About Prateek Joshi

I want the machines to see the world ... see the world the way I see it!

Deep Learning For Sequential Data – Part V: Handling Long Term Temporal Dependencies

1 mainIn the previous blog post, we learnt why we cannot use regular backpropagation to train a Recurrent Neural Network (RNN). We discussed how we can use backpropagation through time to train an RNN. The next step is to understand how exactly the RNN can be trained. Does the unrolling strategy work in practice? If we can just unroll an RNN and make it into a feedforward neural network, then what’s so special about the RNN in the first place? Let’s see how we tackle these issues.   Continue reading

Deep Learning For Sequential Data – Part IV: Training Recurrent Neural Networks

1 mainIn the previous blog post, we learnt how Recurrent Neural Networks (RNNs) can be used to build deep learning models for sequential data. Building a deep learning model involves many steps, and the training process is an important step. We should be able to train a model in a robust way in order to use it for inferencing. The training process needs to be trackable and it should converge in a reasonable amount of time. So how do we train RNNs? Can we just use the regular techniques that are used for feedforward neural networks?   Continue reading

Deep Learning For Sequential Data – Part III: What Are Recurrent Neural Networks

1 mainIn the previous two blog posts, we discussed why Hidden Markov Models and Feedforward Neural Networks are restrictive. If we want to build a good sequential data model, we should give more freedom to our learning model to understand the underlying patterns. This is where Recurrent Neural Networks (RNNs) come into picture. One of the biggest restrictions of Convolutional Neural Networks is that they force us to operate on fixed size input data. RNNs know how to operate on sequences of data, which is exciting because it opens up a lot of possibilities! What are RNNs? How do we construct them?   Continue reading

Deep Learning For Sequential Data – Part II: Constraints Of Traditional Approaches

1 mainIn the previous blog post, we discussed the nature of sequential data and why we need a robust separate modeling technique to analyze that data. Traditionally, people have been using Hidden Markov Models (HMMs) to analyze sequential data, so we will center the discussion around HMMs in this blog post. HMMs have been implemented for many tasks such as speech recognition, gesture recognition, part-of-speech tagging, and so on. But HMMs place a lot of restrictions as to how we can model our data. HMMs are definitely better than using classical machine learning techniques, but they don’t fully cover the needs of all the modern data analysis. This is because of the constraints that are used to build HMMs. What are those constraints?   Continue reading

Deep Learning For Sequential Data – Part I: Why Do We Need It

1 mainMost of the current research on deep learning is focused on images. Deep learning is being actively applied to many areas, but image recognition is definitely generating a lot of buzz. Deep neural networks are being used for image classification tasks and they are able to outperform all the other approaches by a big margin. The networks that are used here are traditional feedforward neural networks that learn how to classify data by generating the optimal feature representation. These neural networks severely limited when it comes to sequential data. Time series data is perhaps the most popular form of sequential data. Why can’t we use feedforward neural networks analyze sequential data?   Continue reading

How To Extract Feature Vectors From Deep Neural Networks In Python Caffe

1 mainConvolutional Neural Networks are great at identifying all the information that makes an image distinct. When we train a deep neural network in Caffe to classify images, we specify a multilayered neural network with different types of layers like convolution, rectified linear unit, softmax loss, and so on. The last layer is the output layer that gives us the output tag with the corresponding confidence value. But sometimes it’s useful for us to extract the feature vectors from various layers and use it for other purposes. Let’s see how to do it in Python Caffe, shall we?   Continue reading

How To Programmatically Create A Deep Neural Network In Python Caffe

1 mainWhen you are working with Caffe, you need to define your deep neural network architecture in a ‘.prototxt’ file. These prototxt files usually consist of hundreds of lines, defining layers and corresponding parameters. Before you start training your neural network, you need to create these files and define your architecture. One way to do this is manually write all these lines into a file. But sometimes, it’s beneficial to dynamically create this architecture depending on our needs. In such cases, creating a deep neural network programmatically can be very useful. Let’s go ahead and see how to do it in Python Caffe, shall we?   Continue reading