Deep Learning For Sequential Data – Part IV: Training Recurrent Neural Networks

1 mainIn the previous blog post, we learnt how Recurrent Neural Networks (RNNs) can be used to build deep learning models for sequential data. Building a deep learning model involves many steps, and the training process is an important step. We should be able to train a model in a robust way in order to use it for inferencing. The training process needs to be trackable and it should converge in a reasonable amount of time. So how do we train RNNs? Can we just use the regular techniques that are used for feedforward neural networks?   Continue reading

Deep Learning For Sequential Data – Part III: What Are Recurrent Neural Networks

1 mainIn the previous two blog posts, we discussed why Hidden Markov Models and Feedforward Neural Networks are restrictive. If we want to build a good sequential data model, we should give more freedom to our learning model to understand the underlying patterns. This is where Recurrent Neural Networks (RNNs) come into picture. One of the biggest restrictions of Convolutional Neural Networks is that they force us to operate on fixed size input data. RNNs know how to operate on sequences of data, which is exciting because it opens up a lot of possibilities! What are RNNs? How do we construct them?   Continue reading

Deep Learning For Sequential Data – Part II: Constraints Of Traditional Approaches

1 mainIn the previous blog post, we discussed the nature of sequential data and why we need a robust separate modeling technique to analyze that data. Traditionally, people have been using Hidden Markov Models (HMMs) to analyze sequential data, so we will center the discussion around HMMs in this blog post. HMMs have been implemented for many tasks such as speech recognition, gesture recognition, part-of-speech tagging, and so on. But HMMs place a lot of restrictions as to how we can model our data. HMMs are definitely better than using classical machine learning techniques, but they don’t fully cover the needs of all the modern data analysis. This is because of the constraints that are used to build HMMs. What are those constraints?   Continue reading

Deep Learning For Sequential Data – Part I: Why Do We Need It

1 mainMost of the current research on deep learning is focused on images. Deep learning is being actively applied to many areas, but image recognition is definitely generating a lot of buzz. Deep neural networks are being used for image classification tasks and they are able to outperform all the other approaches by a big margin. The networks that are used here are traditional feedforward neural networks that learn how to classify data by generating the optimal feature representation. These neural networks severely limited when it comes to sequential data. Time series data is perhaps the most popular form of sequential data. Why can’t we use feedforward neural networks analyze sequential data?   Continue reading

How To Extract Feature Vectors From Deep Neural Networks In Python Caffe

1 mainConvolutional Neural Networks are great at identifying all the information that makes an image distinct. When we train a deep neural network in Caffe to classify images, we specify a multilayered neural network with different types of layers like convolution, rectified linear unit, softmax loss, and so on. The last layer is the output layer that gives us the output tag with the corresponding confidence value. But sometimes it’s useful for us to extract the feature vectors from various layers and use it for other purposes. Let’s see how to do it in Python Caffe, shall we?   Continue reading

How To Programmatically Create A Deep Neural Network In Python Caffe

1 mainWhen you are working with Caffe, you need to define your deep neural network architecture in a ‘.prototxt’ file. These prototxt files usually consist of hundreds of lines, defining layers and corresponding parameters. Before you start training your neural network, you need to create these files and define your architecture. One way to do this is manually write all these lines into a file. But sometimes, it’s beneficial to dynamically create this architecture depending on our needs. In such cases, creating a deep neural network programmatically can be very useful. Let’s go ahead and see how to do it in Python Caffe, shall we?   Continue reading

Understanding Locally Connected Layers In Convolutional Neural Networks

1 mainConvolutional Neural Networks (CNNs) have been phenomenal in the field of image recognition. Researchers have been focusing heavily on building deep learning models for various tasks and they just keeps getting better every year. As we know, a CNN is composed of many types of layers like convolution, pooling, fully connected, and so on. Convolutional layers are great at dealing with image data, but there are a couple of restrictions as well. The DeepFace network built by Facebook used another type of layer to speed up their training and get amazing results. This layer is called Locally Connected Layer with unshared weights. So what exactly does it do that other layers can’t?   Continue reading