Deep Learning For Sequential Data – Part I: Why Do We Need It

May 3, 2016February 22, 2022 ~ Prateek Joshi ~ 4 Comments

1 main Most of the current research on deep learning is focused on images. Deep learning is being actively applied to many areas, but image recognition is definitely generating a lot of buzz. Deep neural networks are being used for image classification tasks and they are able to outperform all the other approaches by a big margin. The networks that are used here are traditional feedforward neural networks that learn how to classify data by generating the optimal feature representation. These neural networks severely limited when it comes to sequential data. Time series data is perhaps the most popular form of sequential data. Why can’t we use feedforward neural networks analyze sequential data? Continue reading “Deep Learning For Sequential Data – Part I: Why Do We Need It” →

How To Extract Feature Vectors From Deep Neural Networks In Python Caffe

April 26, 2016April 20, 2016 ~ Prateek Joshi ~ 6 Comments

1 main Convolutional Neural Networks are great at identifying all the information that makes an image distinct. When we train a deep neural network in Caffe to classify images, we specify a multilayered neural network with different types of layers like convolution, rectified linear unit, softmax loss, and so on. The last layer is the output layer that gives us the output tag with the corresponding confidence value. But sometimes it’s useful for us to extract the feature vectors from various layers and use it for other purposes. Let’s see how to do it in Python Caffe, shall we? Continue reading “How To Extract Feature Vectors From Deep Neural Networks In Python Caffe” →

How To Programmatically Create A Deep Neural Network In Python Caffe

April 19, 2016April 4, 2016 ~ Prateek Joshi ~ 2 Comments

1 main When you are working with Caffe, you need to define your deep neural network architecture in a ‘.prototxt’ file. These prototxt files usually consist of hundreds of lines, defining layers and corresponding parameters. Before you start training your neural network, you need to create these files and define your architecture. One way to do this is manually write all these lines into a file. But sometimes, it’s beneficial to dynamically create this architecture depending on our needs. In such cases, creating a deep neural network programmatically can be very useful. Let’s go ahead and see how to do it in Python Caffe, shall we? Continue reading “How To Programmatically Create A Deep Neural Network In Python Caffe” →

Understanding Locally Connected Layers In Convolutional Neural Networks

April 12, 2016April 13, 2016 ~ Prateek Joshi ~ 12 Comments

1 main Convolutional Neural Networks (CNNs) have been phenomenal in the field of image recognition. Researchers have been focusing heavily on building deep learning models for various tasks and they just keeps getting better every year. As we know, a CNN is composed of many types of layers like convolution, pooling, fully connected, and so on. Convolutional layers are great at dealing with image data, but there are a couple of restrictions as well. The DeepFace network built by Facebook used another type of layer to speed up their training and get amazing results. This layer is called Locally Connected Layer with unshared weights. So what exactly does it do that other layers can’t? Continue reading “Understanding Locally Connected Layers In Convolutional Neural Networks” →

What Is Local Response Normalization In Convolutional Neural Networks

April 5, 2016April 7, 2016 ~ Prateek Joshi ~ 12 Comments

1 main Convolutional Neural Networks (CNNs) have been doing wonders in the field of image recognition in recent times. CNN is a type of deep neural network in which the layers are connected using spatially organized patterns. This is in line with how the human visual cortex processes image data. Researchers have been working on coming up with better architectures over the last few years. In this blog post, we will discuss a particular type of layer that has been used consistently across many famous architectures. This layer is called Local Response Normalization layer and it plays an important role. What does it do? What’s the advantage of having this in our network? Continue reading “What Is Local Response Normalization In Convolutional Neural Networks” →

Understanding Xavier Initialization In Deep Neural Networks

March 29, 2016April 7, 2016 ~ Prateek Joshi ~ 6 Comments

1 main I recently stumbled upon an interesting piece of information when I was working on deep neural networks. I started thinking about initialization of network weights and the theory behind it. Does the image to the left make sense now? The guy in that picture is lifting “weights” and we are talking about network “weights”. Anyway, when we implement convolutional neural networks, we tend to utilize all the knowledge and research available out there. A good number of things in deep learning are based on heuristics! It’s worth exploring why we do things in a certain way whenever it’s possible. This goes a long way in unlocking the hidden mysteries of deep learning and why it’s so unbelievably accurate. Let’s go ahead and understand how network weights are initialized, shall we? Continue reading “Understanding Xavier Initialization In Deep Neural Networks” →

How Are Decision Trees Constructed In Machine Learning

March 22, 2016April 7, 2016 ~ Prateek Joshi ~ 6 Comments

1 main Decision trees occupy an important place in machine learning. They form the basis of Random Forests, which are used extensively in real world systems. A famous example of this is Microsoft Kinect where Random Forests are used to track your body parts. The reason this technique is so popular is because it provides high accuracy with relatively little effort. They are fast to train and they are not computationally expensive. They are not sensitive to outliers either, which helps them to be robust in a variety of cases. So how exactly are they constructed? How are the nodes in the trees generated so that they are optimal? Continue reading “How Are Decision Trees Constructed In Machine Learning” →

Deep Learning With Caffe In Python – Part IV: Classifying An Image

February 23, 2016April 7, 2016 ~ Prateek Joshi ~ 11 Comments

4 main In the previous blog post, we learnt how to train a convolutional neural network (CNN). One of the most popular use cases for a CNN is to classify images. Once the CNN is trained, we need to know how to use it to classify an unknown image. The trained model files will be stored as “caffemodel” files, so we need to load those files, preprocess the input images, and then extract the output tags for those images. In this post, we will see how to load those trained model files and use it to classify an image. Let’s go ahead see how to do it, shall we? Continue reading “Deep Learning With Caffe In Python – Part IV: Classifying An Image” →

Deep Learning With Caffe In Python – Part III: Training A CNN

February 16, 2016April 7, 2016 ~ Prateek Joshi ~ 9 Comments

3 main In the previous blog post, we learnt about how to interact with a Caffe model. In this blog post, we will learn how to train a proper CNN. Up until now, we were dealing with a single layer network. We just defined it in a prototxt file and visualized it easily. If we want our CNN to perform any meaningful tasks, we should define a multilayer network and allow it to train on a large amount of data. Caffe makes it very easy for us to train a multilayer network. We can specify all the parameters in a prototxt file, create a training database, and just train the network. Let’s go ahead and see how to do that, shall we? Continue reading “Deep Learning With Caffe In Python – Part III: Training A CNN” →

Deep Learning With Caffe In Python – Part II: Interacting With A Model

February 9, 2016April 7, 2016 ~ Prateek Joshi ~ 4 Comments

2 main I know that the title looks slightly misleading. If you are thinking that we will be talking about how to interact with fashion models at a coffee shop, you are in for a big surprise! In the previous blog post, we talked about how to define and visualize a single layer convolutional neural network (CNN). In this post, we will discuss how to interact with a Caffe model. This is a continuation of the previous blog post. So if you haven’t read it, you may want to take a quick glance at it before you proceed. In that post, we defined our CNN architecture in a prototxt file. Now how do we make it do stuff for us? When we load such a network using Caffe, it comes with a bunch of features. Let’s see how to work with our model, shall we? Continue reading “Deep Learning With Caffe In Python – Part II: Interacting With A Model” →