Understanding Xavier Initialization In Deep Neural Networks

March 29, 2016April 7, 2016 ~ Prateek Joshi ~ 6 Comments

1 main I recently stumbled upon an interesting piece of information when I was working on deep neural networks. I started thinking about initialization of network weights and the theory behind it. Does the image to the left make sense now? The guy in that picture is lifting “weights” and we are talking about network “weights”. Anyway, when we implement convolutional neural networks, we tend to utilize all the knowledge and research available out there. A good number of things in deep learning are based on heuristics! It’s worth exploring why we do things in a certain way whenever it’s possible. This goes a long way in unlocking the hidden mysteries of deep learning and why it’s so unbelievably accurate. Let’s go ahead and understand how network weights are initialized, shall we? Continue reading “Understanding Xavier Initialization In Deep Neural Networks” →

How Are Decision Trees Constructed In Machine Learning

March 22, 2016April 7, 2016 ~ Prateek Joshi ~ 6 Comments

1 main Decision trees occupy an important place in machine learning. They form the basis of Random Forests, which are used extensively in real world systems. A famous example of this is Microsoft Kinect where Random Forests are used to track your body parts. The reason this technique is so popular is because it provides high accuracy with relatively little effort. They are fast to train and they are not computationally expensive. They are not sensitive to outliers either, which helps them to be robust in a variety of cases. So how exactly are they constructed? How are the nodes in the trees generated so that they are optimal? Continue reading “How Are Decision Trees Constructed In Machine Learning” →

Deep Learning With Caffe In Python – Part IV: Classifying An Image

February 23, 2016April 7, 2016 ~ Prateek Joshi ~ 11 Comments

4 main In the previous blog post, we learnt how to train a convolutional neural network (CNN). One of the most popular use cases for a CNN is to classify images. Once the CNN is trained, we need to know how to use it to classify an unknown image. The trained model files will be stored as “caffemodel” files, so we need to load those files, preprocess the input images, and then extract the output tags for those images. In this post, we will see how to load those trained model files and use it to classify an image. Let’s go ahead see how to do it, shall we? Continue reading “Deep Learning With Caffe In Python – Part IV: Classifying An Image” →

Deep Learning With Caffe In Python – Part III: Training A CNN

February 16, 2016April 7, 2016 ~ Prateek Joshi ~ 9 Comments

3 main In the previous blog post, we learnt about how to interact with a Caffe model. In this blog post, we will learn how to train a proper CNN. Up until now, we were dealing with a single layer network. We just defined it in a prototxt file and visualized it easily. If we want our CNN to perform any meaningful tasks, we should define a multilayer network and allow it to train on a large amount of data. Caffe makes it very easy for us to train a multilayer network. We can specify all the parameters in a prototxt file, create a training database, and just train the network. Let’s go ahead and see how to do that, shall we? Continue reading “Deep Learning With Caffe In Python – Part III: Training A CNN” →

Deep Learning With Caffe In Python – Part II: Interacting With A Model

February 9, 2016April 7, 2016 ~ Prateek Joshi ~ 4 Comments

2 main I know that the title looks slightly misleading. If you are thinking that we will be talking about how to interact with fashion models at a coffee shop, you are in for a big surprise! In the previous blog post, we talked about how to define and visualize a single layer convolutional neural network (CNN). In this post, we will discuss how to interact with a Caffe model. This is a continuation of the previous blog post. So if you haven’t read it, you may want to take a quick glance at it before you proceed. In that post, we defined our CNN architecture in a prototxt file. Now how do we make it do stuff for us? When we load such a network using Caffe, it comes with a bunch of features. Let’s see how to work with our model, shall we? Continue reading “Deep Learning With Caffe In Python – Part II: Interacting With A Model” →

Deep Learning With Caffe In Python – Part I: Defining A Layer

February 2, 2016April 7, 2016 ~ Prateek Joshi ~ 7 Comments

1 main Caffe is one the most popular deep learning packages out there. In one of the previous blog posts, we talked about how to install Caffe. In this blog post, we will discuss how to get started with Caffe and use its various features. We will then build a convolutional neural network (CNN) that can be used for image classification. Caffe plays very well with the GPU during the training process, hence we can achieve a lot of speed-up. For the purpose of this discussion, it is assumed that you have already installed Caffe on your machine. Let’s go ahead and see how to interact with Caffe, shall we? Continue reading “Deep Learning With Caffe In Python – Part I: Defining A Layer” →

How To Train A Neural Network In Python – Part III

January 26, 2016December 19, 2015 ~ Prateek Joshi ~ 3 Comments

ImageJ=1.44p unit=um In the previous blog post, we learnt how to build a multilayer neural network in Python. What we did there falls under the category of supervised learning. In that realm, we have some training data and we have the associated labels. Now the goal is to train the neural network correctly label our training data. Once we train the model, we can use it to predict the labels of unknown datapoints. But what about unsupervised learning? In the real world, we also have to deal with a lot of unlabeled data. Can we train a neural network to recognize clusters in our data? Yes, we certainly can! Let’s go ahead and see how we can do that in Python, shall we? Continue reading “How To Train A Neural Network In Python – Part III” →

How To Train A Neural Network In Python – Part II

January 19, 2016January 20, 2016 ~ Prateek Joshi ~ 4 Comments

1 main In the previous blog post, we discussed about perceptrons. We learnt how to train a perceptron in Python to achieve a simple classification task. If you need a quick refresher on perceptrons, you can check out that blog post before proceeding further. In a way, perceptron is a single layer neural network with a single neuron. In this blog post, we will learn how to develop a multilayer neural network. A multilayer neural network consists of multiple layers and each layer consists of many perceptrons, and it is much better at classifying data that a single perceptron. So how exactly does a multilayer neural network function? How do we build it in Python? Continue reading “How To Train A Neural Network In Python – Part II” →

How To Train A Neural Network In Python – Part I

January 12, 2016January 13, 2016 ~ Prateek Joshi ~ 10 Comments

1 main Deep learning uses neural networks to build sophisticated models. The basic building blocks of these neural networks are called “neurons”. When a neuron is trained to act like a simple classifier, we call it “perceptron”. A neural network consists of a lot of perceptrons interconnected with each other. Let’s say we have a bunch of inputs and the corresponding desired outputs. The goal of deep learning is to train this neural network so that the system outputs the right value for the given set of inputs. This process basically involves tuning each neuron in the network until it behaves a certain way. So what exactly is this perceptron? How do we train it in Python? Continue reading “How To Train A Neural Network In Python – Part I” →

How To Install Caffe On Ubuntu

January 5, 2016December 4, 2015 ~ Prateek Joshi ~ 1 Comment

main The concept of deep learning is becoming increasingly pervasive. It is a new area of research in machine learning that focuses on learning optimal representations of data. Now what does it mean? In the realm of classical machine learning, we have the build the features first and then the machine learning algorithm will learn how to classify the data based on these features. The problem is that feature-building is a trial-and-error process and we want to avoid manual intervention. This is where deep learning tends to shine! Instead of manually building the features ourselves, we can just let our deep neural networks learn the features and then build a system to classify that data. In this field, people work towards building a set of algorithms that can model abstractions in our data using multilayered neural networks. Caffe is one of the most popular libraries available out there for deep learning. Let’s go ahead and see how to get it up and running on Ubuntu, shall we? Continue reading “How To Install Caffe On Ubuntu” →