What Does Backpropagation Mean?

December 12, 2012December 2, 2015 ~ Prateek Joshi ~ Leave a comment

People started working on artificial intelligence back in the late ’60s. After they came up with the concept of perceptron, this field looked very promising. But as the years passed by, no significant development took place even after making several attempts from multiple directions! As people were beginning to lose hope, backpropagation came into picture and breathed new life into this field. Backpropagation was the result of pioneering work by mathematicians and computer scientists, which eventually led to a successful revival of artificial intelligence! So what exactly is backpropagation? How is it used in real life? Continue reading “What Does Backpropagation Mean?” →

Perceiving The Perceptron

December 4, 2012November 2, 2013 ~ Prateek Joshi ~ 4 Comments

If you are hearing the word “perceptron” for the first time, it sounds a lot like a futuristic robot which can perceive things right? Well, that’s not exactly what it means! Perceptron is a machine learning algorithm for supervised classification. It is one of the very first algorithms to be formulated in the field of artificial intelligence. When it first came out, it was very promising. But over the following years, the performance didn’t exactly reach the expectations. It was studied for many years and the theory was modified and extended in a lot of ways. Now, it has become an integral part in the field of artificial neural networks. So what exactly is a perceptron? Where do we use it in real life? Continue reading “Perceiving The Perceptron” →

Dynamic Programming

November 14, 2012February 27, 2014 ~ Prateek Joshi ~ 1 Comment

Most of the techies have come across this concept one time or the other. People know that it’s really good and very useful, but not a lot of them know how exactly it works and why it works in the first place! Let’s say you are presented with a big box of precious stones with different sizes and weights. You have a bag with you which can only hold a limited weight. So obviously you can’t take everything. In particular, you’re constrained to take only what your bag can hold. Let’s say it can only hold W pounds. You also know the market value for each of those stones. Given that you can only carry W pounds, what stones should you pick in order to maximize your profit? Continue reading “Dynamic Programming” →

Constrained Optimization

September 25, 2012November 4, 2013 ~ Prateek Joshi ~ Leave a comment

Whenever we think of a real life problem, we always want to get the most optimal result. I said optimal and not the best possible because we don’t have unlimited resources. Given unlimited resources, we would always pick the best one and we don’t have to think about it at all. But unfortunately in real life, this is almost never the case. Let’s say you want to buy a car. Ideally you want the best possible car, but you don’t have unlimited money. So you would buy a car with maximum features while minimizing your cost. This is not so hard to do because you have a limited number of variables. Hence you would just do it manually. What would you do when you have to deal with a lot of variables? How would you do it? Continue reading “Constrained Optimization” →

Principal Component Analysis

September 21, 2012November 4, 2013 ~ Prateek Joshi ~ 1 Comment

Principal Component Analysis (PCA) is one of most useful tools in the field of pattern recognition. Let’s say you are making a list of people and collecting information about their physical attributes. Some of the more common attributes include height, weight, chest, waist and biceps. If you store 5 attributes per person, it is equivalent to storing a 5-dimensional feature vector. If you generalize it for ‘n’ different attributes, you are constructing an n-dimensional feature vector. Now you may want to analyze this data and cluster people into different categories based on these attributes. PCA comes into picture when have a set of datapoints which are multidimensional feature vectors and the dimensionality is high. If you want to analyze the patterns in our earlier example, it’s quite simple because it’s just a 5-dimensional feature vector. In real-life systems, the dimensionality is really high (often in hundreds or thousands) and it becomes very complex and time-consuming to analyze such data. What should we do now? Continue reading “Principal Component Analysis” →

Interpretation of Gaussian Distribution

September 9, 2012September 9, 2012 ~ Prateek Joshi ~ 9 Comments

When we deal with large amount of data, we can’t have specific rules for each and every instance. We have to come up with a model which defines the whole data. This model can then be used to analyze unknown inputs. More often than not, the data has some underlying pattern. When we think of a model, we extract specific characteristics from the data and come up with a formulation which best explains the behavior of the data. One of the most frequently occurring pattern is the Gaussian Distribution. It is used almost everywhere in science and technology. But what is it exactly? Why do we need it? Continue reading “Interpretation of Gaussian Distribution” →

Kernel Functions For Machine Learning

September 1, 2012 ~ Prateek Joshi ~ 1 Comment

You must have heard the term ‘kernel’ floating around quite a few times. People from many different backgrounds use it in different contexts. The thing is that this term has been applied to different things in different domains. When we talk about operating systems, we talk about which kernel is being used. Kernel is also used extensively in parallel computing and in the GPU domain, where it is the function which is called repetitively on a computing grid. It has a few other meanings in different hardware related programming fields. But in this post, I will discuss kernels as applied to machine learning. Kernels are used in machine learning to transform the data so that the classification becomes easier. One common thing in all these different definitions of the term ‘kernel’ is that it is being used as a bridge between two things. In operating systems, it is the bridge between hardware and software. In GPU domain, it is the bridge between the geometric grid and the programmer. In machine learning, it is the bridge between linearity and non-linearity. I will discuss the underlying mathematical structure in this post. So readers beware, this is a technical deep-dive. Continue reading “Kernel Functions For Machine Learning” →

Support Vector Machines

August 24, 2012August 24, 2012 ~ Prateek Joshi ~ 3 Comments

In machine learning, we have supervised learning on one end and unsupervised learning on the other end. Support Vector Machines (SVMs) are supervised learning models used to analyze and classify data. We use machine learning algorithms to train the machines. Once we have a model, we can classify unknown data. Let’s say you have a set of data points and they belong one of the two possible classes. Now our task is to find the best possible way to put a boundary between the two sets of points. When a new point comes in, we can use this boundary to decide whether it belongs to class 1 or class 2. In real life, these data points can be a set of observations like images, text, characters, protein sequences etc. How can we achieve this in the most optimal way? Continue reading “Support Vector Machines” →

Artificial Neural Networks

August 14, 2012December 2, 2015 ~ Prateek Joshi ~ 2 Comments

We want our machines to learn everything on their own as much as possible. Over the past few decades, researchers have come up with many theories and formulations about how we can achieve this in the best possible way. This realm is called machine learning. We come up with algorithms to teach the machines how to learn. I have discussed more about machine learning here. Human brain seems to achieve this rather effortlessly. Our ultimate goal is to make the machines as good as our brains, or even better. The formulation of Artificial Neural Network (ANN) is an attempt towards this. Continue reading “Artificial Neural Networks” →

Augmented Reality

August 2, 2012August 14, 2012 ~ Prateek Joshi ~ Leave a comment

Augmented Reality (AR) has been one of the most exciting fields to have come into prominence in the last few years. Back when people starting working aggressively on computer graphics, great innovations took place. Today, we have 3D movies with high end computer graphics, but it is still on the screen inside our machines. People then started to think how to pull the graphics out of the screen and integrate them into real world. The result of this effort was augmented reality. It tries to blur the line between what’s real and what’s virtual. It enhances our perception of reality. You can take a look at this video to see what I’m talking about. How does this technology work? How does it track the marker? Continue reading “Augmented Reality” →