Package OpenCV not found? Let’s Find It.

October 18, 2013October 18, 2013 ~ Prateek Joshi ~ 6 Comments

OpenCV has been refined over the years and installing it has become way more user-friendly now. If has been ported to many platforms and you don’t need an IDE to use it. There are many different ways in which you can install it and use it in your existing code. But sometimes, when people try to do different things with it, they run into problems. This post deals with one such problem. When you write code that uses OpenCV, you need to tell your compiler where it is. The reason for this is that this location will contains all the libraries and binaries necessary for the header files to work in your code. If you don’t specify this, you will encounter the following error: Continue reading “Package OpenCV not found? Let’s Find It.” →

Graph-Cuts In Computer Vision

June 30, 2013November 1, 2013 ~ Prateek Joshi ~ Leave a comment

The theory of graph-cuts is used often in the field of computer vision. Graph-cuts are employed to efficiently solve a wide variety of computer vision problems, such as image smoothing, the stereo correspondence, and many other problems that can be formulated in terms of energy minimization. Hold on, energy minimization? It basically refers to finding the equilibrium state. We will talk more about it soon. Many computer vision algorithms involve transforming a given problem into a graph and cutting that graph in the best way. When we say “graph-cuts”, we are specifically referring to the models which use a max-flow/min-cut optimization. Too much jargon? Let’s just dissect it and see what’s inside, shall we? Continue reading “Graph-Cuts In Computer Vision” →

What Are Conditional Random Fields?

February 23, 2013November 1, 2013 ~ Prateek Joshi ~ 2 Comments

This is a continuation of my previous blog post. In that post, we discussed about why we need conditional random fields in the first place. We have graphical models in machine learning that are widely used to solve many different problems. But Conditional Random Fields (CRFs) address a critical problem faced by these graphical models. A popular example for graphical models is Hidden Markov Models (HMMs). HMMs have gained a lot of popularity in recent years due to their robustness and accuracy. They are used in computer vision, speech recognition and other time-series related data analysis. CRFs outperform HMMs in many different tasks. How is that? What are these CRFs and how are they formulated? Continue reading “What Are Conditional Random Fields?” →

Why Do We Need Conditional Random Fields?

February 23, 2013November 1, 2013 ~ Prateek Joshi ~ 3 Comments

This is a two-part discussion. In this blog post, we will discuss the need for conditional random fields. In the next one, we will discuss what exactly they are and how do we use them. The task of assigning labels to a set of observation sequences arises in many fields, including computer vision, bioinformatics, computational linguistics and speech recognition. For example, consider the natural language processing task of labeling the words in a sentence with their corresponding part-of-speech tags. In this task, each word is labeled with a tag indicating its appropriate part of speech, resulting in annotated text. To give another example, consider the task of labeling a video with the mental state of a person based on the observed behavior. You have to analyze the facial expressions of the user and determine if the user is happy, angry, sad etc. We often wish to predict a large number of variables that depend on each other as well as on other observed variables. How to achieve these tasks? What model should we use? Continue reading “Why Do We Need Conditional Random Fields?” →

Derandomization Of RANSAC

February 15, 2013November 1, 2013 ~ Prateek Joshi ~ 2 Comments

Let’s say you are a clothes designer and you want to design a pair of jeans. Since you are new to all this, you go out and collect a bunch of measurements from people to see how to design your jeans as far as sizing is concerned. One aspect of this project would be to see how the height of a person relates to the size of the jeans you are designing. From the measurements you took from those people, you notice a certain pattern that relates height of a person to the overall size of the jeans. Now you generalize this pattern and say that for a given height, a particular size is recommended. To deduce the pattern, you just took a bunch of points and drew a line through them so that it is close to all those points. Pretty simple right! What if there are a few points that are way off from all the other points? Would you consider them while deducing your pattern? You will probably discard them because they are outliers. This was a small sample set, so you could notice these outliers manually. What if there were a million points? Continue reading “Derandomization Of RANSAC” →

How To Build A Face Detector?

January 8, 2013November 2, 2013 ~ Prateek Joshi ~ Leave a comment

Ever wondered if you can build a working face detector quickly? If you are in computer vision, you would have built a face detector one time or the other. If not, then you will be able to build one by the end of this post. This blog post is not about the concept or the algorithm behind real time face detection. We will only deal with building one. If you are not willing to get your hands dirty, then this post wouldn’t be very useful to you. Continue reading “How To Build A Face Detector?” →

Ridgelet Analysis

January 3, 2013November 2, 2013 ~ Prateek Joshi ~ 1 Comment

The pioneering work of researchers on signal processing paved the way to the powerful concept of multiresolution analysis. This is perhaps best known under the generic name of wavelets. Signals occur in the form of images, voice, radar, sonar, infrared etc. Different techniques have been developed over the years to understand these signals. Multiresolution provides us with tools to analyze these signals at different level of resolutions. It’s like looking at the same thing using a microscope with different magnifying powers. The formulation of multiresolution analysis moved the signal processing field away from classical Fourier analysis. But are wavelets equally efficient for all the shapes? Can we somehow take advantage of the shape of the object? Continue reading “Ridgelet Analysis” →

OpenCV On Mac: How To Get It Up And Running?

December 13, 2012December 18, 2013 ~ Prateek Joshi ~ 19 Comments

OpenCV is a computer vision library used extensively by people in the computer vision field. Until a couple of years ago, OpenCV was a bit hacky and the usage was not very straightforward. But determined efforts by multiple companies finally standardized the process and now it is nice and clean. Computer vision algorithms are computationally intensive, requiring lot of processing power to run in real time. Before OpenCV came along, the efforts were very fragmented and repetitive, and there was no standard library as such. Hence Intel decided to do something about it and came up with OpenCV. The advantage of OpenCV is that the algorithms are highly optimized and the library is available on almost all the popular platforms. I have outlined the procedure below to get OpenCV up and running on your Mac. Continue reading “OpenCV On Mac: How To Get It Up And Running?” →

Panoramic Images

August 18, 2012August 19, 2012 ~ Prateek Joshi ~ Leave a comment

Consider a situation where you are standing on top of a mountain or some other beautiful natural scenery. You are enjoying a beautiful view that seems to span from far left to far right and you want to take a nice picture of the whole thing. Your camera allows you to capture only a limited field of view. So to capture the whole scene, you will have to capture multiple images. Doesn’t feel exactly the same watching it in pieces, does it? We really want to capture the beauty within a single image. You can certainly record a video and capture the whole scene, but what if you want to print it out? This is where panoramic photography technique comes in. Panoramic images are images with elongated field of view. The image above is one such example. These images cannot be captured with a single camera click because the field of view is limited. So how do we do capture panoramic images? Continue reading “Panoramic Images” →

Augmented Reality

August 2, 2012August 14, 2012 ~ Prateek Joshi ~ Leave a comment

Augmented Reality (AR) has been one of the most exciting fields to have come into prominence in the last few years. Back when people starting working aggressively on computer graphics, great innovations took place. Today, we have 3D movies with high end computer graphics, but it is still on the screen inside our machines. People then started to think how to pull the graphics out of the screen and integrate them into real world. The result of this effort was augmented reality. It tries to blur the line between what’s real and what’s virtual. It enhances our perception of reality. You can take a look at this video to see what I’m talking about. How does this technology work? How does it track the marker? Continue reading “Augmented Reality” →