Before we start, I want to clarify that this post is not about treasure hunting! As you read along, the title will start making sense. In one of my previous blog posts, I have discussed speech recognition and a few ways to model the problem. I have also talked about how we can use machine learning to solve various real life problems. A lot of times, we need to model temporal events. Temporal events are things that happen over a period of time. Sometimes we know everything about a system, and so we just predict what’s going to happen next. What if we don’t know everything about a system? What if we can just see the effects of that system? Can we learn about a system even though we cannot directly observe what’s happening inside?
Okay I read the first paragraph and the title doesn’t make sense.
To demonstrate, let’s check out this conversation between Walter and Michael. Walter is a mathematician and Michael is his friend who refuses to acknowledge that complex mathematical models are required in real life. So Walter has taken it upon himself to prove Michael wrong.
Walter: Look at this, I have a black box here.
Michael: Oh! And what’s so special about that?
Walter: Well, this black box has a heating facility inside and the temperature keeps varying.
Michael: Alright. So who’s controlling the temperature?
Walter: That’s the problem. It varies on its own. We want to know what the temperature is at all times, but we cannot go inside and measure it.
Michael: Why did you get this box then?
Walter: The interesting thing about this box is that it emits a small spherical object every now and then. This object is sensitive to heat and changes its color according to temperature.
Michael: That’s interesting!
Walter: Now we need to determine how exactly the temperature varies so that we can actually put it to use.
The obvious way to determine the temperature variations would be to look at the color of the emitted object. We are not exactly measuring the temperature directly, but we are looking at the color of the object to determine the temperature. Now, it may so happen that the object might exhibit some erratic behavior and change its color randomly. But nevertheless, the chances are less. So if you see that the color is red, you are pretty sure that the current temperature is high.
Where are we going with this?
This blog post is about Hidden Markov Models (HMMs). Does the title make sense now? The image in the beginning of this post doesn’t look so random now, does it? HMMs are statistical models used to model a system which has unobserved states. HMMs are very useful in modeling temporal events such as speech recognition, gesture recognition, bioinformatics and so on. These models are used as learning algorithms. Given a lot of training data, HMMs can model a temporal system very efficiently and accurately.
What are Markov Models and what exactly is “hidden”?
Markov models are statistical models which obey the Markov property. Markov property says that the current state of a system depends only on the previous state of that system. Let’s consider Walter, who is traveling around Europe. Given that he is a mathematician, he will plan his trip efficiently. It doesn’t matter where he was 10 days ago, his next destination will only depend on where he is today. If he is in Lisbon, it’s more likely that he will visit Barcelona before he goes to Rome. It doesn’t matter that he was somewhere close to Rome 10 days ago. He will only consider where he is today to plan for tomorrow. To give a counter example, consider adding sugar into water. If you add sugar to water now, it will have the effect till the very end. You cannot just modify the sweetness by considering how much sugar you just added. You have to think how much you added two hours ago or may be two days ago, because that will also have an effect. This is an example of a system which doesn’t obey the Markov property.
Markov models use a random variable to model the state of a system which changes through time. For example, if you switch on a light bulb, it remains on at all times. It will not change its state until you manually do something. But in the black box example earlier, the temperature of the box is constantly changing. Markov models are very useful in modeling such systems.
The “hidden” part in HMMs are the states. We cannot directly observe the state of some systems, and hence we need HMMs to model them and predict the state of that system at any given time. In our black box example, the temperature of the box is the state of that system.
What was the point of that Walter-Michael story earlier?
The whole formulation in that story is conceptualized in the form of Hidden Markov Models. We have something called ‘states’ and ‘outputs’. The temperature of the box corresponds to ‘states’ and the spherical object corresponds to ‘outputs’. We used the color of the emitted object to guess the temperature of the box. This is the very foundation of HMMs. Of course there is a lot of mathematical background involved, and these formulations are very complex when used in real life applications. These models have become an integral part of many advanced technologies. If you start digging deep into the field of your interest, you will be surprised at how often you will come across Markov Models!