There are many phenomena in everyday life where it’s very difficult to model the problem. There are so many variables and so many dependencies that any approximation or assumption would lead to a huge errors in outputs. This is usually a combination of uncertainty and variability. Even though we have access to all the historical information, we can’t accurately predict a future outcome because of inaccurate modeling. This becomes especially relevant when we are dealing with systems where the degrees of freedom are dependent on each other. An example would be movement of fluids or kinetic modeling of gases. How do we compute the possible outcomes? How can we assess the impact of all the free variables to make sure we predict the outcome under uncertainty?
What’s the answer?
This is where Monte Carlo simulation comes into picture. It refers to a class of computerized mathematical algorithms that allow us obtain outputs based on repeated random sampling. The central idea here is to specify the range of inputs and the algorithm will repeatedly sample this input space to compute the output. If you repeat the experiment enough number of times, then the output obtained will be pretty close to the real output. Monte Carlo simulation provides us with a range of possible outcomes. These outcomes are associated with probabilities so that we will know how our choice of any particular action will turn out in the future. This analysis allows us to see various possibilities by allowing us to see what the outcome looks like if we are too conservative as well as too aggressive.
Where is it used?
Monte Carlo simulation is a hugely popular technique that finds applications across many fields like energy, environmental science, finance, insurance, beverage processing, logistics, and so on. Some of the specific use cases include predicting pipe failures in oil and gas, designing heat shields, analyzing variations in circuits, modeling radiation transport, predicted energy output from a wind energy farm, determining position of a robot in uncertain terrains, modeling fluid flows, studying biological systems like genomes and proteins,modeling climate change, rendering a 3D scene by tracing the possible light paths, search and rescue operations, and so on. There are way too many applications! Once you realize the power of Monte Carlo simulation, it can be applied to pretty much any problem where there’s uncertainty.
How does it work?
Monte Carlo simulation works by building models of possible outcomes by substituting a range of values for any input parameter that has uncertainty. This range of values corresponds to a probability distribution. This means that we have to specify a probability distribution and the values are chosen randomly from this probability distribution. Once the values are chosen, the algorithm computes the results for this set of values. This experiment is repeated over and over again with different set of values chosen randomly from the probability distribution.
A real world Monte Carlo simulation usually involves tens of thousands of computations. In the end, it produces a probability distribution of possible outcomes. Instead of trying to come up with an exact number, it comes up with probabilities, which is a much more realistic way of modeling the real world. There are a lot of uncertainties associated with inputs, so we need to know how various scenarios are going to impact us. When we are dealing with big infrastructure like oil pipes, we need to be very careful about how we choose our parameters. If you are too aggressive, the pipe might burst and kill people. If you are too conservative, you might end up losing billions of dollars because of suboptimal operations. You need to know how thousands of free parameters will impact the pipe in the future in order to make an informed decision.
How is it better than just estimating the underlying parameters governing the data?
When you have a bunch of variables, our first instinct is to model the data by extracting the underlying parameters. A simple example would be to assume that the data is Gaussian and we try to estimate the mean and variance from this data. This technique is called single point estimation. We deterministically estimate each individual parameter to model our data. But data in the real world is not so deterministic. It has a lot of uncertainty associated with it and we don’t have any idea about interdependencies, which adds to the complexity.
When we compute any outcome in Monte Carlo simulation, not only do we know if it’s is feasible or not, but we also know how likely it is to occur. This allows us to create visualizations of different possible comes along with associated probabilities. This is a very useful tool to communicate with people who might not be well versed with all the data. With single point estimation technique, it’s difficult to see how different input variables impact the outcome and what’s the individual impact on the outcome. This is in contrast with Monte Carlo simulation where we can see which inputs had the biggest impact on the outcomes.
Single point estimation techniques make it very difficult to model how various combinations of input values give rise to different scenarios. Monte Carlo simulation, on the other hand, allows us to see exactly which combination of inputs led to a particular outcome. It’s possible to model the interdependency between input variables, which is very important in the real world.