The theory of graph-cuts is used often in the field of computer vision. Graph-cuts are employed to efficiently solve a wide variety of computer vision problems, such as image smoothing, the stereo correspondence, and many other problems that can be formulated in terms of energy minimization. Hold on, energy minimization? It basically refers to finding the equilibrium state. We will talk more about it soon. Many computer vision algorithms involve transforming a given problem into a graph and cutting that graph in the best way. When we say “graph-cuts”, we are specifically referring to the models which use a max-flow/min-cut optimization. Too much jargon? Let’s just dissect it and see what’s inside, shall we?
What is energy minimization?
The reason behind using the term “energy” is that typical object detection/segmentation tasks are posed as energy minimization problems. We define “energy” to be the term that would capture the solution we desire and perform gradient-descent to compute its lowest value, resulting in a solution for the given problem. Energy is like the information present in the image. For example, compressing an image causes energy-loss (i.e. if we use lossy compression like JPEG). Sometimes the energy can be a negative measure to be minimised and sometimes it is a positive measure to be maximized.
What is a cut?
In graph theory, a cut is a partition of the vertices of a graph into two disjoint subsets. The cut-set of the cut is the set of edges whose end points are in different subsets of the partition. For example, in the graph here, the dotted line represents a cut. The weight of the cut is 4+5+9+7+11+7+3=46. Edges are said to be crossing the cut if they are in its cut-set. In an unweighted undirected graph, the size or weight of a cut is the number of edges crossing the cut. In a weighted graph, the same term is defined by the sum of the weights of the edges crossing the cut.
When we are dealing with, say image segmentation, we treat every pixel as a node on a graph. As mentioned earlier, we first transformed the given problem into a graph. Now we need to cut the image into multiple parts. We have to “cut” the graph in the best possible way. The reason we do this is because graph theory is very well developed branch of mathematics with robust formulations and it gives really optimized results.
What is minimum-cut?
A cut is minimum if the size of the cut is not larger than the size of any other cut. In the above image, the minimum cut of the weighted graph G is the the cut of the graph with minimum weight. The minimum cut between two vertices v and w in G is the minimum weight cut of G that puts v and w in different partitions. In the graph given here, the dotted line represents the minimum cut that separates the source from the sink. The max-flow min-cut theorem proves that the maximum network flow and the sum of the cut-edge weights of any minimum cut that separates the source and the sink are equal. Graph-cuts formulate a given problem as a graph problem and use minimum cut to get the optimized result. It is widely used for image segmentation.