Why Is Python Slow?

1 mainWhen people talk about speed in the world of programming languages, they usually center the discussion around compiled vs interpreted languages. In this post, we will discuss two of the most famous languages on this planet, Python and C. I was recently playing around with this and I made a few pleasant discoveries. So I thought I should share them here. One of the biggest reasons as to why Python is slower than C is because of the dynamic typing feature in Python. While it may be true that dynamically typed programming languages are slower than statically typed languages, it may not be the major factor slowing down your Python code. The dynamic typing feature of programming languages like Python makes the interpreters harder to optimize. I guess this is the cost of having an extremely beginner-friendly language! But one thing to note is that there is a big difference between interpreter being harder to optimize and your code being slow. We’ve had years of research focusing on the best way to perform type checking at runtime in these languages, thus making this overhead negligible. So how do we understand why Python code is slower than C code? How do we write Python code that’s not slow?  

The root of all evil

The problem when we write Python code is that we just write it very differently as compared to C. We tend to be a bit too relaxed as far as type safety is concerned. Statically typed languages like C or C++ don’t allow you to just randomly pass data around without checking for types first. I’ve discussed more about static and dynamic typing here. When it comes down to it, the data structures and the algorithms used underneath are the biggest reasons for your Python code being slower than C code. The thing is that this can happen without the programmer’s knowledge. Let’s take a simple example in Python:

rectangle = {'x': 0, 'y': 0, 'width': 60, 'height': 40}

This looks really clean, right? It’s easy on the eye, and it’s just elegant! Now let’s do the same thing in C:

struct rectangle {
    int x;
    int y;
    int width;
    int height;
};

This looks elegant too, right? One can argue that this is as easy to work with as the Python code we just looked at. So far, so good? Okay, let’s continue.

So what’s the problem?

The thing to note here is that C uses a completely different data structure. In C, we tell the compiler that each rectangle object would have exactly 4 fields. Also, we specify the type, and hence the allowed size for these variables. The compiler will utilize all this knowledge and allocate them near each other as a single contiguous memory block. In other words, the structure would be similar to an array. The compiler would know exactly where x, y, width and height fields of a given rectangle are at any time. We could also access both of these fields in memory easily, just by looking some constant offset away from where the pointer itself is.

2 Contiguous-AllocationOn the other hand, Python uses a hash map data structure to perform the same task. As you can see in our code here, we defined rectangle to be a dictionary, which translates to a hash map underneath. The interpreter simply cannot preallocate a contiguous block of memory to store all the fields next to each other to make them easily accessible. Now why is that? Well, because we haven’t specified any datatype, which means that the interpreter has no knowledge about the size of these fields. So it has to allocate enough space assuming the worst case scenario, which means they cannot be stored in a contiguous block of memory. We could also delete these keys if we want to, which means the interpreter doesn’t know about the number of keys present in the dictionary at any time. The interpreter must use a hash function to be able to map any input you could throw at it to a memory location. Needless to say, these functions add a some overhead to the calculation. While it is usually small, it may as well be significant enough to slow down your code, especially if you have to do a lot of them.

What about C++?

If we were to directly translate the Python code to C++, we would do something like this:

std::unordered_map<std::string, int> rectangle;
rectangle["x"] = x
rectangle["y"] = y
rectangle["width"] = width
rectangle["height"] = height

As we can see here, C++ uses a hash map only and only if we specify it. People wouldn’t just use it by accident! If we look at this code, we can immediately see that an expensive hash map is being used. In Python, this looks deceptively simple, which makes it hard to spot.

Why do we let it happen in Python?

The reason for this is that the Python programmers tend to think of dictionaries as lightweight objects. I mean, look at how easy they are to use in Python! Let’s see if we can make Python code look like the C struct:

class Rectangle(object):
    __slots__ = ('x', 'y', 'width', 'height')

    def __init__(self, x, y, width, height):
        self.x, self.y, self.width, self.height = x, y, width, height

Here, the second line restricts the variables that this class can contain. Without this, people can instantiate this class and randomly add variables to this class. If you ask an object oriented programmer, that’s just blasphemy! The good thing about this representation is that it provides useful hints to the compiler, just like the C struct does. For instance, in the second line, we are explicitly telling the interpreter that our Rectangle object will always have only 4 fields x, y, width and height when it is created, hoping it would be able to deal with them.

Does it help?

The standard Python interpreter doesn’t make much use of this representation. But the good thing is that some Python implementations are specifically designed to be efficient, such as PyPy. Using a class is much faster than using a dictionary under this JIT-capable interpreter. In other words, PyPy is able to use correct data structure here. The interesting thing to note here is that we can be really fast if we use the right Python implementation, while having nice features like dynamic typing still built into it. All of this, by just using a better data structure, both in the code and in the implementation. Next time, when you are writing a line of code in Python, take a moment to think about the data structure you are using, both explicitly and implicitly, and ask yourself if there is a better one.

———————————————————————————————————

3 thoughts on “Why Is Python Slow?

Leave a comment