About Prateek Joshi

I want the machines to see the world ... see the world the way I see it!

How To Setup Nginx For Load Balancing?

1 mainLet’s say that you have a nice idea for a website and you want to host it somewhere to make it available to the users. To do this, you put your website on a server somewhere on the cloud. Then, you purchase a domain name and you redirect all the requests to this server. But you soon realize that you are getting too much traffic, and that your server won’t be able to handle all of it by itself. So you go ahead and get three more servers. Now you want make sure all your servers share the load in a nice way. How will you do that? We would like to avoid the situation where one of the servers is getting all the traffic, and the remaining servers are getting a small amount. That wouldn’t serve our purpose here! So how we do we handle this?   Continue reading

What Is Relative Entropy?

1 mainIn this blog post, we will be using a bit of background from my previous blog post. If you are familiar with the basics of entropy coding, you should be fine. If not, you may want to quickly read through my previous blog post. So coming to the topic at hand, let’s continue our discussion on entropy coding. Let’s say we have a stream of English alphabets coming in, and you want to store them in the best possible way by consuming the least amount of space. So you go ahead and build your nifty entropy coder to take care of all this. But what if you don’t have access to all the data? How do you know what alphabet appears most frequently if you can’t access the full data? The problem now is that you cannot know for sure if you have chosen the best possible representation. Since you cannot wait forever, you just wait for the first ‘n’ alphabets and build your entropy coder hoping that the rest of the data will adhere to this distribution. Do we end up suffering in terms of compression by doing this? How do we measure the loss in quality?   Continue reading

What Is Entropy Coding?

1 mainEntropy Coding appears everywhere in modern digital systems. It is a fundamental building block of data compression, and data compression is pretty much needed everywhere, especially for internet, video, audio, communication, etc. Let’s consider the following scenario. You have a stream of English alphabets coming in and you want to store them in the best possible way by consuming the least amount of space. For the sake of discussion, let’s assume that they are all uppercase letters. Bear in mind that you have an empty machine which doesn’t know anything, and it understands only binary symbols i.e. 0 and 1. It will do exactly what you tell it to do, and it will need data in binary format. So what do we do here? One way would be to use numbers to represent these alphabets, right? Since there are 26 alphabets in English, we can convert them to numbers ranging from 0 to 25, and then convert those numbers into binary form. The biggest number, 25, needs 5 bits to be represented in binary form. So considering the worst case scenario, we can say that we need 5 bits to represent every alphabet. If have to store 100 alphabets, we will need 500 bits. But is that the best we can do? Are we perhaps not exploring our data to the fullest possible extent?   Continue reading

HTTP vs HTTPS: Latency Comparison

1 mainI recently came across the issue of latency differences between HTTP and HTTPS. It got me curious and I started looking into it. To give a quick introduction to those who are new to this, HTTP stands for Hypertext Transfer Protocol and it’s a protocol for communication over the internet. Whenever somebody types something into the address bar on their browser, the browser understands the address and displays the appropriate thing. When you look at the address bar, you usually won’t see the address beginning with “http” because modern web browsers hide it. If you copy that address and paste it into a text file, you will see the full address starting with “http”. The problem with HTTP is that it is susceptible to wiretapping and other kinds of attacks. So people came with a solution and introduced HTTPS. HTTPS stands for Hypertext Transfer Protocol Secure. As the name suggests, it is secure! It’s the same HTTP protocol layered with a security protocol. Now that brings us to the main question. Will this affect the internet speed in any way? Will this be an issue when we are dealing with large amounts of traffic on the internet?   Continue reading

The Cost Of Abstraction In Python

1 mainPeople who have been coding for years will tell you that abstraction is always a good thing. It makes the system more robust and tolerant to future changes. Recently, I’ve been working on some Python libraries that wrap C extensions. While working on those libraries, a couple of interesting things came up.This blog post is about one of those things, and how it affects the speed. When we talk about abstraction, we tend to think about hierarchies, encapsulations, and robustness in general. But what about the ways in which they are implemented underneath? Different languages implement this in their own way, and they are not always optimal. This is what we are going to talk about. Let’s dive in, shall we?   Continue reading

Homomorphism vs Homeomorphism

1 mainDid you get the joke in the picture to the left? If not, you will do so in a few minutes. I was recently reading an article and I came across the terms mentioned in the title. From the looks of it, they are very close to each other, right? In many fields within mathematics, we talk about objects and the maps between them. Now you may ask why we would want to do that? Well, transformation is one of the most fundamental things in any field. For example, how do we transform a line into a circle, or fuel into mechanical energy, or words into numbers? There are infinitely many types of transformations that can exist. Obviously, we cannot account for every single type of transformation that can possibly exist. So we limit ourselves to only the interesting ones. So what exactly is it all about? How does it even relate to the title of this blog post?   Continue reading

Why Is Python Slow?

1 mainWhen people talk about speed in the world of programming languages, they usually center the discussion around compiled vs interpreted languages. In this post, we will discuss two of the most famous languages on this planet, Python and C. I was recently playing around with this and I made a few pleasant discoveries. So I thought I should share them here. One of the biggest reasons as to why Python is slower than C is because of the dynamic typing feature in Python. While it may be true that dynamically typed programming languages are slower than statically typed languages, it may not be the major factor slowing down your Python code. The dynamic typing feature of programming languages like Python makes the interpreters harder to optimize. I guess this is the cost of having an extremely beginner-friendly language! But one thing to note is that there is a big difference between interpreter being harder to optimize and your code being slow. We’ve had years of research focusing on the best way to perform type checking at runtime in these languages, thus making this overhead negligible. So how do we understand why Python code is slower than C code? How do we write Python code that’s not slow?   Continue reading