HTTP vs HTTPS: Latency Comparison

1 mainI recently came across the issue of latency differences between HTTP and HTTPS. It got me curious and I started looking into it. To give a quick introduction to those who are new to this, HTTP stands for Hypertext Transfer Protocol and it’s a protocol for communication over the internet. Whenever somebody types something into the address bar on their browser, the browser understands the address and displays the appropriate thing. When you look at the address bar, you usually won’t see the address beginning with “http” because modern web browsers hide it. If you copy that address and paste it into a text file, you will see the full address starting with “http”. The problem with HTTP is that it is susceptible to wiretapping and other kinds of attacks. So people came with a solution and introduced HTTPS. HTTPS stands for Hypertext Transfer Protocol Secure. As the name suggests, it is secure! It’s the same HTTP protocol layered with a security protocol. Now that brings us to the main question. Will this affect the internet speed in any way? Will this be an issue when we are dealing with large amounts of traffic on the internet?  

Why should I care about this?

To understand why this matters, let’s understand a bit about HTTP. Let’s say you go to your web browser and type something into the address bar. This means that you are requesting a webpage that’s located somewhere on the cloud, typically a server. In this situation, you are called the “client”, and you just submitted an HTTP request message to the server. Submitting the HTTP request message is basically typing something into the address bar of your browser. Now it’s time for the server to respond. Server basically looks at the request and fetches resources such as files, images, etc. It then returns a response message to the client, and you will see a bunch of stuff displayed on your browser. Pretty straightforward!

2 eavesdropperNow let’s go to HTTPS. The problem with HTTP is that it is not safe. So if you have sensitive data, you need some form of security. When you type something into your web browser beginning with “https”, you are asking your browser to use an encryption layer to protect the traffic. This provides reasonable protection against eavesdroppers, but the problem is that it will be slower. Since we want to encrypt our traffic, there will be some computation involved, which adds to the time. This means that if you don’t design your system correctly, your website will appear sluggish to users. As we all know, people hate waiting, especially on the internet! Okay so how slow can it possibly be? Well, the interesting thing is that HTTPS takes almost 4 times longer to display the same thing as HTTP. This ratio actually tends to fluctuate between 3.5 and 4.5 depending on various factors, but it’s a big multiplier nonetheless! So why do we have such a big multiplier? Is the encryption so computationally intensive that it takes so long? Let’s go ahead and find out, shall we?

How do we know it takes 4 times longer?

To measure the latency differences for HTTP and HTTPS, we will take the same request and measure the response times for both. We use something called Transmission Control Protocol (TCP) to communicate over the internet. HTTP and HTTPS are encapsulated in the TCP protocol. The reason we want to know about TCP is because we need to understand “handshake”. TCP uses something called “handshake” to establish connection with a server. It’s just to confirm if the server is indeed what we think it is. Now this TCP handshake is a three-packet event. Now wait a minute, what is a “packet”? Think of packets as units of data, and these units flow around all over the internet. Anything that travels through the internet is broken down into packets first. Okay, so during the handshake, the client sends two packets and the server sends one. When you (the client) receive the second packet in the handshake, you reply with an acknowledgment packet and consider the connection open. Exactly one round-trip is required before you can send your http request. How do we know this is actually happening? Well, let’s do a quick test.

Let’s start with TCP and see what happens. We are going to use “curl” and measure the response times. Go to your terminal and type the following:

$ curl -o /dev/null -w "%{time_connect}\n" -s http://www.google.com

This will display the time taken. Now run the same command multiple times and note down the timings. Let’s say we run it seven times just to make sure we see the average value. You will get something like this:

0.040
0.042
0.041
0.043
0.048
0.039
0.044

Now let’s explicitly ping google and measure the response time. When we use “ping”, we are specifically asking it to do a round trip i.e. send the request and get the response back. Type the following in your terminal:

$ ping -q -c 7 www.google.com

You will get something like this:

--- www.google.com ping statistics ---
7 packets transmitted, 7 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 37.098/40.333/43.420/2.003 ms

As you can see, it’s almost the same as the average value when we used curl earlier. What about when we use HTTPS? Let’s check that out:

$ curl -o /dev/null -w "HTTP time taken: %{time_connect}\nHTTPS time taken: %{time_appconnect}\n" -s https://www.google.com
HTTP time taken: 0.042 
HTTPS time taken: 0.163

As we can see, we have a 4x jump in latency just by using HTTPS, and this is before we even send the HTTP request.

Why is this happening?

Let’s dig into this further by analyzing the packets that are being transmitted. This way, we can see exactly what’s happening underneath. We can use a tool called “tcpdump” for this purpose. This tool will sniff HTTPS traffic and then use openssl s_client to connect to the HTTP server over SSL and do nothing else. SSL stands for Secure Sockets Layer and it is the security protocol that’s used for HTTPS. If we summarize the whole thing, we see that we need 12 packets for SSL handshake as compared to 3 packets for TCP alone. This means that it has to go through 3 extra network round-trips. We were assuming that HTTPS is slow because of the computational requirements of the security protocol, but as it turns out, the network round trips for SSL contributes significantly to this slowdown.

So what’s the solution here? Do we just use bigger and faster servers?

3 serversIt seems like an obvious thing to do, right? If SSL is computationally expensive, then we can just use bigger servers to speed this thing up. But the point is that no matter how fast our servers are, if they are not near the user, then your first connection will be slow. The encryption part of SSL handshake takes a minimal amount of time, which means that a significant portion of the time is because of network latency. Wait, isn’t internet fast and instantaneous? Why should the servers be “near” the user? Well, as it turns out, it matters a lot where your users are coming from.

Let’s say you build a website and a majority of your users are coming from Argentina. Now when you want to host your website somewhere, you have the option of picking the geographical location of your servers. For example, if you go to Amazon Web Services, you can see a list of locations of their data centers. It takes a finite non-trivial amount of time for the packets to travel around the internet. So if you host your website on a server in Oregon (US), then a person visiting the website from US will have a much faster connection that a person visiting the website from Argentina. This is the reason cloud service providers have data centers all over the world so that you can optimize your design based on your demographic. Ideally, you would want to host your website on a server located as close as possible to Argentina.

Fixing this SSL latency is not exactly straightforward. Even if the computationally intensive part is handled by more expensive servers, the only sure way to solve network round-trip latency is to be closer to your users and to work on minimizing the total number of round-trips. Okay now we are talking about the protocol itself! It’s there for a reason, and we can see why. But at the same time, it’s adding a lot of latency. Data centers are more expensive in some parts of the world, which means that you will incur those costs if you want to reduce the latency. If we optimize the protocol, then we can afford to be further away from your users and still not face the wrath of latency, which can save you money in the long run.

———————————————————————————————————

One thought on “HTTP vs HTTPS: Latency Comparison

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s