If you are a techie, regardless of the field you are in, you must have heard the term “proxy server”. If you are a web developer, you would hear that term a lot! Off the top of your head, what do you think a proxy server is? It’s basically very similar to what a person aims to achieve when he proxies for another person. That is, to act on behalf of that other person. Remember the good old college days when this used to happen a lot? It was frowned upon by the professors back then. But now, it makes up for a good analogy to discuss an important concept. Now why would we need a proxy server? Why can’t we just talk directly to the actual server and leave this whole proxy thing aside?
What do we care about this?
Before we begin, let’s see how a server works. When you connect to the internet, you basically open the browser and click on some link. Internally, what happens is that your browser sends a request to a server about the link. The server will fetch that information and send it back. Once the browser receives the information, it will display that for you.
Now, how do you know what server to connect to? Servers go down all the time, and you shouldn’t have to keep a tab on which server is up right now. Also, if you visit the same webpage multiple times, you shouldn’t have to download the same content over and over again. If you are hosting a server and people know exactly where the information is coming from, they may try to hack it. It can also be the other way round where the incoming content to your computer might be malicious. What I am getting at is that you basically need a gatekeeper between your machine and the internet which can handle all this stuff. This is where a proxy server comes into picture. It is extremely critical to the web infrastructure.
What exactly is a proxy server?
A proxy server acts as a buffer between the user and the servers. It is a computer that allows users to make indirect network connections to the servers. It basically acts like a manager! A client connects to the proxy server, then requests a connection, file, or other resource available on a different server. The proxy provides the resource either by connecting to the specified server or by serving it from a cache. In some cases, the proxy may alter the client’s request or the server’s response for various purposes.
When people talk about a proxy server, they are usually referring to a forward proxy. The reason I am mentioning this is because there is something called reverse proxy as well, which we will discuss in the next blog post. A forward proxy provides proxy services to a client or a group of clients. These clients usually belong to a common internal network. When one of these clients makes a connection attempt to that file transfer server on the Internet, its requests have to pass through the forward proxy first.
Depending on the forward proxy’s settings, a request can be allowed or denied. If allowed, then the request is forwarded to the firewall and then to the file transfer server. From the point of view of the file transfer server, it is the proxy server that issued the request, not the client. So when the server responds, it addresses its response to the proxy. But when the forward proxy receives the response, it recognizes it as a response to the request that went through earlier. So it sends that response to the client that made the request.
Why do we need proxy servers?
We need them because proxy servers can keep track of requests, responses, their sources and their destinations. Because of that, different clients can send out various requests to different servers through the forward proxy and the proxy will intermediate for all of them. Again, some requests will be allowed, while some will be denied.
As we can see here, the proxy can serve as a single point of access and control, making it easier for you to enforce security policies. You know how companies block sites like Facebook, YouTube, etc to prevent their employees from slacking off? This is how they do it. A forward proxy is typically used in tandem with a firewall to enhance an internal network’s security by controlling traffic originating from clients in the internal network that are directed at hosts on the Internet. Thus, from a security standpoint, a forward proxy is primarily aimed at enforcing security on client computers in your internal network.
But then again, client computers aren’t always the only ones you find in your internal network. Sometimes, we also have servers. We can have our servers act as clients to the outside servers. When these servers have to provide services to external clients, e.g. when the contractors you are working with need to access files from your FTP server, a more appropriate solution would be a reverse proxy.
How is it used?
A common proxy application is a caching web proxy. This provides a nearby cache of web pages and files available on remote web servers, allowing local network clients to access them more quickly or reliably. When it receives a request for a web resource (specified by a URL), a caching proxy looks for the resulting URL in its local cache. If found, it returns the document immediately. Otherwise it fetches it from the remote server, returns it to the requester and saves a copy in the cache. The cache usually uses an expiry algorithm to remove documents from the cache, according to their age, size, and access history. Two simple cache algorithms are Least Recently Used (LRU) and Least Frequently Used (LFU). LRU removes the least-recently used documents, and LFU removes the least-frequently used documents.
Web proxies can also filter the content of web pages served. Some censor-ware applications, which attempt to block offensive web content, are implemented as web proxies. Other web proxies reformat web pages for a specific purpose or audience. For example, some service can reformat web pages for mobile phones. Network operators can also deploy proxies to intercept computer viruses and other hostile content served from remote Web pages.
A special case of web proxies are “CGI proxies”. These are web sites which allow a user to access a site through them. They generally use PHP or CGI to implement the proxying functionality. CGI proxies are frequently used to gain access to web sites blocked by corporate or school proxies. Since they also hide the user’s own IP address from the web sites they access through the proxy, they are sometimes also used to gain a degree of anonymity.
What are the different types of proxy?
In real life, if you send someone to proxy for you, we have a few different kinds: the person who gets caught all the time, the person who is very efficient, the person who is absent sometimes, the person who is present but forgets to proxy, etc. Jokes aside, when we talk about web servers, we have four different types of proxy servers:
- Transparent Proxy: This type of proxy server identifies itself as a proxy server and also makes the original IP address available through the http headers. These are generally used for their ability to cache websites and do not effectively provide any anonymity to those who use them. However, the use of a transparent proxy will get you around simple IP bans. They are transparent in the terms that your IP address is exposed, not transparent in the terms that you do not know that you are using it.
- Anonymous Proxy: This type of proxy server identifies itself as a proxy server, but does not make the original IP address available. This type of proxy server is detectable, but provides reasonable anonymity for most users.
- Distorting Proxy: This type of proxy server identifies itself as a proxy server, but makes an incorrect original IP address available through the http headers.
- High Anonymity Proxy: This type of proxy server does not identify itself as a proxy server and does not make available the original IP address.