Git is one of the most popular version control systems available out there. If you are a programmer, you must have used it one time or the other. This post deals with a couple of specific commands. So if you are new to git, this post may not make much sense to you. Git offers a variety of powerful commands to control the source code and collaborate with your peers on software projects.Every time I have to tinker with git, I tend to learn something new. In this post, we will discuss the difference between ‘git fetch’ and ‘git pull’.
Why do we care about this?
Before we look into git, we need to understand why we have it in the first place. We need to understand the idea behind git and see how it’s different from a more traditional source control tool like subversion, popularly referred to as svn. When people realized that they needed a tool to collaborate, they built something with a client/server model. Subversion is a very good example of such a system. What this means is that there is a single server, and several clients can fetch code from it whenever needed. The clients are usually the people working on the project. People can fetch the code, make the required changes, then push it back to the server. Pretty straightforward! In this situation, we are assuming that the client can always talk to the server anytime he wants.
Now it seems like a good system, right? I mean, everybody can access the code and we have all the history stored on the server. What’s the problem? Well, there are a few problems that you can face with svn. The system of always needing to have a central repository restricts the development process to be linear. The software systems of today are very complex and the development cannot stick to a linear timeline. We need something that can support non-linear distributed development. This is where git comes into picture. Git was designed to support a more distributed model with no need for a central repository, although you can certainly use one if you want. Here, we don’t need the client to be online to work on his code. He can keep making the changes, and can commit his changes anytime he wants. This is in contrast with svn where the client and the server need to be online. If you are using git, you can exchange code even through email or a flash drive, and still keep all the history intact.
How does git do it?
Now that we know we what git is capable of doing, you must be wondering how git does it all. Well, git actually maintains a local repository with your code. This repository will have all the code history. It also maintains an additional local repository that mirrors the remote repository. Now why would git do that? This comes in pretty handy when the client is offline or if the remote repository is not reachable. By keeping a copy of the remote repository, git can figure out the changes needed even when the remote repository is not reachable. When you actually need to send the changes to someone else, git can transfer them as a set of changes from a point in time known to the remote repository. It’s pretty powerful that way!
What’s the difference?
We are almost near the end and we still haven’t discussed the difference between git-fetch and git-pull. That’s what you are thinking, right? Well, as it turns out, we already have. It’s just that we used the verbal description as opposed to using the actual names of the commands. It will become clear in a minute. As we know, git maintains a copy of your own code history as well as a copy of the remote repository. The command ‘git fetch’ is the command that brings your local copy of the remote repository up to date. The reason we need this is because somebody else might have made some changes to the code and you want to keep yourself updated. The command ‘git pull’ brings the changes in the remote repository to where you keep your own code. Normally, ‘git pull’ does this by doing a ‘git fetch’ first to bring the local copy of the remote repository up to date, and then it merges the changes into your own code repository and possibly your working copy. For example, let’s say you have two files, A and B. You are working on A and somebody makes some changes to B in the repository. Now ‘git pull’ will merge your changes on A with the remote changes to B. Now wait a minute, wouldn’t this be a problem if I am not ready to merge my code yet? Why doesn’t ‘git pull’ ask me before doing this? That’s a totally valid concern. We will discuss more about it in our next blog post.