Mastering Git: A Practical Guide to Efficient Version Control for Developers

What is Version Control?

Version control is a system that helps developers manage changes to their code over time. It allows multiple developers to work on the same codebase simultaneously, keep track of changes, collaborate, and revert to previous versions of the code if necessary. Version control is a critical tool used by software development teams to ensure code integrity, improve collaboration, track changes, and enable efficient development workflows.

Why do we need Version Control?

When building large-scale applications they can quickly become very complex and difficult to manage over time. To combat this, we need software that enables us to store and move between different versions of our application. If we weren't able to do this then if something were to go wrong, we wouldn't be able to move back to the previous version of the application when everything was working.

Starting a timeline

Imagine when you start a project that is the beginning of a timeline for your project. Then as you make changes and updates to your project, the timeline gets updated at those points. Therefore you can pick any point in the timeline and see what the files and folders in your project looked like at that particular point in time. The requirements for this timeline would be as follows:

Basic timeline requirements

Store the content of our files and directories
Add new points to the timeline, that reflect our files and directories at that point in time
View the changes we've made since the last point in our timeline
Move to different points in the timeline to see the files and directories state at that point in time

git is the version control software that allows us to do all of the above.

git CLI

CLI stands for 'command line interface'.

This is a list of commands that we can use within our terminal to utilize git's features. One of the starting commands is git init.

git init creates a new git repository - this will be the place where git stores the content of files and folders in each snapshot on your timeline.

$ git init
Initialized empty Git repository in /Users/james/Documents/files/.git/

If we run the command ls -a we can print all the files and directories including those directories that start with a . The ls git command stands for list and lists the current directory's contents.

$ ls -a
. .. .git

Working directory

The working directory will be the directory you are currently working in with all the files and folders you have immediate access to. It will be the directory where all your commands are being executed. To find out what your working directory is you can use the pwd command. Pwd stands for 'print working directory'.

`git init`

Say I create an empty directory with no files or folders in it. Then I run git init which creates a new empty git repository. Note that the empty git repository doesn't have anything stored within it yet. Check the snippet below to see how this would look running the commands in our terminal.

$ mkdir files

$ cd files

$ ls

$ git init
Initialized empty Git repository in /Users/james/Documents/files/.git/

$ ls -a
. .. .git

I then create a new file in my project called alphabet.txt with some content as shown:

---- alphabet.txt ----
abcdefghijklmnopqrstuvwxyz

So far, nothing has been stored in the git repository yet .git. All we have done is updated the working directory.

The working directory refers to all of the files and directories we can currently view and edit in our project.

`git status`

git keeps a track of when the working directory has been updated and we can see this information by running the git status command:

$ git status
On branch main

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        alphabet.txt

nothing added to commit but untracked files present (use "git add" to track)

The part we care about in this message is Untracked files. An untracked file is a newly created file that has not yet been stored inside the git repository. To store this content in our git repository we need to use additional git commands.

`git add`

Git uses 3 different areas to manage the state of our content. To permanently store our changes on the git repository we have to move our changes through these stages.

These are the stages we move our changes through:

Working directory: alphabet.txt
Staging area:
Git repository:

First, we make a change to our working directory like above where we added a file called alphabet.txt.

Then we want to move these changes to something called the staging area which is simply an area inside git where we prepare files to be stored permanently in a snapshot in our git repository. To stage our changes we use the git add command.

We can run the git add command like this:

$ git add alphabet.txt

This will 'add' the file alphabet.txt to the staging area and we can check that this has worked by running the command git status again:

$ git status
On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

  new file:   alphabet.txt

If you wanted to add all the files in your working directory to the staging area, say you were updating multiple files at once. Then you would use git add . the . refers to all files.

`git commit`

Committing our work to the git repository will store it as a snapshot of our file and is like saving your work. Once you're happy with the changes in the staging area you can run the git commit to save and store them. To do this, you can use git commit -m <commit-message-here>. The commit message should provide a brief description of the changes that you've made.

$ git commit -m "Add alphabet.txt"
[main (root-commit) a5c983b] Add alphabet.txt
 1 file changed, 1 insertion(+)
 create mode 100644 alphabet.txt

Note that the output for this command is also saying that this commit is regarded as the root commit as it is the first commit stored inside the git repository.

`git log`

We can use the command git log to list all the commits stored inside a git repository. You can also think of git log as printing the timeline of different commits in our project.

$ git log
commit a5c983b7915c5d89586feba51026cb6bceb0f4cb (HEAD -> main)
Author: james <james@hashnode.com>
Date:   Sun Apr 9 22:05:56 2023 +0100

    Add alphabet.txt

Branches

A commit is a snapshot of the files and folders in our project at that point in time. Over time we can create multiple commits to build up a timeline containing different snapshots. Each commit that we create will automatically point back to the previous commit (unless it is the root commit). The commits we create therefore create a chain of commits. It is therefore fine to point to a single commit as these will continue to point back to form a chain. A branch in git is a series of commits that forms a timeline representing some work in a project. However, we only need to point to a single commit as this one points back to the others.

A branch is a reference to a particular commit.

The branch that is created by default when a new empty git repository is created is the main branch. Whenever a new git commit is created the main branch is updated so it points to the most recent commit.

When working on a project in a team it is quite common to have multiple branches for the same project. There are multiple reasons why this may be the case which are as follows:

Developing new features
Fixing bugs
Testing
Teamwork

Creating a new branch

If you'd like to create a new branch in your local repository then you can use the git checkout command to 'checkout' a new branch like this:

$ git checkout -b my-new-branch
Switched to a new branch 'my-new-branch'

Like with entering a commit message, we can type in what we want to call the branch. Note it can't include spaces but you can use dashes instead.

Switching branches

To switch branches you can use the git checkout command along with the name of the branch you'd like to switch to like so:

$ git checkout main
Switched to branch 'main'
Your branch is up to date with 'origin/main'.

Github

Version control software allows us to manage and update the versions of a project over time. However, git is regarded more fully as distributed version control software. The term distributed means that any number of users (with the relevant permissions) can access the git repository and therefore gain access to the entire history of the project.

Github is a website that stores Github repositories in the cloud and allows multiple users to store their work so that each collaborator can share their changes online.

Github repositories can also be created on the Github website.

Remotes

A remote repository is one stored on Github that others can access and update from their local machines. A typical pattern is to have one remote repository and other users can create clones of this repository on their local machine. Users can add local changes to the git repository on their machine and then once ready add their changes to the remote repository. The name origin is a conventional way of referring to a git remote repository - a new GitHub repository will be referred to as origin.

Creating a link

When a new repository is created on Github then a new URL is generated which points to the remote repository on Github. We can use this URL to connect our local machine with a Github repo. We can then retrieve what is already on the remote repository on Github as well as update with the commits on our local machine. To create this link we can use the command git remote:

$ git remote add origin <origin-repository-url>

Once this command has been executed this can be checked by running git remote -v:

v stands for version in git remote -v

$ git remote -v
origin  https://github.com/james/new_project.git (fetch)
origin  https://github.com/james/new_project.git (push)

`git push`

We can use the command git push to add any of the commits on our local machine to a remote repository. If the main branch is 3 commits ahead of our origin repository's main branch then using git push will add the three local commits onto the main branch of the origin repository. We can use the git push command in the following way:

$ git push -u origin main
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Writing objects: 100% (3/3), 245 bytes | 245.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
To https://github.com/james/files.git
 * [new branch]      main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.

The u in git push -u origin main stands for upstream.

You should get a message in your terminal like the one above if the git push command was successful. Once this command is executed both the local main branch and the origin main branch abbreviated as origin/main will be pointing at the same commit. In this commit, both the local and the remote main branches are up to date and in sync with each other.

`git pull`

When there are commits present on the remote repository that aren't present on the repository on our local machine we can use git pull to pull in the latest commits on the remote repository and bring our repository up to date.

Conclusion

Version control is something that all developers should know when going into their first role. There aren't too many commands to learn and with practice, it's something that you can quickly become proficient in. Try integrating version control into your current projects for practice even if they don't require it.

Also, Github isn't the only platform available, there are others like GitLab or Bitbucket but Github is one of the most popular and likely the one your future employer will use.