What is Version Control?
Version control is a system that helps developers manage changes to their code over time. It allows multiple developers to work on the same codebase simultaneously, keep track of changes, collaborate, and revert to previous versions of the code if necessary. Version control is a critical tool used by software development teams to ensure code integrity, improve collaboration, track changes, and enable efficient development workflows.
Why do we need Version Control?
When building large-scale applications they can quickly become very complex and difficult to manage over time. To combat this, we need software that enables us to store and move between different versions of our application. If we weren't able to do this then if something were to go wrong, we wouldn't be able to move back to the previous version of the application when everything was working.
Starting a timeline
Imagine when you start a project that is the beginning of a timeline for your project. Then as you make changes and updates to your project, the timeline gets updated at those points. Therefore you can pick any point in the timeline and see what the files and folders in your project looked like at that particular point in time. The requirements for this timeline would be as follows:
Basic timeline requirements
Store the content of our files and directories
Add new points to the timeline, that reflect our files and directories at that point in time
View the changes we've made since the last point in our timeline
Move to different points in the timeline to see the files and directories state at that point in time
git is the version control software that allows us to do all of the above.
git CLI
CLI stands for 'command line interface'.
This is a list of commands that we can use within our terminal to utilize git's features. One of the starting commands is git init
.
git init
creates a new git repository - this will be the place where git stores the content of files and folders in each snapshot on your timeline.
$ git init
Initialized empty Git repository in /Users/james/Documents/files/.git/
If we run the command ls -a
we can print all the files and directories including those directories that start with a .
The ls
git command stands for list and lists the current directory's contents.
$ ls -a
. .. .git
Working directory
The working directory will be the directory you are currently working in with all the files and folders you have immediate access to. It will be the directory where all your commands are being executed. To find out what your working directory is you can use the pwd
command. Pwd stands for 'print working directory'.
git init
Say I create an empty directory with no files or folders in it. Then I run git init
which creates a new empty git repository. Note that the empty git repository doesn't have anything stored within it yet. Check the snippet below to see how this would look running the commands in our terminal.
$ mkdir files
$ cd files
$ ls
$ git init
Initialized empty Git repository in /Users/james/Documents/files/.git/
$ ls -a
. .. .git
I then create a new file in my project called alphabet.txt
with some content as shown:
---- alphabet.txt ----
abcdefghijklmnopqrstuvwxyz
So far, nothing has been stored in the git repository yet .git
. All we have done is updated the working directory.
The working directory refers to all of the files and directories we can currently view and edit in our project.
git status
git
keeps a track of when the working directory has been updated and we can see this information by running the git status
command:
$ git status
On branch main
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
alphabet.txt
nothing added to commit but untracked files present (use "git add" to track)
The part we care about in this message is Untracked files
. An untracked file is a newly created file that has not yet been stored inside the git repository. To store this content in our git repository we need to use additional git
commands.
git add
Git uses 3 different areas to manage the state of our content. To permanently store our changes on the git repository we have to move our changes through these stages.
These are the stages we move our changes through:
Working directory:
alphabet.txt
Staging area:
Git repository:
First, we make a change to our working directory like above where we added a file called alphabet.txt
.
Then we want to move these changes to something called the staging area which is simply an area inside git where we prepare files to be stored permanently in a snapshot in our git repository. To stage our changes we use the git add
command.
We can run the git add
command like this:
$ git add alphabet.txt
This will 'add' the file alphabet.txt
to the staging area and we can check that this has worked by running the command git status
again:
$ git status
On branch main
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: alphabet.txt
If you wanted to add all the files in your working directory to the staging area, say you were updating multiple files at once. Then you would use git add .
the .
refers to all files.
git commit
Committing our work to the git repository will store it as a snapshot of our file and is like saving your work. Once you're happy with the changes in the staging area you can run the git commit
to save and store them. To do this, you can use git commit -m <commit-message-here>
. The commit message should provide a brief description of the changes that you've made.
$ git commit -m "Add alphabet.txt"
[main (root-commit) a5c983b] Add alphabet.txt
1 file changed, 1 insertion(+)
create mode 100644 alphabet.txt
Note that the output for this command is also saying that this commit is regarded as the root commit as it is the first commit stored inside the git repository.
git log
We can use the command git log
to list all the commits stored inside a git repository. You can also think of git log
as printing the timeline of different commits in our project.
$ git log
commit a5c983b7915c5d89586feba51026cb6bceb0f4cb (HEAD -> main)
Author: james <james@hashnode.com>
Date: Sun Apr 9 22:05:56 2023 +0100
Add alphabet.txt
Branches
A commit is a snapshot of the files and folders in our project at that point in time. Over time we can create multiple commits to build up a timeline containing different snapshots. Each commit that we create will automatically point back to the previous commit (unless it is the root commit). The commits we create therefore create a chain of commits. It is therefore fine to point to a single commit as these will continue to point back to form a chain. A branch in git is a series of commits that forms a timeline representing some work in a project. However, we only need to point to a single commit as this one points back to the others.
A branch is a reference to a particular commit.
The branch that is created by default when a new empty git repository is created is the main branch. Whenever a new git commit is created the main branch is updated so it points to the most recent commit.
When working on a project in a team it is quite common to have multiple branches for the same project. There are multiple reasons why this may be the case which are as follows:
Developing new features
Fixing bugs
Testing
Teamwork
Creating a new branch
If you'd like to create a new branch in your local repository then you can use the git checkout
command to 'checkout' a new branch like this:
$ git checkout -b my-new-branch
Switched to a new branch 'my-new-branch'
Like with entering a commit message, we can type in what we want to call the branch. Note it can't include spaces but you can use dashes instead.
Switching branches
To switch branches you can use the git checkout
command along with the name of the branch you'd like to switch to like so:
$ git checkout main
Switched to branch 'main'
Your branch is up to date with 'origin/main'.
Github
Version control software allows us to manage and update the versions of a project over time. However, git is regarded more fully as distributed version control software. The term distributed means that any number of users (with the relevant permissions) can access the git repository and therefore gain access to the entire history of the project.
Github is a website that stores Github repositories in the cloud and allows multiple users to store their work so that each collaborator can share their changes online.
Github repositories can also be created on the Github website.
Remotes
A remote repository is one stored on Github that others can access and update from their local machines. A typical pattern is to have one remote repository and other users can create clones of this repository on their local machine. Users can add local changes to the git repository on their machine and then once ready add their changes to the remote repository. The name origin is a conventional way of referring to a git remote repository - a new GitHub repository will be referred to as origin.
Creating a link
When a new repository is created on Github then a new URL is generated which points to the remote repository on Github. We can use this URL to connect our local machine with a Github repo. We can then retrieve what is already on the remote repository on Github as well as update with the commits on our local machine. To create this link we can use the command git remote
:
$ git remote add origin <origin-repository-url>
Once this command has been executed this can be checked by running git remote -v
:
v stands for version in
git remote -v
$ git remote -v
origin https://github.com/james/new_project.git (fetch)
origin https://github.com/james/new_project.git (push)
git push
We can use the command git push
to add any of the commits on our local machine to a remote repository. If the main branch is 3 commits ahead of our origin repository's main branch then using git push
will add the three local commits onto the main
branch of the origin repository. We can use the git push
command in the following way:
$ git push -u origin main
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Writing objects: 100% (3/3), 245 bytes | 245.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
To https://github.com/james/files.git
* [new branch] main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.
The u in
git push -u origin main
stands for upstream.
You should get a message in your terminal like the one above if the git push
command was successful. Once this command is executed both the local main
branch and the origin
main
branch abbreviated as origin/main
will be pointing at the same commit. In this commit, both the local and the remote main branches are up to date and in sync with each other.
git pull
When there are commits present on the remote repository that aren't present on the repository on our local machine we can use git pull
to pull in the latest commits on the remote repository and bring our repository up to date.
Conclusion
Version control is something that all developers should know when going into their first role. There aren't too many commands to learn and with practice, it's something that you can quickly become proficient in. Try integrating version control into your current projects for practice even if they don't require it.
Also, Github isn't the only platform available, there are others like GitLab or Bitbucket but Github is one of the most popular and likely the one your future employer will use.