A Visual Guide to Git Internals — Objects, Branches, and How to Create a Repo From Scratch
A Visual Guide to Git Internals — Objects, Branches, and How to Create a Repo From Scratch 관련
Many of us use git
on a daily basis. But how many of us know what goes on under the hood?
For example, what happens when we use git commit
? What is stored between commits? Is it just a diff between the current and previous commit? If so, how is the diff encoded? Or is an entire snapshot of the repo stored each time? What really happens when we use git init
?
Many people who use git
don’t know the answers to the questions above. But does it really matter?
First, as professionals, we should strive to understand the tools we use, especially if we use them all the time — like git
.
But even more acutely, I've found that understanding how git actually works is useful in many scenarios — whether it’s resolving merge conflicts, looking to conduct an interesting rebase, or even just when something goes slightly wrong.
You’ll benefit from this post if you’re experienced enough with git
to feel comfortable with commands such as git pull
,git push
,git add
or git commit
.
Still, we will start with an overview to make sure we are on the same page regarding the mechanisms of git
, and specifically, the terms used throughout this post.
I also uploaded a YouTube series covering this post — you are welcome to watch it here.
What to expect from this tutorial
We will get a rare understanding of what goes on under the hood of what we do almost daily.
We will start by covering objects — blobs, trees, and commits. We will then briefly discuss branches and how they are implemented. We will dive into the working directory, staging area and repository.
And we will make sure we understand how these terms relate to the git
commands we know and use to create a new repository.
Next, will create a repository from scratch — without using git init
, git add
, or git commit
. This will allow us to deepen our understanding of what is happening under the hood when we work with git
.
We will also create new branches, switch branches, and create additional commits — all without using git branch
or git checkout
.
By the end of this post, you will feel like you understand **git**
. Are you up for it? 😎
Time to get hard core
So far we've covered some Git fundamentals, and now we’re ready to really Git going.
In order to deeply understand how git
works, we will create a repository, but this time — we'll build it from scratch.
We won’t use git init
, git add
or git commit
which will enable us to get a better hands-on understanding of the process.
Summary
This post introduced you to the internals of git
. We started by covering the basic objects — blobs, trees, and commits.
We learned that a blob holds the contents of a file. A tree is a directory-listing, containing blobs and/or sub-trees. A commit is a snapshot of our working directory, with some meta-data such as the time or the commit message.
We then discussed branches and explained that they are nothing but a named reference to a commit.
We went on to describe the working directory, a directory that has a repository associated with it, the staging area (index) which holds the tree for the next commit, and the repository, which is a collection of commits.
We clarified how these terms relate to git
commands we know by creating a new repository and committing a file using the well-known git init
, git add
, and git commit
.
Then, we fearlessly deep-dived into git
. We stopped using porcelain commands and switched to plumbing commands.
By using echo
and low-level commands such as git hash-object
, we were able to create a blob, add it to the index, create a tree of the index, and create a commit object pointing to that tree.
We were also able to create and switch between branches. Kudos to those of you who tried this on their own!👏
Hopefully, after following this post you feel you’ve deepened your understanding of what is happening under the hood when working with git
.
Thanks for reading! If you enjoyed this article, you can read more on this topic on the swimm.io blog.
Omer Rosenbaum
- Swimm’s Chief Technology Officer
- Cyber training expert and Founder of Checkpoint Security Academy.
- Author of Computer Networks (in Hebrew)
- Youtube (
@BriefVid
) - Linkedin (
omer-rosenbaum-034a08b9
)
Additional References
A lot has been written and said about git
. Specifically, I found these references to be useful: