The Git submodule feature allows us to add another repository to a repository as a subrepository of the current repository. Git submodules are simply references to a state of a repository at a given time, i.e. a reference to a commit.

Introduction

When developing a complex project, it is often necessary to rely on some external code packages, which facilitates code management and saves us the effort and time of creating tools repeatedly.

There are usually two ways to rely on external code.

  • Copy the external code directly into the current project source code. This way is the least recommended way because it discards the previous git version information of the external code, and the copy is error-prone, which may cause missing features and lead to bugs.
  • Another way is to rely on package managers for various programming languages, such as npm and NuGut, which are currently the most popular and most used. The disadvantage of using this approach is that you need to do various versioning of the package, and if you find bugs in the dependent packages or add new features, you need to re-release the package after each update.

So if you have a complex project that depends on an external package, and you need to update that package frequently, you can use the Git submodules feature.

A git submodule is a reference to a commit in a git repository, and it doesn’t keep track of the specific branch of the repository. Adding a submodule to a repository automatically creates a .gitsubmodules file that contains all the information about the submodule - the submodule project address, and the location of the submodule’s code in the current repository.

Usage

1. Adding submodules

Use the git submodule add <submodule_url> command to add a submodule.

1
2
3
4
5
6
$ git submodule add https://bitbucket.org/jaredw/awesomelibrary
Cloning into '/Users/atlassian/git-submodule-demo/awesomelibrary'...
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 8 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (8/8), done.

After adding a submodule, you can use git status to check the status of the repository, which adds a .gitmodules file and an awesomelibrary folder to the repository.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ git status
On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

 new file:   .gitmodules
 new file:   awesomelibrary

View the contents of the .gitmodules file.

1
2
3
[submodule "awesomelibrary"]
 path = awesomelibrary
 url = https://bitbucket.org/jaredw/awesomelibrary
  • path specifies the location of the submodule in the current repository
  • url specifies the location of the submodule’s repository

2. Submodule initialization and update

When you clone a git repository containing submodules, using the git clone command does not pull the submodule code, the submodule folders in the cloned repository are empty, and all submodules must be initialized and then updated.

You can use git submodule init [<submodule_name> ...] command to initialize the subrepository. It does this by copying the submodule information from the .gitmodules file to the current repository’s . /.git/config configuration file of the current repository. At first glance, this step seems redundant, because .gitmodules already contains the submodule information, so why bother? Actually, it’s not a redundant step. Suppose the current repository contains many submodules, but at this point you don’t need all of them for your development work, so you can specify the submodules you need to initialize, and then pull the code.

1
2
$ git submodule init
Submodule 'awesomelibrary' (https://bitbucket.org/jaredw/awesomelibrary) registered for path 'awesomelibrary'

Once the submodule is initialized, you can use the git submodule udpate command to pull the code for the submodule.

1
2
3
4
5
6
7
8
$ git submodule update
Cloning into 'awesomelibrary'...
remote: Counting objects: 11, done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 11 (delta 0), reused 11 (delta 0)
Unpacking objects: 100% (11/11), done.
Checking connectivity... done.
Submodule path 'awesomelibrary': checked out 'c3f01dc8862123d317dd46284b05b6892c7b29bc'

After updating, check the . /awesomelibrary folder and find that it is no longer an empty folder, you can use git submodule init && git submodule update to merge the two commands into one.

Here’s a trick, when there are nested modules, where the submodule contains submodules, you can use git submodule update --init --recursive to loop through and update all submodules (including nested submodules).

3. Sub-module workflow

Submodules have their own independent versioning. Remember what we said earlier? A submodule reference in a repository is just a commit in the submodule, not a branch. So when a submodule is updated, we have to update the reference to the submodule in the parent repository to point to the new commit in the submodule. This means that you need two add and commit operations for updates in the submodule, one for the submodule’s own add and commit, and one to update the submodule reference in the parent module, which is a rather annoying point, and one I always hate, but there is no way, the mechanism is like this, and since you can’t change it, you have to accept it.