Keep clear commit records
Uniformly standardize commit messages
commit to have a summary message, but there are no restrictions on the content. Take a look at the following commit history.
More explicit (django-oscar)
1 2 3 4 5 6 7 8 9 10
patch 9.0.0316: screen flickers when 'cmdheight' is zero Problem: Screen flickers when 'cmdheight' is zero. Solution: Redraw over existing text instead of clearing. patch 9.0.0315: shell command is displayed in message window Problem: Shell command is displayed in message window. Solution: Do not echo the shell command in the message window.
I think most developers should agree that the commit message should at least describe what was done in this commit, so in contrast, in fact, the first kind of writing is equal to not writing, at least to do the second kind of form, in order to be considered a useful commit record.
In many commit message specification, the specification proposed by the front-end framework
Angular team should be the most popular, the specification will mention the commit summary into three parts:
footer, where the
header is required.
header contains three parts.
type: commit type,
subject: subject, a short description of the modification, starts with a lowercase letter, present tense, no period at the end
body is a complement to
subject, including the motivation for this change, compared to the previous behavior.
footer is mainly a description of Breaking Changes or closing a related issue
This format looks a bit complicated, but it can be done with the help of tools, such as the helper script I wrote commit-formatter.
Clean up useless commit messages
Sometimes when you finish a
git commit operation and suddenly find a spelling mistake, you can fix it and commit again, but there is no need to create an extra commit record for such a small change (this can be avoided with lint or git-hook, but that’s a different issue). In this case, you can add the changed file to the staging area and then rewrite the commit with
git commit --amend to add the minor changes to the last commit. This will open up the default editor for you to edit the commit information, or you can use
git commit --amend --no-edit if you don’t need to change the commit record.
Sometimes we need to compress multiple commits into one, such as when developing a feature that generates multiple unnecessary commits for a small change, or when participating in an open source project where we need to commit a PR based on our own branch and the Reviewer has suggested some changes to us. This is the time to use
For example, there is the following commit.
Use the command
git rebase -i HEAD~3, which opens the default editor in the terminal.
Each commit message is preceded by the word
pick, and in the comments below, the meaning of the other words before
pick is explained, you can see that the meaning of
squash is to keep the message but merge it into the previous commit, now edit the last two
picks and change them as follows.
After saving and confirming, you can use
git log again to view the commit history and see that the three commits have been merged.
Using rebase synchronization
Sometimes, the reason for the confusing commit history of some projects can be that developers use inappropriate actions, such as only knowing to use
push for remote branches.
Many of you will have seen this warning when using
Now suppose A and B are working on the same dev branch, and A modifies the code and creates commit1, which is pushed to the server via
git push, when B also creates commit2 locally, he will get an error using
git push, because B has not synchronized the latest changes in the remote dev branch.
Image from Gitbucket
If he pulls the remote branch at this point, he will create an additional merge commit. Actually, the pull operation here is equivalent to
git fetch <remote> && git merge <remote>/branch, which downloads the changes to the remote branch locally and then merges them into the local branch.
How do you avoid this merge commit? You can use
git pull --rebase.
The rebase looks like it takes the local commit first and inserts it at the top of another branch, so that you get a linear commit history. Notice in the diagram that what was originally local E F G becomes E’ F’ G’, which will be mentioned later.
Looking back at the previous warning, the default pull operation can be set to
git pull --rebase by using
git config pull.rebase true.
Similarly, for different branches on the same machine, you can actually use the
git rebase other-branch operation instead of merge.
The Golden Rule of Rebase
There is a golden rule for rebase operations: **Don’t use rebase on shared branches!
Perhaps because of this rule, some programmers are afraid to use rebase.
As mentioned earlier, a local commit, after rebase, actually generates a new commit with the same content, and the hash of E’ F’ G’ is not the same as the original E F G. Suppose now the branching situation is as follows.
If A rebases a feature in the local dev branch.
Then A wants to push the local dev to the remote end. Here comes the trouble, A’s local dev and the remote end won’t match after B. If A ignores it and uses
git push --force, then B will encounter an error if he wants to push his local changes, and B uses
git pull, Git will try to merge the branches.
If everyone operated like A, this shared dev branch would end up being very confusing.
But if it’s like mentioned earlier, A’s local dev is
A -> B -> C -> D, the remote end was originally
A -> B -> C, and after B’s push it becomes
A -> B -> C -> E, A uses
git pull --rebase which is fine, then the local becomes
A -> B -C -> E -> D', why is this operation safe? Here the remote dev branch is shared, but the local dev can be treated as a private branch.
git pull --rebase is equivalent to rebasing the remote dev branch, and the final push results in a new commit being pushed to the remote end, which then results in
A -> B -> C -> E -> D after B uses
For example, if you fork a repository on Github, checkout a
dev branch and make some changes to create a PR, although this
dev branch is on a public code hosting platform where everyone can see it, it is only created to eventually merge into the mainline of the target repository and can still be considered a private branch. Before this PR is merged, changes to the target master branch can be synchronized via rebase, and commits can be compressed with squash, all of which are safe operations.
In summary, the safe use of amend, squash, rebase, etc. is based on the premise that don’t change commits that have been shared and that if
A -> B -C on a shared remote branch becomes
A -> B -> D -> F, it will cause confusion.
Git provides a hook mechanism to trigger specific actions before and after specific events. For example, checking test coverage before a code commit, checking code formatting, etc. The Python open source tool pre-commit provides a number of nice hooks.
If you write an extended script for Git, then you can name your executable with
git-foo, and Git allows you to call custom scripts using the subcommand form of
You can configure a short alias for some common and long commands, for example
Different editors/IDEs will have their own project configuration files, such as
.idea for JetBrains series,
.vscode for VSCode, I personally think such files should not be committed to the public repository, because it should not be mandatory for all developers to use the same tools (projects like Android development that are highly tied to IDEs might be is the exception).
So how do you ensure that different developers use different editors while maintaining a uniform code style at this point? One way is to use the aforementioned git hooks to do formatting before committing; another way is to use EditorConfig to place a
.editorconfig file in the project to configure indentation, line breaks, etc. Basically, mainstream editors will respect This configuration.
The Git command line provides a number of options to quickly find commits.
- Find by commit information:
git log --all --grep='<pattern>'
- Find by committer:
git log --committer=<pattern>
- By date:
git log --since=<date>, git log --before=<date>
For more query options, you can check the official documentation.
Tracking Empty Folders
Git itself can’t track empty directories, but sometimes you do need to put an empty directory into the repository, so you can put an empty
.gitkeep file under that directory. This filename is just a naming convention and has no special meaning, so you’ll have to go and modify the
.gitignore file next.
This allows Git to ignore all files in that directory except for
.gitkeep, but keep the directory.
Git is designed for text files, but sometimes you need to put large binary files in your repository, such as images, audio, and other design resources. This can make the repository huge, and if the binary changes, the change history can become large. To solve this problem, you can use the LFS (Large File Storage) extension, which simply allows large files to be saved in another repository, keeping a pointer to it locally. See LFS for details.
git gc command can help clean up the Git database of unneeded files and reduce disk footprint, which is useful when working on large repositories with huge commits like nixpkgs.
Only the most recent commit is needed
Sometimes we only need the latest code from one repository for the time being and don’t need all the Git commit history, so you can use
git clone --depth 1 repo-url to clone the repository, which saves download time and local disk footprint.