'What is the point of 'git submodule init'?
Background
To populate a repository's submodules, one typically invokes:
git submodule init
git submodule update
In this usage, git submodule init
seems to do only one thing: populate .git/config
with information that is already in .gitmodules
.
What is the point of that?
Couldn't git submodule update
simply use the information from .gitmodules
? This would avoid both:
- an unnecessary command (
git submodule init
); and - an unnecessary duplication of data (
.gitmodules
content into.git/config
).
Question
Either:
- there are use-cases for
git submodule init
that I do not know (in which case, please enlighten me!); or else git submodule init
is cruft that could be deprecated in Git without any harm.
Which of these is true?
Solution 1:[1]
Reading the git submodule
documentation, there is a use-case that ostensibly justifies the existence of git submodule init
as a standalone command.
If a user who has cloned a repository wishes to use a different URL for a submodule than is specified by the upstream repository, then that user can:
git submodule init
vim .git/config # Alter submodule URL as desired, without changing .gitmodules
# or polluting history.
git submodule update
Solution 2:[2]
Imagine the repository has 10 submodules and you are interested in only two submodules of these. In such a case, you may want to get updates from only these two submodules from the remote repository from time to time. git init
works well for this, because once you execute the command git init
for these two submodules, git submodule update --remote
applies only to them.
Appended two workflows demo.
Workflow1: Submodules are libraries which several projects use.
I think this is one of the common use cases.
You just cloned "my-project".
git clone https://example.com/demo/my-project
And the surface of its structure is like below.
The contents of .gitmodules
[submodule "lib1"]
path = lib1
url = https://example.com/demo/lib1
[submodule "lib2"]
path = lib2
url = https://example.com/demo/lib2
[submodule "lib3"]
path = lib3
url = https://example.com/demo/lib3
[submodule "lib4"]
path = lib4
url = https://example.com/demo/lib4
You want to refactor the code code1.js
which references lib1 and lib2 which means you don't need to clone and checkout lib3 and lib4. So you just run the below command.
git submodule init lib1 lib2
Now let's see the contents of .git/config
...
[submodule "lib1"]
active = true
url = https://example.com/demo/lib1
[submodule "lib2"]
active = true
url = https://example.com/demo/lib2
This means something like "Ready to update lib1 and lib2 from example.com/demo".
At this point, lib1 and lib2 directories are empty. You can clone and checkout lib1 and lib2 with one command:
git submodule update
Now you are able to refactor code1.js
without import errors.
Submodules are just references to certain commits. So when you want to update libraries to new versions, you have to update the references. You can do it by the below command.
git submodule update --remote
Now you can see how useful it is to only initialize the submodules you need.
Workflow 2: Each submodule is a project and one big top project includes them.
I'm a fan of this.
You clone "main-project".
git clone https://example.com/demo/main-project
And the surface of its structure is like below.
You can see a directory named "shared". There is a rule in this workflow: if you want to use shared codes of main-project in your project, you have to create the project as a submodule of main-project.
I like to put entity classes in shared directory like below.
Back to the submodule workflow, the contents of .gitmodules is like the following.
[submodule "sub-project1"]
path = sub-project1
url = https://example.com/demo/sub-project1
[submodule "sub-project2"]
path = sub-project2
url = https://example.com/demo/sub-project2
[submodule "sub-project3"]
path = sub-project3
url = https://example.com/demo/sub-project3
[submodule "sub-project4"]
path = sub-project4
url = https://example.com/demo/sub-project4
This time you want to refactor some code in the shared directory of the main-project and you know that only sub-project1 and sub-project2 reference shared code, which means you don't need to clone and checkout sub-project3 and sub-project4. So you just run the command below.
git submodule init sub-project1 sub-project2
And like I mentioned in workflow1, you need to run the command below to clone and checkout them.
git submodule update
Would I do git submodule update --remote
in this case? Or do I even have to init and update submodules to refactor code in the shared directory? Yes, because you have to run tests in submodules after refactoring the shared code and if any update of submodules is committed and pushed to the remote repository while you are refactoring, then you need to get it by git submodule update --remote
.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Peter Mortensen |