Git submodules allow you to keep a Git repository as a subdirectory of another Git repository. Git submodules are simply a reference to another repository at a particular snapshot in time. Git submodules enable a Git repository to incorporate and track version history of external code.
Often a code repository will depend upon external code. This external code can be incorporated in a few different ways. The external code can be directly copied and pasted into the main repository. This method has the downside of losing any upstream changes to the external repository. Another method of incorporating external code is through the use of a language's package management system like Ruby Gems or NPM. This method has the downside of requiring installation and version management at all places the origin code is deployed. Both of these suggested incorporation methods do not enable tracking edits and changes to the external repository.
A Git submodule is a record within a host Git repository that points to a specific commit in another external repository. Submodules are very static and only track specific commits. Submodules do not track Git refs or branches and are not automatically updated when the host repository is updated. When adding a submodule to a repository a new .gitmodules file will be created. The .gitmodules file contains meta data about the mapping between the submodule project's URL and local directory. If the host repository has multiple submodules, the .gitmodules file will have an entry for each submodule.
The default behavior of git submodule init is to copy the mapping from the .gitmodules file into the local ./.git/config file. This may seem redundant and lead to questioning git submodule init usefulness. git submodule init has extend behavior in which it accepts a list of explicit module names. This enables a workflow of activating only specific submodules that are needed for work on the repository. This can be helpful if there are many submodules in a repo but they don't all need to be fetched for work you are doing.
Here we have changed directory to the awesomelibrary submodule. We have created a new text file new_awesome.txt with some content and we have added and committed this new file to the submodule. Now let us change directories back to the parent repository and review the current state of the parent repo.
Executing git status shows us that the parent repository is aware of the new commits to the awesomelibrary submodule. It doesn't go into detail about the specific updates because that is the submodule repositories responsibility. The parent repository is only concerned with pinning the submodule to a commit. Now we can update the parent repository again by doing a git add and git commit on the submodule. This will put everything into a good state with the local content. If you are working in a team environment it is critical that you then git push the submodule updates, and the parent repository updates.
When working with submodules, a common pattern of confusion and error is forgetting to push updates for remote users. If we revisit the awesomelibrary work we just did, we pushed only the updates to the parent repository. Another developer would go to pull the latest parent repository and it would be pointing at a commit of awesomelibrary that they were unable to pull because we had forgotten to push the submodule. This would break the remote developers local repo. To avoid this failure scenario make sure to always commit and push the submodule and parent repository.
Git submodules are a powerful way to leverage Git as an external dependency management tool. Weigh the pros and cons of Git submodules before using them, as they are an advanced feature and may take a learning curve for team members to adopt.
Eventually, any interesting software project will come to depend on another project, library, or framework. Git provides submodules to help with this. Submodules allow you to include or embed one or more repositories as a sub-folder inside another repository.
Git addresses this issue using submodules.Submodules allow you to keep a Git repository as a subdirectory of another Git repository.This lets you clone another repository into your project and keep your commits separate.
Since the URL in the .gitmodules file is what other people will first try to clone/fetch from, make sure to use a URL that they can access if possible.For example, if you use a different URL to push to than others would to pull from, use the one that others have access to.You can overwrite this value locally with git config submodule.DbConnector.url PRIVATE_URL for your own use.When applicable, a relative URL can be helpful.
There is another way to do this which is a little simpler, however.If you pass --recurse-submodules to the git clone command, it will automatically initialize and update each submodule in the repository, including nested submodules if any of the submodules in the repository have submodules themselves.
If you already cloned the project and forgot --recurse-submodules, you can combine the git submodule init and git submodule update steps by running git submodule update --init.To also initialize, fetch and checkout any nested submodules, you can use the foolproof git submodule update --init --recursive.
There is an easier way to do this as well, if you prefer to not manually fetch and merge in the subdirectory.If you run git submodule update --remote, Git will go into your submodules and fetch and update for you.
Git will by default try to update all of your submodules when you run git submodule update --remote.If you have a lot of them, you may want to pass the name of just the submodule you want to try to update.
Note that to be on the safe side, you should run git submodule update with the --init flag in case the MainProject commits you just pulled added new submodules, and with the --recursive flag if any submodules have nested submodules.
If you want to automate this process, you can add the --recurse-submodules flag to the git pull command (since Git 2.14).This will make Git run git submodule update right after the pull, putting the submodules in the correct state.Moreover, if you want to make Git always pull with --recurse-submodules, you can set the configuration option submodule.recurse to true (this works for git pull since Git 2.15).This option will make Git use the --recurse-submodules flag for all commands that support it (except clone).
There is a special situation that can happen when pulling superproject updates: it could be that the upstream repository has changed the URL of the submodule in the .gitmodules file in one of the commits you pull.This can happen for example if the submodule project changes its hosting platform.In that case, it is possible for git pull --recurse-submodules, or git submodule update, to fail if the superproject references a submodule commit that is not found in the submodule remote locally configured in your repository.In order to remedy this situation, the git submodule sync command is required:
If we commit in the main project and push it up without pushing the submodule changes up as well, other people who try to check out our changes are going to be in trouble since they will have no way to get the submodule changes that are depended on.Those changes will only exist on our local copy.
For instance, switching branches with submodules in them can also be tricky with Git versions older than Git 2.13.If you create a new branch, add a submodule there, and then switch back to a branch without that submodule, you still have the submodule directory as an untracked directory:
Newer Git versions (Git >= 2.13) simplify all this by adding the --recurse-submodules flag to the git checkout command, which takes care of placing the submodules in the right state for the branch we are switching to.
Luckily, you can tell Git (>=2.14) to always use the --recurse-submodules flag by setting the configuration option submodule.recurse: git config submodule.recurse true.As noted above, this will also make Git recurse into submodules for every command that has a --recurse-submodules option (except git clone).
Then, when you switch back, you get an empty CryptoLibrary directory for some reason and git submodule update may not fix it either.You may need to go into your submodule directory and run a git checkout . to get all your files back.You could run this in a submodule foreach script to run it for multiple submodules.
How is this done? In my limited experience, modules with submodules are simply folders with a __init__.py file, while modules with functions/classes are actual python files. How does one create a module "folder" that also has functions/classes?
I have a website directory versioned with git. I use submodules for required libraries like Twitter Bootstrap, colorbox and lessjs because I should not track the sourcecode but only the version of their code I use.
It seems like this should be a built-in capability of Git -- it's too complicated to piece together with scripting. I guess not many people use submodules or git-archive, so even fewer want to use both.
For an example of exactly the kind of problem that @congvan is correctly pointing out, see this recent post: I'm confused with types names . Including a file multiple times will result in different types with the same name and is likely to create all kinds of confusion.
Another possibility would be to pull all the sub module repos directly into your source tree, to avoid problems with submodules at all. That still requires a different branch, with always merging the upstream master branch when changes occur.
Well, over all it sounds complicated the more I think about it. If there is no specific reason to also fork the submodules, stick with them like upstream added them. Modifying/removing submodules is really not fun btw.
May I ask for your opinion about how you would work with a forked project containing submodules if you would want to work on the main repo as well as on the submodules and occasionally create pull requests on each of the repos?
ffe2fad269