aragost Trifork: Mercurial Kick Start Exercises


Subrepositories

Contents

Introduction

Mercurial has a feature called subrepositories. This feature allows you to treat a collection of repositories as a group.

Reusing code is an important concept of software architecture. Subrepositories support a way of reusing, by having shared libraries and modules as subrepositories for the projects using them.

Subrepositories can be nested.

The basic commands when using subrepositories are the same as ordinary repositories. However where meaningful commands take a --subrepo option to recurse down into the subrepositories.

When executing a command within the working directory of a subrepository, then the behavior will be exactly the same as if it wasn’t a subrepository, so a subrepository doesn’t “know” it is within another repository.

Subrepositories

Subrepositories are just normal repositories, therefore we first create one repository and another one within, using the hg init command:

alice$ hg init mainrepo
alice$ cd mainrepo
alice$ hg init subrepo
alice$ ls
subrepo

We have now created two repositories, one nested within the other. This doesn’t make the inner repository to a subrepository. The subrepository does also have to be marked as such in the repository containing the subrepository.

Marking a subrepository is done by creating and entry for it in the special .hgsub file. This is done by adding the following line:

alice$ echo subrepo = subrepo > .hgsub
alice$ hg add .hgsub

Note

The way to interpret the .hgsub file is as a set of lines of the form:

where/to/put/the/subrepo = where/to/get/the/subrepo

The left-hand side is a path relative to the root of your clone and it tells Mercurial where to put the subrepository in your clone.

The right-hand side is a either a path relative to the place you clone from, or an absolute path. If it is a relative path then if you do hg clone mainrepo my-main then Mercurial will create a new repository at my-main/where/to/put/the/subrepo and will fill it with changesets taken from mainrepo/where/to/get/the/subrepo. If it is an absolute path it will naturally take the changesets from where/to/get/the/subrepo.

In the simple case where you have ‘subrepo = subrepo’, you end up with Mercurial doing these commands:

$ hg clone http://server/repos/repo my-repo
$ cd repo
$ hg clone http://server/repos/repo/subrepo
$ cd ..

and if you had used subrepo = ../subrepo, then the commands would be:

$ hg clone http://server/repos/repo my-repo
$ cd repo
$ hg clone http://server/repos/subrepo
$ cd ..

where the last URL is the normalization version of http://server/repos/repo/../subrepo.

Remote Subrepositories

We do also want to link a remote subrepository to our project. This procedure is similar to before:

alice$ hg clone http://www.selenic.com/repo/hello remoterepo
requesting all changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 2 files
updating to branch default
2 files updated, 0 files merged, 0 files removed, 0 files unresolved
alice$ echo remoterepo = http://www.selenic.com/repo/hello >> .hgsub
alice$ ls
remoterepo
subrepo

Note that we clone the repository to the place where it should be, because just adding the entry in the .hgsub file will not do anything unless there is a repository and the given location.

Subversion Subrepositories

A company might have some libraries in Subversion repositories and can for different reasons not convert them to Mercurial. No problem, Mercurial can have SVN repositories as subrepositories. The procedure is as above, except that [svn] should be prepended to the URL in the .hgsub file.

Therefore adding a SVN subrepository will be like the following:

alice$ svn co http://mercurial.aragost.com/svn/hello/trunk svnrepo
A    svnrepo/hello.c
A    svnrepo/Makefile
A    svnrepo/README
 U   svnrepo
Checked out revision 10.
alice$ ls
remoterepo
subrepo
svnrepo
alice$ echo svnrepo = [svn]http://mercurial.aragost.com/svn/hello1/trunk >> .hgsub

Then SVN commands should be used within the subrepository and Mercurial commands in the super repository.

Working with Subrepositories

We will now show how subrepositories work in daily usage.

Committing

The status can as always be seen with hg status but you have to add a --subrepo option to recurse through the subrepositories. We see that the files from the subrepositories doesn’t act like such, before committing the .hgsub file.

alice$ hg status
A .hgsub

Our changes are not committed yet, so this we will do now.

alice$ hg commit -m "Subrepositories added"
alice$ hg status
alice$ hg status --subrepos

The default behavior of commit is to first recurse through subrepositories and commit in each. Then commit the outer repository. The reason for this is that subrepositories are also a part of the project being under version control, therefore a snapshot of the whole project makes most sense. The resulting state of subrepositories will be saved in a special file called .hgsubstate. This file is not intended to be edited by the user, but tracks which version of the subrepository that is linked with the repository.

Pushing and Pulling

Mercurial will always attempt to first push all subrepositories of a repository before pushing the current repository, thereby ensuring that the repository being pushed to, cannot end up with referring to an inexistent version of a subrepository.

Pulling is on the other hand not recursive, because Mercurial cannot know which subrepositories to pull before updating to a specific changeset has been done.

Another Version

As mentioned you should not edit .hgsubstate. But if you want to switch the subrepository to a different version, then go to the subrepository and update it to the wanted version. This change also have to be recorded in the repository above. Therefore you also have to commit that, which will as described above update the state of the subrepository in the .hgsubstate file.

First we create some changes and commits them in the subrepository, and the final commit is for updating and committing the .hgsubstate file.

alice$ cd subrepo
alice$ echo a > a.txt
alice$ hg add a.txt
alice$ hg commit -m "A" 
alice$ echo b > b.txt
alice$ hg add b.txt
alice$ hg commit -m "B" 
alice$ echo c > c.txt
alice$ hg add c.txt
alice$ hg commit -m "C" 
alice$ cd ..
alice$ hg commit -m "Main repo"

Now the main repository is linked with the newest revision of the subrepository. We realize that we want to link the main repository with an earlier version of the subrepository. This is done by updating the subrepository to the wanted revision and then performing a commit from the main repository again to update and commit the .hgsubstate file.

alice$ cd subrepo
alice$ hg log
changeset:   2:50b194df1a80
tag:         tip
user:        Alice <alice@example.net>
date:        Wed Mar 10 20:12:05 2010 +0000
summary:     C

changeset:   1:00bd589746e0
user:        Alice <alice@example.net>
date:        Wed Mar 10 20:11:05 2010 +0000
summary:     B

changeset:   0:3cf2ce324347
user:        Alice <alice@example.net>
date:        Wed Mar 10 20:10:05 2010 +0000
summary:     A
alice$ hg update 1
0 files updated, 0 files merged, 1 files removed, 0 files unresolved
alice$ cd ..
alice$ hg commit -m "Changed subrepo version"

Update On Demand

Update does naturally also perform actions on the subrepositories, since updating to another revision of a repository might mean that we have to update the subrepository to another revision as well.

As mentioned the .hgsubstate file tracks the revision of the subrepository to use. Therefore we do not know which version of the subrepository to update to before this is checked out. Therefore the repository is first updated and then the subrepositories are updated according to the updated version of the .hgsubstate file. This might also involve pulling from the subrepository path, if the revision in the .hgsubstate file has not yet been pulled.

Commands Crossing Repository Boundaries

hg commit and hg update go recursively through the subrepositories. If there is no change in a subrepository during a command, then there will of course not be committed to that subrepository.

Other commands that can work on subrepositories need a --subrepos option to recurse through the subrepositories.

Commands supporting this are e.g. hg add, hg archive, hg diff, hg incoming, hg outgoing, and hg status.

hg add with the --subrepos options lets you add a file in a subrepository, with syntax as if it was just in another directory.

The other commands just do what they normally do, but do also recurse into the subrepositories when the --subrepos flag is on.

Tips & Tricks

We will end this guide by explaining how to do some uncommon, but still quite useful operations.

Converting Folder Into a Subrepository

A project might contain a folder with some code, which at a later point is realized should be used across different projects. This code could of course just be copied to a repository, which the projects could include as subrepository. This would however mean that we would loose our precious history of these files.

The way to do it, is by converting a folder into a repository, using the convert extension and then include this repository as a subrepository in the different projects.

This first part below, just creates our folder. makes some changes and commits its contents.

alice$ mkdir folder
alice$ echo 'infolder' > folder/1.txt
alice$ hg add folder/1.txt
alice$ ls
folder
remoterepo
subrepo
svnrepo
alice$ hg commit -m "In folder" 
alice$ echo 'notinfolder' > 2.txt
alice$ hg add 2.txt
alice$ hg commit -m "Not in folder"

Now we want to make the folder into a subrepository. First we just make the folder into an repository. This is done by converting the original repository from a Mercurial repository to a Mercurial repository and then using a filemap to specify we only want the certain folder.

The filemap specifies to include the folder, furthermore there is a rename mapping, which ensures that the contents of the folder will be in the root directory of the repository we create. Then we perform the conversion, and see that we now have a repository containing only the changeset which edited in the file within the folder. This repository we can now be included as a subrepository as previously described.

alice$ echo 'include folder' > map.txt
alice$ echo 'rename folder .' >> map.txt
alice$ hg --config extensions.hgext.convert= convert --filemap map.txt . mynewsubrepo
initializing destination mynewsubrepo repository
scanning source...
sorting...
converting...
4 Subrepositories added
3 Main repo
2 Changed subrepo version
1 In folder
0 Not in folder
alice$ cd mynewsubrepo
alice$ hg update
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
alice$ hg log
changeset:   0:c505b3dd95a2
tag:         tip
user:        Alice <alice@example.net>
date:        Wed Mar 10 20:13:05 2010 +0000
summary:     In folder

Detaching a Subrepository

If a subrepository no longer needs to be associated with a project, then it is detached from the repository by just deleting the line for it in the .hgsub file. Then hg status will show the files from that subrepository as unknown files, and you can delete these files from your file system.

Exercises

  1. Use hg init and create, edit and add the .hgsub file to create a repository called kick-start and a subrepository called mysubrepo

  2. Use hg add and hg commit to make some commits. Try both within a subrepository and with --subrepos to do it across repository boundaries.

  3. Use hg update in the subrepository to switch it to an older revision.

  4. Go to main repositories. Use hg status to see that the subrepository in the

  5. Use hg commit to update the registered subrepositories revision. What is committed now will therefore include the .hgsubstate file. Therefore the commit is not empty.

  6. Add a SVN repository as subrepository.