Mercurial 4.0 Sprint Notes

12385 views
Skip to first unread message

Gregory Szorc

unread,
Oct 17, 2016, 3:52:53 PM10/17/16
to dev-version-control
I attended a semi-annual Mercurial developers meeting on Oct 7-9. While a
lot was discussed (full notes available at https://www.mercurial-scm.org/
wiki/4.0sprint), there were a number of developments relevant to Mozilla.

* Facebook has implemented an extension that effectively caches blame
lookups. My measurements show a 4-10x speedup across the board. Files that
used to take 10s to blame can now take under 1s! Bug 1308226 tracks rolling
it out to hg.mozilla.org. It isn't likely to happen before December due to
a probable requirement on Mercurial 4.0.

* There was a demo of limiting blame output to certain lines. This is
useful for e.g. following the history of just a function. There was talk of
integrating this into the HTML interface (so you e.g. highlight lines and
then limit output to just those lines over time).

* Facebook demoed `hg absorb` which is probably the coolest workflow
enhancement I've seen to version control in years. Essentially, when your
working directory has uncommitted changes on top of draft changesets, you
can run `hg absorb` and the uncommitted modifications are automagically
folded ("absorbed") into the appropriate draft ancestor changesets. This is
essentially doing `hg histedit` + "roll" actions without having to make a
commit or manually make history modification rules. The command essentially
looks at the lines that were modified, finds a changeset modifying those
lines, and amends that changeset to include your uncommitted changes. If
the changes can't be made without conflicts, they remain uncommitted. This
workflow is insanely useful for things like applying review feedback. You
just make file changes, run `hg absorb` and the mapping of changes to
commits sorts itself out. It is magical.

* Facebook, Google, and Unity all are experimenting with virtual
filesystems for Mercurial (e.g. FUSE filesystems). This allows them to
bypass having to write 100,000+ files at checkout time and makes subsequent
operations like `hg update` much faster since the filesystem doesn't need
to be touched as much. This has the potential to significantly speed up VCS
operations in Firefox automation.

* Facebook has developed a large file storage extension using the git-lfs
protocol.

* Google demoed a working narrow clone (clone a subset of files). It is a
3rd party extension for now. But they are still intent on upstreaming once
it is stable enough.

* `hg version` now runs under Python 3. It looks like official support for
Python 3 support will happen sometime in 2017.

* Facebook reported significant improvements to developer sentiment towards
Mercurial at Facebook. Initially, a lot of developers were skeptical about
Mercurial and preferred Git. Now, apparently a number of their developers
have forgot how to use Git. A number of their developers working on open
source projects on GitHub (which are essentially mirrors of subdirectories
from their monorepo) now prefer working out of a Mercurial clone instead.
Keep in mind Facebook essentially runs their own Mercurial distribution
that has a lot of customizations to work around rough edges. So the
Mercurial experience at Facebook is vastly different from the experience
others have.

* There are very promising performance wins when running Mercurial under
PyPy, especially on the server side.

* Facebook will be investing a lot of time in the next year into improving
the long tail of minor performance issues in Mercurial. They want all
commands to be as fast as possible.

* Facebook is writing a Mercurial server in Rust. It will be distributed
and will support pluggable key-value stores for storage (meaning that we
could move hg.mozilla.org to be backed by Amazon S3 or some such). The
primary author also has aspirations for supporting the Git wire protocol on
the server and enabling sub-directories to be `git clone`d independently of
a large repo. This means you could use Mercurial to back your monorepo
while still providing the illusion of multiple "sub-repos" to Mercurial or
Git clients. The author is also interested in things like GraphQL to query
repo data. Facebook engineers are crazy... in a good way.

* Mercurial 4.1 should contain an `hg display <view>` command that provides
a common command for showing common views of various pieces of data. Look
for new views like `hg display inprogress` as an officially supported
version of `hg wip`.

* There were discussions on improving the robustness of clone (resume from
interrupted clone, automatic retries, etc).

* Supporting zstandard (an new compression format that beats zlib in
compression ratio and speed) in the core Mercurial distribution has a green
light. Look for initial support in Mercurial 4.1. This has potential to
drastically speed up several operations and to make servers scale much
better.

* Work on a rewrite of the Mercurial book is underway. More info at
https://book.mercurial-scm.org/.

* Discussions about SHA-1 concerns, commit integrity/signing.

If you have any questions, just ask.

Byron Jones

unread,
Oct 17, 2016, 11:07:54 PM10/17/16
to dev-version-control, Gregory Szorc
thanks gps, nice writeup...
> * Facebook demoed `hg absorb` which is probably the coolest workflow
> enhancement I've seen to version control in years. Essentially, when your
> working directory has uncommitted changes on top of draft changesets, you
> can run `hg absorb` and the uncommitted modifications are automagically
> folded ("absorbed") into the appropriate draft ancestor changesets.
that does indeed sound magical. is there a time frame for its release,
and do you know if it will require mercurial 4?
> * Facebook reported significant improvements to developer sentiment towards
> Mercurial at Facebook. [snip]
> Keep in mind Facebook essentially runs their own Mercurial distribution
> that has a lot of customizations to work around rough edges. So the
> Mercurial experience at Facebook is vastly different from the experience
> others have.
are these customisations documented/available?
i wasn't able to find anything, but searching for anything that includes
"facebook" is a mess of false positives :)


-glob

--
glob — engineering productivity — mozilla

Gregory Szorc

unread,
Oct 18, 2016, 12:09:29 AM10/18/16
to Byron Jones, Gregory Szorc, dev-version-control
Pretty much all of Facebook's Mercurial foo is available at
https://bitbucket.org/facebook/hg-experimental/. The "experimental" bit in
the repo name is a bit misleading: many of the extensions are production
quality. Although some of them aren't. Knowing which are which often
requires pinging someone. It doesn't help that the docs are a bit sparse :/

`hg absorb` is in that repo in hgext3rd/absorb.py. It currently requires
the "linelog" Python C extension, which is the core technology behind the
"fast annotate" feature. Getting that compiled and installed in a way
Mercurial can load is... fun. In theory, `hg absorb` doesn't need linelog:
linelog just makes it faster.

At this time, there is no timetable to get absorb in core. The code is
literally only a few weeks old and the concept needs to be refined before
incorporation can be considered. Given the massive amount of interest for
this feature (people were literally gasping when it was demoed), it is
likely on the fast track.

mos...@gmail.com

unread,
Oct 18, 2016, 12:58:12 PM10/18/16
to mozilla-dev-v...@lists.mozilla.org
On Monday, October 17, 2016 at 2:52:53 PM UTC-5, Gregory Szorc wrote:
> have forgot how to use Git. A number of their developers working on open
> source projects on GitHub (which are essentially mirrors of subdirectories
> from their monorepo) now prefer working out of a Mercurial clone instead.

Do they use hg-git, a facebook version of hg-git, or something else? This is effectively how I work, myself. Everything is a Mercurial clone of some Github project.

Gregory Szorc

unread,
Oct 18, 2016, 1:23:37 PM10/18/16
to mos...@gmail.com, mozilla-dev-version-control
On Tue, Oct 18, 2016 at 9:13 AM, <mos...@gmail.com> wrote:

> On Monday, October 17, 2016 at 2:52:53 PM UTC-5, Gregory Szorc wrote:
> > have forgot how to use Git. A number of their developers working on open
> > source projects on GitHub (which are essentially mirrors of
> subdirectories
> > from their monorepo) now prefer working out of a Mercurial clone instead.
>
> Do they use hg-git, a facebook version of hg-git, or something else? This
> is effectively how I work, myself. Everything is a Mercurial clone of some
> Github project.
>

The commit history of hg-git contains patches from a number of Facebook
employees. So you can infer hg-git is used in some capacity.

Jun WU

unread,
Oct 18, 2016, 3:16:34 PM10/18/16
to mozilla-dev-v...@lists.mozilla.org
On Monday, October 17, 2016 at 8:52:53 PM UTC+1, Gregory Szorc wrote:
> * Facebook demoed `hg absorb` which is probably the coolest workflow
> enhancement I've seen to version control in years. Essentially, when your
> working directory has uncommitted changes on top of draft changesets, you
> can run `hg absorb` and the uncommitted modifications are automagically
> folded ("absorbed") into the appropriate draft ancestor changesets. This is
> essentially doing `hg histedit` + "roll" actions without having to make a
> commit or manually make history modification rules. The command essentially
> looks at the lines that were modified, finds a changeset modifying those
> lines, and amends that changeset to include your uncommitted changes. If
> the changes can't be made without conflicts, they remain uncommitted. This

This is a bit inaccurate. Only ambiguous changes are left untouched.

The "absorb" command does not traditional merge algorithm and is impossible
to have merge conflicts. Things can results in merge conflicts using histedit
+ roll would just work with "absorb". An example:

$ echo 1 > a
$ hg commit -A a -m 1
$ echo 2 >> a
$ hg commit -m 2

Then try to change "1" to "one" and "2" to "two" using both histedit (or
other workflows like update, amend, evolve) and absorb. The former will
have merge conflict. The latter would just work as you would expect.

Note that absorb never writes to the working copy even with the experimental
interactive mode ("-i"). So it is faster and the working copy is safer.

Jun WU

unread,
Oct 18, 2016, 4:31:39 PM10/18/16
to mozilla-dev-v...@lists.mozilla.org
On Tuesday, October 18, 2016 at 5:09:29 AM UTC+1, Gregory Szorc
wrote:
> `hg absorb` is in that repo in hgext3rd/absorb.py. It currently
> requires the "linelog" Python C extension, which is the core
> technology behind the "fast annotate" feature. Getting that
> compiled and installed in a way Mercurial can load is... fun. In
> theory, `hg absorb` doesn't need linelog: linelog just makes it
> faster.

Actually, it's less about performance (because absorb runs the diff
algorithm and build the linelog on the fly), more about "able (or,
easy) to implement".

absorb uses linelog in a non-usual way (different from what
fastannotate does) to do the editing. It's replacing hunks belonging
to earlier changesets directly from an annotate view of an later
changeset [1], which cannot be easily done otherwise.

If the existing merge algorithm is used, one challenge would be how
to identify lines (figuring out the line is introduced by which line
in which revision) after a merge. I think this may require
additional changes to the merge algorithm - making things more
complex. Let alone the possibility of unnecessary conflicts.

That said, I agree that linelog is not the only way to get this
operation done. It's just making the code much easier to write
confidently.

I have recently written a prototype called "collate" [2] to edit
lines of multiple revisions of a file. It has the potential to
handle some cases that "absorb" supports (but there are still some
subtle differences where absorb is superior). I think if we want the
"absorb" upstreamed, it's probably after the "collate" command, and
may be a flag (sub feature) of "collate".

[1]: https://bitbucket.org/facebook/hg-experimental/src/2b9f2a2c5/hgext3rd/absorb.py#absorb.py-323
[2]: https://bitbucket.org/quark-zju/hgext-collate/

Jun WU

unread,
Oct 19, 2016, 2:03:50 PM10/19/16
to mozilla-dev-v...@lists.mozilla.org
(resending since the previous reply didn't show up after 1 hour.
sorry if it turns out to be a duplicate)

On Tuesday, October 18, 2016 at 5:09:29 AM UTC+1, Gregory Szorc
wrote:
> `hg absorb` is in that repo in hgext3rd/absorb.py. It currently
> requires the "linelog" Python C extension, which is the core
> technology behind the "fast annotate" feature. Getting that
> compiled and installed in a way Mercurial can load is... fun. In
> theory, `hg absorb` doesn't need linelog: linelog just makes it
> faster.

Actually, it's less about performance (because absorb runs the diff
algorithm and build the linelog on the fly), more about "able (or,
easy) to implement".

What is more interesting here is the writing part, not the reading
part getting the annotate result. absorb uses linelog in a non-usual
way (different from fastannotate which always appends history) to do
the editing. It's replacing chunks belonging to earlier changesets
directly from an annotate view of an later changeset [1]. This logic
cannot be easily done otherwise.

If the existing merge algorithm is used, one challenge would be how
to identify lines (figuring out the line is introduced by which line
in which revision) after a merge. I think this may require
additional changes to the merge algorithm - making things more
complex. Let alone the possibility of unnecessary conflicts.

That said, I agree that linelog is not the only way to get this
operation done. It's just making the code much easier to write
confidently.

I have recently written a prototype called "collate" [2] to edit
lines of multiple revisions of a file. It has the potential to
handle some cases that "absorb" supports (but there are still some
subtle differences where absorb is superior). I think if we want the
"absorb" upstreamed, it's probably after the "collate" command, and
may be a flag (sub feature) of "collate".

[1]:
bb/facebook/hg-experimental/src/2b9f2a2c5/hgext3rd/absorb.py#absorb.py-323
[2]: bb/bitbucket.org/quark-zju/hgext-collate/
Reply all
Reply to author
Forward
0 new messages