| Mercurial 4.0 Sprint Notes | Gregory Szorc | 17/10/16 12:52 | I attended a semi-annual Mercurial developers meeting on Oct 7-9. While a
lot was discussed (full notes available at https://www.mercurial-scm.org/ wiki/4.0sprint), there were a number of developments relevant to Mozilla. * Facebook has implemented an extension that effectively caches blame lookups. My measurements show a 4-10x speedup across the board. Files that used to take 10s to blame can now take under 1s! Bug 1308226 tracks rolling it out to hg.mozilla.org. It isn't likely to happen before December due to a probable requirement on Mercurial 4.0. * There was a demo of limiting blame output to certain lines. This is useful for e.g. following the history of just a function. There was talk of integrating this into the HTML interface (so you e.g. highlight lines and then limit output to just those lines over time). * Facebook demoed `hg absorb` which is probably the coolest workflow enhancement I've seen to version control in years. Essentially, when your working directory has uncommitted changes on top of draft changesets, you can run `hg absorb` and the uncommitted modifications are automagically folded ("absorbed") into the appropriate draft ancestor changesets. This is essentially doing `hg histedit` + "roll" actions without having to make a commit or manually make history modification rules. The command essentially looks at the lines that were modified, finds a changeset modifying those lines, and amends that changeset to include your uncommitted changes. If the changes can't be made without conflicts, they remain uncommitted. This workflow is insanely useful for things like applying review feedback. You just make file changes, run `hg absorb` and the mapping of changes to commits sorts itself out. It is magical. * Facebook, Google, and Unity all are experimenting with virtual filesystems for Mercurial (e.g. FUSE filesystems). This allows them to bypass having to write 100,000+ files at checkout time and makes subsequent operations like `hg update` much faster since the filesystem doesn't need to be touched as much. This has the potential to significantly speed up VCS operations in Firefox automation. * Facebook has developed a large file storage extension using the git-lfs protocol. * Google demoed a working narrow clone (clone a subset of files). It is a 3rd party extension for now. But they are still intent on upstreaming once it is stable enough. * `hg version` now runs under Python 3. It looks like official support for Python 3 support will happen sometime in 2017. * Facebook reported significant improvements to developer sentiment towards Mercurial at Facebook. Initially, a lot of developers were skeptical about Mercurial and preferred Git. Now, apparently a number of their developers have forgot how to use Git. A number of their developers working on open source projects on GitHub (which are essentially mirrors of subdirectories from their monorepo) now prefer working out of a Mercurial clone instead. Keep in mind Facebook essentially runs their own Mercurial distribution that has a lot of customizations to work around rough edges. So the Mercurial experience at Facebook is vastly different from the experience others have. * There are very promising performance wins when running Mercurial under PyPy, especially on the server side. * Facebook will be investing a lot of time in the next year into improving the long tail of minor performance issues in Mercurial. They want all commands to be as fast as possible. * Facebook is writing a Mercurial server in Rust. It will be distributed and will support pluggable key-value stores for storage (meaning that we could move hg.mozilla.org to be backed by Amazon S3 or some such). The primary author also has aspirations for supporting the Git wire protocol on the server and enabling sub-directories to be `git clone`d independently of a large repo. This means you could use Mercurial to back your monorepo while still providing the illusion of multiple "sub-repos" to Mercurial or Git clients. The author is also interested in things like GraphQL to query repo data. Facebook engineers are crazy... in a good way. * Mercurial 4.1 should contain an `hg display <view>` command that provides a common command for showing common views of various pieces of data. Look for new views like `hg display inprogress` as an officially supported version of `hg wip`. * There were discussions on improving the robustness of clone (resume from interrupted clone, automatic retries, etc). * Supporting zstandard (an new compression format that beats zlib in compression ratio and speed) in the core Mercurial distribution has a green light. Look for initial support in Mercurial 4.1. This has potential to drastically speed up several operations and to make servers scale much better. * Work on a rewrite of the Mercurial book is underway. More info at https://book.mercurial-scm.org/. * Discussions about SHA-1 concerns, commit integrity/signing. If you have any questions, just ask. |
| Re: Mercurial 4.0 Sprint Notes | Byron Jones | 17/10/16 20:07 | thanks gps, nice writeup...
> * Facebook demoed `hg absorb` which is probably the coolest workflowthat does indeed sound magical. is there a time frame for its release, and do you know if it will require mercurial 4? > * Facebook reported significant improvements to developer sentiment towards> Mercurial at Facebook. [snip] > Keep in mind Facebook essentially runs their own Mercurial distributionare these customisations documented/available? i wasn't able to find anything, but searching for anything that includes "facebook" is a mess of false positives :) -glob -- glob — engineering productivity — mozilla |
| Re: Mercurial 4.0 Sprint Notes | Gregory Szorc | 17/10/16 21:09 | Pretty much all of Facebook's Mercurial foo is available at
https://bitbucket.org/facebook/hg-experimental/. The "experimental" bit in the repo name is a bit misleading: many of the extensions are production quality. Although some of them aren't. Knowing which are which often requires pinging someone. It doesn't help that the docs are a bit sparse :/ `hg absorb` is in that repo in hgext3rd/absorb.py. It currently requires the "linelog" Python C extension, which is the core technology behind the "fast annotate" feature. Getting that compiled and installed in a way Mercurial can load is... fun. In theory, `hg absorb` doesn't need linelog: linelog just makes it faster. At this time, there is no timetable to get absorb in core. The code is literally only a few weeks old and the concept needs to be refined before incorporation can be considered. Given the massive amount of interest for this feature (people were literally gasping when it was demoed), it is likely on the fast track. |
| Re: Mercurial 4.0 Sprint Notes | mos...@gmail.com | 18/10/16 09:58 | On Monday, October 17, 2016 at 2:52:53 PM UTC-5, Gregory Szorc wrote:Do they use hg-git, a facebook version of hg-git, or something else? This is effectively how I work, myself. Everything is a Mercurial clone of some Github project. |
| Re: Mercurial 4.0 Sprint Notes | Gregory Szorc | 18/10/16 10:23 | The commit history of hg-git contains patches from a number of Facebook
employees. So you can infer hg-git is used in some capacity. |
| Re: Mercurial 4.0 Sprint Notes | Jun WU | 18/10/16 12:16 | On Monday, October 17, 2016 at 8:52:53 PM UTC+1, Gregory Szorc wrote:This is a bit inaccurate. Only ambiguous changes are left untouched. The "absorb" command does not traditional merge algorithm and is impossible to have merge conflicts. Things can results in merge conflicts using histedit + roll would just work with "absorb". An example: $ echo 1 > a $ hg commit -A a -m 1 $ echo 2 >> a $ hg commit -m 2 Then try to change "1" to "one" and "2" to "two" using both histedit (or other workflows like update, amend, evolve) and absorb. The former will have merge conflict. The latter would just work as you would expect. Note that absorb never writes to the working copy even with the experimental interactive mode ("-i"). So it is faster and the working copy is safer. |
| Re: Mercurial 4.0 Sprint Notes | Jun WU | 18/10/16 13:31 | On Tuesday, October 18, 2016 at 5:09:29 AM UTC+1, Gregory Szorc
wrote: > `hg absorb` is in that repo in hgext3rd/absorb.py. It currentlyActually, it's less about performance (because absorb runs the diff algorithm and build the linelog on the fly), more about "able (or, easy) to implement". absorb uses linelog in a non-usual way (different from what fastannotate does) to do the editing. It's replacing hunks belonging to earlier changesets directly from an annotate view of an later changeset [1], which cannot be easily done otherwise. If the existing merge algorithm is used, one challenge would be how to identify lines (figuring out the line is introduced by which line in which revision) after a merge. I think this may require additional changes to the merge algorithm - making things more complex. Let alone the possibility of unnecessary conflicts. That said, I agree that linelog is not the only way to get this operation done. It's just making the code much easier to write confidently. I have recently written a prototype called "collate" [2] to edit lines of multiple revisions of a file. It has the potential to handle some cases that "absorb" supports (but there are still some subtle differences where absorb is superior). I think if we want the "absorb" upstreamed, it's probably after the "collate" command, and may be a flag (sub feature) of "collate". [1]: https://bitbucket.org/facebook/hg-experimental/src/2b9f2a2c5/hgext3rd/absorb.py#absorb.py-323 [2]: https://bitbucket.org/quark-zju/hgext-collate/ |
| Re: Mercurial 4.0 Sprint Notes | Jun WU | 19/10/16 11:03 | (resending since the previous reply didn't show up after 1 hour.
sorry if it turns out to be a duplicate)
> `hg absorb` is in that repo in hgext3rd/absorb.py. It currently Actually, it's less about performance (because absorb runs the diffWhat is more interesting here is the writing part, not the reading part getting the annotate result. absorb uses linelog in a non-usual way (different from fastannotate which always appends history) to do the editing. It's replacing chunks belonging to earlier changesets directly from an annotate view of an later changeset [1]. This logic cannot be easily done otherwise.bb/facebook/hg-experimental/src/2b9f2a2c5/hgext3rd/absorb.py#absorb.py-323 [2]: bb/bitbucket.org/quark-zju/hgext-collate/ |