Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Mercurial 4.6 Sprint Report

897 views
Skip to first unread message

Gregory Szorc

unread,
Mar 14, 2018, 6:03:41 PM3/14/18
to dev-version-control
The semiannual Mercurial developer meetup (or "sprint") was March 2-4 in
Boston. Mozilla was represented by me and Connor Sheehan.

As usual, the sprint was 3 very long days of discussion, planning, and
hacking.

One of the larger topics of the sprint was support for partial clones.
Partial clones refers to a client-side clone that has a subset of the files
and/or a subset of history. Fully distributed version control systems like
Mercurial and Git transfer all the data all the time, which obviously
doesn't scale. Partial clone is the solution to that scaling problem. We
tend to use the terms "narrow clone" for a subset of files and "shallow
clone" for a subset of history. Google has upstreamed their "narrow"
extension, which allows Mercurial to support narrow clones. Facebook has a
"remotefilelog" extension that implements support for shallow clone.

Non-experimental support for shallow clones will require significant work
to refactor client-side storage. So that is a few releases out. The
immediate focus to support partial clones is to get the server and wire
protocol pieces in place and to provide experimental-level support for
partial clone so it can be used by limited-use clients (like automated
systems). The thinking here is that if the server pieces get deployed and
are reasonably backwards compatible, then this enables multiple versions of
clients in the near future to support partial clones. For example (and this
is the plan of record), Mozilla could roll out partial clone support on
hg.mozilla.org and enable the experimental client bits in Firefox CI, which
have a tightly controlled client environment. This would allow us to start
using partial clone for critical CI efficiency wins a few releases before
the client-side bits are stabilized and non-experimental.

Supporting partial clones will be an overhaul of the wire protocol, which
is long overdue for the project. The new wire protocol is being designed
with modern practices in mind. For example, CPU-bound activities will be
able to scale out to multiple CPU cores. The new HTTP protocol will also be
designed such that repository hosting can easily leverage a CDN for content
distribution (it can be difficult to scale version control servers and the
use of redirects to CDN or scalable blob store services will make running
services at scale much easier).

After a long discussion, it was concluded that the minimum viable product
for partial clones will be narrow clones. The bulk of the complexity for
partial clone is related to shallow clones and this problem will be
deferred to later releases.

Google has authored an `hg fix` command which runs code formaters (such as
clang-format). Basically, you can check in a file that defines which code
formaters run for which files and `hg fix` automatically runs formaters. I
believe they will be upstreaming it into core Mercurial. More details at
https://www.mercurial-scm.org/wiki/AutomaticFormattingPlan.

Speaking of Google, their project to adopt Mercurial for their massive,
internal monorepo is picking up steam. A percentage of people at Google are
now using the `hg` client for interacting with their monorepo. (I can't
share more detailed numbers, sorry.)

Facebook continues to do crazy and interesting things. They rewrote
"dirstate" in Rust and this yielded significant performance improvements.
This isn't yet upstreamed though. They are still working on Mononoke - a
Mercurial server implemented in Rust. Still under heavy development. They
are pretty obsessed with performance everywhere. They want to make heavy
use of progress bars on all commands and operations that could take
unbounded time so they have a better grasp on what operations need perf
attention.

During the sprint, a Facebook engineer imported Git's "xdiff" diffing
library into Mercurial. Since the sprint, he has cleaned up the code
dramatically and made changes to increase performance by up to 10x. He is
attempting to upstream some of this work back to Git.

There was a long discussion about obsolescence, hiddenness, and a path
forward in core. This is a technically complicated topic. The short version
is there appears to be a plan for enabling hiding changesets in core by
default. This will make history rewriting operations significantly faster.
The sentiment is Mercurial should focus on shipping that, then we can worry
about adding more changeset evolution / evolve features in core.

The oxidation (Rust in Mercurial) effort is underway. There is a Rust
version of `hg` in the core repository and it passes all but ~5 tests. This
is basically a Rust program that starts a Python interpreter. That still
needs to get shipped, which will require significant packaging work. The
project wants to start implementing performance critical and low-level
components in Rust (as opposed to C). We generally know what we want here.
We're still waiting for the first domino to fall.

We agreed that we want proper support for sub commands. e.g. `hg command
subcommand <args>`. First consumer may be `hg show`, which we agreed should
move from an extension to a core command.

There was a discussion on "named commits" - allowing people to give human
readable names to individual commits. This seems to be something that MQ
users really love. There is already a mechanism in core to associate extra
names with changesets. So this was mostly a discussion about the UI for
defining names and where to store the name. I think we agreed that `hg
commit --name X` would store a name in the changeset extras field and we'd
cache names to make lookup faster.

The Python 3 port is proceeding at a healthy pace. Just recently, the
effort reached a milestone with over 50% of tests now passing on Python 3!
We're optimistic that we'll be able to ship a beta quality release of
Mercurial that supports Python 3 by the end of 2018.

There was talk of random bigger projects/features that are "good ideas" and
need someone to work on them:

* `hg shellprompt` - a command that spits out shell script that you can
eval in your shell init so you can get better shell integration
* --dry flags for every command that modifies things
* curses interface for resolving merge conflicts
* curses interface for running revsets
* side-by-side diff support
* curses interface for annotate

When you get a bunch of VCS people together in a room, we tend to end up
talking about things related to VCS, like code review. There was an
interesting discussion on commit authoring and review workflows. It was
interesting to see people from Google, Facebook, and other companies talk
about many of the same "review workflow" issues that have come up at
Mozilla. e.g. should commits be squashed before review, should you use
fix-up commits or amend commits, etc. People from large companies tended to
agree that fix-up commits and having reviewers see the intermediate,
throw-away commits did not scale.

There was talk of https://www.mercurial-scm.org/wiki/GenericTemplatingPlan and
steps needed to complete that work.
<https://www.mercurial-scm.org/wiki/GenericTemplatingPlan>

There's interest in adding a "bug report" extension/command that can be
used to report Mercurial bugs in a more turnkey manner. This devolved into
a data sensitivity and privacy discussion.

There was talk of establishing a formal "stack" primitive, both as an
internal and user-facing concept for expressing "the current commits I'm
working on." Various parts of the internal code already expose a "stack"
and "hg show stack" exposes a stack to the user. The end goal would be for
various commands to take the current stack into account when deciding how
to behave.

There was a discussion on `hg push --force` and how the UX is bad because
it removes all of the safety stops. Conclusion: add `hg push --allow
<thing>` to provide granular overrides for each thing you want to override.
e.g. --allow new-head, --allow create-branch, etc.

There were several more discussions and side-conversations. The full notes
are available at https://public.etherpad-mozilla.org/p/sprint-hg4.6
0 new messages