Chromium Git migration: Flag day planned for Friday, April 25, 2014

225 views
Skip to first unread message

Chase Phillips

unread,
Apr 15, 2014, 4:07:41 PM4/15/14
to Chromium-dev, blin...@chromium.org, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci

Hi Chromium Developers,


The Chrome Infra team has been working via bugs and a planning document to complete the final tasks that remain prior to flipping the switch from SVN to Git.  We are now nearly ready to make the switch.


When will the switch happen?  Friday, April 25, 2014.


A known quantity of work remains before we make the switch.  We are addressing that now.  We plan to switch the Git repo to be authoritative for writes in the evening of April 25, 2014 ("flag day").  If some unforeseeable issue occurs on flag day, our backup date is May 2, 2014.


What do I need to do?


- Today: Since, after flag day, SVN checkouts will no longer update and Git checkouts from other locations may break in mysterious ways, please make sure your gclient ‘src’ is configured to use https://chromium.googlesource.com/chromium/src.git.


- Soon: Watch for updates about if/when services may be temporarily offline prior to and during the switch.  Before the switch, we also plan to host a tech talk for the team on 4/22 (expecting this to be recorded) with more info about the switch.  Details for the tech talk to follow.


Questions?  If you have an issue that has not been addressed in any available documentation, please let us know.


Thanks!


Chase

John Abd-El-Malek

unread,
Apr 15, 2014, 4:23:15 PM4/15/14
to Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
What's the status of the web-based blame working? i.e. from the example from the "Chromium development, Git, and you" thread, the workflow I described still doesn't work. It took several minutes for the annotated view for render_view_impl.cc to show up, which is too slow to be usable. Even then, when I clicked on the link to the left to see the code before previous change, it took another 2 minutes for the same html to load, and I couldn't see the code before the previous change. This is something that needs to work for the reasons I listed before.

Peter Kasting

unread,
Apr 15, 2014, 4:37:06 PM4/15/14
to John Abd-El-Malek, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 1:23 PM, John Abd-El-Malek <j...@chromium.org> wrote:
What's the status of the web-based blame working? i.e. from the example from the "Chromium development, Git, and you" thread, the workflow I described still doesn't work. It took several minutes for the annotated view for render_view_impl.cc to show up, which is too slow to be usable. Even then, when I clicked on the link to the left to see the code before previous change, it took another 2 minutes for the same html to load, and I couldn't see the code before the previous change. This is something that needs to work for the reasons I listed before.

+1.  Having a web-based blame flow of equivalent quality to the SVN-based system is critically important.  Please, please do not switch to git without this.

PK

Brett Wilson

unread,
Apr 15, 2014, 4:43:27 PM4/15/14
to John Abd-El-Malek, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 1:23 PM, John Abd-El-Malek <j...@chromium.org> wrote:
> What's the status of the web-based blame working? i.e. from the example from
> the "Chromium development, Git, and you" thread, the workflow I described
> still doesn't work. It took several minutes for the annotated view for
> render_view_impl.cc to show up, which is too slow to be usable. Even then,
> when I clicked on the link to the left to see the code before previous
> change, it took another 2 minutes for the same html to load, and I couldn't
> see the code before the previous change. This is something that needs to
> work for the reasons I listed before.

Its great to see the git stuff moving along, but I recall there were
some pretty string assurances about this blame stuff working before
the switchover. According to John's comment, these haven't been
addressed. Lately, this is something I've been doing every single day.

Brett

Mike Wittman

unread,
Apr 15, 2014, 5:02:55 PM4/15/14
to Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
It's been stated (here) that the git submodules workflow will break with this transition. What do those of us still using that workflow need to do to transition back to the gclient workflow?

-Mike


On Tue, Apr 15, 2014 at 1:07 PM, Chase Phillips <c...@google.com> wrote:

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Robert Iannucci

unread,
Apr 15, 2014, 5:11:25 PM4/15/14
to Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager
Yes, but the unmanaged gclient workflow (i.e. what you get when you run `fetch chromium`) is very similar now. Basically, `gclient sync` won't touch your src.git repo, but it will update all the dependencies from whatever DEPS (currently, .DEPS.git) says in your current working copy. On flagday, .DEPS.git will become DEPS so that discrepancy will go away.

Robert Iannucci

unread,
Apr 15, 2014, 5:11:51 PM4/15/14
to Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager
(from my chromium.org account)

Yes, but the unmanaged gclient workflow (i.e. what you get when you run `fetch chromium`) is very similar now. Basically, `gclient sync` won't touch your src.git repo, but it will update all the dependencies from whatever DEPS (currently, .DEPS.git) says in your current working copy. On flagday, .DEPS.git will become DEPS so that discrepancy will go away.
On Tue, Apr 15, 2014 at 2:02 PM, Mike Wittman <wit...@google.com> wrote:

Michael Moss

unread,
Apr 15, 2014, 5:14:42 PM4/15/14
to Brett Wilson, John Abd-El-Malek, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 1:43 PM, Brett Wilson <bre...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 1:23 PM, John Abd-El-Malek <j...@chromium.org> wrote:
> What's the status of the web-based blame working? i.e. from the example from
> the "Chromium development, Git, and you" thread, the workflow I described
> still doesn't work. It took several minutes for the annotated view for

As I stated in that thread, I see similar performance characteristics trying to load history and blame for those files in viewvc (35s for the history view, 1.2m for the annotate view). Are you seeing some sort of blazingly fast response times that I don't? I think caching on repeated requests might be faster in viewvc, but even then, I wouldn't call the views of any heavily modified file "fast".

> render_view_impl.cc to show up, which is too slow to be usable. Even then,
> when I clicked on the link to the left to see the code before previous
> change, it took another 2 minutes for the same html to load, and I couldn't
> see the code before the previous change. This is something that needs to

This might be a legitimate problem, though I'm not sure it's actually been brought up before. If I understand you correctly, the issue is that in viewvc, the links go to a diff between the blame revision and revision for that line, but in gitiles, the links take you to a new blame view, based on revision at that line. You prefer the viewvc "links to diff" over the gitiles "links to blame" feature?

Michael Moss

unread,
Apr 15, 2014, 5:15:34 PM4/15/14
to Brett Wilson, John Abd-El-Malek, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
[ugh, from @chromium account]

Ilya Sherman

unread,
Apr 15, 2014, 5:19:01 PM4/15/14
to Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
The Chrome version string currently has the format "35.0.1916.27 (Official Build 262261) dev".  What will the format be after the switch?  Specifically, will there still be a (per-branch?) monotonically increasing build number for official releases?

Thanks,
~ilya


On Tue, Apr 15, 2014 at 1:07 PM, Chase Phillips <c...@google.com> wrote:

--

Peter Kasting

unread,
Apr 15, 2014, 5:24:23 PM4/15/14
to Michael Moss, Brett Wilson, John Abd-El-Malek, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 2:14 PM, Michael Moss <mm...@google.com> wrote:
This might be a legitimate problem, though I'm not sure it's actually been brought up before. If I understand you correctly, the issue is that in viewvc, the links go to a diff between the blame revision and revision for that line, but in gitiles, the links take you to a new blame view, based on revision at that line. You prefer the viewvc "links to diff" over the gitiles "links to blame" feature?

Yes.  I raised this before, though I may have raised it on a gitiles bug directly.  Linking to a blame view doesn't help you do the primary thing blame is useful for, which is to determine whether a particular revision was, in fact, the source of the changes you're interested in.  Linking to a diff, OTOH, tells you precisely what the change in question did, and allows you to click a link to see the blame of the pre-diff file if it didn't turn out to be the change you're interested in.

PK

Aaron Gable

unread,
Apr 15, 2014, 5:24:37 PM4/15/14
to Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
The Chrome version string will keep exactly the same format -- the Chrome Version is determined by the release scripts which cut the branch and increment the various pieces of the version number. None of that held a correspondance to SVN revision numbers in the first place, so none of that is changing.

Aaron

Aaron Gable

unread,
Apr 15, 2014, 5:29:43 PM4/15/14
to Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
Sorry, Ilya, clarification: The X.Y.Z.A format of the version string will not change. The "Official Build 262261" part will, as the "262261" piece of that is the revision currently at HEAD of the branch in question, so that will become a hash.

Aaron

Rachel Blum

unread,
Apr 15, 2014, 5:35:39 PM4/15/14
to Michael Moss, Brett Wilson, John Abd-El-Malek, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 2:14 PM, Michael Moss <mm...@google.com> wrote:
As I stated in that thread, I see similar performance characteristics trying to load history and blame for those files in viewvc (35s for the history view, 1.2m for the annotate view).

FWIW - 20s on the annotated view of render_view_impl.cc here. 2:30m for gitiles version of that. That's 7.5x faster for viewvc - somewhat noticeable :)

You prefer the viewvc "links to diff" over the gitiles "links to blame" feature?

Yes, please - I usually care about what exactly changed, not how the entire file looked at the time in question.

Ilya Sherman

unread,
Apr 15, 2014, 6:08:34 PM4/15/14
to Aaron Gable, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
Can we pretty please keep a monotonically increasing number for official commits (possibly per-branch), rather than reverting to hashes?  It's very common for me to need to know: Is my change, rXXXXXX, included in the build corresponding to this bug report?  With hashes, I need to query a tool to answer this common question.  That's much less convenient, and slower, than just being able to compare two numbers.

Elliott Sprehn

unread,
Apr 15, 2014, 6:14:54 PM4/15/14
to Rachel Blum, Michael Moss, Brett Wilson, John Abd-El-Malek, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
Same here, please don't do this migration until the blame tools work properly and are fast. This would be a huge hit to productivity, especially in the blink code base where lots of things are 10+ years old and I blame all the time.

I remember the same thing that Jam does, that this was supposed to be addressed before the migration?

Elliott Sprehn

unread,
Apr 15, 2014, 6:21:28 PM4/15/14
to Rachel Blum, Michael Moss, Brett Wilson, John Abd-El-Malek, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
Same here, please don't do this migration until the blame tools work properly and are fast. This would be a huge hit to productivity, especially in the blink code base where lots of things are 10+ years old and I blame all the time.

I remember the same thing that Jam does, that this was supposed to be addressed before the migration?

Mike Wittman

unread,
Apr 15, 2014, 6:25:01 PM4/15/14
to Robert Iannucci, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager
Sorry, should have been more specific. I'm fine with the gclient workflow, but I have several repositories created using the git submodules workflow that have a lot of local working branches I'd like to keep. Can I preserve these repositories when switching to the gclient workflow, or do I need to check out fresh and bring that state over from the old repositories?

-Mike

Dana Jansens

unread,
Apr 15, 2014, 6:27:31 PM4/15/14
to Mike Wittman, Robert Iannucci, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager
On Tue, Apr 15, 2014 at 6:25 PM, Mike Wittman <wit...@google.com> wrote:
Sorry, should have been more specific. I'm fine with the gclient workflow, but I have several repositories created using the git submodules workflow that have a lot of local working branches I'd like to keep. Can I preserve these repositories when switching to the gclient workflow, or do I need to check out fresh and bring that state over from the old repositories?

FWIW If you checkout fresh, you can git-fetch the branches from the other repos over.

John Abd-El-Malek

unread,
Apr 15, 2014, 6:29:16 PM4/15/14
to Michael Moss, Brett Wilson, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 2:14 PM, Michael Moss <mm...@google.com> wrote:

On Tue, Apr 15, 2014 at 1:43 PM, Brett Wilson <bre...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 1:23 PM, John Abd-El-Malek <j...@chromium.org> wrote:
> What's the status of the web-based blame working? i.e. from the example from
> the "Chromium development, Git, and you" thread, the workflow I described
> still doesn't work. It took several minutes for the annotated view for

As I stated in that thread, I see similar performance characteristics trying to load history and blame for those files in viewvc (35s for the history view, 1.2m for the annotate view). Are you seeing some sort of blazingly fast response times that I don't? I think caching on repeated requests might be faster in viewvc, but even then, I wouldn't call the views of any heavily modified file "fast".

The numbers that I wrote were reproduced on different days (also today). It's still 20s to annotate in view vc, and multiple minutes in gittiles. And that's when gittiles loads, it seems to time out often.


> render_view_impl.cc to show up, which is too slow to be usable. Even then,
> when I clicked on the link to the left to see the code before previous
> change, it took another 2 minutes for the same html to load, and I couldn't
> see the code before the previous change. This is something that needs to

This might be a legitimate problem, though I'm not sure it's actually been brought up before.

It's been brought up multiple times, in that thread and on the bug. Please see the email I linked to below, the process to blame an old cl are outlined step by step.

Marshall Greenblatt

unread,
Apr 15, 2014, 6:57:30 PM4/15/14
to Chase Phillips, Chromium-dev, blin...@chromium.org, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
Hi Chase,

What's the timeline and plan for migrating Chromium release branches to git?

Thanks,
Marshall
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Michael Moss

unread,
Apr 15, 2014, 7:00:43 PM4/15/14
to John Abd-El-Malek, Brett Wilson, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
We are investigating. Rest assured, if the performance can't be improved, we won't switch.

Viet-Trung Luu

unread,
Apr 15, 2014, 7:00:53 PM4/15/14
to Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

Michael Moss

unread,
Apr 15, 2014, 7:01:06 PM4/15/14
to John Abd-El-Malek, Brett Wilson, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci

Michael Moss

unread,
Apr 15, 2014, 7:07:06 PM4/15/14
to Marshall Greenblatt, Chase Phillips, Chromium-dev, blin...@chromium.org, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
Branches already exist in git (https://chromium.googlesource.com/chromium/src.git/+log/refs/branch-heads/1933). You can 'gclient sync --with_branch_heads' to pull them into your checkout if you don't have them already.

Marshall Greenblatt

unread,
Apr 15, 2014, 7:19:07 PM4/15/14
to Michael Moss, Chase Phillips, Chromium-dev, blin...@chromium.org, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
On Apr 15, 2014, at 7:07 PM, Michael Moss <mm...@chromium.org> wrote:

Branches already exist in git (https://chromium.googlesource.com/chromium/src.git/+log/refs/branch-heads/1933). You can 'gclient sync --with_branch_heads' to pull them into your checkout if you don't have them already.

These instructions seem to indicate that there's still an svn dependency when building release branches: http://www.chromium.org/developers/how-tos/get-the-code#Working_with_release_branches

Can you update the instructions if there's now a way to do it without requiring svn?

Thanks,
Marshall

Michael Moss

unread,
Apr 15, 2014, 7:34:46 PM4/15/14
to Marshall Greenblatt, Chase Phillips, Chromium-dev, blin...@chromium.org, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 4:19 PM, Marshall Greenblatt <magree...@gmail.com> wrote:
On Apr 15, 2014, at 7:07 PM, Michael Moss <mm...@chromium.org> wrote:

Branches already exist in git (https://chromium.googlesource.com/chromium/src.git/+log/refs/branch-heads/1933). You can 'gclient sync --with_branch_heads' to pull them into your checkout if you don't have them already.

These instructions seem to indicate that there's still an svn dependency when building release branches: http://www.chromium.org/developers/how-tos/get-the-code#Working_with_release_branches

Can you update the instructions if there's now a way to do it without requiring svn?

Those instructions are still valid until the switch happens, since svn is still the source of truth. But if you're talking about a plan to migrate "directly editable" release branches to git, that's just a normal part of the migration, and all that hackish branch workflow stuff will basically go away.

Aaron Gable

unread,
Apr 15, 2014, 7:52:57 PM4/15/14
to Viet-Trung Luu, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

That said, even if we did rewrite history, it would not be hard to keep the old history in (for example) a refs/archive ref namespace, so that all the same git objects would continue to exist and could be pulled by someone who wants to inspect them or viewed on chromium.googlesource.com, but wouldn't take up any space on developer or bot machines.

Elliot Glaysher (Chromium)

unread,
Apr 15, 2014, 7:57:51 PM4/15/14
to aga...@chromium.org, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

I've maintained scripts to clean up the large binary weight of the main chrome repo; this is tracked in crbug.com/111570. Given all the binary copies of cygwin and other programs, the historical layout test results, etc, I think it would be a good idea if you're doing the same for blink.

-- Elliot

Viet-Trung Luu

unread,
Apr 15, 2014, 8:26:24 PM4/15/14
to Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?

罗勇刚(Yonggang Luo)

unread,
Apr 15, 2014, 8:33:54 PM4/15/14
to Viet-Trung Luu, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
Looking into src.git history, there is problems with that,
it's contains commits like "SVN changes up to revision 264039" which makes no sense.
https://chromium.googlesource.com/chromium/src.git/+/e7f039f2099d9a0af22cdcdc548929051f449df9
--
         此致

罗勇刚
Yours
    sincerely,
Yonggang Luo

Aaron Gable

unread,
Apr 15, 2014, 8:37:10 PM4/15/14
to luoyo...@gmail.com, Viet-Trung Luu, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
Those changes will no longer exist. They're an artifact of the current system of replicating changes from SVN into Git. If you take a look at the history of the 'git-svn' ref, you'll see what will become 'master' on flag day.

Aaron

罗勇刚(Yonggang Luo)

unread,
Apr 15, 2014, 8:42:58 PM4/15/14
to Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
Otherewise, I reviewed the git hisotry, the Author and Email doesn't confirm the Git standard.
Revision: fb9269308b8bcdc21e6bed05f7708857ef542d69
Author: mall...@chromium.org <mall...@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98>
Date: 2014/4/16 5:13:48


The Author: should be something like:
Author: Yonggang Luo <luoyo...@gmail.com>

Aaron Gable

unread,
Apr 15, 2014, 8:57:06 PM4/15/14
to luoyo...@gmail.com, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
Yes. That is the price we pay for having used git-svn and replication for the last few years. Fixing it would require rewriting all of history (which, as mentioned above, we don't plan to do). That will not be true for all new commits after the flag day -- they will have properly formatted committer and author fields.

Aaron

罗勇刚(Yonggang Luo)

unread,
Apr 15, 2014, 9:49:49 PM4/15/14
to Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
2014-04-16 8:57 GMT+08:00 Aaron Gable <aga...@chromium.org>:
Yes. That is the price we pay for having used git-svn and replication for the last few years. Fixing it would require rewriting all of history (which, as mentioned above, we don't plan to do). That will not be true for all new commits after the flag day -- they will have properly formatted committer and author fields.

On the flag day, I think it's worth to do a filter-branch to fixes things like this up. Because at the current time, the git history didn't used
as a official releases, after the flag day, nothing can be change, and besides,  crbug.com/111570 need to filter branch to remove those binary files
so we can got that chance to do better.

ingrid krammova

unread,
Apr 15, 2014, 10:19:29 PM4/15/14
to luoyo...@gmail.com, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci

"罗勇刚(Yonggang Luo) " <luoyo...@gmail.com> schrieb:

>>>>>>>>>> The Chrome Infra team has been working via bugs<https://code.google.com/p/chromium/issues/list?can=2&q=blocking%3A263848&colspec=ID+Pri+M+Iteration+ReleaseBlock+Cr+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&cells=tiles>and a
>>>>>>>>>> planning document<https://docs.google.com/a/chromium.org/document/d/1JHs1SjHTnUK77PAVeG6T-xTjUWBGqfuglTvYVa2zwmU/edit#heading=h.3ptkkl4ppzn8>to complete the final tasks that remain prior to flipping the switch from

>To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Matt Giuca

unread,
Apr 15, 2014, 11:31:33 PM4/15/14
to inge3...@gmail.com, luoyo...@gmail.com, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On 16 April 2014 08:08, Ilya Sherman <ishe...@chromium.org> wrote:
Can we pretty please keep a monotonically increasing number for official commits (possibly per-branch), rather than reverting to hashes?  It's very common for me to need to know: Is my change, rXXXXXX, included in the build corresponding to this bug report?  With hashes, I need to query a tool to answer this common question.  That's much less convenient, and slower, than just being able to compare two numbers.

+cb0f9bc

I mean +1.

We had a big discussion about revnos vs hashes already on the "Chromium, Git, and you" thread but I don't think we came to any conclusion. If those SVN revision numbers get replaced with hashes, life is going to be awful. I need to be able to tell at a glance whether one revision is newer than another. I need the git log, OmahaProxy page and chrome://version pages to all show me the revision numbers consistently and in a monotonically increasing order. (For just one quick example: all bugs reported from TOT should have this revision number in the bug report so we can easily compare to the version we have checked out.)

We discussed previously the idea of using "git number" (which is in depot-tools) which just tells you the height of the git DAG. This is sufficient for the trunk but could cause confusion when you're in a branch. Another idea is just to have a CQ script (like the one that currently adds "Review URL" to every commit message) that appends a special line to every commit message saying what its revision number is. We could then automatically scrape the commit message for this line and display it in chrome://version. The advantage of this approach is that we can continue the numbering from the SVN IDs.

The format would be very simple. All commits to master end in a line:
"chromium-revision-id: 264090"

The CQ looks at the previous commit for chromium-revision-id, increments it, and appends a the new chromium-revision-id to the new commit (with a special case for the first commit post-flag-day; it would look at git-svn-id to get the predecessor of the first chromium-revision-id number).

Commits to a branch would be handled differently to how they are now (where they interleave SVN IDs with the commits to trunk). Instead, of tracking the chromium-revision-ids, we have a separate ID counter for each branch. Perhaps we can commandeer the fourth field of the Chrome version number instead of inventing a fifth one: when creating a branch, it is set to 0, and we add this line to the commit that creates the branch:
"chromium-version-id: 35.0.1916.0"

Every commit to a branch ends in a line like this, and when creating a new branch commit, the CQ automatically increments the fourth field, like this:
"chromium-version-id: 35.0.1916.1"

And when building from a branch release, that becomes the version ID (so the fourth field will be much larger than it has been in the past, but that seems OK). You don't need to see a revision ID when comparing versions in a branch because the version IDs are monotonically increasing.

This is just a rough sketch. I'm happy to continue discussing this offline and even help out, but I would really like for trunk commits to have monotonic revision IDs before the flag day.

Dirk Pranke

unread,
Apr 16, 2014, 12:10:20 AM4/16/14
to Viet-Trung Luu, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 5:26 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

I don't think we have a plan for this at the moment, at least not beyond the things we normally do (like hope it gets caught in review). 

It's also not clear to me that this is a problem that needs to be urgently solved. It's true that Git is worse at handling large binary objects in a large repo than SVN is, and it's true that we have checked large things in occasionally in the past, but I don't remember anything causing much more than a bit of temporary pain. 

Do you have specific things you're afraid of that happen regularly, or is this more of a general concern?

(I do believe we need to evaluate repo size seriously in the context of the Blink merger, but that is really a separate project from flipping Chrome to Git and we shouldn't conflate the two).
 

Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?

As far as I know, Blink is the only repo we're thinking about. 

Since we actively discourage people from trying to use Blink w/o the rest of Chrome, and since we increasingly need to change things in both repos at once, it's clear that there is a big win to blink devs to merge the two repos. 

I don't believe this is true of the others you mention, all of which have a meaningful non-chrome-centric existence.

-- Dirk

Robert Iannucci

unread,
Apr 16, 2014, 12:32:45 AM4/16/14
to Matt Giuca, inge3...@gmail.com, luoyo...@gmail.com, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
Hey Matt,

So incidentally, we've been bouncing the same idea around infra today. I'm not sure about the release numbering (I think there would be some seriously deep implications there for a lot of other systems), but we could maybe do something like what you propose.

** Disclaimer: This is not a plan. **

If we supposed the existence of a commit queue which interdicted all pushes to the chromium repo (and I mean all (no, seriously everything. Even dcommit would go through this service)), then we could have that commit queue stamp all commits with a repo-wide monotonically increasing number. Note that this would change the commit hashes of all the commits (but `git cl upload` and the existing CQ already do this).

We could then have services (like bugdroid) mention the revision when pasting hashes in places, like deadbeef....-r12345. We could potentially have a chromium extension do this lookup and inject the numbers whenever it sees a hash... There would probably have to be some web service support for this to make it fast, however.

Note also that this would not remove the need for commit hashes. Tools and services, like gitiles and the git command line interface, would still not support these (the amount of work to do that would be large, and diverging our toolchain from standard gitiles / git / etc. would carry a very high cost). These numbers would be correct, but they would effectively be display only.

We could eventually also extend this mechanism to include other numbering schemes (like N-th commit on branch), as well.

** End not-a-plan. **

My question to the ML is: if we had one-to-one hash-to-number (within a repo, monotonically increasing), and printed these numbers in addition to the hashes, what flows would this enable?

R

Robert Iannucci

unread,
Apr 16, 2014, 12:33:42 AM4/16/14
to Matt Giuca, inge3...@gmail.com, luoyo...@gmail.com, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
*sigh* from my chromium account. Twice in the same day. Terrible.

Hey Matt,

So incidentally, we've been bouncing the same idea around infra today. I'm not sure about the release numbering (I think there would be some seriously deep implications there for a lot of other systems), but we couldmaybe do something like what you propose.

** Disclaimer: This is not a plan. **

If we supposed the existence of a commit queue which interdicted allpushes to the chromium repo (and I mean all (no, seriously everything. Even dcommit would go through this service)), then we could have that commit queue stamp all commits with a repo-wide monotonically increasing number. Note that this would change the commit hashes of all the commits (but `git cl upload` and the existing CQ already do this).

We could then have services (like bugdroid) mention the revision when pasting hashes in places, like deadbeef....-r12345. We could potentially have a chromium extension do this lookup and inject the numbers whenever it sees a hash... There would probably have to be some web service support for this to make it fast, however.

Note also that this would not remove the need for commit hashes. Tools and services, like gitiles and the git command line interface, would still not support these (the amount of work to do that would be large, and diverging our toolchain from standard gitiles / git / etc. would carry a very high cost). These numbers would be correct, but they would effectively be display only.

We could eventually also extend this mechanism to include other numbering schemes (like N-th commit on branch), as well.

** End not-a-plan. **

My question to the ML is: if we had one-to-one hash-to-number (within a repo, monotonically increasing), and printed these numbers in addition to the hashes, what flows would this enable?

R


On Tue, Apr 15, 2014 at 8:31 PM, Matt Giuca <mgi...@chromium.org> wrote:

Viet-Trung Luu

unread,
Apr 16, 2014, 12:47:36 AM4/16/14
to Dirk Pranke, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 9:10 PM, Dirk Pranke <dpr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 5:26 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

I don't think we have a plan for this at the moment, at least not beyond the things we normally do (like hope it gets caught in review). 

My concern is that mistakes happen, and that the pain of git repo growth is inflicted on everyone in perpetuity, unless we plan on rewriting history (which really should be viewed as an awful thing -- it's antithetical to the basic purpose of revision control, and defeats any benefits that uses hashes may have).
 

It's also not clear to me that this is a problem that needs to be urgently solved. It's true that Git is worse at handling large binary objects in a large repo than SVN is, and it's true that we have checked large things in occasionally in the past, but I don't remember anything causing much more than a bit of temporary pain. 

It's a concern because we apparently want to move to git as our source of truth, despite a significant number of drawbacks.
 

Do you have specific things you're afraid of that happen regularly, or is this more of a general concern?

(I do believe we need to evaluate repo size seriously in the context of the Blink merger, but that is really a separate project from flipping Chrome to Git and we shouldn't conflate the two).

In general, I'm concerned about the scaling properties of git. This is not to say that git is bad, but that it obviously wasn't designed for our use case.
 
 

Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?

As far as I know, Blink is the only repo we're thinking about. 

Since we actively discourage people from trying to use Blink w/o the rest of Chrome, and since we increasingly need to change things in both repos at once, it's clear that there is a big win to blink devs to merge the two repos. 

I don't believe this is true of the others you mention, all of which have a meaningful non-chrome-centric existence.

To be clear, systems like subversion allow separate and combined existence. We don't take advantage of it very much, but the capability is there. (Chromium developers tend to think of the repo as being rooted at src/, which is incorrect.)

The great advantage of "one big repo" is that it allows for atomic changes, and can reduce the amount of gardening needed. (You can still have gardening schemes if you want[*], but there's the capability to use everything a ToT if you want.[**])

For example, take Chrome OS. As I understand it, they currently use parts of Chromium (e.g., base/), and have to semi-regularly have to garden bits of Chromium in. Now, whenever someone does major refactoring in Chromium, they potentially inflict great pain on Chrome OS. Currently, a great deal of Chrome OS-specific UI code lives in Chromium. This is largely due to an architectural problem. But if we solve that problem, then it'd be natural to kick such code closer to Chrome OS. But this is basically undo-able in any meaningful sense, at least without inflicting great pain on everyone, with the many-repos scheme.

It seems to me that moving to a revision control system that forces you to have lots of repos (and thus forcing gardening and/or code duplication) is a step in the wrong direction.

[*] Subversion doesn't force you to check out all of your repo at the same revision. Things like Perforce clients offer even more mapping flexibility/pain.
[**] E.g., probably you could put NaCl into the Chromium subversion repo somewhere ("beside" Chromium's src/; probably "beside" Chromium's trunk/ even), and no one would be the wiser (other than some svn paths needing to be updated, and revision numbers changing). But then if you wanted to, you could try to use NaCl at ToT.[***]
[***] Of course, our build/test infrastructure would need significant changes to make things like this practical, but that's a different problem. My concern is that moving to git might make this (a "google3"-like model) impossible.


-- Dirk

Matt Giuca

unread,
Apr 16, 2014, 1:01:02 AM4/16/14
to Robert Iannucci, Inge Krammova Inge, Yonggang Luo, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On 16 April 2014 14:32, Robert Iannucci <iann...@google.com> wrote:
Hey Matt,

So incidentally, we've been bouncing the same idea around infra today. I'm not sure about the release numbering (I think there would be some seriously deep implications there for a lot of other systems), but we could maybe do something like what you propose.

Great to hear this is still being discussed.

** Disclaimer: This is not a plan. **

If we supposed the existence of a commit queue which interdicted all pushes to the chromium repo (and I mean all (no, seriously everything. Even dcommit would go through this service)), then we could have that commit queue stamp all commits with a repo-wide monotonically increasing number. Note that this would change the commit hashes of all the commits (but `git cl upload` and the existing CQ already do this).

We could then have services (like bugdroid) mention the revision when pasting hashes in places, like deadbeef....-r12345. We could potentially have a chromium extension do this lookup and inject the numbers whenever it sees a hash... There would probably have to be some web service support for this to make it fast, however.

Note also that this would not remove the need for commit hashes. Tools and services, like gitiles and the git command line interface, would still not support these (the amount of work to do that would be large, and diverging our toolchain from standard gitiles / git / etc. would carry a very high cost). These numbers would be correct, but they would effectively be display only.

We could eventually also extend this mechanism to include other numbering schemes (like N-th commit on branch), as well.

That's fine (being display only). I am quite used to dealing with git commits (as I'm sure most Chrome engineers are). I just need the monotonic numbers for quick comparison purposes.
 

** End not-a-plan. **

My question to the ML is: if we had one-to-one hash-to-number (within a repo, monotonically increasing), and printed these numbers in addition to the hashes, what flows would this enable?

Here are some (all really similar variations on the same theme):
  • Bug reports from TOT would generally include the revno.
  • Similarly, I can mention on bug discussions what my local build revno is, so everybody knows which patches I have and which ones I don't. (Someone can quickly think, "Oh, Matt is only on 264046, which explains why he isn't seeing my patch r264182 yet.")
  • I can quickly check if my CL is earlier than the bug report revno (and therefore if it could be responsible).
  • I can quickly check if my local build is earlier or later than a particular commit that fixes a bug (and therefore if I should expect the bug to be fixed in my release).
  • I can quickly check if my CL is earlier than the base revision of a branch (reported by OmahaProxy) (and therefore if it made it into a release).

Robert Iannucci

unread,
Apr 16, 2014, 1:11:57 AM4/16/14
to Viet-Trung Luu, Dirk Pranke, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On Tue, Apr 15, 2014 at 9:47 PM, Viet-Trung Luu <v...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:10 PM, Dirk Pranke <dpr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 5:26 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

I don't think we have a plan for this at the moment, at least not beyond the things we normally do (like hope it gets caught in review). 

My concern is that mistakes happen, and that the pain of git repo growth is inflicted on everyone in perpetuity, unless we plan on rewriting history (which really should be viewed as an awful thing -- it's antithetical to the basic purpose of revision control, and defeats any benefits that uses hashes may have).
 

It's also not clear to me that this is a problem that needs to be urgently solved. It's true that Git is worse at handling large binary objects in a large repo than SVN is, and it's true that we have checked large things in occasionally in the past, but I don't remember anything causing much more than a bit of temporary pain. 

It's a concern because we apparently want to move to git as our source of truth, despite a significant number of drawbacks.
 

Do you have specific things you're afraid of that happen regularly, or is this more of a general concern?

(I do believe we need to evaluate repo size seriously in the context of the Blink merger, but that is really a separate project from flipping Chrome to Git and we shouldn't conflate the two).

In general, I'm concerned about the scaling properties of git. This is not to say that git is bad, but that it obviously wasn't designed for our use case.

So keep in mind, SVN is really not stellar at keeping binaries in source control either. It's easier from the client perspective, but the server side is, at best, not great.

FTR, I think it would be absolutely worthwhile to reject pushes containing binaries once we have a transparent or semi-transparent way to push binaries into google storage (the {upload,download}_from_google_storage scripts are half of a solution). We could even have a commit-queue system do this sort of thing automatically.
 
 
 

Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?

As far as I know, Blink is the only repo we're thinking about. 

Since we actively discourage people from trying to use Blink w/o the rest of Chrome, and since we increasingly need to change things in both repos at once, it's clear that there is a big win to blink devs to merge the two repos. 

I don't believe this is true of the others you mention, all of which have a meaningful non-chrome-centric existence.

To be clear, systems like subversion allow separate and combined existence. We don't take advantage of it very much, but the capability is there. (Chromium developers tend to think of the repo as being rooted at src/, which is incorrect.)

The great advantage of "one big repo" is that it allows for atomic changes, and can reduce the amount of gardening needed. (You can still have gardening schemes if you want[*], but there's the capability to use everything a ToT if you want.[**])

For example, take Chrome OS. As I understand it, they currently use parts of Chromium (e.g., base/), and have to semi-regularly have to garden bits of Chromium in. Now, whenever someone does major refactoring in Chromium, they potentially inflict great pain on Chrome OS. Currently, a great deal of Chrome OS-specific UI code lives in Chromium. This is largely due to an architectural problem. But if we solve that problem, then it'd be natural to kick such code closer to Chrome OS. But this is basically undo-able in any meaningful sense, at least without inflicting great pain on everyone, with the many-repos scheme.

It seems to me that moving to a revision control system that forces you to have lots of repos (and thus forcing gardening and/or code duplication) is a step in the wrong direction.

[*] Subversion doesn't force you to check out all of your repo at the same revision. Things like Perforce clients offer even more mapping flexibility/pain.
[**] E.g., probably you could put NaCl into the Chromium subversion repo somewhere ("beside" Chromium's src/; probably "beside" Chromium's trunk/ even), and no one would be the wiser (other than some svn paths needing to be updated, and revision numbers changing). But then if you wanted to, you could try to use NaCl at ToT.[***]
[***] Of course, our build/test infrastructure would need significant changes to make things like this practical, but that's a different problem. My concern is that moving to git might make this (a "google3"-like model) impossible.

On the flip side, moving off SVN will enable significant improvements on the workflow side, including affording the ability to better tool the management of multiple repos. This is not to mention that the majority of devs are already using git and are suffering daily with its second-class-citizen status. In addition, there are much better stories around security and audibility of the data-at-rest with git than with SVN (even in the context of the proposed history rewrites), and we can provide significantly better QoS with git than we can with SVN.

One of the problems with the one-big-repo world is it can very easily lead to tight coupling of presumably independent code bases (our one-big-repo is currently not immune to this). Either they should be independent code bases (in which case atomic commits are nice but not strictly necessary: i.e. automated multi-repo commits could be enabled by better infrastructure tooling), or they really are coupled, in which case they should probably be merged together (such as blink).

R

Robert Iannucci

unread,
Apr 16, 2014, 1:14:18 AM4/16/14
to Viet-Trung Luu, Dirk Pranke, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
Wow, today is not my day in the whole replying-with-the-correct-address departement.


On Tue, Apr 15, 2014 at 9:47 PM, Viet-Trung Luu <v...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:10 PM, Dirk Pranke <dpr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 5:26 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

I don't think we have a plan for this at the moment, at least not beyond the things we normally do (like hope it gets caught in review). 

My concern is that mistakes happen, and that the pain of git repo growth is inflicted on everyone in perpetuity, unless we plan on rewriting history (which really should be viewed as an awful thing -- it's antithetical to the basic purpose of revision control, and defeats any benefits that uses hashes may have).
 

It's also not clear to me that this is a problem that needs to be urgently solved. It's true that Git is worse at handling large binary objects in a large repo than SVN is, and it's true that we have checked large things in occasionally in the past, but I don't remember anything causing much more than a bit of temporary pain. 

It's a concern because we apparently want to move to git as our source of truth, despite a significant number of drawbacks.
 

Do you have specific things you're afraid of that happen regularly, or is this more of a general concern?

(I do believe we need to evaluate repo size seriously in the context of the Blink merger, but that is really a separate project from flipping Chrome to Git and we shouldn't conflate the two).

In general, I'm concerned about the scaling properties of git. This is not to say that git is bad, but that it obviously wasn't designed for our use case.


So keep in mind, SVN is really not stellar at keeping binaries in source control either. It's easier from the client perspective, but the server side is, at best, not great.

FTR, I think it would be absolutely worthwhile to reject pushes containing binaries once we have a transparent or semi-transparent way to push binaries into google storage (the {upload,download}_from_google_storage scripts are half of a solution). We could even have a commit-queue system do this sort of thing automatically.
 
Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?
As far as I know, Blink is the only repo we're thinking about. 

Since we actively discourage people from trying to use Blink w/o the rest of Chrome, and since we increasingly need to change things in both repos at once, it's clear that there is a big win to blink devs to merge the two repos. 

I don't believe this is true of the others you mention, all of which have a meaningful non-chrome-centric existence.
To be clear, systems like subversion allow separate and combined existence. We don't take advantage of it very much, but the capability is there. (Chromium developers tend to think of the repo as being rooted at src/, which is incorrect.)

The great advantage of "one big repo" is that it allows for atomic changes, and can reduce the amount of gardening needed. (You can still have gardening schemes if you want[*], but there's the capability to use everything a ToT if you want.[**])

For example, take Chrome OS. As I understand it, they currently use parts of Chromium (e.g., base/), and have to semi-regularly have to garden bits of Chromium in. Now, whenever someone does major refactoring in Chromium, they potentially inflict great pain on Chrome OS. Currently, a great deal of Chrome OS-specific UI code lives in Chromium. This is largely due to an architectural problem. But if we solve that problem, then it'd be natural to kick such code closer to Chrome OS. But this is basically undo-able in any meaningful sense, at least without inflicting great pain on everyone, with the many-repos scheme.

It seems to me that moving to a revision control system that forces you to have lots of repos (and thus forcing gardening and/or code duplication) is a step in the wrong direction.

[*] Subversion doesn't force you to check out all of your repo at the same revision. Things like Perforce clients offer even more mapping flexibility/pain.
[**] E.g., probably you could put NaCl into the Chromium subversion repo somewhere ("beside" Chromium's src/; probably "beside" Chromium's trunk/ even), and no one would be the wiser (other than some svn paths needing to be updated, and revision numbers changing). But then if you wanted to, you could try to use NaCl at ToT.[***]
[***] Of course, our build/test infrastructure would need significant changes to make things like this practical, but that's a different problem. My concern is that moving to git might make this (a "google3"-like model) impossible.

On the flip side, moving off SVN will enable significant improvements on the workflow side, including affording the ability to better tool the management of multiple repos. This is not to mention that the majority of devs are already using git and are suffering daily with its second-class-citizen status. In addition, there are much better stories around security and audibility of the data-at-rest with git than with SVN (even in the context of the proposed history rewrites), and we can provide significantly better QoS with git than we can with SVN.

One of the problems with the one-big-repo world is it can very easily lead to tight coupling of presumably independent code bases (our one-big-repo is currently not immune to this). Either they should be independent code bases (in which case atomic commits are nice but not strictly necessary: i.e. automated multi-repo commits could be enabled by better infrastructure tooling), or they really are coupled, in which case they should probably be merged together (such as blink).

R
 


-- Dirk

Viet-Trung Luu

unread,
Apr 16, 2014, 1:44:14 AM4/16/14
to Robert Iannucci, Dirk Pranke, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On Tue, Apr 15, 2014 at 10:11 PM, Robert Iannucci <iann...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:47 PM, Viet-Trung Luu <v...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:10 PM, Dirk Pranke <dpr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 5:26 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

I don't think we have a plan for this at the moment, at least not beyond the things we normally do (like hope it gets caught in review). 

My concern is that mistakes happen, and that the pain of git repo growth is inflicted on everyone in perpetuity, unless we plan on rewriting history (which really should be viewed as an awful thing -- it's antithetical to the basic purpose of revision control, and defeats any benefits that uses hashes may have).
 

It's also not clear to me that this is a problem that needs to be urgently solved. It's true that Git is worse at handling large binary objects in a large repo than SVN is, and it's true that we have checked large things in occasionally in the past, but I don't remember anything causing much more than a bit of temporary pain. 

It's a concern because we apparently want to move to git as our source of truth, despite a significant number of drawbacks.
 

Do you have specific things you're afraid of that happen regularly, or is this more of a general concern?

(I do believe we need to evaluate repo size seriously in the context of the Blink merger, but that is really a separate project from flipping Chrome to Git and we shouldn't conflate the two).

In general, I'm concerned about the scaling properties of git. This is not to say that git is bad, but that it obviously wasn't designed for our use case.

So keep in mind, SVN is really not stellar at keeping binaries in source control either. It's easier from the client perspective, but the server side is, at best, not great.

The essential difference here is that one problem (git's) is architectural, whereas the other (svn's) is implementation.

The problem with binary files is that they usually change quite radically (I don't know if git even does binary diffs, but even if it did it'd be of limited help), so changes to binary files accumulate to large repo growth.

A large repo isn't necessarily a problem, within certain limits (e.g., why should dealing with binary files of size on the order of megabytes, or even tens of megabytes be a problem?). Making a large repo work quickly/efficiently is largely a problem of implementation.

The problem git has with this is that the repo size is inflicted on every single clone. This is a fundamental issue.
 

FTR, I think it would be absolutely worthwhile to reject pushes containing binaries once we have a transparent or semi-transparent way to push binaries into google storage (the {upload,download}_from_google_storage scripts are half of a solution). We could even have a commit-queue system do this sort of thing automatically.

It seems to me that piling on more scripts only increases fragility, and indicates that git is really poorly suited for what we're doing. This applies to the proposed handling for large (binary) files just as much as it does to other things, such as revision numbering.

 
 
 

Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?

As far as I know, Blink is the only repo we're thinking about. 

Since we actively discourage people from trying to use Blink w/o the rest of Chrome, and since we increasingly need to change things in both repos at once, it's clear that there is a big win to blink devs to merge the two repos. 

I don't believe this is true of the others you mention, all of which have a meaningful non-chrome-centric existence.

To be clear, systems like subversion allow separate and combined existence. We don't take advantage of it very much, but the capability is there. (Chromium developers tend to think of the repo as being rooted at src/, which is incorrect.)

The great advantage of "one big repo" is that it allows for atomic changes, and can reduce the amount of gardening needed. (You can still have gardening schemes if you want[*], but there's the capability to use everything a ToT if you want.[**])

For example, take Chrome OS. As I understand it, they currently use parts of Chromium (e.g., base/), and have to semi-regularly have to garden bits of Chromium in. Now, whenever someone does major refactoring in Chromium, they potentially inflict great pain on Chrome OS. Currently, a great deal of Chrome OS-specific UI code lives in Chromium. This is largely due to an architectural problem. But if we solve that problem, then it'd be natural to kick such code closer to Chrome OS. But this is basically undo-able in any meaningful sense, at least without inflicting great pain on everyone, with the many-repos scheme.

It seems to me that moving to a revision control system that forces you to have lots of repos (and thus forcing gardening and/or code duplication) is a step in the wrong direction.

[*] Subversion doesn't force you to check out all of your repo at the same revision. Things like Perforce clients offer even more mapping flexibility/pain.
[**] E.g., probably you could put NaCl into the Chromium subversion repo somewhere ("beside" Chromium's src/; probably "beside" Chromium's trunk/ even), and no one would be the wiser (other than some svn paths needing to be updated, and revision numbers changing). But then if you wanted to, you could try to use NaCl at ToT.[***]
[***] Of course, our build/test infrastructure would need significant changes to make things like this practical, but that's a different problem. My concern is that moving to git might make this (a "google3"-like model) impossible.

On the flip side, moving off SVN will enable significant improvements on the workflow side, including affording the ability to better tool the management of multiple repos. This is not to mention that the majority of devs are already using git and are suffering daily with its second-class-citizen status. In addition, there are much better stories around security and audibility of the data-at-rest with git than with SVN (even in the context of the proposed history rewrites), and we can provide significantly better QoS with git than we can with SVN.

To be honest, given the amount of effort we've poured into trying to make git work, I don't think it's fair to compare versus subversion.

And to be honest, I found the old git workflow -- the one that evmar originally set up -- to be mostly superior in its usability/reliability to what we have now. I'm not entirely sure where you're getting this "suffering daily with its second-class-citizen status" bit.


One of the problems with the one-big-repo world is it can very easily lead to tight coupling of presumably independent code bases (our one-big-repo is currently not immune to this). Either they should be independent code bases (in which case atomic commits are nice but not strictly necessary: i.e. automated multi-repo commits could be enabled by better infrastructure tooling), or they really are coupled, in which case they should probably be merged together (such as blink).

As I said, in the case of Chrome OS, they're already pulling things from Chromium. (Because who really wants to duplicate base/? net/? And in the future, probably more.)

And so if you're suggesting that they (or at least parts of them) should be merged into the Chromium repo, it's important to ask if our repo could actually accommodate them?

Here, there are actually two questions: Can git handle such a repo size at all with reasonable performance? And how painful would it be for developers? (Keep in mind that the ability of subversion (and similar systems) to do partial checkouts makes it easy to limit/eliminate the pain.)

Matt Giuca

unread,
Apr 16, 2014, 1:48:43 AM4/16/14
to Viet-Trung Luu, Robert Iannucci, Dirk Pranke, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On 16 April 2014 15:44, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 10:11 PM, Robert Iannucci <iann...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:47 PM, Viet-Trung Luu <v...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:10 PM, Dirk Pranke <dpr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 5:26 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

I don't think we have a plan for this at the moment, at least not beyond the things we normally do (like hope it gets caught in review). 

My concern is that mistakes happen, and that the pain of git repo growth is inflicted on everyone in perpetuity, unless we plan on rewriting history (which really should be viewed as an awful thing -- it's antithetical to the basic purpose of revision control, and defeats any benefits that uses hashes may have).
 

It's also not clear to me that this is a problem that needs to be urgently solved. It's true that Git is worse at handling large binary objects in a large repo than SVN is, and it's true that we have checked large things in occasionally in the past, but I don't remember anything causing much more than a bit of temporary pain. 

It's a concern because we apparently want to move to git as our source of truth, despite a significant number of drawbacks.
 

Do you have specific things you're afraid of that happen regularly, or is this more of a general concern?

(I do believe we need to evaluate repo size seriously in the context of the Blink merger, but that is really a separate project from flipping Chrome to Git and we shouldn't conflate the two).

In general, I'm concerned about the scaling properties of git. This is not to say that git is bad, but that it obviously wasn't designed for our use case.

So keep in mind, SVN is really not stellar at keeping binaries in source control either. It's easier from the client perspective, but the server side is, at best, not great.

The essential difference here is that one problem (git's) is architectural, whereas the other (svn's) is implementation.

The problem with binary files is that they usually change quite radically (I don't know if git even does binary diffs, but even if it did it'd be of limited help), so changes to binary files accumulate to large repo growth.

A large repo isn't necessarily a problem, within certain limits (e.g., why should dealing with binary files of size on the order of megabytes, or even tens of megabytes be a problem?). Making a large repo work quickly/efficiently is largely a problem of implementation.

The problem git has with this is that the repo size is inflicted on every single clone. This is a fundamental issue.

We're already dealing with that issue when we use git as a svn client (which as far as I can tell, nearly all Chrome engineers do). Right now I have on my computer a copy of every version of every binary file that has ever been checked into Chromium, and it's huuuuge (and horrible to check out) but it isn't a serious day-to-day issue.
 
 

FTR, I think it would be absolutely worthwhile to reject pushes containing binaries once we have a transparent or semi-transparent way to push binaries into google storage (the {upload,download}_from_google_storage scripts are half of a solution). We could even have a commit-queue system do this sort of thing automatically.

It seems to me that piling on more scripts only increases fragility, and indicates that git is really poorly suited for what we're doing. This applies to the proposed handling for large (binary) files just as much as it does to other things, such as revision numbering.

 
 
 

Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?

As far as I know, Blink is the only repo we're thinking about. 

Since we actively discourage people from trying to use Blink w/o the rest of Chrome, and since we increasingly need to change things in both repos at once, it's clear that there is a big win to blink devs to merge the two repos. 

I don't believe this is true of the others you mention, all of which have a meaningful non-chrome-centric existence.

To be clear, systems like subversion allow separate and combined existence. We don't take advantage of it very much, but the capability is there. (Chromium developers tend to think of the repo as being rooted at src/, which is incorrect.)

The great advantage of "one big repo" is that it allows for atomic changes, and can reduce the amount of gardening needed. (You can still have gardening schemes if you want[*], but there's the capability to use everything a ToT if you want.[**])

For example, take Chrome OS. As I understand it, they currently use parts of Chromium (e.g., base/), and have to semi-regularly have to garden bits of Chromium in. Now, whenever someone does major refactoring in Chromium, they potentially inflict great pain on Chrome OS. Currently, a great deal of Chrome OS-specific UI code lives in Chromium. This is largely due to an architectural problem. But if we solve that problem, then it'd be natural to kick such code closer to Chrome OS. But this is basically undo-able in any meaningful sense, at least without inflicting great pain on everyone, with the many-repos scheme.

It seems to me that moving to a revision control system that forces you to have lots of repos (and thus forcing gardening and/or code duplication) is a step in the wrong direction.

[*] Subversion doesn't force you to check out all of your repo at the same revision. Things like Perforce clients offer even more mapping flexibility/pain.
[**] E.g., probably you could put NaCl into the Chromium subversion repo somewhere ("beside" Chromium's src/; probably "beside" Chromium's trunk/ even), and no one would be the wiser (other than some svn paths needing to be updated, and revision numbers changing). But then if you wanted to, you could try to use NaCl at ToT.[***]
[***] Of course, our build/test infrastructure would need significant changes to make things like this practical, but that's a different problem. My concern is that moving to git might make this (a "google3"-like model) impossible.

On the flip side, moving off SVN will enable significant improvements on the workflow side, including affording the ability to better tool the management of multiple repos. This is not to mention that the majority of devs are already using git and are suffering daily with its second-class-citizen status. In addition, there are much better stories around security and audibility of the data-at-rest with git than with SVN (even in the context of the proposed history rewrites), and we can provide significantly better QoS with git than we can with SVN.

To be honest, given the amount of effort we've poured into trying to make git work, I don't think it's fair to compare versus subversion.

And to be honest, I found the old git workflow -- the one that evmar originally set up -- to be mostly superior in its usability/reliability to what we have now. I'm not entirely sure where you're getting this "suffering daily with its second-class-citizen status" bit.


One of the problems with the one-big-repo world is it can very easily lead to tight coupling of presumably independent code bases (our one-big-repo is currently not immune to this). Either they should be independent code bases (in which case atomic commits are nice but not strictly necessary: i.e. automated multi-repo commits could be enabled by better infrastructure tooling), or they really are coupled, in which case they should probably be merged together (such as blink).

As I said, in the case of Chrome OS, they're already pulling things from Chromium. (Because who really wants to duplicate base/? net/? And in the future, probably more.)

And so if you're suggesting that they (or at least parts of them) should be merged into the Chromium repo, it's important to ask if our repo could actually accommodate them?

Here, there are actually two questions: Can git handle such a repo size at all with reasonable performance? And how painful would it be for developers? (Keep in mind that the ability of subversion (and similar systems) to do partial checkouts makes it easy to limit/eliminate the pain.)


R
 


-- Dirk

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.



Robert Iannucci

unread,
Apr 16, 2014, 2:47:23 AM4/16/14
to Viet-Trung Luu, Dirk Pranke, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On Tue, Apr 15, 2014 at 10:44 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 10:11 PM, Robert Iannucci <iann...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:47 PM, Viet-Trung Luu <v...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:10 PM, Dirk Pranke <dpr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 5:26 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

I don't think we have a plan for this at the moment, at least not beyond the things we normally do (like hope it gets caught in review). 

My concern is that mistakes happen, and that the pain of git repo growth is inflicted on everyone in perpetuity, unless we plan on rewriting history (which really should be viewed as an awful thing -- it's antithetical to the basic purpose of revision control, and defeats any benefits that uses hashes may have).
 

It's also not clear to me that this is a problem that needs to be urgently solved. It's true that Git is worse at handling large binary objects in a large repo than SVN is, and it's true that we have checked large things in occasionally in the past, but I don't remember anything causing much more than a bit of temporary pain. 

It's a concern because we apparently want to move to git as our source of truth, despite a significant number of drawbacks.
 

Do you have specific things you're afraid of that happen regularly, or is this more of a general concern?

(I do believe we need to evaluate repo size seriously in the context of the Blink merger, but that is really a separate project from flipping Chrome to Git and we shouldn't conflate the two).

In general, I'm concerned about the scaling properties of git. This is not to say that git is bad, but that it obviously wasn't designed for our use case.

So keep in mind, SVN is really not stellar at keeping binaries in source control either. It's easier from the client perspective, but the server side is, at best, not great.

The essential difference here is that one problem (git's) is architectural, whereas the other (svn's) is implementation.

Er... How is this an architectural problem instead of an implementation problem? Tools like git-annex, and git-bigfiles exist, which suggests that it is just an implementation problem. Unlike SVN however, there's a lot of development effort being put into making the tools better, because the architecture of git allows for that. 
 

The problem with binary files is that they usually change quite radically (I don't know if git even does binary diffs, but even if it did it'd be of limited help), so changes to binary files accumulate to large repo growth.


Git does use libXDiff to do deltification of packfiles. It's why the size of a git checkout of chromium is only 2x the size of the svn checkout of chromium but includes the entire history. The current size of an SVN checkout (no build artifacts) is 1.6GB. The current size of a git clone of chromium is 3.2GB.
 
A large repo isn't necessarily a problem, within certain limits (e.g., why should dealing with binary files of size on the order of megabytes, or even tens of megabytes be a problem?). Making a large repo work quickly/efficiently is largely a problem of implementation.

The problem git has with this is that the repo size is inflicted on every single clone. This is a fundamental issue.

Not so. git supports shallow clones if you don't care about the history of the repo. A 1-commit shallow clone of chromium/src is 1.1GB (yep, smaller than SVN).
 
 

FTR, I think it would be absolutely worthwhile to reject pushes containing binaries once we have a transparent or semi-transparent way to push binaries into google storage (the {upload,download}_from_google_storage scripts are half of a solution). We could even have a commit-queue system do this sort of thing automatically.

It seems to me that piling on more scripts only increases fragility, and indicates that git is really poorly suited for what we're doing. This applies to the proposed handling for large (binary) files just as much as it does to other things, such as revision numbering.

I think this is a bit misleading because there are many flows which are enabled by git that are not enabled by svn.

Not surprisingly, chromium is a large enough project to have unique requirements. It's unlikely that there will be an out-of-the-box solution that gets everyone everything, but we're trying our best to accommodate.

As I mentioned above, there's a substantial amount of work (internally and externally) being put into improving the implementation of git with binary files. The desire for revision numbers stems from the enormity of the chromium project and the need for centralized coordination in that project.
 

 
 
 

Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?

As far as I know, Blink is the only repo we're thinking about. 

Since we actively discourage people from trying to use Blink w/o the rest of Chrome, and since we increasingly need to change things in both repos at once, it's clear that there is a big win to blink devs to merge the two repos. 

I don't believe this is true of the others you mention, all of which have a meaningful non-chrome-centric existence.

To be clear, systems like subversion allow separate and combined existence. We don't take advantage of it very much, but the capability is there. (Chromium developers tend to think of the repo as being rooted at src/, which is incorrect.)

The great advantage of "one big repo" is that it allows for atomic changes, and can reduce the amount of gardening needed. (You can still have gardening schemes if you want[*], but there's the capability to use everything a ToT if you want.[**])

For example, take Chrome OS. As I understand it, they currently use parts of Chromium (e.g., base/), and have to semi-regularly have to garden bits of Chromium in. Now, whenever someone does major refactoring in Chromium, they potentially inflict great pain on Chrome OS. Currently, a great deal of Chrome OS-specific UI code lives in Chromium. This is largely due to an architectural problem. But if we solve that problem, then it'd be natural to kick such code closer to Chrome OS. But this is basically undo-able in any meaningful sense, at least without inflicting great pain on everyone, with the many-repos scheme.

It seems to me that moving to a revision control system that forces you to have lots of repos (and thus forcing gardening and/or code duplication) is a step in the wrong direction.

[*] Subversion doesn't force you to check out all of your repo at the same revision. Things like Perforce clients offer even more mapping flexibility/pain.
[**] E.g., probably you could put NaCl into the Chromium subversion repo somewhere ("beside" Chromium's src/; probably "beside" Chromium's trunk/ even), and no one would be the wiser (other than some svn paths needing to be updated, and revision numbers changing). But then if you wanted to, you could try to use NaCl at ToT.[***]
[***] Of course, our build/test infrastructure would need significant changes to make things like this practical, but that's a different problem. My concern is that moving to git might make this (a "google3"-like model) impossible.

On the flip side, moving off SVN will enable significant improvements on the workflow side, including affording the ability to better tool the management of multiple repos. This is not to mention that the majority of devs are already using git and are suffering daily with its second-class-citizen status. In addition, there are much better stories around security and audibility of the data-at-rest with git than with SVN (even in the context of the proposed history rewrites), and we can provide significantly better QoS with git than we can with SVN.

To be honest, given the amount of effort we've poured into trying to make git work, I don't think it's fair to compare versus subversion.

Er... there are a lot of features that we get with git that we can't have with subversion. I'm not going to list them. If we wanted to build all of those features for subversion (or even just a couple of them), then it would probably be even more work.

Unless I missed your meaning?
 

And to be honest, I found the old git workflow -- the one that evmar originally set up -- to be mostly superior in its usability/reliability to what we have now. I'm not entirely sure where you're getting this "suffering daily with its second-class-citizen status" bit.

A lot of the complexity with the current flow stems from the current tooling. A lot of the complexity in the current tooling (looking at you, gclient :) stems from its need to support multiple SCMs. This is one of the things we would love to dig into and clean up, but it's neigh impossible when we're in this amorphous git/svn middle transition state.

The current system which supports both git and svn has an incredible number of moving parts, many of which speak both git and svn. The ability to improve anything in this environment is staggeringly difficult.

Because all of these tools need to support both git and svn, they all must be written to serve the lowest common denominator of both systems, which means that currently we get all the pain of both, but few of the advantages of either one. Once we move to git, we'll be able to de-cruftify a lot of the tooling and begin to rely on the more advanced features of git. 

 
 


One of the problems with the one-big-repo world is it can very easily lead to tight coupling of presumably independent code bases (our one-big-repo is currently not immune to this). Either they should be independent code bases (in which case atomic commits are nice but not strictly necessary: i.e. automated multi-repo commits could be enabled by better infrastructure tooling), or they really are coupled, in which case they should probably be merged together (such as blink).

As I said, in the case of Chrome OS, they're already pulling things from Chromium. (Because who really wants to duplicate base/? net/? And in the future, probably more.)

>>**Given improved tooling**<<, it would make sense to me (if I actually mattered here), to move these to a separate repo and pin them in both, because they're being consumed independently. Improved tooling would mean that you could prepare a multi-repo patch for chromeos, chromium and e.g. base, get it reviewed together, try it all together on the trybots, and then land the patches in dependency order (base, chromium, chromeos) when you did the 'commit' action automatically.

Improved tooling would also, in my mind, automatically post DEPS roll CLs in all downstream repos any time the upstream repo changed (or maybe batches these up to once a day) and run tryjobs for them, so an owner could just lgtm+commit them.
 

And so if you're suggesting that they (or at least parts of them) should be merged into the Chromium repo, it's important to ask if our repo could actually accommodate them? 

Here, there are actually two questions: Can git handle such a repo size at all with reasonable performance? And how painful would it be for developers? (Keep in mind that the ability of subversion (and similar systems) to do partial checkouts makes it easy to limit/eliminate the pain.)


Yes, git can handle very huge repos of text data with possibly a few small binaries. If we start committing compiled build products, then not so much, but I don't think anyone's planning to do that.

I'm not sure how to answer 'how painful', but if you mean size-on-disk, then I think I've already demonstrated that this is already not-a-huge-problem, and can be mitigated by doing shallow clones (if you can't spare the extra 2GB for the full history).

R

Torne (Richard Coles)

unread,
Apr 16, 2014, 3:37:56 AM4/16/14
to Matt Giuca, inge3...@gmail.com, Yonggang Luo, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On 16 April 2014 04:31, Matt Giuca <mgi...@chromium.org> wrote:
On 16 April 2014 08:08, Ilya Sherman <ishe...@chromium.org> wrote:

Can we pretty please keep a monotonically increasing number for official commits (possibly per-branch), rather than reverting to hashes?  It's very common for me to need to know: Is my change, rXXXXXX, included in the build corresponding to this bug report?  With hashes, I need to query a tool to answer this common question.  That's much less convenient, and slower, than just being able to compare two numbers.

+cb0f9bc

I mean +1.

We had a big discussion about revnos vs hashes already on the "Chromium, Git, and you" thread but I don't think we came to any conclusion. If those SVN revision numbers get replaced with hashes, life is going to be awful. I need to be able to tell at a glance whether one revision is newer than another. I need the git log, OmahaProxy page and chrome://version pages to all show me the revision numbers consistently and in a monotonically increasing order. (For just one quick example: all bugs reported from TOT should have this revision number in the bug report so we can easily compare to the version we have checked out.)

We discussed previously the idea of using "git number" (which is in depot-tools) which just tells you the height of the git DAG. This is sufficient for the trunk but could cause confusion when you're in a branch. Another idea is just to have a CQ script (like the one that currently adds "Review URL" to every commit message) that appends a special line to every commit message saying what its revision number is. We could then automatically scrape the commit message for this line and display it in chrome://version. The advantage of this approach is that we can continue the numbering from the SVN IDs.

The format would be very simple. All commits to master end in a line:
"chromium-revision-id: 264090"
[snip]

If we implement this kind of thing, then that would be fine, but I am still absolutely opposed to the idea of using any variant of "git-number" as the way of generating a number. git-number is not even fine for trunk if you're talking about local builds. I think we should delete git-number right now as I still consider it an attractive nuisance which will cause more problems than it solves :)

Torne (Richard Coles)

unread,
Apr 16, 2014, 3:39:49 AM4/16/14
to Matt Giuca, Robert Iannucci, Inge Krammova Inge, Yonggang Luo, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On 16 April 2014 06:01, Matt Giuca <mgi...@chromium.org> wrote:


On 16 April 2014 14:32, Robert Iannucci <iann...@google.com> wrote:
Hey Matt,

So incidentally, we've been bouncing the same idea around infra today. I'm not sure about the release numbering (I think there would be some seriously deep implications there for a lot of other systems), but we could maybe do something like what you propose.

Great to hear this is still being discussed.

** Disclaimer: This is not a plan. **

If we supposed the existence of a commit queue which interdicted all pushes to the chromium repo (and I mean all (no, seriously everything. Even dcommit would go through this service)), then we could have that commit queue stamp all commits with a repo-wide monotonically increasing number. Note that this would change the commit hashes of all the commits (but `git cl upload` and the existing CQ already do this).

We could then have services (like bugdroid) mention the revision when pasting hashes in places, like deadbeef....-r12345. We could potentially have a chromium extension do this lookup and inject the numbers whenever it sees a hash... There would probably have to be some web service support for this to make it fast, however.

Note also that this would not remove the need for commit hashes. Tools and services, like gitiles and the git command line interface, would still not support these (the amount of work to do that would be large, and diverging our toolchain from standard gitiles / git / etc. would carry a very high cost). These numbers would be correct, but they would effectively be display only.

We could eventually also extend this mechanism to include other numbering schemes (like N-th commit on branch), as well.

That's fine (being display only). I am quite used to dealing with git commits (as I'm sure most Chrome engineers are). I just need the monotonic numbers for quick comparison purposes.
 

** End not-a-plan. **

My question to the ML is: if we had one-to-one hash-to-number (within a repo, monotonically increasing), and printed these numbers in addition to the hashes, what flows would this enable?

If they're just within a repo then almost none of these things are enabled, because non-global numbering is intrinsically error prone and misleading. You need the numbers to be determined by a *single* authoritative repo to be meaningful, which means they have to be recorded in the actual commit data/metadata somewhere in that repo and then downloaded into people's clones.

Robert Iannucci

unread,
Apr 16, 2014, 3:56:46 AM4/16/14
to Torne (Richard Coles), Matt Giuca, Inge Krammova Inge, Yonggang Luo, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On Wed, Apr 16, 2014 at 12:39 AM, Torne (Richard Coles) <to...@chromium.org> wrote:



On 16 April 2014 06:01, Matt Giuca <mgi...@chromium.org> wrote:


On 16 April 2014 14:32, Robert Iannucci <iann...@google.com> wrote:
Hey Matt,

So incidentally, we've been bouncing the same idea around infra today. I'm not sure about the release numbering (I think there would be some seriously deep implications there for a lot of other systems), but we could maybe do something like what you propose.

Great to hear this is still being discussed.

** Disclaimer: This is not a plan. **

If we supposed the existence of a commit queue which interdicted all pushes to the chromium repo (and I mean all (no, seriously everything. Even dcommit would go through this service)), then we could have that commit queue stamp all commits with a repo-wide monotonically increasing number. Note that this would change the commit hashes of all the commits (but `git cl upload` and the existing CQ already do this).

We could then have services (like bugdroid) mention the revision when pasting hashes in places, like deadbeef....-r12345. We could potentially have a chromium extension do this lookup and inject the numbers whenever it sees a hash... There would probably have to be some web service support for this to make it fast, however.

Note also that this would not remove the need for commit hashes. Tools and services, like gitiles and the git command line interface, would still not support these (the amount of work to do that would be large, and diverging our toolchain from standard gitiles / git / etc. would carry a very high cost). These numbers would be correct, but they would effectively be display only.

We could eventually also extend this mechanism to include other numbering schemes (like N-th commit on branch), as well.

That's fine (being display only). I am quite used to dealing with git commits (as I'm sure most Chrome engineers are). I just need the monotonic numbers for quick comparison purposes.
 

** End not-a-plan. **

My question to the ML is: if we had one-to-one hash-to-number (within a repo, monotonically increasing), and printed these numbers in addition to the hashes, what flows would this enable?

If they're just within a repo then almost none of these things are enabled, because non-global numbering is intrinsically error prone and misleading. You need the numbers to be determined by a *single* authoritative repo to be meaningful, which means they have to be recorded in the actual commit data/metadata somewhere in that repo and then downloaded into people's clones.

Sorry my question was unclear here. The metadata would be recorded once in the commit message by the authoritative repo at push-time.

Robert Iannucci

unread,
Apr 16, 2014, 3:58:00 AM4/16/14
to Torne (Richard Coles), Matt Giuca, Inge Krammova Inge, Yonggang Luo, Aaron Gable, Viet-Trung Luu, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On Wed, Apr 16, 2014 at 12:37 AM, Torne (Richard Coles) <to...@chromium.org> wrote:



On 16 April 2014 04:31, Matt Giuca <mgi...@chromium.org> wrote:
On 16 April 2014 08:08, Ilya Sherman <ishe...@chromium.org> wrote:

Can we pretty please keep a monotonically increasing number for official commits (possibly per-branch), rather than reverting to hashes?  It's very common for me to need to know: Is my change, rXXXXXX, included in the build corresponding to this bug report?  With hashes, I need to query a tool to answer this common question.  That's much less convenient, and slower, than just being able to compare two numbers.

+cb0f9bc

I mean +1.

We had a big discussion about revnos vs hashes already on the "Chromium, Git, and you" thread but I don't think we came to any conclusion. If those SVN revision numbers get replaced with hashes, life is going to be awful. I need to be able to tell at a glance whether one revision is newer than another. I need the git log, OmahaProxy page and chrome://version pages to all show me the revision numbers consistently and in a monotonically increasing order. (For just one quick example: all bugs reported from TOT should have this revision number in the bug report so we can easily compare to the version we have checked out.)

We discussed previously the idea of using "git number" (which is in depot-tools) which just tells you the height of the git DAG. This is sufficient for the trunk but could cause confusion when you're in a branch. Another idea is just to have a CQ script (like the one that currently adds "Review URL" to every commit message) that appends a special line to every commit message saying what its revision number is. We could then automatically scrape the commit message for this line and display it in chrome://version. The advantage of this approach is that we can continue the numbering from the SVN IDs.

The format would be very simple. All commits to master end in a line:
"chromium-revision-id: 264090"
[snip]

If we implement this kind of thing, then that would be fine, but I am still absolutely opposed to the idea of using any variant of "git-number" as the way of generating a number. git-number is not even fine for trunk if you're talking about local builds. I think we should delete git-number right now as I still consider it an attractive nuisance which will cause more problems than it solves :)

agree, modulo that it's used by other tools to do exactly what it's designed for: sorting large numbers of commits.

Daniel Bratell

unread,
Apr 16, 2014, 8:55:25 AM4/16/14
to Chromium-dev, blin...@chromium.org, Chase Phillips, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci, Cezary Kulakowski, odinho, Eirik Byrkjeflot Anonsen, Jens Lindström
On Tue, 15 Apr 2014 22:07:41 +0200, Chase Phillips <c...@google.com> wrote:

>
>> Questions? If you have an issue that has not been addressed in any
>> available documentation, please let us know.
>

The currently documented[1] workflow for fetching release branches still
uses subversion.

How will release branches be managed and how will they be accessed? I'm a
bit scared that we at Opera will get stale engine code for a release
because of this. The release work on Chromium 35 based products will be at
full speed when the switch happens.

/Daniel

[1]:
http://www.chromium.org/developers/how-tos/get-the-code#Working_with_release_branches

Daniel Bratell

unread,
Apr 16, 2014, 9:02:28 AM4/16/14
to Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Wed, 16 Apr 2014 00:08:34 +0200, Ilya Sherman <ishe...@chromium.org>
wrote:

> Can we pretty please keep a monotonically increasing number for official
> commits (possibly per->branch), rather than reverting to hashes? It's
> very common for me to need to know: Is my change, >rXXXXXX, included in
> the build corresponding to this bug report? With hashes, I need to
> query a tool >to answer this common question. That's much less
> convenient, and slower, than just being able to >compare two numbers.

Ar Opera we switched to git in 2009 and this was one of the concerns
discussed at the time, and I was admittedly one of the people raising
concerns. In hindsight it turned out to be a much smaller problem than
expected. You change how you work, gain a few tools and lose a few tools,
but the end result is positive.

/Daniel

Jiang Jiang

unread,
Apr 16, 2014, 10:12:21 AM4/16/14
to bratell at Opera, Chromium-dev, blink-dev, Chase Phillips, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci, Cezary Kulakowski, odinho, Eirik Byrkjeflot Anonsen, Jens Lindström
On Wed, Apr 16, 2014 at 2:55 PM, Daniel Bratell <bra...@opera.com> wrote:
> The currently documented[1] workflow for fetching release branches still
> uses subversion.
>
> How will release branches be managed and how will they be accessed? I'm a
> bit scared that we at Opera will get stale engine code for a release because
> of this. The release work on Chromium 35 based products will be at full
> speed when the switch happens.
>
> /Daniel
>
> [1]:
> http://www.chromium.org/developers/how-tos/get-the-code#Working_with_release_branches

Fetching release branches from git was mentioned here:

https://groups.google.com/a/chromium.org/d/msg/chromium-dev/GZOeMUPE7Bc/_TAdn7dYleEJ

Of course we do want to get some official documentation on this before
the switch, just so have some time in advance to prepare for that in
Opera.

- Jiang

Michael Moss

unread,
Apr 16, 2014, 11:21:21 AM4/16/14
to Jiang Jiang, bratell at Opera, Chromium-dev, blink-dev, Chase Phillips, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci, Cezary Kulakowski, odinho, Eirik Byrkjeflot Anonsen, Jens Lindström
I'm not sure I understand the question here. Can you describe the workflow that you're concerned about? The migration has the same implications for branches as it does for trunk/master. Basically, the branches in git will be the same as they are now, but subsequent changes will happen directly in git, rather than being mirrored from svn. The git migration won't change any existing branches or how you access them, since that was always a normal "git checkout". The use of svn in the current branch workflow is to be able to make changes, just like when making changes to trunk/master (i.e. http://www.chromium.org/developers/contributing-code/direct-commit), and that usage will go away for branches just like it will for master.

Michael Moss

unread,
Apr 16, 2014, 11:22:30 AM4/16/14
to Jiang Jiang, bratell at Opera, Chromium-dev, blink-dev, Chase Phillips, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci, Cezary Kulakowski, odinho, Eirik Byrkjeflot Anonsen, Jens Lindström

Viet-Trung Luu

unread,
Apr 16, 2014, 11:24:34 AM4/16/14
to Robert Iannucci, Dirk Pranke, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager
On Tue, Apr 15, 2014 at 11:47 PM, Robert Iannucci <iann...@google.com> wrote:
On Tue, Apr 15, 2014 at 10:44 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 10:11 PM, Robert Iannucci <iann...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:47 PM, Viet-Trung Luu <v...@google.com> wrote:
On Tue, Apr 15, 2014 at 9:10 PM, Dirk Pranke <dpr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 5:26 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Tue, Apr 15, 2014 at 4:52 PM, Aaron Gable <aga...@chromium.org> wrote:

On Tue, Apr 15, 2014 at 4:00 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
What good is a hash if we think that we'll rewrite history (thereby quite possibly invalidating lots of/most hashes)?

Each time we rewrite history, will we have to keep a copy of the git repo before the rewrite, so that we can do proper archaeology?

It's definitely not in the plan to do lots of (or any) rewriting of the history of the repo. There has been discussion of history rewriting for the sake of the Chrome+Blink merge, but that would mostly be rewriting of Blink, not Chrome.

What's the plan for preventing, e.g., people from accidentally (or unaccidentally) checking in large things? (Or, not necessarily even large things, but things that shouldn't -- where "shouldn't" may only be clear in hindsight -- be committed?)

I don't think we have a plan for this at the moment, at least not beyond the things we normally do (like hope it gets caught in review). 

My concern is that mistakes happen, and that the pain of git repo growth is inflicted on everyone in perpetuity, unless we plan on rewriting history (which really should be viewed as an awful thing -- it's antithetical to the basic purpose of revision control, and defeats any benefits that uses hashes may have).
 

It's also not clear to me that this is a problem that needs to be urgently solved. It's true that Git is worse at handling large binary objects in a large repo than SVN is, and it's true that we have checked large things in occasionally in the past, but I don't remember anything causing much more than a bit of temporary pain. 

It's a concern because we apparently want to move to git as our source of truth, despite a significant number of drawbacks.
 

Do you have specific things you're afraid of that happen regularly, or is this more of a general concern?

(I do believe we need to evaluate repo size seriously in the context of the Blink merger, but that is really a separate project from flipping Chrome to Git and we shouldn't conflate the two).

In general, I'm concerned about the scaling properties of git. This is not to say that git is bad, but that it obviously wasn't designed for our use case.

So keep in mind, SVN is really not stellar at keeping binaries in source control either. It's easier from the client perspective, but the server side is, at best, not great.

The essential difference here is that one problem (git's) is architectural, whereas the other (svn's) is implementation.

Er... How is this an architectural problem instead of an implementation problem? Tools like git-annex, and git-bigfiles exist, which suggests that it is just an implementation problem. Unlike SVN however, there's a lot of development effort being put into making the tools better, because the architecture of git allows for that.

The traditional assumption of git is that every useful repo has full history. This was a conscious design choice, and was an entirely reasonable one for a truly distributed revision control system.

Now, obviously this choice can be revisited and weakened, but then you're in a situation where parts of git will work with that choice weakened and other parts won't.

Note that our use-case is anything but as a distributed revision system. As such, for our use-case the design of git has disadvantages. To not acknowledge this is simply to deny reality.

 

The problem with binary files is that they usually change quite radically (I don't know if git even does binary diffs, but even if it did it'd be of limited help), so changes to binary files accumulate to large repo growth.


Git does use libXDiff to do deltification of packfiles. It's why the size of a git checkout of chromium is only 2x the size of the svn checkout of chromium but includes the entire history. The current size of an SVN checkout (no build artifacts) is 1.6GB. The current size of a git clone of chromium is 3.2GB.
 
A large repo isn't necessarily a problem, within certain limits (e.g., why should dealing with binary files of size on the order of megabytes, or even tens of megabytes be a problem?). Making a large repo work quickly/efficiently is largely a problem of implementation.

The problem git has with this is that the repo size is inflicted on every single clone. This is a fundamental issue.

Not so. git supports shallow clones if you don't care about the history of the repo. A 1-commit shallow clone of chromium/src is 1.1GB (yep, smaller than SVN).

Well, let's see.

Useful-ish support for shallow clones was added in git 1.9, which was only released a few months ago. (In particular, before then, you couldn't push from a shallow clone.)

Realistically though, you do care about revision history. Now, I don't have much experience with shallow clones, but I'm guessing your only choice would then be to do "git pull --unshallow", which would then pull down the world.

Perhaps in the fullness of time, the various parts of git will be made to automatically fetch what's needed for the given operation, i.e., operate like a centralized revision control system would.

Perhaps in the fullness of time, perhaps we'll also get partial (or "narrow") checkouts.

But I'm not holding my breath (esp. for reasonably complete support), since these all basically run counter to the basic design choices made for git.
 
 
 

FTR, I think it would be absolutely worthwhile to reject pushes containing binaries once we have a transparent or semi-transparent way to push binaries into google storage (the {upload,download}_from_google_storage scripts are half of a solution). We could even have a commit-queue system do this sort of thing automatically.

It seems to me that piling on more scripts only increases fragility, and indicates that git is really poorly suited for what we're doing. This applies to the proposed handling for large (binary) files just as much as it does to other things, such as revision numbering.

I think this is a bit misleading because there are many flows which are enabled by git that are not enabled by svn.

Such flows as?

In any case, we're not interested in arbitrary flows, only those that are appropriate for Chromium. But we already have enough trouble with supporting two workflows (rebase vs merge).
 

Not surprisingly, chromium is a large enough project to have unique requirements. It's unlikely that there will be an out-of-the-box solution that gets everyone everything, but we're trying our best to accommodate.

As I mentioned above, there's a substantial amount of work (internally and externally) being put into improving the implementation of git with binary files. The desire for revision numbers stems from the enormity of the chromium project and the need for centralized coordination in that project.

Chromium basically operates with a central repo, in a star topology. Why are we insisting on moving to a system that was designed for everything but that, and then building workarounds to try to support that?
 
 

 
 
 

Is Blink the only repo we're thinking about bringing in? What about other repos? (Skia, the Chrome OS repos, NaCl, Android, whatever -- thinking long-term.) What's our plan to scale our repo?

As far as I know, Blink is the only repo we're thinking about. 

Since we actively discourage people from trying to use Blink w/o the rest of Chrome, and since we increasingly need to change things in both repos at once, it's clear that there is a big win to blink devs to merge the two repos. 

I don't believe this is true of the others you mention, all of which have a meaningful non-chrome-centric existence.

To be clear, systems like subversion allow separate and combined existence. We don't take advantage of it very much, but the capability is there. (Chromium developers tend to think of the repo as being rooted at src/, which is incorrect.)

The great advantage of "one big repo" is that it allows for atomic changes, and can reduce the amount of gardening needed. (You can still have gardening schemes if you want[*], but there's the capability to use everything a ToT if you want.[**])

For example, take Chrome OS. As I understand it, they currently use parts of Chromium (e.g., base/), and have to semi-regularly have to garden bits of Chromium in. Now, whenever someone does major refactoring in Chromium, they potentially inflict great pain on Chrome OS. Currently, a great deal of Chrome OS-specific UI code lives in Chromium. This is largely due to an architectural problem. But if we solve that problem, then it'd be natural to kick such code closer to Chrome OS. But this is basically undo-able in any meaningful sense, at least without inflicting great pain on everyone, with the many-repos scheme.

It seems to me that moving to a revision control system that forces you to have lots of repos (and thus forcing gardening and/or code duplication) is a step in the wrong direction.

[*] Subversion doesn't force you to check out all of your repo at the same revision. Things like Perforce clients offer even more mapping flexibility/pain.
[**] E.g., probably you could put NaCl into the Chromium subversion repo somewhere ("beside" Chromium's src/; probably "beside" Chromium's trunk/ even), and no one would be the wiser (other than some svn paths needing to be updated, and revision numbers changing). But then if you wanted to, you could try to use NaCl at ToT.[***]
[***] Of course, our build/test infrastructure would need significant changes to make things like this practical, but that's a different problem. My concern is that moving to git might make this (a "google3"-like model) impossible.

On the flip side, moving off SVN will enable significant improvements on the workflow side, including affording the ability to better tool the management of multiple repos. This is not to mention that the majority of devs are already using git and are suffering daily with its second-class-citizen status. In addition, there are much better stories around security and audibility of the data-at-rest with git than with SVN (even in the context of the proposed history rewrites), and we can provide significantly better QoS with git than we can with SVN.

To be honest, given the amount of effort we've poured into trying to make git work, I don't think it's fair to compare versus subversion.

Er... there are a lot of features that we get with git that we can't have with subversion. I'm not going to list them. If we wanted to build all of those features for subversion (or even just a couple of them), then it would probably be even more work.

Which features do we want? Please list them. And please compare them to the things we lose.

I'm not trying to defend subversion per se. But I haven't seen any serious discussion of the pros and cons of git vs subversion (or other systems).
 

Unless I missed your meaning?

I was speaking in particular to the things you listed above. 
 

And to be honest, I found the old git workflow -- the one that evmar originally set up -- to be mostly superior in its usability/reliability to what we have now. I'm not entirely sure where you're getting this "suffering daily with its second-class-citizen status" bit.

A lot of the complexity with the current flow stems from the current tooling. A lot of the complexity in the current tooling (looking at you, gclient :) stems from its need to support multiple SCMs. This is one of the things we would love to dig into and clean up, but it's neigh impossible when we're in this amorphous git/svn middle transition state.

You can't seriously denounce things like gclient while at the same time saying that our solution of big(ish)/binary/whatever files is to check in hashes and pull them from google storage using some other tool. Because then your revision control system isn't git, but git + tools.


The current system which supports both git and svn has an incredible number of moving parts, many of which speak both git and svn. The ability to improve anything in this environment is staggeringly difficult.

A lot of this was self-imposed, in our quest to actually move to git.

It's not obvious that moving to git is superior to a world that's entirely svn in the backend, with a thin veneer of git for developers who want to have local branches (i.e., what Evan originally set up).


Because all of these tools need to support both git and svn, they all must be written to serve the lowest common denominator of both systems, which means that currently we get all the pain of both, but few of the advantages of either one. Once we move to git, we'll be able to de-cruftify a lot of the tooling and begin to rely on the more advanced features of git. 

What advanced features of git do we plan on relying on?
 

 
 


One of the problems with the one-big-repo world is it can very easily lead to tight coupling of presumably independent code bases (our one-big-repo is currently not immune to this). Either they should be independent code bases (in which case atomic commits are nice but not strictly necessary: i.e. automated multi-repo commits could be enabled by better infrastructure tooling), or they really are coupled, in which case they should probably be merged together (such as blink).

As I said, in the case of Chrome OS, they're already pulling things from Chromium. (Because who really wants to duplicate base/? net/? And in the future, probably more.)

>>**Given improved tooling**<<, it would make sense to me (if I actually mattered here), to move these to a separate repo and pin them in both, because they're being consumed independently. Improved tooling would mean that you could prepare a multi-repo patch for chromeos, chromium and e.g. base, get it reviewed together, try it all together on the trybots, and then land the patches in dependency order (base, chromium, chromeos) when you did the 'commit' action automatically.

I think the problem is that we have too much tooling, not too little.
 

Improved tooling would also, in my mind, automatically post DEPS roll CLs in all downstream repos any time the upstream repo changed (or maybe batches these up to once a day) and run tryjobs for them, so an owner could just lgtm+commit them.

My mind is boggled.

The major problem that combining repos solves is atomic commits, which eliminates the need for multi-sided changes (change here, roll deps there, change there, update here, etc.). This can't reasonably be replicated by tooling.
 
 

And so if you're suggesting that they (or at least parts of them) should be merged into the Chromium repo, it's important to ask if our repo could actually accommodate them? 

Here, there are actually two questions: Can git handle such a repo size at all with reasonable performance? And how painful would it be for developers? (Keep in mind that the ability of subversion (and similar systems) to do partial checkouts makes it easy to limit/eliminate the pain.)


Yes, git can handle very huge repos of text data with possibly a few small binaries. If we start committing compiled build products, then not so much, but I don't think anyone's planning to do that.

Please define "huge".

E.g., if I take all of the Chrome OS repos, blink, and Chromium and combine them, do I still get a functional repo?


I'm not sure how to answer 'how painful', but if you mean size-on-disk, then I think I've already demonstrated that this is already not-a-huge-problem, and can be mitigated by doing shallow clones (if you can't spare the extra 2GB for the full history).

On disk size is not the only consideration.

Network transfer size is probably a bigger consideration, especially if, say, you're an external contributor with poorer connectivity. Shallow clones help somewhat, but are still limited (if less so than pre-git-1.9). The lack of support for partial/narrow clones is still a major limitation.

Marshall Greenblatt

unread,
Apr 16, 2014, 12:38:06 PM4/16/14
to Michael Moss, Chase Phillips, Chromium-dev, blin...@chromium.org, Ben Henry, Aaron Gable, Ryan Tseng, Stefan Zager, Robert Iannucci
On Tue, Apr 15, 2014 at 7:19 PM, Marshall Greenblatt <magree...@gmail.com> wrote:
On Apr 15, 2014, at 7:07 PM, Michael Moss <mm...@chromium.org> wrote:

Branches already exist in git (https://chromium.googlesource.com/chromium/src.git/+log/refs/branch-heads/1933). You can 'gclient sync --with_branch_heads' to pull them into your checkout if you don't have them already.

These instructions seem to indicate that there's still an svn dependency when building release branches: http://www.chromium.org/developers/how-tos/get-the-code#Working_with_release_branches

Can you update the instructions if there's now a way to do it without requiring svn?

Just to follow up, this worked for me on Windows:

fetch --nohooks chromium --nosvn=True
gclient sync --nohooks --with_branch_heads
git fetch
git checkout refs/remotes/branch-heads/1916
gclient sync --jobs 16

However, it looks like tags aren't always updated in a timely manner. I expected to see 35.0.1916.27 in the below list since that's the current version shown on omahaproxy.

$ git show-ref | grep 35.0.1916
43b6e91259065ea65a50c4ff54a566f6997727a3 refs/tags/35.0.1916.0
fd3e875f4c8681b0890142b33cc4bb549aeed3bc refs/tags/35.0.1916.1
050085866580d77a276254d4406c26368dfc96b5 refs/tags/35.0.1916.10
425d2641586587e359ee9cf7369690acb5f1961e refs/tags/35.0.1916.11
22d0184131ab11d0da6ae15060d346baba80263f refs/tags/35.0.1916.13
fe4bc8dbb87e9ddf41e20e6c640d71d6c46606cb refs/tags/35.0.1916.14
96600a11b2975f51f67daeb16a039643a5b101e7 refs/tags/35.0.1916.15
0b7be79444fec0be6507188eee73a3b8b471ccd4 refs/tags/35.0.1916.17
4f19bff843bbc21306dc6ed70efb0bb3748225ce refs/tags/35.0.1916.2
503bc028cd2c9d1a886ab33bdf4c97bde4f45ba7 refs/tags/35.0.1916.21
84146968f467c5677dd91b2b29c26dbcd8d048fc refs/tags/35.0.1916.22
e3d2c5e61e796209210230e0c8c1d48e88d90123 refs/tags/35.0.1916.3
d009715289deea55de44e7c809946db174dae375 refs/tags/35.0.1916.4
29aad34e553b78ed7bb33c5a82319867f718d714 refs/tags/35.0.1916.5
85bc33759ab5dbac8091fa758cf0e91d05f50524 refs/tags/35.0.1916.6
a52fe06ec0c45d79a6fd676a1d418cce4bb5c98a refs/tags/35.0.1916.7
64d0a315c51603a46b3c185fec8e0e857de22869 refs/tags/35.0.1916.8
f33e556bd9ceed46d8e7522855bb78aa105590e1 refs/tags/35.0.1916.9
 

Thanks,
Marshall




On Tue, Apr 15, 2014 at 3:57 PM, Marshall Greenblatt <magree...@gmail.com> wrote:
Hi Chase,

What's the timeline and plan for migrating Chromium release branches to git?

Thanks,
Marshall

On Apr 15, 2014, at 4:07 PM, Chase Phillips <c...@google.com> wrote:

Hi Chromium Developers,


The Chrome Infra team has been working via bugs and a planning document to complete the final tasks that remain prior to flipping the switch from SVN to Git.  We are now nearly ready to make the switch.


When will the switch happen?  Friday, April 25, 2014.


A known quantity of work remains before we make the switch.  We are addressing that now.  We plan to switch the Git repo to be authoritative for writes in the evening of April 25, 2014 ("flag day").  If some unforeseeable issue occurs on flag day, our backup date is May 2, 2014.


What do I need to do?


- Today: Since, after flag day, SVN checkouts will no longer update and Git checkouts from other locations may break in mysterious ways, please make sure your gclient ‘src’ is configured to use https://chromium.googlesource.com/chromium/src.git.


- Soon: Watch for updates about if/when services may be temporarily offline prior to and during the switch.  Before the switch, we also plan to host a tech talk for the team on 4/22 (expecting this to be recorded) with more info about the switch.  Details for the tech talk to follow.


Questions?  If you have an issue that has not been addressed in any available documentation, please let us know.


Thanks!


Chase

Eric Seidel

unread,
Apr 16, 2014, 1:26:29 PM4/16/14
to Daniel Bratell, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
+1.

There is a ton of stop-energy on this thread.

I would like to publicly thank our tireless Infra engineers for taking this on.

Personally I'd much rather Infra offered fewer awesomer solutions.
These are good people, who are clearly understaffed to maintained the
menagerie of systems they've inherited. Part of better serving Chrome
is moving from 2 version control systems to 1. I'm very thankful that
these folks are doing that *for* us. There are a number of reasons why
Git was chosen [1], but now (after multiple years of prep) is the time
to execute.

As Daniel so nicely said, "you gain some tools you lose some tools"
but we believe the end result will be positive. Lets move forward and
then pick up the pieces. Stopping for more time to prepare is not a
good option.

The infra repositories are (mostly) open source [2] and I'm sure these
engineers would love a hand with the re-painting.

-eric

1. https://docs.google.com/a/chromium.org/document/d/1JHs1SjHTnUK77PAVeG6T-xTjUWBGqfuglTvYVa2zwmU/edit#heading=h.3ptkkl4ppzn8
2. https://chromium.googlesource.com/ the tools/* repos.

Viet-Trung Luu

unread,
Apr 16, 2014, 1:37:20 PM4/16/14
to Eric Seidel, Daniel Bratell, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Wed, Apr 16, 2014 at 10:26 AM, Eric Seidel <ese...@chromium.org> wrote:
+1.

There is a ton of stop-energy on this thread.

I would like to publicly thank our tireless Infra engineers for taking this on.

Personally I'd much rather Infra offered fewer awesomer solutions.
These are good people, who are clearly understaffed to maintained the
menagerie of systems they've inherited. Part of better serving Chrome
is moving from 2 version control systems to 1.  I'm very thankful that
these folks are doing that *for* us. There are a number of reasons why
Git was chosen [1], but now (after multiple years of prep) is the time
to execute.

This document is a little hard to take seriously, since it has no discussion about the disadvantages of git and any alternatives considered.
 

As Daniel so nicely said, "you gain some tools you lose some tools"
but we believe the end result will be positive.  Lets move forward and
then pick up the pieces.  Stopping for more time to prepare is not a
good option.

The infra repositories are (mostly) open source [2] and I'm sure these
engineers would love a hand with the re-painting.

-eric

1. https://docs.google.com/a/chromium.org/document/d/1JHs1SjHTnUK77PAVeG6T-xTjUWBGqfuglTvYVa2zwmU/edit#heading=h.3ptkkl4ppzn8
2. https://chromium.googlesource.com/ the tools/* repos.

On Wed, Apr 16, 2014 at 6:02 AM, Daniel Bratell <bra...@opera.com> wrote:
> On Wed, 16 Apr 2014 00:08:34 +0200, Ilya Sherman <ishe...@chromium.org>
> wrote:
>
>> Can we pretty please keep a monotonically increasing number for official
>> commits (possibly per->branch), rather than reverting to hashes?  It's very
>> common for me to need to know: Is my change, >rXXXXXX, included in the build
>> corresponding to this bug report?  With hashes, I need to query a tool >to
>> answer this common question.  That's much less convenient, and slower, than
>> just being able to >compare two numbers.
>
>
> Ar Opera we switched to git in 2009 and this was one of the concerns
> discussed at the time, and I was admittedly one of the people raising
> concerns. In hindsight it turned out to be a much smaller problem than
> expected. You change how you work, gain a few tools and lose a few tools,
> but the end result is positive.
>
> /Daniel

Michael Spang

unread,
Apr 16, 2014, 2:00:13 PM4/16/14
to Eric Seidel, Daniel Bratell, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
+1

Big thanks to infrastructure for supporting so many workflows for so long. Dropping one will let them move faster.

Dirk Pranke

unread,
Apr 16, 2014, 3:27:36 PM4/16/14
to Viet-Trung Luu, Eric Seidel, Daniel Bratell, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Wed, Apr 16, 2014 at 10:37 AM, Viet-Trung Luu <viettr...@chromium.org> wrote:
On Wed, Apr 16, 2014 at 10:26 AM, Eric Seidel <ese...@chromium.org> wrote:
+1.

There is a ton of stop-energy on this thread.

I would like to publicly thank our tireless Infra engineers for taking this on.

Personally I'd much rather Infra offered fewer awesomer solutions.
These are good people, who are clearly understaffed to maintained the
menagerie of systems they've inherited. Part of better serving Chrome
is moving from 2 version control systems to 1.  I'm very thankful that
these folks are doing that *for* us. There are a number of reasons why
Git was chosen [1], but now (after multiple years of prep) is the time
to execute.

This document is a little hard to take seriously, since it has no discussion about the disadvantages of git and any alternatives considered.

Trung, the "svn vs. git" ship sailed a long time ago. That document is not intended to justify that decision. It is probably better than anything else we have written up, but that's not saying a lot.

The reason we don't have a document for the decision is that, in truth, there never was "a decision". 

The whole drive to Git was developer-driven and developer-initiated, and it wasn't made on purely technical merits. Infra's life would've been *far* easier if we had all stayed on svn and gcl, but of course that wasn't good enough for the vast majority of *us*. 

When we did the Blink fork, I'm pretty sure I heard from essentially every person still using SVN as we worked out the bugs. That number was probably less than a dozen out of the many hundreds of Chromium and Blink committers. 

I would also guess that I spent well more than 75% of my time troubleshooting issues for those people, compared to almost none for people using Git. Obviously, I had done the work to make Git easy, but my point is that making *both* systems work equally well is a substantial amount of work and simply doesn't make sense given all of the other demands on our time. At some point, the infra team (and others) have fairly observed that we need to reduce or eliminate this self-inflicted cost, and switching to Git is *by far* the easiest way to do that.

Others on this thread have raised fair concerns about various tools and issues that need to be addressed ASAP in order for the switch to go smoothly (revision numbers, fast web-based blame, etc.). We need to not lose sight of them.

I think your questions are also fair and we should keep them in mind as we need to scale the repos further. We should definitely consider them further in the context of merging Chromium and Blink, for example. But I think this is best done by starting a new thread with that specific focus, rather than trying to continue it here.

(I actually had a long-ish response written up where I attempted to list the advantages and disadvantages of git vs. svn, but decided that posting it here would just muddy the waters further.)

I hope this helps,

-- Dirk

Jeffrey Yasskin

unread,
Apr 16, 2014, 4:09:49 PM4/16/14
to Eric Seidel, Daniel Bratell, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
+1. Thank you to the infrastructure folks for finishing this out.

On Wed, Apr 16, 2014 at 10:26 AM, Eric Seidel <ese...@chromium.org> wrote:

Paweł Hajdan, Jr.

unread,
Apr 17, 2014, 7:57:05 AM4/17/14
to Viet-Trung Luu, Eric Seidel, Daniel Bratell, Aaron Gable, Ilya Sherman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Ryan Tseng, Michael Moss, Stefan Zager, Robert Iannucci
On Wed, Apr 16, 2014 at 7:37 PM, Viet-Trung Luu <viettr...@chromium.org> wrote:
This document is a little hard to take seriously, since it has no discussion about the disadvantages of git and any alternatives considered.

Trung, I think I can see your point about git not being a perfect fit for Chromium. Yes, we do have large binary files, big repo with long history, and mostly using it in a centralized manner anyway. There are probably other reasons against using git as well that I could recognize.

However, I'd like to urge everyone to be constructive, i.e. don't just shoot down ideas for change but always propose alternatives.

In this case there are actual issues we're hitting with svn, e.g. reverting with gclient is unreliable, repos get corrupted in a way we can't recover from in a way other than deleting the entire checkout and so on. This is happening on a daily bases on the trybots. We believe moving to git is an important part of addressing these issues.

Paweł

Stefan Zager

unread,
Apr 21, 2014, 4:12:41 PM4/21/14
to Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Robert Iannucci
On Tue, Apr 15, 2014 at 2:02 PM, Mike Wittman <wit...@google.com> wrote:
It's been stated (here) that the git submodules workflow will break with this transition. What do those of us still using that workflow need to do to transition back to the gclient workflow?

I'll be sending out a separate PSA on this topic a bit later today.

Stefan

Elliott Sprehn

unread,
Apr 21, 2014, 5:43:18 PM4/21/14
to Stefan Zager, Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Robert Iannucci
I see announcements that this migration is still happening, has the blame tool being too slow been addressed? I haven't seen anything change.

https://chromium.googlesource.com/chromium/blink.git/+blame/87e70381aca484f1700ea23e72f5adc78dc4581c/Source/core/inspector/InspectorPageAgent.cpp took 6 seconds. It was faster the second time, but clicking back through the history is very slow.

Dirk Pranke

unread,
Apr 21, 2014, 5:48:19 PM4/21/14
to Elliott Sprehn, Stefan Zager, Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Robert Iannucci
I did actually see a note from one of the gitiles maintainers (the web-based git viewer that is the equivalent of viewvc) that they had made some significant improvements recently. So, yes, it is being worked on.

-- Dirk

Stefan Zager

unread,
Apr 21, 2014, 5:53:37 PM4/21/14
to Elliott Sprehn, Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Michael Moss, Robert Iannucci
On Mon, Apr 21, 2014 at 2:43 PM, Elliott Sprehn <esp...@chromium.org> wrote:
I see announcements that this migration is still happening, has the blame tool being too slow been addressed? I haven't seen anything change.

https://chromium.googlesource.com/chromium/blink.git/+blame/87e70381aca484f1700ea23e72f5adc78dc4581c/Source/core/inspector/InspectorPageAgent.cpp took 6 seconds. It was faster the second time, but clicking back through the history is very slow.


The git-on-borg team has been busy at work.  Here is their latest update:

Last week we improved Gitiles blame performance in two areas:

1. Optimized the JGit blame algorithm so it is comparable to C git on most files. On the same machine, it now outperforms C git blame on some files, and lags on others. These changes have also improved performance on Git-on-Borg, but due to various issues, Git-on-Borg performance may still lag behind local C git performance overall.

2. Made the Gitiles blame cache persistent, so blame results are cached indefinitely.  Ran some rough numbers at crbug.com/363822. We will monitor cache growth and long term we may be concerned about its unbounded nature, but in the short term GoB team is fairly confident we will not hit quota limits soon, and can react quickly in the event limits are reached.

Today/this week, we are working on adding more links to make the blame interface more useful, see https://code.google.com/p/gitiles/issues/detail?id=5.



A few simple experiments show that with a cold cache, blame performance is a bit faster but roughly comparable to viewvc; with a warm cache, it's much faster than viewvc.


Stefan

Michael Moss

unread,
Apr 21, 2014, 6:05:25 PM4/21/14
to Stefan Zager, Elliott Sprehn, Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Robert Iannucci
In general, git cold cache seems to be slower than viewvc in the best cases (though much better than it was last week), and faster in the worst cases. The warm cache is just faster overall.



Stefan


Sébastien Marchand

unread,
May 9, 2014, 4:58:33 PM5/9/14
to Michael Moss, Robert Iannucci, blink-dev, Ryan Tseng, Elliott Sprehn, Aaron Gable, Ben Henry, Chase Phillips, Stefan Zager, Mike Wittman, Chromium-dev

The migration to git will also break the pdb source file indexation, which is super useful when you're looking at a minidump... (This automatically download the source code at the good revision when you load it in windbg).

Maybe not a blocker but it'll be nice to fix this before doing the migration... I'll look at it next week.

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Michael Moss

unread,
May 9, 2014, 5:04:58 PM5/9/14
to Sébastien Marchand, Robert Iannucci, blink-dev, Ryan Tseng, Elliott Sprehn, Aaron Gable, Ben Henry, Chase Phillips, Stefan Zager, Mike Wittman, Chromium-dev

Elliott Sprehn

unread,
May 19, 2014, 6:50:56 PM5/19/14
to Michael Moss, Stefan Zager, Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Robert Iannucci
I'm seeing git blame performance be horrible again. ex.


That took 22 seconds to load, it took 1 second in ViewVC. Each time I jump backwards a revision in that file it takes another 20 seconds. Please don't migrate to git until blame isn't 20x slower.

John Abd-El-Malek

unread,
May 20, 2014, 12:01:18 AM5/20/14
to Elliott Sprehn, Michael Moss, Stefan Zager, Mike Wittman, Chase Phillips, Chromium-dev, blink-dev, Ben Henry, Aaron Gable, Ryan Tseng, Robert Iannucci
It's interesting that viewvc for blink is much faster than viewvc for chromium. I wonder if it's because the chromium repo has more code or revisions?
My limited testing with files in chromium has been that improved gitiles is about the same speed as viewvc or sometimes faster.
Reply all
Reply to author
Forward
0 new messages