On Monday, September 29, 2014 04:39:01 pm 'Dave Borowitz' via
Repo and Gerrit Discussion wrote:
> On Mon, Sep 29, 2014 at 1:11 PM, Simon Lei
<
name.s...@gmail.com> wrote:
> > On Monday, September 29, 2014 3:43:33 PM UTC-4, Dave
Borowitz wrote:
> >> On Mon, Sep 29, 2014 at 12:38 PM, Simon Lei
<
name.s...@gmail.com> wrote:
> >
> > ProjectCacheImpl.all() does return quickly but it isn't
> > sufficient by itself since it returns a list of every
> > project, including the ones not visible to the user. In
> > ListProjects (the use case that I'm focusing on), an extra
> > step is taken to check the visibility of every project to
> > the user amongst other things.
Are you on local disk, spinning, SSD, or NFS? I wonder if
this list takes longer on NFS?
One thing for sure is that our NFS slaves definitely are
slower than our SSD slaves at ls-projects, so that difference
must be related to IO. I suspect that this piece is part of
what might be slow on NFS, but I have yet to confirm this via
testing. If that is the case, we may still want to improve
this.
For fun I setup a single repo, and fetched all the other
projects' refs meta configs into it on separate branches (a
common git object cache for all refs/meta/configs) using git
(this took surprisingly long, over 1 hour for 3K repos!). I
then hacked the Gerrit code to look for the shas from every
repo in that repo first. I saw no difference in speed on my
local disk, but I know it was working by looking at the pack
files opened. I need to test this idea on NFS to see if it
helps. If this does not speed things up on NFS, I would need
to rule out the ref lookups (to get the shas to load) also as
the potential culprit.
> That makes sense. Unfortunately the simplest thing to
> persist would be the project list itself.
>
> I suspect caching the ProjectStates persistently would be
> much harder because they store references to all kinds of
> stateful singletons. Not impossible, but probably a heck of
> a lot harder than writing something to prewarm the cache on
> server startup.
As a hack 2 weeks ago I tried persisting the ProjectConfigs.
It required making tons of things serializable (I think some
transients broke it -> NPEs), and I still didn't get it to
work. So yes, it is not easy (not a simple hack even).
One additional piece to the puzzle is that slaves tend to
flush the project states on fetch if they are older than some
time (5 mins)? And due to this design I don't think running
ls-projects regularly prevents this. So even if we had a way
of forcing a prewarm on startup, this might not solve the
problem on slaves without a change to this flushing behavior
(flush after 5 mins, not on fetch, then auto reload).
One reason I started looking into this a bit is because the
project list is not the only thing that suffers from this
visibility check, queries do also. Queries which need to
iterate over tons of changes suffer drastically from cold
start issues also. My testing points towards the same issue:
visibility testing which is slow due to ProjectState building.
One thing that is still unexplored for me is parallelizing the
ProjectState loading/parsing. I have no idea how to even
think about doing that (we might need a separate queue just
for that!), but it might be a big win for those of us with
high-end servers with tons of cores. Unfortunately, I don't
know that there are any quick easy solutions to this problem.
:(
-Martin
--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation