gerrit replication fail of big repository

363 views
Skip to first unread message

Chunlin Zhang

unread,
Dec 13, 2012, 3:18:32 AM12/13/12
to repo-d...@googlegroups.com
I have setup gerrit replication(I use gerrit 2.5) and success for some little test project,but fail in some git project which have large repository(for example > 10G)

When I do "replication start",I notice that the ssh action last for a short time for example half a minute and end with this error msg log:

[2012-12-13 15:52:48,834] ERROR com.googlesource.gerrit.plugins.replication.ReplicationQueue : Cannot replicate to ger...@10.115.6.212:review_site/git/projects/phone.git
org.eclipse.jgit.errors.TransportException: ger...@10.115.6.212:review_site/git/projects/phone.git: java.io.IOException: channel is broken
        at org.eclipse.jgit.transport.BasePackPushConnection.doPush(BasePackPushConnection.java:204)
        at org.eclipse.jgit.transport.BasePackPushConnection.push(BasePackPushConnection.java:142)
        at org.eclipse.jgit.transport.PushProcess.execute(PushProcess.java:141)
        at org.eclipse.jgit.transport.Transport.push(Transport.java:1127)
        at com.googlesource.gerrit.plugins.replication.PushOne.pushVia(PushOne.java:299)
        at com.googlesource.gerrit.plugins.replication.PushOne.runImpl(PushOne.java:244)
        at com.googlesource.gerrit.plugins.replication.PushOne.runPushOperation(PushOne.java:202)
        at com.googlesource.gerrit.plugins.replication.PushOne.access$000(PushOne.java:69)
        at com.googlesource.gerrit.plugins.replication.PushOne$1.call(PushOne.java:181)
        at com.googlesource.gerrit.plugins.replication.PushOne$1.call(PushOne.java:178)
        at com.google.gerrit.server.util.RequestScopePropagator$5.call(RequestScopePropagator.java:196)
        at com.google.gerrit.server.util.RequestScopePropagator$4.call(RequestScopePropagator.java:174)
        at com.google.gerrit.server.git.PerThreadRequestScope$Propagator$1.call(PerThreadRequestScope.java:73)
        at com.googlesource.gerrit.plugins.replication.PushOne.run(PushOne.java:178)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
        at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:337)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: java.io.IOException: channel is broken
        at com.jcraft.jsch.Channel$1.flush(Channel.java:337)
        at com.jcraft.jsch.Channel$1.write(Channel.java:308)
        at org.eclipse.jgit.storage.pack.PackOutputStream.write(PackOutputStream.java:124)
        at org.eclipse.jgit.storage.file.PackFile.copyAsIs2(PackFile.java:503)
        at org.eclipse.jgit.storage.file.PackFile.copyAsIs(PackFile.java:327)
        at org.eclipse.jgit.storage.file.WindowCursor.copyObjectAsIs(WindowCursor.java:162)
        at org.eclipse.jgit.storage.pack.PackWriter.writeObjectImpl(PackWriter.java:1360)
        at org.eclipse.jgit.storage.pack.PackWriter.writeObject(PackWriter.java:1331)
        at org.eclipse.jgit.storage.pack.PackOutputStream.writeObject(PackOutputStream.java:161)
        at org.eclipse.jgit.storage.file.WindowCursor.writeObjects(WindowCursor.java:168)
        at org.eclipse.jgit.storage.pack.PackWriter.writeObjects(PackWriter.java:1319)
        at org.eclipse.jgit.storage.pack.PackWriter.writeObjects(PackWriter.java:1307)
        at org.eclipse.jgit.storage.pack.PackWriter.writePack(PackWriter.java:897)
        at org.eclipse.jgit.transport.BasePackPushConnection.writePack(BasePackPushConnection.java:284)
        at org.eclipse.jgit.transport.BasePackPushConnection.doPush(BasePackPushConnection.java:184)
        ... 22 more

How can I do with this?
Is the gerrit replication do not suitable for the big repository?

Chunlin Zhang

unread,
Dec 17, 2012, 10:56:50 PM12/17/12
to repo-d...@googlegroups.com
I decided to write some script to use rsync to sync the mirror instead of gerrit replication.

在 2012年12月13日星期四UTC+8下午4时18分32秒,Chunlin Zhang写道:

Shawn Pearce

unread,
Dec 24, 2012, 12:05:42 PM12/24/12
to Chunlin Zhang, repo-discuss
On Thu, Dec 13, 2012 at 12:18 AM, Chunlin Zhang <zhangc...@gmail.com> wrote:
> I have setup gerrit replication(I use gerrit 2.5) and success for some
> little test project,but fail in some git project which have large
> repository(for example > 10G)
>
> When I do "replication start",I notice that the ssh action last for a short
> time for example half a minute and end with this error msg log:
>
> [2012-12-13 15:52:48,834] ERROR
> com.googlesource.gerrit.plugins.replication.ReplicationQueue : Cannot
> replicate to ger...@10.115.6.212:review_site/git/projects/phone.git
> org.eclipse.jgit.errors.TransportException:
> ger...@10.115.6.212:review_site/git/projects/phone.git:
> java.io.IOException: channel is broken

Is the slave SSH server timing out the connection because of an idle
timer? Gerrit connects, then does a lot of processing, then sends
data. While it is doing that processing there is no data travelling in
either direction on the connection.

Chunlin Zhang

unread,
Dec 24, 2012, 8:38:49 PM12/24/12
to Shawn Pearce, repo-discuss
On Tue, Dec 25, 2012 at 1:05 AM, Shawn Pearce <s...@google.com> wrote:
Is the slave SSH server timing out the connection because of an idle
timer? Gerrit connects, then does a lot of processing, then sends
data. While it is doing that processing there is no data travelling in
either direction on the connection.
I don't think the slave time out when I do the replication action,because the error come up quickly.
Now I guess that it may because the replication action cause "git gc" action when doing push to slave,but to some big git repository in my master gerrit,the "git gc" action using jgit may fail or use long long time to finished(when I run "git gc" in this kind of big repository,it use 40+ minute typically)

georgey

unread,
Jan 9, 2013, 10:34:36 AM1/9/13
to repo-d...@googlegroups.com


On Thursday, 13 December 2012 03:18:32 UTC-5, Chunlin Zhang wrote:
I have setup gerrit replication(I use gerrit 2.5) and success for some little test project,but fail in some git project which have large repository(for example > 10G)

When I do "replication start",I notice that the ssh action last for a short time for example half a minute and end with this error msg log:

... Cannot replicate to ger...@10.115.6.212:review_site/git/projects/phone.git


Your replication target is another Gerrit server? I get errors in org.eclipse.jgit.transport.SideBandOutputStream.writeBuffer when some native Git clients clone my large repos. I'm running an earlier Gerrit/JGit combination though. I know some older Git clients do not deal well with large TCP window sizes nor lack of timely session "ack" communications. I have my repo packed object sizes limited, but some project owners store binaries and (J)Git^= Svn. Perhaps these communication limitations are a real bug in JGit.
Reply all
Reply to author
Forward
0 new messages