We started with an rsync snapshot copy of our Gerrit 3.2.7 environment on a separate test Gerrit master, which we then upgraded to 3.5.2 and then on to 3.6.6. From there we created a set of remote 3.6.6 mirror servers and got them replicating smoothly.
We are trying to streamline this process so that we can migrate to using the Gerrit 3.6 system for production. To that end we currently refresh this master with a weekly rsync snapshot from the 3.2.7 environment using the same original script that reindexes 3.2.7->3.5.2->3.6.6. This process is fairly well understood and works well up to now. The Gerrit 3.6.6 servers exist in parallel to the production systems. We do not touch the production Gerrit 3.2 servers for this purpose and leave them running throughout this process.
The problem arises when we try to populate the Gerrit 3.6 proxies from the 3.6 master after the weekly snapshot. The methods tried so far as follows:
- rsync is reliable, but takes too long to complete
- using replication, we see many jobs getting rejected REJECTED_NONFASTFORWARD
- when we set defaultForceUpdate=true we get TransportException: null (stack trace below), presumably for the same projects that were getting
REJECTED_NONFASTFORWARD before
- rsyncing the affected repos will workaround this exception so that they replicate smoothly again, but this is taking manual work to track down the failures to correct them
Since we perform tests on the Gerrit 3.6 systems during the week and since routine work is ongoing in the production environment, it is expected that they will get out of sync.
Is there a way to configure the replication plugin so that the proxies get clobberred by the current state of the master?
Regards,
Robert.
stack trace follows:
[2024-01-13 13:39:09,028] Cannot replicate to <redacted>.git [CONTEXT pushOneId="00f8be0c" ]
org.eclipse.jgit.errors.TransportException: <redacted>.git: null
at org.eclipse.jgit.transport.BasePackPushConnection.doPush(BasePackPushConnection.java:209)
at org.eclipse.jgit.transport.BasePackPushConnection.push(BasePackPushConnection.java:139)
at org.eclipse.jgit.transport.PushProcess.execute(PushProcess.java:179)
at org.eclipse.jgit.transport.Transport.push(Transport.java:1537)
at org.eclipse.jgit.transport.Transport.push(Transport.java:1583)
at com.googlesource.gerrit.plugins.replication.PushOne.pushInBatches(PushOne.java:591)
at com.googlesource.gerrit.plugins.replication.PushOne.pushVia(PushOne.java:584)
at com.googlesource.gerrit.plugins.replication.PushOne.runImpl(PushOne.java:555)
at com.googlesource.gerrit.plugins.replication.PushOne.doRunPushOperation(PushOne.java:437)
at com.googlesource.gerrit.plugins.replication.PushOne.runPushOperation(PushOne.java:405)
at com.googlesource.gerrit.plugins.replication.PushOne.lambda$run$2(PushOne.java:391)
at com.google.gerrit.server.util.RequestScopePropagator.lambda$cleanup$1(RequestScopePropagator.java:186)
at com.google.gerrit.server.util.RequestScopePropagator.lambda$context$0(RequestScopePropagator.java:174)
at com.google.gerrit.server.git.PerThreadRequestScope$Propagator.lambda$scope$0(PerThreadRequestScope.java:70)
at com.googlesource.gerrit.plugins.replication.PushOne.run(PushOne.java:394)
at com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:113)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:612)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NullPointerException