master/slave and lucene index

202 views
Skip to first unread message

Jay Soffian

unread,
Sep 18, 2014, 11:42:09 PM9/18/14
to Repo and Gerrit Discussion
I have a master Gerrit instance running against MySQL.

For failover purposes, a standby server is running a Gerrit instance
in slave mode. MySQL replication is used for the database, and the
Gerrit replication plugin replicates the git changes from the master
server to the standby server.

That leaves the lucene index. As things stand currently, during a
failover I'd have to shutdown the slave instance and run the reindex
operation before restarting in master mode.

Is there a way to avoid the reindex? Can rsync be used to keep the
index on the standby server up-to-date? Other suggestions?

Currently on 2.8, but looking to upgrade to 2.9 soon.

Thanks,

j.

Magnus Bäck

unread,
Sep 19, 2014, 2:00:06 AM9/19/14
to Jay Soffian, Repo and Gerrit Discussion
On Friday, September 19, 2014 at 05:42 CEST,
Not really an answer to your question, but there's work in progress to
support Elasticsearch as an indexing backend. That way the master and
the standby server can share the same index. This won't be in until
2.11 at the earliest.

--
Magnus Bäck | Software Engineer, Development Tools
magnu...@sonymobile.com | Sony Mobile Communications

David Ostrovsky

unread,
Sep 19, 2014, 2:15:36 AM9/19/14
to repo-d...@googlegroups.com

Am Freitag, 19. September 2014 05:42:09 UTC+2 schrieb Jay Soffian:
I have a master Gerrit instance running against MySQL.

For failover purposes, a standby server is running a Gerrit instance
in slave mode. MySQL replication is used for the database, and the
Gerrit replication plugin replicates the git changes from the master
server to the standby server.

That leaves the lucene index. As things stand currently, during a
failover I'd have to shutdown the slave instance and run the reindex
operation before restarting in master mode.


This doesn't work: as of 2.9.1 reindexing on slaves is disabled.
 
Is there a way to avoid the reindex? Can rsync be used to keep the
index on the standby server up-to-date? Other suggestions?

Currently on 2.8, but looking to upgrade to 2.9 soon.

Theoretically you have Solr index type on 2.9. But i think Solr integration in
Gerrit is broken and it was removed in this change [1].


Jay Soffian

unread,
Sep 19, 2014, 12:23:27 PM9/19/14
to David Ostrovsky, Repo and Gerrit Discussion
On Fri, Sep 19, 2014 at 2:15 AM, David Ostrovsky
<david.o...@gmail.com> wrote:
>> That leaves the lucene index. As things stand currently, during a
>> failover I'd have to shutdown the slave instance and run the reindex
>> operation before restarting in master mode.
>>
>
> This doesn't work: as of 2.9.1 reindexing on slaves is disabled.

During a failover, it wouldn't be a slave anymore.

- Shutdown the slave Gerrit instance running on the standby server
- Reindex (on the standby server)
- Restart the Gerrit instance on the standby server as a master.

I'm trying to avoid the time-consuming re-index step.

Some googling indicates it may be viable to rsync the Lucene index. I
think I can do this, keeping track of the highest synced change. Then
during a failover, use the REST API to index any changes missing from
the index.

At worst, it doesn't work and I'm stuck with running a re-index operation.

j.

Dave Borowitz

unread,
Sep 19, 2014, 2:50:18 PM9/19/14
to Jay Soffian, David Ostrovsky, Repo and Gerrit Discussion
On Fri, Sep 19, 2014 at 9:23 AM, Jay Soffian <jayso...@gmail.com> wrote:
On Fri, Sep 19, 2014 at 2:15 AM, David Ostrovsky
<david.o...@gmail.com> wrote:
>> That leaves the lucene index. As things stand currently, during a
>> failover I'd have to shutdown the slave instance and run the reindex
>> operation before restarting in master mode.
>>
>
> This doesn't work: as of 2.9.1 reindexing on slaves is disabled.

During a failover, it wouldn't be a slave anymore.

I'm curious if you've successfully failed over in the past even without a Lucene index. There is zero consistency between the git repos and the database, as they're replicated completely separately, and depending on how replication works in your database there's no knowing how many writes you've dropped. I also have no idea how database replication interacts with the sequences Gerrit needs to hand out incrementing IDs.

At least with Lucene you can be assured that after a reindex it's consistent with the DB. With everything else you're not so lucky :)
 
- Shutdown the slave Gerrit instance running on the standby server
- Reindex (on the standby server)
- Restart the Gerrit instance on the standby server as a master.

I'm trying to avoid the time-consuming re-index step.

Some googling indicates it may be viable to rsync the Lucene index.

Yes, AFAIK this will work.
 
I think I can do this, keeping track of the highest synced change.

Not sure what you mean by "highest synced change." If you're referring to change numbers, those are irrelevant. What you want to keep track of is the latest updated timestamp across all changes. (And ensure master and slaves have the same timezone. And hope that clock skew among them is negligible.)
 
Then during a failover, use the REST API to index any changes missing from
the index.

As I mentioned in another thread recently I'd really like a background repair of stale changes to be part of core. If you are going to spend significant coding effort doing this fixup operation, it would be great if you could contribute it upstream.
 
At worst, it doesn't work and I'm stuck with running a re-index operation.

j.

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dave Borowitz

unread,
Sep 19, 2014, 4:43:47 PM9/19/14
to Jay Soffian, David Ostrovsky, Repo and Gerrit Discussion
On Fri, Sep 19, 2014 at 9:23 AM, Jay Soffian <jayso...@gmail.com> wrote:
On Fri, Sep 19, 2014 at 2:15 AM, David Ostrovsky
<david.o...@gmail.com> wrote:
>> That leaves the lucene index. As things stand currently, during a
>> failover I'd have to shutdown the slave instance and run the reindex
>> operation before restarting in master mode.
>>
>
> This doesn't work: as of 2.9.1 reindexing on slaves is disabled.

During a failover, it wouldn't be a slave anymore.

I'm curious if you've successfully failed over in the past even without a Lucene index. There is zero consistency between the git repos and the database, as they're replicated completely separately, and depending on how replication works in your database there's no knowing how many writes you've dropped. I also have no idea how database replication interacts with the sequences Gerrit needs to hand out incrementing IDs.

At least with Lucene you can be assured that after a reindex it's consistent with the DB. With everything else you're not so lucky :)
 
- Shutdown the slave Gerrit instance running on the standby server

- Reindex (on the standby server)
- Restart the Gerrit instance on the standby server as a master.

I'm trying to avoid the time-consuming re-index step.

Some googling indicates it may be viable to rsync the Lucene index.
Yes, AFAIK this will work.
 
I think I can do this, keeping track of the highest synced change.

Not sure what you mean by "highest synced change." If you're referring to change numbers, those are irrelevant. What you want to keep track of is the latest updated timestamp across all changes. (And ensure master and slaves have the same timezone. And hope that clock skew among them is negligible.)
 
Then during a failover, use the REST API to index any changes missing from
the index.

As I mentioned in another thread recently I'd really like a background repair of stale changes to be part of core. If you are going to spend significant coding effort doing this fixup operation, it would be great if you could contribute it upstream.
 
At worst, it doesn't work and I'm stuck with running a re-index operation.

j.

Reply all
Reply to author
Forward
0 new messages