Dataverse solr synchronisation in "Advanced Installation" architecture

23 views
Skip to first unread message

Michel Bamouni

unread,
May 9, 2018, 10:52:57 AM5/9/18
to Dataverse Users Community
Hi,

In the dataverse installation page, I see "Advanced Installation" with two solr and postgresl database as in the below image:


So I want to setup this architecture but I don't not know how to synchronise the solrActive instance and the solrStandBy instance?
I have the same probleme with the two database synchronisation.


I'm interested in your feedback on how to ensure data consistency between active instances and standby

regards,

Michel

Auto Generated Inline Image 1

Philip Durbin

unread,
May 13, 2018, 10:28:41 PM5/13/18
to dataverse...@googlegroups.com
Hi Michel,

Apologies for the slow response. I created that image years ago based on what I heard the operations team for Harvard Dataverse was doing at the time, but I've never set up a standby Solr server before. In development, I only have a single Solr server running.

You should be aware that in the next release of Dataverse after 4.8.6 we will be requiring an update to Solr from 4.x to 7.x. This is a large enough jump that whatever strategies you may research for 4.x may not work exactly the same for 7.x. I don't know. I haven't researched this myself. Solr is definitely used at scale (at Netflix, for example) so I'm sure someone has worked on these problems before.

No one else from the community has replied with any advice on active a standby Solr instances so I assume that most people are like me an only run a single Solr server. I would suggest emailing the "solr-user" mailing list about this ( http://lucene.apache.org/solr/community.html#mailing-lists-irc ). I'm subscribed to this list but I don't usually read it because it's so high traffic. If you let me know when you email the list I can try to reply on the thread and answer any Dataverse-specific questions.

I hope this helps!

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/c1d644d9-1698-44a6-9726-ad90285d35c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Michel Bamouni

unread,
May 14, 2018, 7:53:18 AM5/14/18
to Dataverse Users Community
Hi Phil,

Thanks for your feedback.
I will send an email to "solr-user" mailing list about sychronization of two solr 7.x instances.

Regards,

Michel



Le lundi 14 mai 2018 04:28:41 UTC+2, Philip Durbin a écrit :
Hi Michel,

Apologies for the slow response. I created that image years ago based on what I heard the operations team for Harvard Dataverse was doing at the time, but I've never set up a standby Solr server before. In development, I only have a single Solr server running.

You should be aware that in the next release of Dataverse after 4.8.6 we will be requiring an update to Solr from 4.x to 7.x. This is a large enough jump that whatever strategies you may research for 4.x may not work exactly the same for 7.x. I don't know. I haven't researched this myself. Solr is definitely used at scale (at Netflix, for example) so I'm sure someone has worked on these problems before.

No one else from the community has replied with any advice on active a standby Solr instances so I assume that most people are like me an only run a single Solr server. I would suggest emailing the "solr-user" mailing list about this ( http://lucene.apache.org/solr/community.html#mailing-lists-irc ). I'm subscribed to this list but I don't usually read it because it's so high traffic. If you let me know when you email the list I can try to reply on the thread and answer any Dataverse-specific questions.

I hope this helps!

Phil
On Wed, May 9, 2018 at 5:52 PM, Michel Bamouni <olimi...@gmail.com> wrote:
Hi,

In the dataverse installation page, I see "Advanced Installation" with two solr and postgresl database as in the below image:


So I want to setup this architecture but I don't not know how to synchronise the solrActive instance and the solrStandBy instance?
I have the same probleme with the two database synchronisation.


I'm interested in your feedback on how to ensure data consistency between active instances and standby

regards,

Michel

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Michel Bamouni

unread,
May 14, 2018, 7:58:22 AM5/14/18
to Dataverse Users Community
Hi Phil,

After look at the installation dataverse 4.8.6, I see solr 4.6.0 and not solr 7.x.
So I want to know if dataverse 4.8.6 really need solr 7.x to work fine?


Michel


Le lundi 14 mai 2018 04:28:41 UTC+2, Philip Durbin a écrit :
Hi Michel,

Apologies for the slow response. I created that image years ago based on what I heard the operations team for Harvard Dataverse was doing at the time, but I've never set up a standby Solr server before. In development, I only have a single Solr server running.

You should be aware that in the next release of Dataverse after 4.8.6 we will be requiring an update to Solr from 4.x to 7.x. This is a large enough jump that whatever strategies you may research for 4.x may not work exactly the same for 7.x. I don't know. I haven't researched this myself. Solr is definitely used at scale (at Netflix, for example) so I'm sure someone has worked on these problems before.

No one else from the community has replied with any advice on active a standby Solr instances so I assume that most people are like me an only run a single Solr server. I would suggest emailing the "solr-user" mailing list about this ( http://lucene.apache.org/solr/community.html#mailing-lists-irc ). I'm subscribed to this list but I don't usually read it because it's so high traffic. If you let me know when you email the list I can try to reply on the thread and answer any Dataverse-specific questions.

I hope this helps!

Phil
On Wed, May 9, 2018 at 5:52 PM, Michel Bamouni <olimi...@gmail.com> wrote:
Hi,

In the dataverse installation page, I see "Advanced Installation" with two solr and postgresl database as in the below image:


So I want to setup this architecture but I don't not know how to synchronise the solrActive instance and the solrStandBy instance?
I have the same probleme with the two database synchronisation.


I'm interested in your feedback on how to ensure data consistency between active instances and standby

regards,

Michel

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Philip Durbin

unread,
May 14, 2018, 12:37:38 PM5/14/18
to dataverse...@googlegroups.com
Sorry for the confusion. I mean that Solr 7.x will be required for the *next* version of Dataverse *after* 4.8.6. We've merged the changes into the "develop" branch already but we have not yet cut a release.

On Mon, May 14, 2018, 2:58 PM Michel Bamouni <olimi...@gmail.com> wrote:
Hi Phil,

After look at the installation dataverse 4.8.6, I see solr 4.6.0 and not solr 7.x.
So I want to know if dataverse 4.8.6 really need solr 7.x to work fine?

Michel


Le lundi 14 mai 2018 04:28:41 UTC+2, Philip Durbin a écrit :
Hi Michel,

Apologies for the slow response. I created that image years ago based on what I heard the operations team for Harvard Dataverse was doing at the time, but I've never set up a standby Solr server before. In development, I only have a single Solr server running.

You should be aware that in the next release of Dataverse after 4.8.6 we will be requiring an update to Solr from 4.x to 7.x. This is a large enough jump that whatever strategies you may research for 4.x may not work exactly the same for 7.x. I don't know. I haven't researched this myself. Solr is definitely used at scale (at Netflix, for example) so I'm sure someone has worked on these problems before.

No one else from the community has replied with any advice on active a standby Solr instances so I assume that most people are like me an only run a single Solr server. I would suggest emailing the "solr-user" mailing list about this ( http://lucene.apache.org/solr/community.html#mailing-lists-irc ). I'm subscribed to this list but I don't usually read it because it's so high traffic. If you let me know when you email the list I can try to reply on the thread and answer any Dataverse-specific questions.

I hope this helps!

Phil
On Wed, May 9, 2018 at 5:52 PM, Michel Bamouni <olimi...@gmail.com> wrote:
Hi,

In the dataverse installation page, I see "Advanced Installation" with two solr and postgresl database as in the below image:


So I want to setup this architecture but I don't not know how to synchronise the solrActive instance and the solrStandBy instance?
I have the same probleme with the two database synchronisation.


I'm interested in your feedback on how to ensure data consistency between active instances and standby

regards,

Michel

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

Michel Bamouni

unread,
May 15, 2018, 3:07:22 AM5/15/18
to Dataverse Users Community
Hi Phil,

I understand now.

Thanks,

Michel
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages