What this boils down to, at the current state of the code, is whether
an HttpSolrClient can operate properly against a SolrCloud. I *think*
the answer is "yes" but have not found it so documented. Whether this
is a good way to run a Solr client is another question. I *think* it
is probably good enough.
If we would *require* cloud mode, it would be much easier to set up
and manage our data in Solr. There are a number of management API
calls which only work in cloud mode, which could be used to create and
configure collections instead of manually copying preconfigured cores
into Solr's directory tree. Sharding could be done entirely
within Solr using Time Routed Aliases instead of the way it was done
in 6_x by fiddling with Solr's storage behind its back. (This bypasses
the question of whether sharding is worth the hassle, considering the
way we slice the data vs. the usual distribution of sharded records.)
I think we could find willing a volunteer or two to do the work. What
hasn't been found is enough people willing to discuss the issues that
arise, and the broader questions of the ways in which DSpace uses
Solr.
--
Mark H. Wood
Lead Technology Analyst
University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu