upgrading to solr 4.0

255 views
Skip to first unread message

kfoley

unread,
Jan 18, 2011, 8:25:36 PM1/18/11
to Blacklight Development
I'm currently running blacklight 2.7 (works fine) and am trying to
upgrade the solr version to 4.0. I tried to follow the info here:
https://github.com/projectblacklight/blacklight/wiki/README_SOLR
making the best guess of where to put the new solr directory. I'm
guessing it didn't work as I'm getting a catalog controller error
complaining about a RSolr::RequestError (Solr Response: unknown
handler: search) I think it's complaining that it can't find the solr
files?

I put the untar'd solr files within my blacklight app so instead of
jetty/solr/conf, it's now jetty/example/solr/conf. I also have this
new solr working with my schema.xml, solrconfig.xml and data-
config.xml files so I know that solr is configured (via localhost:8983/
solr queries)

Assuming it's doable to upgrade solr 1.4 to solr 4.0, am I missing
something to configure blacklight to work with 4.0? Looking in
catalog_controller.rb, I didn't see anything that appeared obvious to
me (I'm a ruby newb).

Thanks,
Karen

Naomi Dushay

unread,
Jan 19, 2011, 1:07:32 AM1/19/11
to blacklight-...@googlegroups.com
Hi Karen,

If you can do http Solr requests directly to your Solr server, and
they work (e.g. http://localhost:8983/solr/select?q=blah )

then the next place to look is in your solr.yml file -- are you
talking to the right Solr instance? Is Solr running when you fire
up Blacklight?

> --
> You received this message because you are subscribed to the Google
> Groups "Blacklight Development" group.
> To post to this group, send email to blacklight-...@googlegroups.com
> .
> To unsubscribe from this group, send email to blacklight-develo...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/blacklight-development?hl=en
> .
>

Bill Dueber

unread,
Jan 19, 2011, 6:49:07 AM1/19/11
to blacklight-...@googlegroups.com
If you're seeing "Unknown handler: search" then the first thing I'd check is to make sure that there's a 'search' requestHandler in your solrconfig.xml. Did you maybe copy over schema.xml and not solrconfig.xml?

--
You received this message because you are subscribed to the Google Groups "Blacklight Development" group.
To post to this group, send email to blacklight-...@googlegroups.com.
To unsubscribe from this group, send email to blacklight-develo...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/blacklight-development?hl=en.




--
Bill Dueber
Library Systems Programmer
University of Michigan Library

Jason Ronallo

unread,
Jan 19, 2011, 8:04:28 AM1/19/11
to blacklight-...@googlegroups.com
Karen,
I'm running a Solr 4.0 nightly for one of my Blacklight applications.
You'll need to work through some Solr errors if you try to use the
solrconfig.xml from Blacklight. For instance the luceneMatchVersion
should be set to LUCENE_40 and I think some of the solr libraries
Blacklight uses (unicode tokenizer may be one) will need to be
commented out.

If you have a working Solr you're happy with already, then it is just
a matter of configuring Blacklight as others have suggested.

Jason

On Tue, Jan 18, 2011 at 8:25 PM, kfoley <kfol...@gmail.com> wrote:

Jonathan Rochkind

unread,
Jan 19, 2011, 11:32:18 AM1/19/11
to blacklight-...@googlegroups.com
Alternately you can tell Blacklight to use the Solr request handler of your choice for searches, in blacklight_config.rb, in config[:default_solr_params], :qt. 

Blacklight may still be hard-coded to look for a Solr request handler called 'document'  in other cases though; we need to fix that.

Jonathan

kfoley

unread,
Jan 19, 2011, 1:21:36 PM1/19/11
to Blacklight Development
I finally got Solr4.0 working by adding in the requestHandlers.
Thanks to all for the suggestions. I'm slowly figuring out Solr and
ruby(Blacklight), but it's seeming to be a pretty steep learning curve
for me. One day I hope to be able to provide a different use case
(non library) for Blacklight.

Question for Jason (or anyone else that may have done this too)
I'm wanting to test out the FieldCollapse feature within Blacklight
and saw that you made a post about this and was wondering if you could
give more guidance on how you went about getting this to work,
assuming you ended up getting it to fully work.

-Karen

On Jan 19, 8:32 am, Jonathan Rochkind <rochk...@jhu.edu> wrote:
> Alternately you can tell Blacklight to use the Solr request handler of
> your choice for searches, in blacklight_config.rb, in
> config[:default_solr_params], :qt.
>
> Blacklight may still be hard-coded to look for a Solr request handler
> called 'document'  in other cases though; we need to fix that.
>
> Jonathan
>
> On 1/19/2011 6:49 AM, Bill Dueber wrote:
>
> > If you're seeing "Unknown handler: search" then the first thing I'd
> > check is to make sure that there's a 'search' requestHandler in your
> > solrconfig.xml. Did you maybe copy over schema.xml and not solrconfig.xml?
>
> >     <mailto:blacklight-...@googlegroups.com>.
> >     To unsubscribe from this group, send email to
> >     blacklight-develo...@googlegroups.com
> >     <mailto:blacklight-development%2Bunsu...@googlegroups.com>.

Jonathan Rochkind

unread,
Jan 19, 2011, 1:25:25 PM1/19/11
to blacklight-...@googlegroups.com, kfoley
Yeah, it's not just you, it is a bit of a steep learning curve in my
opinion, to have to learn both Solr and Rails and Blacklight all at
once. I'd recommend definitely not neglecting the Solr part -- if you
understand how Solr works, understanding how Blacklight interacts with
Solr becomes a lot easier.

And we're continually trying to make Blacklight easier to use for
newbies, so feel free to give us suggestions of things you think could
be improved, whether documentation or the way configuration works or
anything else. [Can't promise we'll implement em of course, but
suggestions welcome.]

Jason Ronallo

unread,
Jan 19, 2011, 1:40:02 PM1/19/11
to blacklight-...@googlegroups.com
Karen,
I do have field collapse working with Blacklight and it was the main
reason to move to Solr 4.0. Just to get it working I've done quite a
bit of overriding of Blacklight in my local app. I think some of what
I've done is more than was necessary as I was figuring out how to make
it work, so hopefully I'll be able to refactor it down to just what's
needed.

If you group.main=true you'll have an easier time. This means that
grouped results will pretty much be formatted like ungrouped results.
This limitation comes mainly from rsolr(-ext) rather than Blacklight.
This means that you'll only have access to the top matching document
for the group. I've gotten around this limitation by indexing some
aggregation-level data along with each individual document within that
aggregation.

Other issues include pagination. There's no way (at least in the
nightly snapshot I saw) to determine how many groups you'd have for
any search. This means that regular pagination helpers won't work,
since you don't know the total number of "documents" being paginated.
To work around this I've used the per_page + 1 trick where I request
more documents than I intend to show on the page and if there is an
extra record then I know that there is at least one more page.

In the UI I'm developing collapsed results when clicked on send the
user to a search where the group is atomized into individual documents
with the same faceted search interface available. So things like links
from a document in a search result needs to be overridden to provide
this new behavior where some documents (atomized) link to show views
while clicking on an aggregation continues to take one to a search
results page. I've also added another index view so that collapsed and
uncollapsed results look different enough. We'll hopefully be doing
some testing of this interface to see to what works best for our
users. This may mean ditching field collapsing altogether or may
provide me the chance to do more work on making field collapsing
easier to work with.

Jason

Jonathan Rochkind

unread,
Jan 19, 2011, 1:52:33 PM1/19/11
to blacklight-...@googlegroups.com
Yeah, while field collapsing can initially _seem_ like it will be a
great solution to things that Solr can't easily do without it -- my own
personal evaluation at this point is it's still best avoided for
anything that isn't exactly the sort of use case it's originally
intended for (look at it's Solr JIRA ticket or wiki page) -- it's still
kinda weird, doesn't work exactly like you'd expect, has performance
implications, etc.

[FieldCollapsing definitely doesn't give you the total number of
collapsed 'pages' you'll get, and probably never will -- the Solr
developers don't see any way to do that without destroying performance,
because right now Solr only 'collapses' documents in the visible page,
it doesn't go through the entire result set and collapse it, which is
what it would need to do to know the total number of post-collapsed items.]

I wouldn't use it as a first resort to your problem, if you can find
another way around it. Which you may not be able to do, Solr isn't great
at certain things. (In many cases, I think the best solution is to
pre-process your records and merge them _before_ you add them to the
Solr index, but that's not easy to do either in a typical library
environment).

What are you actually thinking of using field collapsing for, Karen?

Jason Ronallo

unread,
Jan 19, 2011, 2:15:46 PM1/19/11
to blacklight-...@googlegroups.com
One reason I went with field collapsing rather than merging documents
before indexing is because I didn't want to create aggregated
documents to fit each new project. I initially went down the extract,
transform, and load (ETL) path. I found the ETL approach to be more
cumbersome and fragmenting than I liked. Lots of different Solr
indexes that each get indexed in slightly different ways. I don't want
to have to maintain more than one Solr index if at all possible.
Instead I can have a single Solr index which can appear to have
whatever aggregations make sense for any new application.

Anyone know if there is a way to determine how many unique values are
in a single facet for any arbitrary search that wouldn't be a big
performance problem? I've figured if pagination wasn't already part of
field collapsing, and as Jonathan mentions, would likely never get in
there, that there might not be a way to get this information, but I
figured I'd ask in any case. Since it seems as if it is possible to
group on more than one field (group.field can be specified more than
once), it may be that factor which limits getting pagination to work?
I'm only grouping by one field so far, so maybe there is a way to
calculate how many pages I'd have?

Jason

Jonathan Rochkind

unread,
Jan 19, 2011, 2:21:30 PM1/19/11
to blacklight-...@googlegroups.com, Jason Ronallo
On 1/19/2011 2:15 PM, Jason Ronallo wrote:
> Anyone know if there is a way to determine how many unique values are
> in a single facet for any arbitrary search that wouldn't be a big
> performance problem?

Sadly, there's also no way to do that in Solr, and the Solr developers
seem uninterested in figuring one out, believing that it may not be
possible without performance implications.

Although I have looked at the code a _bit_, and in my _completely_
not-familiar-with-solr not-a-solr-developer opinion, I thought I saw
some ways maybe I could add it in -- but it gets confusing, as there are
3 or 4 different paths facetting can take depending on the nature of
your data. I thought I saw a way to put it in without performance
problems for the strategy Solr uses for facet.method=fc on a
multi-valued field. Which is pretty much always what I have. So if you
feel like doing some Java hacking, you could try to write some Java to
do this, a custom version of the SimpleFacet component. (And/or patch
suggested back to Solr of course). When I looked a bit, it did seem
possible to me, but I could be wrong, and don't really have time to get
into it right now compared to how much I need it.

If you don't need "for an arbitrary search", but just accross your
entire corpus, you can do it. But within an arbitrary search, nope.

K Foley

unread,
Jan 19, 2011, 2:28:47 PM1/19/11
to blacklight-...@googlegroups.com
On the solr forums it was suggested to look at FieldCollapse as a solution to my problem.  It appears to be doing what I'd like when I do a ...group=true&group.field=groups&group.limit=9  Of course I won't know a limit in all cases, but adding that gave me what I was hoping for. So I thought I'd see if it could work within Blacklight for demo purposes.

What I'm wanting to do (in a test environment) is create an archiving-like system for lab experiments.  These lab experiments are defined by metadata such that the user can browse/search for them.  These experiments can also belong to 1+ groups and the users want to be presented with a list of groups (via faceted search) that upon expanding them will show the member conditions (experiments).  The database has "condition" (an experiment) as the atomic unit and I was struggling with how to switch the data-config.xml file to not have it's main entity be condition, but be group. 

A simplified result view would be like:

- GroupAflkdfj
   - Condition_alpha: a listing of it's metadata
   - Condition_beta: "  "
   - Condition_gamma: " "
- GroupGLKDlkfj
   - Condition_copper: " "
   - Condtion_beta: " "
   - Condition_zinc: " "
- Groupfkdlkjf
   - Condition_gamma: " "
   - Condition_zinc: " "

Jonathan Rochkind

unread,
Jan 19, 2011, 2:35:36 PM1/19/11
to blacklight-...@googlegroups.com, K Foley
Cool, if collapsing works for you, then collapsing works for you.  Yeah, Blacklight can't easily or out of the box handle collapsed search results though, it'll take some rails development to make it do so.

Another possible solution (which may or may not work for you) instead of using collapsing. Have your initial screen be a facet display for a field you keep 'group' in, facetting on groups.

However, this isn't going to be great either, because you can't really do sophisticated searches within a 'facet'. But if you don't have that many groups, and all you need to do is show a list of all the groups at once, it could work. 

Yet another option would be to put BOTH documents for groups and documents for experiments in your index, but with entirely different fields.  Then allow two searches, over groups, or over experiments. Once you've identified an experiment or experiments you are interested in, your app would have to do a second Solr query to then find all the experiments in that group.  However, you can do that query on 10 or 20 groups at a time, if you want to fetch all the experiments for all the groups on a page.  This strategy won't be too hard to fit into Blacklight, I think -- but might not support certain sophisticated kinds of queries, depending on what you need.

Solr is in general not great at hieararchical data.

Jonathan Rochkind

unread,
Jan 19, 2011, 2:41:22 PM1/19/11
to blacklight-...@googlegroups.com, K Foley
Oh, but, I'm also not sure that Collapsing really WILL work for you, although the folks on the solr list might (or might not :) ) know better than me.

My impression of collapsing was that it only collapsed items in the current page.  So if a condition for Group1 appeared on page 1, and another condition for Group1 appeared way off on page 100 or something, that second one wouldn't actually appear in your collapsed group.  Which would not do what you need.

Maybe try things out in pure Solr first, just using the Solr HTTP api by hand, to make sure collapsing in Solr really does what you want, before touching the (potentially non-trivial) task of getting Blacklight to accomodate it.

Jason Ronallo

unread,
Jan 19, 2011, 2:56:42 PM1/19/11
to blacklight-...@googlegroups.com
Field collapsing doesn't work exactly like Google domain collapsing
does. So in Google you can have only 7 main results on the first page
with 3 results nested under one or more results as long as all 10
documents are relevant enough to be on the first page. That obviously
can happen very quickly.

What Solr does is allow you to ask for the number of groups (as rows)
you want returned. So you have more than 10 aggregations and you
request 10 groups, it will return 10 groups. In the group.main
representation you only get the first matching document back. This
means it is able to be parsed by most Solr response parsers. Getting
back grouped results has a completely different syntax. Using that new
grouping syntax you could get back 10 groups and then display more
than one result within each group. Exactly how the relevance of groups
and documents within groups is determined, I'm not certain of yet. So
the first document within a group may be very relevant, while the
second document may be much less relevant in the current search
context--I don't know. Certainly each document in the group must show
up somewhere in the search results, but may not be as relevant. Does
relevancy with grouping work like that? I don't know.

The implementation of field collapsing in current Solr trunk is
different than the patches to previous versions of Solr. It may be
that those previous patches worked more similar to how Google does.

Jason

K Foley

unread,
Jan 19, 2011, 2:57:33 PM1/19/11
to Jonathan Rochkind, blacklight-...@googlegroups.com
Yeah, that's actually been a frustration of mine with solr documentation.  I read the documentation for a lot of it and see at the start of the documentation that a particular feature sounds like it will work for what I want to do, but then by the time I'm done reading the full page I'm scratching my head saying, "wait, does this actually do what it said at the beginning?"  I've also found that the book everyone keeps referring to isn't as helpful to me as it was for others.  Perhaps it's a lack of understanding the details of search terms/definitions.  I will keep trying, though, as I believe both Solr and Blacklight are a marked improvement to what our users currently use.  Besides what I'm ultimately wanting to do, I don't see as being any different from say Netflix or Zappos (two names I keep seeing tossed around as users of Solr)

If I haven't said it before, Thanks to all the Blacklight contributors, it is a really good app and I see its potential for many use cases.

Naomi Dushay

unread,
Jan 19, 2011, 4:56:10 PM1/19/11
to blacklight-...@googlegroups.com
I readily admit my ignorance on field collapsing, but am wondering if
this is what we think of as "hierarchical facets".

Check out call numbers and pub dates at

http://searchworks.stanford.edu

( you have to select a top level facet value to see what's underneath,
in our implementation).


If this is the desired behavior, I can provide more details. I
believe there is now "hierarchical facets" patch in Solr, if it isn't
included already in 4.0

- Naomi

K Foley

unread,
Jan 19, 2011, 6:35:33 PM1/19/11
to blacklight-...@googlegroups.com
Here is a snippet of what the FieldCollapse produces when I do:
http://localhost:8983/solr/select?wt=json&indent=true&fl=id,name,groups&q=Zn&group=true&group.field=groups&group.limit=70

*I set group.limit=70 because I know one of the groups has 70 conditions (which I'm not including in the snippet ;)  )

"grouped":{
"groups":{
"matches":188,
"groups":[{
"groupValue":"zinc concentrations set II",
"doclist":{"numFound":4,"start":0,"docs":[
{
"id":"625",
"name":"Zn_0.000_vs_NRC-1d.sig",
"groups":["zinc concentrations set II"]},
{
"id":"626",
"name":"Zn_0.005_vs_NRC-1d.sig",
"groups":["zinc concentrations set II"]},
{
"id":"627",
"name":"Zn_0.010_vs_NRC-1d.sig",
"groups":["zinc concentrations set II"]},
{
"id":"628",
"name":"Zn_0.015_vs_NRC-1d.sig",
"groups":["zinc concentrations set II"]}]
}},
{
"groupValue":"ZnSO4 0.015mM step time series rep-1",
"doclist":{"numFound":8,"start":0,"docs":[
{
"id":"652",
"name":"ZnSO4_ts_set-1_-005min_vs_NRC-1h1.sig",
"groups":["ZnSO4 0.015mM step time series rep-1"]},
{
"id":"653",
"name":"ZnSO4_ts_set-1_000min_vs_NRC-1h1.sig",
"groups":["ZnSO4 0.015mM step time series rep-1"]},
{
"id":"654",
"name":"ZnSO4_ts_set-1_005min_vs_NRC-1h1.sig",
"groups":["ZnSO4 0.015mM step time series rep-1"]},
{
"id":"655",
"name":"ZnSO4_ts_set-1_010min_vs_NRC-1h1.sig",
"groups":["ZnSO4 0.015mM step time series rep-1"]},
{
"id":"656",
"name":"ZnSO4_ts_set-1_020min_vs_NRC-1h1.sig",
"groups":["ZnSO4 0.015mM step time series rep-1"]},
{
"id":"657",
"name":"ZnSO4_ts_set-1_040min_vs_NRC-1h1.sig",
"groups":["ZnSO4 0.015mM step time series rep-1"]},
{
"id":"658",
"name":"ZnSO4_ts_set-1_080min_vs_NRC-1h1.sig",
"groups":["ZnSO4 0.015mM step time series rep-1"]},
{
"id":"659",
"name":"ZnSO4_ts_set-1_160min_vs_NRC-1h1.sig",
"groups":["ZnSO4 0.015mM step time series rep-1"]}]
}},
....

For the doclist, I only have it returning the id, name and groups for testing purposes.  In reality there would be much more info 
there (the metadata).
You'll notice that it says, "numFound" then displays that number of "docs" for that groupValue. This is similar to what I'm
wanting to do in the UI - have the groupValue be clickable (expand/collapse) such that depending on the state it will
display/hide the member conditions (docs) to the user.


I, too, am not 100% certain of what FieldCollapsing is doing, but just wanted to show you an example of what (I think) it can do.  Is this a similar effect to what you are doing with the call numbers and pub dates?

-Karen

To post to this group, send email to blacklight-...@googlegroups.com.

To unsubscribe from this group, send email to blacklight-develo...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/blacklight-development?hl=en.

Naomi Dushay

unread,
Jan 19, 2011, 7:05:08 PM1/19/11
to blacklight-...@googlegroups.com
Karen,

I *think* the distinction is:

hierarchical facets, when selected, determine the set of your search results

grouping  takes your search results and groups them.

- Naomi

Chris Beer

unread,
Mar 28, 2011, 2:28:46 PM3/28/11
to blacklight-...@googlegroups.com
Apologies for resurrecting a dead thread, but I’ve started a solr 4 blacklight-jetty branch (mainly to run the tests against).

Branch: https://github.com/projectblacklight/blacklight-jetty/tree/solr-4
Diff: https://github.com/projectblacklight/blacklight-jetty/compare/master...solr-4

The majority of the Blacklight tests passed using stock Blacklight, however there were some spellchecking failures (which might actually be bad tests — the intended behavior isn’t entirely clear to me)

To run the Blacklight tests, I had to self-compile SolrMarc using the latest SolrJ library. I can’t remember off-hand if the embedded solr feature worked, or if I had to run against the http endpoint.

I tried to leave the out-of-the-box solr config alone as much as possible and just add in the appropriate Blacklight configuration. I wasn’t entirely successful in this attempt, but should do better next time (and, note to self, also commit the stock Solr configs for ease-of-diffing later)

https://github.com/projectblacklight/blacklight-jetty/blob/solr-4/solr/development-core/conf/solrconfig.xml
https://github.com/projectblacklight/blacklight-jetty/blob/solr-4/solr/development-core/conf/schema.xml

The two major differences are:

  1. using the solr multicore configuration with a development core and a test core. I certainly like the multicore configuration better, and if there are no objections would like to proceed with it.
  2. Using the new (in Solr 3.1) ICU tokenizers and filters to replace the  schema.UnicodeNormalizationFilterFactory, schema.CJKFilterFactory,  etc. See http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUTokenizerFactory and http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory

As the tests pass, I assume nothing terrible happens when I do that. I’d love for someone who actually knows what they are doing with CJK languages or unicode normalization to take a look sometime.


On a related note — hopefully as part of this work, I’ll be able to cobble together some HOWTO documentation about going from a stock Solr config to what Blacklight expects and add it to the wiki.


Chris

Naomi Dushay

unread,
Mar 28, 2011, 2:52:40 PM3/28/11
to blacklight-...@googlegroups.com
Sounds great, Chris.  Comments below.  - Naomi


On Mar 28, 2011, at 11:28 AM, Chris Beer wrote:

Apologies for resurrecting a dead thread, but I’ve started a solr 4 blacklight-jetty branch (mainly to run the tests against).

Branch: https://github.com/projectblacklight/blacklight-jetty/tree/solr-4
Diff: https://github.com/projectblacklight/blacklight-jetty/compare/master...solr-4

The majority of the Blacklight tests passed using stock Blacklight, however there were some spellchecking failures (which might actually be bad tests — the intended behavior isn’t entirely clear to me)

To run the Blacklight tests, I had to self-compile SolrMarc using the latest SolrJ library. I can’t remember off-hand if the embedded solr feature worked, or if I had to run against the http endpoint.

I am in favor of solrmarc using SolrJ over embedded solr.  I believe Jonathan agrees heartily.

I tried to leave the out-of-the-box solr config alone as much as possible and just add in the appropriate Blacklight configuration. I wasn’t entirely successful in this attempt, but should do better next time (and, note to self, also commit the stock Solr configs for ease-of-diffing later)

https://github.com/projectblacklight/blacklight-jetty/blob/solr-4/solr/development-core/conf/solrconfig.xml
https://github.com/projectblacklight/blacklight-jetty/blob/solr-4/solr/development-core/conf/schema.xml

The two major differences are:

  1. using the solr multicore configuration with a development core and a test core. I certainly like the multicore configuration better, and if there are no objections would like to proceed with it.
The only concern I have is:  is it easier for a Solr newbie to work with cores, or with separate instances?  Is it simply a matter of a good README?


  1. Using the new (in Solr 3.1) ICU tokenizers and filters to replace the  schema.UnicodeNormalizationFilterFactory, schema.CJKFilterFactory,  etc. See http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUTokenizerFactory and http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory

As the tests pass, I assume nothing terrible happens when I do that. I’d love for someone who actually knows what they are doing with CJK languages or unicode normalization to take a look sometime.

I suspect Bob Haschart would be the best person to do this testing;  we are not yet doing anything with CJK.  And/or perhaps you could ask Erik or the solr-dev list if anyone has done a comparison?  


On a related note — hopefully as part of this work, I’ll be able to cobble together some HOWTO documentation about going from a stock Solr config to what Blacklight expects and add it to the wiki.

That would be awesome.  Seems a lot of folks struggle with this point.

To post to this group, send email to blacklight-...@googlegroups.com.
To unsubscribe from this group, send email to blacklight-develo...@googlegroups.com.

Jonathan Rochkind

unread,
Mar 28, 2011, 3:03:52 PM3/28/11
to blacklight-...@googlegroups.com
On 3/28/2011 2:52 PM, Naomi Dushay wrote:
>
> I am in favor of solrmarc using SolrJ over embedded solr. I believe
> Jonathan agrees heartily.

Indeed. But to be clear, I think either way SolrMarc is using "SolrJ",
just a question of whether it's in embedded mode, or HTTP mode.

>> On a related note � hopefully as part of this work, I�ll be able to

>> cobble together some HOWTO documentation about going from a stock
>> Solr config to what Blacklight expects and add it to the wiki.

I was thinking about doing this too, perhaps not exactly the same thing
as you, which might make our approaches complementary. What I was
thinking: There are actually a bunch of different choices for how you
set up your solrconfig.xml and corresonding blacklight config, I was
thinking of making a list of "scenarios" with solrconfig.xml set up a
certain way, blacklight config set up the simplest way that works for
that config, and then bonus advanced search config for that scenario too.

kfoley

unread,
Mar 28, 2011, 4:49:01 PM3/28/11
to Blacklight Development


On Mar 28, 12:03 pm, Jonathan Rochkind <rochk...@jhu.edu> wrote:
> On 3/28/2011 2:52 PM, Naomi Dushay wrote:
>
>
>
> > I am in favor of solrmarc using SolrJ over embedded solr.  I believe
> > Jonathan agrees heartily.
>
> Indeed. But to be clear, I think either way SolrMarc is using "SolrJ",
> just a question of whether it's in embedded mode, or HTTP mode.
>
> >> On a related note hopefully as part of this work, I ll be able to
> >> cobble together some HOWTO documentation about going from a stock
> >> Solr config to what Blacklight expects and add it to the wiki.
>
> I was thinking about doing this too, perhaps not exactly the same thing
> as you, which might make our approaches complementary.  What I was
> thinking: There are actually a bunch of different choices for how you
> set up your solrconfig.xml and corresonding blacklight config, I was
> thinking of making a list of "scenarios" with solrconfig.xml set up a
> certain way, blacklight config set up the simplest way that works for
> that config, and then bonus advanced search config for that scenario too.

+1

I would be very interested in seeing a few different scenarios in the
form as you've described.

Erik Hatcher

unread,
Mar 28, 2011, 5:16:14 PM3/28/11
to blacklight-...@googlegroups.com

On Jan 19, 2011, at 13:52 , Jonathan Rochkind wrote:
> [FieldCollapsing definitely doesn't give you the total number of collapsed 'pages' you'll get, and probably never will -- the Solr developers don't see any way to do that without destroying performance, because right now Solr only 'collapses' documents in the visible page, it doesn't go through the entire result set and collapse it, which is what it would need to do to know the total number of post-collapsed items.]

That's simply not true that it only collapses documents visible in the page. It is some way serious Lucene Collector magic across values of a field to "group" them, and definitely runs over the entire result set from q/fq's.

Note that the feature is really called Field *Grouping*, not collapsing, in case there's any semantic confusions about that. It's explained a little here: <http://wiki.apache.org/solr/FieldCollapsing>

Erik

Reply all
Reply to author
Forward
0 new messages