we need Solr to reindex Solr ;)

44 views
Skip to first unread message

Fuad Efendi

unread,
Aug 8, 2011, 6:25:29 PM8/8/11
to lily-d...@googlegroups.com
Hi,


If I have few millions records, and I need to reindex subset only, specific namespace, Lily will run MapReduce which will scan all records… isn't it strange that we can't support secondary indexes for that yet? Am I right, is it performance bottleneck rich now? User asked to reindex 2000 records and it took 4 hours… we need Solr to reindex Solr ;)

Evert Arckens

unread,
Aug 9, 2011, 8:27:39 AM8/9/11
to lily-d...@googlegroups.com
Hi,

The current batch build is used to reindex all records in your repository. Re-indexing a subset is not supported yet.
So it will be reindexing your millions of records.

Regards,
Evert Arckens.
--
Evert Arckens
http://outerthought.org/
Scalable Smart Data
Makers of Kauri, Daisy CMS and Lily

Fuad Efendi

unread,
Aug 9, 2011, 9:58:33 AM8/9/11
to lily-d...@googlegroups.com
Hi Evert,

As a workaround, I can have simple Solr instance to index just "namespace" of a record (and to use WAL); and, to reindex subset, I can implement own MapReduce task. But I don't think Lily supports "namespace-index"; I need to use specific "namespace" field (or hack HBase table)… Am I right?
-Fuad


Evert Arckens

unread,
Aug 10, 2011, 3:06:26 AM8/10/11
to lily-d...@googlegroups.com
Hi,

Lily does indeed not support a namespace-index. So you would indeed have to use a specific field to put a 'namespace' in.

Evert.
Reply all
Reply to author
Forward
0 new messages