where/how to adjust chunk overlap?

6 views
Skip to first unread message

dan haig

unread,
Sep 20, 2017, 11:23:18 AM9/20/17
to xtf-...@googlegroups.com
So I wanted to change our proximity values for searching in our xtf instance, after a few moments poking around remembered there's all this chunk overlap business, and that it's here in the conf/textIndexer.conf:

<chunk size="200" overlap="20"/>


So I changed the overlap from 20 to 50, a little perplexed because I was sure it had been 50 to begin with - this is why I'm in there in the first place - but easy right? But when I ran the indexer, it told me:


  Indexing New/Updated Documents:

    Index: "default"

*** Error: class java.lang.RuntimeException

java.lang.RuntimeException: Index chunk overlap (20) doesn't match config (50)

at org.cdlib.xtf.textIndexer.XMLTextProcessor.open(XMLTextProcessor.java:581)

at org.cdlib.xtf.textIndexer.SrcTreeProcessor.open(SrcTreeProcessor.java:142)

at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:474)

at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)



So off to the java files I go, with a torch and machete as that is about all I have with me when it comes to java. I find in WEB-INF/src/org/cdlib/xtf/textIndexer/indexInfo.java:


  public final static int defaultChunkOvlp = 50;


So now I'm really confused, because if the default is 50 in the java, why did it work when I had 20 in the conf file, and why does 50 in the conference file gag it? Where else might this be overridden?


.d




Martin Haye

unread,
Sep 20, 2017, 11:44:47 AM9/20/17
to xtf-...@googlegroups.com
Hi Dan,

My guess is that you're trying to do an incremental index. Blow away your old index and build a clean one with the new chunk value and I think it'll work. That said, I can't honestly say I've ever changed the chunk value, so there's a non-zero chance it's broken. But give the clean re-index a try.

--Martin

--
You received this message because you are subscribed to the Google Groups "XTF Users List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xtf-user+unsubscribe@googlegroups.com.
To post to this group, send email to xtf-...@googlegroups.com.
Visit this group at https://groups.google.com/group/xtf-user.
For more options, visit https://groups.google.com/d/optout.

dan haig

unread,
Sep 20, 2017, 11:59:36 AM9/20/17
to xtf-...@googlegroups.com
Yep, nailed it Martin, I removed the index dir and started fresh in there and the error went away. I'm seeing results as I would expect with the new values I wanted. Thanks!

.d
Reply all
Reply to author
Forward
0 new messages