Hi there,
I'd like to create a simple tool for indexing directories which contain large text (or HTML, or PDF, but I'd convert them) files, thousands of words typically (books/articles).
I'd likely only be doing simple queries, words or phrases, with tokenisation would be nice.
I was wondering, if Bleve would perform well here? It sounds like there's lots and lots of work for updating documents and the like and since mine is a fixed set of texts I thought maybe the value proposition would be different.
I suppose there's 2 questions:
1) Do you see anything wrong with using bleve to index (by word) 100's of n thousand word documents
2) Do you think I may be better off writing something myself if I'm not using a large chunk of bleve's functionality, or is tokenisation/search/etc a hard problem?
Thanks!