Which is better? Several small/medium indexes or one huge/massive index?

51 views
Skip to first unread message

Ajk Sanders

unread,
Jan 31, 2018, 7:49:39 PM1/31/18
to foxtrot-search
Which is better?

Several small/medium indexes or one huge/massive index?


Thanks






FoxTrot Engineering

unread,
Feb 1, 2018, 4:27:22 AM2/1/18
to foxtrot...@googlegroups.com
'Ajk Sanders' via foxtrot-search wrote:

>Which is better?
>
>Several small/medium indexes or one huge/massive index?

It depends…

If some of your indexed data is rarely modified (e.g. some archives or reference documents), while some other are frequently modified, it is usually wise to handle them in different FoxTrot indices; updating your frequently-modified-data index will be quite faster.

The same goes if you index logically distinct sets of data, and often know in which sets you want to search or not.

If you have an (i)Mac Pro with many cores and fast SSD drive, indexing / updating multiple indices in parallel may also be faster than having a single monolithic index.

In the other cases, a single massive index should be an acceptable choice (as long as your hardware is adequate with the mass of data you index). We recently found a bug which currently limits the size of an index to 16 GB (or more precisely, the size of some file inside the .ftindex package), and this will be fixed in a later version.

By the way, what is exactly "huge/massive" for you? In the 1.35 TB of data you are talking of, what part is actually textual data, rather than video / images etc?


Jérôme - CTM Engineering


---------------------------------------------------------------------
"I've been using Powermail for around 3 years now and find that it's
extremely stable. The interface is clear and intuitive and not cluttered
like other programs (Apple Mail and Eudora, e.g.). Filters work well,
and it does all that one would expect from an email client. The program
is robust and straightforward. It's a great application."
PowerMail user comment on www.versiontracker.com


Download a demo version from www.ctmdev.com
---------------------------------------------------------------------

Message has been deleted

Ajk Sanders

unread,
Feb 1, 2018, 4:58:28 AM2/1/18
to foxtrot-search
1.3 TB is almost all PDFs of textbooks, reference books, files etc.
Some are pure or true PDFs, others are scanned and have had OCR.

Indexing is set to ignore image, video and audio content (I uncheck those boxes under "Indexed data--Index contents of files")

Total of files indexed are 1 folder of 155,580 items totalling 1.37TB and another folder of 175 GB for 52,400 items.

I have currently 14 indexes running.

My biggest index file is currently 24.95 GB. This is from a folder of 80,000 items totalling 364 GB.

I don't know if that makes me a power user or an average user.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages