1. There might be "something by apollo" at export.lcs.mit.edu; "check files
apollo*" in contrib. I have not been able to check this.
2. "maybe just using GNUS' local spool options" was suggested. From what I
have seen (admittedly not much) this would not help much with 100's of MBs?
3. "many professional quality programs" sold by "vendors of search software
for cdrom", with special mentions for "Personal Librarian", and for INDIC
from Emerging Technology, 4760 Walnut Street, Bouder, CO, 80301, USA,
(303) 447-9495. Unfortunately I don't think I can get funding for a
commercial product, so I have not pursued this.
4. Only one book was specifically suggested: "Automatic Text Processing"
by Gerard Salton. It is said to describe many different considerations for
writing a text retrieval package, including the Vector Space Model.
5. Most popular suggestion award must definitely go to "lq-text"
package, by Liam Quin, recently posted in alt.sources (usual hints on
how to find alt.sources archives omitted). It is available for anon ftp
at ftp.cs.toronto.edu in subdirectory pub, file lq-text1.10.tar.Z. It
is said to have "hooks for indexing news", although some work would be
needed (e.g. to avoid indexing uuencoded/shar/etc articles).
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 53, Bologna, Italia
Email: (work:) mart...@cadlab.sublink.org, (home:) al...@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434;
Fax: ++39 (51) 366964 (work only), Fidonet: 332/407.314 (home only).