> why a search engine don't or can't index all of a file when it's of
> large size?
The answer to this (and also partly for your question about stopwords)
is simple: saving disk space. Well that used to the excuse for both things
but compressing the index and the position pointers within each inverted
entry will save more space than partial indexing or stopwording.
You should read Witten, Moffat and Bell's book "Managing Gigabytes" that
will provide you with enough technical background to answer these (and
many more) questions for yourself.
Regards, Trevor
<>< Re: deemed!