Not sure if this is the exact answer you are looking for but hope it
helps.
Brian
I'm not sure if this feature is available on the Mini.. but I do a
regexp to only retrieve content from the site relative root folder to
three folders deep. A similar approach may work?
There are also free web crawler applications on the web. I won't name
them because they are not all Google products.. but it would hopefully
help you get a number of documents. Besides, its that Google algorithm
that everyone seems to want...
Just ideas that may be useful.