Re-index one directory

48 views
Skip to first unread message

Dan Waterloo

unread,
Jan 7, 2016, 4:17:27 AM1/7/16
to DataparkSearch Engine
My dps installation is configured to go to various websites and index portions of them.

I have updated a single directory on one website, and want to add the new pages to the index.

for example:
www.domain.com/directory/index.html has been updated to contain links to all of the pages (some new) in the www.domain.com/directory/

I would like to re-index just that one directory, or all of the links on the page - www.domain.com/directory/index.html

Is this possible? the index.html was indexed previously, but now has new links in it, to pages in the same directory.  I would prefer to NOT re-scan all of the various other websites, and it might take a week or more before the system does that, plus the index.html file might not have expired yet.

Is there a command for the indexer to rescan just the www.domain.com/directory/index.html page, and 1 hop (all of the links on that one page)?

Thanks!

Maxim Zakharov

unread,
Jan 8, 2016, 9:28:50 AM1/8/16
to DataparkSearch Engine
Hi Dan,

You may limit indexer operations within a subdirectory of a site using -u switch for indexer:

./indexer -am -u http://www.domain.com/directory/% -u https://www.domain.com/directory/%

this command reindexes all documents under the directory specified.


You may prefer do it other way, first reindex a single page with new links:

./indexer -amu www.domain.com/directory/index.html

and then index only new documents found under this directory and all expired documents under this directory (which should be reindexed anyway) using the command:

./indexer -u http://www.domain.com/directory/% -u https://www.domain.com/directory/%



Best regards.


--
You received this message because you are subscribed to the Google Groups "DataparkSearch Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataparksearc...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Dan Waterloo

unread,
Jan 18, 2016, 12:52:04 AM1/18/16
to datapar...@googlegroups.com
Thanks Maxim,

that is exactly what I needed.

Dan

--
You received this message because you are subscribed to a topic in the Google Groups "DataparkSearch Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dataparksearch/Pg8fUcTTrOk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dataparksearc...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages