SNM:
Did you mean to index nodes with Neo4J or create a set of nodes within
Neo4J that represent what a spider has crawled externally to the Neo
instance? If the latter, following on what Michael recommended, depending
on how dynamic your pages are, if a page has multiple states, you might
also add extra nodes for linking those states if they adhere to RESTful
architectural principles (a unique URI for each state representation).
Deep linking and indexing of dynamic content was a big concern of many
people before Google provided guidance on how to do it properly.
Refs:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=174993
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=72746#1
If you are wanting to index links that are already within a Neo4J database
and make that information available from outside Neo, running cypher
queries and pre-building the results page (façade pattern) might be
effective if your audience wants to see the data. Google uses this
principle in their core search.
Cheers!
Duane Nickull
"Aus Berlin jetzt!"
***********************************
Technoracle Advanced Systems Inc.
Consulting and Contracting; Proven Results!
i. Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile
b. http://technoracle.blogspot.com
t. @duanechaos
"Don't fear the Graph! Embrace Neo4J"
On 2012-10-01 6:07 AM, "Michael Hunger" <michael.hun...@neotechnology.com>
wrote:
>You create nodes for the pages and relationships for the links pointing
>to other pages,
>the link-text can be put as an property on the relationships.
>The page text, title and crawl-date can be put as property on the nodes.
>Michael
>Am 01.10.2012 um 13:55 schrieb snm:
>> I am working on a multithreaded spider and i want to index links in
>>neo4j database. How can i do that automatically?
>> --
>--