neo4j-admin import documentation and importing into existing database

290 views
Skip to first unread message

Farid

unread,
Aug 2, 2017, 2:31:19 PM8/2/17
to Neo4j
Hi,

Trying to import very large dataset, I see a huge difference between cypher and neo4j-import

Sadly:
1) It's already deprecated, yet I can't find any documentation nor real examples using its successor: neo4j-admin import
2) Even current documentation is limited to a very simple 2 nodes types and a simple relation between them, can't find how to use a more complex graph.
3) Import is supported only on an empty database, so there is no way to use it to import a batch of data into existing database, or create a complex graph by importing nodes and more complex relations.

Any solution to these?

Michael Hunger

unread,
Aug 2, 2017, 3:28:21 PM8/2/17
to ne...@googlegroups.com
Hi Farid,

On Wed, Aug 2, 2017 at 11:17 AM, Farid <farid...@geniee.co.jp> wrote:
Hi,

Trying to import very large dataset, I see a huge difference between cypher and neo4j-import

Sadly:
1) It's already deprecated, yet I can't find any documentation nor real examples using its successor: neo4j-admin import

neo4j-admin import has built in help, but it's mostly just a different script around the neo4j-import code.
 
2) Even current documentation is limited to a very simple 2 nodes types and a simple relation between them, can't find how to use a more complex graph.


 
3) Import is supported only on an empty database, so there is no way to use it to import a batch of data into existing database, or create a complex graph by importing nodes and more complex relations.

Yes, this is currently a limitation, but it is meant to be accomodated in the future.

Michael
 

Any solution to these?

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Farid

unread,
Aug 3, 2017, 4:34:38 AM8/3/17
to Neo4j, michael...@neo4j.com
Dear Michael,

Thanks for the answer.


1) It's already deprecated, yet I can't find any documentation nor real examples using its successor: neo4j-admin import
neo4j-admin import has built in help, but it's mostly just a different script around the neo4j-import code.

That's the complain, it's very much Linux like, where some documentation can be found on the help, but nothing on the website, and no tutorials.
For people used to the tool and have been working with it for long, that documentation does make sense, but for people new to the technology, it's too abstract and confusing (and I have 20 years+ of dev experience).


2) Even current documentation is limited to a very simple 2 nodes types and a simple relation between them, can't find how to use a more complex graph.
here is some other documentation: 

Thanks for the links, that I've seen as well as some other use case.
For my understanding, this still a simple relation between 2 different nodes. What I mean by more complex graph is something that can be done in Cypher, such as some "if/else" comparisons, etc.

I'll give it another shot, even if it means duplicating the data or so (cases when multipl ids are present, for example, a "Content" can have "Creator ID", "Category ID", "Size ID", so being able to match with more than 2 nodes at a time would be awesome)
  

3) Import is supported only on an empty database, so there is no way to use it to import a batch of data into existing database, or create a complex graph by importing nodes and more complex relations.
Yes, this is currently a limitation, but it is meant to be accomodated in the future.

Thanks, that's a great news, hopefully it's a near future, as it's highly important when needing to work with high frequency data that needs to be imported in batches.

Sincerely,
Farid

Kamal Murthy

unread,
Aug 3, 2017, 5:32:45 PM8/3/17
to Neo4j, michael...@neo4j.com
Hi Farid,

Item # 2: For my understanding, this still a simple relation between 2 different nodes. What I mean by more complex graph is something that can be done in Cypher, such as some "if/else" comparisons, etc.

You can use FOREACH construct to create nodes based on some conditions. This is similar to "if/else".
I use this a lot and you can check the code in my blog: 

Let me know if this helps.

-Kamal

Farid

unread,
Aug 4, 2017, 6:16:05 AM8/4/17
to Neo4j, michael...@neo4j.com
Dear Kamal,

Thank you for the answer.

Pardon my incompetence if I somehow misunderstood, but what I see in your example is Cypher queries, which isn't problematic (I could solve my needs by using CASE, but FOREACH also seems working, thanks).

The problem I'm having is rather whereas this can be done with neo4j-import (or neo4j-admin). Because, without that, neo4j-import would be limited to very simple DBs as we can't reproduce Cypher scripts with it.

If I'm mistaken and it's possible to be done, any pointer on how to would be awesome.

Sincerely,
farid

Michael Hunger

unread,
Aug 4, 2017, 6:29:18 AM8/4/17
to Farid, Neo4j, Mark Needham, Max De Marzi Jr.
Hi,

On Thu, Aug 3, 2017 at 10:34 AM, Farid <farid...@geniee.co.jp> wrote:
Dear Michael,

Thanks for the answer.


1) It's already deprecated, yet I can't find any documentation nor real examples using its successor: neo4j-admin import
neo4j-admin import has built in help, but it's mostly just a different script around the neo4j-import code.

That's the complain, it's very much Linux like, where some documentation can be found on the help, but nothing on the website, and no tutorials.
For people used to the tool and have been working with it for long, that documentation does make sense, but for people new to the technology, it's too abstract and confusing (and I have 20 years+ of dev experience).

I will send the feedback to our docs team. 

What kind of docs would be more helpful? Do you have good examples for this kind of tool?


2) Even current documentation is limited to a very simple 2 nodes types and a simple relation between them, can't find how to use a more complex graph.
here is some other documentation: 

Thanks for the links, that I've seen as well as some other use case.
For my understanding, this still a simple relation between 2 different nodes. What I mean by more complex graph is something that can be done in Cypher, such as some "if/else" comparisons, etc.

I'll give it another shot, even if it means duplicating the data or so (cases when multipl ids are present, for example, a "Content" can have "Creator ID", "Category ID", "Size ID", so being able to match with more than 2 nodes at a time would be awesome)

It is not limited to 2 nodes and 1 relationship, you can import many different node-types and relationship-types at the same time,
Just a relationship is always between two node-ids, in your case you would have one rel for content->creator, one for content->category one for device->size (although I would store the size as property tbh).

There are no conditionals in the import tool.

3) Import is supported only on an empty database, so there is no way to use it to import a batch of data into existing database, or create a complex graph by importing nodes and more complex relations.
Yes, this is currently a limitation, but it is meant to be accomodated in the future.

Thanks, that's a great news, hopefully it's a near future, as it's highly important when needing to work with high frequency data that needs to be imported in batches.

I hear you.

I still want to see how we can improve your cypher import performance, so it would be good to get a realistic sample of data + your existing cypher import scripts.

The other option for such a dedicated use-case would be to create a custom importer which uses the Neo4j Java APIs that allow you full control over concurrency, batching and queuing, like here: https://maxdemarzi.com/2016/09/26/custom-importers/

Sincerely,
Farid

Kamal Murthy

unread,
Aug 4, 2017, 12:50:15 PM8/4/17
to Neo4j, michael...@neo4j.com
Hi Farida,

This is on a different point:

cases when multiple ids are present, for example, a "Content" can have "Creator ID", "Category ID", "Size ID", so being able to match with more than 2 nodes at a time would be awesome

If I understand correctly, you want to search / filter "Content"  based on any combination of "Creator ID", "Category ID", "Size ID".  Here you can use multidimensional approach (also called Context Aware Recommendation). I used this approach to recommend restaurants based on user's choices (https://neo4j.com/graphgist/800a57b2-bbd1-40d3-9dee-a00c4ef624e6).

-Kamal
Reply all
Reply to author
Forward
0 new messages