[Neo4j] KeyError in python

529 views
Skip to first unread message

Jacopo Farina

unread,
Dec 6, 2011, 12:40:09 PM12/6/11
to Neo4j user discussions
Hi all!
I'm using python embedded in order to classify all the nodes in a neo4j
graph previously labeled with properties.
The graph is about 3.9GB with 7M nodes and 30-40M relationships. I've two
questions:
1- the program worked correctly for hours then crashed suddenly with this
error:
Traceback (most recent call last):
File ''assegnaCategoria.py'', line l4, in <module>
for n in db.nodes:
File ''/usr/local/lib/python2.6/dist-packages/neo4)/ _ init _ .py'', line
44, in _ getitem _
return sel_.get(items)
File ''/usr/local/lib/python2.6/dist-packages/neo4)/ _ init _ .py'', line
6l, in get
rethrow current exception as(KeyError)
File ''/usr/local/lib/python2.6/dist-packages/neo4)/util.py'', line 76, in
rethrow_current_exception_as
raise ErrorClass(msg)
KeyError: u'Node[9327924]'

2-the program is very slow.I started it at 18 pm end it crashed at ~65% of
the work at 4 am It only reads the database, never changing it, is there a
way to set it to use the cache intensively? I would put it in /dev/shm/ but
my RAM is 3GB and the database is bigger.

The code is this http://codepad.org/leSwqhnc

Cheers,
Jacopo
_______________________________________________
Neo4j mailing list
Us...@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Peter Neubauer

unread,
Dec 12, 2011, 6:16:40 PM12/12/11
to ne...@googlegroups.com, Neo4j user discussions
Jacopo,
are you still seeing this? It looks like there is some problems with
the Python engine. Have you tried different machines etc?

Cheers,

/peter neubauer

TC CEO of the year - vote for Emil Eifrém!
http://crunchies2011.techcrunch.com/nominate/

Google:neubauer.peter
Skype:peter.neubauer
Phone: +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      @peterneubauer

brew install neo4j && neo4j start
heroku addons:add neo4j

Jacopo Farina

unread,
Dec 13, 2011, 5:31:57 PM12/13/11
to ne...@googlegroups.com, Neo4j user discussions
I tried again to run the program and still got the same error, at the same
point. I'm running it on Ubuntu 10.10, but I could try on a pc with Windows
7 and more RAM.
Cheers,
Jacopo

Peter Neubauer

unread,
Dec 29, 2011, 4:56:03 AM12/29/11
to ne...@googlegroups.com
Mms,
I am no python expert by any means. Maybe running it on different platforms is a good idea. Also, I think I have been googling for JVM settings with python, and you can provide them when you start the vm.
--
Sent from Gmail Mobile

Jacob Hansson

unread,
Jan 2, 2012, 12:27:48 PM1/2/12
to ne...@googlegroups.com
Hmm..

I *think* that the problem is that you are using an older version of neo4j-embedded, one that did not have an __iter__ method for the nodes. Python falls back to passing numbers in order to __getitem__, which eventually fails when it hits a node id that does not exist.

Try upgrading to the latest version of neo4j-embedded.

Also: For higher throughput when dealing with large batch operations like this, it is usually a good idea to split the work into multiple transactions. Ideally, each transaction should handle something like 50 000 to 100 000 inserts, passed that, performance will degrade. If you refactor the code to use multiple transactions, performance should be significantly better.

Let me know if it works!
jake
--
Jacob Hansson
Phone: +46 (0) 763503395
Twitter: @jakewins

Jacopo Farina

unread,
Jan 9, 2012, 5:19:34 AM1/9/12
to ne...@googlegroups.com
Hi,
for now I've not been able to access the Windows 7 PC, so I've not tried it yet.
I'm using the latest version of Python embedded, in fact I had some issues about using a graph created with an older version of Java-embedded (issues solved by using allow_store_upgrade="true" when opening the DB).
I tried to divide the transaction in little ones, although the program is single-threaded and never changes data, but when I start again the transaction I can't resume the iterator. Probably there's a workaround to keep the cursor but I'm not keen on Python.

Cheers,
Jacopo


2012/1/2 Jacob Hansson <jacob....@neotechnology.com>
Reply all
Reply to author
Forward
0 new messages