failed assertion on startup

10 views
Skip to first unread message

James Turk

unread,
Oct 6, 2011, 10:55:11 AM10/6/11
to superfa...@googlegroups.com
Just started hitting another problem after loading about 20000 documents in, I'm going to attempt to debug it but I figured you might be able to give some tips on what I should be looking for based on the error I'm seeing.

About 90% of the way through a load of documents I got:
superfastmatch: src/command.cc:124: std::string& superfastmatch::Command::getPayload(): Assertion `registry_->getPayloadDB()->get(toString(payload_id_),payload_)' failed.
Aborted

I was able to restart it and resume the document load but after another thousand or so documents it happened again.

Now when I try and restart it happens immediately, so I'm unable to restart the SFM instance altogether.

superfastmatch@hartford:~$ ./superfastmatch -debug -window_size 30 
2011-10-06T14:51:42.973531Z: [SYSTEM]: ================ [START]: pid=6360
2011-10-06T14:51:42.973793Z: [SYSTEM]: starting the server: expr=127.0.0.1:8080
2011-10-06T14:51:42.973975Z: [SYSTEM]: server socket opened: expr=127.0.0.1:8080 timeout=1.0
2011-10-06T14:51:42.974009Z: [SYSTEM]: listening server socket started: fd=10
superfastmatch: src/command.cc:124: std::string& superfastmatch::Command::getPayload(): Assertion `registry_->getPayloadDB()->get(toString(payload_id_),payload_)' failed.
Aborted

Donovan Hide

unread,
Oct 6, 2011, 11:04:33 AM10/6/11
to superfa...@googlegroups.com
Hi James,

the assertion hits because the POST of a document couldn't be found in
the payload db while processing the command queue. Not sure why, but
might be worth checking that there are no zero length documents in the
corpora that you are loading. Obviously, this needs to be guarded for
in code, so it might be a bug.

To restart from scratch you'll have to pass the -reset flag. There are
plenty of bugfixes coming in the next commit, some of which were to do
with limits hit with 1000's of documents that didn't come up with
corpora with larger document sizes but fewer documents. I'm doing my
best to get this all tested and pushed ASAP. It's much faster to load
and associate so it will be less frustrating with big corpora if an
error is hit.

If you look at the debug log, you might be able to see the actual
document that is failing and can then check the document itself on the
file system for any oddities. If you want me to ssh in and do a gdb
session on the server I'm more than happy to do so!

Cheers,
Donny.

Reply all
Reply to author
Forward
0 new messages