error (or failure) codes

25 views
Skip to first unread message

Márton Makrai

unread,
Sep 19, 2016, 6:28:29 AM9/19/16
to bob-...@googlegroups.com

Dear People,

I would like to ask where you can learn what an error (or failure) code like 9 in the following example means

gridtk@2016-09-15 17:24:30,863 -- INFO: Starting execution of Job 'train-p' (74)
gridtk@2016-09-15 17:41:31,986 -- INFO: Job 'train-p' (74) finished execution with result 'failure (-9)'
gridtk@2016-09-15 17:41:31,988 -- INFO: Stopping task scheduler since there are no more jobs running.
bob.bio.base@2016-09-15 17:41:31,990 -- ERROR: The jobs with the following IDS did not finish successfully: '74'.
<Job: 74 (74)  - 'train-p'> | local - ginny : failure (-9) -- '/home/makrai/tool/python/venv/bin/verify.py' -d '/home/makrai/repo/hunspeech/emLid/babel.py' -p 'energy-2gauss' -e 'mfcc-60' -a 'ivector-cosine' -vvvs '/mnt/store/makrai/work/emLid/spear_babel/' --parallel '8' --sub-task 'train-projector'
gridtk@2016-09-15 17:41:31,991 -- INFO: Contents of output file: '/mnt/store/makrai/work/emLid/spear_babel/gridtk_logs/train-p/train-p.o74'
------------------------------------------------------------

Thanks
Márton Makrai

Manuel Günther

unread,
Sep 19, 2016, 9:06:39 PM9/19/16
to bob-devel
Dear Marton,

as you can see, bob.bio.base relies on GridTK to run experiments in parallel (or in the SGE). 
The errors reported here are usually the return codes (exit status) from the scripts that were executed. In your case it was exit code -9, where this page: http://stackoverflow.com/questions/18529452/sudden-exit-with-status-of-9 says that your process was killed by the system with code 9 (SIGKILL).

As there is no output in the error file, I assume that it was killed because of a memory error. I wouldn't be surprised, as the ivector training requires a lot of memory.

Best wishes
Manuel

Reply all
Reply to author
Forward
0 new messages