exit codes?

946 views
Skip to first unread message

Brian Law

unread,
Aug 19, 2015, 4:21:35 AM8/19/15
to Luigi
what are the exit codes for workers and what do they mean?

Arash Rouhani

unread,
Aug 19, 2015, 4:30:01 AM8/19/15
to Brian Law, Luigi
I'm not sure if this is documented or even implemented. But it would be interesting to scetch up the life cycle of a worker and associate each kind of failure with an exit code. Currently task failures do not even result in non-zero exit code. There's some issue in the issue tracker about this ...

/Arash

On Wed, Aug 19, 2015 at 10:21 AM, Brian Law <bpl...@gmail.com> wrote:
what are the exit codes for workers and what do they mean?

--
You received this message because you are subscribed to the Google Groups "Luigi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to luigi-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brian Law

unread,
Aug 19, 2015, 4:30:54 AM8/19/15
to Luigi, bpl...@gmail.com
I keep getting exit code 1
is it just a generic thing?

Arash Rouhani

unread,
Aug 19, 2015, 4:37:55 AM8/19/15
to Brian Law, Luigi
Yea, currently exit codes are not organized or anything. I think they are all 1 actually.

However, you should be able to tell from the content of stdout what went wrong, though you can't do it programatically.

Brian Law

unread,
Aug 19, 2015, 4:40:07 AM8/19/15
to Luigi, bpl...@gmail.com
I don't receive any python errors. They all just instacrash which is concerning

Arash Rouhani

unread,
Aug 19, 2015, 4:43:54 AM8/19/15
to Brian Law, Luigi
That's odd. Maybe paste small example, show log ouput and show how you run it and with what version of luigi.

Brian Law

unread,
Aug 19, 2015, 4:48:41 AM8/19/15
to Luigi, bpl...@gmail.com
I am triggering from an ipython notebook with the command:

luigi.build(ActiveTask, workers=3, local_scheduler=False, scheduler_host='localhost', scheduler_port=8081)

ActiveTask is a list of tasks.
Each task consists of the execution of some SQL commands via sqlalchemy to a MS SQL Server.

When I set workers = 1 that all works but anything bigger gives me: 

INFO:luigi-interface:Worker task TableCheck_1(current=2015-08-19, Server=ProdDB, case_run=TestCase_1) died unexpectedly with exit code 1

is that enough info?
Is my triggering process wrong?

Arash Rouhani

unread,
Aug 19, 2015, 4:52:49 AM8/19/15
to Brian Law, Luigi
That seems like a bug if it works for workers=1 but not workers=2. But it seems from the log that it is the task that dies? Have you println-debugged your task's run method?

Ramnath Medikonda

unread,
May 31, 2016, 6:36:44 PM5/31/16
to Luigi, bpl...@gmail.com
Was someone able to figure out a way for this?

I have an alarming service that goes off if the exit code is non-zero. However, luigi always throws 0 as exit code irrespective of the status of the workflow.

Can someone suggest me a way to go around this?

Thanks,
Ramnath M

Arash Rouhani Kalleh

unread,
May 31, 2016, 10:06:22 PM5/31/16
to Ramnath Medikonda, Luigi, bpl...@gmail.com
Luigi exit codes have been supported for quite a while now (since 2.0.0).

Ramnath Medikonda

unread,
Jun 2, 2016, 2:50:33 PM6/2/16
to Luigi, medikond...@gmail.com, bpl...@gmail.com
HI Arash,

I was trying to use the exit codes but for some reason when i echo $?, it always prints 0. I tried using sys.exit(luigi.retcodes.retcode().unhandled_exception) but it still prints 0.

Am i doing something wrong or missing something in the way I need to tell luigi to exit with a certain exit code?


Thanks,
Ramnath M

Ramnath Medikonda

unread,
Jun 2, 2016, 4:56:29 PM6/2/16
to Luigi, medikond...@gmail.com, bpl...@gmail.com
Here is the sample:

# Class that logs the processing time of each task
class SuperLuigiTask(luigi.Config, luigi.Task, object):

@luigi.Task.event_handler(luigi.Event.FAILURE)
def exit_message(self, exception):
logger.info(exception)
sys.exit(luigi.retcodes.retcode().unhandled_exception)


Logs:

medikond.desktop% /apollo/bin/env -e GyroServiceEnhancedClient /apollo/env/GyroServiceEnhancedClient/bin/python2.7 -m luigi --module datalytics_luigi_druid_workflow DruidSpecFileCreation --local-scheduler && echo $?

DEBUG: Checking if DruidSpecFileCreation() is complete

INFO: Informed scheduler that task   DruidSpecFileCreation()   has status   PENDING

INFO: Done scheduling tasks

INFO: Running Worker with 1 processes

DEBUG: Asking scheduler for work...

DEBUG: Pending tasks: 1

INFO: [pid 4313] Worker Worker(salt=610339599, workers=1, host=medikond.desktop.com, username=medikond, pid=4313) running   DruidSpecFileCreation()

INFO: Modifying the spec file ad3_index_task_template.json

ERROR: [pid 4313] Worker Worker(salt=610339599, workers=1, host=medikond.desktop.com, username=medikond, pid=4313) failed    DruidSpecFileCreation()

Traceback (most recent call last):

 File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/site-packages/luigi/worker.py", line 162, in run

   new_deps = self._run_get_new_deps()

 File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/site-packages/luigi/worker.py", line 113, in _run_get_new_deps

   task_gen = self.task.run()

 File "datalytics_luigi_druid_workflow.py", line 108, in run

   self.json_index_generate(self.dlp.json_index_input)

 File "datalytics_luigi_druid_workflow.py", line 95, in json_index_generate

   json_data = json.load(new_file, object_pairs_hook=OrderedDict)

 File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/json/__init__.py", line 290, in load

   **kw)

 File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/json/__init__.py", line 351, in loads

   return cls(encoding=encoding, **kw).decode(s)

 File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/json/decoder.py", line 366, in decode

   obj, end = self.raw_decode(s, idx=_w(s, 0).end())

 File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/json/decoder.py", line 384, in raw_decode

   raise ValueError("No JSON object could be decoded")

ValueError: No JSON object could be decoded

INFO: No JSON object could be decoded

ERROR: Error in event callback for 'event.core.failure'

Traceback (most recent call last):

 File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/site-packages/luigi/task.py", line 155, in trigger_event

   callback(*args, **kwargs)

 File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/site-packages/datalytics_luigi_util/datalytics_luigi_super.py", line 36, in exit_message

   sys.exit(luigi.retcodes.retcode().unhandled_exception)

SystemExit: 4

INFO: Skipping error email. Set `error-email` in the `core` section of the luigi config file or override `owner_email`in the task to receive error emails.

DEBUG: 1 running tasks, waiting for next task to finish

INFO: Informed scheduler that task   DruidSpecFileCreation()   has status   FAILED

DEBUG: Asking scheduler for work...

INFO: Done

INFO: There are no more tasks to run at this time

INFO: Worker Worker(salt=610339599, workers=1, host=medikond.desktop.com, username=medikond, pid=4313) was stopped. Shutting down Keep-Alive thread

INFO:

===== Luigi Execution Summary =====


Scheduled 1 tasks of which:

* 1 failed:

   - 1 DruidSpecFileCreation()


This progress looks :( because there were failed tasks


===== Luigi Execution Summary =====


0



Arash Rouhani Kalleh

unread,
Jun 2, 2016, 10:30:38 PM6/2/16
to Ramnath Medikonda, Luigi, Brian Law
Please first consult these two pieces of docs:

http://luigi.readthedocs.io/en/stable/configuration.html#retcode (most important)

http://luigi.readthedocs.io/en/stable/api/luigi.retcodes.html (worth checking out too, but shouldn't be needed)

Cheers,
Arash

Ramnath Medikonda

unread,
Jun 22, 2016, 7:25:17 PM6/22/16
to Luigi, medikond...@gmail.com, bpl...@gmail.com
Hi Arash,

Thank you. I was able to incorporate the return codes.

However, I am still unable to for scheduling error. In this case, I expected the return code to be 50 which I explicitly specified, but somehow it returns 0. I suppose this is the scheduling_error retcode right?

DEBUG: Checking if DruidHadoopIndexingHourly(job_start_date=2016-06-21, job_start_hour=20) is complete
DEBUG: Hash 654a3d8a67a6376988eecf60f3f9f5ba corresponds to task DruidHadoopIndexingHourly(job_start_date=2016-06-21, job_start_hour=20)
WARNING: Will not schedule DruidHadoopIndexingHourly(job_start_date=2016-06-21, job_start_hour=20) or any dependencies due to error in deps() method:
Traceback (most recent call last):
  File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/site-packages/luigi/worker.py", line 588, in _add
    deps = task.deps()
  File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/site-packages/luigi/contrib/hadoop.py", line 710, in deps
    return luigi.task.flatten(self.requires_hadoop()) + luigi.task.flatten(self.requires_local())
  File "/apollo/env/GyroServiceEnhancedClient/lib/python2.7/site-packages/luigi/contrib/hadoop.py", line 700, in requires_hadoop
    return self.requires()  # default impl
  File "datalytics_luigi_druid_workflow.py", line 122, in requires
    return [DruidSpecFileCreation(job_start_date=self.job_start_date, job_start_hour=self.start_hour(), job_end_hour=self.end_hour(), engagement_output_path=self.dist_copy_dest())]
  File "datalytics_luigi_druid_workflow.py", line 97, in dist_copy_dest
    return '-'.join([self.engagement_output_path, random_string, self.job_start_date(), self.start_hour()])
TypeError: 'datetime.date' object is not callable

INFO: Skipping error email. Set `error-email` in the `core` section of the luigi config file or override `owner_email`in the task to receive error emails.
INFO: Done scheduling tasks
INFO: Running Worker with 1 processes
DEBUG: Asking scheduler for work...
INFO: Done
INFO: There are no more tasks to run at this time
INFO: Worker Worker(salt=985179584, workers=1, host=medikond.desktop.com, username=medikond, pid=3918) was stopped. Shutting down Keep-Alive thread
INFO: 
===== Luigi Execution Summary =====

Did not schedule any tasks

===== Luigi Execution Summary =====
Reply all
Reply to author
Forward
0 new messages