Luigi Task fails with output as a dictionary. (Under Docker env)

103 views
Skip to first unread message

Stephen Sun

unread,
Oct 10, 2021, 9:36:25 PM10/10/21
to Luigi
Hi,

I have a Luigi Task which has 4 files outputs. I put them in an dictionary and it runs smoothly on my local Mac. However, when I put them into Docker env it fails to run. With multiple workers it returns "unexpected exit with code -9",  Blank error message with 1 worker. Could someone please help me out?

class RemoveDuplicates(Task):
params=DictParameter()

def requires(self):
return ExtractTest()

def output(self):
return {'payslips_dups': LocalTarget(self.params['payslips_dups'],format=luigi.format.Nop),
'timesheets_dups': LocalTarget(self.params['timesheets_dups'],format=luigi.format.Nop),
'payslips': LocalTarget(self.params['payslips'],format=luigi.format.Nop),
'timesheets': LocalTarget(self.params['timesheets'],format=luigi.format.Nop),
}

def run(self):
meta_payslips=pd.read_csv(self.params['metaraw_payslips'])
meta_timesheets=pd.read_csv(self.params['metaraw_timesheets'])
cleand_payslips=core.cleaning(meta_payslips).duplicates()
cleand_timesheets=core.cleaning(meta_timesheets).duplicates()
with self.output()['payslips_dups'].open('wb') as ofile:
cleand_payslips[1].to_csv(ofile)
with self.output()['payslips'].open('wb') as ofile:
cleand_payslips[0].to_csv(ofile)
with self.output()['timesheets_dups'].open('wb') as ofile:
cleand_timesheets[1].to_csv(ofile)
with self.output()['timesheets'].open('wb') as ofile:
cleand_timesheets[0].to_csv(ofile)
del meta_payslips,meta_timesheets,cleand_payslips,cleand_timesheets

Tashrif

unread,
Oct 10, 2021, 10:26:37 PM10/10/21
to Stephen Sun, Luigi
Here are a few thoughts:

1. Did you forget to initialize a luigid server local to your docker container?

2. If you run your job with a single worker, you should get the exact error message.

3. Your formatting is hard to read. You may want to provide us a gist with your run method in comment so we can help better.

-Tashrif
.

--
You received this message because you are subscribed to the Google Groups "Luigi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to luigi-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/luigi-user/d1a70080-89ed-46b6-94f1-33dcfe7ee94fn%40googlegroups.com.

Stephen Sun

unread,
Oct 11, 2021, 12:14:02 AM10/11/21
to Luigi

Hi Tashrif,

Thanks for replaying. I've initialised the lugged on local server and I'm able to access the luigi ui through 8082. I have 40 tasks in total and I get the error in the middle of the pipeline. Firstly, I thought it may be not enough memory allocation to the container but it seems not the case. 
here is the run method: LUIGI_CONFIG_PATH=luigi.cfg python -m luigi --module main GenerateExceptionReports --workers=10 --scheduler-host luigid

May I please know what this "code -9" stands for?  Btw, the entire pipeline runs smoothly on my local Mac 
Screen Shot 2021-10-11 at 3.08.32 pm.png

Lars Albertsson

unread,
Oct 11, 2021, 3:27:55 AM10/11/21
to Stephen Sun, Luigi
-9 means that the process was killed by the operating system or some other process using the SIGKILL signal. It is not caused by luigi. Google for 'exit code "-9"' for more information.

It is likely caused by insufficient memory, so check that again. It might for example happen if you run in kubernetes and the container allocates more memory than the container memory limit.


Reply all
Reply to author
Forward
0 new messages