How do you set the luigi configuration from within code?

345 views
Skip to first unread message

Matthew Lueder

unread,
Nov 10, 2020, 7:22:54 PM11/10/20
to Luigi
Hello,

I have multiple luigi pipelines on the same machine and would like to have a separate configuration file for each pipeline. I have tried to set the LUIGI_CONFIG_PATH environment variable using os.environ, however this does not work.
My code:

---------------------------------------------------------------------------------------------------------
def main(args):
    print('Running search pipeline')

    # Set environment variable showing Luigi where config file is
    cfg_path = os.path.join(
        os.path.dirname(os.path.realpath(__file__)),
        'luigi.cfg'
    )
    print(cfg_path ) # Produces correct path
    os.environ['LUIGI_CONFIG_PATH'] = cfg_path
    job_array = [Tasks...]
    luigi.build(job_array, workers=1)
---------------------------------------------------------------------------------------------------------

When luigi tries to resolve parameters I get the luigi.parameter.MissingParameterException. However, if I set the LUIGI_CONFIG_PATH environment variable outside of python it works as expected. What is strange is I print os.environ right before I instantiate my luigi.Config class and the environments appear to be the exact same.

---------------------------------------------------------------------------------------------------------
class Globals(luigi.Config):
    MAS_USERNAME = luigi.Parameter()

print(os.environ). # LUIGI_CONFIG_PATH appears to be set correctly
global_config = Globals() # Causes luigi.parameter.MissingParameterException
---------------------------------------------------------------------------------------------------------

luigi.cfg:
---------------------------------------------------------------------------------------------------------
[Globals]
MAS_USERNAME=luigi
---------------------------------------------------------------------------------------------------------

I don't know if this is relevant but I am running this from a celery worker.

Matthew Lueder

unread,
Nov 12, 2020, 10:29:20 AM11/12/20
to Luigi
It appears this was a celery issue and not a Luigi one. I managed to fix this by moving the code to set the environment variable to celery.py (https://docs.celeryproject.org/en/stable/django/first-steps-with-django.html). My guess is it has something to do with how celery uses multiprocessing.
Reply all
Reply to author
Forward
0 new messages