Structure of DAG repo

95 views
Skip to first unread message

Chris Riccomini

unread,
Jan 4, 2016, 3:39:11 PM1/4/16
to Airflow
Hey all,

I'm curious how people structure their DAG repository (specifically, AirBNB).

1) Do you have just one repo?
2) Do the DAGs run in multiple environments (e.g. staging, production, etc)
3) What is the folder hierarchy/structure that you guys use?

Cheers,
Chris

cha...@chartboost.com

unread,
Jan 4, 2016, 5:18:18 PM1/4/16
to Airflow
Hi Chris,

Here's how structure our dags and repos:
1) We have a repo for our data pipelines and another repo for plugins.

2) Right now we just have a production environment.  We had in the past created a new mysql instance of the airflow db, and sometimes we test things using that other airflow db by updating the airflow.cfg to point to it.  But these days we're mostly just using production.  You can always clear old jobs, so it's not a big deal to do things in production.

3) We have a /data_pipelines repo, with a /dags folder.

Under the dags folder we have a bunch of different directories, each one representing a group of dags.  So:

dag/folder1
dag/folder2

And then in each folder we have one or more files that contain different dags.

We also have a dags/utils folder will we keep some common files that can be imported into other dags via:
from utils.utils import my_util_method

Cheers,
Charlie

Reply all
Reply to author
Forward
0 new messages