What is backfill used for ?

1,253 views
Skip to first unread message

Hao Ren

unread,
Mar 23, 2016, 1:11:42 PM3/23/16
to Airflow
Hi, 

I am new to airflow. 

When reading the doc, I don't understand what backfill means.

It just for testing ? It runs the DAGs for a given period of time in the past, like simulation ?

If I set schedule_interval="@once", `backfill` does nothing.

Any clarification is highly appreciated.

Hao

Joy Gao

unread,
Mar 24, 2016, 3:38:46 AM3/24/16
to Airflow
A backfill job essentially runs a specified DAG or SUBDAG "on-demand" rather than being monitored/scheduled by the scheduler.  What this means is that a backfill job doesn't actually use the schedule_interval to determine if/when it should be run.
Instead it will look at the start_date and end_date that you passed in (from the CLI or in the DAG script) and determine if it's in that range. If valid, then the DAG will start running. 
Note that unlike 'airflow test', 'airflow backfill' actually check for dependencies before running and will record all the states in the database.

Hope it helps!

Hao Ren

unread,
Mar 24, 2016, 5:00:38 AM3/24/16
to Airflow
Thank you. It is clear now. =)
Reply all
Reply to author
Forward
0 new messages