xcom feature - how to populate values for `airflow test`

1,275 views
Skip to first unread message

Marc Limotte

unread,
Sep 10, 2015, 10:47:15 AM9/10/15
to Airflow
The xcom feature is very useful.  Thanks @jlowin for contributing that.

One thing that I run into when using xcom is how can I test (or run) a task in isolation.  If I do `airflow test DAGID some_task` for a task that expects xcom values to be available to it from a prior dependency, the task will fail b/c the prior task doesn't run in this test context.  

Any suggestions on how to handle this?


marc


Marc Limotte

unread,
Sep 10, 2015, 11:04:23 AM9/10/15
to Airflow
Also, the same question/problem applies to `airflow backfill`.  

Example: If I have a dag for a particular execution date, where 3 out 5 tasks succeeded and then the next task fails.  I then fix the problem and want to `airflow backfill` for that dag/execution date; then xcom values from the earlier tasks are apparently (AFAICT) not available.


marc

Jeremiah Lowin

unread,
Sep 11, 2015, 9:29:56 AM9/11/15
to Airflow
Hi Marc,

Unfortunately I don't think there's a way to mock an XCom (or any other task dependency). A couple thoughts: 

1. XComs are persisted to the database, so if you `test` the prior task, it should properly store its XCom and make it available when you `test` the subsequent task
2. If a matching XCom value isn't found, it should return None (or potentially a list of Nones, if you passed a list of task_ids), so you could build logic to explicitly handle this case. For example, if your task communicates to its future self (a "last_updated" value, for example), then at least one run (the first one) will encounter a None that has to be handled.

I'm a little surprised about the backfill behavior, because backfilled tasks should only run if all dependencies are done. Could you provide a simple test case?

Marc Limotte

unread,
Sep 13, 2015, 3:20:22 PM9/13/15
to Airflow
Thanks, Jeremiah.  I used #2 as a workaround the first time, but #1 is helpful, too.  I wonder if #1 is potentially confusing because standard Airflow behavior for test [from the main docs] says test "will run a task without ... recording it’s state in the database".  It helps in my particular scenario, but might be confusing because it is inconsistent.

About the backfill behavior, I was able to see xcom values available in subsequent runs of the missing tasks (as you expected); so the issue I saw was probably due to some other error on my part. 

Thanks for your clarifications on this topic.
Reply all
Reply to author
Forward
0 new messages