Status

StateDraft
Discussion Thread
Vote Thread
Vote Result Thread
Progress Tacking (PR/GitHub Project/Issue Label)
Date Created

2024-03-05

Version Released
Authors

One significant challenge with Airflow today is that Airflow takes a very optimistic view on DAGs - that they never change in a meaningful way. Of course, the real world necessitates that these DAGs do change. This problem has a couple layers to it:

  • Airflow does not track the history of a DAG. When a DAG changes, all of Airflow just assumes that the DAG has always looked like it does now.
  • Airflow always runs the latest code of a DAG, even if it is part way through a DAG run or even a task if retries are required.

Fixing these shortcomings will be a large undertaking, so I propose we tackle it in stages. This unlocks incremental value for the community.

First, one necessary foundational change is to start tracking the details of every Task Instance try. Currently some Task Instance details are reset when a task is retried later, or a task is cleared. This causes data loss and we can’t show anything other than task logs to the user about an earlier try. This will be tackled in AIP-64: Keep TaskInstance try history.

Next, we will enhance Airflow’s UI by tracking and displaying historical versions of a DAG. This will allow users to see a DAGs structure and the code that created it, for any given DAG run, down to the task try granularity. We will accomplish this by allowing Airflow to detect when a DAG has changed and store that version history for later use in the UI. This will only impact the UI - the execution path will remain unchanged in this AIP. This will be tackled in AIP-65: Improve DAG history in UI.

Finally, we will introduce the concept of a versioned DAG bundle (a collection of DAGs and other files) and allow Airflow to control the DAG bundle version to use for a given task try. This means that a DAG run could continue running on the same DAG code for the entire run, even if the DAG is updated mid way through. This will require Airflow to support a different way of finding DAG code - it can no longer simply expect the DAG on local disk in all cases. This will be done in a pluggable way, so the ecosystem can evolve as time goes on. This will be tackled in AIP-66: Execution of specific DAG code versions.

This AIP will be considered done when its "sub-AIPs" are all done: