Note
This page is a collection of opinions of different stakeholders as planning for Airflow 3.0 Release. It is forked-off the discussion in Devlist: [HUGE DISCUSSION] Airflow3 and tactical (Airflow 2) vs strategic (Airflow 3) approach. It is just a set of points, no concrete planning yet and not agreed. Please rather see this page as a playground to collect items to discuss and notes to share.
As this page is intended to collect thoughts and ideas, please feel free to adjust content and leave your name as vote/stakeholder. It is intended that we can use this space to collaborate between Airflow maintainer group and interested stakeholders.
If one argument fits into multiple sections please add it just to one. We might move the content and re-align over time, redundancy is not helpful first place.
Airflow 2.x Pain Points (Which require to make a 3.0)
Note
In Airflow 2.0 Planning (see Airflow 2.0 - Planning [Archived]) there were a couple of learning's from Airflow 1.x considered and a planning for 2.0 was made. It was a great effort to carry it over. For sure times are changing, complexity and requirements have been moved since 2.0. In this chapter/table please describe and collect the points that hinder you/us to change existing 2.x chain to achieve the needed requirements.
Item / Description Describe your Airflow 2.x Pain point that need to be fixed in 3.0 | Stakeholder backing Please add your name if you agree - so we see how many people share this pain | Open Questions to discuss Please post questions that need to be discussed for understanding of the raised point or clarifications (preventing too many comments, please directly here as text) | Any Tactical Alternatives? Please add/describe any tactical alternative that would be an option to prevent a 3.0 |
---|---|---|---|
LLM/Gen-AI mainly as the important trigger Reiterating the fact that this needs more work, I do believe this can be incremental to Airflow. As Astronomer, we have worked on the LLM Providers which we contributed to Airflow late last year. But clearly, there is so more to do, both from building awareness of the patterns / templates to use, as well as patterns to support in Airflow to make these easier to use and adopt. |
|
| |
Cloud Native is the "way to go" |
| ||
Need to submit DAGs in other ways than dropping them to a shared DAG folder Different DAG distribution processes |
|
| |
Local testing and fast iteration on developing pipelines |
|
| |
Ability to run tasks with workflow with "affinity" so that they can share inputs/outputs in shared CPU/GPU memory |
|
| |
Ability to integrate seamlessly with other workflow engines - making Airflow a "workflow of workflows" |
|
| |
Maintaining 700+ Python dependencies within all complexity of provider packages As we have with the integrated provider packages a complex ecosystem we need to fix CI almost every second day. It would be good to think about modularization of provider packages and code - even with keeping a monorepo such that dependencies in provider packages can be modeled independent (e.g. keep a Venv per provider) and not a global Python site-package tree. This was very visible in migration to Python 3.12. |
|
| |
First user experience: Most common failures in DAG authoring Many user fail in their first DAG editing (if they do not start with a copy&paste template) because with the Python code they do not understand the difference between the DAG parsing and execution stage. It is not directly visible what "global code" and code between operators/tasks is executed when. Compensation is the path to Jinja Templating which is also in most cases added complexity. TaskFlow approach had made it better in many cases but is not a viable approach for many operator which just consume "properties" as configuration. |
|
| |
Airflow User Improvements
|
|
| |
Easy adoption of Airflow by new users We have discussed this many times, but we absolutely need to make the individual first-time adoption of Airflow better. I think the most common term I recall here is the notion of "Airflow Standalone", but whatever the term may be, an ultra quick, simple install of Airflow and the getting started experience is something we owe our community. |
|
| |
Integration improvements / Provider maintainability The changes we made as part of Airflow 2.0 to split the Core Airflow releases from the Provider releases was clearly a good choice and made a huge impact. However, the integration maintainability balanced with growth still seems like it could use a significant set of improvements. Elad and I spoke about this a couple of days ago as well and I don't have a clear set of next steps here, but definitely worth exploring. | Vikram Koka +Elad? |
|
Airflow 3.0 Fundamental (Breaking) Concept Demand / Wish List
Note
Airflow 2.x was under the focus to keep a Semantic Release Versioning promise, so we piled-up a lot of things (e.g. technical interfaces) which we did not change and defined as "will be cleaner in Airflow 3.0" or where we could not change structures because of our backwards-compatibility promise. Please list things that are in your mind which need to change and are a (mainly technical) reason to spin a 3.0 version. Besides technical items this should also list fundamental concepts.
Item / Description Describe the Airflow 3.0 Breaking Point that can not be achieved non-breaking in 2.x. Also try to sketch the "Value" it brings to user or product. | Describe the Pressure Please describe what the impact would be if we would not go with this, e.g. competitive/comparable products that carry this and Airflow has a gap because of existing 2.x concepts | Stakeholder backing Please add your name if you agree - so we see how many people share this demand | Open Questions to discuss Please post questions that need to be discussed for understanding of the raised point or clarifications (preventing too many comments, please directly here as text) |
---|---|---|---|
DAG / Workflow Support for non-linear complexity. DAGs are only one-way. And findemantally this is because loops would call a task multiple times and you would need to keep the context and execution history all separate (see also AIP-64: Keep TaskInstance try history). But there is real demand to have support for "experimental approaches" calling for loops, e.g. attempt to train a network until the desired state is reached. For such cases a DAG is not the right thing, a workaround would be to have a long running task that calls a second DAG in a loop. | Support non-linear workflows which have experimental character, unable to model such as DAG. |
What we must Keep from 2.x Approach
Note
We know that no software is perfect. We always have more wishes. Consider we are moving to Airflow 3.0, which things from the 2.x is a "must have" to keep. Not that we forget about these.
Item / Description Describe the Airflow 2.x approach or feature we must not break. | Rationale Describe the reason | Stakeholder backing Please add your name if you agree - so we see how many people share this requirement | Open Questions to discuss Please post questions that need to be discussed for understanding of the raised point or clarifications (preventing too many comments, please directly here as text) |
---|---|---|---|
Continuing to have the option of using the many thousands of operators with 90+ providers | If we would lose all providers with 3.0 all the ecosystem would need to start from scratch. This would make Airflow un-usable. |
| |
Allowing to scale and complexity of DAGs we have with Airflow 2.x today | Because setups existing today must be further supported with Airflow 3.0 as well |
Things we need to consider as "Promise" for Migration
Note
Most of the contributors are with Airflow since 1.x. If not then most of us at least as a user have gone through the migration from 1.x to 2.x. In this chapter/are please list the things that need to be assured for a 3.x planning that we need to consider. We know we need to make a user transition "easy" to migrate over from 2.x to 3.0 - assuming with a 3.0 version we do not want to lose a large user base.
Item / Description Describe the Airflow 2.x approach or feature we must not break. | Rationale Describe the reason | Stakeholder backing Please add your name if you agree - so we see how many people share this requirement | Open Questions to discuss Please post questions that need to be discussed for understanding of the raised point or clarifications (preventing too many comments, please directly here as text) |
---|---|---|---|
With a 3.0 version we move a lot of existing installs out of the comfort zone. Users might be scared to migrate and will be long time on a 2.x release until all stability and function is possible in a 3.x branch. We also might lose users and the effort of migration will have many considering to migrate to other products / solutions | In support channels (Slack/Github) we see a lot of requests still coming for Airflow 1.x and 2.3-2.5 - seems a lot of people are not regularly upgrading. If we release a 3.0 with a lot of breaking changes we might lose a lot of users and installs. |
Ideas for Target Airflow 3.0 Design
Note
In this section please sketch ideas and Designs we should consider. This is the most "playground-like" areas. Rather drop more than less, see it as "brainstorming" field.
Item / Description Describe the Airflow 2.x approach or feature we must not break. | Stakeholder backing Please add your name if you agree - so we see how many people share this requirement | Open Questions to discuss Please post questions that need to be discussed for understanding of the raised point or clarifications (preventing too many comments, please directly here as text) |
---|---|---|