'How does Airflow decide to render template values?

I am working with Airflow 2.2.3 in GCP (Composer) and I am seeing inconsistent behavior which I can't explain when trying to use template values.

When I reference the templated value directly, it works without issue:

ts = '{{ ds }}' # results in 2022-05-09

When I reference the templated value in a function call, it doesn't work as expected:

ts_parts = '{{ ds }}'.split('-') # result ['2022-05-09']

The non-function call value is rendered without any issues, so it doesn't have any dependency on operator scope. There are examples here that show rendering outside of an operator, so I expect that not to be the issue. It's possible that Composer has setting configured so that Airflow will apply rendering to all python files.

Here's the full code for reference dag.py

with DAG('rendering_test', 
    description='Testing template rendering',
    schedule_interval=None,   # only run on demand
    start_date=datetime(2020, 11, 10), ) as rendering_dag:

    ts = '{{ ds }}'
    ts_parts = '{{ ds }}'.split('-')
    literal_parts = '2022-05-09'.split('-')
    
    print_gcs_info = BashOperator(
        task_id='print_rendered_values',
     bash_command=f'echo "ts: {ts}\nts_parts: {ts_parts}\nliteral_parts {literal_parts}"'
    )

I thought that Airflow writes the files to some location with template values, then runs jinja against them with some supplied values, then runs the resulting python code. It looks like there is some logic applied if the line contains a function call? The documentation mentions none of these architectural principles and gives very limited examples.



Solution 1:[1]

Airflow does not render values outside of operator scope. Rendering is a part of task execution which means that it's a step that happens only when task is in the worker (after being scheduled).

In your code the rendering is a top level code which is not part of operator templated fields thus Airflow consider it to be a regular string.

In your case the os.path.dirname() is executed on '{{ dag_run.conf.name }}' before it was rendered.

To fix your issue you need to set the Jinja string in templated fields of the operator.

bash_command=""" echo "path: {{ dag_run.conf.name }} path:  os.path.dirname('{{ dag_run.conf.name }}')" """

Triggering DAG with {"name": "value"} will give:

enter image description here

Note that if you wish to use f-string with Jinja strings you must double the number of { }

source_file_path = '{{ dag_run.conf.name }}'                   

print_template_info = BashOperator(
    task_id='print_template_info',
    bash_command=f""" echo "path: { source_file_path } path:  os.path.dirname('{{{{ dag_run.conf.name }}}}')" """

)

Edit: Let me clarify - Airflow template fields as part of task execution. You can see in the code base that Airflow invokes render_templates before it invokes pre_execute() and before it invokes execute(). This means that this step happens when the task is running on a worker. Trying to template outside of operator means the task doesn't even run - so the step of templating isn't running.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1