'How does the mode "reschedule" in Airflow Sensors work?
I have an Airflow Http sensor that calls a REST endpoint and checks for a specific value in the JSON structure returned by the API
sensor = HttpSensor(
soft_fail=True,
task_id='http_sensor_check',
http_conn_id='http_default',
endpoint='http://localhost:8082/api/v1/resources/games/all',
request_params={},
response_check=lambda response: True if check_api_response(response) is True else False,
mode='reschedule',
dag=dag)
If the response_check is false, the DAG is put in a "up_for_reschedule" state. The issue is, the DAG stayed in that status forever and never got rescheduled.
My questions are:
- What does "up_for_reschedule" means? and when would be the DAG rescheduled?
- Let's suppose my DAG is scheduled to run every 5 minutes but because of the sensor, the "up_for_reschedule" DAG instance overlaps with the new run, will I have 2 DAGS running at the same time?
Thank you in advance.
Solution 1:[1]
In sensor mode='reschedule'
means that if the criteria of the sensor isn't True then the sensor will release the worker to other tasks. This is very useful for cases when sensor may wait for a long time.
up_for_reschedule
means that the sensor condition isn't true yet and it hasnt reached timout so the task is waiting to be rescheduled by the scheduler.- You don't know when the task will run. That depends on the scheduler (available resources, priorities etc..). If you don't want to allow parallel dag runs use
max_active_runs=1
in DAG constructor.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |