Category "airflow"

Airflow Subdag tasks are stuck in None state while subdag is showing as Running

I have a problem with my dag getting stuck at subdag. The subdag is in RUNNING state but on zooming in all the tasks of the subdag are in None status. Using Air

Get List of all the dags in python

I have a list of dags that are hosted on Airflow. I want to get the name of the dags in a AWS lambda function so that I can use the names and trigger the dag us

Is there any difference between python scripts in airflow and same script in python

I was writing the below code but it is running endless in airflow, but in my system it take 5 min to run gc=pygsheets.authorize(service_account_file='file.json'

How to push xcom from AwsGlueJobOperator when this task fails

I am trying to get xcom for a glue job run to get it's glueid. I need this to display the cloudwatch link on airflow output console in case the glue job fails.

How do you access Airflow Web Interface?

Hi I am taking a datacamp class on how to use Airflow and it shows how to create dags once you have access to an Airflow Web Interface. Is there an easy way to

In Airflow UI under connections, what does 'encrypted' and 'extra-encrypted' mean?

Simple question here. When looking at Airflow's UI, like in the screenshot below: What does Is Encrypted and Is Extra Encrypted stand for? Is there clear docum

Airflow 2.3 - Dynamic Task Mapping using Operators

I've got a current implementation of some code which works fine, but only carries out a single check per dag run as I cannot feed through multiple results to do

Using airflow to uploade data on S3

I tried to upload a dataframe containing informations about apple stock (using their api) as csv on s3 using airflow and pythonoperator. The script is below. Wh

Apache Airflow: How to template Volumes in DockerOperator using Jinja Templating

I want to convey list of volumes into DockerOperator using Jinja template: hard coded volumes works fine: volumes=['first:/dest', 'second:/sec_destination'] ho

loop over airflow variables issue question

I am having hard time looping over an airflow variable in my script so I have a requirement to list all files prefixed by string in a bucket. next loop throug

How to extract the query result from a Hive job output logs using DataprocHiveOperator?

I am trying to build a data migration pipeline using Airflow, source being a Hive table on a Dataproc cluster and the destination is BigQuery. I'm using Datapro

Airflow + sqlalchemy short-lived connections to metadata db

I deployed the latest airflow on a centos 7.5 vm and updated sql_alchemy_conn and result_backend to postgres databases on a postgresql instance and designated m

sqlite3 raised an error after running Airflow command line

When I ran command: airflow list_users It raised an error as below: sqlite3.OperationalError: no such table: ab_permission_view_role ... sqlalchemy.exc.Opera

How do I create a chain for data with parent child relationship using python?

If I have this set of input to convert, Input: Task A -> Task B Task A -> Task C Task B -> Task D Task C -> Task E Making use of pandas python: df

Read and group json files by date element using pyspark

I have multiple JSON files (10 TB ~) on a S3 bucket, and I need to organize these files by a date element present in every json document. What I think that my c

Amazon Managed Airflow (MWAA) import custom plugins

I'm setting up an AWS MWAA instance and I have a problem with import custom plugins. My local project structure looks like this: airflow-project ├─&

Airflow - call a operator inside a function

I'm trying to call a python operator which is inside a function using another python operator. Seems something I missed, can someone help me to find out what I

DAG run as per timezone

I want to run my dag as per new york time zone. As the data comes as per the new york time zone and dag fails for the initial runs and skips last runs data as w

Airflow metrics with prometheus and grafana

any one knows how to send metrics from airflow to prometheus, I'm not finding much documents about it, I tried the airflow operator metrics on Grafana but it d

Maximum memory size for an XCOM in Airflow

I was wondering if there is any memory size limit for an XCOM in airflow ?