'Managing DBT profile file in MWAA

I would like to use DBT in MWAA Airflow enviroment. To achieve this I need to install DBT in the managed environment and from there run the dbt commands via the Airflow operators or CLI (BashOperator).

My problem with solution is that I need store the dbt profile file(s) -which contains the target / source database credentials- in S3. Otherwise the file is not going to be deployed to the Airflow worker nodes hence cannot be used by dbt.

Is there any other option? I feel this is a big security risk and also undermines the use of Airflow (because I would like to use its inbuilt password manager)

My ideas:

  • Create the profile file on the fly in the Airflow dag as a task and write it out to local. I do not think this is a feasible workaround, because there is no guarantee that the dbt task is going to run on the same worker node which my code created.
  • Move the profile file manually to S3 (Exclude it from CI/CD). Again, I see a security risk, as I am storing credentials on S3.
  • Create a custom operator, which builds the profile file on the same machine as command will run. Maintenance nightmare.
  • Use MWAA environment variables (https://docs.aws.amazon.com/mwaa/latest/userguide/configuring-env-variables.html) and combine it with dbt's env_var command. (https://docs.getdbt.com/reference/dbt-jinja-functions/env_var) Storing credentials in System wide EVs, this way feels awkward.

Any good ideas or best practices?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source