'Combining Python variables into SQL queries
I am pulling data from an online database using SQL/postgresql queries and converting it into a Python dataframe using Pandas. I want to be able to change the dates in the SQL query from one point in my Python script instead of having to manually go through every SQL query and change it one by one as there are many queries and many lines in each one.
This is what I have to begin with for example:
random_query = """
select *
from table_A as a
where date_trunc('day',a.created_at) >= date('2022-03-01')
and date_trunc('day',a.created_at) <= date('2022-03-31')
group by 1,2,3
"""
Then I will read the data into Pandas as follows:
df_random_query = pd.read_sql(random_query, conn)
The connection above is to the database - the issue is not there so I am excluding that portion of code here.
What I have attempted is the following:
start_date = '2022-03-01'
end_date = '2022-03-31'
I have set the above 2 dates as variables and then below I have tried to use them in the SQL query as follows:
attempted_solution = """
select *
from table_A as a
where date_trunc('day',a.created_at) >= date(
""" + start_date + """)
and date_trunc('day',a.created_at) <= date(
""" + end_date + """)
group by 1,2,3
"""
This does run but it gives me a dataframe with no data in it - i.e. no numbers. I am not sure what I am doing wrong - any assistance will really help.
Solution 1:[1]
I was able to work it out as follows:
start_date = '2022-03-01'
end_date = '2022-03-31'
random_query = f"""
select *
from table_A as a
where date_trunc('day',a.created_at) >= date('start_date')
and date_trunc('day',a.created_at) <= date('end_date')
group by 1,2,3
"""
It was amusing to see that all I needed to do was put start_date and end_date in ' ' as well. I noticed this simply by printing what query was showing in the script. Key thing here is to know how to troubleshoot.
Another option was also to use the .format()
at the end of the query and inside it say .format(start_date = '2022-03-01', end_date = '2022-03-31')
.
Solution 2:[2]
try dropping date
function and formatting:
my_query = f"... where date_trunc('day', a.created_at) >= {start_date}"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Andronicus |
Solution 2 |