'Python Pandas Dataframe Datetime Range
Here is my code block:
import pandas as pd
import datetime as dt
first_day = dt.date(todays_year, todays_month, 1)
print(first_day)
>2021-02-01
print(type(first_day))
>class 'datetime.date'>
My code runs successfully as below:
df = pd.read_excel('AllServiceActivities.xlsx',
sheet_name='All Service Activities',
usecols=[7, 12, 13]).query(f'Resources.str.contains("{name} {surname}")',
engine='python')
Yet, I also wanna do something like this("Scheduled Start" is my column name):
df = pd.read_excel('AllServiceActivities.xlsx',
sheet_name='All Service Activities',
usecols=[7, 12, 13]).query(f'Scheduled Start >= {first_day})',
engine='python')
As you can guess it does not work.
There are solutions such like: Select DataFrame rows between two dates , but I want to use "query" method because I don' t want to pass all of the irrelevant data.
Edit(In order to generate test):
dtr = [dt.datetime(2021,1,27,12,0),
dt.datetime(2021,2,3,10,0),
dt.datetime(2021,1,25,9,0),
dt.datetime(2021,1,15,7,59),
dt.datetime(2021,1,13,10,59),
dt.datetime(2021,1,12,13,59),
dt.datetime(2021,1,11,13,59),
dt.datetime(2021,2,2,9,29),
dt.datetime(2021,1,20,7,59),
dt.datetime(2021,1,19,10,59),
dt.datetime(2021,2,1,10,0),
dt.datetime(2021,1,19,7,59),
dt.datetime(2021,1,29,7,59),
dt.datetime(2021,1,28,13,0),
dt.datetime(2021,1,28,10,59),
dt.datetime(2021,1,27,19,30),
dt.datetime(2021,1,27,13,30),
dt.datetime(2021,1,18,17,30),
dt.datetime(2021,1,19,9,0),
dt.datetime(2021,1,18,13,0),
dt.datetime(2021,2,1,14,19),
dt.datetime(2021,1,29,14,30),
dt.datetime(2021,1,14,13,0),
dt.datetime(2021,1,8,13,0),
dt.datetime(2021,1,26,10,59),
dt.datetime(2021,1,25,10,0),
dt.datetime(2021,1,23,16,0),
dt.datetime(2021,1,21,10,0),
dt.datetime(2021,1,18,10,59),
dt.datetime(2021,1,11,13,30),
dt.datetime(2021,1,20,22,0),
dt.datetime(2021,1,20,21,0),
dt.datetime(2021,1,22,19,59),
dt.datetime(2021,1,12,13,59),
dt.datetime(2021,1,21,13,59),
dt.datetime(2021,1,20,10,30),
dt.datetime(2021,1,19,16,59),
dt.datetime(2021,1,19,10,0),
dt.datetime(2021,1,14,9,29),
dt.datetime(2021,1,19,8,53),
dt.datetime(2021,1,18,10,59),
dt.datetime(2021,1,13,16,0),
dt.datetime(2021,1,13,15,0),
dt.datetime(2021,1,12,13,59),
dt.datetime(2021,1,11,10,0),
dt.datetime(2021,1,8,9,0),
dt.datetime(2021,1,7,13,0),
dt.datetime(2021,1,6,13,59),
dt.datetime(2021,1,5,12,0),
dt.datetime(2021,1,10,0,0),
dt.datetime(2020,12,8,13,0),
dt.datetime(2021,1,7,11,10),
dt.datetime(2021,1,6,8,12),
dt.datetime(2021,1,5,10,0),
dt.datetime(2021,1,5,15,15),
dt.datetime(2021,1,4,7,59)]
df1= pd.DataFrame(dtr,columns=['Scheduled Start'])
df2 = df1.query("'Scheduled Start' >= @first_day")
Thanks!
Solution 1:[1]
Firstly, thanks for your guiding me @mullinscr.
From here got extra information about date_parser and parse_dates:
https://www.programcreek.com/python/example/101346/pandas.read_excel
date_parser is a specific parser function for my cases.
date_parser = lambda x: pd.datetime.strptime(str(x).split(".")[0], "%Y-%m-%d %H:%M:%S") if str(x).__contains__(".") else (pd.datetime.strptime(str(x), "%Y-%m-%d %H:%M:%S") if not str(x).__contains__("1899") else None)
df = pd.read_excel('AllServiceActivities.xlsx', sheet_name='All Service Activities', header=None, names=["Resources", "Start", "End"], skiprows=1, usecols=[7, 12, 13], parse_dates=[1], date_parser=date_parser).query("Start >= @first_day and End <= @last_day and Resources.str.contains('{} {}')".format(name, surname), engine='python')
Hope helps everyone :).
Solution 2:[2]
Without a reproducible example it's hard to know for sure. But try this. It uses the @
character for referencing variables.
df = pd.read_excel(
'AllServiceActivities.xlsx',
sheet_name='All Service Activities',
usecols=[7, 12, 13]) \
.query('Scheduled Start >= @first_day)')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Umut TEK?N |
Solution 2 | mullinscr |