'How to add an extra middle step into a list comprehension?
Let's say I have a list[str]
object containing timestamps in "HH:mm"
format, e.g.
timestamps = ["22:58", "03:11", "12:21"]
I want to convert it to a list[int]
object with the "number of minutes since midnight" values for each timestamp:
converted = [22*60+58, 3*60+11, 12*60+21]
... but I want to do it in style and use a single list comprehension to do it. A (syntactically incorrect) implementation that I naively constructed was something like:
def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
return [int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]
... but this doesn't work because for hh, mm = ts.split(":")
is not a valid syntax.
What would be the valid way of writing the same thing?
To clarify: I can see a formally satisfying solution in the form of:
def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
return [int(ts.split(":")[0]) * 60 + int(ts.split(":")[1]) for ts in timestamps]
... but this is highly inefficient and I don't want to split the string twice.
Solution 1:[1]
You could use an inner generator expression to do the splitting:
[int(hh)*60 + int(mm) for hh, mm in (ts.split(':') for ts in timestamps)]
Although personally, I'd rather use a helper function instead:
def timestamp_to_minutes(timestamp: str) -> int:
hh, mm = timestamp.split(":")
return int(hh)*60 + int(mm)
[timestamp_to_minutes(ts) for ts in timestamps]
# Alternative
list(map(timestamp_to_minutes, timestamps))
Solution 2:[2]
Your initial pseudocode
[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]
is pretty close to what you can do:
[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm in [ts.split(':')]]
In Python 3.9, expressions like this were optimized so that creating a single-element array inside a comprehension just to access its single element immediately is as fast as a simple assignment.
Solution 3:[3]
If you don't want to split string twice you can use :=
assignment operator:
timestamps = [int((s := t.split(":"))[0]) * 60 + int(s[1]) for t in timestamps]
print(timestamps)
Prints:
[1378, 191, 741]
Alternative:
print([int(h) * 60 + int(m) for h, m in (t.split(":") for t in timestamps)])
Prints:
[1378, 191, 741]
Note:
:=
is a feature of Python 3.8+ commonly referred to as the "walrus operator". Here's the PEP with the proposal.
Solution 4:[4]
If you use generators (as opposed to list-comprehensions) for middle-steps, the whole list will still be converted in one single pass:
timestamps = ["22:58", "03:11", "12:21"]
#NOTE: Use () for generators, not [].
hh_mms = (timestamp.split(':') for timestamp in timestamps)
converted = [int(hh) * 60 + int(mm) for (hh, mm) in hh_mms]
print(converted)
# [1378, 191, 741]
You can split the comprehension in multiple-steps, written on multiple lines, and you don't need to define any function.
Solution 5:[5]
Late to the party .. but why not use datetime / timedelta to convert your time?
For "hh:mm" this may be overkill, but you can easily adjust it to more complex time strings:
from datetime import datetime as dt
import typing
def timestamps_to_minutes(timestamps: typing.List[str]) -> typing.List[any]:
"""Uses datetime.strptime to parse a datetime string and return
minutes spent in this day."""
return [int(((p := dt.strptime(t,"%H:%M")) - dt(p.year,p.month, p.day)
).total_seconds()//60) for t in timestamps]
timestamps = ["22:58", "03:11", "12:21"]
print(timestamps_to_minutes(timestamps))
Outputs:
[1378, 191, 741]
Solution 6:[6]
Just for fun, we could also use operator.methodcaller
:
from operator import methodcaller
out = [int(h) * 60 + int(m) for h, m in map(methodcaller("split", ":"), timestamps)]
Output:
[1378, 191, 741]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | wjandrea |
Solution 2 | wjandrea |
Solution 3 | Doug Lipinski |
Solution 4 | |
Solution 5 | |
Solution 6 |