'How to add an extra middle step into a list comprehension?

Let's say I have a list[str] object containing timestamps in "HH:mm" format, e.g.

timestamps = ["22:58", "03:11", "12:21"]

I want to convert it to a list[int] object with the "number of minutes since midnight" values for each timestamp:

converted = [22*60+58, 3*60+11, 12*60+21]

... but I want to do it in style and use a single list comprehension to do it. A (syntactically incorrect) implementation that I naively constructed was something like:

def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
    return [int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]

... but this doesn't work because for hh, mm = ts.split(":") is not a valid syntax.

What would be the valid way of writing the same thing?

To clarify: I can see a formally satisfying solution in the form of:

def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
    return [int(ts.split(":")[0]) * 60 + int(ts.split(":")[1]) for ts in timestamps]

... but this is highly inefficient and I don't want to split the string twice.



Solution 1:[1]

You could use an inner generator expression to do the splitting:

[int(hh)*60 + int(mm) for hh, mm in (ts.split(':') for ts in timestamps)]

Although personally, I'd rather use a helper function instead:

def timestamp_to_minutes(timestamp: str) -> int:
    hh, mm = timestamp.split(":")
    return int(hh)*60 + int(mm)

[timestamp_to_minutes(ts) for ts in timestamps]

# Alternative
list(map(timestamp_to_minutes, timestamps))

Solution 2:[2]

Your initial pseudocode

[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]

is pretty close to what you can do:

[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm in [ts.split(':')]]

In Python 3.9, expressions like this were optimized so that creating a single-element array inside a comprehension just to access its single element immediately is as fast as a simple assignment.

Solution 3:[3]

If you don't want to split string twice you can use := assignment operator:

timestamps = [int((s := t.split(":"))[0]) * 60 + int(s[1]) for t in timestamps]
print(timestamps)

Prints:

[1378, 191, 741]

Alternative:

print([int(h) * 60 + int(m) for h, m in (t.split(":") for t in timestamps)])

Prints:

[1378, 191, 741]

Note: := is a feature of Python 3.8+ commonly referred to as the "walrus operator". Here's the PEP with the proposal.

Solution 4:[4]

If you use generators (as opposed to list-comprehensions) for middle-steps, the whole list will still be converted in one single pass:

timestamps = ["22:58", "03:11", "12:21"]

#NOTE: Use () for generators, not [].
hh_mms = (timestamp.split(':') for timestamp in timestamps)
converted = [int(hh) * 60 + int(mm) for (hh, mm) in hh_mms]

print(converted)
# [1378, 191, 741]

You can split the comprehension in multiple-steps, written on multiple lines, and you don't need to define any function.

Solution 5:[5]

Late to the party .. but why not use datetime / timedelta to convert your time?

For "hh:mm" this may be overkill, but you can easily adjust it to more complex time strings:

from datetime import datetime as dt
import typing

def timestamps_to_minutes(timestamps: typing.List[str]) -> typing.List[any]:
    """Uses datetime.strptime to parse a datetime string and return
    minutes spent in this day."""
    return [int(((p := dt.strptime(t,"%H:%M")) - dt(p.year,p.month, p.day)
                 ).total_seconds()//60) for t in timestamps]

timestamps = ["22:58", "03:11", "12:21"]

print(timestamps_to_minutes(timestamps))

Outputs:

[1378, 191, 741]

Solution 6:[6]

Just for fun, we could also use operator.methodcaller:

from operator import methodcaller
out = [int(h) * 60 + int(m) for h, m in map(methodcaller("split", ":"), timestamps)]

Output:

[1378, 191, 741]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 wjandrea
Solution 2 wjandrea
Solution 3 Doug Lipinski
Solution 4
Solution 5
Solution 6