'functools.reduce in Python not working as expected
I would like to sum across keys of dictionaries nested within a list using the functools.reduce
function
I can accomplish this WITHOUT the functools.reduce
function with the following simple program:
dict1 = {'a': '1', 'b': '2'}
dict2 = {'a': '5', 'b': '0'}
dict3 = {'a': '7', 'b': '3'}
data_list = [dict1, dict2, dict3]
total_a = 0
total_b = 0
for record in data_list:
total_a += eval(record['a'])
total_b += eval(record['b'])
print(total_a)
print(total_b)
As I said however, I would like to produce the same results using the functools.reduce
method instead.
Here is my attempt at using functools.reduce
with a lambda expression:
from functools import reduce
dict1 = {'a': '1', 'b': '2'}
dict2 = {'a': '5', 'b': '0'}
dict3 = {'a': '7', 'b': '3'}
data_list = [dict1, dict2, dict3]
total_a = reduce(lambda x, y: int(x['a']) + int(y['a']),data_list)
total_b = reduce(lambda x, y: int(x['b']) + int(y['b']),data_list )
print(total_a)
print(total_b)
Unfortunately, I get the following error and do not know why:
TypeError: 'int' object is not subscriptable
Does someone know why I am getting this error?
Solution 1:[1]
TypeError: 'int' object is not subscriptable
Does someone know why I am getting this error?
First, let's reduce (pun intended) the sample to a minimum:
>>> from functools import reduce
>>> data = [{"a": 1}, {"a": 2}, {"a": 3}]
>>> reduce(lambda x, y: x["a"] + y["a"], data)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <lambda>
TypeError: 'int' object is not subscriptable
Same error. But observe this:
>>> reduce(lambda x, y: x["a"] + y["a"], data[:2])
3
That's working. So what's going on? For simplification, let's assign the lambda expression to a variable:
f = lambda x, y: x['a'] + y['a']
Reduce combines the input like this:
# Example: reduce(lambda x, y: x["a"] + y["a"], data[:2])
>>> f(data[0], data[1])
# which evaluates in steps like this:
>>> data[0]["a"] + data[1]["a"]
>>> 1 + 2
>>> 3
But what's happening when reducing the full list? This evaluates to
# Example: reduce(lambda x, y: x["a"] + y["a"], data)
>>> f(f(data[0], data[1]), data[2])
# which evaluates in steps like this:
>>> f(data[0]["a"] + data[1]["a"], data[2])
>>> f(1 + 2, data[2])
>>> f(3, data[2])
>>> 3["a"] + data[2]["a"]
So this errors out because it tries to access item "a"
from the integer 3.
Basically: The output of the function passed to reduce must be acceptable as it's first parameter. In your example, the lambda expects a dictionary as its first parameter and returns an integer.
Solution 2:[2]
The reduction function receives the current reduced value plus the next iterated item to be reduced. The trick is in choosing what that reduction value looks like. In your case, if you choose a 2 item list holding the reduced values of 'a' and 'b', then the reduction function just adds the next 'a' and 'b' to those values. The reduction is most easily written as a couple of statements so should be moved from an anonymous lambda to a regular function. Start with an initializer of [0, 0]
to hold the reduced 'a' and 'b', and you get:
from functools import reduce
def reducer(accum, next_dict):
print(accum, next_dict) # debug trace
accum[0] += int(next_dict['a'])
accum[1] += int(next_dict['b'])
return accum
dict1 = {'a': '1', 'b': '2'}
dict2 = {'a': '5', 'b': '0'}
dict3 = {'a': '7', 'b': '3'}
data_list = [dict1, dict2, dict3]
total_a, total_b = reduce(reducer, data_list, [0, 0])
Solution 3:[3]
You misunderstand how reduce()
works. In the function passed to it, its first argument is a partial result so far, and has nothing directly to do with the iterable passed to reduce()
. The iterable is passed one element at a time, to the function's second argument. Since you want a sum, the initial value of the "partial result" needs to be 0, which also needs to be passed to reduce()
.
So, in all, these lines will print what you want:
print(reduce(lambda x, y: x + int(y['a']), data_list, 0))
print(reduce(lambda x, y: x + int(y['b']), data_list, 0))
EDIT: replaced eval()
with int()
above, so it matches the edited question. It's irrelevant to the answer, though.
EDIT 2: you keep changing the question, but I'm not going to keep changing the answer to match ;-) The code just above fully answers an earlier version of the question, and nothing material has changed. Exactly the same things are still at work, and exactly the same kind of approach is needed.
Gloss on types
While Python doesn't require explicit type declarations, sometimes they can be helpful.
If you have an iterable delivering objects of type A
, and the result of reduce()
is of type B
, then the signature of the the first argument passed to reduce()
must be
def reduction_function(x: B, y: A) -> B
In the example, A
is dict
and B
is int
. Passing a dict for both can't possibly work. That's essentially why we need to specifiy an initial value of type B
(int
) in this case.
In doc examples, A
and B
are typically both int
or float
. Then a simple + or * is already type-correct for reduce()
's first argument.
Solution 4:[4]
Since we're adding integers, an alternative to reduce
is sum
:
data_list = [{'a': '1', 'b': '2'}, {'a': '5', 'b': '0'}, {'a': '7', 'b': '3'}]
total_a, total_b = map(sum, zip(*((int(d['a']),int(d['b'])) for d in data_list)))
print(total_a, total_b)
# 13 5
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Bluehorn |
Solution 2 | tdelaney |
Solution 3 | |
Solution 4 | Stef |