'np.vectorize fails on a 2-d numpy array as input
I am trying to vectorize a function that takes a numpy array as input. I have a 2-d numpy array (shape is 1000,100) on which the function is to be applied on each of the 1000 rows. I tried to vectorize the function using np.vectorize
. Here is the code:
def fun(i):
print(i)
location = geocoder.google([i[1], i[0]], method="reverse")
#print type(location)
location = str(location)
location = location.split("Reverse")
if len(location) > 1:
location1 = location[1]
return [i[0], i[1], location1]
#using np.vectorize
vec_fun = np.vectorize(fun)
Which raises the error
<ipython-input-19-1ee9482c6161> in fun(i)
1 def fun(i):
2 print(i)
----> 3 location = geocoder.google([i[1], i[0]], method="reverse")
4 #print type(location)
5 location = lstr(location)
IndexError: invalid index to scalar variable.
I have printed the argument that is passed in to the fun which prints a single value (the first element of the vector) rather than the vector(1 row) that is the reason of the index error but I'm not getting any idea how to resolve this.
Solution 1:[1]
vectorize
runs your function on each element of an array, so it's not the right choice. Use a regular loop instead:
for row in some_array:
i0, i1, loc = fun(row)
It's up to you as to what you want to do with the output. Keep in mind that your function does not assign location1
if len(location) <= 1
, and will raise an error in that case. It also returns a string rather than a numerical value in the third output.
Once you fix those issues, if you want to make an array of the output:
output = np.empty((some_array.shape[0], 3))
for i, row in enumerate(some_array):
output[i, :] = fun(row)
Solution 2:[2]
By this time I think yo have solved your problem. However, I just found a way that solve this and may help other people with the same question. You can pass a signature="str"
parameter to np.vectorize
in order to specify the input and output shape. For example, the signature "(n) -> ()"
expects an input shape with length (n)
(rows) and outputs a scalar ()
. Therefore, it will broadcast up to rows:
def my_sum(row):
return np.sum(row)
row_sum = np.vectorize(my_sum, signature="(n) -> ()")
my_mat = np.array([
[1, 1, 1],
[2, 2, 2],
])
row_sum(my_mat)
OUT: array([3, 6])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Dharman |