'How to vectorize such an algorithm in python?
I have four (nx1) dimensional arrays named a, b, c, and F. I want to run this algorithm without any loops.
for i in range(n):
if a[i] < b[i]:
F[i] = 0
elif a[i] > c[i]:
F[i] = 1
elif b[i] <= a[i] <= c[i]:
F[i] = 2
I want to write this code in a more vectorized way to make it more efficient as my dataset is quite large.
Solution 1:[1]
If you care about performance, why not try numba
? It might get 10X
faster than logical operations while saving memory at the same time. As a bonus, the loop code you wrote will be kept intact, only through an @njit
decorator in front of the function.
from numba import njit
@njit
def loop(F, a, b, c):
for i in range(F.shape[0]):
if a[i] < b[i]:
F[i] = 0
elif a[i] < c[i]:
F[i] = 1
elif b[i] <= a[i] <= c[i]:
F[i] = 2
Compare with vectorized solution by @NiziL using sizes of 100 and 1000 vectors,
timeit(lambda: loop(F, a, b, c))
timeit(lambda: idx(F, a, b, c))
Gives:
# 1.0355658 (Size: 100, @njit loop)
# 4.6863165 (Size: 100, idx)
# 1.9563843 (Size: 1000, @njit loop)
# 16.658198 (Size: 1000, idx)
Solution 2:[2]
I feel like you could use boolean indexing for this task.
F[np.logical_and(b <= a, a <= c)] = 2
F[a > c] = 1
F[a < b] = 0
Beware, the affectation order is important here to get the desired outcomes.
Some timeit
benchmark:
def loop(F, a, b, c):
for i in range(F.shape[0]):
if a[i] < b[i]:
F[i] = 0
elif a[i] > c[i]:
F[i] = 1
elif b[i] <= a[i] <= c[i]:
F[i] = 2
def idx(F, a, b, c):
F[np.logical_and(b <= a, a <= c)] = 2
F[a > c] = 1
F[a < b] = 0
with (10x1) array:
>>> timeit.timeit(lambda: loop(F, a, b, c))
11.585818066001593
>>> timeit.timeit(lambda: idx(F, a, b, c))
3.337863392000145
with (1000x1) array:
>>> timeit.timeit(lambda: loop(F, a, b, c))
1457.268110728999
>>> timeit.timeit(lambda: idx(F, a, b, c))
10.00236530300026
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | AboAmmar |
Solution 2 |