'Performance issues when iterating numpy array
I have a 3D array of image such as
[
[
[225, 0, 0],
[225, 225, 0],
...
],
[
[225, 0, 0],
[225, 225, 0],
...
],
...
]
The size of this array is 500x500x3 which is 750.000 elements. These are simple nested loops to iterate over the array
for row in arr:
for col in row:
for elem in col:
elem = (2 * elem / MAX_COLOR_VAL) - 1
But it takes a lot of time (> 5 min) to iterate.
I'm new in numpy so may be I'm iterating arrays wrong way? How can I optimize these loops?
Solution 1:[1]
Numpy arrays are not designed to do iteration over the elements. Likely it will even be slower than iterating over a Python list, since that will result in a lot of wrapping and unwrapping of elements.
Numpy arrays are designed to do processing in bulk. So for example calculate the elementwise-sum of two 1000×1000 matrices.
If you want to multiply all elements with 2
, divide these by MAX_COLOR_VAL
and subtract one from these, you can simply construct a new array with:
arr = (2 * arr.astype(float) / MAX_COLOR_VAL) - 1
This will apply this operation to all elements.
Note: note that if you iterate over a numpy array, you do not iterate over the indices, you iterate over the rows itself. So the
row
infor row in arr
will return a 2d array, not the index of a 2d array.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Willem Van Onsem |