'NumPy fancy indexing with provided output array

I have a two-dimensional array and index it with a pair of arrays (in fact my arrays are much larger, millions of elements):

a = np.array([[1, 2, 3], [4, 5, 6]])
b = a[[0, 0, 0, 1], [0, 1, 2, 0]]

Indexing will allocate a new array. Is there a way to do this indexing with an output array provided?

I looked at np.take and np.choose, but it seems that they don't work with a pair of arrays. I managed to use np.take(..., out=buf) if I ravel the array and manually construct the 1-d instance, but it causes more memory accesses, and almost kills the improvement over eliminating allocation for the indexing result.



Solution 1:[1]

This was asked a long time ago, but just in case:

If your large array (a in your question) is C-contiguous, then you could use np.lib.stride_tricks.as_strided() instead of np.ravel() as in your question. That returns a view of the large array, without copying, and is very fast. You can then use np.take() with an existing out=... destination.

Note however that, in your question, the index pair itself is large (twice as many elements as the output you seek). If we convert it to a 1-d index (with np.ravel_multi_index(ix, a.shape) as below), that is also a new array of the same size as the output you want, so in the end you might not save much time not memory.

In any case, here is one use of np.lib.stride_tricks.as_strided:

n = np.prod(a.shape)
as1d_view = np.lib.stride_tricks.as_strided(a, (n, ), writeable=False)

# then, to index and copy into buf:
np.take(as1d_view, np.ravel_multi_index(ix, a.shape), out=buf)

Example 1: single large array, multiple indices

# setup
a = np.array([[1, 2, 3], [4, 5, 6]])
indices = [
    (np.array([0, 0, 0, 1]), np.array([0, 1, 2, 0])),
    (np.array([0, 1, 0, 1, 0, 1]), np.array([2, 1, 1, 2, 0, 0])),
    (np.array([0, 0, 1]), np.array([0, 1, 2])),
]

# init
n = np.prod(a.shape)
as1d_view = np.lib.stride_tricks.as_strided(a, (n, ), writeable=False)
m = max(len(ix[0]) for ix in indices)
buf = np.empty(m, dtype=a.dtype)

# loop
for i, ix in enumerate(indices):
    m = len(ix[0])
    np.take(as1d_view, np.ravel_multi_index(ix, a.shape), out=buf[:m])
    print(f'for index {i}, buf={buf[:m]!r}')

# gives:
# for index 0, buf=array([1, 2, 3, 4])
# for index 1, buf=array([3, 5, 2, 6, 1, 4])
# for index 2, buf=array([1, 2, 6])

Example 2: single index, multiple large arrays

# setup (presumably the large arrays are obtained one at a time instead...)
a_list = [
    np.array([[1, 2, 3], [4, 5, 6]]),
    np.arange(15).reshape(3, 5),
    np.random.randint(0, 20, (4,4)),
]
ix = np.array([0, 0, 0, 1]), np.array([0, 1, 2, 0])

# init
m = len(ix[0])
buf = np.empty(m, dtype=a.dtype)

# loop
for i, a in enumerate(a_list):
    ix1d = np.ravel_multi_index(ix, a.shape)
    as1d_view = np.lib.stride_tricks.as_strided(a, (n, ), writeable=False)
    np.take(as1d_view, ix1d, out=buf)
    print(f'for array {i}, buf={buf!r}')

# gives:
# for array 0, buf=array([1, 2, 3, 4])
# for array 1, buf=array([0, 1, 2, 5])
# for array 2, buf=array([ 5, 19,  1,  3])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Pierre D