'Extract subarray between certain value in Python
I have a list of values that are the result of merging many files. I need to pad some of the values. I know that each sub-section begins with the value -1. I am trying to basically extract a sub-array between -1's in the main array via iteration.
For example supposed this is the main list:
-1 1 2 3 4 5 7 -1 4 4 4 5 6 7 7 8 -1 0 2 3 5 -1
I would like to extract the values between the -1s:
list_a = 1 2 3 4 5 7
list_b = 4 4 4 5 6 7 7 8
list_c = 0 2 3 5 ...
list_n = a1 a2 a3 ... aM
I have extracted the indices for each -1 by searching through the main list:
minus_ones = [i for i, j in izip(count(), q) if j == -1]
I also assembled them as pairs using a common recipe:
def pairwise(iterable):
a, b = tee(iterable)
next(b, None)
return izip(a,b)
for index in pairwise(minus_ones):
print index
The next step I am trying to do is grab the values between the index pairs, for example:
list_b: (7 , 16) -> 4 4 4 5 6 7 7 8
so I can then do some work to those values (I will add a fixed int. to each value in each sub-array).
Solution 1:[1]
You mentioned numpy
in the tags. If you're using it, have a look at np.split
.
For example:
import numpy as np
x = np.array([-1, 1, 2, 3, 4, 5, 7, -1, 4, 4, 4, 5, 6, 7, 7, 8, -1, 0, 2,
3, 5, -1])
arrays = np.split(x, np.where(x == -1)[0])
arrays = [item[1:] for item in arrays if len(item) > 1]
This yields:
[array([1, 2, 3, 4, 5, 7]),
array([4, 4, 4, 5, 6, 7, 7, 8]),
array([0, 2, 3, 5])]
What's going on is that where
will yield an array (actually a tuple of arrays, therefore the where(blah)[0]
) of the indicies where the given expression is true. We can then pass these indicies to split
to get a sequence of arrays.
However, the result will contain the -1
's and an empty array at the start, if the sequence starts with -1
. Therefore, we need to filter these out.
If you're not already using numpy
, though, your (or @DSM's) itertools
solution is probably a better choice.
Solution 2:[2]
If you only need the groups themselves and don't care about the indices of the groups (you could always reconstruct them, after all), I'd use itertools.groupby
:
>>> from itertools import groupby
>>> seq = [-1, 1, 2, 3, 4, 5, 7, -1, 4, 4, 4, 5, 6, 7, 7, 8, -1, 0, 2, 3, 5, -1]
>>> groups = [list(g) for k,g in groupby(seq, lambda x: x != -1) if k]
>>> groups
[[1, 2, 3, 4, 5, 7], [4, 4, 4, 5, 6, 7, 7, 8], [0, 2, 3, 5]]
I missed the numpy
tags, though: if you're working with numpy arrays, using np.split
/np.where
is a better choice.
Solution 3:[3]
I would do it something like this, which is a little different from the path you started down:
input_list = [-1,1,2,3,4,5,7,-1,4,4,4,5,6,7,7,8,-1,0,2,3,5,-1]
list_index = -1
new_lists = []
for i in input_list:
if i == -1:
list_index += 1
new_lists.append([])
continue
else:
print list_index
print new_lists
new_lists[list_index].append(i)
Solution 4:[4]
I think when you build your list
, you can directly add the values to a string
. So rather than starting with a list
like xx = []
, you can start with xx = ''
, and then do an update like xx = xx + ' ' + str (val)
. The result will be a string
rather than a list
. Then, you can just use the split()
method on the strihg.
In [48]: xx
Out[48]: '-1 1 2 3 4 5 7 -1 4 4 4 5 6 7 7 8 -1 0 2 3 5 -1'
In [49]: xx.split('-1')
Out[49]: ['', ' 1 2 3 4 5 7 ', ' 4 4 4 5 6 7 7 8 ', ' 0 2 3 5 ', '']
In [50]: xx.split('-1')[1:-1]
Out[50]: [' 1 2 3 4 5 7 ', ' 4 4 4 5 6 7 7 8 ', ' 0 2 3 5 ']
Am sure you can take it from here ...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | ernie |
Solution 4 | ssm |