'Count groups of consecutive 1s in pandas

I have a list of '1's and '0s' and I would like to calculate the number of groups of consecutive '1's.

mylist = [0,0,1,1,0,1,1,1,1,0,1,0]

Doing it by hand gives us 3 groups but is there a way to do it by python?



Solution 1:[1]

Option 1

With pandas. First, initialise a dataframe:

In [78]: df
Out[78]: 
    Col1
0      0
1      0
2      1
3      1
4      0
5      1
6      1
7      1
8      1
9      0
10     1
11     0

Now calculate sum total by number of groups:

In [79]: df.sum() / df.diff().eq(1).cumsum().max()
Out[79]: 
Col1    2.333333
dtype: float64

If you want just the number of groups, df.diff().eq(1).cumsum().max() is enough.


Option 2

With itertools.groupby:

In [88]: sum(array) / sum(1 if sum(g) else 0 for  _, g in  itertools.groupby(array))
Out[88]: 2.3333333333333335

If you want just the number of groups, sum(1 if sum(g) else 0 for _, g in itertools.groupby(array)) is enough.

Solution 2:[2]

Here I count whenever there is a jump from 0 to 1. Prepending the 0 prevents not counting a leading sequence.

import numpy as np

mylist_arr = np.array([0] + [0,0,1,1,0,1,1,1,1,0,1,0])
diff = np.diff(mylist_arr)
count = np.sum(diff == 1)

Solution 3:[3]

you can try this

import numpy as np
import pandas as pd
df=pd.DataFrame(data = [0,0,1,1,0,1,1,1,1,0,1,0])
df['Gid']=df[0].diff().eq(1).cumsum()
df=df[df[0].eq(1)]
df.groupby('Gid').size()
Out[245]: 
Gid
1    2
2    4
3    1
dtype: int64

sum(df.groupby('Gid').size())/len(df.groupby('Gid').size())
Out[244]: 2.3333333333333335

Solution 4:[4]

Here's one solution:

durations = []

for n, d in enumerate(mylist):
    if (n == 0 and d == 1) or (n > 0 and mylist[n-1] == 0 and d == 1):
        durations.append(1)
    elif d == 1:
        durations[-1] += 1

def mean(x):
    return sum(x)/len(x)

print(durations)
print(mean(durations))

Solution 5:[5]

You can try this:

mylist = [0,0,1,1,0,1,1,1,1,0,1,0]
previous = mylist[0]
count = 0

for i in mylist[1:]:
   if i == 1:
       if previous == 0:
            previous = 1
   else:
       if i == 0:
            if previous == 1:
                 count += 1
                 previous = 0

print count

Output:

3

Solution 6:[6]

Take a look at itertools.groupby:

import itertools
import operator

def get_1_groups(ls):
    return sum(map(operator.itemgetter(0), itertools.groupby(ls)))

This works because itertools.groupby returns (the iterable equivalent) of:

itertools.groupby([0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0])
# ==>
[(0, [0, 0]), (1, [1, 1]), (0, [0]), (1, [1, 1, 1, 1]), (0, [0]), (1, [1]), (0, [0])]

So you are just summing the first item.

If you can have other items that are not 0, they would add to the sum.

You can do something like this:

def count_groups(ls, target=1):
    return sum(target == value for value, _ in itertools.groupby(ls))

Solution 7:[7]

This can be accomplished without much work by simply summing the number of times the list transitions from 0 to 1 (Counting rising signal edges):

count = 0
last = 0
for element in mylist:
    if element != last:
        last = element
        if element:  # 1 is truthy
            count += 1
print count

Solution 8:[8]

Here is my solution:

c is the list to play on

   c=[1,0,1,1,1,0]
   max=0
   counter = 0
   
   for j in c:
     if j==1:
        counter+=1

     else:
        if counter>max:
           max=counter
           counter=0
           continue

   if counter>max:
      max=counter

   print(max)

Solution 9:[9]

A Quick and dirty one-liner (almost)

import re
mylist = [0,0,1,1,0,1,1,1,1,0,1,0]
print len(re.sub(r'0+', '0', ''.join(str(x) for x in mylist)).strip('0').split('0')) 
3

step by step:

import re
mylist = [0,0,1,1,0,1,1,1,1,0,1,0]
sal1 = ''.join(str(x) for x in mylist) # returns a string from the list
sal2 = re.sub(r'0+', '0', sal1)   # remove duplicates of zeroes
sal3 = sal2.strip('0')            # remove 0s from the start & the end of the string
sal4 = len(sal3.split('0'))       # split the string using '0' as separators into a list, and calculate it's length

This throws:

sal  -> 001101111010
sal2 -> 01101111010
sal3 -> 110111101
sal4 -> 3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 philippd
Solution 3 BENY
Solution 4 hilssu
Solution 5 Ajax1234
Solution 6 Artyer
Solution 7
Solution 8 Community
Solution 9 Leonardo Brugues