Count consecutive occurences of values varying in length in a numpy array

Say I have a bunch of numbers in a numpy array and I test them based on a condition returning a boolean array:

np.random.seed(3456)
a = np.random.rand(8)
condition = a>0.5

And with this boolean array I want to count all of the lengths of consecutive occurences of True. For example if I had [True,True,True,False,False,True,True,False,True] I would want to get back [3,2,1].

I can do that using this code:

length,count = [],0
for i in range(len(condition)):

    if condition[i]==True:
        count += 1
    elif condition[i]==False and count>0:
        length.append(count)
        count = 0

    if i==len(condition)-1 and count>0:
        length.append(count)

    print length

But is there anything already implemented for this or a python,numpy,scipy, etc. function that counts the length of consecutive occurences in a list or array for a given input?

Contents hide

Answers:

Method 1

Method 2

Method 3

Method 4

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

If you already have a numpy array, this is probably going to be faster:

>>> condition = np.array([True,True,True,False,False,True,True,False,True])
>>> np.diff(np.where(np.concatenate(([condition[0]],
                                     condition[:-1] != condition[1:],
                                     [True])))[0])[::2]
array([3, 2, 1])

It detects where chunks begin, has some logic for the first and last chunk, and simply computes differences between chunk starts and discards lengths corresponding to False chunks.

Method 2

Here’s a solution using itertools (it’s probably not the fastest solution):

import itertools
condition = [True,True,True,False,False,True,True,False,True]
[ sum( 1 for _ in group ) for key, group in itertools.groupby( condition ) if key ]

Out:
[3, 2, 1]

Method 3

You can also count the distance between consecutive False values by looking at the index (result of np.where) of the inverse of your condition array. The trick is ensuring the boolean array starts with a False. Basically, you’re counting the distance between the boundaries between your True conditions.

condition = np.array([True, True, True, False, False, True, True, False, True, False])
if condition[0]:
    condition = np.concatenate([[False], condition])

idx = np.where(~condition)[0]

At the final step, you need to 1 from these values so you remove both the left and right edges.

>>> np.ediff1d(idx) - 1
array([3, 0, 2, 1])

Method 4

If t is the np array and it is sorted in ascending order, then:

d=np.diff(t)
d_incr = np.argwhere(d>0).flatten()
d_incr = np.insert(d_incr, 0, 0)

The np array d_incr will contain the indices where a change occured, allowing one to perform operations on groups of values between d_incr[i-1] and d_incr[i] for i in range(1,d_incr.size)

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating