Identify if list has consecutive elements that are equal

I’m trying to identify if a large list has consecutive elements that are the same.

So let’s say:

lst = [1, 2, 3, 4, 5, 5, 6]

And in this case, I would return true, since there are two consecutive elements lst[4] and lst[5], are the same value.

I know this could probably be done with some sort of combination of loops, but I was wondering if there were a more efficient way to do this?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You can use itertools.groupby() and a generator expression within any()
*:

>>> from itertools import groupby
>>> any(sum(1 for _ in g) > 1 for _, g in groupby(lst))
True

Or as a more Pythonic way you can use zip(), in order to check if at least there are two equal consecutive items in your list:

>>> any(i==j for i,j in zip(lst, lst[1:])) # In python-2.x,in order to avoid creating a 'list' of all pairs instead of an iterator use itertools.izip()
True

Note: The first approach is good when you want to check if there are more than 2 consecutive equal items, otherwise, in this case the second one takes the cake!


* Using sum(1 for _ in g) instead of len(list(g)) is very optimized in terms of memory use (not reading the whole list in memory at once) but the latter is slightly faster.

Method 2

You can use a simple any condition:

lst = [1, 2, 3, 4, 5, 5, 6]
any(lst[i]==lst[i+1] for i in range(len(lst)-1))
#outputs:
True

any return True if any of the iterable elements are True

Method 3

If you’re looking for an efficient way of doing this and the lists are numerical, you would probably want to use numpy and apply the diff (difference) function:

>>> numpy.diff([1,2,3,4,5,5,6])
array([1, 1, 1, 1, 0, 1])

Then to get a single result regarding whether there are any consecutive elements:

>>> numpy.any(~numpy.diff([1,2,3,4,5,5,6]).astype(bool))

This first performs the diff, inverts the answer, and then checks if any of the resulting elements are non-zero.

Similarly,

>>> 0 in numpy.diff([1, 2, 3, 4, 5, 5, 6])

also works well and is similar in speed to the np.any approach (credit for this last version to heracho).

Method 4

Here a more general numpy one-liner:

number = 7
n_consecutive = 3
arr = np.array([3, 3, 6, 5, 8, 7, 7, 7, 4, 5])
#                              ^  ^  ^ 
 
np.any(np.convolve(arr == number, v=np.ones(n_consecutive), mode='valid') 
       == n_consecutive)[0]

This method always searches the whole array, while the approach from @Kasramvd ends when the condition is first met. So which method is faster dependents on how sparse those cases of consecutive numbers are.
If you are interested in the positions of the consecutive numbers, and have to look at all elements of the array this approach should be faster (for larger arrays (or/and longer sequences)).

idx = np.nonzero(np.convolve(arr==number, v=np.ones(n_consecutive), mode='valid') 
                             == n_consecutive)
# idx = i: all(arr[i:i+n_consecutive] == number)

If you are not interested in a specific value but at all consecutive numbers in general a slight variation of @jmetz‘s answer:

np.any(np.convolve(np.abs(np.diff(arr)), v=np.ones(n_consecutive-1), mode='valid') == 0)
#                  ^^^^^^
# EDIT see djvg's comment

Method 5

Starting in Python 3.10, the new pairwise function provides a way to slide through pairs of consecutive elements, so that we can test the quality between consecutive elements:

from itertools import pairwise

any(x == y for (x, y) in pairwise([1, 2, 3, 4, 5, 5, 6]))
# True

The intermediate result of pairwise:

pairwise([1, 2, 3, 4, 5, 5, 6])
# [(1, 2), (2, 3), (3, 4), (4, 5), (5, 5), (5, 6)]

Method 6

A simple for loop should do it:

def check(lst):
    last = lst[0]
    for num in lst[1:]:
        if num == last:
            return True
        last = num
    return False


lst = [1, 2, 3, 4, 5, 5, 6]
print (check(lst)) #Prints True

Here, in each loop, I check if the current element is equal to the previous element.

Method 7

The convolution approach suggested in scleronomic’s answer is very promising, especially if you’re looking for more than two consecutive elements.

However, the implementation presented in that answer might not be the most efficient, because it consists of two steps: diff() followed by convolve().

Alternative implementation

If we consider that the diff() can also be calculated using convolution, we can combine the two steps into a single convolution.

The following alternative implementation only requires a single convolution of the full signal, which is advantageous if the signal has many elements.

Note that we cannot take the absolute values of the diff (to prevent false positives, as mentioned in this comment), so we add some random noise to the unit kernel instead.

# example signal
signal = numpy.array([1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0])
# minimum number of consecutive elements
n_consecutive = 3
# convolution kernel for weighted moving sum (with small random component)
rng = numpy.random.default_rng()
random_kernel = 1 + 0.01 * rng.random(n_consecutive - 1)
# convolution kernel for first-order difference (similar to numpy.diff)
diff_kernel = [1, -1]
# combine the kernels so we only need to do one convolution with the signal
combined_kernel = numpy.convolve(diff_kernel, random_kernel, mode='full')
# convolve the signal to get the moving weighted sum of differences
moving_sum_of_diffs = numpy.convolve(signal, combined_kernel, mode='valid')
# check if moving sum is zero anywhere
result = numpy.any(moving_sum_of_diffs == 0)

See the DSP guide for a detailed discussion of convolution.

Timing

The difference between the two implementations boils down to this:

def original(signal, unit_kernel):
    return numpy.convolve(numpy.abs(numpy.diff(signal)), unit_kernel, mode='valid')


def alternative(signal, combined_kernel):
    return numpy.convolve(signal, combined_kernel, mode='valid')

where unit_kernel = numpy.ones(n_consecutive - 1) and combined_kernel is defined above.

Comparison of these two functions, using timeit, shows that alternative() can be several times faster, for small kernel sizes (i.e. small value of n_consecutive). However, for large kernel sizes the advantage becomes negligible, because the convolution becomes dominant (compared to the diff).

Notes:

  • For large kernel sizes I would prefer the original two-step approach, as I think it is easier to understand.
  • Due to numerical issues it may be necessary to replace numpy.any(moving_sum_of_diffs == 0) by numpy.any(numpy.abs(moving_sum_of_diffs) < very_small_number), see e.g. here.

Method 8

My solution for this if you want to find out whether 3 consecutive values are equal to 7. For example, a tuple of intList = (7, 7, 7, 8, 9, 1):

for i in range(len(intList) - 1):
        if intList[i] == 7 and intList[i + 2] == 7 and intList[i + 1] == 7:
            return True
    return False


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x