Check if list items contains substrings from another list

I have a lists:

my_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456', 'def-111', 'qwe-111']

bad = ['abc', 'def']

and want to search for items that contain the string ‘abc’ and ‘def’ (and others in bad). How can I do that?

Almost same question here.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

If you just want a test, join the target list into a string and test each element of bad like so:

>>> my_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456', 'def-111', 'qwe-111']
>>> bad = ['abc', 'def']
>>> [e for e in bad if e in 'n'.join(my_list)]
['abc', 'def']

From your question, you can test each element as a sub string against the each element of the other this way:

>>> [i for e in bad for i in my_list if e in i]
['abc-123', 'abc-456', 'def-456', 'def-111']

It is fast (in comparison to one of the other methods):

>>> def f1():
...    [item for item in my_list if any(x in item for x in bad)]
... 
>>> def f2():
...    [i for e in bad for i in my_list if e in i]
... 
>>> timeit.Timer(f1).timeit()
5.062238931655884
>>> timeit.Timer(f2).timeit()
1.35371994972229

From your comment, here is how you get the elements that do not match:

>>> set(my_list)-{i for e in bad for i in my_list if e in i}
{'ghi-789', 'qwe-111'}

Method 2

In [4]: filter(lambda item: any(x in item for x in bad), my_list)
Out[4]: ['abc-123', 'def-456', 'abc-456', 'def-111']

or

In [13]: [item for item in my_list if any(x in item for x in bad)]
Out[13]: ['abc-123', 'def-456', 'abc-456', 'def-111']

Method 3

some_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
bad = ['abc', 'def']
for s in some_list:
    for item in bad:
       if item in s:
          print 'Found ', s

It’s simple, works fine and fast( only if your list is not very big.)

Method 4

some_list=['abc-123', 'def-456', 'ghi-789', 'abc-456']
bad = ['abc', 'def']
for i in range (0,len(bad)):
    if bad[i] in some_list:
        print('Found a bad entry:', bad[i])


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x