Difference Between Two Lists with Duplicates in Python

I have two lists that contain many of the same items, including duplicate items. I want to check which items in the first list are not in the second list. For example, I might have one list like this:

l1 = ['a', 'b', 'c', 'b', 'c']

and one list like this:

l2 = ['a', 'b', 'c', 'b']

Comparing these two lists I would want to return a third list like this:

l3 = ['c']

I am currently using some terrible code that I made a while ago that I’m fairly certain doesn’t even work properly shown below.

def list_difference(l1,l2):
    for i in range(0, len(l1)):
        for j in range(0, len(l2)):
            if l1[i] == l1[j]:
                l1[i] = 'damn'
                l2[j] = 'damn'
    l3 = []
    for item in l1:
        if item!='damn':
            l3.append(item)
    return l3

How can I better accomplish this task?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You didn’t specify if the order matters. If it does not, you can do this in >= Python 2.7:

l1 = ['a', 'b', 'c', 'b', 'c']
l2 = ['a', 'b', 'c', 'b']

from collections import Counter

c1 = Counter(l1)
c2 = Counter(l2)

diff = c1-c2
print list(diff.elements())

Method 2

Create Counters for both lists, then subtract one from the other.

from collections import Counter

a = [1,2,3,1,2]
b = [1,2,3,1]

c = Counter(a)
c.subtract(Counter(b))

Method 3

To take into account both duplicates and the order of elements:

from collections import Counter

def list_difference(a, b):
    count = Counter(a) # count items in a
    count.subtract(b)  # subtract items that are in b
    diff = []
    for x in a:
        if count[x] > 0:
           count[x] -= 1
           diff.append(x)
    return diff

Example

print(list_difference("z y z x v x y x u".split(), "x y z w z".split()))
# -> ['y', 'x', 'v', 'x', 'u']

Python 2.5 version:

from collections import defaultdict 

def list_difference25(a, b):
    # count items in a
    count = defaultdict(int) # item -> number of occurrences
    for x in a:
        count[x] += 1

    # subtract items that are in b
    for x in b: 
        count[x] -= 1

    diff = []
    for x in a:
        if count[x] > 0:
           count[x] -= 1
           diff.append(x)
    return diff

Method 4

Counters are new in Python 2.7.
For a general solution to substract a from b:

def list_difference(b, a):
    c = list(b)
    for item in a:
       try:
           c.remove(item)
       except ValueError:
           pass            #or maybe you want to keep a values here
    return c

Method 5

you can try this

list(filter(lambda x:l1.remove(x),li2))
print(l1)

Method 6

Try this one:

from collections import Counter
from typing import Sequence

def duplicates_difference(a: Sequence, b: Sequence) -> Counter:
    """
    >>> duplicates_difference([1,2],[1,2,2,3])
    Counter({2: 1, 3: 1})
    """
    shorter, longer = sorted([a, b], key=len)
    return Counter(longer) - Counter(shorter)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x