Is it possible to get which values are duplicates in a list using python?
I have a list of items:
mylist = [20, 30, 25, 20]
I know the best way of removing the duplicates is set(mylist), but is it possible to know what values are being duplicated? As you can see, in this list the duplicates are the first and last values. [0, 3].
Is it possible to get this result or something similar in python? I’m trying to avoid making a ridiculously big if elif conditional statement.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
These answers are O(n), so a little more code than using mylist.count() but much more efficient as mylist gets longer
If you just want to know the duplicates, use collections.Counter
from collections import Counter mylist = [20, 30, 25, 20] [k for k,v in Counter(mylist).items() if v>1]
If you need to know the indices,
from collections import defaultdict
D = defaultdict(list)
for i,item in enumerate(mylist):
D[item].append(i)
D = {k:v for k,v in D.items() if len(v)>1}
Method 2
Here’s a list comprehension that does what you want. As @Codemonkey says, the list starts at index 0, so the indices of the duplicates are 0 and 3.
>>> [i for i, x in enumerate(mylist) if mylist.count(x) > 1] [0, 3]
Method 3
The following list comprehension will yield the duplicate values:
[x for x in mylist if mylist.count(x) >= 2]
Method 4
You can use list compression and set to reduce the complexity.
my_list = [3, 5, 2, 1, 4, 4, 1] opt = [item for item in set(my_list) if my_list.count(item) > 1]
Method 5
simplest way without any intermediate list using list.index():
z = ['a', 'b', 'a', 'c', 'b', 'a', ] [z[i] for i in range(len(z)) if i == z.index(z[i])] >>>['a', 'b', 'c']
and you can also list the duplicates itself (may contain duplicates again as in the example):
[z[i] for i in range(len(z)) if not i == z.index(z[i])] >>>['a', 'b', 'a']
or their index:
[i for i in range(len(z)) if not i == z.index(z[i])] >>>[2, 4, 5]
or the duplicates as a list of 2-tuples of their index (referenced to their first occurrence only), what is the answer to the original question!!!:
[(i,z.index(z[i])) for i in range(len(z)) if not i == z.index(z[i])] >>>[(2, 0), (4, 1), (5, 0)]
or this together with the item itself:
[(i,z.index(z[i]),z[i]) for i in range(len(z)) if not i == z.index(z[i])] >>>[(2, 0, 'a'), (4, 1, 'b'), (5, 0, 'a')]
or any other combination of elements and indices….
Method 6
I tried below code to find duplicate values from list
1) create a set of duplicate list
2) Iterated through set by looking in duplicate list.
glist=[1, 2, 3, "one", 5, 6, 1, "one"]
x=set(glist)
dup=[]
for c in x:
if(glist.count(c)>1):
dup.append(c)
print(dup)
OUTPUT
[1, ‘one’]
Now get the all index for duplicate element
glist=[1, 2, 3, "one", 5, 6, 1, "one"]
x=set(glist)
dup=[]
for c in x:
if(glist.count(c)>1):
indices = [i for i, x in enumerate(glist) if x == c]
dup.append((c,indices))
print(dup)
OUTPUT
[(1, [0, 6]), (‘one’, [3, 7])]
Hope this helps someone
Method 7
That’s the simplest way I can think for finding duplicates in a list:
my_list = [3, 5, 2, 1, 4, 4, 1]
my_list.sort()
for i in range(0,len(my_list)-1):
if my_list[i] == my_list[i+1]:
print str(my_list[i]) + ' is a duplicate'
Method 8
The following code will fetch you desired results with duplicate items and their index values.
for i in set(mylist):
if mylist.count(i) > 1:
print(i, mylist.index(i))
Method 9
You should sort the list:
mylist.sort()
After this, iterate through it like this:
doubles = []
for i, elem in enumerate(mylist):
if i != 0:
if elem == old:
doubles.append(elem)
old = None
continue
old = elem
Method 10
You can print duplicate and Unqiue using below logic using list.
def dup(x):
duplicate = []
unique = []
for i in x:
if i in unique:
duplicate.append(i)
else:
unique.append(i)
print("Duplicate values: ",duplicate)
print("Unique Values: ",unique)
list1 = [1, 2, 1, 3, 2, 5]
dup(list1)
Method 11
mylist = [20, 30, 25, 20]
kl = {i: mylist.count(i) for i in mylist if mylist.count(i) > 1 }
print(kl)
Method 12
It looks like you want the indices of the duplicates. Here is some short code that will find those in O(n) time, without using any packages:
dups = {}
[dups.setdefault(v, []).append(i) for i, v in enumerate(mylist)]
dups = {k: v for k, v in dups.items() if len(v) > 1}
# dups now has keys for all the duplicate values
# and a list of matching indices for each
# The second line produces an unused list.
# It could be replaced with this:
for i, v in enumerate(mylist):
dups.setdefault(v, []).append(i)
Method 13
m = len(mylist)
for index,value in enumerate(mylist):
for i in xrange(1,m):
if(index != i):
if (L[i] == L[index]):
print "Location %d and location %d has same list-entry: %r" % (index,i,value)
This has some redundancy that can be improved however.
Method 14
def checkduplicate(lists):
a = []
for i in lists:
if i in a:
pass
else:
a.append(i)
return i
print(checkduplicate([1,9,78,989,2,2,3,6,8]))
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0