'Fastest way to remove duplicates in a list without importing libraries and using sets
I was trying to remove duplicates from a list using the following code:
a = [1,2,3,4,2,6,1,1,5,2]
res = []
[res.append(i) for i in a if i not in res]
But I would like to do this without defining the list I want as an empty list (i.e., omit the line res = []
) like:
a = [1,2,3,4,2,6,1,1,5,2]
# Either:
res = [i for i in a if i not in res]
# Or:
[i for i in a if i not in 'this list'] # This list is not a string. I meant it as the list being comprehended.
I want to avoid library imports and set()
.
Solution 1:[1]
For Python 3.6+, you can use dict.fromkeys()
:
>>> a = [1, 2, 3, 4, 2, 6, 1, 1, 5, 2]
>>> list(dict.fromkeys(a))
[1, 2, 3, 4, 6, 5]
From the documentation:
Create a new dictionary with keys from iterable and values set to value.
If you are using a lower Python version, you will need to use collections.OrderedDict
to maintain order:
>>> from collections import OrderedDict
>>> a = [1, 2, 3, 4, 2, 6, 1, 1, 5, 2]
>>> list(OrderedDict.fromkeys(a))
[1, 2, 3, 4, 6, 5]
Solution 2:[2]
I think this may work for you. It removes duplicates from the list while keeping the order.
newlist = [i for n,i in enumerate(L) if i not in L[:n]]
Solution 3:[3]
Here is a simple benchmark with the proposed solutions,
It shows that dict.fromkeys
will perform the best.
from simple_benchmark import BenchmarkBuilder
import random
b = BenchmarkBuilder()
@b.add_function()
def AmitDavidson(a):
return [i for n,i in enumerate(a) if i not in a[:n]]
@b.add_function()
def RoadRunner(a):
return list(dict.fromkeys(a))
@b.add_function()
def DaniMesejo(a):
return list({k: '' for k in a})
@b.add_function()
def rdas(a):
return sorted(list(set(a)), key=lambda x: a.index(x))
@b.add_function()
def unwanted_set(a):
return list(set(a))
@b.add_arguments('List lenght')
def argument_provider():
for exp in range(2, 18):
size = 2**exp
yield size, [random.randint(0, 10) for _ in range(size)]
r = b.run()
r.plot()
Solution 4:[4]
Here is a solution using set
that does preserve the order:
a = [1,2,3,4,2,6,1,1,5,2]
a_uniq = sorted(list(set(a)), key=lambda x: a.index(x))
print(a_uniq)
Solution 5:[5]
One-liner, comprehension, O(n)
, that preserves order in Python 3.6+:
a = [1, 2, 3, 4, 2, 6, 1, 1, 5, 2]
res = list({k: '' for k in a})
print(res)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Peter Mortensen |
Solution 2 | Peter Mortensen |
Solution 3 | Peter Mortensen |
Solution 4 | rdas |
Solution 5 |