'In Python, how do I find duplicates in a sequence and collect them in a list? [duplicate]
I was not allowed to answer How do I find the duplicates in a list and create another list with them? but I think my solution is worth it. So, I will generalize the question and be glad about feedback.
How to answer the question for this list:
a = [1, 2, 1, 1, 2, 3, ]
Solution 1:[1]
very simple:
have = []
duplicates = []
for item in a:
if item not in have:
have.append(item)
else:
if item not in duplicates:
duplicates append(item)
the condition after else is just to be shure that the list duplicates get just one time the duplicated item
Solution 2:[2]
I'm not sure what your question is exactly. But if you want to compare two lists and place the similarities into one list you can do something like this:
a = [1, 2, 1, 1, 2, 3]
b = [1, 2, 5, 6, 1, 3]
c = []
for item in a:
if item in b:
c.append(item)
Solution 3:[3]
The accepted answer in the original question is best, as I learned. I missed the fact that any look-up in a set or dictionary is only Order(1). Sets and dicts address their elements via hashes by which the location in memory is directly known - very neat. (NB: calling set() upon a sequence of hashable/immutable elements removes any duplicates and the set is most efficiently created: Order(n) for n items.)
I demonstrate by searching for the last item in a list compared to searching for any item in a set:
>>> import timeit, random
>>> rng = range(100999)
>>> myset = set(rng)
>>> mylist = list(rng)
>>> # How long does it take to test the list for its last value
>>> # compared to testing for a value in a set?
>>> timeit.timeit(lambda: mylist[-1] in mylist, number=9999)
12.907866499997908
>>> timeit.timeit(lambda: random.choice(rng) in myset, number=9999)
0.012736899996525608
>>>
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Martino |
Solution 2 | Quessts |
Solution 3 |