'Removing a list in a list of lists if a condition not satisfied

I have a list of lists. I want to remove lists if the lenght of intersection of values in the list with previous lists is more than one.

For example

A=[(1,2,3,4), (2,3,5,6),(1,7,8,9),(2,4,6,8)]

We need to keep first list because there is no previous list. We remove (2,3,5,6) as its intersection of previous list is (2,3) and its length is 2. We keep (1,7,8,9) and remove (2,4,6,8) likewise.

At the end our list should become B=[(1,2,3,4), (1,7,8,9)]



Solution 1:[1]

A potentially more efficient solution, in case you have (and keep) many more but still short tuples. This checks pairs of numbers in the lists. Keep the current tuple a if none of its pairs occur in an already kept tuple.

from itertools import combinations

A = [(1,2,3,4), (2,3,5,6), (1,7,8,9), (2,4,6,8)]

B = []
B_pairs = set()
for a in A:
    pairs = set(map(frozenset, combinations(a, 2)))
    if not B_pairs & pairs:
        B.append(a)
        B_pairs |= pairs
print(B)

Some benchmark with Ouss's solution, since they brought that up:

With 5000 tuples, 2553 of which are kept:
  23 ms  Kelly
2853 ms  Ouss

With 5000 tuples, 2540 of which are kept:
  21 ms  Kelly
2776 ms  Ouss

With 5000 tuples, 2558 of which are kept:
  20 ms  Kelly
2685 ms  Ouss

Benchmark code (Try it online!):

from itertools import combinations

def Kelly(A):
  B = []
  B_pairs = set()
  for a in A:
    pairs = set(map(frozenset, combinations(a, 2)))
    if not B_pairs & pairs:
        B.append(a)
        B_pairs |= pairs
  return B

def Ouss(A):
  B=[]

  # iterate over sublists a of the list A
  for a in A:
    # define and reset the intersections counter
    intersections=0
    # iterate of sublists b of B
    for b in B:
        # reset the intersections counter
        intersections=0
        # iterate over the numbers num of sublist a
        for num in a:
            # perform checks
            if num in b:
                intersections+=1
            if intersections>1:
                break
        # break the iteration over sublists b if check fulfilled
        if intersections>1:
            break
    # Append a to the subset B only if intersection were not > 1 
    if intersections>1:
        continue
    B.append(a)
    # repeat
  #B now contains the result...
  return B

from timeit import timeit
from random import sample

A = [tuple(sample(range(100), 4)) for _ in range(500)]
print(Kelly(A) == Ouss(A), len(Kelly(A)))

for _ in range(3):
    A = [tuple(sample(range(400), 4)) for _ in range(5000)]
    print(f'With {len(A)} tuples, {len(Kelly(A))} of which are kept:')
    for func in Kelly, Ouss:
        time = timeit(lambda: func(A), number=1)
        print('%4d ms ' % (time * 1e3), func.__name__)
    print()

Solution 2:[2]

Here is a direct implementation:

A = [(1,2,3,4), (2,3,5,6), (1,7,8,9), (2,4,6,8)]

B = []
intersect = 0
for a in A:
    for b in B:
        intersect = 0
        for num in b:
            if num in a:
                intersect += 1
        if intersect > 1:
            break
    if intersect < 2:
        B.append(a)

which gives:

B = [(1, 2, 3, 4), (1, 7, 8, 9)]

Solution 3:[3]

Here is a possible solution:

A=[(1,2,3,4),(2,3,5,6),(1,7,8,9),(2,4,6,8)]
B=[]

# iterate over sublists a of the list A
for a in A:
    # define and reset the intersections counter
    intersections=0
    # iterate of sublists b of B
    for b in B:
        # reset the intersections counter
        intersections=0
        # iterate over the numbers num of sublist a
        for num in a:
            # perform checks
            if num in b:
                intersections+=1
            if intersections>1:
                break
        # break the iteration over sublists b if check fulfilled
        if intersections>1:
            break
    # Append a to the subset B only if intersection were not > 1 
    if intersections>1:
        continue
    B.append(a)
    # repeat
#B now contains the result...
print(B)

Solution 4:[4]

Normally, this piece of code should do what you are looking for

A=[(1,2,3,4), (2,3,5,6),(1,7,8,9),(2,4,6,8)]

B = []

precedent = []
for list_ in A:
    intersection = 0

    for num in list_:
        if num in precedent:
            intersection += 1

    if intersection < 1:
        B.append(list_)

    precedent = list_

Solution 5:[5]

I think this is the one way you could try.

def intersection_length(lst1, lst2):
    intersection_list = [value for value in lst1 if value in lst2]
    return len(intersection_list)

def filter_lists(lists):
    filtered_lists = []
    filtered_lists.append(lists[0])
    del lists[0]

    for list_item in lists:
        is_ok = True
        for prev_list in filtered_lists:
            if intersection_length(list_item, prev_list) > 1:
                is_ok = False
                continue
        if is_ok:
            filtered_lists.append(list_item)
    return filtered_lists

def main():
    filtered = filter_lists(
        [(1, 2, 3, 4), (2, 3, 5, 6), (1, 7, 8, 9), (2, 4, 6, 8)]
    )
    print(filtered)

if __name__ == "__main__":
    main()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 AboAmmar
Solution 3
Solution 4 crazycat256
Solution 5