'Split list by sum of sublist items
I have a list of sublists with file names and sizes. I need to split that list into sublists based on the criteria that each splitted sublist must have a total file size less than 500 000 000 bytes. I have tried multiple solutions but I could not find a way to make it work. My last attempt is this:
import functools
import operator
data = [["c:\example_path", 480000],["c:\example_path2", 500000], ...]
list_final = []
sum = 0
list_items_subset = []
for index, item in enumerate(data):
sum += item[1]
if sum < 500000000:
list_items_subset.append(item[0])
else:
list_final.append(list_items_subset)
sum = 0
list_items_subset = []
list_items_subset.append(item[0])
sum += item[1]
print("len data init: ", len(data))
print("len items final: ", len(functools.reduce(operator.iconcat, list_final, [])))
The list_final
should store all the sublists of files which have a cumulative sum less than
500 000 000 bytes. In the code above, while sublists are created and inserted, I am left with items which are not included anywhere.
Thanks for any suggestions!
Solution 1:[1]
Is this what you want to get?
import functools
import operator
data = [[r"c:\example_path", 480000], [r"c:\example_path2", 500000]] * 10000
list_final = []
total_size = 0
list_items_subset = []
for name, size in data:
total_size += size
if total_size < 500000000:
list_items_subset.append(name)
else:
list_final.append(list_items_subset)
total_size = 0
list_items_subset = [name]
total_size += size
list_final.append(list_items_subset)
print("len data init: ", len(data))
print(len(list_final))
print("len items final: ", len(functools.reduce(operator.iconcat, list_final, [])))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | pL3b |