'How to get the remaining sample after using random.sample() in Python?

I have a large list of elements (in this example I'll assume it's filled with numbers). For example: l = [1,2,3,4,5,6,7,8,9,10] Now I want to take 2 samples from that list, one with the 80% of the elements (randomly chosen of course), and the other one with the remaining elements (the 20%), so I can use the bigger one to train a machine-learning tool, and the rest to test that training. The function I used is from random and I used it this way:

sz = len(l) #Size of the original list
per = int((80 * sz) / 100) #This will be the length of the sample list with the 80% of the elements (I guess)
random.seed(1) # As I want to obtain the same results every time I run it.
l2 = random.sample(l, per)

I'm not totally sure, but I believe that with that code I'm getting a random sample with the 80% of the numbers.

l2 = [3,4,7,2,9,5,1,8]

Nonetheless, I can't seem to find the way to get the other sample list with the remaining elements l3 = [6,10] (the sample() function does not remove the elements it takes from the original list). Can you please help me? Thank you in advance.

Solution 1:^[1]

For me the following code worked to randomly split a list into two (training/testing) sets, even though most machine learning libraries include easy to use splitting functions as mentioned before:

l = [1,2,3,4,5,6,7,8,9,10]
sz = len(l)
cut = int(0.8 * sz) #80% of the list
shuffled_l = random.shuffle(l)
l2 = shuffled_l[:cut] # first 80% of shuffled list
l3 = shuffled_l[cut:] # last 20% of shuffled list

Solution 2:^[2]

You can simply do:

from random import sample

data = [1, 2, 3, 4, 5]

training = sample(a, len(data)*cut)

testing = [value for value in data if value not in training]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	L. Brasi
Solution 2	cigien

'How to get the remaining sample after using random.sample() in Python?

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]