'how to assign an entire list to each row of a pandas dataframe

I have a dataframe and a list

df = pd.DataFrame({'A':[1,2,3], 'B':[4,5,6]})
mylist= [10,20,30,40,50]

I would like to have a list as element in each row of a dataframe. If I do like here,

df['C'] = mylist

Pandas is trying to broadcast one value per row, so I get an error Length of values does not match length of index.

   A  B   C
0  1  4  [10,20,40,50]
1  2  5  [10,20,40,50]
2  3  6  [10,20,40,50]


Solution 1:[1]

First I think working with lists in pandas is not good idea.

But it is possible by list comprehension:

df['C'] = [mylist for i in df.index]
#another solution
#df['C'] = pd.Series([mylist] * len(df))

print (df)

   A  B                     C
0  1  4  [10, 20, 30, 40, 50]
1  2  5  [10, 20, 30, 40, 50]
2  3  6  [10, 20, 30, 40, 50]

Solution 2:[2]

One alternative using np.tile:

df['C'] = np.tile(mylist, (len(df),1)).tolist()

print (df)

   A  B                     C
0  1  4  [10, 20, 30, 40, 50]
1  2  5  [10, 20, 30, 40, 50]
2  3  6  [10, 20, 30, 40, 50]

?

Solution 3:[3]

Here is another solution. It makes use of lambda and do things "Pythonically". I think it is easier to read.

import pandas as pd
df = pd.DataFrame({'A':[1,2,3], 'B':[4,5,6]})
mylist= [10,20,30,40,50]
df['combined'] = df.apply(lambda x: mylist, axis=1)
df

enter image description here

Solution 4:[4]

Just to complete my earlier answer with df.assign, borrowed list comprehension from @jezrael

>>> df
   A  B
0  1  4
1  2  5
2  3  6

>>> df.assign(C =  [mylist for i in df.index])
   A  B                     C
0  1  4  [10, 20, 30, 40, 50]
1  2  5  [10, 20, 30, 40, 50]
2  3  6  [10, 20, 30, 40, 50]

OR, to add permanently to the DataFrame

df = df.assign(C =  [mylist for i in df.index])

Another way of doing it with df.insert

as we are specifying the order of the column, hence can use insert here by inserting at index 2 (so should be third col in dataframe)

>>> df.insert(2, 'C', '[10, 20, 30, 40, 50]') # directly assigning the list
>>> df
   A  B                     C
0  1  4  [10, 20, 30, 40, 50]
1  2  5  [10, 20, 30, 40, 50]
2  3  6  [10, 20, 30, 40, 50]

Solution 5:[5]

I agree with @jezrael, that working with lists in pandas is not good idea. And there is a much faster vectorized way:

  1. squeeze the list into single numpy cell.
  2. tile that cell and assign it to the DF.
df = pd.DataFrame(index=np.arange(1e6))
mylist= [10,20,30,40,50]

#ORIGINAL:
%%timeit -n 100 
df['C'] = [mylist for i in df.index]
>>> 188 ms ± 922 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# VECTORIZED:
%%timeit -n 100 
q = np.array([1,], dtype=object)   # dummy array, note the dtype
q[0] = mylist                      # squeeze the list into single cell
df['C'] = np.tile(q, df.shape[0])  # tile and assign
>>> 12.1 ms ± 44.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The gain is especially high with larger DF sizes. (15x in this example) Hopefully there is a more elegant way to fit a list into single numpy cell.

Solution 6:[6]

That should work:

df = pd.DataFrame({'A':[1,2,3], 'B':[4,5,6]})
my_list = [10, 20, 30, 40]
df['C'] = [my_list] * df.shape[0]
df

A   B   C
0   1   4   [10, 20, 30, 40]
1   2   5   [10, 20, 30, 40]
2   3   6   [10, 20, 30, 40]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3
Solution 4
Solution 5 Poe Dator
Solution 6 Vladimir Lukin