'Successively filling in a new column of a pandas DataFrame

I would like to extend an existing pandas DataFrame and fill the new column successively:

df = pd.DataFrame({'col1': [1, 2, 3, 4, 5, 6], 'col2': [7, 8, 9, 10, 11, 12]})
df['col3'] = pd.Series(['a' for x in df[:3]])
df['col3'] = pd.Series(['b' for x in df[3:4]])
df['col3'] = pd.Series(['c' for x in df[4:]])

I would expect a result as follows:

  col1 col2 col3
0   1   7   a
1   2   8   a
2   3   9   a
3   4   10  b
4   5   11  c
5   6   12  c

However, my code fails and I get:

  col1 col2 col3
0   1   7   a
1   2   8   a
2   3   9   NaN
3   4   10  NaN
4   5   11  NaN
5   6   12  NaN

What is wrong?



Solution 1:[1]

Every time you do something like df['col3'] = pd.Series(['a' for x in df[:3]]), you're assigning a new pd.Series to the column col3. One alternative way to do this is to create your new column separately, then assign it to the df.

df = pd.DataFrame({'col1': [1, 2, 3, 4, 5, 6], 'col2': [7, 8, 9, 10, 11, 12]})
new_col = ['a' for _ in range(3)] + ['b'] + ['c' for _ in range(4, len(df))]
df['col3'] = pd.Series(new_col)

Solution 2:[2]

As @Amirhossein Kiani and @Emma notes in the comments, you're never using df itself to assign values, so there is no need to slice it. Since you can assign a list to a DataFrame column, the following suffices:

df['col3'] = ['a'] * 3 + ['b'] + ['c'] * (len(df) - 4)

You can also use numpy.select to assign values. The idea is to create a list of boolean Serieses for certain index ranges and select values accordingly. For example, if index is less than 3, select 'a', if index is between 3 and 4, select 'b', etc.

import numpy as np    
df['col3'] = np.select([df.index<3, df.index.to_series().between(3, 4, inclusive='left')], ['a','b'], 'c')

Output:

   col1  col2 col3 
0     1     7    a 
1     2     8    a
2     3     9    a
3     4    10    b
4     5    11    c
5     6    12    c

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 kwsp
Solution 2