'Python pandas: fill a dataframe row by row
The simple task of adding a row to a pandas.DataFrame
object seems to be hard to accomplish. There are 3 stackoverflow questions relating to this, none of which give a working answer.
Here is what I'm trying to do. I have a DataFrame of which I already know the shape as well as the names of the rows and columns.
>>> df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'])
>>> df
a b c d
x NaN NaN NaN NaN
y NaN NaN NaN NaN
z NaN NaN NaN NaN
Now, I have a function to compute the values of the rows iteratively. How can I fill in one of the rows with either a dictionary or a pandas.Series
? Here are various attempts that have failed:
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df['y'] = y
AssertionError: Length of values does not match length of index
Apparently it tried to add a column instead of a row.
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df.join(y)
AttributeError: 'builtin_function_or_method' object has no attribute 'is_unique'
Very uninformative error message.
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df.set_value(index='y', value=y)
TypeError: set_value() takes exactly 4 arguments (3 given)
Apparently that is only for setting individual values in the dataframe.
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df.append(y)
Exception: Can only append a Series if ignore_index=True
Well, I don't want to ignore the index, otherwise here is the result:
>>> df.append(y, ignore_index=True)
a b c d
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 1 5 2 3
It did align the column names with the values, but lost the row labels.
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df.ix['y'] = y
>>> df
a b \
x NaN NaN
y {'a': 1, 'c': 2, 'b': 5, 'd': 3} {'a': 1, 'c': 2, 'b': 5, 'd': 3}
z NaN NaN
c d
x NaN NaN
y {'a': 1, 'c': 2, 'b': 5, 'd': 3} {'a': 1, 'c': 2, 'b': 5, 'd': 3}
z NaN NaN
That also failed miserably.
So how do you do it ?
Solution 1:[1]
df['y']
will set a column
since you want to set a row, use .loc
Note that .ix
is equivalent here, yours failed because you tried to assign a dictionary
to each element of the row y
probably not what you want; converting to a Series tells pandas
that you want to align the input (for example you then don't have to to specify all of the elements)
In [6]: import pandas as pd
In [7]: df = pd.DataFrame(columns=['a','b','c','d'], index=['x','y','z'])
In [8]: df.loc['y'] = pd.Series({'a':1, 'b':5, 'c':2, 'd':3})
In [9]: df
Out[9]:
a b c d
x NaN NaN NaN NaN
y 1 5 2 3
z NaN NaN NaN NaN
Solution 2:[2]
Update: because append has been deprecated
df = pd.DataFrame(columns=["firstname", "lastname"])
entry = pd.DataFrame.from_dict({
"firstname": ["John"],
"lastname": ["Johny"]
})
df = pd.concat([df, entry], ignore_index=True)
Solution 3:[3]
This is a simpler version
import pandas as pd
df = pd.DataFrame(columns=('col1', 'col2', 'col3'))
for i in range(5):
df.loc[i] = ['<some value for first>','<some value for second>','<some value for third>']`
Solution 4:[4]
If your input rows are lists rather than dictionaries, then the following is a simple solution:
import pandas as pd
list_of_lists = []
list_of_lists.append([1,2,3])
list_of_lists.append([4,5,6])
pd.DataFrame(list_of_lists, columns=['A', 'B', 'C'])
# A B C
# 0 1 2 3
# 1 4 5 6
Solution 5:[5]
The logic behind the code is quite simple and straight forward
Make a df with 1 row using the dictionary
Then create a df of shape (1, 4) that only contains NaN and has the same columns as the dictionary keys
Then concatenate a nan df with the dict df and then another nan df
import pandas as pd
import numpy as np
raw_datav = {'a':1, 'b':5, 'c':2, 'd':3}
datav_df = pd.DataFrame(raw_datav, index=[0])
nan_df = pd.DataFrame([[np.nan]*4], columns=raw_datav.keys())
df = pd.concat([nan_df, datav_df, nan_df], ignore_index=True)
df.index = ["x", "y", "z"]
print(df)
gives
a b c d
x NaN NaN NaN NaN
y 1.0 5.0 2.0 3.0
z NaN NaN NaN NaN
[Program finished]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | dangom |
Solution 2 | |
Solution 3 | Rajitha Fernando |
Solution 4 | stackoverflowuser2010 |
Solution 5 | Subham |