'Python pandas: Why does df.iloc[:, :-1].values for my training data select till only the second last column?
Very simply put,
For the same training data frame df
, when I use
X = df.iloc[:, :-1].values
, it will select till the second last column of the data frame instead of the last column (which is what I want BUT it's a strange behavior I've never seen before), and I know this as the second last column's value and the last column's value for that row is different.
However, using
y = df.iloc[:, -1].values
gives me the row vector of the last column's values which is exactly what I want.
Why is the negative 1 for X giving me the second last column's value instead?
Solution 1:[1]
I think you have only two columns in df
, because if there is more columns, iloc
select all columns without last:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 5 6 3
print(df.iloc[:, :-1])
A B C D E
0 1 4 7 1 5
1 2 5 8 3 3
2 3 6 9 5 6
X = df.iloc[:, :-1].values
print (X)
[[1 4 7 1 5]
[2 5 8 3 3]
[3 6 9 5 6]]
print (X.shape)
(3, 5)
Solution 2:[2]
Just for clarity
With respect to python syntax, this question has been answered here.
Python list slicing syntax states that for a:b
it will get a
and everything upto but not including b
. a:
will get a
and everything after it. :b
will get everything before b
but not b
. The list index of -1
refers to the last element. :-1
adheres to the same standards as above in that this gets everything before the last element but not the last element. If you want the last element included use :
.
Solution 3:[3]
Bcz Upper bound is exclusive. Its similar to slicing a list:
a=[1,2,3,4]
a[:3]
will result in [1, 2, 3]. It did not take the last element.
Solution 4:[4]
In case you learn something from this
# Single selections using iloc and DataFrame
# Rows:
data.iloc[0] # first row of data frame (Aleshia Tomkiewicz) - Note a Series data type output.
data.iloc[1] # second row of data frame (Evan Zigomalas)
data.iloc[-1] # last row of data frame (Mi Richan)
# Columns:
data.iloc[:,0] # first column of data frame (first_name)
data.iloc[:,1] # second column of data frame (last_name)
data.iloc[:,-1] # last column of data frame (id)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Community |
Solution 3 | Manoj Kumar |
Solution 4 | Shafin Junayed |