'Find specific value knowing row pandas
I have a dataframe with this structure:
A | indexer | attr1_rank | attr2_rank | attr3_rank | attr4_rank | ... | attrn_rank |
---|---|---|---|---|---|---|---|
P | 1 | 2 | 1 | 3 | 4 | ... | n |
S | 2 | 1 | 2 | 4 | 3 | ... | n |
How can i add a column with the next expected value where return_value
is the name of the column based on value in indexer
, indexer
should be compared with attr1_rank
and attr2_rank
to return its header:
return_value |
---|
attr2 |
attr2 |
i have this code but return Index Error. var tmp
is representing indexer
index_value = self._data.iloc[row, [2, 4]] == int(tmp)
col_name = self._data.columns[index_value]
col_name = col_name.removesuffix('_rank')
self._data.iloc[row, column + 3] = col_name # assuming that 'return_value' column is 3 positions to the right
Update for @JAV solution, this is the code that I fit
def setData(self, index, value, role=Qt.EditRole):
based_columns = [6, 8, 10, 12]
if role == Qt.EditRole:
row = index.row()
column = index.column()
tmp = str(value)
if column in based_columns:
if column == 6 and tmp in self._data.columns.values.tolist():
index_no = self._data.columns.get_loc(tmp)
self._data.iloc[row, column + 1] = self._data.iloc[row, index_no]
self._data.iloc[row, column] = tmp
elif column in [8, 10, 12]:
self._data.iloc[row, column + 1] = self._data.apply(self.index_match(row), axis=1)
self._data.iloc[row, column] = tmp
self.dataChanged.emit(index, index)
def index_match(self, row):
for col in row[97:].index:
if row[col] == row['indexer']:
return col[:-5]
Traceback (most recent call last):
File "helper_classes.py", line 171, in setData
self._data.iloc[row, column + 1] = self._data.apply(self.index_match(row), axis=1)
File "helper_classes.py", line 176, in index_match
for col in row[97:].index:
TypeError: 'int' object is not subscriptable
Solution 1:[1]
It was easier than I thought for my functionality.
for x in range(initial_column, end_column):
if self._data.iloc[row, x] == int(tmp):
index_value = x
break
col_name = self._data.columns[index_value]
col_name = col_name.removesuffix('_rank')
self._data.iloc[row, column + 1] = col_name
Solution 2:[2]
You can use this if you have an undefined number of columns to loop through:
def index_match(row):
for col in row[1:].index:
if row[col] == row['indexer']:
return col[:-5] # trimming off _rank
df['return_value'] = df.apply(index_match, axis=1)
Solution 3:[3]
Here's a one liner that does it for you.
df['return_value']=df.apply(lambda x: 'attr2' if x['indexer']==x['attr2_rank'] else ('attr1' if x['indexer']==x['attr1_rank'] else None ), axis =1)
Solution if the number of columns are large:
def get_return_val(x):
vals=set(x.loc[x.indexer == x].index)- {'indexer'}
if len(vals):
return [x.rstrip('_rank') for x in vals ][0]
else:
return None
df['return_value'] = df.apply(get_return_val, axis=1)
Solution 4:[4]
df.apply(lambda x: list((dictionary:=x[[index for index in x.index if '_rank' in index]].to_dict()).keys())[list(dictionary.values()).index(x['indexer'])].replace("_rank",""), axis=1)
or
df.apply(lambda x: list((dictionary:=x[[index for index in x.index if '_rank' in index]].to_dict()).keys())[list(dictionary.values()).index(x['indexer'])][:-5], axis=1)
or
df.apply(lambda x: list((dictionary:=x.drop(['A', 'indexer']).to_dict()).keys())[list(dictionary.values()).index(x['indexer'])][:-5], axis=1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Victor Caicedo |
Solution 2 | |
Solution 3 | |
Solution 4 | MoRe |