'Python pandas dataframe populate hierarchical levels from parent child
I have the following dataframe which contains Parent child relation:
data = pd.DataFrame({'Parent':['a','a','b','c','c','f','q','z','k'],
Child':['b','c','d','f','g','h','k','q','w']})
a
├── b
│ └── d
└── c
├── f
│ └── h
└── g
z
└── q
└── k
└── w
I would like to get a new dataframe which contains e.g. all children of parent a
:
child | level1 | level2 | level x |
---|---|---|---|
d | a | b | - |
b | a | - | - |
c | a | - | - |
f | a | c | - |
h | a | c | f |
g | a | c | - |
I do not know how many levels there are upfront therefore I have used 'level x'.
I guess I somehow need a recursive pattern iterate over the dataframe.
Solution 1:[1]
I'd suggest
- building each
children:parentList
- build the
DataFrame
with giving each parent alevel
name
import pandas as pd
values = {'Parent': ['a', 'a', 'b', 'c', 'c', 'f', 'q', 'z', 'k'],
'Child': ['b', 'c', 'd', 'f', 'g', 'h', 'k', 'q', 'w']}
relations = dict(zip(values['Child'], values['Parent']))
def get_parent_list(element):
parent = relations.get(element)
return get_parent_list(parent) + [parent] if parent else []
all_relations = {
children: {f'level_{idx}': value for idx, value in enumerate(get_parent_list(children))}
for children in set(values['Child'])
}
df = pd.DataFrame.from_dict(all_relations, orient='index')
print(df)
level_0 level_1 level_2
b a NaN NaN
f a c NaN
d a b NaN
g a c NaN
h a c f
q z NaN NaN
k z q NaN
w z q k
c a NaN NaN
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | azro |