'How can I get branch of a networkx graph as a list from pandas dataframe in Python?
I have a pandas dataframe df
which looks as follows:
From To
0 Node1 Node2
1 Node1 Node3
2 Node2 Node4
3 Node2 Node5
4 Node3 Node6
5 Node3 Node7
6 Node4 Node8
7 Node5 Node9
8 Node6 Node10
9 Node7 Node11
df.to_dict()
is:
{'From': {0: 'Node1',
1: 'Node1',
2: 'Node2',
3: 'Node2',
4: 'Node3',
5: 'Node3',
6: 'Node4',
7: 'Node5',
8: 'Node6',
9: 'Node7'},
'To': {0: 'Node2',
1: 'Node3',
2: 'Node4',
3: 'Node5',
4: 'Node6',
5: 'Node7',
6: 'Node8',
7: 'Node9',
8: 'Node10',
9: 'Node11'}}
I have plotted this pandas dataframe as a network graph using networkx package which looks as follows:
I want to get the list of unique scenarios/branches from this network graph. There are four branches here starting from Node1.
Node1-Node2-Node4-Node8
Node1-Node2-Node5-Node9
Node1-Node3-Node6-Node10
Node1-Node3-Node7-Node11
How can I get the list of the branches above from the given pandas dataframe in Python?
Solution 1:[1]
You can define Recursive Function
and save path and print path:
df = pd.DataFrame({
'From':['Node1','Node1', 'Node2', 'Node2', 'Node3', 'Node3', 'Node4', 'Node5', 'Node6', 'Node7'],
'TO' :['Node2','Node3', 'Node4', 'Node5', 'Node6', 'Node7', 'Node8', 'Node9', 'Node10', 'Node11']
})
def prntPath(lst, node, df, lst_vst):
for val in df.values:
if val[0] == node:
lst.append(val[1])
prntPath(lst, val[1], df, lst_vst)
if not lst[-1] in lst_vst:
print('-'.join(lst))
for l in lst: lst_vst.add(l)
lst.pop()
return
lst_vst = set()
prntPath(['Node1'],'Node1', df, lst_vst)
Output:
Node1-Node2-Node4-Node8
Node1-Node2-Node5-Node9
Node1-Node3-Node6-Node10
Node1-Node3-Node7-Node11
You can check and use for other graph like below:
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
from itertools import chain
from networkx.drawing.nx_pydot import graphviz_layout
def prntPath(lst, node, df, lst_vst):
for val in df.values:
if val[0] == node:
lst.append(val[1])
prntPath(lst, val[1], df, lst_vst)
if not lst[-1] in lst_vst: print('-'.join(lst))
for l in lst: lst_vst.add(l)
lst.pop()
return
df = pd.DataFrame({
'From':['Node1','Node1', 'Node2', 'Node3', 'Node3', 'Node5', 'Node7'],
'TO' :['Node2','Node3', 'Node5', 'Node6', 'Node7', 'Node9', 'Node11']
})
g = nx.DiGraph()
g.add_nodes_from(set(chain.from_iterable(df.values)))
for edg in df.values:
g.add_edge(*edg)
pos = graphviz_layout(g, prog="dot")
nx.draw(g, pos,with_labels=True, node_shape='s')
plt.draw()
plt.show()
lst_vst = set()
prntPath(['Node1'],'Node1', df, lst_vst)
Output:
Node1-Node2-Node5-Node9
Node1-Node3-Node6
Node1-Node3-Node7-Node11
Solution 2:[2]
I think @I'mahdi's first snippet of code assumes that a child node will not have more than one parent node, i.e., the graph is a binary tree. When the graph is not a binary tree, the code can be modified as:
import pandas as pd
def prntPath(lst, node, df, lst_vst):
for val in df.values:
if val[0] == node:
lst.append(val[1])
# recursely find child node first until no child nodes are found
prntPath(lst, val[1], df, lst_vst)
# recursion is over
# when no child nodes are found
if len(lst)>=2:
if (not lst[-1] in lst_vst) or (not lst[-2] in lst_vst):
print('-'.join(lst))
for l in lst:
lst_vst.add(l)
lst.pop()
# Two links are added to the graph: 1) from node 4 to node 9. 2) from node 1 to node 11.
df = pd.DataFrame({
'From':['Node1','Node1', 'Node2', 'Node2', 'Node3', 'Node3', 'Node4', 'Node4','Node5', 'Node6', 'Node7','Node1'],
'TO' :['Node2','Node3', 'Node4', 'Node5', 'Node6', 'Node7', 'Node8', 'Node9','Node9', 'Node10', 'Node11','Node12']
})
lst_vst = set()
prntPath(['Node1'],'Node1', df, lst_vst)
The results are:
Node1-Node2-Node4-Node8
Node1-Node2-Node4-Node9
Node1-Node2-Node5-Node9
Node1-Node3-Node6-Node10
Node1-Node3-Node7-Node11
Node1-Node12
Solution 3:[3]
An alternative approach to solve this using the networkx package.
import networkx as nx
G = nx.DiGraph()
G.add_edges_from((r.From, r.To) for r in df.itertuples())
roots = [n for (n, d) in G.in_degree if d == 0]
print(roots)
leafs = [n for (n, d) in G.out_degree if d == 0]
print(leafs)
df1 = pd.DataFrame(nx.algorithms.all_simple_paths(G, roots[0], leafs))
for index in df1:
print (df1.loc[index].to_list())
A directed graph is created and the edges are added to it from df
. Node 1 is the only node in G, where the in_degree is equal to 0, i.e. it is the root of the entire graph. Therefore, roots
is equal to Node 0.
leafs
imply the nodes at the directed graph at which out_degree is equal to 0. i.e., ["Node8", "Node9", "Node10", "Node11"].
nx.algorithms.all_simple_paths(G, roots[0], leafs)
provide all the paths starting from Node 1 and ending at each of the nodes in leafs.
Each row of the dataframe df1
is printed as a list using the for-loop statement.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | hbstha123 |