'How can I get branch of a networkx graph as a list from pandas dataframe in Python?

I have a pandas dataframe df which looks as follows:

From    To
0   Node1   Node2
1   Node1   Node3
2   Node2   Node4
3   Node2   Node5
4   Node3   Node6
5   Node3   Node7
6   Node4   Node8
7   Node5   Node9
8   Node6   Node10
9   Node7   Node11

df.to_dict() is:

{'From': {0: 'Node1',
  1: 'Node1',
  2: 'Node2',
  3: 'Node2',
  4: 'Node3',
  5: 'Node3',
  6: 'Node4',
  7: 'Node5',
  8: 'Node6',
  9: 'Node7'},
 'To': {0: 'Node2',
  1: 'Node3',
  2: 'Node4',
  3: 'Node5',
  4: 'Node6',
  5: 'Node7',
  6: 'Node8',
  7: 'Node9',
  8: 'Node10',
  9: 'Node11'}}

I have plotted this pandas dataframe as a network graph using networkx package which looks as follows: enter image description here

I want to get the list of unique scenarios/branches from this network graph. There are four branches here starting from Node1.

Node1-Node2-Node4-Node8
Node1-Node2-Node5-Node9
Node1-Node3-Node6-Node10
Node1-Node3-Node7-Node11

How can I get the list of the branches above from the given pandas dataframe in Python?



Solution 1:[1]

You can define Recursive Function and save path and print path:

df = pd.DataFrame({
          'From':['Node1','Node1', 'Node2', 'Node2', 'Node3', 'Node3', 'Node4', 'Node5', 'Node6', 'Node7'],
          'TO'  :['Node2','Node3', 'Node4', 'Node5', 'Node6', 'Node7', 'Node8', 'Node9', 'Node10', 'Node11']
        })

def prntPath(lst, node, df, lst_vst):
    for val in df.values:
        if val[0] == node:
            lst.append(val[1])
            prntPath(lst, val[1], df, lst_vst)
    
    if not lst[-1] in lst_vst:
        print('-'.join(lst))
    for l in lst: lst_vst.add(l)
    lst.pop()
    return
    
lst_vst = set()
prntPath(['Node1'],'Node1', df, lst_vst)

Output:

Node1-Node2-Node4-Node8
Node1-Node2-Node5-Node9
Node1-Node3-Node6-Node10
Node1-Node3-Node7-Node11

You can check and use for other graph like below:

import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
from itertools import chain
from networkx.drawing.nx_pydot import graphviz_layout

def prntPath(lst, node, df, lst_vst):
    for val in df.values:
        if val[0] == node:
            lst.append(val[1])
            prntPath(lst, val[1], df, lst_vst)
    if not lst[-1] in lst_vst: print('-'.join(lst))
    for l in lst: lst_vst.add(l)
    lst.pop()
    return

df = pd.DataFrame({
          'From':['Node1','Node1', 'Node2', 'Node3', 'Node3', 'Node5', 'Node7'],
          'TO'  :['Node2','Node3', 'Node5', 'Node6', 'Node7', 'Node9', 'Node11']
        })

g = nx.DiGraph()
g.add_nodes_from(set(chain.from_iterable(df.values)))
for edg in df.values:
    g.add_edge(*edg)
pos = graphviz_layout(g, prog="dot")
nx.draw(g, pos,with_labels=True, node_shape='s')
plt.draw()
plt.show() 

lst_vst = set()
prntPath(['Node1'],'Node1', df, lst_vst)

Output:

enter image description here

Node1-Node2-Node5-Node9
Node1-Node3-Node6
Node1-Node3-Node7-Node11

Solution 2:[2]

I think @I'mahdi's first snippet of code assumes that a child node will not have more than one parent node, i.e., the graph is a binary tree. When the graph is not a binary tree, the code can be modified as:

import pandas as pd

def prntPath(lst, node, df, lst_vst):
    for val in df.values:
        if val[0] == node:
            lst.append(val[1])
            # recursely find child node first until no child nodes are found
            prntPath(lst, val[1], df, lst_vst)
            
            # recursion is over
    
    # when no child nodes are found
    if len(lst)>=2:
        if (not lst[-1] in lst_vst) or (not lst[-2] in lst_vst):
            print('-'.join(lst))
    for l in lst:
        lst_vst.add(l)
    lst.pop()

# Two links are added to the graph: 1) from node 4 to node 9. 2) from node 1 to node 11.
 
df = pd.DataFrame({
          'From':['Node1','Node1', 'Node2', 'Node2', 'Node3', 'Node3', 'Node4', 'Node4','Node5', 'Node6', 'Node7','Node1'],
          'TO'  :['Node2','Node3', 'Node4', 'Node5', 'Node6', 'Node7', 'Node8', 'Node9','Node9', 'Node10', 'Node11','Node12']
        })
lst_vst = set()
prntPath(['Node1'],'Node1', df, lst_vst)

The results are:

Node1-Node2-Node4-Node8
Node1-Node2-Node4-Node9
Node1-Node2-Node5-Node9
Node1-Node3-Node6-Node10
Node1-Node3-Node7-Node11
Node1-Node12

Solution 3:[3]

An alternative approach to solve this using the networkx package.

import networkx as nx
G = nx.DiGraph()
G.add_edges_from((r.From, r.To) for r in df.itertuples())

roots = [n for (n, d) in G.in_degree if d == 0]
print(roots)

leafs = [n for (n, d) in G.out_degree if d == 0]
print(leafs)

df1 = pd.DataFrame(nx.algorithms.all_simple_paths(G, roots[0], leafs))

for index in df1:
    print (df1.loc[index].to_list())
    

A directed graph is created and the edges are added to it from df. Node 1 is the only node in G, where the in_degree is equal to 0, i.e. it is the root of the entire graph. Therefore, roots is equal to Node 0.

leafs imply the nodes at the directed graph at which out_degree is equal to 0. i.e., ["Node8", "Node9", "Node10", "Node11"].

nx.algorithms.all_simple_paths(G, roots[0], leafs) provide all the paths starting from Node 1 and ending at each of the nodes in leafs.

Each row of the dataframe df1 is printed as a list using the for-loop statement.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 hbstha123