'Detecting cycles in Topological sort using Kahn's algorithm (in degree / out degree)
I have been practicing graph questions lately.
https://leetcode.com/problems/course-schedule-ii/
https://leetcode.com/problems/alien-dictionary/
The current way I detect cycles is to use two hashsets. One for visiting nodes, and one for fully visited nodes. And I push the result onto a stack with DFS traversal.
If I ever visit a node that is currently in the visiting set, then it is a cycle.
The code is pretty verbose and the length is long.
Can anyone please explain how I can use a more standard top-sort algorithm (Kahn's) to detect cycles and generate the top sort sequence?
I just want my method to exit or set some global variable which flags that a cycle has been detected.
Many thanks.
Solution 1:[1]
Khan's algorithm with cycle detection (summary)
- Step 1: Compute In-degree: First we create compute a lookup for the in-degrees of every node. In this particular Leetcode problem, each node has a unique integer identifier, so we can simply store all the in-degrees values using a list where
indegree[i]
tells us the in-degree of nodei
. - Step 2: Keep track of all nodes with in-degree of zero: If a node has an in-degree of zero it means it is a course that we can take right now. There are no other courses that it depends on. We create a queue
q
of all these nodes that have in-degree of zero. At any step of Khan's algorithm, if a node is in q then it is guaranteed that it's "safe to take this course" because it does not depend on any courses that "we have not taken yet". - Step 3: Delete node and edges, then repeat: We take one of these special safe courses
x
from the queueq
and conceptually treat everything as if we have deleted the nodex
and all its outgoing edges from the graphg
. In practice, we don't need to update the graphg
, for Khan's algorithm it is sufficient to just update thein-degree
value of its neighbours to reflect that this node no longer exists.
This step is basically as if a person took and passed the exam for coursex
, and now we want to update the other courses dependencies to show that they don't need to worry aboutx
anymore. - Step 4: Repeat: When we removing these edges from
x
, we are decreasing the in-degree ofx
's neighbours; this can introduce more nodes with an in-degree of zero. During this step, if any more nodes have their in-degree become zero then they are added toq
. We repeat step 3 to process these nodes. Each time we remove a node fromq
we add it to the final topological sort listresult
. - Step 5. Detecting Cycle with Khan's Algorithm: If there is a cycle in the graph then
result
will not include all the nodes in the graph,result
will return only some of the nodes. To check if there is a cycle, you just need to check whether the length ofresult
is equal to the number of nodes in the graph,n
.- Why does this work?:
Suppose there is a cycle in the graph: x1 -> x2 -> ... -> xn -> x1, then none of these nodes will appear in the list because their in-degree will not reach 0 during Khan's algorithm. Each node xi in the cycle can't be put into the queue
q
because there is always some other predecessor node x_(i-1) with an edge going from x_(i-1) to xi preventing this from happening.
- Why does this work?:
Suppose there is a cycle in the graph: x1 -> x2 -> ... -> xn -> x1, then none of these nodes will appear in the list because their in-degree will not reach 0 during Khan's algorithm. Each node xi in the cycle can't be put into the queue
Full solution to Leetcode course-schedule-ii in Python 3:
from collections import defaultdict
def build_graph(edges, n):
g = defaultdict(list)
for i in range(n):
g[i] = []
for a, b in edges:
g[b].append(a)
return g
def topsort(g, n):
# -- Step 1 --
indeg = [0] * n
for u in g:
for v in g[u]:
indeg[v] += 1
# -- Step 2 --
q = []
for i in range(n):
if indeg[i] == 0:
q.append(i)
# -- Step 3 and 4 --
result = []
while q:
x = q.pop()
result.append(x)
for y in g[x]:
indeg[y] -= 1
if indeg[y] == 0:
q.append(y)
return result
def courses(n, edges):
g = build_graph(edges, n)
ordering = topsort(g, n)
# -- Step 5 --
has_cycle = len(ordering) < n
return [] if has_cycle else ordering
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | James Lawson |