'given a number k and a graph is there a DFS run that will give forest larger then k
I was given a question that I can't seem to solve.
given a directed graph G=(V,E) and a natural number k, k>0.
Find an algorithm that will return "YES" if there is a DFS run that the number of trees in the DFS forest is >= K.
the algorithm should be linear in the size of the graph G. O(|V| + |E|).
I Will Explain:
so for the following run where s is the starting node, we will get 2 trees :
but if we stared at the node u we would get only 1 tree. so I need to return yes for k = 1 or 2. and no else.
so how can I know the number of trees?
thanks for the help!
Solution 1:[1]
Imagine that the given graph is actually a tree. Then if you would start a DFS at the root of the tree, you would find the whole graph in one DFS search. On the other extreme, if you would start a DFS in a leaf, and then in the next leaf, and would start every new DFS in the nodes as you would get them bottom-up, then each DFS will only find one node and then quit (because the children would already have been visited by a previous DFS). So then you can launch as many DFS searches as there are nodes in the tree.
The same remains true if the graph has some extra edges, but remains acyclic.
It becomes different when the graph has cycles. In that case, a DFS that starts in any member of a cycle will find all other members in that cycle. So a cycle can never get split over different DFS searches. This cycle, combined with every other cycle that intersects with it, is a so-called Strongly connected component.
An algorithm would thus have to find strongly connected components and count those as 1 DFS, while all other nodes which do not partake in any cycle can each be counted as a separate DFS (since you would start in the leaves of those subtrees).
Here is an algorithm that uses DFS (which is confusing, since it is a DFS that is counting possible DFSs) to identify cycles and update the count accordingly. I've used recursion for this algorithm, and so there must be some fast backtracking when the required k has been reached: further, searching is not necessary in that case.
All edges are visited only once, and the main loop also visits all nodes exactly once, so the required time complexity is attained.
def k_forests(adj, k):
# pathindex[node] == 0: means node has not been visited
# pathindex[node] == -1: means node has been visited and all neighbors processed
# pathindex[node] > 0: means node is at this step in the current DFS path
pathindex = [0] * len(adj) # none of the nodes has been visited
def recur(node, depth):
nonlocal k # we will decrease this count
if pathindex[node] > 0: # cycle back
return pathindex[node]
if pathindex[node] == -1: # already processed
return depth
pathindex[node] = depth # being visited
cycle = depth + 1 # infinity
for neighbor in adj[node]:
cycle = min(cycle, recur(neighbor, depth + 1))
if k == 0: # success
return -1 # backtrack completely...
if cycle >= depth: # no cylce detected or back out of cycle
k -= 1
if k == 0:
return -1 # success
pathindex[node] = -1 # completely visited and processed
return cycle
# main loop over the nodes
for node in range(len(adj)):
recur(node, 1)
if k == 0:
return "YES"
return "NO"
This function should be called with an adjacency list for every node, where nodes are identified by a sequential number, starting from 0. For example, the graph in the question can be represented as follows, where s=0, t=1, u=2, v=3, w=4, x=5, y=6, and z=7:
adj = [
[4, 7],
[2, 3],
[3, 1],
[0, 4],
[5],
[7],
[5],
[6, 4]
]
print(k_forests(adj, 4)) # YES
print(k_forests(adj, 5)) # NO
Solution 2:[2]
After the edit on the question:
Use Kosaraju's algorithm to get strongly connected components in O(V + E) time. That would give the max K.
Here max K is same as the number of strongly connected components.
Why?
Let us do the proof by contradiction now. Let us say we have 4 strongly connected components for the graph shown in the question. Assume there is a possibility of getting an extra dfs tree starting at some node v
. That would mean either the node v
was not covered while counting the number of strongly connected components or the node was missed during DFS. But either of the case is not possible if we do a DFS or find the strongly connected components using well proven algorithm. Hence, our assumption is false. Thus, the proof by contradiction.
Answer before edit on the question:
DFS(Vertex v):
mark v as visited;
for(Vertex neighbor: Neighbors of v){
if(!isVisited(neighbor)){
DFS(neighbor);
})
}
count_trees(Graph G): //V vertices and E edges in total
for(Vertex v: V){
if(!isVisited(v)){
DFS(v);
trees++;
})
}
return trees;
Above steps are self explanatory. Maintaining if a vertex has been visited is trivial.
The above approach is just DFS on every node that hasn't been visited before. Hence, the time complexity is the same as that of DFS which is O(|V| + |E|)
.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | xxx |
Solution 2 |