'Python OpenCV Contour Tree Hierarchy Structure

I am trying to implement the algorithm found here in python with OpenCV.

I am trying to implement the part of the algorithm that remove irrelevant edge boundaries based on the number of interior boundaries that they have.

  • If the current edge boundary has exactly one or two interior edge boundaries, the internal boundaries can be ignored
  • If the current edge boundary has more than two interior edge boundaries, it can be ignored

I am having trouble determining the tree structure of the contours I have extracted from the image.

My current source:

import cv2

# Load the image
img = cv2.imread('test.png')
cv2.copyMakeBorder(img, 50,50,50,50,cv2.BORDER_CONSTANT, img, (255,255,255))

# Split out each channel
blue = cv2.split(img)[0]
green = cv2.split(img)[1]
red = cv2.split(img)[2]

# Run canny edge detection on each channel
blue_edges = cv2.Canny(blue, 1, 255)
green_edges = cv2.Canny(green, 1, 255)
red_edges = cv2.Canny(red, 1, 255)

# Join edges back into image
edges = blue_edges | green_edges | red_edges

# Find the contours
contours,hierarchy = cv2.findContours(edges.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

# For each contour, find the bounding rectangle and draw it
for cnt in contours:
    x,y,w,h = cv2.boundingRect(cnt)
    cv2.rectangle(edges,(x,y),(x+w,y+h),(200,200,200),2)

# Finally show the image
cv2.imshow('img',edges)
cv2.waitKey(0)
cv2.destroyAllWindows()

I assumed that using cv2.RETR_TREE would give me a nice nested array of the contours but that doesn't seem to be the case. How do I retrieve the tree structure of my contours?



Solution 1:[1]

The main confusion here is probably the fact that the hierarchy returned is a numpy array with more dimensions than necessary. On top of that, it looks like the Python FindContours function returns a tuple that is a LIST of contours, and and NDARRAY of the hierarchy...

You can get a sensible array of hierarchy information that is more in line with the C docs by just taking hierarchy[0]. It would then be an appropriate shape to zip, for example, with the contours.

Below is an example that, will draw the outermost rectangles in green and the innermost rectangles in red on this image:

enter image description here

Output:

enter image description here

Note, by the way, that the wording in the OpenCV docs is a little ambiguous, but hierarchyDataOfAContour[2] describes the children of that contour (if it is negative then that is an inner contour), and hierarchyDataOfAContour[3] describes the parents of that contour (if it is negative then that is an exterior contour).

Also note: I looked into implementing the algorithm that you referred to in the OCR paper, and I saw that FindContours was giving me a lot of repeats of near-identical contours. This would complicate the finding of "Edge Boxes" as the paper describes. This may be because the Canny thresholds were too low (note that I was playing around with them as described in the paper), but there may be some way to reduce that effect or just look at the average deviation of the four corners of all the boxes and eliminate duplicates...

import cv2
import numpy

# Load the image
img = cv2.imread("/ContourTest.PNG")

# Split out each channel
blue, green, red = cv2.split(img)

def medianCanny(img, thresh1, thresh2):
    median = numpy.median(img)
    img = cv2.Canny(img, int(thresh1 * median), int(thresh2 * median))
    return img

# Run canny edge detection on each channel
blue_edges = medianCanny(blue, 0.2, 0.3)
green_edges = medianCanny(green, 0.2, 0.3)
red_edges = medianCanny(red, 0.2, 0.3)

# Join edges back into image
edges = blue_edges | green_edges | red_edges

# Find the contours
contours,hierarchy = cv2.findContours(edges, cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

hierarchy = hierarchy[0] # get the actual inner list of hierarchy descriptions

# For each contour, find the bounding rectangle and draw it
for component in zip(contours, hierarchy):
    currentContour = component[0]
    currentHierarchy = component[1]
    x,y,w,h = cv2.boundingRect(currentContour)
    if currentHierarchy[2] < 0:
        # these are the innermost child components
        cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255),3)
    elif currentHierarchy[3] < 0:
        # these are the outermost parent components
        cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),3)

# Finally show the image
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Solution 2:[2]

Understanding Contour Hierarchies

When finding contours in a binary image using cv2.findContours(), you can use contour hierarchy to select and extract specific contours within the image. Specifically, you can choose a contour retrieval mode to optionally return an output vector containing information about the image topology. There are the four possible modes:

  • cv2.RETR_EXTERNAL - retrieves only the extreme outer contours (no hierarchy)
  • cv2.RETR_LIST - retrieves all of the contours without establishing any hierarchical relationships
  • cv2.RETR_CCOMP - retrieves all of the contours and organizes them into a two-level hierarchy. At the top level, there are external boundaries of the components. At the second level, there are boundaries of the holes. If there is another contour inside a hole of a connected component, it is still put at the top level
  • cv2.RETR_TREE - retrieves all of the contours and reconstructs a full hierarchy of nested contours

Contour Tree Structure

So with this information, we can use cv2.RETR_CCOMP or cv2.RETR_TREE to return a hierarchy list. Take for example this image:

enter image description here

When we use the cv2.RETR_TREE parameter, the contours are arranged in a hierarchy, with the outermost contours for each object at the top. Moving down the hierarchy, each new level of contours represents the next innermost contour for each object. In the image above, the contours in the image are colored to represent the hierarchical structure of the returned contours data. The outermost contours are red, and they are at the top of the hierarchy. The next innermost contours -- the dice pips, in this case -- are green.

We can get that information about the contour hierarchies via the hierarchy array from the cv2.findContours function call. Suppose we call the function like this:

(_, contours, hierarchy) = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

The third return value, saved in the hierarchy variable in this code, is a three-dimensional NumPy array, with one row, X columns, and a "depth" of 4. The X columns correspond to the number of contours found by the function. The cv2.RETR_TREE parameter causes the function to find the internal contours as well as the outermost contours for each object. Column zero corresponds to the first contour, column one the second, and so on.

Each of the columns has a four-element array of integers, representing indices of other contours, according to this scheme:

[next, previous, first child, parent]

The next index refers to the next contour in this contour's hierarchy level, while the previous index refers to the previous contour in this contour's hierarchy level. The first child index refers to the first contour that is contained inside this contour. The parent index refers to the contour containing this contour. In all cases, an value of -1 indicates that there is no next, previous, first child, or parent contour, as appropriate. For a more concrete example, here are some example hierarchy values. The values are in square brackets, and the indices of the contours precede each entry. If you printed out the hierarchy array you will get something like this

0:  [ 6 -1  1 -1]   18: [19 -1 -1 17]
1:  [ 2 -1 -1  0]   19: [20 18 -1 17]
2:  [ 3  1 -1  0]   20: [21 19 -1 17]
3:  [ 4  2 -1  0]   21: [22 20 -1 17]
4:  [ 5  3 -1  0]   22: [-1 21 -1 17]
5:  [-1  4 -1  0]   23: [27 17 24 -1]
6:  [11  0  7 -1]   24: [25 -1 -1 23]
7:  [ 8 -1 -1  6]   25: [26 24 -1 23]
8:  [ 9  7 -1  6]   26: [-1 25 -1 23]
9:  [10  8 -1  6]   27: [32 23 28 -1]
10: [-1  9 -1  6]   28: [29 -1 -1 27]
11: [17  6 12 -1]   29: [30 28 -1 27]
12: [15 -1 13 11]   30: [31 29 -1 27]
13: [14 -1 -1 12]   31: [-1 30 -1 27]
14: [-1 13 -1 12]   32: [-1 27 33 -1]
15: [16 12 -1 11]   33: [34 -1 -1 32]
16: [-1 15 -1 11]   34: [35 33 -1 32]
17: [23 11 18 -1]   35: [-1 34 -1 32]

The entry for the first contour is [6, -1, 1, -1]. This represents the first of the outermost contours; note that there is no particular order for the contours, e.g., they are not stored left to right by default. The entry tells us that the next dice outline is the contour with index six, that there is no previous contour in the list, that the first contour inside this one has index one, and that there is no parent for this contour (no contour containing this one). We can visualize the information in the hierarchy array as seven trees, one for each of the dice in the image.

enter image description here

The seven outermost contours are all those that have no parent, i.e., those with an value of -1 in the fourth field of their hierarchy entry. Each of the child nodes beneath one of the "roots" represents a contour inside the outermost contour. Note how contours 13 and 14 are beneath contour 12 in the diagram. These two contours represent the innermost contours, perhaps noise or some lost paint in one of the pips.


Once we understand how contours are arranged into a hierarchy, we can perform more sophisticated tasks, such as counting the number of contours within a shape in addition to the number of objects in an image. Depending on the retrieval mode you select, you will have full access to the topology of an image and be able to control the tree structure of the contours.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 bellkev
Solution 2 nathancy