'How to convert cv2.rectangle bounding box to YoloV4 annotation format (relative x,y,w,h)?
I have trained a Yolo4 network and it is giving me bounding boxes as:
img_array = cv2.cvtColor(cv2.imread('image.png'), cv2.COLOR_BGR2RGB)
classes, scores, bboxes = model.detect(img_array, CONFIDENCE_THRESHOLD, NMS_THRESHOLD)
box = bboxes[0]
(x, y) = (box[0], box[1])
(w, h) = (box[2], box[3])
When I save the image by using cv2.rectangle
as:
cv2.rectangle(img_array, (x, y), (x + w, y + h), (127,0,75), 1)
cv2.imwrite('image.png',img_array)
IT gives me a very good bounding box plotted. I want to use this box
and shape of image array to create a text file which is in the Yolov4
format as x,y,w,h
floating values between 0 and 1 relative to image size.
Let us suppose I have my values as:
img_array.shape -> (443, 1265, 3)
box -> array([489, 126, 161, 216], dtype=int32)
So it gives me
(x, y) = (box[0], box[1]) -> (489, 126)
(w, h) = (box[2], box[3]) -> (161, 216)
Also the Bounding Boxes created by me using LabelImg
in the Text file are as
0.453125 0.538462 0.132212 0.509615 # 0 is the class
How can I use these coordinates to get in Yolov4
format? It is a bit confusing. I have used many codes from this answer does not seem to work.
Also, I tried using this code but I don't know if that's right or not. Even if that's right, I have no idea how to get x_, y_
def yolov4_format(img_shape,box):
x_img, y_img, c = img_shape
(x, y) = (box[0], box[1])
(w, h) = (box[2], box[3])
x_, y_ = None # logic for these?
w_ = w/x_img
h_ = h/y_img
return x_,y_, w_, h_
Solution 1:[1]
Guess I was close to solving just the x
and y
are NOT absolute but the Center of the rectangle box as described by AlexyAB in this answer. So I followed up the code for LabelImg and found a code and modified it to my usecase.
def bnd_box_to_yolo_line(box,img_size):
(x_min, y_min) = (box[0], box[1])
(w, h) = (box[2], box[3])
x_max = x+w
y_max = y+h
x_center = float((x_min + x_max)) / 2 / img_size[1]
y_center = float((y_min + y_max)) / 2 / img_size[0]
w = float((x_max - x_min)) / img_size[1]
h = float((y_max - y_min)) / img_size[0]
return x_center, y_center, w, h
All you need is that Bounding Box and Image shape
Solution 2:[2]
There is a more straight-forward way to do those stuff with pybboxes. Install with,
pip install pybboxes
In your case,
import pybboxes as pbx
voc_bbox = (489, 126, 161, 216)
W, H = 443, 1265 # WxH of the image
pbx.convert_bbox(voc_bbox, from_type="coco", to_type="yolo", image_width=W, image_height=H)
>>> (1.2855530474040633, 0.18498023715415018, 0.36343115124153497, 0.1707509881422925)
Note that, converting to YOLO format requires the image width and height for scaling.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Deshwal |
Solution 2 | null |