'What and where am I going wrong in this code for pytorch based object detection?
I am using Yolov5 for this project
Here is my code
import numpy as np
import cv2
import torch
import torch.backends.cudnn as cudnn
from models.experimental import attempt_load
from utils.general import non_max_suppression
weights = '/Users/nidhi/Desktop/yolov5/best.pt'
device = torch.device('cpu')
model = attempt_load(weights, map_location=device) # load FP32 model
stride = int(model.stride.max()) # model stride
cudnn.benchmark = True
# Capture with opencv and detect object
cap = cv2.VideoCapture('Pothole testing.mp4')
width, height = (352, 352) # quality
cap.set(3, width) # width
cap.set(4, height) # height
while(cap.isOpened()):
time.sleep(0.2) # wait for 0.2 second
ret, frame = cap.read()
if ret ==True:
now = time.time()
img = torch.from_numpy(frame).float().to(device).permute(2, 0, 1)
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
pred = model(img, augment=False)[0]
pred = non_max_suppression(pred, 0.39, 0.45, classes=0, agnostic=True) # img, conf, iou, classes, ...
print('time -> ', time.time()-now)
else:
break
cap.release()
The error I am getting:
File "run.py", line 38, in <module>
pred = model(img, augment=False)[0]
File "/Users/nidhi/Library/Python/3.8/lib/python/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/nidhi/Desktop/yolov5/models/yolo.py", line 118, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "/Users/nidhi/Desktop/yolov5/models/yolo.py", line 134, in forward_once
x = m(x) # run
File "/Users/nidhi/Library/Python/3.8/lib/python/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/nidhi/Desktop/yolov5/models/common.py", line 152, in forward
return torch.cat(x, self.d)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 108 and 107 in dimension 3 (The offending index is 1)
Operating system: macOS Big Sur 11.2.3
Python version: 3.8.2
The model is used best.pt
which I had trained on Google Colab, I used yolov5l model to train the dataset.
Solution 1:[1]
Are you getting your error in the following line?
pred = model(img, augment=False)[0]
It might be because YOLO expects inputs of the image size which are multiple of 32. So 320×320, 352×352 etc. But you are 352x288. You will either have to resize it, or pad the 288 dimension with white/black pixels to make it 352.
If you are not sure about where you are getting the error, can you attach the whole error?
Solution 2:[2]
Get the solution here https://www.youtube.com/watch?v=_gQ2Xzld0m4 Its work for me do exactly the same thinks from model=yolov5s.pt
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | stupid_cannon |
Solution 2 | Citoyen x14 |