添加注释
35
.gitignore
vendored
Normal file
@ -0,0 +1,35 @@
|
||||
# .gitignore
|
||||
# 首先忽略所有的文件
|
||||
*
|
||||
# 但是不忽略目录
|
||||
!*/
|
||||
# 忽略一些指定的目录名
|
||||
ut/
|
||||
runs/
|
||||
.vscode/
|
||||
build/
|
||||
result1/
|
||||
result/
|
||||
mytest/
|
||||
mytest_double/
|
||||
pretrained_model/
|
||||
gangao/
|
||||
extra/
|
||||
ccpd/
|
||||
*.pyc
|
||||
# 不忽略下面指定的文件类型
|
||||
!*.cpp
|
||||
!*.h
|
||||
!*.hpp
|
||||
!*.c
|
||||
!.gitignore
|
||||
!*.py
|
||||
!*.sh
|
||||
!*.npy
|
||||
!*.jpg
|
||||
!*.pt
|
||||
!*.npy
|
||||
!*.pth
|
||||
!*.png
|
||||
!*.yaml
|
||||
!*.md
|
3
.idea/.gitignore
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
# Default ignored files
|
||||
/shelf/
|
||||
/workspace.xml
|
93
README.md
Normal file
@ -0,0 +1,93 @@
|
||||
## What's New
|
||||
|
||||
**2022.12.04 车辆和车牌一起检测看这里[车辆系统](https://github.com/we0091234/Car_recognition)**
|
||||
|
||||
[yolov8 车牌检测+识别](https://github.com/we0091234/yolov8-plate)
|
||||
|
||||
[yolov7 车牌检测+识别](https://github.com/we0091234/yolov7_plate)
|
||||
|
||||
[安卓NCNN](https://github.com/Ayers-github/Chinese-License-Plate-Recognition)
|
||||
|
||||
## 联系
|
||||
模型用的是公开数据集训练出来的,需要准确率更高的模型,或者商务合作请加
|
||||
|
||||
**wechat: we0091234 (注明来意)**
|
||||
|
||||
## **最全车牌识别算法,支持12种中文车牌类型**
|
||||
|
||||
**环境要求: python >=3.6 pytorch >=1.7**
|
||||
|
||||
#### **图片测试demo:**
|
||||
|
||||
直接运行detect_plate.py 或者运行如下命令行:
|
||||
|
||||
```
|
||||
python detect_plate.py --detect_model weights/plate_detect.pt --rec_model weights/plate_rec_color.pth --image_path imgs --output result
|
||||
```
|
||||
|
||||
测试文件夹imgs,结果保存再 result 文件夹中
|
||||
|
||||
#### 视频测试demo [2.MP4](https://pan.baidu.com/s/1O1sT8hCEwJZmVScDwBHgOg) 提取码:41aq
|
||||
|
||||
```
|
||||
python detect_plate.py --detect_model weights/plate_detect.pt --rec_model weights/plate_rec_color.pth --video 2.mp4
|
||||
```
|
||||
|
||||
视频文件为2.mp4 保存为result.mp4
|
||||
|
||||
## **车牌检测训练**
|
||||
|
||||
车牌检测训练链接如下:
|
||||
|
||||
[车牌检测训练](https://github.com/we0091234/Chinese_license_plate_detection_recognition/tree/main/readme)
|
||||
|
||||
## **车牌识别训练**
|
||||
|
||||
车牌识别训练链接如下:
|
||||
|
||||
[车牌识别训练](https://github.com/we0091234/crnn_plate_recognition)
|
||||
|
||||
#### **支持如下:**
|
||||
|
||||
- [X] 1.单行蓝牌
|
||||
- [X] 2.单行黄牌
|
||||
- [X] 3.新能源车牌
|
||||
- [X] 4.白色警用车牌
|
||||
- [X] 5.教练车牌
|
||||
- [X] 6.武警车牌
|
||||
- [X] 7.双层黄牌
|
||||
- [X] 8.双层白牌
|
||||
- [X] 9.使馆车牌
|
||||
- [X] 10.港澳粤Z牌
|
||||
- [X] 11.双层绿牌
|
||||
- [X] 12.民航车牌
|
||||
|
||||
![Image ](image/README/test_1.jpg)
|
||||
|
||||
## 部署
|
||||
|
||||
1.[安卓NCNN](https://github.com/Ayers-github/Chinese-License-Plate-Recognition)
|
||||
|
||||
2.**onnx demo** 百度网盘: [k874](https://pan.baidu.com/s/1K3L3xubd6pXIreAydvUm4g)
|
||||
|
||||
```
|
||||
python onnx_infer.py --detect_model weights/plate_detect.onnx --rec_model weights/plate_rec_color.onnx --image_path imgs --output result_onnx
|
||||
```
|
||||
|
||||
3.**tensorrt** 部署见[tensorrt_plate](https://github.com/we0091234/chinese_plate_tensorrt)
|
||||
|
||||
4.**openvino demo** 版本2022.2
|
||||
|
||||
```
|
||||
python openvino_infer.py --detect_model weights/plate_detect.onnx --rec_model weights/plate_rec.onnx --image_path imgs --output result_openvino
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
* [https://github.com/deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face)
|
||||
* [https://github.com/Sierkinhane/CRNN_Chinese_Characters_Rec](https://github.com/Sierkinhane/CRNN_Chinese_Characters_Rec)
|
||||
|
||||
## More
|
||||
**qq群:871797331(已满) 837982567(二群) 询问**
|
||||
|
||||
![Image ](image/README/105384078.png)
|
196
ccpd_process.py
Normal file
@ -0,0 +1,196 @@
|
||||
import os
|
||||
import shutil
|
||||
import cv2
|
||||
import numpy as np
|
||||
def allFilePath(rootPath,allFIleList):
|
||||
fileList = os.listdir(rootPath)
|
||||
for temp in fileList:
|
||||
if os.path.isfile(os.path.join(rootPath,temp)):
|
||||
if temp.endswith(".jpg"):
|
||||
allFIleList.append(os.path.join(rootPath,temp))
|
||||
else:
|
||||
allFilePath(os.path.join(rootPath,temp),allFIleList)
|
||||
|
||||
def order_points(pts):
|
||||
# initialzie a list of coordinates that will be ordered
|
||||
# such that the first entry in the list is the top-left,
|
||||
# the second entry is the top-right, the third is the
|
||||
# bottom-right, and the fourth is the bottom-left
|
||||
pts=pts[:4,:]
|
||||
rect = np.zeros((5, 2), dtype = "float32")
|
||||
|
||||
# the top-left point will have the smallest sum, whereas
|
||||
# the bottom-right point will have the largest sum
|
||||
s = pts.sum(axis = 1)
|
||||
rect[0] = pts[np.argmin(s)]
|
||||
rect[2] = pts[np.argmax(s)]
|
||||
|
||||
# now, compute the difference between the points, the
|
||||
# top-right point will have the smallest difference,
|
||||
# whereas the bottom-left will have the largest difference
|
||||
diff = np.diff(pts, axis = 1)
|
||||
rect[1] = pts[np.argmin(diff)]
|
||||
rect[3] = pts[np.argmax(diff)]
|
||||
|
||||
# return the ordered coordinates
|
||||
return rect
|
||||
|
||||
def get_partical_ccpd():
|
||||
ccpd_dir = r"/mnt/Gpan/BaiduNetdiskDownload/CCPD1/CCPD2020/ccpd_green"
|
||||
save_Path = r"ccpd/green_plate"
|
||||
folder_list = os.listdir(ccpd_dir)
|
||||
for folder_name in folder_list:
|
||||
count=0
|
||||
folder_path = os.path.join(ccpd_dir,folder_name)
|
||||
if os.path.isfile(folder_path):
|
||||
continue
|
||||
if folder_name == "ccpd_fn":
|
||||
continue
|
||||
name_list = os.listdir(folder_path)
|
||||
|
||||
save_folder=save_Path
|
||||
if not os.path.exists(save_folder):
|
||||
os.mkdir(save_folder)
|
||||
|
||||
for name in name_list:
|
||||
file_path = os.path.join(folder_path,name)
|
||||
count+=1
|
||||
if count>1000:
|
||||
break
|
||||
new_file_path =os.path.join(save_folder,name)
|
||||
shutil.move(file_path,new_file_path)
|
||||
print(count,new_file_path)
|
||||
|
||||
def get_rect_and_landmarks(img_path):
|
||||
file_name = img_path.split("/")[-1].split("-")
|
||||
landmarks_np =np.zeros((5,2))
|
||||
rect = file_name[2].split("_")
|
||||
landmarks=file_name[3].split("_")
|
||||
rect_str = "&".join(rect)
|
||||
landmarks_str= "&".join(landmarks)
|
||||
rect= rect_str.split("&")
|
||||
landmarks=landmarks_str.split("&")
|
||||
rect=[int(x) for x in rect]
|
||||
landmarks=[int(x) for x in landmarks]
|
||||
for i in range(4):
|
||||
landmarks_np[i][0]=landmarks[2*i]
|
||||
landmarks_np[i][1]=landmarks[2*i+1]
|
||||
# middle_landmark_w =int((landmarks[4]+landmarks[6])/2)
|
||||
# middle_landmark_h =int((landmarks[5]+landmarks[7])/2)
|
||||
# landmarks.append(middle_landmark_w)
|
||||
# landmarks.append(middle_landmark_h)
|
||||
landmarks_np_new=order_points(landmarks_np)
|
||||
# landmarks_np_new[4]=np.array([middle_landmark_w,middle_landmark_h])
|
||||
return rect,landmarks,landmarks_np_new
|
||||
|
||||
def x1x2y1y2_yolo(rect,landmarks,img):
|
||||
h,w,c =img.shape
|
||||
rect[0] = max(0, rect[0])
|
||||
rect[1] = max(0, rect[1])
|
||||
rect[2] = min(w - 1, rect[2]-rect[0])
|
||||
rect[3] = min(h - 1, rect[3]-rect[1])
|
||||
annotation = np.zeros((1, 14))
|
||||
annotation[0, 0] = (rect[0] + rect[2] / 2) / w # cx
|
||||
annotation[0, 1] = (rect[1] + rect[3] / 2) / h # cy
|
||||
annotation[0, 2] = rect[2] / w # w
|
||||
annotation[0, 3] = rect[3] / h # h
|
||||
|
||||
annotation[0, 4] = landmarks[0] / w # l0_x
|
||||
annotation[0, 5] = landmarks[1] / h # l0_y
|
||||
annotation[0, 6] = landmarks[2] / w # l1_x
|
||||
annotation[0, 7] = landmarks[3] / h # l1_y
|
||||
annotation[0, 8] = landmarks[4] / w # l2_x
|
||||
annotation[0, 9] = landmarks[5] / h # l2_y
|
||||
annotation[0, 10] = landmarks[6] / w # l3_x
|
||||
annotation[0, 11] = landmarks[7] / h # l3_y
|
||||
# annotation[0, 12] = landmarks[8] / w # l4_x
|
||||
# annotation[0, 13] = landmarks[9] / h # l4_y
|
||||
return annotation
|
||||
|
||||
def xywh2yolo(rect,landmarks_sort,img):
|
||||
h,w,c =img.shape
|
||||
rect[0] = max(0, rect[0])
|
||||
rect[1] = max(0, rect[1])
|
||||
rect[2] = min(w - 1, rect[2]-rect[0])
|
||||
rect[3] = min(h - 1, rect[3]-rect[1])
|
||||
annotation = np.zeros((1, 12))
|
||||
annotation[0, 0] = (rect[0] + rect[2] / 2) / w # cx
|
||||
annotation[0, 1] = (rect[1] + rect[3] / 2) / h # cy
|
||||
annotation[0, 2] = rect[2] / w # w
|
||||
annotation[0, 3] = rect[3] / h # h
|
||||
|
||||
annotation[0, 4] = landmarks_sort[0][0] / w # l0_x
|
||||
annotation[0, 5] = landmarks_sort[0][1] / h # l0_y
|
||||
annotation[0, 6] = landmarks_sort[1][0] / w # l1_x
|
||||
annotation[0, 7] = landmarks_sort[1][1] / h # l1_y
|
||||
annotation[0, 8] = landmarks_sort[2][0] / w # l2_x
|
||||
annotation[0, 9] = landmarks_sort[2][1] / h # l2_y
|
||||
annotation[0, 10] = landmarks_sort[3][0] / w # l3_x
|
||||
annotation[0, 11] = landmarks_sort[3][1] / h # l3_y
|
||||
# annotation[0, 12] = landmarks_sort[4][0] / w # l4_x
|
||||
# annotation[0, 13] = landmarks_sort[4][1] / h # l4_y
|
||||
return annotation
|
||||
|
||||
def yolo2x1y1x2y2(annotation,img):
|
||||
h,w,c = img.shape
|
||||
rect= annotation[:,0:4].squeeze().tolist()
|
||||
landmarks=annotation[:,4:].squeeze().tolist()
|
||||
rect_w = w*rect[2]
|
||||
rect_h =h*rect[3]
|
||||
rect_x =int(rect[0]*w-rect_w/2)
|
||||
rect_y = int(rect[1]*h-rect_h/2)
|
||||
new_rect=[rect_x,rect_y,rect_x+rect_w,rect_y+rect_h]
|
||||
for i in range(5):
|
||||
landmarks[2*i]=landmarks[2*i]*w
|
||||
landmarks[2*i+1]=landmarks[2*i+1]*h
|
||||
return new_rect,landmarks
|
||||
|
||||
def write_lable(file_path):
|
||||
pass
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
file_root = r"ccpd"
|
||||
file_list=[]
|
||||
count=0
|
||||
allFilePath(file_root,file_list)
|
||||
for img_path in file_list:
|
||||
count+=1
|
||||
# img_path = r"ccpd_yolo_test/02-90_85-173&466_452&541-452&553_176&556_178&463_454&460-0_0_6_26_15_26_32-68-53.jpg"
|
||||
text_path= img_path.replace(".jpg",".txt")
|
||||
img =cv2.imread(img_path)
|
||||
rect,landmarks,landmarks_sort=get_rect_and_landmarks(img_path)
|
||||
# annotation=x1x2y1y2_yolo(rect,landmarks,img)
|
||||
annotation=xywh2yolo(rect,landmarks_sort,img)
|
||||
str_label = "0 "
|
||||
for i in range(len(annotation[0])):
|
||||
str_label = str_label + " " + str(annotation[0][i])
|
||||
str_label = str_label.replace('[', '').replace(']', '')
|
||||
str_label = str_label.replace(',', '') + '\n'
|
||||
with open(text_path,"w") as f:
|
||||
f.write(str_label)
|
||||
print(count,img_path)
|
||||
# get_partical_ccpd()
|
||||
# file_root = r"ccpd/green_plate"
|
||||
# file_list=[]
|
||||
# allFilePath(file_root,file_list)
|
||||
# count=0
|
||||
# for img_path in file_list:
|
||||
# img_name = img_path.split(os.sep)[-1]
|
||||
# if not "&" in img_name:
|
||||
# count+=1
|
||||
# os.remove(img_path)
|
||||
# print(count,img_path)
|
||||
|
||||
# new_rect,new_landmarks=yolo2x1y1x2y2(annotation,img)
|
||||
# rect= [int(x) for x in new_rect]
|
||||
# cv2.rectangle(img,(rect[0],rect[1]),(rect[2],rect[3]),(255,0,0),2)
|
||||
# colors=[(0,255,0),(0,255,255),(255,255,0),(255,255,255),(255,0,255)] #绿 黄 青 白
|
||||
# for i in range(5):
|
||||
# cv2.circle(img,(landmarks[2*i],landmarks[2*i+1]),2,colors[i],2)
|
||||
# cv2.imwrite("1.jpg",img)
|
||||
# print(rect,landmarks)
|
||||
# get_partical_ccpd()
|
||||
|
||||
|
||||
|
21
data/argoverse_hd.yaml
Normal file
@ -0,0 +1,21 @@
|
||||
# Argoverse-HD dataset (ring-front-center camera) http://www.cs.cmu.edu/~mengtial/proj/streaming/
|
||||
# Train command: python train.py --data argoverse_hd.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /argoverse
|
||||
# /yolov5
|
||||
|
||||
|
||||
# download command/URL (optional)
|
||||
download: bash data/scripts/get_argoverse_hd.sh
|
||||
|
||||
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
|
||||
train: ../argoverse/Argoverse-1.1/images/train/ # 39384 images
|
||||
val: ../argoverse/Argoverse-1.1/images/val/ # 15062 iamges
|
||||
test: ../argoverse/Argoverse-1.1/images/test/ # Submit to: https://eval.ai/web/challenges/challenge-page/800/overview
|
||||
|
||||
# number of classes
|
||||
nc: 8
|
||||
|
||||
# class names
|
||||
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'bus', 'truck', 'traffic_light', 'stop_sign' ]
|
35
data/coco.yaml
Normal file
@ -0,0 +1,35 @@
|
||||
# COCO 2017 dataset http://cocodataset.org
|
||||
# Train command: python train.py --data coco.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /coco
|
||||
# /yolov5
|
||||
|
||||
|
||||
# download command/URL (optional)
|
||||
download: bash data/scripts/get_coco.sh
|
||||
|
||||
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
|
||||
train: ../coco/train2017.txt # 118287 images
|
||||
val: ../coco/val2017.txt # 5000 images
|
||||
test: ../coco/test-dev2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794
|
||||
|
||||
# number of classes
|
||||
nc: 80
|
||||
|
||||
# class names
|
||||
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
|
||||
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
|
||||
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
|
||||
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
|
||||
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
|
||||
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
|
||||
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
|
||||
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
|
||||
'hair drier', 'toothbrush' ]
|
||||
|
||||
# Print classes
|
||||
# with open('data/coco.yaml') as f:
|
||||
# d = yaml.load(f, Loader=yaml.FullLoader) # dict
|
||||
# for i, x in enumerate(d['names']):
|
||||
# print(i, x)
|
28
data/coco128.yaml
Normal file
@ -0,0 +1,28 @@
|
||||
# COCO 2017 dataset http://cocodataset.org - first 128 training images
|
||||
# Train command: python train.py --data coco128.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /coco128
|
||||
# /yolov5
|
||||
|
||||
|
||||
# download command/URL (optional)
|
||||
download: https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip
|
||||
|
||||
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
|
||||
train: ../coco128/images/train2017/ # 128 images
|
||||
val: ../coco128/images/train2017/ # 128 images
|
||||
|
||||
# number of classes
|
||||
nc: 80
|
||||
|
||||
# class names
|
||||
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
|
||||
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
|
||||
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
|
||||
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
|
||||
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
|
||||
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
|
||||
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
|
||||
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
|
||||
'hair drier', 'toothbrush' ]
|
38
data/hyp.finetune.yaml
Normal file
@ -0,0 +1,38 @@
|
||||
# Hyperparameters for VOC finetuning
|
||||
# python train.py --batch 64 --weights yolov5m.pt --data voc.yaml --img 512 --epochs 50
|
||||
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
|
||||
|
||||
|
||||
# Hyperparameter Evolution Results
|
||||
# Generations: 306
|
||||
# P R mAP.5 mAP.5:.95 box obj cls
|
||||
# Metrics: 0.6 0.936 0.896 0.684 0.0115 0.00805 0.00146
|
||||
|
||||
lr0: 0.0032
|
||||
lrf: 0.12
|
||||
momentum: 0.843
|
||||
weight_decay: 0.00036
|
||||
warmup_epochs: 2.0
|
||||
warmup_momentum: 0.5
|
||||
warmup_bias_lr: 0.05
|
||||
box: 0.0296
|
||||
cls: 0.243
|
||||
cls_pw: 0.631
|
||||
obj: 0.301
|
||||
obj_pw: 0.911
|
||||
iou_t: 0.2
|
||||
anchor_t: 2.91
|
||||
# anchors: 3.63
|
||||
fl_gamma: 0.0
|
||||
hsv_h: 0.0138
|
||||
hsv_s: 0.664
|
||||
hsv_v: 0.464
|
||||
degrees: 0.373
|
||||
translate: 0.245
|
||||
scale: 0.898
|
||||
shear: 0.602
|
||||
perspective: 0.0
|
||||
flipud: 0.00856
|
||||
fliplr: 0.5
|
||||
mosaic: 1.0
|
||||
mixup: 0.243
|
34
data/hyp.scratch.yaml
Normal file
@ -0,0 +1,34 @@
|
||||
# Hyperparameters for COCO training from scratch
|
||||
# python train.py --batch 40 --cfg yolov5m.yaml --weights '' --data coco.yaml --img 640 --epochs 300
|
||||
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
|
||||
|
||||
|
||||
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
|
||||
lrf: 0.2 # final OneCycleLR learning rate (lr0 * lrf)
|
||||
momentum: 0.937 # SGD momentum/Adam beta1
|
||||
weight_decay: 0.0005 # optimizer weight decay 5e-4
|
||||
warmup_epochs: 3.0 # warmup epochs (fractions ok)
|
||||
warmup_momentum: 0.8 # warmup initial momentum
|
||||
warmup_bias_lr: 0.1 # warmup initial bias lr
|
||||
box: 0.05 # box loss gain
|
||||
cls: 0.5 # cls loss gain
|
||||
landmark: 0.005 # landmark loss gain
|
||||
cls_pw: 1.0 # cls BCELoss positive_weight
|
||||
obj: 1.0 # obj loss gain (scale with pixels)
|
||||
obj_pw: 1.0 # obj BCELoss positive_weight
|
||||
iou_t: 0.20 # IoU training threshold
|
||||
anchor_t: 4.0 # anchor-multiple threshold
|
||||
# anchors: 3 # anchors per output layer (0 to ignore)
|
||||
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
|
||||
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
|
||||
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
|
||||
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
|
||||
degrees: 0.0 # image rotation (+/- deg)
|
||||
translate: 0.1 # image translation (+/- fraction)
|
||||
scale: 0.5 # image scale (+/- gain)
|
||||
shear: 0.5 # image shear (+/- deg)
|
||||
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
|
||||
flipud: 0.0 # image flip up-down (probability)
|
||||
fliplr: 0.5 # image flip left-right (probability)
|
||||
mosaic: 0.5 # image mosaic (probability)
|
||||
mixup: 0.0 # image mixup (probability)
|
20
data/plateAndCar.yaml
Normal file
@ -0,0 +1,20 @@
|
||||
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
|
||||
# Train command: python train.py --data voc.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /VOC
|
||||
# /yolov5
|
||||
|
||||
|
||||
# download command/URL (optional)
|
||||
download: bash data/scripts/get_voc.sh
|
||||
|
||||
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
|
||||
train: /mnt/Gpan/Mydata/pytorch
|
||||
Porject/datasets/ccpd/train_car_plate/train_detect
|
||||
val: /mnt/Gpan/Mydata/pytorchPorject/datasets/ccpd/train_car_plate/val_detect
|
||||
# number of classes
|
||||
nc: 3
|
||||
|
||||
# class names
|
||||
names: [ 'single_plate','double_plate','car']
|
150
data/retinaface2yolo.py
Normal file
@ -0,0 +1,150 @@
|
||||
import os
|
||||
import os.path
|
||||
import sys
|
||||
import torch
|
||||
import torch.utils.data as data
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
class WiderFaceDetection(data.Dataset):
|
||||
def __init__(self, txt_path, preproc=None):
|
||||
self.preproc = preproc
|
||||
self.imgs_path = []
|
||||
self.words = []
|
||||
f = open(txt_path,'r')
|
||||
lines = f.readlines()
|
||||
isFirst = True
|
||||
labels = []
|
||||
for line in lines:
|
||||
line = line.rstrip()
|
||||
if line.startswith('#'):
|
||||
if isFirst is True:
|
||||
isFirst = False
|
||||
else:
|
||||
labels_copy = labels.copy()
|
||||
self.words.append(labels_copy)
|
||||
labels.clear()
|
||||
path = line[2:]
|
||||
path = txt_path.replace('label.txt','images/') + path
|
||||
self.imgs_path.append(path)
|
||||
else:
|
||||
line = line.split(' ')
|
||||
label = [float(x) for x in line]
|
||||
labels.append(label)
|
||||
|
||||
self.words.append(labels)
|
||||
|
||||
def __len__(self):
|
||||
return len(self.imgs_path)
|
||||
|
||||
def __getitem__(self, index):
|
||||
img = cv2.imread(self.imgs_path[index])
|
||||
height, width, _ = img.shape
|
||||
|
||||
labels = self.words[index]
|
||||
annotations = np.zeros((0, 15))
|
||||
if len(labels) == 0:
|
||||
return annotations
|
||||
for idx, label in enumerate(labels):
|
||||
annotation = np.zeros((1, 15))
|
||||
# bbox
|
||||
annotation[0, 0] = label[0] # x1
|
||||
annotation[0, 1] = label[1] # y1
|
||||
annotation[0, 2] = label[0] + label[2] # x2
|
||||
annotation[0, 3] = label[1] + label[3] # y2
|
||||
|
||||
# landmarks
|
||||
annotation[0, 4] = label[4] # l0_x
|
||||
annotation[0, 5] = label[5] # l0_y
|
||||
annotation[0, 6] = label[7] # l1_x
|
||||
annotation[0, 7] = label[8] # l1_y
|
||||
annotation[0, 8] = label[10] # l2_x
|
||||
annotation[0, 9] = label[11] # l2_y
|
||||
annotation[0, 10] = label[13] # l3_x
|
||||
annotation[0, 11] = label[14] # l3_y
|
||||
annotation[0, 12] = label[16] # l4_x
|
||||
annotation[0, 13] = label[17] # l4_y
|
||||
if (annotation[0, 4]<0):
|
||||
annotation[0, 14] = -1
|
||||
else:
|
||||
annotation[0, 14] = 1
|
||||
|
||||
annotations = np.append(annotations, annotation, axis=0)
|
||||
target = np.array(annotations)
|
||||
if self.preproc is not None:
|
||||
img, target = self.preproc(img, target)
|
||||
|
||||
return torch.from_numpy(img), target
|
||||
|
||||
def detection_collate(batch):
|
||||
"""Custom collate fn for dealing with batches of images that have a different
|
||||
number of associated object annotations (bounding boxes).
|
||||
|
||||
Arguments:
|
||||
batch: (tuple) A tuple of tensor images and lists of annotations
|
||||
|
||||
Return:
|
||||
A tuple containing:
|
||||
1) (tensor) batch of images stacked on their 0 dim
|
||||
2) (list of tensors) annotations for a given image are stacked on 0 dim
|
||||
"""
|
||||
targets = []
|
||||
imgs = []
|
||||
for _, sample in enumerate(batch):
|
||||
for _, tup in enumerate(sample):
|
||||
if torch.is_tensor(tup):
|
||||
imgs.append(tup)
|
||||
elif isinstance(tup, type(np.empty(0))):
|
||||
annos = torch.from_numpy(tup).float()
|
||||
targets.append(annos)
|
||||
|
||||
return (torch.stack(imgs, 0), targets)
|
||||
|
||||
save_path = '/ssd_1t/derron/yolov5-face/data/widerface/train'
|
||||
aa=WiderFaceDetection("/ssd_1t/derron/yolov5-face/data/widerface/widerface/train/label.txt")
|
||||
for i in range(len(aa.imgs_path)):
|
||||
print(i, aa.imgs_path[i])
|
||||
img = cv2.imread(aa.imgs_path[i])
|
||||
base_img = os.path.basename(aa.imgs_path[i])
|
||||
base_txt = os.path.basename(aa.imgs_path[i])[:-4] +".txt"
|
||||
save_img_path = os.path.join(save_path, base_img)
|
||||
save_txt_path = os.path.join(save_path, base_txt)
|
||||
with open(save_txt_path, "w") as f:
|
||||
height, width, _ = img.shape
|
||||
labels = aa.words[i]
|
||||
annotations = np.zeros((0, 14))
|
||||
if len(labels) == 0:
|
||||
continue
|
||||
for idx, label in enumerate(labels):
|
||||
annotation = np.zeros((1, 14))
|
||||
# bbox
|
||||
label[0] = max(0, label[0])
|
||||
label[1] = max(0, label[1])
|
||||
label[2] = min(width - 1, label[2])
|
||||
label[3] = min(height - 1, label[3])
|
||||
annotation[0, 0] = (label[0] + label[2] / 2) / width # cx
|
||||
annotation[0, 1] = (label[1] + label[3] / 2) / height # cy
|
||||
annotation[0, 2] = label[2] / width # w
|
||||
annotation[0, 3] = label[3] / height # h
|
||||
#if (label[2] -label[0]) < 8 or (label[3] - label[1]) < 8:
|
||||
# img[int(label[1]):int(label[3]), int(label[0]):int(label[2])] = 127
|
||||
# continue
|
||||
# landmarks
|
||||
annotation[0, 4] = label[4] / width # l0_x
|
||||
annotation[0, 5] = label[5] / height # l0_y
|
||||
annotation[0, 6] = label[7] / width # l1_x
|
||||
annotation[0, 7] = label[8] / height # l1_y
|
||||
annotation[0, 8] = label[10] / width # l2_x
|
||||
annotation[0, 9] = label[11] / height # l2_y
|
||||
annotation[0, 10] = label[13] / width # l3_x
|
||||
annotation[0, 11] = label[14] / height # l3_y
|
||||
annotation[0, 12] = label[16] / width # l4_x
|
||||
annotation[0, 13] = label[17] / height # l4_y
|
||||
str_label="0 "
|
||||
for i in range(len(annotation[0])):
|
||||
str_label =str_label+" "+str(annotation[0][i])
|
||||
str_label = str_label.replace('[', '').replace(']', '')
|
||||
str_label = str_label.replace(',', '') + '\n'
|
||||
f.write(str_label)
|
||||
cv2.imwrite(save_img_path, img)
|
||||
|
62
data/scripts/get_argoverse_hd.sh
Normal file
@ -0,0 +1,62 @@
|
||||
#!/bin/bash
|
||||
# Argoverse-HD dataset (ring-front-center camera) http://www.cs.cmu.edu/~mengtial/proj/streaming/
|
||||
# Download command: bash data/scripts/get_argoverse_hd.sh
|
||||
# Train command: python train.py --data argoverse_hd.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /argoverse
|
||||
# /yolov5
|
||||
|
||||
# Download/unzip images
|
||||
d='../argoverse/' # unzip directory
|
||||
mkdir $d
|
||||
url=https://argoverse-hd.s3.us-east-2.amazonaws.com/
|
||||
f=Argoverse-HD-Full.zip
|
||||
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &# download, unzip, remove in background
|
||||
wait # finish background tasks
|
||||
|
||||
cd ../argoverse/Argoverse-1.1/
|
||||
ln -s tracking images
|
||||
|
||||
cd ../Argoverse-HD/annotations/
|
||||
|
||||
python3 - "$@" <<END
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
annotation_files = ["train.json", "val.json"]
|
||||
print("Converting annotations to YOLOv5 format...")
|
||||
|
||||
for val in annotation_files:
|
||||
a = json.load(open(val, "rb"))
|
||||
|
||||
label_dict = {}
|
||||
for annot in a['annotations']:
|
||||
img_id = annot['image_id']
|
||||
img_name = a['images'][img_id]['name']
|
||||
img_label_name = img_name[:-3] + "txt"
|
||||
|
||||
obj_class = annot['category_id']
|
||||
x_center, y_center, width, height = annot['bbox']
|
||||
x_center = (x_center + width / 2) / 1920. # offset and scale
|
||||
y_center = (y_center + height / 2) / 1200. # offset and scale
|
||||
width /= 1920. # scale
|
||||
height /= 1200. # scale
|
||||
|
||||
img_dir = "./labels/" + a['seq_dirs'][a['images'][annot['image_id']]['sid']]
|
||||
|
||||
Path(img_dir).mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if img_dir + "/" + img_label_name not in label_dict:
|
||||
label_dict[img_dir + "/" + img_label_name] = []
|
||||
|
||||
label_dict[img_dir + "/" + img_label_name].append(f"{obj_class} {x_center} {y_center} {width} {height}\n")
|
||||
|
||||
for filename in label_dict:
|
||||
with open(filename, "w") as file:
|
||||
for string in label_dict[filename]:
|
||||
file.write(string)
|
||||
|
||||
END
|
||||
|
||||
mv ./labels ../../Argoverse-1.1/
|
27
data/scripts/get_coco.sh
Normal file
@ -0,0 +1,27 @@
|
||||
#!/bin/bash
|
||||
# COCO 2017 dataset http://cocodataset.org
|
||||
# Download command: bash data/scripts/get_coco.sh
|
||||
# Train command: python train.py --data coco.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /coco
|
||||
# /yolov5
|
||||
|
||||
# Download/unzip labels
|
||||
d='../' # unzip directory
|
||||
url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
|
||||
f='coco2017labels.zip' # or 'coco2017labels-segments.zip', 68 MB
|
||||
echo 'Downloading' $url$f ' ...'
|
||||
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
|
||||
|
||||
# Download/unzip images
|
||||
d='../coco/images' # unzip directory
|
||||
url=http://images.cocodataset.org/zips/
|
||||
f1='train2017.zip' # 19G, 118k images
|
||||
f2='val2017.zip' # 1G, 5k images
|
||||
f3='test2017.zip' # 7G, 41k images (optional)
|
||||
for f in $f1 $f2; do
|
||||
echo 'Downloading' $url$f '...'
|
||||
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
|
||||
done
|
||||
wait # finish background tasks
|
139
data/scripts/get_voc.sh
Normal file
@ -0,0 +1,139 @@
|
||||
#!/bin/bash
|
||||
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
|
||||
# Download command: bash data/scripts/get_voc.sh
|
||||
# Train command: python train.py --data voc.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /VOC
|
||||
# /yolov5
|
||||
|
||||
start=$(date +%s)
|
||||
mkdir -p ../tmp
|
||||
cd ../tmp/
|
||||
|
||||
# Download/unzip images and labels
|
||||
d='.' # unzip directory
|
||||
url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
|
||||
f1=VOCtrainval_06-Nov-2007.zip # 446MB, 5012 images
|
||||
f2=VOCtest_06-Nov-2007.zip # 438MB, 4953 images
|
||||
f3=VOCtrainval_11-May-2012.zip # 1.95GB, 17126 images
|
||||
for f in $f3 $f2 $f1; do
|
||||
echo 'Downloading' $url$f '...'
|
||||
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
|
||||
done
|
||||
wait # finish background tasks
|
||||
|
||||
end=$(date +%s)
|
||||
runtime=$((end - start))
|
||||
echo "Completed in" $runtime "seconds"
|
||||
|
||||
echo "Splitting dataset..."
|
||||
python3 - "$@" <<END
|
||||
import xml.etree.ElementTree as ET
|
||||
import pickle
|
||||
import os
|
||||
from os import listdir, getcwd
|
||||
from os.path import join
|
||||
|
||||
sets=[('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
|
||||
|
||||
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
|
||||
|
||||
|
||||
def convert(size, box):
|
||||
dw = 1./(size[0])
|
||||
dh = 1./(size[1])
|
||||
x = (box[0] + box[1])/2.0 - 1
|
||||
y = (box[2] + box[3])/2.0 - 1
|
||||
w = box[1] - box[0]
|
||||
h = box[3] - box[2]
|
||||
x = x*dw
|
||||
w = w*dw
|
||||
y = y*dh
|
||||
h = h*dh
|
||||
return (x,y,w,h)
|
||||
|
||||
def convert_annotation(year, image_id):
|
||||
in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
|
||||
out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
|
||||
tree=ET.parse(in_file)
|
||||
root = tree.getroot()
|
||||
size = root.find('size')
|
||||
w = int(size.find('width').text)
|
||||
h = int(size.find('height').text)
|
||||
|
||||
for obj in root.iter('object'):
|
||||
difficult = obj.find('difficult').text
|
||||
cls = obj.find('name').text
|
||||
if cls not in classes or int(difficult)==1:
|
||||
continue
|
||||
cls_id = classes.index(cls)
|
||||
xmlbox = obj.find('bndbox')
|
||||
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
|
||||
bb = convert((w,h), b)
|
||||
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
|
||||
|
||||
wd = getcwd()
|
||||
|
||||
for year, image_set in sets:
|
||||
if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
|
||||
os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
|
||||
image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
|
||||
list_file = open('%s_%s.txt'%(year, image_set), 'w')
|
||||
for image_id in image_ids:
|
||||
list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'%(wd, year, image_id))
|
||||
convert_annotation(year, image_id)
|
||||
list_file.close()
|
||||
|
||||
END
|
||||
|
||||
cat 2007_train.txt 2007_val.txt 2012_train.txt 2012_val.txt >train.txt
|
||||
cat 2007_train.txt 2007_val.txt 2007_test.txt 2012_train.txt 2012_val.txt >train.all.txt
|
||||
|
||||
python3 - "$@" <<END
|
||||
|
||||
import shutil
|
||||
import os
|
||||
os.system('mkdir ../VOC/')
|
||||
os.system('mkdir ../VOC/images')
|
||||
os.system('mkdir ../VOC/images/train')
|
||||
os.system('mkdir ../VOC/images/val')
|
||||
|
||||
os.system('mkdir ../VOC/labels')
|
||||
os.system('mkdir ../VOC/labels/train')
|
||||
os.system('mkdir ../VOC/labels/val')
|
||||
|
||||
import os
|
||||
print(os.path.exists('../tmp/train.txt'))
|
||||
f = open('../tmp/train.txt', 'r')
|
||||
lines = f.readlines()
|
||||
|
||||
for line in lines:
|
||||
line = "/".join(line.split('/')[-5:]).strip()
|
||||
if (os.path.exists("../" + line)):
|
||||
os.system("cp ../"+ line + " ../VOC/images/train")
|
||||
|
||||
line = line.replace('JPEGImages', 'labels')
|
||||
line = line.replace('jpg', 'txt')
|
||||
if (os.path.exists("../" + line)):
|
||||
os.system("cp ../"+ line + " ../VOC/labels/train")
|
||||
|
||||
|
||||
print(os.path.exists('../tmp/2007_test.txt'))
|
||||
f = open('../tmp/2007_test.txt', 'r')
|
||||
lines = f.readlines()
|
||||
|
||||
for line in lines:
|
||||
line = "/".join(line.split('/')[-5:]).strip()
|
||||
if (os.path.exists("../" + line)):
|
||||
os.system("cp ../"+ line + " ../VOC/images/val")
|
||||
|
||||
line = line.replace('JPEGImages', 'labels')
|
||||
line = line.replace('jpg', 'txt')
|
||||
if (os.path.exists("../" + line)):
|
||||
os.system("cp ../"+ line + " ../VOC/labels/val")
|
||||
|
||||
END
|
||||
|
||||
rm -rf ../tmp # remove temporary directory
|
||||
echo "VOC download done."
|
176
data/train2yolo.py
Normal file
@ -0,0 +1,176 @@
|
||||
import os.path
|
||||
import sys
|
||||
import torch
|
||||
import torch.utils.data as data
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
|
||||
class WiderFaceDetection(data.Dataset):
|
||||
def __init__(self, txt_path, preproc=None):
|
||||
self.preproc = preproc
|
||||
self.imgs_path = []
|
||||
self.words = []
|
||||
f = open(txt_path, 'r')
|
||||
lines = f.readlines()
|
||||
isFirst = True
|
||||
labels = []
|
||||
for line in lines:
|
||||
line = line.rstrip()
|
||||
if line.startswith('#'):
|
||||
if isFirst is True:
|
||||
isFirst = False
|
||||
else:
|
||||
labels_copy = labels.copy()
|
||||
self.words.append(labels_copy)
|
||||
labels.clear()
|
||||
path = line[2:]
|
||||
path = txt_path.replace('label.txt', 'images/') + path
|
||||
self.imgs_path.append(path)
|
||||
else:
|
||||
line = line.split(' ')
|
||||
label = [float(x) for x in line]
|
||||
labels.append(label)
|
||||
|
||||
self.words.append(labels)
|
||||
|
||||
def __len__(self):
|
||||
return len(self.imgs_path)
|
||||
|
||||
def __getitem__(self, index):
|
||||
img = cv2.imread(self.imgs_path[index])
|
||||
height, width, _ = img.shape
|
||||
|
||||
labels = self.words[index]
|
||||
annotations = np.zeros((0, 15))
|
||||
if len(labels) == 0:
|
||||
return annotations
|
||||
for idx, label in enumerate(labels):
|
||||
annotation = np.zeros((1, 15))
|
||||
# bbox
|
||||
annotation[0, 0] = label[0] # x1
|
||||
annotation[0, 1] = label[1] # y1
|
||||
annotation[0, 2] = label[0] + label[2] # x2
|
||||
annotation[0, 3] = label[1] + label[3] # y2
|
||||
|
||||
# landmarks
|
||||
annotation[0, 4] = label[4] # l0_x
|
||||
annotation[0, 5] = label[5] # l0_y
|
||||
annotation[0, 6] = label[7] # l1_x
|
||||
annotation[0, 7] = label[8] # l1_y
|
||||
annotation[0, 8] = label[10] # l2_x
|
||||
annotation[0, 9] = label[11] # l2_y
|
||||
annotation[0, 10] = label[13] # l3_x
|
||||
annotation[0, 11] = label[14] # l3_y
|
||||
annotation[0, 12] = label[16] # l4_x
|
||||
annotation[0, 13] = label[17] # l4_y
|
||||
if annotation[0, 4] < 0:
|
||||
annotation[0, 14] = -1
|
||||
else:
|
||||
annotation[0, 14] = 1
|
||||
|
||||
annotations = np.append(annotations, annotation, axis=0)
|
||||
target = np.array(annotations)
|
||||
if self.preproc is not None:
|
||||
img, target = self.preproc(img, target)
|
||||
|
||||
return torch.from_numpy(img), target
|
||||
|
||||
|
||||
def detection_collate(batch):
|
||||
"""Custom collate fn for dealing with batches of images that have a different
|
||||
number of associated object annotations (bounding boxes).
|
||||
|
||||
Arguments:
|
||||
batch: (tuple) A tuple of tensor images and lists of annotations
|
||||
|
||||
Return:
|
||||
A tuple containing:
|
||||
1) (tensor) batch of images stacked on their 0 dim
|
||||
2) (list of tensors) annotations for a given image are stacked on 0 dim
|
||||
"""
|
||||
targets = []
|
||||
imgs = []
|
||||
for _, sample in enumerate(batch):
|
||||
for _, tup in enumerate(sample):
|
||||
if torch.is_tensor(tup):
|
||||
imgs.append(tup)
|
||||
elif isinstance(tup, type(np.empty(0))):
|
||||
annos = torch.from_numpy(tup).float()
|
||||
targets.append(annos)
|
||||
|
||||
return torch.stack(imgs, 0), targets
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if len(sys.argv) == 1:
|
||||
print('Missing path to WIDERFACE train folder.')
|
||||
print('Run command: python3 train2yolo.py /path/to/original/widerface/train [/path/to/save/widerface/train]')
|
||||
exit(1)
|
||||
elif len(sys.argv) > 3:
|
||||
print('Too many arguments were provided.')
|
||||
print('Run command: python3 train2yolo.py /path/to/original/widerface/train [/path/to/save/widerface/train]')
|
||||
exit(1)
|
||||
original_path = sys.argv[1]
|
||||
|
||||
if len(sys.argv) == 2:
|
||||
if not os.path.isdir('widerface'):
|
||||
os.mkdir('widerface')
|
||||
if not os.path.isdir('widerface/train'):
|
||||
os.mkdir('widerface/train')
|
||||
|
||||
save_path = 'widerface/train'
|
||||
else:
|
||||
save_path = sys.argv[2]
|
||||
|
||||
if not os.path.isfile(os.path.join(original_path, 'label.txt')):
|
||||
print('Missing label.txt file.')
|
||||
exit(1)
|
||||
|
||||
aa = WiderFaceDetection(os.path.join(original_path, 'label.txt'))
|
||||
|
||||
for i in range(len(aa.imgs_path)):
|
||||
print(i, aa.imgs_path[i])
|
||||
img = cv2.imread(aa.imgs_path[i])
|
||||
base_img = os.path.basename(aa.imgs_path[i])
|
||||
base_txt = os.path.basename(aa.imgs_path[i])[:-4] + ".txt"
|
||||
save_img_path = os.path.join(save_path, base_img)
|
||||
save_txt_path = os.path.join(save_path, base_txt)
|
||||
with open(save_txt_path, "w") as f:
|
||||
height, width, _ = img.shape
|
||||
labels = aa.words[i]
|
||||
annotations = np.zeros((0, 14))
|
||||
if len(labels) == 0:
|
||||
continue
|
||||
for idx, label in enumerate(labels):
|
||||
annotation = np.zeros((1, 14))
|
||||
# bbox
|
||||
label[0] = max(0, label[0])
|
||||
label[1] = max(0, label[1])
|
||||
label[2] = min(width - 1, label[2])
|
||||
label[3] = min(height - 1, label[3])
|
||||
annotation[0, 0] = (label[0] + label[2] / 2) / width # cx
|
||||
annotation[0, 1] = (label[1] + label[3] / 2) / height # cy
|
||||
annotation[0, 2] = label[2] / width # w
|
||||
annotation[0, 3] = label[3] / height # h
|
||||
#if (label[2] -label[0]) < 8 or (label[3] - label[1]) < 8:
|
||||
# img[int(label[1]):int(label[3]), int(label[0]):int(label[2])] = 127
|
||||
# continue
|
||||
# landmarks
|
||||
annotation[0, 4] = label[4] / width # l0_x
|
||||
annotation[0, 5] = label[5] / height # l0_y
|
||||
annotation[0, 6] = label[7] / width # l1_x
|
||||
annotation[0, 7] = label[8] / height # l1_y
|
||||
annotation[0, 8] = label[10] / width # l2_x
|
||||
annotation[0, 9] = label[11] / height # l2_y
|
||||
annotation[0, 10] = label[13] / width # l3_x
|
||||
annotation[0, 11] = label[14] / height # l3_y
|
||||
annotation[0, 12] = label[16] / width # l4_x
|
||||
annotation[0, 13] = label[17] / height # l4_yca
|
||||
str_label = "0 "
|
||||
for i in range(len(annotation[0])):
|
||||
str_label = str_label + " " + str(annotation[0][i])
|
||||
str_label = str_label.replace('[', '').replace(']', '')
|
||||
str_label = str_label.replace(',', '') + '\n'
|
||||
f.write(str_label)
|
||||
cv2.imwrite(save_img_path, img)
|
88
data/val2yolo.py
Normal file
@ -0,0 +1,88 @@
|
||||
import os
|
||||
import cv2
|
||||
import numpy as np
|
||||
import shutil
|
||||
import sys
|
||||
from tqdm import tqdm
|
||||
|
||||
|
||||
def xywh2xxyy(box):
|
||||
x1 = box[0]
|
||||
y1 = box[1]
|
||||
x2 = box[0] + box[2]
|
||||
y2 = box[1] + box[3]
|
||||
return x1, x2, y1, y2
|
||||
|
||||
|
||||
def convert(size, box):
|
||||
dw = 1. / (size[0])
|
||||
dh = 1. / (size[1])
|
||||
x = (box[0] + box[1]) / 2.0 - 1
|
||||
y = (box[2] + box[3]) / 2.0 - 1
|
||||
w = box[1] - box[0]
|
||||
h = box[3] - box[2]
|
||||
x = x * dw
|
||||
w = w * dw
|
||||
y = y * dh
|
||||
h = h * dh
|
||||
return x, y, w, h
|
||||
|
||||
|
||||
def wider2face(root, phase='val', ignore_small=0):
|
||||
data = {}
|
||||
with open('{}/{}/label.txt'.format(root, phase), 'r') as f:
|
||||
lines = f.readlines()
|
||||
for line in tqdm(lines):
|
||||
line = line.strip()
|
||||
if '#' in line:
|
||||
path = '{}/{}/images/{}'.format(root, phase, line.split()[-1])
|
||||
img = cv2.imread(path)
|
||||
height, width, _ = img.shape
|
||||
data[path] = list()
|
||||
else:
|
||||
box = np.array(line.split()[0:4], dtype=np.float32) # (x1,y1,w,h)
|
||||
if box[2] < ignore_small or box[3] < ignore_small:
|
||||
continue
|
||||
box = convert((width, height), xywh2xxyy(box))
|
||||
label = '0 {} {} {} {} -1 -1 -1 -1 -1 -1 -1 -1 -1 -1'.format(round(box[0], 4), round(box[1], 4),
|
||||
round(box[2], 4), round(box[3], 4))
|
||||
data[path].append(label)
|
||||
return data
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if len(sys.argv) == 1:
|
||||
print('Missing path to WIDERFACE folder.')
|
||||
print('Run command: python3 val2yolo.py /path/to/original/widerface [/path/to/save/widerface/val]')
|
||||
exit(1)
|
||||
elif len(sys.argv) > 3:
|
||||
print('Too many arguments were provided.')
|
||||
print('Run command: python3 val2yolo.py /path/to/original/widerface [/path/to/save/widerface/val]')
|
||||
exit(1)
|
||||
|
||||
root_path = sys.argv[1]
|
||||
if not os.path.isfile(os.path.join(root_path, 'val', 'label.txt')):
|
||||
print('Missing label.txt file.')
|
||||
exit(1)
|
||||
|
||||
if len(sys.argv) == 2:
|
||||
if not os.path.isdir('widerface'):
|
||||
os.mkdir('widerface')
|
||||
if not os.path.isdir('widerface/val'):
|
||||
os.mkdir('widerface/val')
|
||||
|
||||
save_path = 'widerface/val'
|
||||
else:
|
||||
save_path = sys.argv[2]
|
||||
|
||||
datas = wider2face(root_path, phase='val')
|
||||
for idx, data in enumerate(datas.keys()):
|
||||
pict_name = os.path.basename(data)
|
||||
out_img = f'{save_path}/{idx}.jpg'
|
||||
out_txt = f'{save_path}/{idx}.txt'
|
||||
shutil.copyfile(data, out_img)
|
||||
labels = datas[data]
|
||||
f = open(out_txt, 'w')
|
||||
for label in labels:
|
||||
f.write(label + '\n')
|
||||
f.close()
|
65
data/val2yolo_for_test.py
Normal file
@ -0,0 +1,65 @@
|
||||
import os
|
||||
import cv2
|
||||
import numpy as np
|
||||
import shutil
|
||||
from tqdm import tqdm
|
||||
|
||||
root = '/ssd_1t/derron/WiderFace'
|
||||
|
||||
|
||||
def xywh2xxyy(box):
|
||||
x1 = box[0]
|
||||
y1 = box[1]
|
||||
x2 = box[0] + box[2]
|
||||
y2 = box[1] + box[3]
|
||||
return (x1, x2, y1, y2)
|
||||
|
||||
|
||||
def convert(size, box):
|
||||
dw = 1. / (size[0])
|
||||
dh = 1. / (size[1])
|
||||
x = (box[0] + box[1]) / 2.0 - 1
|
||||
y = (box[2] + box[3]) / 2.0 - 1
|
||||
w = box[1] - box[0]
|
||||
h = box[3] - box[2]
|
||||
x = x * dw
|
||||
w = w * dw
|
||||
y = y * dh
|
||||
h = h * dh
|
||||
return (x, y, w, h)
|
||||
|
||||
|
||||
def wider2face(phase='val', ignore_small=0):
|
||||
data = {}
|
||||
with open('{}/{}/label.txt'.format(root, phase), 'r') as f:
|
||||
lines = f.readlines()
|
||||
for line in tqdm(lines):
|
||||
line = line.strip()
|
||||
if '#' in line:
|
||||
path = '{}/{}/images/{}'.format(root, phase, os.path.basename(line))
|
||||
img = cv2.imread(path)
|
||||
height, width, _ = img.shape
|
||||
data[path] = list()
|
||||
else:
|
||||
box = np.array(line.split()[0:4], dtype=np.float32) # (x1,y1,w,h)
|
||||
if box[2] < ignore_small or box[3] < ignore_small:
|
||||
continue
|
||||
box = convert((width, height), xywh2xxyy(box))
|
||||
label = '0 {} {} {} {} -1 -1 -1 -1 -1 -1 -1 -1 -1 -1'.format(round(box[0], 4), round(box[1], 4),
|
||||
round(box[2], 4), round(box[3], 4))
|
||||
data[path].append(label)
|
||||
return data
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
datas = wider2face('val')
|
||||
for idx, data in enumerate(datas.keys()):
|
||||
pict_name = os.path.basename(data)
|
||||
out_img = 'widerface/val/images/{}'.format(pict_name)
|
||||
out_txt = 'widerface/val/labels/{}.txt'.format(os.path.splitext(pict_name)[0])
|
||||
shutil.copyfile(data, out_img)
|
||||
labels = datas[data]
|
||||
f = open(out_txt, 'w')
|
||||
for label in labels:
|
||||
f.write(label + '\n')
|
||||
f.close()
|
21
data/voc.yaml
Normal file
@ -0,0 +1,21 @@
|
||||
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
|
||||
# Train command: python train.py --data voc.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /VOC
|
||||
# /yolov5
|
||||
|
||||
|
||||
# download command/URL (optional)
|
||||
download: bash data/scripts/get_voc.sh
|
||||
|
||||
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
|
||||
train: ../VOC/images/train/ # 16551 images
|
||||
val: ../VOC/images/val/ # 4952 images
|
||||
|
||||
# number of classes
|
||||
nc: 20
|
||||
|
||||
# class names
|
||||
names: [ 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog',
|
||||
'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor' ]
|
19
data/widerface.yaml
Normal file
@ -0,0 +1,19 @@
|
||||
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
|
||||
# Train command: python train.py --data voc.yaml
|
||||
# Default dataset location is next to /yolov5:
|
||||
# /parent_folder
|
||||
# /VOC
|
||||
# /yolov5
|
||||
|
||||
|
||||
# download command/URL (optional)
|
||||
download: bash data/scripts/get_voc.sh
|
||||
|
||||
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
|
||||
train: /mnt/Gpan/Mydata/pytorchPorject/yolov5-face/ccpd/train_detect
|
||||
val: /mnt/Gpan/Mydata/pytorchPorject/yolov5-face/ccpd/val_detect
|
||||
# number of classes
|
||||
nc: 2
|
||||
|
||||
# class names
|
||||
names: [ 'single','double']
|
3
demo.sh
Normal file
@ -0,0 +1,3 @@
|
||||
python detect_plate_hongkang.py --image_path gangao --img_size 640 --detect_model runs/train/exp32/weights/best.pt --rec_model /mnt/Gpan/Mydata/pytorchPorject/yolov7-face/weights/plate_rec.pth
|
||||
# runs/train/exp26/weights/last.ptimgs
|
||||
# /mnt/Gpan/Mydata/pytorchPorject/datasets/ccpd/train_detect/gangao
|
223
detect_demo.py
Normal file
@ -0,0 +1,223 @@
|
||||
# -*- coding: UTF-8 -*-
|
||||
import argparse
|
||||
import time
|
||||
import os
|
||||
import cv2
|
||||
import torch
|
||||
from numpy import random
|
||||
import copy
|
||||
import numpy as np
|
||||
from models.experimental import attempt_load
|
||||
from utils.datasets import letterbox
|
||||
from utils.general import check_img_size, non_max_suppression_face, scale_coords
|
||||
|
||||
from utils.torch_utils import time_synchronized
|
||||
from utils.cv_puttext import cv2ImgAddText
|
||||
from plate_recognition.plate_rec import get_plate_result,allFilePath,cv_imread
|
||||
|
||||
from plate_recognition.double_plate_split_merge import get_split_merge
|
||||
|
||||
clors = [(255,0,0),(0,255,0),(0,0,255),(255,255,0),(0,255,255)]
|
||||
|
||||
def load_model(weights, device):
|
||||
model = attempt_load(weights, map_location=device) # load FP32 model
|
||||
return model
|
||||
|
||||
|
||||
def scale_coords_landmarks(img1_shape, coords, img0_shape, ratio_pad=None):
|
||||
# Rescale coords (xyxy) from img1_shape to img0_shape
|
||||
if ratio_pad is None: # calculate from img0_shape
|
||||
gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1]) # gain = old / new
|
||||
pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2 # wh padding
|
||||
else:
|
||||
gain = ratio_pad[0][0]
|
||||
pad = ratio_pad[1]
|
||||
|
||||
coords[:, [0, 2, 4, 6]] -= pad[0] # x padding
|
||||
coords[:, [1, 3, 5, 7]] -= pad[1] # y padding
|
||||
coords[:, :10] /= gain
|
||||
#clip_coords(coords, img0_shape)
|
||||
coords[:, 0].clamp_(0, img0_shape[1]) # x1
|
||||
coords[:, 1].clamp_(0, img0_shape[0]) # y1
|
||||
coords[:, 2].clamp_(0, img0_shape[1]) # x2
|
||||
coords[:, 3].clamp_(0, img0_shape[0]) # y2
|
||||
coords[:, 4].clamp_(0, img0_shape[1]) # x3
|
||||
coords[:, 5].clamp_(0, img0_shape[0]) # y3
|
||||
coords[:, 6].clamp_(0, img0_shape[1]) # x4
|
||||
coords[:, 7].clamp_(0, img0_shape[0]) # y4
|
||||
# coords[:, 8].clamp_(0, img0_shape[1]) # x5
|
||||
# coords[:, 9].clamp_(0, img0_shape[0]) # y5
|
||||
return coords
|
||||
|
||||
|
||||
|
||||
|
||||
def get_plate_rec_landmark(img, xyxy, conf, landmarks, class_num,device):
|
||||
h,w,c = img.shape
|
||||
result_dict={}
|
||||
tl = 1 or round(0.002 * (h + w) / 2) + 1 # line/font thickness
|
||||
|
||||
x1 = int(xyxy[0])
|
||||
y1 = int(xyxy[1])
|
||||
x2 = int(xyxy[2])
|
||||
y2 = int(xyxy[3])
|
||||
landmarks_np=np.zeros((4,2))
|
||||
rect=[x1,y1,x2,y2]
|
||||
for i in range(4):
|
||||
point_x = int(landmarks[2 * i])
|
||||
point_y = int(landmarks[2 * i + 1])
|
||||
landmarks_np[i]=np.array([point_x,point_y])
|
||||
|
||||
class_label= int(class_num) #车牌的的类型0代表单牌,1代表双层车牌
|
||||
result_dict['rect']=rect
|
||||
result_dict['landmarks']=landmarks_np.tolist()
|
||||
result_dict['class']=class_label
|
||||
return result_dict
|
||||
|
||||
|
||||
|
||||
def detect_plate(model, orgimg, device,img_size):
|
||||
# Load model
|
||||
# img_size = opt_img_size
|
||||
conf_thres = 0.3
|
||||
iou_thres = 0.5
|
||||
dict_list=[]
|
||||
# orgimg = cv2.imread(image_path) # BGR
|
||||
img0 = copy.deepcopy(orgimg)
|
||||
assert orgimg is not None, 'Image Not Found '
|
||||
h0, w0 = orgimg.shape[:2] # orig hw
|
||||
r = img_size / max(h0, w0) # resize image to img_size
|
||||
if r != 1: # always resize down, only resize up if training with augmentation
|
||||
interp = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
|
||||
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interp)
|
||||
|
||||
imgsz = check_img_size(img_size, s=model.stride.max()) # check img_size
|
||||
|
||||
img = letterbox(img0, new_shape=imgsz)[0]
|
||||
# img =process_data(img0)
|
||||
# Convert
|
||||
img = img[:, :, ::-1].transpose(2, 0, 1).copy() # BGR to RGB, to 3x416x416
|
||||
|
||||
# Run inference
|
||||
t0 = time.time()
|
||||
|
||||
img = torch.from_numpy(img).to(device)
|
||||
img = img.float() # uint8 to fp16/32
|
||||
img /= 255.0 # 0 - 255 to 0.0 - 1.0
|
||||
if img.ndimension() == 3:
|
||||
img = img.unsqueeze(0)
|
||||
|
||||
# Inference
|
||||
t1 = time_synchronized()
|
||||
pred = model(img)[0]
|
||||
t2=time_synchronized()
|
||||
# print(f"infer time is {(t2-t1)*1000} ms")
|
||||
|
||||
# Apply NMS
|
||||
pred = non_max_suppression_face(pred, conf_thres, iou_thres)
|
||||
|
||||
# print('img.shape: ', img.shape)
|
||||
# print('orgimg.shape: ', orgimg.shape)
|
||||
|
||||
# Process detections
|
||||
for i, det in enumerate(pred): # detections per image
|
||||
if len(det):
|
||||
# Rescale boxes from img_size to im0 size
|
||||
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], orgimg.shape).round()
|
||||
|
||||
# Print results
|
||||
for c in det[:, -1].unique():
|
||||
n = (det[:, -1] == c).sum() # detections per class
|
||||
|
||||
det[:, 5:13] = scale_coords_landmarks(img.shape[2:], det[:, 5:13], orgimg.shape).round()
|
||||
|
||||
for j in range(det.size()[0]):
|
||||
xyxy = det[j, :4].view(-1).tolist()
|
||||
conf = det[j, 4].cpu().numpy()
|
||||
landmarks = det[j, 5:13].view(-1).tolist()
|
||||
class_num = det[j, 13].cpu().numpy()
|
||||
result_dict = get_plate_rec_landmark(orgimg, xyxy, conf, landmarks, class_num,device)
|
||||
dict_list.append(result_dict)
|
||||
return dict_list
|
||||
# cv2.imwrite('result.jpg', orgimg)
|
||||
|
||||
|
||||
|
||||
def draw_result(orgimg,dict_list):
|
||||
result_str =""
|
||||
for result in dict_list:
|
||||
rect_area = result['rect']
|
||||
|
||||
x,y,w,h = rect_area[0],rect_area[1],rect_area[2]-rect_area[0],rect_area[3]-rect_area[1]
|
||||
padding_w = 0.05*w
|
||||
padding_h = 0.11*h
|
||||
rect_area[0]=max(0,int(x-padding_w))
|
||||
rect_area[1]=max(0,int(y-padding_h))
|
||||
rect_area[2]=min(orgimg.shape[1],int(rect_area[2]+padding_w))
|
||||
rect_area[3]=min(orgimg.shape[0],int(rect_area[3]+padding_h))
|
||||
|
||||
|
||||
landmarks=result['landmarks']
|
||||
label=result['class']
|
||||
# result_str+=result+" "
|
||||
for i in range(4): #关键点
|
||||
cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
|
||||
cv2.rectangle(orgimg,(rect_area[0],rect_area[1]),(rect_area[2],rect_area[3]),clors[label],2) #画框
|
||||
cv2.putText(img,str(label),(rect_area[0],rect_area[1]),cv2.FONT_HERSHEY_SIMPLEX,0.5,clors[label],2)
|
||||
# orgimg=cv2ImgAddText(orgimg,label,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
|
||||
# print(result_str)
|
||||
return orgimg
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--detect_model', nargs='+', type=str, default='weights/plate_detect.pt', help='model.pt path(s)') #检测模型
|
||||
parser.add_argument('--image_path', type=str, default=r'D:\Project\ChePai\test\images\val', help='source')
|
||||
parser.add_argument('--img_size', type=int, default=640, help='inference size (pixels)')
|
||||
parser.add_argument('--output', type=str, default='result1', help='source')
|
||||
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
||||
# device =torch.device("cpu")
|
||||
opt = parser.parse_args()
|
||||
print(opt)
|
||||
save_path = opt.output
|
||||
count=0
|
||||
if not os.path.exists(save_path):
|
||||
os.mkdir(save_path)
|
||||
|
||||
detect_model = load_model(opt.detect_model, device) #初始化检测模型
|
||||
time_all = 0
|
||||
time_begin=time.time()
|
||||
if not os.path.isfile(opt.image_path): #目录
|
||||
file_list=[]
|
||||
allFilePath(opt.image_path,file_list)
|
||||
for img_path in file_list:
|
||||
|
||||
print(count,img_path)
|
||||
time_b = time.time()
|
||||
img =cv_imread(img_path)
|
||||
|
||||
if img is None:
|
||||
continue
|
||||
if img.shape[-1]==4:
|
||||
img=cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
|
||||
# detect_one(model,img_path,device)
|
||||
dict_list=detect_plate(detect_model, img, device,opt.img_size)
|
||||
ori_img=draw_result(img,dict_list)
|
||||
img_name = os.path.basename(img_path)
|
||||
save_img_path = os.path.join(save_path,img_name)
|
||||
time_e=time.time()
|
||||
time_gap = time_e-time_b
|
||||
if count:
|
||||
time_all+=time_gap
|
||||
cv2.imwrite(save_img_path,ori_img)
|
||||
count+=1
|
||||
else: #单个图片
|
||||
print(count,opt.image_path,end=" ")
|
||||
img =cv_imread(opt.image_path)
|
||||
if img.shape[-1]==4:
|
||||
img=cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
|
||||
# detect_one(model,img_path,device)
|
||||
dict_list=detect_plate(detect_model, img, device,opt.img_size)
|
||||
ori_img=draw_result(img,dict_list)
|
||||
img_name = os.path.basename(opt.image_path)
|
||||
save_img_path = os.path.join(save_path,img_name)
|
||||
cv2.imwrite(save_img_path,ori_img)
|
||||
print(f"sumTime time is {time.time()-time_begin} s, average pic time is {time_all/(len(file_list)-1)}")
|
372
detect_plate.py
Normal file
@ -0,0 +1,372 @@
|
||||
# -*- coding: UTF-8 -*-
|
||||
import argparse
|
||||
import time
|
||||
from pathlib import Path
|
||||
import os
|
||||
import cv2
|
||||
import torch
|
||||
import torch.backends.cudnn as cudnn
|
||||
from numpy import random
|
||||
import copy
|
||||
import numpy as np
|
||||
from models.experimental import attempt_load
|
||||
from utils.datasets import letterbox
|
||||
from utils.general import check_img_size, non_max_suppression_face, apply_classifier, scale_coords, xyxy2xywh, \
|
||||
strip_optimizer, set_logging, increment_path
|
||||
from utils.plots import plot_one_box
|
||||
from utils.torch_utils import select_device, load_classifier, time_synchronized
|
||||
from utils.cv_puttext import cv2ImgAddText
|
||||
from plate_recognition.plate_rec import get_plate_result,allFilePath,init_model,cv_imread
|
||||
# from plate_recognition.plate_cls import cv_imread
|
||||
from plate_recognition.double_plate_split_merge import get_split_merge
|
||||
|
||||
clors = [(255,0,0),(0,255,0),(0,0,255),(255,255,0),(0,255,255)]
|
||||
danger=['危','险']
|
||||
def order_points(pts): #四个点按照左上 右上 右下 左下排列
|
||||
rect = np.zeros((4, 2), dtype = "float32")
|
||||
s = pts.sum(axis = 1)
|
||||
rect[0] = pts[np.argmin(s)]
|
||||
rect[2] = pts[np.argmax(s)]
|
||||
diff = np.diff(pts, axis = 1)
|
||||
rect[1] = pts[np.argmin(diff)]
|
||||
rect[3] = pts[np.argmax(diff)]
|
||||
return rect
|
||||
|
||||
|
||||
def four_point_transform(image, pts): #透视变换得到车牌小图
|
||||
# rect = order_points(pts)
|
||||
rect = pts.astype('float32')
|
||||
(tl, tr, br, bl) = rect
|
||||
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
|
||||
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
|
||||
maxWidth = max(int(widthA), int(widthB))
|
||||
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
|
||||
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
|
||||
maxHeight = max(int(heightA), int(heightB))
|
||||
dst = np.array([
|
||||
[0, 0],
|
||||
[maxWidth - 1, 0],
|
||||
[maxWidth - 1, maxHeight - 1],
|
||||
[0, maxHeight - 1]], dtype = "float32")
|
||||
M = cv2.getPerspectiveTransform(rect, dst)
|
||||
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
|
||||
return warped
|
||||
|
||||
def load_model(weights, device): #加载检测模型
|
||||
model = attempt_load(weights, map_location=device) # load FP32 model
|
||||
return model
|
||||
|
||||
def scale_coords_landmarks(img1_shape, coords, img0_shape, ratio_pad=None): #返回到原图坐标
|
||||
# Rescale coords (xyxy) from img1_shape to img0_shape
|
||||
if ratio_pad is None: # calculate from img0_shape
|
||||
gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1]) # gain = old / new
|
||||
pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2 # wh padding
|
||||
else:
|
||||
gain = ratio_pad[0][0]
|
||||
pad = ratio_pad[1]
|
||||
|
||||
coords[:, [0, 2, 4, 6]] -= pad[0] # x padding
|
||||
coords[:, [1, 3, 5, 7]] -= pad[1] # y padding
|
||||
coords[:, :8] /= gain
|
||||
#clip_coords(coords, img0_shape)
|
||||
coords[:, 0].clamp_(0, img0_shape[1]) # x1
|
||||
coords[:, 1].clamp_(0, img0_shape[0]) # y1
|
||||
coords[:, 2].clamp_(0, img0_shape[1]) # x2
|
||||
coords[:, 3].clamp_(0, img0_shape[0]) # y2
|
||||
coords[:, 4].clamp_(0, img0_shape[1]) # x3
|
||||
coords[:, 5].clamp_(0, img0_shape[0]) # y3
|
||||
coords[:, 6].clamp_(0, img0_shape[1]) # x4
|
||||
coords[:, 7].clamp_(0, img0_shape[0]) # y4
|
||||
# coords[:, 8].clamp_(0, img0_shape[1]) # x5
|
||||
# coords[:, 9].clamp_(0, img0_shape[0]) # y5
|
||||
return coords
|
||||
|
||||
|
||||
def get_plate_rec_landmark(img, xyxy, conf, landmarks, class_num,device,plate_rec_model,is_color=False): #获取车牌坐标以及四个角点坐标并获取车牌号
|
||||
h,w,c = img.shape
|
||||
Box={}
|
||||
result_dict={}
|
||||
tl = 1 or round(0.002 * (h + w) / 2) + 1 # line/font thickness
|
||||
|
||||
x1 = int(xyxy[0])
|
||||
y1 = int(xyxy[1])
|
||||
x2 = int(xyxy[2])
|
||||
y2 = int(xyxy[3])
|
||||
height=y2-y1
|
||||
landmarks_np=np.zeros((4,2))
|
||||
rect=[x1,y1,x2,y2]
|
||||
for i in range(4):
|
||||
point_x = int(landmarks[2 * i])
|
||||
point_y = int(landmarks[2 * i + 1])
|
||||
landmarks_np[i]=np.array([point_x,point_y])
|
||||
|
||||
class_label= int(class_num) #车牌的的类型0代表单牌,1代表双层车牌
|
||||
roi_img = four_point_transform(img,landmarks_np) #透视变换得到车牌小图
|
||||
if class_label: #判断是否是双层车牌,是双牌的话进行分割后然后拼接
|
||||
roi_img=get_split_merge(roi_img)
|
||||
if not is_color:
|
||||
plate_number,rec_prob = get_plate_result(roi_img,device,plate_rec_model,is_color=is_color) #对车牌小图进行识别
|
||||
else:
|
||||
plate_number,rec_prob,plate_color,color_conf=get_plate_result(roi_img,device,plate_rec_model,is_color=is_color)
|
||||
# cv2.imwrite("roi.jpg",roi_img)
|
||||
# result_dict['Score']=conf #检测区域得分
|
||||
Box['X']=landmarks_np[0][0].tolist() #车牌角点坐标
|
||||
Box['Y']=landmarks_np[0][1].tolist()
|
||||
Box['Width']=rect[2]-rect[0]
|
||||
Box['Height']=rect[3]-rect[1]
|
||||
Box['label']=plate_number #车牌号
|
||||
Box['rect']=rect
|
||||
result_dict['rect'] = rect # 车牌roi区域
|
||||
result_dict['detect_conf'] = conf # 检测区域得分
|
||||
result_dict['landmarks'] = landmarks_np.tolist() # 车牌角点坐标
|
||||
result_dict['plate_no'] = plate_number # 车牌号
|
||||
result_dict['rec_conf'] = rec_prob # 每个字符的概率
|
||||
result_dict['roi_height'] = roi_img.shape[0] # 车牌高度
|
||||
result_dict['plate_color'] = ""
|
||||
if is_color:
|
||||
result_dict['plate_color'] = plate_color # 车牌颜色
|
||||
result_dict['color_conf'] = color_conf # 颜色得分
|
||||
result_dict['plate_type'] = class_label # 单双层 0单层 1双层
|
||||
score = conf.tolist()
|
||||
return plate_number, score, Box,result_dict
|
||||
|
||||
|
||||
|
||||
def detect_Recognition_plate(model, orgimg, device,plate_rec_model,img_size,is_color=False):#获取车牌信息
|
||||
# Load model
|
||||
# img_size = opt_img_size
|
||||
conf_thres = 0.3 #得分阈值
|
||||
iou_thres = 0.5 #nms的iou值
|
||||
dict_list=[]
|
||||
result_jpg=[]
|
||||
# orgimg = cv2.imread(image_path) # BGR
|
||||
img0 = copy.deepcopy(orgimg)
|
||||
assert orgimg is not None, 'Image Not Found '
|
||||
h0, w0 = orgimg.shape[:2] # orig hw
|
||||
r = img_size / max(h0, w0) # resize image to img_size
|
||||
if r != 1: # always resize down, only resize up if training with augmentation
|
||||
interp = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
|
||||
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interp)
|
||||
|
||||
imgsz = check_img_size(img_size, s=model.stride.max()) # check img_size
|
||||
|
||||
img = letterbox(img0, new_shape=imgsz)[0] #检测前处理,图片长宽变为32倍数,比如变为640X640
|
||||
# img =process_data(img0)
|
||||
# Convert
|
||||
img = img[:, :, ::-1].transpose(2, 0, 1).copy() # BGR to RGB, to 3x416x416 图片的BGR排列转为RGB,然后将图片的H,W,C排列变为C,H,W排列
|
||||
|
||||
# Run inference
|
||||
t0 = time.time()
|
||||
|
||||
img = torch.from_numpy(img).to(device)
|
||||
img = img.float() # uint8 to fp16/32
|
||||
img /= 255.0 # 0 - 255 to 0.0 - 1.0
|
||||
if img.ndimension() == 3:
|
||||
img = img.unsqueeze(0)
|
||||
|
||||
# Inference
|
||||
# t1 = time_synchronized()/
|
||||
pred = model(img)[0]
|
||||
# t2=time_synchronized()
|
||||
# print(f"infer time is {(t2-t1)*1000} ms")
|
||||
|
||||
# Apply NMS
|
||||
pred = non_max_suppression_face(pred, conf_thres, iou_thres)
|
||||
result_jpg.insert(0, pred[0].tolist())
|
||||
# print('img.shape: ', img.shape)
|
||||
# print('orgimg.shape: ', orgimg.shape)
|
||||
# Process detections
|
||||
for i, det in enumerate(pred): # detections per image
|
||||
if len(det):
|
||||
# Rescale boxes from img_size to im0 size
|
||||
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], orgimg.shape).round()
|
||||
|
||||
# Print results
|
||||
for c in det[:, -1].unique():
|
||||
n = (det[:, -1] == c).sum() # detections per class
|
||||
|
||||
det[:, 5:13] = scale_coords_landmarks(img.shape[2:], det[:, 5:13], orgimg.shape).round()
|
||||
|
||||
for j in range(det.size()[0]):
|
||||
xyxy = det[j, :4].view(-1).tolist()
|
||||
conf = det[j, 4].cpu().numpy()
|
||||
landmarks = det[j, 5:13].view(-1).tolist()
|
||||
class_num = det[j, 13].cpu().numpy()
|
||||
label,score,Box,result_dict = get_plate_rec_landmark(orgimg, xyxy, conf, landmarks, class_num,device,plate_rec_model,is_color=is_color)
|
||||
dict_list.append(result_dict)
|
||||
result_jpg.append(Box)
|
||||
result_jpg.append(score)
|
||||
result_jpg.append(label)
|
||||
|
||||
return dict_list, result_jpg
|
||||
# cv2.imwrite('result.jpg', orgimg)
|
||||
|
||||
def draw_result(orgimg,dict_list,is_color=False): # 车牌结果画出来
|
||||
result_str =""
|
||||
for result in dict_list:
|
||||
rect_area = result['rect']
|
||||
|
||||
x,y,w,h = rect_area[0],rect_area[1],rect_area[2]-rect_area[0],rect_area[3]-rect_area[1]
|
||||
padding_w = 0.05*w
|
||||
padding_h = 0.11*h
|
||||
rect_area[0]=max(0,int(x-padding_w))
|
||||
rect_area[1]=max(0,int(y-padding_h))
|
||||
rect_area[2]=min(orgimg.shape[1],int(rect_area[2]+padding_w))
|
||||
rect_area[3]=min(orgimg.shape[0],int(rect_area[3]+padding_h))
|
||||
|
||||
height_area = result['roi_height']
|
||||
landmarks=result['landmarks']
|
||||
result_p = result['plate_no']
|
||||
if result['plate_type']==0:#单层
|
||||
result_p+=" "+result['plate_color']
|
||||
else: #双层
|
||||
result_p+=" "+result['plate_color']+"双层"
|
||||
result_str+=result_p+" "
|
||||
for i in range(4): #关键点
|
||||
cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
|
||||
cv2.rectangle(orgimg,(rect_area[0],rect_area[1]),(rect_area[2],rect_area[3]),(0,0,255),2) #画框
|
||||
|
||||
labelSize = cv2.getTextSize(result_p,cv2.FONT_HERSHEY_SIMPLEX,0.5,1) #获得字体的大小
|
||||
if rect_area[0]+labelSize[0][0]>orgimg.shape[1]: #防止显示的文字越界
|
||||
rect_area[0]=int(orgimg.shape[1]-labelSize[0][0])
|
||||
orgimg=cv2.rectangle(orgimg,(rect_area[0],int(rect_area[1]-round(1.6*labelSize[0][1]))),(int(rect_area[0]+round(1.2*labelSize[0][0])),rect_area[1]+labelSize[1]),(255,255,255),cv2.FILLED)#画文字框,背景白色
|
||||
|
||||
if len(result)>=1:
|
||||
orgimg=cv2ImgAddText(orgimg,result_p,rect_area[0],int(rect_area[1]-round(1.6*labelSize[0][1])),(0,0,0),21)
|
||||
# orgimg=cv2ImgAddText(orgimg,result_p,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
|
||||
|
||||
print(result_str) # 打印结果
|
||||
return orgimg, result_str
|
||||
|
||||
|
||||
|
||||
def get_second(capture):
|
||||
if capture.isOpened():
|
||||
rate = capture.get(5) # 帧速率
|
||||
FrameNumber = capture.get(7) # 视频文件的帧数
|
||||
duration = FrameNumber/rate # 帧速率/视频总帧数 是时间,除以60之后单位是分钟
|
||||
return int(rate),int(FrameNumber),int(duration)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--detect_model', nargs='+', type=str, default='weights/plate_detect.pt', help='model.pt path(s)') #检测模型
|
||||
parser.add_argument('--rec_model', type=str, default='weights/plate_rec_color.pth', help='model.pt path(s)')#车牌识别+颜色识别模型
|
||||
parser.add_argument('--is_color',type=bool,default=True,help='plate color') #是否识别颜色
|
||||
parser.add_argument('--image_path', type=str, default=r'D:\Project\ChePai\test\images\val\20230331163841.jpg', help='source') #图片路径
|
||||
parser.add_argument('--img_size', type=int, default=640, help='inference size (pixels)') #网络输入图片大小
|
||||
parser.add_argument('--output', type=str, default='result', help='source') #图片结果保存的位置
|
||||
parser.add_argument('--video', type=str, default='', help='source') #视频的路径
|
||||
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") #使用gpu还是cpu进行识别
|
||||
# device =torch.device("cpu")
|
||||
opt = parser.parse_args()
|
||||
print(opt)
|
||||
save_path = opt.output
|
||||
count=0
|
||||
if not os.path.exists(save_path):
|
||||
os.mkdir(save_path)
|
||||
|
||||
detect_model = load_model(opt.detect_model, device) #初始化检测模型
|
||||
plate_rec_model=init_model(device,opt.rec_model,is_color=opt.is_color) #初始化识别模型
|
||||
#算参数量
|
||||
total = sum(p.numel() for p in detect_model.parameters())
|
||||
total_1 = sum(p.numel() for p in plate_rec_model.parameters())
|
||||
print("detect params: %.2fM,rec params: %.2fM" % (total/1e6,total_1/1e6))
|
||||
|
||||
# plate_color_model =init_color_model(opt.color_model,device)
|
||||
time_all = 0
|
||||
time_begin=time.time()
|
||||
if not opt.video: #处理图片
|
||||
if not os.path.isfile(opt.image_path): #目录
|
||||
file_list=[]
|
||||
allFilePath(opt.image_path,file_list) #将这个目录下的所有图片文件路径读取到file_list里面
|
||||
for img_path in file_list: #遍历图片文件
|
||||
|
||||
print(count,img_path,end=" ")
|
||||
time_b = time.time() #开始时间
|
||||
img =cv_imread(img_path) #opencv 读取图片
|
||||
|
||||
if img is None:
|
||||
continue
|
||||
if img.shape[-1]==4: #图片如果是4个通道的,将其转为3个通道
|
||||
img=cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
|
||||
# detect_one(model,img_path,device)
|
||||
dict_list=detect_Recognition_plate(detect_model, img, device,plate_rec_model,opt.img_size,is_color=opt.is_color)#检测以及识别车牌
|
||||
ori_img, str_result=draw_result(img,dict_list) #将结果画在图上
|
||||
print(str_result)
|
||||
img_name = os.path.basename(img_path)
|
||||
save_img_path = os.path.join(save_path,img_name) #图片保存的路径
|
||||
time_e=time.time()
|
||||
time_gap = time_e-time_b #计算单个图片识别耗时
|
||||
if count:
|
||||
time_all+=time_gap
|
||||
cv2.imwrite(save_img_path,ori_img) #opencv将识别的图片保存
|
||||
count+=1
|
||||
print(f"sumTime time is {time.time()-time_begin} s, average pic time is {time_all/(len(file_list)-1)}")
|
||||
else: #单个图片
|
||||
print(count,opt.image_path,end=" ")
|
||||
img =cv_imread(opt.image_path)
|
||||
if img.shape[-1]==4:
|
||||
img=cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
|
||||
# detect_one(model,img_path,device)
|
||||
dict_list, result_jpg=detect_Recognition_plate(detect_model, img, device,plate_rec_model,opt.img_size,is_color=opt.is_color)
|
||||
ori_img=draw_result(img,dict_list)
|
||||
ori_list=ori_img[0].tolist()
|
||||
result_jpg.insert(0,ori_list)
|
||||
img_name = os.path.basename(opt.image_path)
|
||||
save_img_path = os.path.join(save_path,img_name)
|
||||
cv2.imwrite(save_img_path,ori_img)
|
||||
|
||||
|
||||
else: #处理视频
|
||||
video_name = opt.video
|
||||
capture=cv2.VideoCapture(video_name)
|
||||
fourcc = cv2.VideoWriter_fourcc(*'MP4V')
|
||||
fps = capture.get(cv2.CAP_PROP_FPS) # 帧数
|
||||
width, height = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT)) # 宽高
|
||||
out = cv2.VideoWriter('result.mp4', fourcc, fps, (width, height)) # 写入视频
|
||||
frame_count = 0
|
||||
fps_all=0
|
||||
rate,FrameNumber,duration=get_second(capture)
|
||||
if capture.isOpened():
|
||||
while True:
|
||||
t1 = cv2.getTickCount()
|
||||
frame_count+=1
|
||||
print(f"第{frame_count} 帧",end=" ")
|
||||
ret,img=capture.read()
|
||||
if not ret:
|
||||
break
|
||||
# if frame_count%rate==0:
|
||||
img0 = copy.deepcopy(img)
|
||||
dict_list=detect_Recognition_plate(detect_model, img, device,plate_rec_model,opt.img_size,is_color=opt.is_color)
|
||||
ori_img=draw_result(img,dict_list)
|
||||
t2 =cv2.getTickCount()
|
||||
infer_time =(t2-t1)/cv2.getTickFrequency()
|
||||
fps=1.0/infer_time
|
||||
fps_all+=fps
|
||||
str_fps = f'fps:{fps:.4f}'
|
||||
|
||||
cv2.putText(ori_img,str_fps,(20,20),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,0),2)
|
||||
# cv2.imshow("haha",ori_img)
|
||||
# cv2.waitKey(1)
|
||||
out.write(ori_img)
|
||||
|
||||
# current_time = int(frame_count/FrameNumber*duration)
|
||||
# sec = current_time%60
|
||||
# minute = current_time//60
|
||||
# for result_ in result_list:
|
||||
# plate_no = result_['plate_no']
|
||||
# if not is_car_number(pattern_str,plate_no):
|
||||
# continue
|
||||
# print(f'车牌号:{plate_no},时间:{minute}分{sec}秒')
|
||||
# time_str =f'{minute}分{sec}秒'
|
||||
# writer.writerow({"车牌":plate_no,"时间":time_str})
|
||||
# out.write(ori_img)
|
||||
|
||||
|
||||
else:
|
||||
print("失败")
|
||||
capture.release()
|
||||
out.release()
|
||||
cv2.destroyAllWindows()
|
||||
print(f"all frame is {frame_count},average fps is {fps_all/frame_count} fps")
|
28
detect_test.py
Normal file
@ -0,0 +1,28 @@
|
||||
import torch
|
||||
from ultralytics import YOLO
|
||||
|
||||
# 加载预训练的模型
|
||||
model = YOLO('weights/plate_detect.pt')
|
||||
|
||||
# 设置图像路径
|
||||
image_path = r'D:\Project\ChePai\test\images\val\20230331163841.jpg'
|
||||
|
||||
# 进行推理
|
||||
results = model(image_path)
|
||||
|
||||
# 解析结果
|
||||
for r in results:
|
||||
boxes = r.boxes # 包含检测结果的Boxes对象
|
||||
# 获取边界框坐标
|
||||
box_coordinates = boxes.xyxy.cpu().numpy()
|
||||
# 获取置信度分数
|
||||
confidences = boxes.conf.cpu().numpy()
|
||||
# 获取类别标签
|
||||
labels = boxes.cls.cpu().numpy().astype(int)
|
||||
|
||||
# 打印检测结果
|
||||
for i in range(len(box_coordinates)):
|
||||
x1, y1, x2, y2 = box_coordinates[i]
|
||||
confidence = confidences[i]
|
||||
label = model.names[labels[i]]
|
||||
print(f'Object: {label}, Confidence: {confidence:.2f}, Bounding Box: ({x1:.2f}, {y1:.2f}, {x2:.2f}, {y2:.2f})')
|
161
export.py
Normal file
@ -0,0 +1,161 @@
|
||||
"""Exports a YOLOv5 *.pt model to ONNX and TorchScript formats
|
||||
|
||||
Usage:
|
||||
$ export PYTHONPATH="$PWD" && python models/export.py --weights ./weights/yolov5s.pt --img 640 --batch 1
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import sys
|
||||
import time
|
||||
|
||||
sys.path.append('./') # to run '$ python *.py' files in subdirectories
|
||||
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
|
||||
import models
|
||||
from models.experimental import attempt_load
|
||||
from utils.activations import Hardswish, SiLU
|
||||
from utils.general import set_logging, check_img_size
|
||||
import onnx
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--weights', type=str, default='./yolov5s.pt', help='weights path') # from yolov5/models/
|
||||
parser.add_argument('--img_size', nargs='+', type=int, default=[640, 640], help='image size') # height, width
|
||||
parser.add_argument('--batch_size', type=int, default=1, help='batch size')
|
||||
parser.add_argument('--dynamic', action='store_true', default=False, help='enable dynamic axis in onnx model')
|
||||
parser.add_argument('--onnx2pb', action='store_true', default=False, help='export onnx to pb')
|
||||
parser.add_argument('--onnx_infer', action='store_true', default=True, help='onnx infer test')
|
||||
#=======================TensorRT=================================
|
||||
parser.add_argument('--onnx2trt', action='store_true', default=False, help='export onnx to tensorrt')
|
||||
parser.add_argument('--fp16_trt', action='store_true', default=False, help='fp16 infer')
|
||||
#================================================================
|
||||
opt = parser.parse_args()
|
||||
opt.img_size *= 2 if len(opt.img_size) == 1 else 1 # expand
|
||||
print(opt)
|
||||
set_logging()
|
||||
t = time.time()
|
||||
|
||||
# Load PyTorch model
|
||||
model = attempt_load(opt.weights, map_location=torch.device('cpu')) # load FP32 model
|
||||
delattr(model.model[-1], 'anchor_grid')
|
||||
model.model[-1].anchor_grid=[torch.zeros(1)] * 3 # nl=3 number of detection layers
|
||||
model.model[-1].export_cat = True
|
||||
model.eval()
|
||||
labels = model.names
|
||||
|
||||
# Checks
|
||||
gs = int(max(model.stride)) # grid size (max stride)
|
||||
opt.img_size = [check_img_size(x, gs) for x in opt.img_size] # verify img_size are gs-multiples
|
||||
|
||||
# Input
|
||||
img = torch.zeros(opt.batch_size, 3, *opt.img_size) # image size(1,3,320,192) iDetection
|
||||
|
||||
# Update model
|
||||
for k, m in model.named_modules():
|
||||
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
|
||||
if isinstance(m, models.common.Conv): # assign export-friendly activations
|
||||
if isinstance(m.act, nn.Hardswish):
|
||||
m.act = Hardswish()
|
||||
elif isinstance(m.act, nn.SiLU):
|
||||
m.act = SiLU()
|
||||
# elif isinstance(m, models.yolo.Detect):
|
||||
# m.forward = m.forward_export # assign forward (optional)
|
||||
if isinstance(m, models.common.ShuffleV2Block):#shufflenet block nn.SiLU
|
||||
for i in range(len(m.branch1)):
|
||||
if isinstance(m.branch1[i], nn.SiLU):
|
||||
m.branch1[i] = SiLU()
|
||||
for i in range(len(m.branch2)):
|
||||
if isinstance(m.branch2[i], nn.SiLU):
|
||||
m.branch2[i] = SiLU()
|
||||
if isinstance(m, models.common.BlazeBlock):#shufflenet block nn.SiLU
|
||||
if isinstance(m.relu, nn.SiLU):
|
||||
m.relu = SiLU()
|
||||
if isinstance(m, models.common.DoubleBlazeBlock):#shufflenet block nn.SiLU
|
||||
if isinstance(m.relu, nn.SiLU):
|
||||
m.relu = SiLU()
|
||||
for i in range(len(m.branch1)):
|
||||
if isinstance(m.branch1[i], nn.SiLU):
|
||||
m.branch1[i] = SiLU()
|
||||
# for i in range(len(m.branch2)):
|
||||
# if isinstance(m.branch2[i], nn.SiLU):
|
||||
# m.branch2[i] = SiLU()
|
||||
y = model(img) # dry run
|
||||
|
||||
# ONNX export
|
||||
print('\nStarting ONNX export with onnx %s...' % onnx.__version__)
|
||||
f = opt.weights.replace('.pt', '.onnx') # filename
|
||||
model.fuse() # only for ONNX
|
||||
input_names=['input']
|
||||
output_names=['output']
|
||||
#tensorrt 7
|
||||
# grid = model.model[-1].anchor_grid
|
||||
# model.model[-1].anchor_grid = [a[..., :1, :1, :] for a in grid]
|
||||
#tensorrt 7
|
||||
|
||||
torch.onnx.export(model, img, f, verbose=False, opset_version=12,
|
||||
input_names=input_names,
|
||||
output_names=output_names,
|
||||
dynamic_axes = {'input': {0: 'batch'},
|
||||
'output': {0: 'batch'}
|
||||
} if opt.dynamic else None)
|
||||
|
||||
# model.model[-1].anchor_grid = grid
|
||||
|
||||
# Checks
|
||||
onnx_model = onnx.load(f) # load onnx model
|
||||
onnx.checker.check_model(onnx_model) # check onnx model
|
||||
print('ONNX export success, saved as %s' % f)
|
||||
# Finish
|
||||
print('\nExport complete (%.2fs). Visualize with https://github.com/lutzroeder/netron.' % (time.time() - t))
|
||||
|
||||
|
||||
# onnx infer
|
||||
if opt.onnx_infer:
|
||||
import onnxruntime
|
||||
import numpy as np
|
||||
providers = ['CPUExecutionProvider']
|
||||
session = onnxruntime.InferenceSession(f, providers=providers)
|
||||
im = img.cpu().numpy().astype(np.float32) # torch to numpy
|
||||
y_onnx = session.run([session.get_outputs()[0].name], {session.get_inputs()[0].name: im})[0]
|
||||
print("pred's shape is ",y_onnx.shape)
|
||||
print("max(|torch_pred - onnx_pred|) =",abs(y.cpu().numpy()-y_onnx).max())
|
||||
|
||||
|
||||
# TensorRT export
|
||||
if opt.onnx2trt:
|
||||
from torch2trt.trt_model import ONNX_to_TRT
|
||||
print('\nStarting TensorRT...')
|
||||
ONNX_to_TRT(onnx_model_path=f,trt_engine_path=f.replace('.onnx', '.trt'),fp16_mode=opt.fp16_trt)
|
||||
|
||||
# PB export
|
||||
if opt.onnx2pb:
|
||||
print('download the newest onnx_tf by https://github.com/onnx/onnx-tensorflow/tree/master/onnx_tf')
|
||||
from onnx_tf.backend import prepare
|
||||
import tensorflow as tf
|
||||
|
||||
outpb = f.replace('.onnx', '.pb') # filename
|
||||
# strict=True maybe leads to KeyError: 'pyfunc_0', check: https://github.com/onnx/onnx-tensorflow/issues/167
|
||||
tf_rep = prepare(onnx_model, strict=False) # prepare tf representation
|
||||
tf_rep.export_graph(outpb) # export the model
|
||||
|
||||
out_onnx = tf_rep.run(img) # onnx output
|
||||
|
||||
# check pb
|
||||
with tf.Graph().as_default():
|
||||
graph_def = tf.GraphDef()
|
||||
with open(outpb, "rb") as f:
|
||||
graph_def.ParseFromString(f.read())
|
||||
tf.import_graph_def(graph_def, name="")
|
||||
with tf.Session() as sess:
|
||||
init = tf.global_variables_initializer()
|
||||
input_x = sess.graph.get_tensor_by_name(input_names[0]+':0') # input
|
||||
outputs = []
|
||||
for i in output_names:
|
||||
outputs.append(sess.graph.get_tensor_by_name(i+':0'))
|
||||
out_pb = sess.run(outputs, feed_dict={input_x: img})
|
||||
|
||||
print(f'out_pytorch {y}')
|
||||
print(f'out_onnx {out_onnx}')
|
||||
print(f'out_pb {out_pb}')
|
141
hubconf.py
Normal file
@ -0,0 +1,141 @@
|
||||
"""File for accessing YOLOv5 via PyTorch Hub https://pytorch.org/hub/
|
||||
|
||||
Usage:
|
||||
import torch
|
||||
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True, channels=3, classes=80)
|
||||
"""
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
import torch
|
||||
|
||||
from models.yolo import Model
|
||||
from utils.general import set_logging
|
||||
from utils.google_utils import attempt_download
|
||||
|
||||
dependencies = ['torch', 'yaml']
|
||||
set_logging()
|
||||
|
||||
|
||||
def create(name, pretrained, channels, classes, autoshape):
|
||||
"""Creates a specified YOLOv5 model
|
||||
|
||||
Arguments:
|
||||
name (str): name of model, i.e. 'yolov5s'
|
||||
pretrained (bool): load pretrained weights into the model
|
||||
channels (int): number of input channels
|
||||
classes (int): number of model classes
|
||||
|
||||
Returns:
|
||||
pytorch model
|
||||
"""
|
||||
config = Path(__file__).parent / 'models' / f'{name}.yaml' # model.yaml path
|
||||
try:
|
||||
model = Model(config, channels, classes)
|
||||
if pretrained:
|
||||
fname = f'{name}.pt' # checkpoint filename
|
||||
attempt_download(fname) # download if not found locally
|
||||
ckpt = torch.load(fname, map_location=torch.device('cpu')) # load
|
||||
state_dict = ckpt['model'].float().state_dict() # to FP32
|
||||
state_dict = {k: v for k, v in state_dict.items() if model.state_dict()[k].shape == v.shape} # filter
|
||||
model.load_state_dict(state_dict, strict=False) # load
|
||||
if len(ckpt['model'].names) == classes:
|
||||
model.names = ckpt['model'].names # set class names attribute
|
||||
if autoshape:
|
||||
model = model.autoshape() # for file/URI/PIL/cv2/np inputs and NMS
|
||||
return model
|
||||
|
||||
except Exception as e:
|
||||
help_url = 'https://github.com/ultralytics/yolov5/issues/36'
|
||||
s = 'Cache maybe be out of date, try force_reload=True. See %s for help.' % help_url
|
||||
raise Exception(s) from e
|
||||
|
||||
|
||||
def yolov5s(pretrained=False, channels=3, classes=80, autoshape=True):
|
||||
"""YOLOv5-small model from https://github.com/ultralytics/yolov5
|
||||
|
||||
Arguments:
|
||||
pretrained (bool): load pretrained weights into the model, default=False
|
||||
channels (int): number of input channels, default=3
|
||||
classes (int): number of model classes, default=80
|
||||
|
||||
Returns:
|
||||
pytorch model
|
||||
"""
|
||||
return create('yolov5s', pretrained, channels, classes, autoshape)
|
||||
|
||||
|
||||
def yolov5m(pretrained=False, channels=3, classes=80, autoshape=True):
|
||||
"""YOLOv5-medium model from https://github.com/ultralytics/yolov5
|
||||
|
||||
Arguments:
|
||||
pretrained (bool): load pretrained weights into the model, default=False
|
||||
channels (int): number of input channels, default=3
|
||||
classes (int): number of model classes, default=80
|
||||
|
||||
Returns:
|
||||
pytorch model
|
||||
"""
|
||||
return create('yolov5m', pretrained, channels, classes, autoshape)
|
||||
|
||||
|
||||
def yolov5l(pretrained=False, channels=3, classes=80, autoshape=True):
|
||||
"""YOLOv5-large model from https://github.com/ultralytics/yolov5
|
||||
|
||||
Arguments:
|
||||
pretrained (bool): load pretrained weights into the model, default=False
|
||||
channels (int): number of input channels, default=3
|
||||
classes (int): number of model classes, default=80
|
||||
|
||||
Returns:
|
||||
pytorch model
|
||||
"""
|
||||
return create('yolov5l', pretrained, channels, classes, autoshape)
|
||||
|
||||
|
||||
def yolov5x(pretrained=False, channels=3, classes=80, autoshape=True):
|
||||
"""YOLOv5-xlarge model from https://github.com/ultralytics/yolov5
|
||||
|
||||
Arguments:
|
||||
pretrained (bool): load pretrained weights into the model, default=False
|
||||
channels (int): number of input channels, default=3
|
||||
classes (int): number of model classes, default=80
|
||||
|
||||
Returns:
|
||||
pytorch model
|
||||
"""
|
||||
return create('yolov5x', pretrained, channels, classes, autoshape)
|
||||
|
||||
|
||||
def custom(path_or_model='path/to/model.pt', autoshape=True):
|
||||
"""YOLOv5-custom model from https://github.com/ultralytics/yolov5
|
||||
|
||||
Arguments (3 options):
|
||||
path_or_model (str): 'path/to/model.pt'
|
||||
path_or_model (dict): torch.load('path/to/model.pt')
|
||||
path_or_model (nn.Module): torch.load('path/to/model.pt')['model']
|
||||
|
||||
Returns:
|
||||
pytorch model
|
||||
"""
|
||||
model = torch.load(path_or_model) if isinstance(path_or_model, str) else path_or_model # load checkpoint
|
||||
if isinstance(model, dict):
|
||||
model = model['model'] # load model
|
||||
|
||||
hub_model = Model(model.yaml).to(next(model.parameters()).device) # create
|
||||
hub_model.load_state_dict(model.float().state_dict()) # load state_dict
|
||||
hub_model.names = model.names # class names
|
||||
return hub_model.autoshape() if autoshape else hub_model
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
model = create(name='yolov5s', pretrained=True, channels=3, classes=80, autoshape=True) # pretrained example
|
||||
# model = custom(path_or_model='path/to/model.pt') # custom example
|
||||
|
||||
# Verify inference
|
||||
from PIL import Image
|
||||
|
||||
imgs = [Image.open(x) for x in Path('data/images').glob('*.jpg')]
|
||||
results = model(imgs)
|
||||
results.show()
|
||||
results.print()
|
BIN
image/README/1.png
Normal file
After Width: | Height: | Size: 26 KiB |
BIN
image/README/105384078.png
Normal file
After Width: | Height: | Size: 18 KiB |
BIN
image/README/test_1.jpg
Normal file
After Width: | Height: | Size: 1.0 MiB |
BIN
image/README/weixian.png
Normal file
After Width: | Height: | Size: 960 KiB |
BIN
imgs/Quicker_20220930_180856.png
Normal file
After Width: | Height: | Size: 1.4 MiB |
BIN
imgs/Quicker_20220930_180919.png
Normal file
After Width: | Height: | Size: 1.0 MiB |
BIN
imgs/Quicker_20220930_180938.png
Normal file
After Width: | Height: | Size: 241 KiB |
BIN
imgs/Quicker_20220930_181044.png
Normal file
After Width: | Height: | Size: 328 KiB |
BIN
imgs/double_yellow.jpg
Normal file
After Width: | Height: | Size: 29 KiB |
BIN
imgs/hongkang1.jpg
Normal file
After Width: | Height: | Size: 571 KiB |
BIN
imgs/moto.png
Normal file
After Width: | Height: | Size: 400 KiB |
BIN
imgs/police.jpg
Normal file
After Width: | Height: | Size: 382 KiB |
BIN
imgs/shi_lin_guan.jpg
Normal file
After Width: | Height: | Size: 47 KiB |
BIN
imgs/single_blue.jpg
Normal file
After Width: | Height: | Size: 1.8 MiB |
BIN
imgs/single_green.jpg
Normal file
After Width: | Height: | Size: 903 KiB |
BIN
imgs/single_yellow.jpg
Normal file
After Width: | Height: | Size: 85 KiB |
BIN
imgs/tmp8F1F.png
Normal file
After Width: | Height: | Size: 932 KiB |
BIN
imgs/tmpA5E3.png
Normal file
After Width: | Height: | Size: 513 KiB |
BIN
imgs/xue.jpg
Normal file
After Width: | Height: | Size: 999 KiB |
121
json2yolo.py
Normal file
@ -0,0 +1,121 @@
|
||||
import json
|
||||
import os
|
||||
import numpy as np
|
||||
from copy import deepcopy
|
||||
import cv2
|
||||
|
||||
def allFilePath(rootPath,allFIleList):
|
||||
fileList = os.listdir(rootPath)
|
||||
for temp in fileList:
|
||||
if os.path.isfile(os.path.join(rootPath,temp)):
|
||||
allFIleList.append(os.path.join(rootPath,temp))
|
||||
else:
|
||||
allFilePath(os.path.join(rootPath,temp),allFIleList)
|
||||
|
||||
def xywh2yolo(rect,landmarks_sort,img):
|
||||
h,w,c =img.shape
|
||||
rect[0] = max(0, rect[0])
|
||||
rect[1] = max(0, rect[1])
|
||||
rect[2] = min(w - 1, rect[2]-rect[0])
|
||||
rect[3] = min(h - 1, rect[3]-rect[1])
|
||||
annotation = np.zeros((1, 12))
|
||||
annotation[0, 0] = (rect[0] + rect[2] / 2) / w # cx
|
||||
annotation[0, 1] = (rect[1] + rect[3] / 2) / h # cy
|
||||
annotation[0, 2] = rect[2] / w # w
|
||||
annotation[0, 3] = rect[3] / h # h
|
||||
|
||||
annotation[0, 4] = landmarks_sort[0][0] / w # l0_x
|
||||
annotation[0, 5] = landmarks_sort[0][1] / h # l0_y
|
||||
annotation[0, 6] = landmarks_sort[1][0] / w # l1_x
|
||||
annotation[0, 7] = landmarks_sort[1][1] / h # l1_y
|
||||
annotation[0, 8] = landmarks_sort[2][0] / w # l2_x
|
||||
annotation[0, 9] = landmarks_sort[2][1] / h # l2_y
|
||||
annotation[0, 10] = landmarks_sort[3][0] / w # l3_x
|
||||
annotation[0, 11] = landmarks_sort[3][1] / h # l3_y
|
||||
# annotation[0, 12] = (landmarks_sort[0][0]+landmarks_sort[1][0])/2 / w # l4_x
|
||||
# annotation[0, 13] = (landmarks_sort[0][1]+landmarks_sort[1][1])/2 / h # l4_y
|
||||
return annotation
|
||||
|
||||
def order_points(pts):
|
||||
rect = np.zeros((4, 2), dtype = "float32")
|
||||
s = pts.sum(axis = 1)
|
||||
rect[0] = pts[np.argmin(s)]
|
||||
rect[2] = pts[np.argmax(s)]
|
||||
diff = np.diff(pts, axis = 1)
|
||||
rect[1] = pts[np.argmin(diff)]
|
||||
rect[3] = pts[np.argmax(diff)]
|
||||
|
||||
# return the ordered coordinates
|
||||
return rect
|
||||
|
||||
def four_point_transform(image, pts):
|
||||
rect = order_points(pts)
|
||||
(tl, tr, br, bl) = rect
|
||||
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
|
||||
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
|
||||
maxWidth = max(int(widthA), int(widthB))
|
||||
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
|
||||
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
|
||||
maxHeight = max(int(heightA), int(heightB))
|
||||
dst = np.array([
|
||||
[0, 0],
|
||||
[maxWidth - 1, 0],
|
||||
[maxWidth - 1, maxHeight - 1],
|
||||
[0, maxHeight - 1]], dtype = "float32")
|
||||
M = cv2.getPerspectiveTransform(rect, dst)
|
||||
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
|
||||
|
||||
# return the warped image
|
||||
return warped
|
||||
|
||||
if __name__ == "__main__":
|
||||
pic_file_list = []
|
||||
pic_file = r"/mnt/Gpan/Mydata/pytorchPorject/datasets/ccpd/train_bisai/train_bisai"
|
||||
save_small_path = "small"
|
||||
label_file = ['0','1']
|
||||
allFilePath(pic_file,pic_file_list)
|
||||
count=0
|
||||
index = 0
|
||||
for pic_ in pic_file_list:
|
||||
if not pic_.endswith(".jpg"):
|
||||
continue
|
||||
count+=1
|
||||
img = cv2.imread(pic_)
|
||||
img_name = os.path.basename(pic_)
|
||||
txt_name = img_name.replace(".jpg",".txt")
|
||||
txt_path = os.path.join(pic_file,txt_name)
|
||||
json_file_ = pic_.replace(".jpg",".json")
|
||||
if not os.path.exists(json_file_):
|
||||
continue
|
||||
with open(json_file_, 'r',encoding='utf-8') as a:
|
||||
data_dict = json.load(a)
|
||||
# print(data_dict['shapes'])
|
||||
with open(txt_path,"w") as f:
|
||||
for data_message in data_dict['shapes']:
|
||||
index+=1
|
||||
label=data_message['label']
|
||||
points = data_message['points']
|
||||
pts = np.array(points)
|
||||
# pts=order_points(pts)
|
||||
# new_img = four_point_transform(img,pts)
|
||||
roi_img_name = label+"_"+str(index)+".jpg"
|
||||
save_path=os.path.join(save_small_path,roi_img_name)
|
||||
# cv2.imwrite(save_path,new_img)
|
||||
x_max,y_max = np.max(pts,axis=0)
|
||||
x_min,y_min = np.min(pts,axis=0)
|
||||
rect = [x_min,y_min,x_max,y_max]
|
||||
rect1=deepcopy(rect)
|
||||
annotation=xywh2yolo(rect1,pts,img)
|
||||
print(data_message)
|
||||
label = data_message['label']
|
||||
str_label = label_file.index(label)
|
||||
# str_label = "0 "
|
||||
str_label = str(str_label)+" "
|
||||
for i in range(len(annotation[0])):
|
||||
str_label = str_label + " " + str(annotation[0][i])
|
||||
str_label = str_label.replace('[', '').replace(']', '')
|
||||
str_label = str_label.replace(',', '') + '\n'
|
||||
|
||||
f.write(str_label)
|
||||
print(count,img_name)
|
||||
# point=data_message[points]
|
349
main.py
Normal file
@ -0,0 +1,349 @@
|
||||
# -*- coding: UTF-8 -*-
|
||||
from flask import Flask, request, jsonify
|
||||
from PIL import Image
|
||||
import io
|
||||
import base64
|
||||
import time
|
||||
from pathlib import Path
|
||||
import os
|
||||
import cv2
|
||||
import torch
|
||||
import torch.backends.cudnn as cudnn
|
||||
from numpy import random
|
||||
import copy
|
||||
import numpy as np
|
||||
from models.experimental import attempt_load
|
||||
from utils.datasets import letterbox
|
||||
from utils.general import check_img_size, non_max_suppression_face, apply_classifier, scale_coords, xyxy2xywh, \
|
||||
strip_optimizer, set_logging, increment_path
|
||||
from utils.plots import plot_one_box
|
||||
from utils.torch_utils import select_device, load_classifier, time_synchronized
|
||||
from utils.cv_puttext import cv2ImgAddText
|
||||
from plate_recognition.plate_rec import get_plate_result, allFilePath, init_model, cv_imread
|
||||
# from plate_recognition.plate_cls import cv_imread
|
||||
from plate_recognition.double_plate_split_merge import get_split_merge
|
||||
|
||||
|
||||
clors = [(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (0, 255, 255)]
|
||||
danger = ['危', '险']
|
||||
|
||||
|
||||
def order_points(pts): # 四个点按照左上 右上 右下 左下排列
|
||||
rect = np.zeros((4, 2), dtype="float32")
|
||||
s = pts.sum(axis=1)
|
||||
rect[0] = pts[np.argmin(s)]
|
||||
rect[2] = pts[np.argmax(s)]
|
||||
diff = np.diff(pts, axis=1)
|
||||
rect[1] = pts[np.argmin(diff)]
|
||||
rect[3] = pts[np.argmax(diff)]
|
||||
return rect
|
||||
|
||||
|
||||
def four_point_transform(image, pts): # 透视变换得到车牌小图
|
||||
# rect = order_points(pts)
|
||||
rect = pts.astype('float32')
|
||||
(tl, tr, br, bl) = rect
|
||||
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
|
||||
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
|
||||
maxWidth = max(int(widthA), int(widthB))
|
||||
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
|
||||
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
|
||||
maxHeight = max(int(heightA), int(heightB))
|
||||
dst = np.array([
|
||||
[0, 0],
|
||||
[maxWidth - 1, 0],
|
||||
[maxWidth - 1, maxHeight - 1],
|
||||
[0, maxHeight - 1]], dtype="float32")
|
||||
M = cv2.getPerspectiveTransform(rect, dst)
|
||||
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
|
||||
return warped
|
||||
|
||||
|
||||
def load_model(weights, device): # 加载检测模型
|
||||
model = attempt_load(weights, map_location=device) # load FP32 model
|
||||
return model
|
||||
|
||||
|
||||
def scale_coords_landmarks(img1_shape, coords, img0_shape, ratio_pad=None): # 返回到原图坐标
|
||||
# Rescale coords (xyxy) from img1_shape to img0_shape
|
||||
if ratio_pad is None: # calculate from img0_shape
|
||||
gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1]) # gain = old / new
|
||||
pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2 # wh padding
|
||||
else:
|
||||
gain = ratio_pad[0][0]
|
||||
pad = ratio_pad[1]
|
||||
|
||||
coords[:, [0, 2, 4, 6]] -= pad[0] # x padding
|
||||
coords[:, [1, 3, 5, 7]] -= pad[1] # y padding
|
||||
coords[:, :8] /= gain
|
||||
# clip_coords(coords, img0_shape)
|
||||
coords[:, 0].clamp_(0, img0_shape[1]) # x1
|
||||
coords[:, 1].clamp_(0, img0_shape[0]) # y1
|
||||
coords[:, 2].clamp_(0, img0_shape[1]) # x2
|
||||
coords[:, 3].clamp_(0, img0_shape[0]) # y2
|
||||
coords[:, 4].clamp_(0, img0_shape[1]) # x3
|
||||
coords[:, 5].clamp_(0, img0_shape[0]) # y3
|
||||
coords[:, 6].clamp_(0, img0_shape[1]) # x4
|
||||
coords[:, 7].clamp_(0, img0_shape[0]) # y4
|
||||
# coords[:, 8].clamp_(0, img0_shape[1]) # x5
|
||||
# coords[:, 9].clamp_(0, img0_shape[0]) # y5
|
||||
return coords
|
||||
|
||||
|
||||
def get_plate_rec_landmark(img, xyxy, conf, landmarks, class_num, device, plate_rec_model,
|
||||
is_color=False): # 获取车牌坐标以及四个角点坐标并获取车牌号
|
||||
h, w, c = img.shape
|
||||
Box = {}
|
||||
result_dict = {}
|
||||
tl = 1 or round(0.002 * (h + w) / 2) + 1 # line/font thickness
|
||||
|
||||
x1 = int(xyxy[0])
|
||||
y1 = int(xyxy[1])
|
||||
x2 = int(xyxy[2])
|
||||
y2 = int(xyxy[3])
|
||||
height = y2 - y1
|
||||
landmarks_np = np.zeros((4, 2))
|
||||
rect = [x1, y1, x2, y2]
|
||||
for i in range(4):
|
||||
point_x = int(landmarks[2 * i])
|
||||
point_y = int(landmarks[2 * i + 1])
|
||||
landmarks_np[i] = np.array([point_x, point_y])
|
||||
|
||||
class_label = int(class_num) # 车牌的的类型0代表单牌,1代表双层车牌
|
||||
roi_img = four_point_transform(img, landmarks_np) # 透视变换得到车牌小图
|
||||
if class_label: # 判断是否是双层车牌,是双牌的话进行分割后然后拼接
|
||||
roi_img = get_split_merge(roi_img)
|
||||
if not is_color:
|
||||
plate_number, rec_prob = get_plate_result(roi_img, device, plate_rec_model, is_color=is_color) # 对车牌小图进行识别
|
||||
else:
|
||||
plate_number, rec_prob, plate_color, color_conf = get_plate_result(roi_img, device, plate_rec_model,
|
||||
is_color=is_color)
|
||||
Box['X'] = landmarks_np[0][0].tolist() # 车牌角点坐标
|
||||
Box['Y'] = landmarks_np[0][1].tolist()
|
||||
Box['Width'] = rect[2] - rect[0]
|
||||
Box['Height'] = rect[3] - rect[1]
|
||||
# Box['label'] = plate_number # 车牌号
|
||||
# Box['rect'] = rect
|
||||
result_dict['rect'] = rect # 车牌roi区域
|
||||
result_dict['detect_conf'] = conf # 检测区域得分
|
||||
result_dict['landmarks'] = landmarks_np.tolist() # 车牌角点坐标
|
||||
result_dict['plate_no'] = plate_number # 车牌号
|
||||
result_dict['rec_conf'] = rec_prob # 每个字符的概率
|
||||
result_dict['roi_height'] = roi_img.shape[0] # 车牌高度
|
||||
result_dict['plate_color'] = ""
|
||||
if is_color:
|
||||
result_dict['plate_color'] = plate_color # 车牌颜色
|
||||
result_dict['color_conf'] = color_conf # 颜色得分
|
||||
result_dict['plate_type'] = class_label # 单双层 0单层 1双层
|
||||
score = conf.tolist()
|
||||
return plate_number, score, Box, result_dict
|
||||
|
||||
|
||||
def detect_Recognition_plate(model, orgimg, device, plate_rec_model, img_size, is_color=False): # 获取车牌信息
|
||||
# Load model
|
||||
# img_size = opt_img_size
|
||||
conf_thres = 0.3 # 得分阈值
|
||||
iou_thres = 0.5 # nms的iou值
|
||||
dict_list = []
|
||||
result_jpg = []
|
||||
# orgimg = cv2.imread(image_path) # BGR
|
||||
img0 = copy.deepcopy(orgimg)
|
||||
assert orgimg is not None, 'Image Not Found '
|
||||
h0, w0 = orgimg.shape[:2] # orig hw
|
||||
r = img_size / max(h0, w0) # resize image to img_size
|
||||
if r != 1: # always resize down, only resize up if training with augmentation
|
||||
interp = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
|
||||
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interp)
|
||||
|
||||
imgsz = check_img_size(img_size, s=model.stride.max()) # check img_size
|
||||
|
||||
img = letterbox(img0, new_shape=imgsz)[0] # 检测前处理,图片长宽变为32倍数,比如变为640X640
|
||||
# img =process_data(img0)
|
||||
# Convert
|
||||
img = img[:, :, ::-1].transpose(2, 0, 1).copy() # BGR to RGB, to 3x416x416 图片的BGR排列转为RGB,然后将图片的H,W,C排列变为C,H,W排列
|
||||
|
||||
# Run inference
|
||||
t0 = time.time()
|
||||
|
||||
img = torch.from_numpy(img).to(device)
|
||||
img = img.float() # uint8 to fp16/32
|
||||
img /= 255.0 # 0 - 255 to 0.0 - 1.0
|
||||
if img.ndimension() == 3:
|
||||
img = img.unsqueeze(0)
|
||||
|
||||
# Inference
|
||||
pred = model(img)[0]
|
||||
# Apply NMS
|
||||
pred = non_max_suppression_face(pred, conf_thres, iou_thres)
|
||||
# result_jpg.insert(0, pred)
|
||||
# Process detections
|
||||
for i, det in enumerate(pred): # detections per image
|
||||
if len(det):
|
||||
# Rescale boxes from img_size to im0 size
|
||||
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], orgimg.shape).round()
|
||||
|
||||
# Print results
|
||||
for c in det[:, -1].unique():
|
||||
n = (det[:, -1] == c).sum() # detections per class
|
||||
|
||||
det[:, 5:13] = scale_coords_landmarks(img.shape[2:], det[:, 5:13], orgimg.shape).round()
|
||||
|
||||
for j in range(det.size()[0]):
|
||||
xyxy = det[j, :4].view(-1).tolist()
|
||||
conf = det[j, 4].cpu().numpy()
|
||||
landmarks = det[j, 5:13].view(-1).tolist()
|
||||
class_num = det[j, 13].cpu().numpy()
|
||||
label, score, Box, result_dict = get_plate_rec_landmark(orgimg, xyxy, conf, landmarks, class_num,
|
||||
device, plate_rec_model, is_color=is_color)
|
||||
dict_list.append(result_dict)
|
||||
result_jpg.append(Box)
|
||||
result_jpg.append(score)
|
||||
result_jpg.append(label)
|
||||
return dict_list, result_jpg
|
||||
# cv2.imwrite('result.jpg', orgimg)
|
||||
|
||||
|
||||
def draw_result(orgimg, dict_list, is_color=False): # 车牌结果画出来
|
||||
result_str = ""
|
||||
for result in dict_list:
|
||||
rect_area = result['rect']
|
||||
|
||||
x, y, w, h = rect_area[0], rect_area[1], rect_area[2] - rect_area[0], rect_area[3] - rect_area[1]
|
||||
padding_w = 0.05 * w
|
||||
padding_h = 0.11 * h
|
||||
rect_area[0] = max(0, int(x - padding_w))
|
||||
rect_area[1] = max(0, int(y - padding_h))
|
||||
rect_area[2] = min(orgimg.shape[1], int(rect_area[2] + padding_w))
|
||||
rect_area[3] = min(orgimg.shape[0], int(rect_area[3] + padding_h))
|
||||
|
||||
height_area = result['roi_height']
|
||||
landmarks = result['landmarks']
|
||||
result_p = result['plate_no']
|
||||
if result['plate_type'] == 0: # 单层
|
||||
result_p += " " + result['plate_color']
|
||||
else: # 双层
|
||||
result_p += " " + result['plate_color'] + "双层"
|
||||
result_str += result_p + " "
|
||||
for i in range(4): # 关键点
|
||||
cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
|
||||
cv2.rectangle(orgimg, (rect_area[0], rect_area[1]), (rect_area[2], rect_area[3]), (0, 0, 255), 2) # 画框
|
||||
|
||||
labelSize = cv2.getTextSize(result_p, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1) # 获得字体的大小
|
||||
if rect_area[0] + labelSize[0][0] > orgimg.shape[1]: # 防止显示的文字越界
|
||||
rect_area[0] = int(orgimg.shape[1] - labelSize[0][0])
|
||||
orgimg = cv2.rectangle(orgimg, (rect_area[0], int(rect_area[1] - round(1.6 * labelSize[0][1]))),
|
||||
(int(rect_area[0] + round(1.2 * labelSize[0][0])), rect_area[1] + labelSize[1]),
|
||||
(255, 255, 255), cv2.FILLED) # 画文字框,背景白色
|
||||
|
||||
if len(result) >= 1:
|
||||
orgimg = cv2ImgAddText(orgimg, result_p, rect_area[0], int(rect_area[1] - round(1.6 * labelSize[0][1])),
|
||||
(0, 0, 0), 21)
|
||||
# orgimg=cv2ImgAddText(orgimg,result_p,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
|
||||
|
||||
print(result_str) # 打印结果
|
||||
return orgimg, result_str
|
||||
|
||||
|
||||
def get_second(capture):
|
||||
if capture.isOpened():
|
||||
rate = capture.get(5) # 帧速率
|
||||
FrameNumber = capture.get(7) # 视频文件的帧数
|
||||
duration = FrameNumber / rate # 帧速率/视频总帧数 是时间,除以60之后单位是分钟
|
||||
return int(rate), int(FrameNumber), int(duration)
|
||||
|
||||
def process_images(detect_model_path, rec_model_path, is_color, img, img_size, output, video_path):
|
||||
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
||||
|
||||
# 创建保存结果的文件夹
|
||||
save_path = output
|
||||
if not os.path.exists(save_path):
|
||||
os.mkdir(save_path)
|
||||
|
||||
# 加载模型
|
||||
detect_model = load_model(detect_model_path, device)
|
||||
plate_rec_model = init_model(device, rec_model_path, is_color=is_color)
|
||||
|
||||
# img = cv_imread(image_path)
|
||||
if img.shape[-1] == 4:
|
||||
img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)
|
||||
dict_list, result_jpg = detect_Recognition_plate(detect_model, img, device, plate_rec_model, img_size,
|
||||
is_color=is_color)
|
||||
# ori_img = draw_result(img, dict_list)
|
||||
# ori_list=ori_img[0].tolist()
|
||||
# result_jpg.insert(0,ori_list)
|
||||
result_jpg.insert(0, [[1, 0, 0], [0, 1, 0], [0, 0, 1]])
|
||||
return result_jpg
|
||||
|
||||
|
||||
app = Flask(__name__)
|
||||
def base64_to_image(base64_str):
|
||||
# 去掉base64编码中的头部信息
|
||||
base64_str = base64_str.split(",")[-1]
|
||||
# 解码base64字符串
|
||||
image_data = base64.b64decode(base64_str)
|
||||
# 转换为numpy数组
|
||||
nparr = np.frombuffer(image_data, np.uint8)
|
||||
# 解码为OpenCV格式的图片对象
|
||||
image = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
|
||||
return image
|
||||
|
||||
@app.route('/upload', methods=['POST'])
|
||||
def upload_image():
|
||||
try:
|
||||
# 从请求中获取base64编码的图片数据
|
||||
data = request.json
|
||||
# print(data)
|
||||
base64_str = data.get('image')
|
||||
# print(base64_str)
|
||||
if not base64_str:
|
||||
return jsonify({'error': 'No image data provided'}), 400
|
||||
|
||||
# 将base64编码转换为图片
|
||||
image = base64_to_image(base64_str)
|
||||
|
||||
result_jpg = process_images(
|
||||
detect_model_path='weights/plate_detect.pt',
|
||||
rec_model_path='weights/plate_rec_color.pth',
|
||||
is_color=True,
|
||||
img=image,
|
||||
img_size=640,
|
||||
output='result',
|
||||
video_path='' # 如果处理图片,视频路径留空
|
||||
)
|
||||
|
||||
# 构建一个字典来存储结果
|
||||
results = []
|
||||
|
||||
# 添加注册矩阵
|
||||
register_matrix = [
|
||||
[1, 0, 0],
|
||||
[0, 1, 0],
|
||||
[0, 0, 1]
|
||||
]
|
||||
results.append({"RegisterMatrix": register_matrix})
|
||||
|
||||
# 添加检测结果
|
||||
for i in range(1, len(result_jpg), 3):
|
||||
box, score, label = result_jpg[i:i + 3]
|
||||
box_data = box
|
||||
|
||||
detection_result = {
|
||||
"Box": box_data,
|
||||
"Score": score,
|
||||
"label": label
|
||||
}
|
||||
results.append(detection_result)
|
||||
# print(detection_result)
|
||||
# 返回处理结果
|
||||
return jsonify({"result.jpg": results})
|
||||
except Exception as e:
|
||||
# 打印异常信息以帮助诊断
|
||||
print(f"Caught an exception: {type(e).__name__}: {str(e)}")
|
||||
return jsonify({"error_msg": "Content processing is incorrect",
|
||||
"error_code": "AIS.0404"})
|
||||
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
app.run(debug=True)
|
0
models/__init__.py
Normal file
33
models/blazeface.yaml
Normal file
@ -0,0 +1,33 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [5,6, 10,13, 21,26] # P3/8
|
||||
- [55,72, 225,304, 438,553] # P4/16
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [24, 3, 2]], # 0-P1/2
|
||||
[-1, 2, BlazeBlock, [24]], # 1
|
||||
[-1, 1, BlazeBlock, [48, None, 2]], # 2-P2/4
|
||||
[-1, 2, BlazeBlock, [48]], # 3
|
||||
[-1, 1, DoubleBlazeBlock, [96, 24, 2]], # 4-P3/8
|
||||
[-1, 2, DoubleBlazeBlock, [96, 24]], # 5
|
||||
[-1, 1, DoubleBlazeBlock, [96, 24, 2]], # 6-P4/16
|
||||
[-1, 2, DoubleBlazeBlock, [96, 24]], # 7
|
||||
]
|
||||
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[[-1, 1, Conv, [64, 1, 1]], # 8 (P4/32-large)
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 5], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 1, Conv, [64, 1, 1]], # 11 (P3/8-medium)
|
||||
|
||||
[[11, 8], 1, Detect, [nc, anchors]], # Detect(P3, P4)
|
||||
]
|
38
models/blazeface_fpn.yaml
Normal file
@ -0,0 +1,38 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [5,6, 10,13, 21,26] # P3/8
|
||||
- [55,72, 225,304, 438,553] # P4/16
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [24, 3, 2]], # 0-P1/2
|
||||
[-1, 2, BlazeBlock, [24]], # 1
|
||||
[-1, 1, BlazeBlock, [48, None, 2]], # 2-P2/4
|
||||
[-1, 2, BlazeBlock, [48]], # 3
|
||||
[-1, 1, DoubleBlazeBlock, [96, 24, 2]], # 4-P3/8
|
||||
[-1, 2, DoubleBlazeBlock, [96, 24]], # 5
|
||||
[-1, 1, DoubleBlazeBlock, [96, 24, 2]], # 6-P4/16
|
||||
[-1, 2, DoubleBlazeBlock, [96, 24]], # 7
|
||||
]
|
||||
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[[-1, 1, Conv, [48, 1, 1]], # 8
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 5], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 1, Conv, [48, 1, 1]], # 11 (P3/8-medium)
|
||||
|
||||
[-1, 1, nn.MaxPool2d, [3, 2, 1]], # 12
|
||||
[[-1, 7], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 1, Conv, [48, 1, 1]], # 14 (P4/16-large)
|
||||
|
||||
[[11, 14], 1, Detect, [nc, anchors]], # Detect(P3, P4)
|
||||
]
|
||||
|
456
models/common.py
Normal file
@ -0,0 +1,456 @@
|
||||
# This file contains modules common to various models
|
||||
|
||||
import math
|
||||
|
||||
import numpy as np
|
||||
import requests
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
from PIL import Image, ImageDraw
|
||||
|
||||
from utils.datasets import letterbox
|
||||
from utils.general import non_max_suppression, make_divisible, scale_coords, xyxy2xywh
|
||||
from utils.plots import color_list
|
||||
|
||||
def autopad(k, p=None): # kernel, padding
|
||||
# Pad to 'same'
|
||||
if p is None:
|
||||
p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad
|
||||
return p
|
||||
|
||||
def channel_shuffle(x, groups):
|
||||
batchsize, num_channels, height, width = x.data.size()
|
||||
channels_per_group = num_channels // groups
|
||||
|
||||
# reshape
|
||||
x = x.view(batchsize, groups, channels_per_group, height, width)
|
||||
x = torch.transpose(x, 1, 2).contiguous()
|
||||
|
||||
# flatten
|
||||
x = x.view(batchsize, -1, height, width)
|
||||
return x
|
||||
|
||||
def DWConv(c1, c2, k=1, s=1, act=True):
|
||||
# Depthwise convolution
|
||||
return Conv(c1, c2, k, s, g=math.gcd(c1, c2), act=act)
|
||||
|
||||
class Conv(nn.Module):
|
||||
# Standard convolution
|
||||
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
|
||||
super(Conv, self).__init__()
|
||||
self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
|
||||
self.bn = nn.BatchNorm2d(c2)
|
||||
self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
|
||||
#self.act = self.act = nn.LeakyReLU(0.1, inplace=True) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
|
||||
|
||||
def forward(self, x):
|
||||
return self.act(self.bn(self.conv(x)))
|
||||
|
||||
def fuseforward(self, x):
|
||||
return self.act(self.conv(x))
|
||||
|
||||
class StemBlock(nn.Module):
|
||||
def __init__(self, c1, c2, k=3, s=2, p=None, g=1, act=True):
|
||||
super(StemBlock, self).__init__()
|
||||
self.stem_1 = Conv(c1, c2, k, s, p, g, act)
|
||||
self.stem_2a = Conv(c2, c2 // 2, 1, 1, 0)
|
||||
self.stem_2b = Conv(c2 // 2, c2, 3, 2, 1)
|
||||
self.stem_2p = nn.MaxPool2d(kernel_size=2,stride=2,ceil_mode=True)
|
||||
self.stem_3 = Conv(c2 * 2, c2, 1, 1, 0)
|
||||
|
||||
def forward(self, x):
|
||||
stem_1_out = self.stem_1(x)
|
||||
stem_2a_out = self.stem_2a(stem_1_out)
|
||||
stem_2b_out = self.stem_2b(stem_2a_out)
|
||||
stem_2p_out = self.stem_2p(stem_1_out)
|
||||
out = self.stem_3(torch.cat((stem_2b_out,stem_2p_out),1))
|
||||
return out
|
||||
|
||||
class Bottleneck(nn.Module):
|
||||
# Standard bottleneck
|
||||
def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion
|
||||
super(Bottleneck, self).__init__()
|
||||
c_ = int(c2 * e) # hidden channels
|
||||
self.cv1 = Conv(c1, c_, 1, 1)
|
||||
self.cv2 = Conv(c_, c2, 3, 1, g=g)
|
||||
self.add = shortcut and c1 == c2
|
||||
|
||||
def forward(self, x):
|
||||
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
|
||||
|
||||
class BottleneckCSP(nn.Module):
|
||||
# CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
|
||||
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
|
||||
super(BottleneckCSP, self).__init__()
|
||||
c_ = int(c2 * e) # hidden channels
|
||||
self.cv1 = Conv(c1, c_, 1, 1)
|
||||
self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
|
||||
self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
|
||||
self.cv4 = Conv(2 * c_, c2, 1, 1)
|
||||
self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3)
|
||||
self.act = nn.LeakyReLU(0.1, inplace=True)
|
||||
self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
|
||||
|
||||
def forward(self, x):
|
||||
y1 = self.cv3(self.m(self.cv1(x)))
|
||||
y2 = self.cv2(x)
|
||||
return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1))))
|
||||
|
||||
|
||||
class C3(nn.Module):
|
||||
# CSP Bottleneck with 3 convolutions
|
||||
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
|
||||
super(C3, self).__init__()
|
||||
c_ = int(c2 * e) # hidden channels
|
||||
self.cv1 = Conv(c1, c_, 1, 1)
|
||||
self.cv2 = Conv(c1, c_, 1, 1)
|
||||
self.cv3 = Conv(2 * c_, c2, 1) # act=FReLU(c2)
|
||||
self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
|
||||
|
||||
def forward(self, x):
|
||||
return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
|
||||
|
||||
class ShuffleV2Block(nn.Module):
|
||||
def __init__(self, inp, oup, stride):
|
||||
super(ShuffleV2Block, self).__init__()
|
||||
|
||||
if not (1 <= stride <= 3):
|
||||
raise ValueError('illegal stride value')
|
||||
self.stride = stride
|
||||
|
||||
branch_features = oup // 2
|
||||
assert (self.stride != 1) or (inp == branch_features << 1)
|
||||
|
||||
if self.stride > 1:
|
||||
self.branch1 = nn.Sequential(
|
||||
self.depthwise_conv(inp, inp, kernel_size=3, stride=self.stride, padding=1),
|
||||
nn.BatchNorm2d(inp),
|
||||
nn.Conv2d(inp, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
|
||||
nn.BatchNorm2d(branch_features),
|
||||
nn.SiLU(),
|
||||
)
|
||||
else:
|
||||
self.branch1 = nn.Sequential()
|
||||
|
||||
self.branch2 = nn.Sequential(
|
||||
nn.Conv2d(inp if (self.stride > 1) else branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
|
||||
nn.BatchNorm2d(branch_features),
|
||||
nn.SiLU(),
|
||||
self.depthwise_conv(branch_features, branch_features, kernel_size=3, stride=self.stride, padding=1),
|
||||
nn.BatchNorm2d(branch_features),
|
||||
nn.Conv2d(branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
|
||||
nn.BatchNorm2d(branch_features),
|
||||
nn.SiLU(),
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def depthwise_conv(i, o, kernel_size, stride=1, padding=0, bias=False):
|
||||
return nn.Conv2d(i, o, kernel_size, stride, padding, bias=bias, groups=i)
|
||||
|
||||
def forward(self, x):
|
||||
if self.stride == 1:
|
||||
x1, x2 = x.chunk(2, dim=1)
|
||||
out = torch.cat((x1, self.branch2(x2)), dim=1)
|
||||
else:
|
||||
out = torch.cat((self.branch1(x), self.branch2(x)), dim=1)
|
||||
out = channel_shuffle(out, 2)
|
||||
return out
|
||||
|
||||
class BlazeBlock(nn.Module):
|
||||
def __init__(self, in_channels,out_channels,mid_channels=None,stride=1):
|
||||
super(BlazeBlock, self).__init__()
|
||||
mid_channels = mid_channels or in_channels
|
||||
assert stride in [1, 2]
|
||||
if stride>1:
|
||||
self.use_pool = True
|
||||
else:
|
||||
self.use_pool = False
|
||||
|
||||
self.branch1 = nn.Sequential(
|
||||
nn.Conv2d(in_channels=in_channels,out_channels=mid_channels,kernel_size=5,stride=stride,padding=2,groups=in_channels),
|
||||
nn.BatchNorm2d(mid_channels),
|
||||
nn.Conv2d(in_channels=mid_channels,out_channels=out_channels,kernel_size=1,stride=1),
|
||||
nn.BatchNorm2d(out_channels),
|
||||
)
|
||||
|
||||
if self.use_pool:
|
||||
self.shortcut = nn.Sequential(
|
||||
nn.MaxPool2d(kernel_size=stride, stride=stride),
|
||||
nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1),
|
||||
nn.BatchNorm2d(out_channels),
|
||||
)
|
||||
|
||||
self.relu = nn.SiLU(inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
branch1 = self.branch1(x)
|
||||
out = (branch1+self.shortcut(x)) if self.use_pool else (branch1+x)
|
||||
return self.relu(out)
|
||||
|
||||
class DoubleBlazeBlock(nn.Module):
|
||||
def __init__(self,in_channels,out_channels,mid_channels=None,stride=1):
|
||||
super(DoubleBlazeBlock, self).__init__()
|
||||
mid_channels = mid_channels or in_channels
|
||||
assert stride in [1, 2]
|
||||
if stride > 1:
|
||||
self.use_pool = True
|
||||
else:
|
||||
self.use_pool = False
|
||||
|
||||
self.branch1 = nn.Sequential(
|
||||
nn.Conv2d(in_channels=in_channels, out_channels=in_channels, kernel_size=5, stride=stride,padding=2,groups=in_channels),
|
||||
nn.BatchNorm2d(in_channels),
|
||||
nn.Conv2d(in_channels=in_channels, out_channels=mid_channels, kernel_size=1, stride=1),
|
||||
nn.BatchNorm2d(mid_channels),
|
||||
nn.SiLU(inplace=True),
|
||||
nn.Conv2d(in_channels=mid_channels, out_channels=mid_channels, kernel_size=5, stride=1,padding=2),
|
||||
nn.BatchNorm2d(mid_channels),
|
||||
nn.Conv2d(in_channels=mid_channels, out_channels=out_channels, kernel_size=1, stride=1),
|
||||
nn.BatchNorm2d(out_channels),
|
||||
)
|
||||
|
||||
if self.use_pool:
|
||||
self.shortcut = nn.Sequential(
|
||||
nn.MaxPool2d(kernel_size=stride, stride=stride),
|
||||
nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1),
|
||||
nn.BatchNorm2d(out_channels),
|
||||
)
|
||||
|
||||
self.relu = nn.SiLU(inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
branch1 = self.branch1(x)
|
||||
out = (branch1 + self.shortcut(x)) if self.use_pool else (branch1 + x)
|
||||
return self.relu(out)
|
||||
|
||||
|
||||
class SPP(nn.Module):
|
||||
# Spatial pyramid pooling layer used in YOLOv3-SPP
|
||||
def __init__(self, c1, c2, k=(5, 9, 13)):
|
||||
super(SPP, self).__init__()
|
||||
c_ = c1 // 2 # hidden channels
|
||||
self.cv1 = Conv(c1, c_, 1, 1)
|
||||
self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
|
||||
self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
|
||||
|
||||
def forward(self, x):
|
||||
x = self.cv1(x)
|
||||
return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))
|
||||
|
||||
class SPPF(nn.Module):
|
||||
# Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher
|
||||
def __init__(self, c1, c2, k=5): # equivalent to SPP(k=(5, 9, 13))
|
||||
super().__init__()
|
||||
c_ = c1 // 2 # hidden channels
|
||||
self.cv1 = Conv(c1, c_, 1, 1)
|
||||
self.cv2 = Conv(c_ * 4, c2, 1, 1)
|
||||
self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.cv1(x)
|
||||
with warnings.catch_warnings():
|
||||
warnings.simplefilter('ignore') # suppress torch 1.9.0 max_pool2d() warning
|
||||
y1 = self.m(x)
|
||||
y2 = self.m(y1)
|
||||
return self.cv2(torch.cat((x, y1, y2, self.m(y2)), 1))
|
||||
|
||||
|
||||
class Focus(nn.Module):
|
||||
# Focus wh information into c-space
|
||||
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
|
||||
super(Focus, self).__init__()
|
||||
self.conv = Conv(c1 * 4, c2, k, s, p, g, act)
|
||||
# self.contract = Contract(gain=2)
|
||||
|
||||
def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2)
|
||||
return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
|
||||
# return self.conv(self.contract(x))
|
||||
|
||||
|
||||
class Contract(nn.Module):
|
||||
# Contract width-height into channels, i.e. x(1,64,80,80) to x(1,256,40,40)
|
||||
def __init__(self, gain=2):
|
||||
super().__init__()
|
||||
self.gain = gain
|
||||
|
||||
def forward(self, x):
|
||||
N, C, H, W = x.size() # assert (H / s == 0) and (W / s == 0), 'Indivisible gain'
|
||||
s = self.gain
|
||||
x = x.view(N, C, H // s, s, W // s, s) # x(1,64,40,2,40,2)
|
||||
x = x.permute(0, 3, 5, 1, 2, 4).contiguous() # x(1,2,2,64,40,40)
|
||||
return x.view(N, C * s * s, H // s, W // s) # x(1,256,40,40)
|
||||
|
||||
|
||||
class Expand(nn.Module):
|
||||
# Expand channels into width-height, i.e. x(1,64,80,80) to x(1,16,160,160)
|
||||
def __init__(self, gain=2):
|
||||
super().__init__()
|
||||
self.gain = gain
|
||||
|
||||
def forward(self, x):
|
||||
N, C, H, W = x.size() # assert C / s ** 2 == 0, 'Indivisible gain'
|
||||
s = self.gain
|
||||
x = x.view(N, s, s, C // s ** 2, H, W) # x(1,2,2,16,80,80)
|
||||
x = x.permute(0, 3, 4, 1, 5, 2).contiguous() # x(1,16,80,2,80,2)
|
||||
return x.view(N, C // s ** 2, H * s, W * s) # x(1,16,160,160)
|
||||
|
||||
|
||||
class Concat(nn.Module):
|
||||
# Concatenate a list of tensors along dimension
|
||||
def __init__(self, dimension=1):
|
||||
super(Concat, self).__init__()
|
||||
self.d = dimension
|
||||
|
||||
def forward(self, x):
|
||||
return torch.cat(x, self.d)
|
||||
|
||||
|
||||
class NMS(nn.Module):
|
||||
# Non-Maximum Suppression (NMS) module
|
||||
conf = 0.25 # confidence threshold
|
||||
iou = 0.45 # IoU threshold
|
||||
classes = None # (optional list) filter by class
|
||||
|
||||
def __init__(self):
|
||||
super(NMS, self).__init__()
|
||||
|
||||
def forward(self, x):
|
||||
return non_max_suppression(x[0], conf_thres=self.conf, iou_thres=self.iou, classes=self.classes)
|
||||
|
||||
class autoShape(nn.Module):
|
||||
# input-robust model wrapper for passing cv2/np/PIL/torch inputs. Includes preprocessing, inference and NMS
|
||||
img_size = 640 # inference size (pixels)
|
||||
conf = 0.25 # NMS confidence threshold
|
||||
iou = 0.45 # NMS IoU threshold
|
||||
classes = None # (optional list) filter by class
|
||||
|
||||
def __init__(self, model):
|
||||
super(autoShape, self).__init__()
|
||||
self.model = model.eval()
|
||||
|
||||
def autoshape(self):
|
||||
print('autoShape already enabled, skipping... ') # model already converted to model.autoshape()
|
||||
return self
|
||||
|
||||
def forward(self, imgs, size=640, augment=False, profile=False):
|
||||
# Inference from various sources. For height=720, width=1280, RGB images example inputs are:
|
||||
# filename: imgs = 'data/samples/zidane.jpg'
|
||||
# URI: = 'https://github.com/ultralytics/yolov5/releases/download/v1.0/zidane.jpg'
|
||||
# OpenCV: = cv2.imread('image.jpg')[:,:,::-1] # HWC BGR to RGB x(720,1280,3)
|
||||
# PIL: = Image.open('image.jpg') # HWC x(720,1280,3)
|
||||
# numpy: = np.zeros((720,1280,3)) # HWC
|
||||
# torch: = torch.zeros(16,3,720,1280) # BCHW
|
||||
# multiple: = [Image.open('image1.jpg'), Image.open('image2.jpg'), ...] # list of images
|
||||
|
||||
p = next(self.model.parameters()) # for device and type
|
||||
if isinstance(imgs, torch.Tensor): # torch
|
||||
return self.model(imgs.to(p.device).type_as(p), augment, profile) # inference
|
||||
|
||||
# Pre-process
|
||||
n, imgs = (len(imgs), imgs) if isinstance(imgs, list) else (1, [imgs]) # number of images, list of images
|
||||
shape0, shape1 = [], [] # image and inference shapes
|
||||
for i, im in enumerate(imgs):
|
||||
if isinstance(im, str): # filename or uri
|
||||
im = Image.open(requests.get(im, stream=True).raw if im.startswith('http') else im) # open
|
||||
im = np.array(im) # to numpy
|
||||
if im.shape[0] < 5: # image in CHW
|
||||
im = im.transpose((1, 2, 0)) # reverse dataloader .transpose(2, 0, 1)
|
||||
im = im[:, :, :3] if im.ndim == 3 else np.tile(im[:, :, None], 3) # enforce 3ch input
|
||||
s = im.shape[:2] # HWC
|
||||
shape0.append(s) # image shape
|
||||
g = (size / max(s)) # gain
|
||||
shape1.append([y * g for y in s])
|
||||
imgs[i] = im # update
|
||||
shape1 = [make_divisible(x, int(self.stride.max())) for x in np.stack(shape1, 0).max(0)] # inference shape
|
||||
x = [letterbox(im, new_shape=shape1, auto=False)[0] for im in imgs] # pad
|
||||
x = np.stack(x, 0) if n > 1 else x[0][None] # stack
|
||||
x = np.ascontiguousarray(x.transpose((0, 3, 1, 2))) # BHWC to BCHW
|
||||
x = torch.from_numpy(x).to(p.device).type_as(p) / 255. # uint8 to fp16/32
|
||||
|
||||
# Inference
|
||||
with torch.no_grad():
|
||||
y = self.model(x, augment, profile)[0] # forward
|
||||
y = non_max_suppression(y, conf_thres=self.conf, iou_thres=self.iou, classes=self.classes) # NMS
|
||||
|
||||
# Post-process
|
||||
for i in range(n):
|
||||
scale_coords(shape1, y[i][:, :4], shape0[i])
|
||||
|
||||
return Detections(imgs, y, self.names)
|
||||
|
||||
|
||||
class Detections:
|
||||
# detections class for YOLOv5 inference results
|
||||
def __init__(self, imgs, pred, names=None):
|
||||
super(Detections, self).__init__()
|
||||
d = pred[0].device # device
|
||||
gn = [torch.tensor([*[im.shape[i] for i in [1, 0, 1, 0]], 1., 1.], device=d) for im in imgs] # normalizations
|
||||
self.imgs = imgs # list of images as numpy arrays
|
||||
self.pred = pred # list of tensors pred[0] = (xyxy, conf, cls)
|
||||
self.names = names # class names
|
||||
self.xyxy = pred # xyxy pixels
|
||||
self.xywh = [xyxy2xywh(x) for x in pred] # xywh pixels
|
||||
self.xyxyn = [x / g for x, g in zip(self.xyxy, gn)] # xyxy normalized
|
||||
self.xywhn = [x / g for x, g in zip(self.xywh, gn)] # xywh normalized
|
||||
self.n = len(self.pred)
|
||||
|
||||
def display(self, pprint=False, show=False, save=False, render=False):
|
||||
colors = color_list()
|
||||
for i, (img, pred) in enumerate(zip(self.imgs, self.pred)):
|
||||
str = f'Image {i + 1}/{len(self.pred)}: {img.shape[0]}x{img.shape[1]} '
|
||||
if pred is not None:
|
||||
for c in pred[:, -1].unique():
|
||||
n = (pred[:, -1] == c).sum() # detections per class
|
||||
str += f'{n} {self.names[int(c)]}s, ' # add to string
|
||||
if show or save or render:
|
||||
img = Image.fromarray(img.astype(np.uint8)) if isinstance(img, np.ndarray) else img # from np
|
||||
for *box, conf, cls in pred: # xyxy, confidence, class
|
||||
# str += '%s %.2f, ' % (names[int(cls)], conf) # label
|
||||
ImageDraw.Draw(img).rectangle(box, width=4, outline=colors[int(cls) % 10]) # plot
|
||||
if pprint:
|
||||
print(str)
|
||||
if show:
|
||||
img.show(f'Image {i}') # show
|
||||
if save:
|
||||
f = f'results{i}.jpg'
|
||||
str += f"saved to '{f}'"
|
||||
img.save(f) # save
|
||||
if render:
|
||||
self.imgs[i] = np.asarray(img)
|
||||
|
||||
def print(self):
|
||||
self.display(pprint=True) # print results
|
||||
|
||||
def show(self):
|
||||
self.display(show=True) # show results
|
||||
|
||||
def save(self):
|
||||
self.display(save=True) # save results
|
||||
|
||||
def render(self):
|
||||
self.display(render=True) # render results
|
||||
return self.imgs
|
||||
|
||||
def __len__(self):
|
||||
return self.n
|
||||
|
||||
def tolist(self):
|
||||
# return a list of Detections objects, i.e. 'for result in results.tolist():'
|
||||
x = [Detections([self.imgs[i]], [self.pred[i]], self.names) for i in range(self.n)]
|
||||
for d in x:
|
||||
for k in ['imgs', 'pred', 'xyxy', 'xyxyn', 'xywh', 'xywhn']:
|
||||
setattr(d, k, getattr(d, k)[0]) # pop out of list
|
||||
return x
|
||||
|
||||
|
||||
class Classify(nn.Module):
|
||||
# Classification head, i.e. x(b,c1,20,20) to x(b,c2)
|
||||
def __init__(self, c1, c2, k=1, s=1, p=None, g=1): # ch_in, ch_out, kernel, stride, padding, groups
|
||||
super(Classify, self).__init__()
|
||||
self.aap = nn.AdaptiveAvgPool2d(1) # to x(b,c1,1,1)
|
||||
self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g) # to x(b,c2,1,1)
|
||||
self.flat = nn.Flatten()
|
||||
|
||||
def forward(self, x):
|
||||
z = torch.cat([self.aap(y) for y in (x if isinstance(x, list) else [x])], 1) # cat if list
|
||||
return self.flat(self.conv(z)) # flatten to x(b,c2)
|
133
models/experimental.py
Normal file
@ -0,0 +1,133 @@
|
||||
# This file contains experimental modules
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
|
||||
from models.common import Conv, DWConv
|
||||
from utils.google_utils import attempt_download
|
||||
|
||||
|
||||
class CrossConv(nn.Module):
|
||||
# Cross Convolution Downsample
|
||||
def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False):
|
||||
# ch_in, ch_out, kernel, stride, groups, expansion, shortcut
|
||||
super(CrossConv, self).__init__()
|
||||
c_ = int(c2 * e) # hidden channels
|
||||
self.cv1 = Conv(c1, c_, (1, k), (1, s))
|
||||
self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g)
|
||||
self.add = shortcut and c1 == c2
|
||||
|
||||
def forward(self, x):
|
||||
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
|
||||
|
||||
|
||||
class Sum(nn.Module):
|
||||
# Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070
|
||||
def __init__(self, n, weight=False): # n: number of inputs
|
||||
super(Sum, self).__init__()
|
||||
self.weight = weight # apply weights boolean
|
||||
self.iter = range(n - 1) # iter object
|
||||
if weight:
|
||||
self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True) # layer weights
|
||||
|
||||
def forward(self, x):
|
||||
y = x[0] # no weight
|
||||
if self.weight:
|
||||
w = torch.sigmoid(self.w) * 2
|
||||
for i in self.iter:
|
||||
y = y + x[i + 1] * w[i]
|
||||
else:
|
||||
for i in self.iter:
|
||||
y = y + x[i + 1]
|
||||
return y
|
||||
|
||||
|
||||
class GhostConv(nn.Module):
|
||||
# Ghost Convolution https://github.com/huawei-noah/ghostnet
|
||||
def __init__(self, c1, c2, k=1, s=1, g=1, act=True): # ch_in, ch_out, kernel, stride, groups
|
||||
super(GhostConv, self).__init__()
|
||||
c_ = c2 // 2 # hidden channels
|
||||
self.cv1 = Conv(c1, c_, k, s, None, g, act)
|
||||
self.cv2 = Conv(c_, c_, 5, 1, None, c_, act)
|
||||
|
||||
def forward(self, x):
|
||||
y = self.cv1(x)
|
||||
return torch.cat([y, self.cv2(y)], 1)
|
||||
|
||||
|
||||
class GhostBottleneck(nn.Module):
|
||||
# Ghost Bottleneck https://github.com/huawei-noah/ghostnet
|
||||
def __init__(self, c1, c2, k, s):
|
||||
super(GhostBottleneck, self).__init__()
|
||||
c_ = c2 // 2
|
||||
self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1), # pw
|
||||
DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw
|
||||
GhostConv(c_, c2, 1, 1, act=False)) # pw-linear
|
||||
self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False),
|
||||
Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity()
|
||||
|
||||
def forward(self, x):
|
||||
return self.conv(x) + self.shortcut(x)
|
||||
|
||||
|
||||
class MixConv2d(nn.Module):
|
||||
# Mixed Depthwise Conv https://arxiv.org/abs/1907.09595
|
||||
def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True):
|
||||
super(MixConv2d, self).__init__()
|
||||
groups = len(k)
|
||||
if equal_ch: # equal c_ per group
|
||||
i = torch.linspace(0, groups - 1E-6, c2).floor() # c2 indices
|
||||
c_ = [(i == g).sum() for g in range(groups)] # intermediate channels
|
||||
else: # equal weight.numel() per group
|
||||
b = [c2] + [0] * groups
|
||||
a = np.eye(groups + 1, groups, k=-1)
|
||||
a -= np.roll(a, 1, axis=1)
|
||||
a *= np.array(k) ** 2
|
||||
a[0] = 1
|
||||
c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b
|
||||
|
||||
self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)])
|
||||
self.bn = nn.BatchNorm2d(c2)
|
||||
self.act = nn.LeakyReLU(0.1, inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1)))
|
||||
|
||||
|
||||
class Ensemble(nn.ModuleList):
|
||||
# Ensemble of models
|
||||
def __init__(self):
|
||||
super(Ensemble, self).__init__()
|
||||
|
||||
def forward(self, x, augment=False):
|
||||
y = []
|
||||
for module in self:
|
||||
y.append(module(x, augment)[0])
|
||||
# y = torch.stack(y).max(0)[0] # max ensemble
|
||||
# y = torch.stack(y).mean(0) # mean ensemble
|
||||
y = torch.cat(y, 1) # nms ensemble
|
||||
return y, None # inference, train output
|
||||
|
||||
|
||||
def attempt_load(weights, map_location=None):
|
||||
# Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
|
||||
model = Ensemble()
|
||||
for w in weights if isinstance(weights, list) else [weights]:
|
||||
attempt_download(w)
|
||||
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
|
||||
|
||||
# Compatibility updates
|
||||
for m in model.modules():
|
||||
if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU]:
|
||||
m.inplace = True # pytorch 1.7.0 compatibility
|
||||
elif type(m) is Conv:
|
||||
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
|
||||
|
||||
if len(model) == 1:
|
||||
return model[-1] # return model
|
||||
else:
|
||||
print('Ensemble created with %s\n' % weights)
|
||||
for k in ['names', 'stride']:
|
||||
setattr(model, k, getattr(model[-1], k))
|
||||
return model # return ensemble
|
351
models/yolo.py
Normal file
@ -0,0 +1,351 @@
|
||||
import argparse
|
||||
import logging
|
||||
import math
|
||||
import sys
|
||||
from copy import deepcopy
|
||||
from pathlib import Path
|
||||
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
|
||||
sys.path.append('./') # to run '$ python *.py' files in subdirectories
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from models.common import Conv, Bottleneck, SPP, DWConv, Focus, BottleneckCSP, C3, ShuffleV2Block, Concat, NMS, autoShape, StemBlock, BlazeBlock, DoubleBlazeBlock
|
||||
from models.experimental import MixConv2d, CrossConv
|
||||
from utils.autoanchor import check_anchor_order
|
||||
from utils.general import make_divisible, check_file, set_logging
|
||||
from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info, scale_img, initialize_weights, \
|
||||
select_device, copy_attr
|
||||
|
||||
try:
|
||||
import thop # for FLOPS computation
|
||||
except ImportError:
|
||||
thop = None
|
||||
|
||||
|
||||
class Detect(nn.Module):
|
||||
stride = None # strides computed during build
|
||||
export_cat = False # onnx export cat output
|
||||
|
||||
def __init__(self, nc=80, anchors=(), ch=()): # detection layer
|
||||
super(Detect, self).__init__()
|
||||
self.nc = nc # number of classes
|
||||
#self.no = nc + 5 # number of outputs per anchor
|
||||
self.no = nc + 5 + 8 # number of outputs per anchor
|
||||
|
||||
self.nl = len(anchors) # number of detection layers
|
||||
self.na = len(anchors[0]) // 2 # number of anchors
|
||||
self.grid = [torch.zeros(1)] * self.nl # init grid
|
||||
a = torch.tensor(anchors).float().view(self.nl, -1, 2)
|
||||
self.register_buffer('anchors', a) # shape(nl,na,2)
|
||||
self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2)
|
||||
self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv
|
||||
|
||||
def forward(self, x):
|
||||
# x = x.copy() # for profiling
|
||||
z = [] # inference output
|
||||
# self.training=True
|
||||
if self.export_cat:
|
||||
for i in range(self.nl):
|
||||
x[i] = self.m[i](x[i]) # conv
|
||||
bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
|
||||
x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
|
||||
|
||||
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
|
||||
# self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
|
||||
self.grid[i], self.anchor_grid[i] = self._make_grid_new(nx, ny,i)
|
||||
|
||||
y = torch.full_like(x[i], 0)
|
||||
y = y + torch.cat((x[i][:, :, :, :, 0:5].sigmoid(), torch.cat((x[i][:, :, :, :, 5:13], x[i][:, :, :, :, 13:13+self.nc].sigmoid()), 4)), 4)
|
||||
|
||||
box_xy = (y[:, :, :, :, 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
|
||||
box_wh = (y[:, :, :, :, 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
|
||||
# box_conf = torch.cat((box_xy, torch.cat((box_wh, y[:, :, :, :, 4:5]), 4)), 4)
|
||||
|
||||
landm1 = y[:, :, :, :, 5:7] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x1 y1
|
||||
landm2 = y[:, :, :, :, 7:9] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x2 y2
|
||||
landm3 = y[:, :, :, :, 9:11] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x3 y3
|
||||
landm4 = y[:, :, :, :, 11:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x4 y4
|
||||
prob= y[:, :, :, :, 13:13+self.nc]
|
||||
score,index_ = torch.max(prob,dim=-1,keepdim=True)
|
||||
score=score.type(box_xy.dtype)
|
||||
index_=index_.type(box_xy.dtype)
|
||||
index =torch.argmax(prob,dim=-1,keepdim=True).type(box_xy.dtype)
|
||||
# landm5 = y[:, :, :, :, 13:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x5 y5
|
||||
# landm = torch.cat((landm1, torch.cat((landm2, torch.cat((landm3, torch.cat((landm4, landm5), 4)), 4)), 4)), 4)
|
||||
# y = torch.cat((box_conf, torch.cat((landm, y[:, :, :, :, 13:13+self.nc]), 4)), 4)
|
||||
y = torch.cat([box_xy, box_wh, y[:, :, :, :, 4:5], landm1, landm2, landm3, landm4, y[:, :, :, :, 13:13+self.nc]], -1)
|
||||
|
||||
z.append(y.view(bs, -1, self.no))
|
||||
return torch.cat(z, 1)
|
||||
|
||||
for i in range(self.nl):
|
||||
x[i] = self.m[i](x[i]) # conv
|
||||
bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
|
||||
x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
|
||||
|
||||
if not self.training: # inference
|
||||
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
|
||||
self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
|
||||
|
||||
y = torch.full_like(x[i], 0)
|
||||
class_range = list(range(5)) + list(range(13,13+self.nc))
|
||||
y[..., class_range] = x[i][..., class_range].sigmoid()
|
||||
y[..., 5:13] = x[i][..., 5:13]
|
||||
#y = x[i].sigmoid()
|
||||
|
||||
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
|
||||
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
|
||||
|
||||
#y[..., 5:13] = y[..., 5:13] * 8 - 4
|
||||
y[..., 5:7] = y[..., 5:7] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x1 y1
|
||||
y[..., 7:9] = y[..., 7:9] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# landmark x2 y2
|
||||
y[..., 9:11] = y[..., 9:11] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# landmark x3 y3
|
||||
y[..., 11:13] = y[..., 11:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# landmark x4 y4
|
||||
# y[..., 13:13] = y[..., 13:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# landmark x5 y5
|
||||
|
||||
#y[..., 5:7] = (y[..., 5:7] * 2 -1) * self.anchor_grid[i] # landmark x1 y1
|
||||
#y[..., 7:9] = (y[..., 7:9] * 2 -1) * self.anchor_grid[i] # landmark x2 y2
|
||||
#y[..., 9:11] = (y[..., 9:11] * 2 -1) * self.anchor_grid[i] # landmark x3 y3
|
||||
#y[..., 11:13] = (y[..., 11:13] * 2 -1) * self.anchor_grid[i] # landmark x4 y4
|
||||
#y[..., 13:13] = (y[..., 13:13] * 2 -1) * self.anchor_grid[i] # landmark x5 y5
|
||||
|
||||
z.append(y.view(bs, -1, self.no))
|
||||
|
||||
return x if self.training else (torch.cat(z, 1), x)
|
||||
|
||||
@staticmethod
|
||||
def _make_grid(nx=20, ny=20):
|
||||
yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
|
||||
return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
|
||||
|
||||
def _make_grid_new(self,nx=20, ny=20,i=0):
|
||||
d = self.anchors[i].device
|
||||
if '1.10.0' in torch.__version__: # torch>=1.10.0 meshgrid workaround for torch>=0.7 compatibility
|
||||
yv, xv = torch.meshgrid([torch.arange(ny).to(d), torch.arange(nx).to(d)], indexing='ij')
|
||||
else:
|
||||
yv, xv = torch.meshgrid([torch.arange(ny).to(d), torch.arange(nx).to(d)])
|
||||
grid = torch.stack((xv, yv), 2).expand((1, self.na, ny, nx, 2)).float()
|
||||
anchor_grid = (self.anchors[i].clone() * self.stride[i]).view((1, self.na, 1, 1, 2)).expand((1, self.na, ny, nx, 2)).float()
|
||||
return grid, anchor_grid
|
||||
class Model(nn.Module):
|
||||
def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None): # model, input channels, number of classes
|
||||
super(Model, self).__init__()
|
||||
if isinstance(cfg, dict):
|
||||
self.yaml = cfg # model dict
|
||||
else: # is *.yaml
|
||||
import yaml # for torch hub
|
||||
self.yaml_file = Path(cfg).name
|
||||
with open(cfg) as f:
|
||||
self.yaml = yaml.load(f, Loader=yaml.FullLoader) # model dict
|
||||
|
||||
# Define model
|
||||
ch = self.yaml['ch'] = self.yaml.get('ch', ch) # input channels
|
||||
if nc and nc != self.yaml['nc']:
|
||||
logger.info('Overriding model.yaml nc=%g with nc=%g' % (self.yaml['nc'], nc))
|
||||
self.yaml['nc'] = nc # override yaml value
|
||||
self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist
|
||||
self.names = [str(i) for i in range(self.yaml['nc'])] # default names
|
||||
# print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])
|
||||
|
||||
# Build strides, anchors
|
||||
m = self.model[-1] # Detect()
|
||||
if isinstance(m, Detect):
|
||||
s = 128 # 2x min stride
|
||||
m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward
|
||||
m.anchors /= m.stride.view(-1, 1, 1)
|
||||
check_anchor_order(m)
|
||||
self.stride = m.stride
|
||||
self._initialize_biases() # only run once
|
||||
# print('Strides: %s' % m.stride.tolist())
|
||||
|
||||
# Init weights, biases
|
||||
initialize_weights(self)
|
||||
self.info()
|
||||
logger.info('')
|
||||
|
||||
def forward(self, x, augment=False, profile=False):
|
||||
if augment:
|
||||
img_size = x.shape[-2:] # height, width
|
||||
s = [1, 0.83, 0.67] # scales
|
||||
f = [None, 3, None] # flips (2-ud, 3-lr)
|
||||
y = [] # outputs
|
||||
for si, fi in zip(s, f):
|
||||
xi = scale_img(x.flip(fi) if fi else x, si)
|
||||
yi = self.forward_once(xi)[0] # forward
|
||||
# cv2.imwrite('img%g.jpg' % s, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1]) # save
|
||||
yi[..., :4] /= si # de-scale
|
||||
if fi == 2:
|
||||
yi[..., 1] = img_size[0] - yi[..., 1] # de-flip ud
|
||||
elif fi == 3:
|
||||
yi[..., 0] = img_size[1] - yi[..., 0] # de-flip lr
|
||||
y.append(yi)
|
||||
return torch.cat(y, 1), None # augmented inference, train
|
||||
else:
|
||||
return self.forward_once(x, profile) # single-scale inference, train
|
||||
|
||||
def forward_once(self, x, profile=False):
|
||||
y, dt = [], [] # outputs
|
||||
for m in self.model:
|
||||
if m.f != -1: # if not from previous layer
|
||||
x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers
|
||||
|
||||
if profile:
|
||||
o = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2 if thop else 0 # FLOPS
|
||||
t = time_synchronized()
|
||||
for _ in range(10):
|
||||
_ = m(x)
|
||||
dt.append((time_synchronized() - t) * 100)
|
||||
print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))
|
||||
|
||||
x = m(x) # run
|
||||
y.append(x if m.i in self.save else None) # save output
|
||||
|
||||
if profile:
|
||||
print('%.1fms total' % sum(dt))
|
||||
return x
|
||||
|
||||
def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency
|
||||
# https://arxiv.org/abs/1708.02002 section 3.3
|
||||
# cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
|
||||
m = self.model[-1] # Detect() module
|
||||
for mi, s in zip(m.m, m.stride): # from
|
||||
b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85)
|
||||
b.data[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
|
||||
b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
|
||||
mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
|
||||
|
||||
def _print_biases(self):
|
||||
m = self.model[-1] # Detect() module
|
||||
for mi in m.m: # from
|
||||
b = mi.bias.detach().view(m.na, -1).T # conv.bias(255) to (3,85)
|
||||
print(('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean()))
|
||||
|
||||
# def _print_weights(self):
|
||||
# for m in self.model.modules():
|
||||
# if type(m) is Bottleneck:
|
||||
# print('%10.3g' % (m.w.detach().sigmoid() * 2)) # shortcut weights
|
||||
|
||||
def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
|
||||
print('Fusing layers... ')
|
||||
for m in self.model.modules():
|
||||
if type(m) is Conv and hasattr(m, 'bn'):
|
||||
m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
|
||||
delattr(m, 'bn') # remove batchnorm
|
||||
m.forward = m.fuseforward # update forward
|
||||
elif type(m) is nn.Upsample:
|
||||
m.recompute_scale_factor = None # torch 1.11.0 compatibility
|
||||
self.info()
|
||||
return self
|
||||
|
||||
def nms(self, mode=True): # add or remove NMS module
|
||||
present = type(self.model[-1]) is NMS # last layer is NMS
|
||||
if mode and not present:
|
||||
print('Adding NMS... ')
|
||||
m = NMS() # module
|
||||
m.f = -1 # from
|
||||
m.i = self.model[-1].i + 1 # index
|
||||
self.model.add_module(name='%s' % m.i, module=m) # add
|
||||
self.eval()
|
||||
elif not mode and present:
|
||||
print('Removing NMS... ')
|
||||
self.model = self.model[:-1] # remove
|
||||
return self
|
||||
|
||||
def autoshape(self): # add autoShape module
|
||||
print('Adding autoShape... ')
|
||||
m = autoShape(self) # wrap model
|
||||
copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=()) # copy attributes
|
||||
return m
|
||||
|
||||
def info(self, verbose=False, img_size=640): # print model information
|
||||
model_info(self, verbose, img_size)
|
||||
|
||||
|
||||
def parse_model(d, ch): # model_dict, input_channels(3)
|
||||
logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
|
||||
anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']
|
||||
na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors
|
||||
no = na * (nc + 5) # number of outputs = anchors * (classes + 5)
|
||||
|
||||
layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out
|
||||
for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args
|
||||
m = eval(m) if isinstance(m, str) else m # eval strings
|
||||
for j, a in enumerate(args):
|
||||
try:
|
||||
args[j] = eval(a) if isinstance(a, str) else a # eval strings
|
||||
except:
|
||||
pass
|
||||
|
||||
n = max(round(n * gd), 1) if n > 1 else n # depth gain
|
||||
if m in [Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3, ShuffleV2Block, StemBlock, BlazeBlock, DoubleBlazeBlock]:
|
||||
c1, c2 = ch[f], args[0]
|
||||
|
||||
# Normal
|
||||
# if i > 0 and args[0] != no: # channel expansion factor
|
||||
# ex = 1.75 # exponential (default 2.0)
|
||||
# e = math.log(c2 / ch[1]) / math.log(2)
|
||||
# c2 = int(ch[1] * ex ** e)
|
||||
# if m != Focus:
|
||||
|
||||
c2 = make_divisible(c2 * gw, 8) if c2 != no else c2
|
||||
|
||||
# Experimental
|
||||
# if i > 0 and args[0] != no: # channel expansion factor
|
||||
# ex = 1 + gw # exponential (default 2.0)
|
||||
# ch1 = 32 # ch[1]
|
||||
# e = math.log(c2 / ch1) / math.log(2) # level 1-n
|
||||
# c2 = int(ch1 * ex ** e)
|
||||
# if m != Focus:
|
||||
# c2 = make_divisible(c2, 8) if c2 != no else c2
|
||||
|
||||
args = [c1, c2, *args[1:]]
|
||||
if m in [BottleneckCSP, C3]:
|
||||
args.insert(2, n)
|
||||
n = 1
|
||||
elif m is nn.BatchNorm2d:
|
||||
args = [ch[f]]
|
||||
elif m is Concat:
|
||||
c2 = sum([ch[-1 if x == -1 else x + 1] for x in f])
|
||||
elif m is Detect:
|
||||
args.append([ch[x + 1] for x in f])
|
||||
if isinstance(args[1], int): # number of anchors
|
||||
args[1] = [list(range(args[1] * 2))] * len(f)
|
||||
else:
|
||||
c2 = ch[f]
|
||||
|
||||
m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module
|
||||
t = str(m)[8:-2].replace('__main__.', '') # module type
|
||||
np = sum([x.numel() for x in m_.parameters()]) # number params
|
||||
m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params
|
||||
logger.info('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n, np, t, args)) # print
|
||||
save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist
|
||||
layers.append(m_)
|
||||
ch.append(c2)
|
||||
return nn.Sequential(*layers), sorted(save)
|
||||
|
||||
|
||||
from thop import profile
|
||||
from thop import clever_format
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--cfg', type=str, default='yolov5s.yaml', help='model.yaml')
|
||||
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
|
||||
opt = parser.parse_args()
|
||||
opt.cfg = check_file(opt.cfg) # check file
|
||||
set_logging()
|
||||
device = select_device(opt.device)
|
||||
|
||||
# Create model
|
||||
model = Model(opt.cfg).to(device)
|
||||
stride = model.stride.max()
|
||||
if stride == 32:
|
||||
input = torch.Tensor(1, 3, 480, 640).to(device)
|
||||
else:
|
||||
input = torch.Tensor(1, 3, 512, 640).to(device)
|
||||
model.train()
|
||||
print(model)
|
||||
flops, params = profile(model, inputs=(input, ))
|
||||
flops, params = clever_format([flops, params], "%.3f")
|
||||
print('Flops:', flops, ',Params:' ,params)
|
47
models/yolov5l.yaml
Normal file
@ -0,0 +1,47 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [4,5, 8,10, 13,16] # P3/8
|
||||
- [23,29, 43,55, 73,105] # P4/16
|
||||
- [146,217, 231,300, 335,433] # P5/32
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, StemBlock, [64, 3, 2]], # 0-P1/2
|
||||
[-1, 3, C3, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 2-P3/8
|
||||
[-1, 9, C3, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 4-P4/16
|
||||
[-1, 9, C3, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 6-P5/32
|
||||
[-1, 1, SPP, [1024, [3,5,7]]],
|
||||
[-1, 3, C3, [1024, False]], # 8
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[[-1, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 5], 1, Concat, [1]], # cat backbone P4
|
||||
[-1, 3, C3, [512, False]], # 12
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 3], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 3, C3, [256, False]], # 16 (P3/8-small)
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 13], 1, Concat, [1]], # cat head P4
|
||||
[-1, 3, C3, [512, False]], # 19 (P4/16-medium)
|
||||
|
||||
[-1, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 9], 1, Concat, [1]], # cat head P5
|
||||
[-1, 3, C3, [1024, False]], # 22 (P5/32-large)
|
||||
|
||||
[[16, 19, 22], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
60
models/yolov5l6.yaml
Normal file
@ -0,0 +1,60 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [6,7, 9,11, 13,16] # P3/8
|
||||
- [18,23, 26,33, 37,47] # P4/16
|
||||
- [54,67, 77,104, 112,154] # P5/32
|
||||
- [174,238, 258,355, 445,568] # P6/64
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[ [ -1, 1, StemBlock, [ 64, 3, 2 ] ], # 0-P1/2
|
||||
[ -1, 3, C3, [ 128 ] ],
|
||||
[ -1, 1, Conv, [ 256, 3, 2 ] ], # 2-P3/8
|
||||
[ -1, 9, C3, [ 256 ] ],
|
||||
[ -1, 1, Conv, [ 512, 3, 2 ] ], # 4-P4/16
|
||||
[ -1, 9, C3, [ 512 ] ],
|
||||
[ -1, 1, Conv, [ 768, 3, 2 ] ], # 6-P5/32
|
||||
[ -1, 3, C3, [ 768 ] ],
|
||||
[ -1, 1, Conv, [ 1024, 3, 2 ] ], # 8-P6/64
|
||||
[ -1, 1, SPP, [ 1024, [ 3, 5, 7 ] ] ],
|
||||
[ -1, 3, C3, [ 1024, False ] ], # 10
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[ [ -1, 1, Conv, [ 768, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 7 ], 1, Concat, [ 1 ] ], # cat backbone P5
|
||||
[ -1, 3, C3, [ 768, False ] ], # 14
|
||||
|
||||
[ -1, 1, Conv, [ 512, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 5 ], 1, Concat, [ 1 ] ], # cat backbone P4
|
||||
[ -1, 3, C3, [ 512, False ] ], # 18
|
||||
|
||||
[ -1, 1, Conv, [ 256, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 3 ], 1, Concat, [ 1 ] ], # cat backbone P3
|
||||
[ -1, 3, C3, [ 256, False ] ], # 22 (P3/8-small)
|
||||
|
||||
[ -1, 1, Conv, [ 256, 3, 2 ] ],
|
||||
[ [ -1, 19 ], 1, Concat, [ 1 ] ], # cat head P4
|
||||
[ -1, 3, C3, [ 512, False ] ], # 25 (P4/16-medium)
|
||||
|
||||
[ -1, 1, Conv, [ 512, 3, 2 ] ],
|
||||
[ [ -1, 15 ], 1, Concat, [ 1 ] ], # cat head P5
|
||||
[ -1, 3, C3, [ 768, False ] ], # 28 (P5/32-large)
|
||||
|
||||
[ -1, 1, Conv, [ 768, 3, 2 ] ],
|
||||
[ [ -1, 11 ], 1, Concat, [ 1 ] ], # cat head P6
|
||||
[ -1, 3, C3, [ 1024, False ] ], # 31 (P6/64-xlarge)
|
||||
|
||||
[ [ 22, 25, 28, 31 ], 1, Detect, [ nc, anchors ] ], # Detect(P3, P4, P5, P6)
|
||||
]
|
||||
|
47
models/yolov5m.yaml
Normal file
@ -0,0 +1,47 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 0.67 # model depth multiple
|
||||
width_multiple: 0.75 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [4,5, 8,10, 13,16] # P3/8
|
||||
- [23,29, 43,55, 73,105] # P4/16
|
||||
- [146,217, 231,300, 335,433] # P5/32
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, StemBlock, [64, 3, 2]], # 0-P1/2
|
||||
[-1, 3, C3, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 2-P3/8
|
||||
[-1, 9, C3, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 4-P4/16
|
||||
[-1, 9, C3, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 6-P5/32
|
||||
[-1, 1, SPP, [1024, [3,5,7]]],
|
||||
[-1, 3, C3, [1024, False]], # 8
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[[-1, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 5], 1, Concat, [1]], # cat backbone P4
|
||||
[-1, 3, C3, [512, False]], # 12
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 3], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 3, C3, [256, False]], # 16 (P3/8-small)
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 13], 1, Concat, [1]], # cat head P4
|
||||
[-1, 3, C3, [512, False]], # 19 (P4/16-medium)
|
||||
|
||||
[-1, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 9], 1, Concat, [1]], # cat head P5
|
||||
[-1, 3, C3, [1024, False]], # 22 (P5/32-large)
|
||||
|
||||
[[16, 19, 22], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
60
models/yolov5m6.yaml
Normal file
@ -0,0 +1,60 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 0.67 # model depth multiple
|
||||
width_multiple: 0.75 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [6,7, 9,11, 13,16] # P3/8
|
||||
- [18,23, 26,33, 37,47] # P4/16
|
||||
- [54,67, 77,104, 112,154] # P5/32
|
||||
- [174,238, 258,355, 445,568] # P6/64
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[ [ -1, 1, StemBlock, [ 64, 3, 2 ] ], # 0-P1/2
|
||||
[ -1, 3, C3, [ 128 ] ],
|
||||
[ -1, 1, Conv, [ 256, 3, 2 ] ], # 2-P3/8
|
||||
[ -1, 9, C3, [ 256 ] ],
|
||||
[ -1, 1, Conv, [ 512, 3, 2 ] ], # 4-P4/16
|
||||
[ -1, 9, C3, [ 512 ] ],
|
||||
[ -1, 1, Conv, [ 768, 3, 2 ] ], # 6-P5/32
|
||||
[ -1, 3, C3, [ 768 ] ],
|
||||
[ -1, 1, Conv, [ 1024, 3, 2 ] ], # 8-P6/64
|
||||
[ -1, 1, SPP, [ 1024, [ 3, 5, 7 ] ] ],
|
||||
[ -1, 3, C3, [ 1024, False ] ], # 10
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[ [ -1, 1, Conv, [ 768, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 7 ], 1, Concat, [ 1 ] ], # cat backbone P5
|
||||
[ -1, 3, C3, [ 768, False ] ], # 14
|
||||
|
||||
[ -1, 1, Conv, [ 512, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 5 ], 1, Concat, [ 1 ] ], # cat backbone P4
|
||||
[ -1, 3, C3, [ 512, False ] ], # 18
|
||||
|
||||
[ -1, 1, Conv, [ 256, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 3 ], 1, Concat, [ 1 ] ], # cat backbone P3
|
||||
[ -1, 3, C3, [ 256, False ] ], # 22 (P3/8-small)
|
||||
|
||||
[ -1, 1, Conv, [ 256, 3, 2 ] ],
|
||||
[ [ -1, 19 ], 1, Concat, [ 1 ] ], # cat head P4
|
||||
[ -1, 3, C3, [ 512, False ] ], # 25 (P4/16-medium)
|
||||
|
||||
[ -1, 1, Conv, [ 512, 3, 2 ] ],
|
||||
[ [ -1, 15 ], 1, Concat, [ 1 ] ], # cat head P5
|
||||
[ -1, 3, C3, [ 768, False ] ], # 28 (P5/32-large)
|
||||
|
||||
[ -1, 1, Conv, [ 768, 3, 2 ] ],
|
||||
[ [ -1, 11 ], 1, Concat, [ 1 ] ], # cat head P6
|
||||
[ -1, 3, C3, [ 1024, False ] ], # 31 (P6/64-xlarge)
|
||||
|
||||
[ [ 22, 25, 28, 31 ], 1, Detect, [ nc, anchors ] ], # Detect(P3, P4, P5, P6)
|
||||
]
|
||||
|
46
models/yolov5n-0.5.yaml
Normal file
@ -0,0 +1,46 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 0.5 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [4,5, 8,10, 13,16] # P3/8
|
||||
- [23,29, 43,55, 73,105] # P4/16
|
||||
- [146,217, 231,300, 335,433] # P5/32
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, StemBlock, [32, 3, 2]], # 0-P2/4
|
||||
[-1, 1, ShuffleV2Block, [128, 2]], # 1-P3/8
|
||||
[-1, 3, ShuffleV2Block, [128, 1]], # 2
|
||||
[-1, 1, ShuffleV2Block, [256, 2]], # 3-P4/16
|
||||
[-1, 7, ShuffleV2Block, [256, 1]], # 4
|
||||
[-1, 1, ShuffleV2Block, [512, 2]], # 5-P5/32
|
||||
[-1, 3, ShuffleV2Block, [512, 1]], # 6
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 4], 1, Concat, [1]], # cat backbone P4
|
||||
[-1, 1, C3, [128, False]], # 10
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 2], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 1, C3, [128, False]], # 14 (P3/8-small)
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, 11], 1, Concat, [1]], # cat head P4
|
||||
[-1, 1, C3, [128, False]], # 17 (P4/16-medium)
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, 7], 1, Concat, [1]], # cat head P5
|
||||
[-1, 1, C3, [128, False]], # 20 (P5/32-large)
|
||||
|
||||
[[14, 17, 20], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
||||
|
46
models/yolov5n.yaml
Normal file
@ -0,0 +1,46 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [4,5, 8,10, 13,16] # P3/8
|
||||
- [23,29, 43,55, 73,105] # P4/16
|
||||
- [146,217, 231,300, 335,433] # P5/32
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, StemBlock, [32, 3, 2]], # 0-P2/4
|
||||
[-1, 1, ShuffleV2Block, [128, 2]], # 1-P3/8
|
||||
[-1, 3, ShuffleV2Block, [128, 1]], # 2
|
||||
[-1, 1, ShuffleV2Block, [256, 2]], # 3-P4/16
|
||||
[-1, 7, ShuffleV2Block, [256, 1]], # 4
|
||||
[-1, 1, ShuffleV2Block, [512, 2]], # 5-P5/32
|
||||
[-1, 3, ShuffleV2Block, [512, 1]], # 6
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 4], 1, Concat, [1]], # cat backbone P4
|
||||
[-1, 1, C3, [128, False]], # 10
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 2], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 1, C3, [128, False]], # 14 (P3/8-small)
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, 11], 1, Concat, [1]], # cat head P4
|
||||
[-1, 1, C3, [128, False]], # 17 (P4/16-medium)
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, 7], 1, Concat, [1]], # cat head P5
|
||||
[-1, 1, C3, [128, False]], # 20 (P5/32-large)
|
||||
|
||||
[[14, 17, 20], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
||||
|
58
models/yolov5n6.yaml
Normal file
@ -0,0 +1,58 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [6,7, 9,11, 13,16] # P3/8
|
||||
- [18,23, 26,33, 37,47] # P4/16
|
||||
- [54,67, 77,104, 112,154] # P5/32
|
||||
- [174,238, 258,355, 445,568] # P6/64
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, StemBlock, [32, 3, 2]], # 0-P2/4
|
||||
[-1, 1, ShuffleV2Block, [128, 2]], # 1-P3/8
|
||||
[-1, 3, ShuffleV2Block, [128, 1]], # 2
|
||||
[-1, 1, ShuffleV2Block, [256, 2]], # 3-P4/16
|
||||
[-1, 7, ShuffleV2Block, [256, 1]], # 4
|
||||
[-1, 1, ShuffleV2Block, [384, 2]], # 5-P5/32
|
||||
[-1, 3, ShuffleV2Block, [384, 1]], # 6
|
||||
[-1, 1, ShuffleV2Block, [512, 2]], # 7-P6/64
|
||||
[-1, 3, ShuffleV2Block, [512, 1]], # 8
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 6], 1, Concat, [1]], # cat backbone P5
|
||||
[-1, 1, C3, [128, False]], # 12
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 4], 1, Concat, [1]], # cat backbone P4
|
||||
[-1, 1, C3, [128, False]], # 16 (P4/8-small)
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 2], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 1, C3, [128, False]], # 20 (P3/8-small)
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, 17], 1, Concat, [1]], # cat head P4
|
||||
[-1, 1, C3, [128, False]], # 23 (P4/16-medium)
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, 13], 1, Concat, [1]], # cat head P5
|
||||
[-1, 1, C3, [128, False]], # 26 (P5/32-large)
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, 9], 1, Concat, [1]], # cat head P6
|
||||
[-1, 1, C3, [128, False]], # 29 (P6/64-large)
|
||||
|
||||
[[20, 23, 26, 29], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
||||
|
47
models/yolov5s.yaml
Normal file
@ -0,0 +1,47 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 0.33 # model depth multiple
|
||||
width_multiple: 0.5 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [4,5, 8,10, 13,16] # P3/8
|
||||
- [23,29, 43,55, 73,105] # P4/16
|
||||
- [146,217, 231,300, 335,433] # P5/32
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, StemBlock, [64, 3, 2]], # 0-P1/2
|
||||
[-1, 3, C3, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 2-P3/8
|
||||
[-1, 9, C3, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 4-P4/16
|
||||
[-1, 9, C3, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 6-P5/32
|
||||
[-1, 1, SPP, [1024, [3,5,7]]],
|
||||
[-1, 3, C3, [1024, False]], # 8
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[[-1, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 5], 1, Concat, [1]], # cat backbone P4
|
||||
[-1, 3, C3, [512, False]], # 12
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 3], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 3, C3, [256, False]], # 16 (P3/8-small)
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 13], 1, Concat, [1]], # cat head P4
|
||||
[-1, 3, C3, [512, False]], # 19 (P4/16-medium)
|
||||
|
||||
[-1, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 9], 1, Concat, [1]], # cat head P5
|
||||
[-1, 3, C3, [1024, False]], # 22 (P5/32-large)
|
||||
|
||||
[[16, 19, 22], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
60
models/yolov5s6.yaml
Normal file
@ -0,0 +1,60 @@
|
||||
# parameters
|
||||
nc: 1 # number of classes
|
||||
depth_multiple: 0.33 # model depth multiple
|
||||
width_multiple: 0.50 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [6,7, 9,11, 13,16] # P3/8
|
||||
- [18,23, 26,33, 37,47] # P4/16
|
||||
- [54,67, 77,104, 112,154] # P5/32
|
||||
- [174,238, 258,355, 445,568] # P6/64
|
||||
|
||||
# YOLOv5 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[ [ -1, 1, StemBlock, [ 64, 3, 2 ] ], # 0-P1/2
|
||||
[ -1, 3, C3, [ 128 ] ],
|
||||
[ -1, 1, Conv, [ 256, 3, 2 ] ], # 2-P3/8
|
||||
[ -1, 9, C3, [ 256 ] ],
|
||||
[ -1, 1, Conv, [ 512, 3, 2 ] ], # 4-P4/16
|
||||
[ -1, 9, C3, [ 512 ] ],
|
||||
[ -1, 1, Conv, [ 768, 3, 2 ] ], # 6-P5/32
|
||||
[ -1, 3, C3, [ 768 ] ],
|
||||
[ -1, 1, Conv, [ 1024, 3, 2 ] ], # 8-P6/64
|
||||
[ -1, 1, SPP, [ 1024, [ 3, 5, 7 ] ] ],
|
||||
[ -1, 3, C3, [ 1024, False ] ], # 10
|
||||
]
|
||||
|
||||
# YOLOv5 head
|
||||
head:
|
||||
[ [ -1, 1, Conv, [ 768, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 7 ], 1, Concat, [ 1 ] ], # cat backbone P5
|
||||
[ -1, 3, C3, [ 768, False ] ], # 14
|
||||
|
||||
[ -1, 1, Conv, [ 512, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 5 ], 1, Concat, [ 1 ] ], # cat backbone P4
|
||||
[ -1, 3, C3, [ 512, False ] ], # 18
|
||||
|
||||
[ -1, 1, Conv, [ 256, 1, 1 ] ],
|
||||
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
|
||||
[ [ -1, 3 ], 1, Concat, [ 1 ] ], # cat backbone P3
|
||||
[ -1, 3, C3, [ 256, False ] ], # 22 (P3/8-small)
|
||||
|
||||
[ -1, 1, Conv, [ 256, 3, 2 ] ],
|
||||
[ [ -1, 19 ], 1, Concat, [ 1 ] ], # cat head P4
|
||||
[ -1, 3, C3, [ 512, False ] ], # 25 (P4/16-medium)
|
||||
|
||||
[ -1, 1, Conv, [ 512, 3, 2 ] ],
|
||||
[ [ -1, 15 ], 1, Concat, [ 1 ] ], # cat head P5
|
||||
[ -1, 3, C3, [ 768, False ] ], # 28 (P5/32-large)
|
||||
|
||||
[ -1, 1, Conv, [ 768, 3, 2 ] ],
|
||||
[ [ -1, 11 ], 1, Concat, [ 1 ] ], # cat head P6
|
||||
[ -1, 3, C3, [ 1024, False ] ], # 31 (P6/64-xlarge)
|
||||
|
||||
[ [ 22, 25, 28, 31 ], 1, Detect, [ nc, anchors ] ], # Detect(P3, P4, P5, P6)
|
||||
]
|
||||
|
256
onnx_infer.py
Normal file
@ -0,0 +1,256 @@
|
||||
import onnxruntime
|
||||
import numpy as np
|
||||
import cv2
|
||||
import copy
|
||||
import os
|
||||
import argparse
|
||||
from PIL import Image, ImageDraw, ImageFont
|
||||
import time
|
||||
plate_color_list=['黑色','蓝色','绿色','白色','黄色']
|
||||
plateName=r"#京沪津渝冀晋蒙辽吉黑苏浙皖闽赣鲁豫鄂湘粤桂琼川贵云藏陕甘青宁新学警港澳挂使领民航危0123456789ABCDEFGHJKLMNPQRSTUVWXYZ险品"
|
||||
mean_value,std_value=((0.588,0.193))#识别模型均值标准差
|
||||
|
||||
def decodePlate(preds): #识别后处理
|
||||
pre=0
|
||||
newPreds=[]
|
||||
for i in range(len(preds)):
|
||||
if preds[i]!=0 and preds[i]!=pre:
|
||||
newPreds.append(preds[i])
|
||||
pre=preds[i]
|
||||
plate=""
|
||||
for i in newPreds:
|
||||
plate+=plateName[int(i)]
|
||||
return plate
|
||||
# return newPreds
|
||||
|
||||
def rec_pre_precessing(img,size=(48,168)): #识别前处理
|
||||
img =cv2.resize(img,(168,48))
|
||||
img = img.astype(np.float32)
|
||||
img = (img/255-mean_value)/std_value #归一化 减均值 除标准差
|
||||
img = img.transpose(2,0,1) #h,w,c 转为 c,h,w
|
||||
img = img.reshape(1,*img.shape) #channel,height,width转为batch,channel,height,channel
|
||||
return img
|
||||
|
||||
def get_plate_result(img,session_rec): #识别后处理
|
||||
img =rec_pre_precessing(img)
|
||||
y_onnx_plate,y_onnx_color = session_rec.run([session_rec.get_outputs()[0].name,session_rec.get_outputs()[1].name], {session_rec.get_inputs()[0].name: img})
|
||||
index =np.argmax(y_onnx_plate,axis=-1)
|
||||
index_color = np.argmax(y_onnx_color)
|
||||
plate_color = plate_color_list[index_color]
|
||||
# print(y_onnx[0])
|
||||
plate_no = decodePlate(index[0])
|
||||
return plate_no,plate_color
|
||||
|
||||
|
||||
def allFilePath(rootPath,allFIleList): #遍历文件
|
||||
fileList = os.listdir(rootPath)
|
||||
for temp in fileList:
|
||||
if os.path.isfile(os.path.join(rootPath,temp)):
|
||||
allFIleList.append(os.path.join(rootPath,temp))
|
||||
else:
|
||||
allFilePath(os.path.join(rootPath,temp),allFIleList)
|
||||
|
||||
def get_split_merge(img): #双层车牌进行分割后识别
|
||||
h,w,c = img.shape
|
||||
img_upper = img[0:int(5/12*h),:]
|
||||
img_lower = img[int(1/3*h):,:]
|
||||
img_upper = cv2.resize(img_upper,(img_lower.shape[1],img_lower.shape[0]))
|
||||
new_img = np.hstack((img_upper,img_lower))
|
||||
return new_img
|
||||
|
||||
|
||||
def order_points(pts): # 关键点排列 按照(左上,右上,右下,左下)的顺序排列
|
||||
rect = np.zeros((4, 2), dtype = "float32")
|
||||
s = pts.sum(axis = 1)
|
||||
rect[0] = pts[np.argmin(s)]
|
||||
rect[2] = pts[np.argmax(s)]
|
||||
diff = np.diff(pts, axis = 1)
|
||||
rect[1] = pts[np.argmin(diff)]
|
||||
rect[3] = pts[np.argmax(diff)]
|
||||
return rect
|
||||
|
||||
|
||||
def four_point_transform(image, pts): #透视变换得到矫正后的图像,方便识别
|
||||
rect = order_points(pts)
|
||||
(tl, tr, br, bl) = rect
|
||||
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
|
||||
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
|
||||
maxWidth = max(int(widthA), int(widthB))
|
||||
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
|
||||
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
|
||||
maxHeight = max(int(heightA), int(heightB))
|
||||
dst = np.array([
|
||||
[0, 0],
|
||||
[maxWidth - 1, 0],
|
||||
[maxWidth - 1, maxHeight - 1],
|
||||
[0, maxHeight - 1]], dtype = "float32")
|
||||
M = cv2.getPerspectiveTransform(rect, dst)
|
||||
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
|
||||
|
||||
# return the warped image
|
||||
return warped
|
||||
|
||||
def my_letter_box(img,size=(640,640)): #
|
||||
h,w,c = img.shape
|
||||
r = min(size[0]/h,size[1]/w)
|
||||
new_h,new_w = int(h*r),int(w*r)
|
||||
top = int((size[0]-new_h)/2)
|
||||
left = int((size[1]-new_w)/2)
|
||||
|
||||
bottom = size[0]-new_h-top
|
||||
right = size[1]-new_w-left
|
||||
img_resize = cv2.resize(img,(new_w,new_h))
|
||||
img = cv2.copyMakeBorder(img_resize,top,bottom,left,right,borderType=cv2.BORDER_CONSTANT,value=(114,114,114))
|
||||
return img,r,left,top
|
||||
|
||||
def xywh2xyxy(boxes): #xywh坐标变为 左上 ,右下坐标 x1,y1 x2,y2
|
||||
xywh =copy.deepcopy(boxes)
|
||||
xywh[:,0]=boxes[:,0]-boxes[:,2]/2
|
||||
xywh[:,1]=boxes[:,1]-boxes[:,3]/2
|
||||
xywh[:,2]=boxes[:,0]+boxes[:,2]/2
|
||||
xywh[:,3]=boxes[:,1]+boxes[:,3]/2
|
||||
return xywh
|
||||
|
||||
def my_nms(boxes,iou_thresh): #nms
|
||||
index = np.argsort(boxes[:,4])[::-1]
|
||||
keep = []
|
||||
while index.size >0:
|
||||
i = index[0]
|
||||
keep.append(i)
|
||||
x1=np.maximum(boxes[i,0],boxes[index[1:],0])
|
||||
y1=np.maximum(boxes[i,1],boxes[index[1:],1])
|
||||
x2=np.minimum(boxes[i,2],boxes[index[1:],2])
|
||||
y2=np.minimum(boxes[i,3],boxes[index[1:],3])
|
||||
|
||||
w = np.maximum(0,x2-x1)
|
||||
h = np.maximum(0,y2-y1)
|
||||
|
||||
inter_area = w*h
|
||||
union_area = (boxes[i,2]-boxes[i,0])*(boxes[i,3]-boxes[i,1])+(boxes[index[1:],2]-boxes[index[1:],0])*(boxes[index[1:],3]-boxes[index[1:],1])
|
||||
iou = inter_area/(union_area-inter_area)
|
||||
idx = np.where(iou<=iou_thresh)[0]
|
||||
index = index[idx+1]
|
||||
return keep
|
||||
|
||||
def restore_box(boxes,r,left,top): #返回原图上面的坐标
|
||||
boxes[:,[0,2,5,7,9,11]]-=left
|
||||
boxes[:,[1,3,6,8,10,12]]-=top
|
||||
|
||||
boxes[:,[0,2,5,7,9,11]]/=r
|
||||
boxes[:,[1,3,6,8,10,12]]/=r
|
||||
return boxes
|
||||
|
||||
def detect_pre_precessing(img,img_size): #检测前处理
|
||||
img,r,left,top=my_letter_box(img,img_size)
|
||||
# cv2.imwrite("1.jpg",img)
|
||||
img =img[:,:,::-1].transpose(2,0,1).copy().astype(np.float32)
|
||||
img=img/255
|
||||
img=img.reshape(1,*img.shape)
|
||||
return img,r,left,top
|
||||
|
||||
def post_precessing(dets,r,left,top,conf_thresh=0.3,iou_thresh=0.5):#检测后处理
|
||||
choice = dets[:,:,4]>conf_thresh
|
||||
dets=dets[choice]
|
||||
dets[:,13:15]*=dets[:,4:5]
|
||||
box = dets[:,:4]
|
||||
boxes = xywh2xyxy(box)
|
||||
score= np.max(dets[:,13:15],axis=-1,keepdims=True)
|
||||
index = np.argmax(dets[:,13:15],axis=-1).reshape(-1,1)
|
||||
output = np.concatenate((boxes,score,dets[:,5:13],index),axis=1)
|
||||
reserve_=my_nms(output,iou_thresh)
|
||||
output=output[reserve_]
|
||||
output = restore_box(output,r,left,top)
|
||||
return output
|
||||
|
||||
def rec_plate(outputs,img0,session_rec): #识别车牌
|
||||
dict_list=[]
|
||||
for output in outputs:
|
||||
result_dict={}
|
||||
rect=output[:4].tolist()
|
||||
land_marks = output[5:13].reshape(4,2)
|
||||
roi_img = four_point_transform(img0,land_marks)
|
||||
label = int(output[-1])
|
||||
score = output[4]
|
||||
if label==1: #代表是双层车牌
|
||||
roi_img = get_split_merge(roi_img)
|
||||
plate_no,plate_color = get_plate_result(roi_img,session_rec)
|
||||
result_dict['rect']=rect
|
||||
result_dict['landmarks']=land_marks.tolist()
|
||||
result_dict['plate_no']=plate_no
|
||||
result_dict['roi_height']=roi_img.shape[0]
|
||||
result_dict['plate_color']=plate_color
|
||||
dict_list.append(result_dict)
|
||||
return dict_list
|
||||
|
||||
def cv2ImgAddText(img, text, left, top, textColor=(0, 255, 0), textSize=20): #将识别结果画在图上
|
||||
if (isinstance(img, np.ndarray)): #判断是否OpenCV图片类型
|
||||
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
|
||||
draw = ImageDraw.Draw(img)
|
||||
fontText = ImageFont.truetype(
|
||||
"fonts/platech.ttf", textSize, encoding="utf-8")
|
||||
draw.text((left, top), text, textColor, font=fontText)
|
||||
return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)
|
||||
|
||||
def draw_result(orgimg,dict_list):
|
||||
result_str =""
|
||||
for result in dict_list:
|
||||
rect_area = result['rect']
|
||||
|
||||
x,y,w,h = rect_area[0],rect_area[1],rect_area[2]-rect_area[0],rect_area[3]-rect_area[1]
|
||||
padding_w = 0.05*w
|
||||
padding_h = 0.11*h
|
||||
rect_area[0]=max(0,int(x-padding_w))
|
||||
rect_area[1]=min(orgimg.shape[1],int(y-padding_h))
|
||||
rect_area[2]=max(0,int(rect_area[2]+padding_w))
|
||||
rect_area[3]=min(orgimg.shape[0],int(rect_area[3]+padding_h))
|
||||
|
||||
height_area = result['roi_height']
|
||||
landmarks=result['landmarks']
|
||||
result = result['plate_no']
|
||||
result_str+=result+" "
|
||||
for i in range(4): #关键点
|
||||
cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
|
||||
cv2.rectangle(orgimg,(rect_area[0],rect_area[1]),(rect_area[2],rect_area[3]),(255,255,0),2) #画框
|
||||
if len(result)>=1:
|
||||
orgimg=cv2ImgAddText(orgimg,result,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
|
||||
print(result_str)
|
||||
return orgimg
|
||||
|
||||
if __name__ == "__main__":
|
||||
begin = time.time()
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--detect_model',type=str, default=r'weights/plate_detect.onnx', help='model.pt path(s)') #检测模型
|
||||
parser.add_argument('--rec_model', type=str, default='weights/plate_rec_color.onnx', help='model.pt path(s)')#识别模型
|
||||
parser.add_argument('--image_path', type=str, default='imgs', help='source')
|
||||
parser.add_argument('--img_size', type=int, default=640, help='inference size (pixels)')
|
||||
parser.add_argument('--output', type=str, default='result1', help='source')
|
||||
opt = parser.parse_args()
|
||||
file_list = []
|
||||
allFilePath(opt.image_path,file_list)
|
||||
providers = ['CPUExecutionProvider']
|
||||
clors = [(255,0,0),(0,255,0),(0,0,255),(255,255,0),(0,255,255)]
|
||||
img_size = (opt.img_size,opt.img_size)
|
||||
session_detect = onnxruntime.InferenceSession(opt.detect_model, providers=providers )
|
||||
session_rec = onnxruntime.InferenceSession(opt.rec_model, providers=providers )
|
||||
if not os.path.exists(opt.output):
|
||||
os.mkdir(opt.output)
|
||||
save_path = opt.output
|
||||
count = 0
|
||||
for pic_ in file_list:
|
||||
count+=1
|
||||
print(count,pic_,end=" ")
|
||||
img=cv2.imread(pic_)
|
||||
img0 = copy.deepcopy(img)
|
||||
img,r,left,top = detect_pre_precessing(img,img_size) #检测前处理
|
||||
# print(img.shape)
|
||||
y_onnx = session_detect.run([session_detect.get_outputs()[0].name], {session_detect.get_inputs()[0].name: img})[0]
|
||||
outputs = post_precessing(y_onnx,r,left,top) #检测后处理
|
||||
result_list=rec_plate(outputs,img0,session_rec)
|
||||
ori_img = draw_result(img0,result_list)
|
||||
img_name = os.path.basename(pic_)
|
||||
save_img_path = os.path.join(save_path,img_name)
|
||||
cv2.imwrite(save_img_path,ori_img)
|
||||
print(f"总共耗时{time.time()-begin} s")
|
||||
|
||||
|
||||
|
342
openvino_infer.py
Normal file
@ -0,0 +1,342 @@
|
||||
import cv2
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
from openvino.runtime import Core
|
||||
import os
|
||||
import time
|
||||
import copy
|
||||
from PIL import Image, ImageDraw, ImageFont
|
||||
import argparse
|
||||
|
||||
def cv_imread(path):
|
||||
img=cv2.imdecode(np.fromfile(path,dtype=np.uint8),-1)
|
||||
return img
|
||||
|
||||
def allFilePath(rootPath,allFIleList):
|
||||
fileList = os.listdir(rootPath)
|
||||
for temp in fileList:
|
||||
if os.path.isfile(os.path.join(rootPath,temp)):
|
||||
# if temp.endswith("jpg"):
|
||||
allFIleList.append(os.path.join(rootPath,temp))
|
||||
else:
|
||||
allFilePath(os.path.join(rootPath,temp),allFIleList)
|
||||
|
||||
mean_value,std_value=((0.588,0.193))#识别模型均值标准差
|
||||
plateName=r"#京沪津渝冀晋蒙辽吉黑苏浙皖闽赣鲁豫鄂湘粤桂琼川贵云藏陕甘青宁新学警港澳挂使领民航危0123456789ABCDEFGHJKLMNPQRSTUVWXYZ险品"
|
||||
|
||||
def rec_pre_precessing(img,size=(48,168)): #识别前处理
|
||||
img =cv2.resize(img,(168,48))
|
||||
img = img.astype(np.float32)
|
||||
img = (img/255-mean_value)/std_value
|
||||
img = img.transpose(2,0,1)
|
||||
img = img.reshape(1,*img.shape)
|
||||
return img
|
||||
|
||||
def decodePlate(preds): #识别后处理
|
||||
pre=0
|
||||
newPreds=[]
|
||||
preds=preds.astype(np.int8)[0]
|
||||
for i in range(len(preds)):
|
||||
if preds[i]!=0 and preds[i]!=pre:
|
||||
newPreds.append(preds[i])
|
||||
pre=preds[i]
|
||||
plate=""
|
||||
for i in newPreds:
|
||||
plate+=plateName[int(i)]
|
||||
return plate
|
||||
|
||||
def load_model(onnx_path):
|
||||
ie = Core()
|
||||
model_onnx = ie.read_model(model=onnx_path)
|
||||
compiled_model_onnx = ie.compile_model(model=model_onnx, device_name="CPU")
|
||||
output_layer_onnx = compiled_model_onnx.output(0)
|
||||
return compiled_model_onnx,output_layer_onnx
|
||||
|
||||
def get_plate_result(img,rec_model,rec_output):
|
||||
img =rec_pre_precessing(img)
|
||||
# time_b = time.time()
|
||||
res_onnx = rec_model([img])[rec_output]
|
||||
# time_e= time.time()
|
||||
index =np.argmax(res_onnx,axis=-1) #找出最大概率的那个字符的序号
|
||||
plate_no = decodePlate(index)
|
||||
# print(f'{plate_no},time is {time_e-time_b}')
|
||||
return plate_no
|
||||
|
||||
|
||||
def get_split_merge(img): #双层车牌进行分割后识别
|
||||
h,w,c = img.shape
|
||||
img_upper = img[0:int(5/12*h),:]
|
||||
img_lower = img[int(1/3*h):,:]
|
||||
img_upper = cv2.resize(img_upper,(img_lower.shape[1],img_lower.shape[0]))
|
||||
new_img = np.hstack((img_upper,img_lower))
|
||||
return new_img
|
||||
|
||||
|
||||
def order_points(pts):
|
||||
rect = np.zeros((4, 2), dtype = "float32")
|
||||
s = pts.sum(axis = 1)
|
||||
rect[0] = pts[np.argmin(s)]
|
||||
rect[2] = pts[np.argmax(s)]
|
||||
diff = np.diff(pts, axis = 1)
|
||||
rect[1] = pts[np.argmin(diff)]
|
||||
rect[3] = pts[np.argmax(diff)]
|
||||
return rect
|
||||
|
||||
|
||||
def four_point_transform(image, pts):
|
||||
rect = order_points(pts)
|
||||
(tl, tr, br, bl) = rect
|
||||
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
|
||||
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
|
||||
maxWidth = max(int(widthA), int(widthB))
|
||||
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
|
||||
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
|
||||
maxHeight = max(int(heightA), int(heightB))
|
||||
dst = np.array([
|
||||
[0, 0],
|
||||
[maxWidth - 1, 0],
|
||||
[maxWidth - 1, maxHeight - 1],
|
||||
[0, maxHeight - 1]], dtype = "float32")
|
||||
M = cv2.getPerspectiveTransform(rect, dst)
|
||||
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
|
||||
|
||||
# return the warped image
|
||||
return warped
|
||||
|
||||
def my_letter_box(img,size=(640,640)):
|
||||
h,w,c = img.shape
|
||||
r = min(size[0]/h,size[1]/w)
|
||||
new_h,new_w = int(h*r),int(w*r)
|
||||
top = int((size[0]-new_h)/2)
|
||||
left = int((size[1]-new_w)/2)
|
||||
|
||||
bottom = size[0]-new_h-top
|
||||
right = size[1]-new_w-left
|
||||
img_resize = cv2.resize(img,(new_w,new_h))
|
||||
img = cv2.copyMakeBorder(img_resize,top,bottom,left,right,borderType=cv2.BORDER_CONSTANT,value=(114,114,114))
|
||||
return img,r,left,top
|
||||
|
||||
def xywh2xyxy(boxes):
|
||||
xywh =copy.deepcopy(boxes)
|
||||
xywh[:,0]=boxes[:,0]-boxes[:,2]/2
|
||||
xywh[:,1]=boxes[:,1]-boxes[:,3]/2
|
||||
xywh[:,2]=boxes[:,0]+boxes[:,2]/2
|
||||
xywh[:,3]=boxes[:,1]+boxes[:,3]/2
|
||||
return xywh
|
||||
|
||||
def my_nms(boxes,iou_thresh):
|
||||
index = np.argsort(boxes[:,4])[::-1]
|
||||
keep = []
|
||||
while index.size >0:
|
||||
i = index[0]
|
||||
keep.append(i)
|
||||
x1=np.maximum(boxes[i,0],boxes[index[1:],0])
|
||||
y1=np.maximum(boxes[i,1],boxes[index[1:],1])
|
||||
x2=np.minimum(boxes[i,2],boxes[index[1:],2])
|
||||
y2=np.minimum(boxes[i,3],boxes[index[1:],3])
|
||||
|
||||
w = np.maximum(0,x2-x1)
|
||||
h = np.maximum(0,y2-y1)
|
||||
|
||||
inter_area = w*h
|
||||
union_area = (boxes[i,2]-boxes[i,0])*(boxes[i,3]-boxes[i,1])+(boxes[index[1:],2]-boxes[index[1:],0])*(boxes[index[1:],3]-boxes[index[1:],1])
|
||||
iou = inter_area/(union_area-inter_area)
|
||||
idx = np.where(iou<=iou_thresh)[0]
|
||||
index = index[idx+1]
|
||||
return keep
|
||||
|
||||
def restore_box(boxes,r,left,top):
|
||||
boxes[:,[0,2,5,7,9,11]]-=left
|
||||
boxes[:,[1,3,6,8,10,12]]-=top
|
||||
|
||||
boxes[:,[0,2,5,7,9,11]]/=r
|
||||
boxes[:,[1,3,6,8,10,12]]/=r
|
||||
return boxes
|
||||
|
||||
def detect_pre_precessing(img,img_size):
|
||||
img,r,left,top=my_letter_box(img,img_size)
|
||||
# cv2.imwrite("1.jpg",img)
|
||||
img =img[:,:,::-1].transpose(2,0,1).copy().astype(np.float32)
|
||||
img=img/255
|
||||
img=img.reshape(1,*img.shape)
|
||||
return img,r,left,top
|
||||
|
||||
def post_precessing(dets,r,left,top,conf_thresh=0.3,iou_thresh=0.5):#检测后处理
|
||||
choice = dets[:,:,4]>conf_thresh
|
||||
dets=dets[choice]
|
||||
dets[:,13:15]*=dets[:,4:5]
|
||||
box = dets[:,:4]
|
||||
boxes = xywh2xyxy(box)
|
||||
score= np.max(dets[:,13:15],axis=-1,keepdims=True)
|
||||
index = np.argmax(dets[:,13:15],axis=-1).reshape(-1,1)
|
||||
output = np.concatenate((boxes,score,dets[:,5:13],index),axis=1)
|
||||
reserve_=my_nms(output,iou_thresh)
|
||||
output=output[reserve_]
|
||||
output = restore_box(output,r,left,top)
|
||||
return output
|
||||
|
||||
def rec_plate(outputs,img0,rec_model,rec_output):
|
||||
dict_list=[]
|
||||
for output in outputs:
|
||||
result_dict={}
|
||||
rect=output[:4].tolist()
|
||||
land_marks = output[5:13].reshape(4,2)
|
||||
roi_img = four_point_transform(img0,land_marks)
|
||||
label = int(output[-1])
|
||||
if label==1: #代表是双层车牌
|
||||
roi_img = get_split_merge(roi_img)
|
||||
plate_no = get_plate_result(roi_img,rec_model,rec_output) #得到车牌识别结果
|
||||
result_dict['rect']=rect
|
||||
result_dict['landmarks']=land_marks.tolist()
|
||||
result_dict['plate_no']=plate_no
|
||||
result_dict['roi_height']=roi_img.shape[0]
|
||||
dict_list.append(result_dict)
|
||||
return dict_list
|
||||
|
||||
|
||||
|
||||
def cv2ImgAddText(img, text, left, top, textColor=(0, 255, 0), textSize=20):
|
||||
if (isinstance(img, np.ndarray)): #判断是否OpenCV图片类型
|
||||
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
|
||||
draw = ImageDraw.Draw(img)
|
||||
fontText = ImageFont.truetype(
|
||||
"fonts/platech.ttf", textSize, encoding="utf-8")
|
||||
draw.text((left, top), text, textColor, font=fontText)
|
||||
return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)
|
||||
|
||||
def draw_result(orgimg,dict_list):
|
||||
result_str =""
|
||||
for result in dict_list:
|
||||
rect_area = result['rect']
|
||||
|
||||
x,y,w,h = rect_area[0],rect_area[1],rect_area[2]-rect_area[0],rect_area[3]-rect_area[1]
|
||||
padding_w = 0.05*w
|
||||
padding_h = 0.11*h
|
||||
rect_area[0]=max(0,int(x-padding_w))
|
||||
rect_area[1]=min(orgimg.shape[1],int(y-padding_h))
|
||||
rect_area[2]=max(0,int(rect_area[2]+padding_w))
|
||||
rect_area[3]=min(orgimg.shape[0],int(rect_area[3]+padding_h))
|
||||
|
||||
height_area = result['roi_height']
|
||||
landmarks=result['landmarks']
|
||||
result = result['plate_no']
|
||||
result_str+=result+" "
|
||||
# for i in range(4): #关键点
|
||||
# cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
|
||||
|
||||
if len(result)>=6:
|
||||
cv2.rectangle(orgimg,(rect_area[0],rect_area[1]),(rect_area[2],rect_area[3]),(0,0,255),2) #画框
|
||||
orgimg=cv2ImgAddText(orgimg,result,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
|
||||
# print(result_str)
|
||||
return orgimg
|
||||
|
||||
def get_second(capture):
|
||||
if capture.isOpened():
|
||||
rate = capture.get(5) # 帧速率
|
||||
FrameNumber = capture.get(7) # 视频文件的帧数
|
||||
duration = FrameNumber/rate # 帧速率/视频总帧数 是时间,除以60之后单位是分钟
|
||||
return int(rate),int(FrameNumber),int(duration)
|
||||
|
||||
if __name__=="__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--detect_model',type=str, default=r'weights/plate_detect.onnx', help='model.pt path(s)') #检测模型
|
||||
parser.add_argument('--rec_model', type=str, default='weights/plate_rec.onnx', help='model.pt path(s)')#识别模型
|
||||
parser.add_argument('--image_path', type=str, default='imgs', help='source')
|
||||
parser.add_argument('--img_size', type=int, default=640, help='inference size (pixels)')
|
||||
parser.add_argument('--output', type=str, default='result1', help='source')
|
||||
opt = parser.parse_args()
|
||||
file_list=[]
|
||||
file_folder=opt.image_path
|
||||
allFilePath(file_folder,file_list)
|
||||
rec_onnx_path =opt.rec_model
|
||||
detect_onnx_path=opt.detect_model
|
||||
rec_model,rec_output=load_model(rec_onnx_path)
|
||||
detect_model,detect_output=load_model(detect_onnx_path)
|
||||
count=0
|
||||
img_size=(opt.img_size,opt.img_size)
|
||||
begin=time.time()
|
||||
save_path=opt.output
|
||||
if not os.path.exists(save_path):
|
||||
os.mkdir(save_path)
|
||||
for pic_ in file_list:
|
||||
|
||||
count+=1
|
||||
print(count,pic_,end=" ")
|
||||
img=cv2.imread(pic_)
|
||||
time_b = time.time()
|
||||
if img.shape[-1]==4:
|
||||
img = cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
|
||||
img0 = copy.deepcopy(img)
|
||||
img,r,left,top = detect_pre_precessing(img,img_size) #检测前处理
|
||||
# print(img.shape)
|
||||
det_result = detect_model([img])[detect_output]
|
||||
outputs = post_precessing(det_result,r,left,top) #检测后处理
|
||||
time_1 = time.time()
|
||||
result_list=rec_plate(outputs,img0,rec_model,rec_output)
|
||||
time_e= time.time()
|
||||
print(f'耗时 {time_e-time_b} s')
|
||||
ori_img = draw_result(img0,result_list)
|
||||
img_name = os.path.basename(pic_)
|
||||
save_img_path = os.path.join(save_path,img_name)
|
||||
|
||||
cv2.imwrite(save_img_path,ori_img)
|
||||
print(f"总共耗时{time.time()-begin} s")
|
||||
|
||||
# video_name = r"plate.mp4"
|
||||
# capture=cv2.VideoCapture(video_name)
|
||||
# fourcc = cv2.VideoWriter_fourcc(*'MP4V')
|
||||
# fps = capture.get(cv2.CAP_PROP_FPS) # 帧数
|
||||
# width, height = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT)) # 宽高
|
||||
# out = cv2.VideoWriter('2result.mp4', fourcc, fps, (width, height)) # 写入视频
|
||||
# frame_count = 0
|
||||
# fps_all=0
|
||||
# rate,FrameNumber,duration=get_second(capture)
|
||||
# # with open("example.csv",mode='w',newline='') as example_file:
|
||||
# # fieldnames = ['车牌', '时间']
|
||||
# # writer = csv.DictWriter(example_file, fieldnames=fieldnames, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
|
||||
# # writer.writeheader()
|
||||
# if capture.isOpened():
|
||||
# while True:
|
||||
# t1 = cv2.getTickCount()
|
||||
# frame_count+=1
|
||||
# ret,img=capture.read()
|
||||
# if not ret:
|
||||
# break
|
||||
# # if frame_count%rate==0:
|
||||
# img0 = copy.deepcopy(img)
|
||||
# img,r,left,top = detect_pre_precessing(img,img_size) #检测前处理
|
||||
# # print(img.shape)
|
||||
# det_result = detect_model([img])[detect_output]
|
||||
# outputs = post_precessing(det_result,r,left,top) #检测后处理
|
||||
# result_list=rec_plate(outputs,img0,rec_model,rec_output)
|
||||
# ori_img = draw_result(img0,result_list)
|
||||
# t2 =cv2.getTickCount()
|
||||
# infer_time =(t2-t1)/cv2.getTickFrequency()
|
||||
# fps=1.0/infer_time
|
||||
# fps_all+=fps
|
||||
# str_fps = f'fps:{fps:.4f}'
|
||||
# out.write(ori_img)
|
||||
# cv2.putText(ori_img,str_fps,(20,20),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,0),2)
|
||||
# cv2.imshow("haha",ori_img)
|
||||
# cv2.waitKey(1)
|
||||
|
||||
# # current_time = int(frame_count/FrameNumber*duration)
|
||||
# # sec = current_time%60
|
||||
# # minute = current_time//60
|
||||
# # for result_ in result_list:
|
||||
# # plate_no = result_['plate_no']
|
||||
# # if not is_car_number(pattern_str,plate_no):
|
||||
# # continue
|
||||
# # print(f'车牌号:{plate_no},时间:{minute}分{sec}秒')
|
||||
# # time_str =f'{minute}分{sec}秒'
|
||||
# # writer.writerow({"车牌":plate_no,"时间":time_str})
|
||||
# # out.write(ori_img)
|
||||
|
||||
|
||||
# else:
|
||||
# print("失败")
|
||||
# capture.release()
|
||||
# out.release()
|
||||
# cv2.destroyAllWindows()
|
||||
# print(f"all frame is {frame_count},average fps is {fps_all/frame_count}")
|
||||
|
15
plate_recognition/double_plate_split_merge.py
Normal file
@ -0,0 +1,15 @@
|
||||
import os
|
||||
import cv2
|
||||
import numpy as np
|
||||
def get_split_merge(img):
|
||||
h,w,c = img.shape
|
||||
img_upper = img[0:int(5/12*h),:]
|
||||
img_lower = img[int(1/3*h):,:]
|
||||
img_upper = cv2.resize(img_upper,(img_lower.shape[1],img_lower.shape[0]))
|
||||
new_img = np.hstack((img_upper,img_lower))
|
||||
return new_img
|
||||
|
||||
if __name__=="__main__":
|
||||
img = cv2.imread("double_plate/tmp8078.png")
|
||||
new_img =get_split_merge(img)
|
||||
cv2.imwrite("double_plate/new.jpg",new_img)
|
203
plate_recognition/plateNet.py
Normal file
@ -0,0 +1,203 @@
|
||||
import torch.nn as nn
|
||||
import torch
|
||||
|
||||
|
||||
class myNet_ocr(nn.Module):
|
||||
def __init__(self,cfg=None,num_classes=78,export=False):
|
||||
super(myNet_ocr, self).__init__()
|
||||
if cfg is None:
|
||||
cfg =[32,32,64,64,'M',128,128,'M',196,196,'M',256,256]
|
||||
# cfg =[32,32,'M',64,64,'M',128,128,'M',256,256]
|
||||
self.feature = self.make_layers(cfg, True)
|
||||
self.export = export
|
||||
# self.classifier = nn.Linear(cfg[-1], num_classes)
|
||||
# self.loc = nn.MaxPool2d((2, 2), (5, 1), (0, 1),ceil_mode=True)
|
||||
# self.loc = nn.AvgPool2d((2, 2), (5, 2), (0, 1),ceil_mode=False)
|
||||
self.loc = nn.MaxPool2d((5, 2), (1, 1),(0,1),ceil_mode=False)
|
||||
self.newCnn=nn.Conv2d(cfg[-1],num_classes,1,1)
|
||||
# self.newBn=nn.BatchNorm2d(num_classes)
|
||||
def make_layers(self, cfg, batch_norm=False):
|
||||
layers = []
|
||||
in_channels = 3
|
||||
for i in range(len(cfg)):
|
||||
if i == 0:
|
||||
conv2d =nn.Conv2d(in_channels, cfg[i], kernel_size=5,stride =1)
|
||||
if batch_norm:
|
||||
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
|
||||
else:
|
||||
layers += [conv2d, nn.ReLU(inplace=True)]
|
||||
in_channels = cfg[i]
|
||||
else :
|
||||
if cfg[i] == 'M':
|
||||
layers += [nn.MaxPool2d(kernel_size=3, stride=2,ceil_mode=True)]
|
||||
else:
|
||||
conv2d = nn.Conv2d(in_channels, cfg[i], kernel_size=3, padding=(1,1),stride =1)
|
||||
if batch_norm:
|
||||
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
|
||||
else:
|
||||
layers += [conv2d, nn.ReLU(inplace=True)]
|
||||
in_channels = cfg[i]
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.feature(x)
|
||||
x=self.loc(x)
|
||||
x=self.newCnn(x)
|
||||
# x=self.newBn(x)
|
||||
if self.export:
|
||||
conv = x.squeeze(2) # b *512 * width
|
||||
conv = conv.transpose(2,1) # [w, b, c]
|
||||
# conv =conv.argmax(dim=2)
|
||||
return conv
|
||||
else:
|
||||
b, c, h, w = x.size()
|
||||
assert h == 1, "the height of conv must be 1"
|
||||
conv = x.squeeze(2) # b *512 * width
|
||||
conv = conv.permute(2, 0, 1) # [w, b, c]
|
||||
# output = F.log_softmax(self.rnn(conv), dim=2)
|
||||
output = torch.softmax(conv, dim=2)
|
||||
return output
|
||||
|
||||
myCfg = [32,'M',64,'M',96,'M',128,'M',256]
|
||||
class myNet(nn.Module):
|
||||
def __init__(self,cfg=None,num_classes=3):
|
||||
super(myNet, self).__init__()
|
||||
if cfg is None:
|
||||
cfg = myCfg
|
||||
self.feature = self.make_layers(cfg, True)
|
||||
self.classifier = nn.Linear(cfg[-1], num_classes)
|
||||
def make_layers(self, cfg, batch_norm=False):
|
||||
layers = []
|
||||
in_channels = 3
|
||||
for i in range(len(cfg)):
|
||||
if i == 0:
|
||||
conv2d =nn.Conv2d(in_channels, cfg[i], kernel_size=5,stride =1)
|
||||
if batch_norm:
|
||||
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
|
||||
else:
|
||||
layers += [conv2d, nn.ReLU(inplace=True)]
|
||||
in_channels = cfg[i]
|
||||
else :
|
||||
if cfg[i] == 'M':
|
||||
layers += [nn.MaxPool2d(kernel_size=3, stride=2,ceil_mode=True)]
|
||||
else:
|
||||
conv2d = nn.Conv2d(in_channels, cfg[i], kernel_size=3, padding=1,stride =1)
|
||||
if batch_norm:
|
||||
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
|
||||
else:
|
||||
layers += [conv2d, nn.ReLU(inplace=True)]
|
||||
in_channels = cfg[i]
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.feature(x)
|
||||
x = nn.AvgPool2d(kernel_size=3, stride=1)(x)
|
||||
x = x.view(x.size(0), -1)
|
||||
y = self.classifier(x)
|
||||
return y
|
||||
|
||||
|
||||
class MyNet_color(nn.Module):
|
||||
def __init__(self, class_num=6):
|
||||
super(MyNet_color, self).__init__()
|
||||
self.class_num = class_num
|
||||
self.backbone = nn.Sequential(
|
||||
nn.Conv2d(in_channels=3, out_channels=16, kernel_size=(5, 5), stride=(1, 1)), # 0
|
||||
torch.nn.BatchNorm2d(16),
|
||||
nn.ReLU(),
|
||||
nn.MaxPool2d(kernel_size=(2, 2)),
|
||||
nn.Dropout(0),
|
||||
nn.Flatten(),
|
||||
nn.Linear(480, 64),
|
||||
nn.Dropout(0),
|
||||
nn.ReLU(),
|
||||
nn.Linear(64, class_num),
|
||||
nn.Dropout(0),
|
||||
nn.Softmax(1)
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
logits = self.backbone(x)
|
||||
|
||||
return logits
|
||||
|
||||
|
||||
class myNet_ocr_color(nn.Module):
|
||||
def __init__(self,cfg=None,num_classes=78,export=False,color_num=None):
|
||||
super(myNet_ocr_color, self).__init__()
|
||||
if cfg is None:
|
||||
cfg =[32,32,64,64,'M',128,128,'M',196,196,'M',256,256]
|
||||
# cfg =[32,32,'M',64,64,'M',128,128,'M',256,256]
|
||||
self.feature = self.make_layers(cfg, True)
|
||||
self.export = export
|
||||
self.color_num=color_num
|
||||
self.conv_out_num=12 #颜色第一个卷积层输出通道12
|
||||
if self.color_num:
|
||||
self.conv1=nn.Conv2d(cfg[-1],self.conv_out_num,kernel_size=3,stride=2)
|
||||
self.bn1=nn.BatchNorm2d(self.conv_out_num)
|
||||
self.relu1=nn.ReLU(inplace=True)
|
||||
self.gap =nn.AdaptiveAvgPool2d(output_size=1)
|
||||
self.color_classifier=nn.Conv2d(self.conv_out_num,self.color_num,kernel_size=1,stride=1)
|
||||
self.color_bn = nn.BatchNorm2d(self.color_num)
|
||||
self.flatten = nn.Flatten()
|
||||
self.loc = nn.MaxPool2d((5, 2), (1, 1),(0,1),ceil_mode=False)
|
||||
self.newCnn=nn.Conv2d(cfg[-1],num_classes,1,1)
|
||||
# self.newBn=nn.BatchNorm2d(num_classes)
|
||||
def make_layers(self, cfg, batch_norm=False):
|
||||
layers = []
|
||||
in_channels = 3
|
||||
for i in range(len(cfg)):
|
||||
if i == 0:
|
||||
conv2d =nn.Conv2d(in_channels, cfg[i], kernel_size=5,stride =1)
|
||||
if batch_norm:
|
||||
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
|
||||
else:
|
||||
layers += [conv2d, nn.ReLU(inplace=True)]
|
||||
in_channels = cfg[i]
|
||||
else :
|
||||
if cfg[i] == 'M':
|
||||
layers += [nn.MaxPool2d(kernel_size=3, stride=2,ceil_mode=True)]
|
||||
else:
|
||||
conv2d = nn.Conv2d(in_channels, cfg[i], kernel_size=3, padding=(1,1),stride =1)
|
||||
if batch_norm:
|
||||
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
|
||||
else:
|
||||
layers += [conv2d, nn.ReLU(inplace=True)]
|
||||
in_channels = cfg[i]
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.feature(x)
|
||||
if self.color_num:
|
||||
x_color=self.conv1(x)
|
||||
x_color=self.bn1(x_color)
|
||||
x_color =self.relu1(x_color)
|
||||
x_color = self.color_classifier(x_color)
|
||||
x_color = self.color_bn(x_color)
|
||||
x_color =self.gap(x_color)
|
||||
x_color = self.flatten(x_color)
|
||||
x=self.loc(x)
|
||||
x=self.newCnn(x)
|
||||
|
||||
if self.export:
|
||||
conv = x.squeeze(2) # b *512 * width
|
||||
conv = conv.transpose(2,1) # [w, b, c]
|
||||
if self.color_num:
|
||||
return conv,x_color
|
||||
return conv
|
||||
else:
|
||||
b, c, h, w = x.size()
|
||||
assert h == 1, "the height of conv must be 1"
|
||||
conv = x.squeeze(2) # b *512 * width
|
||||
conv = conv.permute(2, 0, 1) # [w, b, c]
|
||||
output = F.log_softmax(conv, dim=2)
|
||||
if self.color_num:
|
||||
return output,x_color
|
||||
return output
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
x = torch.randn(1,3,48,216)
|
||||
model = myNet_ocr(num_classes=78,export=True)
|
||||
out = model(x)
|
||||
print(out.shape)
|
119
plate_recognition/plate_rec.py
Normal file
@ -0,0 +1,119 @@
|
||||
from plate_recognition.plateNet import myNet_ocr,myNet_ocr_color
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import cv2
|
||||
import numpy as np
|
||||
import os
|
||||
import time
|
||||
import sys
|
||||
|
||||
def cv_imread(path): #可以读取中文路径的图片
|
||||
img=cv2.imdecode(np.fromfile(path,dtype=np.uint8),-1)
|
||||
return img
|
||||
|
||||
def allFilePath(rootPath,allFIleList):
|
||||
fileList = os.listdir(rootPath)
|
||||
for temp in fileList:
|
||||
if os.path.isfile(os.path.join(rootPath,temp)):
|
||||
if temp.endswith('.jpg') or temp.endswith('.png') or temp.endswith('.JPG'):
|
||||
allFIleList.append(os.path.join(rootPath,temp))
|
||||
else:
|
||||
allFilePath(os.path.join(rootPath,temp),allFIleList)
|
||||
device = torch.device('cuda') if torch.cuda.is_available() else torch.device("cpu")
|
||||
color=['黑色','蓝色','绿色','白色','黄色']
|
||||
plateName=r"#京沪津渝冀晋蒙辽吉黑苏浙皖闽赣鲁豫鄂湘粤桂琼川贵云藏陕甘青宁新学警港澳挂使领民航危0123456789ABCDEFGHJKLMNPQRSTUVWXYZ险品"
|
||||
mean_value,std_value=(0.588,0.193)
|
||||
def decodePlate(preds):
|
||||
pre=0
|
||||
newPreds=[]
|
||||
index=[]
|
||||
for i in range(len(preds)):
|
||||
if preds[i]!=0 and preds[i]!=pre:
|
||||
newPreds.append(preds[i])
|
||||
index.append(i)
|
||||
pre=preds[i]
|
||||
return newPreds,index
|
||||
|
||||
def image_processing(img,device):
|
||||
img = cv2.resize(img, (168,48))
|
||||
img = np.reshape(img, (48, 168, 3))
|
||||
|
||||
# normalize
|
||||
img = img.astype(np.float32)
|
||||
img = (img / 255. - mean_value) / std_value
|
||||
img = img.transpose([2, 0, 1])
|
||||
img = torch.from_numpy(img)
|
||||
|
||||
img = img.to(device)
|
||||
img = img.view(1, *img.size())
|
||||
return img
|
||||
|
||||
def get_plate_result(img,device,model,is_color=False):
|
||||
input = image_processing(img,device)
|
||||
if is_color: #是否识别颜色
|
||||
preds,color_preds = model(input)
|
||||
color_preds = torch.softmax(color_preds,dim=-1)
|
||||
color_conf,color_index = torch.max(color_preds,dim=-1)
|
||||
color_conf=color_conf.item()
|
||||
else:
|
||||
preds = model(input)
|
||||
preds=torch.softmax(preds,dim=-1)
|
||||
prob,index=preds.max(dim=-1)
|
||||
index = index.view(-1).detach().cpu().numpy()
|
||||
prob=prob.view(-1).detach().cpu().numpy()
|
||||
|
||||
|
||||
# preds=preds.view(-1).detach().cpu().numpy()
|
||||
newPreds,new_index=decodePlate(index)
|
||||
prob=prob[new_index]
|
||||
plate=""
|
||||
for i in newPreds:
|
||||
plate+=plateName[i]
|
||||
# if not (plate[0] in plateName[1:44] ):
|
||||
# return ""
|
||||
if is_color:
|
||||
return plate,prob,color[color_index],color_conf #返回车牌号以及每个字符的概率,以及颜色,和颜色的概率
|
||||
else:
|
||||
return plate,prob
|
||||
|
||||
def init_model(device,model_path,is_color = False):
|
||||
# print( print(sys.path))
|
||||
# model_path ="plate_recognition/model/checkpoint_61_acc_0.9715.pth"
|
||||
check_point = torch.load(model_path,map_location=device)
|
||||
model_state=check_point['state_dict']
|
||||
cfg=check_point['cfg']
|
||||
color_classes=0
|
||||
if is_color:
|
||||
color_classes=5 #颜色类别数
|
||||
model = myNet_ocr_color(num_classes=len(plateName),export=True,cfg=cfg,color_num=color_classes)
|
||||
|
||||
model.load_state_dict(model_state,strict=False)
|
||||
model.to(device)
|
||||
model.eval()
|
||||
return model
|
||||
|
||||
# model = init_model(device)
|
||||
if __name__ == '__main__':
|
||||
model_path = r"weights/plate_rec_color.pth"
|
||||
image_path ="images/tmp2424.png"
|
||||
testPath = r"/mnt/Gpan/Mydata/pytorchPorject/CRNN/crnn_plate_recognition/images"
|
||||
fileList=[]
|
||||
allFilePath(testPath,fileList)
|
||||
# result = get_plate_result(image_path,device)
|
||||
# print(result)
|
||||
is_color = False
|
||||
model = init_model(device,model_path,is_color=is_color)
|
||||
right=0
|
||||
begin = time.time()
|
||||
|
||||
for imge_path in fileList:
|
||||
img=cv2.imread(imge_path)
|
||||
if is_color:
|
||||
plate,_,plate_color,_=get_plate_result(img,device,model,is_color=is_color)
|
||||
print(plate)
|
||||
else:
|
||||
plate,_=get_plate_result(img,device,model,is_color=is_color)
|
||||
print(plate,imge_path)
|
||||
|
||||
|
||||
|
42
readme/README.md
Normal file
@ -0,0 +1,42 @@
|
||||
### **车牌检测训练**
|
||||
|
||||
1. **下载数据集:** 数据集可以联系vx获取:we0091234
|
||||
数据从CCPD和CRPD数据集中选取并转换的
|
||||
数据集格式为yolo格式:
|
||||
|
||||
```
|
||||
label x y w h pt1x pt1y pt2x pt2y pt3x pt3y pt4x pt4y
|
||||
```
|
||||
|
||||
关键点依次是(左上,右上,右下,左下)
|
||||
坐标都是经过归一化,x,y是中心点除以图片宽高,w,h是框的宽高除以图片宽高,ptx,pty是关键点坐标除以宽高
|
||||
|
||||
**自己的数据集**可以通过lablme 软件,create polygons标注车牌四个点即可,然后通过json2yolo.py 将数据集转为yolo格式,即可训练
|
||||
2. **修改 data/widerface.yaml train和val路径,换成你的数据路径**
|
||||
|
||||
```
|
||||
train: /your/train/path #修改成你的训练集路径
|
||||
val: /your/val/path #修改成你的验证集路径
|
||||
# number of classes
|
||||
nc: 2 #这里用的是2分类,0 单层车牌 1 双层车牌
|
||||
|
||||
# class names
|
||||
names: [ 'single','double']
|
||||
|
||||
```
|
||||
3. **训练**
|
||||
|
||||
```
|
||||
python3 train.py --data data/widerface.yaml --cfg models/yolov5n-0.5.yaml --weights weights/plate_detect.pt --epoch 120
|
||||
```
|
||||
|
||||
结果存在run文件夹中
|
||||
|
||||
### onnx export
|
||||
|
||||
1. 检测模型导出onnx,需要安装onnx-sim **[onnx-simplifier](https://github.com/daquexian/onnx-simplifier)**
|
||||
|
||||
```
|
||||
python export.py --weights ./weights/plate_detect.pt --img_size 640 --batch_size 1
|
||||
onnxsim weights/plate_detect.onnx weights/plate_detect.onnx
|
||||
```
|
BIN
result_Rainy/20230402110037.jpg
Normal file
After Width: | Height: | Size: 674 KiB |
BIN
result_Rainy/20230402114216.jpg
Normal file
After Width: | Height: | Size: 790 KiB |
BIN
result_Rainy/20230402122209.jpg
Normal file
After Width: | Height: | Size: 725 KiB |
BIN
result_Rainy/20230402124902.jpg
Normal file
After Width: | Height: | Size: 824 KiB |
BIN
result_Rainy/20230402174157.jpg
Normal file
After Width: | Height: | Size: 552 KiB |
BIN
result_Rainy/20230402174728.jpg
Normal file
After Width: | Height: | Size: 507 KiB |
BIN
result_Rainy/20230402180202.jpg
Normal file
After Width: | Height: | Size: 704 KiB |
BIN
result_Rainy/20230402181145.jpg
Normal file
After Width: | Height: | Size: 542 KiB |
BIN
result_Rainy/20230402181201.jpg
Normal file
After Width: | Height: | Size: 453 KiB |
BIN
result_Rainy/20230402181706.jpg
Normal file
After Width: | Height: | Size: 405 KiB |
BIN
result_Rainy/20230402181837.jpg
Normal file
After Width: | Height: | Size: 438 KiB |
BIN
result_Rainy/20230402181905.jpg
Normal file
After Width: | Height: | Size: 501 KiB |
BIN
result_Rainy/20230402182355.jpg
Normal file
After Width: | Height: | Size: 455 KiB |
BIN
result_Rainy/20230402185127.jpg
Normal file
After Width: | Height: | Size: 483 KiB |
BIN
result_Rainy/20230402185227.jpg
Normal file
After Width: | Height: | Size: 453 KiB |
BIN
result_Rainy/20230402185235.jpg
Normal file
After Width: | Height: | Size: 494 KiB |
BIN
result_Rainy/20230402185440.jpg
Normal file
After Width: | Height: | Size: 461 KiB |
BIN
result_Rainy/20230402185635.jpg
Normal file
After Width: | Height: | Size: 498 KiB |
BIN
result_Rainy/20230402185824.jpg
Normal file
After Width: | Height: | Size: 476 KiB |
BIN
result_Rainy/20230402190134.jpg
Normal file
After Width: | Height: | Size: 513 KiB |
BIN
result_Rainy/20230402190219.jpg
Normal file
After Width: | Height: | Size: 471 KiB |
BIN
result_Rainy/20230402191206.jpg
Normal file
After Width: | Height: | Size: 471 KiB |
BIN
result_Rainy/20230402191441.jpg
Normal file
After Width: | Height: | Size: 498 KiB |
BIN
result_Rainy/20230402193706.jpg
Normal file
After Width: | Height: | Size: 449 KiB |
BIN
result_Rainy/20230402204602.jpg
Normal file
After Width: | Height: | Size: 468 KiB |
BIN
result_Rainy/20230402205418.jpg
Normal file
After Width: | Height: | Size: 519 KiB |
BIN
result_Rainy/20230402215131.jpg
Normal file
After Width: | Height: | Size: 735 KiB |
BIN
result_Rainy/20230403095717.jpg
Normal file
After Width: | Height: | Size: 1.0 MiB |
BIN
result_Rainy/20230403095725.jpg
Normal file
After Width: | Height: | Size: 1.0 MiB |
BIN
result_Rainy/20230403095737.jpg
Normal file
After Width: | Height: | Size: 1.0 MiB |
BIN
result_Rainy/20230403095753.jpg
Normal file
After Width: | Height: | Size: 1.0 MiB |
BIN
result_Rainy/20230403095805.jpg
Normal file
After Width: | Height: | Size: 1.1 MiB |
BIN
result_Rainy/20230403100026.jpg
Normal file
After Width: | Height: | Size: 974 KiB |