添加注释

This commit is contained in:
zty8080123 2024-08-07 09:32:38 +08:00
commit 5d9b9a6d9f
256 changed files with 19346 additions and 0 deletions

35
.gitignore vendored Normal file
View File

@ -0,0 +1,35 @@
# .gitignore
# 首先忽略所有的文件
*
# 但是不忽略目录
!*/
# 忽略一些指定的目录名
ut/
runs/
.vscode/
build/
result1/
result/
mytest/
mytest_double/
pretrained_model/
gangao/
extra/
ccpd/
*.pyc
# 不忽略下面指定的文件类型
!*.cpp
!*.h
!*.hpp
!*.c
!.gitignore
!*.py
!*.sh
!*.npy
!*.jpg
!*.pt
!*.npy
!*.pth
!*.png
!*.yaml
!*.md

3
.idea/.gitignore vendored Normal file
View File

@ -0,0 +1,3 @@
# Default ignored files
/shelf/
/workspace.xml

93
README.md Normal file
View File

@ -0,0 +1,93 @@
## What's New
**2022.12.04 车辆和车牌一起检测看这里[车辆系统](https://github.com/we0091234/Car_recognition)**
[yolov8 车牌检测+识别](https://github.com/we0091234/yolov8-plate)
[yolov7 车牌检测+识别](https://github.com/we0091234/yolov7_plate)
[安卓NCNN](https://github.com/Ayers-github/Chinese-License-Plate-Recognition)
## 联系
模型用的是公开数据集训练出来的,需要准确率更高的模型,或者商务合作请加
**wechat: we0091234 (注明来意)**
## **最全车牌识别算法支持12种中文车牌类型**
**环境要求: python >=3.6 pytorch >=1.7**
#### **图片测试demo:**
直接运行detect_plate.py 或者运行如下命令行:
```
python detect_plate.py --detect_model weights/plate_detect.pt --rec_model weights/plate_rec_color.pth --image_path imgs --output result
```
测试文件夹imgs结果保存再 result 文件夹中
#### 视频测试demo [2.MP4](https://pan.baidu.com/s/1O1sT8hCEwJZmVScDwBHgOg) 提取码41aq
```
python detect_plate.py --detect_model weights/plate_detect.pt --rec_model weights/plate_rec_color.pth --video 2.mp4
```
视频文件为2.mp4 保存为result.mp4
## **车牌检测训练**
车牌检测训练链接如下:
[车牌检测训练](https://github.com/we0091234/Chinese_license_plate_detection_recognition/tree/main/readme)
## **车牌识别训练**
车牌识别训练链接如下:
[车牌识别训练](https://github.com/we0091234/crnn_plate_recognition)
#### **支持如下:**
- [X] 1.单行蓝牌
- [X] 2.单行黄牌
- [X] 3.新能源车牌
- [X] 4.白色警用车牌
- [X] 5.教练车牌
- [X] 6.武警车牌
- [X] 7.双层黄牌
- [X] 8.双层白牌
- [X] 9.使馆车牌
- [X] 10.港澳粤Z牌
- [X] 11.双层绿牌
- [X] 12.民航车牌
![Image ](image/README/test_1.jpg)
## 部署
1.[安卓NCNN](https://github.com/Ayers-github/Chinese-License-Plate-Recognition)
2.**onnx demo** 百度网盘: [k874](https://pan.baidu.com/s/1K3L3xubd6pXIreAydvUm4g)
```
python onnx_infer.py --detect_model weights/plate_detect.onnx --rec_model weights/plate_rec_color.onnx --image_path imgs --output result_onnx
```
3.**tensorrt** 部署见[tensorrt_plate](https://github.com/we0091234/chinese_plate_tensorrt)
4.**openvino demo** 版本2022.2
```
python openvino_infer.py --detect_model weights/plate_detect.onnx --rec_model weights/plate_rec.onnx --image_path imgs --output result_openvino
```
## References
* [https://github.com/deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face)
* [https://github.com/Sierkinhane/CRNN_Chinese_Characters_Rec](https://github.com/Sierkinhane/CRNN_Chinese_Characters_Rec)
## More
**qq群:871797331(已满) 837982567二群 询问**
![Image ](image/README/105384078.png)

196
ccpd_process.py Normal file
View File

@ -0,0 +1,196 @@
import os
import shutil
import cv2
import numpy as np
def allFilePath(rootPath,allFIleList):
fileList = os.listdir(rootPath)
for temp in fileList:
if os.path.isfile(os.path.join(rootPath,temp)):
if temp.endswith(".jpg"):
allFIleList.append(os.path.join(rootPath,temp))
else:
allFilePath(os.path.join(rootPath,temp),allFIleList)
def order_points(pts):
# initialzie a list of coordinates that will be ordered
# such that the first entry in the list is the top-left,
# the second entry is the top-right, the third is the
# bottom-right, and the fourth is the bottom-left
pts=pts[:4,:]
rect = np.zeros((5, 2), dtype = "float32")
# the top-left point will have the smallest sum, whereas
# the bottom-right point will have the largest sum
s = pts.sum(axis = 1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
# now, compute the difference between the points, the
# top-right point will have the smallest difference,
# whereas the bottom-left will have the largest difference
diff = np.diff(pts, axis = 1)
rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)]
# return the ordered coordinates
return rect
def get_partical_ccpd():
ccpd_dir = r"/mnt/Gpan/BaiduNetdiskDownload/CCPD1/CCPD2020/ccpd_green"
save_Path = r"ccpd/green_plate"
folder_list = os.listdir(ccpd_dir)
for folder_name in folder_list:
count=0
folder_path = os.path.join(ccpd_dir,folder_name)
if os.path.isfile(folder_path):
continue
if folder_name == "ccpd_fn":
continue
name_list = os.listdir(folder_path)
save_folder=save_Path
if not os.path.exists(save_folder):
os.mkdir(save_folder)
for name in name_list:
file_path = os.path.join(folder_path,name)
count+=1
if count>1000:
break
new_file_path =os.path.join(save_folder,name)
shutil.move(file_path,new_file_path)
print(count,new_file_path)
def get_rect_and_landmarks(img_path):
file_name = img_path.split("/")[-1].split("-")
landmarks_np =np.zeros((5,2))
rect = file_name[2].split("_")
landmarks=file_name[3].split("_")
rect_str = "&".join(rect)
landmarks_str= "&".join(landmarks)
rect= rect_str.split("&")
landmarks=landmarks_str.split("&")
rect=[int(x) for x in rect]
landmarks=[int(x) for x in landmarks]
for i in range(4):
landmarks_np[i][0]=landmarks[2*i]
landmarks_np[i][1]=landmarks[2*i+1]
# middle_landmark_w =int((landmarks[4]+landmarks[6])/2)
# middle_landmark_h =int((landmarks[5]+landmarks[7])/2)
# landmarks.append(middle_landmark_w)
# landmarks.append(middle_landmark_h)
landmarks_np_new=order_points(landmarks_np)
# landmarks_np_new[4]=np.array([middle_landmark_w,middle_landmark_h])
return rect,landmarks,landmarks_np_new
def x1x2y1y2_yolo(rect,landmarks,img):
h,w,c =img.shape
rect[0] = max(0, rect[0])
rect[1] = max(0, rect[1])
rect[2] = min(w - 1, rect[2]-rect[0])
rect[3] = min(h - 1, rect[3]-rect[1])
annotation = np.zeros((1, 14))
annotation[0, 0] = (rect[0] + rect[2] / 2) / w # cx
annotation[0, 1] = (rect[1] + rect[3] / 2) / h # cy
annotation[0, 2] = rect[2] / w # w
annotation[0, 3] = rect[3] / h # h
annotation[0, 4] = landmarks[0] / w # l0_x
annotation[0, 5] = landmarks[1] / h # l0_y
annotation[0, 6] = landmarks[2] / w # l1_x
annotation[0, 7] = landmarks[3] / h # l1_y
annotation[0, 8] = landmarks[4] / w # l2_x
annotation[0, 9] = landmarks[5] / h # l2_y
annotation[0, 10] = landmarks[6] / w # l3_x
annotation[0, 11] = landmarks[7] / h # l3_y
# annotation[0, 12] = landmarks[8] / w # l4_x
# annotation[0, 13] = landmarks[9] / h # l4_y
return annotation
def xywh2yolo(rect,landmarks_sort,img):
h,w,c =img.shape
rect[0] = max(0, rect[0])
rect[1] = max(0, rect[1])
rect[2] = min(w - 1, rect[2]-rect[0])
rect[3] = min(h - 1, rect[3]-rect[1])
annotation = np.zeros((1, 12))
annotation[0, 0] = (rect[0] + rect[2] / 2) / w # cx
annotation[0, 1] = (rect[1] + rect[3] / 2) / h # cy
annotation[0, 2] = rect[2] / w # w
annotation[0, 3] = rect[3] / h # h
annotation[0, 4] = landmarks_sort[0][0] / w # l0_x
annotation[0, 5] = landmarks_sort[0][1] / h # l0_y
annotation[0, 6] = landmarks_sort[1][0] / w # l1_x
annotation[0, 7] = landmarks_sort[1][1] / h # l1_y
annotation[0, 8] = landmarks_sort[2][0] / w # l2_x
annotation[0, 9] = landmarks_sort[2][1] / h # l2_y
annotation[0, 10] = landmarks_sort[3][0] / w # l3_x
annotation[0, 11] = landmarks_sort[3][1] / h # l3_y
# annotation[0, 12] = landmarks_sort[4][0] / w # l4_x
# annotation[0, 13] = landmarks_sort[4][1] / h # l4_y
return annotation
def yolo2x1y1x2y2(annotation,img):
h,w,c = img.shape
rect= annotation[:,0:4].squeeze().tolist()
landmarks=annotation[:,4:].squeeze().tolist()
rect_w = w*rect[2]
rect_h =h*rect[3]
rect_x =int(rect[0]*w-rect_w/2)
rect_y = int(rect[1]*h-rect_h/2)
new_rect=[rect_x,rect_y,rect_x+rect_w,rect_y+rect_h]
for i in range(5):
landmarks[2*i]=landmarks[2*i]*w
landmarks[2*i+1]=landmarks[2*i+1]*h
return new_rect,landmarks
def write_lable(file_path):
pass
if __name__ == '__main__':
file_root = r"ccpd"
file_list=[]
count=0
allFilePath(file_root,file_list)
for img_path in file_list:
count+=1
# img_path = r"ccpd_yolo_test/02-90_85-173&466_452&541-452&553_176&556_178&463_454&460-0_0_6_26_15_26_32-68-53.jpg"
text_path= img_path.replace(".jpg",".txt")
img =cv2.imread(img_path)
rect,landmarks,landmarks_sort=get_rect_and_landmarks(img_path)
# annotation=x1x2y1y2_yolo(rect,landmarks,img)
annotation=xywh2yolo(rect,landmarks_sort,img)
str_label = "0 "
for i in range(len(annotation[0])):
str_label = str_label + " " + str(annotation[0][i])
str_label = str_label.replace('[', '').replace(']', '')
str_label = str_label.replace(',', '') + '\n'
with open(text_path,"w") as f:
f.write(str_label)
print(count,img_path)
# get_partical_ccpd()
# file_root = r"ccpd/green_plate"
# file_list=[]
# allFilePath(file_root,file_list)
# count=0
# for img_path in file_list:
# img_name = img_path.split(os.sep)[-1]
# if not "&" in img_name:
# count+=1
# os.remove(img_path)
# print(count,img_path)
# new_rect,new_landmarks=yolo2x1y1x2y2(annotation,img)
# rect= [int(x) for x in new_rect]
# cv2.rectangle(img,(rect[0],rect[1]),(rect[2],rect[3]),(255,0,0),2)
# colors=[(0,255,0),(0,255,255),(255,255,0),(255,255,255),(255,0,255)] #绿 黄 青 白
# for i in range(5):
# cv2.circle(img,(landmarks[2*i],landmarks[2*i+1]),2,colors[i],2)
# cv2.imwrite("1.jpg",img)
# print(rect,landmarks)
# get_partical_ccpd()

21
data/argoverse_hd.yaml Normal file
View File

@ -0,0 +1,21 @@
# Argoverse-HD dataset (ring-front-center camera) http://www.cs.cmu.edu/~mengtial/proj/streaming/
# Train command: python train.py --data argoverse_hd.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /argoverse
# /yolov5
# download command/URL (optional)
download: bash data/scripts/get_argoverse_hd.sh
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../argoverse/Argoverse-1.1/images/train/ # 39384 images
val: ../argoverse/Argoverse-1.1/images/val/ # 15062 iamges
test: ../argoverse/Argoverse-1.1/images/test/ # Submit to: https://eval.ai/web/challenges/challenge-page/800/overview
# number of classes
nc: 8
# class names
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'bus', 'truck', 'traffic_light', 'stop_sign' ]

35
data/coco.yaml Normal file
View File

@ -0,0 +1,35 @@
# COCO 2017 dataset http://cocodataset.org
# Train command: python train.py --data coco.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /coco
# /yolov5
# download command/URL (optional)
download: bash data/scripts/get_coco.sh
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../coco/train2017.txt # 118287 images
val: ../coco/val2017.txt # 5000 images
test: ../coco/test-dev2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794
# number of classes
nc: 80
# class names
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush' ]
# Print classes
# with open('data/coco.yaml') as f:
# d = yaml.load(f, Loader=yaml.FullLoader) # dict
# for i, x in enumerate(d['names']):
# print(i, x)

28
data/coco128.yaml Normal file
View File

@ -0,0 +1,28 @@
# COCO 2017 dataset http://cocodataset.org - first 128 training images
# Train command: python train.py --data coco128.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /coco128
# /yolov5
# download command/URL (optional)
download: https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../coco128/images/train2017/ # 128 images
val: ../coco128/images/train2017/ # 128 images
# number of classes
nc: 80
# class names
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush' ]

38
data/hyp.finetune.yaml Normal file
View File

@ -0,0 +1,38 @@
# Hyperparameters for VOC finetuning
# python train.py --batch 64 --weights yolov5m.pt --data voc.yaml --img 512 --epochs 50
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
# Hyperparameter Evolution Results
# Generations: 306
# P R mAP.5 mAP.5:.95 box obj cls
# Metrics: 0.6 0.936 0.896 0.684 0.0115 0.00805 0.00146
lr0: 0.0032
lrf: 0.12
momentum: 0.843
weight_decay: 0.00036
warmup_epochs: 2.0
warmup_momentum: 0.5
warmup_bias_lr: 0.05
box: 0.0296
cls: 0.243
cls_pw: 0.631
obj: 0.301
obj_pw: 0.911
iou_t: 0.2
anchor_t: 2.91
# anchors: 3.63
fl_gamma: 0.0
hsv_h: 0.0138
hsv_s: 0.664
hsv_v: 0.464
degrees: 0.373
translate: 0.245
scale: 0.898
shear: 0.602
perspective: 0.0
flipud: 0.00856
fliplr: 0.5
mosaic: 1.0
mixup: 0.243

34
data/hyp.scratch.yaml Normal file
View File

@ -0,0 +1,34 @@
# Hyperparameters for COCO training from scratch
# python train.py --batch 40 --cfg yolov5m.yaml --weights '' --data coco.yaml --img 640 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.2 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.5 # cls loss gain
landmark: 0.005 # landmark loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 1.0 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.5 # image scale (+/- gain)
shear: 0.5 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 0.5 # image mosaic (probability)
mixup: 0.0 # image mixup (probability)

20
data/plateAndCar.yaml Normal file
View File

@ -0,0 +1,20 @@
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
# Train command: python train.py --data voc.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /VOC
# /yolov5
# download command/URL (optional)
download: bash data/scripts/get_voc.sh
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: /mnt/Gpan/Mydata/pytorch
Porject/datasets/ccpd/train_car_plate/train_detect
val: /mnt/Gpan/Mydata/pytorchPorject/datasets/ccpd/train_car_plate/val_detect
# number of classes
nc: 3
# class names
names: [ 'single_plate','double_plate','car']

150
data/retinaface2yolo.py Normal file
View File

@ -0,0 +1,150 @@
import os
import os.path
import sys
import torch
import torch.utils.data as data
import cv2
import numpy as np
class WiderFaceDetection(data.Dataset):
def __init__(self, txt_path, preproc=None):
self.preproc = preproc
self.imgs_path = []
self.words = []
f = open(txt_path,'r')
lines = f.readlines()
isFirst = True
labels = []
for line in lines:
line = line.rstrip()
if line.startswith('#'):
if isFirst is True:
isFirst = False
else:
labels_copy = labels.copy()
self.words.append(labels_copy)
labels.clear()
path = line[2:]
path = txt_path.replace('label.txt','images/') + path
self.imgs_path.append(path)
else:
line = line.split(' ')
label = [float(x) for x in line]
labels.append(label)
self.words.append(labels)
def __len__(self):
return len(self.imgs_path)
def __getitem__(self, index):
img = cv2.imread(self.imgs_path[index])
height, width, _ = img.shape
labels = self.words[index]
annotations = np.zeros((0, 15))
if len(labels) == 0:
return annotations
for idx, label in enumerate(labels):
annotation = np.zeros((1, 15))
# bbox
annotation[0, 0] = label[0] # x1
annotation[0, 1] = label[1] # y1
annotation[0, 2] = label[0] + label[2] # x2
annotation[0, 3] = label[1] + label[3] # y2
# landmarks
annotation[0, 4] = label[4] # l0_x
annotation[0, 5] = label[5] # l0_y
annotation[0, 6] = label[7] # l1_x
annotation[0, 7] = label[8] # l1_y
annotation[0, 8] = label[10] # l2_x
annotation[0, 9] = label[11] # l2_y
annotation[0, 10] = label[13] # l3_x
annotation[0, 11] = label[14] # l3_y
annotation[0, 12] = label[16] # l4_x
annotation[0, 13] = label[17] # l4_y
if (annotation[0, 4]<0):
annotation[0, 14] = -1
else:
annotation[0, 14] = 1
annotations = np.append(annotations, annotation, axis=0)
target = np.array(annotations)
if self.preproc is not None:
img, target = self.preproc(img, target)
return torch.from_numpy(img), target
def detection_collate(batch):
"""Custom collate fn for dealing with batches of images that have a different
number of associated object annotations (bounding boxes).
Arguments:
batch: (tuple) A tuple of tensor images and lists of annotations
Return:
A tuple containing:
1) (tensor) batch of images stacked on their 0 dim
2) (list of tensors) annotations for a given image are stacked on 0 dim
"""
targets = []
imgs = []
for _, sample in enumerate(batch):
for _, tup in enumerate(sample):
if torch.is_tensor(tup):
imgs.append(tup)
elif isinstance(tup, type(np.empty(0))):
annos = torch.from_numpy(tup).float()
targets.append(annos)
return (torch.stack(imgs, 0), targets)
save_path = '/ssd_1t/derron/yolov5-face/data/widerface/train'
aa=WiderFaceDetection("/ssd_1t/derron/yolov5-face/data/widerface/widerface/train/label.txt")
for i in range(len(aa.imgs_path)):
print(i, aa.imgs_path[i])
img = cv2.imread(aa.imgs_path[i])
base_img = os.path.basename(aa.imgs_path[i])
base_txt = os.path.basename(aa.imgs_path[i])[:-4] +".txt"
save_img_path = os.path.join(save_path, base_img)
save_txt_path = os.path.join(save_path, base_txt)
with open(save_txt_path, "w") as f:
height, width, _ = img.shape
labels = aa.words[i]
annotations = np.zeros((0, 14))
if len(labels) == 0:
continue
for idx, label in enumerate(labels):
annotation = np.zeros((1, 14))
# bbox
label[0] = max(0, label[0])
label[1] = max(0, label[1])
label[2] = min(width - 1, label[2])
label[3] = min(height - 1, label[3])
annotation[0, 0] = (label[0] + label[2] / 2) / width # cx
annotation[0, 1] = (label[1] + label[3] / 2) / height # cy
annotation[0, 2] = label[2] / width # w
annotation[0, 3] = label[3] / height # h
#if (label[2] -label[0]) < 8 or (label[3] - label[1]) < 8:
# img[int(label[1]):int(label[3]), int(label[0]):int(label[2])] = 127
# continue
# landmarks
annotation[0, 4] = label[4] / width # l0_x
annotation[0, 5] = label[5] / height # l0_y
annotation[0, 6] = label[7] / width # l1_x
annotation[0, 7] = label[8] / height # l1_y
annotation[0, 8] = label[10] / width # l2_x
annotation[0, 9] = label[11] / height # l2_y
annotation[0, 10] = label[13] / width # l3_x
annotation[0, 11] = label[14] / height # l3_y
annotation[0, 12] = label[16] / width # l4_x
annotation[0, 13] = label[17] / height # l4_y
str_label="0 "
for i in range(len(annotation[0])):
str_label =str_label+" "+str(annotation[0][i])
str_label = str_label.replace('[', '').replace(']', '')
str_label = str_label.replace(',', '') + '\n'
f.write(str_label)
cv2.imwrite(save_img_path, img)

View File

@ -0,0 +1,62 @@
#!/bin/bash
# Argoverse-HD dataset (ring-front-center camera) http://www.cs.cmu.edu/~mengtial/proj/streaming/
# Download command: bash data/scripts/get_argoverse_hd.sh
# Train command: python train.py --data argoverse_hd.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /argoverse
# /yolov5
# Download/unzip images
d='../argoverse/' # unzip directory
mkdir $d
url=https://argoverse-hd.s3.us-east-2.amazonaws.com/
f=Argoverse-HD-Full.zip
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &# download, unzip, remove in background
wait # finish background tasks
cd ../argoverse/Argoverse-1.1/
ln -s tracking images
cd ../Argoverse-HD/annotations/
python3 - "$@" <<END
import json
from pathlib import Path
annotation_files = ["train.json", "val.json"]
print("Converting annotations to YOLOv5 format...")
for val in annotation_files:
a = json.load(open(val, "rb"))
label_dict = {}
for annot in a['annotations']:
img_id = annot['image_id']
img_name = a['images'][img_id]['name']
img_label_name = img_name[:-3] + "txt"
obj_class = annot['category_id']
x_center, y_center, width, height = annot['bbox']
x_center = (x_center + width / 2) / 1920. # offset and scale
y_center = (y_center + height / 2) / 1200. # offset and scale
width /= 1920. # scale
height /= 1200. # scale
img_dir = "./labels/" + a['seq_dirs'][a['images'][annot['image_id']]['sid']]
Path(img_dir).mkdir(parents=True, exist_ok=True)
if img_dir + "/" + img_label_name not in label_dict:
label_dict[img_dir + "/" + img_label_name] = []
label_dict[img_dir + "/" + img_label_name].append(f"{obj_class} {x_center} {y_center} {width} {height}\n")
for filename in label_dict:
with open(filename, "w") as file:
for string in label_dict[filename]:
file.write(string)
END
mv ./labels ../../Argoverse-1.1/

27
data/scripts/get_coco.sh Normal file
View File

@ -0,0 +1,27 @@
#!/bin/bash
# COCO 2017 dataset http://cocodataset.org
# Download command: bash data/scripts/get_coco.sh
# Train command: python train.py --data coco.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /coco
# /yolov5
# Download/unzip labels
d='../' # unzip directory
url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
f='coco2017labels.zip' # or 'coco2017labels-segments.zip', 68 MB
echo 'Downloading' $url$f ' ...'
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
# Download/unzip images
d='../coco/images' # unzip directory
url=http://images.cocodataset.org/zips/
f1='train2017.zip' # 19G, 118k images
f2='val2017.zip' # 1G, 5k images
f3='test2017.zip' # 7G, 41k images (optional)
for f in $f1 $f2; do
echo 'Downloading' $url$f '...'
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
done
wait # finish background tasks

139
data/scripts/get_voc.sh Normal file
View File

@ -0,0 +1,139 @@
#!/bin/bash
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
# Download command: bash data/scripts/get_voc.sh
# Train command: python train.py --data voc.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /VOC
# /yolov5
start=$(date +%s)
mkdir -p ../tmp
cd ../tmp/
# Download/unzip images and labels
d='.' # unzip directory
url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
f1=VOCtrainval_06-Nov-2007.zip # 446MB, 5012 images
f2=VOCtest_06-Nov-2007.zip # 438MB, 4953 images
f3=VOCtrainval_11-May-2012.zip # 1.95GB, 17126 images
for f in $f3 $f2 $f1; do
echo 'Downloading' $url$f '...'
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
done
wait # finish background tasks
end=$(date +%s)
runtime=$((end - start))
echo "Completed in" $runtime "seconds"
echo "Splitting dataset..."
python3 - "$@" <<END
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
sets=[('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
def convert(size, box):
dw = 1./(size[0])
dh = 1./(size[1])
x = (box[0] + box[1])/2.0 - 1
y = (box[2] + box[3])/2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
def convert_annotation(year, image_id):
in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
tree=ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult)==1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
bb = convert((w,h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
wd = getcwd()
for year, image_set in sets:
if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
list_file = open('%s_%s.txt'%(year, image_set), 'w')
for image_id in image_ids:
list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'%(wd, year, image_id))
convert_annotation(year, image_id)
list_file.close()
END
cat 2007_train.txt 2007_val.txt 2012_train.txt 2012_val.txt >train.txt
cat 2007_train.txt 2007_val.txt 2007_test.txt 2012_train.txt 2012_val.txt >train.all.txt
python3 - "$@" <<END
import shutil
import os
os.system('mkdir ../VOC/')
os.system('mkdir ../VOC/images')
os.system('mkdir ../VOC/images/train')
os.system('mkdir ../VOC/images/val')
os.system('mkdir ../VOC/labels')
os.system('mkdir ../VOC/labels/train')
os.system('mkdir ../VOC/labels/val')
import os
print(os.path.exists('../tmp/train.txt'))
f = open('../tmp/train.txt', 'r')
lines = f.readlines()
for line in lines:
line = "/".join(line.split('/')[-5:]).strip()
if (os.path.exists("../" + line)):
os.system("cp ../"+ line + " ../VOC/images/train")
line = line.replace('JPEGImages', 'labels')
line = line.replace('jpg', 'txt')
if (os.path.exists("../" + line)):
os.system("cp ../"+ line + " ../VOC/labels/train")
print(os.path.exists('../tmp/2007_test.txt'))
f = open('../tmp/2007_test.txt', 'r')
lines = f.readlines()
for line in lines:
line = "/".join(line.split('/')[-5:]).strip()
if (os.path.exists("../" + line)):
os.system("cp ../"+ line + " ../VOC/images/val")
line = line.replace('JPEGImages', 'labels')
line = line.replace('jpg', 'txt')
if (os.path.exists("../" + line)):
os.system("cp ../"+ line + " ../VOC/labels/val")
END
rm -rf ../tmp # remove temporary directory
echo "VOC download done."

176
data/train2yolo.py Normal file
View File

@ -0,0 +1,176 @@
import os.path
import sys
import torch
import torch.utils.data as data
import cv2
import numpy as np
class WiderFaceDetection(data.Dataset):
def __init__(self, txt_path, preproc=None):
self.preproc = preproc
self.imgs_path = []
self.words = []
f = open(txt_path, 'r')
lines = f.readlines()
isFirst = True
labels = []
for line in lines:
line = line.rstrip()
if line.startswith('#'):
if isFirst is True:
isFirst = False
else:
labels_copy = labels.copy()
self.words.append(labels_copy)
labels.clear()
path = line[2:]
path = txt_path.replace('label.txt', 'images/') + path
self.imgs_path.append(path)
else:
line = line.split(' ')
label = [float(x) for x in line]
labels.append(label)
self.words.append(labels)
def __len__(self):
return len(self.imgs_path)
def __getitem__(self, index):
img = cv2.imread(self.imgs_path[index])
height, width, _ = img.shape
labels = self.words[index]
annotations = np.zeros((0, 15))
if len(labels) == 0:
return annotations
for idx, label in enumerate(labels):
annotation = np.zeros((1, 15))
# bbox
annotation[0, 0] = label[0] # x1
annotation[0, 1] = label[1] # y1
annotation[0, 2] = label[0] + label[2] # x2
annotation[0, 3] = label[1] + label[3] # y2
# landmarks
annotation[0, 4] = label[4] # l0_x
annotation[0, 5] = label[5] # l0_y
annotation[0, 6] = label[7] # l1_x
annotation[0, 7] = label[8] # l1_y
annotation[0, 8] = label[10] # l2_x
annotation[0, 9] = label[11] # l2_y
annotation[0, 10] = label[13] # l3_x
annotation[0, 11] = label[14] # l3_y
annotation[0, 12] = label[16] # l4_x
annotation[0, 13] = label[17] # l4_y
if annotation[0, 4] < 0:
annotation[0, 14] = -1
else:
annotation[0, 14] = 1
annotations = np.append(annotations, annotation, axis=0)
target = np.array(annotations)
if self.preproc is not None:
img, target = self.preproc(img, target)
return torch.from_numpy(img), target
def detection_collate(batch):
"""Custom collate fn for dealing with batches of images that have a different
number of associated object annotations (bounding boxes).
Arguments:
batch: (tuple) A tuple of tensor images and lists of annotations
Return:
A tuple containing:
1) (tensor) batch of images stacked on their 0 dim
2) (list of tensors) annotations for a given image are stacked on 0 dim
"""
targets = []
imgs = []
for _, sample in enumerate(batch):
for _, tup in enumerate(sample):
if torch.is_tensor(tup):
imgs.append(tup)
elif isinstance(tup, type(np.empty(0))):
annos = torch.from_numpy(tup).float()
targets.append(annos)
return torch.stack(imgs, 0), targets
if __name__ == '__main__':
if len(sys.argv) == 1:
print('Missing path to WIDERFACE train folder.')
print('Run command: python3 train2yolo.py /path/to/original/widerface/train [/path/to/save/widerface/train]')
exit(1)
elif len(sys.argv) > 3:
print('Too many arguments were provided.')
print('Run command: python3 train2yolo.py /path/to/original/widerface/train [/path/to/save/widerface/train]')
exit(1)
original_path = sys.argv[1]
if len(sys.argv) == 2:
if not os.path.isdir('widerface'):
os.mkdir('widerface')
if not os.path.isdir('widerface/train'):
os.mkdir('widerface/train')
save_path = 'widerface/train'
else:
save_path = sys.argv[2]
if not os.path.isfile(os.path.join(original_path, 'label.txt')):
print('Missing label.txt file.')
exit(1)
aa = WiderFaceDetection(os.path.join(original_path, 'label.txt'))
for i in range(len(aa.imgs_path)):
print(i, aa.imgs_path[i])
img = cv2.imread(aa.imgs_path[i])
base_img = os.path.basename(aa.imgs_path[i])
base_txt = os.path.basename(aa.imgs_path[i])[:-4] + ".txt"
save_img_path = os.path.join(save_path, base_img)
save_txt_path = os.path.join(save_path, base_txt)
with open(save_txt_path, "w") as f:
height, width, _ = img.shape
labels = aa.words[i]
annotations = np.zeros((0, 14))
if len(labels) == 0:
continue
for idx, label in enumerate(labels):
annotation = np.zeros((1, 14))
# bbox
label[0] = max(0, label[0])
label[1] = max(0, label[1])
label[2] = min(width - 1, label[2])
label[3] = min(height - 1, label[3])
annotation[0, 0] = (label[0] + label[2] / 2) / width # cx
annotation[0, 1] = (label[1] + label[3] / 2) / height # cy
annotation[0, 2] = label[2] / width # w
annotation[0, 3] = label[3] / height # h
#if (label[2] -label[0]) < 8 or (label[3] - label[1]) < 8:
# img[int(label[1]):int(label[3]), int(label[0]):int(label[2])] = 127
# continue
# landmarks
annotation[0, 4] = label[4] / width # l0_x
annotation[0, 5] = label[5] / height # l0_y
annotation[0, 6] = label[7] / width # l1_x
annotation[0, 7] = label[8] / height # l1_y
annotation[0, 8] = label[10] / width # l2_x
annotation[0, 9] = label[11] / height # l2_y
annotation[0, 10] = label[13] / width # l3_x
annotation[0, 11] = label[14] / height # l3_y
annotation[0, 12] = label[16] / width # l4_x
annotation[0, 13] = label[17] / height # l4_yca
str_label = "0 "
for i in range(len(annotation[0])):
str_label = str_label + " " + str(annotation[0][i])
str_label = str_label.replace('[', '').replace(']', '')
str_label = str_label.replace(',', '') + '\n'
f.write(str_label)
cv2.imwrite(save_img_path, img)

88
data/val2yolo.py Normal file
View File

@ -0,0 +1,88 @@
import os
import cv2
import numpy as np
import shutil
import sys
from tqdm import tqdm
def xywh2xxyy(box):
x1 = box[0]
y1 = box[1]
x2 = box[0] + box[2]
y2 = box[1] + box[3]
return x1, x2, y1, y2
def convert(size, box):
dw = 1. / (size[0])
dh = 1. / (size[1])
x = (box[0] + box[1]) / 2.0 - 1
y = (box[2] + box[3]) / 2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return x, y, w, h
def wider2face(root, phase='val', ignore_small=0):
data = {}
with open('{}/{}/label.txt'.format(root, phase), 'r') as f:
lines = f.readlines()
for line in tqdm(lines):
line = line.strip()
if '#' in line:
path = '{}/{}/images/{}'.format(root, phase, line.split()[-1])
img = cv2.imread(path)
height, width, _ = img.shape
data[path] = list()
else:
box = np.array(line.split()[0:4], dtype=np.float32) # (x1,y1,w,h)
if box[2] < ignore_small or box[3] < ignore_small:
continue
box = convert((width, height), xywh2xxyy(box))
label = '0 {} {} {} {} -1 -1 -1 -1 -1 -1 -1 -1 -1 -1'.format(round(box[0], 4), round(box[1], 4),
round(box[2], 4), round(box[3], 4))
data[path].append(label)
return data
if __name__ == '__main__':
if len(sys.argv) == 1:
print('Missing path to WIDERFACE folder.')
print('Run command: python3 val2yolo.py /path/to/original/widerface [/path/to/save/widerface/val]')
exit(1)
elif len(sys.argv) > 3:
print('Too many arguments were provided.')
print('Run command: python3 val2yolo.py /path/to/original/widerface [/path/to/save/widerface/val]')
exit(1)
root_path = sys.argv[1]
if not os.path.isfile(os.path.join(root_path, 'val', 'label.txt')):
print('Missing label.txt file.')
exit(1)
if len(sys.argv) == 2:
if not os.path.isdir('widerface'):
os.mkdir('widerface')
if not os.path.isdir('widerface/val'):
os.mkdir('widerface/val')
save_path = 'widerface/val'
else:
save_path = sys.argv[2]
datas = wider2face(root_path, phase='val')
for idx, data in enumerate(datas.keys()):
pict_name = os.path.basename(data)
out_img = f'{save_path}/{idx}.jpg'
out_txt = f'{save_path}/{idx}.txt'
shutil.copyfile(data, out_img)
labels = datas[data]
f = open(out_txt, 'w')
for label in labels:
f.write(label + '\n')
f.close()

65
data/val2yolo_for_test.py Normal file
View File

@ -0,0 +1,65 @@
import os
import cv2
import numpy as np
import shutil
from tqdm import tqdm
root = '/ssd_1t/derron/WiderFace'
def xywh2xxyy(box):
x1 = box[0]
y1 = box[1]
x2 = box[0] + box[2]
y2 = box[1] + box[3]
return (x1, x2, y1, y2)
def convert(size, box):
dw = 1. / (size[0])
dh = 1. / (size[1])
x = (box[0] + box[1]) / 2.0 - 1
y = (box[2] + box[3]) / 2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return (x, y, w, h)
def wider2face(phase='val', ignore_small=0):
data = {}
with open('{}/{}/label.txt'.format(root, phase), 'r') as f:
lines = f.readlines()
for line in tqdm(lines):
line = line.strip()
if '#' in line:
path = '{}/{}/images/{}'.format(root, phase, os.path.basename(line))
img = cv2.imread(path)
height, width, _ = img.shape
data[path] = list()
else:
box = np.array(line.split()[0:4], dtype=np.float32) # (x1,y1,w,h)
if box[2] < ignore_small or box[3] < ignore_small:
continue
box = convert((width, height), xywh2xxyy(box))
label = '0 {} {} {} {} -1 -1 -1 -1 -1 -1 -1 -1 -1 -1'.format(round(box[0], 4), round(box[1], 4),
round(box[2], 4), round(box[3], 4))
data[path].append(label)
return data
if __name__ == '__main__':
datas = wider2face('val')
for idx, data in enumerate(datas.keys()):
pict_name = os.path.basename(data)
out_img = 'widerface/val/images/{}'.format(pict_name)
out_txt = 'widerface/val/labels/{}.txt'.format(os.path.splitext(pict_name)[0])
shutil.copyfile(data, out_img)
labels = datas[data]
f = open(out_txt, 'w')
for label in labels:
f.write(label + '\n')
f.close()

21
data/voc.yaml Normal file
View File

@ -0,0 +1,21 @@
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
# Train command: python train.py --data voc.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /VOC
# /yolov5
# download command/URL (optional)
download: bash data/scripts/get_voc.sh
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../VOC/images/train/ # 16551 images
val: ../VOC/images/val/ # 4952 images
# number of classes
nc: 20
# class names
names: [ 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog',
'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor' ]

19
data/widerface.yaml Normal file
View File

@ -0,0 +1,19 @@
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
# Train command: python train.py --data voc.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /VOC
# /yolov5
# download command/URL (optional)
download: bash data/scripts/get_voc.sh
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: /mnt/Gpan/Mydata/pytorchPorject/yolov5-face/ccpd/train_detect
val: /mnt/Gpan/Mydata/pytorchPorject/yolov5-face/ccpd/val_detect
# number of classes
nc: 2
# class names
names: [ 'single','double']

3
demo.sh Normal file
View File

@ -0,0 +1,3 @@
python detect_plate_hongkang.py --image_path gangao --img_size 640 --detect_model runs/train/exp32/weights/best.pt --rec_model /mnt/Gpan/Mydata/pytorchPorject/yolov7-face/weights/plate_rec.pth
# runs/train/exp26/weights/last.ptimgs
# /mnt/Gpan/Mydata/pytorchPorject/datasets/ccpd/train_detect/gangao

223
detect_demo.py Normal file
View File

@ -0,0 +1,223 @@
# -*- coding: UTF-8 -*-
import argparse
import time
import os
import cv2
import torch
from numpy import random
import copy
import numpy as np
from models.experimental import attempt_load
from utils.datasets import letterbox
from utils.general import check_img_size, non_max_suppression_face, scale_coords
from utils.torch_utils import time_synchronized
from utils.cv_puttext import cv2ImgAddText
from plate_recognition.plate_rec import get_plate_result,allFilePath,cv_imread
from plate_recognition.double_plate_split_merge import get_split_merge
clors = [(255,0,0),(0,255,0),(0,0,255),(255,255,0),(0,255,255)]
def load_model(weights, device):
model = attempt_load(weights, map_location=device) # load FP32 model
return model
def scale_coords_landmarks(img1_shape, coords, img0_shape, ratio_pad=None):
# Rescale coords (xyxy) from img1_shape to img0_shape
if ratio_pad is None: # calculate from img0_shape
gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1]) # gain = old / new
pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2 # wh padding
else:
gain = ratio_pad[0][0]
pad = ratio_pad[1]
coords[:, [0, 2, 4, 6]] -= pad[0] # x padding
coords[:, [1, 3, 5, 7]] -= pad[1] # y padding
coords[:, :10] /= gain
#clip_coords(coords, img0_shape)
coords[:, 0].clamp_(0, img0_shape[1]) # x1
coords[:, 1].clamp_(0, img0_shape[0]) # y1
coords[:, 2].clamp_(0, img0_shape[1]) # x2
coords[:, 3].clamp_(0, img0_shape[0]) # y2
coords[:, 4].clamp_(0, img0_shape[1]) # x3
coords[:, 5].clamp_(0, img0_shape[0]) # y3
coords[:, 6].clamp_(0, img0_shape[1]) # x4
coords[:, 7].clamp_(0, img0_shape[0]) # y4
# coords[:, 8].clamp_(0, img0_shape[1]) # x5
# coords[:, 9].clamp_(0, img0_shape[0]) # y5
return coords
def get_plate_rec_landmark(img, xyxy, conf, landmarks, class_num,device):
h,w,c = img.shape
result_dict={}
tl = 1 or round(0.002 * (h + w) / 2) + 1 # line/font thickness
x1 = int(xyxy[0])
y1 = int(xyxy[1])
x2 = int(xyxy[2])
y2 = int(xyxy[3])
landmarks_np=np.zeros((4,2))
rect=[x1,y1,x2,y2]
for i in range(4):
point_x = int(landmarks[2 * i])
point_y = int(landmarks[2 * i + 1])
landmarks_np[i]=np.array([point_x,point_y])
class_label= int(class_num) #车牌的的类型0代表单牌1代表双层车牌
result_dict['rect']=rect
result_dict['landmarks']=landmarks_np.tolist()
result_dict['class']=class_label
return result_dict
def detect_plate(model, orgimg, device,img_size):
# Load model
# img_size = opt_img_size
conf_thres = 0.3
iou_thres = 0.5
dict_list=[]
# orgimg = cv2.imread(image_path) # BGR
img0 = copy.deepcopy(orgimg)
assert orgimg is not None, 'Image Not Found '
h0, w0 = orgimg.shape[:2] # orig hw
r = img_size / max(h0, w0) # resize image to img_size
if r != 1: # always resize down, only resize up if training with augmentation
interp = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interp)
imgsz = check_img_size(img_size, s=model.stride.max()) # check img_size
img = letterbox(img0, new_shape=imgsz)[0]
# img =process_data(img0)
# Convert
img = img[:, :, ::-1].transpose(2, 0, 1).copy() # BGR to RGB, to 3x416x416
# Run inference
t0 = time.time()
img = torch.from_numpy(img).to(device)
img = img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
# Inference
t1 = time_synchronized()
pred = model(img)[0]
t2=time_synchronized()
# print(f"infer time is {(t2-t1)*1000} ms")
# Apply NMS
pred = non_max_suppression_face(pred, conf_thres, iou_thres)
# print('img.shape: ', img.shape)
# print('orgimg.shape: ', orgimg.shape)
# Process detections
for i, det in enumerate(pred): # detections per image
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], orgimg.shape).round()
# Print results
for c in det[:, -1].unique():
n = (det[:, -1] == c).sum() # detections per class
det[:, 5:13] = scale_coords_landmarks(img.shape[2:], det[:, 5:13], orgimg.shape).round()
for j in range(det.size()[0]):
xyxy = det[j, :4].view(-1).tolist()
conf = det[j, 4].cpu().numpy()
landmarks = det[j, 5:13].view(-1).tolist()
class_num = det[j, 13].cpu().numpy()
result_dict = get_plate_rec_landmark(orgimg, xyxy, conf, landmarks, class_num,device)
dict_list.append(result_dict)
return dict_list
# cv2.imwrite('result.jpg', orgimg)
def draw_result(orgimg,dict_list):
result_str =""
for result in dict_list:
rect_area = result['rect']
x,y,w,h = rect_area[0],rect_area[1],rect_area[2]-rect_area[0],rect_area[3]-rect_area[1]
padding_w = 0.05*w
padding_h = 0.11*h
rect_area[0]=max(0,int(x-padding_w))
rect_area[1]=max(0,int(y-padding_h))
rect_area[2]=min(orgimg.shape[1],int(rect_area[2]+padding_w))
rect_area[3]=min(orgimg.shape[0],int(rect_area[3]+padding_h))
landmarks=result['landmarks']
label=result['class']
# result_str+=result+" "
for i in range(4): #关键点
cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
cv2.rectangle(orgimg,(rect_area[0],rect_area[1]),(rect_area[2],rect_area[3]),clors[label],2) #画框
cv2.putText(img,str(label),(rect_area[0],rect_area[1]),cv2.FONT_HERSHEY_SIMPLEX,0.5,clors[label],2)
# orgimg=cv2ImgAddText(orgimg,label,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
# print(result_str)
return orgimg
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--detect_model', nargs='+', type=str, default='weights/plate_detect.pt', help='model.pt path(s)') #检测模型
parser.add_argument('--image_path', type=str, default=r'D:\Project\ChePai\test\images\val', help='source')
parser.add_argument('--img_size', type=int, default=640, help='inference size (pixels)')
parser.add_argument('--output', type=str, default='result1', help='source')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# device =torch.device("cpu")
opt = parser.parse_args()
print(opt)
save_path = opt.output
count=0
if not os.path.exists(save_path):
os.mkdir(save_path)
detect_model = load_model(opt.detect_model, device) #初始化检测模型
time_all = 0
time_begin=time.time()
if not os.path.isfile(opt.image_path): #目录
file_list=[]
allFilePath(opt.image_path,file_list)
for img_path in file_list:
print(count,img_path)
time_b = time.time()
img =cv_imread(img_path)
if img is None:
continue
if img.shape[-1]==4:
img=cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
# detect_one(model,img_path,device)
dict_list=detect_plate(detect_model, img, device,opt.img_size)
ori_img=draw_result(img,dict_list)
img_name = os.path.basename(img_path)
save_img_path = os.path.join(save_path,img_name)
time_e=time.time()
time_gap = time_e-time_b
if count:
time_all+=time_gap
cv2.imwrite(save_img_path,ori_img)
count+=1
else: #单个图片
print(count,opt.image_path,end=" ")
img =cv_imread(opt.image_path)
if img.shape[-1]==4:
img=cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
# detect_one(model,img_path,device)
dict_list=detect_plate(detect_model, img, device,opt.img_size)
ori_img=draw_result(img,dict_list)
img_name = os.path.basename(opt.image_path)
save_img_path = os.path.join(save_path,img_name)
cv2.imwrite(save_img_path,ori_img)
print(f"sumTime time is {time.time()-time_begin} s, average pic time is {time_all/(len(file_list)-1)}")

372
detect_plate.py Normal file
View File

@ -0,0 +1,372 @@
# -*- coding: UTF-8 -*-
import argparse
import time
from pathlib import Path
import os
import cv2
import torch
import torch.backends.cudnn as cudnn
from numpy import random
import copy
import numpy as np
from models.experimental import attempt_load
from utils.datasets import letterbox
from utils.general import check_img_size, non_max_suppression_face, apply_classifier, scale_coords, xyxy2xywh, \
strip_optimizer, set_logging, increment_path
from utils.plots import plot_one_box
from utils.torch_utils import select_device, load_classifier, time_synchronized
from utils.cv_puttext import cv2ImgAddText
from plate_recognition.plate_rec import get_plate_result,allFilePath,init_model,cv_imread
# from plate_recognition.plate_cls import cv_imread
from plate_recognition.double_plate_split_merge import get_split_merge
clors = [(255,0,0),(0,255,0),(0,0,255),(255,255,0),(0,255,255)]
danger=['','']
def order_points(pts): #四个点按照左上 右上 右下 左下排列
rect = np.zeros((4, 2), dtype = "float32")
s = pts.sum(axis = 1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
diff = np.diff(pts, axis = 1)
rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)]
return rect
def four_point_transform(image, pts): #透视变换得到车牌小图
# rect = order_points(pts)
rect = pts.astype('float32')
(tl, tr, br, bl) = rect
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
maxWidth = max(int(widthA), int(widthB))
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
maxHeight = max(int(heightA), int(heightB))
dst = np.array([
[0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[0, maxHeight - 1]], dtype = "float32")
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
return warped
def load_model(weights, device): #加载检测模型
model = attempt_load(weights, map_location=device) # load FP32 model
return model
def scale_coords_landmarks(img1_shape, coords, img0_shape, ratio_pad=None): #返回到原图坐标
# Rescale coords (xyxy) from img1_shape to img0_shape
if ratio_pad is None: # calculate from img0_shape
gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1]) # gain = old / new
pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2 # wh padding
else:
gain = ratio_pad[0][0]
pad = ratio_pad[1]
coords[:, [0, 2, 4, 6]] -= pad[0] # x padding
coords[:, [1, 3, 5, 7]] -= pad[1] # y padding
coords[:, :8] /= gain
#clip_coords(coords, img0_shape)
coords[:, 0].clamp_(0, img0_shape[1]) # x1
coords[:, 1].clamp_(0, img0_shape[0]) # y1
coords[:, 2].clamp_(0, img0_shape[1]) # x2
coords[:, 3].clamp_(0, img0_shape[0]) # y2
coords[:, 4].clamp_(0, img0_shape[1]) # x3
coords[:, 5].clamp_(0, img0_shape[0]) # y3
coords[:, 6].clamp_(0, img0_shape[1]) # x4
coords[:, 7].clamp_(0, img0_shape[0]) # y4
# coords[:, 8].clamp_(0, img0_shape[1]) # x5
# coords[:, 9].clamp_(0, img0_shape[0]) # y5
return coords
def get_plate_rec_landmark(img, xyxy, conf, landmarks, class_num,device,plate_rec_model,is_color=False): #获取车牌坐标以及四个角点坐标并获取车牌号
h,w,c = img.shape
Box={}
result_dict={}
tl = 1 or round(0.002 * (h + w) / 2) + 1 # line/font thickness
x1 = int(xyxy[0])
y1 = int(xyxy[1])
x2 = int(xyxy[2])
y2 = int(xyxy[3])
height=y2-y1
landmarks_np=np.zeros((4,2))
rect=[x1,y1,x2,y2]
for i in range(4):
point_x = int(landmarks[2 * i])
point_y = int(landmarks[2 * i + 1])
landmarks_np[i]=np.array([point_x,point_y])
class_label= int(class_num) #车牌的的类型0代表单牌1代表双层车牌
roi_img = four_point_transform(img,landmarks_np) #透视变换得到车牌小图
if class_label: #判断是否是双层车牌,是双牌的话进行分割后然后拼接
roi_img=get_split_merge(roi_img)
if not is_color:
plate_number,rec_prob = get_plate_result(roi_img,device,plate_rec_model,is_color=is_color) #对车牌小图进行识别
else:
plate_number,rec_prob,plate_color,color_conf=get_plate_result(roi_img,device,plate_rec_model,is_color=is_color)
# cv2.imwrite("roi.jpg",roi_img)
# result_dict['Score']=conf #检测区域得分
Box['X']=landmarks_np[0][0].tolist() #车牌角点坐标
Box['Y']=landmarks_np[0][1].tolist()
Box['Width']=rect[2]-rect[0]
Box['Height']=rect[3]-rect[1]
Box['label']=plate_number #车牌号
Box['rect']=rect
result_dict['rect'] = rect # 车牌roi区域
result_dict['detect_conf'] = conf # 检测区域得分
result_dict['landmarks'] = landmarks_np.tolist() # 车牌角点坐标
result_dict['plate_no'] = plate_number # 车牌号
result_dict['rec_conf'] = rec_prob # 每个字符的概率
result_dict['roi_height'] = roi_img.shape[0] # 车牌高度
result_dict['plate_color'] = ""
if is_color:
result_dict['plate_color'] = plate_color # 车牌颜色
result_dict['color_conf'] = color_conf # 颜色得分
result_dict['plate_type'] = class_label # 单双层 0单层 1双层
score = conf.tolist()
return plate_number, score, Box,result_dict
def detect_Recognition_plate(model, orgimg, device,plate_rec_model,img_size,is_color=False):#获取车牌信息
# Load model
# img_size = opt_img_size
conf_thres = 0.3 #得分阈值
iou_thres = 0.5 #nms的iou值
dict_list=[]
result_jpg=[]
# orgimg = cv2.imread(image_path) # BGR
img0 = copy.deepcopy(orgimg)
assert orgimg is not None, 'Image Not Found '
h0, w0 = orgimg.shape[:2] # orig hw
r = img_size / max(h0, w0) # resize image to img_size
if r != 1: # always resize down, only resize up if training with augmentation
interp = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interp)
imgsz = check_img_size(img_size, s=model.stride.max()) # check img_size
img = letterbox(img0, new_shape=imgsz)[0] #检测前处理图片长宽变为32倍数比如变为640X640
# img =process_data(img0)
# Convert
img = img[:, :, ::-1].transpose(2, 0, 1).copy() # BGR to RGB, to 3x416x416 图片的BGR排列转为RGB,然后将图片的H,W,C排列变为C,H,W排列
# Run inference
t0 = time.time()
img = torch.from_numpy(img).to(device)
img = img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
# Inference
# t1 = time_synchronized()/
pred = model(img)[0]
# t2=time_synchronized()
# print(f"infer time is {(t2-t1)*1000} ms")
# Apply NMS
pred = non_max_suppression_face(pred, conf_thres, iou_thres)
result_jpg.insert(0, pred[0].tolist())
# print('img.shape: ', img.shape)
# print('orgimg.shape: ', orgimg.shape)
# Process detections
for i, det in enumerate(pred): # detections per image
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], orgimg.shape).round()
# Print results
for c in det[:, -1].unique():
n = (det[:, -1] == c).sum() # detections per class
det[:, 5:13] = scale_coords_landmarks(img.shape[2:], det[:, 5:13], orgimg.shape).round()
for j in range(det.size()[0]):
xyxy = det[j, :4].view(-1).tolist()
conf = det[j, 4].cpu().numpy()
landmarks = det[j, 5:13].view(-1).tolist()
class_num = det[j, 13].cpu().numpy()
label,score,Box,result_dict = get_plate_rec_landmark(orgimg, xyxy, conf, landmarks, class_num,device,plate_rec_model,is_color=is_color)
dict_list.append(result_dict)
result_jpg.append(Box)
result_jpg.append(score)
result_jpg.append(label)
return dict_list, result_jpg
# cv2.imwrite('result.jpg', orgimg)
def draw_result(orgimg,dict_list,is_color=False): # 车牌结果画出来
result_str =""
for result in dict_list:
rect_area = result['rect']
x,y,w,h = rect_area[0],rect_area[1],rect_area[2]-rect_area[0],rect_area[3]-rect_area[1]
padding_w = 0.05*w
padding_h = 0.11*h
rect_area[0]=max(0,int(x-padding_w))
rect_area[1]=max(0,int(y-padding_h))
rect_area[2]=min(orgimg.shape[1],int(rect_area[2]+padding_w))
rect_area[3]=min(orgimg.shape[0],int(rect_area[3]+padding_h))
height_area = result['roi_height']
landmarks=result['landmarks']
result_p = result['plate_no']
if result['plate_type']==0:#单层
result_p+=" "+result['plate_color']
else: #双层
result_p+=" "+result['plate_color']+"双层"
result_str+=result_p+" "
for i in range(4): #关键点
cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
cv2.rectangle(orgimg,(rect_area[0],rect_area[1]),(rect_area[2],rect_area[3]),(0,0,255),2) #画框
labelSize = cv2.getTextSize(result_p,cv2.FONT_HERSHEY_SIMPLEX,0.5,1) #获得字体的大小
if rect_area[0]+labelSize[0][0]>orgimg.shape[1]: #防止显示的文字越界
rect_area[0]=int(orgimg.shape[1]-labelSize[0][0])
orgimg=cv2.rectangle(orgimg,(rect_area[0],int(rect_area[1]-round(1.6*labelSize[0][1]))),(int(rect_area[0]+round(1.2*labelSize[0][0])),rect_area[1]+labelSize[1]),(255,255,255),cv2.FILLED)#画文字框,背景白色
if len(result)>=1:
orgimg=cv2ImgAddText(orgimg,result_p,rect_area[0],int(rect_area[1]-round(1.6*labelSize[0][1])),(0,0,0),21)
# orgimg=cv2ImgAddText(orgimg,result_p,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
print(result_str) # 打印结果
return orgimg, result_str
def get_second(capture):
if capture.isOpened():
rate = capture.get(5) # 帧速率
FrameNumber = capture.get(7) # 视频文件的帧数
duration = FrameNumber/rate # 帧速率/视频总帧数 是时间除以60之后单位是分钟
return int(rate),int(FrameNumber),int(duration)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--detect_model', nargs='+', type=str, default='weights/plate_detect.pt', help='model.pt path(s)') #检测模型
parser.add_argument('--rec_model', type=str, default='weights/plate_rec_color.pth', help='model.pt path(s)')#车牌识别+颜色识别模型
parser.add_argument('--is_color',type=bool,default=True,help='plate color') #是否识别颜色
parser.add_argument('--image_path', type=str, default=r'D:\Project\ChePai\test\images\val\20230331163841.jpg', help='source') #图片路径
parser.add_argument('--img_size', type=int, default=640, help='inference size (pixels)') #网络输入图片大小
parser.add_argument('--output', type=str, default='result', help='source') #图片结果保存的位置
parser.add_argument('--video', type=str, default='', help='source') #视频的路径
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") #使用gpu还是cpu进行识别
# device =torch.device("cpu")
opt = parser.parse_args()
print(opt)
save_path = opt.output
count=0
if not os.path.exists(save_path):
os.mkdir(save_path)
detect_model = load_model(opt.detect_model, device) #初始化检测模型
plate_rec_model=init_model(device,opt.rec_model,is_color=opt.is_color) #初始化识别模型
#算参数量
total = sum(p.numel() for p in detect_model.parameters())
total_1 = sum(p.numel() for p in plate_rec_model.parameters())
print("detect params: %.2fM,rec params: %.2fM" % (total/1e6,total_1/1e6))
# plate_color_model =init_color_model(opt.color_model,device)
time_all = 0
time_begin=time.time()
if not opt.video: #处理图片
if not os.path.isfile(opt.image_path): #目录
file_list=[]
allFilePath(opt.image_path,file_list) #将这个目录下的所有图片文件路径读取到file_list里面
for img_path in file_list: #遍历图片文件
print(count,img_path,end=" ")
time_b = time.time() #开始时间
img =cv_imread(img_path) #opencv 读取图片
if img is None:
continue
if img.shape[-1]==4: #图片如果是4个通道的将其转为3个通道
img=cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
# detect_one(model,img_path,device)
dict_list=detect_Recognition_plate(detect_model, img, device,plate_rec_model,opt.img_size,is_color=opt.is_color)#检测以及识别车牌
ori_img, str_result=draw_result(img,dict_list) #将结果画在图上
print(str_result)
img_name = os.path.basename(img_path)
save_img_path = os.path.join(save_path,img_name) #图片保存的路径
time_e=time.time()
time_gap = time_e-time_b #计算单个图片识别耗时
if count:
time_all+=time_gap
cv2.imwrite(save_img_path,ori_img) #opencv将识别的图片保存
count+=1
print(f"sumTime time is {time.time()-time_begin} s, average pic time is {time_all/(len(file_list)-1)}")
else: #单个图片
print(count,opt.image_path,end=" ")
img =cv_imread(opt.image_path)
if img.shape[-1]==4:
img=cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
# detect_one(model,img_path,device)
dict_list, result_jpg=detect_Recognition_plate(detect_model, img, device,plate_rec_model,opt.img_size,is_color=opt.is_color)
ori_img=draw_result(img,dict_list)
ori_list=ori_img[0].tolist()
result_jpg.insert(0,ori_list)
img_name = os.path.basename(opt.image_path)
save_img_path = os.path.join(save_path,img_name)
cv2.imwrite(save_img_path,ori_img)
else: #处理视频
video_name = opt.video
capture=cv2.VideoCapture(video_name)
fourcc = cv2.VideoWriter_fourcc(*'MP4V')
fps = capture.get(cv2.CAP_PROP_FPS) # 帧数
width, height = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT)) # 宽高
out = cv2.VideoWriter('result.mp4', fourcc, fps, (width, height)) # 写入视频
frame_count = 0
fps_all=0
rate,FrameNumber,duration=get_second(capture)
if capture.isOpened():
while True:
t1 = cv2.getTickCount()
frame_count+=1
print(f"{frame_count}",end=" ")
ret,img=capture.read()
if not ret:
break
# if frame_count%rate==0:
img0 = copy.deepcopy(img)
dict_list=detect_Recognition_plate(detect_model, img, device,plate_rec_model,opt.img_size,is_color=opt.is_color)
ori_img=draw_result(img,dict_list)
t2 =cv2.getTickCount()
infer_time =(t2-t1)/cv2.getTickFrequency()
fps=1.0/infer_time
fps_all+=fps
str_fps = f'fps:{fps:.4f}'
cv2.putText(ori_img,str_fps,(20,20),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,0),2)
# cv2.imshow("haha",ori_img)
# cv2.waitKey(1)
out.write(ori_img)
# current_time = int(frame_count/FrameNumber*duration)
# sec = current_time%60
# minute = current_time//60
# for result_ in result_list:
# plate_no = result_['plate_no']
# if not is_car_number(pattern_str,plate_no):
# continue
# print(f'车牌号:{plate_no},时间:{minute}分{sec}秒')
# time_str =f'{minute}分{sec}秒'
# writer.writerow({"车牌":plate_no,"时间":time_str})
# out.write(ori_img)
else:
print("失败")
capture.release()
out.release()
cv2.destroyAllWindows()
print(f"all frame is {frame_count},average fps is {fps_all/frame_count} fps")

28
detect_test.py Normal file
View File

@ -0,0 +1,28 @@
import torch
from ultralytics import YOLO
# 加载预训练的模型
model = YOLO('weights/plate_detect.pt')
# 设置图像路径
image_path = r'D:\Project\ChePai\test\images\val\20230331163841.jpg'
# 进行推理
results = model(image_path)
# 解析结果
for r in results:
boxes = r.boxes # 包含检测结果的Boxes对象
# 获取边界框坐标
box_coordinates = boxes.xyxy.cpu().numpy()
# 获取置信度分数
confidences = boxes.conf.cpu().numpy()
# 获取类别标签
labels = boxes.cls.cpu().numpy().astype(int)
# 打印检测结果
for i in range(len(box_coordinates)):
x1, y1, x2, y2 = box_coordinates[i]
confidence = confidences[i]
label = model.names[labels[i]]
print(f'Object: {label}, Confidence: {confidence:.2f}, Bounding Box: ({x1:.2f}, {y1:.2f}, {x2:.2f}, {y2:.2f})')

161
export.py Normal file
View File

@ -0,0 +1,161 @@
"""Exports a YOLOv5 *.pt model to ONNX and TorchScript formats
Usage:
$ export PYTHONPATH="$PWD" && python models/export.py --weights ./weights/yolov5s.pt --img 640 --batch 1
"""
import argparse
import sys
import time
sys.path.append('./') # to run '$ python *.py' files in subdirectories
import torch
import torch.nn as nn
import models
from models.experimental import attempt_load
from utils.activations import Hardswish, SiLU
from utils.general import set_logging, check_img_size
import onnx
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--weights', type=str, default='./yolov5s.pt', help='weights path') # from yolov5/models/
parser.add_argument('--img_size', nargs='+', type=int, default=[640, 640], help='image size') # height, width
parser.add_argument('--batch_size', type=int, default=1, help='batch size')
parser.add_argument('--dynamic', action='store_true', default=False, help='enable dynamic axis in onnx model')
parser.add_argument('--onnx2pb', action='store_true', default=False, help='export onnx to pb')
parser.add_argument('--onnx_infer', action='store_true', default=True, help='onnx infer test')
#=======================TensorRT=================================
parser.add_argument('--onnx2trt', action='store_true', default=False, help='export onnx to tensorrt')
parser.add_argument('--fp16_trt', action='store_true', default=False, help='fp16 infer')
#================================================================
opt = parser.parse_args()
opt.img_size *= 2 if len(opt.img_size) == 1 else 1 # expand
print(opt)
set_logging()
t = time.time()
# Load PyTorch model
model = attempt_load(opt.weights, map_location=torch.device('cpu')) # load FP32 model
delattr(model.model[-1], 'anchor_grid')
model.model[-1].anchor_grid=[torch.zeros(1)] * 3 # nl=3 number of detection layers
model.model[-1].export_cat = True
model.eval()
labels = model.names
# Checks
gs = int(max(model.stride)) # grid size (max stride)
opt.img_size = [check_img_size(x, gs) for x in opt.img_size] # verify img_size are gs-multiples
# Input
img = torch.zeros(opt.batch_size, 3, *opt.img_size) # image size(1,3,320,192) iDetection
# Update model
for k, m in model.named_modules():
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
if isinstance(m, models.common.Conv): # assign export-friendly activations
if isinstance(m.act, nn.Hardswish):
m.act = Hardswish()
elif isinstance(m.act, nn.SiLU):
m.act = SiLU()
# elif isinstance(m, models.yolo.Detect):
# m.forward = m.forward_export # assign forward (optional)
if isinstance(m, models.common.ShuffleV2Block):#shufflenet block nn.SiLU
for i in range(len(m.branch1)):
if isinstance(m.branch1[i], nn.SiLU):
m.branch1[i] = SiLU()
for i in range(len(m.branch2)):
if isinstance(m.branch2[i], nn.SiLU):
m.branch2[i] = SiLU()
if isinstance(m, models.common.BlazeBlock):#shufflenet block nn.SiLU
if isinstance(m.relu, nn.SiLU):
m.relu = SiLU()
if isinstance(m, models.common.DoubleBlazeBlock):#shufflenet block nn.SiLU
if isinstance(m.relu, nn.SiLU):
m.relu = SiLU()
for i in range(len(m.branch1)):
if isinstance(m.branch1[i], nn.SiLU):
m.branch1[i] = SiLU()
# for i in range(len(m.branch2)):
# if isinstance(m.branch2[i], nn.SiLU):
# m.branch2[i] = SiLU()
y = model(img) # dry run
# ONNX export
print('\nStarting ONNX export with onnx %s...' % onnx.__version__)
f = opt.weights.replace('.pt', '.onnx') # filename
model.fuse() # only for ONNX
input_names=['input']
output_names=['output']
#tensorrt 7
# grid = model.model[-1].anchor_grid
# model.model[-1].anchor_grid = [a[..., :1, :1, :] for a in grid]
#tensorrt 7
torch.onnx.export(model, img, f, verbose=False, opset_version=12,
input_names=input_names,
output_names=output_names,
dynamic_axes = {'input': {0: 'batch'},
'output': {0: 'batch'}
} if opt.dynamic else None)
# model.model[-1].anchor_grid = grid
# Checks
onnx_model = onnx.load(f) # load onnx model
onnx.checker.check_model(onnx_model) # check onnx model
print('ONNX export success, saved as %s' % f)
# Finish
print('\nExport complete (%.2fs). Visualize with https://github.com/lutzroeder/netron.' % (time.time() - t))
# onnx infer
if opt.onnx_infer:
import onnxruntime
import numpy as np
providers = ['CPUExecutionProvider']
session = onnxruntime.InferenceSession(f, providers=providers)
im = img.cpu().numpy().astype(np.float32) # torch to numpy
y_onnx = session.run([session.get_outputs()[0].name], {session.get_inputs()[0].name: im})[0]
print("pred's shape is ",y_onnx.shape)
print("max(|torch_pred - onnx_pred| =",abs(y.cpu().numpy()-y_onnx).max())
# TensorRT export
if opt.onnx2trt:
from torch2trt.trt_model import ONNX_to_TRT
print('\nStarting TensorRT...')
ONNX_to_TRT(onnx_model_path=f,trt_engine_path=f.replace('.onnx', '.trt'),fp16_mode=opt.fp16_trt)
# PB export
if opt.onnx2pb:
print('download the newest onnx_tf by https://github.com/onnx/onnx-tensorflow/tree/master/onnx_tf')
from onnx_tf.backend import prepare
import tensorflow as tf
outpb = f.replace('.onnx', '.pb') # filename
# strict=True maybe leads to KeyError: 'pyfunc_0', check: https://github.com/onnx/onnx-tensorflow/issues/167
tf_rep = prepare(onnx_model, strict=False) # prepare tf representation
tf_rep.export_graph(outpb) # export the model
out_onnx = tf_rep.run(img) # onnx output
# check pb
with tf.Graph().as_default():
graph_def = tf.GraphDef()
with open(outpb, "rb") as f:
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name="")
with tf.Session() as sess:
init = tf.global_variables_initializer()
input_x = sess.graph.get_tensor_by_name(input_names[0]+':0') # input
outputs = []
for i in output_names:
outputs.append(sess.graph.get_tensor_by_name(i+':0'))
out_pb = sess.run(outputs, feed_dict={input_x: img})
print(f'out_pytorch {y}')
print(f'out_onnx {out_onnx}')
print(f'out_pb {out_pb}')

141
hubconf.py Normal file
View File

@ -0,0 +1,141 @@
"""File for accessing YOLOv5 via PyTorch Hub https://pytorch.org/hub/
Usage:
import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True, channels=3, classes=80)
"""
from pathlib import Path
import torch
from models.yolo import Model
from utils.general import set_logging
from utils.google_utils import attempt_download
dependencies = ['torch', 'yaml']
set_logging()
def create(name, pretrained, channels, classes, autoshape):
"""Creates a specified YOLOv5 model
Arguments:
name (str): name of model, i.e. 'yolov5s'
pretrained (bool): load pretrained weights into the model
channels (int): number of input channels
classes (int): number of model classes
Returns:
pytorch model
"""
config = Path(__file__).parent / 'models' / f'{name}.yaml' # model.yaml path
try:
model = Model(config, channels, classes)
if pretrained:
fname = f'{name}.pt' # checkpoint filename
attempt_download(fname) # download if not found locally
ckpt = torch.load(fname, map_location=torch.device('cpu')) # load
state_dict = ckpt['model'].float().state_dict() # to FP32
state_dict = {k: v for k, v in state_dict.items() if model.state_dict()[k].shape == v.shape} # filter
model.load_state_dict(state_dict, strict=False) # load
if len(ckpt['model'].names) == classes:
model.names = ckpt['model'].names # set class names attribute
if autoshape:
model = model.autoshape() # for file/URI/PIL/cv2/np inputs and NMS
return model
except Exception as e:
help_url = 'https://github.com/ultralytics/yolov5/issues/36'
s = 'Cache maybe be out of date, try force_reload=True. See %s for help.' % help_url
raise Exception(s) from e
def yolov5s(pretrained=False, channels=3, classes=80, autoshape=True):
"""YOLOv5-small model from https://github.com/ultralytics/yolov5
Arguments:
pretrained (bool): load pretrained weights into the model, default=False
channels (int): number of input channels, default=3
classes (int): number of model classes, default=80
Returns:
pytorch model
"""
return create('yolov5s', pretrained, channels, classes, autoshape)
def yolov5m(pretrained=False, channels=3, classes=80, autoshape=True):
"""YOLOv5-medium model from https://github.com/ultralytics/yolov5
Arguments:
pretrained (bool): load pretrained weights into the model, default=False
channels (int): number of input channels, default=3
classes (int): number of model classes, default=80
Returns:
pytorch model
"""
return create('yolov5m', pretrained, channels, classes, autoshape)
def yolov5l(pretrained=False, channels=3, classes=80, autoshape=True):
"""YOLOv5-large model from https://github.com/ultralytics/yolov5
Arguments:
pretrained (bool): load pretrained weights into the model, default=False
channels (int): number of input channels, default=3
classes (int): number of model classes, default=80
Returns:
pytorch model
"""
return create('yolov5l', pretrained, channels, classes, autoshape)
def yolov5x(pretrained=False, channels=3, classes=80, autoshape=True):
"""YOLOv5-xlarge model from https://github.com/ultralytics/yolov5
Arguments:
pretrained (bool): load pretrained weights into the model, default=False
channels (int): number of input channels, default=3
classes (int): number of model classes, default=80
Returns:
pytorch model
"""
return create('yolov5x', pretrained, channels, classes, autoshape)
def custom(path_or_model='path/to/model.pt', autoshape=True):
"""YOLOv5-custom model from https://github.com/ultralytics/yolov5
Arguments (3 options):
path_or_model (str): 'path/to/model.pt'
path_or_model (dict): torch.load('path/to/model.pt')
path_or_model (nn.Module): torch.load('path/to/model.pt')['model']
Returns:
pytorch model
"""
model = torch.load(path_or_model) if isinstance(path_or_model, str) else path_or_model # load checkpoint
if isinstance(model, dict):
model = model['model'] # load model
hub_model = Model(model.yaml).to(next(model.parameters()).device) # create
hub_model.load_state_dict(model.float().state_dict()) # load state_dict
hub_model.names = model.names # class names
return hub_model.autoshape() if autoshape else hub_model
if __name__ == '__main__':
model = create(name='yolov5s', pretrained=True, channels=3, classes=80, autoshape=True) # pretrained example
# model = custom(path_or_model='path/to/model.pt') # custom example
# Verify inference
from PIL import Image
imgs = [Image.open(x) for x in Path('data/images').glob('*.jpg')]
results = model(imgs)
results.show()
results.print()

BIN
image/README/1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

BIN
image/README/105384078.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

BIN
image/README/test_1.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

BIN
image/README/weixian.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 960 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 241 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 328 KiB

BIN
imgs/double_yellow.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

BIN
imgs/hongkang1.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 571 KiB

BIN
imgs/moto.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 400 KiB

BIN
imgs/police.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 382 KiB

BIN
imgs/shi_lin_guan.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

BIN
imgs/single_blue.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB

BIN
imgs/single_green.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 903 KiB

BIN
imgs/single_yellow.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

BIN
imgs/tmp8F1F.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 932 KiB

BIN
imgs/tmpA5E3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 513 KiB

BIN
imgs/xue.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 999 KiB

121
json2yolo.py Normal file
View File

@ -0,0 +1,121 @@
import json
import os
import numpy as np
from copy import deepcopy
import cv2
def allFilePath(rootPath,allFIleList):
fileList = os.listdir(rootPath)
for temp in fileList:
if os.path.isfile(os.path.join(rootPath,temp)):
allFIleList.append(os.path.join(rootPath,temp))
else:
allFilePath(os.path.join(rootPath,temp),allFIleList)
def xywh2yolo(rect,landmarks_sort,img):
h,w,c =img.shape
rect[0] = max(0, rect[0])
rect[1] = max(0, rect[1])
rect[2] = min(w - 1, rect[2]-rect[0])
rect[3] = min(h - 1, rect[3]-rect[1])
annotation = np.zeros((1, 12))
annotation[0, 0] = (rect[0] + rect[2] / 2) / w # cx
annotation[0, 1] = (rect[1] + rect[3] / 2) / h # cy
annotation[0, 2] = rect[2] / w # w
annotation[0, 3] = rect[3] / h # h
annotation[0, 4] = landmarks_sort[0][0] / w # l0_x
annotation[0, 5] = landmarks_sort[0][1] / h # l0_y
annotation[0, 6] = landmarks_sort[1][0] / w # l1_x
annotation[0, 7] = landmarks_sort[1][1] / h # l1_y
annotation[0, 8] = landmarks_sort[2][0] / w # l2_x
annotation[0, 9] = landmarks_sort[2][1] / h # l2_y
annotation[0, 10] = landmarks_sort[3][0] / w # l3_x
annotation[0, 11] = landmarks_sort[3][1] / h # l3_y
# annotation[0, 12] = (landmarks_sort[0][0]+landmarks_sort[1][0])/2 / w # l4_x
# annotation[0, 13] = (landmarks_sort[0][1]+landmarks_sort[1][1])/2 / h # l4_y
return annotation
def order_points(pts):
rect = np.zeros((4, 2), dtype = "float32")
s = pts.sum(axis = 1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
diff = np.diff(pts, axis = 1)
rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)]
# return the ordered coordinates
return rect
def four_point_transform(image, pts):
rect = order_points(pts)
(tl, tr, br, bl) = rect
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
maxWidth = max(int(widthA), int(widthB))
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
maxHeight = max(int(heightA), int(heightB))
dst = np.array([
[0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[0, maxHeight - 1]], dtype = "float32")
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
# return the warped image
return warped
if __name__ == "__main__":
pic_file_list = []
pic_file = r"/mnt/Gpan/Mydata/pytorchPorject/datasets/ccpd/train_bisai/train_bisai"
save_small_path = "small"
label_file = ['0','1']
allFilePath(pic_file,pic_file_list)
count=0
index = 0
for pic_ in pic_file_list:
if not pic_.endswith(".jpg"):
continue
count+=1
img = cv2.imread(pic_)
img_name = os.path.basename(pic_)
txt_name = img_name.replace(".jpg",".txt")
txt_path = os.path.join(pic_file,txt_name)
json_file_ = pic_.replace(".jpg",".json")
if not os.path.exists(json_file_):
continue
with open(json_file_, 'r',encoding='utf-8') as a:
data_dict = json.load(a)
# print(data_dict['shapes'])
with open(txt_path,"w") as f:
for data_message in data_dict['shapes']:
index+=1
label=data_message['label']
points = data_message['points']
pts = np.array(points)
# pts=order_points(pts)
# new_img = four_point_transform(img,pts)
roi_img_name = label+"_"+str(index)+".jpg"
save_path=os.path.join(save_small_path,roi_img_name)
# cv2.imwrite(save_path,new_img)
x_max,y_max = np.max(pts,axis=0)
x_min,y_min = np.min(pts,axis=0)
rect = [x_min,y_min,x_max,y_max]
rect1=deepcopy(rect)
annotation=xywh2yolo(rect1,pts,img)
print(data_message)
label = data_message['label']
str_label = label_file.index(label)
# str_label = "0 "
str_label = str(str_label)+" "
for i in range(len(annotation[0])):
str_label = str_label + " " + str(annotation[0][i])
str_label = str_label.replace('[', '').replace(']', '')
str_label = str_label.replace(',', '') + '\n'
f.write(str_label)
print(count,img_name)
# point=data_message[points]

349
main.py Normal file
View File

@ -0,0 +1,349 @@
# -*- coding: UTF-8 -*-
from flask import Flask, request, jsonify
from PIL import Image
import io
import base64
import time
from pathlib import Path
import os
import cv2
import torch
import torch.backends.cudnn as cudnn
from numpy import random
import copy
import numpy as np
from models.experimental import attempt_load
from utils.datasets import letterbox
from utils.general import check_img_size, non_max_suppression_face, apply_classifier, scale_coords, xyxy2xywh, \
strip_optimizer, set_logging, increment_path
from utils.plots import plot_one_box
from utils.torch_utils import select_device, load_classifier, time_synchronized
from utils.cv_puttext import cv2ImgAddText
from plate_recognition.plate_rec import get_plate_result, allFilePath, init_model, cv_imread
# from plate_recognition.plate_cls import cv_imread
from plate_recognition.double_plate_split_merge import get_split_merge
clors = [(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (0, 255, 255)]
danger = ['', '']
def order_points(pts): # 四个点按照左上 右上 右下 左下排列
rect = np.zeros((4, 2), dtype="float32")
s = pts.sum(axis=1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
diff = np.diff(pts, axis=1)
rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)]
return rect
def four_point_transform(image, pts): # 透视变换得到车牌小图
# rect = order_points(pts)
rect = pts.astype('float32')
(tl, tr, br, bl) = rect
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
maxWidth = max(int(widthA), int(widthB))
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
maxHeight = max(int(heightA), int(heightB))
dst = np.array([
[0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[0, maxHeight - 1]], dtype="float32")
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
return warped
def load_model(weights, device): # 加载检测模型
model = attempt_load(weights, map_location=device) # load FP32 model
return model
def scale_coords_landmarks(img1_shape, coords, img0_shape, ratio_pad=None): # 返回到原图坐标
# Rescale coords (xyxy) from img1_shape to img0_shape
if ratio_pad is None: # calculate from img0_shape
gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1]) # gain = old / new
pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2 # wh padding
else:
gain = ratio_pad[0][0]
pad = ratio_pad[1]
coords[:, [0, 2, 4, 6]] -= pad[0] # x padding
coords[:, [1, 3, 5, 7]] -= pad[1] # y padding
coords[:, :8] /= gain
# clip_coords(coords, img0_shape)
coords[:, 0].clamp_(0, img0_shape[1]) # x1
coords[:, 1].clamp_(0, img0_shape[0]) # y1
coords[:, 2].clamp_(0, img0_shape[1]) # x2
coords[:, 3].clamp_(0, img0_shape[0]) # y2
coords[:, 4].clamp_(0, img0_shape[1]) # x3
coords[:, 5].clamp_(0, img0_shape[0]) # y3
coords[:, 6].clamp_(0, img0_shape[1]) # x4
coords[:, 7].clamp_(0, img0_shape[0]) # y4
# coords[:, 8].clamp_(0, img0_shape[1]) # x5
# coords[:, 9].clamp_(0, img0_shape[0]) # y5
return coords
def get_plate_rec_landmark(img, xyxy, conf, landmarks, class_num, device, plate_rec_model,
is_color=False): # 获取车牌坐标以及四个角点坐标并获取车牌号
h, w, c = img.shape
Box = {}
result_dict = {}
tl = 1 or round(0.002 * (h + w) / 2) + 1 # line/font thickness
x1 = int(xyxy[0])
y1 = int(xyxy[1])
x2 = int(xyxy[2])
y2 = int(xyxy[3])
height = y2 - y1
landmarks_np = np.zeros((4, 2))
rect = [x1, y1, x2, y2]
for i in range(4):
point_x = int(landmarks[2 * i])
point_y = int(landmarks[2 * i + 1])
landmarks_np[i] = np.array([point_x, point_y])
class_label = int(class_num) # 车牌的的类型0代表单牌1代表双层车牌
roi_img = four_point_transform(img, landmarks_np) # 透视变换得到车牌小图
if class_label: # 判断是否是双层车牌,是双牌的话进行分割后然后拼接
roi_img = get_split_merge(roi_img)
if not is_color:
plate_number, rec_prob = get_plate_result(roi_img, device, plate_rec_model, is_color=is_color) # 对车牌小图进行识别
else:
plate_number, rec_prob, plate_color, color_conf = get_plate_result(roi_img, device, plate_rec_model,
is_color=is_color)
Box['X'] = landmarks_np[0][0].tolist() # 车牌角点坐标
Box['Y'] = landmarks_np[0][1].tolist()
Box['Width'] = rect[2] - rect[0]
Box['Height'] = rect[3] - rect[1]
# Box['label'] = plate_number # 车牌号
# Box['rect'] = rect
result_dict['rect'] = rect # 车牌roi区域
result_dict['detect_conf'] = conf # 检测区域得分
result_dict['landmarks'] = landmarks_np.tolist() # 车牌角点坐标
result_dict['plate_no'] = plate_number # 车牌号
result_dict['rec_conf'] = rec_prob # 每个字符的概率
result_dict['roi_height'] = roi_img.shape[0] # 车牌高度
result_dict['plate_color'] = ""
if is_color:
result_dict['plate_color'] = plate_color # 车牌颜色
result_dict['color_conf'] = color_conf # 颜色得分
result_dict['plate_type'] = class_label # 单双层 0单层 1双层
score = conf.tolist()
return plate_number, score, Box, result_dict
def detect_Recognition_plate(model, orgimg, device, plate_rec_model, img_size, is_color=False): # 获取车牌信息
# Load model
# img_size = opt_img_size
conf_thres = 0.3 # 得分阈值
iou_thres = 0.5 # nms的iou值
dict_list = []
result_jpg = []
# orgimg = cv2.imread(image_path) # BGR
img0 = copy.deepcopy(orgimg)
assert orgimg is not None, 'Image Not Found '
h0, w0 = orgimg.shape[:2] # orig hw
r = img_size / max(h0, w0) # resize image to img_size
if r != 1: # always resize down, only resize up if training with augmentation
interp = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interp)
imgsz = check_img_size(img_size, s=model.stride.max()) # check img_size
img = letterbox(img0, new_shape=imgsz)[0] # 检测前处理图片长宽变为32倍数比如变为640X640
# img =process_data(img0)
# Convert
img = img[:, :, ::-1].transpose(2, 0, 1).copy() # BGR to RGB, to 3x416x416 图片的BGR排列转为RGB,然后将图片的H,W,C排列变为C,H,W排列
# Run inference
t0 = time.time()
img = torch.from_numpy(img).to(device)
img = img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
# Inference
pred = model(img)[0]
# Apply NMS
pred = non_max_suppression_face(pred, conf_thres, iou_thres)
# result_jpg.insert(0, pred)
# Process detections
for i, det in enumerate(pred): # detections per image
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], orgimg.shape).round()
# Print results
for c in det[:, -1].unique():
n = (det[:, -1] == c).sum() # detections per class
det[:, 5:13] = scale_coords_landmarks(img.shape[2:], det[:, 5:13], orgimg.shape).round()
for j in range(det.size()[0]):
xyxy = det[j, :4].view(-1).tolist()
conf = det[j, 4].cpu().numpy()
landmarks = det[j, 5:13].view(-1).tolist()
class_num = det[j, 13].cpu().numpy()
label, score, Box, result_dict = get_plate_rec_landmark(orgimg, xyxy, conf, landmarks, class_num,
device, plate_rec_model, is_color=is_color)
dict_list.append(result_dict)
result_jpg.append(Box)
result_jpg.append(score)
result_jpg.append(label)
return dict_list, result_jpg
# cv2.imwrite('result.jpg', orgimg)
def draw_result(orgimg, dict_list, is_color=False): # 车牌结果画出来
result_str = ""
for result in dict_list:
rect_area = result['rect']
x, y, w, h = rect_area[0], rect_area[1], rect_area[2] - rect_area[0], rect_area[3] - rect_area[1]
padding_w = 0.05 * w
padding_h = 0.11 * h
rect_area[0] = max(0, int(x - padding_w))
rect_area[1] = max(0, int(y - padding_h))
rect_area[2] = min(orgimg.shape[1], int(rect_area[2] + padding_w))
rect_area[3] = min(orgimg.shape[0], int(rect_area[3] + padding_h))
height_area = result['roi_height']
landmarks = result['landmarks']
result_p = result['plate_no']
if result['plate_type'] == 0: # 单层
result_p += " " + result['plate_color']
else: # 双层
result_p += " " + result['plate_color'] + "双层"
result_str += result_p + " "
for i in range(4): # 关键点
cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
cv2.rectangle(orgimg, (rect_area[0], rect_area[1]), (rect_area[2], rect_area[3]), (0, 0, 255), 2) # 画框
labelSize = cv2.getTextSize(result_p, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1) # 获得字体的大小
if rect_area[0] + labelSize[0][0] > orgimg.shape[1]: # 防止显示的文字越界
rect_area[0] = int(orgimg.shape[1] - labelSize[0][0])
orgimg = cv2.rectangle(orgimg, (rect_area[0], int(rect_area[1] - round(1.6 * labelSize[0][1]))),
(int(rect_area[0] + round(1.2 * labelSize[0][0])), rect_area[1] + labelSize[1]),
(255, 255, 255), cv2.FILLED) # 画文字框,背景白色
if len(result) >= 1:
orgimg = cv2ImgAddText(orgimg, result_p, rect_area[0], int(rect_area[1] - round(1.6 * labelSize[0][1])),
(0, 0, 0), 21)
# orgimg=cv2ImgAddText(orgimg,result_p,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
print(result_str) # 打印结果
return orgimg, result_str
def get_second(capture):
if capture.isOpened():
rate = capture.get(5) # 帧速率
FrameNumber = capture.get(7) # 视频文件的帧数
duration = FrameNumber / rate # 帧速率/视频总帧数 是时间除以60之后单位是分钟
return int(rate), int(FrameNumber), int(duration)
def process_images(detect_model_path, rec_model_path, is_color, img, img_size, output, video_path):
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# 创建保存结果的文件夹
save_path = output
if not os.path.exists(save_path):
os.mkdir(save_path)
# 加载模型
detect_model = load_model(detect_model_path, device)
plate_rec_model = init_model(device, rec_model_path, is_color=is_color)
# img = cv_imread(image_path)
if img.shape[-1] == 4:
img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)
dict_list, result_jpg = detect_Recognition_plate(detect_model, img, device, plate_rec_model, img_size,
is_color=is_color)
# ori_img = draw_result(img, dict_list)
# ori_list=ori_img[0].tolist()
# result_jpg.insert(0,ori_list)
result_jpg.insert(0, [[1, 0, 0], [0, 1, 0], [0, 0, 1]])
return result_jpg
app = Flask(__name__)
def base64_to_image(base64_str):
# 去掉base64编码中的头部信息
base64_str = base64_str.split(",")[-1]
# 解码base64字符串
image_data = base64.b64decode(base64_str)
# 转换为numpy数组
nparr = np.frombuffer(image_data, np.uint8)
# 解码为OpenCV格式的图片对象
image = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
return image
@app.route('/upload', methods=['POST'])
def upload_image():
try:
# 从请求中获取base64编码的图片数据
data = request.json
# print(data)
base64_str = data.get('image')
# print(base64_str)
if not base64_str:
return jsonify({'error': 'No image data provided'}), 400
# 将base64编码转换为图片
image = base64_to_image(base64_str)
result_jpg = process_images(
detect_model_path='weights/plate_detect.pt',
rec_model_path='weights/plate_rec_color.pth',
is_color=True,
img=image,
img_size=640,
output='result',
video_path='' # 如果处理图片,视频路径留空
)
# 构建一个字典来存储结果
results = []
# 添加注册矩阵
register_matrix = [
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]
]
results.append({"RegisterMatrix": register_matrix})
# 添加检测结果
for i in range(1, len(result_jpg), 3):
box, score, label = result_jpg[i:i + 3]
box_data = box
detection_result = {
"Box": box_data,
"Score": score,
"label": label
}
results.append(detection_result)
# print(detection_result)
# 返回处理结果
return jsonify({"result.jpg": results})
except Exception as e:
# 打印异常信息以帮助诊断
print(f"Caught an exception: {type(e).__name__}: {str(e)}")
return jsonify({"error_msg": "Content processing is incorrect",
"error_code": "AIS.0404"})
if __name__ == '__main__':
app.run(debug=True)

0
models/__init__.py Normal file
View File

33
models/blazeface.yaml Normal file
View File

@ -0,0 +1,33 @@
# parameters
nc: 1 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [5,6, 10,13, 21,26] # P3/8
- [55,72, 225,304, 438,553] # P4/16
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [24, 3, 2]], # 0-P1/2
[-1, 2, BlazeBlock, [24]], # 1
[-1, 1, BlazeBlock, [48, None, 2]], # 2-P2/4
[-1, 2, BlazeBlock, [48]], # 3
[-1, 1, DoubleBlazeBlock, [96, 24, 2]], # 4-P3/8
[-1, 2, DoubleBlazeBlock, [96, 24]], # 5
[-1, 1, DoubleBlazeBlock, [96, 24, 2]], # 6-P4/16
[-1, 2, DoubleBlazeBlock, [96, 24]], # 7
]
# YOLOv5 head
head:
[[-1, 1, Conv, [64, 1, 1]], # 8 (P4/32-large)
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 5], 1, Concat, [1]], # cat backbone P3
[-1, 1, Conv, [64, 1, 1]], # 11 (P3/8-medium)
[[11, 8], 1, Detect, [nc, anchors]], # Detect(P3, P4)
]

38
models/blazeface_fpn.yaml Normal file
View File

@ -0,0 +1,38 @@
# parameters
nc: 1 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [5,6, 10,13, 21,26] # P3/8
- [55,72, 225,304, 438,553] # P4/16
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [24, 3, 2]], # 0-P1/2
[-1, 2, BlazeBlock, [24]], # 1
[-1, 1, BlazeBlock, [48, None, 2]], # 2-P2/4
[-1, 2, BlazeBlock, [48]], # 3
[-1, 1, DoubleBlazeBlock, [96, 24, 2]], # 4-P3/8
[-1, 2, DoubleBlazeBlock, [96, 24]], # 5
[-1, 1, DoubleBlazeBlock, [96, 24, 2]], # 6-P4/16
[-1, 2, DoubleBlazeBlock, [96, 24]], # 7
]
# YOLOv5 head
head:
[[-1, 1, Conv, [48, 1, 1]], # 8
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 5], 1, Concat, [1]], # cat backbone P3
[-1, 1, Conv, [48, 1, 1]], # 11 (P3/8-medium)
[-1, 1, nn.MaxPool2d, [3, 2, 1]], # 12
[[-1, 7], 1, Concat, [1]], # cat backbone P3
[-1, 1, Conv, [48, 1, 1]], # 14 (P4/16-large)
[[11, 14], 1, Detect, [nc, anchors]], # Detect(P3, P4)
]

456
models/common.py Normal file
View File

@ -0,0 +1,456 @@
# This file contains modules common to various models
import math
import numpy as np
import requests
import torch
import torch.nn as nn
from PIL import Image, ImageDraw
from utils.datasets import letterbox
from utils.general import non_max_suppression, make_divisible, scale_coords, xyxy2xywh
from utils.plots import color_list
def autopad(k, p=None): # kernel, padding
# Pad to 'same'
if p is None:
p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad
return p
def channel_shuffle(x, groups):
batchsize, num_channels, height, width = x.data.size()
channels_per_group = num_channels // groups
# reshape
x = x.view(batchsize, groups, channels_per_group, height, width)
x = torch.transpose(x, 1, 2).contiguous()
# flatten
x = x.view(batchsize, -1, height, width)
return x
def DWConv(c1, c2, k=1, s=1, act=True):
# Depthwise convolution
return Conv(c1, c2, k, s, g=math.gcd(c1, c2), act=act)
class Conv(nn.Module):
# Standard convolution
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
super(Conv, self).__init__()
self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
self.bn = nn.BatchNorm2d(c2)
self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
#self.act = self.act = nn.LeakyReLU(0.1, inplace=True) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
def forward(self, x):
return self.act(self.bn(self.conv(x)))
def fuseforward(self, x):
return self.act(self.conv(x))
class StemBlock(nn.Module):
def __init__(self, c1, c2, k=3, s=2, p=None, g=1, act=True):
super(StemBlock, self).__init__()
self.stem_1 = Conv(c1, c2, k, s, p, g, act)
self.stem_2a = Conv(c2, c2 // 2, 1, 1, 0)
self.stem_2b = Conv(c2 // 2, c2, 3, 2, 1)
self.stem_2p = nn.MaxPool2d(kernel_size=2,stride=2,ceil_mode=True)
self.stem_3 = Conv(c2 * 2, c2, 1, 1, 0)
def forward(self, x):
stem_1_out = self.stem_1(x)
stem_2a_out = self.stem_2a(stem_1_out)
stem_2b_out = self.stem_2b(stem_2a_out)
stem_2p_out = self.stem_2p(stem_1_out)
out = self.stem_3(torch.cat((stem_2b_out,stem_2p_out),1))
return out
class Bottleneck(nn.Module):
# Standard bottleneck
def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion
super(Bottleneck, self).__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c_, c2, 3, 1, g=g)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
class BottleneckCSP(nn.Module):
# CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
super(BottleneckCSP, self).__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
self.cv4 = Conv(2 * c_, c2, 1, 1)
self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3)
self.act = nn.LeakyReLU(0.1, inplace=True)
self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
def forward(self, x):
y1 = self.cv3(self.m(self.cv1(x)))
y2 = self.cv2(x)
return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1))))
class C3(nn.Module):
# CSP Bottleneck with 3 convolutions
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
super(C3, self).__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c1, c_, 1, 1)
self.cv3 = Conv(2 * c_, c2, 1) # act=FReLU(c2)
self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
def forward(self, x):
return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
class ShuffleV2Block(nn.Module):
def __init__(self, inp, oup, stride):
super(ShuffleV2Block, self).__init__()
if not (1 <= stride <= 3):
raise ValueError('illegal stride value')
self.stride = stride
branch_features = oup // 2
assert (self.stride != 1) or (inp == branch_features << 1)
if self.stride > 1:
self.branch1 = nn.Sequential(
self.depthwise_conv(inp, inp, kernel_size=3, stride=self.stride, padding=1),
nn.BatchNorm2d(inp),
nn.Conv2d(inp, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(branch_features),
nn.SiLU(),
)
else:
self.branch1 = nn.Sequential()
self.branch2 = nn.Sequential(
nn.Conv2d(inp if (self.stride > 1) else branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(branch_features),
nn.SiLU(),
self.depthwise_conv(branch_features, branch_features, kernel_size=3, stride=self.stride, padding=1),
nn.BatchNorm2d(branch_features),
nn.Conv2d(branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(branch_features),
nn.SiLU(),
)
@staticmethod
def depthwise_conv(i, o, kernel_size, stride=1, padding=0, bias=False):
return nn.Conv2d(i, o, kernel_size, stride, padding, bias=bias, groups=i)
def forward(self, x):
if self.stride == 1:
x1, x2 = x.chunk(2, dim=1)
out = torch.cat((x1, self.branch2(x2)), dim=1)
else:
out = torch.cat((self.branch1(x), self.branch2(x)), dim=1)
out = channel_shuffle(out, 2)
return out
class BlazeBlock(nn.Module):
def __init__(self, in_channels,out_channels,mid_channels=None,stride=1):
super(BlazeBlock, self).__init__()
mid_channels = mid_channels or in_channels
assert stride in [1, 2]
if stride>1:
self.use_pool = True
else:
self.use_pool = False
self.branch1 = nn.Sequential(
nn.Conv2d(in_channels=in_channels,out_channels=mid_channels,kernel_size=5,stride=stride,padding=2,groups=in_channels),
nn.BatchNorm2d(mid_channels),
nn.Conv2d(in_channels=mid_channels,out_channels=out_channels,kernel_size=1,stride=1),
nn.BatchNorm2d(out_channels),
)
if self.use_pool:
self.shortcut = nn.Sequential(
nn.MaxPool2d(kernel_size=stride, stride=stride),
nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1),
nn.BatchNorm2d(out_channels),
)
self.relu = nn.SiLU(inplace=True)
def forward(self, x):
branch1 = self.branch1(x)
out = (branch1+self.shortcut(x)) if self.use_pool else (branch1+x)
return self.relu(out)
class DoubleBlazeBlock(nn.Module):
def __init__(self,in_channels,out_channels,mid_channels=None,stride=1):
super(DoubleBlazeBlock, self).__init__()
mid_channels = mid_channels or in_channels
assert stride in [1, 2]
if stride > 1:
self.use_pool = True
else:
self.use_pool = False
self.branch1 = nn.Sequential(
nn.Conv2d(in_channels=in_channels, out_channels=in_channels, kernel_size=5, stride=stride,padding=2,groups=in_channels),
nn.BatchNorm2d(in_channels),
nn.Conv2d(in_channels=in_channels, out_channels=mid_channels, kernel_size=1, stride=1),
nn.BatchNorm2d(mid_channels),
nn.SiLU(inplace=True),
nn.Conv2d(in_channels=mid_channels, out_channels=mid_channels, kernel_size=5, stride=1,padding=2),
nn.BatchNorm2d(mid_channels),
nn.Conv2d(in_channels=mid_channels, out_channels=out_channels, kernel_size=1, stride=1),
nn.BatchNorm2d(out_channels),
)
if self.use_pool:
self.shortcut = nn.Sequential(
nn.MaxPool2d(kernel_size=stride, stride=stride),
nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1),
nn.BatchNorm2d(out_channels),
)
self.relu = nn.SiLU(inplace=True)
def forward(self, x):
branch1 = self.branch1(x)
out = (branch1 + self.shortcut(x)) if self.use_pool else (branch1 + x)
return self.relu(out)
class SPP(nn.Module):
# Spatial pyramid pooling layer used in YOLOv3-SPP
def __init__(self, c1, c2, k=(5, 9, 13)):
super(SPP, self).__init__()
c_ = c1 // 2 # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
def forward(self, x):
x = self.cv1(x)
return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))
class SPPF(nn.Module):
# Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher
def __init__(self, c1, c2, k=5): # equivalent to SPP(k=(5, 9, 13))
super().__init__()
c_ = c1 // 2 # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c_ * 4, c2, 1, 1)
self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
def forward(self, x):
x = self.cv1(x)
with warnings.catch_warnings():
warnings.simplefilter('ignore') # suppress torch 1.9.0 max_pool2d() warning
y1 = self.m(x)
y2 = self.m(y1)
return self.cv2(torch.cat((x, y1, y2, self.m(y2)), 1))
class Focus(nn.Module):
# Focus wh information into c-space
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
super(Focus, self).__init__()
self.conv = Conv(c1 * 4, c2, k, s, p, g, act)
# self.contract = Contract(gain=2)
def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2)
return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
# return self.conv(self.contract(x))
class Contract(nn.Module):
# Contract width-height into channels, i.e. x(1,64,80,80) to x(1,256,40,40)
def __init__(self, gain=2):
super().__init__()
self.gain = gain
def forward(self, x):
N, C, H, W = x.size() # assert (H / s == 0) and (W / s == 0), 'Indivisible gain'
s = self.gain
x = x.view(N, C, H // s, s, W // s, s) # x(1,64,40,2,40,2)
x = x.permute(0, 3, 5, 1, 2, 4).contiguous() # x(1,2,2,64,40,40)
return x.view(N, C * s * s, H // s, W // s) # x(1,256,40,40)
class Expand(nn.Module):
# Expand channels into width-height, i.e. x(1,64,80,80) to x(1,16,160,160)
def __init__(self, gain=2):
super().__init__()
self.gain = gain
def forward(self, x):
N, C, H, W = x.size() # assert C / s ** 2 == 0, 'Indivisible gain'
s = self.gain
x = x.view(N, s, s, C // s ** 2, H, W) # x(1,2,2,16,80,80)
x = x.permute(0, 3, 4, 1, 5, 2).contiguous() # x(1,16,80,2,80,2)
return x.view(N, C // s ** 2, H * s, W * s) # x(1,16,160,160)
class Concat(nn.Module):
# Concatenate a list of tensors along dimension
def __init__(self, dimension=1):
super(Concat, self).__init__()
self.d = dimension
def forward(self, x):
return torch.cat(x, self.d)
class NMS(nn.Module):
# Non-Maximum Suppression (NMS) module
conf = 0.25 # confidence threshold
iou = 0.45 # IoU threshold
classes = None # (optional list) filter by class
def __init__(self):
super(NMS, self).__init__()
def forward(self, x):
return non_max_suppression(x[0], conf_thres=self.conf, iou_thres=self.iou, classes=self.classes)
class autoShape(nn.Module):
# input-robust model wrapper for passing cv2/np/PIL/torch inputs. Includes preprocessing, inference and NMS
img_size = 640 # inference size (pixels)
conf = 0.25 # NMS confidence threshold
iou = 0.45 # NMS IoU threshold
classes = None # (optional list) filter by class
def __init__(self, model):
super(autoShape, self).__init__()
self.model = model.eval()
def autoshape(self):
print('autoShape already enabled, skipping... ') # model already converted to model.autoshape()
return self
def forward(self, imgs, size=640, augment=False, profile=False):
# Inference from various sources. For height=720, width=1280, RGB images example inputs are:
# filename: imgs = 'data/samples/zidane.jpg'
# URI: = 'https://github.com/ultralytics/yolov5/releases/download/v1.0/zidane.jpg'
# OpenCV: = cv2.imread('image.jpg')[:,:,::-1] # HWC BGR to RGB x(720,1280,3)
# PIL: = Image.open('image.jpg') # HWC x(720,1280,3)
# numpy: = np.zeros((720,1280,3)) # HWC
# torch: = torch.zeros(16,3,720,1280) # BCHW
# multiple: = [Image.open('image1.jpg'), Image.open('image2.jpg'), ...] # list of images
p = next(self.model.parameters()) # for device and type
if isinstance(imgs, torch.Tensor): # torch
return self.model(imgs.to(p.device).type_as(p), augment, profile) # inference
# Pre-process
n, imgs = (len(imgs), imgs) if isinstance(imgs, list) else (1, [imgs]) # number of images, list of images
shape0, shape1 = [], [] # image and inference shapes
for i, im in enumerate(imgs):
if isinstance(im, str): # filename or uri
im = Image.open(requests.get(im, stream=True).raw if im.startswith('http') else im) # open
im = np.array(im) # to numpy
if im.shape[0] < 5: # image in CHW
im = im.transpose((1, 2, 0)) # reverse dataloader .transpose(2, 0, 1)
im = im[:, :, :3] if im.ndim == 3 else np.tile(im[:, :, None], 3) # enforce 3ch input
s = im.shape[:2] # HWC
shape0.append(s) # image shape
g = (size / max(s)) # gain
shape1.append([y * g for y in s])
imgs[i] = im # update
shape1 = [make_divisible(x, int(self.stride.max())) for x in np.stack(shape1, 0).max(0)] # inference shape
x = [letterbox(im, new_shape=shape1, auto=False)[0] for im in imgs] # pad
x = np.stack(x, 0) if n > 1 else x[0][None] # stack
x = np.ascontiguousarray(x.transpose((0, 3, 1, 2))) # BHWC to BCHW
x = torch.from_numpy(x).to(p.device).type_as(p) / 255. # uint8 to fp16/32
# Inference
with torch.no_grad():
y = self.model(x, augment, profile)[0] # forward
y = non_max_suppression(y, conf_thres=self.conf, iou_thres=self.iou, classes=self.classes) # NMS
# Post-process
for i in range(n):
scale_coords(shape1, y[i][:, :4], shape0[i])
return Detections(imgs, y, self.names)
class Detections:
# detections class for YOLOv5 inference results
def __init__(self, imgs, pred, names=None):
super(Detections, self).__init__()
d = pred[0].device # device
gn = [torch.tensor([*[im.shape[i] for i in [1, 0, 1, 0]], 1., 1.], device=d) for im in imgs] # normalizations
self.imgs = imgs # list of images as numpy arrays
self.pred = pred # list of tensors pred[0] = (xyxy, conf, cls)
self.names = names # class names
self.xyxy = pred # xyxy pixels
self.xywh = [xyxy2xywh(x) for x in pred] # xywh pixels
self.xyxyn = [x / g for x, g in zip(self.xyxy, gn)] # xyxy normalized
self.xywhn = [x / g for x, g in zip(self.xywh, gn)] # xywh normalized
self.n = len(self.pred)
def display(self, pprint=False, show=False, save=False, render=False):
colors = color_list()
for i, (img, pred) in enumerate(zip(self.imgs, self.pred)):
str = f'Image {i + 1}/{len(self.pred)}: {img.shape[0]}x{img.shape[1]} '
if pred is not None:
for c in pred[:, -1].unique():
n = (pred[:, -1] == c).sum() # detections per class
str += f'{n} {self.names[int(c)]}s, ' # add to string
if show or save or render:
img = Image.fromarray(img.astype(np.uint8)) if isinstance(img, np.ndarray) else img # from np
for *box, conf, cls in pred: # xyxy, confidence, class
# str += '%s %.2f, ' % (names[int(cls)], conf) # label
ImageDraw.Draw(img).rectangle(box, width=4, outline=colors[int(cls) % 10]) # plot
if pprint:
print(str)
if show:
img.show(f'Image {i}') # show
if save:
f = f'results{i}.jpg'
str += f"saved to '{f}'"
img.save(f) # save
if render:
self.imgs[i] = np.asarray(img)
def print(self):
self.display(pprint=True) # print results
def show(self):
self.display(show=True) # show results
def save(self):
self.display(save=True) # save results
def render(self):
self.display(render=True) # render results
return self.imgs
def __len__(self):
return self.n
def tolist(self):
# return a list of Detections objects, i.e. 'for result in results.tolist():'
x = [Detections([self.imgs[i]], [self.pred[i]], self.names) for i in range(self.n)]
for d in x:
for k in ['imgs', 'pred', 'xyxy', 'xyxyn', 'xywh', 'xywhn']:
setattr(d, k, getattr(d, k)[0]) # pop out of list
return x
class Classify(nn.Module):
# Classification head, i.e. x(b,c1,20,20) to x(b,c2)
def __init__(self, c1, c2, k=1, s=1, p=None, g=1): # ch_in, ch_out, kernel, stride, padding, groups
super(Classify, self).__init__()
self.aap = nn.AdaptiveAvgPool2d(1) # to x(b,c1,1,1)
self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g) # to x(b,c2,1,1)
self.flat = nn.Flatten()
def forward(self, x):
z = torch.cat([self.aap(y) for y in (x if isinstance(x, list) else [x])], 1) # cat if list
return self.flat(self.conv(z)) # flatten to x(b,c2)

133
models/experimental.py Normal file
View File

@ -0,0 +1,133 @@
# This file contains experimental modules
import numpy as np
import torch
import torch.nn as nn
from models.common import Conv, DWConv
from utils.google_utils import attempt_download
class CrossConv(nn.Module):
# Cross Convolution Downsample
def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False):
# ch_in, ch_out, kernel, stride, groups, expansion, shortcut
super(CrossConv, self).__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, (1, k), (1, s))
self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
class Sum(nn.Module):
# Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070
def __init__(self, n, weight=False): # n: number of inputs
super(Sum, self).__init__()
self.weight = weight # apply weights boolean
self.iter = range(n - 1) # iter object
if weight:
self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True) # layer weights
def forward(self, x):
y = x[0] # no weight
if self.weight:
w = torch.sigmoid(self.w) * 2
for i in self.iter:
y = y + x[i + 1] * w[i]
else:
for i in self.iter:
y = y + x[i + 1]
return y
class GhostConv(nn.Module):
# Ghost Convolution https://github.com/huawei-noah/ghostnet
def __init__(self, c1, c2, k=1, s=1, g=1, act=True): # ch_in, ch_out, kernel, stride, groups
super(GhostConv, self).__init__()
c_ = c2 // 2 # hidden channels
self.cv1 = Conv(c1, c_, k, s, None, g, act)
self.cv2 = Conv(c_, c_, 5, 1, None, c_, act)
def forward(self, x):
y = self.cv1(x)
return torch.cat([y, self.cv2(y)], 1)
class GhostBottleneck(nn.Module):
# Ghost Bottleneck https://github.com/huawei-noah/ghostnet
def __init__(self, c1, c2, k, s):
super(GhostBottleneck, self).__init__()
c_ = c2 // 2
self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1), # pw
DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw
GhostConv(c_, c2, 1, 1, act=False)) # pw-linear
self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False),
Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity()
def forward(self, x):
return self.conv(x) + self.shortcut(x)
class MixConv2d(nn.Module):
# Mixed Depthwise Conv https://arxiv.org/abs/1907.09595
def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True):
super(MixConv2d, self).__init__()
groups = len(k)
if equal_ch: # equal c_ per group
i = torch.linspace(0, groups - 1E-6, c2).floor() # c2 indices
c_ = [(i == g).sum() for g in range(groups)] # intermediate channels
else: # equal weight.numel() per group
b = [c2] + [0] * groups
a = np.eye(groups + 1, groups, k=-1)
a -= np.roll(a, 1, axis=1)
a *= np.array(k) ** 2
a[0] = 1
c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b
self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)])
self.bn = nn.BatchNorm2d(c2)
self.act = nn.LeakyReLU(0.1, inplace=True)
def forward(self, x):
return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1)))
class Ensemble(nn.ModuleList):
# Ensemble of models
def __init__(self):
super(Ensemble, self).__init__()
def forward(self, x, augment=False):
y = []
for module in self:
y.append(module(x, augment)[0])
# y = torch.stack(y).max(0)[0] # max ensemble
# y = torch.stack(y).mean(0) # mean ensemble
y = torch.cat(y, 1) # nms ensemble
return y, None # inference, train output
def attempt_load(weights, map_location=None):
# Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
model = Ensemble()
for w in weights if isinstance(weights, list) else [weights]:
attempt_download(w)
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
# Compatibility updates
for m in model.modules():
if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU]:
m.inplace = True # pytorch 1.7.0 compatibility
elif type(m) is Conv:
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
if len(model) == 1:
return model[-1] # return model
else:
print('Ensemble created with %s\n' % weights)
for k in ['names', 'stride']:
setattr(model, k, getattr(model[-1], k))
return model # return ensemble

351
models/yolo.py Normal file
View File

@ -0,0 +1,351 @@
import argparse
import logging
import math
import sys
from copy import deepcopy
from pathlib import Path
import torch
import torch.nn as nn
sys.path.append('./') # to run '$ python *.py' files in subdirectories
logger = logging.getLogger(__name__)
from models.common import Conv, Bottleneck, SPP, DWConv, Focus, BottleneckCSP, C3, ShuffleV2Block, Concat, NMS, autoShape, StemBlock, BlazeBlock, DoubleBlazeBlock
from models.experimental import MixConv2d, CrossConv
from utils.autoanchor import check_anchor_order
from utils.general import make_divisible, check_file, set_logging
from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info, scale_img, initialize_weights, \
select_device, copy_attr
try:
import thop # for FLOPS computation
except ImportError:
thop = None
class Detect(nn.Module):
stride = None # strides computed during build
export_cat = False # onnx export cat output
def __init__(self, nc=80, anchors=(), ch=()): # detection layer
super(Detect, self).__init__()
self.nc = nc # number of classes
#self.no = nc + 5 # number of outputs per anchor
self.no = nc + 5 + 8 # number of outputs per anchor
self.nl = len(anchors) # number of detection layers
self.na = len(anchors[0]) // 2 # number of anchors
self.grid = [torch.zeros(1)] * self.nl # init grid
a = torch.tensor(anchors).float().view(self.nl, -1, 2)
self.register_buffer('anchors', a) # shape(nl,na,2)
self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2)
self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv
def forward(self, x):
# x = x.copy() # for profiling
z = [] # inference output
# self.training=True
if self.export_cat:
for i in range(self.nl):
x[i] = self.m[i](x[i]) # conv
bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
# self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
self.grid[i], self.anchor_grid[i] = self._make_grid_new(nx, ny,i)
y = torch.full_like(x[i], 0)
y = y + torch.cat((x[i][:, :, :, :, 0:5].sigmoid(), torch.cat((x[i][:, :, :, :, 5:13], x[i][:, :, :, :, 13:13+self.nc].sigmoid()), 4)), 4)
box_xy = (y[:, :, :, :, 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
box_wh = (y[:, :, :, :, 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
# box_conf = torch.cat((box_xy, torch.cat((box_wh, y[:, :, :, :, 4:5]), 4)), 4)
landm1 = y[:, :, :, :, 5:7] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x1 y1
landm2 = y[:, :, :, :, 7:9] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x2 y2
landm3 = y[:, :, :, :, 9:11] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x3 y3
landm4 = y[:, :, :, :, 11:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x4 y4
prob= y[:, :, :, :, 13:13+self.nc]
score,index_ = torch.max(prob,dim=-1,keepdim=True)
score=score.type(box_xy.dtype)
index_=index_.type(box_xy.dtype)
index =torch.argmax(prob,dim=-1,keepdim=True).type(box_xy.dtype)
# landm5 = y[:, :, :, :, 13:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x5 y5
# landm = torch.cat((landm1, torch.cat((landm2, torch.cat((landm3, torch.cat((landm4, landm5), 4)), 4)), 4)), 4)
# y = torch.cat((box_conf, torch.cat((landm, y[:, :, :, :, 13:13+self.nc]), 4)), 4)
y = torch.cat([box_xy, box_wh, y[:, :, :, :, 4:5], landm1, landm2, landm3, landm4, y[:, :, :, :, 13:13+self.nc]], -1)
z.append(y.view(bs, -1, self.no))
return torch.cat(z, 1)
for i in range(self.nl):
x[i] = self.m[i](x[i]) # conv
bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
if not self.training: # inference
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
y = torch.full_like(x[i], 0)
class_range = list(range(5)) + list(range(13,13+self.nc))
y[..., class_range] = x[i][..., class_range].sigmoid()
y[..., 5:13] = x[i][..., 5:13]
#y = x[i].sigmoid()
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
#y[..., 5:13] = y[..., 5:13] * 8 - 4
y[..., 5:7] = y[..., 5:7] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x1 y1
y[..., 7:9] = y[..., 7:9] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# landmark x2 y2
y[..., 9:11] = y[..., 9:11] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# landmark x3 y3
y[..., 11:13] = y[..., 11:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# landmark x4 y4
# y[..., 13:13] = y[..., 13:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# landmark x5 y5
#y[..., 5:7] = (y[..., 5:7] * 2 -1) * self.anchor_grid[i] # landmark x1 y1
#y[..., 7:9] = (y[..., 7:9] * 2 -1) * self.anchor_grid[i] # landmark x2 y2
#y[..., 9:11] = (y[..., 9:11] * 2 -1) * self.anchor_grid[i] # landmark x3 y3
#y[..., 11:13] = (y[..., 11:13] * 2 -1) * self.anchor_grid[i] # landmark x4 y4
#y[..., 13:13] = (y[..., 13:13] * 2 -1) * self.anchor_grid[i] # landmark x5 y5
z.append(y.view(bs, -1, self.no))
return x if self.training else (torch.cat(z, 1), x)
@staticmethod
def _make_grid(nx=20, ny=20):
yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
def _make_grid_new(self,nx=20, ny=20,i=0):
d = self.anchors[i].device
if '1.10.0' in torch.__version__: # torch>=1.10.0 meshgrid workaround for torch>=0.7 compatibility
yv, xv = torch.meshgrid([torch.arange(ny).to(d), torch.arange(nx).to(d)], indexing='ij')
else:
yv, xv = torch.meshgrid([torch.arange(ny).to(d), torch.arange(nx).to(d)])
grid = torch.stack((xv, yv), 2).expand((1, self.na, ny, nx, 2)).float()
anchor_grid = (self.anchors[i].clone() * self.stride[i]).view((1, self.na, 1, 1, 2)).expand((1, self.na, ny, nx, 2)).float()
return grid, anchor_grid
class Model(nn.Module):
def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None): # model, input channels, number of classes
super(Model, self).__init__()
if isinstance(cfg, dict):
self.yaml = cfg # model dict
else: # is *.yaml
import yaml # for torch hub
self.yaml_file = Path(cfg).name
with open(cfg) as f:
self.yaml = yaml.load(f, Loader=yaml.FullLoader) # model dict
# Define model
ch = self.yaml['ch'] = self.yaml.get('ch', ch) # input channels
if nc and nc != self.yaml['nc']:
logger.info('Overriding model.yaml nc=%g with nc=%g' % (self.yaml['nc'], nc))
self.yaml['nc'] = nc # override yaml value
self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist
self.names = [str(i) for i in range(self.yaml['nc'])] # default names
# print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])
# Build strides, anchors
m = self.model[-1] # Detect()
if isinstance(m, Detect):
s = 128 # 2x min stride
m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward
m.anchors /= m.stride.view(-1, 1, 1)
check_anchor_order(m)
self.stride = m.stride
self._initialize_biases() # only run once
# print('Strides: %s' % m.stride.tolist())
# Init weights, biases
initialize_weights(self)
self.info()
logger.info('')
def forward(self, x, augment=False, profile=False):
if augment:
img_size = x.shape[-2:] # height, width
s = [1, 0.83, 0.67] # scales
f = [None, 3, None] # flips (2-ud, 3-lr)
y = [] # outputs
for si, fi in zip(s, f):
xi = scale_img(x.flip(fi) if fi else x, si)
yi = self.forward_once(xi)[0] # forward
# cv2.imwrite('img%g.jpg' % s, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1]) # save
yi[..., :4] /= si # de-scale
if fi == 2:
yi[..., 1] = img_size[0] - yi[..., 1] # de-flip ud
elif fi == 3:
yi[..., 0] = img_size[1] - yi[..., 0] # de-flip lr
y.append(yi)
return torch.cat(y, 1), None # augmented inference, train
else:
return self.forward_once(x, profile) # single-scale inference, train
def forward_once(self, x, profile=False):
y, dt = [], [] # outputs
for m in self.model:
if m.f != -1: # if not from previous layer
x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers
if profile:
o = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2 if thop else 0 # FLOPS
t = time_synchronized()
for _ in range(10):
_ = m(x)
dt.append((time_synchronized() - t) * 100)
print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))
x = m(x) # run
y.append(x if m.i in self.save else None) # save output
if profile:
print('%.1fms total' % sum(dt))
return x
def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency
# https://arxiv.org/abs/1708.02002 section 3.3
# cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
m = self.model[-1] # Detect() module
for mi, s in zip(m.m, m.stride): # from
b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85)
b.data[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
def _print_biases(self):
m = self.model[-1] # Detect() module
for mi in m.m: # from
b = mi.bias.detach().view(m.na, -1).T # conv.bias(255) to (3,85)
print(('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean()))
# def _print_weights(self):
# for m in self.model.modules():
# if type(m) is Bottleneck:
# print('%10.3g' % (m.w.detach().sigmoid() * 2)) # shortcut weights
def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
print('Fusing layers... ')
for m in self.model.modules():
if type(m) is Conv and hasattr(m, 'bn'):
m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
delattr(m, 'bn') # remove batchnorm
m.forward = m.fuseforward # update forward
elif type(m) is nn.Upsample:
m.recompute_scale_factor = None # torch 1.11.0 compatibility
self.info()
return self
def nms(self, mode=True): # add or remove NMS module
present = type(self.model[-1]) is NMS # last layer is NMS
if mode and not present:
print('Adding NMS... ')
m = NMS() # module
m.f = -1 # from
m.i = self.model[-1].i + 1 # index
self.model.add_module(name='%s' % m.i, module=m) # add
self.eval()
elif not mode and present:
print('Removing NMS... ')
self.model = self.model[:-1] # remove
return self
def autoshape(self): # add autoShape module
print('Adding autoShape... ')
m = autoShape(self) # wrap model
copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=()) # copy attributes
return m
def info(self, verbose=False, img_size=640): # print model information
model_info(self, verbose, img_size)
def parse_model(d, ch): # model_dict, input_channels(3)
logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']
na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors
no = na * (nc + 5) # number of outputs = anchors * (classes + 5)
layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out
for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args
m = eval(m) if isinstance(m, str) else m # eval strings
for j, a in enumerate(args):
try:
args[j] = eval(a) if isinstance(a, str) else a # eval strings
except:
pass
n = max(round(n * gd), 1) if n > 1 else n # depth gain
if m in [Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3, ShuffleV2Block, StemBlock, BlazeBlock, DoubleBlazeBlock]:
c1, c2 = ch[f], args[0]
# Normal
# if i > 0 and args[0] != no: # channel expansion factor
# ex = 1.75 # exponential (default 2.0)
# e = math.log(c2 / ch[1]) / math.log(2)
# c2 = int(ch[1] * ex ** e)
# if m != Focus:
c2 = make_divisible(c2 * gw, 8) if c2 != no else c2
# Experimental
# if i > 0 and args[0] != no: # channel expansion factor
# ex = 1 + gw # exponential (default 2.0)
# ch1 = 32 # ch[1]
# e = math.log(c2 / ch1) / math.log(2) # level 1-n
# c2 = int(ch1 * ex ** e)
# if m != Focus:
# c2 = make_divisible(c2, 8) if c2 != no else c2
args = [c1, c2, *args[1:]]
if m in [BottleneckCSP, C3]:
args.insert(2, n)
n = 1
elif m is nn.BatchNorm2d:
args = [ch[f]]
elif m is Concat:
c2 = sum([ch[-1 if x == -1 else x + 1] for x in f])
elif m is Detect:
args.append([ch[x + 1] for x in f])
if isinstance(args[1], int): # number of anchors
args[1] = [list(range(args[1] * 2))] * len(f)
else:
c2 = ch[f]
m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module
t = str(m)[8:-2].replace('__main__.', '') # module type
np = sum([x.numel() for x in m_.parameters()]) # number params
m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params
logger.info('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n, np, t, args)) # print
save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist
layers.append(m_)
ch.append(c2)
return nn.Sequential(*layers), sorted(save)
from thop import profile
from thop import clever_format
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--cfg', type=str, default='yolov5s.yaml', help='model.yaml')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
opt = parser.parse_args()
opt.cfg = check_file(opt.cfg) # check file
set_logging()
device = select_device(opt.device)
# Create model
model = Model(opt.cfg).to(device)
stride = model.stride.max()
if stride == 32:
input = torch.Tensor(1, 3, 480, 640).to(device)
else:
input = torch.Tensor(1, 3, 512, 640).to(device)
model.train()
print(model)
flops, params = profile(model, inputs=(input, ))
flops, params = clever_format([flops, params], "%.3f")
print('Flops:', flops, ',Params:' ,params)

47
models/yolov5l.yaml Normal file
View File

@ -0,0 +1,47 @@
# parameters
nc: 1 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [4,5, 8,10, 13,16] # P3/8
- [23,29, 43,55, 73,105] # P4/16
- [146,217, 231,300, 335,433] # P5/32
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, StemBlock, [64, 3, 2]], # 0-P1/2
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 2-P3/8
[-1, 9, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 4-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 6-P5/32
[-1, 1, SPP, [1024, [3,5,7]]],
[-1, 3, C3, [1024, False]], # 8
]
# YOLOv5 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 5], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 12
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 3], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 16 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 13], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 19 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 9], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 22 (P5/32-large)
[[16, 19, 22], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]

60
models/yolov5l6.yaml Normal file
View File

@ -0,0 +1,60 @@
# parameters
nc: 1 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [6,7, 9,11, 13,16] # P3/8
- [18,23, 26,33, 37,47] # P4/16
- [54,67, 77,104, 112,154] # P5/32
- [174,238, 258,355, 445,568] # P6/64
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[ [ -1, 1, StemBlock, [ 64, 3, 2 ] ], # 0-P1/2
[ -1, 3, C3, [ 128 ] ],
[ -1, 1, Conv, [ 256, 3, 2 ] ], # 2-P3/8
[ -1, 9, C3, [ 256 ] ],
[ -1, 1, Conv, [ 512, 3, 2 ] ], # 4-P4/16
[ -1, 9, C3, [ 512 ] ],
[ -1, 1, Conv, [ 768, 3, 2 ] ], # 6-P5/32
[ -1, 3, C3, [ 768 ] ],
[ -1, 1, Conv, [ 1024, 3, 2 ] ], # 8-P6/64
[ -1, 1, SPP, [ 1024, [ 3, 5, 7 ] ] ],
[ -1, 3, C3, [ 1024, False ] ], # 10
]
# YOLOv5 head
head:
[ [ -1, 1, Conv, [ 768, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 7 ], 1, Concat, [ 1 ] ], # cat backbone P5
[ -1, 3, C3, [ 768, False ] ], # 14
[ -1, 1, Conv, [ 512, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 5 ], 1, Concat, [ 1 ] ], # cat backbone P4
[ -1, 3, C3, [ 512, False ] ], # 18
[ -1, 1, Conv, [ 256, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 3 ], 1, Concat, [ 1 ] ], # cat backbone P3
[ -1, 3, C3, [ 256, False ] ], # 22 (P3/8-small)
[ -1, 1, Conv, [ 256, 3, 2 ] ],
[ [ -1, 19 ], 1, Concat, [ 1 ] ], # cat head P4
[ -1, 3, C3, [ 512, False ] ], # 25 (P4/16-medium)
[ -1, 1, Conv, [ 512, 3, 2 ] ],
[ [ -1, 15 ], 1, Concat, [ 1 ] ], # cat head P5
[ -1, 3, C3, [ 768, False ] ], # 28 (P5/32-large)
[ -1, 1, Conv, [ 768, 3, 2 ] ],
[ [ -1, 11 ], 1, Concat, [ 1 ] ], # cat head P6
[ -1, 3, C3, [ 1024, False ] ], # 31 (P6/64-xlarge)
[ [ 22, 25, 28, 31 ], 1, Detect, [ nc, anchors ] ], # Detect(P3, P4, P5, P6)
]

47
models/yolov5m.yaml Normal file
View File

@ -0,0 +1,47 @@
# parameters
nc: 1 # number of classes
depth_multiple: 0.67 # model depth multiple
width_multiple: 0.75 # layer channel multiple
# anchors
anchors:
- [4,5, 8,10, 13,16] # P3/8
- [23,29, 43,55, 73,105] # P4/16
- [146,217, 231,300, 335,433] # P5/32
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, StemBlock, [64, 3, 2]], # 0-P1/2
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 2-P3/8
[-1, 9, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 4-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 6-P5/32
[-1, 1, SPP, [1024, [3,5,7]]],
[-1, 3, C3, [1024, False]], # 8
]
# YOLOv5 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 5], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 12
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 3], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 16 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 13], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 19 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 9], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 22 (P5/32-large)
[[16, 19, 22], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]

60
models/yolov5m6.yaml Normal file
View File

@ -0,0 +1,60 @@
# parameters
nc: 1 # number of classes
depth_multiple: 0.67 # model depth multiple
width_multiple: 0.75 # layer channel multiple
# anchors
anchors:
- [6,7, 9,11, 13,16] # P3/8
- [18,23, 26,33, 37,47] # P4/16
- [54,67, 77,104, 112,154] # P5/32
- [174,238, 258,355, 445,568] # P6/64
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[ [ -1, 1, StemBlock, [ 64, 3, 2 ] ], # 0-P1/2
[ -1, 3, C3, [ 128 ] ],
[ -1, 1, Conv, [ 256, 3, 2 ] ], # 2-P3/8
[ -1, 9, C3, [ 256 ] ],
[ -1, 1, Conv, [ 512, 3, 2 ] ], # 4-P4/16
[ -1, 9, C3, [ 512 ] ],
[ -1, 1, Conv, [ 768, 3, 2 ] ], # 6-P5/32
[ -1, 3, C3, [ 768 ] ],
[ -1, 1, Conv, [ 1024, 3, 2 ] ], # 8-P6/64
[ -1, 1, SPP, [ 1024, [ 3, 5, 7 ] ] ],
[ -1, 3, C3, [ 1024, False ] ], # 10
]
# YOLOv5 head
head:
[ [ -1, 1, Conv, [ 768, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 7 ], 1, Concat, [ 1 ] ], # cat backbone P5
[ -1, 3, C3, [ 768, False ] ], # 14
[ -1, 1, Conv, [ 512, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 5 ], 1, Concat, [ 1 ] ], # cat backbone P4
[ -1, 3, C3, [ 512, False ] ], # 18
[ -1, 1, Conv, [ 256, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 3 ], 1, Concat, [ 1 ] ], # cat backbone P3
[ -1, 3, C3, [ 256, False ] ], # 22 (P3/8-small)
[ -1, 1, Conv, [ 256, 3, 2 ] ],
[ [ -1, 19 ], 1, Concat, [ 1 ] ], # cat head P4
[ -1, 3, C3, [ 512, False ] ], # 25 (P4/16-medium)
[ -1, 1, Conv, [ 512, 3, 2 ] ],
[ [ -1, 15 ], 1, Concat, [ 1 ] ], # cat head P5
[ -1, 3, C3, [ 768, False ] ], # 28 (P5/32-large)
[ -1, 1, Conv, [ 768, 3, 2 ] ],
[ [ -1, 11 ], 1, Concat, [ 1 ] ], # cat head P6
[ -1, 3, C3, [ 1024, False ] ], # 31 (P6/64-xlarge)
[ [ 22, 25, 28, 31 ], 1, Detect, [ nc, anchors ] ], # Detect(P3, P4, P5, P6)
]

46
models/yolov5n-0.5.yaml Normal file
View File

@ -0,0 +1,46 @@
# parameters
nc: 1 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 0.5 # layer channel multiple
# anchors
anchors:
- [4,5, 8,10, 13,16] # P3/8
- [23,29, 43,55, 73,105] # P4/16
- [146,217, 231,300, 335,433] # P5/32
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, StemBlock, [32, 3, 2]], # 0-P2/4
[-1, 1, ShuffleV2Block, [128, 2]], # 1-P3/8
[-1, 3, ShuffleV2Block, [128, 1]], # 2
[-1, 1, ShuffleV2Block, [256, 2]], # 3-P4/16
[-1, 7, ShuffleV2Block, [256, 1]], # 4
[-1, 1, ShuffleV2Block, [512, 2]], # 5-P5/32
[-1, 3, ShuffleV2Block, [512, 1]], # 6
]
# YOLOv5 head
head:
[[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P4
[-1, 1, C3, [128, False]], # 10
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 2], 1, Concat, [1]], # cat backbone P3
[-1, 1, C3, [128, False]], # 14 (P3/8-small)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 11], 1, Concat, [1]], # cat head P4
[-1, 1, C3, [128, False]], # 17 (P4/16-medium)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 7], 1, Concat, [1]], # cat head P5
[-1, 1, C3, [128, False]], # 20 (P5/32-large)
[[14, 17, 20], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]

46
models/yolov5n.yaml Normal file
View File

@ -0,0 +1,46 @@
# parameters
nc: 1 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [4,5, 8,10, 13,16] # P3/8
- [23,29, 43,55, 73,105] # P4/16
- [146,217, 231,300, 335,433] # P5/32
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, StemBlock, [32, 3, 2]], # 0-P2/4
[-1, 1, ShuffleV2Block, [128, 2]], # 1-P3/8
[-1, 3, ShuffleV2Block, [128, 1]], # 2
[-1, 1, ShuffleV2Block, [256, 2]], # 3-P4/16
[-1, 7, ShuffleV2Block, [256, 1]], # 4
[-1, 1, ShuffleV2Block, [512, 2]], # 5-P5/32
[-1, 3, ShuffleV2Block, [512, 1]], # 6
]
# YOLOv5 head
head:
[[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P4
[-1, 1, C3, [128, False]], # 10
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 2], 1, Concat, [1]], # cat backbone P3
[-1, 1, C3, [128, False]], # 14 (P3/8-small)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 11], 1, Concat, [1]], # cat head P4
[-1, 1, C3, [128, False]], # 17 (P4/16-medium)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 7], 1, Concat, [1]], # cat head P5
[-1, 1, C3, [128, False]], # 20 (P5/32-large)
[[14, 17, 20], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]

58
models/yolov5n6.yaml Normal file
View File

@ -0,0 +1,58 @@
# parameters
nc: 1 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [6,7, 9,11, 13,16] # P3/8
- [18,23, 26,33, 37,47] # P4/16
- [54,67, 77,104, 112,154] # P5/32
- [174,238, 258,355, 445,568] # P6/64
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, StemBlock, [32, 3, 2]], # 0-P2/4
[-1, 1, ShuffleV2Block, [128, 2]], # 1-P3/8
[-1, 3, ShuffleV2Block, [128, 1]], # 2
[-1, 1, ShuffleV2Block, [256, 2]], # 3-P4/16
[-1, 7, ShuffleV2Block, [256, 1]], # 4
[-1, 1, ShuffleV2Block, [384, 2]], # 5-P5/32
[-1, 3, ShuffleV2Block, [384, 1]], # 6
[-1, 1, ShuffleV2Block, [512, 2]], # 7-P6/64
[-1, 3, ShuffleV2Block, [512, 1]], # 8
]
# YOLOv5 head
head:
[[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P5
[-1, 1, C3, [128, False]], # 12
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P4
[-1, 1, C3, [128, False]], # 16 (P4/8-small)
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 2], 1, Concat, [1]], # cat backbone P3
[-1, 1, C3, [128, False]], # 20 (P3/8-small)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 17], 1, Concat, [1]], # cat head P4
[-1, 1, C3, [128, False]], # 23 (P4/16-medium)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 13], 1, Concat, [1]], # cat head P5
[-1, 1, C3, [128, False]], # 26 (P5/32-large)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 9], 1, Concat, [1]], # cat head P6
[-1, 1, C3, [128, False]], # 29 (P6/64-large)
[[20, 23, 26, 29], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]

47
models/yolov5s.yaml Normal file
View File

@ -0,0 +1,47 @@
# parameters
nc: 1 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.5 # layer channel multiple
# anchors
anchors:
- [4,5, 8,10, 13,16] # P3/8
- [23,29, 43,55, 73,105] # P4/16
- [146,217, 231,300, 335,433] # P5/32
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, StemBlock, [64, 3, 2]], # 0-P1/2
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 2-P3/8
[-1, 9, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 4-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 6-P5/32
[-1, 1, SPP, [1024, [3,5,7]]],
[-1, 3, C3, [1024, False]], # 8
]
# YOLOv5 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 5], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 12
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 3], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 16 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 13], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 19 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 9], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 22 (P5/32-large)
[[16, 19, 22], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]

60
models/yolov5s6.yaml Normal file
View File

@ -0,0 +1,60 @@
# parameters
nc: 1 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
# anchors
anchors:
- [6,7, 9,11, 13,16] # P3/8
- [18,23, 26,33, 37,47] # P4/16
- [54,67, 77,104, 112,154] # P5/32
- [174,238, 258,355, 445,568] # P6/64
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[ [ -1, 1, StemBlock, [ 64, 3, 2 ] ], # 0-P1/2
[ -1, 3, C3, [ 128 ] ],
[ -1, 1, Conv, [ 256, 3, 2 ] ], # 2-P3/8
[ -1, 9, C3, [ 256 ] ],
[ -1, 1, Conv, [ 512, 3, 2 ] ], # 4-P4/16
[ -1, 9, C3, [ 512 ] ],
[ -1, 1, Conv, [ 768, 3, 2 ] ], # 6-P5/32
[ -1, 3, C3, [ 768 ] ],
[ -1, 1, Conv, [ 1024, 3, 2 ] ], # 8-P6/64
[ -1, 1, SPP, [ 1024, [ 3, 5, 7 ] ] ],
[ -1, 3, C3, [ 1024, False ] ], # 10
]
# YOLOv5 head
head:
[ [ -1, 1, Conv, [ 768, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 7 ], 1, Concat, [ 1 ] ], # cat backbone P5
[ -1, 3, C3, [ 768, False ] ], # 14
[ -1, 1, Conv, [ 512, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 5 ], 1, Concat, [ 1 ] ], # cat backbone P4
[ -1, 3, C3, [ 512, False ] ], # 18
[ -1, 1, Conv, [ 256, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 3 ], 1, Concat, [ 1 ] ], # cat backbone P3
[ -1, 3, C3, [ 256, False ] ], # 22 (P3/8-small)
[ -1, 1, Conv, [ 256, 3, 2 ] ],
[ [ -1, 19 ], 1, Concat, [ 1 ] ], # cat head P4
[ -1, 3, C3, [ 512, False ] ], # 25 (P4/16-medium)
[ -1, 1, Conv, [ 512, 3, 2 ] ],
[ [ -1, 15 ], 1, Concat, [ 1 ] ], # cat head P5
[ -1, 3, C3, [ 768, False ] ], # 28 (P5/32-large)
[ -1, 1, Conv, [ 768, 3, 2 ] ],
[ [ -1, 11 ], 1, Concat, [ 1 ] ], # cat head P6
[ -1, 3, C3, [ 1024, False ] ], # 31 (P6/64-xlarge)
[ [ 22, 25, 28, 31 ], 1, Detect, [ nc, anchors ] ], # Detect(P3, P4, P5, P6)
]

256
onnx_infer.py Normal file
View File

@ -0,0 +1,256 @@
import onnxruntime
import numpy as np
import cv2
import copy
import os
import argparse
from PIL import Image, ImageDraw, ImageFont
import time
plate_color_list=['黑色','蓝色','绿色','白色','黄色']
plateName=r"#京沪津渝冀晋蒙辽吉黑苏浙皖闽赣鲁豫鄂湘粤桂琼川贵云藏陕甘青宁新学警港澳挂使领民航危0123456789ABCDEFGHJKLMNPQRSTUVWXYZ险品"
mean_value,std_value=((0.588,0.193))#识别模型均值标准差
def decodePlate(preds): #识别后处理
pre=0
newPreds=[]
for i in range(len(preds)):
if preds[i]!=0 and preds[i]!=pre:
newPreds.append(preds[i])
pre=preds[i]
plate=""
for i in newPreds:
plate+=plateName[int(i)]
return plate
# return newPreds
def rec_pre_precessing(img,size=(48,168)): #识别前处理
img =cv2.resize(img,(168,48))
img = img.astype(np.float32)
img = (img/255-mean_value)/std_value #归一化 减均值 除标准差
img = img.transpose(2,0,1) #h,w,c 转为 c,h,w
img = img.reshape(1,*img.shape) #channel,height,width转为batch,channel,height,channel
return img
def get_plate_result(img,session_rec): #识别后处理
img =rec_pre_precessing(img)
y_onnx_plate,y_onnx_color = session_rec.run([session_rec.get_outputs()[0].name,session_rec.get_outputs()[1].name], {session_rec.get_inputs()[0].name: img})
index =np.argmax(y_onnx_plate,axis=-1)
index_color = np.argmax(y_onnx_color)
plate_color = plate_color_list[index_color]
# print(y_onnx[0])
plate_no = decodePlate(index[0])
return plate_no,plate_color
def allFilePath(rootPath,allFIleList): #遍历文件
fileList = os.listdir(rootPath)
for temp in fileList:
if os.path.isfile(os.path.join(rootPath,temp)):
allFIleList.append(os.path.join(rootPath,temp))
else:
allFilePath(os.path.join(rootPath,temp),allFIleList)
def get_split_merge(img): #双层车牌进行分割后识别
h,w,c = img.shape
img_upper = img[0:int(5/12*h),:]
img_lower = img[int(1/3*h):,:]
img_upper = cv2.resize(img_upper,(img_lower.shape[1],img_lower.shape[0]))
new_img = np.hstack((img_upper,img_lower))
return new_img
def order_points(pts): # 关键点排列 按照(左上,右上,右下,左下)的顺序排列
rect = np.zeros((4, 2), dtype = "float32")
s = pts.sum(axis = 1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
diff = np.diff(pts, axis = 1)
rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)]
return rect
def four_point_transform(image, pts): #透视变换得到矫正后的图像,方便识别
rect = order_points(pts)
(tl, tr, br, bl) = rect
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
maxWidth = max(int(widthA), int(widthB))
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
maxHeight = max(int(heightA), int(heightB))
dst = np.array([
[0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[0, maxHeight - 1]], dtype = "float32")
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
# return the warped image
return warped
def my_letter_box(img,size=(640,640)): #
h,w,c = img.shape
r = min(size[0]/h,size[1]/w)
new_h,new_w = int(h*r),int(w*r)
top = int((size[0]-new_h)/2)
left = int((size[1]-new_w)/2)
bottom = size[0]-new_h-top
right = size[1]-new_w-left
img_resize = cv2.resize(img,(new_w,new_h))
img = cv2.copyMakeBorder(img_resize,top,bottom,left,right,borderType=cv2.BORDER_CONSTANT,value=(114,114,114))
return img,r,left,top
def xywh2xyxy(boxes): #xywh坐标变为 左上 ,右下坐标 x1,y1 x2,y2
xywh =copy.deepcopy(boxes)
xywh[:,0]=boxes[:,0]-boxes[:,2]/2
xywh[:,1]=boxes[:,1]-boxes[:,3]/2
xywh[:,2]=boxes[:,0]+boxes[:,2]/2
xywh[:,3]=boxes[:,1]+boxes[:,3]/2
return xywh
def my_nms(boxes,iou_thresh): #nms
index = np.argsort(boxes[:,4])[::-1]
keep = []
while index.size >0:
i = index[0]
keep.append(i)
x1=np.maximum(boxes[i,0],boxes[index[1:],0])
y1=np.maximum(boxes[i,1],boxes[index[1:],1])
x2=np.minimum(boxes[i,2],boxes[index[1:],2])
y2=np.minimum(boxes[i,3],boxes[index[1:],3])
w = np.maximum(0,x2-x1)
h = np.maximum(0,y2-y1)
inter_area = w*h
union_area = (boxes[i,2]-boxes[i,0])*(boxes[i,3]-boxes[i,1])+(boxes[index[1:],2]-boxes[index[1:],0])*(boxes[index[1:],3]-boxes[index[1:],1])
iou = inter_area/(union_area-inter_area)
idx = np.where(iou<=iou_thresh)[0]
index = index[idx+1]
return keep
def restore_box(boxes,r,left,top): #返回原图上面的坐标
boxes[:,[0,2,5,7,9,11]]-=left
boxes[:,[1,3,6,8,10,12]]-=top
boxes[:,[0,2,5,7,9,11]]/=r
boxes[:,[1,3,6,8,10,12]]/=r
return boxes
def detect_pre_precessing(img,img_size): #检测前处理
img,r,left,top=my_letter_box(img,img_size)
# cv2.imwrite("1.jpg",img)
img =img[:,:,::-1].transpose(2,0,1).copy().astype(np.float32)
img=img/255
img=img.reshape(1,*img.shape)
return img,r,left,top
def post_precessing(dets,r,left,top,conf_thresh=0.3,iou_thresh=0.5):#检测后处理
choice = dets[:,:,4]>conf_thresh
dets=dets[choice]
dets[:,13:15]*=dets[:,4:5]
box = dets[:,:4]
boxes = xywh2xyxy(box)
score= np.max(dets[:,13:15],axis=-1,keepdims=True)
index = np.argmax(dets[:,13:15],axis=-1).reshape(-1,1)
output = np.concatenate((boxes,score,dets[:,5:13],index),axis=1)
reserve_=my_nms(output,iou_thresh)
output=output[reserve_]
output = restore_box(output,r,left,top)
return output
def rec_plate(outputs,img0,session_rec): #识别车牌
dict_list=[]
for output in outputs:
result_dict={}
rect=output[:4].tolist()
land_marks = output[5:13].reshape(4,2)
roi_img = four_point_transform(img0,land_marks)
label = int(output[-1])
score = output[4]
if label==1: #代表是双层车牌
roi_img = get_split_merge(roi_img)
plate_no,plate_color = get_plate_result(roi_img,session_rec)
result_dict['rect']=rect
result_dict['landmarks']=land_marks.tolist()
result_dict['plate_no']=plate_no
result_dict['roi_height']=roi_img.shape[0]
result_dict['plate_color']=plate_color
dict_list.append(result_dict)
return dict_list
def cv2ImgAddText(img, text, left, top, textColor=(0, 255, 0), textSize=20): #将识别结果画在图上
if (isinstance(img, np.ndarray)): #判断是否OpenCV图片类型
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
draw = ImageDraw.Draw(img)
fontText = ImageFont.truetype(
"fonts/platech.ttf", textSize, encoding="utf-8")
draw.text((left, top), text, textColor, font=fontText)
return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)
def draw_result(orgimg,dict_list):
result_str =""
for result in dict_list:
rect_area = result['rect']
x,y,w,h = rect_area[0],rect_area[1],rect_area[2]-rect_area[0],rect_area[3]-rect_area[1]
padding_w = 0.05*w
padding_h = 0.11*h
rect_area[0]=max(0,int(x-padding_w))
rect_area[1]=min(orgimg.shape[1],int(y-padding_h))
rect_area[2]=max(0,int(rect_area[2]+padding_w))
rect_area[3]=min(orgimg.shape[0],int(rect_area[3]+padding_h))
height_area = result['roi_height']
landmarks=result['landmarks']
result = result['plate_no']
result_str+=result+" "
for i in range(4): #关键点
cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
cv2.rectangle(orgimg,(rect_area[0],rect_area[1]),(rect_area[2],rect_area[3]),(255,255,0),2) #画框
if len(result)>=1:
orgimg=cv2ImgAddText(orgimg,result,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
print(result_str)
return orgimg
if __name__ == "__main__":
begin = time.time()
parser = argparse.ArgumentParser()
parser.add_argument('--detect_model',type=str, default=r'weights/plate_detect.onnx', help='model.pt path(s)') #检测模型
parser.add_argument('--rec_model', type=str, default='weights/plate_rec_color.onnx', help='model.pt path(s)')#识别模型
parser.add_argument('--image_path', type=str, default='imgs', help='source')
parser.add_argument('--img_size', type=int, default=640, help='inference size (pixels)')
parser.add_argument('--output', type=str, default='result1', help='source')
opt = parser.parse_args()
file_list = []
allFilePath(opt.image_path,file_list)
providers = ['CPUExecutionProvider']
clors = [(255,0,0),(0,255,0),(0,0,255),(255,255,0),(0,255,255)]
img_size = (opt.img_size,opt.img_size)
session_detect = onnxruntime.InferenceSession(opt.detect_model, providers=providers )
session_rec = onnxruntime.InferenceSession(opt.rec_model, providers=providers )
if not os.path.exists(opt.output):
os.mkdir(opt.output)
save_path = opt.output
count = 0
for pic_ in file_list:
count+=1
print(count,pic_,end=" ")
img=cv2.imread(pic_)
img0 = copy.deepcopy(img)
img,r,left,top = detect_pre_precessing(img,img_size) #检测前处理
# print(img.shape)
y_onnx = session_detect.run([session_detect.get_outputs()[0].name], {session_detect.get_inputs()[0].name: img})[0]
outputs = post_precessing(y_onnx,r,left,top) #检测后处理
result_list=rec_plate(outputs,img0,session_rec)
ori_img = draw_result(img0,result_list)
img_name = os.path.basename(pic_)
save_img_path = os.path.join(save_path,img_name)
cv2.imwrite(save_img_path,ori_img)
print(f"总共耗时{time.time()-begin} s")

342
openvino_infer.py Normal file
View File

@ -0,0 +1,342 @@
import cv2
import matplotlib.pyplot as plt
import numpy as np
from openvino.runtime import Core
import os
import time
import copy
from PIL import Image, ImageDraw, ImageFont
import argparse
def cv_imread(path):
img=cv2.imdecode(np.fromfile(path,dtype=np.uint8),-1)
return img
def allFilePath(rootPath,allFIleList):
fileList = os.listdir(rootPath)
for temp in fileList:
if os.path.isfile(os.path.join(rootPath,temp)):
# if temp.endswith("jpg"):
allFIleList.append(os.path.join(rootPath,temp))
else:
allFilePath(os.path.join(rootPath,temp),allFIleList)
mean_value,std_value=((0.588,0.193))#识别模型均值标准差
plateName=r"#京沪津渝冀晋蒙辽吉黑苏浙皖闽赣鲁豫鄂湘粤桂琼川贵云藏陕甘青宁新学警港澳挂使领民航危0123456789ABCDEFGHJKLMNPQRSTUVWXYZ险品"
def rec_pre_precessing(img,size=(48,168)): #识别前处理
img =cv2.resize(img,(168,48))
img = img.astype(np.float32)
img = (img/255-mean_value)/std_value
img = img.transpose(2,0,1)
img = img.reshape(1,*img.shape)
return img
def decodePlate(preds): #识别后处理
pre=0
newPreds=[]
preds=preds.astype(np.int8)[0]
for i in range(len(preds)):
if preds[i]!=0 and preds[i]!=pre:
newPreds.append(preds[i])
pre=preds[i]
plate=""
for i in newPreds:
plate+=plateName[int(i)]
return plate
def load_model(onnx_path):
ie = Core()
model_onnx = ie.read_model(model=onnx_path)
compiled_model_onnx = ie.compile_model(model=model_onnx, device_name="CPU")
output_layer_onnx = compiled_model_onnx.output(0)
return compiled_model_onnx,output_layer_onnx
def get_plate_result(img,rec_model,rec_output):
img =rec_pre_precessing(img)
# time_b = time.time()
res_onnx = rec_model([img])[rec_output]
# time_e= time.time()
index =np.argmax(res_onnx,axis=-1) #找出最大概率的那个字符的序号
plate_no = decodePlate(index)
# print(f'{plate_no},time is {time_e-time_b}')
return plate_no
def get_split_merge(img): #双层车牌进行分割后识别
h,w,c = img.shape
img_upper = img[0:int(5/12*h),:]
img_lower = img[int(1/3*h):,:]
img_upper = cv2.resize(img_upper,(img_lower.shape[1],img_lower.shape[0]))
new_img = np.hstack((img_upper,img_lower))
return new_img
def order_points(pts):
rect = np.zeros((4, 2), dtype = "float32")
s = pts.sum(axis = 1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
diff = np.diff(pts, axis = 1)
rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)]
return rect
def four_point_transform(image, pts):
rect = order_points(pts)
(tl, tr, br, bl) = rect
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
maxWidth = max(int(widthA), int(widthB))
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
maxHeight = max(int(heightA), int(heightB))
dst = np.array([
[0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[0, maxHeight - 1]], dtype = "float32")
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
# return the warped image
return warped
def my_letter_box(img,size=(640,640)):
h,w,c = img.shape
r = min(size[0]/h,size[1]/w)
new_h,new_w = int(h*r),int(w*r)
top = int((size[0]-new_h)/2)
left = int((size[1]-new_w)/2)
bottom = size[0]-new_h-top
right = size[1]-new_w-left
img_resize = cv2.resize(img,(new_w,new_h))
img = cv2.copyMakeBorder(img_resize,top,bottom,left,right,borderType=cv2.BORDER_CONSTANT,value=(114,114,114))
return img,r,left,top
def xywh2xyxy(boxes):
xywh =copy.deepcopy(boxes)
xywh[:,0]=boxes[:,0]-boxes[:,2]/2
xywh[:,1]=boxes[:,1]-boxes[:,3]/2
xywh[:,2]=boxes[:,0]+boxes[:,2]/2
xywh[:,3]=boxes[:,1]+boxes[:,3]/2
return xywh
def my_nms(boxes,iou_thresh):
index = np.argsort(boxes[:,4])[::-1]
keep = []
while index.size >0:
i = index[0]
keep.append(i)
x1=np.maximum(boxes[i,0],boxes[index[1:],0])
y1=np.maximum(boxes[i,1],boxes[index[1:],1])
x2=np.minimum(boxes[i,2],boxes[index[1:],2])
y2=np.minimum(boxes[i,3],boxes[index[1:],3])
w = np.maximum(0,x2-x1)
h = np.maximum(0,y2-y1)
inter_area = w*h
union_area = (boxes[i,2]-boxes[i,0])*(boxes[i,3]-boxes[i,1])+(boxes[index[1:],2]-boxes[index[1:],0])*(boxes[index[1:],3]-boxes[index[1:],1])
iou = inter_area/(union_area-inter_area)
idx = np.where(iou<=iou_thresh)[0]
index = index[idx+1]
return keep
def restore_box(boxes,r,left,top):
boxes[:,[0,2,5,7,9,11]]-=left
boxes[:,[1,3,6,8,10,12]]-=top
boxes[:,[0,2,5,7,9,11]]/=r
boxes[:,[1,3,6,8,10,12]]/=r
return boxes
def detect_pre_precessing(img,img_size):
img,r,left,top=my_letter_box(img,img_size)
# cv2.imwrite("1.jpg",img)
img =img[:,:,::-1].transpose(2,0,1).copy().astype(np.float32)
img=img/255
img=img.reshape(1,*img.shape)
return img,r,left,top
def post_precessing(dets,r,left,top,conf_thresh=0.3,iou_thresh=0.5):#检测后处理
choice = dets[:,:,4]>conf_thresh
dets=dets[choice]
dets[:,13:15]*=dets[:,4:5]
box = dets[:,:4]
boxes = xywh2xyxy(box)
score= np.max(dets[:,13:15],axis=-1,keepdims=True)
index = np.argmax(dets[:,13:15],axis=-1).reshape(-1,1)
output = np.concatenate((boxes,score,dets[:,5:13],index),axis=1)
reserve_=my_nms(output,iou_thresh)
output=output[reserve_]
output = restore_box(output,r,left,top)
return output
def rec_plate(outputs,img0,rec_model,rec_output):
dict_list=[]
for output in outputs:
result_dict={}
rect=output[:4].tolist()
land_marks = output[5:13].reshape(4,2)
roi_img = four_point_transform(img0,land_marks)
label = int(output[-1])
if label==1: #代表是双层车牌
roi_img = get_split_merge(roi_img)
plate_no = get_plate_result(roi_img,rec_model,rec_output) #得到车牌识别结果
result_dict['rect']=rect
result_dict['landmarks']=land_marks.tolist()
result_dict['plate_no']=plate_no
result_dict['roi_height']=roi_img.shape[0]
dict_list.append(result_dict)
return dict_list
def cv2ImgAddText(img, text, left, top, textColor=(0, 255, 0), textSize=20):
if (isinstance(img, np.ndarray)): #判断是否OpenCV图片类型
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
draw = ImageDraw.Draw(img)
fontText = ImageFont.truetype(
"fonts/platech.ttf", textSize, encoding="utf-8")
draw.text((left, top), text, textColor, font=fontText)
return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)
def draw_result(orgimg,dict_list):
result_str =""
for result in dict_list:
rect_area = result['rect']
x,y,w,h = rect_area[0],rect_area[1],rect_area[2]-rect_area[0],rect_area[3]-rect_area[1]
padding_w = 0.05*w
padding_h = 0.11*h
rect_area[0]=max(0,int(x-padding_w))
rect_area[1]=min(orgimg.shape[1],int(y-padding_h))
rect_area[2]=max(0,int(rect_area[2]+padding_w))
rect_area[3]=min(orgimg.shape[0],int(rect_area[3]+padding_h))
height_area = result['roi_height']
landmarks=result['landmarks']
result = result['plate_no']
result_str+=result+" "
# for i in range(4): #关键点
# cv2.circle(orgimg, (int(landmarks[i][0]), int(landmarks[i][1])), 5, clors[i], -1)
if len(result)>=6:
cv2.rectangle(orgimg,(rect_area[0],rect_area[1]),(rect_area[2],rect_area[3]),(0,0,255),2) #画框
orgimg=cv2ImgAddText(orgimg,result,rect_area[0]-height_area,rect_area[1]-height_area-10,(0,255,0),height_area)
# print(result_str)
return orgimg
def get_second(capture):
if capture.isOpened():
rate = capture.get(5) # 帧速率
FrameNumber = capture.get(7) # 视频文件的帧数
duration = FrameNumber/rate # 帧速率/视频总帧数 是时间除以60之后单位是分钟
return int(rate),int(FrameNumber),int(duration)
if __name__=="__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--detect_model',type=str, default=r'weights/plate_detect.onnx', help='model.pt path(s)') #检测模型
parser.add_argument('--rec_model', type=str, default='weights/plate_rec.onnx', help='model.pt path(s)')#识别模型
parser.add_argument('--image_path', type=str, default='imgs', help='source')
parser.add_argument('--img_size', type=int, default=640, help='inference size (pixels)')
parser.add_argument('--output', type=str, default='result1', help='source')
opt = parser.parse_args()
file_list=[]
file_folder=opt.image_path
allFilePath(file_folder,file_list)
rec_onnx_path =opt.rec_model
detect_onnx_path=opt.detect_model
rec_model,rec_output=load_model(rec_onnx_path)
detect_model,detect_output=load_model(detect_onnx_path)
count=0
img_size=(opt.img_size,opt.img_size)
begin=time.time()
save_path=opt.output
if not os.path.exists(save_path):
os.mkdir(save_path)
for pic_ in file_list:
count+=1
print(count,pic_,end=" ")
img=cv2.imread(pic_)
time_b = time.time()
if img.shape[-1]==4:
img = cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)
img0 = copy.deepcopy(img)
img,r,left,top = detect_pre_precessing(img,img_size) #检测前处理
# print(img.shape)
det_result = detect_model([img])[detect_output]
outputs = post_precessing(det_result,r,left,top) #检测后处理
time_1 = time.time()
result_list=rec_plate(outputs,img0,rec_model,rec_output)
time_e= time.time()
print(f'耗时 {time_e-time_b} s')
ori_img = draw_result(img0,result_list)
img_name = os.path.basename(pic_)
save_img_path = os.path.join(save_path,img_name)
cv2.imwrite(save_img_path,ori_img)
print(f"总共耗时{time.time()-begin} s")
# video_name = r"plate.mp4"
# capture=cv2.VideoCapture(video_name)
# fourcc = cv2.VideoWriter_fourcc(*'MP4V')
# fps = capture.get(cv2.CAP_PROP_FPS) # 帧数
# width, height = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT)) # 宽高
# out = cv2.VideoWriter('2result.mp4', fourcc, fps, (width, height)) # 写入视频
# frame_count = 0
# fps_all=0
# rate,FrameNumber,duration=get_second(capture)
# # with open("example.csv",mode='w',newline='') as example_file:
# # fieldnames = ['车牌', '时间']
# # writer = csv.DictWriter(example_file, fieldnames=fieldnames, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
# # writer.writeheader()
# if capture.isOpened():
# while True:
# t1 = cv2.getTickCount()
# frame_count+=1
# ret,img=capture.read()
# if not ret:
# break
# # if frame_count%rate==0:
# img0 = copy.deepcopy(img)
# img,r,left,top = detect_pre_precessing(img,img_size) #检测前处理
# # print(img.shape)
# det_result = detect_model([img])[detect_output]
# outputs = post_precessing(det_result,r,left,top) #检测后处理
# result_list=rec_plate(outputs,img0,rec_model,rec_output)
# ori_img = draw_result(img0,result_list)
# t2 =cv2.getTickCount()
# infer_time =(t2-t1)/cv2.getTickFrequency()
# fps=1.0/infer_time
# fps_all+=fps
# str_fps = f'fps:{fps:.4f}'
# out.write(ori_img)
# cv2.putText(ori_img,str_fps,(20,20),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,0),2)
# cv2.imshow("haha",ori_img)
# cv2.waitKey(1)
# # current_time = int(frame_count/FrameNumber*duration)
# # sec = current_time%60
# # minute = current_time//60
# # for result_ in result_list:
# # plate_no = result_['plate_no']
# # if not is_car_number(pattern_str,plate_no):
# # continue
# # print(f'车牌号:{plate_no},时间:{minute}分{sec}秒')
# # time_str =f'{minute}分{sec}秒'
# # writer.writerow({"车牌":plate_no,"时间":time_str})
# # out.write(ori_img)
# else:
# print("失败")
# capture.release()
# out.release()
# cv2.destroyAllWindows()
# print(f"all frame is {frame_count},average fps is {fps_all/frame_count}")

View File

@ -0,0 +1,15 @@
import os
import cv2
import numpy as np
def get_split_merge(img):
h,w,c = img.shape
img_upper = img[0:int(5/12*h),:]
img_lower = img[int(1/3*h):,:]
img_upper = cv2.resize(img_upper,(img_lower.shape[1],img_lower.shape[0]))
new_img = np.hstack((img_upper,img_lower))
return new_img
if __name__=="__main__":
img = cv2.imread("double_plate/tmp8078.png")
new_img =get_split_merge(img)
cv2.imwrite("double_plate/new.jpg",new_img)

View File

@ -0,0 +1,203 @@
import torch.nn as nn
import torch
class myNet_ocr(nn.Module):
def __init__(self,cfg=None,num_classes=78,export=False):
super(myNet_ocr, self).__init__()
if cfg is None:
cfg =[32,32,64,64,'M',128,128,'M',196,196,'M',256,256]
# cfg =[32,32,'M',64,64,'M',128,128,'M',256,256]
self.feature = self.make_layers(cfg, True)
self.export = export
# self.classifier = nn.Linear(cfg[-1], num_classes)
# self.loc = nn.MaxPool2d((2, 2), (5, 1), (0, 1),ceil_mode=True)
# self.loc = nn.AvgPool2d((2, 2), (5, 2), (0, 1),ceil_mode=False)
self.loc = nn.MaxPool2d((5, 2), (1, 1),(0,1),ceil_mode=False)
self.newCnn=nn.Conv2d(cfg[-1],num_classes,1,1)
# self.newBn=nn.BatchNorm2d(num_classes)
def make_layers(self, cfg, batch_norm=False):
layers = []
in_channels = 3
for i in range(len(cfg)):
if i == 0:
conv2d =nn.Conv2d(in_channels, cfg[i], kernel_size=5,stride =1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = cfg[i]
else :
if cfg[i] == 'M':
layers += [nn.MaxPool2d(kernel_size=3, stride=2,ceil_mode=True)]
else:
conv2d = nn.Conv2d(in_channels, cfg[i], kernel_size=3, padding=(1,1),stride =1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = cfg[i]
return nn.Sequential(*layers)
def forward(self, x):
x = self.feature(x)
x=self.loc(x)
x=self.newCnn(x)
# x=self.newBn(x)
if self.export:
conv = x.squeeze(2) # b *512 * width
conv = conv.transpose(2,1) # [w, b, c]
# conv =conv.argmax(dim=2)
return conv
else:
b, c, h, w = x.size()
assert h == 1, "the height of conv must be 1"
conv = x.squeeze(2) # b *512 * width
conv = conv.permute(2, 0, 1) # [w, b, c]
# output = F.log_softmax(self.rnn(conv), dim=2)
output = torch.softmax(conv, dim=2)
return output
myCfg = [32,'M',64,'M',96,'M',128,'M',256]
class myNet(nn.Module):
def __init__(self,cfg=None,num_classes=3):
super(myNet, self).__init__()
if cfg is None:
cfg = myCfg
self.feature = self.make_layers(cfg, True)
self.classifier = nn.Linear(cfg[-1], num_classes)
def make_layers(self, cfg, batch_norm=False):
layers = []
in_channels = 3
for i in range(len(cfg)):
if i == 0:
conv2d =nn.Conv2d(in_channels, cfg[i], kernel_size=5,stride =1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = cfg[i]
else :
if cfg[i] == 'M':
layers += [nn.MaxPool2d(kernel_size=3, stride=2,ceil_mode=True)]
else:
conv2d = nn.Conv2d(in_channels, cfg[i], kernel_size=3, padding=1,stride =1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = cfg[i]
return nn.Sequential(*layers)
def forward(self, x):
x = self.feature(x)
x = nn.AvgPool2d(kernel_size=3, stride=1)(x)
x = x.view(x.size(0), -1)
y = self.classifier(x)
return y
class MyNet_color(nn.Module):
def __init__(self, class_num=6):
super(MyNet_color, self).__init__()
self.class_num = class_num
self.backbone = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=16, kernel_size=(5, 5), stride=(1, 1)), # 0
torch.nn.BatchNorm2d(16),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2)),
nn.Dropout(0),
nn.Flatten(),
nn.Linear(480, 64),
nn.Dropout(0),
nn.ReLU(),
nn.Linear(64, class_num),
nn.Dropout(0),
nn.Softmax(1)
)
def forward(self, x):
logits = self.backbone(x)
return logits
class myNet_ocr_color(nn.Module):
def __init__(self,cfg=None,num_classes=78,export=False,color_num=None):
super(myNet_ocr_color, self).__init__()
if cfg is None:
cfg =[32,32,64,64,'M',128,128,'M',196,196,'M',256,256]
# cfg =[32,32,'M',64,64,'M',128,128,'M',256,256]
self.feature = self.make_layers(cfg, True)
self.export = export
self.color_num=color_num
self.conv_out_num=12 #颜色第一个卷积层输出通道12
if self.color_num:
self.conv1=nn.Conv2d(cfg[-1],self.conv_out_num,kernel_size=3,stride=2)
self.bn1=nn.BatchNorm2d(self.conv_out_num)
self.relu1=nn.ReLU(inplace=True)
self.gap =nn.AdaptiveAvgPool2d(output_size=1)
self.color_classifier=nn.Conv2d(self.conv_out_num,self.color_num,kernel_size=1,stride=1)
self.color_bn = nn.BatchNorm2d(self.color_num)
self.flatten = nn.Flatten()
self.loc = nn.MaxPool2d((5, 2), (1, 1),(0,1),ceil_mode=False)
self.newCnn=nn.Conv2d(cfg[-1],num_classes,1,1)
# self.newBn=nn.BatchNorm2d(num_classes)
def make_layers(self, cfg, batch_norm=False):
layers = []
in_channels = 3
for i in range(len(cfg)):
if i == 0:
conv2d =nn.Conv2d(in_channels, cfg[i], kernel_size=5,stride =1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = cfg[i]
else :
if cfg[i] == 'M':
layers += [nn.MaxPool2d(kernel_size=3, stride=2,ceil_mode=True)]
else:
conv2d = nn.Conv2d(in_channels, cfg[i], kernel_size=3, padding=(1,1),stride =1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(cfg[i]), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = cfg[i]
return nn.Sequential(*layers)
def forward(self, x):
x = self.feature(x)
if self.color_num:
x_color=self.conv1(x)
x_color=self.bn1(x_color)
x_color =self.relu1(x_color)
x_color = self.color_classifier(x_color)
x_color = self.color_bn(x_color)
x_color =self.gap(x_color)
x_color = self.flatten(x_color)
x=self.loc(x)
x=self.newCnn(x)
if self.export:
conv = x.squeeze(2) # b *512 * width
conv = conv.transpose(2,1) # [w, b, c]
if self.color_num:
return conv,x_color
return conv
else:
b, c, h, w = x.size()
assert h == 1, "the height of conv must be 1"
conv = x.squeeze(2) # b *512 * width
conv = conv.permute(2, 0, 1) # [w, b, c]
output = F.log_softmax(conv, dim=2)
if self.color_num:
return output,x_color
return output
if __name__ == '__main__':
x = torch.randn(1,3,48,216)
model = myNet_ocr(num_classes=78,export=True)
out = model(x)
print(out.shape)

View File

@ -0,0 +1,119 @@
from plate_recognition.plateNet import myNet_ocr,myNet_ocr_color
import torch
import torch.nn as nn
import cv2
import numpy as np
import os
import time
import sys
def cv_imread(path): #可以读取中文路径的图片
img=cv2.imdecode(np.fromfile(path,dtype=np.uint8),-1)
return img
def allFilePath(rootPath,allFIleList):
fileList = os.listdir(rootPath)
for temp in fileList:
if os.path.isfile(os.path.join(rootPath,temp)):
if temp.endswith('.jpg') or temp.endswith('.png') or temp.endswith('.JPG'):
allFIleList.append(os.path.join(rootPath,temp))
else:
allFilePath(os.path.join(rootPath,temp),allFIleList)
device = torch.device('cuda') if torch.cuda.is_available() else torch.device("cpu")
color=['黑色','蓝色','绿色','白色','黄色']
plateName=r"#京沪津渝冀晋蒙辽吉黑苏浙皖闽赣鲁豫鄂湘粤桂琼川贵云藏陕甘青宁新学警港澳挂使领民航危0123456789ABCDEFGHJKLMNPQRSTUVWXYZ险品"
mean_value,std_value=(0.588,0.193)
def decodePlate(preds):
pre=0
newPreds=[]
index=[]
for i in range(len(preds)):
if preds[i]!=0 and preds[i]!=pre:
newPreds.append(preds[i])
index.append(i)
pre=preds[i]
return newPreds,index
def image_processing(img,device):
img = cv2.resize(img, (168,48))
img = np.reshape(img, (48, 168, 3))
# normalize
img = img.astype(np.float32)
img = (img / 255. - mean_value) / std_value
img = img.transpose([2, 0, 1])
img = torch.from_numpy(img)
img = img.to(device)
img = img.view(1, *img.size())
return img
def get_plate_result(img,device,model,is_color=False):
input = image_processing(img,device)
if is_color: #是否识别颜色
preds,color_preds = model(input)
color_preds = torch.softmax(color_preds,dim=-1)
color_conf,color_index = torch.max(color_preds,dim=-1)
color_conf=color_conf.item()
else:
preds = model(input)
preds=torch.softmax(preds,dim=-1)
prob,index=preds.max(dim=-1)
index = index.view(-1).detach().cpu().numpy()
prob=prob.view(-1).detach().cpu().numpy()
# preds=preds.view(-1).detach().cpu().numpy()
newPreds,new_index=decodePlate(index)
prob=prob[new_index]
plate=""
for i in newPreds:
plate+=plateName[i]
# if not (plate[0] in plateName[1:44] ):
# return ""
if is_color:
return plate,prob,color[color_index],color_conf #返回车牌号以及每个字符的概率,以及颜色,和颜色的概率
else:
return plate,prob
def init_model(device,model_path,is_color = False):
# print( print(sys.path))
# model_path ="plate_recognition/model/checkpoint_61_acc_0.9715.pth"
check_point = torch.load(model_path,map_location=device)
model_state=check_point['state_dict']
cfg=check_point['cfg']
color_classes=0
if is_color:
color_classes=5 #颜色类别数
model = myNet_ocr_color(num_classes=len(plateName),export=True,cfg=cfg,color_num=color_classes)
model.load_state_dict(model_state,strict=False)
model.to(device)
model.eval()
return model
# model = init_model(device)
if __name__ == '__main__':
model_path = r"weights/plate_rec_color.pth"
image_path ="images/tmp2424.png"
testPath = r"/mnt/Gpan/Mydata/pytorchPorject/CRNN/crnn_plate_recognition/images"
fileList=[]
allFilePath(testPath,fileList)
# result = get_plate_result(image_path,device)
# print(result)
is_color = False
model = init_model(device,model_path,is_color=is_color)
right=0
begin = time.time()
for imge_path in fileList:
img=cv2.imread(imge_path)
if is_color:
plate,_,plate_color,_=get_plate_result(img,device,model,is_color=is_color)
print(plate)
else:
plate,_=get_plate_result(img,device,model,is_color=is_color)
print(plate,imge_path)

42
readme/README.md Normal file
View File

@ -0,0 +1,42 @@
### **车牌检测训练**
1. **下载数据集:** 数据集可以联系vx获取we0091234
数据从CCPD和CRPD数据集中选取并转换的
数据集格式为yolo格式
```
label x y w h pt1x pt1y pt2x pt2y pt3x pt3y pt4x pt4y
```
关键点依次是(左上,右上,右下,左下)
坐标都是经过归一化x,y是中心点除以图片宽高w,h是框的宽高除以图片宽高ptxpty是关键点坐标除以宽高
**自己的数据集**可以通过lablme 软件,create polygons标注车牌四个点即可然后通过json2yolo.py 将数据集转为yolo格式即可训练
2. **修改 data/widerface.yaml train和val路径,换成你的数据路径**
```
train: /your/train/path #修改成你的训练集路径
val: /your/val/path #修改成你的验证集路径
# number of classes
nc: 2 #这里用的是2分类0 单层车牌 1 双层车牌
# class names
names: [ 'single','double']
```
3. **训练**
```
python3 train.py --data data/widerface.yaml --cfg models/yolov5n-0.5.yaml --weights weights/plate_detect.pt --epoch 120
```
结果存在run文件夹中
### onnx export
1. 检测模型导出onnx,需要安装onnx-sim **[onnx-simplifier](https://github.com/daquexian/onnx-simplifier)**
```
python export.py --weights ./weights/plate_detect.pt --img_size 640 --batch_size 1
onnxsim weights/plate_detect.onnx weights/plate_detect.onnx
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 674 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 790 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 725 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 824 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 552 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 507 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 704 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 542 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 453 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 405 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 438 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 501 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 455 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 483 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 453 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 494 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 461 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 498 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 476 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 513 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 471 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 471 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 498 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 449 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 468 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 519 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 735 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 974 KiB

Some files were not shown because too many files have changed in this diff Show More