当前位置:   article > 正文

用PaddleDetection做上班摸鱼神器(完整教程)_paddledetection github

paddledetection github

来源:转载 作者:livingbody
编辑:学姐

还记得小时候爸妈不偷看电视的你吗?

还记得高中时代看小说防着老师的你吗?

还记得上班打游戏防着领导的你吗?

有了这个你将能安心摸鱼

介绍

本文使用PaddleDetection自制数据集,做人脸分类,调用本地电脑的前置摄像头识别出现在屏幕前的人脸,根据不同的人脸切换本地电脑的窗口,实现偷懒神器~

效果

演示项目效果地址:

https://www.bilibili.com/video/BV1ni4y1V7TT

百度推广视频演示地址:

https://www.bilibili.com/video/BV1kC4y1t7s1

实现的效果截图:

百度推广视频效果图:

PaddleDetection介绍

目标检测是机器视觉领域的核心问题之一。7月3日百度AI开发者大会,飞桨核心框架Paddle Fluid v1.5宣布开源了PaddleDetection物体检测统一框架,用户可以非常方便、快速的搭建出各种检测框架,构建强大的各类应用。

PaddleDetection物体检测统一框架,覆盖主流的检测算法,即具备高精度模型、也具备高速推理模型,并提供丰富的预训练模型,具有工业化、模块化、高性能的优势。

  • 工业化:结合飞桨核心框架的高速推理引擎,训练到部署无缝衔接
  • 模块化:提供模块化设计,模型网络结构和数据处理均可定制
  • 高性能:基于高效的核心框架,训练速度和显存占用上有一定的优势,例如,YOLO v3训练速度相比同类框架快1.6倍。

现有的模型

用PaddleDetection实现本项目

PaddleDetection下载地址:

https://github.com/PaddlePaddle/PaddleDetection

或者

https://gitee.com/paddlepaddle/PaddleDetection

  1. # git下载最新版本
  2. !git clone https://gitee.com/paddlepaddle/PaddleDetection.git --depth=1
  1.    Cloning into 'PaddleDetection'...
  2.     remote: Enumerating objects: 1870, done.[K
  3.     remote: Counting objects: 100% (1870/1870), done.[K
  4.     remote: Compressing objects: 100% (1438/1438), done.[K
  5.     remote: Total 1870 (delta 617), reused 1061 (delta 368), pack-reused 0[K
  6.     Receiving objects: 100% (1870/1870), 175.31 MiB | 15.71 MiB/s, done.
  7.     Resolving deltas: 100% (617/617), done.
  8.     Checking connectivity... done.

01本地制作数据集

实现本项目,最好用打算实验的相机拍摄数据集,就比如我想在自己的笔记本电脑上实现,那么可以直接调用本地电脑的前置相机来采集人脸照片,调用本地电脑拍照储存代码如会自动调用摄像头,按 S 按键,保存捕获的图像, ESC 退出。

  1. ####### 本地运行!!!!!!!!!
  2. import cv2
  3. import os
  4. path = "./pictures/" # 图片保存路径
  5. if not os.path.exists(path):
  6.     os.makedirs(path)
  7. cap = cv2.VideoCapture(0)
  8. = 0
  9. while (1):
  10.     ret, frame = cap.read()
  11.     k = cv2.waitKey(1)
  12.     if k == 27:
  13.         break
  14.     elif k == ord('s'):
  15.         cv2.imwrite(path + str(i) + '.jpg', frame)
  16.         print("save" + str(i) + ".jpg")
  17.         i += 1
  18.     cv2.imshow("capture", frame)
  19. cap.release()
  20. cv2.destroyAllWindows()

02数据标注

上述代码执行完后会在本地这段代码所在路径创建pictures文件夹存放图片,接下来下载labelImg工具进行数据标注参考教程:

[labelImg在Python3 linux环境下安装]

(https://blog.csdn.net/weixin_40745291/article/details/85775708)

~ 安装完成后在本地执行以下命令打开labelImg:

  1. ##### 本地执行,不可在aistudio执行!!!!
  2. cd 你的path/labelImg
  3. python labelImg.py

打开图片路径:

选择框选,并给人脸标注名字(这里我用红绿灯的图为例保护隐私):

标注完成:

03训练模型

  1. # 解压照片,aistudio上执行
  2. %mkdir -p PaddleDetection/dataset/faces
  3. %mv pictures.zip PaddleDetection/dataset/faces  
  4. %cd PaddleDetection/dataset/faces 
  5. !unzip pictures.zip
  1. /home/aistudio/PaddleDetection-release-0.3/dataset

现在的路径应该是这样的:

pictures下放着我们的图片和xml文件。

3.1 生成图片和xmls关系的描述文件

  1. #### 在aistudio上执行
  2. %cd PaddleDetection/dataset/faces
  3. %mkdir xmls
  4. %cd ./pictures/
  5. %mv *.xml ../xmls
  1. # 记录train图片
  2. import os
  3. import random
  4.  
  5. path = "PaddleDetection/dataset/faces/"
  6. total_red = os.listdir(path + "pictures/")
  7. num = len(total_red)
  8. list=range(num)
  9. ftrain = open('PaddleDetection/dataset/faces/train.txt''w')
  10. fval = open('PaddleDetection/dataset/faces/val.txt''w')
  11. for i  in list:
  12.     if i % 9 != 0:
  13.         name=total_red[i].split(".")[0]
  14.         writeName = "pictures/" + total_red[i] + " xmls/" + name + ".xml\n"
  15.         ftrain.write(writeName)
  16.     else:
  17.         name=total_red[i].split(".")[0]
  18.         writeName = "pictures/" + total_red[i] + " xmls/" + name + ".xml\n"
  19.         fval.write(writeName)
  20. ftrain.close()
  21. fval.close()
  1. # 记录了label,记得修改label的列表!!!!!
  2. import os
  3. flabel = open('PaddleDetection/dataset/faces/label_list.txt''w')
  4. label = ['张三','李四']
  5. for _label in label:
  6.     flabel.write(_label + "\n")
  7. flabel.close()

上面两段代码执行完后会生成三个文件:

3.2修改配置文件

修改这个文件:PaddleDetection/configs/ssd/ssd_mobilenet_v1_voc.yml

主要修改就是改动了dataset_dir和anno_path和use_default_label,本项目的ssd_mobilenet_v1_voc.yml文件已经修改了。

内容如下:

  1. #### 只是用来看的,不要运行!!!!!
  2. architecture: SSD
  3. pretrain_weights: https://paddlemodels.bj.bcebos.com/object_detection/ssd_mobilenet_v1_coco_pretrained.tar
  4. use_gpu: true
  5. max_iters: 28000
  6. snapshot_iter: 2000
  7. log_smooth_window: 1
  8. metric: VOC
  9. map_type11point
  10. save_dir: output
  11. weights: output/ssd_mobilenet_v1_voc/model_final
  12. 20(label_class+ 1(background)
  13. num_classes: 21
  14. SSD:
  15.   backbone: MobileNet
  16.   multi_box_head: MultiBoxHead
  17.   output_decoder:
  18.     background_label: 0
  19.     keep_top_k: 200
  20.     nms_eta: 1.0
  21.     nms_threshold: 0.45
  22.     nms_top_k: 400
  23.     score_threshold: 0.01
  24. MobileNet:
  25.   norm_decay: 0.
  26.   conv_group_scale: 1
  27.   conv_learning_rate: 0.1
  28.   extra_block_filters: [[256512], [128256], [128256], [64128]]
  29.   with_extra_blocks: true
  30. MultiBoxHead:
  31.   aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2., 3.], [2., 3.]]
  32.   base_size300
  33.   flip: true
  34.   max_ratio: 90
  35.   max_sizes: [[], 150.0195.0240.0285.0300.0]
  36.   min_ratio: 20
  37.   min_sizes: [60.0105.0150.0195.0240.0285.0]
  38.   offset: 0.5
  39. LearningRate:
  40.   schedulers:
  41.   - !PiecewiseDecay
  42.     milestones: [10000150002000025000]
  43.     values: [0.0010.00050.000250.00010.00001]
  44. OptimizerBuilder:
  45.   optimizer:
  46.     momentum: 0.0
  47.     type: RMSPropOptimizer
  48.   regularizer:
  49.     factor: 0.00005
  50.     type: L2
  51. TrainReader:
  52.   inputs_def:
  53.     image_shape: [3300300]
  54.     fields: ['image''gt_bbox''gt_class']
  55.   dataset:
  56.     !VOCDataSet
  57.     anno_path: train.txt
  58.     dataset_dir: dataset/faces
  59.     use_default_label: false
  60.   sample_transforms:
  61.   - !DecodeImage
  62.     to_rgb: true
  63.   - !RandomDistort
  64.     brightness_lower: 0.875
  65.     brightness_upper: 1.125
  66.     is_ordertrue
  67.   - !RandomExpand
  68.     fill_value: [127.5127.5127.5]
  69.   - !RandomCrop
  70.     allow_no_crop: false
  71.   - !NormalizeBox {}
  72.   - !ResizeImage
  73.     interp: 1
  74.     target_size300
  75.     use_cv2false
  76.   - !RandomFlipImage
  77.     is_normalized: true
  78.   - !Permute {}
  79.   - !NormalizeImage
  80.     is_scale: false
  81.     mean: [127.5127.5127.5]
  82.     std: [127.502231127.502231127.502231]
  83.   batch_size32
  84.   shuffle: true
  85.   drop_lasttrue
  86.   worker_num: 8
  87.   bufsize: 16
  88.   use_process: true
  89. EvalReader:
  90.   inputs_def:
  91.     image_shape: [3300300]
  92.     fields: ['image''gt_bbox''gt_class''im_shape''im_id''is_difficult']
  93.   dataset:
  94.     !VOCDataSet
  95.     anno_path: val.txt
  96.     dataset_dir: dataset/faces
  97.     use_default_label: false
  98.   sample_transforms:
  99.   - !DecodeImage
  100.     to_rgb: true
  101.   - !NormalizeBox {}
  102.   - !ResizeImage
  103.     interp: 1
  104.     target_size300
  105.     use_cv2false
  106.   - !Permute {}
  107.   - !NormalizeImage
  108.     is_scale: false
  109.     mean: [127.5127.5127.5]
  110.     std: [127.502231127.502231127.502231]
  111.   batch_size32
  112.   worker_num: 8
  113.   bufsize: 16
  114.   use_process: false
  115. TestReader:
  116.   inputs_def:
  117.     image_shape: [3,300,300]
  118.     fields: ['image''im_id''im_shape']
  119.   dataset:
  120.     !ImageFolder
  121.     dataset_dir: dataset/faces
  122.     anno_path: label_list.txt
  123.     use_default_label: false
  124.   sample_transforms:
  125.   - !DecodeImage
  126.     to_rgb: true
  127.   - !ResizeImage
  128.     interp: 1
  129.     max_size0
  130.     target_size300
  131.     use_cv2false
  132.   - !Permute {}
  133.   - !NormalizeImage
  134.     is_scale: false
  135.     mean: [127.5127.5127.5]
  136.     std: [127.502231127.502231127.502231]
  137.   batch_size1

开始训练

  1. %cd ~/PaddleDetection/
  2. !pip install -r requirements.txt
  1. #测试项目环境
  2. %cd ~/PaddleDetection/
  3. !export PYTHONPATH=`pwd`:$PYTHONPATH
  4. !python ppdet/modeling/tests/test_architectures.py
  1. #训练模型
  2. %cd ~/PaddleDetection/
  3. # ssd训练
  4. !python -u tools/train.py -c configs/ssd/ssd_mobilenet_v1_voc.yml -o --eval
  1. #转换模型
  2. %cd ~/PaddleDetection/
  3. !python -u tools/export_model.py -c configs/ssd/ssd_mobilenet_v1_voc.yml --output_dir=./inference_model

本地部署

转换模型后,这个路径下:

~/PaddleDetection-release/inference_model/ssd_mobilenet_v1_voc会有__params__ model infer_cfg.yml

三个文件,下载到本地。

本地在python环境下下载PaddleDetection的包,将模型的三个文件放到PaddleDetection目录下的output目录下:

安装工具wmctrl 在linux的终端下执行:

sudo apt-get install wmctrl

修改本地PaddleDetection包中的这个文件:

/PaddleDetection-release-0.3/deploy/python/infer.py

内容如下:

  1. # Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
  2. #
  3. # Licensed under the Apache License, Version 2.0 (the "License");
  4. # you may not use this file except in compliance with the License.
  5. # You may obtain a copy of the License at
  6. #
  7. #     http://www.apache.org/licenses/LICENSE-2.0
  8. #
  9. # Unless required by applicable law or agreed to in writing, software
  10. # distributed under the License is distributed on an "AS IS" BASIS,
  11. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  12. # See the License for the specific language governing permissions and
  13. # limitations under the License.
  14. # python -u deploy/python/infer.py --model_dir output/light/mobilnet_best_model
  15. import os
  16. import argparse
  17. import threading
  18. import time
  19. import yaml
  20. import ast
  21. from functools import reduce
  22. from PIL import Image
  23. import cv2
  24. import numpy as np
  25. import paddle.fluid as fluid
  26. from visualize import visualize_box_mask
  27. def decode_image(im_file, im_info):
  28.     """read rgb image
  29.     Args:
  30.         im_file (str/np.ndarray): path of image/ np.ndarray read by cv2
  31.         im_info (dict): info of image
  32.     Returns:
  33.         im (np.ndarray):  processed image (np.ndarray)
  34.         im_info (dict): info of processed image
  35.     """
  36.     if isinstance(im_file, str):
  37.         with open(im_file'rb'as f:
  38.             im_read = f.read()
  39.         data = np.frombuffer(im_read, dtype='uint8')
  40.         im = cv2.imdecode(data1)  # BGR mode, but need RGB mode
  41.         im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
  42.         im_info['origin_shape'= im.shape[:2]
  43.         im_info['resize_shape'= im.shape[:2]
  44.     else:
  45.         im = im_file
  46.         im_info['origin_shape'= im.shape[:2]
  47.         im_info['resize_shape'= im.shape[:2]
  48.     return im, im_info
  49. class Resize(object):
  50.     """resize image by target_size and max_size
  51.     Args:
  52.         arch (str): model type
  53.         target_size (int): the target size of image
  54.         max_size (int): the max size of image
  55.         use_cv2 (bool): whether us cv2
  56.         image_shape (list): input shape of model
  57.         interp (int): method of resize
  58.     """
  59.     def __init__(self,
  60.                  arch,
  61.                  target_size,
  62.                  max_size,
  63.                  use_cv2=True,
  64.                  image_shape=None,
  65.                  interp=cv2.INTER_LINEAR):
  66.         self.target_size = target_size
  67.         self.max_size = max_size
  68.         self.image_shape = image_shape,
  69.         self.arch = arch
  70.         self.use_cv2 = use_cv2
  71.         self.interp = interp
  72.         self.scale_set = {'RCNN''RetinaNet'}
  73.     def __call__(self, im, im_info):
  74.         """
  75.         Args:
  76.             im (np.ndarray): image (np.ndarray)
  77.             im_info (dict): info of image
  78.         Returns:
  79.             im (np.ndarray):  processed image (np.ndarray)
  80.             im_info (dict): info of processed image
  81.         """
  82.         im_channel = im.shape[2]
  83.         im_scale_x, im_scale_y = self.generate_scale(im)
  84.         if self.use_cv2:
  85.             im = cv2.resize(
  86.                 im,
  87.                 None,
  88.                 None,
  89.                 fx=im_scale_x,
  90.                 fy=im_scale_y,
  91.                 interpolation=self.interp)
  92.         else:
  93.             resize_w = int(im_scale_x * float(im.shape[1]))
  94.             resize_h = int(im_scale_y * float(im.shape[0]))
  95.             if self.max_size != 0:
  96.                 raise TypeError(
  97.                     'If you set max_size to cap the maximum size of image,'
  98.                     'please set use_cv2 to True to resize the image.')
  99.             im = im.astype('uint8')
  100.             im = Image.fromarray(im)
  101.             im = im.resize((int(resize_w), int(resize_h)), self.interp)
  102.             im = np.array(im)
  103.         # padding im when image_shape fixed by infer_cfg.yml
  104.         if self.max_size != 0 and self.image_shape is not None:
  105.             padding_im = np.zeros(
  106.                 (self.max_sizeself.max_size, im_channel), dtype=np.float32)
  107.             im_h, im_w = im.shape[:2]
  108.             padding_im[:im_h, :im_w, :] = im
  109.             im = padding_im
  110.         if self.arch in self.scale_set:
  111.             im_info['scale'= im_scale_x
  112.         im_info['resize_shape'= im.shape[:2]
  113.         return im, im_info
  114.     def generate_scale(self, im):
  115.         """
  116.         Args:
  117.             im (np.ndarray): image (np.ndarray)
  118.         Returns:
  119.             im_scale_x: the resize ratio of X 
  120.             im_scale_y: the resize ratio of Y 
  121.         """
  122.         origin_shape = im.shape[:2]
  123.         im_c = im.shape[2]
  124.         if self.max_size != 0 and self.arch in self.scale_set:
  125.             im_size_min = np.min(origin_shape[0:2])
  126.             im_size_max = np.max(origin_shape[0:2])
  127.             im_scale = float(self.target_size/ float(im_size_min)
  128.             if np.round(im_scale * im_size_max) > self.max_size:
  129.                 im_scale = float(self.max_size/ float(im_size_max)
  130.             im_scale_x = im_scale
  131.             im_scale_y = im_scale
  132.         else:
  133.             im_scale_x = float(self.target_size/ float(origin_shape[1])
  134.             im_scale_y = float(self.target_size/ float(origin_shape[0])
  135.         return im_scale_x, im_scale_y
  136. class Normalize(object):
  137.     """normalize image
  138.     Args:
  139.         mean (list): im - mean
  140.         std (list): im / std
  141.         is_scale (bool): whether need im / 255
  142.         is_channel_first (bool): if True: image shape is CHW, else: HWC
  143.     """
  144.     def __init__(self, mean, std, is_scale=Trueis_channel_first=False):
  145.         self.mean = mean
  146.         self.std = std
  147.         self.is_scale = is_scale
  148.         self.is_channel_first = is_channel_first
  149.     def __call__(self, im, im_info):
  150.         """
  151.         Args:
  152.             im (np.ndarray): image (np.ndarray)
  153.             im_info (dict): info of image
  154.         Returns:
  155.             im (np.ndarray):  processed image (np.ndarray)
  156.             im_info (dict): info of processed image
  157.         """
  158.         im = im.astype(np.float32copy=False)
  159.         if self.is_channel_first:
  160.             mean = np.array(self.mean)[:, np.newaxis, np.newaxis]
  161.             std = np.array(self.std)[:, np.newaxis, np.newaxis]
  162.         else:
  163.             mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
  164.             std = np.array(self.std)[np.newaxis, np.newaxis, :]
  165.         if self.is_scale:
  166.             im = im / 255.0
  167.         im -= mean
  168.         im /= std
  169.         return im, im_info
  170. class Permute(object):
  171.     """permute image
  172.     Args:
  173.         to_bgr (bool): whether convert RGB to BGR 
  174.         channel_first (bool): whether convert HWC to CHW
  175.     """
  176.     def __init__(selfto_bgr=False, channel_first=True):
  177.         self.to_bgr = to_bgr
  178.         self.channel_first = channel_first
  179.     def __call__(self, im, im_info):
  180.         """
  181.         Args:
  182.             im (np.ndarray): image (np.ndarray)
  183.             im_info (dict): info of image
  184.         Returns:
  185.             im (np.ndarray):  processed image (np.ndarray)
  186.             im_info (dict): info of processed image
  187.         """
  188.         if self.channel_first:
  189.             im = im.transpose((201)).copy()
  190.         if self.to_bgr:
  191.             im = im[[210], :, :]
  192.         return im, im_info
  193. class PadStride(object):
  194.     """ padding image for model with FPN 
  195.     Args:
  196.         stride (bool): model with FPN need image shape % stride == 0 
  197.     """
  198.     def __init__(self, stride=0):
  199.         self.coarsest_stride = stride
  200.     def __call__(self, im, im_info):
  201.         """
  202.         Args:
  203.             im (np.ndarray): image (np.ndarray)
  204.             im_info (dict): info of image
  205.         Returns:
  206.             im (np.ndarray):  processed image (np.ndarray)
  207.             im_info (dict): info of processed image
  208.         """
  209.         coarsest_stride = self.coarsest_stride
  210.         if coarsest_stride == 0:
  211.             return im
  212.         im_c, im_h, im_w = im.shape
  213.         pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
  214.         pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
  215.         padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
  216.         padding_im[:, :im_h, :im_w] = im
  217.         im_info['resize_shape'= padding_im.shape[1:]
  218.         return padding_im, im_info
  219. def create_inputs(im, im_info, model_arch='YOLO'):
  220.     """generate input for different model type
  221.     Args:
  222.         im (np.ndarray): image (np.ndarray)
  223.         im_info (dict): info of image
  224.         model_arch (str): model type
  225.     Returns:
  226.         inputs (dict): input of model
  227.     """
  228.     inputs = {}
  229.     inputs['image'= im
  230.     origin_shape = list(im_info['origin_shape'])
  231.     resize_shape = list(im_info['resize_shape'])
  232.     scale = im_info['scale']
  233.     if 'YOLO' in model_arch:
  234.         im_size = np.array([origin_shape]).astype('int32')
  235.         inputs['im_size'= im_size
  236.     elif 'RetinaNet' in model_arch:
  237.         im_info = np.array([resize_shape + [scale]]).astype('float32')
  238.         inputs['im_info'= im_info
  239.     elif 'RCNN' in model_arch:
  240.         im_info = np.array([resize_shape + [scale]]).astype('float32')
  241.         im_shape = np.array([origin_shape + [1.]]).astype('float32')
  242.         inputs['im_info'= im_info
  243.         inputs['im_shape'= im_shape
  244.     return inputs
  245. class Config():
  246.     """set config of preprocess, postprocess and visualize
  247.     Args:
  248.         model_dir (str): root path of model.yml
  249.     """
  250.     support_models = ['YOLO''SSD''RetinaNet''RCNN''Face']
  251.     def __init__(self, model_dir):
  252.         # parsing Yaml config for Preprocess
  253.         deploy_file = os.path.join(model_dir, 'infer_cfg.yml')
  254.         with open(deploy_fileas f:
  255.             yml_conf = yaml.safe_load(f)
  256.         self.check_model(yml_conf)
  257.         self.arch = yml_conf['arch']
  258.         self.preprocess_infos = yml_conf['Preprocess']
  259.         self.use_python_inference = yml_conf['use_python_inference']
  260.         self.min_subgraph_size = yml_conf['min_subgraph_size']
  261.         self.labels = yml_conf['label_list']
  262.         self.mask_resolution = None
  263.         if 'mask_resolution' in yml_conf:
  264.             self.mask_resolution = yml_conf['mask_resolution']
  265.         self.print_config()
  266.     def check_model(self, yml_conf):
  267.         """
  268.         Raises:
  269.             ValueError: loaded model not in supported model type 
  270.         """
  271.         for support_model in self.support_models:
  272.             if support_model in yml_conf['arch']:
  273.                 return True
  274.         raise ValueError(
  275.             "Unsupported arch: {}, expect SSD, YOLO, RetinaNet, RCNN and Face".
  276.             format(yml_conf['arch']))
  277.     def print_config(self):
  278.         print('-----------  Model Configuration -----------')
  279.         print('%s: %s' % ('Model Arch'self.arch))
  280.         print('%s: %s' % ('Use Padddle Executor'self.use_python_inference))
  281.         print('%s: ' % ('Transform Order'))
  282.         for op_info in self.preprocess_infos:
  283.             print('--%s: %s' % ('transform op', op_info['type']))
  284.         print('--------------------------------------------')
  285. def load_predictor(model_dir,
  286.                    run_mode='fluid',
  287.                    batch_size=1,
  288.                    use_gpu=False,
  289.                    min_subgraph_size=3):
  290.     """set AnalysisConfig, generate AnalysisPredictor
  291.     Args:
  292.         model_dir (str): root path of __model__ and __params__
  293.         use_gpu (bool): whether use gpu
  294.     Returns:
  295.         predictor (PaddlePredictor): AnalysisPredictor
  296.     Raises:
  297.         ValueError: predict by TensorRT need use_gpu == True.
  298.     """
  299.     if not use_gpu and not run_mode == 'fluid':
  300.         raise ValueError(
  301.             "Predict by TensorRT mode: {}, expect use_gpu==True, but use_gpu == {}"
  302.             .format(run_modeuse_gpu))
  303.     if run_mode == 'trt_int8':
  304.         raise ValueError("TensorRT int8 mode is not supported now, "
  305.                          "please use trt_fp32 or trt_fp16 instead.")
  306.     precision_map = {
  307.         'trt_int8': fluid.core.AnalysisConfig.Precision.Int8,
  308.         'trt_fp32': fluid.core.AnalysisConfig.Precision.Float32,
  309.         'trt_fp16': fluid.core.AnalysisConfig.Precision.Half
  310.     }
  311.     config = fluid.core.AnalysisConfig(
  312.         os.path.join(model_dir, '__model__'),
  313.         os.path.join(model_dir, '__params__'))
  314.     if use_gpu:
  315.         # initial GPU memory(M), device ID
  316.         config.enable_use_gpu(1000)
  317.         # optimize graph and fuse op
  318.         config.switch_ir_optim(True)
  319.     else:
  320.         config.disable_gpu()
  321.     if run_mode in precision_map.keys():
  322.         config.enable_tensorrt_engine(
  323.             workspace_size=1 << 10,
  324.             max_batch_size=batch_size,
  325.             min_subgraph_size=min_subgraph_size,
  326.             precision_mode=precision_map[run_mode],
  327.             use_static=False,
  328.             use_calib_mode=False)
  329.     # disable print log when predict
  330.     config.disable_glog_info()
  331.     # enable shared memory
  332.     config.enable_memory_optim()
  333.     # disable feed, fetch OP, needed by zero_copy_run
  334.     config.switch_use_feed_fetch_ops(False)
  335.     predictor = fluid.core.create_paddle_predictor(config)
  336.     return predictor
  337. def load_executor(model_dir, use_gpu=False):
  338.     if use_gpu:
  339.         place = fluid.CUDAPlace(0)
  340.     else:
  341.         place = fluid.CPUPlace()
  342.     exe = fluid.Executor(place)
  343.     program, feed_names, fetch_targets = fluid.io.load_inference_model(
  344.         dirname=model_dir,
  345.         executor=exe,
  346.         model_filename='__model__',
  347.         params_filename='__params__')
  348.     return exe, program, fetch_targets
  349. def visualize(image_file,
  350.               results,
  351.               labels,
  352.               mask_resolution=14,
  353.               output_dir='output/'):
  354.     # visualize the predict result
  355.     im = visualize_box_mask(
  356.         image_file, results, labels, mask_resolution=mask_resolution)
  357.     img_name = os.path.split(image_file)[-1]
  358.     if not os.path.exists(output_dir):
  359.         os.makedirs(output_dir)
  360.     out_path = os.path.join(output_dir, img_name)
  361.     im.save(out_path, quality=95)
  362.     print("save result to: " + out_path)
  363. class Detector():
  364.     """
  365.     Args:
  366.         model_dir (str): root path of __model__, __params__ and infer_cfg.yml
  367.         use_gpu (bool): whether use gpu
  368.     """
  369.     def __init__(self,
  370.                  model_dir,
  371.                  use_gpu=False,
  372.                  run_mode='fluid',
  373.                  threshold=0.5):
  374.         self.config = Config(model_dir)
  375.         if self.config.use_python_inference:
  376.             self.executor, self.programself.fecth_targets = load_executor(
  377.                 model_dir, use_gpu=use_gpu)
  378.         else:
  379.             self.predictor = load_predictor(
  380.                 model_dir,
  381.                 run_mode=run_mode,
  382.                 min_subgraph_size=self.config.min_subgraph_size,
  383.                 use_gpu=use_gpu)
  384.         self.preprocess_ops = []
  385.         for op_info in self.config.preprocess_infos:
  386.             op_type = op_info.pop('type')
  387.             if op_type == 'Resize':
  388.                 op_info['arch'= self.config.arch
  389.             self.preprocess_ops.append(eval(op_type)(**op_info))
  390.     def preprocess(self, im):
  391.         # process image by preprocess_ops
  392.         im_info = {
  393.             'scale'1.,
  394.             'origin_shape': None,
  395.             'resize_shape': None,
  396.         }
  397.         im, im_info = decode_image(im, im_info)
  398.         for operator in self.preprocess_ops:
  399.             im, im_info = operator(im, im_info)
  400.         im = np.array((im, )).astype('float32')
  401.         inputs = create_inputs(im, im_info, self.config.arch)
  402.         return inputs, im_info
  403.     def postprocess(self, np_boxes, np_masks, im_info, threshold=0.5):
  404.         # postprocess output of predictor
  405.         results = {}
  406.         if self.config.arch in ['SSD''Face']:
  407.             w, h = im_info['origin_shape']
  408.             np_boxes[:, 2*= h
  409.             np_boxes[:, 3*= w
  410.             np_boxes[:, 4*= h
  411.             np_boxes[:, 5*= w
  412.         expect_boxes = np_boxes[:, 1> threshold
  413.         np_boxes = np_boxes[expect_boxes, :]
  414.         for box in np_boxes:
  415.             print('class_id:{:d}, confidence:{:.2f},'
  416.                   'left_top:[{:.2f},{:.2f}],'
  417.                   ' right_bottom:[{:.2f},{:.2f}]'.format(
  418.                       int(box[0]), box[1], box[2], box[3], box[4], box[5]))
  419.         results['boxes'= np_boxes
  420.         if np_masks is not None:
  421.             np_masks = np_masks[expect_boxes, :, :, :]
  422.             results['masks'= np_masks
  423.         return results
  424.     def predict(self, image, threshold=0.5, warmup=0, repeats=1):
  425.         '''
  426.         Args:
  427.             image (str/np.ndarray): path of image/ np.ndarray read by cv2
  428.             threshold (float): threshold of predicted box' score
  429.         Returns:
  430.             results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
  431.                             matix element:[class, score, x_min, y_min, x_max, y_max]
  432.                             MaskRCNN's results include 'masks': np.ndarray:
  433.                             shape:[N, class_num, mask_resolution, mask_resolution]
  434.         '''
  435.         inputs, im_info = self.preprocess(image)
  436.         np_boxes, np_masks = None, None
  437.         if self.config.use_python_inference:
  438.             for i in range(warmup):
  439.                 outs = self.executor.run(self.program,
  440.                                          feed=inputs,
  441.                                          fetch_list=self.fecth_targets,
  442.                                          return_numpy=False)
  443.             t1 = time.time()
  444.             for i in range(repeats):
  445.                 outs = self.executor.run(self.program,
  446.                                          feed=inputs,
  447.                                          fetch_list=self.fecth_targets,
  448.                                          return_numpy=False)
  449.             t2 = time.time()
  450.             ms = (t2 - t1* 1000.0 / repeats
  451.             print("Inference: {} ms per batch image".format(ms))
  452.             np_boxes = np.array(outs[0])
  453.             if self.config.mask_resolution is not None:
  454.                 np_masks = np.array(outs[1])
  455.         else:
  456.             input_names = self.predictor.get_input_names()
  457.             for i in range(len(inputs)):
  458.                 input_tensor = self.predictor.get_input_tensor(input_names[i])
  459.                 input_tensor.copy_from_cpu(inputs[input_names[i]])
  460.             for i in range(warmup):
  461.                 self.predictor.zero_copy_run()
  462.                 output_names = self.predictor.get_output_names()
  463.                 boxes_tensor = self.predictor.get_output_tensor(output_names[0])
  464.                 np_boxes = boxes_tensor.copy_to_cpu()
  465.                 if self.config.mask_resolution is not None:
  466.                     masks_tensor = self.predictor.get_output_tensor(
  467.                         output_names[1])
  468.                     np_masks = masks_tensor.copy_to_cpu()
  469.             t1 = time.time()
  470.             for i in range(repeats):
  471.                 self.predictor.zero_copy_run()
  472.                 output_names = self.predictor.get_output_names()
  473.                 boxes_tensor = self.predictor.get_output_tensor(output_names[0])
  474.                 np_boxes = boxes_tensor.copy_to_cpu()
  475.                 if self.config.mask_resolution is not None:
  476.                     masks_tensor = self.predictor.get_output_tensor(
  477.                         output_names[1])
  478.                     np_masks = masks_tensor.copy_to_cpu()
  479.             t2 = time.time()
  480.             ms = (t2 - t1* 1000.0 / repeats
  481.             print("Inference: {} ms per batch image".format(ms))
  482.         if reduce(lambda x, y: x * y, np_boxes.shape) < 6:
  483.             print('[WARNNING] No object detected.')
  484.             results = {'boxes': np.array([])}
  485.         else:
  486.             results = self.postprocess(
  487.                 np_boxes, np_masks, im_info, threshold=threshold)
  488.         return results
  489. def predict_image():
  490.     detector = Detector(
  491.         FLAGS.model_dir, use_gpu=FLAGS.use_gpu, run_mode=FLAGS.run_mode)
  492.     if FLAGS.run_benchmark:
  493.         detector.predict(
  494.             FLAGS.image_file, FLAGS.threshold, warmup=100, repeats=100)
  495.     else:
  496.         results = detector.predict(FLAGS.image_file, FLAGS.threshold)
  497.         visualize(
  498.             FLAGS.image_file,
  499.             results,
  500.             detector.config.labels,
  501.             mask_resolution=detector.config.mask_resolution,
  502.             output_dir=FLAGS.output_dir)
  503. def predict_video():
  504.     detector = Detector(
  505.         FLAGS.model_dir, use_gpu=FLAGS.use_gpu, run_mode=FLAGS.run_mode)
  506.     capture = cv2.VideoCapture(0)
  507.     fps = 30
  508.     width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
  509.     height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
  510.     fourcc = cv2.VideoWriter_fourcc(*'mp4v')
  511.     video_name = os.path.split(FLAGS.video_file)[-1]
  512.     if not os.path.exists(FLAGS.output_dir):
  513.         os.makedirs(FLAGS.output_dir)
  514.     out_path = os.path.join(FLAGS.output_dir, video_name)
  515.     writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
  516.     index = 1
  517.     while (1):
  518.         ret, frame = capture.read()
  519.         if not ret:
  520.             break
  521.         print('detect frame:%d' % (index))
  522.         index += 1
  523.         results = detector.predict(frame, FLAGS.threshold)
  524.         if len(results['boxes']) != 0:
  525.             for box in results['boxes']:
  526.                 x1, y1, x2, y2 = box[2:]
  527.                 font = cv2.FONT_HERSHEY_SIMPLEX
  528.                 flag = 0
  529.                 # 这里两个人脸的话box[0]的值就是1.02.0,如果是三个人就是1.02.03.0,所以我这里两个人脸我就写了小于1.5来判断是谁
  530.                 if box[0< 1.5:
  531.                     flag = 1
  532.                     cv2.putText(frame, 'wyj', (int(x1-5), int(y1-5)), font, 1.2, (2552550), 2)
  533.                 else:
  534.                     cv2.putText(frame, 'xmy', (int(x1 - 5), int(y1 - 5)), font, 1.2, (2552550), 2)
  535.                 cv2.rectangle(frame, (x1, y1), (x2, y2), (02550), 5)
  536.                 if flag == 1:
  537.                     os.system("wmctrl -a \"pycharm\"") #检测到妹妹的脸 命令行打开pycharm
  538.         cv2.imshow("1",frame)
  539.         cv2.waitKey(3)
  540.     writer.release()
  541.     cv2.destroyAllWindows()
  542. def print_arguments(args):
  543.     print('-----------  Running Arguments -----------')
  544.     for arg, value in sorted(vars(args).items()):
  545.         print('%s: %s' % (arg, value))
  546.     print('------------------------------------------')
  547. flag = 0
  548. def showImg():
  549.     global flag
  550.     cap = cv2.VideoCapture(0)
  551.     while 1:
  552.         ret,frame = cap.read()
  553.         #cv2.imshow("cap",frame)
  554.         if flag is 0:
  555.             cv2.imwrite("temp.jpg",frame)
  556.             flag = 1
  557.         if cv2.waitKey(100& 0xff == ord('q'):
  558.             break
  559.     cap.release()
  560.     cv2.destroyAllWindows()
  561. if __name__ == '__main__':
  562.     parser = argparse.ArgumentParser(description=__doc__)
  563.     parser.add_argument(
  564.         "--model_dir",
  565.         type=str,
  566.         default="",
  567.         help=("Directory include:'__model__', '__params__', "
  568.               "'infer_cfg.yml', created by tools/export_model.py."),
  569.         required=True)
  570.     parser.add_argument(
  571.         "--image_file"type=str, default='', help="Path of image file.")
  572.     parser.add_argument(
  573.         "--video_file"type=str, default='', help="Path of video file.")
  574.     parser.add_argument(
  575.         "--run_mode",
  576.         type=str,
  577.         default='fluid',
  578.         help="mode of running(fluid/trt_fp32/trt_fp16)")
  579.     parser.add_argument(
  580.         "--use_gpu",
  581.         type=ast.literal_eval,
  582.         default=False,
  583.         help="Whether to predict with GPU.")
  584.     parser.add_argument(
  585.         "--run_benchmark",
  586.         type=ast.literal_eval,
  587.         default=False,
  588.         help="Whether to predict a image_file repeatedly for benchmark")
  589.     parser.add_argument(
  590.         "--threshold"type=float, default=0.5, help="Threshold of score.")
  591.     parser.add_argument(
  592.         "--output_dir",
  593.         type=str,
  594.         default="output",
  595.         help="Directory of output visualization files.")
  596.     FLAGS = parser.parse_args()
  597.     print_arguments(FLAGS)
  598.     if FLAGS.image_file != '' and FLAGS.video_file != '':
  599.         assert "Cannot predict image and video at the same time"
  600.     if FLAGS.image_file != '':
  601.         predict_image()
  602.     predict_video()

上面代码的核心代码,根据个人使用一点修改,核心代码在585行,其他不用改动,内容如下:

  1. # 启动识别,本地运行不可以在aistudio运行!!!!!
  2. 在pychram中可打开终端,输入:
  3. python -u deploy/python/infer.py --model_dir output/face

总结

PaddleDetection可以帮助我们很好的实现各种目标检测任务,不需要用户写大量的代码,只需要简单的配置

使用PaddleDetection也可以很轻松的实现本地的部署

如果项目有任何问题,欢迎在评论区留言指出

关注下方《学姐带你玩AI》

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/article/detail/52184
推荐阅读
相关标签