当前位置:   article > 正文

yolov5 TensorRT 模型转换/量化加速/视频推理_tensorrt加速视频

tensorrt加速视频

yolov5s .pt->.wts->.engine c++进行rtsp视频流推理

环境:

jetson AGX xavier
cmake3.13
python3.6.9
NVIDIA Jetson AGX Xavier [16GB] - Jetpack 4.6
version 3.1.1
Cuda ARCH:7.2
CUDA:10.2.300
OpenCV:4.1.1
TensorRT:8.0.1.6
cuDNN:8.2.1.32
pycuda 2020.1
tensorrt 8.0.1.6
torch 1.9.0
torchvision 0.10.0a0+300a8a4
P.S. torchvision在aarch64上安装有坑

模型:

yolov5s标准模型 numclass = 4

Step 1 :

https://github.com/wang-xinyu/tensorrtx
https://github.com/Alex-Beh/tensorrtx/tree/c1220241f80aa9db8411c724905e6e695416f3e9
第一个用来pt->engine,第二个用来videoinference,好像只下载第二个就行了

Step 2:

  1. 进入yolov5文件夹,用gen_wts.py文件将已训练好的yolov5.pt模型转换为.wts文件,注意要改yololayer.h中的类别数和模型深度宽度设置。
  2. mkdir build -> cd build -> cmake . . -> make
  3. sudo ./yolov5 -s yolov5s.wts yolov5s.engine s
  4. sudo ./yolov5 -d yolov5s.engine …/samples

Step 3:

  1. Step1中第二个文件的yolov5文件夹, 同样mkdir build -> cd build -> cmake . . -> make
  2. build文件夹中 ./video_inteference_yolov5 (.engine文件和rtsp地址提前改好再编译)

量化结果:

(contains img preprocess and boxes postprocess)
INT8 :
sumtime = 20.953526973724365
imgnum = 500
Avr = 41.90ms

FP16 :
sumtime = 21.62316060066223
imgnum = 500
Avr = 43.24ms

FP32 :
sumtime = 24.902515172958374
imgnum = 500
Avr = 49.80ms

yolov5s 速度挺慢的
https://github.com/ppogg/YOLOv5-Lite
标记一下

cmake安装

1. wget https://cmake.org/files/v3.13/cmake-3.13.0.tar.gz
2. tar -xzvf cmake-3.13.0.tar.gz
3. cd cmake-3.13.0
4. ./bootstrap --qt-gui in order to build the cmake-gui tool, 
5. appending –qt-gui option
6. ./bootstrap && make && sudo make install
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
本文内容由网友自发贡献,转载请注明出处:【wpsshop博客】
推荐阅读
相关标签
  

闽ICP备14008679号