赞
踩
目标检测一直是遥感图像和计算机视觉领域的一个长期问题。它通常被定义为识别输入图像中目标对象的位置以及识别对象类别。自动目标检测已广泛应用于许多实际应用中,如危险检测、环境监测、变化检测、城市规划等。
在过去的几十年里,人们对目标检测进行了广泛的研究,并开发了大量方法来检测遥感图像中的人工目标(如车辆、建筑物、道路、桥梁等)和自然目标(如湖泊、海岸、森林等)。遥感图像数据集上现有的目标检测方法大致可分为四类:(1)基于模板匹配的方法,(2)基于知识的方法,(3)基于对象的图像分析方法,(4)基于机器学习的方法。其中,基于机器学习的方法在特征提取和目标分类方面具有强大的鲁棒性,并被许多最近的方法广泛研究,以实现这一问题的重大进展。
在过去的几年里,为了完成场景分类、图像分割和目标检测的任务,少样本学习在计算机视觉领域得到了广泛的研究。而在遥感图像中,物体的大小可能非常不同,遥感图像的空间分辨率也可能非常不同,这使得在只提供少量注释样本的情况下,这个问题更加具有挑战性。
小目标检测在视频监控、自动驾驶、无人机航拍、遥感图像检测等方面有着广泛的应用价值和重要的研究意义。针对小目标的定义,目前主要有两种方式:
目标边界框的宽高与图像的宽高比例小于一定值
目标边界框面积与图像面积的比值开方小于一定值
分辨率小于32*32像素的目标。如MS-COCO数据集
像素值范围在[10,50]之间的目标。如DOTA/WIDER FACE数据集
paddle从数据集 整体层面提出了如下定义:
目标边界框的宽高与图像的宽高比例的中位数小于0.04时,判定该数据集为小目标数据集。

目前,小目标检测主要有以下几个难点:
覆盖面积小,有效特征少
小目标下采样后丢失问题,边界框难以回归,模型难以收敛
同类小目标密集,NMS(非极大值抑制)操作将大量正确预测的边界框过滤
小目标检测的数据集少
针对上述问题,飞桨团队基于PP-YOLOE+通用检测模型,从流程和算法上进行了改进,提出了一套小目标专属检测器PP-YOLOE-SOD(Small Object Detection)。
相比PP-YOLOE模型,PP-YOLOE-SOD改进点主要包括在neck中引入 Transformer全局注意力机制 以及在回归分支中使用 基于向量的DFL 。
Transformer在CV中的应用是目前研究较为火热的一个方向。最早的ViT直接将图像分为多个Patch并加入位置Embedding送入Transformer Encoder中,加上相应的分类或者检测头即可实现较好的效果。

这里类似,主要加入了Position Embedding和Encoder两个模块,不同的是输入是最后一层特征图。


| 模型 | m A P v a l mAP^{val} mAPval | A P 0.5 AP^{0.5} AP0.5 | A P 0.75 AP^{0.75} AP0.75 | A P s m a l l AP^{small} APsmall | A P m e d i u m AP^{medium} APmedium | A P l a r g e AP^{large} APlarge | A R s m a l l AR^{small} ARsmall | A R m e d i u m AR^{medium} ARmedium | A R l a r g e AR^{large} ARlarge | 下载链接 | 配置文件 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| PP-YOLOE+_SOD-l | 53.0 | 70.4 | 57.7 | 37.1 | 57.5 | 69.0 | 56.5 | 77.5 | 86.7 | 下载链接 | 配置文件 |
注意:
NWPU VHR-10数据集包含800个高分辨率的卫星图像,这些图像是从Google Earth和Vaihingen数据集裁剪而来的,然后由专家手动注释。数据集分成10类(飞机,轮船,储罐,棒球场,网球场,篮球场,地面跑道,港口,桥梁和车辆)。
它由715幅RGB图像和85幅锐化彩色红外图像组成。其中715幅RGB图像采集自谷歌地球,空间分辨率从0.5m到2m不等。85幅经过pan‐锐化的红外图像,空间分辨率为0.08m,来自Vaihingen数据。
该数据集共包含3775个对象实例,其中包括757架飞机、390个棒球方块、159个篮球场、124座桥梁、224个港口、163个田径场、302艘船、655个储罐、524个网球场和477辆汽车,这些对象实例都是用水平边框手工标注的。
原始数据集包含以下文件:
(x1,y1),(x2,y2),a
其中(x1,y1)表示边界框的左上角坐标,(x2,y2)表示边界框的右下角坐标,
a是对象类别(1-飞机,2-轮船,3-储罐,4-棒球场,5-网球场,6-篮球场,7-田径场,8-港口,9-桥梁,10-车辆)。

该数据集已经转化为COCO格式,原有数据集为VOC格式。
# 压缩数据集
%cd work
!mkdir dataset
!unzip /home/aistudio/data/data198756/dataset_coco.zip -d /home/aistudio/work/dataset
# 克隆paddledetection仓库
# gitee 国内下载比较快
%cd /home/aistudio
!git clone https://gitee.com/paddlepaddle/PaddleDetection.git
# github
# !git clone https://github.com/PaddlePaddle/PaddleDetection.git
# 如果git clone的速度非常的慢,可以使用下面的命令直接压缩我上传的PaddleDetection套件压缩包
!unzip /home/aistudio/data/data199313/PaddleDetection.zip -d /home/aistudio
在进行训练之前,我们需要先到/home/aistudio/PaddleDetection/configs/datasets/coco_detection.yml文件中,修改数据集路径,具体修改如下:
metric: COCO num_classes: 10 # 该数据集类别为10 TrainDataset: name: COCODataSet image_dir: /home/aistudio/work/dataset/image anno_path: dataset/instances_train2017.json dataset_dir: /home/aistudio/work data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd'] EvalDataset: name: COCODataSet image_dir: /home/aistudio/work/dataset/image anno_path: dataset/instances_val2017.json dataset_dir: /home/aistudio/work allow_empty: true TestDataset: name: ImageFolder anno_path: dataset/instances_val2017.json # also support txt (like VOC's label_list.txt) dataset_dir: /home/aistudio/work # if set, anno_path will be 'dataset_dir/anno_path'
同时,我们还需要到/home/aistudio/PaddleDetection/configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml文件中,修改一下参数:
_BASE_: [ '../datasets/coco_detection.yml', '../runtime.yml', '../ppyoloe/_base_/optimizer_80e.yml', '../ppyoloe/_base_/ppyoloe_plus_crn.yml', '../ppyoloe/_base_/ppyoloe_plus_reader.yml', ] log_iter: 10 # 打印日志log的间隔 snapshot_epoch: 5 # 每过多少轮评估一次 weights: output/ppyoloe_plus_sod_crn_l_80e_coco/model_final pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams depth_mult: 1.0 width_mult: 1.0 CustomCSPPAN: num_layers: 4 use_trans: True PPYOLOEHead: reg_range: [-2, 17] static_assigner_epoch: -1 assigner: name: TaskAlignedAssigner_CR center_radius: 1 nms: name: MultiClassNMS nms_top_k: 1000 keep_top_k: 300 score_threshold: 0.01 nms_threshold: 0.7
同时,由于我们是单卡训练,YOLOE中默认是8卡训练,所以我们需要调整下/home/aistudio/PaddleDetection/configs/ppyoloe/_base_/optimizer_80e.yml中的学习率,具体如下:
epoch: 80 LearningRate: base_lr: 0.000125 # 这里在原先0.001的基础上除了8 schedulers: - name: CosineDecay max_epochs: 96 - name: LinearWarmup start_factor: 0. epochs: 5 OptimizerBuilder: optimizer: momentum: 0.9 type: Momentum regularizer: factor: 0.0005 type: L2
# 安装所需依赖
!pip install pycocotools
# 导入package
!pip install -r ~/PaddleDetection/requirements.txt
# 训练
%cd /home/aistudio/PaddleDetection
!python tools/train.py -c configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml --amp --eval --use_vdl True --vdl_log_dir vdl_log_dir/scalar
我们可以通过VisualDL服务,进行训练的可视化,具体如下:




点击进入VisualDL以后,我们就可以看到可视化的结果如下:

# 评估
%cd /home/aistudio/PaddleDetection
!python tools/eval.py -c configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml -o weights=output/ppyoloe_plus_sod_crn_l_80e_coco/best_model.pdparams
/home/aistudio/PaddleDetection /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import MutableMapping /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import Iterable, Mapping /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import Sized Warning: Unable to use MOT metric, please install motmetrics, for example: `pip install motmetrics`, see https://github.com/longcw/py-motmetrics Warning: Unable to use MCMOT metric, please install motmetrics, for example: `pip install motmetrics`, see https://github.com/longcw/py-motmetrics Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly W0315 17:09:25.167379 38757 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2 W0315 17:09:25.170883 38757 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2. loading annotations into memory... Done (t=0.01s) creating index... index created! [03/15 17:09:27] ppdet.data.source.coco INFO: Load [130 samples valid, 0 samples invalid] in file /home/aistudio/work/dataset/instances_val2017.json. [03/15 17:09:29] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_plus_sod_crn_l_80e_coco/best_model.pdparams [03/15 17:09:29] ppdet.engine INFO: Eval iter: 0 [03/15 17:09:34] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json. loading annotations into memory... Done (t=0.01s) creating index... index created! [03/15 17:09:34] ppdet.metrics.coco_utils INFO: Start evaluate... Loading and preparing results... DONE (t=0.46s) creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=2.33s). Accumulating evaluation results... DONE (t=0.31s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.776 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.977 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.882 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.768 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.759 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.843 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.288 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.710 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.831 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.787 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.813 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.897 [03/15 17:09:37] ppdet.engine INFO: Total sample number: 130, average FPS: 26.371598818398887
根据instances_val2017.json文件提取除image文件夹中的验证集图片
import json import shutil import os if not os.path.exists('test'): os.chdir('/home/aistudio/work/dataset') os.mkdir('test') datasets_path = '/home/aistudio/work/dataset/' img_dir = '/home/aistudio/work/dataset/image' annotion_dir = '/home/aistudio/work/dataset/test' f = open('{}instances_val2017.json'.format(datasets_path), encoding='utf-8') gt = json.load(f) lst = [] for img_info in gt['images']: lst.append(img_info['file_name']) for fileNum in lst: if not os.path.isdir(fileNum): imgName = os.path.join(img_dir, fileNum) print(imgName) shutil.copy(imgName, annotion_dir)
# 预测
%cd /home/aistudio/PaddleDetection
!python tools/infer.py -c configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml -o weights=output/ppyoloe_plus_sod_crn_l_80e_coco/best_model.pdparams --infer_dir=/home/aistudio/work/dataset/test --output_dir infer_output/
推理结果如下:


PP-YOLO-SOD在GPU上部署或者速度测试需要通过tools/export_model.py导出模型。
%cd /home/aistudio/PaddleDetection
!python tools/export_model.py -c configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml -o weights=output/ppyoloe_plus_sod_crn_l_80e_coco/best_model.pdparams
/home/aistudio/PaddleDetection /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import MutableMapping /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import Iterable, Mapping /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import Sized Warning: Unable to use MOT metric, please install motmetrics, for example: `pip install motmetrics`, see https://github.com/longcw/py-motmetrics Warning: Unable to use MCMOT metric, please install motmetrics, for example: `pip install motmetrics`, see https://github.com/longcw/py-motmetrics Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly [03/15 18:17:47] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_plus_sod_crn_l_80e_coco/best_model.pdparams loading annotations into memory... Done (t=0.01s) creating index... index created! [03/15 18:17:48] ppdet.engine INFO: Export inference config file to output_inference/ppyoloe_plus_sod_crn_l_80e_coco/infer_cfg.yml [03/15 18:18:02] ppdet.engine INFO: Export model and saved in output_inference/ppyoloe_plus_sod_crn_l_80e_coco
# 选一张验证集图片测试部署效果
%cd /home/aistudio/PaddleDetection
!python deploy/python/infer.py --model_dir=/home/aistudio/PaddleDetection/output_inference/ppyoloe_plus_sod_crn_l_80e_coco --image_file=/home/aistudio/work/dataset/test/421.jpg --device=GPU --save_images=True --threshold=0.25 --slice_infer --slice_size 500 500 --overlap_ratio 0.25 0.25 --combine_method=nms --match_threshold=0.6 --match_metric=ios
/home/aistudio/PaddleDetection ----------- Running Arguments ----------- action_file: None batch_size: 1 camera_id: -1 combine_method: nms cpu_threads: 1 device: GPU enable_mkldnn: False enable_mkldnn_bfloat16: False image_dir: None image_file: /home/aistudio/work/dataset/test/421.jpg match_metric: ios match_threshold: 0.6 model_dir: /home/aistudio/PaddleDetection/output_inference/ppyoloe_plus_sod_crn_l_80e_coco output_dir: output overlap_ratio: [0.25, 0.25] random_pad: False reid_batch_size: 50 reid_model_dir: None run_benchmark: False run_mode: paddle save_images: True save_mot_txt_per_img: False save_mot_txts: False save_results: False scaled: False slice_infer: True slice_size: [500, 500] threshold: 0.25 tracker_config: None trt_calib_mode: False trt_max_shape: 1280 trt_min_shape: 1 trt_opt_shape: 640 use_coco_category: False use_dark: True use_gpu: False video_file: None window_size: 50 ------------------------------------------ ----------- Model Configuration ----------- Model Arch: YOLO Transform Order: --transform op: Resize --transform op: NormalizeImage --transform op: Permute -------------------------------------------- slice to {} sub_samples. 6 class_id:9, confidence:0.8728, left_top:[340.72,103.24],right_bottom:[369.60,158.67] class_id:9, confidence:0.7929, left_top:[330.79,163.38],right_bottom:[361.78,210.25] class_id:9, confidence:0.5966, left_top:[352.58,53.78],right_bottom:[379.66,105.11] class_id:9, confidence:0.5286, left_top:[361.60,4.02],right_bottom:[391.68,48.03] class_id:9, confidence:0.8936, left_top:[407.95,233.15],right_bottom:[465.88,262.83] class_id:9, confidence:0.8747, left_top:[696.56,390.71],right_bottom:[723.77,438.41] class_id:9, confidence:0.8253, left_top:[626.39,434.95],right_bottom:[653.88,482.35] class_id:9, confidence:0.8880, left_top:[922.41,258.08],right_bottom:[954.44,307.52] class_id:9, confidence:0.8653, left_top:[654.27,256.37],right_bottom:[678.71,303.64] class_id:9, confidence:0.8627, left_top:[745.64,64.88],right_bottom:[772.16,110.10] class_id:9, confidence:0.8569, left_top:[887.41,241.04],right_bottom:[920.09,294.39] class_id:9, confidence:0.8382, left_top:[686.31,25.25],right_bottom:[714.68,78.60] class_id:9, confidence:0.8245, left_top:[657.26,187.57],right_bottom:[689.04,244.98] class_id:9, confidence:0.7201, left_top:[736.03,115.66],right_bottom:[764.34,166.88] class_id:9, confidence:0.8704, left_top:[285.31,409.72],right_bottom:[315.14,465.77] class_id:9, confidence:0.8475, left_top:[261.15,549.25],right_bottom:[289.68,595.72] class_id:9, confidence:0.8131, left_top:[272.90,482.58],right_bottom:[302.10,531.71] class_id:9, confidence:0.8013, left_top:[305.93,293.37],right_bottom:[331.93,343.54] class_id:9, confidence:0.6527, left_top:[246.47,612.18],right_bottom:[276.39,671.80] class_id:9, confidence:0.8807, left_top:[689.47,232.56],right_bottom:[715.29,278.37] class_id:9, confidence:0.5627, left_top:[982.50,276.36],right_bottom:[1008.19,328.45] save result to: output/421.jpg Test iter 0 ------------------ Inference Time Info ---------------------- total_time(ms): 1583.2, img_num: 1 average latency time(ms): 1583.20, QPS: 0.631632 preprocess_time(ms): 68.30, inference_time(ms): 1514.90, postprocess_time(ms): 0.00
average latency time(ms): 1583.20, QPS: 0.631632
preprocess_time(ms): 68.30, inference_time(ms): 1514.90, postprocess_time(ms): 0.00
推理结果如下:

PP-YOLOE-SOD 是PaddleDetection团队自研的小目标检测特色模型,使用数据集分布相关的基于向量的DFL算法 和 针对小目标优化的中心先验优化策略,并且在模型的Neck(FPN)结构中加入Transformer模块,以及结合增加P2层、使用large size等策略,最终在多个小目标数据集上达到极高的精度。
不通过切图拼图而直接使用原图或子图去训练评估预测,推荐使用 PP-YOLOE-SOD 模型,更多细节和消融实验可参照COCO模型和VisDrone模型。
通过此次项目实践,我学到了很多以往没有掌握的知识技能,比如以往没有使用过COCO格式的数据集,在此次项目实践中,使用到了它,并将其掌握。
学员:吉康毅
导师:周军
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。