赞
踩
水下目标检测旨在对水下场景中的物体进行定位和识别。这项研究由于在海洋学、水下导航等领域的广泛应用而引起了持续的关注。但是,由于复杂的水下环境和光照条件,这仍然是一项艰巨的任务。
基于深度学习的物体检测系统已在各种应用中表现出较好的性能,但在处理水下目标检测方面仍然感到不足,主要有原因是:可用的水下目标检测数据集稀少,实际应用中的水下场景的图像杂乱无章,并且水下环境中的目标物体通常很小,而当前基于深度学习的目标检测器通常无法有效地检测小物体,或者对小目标物体的检测性能较差。同时,在水下场景中,与波长有关的吸收和散射问题大大降低了水下图像的质量,从而导致了可见度损失,弱对比度和颜色变化等问题。
Al+水下勘探是一个新兴领域,目前专门用于水下研究工作的解决方案不多,高质量的数据集更是弥足珍贵。使用 PP-YOLOE+ 来推进水下目标检测的进步,从而使得水下机器人等设备能够更加智能化,提高海底资源勘探等方面的效率。而水下机器人又反哺出高质量的水下目标检测数据集,推动 Al+水下勘探的发展。

深度学习中,数据往往决定了性能的上限,算法只是不断地逼近上限。尽管基于深度学习的方法在标准的目标检测中取得了可喜的性能。水下目标检测仍具有以下几点挑战:
(1)水下场景的实际应用中目标通常很小,含有大量的小目标;
(2)水下数据集和实际应用中的图像通常是模糊的,图像中具有异构的噪声。
因此,针对以上所述的背景和水中目标检测所遇到的挑战,本项目将选用 PP-YOLOE+ 这一基于飞桨云边一体高精度模型PP-YOLOE迭代优化升级的版本。针对性的解决以上问题。
PP-YOLOE+ 具有如下特点:
超强性能
训练收敛加速
下游任务泛化性显著提升
高性能部署能力

PaddlePaddle >= 2.3.2
Python == 3.7
%cd /home/aistudio/work/
# gitee 国内下载比较快
!git clone https://gitee.com/paddlepaddle/PaddleDetection.git -b develop
# github
# !git clone https://github.com/PaddlePaddle/PaddleDetection.git -b develop
# 环境安装
%cd /home/aistudio/work/
# gitee 国内下载比较快
# !git clone https://gitee.com/paddlepaddle/PaddleDetection.git -b develop
%cd PaddleDetection/
!pip install -r requirements.txt > /dev/null
/home/aistudio/work
/home/aistudio/work/PaddleDetection
本项目选用数据集来源于水下目标检测算法赛(注:数据由鹏城实验室提供),训练集是5543张 jpg 格式的水下光学图像与对应标注结果构成,其中主要有海参、海胆、扇贝、海星四种目标。

| supercategory | id | name |
|---|---|---|
| component | 1 | echinus |
| component | 2 | holothurian |
| component | 3 | scallop |
| component | 4 | starfish |
| component | 5 | waterweeds |
| 长度 | 宽度 | 图片数量 |
|---|---|---|
| 704 | 576 | 38 |
| 1920 | 1080 | 596 |
| 3840 | 2160 | 1712 |
| 720 | 405 | 3153 |
| 586 | Text | 44 |
# 1.数据集解压
!unzip data/data172711/fish.zip > /dev/null
!mv ./fish ./work/PaddleDetection/dataset
%cd work/PaddleDetection/
/home/aistudio/work/PaddleDetection
# 2.voc2coco import argparse import glob import json import os import os.path as osp import sys import shutil import numpy as np import PIL.ImageDraw import xml.dom.minidom as xmldom import cv2 label_to_num = {} categories_list = [] labels_list = [] class MyEncoder(json.JSONEncoder): def default(self, obj): if isinstance(obj, np.integer): return int(obj) elif isinstance(obj, np.floating): return float(obj) elif isinstance(obj, np.ndarray): return obj.tolist() else: return super(MyEncoder, self).default(obj) def getbbox(self, points): polygons = points mask = self.polygons_to_mask([self.height, self.width], polygons) return self.mask2box(mask) def images_labelme(data, num): image = {} image['height'] = data['imageHeight'] image['width'] = data['imageWidth'] image['id'] = num + 1 image['file_name'] = data['imagePath'].split('/')[-1] return image def images_cityscape(num, img_file, w, h): image = {} image['height'] = h image['width'] = w image['id'] = num + 1 image['file_name'] = img_file return image def categories(label, labels_list): category = {} category['supercategory'] = 'component' category['id'] = len(labels_list) + 1 category['name'] = label return category def annotations_rectangle(points, label, image_num, object_num, label_to_num): annotation = {} seg_points = np.asarray(points).copy() annotation['segmentation'] = [list(seg_points.flatten())] annotation['iscrowd'] = 0 annotation['image_id'] = image_num + 1 annotation['bbox'] = list( map(float, [ points[0][0], points[0][1], points[1][0] - points[0][0], points[1][ 1] - points[0][1] ])) annotation['area'] = annotation['bbox'][2] * annotation['bbox'][3] annotation['category_id'] = label_to_num[label] annotation['id'] = object_num + 1 return annotation def annotations_polygon(height, width, points, label, image_num, object_num, label_to_num): annotation = {} annotation['segmentation'] = [list(np.asarray(points).flatten())] annotation['iscrowd'] = 0 annotation['image_id'] = image_num + 1 annotation['bbox'] = list(map(float, get_bbox(height, width, points))) annotation['area'] = annotation['bbox'][2] * annotation['bbox'][3] annotation['category_id'] = label_to_num[label] annotation['id'] = object_num + 1 return annotation def get_bbox(height, width, points): polygons = points mask = np.zeros([height, width], dtype=np.uint8) mask = PIL.Image.fromarray(mask) xy = list(map(tuple, polygons)) PIL.ImageDraw.Draw(mask).polygon(xy=xy, outline=1, fill=1) mask = np.array(mask, dtype=bool) index = np.argwhere(mask == 1) rows = index[:, 0] clos = index[:, 1] left_top_r = np.min(rows) left_top_c = np.min(clos) right_bottom_r = np.max(rows) right_bottom_c = np.max(clos) return [ left_top_c, left_top_r, right_bottom_c - left_top_c, right_bottom_r - left_top_r ] def deal_json(output_dir, input_dir, img_list): data_coco = {} images_list = [] annotations_list = [] image_num = -1 object_num = -1 # labels_list =[] for img_file in img_list: img = cv2.imread(osp.join('dataset/fish/Images', img_file)) w = img.shape[1] h = img.shape[0] img_label = img_file.split('.')[0] label_file = osp.join('dataset/fish/Annotations', img_label + '.xml') # print('Generating dataset from:', label_file) image_num = image_num + 1 xml_file = xmldom.parse(label_file) eles = xml_file.documentElement images_list.append(images_cityscape(image_num, img_file, w, h)) for i in range(len(eles.getElementsByTagName('name'))): label = eles.getElementsByTagName('name')[i].firstChild.data # if label == 'starfish': # continue object_num = object_num + 1 # print(label) if label not in labels_list: # print(label) categories_list.append(categories(label, labels_list)) labels_list.append(label) label_to_num[label] = len(labels_list) points = [] xmin = int(eles.getElementsByTagName('xmin')[i].firstChild.data) ymin = int(eles.getElementsByTagName('ymin')[i].firstChild.data) xmax = int(eles.getElementsByTagName('xmax')[i].firstChild.data) ymax = int(eles.getElementsByTagName('ymax')[i].firstChild.data) # print(xmin,ymin,xmax,ymax) # if xmin > 2000: # print(label_file) points.append([xmin, ymin]) points.append([xmax, ymax]) annotations_list.append( annotations_rectangle(points, label, image_num, object_num, label_to_num)) data_coco['images'] = images_list data_coco['categories'] = categories_list data_coco['annotations'] = annotations_list # print(labels_list) return data_coco import os.path as osp import glob import os import shutil train_img = 'dataset/fish/Images' train_box = 'dataset/fish/Annotations' # Allocate the dataset. total_num = len(glob.glob(osp.join(train_img, '*.jpg'))) train_num = int(total_num * 0.8) os.makedirs('data/cocome' + '/train') val_num = total_num - train_num os.makedirs('data/cocome' + '/val') count = 1 train_list = [] val_list = [] for img_name in os.listdir(train_img): if count <= train_num: if osp.exists('data/cocome' + '/train/'): shutil.copyfile( osp.join(train_img, img_name), osp.join('data/cocome' + '/train/', img_name)) train_list.append(img_name) else: if count <= train_num + val_num: if osp.exists('data/cocome' + '/val/'): shutil.copyfile( osp.join(train_img, img_name), osp.join('data/cocome' + '/val/', img_name)) val_list.append(img_name) count = count + 1 if not os.path.exists('data/cocome' + '/annotations'): os.makedirs('data/cocome' + '/annotations') train_data_coco = deal_json( 'data/cocome' + '/train', train_img ,train_list) train_json_path = osp.join('data/cocome' + '/annotations', 'instance_train.json') json.dump( train_data_coco, open(train_json_path, 'w'), indent=4, cls=MyEncoder) val_data_coco = deal_json('data/cocome' + '/val', train_img, val_list) val_json_path = osp.join('data/cocome' + '/annotations', 'instance_val.json') json.dump(val_data_coco, open(val_json_path, 'w'), indent=4, cls=MyEncoder)
# 3.将转换后的数据集移动至dataset/fish
!mv data/cocome/* dataset/fish/
metric: COCO num_classes: 5 TrainDataset: !COCODataSet image_dir: train anno_path: annotations/instance_train.json dataset_dir: dataset/fish data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd'] EvalDataset: !COCODataSet image_dir: val anno_path: annotations/instance_val.json dataset_dir: dataset/fish TestDataset: !ImageFolder anno_path: label_list.txt # also support txt (like VOC's label_list.txt) dataset_dir: dataset/fish # if set, anno_path will be 'dataset_dir/anno_path'
使用 PP-YOLOE+ 进行训练:
在./configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml中提供了基于PP-YOLOE+训练该场景的配置,训练脚本如下:
!python tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o LearningRate.base_lr=0.000125 -o snapshot_epoch=1 -o worker_num=1 --eval
在训练模型以后,我们可以通过运行评估命令来得到模型的精度,以确认训练的效果。评估可以参考以下命令执行。
这里使用了我们已经训练好的模型。如希望使用自己训练的模型,请对应将weights=后的值更改为对应模型.pdparams文件的存储路径。
# 模型评估
!python tools/eval.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o weights=output/ppyoloe_plus_crn_s_80e_coco/best_model.pdparams
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly W1016 22:54:41.591436 22455 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2 W1016 22:54:41.598824 22455 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2. loading annotations into memory... Done (t=0.00s) creating index... index created! [10/16 22:54:43] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_plus_crn_s_80e_coco/best_model.pdparams [10/16 22:54:44] ppdet.engine INFO: Eval iter: 0 [10/16 22:54:55] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json. loading annotations into memory... Done (t=0.01s) creating index... index created! [10/16 22:54:55] ppdet.metrics.coco_utils INFO: Start evaluate... Loading and preparing results... DONE (t=0.50s) creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=3.09s). Accumulating evaluation results... DONE (t=0.43s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.278 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.541 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.267 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.237 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.275 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.304 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.160 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.436 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.543 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.514 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.509 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.572 [10/16 22:54:59] ppdet.engine INFO: Total sample number: 200, averge FPS: 19.422192298181745
这里我们将训练好的模型对着整个数据集的验证集来一番批量预测,这些预测结果会记录在VisualDL中,可以很方便地与原图对比,观察预测效果。
!python tools/infer.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o weights=output/ppyoloe_plus_crn_s_80e_coco/best_model --infer_dir ./dataset/fish/val --use_vdl=True --vdl_log_dir=./output/image --output_dir ./output/results
.pdparams只包括了模型的参数数据,实际部署还需要执行导出步骤。导出步骤可以参考下面列举的步骤:
注意,这里使用了我们已经训练好的模型。如希望使用自己训练的模型,请对应将weights=后的值更改为对应模型.pdparams文件的存储路径。如果没有指定–output_dir,那么导出的模型将默认存储在 output_inference/ 路径下。
!python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o weights=output/ppyoloe_plus_crn_s_80e_coco/best_model.pdparams
至此,我们就完成了水底生物目标检测模型的从训练到导出的过程。接下来,看看该模型使用Paddle Inference部署时的具体性能表现。
!python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_s_80e_coco --image_file=dataset/fish/val/c000418.jpg --run_mode=paddle --device=gpu
----------- Running Arguments ----------- action_file: None batch_size: 1 camera_id: -1 combine_method: nms cpu_threads: 1 device: gpu enable_mkldnn: False enable_mkldnn_bfloat16: False image_dir: None image_file: dataset/fish/val/c000418.jpg match_metric: ios match_threshold: 0.6 model_dir: output_inference/ppyoloe_plus_crn_s_80e_coco output_dir: output overlap_ratio: [0.25, 0.25] random_pad: False reid_batch_size: 50 reid_model_dir: None run_benchmark: False run_mode: paddle save_images: True save_mot_txt_per_img: False save_mot_txts: False save_results: False scaled: False slice_infer: False slice_size: [640, 640] threshold: 0.5 tracker_config: None trt_calib_mode: False trt_max_shape: 1280 trt_min_shape: 1 trt_opt_shape: 640 use_coco_category: False use_dark: True use_gpu: False video_file: None window_size: 50 ------------------------------------------ ----------- Model Configuration ----------- Model Arch: YOLO Transform Order: --transform op: Resize --transform op: NormalizeImage --transform op: Permute -------------------------------------------- class_id:0, confidence:0.7396, left_top:[299.31,0.93],right_bottom:[331.32,32.59] class_id:0, confidence:0.7280, left_top:[293.39,100.21],right_bottom:[332.52,138.38] class_id:0, confidence:0.6212, left_top:[528.09,-1.20],right_bottom:[566.73,20.30] class_id:0, confidence:0.5986, left_top:[71.87,-0.63],right_bottom:[111.20,26.91] class_id:0, confidence:0.5766, left_top:[120.02,101.65],right_bottom:[181.98,152.03] class_id:0, confidence:0.5554, left_top:[405.12,215.59],right_bottom:[458.40,272.60] class_id:0, confidence:0.5486, left_top:[476.20,49.54],right_bottom:[511.81,85.39] class_id:1, confidence:0.6370, left_top:[446.48,281.86],right_bottom:[504.50,342.79] class_id:1, confidence:0.5986, left_top:[450.84,89.30],right_bottom:[494.58,123.99] class_id:1, confidence:0.5715, left_top:[510.54,89.15],right_bottom:[570.02,138.44] class_id:1, confidence:0.5151, left_top:[512.79,94.07],right_bottom:[562.67,133.47] save result to: output/c000418.jpg Test iter 0 ------------------ Inference Time Info ---------------------- total_time(ms): 1072.8, img_num: 1 average latency time(ms): 1072.80, QPS: 0.932140 preprocess_time(ms): 18.20, inference_time(ms): 1054.50, postprocess_time(ms): 0.10

本项目用EasyEdge端与边缘AI服务平台对水底生物目标检测模型进行部署。EasyEdge是基于百度飞桨轻量化推理框架Paddle Lite研发的端与边缘AI服务平台,能够帮助深度学习开发者将自建模型快速部署到设备端。只需上传模型,最快2分种即可获得适配终端硬件/芯片的模型。


至此,本项目的一个完整流程就结束啦!如果对此感兴趣,不妨动起来呀!
请点击此处查看本环境基本用法.
Please click here for more detailed instructions.
此文章为搬运
原项目链接
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。