当前位置:   article > 正文

【PyTorch】基于YOLO的多目标检测项目(一)

【PyTorch】基于YOLO的多目标检测项目(一)

【PyTorch】基于YOLO的多目标检测项目(一)

【PyTorch】基于YOLO的多目标检测项目(二)

目标检测是对图像中的现有目标进行定位和分类的过程。识别的对象在图像中显示有边界框。一般的目标检测方法有两种:基于区域提议的和基于回归/分类的。这里使用一种基于回归/分类的方法,称为YOLO

目录

准备COCO数据集

创建自定义数据集

转换数据

定义数据加载器


准备COCO数据集

COCO是一个大规模的对象检测,分割和字幕数据集。它包含80个对象类别用于对象检测。

下载以下GitHub存储库

https://github.com/pjreddie/darkneticon-default.png?t=N7T8https://github.com/pjreddie/darknet

创建一个名为config的文件夹,将darknet/cfg/coco.data、darknet/cfg/yolov3.cfg文件复制到config文件夹中。

创建一个名为data的文件夹,从以下链接获取coco.names文件,并将其放入data文件夹,coco.names文件包含COCO数据集中80个对象类别的列表。

darknet/data/coco.names at master · pjreddie/darknet · GitHubConvolutional Neural Networks. Contribute to pjreddie/darknet development by creating an account on GitHub.icon-default.png?t=N7T8https://github.com/pjreddie/darknet/blob/master/data/coco.names将darknet/scripts/get_coco_dataset.sh文件复制到data文件夹中,并复制get_coco_cocoet.sh到data文件夹。接下来,打开一个终端并执行get_coco_cocoet.sh,该脚本将把完整的COCO数据集下载到名为coco的子文件夹中。也可通过以下链接下载coco数据集。

COCO2014_数据集-飞桨AI Studio星河社区 (baidu.com)icon-default.png?t=N7T8https://aistudio.baidu.com/datasetdetail/165195

在images文件夹中,有两个名为train 2014和val 2014的文件夹,分别包含82783和40504个图像。在labels文件夹中,有两个名为train 2014和val 2014的标签,分别包含82081和40137文本文件。这些文本文件包含图像中对象的边界框坐标。此外,trainvalno5k.txt文件是一个包含117264张图像的列表,这些图像将用于训练模型。此列表是train2014和val2014中图像的组合,5000个图像除外。5k.txt文件包含将用于验证的5000个图像的列表。

创建自定义数据集

完成数据集下载后,使用PyTorch的Dataset和Dataloader类创建训练和验证数据集和数据加载器。

  1. from torch.utils.data import Dataset
  2. from PIL import Image
  3. import torchvision.transforms.functional as TF
  4. import os
  5. import numpy as np
  6. import torch
  7. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
  8. print(torch.__version__)
  1. #定义CocoDataset类,并展示来自训练和验证数据集的一些示例图像
  2. class CocoDataset(Dataset):
  3. def __init__(self, path2listFile, transform=None, trans_params=None):
  4. with open(path2listFile, "r") as file:
  5. self.path2imgs = file.readlines()
  6. self.path2labels = [
  7. path.replace("images", "labels").replace(".png", ".txt").replace(".jpg", ".txt")
  8. for path in self.path2imgs]
  9. self.trans_params = trans_params
  10. self.transform = transform
  11. def __len__(self):
  12. return len(self.path2imgs)
  13. def __getitem__(self, index):
  14. path2img = self.path2imgs[index % len(self.path2imgs)].rstrip()
  15. img = Image.open(path2img).convert('RGB')
  16. path2label = self.path2labels[index % len(self.path2imgs)].rstrip()
  17. labels= None
  18. if os.path.exists(path2label):
  19. labels = np.loadtxt(path2label).reshape(-1, 5)
  20. if self.transform:
  21. img, labels = self.transform(img, labels, self.trans_params)
  22. return img, labels, path2img
  1. root_data="./data/coco"
  2. path2trainList=os.path.join(root_data, "trainvalno5k.txt")
  3. coco_train = CocoDataset(path2trainList)
  4. print(len(coco_train))

 

  1. # 从coco_train中获取图像、标签和图像路径
  2. img, labels, path2img = coco_train[1]
  3. print("image size:", img.size, type(img))
  4. print("labels shape:", labels.shape, type(labels))
  5. print("labels \n", labels)

  1. path2valList=os.path.join(root_data, "5k.txt")
  2. coco_val = CocoDataset(path2valList, transform=None, trans_params=None)
  3. print(len(coco_val))

  1. img, labels, path2img = coco_val[7]
  2. print("image size:", img.size, type(img))
  3. print("labels shape:", labels.shape, type(labels))
  4. print("labels \n", labels)

  1. import matplotlib.pylab as plt
  2. import numpy as np
  3. from PIL import Image, ImageDraw, ImageFont
  4. from torchvision.transforms.functional import to_pil_image
  5. import random
  6. %matplotlib inline
  7. path2cocoNames="./data/coco.names"
  8. fp = open(path2cocoNames, "r")
  9. coco_names = fp.read().split("\n")[:-1]
  10. print("number of classese:", len(coco_names))
  11. print(coco_names)

  1. def rescale_bbox(bb,W,H):
  2. x,y,w,h=bb
  3. return [x*W, y*H, w*W, h*H]
  4. COLORS = np.random.randint(0, 255, size=(80, 3),dtype="uint8")
  5. # fnt = ImageFont.truetype('Pillow/Tests/fonts/FreeMono.ttf', 16)
  6. fnt = ImageFont.truetype('arial.ttf', 16)
  7. def show_img_bbox(img,targets):
  8. if torch.is_tensor(img):
  9. img=to_pil_image(img)
  10. if torch.is_tensor(targets):
  11. targets=targets.numpy()[:,1:]
  12. W, H=img.size
  13. draw = ImageDraw.Draw(img)
  14. for tg in targets:
  15. id_=int(tg[0])
  16. bbox=tg[1:]
  17. bbox=rescale_bbox(bbox,W,H)
  18. xc,yc,w,h=bbox
  19. color = [int(c) for c in COLORS[id_]]
  20. name=coco_names[id_]
  21. draw.rectangle(((xc-w/2, yc-h/2), (xc+w/2, yc+h/2)),outline=tuple(color),width=3)
  22. draw.text((xc-w/2,yc-h/2),name, font=fnt, fill=(255,255,255,0))
  23. plt.imshow(np.array(img))
  24. np.random.seed(1)
  25. rnd_ind=np.random.randint(len(coco_train))
  26. img, labels, path2img = coco_train[rnd_ind]
  27. print(img.size, labels.shape)
  28. plt.rcParams['figure.figsize'] = (20, 10)
  29. show_img_bbox(img,labels)

  1. np.random.seed(1)
  2. rnd_ind=np.random.randint(len(coco_val))
  3. img, labels, path2img = coco_val[rnd_ind]
  4. print(img.size, labels.shape)
  5. plt.rcParams['figure.figsize'] = (20, 10)
  6. show_img_bbox(img,labels)

转换数据

定义一个转换函数和传递给CocoDataset类的参数

  1. def pad_to_square(img, boxes, pad_value=0, normalized_labels=True):
  2. w, h = img.size
  3. w_factor, h_factor = (w,h) if normalized_labels else (1, 1)
  4. dim_diff = np.abs(h - w)
  5. pad1= dim_diff // 2
  6. pad2= dim_diff - pad1
  7. if h<=w:
  8. left, top, right, bottom= 0, pad1, 0, pad2
  9. else:
  10. left, top, right, bottom= pad1, 0, pad2, 0
  11. padding= (left, top, right, bottom)
  12. img_padded = TF.pad(img, padding=padding, fill=pad_value)
  13. w_padded, h_padded = img_padded.size
  14. x1 = w_factor * (boxes[:, 1] - boxes[:, 3] / 2)
  15. y1 = h_factor * (boxes[:, 2] - boxes[:, 4] / 2)
  16. x2 = w_factor * (boxes[:, 1] + boxes[:, 3] / 2)
  17. y2 = h_factor * (boxes[:, 2] + boxes[:, 4] / 2)
  18. x1 += padding[0] # 左
  19. y1 += padding[1] # 上
  20. x2 += padding[2] # 右
  21. y2 += padding[3] # 下
  22. boxes[:, 1] = ((x1 + x2) / 2) / w_padded
  23. boxes[:, 2] = ((y1 + y2) / 2) / h_padded
  24. boxes[:, 3] *= w_factor / w_padded
  25. boxes[:, 4] *= h_factor / h_padded
  26. return img_padded, boxes
  1. def hflip(image, labels):
  2. image = TF.hflip(image)
  3. labels[:, 1] = 1.0 - labels[:, 1]
  4. return image, labels
  5. def transformer(image, labels, params):
  6. if params["pad2square"] is True:
  7. image,labels= pad_to_square(image, labels)
  8. image = TF.resize(image,params["target_size"])
  9. if random.random() < params["p_hflip"]:
  10. image,labels=hflip(image,labels)
  11. image=TF.to_tensor(image)
  12. targets = torch.zeros((len(labels), 6))
  13. targets[:, 1:] = torch.from_numpy(labels)
  14. return image, targets
  1. trans_params_train={
  2. "target_size" : (416, 416),
  3. "pad2square": True,
  4. "p_hflip" : 1.0,
  5. "normalized_labels": True,
  6. }
  7. coco_train=CocoDataset(path2trainList,transform=transformer,trans_params=trans_params_train)
  8. np.random.seed(100)
  9. rnd_ind=np.random.randint(len(coco_train))
  10. img, targets, path2img = coco_train[rnd_ind]
  11. print("image shape:", img.shape)
  12. print("labels shape:", targets.shape)
  13. plt.rcParams['figure.figsize'] = (20, 10)
  14. COLORS = np.random.randint(0, 255, size=(80, 3),dtype="uint8")
  15. show_img_bbox(img,targets)

通过传递 transformer 函数来定义 CocoDataset 的一个对象来验证数据 

  1. trans_params_val={
  2. "target_size" : (416, 416),
  3. "pad2square": True,
  4. "p_hflip" : 0.0,
  5. "normalized_labels": True,
  6. }
  7. coco_val= CocoDataset(path2valList,
  8. transform=transformer,
  9. trans_params=trans_params_val)
  10. np.random.seed(55)
  11. rnd_ind=np.random.randint(len(coco_val))
  12. img, targets, path2img = coco_val[rnd_ind]
  13. print("image shape:", img.shape)
  14. print("labels shape:", targets.shape)
  15. plt.rcParams['figure.figsize'] = (20, 10)
  16. COLORS = np.random.randint(0, 255, size=(80, 3),dtype="uint8")
  17. show_img_bbox(img,targets)

 

定义数据加载器

定义两个用于训练和验证数据集的数据加载器,从coco_train和coco_val中获取小批量数据。

  1. from torch.utils.data import DataLoader
  2. batch_size=8
  3. def collate_fn(batch):
  4. imgs, targets, paths = list(zip(*batch))
  5. targets = [boxes for boxes in targets if boxes is not None]
  6. for b_i, boxes in enumerate(targets):
  7. boxes[:, 0] = b_i
  8. targets = torch.cat(targets, 0)
  9. imgs = torch.stack([img for img in imgs])
  10. return imgs, targets, paths
  11. train_dl = DataLoader(
  12. coco_train,
  13. batch_size=batch_size,
  14. shuffle=True,
  15. num_workers=0,
  16. pin_memory=True,
  17. collate_fn=collate_fn,
  18. )
  19. torch.manual_seed(0)
  20. for imgs_batch,tg_batch,path_batch in train_dl:
  21. break
  22. print(imgs_batch.shape)
  23. print(tg_batch.shape,tg_batch.dtype)

 

  1. val_dl = DataLoader(
  2. coco_val,
  3. batch_size=batch_size,
  4. shuffle=False,
  5. num_workers=0,
  6. pin_memory=True,
  7. collate_fn=collate_fn,
  8. )
  9. torch.manual_seed(0)
  10. for imgs_batch,tg_batch,path_batch in val_dl:
  11. break
  12. print(imgs_batch.shape)
  13. print(tg_batch.shape,tg_batch.dtype)

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/酷酷是懒虫/article/detail/920523
推荐阅读
相关标签
  

闽ICP备14008679号