当前位置:   article > 正文

YOLO格式 转为 旋转数据标注rolabelimg/labelimg2的VOC格式(集成主成分提取角度)

rolabelimg

之前用labelimg标注了七十张的yolo格式数据集,但是现在要换识别的模型,需要用到带有角度的VOC格式数据集,传统VOC格式数据集包含的位置信息是(xmin, ymin, xmax, ymax),而使用rolabelimg/labelimg2标注的旋转举矩形数据集包含的位置信息是(cx, cy, w, h, angle)。

前四项cx, cy, w, h可以使用yolo提供的x_0, y_0, w_0, h_0计算得到,但是无法得到角度信息,于是使用基于主成分提取的方向校正(OpenCV)方法得到目标框的角度,然后写为VOC的XML格式。

本文共分为两部分:主成分方向提取代码和VOC格式生成代码。

1. 主成分方向提取代码(OpenCV)

我用的体系是很简单的椭圆结构,方向比较好提取,因此可以使用PCA特征提取的方法求得主方向。下面介绍参考代码的修改部分。

函数设计的思路是输入(img, cx, cy, width, height)输出角度值,这个角度值按照下图定义,纵轴正方向开始,顺时针旋转是角度增大,一直增加到,然后到达纵轴负方向;继续逆时针旋转后角度从0开始,又增加到,直到回到纵轴正方向。

而且我需要的是短边的角度,PCA程序生成两个角度:第一、第二主成分方向,第一主成分方向就是下图长边的方向,第二主成分方向才是我需要的短边的方向,生成后发现第二主方向都是分布在一三四象限的,没有找到第二象限的方向,即角度范围在,第一象限的也很少,角度转换就很好设计了:

theta = (theta_PCA - math.pi / 2) if (theta_PCA > math.pi / 2) else (theta_PCA + math.pi / 2)

get_angle程序如下:

  1. def get_angle(img, cx, cy, width, height):
  2. # 14.18 基于 PCA 的方向矫正 (OpenCV):https://blog.csdn.net/youcans/article/details/125782405
  3. # 必须要规定截取图像的范围,不能小于零!!!
  4. Pheight, Pwidth, _ = img.shape
  5. # 裁剪坐标为[y0:y1, x0:x1]
  6. #截取的左上坐标
  7. left_top = (max(int(cx - width / 2), 0), min(int(cy - height / 2), Pheight))
  8. #截取的右下坐标
  9. right_bottom = (max(int(cx + width / 2), 0), min(int(cy + height / 2), Pwidth))
  10. # 裁剪坐标为[y0:y1, x0:x1]
  11. img_crop = img[left_top[1]:right_bottom[1], left_top[0]:right_bottom[0]]
  12. # cv_show('img_crop', img_crop, True)
  13. gray = cv2.cvtColor(img_crop, cv2.COLOR_BGR2GRAY)
  14. _, binary = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV)
  15. # 寻找二值化图中的轮廓,检索所有轮廓,输出轮廓的每个像素点
  16. contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE) # OpenCV4~
  17. fullCnts = np.zeros(img_crop.shape[:2], np.uint8) # 绘制轮廓函数会修改原始图像
  18. fullCnts = cv2.drawContours(fullCnts, contours, -1, (255, 255, 255), thickness=3) # 绘制全部轮廓
  19. # 按轮廓的面积排序,绘制面积最大的轮廓
  20. cnts = sorted(contours, key=cv2.contourArea, reverse=True) # 所有轮廓按面积排序
  21. cnt = cnts[0] # 第 0 个轮廓,面积最大的轮廓,(1445, 1, 2)
  22. maxCnt = np.zeros(img_crop.shape[:2], np.uint8) # 初始化最大轮廓图像
  23. cv2.drawContours(maxCnt, cnts[0], -1, (255, 255, 255), thickness=3) # 仅绘制最大轮廓 cnt
  24. # 主成分分析方法提取目标的方向
  25. markedCnt = maxCnt.copy()
  26. ptsXY = np.squeeze(cnt).astype(np.float64) # 删除维度为1的数组维度,(1445, 1, 2)->(1445, 2)
  27. mean, eigenvectors, eigenvalues = cv2.PCACompute2(ptsXY, np.array([])) # (1, 2) (2, 2) (2, 1)
  28. # 绘制第一、第二主成分方向轴
  29. center = mean[0, :].astype(int) # 近似作为目标的中心 [266 281]
  30. # e1xy = eigenvectors[0,:] * eigenvalues[0,0] # 第一主方向轴
  31. e2xy = eigenvectors[1,:] * eigenvalues[1,0] # 第二主方向轴
  32. # p1 = (center + 0.1*e1xy).astype(np.int) # P1:[149 403]
  33. p2 = (center + 0.1*e2xy).astype(np.int) # P2:[320 332]
  34. theta = np.arctan2(eigenvectors[1,1], eigenvectors[1,0]) # 第二主方向角度 133.6
  35. # cv2.circle(markedCnt, center, 6, 255, -1) # 在PCA中心位置画一个圆圈 RGB
  36. # cv2.arrowedLine(markedCnt, center, p1, (255, 0, 0), thickness=2, tipLength=0.1) # 从 center 指向 pt1
  37. cv2.arrowedLine(markedCnt, center, p2, (255, 0, 0), thickness=2, tipLength=0.2) # 从 center 指向 pt2
  38. # cv_show('markedCnt', markedCnt, True)
  39. theta = (theta - math.pi / 2) if (theta > math.pi / 2) else (theta + math.pi / 2)
  40. return theta

2. VOC格式生成代码

相较于参考代码做了以下修改,可以生成rolabelimg/labelimg2能够识别的旋转矩形VOC数据集:

  1. 增加了二级节点<path>

  1. 增加了二级节点<source>及其下的三级节点<database>

  1. 增加了二级节点<segmented>

  1. 二级标签下<object>的三级标签<bndbox>改名为<robndbox>,其下的四级节点由原来的<xmin>、<ymin>、<xmax>、<ymax>改为<cx>、<cy>、<w>、<h>、<angle>

  1. 其余的<folder>、<filename>、<path>、<size>根据当前数据集进行修改,其中<path>用了绝对路径,不知道相对路径行不行。

其他部分和参考代码类似,比较好理解,就不过多介绍了,makexml代码如下:

  1. def makexml(picPath, txtPath, xmlPath): # txt所在文件夹路径,xml文件保存路径,图片所在文件夹路径
  2. """此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件
  3. """
  4. dic = {'0': "single", # 创建字典用来对类型进行转换
  5. '1': "overlap"} # 此处的字典要与自己的classes.txt文件中的类对应,且顺序要一致
  6. files = os.listdir(txtPath)
  7. for i, name in enumerate(files):
  8. if name == 'classes.txt':
  9. continue
  10. xmlBuilder = Document()
  11. annotation = xmlBuilder.createElement("annotation") # 创建annotation标签
  12. xmlBuilder.appendChild(annotation)
  13. txtFile = open(txtPath + name)
  14. txtList = txtFile.readlines()
  15. add_0 = name.split('.')[0].rjust(4, '0') # 左侧补零
  16. img_path = "images1_" + add_0 + ".png"
  17. img = cv2.imread(picPath + img_path)
  18. # if cv_show(img_path, img, True) is None:
  19. # break
  20. Pheight, Pwidth, Pdepth = img.shape
  21. folder = xmlBuilder.createElement("folder") # folder标签开始
  22. foldercontent = xmlBuilder.createTextNode("image")
  23. folder.appendChild(foldercontent)
  24. annotation.appendChild(folder) # folder标签结束
  25. filename = xmlBuilder.createElement("filename") # filename标签开始
  26. filenamecontent = xmlBuilder.createTextNode("images1_" + add_0 + ".png")
  27. filename.appendChild(filenamecontent)
  28. annotation.appendChild(filename) # filename标签结束
  29. path = xmlBuilder.createElement("path") # path标签开始
  30. img_path_all = os.path.abspath(picPath + img_path)
  31. pathcontent = xmlBuilder.createTextNode(img_path_all)
  32. path.appendChild(pathcontent)
  33. annotation.appendChild(path) # path标签结束
  34. source = xmlBuilder.createElement("source") # source标签开始
  35. database = xmlBuilder.createElement("database") # source子标签database开始
  36. databasecontent = xmlBuilder.createTextNode('Unknown')
  37. database.appendChild(databasecontent)
  38. source.appendChild(database) # source子标签database结束
  39. annotation.appendChild(source) # source标签结束
  40. size = xmlBuilder.createElement("size") # size标签开始
  41. width = xmlBuilder.createElement("width") # size子标签width开始
  42. widthcontent = xmlBuilder.createTextNode(str(Pwidth))
  43. width.appendChild(widthcontent)
  44. size.appendChild(width) # size子标签width结束
  45. height = xmlBuilder.createElement("height") # size子标签height开始
  46. heightcontent = xmlBuilder.createTextNode(str(Pheight))
  47. height.appendChild(heightcontent)
  48. size.appendChild(height) # size子标签height结束
  49. depth = xmlBuilder.createElement("depth") # size子标签depth开始
  50. depthcontent = xmlBuilder.createTextNode(str(Pdepth))
  51. depth.appendChild(depthcontent)
  52. size.appendChild(depth) # size子标签depth结束
  53. annotation.appendChild(size) # size标签结束
  54. segmented = xmlBuilder.createElement("segmented") # segmented标签开始
  55. segmentedcontent = xmlBuilder.createTextNode("0")
  56. segmented.appendChild(segmentedcontent)
  57. annotation.appendChild(segmented) # segmented标签结束
  58. for j in txtList:
  59. oneline = j.strip().split(" ")
  60. object = xmlBuilder.createElement("object") # object 标签开始
  61. picname = xmlBuilder.createElement("name") # name标签开始
  62. namecontent = xmlBuilder.createTextNode(dic[oneline[0]])
  63. picname.appendChild(namecontent)
  64. object.appendChild(picname) # name标签结束
  65. pose = xmlBuilder.createElement("pose") # pose标签开始
  66. posecontent = xmlBuilder.createTextNode("Unspecified")
  67. pose.appendChild(posecontent)
  68. object.appendChild(pose) # pose标签结束
  69. truncated = xmlBuilder.createElement("truncated") # truncated标签开始
  70. truncatedContent = xmlBuilder.createTextNode("0")
  71. truncated.appendChild(truncatedContent)
  72. object.appendChild(truncated) # truncated标签结束
  73. difficult = xmlBuilder.createElement("difficult") # difficult标签开始
  74. difficultcontent = xmlBuilder.createTextNode("0")
  75. difficult.appendChild(difficultcontent)
  76. object.appendChild(difficult) # difficult标签结束
  77. robndbox = xmlBuilder.createElement("robndbox") # robndbox标签开始
  78. cx = xmlBuilder.createElement("cx") # cx标签开始
  79. cxData = float(oneline[1]) * 1280 - 256 / 2
  80. cxContent = xmlBuilder.createTextNode(str(cxData))
  81. cx.appendChild(cxContent)
  82. robndbox.appendChild(cx) # cx标签结束
  83. cy = xmlBuilder.createElement("cy") # cy标签开始
  84. cyData = float(oneline[2]) * 800 + 224 / 2
  85. cyContent = xmlBuilder.createTextNode(str(cyData))
  86. cy.appendChild(cyContent)
  87. robndbox.appendChild(cy) # cy标签结束
  88. w = xmlBuilder.createElement("w") # w标签开始
  89. wData = float(oneline[3]) * 1280
  90. wContent = xmlBuilder.createTextNode(str(wData))
  91. w.appendChild(wContent)
  92. robndbox.appendChild(w) # w标签结束
  93. h = xmlBuilder.createElement("h") # h标签开始
  94. hData = float(oneline[4]) * 800
  95. hContent = xmlBuilder.createTextNode(str(hData))
  96. h.appendChild(hContent)
  97. robndbox.appendChild(h) # h标签结束
  98. angle = xmlBuilder.createElement("angle") # angle标签开始
  99. angleData = get_angle(img, cxData, cyData, 1.25 * wData, 1.25 * hData)
  100. angleContent = xmlBuilder.createTextNode(str(angleData))
  101. angle.appendChild(angleContent)
  102. robndbox.appendChild(angle) # angle标签结束
  103. object.appendChild(robndbox) # robndbox标签结束
  104. annotation.appendChild(object) # object标签结束
  105. f = open(xmlPath + "images1_" + add_0 + ".xml", 'w')
  106. xmlBuilder.writexml(f, indent='\t', newl='\n', addindent='\t', encoding='utf-8')
  107. print(xmlPath + "images1_" + add_0 + ".xml", 'created successfully!')
  108. f.close()

完整代码:

  1. from xml.dom.minidom import Document
  2. import os
  3. import math
  4. import cv2
  5. import numpy as np
  6. def cv_show(name, img, pause = False):
  7. cv2.imshow(name, img)
  8. key = cv2.waitKey(0 if pause == True else 1) & 0xFF
  9. if key == 27: # keycode 27 = Escape
  10. cv2.destroyAllWindows()
  11. else:
  12. cv2.destroyAllWindows()
  13. return 1
  14. def get_angle(img, cx, cy, width, height):
  15. # 14.18 基于 PCA 的方向矫正 (OpenCV):https://blog.csdn.net/youcans/article/details/125782405
  16. # 必须要规定截取图像的范围,不能小于零!!!
  17. Pheight, Pwidth, _ = img.shape
  18. # 裁剪坐标为[y0:y1, x0:x1]
  19. #截取的左上坐标
  20. left_top = (max(int(cx - width / 2), 0), min(int(cy - height / 2), Pheight))
  21. #截取的右下坐标
  22. right_bottom = (max(int(cx + width / 2), 0), min(int(cy + height / 2), Pwidth))
  23. # 裁剪坐标为[y0:y1, x0:x1]
  24. img_crop = img[left_top[1]:right_bottom[1], left_top[0]:right_bottom[0]]
  25. # cv_show('img_crop', img_crop, True)
  26. gray = cv2.cvtColor(img_crop, cv2.COLOR_BGR2GRAY)
  27. _, binary = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV)
  28. # 寻找二值化图中的轮廓,检索所有轮廓,输出轮廓的每个像素点
  29. contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE) # OpenCV4~
  30. fullCnts = np.zeros(img_crop.shape[:2], np.uint8) # 绘制轮廓函数会修改原始图像
  31. fullCnts = cv2.drawContours(fullCnts, contours, -1, (255, 255, 255), thickness=3) # 绘制全部轮廓
  32. # 按轮廓的面积排序,绘制面积最大的轮廓
  33. cnts = sorted(contours, key=cv2.contourArea, reverse=True) # 所有轮廓按面积排序
  34. cnt = cnts[0] # 第 0 个轮廓,面积最大的轮廓,(1445, 1, 2)
  35. maxCnt = np.zeros(img_crop.shape[:2], np.uint8) # 初始化最大轮廓图像
  36. cv2.drawContours(maxCnt, cnts[0], -1, (255, 255, 255), thickness=3) # 仅绘制最大轮廓 cnt
  37. # 主成分分析方法提取目标的方向
  38. markedCnt = maxCnt.copy()
  39. ptsXY = np.squeeze(cnt).astype(np.float64) # 删除维度为1的数组维度,(1445, 1, 2)->(1445, 2)
  40. mean, eigenvectors, eigenvalues = cv2.PCACompute2(ptsXY, np.array([])) # (1, 2) (2, 2) (2, 1)
  41. # 绘制第一、第二主成分方向轴
  42. center = mean[0, :].astype(int) # 近似作为目标的中心 [266 281]
  43. # e1xy = eigenvectors[0,:] * eigenvalues[0,0] # 第一主方向轴
  44. e2xy = eigenvectors[1,:] * eigenvalues[1,0] # 第二主方向轴
  45. # p1 = (center + 0.1*e1xy).astype(np.int) # P1:[149 403]
  46. p2 = (center + 0.1*e2xy).astype(np.int) # P2:[320 332]
  47. theta = np.arctan2(eigenvectors[1,1], eigenvectors[1,0]) # 第二主方向角度 133.6
  48. # cv2.circle(markedCnt, center, 6, 255, -1) # 在PCA中心位置画一个圆圈 RGB
  49. # cv2.arrowedLine(markedCnt, center, p1, (255, 0, 0), thickness=2, tipLength=0.1) # 从 center 指向 pt1
  50. cv2.arrowedLine(markedCnt, center, p2, (255, 0, 0), thickness=2, tipLength=0.2) # 从 center 指向 pt2
  51. # cv_show('markedCnt', markedCnt, True)
  52. theta = (theta - math.pi / 2) if (theta > math.pi / 2) else (theta + math.pi / 2)
  53. return theta
  54. def makexml(picPath, txtPath, xmlPath): # txt所在文件夹路径,xml文件保存路径,图片所在文件夹路径
  55. """此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件
  56. """
  57. dic = {'0': "single", # 创建字典用来对类型进行转换
  58. '1': "overlap"} # 此处的字典要与自己的classes.txt文件中的类对应,且顺序要一致
  59. files = os.listdir(txtPath)
  60. for i, name in enumerate(files):
  61. if name == 'classes.txt':
  62. continue
  63. xmlBuilder = Document()
  64. annotation = xmlBuilder.createElement("annotation") # 创建annotation标签
  65. xmlBuilder.appendChild(annotation)
  66. txtFile = open(txtPath + name)
  67. txtList = txtFile.readlines()
  68. add_0 = name.split('.')[0].rjust(4, '0') # 左侧补零
  69. img_path = "images1_" + add_0 + ".png"
  70. img = cv2.imread(picPath + img_path)
  71. # if cv_show(img_path, img, True) is None:
  72. # break
  73. Pheight, Pwidth, Pdepth = img.shape
  74. folder = xmlBuilder.createElement("folder") # folder标签开始
  75. foldercontent = xmlBuilder.createTextNode("image")
  76. folder.appendChild(foldercontent)
  77. annotation.appendChild(folder) # folder标签结束
  78. filename = xmlBuilder.createElement("filename") # filename标签开始
  79. filenamecontent = xmlBuilder.createTextNode("images1_" + add_0 + ".png")
  80. filename.appendChild(filenamecontent)
  81. annotation.appendChild(filename) # filename标签结束
  82. path = xmlBuilder.createElement("path") # path标签开始
  83. img_path_all = os.path.abspath(picPath + img_path)
  84. pathcontent = xmlBuilder.createTextNode(img_path_all)
  85. path.appendChild(pathcontent)
  86. annotation.appendChild(path) # path标签结束
  87. source = xmlBuilder.createElement("source") # source标签开始
  88. database = xmlBuilder.createElement("database") # source子标签database开始
  89. databasecontent = xmlBuilder.createTextNode('Unknown')
  90. database.appendChild(databasecontent)
  91. source.appendChild(database) # source子标签database结束
  92. annotation.appendChild(source) # source标签结束
  93. size = xmlBuilder.createElement("size") # size标签开始
  94. width = xmlBuilder.createElement("width") # size子标签width开始
  95. widthcontent = xmlBuilder.createTextNode(str(Pwidth))
  96. width.appendChild(widthcontent)
  97. size.appendChild(width) # size子标签width结束
  98. height = xmlBuilder.createElement("height") # size子标签height开始
  99. heightcontent = xmlBuilder.createTextNode(str(Pheight))
  100. height.appendChild(heightcontent)
  101. size.appendChild(height) # size子标签height结束
  102. depth = xmlBuilder.createElement("depth") # size子标签depth开始
  103. depthcontent = xmlBuilder.createTextNode(str(Pdepth))
  104. depth.appendChild(depthcontent)
  105. size.appendChild(depth) # size子标签depth结束
  106. annotation.appendChild(size) # size标签结束
  107. segmented = xmlBuilder.createElement("segmented") # segmented标签开始
  108. segmentedcontent = xmlBuilder.createTextNode("0")
  109. segmented.appendChild(segmentedcontent)
  110. annotation.appendChild(segmented) # segmented标签结束
  111. for j in txtList:
  112. oneline = j.strip().split(" ")
  113. object = xmlBuilder.createElement("object") # object 标签开始
  114. picname = xmlBuilder.createElement("name") # name标签开始
  115. namecontent = xmlBuilder.createTextNode(dic[oneline[0]])
  116. picname.appendChild(namecontent)
  117. object.appendChild(picname) # name标签结束
  118. pose = xmlBuilder.createElement("pose") # pose标签开始
  119. posecontent = xmlBuilder.createTextNode("Unspecified")
  120. pose.appendChild(posecontent)
  121. object.appendChild(pose) # pose标签结束
  122. truncated = xmlBuilder.createElement("truncated") # truncated标签开始
  123. truncatedContent = xmlBuilder.createTextNode("0")
  124. truncated.appendChild(truncatedContent)
  125. object.appendChild(truncated) # truncated标签结束
  126. difficult = xmlBuilder.createElement("difficult") # difficult标签开始
  127. difficultcontent = xmlBuilder.createTextNode("0")
  128. difficult.appendChild(difficultcontent)
  129. object.appendChild(difficult) # difficult标签结束
  130. robndbox = xmlBuilder.createElement("robndbox") # robndbox标签开始
  131. cx = xmlBuilder.createElement("cx") # cx标签开始
  132. cxData = float(oneline[1]) * 1280 - 256 / 2
  133. cxContent = xmlBuilder.createTextNode(str(cxData))
  134. cx.appendChild(cxContent)
  135. robndbox.appendChild(cx) # cx标签结束
  136. cy = xmlBuilder.createElement("cy") # cy标签开始
  137. cyData = float(oneline[2]) * 800 + 224 / 2
  138. cyContent = xmlBuilder.createTextNode(str(cyData))
  139. cy.appendChild(cyContent)
  140. robndbox.appendChild(cy) # cy标签结束
  141. w = xmlBuilder.createElement("w") # w标签开始
  142. wData = float(oneline[3]) * 1280
  143. wContent = xmlBuilder.createTextNode(str(wData))
  144. w.appendChild(wContent)
  145. robndbox.appendChild(w) # w标签结束
  146. h = xmlBuilder.createElement("h") # h标签开始
  147. hData = float(oneline[4]) * 800
  148. hContent = xmlBuilder.createTextNode(str(hData))
  149. h.appendChild(hContent)
  150. robndbox.appendChild(h) # h标签结束
  151. angle = xmlBuilder.createElement("angle") # angle标签开始
  152. angleData = get_angle(img, cxData, cyData, 1.25 * wData, 1.25 * hData)
  153. angleContent = xmlBuilder.createTextNode(str(angleData))
  154. angle.appendChild(angleContent)
  155. robndbox.appendChild(angle) # angle标签结束
  156. object.appendChild(robndbox) # robndbox标签结束
  157. annotation.appendChild(object) # object标签结束
  158. f = open(xmlPath + "images1_" + add_0 + ".xml", 'w')
  159. xmlBuilder.writexml(f, indent='\t', newl='\n', addindent='\t', encoding='utf-8')
  160. print(xmlPath + "images1_" + add_0 + ".xml", 'created successfully!')
  161. f.close()
  162. if __name__ == "__main__":
  163. picPath = "image/" # 图片所在文件夹路径,后面的/一定要带上
  164. txtPath = "yolo/" # txt所在文件夹路径,后面的/一定要带上
  165. xmlPath = "voc/" # xml文6780p 件保存路径,后面的/一定要带上
  166. makexml(picPath, txtPath, xmlPath)

参考文章:

  1. 【OpenCV 例程 300篇】237. 基于主成分提取的方向校正(OpenCV)_YouCans的博客-CSDN博客_opencv 图像主方向

  1. YOLO与voc格式互转,超详细_@秋野的博客-CSDN博客_yolo转voc

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/从前慢现在也慢/article/detail/722282
推荐阅读
相关标签
  

闽ICP备14008679号