当前位置:   article > 正文

3D目标检测数据集 KITTI(标签格式解析、3D框可视化、点云转图像、BEV鸟瞰图)

kitti

本文介绍在3D目标检测中,理解和使用KITTI 数据集,包括KITTI 的基本情况、下载数据集、标签格式解析、3D框可视化、点云转图像、画BEV鸟瞰图等,并配有实现代码。

目录

 1、KITTI数据集3D框可视化

2、KITTI 3D数据集

3、下载数据集

4、标签格式

5、标定参数解析

6、点云数据-->投影到图像

7、图像数据-->投影到点云

8、可视化图像2D结果、3D结果

9、点云3D结果-->图像BEV鸟瞰图结果(坐标系转换)

10、绘制BEV鸟瞰图

11、BEV鸟瞰图画2d框

12、完整工程代码


 1、KITTI数据集3D框可视化

2、KITTI 3D数据集

kitti 3D数据集的基本情况:

KITTI整个数据集是在德国卡尔斯鲁厄采集的,采集时长6小时。KITTI官网放出的数据大约占采集全部的25%,去除了测试集中相关的数据片段,按场景可以分为“道路”、“城市”、“住宅区”、“校园”和“行人”5类。

传感器配置:

传感器安装位置:


3、下载数据集

The KITTI Vision Benchmark Suite (cvlibs.net)

下载数据需要注册账号的,获取取百度网盘下载;文件的格式如下所示

图片格式:xxx.jpg

点云格式:xxx.bin(点云是以bin二进制的方式存储的)

标定参数:xxx.txt(一个文件中包括各个相机的内参、畸变校正矩阵、激光雷达坐标转到相机坐标的矩阵IMU坐标转激光雷达坐标的矩阵)

标签格式:xxx.txt(包含类别、截断情况、遮挡情况、观测角度、2D框左上角坐标、2D框右下角坐标、3D物体的尺寸-高宽长、3D物体的中心坐标-xyz、置信度)

4、标签格式

示例标签:Pedestrian 0.00 0 -0.20 712.40 143.00 810.73 307.92 1.89 0.48 1.20 1.84 1.47 8.41 0.01 

这时可以看看这个视频:

Nuscenes、KITTI等多个BEV开源数据集介绍

5、标定参数解析

然后看一下标定参数:

P0-P3:是各个相机的内参矩阵;3×4的相机投影矩阵,0~3分别对应左侧灰度相机、右侧灰度相机、左侧彩色相机、右侧彩色相机。

R0_rect: 是左相机的畸变矫正矩阵;3×3的旋转修正矩阵。

Tr_velo_to_cam:是激光雷达坐标系 转到 相机坐标系矩阵;3×4的激光坐标系到Cam 0坐标系的变换矩阵。

Tr_imu_to_velo: 是IMU坐标转到激光雷达坐标的矩阵;3×4的IMU坐标系到激光坐标系的变换矩阵。

6、点云数据-->投影到图像

当有了点云数据信息,如何投影到图像中呢?本质上是一个坐标系转换的问题,流程思路如下:

  1. 已知点云坐标(x,y,z),当前是处于激光雷达坐标系
  2. 激光雷达坐标系 转到 相机坐标系,需要用到标定参数中的Tr_velo_to_cam矩阵,此时得到相机坐标(x1,y1,z1)
  3. 相机坐标系进行畸变矫正,需要用到标定参数中的R0_rect矩阵,此时得到相机坐标(x2,y2,z2)
  4. 相机坐标系转为图像坐标系,需要用到标定参数中的P0矩阵,即相机内存矩阵,此时得到图像坐标(u,v)

看一下示例效果:

接口代码:

  1. '''
  2. 将点云数据投影到图像
  3. '''
  4. def show_lidar_on_image(pc_velo, img, calib, img_width, img_height):
  5. ''' Project LiDAR points to image '''
  6. imgfov_pc_velo, pts_2d, fov_inds = get_lidar_in_image_fov(pc_velo,
  7. calib, 0, 0, img_width, img_height, True)
  8. imgfov_pts_2d = pts_2d[fov_inds,:]
  9. imgfov_pc_rect = calib.project_velo_to_rect(imgfov_pc_velo)
  10. import matplotlib.pyplot as plt
  11. cmap = plt.cm.get_cmap('hsv', 256)
  12. cmap = np.array([cmap(i) for i in range(256)])[:,:3]*255
  13. for i in range(imgfov_pts_2d.shape[0]):
  14. depth = imgfov_pc_rect[i,2]
  15. color = cmap[int(640.0/depth),:]
  16. cv2.circle(img, (int(np.round(imgfov_pts_2d[i,0])),
  17. int(np.round(imgfov_pts_2d[i,1]))),
  18. 2, color=tuple(color), thickness=-1)
  19. Image.fromarray(img).save('save_output/lidar_on_image.png')
  20. Image.fromarray(img).show()
  21. return img

核心代码:

  1. '''
  2. 将点云数据投影到相机坐标系
  3. '''
  4. def get_lidar_in_image_fov(pc_velo, calib, xmin, ymin, xmax, ymax,
  5. return_more=False, clip_distance=2.0):
  6. ''' Filter lidar points, keep those in image FOV '''
  7. pts_2d = calib.project_velo_to_image(pc_velo)
  8. fov_inds = (pts_2d[:,0]<xmax) & (pts_2d[:,0]>=xmin) & \
  9. (pts_2d[:,1]<ymax) & (pts_2d[:,1]>=ymin)
  10. fov_inds = fov_inds & (pc_velo[:,0]>clip_distance)
  11. imgfov_pc_velo = pc_velo[fov_inds,:]
  12. if return_more:
  13. return imgfov_pc_velo, pts_2d, fov_inds
  14. else:
  15. return imgfov_pc_velo

7、图像数据-->投影到点云

当有了图像RGB信息,如何投影到点云中呢?本质上是一个坐标系转换的问题,和上面的是逆过程,流程思路如下:

  1. 已知图像坐标(u,v),当前是处于图像坐标系
  2. 图像坐标系 转 相机坐标系,需要用到标定参数中的P0逆矩阵,即相机内存矩阵,得到相机坐标(x,y,z)
  3. 相机坐标系进行畸变矫正,需要用到标定参数中的R0_rect逆矩阵,得到相机坐标(x1,y1,z1)
  4. 矫正后相机坐标系 转 激光雷达坐标系,需要用到标定参数中的Tr_velo_to_cam逆矩阵,此时得到激光雷达坐标(x2,y2,z2)

8、可视化图像2D结果、3D结果

先看一下2D框的效果:

3D框的效果:

 接口代码:

  1. '''
  2. 在图像中画2D框、3D框
  3. '''
  4. def show_image_with_boxes(img, objects, calib, show3d=True):
  5. img1 = np.copy(img) # for 2d bbox
  6. img2 = np.copy(img) # for 3d bbox
  7. for obj in objects:
  8. if obj.type=='DontCare':continue
  9. cv2.rectangle(img1, (int(obj.xmin),int(obj.ymin)), (int(obj.xmax),int(obj.ymax)), (0,255,0), 2) # 画2D框
  10. box3d_pts_2d, box3d_pts_3d = utils.compute_box_3d(obj, calib.P) # 获取图像3D框(8*2)、相机坐标系3D框(8*3)
  11. img2 = utils.draw_projected_box3d(img2, box3d_pts_2d) # 在图像上画3D框
  12. if show3d:
  13. Image.fromarray(img2).save('save_output/image_with_3Dboxes.png')
  14. Image.fromarray(img2).show()
  15. else:
  16. Image.fromarray(img1).save('save_output/image_with_2Dboxes.png')
  17. Image.fromarray(img1).show()

核心代码:

  1. def compute_box_3d(obj, P):
  2. '''
  3. 计算对象的3D边界框在图像平面上的投影
  4. 输入: obj代表一个物体标签信息, P代表相机的投影矩阵-内参。
  5. 输出: 返回两个值, corners_3d表示3D边界框在 相机坐标系 的8个角点的坐标-3D坐标。
  6. corners_2d表示3D边界框在 图像上 的8个角点的坐标-2D坐标。
  7. '''
  8. # 计算一个绕Y轴旋转的旋转矩阵R,用于将3D坐标从世界坐标系转换到相机坐标系。obj.ry是对象的偏航角
  9. R = roty(obj.ry)
  10. # 物体实际的长、宽、高
  11. l = obj.l;
  12. w = obj.w;
  13. h = obj.h;
  14. # 存储了3D边界框的8个角点相对于对象中心的坐标。这些坐标定义了3D边界框的形状。
  15. x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2];
  16. y_corners = [0,0,0,0,-h,-h,-h,-h];
  17. z_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2];
  18. # 1、将3D边界框的角点坐标从对象坐标系转换到相机坐标系。它使用了旋转矩阵R
  19. corners_3d = np.dot(R, np.vstack([x_corners,y_corners,z_corners]))
  20. # 3D边界框的坐标进行平移
  21. corners_3d[0,:] = corners_3d[0,:] + obj.t[0];
  22. corners_3d[1,:] = corners_3d[1,:] + obj.t[1];
  23. corners_3d[2,:] = corners_3d[2,:] + obj.t[2];
  24. # 2、检查对象是否在相机前方,因为只有在相机前方的对象才会被绘制。
  25. # 如果对象的Z坐标(深度)小于0.1,就意味着对象在相机后方,那么corners_2d将被设置为None,函数将返回None。
  26. if np.any(corners_3d[2,:]<0.1):
  27. corners_2d = None
  28. return corners_2d, np.transpose(corners_3d)
  29. # 3、将相机坐标系下的3D边界框的角点,投影到图像平面上,得到它们在图像上的2D坐标。
  30. corners_2d = project_to_image(np.transpose(corners_3d), P);
  31. return corners_2d, np.transpose(corners_3d)
  32. def draw_projected_box3d(image, qs, color=(0,60,255), thickness=2):
  33. '''
  34. qs: 包含8个3D边界框角点坐标的数组, 形状为(8, 2)。图像坐标下的3D框, 8个顶点坐标。
  35. '''
  36. ''' Draw 3d bounding box in image
  37. qs: (8,2) array of vertices for the 3d box in following order:
  38. 1 -------- 0
  39. /| /|
  40. 2 -------- 3 .
  41. | | | |
  42. . 5 -------- 4
  43. |/ |/
  44. 6 -------- 7
  45. '''
  46. qs = qs.astype(np.int32) # 将输入的顶点坐标转换为整数类型,以便在图像上绘制。
  47. # 这个循环迭代4次,每次处理一个边界框的一条边。
  48. for k in range(0,4):
  49. # Ref: http://docs.enthought.com/mayavi/mayavi/auto/mlab_helper_functions.html
  50. # 定义了要绘制的边的起始点和结束点的索引。在这个循环中,它用于绘制边界框的前四条边。
  51. i,j=k,(k+1)%4
  52. cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)
  53. # 定义了要绘制的边的起始点和结束点的索引。在这个循环中,它用于绘制边界框的后四条边,与前四条边平行
  54. i,j=k+4,(k+1)%4 + 4
  55. cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)
  56. # 定义了要绘制的边的起始点和结束点的索引。在这个循环中,它用于绘制连接前四条边和后四条边的边界框的边。
  57. i,j=k,k+4
  58. cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)
  59. return image

9、点云3D结果-->图像BEV鸟瞰图结果(坐标系转换)

思路流程:

  1. 读取点云数据,点云得存储格式是n*4,n是指当前文件点云的数量,4分别表示(x,y,z,intensity),即点云的空间三维坐标、反射强度
  2. 我们只需读取前两行即可,得到坐标点(x,y)
  3. 然后将坐标点(x,y),画散点图

BEV鸟瞰图效果如下:

10、绘制BEV鸟瞰图

BEV图像示例效果:

核心代码:

  1. '''
  2. 可视化BEV鸟瞰图
  3. '''
  4. def show_lidar_topview(pc_velo, objects, calib):
  5. # 1-设置鸟瞰图范围
  6. side_range = (-30, 30) # 左右距离
  7. fwd_range = (0, 80) # 后前距离
  8. x_points = pc_velo[:, 0]
  9. y_points = pc_velo[:, 1]
  10. z_points = pc_velo[:, 2]
  11. # 2-获得区域内的点
  12. f_filt = np.logical_and(x_points > fwd_range[0], x_points < fwd_range[1])
  13. s_filt = np.logical_and(y_points > side_range[0], y_points < side_range[1])
  14. filter = np.logical_and(f_filt, s_filt)
  15. indices = np.argwhere(filter).flatten()
  16. x_points = x_points[indices]
  17. y_points = y_points[indices]
  18. z_points = z_points[indices]
  19. # 定义了鸟瞰图中每个像素代表的距离
  20. res = 0.1
  21. # 3-1将点云坐标系 转到 BEV坐标系
  22. x_img = (-y_points / res).astype(np.int32)
  23. y_img = (-x_points / res).astype(np.int32)
  24. # 3-2调整坐标原点
  25. x_img -= int(np.floor(side_range[0]) / res)
  26. y_img += int(np.floor(fwd_range[1]) / res)
  27. print(x_img.min(), x_img.max(), y_img.min(), y_img.max())
  28. # 4-填充像素值, 将点云数据的高度信息(Z坐标)映射到像素值
  29. height_range = (-3, 1.0)
  30. pixel_value = np.clip(a=z_points, a_max=height_range[1], a_min=height_range[0])
  31. def scale_to_255(a, min, max, dtype=np.uint8):
  32. return ((a - min) / float(max - min) * 255).astype(dtype)
  33. pixel_value = scale_to_255(pixel_value, height_range[0], height_range[1])
  34. # 创建图像数组
  35. x_max = 1 + int((side_range[1] - side_range[0]) / res)
  36. y_max = 1 + int((fwd_range[1] - fwd_range[0]) / res)
  37. im = np.zeros([y_max, x_max], dtype=np.uint8)
  38. im[y_img, x_img] = pixel_value
  39. im2 = Image.fromarray(im)
  40. im2.save('save_output/BEV.png')
  41. im2.show()

11、BEV鸟瞰图画2d框

在BEV视图中画框,可视化结果:

接口代码:

  1. '''
  2. 将点云数据3D框投影到BEV
  3. '''
  4. def show_lidar_topview_with_boxes(img, objects, calib):
  5. def bbox3d(obj):
  6. box3d_pts_2d, box3d_pts_3d = utils.compute_box_3d(obj, calib.P) # 获取3D框-图像、3D框-相机坐标系
  7. box3d_pts_3d_velo = calib.project_rect_to_velo(box3d_pts_3d) # 将相机坐标系的框 转到 激光雷达坐标系
  8. return box3d_pts_3d_velo # 返回nx3的点
  9. boxes3d = [bbox3d(obj) for obj in objects if obj.type == "Car"]
  10. gt = np.array(boxes3d)
  11. im2 = utils.draw_box3d_label_on_bev(img, gt, scores=None, thickness=1) # 获取激光雷达坐标系的3D点,选择x, y两维,画到BEV平面坐标系上
  12. im2 = Image.fromarray(im2)
  13. im2.save('save_output/BEV with boxes.png')
  14. im2.show()

核心代码:

  1. # 设置BEV鸟瞰图参数
  2. side_range = (-30, 30) # 左右距离
  3. fwd_range = (0, 80) # 后前距离
  4. res = 0.1 # 分辨率0.05m
  5. def compute_box_3d(obj, P):
  6. '''
  7. 计算对象的3D边界框在图像平面上的投影
  8. 输入: obj代表一个物体标签信息, P代表相机的投影矩阵-内参。
  9. 输出: 返回两个值, corners_3d表示3D边界框在 相机坐标系 的8个角点的坐标-3D坐标。
  10. corners_2d表示3D边界框在 图像上 的8个角点的坐标-2D坐标。
  11. '''
  12. # 计算一个绕Y轴旋转的旋转矩阵R,用于将3D坐标从世界坐标系转换到相机坐标系。obj.ry是对象的偏航角
  13. R = roty(obj.ry)
  14. # 物体实际的长、宽、高
  15. l = obj.l;
  16. w = obj.w;
  17. h = obj.h;
  18. # 存储了3D边界框的8个角点相对于对象中心的坐标。这些坐标定义了3D边界框的形状。
  19. x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2];
  20. y_corners = [0,0,0,0,-h,-h,-h,-h];
  21. z_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2];
  22. # 1、将3D边界框的角点坐标从对象坐标系转换到相机坐标系。它使用了旋转矩阵R
  23. corners_3d = np.dot(R, np.vstack([x_corners,y_corners,z_corners]))
  24. # 3D边界框的坐标进行平移
  25. corners_3d[0,:] = corners_3d[0,:] + obj.t[0];
  26. corners_3d[1,:] = corners_3d[1,:] + obj.t[1];
  27. corners_3d[2,:] = corners_3d[2,:] + obj.t[2];
  28. # 2、检查对象是否在相机前方,因为只有在相机前方的对象才会被绘制。
  29. # 如果对象的Z坐标(深度)小于0.1,就意味着对象在相机后方,那么corners_2d将被设置为None,函数将返回None。
  30. if np.any(corners_3d[2,:]<0.1):
  31. corners_2d = None
  32. return corners_2d, np.transpose(corners_3d)
  33. # 3、将相机坐标系下的3D边界框的角点,投影到图像平面上,得到它们在图像上的2D坐标。
  34. corners_2d = project_to_image(np.transpose(corners_3d), P);
  35. return corners_2d, np.transpose(corners_3d)

12、完整工程代码

工程目录:

kitti_vis_main.py(主代码入口)

  1. from __future__ import print_function
  2. import os
  3. import sys
  4. import cv2
  5. import os.path
  6. from PIL import Image
  7. BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  8. ROOT_DIR = os.path.dirname(BASE_DIR)
  9. sys.path.append(BASE_DIR)
  10. sys.path.append(os.path.join(ROOT_DIR, 'mayavi'))
  11. from kitti_object import *
  12. def visualization():
  13. import mayavi.mlab as mlab
  14. dataset = kitti_object(os.path.join(ROOT_DIR, 'Kitti_3D_Vis/dataset/object')) # linux 路径
  15. data_idx = 10 # 选择第几张图像
  16. # 1-加载标签数据
  17. objects = dataset.get_label_objects(data_idx)
  18. print("There are %d objects.", len(objects))
  19. # 2-加载图像
  20. img = dataset.get_image(data_idx)
  21. img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
  22. img_height, img_width, img_channel = img.shape
  23. # 3-加载点云数据
  24. pc_velo = dataset.get_lidar(data_idx)[:,0:3] # (x, y, z)
  25. # 4-加载标定参数
  26. calib = dataset.get_calibration(data_idx)
  27. # 5-可视化原始图像
  28. print(' ------------ show raw image -------- ')
  29. Image.fromarray(img).show()
  30. # 6-在图像中画2D框
  31. print(' ------------ show image with 2D bounding box -------- ')
  32. show_image_with_boxes(img, objects, calib, False)
  33. # 7-在图像中画3D框
  34. print(' ------------ show image with 3D bounding box ------- ')
  35. show_image_with_boxes(img, objects, calib, True)
  36. # 8-将点云数据投影到图像
  37. print(' ----------- LiDAR points projected to image plane -- ')
  38. show_lidar_on_image(pc_velo, img, calib, img_width, img_height)
  39. # 9-画BEV图
  40. print('------------------ BEV of LiDAR points -----------------------------')
  41. show_lidar_topview(pc_velo, objects, calib)
  42. # 10-在BEV图中画2D框
  43. print('--------------- BEV of LiDAR points with bobes ---------------------')
  44. img1 = cv2.imread('save_output/BEV.png')
  45. img = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
  46. show_lidar_topview_with_boxes(img1, objects, calib)
  47. if __name__=='__main__':
  48. visualization()

kitti_util.py

  1. from __future__ import print_function
  2. import numpy as np
  3. import cv2
  4. from PIL import Image
  5. import os
  6. # 设置BEV鸟瞰图参数
  7. side_range = (-30, 30) # 左右距离
  8. fwd_range = (0, 80) # 后前距离
  9. res = 0.1 # 分辨率0.05m
  10. def compute_box_3d(obj, P):
  11. '''
  12. 计算对象的3D边界框在图像平面上的投影
  13. 输入: obj代表一个物体标签信息, P代表相机的投影矩阵-内参。
  14. 输出: 返回两个值, corners_3d表示3D边界框在 相机坐标系 的8个角点的坐标-3D坐标。
  15. corners_2d表示3D边界框在 图像上 的8个角点的坐标-2D坐标。
  16. '''
  17. # 计算一个绕Y轴旋转的旋转矩阵R,用于将3D坐标从世界坐标系转换到相机坐标系。obj.ry是对象的偏航角
  18. R = roty(obj.ry)
  19. # 物体实际的长、宽、高
  20. l = obj.l;
  21. w = obj.w;
  22. h = obj.h;
  23. # 存储了3D边界框的8个角点相对于对象中心的坐标。这些坐标定义了3D边界框的形状。
  24. x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2];
  25. y_corners = [0,0,0,0,-h,-h,-h,-h];
  26. z_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2];
  27. # 1、将3D边界框的角点坐标从对象坐标系转换到相机坐标系。它使用了旋转矩阵R
  28. corners_3d = np.dot(R, np.vstack([x_corners,y_corners,z_corners]))
  29. # 3D边界框的坐标进行平移
  30. corners_3d[0,:] = corners_3d[0,:] + obj.t[0];
  31. corners_3d[1,:] = corners_3d[1,:] + obj.t[1];
  32. corners_3d[2,:] = corners_3d[2,:] + obj.t[2];
  33. # 2、检查对象是否在相机前方,因为只有在相机前方的对象才会被绘制。
  34. # 如果对象的Z坐标(深度)小于0.1,就意味着对象在相机后方,那么corners_2d将被设置为None,函数将返回None。
  35. if np.any(corners_3d[2,:]<0.1):
  36. corners_2d = None
  37. return corners_2d, np.transpose(corners_3d)
  38. # 3、将相机坐标系下的3D边界框的角点,投影到图像平面上,得到它们在图像上的2D坐标。
  39. corners_2d = project_to_image(np.transpose(corners_3d), P);
  40. return corners_2d, np.transpose(corners_3d)
  41. def project_to_image(pts_3d, P):
  42. '''
  43. 将相机坐标系下的3D边界框的角点, 投影到图像平面上, 得到它们在图像上的2D坐标
  44. 输入: pts_3d是一个nx3的矩阵, 包含了待投影的3D坐标点(每行一个点), P是相机的投影矩阵, 通常是一个3x4的矩阵。
  45. 输出: 返回一个nx2的矩阵, 包含了投影到图像平面上的2D坐标点。
  46. P(3x4) dot pts_3d_extended(4xn) = projected_pts_2d(3xn) => normalize projected_pts_2d(2xn)
  47. <=> pts_3d_extended(nx4) dot P'(4x3) = projected_pts_2d(nx3) => normalize projected_pts_2d(nx2)
  48. '''
  49. n = pts_3d.shape[0] # 获取3D点的数量
  50. pts_3d_extend = np.hstack((pts_3d, np.ones((n,1)))) # 将每个3D点的坐标扩展为齐次坐标形式(4D),通过在每个点的末尾添加1,创建了一个nx4的矩阵。
  51. pts_2d = np.dot(pts_3d_extend, np.transpose(P)) # 将扩展的3D坐标点矩阵与投影矩阵P相乘,得到一个nx3的矩阵,其中每一行包含了3D点在图像平面上的投影坐标。每个点的坐标表示为[x, y, z]。
  52. pts_2d[:,0] /= pts_2d[:,2] # 将投影坐标中的x坐标除以z坐标,从而获得2D图像上的x坐标。
  53. pts_2d[:,1] /= pts_2d[:,2] # 将投影坐标中的y坐标除以z坐标,从而获得2D图像上的y坐标。
  54. return pts_2d[:,0:2] # 返回一个nx2的矩阵,其中包含了每个3D点在2D图像上的坐标。
  55. def draw_projected_box3d(image, qs, color=(0,60,255), thickness=2):
  56. '''
  57. qs: 包含8个3D边界框角点坐标的数组, 形状为(8, 2)。图像坐标下的3D框, 8个顶点坐标。
  58. '''
  59. ''' Draw 3d bounding box in image
  60. qs: (8,2) array of vertices for the 3d box in following order:
  61. 1 -------- 0
  62. /| /|
  63. 2 -------- 3 .
  64. | | | |
  65. . 5 -------- 4
  66. |/ |/
  67. 6 -------- 7
  68. '''
  69. qs = qs.astype(np.int32) # 将输入的顶点坐标转换为整数类型,以便在图像上绘制。
  70. # 这个循环迭代4次,每次处理一个边界框的一条边。
  71. for k in range(0,4):
  72. # Ref: http://docs.enthought.com/mayavi/mayavi/auto/mlab_helper_functions.html
  73. # 定义了要绘制的边的起始点和结束点的索引。在这个循环中,它用于绘制边界框的前四条边。
  74. i,j=k,(k+1)%4
  75. cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)
  76. # 定义了要绘制的边的起始点和结束点的索引。在这个循环中,它用于绘制边界框的后四条边,与前四条边平行
  77. i,j=k+4,(k+1)%4 + 4
  78. cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)
  79. # 定义了要绘制的边的起始点和结束点的索引。在这个循环中,它用于绘制连接前四条边和后四条边的边界框的边。
  80. i,j=k,k+4
  81. cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)
  82. return image
  83. def draw_box3d_label_on_bev(image, boxes3d, thickness=1, scores=None):
  84. # if scores is not None and scores.shape[0] >0:
  85. img = image.copy()
  86. num = len(boxes3d)
  87. for n in range(num):
  88. b = boxes3d[n]
  89. x0 = b[0, 0]
  90. y0 = b[0, 1]
  91. x1 = b[1, 0]
  92. y1 = b[1, 1]
  93. x2 = b[2, 0]
  94. y2 = b[2, 1]
  95. x3 = b[3, 0]
  96. y3 = b[3, 1]
  97. if (x0<30 and x1<30 and x2<30 and x3<30):
  98. u0, v0 = lidar_to_top_coords(x0, y0)
  99. u1, v1 = lidar_to_top_coords(x1, y1)
  100. u2, v2 = lidar_to_top_coords(x2, y2)
  101. u3, v3 = lidar_to_top_coords(x3, y3)
  102. color = (0, 255, 0) # green
  103. cv2.line(img, (u0, v0), (u1, v1), color, thickness, cv2.LINE_AA)
  104. cv2.line(img, (u1, v1), (u2, v2), color, thickness, cv2.LINE_AA)
  105. cv2.line(img, (u2, v2), (u3, v3), color, thickness, cv2.LINE_AA)
  106. cv2.line(img, (u3, v3), (u0, v0), color, thickness, cv2.LINE_AA)
  107. elif (x0<50 and x1<50 and x2<50 and x3<50):
  108. color = (255, 0, 0) # red
  109. u0, v0 = lidar_to_top_coords(x0, y0)
  110. u1, v1 = lidar_to_top_coords(x1, y1)
  111. u2, v2 = lidar_to_top_coords(x2, y2)
  112. u3, v3 = lidar_to_top_coords(x3, y3)
  113. cv2.line(img, (u0, v0), (u1, v1), color, thickness, cv2.LINE_AA)
  114. cv2.line(img, (u1, v1), (u2, v2), color, thickness, cv2.LINE_AA)
  115. cv2.line(img, (u2, v2), (u3, v3), color, thickness, cv2.LINE_AA)
  116. cv2.line(img, (u3, v3), (u0, v0), color, thickness, cv2.LINE_AA)
  117. else:
  118. color = (0, 0, 255) # blue
  119. u0, v0 = lidar_to_top_coords(x0, y0)
  120. u1, v1 = lidar_to_top_coords(x1, y1)
  121. u2, v2 = lidar_to_top_coords(x2, y2)
  122. u3, v3 = lidar_to_top_coords(x3, y3)
  123. cv2.line(img, (u0, v0), (u1, v1), color, thickness, cv2.LINE_AA)
  124. cv2.line(img, (u1, v1), (u2, v2), color, thickness, cv2.LINE_AA)
  125. cv2.line(img, (u2, v2), (u3, v3), color, thickness, cv2.LINE_AA)
  126. cv2.line(img, (u3, v3), (u0, v0), color, thickness, cv2.LINE_AA)
  127. return img
  128. def draw_box3d_predict_on_bev(image, boxes3d, thickness=1, scores=None):
  129. # if scores is not None and scores.shape[0] >0:
  130. img = image.copy()
  131. num = len(boxes3d)
  132. for n in range(num):
  133. b = boxes3d[n]
  134. x0 = b[0, 0]
  135. y0 = b[0, 1]
  136. x1 = b[1, 0]
  137. y1 = b[1, 1]
  138. x2 = b[2, 0]
  139. y2 = b[2, 1]
  140. x3 = b[3, 0]
  141. y3 = b[3, 1]
  142. color = (255, 255, 255) # white
  143. u0, v0 = lidar_to_top_coords(x0, y0)
  144. u1, v1 = lidar_to_top_coords(x1, y1)
  145. u2, v2 = lidar_to_top_coords(x2, y2)
  146. u3, v3 = lidar_to_top_coords(x3, y3)
  147. cv2.line(img, (u0, v0), (u1, v1), color, thickness, cv2.LINE_AA)
  148. cv2.line(img, (u1, v1), (u2, v2), color, thickness, cv2.LINE_AA)
  149. cv2.line(img, (u2, v2), (u3, v3), color, thickness, cv2.LINE_AA)
  150. cv2.line(img, (u3, v3), (u0, v0), color, thickness, cv2.LINE_AA)
  151. return img
  152. def lidar_to_top_coords(x, y, z=None):
  153. if 0:
  154. return x, y
  155. else:
  156. # print("TOP_X_MAX-TOP_X_MIN:",TOP_X_MAX,TOP_X_MIN)
  157. xx = (-y / res).astype(np.int32)
  158. yy = (-x / res).astype(np.int32)
  159. # 调整坐标原点
  160. xx -= int(np.floor(side_range[0]) / res)
  161. yy += int(np.floor(fwd_range[1]) / res)
  162. return xx, yy
  163. # 解析标签数据
  164. class Object3d(object):
  165. ''' 3d object label '''
  166. def __init__(self, label_file_line):
  167. data = label_file_line.split(' ')
  168. data[1:] = [float(x) for x in data[1:]]
  169. # extract label, truncation, occlusion
  170. self.type = data[0] # 'Car', 'Pedestrian', ...
  171. self.truncation = data[1] # truncated pixel ratio [0..1]
  172. self.occlusion = int(data[2]) # 0=visible, 1=partly occluded, 2=fully occluded, 3=unknown
  173. self.alpha = data[3] # object observation angle [-pi..pi]
  174. # extract 2d bounding box in 0-based coordinates
  175. self.xmin = data[4] # left
  176. self.ymin = data[5] # top
  177. self.xmax = data[6] # right
  178. self.ymax = data[7] # bottom
  179. self.box2d = np.array([self.xmin,self.ymin,self.xmax,self.ymax])
  180. # extract 3d bounding box information
  181. self.h = data[8] # box height
  182. self.w = data[9] # box width
  183. self.l = data[10] # box length (in meters)
  184. self.t = (data[11],data[12],data[13]) # location (x,y,z) in camera coord.
  185. self.ry = data[14] # yaw angle (around Y-axis in camera coordinates) [-pi..pi]
  186. def print_object(self):
  187. print('Type, truncation, occlusion, alpha: %s, %d, %d, %f' % \
  188. (self.type, self.truncation, self.occlusion, self.alpha))
  189. print('2d bbox (x0,y0,x1,y1): %f, %f, %f, %f' % \
  190. (self.xmin, self.ymin, self.xmax, self.ymax))
  191. print('3d bbox h,w,l: %f, %f, %f' % \
  192. (self.h, self.w, self.l))
  193. print('3d bbox location, ry: (%f, %f, %f), %f' % \
  194. (self.t[0],self.t[1],self.t[2],self.ry))
  195. class Calibration(object):
  196. ''' Calibration matrices and utils
  197. 3d XYZ in <label>.txt are in rect camera coord.
  198. 2d box xy are in image2 coord
  199. Points in <lidar>.bin are in Velodyne coord.
  200. y_image2 = P^2_rect * x_rect
  201. y_image2 = P^2_rect * R0_rect * Tr_velo_to_cam * x_velo
  202. x_ref = Tr_velo_to_cam * x_velo
  203. x_rect = R0_rect * x_ref
  204. P^2_rect = [f^2_u, 0, c^2_u, -f^2_u b^2_x;
  205. 0, f^2_v, c^2_v, -f^2_v b^2_y;
  206. 0, 0, 1, 0]
  207. = K * [1|t]
  208. image2 coord:
  209. ----> x-axis (u)
  210. |
  211. |
  212. v y-axis (v)
  213. velodyne coord:
  214. front x, left y, up z
  215. rect/ref camera coord:
  216. right x, down y, front z
  217. Ref (KITTI paper): http://www.cvlibs.net/publications/Geiger2013IJRR.pdf
  218. TODO(rqi): do matrix multiplication only once for each projection.
  219. '''
  220. def __init__(self, calib_filepath, from_video=False):
  221. if from_video:
  222. calibs = self.read_calib_from_video(calib_filepath)
  223. else:
  224. calibs = self.read_calib_file(calib_filepath)
  225. # Projection matrix from rect camera coord to image2 coord
  226. self.P = calibs['P2']
  227. self.P = np.reshape(self.P, [3,4])
  228. # Rigid transform from Velodyne coord to reference camera coord
  229. self.V2C = calibs['Tr_velo_to_cam']
  230. self.V2C = np.reshape(self.V2C, [3,4])
  231. self.C2V = inverse_rigid_trans(self.V2C)
  232. # Rotation from reference camera coord to rect camera coord
  233. self.R0 = calibs['R0_rect']
  234. self.R0 = np.reshape(self.R0,[3,3])
  235. # Camera intrinsics and extrinsics
  236. self.c_u = self.P[0,2]
  237. self.c_v = self.P[1,2]
  238. self.f_u = self.P[0,0]
  239. self.f_v = self.P[1,1]
  240. self.b_x = self.P[0,3]/(-self.f_u) # relative
  241. self.b_y = self.P[1,3]/(-self.f_v)
  242. def read_calib_file(self, filepath):
  243. ''' Read in a calibration file and parse into a dictionary.'''
  244. data = {}
  245. with open(filepath, 'r') as f:
  246. for line in f.readlines():
  247. line = line.rstrip()
  248. if len(line)==0: continue
  249. key, value = line.split(':', 1)
  250. # The only non-float values in these files are dates, which
  251. # we don't care about anyway
  252. try:
  253. data[key] = np.array([float(x) for x in value.split()])
  254. except ValueError:
  255. pass
  256. return data
  257. def read_calib_from_video(self, calib_root_dir):
  258. ''' Read calibration for camera 2 from video calib files.
  259. there are calib_cam_to_cam and calib_velo_to_cam under the calib_root_dir
  260. '''
  261. data = {}
  262. cam2cam = self.read_calib_file(os.path.join(calib_root_dir, 'calib_cam_to_cam.txt'))
  263. velo2cam = self.read_calib_file(os.path.join(calib_root_dir, 'calib_velo_to_cam.txt'))
  264. Tr_velo_to_cam = np.zeros((3,4))
  265. Tr_velo_to_cam[0:3,0:3] = np.reshape(velo2cam['R'], [3,3])
  266. Tr_velo_to_cam[:,3] = velo2cam['T']
  267. data['Tr_velo_to_cam'] = np.reshape(Tr_velo_to_cam, [12])
  268. data['R0_rect'] = cam2cam['R_rect_00']
  269. data['P2'] = cam2cam['P_rect_02']
  270. return data
  271. def cart2hom(self, pts_3d):
  272. ''' Input: nx3 points in Cartesian
  273. Oupput: nx4 points in Homogeneous by pending 1
  274. '''
  275. n = pts_3d.shape[0]
  276. pts_3d_hom = np.hstack((pts_3d, np.ones((n,1))))
  277. return pts_3d_hom
  278. # ===========================
  279. # ------- 3d to 3d ----------
  280. # ===========================
  281. def project_velo_to_ref(self, pts_3d_velo):
  282. pts_3d_velo = self.cart2hom(pts_3d_velo) # nx4
  283. return np.dot(pts_3d_velo, np.transpose(self.V2C))
  284. def project_ref_to_velo(self, pts_3d_ref):
  285. pts_3d_ref = self.cart2hom(pts_3d_ref) # nx4
  286. return np.dot(pts_3d_ref, np.transpose(self.C2V))
  287. def project_rect_to_ref(self, pts_3d_rect):
  288. ''' Input and Output are nx3 points '''
  289. return np.transpose(np.dot(np.linalg.inv(self.R0), np.transpose(pts_3d_rect)))
  290. def project_ref_to_rect(self, pts_3d_ref):
  291. ''' Input and Output are nx3 points '''
  292. return np.transpose(np.dot(self.R0, np.transpose(pts_3d_ref)))
  293. def project_rect_to_velo(self, pts_3d_rect):
  294. ''' Input: nx3 points in rect camera coord.
  295. Output: nx3 points in velodyne coord.
  296. '''
  297. pts_3d_ref = self.project_rect_to_ref(pts_3d_rect)
  298. return self.project_ref_to_velo(pts_3d_ref)
  299. def project_velo_to_rect(self, pts_3d_velo):
  300. pts_3d_ref = self.project_velo_to_ref(pts_3d_velo)
  301. return self.project_ref_to_rect(pts_3d_ref)
  302. def corners3d_to_img_boxes(self, corners3d):
  303. """
  304. :param corners3d: (N, 8, 3) corners in rect coordinate
  305. :return: boxes: (None, 4) [x1, y1, x2, y2] in rgb coordinate
  306. :return: boxes_corner: (None, 8) [xi, yi] in rgb coordinate
  307. """
  308. sample_num = corners3d.shape[0]
  309. corners3d_hom = np.concatenate((corners3d, np.ones((sample_num, 8, 1))), axis=2) # (N, 8, 4)
  310. img_pts = np.matmul(corners3d_hom, self.P.T) # (N, 8, 3)
  311. x, y = img_pts[:, :, 0] / img_pts[:, :, 2], img_pts[:, :, 1] / img_pts[:, :, 2]
  312. x1, y1 = np.min(x, axis=1), np.min(y, axis=1)
  313. x2, y2 = np.max(x, axis=1), np.max(y, axis=1)
  314. boxes = np.concatenate((x1.reshape(-1, 1), y1.reshape(-1, 1), x2.reshape(-1, 1), y2.reshape(-1, 1)), axis=1)
  315. boxes_corner = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1)), axis=2)
  316. return boxes, boxes_corner
  317. # ===========================
  318. # ------- 3d to 2d ----------
  319. # ===========================
  320. def project_rect_to_image(self, pts_3d_rect):
  321. ''' Input: nx3 points in rect camera coord.
  322. Output: nx2 points in image2 coord.
  323. '''
  324. pts_3d_rect = self.cart2hom(pts_3d_rect)
  325. pts_2d = np.dot(pts_3d_rect, np.transpose(self.P)) # nx3
  326. pts_2d[:,0] /= pts_2d[:,2]
  327. pts_2d[:,1] /= pts_2d[:,2]
  328. return pts_2d[:,0:2]
  329. def project_velo_to_image(self, pts_3d_velo):
  330. ''' Input: nx3 points in velodyne coord.
  331. Output: nx2 points in image2 coord.
  332. '''
  333. pts_3d_rect = self.project_velo_to_rect(pts_3d_velo)
  334. return self.project_rect_to_image(pts_3d_rect)
  335. # ===========================
  336. # ------- 2d to 3d ----------
  337. # ===========================
  338. def project_image_to_rect(self, uv_depth):
  339. ''' Input: nx3 first two channels are uv, 3rd channel
  340. is depth in rect camera coord.
  341. Output: nx3 points in rect camera coord.
  342. '''
  343. n = uv_depth.shape[0]
  344. x = ((uv_depth[:,0]-self.c_u)*uv_depth[:,2])/self.f_u + self.b_x
  345. y = ((uv_depth[:,1]-self.c_v)*uv_depth[:,2])/self.f_v + self.b_y
  346. pts_3d_rect = np.zeros((n,3))
  347. pts_3d_rect[:,0] = x
  348. pts_3d_rect[:,1] = y
  349. pts_3d_rect[:,2] = uv_depth[:,2]
  350. return pts_3d_rect
  351. def project_image_to_velo(self, uv_depth):
  352. pts_3d_rect = self.project_image_to_rect(uv_depth)
  353. return self.project_rect_to_velo(pts_3d_rect)
  354. def rotx(t):
  355. ''' 3D Rotation about the x-axis. '''
  356. c = np.cos(t)
  357. s = np.sin(t)
  358. return np.array([[1, 0, 0],
  359. [0, c, -s],
  360. [0, s, c]])
  361. def roty(t):
  362. ''' Rotation about the y-axis. '''
  363. c = np.cos(t)
  364. s = np.sin(t)
  365. return np.array([[c, 0, s],
  366. [0, 1, 0],
  367. [-s, 0, c]])
  368. def rotz(t):
  369. ''' Rotation about the z-axis. '''
  370. c = np.cos(t)
  371. s = np.sin(t)
  372. return np.array([[c, -s, 0],
  373. [s, c, 0],
  374. [0, 0, 1]])
  375. def transform_from_rot_trans(R, t):
  376. ''' Transforation matrix from rotation matrix and translation vector. '''
  377. R = R.reshape(3, 3)
  378. t = t.reshape(3, 1)
  379. return np.vstack((np.hstack([R, t]), [0, 0, 0, 1]))
  380. def inverse_rigid_trans(Tr):
  381. ''' Inverse a rigid body transform matrix (3x4 as [R|t])
  382. [R'|-R't; 0|1]
  383. '''
  384. inv_Tr = np.zeros_like(Tr) # 3x4
  385. inv_Tr[0:3,0:3] = np.transpose(Tr[0:3,0:3])
  386. inv_Tr[0:3,3] = np.dot(-np.transpose(Tr[0:3,0:3]), Tr[0:3,3])
  387. return inv_Tr
  388. def read_label(label_filename):
  389. lines = [line.rstrip() for line in open(label_filename)]
  390. objects = [Object3d(line) for line in lines]
  391. return objects
  392. def load_image(img_filename):
  393. return cv2.imread(img_filename)
  394. def load_velo_scan(velo_filename):
  395. scan = np.fromfile(velo_filename, dtype=np.float32)
  396. scan = scan.reshape((-1, 4))
  397. return scan

kitti_object.py

  1. from __future__ import print_function
  2. import os
  3. import sys
  4. import cv2
  5. import numpy as np
  6. from PIL import Image
  7. import matplotlib.pyplot as plt
  8. BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9. ROOT_DIR = os.path.dirname(BASE_DIR)
  10. sys.path.append(os.path.join(ROOT_DIR, 'mayavi'))
  11. import kitti_util as utils
  12. '''
  13. 在图像中画2D框、3D框
  14. '''
  15. def show_image_with_boxes(img, objects, calib, show3d=True):
  16. img1 = np.copy(img) # for 2d bbox
  17. img2 = np.copy(img) # for 3d bbox
  18. for obj in objects:
  19. if obj.type=='DontCare':continue
  20. cv2.rectangle(img1, (int(obj.xmin),int(obj.ymin)), (int(obj.xmax),int(obj.ymax)), (0,255,0), 2) # 画2D框
  21. box3d_pts_2d, box3d_pts_3d = utils.compute_box_3d(obj, calib.P) # 获取图像3D框(8*2)、相机坐标系3D框(8*3)
  22. img2 = utils.draw_projected_box3d(img2, box3d_pts_2d) # 在图像上画3D框
  23. if show3d:
  24. Image.fromarray(img2).save('save_output/image_with_3Dboxes.png')
  25. Image.fromarray(img2).show()
  26. else:
  27. Image.fromarray(img1).save('save_output/image_with_2Dboxes.png')
  28. Image.fromarray(img1).show()
  29. '''
  30. 可视化BEV鸟瞰图
  31. '''
  32. def show_lidar_topview(pc_velo, objects, calib):
  33. # 1-设置鸟瞰图范围
  34. side_range = (-30, 30) # 左右距离
  35. fwd_range = (0, 80) # 后前距离
  36. x_points = pc_velo[:, 0]
  37. y_points = pc_velo[:, 1]
  38. z_points = pc_velo[:, 2]
  39. # 2-获得区域内的点
  40. f_filt = np.logical_and(x_points > fwd_range[0], x_points < fwd_range[1])
  41. s_filt = np.logical_and(y_points > side_range[0], y_points < side_range[1])
  42. filter = np.logical_and(f_filt, s_filt)
  43. indices = np.argwhere(filter).flatten()
  44. x_points = x_points[indices]
  45. y_points = y_points[indices]
  46. z_points = z_points[indices]
  47. # 定义了鸟瞰图中每个像素代表的距离
  48. res = 0.1
  49. # 3-1将点云坐标系 转到 BEV坐标系
  50. x_img = (-y_points / res).astype(np.int32)
  51. y_img = (-x_points / res).astype(np.int32)
  52. # 3-2调整坐标原点
  53. x_img -= int(np.floor(side_range[0]) / res)
  54. y_img += int(np.floor(fwd_range[1]) / res)
  55. print(x_img.min(), x_img.max(), y_img.min(), y_img.max())
  56. # 4-填充像素值, 将点云数据的高度信息(Z坐标)映射到像素值
  57. height_range = (-3, 1.0)
  58. pixel_value = np.clip(a=z_points, a_max=height_range[1], a_min=height_range[0])
  59. def scale_to_255(a, min, max, dtype=np.uint8):
  60. return ((a - min) / float(max - min) * 255).astype(dtype)
  61. pixel_value = scale_to_255(pixel_value, height_range[0], height_range[1])
  62. # 创建图像数组
  63. x_max = 1 + int((side_range[1] - side_range[0]) / res)
  64. y_max = 1 + int((fwd_range[1] - fwd_range[0]) / res)
  65. im = np.zeros([y_max, x_max], dtype=np.uint8)
  66. im[y_img, x_img] = pixel_value
  67. im2 = Image.fromarray(im)
  68. im2.save('save_output/BEV.png')
  69. im2.show()
  70. '''
  71. 将点云数据3D框投影到BEV
  72. '''
  73. def show_lidar_topview_with_boxes(img, objects, calib):
  74. def bbox3d(obj):
  75. box3d_pts_2d, box3d_pts_3d = utils.compute_box_3d(obj, calib.P) # 获取3D框-图像、3D框-相机坐标系
  76. box3d_pts_3d_velo = calib.project_rect_to_velo(box3d_pts_3d) # 将相机坐标系的框 转到 激光雷达坐标系
  77. return box3d_pts_3d_velo # 返回nx3的点
  78. boxes3d = [bbox3d(obj) for obj in objects if obj.type == "Car"]
  79. gt = np.array(boxes3d)
  80. im2 = utils.draw_box3d_label_on_bev(img, gt, scores=None, thickness=1) # 获取激光雷达坐标系的3D点,选择x, y两维,画到BEV平面坐标系上
  81. im2 = Image.fromarray(im2)
  82. im2.save('save_output/BEV with boxes.png')
  83. im2.show()
  84. '''
  85. 将点云数据投影到图像
  86. '''
  87. def show_lidar_on_image(pc_velo, img, calib, img_width, img_height):
  88. ''' Project LiDAR points to image '''
  89. imgfov_pc_velo, pts_2d, fov_inds = get_lidar_in_image_fov(pc_velo,
  90. calib, 0, 0, img_width, img_height, True)
  91. imgfov_pts_2d = pts_2d[fov_inds,:]
  92. imgfov_pc_rect = calib.project_velo_to_rect(imgfov_pc_velo)
  93. import matplotlib.pyplot as plt
  94. cmap = plt.cm.get_cmap('hsv', 256)
  95. cmap = np.array([cmap(i) for i in range(256)])[:,:3]*255
  96. for i in range(imgfov_pts_2d.shape[0]):
  97. depth = imgfov_pc_rect[i,2]
  98. color = cmap[int(640.0/depth),:]
  99. cv2.circle(img, (int(np.round(imgfov_pts_2d[i,0])),
  100. int(np.round(imgfov_pts_2d[i,1]))),
  101. 2, color=tuple(color), thickness=-1)
  102. Image.fromarray(img).save('save_output/lidar_on_image.png')
  103. Image.fromarray(img).show()
  104. return img
  105. '''
  106. 将点云数据投影到相机坐标系
  107. '''
  108. def get_lidar_in_image_fov(pc_velo, calib, xmin, ymin, xmax, ymax,
  109. return_more=False, clip_distance=2.0):
  110. ''' Filter lidar points, keep those in image FOV '''
  111. pts_2d = calib.project_velo_to_image(pc_velo)
  112. fov_inds = (pts_2d[:,0]<xmax) & (pts_2d[:,0]>=xmin) & \
  113. (pts_2d[:,1]<ymax) & (pts_2d[:,1]>=ymin)
  114. fov_inds = fov_inds & (pc_velo[:,0]>clip_distance)
  115. imgfov_pc_velo = pc_velo[fov_inds,:]
  116. if return_more:
  117. return imgfov_pc_velo, pts_2d, fov_inds
  118. else:
  119. return imgfov_pc_velo
  120. '''
  121. 解析标签
  122. '''
  123. class kitti_object(object):
  124. '''Load and parse object data into a usable format.'''
  125. def __init__(self, root_dir, split='training'):
  126. '''root_dir contains training and testing folders'''
  127. self.root_dir = root_dir
  128. self.split = split
  129. self.split_dir = os.path.join(root_dir, split)
  130. if split == 'training':
  131. self.num_samples = 7481
  132. elif split == 'testing':
  133. self.num_samples = 7518
  134. else:
  135. print('Unknown split: %s' % (split))
  136. exit(-1)
  137. self.image_dir = os.path.join(self.split_dir, 'image_2')
  138. self.calib_dir = os.path.join(self.split_dir, 'calib')
  139. self.lidar_dir = os.path.join(self.split_dir, 'velodyne')
  140. self.label_dir = os.path.join(self.split_dir, 'label_2')
  141. def __len__(self):
  142. return self.num_samples
  143. def get_image(self, idx):
  144. assert(idx<self.num_samples)
  145. img_filename = os.path.join(self.image_dir, '%06d.png'%(idx))
  146. return utils.load_image(img_filename)
  147. def get_lidar(self, idx):
  148. assert(idx<self.num_samples)
  149. lidar_filename = os.path.join(self.lidar_dir, '%06d.bin'%(idx))
  150. return utils.load_velo_scan(lidar_filename)
  151. def get_calibration(self, idx):
  152. assert(idx<self.num_samples)
  153. calib_filename = os.path.join(self.calib_dir, '%06d.txt'%(idx))
  154. return utils.Calibration(calib_filename)
  155. def get_label_objects(self, idx):
  156. assert(idx<self.num_samples and self.split=='training')
  157. label_filename = os.path.join(self.label_dir, '%06d.txt'%(idx))
  158. return utils.read_label(label_filename)
  159. def get_depth_map(self, idx):
  160. pass
  161. def get_top_down(self, idx):
  162. pass

运行程序后kitti_vis_main.py后,回保存5张结果图片

后面还会介绍Nuscenes、Waymo等3D数据集。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/article/detail/33034
推荐阅读
相关标签
  

闽ICP备14008679号