Advertisement

No Bells, Just Whistles

阅读量:

文章目录

  • 数据

  • Inference

    • 模型预测heatmap
    • 后处理
  • Template

  • Map Template

    • Cam parm
    • 画图
    • 互相转换
  • clamp pitch

数据

之前看的以为SoccerNet是作者用到的网络名称,原来是data

复制代码
    from SoccerNet.Downloader import SoccerNetDownloader
    mySoccerNetDownloader = SoccerNetDownloader(LocalDirectory="soccernet")
    mySoccerNetDownloader.downloadDataTask(task="calibration", split=["train","valid","test"])
    
    
      
      
      
    
    代码解读
在这里插入图片描述
复制代码
    {
    "Side line left": [
        {
            "x": 0.22446875274181366,
            "y": 0.0
        },
        {
            "x": 0.0,
            "y": 0.1467222273349762
        }
    ],
    "Side line top": [
        {
            "x": 0.22700782120227814,
            "y": 0.0
        },
        {
            "x": 1.0,
            "y": 0.101583331823349
        }
    ],
    "Big rect. left top": [
        {
            "x": 0.0898749977350235,
            "y": 0.09255555272102356
        },
        {
            "x": 0.47080469131469727,
            "y": 0.15123611688613892
        }
    ],
    "Big rect. left main": [
        {
            "x": 0.47080469131469727,
            "y": 0.15123611688613892
        },
        {
            "x": 0.0,
            "y": 0.5869166254997253
        }
    ],
    "Small rect. left top": [
        {
            "x": 0.0,
            "y": 0.19638888537883759
        },
        {
            "x": 0.06321094185113907,
            "y": 0.20766666531562805
        }
    ],
    "Small rect. left main": [
        {
            "x": 0.0,
            "y": 0.2505694627761841
        },
        {
            "x": 0.06321094185113907,
            "y": 0.20993055403232574
        }
    ],
    "Circle left": [
        {
            "x": 0.1251562535762787,
            "y": 0.4714166820049286
        },
        {
            "x": 0.16253124177455902,
            "y": 0.46788889169692993
        },
        {
            "x": 0.21246874332427979,
            "y": 0.45495834946632385
        },
        {
            "x": 0.27497655153274536,
            "y": 0.4361388683319092
        },
        {
            "x": 0.31565624475479126,
            "y": 0.41673609614372253
        },
        {
            "x": 0.35170310735702515,
            "y": 0.38616669178009033
        },
        {
            "x": 0.3692343831062317,
            "y": 0.35854166746139526
        },
        {
            "x": 0.3735312521457672,
            "y": 0.3320833444595337
        },
        {
            "x": 0.36427342891693115,
            "y": 0.3038611114025116
        },
        {
            "x": 0.3447578251361847,
            "y": 0.2832777798175812
        },
        {
            "x": 0.33616405725479126,
            "y": 0.27856945991516113
        }
    ]
    }
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

Inference

模型预测heatmap

在这里插入图片描述

输入

  • 输入的图片是RGB的,这一点要注意
  • 其次是图片size会被resize 到 540x960

Feature 输出
同一张图片,它会被送入两个模型,如paper里所示

复制代码
     with torch.no_grad():
        heatmaps = model(frame)
        heatmaps_l = model_l(frame)
    
    
      
      
      
    
    代码解读
复制代码
    其中个stages 完成后会有以下的features list, size 如下:
    【1,48,135,240】
    【1,96,68,120】
    【1,192,34,60】
    【1,384,17,30】
    
    
      
      
      
      
      
    
    代码解读

在最后一个stage结束后,该系统将对list中的所有features依据第0个item的大小进行统一整合,并将其整合为(1,720,135,240)的形式;其中具体计算如下:48+96+192+384=720

复制代码
     height, width = x[0].size(2), x[0].size(3)
     x1 = F.interpolate(x[1], size=(height, width), mode='bilinear', align_corners=False)
     x2 = F.interpolate(x[2], size=(height, width), mode='bilinear', align_corners=False)
     x3 = F.interpolate(x[3], size=(height, width), mode='bilinear', align_corners=False)
     x = torch.cat([x[0], x1, x2, x3], 1)
    
    
      
      
      
      
      
    
    代码解读

最后,再与x_skip 相 cat 成:(1,784,270,480) (有upsample到和x_skip 一样的size)

Head 输出

复制代码
    self.head = nn.Sequential(nn.Sequential(
     nn.Conv2d(
         in_channels=final_inp_channels,
         out_channels=final_inp_channels,
         kernel_size=1),
     BatchNorm2d(final_inp_channels, momentum=BN_MOMENTUM),
     nn.ReLU(inplace=True),
     nn.Conv2d(
         in_channels=final_inp_channels,
         out_channels=config['MODEL']['NUM_JOINTS'],
         kernel_size=extra['FINAL_CONV_KERNEL']),
     nn.Softmax(dim=1)))
    
    
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

这样看来,在模型中设置相关参数(如out_channels)可以通过out_channels=config['MODEL']['NUM_JOINTS']进行计算获得。以Lines为例,在实际应用中其输出结果会是:(1, 24, 270, 480)

请添加图片描述

后处理

提取坐标:

好啊,到底为什么不使用全部的heatmap?

复制代码
     kp_coords = get_keypoints_from_heatmap_batch_maxpool(heatmaps[:,:-1,:,:])
     line_coords = get_keypoints_from_heatmap_batch_maxpool_l(heatmaps_l[:,:-1,:,:])
    
    
      
      
    
    代码解读

该算法 borrow了自mmdetection和CenterNet的技术基础之上,在经过一系列特征提取与筛选之后,在N个通道中利用max pooling操作获取最大响应及其对应坐标位置。经筛选后仅保留了57个点中的11个具有显著响应的特征;其中每个特征点对应于原始输入空间中的具体位置信息

通过以上这些点,又估计出了相机的参数

这样一来,我们也能得到一个project matrix P

Template

lines_coords 来自于inference.py

复制代码
    import matplotlib.pyplot as plt
    lines_coords = np.asarray(lines_coords)
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    for line in lines_coords:
    x_values = line[:,0]
    y_values = line[:,1]
    z_values = line[:,2]
    ax.plot(x_values, y_values, z_values, color='blue', linewidth=2)  # Customize color and width
    
    ax.set_xlabel('X', fontsize=12)
    ax.set_ylabel('Y', fontsize=12)
    ax.set_zlabel('Z', fontsize=12)
    ax.invert_zaxis()
    plt.show()
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

Map Template

Cam parm

复制代码
    {'mode': 'full',
     'use_ransac': 0,
     'rep_err': 1.7726237119131933,
     'cam_params': {'pan_degrees': np.float64(0.5474708001659633),
      'tilt_degrees': np.float64(76.4671550566091),
      'roll_degrees': np.float64(0.07939687652485798),
      'x_focal_length': np.float64(2547.636128205226),
      'y_focal_length': np.float64(2547.636128205226),
      'principal_point': [480.0, 270.0],
      'position_meters': [np.float64(-0.15350727988838408),
       np.float64(73.82576553769255),
       np.float64(-19.2506671316509)],
      'rotation_matrix': [[np.float64(0.9999502912697644),
    np.float64(0.009879264774156918),
    np.float64(0.0013472627937911466)],
       [np.float64(-0.003621572525175545),
    np.float64(0.2339785911681853),
    np.float64(0.972235004043465)],
       [np.float64(0.009289736377224345),
    np.float64(-0.9721915546858414),
    np.float64(0.23400273886339104)]],
      'radial_distortion': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
      'tangential_distortion': [0.0, 0.0],
      'thin_prism_distortion': [0.0, 0.0, 0.0, 0.0]},
     'calib_plane': 0}
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

画图

复制代码
    import matplotlib.pyplot as plt
    lines_coords = np.asarray(lines_coords)
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    for line in lines_coords:
    x_values = line[:,0]
    y_values = line[:,1]
    # z_values = line[:,2]
    ax.plot(x_values, y_values, color='blue', linewidth=2)  # Customize color and width
    
    ax.scatter(50,20,0, c='red', marker='o',linewidths=3)
    ax.set_xlabel('X', fontsize=12)
    ax.set_ylabel('Y', fontsize=12)
    ax.set_zlabel('Z', fontsize=12)
    ax.invert_zaxis()
    plt.show()
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读
复制代码
    import matplotlib.pyplot as plt
    lines_coords = np.asarray(lines_coords)
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    for line in lines_coords:
    w1 = line[0]
    w2 = line[1]
    i1 = P @ np.array([w1[0]-105/2, w1[1]-68/2, w1[2],1])
    i2 = P @ np.array([w2[0]-105/2, w2[1]-68/2, w2[2],1])
    i1 /= P[:,-1]
    i2 /= P[:,-1]
    x = [i1[0],i2[0]]
    y = [i1[1],i2[1]]
    # z = [i1[2],i2[2]]
    
    ax.plot(x,y, color='green', linewidth=2)  # Customize color and width
    pt=[50,20,0]
    pt1 = np.asarray([pt[0]-105/2, pt[1]-68/2, pt[2],1])
    print(pt1)
    pt2=P@pt1
    print(pt2,pt2.shape)
    print(P[:,-1],P[:,-1].shape)
    pt2 /= P[:,-1] 
    print(pt2,pt2.shape)
    ax.scatter(pt2[0],pt2[1], c='red', marker='o',linewidths=3)
    ax.set_xlabel('X', fontsize=12)
    ax.set_ylabel('Y', fontsize=12)
    # ax.set_zlabel('Z', fontsize=12)
    ax.invert_zaxis()
    plt.show()
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

互相转换

复制代码
    P = np.array([
    [1845.919955, -439.396327, 190.897618, 25160.905485],
    [-78.776258, 179.997439, 1854.946559, 16853.477120],
    [0.004139, -0.970780, 0.239934, 54.382024]
    ])
    tx = -105 / 2
    ty = -68 / 2
    
    
      
      
      
      
      
      
      
    
    代码解读

假设从Model中提取出一个字典P,并确定tx和ty为恒定值;这些参数取决于template的设置

复制代码
    def reverse_transform(W4,tx,ty,P):
    W4 = np.append(W4,1)
    P = np.vstack((P, np.array([0,0,0,1])))
    W3 = W4 * P[:,-1]
    P_inv = np.linalg.pinv(P)
    W2 = P_inv @ W3
    W = np.array([W2[0] - tx, W2[1] - ty, W2[2]])
    print('W4:',W4,W4.shape)
    print('W3:',W3,W3.shape)
    print('W2:',W2,W2.shape)
    print('W:',W,W.shape)
    return W
    W = reverse_transform(W4,tx,ty,P)
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读
复制代码
    def create_projection_matrix(cam_params):
    fx, fy = cam_params['x_focal_length'], cam_params['y_focal_length']
    cx, cy = cam_params['principal_point']
    R = np.array(cam_params['rotation_matrix'])
    t = np.array(cam_params['position_meters'])
    K = np.array([[fx, 0, cx],
                  [0, fy, cy],
                  [0, 0, 1]])
    extrinsic = np.column_stack((R, -R @ t))
    P = K @ extrinsic
    return P
    
    
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读
复制代码
    def reverse_transform(W4, cam_params, z_constraint=0, P):
    w4 = np.array([W4[0], W4[1], 1])
    W3 = np.linalg.inv(P[:, :3]) @ w4
    C = -np.linalg.inv(P[:, :3]) @ P[:, 3]
    t = (z_constraint - C[2]) / W3[2]
    W = C + t * W3
    return W
    
    
      
      
      
      
      
      
      
    
    代码解读

50,20,0

50,20,0

50,20,0

从以上所述可知,函数体实际上是由后续函数进行逆向推导完成的。为了使模板呈现【50, 20, 0

复制代码
    def normal_transform(W, tx, ty, P):
    W2 = np.array([W[0] + tx, W[1] + ty, W[2], 1])
    W3 = P@W2
    W4 = W3 / P[:,-1]
    print('W:',W,W.shape)
    print('W2:',W2,W2.shape)
    print('W3',W3,W3.shape)
    print('W4',W4,W4.shape)
    return W4
    W = np.asarray([50,20,0])
    W4 = normal_transform(W,tx,ty,P)
    
    
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

这样两个function 可以用不同的值多测试几次。

clamp pitch

过滤了多余的线后的结果

其主要原理基于:
已知两点可用于建立线性方程(y = xm + b)通过多项式拟合方法。
利用多项式拟合方法求得回归系数m和b后从而实现了对回归参数的有效估计。
例如,在图中绿色线条代表原始数据曲线而红色线条则是经过滤后的曲线以确保所有点位于框内。
假设已知回归系数m和b并且左侧的数据点已经被确认无误现在需要确定右侧数据点的具体坐标。
将右侧数据点x值设为960并代入上述方程即可计算出对应的y值完成这一操作后就能获得新的曲线

在此case中,在调用该函数之前进行数值筛选以清除所有为负数的数值

复制代码
    def replace_points(pt1,pt2,width_lim,height_lim, m=None, b=None):
    if 0 <= pt1[0] <= width_lim and 0 <= pt1[1] <= height_lim and 0 <= pt2[0] <= width_lim and 0 <= pt2[1] <= height_lim:
        return pt1,pt2
    elif pt1[0] == pt2[0]:
        m = 0
        b = pt1[1]
    elif m is None and b is None:
        coefficient = np.polyfit([pt1[0],pt2[0]],[pt1[1],pt2[1]],1)
        m, b = coefficient[0], coefficient[1]
    else:
        print("------------",pt1,pt2,m,b)
    if pt1[0] > width_lim:
        pt1[0] = width_lim
        pt1[1] = pt1[0]*m + b
        print("---1",pt1[0],pt1[1])
    if pt2[0] > width_lim:
        pt2[0] = width_lim
        pt2[1] = pt2[0]*m + b
        print("---2",pt2[0],pt2[1])
    if pt1[0] < 0:
        print("---*3",pt1[0],pt1[1])
        pt1[0] = 0
        pt1[1] = b
        print("---3",pt1[0],pt1[1])
    if pt2[0] < 0:
        print("---*4",pt1[0],pt1[1])
        pt2[0] = 0
        pt2[1] = b
        print("---4",pt1[0],pt1[1])
    if pt1[1] < 0:
        print("---7",pt1[0],pt1[1])
        pt1[1] = 0
        pt1[0] = -b/m
        print("---7",pt1[0],pt1[1])
    if pt2[1] < 0:
        print("---8",pt1[0],pt1[1])
        pt2[1] = 0
        pt2[0] = -b/m
        print("---8",pt1[0],pt1[1])
    
    if pt1[1] > height_lim:
        print("---*5",pt1[0],pt1[1])
        pt1[1] = height_lim
        pt1[0] = (pt1[1]-b)/m
        print("---5",pt1[0],pt1[1])
    if pt2[1] > height_lim:
        print("---*6",pt2[0],pt2[1])
        pt2[1] = height_lim
        pt2[0] = (pt2[1]-b)/m
        print("---6",pt2[0],pt2[1])
    return replace_points(pt1,pt2,width_lim,height_lim,m,b)
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

下面是refine过的代码

复制代码
    import numpy as np
    
    def replace_points(pt1, pt2, width_lim, height_lim, m=None, b=None):
    def clip_point(x, y, width_lim, height_lim, m, b):
        if x < 0:
            x = 0
            y = b
        elif x > width_lim:
            x = width_lim
            y = x * m + b
       
        if y < 0:
            y = 0
            if m != 0:
                x = -b / m
            else:
                x = x 
        elif y > height_lim:
            y = height_lim
            if m != 0:
                x = (y - b) / m
            else:
                x = x  
        return x, y
    
    if (0 <= pt1[0] <= width_lim and 0 <= pt1[1] <= height_lim and
        0 <= pt2[0] <= width_lim and 0 <= pt2[1] <= height_lim):
        return pt1, pt2
    
    if m is None or b is None:
        if pt1[0] == pt2[0]:
            m = 0
            b = pt1[1]
        else:
            coefficients = np.polyfit([pt1[0], pt2[0]], [pt1[1], pt2[1]], 1)
            m, b = coefficients
    
    pt1[0], pt1[1] = clip_point(pt1[0], pt1[1], width_lim, height_lim, m, b)
    pt2[0], pt2[1] = clip_point(pt2[0], pt2[1], width_lim, height_lim, m, b)
    
    if pt1[0] != pt2[0] or pt1[1] != pt2[1]:
        return replace_points(pt1, pt2, width_lim, height_lim, m, b)
    
    return pt1, pt2
    
    pt1 = [10, 10]
    pt2 = [300, 300]
    width_lim = 200
    height_lim = 200
    
    new_pt1, new_pt2 = replace_points(pt1, pt2, width_lim, height_lim)
    print(new_pt1, new_pt2)
    
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

全部评论 (0)

还没有任何评论哟~