学习记录(2): pytorch神经网络线性回归实现手写数字识别(0-1txt文件) 卷积见下一个_基于pytorch用线性回归方法进行手写数字分类

作者：你好赵伟 | 2024-04-04 10:38:22

踩

基于pytorch用线性回归方法进行手写数字分类

首先，本人也是刚接触机器学习和深度学习，所以可能代码不会太难。这些学习都是在学习了一些机器学习和深度学习的基础上的延伸，并且也学习了一部分pytorch的基础上，以下为相关链接：

吴恩达机器学习吴恩达深度学习莫烦pytorch学习

下面代码均上传到github上，链接如下：deep_learning_cnn_target_detection

首先我们准备数据集文件名叫做data1，网盘链接：

百度网盘

链接：https://pan.baidu.com/s/1qRS8-cbFGZNQsmLHZ704jA
提取码：hsxx

阿里网盘

「data1」https://www.aliyundrive.com/s/VY4iSy628jQ 提取码: 59ar 点击链接保存，或者复制本段内容，打开「阿里云盘」APP ，无需下载极速在线查看，视频原画倍速播放。

data1里面的数据命名格式为:数字_序列号，python只需用split分割即可：

python读取的代码如下


def img2vector(filename):
    # 创建向量
    returnVect = np.zeros((1, 1024))
    # 打开数据文件,读取每行内容
    fr = open(filename)
    for i in range(32):
        # 读取每一行
        lineStr = fr.readline()
        # 将每行前32字符转成int,存入向量
        for j in range(32):
            returnVect[0, 32 * i + j] = int(lineStr[j])
    return returnVect
 
 
def trainData(trainPath):
    trainfile = os.listdir(trainPath)  # 获取训练集文件下的所有文件名
    Y = np.zeros((len(trainfile), 1))
    # 先建立一个行数为训练样本数。列数为1024的0数组矩阵，1024为图片像素总和，即32*32
    X = np.zeros((len(trainfile), 1024))
    size = [[], [], [], [], [], [], [], [], [], []]
    # 取文件名的第一个数字为标签名
    for i in range(0, len(trainfile)):
        thislabel = trainfile[i].split(".")[0].split("_")[0]
        if len(thislabel) != 0:
            size[int(thislabel)].append(i)
            Y[i][0] = int(thislabel)  # 保存标签
        X[i, :] = img2vector(trainPath + "/" + trainfile[i])  # 将训练数据写入0矩阵
    return X, Y, size
 
 
X, Y, size = trainData('data1')

接下来将数据随机分配为训练集和测试集，详见github代码handwrite.py

将数据转化为一维向量矩阵，得到的值作为神经网络的输入层：


class Net(torch.nn.Module):     # 继承 torch 的 Module
    def __init__(self, n_feature, n_hidden, n_output):
        super(Net, self).__init__()     # 继承 __init__ 功能
        self.hidden = torch.nn.Linear(n_feature, n_hidden)   # 隐藏层线性输出
        self.out = torch.nn.Linear(n_hidden, n_output)       # 输出层线性输出
 
    def forward(self, x):
        # 正向传播输入值, 神经网络分析出输出值
        x = F.relu(self.hidden(x))      # 激励函数(隐藏层的线性值)
        x = self.out(x)                 # 输出值, 但是这个不是预测值, 预测值还需要再另外计算
        return x
net = Net(n_feature=32*32, n_hidden=10000, n_output=10) # 几个类别就几个 output
net.load_state_dict(torch.load('net2_par.pkl'))
optimizer = torch.optim.SGD(net.parameters(), lr=0.04)  # 传入 net 的所有参数, 学习率
# 算误差的时候, 注意真实值!不是! one-hot 形式的, 而是1D Tensor, (batch,)
# 但是预测值是2D tensor (batch, n_classes)
loss_func = torch.nn.CrossEntropyLoss()

这是神经网络框架，详见上面链接：莫烦pytorch

这是创建神经网络框架，下面是创建具体神经网络：

torch.optim.SGD表示使用SGD学习，torch.nn.CrossEntropyLoss()表示通过交叉验证的方法，输出为10个类，是分类神经网络，所以使用交叉损失函数

下面是使用神经网络进行训练和测试输出


for t in range(350):
    out = net(X_train_tensor)  # 喂给 net 训练数据 x, 输出分析值
    loss = loss_func(out, Y_train_tensor)  # 计算两者的误差
    print(t, loss.data)
    optimizer.zero_grad()  # 清空上一步的残余更新参数值
    loss.backward()  # 误差反向传播, 计算参数更新值
    optimizer.step()  # 将参数更新值施加到 net 的 parameters 上
torch.save(net.state_dict(), 'net2_par.pkl')   # parameters
 
 
 
out = net(X_test_tensor)
prediction = torch.max(F.softmax(out, dim=1), 1)[1]
pred_y = prediction.data.numpy().squeeze()
target_y = Y_test_tensor.data.numpy()
accuracy = sum(pred_y == target_y) / X_test_tensor.shape[0]  # 预测中有多少和真实值一样
print(accuracy)
print(target_y)
print('-'*50)
print(pred_y)
print(len(pred_y))

因为是分类问题，所以需要的是输出10个类中的最大值，最大值的位置就是标签的位置index，完整代码保存在github内的handwrite.py。

而具体的实时检测可以在github中的num_detect_linear.py内看到：

改代码的缺点就是需要首先识别到目标区域的位置才能检测，不能在一张图片中的随机位置检测(随机检测详见后续)

下面的是detect中的参数和神经网络设置：


image = cv2.imread("detect3.jpg")
# num = 7   # 采集数据集
# epo = 60  # 采集数据集
# [0, 255, 210]
 
# gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# r, b = cv2.threshold(gray, 60, 255, cv2.THRESH_BINARY)
[X, Y, D] = image.shape
gs_frame = cv2.GaussianBlur(image, (5, 5), 0)  # 高斯模糊
hsv = cv2.cvtColor(gs_frame, cv2.COLOR_BGR2HSV)  # 转化成HSV图像
erode_hsv = cv2.erode(hsv, None, iterations=2)  # 腐蚀 粗的变细
inRange_hsv = cv2.inRange(erode_hsv, color_dist[ball_color]['Lower'], color_dist[ball_color]['Upper'])
cnts = cv2.findContours(inRange_hsv.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
net4 = Net(n_feature=32 * 32, n_hidden=10000, n_output=10)  # 几个类别就几个 output
net4.load_state_dict(torch.load('net2_par.pkl'))

下面是对目标区域监测和数字识别：


for i in cnts:
    rect = cv2.minAreaRect(i)
    box = cv2.boxPoints(rect)
    # cv2.drawContours(image, [np.int0(box)], -1, (0, 0, 255), 2)
    x = [box[0][0], box[1][0], box[2][0], box[3][0]]
    y = [box[0][1], box[1][1], box[2][1], box[3][1]]
    x_min = int(min(x)-30) if(int(min(x)-30>0)) else 0
    x_max = int(max(x)+30) if(int(max(x)+30)<Y) else Y
    y_min = int(min(y)-30) if(int(min(y)-30)>0) else 0
    y_max = int(max(y)+30) if(int(max(y)+30)<X) else X
    # fron.append([x_min, x_max, y_min, y_max])
    img = image[y_min:y_max, x_min:x_max]
    cv2.imshow('camera', img)
    img_s = cv2.resize(img, (32, 32))
 
    im = pretreatment(img_s)
 
    X_test = np.reshape(im, (1, 1024))
    X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
    out = net4(X_test_tensor)
    prediction = torch.max(F.softmax(out, dim=1), 1)[1]
    pred_y = prediction.data.numpy().squeeze()
    pred_y = prediction.data.numpy().squeeze()
    print(pred_y)
    cv2.waitKey(0)

最后结果如下：

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/你好赵伟/article/detail/358108