我在B站读大学-【小土堆】PyTorch深度学习快速入门教程学习笔记(四)_小土堆笔记

作者：算法创新者 | 2024-01-31 19:46:15

踩

小土堆笔记

文章目录

介绍

Pytorch入门选手的学习笔记 (序号与视频序号完全对应), 本文尽可能的在详细记录了自己的学习内容, 包括但不限于(加了详细注释的代码, 截图并补充了说明的插图, UP讲课时的关键内容),因为自己学习过程中看其他博主的笔记, 有所收获,.
希望我的笔记也可以帮助一起学习的小伙伴, 加油哇~

视频链接

PyTorch深度学习快速入门教程【小土堆】:通往学习之门-点这里 !!!

参考的其他博主的笔记链接 ,非常感谢!!

笔记链接再次感谢上述博主分享, 在学习过程中有效帮助我节约了时间并加深了理解~

30. 利用GPU训练(一)

能进行GPU训练部分: 三部分可添加 .cuda()
- 网络模型
- 数据（imgs，targets）
- 损失函数
- train_gpu_1.py
  
  在上一节视频中 train.py 的基础上
  
  增加.cuda()
  
  增加测试时间的计算
  
  增加

# 准备数据集
import torch
import torchvision
from torch import nn
from torch.nn.modules.flatten import Flatten
from torch.utils.tensorboard import SummaryWriter
import time


# from model import *
from torch.utils.data import DataLoader

train_data = torchvision.datasets.CIFAR10(root="../data", train=True, transform=torchvision.transforms.ToTensor(),
                                          download=True)
test_data = torchvision.datasets.CIFAR10(root="../data", train=False, transform=torchvision.transforms.ToTensor(),
                                         download=True)

# 查看数据集大小 length 长度
train_data_size = len(train_data)
test_data_size = len(test_data)
print("训练数据集长度为:{}", format(train_data_size))
print("测试数据集长度为:{}", format(test_data_size))

# 利用 DataLoader 来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)


# 创建网络模型
class Tudui(nn.Module):
    def __init__(self):
        self.model = super(Tudui, self).__init__()
        # 把网络放在一个序列中
        self.model = nn.Sequential(
            nn.Conv2d(3, 32, 5, stride=1, padding=2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 32, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 5, 1, 2),
            nn.MaxPool2d(2),
            Flatten(),
            nn.Linear(1024, 64),
            nn.Linear(64, 10)
        )

    def forward(self, x):
        x = self.model(x)
        return x


tudui = Tudui()
# 判断是否可用cuda
if torch.cuda.is_available():
    tudui = tudui.cuda() # 网络模型转移到cuda


# 定义损失函数 用交叉熵
loss_fn = nn.CrossEntropyLoss()
if torch.cuda.is_available():
    loss_fn = loss_fn.cuda()  # loss转移到cuda

# 定义优化器 选择的是随机梯度下降
# 1e-2 = 1*(10)^(-2) = 1/100 =0.01
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(), lr=learning_rate)

# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0
# 训练的轮数
epoch = 10

# 添加tensorboard
writer = SummaryWriter()
# 记录训练起始时间
start_time = time.time()

for i in range(epoch):
    print("-----------第{}轮训练开始-----------".format(i + 1))

    # 训练步骤开始
    tudui.train()  # 作用:将网络设置成训练模式 只对部分网络层有作用,例如Dropout层,BatchNorm层等
    for data in train_dataloader:
        imgs, targets = data
        if torch.cuda.is_available():
            imgs = imgs.cuda()
            targets = targets.cuda()
        outputs = tudui(imgs)
        loss = loss_fn(outputs, targets)

        # 优化器优化模型
        # 利用优化器将梯度清零
        optimizer.zero_grad()
        # 利用反向传播得到每个梯度的结点
        loss.backward()
        # 调用优化器
        optimizer.step()

        # 训练次数+1
        total_train_step += 1
        if total_train_step % 100 == 0:
            # 记录训练终止的时间
            end_time = time.time()
            print("训练时间:{}".format(end_time-start_time))
            print("训练次数:{},Loss:{}".format(total_train_step, loss))
            writer.add_scalar("train_loss", loss.item(), total_train_step)

    # 测试步骤开始
    tudui.eval()  # 设置模型进入验证状态 只对特定层有作用,例如Dropout层,BatchNorm层等
    # 计算整个数据集上的loss
    total_test_loss = 0
    # 计算整体的正确率 整体正确的个数 初始为0
    total_accuracy = 0

    with torch.no_grad():  # 本行要有, 测试时不要改变梯度
        for data in test_dataloader:
            imgs, targets = data
            if torch.cuda.is_available():
                imgs = imgs.cuda()
                targets = targets.cuda()
            outputs = tudui(imgs)
            # 计算数据的损失
            loss = loss_fn(outputs, targets)
            # 计算整体损失
            total_test_loss = total_test_loss + loss
            # 横向比较
            # argmax(1)中  填1 水平比较  如果为正确为True  sum + 1
            accuracy = (outputs.argmax(1) == targets).sum()
            total_accuracy = total_accuracy + accuracy
    print("整体测试集上的loss:{}".format(total_test_loss))
    print("整体测试集上的正确率:{}".format(total_accuracy / test_data_size))
    writer.add_scalar("test_loss", total_test_loss, total_test_step)
    writer.add_scalar("test_accuracy", total_accuracy / test_data_size, total_test_step)
    total_train_step += 1

    # 保存训练时每一轮的模型
    torch.save(tudui, "tudui_{}.pth".format(i))
    # 官方推荐的保存方式 torch.save(tudui.stats_dict(),"tudui_{}.path.format)
    print('模型已保存')

writer.close()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144

本人电脑没有独显,在 colab 上试运行的, 那是真的爽哇~

在这期里小土堆也讲了下colab的简单使用!

下述是查看colab配置的方法(转载)
版权声明：本文为CSDN博主「打孔猿」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/weixin_44562957/article/details/120139677

print("============查看GPU信息================")
# 查看GPU信息
!/opt/bin/nvidia-smi
print("==============查看pytorch版本==============")
# 查看pytorch版本
import torch
print(torch.__version__)
print("============查看虚拟机硬盘容量================")
# 查看虚拟机硬盘容量
!df -lh
print("============查看cpu配置================")
# 查看cpu配置
!cat /proc/cpuinfo | grep model\ name
print("=============查看内存容量===============")
# 查看内存容量
!cat /proc/meminfo | grep MemTotal
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

我随机分的配置, 跑起来真的比CPU快很多~~啦

在这里插入图片描述

31. 利用GPU训练(二)

.to(device) # 这种方式更常用~

device = torch.device("cpu)
device = torch.device(“cuda”)
当电脑上有多张显卡时: 可以指定显卡
- device = torch.device(“cuda:0”) # 制定第一个显卡
- device = torch.device(“cuda:1”) # 制定第二个显卡

train_gpu_2.py

首先把train_gpu_1.py 里的代码复制过来 , 开改~
改动部分如下:
- 定义训练的设备: device = torch.device('cuda')
- 定义模型 :
  - 常见写法1 tudui.to('device')
  - 常见写法2 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
- 修改损失函数: loss_fn = loss_fn.to(device)
- 修改训练和测试的cuda部分
  
  imgs = imgs.to(device)
  targets = targets.to(device)

# 准备数据集
import torch
import torchvision
from torch import nn
from torch.nn.modules.flatten import Flatten
from torch.utils.tensorboard import SummaryWriter
import time


from torch.utils.data import DataLoader


# 定义训练的设备
# 写法2  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 可适应不同情况
device = torch.device('cpu')
# 打印正在使用的设备
print(device)

train_data = torchvision.datasets.CIFAR10(root="../data", train=True, transform=torchvision.transforms.ToTensor(),
                                          download=True)
test_data = torchvision.datasets.CIFAR10(root="../data", train=False, transform=torchvision.transforms.ToTensor(),
                                         download=True)

# 查看数据集大小 length 长度
train_data_size = len(train_data)
test_data_size = len(test_data)
print("训练数据集长度为:{}", format(train_data_size))
print("测试数据集长度为:{}", format(test_data_size))

# 利用 DataLoader 来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)


# 创建网络模型
class Tudui(nn.Module):
    def __init__(self):
        self.model = super(Tudui, self).__init__()
        # 把网络放在一个序列中
        self.model = nn.Sequential(
            nn.Conv2d(3, 32, 5, stride=1, padding=2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 32, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 5, 1, 2),
            nn.MaxPool2d(2),
            Flatten(),
            nn.Linear(1024, 64),
            nn.Linear(64, 10)
        )

    def forward(self, x):
        x = self.model(x)
        return x


tudui = Tudui()
# 将网络转移到设备上去
tudui.to(device)


# 定义损失函数 用交叉熵
loss_fn = nn.CrossEntropyLoss()
loss_fn = loss_fn.to(device)

# 定义优化器 选择的是随机梯度下降
# 1e-2 = 1*(10)^(-2) = 1/100 =0.01
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(), lr=learning_rate)

# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0
# 训练的轮数
epoch = 10

# 添加tensorboard
writer = SummaryWriter()
# 记录训练起始时间
start_time = time.time()

for i in range(epoch):
    print("-----------第{}轮训练开始-----------".format(i + 1))

    # 训练步骤开始
    tudui.train()  # 作用:将网络设置成训练模式 只对部分网络层有作用,例如Dropout层,BatchNorm层等
    for data in train_dataloader:
        imgs, targets = data
        imgs = imgs.to(device)
        targets = targets.to(device)
        outputs = tudui(imgs)
        loss = loss_fn(outputs, targets)

        # 优化器优化模型
        # 利用优化器将梯度清零
        optimizer.zero_grad()
        # 利用反向传播得到每个梯度的结点
        loss.backward()
        # 调用优化器
        optimizer.step()

        # 训练次数+1
        total_train_step += 1
        if total_train_step % 100 == 0:
            # 记录训练终止的时间
            end_time = time.time()
            print("训练时间:{}".format(end_time-start_time))
            print("训练次数:{},Loss:{}".format(total_train_step, loss))
            writer.add_scalar("train_loss", loss.item(), total_train_step)

    # 测试步骤开始
    tudui.eval()  # 设置模型进入验证状态 只对特定层有作用,例如Dropout层,BatchNorm层等
    # 计算整个数据集上的loss
    total_test_loss = 0
    # 计算整体的正确率 整体正确的个数 初始为0
    total_accuracy = 0

    with torch.no_grad():  # 本行要有, 测试时不要改变梯度
        for data in test_dataloader:
            imgs, targets = data
            imgs = imgs.to(device)
            targets = targets.to(device)
            outputs = tudui(imgs)
            # 计算数据的损失
            loss = loss_fn(outputs, targets)
            # 计算整体损失
            total_test_loss = total_test_loss + loss
            # 横向比较
            # argmax(1)中  填1 水平比较  如果为正确为True  sum + 1
            accuracy = (outputs.argmax(1) == targets).sum()
            total_accuracy = total_accuracy + accuracy
    print("整体测试集上的loss:{}".format(total_test_loss))
    print("整体测试集上的正确率:{}".format(total_accuracy / test_data_size))
    writer.add_scalar("test_loss", total_test_loss, total_test_step)
    writer.add_scalar("test_accuracy", total_accuracy / test_data_size, total_test_step)
    total_train_step += 1

    # 保存训练时每一轮的模型
    torch.save(tudui, "tudui_{}.pth".format(i))
    # 官方推荐的保存方式 torch.save(tudui.stats_dict(),"tudui_{}.path.format)
    print('模型已保存')

writer.close()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145

在colab上跑起来~

真的好快, 再一次向Colab表达我的爱,轻薄本无独显人士的大救星!!!~

32. 完整模型的验证套路

即利用已经训练好的模型，然后给它提供输入

test.py

# 数据路径
import torch
import torchvision
from PIL import Image
from torch import nn
from torch.nn.modules.flatten import Flatten

image_path = "../imgs/dog.png"
image = Image.open(image_path)
print(image)
# png格式是四通道，除了RGB三通道外，还有一个透明度通道，所以我们需要调用convert('RGB')保留其颜色通道
# 如果图片本来就是三个颜色通道，经过此操作，不变。
# 加上这一步后，可以适应png jpg各种格式的图片。

# torchvision.transforms.Compose(transforms) 将几个转换组合在一起
image = image.convert('RGB') # 有了这句就不会报错;
transform = torchvision.transforms.Compose([torchvision.transforms.Resize((32,32)),
                                           torchvision.transforms.ToTensor()])

image = transform(image)
print(image.shape)

# 搭建神经网络模型
class Tudui(nn.Module):
    def __init__(self):
        self.model = super(Tudui,self).__init__()
        # 把网络放在一个序列中
        self.model = nn.Sequential(
            nn.Conv2d(3, 32, 5,stride=1, padding=2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 32, 5, 1,2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 5, 1,2),
            nn.MaxPool2d(2),
            Flatten(),
            nn.Linear(1024, 64),
            nn.Linear(64, 10)
        )

    def forward(self,x):
        x = self.model(x)
        return x

# 加载网络模型
# 如果使用的是GPU训练的模型,在CPU上使用,映射到CPU上 如下
# model = torch.load('tudui_29_gpu.pth',map_location=torch.device('cpu'))
model = torch.load("tudui_0.pth")
print(model)
image = torch.reshape(image,(1,3,32,32))
# 把模型转化为测试类型
model.eval() # 此步易遗忘
with torch.no_grad(): # 本步可节约内存,提升性能
    output = model(image)
output = model(image)
print(output)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

+= 1

# 保存训练时每一轮的模型
torch.save(tudui, "tudui_{}.pth".format(i))
# 官方推荐的保存方式 torch.save(tudui.stats_dict(),"tudui_{}.path.format)
print('模型已保存')
1
2
3
4

writer.close()


在colab上跑起来~

真的好快, 再一次向Colab表达我的爱,轻薄本无独显人士的大救星!!!~

[外链图片转存中...(img-t2ScrtXv-1663768832015)]





## 32. 完整模型的验证套路

- 即利用已经训练好的模型，然后给它提供输入

test.py

```python
# 数据路径
import torch
import torchvision
from PIL import Image
from torch import nn
from torch.nn.modules.flatten import Flatten

image_path = "../imgs/dog.png"
image = Image.open(image_path)
print(image)
# png格式是四通道，除了RGB三通道外，还有一个透明度通道，所以我们需要调用convert('RGB')保留其颜色通道
# 如果图片本来就是三个颜色通道，经过此操作，不变。
# 加上这一步后，可以适应png jpg各种格式的图片。

# torchvision.transforms.Compose(transforms) 将几个转换组合在一起
image = image.convert('RGB') # 有了这句就不会报错;
transform = torchvision.transforms.Compose([torchvision.transforms.Resize((32,32)),
                                           torchvision.transforms.ToTensor()])

image = transform(image)
print(image.shape)

# 搭建神经网络模型
class Tudui(nn.Module):
    def __init__(self):
        self.model = super(Tudui,self).__init__()
        # 把网络放在一个序列中
        self.model = nn.Sequential(
            nn.Conv2d(3, 32, 5,stride=1, padding=2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 32, 5, 1,2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 5, 1,2),
            nn.MaxPool2d(2),
            Flatten(),
            nn.Linear(1024, 64),
            nn.Linear(64, 10)
        )

    def forward(self,x):
        x = self.model(x)
        return x

# 加载网络模型
# 如果使用的是GPU训练的模型,在CPU上使用,映射到CPU上 如下
# model = torch.load('tudui_29_gpu.pth',map_location=torch.device('cpu'))
model = torch.load("tudui_0.pth")
print(model)
image = torch.reshape(image,(1,3,32,32))
# 把模型转化为测试类型
model.eval() # 此步易遗忘
with torch.no_grad(): # 本步可节约内存,提升性能
    output = model(image)
output = model(image)
print(output)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/article/detail/51824