Advertisement

Pytorch官方教程学习笔记6:完整训练一个模型

阅读量:

数据处理

PyTorch 有两个用于处理数据的基本工具:torch.utils.data.DataLoadertorch.utils.data.DatasetDataset 用于存储样本及其对应的标签,而 DataLoader 则为 Dataset 提供一个可迭代的包装器

复制代码
 import torch

    
 from torch import nn
    
 from torch.utils.data import DataLoader
    
 from torchvision import datasets
    
 from torchvision.transforms import ToTensor
    
    
    
    

PyTorch 提供了特定领域的库,例如 TorchText、TorchVision 和 TorchAudio,所有这些库都包含数据集。本教程中,我们将使用一个 TorchVision 数据集。

torchvision.datasets 模块包含许多真实视觉数据的 Dataset 对象,例如 CIFAR 和 COCO(完整列表请查看此处)。在本教程中,我们使用 FashionMNIST 数据集。每个 TorchVision 数据集都包含两个参数:transformtarget_transform,分别用于修改样本和标签。

复制代码
 # Download training data from open datasets.

    
 training_data = datasets.FashionMNIST(
    
     root="data",
    
     train=True,
    
     download=True,
    
     transform=ToTensor(),
    
 )
    
  
    
 # Download test data from open datasets.
    
 test_data = datasets.FashionMNIST(
    
     root="data",
    
     train=False,
    
     download=True,
    
     transform=ToTensor(),
    
 )
    
    
    
    

我们已经下载了数据集FashionMNIST训练集和测试集并分别下载入training_data和test_data。接下来我们将数据集 (Dataset) 作为参数传递给 DataLoaderDataLoader 为数据集提供一个可迭代的包装,并支持自动批处理、采样、打乱顺序以及多进程数据加载。在这里,我们定义了一个批大小为64,即 dataloader 中的每个元素将返回包含64个特征和标签的批次。

复制代码
 batch_size = 64

    
  
    
 # Create data loaders.
    
 train_dataloader = DataLoader(training_data, batch_size=batch_size)
    
 test_dataloader = DataLoader(test_data, batch_size=batch_size)
    
  
    
 for X, y in test_dataloader:
    
     print(f"Shape of X [N, C, H, W]: {X.shape}")
    
     print(f"Shape of y: {y.shape} {y.dtype}")
    
     break
    
    
    
    

输出结果:

复制代码
 Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])

    
 Shape of y: torch.Size([64]) torch.int64
    
    
    
    

创建模型

在 PyTorch 中定义神经网络时,我们创建一个继承自 nn.Module 的类。在 __init__ 函数中定义网络的各层结构,并在 forward 函数中指定数据如何通过网络。为加速神经网络中的运算,如果有可用的 GPU 或 MPS(苹果芯片的加速器),我们会将网络移动到该设备上运行。

复制代码
 # Get cpu, gpu or mps device for training.

    
 device = (
    
     "cuda"
    
     if torch.cuda.is_available()
    
     else "mps"
    
     if torch.backends.mps.is_available()
    
     else "cpu"
    
 )
    
 print(f"Using {device} device")
    
  
    
 # Define model
    
 class NeuralNetwork(nn.Module):
    
     def __init__(self):
    
     super().__init__()
    
     self.flatten = nn.Flatten()
    
     self.linear_relu_stack = nn.Sequential(
    
         nn.Linear(28*28, 512),
    
         nn.ReLU(),
    
         nn.Linear(512, 512),
    
         nn.ReLU(),
    
         nn.Linear(512, 10)
    
     )
    
  
    
     def forward(self, x):
    
     x = self.flatten(x)
    
     logits = self.linear_relu_stack(x)
    
     return logits
    
  
    
 model = NeuralNetwork().to(device)
    
 print(model)
    
    
    
    

运行结果:

复制代码
 Using cuda device

    
 NeuralNetwork(
    
   (flatten): Flatten(start_dim=1, end_dim=-1)
    
   (linear_relu_stack): Sequential(
    
     (0): Linear(in_features=784, out_features=512, bias=True)
    
     (1): ReLU()
    
     (2): Linear(in_features=512, out_features=512, bias=True)
    
     (3): ReLU()
    
     (4): Linear(in_features=512, out_features=10, bias=True)
    
   )
    
 )
    
    
    
    

优化模型参数

要训练模型,我们需要一个损失函数和一个优化器。

复制代码
 loss_fn = nn.CrossEntropyLoss()

    
 optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
    
    
    
    

在一次训练循环中,模型对训练数据集(以批次输入)进行预测,并将预测误差反向传播,以调整模型的参数。

复制代码
 def train(dataloader, model, loss_fn, optimizer):

    
     size = len(dataloader.dataset)
    
     model.train()
    
     for batch, (X, y) in enumerate(dataloader):
    
     X, y = X.to(device), y.to(device)
    
  
    
     # Compute prediction error
    
     pred = model(X)
    
     loss = loss_fn(pred, y)
    
  
    
     # Backpropagation
    
     loss.backward()
    
     optimizer.step()
    
     optimizer.zero_grad()
    
  
    
     if batch % 100 == 0:
    
         loss, current = loss.item(), (batch + 1) * len(X)
    
         print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
    
    
    
    

训练过程分为多个迭代(epoch)进行。在每个 epoch 中,模型通过学习参数来提高预测效果。我们会在每个 epoch 输出模型的准确率和损失,希望看到准确率逐渐提高,损失逐渐降低。

复制代码
 epochs = 5

    
 for t in range(epochs):
    
     print(f"Epoch {t+1}\n-------------------------------")
    
     train(train_dataloader, model, loss_fn, optimizer)
    
     test(test_dataloader, model, loss_fn)
    
 print("Done!")
    
    
    
    

输出如下:

复制代码
 Epoch 1

    
 -------------------------------
    
 loss: 2.303494  [   64/60000]
    
 loss: 2.294637  [ 6464/60000]
    
 loss: 2.277102  [12864/60000]
    
 loss: 2.269977  [19264/60000]
    
 loss: 2.254235  [25664/60000]
    
 loss: 2.237146  [32064/60000]
    
 loss: 2.231055  [38464/60000]
    
 loss: 2.205037  [44864/60000]
    
 loss: 2.203240  [51264/60000]
    
 loss: 2.170889  [57664/60000]
    
 Test Error:
    
  Accuracy: 53.9%, Avg loss: 2.168588
    
  
    
 Epoch 2
    
 -------------------------------
    
 loss: 2.177787  [   64/60000]
    
 loss: 2.168083  [ 6464/60000]
    
 loss: 2.114910  [12864/60000]
    
 loss: 2.130412  [19264/60000]
    
 loss: 2.087473  [25664/60000]
    
 loss: 2.039670  [32064/60000]
    
 loss: 2.054274  [38464/60000]
    
 loss: 1.985457  [44864/60000]
    
 loss: 1.996023  [51264/60000]
    
 loss: 1.917241  [57664/60000]
    
 Test Error:
    
  Accuracy: 60.2%, Avg loss: 1.920374
    
  
    
 Epoch 3
    
 -------------------------------
    
 loss: 1.951705  [   64/60000]
    
 loss: 1.919516  [ 6464/60000]
    
 loss: 1.808730  [12864/60000]
    
 loss: 1.846550  [19264/60000]
    
 loss: 1.740618  [25664/60000]
    
 loss: 1.698733  [32064/60000]
    
 loss: 1.708889  [38464/60000]
    
 loss: 1.614436  [44864/60000]
    
 loss: 1.646475  [51264/60000]
    
 loss: 1.524308  [57664/60000]
    
 Test Error:
    
  Accuracy: 61.4%, Avg loss: 1.547092
    
  
    
 Epoch 4
    
 -------------------------------
    
 loss: 1.612695  [   64/60000]
    
 loss: 1.570870  [ 6464/60000]
    
 loss: 1.424730  [12864/60000]
    
 loss: 1.489542  [19264/60000]
    
 loss: 1.367256  [25664/60000]
    
 loss: 1.373464  [32064/60000]
    
 loss: 1.376744  [38464/60000]
    
 loss: 1.304962  [44864/60000]
    
 loss: 1.347154  [51264/60000]
    
 loss: 1.230661  [57664/60000]
    
 Test Error:
    
  Accuracy: 62.7%, Avg loss: 1.260891
    
  
    
 Epoch 5
    
 -------------------------------
    
 loss: 1.337803  [   64/60000]
    
 loss: 1.313278  [ 6464/60000]
    
 loss: 1.151837  [12864/60000]
    
 loss: 1.252142  [19264/60000]
    
 loss: 1.123048  [25664/60000]
    
 loss: 1.159531  [32064/60000]
    
 loss: 1.175011  [38464/60000]
    
 loss: 1.115554  [44864/60000]
    
 loss: 1.160974  [51264/60000]
    
 loss: 1.062730  [57664/60000]
    
 Test Error:
    
  Accuracy: 64.6%, Avg loss: 1.087374
    
  
    
 Done!
    
    
    
    

保存模型

保存模型的一种常见方法是序列化内部的状态字典(包含模型参数)。

复制代码
 torch.save(model.state_dict(), "model.pth")

    
 print("Saved PyTorch Model State to model.pth")
    
    
    
    

输出:

复制代码
    Saved PyTorch Model State to model.pth
    

加载模型

加载模型的过程包括重新创建模型结构,并将状态字典加载到模型中。

复制代码
 model = NeuralNetwork().to(device)

    
 model.load_state_dict(torch.load("model.pth", weights_only=True))
    
    
    
    

输出:

复制代码
    <All keys matched successfully>
    

现在这个模型可以用来做预测了。

复制代码
 classes = [

    
     "T-shirt/top",
    
     "Trouser",
    
     "Pullover",
    
     "Dress",
    
     "Coat",
    
     "Sandal",
    
     "Shirt",
    
     "Sneaker",
    
     "Bag",
    
     "Ankle boot",
    
 ]
    
  
    
 model.eval()
    
 x, y = test_data[0][0], test_data[0][1]
    
 with torch.no_grad():
    
     x = x.to(device)
    
     pred = model(x)
    
     predicted, actual = classes[pred[0].argmax(0)], classes[y]
    
     print(f'Predicted: "{predicted}", Actual: "{actual}"')
    
    
    
    

输出:

复制代码
    Predicted: "Ankle boot", Actual: "Ankle boot"
    

总结

您今天需要运行的代码:

分两个文件,一个文件用于训练模型,另一个用于加载模型和使用模型,注意两个模型需要在同一目录下。

train.py:

复制代码
 import torch

    
 from torch import nn
    
 from torch.utils.data import DataLoader
    
 from torchvision import datasets
    
 from torchvision.transforms import ToTensor
    
 # Download training data from open datasets.
    
 training_data = datasets.FashionMNIST(
    
     root="data",
    
     train=True,
    
     download=True,
    
     transform=ToTensor(),
    
 )
    
  
    
 # Download test data from open datasets.
    
 test_data = datasets.FashionMNIST(
    
     root="data",
    
     train=False,
    
     download=True,
    
     transform=ToTensor(),
    
 )
    
  
    
 batch_size = 64
    
  
    
 # Create data loaders.
    
 train_dataloader = DataLoader(training_data, batch_size=batch_size)
    
 test_dataloader = DataLoader(test_data, batch_size=batch_size)
    
  
    
 for X, y in test_dataloader:
    
     print(f"Shape of X [N, C, H, W]: {X.shape}")
    
     print(f"Shape of y: {y.shape} {y.dtype}")
    
     break
    
  
    
 # Get cpu, gpu or mps device for training.
    
 device = (
    
     "cuda"
    
     if torch.cuda.is_available()
    
     else "mps"
    
     if torch.backends.mps.is_available()
    
     else "cpu"
    
 )
    
 print(f"Using {device} device")
    
  
    
 # Define model
    
 class NeuralNetwork(nn.Module):
    
     def __init__(self):
    
     super().__init__()
    
     self.flatten = nn.Flatten()
    
     self.linear_relu_stack = nn.Sequential(
    
         nn.Linear(28*28, 512),
    
         nn.ReLU(),
    
         nn.Linear(512, 512),
    
         nn.ReLU(),
    
         nn.Linear(512, 10)
    
     )
    
  
    
     def forward(self, x):
    
     x = self.flatten(x)
    
     logits = self.linear_relu_stack(x)
    
     return logits
    
  
    
 model = NeuralNetwork().to(device)
    
 print(model)
    
  
    
 loss_fn = nn.CrossEntropyLoss()
    
 optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
    
  
    
 def train(dataloader, model, loss_fn, optimizer):
    
     size = len(dataloader.dataset)
    
     model.train()
    
     for batch, (X, y) in enumerate(dataloader):
    
     X, y = X.to(device), y.to(device)
    
  
    
     # Compute prediction error
    
     pred = model(X)
    
     loss = loss_fn(pred, y)
    
  
    
     # Backpropagation
    
     loss.backward()
    
     optimizer.step()
    
     optimizer.zero_grad()
    
  
    
     if batch % 100 == 0:
    
         loss, current = loss.item(), (batch + 1) * len(X)
    
         print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
    
  
    
 def test(dataloader, model, loss_fn):
    
     size = len(dataloader.dataset)
    
     num_batches = len(dataloader)
    
     model.eval()
    
     test_loss, correct = 0, 0
    
     with torch.no_grad():
    
     for X, y in dataloader:
    
         X, y = X.to(device), y.to(device)
    
         pred = model(X)
    
         test_loss += loss_fn(pred, y).item()
    
         correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    
     test_loss /= num_batches
    
     correct /= size
    
     print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
    
  
    
  
    
 epochs = 5
    
 for t in range(epochs):
    
     print(f"Epoch {t+1}\n-------------------------------")
    
     train(train_dataloader, model, loss_fn, optimizer)
    
     test(test_dataloader, model, loss_fn)
    
 print("Done!")
    
  
    
  
    
 torch.save(model.state_dict(), "model.pth")
    
 print("Saved PyTorch Model State to model.pth")
    
    
    
    

输出如下:

复制代码
 Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])

    
 Shape of y: torch.Size([64]) torch.int64
    
 Using cuda device
    
 NeuralNetwork(
    
   (flatten): Flatten(start_dim=1, end_dim=-1)
    
   (linear_relu_stack): Sequential(
    
     (0): Linear(in_features=784, out_features=512, bias=True)
    
     (1): ReLU()
    
     (2): Linear(in_features=512, out_features=512, bias=True)
    
     (3): ReLU()
    
     (4): Linear(in_features=512, out_features=10, bias=True)
    
   )
    
 )
    
 Epoch 1
    
 -------------------------------
    
 loss: 2.321039  [   64/60000]
    
 loss: 2.300082  [ 6464/60000]
    
 loss: 2.276550  [12864/60000]
    
 loss: 2.261506  [19264/60000]
    
 loss: 2.253393  [25664/60000]
    
 loss: 2.225407  [32064/60000]
    
 loss: 2.232775  [38464/60000]
    
 loss: 2.197544  [44864/60000]
    
 loss: 2.204715  [51264/60000]
    
 loss: 2.160194  [57664/60000]
    
 Test Error: 
    
  Accuracy: 47.8%, Avg loss: 2.158097 
    
  
    
 Epoch 2
    
 -------------------------------
    
 loss: 2.177010  [   64/60000]
    
 loss: 2.158526  [ 6464/60000]
    
 loss: 2.095876  [12864/60000]
    
 loss: 2.105602  [19264/60000]
    
 loss: 2.056697  [25664/60000]
    
 loss: 2.007255  [32064/60000]
    
 loss: 2.030588  [38464/60000]
    
 loss: 1.952178  [44864/60000]
    
 loss: 1.966830  [51264/60000]
    
 loss: 1.876572  [57664/60000]
    
 Test Error: 
    
  Accuracy: 56.8%, Avg loss: 1.879299 
    
  
    
 Epoch 3
    
 -------------------------------
    
 loss: 1.920324  [   64/60000]
    
 loss: 1.881016  [ 6464/60000]
    
 loss: 1.757274  [12864/60000]
    
 loss: 1.795695  [19264/60000]
    
 loss: 1.687385  [25664/60000]
    
 loss: 1.645998  [32064/60000]
    
 loss: 1.667249  [38464/60000]
    
 loss: 1.569485  [44864/60000]
    
 loss: 1.599562  [51264/60000]
    
 loss: 1.485055  [57664/60000]
    
 Test Error: 
    
  Accuracy: 61.0%, Avg loss: 1.506500 
    
  
    
 Epoch 4
    
 -------------------------------
    
 loss: 1.573652  [   64/60000]
    
 loss: 1.539013  [ 6464/60000]
    
 loss: 1.383209  [12864/60000]
    
 loss: 1.455913  [19264/60000]
    
 loss: 1.338859  [25664/60000]
    
 loss: 1.340163  [32064/60000]
    
 loss: 1.354653  [38464/60000]
    
 loss: 1.285331  [44864/60000]
    
 loss: 1.317009  [51264/60000]
    
 loss: 1.218405  [57664/60000]
    
 Test Error: 
    
  Accuracy: 63.4%, Avg loss: 1.243824 
    
  
    
 Epoch 5
    
 -------------------------------
    
 loss: 1.316245  [   64/60000]
    
 loss: 1.301429  [ 6464/60000]
    
 loss: 1.130637  [12864/60000]
    
 loss: 1.240115  [19264/60000]
    
 loss: 1.112968  [25664/60000]
    
 loss: 1.142092  [32064/60000]
    
 loss: 1.165823  [38464/60000]
    
 loss: 1.113276  [44864/60000]
    
 loss: 1.144331  [51264/60000]
    
 loss: 1.062266  [57664/60000]
    
 Test Error: 
    
  Accuracy: 64.9%, Avg loss: 1.082635 
    
  
    
 Done!
    
 Saved PyTorch Model State to model.pth
    
  
    
 Process finished with exit code 0
    
    
    
    

等待train.py训练好模型后,运行apply.py:

复制代码
 import torch

    
 from torch import nn
    
 from torch.utils.data import DataLoader
    
 from torchvision import datasets
    
 from torchvision.transforms import ToTensor
    
 # Get cpu, gpu or mps device for training.
    
 # Download training data from open datasets.
    
 training_data = datasets.FashionMNIST(
    
     root="data",
    
     train=True,
    
     download=True,
    
     transform=ToTensor(),
    
 )
    
  
    
 # Download test data from open datasets.
    
 test_data = datasets.FashionMNIST(
    
     root="data",
    
     train=False,
    
     download=True,
    
     transform=ToTensor(),
    
 )
    
 device = (
    
     "cuda"
    
     if torch.cuda.is_available()
    
     else "mps"
    
     if torch.backends.mps.is_available()
    
     else "cpu"
    
 )
    
 print(f"Using {device} device")
    
  
    
 # Define model
    
 class NeuralNetwork(nn.Module):
    
     def __init__(self):
    
     super().__init__()
    
     self.flatten = nn.Flatten()
    
     self.linear_relu_stack = nn.Sequential(
    
         nn.Linear(28*28, 512),
    
         nn.ReLU(),
    
         nn.Linear(512, 512),
    
         nn.ReLU(),
    
         nn.Linear(512, 10)
    
     )
    
  
    
     def forward(self, x):
    
     x = self.flatten(x)
    
     logits = self.linear_relu_stack(x)
    
     return logits
    
  
    
 model = NeuralNetwork().to(device)
    
  
    
 model = NeuralNetwork().to(device)
    
 model.load_state_dict(torch.load("model.pth", weights_only=True))
    
  
    
 classes = [
    
     "T-shirt/top",
    
     "Trouser",
    
     "Pullover",
    
     "Dress",
    
     "Coat",
    
     "Sandal",
    
     "Shirt",
    
     "Sneaker",
    
     "Bag",
    
     "Ankle boot",
    
 ]
    
  
    
 model.eval()
    
 x, y = test_data[0][0], test_data[0][1]
    
 with torch.no_grad():
    
     x = x.to(device)
    
     pred = model(x)
    
     predicted, actual = classes[pred[0].argmax(0)], classes[y]
    
     print(f'Predicted: "{predicted}", Actual: "{actual}"')
    
    
    
    

输出如下:

复制代码
 Using cuda device

    
 Predicted: "Ankle boot", Actual: "Ankle boot"
    
  
    
 Process finished with exit code 0
    
    
    
    

您当然可以省略存储模型,加载模型这一步,直接在一个文件中训练完后直接使用,但当您需要反复使用这个模型时,上述方法可以替您省去反复训练模型的时间。

全部评论 (0)

还没有任何评论哟~