神经网络-线性层及其他层介绍

28 May, 2025

标准化，有一篇论文说采用归一化可以加快神经网络的训练速度

Normalization Layers -> BatchNorm2d

BatchNorm2d 是一个对 2D 卷积操作的输出进行批量归一化的层

torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, device=None, dtype=None)

其中，num_features (int) – C from an expected input of size (N,C,H,W)

例子：

# With Learnable Parameters
m = nn.BatchNorm2d(100)
# Without Learnable Parameters
m = nn.BatchNorm2d(100, affine=False)
input = torch.randn(20, 100, 35, 45)
output = m(input)

标准化层用的比较少，就不多介绍

Recurrent Layers，在文字识别中可能用到这种网络结构，看自己需要

Transformer Layers，在特定网络中提出的一种结构，绝大多数情况用不到

Linear Layers，用到的比较多

Dropout Layers，在训练过程中，以 p 的概率随机将输入张量的部分元素清零，防止过拟合

Sparse Layers -> Embedding，用于自然语言处理中

Distance Functions，计算两个值之间的误差，通过什么方式衡量的

后面的用的更少了，笔者建议根据自己需要进行学习

Linear Layers

import torch
import torchvision
from torch import nn
from torch.nn import Linear
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10(root="P19_nn_maxpool/dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)

dataLoader = DataLoader(dataset, batch_size=64)

class nn_linear(nn.Module):
    def __init__(self):
        super(nn_linear, self).__init__()
        self.linear1 = Linear(196608, 10)

    def forward(self, input):
        output = self.linear1(input)
        return output

NN_linear = nn_linear()

for data in dataLoader:
    imgs, targets = data
    print(imgs.shape)
    output = torch.reshape(imgs, (1, 1, 1, -1))
    # output = torch.flatten(imgs) # 变成一行
    print(output.shape)
    output = NN_linear(output)
    print(output.shape)

torch.Size([64, 3, 32, 32])
torch.Size([1, 1, 1, 196608])
torch.Size([1, 1, 1, 10])
...
torch.Size([64, 3, 32, 32])
torch.Size([1, 1, 1, 196608])
torch.Size([1, 1, 1, 10])
torch.Size([16, 3, 32, 32])
torch.Size([1, 1, 1, 49152])
Traceback (most recent call last):
  File "D:\desktop\learn_dl\pytorch_1\P21_nn_linear.py", line 29, in <module>
    output = NN_linear(output)
  File "D:\Anaconda_python3.12\envs\py3.10\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\desktop\learn_dl\pytorch_1\P21_nn_linear.py", line 18, in forward
    output = self.linear1(input)
  File "D:\Anaconda_python3.12\envs\py3.10\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Anaconda_python3.12\envs\py3.10\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x49152 and 196608x10)

当output = torch.flatten(imgs) # 变成一行时，返回：

torch.Size([64, 3, 32, 32])
torch.Size([196608])
torch.Size([10])
...
torch.Size([64, 3, 32, 32])
torch.Size([196608])
torch.Size([10])
torch.Size([16, 3, 32, 32])
torch.Size([49152])
Traceback (most recent call last):
  File "D:\desktop\learn_dl\pytorch_1\P21_nn_linear.py", line 29, in <module>
    output = NN_linear(output)
  File "D:\Anaconda_python3.12\envs\py3.10\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\desktop\learn_dl\pytorch_1\P21_nn_linear.py", line 18, in forward
    output = self.linear1(input)
  File "D:\Anaconda_python3.12\envs\py3.10\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Anaconda_python3.12\envs\py3.10\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x49152 and 196608x10)

有的层使用起来还是蛮复杂的，网络模型基本讲解完了

Containers中Sequential，使用也很简单，后面简单说一下

现在会自己搭建网络模型了，但是有时候可以直接使用PyTorch提供的一些网络模型
torchvision.models就提供了很多网络结构，这是关于图像的
torchaudio.models也有一些关于语音的模型

做图像的还是比较多的，也有语义分割的

还有目标检测的，实例分割的，人体关键点检测的...

原始资料地址：
神经网络-线性层及其他层介绍
~~如有侵权联系删除~~ 仅供学习交流使用