深度学习论文: Activate or Not: Learning Customized Activation及其PyTorch实现

阅读量：

该文探讨了自定义激活函数的学习方法

1 概述

建立ReLU与基于NAS搜索优化的Swish激活函数之间的关联关系，并将其扩展为一种通用形式。为了使网络能够自适应地决定是否激活神经元活动，请在模型中引入两层1x1卷积层。

2 ACON

ACON-A ：基于ReLU函数的一种平滑近似的是Swish激活函数。
【注意：此处存在歧义

ACON-A ：Swish激活函数是一种基于ReLU函数的平滑近似。
**
（无需修改）

PyTorch代码：

复制代码

    class AconC(nn.Module):
    r""" ACON activation (activate or not).
    # AconC: (p1*x-p2*x) * sigmoid(beta*(p1*x-p2*x)) + p2*x, beta is a learnable parameter
    # according to "Activate or Not: Learning Customized Activation" <https://arxiv.org/pdf/2009.04759.pdf>.
    """
    
    def __init__(self, width):
        super().__init__()
        self.p1 = nn.Parameter(torch.randn(1, width, 1, 1))
        self.p2 = nn.Parameter(torch.randn(1, width, 1, 1))
        self.beta = nn.Parameter(torch.ones(1, width, 1, 1))
    
    def forward(self, x):
        return (self.p1 * x - self.p2 * x) * torch.sigmoid(self.beta * (self.p1 * x - self.p2 * x)) + self.p2 * x
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

3 Meta-ACON

作者初始化了两个可学习参数p₁和p₂，并为网络添加了两组 $1 \times 1$ 卷积层。在前向传播阶段，在计算β之前，在输入数据的高度(H)和宽度(W)维度上进行了平均池化操作；随后对输出进行两次 $1 \times 1$ 卷积处理；最终经sigmoid激活后输出一个介于0到1之间的数值

复制代码

    class MetaAconC(nn.Module):
    r""" ACON activation (activate or not).
    # MetaAconC: (p1*x-p2*x) * sigmoid(beta*(p1*x-p2*x)) + p2*x, beta is generated by a small network
    # according to "Activate or Not: Learning Customized Activation" <https://arxiv.org/pdf/2009.04759.pdf>.
    """
    
    def __init__(self, width, r=16):
        super().__init__()
        self.fc1 = nn.Conv2d(width, max(r, width // r), kernel_size=1, stride=1, bias=True)
        self.bn1 = nn.BatchNorm2d(max(r, width // r))
        self.fc2 = nn.Conv2d(max(r, width // r), width, kernel_size=1, stride=1, bias=True)
        self.bn2 = nn.BatchNorm2d(width)
    
        self.p1 = nn.Parameter(torch.randn(1, width, 1, 1))
        self.p2 = nn.Parameter(torch.randn(1, width, 1, 1))
    
    def forward(self, x):
        beta = torch.sigmoid(
            self.bn2(self.fc2(self.bn1(self.fc1(x.mean(dim=2, keepdims=True).mean(dim=3, keepdims=True))))))
        return (self.p1 * x - self.p2 * x) * torch.sigmoid(beta * (self.p1 * x - self.p2 * x)) + self.p2 * x
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

随后作者根据网络规模的不同采取了相应的策略，在小规模网络中替换掉了所有的ReLU激活层，在较大的网络（例如ResNet50和ResNet101）则仅对每个块中的3×3卷积操作后的ReLU激活层进行了替换。这种调整以期减少过拟合现象的发生，并非单纯为了防止过拟合的原因在于额外增加的参数数量过大。

尽管引入了少量新增参数（即新增了大约10%左右），但该方法对大网络和小网络均能带来一定的优化效果或提升性能。

针对设计空间做了一系列消融实验，其中channel-wise的效果是最好的

全部评论 (0)

还没有任何评论哟~

深度学习论文: Activate or Not: Learning Customized Activation及其PyTorch实现

深度学习论文:ActivateorNot:LearningCustomizedActivation及其PyTorch实现 ActivateorNot:LearningCustomizedActivat...

CVPR 2021 | Activate or Not: Learning Customized Activation阅读分享（keras实现）

旷视2021CVPR 摘要提出了一个简单的，有效的，和一般的激活函数ACON，它学习是否需要激活神经元。发现Swish，最近流行的NAS搜索激活，可以解释为对ReLU的一个平滑近似。同样地，将更一般...

深度学习论文: Refining activation downsampling with SoftPool及其PyTorch实现

深度学习论文:RefiningactivationdownsamplingwithSoftPool及其PyTorch实现 RefiningactivationdownsamplingwithSoftP...

深度学习论文: To Perceive or Not to Perceive: Lightweight Stacked Hourglass Network及其PyTorch实现

深度学习论文:ToPerceiveorNottoPerceive:LightweightStackedHourglassNetwork及其PyTorch实现 ToPerceiveorNottoPerc...

深度学习论文: MCUNet: Tiny Deep Learning on IoT Devices及其PyTorch实现

深度学习论文:MCUNet:TinyDeepLearningonIoTDevices及其PyTorch实现 MCUNet:TinyDeepLearningonIoTDevices PDF:<https...

深度学习论文:Dynamic ReLU及其PyTorch实现

DynamicReLU PDF:<https://arxiv.org/pdf/2003.10027.pdf PyTorch代码:<https://github.com/shanglianlm0525/...

深度学习论文: Learning to Resize Images for Computer Vision Tasks及其PyTorch实现

深度学习论文:LearningtoResizeImagesforComputerVisionTasks及其PyTorch实现 LearningtoResizeImagesforComputerVisi...

深度学习论文:Learning Spatial Fusion for Single-Shot Object Detection及其PyTorch实现

LearningSpatialFusionforSingleShotObjectDetection PDF:<https://arxiv.org/pdf/1911.09516.pdf PyTorch代...

深度学习论文: Attentional Feature Fusion及其PyTorch实现

深度学习论文:AttentionalFeatureFusion及其PyTorch实现 AttentionalFeatureFusion PDF:<https://arxiv.org/pdf/2009....

深度学习论文: Selective Kernel Networks及其PyTorch实现

SelectiveKernelNetworks PDF:<https://arxiv.org/pdf/1903.06586.pdf PyTorch:<https://github.com/implus...

是否确定退出登录?

深度学习论文: Activate or Not: Learning Customized Activation及其PyTorch实现

1 概述

2 ACON

3 Meta-ACON

全部评论 (0)

相关文章推荐

深度学习论文: Activate or Not: Learning Customized Activation及其PyTorch实现

CVPR 2021 | Activate or Not: Learning Customized Activation阅读分享（keras实现）

深度学习论文: Refining activation downsampling with SoftPool及其PyTorch实现

深度学习论文: To Perceive or Not to Perceive: Lightweight Stacked Hourglass Network及其PyTorch实现

深度学习论文: MCUNet: Tiny Deep Learning on IoT Devices及其PyTorch实现

深度学习论文:Dynamic ReLU及其PyTorch实现

深度学习论文: Learning to Resize Images for Computer Vision Tasks及其PyTorch实现

深度学习论文:Learning Spatial Fusion for Single-Shot Object Detection及其PyTorch实现

深度学习论文: Attentional Feature Fusion及其PyTorch实现

深度学习论文: Selective Kernel Networks及其PyTorch实现