Advertisement

每日Attention学习7——Frequency-Perception Module

阅读量:
模块出处

[link] [

复制代码
](https://github.com/rmcong/FPNet_ACMMM23) [ACM MM 23] Frequency Perception Network for Camouflaged Object Detection

* * *

##### 模块名称

Frequency-Perception Module (FPM)

* * *

##### 模块作用

获取频域信息,更好识别伪装对象

* * *

##### 模块结构
![在这里插入图片描述](https://ad.itadn.com/c/weblog/blog-img/images/2025-05-04/2wO75YhLjrFipPzZaEyHqJDBMe6Q.jpeg)

##### 模块代码
复制代码
import torch
import torch.nn as nn
import torch.nn.functional as F

class FirstOctaveConv(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, alpha=0.5, stride=1, padding=1, dilation=1,
             groups=1, bias=False):
    super(FirstOctaveConv, self).__init__()
    self.stride = stride
    kernel_size = kernel_size[0]
    self.h2g_pool = nn.AvgPool2d(kernel_size=(2, 2), stride=2)
    self.h2l = torch.nn.Conv2d(in_channels, int(alpha * in_channels),
                               kernel_size, 1, padding, dilation, groups, bias)
    self.h2h = torch.nn.Conv2d(in_channels, in_channels - int(alpha * in_channels),
                               kernel_size, 1, padding, dilation, groups, bias)

def forward(self, x):
if stride == 2:
x undergoes a downsampling operation.
X_highfreq = perform a downsampling operation on input.
X_highfreq through a fully connected layer to obtain the high-frequency component.
X_lowfreq through a low-pass branch layer to compute the low-frequency component.
return the high and low frequency components respectively.

class OctaveConv(nn.Module):
def init(self, in_channels: int, out_channels: int,
kernel_size: int, alpha: float = 0.5,
stride: int = 1, padding: int = 1,
dilation: int = 1,
groups: int = 1,
bias: bool = False):
super(OctaveConv, self).init()
self.kernel_size = kernel_size
self.avg_pool_2x2 = nn.AvgPool2d(kernel_size=(2, 2), stride=2)
self.up_sample_by_2x_nearest_mode = torch.nn.Upsample(
scale_factor=2, mode='nearest')
self.stride_val = stride
self.low_low_conv_layer = torch.nn.ConvTranspose(
in_channels=int(alpha * in_channels),
out_channels=int(alpha * out_channels),
kernel_size=kernel_size,
stride=stride,
padding=padding,
dilation=dilation,
groups=groups,
bias=bias)
self.low_high_conv_layer = torch.nn.ConvTranspose(
in_channels=int(alpha * in_channels),
out_channels=out_channels - int(alpha * out_channels),
kernel_size=kernel_size,
stride=stride,
padding=padding,
dilation=dilation,
groups=groups,
bias=bias)
self.high_low_conv_layer_in_chanel_diff_alpha_part_to_in_chanel_total_part_to_out_chanel_alpha_part_diff_alpha_out_chanel_total_part_to_out_chanel_diff_alpha_out_chanel_total_part_to_high_low_contribution_layer_input_channel_num =
int(in_channels - alpha * in Channels)
conv_layer_output_channel_num=int(alpha*outChannels)
kernelSize=self.kernelSize
stride=self.strideVal
padding=self.padding
dilation=self.dilationFactor
groupCount=self.groups
biased=self.biasingParameter:
convLayer(self.highLowChannelDiffAlphaInChannelTotalPartToOutChannelAlphaPartDiffAlphaOutChannelTotalPartInputChannels-self.highLowChannelDiffAlphaInChannelTotalPartToOutChannelAlphaPartDiffAlphaOutChannelTotalPartOutputChannels) with given parameters.

算法体如下:
将输入图像分解为高分辨率特征图Xₕ和低分辨率特征图Xₗ。
若采样率设置为二,则分别对Xₕ和Xₗ应用自适应池化操作。
通过自适应池化生成高分辨率低级特征图Xₕ₂ˡ以及低分辨率低级特征图Xₕ₂𝚑。
随后计算高分辨率高级别特征图Xₕ₂ˡ通过自适应下采样层转换得到。
对于低分辨率高级别特征图Xₗ₂ʰ,则采用双线性插值将其放大至与高分辨率高级别特征图尺寸一致。
最后将放大后的低分辨率高级别特征图与原始的高分辨率高级别特征图相加,
同时将低分辨率级别的两个分支结果相加,
从而得到最终的输出结果。

class LastOctaveConv(nn.Module):
def init(self, in_channels, out_channels, kernel_size, alpha=0.5, stride=1, padding=1, dilation=1,
groups=1, bias=False):
super(LastOctaveConv, self).init()
self.stride = stride
kernel_size = kernel_size[0]
# 使用自适应池化降低高频特征的空间分辨率
self.h2g_pool = nn.AvgPool2d(kernel_size=(2, 2), stride=2)
# 通过参数α控制高频与低频特征的比例
self.l2h = torch.nn.Conv2d(int(alpha * out_channels), out_channels,
kernel_size, 1, padding, dilation, groups, bias)
# 建立低频到高频的卷积连接
self.h2h = torch.nn.Conv2d(out_channels - int(alpha * out_channels),
out_channels,
kernel_size, 1, padding, dilation, groups, bias)
# 使用反近邻插值增强低频特征的空间分辨率
self.upsample = torch.nn.Upsample(scale_factor=2, mode='nearest')
def forward(self, x):
输入被分解为高低频特征X_h和X_l
if self.stride == 2:
X_h和X_l分别经过自适应池化处理以匹配特征图尺寸
高频特征通过参数α选择部分输出至高频卷积层l2h
剩余部分则直接传递至高频卷积层h2h进行处理
最后通过对低频特征进行反近邻插值得到完整的高频响应并叠加得到最终输出特征图X_h
返回最终的高频响应图X_h

类FPM继承于nn.Module
初始化函数
super(FPM,self).init()
self.fir=FirstOctaveConv(in_channels,out_channels,kernel_size)
self.mid1=OctaveConv(in_channels,in_channels,kernel_size)
self.mid2=OctaveConv(in_channels,out_channels,kernel_size)
self.lst=LastOctaveConv(in_channels,out_channels,kernel_size)

复制代码
def forward(self, x):
    x_h, x_l = self.fir(x)                  
    x_h_1, x_l_1 = self.mid1((x_h, x_l))     
    x_h_2, x_l_2 = self.mid1((x_h_1, x_l_1)) 
    x_h_5, x_l_5 = self.mid2((x_h_2, x_l_2)) 
    x_ret = self.lst((x_h_5, x_l_5))
    return x_ret

主程序块开始
创建张量x并初始化为随机数
张量维度为[3个样本]×[256通道]×[16×16的空间尺寸]
实例化FPM类对象fpm
输入通道设置为256个
输出通道设置为64个
应用fpm模型到输入张量x上得到输出结果out
模型执行前向传播过程
输出结果out的形状信息
结果维度为[3样本]×[输出通道数=64]×[空间尺寸=16×16]
程序块结束

复制代码
* * *

##### 原文表述

具体来说,我们采用八度卷积以端到端的方式自动感知高频和低频信息,从而实现伪装物体检测的在线学习。八度卷积可以有效避免DCT 引起的块状效应,并利用GPU的计算速度优势。此外,它可以轻松插入任意网络。

全部评论 (0)

还没有任何评论哟~