[COD] Camouflaged Object Detection(CVPR 2020.oral)

阅读量：

文章目录

1. Motivation
2. Contribution
3. Relation Work
- 3.1 Generic and Salient Object Detection
- 3.2 Camouflaged Object Detection
- - 3.2.1 Types of Camouflage
  - 3.2.2 COD Formulation
  - 3.2.3 Evaluation Metrics.
4. Dataset
- 4.1 Professional Annotation
- 4.2 Dataset Features and Statistics
5. Proposed Framework
- 5.1 Overview
- 5.2 Search Module(SM)
- 5.3 Identification Module（IM)
- 5.4 Partial Decoder Component (PDC)
6. Experiment
- 6.1 Performance on CHAMELEON， CAMO-Test and COD10K-Test
- 6.1 Qualitative Analysis

1. Motivation

本文研究的领域为COD(Camouflaged object detection)，伪装物体检测的定义：

Camouflaged object detection(COD) aims to identify objects that are ‘seamlessly’ embedded in their surroundings.

由于在COD中的target和background具有intrinsic similarities，因此相对于传统的目标检测来说，COD更具有挑战性。

The high intrinsic similarities between the target object and the background make COD far more challenging than the traditional object detection task.

COD的适用性：

COD is also beneficial for applications in the fields of computer vision , medical image segmentation , agriculture and art.

COD领域数据集缺乏。

Currently, camouflaged object detection is not well studied due to the lack of a sufficiently large dataset.

2. Contribution

本文提出了COD10K 数据集。

To address this issue, we elaborately collect a novel dataset, called COD10K, which comprises 10,000 images covering camouflaged objects in various natural scenes, over 78 object categories.

本文构建了用于COD的Search Identification Network (SINet)框架。

In addition, we develop a simple but effective framework for COD, termed Search Identifi-cation Network (SINet).

其中COD10K与现有的COD数据集的区别有以下3点：

COD10K包含了10K，78个类别，水陆地两栖。

It contains 10K images covering 78 camouflaged object categories, such as aquatic, flying, amphibians, and terrestrial, etc.

COD10K的annotation信息，并且可以用于多任务中。

All the camouflaged images are hierarchically annotated with category, bounding-box, object-level, and instance-level labels, facilitating many vision tasks, such as localization, object proposal, semantic edge detection [42], task transfer learning [69], etc.

高质量的annotation促进算法的性能

Each camouflaged image is assigned with challenging attributes found in the real-world and matting-level [73] labeling (requiring ∼60 minutes per image). These high-quality annotations could help with providing deeper insight into the performance of algorithms.

图1是COD10K与之前COD任务中的2个数据集的summary的对比。

3. Relation Work

图2，分为了GOD，SOD，以及COD。

3.1 Generic and Salient Object Detection

其中GOD是最流行的视觉研究方向，用于研究语义分割和全景分割。

One of the most popular directions in computer vision is generic object detection.

Typical GOD tasks include semantic segmentation and panoptic segmentation

SOD是一张图片中最引人注意的物体。与salient相反的就是camouflaged。

That is, positive samples (images containing a salient object) can be utilized as the negative samples in a COD dataset.

3.2 Camouflaged Object Detection

3.2.1 Types of Camouflage

camouflaged 图片可以分为2种类型：自然伪装以及人造伪装。

自然伪装是动物使用的；人造的伪装发生在产品中（所有的defect缺陷），或者在游戏用用于隐藏信息。

Camouflaged images can be roughly split into two types: those containing natural camouflage and those with artificial camouflage

3.2.2 COD Formulation

与class-dependent的语义分割不同，COD是class-independent task。COD formulation：给定一张图片，每个像素都有一个confidence p， $p_i \in [0, 1]$ ，如果pixels的值为0，表示它不属于camouflaged objects，如果是1表明pixel表示camouflagd objects。

Given an image, the task requires a camouflaged object detection approach to assign each pixel i a confidence pi ∈ [0,1], where pi denotes the probability score of pixel i.

A score of 0 is given to pixels that don’t belong to the camouflaged objects, while a score of 1 indicates that a pixel is fully assigned to the camouflaged objects.

3.2.3 Evaluation Metrics.

不适用原本的MAE。

本文使用3个metrics：

a human visual perception based E-measure ( $E_{\phi}$ )，which simultaneously evaluates the pixel-level matching and image-level statistics
Since camouflaged objects often contain complex shapes, COD also requires a metric that can judge structural similarity. We utilize the S-measure ( $S_\alpha$ ) [12] as our alternative metric.
the weighted F-measure ( $F^w_{\beta}$ ) [43] can provide more reliable evaluation

4. Dataset

our goals for studying and developing a dataset for COD are:

(1) to provide a new challenging task,

(2) to promote research in a new topic,

(3) to spark novel ideas.

4.1 Professional Annotation

如图4所示，关于COD10K的annoation信息。

4.2 Dataset Features and Statistics

如图4和图6所示，关于COD10K的一些statistics。

Object size
Global/Local contrast
Center bias
Quality control
Super/Sub-class distribution
Resolution distribution
Dataset splits

5. Proposed Framework

5.1 Overview

图8为SINet framework。包含了2个主要成分，RF和PDC。

The RF is introduced to mimic the structure of RFs in the human visual system.

The PDC reproduces the search and identification stages of animal predation.

SINet 收到了2阶段hunting的启发，它包含了2个主要的模块， the search module（SM）以及identification module（IM）。

SM用于找到camouflagd obejct，而后者IM用于精确的检测它。

5.2 Search Module(SM)

使用RF模块的动机是模仿pRFs，为了吸收更多具有判别表示的特征。

This motivates us to use an RF [41, 68] component to incorporate more discriminative feature representations during the searching stage.

给定输入 $I \in R^{W \times H \times 3}$ ，从ResNet-50中提取 $\{ \mathcal X \}^4_{k=0}$ ，RF的每一层的分辨率保持不变 $\{ \frac{H}{k}, \frac{W}{k}, k=4,4,8,16,32 \}$ 。通过concatenation，up-sampling和down-sampling的操作进行将5个层之间进行结合，从而得到了 $\{rf^s_k ,k=1,2,3,4 \}$ 。

低维度浅层保留了物体边界的空间信息，高维度深层保留了定位物体的语义信息。

Receptive Field (RF)包括了5个branches，在每一个branch中的第一个Bconv（Conv+BN+ReLU）使用1x1卷积将通道数量降维为32，然后接着(2k-1)x(2k-1)的Bconv，（通过代码和网络结构图发现，其实具体来说是 [1x(2k-1)]和[(2k-1)x1]，以及带有（2k-1)的空洞率的3x3的Bconv。前4个branches进行cat操作，接着使用1x1（代码里面是3x3）的Bconv将通道降维为32，最后将第五个branch和前四层通过Bconv的特征进行add操作，进行ReLU操作后，获得特征 $rf_k$ 。具体的代码如下：

复制代码

    def RF_MODULE():
    self.branch1 == nn.Sequential( 
        BasicConv2d(in_channel, out_channel, 1), #conv + bn + ReLU
        BasicConv2d(out_channel, out_channel, kernel_size(1, 3), padding=(0, 1)),
        BasicConv2d(out_channel, out_channel, kernel_size(3, 1), padding=(1, 0)),
        BasicConv2d(out_channel, out_channel, 3, padding=3, dilation=3), # 3x3 conv
    )
    
    self.branch2 == nn.Sequential( 
        BasicConv2d(in_channel, out_channel, 1), #conv + bn + ReLU
        BasicConv2d(out_channel, out_channel, kernel_size(1, 5), padding=(0, 2)),
        BasicConv2d(out_channel, out_channel, kernel_size(5, 1), padding=(2, 0)),
        BasicConv2d(out_channel, out_channel, 3, padding=5, dilation=5), # 3x3 conv
    )
    
    
    self.branch1 == nn.Sequential( 
        BasicConv2d(in_channel, out_channel, 1), #conv + bn + ReLU
        BasicConv2d(out_channel, out_channel, kernel_size(1, 7), padding=(0, 3)),
        BasicConv2d(out_channel, out_channel, kernel_size(7, 1), padding=(3, 0)),
        BasicConv2d(out_channel, out_channel, 3, padding=7, dilation=7), # 3x3 conv
    )
    
    self.conv_cat = BasicConv2d(4*out_channel, out_channel, 3, padding=1)
    self.conv_res = BasicConv2d(in_channel, out_channel, 1)

5.3 Identification Module（IM)

IM结构用于检测camouflaged object，本文扩展了partial decoder componet（PDC)，带有稠密的连接特征。

确切来说，PDC整合了SM结构中的4层特征。the coarse camouflage map $C_s$ 可以通过公式计算：

作者引入attention结构的在于它可以消除不相关特征的干扰。本文介绍了一种search attenion（SA）module，来加强middle-level 特征X2，并且获得enhanced camouflage map $C_h$ 。

$g(\cdot)$ 是SA function，是高斯滤波，带有标准差32和kernel 4。查看了代码发现，貌似在max操作后，最后将attenion和x进行矩阵乘法，

复制代码

    class SA(nn.Module):
    """
        holistic attention src
    """
    def __init__(self):
        super(SA, self).__init__()
        gaussian_kernel = np.float32(_get_kernel(31, 4)) # shape: [31, 31] array
        gaussian_kernel = gaussian_kernel[np.newaxis, np.newaxis, ...] #  [1,1,31,31]
        self.gaussian_kernel = Parameter(torch.from_numpy(gaussian_kernel)) #  [1,1,31,31]
    
    def forward(self, attention, x):
        # attention 是Cs x是x2
        # attention: (torch.randn(1, 1,44,44)
        # x: torch.randn(1, 512,44,44))
        soft_attention = F.conv2d(attention, self.gaussian_kernel, padding=15) # [1,1,44,44]
        soft_attention = min_max_norm(soft_attention)       # normalization
        x = torch.mul(x, soft_attention.max(attention))     # mul
        return x # [1,512 ,44,44]

最后利用PDC和RF来聚合其他三层的特征，获得final camouflage map $C_i$ ：

5.4 Partial Decoder Component (PDC)

PDC中的计算不难理解，只是多了一个element-wise的操作，核心还是聚合多层特征，并且保持上采样到 $rf_4^c$ 的大小（也就是原图缩放的8倍图上是44x44)，并且通过代码（其实通过文中的SINet Network也能得出)，PDC的输出是一个channel为1的灰度图， $I^{PDC} \in [H/8 \times W/8 \times 1]$ 。

网络的总损失包括了Ccsm和Ccim各自与gt的交叉熵loss， $C_{csm}$ 和 $C_{cim}$ 是2个camouflaged object maps，通过 $C_s$ 和 $C_i$ 获得并且上采样8倍

6. Experiment

作者构建了12个基于3个数据集的baseline。

6.1 Performance on CHAMELEON， CAMO-Test and COD10K-Test

6.1 Qualitative Analysis

全部评论 (0)

还没有任何评论哟~

[COD] Camouflaged Object Detection(CVPR 2020.oral)

文章目录 1\.Motivation 2\.Contribution 3\.RelationWork 3.1GenericandSalientObjectDetection 3.2Camouflage...

COD论文笔记 Boundary-Guided Camouflaged Object Detection

动机 1.挑战性任务：伪装物体检测（COD）是一个重要且具有挑战性的任务，因为伪装物体往往与背景高度相似，使得准确识别和分割非常困难。 2.现有方法的不足：现有的深度学习方法难以有效识别伪装物体的结构...

COD论文笔记 Adaptive Guidance Learning for Camouflaged Object Detection

论文的主要动机、现有方法的不足、拟解决的问题、主要贡献和创新点如下：动机：论文的核心动机是解决伪装目标检测（COD）中的挑战性任务。伪装目标检测旨在识别和分割那些在视觉上与周围环境高度相似的目标，...

COD论文笔记 CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models_AAAI_2024

动机 1.挑战性任务：伪装物体检测（CamouflagedObjectDetection,COD）是计算机视觉中的一个挑战性任务，由于伪装物体与其背景的高度相似，现有的方法在检测边界和防止过于自信的错...

COD论文笔记 ECCV2024 Just a Hint: Point-Supervised Camouflaged Object Detection

这篇论文的主要动机、现有方法的不足、拟解决的问题、主要贡献和创新点： 1\.动机伪装物体检测（CamouflagedObjectDetection,COD）旨在检测隐藏在环境中的伪装物体，这是一个具...

COD论文笔记TCSVT_2024 Efficient Camouflaged Object Detection Network Based on Global Localization

RelatedWork PVTv2通过空间减少注意力SRA和逐步缩小的金字塔结构将Transformer的计算复杂度降低到线性。SMT将CNN与Transformer相结合，实现了从浅层到深层的远程依...

[OVD]Open-Vocabulary Object Detection Using Captions(CVPR. 2021 oral)

1\.Motivation Despitetheremarkableaccuracyofdeepneuralnetworksinobjectdetection,theyarecostlytotrain...

COD论文笔记VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

论文概要这篇论文提出了一个名为VSCode的通用模型，用于处理显著目标检测（SOD）和伪装目标检测（COD）任务。该模型采用了一种新颖的2D提示学习方法，通过在编码器解码器架构中引入2D提示来学习领...

Large Model Based Referring Camouflaged Object Detection

基于大型模型的参考伪装对象检测参考伪装对象检测（RefCOD）是最近提出的一个问题，目的是分割出与文本或视觉引用相匹配的指定伪装对象。这项任务涉及两个主要的挑战：COD领域特异性感知和多模态参考图像...

CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection---CVPR 2020

（仅翻译了与目标识别相关的内容）论文地址：<https://arxiv.org/pdf/2003.09119.pdf 源码地址：<https://github.com/KiveeDong/Centr...

是否确定退出登录?

[COD] Camouflaged Object Detection(CVPR 2020.oral)

文章目录

1. Motivation

2. Contribution

3. Relation Work

3.1 Generic and Salient Object Detection

3.2 Camouflaged Object Detection

3.2.1 Types of Camouflage

3.2.2 COD Formulation

3.2.3 Evaluation Metrics.

4. Dataset

4.1 Professional Annotation

4.2 Dataset Features and Statistics

5. Proposed Framework

5.1 Overview

5.2 Search Module(SM)

5.3 Identification Module（IM)

5.4 Partial Decoder Component (PDC)

6. Experiment

6.1 Performance on CHAMELEON， CAMO-Test and COD10K-Test

6.1 Qualitative Analysis

全部评论 (0)

相关文章推荐

[COD] Camouflaged Object Detection(CVPR 2020.oral)

COD论文笔记 Boundary-Guided Camouflaged Object Detection

COD论文笔记 Adaptive Guidance Learning for Camouflaged Object Detection

COD论文笔记 CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models_AAAI_2024

COD论文笔记 ECCV2024 Just a Hint: Point-Supervised Camouflaged Object Detection

COD论文笔记TCSVT_2024 Efficient Camouflaged Object Detection Network Based on Global Localization

[OVD]Open-Vocabulary Object Detection Using Captions(CVPR. 2021 oral)

COD论文笔记VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

Large Model Based Referring Camouflaged Object Detection

CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection---CVPR 2020