[WSIS] Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images

阅读量：

1. Motivation

弱监督实例分割（WSIS）

Weakly-supervised instance segmentation (WSIS) is important in computer vision for at least two reasons.

Humans have a strong class-agnostic object segmentation ability and can outline boundaries ofunknown objects precisely, which motivates us to propose a box-supervised class-agnostic object segmentation (BoxCaseg) based so- lution for weakly-supervised instance segmentation.

对于BoxCaseg的定义：

We think the key problem in WSIS is box-supervised class-agnostic object segmentation (BoxCaseg), i.e., given the object bounding box, we need to infer pixel-level object mask.

关于salient image 和 box-supervised images如图1所示：

Related work

There is a kind of methods [7, 36, 20] utiliz- ing traditional image segmentation algorithms

Dai et al. [7] first proposes to combine deep networks with hand-crafted object proposals for WSIS via iterative training and pseudo labeling.

BBTP [16] designs a multi-instance learning (MIL) formulation to train a weakly-supervised segmentation model.

Our weak segmentation head is also based on BBTP and does not rely on any traditional algorithms.

SOD

Our method is also based on the development of salient object detection (SOD, as known as salient object segmen- tation) [19, 49, 29, 4, 28, 15, 46], which aims at finding visually attractive objects in an image and segmenting them as a binary mask.

2. Contribution

We propose a novel weakly-supervised instance segmentation method based on box-supervised class-agnostic object segmentation, in which class-agnostic precise salient object localization information is utilized as auxiliary memory to promote weakly-supervised instance segmentation during training.

We propose new mask merging and dropping strategies to obtain high-quality proxy masks for pseudo-training Mask R-CNN.

The class-agnostic segmentation model has great generalization ability. Only using 7991 salient images which are disjointed with PASCAL and COCO, for the first time, a box-supervised instance segmentation obtains similar performance with its fully-supervised counterpart. On COCO, our result is significantly better than previous state-of-the-art WSIS methods.

3. Method

3.1 Weakly-supervised Instance Segmentation via Multi-instance Learning

如图3所示，BBTP的方法，使用positive bags（box内部的只要包含一个属于物体的pixel）positive bags和negative bags的定义：

The entire region of the object is in the bounding box and each row or each column in the bounding box must contain at least one pixel belonging to the object. BBTP regards these rows and columns as positive bags.

Other rows outside the bounding box are treated as negative bags.

分类概率可以通过max-pooling pixel分类概率来生成。

Once the positive and negative bags are generated, its classification probabilities are generated by max-pooling pixel classification probabilities.

3.2 Joint Training with Salient Images

3.2.1 Salient Images

salient images 来自the DUTS-TR dataset。

对于salient images，我们确保每张图片只会有一个物体，因此物体较大，并且会占据较大的空间。

利用salient images来产生fine-grained的物体边界信息。

3.2.2 Data Augmentation

作者不像BBTP那样使用RoI Align来同时完成WSIS的检测和分割操作，本文不使用RoI Align操作，只是将物体crop下来，作为网络的输入。对于salient images以及box-supervised images的crop（data augmentation）不同。

对于salient images，先resized 到 320x320，然后随机cropped 成288x288，作为输入；对于box-supervised的方法如下：

This data augmentation strategy shifts the position and shape of the bounding box and introduces background information.

3.3. Multi-task Learning

如图4所示，表示对于本文联合训练box监督信息且无类别的分割框架图。网络的输入是salient 和 box-supervised images，经过一个backbone HRNet来提取convolutional feature maps，其中分为了weak以及salient的feature maps。weak feature map 传至weak seg head，计算MIL loss；salient feature map同时传至 weak segmentation ， transferred segmentation 以及salient segmentation head，transferred segmentation 以及salient segmentation head两个heads一起计算pxiel-wise loss。

对于3个heads的结构，如图4的右图所示。是完全一样的网络结构。

训练BoxCaseg包含了3个任务，分别是MIL training for box-supervised images, MIL training for salient images, and pixel-labeling for salient images.

3.3.1 MIL loss for weak segmentation

MIL loss由公式1所示：

$B$ 属于bags， $B \in \{B^{+}, B^{-}\}$ ,S属于经过sigmoid后的score map， $S \in \{ S_s, S_w\}$ 。p属于postiion，为了使得S§的得分位于0~1。整个loss的优化方向是：使得postive bags中的position经过sigmoid后的得分值大，从而 $log(\max S(p)$ 损失小，使得negative bags中的position经过sigmoid后的的得分值小，从而损失中 $log(1-\max S(p)$ 小。即对于公式1来说，对于Bags中的每一个的postive/negative bag，B中的最大的pixel（等于是进行了进一步的筛选）所在的score map $S(p)$ 。

其中最后一项是一个smooth term。如公式2所示：

其中 $\Omega§$ 表示的是对于像素p周围的eight-connected region pixel。

3.3.2 Pixel-wise loss for salient segmentation and transferred segmentation

作者通过weight transferr方法，使得salient images中的pixel-wise labeling参数更适合box-supervised images。

weight transfer module的设计：

As for the weight transfer module, we simply use a two-layer multi-layer perceptron (MLP) with leaky ReLU as activation function.

并且weight transfer是一个单项的过程，因为它的梯度从weak segmentation head中detach，梯度不会反传至weak segmentation head中。

Note that the weight transfer is a single direction process, since it is detached from the weak segmentation head，the gradients do not back-propagate to the weak segmentation head.

训练过程就是去优化transfer MLP。

pixel-wise loss的公式如下：

其中M表示salient images的gt mask， $S_\alpha$ 和 $S_t$ 分别表示2个head部分输出的score maps。 $\alpha \in [0,1]$ 。

总的损失如下：

where λ is equal to 0 if the input comes from box- supervised images, otherwise 1. Last but not least, during the training of the proposed network, there are both box- supervised images and salient images in a mini-batch.

3.4. Training a Proxy Mask R-CNN

作者使用训练好的BoxCaseg model来产生训练集的proxy masks。

we can use the BoxCaseg model to generate proxy masks on the training set.

具体使用BoxCaseg中的salient segmentation head 和transfer head 来得到预测的mask，并且根据gt bbox进行crop patches。然后在使用 mask r-cnn 来训练bbox annotations和proxy masks。

In our method, the Mask R-CNN model is trained with the bounding box annotations and the proxy masks, thus called Proxy Mask R-CNN.

为了减少多物体高度重合在一个bbox的error的问题，作者提出了merge and drop方法。

To reduce the errors in the masks generated by BoxCaseg, we propose a merge and drop, i.e., merge masks using the strategy of smaller object better and drop masks via the proxy box agreement rule, which are detailed as follows.

3.4.1 Merging via smaller object better

将pixels属于多个预测物体的时候，本文将他们归结于小物体，作者这么做的原因在于小物体通常都包含在大物体中。

When merging segmentation results, for those pixels belong to multiple predicted masks, we assign them to the smallest objects.

3.4.2 Dropping via proxy box agreement

在使用merging策略后，本文进一步去除低质量的masks。

因为只有GT boxes 而没有GT masks，因此，作者使用proxy bbox，也就是proxy mask的bbox，来计算proxy bbox和GT bbox，作为proxy box agreement，删除较低的 boxIOU的proxy mask。在反向传播过程中忽略。

4. Experiment

4.1 Comparison with the State-of-the-art Methods

4.2. Ablation Studies

4.2.1 Less salient images for joint training

4.2.2 The effectiveness of the three segmentation heads

4.2.3 Sampling strategies for imbalanced training images

4.2.4 The weight of transferred segmentation head

4.2.5 The strategies for merging and deleting proxy masks

全部评论 (0)

还没有任何评论哟~

[WSIS] Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images

1\.Motivation 弱监督实例分割（WSIS） WeaklysupervisedinstancesegmentationWSISisimportantincomputervisionforat...

Weakly supervised learning of instance segmentation with inter-pixelrelations（CVPR2019）

摘要本文提出了一种以图像级类标签作为监督的实例分割学习方法。我们的方法生成训练图像的伪实例分割标签，用于训练完全监督模型。为了生成伪标签，我们首先从图像分类模型的注意图中识别对象类的自信种子区域，并...

Weakly Supervised Instance Segmentation using Class Peak Response 解读

基于弱监督的实例分割一.使用PeakResponseMapPRM实现对图像中目标的位置的定位。 1.1在全卷积神经网络最后的分类器部分（CNNclassifiers）可以生成类响应图（classre...

【深度学习】【CVPR2019 oral】Weakly Supervised Learning of Instance Segmentation with Inter-pixel

【CVPR2019oral】WeaklySupervisedLearningofInstanceSegmentationwithInterpixel Abstract Step Abstract 监督...

极简笔记 Weakly Supervised Instance Segmentation using Class Peak Response

极简笔记WeaklySupervisedInstanceSegmentationusingClassPeakResponse 本文是一篇弱监督的工作，利用分类网络做实例分割任务。

Weakly Supervised Instance Segmentation using Class Peak Response阅读笔记

论文链接：https://arxiv.org/abs/1804.00880 目前技术空缺目前大部分弱监督下的实例分割技术需要大范围像素级的标注去训练深度神经网络，然而这种方法往往耗费大量人力物力[1...

Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

WeaklySupervisedObjectLocalizationwithMultifoldMultipleInstanceLearning 基于多层多实例学习的弱监督物体定位摘要——在计算及视觉...

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and

paper:ContrastivelearningofClassagnosticActivationMapforWeaklySupervisedObjectLocalizationandSemanti...

Weakly-Supervised Semantic Segmentation via Sub-category Exploration

WeaklySupervisedSemanticSegmentationviaSubcategoryExploration Abstract 问题 1.Introduction 2\.RelatedW...

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

CurriculumNet:WeaklySupervisedLearningfromLargeScaleWebImages,ECCV,2018 要点 1、在大规模弱监督网络网络图像上训练深度神经网络，...

是否确定退出登录?

[WSIS] Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images

1. Motivation

2. Contribution

3. Method

3.1 Weakly-supervised Instance Segmentation via Multi-instance Learning

3.2 Joint Training with Salient Images

3.2.1 Salient Images

3.2.2 Data Augmentation

3.3. Multi-task Learning

3.3.1 MIL loss for weak segmentation

3.3.2 Pixel-wise loss for salient segmentation and transferred segmentation

3.4. Training a Proxy Mask R-CNN

3.4.1 Merging via smaller object better

3.4.2 Dropping via proxy box agreement

4. Experiment

4.1 Comparison with the State-of-the-art Methods

4.2. Ablation Studies

4.2.1 Less salient images for joint training

4.2.2 The effectiveness of the three segmentation heads

4.2.3 Sampling strategies for imbalanced training images

4.2.4 The weight of transferred segmentation head

4.2.5 The strategies for merging and deleting proxy masks

全部评论 (0)

相关文章推荐

[WSIS] Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images

Weakly supervised learning of instance segmentation with inter-pixelrelations（CVPR2019）

Weakly Supervised Instance Segmentation using Class Peak Response 解读

【深度学习】【CVPR2019 oral】Weakly Supervised Learning of Instance Segmentation with Inter-pixel

极简笔记 Weakly Supervised Instance Segmentation using Class Peak Response

Weakly Supervised Instance Segmentation using Class Peak Response阅读笔记

Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and

Weakly-Supervised Semantic Segmentation via Sub-category Exploration

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images