UM-Net: 重新思考用于息肉分割的ICGNet,结合不确定性建模|文献速递-基于多模态-半监督深度学习的病理学诊断与病灶分割
Title
题目
UM-Net: Re-examining ICGNet for colorectal polyp segmentation while incorporating uncertainty modeling
UM-Net: 重新思考用于息肉分割的ICGNet,结合不确定性建模
01
文献速递介绍
结直肠癌(CRC)是一种在全球范围内常见的恶性肿瘤,在男性中占到第三位,在女性中排名第二。根据Sung等人的研究(2021),约有100万新病例在2020年被确诊,并占全球癌症死亡率的9.4%。其中约85%的 CRC源于腺瘤性息肉,并以高危腺瘤最为常见(Strum和Williamson, 2016)。如果能在 CRC 发生前及时发现并切除相关息肉,则能显著降低 CRC 的发病率和死亡率,并使5年生存率达到90%(Siegel等, 2021)。在临床实践中,专业的结直肠镜检查被认为是筛查和切除息肉的最佳手段,并被视为金标准。然而该过程通常由内镜医生手动操作,并可能因医生主观判断的不同而导致漏诊现象的发生。一些研究表明,在常规结肠镜检查中出现漏检情况的概率高达21.4%(Kim等, 2017)。因此开发一种自动化的息肉分割算法具有重要意义
在不同发育阶段中结直肠息肉的形态与特征会随着时间发生变化(Jha等2021)。息肉呈现大小不一形态不规则且外观多变的特点这给分析带来了一定难度此外在分割过程中可能出现错误的原因包括图像中存在的水流肠道内物质模糊气泡以及亮度变化等因素(Wu等2021)。
Abatract
摘要
Automatic segmentation of polyps from colonoscopy images plays a critical role in the early diagnosis andtreatment of colorectal cancer. Nevertheless, some bottlenecks still exist. In our previous work, we mainlyfocused on polyps with intra-class inconsistency and low contrast, using ICGNet to solve them. Due to thedifferent equipment, specific locations and properties of polyps, the color distribution of the collected images isinconsistent. ICGNet was designed primarily with reverse-contour guide information and local–global contextinformation, ignoring this inconsistent color distribution, which leads to overfitting problems and makes itdifficult to focus only on beneficial image content. In addition, a trustworthy segmentation model should notonly produce high-precision results but also provide a measure of uncertainty to accompany its predictionsso that physicians can make informed decisions. However, ICGNet only gives the segmentation result andlacks the uncertainty measure. To cope with these novel bottlenecks, we further extend the original ICGNetto a comprehensive and effective network (UM-Net) with two main contributions that have been proved byexperiments to have substantial practical value. Firstly, we employ a color transfer operation to weaken therelationship between color and polyps, making the model more concerned with the shape of the polyps.Secondly, we provide the uncertainty to represent the reliability of the segmentation results and use varianceto rectify uncertainty. Our improved method is evaluated on five polyp datasets, which shows competitiveresults compared to other advanced methods in both learning ability and generalization capability.
结肠镜图像中息肉的自动分割在结直肠癌的早期诊断和治疗中发挥着至关重要的作用。然而,仍面临一些关键挑战。我们先前的研究主要针对类内不一致性和低对比度问题,采用ICGNet模型加以解决。然而,在实际应用中发现由于设备类型、息肉的具体位置以及其固有的特征因素所导致的颜色分布差异性较大,这使得ICGNet模型仅能聚焦于表面特征而忽视深层特征特性分析进而造成模型出现过拟合现象影响其专注于关键解剖结构区域的能力。为此为了提高分割模型的质量我们需要建立一套能够全面评估分割结果可靠性的量化体系为此提出了一个综合优化的新网络架构UM-Net并通过实验验证该方法较现有技术具有显著的优势首先我们引入颜色迁移操作以弱化颜色与病变组织之间的关联关系从而让模型更加关注病变组织形态特征其次我们通过计算分割结果的方差来量化不确定性程度并据此对模型输出进行相应修正最终实验表明UM-Net网络在五个典型息肉数据集上均展现出超越现有先进算法的学习能力和泛化性能
Method
方法
3.1. Problem definition
Let T = \{(X_i, Y_i)\}_{i=1}^N represent the N-labeled dataset, where each pair (X_i, Y_i) is composed of an image X_i \in \mathbb{R}^{C \times H \times W} and its corresponding ground truth Y_i \in \{0, 1\}^{H \times W}, with H \times W denoting spatial dimensions and C representing the number of channels. As elaborated in the introduction, our objective is to train a segmentation network F_{seg} to address the polyp color and uncertainty problem so as to achieve satisfactory performance on test data. In this study, given two inputs X_1^a and X_2^b, we transfer the color b from X_2^b to X_1^a, yielding a new input X_1^b, which forms the basis for constructing the segmentation network F_{seg}(X_1^b). Additionally, we incorporate uncertainty into the prediction outcomes by modeling it through \mathcal{U}_{avg}(F_{seg}(X_1^b)), where parameter \mu \in [0, 4], while maintaining minimal prediction bias \mathcal{V}_ar(F_{seg}(X_1^b), Y_1)
3.1 问题定义
设 T = {(Xi, Yi)}{i=1}^N 表示包含了 N 个标注样本的数据集,在本研究中所关注的是一组特定位点图像的集合 T = {(Xi, Yi)}{i=1}^N。其中每个样本 (Xi, Yi) 包含一张图像 Xi ∈ R^{C×H×W} 空间,并赋予其真实类别标签 Yi ∈ {0, 1}^{H×W}。如前文所述,在解决息肉的颜色分类及不确定性量化问题的基础上,在测试数据集上建立分割网络 Fseg 的优化模型以提高诊断效率和准确性。对于任意两个输入样本 Xa 和 Xb,在不改变原有函数关系的前提下将 Xb 的特性参数 Yb 转换至 Xa 所属的空间域中得到新的特征组合体 Xab = (Xa, Yb),并将其输入到分割网络 Fseg 中进行特征提取操作得到新的特征向量 Fseg(Xab)。我们对分割网络输出 Fseg(Xab) 的不确定性 Uavg 进行建模以满足以下条件:在最小化预测偏差 Var(Fseg(Xab), Y* ) 的同时进行优化。
Results
结果
5.1. Quantitative evaluation for metric superiority
5.1.1. Learning ability
In this section, we assess the learning capability of our approach across two datasets, with detailed results provided in Tables 2 and 3. When compared to ICGNet, UM-Net demonstrates notable enhancements in the Dice and mIoU metrics. Specifically, on the EndoScene dataset, these metrics increased from 87.93% and 89.56% to 89.26% and 90.33%, respectively. Similarly, on the Kvasir-SEG dataset, improvements were observed from 92.35% and 91.99% to 93.04% and 92.54%. Our methodology also includes a comparative analysis of complexity metrics such as floating-point operations (FLOPs), network parameters (Params), and frames per second (FPS). For instance, UM-Net achieves FLOPs of 16.87G, Params of 22.75M, and an FPS of 46 on the EndoScene dataset; these values improve to 15.62G, remain unchanged at Params=22.75M, and increase slightly to FPS=50 on the Kvasir-SEG dataset despite Polyp-PVT's lower FLOPs performance by approximately8.28G and7.66G respectively]. In terms of Params efficiency compared to leading methods, our model shows comparable performance while maintaining lower computational demands.] The enhanced segmentation accuracy achieved by our method particularly stands out since it directly impacts diagnostic precision for physicians by ensuring accurate polyp segmentation with minimal computational overhead.] Overall, UM-Net maintains its position as the most efficient option given its balanced performance between effectiveness and computational requirements
5.1. 定量评估指标的优越性
5.1.1. 学习能力
在本节中, 我们对两个数据集进行了系统评估, 具体结果如表2和表3所示. 相较于ICGNet, UM-Net在EndoScene数据集上实现了Dice系数和mIoU(平均交并比)的显著提升, 分别从87.93%、89.56%提升至89.26%和90.33%; 同样地, 在Kvasir-SEG数据集上也分别获得了Dice系数和mIoU的显著提升, 分别从92.35%、91.99%提升至93.04%和92.54%. 由此可见, 我们的方案不仅显著超越了现有的方法, 更实现了最佳性能, 进一步验证了模型卓越的学习能力
此外,在对方法复杂度的评估中,我们不仅关注计算资源的需求还进行了全面对比研究
值得注意的是,在测试过程中我们发现模型的推理速度平均均能达到48 FPS这一指标表现优异,在实际应用中这一性能表现使其具备了辅助诊断系统的能力,并能有效满足实时预测的要求
Figure
图

The framework addresses both challenges and employs an innovative approach for colonoscopy image-based polyps segmentation. From (a) to (b), these stages represent the initial implementation of ICGNet followed by an enhanced methodology in UM-Net, each posing distinct challenges for our respective tasks.
图 1. 我们的框架基于结肠镜图像实现了息肉分割这一任务的关键难点。其中(a)阶段采用的是ICGNet作为基础模型,在(b)阶段则引入了改进的方法UM-Net,并为了解决当前面临的实际问题进行了进一步优化。

_Fig. 2 presents an overview of the enhanced UM-Net architecture. The network divides polyps into distinct segments through a three-stage process. Stage 1 Input involves feeding high-quality polyp images obtained after applying a color transfer operation into the system. Stage 2 focuses on feature extraction from these images. Stage 3 generates segmentation masks alongside uncertainty measures for each identified polyp. Specifically, the RCG, ALGM, and HPPF modules function similarly to ICGNet as described by Du et al. in their 2022 publication.]
改进后的UM-Net架构示意图见图 2

Fig. 3. One iteration of the color transfer operation.
图 3. 颜色转换操作的一次迭代流程。

Fig. 4. Qualitative results of different methods on Kvasir-SEG and EndoScene datasets. The segmentation results are converted into contour outlines and displayed in the final column (with ground truth in red, PraNet in cyan, ACSNet in yellow, CCBANet in black, SANet in white, ICGNet in blue, and UM-Net in green). Furthermore, the red dashed boxes highlight areas where models have missed diagnoses, while red arrows indicate regions where predictions exceed ground truth. The white dashed lines illustrate the distinction between ICGNet and UM-Net predictions.
如图4所示,在Kvasir-SEG和EndoScene数据集上进行对比实验所得的结果。将分割后的边界提取出来作为展示对象,并将其结果显示于表格的最后一栏(其中红色代表真实值、青色对应PraNet、黄色对应ACSNet、黑色对应CCBANet、白色对应SANet、蓝色对应ICGNet、绿色对应UM-网)。另外,在表格中使用红色虚线框来比较ICGNet与UM-网预测效果之间的差异。

Fig. 5. 消融研究 Forest plot on EndoScene test set. 左侧列出了消融研究中的子模块列表;右侧则展示了各子模块对应的Dice分数(以及95%置信区间),中间部分呈现了相应的视觉结果。其中菱形代表每个子模块的Dice分数(dice score),连接菱形的水平线则表示该分数置信区间的上下限
图 5展示了EndoScene测试集上的消融实验 visualize forest图。左边列出了消融实验中的各个功能块,在右边则展示了各功能块对应的具体 Dice 分数及其95%置信区间,并通过中间部分呈现了这些结果的可视化形式。其中用菱形表示各个功能块对应的 Dice 分数值,并且连接这些菱形的水平线条则标记了各分数值95%置信区间的范围

_Fig. 6. 展示了UM-Net第二层的特征可视化示例图6.6展示了UM-Net第二层的特征可视化示例From left to right are input images (the green curve represents the outline of ground truth), the E-Block 2feature, the RCG module feature, and the ALGM module feature, respectively. After applying two modules, the network precisely captured missing object parts and effectively completed feature representation.
图6. UM-Net第二层的特征可视化示例。从左到右依次展示了输入图像(其中绿色曲线代表真实轮廓)、E-Block 2的特性和RCG模块以及ALGM模块各自的特性。通过引入这两个模块后,网络成功地表达了缺失物体的部分及其边界附近的细节。

Fig. 7 demonstrates how UM-Net's modeling uncertainty evolves with increasing training iterations. The input images, ground truths, and corresponding uncertainties are displayed from top to bottom on the left side._ The first row represents UM-Net's output without any variance rectification._ The second row illustrates how variances are rectified in this process._ The third row details how variances are calculated between predicted masks and ground truths._
图 7 展示了随着训练迭代次数增加 UM-Net 建模不确定性程度的变化情况。左侧从上至下依次为原始图像、真实值以及与其相关的不确定性评估。(A) 表示未经标准化处理的不确定度预测结果。(B) 表示经过标准化处理后的不确定度评估结果。(C) 表示预测掩码与实际标签之间的均方误差计算

Perform an assessment on the consistency level of results obtained from two test cases within a given dataset Fig. 8 illustrates this comparison by displaying side-by-side visualizations that include input images their corresponding ground truths predictions from ICGNet and UM-Net associated uncertainty maps and variance measurements across each case arranged sequentially from left to right
图 8展示了对测试集中的两个案例进行的结果可靠性分析。具体而言,在每个案例中从左至右依次排列了输入图像与其真实值对比(第一列)。第二列为ICGNet与UM-Net两种模型的预测输出结果展示。(第三列)则反映了与这两种模型预测相关的不确定性分布情况图。(最后一列)则展示了各案例的方差分布情况图。

As shown in Figure 9, among the failure cases within the EndoScene (a, b) and Kvasir-SEG (c, d) datasets, green and red contours delineate the predicted boundaries of polyps alongside their actual shapes.
图 9展示了EndoScene (a, b) 和 Kvasir-SEG (c, d) 数据集中的失败样本。绿色与红色轮廓分别描绘了我们模型预测的边界与实际息肉边界的对比。
Table
表

Table 1Details of the datasets for training, validation and testing.
表 1用于训练、验证和测试的数据集详情。

Table 2.0 presents the quantitative metrics of the EndoScene test datasets. Where 'n/a' indicates that the results are unavailable.
表 2EndoScene测试数据集的定量结果。“n/a”表示结果不可用。

Table 3 presents the quantitative metrics of the Kvasir-SEG test datasets. 'n/a' indicates that the corresponding results are unavailable.
表 3Kvasir-SEG测试数据集的定量结果。“n/a”表示结果不可用。

Table 4Quantitative results of the test datasets ColonDB, ETIS and CVC300.
表 4ColonDB、ETIS和CVC300测试数据集的定量结果。

Table 5A quantitative assessment of ablation experiments conducted on the EndoScene benchmark dataset.
表 5EndoScene测试集上消融实验的定量评估结果。

Table 695% confidence intervals for all metrics.
表 6所有指标的95%置信区间。

Table 7Quantitative results for a subset (C6) of the PolypGen dataset.
表 7PolypGen数据集子集(C6)的定量结果。

Table 8Quantitative results on CVC-300-TV dataset
表 8CVC-300-TV数据集的定量结果。
