Advertisement

对抗性EM用于变分深度学习:在低剂量PET和低剂量CT中的半监督图像质量增强应用|文献速递--Transformer架构在医学影像分析中的应用

阅读量:

Title

题目

基于对抗式EM算法的变分深度学习:在低剂量PET与CT中实现半监督图像质量提升的应用

变分深度学习中被对抗性EM所促进的应用,在其在低剂量PET和低剂量CT上的半监督图像质量增强应用

01

文献速递介绍

医学影像领域已成为现代医学诊断、疾病筛查与治疗方案的核心领域。随着非侵入性成像技术在辅助诊断、制定治疗计划以及监控病情和外科手术中的应用日益广泛和技术进步的推动下,在这项技术上的关注程度不断提升。其中两种最为常用的成像方法包括X射线计算机断层扫描(CT)技术和正电子发射断层扫描(PET)技术。

CT通过旋转的X射线管将X射线束发送至被检查物体,在探测器上捕捉其能量特征。通过从不同角度获取多组数据,在二维图像的基础上构建出能反映器官组织内部细节的空间切片集合。这些切片经过计算后叠加生成完整的三维体视图。相比之下,PET是一种功能性的影像诊断方法。在扫描前向受检者静脉输注放射性标记物。当该标记物在体内发生正电子衰变时会释放伽马射线这些信号由探测器接收并转化为电信号用于重建图像信息从而提供血液流速、代谢活动及物质分布等生理指标的空间分布图(Townsend等 2004)

Abatract

摘要

In positron emission tomography (PET) and X-ray computed tomography (CT), reducing radiation doses often leads to significant degradation in image quality. To enhance image quality in low-dose PET and CT scans, we introduce a new theoretical framework that combines adversarial and variational deep neural networks (DNNs) for image quality improvement. Our proposed adversarial EM (AdvEM) framework features an encoder–decoder architecture with a multiscale latent space, enabling robust statistical modeling through generalized-Gaussian distributions in both latent and image spaces. The inclusion of adversarial learning during training enhances model robustness by allowing sampling from the posterior distribution using a Metropolis–Hastings scheme. Unlike conventional variational-DNN approaches that rely on paired low-dose images with corresponding normal-dose versions, our semi-supervised AdvEM (ssAdvEM) framework can effectively learn from just a small number of normal-dose images. The proposed methods provide pixel-wise uncertainty estimates for their outputs. Through extensive empirical evaluations on real-world PET and CT datasets, incorporating various baseline models, out-of-distribution data, and ablation studies, we demonstrate the advantages of our framework compared to existing approaches.

在正电子发射断层扫描(PET)和X射线计算机断层扫描(CT)技术中,默认情况下降低辐射剂量可能会导致图像质量明显下降的问题存在。针对低剂量PET与CT图像质量增强的关键需求,在深入分析现有技术局限性后提出了创新性解决方案:我们开发了一种基于期望最大化学习机制的新架构——抗变分深度神经网络框架(AdvEM)。该框架采用先进的编码器-解码器架构设计,并引入多尺度潜在空间概念,在潜在空间与图像空间均实现了数据特异性稳健统计建模功能;通过在训练过程中融入对抗学习策略进一步提升了模型的整体稳定性与鲁棒性表现

相较于传统的变分-DNN学习方案,在 AdvEM 方法中我们采用了不同的策略:通过 Metropolis–Hastings 方法在潜在空间中进行采样。相比于现有基于 PET 或 CT 图像增强的方法,在 AdvEM 的实现过程中我们引入了一种半监督学习框架 ssAdvEM(简记为 ssAdvEM)。该框架的优势在于仅需少量的正态剂量图像便能实现有效的学习目标。值得注意的是,在所有实验条件下 AdvEM 和 ssAdvEM 均能提供精确到像素级别的不确定性估计结果。通过一系列包含多种基准模型、分布外测试样本以及消融实验的实际 PET 和 CT 数据集验证表明所提出的框架具有显著优越性

Method

方法

We present a novel semi-supervised, adversarial, and variational deep neural network (DNN) framework designed for enhancing image quality in low-dose positron emission tomography (PET) scans using expectation maximization (EM)-based training. The framework outlined in Figure 1 leverages an encoder-decoder architecture to produce two distinct implementations: (i) AdvEM, which operates in a fully supervised manner by utilizing paired datasets consisting of low-dose images alongside their corresponding normal-dose counterparts; (ii) ssAdvEM, which achieves similar results without requiring a direct normal-dose image for each individual low-dose scan.

本研究开发了一种创新性高的部分监督对抗变分深度神经网络架构...旨在通过...提升低剂量PET及CT图像的质量。(图1)该架构采用编码器-解码器结构设计,并形成了两种不同版本:其一是完全 supervision型Adversarial EM算法( AdvEM),即基于包含配对数据集进行培训;其二是其半 supervision版本ssAdvEM,在培训过程中仅依赖单张低剂量图像对应的正常样本。 AdvEM通过利用互补模态的数据进行联合优化以提高重构质量;而 ssAdvem 则通过引入辅助学习策略来缓解数据不足的问题。

Conclusion

结论

We introduce a novel adversarial deep learning framework, named ssAdvEM, for semi-supervised low-dose PET and CT image quality enhancement, which relies on a variational modeling approach grounded in MCEM. Our MCEM framework distinguishes itself from conventional variational deep learning schemes like VAE and its extensions, which typically rely on the evidence lower bound (ELBO). During the MCEM process, our framework models the posterior distribution of latent-space variables and employs the Metropolis-Hastings (MH) sampler for posterior sampling. Unlike traditional sampling methods in variational deep learning, our approach leverages both the encoder and decoder alongside a reference image. Furthermore, it integrates a novel multiscale latent space with datum-specific robust GG modeling to create a versatile framework capable of handling both in-distribution and out-of-distribution images resulting from low doses. The ssAdvEM framework introduces adversarial learning into our innovative method called ss-NonAdv-EM, which was initially presented in Sharma et al. (2022). Incorporating adversarial learning enhances the robustness of our framework, as evidenced by significant performance improvements for OOD images discussed in Sections 4.3 and 4.4. By adapting the MCEM framework to semi-supervised learning, we enable effective training using minimal high-quality images, thereby providing a powerful solution to address practical challenges related to dataset availability. Both ssAdvEM and ss-NonAdv-EM surpass other unsupervised and semi-supervised frameworks such as N2N, DIP, and N2N+UNet (a baseline proposed for fair comparison), as well as prior-model-based methods like TV denoising and BM3D/BM4D. An additional key feature of our framework is its ability to estimate uncertainty—a feature that other methods notably lack. Ablation studies in Section 4.6 highlight the benefits of adversarial learning, variational modeling, GG loss optimization, and multiscale latent space construction toward developing an effective and robust framework for low-dose PET and CT image enhancement.

我们开发了一种全新的对抗性和变分深度学习框架ssAdvEM,在半监督条件下用于低剂量PET和CT图像的质效提升任务中。该框架基于MCEM(蒙特卡洛期望最大化)进行变分建模与学习过程。与传统变分深度学习方案(如VAE及其扩展)基于证据下界(ELBO)的特性不同,在本框架中我们对潜在空间变量的后验分布进行了系统性建模,并采用了先进的马尔可夫链蒙特卡洛(MH)采样器来进行后验分布采样操作。值得注意的是,在采样机制方面,我们的方法不仅依赖于传统的深度编码器网络进行特征提取,并且同时结合了解码器网络以及参考图像信息以实现更为全面的特征重建能力。此外,在建模策略上,该框架融合了一种创新性的多尺度潜在空间表示方法,并采用稳健型广义高斯(GG)分布模型以适应数据特性差异性问题,在质效提升任务中展现出优异的性能表现,在分布内图像质量以及因低剂量产生的分布外图像质量(OOD)均取得了理想效果

我们引入了对抗学习至新方法ss-NonAdv-EM,并对其早期版本已在Sharma等人的著作中介绍。通过增强抗受学习机制的应用,在所构建框架下可显著提升稳定性表现,在OOD图像上的性能表现有显著提升(第4.3节与4.4节详述)。针对半监督场景需求,在仅需极少量高质量样本即可实现有效训练的基础上成功实现了对现有数据不足问题的有效解决策略。经对比实验表明,在无监督及半监督领域相关研究工作中所提出的多种基准算法中(包括N2N、DIP与N2N+UNet),我们的新方法体系在性能上已展现出明显优势;此外在基于传统模型的方法如TV去噪与BM3D/BM4D算法等现有技术手段面前也展现出更强竞争力

该框架的另一个关键点是能够体现其他方法未考虑到的不确定性估计。第4.6节中的消融实验表明,在构建一个既高效又可靠地融合低剂量PET与CT增强图像的框架过程中(i)对抗学习原理能够提升模型鲁棒性;(ii)变分建模策略有助于捕捉图像细节;(iii)GG损失函数能有效平衡各模态信息;(iv)多尺度潜在空间设计则能捕捉图像细节特征带来的优势明显

Results

结果

This section outlines various aspects covered in this study, including datasets, baseline approaches, experimental outcomes featuring ssAdvEM and conventional baselines, results across OOD datasets, ablation studies specific to ssAdvEM, and computational efficiency analyses comparing ssAdvEM with baseline methods. We introduce a measure called 𝛾 to quantify supervision levels within our framework. 𝛾 is defined as the ratio between paired low-dose and normal-dose image counts (||) relative to the total training dataset size (| ∪  |), expressed as 𝛾 ∶= ||∕(| ∪  |).

本节将概述数据集及其相关设置,并对比现有基准模型的表现。此外,我们将详细呈现基于本文提出的方法与现有基准模型在标准测试集上的实验结果。特别地,在分布外(OOD)场景下各方案的实际性能表现将得到重点考察。同时通过对ssAdvEM的有效性进行消融研究,并对所提出方案与现有方法之间的计算开销进行系统性比较。

我们采用监督水平 𝛾 来度量整个训练集中成对低剂量与正常剂量图像的数目占比;具体来说, 𝛾 的计算公式为 𝛾 := || / (| ∪ |).

Figure

图片

_Fig. 1. Semi-Supervised Adversarial Expectation Maximization (ssAdvEM) framework. (a) Within the fully-supervised AdvEM framework, a DNN accepts a low-dose image X as input, with its output layer Y corresponding to a normal-dose reference image Z. In contrast, ssAdvEM allows certain input images X to have reference images Y sourced from low-dose images rather than Z. (b) Each encoder block within this framework generates a generalized-Gaussian (GG) distribution at each scale within its multiscale latent space H.

图 1. 半监督对抗性期望最大化 (ssAdvEM) 框架。

(a) 基于完全监督的学习框架中,在深度神经网络(DNN)处理后的输入为低剂量图像 𝑋;其对应的正常剂量重建图像是其输出端的参考图像 𝑌(即相应的真实值为正常剂量图像 𝑍)。值得注意的是,在自相似AdvEM(ssAdvEM)框架下,“参考图像是另一张低剂量图片”的假设不再成立。
(b) 每个编码器模块分别在多尺度潜在空间中的一个特定尺度上生成广义高斯(GG)分布。

图片

The findings in Figure 2 are quantitative. Analysis outcomes regarding image quality improvement were examined for both low-dose CT scans (a) and PET images (b) at different levels of supervision parameter γ, employing all methods tested, as demonstrated through boxplots displaying RRMSE and SSIM metrics across the test set.

图 2展示了结果及其定量分析。低剂量组(a)的CT扫描与(b)的PET图像,在不同监督参数 𝛾 的情况下进行的质量提升效果,则通过箱线图进行了展示。具体而言,在测试集上计算得到的相对均方根误差(RRMSE)以及结构相似性(SSIM)值的变化情况,则通过箱线图进行了详细呈现

图片

Fig. 3. Results: Qualitative assessment using CT. Improved image quality in low-dose scans, demonstrating reconstructed outputs alongside their corresponding reconstructions, comparing pixel-wise differences between reconstructed outputs and ground truth images.

图 3展示了定性分析结果:CT。通过质量增强技术处理后的低剂量图像生成了相应的输出,并呈现与真实图像之间的对比结果。

图片

Fig. 4. 图4结果显示 qual-PET 方法的效果如何?研究者通过模拟实验验证了该方法在低剂量图像处理方面的有效性。实验结果表明,在输出图像中不仅能够生成高质量的低剂量图像输出,并且还能清晰地展示输出与真实图像之间的像素级差异。此外,在实验设计中为了使结果更具可读性,在头部和躯干区域分别采用了不同的颜色编码方案以保持对比度并避免饱和现象.

图 4. 结果:PET分析 - 定性结果。低剂量图像的质量提升明显,在视觉上直观地呈现了图像输出及其对应的每像素差异幅度,并且清楚地显示了输出与真实图像之间的差异程度。为了进一步提高可视化效果,在头部和躯干区域分别采用不同颜色映射方法以增强对比度并避免饱和现象

图片

Fig. 5. Results: Assessment of Bayesian inference methodologies. Under different levels of 𝛾 supervision, quantitative metrics including mean, median, and mode are analyzed for the output set .

图 5 显示了贝叶斯推断方案的结果。针对不同监督水平 γ 的情况,在输出集 S 上进行了平均值、中位数和众数的统计特征分析

图片

Figure 6 presents results from abnormal data images: Quantitative analysis comparing the performance metrics across various degrees of supervision parameter γ, demonstrating the effectiveness in enhancing image quality for both CT and PET scans.

图 6. 非典型分布图像结果:定量评估。在不同监督强度 𝛾 的情况下, 多种方法对低剂量水平 CT 和 PET 图像的质量增强效果进行了对比分析, 并通过展示测试集中的 RR M SE 和 SSIM 值箱线图直观地反映了这些方法的表现差异

图片

Fig. 7. Results on OOD Images: Qualitative Analysis - CT. The system's effectiveness in image restoration is demonstrated through qualitative assessment of low-dose image enhancement, presenting reconstructed images alongside their respective versions, highlighting pixel-wise difference magnitudes between the output and ground truth.

图表7展示了分布外(OOD)技术下的CT成像效果:通过质量提升技术处理了超低剂量扫描所得的影像数据,并成功呈现了所获取的影像信息与其对应的变化情况;将处理后的影像与预期的标准进行对比后发现其表现效果较为理想

图片

Fig. 8. Results on Out-of-Distribution (OOD) Images: Qualitative Analysis using PET Metrics. The findings from the image quality enhancement process for very-low-dose images include visualizations of output images alongside their respective versions, demonstrating pixel-wise differences between outputs and ground truths. To enhance clarity during visualization, the head and torso regions employ distinct colormaps for better interpretation._

图 8 展示了非典型组织学分布的 PET 评估结果

图片

From Figure 9, we observe that uncertainty visualization in CT outputs is effectively captured. As supervision strength γ varies, we analyze how per-pixel standard deviation (σ) distributions evolve in an example CT image’s output set. The overall trend shows that σ values diminish with increasing γ values.

图 9. 结果:CT输出中每像素不确定性的可视化。在不同伽马值下(γ),展示一个示例CT图像的预测结果集  的每像素标准差 (𝜎) 图呈现。总体来看,在伽马值增加时(γ增大),𝜎值逐渐降低。

图片

_Fig. 10. Results: Examining the per-pixel uncertainty in PET images’ outputs by visualizing their corresponding standard deviation (𝜎) maps under different degrees of supervision. The magnitude of 𝜎 systematically diminishes as 𝛾—the degree of supervision—becomes more pronounced for an example PET image.]

图 10. 结果:PET输出中每像素不确定性的可视化效果,在不同监督参数 γ 下展示了示例 PET 图像的输出集合 σ 图(σ)。整体而言,在监督参数 γ 增加的过程中,σ 的变化幅度逐渐减小。

图片

Fig. 11. Results: Characterizing variability in the output distribution by examining its primary modes of variation. The first and second primary modes of variation specifically for CT analysis.

图 11 的结果显示:利用主变化模式来可视化地展示数据分布的变化性;CT的主要两个变化模式。

图片

Fig. 12. Results: By utilizing its principal modes of variation, this study demonstrates how to visualize variability within the output distribution.

图表12的结果显示了通过主变化模式展示数据分布的变化特征.PET分析中包含了前两个主要的变化方向.

图片

Systematic ablation studies were conducted to evaluate the performance of various modeling and training approaches. The comparative analysis focused on six semi-supervised methods, each incorporating distinct enhancements to the base architecture. Specifically, method i) utilized the N2N+UNet architecture, while method ii) enhanced this with an adversarial learning component. Method iii) introduced distribution modeling in the latent space, method iv) applied it to the decoder output, and method v) combined both enhancements. Finally, method vi) incorporated an additional adversarial learning step into the base architecture. Test results were evaluated across two distinct categories of images: (a) in-distribution and (b) out-of-distribution images.

图 13展示了消融机制的研究重点围绕模型构建与训练过程展开,并进行了对比分析。具体而言,在CT扫描场景下γ值设定为0.02,在PET扫描场景下γ值设定为0.5的情况下对比分析了六种半监督学习方法的表现:其中(i)代表第一类(N2N结合UNet架构),(ii)代表第二类(N2N结合UNet架构并引入对抗学习机制),通过引入对抗学习机制来优化模型性能;第三类(N2N结合UNet架构并引入基于EM算法的概率分布建模技术)(iii),第四类(N2N结合UNet架构并在解码器输出层引入GG型概率分布建模)(iv),第五类(ss-NonAdv-EM框架)(v),第六类(ssAdvEM框架)(vi),其中第五类框架同时融合了基于潜在空间的概率分布建模技术以及基于解码器输出层的概率分布建模技术;实验评估主要针对两类数据样本展开:一类是符合预期分布的数据样本(a),另一类是不符合预期分布的数据样本(即异常检测任务中的目标类别)(b),其中(b)指的是异常检测任务中的目标类别

图片

Figure 14 illustrates the ablation studies focusing on the impact of multi-scale latent spaces. The performance outcomes of two semi-supervised techniques, parameterized by γ values at 0.02 for CT imaging and 0.5 for PET imaging, are compared: (i) a standard ssAdvEM framework incorporating multi-scale latent spaces, and (ii) an ablated version using a single-scale latent space akin to VAE-based methodologies. Results from testing on both in-distribution and out-of-distribution image datasets demonstrate the comparative effectiveness of these approaches.

Figure 14: Convergence studies examining multi-scale latent space performance.(Performance comparisons were conducted between two semi-supervised methods, with γ values of 0.02 in CT and 0.5 in PET:(i) The standard ssAdvEM framework was employed using a multi-scale latent space approach;(ii) A simplified ssAdvEM framework based on a single-scale VAE-derived latent space was employed.The results encompass testing performance on (a) in-distribution images and (b) out-of-distribution (OOD) images.)

全部评论 (0)

还没有任何评论哟~