用于病理学的稳健图像分割和合成流程|文献速递-生成式模型与transformer在医学影像中的应用

阅读量：

Title

题目

A robust image segmentation and synthesis pipeline for histopathology

用于病理学的稳健图像分割和合成流程

文献速递介绍

病理组织学检查是病理诊断的临床金标准（Tan 等，2020）。近年来，技术的进步以及能够将玻璃切片生成全切片图像（WSI）的扫描仪的普及，使得病理学家无需使用显微镜，仅依赖于专业的硬件和软件程序即可分析患者的标本（Stathonikos 等，2019）。随着疫情的爆发，越来越多的病理科室采用数字病理技术，将组织样本数字化为电子文件，以便通过计算机远程查看（Schüffler 等，2021）。这些技术进步的积极影响不仅限于此。

例如，Natarajan 等（2020）提出了 LinkNET-34，用于乳腺癌细胞核的分割，该方法通过预处理病理图像后再进行分割流程。在弱监督问题中，Silva Rodríguez 等（2021）提出了 WeGleNet，支持多类别分割模块并结合编码器网络进行分割。其他架构如 WILDCAT（Durand 等，2017）和 Attention-MIL（Ilse 等，2018）则基于注意力机制，使用适配层为数据集的每一类别生成分割图。此外，由于病理数据在数量和质量上的限制，无监督方法也成为研究热点。例如，Xu 等（2021a）提出了一种无监督的组织簇级别图切割（TisCut）方法，用于分割肿瘤组织和非肿瘤组织区域，从而补充监督深度学习模型的下游任务。Irfan 等（2021）提出了 24 层的扩张语义分割网络（Di-CNN），利用迁移学习的特征提取模块分割乳腺病变图像。Kiran 等（2022）提出了 DenseRes U-Net 模型，在 U-Net 编码器的最后几层中集成了密集块，并通过残差连接结合 Atrous 块，替代传统的跳跃连接。

Zhang 等（2022）提出了一种新型的密集双任务网络（DDTNet），能够同时实现病理图像中肿瘤浸润淋巴细胞（TILs）的检测和分割。该模型包含一个主干网络（即特征金字塔网络），用于提取 TIL 的多尺度形态特征，检测模块用于定位 TIL 中心，分割模块用于划分 TIL 边界。Razavi 等（2022）提出了一种新型的条件生成对抗网络（cGAN）架构，称为 MiNuGAN，用于双任务的有丝分裂和核分割。该模型改编自原始 pix2pixHD cGAN 架构，采用编码器-解码器设置，结合 ResNet 块进行乳腺组织的有丝分裂检测分割。

此外，Zidan 等（2023）提出了一种基于变换器的新模型 Swin Transformer with Cascaded UPsampling (SwinCup)，用于结直肠癌图像的分割。Hörst 等（2023）则提出了一种基于变换器的方法，用于在苏木精-伊红（H&E）染色组织图像中实现准确的细胞核检测与分割。

Aastract

摘要

Significant diagnostic variability between and within observers persists in pathology, despite the fact thatdigital slide images provide the ability to measure and quantify features much more precisely comparedto conventional methods. Automated and accurate segmentation of cancerous cell and tissue regions canstreamline the diagnostic process, providing insights into the cancer progression, and helping experts decideon the most effective treatment. Here, we evaluate the performance of the proposed PathoSeg model, withan architecture comprising of a modified HRNet encoder and a UNet++ decoder integrated with a CBAMblock to utilize attention mechanism for an improved segmentation capability. We demonstrate that PathoSegoutperforms the current state-of-the-art (SOTA) networks in both quantitative and qualitative assessment ofinstance and semantic segmentation. Notably, we leverage the use of synthetic data generated by PathopixGAN,which effectively addresses the data imbalance problem commonly encountered in histopathology datasets,further improving the performance of PathoSeg. It utilizes spatially adaptive normalization within a generativeand discriminative mechanism to synthesize diverse histopathological environments dictated through semanticinformation passed through pixel-level annotated Ground Truth semantic masks.Besides, we contribute to theresearch community by providing an in-house dataset that includes semantically segmented masks for breastcarcinoma tubules (BCT), micro/macrovesicular steatosis of the liver (MSL), and prostate carcinoma glands(PCG). In the first part of the dataset, we have a total of 14 whole slide images from 13 patients’ liver, withfat cell segmented masks, totaling 951 masks of size 512 × 512 pixels. In the second part, it includes 17 wholeslide images from 13 patients with prostate carcinoma gland segmentation masks, amounting to 30,000 masksof size 512 × 512 pixels. In the third part, the dataset contains 51 whole slides from 36 patients, with breastcarcinoma tubule masks totaling 30,000 masks of size 512 × 512 pixels. To ensure transparency and encouragefurther research, we will make this dataset publicly available for non-commercial and academic purposes.

尽管数字切片图像相比传统方法能够更精确地测量和量化特征，但病理学领域的观察者之间和同一观察者内部的诊断差异仍然显著。对癌细胞和组织区域进行自动化且精准的分割可以简化诊断过程，提供关于癌症进展的洞察，帮助专家选择最有效的治疗方法。在此，我们评估了所提出的 PathoSeg 模型的性能。该模型的架构由一个改进版的 HRNet 编码器和 UNet++ 解码器组成，并集成了 CBAM 模块以利用注意力机制，从而提升分割能力。我们证明，PathoSeg 在实例分割和语义分割的定量和定性评估中，均优于当前的最先进（SOTA）网络。值得注意的是，我们使用由 PathopixGAN 生成的合成数据，有效解决了病理学数据集中常见的数据不平衡问题，从而进一步提高了 PathoSeg 的性能。PathopixGAN 通过生成和判别机制中的空间自适应归一化（spatially adaptive normalization），结合像素级标注的语义掩膜（Ground Truth semantic masks）传递的语义信息，合成了多样化的病理学环境。此外，我们为研究社区提供了一份内部数据集，该数据集包含乳腺癌管状结构（BCT）、肝脏微/大泡性脂肪变性（MSL）以及前列腺癌腺体（PCG）的语义分割掩膜。在数据集的第一部分，我们提供了来自 13 名患者的 14 张肝脏全切片图像，包含脂肪细胞分割掩膜，总计 951 个大小为 512 × 512 像素的掩膜。第二部分包括来自 13 名患者的 17 张前列腺癌全切片图像，包含腺体分割掩膜，总计 30,000 个大小为 512 × 512 像素的掩膜。第三部分包含来自 36 名患者的 51 张乳腺癌全切片图像，包含管状结构掩膜，总计 30,000 个大小为 512 × 512 像素的掩膜。为了确保透明性并鼓励进一步研究，我们将公开该数据集，供非商业和学术研究用途。

Method

方法

Three histopathology dataset pertaining to different histopathological environments are included in this research study: Breast CancerTubules (BCT), Micro/macrovesicular Steatosis of Liver (MSL), andProstate Carcinoma Glands (PCG) were used for this study. The datasetscomprise of corresponding pixel-level annotated Ground Truth(GT)segmentation masks annotated by board-certified pathologists andbiomedical engineers. The BCT dataset was utilized for a binary classinstance segmentation problem and focuses on abnormal tubule region formation in breast carcinoma. Slides were scanned using thePANNORAMIC® Flash DESK DX scanner at 20× magnification and thenpre-processed into 512 × 512 patches. The dataset includes 30,000annotated patches, and the stratified training cohort included 25,000images with a consistent class distribution of tubule and backgroundregions with 24,997 tubule patches. In addition, the stratified testcohort included 5000 images containing 4990 tubule patches.

2.1. 自建数据集

本研究包含三个与不同病理环境相关的病理学数据集：乳腺癌管状结构（BCT）、肝脏微/大泡性脂肪变性（MSL）以及前列腺癌腺体（PCG）。这些数据集包含经过认证的病理学家和生物医学工程师标注的像素级真值（Ground Truth，GT）分割掩膜。

BCT 数据集被用于二分类实例分割任务，主要研究乳腺癌中异常管状结构区域的形成。切片使用 PANNORAMIC® Flash DESK DX 扫描仪以 20× 放大倍数扫描，并经过预处理生成 512 × 512 的图像块。该数据集包含 30,000 个标注的图像块，其中分层训练集包含 25,000 张图像，乳腺癌管状结构和背景区域的类别分布一致，共有 24,997 张带有管状结构的图像块。此外，分层测试集包含 5,000 张图像，其中 4,990 张含有管状结构的图像块

Results

结果

3.1. PathopixGAN performance evaluation

PathopixGAN performance for synthetic data generation is evaluated on binary-class segmentation of BCT and MSL dataset as well asmulticlass segmentation of PCG datasets. To this end, we present acomparative analysis of PathopixGAN and the original GauGAN (Parket al., 2019) data generation architecture to prove the superiority ofthe proposed PathopixGAN method. A mask synthesis module, FastGans (Liu et al., 2020), is used to create a cohort of new and different semantic masks that resemble the spatial distributions of realhistopathology masks. Most importantly, this technique introduces crucial differences between tissue- and cell-level histopathology environments that PathopixGAN can replicate. The new semantically varied GTmasks were provided to both PathopixGAN and GauGAN image generators to evaluate their ability to synthesize images from histopathologydata. The qualitative analysis of the FastGans synthesized GT masks andtheir corresponding PathopixGAN synthesed histopathological imageryis provided in the Fig. 3.

3.1. PathopixGAN 性能评估

PathopixGAN 在合成数据生成方面的性能，分别针对 BCT 和 MSL 数据集的二分类分割，以及 PCG 数据集的多分类分割进行评估。为此，我们展示了 PathopixGAN 与原始 GauGAN（Park et al., 2019）数据生成架构的比较分析，以证明所提出的 PathopixGAN 方法的优越性。

在合成掩膜模块中，采用了 FastGans（Liu et al., 2020），用于创建一组新的、不同的语义掩膜，这些掩膜的空间分布类似于真实的组织病理学掩膜。最重要的是，这一技术引入了组织和细胞级别的病理学环境之间的关键差异，而 PathopixGAN 可以复制这些差异。新的语义多样化的 GT 掩膜被提供给 PathopixGAN 和 GauGAN 图像生成器，以评估它们从病理学数据合成图像的能力。FastGans 合成的 GT 掩膜及其对应的 PathopixGAN 合成的病理图像的定性分析见图 3。

Figure

图

Fig. 1. Overview of the general workflow of the proposed pipeline. Masks from real data are given to FastGAN (Liu et al., 2020) Mask Synthesis module which synthesizes newsemantic masks with diverse histopathology micro-environments. Increased number of input masks are fed to PathopixGAN to produce corresponding tissue patches. A high-levelview of PathopixGAN is provided, with the architecture inspired from original GauGAN implementation (Park et al., 2019). PathopixGAN Residual blocks(ResBLK) replace RectifiedLinear Unit (ReLU) with Gaussian Error Linear Unit (GELU) (Hendrycks and Gimpel, 2016) from original implementation due to its smoother differentiation, while taking GTsemantic masks as input to perform semantically adaptive normalization for conditional synthesis. Denser skip connections are employed to ensure a more consistent informationflow through the generator. PatchGAN is employed as a discriminator inspired from the original work. The PathopixGAN generator learns the semantic information for the differentclasses in the dataset through their synthetic masks and synthesizes images resembling original tissue patches. The synthetic images are passed to the discriminator during trainingphase which learns to distinguish between them and the real images, making the PathopixGAN generator better at image synthesis. The synthesized images along with theircorresponding FastGANs-generated masks are combined with the real data and utilized as training cohort to train the novel PathoSeg architecture, which outputs an inferredsemantic mask for the input histopathology image

图 1. 提议工作流程的总体概述。

来自真实数据的掩膜被输入到 FastGAN（Liu 等，2020）的掩膜合成模块，该模块生成具有多样化病理微环境的新语义掩膜。增加数量的输入掩膜被送入 PathopixGAN，以生成相应的组织图像块。图中提供了 PathopixGAN 的高层视图，其架构受到 GauGAN 实现（Park 等，2019）的启发。PathopixGAN 的残差块（ResBLK）将原始实现中的整流线性单元（ReLU）替换为高斯误差线性单元（GELU）（Hendrycks 和 Gimpel，2016），因其具有更平滑的微分特性。通过输入真值（GT）语义掩膜，PathopixGAN 执行语义自适应归一化以实现条件合成。此外，PathopixGAN 使用了更密集的跳跃连接，以确保生成器中的信息流更加一致。作为判别器的 PatchGAN 受到原始工作启发。PathopixGAN 生成器通过合成的掩膜学习数据集中不同类别的语义信息，并生成与原始组织图像块相似的图像。在训练阶段，合成图像被传递给判别器，用以区分合成图像与真实图像，从而使 PathopixGAN 生成器在图像合成方面更加精准。最终，这些合成图像及其对应的 FastGAN 生成的掩膜与真实数据相结合，形成训练集，用于训练全新的 PathoSeg 架构，输出输入病理图像的推断语义掩膜。

Fig. 2. Overview of PathoSeg segmentation architecture. The proposed encoder–decoder model employs HRNet and UNet++. Dilated convolutions with a large kernel size of 5 × 5and a rate of 2 are employed by HRNet encoder to better conserve the global context of histopathological environments. Moreover, skip connections were utilized at differentresolutions to incorporate and strengthen the flow of low-level information throughout the network. Five stages of convolutions as displayed in with the first stage, namely the‘Bottleneck layer’, having four blocks of the convolution and pooling layers specified in the figure followed by a dimensionality reducing 1 × 1 convolution layer. The numberof blocks in the following ‘Basic’ stages increase by 4, whereas BatchNorm is replaced with LayerNorm to reduce the interdependency of samples within a mini-batch and theinflated impact of batch size on model performance. Notably, this also reduces the computational complexity of the PathoSeg architecture. Furthermore, as a major introduction,CBAM block (Woo et al., 2018) was employed after encoder to integrate both channel and spatial attention mechanisms. This addition induces attention to the features andregions of higher diagnostic potential. The decoder is the original UNet++ which takes the attention maps and perform deconvolution with skip connections to produce preciseand explainable segmentation masks.

图 2. PathoSeg 分割架构概述。

所提出的编码器-解码器模型结合了 HRNet 和 UNet++。HRNet 编码器采用扩张卷积，卷积核大小为 5 × 5，扩张率为 2，以更好地保留病理环境的全局上下文。此外，在不同分辨率下使用跳跃连接，以整合并加强整个网络中低层信息的流动。该架构包含五个卷积阶段，其中第一阶段称为“瓶颈层”（Bottleneck layer），包括图中指定的四个卷积和池化层块，随后是一个用于降维的 1 × 1 卷积层。接下来的“基础”（Basic）阶段中，块的数量每阶段增加 4 个，同时将 BatchNorm 替换为 LayerNorm，从而减少小批量样本间的相互依赖性，以及批量大小对模型性能的膨胀影响。这种替换显著降低了 PathoSeg 架构的计算复杂度。值得注意的是，作为一项重要改进，编码器之后引入了 CBAM 块（Woo 等，2018），以整合通道和空间注意力机制。该改进使得模型能够关注具有更高诊断潜力的特征和区域。解码器采用原始 UNet++ 架构，利用注意力图，通过跳跃连接执行反卷积，生成精确且可解释的分割掩膜。

Fig. 3. Qualitative results of conditional PathopixGAN image synthesis. Qualitative results of PathopixGAN synthesis guided by GT semantic masks from BCT, MSL, and PCGdatasets. As a preprocessing routine, FastGans was trained on the real semantic masks from GT from all datasets to synthesize masks that have a range of high-quality anddiverse tissue and cell environments for the three datasets. These synthesized semantic masks were then used for semantically controlled image generation using the PathopixGANgenerator, which was first trained and optimized on the real data. The proposed PathopixGAN synthesis method was shown to be able to accurately recreate tubule formation,fat cells, and prostate carcinoma tissues guided by the semantic information from the input masks of GT. The method successfully synthesized the epithelial component, whichdoes not contain the lumen and the stroma. Similarly, for prostate, PathopixGAN also successfully synthesized images with a realistic and diagnostically relevant epithelial andstromal component of the prostate carcinomas. In addition, fat droplets in the fatty liver can be almost completely masked. However, some of the small diameter droplets, namelymicrovesicles, can rarely escape from masking

图 3. 条件 PathopixGAN 图像合成的定性结果。

展示了 PathopixGAN 在 BCT（乳腺癌小管）、MSL（肝微/巨泡性脂肪变性）和 PCG（前列腺癌腺体）数据集的真值（GT）语义掩膜引导下的合成结果。作为预处理步骤，FastGANs 首先基于三个数据集的 GT 语义掩膜进行训练，生成质量高且多样化的组织和细胞环境的掩膜。随后，这些合成语义掩膜被用于通过 PathopixGAN 生成器进行语义控制的图像生成。PathopixGAN 生成器首先在真实数据上进行了训练和优化。

结果显示，所提出的 PathopixGAN 合成方法能够准确地重建受 GT 输入掩膜语义信息引导的乳腺癌小管、脂肪细胞和前列腺癌组织。该方法成功合成了不含腔和基质的上皮成分。同样地，对于前列腺组织，PathopixGAN 也成功生成了具有真实且具有诊断相关性的上皮和基质成分的前列腺癌图像。此外，对于脂肪肝中的脂滴，PathopixGAN 几乎完全遮蔽了它们。然而，直径较小的脂滴（即微泡）偶尔可能会被掩蔽遗漏。

Fig. 4. Qualitative results of real and synthetic images evaluated on BCT, MSL, and PCG datasets. A qualitative comparison between the fidelity of real histopathological imagesand the synthesized images obtained by PathopixGAN synthesis. PathopixGAN was first trained and optimized on real data from the BCT, MSL, and PCG datasets to obtain anaccurate mapping function for image synthesis. During the evaluation, PathopixGAN demonstrated the ability to accurately synthesize tubule images with stromal and epithelialcomponents, while tissue structures could not be separated from the real data. In fatty liver, the texture and microvesicular and macrovesicular fats in the synthetic images appearcompatible with real data. Similarly, for prostate carcinoma, the synthetic images are epithelial, making them indistinguishable from the real prostate carcinoma images.

Fig. 4. 真实图像与合成图像的定性结果，评估了 BCT、MSL 和 PCG 数据集。对比了真实组织病理学图像和通过 PathopixGAN 合成的图像的保真度。PathopixGAN 首先在 BCT、MSL 和 PCG 数据集上的真实数据上进行训练和优化，以获得准确的图像合成映射函数。在评估过程中，PathopixGAN 展示了准确合成具有间质和上皮成分的管状图像的能力，而在真实数据中无法分离组织结构。在脂肪肝中，合成图像中的纹理以及微小和大脂肪滴与真实数据相匹配。同样，对于前列腺癌，合成图像呈现出上皮样，难以与真实的前列腺癌图像区分。

Fig. 5. Qualitative analysis of PathoSeg inference on real and mixed data cohorts. A qualitative comparison of inference quality between PathoSeg trained on the real data andPathoSeg fine-tuned on the mixed data. The synthetic data generated by PathopixGAN showed the ability to improve the performance of the model. They showed a significantimprovement in inference when the network was fine-tuned on mixed (real and synthetic) data compared to the real data. For tubules, PathoSeg optimized on PathopixGANgenerated tubule images showed improved generalizability with a reduction in false positives while improving accuracy in predicting GT tubule regions. For fat, PathoSeg exhibitedimprovement in correctly classifying fat cells while reducing the number of false predictions for microvesicular. Most importantly, PathoSeg demonstrated that fine-tuning formixed data increased generalizability for the lower-represented classes in the imbalanced prostate dataset. The model was able to improve predictions for the unbalanced red andblue classes while further enhancing the quality of inferences for the green and yellow prostate mask areas. The reduction in false positives for BCT, MSL, PCG was about 1.89%,1.66% and 1.07% respectively

图 5. 真实数据与混合数据队列的PathoSeg推理定性分析。比较了PathoSeg在真实数据和经过混合数据微调后的推理质量。由PathopixGAN生成的合成数据展示了能够提高模型性能的能力。与仅使用真实数据训练的模型相比，当网络在混合（真实和合成）数据上进行微调时，推理结果有了显著改善。对于小管，PathoSeg在PathopixGAN生成的小管图像上的优化显示出更好的泛化能力，减少了假阳性，并提高了预测GT小管区域的准确性。对于脂肪，PathoSeg在正确分类脂肪细胞方面有所改进，同时减少了微泡脂肪的假预测。最重要的是，PathoSeg证明了在混合数据上微调可以提高在不平衡前列腺数据集上低代表性类别的泛化能力。该模型能够改进不平衡的红色和蓝色类别的预测，同时进一步提高绿色和黄色前列腺掩膜区域的推理质量。BCT、MSL和PCG数据集中的假阳性减少分别约为1.89%、1.66%和1.07%。

Table

表

Table 1FID and Inception scores achieved by PathopixGAN and original GauGAN generators.PathopixGAN outperformed the GauGAN in the BCT, MSL, and PCG images synthesisby achieving superior FID and Inception scores. This performance improvement is dueto the ability of PathopixGAN to ensure flow of information from higher to lowerresolutions via a denser network of skip connections and better activations in theconvolutional blocks.

Table 1 FID 和 Inception 分数，分别由 PathopixGAN 和原始 GauGAN 生成器获得。PathopixGAN 在 BCT、MSL 和 PCG 图像合成中超越了 GauGAN，达到了更优的 FID 和 Inception 分数。这一性能提升归因于 PathopixGAN 通过密集的跳跃连接网络和卷积块中的更好激活，确保了信息从高分辨率到低分辨率的流动。

Table 2Quantitative comparison between PathoSeg and benchmark models for which the train and test dataset are real. Optimal hyperparameter settings were obtained through randomsearch approach, where the models were trained on the real images from BCT, MSL, and PCG dataset and evaluated by dice score, IoU score, and Pixel Accuracy, F1 score andMatthews Correlation Coefficient (MCC) metrics. In general, it is observed that the attention-based frameworks performed better for histopathology segmentation as it empowersthe model to better learn the histological features with higher importance. In general, the evaluation scores were higher for BCT and MSL datasets since they had less-severe classimbalance as compared to the PCG dataset. It was observed that a batch size of 64, Cosine Annealing LR scheduler, Adam optimizer, and Affine transformations give optimalperformance for majority of the models. Cosine Annealing provides a dynamic learning rate constituting a cosine curve ensuring that the model landscape is traversed smoothlywithout getting hindered by local minima. Generally, lower learning rates were employed for PCG compared to the others since the dataset was severely imbalanced leading toa difficult model landscape for the segmentation, thus requiring smaller updates to the model weights. The best results for each evaluation metric are highlighted, exhibiting thesuperior performance of PathoSeg in binary segmentation of BCT and MSL, while also showing better generalizability on the multi-class imbalanced PCG dataset. The higher Diceand IOU scores and better pixel accuracy denoted that PathoSeg performed better in pixel-level classification done in both instance and semantic segmentation, correctly classifyingthe pixels with an accurate class label. Moreover, a higher F1 score and MCC demonstrate the robustness and better capability of PathoSeg to generalize on multiple classes despitethe class imbalance.

表 2定量比较了 PathoSeg 与基准模型的性能，所有模型在训练和测试数据集上均使用真实数据。最优的超参数设置通过随机搜索方法获得，其中模型在 BCT、MSL 和 PCG 数据集的真实图像上进行训练，并通过 Dice 分数、IoU 分数、像素准确率、F1 分数和 Matthews 相关系数 (MCC) 指标进行评估。一般来说，基于注意力的框架在组织病理学分割中表现更好，因为它使模型能够更好地学习具有更高重要性的组织学特征。总体而言，BCT 和 MSL 数据集的评估分数较高，因为它们的类别不平衡较轻，而与 PCG 数据集相比。观察到，对于大多数模型，批大小为 64、余弦退火学习率调度器、Adam 优化器和仿射变换提供了最佳性能。余弦退火提供了一个动态学习率，呈现余弦曲线，确保模型平稳地遍历整个损失景观，而不被局部最小值困扰。通常，对于 PCG，使用较低的学习率，因为该数据集严重不平衡，导致分割任务的模型景观更加困难，因此需要对模型权重进行较小的更新。每个评估指标的最佳结果突出显示，表明 PathoSeg 在 BCT 和 MSL 的二分类分割中表现优越，同时在多类别不平衡的 PCG 数据集上也表现出更好的泛化能力。较高的 Dice 和 IoU 分数以及更好的像素准确度表明，PathoSeg 在实例分割和语义分割中都能更好地进行像素级分类，正确地为每个像素分配准确的类别标签。此外，较高的 F1 分数和 MCC 表明 PathoSeg 在面对类别不平衡时仍能保持强大的鲁棒性和更好的多类泛化能力。

Table 3Quantitative comparison between PathoSeg and the benchmark models. Random search approach is used to get the optimal hyperparameter set in fine-tuning the pretrained modelsby mixed data consists of original and PathopixGAN synthesized data for which in total 5000 synthetic images for BCT, MSL, and PCG. It was observed that a batch size of 64,Cosine Annealing LR scheduler, and Affine transformations remained optimal for majority of the models. Furthermore, lower learning rates were again employed for PCG comparedto the other datasets due to the severe class imbalance, which required smaller updates to model weights due to a difficult model landscape. The best results for each evaluationmetric are highlighted, exhibiting the superior performance of our proposed PathoSeg architecture when fine tuned on the mixed data. For BCT, an improvement of at least 3%was observed in the Dice score followed by a 2% improvement in IoU score, Pixel Accuracy, F1 scores and MCC. In MSL, PathoSeg achieved an improvement of 3% in Dicescore and IoU score while a superiority of 2% was observed in Pixel Accuracy, F1 score and MCC. Similarly, the proposed model outperformed other SOTA architectures whenfine-tuned on mixed data in binary segmentation of fat cells in the liver. Furthermore, similar to the MSL and BCT datasets, PathoSeg improved its overall generalization capabilityfor the PCG dataset suffering from class imbalance problem, which is demonstrated by an increase of at least 3% in all the evaluation metrics. Finally, the increase in Dice andIoU scores, Pixel Accuracy, F1 score, and MCC further provide evidence that fine tuning on the PathopixGAN generated images can assist in improving the model generalizabilitywhile predicting abnormal tissue and cell regions, which can be a critical aspect in clinical diagnosis

表 3PathoSeg与基准模型的定量比较。使用随机搜索方法来获得最优超参数集，在混合数据上微调预训练模型，其中混合数据包含原始数据和PathopixGAN合成的数据，共包括5000张BCT、MSL和PCG的合成图像。观察到，批量大小为64，余弦退火学习率调度器和仿射变换对于大多数模型来说仍然是最优的。此外，由于PCG数据集存在严重的类别不平衡问题，因此采用了较小的学习率，这需要对模型权重进行较小的更新，以应对更困难的模型优化环境。对于每个评估指标，最好的结果已突出显示，表明我们提出的PathoSeg架构在混合数据上微调时的优越性能。对于BCT，Dice分数提高了至少3%，IoU分数、像素准确率、F1分数和MCC分别提高了2%。在MSL数据集中，PathoSeg在Dice分数和IoU分数上提高了3%，在像素准确率、F1分数和MCC上提高了2%。类似地，在肝脏脂肪细胞的二分类分割中，提出的模型在混合数据微调时超越了其他现有架构。此外，与MSL和BCT数据集类似，PathoSeg改善了其在PCG数据集上的整体泛化能力，解决了类别不平衡问题，所有评估指标均提高了至少3%。最后，Dice分数、IoU分数、像素准确率、F1分数和MCC的提高进一步证明了在PathopixGAN生成的图像上微调有助于改善模型的泛化能力，尤其是在预测异常组织和细胞区域时，这对临床诊断至关重要。

全部评论 (0)

还没有任何评论哟~

用于病理学的稳健图像分割和合成流程|文献速递-生成式模型与transformer在医学影像中的应用

Title 题目 Arobustimagesegmentationandsynthesispipelineforhistopathology 用于病理学的稳健图像分割和合成流程 01 文献速递介绍病...

MedLSAM: 用于3D CT图像的局部化和分割模型|文献速递-生成式模型与transformer在医学影像中的应用

Title 题目 MedLSAM:Localizeandsegmentanythingmodelfor3DCTimages MedLSAM:用于3DCT图像的局部化和分割模型 01 文献速递介绍最近...

RFMiD：多疾病检测的视网膜图像分析挑战|文献速递-生成式模型与transformer在医学影像中的应用

Title 题目 RFMiD:RetinalImageAnalysisformultiDiseaseDetectionchallenge RFMiD：多疾病检测的视网膜图像分析挑战 01 文献速递介绍...

面向半监督医学图像分割的双重结构感知图像滤波|文献速递-生成式模型与transformer在医学影像中的应用

Title 题目 Dualstructureawareimagefilteringsforsemisupervisedmedicalimagesegmentation 面向半监督医学图像分割的双重结构...

LViT: 语言与视觉Transformer在医学图像分割中的应用| 文献速递-深度学习结合医疗影像疾病诊断与病灶分割

Title 题目 LViT:LanguageMeetsVisionTransformerinMedicalImageSegmentatio LViT:语言与视觉Transformer在医学图像分割中的...

文献速递：基于SAM的医学图像分割--用于医学图像的分割任何物模型

Title 题目 SegmentAnythingModelforMedicalImages? 用于医学图像的分割任何物模型 01 文献速递介绍 ChatGPT1和GPT42等大型语言模型的出现，开启了...

SpinDoctor-IVIM：一种用于体素内无序运动MRI的虚拟成像框架|文献速递-生成式模型与transformer在医学影像中的应用

Title 题目 SpinDoctorIVIM:AvirtualimagingframeworkforintravoxelincoherentmotionMRI SpinDoctorIVIM：一种用于...

DACG：一种用于放射学报告生成的双重注意力与上下文引导模型|文献速递-生成式模型与transformer在医学影像中的应用

Title 题目 DACG:DualAttentionandContextGuidancemodelforradiologyreportgeneration DACG：一种用于放射学报告生成的双重注意...

PViT-AIR：基于视觉转换器的乳腺组织多组学和Faxitron图像仿射配准|文献速递-生成式模型与transformer在医学影像中的应用

Title 题目 ViTAIR:Puzzlingvisiontransformerbasedaffineimageregistrationformultihistopathologyandfaxitr...

用于MRI重建的全局感受野傅里叶卷积块|文献速递-生成式模型与transformer在医学影像中的应用

Title 题目 FourierConvolutionBlockwithglobalreceptivefieldforMRIreconstruction 用于MRI重建的全局感受野傅里叶卷积块 01 ...

是否确定退出登录?

用于病理学的稳健图像分割和合成流程|文献速递-生成式模型与transformer在医学影像中的应用

全部评论 (0)

相关文章推荐

用于病理学的稳健图像分割和合成流程|文献速递-生成式模型与transformer在医学影像中的应用

MedLSAM: 用于3D CT图像的局部化和分割模型|文献速递-生成式模型与transformer在医学影像中的应用

RFMiD：多疾病检测的视网膜图像分析挑战|文献速递-生成式模型与transformer在医学影像中的应用

面向半监督医学图像分割的双重结构感知图像滤波|文献速递-生成式模型与transformer在医学影像中的应用

LViT: 语言与视觉Transformer在医学图像分割中的应用| 文献速递-深度学习结合医疗影像疾病诊断与病灶分割

文献速递：基于SAM的医学图像分割--用于医学图像的分割任何物模型

SpinDoctor-IVIM：一种用于体素内无序运动MRI的虚拟成像框架|文献速递-生成式模型与transformer在医学影像中的应用

DACG：一种用于放射学报告生成的双重注意力与上下文引导模型|文献速递-生成式模型与transformer在医学影像中的应用

PViT-AIR：基于视觉转换器的乳腺组织多组学和Faxitron图像仿射配准|文献速递-生成式模型与transformer在医学影像中的应用

用于MRI重建的全局感受野傅里叶卷积块|文献速递-生成式模型与transformer在医学影像中的应用