【论文阅读】APMSA: Adversarial Perturbation Against Model Stealing Attacks（2023）

阅读量：

本研究提出了一种基于对抗性置信度扰动的防御方法（APMSA）来抵御模型窃取攻击（MUA）。攻击者通过付费查询（MLaaS）获取目标模型的置信度分布，进而重建替代模型，威胁模型隐私和业务价值。为了解决这一问题，作者提出在每个输入查询中添加精细噪声，使置信度分布接近决策边界，从而隐藏攻击者可利用的信息。该方法无需修改原模型，适合插件式部署，并通过CIFAR10和GTSRB实验验证其有效性，成功降低了被窃取模型的推理精度，同时保护了普通用户的隐私。

摘要

Training a Deep Learning (DL) model（训练深度学习模型） requires proprietary data（专有数据） and computing-intensive resources（计算密集型资源）. To recoup their training costs（收回训练成本）, a model provider can monetize DL models through Machine Learning as a Service (MLaaS 机器学习即服务). Generally, the model is deployed at the cloud, while providing a publicly accessible（公开访问） Application Programming Interface (API 应用程序接口) for paid queries to obtain benefits（服务查询获得好处）. However, model stealing attacks（模型窃取攻击） have posed security threats（安全威胁） to this model monetizing scheme as they steal the model without paying for future extensive queries. Specifically（具体来说）, an adversary queries a targeted model（查询目标模型） to obtain input-output pairs（获取输入输出对） and thus infer the model’s internal working mechanism（推断模型内部的工作机制） by reverse-engineering（逆向工程） a substitute model（替代模型）, which has deprived model owner’s business advantage （业务优势）and leaked the privacy of the model（泄露了模型的隐私性）. In this work, we observe（观察） that the confidence vector（置信度向量） or the top-1 confidence returned from the model under attack (MUA) varies in a relative large degree given different queried inputs（给定不同的查询输入）. Therefore（因此）, rich internal information（丰富的内部信息） of the MUA is leaked（泄露） to the attacker that facilities her reconstruction of a substitute model（重构替代模型）. We thus propose to leverage adversarial confidence perturbation（对抗性置信度扰动） to hide（隐藏） such varied confidence distribution（置信度分布） given different queries, consequentially against model stealing attacks (dubbed as APMSA). In other words（换句话说）, the confidence vectors returned now（现在返回的置信度向量） is similar for queries from a specific category（特定类别）, considerably reducing information leakage（减少信息泄露） of the MUA. To achieve this objective（为了实现这一目标）, through automated optimization（通过自动优化）, we constructively（建设性地） add delicate noise（添加精细噪声） into per input query to make its confidence close to the decision boundary（决策边界） of the MUA. Generally（通常）, this process is achieved in a similar means of crafting adversarial examples（制作对抗样本） but with a distinction（不同之处在于） that the hard label is preserved（硬标签被保留） to be the same as the queried input（与查询的输入相同）. This retains the inference utility (i.e., without sacrificing 没有牺牲 the inference accuracy 推理精度) for normal users（对于普通用户） but bounded the leaked confidence information（泄露信任信息） to the attacker in a small constrained area（一个很小的约束区域） (i.e., close to decision boundary 即接近决策边界). The later（后者） renders greatly deteriorated accuracy（大大降低精度） of the attacker’s substitute model（攻击者的替代模型）. As the APMSA serves as a plug-in front-end（充当插件的前端） and requires no change to（不需要更改） the MUA, it is thus generic and easy to deploy（因此它是通用的、易于部署）. The high efficacy（高效） of APMSA is validated through experiments（实验验证） on datasets of CIFAR10 and GTSRB. Given a MUA model of ResNet-18 on the CIFAR10, our defense can degrade the accuracy of the stolen model by up to 15% (rendering the stolen model useless to a large extent) with 0% accuracy drop for normal user’s hard-label inference request.

基于对抗性扰动的模型窃取攻击防御机制，展现出显著的防护性能。

防御策略

方法

论文链接

APMSA: Adversarial Perturbations for Defending Against Model Stealing Attacks

全部评论 (0)

还没有任何评论哟~

【论文阅读】APMSA: Adversarial Perturbation Against Model Stealing Attacks（2023）

摘要 TrainingaDeepLearningDLmodel（训练深度学习模型）requiresproprietarydata（专有数据）andcomputingintensiveresources...

【论文阅读】APMSA: Adversarial Perturbation Against Model Stealing Attacks

摘要训练深度学习（DL）模型需要专有数据和计算密集型资源。为了收回训练成本，模型提供商可以通过机器学习即服务（MLaaS）将DL模型货币化。通常，该模型部署在云中，同时为付费查询提供可公开访问的应用...

【论文阅读】PRADA: Protecting Against DNN Model Stealing Attacks（2019）

摘要 MachinelearningMLapplications（应用）areincreasinglyprevalent（越来越普遍）.Protectingtheconfidentiality（机密性...

【论文阅读】Model Stealing Attacks Against Inductive Graph Neural Networks（2021）

摘要 Manyrealworlddata（真实世界的数据）comeintheformofgraphs（以图片的形式）.GraphneuralnetworksGNNs图神经网络,anewfamilyof...

【论文阅读】Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks（2020）

摘要 Highperformance（高性能）DeepNeuralNetworksDNNs深度神经网络areincreasingly（越来越多）deployed（部署）inmanyrealworlda...

【论文阅读】Efficient Model-Stealing Attacks Against Inductive Graph Neural Networks（2024）

摘要 GraphNeuralNetworks（图神经网络）GNNsarerecognizedas（被认为）potenttools（有力工具）forprocessing（处理）realworlddata...

【论文阅读】Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models（2018）

摘要 Inrecentyears（近年来）,deepneuralnetworkapproaches（深度神经网络方法）havebeenwidelyadoptedfor（被广泛应用于）machinele...

论文阅读笔记—— AdvFilter: Predictive Perturbation-aware Filtering against Adversarial Attack via Multi-d L

文章目录 AdvFilter:PredictivePerturbationawareFilteringagainstAdversarialAttackviaMultidomainLearning 背景...

论文粗读—Rethinking Perturbation Directions for Imperceptible Adversarial Attacks on Point Clouds

论文地址：RethinkingPerturbationDirectionsforImperceptibleAdversarialAttacksonPointClouds 动机对点云的对抗性攻击。已...

【论文阅读】DivTheft: An Ensemble Model Stealing Attack by Divide-and-Conquer（2023）

摘要 Recently,modelstealingattacks（模型窃取攻击）arewidelystudied（广泛研究）butmostofthemarefocusedonstealingasing...

是否确定退出登录?

【论文阅读】APMSA: Adversarial Perturbation Against Model Stealing Attacks（2023）

摘要

相关工作

模型窃取

防御策略

方法

论文链接

全部评论 (0)

相关文章推荐

【论文阅读】APMSA: Adversarial Perturbation Against Model Stealing Attacks（2023）

【论文阅读】APMSA: Adversarial Perturbation Against Model Stealing Attacks

【论文阅读】PRADA: Protecting Against DNN Model Stealing Attacks（2019）

【论文阅读】Model Stealing Attacks Against Inductive Graph Neural Networks（2021）

【论文阅读】Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks（2020）

【论文阅读】Efficient Model-Stealing Attacks Against Inductive Graph Neural Networks（2024）

【论文阅读】Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models（2018）

论文阅读笔记—— AdvFilter: Predictive Perturbation-aware Filtering against Adversarial Attack via Multi-d L

论文粗读—Rethinking Perturbation Directions for Imperceptible Adversarial Attacks on Point Clouds

【论文阅读】DivTheft: An Ensemble Model Stealing Attack by Divide-and-Conquer（2023）