赞
踩
主要是在训练数据中加入精心构造的异常数据,破坏原有的训练数据的概率分布,导致模型在某些条件会产生分类或聚类错误[1]。
由于数据投毒攻击需要攻击者接触训练数据,通常针对在线学习场景(即模型利用在线学习数据不断学习更新模型),或者需要定期重新训练进行模型更新的系统,这类攻击比较有效,典型场景如推荐系统、自适应生物识别系统、垃圾邮件检测系统等。
数据投毒攻击流程可以分为三个阶段(假设不知道被攻击的训练集和模型参数):
1.根据被攻击模型的输出特征,选择替代的训练集,训练在同样的输入下,具有相同输出的模型。例如对猫狗类别的分类。
2.初始化恶意样本集(无论来源),使用梯度更新(根据需要构建损失函数,例如梯度上升策略)恶意样本,直至达到理想效果。
3.将恶意样本集投入被攻击模型的训练集
图示如下
关于投毒攻击与对抗样本比较的描述:
1.An exploratory attack exploits the misclassification of models without affecting the training
process. A typical exploratory attack is the adversarial example attack. An adversarial example attackis a deformed sample, with some disturbance added to the original sample, to make the model liableto be misinterpreted, and a person cannot identify the disturbance. This attack’s characteristics do notaffect the training data of the DNN.
2.A causative attack degrades the accuracy of the model by approaching the training process of themodel. A poisoning attack [9,26,27] is a causative attack method that reduces the accuracy of a model by adding malicious data between the processes during the training of the model. There is a strong possibility that this attack will have access to the training process of the model, but it has the advantage of effectively reducing the accuracy of the model.
概括的这两段话,对抗样本攻击与投毒攻击区别不在于有没有梯度更新,而在于作用时间是在预测期间还是在训练期间。
参考
[1]人工智能安全标准化白皮书
[2]华为AI安全白皮书
[3]Kwon H, Yoon H, Park K W. Selective Poisoning Attack on Deep Neural Networks[J]. Symmetry, 2019, 11(7): 892.
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。