Advertisement

一种基于深度学习的网络入侵检测方法

阅读量:

一种基于深度学习的********网络入侵检测方法

作者李政

Abstract: 网络入侵检测作为国家网络空间安全战略的重要组成部分之一,在近年来的研究中被广泛认为是网络安全领域的重要研究方向之一。通过整合深度学习算法与先进的特征提取技术,在智能网络入侵检测系统中实现了对复杂网络安全威胁的全面防御。尽管现有的多种入侵检测方法已经在一定程度上发挥了作用[1]-[3] ,但传统的方法往往存在信息利用率低的问题[4] 。例如,在特征提取过程中信息丢失以及特征维度的不完整性等问题仍然制约着现有技术的发展[5] 。针对这些问题[6] ,本文提出了一个基于多层级架构的新颖 intrusion detection model —— CR-GTCN(Combination of R-CNN, Gated TCN, and Self-Attention Network),该模型不仅能够有效捕获时间序列数据中的动态变化特性[7] ,还能够通过自注意力机制实现跨尺度特征融合[8] ,从而显著提升了模型在复杂场景下的性能表现[9] 。

1. 边缘特征:能够检测图像或数据中的边缘信息。

2. 纹理特征:比如图像中的重复模式、粗糙度等纹理特性。

3. 形状特征:识别和提取不同物体的形状信息。

4. 局部特征:关注局部区域的特定模式和特征。

方向特徵:如线条或区域的方向信息。)Gated TCN及递 (++R)神經網路(RNN)在 extracting 时间特徵方面具有顯著價值(1. Order processing: 按照時間順序依此處理輸入序列中的元件,對序列的时间順序信息 highly sensitive.)

2. 记忆能力:基于内部的状态变量形成对过去事件的记忆,并将这些记忆作为当前决策的基础,从而识别时间序列中的模式变化。

3. 专门处理短期依赖:对于时间跨度较短的依赖关系而言,在较短时间内能够较为精准地捕捉与建模这些依存关系。),将基于CNN、Gated TCN以及RNN模型提取出的特征进行并行连接以实现融合特征生成的过程。为了使有用输入信息获得更加精准地表达能力,则引入了Self-Attention机制为融合后的特征赋予其不同的权重系数;这样一来就能够有效提取出数据的空间属性与时间维度上的特征信息了。为了系统性地验证该方法的有效性,在UNSW-NB15网络入侵检测数据集上展开了系列实验研究;实验结果表明所提出的CR-GTCN方法能够在实现特征选择与深度学习模型深度融合的同时实现不同类别网络入侵行为的有效识别;其检测性能明显优于传统基于线性代数运算的基础网络入侵检测方法。

Keywords: Intrusion Detection; Convolutional Neural Network; Gated TCN; Recurrent Neural Network; Self-Attention

1 Introduction

信息技术的发展与网络安全建设相互推动,在带来经济繁荣的同时也面临新型网络安全威胁[1]。随着计算机网络和通信技术的不断发展和完善[2],《网络安全法》针对网络安全威胁做出了明确规定[3]。作为一种关键的安全防御机制[4]——网络入侵检测系统(NIDS),它主要用于实时监控并识别异常行为模式[5]:通过对数据流量特征的研究[6],《网络安全法》要求相关机构采取相应的防护措施[7]:既要快速识别潜在威胁线索,并及时采取防范措施[8]:还要根据具体威胁类型制定针对性应对策略[9]:最终实现对整个网络安全环境的有效管理与维护[10]。

传统的网络入侵检测系统主要包含误用检测(Misclassification Detection)与异常检测(Anomaly Detection)[5]。其中误用检测通过建立规则或特征模式来识别可能的攻击行为;而异常检测则基于统计方法或模式识别技术来判断数据是否异常。值得注意的是,这两种方法均存在探测率较低及误报率较高的问题。研究者们致力于通过结合机器学习算法与深度神经网络来提升网络安全领域的入侵防御机制[6]。在早期的研究中,默认情况下使用的机器学习技术主要用于单个模型预测行为的研究;然而这种单一模型所带来的预测精度仍然有限。为此已有数十种不同的分类器被提出用于网络安全领域:如k-近邻分类器[k-NN][7]、贝叶斯分类器[Naive Bayes][8]、决策树决策树(Decision Tree)[9]、随机森林(Random Forest)[10]以及其他如多层感知机(MLP)[11]、卷积神经网络(CNN)[12]、循环神经网络(RNN)[12]及长短期记忆单元(LSTM)[12]等先进算法均被用于实现网络安全中的入侵防御任务。尽管这些传统技术相对简单且无需大量训练数据即可实现基础功能但其准确度仍显不足;相比之下在面对复杂网络安全威胁时基于深度学习的方法往往能提供显著更高的准确率

基于机器学习与深度学习的网络安全防护系统,其核心目标在于优化模型性能,而非深入解析网络语义特征.在常规网络安全防护模式中,系统主要针对网络流量进行实时监控与主机状态审查,以便识别潜在的安全威胁.这些设备通常部署于网络节点位置,如集线器或交换机等关键设施,以实现对日常网络活动的有效跟踪.在异常行为检测方面,基于语义的技术能够识别出潜在恶意脚本的存在.为提高模型训练效果,建议在输入深度学习架构之前,先对网络数据进行语义分析以提取关键特征[13].对于该层面的语义分析工作,详细攻击行为信息能够与其它关键属性相结合,例如主机标识码、攻击名称及参考日志等信息.从整体安全防护体系的角度来看,单一安全功能往往只能提供有限的安全威胁视角,并且通常只能聚焦于某一类攻击行为.此外不同特征维度在异常活动识别方面表现出显著差异性.因此现有的网络安全防护系统多采用多维度特征融合策略以增强检测能力.然而由于攻击者会利用现有技术积累积累的安全知识库来设计规避现有安全机制的变种攻击方式,单纯依赖现有特征集合难以全面覆盖各类潜在威胁.因此开发新的特征维度及组合方式成为提升系统防御能力的重要研究方向

为了解决这一问题, 我们提出了一种创新性地结合深度学习技术的新网络入侵检测方案 (CR-GTCN). 为此, 我们构建了一个融合了 CNN、RNN、Gated TCN 以及 Self-Attention 机制的新颖入侵检测模型. 其中使用一维卷积神经网络 (CNN) 来提取空间维度的数据特征, 结合了加权门控循环神经网络 (Gated TCN) 和递归神经网络 (RNN) 来捕获时间序列数据的趋势信息. 将这两种不同的数据表示进行联合分析以获得综合表征. 然后通过引入自注意力机制 (Self-Attention), 能够有效捕捉不同长度的数据序列中的关键信息. 最后进一步优化多模态特征表示以筛选出最具鉴别性的关键指标.

本文的贡献如下:

本文构建了一种以(CR-GTCN)为基础的新型入侵检测系统。该系统着重分析了原始网络流量各组成部分的独特特征,并具备多层次的学习能力以识别网络流量的空间和时间特性;

本文开发了多个模块用于提取网络流量特征,并通过集成这些网络来确保不会丢失任何特征信息。

3.本文开发了一个自注意力机制模块,旨在对融合后的特征赋予不同权重系数,通过二次特征提取进一步精炼信息,筛选出关键的信息元素;

  1. 本文在UNSW-NB15数据集上经过测试,并通过实验分析得出结论:该模型展现了显著的检测能力

2Related works

网络入侵检测本质上是一个有监督的异常检测问题,并且也是一个分类问题。近年来的研究人员提出了大量相关研究,在传统统计技术方法与基于深度学习的方法之间展开了深入探讨,并从中提取独特的特征模式,并对攻击流量和正常流量进行区分分类。

2.1 Semantic-based methods

企业系统经常生成大量日志记录用于跟踪运行时状态和活动网络流量数据。传统方法依赖于解析PCAP文件来处理非结构化网络数据以实现入侵检测功能此类数据中包含与特定攻击相关的有用信息例如攻击名称参考主机信息等。语义工具可用于预处理网络数据以提取有价值的信息并去除噪声以减少入侵检测系统的负担Seyyar等人[13]提出了基于Web的入侵检测系统模型该模型能够区分正常流量与异常流量在URL分析阶段该方法采用了BERT Transformer模型结合CNN对各类攻击进行了分类为了解决语义差异Li等人[15]采用了word2vec模型实现了HTTP流量的TF-IDF加权映射从而生成低维特征向量Huang等人[16]提出了HitAnomaly方法该方法通过BERT构建日志模板段和属性值来进行异常检测基于排序的日志序列与参数配置的编码器旨在提取排序数据Min等人[17]提出了一种基于统计特征与有效负载特征的入侵检测系统通过结合单词语义与文本CNN从有效负载中提取关键特征所提出的方法在检测精度方面表现最佳

2.2 Feature Selection-Based Methods

特征选择技术可被应用于机器学习系统的预处理阶段,并能有效降低计算开销的同时去除冗余属性以提高网络入侵检测性能。为了实现实时性与轻量化入侵检测目标进而减少计算复杂度研究人员开发了多种不同的特征选择方法Jia等人[18]采用了基于信息增益的方法筛选出十个最具相关性的网络属性随后利用LSTM与CNN模型架构进行入侵检测过程首先通过信息增益完成属性筛选然后使用LSTM识别网络流量接着运用CNN对流量进行分类Shafiq等人[19]则采用wrapper技术精准过滤出最优特征集合并基于曲线指标下的面积度量来优化网络特征选取基于此选定的属性参数利用决策树(Decision Tree DT)与支持向量机(SVM)等机器学习算法构建入侵检测系统Ding等人[20]提出了一种混合模型策略用于缩减网络属性数量并采用聚类分析方法确定最佳聚类中心最后结合模糊均值方法选出最优特征集合并基于此运用SVM等多种算法进行入侵检测Injadat等人[21]同样采用了信息增益方法实现了属性筛选在此基础上将原始网络特征规模缩减至原来的近60%并选用随机森林(Random Forest RF)算法有效地完成了网络数据检测过程Aslahi-Shahri等人[22]则将SVM与遗传算法相结合成功将KDD CUP99数据集中的41个原始特征降维至仅10个同时实验数据显示该方法误报率极低Alazzam等人[23]进一步采用Pigeon Inspired Optimizer (PIO)算法将KDDCUP99 NSL-KDD与UNSW-NB15数据集中的41个原始属性分别优化至7个5个与5个同时保持较高true positive率并显著降低了构建决策树所需时间为了找出最优的入侵检测子集Khammassi等人[24]引入逻辑回归作为选择标准并建立训练模型该方案在检测精度上表现优异

尽管基于特征选择的技术能够有效降低特征维度,在实际应用中这些方法的有效性高度依赖于所提取特征包含语义信息的质量。这种方法虽然能在一定程度上保证理想效果,但在面对不同类型的数据时可能会出现偏差,尤其是在数据分布失衡的情况下,这使得在不同数据集上可能无法达到最优效果

2.3 Machine Learning-Based Methods

机器学习技术已在数据分析、网络安全以及人工智能等领域取得了显著成就。鉴于此,在网络安全领域内应用广泛的各种基于机器学习的方法已被开发出来用于网络入侵检测系统的设计与实现。例如Gao等研究者[25]提出了将长短期记忆网络(LSTM)与前馈神经网络(FNN)相结合的入侵检测方案。该方案首先利用LSTM对复杂的时间序列数据进行预处理以提取潜在特征信息接着将预处理得到的数据输入到FNN中完成特征映射随后通过综合考虑各维度的特征信息再结合训练好的模型实现有效的入侵行为识别过程。此外Marteau [26]还提出了一种基于二叉树森林的入侵检测模型这种模型能够有效识别单点异常以及多维联合异常情况为提高系统鲁棒性提供了新的思路。最后Li等研究者[27]则开发了一种基于卷积神经网络(CNN)的双层深度学习架构该系统能够自适应地从正常数据中提取深层次的特征并通过比较预测结果与真实结果之间的差异度来判断系统是否受到攻击威胁

然而,在机器学习应用中存在诸多限制性条件,并随着技术发展而呈现出日益复杂的挑战性特征。Zhou等研究者[28]引入了基于双样本学习机制的Siamese卷积神经网络(CNN)模型,在一定程度上缓解了过拟合问题,并构建了一个预训练模型;随后又基于变换后的低维特征表示再次构建了该类模型以期进一步提升性能。Xu等学者[29]提出了基于对数双曲余弦条件变分自动编码器(LCVAE)的入侵检测系统,在继承传统条件变分自动编码器优点的同时显著提升了数据处理能力;该模型采用对数双曲余弦损失函数并结合CNN技术以优化检测精度。Li等研究者[30]则提出了融合多层感知机(MLP)、循环神经网络(RNN)以及卷积神经网络(CNN)的联合深度学习架构用于入侵检测任务;Teng等专家[31]则通过自适应协作网络结合支持向量机(SVM)与决策树(DT)实现了高效的入侵检测方案;Gu等学者[32]采用朴素贝叶斯算法对网络流量特征进行转换处理,并在此基础上构建SVM基线模型以达到较高的检测准确率;Jing团队[33]则将SVM与非线性核方法相结合以提升入侵检测性能;Raman提出了一种基于SVM的超图特征选择方法用于入侵检测;Aburomman等人[35]提出了加权一对一SVM算法以识别多种攻击类型;Ponmalar研究团队[36]则设计了一种集成式混沌博弈优化算法驱动的SVM集成体系用于入侵检测;最后Rashid提出了基于树状堆叠集成模型整合决策树(DT)、随机森林(RF)及极限梯度提升技术以构建高性能入侵检测系统;综合来看这些基于机器学习的方法具有较高的通用性和可扩展性

2.4 Deep learning methods

近年来,出现了许多高效的深度学习模型,例如深度信念网络(DBN)、深度CNN、RNN和深度生成网络,这些模型已用于网络入侵检测领域[38]。Shone等人[6]提出了一种使用堆叠非对称深度自动编码器构建的新型深度学习分类模型,可以正确检测网络入侵类别。 Hassan等人[40]提出了一种大数据环境下高效的网络入侵检测方法,通过结合深度CNN和权重下降的LSTM,从网络流量数据中提取有意义的特征,从而提高入侵检测精度。Khan等人[41]使用CNN和RNN分别捕获局部特征和时间特征,并提出了一种基于深度学习的混合入侵检测框架,用于预测和分类恶意网络攻击。Cao等人[42]提出了一种融合CNN和GRU的网络入侵检测模型,可以解决分类精度低和类别不平衡的问题。Kasongo[43]使用不同类型的RNN实现了入侵检测框架,包括LSTM、GRU和RNN。Vinayakumar等人[44]提出了一种使用分布式深度学习和深度神经网络(DNN)的混合网络入侵检测方法,采用五个隐藏层来实时处理和分析大量数据。Jiang等人[45]使用LSTM-RNN提出一种多通道网络入侵检测方法,通过挖掘特定攻击的信息,将攻击行为揭示为序列数据。Shone等人[6]使用RF和非对称深度自动编码器,该方法采用堆叠式自动编码器,自动编码器最终编码层的输出作为 softmax分类器的输入。虽然这些基于深度学习的网络入侵检测方法通常能达到较高的检测准确率,但算法设计较为复杂,网络训练通常需要大量迭代,需要较多的计算资源,训练过程一般比较耗时。

Khediri[47]采用了基于反向传播的深度神经网络架构,并基于CSE-CICIDS2018数据集进行了模型训练。该模型未充分考虑训练与验证阶段的时间消耗因素。Thakkar等人[48]提出了基于深度神经网络的攻击检测方法,并分别采用了NSL-KDD、UNSW-NB-15以及CIC-IDS-2017等数据集进行模型训练。其模型在训练过程中耗时过长。El-Ghamry等人[49]引入了改进型卷积神经网络架构(VGG16-PSO),该架构是一种优化型卷积神经网络结构用于入侵检测任务。然而该模型未提及其训练与验证所需的时间资源消耗情况。Wu等人[50]提出了一种基于深度神经网络的攻击检测系统,并仅在CSECICIDS2018数据集上进行了验证测试阶段的工作流程设计;缺乏对实际应用中所需的时间资源评估指标设定。Hnamte等人[51]提出了一种基于长短期记忆自编码器(LSTM-AE)的攻击检测方法;该方法采用了两个公开可用的数据集进行联合训练与测试;尽管如此其模型由于较长的计算时间需求限制了其在现代复杂攻击场景中的适用性分析能力。Dong等人[52]提出了一种半监督学习型双深度Q网络(SSDDQN)用于入侵检测任务;该方法采用了NSL-KDD与AWID两大数据库分别作为源域与靶域的数据输入来源;由于所选数据库均为较早版本且规模有限;因此其提出的半监督学习型深度强化学习算法可能难以有效适应当前快速变化的安全威胁环境需求。Pelletier等人[53]则通过结合循环神经网络与时间卷积神经网络(TempCNN)技术对入侵检测系统的运行效率进行了深入研究分析

当前研究存在一定的局限性。例如,在采用更深的卷积神经网络(CNN)时未对CNN模型的超参数进行L1或L2正则化优化。然而这并非影响模型性能的关键因素之一 相反 模型结构在其中扮演着更为重要的角色。此前的研究已通过调整CNN和DNN结构来优化架构设计 但未来研究仍需进一步扩大搜索空间 并设计适应度函数以考虑计算效率

3 Proposed CR-GTCN approach for intrusion detection

基于其关注的空间信息特性设计而成的CNN架构与专注于捕捉时间序列变化特性的RNN架构之间存在显著区别。现有研究工作中,HAST-IDS系统简单地将这两种网络模型串联起来,并如图1所示进行展示。当学习过程沿着基于CNN构建的多层级体系持续展开时,系统会在各个阶段提取出更加注重空间特性的信息数据。值得注意的是,由于这种架构可能在对时间序列的关键特征进行捕捉方面存在局限性,这会导致后续基于RNN(包括LSTM)模型的学习效率受到影响。

图1. HAST-IDS模型图

本文采用了CNN、Gated TCN与RNN子网络的集成方法,并在此基础上增加了Self-Attention机制。考虑到CNN能够从大量数据中提取高级特征,在论文架构中将CNN置于Gated TCN与RNN网络之前进行部署,并确保其输出仍保留了时间信息。为了有效捕捉时间信息,Gated TCN与RNN网络被设计用于这一过程。Self-注意力机制通过注意力机制动态地生成不同连接权重,并因此能够处理长度不一的输入序列。随着数据处理逐步推进至下一步阶段,在此过程中学习粒度变得更加精细。通过这种方式整合各组件的学习能力,在保证各模块之间相互独立的同时实现了协同优化

图2. CR-GTCN模型图

3.1 Convolutional Neural Network (CNN)

基于两个核心操作构建而成的网络架构是卷积神经网络的基础。卷积操作利用一组滤波器对输入数据进行处理,并生成反映输入特征的特征图。这些输出被称为特征图,在经过激活函数处理后会进一步用于后续计算。为了减少空间维度的同时滤除冗余信息,通过池化操作会对输出进行下采样处理。池化运算有助于增强网络对噪声和干扰的鲁棒性,并在后续层的学习过程中起到重要作用。网络能够自动优化滤波器参数以适应输入数据的变化,并逐步提取更高层次的信息特征。考虑到实际应用中通常将数据展平为一维向量进行处理,在这种情况下的一维卷积运算表示如下:

|

|(1)||
|---|---|---|

其中,

表示序列数据中不同值的位置,

表示大小为

的滤波器,g(j)表示函数g在变量j处的值,选择ReLU作为激活函数。

3.2 Batch Normalization

一个潜在的问题是输入值范围随训练过程逐层发生变化,并被称为协方差偏移现象。这种现象会使得各层的学习效率相互影响从而可能导致整体模型稳定性下降这将限制优化器采用较低的学习率设置以便更好地适应不同输入范围的数据分布进而降低了模型的整体收敛速度为此提出了一种解决方案——批归一化技术在此基础上本文将其应用于卷积神经网络(CNN)模块与循环神经网络(RNN)之间

|

|(2)||
|---|---|---|

其中,

是输入batch中的值,

分别表示batch均值和方差,

是一个可忽略的值,只是为了保证公式中的分母不为零。

基于归一化产生的

,归一化得到公式(3)所示的输出

,其中

都在学习过程中训练,以获得更好的学习结果。

|

|(3)||
|---|---|---|

3.3GatedTemporalConvolutionalNetwork********(GatedTCN****)****

Gated TCNs在递归神经网络体系中扮演着关键角色,在时间卷 convolutional 网络(Temporal Convolutional Network, TCN)中各层次的信息流动方面具有独特调控能力,在给定输入条件下展现出卓越的性能

,其形式为:

|

|(3)||
|---|---|---|

其中

是模型参数,

is element-wise product,

是输出的激活函数,

是sigmoid函数,确定传递到下一层信息的比率。

本文采用dilated casual convolution网络作为TCN模型来捕获特征。Dilated casual convolution网络通过增加网络层的数量来扩大感受域范围,并有助于并行计算的同时解决了梯度消失的问题。作为一种基于1D卷积的变体,dilated casual convolution运算通过在输入序列中添加零值来维持时间上的因果关系(每个输出仅依赖于之前的输入)。而扩张因果卷积则通过在卷积核元素之间插入空洞即跳过一些输入值的方式,在不增加卷积核大小和计算复杂度的前提下实现了感受域的有效扩展。这种机制使得扩张因果卷积能够捕捉更长范围的时间依赖关系(如图3所示)。在数学上,给定一个一维序列输入

和filter

,将

在步骤

处的dilated casual convolution运算的表示如等式(5)所示:

|

|(4)||
|---|---|---|

其中,(k表示长度),

是控制跳跃距离的dilation factor。

3.4 Long Short-Term Memory (LSTM)

与基于CNN的方法仅以单个数据记录为依据学习信息不同,在RNN架构中可以通过将前一次学习所获得的知识反馈至当前学习过程来建立各数据记录之间的关联关系。这种机制使得其能够有效捕捉输入数据中所包含的时间特征信息。然而,在传统RNN架构中,默认的反馈机制可能导致长期依赖过程中累积的误差变得显著。这些误差若积累到一定程度可能会影响最终的学习效果表现。为此出现了一种新的神经网络模型——长短期记忆网络(LSTM),它通过门控机制来缓解这一问题。具体而言,在LSTM设计中包含了四个子网络组件:输入门、输出门以及一个用于存储长期信息的记忆单元。这些组件通过一系列门控函数协同工作,在处理短期干扰的同时能够有效地保持长期特征的有效传递

确定的相同大小的向量,保存在内存中的状态

作为对当前学习的反馈,具体表示如下:

|S(t)=

|(5)||
|---|---|---|

其中,

分别为当前输入,先前输出,偏置和权重矩阵。

使用子网

,通过两种类型(遗忘门(Forget Gate)和输入门(Input Gate))的控制门

来确定反馈

从之前的学习和当前的输出

,具体表示如下:

|

|(6)||
|---|---|---|
|

|(7)||

LSTM通过调整这些网络中的权重和

值来学习输入,以便可以在输出中有效地生成输入数据之间的时间特征。

3.5 Self-Attention(SA)

通过结合CNN、Gated TCN与LSTM网络的能力,在进行特征提取的同时也能实现序列预测功能;引入注意力机制后,在海量数据中筛选出更为重要的部分;这有助于显著提升入侵检测系统的准确性;具体而言,自注意力机制能够有效降低外部干扰因素的影响;如图所示,在自注意力模型中可以通过动态调整各连接权重来实现对不同连接关系的关注度分配;因此该方法可作为神经网络结构中的一层进行扩展应用;数学表达式如下:

|

|(9)||
|---|---|---|

式中,

表示一组查询向量矩,

表示一组键向量矩阵,

表示向量矩阵。(T表示概率分布,

表示Q向量维度Fk的平方根)首先计算

的点乘,然后除以

为了防止其结果过大,在应用Softmax操作后将其结果转换为概率分布,并随后与矩阵相乘

在机器学习模型中赋予每个值向量一定的权重(其生成过程可理解为基于查询向量与其他键向量之间的相关性来进行动态分配)。这种机制旨在实现对输入序列进行自适应的关注与特征提取,在此过程中能够识别或建模输入序列中各不同位置之间存在的依赖关系,并且支持较长长度的输入序列处理。在实际应用场景中常被用作神经网络中的一个模块,在其结构设计与其它层之间交替使用时有助于提升模型对复杂序列数据的学习能力

3.6 Output

随着学习粒度从一个 CNN+RNN 级别调整到另一个级别时

卷积层可以通过机制识别数据包中的局部模式与关键特征。例如,在网络流量分析中它能够识别出特定的字节序列协议字段间的规律性分布或是不同数据包之间在短时间内形成的关联性特征等空间特性。此外基于数据包序列展开的空间域卷积操作也可以在一定程度上提取出时间维度的相关信息如相似模式反复出现的现象或是某种趋势演变的过程。

全局平均池化过程通过对卷积操作后的特征图执行整体均值计算来实现对空间信息的聚合作用。
举例而言,在经过卷积神经网络处理之后可能会提取出多张二维矩阵形式的不同表征图像。
每个这样的二维矩阵都会经历均值聚合步骤将其中的所有元素除以元素数量从而获得一个综合性的数值指标反映其对应的空间特性。
在完成上述步骤之后各通道提取出来的综合指标会被进一步结合形成更加丰富的表征数据作为后续全连接网络的学习输入来源。

4Datasets

神经网络设计的评估与所使用的数据集紧密相关。由于用于NID收集的数据集中存在大量重复的数据[54],从而导致评估结果不可靠[55]。本文在调查中选择了非重复数据集UNSW-NB15以确保评估的有效性

UNSW-NB15[56]是由澳大利亚网络安全中心(ACCS)于2015年开发的一个著名网络安全基准测试集,在该基准中用于评估安全系统的表现与漏洞修复的效果。该基准集通过采集来自现实世界中三个知名网站的数据来构建其庞大的训练与测试集合体。具体而言,在这些真实世界网站的基础上进一步进行了人工干预与模拟实验以增强其安全性评估能力。基于这一背景研究方法学特点及其实验环境特性,在经过一系列标准化处理之后最终形成了一个包含多种典型网络威胁行为特征的基准数据库体系结构框架

5********Experiment

5.1 Data Preprocessing

本文将UNSW-NB15数据集以7:3的比例进行划分,并将其中的70%作为训练集用于网络入侵检测系统的训练;剩下的30%则被用作测试集以评估所训练的网络入侵检测系统。

在本文中,在数据预处理阶段采用7:3的比例进行数据集划分,在与常规比例测试的结果对比中取得了显著效果。通过二元分类与多类别评估策略对数据集进行性能验证。第一步采用二元分类方法,在该任务中仅包含两种可能的结果:正常行为与异常攻击行为。接下来采用多类别分类策略,在该场景下类别数量及具体值范围均有所体现。

为了确保实验的有效性, 本文要求数据具备神经网络算法所需的数值格式. 原始数据中包含若干分类属性, 这些信息无法直接被学习算法处理, 需经编码处理为数值形式. 因此, 在本研究中, 我们采用Pandas库中的'get_dummies'方法来进行属性编码.

(2)归一化处理:输入数据可能存在不同均值以及非单位的标准差的分布情况,这可能会影响模型的收敛速度和效果。为了提升模型性能,在本文中我们采用了归一化处理的方法,并将输入数据的均值归零、标准差规范至1。这种预处理方法在多种机器学习分类算法中都常被采用以提高模型效果

(3)分层 K 折交叉验证:UNSW-NB15 各自拥有 257,673 个样本。为了确保能够获取充足且互不重叠的数据样本以进行训练与验证工作,在本研究中我们采用了基于分层 K 折交叉验证的方法进行实验设计;这种技术在机器学习领域有着广泛应用的基础方法论支持。具体而言,在实验过程中我们将整个数据集划分为 k 个互不重叠的子集,并使得其中 (k-1) 个子集作为训练数据集使用剩余一个子集作为独立测试用例集合;这种系统性地轮换使用不同子集进行测试的方法有时也被称为留一法实验设计策略

5.2 Evaluation Metrics

本文基于准确度指标(ACC)、检测效率(DR)以及误报频率(FPR)来评估CR-GTCN模型的表现。准确度指标(ACC)评估了CR-GTCN在正确识别遭受攻击流量以及正常非攻击流量方面的性能。检测效率(DR)则专注于识别潜在的网络攻击流量。尽管检测效率(DR)较高,但较高的误报频率(FPR)可能使其表现不如预期。为了全面评估模型性能,建议综合考虑检测效率(DR)和其他相关指标。具体表示如下

|

|(10)||
|---|---|---|
|

|(11)||
|

|(12)||
|

|(13)||
|

|(14)||

其中,

为攻击次数,

为正确分类的正常流量数,

是实际的正常记录被误归类为攻击的数量,以及

是被错误地归类为正常流量的攻击的数量。

5.3********实验结果分析

本文主要从两个方面评估CR-GTCN模型的性能表现:(1) 二元分类任务中对每个数据包进行判断时,该模型仅区分两种状态:攻击性或非攻击性流量;(2) 多元分类任务中对每个数据包进行判定处理时,则会将其归类为正常状态,并且该模型能够将输入的数据包准确地归类为正常状态,并识别出其中属于UNSW-NB15数据集中10种攻击类型之一。

  1. Binary Classification

如表1所示,则呈现了本文CR-GTCN算法针对二元分类的具体检测结果。从表1中可以看出,在所有对比模型方法中,其他对比算法的表现较为一般性,在具体准确率上则分布在55%至99.8%之间。相比之下,则展现了最佳检测性能。其精度、召回率及完整性分别达到91.5%、91.6%和99.7%,体现出显著的优势。

  1. Multi-class Classification

表2呈现了多分类检测的结果。根据表2的数据可知,在基于UNSW-NB15数据集分析时,采用本文提出的CR-GTCN模型对UNSW-NB15数据集进行预测时,其准确率达到90.5%,而其他现有算法则分布在8.5%至73.5%之间。与现有的多分类方法如随机森林(RF)和决策树(DT)相比,在基于UNSW-NB15数据集分析中发现,在所有方法中本文提出的CR-GTCN模型表现出了更优的表现。具体而言,在UNSW-NB15上的准确率为90.5%,F1分数为74.3%,并且这些详细的结果包括准确率与F1分数等指标如图所示

  1. Ablation Experiment

为了深入探讨本文所提出的CR-GTCN模型的有效性, 本研究通过消融实验进行评估. 具体实验结果分别体现在表3和表4中. 对比结果显示, 传统CNN和RNN架构在检测性能方面表现欠佳, 而对比而言, 本文提出的方法在 Accu率值及F1分数显著优于上述两种架构. 此外, 该方法不仅提升了 Accu率值及F1分数, 同时也验证了其改进后版本在检测任务中的卓越性能, 最终证实了 CR-GTCN 模型的有效性和准确性.

为进一步探讨研究结果, 本文旨在对UNSW-NB15网络攻击数据集进行可视化处理, 并应用CR-GTCN算法来实现Binary Classification任务. 如图3所示:

图3. 训练的loss与FPR可视化图

本文基于UNSW-NB15网络攻击数据集展示了CR-GTCN算法的Multi-classification评估标准(召回率与损失率)的可视化图。如图4所示,详细展示了CR-GTCN算法基于UNSW-NB15网络攻击数据集的Confusion Matrix Result。

图4. 训练的Recall与Loss可视化图

图5. 混淆矩阵可视化图

6********Conclusion

本文提出了一种名为CR-GTCN的深度神经网络架构用于大规模网络入侵检测系统中。该架构通过CNN提取交通数据的空间特征同时利用门控时间卷积网络(Gated TCN)和长短期记忆单元(LSTM)来学习时间特征。为了防止由于CNN与RNN在学习过程中分别关注空间和时间信息而导致的信息损失问题 本研究采用同步更新机制使CNN与RNN能够共同关注输入数据的同一层次信息从而有效减少信息丢失的可能性。在此基础上 通过引入自注意力机制对融合后的特征进行加权分配使得有用的信息能够得到更加充分的表现 同时还融入批归一化技术以进一步提升模型性能。实验结果表明 在UNSW-NB15入侵检测基准数据集上进行了大量实验验证所提出的CR-GTCN模型不仅能够有效地利用包括CNN LSTM Gated TCN以及自注意力等多层感知机进行特征融合 而且还显著提升了验证准确率 同时成功降低了网络入侵检测系统的误报率

REFERENCES

[1] M. Humayun, M. Niazi, N. Jhanjhi, M. Alshayeb, and S. Mahmood,‘‘Cyber security threats and vulnerabilities: A systematic mapping study,’’ Arabian J. Sci. Eng., vol. 45, no. 4, pp. 3171–3189, Apr. 2020.

该作者就“网络空间安全——反measure练习”方法展开研究,并在国际通讯系统期刊上发表了相关论文。

说明

[4] L. Yi et al., "A thorough methodological review of intrusion detection systems in fog computing environments," Transactions on Emerging Telecommunications Technologies, vol. 34, no. 1, pp. e4632, Jan 2023.]

T. Mehmood 和 H. B. M. Rais 的研究探讨了机器学习算法在入侵检测领域的应用,在 2016 年 thirds 国际计算机与信息科学会议上进行报道。

A deep learning method for network traffic anomaly detection was proposed by the authors in IEEE transactions on emerging topics in computational intelligence, 2 (1), 2018, pp. 41–50.

In their research paper, Rao and Swathi introduced an efficient k-nearest neighbor classifier tailored for rapid deployment in network intrusion detection systems. This study was published in the Indian Journal of Science and Technology in 2017, where they explored innovative machine learning techniques for enhancing cybersecurity frameworks. The article is featured in volume 10, issue 14, covering pages 1 through 10.

An intrusion detection system for networks has been developed using a hidden Naïve Bayes multiclass classifier as outlined in the research paper by L.Koc et al published in Expert Systems with Applications journal volume 39 issue 18 year 2012 page range 13492 to 13500

S. Sahu and B. M. Mehtre presented a network-based intrusion detection system utilizing the j48 decision tree algorithm at the 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), published by IEEE, covering pages 2023–2026.

[10] Y. Chang, W. Li, Z. Yang, Network intrusion detection based on random forest and support vector machine, in: 2017 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC), Vol. 1, IEEE, 2017, pp. 635–638.

A highly efficient mlp-based network-based intrusion detection method named Mlp4nids has been proposed in this study

一种集成式的入侵检测方法用于基于 train Ethernet 网络的组成结构研究,在该论文中详细探讨了其在卷积神经网络(CNN)和 recurrent 神经网络(RNN)上的应用。该论文发表于 IEEE Access 杂志的第 9 卷(2021 年),其具体页码为 59527 至 59539。

改写说明

技术要点

[14]

A research study employing a weighted-based word2vec paragraph vectors approach to identify anomalies within the context of internet traffic analysis was carried out in IEEE Access 8 (2020).

[16] S. Huang, Y. Liu, C. Fung, R. He, Y. Zhao, H. Yang, Z. Luan, Hitanomaly: Hierarchical transformers for anomaly detection in system log, IEEE transactions on network and service management 17 (4) (2020) 2064–2076.

[17] Edwin Min, James Long, Quan Liu, James Cui, Wenyan Chen, Tr-ids: Anomaly-based intrusion detection via text-based convolutional neural network and random forest, Security and Communication Networks 2018.]

[18] Y. Jia, F. Zhong, A. Alrawais, B. Gong, and X. Cheng, “FlowGuard: An intelligent edge defense mechanism against IoT DDoS attacks,” IEEE Internet Things J., vol. 7, no. 10, pp. 9552–9562, Oct. 2020.

[19] M. Shafiq, Z. Tian, A. K. Bashir, X. Du, and M. Guizani, “CorrAUC: A malicious Bot-IoT traffic detection method in IoT network using machine-learning techniques,” IEEE Internet Things J., vol. 8, no. 5, pp. 3242–3254, Mar. 2021.

[20] W. Ding, J. Nayak, B. Naik, D. Pelusi, and M. Mishra, “Fuzzy and real-coded chemical reaction optimization for intrusion detection in industrial big data environment,” IEEE Trans. Ind. Informat., vol. 17, no. 6, pp. 4298–4307, Jun. 2021.

A systematic approach to enhancing network security through multi-stage optimization of machine learning models was developed by M.Injadat,A.Moubayed,A.B.Nassif,and A.Shami in their study titled "Multi-stage optimized machine learning framework for network intrusion detection" published in IEEE Transactions on Network Service Management in June 2021

[22] B. Aslahi-Shahri, R. Rahmani, M. Chizari, A. Maralani, M. Eslami, M. J. Golkar, A. Ebrahimi, A hybrid method consisting of ga and svm for intrusion detection system, Neural computing and applications 27 (6) (2016) 1669–1676.

[23] H. Alazzam, A. Sharieh, K. E. Sabri, An algorithm for feature selection in an intrusion detection system developed using a method inspired by the behavior of pigeons, Expert systems with applications 148 (2020) 113249.

[24] C. Khattabi和S. Krichen提出了基于遗传算法的特征选择框架用于网络入侵检测系统的优化研究,在《计算机与安全》期刊上发表于2017年

[25] J. Gao et al., “Omniscient SCADA intrusion detection system based on deep learning techniques,” IEEE Internet Things J., vol. 8, no. 2, pp. 951–961, Jan. 2021.]

Ensemble of randomized partition trees for both point-wise and collective anomaly detection is applied in the context of network intrusion detection as detailed in [26].

[27] F. Li, Y. Shi, A. Shinde, J. Ye, and W. Song, “Enhanced cyber-physical security in Internet of Things through energy auditing,” IEEE Internet Things J., vol. 6, no. 3, pp. 5224–5231, Jun. 2019.

[28] X. Zhou, W. Liang, S. Shimizu, J. Ma, and Q. Jin, “Siamese neural network based few-shot learning for anomaly detection in industrial cyber-physical systems,” IEEE Trans. Ind. Informat., vol. 17, no. 8, pp. 5790–5798, Aug. 2021.

[29] X.-L.-Y.-F., et al., "被用于实现有效的入侵检测系统中使用log-cosh条件变分自编码器," IEEE Internet Things J., Vol .8 , No .8 , pp .6187 –6196 , in April 2021

Li et al., "DeepFed: Federated deep learning for intrusion detection in industrial cyber-physical systems," IEEE Trans Ind Informat vol 17 no 8 pp 5615–5624 Aug 2021

[31] S. Teng, N. Wu, H. Zhu, L. Teng, and W. Zhang, ‘‘SVM-DT-based adaptive and collaborative intrusion detection,’’ IEEE/CAA J. Autom. Sin., vol. 5, no. 1, pp. 108–118, Jan. 2018.

J. Gu 和 S. Lu 在 Comput. Secur., 第 103 卷第 4 期(2021 年)中发表了题为《一种高效的入侵检测方法基于 SVM 框架中的贝叶斯特征嵌入技术》的研究论文

[33] D. Jing and H.-B. Chen, ‘‘SVM based network intrusion detection for the UNSW-NB15 dataset,’’ in Proc. IEEE 13th Int. Conf. ASIC (ASICON), Oct. 2019, pp. 1–4.

[35] A.A Aburomman and M.B.I Reaz,'An innovative weighted support vector machines multiclass classifier utilizing differential evolution in intrusion detection systems',Inf.Sci.,vol .414 pp .225-246,Nov .2017

A robust intrusion detection framework is proposed incorporating a novel ensembled support vector machine into a chaotic game-based optimization algorithm for effective intrusion detection in large-scale data platforms.

[37] M. Rashid, J. Kamruzzaman, T. Imam, S. Wibowo, and S. Gordon, ‘‘A treebased stacking ensemble technique with feature selection for network intrusion detection,’’ Int. J. Speech Technol., vol. 52, no. 9, pp. 9768–9781, Jul. 2022.

[38] S.-W. Lee, H. M. Sidqi, M. Mohammadi, S. Rashidi, A. M. Rahmani, M. Masdari, and M. Hosseinzadeh, ‘‘Towards secure intrusion detection systems using deep learning techniques: Comprehensive analysis and review,’’ J. Netw. Comput. Appl., vol. 187, Aug. 2021, Art. no. 103111.

[39]

[40] M. M. Hassan, A. Gumaei, A. Alsanad, M. Alrubaian, and G. Fortino,“A hybrid deep learning model for efficient intrusion detection in big data environment,’’ Inf. Sci., vol. 513, pp. 386–396, Mar. 2020.

M. A. Khan, 融合型卷积-递归神经网络模型:HCRNNIDS(一种用于网络入侵检测的智能系统),Processes, Volume 9, Issue 5, Page 834, May 2021年

[42] B. Cao et al., “Network-based intrusion detection system constructed upon convolutional neural networks (CNNs) and gated recurrent units (GRUs),” Appl. Sci., vol. 12, no. 9, pp. 4184–4185, April 2022.]

[43] S. M. Kasongo, "A deep learning-based approach for Intrusion Detection System (IDS) utilizing an effective recurrent neural network (RNN) framework," Comput. Commun., vol. 199, pp. 113–125, Feb. 2023.]

Deep learning approaches have been employed to develop intelligent intrusion detection systems as detailed in the study by R. Vinayakumar and colleagues. The research article focuses on enhancing security mechanisms through advanced computational techniques published in IEEE Access journal volume 7 during the year 2019 covering pages from 41525 to 41550.

[45] S. Jian, G. Pang, L. Cao, K. Lu, H. Gao, 一种创新的方法用于实现灵活的类别数据表示: 基于层次化的耦合学习模型的研究与应用, IEEE transactions on knowledge and data engineering 31 (5) (2018) 853–866.

[46]

A cloud-based intrusion detection system built upon deep neural networks employing backpropagation and particle swarm optimization was developed and evaluated using the CSE-CIC-IDS2018 dataset, appearing in Applied Sciences in 2023 with the DOI 10.3390/app13042276.

[48] A. Thakkar, R. Lohiya, Fusion of statistical importance for feature selection in deep neural network-based intrusion detection system, Inf. Fusion 90 (2023) 353–363, doi:10.1016/j.inffus. 2022.09.026.

[49] A. El-Ghamry, A. Darwish, A.E. Hassanien, An optimized cnn-based intrusion detection system for reducing risks in smart farming, Internet of Things 22 (2023) 100709, doi:10.1016/ j.iot.2023.100709.

[50] C.-s. Wu, S. Chen, A heuristic intrusion detection approach using deep learning model, in: 2023 International Conference on Information Networking (ICOIN), 2023, pp. 438–442. 10.1109/ ICOIN56518.2023.10049024

An innovative bi-layered deep learning-based framework for network-based intrusion detection systems has been developed in this study, incorporating a long short-term memory autoencoder architecture to enhance the accuracy and robustness of anomaly detection in network traffic.

[52] S. Dong, Y. Xia, T. Peng, Network abnormal traffic detection model based on semisupervised deep reinforcement learning, IEEE Trans. Netw. Serv. Manage. 18 (4) (2021) 4197–4212, doi:10.1109/TNSM.2021.3120804.

[53] C. Pelletier, G.I. Webb, F. Petitjean, Deep learning for the classification of sentinel-2 image time series, in: IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium, 2019, pp. 461–464. doi: 10.1109/IGARSS.2019.8900123.

[54] J. McHugh, “Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory,” ACM Transactions on Information and System Security (TISSEC), vol. 3, no. 4, pp. 262–294, 2000.

W.Hu、J.Gao、Y.Wang、O.Wu和S.Maybank,《Real-time boosting-based parameterized approach for dynamically distributed network intrusion detection system》,IEEE Transactions on Cybernetics Volume 44 Issue 1 pages 66-82 year 2013

The assessment of network-based anomaly detection mechanisms: a statistical evaluation of the UNSW-NB15 dataset alongside an analytical comparison with the KDD Cup '99 dataset.

CR-GTCN: A Deep Learning-Based Approach for Network Intrusion Detection

Author:Zheng li

A********bstract: Network intrusion detection is an important technology in the national cyberspace security strategy and has been a research hotspot for various cyberspace security issues in recent years. The development of effective and intelligent network intrusion detection methods using advanced deep learning algorithms is of great significance for the defense of various network intrusions in complex network environments. However, most existing detection methods do not fully utilize the information contained in the raw network traffic, such as information loss during feature extraction and incomplete feature dimensions. Aiming at the above problems, this paper considers the existence of spatio-temporal features in network traffic data, and proposes a new intrusion detection model (CR-GTCN) with hierarchical CNN, RNN, Gated TCN, and Self-Attention, which extracts spatial features by utilizing a 1D Convolutional Neural Network (CNN), and extracts temporal features by using a Gated TCN and a Recurrent Neural Network (RNN), and then the features extracted by the CNN, the Gated TCN, and the RNN are "concatenated" to obtain the fusion features. In order to allow useful input information to be better expressed, Self-Attention is introduced to assign different weights to the fused features so that spatial and temporal features of the data can be effectively extracted. In order to verify the effectiveness of the model, it is evaluated on the UNSW-NB15 dataset, and the experimental results show that the CR-GTCN method can effectively integrate the feature selection with the deep learning model, and can effectively detect different categories of network intrusions, and has a better detection performance compared with the baseline network intrusion detection method.

Keywords: Intrusion Detection; Convolutional Neural Network; Gated TCN; Recurrent Neural Network; Self-Attention

1 Introduction

The increasing prevalence of information technology along with the emergence and evolution of cyber space has significantly contributed to economic growth as well as social progress. However, this advancement has also introduced new cyber space security risks and challenges [1]. As computer networks become increasingly sophisticated, cyber space faces a variety of complex security threats such as network intrusions [2]. Network intrusion detection serves as a vital security defense mechanism in the domain of cyber space. By analyzing network traffic data characteristics, it identifies potential malicious activities before they occur, enabling timely detection. The system then formulates effective defense strategies to protect network resources from malicious activities [3]. To meet stringent cybersecurity requirements, it is not only necessary to promptly detect network intrusions but also essential to accurately classify their types so that appropriate countermeasures can be implemented accordingly. This necessitates research into efficient methods for rapid identification alongside precise categorization of various types of cyber attacks. With the exponential growth in data volume within big data environments, designing scalable intrusion detection systems has become both crucial yet highly challenging [4]. Consequently, developing intelligent methods for accurate identification alongside rapid response becomes particularly significant for defending against cyber threats in complex network environments.

Traditional network intrusion detection systems are typically classified into misuse detection and anomaly detection approaches [5]. In misuse detection, the system evaluates whether data has been accessed abnormally by examining patterns through rule-based or feature-based matching. Conversely, anomaly detection identifies unusual patterns that may indicate malicious activities. Both methods face challenges with low detection efficiency and high false-positive rates. Researchers have enhanced network intrusion detection capabilities for intelligent systems by leveraging machine learning techniques and deep neural networks [6]. In the early stages of network intrusion detection, conventional machine learning methods were predominantly employed to analyze single-model predictions. However, models based on a single algorithm often achieved limited accuracy. Algorithms such as k-nearest neighbor [7], naive bays [8], decision tree [9], random forest [10], and others have been utilized in network intrusion detection applications. Despite their simplicity and lower training demands, these techniques exhibit relatively low accuracy. Deep learning-based network intrusion detection systems demonstrate superior performance compared to traditional machine learning approaches. For instance, Multilayer Perceptron (MLP) [11], Convolutional Neural Networks (CNN) [12], Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) networks, and related algorithms are employed in such systems. Although these advanced algorithms enhance the accuracy of network intrusion detection, they require significant computational resources for training.

Based on machine learning and deep learning-based network intrusion detection systems focus more on analyzing behavior patterns rather than conducting semantic analysis of network data. In general scenarios for network intrusion detection, systems examine both network traffic and host devices to identify potential intrusions. Additionally, these systems link to specific nodes within the network, such as hubs or switches, enabling them to track and monitor network activity effectively. To enhance their monitoring capabilities, Network Intrusion Detection (NID) watchpoints are strategically placed in high-traffic areas where significant data flows occur. These watchpoints continuously monitor incoming packets for signs of malicious behavior. Attackers often exploit existing knowledge about feature-related mechanisms by creating variant attack vectors or techniques designed specifically to bypass current detection methods. To remain effective against such evolving threats, it is crucial for intrusion detection systems to incorporate diverse feature sets that complement each other while also expanding their feature combination space through continuous innovation and adaptation. This approach helps mitigate attempts by attackers to avoid being detected by relying solely on a single type or category of features.

为了解决上述问题

The contribution of this paper is as follows:

该论文提出了一种基于(CR-GTCN)的新入侵检测模型,并整合了每种网络流量组件的独特特性,在层次化的方式下系统性地学习网络流量的时空特性。

Within this study, a multi-module system comprising multiple modules including CNN, RNN, and Gated TCN networks was developed for extracting network traffic features. By integrating these networks in a way that preserves feature information.

This paper presents a self-attention mechanism module designed to determine different weights for fused features and conduct secondary feature extraction aimed at selecting important feature information.

This study tests the CR-GTCN model against the UNSW-NB15 dataset, demonstrating strong detection capabilities through experimental results.

2Related works

Network intrusion detection is both a form of supervised anomaly detection and classification tasks. Over the past few years, researchers have explored various approaches, including traditional statistical techniques and deep learning-based methods, to identify unique attack patterns and differentiate between attack and normal network traffic.

2.1 Semantic-based methods

企业系统通常会产生大量日志文件以记录运行状态和活跃网络流量数据。
传统的入侵检测方法依赖于解析未结构化的网络数据,并分析PCAP文件。
此类数据可能包含关于特定攻击的有用信息,例如攻击名称、参考信息以及主机信息等。
Semeyer等人[13]提出了基于Web的入侵检测模型(IDS),该模型能够区分正常的与异常的URL。
在URL分析阶段,该方法采用了BERT Transformer模型和卷积神经网络(CNN)来进行多种类型的攻击分类。
为了填补语义间隙,在文献[15]中使用了TFIDF加权的词向量映射技术结合Word2Vec模型来处理HTTP流量数据。
黄等人[16]提出了基于BERT的方法构建日志模板片段及属性值来进行异常检测。
文中设计了一种基于日志序列与参数配置的信息提取编码器来提取排序后的数据。
此外,在文献[17]中提出的是一种利用统计特性和payload特性的入侵检测系统。
通过实验结果表明本文所提方法具有最佳的检测精度。

2.2 Feature Selection-Based Methods

Methods of feature selection can be used as a preprocessing step in machine learning to reduce computational cost, aiming at removing unnecessary features while maintaining or even improving the performance of network intrusion detection. In order to achieve real-time, lightweight intrusion detection, researchers have also proposed different feature selection methods to reduce the computational complexity of intrusion detection models. Jia et al. [18] used information gain to select the ten most relevant features and based on the selected network features, LSTM and CNN were used to achieve intrusion detection. In this model, firstly, information gain is utilized to achieve feature selection, LSTM model is used to identify the network traffic, and network traffic is classified by CNN. Shafiq et al. [19] accurately filtered the feature set by wrapper technique and used the area under the curve metrics to select the network feature set. Based on the selected features, machine learning algorithms such as Decision Tree (DT) and Support Vector Machine (SVM) are utilized to achieve intrusion detection. Ding et al. [20] proposed a hybrid model to reduce the number of network features using a cluster analysis approach for feature selection. The method first uses an information feature selection algorithm to select potentially important features of network information. Then, a chemical reaction optimization algorithm with real number coding is used to find the best clustering center and a fuzzy mean method is used to select the best feature set. Finally, based on the selected feature set, various machine learning methods such as SVM are used for intrusion detection. Injadat et al. [21] used an information gain approach for feature selection, based on which the size of the network feature set was reduced by almost 60%, and Random Forest (RF) was used to efficiently detect network data. Aslahi-Shahri et al. [22] used a combination of SVM and genetic algorithms to reduce the number of features in the KDD CUP99 dataset from 41 to 10, and experiments showed that the method has a very low false alarm rate. Alazzam et al. [23] used the Pigeon Inspired Optimizer (PIO) method to reduce the number of features in the KDDCUP99, NSL-KDD, and UNSW-NB15 datasets from 41 to 7, 5, and 5, respectively, maintaining high true positive rates and accuracy, and decreasing the time to construct the DT. In order to select the best subset of network intrusion detection, Khammassi et al. [24] utilized logistic regression as a selection method and training model, which has high detection accuracy.

Refer to the preceding discussion, although these techniques can reduce the feature dimension, their effectiveness is highly reliant on the quality of the features along with their semantic details. It's impossible to guarantee optimal outcomes across all data scenarios, particularly when dealing with imbalanced datasets. Furthermore, current feature selection approaches rely on heuristic-based rules and metrics, which limit their capacity to uncover intricate relationships among features.

2.3 Machine Learning-Based Methods

Machine learning has made great achievements in fields such as data analysis, detection, and artificial intelligence. It has led to the development of numerous machine learning-based techniques applied to network intrusion detection. For instance, Gao et al. [25] proposed an approach that combines LSTM and feedforward neural networks (FNN) for intrusion detection after preprocessing the network traffic data with LSTM and FNN. This method involves preprocessing the data using LSTM and FNN before combining it and training the combined model with FNN. Finally, intrusion detection is performed by integrating the trained models. Marteau [26] developed an intrusion detection model using a binary tree forest that identifies both point-by-point and collective anomalies. Li et al. [27] constructed a dual deep learning model using CNNs that adaptively learn from normal data by comparing differences between predicted and actual results.

2.4 Deep learning methods

近年来,多种高效的深度学习模型如深度信念网络(DBNs)、深度卷积神经网络(Deep CNNs)、循环神经网络(RNNs)以及深度生成网络(Deep Generative Networks)相继涌现,并广泛应用于网络安全领域的入侵检测系统中[38]。Shone等人[6]提出了一种基于堆叠非对称深度自动编码器构建的新型深度学习分类模型,并成功实现了对网络入侵类别的确切检测。Hassan等人[40]提出了基于大数据环境中的高效入侵检测方法,并通过结合深层卷积神经网络(DCNN)与加权递减长短期记忆单元(LSTM),从网络安全数据中提取具有意义的特征以提高入侵检测精度。Khan等人[41]利用CNN捕捉局部分布特征而RNN提取时间特征,并提出了基于混合型深层学习架构的入侵检测框架以实现恶意攻击的预测与分类任务。Cao等人[42]提出了融合卷积神经网络与Gated recurrent unit(GRU)的网络入侵检测模型,并能够解决分类精度低与类别不平衡的问题。Kasongo[43]采用不同类型的循环神经单元构建了入侵检测架构体系包括长短期记忆单元(LSTM)、门控循环单元(GRU)及基本循环单元(RNN)等技术手段。Vinayakumar等人[44]提出了一种基于分布式深层学习与多层感知机(DMNs)构建的混合型网络安全intrusion detection方法并采用五层隐藏层结构以实现对实时处理大量数据的支持能力目标。Jiang等人[45]提出了多通道网络安全intrusion detection方法并采用LSTM-RNN序列建模技术以从特定攻击事件中挖掘行为特征并揭示攻击行为模式作为序列数据形式存在于此过程中主要利用了长期记忆单元(LSTM)与循环神经单元(RNN)联合分析处理的能力特点从而实现了针对某一攻击事件行为模式的有效识别及建模过程。Shone等人[6]所提出的深层学习架构体系则采用了随机森林( RF )与非对称深度自动编码器构建而成其中系统架构设计采用了自编码器层次结构并以自编码器最后一层编码层输出结果作为softmax分类器输入从而实现了系统级的整体判别功能特性

Khediri [47] proposed a DNN using backpropagation and trained the model using CSE-CICIDS 2018, which does not take into account training and validation time that may affect the availability for further research. Thakkar et al. [48] proposed DNN to detect the attacks and trained the model using NSL-KDD, UNSW-NB-15 and CIC-IDS-2017 datasets respectively, which consumed too much time for training. El-Ghamry et al. [49] introduced the VGG16-PSO model, which is an optimized CNN model for intrusion detection. The model does not mention the time consumed for training and validation. Wu et al. [50] proposed a DNN to detect cyber-attacks, and the model was only validated using CSECICIDS2018, which lacked training and inference time. Hnamte et al. [51] proposed an LSTM-AE model for detecting attacks, which was trained and validated using two publicly available datasets, the model takes longer to train and requires high-end computation to apply larger datasets. Dong et al. [52] proposed a semi-supervised double-depth Q-Network (SSDDQN) for detecting attacks trained using the NSL-KDD and AWID datasets. The proposed model may not be applicable to modern attack scenarios since the datasets are older and smaller. Pelletier et al. [53] used RNN and Temporal Convolutional Neural Networks (TempCNN) to evaluate the training and testing duration.

本研究存在一些局限性。例如,在使用更深的卷积神经网络(CNN)时,并未引入L1或L2正则化方法来优化卷积神经网络的超参数。然而,正则化并不是模型性能的关键因素;相反地,在深度学习领域中发现架构设计往往起到决定性作用。此前的研究者通常会采用卷积神经网络(CNN)和深度神经网络(DNN)来设计模型架构,并通过不断迭代来改进性能指标;然而,在搜索空间的扩展方面仍需进一步探索,并要求Fitness函数需综合考虑模型运行时间等多方面因素。

3 Proposed CR-GTCN approach for intrusion detection

CNNs focus on spatial features while RNNs focus on temporal ones. While HAST-IDS represents a basic approach by merely concatenating CNNs and RNNs, as demonstrated in Figure 1. As one advances through multiple layers within a CNN's architecture, its extracted information tends to become increasingly more spatially oriented. Temporal information might be overlooked through the hierarchical structure of a CNN, thereby severely impacting the learning capabilities of subsequent RNN-based models such as LSTMs.

Figure 1. Diagram of the HAST-IDS model

针对上述问题,在本论文中我们通过整合卷积神经网络(CNN)、带有门控机制的一维卷积神经网络(Gated TCN)以及循环神经网络(RNN)子网络来解决问题,并展示了这一过程的具体实现方法如图2所示

Figure 2. CR-GTCN Model Diagrams

3.1 Convolutional Neural Network (CNN)

卷积神经网络主要由卷积和池化两个基本操作构成。卷积操作将输入数据转换为能够突出显示输入数据特征的输出,并将其称为特征图。经过激活函数处理后进一步通过池化操作减少冗余信息以提高学习效果。池化操作不仅有助于去除数据中的缺陷还能提升后续层的学习能力使网络能够更好地理解和表示输入数据的关键特征。通过反复学习过程自动调节滤波器以使输出特征图能够有效地反映原始输入数据的信息结构

|

|(1)||
|---|---|---|

Where

denotes the position of different values in the sequence data,

denotes the filter of size

and ReLU is selected as the activation function.

卷积与池化构成了CNN的基本操作。特征图是卷积操作输出的常见名称,在输入数据中提取关键属性并结合一组核进行识别。激活函数用于进一步处理卷积输出结果,并通过池化对样本进行降采样以及去除无用信息的同时减少数据中的误差干扰从而促进后续层的学习过程得以优化其输出特征图能够较为精确地反映原始输入数据在这种学习过程中网络会通过不断更新滤波器来自动修正前一层参数从而实现对输入数据的有效学习由于网络包通常以一维形式存在则一维卷积足以满足需求并且其计算过程如下所示

|

|(2)||
|---|---|---|

Among them,

is the filter with the size of

, this paper chooses ReLU as the activation function.

3.2 Batch Normalization

The issue with employing deep neural networks lies in the progressive variation of input value ranges across different layers during training and is also referred to as covariance bias. Covariance bias leads to rendering learning outcomes unstable because it makes one layer's efficiency reliant on others' performance. Furthermore, due to covariance bias, a lower learning rate may be necessary to ensure efficient learning across varying input ranges, which can result in a slower convergence rate. This problem is tackled by Batch Normalization techniques. In this paper, Batch Normalization is utilized within an RNN module for adjusting the output of convolutional neural networks (CNN).

|

|(3)||
|---|---|---|

where

is the value in the input batch,

and

denote the batch mean and variance, respectively, and

is of negligible magnitude, to prevent the denominator in the formula from being zero.

Based on the

produced by normalization, the output

shown in Eq. (3) is obtained by normalization, where both

and

The models undergo training as part of the learning process, aiming at improving their learning outcomes.

|

|(4)||
|---|---|---|

While training, as the magnitude of input values varies across different layers—a phenomenon referred to as covariance shift—this leads to an unstable learning outcome. Due to covariance shift, it might be necessary to adjust the learning rate downwards so that the model can effectively learn from diverse input ranges, which in turn slows down the learning process. This paper employs batch normalization techniques not only to address these issues but also to modify CNN outputs within an RNN module—detailed illustrations follow.

|

|(5)||
|---|---|---|

is the value entered in batch,

取一个极小的值以确保分母在公式中不为零。

The output

as shown in Formula (3) is normalized, in which both

and

are trained in the learning process to obtain great learning results.

|

|(6)||
|---|---|---|

3.3GatedTemporalConvolutionalNetwork********(GatedTCN****)****

Gated TCNs play a crucial role in recurrent neural networks by modulating information flow within each temporal convolutional layer, namely the TCN architecture. A straightforward implementation of a Gated TCN features a single output gate, as illustrated in Figure 2. Given an input sequence, this enables effective processing and transformation through the recurrent layers.

, which is of the form:

|

|(7)||
|---|---|---|

Where

,

,

and

are the model parameters,

is the element-wise product,

is the output activation function, and

该函数计算了从当前层传递到下一层的信息比例

本研究中我们将dilated casual卷积作为TCN模型来提取特征。通过增加层的深度设计的dilated casual卷积网络能够显著扩大感受野范围并改善梯度消失问题。通过在输入前后填充零值来保持时间上的因果关系这使得dilated casual卷积作为一种特殊的1D卷积操作能够在特定步长下滑动输入数据具体如图3所示从数学上讲给定一个一维序列输入

and filter

, the representation of the dilated casual convolution operation of

with

at step

is shown in Equation (8):

|

|(8)||
|---|---|---|

where

is the dilation factor that controls the jump distance.

Figure 3 demonstrates the dilated casual convolution with a kernel size of 2, utilizing a dilation factor.

. The inputs are selected every

steps and a standard 1D convolution is applied to the selected inputs.

Figure 3. Dilated casual convolution

3.4 Long Short-Term Memory (LSTM)

While CNNs analyze information based on individual data records, RNNs establish relationships between data records by feeding the learning from the previous step into the current step, thereby capturing temporal patterns in the input data. However, the simple feedback mechanism in traditional RNNs may cause learning errors to accumulate in long-term dependencies, potentially degrading the final learning outcome. LSTM architecture addresses these issues by incorporating gate mechanisms. The LSTM uses gate functions to control feedback, ensuring transient errors are eliminated and only persistent features remain. This study employs an LSTM-based recurrent neural network model to analyze temporal dependencies in sequential data. The LSTM architecture can be conceptualized as a combination of four sub-networks, control gates, and a memory component. The input and output values throughout the network are vectors of consistent size determined by the input dimensionality.

, and the state

stored in the memory are used as a form of feedback for current learning, which can be expressed as follows:

|

|(9)||
|---|---|---|

where

,

,

, and

The current input is denoted as A; the previous output is represented by B; the bias is signified by C; and the weight matrices are symbolized as D respectively.

By using the subnet

, two types of control gates

are used to determine the previous learning and current output

of the feedback

, which are specifically expressed as follows:

|

|(10)||
|---|---|---|
|

|(11)||

The LSTM learns the inputs by adjusting the weights and

The time-dependent characteristics of values within these network structures can be effectively captured by the output mechanism when processing input samples.

In contrast to CNN, which learns information from individual data records, RNN is capable of establishing connections between data records by incorporating previous learning into current learning. This enables RNN to capture the time characteristics present in the input data. However, traditional RNNs provide basic feedback, which has the potential to accrue learning errors over time. If this happens, the final learning results may be invalidated. LSTM is a kind of gated recursive neural network, which can alleviate this kind of problem, and control the feedback through a set of gate functions, so that short-term errors will be eliminated eventually, leaving only lasting features. Therefore, this paper uses LSTM for RNN. Four sub-networks, along with a set of control gates and a memory unit. The figure contains input and output values that are vectors of equal size, determined by the input. As shown below:

|

|(12)||
|---|---|---|

After four subnets, two types of control gates

are used to determine the previous learning and current output

of feedback

, which are specifically expressed as follows:

|

|(13)||
|---|---|---|
|

|(14)||

The input is learned by LSTM through the modification of weights and

parameters within these networks enable the output to effectively capture the temporal characteristics of the input data.

Fig 4. A high-level data processing diagram of LSTM

3.5 Self-Attention (SA)

By integrating Convolutional Neural Networks (CNN), gated Time-Causal Convolutional Neural Networks (Gated TCN), and Long Short-Term Memory (LSTM) networks, this study employs CNN for feature extraction and LSTM for sequence prediction. Additionally, incorporating an attention mechanism enables effective selection of critical information from vast datasets, thereby enhancing intrusion detection accuracy. Furthermore, self-attention mechanisms mitigate external information interference. As demonstrated in Figure 5, the self-attention model leverages attention mechanisms to dynamically assign weights to different connection links. This approach effectively manages progressively longer input sequences and serves as a component within a neural network architecture, mathematically represented as follows:

|

|(15)||
|---|---|---|

where

denotes a set of query vector moments,

denotes a set of key vector matrices, and

denotes a vector matrix. The dot product of

and

is first calculated, then divided by

By ensuring that the outcome does not exceed a certain threshold, the result is normalized to a probability distribution using the Softmax operation and then multiplied by the matrix.

to obtain the weights.

将卷积神经网络与长短期记忆网络结合使用,在处理序列预测任务时可以采用LSTM模型;而卷积神经网络则用于特征提取。引入注意力机制后,在大量信息中筛选出关键部分有助于提高入侵检测的准确性。如图4所示,在动态生成不同连接权重的同时(SA),自注意力机制也可以处理变长输入序列,并可作为神经网络中的一层来使用:

|

|(16)||
|---|---|---|

Where

represents a set of query vector moments,

represents a set of key vector matrices, and

represents a vector matrix. Firstly, the point multiplication of

and

is calculated, and divided by

防止结果过于庞大,并通过Softmax操作将其归一化为概率分布;接着将该结果与矩阵相乘以获得权重

.

Figure 5. Self-Attention structure diagram

3.6 Output

由于不同层级的CNN+RNN在学习粒度上的差异可能导致当前层级的输出尺寸与后续层级的预期输入尺寸不匹配。因此,在此之后的数据将通过Dimension Reshape层重新调整形状。此外,在深度神经网络的学习过程中经常出现过拟合现象。为了解决过拟合问题,在本研究中采用了Dropout技术,在深度神经网络中随机移除部分连接。总结而言,在本研究中还额外引入了一个卷积层以及一个Global Average Pooling(GAP)层来提取更多的时空特征,并通过最终的一个全连接层生成了最终的学习结果。

4Datasets

The evaluation of neural network architecture is closely related to the dataset employed. The evaluation outcomes from a large number of datasets gathered for NID have been found unreliable [46], primarily because they contain an excessive amount of redundant information [39], thereby compromising their reliability. In this study, we utilize a non-redundant dataset called UNSW-NB15 to achieve reliable evaluations.

The UNSW-NB15 dataset [14] was developed by the Australian Network Security Centre (ACCS) in 2015. The attack samples within this dataset originated from three distinct sources: the Common Vulnerability and Exposures (CVE), Symantec Corporation (referred to as BID), and Microsoft Security Bulletins (MSD). Afterward, a laboratory environment was employed to simulate these sample attacks, thereby creating the dataset. The UNSW-NB15 dataset is categorized into nine primary attack types.

5********Experiment

5.1 Data Preprocessing

In this paper, the UNSW-NB15 dataset is divided in proportion of 7:3, where 70% of the data constitutes as the training set. Whereas, the remaining 30% data is allocated as a test set for evaluation purposes of the trained network intrusion detection model.

We employ both binary and multi-class classification techniques for data evaluation. During our initial experiments, binary classification consistently yields only two outcomes: normal or attack. In our subsequent experiments, multi-class labels can take on various values dependent on specific attack types. Before assessing these models, we preprocess datasets through preprocessing steps.

为了确保实验的有效性, 数据必须符合神经网络所接受的输入格式.
原始神经网络数据包含若干分类特征,这些特征需要被转换为数值形式.
学习算法无法处理文本信息并需要将其转换为数值形式.

Standardization: The input data may exhibit variations in the distribution of mean and standard deviation, which could potentially impact learning efficiency. In this paper, the input data have been normalized through standardization methods in order to ensure that their mean is 0 and their standard deviation is 1.

3. 层级式的K-折交叉验证:在本研究中所采用的UNSW-NB15数据集共包含257,673个样本实例。为了最大化地利用大量非重复使用的数据进行训练和验证的目的,在本文中提出了层级式的K-折交叉验证方案。在该方案下,整个数据集会被划分为K个子集;其中前K_1个子集将被视为整体用于模型训练阶段;剩余的子集则会被用来进行测试评估。

5.2 Evaluation Metrics

This paper evaluates CR-GTCN using validation metrics including accuracy, detection rate, and false positive rate. ACC assesses how effectively CR-GTCN can accurately predict normal traffic both when it's attacked and when it's not. While detection rate reflects its capacity to identify only attacks. A high detection rate may be overshadowed by a high false positive rate and thus must be considered alongside the detection rate as shown below:

|

|(17)||
|---|---|---|
|

|(18)||
|

|(19)||
|

|(20)||
|

|(21)||

Where

is the number of attacks,

is the number of normal traffic correctly classified,

is the number of attacks wrongly classified as normal traffic, and

is the number of attacks wrongly classified as normal traffic.

复制代码
1. ****Analysis of Experimental Results****

Firstly, in this study, we evaluate the performance of the CRNN-SPA model across two scenarios: (1) Binary classification task where the model predicts whether a data packet belongs to either an attack scenario or normal traffic; and (2) Multi-class classification task where it identifies whether a data packet is normal traffic or one of ten specific attack types defined by the UNSW-NB15 dataset. The experimental results are described as follows:

1.Binary Classification

The detection outcomes for the binary classification task, which encompasses Accuracy and F1-score metrics, in this paper are illustrated in Table 1. It is evident that the Accuracy of other comparison algorithms lies within a range of 71.6% to 88.5%. Conversely, the CRNN-SA algorithm introduced in this paper demonstrates superior performance compared to other comparison models. The Accuracy achieved by CRNN-SA is 91.5%, with an F1-score of 91.6%.

Table 1 Result of Binary Classification

Methods Accuracy F1-score Recall
LR 0.753 0.792 0.735
GNB 0.716 0.818 0.997
KNN 0.829 0.869 0.887
DT 0.885 0.910 0.906
AdaB 0.839 0.884 0.965
RF 0.877 0.912 0.991
CNN-LSTM 0.835 0.889 0.980
LSTM 0.767 0.798 0.721
GRU 0.777 0.818 0.782
DNN 0.827 0.879 0.987
CR-GTCN 0****.915**** 0****.916**** 0****.997****

2.Classification with multiple classes

The results of classification involving multiple categories are illustrated in Table 2. It is evident that the prediction accuracy achieved by our proposed CRNN-SA algorithm for the UNSW-NB15 dataset amounts to 90.5%, whereas other algorithms fall within a range from 8.5% to 73.6%. In summary, within the UNSW-NB15 dataset, our proposed CRNN-SA algorithm demonstrates superior performance when compared to multi-classification methods such as RF and DT. Furthermore, it is noteworthy that the F1-score for CRNN-SA stands at 74.3%.

Table 2 Result of Multi-class Classification

Methods Accuracy F1-score
LR 0.561 0.428
GNB 0.085 0.130
KNN 0.652 0.638
DT 0.735 0.718
AdaB 0.631 0.557
RF 0.736 0.695
CNN-LSTM 0.680 0.615
LSTM 0.661 0.598
GRU 0.665 0.608
DNN 0.663 0.608
CR-GTCN 0****.905**** 0****.743****

3. Ablation Experiment

为了进一步考察本研究引入的CRNN-SA模型的有效性,在此基础上展开了相关的消融实验研究。具体结果则体现在表3和表4中。从表3和表4的数据可以看出,在单独使用CNN或RNN模型时检测性能欠佳,在二分类场景下(Binary Classification),CRNN-SA模型的准确率达到91.5%,F1值达到91.6%;而在多分类场景下(Multi-Label Classification),其准确率分别为90.5%与74.3%。综上所述,在此实验中消融实验结果表明改进后的CRNN-SA模型较传统的CNN与RNN模型具有更好的检测性能优势,并验证了该方法的有效性和准确性

Table 3 Binary Classification ablation experiment

Methods Accuracy F1-score
CNN 0.856 0.897
RNN 0.807 0.867
CRNN-SA 0.937 0.913
CR-GTCN 0****.915**** 0****.916****

Table 4 Multi-classification ablation experiment

Methods Accuracy F1-score
CNN 0.684 0.627
RNN 0.662 0.587
CRNN-SA 0.899 0.775
CR-GTCN 0****.905**** 0****.743****

Furthermore, this study visualizes the performance indicators of Binary Classification metrics, specifically Loss curves and false positive rate (FPR) visualization plots, using the CR-GTCN algorithm on the UNSW-NB15 cyber-attack dataset. The findings are illustrated in Figure 6.

Figure 6. Visualization of training loss and FPR

Within this study, we have examined the recall rate and loss rate of the CR-GTCN algorithm within the context of UNSW-NB15 network attack datasets. The results are illustrated in Figure 7. Furthermore, The corresponding collision matrix derived by CR-GTCN for the UNSW-NB15 dataset is displayed in Figure 8.

Figure 7. Visualization of Recall and Loss of training

Figure 8. Visualization diagram of confusion matrix

6********Conclusion

In this paper, a novel deep neural network architecture named CR-GTCN is introduced for large-scale network intrusion detection. The proposed model leverages CNNs to extract spatial traffic features from data and employs Gated TCNs fused with LSTM layers to capture temporal patterns. To mitigate the issue of information loss arising from the distinct learning characteristics of CNNs and RNNs, we synchronize the two network components to process input data at uniform granularity. This synchronization enables the Gated TCN module within the Temporal Convolutional Network (TCN) to effectively manage and regulate feature information across different layers. Additionally, we enhance feature representation by incorporating self-attention mechanisms, allowing each fused feature to be assigned optimal weights based on its relevance. This approach ensures that both spatial and temporal characteristics of network traffic are accurately captured. Furthermore, batch normalization is integrated into the model architecture to improve training efficiency and regularization performance. Comprehensive experiments conducted on the UNSW-NB15 dataset demonstrate that our proposed CR-GTCN model achieves superior detection accuracy compared to other state-of-the-art baseline models, while maintaining a lower false positive rate in network intrusion detection scenarios.

REFERENCES

[1] M. Humayun, M. Niazi, N. Jhanjhi, M. Alshayeb, and S. Mahmood, “Cyber security threats and vulnerabilities: A systematic mapping study,’’ Arabian J. Sci. Eng., vol. 45, no. 4, pp. 3171–3189, Apr. 2020.

This article presents a systematic study on the approach for network space security exercise procedure.

[3] Z. Yang, X. Liu, T. Li, D. Wu, J. Wang, Y. Zhao, and H. Han, ‘‘A systematic literature review of methods and datasets for anomaly-based network intrusion detection,’’ Comput. Secur., vol. 116, May 2022, Art. no. 102675.

L Yi M Yin and M Darbandi "An extensive and thorough analysis of intrusion detection systems within a fog environment" Trans Emerg Telecommun Technol Vol 34 No 1 Page e4632 January 2023

该研究探讨了机器学习算法在入侵检测中的应用,在2016年 thirds国际计算机与信息科学会议上发表。

基于深度学习的方法用于网络入侵检测系统的研究,在IEEE transactions上发表的文章中提到

[7] B. B. Rao, K. Swathi, Fast knn classifiers for network intrusion detection system, Indian Journal of Science and Technology 10 (14) (2017) 1–10.

该网络入侵检测系统基于一种隐藏的贝叶斯多元分类器

[9] S. Sahu and B. M. Mehtre, "Network intrusion detection system that employs a j48 decision tree algorithm," presented at the 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, 2015, pp. 2023–2026.

[10] Y. Chang, W. Li, Z. Yang, Network intrusion detection based on random forest and support vector machine, in: 2017 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC), Vol. 1, IEEE, 2017, pp. 635–638.

A. Rosay, F. Carlier, P. Leroux, Mlp4nids: An efficient MLP-based network-based intrusion detection system presented in the cicids2017 dataset, pages 240–254, International Conference on Machine Learning for Networking (ICMLN), Springer Verlag, 2019

[12] Cai Yue et al., 基于卷积神经网络(CNN)和循环神经网络(RNN)的铁路以太网一致性入侵检测集成方法[J], IEEE Access, Vol. 48 No.(6), pp.(486)-(486), Year: Month Day.]

[13] Y. E. Seyyar, A. G. Yavuz, H. M. Ünver, Detection of web attacks using the bert model, in: 2022 30th Signal Processing and Communications Applications Conference (SIU), IEEE, 2022, pp. 1–4.

[14] N. Moustafa and J. Slay, “The evaluation of network anomaly detection systems: Statistical analysis of the unsw-nb15 data set and the comparison with the kdd99 data set,” Information Security Journal: A Global Perspective, vol. 25, no. 1-3, pp. 18-31, 2016.

J. Li, H. Zhang, Z. Wei propose a method using weighted Word2Vec paragraph vectors for anomaly detection in HTTP traffic and published their work in IEEE Access (vol. 8).

[16] S. Huang et al., "HITs: HiT anomaly detection in system logs using hierarchical transformers," IEEE Transactions on Network and Service Management, vol. 17, no. 4, pp. 2064–2076, 2020.]

[17] E. Min, J. Long, Q. Liu, J. Cui, W. Chen, Tr-ids: 基于异常模式的入侵检测通过文本卷积神经网络模型与随机森林算法实现, 计算与通信安全学报 2018年

[18] Y. Jia, F. Zhong, A. Alrawais, B. Gong, and X. Cheng, “FlowGuard: An intelligent edge defense mechanism against IoT DDoS attacks,” IEEE Internet Things J., vol. 7, no. 10, pp. 9552–9562, Oct. 2020.

[19] M. Shafiq, Z. Tian, A. K. Bashir, X. Du, and M. Guizani, “CorrAUC: A malicious Bot-IoT traffic detection method in IoT network using machine-learning techniques,” IEEE Internet Things J., vol. 8, no. 5, pp. 3242–3254, Mar. 2021.

W Ding J Nayak B Naik D Pelusi和M Mishra,“基于模糊和实数编码化学反应优化的工业大数据环境入侵检测方法”,IEEE工业自动化 transactions.vol 17 no6 pp4298–4307 2021年6月

[21] M. Injadat, A. Moubayed, A. B. Nassif, and A. Shami, “Multi-stage optimized machine learning framework for network intrusion detection,” IEEE Trans. Netw. Service Manag., vol. 18, no. 2, pp. 1803–1816, Jun. 2021.

[22] B. Aslahi-Shahri, R. Rahmani, M. Chizari, A. Maralani, M. Eslami, M. J. Golkar, A. Ebrahimi, A hybrid method consisting of ga and svm for intrusion detection system, Neural computing and applications 27 (6) (2016) 1669–1676.

该文提出了一种特征选择算法用于入侵检测系统基于pigeon-inspired optimizer的优化器。

[24] C. Khammassi, S. Krichen, A ga-lr wrapper approach for feature selection in network intrusion detection, computers & security 70 (2017) 255–277.

[25] J. Gao 等人, “基于深度学习算法的全面SCADA入侵检测系统”, IEEE Transactions on Internet of Things, 第8卷, 第2期, 第951-961页, 2021年1月.

[26] P.-F. Marteau, “基于随机划分的森林用于点状和集体异常识别及其在网络入侵检测中的应用,” IEEE Trans. Inf. Forensics Security, vol. 16, pp. 2157–2172, 2021.”

[27] Li et al., "Significant enhancement of cyber-physical security in IoT via energy auditing techniques," IEEE Internet Things J., Vol 6 Issue 3, pp. 5224–5231, June 2019.

[28] X. Zhou, W. Liang, S. Shimizu, J. Ma, and Q. Jin, “Siamese neural network based few-shot learning for anomaly detection in industrial cyber-physical systems,” IEEE Trans. Ind. Informat., vol. 17, no. 8, pp. 5790–5798, Aug. 2021.]

[29] X. Xu, J. Li, Y. Yang, and F. Shen, “Toward effective intrusion detection using log-cosh conditional variational autoencoder,” IEEE Internet Things J., vol. 8, no. 8, pp. 6187–6196, Apr. 2021.

[30] B. Li, Y. Wu, J. Song, R. Lu, T. Li, and L. Zhao, “DeepFed: Federated deep learning for intrusion detection in industrial cyber-physical systems,” IEEE Trans. Ind. Informat., vol. 17, no. 8, pp. 5615–5624, Aug. 2021.

[S.-Q.Tang et al., IEEE/CAA J.Automat.Sini., Vol .5,No .1,Apr .2019

J. Gu 和 S. Lu 在 ComputSecur 杂志上发表了题为《一种利用支持向量机与 Naïve Bayes 特征嵌入实现入侵检测的有效方法》的研究论文。

[33] D. Jing and H.-B. Chen, utilizing SVM techniques for network intrusion detection in the UNSW-NB15 dataset, presented at the IEEE 13th International Conference on ASIC (ASICON) in October 2019, covering pages 1 through 4.]

M.R.G.Raman及其合著者提出了一个基于超图-遗传算法的高效入侵检测系统框架,在支持向量机中实现了参数优化与特征选择

A.Aburomman and M.B.I.Reaz, 'An innovative weighted support vector machines multiclass system built upon differential evolution techniques for intrusion detection systems,' Inf.Sci., vol.414, pp.225–246, Nov 2017

A Ponmalar and V Dhanakoti,'一种入侵检测方法基于集成支持向量机结合混沌游戏优化算法在大数据平台上的应用,'A journal of software applications and computations,vol. 116,Year 2022,Article no. 108295

[37] M. Rashid, J. Kamruzzaman, T. Imam, S. Wibowo, and S. Gordon, ‘‘A treebased stacking ensemble technique with feature selection for network intrusion detection,’’ Int. J. Speech Technol., vol. 52, no. 9, pp. 9768–9781, Jul. 2022.

[38] S.-W. Lee, H. M. Sidqi, M. Mohammadi, S. Rashidi, A. M. Rahmani, M. Masdari, and M. Hosseinzadeh, ‘‘Towards secure intrusion detection systems using deep learning techniques: Comprehensive analysis and review,’’ J. Netw. Comput. Appl., vol. 187, Aug. 2021, Art. no. 103111.

The authors propose online-based Adaboost-based parameterized approaches for dynamic distributed network intrusion detection systems.

The research team comprising M.M.Hassan,A.Gumaei,A.Alsanad,M.Alrubaian,and G.Fortino proposed an intelligent system for efficient intrusion detection in big data environments.The study was published in the journal Inf.Science with the details of volume 513 pages 386–396 and the year of publication March 2020

[41] M. A. Khan, ‘‘HCRNNIDS: Hybrid convolutional recurrent neural network-based network intrusion detection system,’’ Processes, vol. 9, no. 5, p. 834, May 2021.

A network-based intrusion detection framework built upon advanced convolutional neural networks (CNNs) and gated recurrent units (GRUs) was developed.

[43] Sibusiso Mphela Kasongo, "A deep learning-based approach for intrusion detection systems using a recurrent neural network architecture," Comput. Commun., vol. 199, pp. 113–125, Feb. 2023.]

[44] R. Vinayakumar, M. Alazab, K. Soman, P. Poornachandran, A. Al-Nemrat, S. Venkatraman, deep learning-based intelligent-based intrusion detection systems: A comprehensive review and future research directions, IEEE Access 7 (2019) 41525–41550

[45] S. Jian, G. Pang, L. Cao, K. Lu, H. Gao, Cure: Flexible categorical data representation by hierarchical coupling learning, IEEE transactions on knowledge and data engineering 31 (5) (2018) 853–866.

[46] J.M cHugh, "Evaluating intrusion detection systems: a critical analysis of Lincoln Laboratory's evaluations conducted during their participation in DARPA's IDS programs from approximately mid- to late-18 century." ACM Transactions on Information and System Security (TISSEC), vol.3, no.4, pp. 262– 2 ??., year unspecified.

[47] S.Alzughaibi and S.E.Khediri conducted research on a cloud-based intrusion detection system that employs deep neural networks to develop an efficient defense mechanism using backpropagation and particle swarm optimization techniques derived from the CSE-CIC-IDS2018 dataset.

[48] A. Thakkar, R. Lohiya, Fusion of statistical importance for feature selection in deep neural network-based intrusion detection system, Inf. Fusion 90 (2023) 353–363, doi:10.1016/j.inffus. 2022.09.026.

[49] A. El-Ghamry, A. Darwish, A.E. Hassanien, An optimized cnn-based intrusion detection system for reducing risks in smart farming, Internet of Things 22 (2023) 100709, doi:10.1016/ j.iot.2023.100709.

[50] C.-s. Wu, S. Chen, A heuristic intrusion detection approach using deep learning model, in: 2023 International Conference on Information Networking (ICOIN), 2023, pp. 438–442. 10.1109/ ICOIN56518.2023.10049024

V. Hnamte, H. Nhung-Nguyen, J. Hussain, Y. Hwa-Kim, An innovative bi-layered deep learning-based approach for network intrusion detection system identification: LSTM-based autoencoder, IEEE Access 11 (2023) 37131–37148, doi:10.1109/ACCESS.2023.3266979

[52] S. Dong, Y. Xia, T. Peng, Network abnormal traffic detection model based on semisupervised deep reinforcement learning, IEEE Trans. Netw. Serv. Manage. 18 (4) (2021) 4197–4212, doi:10.1109/TNSM.2021.3120804.

[53] C. Pelletier, G.I. Webb, F. Petitjean, Deep learning for the classification of sentinel-2 image time series, in: IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium, 2019, pp. 461–464. doi: 10.1109/IGARSS.2019.8900123.

全部评论 (0)

还没有任何评论哟~