TY - GEN
T1 - Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems
AU - Fitni, Qusyairi Ridho Saeful
AU - Ramli, Kalamullah
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - In recent years, data security in organizational information systems has become a serious concern. Many attacks are becoming less detectable by firewall and antivirus software. To improve security, intrusion detection systems (IDSs) are used to detect anomalies in network traffic. Currently, IDS technology has performance issues regarding detection accuracy, detection times, false alarm notifications, and unknown attack detection. Several studies have applied machine-learning approaches as solutions. This study used an ensemble learning approach that integrates the benefits of each single detection algorithms. We made comparisons with seven single classifiers to identify the most appropriate basic classifiers for ensemble learning. The experiment shows logistics regression, decision trees, and gradient boosting are chosen for our ensemble model. The Communications Security Establishment and Canadian Institute for Cybersecurity 2018 (CSE-CIC-IDS2018) dataset was used to evaluate the proposed model. Spearman's rank correlation coefficient facilitated the identification of the data features that might not be used. The experiment results showed that 23 of the 80 features were selected, and the model achieved the following scores: final accuracy, 98.8%; precision, 98.8%; recall, 97.1%; and F1, 97.9%.
AB - In recent years, data security in organizational information systems has become a serious concern. Many attacks are becoming less detectable by firewall and antivirus software. To improve security, intrusion detection systems (IDSs) are used to detect anomalies in network traffic. Currently, IDS technology has performance issues regarding detection accuracy, detection times, false alarm notifications, and unknown attack detection. Several studies have applied machine-learning approaches as solutions. This study used an ensemble learning approach that integrates the benefits of each single detection algorithms. We made comparisons with seven single classifiers to identify the most appropriate basic classifiers for ensemble learning. The experiment shows logistics regression, decision trees, and gradient boosting are chosen for our ensemble model. The Communications Security Establishment and Canadian Institute for Cybersecurity 2018 (CSE-CIC-IDS2018) dataset was used to evaluate the proposed model. Spearman's rank correlation coefficient facilitated the identification of the data features that might not be used. The experiment results showed that 23 of the 80 features were selected, and the model achieved the following scores: final accuracy, 98.8%; precision, 98.8%; recall, 97.1%; and F1, 97.9%.
KW - CSE-CIC-IDS2018
KW - ensemble learning method
KW - features selection
KW - intrusion detection
UR - http://www.scopus.com/inward/record.url?scp=85091975593&partnerID=8YFLogxK
U2 - 10.1109/IAICT50021.2020.9172014
DO - 10.1109/IAICT50021.2020.9172014
M3 - Conference contribution
AN - SCOPUS:85091975593
T3 - Proceedings - 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2020
SP - 118
EP - 124
BT - Proceedings - 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2020
Y2 - 7 July 2020 through 8 July 2020
ER -