Optimizing cloud-based intrusion detection systems through hybrid data sampling and feature selection for enhanced anomaly detection

To enhance detection accuracy in network intrusion scenarios, this study proposes an optimized intrusion detection system (IDS) framework that integrates advanced data sampling, feature selection, and anomaly detection techniques. Leveraging random forest (RF) and genetic algorithm, the framework optimizes sampling ratios and identifies critical features. In contrast, the isolation forest algorithm detects and excludes outliers, refining dataset quality and classification performance. Evaluated on the UNSW-NB15 dataset, comprising over 2.5 million records and 42 diverse features, the proposed framework demonstrates significant improvements in anomaly detection, including reduced false alarm rates and enhanced identification of rare threats, such as shellcode, worms, and backdoors. Experimental results reveal that the RF-based model achieves an F1 score of 91.8% and an area under the curve (AUC) of 96%, outperforming traditional machine learning models and standalone RF classifiers. The integration of extreme gradient boosting (XGB) and its optimized variant, XGBGA, further enhances the framework, with XGBGA achieving the highest performance metrics, including an F1 score of 92.8% and an AUC of 97%. These findings underscore the importance of data optimization strategies in improving the accuracy and reliability of IDSs, particularly in handling imbalanced datasets and diverse network traffic. Future work will focus on real-time processing capabilities to handle streaming data and expanding the framework’s applicability to domains such as fraud detection and cybersecurity, where precise anomaly detection is essential.
- Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J., & Ahmad, F. (2021). Network Intrusion Detection System: A Systematic Study of Machine Learning and Deep Learning Approaches. *Transactions on Emerging Telecommunications Technologies*, 32, e4150.
- Belouch, M., & Hadaj, S.E. (2017). Comparison of Ensemble Learning Methods Applied to Network Intrusion Detection. In: *Proceedings of the ACM Conference*, pp. 1–4.
- Bukhari, S.M.S., Zafar, M.H., Abou Houran, M., Moosavi, S.K.R., Mansoor, M., Muaaz, M., & Sanfilippo, F. (2024). Secure and Privacy-Preserving Intrusion Detection in Wireless Sensor Networks: Federated Learning with SCNN-BiLSTM for Enhanced Reliability. *Ad Hoc Networks*, 155(103), 407. https://doi.org/10.1016/j.adhoc.2024.103407
- Chkirbene, Z., Erbad, A., Hamila, R., Mohamed, A., Guizani, M., & Hamdi, M. (2020). TIDCS: A Dynamic Intrusion Detection and Classification System Based Feature Selection. *IEEE Access*, 8, 95864–95877. https://doi.org/10.1109/ACCESS.2020.2994931
- Deebak, B.D., & Hwang, S.O. (2024). Healthcare Applications Using Blockchain with a Cloud-Assisted Decentralized Privacy-Preserving Framework. *IEEE Transactions on Mobile Computing*, 23(5), 5897–5916. https://doi.org/10.1109/TMC.2023.3315510
- Dey, A. (2020). Deep IDS: A Deep Learning Approach for Intrusion Detection Based on IDS 2018. In: *2020 2nd International Conference on Sustainable Technologies for Industry 4.0 (STI). IEEE, p. 1–5.
- Drewek-Ossowicka, A., Pietrołaj, M., & Rumiński, J. (2021). A Survey of Neural Networks Usage for Intrusion Detection Systems. *Journal of Ambient Intelligence and Humanized Computing*, 12, 497–514. https://doi.org/10.1007/s12652-020-02014-x
- Ferrag, M.A., Maglaras, L., Janicke, H., & Smith, R. (2019). Deep Learning Techniques for Cyber Security Intrusion Detection: A Detailed Analysis. In: *6th International Symposium for ICS SCADA Cyber Security Research (ICS-CSR 2019)*, Athens, 10–12 September.
- Halbouni, A., Gunawan, T.S., Habaebi, M.H., Halbouni, M., Kartiwi, M., & Ahmad, R. (2022). CNN-LSTM: Hybrid Deep Neural Network for Network Intrusion Detection System. *IEEE Access*, 10, 99837–99849.
- Hanafi, A.V., Ghaffari, A., Rezaei, H., Valipour, A., & Arasteh, B. (2024). Intrusion Detection in Internet of Things Using Improved Binary Golden Jackal Optimization Algorithm and LSTM. *Cluster Computing*, 27(3), 2673–2269. https://doi.org/10.1007/s10586-023-04102-x
- Hassan, S.R., Rehman, A.U., Alsharabi, N., Arain, S., Quddus, A., & Hamam, H. (2024). Design of Load-Aware Resource Allocation for Heterogeneous Fog Computing Systems. *PeerJ Computer Science*, 10, e1986. https://doi.org/10.7717/peerj-cs.1986
- Heidari, A., Jafari Navimipour, N., Dag, H., & Unal, M. (2024). Deepfake Detection Using Deep Learning Methods: A Systematic and Comprehensive Review. *Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery*, 14, e1520. https://doi.org/10.1002/widm.1520
- Heidari, A., Navimipour, N.J., & Unal, M. (2023). A Secure Intrusion Detection Platform Using Blockchain and Radial Basis Function Neural Networks for the Internet of Drones. *IEEE Internet of Things Journal*, 10, 8445–8454. https://doi.org/10.1109/JIOT.2023.3237661
- Henry, A., Gautam, S., Khanna, S., Rabie, K., Shongwe, T., Bhattacharya, P., Sharma, B., & Chowdhury, S. (2023). Composition of Hybrid Deep Learning Model and Feature Optimization for Intrusion Detection System. *Sensors*, 23(2), 890. https://doi.org/10.3390/s23020890
- Hnamte, V., & Hussain, J. (2023). DCNNBiLSTM: An Efficient Hybrid Deep Learning-Based Intrusion Detection System. *Telematics and Informatics Reports*, 10, 100053. https://doi.org/10.1016/j.teler.2023.100053
- Hnamte, V., Nhung-Nguyen, H., Hussain, J., & Hwa-Kim, Y. (2023). A Novel Two-Stage Deep Learning Model for Network Intrusion Detection: LSTM-AE. *IEEE Access*, 11, 37131–37148. https://doi.org/10.1109/ACCESS.2023.3266979
- Liu, F., Ting, K.M., & Zhou, Z.H. (2012). Isolation Forest. In: *Proceedings of the 8th IEEE International Conference on Data Mining (ICDM)*, IEEE, pp. 413–422.
- Mehmood, M., Javed, T., Nebhen, J., Abbas, S., Abid, R., Bojja, G.R., & Rizwan, M. (2022). A Hybrid Approach for Network Intrusion Detection. *Computers, Materials and Continua*, 70, 91–107. https://doi.org/10.32604/cmc.2022.019127
- Mohamed, D., & Ismael, O. (2023). Enhancement of an IoT Hybrid Intrusion Detection System Based on Fog-to-Cloud Computing. *Journal of Cloud Computing*, 12(1), 41. https://doi.org/10.1186/s13677-023-00420-y
- Molina-Coronado, B., Mori, U., Mendiburu, A., & Miguel-Alonso, J. (2020). Survey of Network Intrusion Detection Methods from the Perspective of the Knowledge Discovery in Databases Process. *IEEE Transactions on Network and Service Management*, 17(4), 2451–2479. https://doi.org/10.1109/TNSM.2020.3016246
- Pingale, S.V., & Sutar, S.R. (2022). Analysis of Web Application Firewalls, Challenges, and Research Opportunities. In: *Proceedings of the 2nd International Conference on Data Science, Machine Learning and Applications (ICDSMLA 2020)*, Singapore: Springer, pp. 239–248.
- Pingale, S.V., & Sutar, S.R. (2022). Automated Network Intrusion Detection Using Multimodal Networks. *International Journal of Computational Science and Engineering*, 25(3), 339–352. https://doi.org/10.1504/IJCSE.2022.123123
- Pingale, S.V., & Sutar, S.R. (2022). Remora Whale Optimization Hybrid Deep Learning for Network Intrusion Detection Using CNN Features. *Expert Systems with Applications*, 210, 118476. https://doi.org/10.1016/j.eswa.2022.118476
- Pingale, S.V., & Sutar, S.R. (2023). Remora-Based Deep Maxout Network Model for Network Intrusion Detection Using Convolutional Neural Network Features. *Computers and Electrical Engineering*, 110, 108831. https://doi.org/10.1016/j.compeleceng.2023.108831
- Ravikumar, C., Ravi Kumar, R., Sarada, M., Pabba, K., & Pasha, M.A. (2024). A Comprehensive Exploration of Machine Learning in Early Detection with a Focus on Lung and Pancreatic Cancer for Revolutionizing Cancer Diagnostics. *International Conference on Emerging Technologies in Computer Science for Interdisciplinary Applications (ICETCS 2024).
- Ravikumar, C.H., Batra, I., & Malik, A. (2023). Block chain based secure with improvised bloom filter over a decentralized access control network on a cloud platform. Journal of Engineering Science and Technology Review, 16(2), pp. 123–130. https://doi.org/10.25103/jestr.162.16
- Ravikumar, C.H., Sridevi, M., Ramchander, M., Ramesh, V., & Kumar, V.P. (2024). Enhancing Digital Security Using Signa-Deep for Online Signature Verification and Identity Authentication. *International Journal of Systematic Innovation*, 8(2), 58–69. https://doi.org/10.6977/IJoSI.202406_8(2).0005
- Rekha, G., Malik, S., Tyagi, A.K., & Nair, M.M. (2020). Intrusion Detection in Cyber Security: Role of Machine Learning and Data Mining in Cyber Security. *Advances in Science, Technology and Engineering Systems Journal*, 5(3), 72–81. https://doi.org/10.25046/aj050310
- Talukder, M.A., Hasan, K.F., Islam, M.M., Uddin, M.A., Akhter, A., Yousuf, M.A., Alharbi, F., & Moni, M.A. (2023). A Dependable Hybrid Machine Learning Model for Network Intrusion Detection. *Journal of Information Security and Applications*, 72, 103405. https://doi.org/10.1016/j.jisa.2022.103405
- Vashishtha, L.K., Singh, A.P., & Chatterjee, K. (2023). HIDM: A Hybrid Intrusion Detection Model for Cloud-Based Systems. *Wireless Personal Communications*, 128, 2637–2666. https://doi.org/10.1007/s11277-022-10063-y
- Wang, C., Sun, Y., Wang, W., Liu, H., & Wang, B. (2023). Hybrid Intrusion Detection System Based on Combination of Random Forest and Autoencoder. *Symmetry*, 15(3), 568.
- Wu, P. (2020). Deep Learning for Network Intrusion Detection: Attack Recognition with Computational Intelligence (PhD Thesis). UNSW Sydney.
- Xu, Z., Zhang, W., Li, Y., & Li, W. (2024). Secure and Efficient Intrusion Detection in IoT Using Deep Reinforcement Learning. *Journal of Computer Science and Technology*, 39(3), 552–570.