Advancing Infrastructure-as-Code Resilience through Generative AI Agents for Predictive Remediation and Autonomous Security Enforcement
Main Article Content
Abstract
"Infrastructure as Code" (IaC) is the accepted approach to provision cloud infrastructure declaratively. Still, misconfigurations in IaC remain the primary cause of cloud security incidents, accounting for 67% of all disclosed cloud breaches. However, the current set of countermeasures, such as rule-based static scanners, policy-as-code tools, and manual review gates, is inadequate to prevent such misconfigurations or to address them through autonomous remediation. In this paper, the authors propose a multi-agent generative AI system called GenSecOps, comprising four agents that work together to prevent misconfigurations in IaC. These agents are the IaC Understanding Agent (IUA), which uses the IaC artefact to create a semantic resource graph; the Risk Prediction Agent (RPA), which uses a hybrid model of the Transformer and Graph Neural Networks to create risk mappings; the Generative Remediation Agent (GRA), which uses the risk mappings to create corrected policy-compliant IaC templates; and the Autonomous Enforcement Orchestrator (AEO). Experiments on a corpus of 48,000 IaC templates (Terraform, CloudFormation, Kubernetes) show that GenSecOps achieves a misconfiguration detection F1 score of 0.934, a 73.2% reduction in critical findings overrule based baselines, an 81.5% improvement in mean-time-to remediate (MTTR), and drift-recovery latency below 4.2 minutes. These results demonstrate that generative AI agents provide a viable, deployable foundation for self-healing, autonomously secured cloud-native infrastructure.
Downloads
Article Details
Section

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
How to Cite
References
OpenAI: GPT-4 Technical Report. arXiv preprint arXiv:2303.08774, 2023. https://arxiv.org/abs/2303.08774
Anthropic: The Claude 3 Model Family: Opus, Sonnet, Haiku. Technical Report, 2024.
https://www.anthropic.com/research/claude-3-model-card
Joon Sung Park, Joseph C. O'Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, Michael S. Bernstein: Generative Agents: Interactive Simulacra of Human Behaviour. In: Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST), pp. 1–22, 2023. DOI: https://doi.org/10.1145/3586183.3606763
Akond Rahman, Chris Parnin, Laurie Williams: The Seven Sins: Security Smells in Infrastructure as Code Scripts. In: Proceedings of the 41st International Conference on Software Engineering (ICSE), pp. 164–175, 2019. DOI: https://doi.org/10.1109/ICSE.2019.00028
Indika Kumara, Dario Di Nucci, Damian A. Tamburri, Willem-Jan van den Heuvel: Yet Another Cybersecurity Research Study: Security Smells in Infrastructure as Code Scripts. In: Proceedings of the 18th International Conference on Mining Software Repositories (MSR), pp. 548–552, 2021. DOI: https://doi.org/10.1109/MSR52588.2021.00068
Christophe Ponsard, Abderrahman Touzani, Julien Grandclaudon: A Method for Automated Analysis of Privilege Escalation in Cloud IAM Configurations. In: Proceedings of the 17th International Conference on Availability, Reliability and Security (ARES), pp. 1–9, 2022. DOI: https://doi.org/10.1145/3538969.3544413
Zheng Li, Yue Chen, Hao Wang, Wei Liu, Yi Zhang: A Systematic Literature Review of Security for Infrastructure as Code. IEEE Transactions on Reliability 72(3), 1112–1131, 2023. DOI: https://doi.org/10.1109/TR.2023.3265044
Mark Chen et al.: Evaluating Large Language Models Trained on Code. arXiv preprint arXiv:2107.03374, 2021.
https://arxiv.org/abs/2107.03374
Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis: InCoder: A Generative Model for Code Infilling and Synthesis. In: Proceedings of the International Conference on Learning Representations (ICLR), 2023. https://arxiv.org/abs/2204.05999
Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, Ramesh Karri: Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions. In: IEEE Symposium on Security and Privacy (SP), pp. 754–768, 2022. DOI: https://doi.org/10.1109/SP46214.2022.9833571
Mohammed Latif Siddiq, Joanna C. S. Santos: SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques. In: Proceedings of the 1st International Workshop on Mining Software Repositories Applications for Privacy and Security (MSR4P&S), pp. 29–33, 2022. DOI: https://doi.org/10.1145/3549035.3561184
Gelei Deng, Yi Liu, Victor Mayoral-Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, Stefan Rass: PentestGPT: An LLM-Empowered Automatic Penetration Testing Tool. arXiv preprint arXiv:2308.06782, 2023.
https://arxiv.org/abs/2308.06782
Haonan Li, Yu Hao, Yizhuo Zhai, Zhiyun Qian: Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach. In: Proceedings of the ACM on Programming Languages (OOPSLA), 2024. DOI: https://doi.org/10.1145/3649828
Thomas N. Kipf, Max Welling: Semi-Supervised Classification with Graph Convolutional Networks. In: International Conference on Learning Representations (ICLR), 2017.
https://arxiv.org/abs/1609.02907
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, Yoshua Bengio: Graph Attention Networks. In: International Conference on Learning Representations (ICLR), 2018. https://arxiv.org/abs/1710.10903
Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, Yang Liu: Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 10197–10207, 2019. https://arxiv.org/abs/1909.03496
Xiao Cheng, Haoyu Wang, Jiayi Hua, Guoai Xu, Yulei Sui: DeepWukong: Statistically Detecting Software Vulnerabilities Using Deep Graph Neural Networks. ACM Transactions on Software Engineering and Methodology 30(3), 1–33, 2021. DOI: https://doi.org/10.1145/3436877
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár: Focal Loss for Dense Object Detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988, 2017. DOI: https://doi.org/10.1109/ICCV.2017.324
Diederik P. Kingma, Max Welling: An Introduction to Variational Autoencoders. Foundations and Trends in Machine Learning 12(4), 307–392, 2019. DOI: https://doi.org/10.1561/2200000056
Ilya Loshchilov, Frank Hutter: Decoupled Weight Decay Regularisation. In: International Conference on Learning Representations (ICLR), 2019. https://arxiv.org/abs/1711.05101
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen: LoRA: Low-Rank Adaptation of Large Language Models. In: International Conference on Learning Representations (ICLR), 2022. https://arxiv.org/abs/2106.09685
Baptiste Rozière, Jonas Gehring,
Fabian Gloeckle, Sten Sootla, Itai
Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, Gabriel Synnaeve: Code Llama: Open Foundation Models for Code. arXiv preprint arXiv:2308.12950, 2023.
https://arxiv.org/abs/2308.12950
Alon Jacovi, Ana Marasović, Tim Miller, Yoav Goldberg: Formalising Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI. In: Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT), pp. 624–635, 2021. DOI: https://doi.org/10.1145/3442188.3445923