Cybersecurity-Oriented Detection of Hallucinated API Responses Using Hybrid LS-SVM and Attention Mechanisms
Keywords:
hallucination detection, least squares support vector machine, attention mechanism, API security, cybersecurity infrastructure, interpretable machine learning, socio-technical governanceAbstract
The proliferation of large language model (LLM) based application programming interfaces (APIs) has introduced unprecedented capabilities for automated content generation, yet it has simultaneously amplified the risk of hallucinated outputs—factually incorrect, semantically incoherent, or maliciously fabricated responses—that threaten the security and reliability of downstream systems. This paper proposes a cybersecurity-oriented detection framework that hybridizes least squares support vector machines (LS-SVM) with attention mechanisms to identify hallucinated API responses in real-time operational environments. The framework is designed not merely as a classifier but as an infrastructure layer that integrates with existing API gateways and logging pipelines, enabling continuous monitoring and governance of LLM outputs. We analyze the structural trade-offs between detection accuracy, latency, and computational sustainability, emphasizing the necessity of lightweight models that can be deployed at scale without imposing prohibitive overhead on production systems. The attention component captures contextual dependencies across response sequences, while the LS-SVM provides a regularized decision boundary resistant to overfitting in high-dimensional feature spaces derived from semantic embeddings. From a governance perspective, the framework supports interpretability through SHAP-based feature attribution, allowing system administrators to trace the causes of detected hallucinations and to refine API policies accordingly. The paper further discusses deployment architectures, fairness implications across diverse user populations, and the evolving policy landscape surrounding LLM-generated content. By situating hallucination detection within a broader socio-technical infrastructure, we argue that hybrid statistical and neural approaches offer a robust path toward trustworthy automation. Experimental illustrations using benchmark datasets and real-world API logs demonstrate the efficacy of the proposed method while highlighting areas for future research in adversarial robustness and cross-domain generalization.
References
1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.
2. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901.
3. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27.
4. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155.
5. Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., ... & McMahan, B. (2021). Extracting training data from large language models. Proceedings of the 30th USENIX Security Symposium.
6. Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623.
7. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144.
8. Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.
9. Dwork, C., & Roth, A. (2014). The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4), 211–407.
10. Gao, H., Zeng, W., Zhang, J., & Liang, Y. (2025, December). A large model API response quality prediction model based on least squares vector machine and SHAP interpretability analysis. In 2025 5th International Symposium on Artificial Intelligence and Big Data (AIBDF) (pp. 438-442). IEEE.
11. Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., ... & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), 1–38.
12. Zhang, M., Li, H., & Wang, H. (2023). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232.
13. Kumar, S., Gupta, R., & Bhatt, S. (2023). LLM-based API security: A survey of threats and defenses. Proceedings of the 2023 IEEE International Conference on Cyber Security and Protection, 45–52.
14. Shokri, R., Stronati, M., Song, C., & Shmatikov, V. (2017). Membership inference attacks against machine learning models. Proceedings of the 2017 IEEE Symposium on Security and Privacy, 3–18.
15. Papernot, N., McDaniel, P., Wu, X., Jha, S., & Swami, A. (2016). Distillation as a defense to adversarial perturbations against deep neural networks. Proceedings of the 2016 IEEE Symposium on Security and Privacy, 582–597.
16. Xu, W., Qi, Y., & Evans, D. (2020). Automatically evading classifiers: A case study on PDF malware classifiers. Proceedings of the 2020 Network and Distributed System Security Symposium.
17. Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets. Proceedings of the 2008 IEEE Symposium on Security and Privacy, 111–125.
18. Suykens, J. A. K., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9(3), 293–300.
19. Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3), 50–60.
20. Raji, I. D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., ... & Barnes, P. (2020). Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 33–44.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Computer Science and Engineering Transactions

This work is licensed under a Creative Commons Attribution 4.0 International License.
This article is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.



