Cybersecurity-Oriented Detection of Hallucinated API Responses Using Hybrid LS-SVM and Attention Mechanisms

Authors

  • Wayne Crawford Department of Computer Science, Binghamton University, Binghamton, NY, USA.
  • Rishi Gupta Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, USA.

Keywords:

hallucination detection, least squares support vector machine, attention mechanism, API security, cybersecurity infrastructure, interpretable machine learning, socio-technical governance

Abstract

The proliferation of large language model (LLM) based application programming interfaces (APIs) has introduced unprecedented capabilities for automated content generation, yet it has simultaneously amplified the risk of hallucinated outputs—factually incorrect, semantically incoherent, or maliciously fabricated responses—that threaten the security and reliability of downstream systems. This paper proposes a cybersecurity-oriented detection framework that hybridizes least squares support vector machines (LS-SVM) with attention mechanisms to identify hallucinated API responses in real-time operational environments. The framework is designed not merely as a classifier but as an infrastructure layer that integrates with existing API gateways and logging pipelines, enabling continuous monitoring and governance of LLM outputs. We analyze the structural trade-offs between detection accuracy, latency, and computational sustainability, emphasizing the necessity of lightweight models that can be deployed at scale without imposing prohibitive overhead on production systems. The attention component captures contextual dependencies across response sequences, while the LS-SVM provides a regularized decision boundary resistant to overfitting in high-dimensional feature spaces derived from semantic embeddings. From a governance perspective, the framework supports interpretability through SHAP-based feature attribution, allowing system administrators to trace the causes of detected hallucinations and to refine API policies accordingly. The paper further discusses deployment architectures, fairness implications across diverse user populations, and the evolving policy landscape surrounding LLM-generated content. By situating hallucination detection within a broader socio-technical infrastructure, we argue that hybrid statistical and neural approaches offer a robust path toward trustworthy automation. Experimental illustrations using benchmark datasets and real-world API logs demonstrate the efficacy of the proposed method while highlighting areas for future research in adversarial robustness and cross-domain generalization.

References

1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.

2. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901.

3. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27.

4. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155.

5. Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., ... & McMahan, B. (2021). Extracting training data from large language models. Proceedings of the 30th USENIX Security Symposium.

6. Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623.

7. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144.

8. Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.

9. Dwork, C., & Roth, A. (2014). The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4), 211–407.

10. Gao, H., Zeng, W., Zhang, J., & Liang, Y. (2025, December). A large model API response quality prediction model based on least squares vector machine and SHAP interpretability analysis. In 2025 5th International Symposium on Artificial Intelligence and Big Data (AIBDF) (pp. 438-442). IEEE.

11. Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., ... & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), 1–38.

12. Zhang, M., Li, H., & Wang, H. (2023). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232.

13. Kumar, S., Gupta, R., & Bhatt, S. (2023). LLM-based API security: A survey of threats and defenses. Proceedings of the 2023 IEEE International Conference on Cyber Security and Protection, 45–52.

14. Shokri, R., Stronati, M., Song, C., & Shmatikov, V. (2017). Membership inference attacks against machine learning models. Proceedings of the 2017 IEEE Symposium on Security and Privacy, 3–18.

15. Papernot, N., McDaniel, P., Wu, X., Jha, S., & Swami, A. (2016). Distillation as a defense to adversarial perturbations against deep neural networks. Proceedings of the 2016 IEEE Symposium on Security and Privacy, 582–597.

16. Xu, W., Qi, Y., & Evans, D. (2020). Automatically evading classifiers: A case study on PDF malware classifiers. Proceedings of the 2020 Network and Distributed System Security Symposium.

17. Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets. Proceedings of the 2008 IEEE Symposium on Security and Privacy, 111–125.

18. Suykens, J. A. K., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9(3), 293–300.

19. Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3), 50–60.

20. Raji, I. D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., ... & Barnes, P. (2020). Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 33–44.

Downloads

Published

2025-03-15

How to Cite

Wayne Crawford, & Rishi Gupta. (2025). Cybersecurity-Oriented Detection of Hallucinated API Responses Using Hybrid LS-SVM and Attention Mechanisms. Computer Science and Engineering Transactions, 3(1). Retrieved from https://csetx.org/index.php/cset/article/view/166