Reinforcement Learning Enhanced AI-AugETM for Adaptive Real-Time Dose Adjustment in Oncology Phase I Trials

Authors

  • Xuantianyi Feng Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, USA.
  • Jack L. Taylor Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS, USA.

Keywords:

reinforcement learning, AI-AugETM, dose adjustment, Phase I clinical trials, oncology, adaptive systems, real-time decision support, algorithmic fairness

Abstract

The acceleration of oncology drug development relies heavily on the efficiency and safety of Phase I dose escalation trials, which traditionally employ rule-based or model-guided designs to identify the maximum tolerated dose. Recent advances in artificial intelligence have introduced exposure-toxicity joint modeling frameworks such as AI-AugETM, which integrate pharmacokinetic and toxicity data to personalize dose recommendations. However, these systems operate under static assumptions and lack the capacity for real-time adaptation to evolving patient responses. This paper proposes a reinforcement learning enhanced extension of AI-AugETM that transforms the dose adjustment process into a continuous, adaptive decision-making framework capable of learning optimal dosing policies on the fly. We examine the architectural integration of reinforcement learning with the existing AI-AugETM infrastructure, focusing on state representation, reward design, and policy learning under high uncertainty. The system-level implications of deploying such an adaptive agent in regulated clinical environments are analyzed, including robustness to noisy data, trade-offs between exploration and patient safety, fairness across heterogeneous patient populations, and governance challenges related to interpretability and regulatory approval. Deployment considerations such as computational sustainability, integration with electronic health record systems, and real-time safety monitoring are discussed. By situating the technical innovation within a broader socio-technical context, this paper argues that reinforcement learning enhanced AI-AugETM offers a promising path toward truly personalized dose optimization, but its success hinges on carefully designed governance structures that balance algorithmic autonomy with clinician oversight. The framework also raises important questions about equity in algorithmic decision-making and the need for transparent, auditable models. We conclude by outlining a research agenda for future empirical validation and policy development.

References

1. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.

2. O'Quigley, J., Pepe, M., & Fisher, L. (1990). Continual reassessment method: A practical design for phase 1 clinical trials in cancer. Biometrics, 46(1), 33–48.

3. Liu, S., Yuan, Y., & Chi, Y. (2015). Bayesian optimal interval design for dose finding in phase I clinical trials. Journal of the American Statistical Association, 110(511), 1045–1056.

4. Babb, J., Rogatko, A., & Zacks, S. (1998). Cancer phase I clinical trials: Efficient dose escalation with overdose control. Statistics in Medicine, 17(10), 1103–1120.

5. Berry, D. A. (2006). Bayesian clinical trials. Nature Reviews Drug Discovery, 5(1), 27–36.

6. Topol, E. J. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25(1), 44–56.

7. Gottesman, O., Johansson, F. D., Komorowski, M., Faisal, A., Sontag, D., Doshi-Velez, F., & Celi, L. A. (2019). Guidelines for reinforcement learning in healthcare. Nature Medicine, 25(12), 1890–1899.

8. Cheung, Y. K., & Chappell, R. (2000). Sequential designs for phase I clinical trials with late-onset toxicities. Biometrics, 56(4), 1139–1144.

9. Thall, P. F., & Cook, J. D. (2004). Dose-finding based on efficacy–toxicity trade-offs. Biometrics, 60(3), 684–693.

10. Barricelli, B. R., Casiraghi, E., & Fogli, D. (2019). A survey on digital twin: Definitions, characteristics, applications, and design implications. IEEE Access, 7, 167653–167671.

11. Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453.

12. Zhang, Y., Chen, X., & He, L. (2020). Real-time monitoring and adaptive decision support in clinical trials using machine learning. Journal of Biomedical Informatics, 108, 103500.

13. Food and Drug Administration. (2020). Adaptive designs for clinical trials of drugs and biologics: Guidance for industry. U.S. Department of Health and Human Services.

14. Liu, M., & Wang, Y. (2021). Model-based dose-finding designs for oncology phase I trials: A review. Pharmaceutical Statistics, 20(4), 709–724.

15. Chen, Z., & Shen, J. (2022). Reinforcement learning for personalized dosing in cancer therapy: A survey. IEEE Transactions on Neural Networks and Learning Systems, 33(11), 6183–6199.

16. Wang, Y. (2025, August). AI-AugETM: An AI-augmented exposure–toxicity joint modeling framework for personalized dose optimization in early-phase clinical trials. In 2025 19th International Conference on Complex Medical Engineering (CME) (pp. 182-186). IEEE.

17. Karger, D., & Loh, W. (2023). Ethical considerations in AI-driven clinical decision support. Nature Digital Medicine, 6, 112.

18. Gupta, R., & Raftery, A. E. (2024). Bayesian hierarchical models for adaptive dose escalation with real-time updates. Journal of the Royal Statistical Society: Series C, 73(2), 345–368.

19. Lee, J. J., & Chu, C. R. (2020). Bayesian dose-finding designs: A review and comparison. Clinical Trials, 17(5), 503–513.

20. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.

Downloads

Published

2025-03-15

How to Cite

Xuantianyi Feng, & Jack L. Taylor. (2025). Reinforcement Learning Enhanced AI-AugETM for Adaptive Real-Time Dose Adjustment in Oncology Phase I Trials. Computer Science and Engineering Transactions, 3(1). Retrieved from https://csetx.org/index.php/cset/article/view/171