Thinh, Nguyen Van, Tran Van Lang, and Van The Thanh. “RGTranCNet: Effective Image Captioning Model Using Cross-Attention and Semantic Knowledge”. Vietnam Journal of Science and Technology 64, no. 1 (July 15, 2025): 123–138. Accessed June 1, 2026. https://vjst.vast.vn/jst/article/view/22381.