One Agent to Guide Them All: Empowering MLLMs for Vision-and-Language Navigation via Explicit World Representation
Zerui Li, Hongpei Zheng, Fangguo Zhao, Aidan Chan, Jian Zhou, Sihao Lin, Shijie Li, and Qi Wu
arXiv preprint arXiv:2602.15400, 2026-02-17