Biography

My name is Xiyang Wu. I am a Ph.D. student in Electrical and Computer Engineering at University of Maryland, College Park and a member of GAMMA group. My research advisor is Prof. Dinesh Manocha. I hold a Master’s degree from Georgia Institute of Technology, where I was working with Prof. Matthew Gombolay. Before that, I earned my Bachelor’s in Engineering from Tianjin University, supervised by Prof. Xiaodong Zhang.

My research explores multi-modal foundation models, emphasizing hallucination detection, mitigation, and their physical reasoning capabilities under varied, real-world conditions. In parallel, I investigate deploying neural networks and large language models for robotic decision-making and navigation, aiming to enhance robots’ situational awareness, robustly interpret human behaviors and intentions, and adapt their actions accordingly across complex, dynamic, and collaborative environments.

Please check list of publications here.

I am actively seeking internship opportunities for Fall 2025, Spring 2026, or Summer 2026. If you’re interested in my research, please feel free to reach out.

I also welcome collaboration on potential research projects. Please feel free to contact me if you’d like to connect or discuss ideas.

Research Interest

  • Robotics
  • Reinforcement Learning
  • Multi-Modality
  • Vision Language Model

Education

  • Ph.D. in Electrical and Computer Engineering, University of Maryland, College Park, 2021 - 2026 (Expected)
  • M.S. in Electrical and Computer Engineering, Georgia Institute of Technology, 2019 - 2021
  • B.Eng. in Electrical Engineering (Honors Class), Tianjin University, 2015 - 2019

News

Jun 2025: One paper was accepted IROS 2025!
May 2025: We release a technical report, introducing a novel benchmark for hallucinations in synthetic video understanding over common sense and physics, VideoHallu, with QA pairs requiring human-level reasoning. The goal of this benchmark is to evaluate and post-train SoTA MLLMs on commonsense/physics data shows its impact on improving model reasoning. The project webpage is released here.
Sep 2024:AUTOHALLUSION was accepted by EMNLP 2024!
Jun 2024:LANCAR and AGL-NET were accepted by IROS 2024!
Jun 2024: We release a technical report, introducing a novel automatic benchmark generation approach, AUTOHALLUSION, which harnesses a few principal strategies to create diverse hallucination examples by probing the language modules in LVLMs for context cues. The project webpage is released here.
Apr 2024: One paper was accepted by VLADR Workshop at CVPR 2024!
Feb 2024:HallusionBench was accepted by CVPR 2024! The data, evaluation and code are available on GitHub.
Feb 2024: We release a technical report highlighting the critical issues of robustness and safety associated with integrating large language models (LLMs) and vision-language models (VLMs) into robotics applications. The project webpage is released here.
Oct 2023: We release an early report and analysis on failure modes of GPT-4V and LLaVA-1.5. Stay tuned on the release of our dataset HallusionBench!
Oct 2023:iPLAN was award as Best Paper Award by MRS Workshop at IROS 2023!
Aug 2023:iPLAN was accepted by CoRL 2023 with Oral Presentation (Accept Rate: 6.6%) !
Jul 2023: One paper was accepted by Digital Signal Processing!
Aug 2021:Started Ph.D. at University of Maryland, College Park.
Aug 2019:Started M.S. at Georgia Institute of Technology.


Selected Publications

On the Vulnerability of LLM/VLM-Controlled Robotics
Xiyang Wu, Souradip Chakraborty, Ruiqi Xian, Jing Liang, Tianrui Guan, Fuxiao Liu, Brian Sadler, Dinesh Manocha, Amrit Singh Bedi
The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025) 2025.
[paper] [webpage] [code]
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li*, Xiyang Wu*, Yubin Qin, Guangyao Shi, Hongyang Du, Dinesh Manocha, Tianyi Zhou, Jordan Lee Boyd-Graber (* indicates equal contributions)
arXiv (arXiv) 2025.
[paper] [webpage] [code]
AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models
Xiyang Wu*, Tianrui Guan*, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha (* indicates equal contributions)
The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024) 2024.
[paper] [webpage] [code]
LANCAR: Leveraging Language for Context-Aware Robot Locomotion in Unstructured Environments
Chak Lam Shek*, Xiyang Wu*, Wesley A. Suttle, Carl Busart, Erin Zaroukian, Dinesh Manocha, Pratap Tokekar, Amrit Singh Bedi (* indicates equal contributions)
The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024) 2024.
[paper] [webpage]
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models
Tianrui Guan*, Fuxiao Liu*, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou (* indicates equal contributions)
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2024) 2024.
[paper] [webpage] [code]
iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning
Xiyang Wu, Rohan Chandra, Tianrui Guan, Amrit Singh Bedi, Dinesh Manocha
7th Annual Conference on Robot Learning (CoRL 2023) 2023. oral (6.6%)
Abridged in IROS 2023 Advances in Multi-Agent Learning - Coordination, Peception and Control Workshop. Best Paper and Presentation Award.
[paper] [webpage] [code]