I am a third-year PhD candidate at the Singapore University of Technology and Design (SUTD) (in collaboration with MIT & Zhejiang University), advised by Prof. Roy Ka-Wei Lee. My research focuses on long-form generation and long-context capabilities of LLMs, spanning data generation, chain-of-thought and planning, RL-based training and alignment, and end-to-end evaluation for long text, code, and reasoning.
I am currently a research intern at ByteDance Seed, working on large-scale pretraining data synthesis and Skill-Augmented Pretraining. Previously, I interned at Kimi (Moonshot AI) on Kimi-K2 long-context capabilities, and at Zhipu AI on the GLM-4.x series via long-form generation with RL.
My papers have received citations.
๐ฅ News
- 2026.01: ย ๐ LongWriter-Zero accepted to ICLR 2026 as Oral (top ~1.8%) โ pure RL for ultra-long text generation without SFT!
- 2025.12: ย ๐ Started as Algorithm Research Intern at ByteDance Seed, working on LLM pretraining data synthesis and Skill-Augmented Pretraining.
- 2026: ย ๐ Seed2.0 Model Card released โ ByteDance Seed.
- 2026: ย ๐ Kimi K2.5: Visual Agentic Intelligence released โ Moonshot AI.
- 2025.06 โ 2025.12: ย ๐ Algorithm Research Intern at Kimi (Moonshot AI), contributing to Kimi-K2.5 long-context capabilities.
- 2025.01: ย ๐ LongGenBench accepted to ICLR 2025 main track.
- 2024.09: ย ๐ Joined Zhipu AI as Algorithm Research Intern, contributing to the GLM-4.1/4.5 series.
๐ผ Experience
ByteDance Seed โ Algorithm Research Intern Dec. 2025 โ Present ย ยทย Beijing
- Participated in LLM Pretrain-Level data synthesis research; designed high-quality pretraining data pipelines and explored the impact of large-scale synthetic data on model capabilities.
- Researched Skill-Augmented Pretraining: built structured Skill Libraries and explored skill-data integration to boost model knowledge and capability expression.
Kimi (Moonshot AI) โ Algorithm Research Intern Jun. โ Dec. 2025 ย ยทย Beijing
- Deeply involved in iterating Kimi-K2.5 for long-context capabilities, covering long-text/code synthetic data construction and Long Code Generation data pipelines.
- Follow-up works include Kimi-Linear and Kimi-K2-Thinking.
Zhipu AI โ Algorithm Research Intern Sep. 2024 โ Jun. 2025 ย ยทย Beijing
- Deeply contributed to the GLM-Zero series serving GLM-4.1/4.5; work covered long-chain CoT data construction, Reward Model design, RLHF alignment, and end-to-end benchmark evaluation.
- Proposed SuperWriter: agent-guided hierarchical SFT + hierarchical DPO for long-form writing.
- Proposed LongWriter-Zero: pure RL strategy for ultra-long text generation (ICLR 2026 Oral).
๐ Education
- Sep. 2023 โ Present โ Ph.D. in Natural Language Processing, Singapore University of Technology and Design (SUTD). Advisor: Prof. Roy Ka-Wei Lee.
- Sep. 2024 โ Jul. 2025 โ Visiting Ph.D. Student, Tsinghua University (THU), Beijing.
- Sep. 2018 โ Jun. 2022 โ B.Sc. in Mathematics, Huazhong Agricultural University (HZAU), Wuhan.
๐ Selected Publications
* denotes equal contribution ย |ย For complete list see Google Scholar
Technology Reports
- Co-author โ Kimi Linear: An Expressive, Efficient Attention Architecture. Moonshot AI, 2025.
- Co-author โ GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models. Zhipu AI, 2025.
- Co-author โ Kimi K2.5: Visual Agentic Intelligence. Moonshot AI, 2026.
- Co-author โ Seed2.0 Model Card: Towards Intelligence Frontier for Real-World Complexity. ByteDance Seed, 2026.
Papers
-
[ICLR 2026 Oral] Yuhao Wu*, Yushi Bai*, Zhiqiang Hu, Roy Ka-Wei Lee, Juanzi Li. LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning. [Code]
-
[arXiv 2026] Yuhao Wu, Maojia Song, Yihuai Lan, Lei Wang, Zhiqiang Hu, Yao Xiao, Heng Zhou, Weihua Zheng, Dylan Raharja, Soujanya Poria, Roy Ka-Wei Lee. From Perception to Action: An Interactive Benchmark for Vision Reasoning.
-
[arXiv 2026] Shuangshuang Ying, Zheyu Wang, Yunjian Peng, Jin Chen, Yuhao Wu, Hongbin Lin, Dingyu He, Siyi Liu, Gengchen Yu, YinZhu Piao, Yuchen Wu, Xin Gui, Zhongyuan Peng, Xin Li, Xeron Du, Libo Qin, YiXin Cao, Ge Zhang, Stephen Huang. Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities.
-
[ACM MM 2025] Shangqing Tu, Yucheng Wang, Daniel Zhang-Li, Yushi Bai, Jifan Yu, Yuhao Wu, Lei Hou, Huiqin Liu, Zhiyuan Liu, Bin Xu, Juanzi Li. LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models. [Code]
-
[IJCAI 2025] Ziyu Ge*, Yuhao Wu* , Daniel Wai Kit Chin, Roy Ka-Wei Lee, Rui Cao. Resolving Conflicting Evidence in Automated Fact-Checking: A Study on Retrieval-Augmented LLMs.
-
[ICLR 2025] Yuhao Wu, Ming Shan Hee, Zhiqing Hu, Roy Ka-Wei Lee. LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs. [Code]
-
[AAAI 2025] Weihua Zheng, Xin Huang, Zhengyuan Liu, Tarun Kumar Vangani, Bowei Zou, Xiyan Tao, Yuhao Wu, Ai Ti Aw, Nancy F. Chen, Roy Ka-Wei Lee. AdaMCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Multilingual Chain-of-Thought.
-
[arXiv 2025] Yuhao Wu*, Yushi Bai*, Zhiqing Hu, Juanzi Li, Roy Ka-Wei Lee. SuperWriter: Reflection-Driven Long-Form Writing with LLMs.
-
[arXiv 2025] Yuhao Wu, Yushi Bai, Zhiqing Hu, Shangqing Tu, Ming Shan Hee, Juanzi Li, Roy Ka-Wei Lee. Shifting Long-Context LLMs Research from Input to Output.
-
[EMNLP 2023] Yuhao Wu, K. Sharma, C. Seah, S. Zhang. SentiStream: A Co-Training Framework for Adaptive Online Sentiment Analysis in Evolving Data Streams.
-
[arXiv 2023] Yuhao Wu*, Tongjun Shi*, Karthick Sharma, Chun Wei Seah, Shuhao Zhang. Online Continual Knowledge Learning for Language Models.