๐Ÿ˜Ž About Me

I am Chengyou Jia (่ดพๆˆ้“•), a final-year Ph.D. student in Computer Science at Xiโ€™an Jiaotong University. I am going to be a researcher in the Hunyuan Team of Tencent (Qingyun Project). My Ph.D. advisor is Prof. Minnan Luo, and I am also working closely with Prof. Xiaojun Chang. Previously, I was a visiting student in Singapore under the supervision of Prof. Ivor, and a research intern at Shanghai AI LAB, supervised by Dr. Zhiyong Wu. I received my B.E. degree from Xiโ€™an Jiaotong University in 2021.

I have authored several publications in top-tier conferences and journals, including CVPR, AAAI, ACL, IEEE TIP, among others. I also serve as a reviewer for esteemed conferences and journals like NIPS, ICML, CVPR, ECCV.

Research Interests

I am working in the field of CV & Multi-modal. My current research interests and past experience can be summarized as follows:

  • Agentic Vision Generation: Consistent Generation, Video Generation๏ผŒReward Model and RL for visual Generation
  • Multimodal Agent: GUIAgent, Multi-Agent Systems

๐Ÿ”ฅ News

  • 2026.02: ย  Our papers [PaCo-RL, Chain-of-Merging] are accepted by CVPR 2026๏ผŒSee you in Denver. ๐ŸŽ‰๐ŸŽ‰
  • 2025.11: ย  One paper is accepted by AAAI 2026. ๐ŸŽ‰๐ŸŽ‰
  • 2025.10: ย  CoFFT is accepted by NeurIPS 2025. ๐ŸŽ‰๐ŸŽ‰
  • 2025.06: ย  Our papers [T2IS, DenseDiT, AutoGPS] are recently released. ๐ŸŽ‰๐ŸŽ‰
  • 2025.05: ย  Three papers are accepeted by ACL 2025. ๐ŸŽ‰๐ŸŽ‰
  • 2025.02: ย  ChatGen is accepeted by CVPR 2025. ๐ŸŽ‰๐ŸŽ‰
  • 2025.01: ย  OS-Atlas is accepeted by ICLR 2025 (Spotlight). ๐ŸŽ‰๐ŸŽ‰
  • 2024.12: ย  One paper is accepted by IEEE TCSVT. ๐ŸŽ‰๐ŸŽ‰
  • 2024.11: ย  Our papers [ChatGen, OS-Atlas, AgentStore, OS-Genesis] are recently released. ๐ŸŽ‰๐ŸŽ‰
  • 2024.10: ย  Ended a fulfilling internship at Shanghai AI Lab and started as a visiting student in Singapore.
  • 2024.07: ย  One paper is accepted by ACM-MM 2024. See you in Melbourne, Australia. ๐ŸŽ‰๐ŸŽ‰
  • 2023.11: ย  Two papers are accepted by AAAI 2024. ๐ŸŽ‰๐ŸŽ‰
  • 2023.09: ย  One paper is accepted by IEEE TIP. ๐ŸŽ‰๐ŸŽ‰

๐Ÿ“– Educations

  • 2021.09 - 2026.06 (expected), M.S. + Ph.D Student, Computer Science, Xiโ€™an Jiaotong University. โ€ƒ
  • 2017.09 - 2021.06, B.S. in Computer Science, Xiโ€™an Jiaotong University.

๐Ÿ’ป Internships

  • 2026.03 - Present, Researcher @ Tencent Hunyuan Team.
  • 2024.11 - 2025.11, Research Intern @ CFAR, A*STAR. Focus on Multimodal Agents for Image Generation.
  • 2024.03 - 2024.10, Research Intern @ Shanghai AI LAB. Focus on Multimodal Agents for OS (Operating System).
  • 2022.12 - 2024.03, Research Intern @ SGIT AI Lab, State Grid Corporation of China. Focus on Controllable Image Generation.

๐Ÿ“ Selected Publications

CVPR 2026
PaCo-RL

PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling [CCF-A]
Bowen Ping*, Chengyou Jia*, Minnan Luo, Changliang Xia, Xin Shen, Zhuohang Dang, Hangwei Qian

(* means equal contributions)

Project Page ย  Datasets ย  Code ย  ย 

ICLR 2026 Workshop
Flow-Factory

Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models [Workshop]
Bowen Ping, Chengyou Jia, Minnan Luo, Hangwei Qian, Ivor Tsang

Code ย 

Preprint
sym

Why Settle for One? Text-to-ImageSet Generation and Evaluation ๐Ÿ”ฅ [Preprint]
Chengyou Jia, Xin Shen, Zhuohang Dang, Changliang Xia, Weijia Wu, Xinyu Zhang, Huangwei Qian, Ivor Tsang, Minnan Luo

Project Page ย  Datasets ย  Code ย  ย 

Preprint
sym

From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios ๐Ÿ”ฅ [Preprint]
Changliang Xia*, Chengyou Jia*, Zhuohang Dang, Minnan Luo

(* means equal contributions)

Code ย  Project Page ย  Models ย  Datasets ย 

CVPR 2025
sym

ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting ๐Ÿ”ฅ๐Ÿ”ฅ [CCF-A]
Chengyou Jia*, Changliang Xia*, Zhuohang Dang, Weijia Wu, Hangwei Qian, Minnan Luo

(* means equal contributions)

Code ย  Project Page ย  Datasets ย  Models ย  ย 

ACL 2025 Findings
sym

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant ๐Ÿ”ฅ๐Ÿ”ฅ
Chengyou Jia, Minnan Luo, Zhuohang Dang, Qiushi Sun, Fangzhi Xu, Junlin Hu, Tianbao Xie, Zhiyong Wu

Code ย  Project Page ย 

ICLR 2025
sym

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents ๐Ÿ”ฅ๐Ÿ”ฅ [CCF-A]
Zhiyong Wu, Zhenyu Wu, Fangzhi Xu, Yian Wang, Qiushi Sun, Chengyou Jia, Kanzhi Cheng, Zichen Ding, Liheng Chen, Paul Pu Liang, Yu Qiao

Code ย  Project Page ย  Demo ย 

ACM-MM 2024
sym

Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition [CCF-A]

Chengyou Jia, Minnan Luo, Xiaojun Chang, Zhuohang Dang, Mingfei Han, Mengmeng Wang, Guang Dai, Sizhe Dang, Jingdong Wang

AAAI 2024
sym

SSMG: Spatial-semantic map guided diffusion model for free-form layout-to-image generation [CCF-A]

Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Mengmeng Wang, Jingdong Wang

IEEE TIP
sym

Collaborative Contrastive Refining for Weakly Supervised Person Search [CCF-A]

Chengyou Jia, Minnan Luo, Caixia Yan, Linchao Zhu, Xiaojun Chang, Qinghua Zheng

๐Ÿง‘โ€ Other Paper

๐ŸŽ– Honors and Awards

Honors

  • Oct.2022 โ€ƒ Outstanding Graduate Student of Xiโ€™an Jiaotong University, Award.
  • Jun.2021 โ€ƒ Excellent bachelor degree thesis award (Top 1%), Award.
  • Jun.2021 โ€ƒ Outstanding Graduate Student of Xiโ€™an Jiaotong University, Award.
  • Sep.2019 โ€ƒ Outstanding Student in Xiโ€™an Jiaotong University
  • Sep.2018 โ€ƒ Outstanding Student in Xiโ€™an Jiaotong University

Competition Awards

  • Jun.2021 โ€ƒ Top 1% in the TianChi Global Video Cloud Innovation Challenge, Video Object Segmentation Algorithm Challenge (8/2904).

๐ŸŽ– Scholarships

  • 2025.09 โ€ƒ National Scholarship (ๅ›ฝๅฎถๅฅ–ๅญฆ้‡‘)
  • 2023.09 โ€ƒ Freshman First Prize Scholarship (PhD)
  • 2022.09 โ€ƒ First-Class Fellowships for Graduate Students at Xiโ€™an Jiaotong University, Fellowships.
  • 2020.09 โ€ƒ Computer Science Special Scholarship at Xiโ€™an Jiaotong University, Fellowships.

๐Ÿ’ฌ Academic Services

  • Reviewer: NIPS, ECCV, AAAI, ICASSP, IEEE TIP, IEEE TCSVT and IEEE TNNLS.