π About Me
I am Chengyou Jia (θ΄Ύζι), a Ph.D candidate in Xiβan Jiaotong University, major in Computer Science, under the supervision of Prof. Minnan Luo. I am also working closely with Prof. Xiaojun Chang. I am currently a visiting student in Singapore under the supervision of Prof. Ivor. Previously, I was a research intern at Shanghai AI LAB, supervised by Dr. Zhiyong Wu. Before starting my doctoral studies, I received my B.E. degree from Xiβan Jiaotong University in 2021.
I have authored several publications in top-tier conferences and journals, including CVPR, AAAI, ACL, ACM-MM, IEEE TIP, among others. I also serve as a reviewer for esteemed conferences and journals like NIPS, ECCV, AAAI, IEEE TIP and IEEE TNNLS.
I am currently on the job market for Fall 2025. Please feel free to reach out!π₯
Research Interests
I am working in the field of CV & Multi-modal. My current research interests and past experience can be summarized as follows:
- Controllable Image Generation: Text-to-image; Layout-to-image; Consistent Generation
- Multimodal learning: Multimodal for Open-set Recognition; Multimodal for Automated Agents
- Object Detection & Identification: Person Search, Person Re-Identification
π₯ News
- 2025.06: Β Our papers [T2IS, DenseDiT, AutoGPS] are recently released. ππ
- 2025.05: Β Three papers are accepeted by ACL 2025. ππ
- 2025.02: Β ChatGen is accepeted by CVPR 2025. ππ
- 2025.01: Β OS-Atlas is accepeted by ICLR 2025 (Spotlight). ππ
- 2024.12: Β One paper is accepted by IEEE TCSVT. ππ
- 2024.11: Β Our papers [ChatGen, OS-Atlas, AgentStore, OS-Genesis] are recently released. ππ
- 2024.10: Β Ended a fulfilling internship at Shanghai AI Lab and started as a visiting student in Singapore.
- 2024.07: Β One paper is accepted by ACM-MM 2024. See you in Melbourne, Australia. ππ
- 2023.11: Β Two papers are accepted by AAAI 2024. ππ
- 2023.09: Β One paper is accepted by IEEE TIP. ππ
π Educations
- 2021.09 - 2026.06 (expected), M.S. + Ph.D Student, Computer Science, Xiβan Jiaotong University. β
- 2017.09 - 2021.06, B.S. in Computer Science, Xiβan Jiaotong University.
π» Internships
- 2024.11 - Present, Research Intern @ CFAR, A*STAR. Focus on Multimodal Agents for Image Generation.
- 2024.03 - 2024.10, Research Intern @ Shanghai AI LAB. Focus on Multimodal Agents for OS (Operating System).
- 2022.12 - 2024.03, Research Intern @ SGIT AI Lab, State Grid Corporation of China. Focus on Controllable Image Generation.
π Selected Publications

Why Settle for One? Text-to-ImageSet Generation and Evaluation π₯ [Preprint]
Chengyou Jia, Xin Shen, Zhuohang Dang, Changliang Xia, Weijia Wu, Xinyu Zhang, Huangwei Qian, Ivor Tsang, Minnan Luo
(* means equal contributions)
Project Page Β (Stay tuned for the code and dataset!)

From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios π₯ [Preprint]
Changliang Xia*, Chengyou Jia*, Zhuohang Dang, Minnan Luo
(* means equal contributions)
Code Β
Project Page Β
Models Β
Datasets Β

ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting π₯π₯ [CCF-A]
Chengyou Jia*, Changliang Xia*, Zhuohang Dang, Weijia Wu, Hangwei Qian, Minnan Luo
(* means equal contributions)
Code Β
Project Page Β
Datasets Β
Models Β

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant π₯π₯
Chengyou Jia, Minnan Luo, Zhuohang Dang, Qiushi Sun, Fangzhi Xu, Junlin Hu, Tianbao Xie, Zhiyong Wu
Code Β
Project Page Β

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents π₯π₯ [CAAI-A]
Zhiyong Wu, Zhenyu Wu, Fangzhi Xu, Yian Wang, Qiushi Sun, Chengyou Jia, Kanzhi Cheng, Zichen Ding, Liheng Chen, Paul Pu Liang, Yu Qiao
Code Β
Project Page Β
Demo Β

Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition [CCF-A]
Chengyou Jia, Minnan Luo, Xiaojun Chang, Zhuohang Dang, Mingfei Han, Mengmeng Wang, Guang Dai, Sizhe Dang, Jingdong Wang

SSMG: Spatial-semantic map guided diffusion model for free-form layout-to-image generation [CCF-A]
Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Mengmeng Wang, Jingdong Wang

Collaborative Contrastive Refining for Weakly Supervised Person Search [CCF-A]
Chengyou Jia, Minnan Luo, Caixia Yan, Linchao Zhu, Xiaojun Chang, Qinghua Zheng
π§β Other Paper
-
Preprint
Autogps: Automated geometry problem solving via multimodal formalization and deductive reasoning
Bowen Ping, Minnan Luo, Zhuohang Dang, Chenxi Wang, Chengyou Jia -
Preprint
Multi-Modal Dataset Distillation in the Wild
Zhuohang Dang, Minnan Luo, Chengyou Jia, Hangwei Qian, Xiaojun Chang, Ivor W Tsang -
ACL 2025
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning [CCF-A]
Xinyu Zhang, Yuxuan Dong, Yanrui Wu, Jiaxing Huang, Chengyou Jia, Basura Fernando, Mike Zheng Shou, Lingling Zhang, Jun Liu -
ACL 2025
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis [CCF-A]
Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu,
Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu -
IEEE TCSVT
PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement [CCF-B]
Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Jingdong Wang, Qinghua Zheng -
ICASSP 2023
Towards Real-time Person Search with Invariant Feature Learning [CCF-B]
Chengyou Jia, Minnan Luo, Zhuohang Dang, Xiaojun Chang, Qinghua Zheng -
AAAI 2024
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [CCF-A]
Zhuohang Dang, Minnan Luo, Chengyou Jia, Guang Dai, Xiaojun Chang, Jingdong Wang -
IEEE TCSVT
Disentangled representation learning with transmitted information bottleneck [CCF-B]
Zhuohang Dang, Minnan Luo, Chengyou Jia, Guang Dai, Jihong Wang, Xiaojun Chang, Jingdong Wang -
IEEE TIP
Disentangled Generation with Information Bottleneck for Enhanced Few-Shot Learning [CCF-A]
Zhuohang Dang, Minnan Luo, Jihong Wang, Chengyou Jia, Caixia Yan, Guang Dai, Xiaojun Chang, Qinghua Zheng -
IEEE TCSVT
Counterfactual Generation Framework for Few-Shot Learning [CCF-B]
Zhuohang Dang, Minnan Luo, Chengyou Jia, Caixia Yan, Xiaojun Chang, Qinghua Zheng -
IEEE TIP
Disentangled Noisy Correspondence Learning
Zhuohang Dang, Minnan Luo, Jihong Wang, Chengyou Jia, Haochen Han, Herun Wan, Guang Dai, Xiaojun Chang, Jingdong Wang
π Honors and Awards
Honors
- Oct.2022 β Outstanding Graduate Student of Xiβan Jiaotong University, Award.
- Jun.2021 β Excellent bachelor degree thesis award (Top 1%), Award.
- Jun.2021 β Outstanding Graduate Student of Xiβan Jiaotong University, Award.
- Sep.2019 β Outstanding Student in Xiβan Jiaotong University
- Sep.2018 β Outstanding Student in Xiβan Jiaotong University
Competition Awards
- Jun.2021 β Top 1% in the TianChi Global Video Cloud Innovation Challenge, Video Object Segmentation Algorithm Challenge (8/2904).
π Scholarships
- 2023.09 β Freshman First Prize Scholarship (PhD)
- 2022.09 β First-Class Fellowships for Graduate Students at Xiβan Jiaotong University, Fellowships.
- 2020.09 β Computer Science Special Scholarship at Xiβan Jiaotong University, Fellowships.
π¬ Academic Services
- Reviewer: NIPS, ECCV, AAAI, ICASSP, IEEE TIP, IEEE TCSVT and IEEE TNNLS.