avatar

Jingye Chen (陈竞晔)

Ph.D. Candidate in HKUST

About Me

Jingye Chen is a third-year Ph.D. student in HKUST supervised by Prof. Qifeng Chen. Previously he obtained the BSc and MSc degree in the School of Computer Science at Fudan University, supervised by Prof. Bin Li and Prof. Xiangyang Xue. He enjoys doing interesting research and thinking outside the box. He also spent a wonderful time as an intern in General AI Group at Microsoft Research Asia advised by Dr. Lei Cui and Dr. Furu Wei . He is fortunate to be mentored by Dr. Zhaowen Wang during the internship at Adobe Research.

News
[Mar. 2025]

An awesome repo about generative game is maintained at link. Welcome to any contributions!

[Mar. 2025]

A paper on the numerical and spatial consistency of generative games is released.

[Feb. 2025]

One paper accepted to CVPR2025.

[Nov. 2024]

We release Videotuna, an all-in-one video fine-tuning framework.

[Nov. 2024]

I pass the qualifying exam and become a Ph.D. candidate.

[Jul. 2024]

One paper accepted to ECCV2024 Oral.

[Jul. 2024]

One paper accepted to ACMMM2024.

[May. 2024]

We release a survey about llms for multimodal generation and editing.

[Nov. 2023]

TextDiffuser-2 is released. More flexible.

[Sept. 2023]

We published a multimodal literate model Kosmos-2.5.

[Sept. 2023]

One paper accepted to NeurIPS2023.

[Nov. 2022]

One paper accepted to AAAI2023.

[Oct. 2022]

One paper accepted to EMNLP2022-Findings.

[Jan. 2022]

We construct a benchmark for Chinese text recognition.

[Dec. 2021]

One paper accepted to AAAI2022.

[Apr. 2021]

One paper accepted to IJCAI2021.

[Mar. 2021]

One paper accepted to CVPR2021.

Publications
combined_video.gif

Model as a Game: On Numerical and Spatial Consistency for Generative Games

Jingye Chen, Yuzhong Zhao, Yupan Huang, Lei Cui, Li Dong, Tengchao Lv, Qifeng Chen, Furu Wei

Technical Report, 2025

[PDF] [Blog]
videotuna.gif

VideoTuna: A Powerful Toolkit for Video Generation with Model Fine-Tuning and Post-Training

Yingqing He, Yazhou Xing, Zhefan Rao, Haoyu Wu, Zhaoyang Liu, Jingye Chen, Pengjun Fang, Jiajun Li, Liya Ji, Runtao Liu, Xiaowei Chi, Yang Fei, Guocheng Shao, Yue Ma, Qifeng Chen

Open-source Project, 2025

[Code]

Large Motion Video Autoencoding with Cross-modal Video VAE

Yazhou Xing, Yang Fei, Yingqing He, Jingye Chen, Jiaxin Xie, Xiaowei Chi, Qifeng Chen

Technical Report, 2024

[PDF] [Code]
textdiffuser.png

LLMs Meet Multimodal Generation and Editing: A Survey

Yingqing He, Zhaoyang Liu, Jingye Chen, Zeyue Tian, Hongyu Liu, Xiaowei Chi, Runtao Liu, Ruibin Yuan, Yazhou Xing, Wenhai Wang, Jifeng Dai, Yong Zhang, Wei Xue, Qifeng Liu, Yike Guo, Qifeng Chen

Technical Report, 2024

[PDF] [Code]
textdiffuser.png

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering

Jingye Chen, Yupan Huang, Tengchao Lv, Lei Cui, Qifeng Chen, Furu Wei

European Conference on Computer Vision (ECCV), 2024, Oral Presentation

Top10 in the Hugging Face Space Trending List at Dec. 31st 2023; Featured as Space of the Week.

Used by Recraft V3, the rank 1st image generation model in the global leaderboard.

[PDF] [Code] [ProjectPage] [HuggingFace] [Twitter] [PaperWeekly] [Discord] [?]
kosmos-2.5.jpg

Kosmos-2.5: A Multimodal Literate Model

Tengchao Lv*, Yupan Huang*, Jingye Chen*, Lei Cui*, Shuming Ma, Yaoyao Chang, Shaohan Huang, Wenhui Wang, Li Dong, Weiyao Luo, Shaoxiang Wu, Guoxin Wang, Cha Zhang, Furu Wei

Technical Report, 2023

[PDF] [Code] [HuggingFace]
textdiffuser.png

TextDiffuser: Diffusion Models as Text Painters

Jingye Chen*, Yupan Huang*, Tengchao Lv, Lei Cui, Qifeng Chen, Furu Wei

Neural Information Processing Systems (NeurIPS), 2023

Top10 in the Hugging Face Space Trending List at Jun. 29st 2023; Featured as Space of the Week.

[PDF] [Code] [ProjectPage] [HuggingFace] [GoogleColab] [Twitter] [Zhihu]
trocr.jpg

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models

Minghao Li, Tengchao Lv, Jingye Chen, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei

AAAI Conference on Artificial Intelligence (AAAI), 2023

Rank 4th in Most Influential AAAI 2023 Papers

[PDF] [Code] [HuggingFace]
xdoc.jpg

XDoc: Unified Pre-training for Cross-Format Document Understanding

Jingye Chen, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei

Empirical Methods in Natural Language Processing (EMNLP-Findings), 2022

[PDF] [Code] [HuggingFace]
benchmark.jpg

Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

Jingye Chen, Haiyang Yu, Jianqi Ma, Mengnan Guan, Xixi Xu, Xiaocong Wang, Shaobo Qu, Bin Li, Xiangyang Xue

Technical Report, 2022

[PDF] [Code] [Zhihu]
text-gestalt.jpg

Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution

Jingye Chen, Haiyang Yu, Jianqi Ma, Bin Li, Xiangyang Xue

AAAI Conference on Artificial Intelligence (AAAI), 2022

[PDF] [Code]
stroke-level-decomposition.png

Zero-Shot Chinese Character Recognition with Stroke-Level Decomposition

Jingye Chen, Bin Li, Xiangyang Xue

International Joint Conference on Artifical intelligence (IJCAI), 2021

[PDF] [Code]
scene-text-telescope.png

Scene Text Telescope: Text-Focused Scene Image Super-Resolution

Jingye Chen, Bin Li, Xiangyang Xue

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

[PDF] [Code]
MT-TransUNet.jpg

MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification

Jingye Chen, Jieneng Chen, Zongwei Zhou, Bin Li, Alan Yuille, Yongyi Lu

Technical Report, 2021

[PDF] [Code]
Education
HKUST
Hong Kong
Sept. 2022 –
PhD Candidate in Computer Science, supervised by Prof. Qifeng Chen
Fudan University
Shanghai
Sept. 2019 – Feb. 2022
Master in Computer Science, supervised by Prof. Bin Li and Prof. Xiangyang Xue
Fudan University
Shanghai
Sept. 2015 – Jun. 2019
BSc in Computer Science
Experiences
Microsoft Research Asia
Beijing
Feb. 2022 – Jul. 2022, Dec. 2022 – Feb. 2024, Nov. 2024 -
Research Intern, supervised by Dr. Lei Cui and Dr. Furu Wei
Adobe Research
San Jose, U.S.A.
Apr. 2024 - Aug. 2024
Research Intern, supervised by Dr. Zhaowen Wang
Johns Hopkins University
U.S.A.
Apr. 2021 – Sept. 2021
Summer Intern, supervised by Dr. Yongyi Lu and Prof. Alan Yuille
University of Cambridge
Cambridge, U.K.
Jan. 2018 – Feb. 2018
Visiting student of winter program
Services

Conference Reviewer: CVPR, ICCV, NeurIPS, ACL, EMNLP, AAAI, ACMMM

Journal Reviewer: TPAMI, TMM

Teaching

2023 Spring: COMP 2011 Programming with C++

2023 Fall: COMP 2011 Programming with C++

Awards & Scholarships

Excellent Master Dissertation Award of Shanghai

2023

RedBird PhD Scholarship in HKUST

2022

Outstanding Graduate of Shanghai (top 5%)

2022

Excellent Student Award

2021

National Scholarship (top 1%)

2021

Outstanding Undergraduate of Shanghai (top 5%)

2019

Best Team Award in University of Cambridge as a Leader

2018

Third Class Undergraduate Scholarship

2016-2018