Jinheng Xie

I'm Xie Jinheng (Sier Kinhane /ˈsiːər.kɪn.heɪn/), a final-year PhD student at National University of Singapore, working with Prof. Mike Shou. I’ve had the privilege of interning at Google Research, Google DeepMind, ByteDance, and Tencent.

My research focuses on unifying multimodal understanding and generation within a unified multimodal architecture. I have trained two unified multimodal models—Show-o and Show-o2—with up to 7B parameters on billion-scale datasets.

Google Scholar Github LinkedIn Twitter
sierkinhane at gmail dot com / jinheng at u dot nus dot edu

2026/01

Joined Google as an intern

2025/09

Show-o2 has been accepted to NeurIPS 2025

2025/08

Show-o received the PREMIA Best Student Paper Award 2025

2025/07

Gave a talk on Show-o series

2025/07

Invited talk at Shanghai Jiao Tong University

2025/07

Gave a brief introduction to Show-o2 at MIT

2025/06

We release an improved native unified multimodal model, i.e., Show-o2. Code and models are available here

2025/04

Invited talk at Zhejiang Univeristy

2025/03

One paper got accepted to IJCV

2025/02

Two papers got accepted to CVPR 2025

2025/02

One paper got accepted to TMLR

2025/01

Show-o has been accepted to ICLR 2025

2024/08

We release a unified model, i.e., Show-o, that unifies multimodal understanding and generation in one single transformer. Code and models are available here.

2024/02

One paper got accepted to CVPR 2024

2024/02

Invited talk at Pensees Pte Ltd.

2023/12

One paper got accepted by AAAI 2024

2023/09

Two papers got accepted to NeurIPS 2023

2023/07

One paper got accepted to ACM MM 2023

2023/07

One paper got accepted to ICCV 2023 and One paper got accepted to MICCAI 2023

2022/11

Served as a reviewer for ICCV 2023

2022/11

Served as a reviewer for CVPR 2023

2022/10

Served as a reviewer for IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

2022/09

Served as a reviewer for International Journal of Computer Vision (IJCV)

2022/09

Received China National Scholarship

2022/08

One paper got accepted to ECCV 2022 Workshop

2022/08

Ranked 4th in Out-of-Distribution Visual Recognition ECCV'2022 NICO Challenge

2022/06

One paper got accepted to MICCAI 2022

2022/03

Three papers got accepted to CVPR 2022

2021/09

Received China National Scholarship

2021/07

One paper got accepted to ICCV 2021

★ 2025 ★

Show-o2: Improved Native Unified Multimodal Models

Jinheng Xie, Zhenheng Yang, Mike Zheng Shou
Thirty-ninth Conference on Neural Information Processing Systems (NeurIPS), 2025
PDF • Code

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou
International Conference on Learning Representations (ICLR), 2025
PREMIA Best Student Paper Award 2025
Most Influential ICLR Papers #4
PDF • Code

CLIMS++: Cross Language Image Matching with Automatic Context Discovery for Weakly Supervised Semantic Segmentation

Jinheng Xie^*, Songhe Deng^*, Xianxu Hou, Zhaochuan Luo, Linlin Shen, Yawen Huang, Yefeng Zheng, Mike Zheng Shou
International Journal of Computer Vision (IJCV), 2025
PDF

Faster Diffusion via Temporal Attention Decomposition

Wentian Zhang^*, Haozhe Liu^*, Jinheng Xie^*, Francesco Faccio, Mengmeng Xu, Tao Xiang, Mike Zheng Shou, Juan-Manuel Perez-Rua, Jürgen Schmidhuber
Transactions on Machine Learning Research (TMLR), 2025
PDF • Code

★ 2024 ★

Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models

Wentian Zhang^*, Haozhe Liu^*, Jinheng Xie^*, Francesco Faccio, Mike Zheng Shou, Jürgen Schmidhuber
arXiv preprint arXiv:2404.02747
PDF • Code

Tune-An-Ellipse: CLIP Has Potential to Find What You Want

Jinheng Xie, Songhe Deng, Bing Li, Haozhe Liu, Yawen Huang, Yefeng Zheng, Jürgen Schmidhuber, Bernard Ghanem, Linlin Shen, Mike Zheng Shou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024 (Highlight)
PDF • Code

★ 2023 ★

Learning Visual Prior via Generative Pre-Training

Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin Qinghong Lin, Yefeng Zheng, Linlin Shen, Mike Zheng Shou
Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
Webpage • PDF • Code

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

Jinheng Xie, Yuexiang Li, Yawen Huang, Haozhe Liu, Wentian Zhang, Yefeng Zheng, Mike Zheng Shou
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
PDF • Code

★ 2022 ★

Decoupled Mixup for Out-of-Distribution Visual Recognition

Haozhe Liu^*, Wentian Zhang^*, Jinheng Xie^*, Haoqian Wu, Bing Li, Ziqi Zhang, Yuexiang Li, Yawen Huang, Bernard Ghanem, Yefeng Zheng
European Conference on Computer Vision Workshop (ECCVW), 2022
PDF • Code

CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation

Jinheng Xie, Xianxu Hou, Kai Ye, Linlin Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
PDF • Code

C2AM: Contrastive Learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation

Jinheng Xie, Jianfeng Xiang, Junliang Chen, Xianxu Hou, Xiaodong Zhao, Linlin Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
PDF • Code

Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization

Jinheng Xie, Cheng Luo, Xiangping Zhu, Ziqi Jin, Weizeng Lu, Linlin Shen
IEEE/CVF International Conference on Computer Vision (ICCV), 2021
PDF • Code

2025

PREMIA Best Student Paper Award (Excellence)

2023

Show Lab Annual Award (4,000 SGD)

2023

Outstanding Graduate Award (Rate < 5%)

2022

China National Scholarship (Rate <= 0.02%)

2021

China National Scholarship (Rate <= 0.02%)

2021

Excellent Academic Scholarship, First Class