Jinheng XIE

Hi there! I'm Xie Jinheng (Sier Kinhane /ˈsiːər.kɪn.heɪn/), a third-year PhD student at Show Lab, National University of Singapore, working with Prof. Mike Shou. Prior to my PhD, I dedicated three years to exploring label-efficient learning for scene understanding, focusing on weakly-supervised object localization and semantic segmentation. In my first year of PhD journey, I delved into visual prompt learning and effective controllable image synthesis. Currently, I’m concentrating on unifying multimodal understanding and generation within a native unified multimodal model. I have trained two unified multimodal models, Show-o and Show-o2, with trainable parameters up to 7 billion and utilizing billion-scale datasets.
Google Scholar     Github     LinkedIn     Twitter    

sierkinhane at gmail dot com / jinheng at u dot nus dot edu

2025/12

I will be joining Google in New York City as an intern in 2026.

2025/09

Show-o2 has been accepted to NeurIPS 2025.

2025/08

Show-o received the PREMIA Best Student Paper Award 2025.

2025/07

Gave a talk on Show-o series.

2025/07

Invited talk at Shanghai Jiao Tong University.

2025/07

Gave a brief introduction to Show-o2 at MIT.

2025/06

We release an improved native unified multimodal model, i.e., Show-o2. Code and models are available here.

2025/04

Invited talk at Zhejiang Univeristy.

2025/03

One paper got accepted to IJCV

2025/02

Two papers got accepted to CVPR 2025

2025/02

One paper got accepted to TMLR

2025/01

Show-o has been accepted to ICLR 2025

2024/08

We release a unified model, i.e., Show-o, that unifies multimodal understanding and generation in one single transformer. Code and models are available here.

2024/02

One paper got accepted to CVPR 2024

2024/02

Invited talk at Pensees Pte Ltd.

2023/12

One paper got accepted by AAAI 2024

2023/09

Two papers got accepted to NeurIPS 2023

2023/07

One paper got accepted to ACM MM 2023

2023/07

One paper got accepted to ICCV 2023 and One paper got accepted to MICCAI 2023

2022/11

Served as a reviewer for ICCV 2023

2022/11

Served as a reviewer for CVPR 2023

2022/10

Served as a reviewer for IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

2022/09

Served as a reviewer for International Journal of Computer Vision (IJCV)

2022/09

Received China National Scholarship

2022/08

One paper got accepted to ECCV 2022 Workshop

2022/08

Ranked 4th in Out-of-Distribution Visual Recognition ECCV'2022 NICO Challenge

2022/06

One paper got accepted to MICCAI 2022

2022/03

Three papers got accepted to CVPR 2022

2021/09

Received China National Scholarship

2021/07

One paper got accepted to ICCV 2021
★ 2025 ★

Show-o2: Improved Native Unified Multimodal Models

Jinheng Xie, Zhenheng Yang, Mike Zheng Shou
Thirty-ninth Conference on Neural Information Processing Systems (NeurIPS), 2025
PDF  •   Code

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Jinheng Xie*, Weijia Mao*, Zechen Bai*, David Junhao Zhang*, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou
International Conference on Learning Representations (ICLR), 2025
PREMIA Best Student Paper Award 2025
Most Influential ICLR Papers #4
PDF  •   Code

CLIMS++: Cross Language Image Matching with Automatic Context Discovery for Weakly Supervised Semantic Segmentation

Jinheng Xie*, Songhe Deng*, Xianxu Hou, Zhaochuan Luo, Linlin Shen, Yawen Huang, Yefeng Zheng, Mike Zheng Shou
International Journal of Computer Vision (IJCV), 2025

Faster Diffusion via Temporal Attention Decomposition

Wentian Zhang*, Haozhe Liu*, Jinheng Xie*, Francesco Faccio, Mengmeng Xu, Tao Xiang, Mike Zheng Shou, Juan-Manuel Perez-Rua, Jürgen Schmidhuber
Transactions on Machine Learning Research (TMLR), 2025
PDF  •   Code

★ 2024 ★

Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models

Wentian Zhang*, Haozhe Liu*, Jinheng Xie*, Francesco Faccio, Mike Zheng Shou, Jürgen Schmidhuber
arXiv preprint arXiv:2404.02747
PDF  •   Code

Tune-An-Ellipse: CLIP Has Potential to Find What You Want

Jinheng Xie, Songhe Deng, Bing Li, Haozhe Liu, Yawen Huang, Yefeng Zheng, Jürgen Schmidhuber, Bernard Ghanem, Linlin Shen, Mike Zheng Shou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024 (Highlight)
PDF  •   Code

HEAP: Unsupervised Object Discovery and Localization with Contrastive Grouping

Xin Zhang, Jinheng Xie, Yuan Yuan, Michael Bi Mi, Robby T. Tan
Thirty-eighth Annual AAAI Conference on Artificial Intelligence (AAAI), 2024
PDF  

★ 2023 ★

Learning Visual Prior via Generative Pre-Training

Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin Qinghong Lin, Yefeng Zheng, Linlin Shen, Mike Zheng Shou
Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
Webpage  •   PDF  •   Code

Dynamically Masked Discriminator for GANs

Wentian Zhang, Haozhe Liu, Bing Li, Jinheng Xie, Yawen Huang, Yuexiang Li, Yefeng Zheng, Bernard Ghanem
Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
PDF  •   Code

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

Jinheng Xie, Yuexiang Li, Yawen Huang, Haozhe Liu, Wentian Zhang, Yefeng Zheng, Mike Zheng Shou
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
PDF  •   Code

★ 2022 ★

Decoupled Mixup for Out-of-Distribution Visual Recognition

Haozhe Liu*, Wentian Zhang*, Jinheng Xie*, Haoqian Wu, Bing Li, Ziqi Zhang, Yuexiang Li, Yawen Huang, Bernard Ghanem, Yefeng Zheng
European Conference on Computer Vision Workshop (ECCVW), 2022
PDF  •   Code

Point Beyond Class: A Benchmark for Weakly Semi-Supervised Abnormality Localization in Chest X-Rays

Haoqin Ji, Haozhe Liu, Yuexiang Li, Jinheng Xie, Nanjun He, Yawen Huang, Dong Wei, Xinrong Chen, Linlin Shen, Yefeng Zheng
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2022
PDF  •   Code

CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation

Jinheng Xie, Xianxu Hou, Kai Ye, Linlin Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
PDF  •   Code

C2AM: Contrastive Learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation

Jinheng Xie, Jianfeng Xiang, Junliang Chen, Xianxu Hou, Xiaodong Zhao, Linlin Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
PDF  •   Code

Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity

Cheng Luo, Qinliang Lin, Weicheng Xie, Bizhu Wu, Jinheng Xie, Linlin Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
PDF  •   Code

Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization

Jinheng Xie, Cheng Luo, Xiangping Zhu, Ziqi Jin, Weizeng Lu, Linlin Shen
IEEE/CVF International Conference on Computer Vision (ICCV), 2021
PDF  •   Code

2025

PREMIA Best Student Paper Award (Excellence)

2023

Show Lab Annual Award (4,000 SGD)

2023

Outstanding Graduate Award (Rate < 5%)

2022

China National Scholarship (Rate <= 0.02%)

2021

China National Scholarship (Rate <= 0.02%)

2021

Excellent Academic Scholarship, First Class
My favorite singer is "Liang Bo", and I would be delighted to recommend some of his songs to you.