Jinheng XIE
Hi, there! I'm Jinheng, a PhD student at Show Lab, National University of Singapore, working with Prof. Mike Shou. My research interests focus on multi-modal computer vision. Previously, I enthusiastically worked on label-efficient learning of scene understanding (object localization and segmentation in un/weakly-supervised manners). I'm currently exploring multi-modal pre-training and generation such as generative pre-training and text-to-image generation.
Google Scholar
Github
LinkedIn
sierkinhane at gmail dot com / jinheng at u dot nus dot edu
2024/08
We release a unified model, i.e., Show-o, that unifies multimodal understanding and generation in one single transformer. Code and models are available here.2024/02
One paper got accepted to CVPR 20242023/12
One paper got accepted by AAAI 20242023/09
Two papers got accepted to NeurIPS 20232023/07
One paper got accepted to ACM MM 20232023/07
One paper got accepted to ICCV 2023 and One paper got accepted to MICCAI 20232022/11
Served as a reviewer for ICCV 20232022/11
Served as a reviewer for CVPR 20232022/10
Served as a reviewer for IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)2022/09
Served as a reviewer for International Journal of Computer Vision (IJCV)2022/09
Received China National Scholarship2022/08
One paper got accepted to ECCV 2022 Workshop2022/08
Ranked 4th in Out-of-Distribution Visual Recognition ECCV'2022 NICO Challenge2022/06
One paper got accepted to MICCAI 20222022/03
Three papers got accepted to CVPR 20222021/09
Received China National Scholarship2021/07
One paper got accepted to ICCV 2021Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Jinheng Xie*, Weijia Mao*, Zechen Bai*, David Junhao Zhang*, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou
arXiv preprint arXiv:2408.12528
PDF •
Code
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models
Wentian Zhang*, Haozhe Liu*, Jinheng Xie*, Francesco Faccio, Mike Zheng Shou, Jürgen Schmidhuber
arXiv preprint arXiv:2404.02747
PDF •
Code
Tune-An-Ellipse: CLIP Has Potential to Find What You Want
Jinheng Xie, Songhe Deng, Bing Li, Haozhe Liu, Yawen Huang, Yefeng Zheng, Jürgen Schmidhuber, Bernard Ghanem, Linlin Shen, Mike Zheng Shou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
HEAP: Unsupervised Object Discovery and Localization with Contrastive Grouping
Xin Zhang, Jinheng Xie, Yuan Yuan, Michael Bi Mi, Robby T. Tan
Thirty-eighth Annual AAAI Conference on Artificial Intelligence (AAAI), 2024
PDF
Learning Visual Prior via Generative Pre-Training
Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin Qinghong Lin, Yefeng Zheng, Linlin Shen, Mike Zheng Shou
Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
Webpage •
PDF •
Code
Dynamically Masked Discriminator for GANs
Wentian Zhang, Haozhe Liu, Bing Li, Jinheng Xie, Yawen Huang, Yuexiang Li, Yefeng Zheng, Bernard Ghanem
Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
PDF •
Code
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Jinheng Xie, Yuexiang Li, Yawen Huang, Haozhe Liu, Wentian Zhang, Yefeng Zheng, Mike Zheng Shou
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
PDF •
Code
Decoupled Mixup for Out-of-Distribution Visual Recognition
Haozhe Liu*, Wentian Zhang*, Jinheng Xie*, Haoqian Wu, Bing Li, Ziqi Zhang, Yuexiang Li, Yawen Huang, Bernard Ghanem, Yefeng Zheng
European Conference on Computer Vision Workshop (ECCVW), 2022
PDF •
Code
Point Beyond Class: A Benchmark for Weakly Semi-Supervised Abnormality Localization in Chest X-Rays
Haoqin Ji, Haozhe Liu, Yuexiang Li, Jinheng Xie, Nanjun He, Yawen Huang, Dong Wei, Xinrong Chen, Linlin Shen, Yefeng Zheng
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2022
PDF •
Code
CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation
Jinheng Xie, Xianxu Hou, Kai Ye, Linlin Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
PDF •
Code
C2AM: Contrastive Learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation
Jinheng Xie, Jianfeng Xiang, Junliang Chen, Xianxu Hou, Xiaodong Zhao, Linlin Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
PDF •
Code
Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity
Cheng Luo, Qinliang Lin, Weicheng Xie, Bizhu Wu, Jinheng Xie, Linlin Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
PDF •
Code
Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization
Jinheng Xie, Cheng Luo, Xiangping Zhu, Ziqi Jin, Weizeng Lu, Linlin Shen
IEEE/CVF International Conference on Computer Vision (ICCV), 2021
PDF •
Code
2023
Show Lab Annual Award (4,000 SGD)2023
Outstanding Graduate Award (Rate < 5%)2022
China National Scholarship (Rate <= 0.02%)2021
China National Scholarship (Rate <= 0.02%)2021
Excellent Academic Scholarship, First Class