I am a Monge tenure-track assistant professor in the Computer Science Department (Département d'Informatique, DIX), École Polytechnique, France. I am also a PI at the DaSciM lab, which is part of the Computer Science Laboratory (Laboratoire d'Informatique, LIX) of École Polytechnique, Institut Polytechnique de Paris (IPP).

My research lies in Machine Learning and Computer Vision, with a particular focus on deep probablistic generative models (e.g., diffusion models and GANs) and their applications in multimodal settings (e.g., vision, audio, and text) as well as in scientific domains (e.g., astrophysical inversions).

Before joining l'X, I spent two years as a postdoctoral researcher at Princeton University, working with Prof. Olga Russakovsky. I earned my Ph.D. in Computer Science under the supervision of Prof. Yan Yan at Illinois Tech in Chicago. I also hold M.S. and B.S. degrees from Shanghai Jiao Tong University (SJTU), and received the French engineering diploma through the dual-degree program with SJTU after studying at École Polytechnique.

[Google Scholar]     [Twitter]     [GitHub]     [CV]

News

09/2025: Our works Dynamic diffusion Schrödinger bridge for astrophysical inversions and BNMusic for noise acoustic masking via personalized music generation accepted to NeurIPS 2025.

09/2025: I joined École Polytechnique as a Monge tenure-track assistant professor in Computer Science.

07/2025: Our work NoiseQuery for enhanced goal driven image generation accepted to ICCV 2025 as a Highlight paper.

02/2025: Our work D3 for scaling up deepfake detection accepted to CVPR 2025.

01/2025: Our work on exploring magnetific field in the interstellar medium via diffusion generative models accepted to The Astrophysical Journal (APJ).

Researcher
Ye Zhu

Monge Tenure-Track Assistant Professor

Department of Computer Science
École Polytechnique
Institut Polytechnique de Paris (IPP)
Bâtiment Alan Turing, 1 rue d'Estiennes d'Orves
Palaiseau 91120, France

Selected Publications

* for equal contributions. A complete list can be found from Google Scholar.

  • Dynamic Diffusion Schrödinger Bridge in Astrophysical Observational Inversions
    Ye Zhu, Duo Xu, Zhiwei Deng, Jonathan C. Tan, Olga Russakovsky.
    In Conference on Neural Information Processing Systems (NeurIPS), 2025.
  • [Paper]   [Code]   [Bibtex]
  • BNMusic: Blending Environmental Noises into Personalized Music
    Chi Zuo, Martin B. Møller, Pablo Martínez-Nuevo, Huayang Huang, Yu Wu, Ye Zhu.
    In Conference on Neural Information Processing Systems (NeurIPS), 2025.
  • [Paper]   [Code]   [Bibtex]
  • The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation
    Ruoyu Wang, Huayang Huang, Ye Zhu, Olga Russakovsky, Yu Wu.
    In International Conference on Computer Vision (ICCV Highlight), 2025.
  • [Paper]   [Code]   [Bibtex]
  • D3: Scaling Up Deepfake Detection by Learning from Discrepancy
    Yongqi Yang*, Zhihao Qian*, Ye Zhu, Olga Russakovsky, and Yu Wu.
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
  • [Paper]   [Code]   [Bibtex]
  • Exploring Magnetic Fields in Molecular Clouds through Denoising Diffusion Probabilistic Models
    Duo Xu, Jenna Karcheski, Chi-Yan Law, Ye Zhu, Chia-Jung Hsu, and Jonathan Tan.
    In The Astrophysics Journal (APJ), 2025.
  • [Paper]   [Code]   [Bibtex]
  • Vision + X: A Survey on Multimodal Learning in the Light of Data
    Ye Zhu, Yu Wu, Nicu Sebe, and Yan Yan.
    In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024.
  • [Paper]   [Bibtex]
  • What is Dataset Distillation Learning?
    William Yang, Ye Zhu, Zhiwei Deng, Olga Russakovsky.
    In International Conference on Machine Learning (ICML), 2024.
  • [Paper]   [Code]   [Bibtex]
  • Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
    Ruoyu Wang*, Yongqi Yang*, Zhihao Qian, Ye Zhu, and Yu Wu.
    In International Conference on Learning Representations (ICLR), 2024.
  • [Paper]   [Code]   [Bibtex]
  • Surveying Image Segmentation Approaches in Astronomy
    Duo Xu, Ye Zhu.
    In Astronomy and Computing, 2024.
  • [Paper]   [Bibtex]
  • Boundary Guided Learning-Free Semantic Control with Diffusion Models
    Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, and Yan Yan.
    In Conference on Neural Information Processing Systems (NeurIPS), 2023.
  • [Paper]   [Code]   [Bibtex]
  • Denoising Diffusion Probabilistic Models to Predict the Density of Molecular Clouds
    Duo Xu, Jonathan Tan, Chia-Jung Hsu, and Ye Zhu.
    In The Astrophysics Journal (APJ), 2023.
  • [Paper]   [Bibtex]
  • Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation
    Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, and Yan Yan.
    In International Conference on Learning Representations (ICLR), 2023.
  • [Paper]   [Code]   [Bibtex]
  • Quantized GAN for Complex Music Generation from Dance Videos
    Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, and Sergey Tulyakov.
    In European Conference on Computer Vision (ECCV), 2022.
  • [Paper]   [Code]   [Bibtex]
  • Saying the Unseen: Video Descriptions via Dialog Agents
    Ye Zhu, Yu Wu, Yi Yang, and Yan Yan.
    In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022.
  • [Paper]   [Code]   [Bibtex]
  • Learning Audio-Visual Correlations From Variational Cross-Modal Generations
    Ye Zhu, Yu Wu, Hugo Latapie, Yi Yang, and Yan Yan.
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021.
  • [Paper]   [Code]   [Bibtex]
  • Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
    Ye Zhu, Yu Wu, Yi Yang, and Yan Yan.
    In European Conference on Computer Vision (ECCV), 2020.
    [Paper]   [Code]   [Bibtex]
Plain Academic