Research

* denotes equal contributions.

2024

  1. cow.png
    Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
    Ruoyu Wang*, Yongqi Yang*, Zhihao Qian, Ye Zhu, and Yu Wu
    In The Eleventh International Conference on Learning Representations (ICLR), 2024
  2. Mining and Unifying Heterogeneous Contrastive Relations for Weakly-Supervised Actor-Action Segmentation
    Bin Duan, Hao Tang, Changchang Sun, Ye Zhu, and Yan Yan
    In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024

2023

  1. DETER: Detecting Edited Regions for Deterring Generative Manipulations
    Sai Wang*, Ye Zhu*, Ruoyu Wang, Amaya Dharmasiri, Olga Russakovsky, and Yu Wu
    arXiv preprint arXiv:2312.10539, 2023
  2. Unseen Image Synthesis with Diffusion Models
    Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, and Yan Yan
    arXiv preprint arXiv:2310.09213, 2023
  3. boundarydiffusion.png
    Boundary Guided Learning-Free Semantic Control with Diffusion Models
    Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, and Yan Yan
    In Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
  4. cdcd.png
    Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation
    Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, and Yan Yan
    In The Eleventh International Conference on Learning Representations (ICLR), 2023
  5. diffdensity.jpg
    Denoising Diffusion Probabilistic Models to Predict the Density of Molecular Clouds
    Duo Xu, Jonathan C Tan, Chia-Jung Hsu, and Ye Zhu
    The Astrophysical Journal, 2023
  6. Discrete Diffusion Reward Guidance Methods for Offline Reinforcement Learning
    Matthew Coleman, Olga Russakovsky, Christine Allen-Blanchette, and Ye Zhu
    In ICML 2023 Workshop: Sampling and Optimization in Discrete Space, 2023

2022

  1. Vision+ X: A Survey on Multimodal Learning in the Light of Data
    Ye Zhu, Yu Wu, Nicu Sebe, and Yan Yan
    arXiv preprint arXiv:2210.02884, 2022
  2. d2m.png
    Quantized GAN for Complex Music Generation from Dance Videos
    Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, and Sergey Tulyakov
    In European Conference on Computer Vision (ECCV), 2022
  3. Saying the Unseen: Video Descriptions via Dialog Agents
    Ye Zhu, Yu Wu, Yi Yang, and Yan Yan
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
  4. Skeleton sequence and RGB frame based multi-modality feature fusion network for action recognition
    Xiaoguang Zhu, Ye Zhu, Haoyu Wang, Honglin Wen, Yan Yan, and Peilin Liu
    ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022

2021

  1. Learning audio-visual correlations from variational cross-modal generation
    Ye Zhu, Yu Wu, Hugo Latapie, Yi Yang, and Yan Yan
    In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021

2020

  1. Describing unseen videos via multi-modal cooperative dialog agents
    Ye Zhu, Yu Wu, Yi Yang, and Yan Yan
    In European Conference on Computer Vision (ECCV), 2020
  2. Hierarchical HMM for eye movement classification
    Ye Zhu, Yan Yan, and Oleg Komogortsev
    In European Conference on Computer Vision Workshop (ECCV Workshop), 2020