News

Paper assignment

Written on 30.10.25 by María Martínez-García

Dear Students,

After reviewing your paper preferences, we have finalized the list of paper assignments. You can find your matriculation number listed before each reference. Presentations will take place during the presentation session corresponding to each block. Remember that your presentation… Read more

Dear Students,

If you notice that you’ve been assigned a paper you marked as “I would not present” or find any other error, please let us know as soon as possible.

Additionally, we’ve noticed that only three of you have registered on LSF so far. Please remember that registration is required by November 4, one week before the first presentation session.

BLOCK 1:
- [7058394] Tomczak, J. & Welling, M.. (2018). VAE with a VampPrior. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1214-1223. Link: https://proceedings.mlr.press/v84/tomczak18a.html.
- [7070291] NVAE: A Deep Hierarchical Variational Autoencoder. Advances in Neural Information Processing Systems, 33, 19667-19679. Link: https://proceedings.neurips.cc/paper_files/paper/2020/file/e3b21256183cf7c2c7a66be163579d37-Paper.pdf
- [7075715] Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative Flow with Invertible 1x1 Convolutions. Advances in Neural Information Processing Systems, 31. Link: https://proceedings.neurips.cc/paper_files/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdf
- [7076153] Zhai, S., ZHANG, R., Nakkiran, P., Berthelot, D., Gu, J., Zheng, H., ... & Susskind, J. M. Normalizing Flows are Capable Generative Models. In Forty-second International Conference on Machine Learning. Link: https://openreview.net/pdf?id=2uheUFcFsM
BLOCK 2:
- [7062300] Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10684-10695). Link: https://openaccess.thecvf.com/content/CVPR2022/papers/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.pdf
- [7076165] Ho, J., & Salimans, T. Classifier-Free Diffusion Guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications. Link: https://openreview.net/pdf?id=qw8AKxfYbI
- [7026925] Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. (2016). InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Advances in Neural Information Processing Systems, 29. Link: https://proceedings.neurips.cc/paper_files/paper/2016/file/7c9d0b1f96aebd7b5eca8c3edaa19ebb-Paper.pdf
- [7074899] Casanova, A., Careil, M., Verbeek, J., Drozdzal, M., & Romero Soriano, A. (2021). Instance-Conditioned GAN. Advances in Neural Information Processing Systems, 34, 27517-27529. Link: https://proceedings.neurips.cc/paper_files/paper/2021/file/e7ac288b0f2d41445904d071ba37aaff-Paper.pdf
BLOCK 3:
- [7086510] Palumbo, E., Daunhawer, I., & Vogt, J. E. (2023). MMVAE+: Enhancing the Generative Quality of Multimodal VAEs without Compromises. In The Eleventh International Conference on Learning Representations. Link: https://openreview.net/pdf?id=sdQGxouELX
- [7073034] Campbell, A., Yim, J., Barzilay, R., Rainforth, T., & Jaakkola, T. (2024, July). Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design. In International Conference on Machine Learning (pp. 5453-5512). PMLR. Link: https://proceedings.mlr.press/v235/campbell24a.html
- [7057757] Kang, M., Zhu, J. Y., Zhang, R., Park, J., Shechtman, E., Paris, S., & Park, T. (2023). Scaling up GANs for Text-to-Image Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10124-10134). Link: https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Scaling_Up_GANs_for_Text-to-Image_Synthesis_CVPR_2023_paper.pdf
- [7081324] Chun, S., Kim, W., Park, S., & Yun, S. (2025) Probabilistic Language-Image Pre-Training. In The Thirteenth International Conference on Learning Representations. Link: https://openreview.net/pdf?id=D5X6nPGFUY

If you have any questions, don’t hesitate to reach out.

Best regards,
Your seminar team

Deep Probabilistic Generative Models

With the development of neural networks and increased computational power, deep generative modeling has emerged as one of the leading directions in AI. We are shifting from traditional discriminative tasks (such as classification, segmentation, or clustering), which focus on modeling conditional distributions, to a more comprehensive framework aimed at modeling the joint distribution of the data itself. Discriminative models alone can be insufficient for robust decision-making and the development of intelligent systems, as it is also necessary to understand the underlying data-generating process and be able to express uncertainty about the environment.

Typically, in deep learning literature, generative models are viewed as methods for synthesizing new data. However, in this seminar, we will adopt a probabilistic perspective to highlight that modeling the marginal likelihood of the data has much broader applicability, and this could be essential for building successful AI systems.

In this seminar, we will ask ourselves how to formulate deep generative models (i.e., how to express and learn the marginal likelihood of the data) and explore the different approaches proposed in the literature. The aim is for students to critically assess existing methods, understand their strengths and limitations, and identify potential directions for future research.

- Block 1: Explicit Density Models (VAEs and Flows)

- Block 2: Implicit Density Models (GANs and DDPMs)

- Block 3: Multimodal Generation

Each block includes a paper sessions and a panel discussion, as detailed below. Student evaluation is based on their presentations, active participation in discussions, and a final report summarizing the seminar topics, offering critical analysis, identifying limitations, and suggesting potential research directions.

Date	Block	Content
28/10/2025 (12:15 - 13:45)	Background Session	Introduction to probabilistic modeling
04/11/2025 (12:15 - 13:45)	Background Session	Introduction to generative modeling
11/11/2025 (12:15 - 13:45)	Block 1 - Explicit Density Models	Paper presentations + Q&A
18/11/2025 (12:15 - 13:45)	Block 1 - Explicit Density Models	Panel discussion
09/12/2025 (12:15 - 13:45)	Block 2 - Implicit Density Models	Paper presentations + Q&A
16/12/2025 (12:15 - 13:45)	Block 2 - Implicit Density Models	Panel discussion
13/01/2026 (12:15 - 13:45)	Block 3 - Multimodal Generation	Paper presentations + Q&A
20/01/2026 (12:15 - 13:45)	Block 3 - Multimodal Generation	Panel discussion

LOCATION: SR4 in E2.5

Attendance

The seminar will be held in person, and attendance is required. Students may miss no more than two sessions without providing a justification.

Deliverables and Grading Scheme

Paper Presentation (15 minutes) (40%)
- Submission (requirement): Students should submit a pdf file with the slides the day they are presenting.
- Context
  - Positioning of the paper within the state of the art and identification of gaps the paper addresses.
  - Clear articulation of What/Why/How.
- Content
  - Clear explanation of the paper's core intuition and methodology.
  - Rationale behind experiments and significance of results.
  - Advantages and disadvantages of the method presented.
  - Final slide/section presenting the take-home messages.
- Q&A
  - Responding questions from TAs and audience.
Discussion Session (20%)
- Pre-Submission Requirements
  - Each participant must submit one discussion question per block. This requirement is mandatory for students who did not present a paper in that block.
  - Submission deadline: 2 days before the session via CMS.
- Participation Expectations
  - For Presenters:
    - Active engagement answering the questions.
    - Facilitating broader discussion.
  - For Listeners:
    - Quality of pre-submitted questions.
    - Active participation in discussions.
Final Report (6-8 pages, excluding references) (40%)
- Template: https://www.overleaf.com/read/tsdwmchpkmms#16f2eb
- Critical Analysis
  - Comprehensive summary of the seminar, including the key points from the papers presented and discussion sessions.
  - Comprehensive analysis regarding limitations, advantages, and open challenges in the field.
  - Connect the discussion with broader research context, beyond the papers discussed in the seminar.
- Evaluation Criteria
  - Content
    - Depth of the analysis.
    - Demonstration of understanding across all three seminar blocks.
    - Synthesis of seminar content with broader research context.
  - Delivery
    - Quality of writing and argumentation.

List of papers

Variational Autoencoders
- Tomczak, J. & Welling, M.. (2018). VAE with a VampPrior. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1214-1223. Link: https://proceedings.mlr.press/v84/tomczak18a.html.
- Vahdat, A., & Kautz, J. (2020). NVAE: A Deep Hierarchical Variational Autoencoder. Advances in Neural Information Processing Systems, 33, 19667-19679. Link: https://proceedings.neurips.cc/paper_files/paper/2020/file/e3b21256183cf7c2c7a66be163579d37-Paper.pdf
- Kingma, D. P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., & Welling, M. (2016). Improved Variational Inference With Inverse Autoregressive Flow. Advances in Neural Information Processing Systems, 29. Link: https://proceedings.neurips.cc/paper_files/paper/2016/file/ddeebdeefdb7e7e7a697e1c3e3d8ef54-Paper.pdf
- Chen, R. T., Li, X., Grosse, R. B., & Duvenaud, D. K. (2018). Isolating Sources of Disentanglement in Variational Autoencoders. Advances in Neural Information Processing Systems, 31. Link: https://proceedings.neurips.cc/paper_files/paper/2018/file/1ee3dfcd8a0645a25a35977997223d22-Paper.pdf
- Van Den Oord, A., & Vinyals, O. (2017). Neural Discrete Representation Learning. Advances in Neural Information Processing Systems, 30. Link: https://proceedings.neurips.cc/paper_files/paper/2017/file/7a98af17e63a0ac09ce2e96d03992fbc-Paper.pdf
Flows
- Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative Flow with Invertible 1x1 Convolutions. Advances in neural information processing systems, 31. Link: https://proceedings.neurips.cc/paper_files/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdf
- Ho, J., Chen, X., Srinivas, A., Duan, Y., & Abbeel, P. (2019, May). Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design. In International Conference on Machine Learning (pp. 2722-2730). PMLR. Link: http://proceedings.mlr.press/v97/ho19a/ho19a.pdf
- [ORAL @ ICML 2025] Zhai, S., ZHANG, R., Nakkiran, P., Berthelot, D., Gu, J., Zheng, H., ... & Susskind, J. M. Normalizing Flows are Capable Generative Models. In Forty-second International Conference on Machine Learning. Link: https://openreview.net/pdf?id=2uheUFcFsM
Generative Adversarial Networks
- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Advances in Neural Information Processing Systems, 29. Link: https://proceedings.neurips.cc/paper_files/paper/2016/file/7c9d0b1f96aebd7b5eca8c3edaa19ebb-Paper.pdf
- A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4401-4410). Link: https://openaccess.thecvf.com/content_CVPR_2019/papers/Karras_A_Style-Based_Generator_Architecture_for_Generative_Adversarial_Networks_CVPR_2019_paper.pdf
- Arjovsky, M., Chintala, S., & Bottou, L. (2017, July). Wasserstein Generative Adversarial Networks. In International Conference on Machine Learning (pp. 214-223). PMLR. Link: http://proceedings.mlr.press/v70/arjovsky17a/arjovsky17a.pdf
- Casanova, A., Careil, M., Verbeek, J., Drozdzal, M., & Romero Soriano, A. (2021). Instance-Conditioned GAN. Advances in Neural Information Processing Systems, 34, 27517-27529. Link: https://proceedings.neurips.cc/paper_files/paper/2021/file/e7ac288b0f2d41445904d071ba37aaff-Paper.pdf
Diffusion
- Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems, 33, 6840-6851. Link: https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
- Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10684-10695). Link: https://openaccess.thecvf.com/content/CVPR2022/papers/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.pdf
- Austin, J., Johnson, D. D., Ho, J., Tarlow, D., & Van Den Berg, R. (2021). Structured Denoising Diffusion Models in Discrete State-Spaces. Advances in Neural Information Processing Systems, 34, 17981-17993. Link: https://proceedings.neurips.cc/paper_files/paper/2021/file/958c530554f78bcd8e97125b70e6973d-Paper.pdf
- Ho, J., & Salimans, T. Classifier-Free Diffusion Guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications. Link: https://openreview.net/pdf?id=qw8AKxfYbI
Multimodality
- Nichol, A. Q., Dhariwal, P., Ramesh, A., Shyam, P., Mishkin, P., Mcgrew, B., ... & Chen, M. (2022, June). GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. In International Conference on Machine Learning (pp. 16784-16804). PMLR. Link: https://proceedings.mlr.press/v162/nichol22a/nichol22a.pdf
- [ORAL @ NeurIPS 2024] Karras, T., Aittala, M., Kynkäänniemi, T., Lehtinen, J., Aila, T., & Laine, S. (2024). Guiding a Diffusion Model with a Bad Version of Itself. Advances in Neural Information Processing Systems, 37, 52996-53021. Link: https://proceedings.neurips.cc/paper_files/paper/2024/file/5ee7ed60a7e8169012224dec5fe0d27f-Paper-Conference.pdf
- Palumbo, E., Daunhawer, I., & Vogt, J. E. (2023). MMVAE+: Enhancing the Generative Quality of Multimodal VAEs without Compromises. In The Eleventh International Conference on Learning Representations. Link: https://openreview.net/pdf?id=sdQGxouELX
- Mancisidor, R. A., Jenssen, R., Yu, S., & Kampffmeyer, M. (2025, May). Aggregation of Dependent Expert Distributions in Multimodal Variational Autoencoders. In Forty-second International Conference on Machine Learning. Link: https://openreview.net/pdf?id=jYmGi1175R
- Campbell, A., Yim, J., Barzilay, R., Rainforth, T., & Jaakkola, T. (2024, July). Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design. In International Conference on Machine Learning (pp. 5453-5512). PMLR. Link: https://proceedings.mlr.press/v235/campbell24a.html
- Kang, M., Zhu, J. Y., Zhang, R., Park, J., Shechtman, E., Paris, S., & Park, T. (2023). Scaling up GANs for Text-to-Image Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10124-10134). Link: https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Scaling_Up_GANs_for_Text-to-Image_Synthesis_CVPR_2023_paper.pdf

Deep Probabilistic Generative Models

News

Paper assignment

Deep Probabilistic Generative Models

Date

Block

Content

Attendance

Deliverables and Grading Scheme

Paper Presentation (15 minutes) (40%)

Discussion Session (20%)

Final Report (6-8 pages, excluding references) (40%)

List of papers