News
Paper assignmentWritten on 30.10.25 by María Martínez-García Dear Students, After reviewing your paper preferences, we have finalized the list of paper assignments. You can find your matriculation number listed before each reference. Presentations will take place during the presentation session corresponding to each block. Remember that your presentation… Read more Dear Students, After reviewing your paper preferences, we have finalized the list of paper assignments. You can find your matriculation number listed before each reference. Presentations will take place during the presentation session corresponding to each block. Remember that your presentation should be 15 minutes long, followed by a Q&A with questions from both the audience and the instructors. If you notice that you’ve been assigned a paper you marked as “I would not present” or find any other error, please let us know as soon as possible. Additionally, we’ve noticed that only three of you have registered on LSF so far. Please remember that registration is required by November 4, one week before the first presentation session.
If you have any questions, don’t hesitate to reach out. |
Deep Probabilistic Generative Models
With the development of neural networks and increased computational power, deep generative modeling has emerged as one of the leading directions in AI. We are shifting from traditional discriminative tasks (such as classification, segmentation, or clustering), which focus on modeling conditional distributions, to a more comprehensive framework aimed at modeling the joint distribution of the data itself. Discriminative models alone can be insufficient for robust decision-making and the development of intelligent systems, as it is also necessary to understand the underlying data-generating process and be able to express uncertainty about the environment.
Typically, in deep learning literature, generative models are viewed as methods for synthesizing new data. However, in this seminar, we will adopt a probabilistic perspective to highlight that modeling the marginal likelihood of the data has much broader applicability, and this could be essential for building successful AI systems.
In this seminar, we will ask ourselves how to formulate deep generative models (i.e., how to express and learn the marginal likelihood of the data) and explore the different approaches proposed in the literature. The aim is for students to critically assess existing methods, understand their strengths and limitations, and identify potential directions for future research.
- Block 1: Explicit Density Models (VAEs and Flows)
- Block 2: Implicit Density Models (GANs and DDPMs)
- Block 3: Multimodal Generation
Each block includes a paper sessions and a panel discussion, as detailed below. Student evaluation is based on their presentations, active participation in discussions, and a final report summarizing the seminar topics, offering critical analysis, identifying limitations, and suggesting potential research directions.
Date |
Block |
Content |
|---|---|---|
|
28/10/2025 (12:15 - 13:45) |
Background Session | Introduction to probabilistic modeling |
|
04/11/2025 (12:15 - 13:45) |
Background Session | Introduction to generative modeling |
|
11/11/2025 (12:15 - 13:45) |
Block 1 - Explicit Density Models | Paper presentations + Q&A |
|
18/11/2025 (12:15 - 13:45) |
Block 1 - Explicit Density Models | Panel discussion |
|
09/12/2025 (12:15 - 13:45) |
Block 2 - Implicit Density Models | Paper presentations + Q&A |
|
16/12/2025 (12:15 - 13:45) |
Block 2 - Implicit Density Models | Panel discussion |
|
13/01/2026 (12:15 - 13:45) |
Block 3 - Multimodal Generation | Paper presentations + Q&A |
|
20/01/2026 (12:15 - 13:45) |
Block 3 - Multimodal Generation | Panel discussion |
LOCATION: SR4 in E2.5
Attendance
The seminar will be held in person, and attendance is required. Students may miss no more than two sessions without providing a justification.
Deliverables and Grading Scheme
-
Paper Presentation (15 minutes) (40%)
- Submission (requirement): Students should submit a pdf file with the slides the day they are presenting.
- Context
- Positioning of the paper within the state of the art and identification of gaps the paper addresses.
- Clear articulation of What/Why/How.
- Content
- Clear explanation of the paper's core intuition and methodology.
- Rationale behind experiments and significance of results.
- Advantages and disadvantages of the method presented.
- Final slide/section presenting the take-home messages.
- Q&A
- Responding questions from TAs and audience.
-
Discussion Session (20%)
- Pre-Submission Requirements
- Each participant must submit one discussion question per block. This requirement is mandatory for students who did not present a paper in that block.
- Submission deadline: 2 days before the session via CMS.
- Participation Expectations
- For Presenters:
- Active engagement answering the questions.
- Facilitating broader discussion.
- For Listeners:
- Quality of pre-submitted questions.
- Active participation in discussions.
- For Presenters:
- Pre-Submission Requirements
-
Final Report (6-8 pages, excluding references) (40%)
- Critical Analysis
- Comprehensive summary of the seminar, including the key points from the papers presented and discussion sessions.
- Comprehensive analysis regarding limitations, advantages, and open challenges in the field.
- Connect the discussion with broader research context, beyond the papers discussed in the seminar.
- Evaluation Criteria
- Content
- Depth of the analysis.
- Demonstration of understanding across all three seminar blocks.
- Synthesis of seminar content with broader research context.
- Delivery
- Quality of writing and argumentation.
- Content
List of papers
- Variational Autoencoders
- Tomczak, J. & Welling, M.. (2018). VAE with a VampPrior. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1214-1223. Link: https://proceedings.mlr.press/v84/tomczak18a.html.
- Vahdat, A., & Kautz, J. (2020). NVAE: A Deep Hierarchical Variational Autoencoder. Advances in Neural Information Processing Systems, 33, 19667-19679. Link: https://proceedings.neurips.cc/paper_files/paper/2020/file/e3b21256183cf7c2c7a66be163579d37-Paper.pdf
- Kingma, D. P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., & Welling, M. (2016). Improved Variational Inference With Inverse Autoregressive Flow. Advances in Neural Information Processing Systems, 29. Link: https://proceedings.neurips.cc/paper_files/paper/2016/file/ddeebdeefdb7e7e7a697e1c3e3d8ef54-Paper.pdf
- Chen, R. T., Li, X., Grosse, R. B., & Duvenaud, D. K. (2018). Isolating Sources of Disentanglement in Variational Autoencoders. Advances in Neural Information Processing Systems, 31. Link: https://proceedings.neurips.cc/paper_files/paper/2018/file/1ee3dfcd8a0645a25a35977997223d22-Paper.pdf
- Van Den Oord, A., & Vinyals, O. (2017). Neural Discrete Representation Learning. Advances in Neural Information Processing Systems, 30. Link: https://proceedings.neurips.cc/paper_files/paper/2017/file/7a98af17e63a0ac09ce2e96d03992fbc-Paper.pdf
- Flows
- Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative Flow with Invertible 1x1 Convolutions. Advances in neural information processing systems, 31. Link: https://proceedings.neurips.cc/paper_files/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdf
- Ho, J., Chen, X., Srinivas, A., Duan, Y., & Abbeel, P. (2019, May). Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design. In International Conference on Machine Learning (pp. 2722-2730). PMLR. Link: http://proceedings.mlr.press/v97/ho19a/ho19a.pdf
- [ORAL @ ICML 2025] Zhai, S., ZHANG, R., Nakkiran, P., Berthelot, D., Gu, J., Zheng, H., ... & Susskind, J. M. Normalizing Flows are Capable Generative Models. In Forty-second International Conference on Machine Learning. Link: https://openreview.net/pdf?id=2uheUFcFsM
- Generative Adversarial Networks
- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Advances in Neural Information Processing Systems, 29. Link: https://proceedings.neurips.cc/paper_files/paper/2016/file/7c9d0b1f96aebd7b5eca8c3edaa19ebb-Paper.pdf
- A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4401-4410). Link: https://openaccess.thecvf.com/content_CVPR_2019/papers/Karras_A_Style-Based_Generator_Architecture_for_Generative_Adversarial_Networks_CVPR_2019_paper.pdf
- Arjovsky, M., Chintala, S., & Bottou, L. (2017, July). Wasserstein Generative Adversarial Networks. In International Conference on Machine Learning (pp. 214-223). PMLR. Link: http://proceedings.mlr.press/v70/arjovsky17a/arjovsky17a.pdf
- Casanova, A., Careil, M., Verbeek, J., Drozdzal, M., & Romero Soriano, A. (2021). Instance-Conditioned GAN. Advances in Neural Information Processing Systems, 34, 27517-27529. Link: https://proceedings.neurips.cc/paper_files/paper/2021/file/e7ac288b0f2d41445904d071ba37aaff-Paper.pdf
- Diffusion
- Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems, 33, 6840-6851. Link: https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
- Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10684-10695). Link: https://openaccess.thecvf.com/content/CVPR2022/papers/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.pdf
- Austin, J., Johnson, D. D., Ho, J., Tarlow, D., & Van Den Berg, R. (2021). Structured Denoising Diffusion Models in Discrete State-Spaces. Advances in Neural Information Processing Systems, 34, 17981-17993. Link: https://proceedings.neurips.cc/paper_files/paper/2021/file/958c530554f78bcd8e97125b70e6973d-Paper.pdf
- Ho, J., & Salimans, T. Classifier-Free Diffusion Guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications. Link: https://openreview.net/pdf?id=qw8AKxfYbI
- Multimodality
- Nichol, A. Q., Dhariwal, P., Ramesh, A., Shyam, P., Mishkin, P., Mcgrew, B., ... & Chen, M. (2022, June). GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. In International Conference on Machine Learning (pp. 16784-16804). PMLR. Link: https://proceedings.mlr.press/v162/nichol22a/nichol22a.pdf
- [ORAL @ NeurIPS 2024] Karras, T., Aittala, M., Kynkäänniemi, T., Lehtinen, J., Aila, T., & Laine, S. (2024). Guiding a Diffusion Model with a Bad Version of Itself. Advances in Neural Information Processing Systems, 37, 52996-53021. Link: https://proceedings.neurips.cc/paper_files/paper/2024/file/5ee7ed60a7e8169012224dec5fe0d27f-Paper-Conference.pdf
- Palumbo, E., Daunhawer, I., & Vogt, J. E. (2023). MMVAE+: Enhancing the Generative Quality of Multimodal VAEs without Compromises. In The Eleventh International Conference on Learning Representations. Link: https://openreview.net/pdf?id=sdQGxouELX
- Mancisidor, R. A., Jenssen, R., Yu, S., & Kampffmeyer, M. (2025, May). Aggregation of Dependent Expert Distributions in Multimodal Variational Autoencoders. In Forty-second International Conference on Machine Learning. Link: https://openreview.net/pdf?id=jYmGi1175R
- Campbell, A., Yim, J., Barzilay, R., Rainforth, T., & Jaakkola, T. (2024, July). Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design. In International Conference on Machine Learning (pp. 5453-5512). PMLR. Link: https://proceedings.mlr.press/v235/campbell24a.html
- Kang, M., Zhu, J. Y., Zhang, R., Park, J., Shechtman, E., Paris, S., & Park, T. (2023). Scaling up GANs for Text-to-Image Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10124-10134). Link: https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Scaling_Up_GANs_for_Text-to-Image_Synthesis_CVPR_2023_paper.pdf
