News
Panel DiscussionWritten on 24.11.25 by María Martínez-García Dear students, Tomorrow we will have the first panel session of the seminar. We have uploaded a PDF file compiling the pending points from the last session along with your submitted questions, so you can prepare for the discussion. Students who presented papers in this block will act… Read more Dear students, Tomorrow we will have the first panel session of the seminar. We have uploaded a PDF file compiling the pending points from the last session along with your submitted questions, so you can prepare for the discussion. Students who presented papers in this block will act as “experts” and lead the discussion, but all students are expected to participate actively. Everyone will be evaluated based on their assigned role. The session will begin with the presentation of the paper “VAE with a VampPrior”, followed by the panel discussion. Looking forward to seeing you all tomorrow! Your seminar team |
Reschedule Presentations and Panel DiscussionWritten on 11.11.25 by María Martínez-García Dear Students, Due to the illness of the three presenters scheduled for today’s session, we have decided to cancel and postpone the first block sessions by one week. This means that the paper presentations of the first block will take place on November 18, and the panel discussion on November 25,… Read more Dear Students, Due to the illness of the three presenters scheduled for today’s session, we have decided to cancel and postpone the first block sessions by one week. This means that the paper presentations of the first block will take place on November 18, and the panel discussion on November 25, at the usual time slot. Remember that you can reach out to get feedback on your presentations, but please do so at least one week before your presentation to ensure we can schedule a session and you have time to incorporate the feedback. We’ve also noticed some confusion regarding submissions, so here’s a clarification:
Also, remember that attendance is mandatory, and you may miss no more than two sessions without a valid justification. Thank you all for your understanding today. See you all next week! Best, Your seminar team |
Paper assignmentWritten on 30.10.25 by María Martínez-García Dear Students, After reviewing your paper preferences, we have finalized the list of paper assignments. You can find your matriculation number listed before each reference. Presentations will take place during the presentation session corresponding to each block. Remember that your presentation… Read more Dear Students, After reviewing your paper preferences, we have finalized the list of paper assignments. You can find your matriculation number listed before each reference. Presentations will take place during the presentation session corresponding to each block. Remember that your presentation should be 15 minutes long, followed by a Q&A with questions from both the audience and the instructors. If you notice that you’ve been assigned a paper you marked as “I would not present” or find any other error, please let us know as soon as possible. Additionally, we’ve noticed that only three of you have registered on LSF so far. Please remember that registration is required by November 4, one week before the first presentation session.
If you have any questions, don’t hesitate to reach out. |
Deep Probabilistic Generative Models
With the development of neural networks and increased computational power, deep generative modeling has emerged as one of the leading directions in AI. We are shifting from traditional discriminative tasks (such as classification, segmentation, or clustering), which focus on modeling conditional distributions, to a more comprehensive framework aimed at modeling the joint distribution of the data itself. Discriminative models alone can be insufficient for robust decision-making and the development of intelligent systems, as it is also necessary to understand the underlying data-generating process and be able to express uncertainty about the environment.
Typically, in deep learning literature, generative models are viewed as methods for synthesizing new data. However, in this seminar, we will adopt a probabilistic perspective to highlight that modeling the marginal likelihood of the data has much broader applicability, and this could be essential for building successful AI systems.
In this seminar, we will ask ourselves how to formulate deep generative models (i.e., how to express and learn the marginal likelihood of the data) and explore the different approaches proposed in the literature. The aim is for students to critically assess existing methods, understand their strengths and limitations, and identify potential directions for future research.
- Block 1: Explicit Density Models (VAEs and Flows)
- Block 2: Implicit Density Models (GANs and DDPMs)
- Block 3: Multimodal Generation
Each block includes a paper sessions and a panel discussion, as detailed below. Student evaluation is based on their presentations, active participation in discussions, and a final report summarizing the seminar topics, offering critical analysis, identifying limitations, and suggesting potential research directions.
Date |
Block |
Content |
|---|---|---|
|
28/10/2025 (12:15 - 13:45) |
Background Session | Introduction to probabilistic modeling |
|
04/11/2025 (12:15 - 13:45) |
Background Session | Introduction to generative modeling |
|
18/11/2025 (12:15 - 13:45) |
Block 1 - Explicit Density Models | Paper presentations + Q&A |
|
25/11/2025 (12:15 - 13:45) |
Block 1 - Explicit Density Models | Panel discussion |
|
09/12/2025 (12:15 - 13:45) |
Block 2 - Implicit Density Models | Paper presentations + Q&A |
|
16/12/2025 (12:15 - 13:45) |
Block 2 - Implicit Density Models | Panel discussion |
|
13/01/2026 (12:15 - 13:45) |
Block 3 - Multimodal Generation | Paper presentations + Q&A |
|
20/01/2026 (12:15 - 13:45) |
Block 3 - Multimodal Generation | Panel discussion |
LOCATION: SR4 in E2.5
Attendance
The seminar will be held in person, and attendance is required. Students may miss no more than two sessions without providing a justification.
Deliverables and Grading Scheme
-
Paper Presentation (15 minutes) (40%)
- Submission (requirement): Students should submit a pdf file with the slides the day they are presenting.
- Context
- Positioning of the paper within the state of the art and identification of gaps the paper addresses.
- Clear articulation of What/Why/How.
- Content
- Clear explanation of the paper's core intuition and methodology.
- Rationale behind experiments and significance of results.
- Advantages and disadvantages of the method presented.
- Final slide/section presenting the take-home messages.
- Q&A
- Responding questions from TAs and audience.
-
Discussion Session (20%)
- Pre-Submission Requirements
- Each participant must submit one discussion question per block. This requirement is mandatory for students who did not present a paper in that block.
- Submission deadline: 2 days before the session via CMS.
- Participation Expectations
- For Presenters:
- Active engagement answering the questions.
- Facilitating broader discussion.
- For Listeners:
- Quality of pre-submitted questions.
- Active participation in discussions.
- For Presenters:
- Pre-Submission Requirements
-
Final Report (6-8 pages, excluding references) (40%)
- Critical Analysis
- Comprehensive summary of the seminar, including the key points from the papers presented and discussion sessions.
- Comprehensive analysis regarding limitations, advantages, and open challenges in the field.
- Connect the discussion with broader research context, beyond the papers discussed in the seminar.
- Evaluation Criteria
- Content
- Depth of the analysis.
- Demonstration of understanding across all three seminar blocks.
- Synthesis of seminar content with broader research context.
- Delivery
- Quality of writing and argumentation.
- Content
Paper presentations
- BLOCK 1:
- [7058394] Tomczak, J. & Welling, M.. (2018). VAE with a VampPrior. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1214-1223. Link: https://proceedings.mlr.press/v84/tomczak18a.html.
- [7070291] NVAE: A Deep Hierarchical Variational Autoencoder. Advances in Neural Information Processing Systems, 33, 19667-19679. Link: https://proceedings.neurips.cc/paper_files/paper/2020/file/e3b21256183cf7c2c7a66be163579d37-Paper.pdf
- [7075715] Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative Flow with Invertible 1x1 Convolutions. Advances in Neural Information Processing Systems, 31. Link: https://proceedings.neurips.cc/paper_files/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdf
- [7076153] Zhai, S., ZHANG, R., Nakkiran, P., Berthelot, D., Gu, J., Zheng, H., ... & Susskind, J. M. Normalizing Flows are Capable Generative Models. In Forty-second International Conference on Machine Learning. Link: https://openreview.net/pdf?id=2uheUFcFsM
- BLOCK 2:
- [7062300] Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10684-10695). Link: https://openaccess.thecvf.com/content/CVPR2022/papers/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.pdf
- [7076165] Ho, J., & Salimans, T. Classifier-Free Diffusion Guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications. Link: https://openreview.net/pdf?id=qw8AKxfYbI
- [7026925] Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. (2016). InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Advances in Neural Information Processing Systems, 29. Link: https://proceedings.neurips.cc/paper_files/paper/2016/file/7c9d0b1f96aebd7b5eca8c3edaa19ebb-Paper.pdf
- [7074899] Casanova, A., Careil, M., Verbeek, J., Drozdzal, M., & Romero Soriano, A. (2021). Instance-Conditioned GAN. Advances in Neural Information Processing Systems, 34, 27517-27529. Link: https://proceedings.neurips.cc/paper_files/paper/2021/file/e7ac288b0f2d41445904d071ba37aaff-Paper.pdf
- BLOCK 3:
- [7086510] Palumbo, E., Daunhawer, I., & Vogt, J. E. (2023). MMVAE+: Enhancing the Generative Quality of Multimodal VAEs without Compromises. In The Eleventh International Conference on Learning Representations. Link: https://openreview.net/pdf?id=sdQGxouELX
- [7073034] Campbell, A., Yim, J., Barzilay, R., Rainforth, T., & Jaakkola, T. (2024, July). Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design. In International Conference on Machine Learning (pp. 5453-5512). PMLR. Link: https://proceedings.mlr.press/v235/campbell24a.html
- [7057757] Kang, M., Zhu, J. Y., Zhang, R., Park, J., Shechtman, E., Paris, S., & Park, T. (2023). Scaling up GANs for Text-to-Image Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10124-10134). Link: https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Scaling_Up_GANs_for_Text-to-Image_Synthesis_CVPR_2023_paper.pdf
- [7081324] Chun, S., Kim, W., Park, S., & Yun, S. (2025) Probabilistic Language-Image Pre-Training. In The Thirteenth International Conference on Learning Representations. Link: https://openreview.net/pdf?id=D5X6nPGFUY
Full list of papers proposed
- Variational Autoencoders
- Tomczak, J. & Welling, M.. (2018). VAE with a VampPrior. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1214-1223. Link: https://proceedings.mlr.press/v84/tomczak18a.html.
- Vahdat, A., & Kautz, J. (2020). NVAE: A Deep Hierarchical Variational Autoencoder. Advances in Neural Information Processing Systems, 33, 19667-19679. Link: https://proceedings.neurips.cc/paper_files/paper/2020/file/e3b21256183cf7c2c7a66be163579d37-Paper.pdf
- Kingma, D. P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., & Welling, M. (2016). Improved Variational Inference With Inverse Autoregressive Flow. Advances in Neural Information Processing Systems, 29. Link: https://proceedings.neurips.cc/paper_files/paper/2016/file/ddeebdeefdb7e7e7a697e1c3e3d8ef54-Paper.pdf
- Chen, R. T., Li, X., Grosse, R. B., & Duvenaud, D. K. (2018). Isolating Sources of Disentanglement in Variational Autoencoders. Advances in Neural Information Processing Systems, 31. Link: https://proceedings.neurips.cc/paper_files/paper/2018/file/1ee3dfcd8a0645a25a35977997223d22-Paper.pdf
- Van Den Oord, A., & Vinyals, O. (2017). Neural Discrete Representation Learning. Advances in Neural Information Processing Systems, 30. Link: https://proceedings.neurips.cc/paper_files/paper/2017/file/7a98af17e63a0ac09ce2e96d03992fbc-Paper.pdf
- Flows
- Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative Flow with Invertible 1x1 Convolutions. Advances in neural information processing systems, 31. Link: https://proceedings.neurips.cc/paper_files/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdf
- Ho, J., Chen, X., Srinivas, A., Duan, Y., & Abbeel, P. (2019, May). Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design. In International Conference on Machine Learning (pp. 2722-2730). PMLR. Link: http://proceedings.mlr.press/v97/ho19a/ho19a.pdf
- [ORAL @ ICML 2025] Zhai, S., ZHANG, R., Nakkiran, P., Berthelot, D., Gu, J., Zheng, H., ... & Susskind, J. M. Normalizing Flows are Capable Generative Models. In Forty-second International Conference on Machine Learning. Link: https://openreview.net/pdf?id=2uheUFcFsM
- Generative Adversarial Networks
- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Advances in Neural Information Processing Systems, 29. Link: https://proceedings.neurips.cc/paper_files/paper/2016/file/7c9d0b1f96aebd7b5eca8c3edaa19ebb-Paper.pdf
- A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4401-4410). Link: https://openaccess.thecvf.com/content_CVPR_2019/papers/Karras_A_Style-Based_Generator_Architecture_for_Generative_Adversarial_Networks_CVPR_2019_paper.pdf
- Arjovsky, M., Chintala, S., & Bottou, L. (2017, July). Wasserstein Generative Adversarial Networks. In International Conference on Machine Learning (pp. 214-223). PMLR. Link: http://proceedings.mlr.press/v70/arjovsky17a/arjovsky17a.pdf
- Casanova, A., Careil, M., Verbeek, J., Drozdzal, M., & Romero Soriano, A. (2021). Instance-Conditioned GAN. Advances in Neural Information Processing Systems, 34, 27517-27529. Link: https://proceedings.neurips.cc/paper_files/paper/2021/file/e7ac288b0f2d41445904d071ba37aaff-Paper.pdf
- Diffusion
- Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems, 33, 6840-6851. Link: https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
- Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10684-10695). Link: https://openaccess.thecvf.com/content/CVPR2022/papers/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.pdf
- Austin, J., Johnson, D. D., Ho, J., Tarlow, D., & Van Den Berg, R. (2021). Structured Denoising Diffusion Models in Discrete State-Spaces. Advances in Neural Information Processing Systems, 34, 17981-17993. Link: https://proceedings.neurips.cc/paper_files/paper/2021/file/958c530554f78bcd8e97125b70e6973d-Paper.pdf
- Ho, J., & Salimans, T. Classifier-Free Diffusion Guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications. Link: https://openreview.net/pdf?id=qw8AKxfYbI
- Multimodality
- Nichol, A. Q., Dhariwal, P., Ramesh, A., Shyam, P., Mishkin, P., Mcgrew, B., ... & Chen, M. (2022, June). GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. In International Conference on Machine Learning (pp. 16784-16804). PMLR. Link: https://proceedings.mlr.press/v162/nichol22a/nichol22a.pdf
- [ORAL @ NeurIPS 2024] Karras, T., Aittala, M., Kynkäänniemi, T., Lehtinen, J., Aila, T., & Laine, S. (2024). Guiding a Diffusion Model with a Bad Version of Itself. Advances in Neural Information Processing Systems, 37, 52996-53021. Link: https://proceedings.neurips.cc/paper_files/paper/2024/file/5ee7ed60a7e8169012224dec5fe0d27f-Paper-Conference.pdf
- Palumbo, E., Daunhawer, I., & Vogt, J. E. (2023). MMVAE+: Enhancing the Generative Quality of Multimodal VAEs without Compromises. In The Eleventh International Conference on Learning Representations. Link: https://openreview.net/pdf?id=sdQGxouELX
- Mancisidor, R. A., Jenssen, R., Yu, S., & Kampffmeyer, M. (2025, May). Aggregation of Dependent Expert Distributions in Multimodal Variational Autoencoders. In Forty-second International Conference on Machine Learning. Link: https://openreview.net/pdf?id=jYmGi1175R
- Campbell, A., Yim, J., Barzilay, R., Rainforth, T., & Jaakkola, T. (2024, July). Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design. In International Conference on Machine Learning (pp. 5453-5512). PMLR. Link: https://proceedings.mlr.press/v235/campbell24a.html
- Kang, M., Zhu, J. Y., Zhang, R., Park, J., Shechtman, E., Paris, S., & Park, T. (2023). Scaling up GANs for Text-to-Image Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10124-10134). Link: https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Scaling_Up_GANs_for_Text-to-Image_Synthesis_CVPR_2023_paper.pdf
