News

Data and Society, Presentations Starting this Friday

Written on 20.11.24 by Annika Hass

Dear all,

 

We would like to remind you of the criteria for the paper presentations starting this Friday. You can find them in the section Materials under Grading.

 

On Friday, we will listen to the following paper presentations:

 

Parking occupancy estimation on planetscope… Read more

Dear all,

 

We would like to remind you of the criteria for the paper presentations starting this Friday. You can find them in the section Materials under Grading.

 

On Friday, we will listen to the following paper presentations:

 

Parking occupancy estimation on planetscope satellite images, Chaitanya

(https://ieeexplore.ieee.org/abstract/document/9323104)

 

Ideational diffusion and the great witch hunt in Central Europe, Prakhar Narian

(https://link.springer.com/article/10.1007/s11186-024-09576-1)

 

Persistent Pre-Training Poisoning of LLMs, Prakhar

(https://arxiv.org/abs/2410.13722)

 

Please ensure you are well-prepared for the discussion and have reviewed the key figures in advance.

 

If desired, we could record the presentations and provide more detailed feedback on presentation style.

 

Looking forward to hearing the talks and discussing the papers with you on Friday!

 

Best regards,

 

Your Data and Society Team

Written on 20.11.24 by Annika Hass

Dear all,

Elisa has sent me the slights of her talk, which you can find under materials.

They are for your private use. Please do not share them.
Best regards,

Annika

Links I showed on Friday re "must published research findings are false"

Written on 18.11.24 by Ingmar Weber

Pimeyes: scary facial recognition service, https://pimeyes.com/en

"Why Most Published Research Findings Are False", the paper that kicked of the "replication crisis"… Read more

Pimeyes: scary facial recognition service, https://pimeyes.com/en

"Why Most Published Research Findings Are False", the paper that kicked of the "replication crisis" (https://en.wikipedia.org/wiki/Replication_crisis), https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124

"Chocolate promotes weight loss", the problem with trying out lots of things and only reporting the one that works, https://gizmodo.com/i-fooled-millions-into-thinking-chocolate-helps-weight-1707251800, https://www.cbsnews.com/news/how-the-chocolate-diet-hoax-fooled-millions/

Bonferroni Correction, one of the ways to deal with this "multiple hypothesis" setting: https://en.wikipedia.org/wiki/Bonferroni_correction

Same data, different analysts, different conclusions: two studies show that the _same_ data and the _same_ research question can lead to different results: https://journals.sagepub.com/doi/full/10.1177/2515245917747646, https://www.sciencedirect.com/science/article/pii/S0749597821000200

Reproducibility crisis in machine learning, partly caused by "leakage" where some information from the training data leaks into the test data: https://reproducible.cs.princeton.edu/

One example of a meta analysis of if [insert some food] is good or bad for you. Coffee in this case: https://www.bmj.com/content/359/bmj.j5024

 

 

 

Paper assignments

Written on 15.11.24 (last change on 20.11.24) by Till Koebe

Dear all, please find below the assignment of papers for the upcoming four sessions. A few points to remember:

1. 3 Papers per session: 15 min presentation, 15 min discussion each. Stick to the time limit, we will cut you off.
2. Please truly understand the key figure of each of the papers… Read more
Dear all, please find below the assignment of papers for the upcoming four sessions. A few points to remember:

1. 3 Papers per session: 15 min presentation, 15 min discussion each. Stick to the time limit, we will cut you off.
2. Please truly understand the key figure of each of the papers presented beforehand, whatever it takes.
3. When doing your presentation, switch in the role of the listener and tune your presentation in a way that maximises the value added for them.

A final note: One paper has not been assigned yet. The presentation date for that will be Nov 29. I will update you which paper has been assigned in the upcoming session.

And finally, thanks to many of you for sticking around for the lecture series afterwards today. I think for a speaker it is always a good feeling to see every seat being taken.

Best,
Till

 

# Paper Title Student Name Presentation Date
1 High-resolution satellite images reveal the prevalent positive indirect impact of urbanization on urban tree canopy coverage in South America    
2 Modelling and evaluation of land use changes through satellite images in a multifunctional catchment: Social, economic and environmental implications    
3 The Social Impact of Generative AI: An Analysis on ChatGPT Abaad Dec 6
4 Social media influence on students' knowledge sharing and learning: An empirical study Shiraz Nov 29
5 Urban Flood Mapping With Bitemporal Multispectral Imagery Via a Self-Supervised Learning Framework Rishant Dec 13
6 Parking occupancy estimation on planetscope satellite images Chaitanya Nov 22
7 The Evolution of the Manosphere Across the Web Bharat Nov 29
8 From individual to group privacy in big data analytics Parnian Dec 6
9 Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead Najia Dec 6
10 Ideational diffusion and the great witch hunt in Central Europe Prakhar Narian Nov 22
11 Conceptual structure and the growth of scientific knowledge    
12 Fact-checker warning labels are effective even for those who distrust fact-checkers David Nov 29
13 A 27-country test of communicating the scientific consensus on climate change Khalid Dec 13
14 Persistent Pre-Training Poisoning of LLMs Prakhar Nov 22

No seminar on Fri, Oct 18 - We start on Fri, Oct 25

Written on 15.10.24 by Ingmar Weber

The first seminar will be on Friday, October 25, 10am (c.t.) - noon in building E1.7, 3rd floor, room 3.23.

See you then!

Your Data and Society Team.

Data and Society


From finding a mate, to booking a holiday, our lives are increasingly mediated by online platforms. Digital traces left by these interactions provide opportunities to study societal phenomena while creating challenges around the responsible use of data. In this seminar, students will learn how computational methods and machine learning can be applied to study society through such data. 

The first part of the seminar will familiarize students with existing work in computational social science with each week focused on a topic such as “Digital Democracy” or “Gender Gaps” and methods to quantify it. The second part of the seminar will be about projects in which students are asked to quantify a societal phenomenon of their choice using computational methods. Here, students can both propose topics or choose from topics defined by the lecturers.

The overall course performance will be based on (i) overall course participation, (ii) assigned paper presentations, (iii) literature review and “project pitch” (prior to in-depth work) and (iv) the written project report.

Apart from learning about interdisciplinary research and applications of machine learning, students will also learn research skills such as how to read and discuss papers, how to plan a project, how to present their work, how to write a scientific paper, and how to work in teams. 

Students can take this course as a seminar. 

Requirements: Msc students only – the project-based element of the seminar will require some Python programming and data analysis experience.  An interest beyond foundations of CS, and caring about societal problems is a must. 

Privacy Policy | Legal Notice
If you encounter technical problems, please contact the administrators.