This is the main course website for the seminar Critical Data Studies in summer term 2024 at University of Bayreuth. The umbrella topic for this iteration of the seminar is called Recognising and preventing bias in AI.
Syllabus¶
The collection and analysis of big data now influences all aspects of our lives. Critical Data Studies (CDS) explores the cultural, ethical, and socio-technical challenges at the intersection of computer science, humanities, social science, and society. Students critically explore topics such as Artificial Intelligence, Big Data, data science, data ethics, privacy, fake news, and discuss how data systems and algorithms can contribute to help solve societal problems.
Based on the umbrella topic of Recognising and preventing bias in AI, the participants develop their own scientific questions and work on them either individually or in small groups of up to two people during the semester.
Schedule¶
Introduction¶
- The Trouble with Bias: https://www.youtube.com/watch?v=fMym_BKWQzk
- Causes for Bias
- Historical Bias: Human biases included in training datasets
https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist - Unrepresentative Training Data: Uncomplete or unrepresentative data
http://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf - Proxies & Feedback Loops: Proxies for sensitive attributes become loops
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm - Algorithmic Objectives: Minimize prediction errors and benefit majority groups
https://doi.org/10.1145/2783258.2788613
- Historical Bias: Human biases included in training datasets
- Recent Conferences and Publications:
Papers¶
- Samuel R. Bowman. Eight things to know about large language models. CoRR, 2023. arXiv:2304.00612.
- Kenneth Church, Annika Schoene, John E. Ortega, Raman Chandrasekar, and Valia Kordoni. Emerging trends: unfair, biased, addictive, dangerous, deadly, and insanely profitable. Natural Language Engineering, 29(2):483–508, 2023. doi:10.1017/S1351324922000481.
- Plamen P. Angelov, Eduardo A. Soares, Richard Jiang, Nicholas I. Arnold, and Peter M. Atkinson. Explainable artificial intelligence: an analytical review. WIREs Data Mining and Knowledge Discovery, 11(5):e1424, 2021. doi:https://doi.org/10.1002/widm.1424.
- Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. On the dangers of stochastic parrots: can language models be too big? 🦜. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT '21, 610–623. New York, NY, USA, 2021. Association for Computing Machinery. doi:10.1145/3442188.3445922.
- Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. A survey on bias and fairness in machine learning. ACM Comput. Surv., jul 2021. doi:10.1145/3457607.
- Duri Long and Brian Magerko. What is ai literacy? competencies and design considerations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI '20, 1–16. New York, NY, USA, 2020. Association for Computing Machinery. doi:10.1145/3313831.3376727.
- Eirini Ntoutsi et al. Bias in data-driven artificial intelligence systems—an introductory survey. WIREs Data Mining and Knowledge Discovery, 10(3):e1356, 2020. doi:10.1002/widm.1356.
- Joy Buolamwini and Timnit Gebru. Gender shades: intersectional accuracy disparities in commercial gender classification. In Sorelle A. Friedler and Christo Wilson, editors, Proceedings of the 1st Conference on Fairness, Accountability and Transparency, volume 81 of Proceedings of Machine Learning Research, 77–91. PMLR, 23–24 Feb 2018. [ PDF ]
- Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '15, 1721–1730. New York, NY, USA, 2015. Association for Computing Machinery. doi:10.1145/2783258.2788613.