Sentiment Classification and Maximum Group Loss
Overview
Although each of the problems in the problem set build on one another, the ethics assignment itself begins with Problem 4: Toxicity Classification and Maximum Group Loss. Toxicity classifiers are designed to assist in moderating online forums by predicting whether an online comment is toxic or not so that comments predicted to be toxic can be flagged for humans to review. Unfortunately, such models have been observed to be biased: non-toxic comments mentioning demographic identities often get misclassified as toxic (e.g., “I am a [demographic identity]”). These biases arise because toxic comments often mention and attack demographic identities, and as a result, models learn to spuriously correlate toxicity with the mention of these identities. Therefore, some groups are more likely to have comments incorrectly flagged for review: their group-level loss is higher than other groups.
In Problem 4, students train two classifiers with different objectives: one has lower average loss, while the other has lower maximum group loss.
The ethical reflection questions ask, how should the correct classifications be distributed across commenters? We introduce three well-known principles of fair distribution and ask students to use one in their argument for which classifier should be preferred. A final question asks students to consider the ethics of data collection.
The assignment covers topics including machine learning, gradient descent, linear prediction, loss functions, classifiers, k-means clustering, bias, distributive justice, consent moderation, consequentialism, prioritarianism, and participatory design.
Contributors
- Ethics materials by Kathleen Creel, Lauren Gillespie, Dorsa Sadigh, and Percy Liang. Updated by Diana Acosta Navas.
- Assignment by Percy Liang, Dorsa Sadigh, Tatsunori Hashimoto, and Lauren Gillespie.
Assignment goals
- Train classifiers to identify toxic comments on social media.
- Train classifiers with different objectives.
Ethics goals
- Understand and apply principles of fair distribution to the outcomes of a toxicity classifier
- Identify and compare ethical concerns with different methods of data collection
Download Links
Additional Readings for Context (Instructors or Students):
- Toxicity | Jigsaw
- Unintended Bias and Identity Terms | by Jigsaw
- Egalitarianism (Stanford Encyclopedia of Philosophy)
- Consequentialism (Stanford Encyclopedia of Philosophy)
- Rule Consequentialism (Stanford Encyclopedia of Philosophy)
- Parfit, Equality and Priority
- Distributive Justice (Stanford Encyclopedia of Philosophy)
- Freeman, Rawls on Distributive Justice and the Difference Principle
- Hellman, Big Data and Compounding Injustice
- Discrimination (Stanford Encyclopedia of Philosophy)
- Mills, Towards a Black Radical Liberalism
- Organising principles and general guidelines for Participatory Design