From Language to Information

Ethics Content Description

There are three ethics units in the class: one on sentiment classification, one on information retrieval, and one on large language models. The ethics content is embedded within a three labs that students work on in groups during the scheduled class period. Therefore, the ethics material is a mix of technical problems students work on to practice the material, as well as broader questions they discuss in small groups and with the rest of the class. The topics covered by these three units include: biases introduced by sentiment classifiers, issues of data privacy and data sovereignty related to personalized search engines, and the societal implications of large langauge models.

Course Description

Extracting meaning, information, and structure from human language text, speech, web pages, social networks. Introducing methods (string algorithms, edit distance, language modeling, machine learning, logistic regression, neural networks, neural embeddings, inverted indices, collaborative filtering, PageRank), applications (chatbots, sentiment analysis, information retrieval, text classification, social networks, recommender systems), and ethical issues.

Contributors

Ethics materials created by Veronica Rivera (with Uma Phatak, Deveshi Buch, and Dan Jurafsky).

Assignments

Download all