Human-in-the-Loop Machine Learning

University of Amsterdam
Masters of AI: Semester 1, Block 2, 2023
Course Coordinators: Eric Nalisnick and Mohammad Aliannejadi
Teaching Assistants: Rajeev Verma, Dharmesh Tailor , Alexander Timans
Notes: ML_with_Humans_Course_Notes.pdf

Machine learning (ML) is being deployed in increasingly consequential tasks, such as healthcare and autonomous driving. For the foreseeable future, successfully deploying ML in such settings will require close collaboration and integration with humans, whether they be users, designers, engineers, policy-makers, etc. This course will look at how humans can be incorporated into the foundations of ML in a principled way. The course will be broken down (unequally) into three parts: demonstration, collaboration, and oversight. Demonstration is about how machines can learn from 'observing' humans---such as learning to drive a car from data collected while humans drive. In this setting, the human is assumed to be strictly better than the machine and so the primary goal is to transmit the human's knowledge and abilities into the ML model. The second part, collaboration, is about when humans and models are near equals in performance but not in abilities. A relevant setting is AI-assisted healthcare: perhaps a human radiologist and ML model are good at diagnosising different kinds of diseases. Thus we will look at methodologies that allow machines to `ask for help' when they are either unconfident in their own performance and/or think the human can better carry out the task. Lastly, the course will close with the setting in which machines are strictly better at a task than humans are, but we still wish to monitor them to ensure safety and alignment with our goals (oversight).

Schedule and Topics

Dates	Topics	Readings
31.10.23	Logistics, introduction, building prior knowledge into models	Do we still need models or just more data and compute? by Welling Prior Probabilities by Jaynes
02.11.23	Crowdsourcing, Dawid-Skene model	Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm by Dawid & Skene Learning From Crowds by Raykar et al. Who Said What: Modeling Individual Labelers Improves Classification by Guan et al.
07.11.23	Active learning	Active Learning Literature Survey by Settles Active Learning from Crowds by Yan et al.
09.11.23	Practical on crowdsourcing platforms, overview of reinforcement learning (RL)	Lecture 1: Introduction to RL by Silver
14.11.23	Imitation learning: behavior cloning, distribution matching	Notes on Imitation Learning, CS237B, Stanford University Model-Based Imitation Learning by Probabilistic Trajectory Matching by Englert et al. Generative Adversarial Imitation Learning by Ho & Ermon
16.11.23	Imitation learning continued: distribution matching (cont.), inverse RL	Apprenticeship Learning via Inverse Reinforcement Learning by Abbeel & Ng Maximum Entropy Inverse Reinforcement Learning by Ziebart et al.
21.11.23	Inverse RL continued, RL from human feedback	Deep Reinforcement Learning from Human Preferences by Christiano et al.
23.11.23	Direct preference optimization, exam review	Direct Preference Optimization: Your Language Model is Secretly a Reward Model by Rafailov et al.
30.11.23	Exam
05.12.23	Guest lectures: large language models by Prof. Charlie Clarke (Univ. of Waterloo) and the alignment problem by Leon Lang (UvA)	Perspectives on Large Language Models for Relevance Judgment by Faggioli et al. The Alignment Problem from a Deep Learning Perspective by Ngo et al.
07.12.23	Combining human and machine predictions, learning to defer to an expert	Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence by Kamar Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration by Kerrigan et al. Learning to Complement Humans by Wilder et al. Consistent Estimators for Learning to Defer to an Expert by Mozannar & Sontag
12.12.23	The off-switch game (slides)	Corrigibility by Soares et al. The Off-Switch Game by Hadfield-Menell et al.
14.12.23	Off-switch game continued, human control (causal definitions), course summary	Human Control: Definitions and Algorithms by Carey & Everitt.