Quarter
Course Type
Course Area
Foundations
Enrollment Code
8193
Location
ON LINE
Units
4
Day and Time
M/W 1 - 2:50 pm
Course Description

In this PhD level course, you will learn the models and algorithms of reinforcement learning (RL) as well as techniques in analyzing the sample-complexity of reinforcement learning algorithms in various settings.  The focus of the course would be on the statistical foundation of RL and for the most part we will consider the Markov decision process (MDP) model with finite states and finite actions, though towards the end of the course we will talk about function approximations as well as the ideas behind some deep RL algorithms. 

Key topics of the course include:  Markov decision processes, multi-armed bandits, exploration, off-policy evaluation,  offline reinforcement learning, function approximation.

Students are expected to have a reasonable level of mathematical maturity and have working knowledge of probability, statistics and optimization.  Prior knowledge of statistical learning theory, reinforcement learning algorithms, and the theory of convex optimization would be ideal but not required.