CMPSC 292F Statistical Reinforcement Learning

Quarter

Spring 2021

Instructor/s

Yu-Xiang Wang

Course Type

Special Topics Course

Course Area

Foundations

Enrollment Code

8193

Location

ON LINE

Units

4

Day and Time

M/W 1 - 2:50 pm

Course Description

In this PhD level course, you will learn the models and algorithms of reinforcement learning (RL) as well as techniques in analyzing the sample-complexity of reinforcement learning algorithms in various settings. The focus of the course would be on the statistical foundation of RL and for the most part we will consider the Markov decision process (MDP) model with finite states and finite actions, though towards the end of the course we will talk about function approximations as well as the ideas behind some deep RL algorithms.

Key topics of the course include: Markov decision processes, multi-armed bandits, exploration, off-policy evaluation, offline reinforcement learning, function approximation.

Students are expected to have a reasonable level of mathematical maturity and have working knowledge of probability, statistics and optimization. Prior knowledge of statistical learning theory, reinforcement learning algorithms, and the theory of convex optimization would be ideal but not required.