Lecture 1: Epsilon-Greedy & the multiarmed bandit problem

Video

Description

In this lecture, we introduce you to your very first RL algorithm, Epsilon Greedy. We start off by exploring a toy problem known as the “multi armed bandit problem” or in english- how to win at slot machines! We then dive down into how Epsilon-Greedy solves the bandit problem, go on a detour introducing OpenAi Gym (and why it is important!) and finally hand you over to your first exercise, solving the bandit problem in OpenAi Gym.

A written version of this lecture is also available on the StarAi blog.

Lecture Slides

StarAi Lecture 1 Epsilon Greedy Lecture slides

Exercise

Follow the link below to access the exercises for lecture 1:

Lecture 1: Epsilon-Greedy & the Multi- Armed Bandit with OpenAi Gym

Exercise Solutions

Follow the link below to access the exercise solutions for lecture 1:

Exercise Solutions: Epsilon-Greedy & the Multi- Armed Bandit with OpenAi Gym

Additional Learning Material

Sutton & Barto’s Reinforcement Learning: An Introduction - Chapter 2 intro, section 2.1 up to 2.5
Tom Roth’s Multiarmed Bandit Simulator - Get a feel for how the Multiarmed Bandit works live in your browser!
Edx’s Brilliant Python Course - Note that the majority of StarAi’s exercises are in the Python programming language. If you would like to further you knowledge in this field we strongly suggest you learn Python.

Last updated on Apr 4, 2019