Lecture 1: Epsilon-Greedy & the multiarmed bandit problem

Video

   

Description

In this lecture, we introduce you to your very first RL algorithm, Epsilon Greedy. We start off by exploring a toy problem known as the “multi armed bandit problem” or in english- how to win at slot machines! We then dive down into how Epsilon-Greedy solves the bandit problem, go on a detour introducing OpenAi Gym (and why it is important!) and finally hand you over to your first exercise, solving the bandit problem in OpenAi Gym.

A written version of this lecture is also available on the StarAi blog.

   

Lecture Slides

StarAi Lecture 1 Epsilon Greedy Lecture slides

   

Exercise

Follow the link below to access the exercises for lecture 1:

Lecture 1: Epsilon-Greedy & the Multi- Armed Bandit with OpenAi Gym

   

Exercise Solutions

Follow the link below to access the exercise solutions for lecture 1:

Exercise Solutions: Epsilon-Greedy & the Multi- Armed Bandit with OpenAi Gym

   

Additional Learning Material

  1. Sutton & Barto’s Reinforcement Learning: An Introduction - Chapter 2 intro, section 2.1 up to 2.5
  2. Tom Roth’s Multiarmed Bandit Simulator - Get a feel for how the Multiarmed Bandit works live in your browser!
  3. Edx’s Brilliant Python Course - Note that the majority of StarAi’s exercises are in the Python programming language. If you would like to further you knowledge in this field we strongly suggest you learn Python.