Video
Description
In this lecture, we introduce you to your very first RL algorithm, Epsilon Greedy. We start off by exploring a toy problem known as the “multi armed bandit problem” or in english- how to win at slot machines! We then dive down into how Epsilon-Greedy solves the bandit problem, go on a detour introducing OpenAi Gym (and why it is important!) and finally hand you over to your first exercise, solving the bandit problem in OpenAi Gym.
A written version of this lecture is also available on the StarAi blog.
Lecture Slides
StarAi Lecture 1 Epsilon Greedy Lecture slides
Exercise
Follow the link below to access the exercises for lecture 1:
Lecture 1: Epsilon-Greedy & the Multi- Armed Bandit with OpenAi Gym
Exercise Solutions
Follow the link below to access the exercise solutions for lecture 1:
Exercise Solutions: Epsilon-Greedy & the Multi- Armed Bandit with OpenAi Gym
Additional Learning Material
- Sutton & Barto’s Reinforcement Learning: An Introduction - Chapter 2 intro, section 2.1 up to 2.5
- Tom Roth’s Multiarmed Bandit Simulator - Get a feel for how the Multiarmed Bandit works live in your browser!
- Edx’s Brilliant Python Course - Note that the majority of StarAi’s exercises are in the Python programming language. If you would like to further you knowledge in this field we strongly suggest you learn Python.