Seminar Computational Intelligence A (708.111)

WS 2019

Institut für Grundlagen der Informationsverarbeitung (708)

Lecturer:

Assoc. Prof. Dr. Robert Legenstein

Office hours: by appointment (via e-mail)

E-mail: robert.legenstein@igi.tugraz.at
Homepage: https://www.tugraz.at/institute/igi/team/prof-legenstein/

Location: IGI-seminar room, Inffeldgasse 16b/I, 8010 Graz
Date: starting on Tuesday, Oct 8 2019, 14:00 - 15.30

Content of the seminar: Deep Reinforcement Learning

In this seminar, we will cover deep reinforcement learning (RL), which covers a class of learning methods that have achieved impressive results in recent years. We will start by introducing the general reinforcement learning framework and its most important algorithms before moving to the modern approach of deep reinforcement using neural networks as a basis. No prior knowledge in reinforcement learning is assumed. However, we assume that students are familiar with general machine learning concepts as well as with neural networks (at least its basics).

The general introductory talks will be based on the book “Reinforcement Learning: An Introduction, Second edition”, by RS Sutton and AG Barto (abbreviated as SB below). Later talks will be based on recent papers on deep reinforcement learning.

Literature: PDF.

Notes about key concepts that should be discussed in the specific talks: PDF.

Use this guide to help you prepare your talk successfully.

Topics:

Basic Concepts and Algorithms

Introduction to RL and Multi-armed Bandits (SB Ch.1, Ch.2)
Finite Markov Decision Processes (SB Ch.3)
Dynamic Programming (SB Ch.4)
Monte Carlo Methods (SB Ch.5)
Temporal-Difference Learning and Q-learning (SB Ch.6)
Function approximation (SB Ch9)

Deep Q-Learning

Learning to play video games with Reinforcement Learning. Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529 [PDF] and Ch.16.5.
Double Q-Learning. Van Hasselt et al., "Deep reinforcement learning with double q-learning." Thirtieth AAAI conference on artificial intelligence [PDF].

Policy Gradient Methods

Reinforcement Learning with Policy Gradients (SB Ch 13)
Advantage actor critic. Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. 2016 [PDF].
Deterministic Policy Gradients. Silver, David, et al. "Deterministic policy gradient algorithms." 2014. [PDF]. See also [Sutton et al., 2000]
PPO. Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017). [PDF].

Deep RL for robotics

Andrychowicz, Marcin, et al. "Learning dexterous in-hand manipulation." arXiv preprint arXiv:1808.00177 (2018). [PDF]
Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." The Journal of Machine Learning Research 17.1 (2016): 1334-1373. [PDF] (split into two talks, Part 1)
Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." The Journal of Machine Learning Research 17.1 (2016): 1334-1373. [PDF] (split into two talks, Part 2)

Including models

Weber, Theophane, et al. "Imagination-augmented agents for deep reinforcement learning." arXiv preprint arXiv:1707.06203 (2017). [PDF]
Francois-Lavet, Vincent, et al. "Combined reinforcement learning via abstract representations." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019. [PDF]
Planning and Learning with Tabular Methodds (SB Ch.8)

Deep RL for board games

Learning to play board games with RL: Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search" nature 529.7587 (2016): 484. [PDF]
Learning to play board games with RL: Silver, David, et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play." Science 362.6419 (2018): 1140-1144. [PDF, Supplement PDF]

Talks should be no longer than 35 minutes, and they should be be clear, interesting and informative, rather than a reprint of the material. Select what parts of the material you want to present, and what not, and then present the selected material well (including definitions not given in the material: look them up on the web or if that is not successful, ask the seminar organizers). Often diagrams or figures are useful for a talk. on the other hand, giving in the talk numbers of references that are listed at the end is a no-no (a talk is an online process, not meant to be read). For the same reasons you can also quickly repeat earlier definitions or so if you suspect that the audience may not remember it.

Talks will be assigned at the first seminar meeting. Students are requested to have a quick glance at the topics prior to this meeting in order to determine their preferences. Note that the number of participants for this seminar will be limited.

General rules:

Participation in the seminar meetings is obligatory. We also request your courtesy and attention for the seminar speaker: no smartphones, laptops, etc during a talk. Furthermore your active attention, questions, and discussion contributions are expected.

After your talk (and possibly some corrections) send pdf of your talk to Darjan Salaj salaj@igi.tugraz.at, who will post it on the seminar webpage.

TALKS:

Date	#	Topic / paper title	Presenter	Presentation
29.10.2019	1	SB Ch 1,2	Kulmer Marvin Jonathan	PDF
29.10.2019	2	SB Ch 3	FeichtnerJohannes	PDF
5.11.2019	3	SB Ch 4	Schögler Christoph	PDF
5.11.2019	4	SB Ch 5	Wachter Alexander	PDF
12.11.2019	5	SB Ch 6	Ziegler Dominik	PDF
12.11.2019	6	SB Ch 9	Fuchs Alexander	PDF
26.11.2019	7	Human-level control through deep reinforcement learning.	Baronig Maximilian	PDF
26.11.2019	8	Deep reinforcement learning with double q-learning.	Trapp Martin	PDF
03.12.2019	9	SB Ch 13	Koschatko Katharina	PDF
03.12.2019	10	Policy Gradient Methods for Reinforcement Learning with Function Approximation	Ek Hanna Kristin	PDF
10.12.2019	11	Asynchronous methods for deep reinforcement learning.	Khodachenko Ian	PDF
10.12.2019	12	Deterministic policy gradient algorithms.	Weinrauch Alexander	PDF
10.12.2020	13	Proximal policy optimization algorithms.	Toth Christian	PDF
07.01.2020	15	End-to-end training of deep visuomotor policies. Part 1	Nguyen Thi Kim Truc	PDF
07.01.2020	16	End-to-end training of deep visuomotor policies. Part 2	Rohr Benjamin	PDF
14.01.2020	14	Learning dexterous in-hand manipulation.	Novak Markus	PDF
14.01.2020	17	Imagination-augmented agents for deep reinforcement learning.	Lazaro Garcia Ernesto	PDF
21.01.2020	18	Combined reinforcement learning via abstract representations.	Kumar Chetan Srinivasa	PDF
21.01.2020	19	Unsupervised State Representation Learning in Atari.	Maiti Shalini	PDF
28.01.2020	20	SB Ch8	Simic Ilija	PDF
28.01.2020	21	Mastering the game of Go with deep neural networks and tree search.	Muelleder Thoma	PDF

Misc:

An excellent tutorial on reinforcement learning by Katja Hofmann is available online (Video, Slides).