Seminar Computational Intelligence A (708.111)

WS 2019

Institut für Grundlagen der Informationsverarbeitung (708)

Lecturer:

Assoc. Prof. Dr. Robert Legenstein

Office hours: by appointment (via e-mail)

E-mail: robert.legenstein@igi.tugraz.at
Homepage: https://www.tugraz.at/institute/igi/team/prof-legenstein/




Location: IGI-seminar room, Inffeldgasse 16b/I, 8010 Graz
Date: starting on Tuesday, Oct 8 2019, 14:00 - 15.30

Content of the seminar: Deep Reinforcement Learning

In this seminar, we will cover deep reinforcement learning (RL), which covers a class of learning methods that have achieved impressive results in recent years. We will start by introducing the general reinforcement learning framework and its most important algorithms before moving to the modern approach of deep reinforcement using neural networks as a basis. No prior knowledge in reinforcement learning is assumed. However, we assume that students are familiar with general machine learning concepts as well as with neural networks (at least its basics).

The general introductory talks will be based on the book “Reinforcement Learning: An Introduction, Second edition”, by RS Sutton and AG Barto (abbreviated as SB below). Later talks will be based on recent papers on deep reinforcement learning.

Literature: PDF.

Notes about key concepts that should be discussed in the specific talks: PDF.

Use this guide to help you prepare your talk successfully.



Topics:

    Basic Concepts and Algorithms

  1. Introduction to RL and Multi-armed Bandits (SB Ch.1, Ch.2)
  2. Finite Markov Decision Processes (SB Ch.3)
  3. Dynamic Programming (SB Ch.4)
  4. Monte Carlo Methods (SB Ch.5)
  5. Temporal-Difference Learning and Q-learning (SB Ch.6)
  6. Function approximation (SB Ch9)
  7. Deep Q-Learning

  8. Learning to play video games with Reinforcement Learning. Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529 [PDF] and Ch.16.5.
  9. Double Q-Learning. Van Hasselt et al., "Deep reinforcement learning with double q-learning." Thirtieth AAAI conference on artificial intelligence [PDF].
  10. Policy Gradient Methods

  11. Reinforcement Learning with Policy Gradients (SB Ch 13)
  12. Advantage actor critic. Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. 2016 [PDF].
  13. Deterministic Policy Gradients. Silver, David, et al. "Deterministic policy gradient algorithms." 2014. [PDF]. See also [Sutton et al., 2000]
  14. PPO. Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017). [PDF].
  15. Deep RL for robotics

  16. Andrychowicz, Marcin, et al. "Learning dexterous in-hand manipulation." arXiv preprint arXiv:1808.00177 (2018). [PDF]
  17. Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." The Journal of Machine Learning Research 17.1 (2016): 1334-1373. [PDF] (split into two talks, Part 1)
  18. Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." The Journal of Machine Learning Research 17.1 (2016): 1334-1373. [PDF] (split into two talks, Part 2)
  19. Including models

  20. Weber, Theophane, et al. "Imagination-augmented agents for deep reinforcement learning." arXiv preprint arXiv:1707.06203 (2017). [PDF]
  21. Francois-Lavet, Vincent, et al. "Combined reinforcement learning via abstract representations." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019. [PDF]
  22. Planning and Learning with Tabular Methodds (SB Ch.8)
  23. Deep RL for board games

  24. Learning to play board games with RL: Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search" nature 529.7587 (2016): 484. [PDF]
  25. Learning to play board games with RL: Silver, David, et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play." Science 362.6419 (2018): 1140-1144. [PDF, Supplement PDF]

Talks should be no longer than 35 minutes, and they should be be clear, interesting and informative, rather than a reprint of the material. Select what parts of the material you want to present, and what not, and then present the selected material well (including definitions not given in the material: look them up on the web or if that is not successful, ask the seminar organizers). Often diagrams or figures are useful for a talk. on the other hand, giving in the talk numbers of references that are listed at the end is a no-no (a talk is an online process, not meant to be read). For the same reasons you can also quickly repeat earlier definitions or so if you suspect that the audience may not remember it.


Talks will be assigned at the first seminar meeting. Students are requested to have a quick glance at the topics prior to this meeting in order to determine their preferences. Note that the number of participants for this seminar will be limited.

General rules:

Participation in the seminar meetings is obligatory. We also request your courtesy and attention for the seminar speaker: no smartphones, laptops, etc during a talk. Furthermore your active attention, questions, and discussion contributions are expected.

After your talk (and possibly some corrections) send pdf of your talk to Darjan Salaj salaj@igi.tugraz.at, who will post it on the seminar webpage.



TALKS:

Date # Topic / paper title Presenter Presentation
29.10.2019 1 SB Ch 1,2 Kulmer Marvin Jonathan PDF
29.10.2019 2 SB Ch 3 FeichtnerJohannes PDF
5.11.2019 3 SB Ch 4 Schögler Christoph PDF
5.11.2019 4 SB Ch 5 Wachter Alexander PDF
12.11.2019 5 SB Ch 6 Ziegler Dominik PDF
12.11.2019 6 SB Ch 9 Fuchs Alexander PDF
26.11.2019 7 Human-level control through deep reinforcement learning. Baronig Maximilian PDF
26.11.2019 8 Deep reinforcement learning with double q-learning. Trapp Martin PDF
03.12.2019 9 SB Ch 13 Koschatko Katharina PDF
03.12.2019 10 Policy Gradient Methods for Reinforcement Learning with Function Approximation Ek Hanna Kristin PDF
10.12.2019 11 Asynchronous methods for deep reinforcement learning. Khodachenko Ian PDF
10.12.2019 12 Deterministic policy gradient algorithms. Weinrauch Alexander PDF
10.12.2020 13 Proximal policy optimization algorithms. Toth Christian PDF
07.01.2020 15 End-to-end training of deep visuomotor policies. Part 1 Nguyen Thi Kim Truc PDF
07.01.2020 16 End-to-end training of deep visuomotor policies. Part 2 Rohr Benjamin PDF
14.01.2020 14 Learning dexterous in-hand manipulation. Novak Markus PDF
14.01.2020 17 Imagination-augmented agents for deep reinforcement learning. Lazaro Garcia Ernesto PDF
21.01.2020 18 Combined reinforcement learning via abstract representations. Kumar Chetan Srinivasa PDF
21.01.2020 19 Unsupervised State Representation Learning in Atari. Maiti Shalini PDF
28.01.2020 20 SB Ch8 Simic Ilija PDF
28.01.2020 21 Mastering the game of Go with deep neural networks and tree search. Muelleder Thoma PDF

Misc:

An excellent tutorial on reinforcement learning by Katja Hofmann is available online (Video, Slides).