Seminar Computational Intelligence A (708.111)

WS 2019

Institut für Grundlagen der Informationsverarbeitung (708)

Lecturer:

Assoc. Prof. Dr. Robert Legenstein

Office hours: by appointment (via e-mail)

E-mail: robert.legenstein@igi.tugraz.at
Homepage: https://www.tugraz.at/institute/igi/team/prof-legenstein/




Location: IGI-seminar room, Inffeldgasse 16b/I, 8010 Graz
Date: starting on Tuesday, Oct 8 2019, 14:00 - 15.30

Content of the seminar: Deep Reinforcement Learning

In this seminar, we will cover deep reinforcement learning (RL), which covers a class of learning methods that have achieved impressive results in recent years. We will start by introducing the general reinforcement learning framework and its most important algorithms before moving to the modern approach of deep reinforcement using neural networks as a basis. No prior knowledge in reinforcement learning is assumed. However, we assume that students are familiar with general machine learning concepts as well as with neural networks (at least its basics).

The general introductory talks will be based on the book “Reinforcement Learning: An Introduction, Second edition”, by RS Sutton and AG Barto (abbreviated as SB below). Later talks will be based on recent papers on deep reinforcement learning.

Literature: PDF.

Notes about key concepts that should be discussed in the specific talks: PDF.

Use this guide to help you prepare your talk successfully.



Topics:

    Basic Concepts and Algorithms

  1. Introduction to RL and Multi-armed Bandits (SB Ch.1, Ch.2)
  2. Finite Markov Decision Processes (SB Ch.3)
  3. Dynamic Programming (SB Ch.4)
  4. Monte Carlo Methods (SB Ch.5)
  5. Temporal-Difference Learning and Q-learning (SB Ch.6)
  6. Function approximation (SB Ch9)
  7. Deep Q-Learning

  8. Learning to play video games with Reinforcement Learning. Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529 [PDF] and Ch.16.5.
  9. Double Q-Learning. Van Hasselt et al., "Deep reinforcement learning with double q-learning." Thirtieth AAAI conference on artificial intelligence [PDF].
  10. Policy Gradient Methods

  11. Reinforcement Learning with Policy Gradients (SB Ch 13)
  12. Advantage actor critic. Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. 2016 [PDF].
  13. Deterministic Policy Gradients. Silver, David, et al. "Deterministic policy gradient algorithms." 2014. [PDF]. See also [Sutton et al., 2000]
  14. PPO. Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017). [PDF].
  15. Deep RL for robotics

  16. Andrychowicz, Marcin, et al. "Learning dexterous in-hand manipulation." arXiv preprint arXiv:1808.00177 (2018). [PDF]
  17. Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." The Journal of Machine Learning Research 17.1 (2016): 1334-1373. [PDF] (split into two talks, Part 1)
  18. Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." The Journal of Machine Learning Research 17.1 (2016): 1334-1373. [PDF] (split into two talks, Part 2)
  19. Including models

  20. Weber, Theophane, et al. "Imagination-augmented agents for deep reinforcement learning." arXiv preprint arXiv:1707.06203 (2017). [PDF]
  21. Francois-Lavet, Vincent, et al. "Combined reinforcement learning via abstract representations." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019. [PDF]
  22. Planning and Learning with Tabular Methodds (SB Ch.8)
  23. Deep RL for board games

  24. Learning to play board games with RL: Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search" nature 529.7587 (2016): 484. [PDF]
  25. Learning to play board games with RL: Silver, David, et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play." Science 362.6419 (2018): 1140-1144. [PDF, Supplement PDF]

Talks should be no longer than 35 minutes, and they should be be clear, interesting and informative, rather than a reprint of the material. Select what parts of the material you want to present, and what not, and then present the selected material well (including definitions not given in the material: look them up on the web or if that is not successful, ask the seminar organizers). Often diagrams or figures are useful for a talk. on the other hand, giving in the talk numbers of references that are listed at the end is a no-no (a talk is an online process, not meant to be read). For the same reasons you can also quickly repeat earlier definitions or so if you suspect that the audience may not remember it.


Talks will be assigned at the first seminar meeting. Students are requested to have a quick glance at the topics prior to this meeting in order to determine their preferences. Note that the number of participants for this seminar will be limited.

General rules:

Participation in the seminar meetings is obligatory. We also request your courtesy and attention for the seminar speaker: no smartphones, laptops, etc during a talk. Furthermore your active attention, questions, and discussion contributions are expected.

After your talk (and possibly some corrections) send pdf of your talk to Charlotte Rumpf charlotte.rumpf@tugraz.at, who will post it on the seminar webpage.




TALKS:

Date # Topic / paper title Presenter Presentation
29.10.2019 1 SB Ch 1,2 Kulmer Marvin Jonathan
29.10.2019 2 SB Ch 3 FeichtnerJohannes
5.11.2019 3 SB Ch 4 Schögler Christoph
5.11.2019 4 SB Ch 5 Wachter Alexander
12.11.2019 5 SB Ch 6 Ziegler Dominik
12.11.2019 6 SB Ch 9 Fuchs Alexander
26.11.2019 7 Human-level control through deep reinforcement learning. Baronig Maximilian
26.11.2019 8 Deep reinforcement learning with double q-learning. Trapp Martin
9 SB Ch 13 Koschatko Katharina
10 Asynchronous methods for deep reinforcement learning. Khodachenko Ian
11 Deterministic policy gradient algorithms. Weinrauch Alexander
12 Proximal policy optimization algorithms. Toth Christian
13 Learning dexterous in-hand manipulation. Novak Markus
14 End-to-end training of deep visuomotor policies. Part 1 Nguyen Thi Kim Truc
15 End-to-end training of deep visuomotor policies. Part 2 Rohr Benjamin
16 Imagination-augmented agents for deep reinforcement learning. Lazaro Garcia Ernesto
17 Combined reinforcement learning via abstract representations. Kumar Chetan Srinivasa
18 SB Ch8 Simic Ilija
19 Mastering the game of Go with deep neural networks and tree search. Müllede Thoma
20 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Könighofe Bettina
21 Unsupervised State Representation Learning in Atari. Maiti Shalini
22 Policy Gradient Methods for Reinforcement Learning with Function Approximation Ek Hanna Kristin