Seminar Computational Intelligence B (708.112)

SS 2018

Institut für Grundlagen der Informationsverarbeitung (708)

Lecturer:

Assoc. Prof. Dr. Robert Legenstein

Office hours: by appointment (via e-mail)

E-mail: robert.legenstein@igi.tugraz.at
Homepage: https://www.tugraz.at/institute/igi/team/prof-legenstein/




Location: IGI-seminar room, Inffeldgasse 16b/I, 8010 Graz
Date: starting on Tuesday, March 6 2018, 13:15 - 15.00

Content of the seminar: Reinforcement Learning

In this seminar, we will discuss Reinforcement Learning in depth. Reinforcement Learning is a very important subfield of Machine Learning, where learning is not performed from explicit target labels, but from reward signals.
We will start with the basic concepts and algorithms. Our treatment of the topic will be based on a new book PDF.

We will later turn our attention to current research topics.

Prior knowledge in reinforcement learning is not necessesary. However, prior knowledge in machine learning is expected.


Topics:

    Basic Concepts an Algorithms

  1. Ch.2 Multi-armed Bandits
  2. Ch.3 Finite Markov Decision Processes
  3. Ch.4 Dynamic Programming
  4. Ch.5 Monte Carlo Methods
  5. Ch.6 Temporal-Difference Learning
  6. Ch.7 n-step Bootstrapping
  7. Ch.8 Planning and Learning with Tabular Methods
  8. Approximate Solutions

  9. Ch.9 On-policy prediction with approximation
  10. Ch.10 On-policy control with approximation
  11. Ch.12 Eligibility traces
  12. Ch.13 Policy Gradient Methods
  13. Reinforcement Learning in Humans and Animals

  14. Ch.14 Reinforcement Learning and Psychology
  15. Ch.15 Reinforcement Learning and Neuroscience
  16. Current Research Topics

  17. Learning to play video games with Reinforcement Learning. Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529 [PDF] and Ch.16.5.
  18. Learning to play board games with Reinforcement: Silver, David, et al. "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm." arXiv preprint arXiv:1712.01815 (2017). [PDF] and Ch.16.6. Learning
  19. Learning to Learn: Wang, J. X., Kurth-Nelson, Z., Tirumala, D., Soyer, H., Leibo, J. Z., Munos, R., ... & Botvinick, M. (2016). Learning to reinforcement learn. [PDF]

Talks should be no longer than 35 minutes, and they should be be clear, interesting and informative, rather than a reprint of the material. Select what parts of the material you want to present, and what not, and then present the selected material well (including definitions not given in the material: look them up on the web or if that is not successful, ask the seminar organizers). Often diagrams or figures are useful for a talk. on the other hand, giving in the talk numbers of references that are listed at the end is a no-no (a talk is an online process, not meant to be read). For the same reasons you can also quickly repeat earlier definitions or so if you suspect that the audience may not remember it.


Talks will be assigned at the first seminar meeting. Students are requested to have a quick glance at the topics prior to this meeting in order to determine their preferences. Note that the number of participants for this seminar will be limited. Preference will be given to students who

  1. are / will write a Master's Thesis at the institute
  2. are / will perform a Student's Project at the institute
  3. have registered early.

General rules:

Participation in the seminar meetings is obligatory. We also request your courtesy and attention for the seminar speaker: no smartphones, laptops, etc during a talk. Furthermore your active attention, questions, and discussion contributions are expected.

After your talk (and possibly some corrections) send pdf of your talk to Charlotte Rumpf charlotte.rumpf@tugraz.at, who will post it on the seminar webpage.




TALKS:

Date # Topic / paper title Presenter 1 Presenter 2 Presentation
20.3.2018 1 Chapter 2: Multi-armed bandits Könighofer
PDF
20.3.2018 2 Chapter 3: Markov Decision Processes Basirat Ebrahimi PDF
20.3.2018 3 Chapter 4: Dynamic Programming Karl
PDF
24.4.2018 4 Chapter 5: Monte Carlo methods Gigerl Petschenig PDF
24.4.2018 5 Chapter 6: Temporal-Differences Learning Ahmetovic Music PDF
8.5.2018 6 Chapter 7: n-step Bootstraping Kassarnig
PDF
8.5.2018 7 Chapter 8: Planning and Learning with Tabular Methods Schlacher Schlüsselbauer
PDF
15.5.2018 8 Chapter 14: Reinforcement Learning and Psychology Benninger Hajek
PDF
15.5.2018 9 Chapter 15: Reinforcement Learning and Neuroscience Raggam
canceled
29.5.2018 10 Chapter 9: On-policy prediction with approximation Toth Zöhrer
PDF
29.5.2018 11 Chapter 10: On-policy control with approximation Hehenberger
PDF
12.6.2018 12 Chapter 12: Eligibility traces Gherman Moik
canceled
12.6.2018 13 Chapter 13: Policy Gradient Methods Spataru
PDF
25.6.2018 15 Learning to play boardgames with Reinforcement Remonda
PDF