Seminar Computational Intelligence B (708.112)

SS 2018

Institut für Grundlagen der Informationsverarbeitung (708)

Lecturer:

Assoc. Prof. Dr. Robert Legenstein

Office hours: by appointment (via e-mail)

E-mail: robert.legenstein@igi.tugraz.at
Homepage: https://www.tugraz.at/institute/igi/team/prof-legenstein/

Location: IGI-seminar room, Inffeldgasse 16b/I, 8010 Graz
Date: starting on Tuesday, March 6 2018, 13:15 - 15.00

Content of the seminar: Reinforcement Learning

In this seminar, we will discuss Reinforcement Learning in depth. Reinforcement Learning is a very important subfield of Machine Learning, where learning is not performed from explicit target labels, but from reward signals.
We will start with the basic concepts and algorithms. Our treatment of the topic will be based on a new book PDF.

We will later turn our attention to current research topics.

Prior knowledge in reinforcement learning is not necessesary. However, prior knowledge in machine learning is expected.

Topics:

Basic Concepts an Algorithms

Ch.2 Multi-armed Bandits
Ch.3 Finite Markov Decision Processes
Ch.4 Dynamic Programming
Ch.5 Monte Carlo Methods
Ch.6 Temporal-Difference Learning
Ch.7 n-step Bootstrapping
Ch.8 Planning and Learning with Tabular Methods

Approximate Solutions

Ch.9 On-policy prediction with approximation
Ch.10 On-policy control with approximation
Ch.12 Eligibility traces
Ch.13 Policy Gradient Methods

Reinforcement Learning in Humans and Animals

Ch.14 Reinforcement Learning and Psychology
Ch.15 Reinforcement Learning and Neuroscience

Current Research Topics

Learning to play video games with Reinforcement Learning. Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529 [PDF] and Ch.16.5.
Learning to play board games with Reinforcement: Silver, David, et al. "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm." arXiv preprint arXiv:1712.01815 (2017). [PDF] and Ch.16.6. Learning
Learning to Learn: Wang, J. X., Kurth-Nelson, Z., Tirumala, D., Soyer, H., Leibo, J. Z., Munos, R., ... & Botvinick, M. (2016). Learning to reinforcement learn. [PDF]

Talks should be no longer than 35 minutes, and they should be be clear, interesting and informative, rather than a reprint of the material. Select what parts of the material you want to present, and what not, and then present the selected material well (including definitions not given in the material: look them up on the web or if that is not successful, ask the seminar organizers). Often diagrams or figures are useful for a talk. on the other hand, giving in the talk numbers of references that are listed at the end is a no-no (a talk is an online process, not meant to be read). For the same reasons you can also quickly repeat earlier definitions or so if you suspect that the audience may not remember it.

Talks will be assigned at the first seminar meeting. Students are requested to have a quick glance at the topics prior to this meeting in order to determine their preferences. Note that the number of participants for this seminar will be limited. Preference will be given to students who

are / will write a Master's Thesis at the institute

are / will perform a Student's Project at the institute

have registered early.

General rules:

Participation in the seminar meetings is obligatory. We also request your courtesy and attention for the seminar speaker: no smartphones, laptops, etc during a talk. Furthermore your active attention, questions, and discussion contributions are expected.

After your talk (and possibly some corrections) send pdf of your talk to Charlotte Rumpf, who will post it on the seminar webpage.

TALKS:

Date	#	Topic / paper title	Presenter 1	Presenter 2	Presentation
20.3.2018	1	Chapter 2: Multi-armed bandits	Könighofer		PDF
20.3.2018	2	Chapter 3: Markov Decision Processes	Basirat	Ebrahimi	PDF
20.3.2018	3	Chapter 4: Dynamic Programming	Karl		PDF
24.4.2018	4	Chapter 5: Monte Carlo methods	Gigerl	Petschenig	PDF
24.4.2018	5	Chapter 6: Temporal-Differences Learning	Ahmetovic	Music	PDF
8.5.2018	6	Chapter 7: n-step Bootstraping	Kassarnig		PDF
8.5.2018	7	Chapter 8: Planning and Learning with Tabular Methods	Schlacher	Schlüsselbauer	PDF
15.5.2018	8	Chapter 14: Reinforcement Learning and Psychology	Benninger	Hajek	PDF
15.5.2018	9	Chapter 15: Reinforcement Learning and Neuroscience	Raggam		canceled
29.5.2018	10	Chapter 9: On-policy prediction with approximation	Toth	Zöhrer	PDF
29.5.2018	11	Chapter 10: On-policy control with approximation	Hehenberger		PDF
12.6.2018	12	Chapter 12: Eligibility traces	Gherman	Moik	canceled
12.6.2018	13	Chapter 13: Policy Gradient Methods	Spataru		PDF
25.6.2018	15	Learning to play boardgames with Reinforcement	Remonda		PDF