AISB Convention 2015

The AISB Convention is an annual conference covering the range of AI and Cognitive Science, organised by the Society for the Study of Artificial Intelligence and Simulation of Behaviour. The 2015 Convention will be held at the Uni...


Read More...

Yasemin Erden on BBC

AISB Committee member, and Philosophy Programme Director and Lecturer, Dr Yasemin J. Erden interviewed for the BBC on 29 October 2013. Speaking on the Today programme for BBC Radio 4, as well as the Business Report for BBC world N...


Read More...

Mark Bishop on BBC ...

Mark Bishop, Chair of the Study of Artificial Intelligence and the Simulation of Behaviour, appeared on Newsnight to discuss the ethics of ‘killer robots’. He was approached to give his view on a report raising questions on the et...


Read More...

AISB YouTube Channel

The AISB has launched a YouTube channel: http://www.youtube.com/user/AISBTube (http://www.youtube.com/user/AISBTube). The channel currently holds a number of videos from the AISB 2010 Convention. Videos include the AISB round t...


Read More...

Lighthill Debates

The Lighthill debates from 1973 are now available on YouTube. You need to a flashplayer enabled browser to view this YouTube video  


Read More...
01234

Notice

AISB event Bulletin Item

ICML-07 Tutorial on Bayesian Methods for Reinforcement Learning

http://www.cs.uwaterloo.ca/~ppoupart/ICML-07-tutorial-Bayes-RL.html

ICML-07 Tutorial on
Bayesian Methods for Reinforcement Learning
http://www.cs.uwaterloo.ca/~ppoupart/ICML-07-tutorial-Bayes-RL.html
Corvallis, Oregon, USA
20 June 2007

MOTIVATION

Although Bayesian methods for Reinforcement Learning can be traced
back to the 1960s (Howard's work in Operations Research), Bayesian
methods have only been used sporadically in modern Reinforcement
Learning. This is in part because non-Bayesian approaches tend to be
much simpler to work with. However, recent advances have shown that
Bayesian approaches do not need to be as complex as initially thought
and offer several theoretical advantages. For instance, by keeping
track of full distributions (instead of point estimates) over the
unknowns, Bayesian approaches permit a more comprehensive
quantification of the uncertainty regarding the transition
probabilities, the rewards, the value function parameters and the
policy parameters. Such distributional information can be used to
optimize (in a principled way) the classic exploration/exploitation
tradeoff, which can speed up the learning process. Similarly, active
learning for reinforcement learning can be naturally optimized. The
estimation of gradient performance with respect to value function
and/or policy parameters can also be done more accurately while using
less data. Bayesian approaches also facilitate the encoding of prior
knowledge and the explicit formulation of domain assumptions. The
primary goal of this tutorial is to raise the awareness of the
research community with regard to Bayesian methods, their properties
and potential benefits for the advancement of Reinforcement Learning.

OUTLINE

1. Introduction to Reinforcement Learning and Bayesian learning

2. History of Bayesian RL

3. Model-based Bayesian RL
3.1 Policy optimization techniques
3.2 Encoding of domain knowledge
3.3 Exploration/exploitation tradeoff and active learning
3.4 Bayesian imitation learning in RL
3.5 Bayesian multi-agent coordination and coalition formation in RL

4. Model-free Bayesian RL
4.1 Gaussian process temporal difference (GPTD)
4.2 Gaussian process SARSA
4.3 Bayesian policy gradient
4.4 Bayesian actor-critic algorithms

5. Demo
5.1 Control of an octopus arm using GPTD

PRESENTERS

Pascal Poupart, University of Waterloo
ppoupart[at]cs[dot]uwaterloo[dot]ca
http://www.cs.uwaterloo.ca/~ppoupart

Pascal Poupart received a Ph.D. degree in Computer Science from the
University of Toronto in 2005. Since August 2004, he is an Assistant
Professor in the David R. Cheriton School of Computer Science at the
University of Waterloo. Poupart's research focuses on the design and
analysis of scalable algorithms for sequential decision making under
uncertainty (including Bayesian reinforcement learning), with
application to assistive technologies in eldercare, spoken dialogue
management and information retrieval. He has served on the program
committee of several international conferences, including AAMAS
(2006, 2007), UAI (2005, 2006, 2007), ICML (2007), AAAI (2005, 2006,
2007), NIPS (2007) and AISTATS (2007).

Mohammad Ghavamzadeh, University of Alberta
mgh[at]cs[dot]ualberta[dot]ca
http://www.cs.ualberta.ca/~mgh

Mohammad Ghavamzadeh received a Ph.D. degree in computer science
from the University of Massachusetts Amherst in 2005. Since September
2005 he has been a postdoctoral fellow at the Department of Computing
Science at the University of Alberta, working with Prof. Richard
Sutton. The main objective of his research is to investigate the
principles of scalable decision-making grounded by real-world
applications. In the last two years, Ghavamzadeh?s research has been
mostly focused on using recent advances in statistical machine
learning, especially Bayesian reasoning and kernel methods, to develop
more scalable reinforcement learning algorithms.

Yaakov Engel, University of Alberta
yaki[at]cs[dot]ualberta[dot]ca
http://www.cs.ualberta.ca/~yaki

Yaakov Engel received a Ph.D. degree from the Hebrew University of
Jerusalem in 2005. Since April 2005 he has been a postdoctoral fellow
with the Alberta Ingenuity Centre for Machine Learning (AICML) at the
Department of Computing Science at the University of Alberta.