Computerised Minds. ...

A video sponsored by the society discusses Searle's Chinese Room Argument (CRA) and the heated debates surrounding it. In this video, which is accessible to the general public and those with interest in AI, Olly's Philosophy Tube ...


Erden in AI roundtab...

On Friday 4th September, philosopher and AISB member Dr Yasemin J Erden, participated in an AI roundtable at Second Home, hosted by Index Ventures and SwiftKey.   Joining her on the panel were colleagues from academia and indu...


AISB Convention 2016

The AISB Convention is an annual conference covering the range of AI and Cognitive Science, organised by the Society for the Study of Artificial Intelligence and Simulation of Behaviour. The 2016 Convention will be held at the Uni...


Bishop and AI news

Stephen Hawking thinks computers may surpass human intelligence and take over the world. This view is based on the ideology that all aspects of human mentality will eventually be realised by a program running on a suitable compu...


Connection Science

All individual members of The Society for the Study of Artificial Intelligence and Simulation of Behaviour have a personal subscription to the Taylor Francis journal Connection Science as part of their membership. How to Acce...


Al-Rifaie on BBC

AISB Committee member and Research Fellow at Goldsmiths, University of London, Dr Mohammad Majid al-Rifaie was interviewed by the BBC (in Farsi) along with his colleague Mohammad Ali Javaheri Javid on the 6 November 2014. He was a...


AISB YouTube Channel

The AISB has launched a YouTube channel: ( The channel currently holds a number of videos from the AISB 2010 Convention. Videos include the AISB round t...



AISB event Bulletin Item

ICML-07 Tutorial on Bayesian Methods for Reinforcement Learning

ICML-07 Tutorial on
Bayesian Methods for Reinforcement Learning
Corvallis, Oregon, USA
20 June 2007


Although Bayesian methods for Reinforcement Learning can be traced
back to the 1960s (Howard's work in Operations Research), Bayesian
methods have only been used sporadically in modern Reinforcement
Learning. This is in part because non-Bayesian approaches tend to be
much simpler to work with. However, recent advances have shown that
Bayesian approaches do not need to be as complex as initially thought
and offer several theoretical advantages. For instance, by keeping
track of full distributions (instead of point estimates) over the
unknowns, Bayesian approaches permit a more comprehensive
quantification of the uncertainty regarding the transition
probabilities, the rewards, the value function parameters and the
policy parameters. Such distributional information can be used to
optimize (in a principled way) the classic exploration/exploitation
tradeoff, which can speed up the learning process. Similarly, active
learning for reinforcement learning can be naturally optimized. The
estimation of gradient performance with respect to value function
and/or policy parameters can also be done more accurately while using
less data. Bayesian approaches also facilitate the encoding of prior
knowledge and the explicit formulation of domain assumptions. The
primary goal of this tutorial is to raise the awareness of the
research community with regard to Bayesian methods, their properties
and potential benefits for the advancement of Reinforcement Learning.


1. Introduction to Reinforcement Learning and Bayesian learning

2. History of Bayesian RL

3. Model-based Bayesian RL
3.1 Policy optimization techniques
3.2 Encoding of domain knowledge
3.3 Exploration/exploitation tradeoff and active learning
3.4 Bayesian imitation learning in RL
3.5 Bayesian multi-agent coordination and coalition formation in RL

4. Model-free Bayesian RL
4.1 Gaussian process temporal difference (GPTD)
4.2 Gaussian process SARSA
4.3 Bayesian policy gradient
4.4 Bayesian actor-critic algorithms

5. Demo
5.1 Control of an octopus arm using GPTD


Pascal Poupart, University of Waterloo

Pascal Poupart received a Ph.D. degree in Computer Science from the
University of Toronto in 2005. Since August 2004, he is an Assistant
Professor in the David R. Cheriton School of Computer Science at the
University of Waterloo. Poupart's research focuses on the design and
analysis of scalable algorithms for sequential decision making under
uncertainty (including Bayesian reinforcement learning), with
application to assistive technologies in eldercare, spoken dialogue
management and information retrieval. He has served on the program
committee of several international conferences, including AAMAS
(2006, 2007), UAI (2005, 2006, 2007), ICML (2007), AAAI (2005, 2006,
2007), NIPS (2007) and AISTATS (2007).

Mohammad Ghavamzadeh, University of Alberta

Mohammad Ghavamzadeh received a Ph.D. degree in computer science
from the University of Massachusetts Amherst in 2005. Since September
2005 he has been a postdoctoral fellow at the Department of Computing
Science at the University of Alberta, working with Prof. Richard
Sutton. The main objective of his research is to investigate the
principles of scalable decision-making grounded by real-world
applications. In the last two years, Ghavamzadeh?s research has been
mostly focused on using recent advances in statistical machine
learning, especially Bayesian reasoning and kernel methods, to develop
more scalable reinforcement learning algorithms.

Yaakov Engel, University of Alberta

Yaakov Engel received a Ph.D. degree from the Hebrew University of
Jerusalem in 2005. Since April 2005 he has been a postdoctoral fellow
with the Alberta Ingenuity Centre for Machine Learning (AICML) at the
Department of Computing Science at the University of Alberta.