CFProposal AISB2018

  The Society for the Study of Artificial Intelligence and Simulation for Behaviour (AISB) is soliciting proposals for symposia to be held at the AISB 2018 convention.The longest running convention on Artificial Intelligence, A...


Insurance AI Analy...

Insurance AI Analytics Summit, October 9-10, London Join us for Europe’s only AI event dedicated to insurance where 300 attendees will unite from analytics, pricing, marketing, claims and underwriting. You’ll find out how advan...


AISB 2018 Convention

  The longest running convention on Artificial Intelligence, AISB 2018 will be held at the University of Liverpool, chaired by Floriana Grasso and Louise Dennis. As in the past years, AISB 2018 will provide a unique forum for p...


AI Summit London

     The AI Summit London: The World’s Number One AI Event for Business  Date: 9-10 May 2017 Venue: Business Design Centre, London. The AI Summit is the world’s first and largest/number one conference exhibition dedicated to t...


AISB Wired Health

    AISB and WIRED events have partnered to bring together inspirational high-profile speakers. Join hundreds of healthcare, pharmaceutical and technology influencers and leaders at the 4th Annual WIRED Health event, taking pl...


Hugh Gene Loebner

  The AISB were sad to learn last week of the passing of philanthropist and inventor Hugh Gene Loebner PhD, who died peacefully in his home in New York at the age of 74.  Hugh was founder and sponsor of The Loebner Prize, an an...


AI Europe 2016

  Partnership between AISB and AI Europe 2016: Next December 5th and 6th in London, AI Europe will bring together the European AI eco-system by gathering new tools and future technologies appearing in professional fields for th...


AISB convention 2017

  In the run up to AISB2017 convention (, I've asked Joanna Bryson, from the organising team, to answer few questions about the convention and what comes with it. Mohammad Majid...


Harold Cohen

Harold Cohen, tireless computer art pioneer dies at 87   Harold Cohen at the Tate (1983) Aaron image in background   Harold Cohen died at 87 in his studio on 27th April 2016 in Encintias California, USA.The first time I hear...


Dancing with Pixies?...

At TEDx Tottenham, London Mark Bishop (the former chair of the Society) demonstrates that if the ongoing EU flagship science project - the 1.6 billion dollar "Human Brain Project” - ultimately succeeds in understanding all as...



AISB event Bulletin Item

ICML-07 Tutorial on Bayesian Methods for Reinforcement Learning

ICML-07 Tutorial on
Bayesian Methods for Reinforcement Learning
Corvallis, Oregon, USA
20 June 2007


Although Bayesian methods for Reinforcement Learning can be traced
back to the 1960s (Howard's work in Operations Research), Bayesian
methods have only been used sporadically in modern Reinforcement
Learning. This is in part because non-Bayesian approaches tend to be
much simpler to work with. However, recent advances have shown that
Bayesian approaches do not need to be as complex as initially thought
and offer several theoretical advantages. For instance, by keeping
track of full distributions (instead of point estimates) over the
unknowns, Bayesian approaches permit a more comprehensive
quantification of the uncertainty regarding the transition
probabilities, the rewards, the value function parameters and the
policy parameters. Such distributional information can be used to
optimize (in a principled way) the classic exploration/exploitation
tradeoff, which can speed up the learning process. Similarly, active
learning for reinforcement learning can be naturally optimized. The
estimation of gradient performance with respect to value function
and/or policy parameters can also be done more accurately while using
less data. Bayesian approaches also facilitate the encoding of prior
knowledge and the explicit formulation of domain assumptions. The
primary goal of this tutorial is to raise the awareness of the
research community with regard to Bayesian methods, their properties
and potential benefits for the advancement of Reinforcement Learning.


1. Introduction to Reinforcement Learning and Bayesian learning

2. History of Bayesian RL

3. Model-based Bayesian RL
3.1 Policy optimization techniques
3.2 Encoding of domain knowledge
3.3 Exploration/exploitation tradeoff and active learning
3.4 Bayesian imitation learning in RL
3.5 Bayesian multi-agent coordination and coalition formation in RL

4. Model-free Bayesian RL
4.1 Gaussian process temporal difference (GPTD)
4.2 Gaussian process SARSA
4.3 Bayesian policy gradient
4.4 Bayesian actor-critic algorithms

5. Demo
5.1 Control of an octopus arm using GPTD


Pascal Poupart, University of Waterloo

Pascal Poupart received a Ph.D. degree in Computer Science from the
University of Toronto in 2005. Since August 2004, he is an Assistant
Professor in the David R. Cheriton School of Computer Science at the
University of Waterloo. Poupart's research focuses on the design and
analysis of scalable algorithms for sequential decision making under
uncertainty (including Bayesian reinforcement learning), with
application to assistive technologies in eldercare, spoken dialogue
management and information retrieval. He has served on the program
committee of several international conferences, including AAMAS
(2006, 2007), UAI (2005, 2006, 2007), ICML (2007), AAAI (2005, 2006,
2007), NIPS (2007) and AISTATS (2007).

Mohammad Ghavamzadeh, University of Alberta

Mohammad Ghavamzadeh received a Ph.D. degree in computer science
from the University of Massachusetts Amherst in 2005. Since September
2005 he has been a postdoctoral fellow at the Department of Computing
Science at the University of Alberta, working with Prof. Richard
Sutton. The main objective of his research is to investigate the
principles of scalable decision-making grounded by real-world
applications. In the last two years, Ghavamzadeh?s research has been
mostly focused on using recent advances in statistical machine
learning, especially Bayesian reasoning and kernel methods, to develop
more scalable reinforcement learning algorithms.

Yaakov Engel, University of Alberta

Yaakov Engel received a Ph.D. degree from the Hebrew University of
Jerusalem in 2005. Since April 2005 he has been a postdoctoral fellow
with the Alberta Ingenuity Centre for Machine Learning (AICML) at the
Department of Computing Science at the University of Alberta.