Robotics: Science and Systems XV

DESPOT-Alpha: Online POMDP Planning with Large State and Observation Spaces

Neha Priyadarshini Garg, David Hsu, Wee Sun Lee


State-of-the-art sampling-based online POMDP solvers compute near-optimal policies for POMDPs with very large state spaces. However, when faced with large observation spaces, they may become overly optimistic and compute sub-optimal policies, because of particle divergence. This paper presents a new online POMDP solver DESPOT alpha, which builds upon the widely used DESPOT solver. DESPOT alpha improves the practical performance of online planning for POMDPs with large observation as well as state spaces. Like DESPOT, DESPOT alpha uses the particle belief approximation and searches a determinized sparse belief tree. To tackle large observation spaces, DESPOT alpha shares sub-policies among many observations during online policy computation. The value function of a sub-policy is a linear function of the belief, commonly known as alpha-vector. We introduce a particle approximation of the alpha-vector to improve the efficiency of online policy search. We further speed up DESPOT alpha using CPU and GPU parallelization ideas introduced in HyP-DESPOT. Experimental results show that DESPOT Alpha/HyP-DESPOT Alpha; outperform DESPOT/HyP-DESPOT on POMDPs with large observation spaces, including a complex simulation task involving an autonomous vehicle driving among many pedestrians.



    AUTHOR    = {Neha Priyadarshini Garg AND David Hsu AND Wee Sun Lee}, 
    TITLE     = {DESPOT-Alpha: Online POMDP Planning with Large State and Observation Spaces}, 
    BOOKTITLE = {Proceedings of Robotics: Science and Systems}, 
    YEAR      = {2019}, 
    ADDRESS   = {FreiburgimBreisgau, Germany}, 
    MONTH     = {June}, 
    DOI       = {10.15607/RSS.2019.XV.006}