Sequential, Sparse Learning in Gaussian Processes
8 September 2003
Dan Cornford, Lehel Csato and Manfred Opper
The application of Gaussian processes (or Gaussian random field models in the spatial context) has historically been limited to datasets of a small size. This limitation is imposed by the requirement to store and invert the covariance matrix of all the samples to obtain a predictive distribution at unsampled locations. Various ad-hoc approaches to solve this problem have been adopted, such as selecting a neighbourhood region and / or a small number of observation to use in the kriging process, but these have no sound theoretical basis and it is unclear what information is being lost. In this paper we present a recently developed Bayesian method for estimating the mean and covariance structures of a Gaussian process using a sequential learning algorithm which attempts to minimise the relative entropy between the true posterior process and the approximating Gaussian process. By imposing sparsity in a well defined framework, the algorithm retains a subset of `basis vectors' which best represent the `true' posterior Gaussian random field model in the relative entropy sense (that is both the mean and covariance are taken into account in the approximation). This allows a principled treatment of Gaussian processes on very large data sets, particularly when they are regarded as a latent variable model, which may be non-linearly related to the observations. We show the application of the sequential, sparse learning in Gaussian processes to wind field modelling and discuss the merits and draw-backs.
Reference: Proceedings of the 7th International Conference on GeoComputation, University of Southampton, United Kingdom, 8 - 10 September 2003.
CD-ROM. Produced by D.Martin, "GeoComputation CD-ROM".