Building Gaussian Process Regression

This example is for building Gaussian process regression models. In this example, we cover three scenarios: Gaussian process with fixed hyperparameters, Gaussian process with learned hyperparameters, and Gaussian process with particular priors.

First of all, import the package we need and bayeso.

import numpy as np

from bayeso import covariance
from bayeso.gp import gp
from bayeso.utils import utils_covariance
from bayeso.utils import utils_plotting

Declare some parameters to control this example. use_tex is a flag for using a LaTeX style, num_test is the number of test data points, and str_cov is a kernel choice.

use_tex = False
num_test = 200
str_cov = 'matern52'

Make a simple synthetic dataset, which is produced with a cosine function. The underlying true function is cos(x) + 10.

X_train = np.array([
    [-3.0],
    [-2.0],
    [-1.0],
    [2.0],
    [1.2],
    [1.1],
])
Y_train = np.cos(X_train) + 10.0
X_test = np.linspace(-3, 3, num_test)
X_test = X_test.reshape((num_test, 1))
Y_test = np.cos(X_test) + 10.0

Sample functions from a prior distribution, which is zero mean. As shown in the figure below, num_samples smooth functions are sampled.

mu = np.zeros(num_test)
hyps = utils_covariance.get_hyps(str_cov, 1)
Sigma = covariance.cov_main(str_cov, X_test, X_test, hyps, True)

Ys = gp.sample_functions(mu, Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')
gp_sampled_prior

Build a Gaussian process regression model with fixed hyperparameters. Fixed hyperparameters are brought through get_hyps. mu, sigma, and Sigma are mean estimates, standard deviation estimates, and covariance estimates, respectively. In addition, num_samples functions are sampled using mu and Sigma. Then, plot the result.

hyps = utils_covariance.get_hyps(str_cov, 1)
mu, sigma, Sigma = gp.predict_with_hyps(X_train, Y_train, X_test, hyps, str_cov=str_cov)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')
gp_fixed gp_sampled_fixed

Build a Gaussian process regression model with the hyperparameters optimized by marginal likelihood maximization, and plot the result. Similar to the aforementioned case, num_samples functions are sampled from the distributions with mu and Sigma.

mu, sigma, Sigma = gp.predict_with_optimized_hyps(X_train, Y_train, X_test, str_cov=str_cov)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')
gp_optimized gp_sampled_optimized

Declare some functions that would be employed as prior functions. cosine is a prior function with a cosine function. linear_down is a prior function with a decreasing function. linear_up is a prior function with an increasing function.

def cosine(X):
    return np.cos(X)

def linear_down(X):
    list_up = []
    for elem_X in X:
        list_up.append([-0.5 * np.sum(elem_X)])
    return np.array(list_up)

def linear_up(X):
    list_up = []
    for elem_X in X:
        list_up.append([0.5 * np.sum(elem_X)])
    return np.array(list_up)

Make an another synthetic dataset using a cosine function. The true function is cos(x) + 2.

X_train = np.array([
    [-3.0],
    [-2.0],
    [-1.0],
])
Y_train = np.cos(X_train) + 2.0
X_test = np.linspace(-3, 6, num_test)
X_test = X_test.reshape((num_test, 1))
Y_test = np.cos(X_test) + 2.0

Build Gaussian process regression models with the prior functions we declare above and the hyperparameters optimized by marginal likelihood maximization, and plot the result. Also, num_samples functions are sampled from the distributions defined with mu and Sigma.

prior_mu = cosine
mu, sigma, Sigma = gp.predict_with_optimized_hyps(X_train, Y_train, X_test,
                                                  str_cov=str_cov, prior_mu=prior_mu)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')

prior_mu = linear_down
mu, sigma, Sigma = gp.predict_with_optimized_hyps(X_train, Y_train, X_test,
                                                  str_cov=str_cov, prior_mu=prior_mu)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')

prior_mu = linear_up
mu, sigma, Sigma = gp.predict_with_optimized_hyps(X_train, Y_train, X_test,
                                                  str_cov=str_cov, prior_mu=prior_mu)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')
gp_optimized_prior_cosine gp_sampled_optimized_prior_cosine gp_optimized_prior_linear_down gp_sampled_optimized_prior_linear_down gp_optimized_prior_linear_up gp_sampled_optimized_prior_linear_up

Full code:

import numpy as np

from bayeso import covariance
from bayeso.gp import gp
from bayeso.utils import utils_covariance
from bayeso.utils import utils_plotting

use_tex = False
num_test = 200
str_cov = 'matern52'

X_train = np.array([
    [-3.0],
    [-2.0],
    [-1.0],
    [2.0],
    [1.2],
    [1.1],
])
Y_train = np.cos(X_train) + 10.0
X_test = np.linspace(-3, 3, num_test)
X_test = X_test.reshape((num_test, 1))
Y_test = np.cos(X_test) + 10.0

mu = np.zeros(num_test)
hyps = utils_covariance.get_hyps(str_cov, 1)
Sigma = covariance.cov_main(str_cov, X_test, X_test, hyps, True)

Ys = gp.sample_functions(mu, Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')

hyps = utils_covariance.get_hyps(str_cov, 1)
mu, sigma, Sigma = gp.predict_with_hyps(X_train, Y_train, X_test, hyps, str_cov=str_cov)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')

mu, sigma, Sigma = gp.predict_with_optimized_hyps(X_train, Y_train, X_test, str_cov=str_cov)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')

def cosine(X):
    return np.cos(X)

def linear_down(X):
    list_up = []
    for elem_X in X:
        list_up.append([-0.5 * np.sum(elem_X)])
    return np.array(list_up)

def linear_up(X):
    list_up = []
    for elem_X in X:
        list_up.append([0.5 * np.sum(elem_X)])
    return np.array(list_up)

X_train = np.array([
    [-3.0],
    [-2.0],
    [-1.0],
])
Y_train = np.cos(X_train) + 2.0
X_test = np.linspace(-3, 6, num_test)
X_test = X_test.reshape((num_test, 1))
Y_test = np.cos(X_test) + 2.0

prior_mu = cosine
mu, sigma, Sigma = gp.predict_with_optimized_hyps(X_train, Y_train, X_test,
                                                  str_cov=str_cov, prior_mu=prior_mu)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')

prior_mu = linear_down
mu, sigma, Sigma = gp.predict_with_optimized_hyps(X_train, Y_train, X_test,
                                                  str_cov=str_cov, prior_mu=prior_mu)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')

prior_mu = linear_up
mu, sigma, Sigma = gp.predict_with_optimized_hyps(X_train, Y_train, X_test,
                                                  str_cov=str_cov, prior_mu=prior_mu)
utils_plotting.plot_gp_via_distribution(
    X_train, Y_train, X_test, mu, sigma,
    Y_test=Y_test, use_tex=use_tex,
    str_x_axis='$x$', str_y_axis='$y$'
)

Ys = gp.sample_functions(mu.flatten(), Sigma, num_samples=5)
utils_plotting.plot_gp_via_sample(X_test, Ys, use_tex=use_tex,
                                  str_x_axis='$x$', str_y_axis='$y$')