Friday, November 10, 2017

Simple regression by Edward: variational inference

Overview

Edward is one of the PPL(probabilistic programming language). This enables us to use variational inference, Gibbs sampling and Monte Carlo method relatively easily. But it doesn’t look so easy. So step by step, I’ll try this.

On this article, simple regression, tried on the article Simple Bayesian modeling by Stan, can be the nice example. So I did same things by Edward, using variational inference.




Edward

Actually, I don’t know yet Edward well. I just grasp it as one of the PPLs and for variational inference, Gibbs sampling and Monte Carlo method. Personally, I think the point is variational inference.

Anyway, about the details, it is better to read the official tutorial and the thesis below.


Data


The data I’ll use here are just same as the one used on the article below.

Simple Bayesian modeling by Stan

About Bayesian modeling, we can use some languages and tools. BUGS, PyMC, Stan. On this article, I made simple regression model by using Stan from Python. To try simple regression, I used the data set, Speed and Stopping Distances of Cars. R has the data set as default.


You can write it out on R.
Here, I’ll make the model to predict ‘dist’ from the variable ‘speed’.

Modeling

At first, we need to import necessary libraries and read the data.

import pandas as pd
import tensorflow as tf
import edward as ed 
from edward.models import Normal
import numpy as np

cars = pd.read_csv('cars.csv')

#the number of data points
N = cars.shape[0]

The model is like below. It is the goal to estimate the parameter by passing the data to the model(to say more precisely the goal is to sample the points).



On the code, it becomes below.

X = tf.placeholder(tf.float32, [N, 1])

b = Normal(loc=tf.zeros(1), scale=tf.ones(1))
a = Normal(loc=tf.zeros(1), scale=tf.ones(1))
y = Normal(loc=ed.dot(X, b) + a, scale=tf.ones(1))

I’ll use variational inference. About weight b and bias a, the code below specifies normal approximations.

q_b = Normal(loc=tf.Variable(tf.zeros(1)),
              scale=tf.nn.softplus(tf.Variable(tf.zeros(1))))
q_a = Normal(loc=tf.Variable(tf.zeros(1)),
              scale=tf.nn.softplus(tf.Variable(tf.zeros(1))))

Execute

After the setting, it runs variational inference with the Kullback-Leibler divergence.

inference = ed.KLqp({b: q_b, a: q_a}, data={X: np.array(cars['speed']).reshape([N, 1]), y: np.array(cars['dist'])})
inference.run(n_iter=1000)
1000/1000 [100%] ██████████████████████████████ Elapsed: 4s | Loss: 5980.085

Check the parameters


We can get sampled points about the parameters. Histogram of those show how the parameters are.

import matplotlib.pyplot as plt
plt.subplot(2,1,1)
plt.title("b")
plt.hist(q_b.sample(1000).eval())
plt.subplot(2,1,2)
plt.title("a")
plt.hist(q_a.sample(1000).eval())
plt.show()
enter image description here