Monday, September 25, 2017

Simple tutorial to write deep neural network by TensorFlow

Overview


On this article, I’ll show simple deep neural network(DNN) model for regression by TensorFlow.

TensorFlow is open source library from Google. From the official web site,
TensorFlow™ is an open source software library for numerical computation using data flow graphs.

From wikipedia,
TensorFlow is an open-source software library for machine learning across a range of tasks. It is a system for building and training neural networks to detect and decipher patterns and correlations, analogous to (but not the same as) human learning and reasoning.[3] It is used for both research and production at Google,‍[3]:min 0:15/2:17 [4]:p.2 [3]:0:26/2:17 often replacing its closed-source predecessor, DistBelief.

It lets us make neural network relatively easily. But different from keras, this needs proper knowledge of the things you want to make.
This article is almost simple tutorial to make deep neural network model for regression.

You can know the followings on this article.
  • What is deep neural network?
  • How do we write deep neural network model by TensorFlow

What is deep neural network?


Sometimes we can hear the word neural network and deep neural network. What is the difference between those?

Roughly, from the viewpoint of TensorFlow modeling, the difference is about the hidden layer’s number. Neural network and deep neural network have three types of layers about the aspect of their position. Those are input, hidden and output layers.

Input layer is to receive data. For example, when you try to predict flower’s name from the petal length and width, input layer takes the petal length and width as data to predict the name.

Output layer outputs predicted values. On the example above, the output layer outputs the name of the flower. To say precisely, it outputs values to each names which the train data’s class has.

Hidden layer is located between input and output layers. It is to get features for prediction. Although input and output layer’s form is restricted by the data and prediction target, the hidden layer’s form can be changed by you. You can change the number of layers and the number of nodes which each layer has.

Usually, when the model has just one hidden layer, it can be called neural network. When it has more hidden layers, we can call them deep neural network, although personally I don’t distinguish between them.

enter image description here

On the image, the left diagram has just one hidden layer and the right one has three hidden layers.

How do we write deep neural network model by TensorFlow?


By using Tensorflow, you can write deep neural network in the same manner as writing neural network. If you write it from scratch, you need to care about back propagation. But TensorFlow lets us write without caring about it.

About the way of writing simple neural network, please check the article below.

Simple regression model by TensorFlow

Neural network is composed of input, hidden and output layers. And the number of hidden layers is optional. So the simplest network architecture has just one hidden layer. On this article, I'll make the simplest neural network for regression by TensorFlow.


Actually, I use almost same code as the one used on the article above. Only difference is that deep neural network has one more hidden layer. The code I used is on the bottom of this article.

Here I’ll just check the different point from the article above.

# parameters
W1 = tf.Variable(tf.random_normal(shape=[3, 6]))
b1 = tf.Variable(tf.random_normal(shape=[6]))
hidden_1 = tf.nn.relu(tf.add(tf.matmul(x_data, W1), b1))

W2 = tf.Variable(tf.random_normal(shape=[6, 4]))
b2 = tf.Variable(tf.random_normal(shape=[4]))
hidden_2 = tf.nn.relu(tf.add(tf.matmul(hidden_1, W2), b2))

W3 = tf.Variable(tf.random_normal(shape=[4, 1]))
b3 = tf.Variable(tf.random_normal(shape=[1]))
output = tf.nn.relu(tf.add(tf.matmul(hidden_2, W3), b3))

By TensorFlow, we can write model with matrix calcuaration manner. When we do that, it’s necessary to care about the number of rows and columns.

For example, when we express a matrix with 3 rows and 4 columns as [3, 4] matrix, the form of [3, 4] * [4, 6] is equal to [3, 6].
So, on the case of the code, I wrote model as [3, 6] * [6, 4] * [4, 1]. By the image, the model is expressed as follwing.

enter image description here

It has one input and output layer and two hidden layers.
Concretely, the blue circles are input nodes. Green circles are hidden layer’s nodes. Red circle is output node.

By executing the code which is shown on the bottom, we can get the plot of how training went on.

enter image description here

Reference

The book, TensorFlow Machine Learning Cookbook, has basic information and many tips to use TensorFlow well.
IMAGE
About the neural network and TensorFlow, following articles take the topics.
  • Simple guide to Neural Network

  • Simple guide to Neural Network

    Neural network is an algorithm which make input go through at least one hidden and output layers to output. Graphically it is like below. To understand how neural network works, let's just think about very simple model like this. On this structure Input: [1, 3] : Weight: [3, 1] : Let's focus on .
  • Simple guide for Tensorflow

  • Simple guide for Tensorflow

    This article is to roughly understand Tensorflow and make easy model. These days if you are machine-oriented person, you can't pass even a day without hearing the name of Tensorflow. This is very useful tool but not so easily approachable. Let't check what Tensorflow is and how you can use it.
  • Simple regression model by TensorFlow

  • Simple regression model by TensorFlow

    Neural network is composed of input, hidden and output layers. And the number of hidden layers is optional. So the simplest network architecture has just one hidden layer. On this article, I'll make the simplest neural network for regression by TensorFlow.

Code

from sklearn import datasets
iris = datasets.load_iris()
print(iris.data[:5])

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

# set random number
seed = 42
tf.set_random_seed(seed)
np.random.seed(seed)

# data
iris = datasets.load_iris()
x = np.array([x[0:3] for x in iris.data])
y = np.array([x[3] for x in iris.data])

x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.7)

# normalization
mms = MinMaxScaler()
x_train = mms.fit_transform(x_train)
x_test = mms.fit_transform(x_test)

batch_size = 50

# placeholder
x_data = tf.placeholder(shape=[None, 3], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

# parameters
W1 = tf.Variable(tf.random_normal(shape=[3, 6]))
b1 = tf.Variable(tf.random_normal(shape=[6]))
hidden_1 = tf.nn.relu(tf.add(tf.matmul(x_data, W1), b1))

W2 = tf.Variable(tf.random_normal(shape=[6, 4]))
b2 = tf.Variable(tf.random_normal(shape=[4]))
hidden_2 = tf.nn.relu(tf.add(tf.matmul(hidden_1, W2), b2))

W3 = tf.Variable(tf.random_normal(shape=[4, 1]))
b3 = tf.Variable(tf.random_normal(shape=[1]))
output = tf.nn.relu(tf.add(tf.matmul(hidden_2, W3), b3))



# loss
loss = tf.reduce_mean(tf.square(y_target - output))

# optimize
optimizer = tf.train.GradientDescentOptimizer(0.005)
train_step = optimizer.minimize(loss)

with tf.Session() as sess:
    # initialize variables
    init = tf.global_variables_initializer()
    sess.run(init)

    train_loss = []
    test_loss = []
    for i in range(300):
        # index for training
        random_index = np.random.choice(len(x_train), size=batch_size)

        # prepare data
        random_x = x_train[random_index]
        random_y = np.transpose([y_train[random_index]])

        sess.run(train_step, feed_dict={x_data: random_x, y_target: random_y})

        # reserve train and test loss
        temp_train_loss = sess.run(loss, feed_dict={x_data: random_x, y_target: random_y})
        temp_test_loss = sess.run(loss, feed_dict={x_data: x_test, y_target: np.transpose([y_test])})

        train_loss.append(sess.run(tf.sqrt(temp_train_loss)))
        test_loss.append(sess.run(tf.sqrt(temp_test_loss)))

        if i % 50 == 0:
            print(str(i) + ':' + str([temp_train_loss, temp_test_loss]))

        if i == 300 - 1:
            pred = sess.run(output, feed_dict={x_data: x_test})
            pred_list = [x[0] for x in pred]
            print(np.transpose(np.array([y_test, pred_list])))

plt.plot(train_loss, 'k-', label='train loss')
plt.plot(test_loss, 'r--', label='test loss')
plt.legend(loc='upper right')
plt.show()