TensorFlow - How to visualize training data and graphs using TensorBoard

Contents:


  • Here is a good high-level and short introduction how TensorBoard works and what are the different dashboard types (events, histograms, images, audios, graphs) supposed for.
  • Here is some more detailed explanation how to augment TF code such that it saves log data in some log directory, which can later be visualized.

How to write logging data for TensorBoard

Example script generating some scalar logging data that can be visualized with TensorBoard. It further shows how to group ops such that the computation graph visualization in TensorBoard is clearly.
            # tf_tensorboard_hello_world.py
            #
            # Simple TF python script that shows
            # how to use the logging functionality
            # of tensorboard
            #
            # by Prof. Dr. Juergen Brauer, www.juergenbrauer.org

            import tensorflow as tf


            # 1. import MNIST data
            from tensorflow.examples.tutorials.mnist import input_data
            mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)


            # 2. set parameters
            learning_rate   = 0.01
            training_epochs = 25
            batch_size      = 100
            display_step    = 1
            logs_path = '/tmp/tensorflow_logs/example'


            # 3. define TF graph input
            x = tf.placeholder(tf.float32, [None, 784], name='InputData')
            y = tf.placeholder(tf.float32, [None, 10], name='LabelData')


            # 4. we will use a linear model for prediction y=W*x+b
            #    define variables (to be learned) here
            W = tf.Variable(tf.zeros([784, 10]), name='Weights')
            b = tf.Variable(tf.zeros([10]), name='Bias')


            # 5. construct model and encapsulate all ops into scopes,
            #    making TensorBoard's graph visualization more convenient
            with tf.name_scope('Model'):
                pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
            with tf.name_scope('Loss'):
                cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
            with tf.name_scope('SGD'):
                optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
            with tf.name_scope('Accuracy'):
                acc = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
                acc = tf.reduce_mean(tf.cast(acc, tf.float32))


            # 6. initialize all variables defined in the model and init the graph
            init = tf.initialize_all_variables()
            sess = tf.InteractiveSession()
            sess.run(init)


            # 7. Create summary ops to monitor lost and accuracy tensors
            tf.scalar_summary("loss", cost)
            tf.scalar_summary("accuracy", acc)


            # 8. merge all summaries into a single op to make collection of log data
            #    more convenient in the following
            merged_summary_op = tf.merge_all_summaries()


            # 9. create summary_writer object to write logs to TensorBoard
            summary_writer = tf.train.SummaryWriter(logs_path, graph=tf.get_default_graph())


            # 10. let's train the linear classifier!
            for epoch in range(training_epochs):

                    avg_cost = 0.0

                    total_batch = int(mnist.train.num_examples/batch_size)

                    # loop over all batches
                    for i in range(total_batch):

                        # get next training batch
                        batch_xs, batch_ys = mnist.train.next_batch(batch_size)

                        # run optimization op (backprop), cost op (to get loss value)
                        # and summary nodes
                        _, c, summary = sess.run([optimizer, cost, merged_summary_op],
                                                 feed_dict={x: batch_xs, y: batch_ys})

                        # write logs at every iteration
                        # (buffered writing to file is done automatically for speed reasons)
                        summary_writer.add_summary(summary, epoch * total_batch + i)

                        # compute average loss
                        avg_cost += c / total_batch

                    # display logs per epoch step
                    if epoch % display_step == 0:
                        print("Epoch:", '%04d' % epoch, "cost=", "{:.3f}".format(avg_cost))

            print("Optimization Finished!")


            # 11. test model using the accuracy op in the graph and the 10.000 test images
            print("Accuracy:", acc.eval({x: mnist.test.images, y: mnist.test.labels}))

            # after executing the script
            # call: tensorboard --logdir=/tmp/tensorflow_logs
            # then: open http://0.0.0.0:6006/ in a Chrome or Firefox web browser        
		
And here is the output of the script above:
juebrauer@ubuntu:~/my_lectures/deep_learning/python$ python3 tf_tensorboard_hello_world.py 
Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
Epoch: 0000 cost= 1.182
Epoch: 0001 cost= 0.665
Epoch: 0002 cost= 0.553
Epoch: 0003 cost= 0.499
Epoch: 0004 cost= 0.465
Epoch: 0005 cost= 0.443
Epoch: 0006 cost= 0.425
Epoch: 0007 cost= 0.412
Epoch: 0008 cost= 0.401
Epoch: 0009 cost= 0.392
Epoch: 0010 cost= 0.385
Epoch: 0011 cost= 0.378
Epoch: 0012 cost= 0.372
Epoch: 0013 cost= 0.367
Epoch: 0014 cost= 0.363
Epoch: 0015 cost= 0.359
Epoch: 0016 cost= 0.355
Epoch: 0017 cost= 0.351
Epoch: 0018 cost= 0.348
Epoch: 0019 cost= 0.345
Epoch: 0020 cost= 0.343
Epoch: 0021 cost= 0.340
Epoch: 0022 cost= 0.338
Epoch: 0023 cost= 0.336
Epoch: 0024 cost= 0.334
Optimization Finished!
Accuracy: 0.9138	
    

As we can see after executing the script, TensorFlow has written an event file in the specified log dir:

juebrauer@ubuntu:~/my_lectures/deep_learning/python$ ls -la /tmp/tensorflow_logs/example/
insgesamt 832
drwxr-xr-x 2 juebrauer juebrauer   4096 Sep 16 01:09 .
drwxr-xr-x 3 juebrauer juebrauer   4096 Sep 16 01:09 ..
-rw-rw-r-- 1 juebrauer juebrauer 840096 Sep 16 01:09 events.out.tfevents.1474013364.ubuntu
    

Examples of how written log data is visualized by TensorBoard

For visualizing the previously logged data, call
        juebrauer@ubuntu:~/my_lectures/deep_learning/python$ tensorboard --logdir=/tmp/tensorflow_logs
    
and then open your webbrowser with address http://0.0.0.0/6006

Since we added two scalar summaries, we now have the "Loss" and "Accuracy" graphs available:

Since we grouped the ops with tf.name_scope() into "namespaces" (visualized by rounded rectangles) we have now a clear high-level visualization of the computation graph:

However, we can still zoom into the ops, as e.g., the "Model" op, and visualize what happens in here (it consists of ops as well):