Deep Learning With Caffe In Python – Part II: Interacting With A Model

2 mainI know that the title looks slightly misleading. If you are thinking that we will be talking about how to interact with fashion models at a coffee shop, you are in for a big surprise! In the previous blog post, we talked about how to define and visualize a single layer convolutional neural network (CNN). In this post, we will discuss how to interact with a Caffe model. This is a continuation of the previous blog post. So if you haven’t read it, you may want to take a quick glance at it before you proceed. In that post, we defined our CNN architecture in a prototxt file. Now how do we make it do stuff for us? When we load such a network using Caffe, it comes with a bunch of features. Let’s see how to work with our model, shall we?  

Interacting with the neural network

Let’s go ahead and load our single layer network. Open the python file from the previous blog post and add the following line:

import sys
sys.path.insert(0, '/path/to/caffe/python')
import caffe
import cv2
import numpy as np

net = caffe.Net('myconvnet.prototxt', caffe.TEST)

print "\nnet.inputs =", net.inputs
print "\ndir(net.blobs) =", dir(net.blobs)
print "\ndir(net.params) =", dir(net.params)
print "\nconv shape = ", net.blobs['conv'].data.shape

We just created the “net” object to hold our convolutional neural network. You can access the names of input layers using “net.inputs”. You can see them by adding “print net.inputs” to your python file.

This “net” object contains two dictionaries — net.blobs and net.params. Basically, net.blobs is for data in the layers and net.params is for the weights and biases in the network. You can check them out using dir(net.blobs) and dir(net.params). In this case, net.blobs[‘data’] would contain an array of shape (1, 1, 256, 256). Now why does it have 4 dimensions if we are dealing with a simple 2D grayscale image? The first ‘1’ refers to the number of images and the second ‘1’ refers to the number of channels in the image. Caffe uses this format for all data. If you check net.blobs[‘conv’], you’ll see that it contains the output of the ‘conv’ layer of shape (1, 10, 254, 254). If you run a 3×3 kernel over a 256×256 image, the output will be of size 254×254, which is what we get here.

Let’s inspect the parameters:

  • net.params[‘conv’][0] contains the weight parameters of our neurons. It’s an array of shape (10, 1, 3, 3) initialized with “weight_filler” parameters. In the prototxt file, we have specified “Gaussian” indicating that the kernel values will have Gaussian distribution.
  • net.params[‘conv’][1] contains the bias parameters of our neurons. It’s an array of shape (10,) initialized with “bias_filler” parameters. In the prototxt file, we have specified “constant” indicating that the values will remain constant (0 in this case).

Caffe handles data as “blobs”, which are basically memory abstraction objects. Our data is contained as an array in the field named ‘data’.

Extracting the output of the network

Let’s consider the following image:

buildings

Let’s go ahead and see how to compute the output for this image. The size of the above image is 960×640, so we need to reshape the data blob from (1, 1, 256, 256) to (1, 1, 960, 640) so that it fits the image.

img = cv2.imread('input_image.jpg', 0)
img_blobinp = img[np.newaxis, np.newaxis, :, :]
net.blobs['data'].reshape(*img_blobinp.shape)
net.blobs['data'].data[...] = img_blobinp

net.forward()

for i in range(10):
    cv2.imwrite('output_image_' + str(i) + '.jpg', 255*net.blobs['conv'].data[0,i])

The “net” object will be populated now. If you check “net.blobs[‘conv’]”, you will see that it will be filled with data. We can plot the pictures inside each of the 10 neurons in the layer. If you want to plot the image in the nth neuron, you can do access it using net.blobs[‘conv’].data[0,n-1]. We just created 10 output files corresponding to the 10 neurons in the loop. Let’s check what the output from, say, the ninth neuron looks like:

output_image_8

As we can see here, it’s an edge detector. You can check out the other files to see the different types of filters generated.

We want the ability to reuse this layer without going through the process again. So let’s save it:

net.save('myconvmodel.caffemodel')

We just created a single layer network in Caffe. You should play around with the “net” object until you get familiar with it. It is used extensively in deep learning applications built using Caffe.

——————————————————————————————————————

4 thoughts on “Deep Learning With Caffe In Python – Part II: Interacting With A Model

  1. Very good text. A question: if I have more than one image to classify, how can I load them into net.blobs? In this case, all prediction will be made when execute net.forward()?

  2. Hi Prateek,

    Could you please explain how can I play with the functionality of the various nodes in the Convolutional layer? Where did you define that which neuron would perform which functionality?
    I understand that all these 10 neurons have a gaussian filter but how can we modify the individual working of each neuron. For example suppose I want the 9th Neuron to implement something else from edge detection. How can I accomplish that?

  3. Hi ! Thank you for the really helpful tutorials. I was wondering if there is any way of extracting the output during the training. When I try I just get the 1st image and the net is not training. Thanks !

Leave a comment