In the previous blog post, we learnt how to train a convolutional neural network (CNN). One of the most popular use cases for a CNN is to classify images. Once the CNN is trained, we need to know how to use it to classify an unknown image. The trained model files will be stored as “caffemodel” files, so we need to load those files, preprocess the input images, and then extract the output tags for those images. In this post, we will see how to load those trained model files and use it to classify an image. Let’s go ahead see how to do it, shall we?
Loading the model
Training a full network takes time, so we will use an existing trained model to classify an image for now. There are many models available here for tasks such as flower classification, digit classification, scene recognition, and so on. We will be using the caffemodel file available here. Download and save it before you proceed. Open up a new python file and add the following line:
net = caffe.Net('/path/to/caffe/models/bvlc_reference_caffenet/deploy.prototxt', 'bvlc_reference_caffenet.caffemodel', caffe.TEST)
This will load the model into the “net” object. Make sure you substitute the right path in the input parameters in the above line.
Preprocessing the image
Let’s define the transformer:
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
The function of the transformer is to preprocess the input image and transform it into something that Caffe can understand. Let’s set the mean image:
transformer.set_mean('data', np.load('/path/to/caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1))
The mean image needs to be subtracted from each input image. A couple of other params:
transformer.set_transpose('data', (2,0,1)) transformer.set_channel_swap('data', (2,1,0)) # if using RGB instead of BGR transformer.set_raw_scale('data', 255.0)
The “set_transpose” function here will transform an image from (256,256,3) to (3,256,256). The “set_channel_swap” function will change the channel ordering. Caffe uses BGR image format, so we need to change the image from RGB to BGR. If you are using OpenCV to load the image, then this step is not necessary since OpenCV also uses the BGR format. The “set_raw_scale” function normalizes the values in the image based on the 0-255 range.
We need to reshape the blobs so that they match the image shape. Let’s add the following line to the python file:
net.blobs['data'].reshape(1,3,227,227)
The first input parameter specifies the batch size. Since we are only classifying one image, it is set to 1. The next three parameters correspond to the size of the cropped image. From each image, a 227×227 sub-image is taken for training in the model file that we loaded. This makes the model more robust. That’s the reason we are using 227 here!
Classifying an image
Let’s load the input image:
img = caffe.io.load_image('myimage.jpg') net.blobs['data'].data[...] = transformer.preprocess('data', img)
Let’s compute the output:
output = net.forward()
The predicted output class can be printed using:
print output['prob'].argmax()
Download this file before you proceed. It contains the mapping required for the labels.Let’s print all the predicted labels:
label_mapping = np.loadtxt("synset_words.txt", str, delimiter='\t') best_n = net.blobs['prob'].data[0].flatten().argsort()[-1:-6:-1] print label_mapping[best_n]
The above lines will print all the labels predicted by the CNN. You will get the following output:
['n03837869 obelisk' 'n03743016 megalith, megalithic structure' 'n02793495 barn' 'n03028079 church, church building' 'n02980441 castle']
Isn’t it awesome? Looks like we are all set! During the course of these four blog posts, you learnt how to define, visualize, train, and run a CNN using Caffe. Keep playing with it and you’ll see that it can be used to perform a wide variety of computer vision tasks!
——————————————————————————————————————
Thank for the post. I have a question that if I create custom python layer in caffe then would its computation take place on gpu or it would only run on cpu? Is it necessary to implement a layer in C++ and CUDA to make use of gpu?
Custom Python layer works only on he CPU. Data will be copied to CPU in order to apply that layer. Hence having the Python layer as the first or last layer will reduce the impact on performance.
To use both CPU and GPU you have to add .cpp and .cu files for that layer.
when i run these script ..i got this error ,..how to solve this ..?
Traceback (most recent call last):
File “caffee_lib.py”, line 8, in
‘bvlc_reference_caffenet.caffemodel’, caffe.TEST)
RuntimeError: Could not open file bvlc_reference_caffenet.caffemodel
Hi, can I model my own caffemodel file? Thank you so much
Nice. I’m going to have a deeper look tommmorw
*tommorow
How can we draw a bounding box over a classified object (make a detector)?
Hi, here is my code to draw a bounding box with single shot detector!!!
https://github.com/jssmile/Distance_Estimation/blob/master/caffe/examples/detect.py
Thank you really. it works for all trained SSD models?
What a brilliant tutorial, Prateek Joshi. Thank you very much. I’d like a suggestion from you. Considering a problem of “finding” a specific sort of grain/seed in a image of a ground full of land, other sort of grains, inserts, etc, that is, can I work with Caffe as well? In the case of objects with very particular characteristics, what can I use? Caffe as well? Thank you a lot.
Thank you Prateek Joshi for this usefull tutorial. Can I ask you a suggestion? I am wondering whether I can use Caffe for the problem of “finding” a specific grain/feed in the ground. For example, I have pictures from a ground full of land, other grains, plants, insets, etc, and I would link to “find” a specific grain in that pictures. Is is viable using Caffe? For objects with peculiar characteristics it is possible using Caffe as well? Thank you a lot.