Reading JPEG Into A Byte Array

Let’s say you are workBits_and_bytesing with images for your project. A lot of times, when you have to work across multiple platforms, the encoding doesn’t remain the same. In these scenarios, you cannot process images directly by treating them like 2D matrices. One of the most common image formats you will comes across is JPEG. If you are working on the same platform using a library like OpenCV, you can directly read JPEG files into 2D data structures. If not, you will have to read it into a byte array, process it and then encode it back. How do we do that?  

A novice would do the following and run into an error:

// Define file stream object, and open the file
std::ifstream file("./sample.jpg", ios::binary);

// Prepare iterator pairs to iterate the file content
std::istream_iterator<unsigned char> begin(file), end;

// Reading the file content using the iterator
std::vector<unsigned char> buffer(begin,end);

std::copy(buffer.begin(), buffer.end(), 
          std::ostream_iterator<unsigned int>(std::cout, ","));

Why is there an error? The code seems fine, right? Well, here is the deal. On most of the machines, “char” type is signed. A JPEG file typically consists of both positive and negative values. So when you cast a negative number to unsigned int, you get a big garbage value. These big values in the output are actually negative values, but they were converted to garbage values during type casting. As we all know, when char is signed, its value can be -128 to 127, but a byte can be between 0 to 255. So any value greater than 127 would become negative between the range -128 to -1.

You need to use unsigned char as given below:

unsigned char *s;

Or do this:

is<<static_cast<unsigned int>(static_cast<unsigned char>(s[i]))<<",";
                                cast to unsigned char first

                  then cast to unsigned int

Here, we are casting to unsigned char first by using “static_cast<unsigned char>(s[i]”. After that, we are casting the resultant value to unsigned int. Basically, cast char to unsigned char first, and then to unsigned int. Do you see what happened here? Take a moment to think about it and see how it solves our problem.

Why did we use static casting here? We used static_cast to instruct the compiler that you know that the conversion will not result into a truncation. For example, if you convert int to a char, the compiler will warn you that not all the values are going to fit inside this datatype. So if you are absolutely sure that none of the values will exceed the range, you can use static_cast and inform the compiler that you are aware of the situation and it’s okay with you.

On a more thing to note about C++ is that you should avoid using “new” when it’s not necessary. You can use std::vector as shown below:

// Define file stream object, and open the file
std::ifstream file("image.jpg", ios::binary);

// Prepare iterator pairs to iterate the file content!
std::istream_iterator<unsigned char> begin(file), end;

// Reading the file content using the iterator!
std::vector<unsigned char> buffer(begin,end);

The last line reads all the data from the file into buffer. As you can see, this solution doesn’t use “new”, neither does it use any kind of casting! Now you can print it as:

std::copy(buffer.begin(), buffer.end(), std::ostream_iterator<unsigned int>(std::cout, ","));

For the sake of completion, you need to include the following headers to make this work:

#include <vector> // for vector
#include <iterator> // for std::istream_iterator and std::ostream_iterator
#include <algorithm> // for std::copy


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s