What goes behind a compress function

Harsh Sinha
5 min readJul 23, 2022

--

So recently I was part of a project where we had to decrease the size of the image before uploading it. We had noticed that the upload API call was taking a lot of time because of the high payload size (images) hence affecting the user experience. We decide to go do image compression before uploading. The project is in react-native so we had options for different libraries.

We tried id.zelory:compressor:2.1.0 — Java, id.zelory:compressor:3.0.1 — kotlin and react-native-compressor library for compression. After comparing react-native-compressor was the first choice as it compresses images similar to WhatsApp, moreover, the other two libs are for android only and one requires Kotlin configuration which increases the app size significantly. The library has a function “compress” that takes a file path for that image does some “processing” and returns a promise which contains a new cache dir file path for the compressed image. Now we can upload that new image and clear the cache directory. This gives us the desired result, on average decreasing the image size 3–4 times without having a major hit on quality. But the question is what is this “processing” done to get this result.

After digging deep into the library it was actually calling a native java module using NativeModules of react native. In that module, it was actually decoding the image path into Bitmap using BitmapFactory. Let us look first at what is a bitmap.

A bitmap is the simplest form to represent image data. It is an array of bits that specify the color of each pixel in a rectangular array of pixels. The number of bits devoted to an individual pixel determines the number of colors that can be assigned to that pixel. For example, if each pixel is represented by 4 bits, then a given pixel can be assigned one of 16 different colors (2⁴ = 16). Modern cameras have 24 bits for a pixel which means 2²⁴ = 16,777,216 different colors. Even the human eye can’t distinguish more than 10 million colors.

A one-bit pixel image 16*16 ( source — google)
A 24-bit pixel image 1000*667 ( source — google)

Let's do some calculations for the above two images the first image has 1 bit for a pixel. And the width height is 16 pixels so the size is 1*16*16 bits equals 32 bytes. Now similarly for the second image, the size will be 24 * 1000*667 bits = 2001000 bytes = 1.9 MB.

Now let's come back to our compression function. So we converted the image to a bitmap now we can do many things in a bitmap, we can change the height, width etc, but there is a compress function for bitmap which is of importance for us. This function takes three parameters -format, quality (default 80%), and ByteArrayOutputStream. And the stream is filled with the desired data. The ByteArrayOutputStream is then written to a file using the file output stream. And this is the compressed file.

bitmapImage.compress(format, quality, stream);

This function is inside aosp( android open source project). This in turn uses native C++ implementation of a function called nativeCompress. This is the last function that I could see in the project. So the “format” parameter can be JPEG, PNG, WEBP_LOSSLESS, and WEBP_LOSSY. And the quality parameter can be from 0 to 100. The below formats are taken from android documentation.

JPEG: Compress to the JPEG format. quality of 0 means compress for the smallest size. 100 means compress for max visual quality.

PNG: Compress to the PNG format. PNG is lossless, so quality is ignored.

WEBP_LOSSLESS: Compress to the WEBP lossless format. quality refers to how much effort to put into compression. A value of 0 means to compress quickly, resulting in relatively large file size. 100 means to spend more time compressing, resulting in a smaller file.

WEBP_LOSSY: Compress to the WEBP lossy format. quality of 0 means compress for the smallest size. 100 means compress for max visual quality.

Let’s see what is lossy and lossless compression.

Lossless compression

Lossless Compression stores all the information exactly, without loss of quality or accuracy, as a perfect copy. The image recreated on the screen is exactly the same as the image created by the original designer. Lossless compression works particularly well for images with large, solid blocks of color, which condenses very effectively. For example, if on a row of pixels there is a line of thirty red pixels, then we can store the color value for ‘red’ once, and store the fact that it’s repeated thirty times, which would save us around twenty to twenty-five color-value entries. This type of simple compression can make significant savings with certain types of images. PNG is lossless so in that case, you can only use lossless compression, that’s why the “quality” parameter is ignored there. Some algorithms used in this compression are Lossless predictive coding, Huffman Coding, Arithmetic Coding, LZ77 etc.

Lossy Compression

Lossy Compression, on the other hand, has more advanced ways of compressing the image data, though this comes at the cost of precise reproduction. It eliminates the data which is not noticeable. The lossy image process is irreversible. Once you have compressed an image this way, you can’t go back. JPEG can be compressed with this compression. Well-designed lossy compression technology often reduces file sizes significantly before degradation is noticed by the end-user. Some algotithms used in the compression are Transform Coding , Fractal compression. The most famous of them is Discrete Cosine transform.

The compression used in android for PNG is a combination of filtering techniques and LZ77 and Huffman encoder. For JPG Discrete Cosine transform is used with Huffman/Arithmetic Encoding.

A detailed explanation of these algorithms is beyond the scope of this article. But I have pasted some references for the same.

Conclusion

It is remarkable what goes behind a simple compress statement, all those complex mathematical algorithms, which was taught in our digital image processing course in college, working to give us the desired result. And with the advent of high-end smartphones and devices, these high computations are done in a matter of millis.

References:

https://brilliant.org/wiki/huffman-encoding/

https://www.geeksforgeeks.org/discrete-cosine-transform-algorithm-program/

https://blog.mindorks.com/understanding-image-compression-in-android

https://youtube.com/playlist?list=PL3rE2jS8zxAykFjinlf6EsucLv5EA03_m

https://youtu.be/1I6kfkY4GyQ

--

--

Harsh Sinha
Harsh Sinha

No responses yet