Rotation-Based Compression

Background

In the late 19th to mid 20th century, the United States Department of Agriculture hired dozens of artists to paint watercolors of every fruit that grows in the United States. The collection contains 7,497 watercolor paintings, 87 line drawings, and 79 wax models created by approximately 21 artists. I thought this would make a great set of images to choose a random photo every day to use as my desktop wallpaper. The entire collection is 59.06 GB in size because the images are stored in archival quality.

Right off the bat, I noticed that there are a few images that have to be rotated because their orientation is incorrect.

A watercolor of strawberries.

One such image that required rotation is strawberries, so I went ahead and rotated it.

The strawberries watercolor rotated to the correct orientation.

After rotating the image, I noticed something interesting. The original image's file size was 12.67 MB, but the rotated image's size was only 5.41 MB. That's a huge size saving! And even better, I can't visually tell the difference.

Image quality

I used ImageMagick to compare the qualities:

identify -format "%Q" POM00000042.jpg
98%
identify -format "%Q" POM00000042-rotated.jpg
93%

Clearly there's a difference that I don't notice visually. This got me wondering what's going on when Preview rotates an image. I looked up how Preview rotates images and came across a MacOS tool called sips.

sips compression

I next ran the direct sips compression command to reduce the original image quality to 93%:

sips -s formatOptions 75 POM00000042.jpg
The strawberries watercolor compressed via sips.

The quality of this image was also 93%, but the size was 6.57 MB, which is larger than the rotated image!

ImageMagick compression

Next, I wondered what would happen if I used a more mainline tool to compress the image to 93%. Surely ImageMagick will be the most efficient right?

magick POM00000042.jpg -quality 93 POM00000042.jpg
The strawberries watercolor compressed via sips.

The size of this image was 7.01 MB, which is even bigger than sips. Something must be going on when the image is rotated. What would happen if we rotated the image 360 degrees to keep it in the same orientation? Well it turns out using sips to rotate 360 degrees doesn't do anything, but if you first rotate the file 90 degrees and then rotate it another 270 degrees to return it to its original orientation, the image compresses.

strawberry compression

I wrote a script called strawberry that handles this rotation.

The strawberries watercolor compressed via sips.

After rotating the original image in place, the output size was 4.10 MB, which is the smallest file size yet. That's a 68% reduction in size, and I still can't notice the difference!

GIMP comparison

The next step I took was to compare the compressed image with the original image in GIMP to check for differences. Assuming I correctly made the comparison, GIMP showed that there was no visual difference. Well that's pretty great for me.

Rotating too many times

Next I decided to rotate the image 12,500 times for fun.

The strawberries watercolor compressed via sips.

I think this might provide a hint at what's going on. When I compare the difference between this image and the original in GIMP, it now shows a lot of differences.

Thoughts

So what's going on here? I'm not entirely sure, and this is where my ignorance of JPEG works. My understanding is that JPEGs don't have lossless rotation by default, and that is true for sips as well. There's a great article on JPEG, which allowed me to plug in strawberries as an example image. I believe there's either some minor compression happening related to the Discrete Cosine Transform or a lossless saving from the Huffman Encoding, but I'm not 100% confident on either option. Regardless, it seems doing a single rotation compresses the image to a smaller size than direct compression, and I can't visually perceive the difference. That's an absolute win for me. The final thing I noticed is that this method seems particularly effective on the set of images that are stored in a significantly higher quality than most other JPEGs, so that's worth considering as well.

The original size of the image collection was 59.06 GB. After compressing them, it dropped down to about 17 GB. That's a roughly 70% size reduction! I wish I could find an answer for what's going on, but I'm pretty happy with the results.