Thursday, September 5, 2013

Comprehending Compression. A13, Image compression

Pictures are everywhere. It has become so common today that many of us from our generation would probably take at least a picture a day. In fact, this blog entry would probably contain at least 5 images, which is the average number of pictures in this blog. Just imagine all the billions of photos uploaded to Facebook and the amount of disk space that is required to store all of these. We would probably need new technologies and super fast internet to support all the information in each picture or just use a file compression.

The concept of compression comes from the idea that you can represent the same file or picture with less information to be stored. In this activity we use PCA or principal components analysis. From my understanding, the most common way to compress images is through Fourier transform. It states that any function is a sum of sines and cosines. The same is true for pictures which is represented by a matrix of values. We know the Fourier transform of Dirac deltas are sine waves. Superimposing various sine waves of different frequencies and along the horizontal and vertical axes would allow you to reconstruct any image. However, this would most likely require a large amount of data to be stored as well if you reconstruct an image exactly the same through this method. What is interesting is how just a few iterations would result in a good reconstruction of the original already. The rest of the information for more sine waves could be discarded to save on space. The result may not be exactly the same as the original but depending on the amount of information stored, would come pretty close.

We have an image of my uncle's cat for the same reason my uncle brought him to our house one evening. None.
What are you looking at?

Thanks to the miracle of image compression and broadband internet, we wouldn't have to wait so long for it to upload and you to download his serious face :|

Next, to simplify matters, we take the grayscale of the image. This will also be the comparison for our file size which is 147 KB.
don't try anything funny. I'm watching you.

Next, we cut the image into small 10x10 sub-images. It is intuitive that a 10x10 sub image would be easier to reconstruct than the whole 600x800 image. It would probably need an infinite amount of eigenfuctions to reconstruct which is counter productive. We use cumsum on lambda to find the threshold required. We find the 99% threshold at the 30th lambda so we use the 1st 30 eigenfunctions. I am thankful to mam for providing the code. A little modifications and we get a working code with this result with some help from Nestor.
I said don't try anything funny!

If we panic, we might think that we did something wrong and try to write the code from scratch. Don't worry because a quick check of the matrix of the reconstructed image shows weird values and then a quick elementary normalization of the values gives us this. The original grayscale image is shown below the reconstructed as a quick comparison.
spot the difference

Visually, it is impossible to tell the difference between the two but there are some significant differences, trust me. first of all is the file size. We have 106/147 KB which is 72% of the original. Not bad especially if they appear exactly the same. And to prove that they aren't exactly the same, here is the difference of the values of the reconstruction and the original in image form.
ktnxbye

I believe I was able to do the the activity sufficiently and give myself a 9.

No comments:

Post a Comment