Friday, September 20, 2013

for rich or richer. A14 Pattern recognition


For this experiment, I will attempt to classify something that most UP students only see during enrollment, 500 and 1000 peso bills. As a UP student in the middle of September, I don't have such bills so I just use the first few good results from Google. Being lazy, I just get all the types available, front and back, new and old. I cropped the images to contain only the bill and no background. I did not alter the image in any other way, specifically the color or brightness. I then used half of the total amount, around 7/14 of the pics I found for each for learning, and the other half for testing.

From the images reserved from learning, I extract the red, green and blue components easily and find the mean values. From intuition, it is expected that the blue 1000 peso bills should have a higher mean value for blue while the yellow 500 peso bills should have more of the green. I plot these mean values for each test subject in 3d shown as the solid circles. The color of the circles conveniently represent the color of the bills, blue for 1000 and yellow for 500. The resulting plot shows that the 1000 peso bills are roughly grouped together and the 500 peso bills are found in another cluster. The test bills were similarly plotted. As expected, the 3d plot of the test bills shown as the hollow circles are found in their corresponding clusters. Next, the mean of the mean of the R, G and B values of the 1000 and 500 peso bills were calculated and plotted as well represented by the stars with the bill with the same color. We find each at the center of their corresponding clusters. The test bills were then classified by using the distance formula from these two stars. If the tested bill is closer to the yellow star, it is a 500 peso bill. Otherwise, it is a 1000 peso bill. This method applied to this set of bills yield an accuracy of 100%. The code is quite lengthy but is actually just a bit repetitive. You can try it out if you don't believe me. However, you don't have the pictures of the bills. You can contact me if you want a copy. A few snapshots of the final 3d plot from different angles are shown below. You can simply run the code to play around with the 3d plot for yourself. I find myself spending quite some time doing that. Because everything is better in 3d.





The part of this experiment where I learned the most is in 3d plotting. I spent quite some time figuring the 3d plot out and how to make it look presentable. Just to clarify, the x axis is the red value, the y is the green and the z is blue. The degree to which the data are scattered can be attributed to the fact that different kinds of bills were used. Again, I didn't care whether it was a picture of the front or back of an old or a new bill. However, this could also be considered as a strength of this code wherein it is not limited in identifying a specific type of bill only. The downside is that there is a part of the plot wherein the two clusters are almost touching already. I think this could be attributed to the fact that some of the bills, especially the front of the old bills contain a lot of white. The code can further be tested by adding another bill to be classified. I recommend 50 peso bills which are red so it shouldn't be confused with the yellow 500. However, I blame the lack of internet at the moment for the reason why I can't download the pictures of 50 peso bills and add it to the code. In the end, I give myself a 9 because I believe I was able to do what was required satisfactorily and was even able to give a good visual representation of the classification. I am also happy that somehow, I was also able to incorporate some basic normalization techniques for the color values I learned from AP187.


bn = 'C:\Users\Shua\Desktop\p1000\p1000'
mr1 = []
mg1 = []
mb1 = []
mr5 = []
mg5 = []
mb5 = []
cmr1 = []
cmg1 = []
cmb1 = []
cmr5 = []
cmg5 = []
cmb5 = []

//learn blue 1000
for i = 1:7
    a = imread([bn + string(i) + '.jpg'])
    a = double(a)
    mr1 = [mr1 mean(a(:,:,1))]
    mg1 = [mg1 mean(a(:,:,2))]
    mb1 = [mb1 mean(a(:,:,3))]
end

mmr1 = mean(mr1)
mmg1 = mean(mg1)
mmb1 = mean(mb1)
tot1 = mr1 + mg1 + mb1
mtot1 = mean(tot1)

param3d(mr1./tot1,mg1./tot1,mb1./tot1)

p = get('hdl')
p.line_mode = 'off'
p.mark_mode = 'on'
p.mark_size = 1
p.mark_foreground = 2

param3d(mmr1./mtot1,mmg1./mtot1,mmb1./mtot1)

p = get('hdl')
p.mark_style = 14
p.mark_size = 4
p.mark_foreground = 2

//learn yellow 500
bn = 'C:\Users\Shua\Desktop\p500\p500'

for i = 1:7
    a = imread([bn + string(i) + '.jpg'])
    a = double(a)
    mr5 = [mr5 mean(a(:,:,1))]
    mg5 = [mg5 mean(a(:,:,2))]
    mb5 = [mb5 mean(a(:,:,3))]
end
mmr5 = mean(mr5)
mmg5 = mean(mg5)
mmb5 = mean(mb5)
tot5 = mr5 + mg5 + mb5
mtot5 = mean(tot5)
param3d(mr5./tot5,mg5./tot5,mb5./tot5)

p = get('hdl')
p.line_mode = 'off'
p.mark_mode = 'on'
p.mark_size = 1
p.mark_foreground = 7

param3d(mmr5./mtot5,mmg5./mtot5,mmb5./mtot5)

p = get('hdl')
p.mark_style = 14
p.mark_size = 4
p.mark_foreground = 7

//check 1000
bn = 'C:\Users\Shua\Desktop\p1000\c1'
for i = 1:7
    a = imread([bn + string(i) + '.jpg'])
    a = double(a)
    cmr1 = [cmr1 mean(a(:,:,1))]
    cmg1 = [cmg1 mean(a(:,:,2))]
    cmb1 = [cmb1 mean(a(:,:,3))]
end

ctot1 = cmr1 + cmg1 + cmb1

param3d(cmr1./ctot1,cmg1./ctot1,cmb1./ctot1)
p = get('hdl')
p.line_mode = 'off'
p.mark_style = 9
p.mark_size = 1
p.mark_foreground = 2

//distance formula
dr121 = cmr1./ctot1-mmr1./mtot1
dg121 = cmg1./ctot1-mmg1./mtot1
db121 = cmb1./ctot1-mmb1./mtot1
d121 = sqrt(dr121^2 + dg121^2 + db121^2)
dr125 = cmr1./ctot1-mmr5./mtot5
dg125 = cmg1./ctot1-mmg5./mtot5
db125 = cmb1./ctot1-mmb5./mtot5
d125 = sqrt(dr125^2 + dg125^2 + db125^2)

//check 500
bn = 'C:\Users\Shua\Desktop\p500\c'
for i = 1:7
    a = imread([bn + string(i) + '.jpg'])
    a = double(a)
    cmr5 = [cmr5 mean(a(:,:,1))]
    cmg5 = [cmg5 mean(a(:,:,2))]
    cmb5 = [cmb5 mean(a(:,:,3))]
end

ctot5 = cmr5 + cmg5 + cmb5

param3d(cmr5./ctot5,cmg5./ctot5,cmb5./ctot5)

p = get('hdl')
p.line_mode = 'off'
p.mark_style = 9
p.mark_size = 1
p.mark_foreground = 7


dr521 = cmr5./ctot5-mmr1./mtot1
dg521 = cmg5./ctot5-mmg1./mtot1
db521 = cmb5./ctot5-mmb1./mtot1
d521 = sqrt(dr521^2 + dg521^2 + db521^2)
dr525 = cmr5./ctot5-mmr5./mtot5
dg525 = cmg5./ctot5-mmg5./mtot5
db525 = cmb5./ctot5-mmb5./mtot5
d525 = sqrt(dr525^2 + dg525^2 + db525^2)


Thursday, September 5, 2013

Comprehending Compression. A13, Image compression

Pictures are everywhere. It has become so common today that many of us from our generation would probably take at least a picture a day. In fact, this blog entry would probably contain at least 5 images, which is the average number of pictures in this blog. Just imagine all the billions of photos uploaded to Facebook and the amount of disk space that is required to store all of these. We would probably need new technologies and super fast internet to support all the information in each picture or just use a file compression.

The concept of compression comes from the idea that you can represent the same file or picture with less information to be stored. In this activity we use PCA or principal components analysis. From my understanding, the most common way to compress images is through Fourier transform. It states that any function is a sum of sines and cosines. The same is true for pictures which is represented by a matrix of values. We know the Fourier transform of Dirac deltas are sine waves. Superimposing various sine waves of different frequencies and along the horizontal and vertical axes would allow you to reconstruct any image. However, this would most likely require a large amount of data to be stored as well if you reconstruct an image exactly the same through this method. What is interesting is how just a few iterations would result in a good reconstruction of the original already. The rest of the information for more sine waves could be discarded to save on space. The result may not be exactly the same as the original but depending on the amount of information stored, would come pretty close.

We have an image of my uncle's cat for the same reason my uncle brought him to our house one evening. None.
What are you looking at?

Thanks to the miracle of image compression and broadband internet, we wouldn't have to wait so long for it to upload and you to download his serious face :|

Next, to simplify matters, we take the grayscale of the image. This will also be the comparison for our file size which is 147 KB.
don't try anything funny. I'm watching you.

Next, we cut the image into small 10x10 sub-images. It is intuitive that a 10x10 sub image would be easier to reconstruct than the whole 600x800 image. It would probably need an infinite amount of eigenfuctions to reconstruct which is counter productive. We use cumsum on lambda to find the threshold required. We find the 99% threshold at the 30th lambda so we use the 1st 30 eigenfunctions. I am thankful to mam for providing the code. A little modifications and we get a working code with this result with some help from Nestor.
I said don't try anything funny!

If we panic, we might think that we did something wrong and try to write the code from scratch. Don't worry because a quick check of the matrix of the reconstructed image shows weird values and then a quick elementary normalization of the values gives us this. The original grayscale image is shown below the reconstructed as a quick comparison.
spot the difference

Visually, it is impossible to tell the difference between the two but there are some significant differences, trust me. first of all is the file size. We have 106/147 KB which is 72% of the original. Not bad especially if they appear exactly the same. And to prove that they aren't exactly the same, here is the difference of the values of the reconstruction and the original in image form.
ktnxbye

I believe I was able to do the the activity sufficiently and give myself a 9.