Assignment 4: Lossy Compression using  SVD

Overview

This part of the assignment allows you to look at an interesting image compression technique. The idea here is to apply a matrix decomposition method known as Singular Value Decomposition(SVD) to an image to extract "enough" information form the image to recreate an image that is "closer" to the original. Based on the level of details that is needed in the image, you can choose to store more or less bytes in the compressed file. We will take a look at some samples of images of people and  places to see the level of compression that is acceptable.

Actually, lossy compression is normally very tricky and requires some higher math, but for this assignment we've provided  the code for SVD decomposition algorithm. In the next section, we outline the basic concepts you'll need to understand, and at the bottom, you're given step-by-step instructions on exactly what you have to do.


Basic Concepts

So far you've learned about Huffman compression and LZW compression. One thing that is nice about both of these algorithms is that they work reasonably well on any type of file. Obviously some files can be compressed much more than others, but even when they can't compress a file at all, both LZW and Huffman will still run, and the file they produce can always be decompressed to produce the original exactly.

Sometimes, though, by making assumptions about the format of the input data, you can achieve much higher compression ratios. However, you have to be willing to give up the requirement that the after compressing and decompressing, the resulting file is identical to the original. That's why this type of compression is called lossy - because you lose data. In the case of images, we would like to compress in such a way that the decompressed image looks very similar to the original, but not exactly the same.

You will be implementing a lossy compression algorithm using a technique adopted from linear algebra. If you have done a course in linear algebra, you may remember several matrix decomposition methods that were discussed in class (if you went to class). You may remember decompositions such as LU, QR and SVD decompositions. SVD stands for the singular value decomposition, a method that has many practical applciations. One such application is applying SVD to compress an image by extracting and storing "enough" important information about the matrix in the compressed file. So the cool thing about this algorithm is you really don't have to depend on the fixed level of compression provided by a method such as jpeg or gif compression, but choose how much coded information you need to store in order to recover an "approximate" image. This method is particulary useful for image transmission over distances where the receiver can request information as needed to reconstruct an "acceptable" image.. If you can live with a fuzzy image you can certainly choose to receive a much smaller array of values than the original. There are several variations of the SVD decomposition as well. We can apply the SVD method to the entire image or to various blocks sizes of the image. Of course like any compression algorithm, there are enhancements and drawbacks to SVD compression which we can discuss later.
 


Singular Value Decomposition(SVD)

The basic concept in SVD compression algorithm is to represent an image as an mxn matrix, say A. We then can obtain a decompostion of the matrix A such that  A= USVT   where U is an mxn orthogonal matrix and V is an nxn orthogonal matrix. S is a diagonal matrix such that S = diag (s1 , s2, s3 , ....., sk , 0, 0, ..0) where si's are called the singular values of A and are in descending order. Finding this decompostion involves determining the eigenvalues of AAT and using a orthogonalization process known as Gram-Schmidt method. If you are interested in finding out more, you can find details in basic linear algebra textbooks or on the web. Of course, we don't expect you to learn all this Math to do this assignment. So we decided to give you a Java class for computing the SVD. This is a class created by a project named JAMA (A JAVA Matrix Package) by MathWorks and the National Institute of Standards and Technology (NIST) which has been released to the public domain.

The SingularValueDecomposition class from JAMA looks like this: 
public class SingularValueDecomposition  {
  public SingularValueDecomposition (Matrix Arg){
      public Matrix getU (); // returns the orthogonal matrix U
      public Matrix getV (); // returns the orthogonal matrix V
      public Matrix getS (); // returns the sigular values matrix S
      //Return the one-dimensional array of singular values
      public double[] getSingularValues (); 
      // bunch of other methods, you may or may not use. 
      // See documentation for more details
}
The SingularValueDecomposition class from JAMA uses Matrix class that is also included with the JAMA package. 
public class Matrix implements Cloneable, java.io.Serializable {
  public int getColumnDimension (); // returns the number of columns
  public int getRowDimension (); // returns the number of rows
  public Matrix plus(Matrix M);  // adds a matrix M
  public Matrix times(double s); // multiplies a matrix by scalar s
  ......
  // See documentation for more details
}

The SVD class can be used to find the Singular Value Decomposition of any size array. Therefore you can apply SVD to the entire image or chunks of it, say 8x8 arrays. We will compress the whole image first and then apply the SVD to chunks of the image as extra credit. 
So how does all this work? Lets assume that the original image A is a nxn square matrix. We apply the singular value decompostion and find a matrix factorization such that A= USVT   where S is a diagonal matrix with k non-zero singular values {s1 , s2, s3 , ....., sk } where s1 > s2, > s3 > .....> sk.  and  k <= n. Now consider the following scalar-vector-vector product  A1 = s1 U1 V1T  , where  Ui is the i-th column of U and Vi is the i-th column of V and s1 is the largest singular value of A. Note that when you multiply a nx1 column vector and 1xn row vector we get an nxn matrix. This product is called the rank-1 approximation of A. If you can live with a fuzzy image, all you need to store in the compressed file is  two vectors and a scalar, a total of (2n + 1)*3 bytes if we assume each pixel is 3 bytes. This is substantially smaller than the original array of size 3n2 bytes. You can choose how many vectors, you need to store in the compressed file depending on how much clarity you want. So for example, if you want a rank-r approximation, then you need to form the sum  A = s1 U1 V1T  + s2 U2 V2T  + ..... + sr Ur Vrand that would require us to store (2n+1)*3r bytes. So larger the rank approximation, the closer you get to the original image, but you end up storing more bits in the compressed file. If r = k , you can actually form the original image. So of course if  k  is substantially smaller than n, then we still can get a pretty good image using all k singular values. However, if k is equal to n, then we need to store (2n2 + n)*3 bytes to get the original image back, much much worse than  the size required for the original image. Therefore, the value of r must be smaller than n2 / (1 + 2n) to achieve the goal of compression. 
Extra Credit Part: (Adaptive Rank Method)
Instead of compressing the entire image at once, all popular image compression programs apply compression algorithm to sub-blocks of the image. We can then exploit the uneven characteristics of the original image. If parts of the image are less complex than the others, then a smaller number of singular values are needed to create a "close" approximation. On the other hand if the image is complex, then a larger number of values are needed. Thus we can specify a percentange of ranks to be used for each block. We can define a percentage as
 = (sum of singular values used to reach that percentage) / (sum of all singular values). So for example, if we choose a percentage of 80% for each block of say 8x8, and the singular values for a particular block are 6, 5, 1, 0 .9, 0.8, 0.2, 0.2, 0.1,  then only the first two are necessary to be stored for that block. So we can get a pretty good "approximation" to a sub-block of the image using only two singular values. But for complex sub-blocks you may use a larger number of singular values. Therefore this method will use an adpative ranking scheme for each block based on the complexity of the image chunks. 

The following figures show the image after applying SVD to the entire image (128x128) using rank-1, rank-8, and rank-16 approximations.
 

Images using various rank approximations
Original Image 
size 49K
rank1
Size 1K
rank 8
Size 7K
rank 16
size 13K

Theoritically sizes of these rank-k files should be even less. But we will use this as a benchmark to make sure you will stay close to the sizes
given above. A challenging thing here is to find a good bit packing algorithm (you must write your own) to efficiently store, U, V and S. But you dont have to find the most efficient way to pack the bytes. Just try to stay close to above sizes.

Adaptive Ranking Samples
The following figures show the result of applying compression to Professor Danny Sleator (image size 128x128) using an adaptive ranking scheme on sub-blocks of size 8x8 using 80%, 50%, 10% percentage of values. Note that the compressed file didn't reduce in size much after 50%. We encourage you to think about why this is the case.
 
 

Danny Sleator in various rank percentages
Original Image
size 49K
80% 
size 26K
50%
size 15K
10%
size 14K
 

The BMP file format

The images we are giving you are in the BMP file format, which is one of the most popular uncompressed file formats for images. If you're on a Unix machine, you can use the display command to view a BMP image. For example, to view the image called desert.bmp, you would type: 
display desert.bmp
If you're on Windows, you can open BMP files using Windows Paint, and on a Macintosh you can use a shareware progam called GraphicConverter. You'll need to find a way to view BMP files to do this assignment, because otherwise you'll have no way of knowing if your program is working or not. 

All of the BMP files we are giving you are exactly 128x128 pixels. Each file is exactly 49,206 bytes in size, because there are 54 bytes of header, and each pixel takes up 3 bytes (and 54 + 128*128*3 = 49206). Don't worry about the BMP header - we want you to ignore it. So when you read a BMP file, you should just read the first 54 bytes into an array, and when you write it out again, you should just write the header bytes out unchanged. 

The pixels are stored in RGB format, which means that every pixel contains three bytes, one each for the red, green, and blue color component of each pixel. One byte can store numbers in the range 0-255, and in this model 0 is black and 255 is full color. So the color red is (255, 0, 0) and green is (0, 255, 0). (128, 0, 0) makes a dark red, and (255, 0, 255) makes purple since purple is red + blue. 

The pixels are stored from left to right and from bottom to top. So the first 128 groups of three bytes in the file (after the 54-byte header) correspond to the bottom row of the image, and the last 128 groups of thee bytes correspond to the top row. 

The compression algorithm we're using only works with one color component at a time, so when you extract the 128x128 chunks (or 8x8 chunck's in extra credit) of the image to pass to the SVD, you'll need to do this separately for the red, green, and blue components of the image.

BTW, every other file format I can think of stores images from top to bottom. But of course, BMP was invented by Microsoft, and they like to do everything upside-down.

In order to write this program, you'll need to work with two-dimensional arrays. Luckily, Java makes this very easy. Here's an example of how to create an 128x128 array of bytes and initialize all of the values to zero:

  byte grid[][] = new byte[128][128];
  int i, j;

  for(i=0; i<128; i++)
    for(j=0; j<128; j++)
      grid[i][j] = 0;
OK, now you have all of the tools you need. Here's what you actually write:


What you have to do

You should write a class called Lossy which implements the Compressor interface. Here's the interface as a reminder:
public interface Compressor {
    public void compress (BitReader reader, BitWriter writer) throws IOException;
    public void decompress (BitReader reader, BitWriter writer) throws IOException;
}
Here are the steps you'll need to compress a file: Here's how you decompress a file: To make sure your program is working correctly, compress an image, then decompress it at different rank approximations. Based on the rank approximations you select, the resulting image may look "similar" to the original. (As similar as in the examples above.)

How to run Jama

The Jama library that is provided to you contain all the classes you need to find the SVD of a matrix. In particular, SingularValueDecomposition class and Matrix class are useful. See Java Doc for more information. To run your code with Jama you may want to type:
> javac -classPath .; Jama.jar   my211Zip.java

Extra Credit

To receive extra credit you must modify your program to handle any size sub-block as well as allow each sub-block to pick adpative ranks based on the percentage criteria described above. You can also think of efficient storage mechanisms for U, V and S to further reduce the compressed file size. We will determine at a later date how much extra credit will be given for extra work. We will announce other opportunities during class.

Footnote

You have now implemented the SVD decomposition algorithm. Many of the drawbacks to the method comes from the need to store the matirces(or parts of it) U and V. Perhaps some of the latest research work in linear algebra may give the answers to further reducing the necessity to store all of U and V. There are many open questions remain in the SVD method in image decomposition. For example,