Matrices and Digital Images

Originating authors are Dirce Uesu Pesco and Humberto José Bortolossi.
The images you see on internet pages and the photos you take with your mobile phone are examples of digital images. It is possible to represent this kind of image using matrices. For example, the small image of Felix the Cat (on the left) can be represented by a 35 \times 35 matrix whose elements are the numbers 0 and 1. These numbers specify the color of each pixel (a pixel is the smallest graphical element of a matricial image, which can take only one color at a time): the number 0 indicates black, and the number 1 indicates white. Digital images using only two colors are called binary images or boolean images.

Figure 2: The matrix corresponding to Felix The Cat

Grayscale images can also be represented by matrices. Each element of the matrix determines the intensity of the corresponding pixel. For convenience, most of the current digital files use integer numbers between 0 (to indicate black, the color of minimal intensity) and 255 (to indicate white, maximum intensity), giving a total of 256 = 2^8 different levels of gray (This quantity of levels of gray is reasonable to work with images in WEB pages. However, there are certain specific applications that need more levels of gray in order to reproduce
the image with more details and avoid rounding errors in numerical calculations, as is the case of medical images).

Color images, in turn, can be represented by three matrices. Each matrix specifies the amount of red, green and blue that makes up the image. This color system is known as RGB (There are many other color systems that are used depending on the application: CMYK (for printing), Y’IQ (for TV analog transmission in NTSC), etc). The elements of these matrices are integer numbers between 0 and 255, and they determine the intensity of the pixel with respect to the color of the matrix. Thus, in the RGB system, it is possible to represent 256^3 = 2^{24} = 16777216 different colors.

Figure 3: Original picture, Red, Green and Blue components

Digital image processing and operations with matrices

Once a digital image can be represented by matrices, we may ask how operations on their elements affect the corresponding image. For example, if we consider the binary image A below as a matrix, say A = (a_{i,j}), then the image B corresponds to the transposed matrix of A, that is, B = (b_{i, j}) = (a_{j, i}) = A^{T}. The image H, by its turn, corresponds to the matrix (a_{j, 35 - i + 1}). Try to discover the matricial relationships between the image A and the other images!

Figure 4: Matrix transformations

Another example: if we take the standard arithmetic mean of the component matrices R, G and B from a color image A, we will get a grayscale version of the image (non-integer values are rounded to the nearest integer):

Figure 5: arithmetic mean of the component matrices

One more example: using the operations of multiplication by a scalar and sum of matrices, it is possible to create an image transition effect commonly used, for instance, in PowerPoint{}^{\copyright} presentations and slide shows. More precisely, consider two grayscale images of the same size, represented by the matrices A and Z.
For each scalar (real number) t in the interval [0, 1], define the matrix

    \[M(t) = (1 - t) A + t Z.\]

Notice that M(0) = A, M(1) = Z and, for each t between 0 and 1, the elements of the matrix M(t) are between the elements of the matrices A and Z. Therefore, when t varies from 0 to 1, the matrix M(t) varies from A to Z. For the case of color images, the transformation above must be applied to the matrices R, G and B that compose each image.

Figure 6: M(0) = A, M(0.13), M(0.25), M(0.38), M(0.50), M(0.63), M(0.75), M(0.88), M(1)=Z

Multiplication of matrices also has applications in digital image processing. Although our next example will be more elaborate (with a rationale based on more advanced mathematical techniques usually only studied in Linear Algebra university courses), we believe, still, that it will be of interest to the reader, since this will have the opportunity to enjoy an amazing application derived from the ability to decompose a matrix as a product of matrices with special structures. The omitted details may be found in the references [Lay, 2011] and [Poole, 2005]. Consider, therefore, a singular value decomposition (SVD), that consists of writing a matrix A_{m \times n} as the product of three matrices:

    \[A_{m \times n}^{\vphantom{T}} = U_{m \times m}^{\vphantom{T}} S_{m \times n}^{\vphantom{T}} V_{n \times n}^{T},\]

where U and V are orthogonal matrices (that is, U^{T} U and V^{T}V are m \times m and n \times n identity matrices, respectively) and S is a matrix whose elements s_{i, j} are equal to zero for i \not = j and s_{1, 1} \geq s_{2, 2} \geq \cdots \geq s_{k, k} \geq 0, with k = \min\{m, n\}. Here is an example of an SVD decomposition:

    \[A = \left[\begin{array}{rrr} -1 & 1 & 0 \\ 0 & -1 & 1 \end{array}\right] = U S V^{T}\]

    \[= \left[ \begin{array}{rr} \displaystyle -\frac{\sqrt{2}}{2} & \displaystyle \frac{\sqrt{2}}{2} \vspace*{0.5ex} \\ \displaystyle \frac{\sqrt{2}}{2} & \displaystyle \frac{\sqrt{2}}{2} \end{array} \right] \left[ \begin{array}{rrr} \sqrt{3} & 0 & 0 \\ 0 & 1 & 0 \end{array} \right] \left[ \begin{array}{rrr} \displaystyle \frac{\sqrt{6}}{6} & \displaystyle -\frac{\sqrt{2}}{2} & \displaystyle \frac{\sqrt{3}}{3} \vspace*{0.5ex} \\ \displaystyle -\frac{\sqrt{6}}{3} & 0 & \displaystyle \frac{\sqrt{3}}{3} \vspace*{0.5ex} \\ \displaystyle \frac{\sqrt{6}}{6} & \displaystyle \frac{\sqrt{2}}{2} & \displaystyle \frac{\sqrt{3}}{3} \end{array} \right]^{T}. $ } }\]

It can be shown that every matrix has an SVD decomposition ([Lay, 2011], [Poole, 2005]). Moreover, algorithms exist that allow us to calculate such decompositions using a computer. The key point of our example is to observe that if \textbf{u}_{1}, \textbf{u}_{2}, \ldots, \textbf{u}_{m} are the columns of the matrix U and \textbf{v}_{1}, \textbf{v}_{2}, \ldots, \textbf{v}_{n} are the columns of the matrix V, then

    \[A = USV^{T} = s_{1, 1} \textbf{u}_{1} \textbf{v}_{1}^{T} + s_{2, 2} \textbf{u}_{2} \textbf{v}_{2}^{T} + \cdots + s_{k, k} \textbf{u}_{k} \textbf{v}_{k}^{T}.\]

Why is that? Suppose that A, a grayscale image of size 1000 \times 1000, must be transmitted from a satellite to a laboratory on Earth. In principle, the satellite would have to send 1 million numbers (one for each pixel). As typically only the first elements s_{i, i} of the matrix S of the SVD decomposition for A are significant (the others are “small”), it is enough, then, that the satellite sends, say, the 20 first columns of U and V, and the 20 first numbers s_{i, i} (totaling only 20 \cdot 1000 + 20 \cdot 1000 + 20 = 40020 numbers that must be sent). Upon receiving these data, the laboratory on Earth calculates the matrix s_{1, 1} \textbf{u}_{1} \textbf{v}_{1}^{T} + s_{2, 2} \textbf{u}_{2} \textbf{v}_{2}^{T} + \cdots + s_{20, 20} \textbf{u}_{20} \textbf{v}_{20}^{T} that will give an approximation of the original image.

Let’s see an example: the picture below of the mathematician Christian Felix Klein (1849-1925) has 720 \times 524 = 377 280 pixels.

Figure 7: Felix Klein

From the SVD decomposition of the corresponding matrix of this image, we can calculate the matrices s_{1, 1} \textbf{u}_{1} \textbf{v}_{1}^{T} + s_{2, 2} \textbf{u}_{2} \textbf{v}_{2}^{T} + \cdots + s_{r, r} \textbf{u}_{r} \textbf{v}_{r}^{T} for r = 1, 5, 10 and 20. These matrices generate approximations to the original image, as illustrated in the following figures. Notice that the original image corresponds to the case r = 524. It is quite impressive, is it not?

Figure 8: cases r=1, 5, 10 and 20

Other applications

Digital image processing has many applications, such as remote sensing, data transmission, medicine, robotics, computer vision, film industry, etc. In remote sensing, for example, images acquired by satellites are useful for tracking natural resources, geographical mapping, analysis of urban growth, and many other environmental applications. In the transmission of images, we have communications via fax, networks, the internet, and closed-circuit TV for monitoring and security. In medical applications, we have X-ray image processing, projection images of transaxial tomography, radiology, nuclear magnetic resonance (NMR) and ultrasonic scanning.

Some methods of acquisition and transmission can generate noise in an image. The median filter is an image processing technique used to remove them or reduce their effects: for each element of the matrix that represents the image, we observe its neighboring elements and, then, we arrange them in an ordered list. The median filter consists of choosing the central element of this list and replace the element at the center by this one.

Figure 9: the median filter

Figure 10: image with noise and image with the median filter

There are many other techniques in image processing with different objectives. The following images illustrate examples of contrast adjustment, edge detection and threshold.

Figure 11: original picture, and the same picture with contrast adjustment, edge detection and threshold

Final comment

Our objective with this text is to present a little known application of matrices for teachers and high school students: the processing of digital images. It is important to notice that the mathematical tools related to this topic go far beyond matrices. The subject is vast, rich and modern. Unfortunately, the limit of few pages recommended for this article does not allow us to provide further details. As a starting point for the reader motivated to find out more, we recommend the books [Gonzalez and Woods, 2007] and [Gomes and Velho, 2008].

References

On this website (or at its mirror), there is a series of interactive applets that allows you to explore the relations between matrices and digital images presented in this text. You will also find a DOC file with suggestions of exercises to be worked in the classroom.

Here are the references cited in the text:

Gomes, J.; Velho, L. Image Processing for Computer Graphics and Vision. Springer-Verlag, 2008.

Gonzalez, R. C.; Woods, R. E. Digital Image Processing. Third Edition. Prentice Hall, 2007.

Lay, D. Linear Algebra and Its Applications. Forth Edition. Addison Wesley, 2011.

Poole, D. Linear Algebra: A Modern Introduction. Second Edition. Brooks Cole, 2005.

The photo of the Mona Liza in LEGO is a property of Marco Pece Udronotto, who kindly granted permission to use it in this work.

This post is also available in: French, German, Italian, Arabic, Khmer

This entry was posted in Mathematics Within the Last 100 Years. Bookmark the permalink.

  1. Eric Hsu says:

    Excellent, I really enjoyed this article. I wonder if it’s not too far out of the way to introduce the SVD via eigenvalues and diagonalizing a matrix, which I believe *is* often covered in linear algebra classes.

    Also, I personally am wondering what connection there is between the SVD filter approach and filtering using Fourier analysis.

    • Antoine Nectoux says:

      Eigenvalues and diagonalizing matrices are indeed covered in undergraduate linear algebra courses. However, this article is not designed to introduce linear, but to provide an interesting example of it, in digital images. As far as I know, there is no connection between SVD decomposition and Fourier analysis, which is used in other image compression algorithms (like JPEG for example).

    • Hi! If I am not mistaken, The Klein Project is aimed to senior secondary school curriculum. In Brazil, Linear Algebra and Calculus are not presented in secondary schools. They are studied only in College! Indeed, one of the reviewers of the Portuguese version of this vignette suggested to cut off all SVD stuff because he found the subject too advanced for brazilian secondary school teachers. So, we have decided to keep the presentation as simple as possible.

  2. Govind says:

    In case of color image, which matrix do we use as ‘A’ in the above?
    I mean color image having RGB values then we will get R, G, B matrices. how to perform SVD on these matrices.

    please help me. please provide java code for SVD. my mail id :
    govind.satyam@gmail.com

    • Antoine Nectoux says:

      In case of a color image, just apply SVD to each of R, G and B.
      We do not have a java code for this, but you have all the informations you need to create one. However, there are codes available on the internet for more efficients image compression algorithms.

  3. Nitsa Movshovitz-Hadar says:

    Hi
    I am working on a series of 20 minutes powerpoint presentations of math news snapshots for high school students, to be interwoven in the ordinary curriculum. Would you be interested in joining it by working on such a snapshot following this vignette?
    Nitsa

  4. Matthew Green says:

    This is a wonderful post, and I’d love to use it with my students, but Java security issues prevent me from accessing any of the interactive code on the supplementary sites. That’s very disappointing, but otherwise, this is great!

  5. Harsh says:

    Hello .. !! It was really nice reading the article.It was up to the point and very Helpful.I just want to ask two things:
    1.How to represent the image in various possible spaces like time domain , frequency Domain, Using Fourier Series etc.
    2.What is Image preprocessing and what are the different techniques for image preprocessing.
    I am a University Student and i am researching in Image Processing field.
    I would be glad if you reply me.
    Thank You.
    shahharsh1916@gmail.com

Leave a Reply

Your email address will not be published. Required fields are marked *