Introduction to digital image processing using Python

Contents

Digital images

An image may be defined as a two-dimensional function $f(x,y)$, where $x$ and $y$ are spatial coordinates, and the value of $f$ at any pair of coordinates $(x,y)$ is called the intensity of the image at that point. For a gray-range image, the intensity is given by just one value (one channel). For color images, the intensity is a 3D vector (three channels), usually distributed in the order RGB.

An image may be regarded as continuous with respect to $x$ and $y$, and also in intensity (analog image). Or as a discrete function defined on a discrete domain (digital image). Both viewpoints are useful for image processing.

Converting an analog image to digital form requires both the coordinates and the intensity to be digitized. Digitizing the coordinates is called sampling, while digitizing the intensity is referred to as quantization. Thus, when all this quantities are discrete, we call the image a digital image.

The opposite operation, converting from digital to analog, is also possible and called interpolation.

Coordinate conventions

The result of sampling and quantization is a matrix of real numbers. The size of the image is the number of rows by the number of columns, $M\times N$. The indexation of the image in Python follows the usual convention:

$$\left(\begin{array}{cccc}a(0,0) & a(0,1) & \cdots & a(0,N-1) \\a(1,0) & a(1,1) & \cdots & a(1,N-1) \\\cdots & \cdots & \cdots & \cdots \\ a(M-1,0) & a(M-1,1) & \cdots & a(M-1,N-1)\end{array}\right)$$

Python modules

In [1]:
from __future__ import division   # forces floating point division 
from PIL import Image             # Python Imaging Library
import numpy as np                # Numerical Python 
import matplotlib.pyplot as plt   # Python plotting

For showing plots in the notebook. Don't use it in Python scripts

In [2]:
%matplotlib inline

Reading, displaying and writing images

Python supports most usual image formats. Let us load the lena.jpg image

In [3]:
I = Image.open('lena.jpg')

The usual data type of an image is uint8, i.e. 8-bit unsigned integer. This gives $2^8 = 256$ intensity values which are distributed in the interval $[0,255]$. We'll comment on data types later on.

The variable I is not a matrix, but an object (an instance of the class image). There are many attributes and methods defined for this class. We review the essentials in what follow.

For plotting the image,

In [4]:
I.show()

To show the image in this notebook

In [5]:
plt.imshow(np.asarray(I))
plt.show()

For getting image information, such as size, type, and format:

In [6]:
print (I.size, I.mode, I.format)
(512, 512) RGB JPEG

For converting to other formats, in this case to a gray scale image:

In [7]:
I1=I.convert('L') # 'L' for gray scale mode
print (I1.mode)
L
In [8]:
plt.imshow(np.asarray(I1), cmap='gray')
plt.show()

For writting to disk:

In [9]:
I1.save('lena_gray.tif')

Image types and coversions

There are three main types of images:

  • Intensity image is a data matrix whose values have been scaled to represent intensities. When the elements of an intensity image are of class uint8 or class uint16, they have integer values in the range $[0,255]$ and $[0,65535]$, respectively. If the image is of class float32, the values are single-precission floating-point numbers. They are usually scaled in the range $[0,1]$, although it is not rare to use the sclae $[0,255]$ too.
  • Binary image is a black and white image. Each pixel has one logical value, $0$ or $1$.
  • Color image is like intensity image but with three chanels, i.e. to each pixel corresponds three intensity values (RGB) instead of one.

When performing mathematical transformations of images we often need the image to be of double type. But when reading and writing we save space by using integer codification. We use the following commands

In [10]:
a = np.asarray(I1,dtype=np.float32)  # Image class instance, I1, to float32 Numpy array, a

This command converts the object I1 in a float32 matrix.

In [11]:
Image.fromarray(a.astype(np.uint8)).save("test.jpg")  # convert a to uint8 and then to an Image instance

This command first converts float32 numpy array a into a uint8 numpy array and the to image object.

Once we have the image defined as a float32 matrix, we may start working with it (processing).

Example.

We are going to

  • extract a part of the image by index restriction,
  • make a plot,
  • save the result.

We start with the extraction.

In [12]:
Lena_eye=a[251:283,317:349]

The, the plotting

In [13]:
plt.subplot(121)
plt.imshow(a,cmap='gray',interpolation='none')
plt.title('Lena'),plt.axis('off') 

plt.subplot(122)
plt.imshow(Lena_eye,cmap='gray',interpolation='none')
plt.title("Right Lena's eye"),plt.axis('off') 

plt.show()

And finish saving

In [14]:
Image.fromarray(Lena_eye.astype(np.uint8)).save("LenaEye.jpg")

Exercises


Exercise 1

Write a function with

  • input: an image of any class and some ranges for its $(x,y)$ pixels.

  • output: the matrix (float32) corresponding to the original image restricted to the given indices, and a figure of it.

Apply the function to extract the cameraman's head from cameraman.tif.

In [16]:
%run Exercise1.py

Exercise 2

Masks are geometric filters on an image. For instance, if we want to extract a region of an image, we may do it by multiplying the matrix of the original image by a matrix of equal size containing $1's$ in the region we want to keep and $0's$ otherwise. In this exercise we extract a circular region of the image lena_gray_512.tif of radious 150. Follow these steps:

  • Read the image and convert it to double.
  • Create a matrix of the same dimensions filled with zeros.
  • Modify the above matrix to contain $1's$ in a circle of radious 150, i.e. if $(j-c_x)^2+(i-c_y)^2<150^2$, where $(c_x,c_y)$ is the center of the image.
  • Multiply the image by the mask (they are matrices!)
  • Show the result.

When multiplying by zero, you set to black the pixels out of the circle. Modify the program to make visible those pixels with half the intensity.

In [17]:
%run Exercise2.py

Exercise 3

Linear degradation is the well known effect of darkening an image vertically (or horizontally). We may do this with a mask which is constant by columns but take decreasing values in rows, from 1 in the first row to zero in the last.

Construct such matrix and apply it to Lena's image. Save the resulting image to a file.

Hint: you may use loops and if's. But vetorizing saves execution time. Explore the commands np.linspace to make the degradation and numpy.tile to construct a repited (tiled) matrix from the linspace vector output.

In [18]:
%run Exercise3.py

If you open the book cited in the References at page 13, you will see the functions, methods and attributes defined for the module Image of PIL, such as crop, getextrema, getpixel, histogram, resize, or rotate.


Exercise 4

Create, as a numpy array, the image of a chess checkerboard, where the squares have a size of $250 \times 250$ pixels. Show the result in iPyhon terminal. You may use the command numpy.tile. Save the resulting image to a file.

In [27]:
%run Exercise4.py

Exercise 5

Create, as a numpy array, the image of concentric circels. The image has a size of $500 \times 500$ pixels and each circunference line is approximately $10$ pixels wide. Show the result in iPyhon terminal. Save the resulting image to a file.

In [28]:
%run Exercise5.py

Exercise 6

Create, as a numpy array, the image of the napkin. The squares have a size of $10 \times 10$. Show the result in iPyhon terminal. You may use the command numpy.tile. Save the resulting image to a file.

In [29]:
%run Exercise6.py

Exercise 7

Create, as a numpy array, the image shown below. The image a size of $500 \times 500$. Each circle has a radius of $10$ pixels and their centers are spaced $50$ pixels. Show the result in iPyhon terminal. You may use the command numpy.tile. Save the resulting image to a file.

In [30]:
%run Exercise7.py

References