An image may be defined as a two-dimensional function $f(x,y)$, where $x$ and $y$ are spatial coordinates, and the value of $f$ at any pair of coordinates $(x,y)$ is called the intensity of the image at that point. For a gray-range image, the intensity is given by just one value (one channel). For color images, the intensity is a 3D vector (three channels), usually distributed in the order RGB.
An image may be regarded as continuous with respect to $x$ and $y$, and also in intensity (analog image). Or as a discrete function defined on a discrete domain (digital image). Both viewpoints are useful for image processing.
Converting an analog image to digital form requires both the coordinates and the intensity to be digitized. Digitizing the coordinates is called sampling, while digitizing the intensity is referred to as quantization. Thus, when all this quantities are discrete, we call the image a digital image.
The opposite operation, converting from digital to analog, is also possible and called interpolation.
The result of sampling and quantization is a matrix of real numbers. The size of the image is the number of rows by the number of columns, $M\times N$. The indexation of the image in Python follows the usual convention:
from __future__ import division # forces floating point division
from PIL import Image # Python Imaging Library
import numpy as np # Numerical Python
import matplotlib.pyplot as plt # Python plotting
For showing plots in the notebook. Don't use it in Python scripts
%matplotlib inline
Python supports most usual image formats. Let us load the lena.jpg image
I = Image.open('lena.jpg')
The usual data type of an image is uint8
, i.e. 8-bit unsigned integer. This gives $2^8 = 256$ intensity values which are distributed in the interval $[0,255]$. We'll comment on data types later on.
The variable I is not a matrix, but an object (an instance of the class image
). There are many attributes and methods defined for this class. We review the essentials in what follow.
For plotting the image,
I.show()
To show the image in this notebook
plt.imshow(np.asarray(I))
plt.show()
For getting image information, such as size, type, and format:
print (I.size, I.mode, I.format)
For converting to other formats, in this case to a gray scale image:
I1=I.convert('L') # 'L' for gray scale mode
print (I1.mode)
plt.imshow(np.asarray(I1), cmap='gray')
plt.show()
For writting to disk:
I1.save('lena_gray.tif')
There are three main types of images:
uint8
or class uint16
, they have integer values in the range $[0,255]$ and $[0,65535]$, respectively. If the image is of class float32
, the values are single-precission floating-point numbers. They are usually scaled in the range $[0,1]$, although it is not rare to use the sclae $[0,255]$ too.When performing mathematical transformations of images we often need the image to be of double
type. But when reading and writing we save space by using integer codification. We use the following commands
a = np.asarray(I1,dtype=np.float32) # Image class instance, I1, to float32 Numpy array, a
This command converts the object I1 in a float32
matrix.
Image.fromarray(a.astype(np.uint8)).save("test.jpg") # convert a to uint8 and then to an Image instance
This command first converts float32
numpy array a
into a uint8
numpy array and the to image object.
Once we have the image defined as a float32
matrix, we may start working with it (processing).
Example.
We are going to
We start with the extraction.
Lena_eye=a[251:283,317:349]
The, the plotting
plt.subplot(121)
plt.imshow(a,cmap='gray',interpolation='none')
plt.title('Lena'),plt.axis('off')
plt.subplot(122)
plt.imshow(Lena_eye,cmap='gray',interpolation='none')
plt.title("Right Lena's eye"),plt.axis('off')
plt.show()
And finish saving
Image.fromarray(Lena_eye.astype(np.uint8)).save("LenaEye.jpg")
Write a function with
input: an image of any class and some ranges for its $(x,y)$ pixels.
output: the matrix (float32
) corresponding to the original image restricted to the given indices, and a figure of it.
Apply the function to extract the cameraman's head from cameraman.tif.
%run Exercise1.py
Masks are geometric filters on an image. For instance, if we want to extract a region of an image, we may do it by multiplying the matrix of the original image by a matrix of equal size containing $1's$ in the region we want to keep and $0's$ otherwise. In this exercise we extract a circular region of the image lena_gray_512.tif of radious 150. Follow these steps:
When multiplying by zero, you set to black the pixels out of the circle. Modify the program to make visible those pixels with half the intensity.
%run Exercise2.py
Linear degradation is the well known effect of darkening an image vertically (or horizontally). We may do this with a mask which is constant by columns but take decreasing values in rows, from 1 in the first row to zero in the last.
Construct such matrix and apply it to Lena's image. Save the resulting image to a file.
Hint: you may use loops and if's. But vetorizing saves execution time. Explore the commands np.linspace
to make the degradation and numpy.tile
to construct a repited (tiled) matrix from the linspace vector output.
%run Exercise3.py
If you open the book cited in the References at page 13, you will see the functions, methods and attributes defined for the module Image of PIL, such as crop
, getextrema
, getpixel
, histogram
, resize
, or rotate
.
%run Exercise4.py
%run Exercise5.py
%run Exercise6.py
%run Exercise7.py