Course Webpage

Approximation

Contents

We import the modules matplotlib.pyplot, numpy and the numpy functions for polynomials.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import numpy.polynomial.polynomial as pol

We set the print options to print matrices neatly

In [2]:
np.set_printoptions(precision = 2)   # only 2 fractionary digits
np.set_printoptions(suppress = True) # do not use exponential notation

Discrete polynomial approximation

Let us begin with an example. We search for a polynomial of degree $2$ that approximates the function $f(x)=\sin(x)$ and we have values for $5$ nodes

$$x_1=0,x_2=0.5,x_3=1,x_4=1.5,x_5=2$$

The polynomial would be

$$P_2(x)=a_0+a_1x+a_2x^2$$

and we have three unknowns $a_0,a_1,a_2.$

The equations are

$$ \begin{array}{lll} P_{2}(x_{1})=y_{1} & \quad & a_{0}+a_{1}x_{1}+a_{2}x_{1}^{2}=y_{1}\\ P_{2}(x_{2})=y_{2} & \quad & a_{0}+a_{1}x_{2}+a_{2}x_{2}^{2}=y_{2}\\ P_{2}(x_{3})=y_{3} & \quad & a_{0}+a_{1}x_{3}+a_{2}x_{3}^{2}=y_{3}\\ P_{2}(x_{4})=y_{4} & \quad & a_{0}+a_{1}x_{4}+a_{2}x_{4}^{2}=y_{4}\\ P_{2}(x_{5})=y_{5} & \quad & a_{0}+a_{1}x_{5}+a_{2}x_{5}^{2}=y_{5} \end{array} $$

And we must solve the system

$$ \left(\begin{array}{ccccc} 1 & x_{1} & x_{1}^{2} \\ 1 & x_{2} & x_{2}^{2} \\ 1 & x_{3} & x_{3}^{2} \\ 1 & x_{4} & x_{4}^{2} \\ 1 & x_{5} & x_{5}^{2} \end{array}\right)\left(\begin{array}{c} a_{0}\\ a_{1}\\ a_{2} \end{array}\right)=\left(\begin{array}{c} y_{1}\\ y_{2}\\ y_{3}\\ y_{4}\\ y_{5} \end{array}\right)\quad \quad (1) $$

That is,

$$ Vp=y\quad \quad (1) $$

But we have more equations than unknowns and the system has no solution. If we multiply the two sides by the transpose of the coefficient matrix:

$$ \left(\begin{array}{ccccc} 1 & 1 & 1 & 1 & 1 \\ x_{1} & x_{2} & x_{3}& x_{4} &x_{5}\\ x_{1}^{2} & x_{2}^{2} & x_{3}^{2} & x_{4}^{2} & x_{5}^{2} \end{array}\right) \left(\begin{array}{ccccc} 1 & x_{1} & x_{1}^{2} \\ 1 & x_{2} & x_{2}^{2} \\ 1 & x_{3} & x_{3}^{2} \\ 1 & x_{4} & x_{4}^{2} \\ 1 & x_{5} & x_{5}^{2} \end{array}\right) \left(\begin{array}{c} a_{0}\\ a_{1}\\ a_{2} \end{array}\right)= \left(\begin{array}{ccccc} 1 & 1 & 1 & 1 & 1 \\ x_{1} & x_{2} & x_{3}& x_{4} &x_{5}\\ x_{1}^{2} & x_{2}^{2} & x_{3}^{2} & x_{4}^{2} & x_{5}^{2} \end{array}\right) \left(\begin{array}{c} y_{1}\\ y_{2}\\ y_{3}\\ y_{4}\\ y_{5} \end{array}\right) $$

And now we have a consistent system of $3$ unknowns and $3$ equations:

$$ \left(\begin{array}{ccc} n & \sum\limits_{k=1}^{n}x_{k} & \sum\limits_{k=1}^{n}x_{k}^{2}\\ \sum\limits_{k=1}^{n}x_{k} & \sum\limits_{k=1}^{n}x_{k}^{2} & \sum\limits_{k=1}^{n}x_{k}^{3}\\ \sum_\limits{k=1}^{n}x_{k}^{2} & \sum\limits_{k=1}^{n}x_{k}^{3} & \sum\limits_{k=1}^{n}x_{k}^{4} \end{array}\right)\left(\begin{array}{c} a_{0}\\ a_{1}\\ a_{2} \end{array}\right)=\left(\begin{array}{c} \sum\limits_{k=1}^{n}y_{k}\\ \sum\limits_{k=1}^{n}x_{k}y_{k}\\ \sum\limits_{k=1}^{n}x_{k}^{2}y_{k} \end{array}\right) \quad \quad (2) $$

That is, we have a system

$$Cp=d \quad \quad (2)$$

where the data are $C$ y $d$ and we want to calculate the coefficients of the approximation polynomial $p.$


Exercise 1

Write a function approx1(f,deg,a,b,n) that computes and plots the approximating polynomial of degree deg for the n equispaced points of the function f in the interval [a,b] (the border points are nodes). The function will not return anything: it will do the necessary calculations to obtain the approximation polynomial and it will plot the polynomial and the points. Follow the steps:

  1. Build the nodes x with np.linspace and obtain y = f(x)
  2. Build the coefficients matrix $C$ and the right hand side matrix $d$ of system (2).
    1. Build the matrix $V$ from system (1) and the obtain C = np.dot(V.T,V) and d = np.dot(V.T,y), where V.T is the transposed matrix of $V$ or
    2. Build $C$ directly using np.sum that sums the elements of a vector. Also, take into account that x**2 is a vector whose elements are x elements squared.
  3. Solve the system $Cp=d$ with the command np.linalg.solve(C,d).
  4. Obtain 50 points in the interval $[a,b]$ and the corresponding values of the polynomial p with pol.polyval.
  5. Plot, in interval $[a,b]$:
    • The points.
    • The polynomial.

Use this function to obtain and plot:

  • The degree $\mathrm{deg} = 2$ approximation polynomial for the function $f_1(x)=\sin(x)$ using $n = 5$ equispaced points in the interval $[0,2].$
  • The degree $\mathrm{deg} = 4$ approximation polynomial for the function $f_2(x) =\cos(\arctan(x))-\log(x+5)$ using $n = 10$ equispaced points in the interval $[-2,0].$
In [1]:
%run Exercise1.py
 Coefficient matrix

[[ 5.    5.    7.5 ]
 [ 5.    7.5  12.5 ]
 [ 7.5  12.5  22.12]]


 Right-hand side term

[3.23 4.4  6.84]


 System solution

[-0.01  1.23 -0.38]

We can get the polynomial coefficients with the command polyfit using as input the polynomial degree.

In [2]:
x = np.linspace(0,2,5)
y = np.sin(x)
p = pol.polyfit(x,y,2)
print(p)
[-0.01  1.23 -0.38]

Using the dataset Auto MPG from UC Irvine Machine Learning Repository We will estimate the power (horsepower) needed for a car of a given weight (weight) and the miles per gallon (mpg) that it would be able to travel. The car models in this database are from the year 1970 to 1982. The database is also in kaggle.

Using the pandas library, we read the data file cars.csv and store it as a dataframe, which is the equivalent of an Excel sheet: the data is organized in a matrix with one sample per row and one variable per column. The data can be non homogeneous. The dataframe can contain variables of different types (categorical, ordinal, numerical,...)

In [3]:
import pandas as pd

df = pd.read_csv('http://www.unioviedo.es/compnum/labs/new/cars.csv',sep=',')

Another way to read the file would be to download it to the working directory from cars and read it with

In [4]:
df = pd.read_csv('cars.csv',sep=',') # It reads data separated by commas

Let us have a look at the dataframe

In [5]:
df
Out[5]:
mpg cylinders displacement horsepower weight acceleration model year origin car name
0 18.0 8 307.0 130 3504 12.0 70 1 chevrolet chevelle malibu
1 15.0 8 350.0 165 3693 11.5 70 1 buick skylark 320
2 18.0 8 318.0 150 3436 11.0 70 1 plymouth satellite
3 16.0 8 304.0 150 3433 12.0 70 1 amc rebel sst
4 17.0 8 302.0 140 3449 10.5 70 1 ford torino
... ... ... ... ... ... ... ... ... ...
387 27.0 4 140.0 86 2790 15.6 82 1 ford mustang gl
388 44.0 4 97.0 52 2130 24.6 82 2 vw pickup
389 32.0 4 135.0 84 2295 11.6 82 1 dodge rampage
390 28.0 4 120.0 79 2625 18.6 82 1 ford ranger
391 31.0 4 119.0 82 2720 19.4 82 1 chevy s-10

392 rows × 9 columns

For intance, to extract the column horsepower

In [6]:
x = df['horsepower']

Exercise 2

Using the file cars that contains the dataset Auto MPG from UC Irvine Machine Learning Repository, estimate the necessary power (horsepower) for a car of a given weight(weight) and the miles per gallon (mpg) that would be able to travel in a city. Follow the steps:

  1. Extract the columns weight y horsepower.
  2. Using pol.polyfit calculate and plot the degree 1 fitting polynomial using as variable $x$ the weight and as variable $y$ the horsepower.
  3. Using pol.polyval estimate the necessary power (horsepower) of a $3000$ lbs car.
  4. Extract the column mpg and using as variable $x$ the horsepower and as variable $y$ the mpg calculate and plot the degree 2 fitting polynomial.
  5. With the horsepower obtained for the $3000$ lbs car, estimate the miles per gallon that this car could travel within a city using the approximation polynomial.
  6. Plot the data, the corresponding fitting polynomial and the point over the polynomial used to estimate the corresponding mpg.
In [7]:
%run Exercise2.py
105 horse power
21  mpg

Continuous polynomial approximation

If, instead of knowing only the value of the function in some points, we know all the values for the points in an interval, we can perform a continuous fitting. In this case, the additions transform into integrals:

$$ \left(\begin{array}{ccc} \int_{a}^{b}1 dx & \int_{a}^{b}x dx & \int_{a}^{b}x^2 dx\\ \int_{a}^{b}x dx & \int_{a}^{b}x^2 dx & \int_{a}^{b}x^3 dx\\ \int_{a}^{b}x^2 dx & \int_{a}^{b}x^3 dx & \int_{a}^{b}x^4 dx \end{array}\right)\left(\begin{array}{c} a_{0}\\ a_{1}\\ a_{2} \end{array}\right)=\left(\begin{array}{c} \int_{a}^{b}f(x)dx\\ \int_{a}^{b}x f(x)dx\\ \int_{a}^{b}x^2 f(x)dx \end{array}\right)\quad \quad (3) $$

Exercise 3

Write a function approx2(f,deg,a,b) that computes and plots the degree deg fitting polynomial p for the function f in the interval [a,b]. Follow the steps:

  1. Build the coefficients matrix and the right hand side matrix of system (3) and print them. Use the function scipy.integrate.quad. For instance, to compute the value of an integral $$I = \int_{a}^{b}g(x)dx$$
    from scipy.integrate import quad
    Ia = quad(g,a,b)[0]
    
    where g is a lambda function. We can define a lambda function from other one. For example
    g = lambda x: x * f(x)
    
  2. Solve the system with the command np.linalg.solve and print the solution.
  3. Find the values of the approximating polynomial for the points in the interval $[a,b]$ with the command pol.polyval.
  4. Plot, in interval $[a,b]$:
    • The function.
    • The polynomial.

Use this function to obtain and plot:

  • The degree $\mathrm{deg} = 2$ fitting polynomial for the function
$$f_1(x)= \sin(x)$$

in the interval $[a,b]=[0,2].$

  • The degree $\mathrm{deg}=4$ fitting polynomial for the function
$$f_2(x) =\cos(\arctan(x))-\log(x+5)$$

in the interval $[a,b]=[-2,0].$

In [10]:
%run Exercise3.py
 Coefficient matrix

[[2.   2.   2.67]
 [2.   2.67 4.  ]
 [2.67 4.   6.4 ]]


 Right-hand side term

[1.42 1.74 2.47]


 System solution

[-0.04  1.27 -0.39]

More exercises

Approximation by orthogonal basis

We can assume that we have been using the dot product for functions

$$ \left\langle f(x),g(x)\right\rangle=\int_{a}^ b f(x)g(x)dx$$

for the polynomial basis $\{P_0,P_1,P_2\}=\{1,x,x^2\}$ we may write the system

$$ \left(\begin{array}{ccc} \left\langle P_0,P_0\right\rangle & \left\langle P_0,P_1\right\rangle & \left\langle P_0,P_2\right\rangle\\ \left\langle P_1,P_0\right\rangle & \left\langle P_1,P_1\right\rangle & \left\langle P_1,P_2\right\rangle\\ \left\langle P_2,P_0\right\rangle & \left\langle P_2,P_1\right\rangle & \left\langle P_2,P_2\right\rangle \end{array}\right)\left(\begin{array}{c} a_{0}\\ a_{1}\\ a_{2} \end{array}\right)=\left(\begin{array}{c} \left\langle P_0,f(x)\right\rangle\\ \left\langle P_1,f(x)\right\rangle\\ \left\langle P_2,f(x)\right\rangle \end{array}\right) $$

We can improve the previous method making the coefficient matrix diagonal. We can achieve it using an orthogonal polynomial basis $\{L_0,L_1,L_2\}$. Then

$$ \left(\begin{array}{ccc} \left\langle L_0,L_0\right\rangle & 0 & 0\\ 0& \left\langle L_1,L_1\right\rangle & 0\\ 0 & 0 & \left\langle L_2,L_2\right\rangle \end{array}\right)\left(\begin{array}{c} a_{0}\\ a_{1}\\ a_{2} \end{array}\right)=\left(\begin{array}{c} \left\langle L_0,f(x)\right\rangle\\ \left\langle L_1,f(x)\right\rangle\\ \left\langle L_2,f(x)\right\rangle \end{array}\right) $$

And the solution of the system would be

$$ a_0=\frac{\left\langle L_0,f(x)\right\rangle}{\left\langle L_0,L_0\right\rangle} \quad a_1=\frac{\left\langle L_1,f(x)\right\rangle}{\left\langle L_1,L_1\right\rangle} \quad a_2=\frac{\left\langle L_2,f(x)\right\rangle}{\left\langle L_2,L_2\right\rangle}\quad \quad (3) $$

This orthogonal polynomials are the Legendre polynomials. We can obtain them from the module scipy.special.

In [4]:
from scipy.special import eval_legendre

xp = np.linspace(-1,1,100)

plt.figure(figsize=(8,6))
plt.plot(xp,0*xp,'k-')
for i in range(6):
    plt.plot(xp,eval_legendre(i,xp),label=r'$L_'+str(i)+'$') 
plt.legend(bbox_to_anchor=(1, 1), loc='upper left', ncol=1,fontsize=16)  
plt.title('Legendre polynomials',fontsize=16)
plt.show() 

If we solve the latter exercise using this orthogonal polynomials, the result will be the same but it would be expressed as

$$ P_2(x)=a_0\,L_0(x)+a_1\,L_1(x)+a_2\,L_2(x) \quad \quad (4) $$

But this Legendre polynomials are defined for $[-1,1].$

We can apply this to any $[a,b]$ interval with a linear transformation.

We want to change variable that takes us from the interval $[-1,1]$ to the interval $[a,b].$

If we perform a linear change of variable

$$x = m\,t +n,$$

that if $x = -1$, then $t = a$ then

$$-1 = a\,m+n.$$

And if $x = 1$, then $t = b$ then

$$1 = b\,m+n.$$

We multiply the first equation by $-1$ and we add these two equations

$$2 = (b-a)\,m \qquad m = \dfrac{2}{b-a}$$

Now we substitute this value in the first equation

$$-1 = a\,\dfrac{2}{b-a} +n$$

And solving this equation for $n$ we get

$$n =-\dfrac{a+b}{b-a}$$

Thus, the change of variable is

$$\fbox{$x=\dfrac{2t-(a+b)}{b-a}$}$$

And if we want to use the Legendre polynomials in an interval that it is not $[-1,1]$

In [5]:
a = -1.; b = 3.
xp = np.linspace(a,b,100)

plt.figure(figsize=(8,6))
plt.plot(xp,0*xp,'k-')

for i in range(6):
    plt.plot(xp,eval_legendre(i,(2*xp-(a+b))/(b-a)),label=r'$L_'+str(i)+'$') 
plt.legend(bbox_to_anchor=(1, 1), loc='upper left', ncol=1,fontsize=16)  
plt.title('Legendre polynomials',fontsize=16)
plt.show() 

Exercise 4

Write a function approx3(f,deg,a,b) that computes and plots the degree deg fitting polynomial p for the function f in the interval [a,b]. Follow the steps:

  • Compute the coefficients in (3). Use the function scipy.integrate.quad and the Legendre polynomials from scipy.special transformed for the interval $[a,b]$
  • Find the values of the polynomial (4) in the interval $[a,b]$
  • Plot, in the interval $[0,2]$:
    • The function $f(x)=\cos(x)$
    • The degree $2$ approximation polynomial (4)

Also compute the degree $4$ polynomial approximation of $f_2(x)=\cos(\arctan(x))-\log(x+5)$ in the interval $[-2,0]$. Plot the function and the polynomial approximation.

In [5]:
%run Exercise4.py
a0 num =  1.4161468365471424
a0 den =  2.0
a0     =  0.7080734182735712

a1 num =  0.32544426337282417
a1 den =  0.6666666666666666
a1     =  0.48816639505923626

a2 num =  -0.10440139261723963
a2 den =  0.4
a2     =  -0.26100348154309905
a0 num =  -1.3077172209873624
a0 den =  2.0
a0     =  -0.6538586104936812

a1 num =  0.03875967592395079
a1 den =  0.6666666666666666
a1     =  0.05813951388592619

a2 num =  0.015371522898630807
a2 den =  0.4
a2     =  0.03842880724657702

a3 num =  -0.009963365987656741
a3 den =  0.2857142857142858
a3     =  -0.03487178095679858

a4 num =  -0.003526531279820857
a4 den =  0.22222222222222227
a4     =  -0.015869390759193854

Approximation with Fourier series

If $f$ is a periodic function with period $T$, the approximation with a basis of trigonometric functions works very well. We use the dot product

$$ \left\langle f(x),g(x)\right\rangle=\int_{\lambda}^{\lambda+T} f(x)g(x)dx\quad\quad (5)$$

where the integration interval covers a full period.

The basis of trigonometric functions is

$$ \left\{ 1,\cos\left(\frac{2\pi x}{T}\right),\mathrm{sen} \left(\frac{2\pi x}{T}\right),\cos\left(2\frac{2\pi x}{T}\right),\mathrm{sen} \left(2\frac{2\pi x}{T}\right),\ldots,\cos\left(n\frac{2\pi x}{T}\right),\mathrm{sen} \left(n\frac{2\pi x}{T}\right)\right\} $$

that is an orthogonal basis for dot product (5). Then, we may write the system $Ax=b$

$$ A= \left(\begin{array}{cccccc} \int_{\lambda}^{\lambda+T} 1^2 dx & 0 & 0 & \cdots & 0 & 0\\ 0& \int_{\lambda}^{\lambda+T} \cos^2\left(\frac{2\pi x}{T}\right) dx & 0& \cdots & 0 & 0\\ 0 & 0 & \int_{\lambda}^{\lambda+T} \mathrm{sen}^2 \left(\frac{2\pi x}{T}\right) dx & \cdots & 0 & 0\\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots\\ 0 & 0 & 0 & \cdots & \int_{\lambda}^{\lambda+T} \cos^2 \left(n\frac{2\pi x}{T}\right) dx & 0\\ 0 & 0 & 0 & \cdots & 0 & \int_{\lambda}^{\lambda+T} \mathrm{sen}^2 \left(n\frac{2\pi x}{T}\right) dx\\ \end{array}\right)\left(\begin{array}{c} \frac{a_{0}}{2}\\ a_{1}\\ b_{1}\\ \cdots\\ a_{n}\\ b_{n} \end{array}\right)= $$$$ =\left(\begin{array}{c} \int_{\lambda}^{\lambda+T} f(x) dx\\ \int_{\lambda}^{\lambda+T} f(x)\cos \left(\frac{2\pi x}{T}\right) dx\\ \int_{\lambda}^{\lambda+T} f(x)\mathrm{sen} \left(\frac{2\pi x}{T}\right) dx\\ \cdots\\ \int_{\lambda}^{\lambda+T} f(x)\cos \left(n\frac{2\pi x}{T}\right) dx\\ \int_{\lambda}^{\lambda+T} f(x)\mathrm{sen} \left(n\frac{2\pi x}{T}\right) dx \end{array}\right) $$

And the function that approximates $f$ would be of the form:

$$ F(x)=\frac{a_0}{2}+a_1\cos\left(\frac{2\pi x}{T}\right)+b_1\mathrm{sen} \left(\frac{2\pi x}{T}\right)+a_2\cos\left(2\frac{2\pi x}{T}\right)+b_2\mathrm{sen} \left(2\frac{2\pi x}{T}\right)+\cdots+a_n\cos\left(n\frac{2\pi x}{T}\right)+b_n\mathrm{sen} \left(n\frac{2\pi x}{T}\right) $$

And now,

$$ \frac{a_0}{2}=\frac{\int_{\lambda}^{\lambda+T} f(x) dx}{\int_{\lambda}^{\lambda+T} 1^2 dx} \quad a_1=\frac{\int_{\lambda}^{\lambda+T} f(x)\cos \left(\frac{2\pi x}{T}\right) dx}{\int_{\lambda}^{\lambda+T} \cos^2\left(\frac{2\pi x}{T}\right)dx} \quad b_1=\frac{\int_{\lambda}^{\lambda+T} f(x)\mathrm{sen} \left(\frac{2\pi x}{T}\right) dx}{\int_{\lambda}^{\lambda+T} \mathrm{sen}^2\left(\frac{2\pi x}{T}\right)dx} ,\ldots, \\ a_n=\frac{\int_{\lambda}^{\lambda+T} f(x)\cos \left(n\frac{2\pi x}{T}\right) dx}{\int_{\lambda}^{\lambda+T} \cos^2\left(n\frac{2\pi x}{T}\right)dx} \quad b_n=\frac{\int_{\lambda}^{\lambda+T} f(x)\mathrm{sen} \left(n\frac{2\pi x}{T}\right) dx}{\int_{\lambda}^{\lambda+T} \mathrm{sen}^2\left(n\frac{2\pi x}{T}\right)dx} $$

As

$$ \int_{\lambda}^{\lambda+T} 1^2 dx= T, \quad \int_{\lambda}^{\lambda+T} \cos^2\left(k\frac{2\pi x}{T}\right)dx = \frac{T}{2}, \quad \int_{\lambda}^{\lambda+T} \mathrm{sen}^2\left(k\frac{2\pi x}{T}\right)dx = \frac{T}{2} $$

for any positive integer $k$, the coefficients of the basis function are

$$ a_0=\frac{2}{T}\int_{\lambda}^{\lambda+T} f(x) dx, \quad a_1=\frac{2}{T}\int_{\lambda}^{\lambda+T} f(x) \cos\left(\frac{2\pi x}{T}\right) dx, \quad b_1=\frac{2}{T}\int_{\lambda}^{\lambda+T} f(x)\mathrm{sen} \left(\frac{2\pi x}{T}\right) dx, \ldots, a_n=\frac{2}{T}\int_{\lambda}^{\lambda+T} f(x)\cos \left(n\frac{2\pi x}{T}\right) dx, \quad b_n=\frac{2}{T}\int_{\lambda}^{\lambda+T} f(x)\mathrm{sen} \left(n\frac{2\pi x}{T}\right) dx. $$

Exercise 5

  • Plot the fith order Fourier series for the function of period $T=3$ $f(x)=x$ in interval $[0,3]$.
  • Plot the sixth order Fourier series for the function of period $T=2\pi$ $f(x)=(x-\pi)^2$ in interval $[0,2\pi]$.
In [6]:
%run Exercise5.py