Curve-Fitting

Curve Fitting - Interpolation.

Cubic Spline Interpolation

This section will derive the equations of a cubic spline to interpolate a set of data points P1, P2, ..., Pn. In two dimensions, the point Pi has coordinates xi and yi. In three dimensions, it has coordinates xi, yi and zi. The working below assumes the two dimensional case and fits y=f(t) where t depends linearly on x. It is straightforward to adjust this to the case where we have two sets of cubics, one y=f(t) and the other z=g(t) or to the case where x,y,and z are all represented by sets of cubics in some other parameter.

Consider the interval from Pi to P{i+1} and use a local variable t (with t=0 at Pi) so that y may be expressed in the form
y = at^{3} + bt^{2} +ct + d
then
dy/dt = 3at^{2} + 2bt + c
and
d^{2}y/dt^{2} = 6at + 2b
Let the first derivative have the value f(i) at the point Pi. Then at the point Pi, t=0 and so
y(i) = d
f(i) = c
Now consider the value of t at the point P{i+1}. It would simplify the algebra if t was equal to 1 at this point and it is possible to find equations for t which make this true for each of our curves.

If the equation is y = f(x), then the mapping from x to the local variable t is given by
t = (x-x{i})/(x{i+1}-x{i})
Substituting x=x{i} gives t=0 and substituting x=x{i+1} gives t=1, so this does provide the required mapping. This will be satisfactory unless the intervals are very different in length. Problems sometimes arise because with this change of variable, you are fitting to ensure continuity of the derivatives with respect to t not x and the scaling of the x-axis to make all the intervals of unit length will change the gradients and hence the second derivatives as well. If the intervals are nearly equal in length, then so is this scaling factor and a fitting to make the second derivative with respect to t continuous will be close to the same condition for x and will provide an acceptable spline fit.

If, on the other hand, the equation is y = f1(s) and s=i-1 at the points Pi, then t= s -i+ 1 at the points Pi. Now it is the lengths along the curve for each interval that must not vary too greatly. The same condition applies to the next example. If s is the arc length and has values si at Pi and s{i+1} at P{i+1} at the two end points, then the required equation is
t = (s - si)/(s{i+1} - si)
So for any of these cases, it is possible to choose a local variable t which varies from 0 to 1 over the interval.

So if t=1 at the point P(i+1),
y(i+1) = a + b + c + d
f(i+1) = 3a + 2b + c
These equations may be rearranged to give
a = f{i+1} + f{i} - 2y{i+1} + 2yi
b = 3y{i+1}-3yi - f{i+1} - 2f{i }
c = f{i }
d = yi
which shows that the the coefficients are completely known once the values of f{i} have been calculated.

To obtain a recurrance relation, and hence a set of simultaneous linear equations, it is necessary to use the remaining condition of continuity of the second derivative.

"At the knot-point Pi, the value of the second derivative for the left-hand curve (over the interval P{i-1} to Pi) is equal to the value of this derivative for the right-hand curve (over the interval Pi to P{i+1})".

Considering the left-hand curve, its local variable t will be equal to 1 and its coefficients A and B will give
6A + 2B = 6(f{i}+f{i-1}-2yi+2y{i-1}) + 2(3yi-3y{i-1}-f{i}-2f{i-1})
= 2(2f{i} + f{i-1} -3yi +3y{i-1})

Considering the right-hand curve, its local variable t will be equal to 0 and its coefficient b will give
2b = 2(3y{i+1}-3yi-f{i+1}-2f{i})
Since these two expressions are equal,
2f{i}+f{i-1}-3yi+3y{i-1} = 3y{i+1}-3yi -f{i+1} -2f{i}
which gives
f{i+1} + 4f{i} + f{i-1} = 3( y{i+1}-y{i-1} )

and this is true for i = 2,3,4,....,n-1, giving a set of n-2 equations connecting the n unknowns f{i}. Two end-conditions are necessary to complete the set and they usually take the form
f{1} + A f{2} = 3B
C f{n-1} + f{n} = 3D
The easiest choice is to set A and C to zero and insert estimates of f{1} and f{n} to give the end conditions. This gives the set of equations:

1   0   0   0   .   .   .   0 	f{1}	   	B
1   4   1   0   .   .   .   0  	f{2}		y3-y1
0   1   4   1   .   .   .   0  	f{3}   = 3 	y4-y2
.     .     .    .               . 
.     .     .    .               . 
0   0   0   0   .   .   .   1 	f{n}	  	D

which may be solved by a standard method such as Gaussian Elimination. Once the values of f{i} have been calculated, then the coefficients ai, bi, ci and di for each interval may be calculated and hence the values of y along the curve for this interval.

If you are using the form y = f1(s) and x = f2(s), then a second set of splines must be calculated to give the values of x in the same way.

This has the advantage of applying the same set of software to all possible examples. However in the simple cases where Polyline would have been adequate, it is slightly wasteful of computer time.

Gaussian Elimination Method

For the full discussion of this method, students are recommended to consult one of the many Numerical Analysis textbooks currently in print or read the description given in the NAG library. However for the simple case of a tri-diagonal matrix, which is the form needed for the cubic spline, a simple example is appended here. Consider the 4x4 set of equations:

1    0.5    0     0 	f{1}		0.6
1     4      1     0 	f{2}   =	2.4
0     1      4     1 	f{3}		2.8
0     0     0.2   1 	f{4}		0.5

The aim of the Gaussian Elimination method is to obtain a new set of equations with the same solution as the previous set. This is obtained by using row operations as follows:

1. Calculate the factor F1 = A(2,1)/A(1,1) = 1.0 in this case.
Then carry out the row operation:
new row 2 = old row 2 - F1 * row 1
This gives the new value of row 2 as

	0    3.5    1    0    1.8

2. Calculate the factor F2 = A(3,2)/A(2,2) = 0.286
then: new row 3 = old row 3 - F2 * row 2
This gives the new value of row 3 as

	0     0    3.714     1    2.286

3. Finally calculate the factor F3 = A(4,3)/A(3,3) = 0.054
then: new row 4 = old row 4 - F3 * row 3
Giving the new value of row 4 as

	0    0    0    0.946    0.349

So the new set of equations has the form
	1    0.5    0         0      0.6  
	0    3.5    1         0      1.8    
	0     0     3.714   1      2.286     
	0     0      0      0.946  0.349

Solving this by back substitution, you take the last equation first.
0.946*f{4} = 0.349 or f{4} = 0.369

The next equation is
3.714*f{3} + f{4} = 2.286 or f{3} = (2.286-f{4})/3.714 = 0.516

The next equation is
f{2} = (1.8 - f{3})/3.5 = 0.369

and the final equation is
f{1} = (0.6 - 0.5*f{2})/1 = 0.415

which completes the solution of this set of equations. The same method can be extended to apply it to larger sets of equations.