smooth1dData¶
Smooth one-dimensional data by error detection.
Synopsis¶
smooth1dData (xdata, fdata)
Required Arguments¶
- float
xdata[]
(Input) - Array with
ndata
components containing the abscissas of the data points. - float
ydata[]
(Input) - Array with
ndata
components containing the ordinates of the data points.
Return Value¶
The vector of length ndata
containing the smoothed data.
Optional Arguments¶
itmax
, int (Input)The maximum number of iterations allowed.
Default:
itmax
= 500distance
, float (Input)Proportion of the distance the ordinate in error is moved to its
interpolating curve. It must be in the range 0.0 to 1.0.
Default:
distance
= 1.0stoppingCriterion
, float (Input)The stopping criterion.
stoppingCriterion
should be greater than or equal to zero.Default:
stoppingCriterion
= 0.0
Description¶
The function smooth1dData
is designed to smooth a data set that is
mildly contaminated with isolated errors. In general, the routine will not
work well if more than 25% of the data points are in error. The routine
smooth1dData
is based on an algorithm of Guerra and Tapia (1974).
Setting ndata
= n, ydata
= f, sdata
= s and xdata
=
x, the algorithm proceeds as follows. Although the user need not input an
ordered xdata
sequence, we will assume that x is increasing for
simplicity. The algorithm first sorts the xdata
values into an
increasing sequence and then continues. A cubic spline interpolant is
computed for each of the 6-point data sets (initially setting s = f)
where \(i = 4, ..., n - 3\). For each i the interpolant, which we will call \(S_i\), is compared with the current value of \(s_i\), and a ‘point energy’ is computed as
Setting stoppingCriterion
= stoppingCriterion, the algorithm
terminates either if itmax
iterations have taken place or if
If the above inequality is violated for any i, then we update the i-th
element of s by setting \(s_i = s_i + d(pe_i)\), where d =
distance
. Note that neither the first three nor the last three data
points are changed. Thus, if these points are inaccurate, care must be taken
to interpret the results.
The choice of the parameters d, stoppingCriterion and itmax
are
crucial to the successful usage of this subroutine. If the user has specific
information about the extent of the contamination, then he should choose the
parameters as follows: \(d = 1\), \(stoppingCriterion = 0\) and
itmax
to be the number of data points in error. On the other hand, if no
such specific information is available, then choose \(d = 5\), itmax
≤ 2n, and
In any case, we would encourage the user to experiment with these values.
Example¶
We take 91 uniform samples from the function \(5 + (5 + t^2 \sin t)/t\) on the interval [1, 10]. Then, we contaminate 10 of the samples and try to recover the original function values.
from __future__ import print_function
from numpy import *
from pyimsl.math.smooth1dData import smooth1dData
def F(X):
return (X * X * sin(double(X)) + 5.0) / X + 5.0
ndata = 91
isub = (5, 16, 25, 33, 41, 48, 55, 61, 74, 82)
fdata = empty(ndata)
xdata = empty(ndata)
s_user = empty(ndata)
rnoise = (2.5, -3., -2., 2.5, 3., -2., -2.5, 2., -2., 3.)
# Example 1: No specific information available.
dis = .5
sc = .56
maxit = 182
# Set values for xdata and fdata.
xdata[0] = 1.
fdata[0] = F(xdata[0])
for i in range(1, ndata):
xdata[i] = xdata[i - 1] + .1
fdata[i] = F(xdata[i])
# Contaminate the data.
for i in range(0, 10):
fdata[isub[i]] += rnoise[i]
# Smooth the data.
sdata = smooth1dData(xdata, fdata,
distance=dis,
stoppingCriterion=sc,
itmax=maxit)
# Output the result.
print("\nCase A - No specific information available.")
print(" F(X) F(X)+noise sdata")
for i in range(0, 10):
print("%7.3f\t%15.3f\t%15.3f" %
(F(xdata[isub[i]]), fdata[isub[i]], sdata[isub[i]]))
# Example 2: No specific information is available.
dis = 1.0
sc = 0.0
maxit = 10
# A warning message is produced because the maximum
# number of iterations is reached.
# Smooth the data.
s_user = smooth1dData(xdata, fdata,
distance=dis,
stoppingCriterion=sc,
itmax=maxit)
# Output the result.
print("\nCase B - Specific information available.")
print(" F(X) F(X)+noise sdata")
for i in range(0, 10):
print("%7.3f\t%15.3f\t%15.3f"
% (F(xdata[isub[i]]), fdata[isub[i]], s_user[isub[i]]))
Output¶
Case A - No specific information available.
F(X) F(X)+noise sdata
9.830 12.330 9.870
8.263 5.263 8.215
5.201 3.201 5.168
2.223 4.723 2.264
1.259 4.259 1.308
3.167 1.167 3.138
7.168 4.668 7.131
10.880 12.880 10.909
12.774 10.774 12.708
7.594 10.594 7.639
***
*** Warning error issued from IMSL function smooth1dData:
*** Maximum number of iterations limit "itmax" = 10 exceeded. The best answer found is returned.
***
Case B - Specific information available.
F(X) F(X)+noise sdata
9.830 12.330 9.831
8.263 5.263 8.262
5.201 3.201 5.199
2.223 4.723 2.225
1.259 4.259 1.261
3.167 1.167 3.170
7.168 4.668 7.170
10.880 12.880 10.878
12.774 10.774 12.770
7.594 10.594 7.592