DNFFT
Computes Gaussian kernel estimates of a univariate density via the fast Fourier transform over a fixed interval.
Required Arguments
X — Vector of length NOBS containing the data for which a univariate density estimate is desired. (Input)
X is not referenced and may be dimensioned of length 1 in the calling program if IFFT = 1.
BNDS — Vector of length 2 containing the minimum and maximum values of X at which the density is to be estimated. (Input)
Observations less than BNDS(1) or greater than BNDS(2) are ignored. If either range of the hypothesized density is infinite, a value equal to the smallest observation minus 3 * WINDOW is a good choice for BNDS(1), and a value equal to the largest observation plus 3 * WINDOW is a good choice for BNDS(2). Let STEP = (BNDS(2) ‑ BNDS(1))/(NXPT ‑ 1), and note that the density is estimated at the points BNDS(1) + i STEP where i = 0, 1, …, NXPT ‑ 1. The density is assumed constant over the interval from BNDS(1) + i * STEP to BNDS(1) + (i + 1) * STEP.
WINDOW — Window width for the kernel function. (Input)
Generally, several different values for WINDOW should be tried. When several different values are tried, use the IFFT option.
COEF — Vector of length NXPT containing the Fourier coefficients. (Input, if IFFT = 1; output, otherwise)
DENS — Vector of length NXPT containing the density estimates. (Output)
The density is estimated at the points BNDS(1) + i * STEP, i = 0, 1, …, NXPT ‑ 1, where STEP = (BNDS (2) ‑ BNDS(1))/(NXPT ‑ 1).
Optional Arguments
NOBS — Number of observations. (Input)
Default: NOBS = size (X,1).
FRQ — Vector of length NOBS containing the frequency of the corresponding element of X. (Input)
If FRQ(1) is ‑ 1.0, then the vector FRQ is not used and all frequencies are taken to be one. FRQ is also not used if IFFT = 1. In either case, FRQ may be dimensioned of length 1 in the calling program.
Default: FRQ(1) = –1.0.
IFFT — Fourier transform option parameter. (Input)
If IFFT = 1, then COEF contains the Fourier coefficients on input, and the coefficients are not computed. Otherwise, the coefficients are computed. This option is used when several different values for WINDOW are to be tried. On the first call to DNFFT, IFFT = 0 and the coefficients COEF are computed. On subsequent calls, these coefficients do not need to be recomputed (but only if NXPT also remains fixed).
Default: IFFT = 0.
NXPT — Number of equally‑spaced points points at which the density is to be estimated. (Input)
Routine DNFFT is most efficient when NXPT is a power of 2. Little efficiency is lost if NXPT is a product of small primes. Because of the method of estimation, NXPT should be large, say greater than 64.
Default: NXPT = 128.
NRMISS — Number of rows of data that contain missing values in X or FRQ. (Output) NRMISS is not referenced if IFFT = 1.
FORTRAN 90 Interface
Generic: CALL DNFFT (X, BNDS, WINDOW, COEF, DENS [, …])
Specific: The specific interface names are S_DNFFT and D_DNFFT.
FORTRAN 77 Interface
Single: CALL DNFFT (NOBS, X, FRQ, BNDS, WINDOW, IFFT, NXPT, COEF, DENS, NRMISS)
Double: The double precision name is DDNFFT.
Description
Routine
DNFFT computes Gaussian kernel estimates of the density function for a random sample of (scalar‑valued) observations using a Gaussian kernel (normal density). The computations are comparatively fast because they are performed through the use of the fast Fourier transform. Routine
DESKN should be used in place of
DNFFT if a kernel other than the Gaussian kernel is to be used, if a irregular grid is desired, or if the approximations in
DNFFT are not acceptable. Because of its speed,
DNFFT will usually be preferred to
DESKN.
A Gaussian kernel estimate of the density at the point y is given by:
where
is the estimated density at y, xi denotes the i‑th observation, n is the number of observations, and h is a fixed constant called the window width supplied by the user. If density estimates for several different window sizes are to be computed, then DNFFT performs a fast Fourier transform on the data only during the first call (when IFFT is zero). On subsequent calls (with IFFT set at 1), the Fourier transform of the data need not be recomputed.
If the same value of NXPT is to be used with several different input vectors X, then the computations can be made faster by the use of D2FFT. In D2FFT, it is assumed that some constants required by the Fourier transform and its inverse have already been computed via routine FFTRI (IMSL MATH/LIBRARY) in work array WFFTR. If D2FFT is called repeatedly with the same value of NXPT, WFTTR need only be computed once.
Routine DNFFT is an implementation of Applied Statistics algorithm AS 176 (Silverman 1982) using IMSL routines for the fast Fourier transforms. Modification to algorithm AS 176, as discussed in Silverman (1986, pages 61–66), gives the details of the computational method. The basic idea is to partition the support of the density into NXPT equally‑sized nonoverlapping intervals. The frequency of the observations within each interval is then computed, and the Fourier transform of the frequencies obtained. Since the kernel density estimate is the convolution of the frequencies with the Gaussian kernel (for given window size), the Fourier coefficients for the Gaussian kernel density estimates are computed as the product of the coefficients obtained for the frequencies, times the Fourier coefficients for the Gaussian kernel function. The discrete Fourier coefficients for the Gaussian kernel may be estimated from the continuous transform. The inverse transform is then used to to obtain the density estimates.
Because the fast Fourier transform is used in computing
the computations are relatively fast (providing that NXPT is a product of small primes). To maintain precision, a large number of intervals, say 256, is usually recommended. Tapia and Thompson (1978), Chapter 2, give some considerations relevant to the choice of the window size parameter WINDOW. Generally, several different window sizes should be tried in order to obtain the best value for this parameter.
Comments
1. Workspace may be explicitly provided, if desired, by use of D2FFT/DD2FFT. The reference is:
CALL D2FFT (NOBS, X, FRQ, BNDS, WINDOW, IFFT, NXPT, COEF, DENS, NRMISS, WFFTR)
The additional argument is:
WFFTR – Work vector of length 2 * NXPT + 15. See Comment 3. (Input)
2. Informational errors
Type | Code | Description |
---|
4 | 1 | The sum of the frequencies must be positive. |
4 | 2 | Each frequency must be nonnegative. |
4 | 3 | There are no valid observations remaining after all missing values are eliminated. |
3. WFFTR is computed in DNFFT. If D2FFT is to be called, WFFTR must first be computed via the following FORTRAN statement:
CALL FFTRI (NXPT, WFFTR)
If DD2FFT is used, call DFFTRI instead of FFTRI. WFFTR need not be recomputed between successive calls to D2FFT if NXPT does not change.
Example
In this example, the density function is estimated at 64 points using a random sample of 150 points from a standard normal distribution. The actual density for the standard normal density is also reported in the output for comparison. The random sample is generated using routines
RNSET and
RNNOR in
Chapter 18, “Random Number Generation”.
USE RNSET_INT
USE RNNOR_INT
USE DNFFT_INT
USE UMACH_INT
IMPLICIT NONE
INTEGER NOBS, NXPT
REAL CONS, WINDOW
PARAMETER (CONS=0.39894228, NOBS=150, NXPT=64, WINDOW=0.25)
!
INTEGER I, NOUT
REAL BNDS(2), COEF(NXPT), DENS(NXPT), EXP, STEP, X(NOBS), XX
INTRINSIC EXP
!
DATA BNDS/-4.0, 3.875/
!
CALL RNSET (123457)
CALL RNNOR (X)
!
CALL DNFFT (X, BNDS, WINDOW, COEF, DENS, NXPT=NXPT)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99998)
99998 FORMAT (' X DENSITY POPULATION')
STEP = (BNDS(2)-BNDS(1))/(NXPT-1)
XX = BNDS(1)
DO 10 I=1, NXPT, 2
WRITE (NOUT,99999) XX, DENS(I), CONS*EXP(-XX*XX/2.0)
99999 FORMAT (F6.2, 2F8.4)
XX = XX + STEP*2.0
10 CONTINUE
!
END
Output
X DENSITY POPULATION
-4.00 0.0000 0.0001
-3.75 0.0000 0.0004
-3.50 0.0000 0.0009
-3.25 0.0000 0.0020
-3.00 0.0001 0.0044
-2.75 0.0011 0.0091
-2.50 0.0089 0.0175
-2.25 0.0345 0.0317
-2.00 0.0772 0.0540
-1.75 0.1204 0.0863
-1.50 0.1573 0.1295
-1.25 0.2076 0.1826
-1.00 0.2682 0.2420
-0.75 0.2987 0.3011
-0.50 0.2976 0.3521
-0.25 0.3072 0.3867
0.00 0.3336 0.3989
0.25 0.3458 0.3867
0.50 0.3169 0.3521
0.75 0.2834 0.3011
1.00 0.2683 0.2420
1.25 0.2242 0.1826
1.50 0.1557 0.1295
1.75 0.1182 0.0863
2.00 0.0946 0.0540
2.25 0.0569 0.0317
2.50 0.0199 0.0175
2.75 0.0033 0.0091
3.00 0.0002 0.0044
3.25 0.0000 0.0020
3.50 0.0000 0.0009
3.75 0.0000 0.0004