DNFFT

 


   more...

Computes Gaussian kernel estimates of a univariate density via the fast Fourier transform over a fixed interval.

Required Arguments

X — Vector of length NOBS containing the data for which a univariate density estimate is desired. (Input)
X is not referenced and may be dimensioned of length 1 in the calling program if IFFT = 1.

BNDS — Vector of length 2 containing the minimum and maximum values of X at which the density is to be estimated. (Input)
Observations less than BNDS(1) or greater than BNDS(2) are ignored. If either range of the hypothesized density is infinite, a value equal to the smallest observation minus 3 * WINDOW is a good choice for BNDS(1), and a value equal to the largest observation plus 3 * WINDOW is a good choice for BNDS(2). Let STEP = (BNDS(2)  BNDS(1))/(NXPT  1), and note that the density is estimated at the points BNDS(1) + i STEP where i = 0, 1, NXPT  1. The density is assumed constant over the interval from BNDS(1) + i * STEP to BNDS(1) + (i + 1) * STEP.

WINDOW — Window width for the kernel function. (Input)
Generally, several different values for WINDOW should be tried. When several different values are tried, use the IFFT option.

COEF — Vector of length NXPT containing the Fourier coefficients. (Input, if IFFT = 1; output, otherwise)

DENS — Vector of length NXPT containing the density estimates. (Output)
The density is estimated at the points BNDS(1) + i * STEP, i = 0, 1, NXPT  1, where STEP = (BNDS (2)  BNDS(1))/(NXPT  1).

Optional Arguments

NOBS — Number of observations. (Input)
Default: NOBS = size (X,1).

FRQ — Vector of length NOBS containing the frequency of the corresponding element of X. (Input)
If FRQ(1) is  1.0, then the vector FRQ is not used and all frequencies are taken to be one. FRQ is also not used if IFFT = 1. In either case, FRQ may be dimensioned of length 1 in the calling program.
Default: FRQ(1) = –1.0.

IFFT — Fourier transform option parameter. (Input)
If IFFT = 1, then COEF contains the Fourier coefficients on input, and the coefficients are not computed. Otherwise, the coefficients are computed. This option is used when several different values for WINDOW are to be tried. On the first call to DNFFT, IFFT = 0 and the coefficients COEF are computed. On subsequent calls, these coefficients do not need to be recomputed (but only if NXPT also remains fixed).
Default: IFFT = 0.

NXPT — Number of equally‑spaced points points at which the density is to be estimated. (Input)
Routine DNFFT is most efficient when NXPT is a power of 2. Little efficiency is lost if NXPT is a product of small primes. Because of the method of estimation, NXPT should be large, say greater than 64.
Default: NXPT = 128.

NRMISS — Number of rows of data that contain missing values in X or FRQ. (Output) NRMISS is not referenced if IFFT = 1.

FORTRAN 90 Interface

Generic: CALL DNFFT (X, BNDS, WINDOW, COEF, DENS [])

Specific: The specific interface names are S_DNFFT and D_DNFFT.

FORTRAN 77 Interface

Single: CALL DNFFT (NOBS, X, FRQ, BNDS, WINDOW, IFFT, NXPT, COEF, DENS, NRMISS)

Double: The double precision name is DDNFFT.

Description

Routine DNFFT computes Gaussian kernel estimates of the density function for a random sample of (scalar‑valued) observations using a Gaussian kernel (normal density). The computations are comparatively fast because they are performed through the use of the fast Fourier transform. Routine DESKN should be used in place of DNFFT if a kernel other than the Gaussian kernel is to be used, if a irregular grid is desired, or if the approximations in DNFFT are not acceptable. Because of its speed, DNFFT will usually be preferred to DESKN.

A Gaussian kernel estimate of the density at the point y is given by:

 

where

 

is the estimated density at yxi denotes the i‑th observation, n is the number of observations, and h is a fixed constant called the window width supplied by the user. If density estimates for several different window sizes are to be computed, then DNFFT performs a fast Fourier transform on the data only during the first call (when IFFT is zero). On subsequent calls (with IFFT set at 1), the Fourier transform of the data need not be recomputed.

If the same value of NXPT is to be used with several different input vectors X, then the computations can be made faster by the use of D2FFT. In D2FFT, it is assumed that some constants required by the Fourier transform and its inverse have already been computed via routine FFTRI (IMSL MATH/LIBRARY) in work array WFFTR. If D2FFT is called repeatedly with the same value of NXPT, WFTTR need only be computed once.

Routine DNFFT is an implementation of Applied Statistics algorithm AS 176 (Silverman 1982) using IMSL routines for the fast Fourier transforms. Modification to algorithm AS 176, as discussed in Silverman (1986, pages 61–66), gives the details of the computational method. The basic idea is to partition the support of the density into NXPT equally‑sized nonoverlapping intervals. The frequency of the observations within each interval is then computed, and the Fourier transform of the frequencies obtained. Since the kernel density estimate is the convolution of the frequencies with the Gaussian kernel (for given window size), the Fourier coefficients for the Gaussian kernel density estimates are computed as the product of the coefficients obtained for the frequencies, times the Fourier coefficients for the Gaussian kernel function. The discrete Fourier coefficients for the Gaussian kernel may be estimated from the continuous transform. The inverse transform is then used to to obtain the density estimates.

Because the fast Fourier transform is used in computing

 

the computations are relatively fast (providing that NXPT is a product of small primes). To maintain precision, a large number of intervals, say 256, is usually recommended. Tapia and Thompson (1978), Chapter 2, give some considerations relevant to the choice of the window size parameter WINDOW. Generally, several different window sizes should be tried in order to obtain the best value for this parameter.

Comments

1. Workspace may be explicitly provided, if desired, by use of D2FFT/DD2FFT. The reference is:

CALL D2FFT (NOBS, X, FRQ, BNDS, WINDOW, IFFT, NXPT, COEF, DENS, NRMISS, WFFTR)

The additional argument is:

WFFTR – Work vector of length 2 * NXPT + 15. See Comment 3. (Input)

2. Informational errors

 

Type

Code

Description

4

1

The sum of the frequencies must be positive.

4

2

Each frequency must be nonnegative.

4

3

There are no valid observations remaining after all missing values are eliminated.

3. WFFTR is computed in DNFFT. If D2FFT is to be called, WFFTR must first be computed via the following FORTRAN statement:

CALL FFTRI (NXPT, WFFTR)

If DD2FFT is used, call DFFTRI instead of FFTRI. WFFTR need not be recomputed between successive calls to D2FFT if NXPT does not change.

Example

In this example, the density function is estimated at 64 points using a random sample of 150 points from a standard normal distribution. The actual density for the standard normal density is also reported in the output for comparison. The random sample is generated using routines RNSET and RNNOR in Chapter 18, “Random Number Generation”.

 

USE RNSET_INT

USE RNNOR_INT

USE DNFFT_INT

USE UMACH_INT

 

IMPLICIT NONE

INTEGER NOBS, NXPT

REAL CONS, WINDOW

PARAMETER (CONS=0.39894228, NOBS=150, NXPT=64, WINDOW=0.25)

!

INTEGER I, NOUT

REAL BNDS(2), COEF(NXPT), DENS(NXPT), EXP, STEP, X(NOBS), XX

INTRINSIC EXP

!

DATA BNDS/-4.0, 3.875/

!

CALL RNSET (123457)

CALL RNNOR (X)

!

CALL DNFFT (X, BNDS, WINDOW, COEF, DENS, NXPT=NXPT)

!

CALL UMACH (2, NOUT)

WRITE (NOUT,99998)

99998 FORMAT (' X DENSITY POPULATION')

STEP = (BNDS(2)-BNDS(1))/(NXPT-1)

XX = BNDS(1)

DO 10 I=1, NXPT, 2

WRITE (NOUT,99999) XX, DENS(I), CONS*EXP(-XX*XX/2.0)

99999 FORMAT (F6.2, 2F8.4)

XX = XX + STEP*2.0

10 CONTINUE

!

END

Output

 

X DENSITY POPULATION

-4.00 0.0000 0.0001

-3.75 0.0000 0.0004

-3.50 0.0000 0.0009

-3.25 0.0000 0.0020

-3.00 0.0001 0.0044

-2.75 0.0011 0.0091

-2.50 0.0089 0.0175

-2.25 0.0345 0.0317

-2.00 0.0772 0.0540

-1.75 0.1204 0.0863

-1.50 0.1573 0.1295

-1.25 0.2076 0.1826

-1.00 0.2682 0.2420

-0.75 0.2987 0.3011

-0.50 0.2976 0.3521

-0.25 0.3072 0.3867

0.00 0.3336 0.3989

0.25 0.3458 0.3867

0.50 0.3169 0.3521

0.75 0.2834 0.3011

1.00 0.2683 0.2420

1.25 0.2242 0.1826

1.50 0.1557 0.1295

1.75 0.1182 0.0863

2.00 0.0946 0.0540

2.25 0.0569 0.0317

2.50 0.0199 0.0175

2.75 0.0033 0.0091

3.00 0.0002 0.0044

3.25 0.0000 0.0020

3.50 0.0000 0.0009

3.75 0.0000 0.0004