empiricalQuantiles

Computes empirical quantiles.

Synopsis

empiricalQuantiles (x, qprop)

Required Arguments

float x[](Input)
An array of length nObservations containing the data.
float qprop[ ] (Input)
An array of length nQprop containing the desired quantile proportions. Each value must lie in the interval (0,1).

Return Value

The function empiricalQuantiles returns an array of length nQprop containing the empirical quantiles corresponding to the input proportions in qprop.

Optional Arguments

nMissing (Output)
The number of missing values, if any, in x.
xlo (Output)
An array of length nQprop containing the largest element of x less than or equal to the desired quantile.
xhi (Output)
An array of length nQprop containing the smallest element of x greater than or equal to the desired quantile.

Description

The function empiricalQuantiles determines the empirical quantiles, as indicated in the vector qprop, from the data in x. empiricalQuantiles first checks to see if x is sorted; if x is not sorted, the routine does either a complete or partial sort, depending on how many order statistics are required to compute the quantiles requested.

This function returns the empirical quantiles and, for each quantile, the two order statistics from the sample that are at least as large and at least as small as the quantile. For a sample of size n, the quantile corresponding to the proportion p is defined as

\[Q(p) = (1-f) x_{(j)} + fx_{(j+1)}\]

where \(j=\lfloor p(n+1) \rfloor\), \(f=p(n+1)-j\), and \(x_{(j)}\) is the j-th order statistic, if \(1\leq j<n\); otherwise, the empirical quantile is the smallest or largest order statistic.

Example

In this example, five empirical quantiles from a sample of size 30 are obtained. Notice that the 0.5 quantile corresponds to the sample median. The data are from Hinkley (1977) and Velleman and Hoaglin (1981). They are the measurements (in inches) of precipitation in Minneapolis/St. Paul during the month of March for 30 consecutive years.

from __future__ import print_function
from numpy import *
from pyimsl.stat.empiricalQuantiles import empiricalQuantiles

x = array([
    0.77, 1.74, 0.81, 1.20, 1.95,
    1.20, 0.47, 1.43, 3.37, 2.20,
    3.00, 3.09, 1.51, 2.10, 0.52,
    1.62, 1.31, 0.32, 0.59, 0.81,
    2.81, 1.87, 1.18, 1.35, 4.75,
    2.48, 0.96, 1.89, 0.90, 2.05])

qprop = [0.01, 0.5, 0.9, 0.95, 0.99]

p_xlo = []
p_xhi = []

p_q = empiricalQuantiles(x, qprop,
                         xlo=p_xlo,
                         xhi=p_xhi)

print("          Smaller  Empirical  Larger")
print("Quantile   Datum    Quantile   Datum")
for i in range(0, 5):
    print("  %4.2f   %7.2f   %7.2f   %7.2f" %
          (qprop[i], p_xlo[i], p_q[i], p_xhi[i]))

Output

          Smaller  Empirical  Larger
Quantile   Datum    Quantile   Datum
  0.01      0.32      0.32      0.32
  0.50      1.43      1.47      1.51
  0.90      3.00      3.08      3.09
  0.95      3.37      3.99      4.75
  0.99      4.75      4.75      4.75