maxLikelihoodEstimates

Calculates maximum likelihood estimates (MLE) for the parameters of one of several univariate probability distributions.

Synopsis

maxLikelihoodEstimates (x, ipdf)

Required Arguments

float x[] (Input)
Array of length nObservations containing the data.
int ipdf (Input)

Specifies the probability density function.

Distribution ipdf nParameters i parameters[i]
Discrete uniform 0 1 0 scale - upper limit
Bernoulli 1 1 0 probability of success (mean)
\(Binomial ^{(1)}\) 2 1 0 probability of success
Negative \(binomial ^{(2)}\) 3 1 0 probability of success
Poisson 4 1 0 location (mean) - θ
Geometric 5 1 0 probability of success
Continuous uniform 6 2

0

1

scale - lower boundary

scale - upper boundary

Beta 7 2

0

1

shape - p

shape - q

Exponential 8 1 0 scale - b
Gamma 9 2

0

1

shape - k

scale - θ

Weibull 10 2

0

1

scale - λ

shape - k

Rayleigh 11 1 0 scale - α
Extreme value 12 2

0

1

location - μ

scale - σ

Generalized extreme value 13 3

0

1

2

location - μ

scale - σ

shape - β

Pareto 14 2

0

1

scale (lower boundary) \(x_m\)

shape - k

Generalized Pareto 15 2

0

1

scale - σ

shape - α

Normal 16 2

0

1

location(mean) - μ

scale(variance) - \(\sigma^2\)

Log-normal 17 2

0

1

location(mean of log(x)) - μ

scale(variance of log(x)) - \(\sigma^2\)

Logistic 18 2

0

1

location(mean) - μ

scale - s

Log-logistic 19 2

0

1

scale(exp(mean)) - \(e^\mu\)

shape - β

Inverse Gaussian 20 2

0

1

location(mean) - μ

shape - λ

Note: 1 ‑ The binomial distribution requires the optional argument numberOfTrials.

Note: 2 ‑ The negative binomial distribution requires the optional argument numberOfFailures.

Return Value

An array of length nParameters containing the parameter values (see ipdf table above).

Optional Arguments

printLevel, int (Input)

Printing option.

printLevel Action
0 No printing
1 Print final results only
2 Print intermediate and final results

Default: printLevel = 0.

nParameters (Output)
The number of parameters in the distribution specified by ipdf.
numberOfTrials, int (Input)

The number of trials. numberOfTrials is required for the binomial distribution, (ipdf = 2).

Default: Not used, except for ipdf = 2.

numberOfFailures, int (Input)

The number of failures. numberOfFailures is required for the negative binomial distribution, (ipdf = 3).

Default: Not used, except for ipdf = 3.

mloglike (Output)
Minus log-likelihood evaluated at the parameter estimates.
stdErrors (Output)
An array of length nParameters containing the standard errors of the parameter estimates.
hessian (Output)
An array of length nParameters × nParameters containing the Hessian matrix.
paramLb, float[] (Input)

Array of length nParameters containing the lower bounds of the parameters.

Exceptions paramLb
Extreme value distribution (ipdf = 12) paramLb[1] = 0.25, for the scale parameter
Generalized Pareto distribution (ipdf = 15) paramLb[1] = -5.0, for the shape parameter
Generalized extreme value distribution (ipdf = 13) paramLb[2] = -10.0, for the shape parameter

Default: The default lower bound depends on the range of the parameter. That is, if the range of the parameter is positive for the desired distribution, paramLb[i] = 0.01. If the range of the parameter is non-negative (≥ 0), then paramLb[i] = 0.0. If the range of the parameter is unbounded, then paramLb[i] = -10000.00.

paramUb, float[] (Input)

Array of length nParameters containing the upper bounds of the parameters.

Exceptions paramLb
Generalized Pareto distribution (ipdf = 15) paramUb[1] = -5.0, for the shape parameter
Generalized extreme value distribution (ipdf = 13) paramUb[2] = -10.0, for the shape parameter

Default: paramUb[i] = 10000.0.

initialEstimates, float[] (Input)

Array of length nParameters containing the initial estimates of the parameters.

Default: Method of moments estimates are used for initial estimates.

xscale, float[] (Input)

Array of length nParameters containing the scaling factors for the parameters. xscale is used in the optimization algorithm in scaling the gradient and the distance between two points.

Default: xscale[i] = 1.0.

maxIterations, int (Input)

Maximum number of iterations.

Default: maxIterations = 100.

maxFcn, int (Input)

Maximum number of function evaluations.

Default: maxFcn = 400.

maxGrad, int (Input)

Maximum number of gradient evaluations.

Default: maxGrad = 400.

Description

Function maxLikelihoodEstimates calculates maximum likelihood estimates for the parameters of a univariate probability distribution, where the distribution is one specified by ipdf and where the input data x is (assumed to be) a random sample from that distribution.

Let \(\{x_i,i=1,\ldots,N\}\) represent a random sample from a probability distribution with density function \(f (x | \theta)\), which depends on a vector \(\theta\in \Re^p\) containing the values of the parameters of the distribution. The values in θ are fixed but unknown and the problem is to find an estimate for θ given the sample data.

The likelihood function is defined to be the product

\[L\left(\theta | \left\{x_i; i=1, \ldots, N\right\}\right) = \prod_{i=1,\ldots,N} f\left(x_i | \theta\right)\]

The estimator

\[\begin{split}\begin{aligned} \hat{\theta}_{\mathit{MLE}} &= \arg \max{_\theta} L \left( \theta | \{ x_1, x_2, \ldots, x_N \right) \\ &= \arg \max{_\theta} \prod_{i=1,\ldots,N}f(x_i | \theta) \\ &= \arg \max{_\theta} \sum_{i=1,\ldots,N}\log\left(f\left(x_i | \theta\right)\right) \\ \end{aligned}\end{split}\]

That is, the estimator that maximizes L also maximizes log L and is the maximum likelihood estimate, or MLE for θ.

The likelihood problem is in general a constrained non-linear optimization problem, where the constraints are determined by the permissible range of θ. In some situations, the problem has a closed form solution. Otherwise, maxLikelihoodEstimates uses a quasi-Newton method to solve the likelihood problem. If optional argument initialEstimates is not supplied, method of moments estimates serve as starting values of the parameters. In some cases, method of moments estimators may not exist, such as when certain moments of the true distribution do not exist; thus it is possible that the starting values are not truly method of moments estimates.

Upper and lower bounds, when needed for the optimization, have default values for each selection of ipdf (defaults will vary depending on the allowable range of the parameters). It is possible that the optimization will fail. In such cases, the user may try adjusting upper and lower bounds using the optional arguments paramLb, paramUb, or adjusting up or down the scaling factors using optional argument xscale, which can sometimes help the optimization converge.

Standard errors and covariances are supplied, in most cases, using the asymptotic properties of ML estimators. Under some general regularity conditions, ML estimates are consistent and asymptotically normally distributed with variance-covariance equal to the inverse Fisher’s Information matrix evaluated at the true value of the parameter, \(\theta_0\):

\[\mathit{Var}\left(\hat{\theta}\right) = I\left(\theta_0\right)^{-1} = -E \left[\frac{\partial^2 \log L}{\partial \theta^2}\right]_{\theta_0}^{-1}\]

maxLikelihoodEstimates approximates the asymptotic variance using the negative inverse Hessian evaluated at the ML estimate:

\[\mathit{Var}\left(\hat{\theta}\right) \approx -\left[\frac{\partial^2 \log L}{\partial \theta^2}\right] _{\theta = \hat{\theta}_{\mathit{MLE}}}^{-1}\]

The Hessian is approximated numerically for all but a few cases where it can be determined in closed form.

In cases when the asymptotic result does not hold, standard errors may be available from the known sampling distribution. For example, the ML estimate of the Pareto distribution location parameter is the minimum of the sample. The variance is estimated using the known sampling distribution of the minimum or first order-statistic for the Pareto distribution.

For further details regarding the properties of the estimators and the theory of the maximum likelihood method, see Kendall and Stuart (1979). The different probability distributions have wide coverage in the statistical literature. See Johnson and Kotz (1970a, 1970b, or later editions).

Parameter estimation (including maximum likelihoood) for the generalized Pareto distribution is studied in Hosking and Wallis (1987) and Giles and Feng (2009), and estimation for the generalized extreme value distribution is treated in Hosking, Wallis, and Wood (1985).

Remarks

  1. The location parameter is not estimated for the generalized Pareto distribution (ipdf=15). Instead, the minimum of the sample is subtracted from each observation before the estimation procedure.
  2. Only the probability of success parameter is estimated for the binomial and negative binomial distributions, (ipdf = 2,3). The number of trials and the number of failures, respectively, must be provided using optional arguments numberOfTrials or numberOfFailures.
  3. maxLikelihoodEstimates issues an error if missing or NaN values are encountered in the input data. Missing or NaN values should be removed before calling maxLikelihoodEstimates.

Examples

Example 1

The data are \(N=100\) observations generated from the logistic distribution with location parameter \(\mu=0.85\) and parameter \(\sigma=0.5\).

from numpy import *
from pyimsl.stat.maxLikelihoodEstimates import maxLikelihoodEstimates

ipdf = 18
n_observations = 100

x = [2.020394, 2.562315, -0.5453395, 1.258546, 0.7704533, 0.3662717,
     0.6885536, 2.619634, -0.49581, 2.972249, 0.5356222, 0.4262079,
     1.023666, 0.8286033, 1.319018, 2.123659, 0.3904647, -0.1196832,
     1.629261, 1.069602, 0.9438083, 1.314796, 1.404453, -0.5496156,
     0.8326595, 1.570288, 1.326737, 0.9619384, -0.1795268, 1.330161,
     -0.2916453, 0.7430826, 1.640854, 1.582755, 1.559261, 0.6177695,
     1.739638, 1.308973, 0.568709, 0.2587071, 0.745583, 1.003815,
     1.475413, 1.444586, 0.4515438, 1.264374, 1.788313, 1.062330,
     2.126034, 0.3626510, 1.365612, 0.5044735, 2.51385, 0.7910572,
     0.5932584, 1.140248, 2.104453, 1.345562, -0.9120445, 0.0006519341,
     1.049729, -0.8246097, 0.8053433, 1.493787, -0.5199705, 2.285175,
     0.9005916, 2.108943, 1.40268, 1.813626, 1.007817, 1.925250, 1.037391,
     0.6767235, -0.3574937, 0.696697, 1.104745, -0.7691124, 1.554932,
     2.090315, 0.60919, 0.4949385, -2.449544, 0.668952, 0.9480486,
     0.9908558, -1.495384, 2.179275, 0.1858808, -0.3715074, 0.1447150,
     0.857202, 1.805844, 0.405371, 1.425935, 0.3187476, 1.536181,
     -0.6352768, 0.5692068, 1.706736]

p_hess = []
p_se = []
mloglike = []

param = maxLikelihoodEstimates(x, ipdf,
                               printLevel=2,
                               hessian=p_hess,
                               stdErrors=p_se,
                               mloglike=mloglike)

Output

Maximum likelihood estimation for the logistic distribution
Starting Estimates:    0.90677     0.51128

Initial -log-likelihood:  132.75303


-log-likelihood  132.61489

MLE for parameter         1      0.95323

MLE for parameter         2      0.50951

Std error for parameter   1      0.08821

Std error for parameter   2      0.04271

 
          Hessian
             1            2
1       -128.6         -8.3
2         -8.3       -548.6

Example 2

The data are \(N=100\) observations generated from the generalized extreme value distribution with location parameter \(\mu=0\), scale parameter \(\sigma=1.0\), and shape parameter \(\xi=-0.25\).

from numpy import *
from pyimsl.stat.maxLikelihoodEstimates import maxLikelihoodEstimates


ipdf = 13
n_observations = 100

x = [0.7688048, 0.1944504, -0.2992029, -0.3853738,
     -1.185593, 0.3056149, -0.4407711, 0.5001115,
     0.3635027, -1.058632, -0.2927695, -0.3205969,
     0.03367599, 0.8850839, 1.860485, 0.4841038,
     0.5421101, 1.883694, 1.707392, 0.2166106,
     1.537204, 1.340291, 0.4589722, 1.616080,
     -0.8389288, 0.7057426, 1.532988, 1.161350,
     0.9475416, 0.4995294, -0.2392898, 0.8167126,
     0.992479, -0.8357962, -0.3194499, 1.233603,
     2.321555, -0.3715629, -0.1735171, 0.4624801,
     -0.6249577, 0.7040129, -0.3598889, 0.7121399,
     -0.5178735, -1.069429, 0.7169358, 0.4148059,
     1.606248, -0.4640152, 1.463425, 0.9544342,
     -1.383239, 0.1393160, 0.622689, 0.365793,
     0.7592438, 0.810005, 0.3483791, 2.375727,
     -0.08124195, -0.4726068, 0.1496043, 0.4961212,
     1.532723, -0.1106993, 1.028553, 0.856018,
     -0.6634978, 0.3573150, 0.06391576, 0.3760349,
     -0.5998756, 0.4158309, -0.2832369, -1.023551,
     1.116887, 1.237714, 1.900794, 0.6010037,
     1.599663, -0.3341879, 0.5278575, 0.5497694,
     0.6392933, 0.592865, 1.646261, -1.042950,
     -1.113611, 1.229645, 1.655998, 0.6913992,
     0.4548073, 0.4982649, -1.073640, -0.4765107,
     -0.8692533, -0.8316462, -0.03609102, 0.655814]

p_hess = []
p_se = []
mloglike = []

param = maxLikelihoodEstimates(x, ipdf,
                               printLevel=2,
                               hessian=p_hess,
                               stdErrors=p_se,
                               mloglike=mloglike)

Output

Maximum likelihood estimation for the generalized extreme value distribution
Starting Estimates:   -0.00888     0.67451     0.00000

Initial -log-likelihood:  135.43817


-log-likelihood  126.09405

MLE for parameter         1      0.07529

MLE for parameter         2      0.85115

MLE for parameter         3     -0.27964

Std error for parameter   1      0.09474

Std error for parameter   2      0.06823

Std error for parameter   3      0.07030

 
                 Hessian
             1            2            3
1       -141.7        -56.1       -109.0
2        -56.1       -353.7       -234.1
3       -109.0       -234.1       -396.1

Warning Errors

IMSLS_HESSIAN_NOT_CALCULATED The Hessian is not calculated for the requested distribution.
IMSLS_HESSIAN_NOT_USED The Hessian is not used to calculate the standard errors of the estimates for the # distribution.
IMSLS_HESSIAN_NOT_CALC_2 For the Pareto distribution, the Hessian cannot be calculated because the parameter estimate is 0.