Skip navigation links

Package com.imsl.stat.distributions

Probability distributions and parameter estimation.

See: Description

Package com.imsl.stat.distributions Description

Probability distributions and parameter estimation. This package contains common univariate probability distributions and methods for parameter estimation. The classes BetaPD, GammaPD, NormalPD and others extend the abstract class ProbabilityDistribution. The class MaximumLikelihoodEstimation performs maximum likelihood estimation on subclasses of ProbabilityDistribution.

Random variables and probability distributions

A random variable \( X \) is a real-valued function defined on the set of potential outcomes of an experiment. The set of all potential outcomes is known as the sample space. The random variable \(X \) assigns a real number to the elements and subsets of the sample space according to a probability measure or law. For example, suppose the experiment is one flip of a fair coin. Since the possible outcomes are heads (H) or tails (T), the sample space \( \mathcal{C} \) is the set \( \{H, T\} \). Now, define a random variable to be the number of heads. Then \(X \) has a range of \(0, 1 \) and because the coin is fair, the probability of each outcome is $$Pr[X=0]=Pr[X=1]=\frac{1}{2}$$ If the experiment is to flip the coin twice, then the sample space is \( \{HH, HT, TH, TT \} \). Because it is a fair coin and the coin flips do not affect one another, the probability of each event \(c \in \mathcal{C} = \{HH,HT,TH,TT\}\) is defined by the probability function $$ P(c) = \frac{1}{2}*\frac{1}{2} = \frac{1}{4}, c \in \mathcal{C} $$ Thus it follows that the random variable \(X \) (defined as the number of heads in 2 tosses) has a range of \(0, 1, 2\) and a probability distribution: $$\begin{array}{lll} p(0) = Pr[X=0] = P(TT) = \frac{1}{4} \\ p(1) = Pr[X=1] = P(HT) + P(TH) = \frac{1}{2} \\ p(2) = Pr[X=2] = P(HH) = \frac{1}{4} \end{array} $$ Because the sample set is finite, the range of the random variable is also finite. Often the sample set of an experiment is not finite and the range of the random variable may also not be finite. It may still be discrete, such as the set of natural numbers \( \{1, 2, 3, \ldots \} \), or integers, \( \{\ldots, -2, -1, 0, 1, 2, \ldots \} \). For infinite sample spaces that are not discrete, the range of \( X \) is a continuous subset of the real numbers. Measurements of temperature, weight, water level, financial quantities, to name a few, are typically on continuous ranges. The probability density function is the function on the range of \(X \) that defines the probabilities. For a discrete random variable, $$ p(x) = Pr[X = x] \ge 0 $$ and $$ \sum_{x \in \mathcal{A}} p(x) = 1 $$ For a continuous random variable, the probability density function (pdf) is the function \(f\) such that $$ Pr[X \in A] = \int_{A} f(x) dx $$ The cumulative distribution function or (cdf) of the random variable \(X\), is the function defined on the range of \(X\) (often denoted by \(F\)) such that $$F(x) = P(X \le x) = \sum_{x_i \le x} p(x_i) $$ in the case of discrete random variable or $$F(x) = P(X \le x) = \int_{X \le x} f(x)dx $$ for continuous \(X\). See Hogg & Craig (1978) (4th or later editions) for a thorough introduction to random variables and probability distributions.

Parameter estimation and Maximum Likelihood

Suppose we have the random sample $$\{x_i, i=1,2, \ldots, N\}$$ from a probability distribution having a density function \( f(x;\theta) \) which depends on a vector of unknown parameters, \( \theta \). The likelihood function given the sample is the product of probability densities evaluated at the sample points $$ L(\theta,\{x_i,i=1,2, \ldots, N \}) = \prod_{i=1,...,N}f(x_i;\theta) $$ The estimator $$ \hat{\theta} = \text{argmax}_{\theta}L(\theta,\{x_i\}) $$ is the maximum likelihood estimator (MLE) for \(\theta\). The problem is usually expressed in terms of the log-likelihood: $$ \hat{\theta} = \text{argmax}_{\theta}\log(L(\theta,\{x_i\})) $$ $$ = \text{argmax}_{\theta}\sum_{i}^{N}\log(f(x_i;\theta)) $$ Or, equivalently, the problem is often expressed as a minimization problem: $$ \hat{\theta} =\text{argmin}_{\theta}\left(-\sum_{i}^{N}\log(f(x_i;\theta))\right ) $$ The likelihood problem is a constrained non-linear optimization problem, where the constraints are determined by the domain of \(\theta\). Numerical optimization is usually successful in solving the likelihood problem for densities having first and second partial derivatives with respect to \(\theta\). Furthermore, under some general regularity conditions, the maximum likelihood estimator is consistent and asymptotically normally distributed with mean equal to the true value of the parameter \(\theta_0\) and variance-covariance matrix equal to the inverse Fisher's Information matrix evaluated at the true value of the parameter: $$ Var(\hat{\theta}) = I(\theta_0)^{-1}=-E_{\theta_0}\left[\frac{\partial^2 \log L}{\partial\theta^2}\right]^{-1} $$ The variance is approximated by the negative inverse hessian of the log-likelihood evaluated at the maximum likelihood estimate. $$ Var(\hat{\theta}) \approx -\left[\frac{\partial^2 \log L}{\partial\theta^2}\right]^{-1}_{\hat{\theta}} $$ See Kendall and Stuart (1979) for further details on the theory of the maximum likelihood.

Skip navigation links

Copyright © 2020 Rogue Wave Software. All rights reserved.