Chi-square distribution
From Wikipedia, the free encyclopedia
Categories: Articles lacking sources from February 2008 | All articles lacking sources | Continuous distributions
|
This article is about the mathematics of the chi-square distribution. For its uses in statistics, see chi-square test.
In probability theory and statistics, the chi-square distribution (also chi-squared or Failed to parse (Missing texvc executable; please see math/README to configure.): \chi^2 distribution) is one of the most widely used theoretical probability distributions in inferential statistics, e.g., in statistical significance tests. It is useful because, under reasonable assumptions, easily calculated quantities can be proven to have distributions that approximate to the chi-square distribution if the null hypothesis is true. If Failed to parse (Missing texvc executable; please see math/README to configure.): X_i are k independent, normally distributed random variables with mean 0 and variance 1, then the random variable
- a positive integer that specifies the number of degrees of freedom (i.e. the number of Failed to parse (Missing texvc executable; please see math/README to configure.): X_i ) The chi-square distribution is a special case of the gamma distribution. The best-known situations in which the chi-square distribution are used are the common chi-square tests for goodness of fit of an observed distribution to a theoretical one, and of the independence of two criteria of classification of qualitative data. However, many other statistical tests lead to a use of this distribution. One example is Friedman's analysis of variance by ranks.
CharacteristicsProbability density functionA probability density function of the chi-square distribution is
denotes the Gamma function, which takes particular values at the half-integers. Cumulative distribution functionIts cumulative distribution function is:
is the lower incomplete Gamma function and Failed to parse (Missing texvc executable; please see math/README to configure.): P(k, z) is the regularized Gamma function. Tables of this distribution — usually in its cumulative form — are widely available and the function is included in many spreadsheets and all statistical packages. Characteristic functionThe characteristic function of the Chi-square distribution is
PropertiesThe chi-square distribution has numerous applications in inferential statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role in Student's t-distribution. It enters all analysis of variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent chi-squared random variables divided by their respective degrees of freedom. Normal approximationIf Failed to parse (Missing texvc executable; please see math/README to configure.): X\sim\chi^2_k , then as Failed to parse (Missing texvc executable; please see math/README to configure.): k tends to infinity, the distribution of Failed to parse (Missing texvc executable; please see math/README to configure.): X tends to normality. However, the tendency is slow (the skewness is Failed to parse (Missing texvc executable; please see math/README to configure.): \sqrt{8/k} and the kurtosis excess is Failed to parse (Missing texvc executable; please see math/README to configure.): 12/k ) and two transformations are commonly considered, each of which approaches normality faster than Failed to parse (Missing texvc executable; please see math/README to configure.): X itself: Fisher empirically showed that Failed to parse (Missing texvc executable; please see math/README to configure.): \sqrt{2X}
is approximately normally distributed with mean Failed to parse (Missing texvc executable; please see math/README to configure.): \sqrt{2k-1}
and unit variance. It is possible to arrive at the same normal approximation result by using moment matching. To see this, consider the mean and the variance of a Chi-distributed random variable Failed to parse (Missing texvc executable; please see math/README to configure.): z=\sqrt{X}
, which are given by Failed to parse (Missing texvc executable; please see math/README to configure.): \mu_z= \sqrt{2} \frac{\Gamma\left(k/2+1/2\right)}{\Gamma\left(k/2 \right)} and Failed to parse (Missing texvc executable; please see math/README to configure.): \sigma_z^2= k-\mu_z^2 , where Failed to parse (Missing texvc executable; please see math/README to configure.): \Gamma(\cdot) is the Gamma function. The particular ratio of the Gamma functions in Failed to parse (Missing texvc executable; please see math/README to configure.): \mu_z has the following series expansion [1]: Failed to parse (Missing texvc executable; please see math/README to configure.): \frac{\Gamma\left(N+1/2\right)}{\Gamma\left(N \right)}=\sqrt{N}\left(1-\frac{1}{8N}+ \frac{1}{128N^2}+\frac{5}{1024N^3}-\frac{21}{32768N^4}+\ldots\right). When Failed to parse (Missing texvc executable; please see math/README to configure.): N\gg 1 , this ratio can be approximated as follows: Failed to parse (Missing texvc executable; please see math/README to configure.): \frac{\Gamma\left(N+1/2\right)}{\Gamma\left(N \right)}\approx\sqrt{N}\left(1-\frac{1}{8N}\right)\approx\sqrt{N}\left(1-\frac{1}{4N}\right)^{0.5}=\sqrt{N-1/4}.
Failed to parse (Missing texvc executable; please see math/README to configure.): z\sim{\mathcal N}\left(\sqrt{k-1/2}, \frac{1}{2}\right) , from which it follows that Failed to parse (Missing texvc executable; please see math/README to configure.): \sqrt{2X}\sim{\mathcal N}\left(\sqrt{2k-1}, 1\right) . Wilson and Hilferty showed in 1931 that Failed to parse (Missing texvc executable; please see math/README to configure.): \sqrt[3]{X/k} is approximately normally distributed with mean Failed to parse (Missing texvc executable; please see math/README to configure.): 1-2/(9k) and variance Failed to parse (Missing texvc executable; please see math/README to configure.): 2/(9k) . The expected value of a random variable having chi-square distribution with Failed to parse (Missing texvc executable; please see math/README to configure.): k degrees of freedom is Failed to parse (Missing texvc executable; please see math/README to configure.): k and the variance is Failed to parse (Missing texvc executable; please see math/README to configure.): 2k . The median is given approximately by
Information entropyThe information entropy is given by
is the Digamma function. Related distributions
is an exponential distribution if Failed to parse (Missing texvc executable; please see math/README to configure.): X \sim \chi_2^2 (with 2 degrees of freedom).
is a chi-square distribution if Failed to parse (Missing texvc executable; please see math/README to configure.): Y = \sum_{m=1}^k X_m^2
for Failed to parse (Missing texvc executable; please see math/README to configure.): X_i \sim N(0,1)
independent that are normally distributed.
have nonzero means, then Failed to parse (Missing texvc executable; please see math/README to configure.): Y = \sum_{m=1}^k X_m^2
is drawn from a noncentral chi-square distribution.
is a special case of the gamma distribution, in that Failed to parse (Missing texvc executable; please see math/README to configure.): X \sim \textrm{Gamma}(\tfrac{\nu}{2}, 2) .
is an F-distribution if Failed to parse (Missing texvc executable; please see math/README to configure.): Y = \frac{X_1 / \nu_1}{X_2 / \nu_2} where Failed to parse (Missing texvc executable; please see math/README to configure.): X_1 \sim \chi_{\nu_1}^2 and Failed to parse (Missing texvc executable; please see math/README to configure.): X_2 \sim \chi_{\nu_2}^2 are independent with their respective degrees of freedom.
is a chi-square distribution if Failed to parse (Missing texvc executable; please see math/README to configure.): Y = \sum_{m=1}^N X_m
where Failed to parse (Missing texvc executable; please see math/README to configure.): X_m \sim \chi^2(\nu_m)
are independent and Failed to parse (Missing texvc executable; please see math/README to configure.): \bar{\nu} = \sum_{m=1}^N \nu_m
.
is chi-square distributed, then Failed to parse (Missing texvc executable; please see math/README to configure.): \sqrt{X}
is chi distributed.
(chi-square with 2 degrees of freedom), then Failed to parse (Missing texvc executable; please see math/README to configure.): \sqrt{X}
is Rayleigh distributed.
are i.i.d. Failed to parse (Missing texvc executable; please see math/README to configure.): N(\mu,\sigma^2) random variables, then Failed to parse (Missing texvc executable; please see math/README to configure.): \sum_{i=1}^n(X_i - \bar X)^2 \sim \sigma^2 \chi^2_{n-1} where Failed to parse (Missing texvc executable; please see math/README to configure.): \bar X = \frac{1}{n} \sum_{i=1}^n X_i .
, then Failed to parse (Missing texvc executable; please see math/README to configure.): \mathrm{log}(1 + e^{-X}) \sim \chi_2^2\,
See also
External links
cs:Χ² rozdělení de:Chi-Quadrat-Verteilung es:Distribución χ² fa:توزیع کیدو fr:Loi du χ² it:Variabile casuale chi quadro nl:Chi-kwadraatverdeling ja:カイ二乗分布 pl:Rozkład chi kwadrat pt:Chi-quadrado ru:Распределение хи-квадрат simple:Chi-square distribution su:Sebaran chi-kuadrat fi:Khii toiseen -jakauma sv:Chitvåfördelning tr:Ki-kare dağılımı | ||||||||||||||||||||||||||||||||||||||||||||||


