Skewness

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Image:SkewedDistribution.png
Example of experimental data with non-zero skewness (gravitropic response of wheat coleoptiles, 1,790)

In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable.

Contents

[edit] Introduction

Consider the distribution in the figure. The bars on the right side of the distribution taper differently than the bars on the left side. These tapering sides are called tails (or snakes), and they provide a visual means for determining which of the two kinds of skewness a distribution has:

  1. positive skew: The right tail is longer; the mass of the distribution is concentrated on the left of the figure. The distribution is said to be right-skewed.
  2. negative skew: The left tail is longer; the mass of the distribution is concentrated on the right of the figure. The distribution is said to be left-skewed.

Image:Skew.png

[edit] Definition

Skewness, the third standardized moment, is written as Failed to parse (Missing texvc executable; please see math/README to configure.): \gamma_1

and defined as
Failed to parse (Missing texvc executable; please see math/README to configure.): \gamma_1 = \frac{\mu_3}{\sigma^3}, \!


where Failed to parse (Missing texvc executable; please see math/README to configure.): \mu_3

is the third moment about the mean and Failed to parse (Missing texvc executable; please see math/README to configure.): \sigma
is the standard deviation. Equivalently, skewness can be defined as the ratio of the third cumulant Failed to parse (Missing texvc executable; please see math/README to configure.): \kappa_3
and the third power of the square root of the second cumulant Failed to parse (Missing texvc executable; please see math/README to configure.): \kappa_2
Failed to parse (Missing texvc executable; please see math/README to configure.): \gamma_1 = \frac{\kappa_3}{\kappa_2^{3/2}}. \!


This is analogous to the definition of kurtosis, which is expressed as the fourth cumulant divided by the fourth power of the square root of the second cumulant.

For a sample of n values the sample skewness is

Failed to parse (Missing texvc executable; please see math/README to configure.): g_1 = \frac{m_3}{m_2^{3/2}} = \frac{\sqrt{n\,}\sum_{i=1}^n (x_i-\bar{x})^3}{\left(\sum_{i=1}^n (x_i-\bar{x})^2\right)^{3/2}}, \!


where Failed to parse (Missing texvc executable; please see math/README to configure.): x_i

is the ith value, Failed to parse (Missing texvc executable; please see math/README to configure.): \bar{x}
is the sample mean, Failed to parse (Missing texvc executable; please see math/README to configure.): m_3
is the sample third central moment, and Failed to parse (Missing texvc executable; please see math/README to configure.): m_2
is the sample variance.

Given samples from a population, the equation for the sample skewness Failed to parse (Missing texvc executable; please see math/README to configure.): g_1

above is a biased estimator of the population skewness.  The usual estimator of skewness is
Failed to parse (Missing texvc executable; please see math/README to configure.): G_1 = \frac{k_3}{k_2^{3/2}} = \frac{\sqrt{n\,(n-1)}}{n-2}\; g_1, \!


where Failed to parse (Missing texvc executable; please see math/README to configure.): k_3

is the unique symmetric unbiased estimator of the third cumulant and Failed to parse (Missing texvc executable; please see math/README to configure.): k_2
is the symmetric unbiased estimator of the second cumulant. Unfortunately Failed to parse (Missing texvc executable; please see math/README to configure.): G_1
is, nevertheless, generally biased. Its expected value can even have the opposite sign from the true skewness.

The skewness of a random variable X is sometimes denoted Skew[X]. If Y is the sum of n independent random variables, all with the same distribution as X, then it can be shown that Skew[Y] = Skew[X] / √n.

Skewness has benefits in many areas. Many simplistic models assume normal distribution i.e. data is symmetric about the mean. The normal distribution has a skewness of zero. But in reality, data points are not perfectly symmetric. So, an understanding of the skewness of the dataset indicates whether deviations from the mean are going to be positive or negative.

[edit] Pearson skewness coefficients

Karl Pearson suggested two simpler calculations as a measure of skewness:

There is no guarantee that these will be the same sign as each other or as the ordinary definition of skewness.

[edit] See also

[edit] External links

de:Schiefe (Statistik) fr:Asymétrie (statistique) es:Curtosis he:צידוד (סטטיסטיקה) it:Simmetria (statistica) lv:Asimetrijas koeficients lt:Asimetrijos koeficientas hu:Ferdeség nl:Scheefheid ja:歪度 pl:Współczynnik skośności pt:Obliquidade ru:Коэффициент асимметрии su:Skewness fi:Vinous

Personal tools
AD Links