首页 | 主题 | 图库 | 问答 | 文摘 | 原创 | 百科

历史 | 地理 | 人物 | 艺术 | 体育 | 科学 | 音乐 | 电影 | 信息技术 | 世界遗产

 开放、中立,源自维基百科

Personal tools

Rényi entropy

From Wikipedia, the free encyclopedia

  (Redirected from Renyi entropies)
Jump to: navigation, search

In information theory, the Rényi entropy, a generalisation of Shannon entropy, is one of a family of functionals for quantifying the diversity, uncertainty or randomness of a system. It is named after Alfréd Rényi.

The Rényi entropy of order α, where α Failed to parse (Missing texvc executable; please see math/README to configure.): \geq

0,  is defined as
Failed to parse (Missing texvc executable; please see math/README to configure.): H_\alpha(X) = \frac{1}{1-\alpha}\log\Bigg(\sum_{i=1}^n p_i^\alpha\Bigg)


where pi are the probabilities of {x1, x2 ... xn}. If the probabilities are all the same then all the Rényi entropies of the distribution are equal, with Hα(X)=log n. Otherwise the entropies are weakly decreasing as a function of α.

Some particular cases:

Failed to parse (Missing texvc executable; please see math/README to configure.): H_0 (X) = \log n = \log |X|,\,


which is the logarithm of the cardinality of X, sometimes called the Hartley entropy of X.

In the limit that Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha

approaches 1, it can be shown that Failed to parse (Missing texvc executable; please see math/README to configure.): H_\alpha
converges to 
Failed to parse (Missing texvc executable; please see math/README to configure.): H_1 (X) = - \sum_{i=1}^n p_i \log p_i

which is the Shannon entropy. Sometimes Renyi entropy refers only to the case Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha = 2 ,

Failed to parse (Missing texvc executable; please see math/README to configure.): H_2 (X) = - \log \sum_{i=1}^n p_i^2 = - \log P(X = Y)


where Y is a random variable independent of X but identically distributed to X. As Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha \rightarrow \infty , the limit exists as

Failed to parse (Missing texvc executable; please see math/README to configure.): H_\infty (X) = - \log \sup_{i=1..n} p_i


and this is called Min-entropy, because it is smallest value of Failed to parse (Missing texvc executable; please see math/README to configure.): H_\alpha . These two latter cases are related by Failed to parse (Missing texvc executable; please see math/README to configure.): H_\infty < H_2 < 2 H_\infty , while on the other hand Shannon entropy can be arbitrarily high for a random variable X with fixed min-entropy.

The Rényi entropies are important in ecology and statistics as indices of diversity. They also lead to a spectrum of indices of fractal dimension.

Contents

Rényi relative informations

As well as the absolute Rényi entropies, Rényi also defined a spectrum of generalised relative information gains (the negative of relative entropies), generalising the Kullback–Leibler divergence.

The Rényi generalised divergence of order α, where α > 0, of an approximate distribution or a prior distribution Q(x) from a "true" distribution or an updated distribution P(x) is defined to be:

Failed to parse (Missing texvc executable; please see math/README to configure.): D_\alpha (P \| Q) = \frac{1}{\alpha-1}\log\Bigg(\sum_{i=1}^n \frac{p_i^\alpha}{q_i^{\alpha-1}}\Bigg) = \frac{1}{\alpha-1}\log \sum_{i=1}^n p_i^\alpha q_i^{1-\alpha}\,


Like the Kullback-Leibler divergence, the Rényi generalised divergences are always non-negative.

Some special cases:

Failed to parse (Missing texvc executable; please see math/README to configure.): D_0(P \| Q) = - \log \Pr(\{i : q_i > 0\})
: minus the log probability that qi>0;
Failed to parse (Missing texvc executable; please see math/README to configure.): D_{1/2}(P \| Q) = -2 \log \sum_{i=1}^n \sqrt{p_i q_i}
: minus twice the logarithm of the Bhattacharyya coefficient;
Failed to parse (Missing texvc executable; please see math/README to configure.): D_1(P \| Q) = \sum_{i=1}^n p_i \log \frac{p_i}{q_i}
: the Kullback-Leibler divergence;
Failed to parse (Missing texvc executable; please see math/README to configure.): D_2(P \| Q) = \log \Big\langle \frac{p_i}{q_i} \Big\rangle \,
: the log of the expected ratio of the probabilities;
Failed to parse (Missing texvc executable; please see math/README to configure.): D_\infty(P \| Q) = \log \sup_i \frac{p_i}{q_i}
: the log of the maximum ratio of the probabilities.

Why α = 1 is special

The value α = 1, which gives the Shannon entropy and the Kullback–Leibler divergence, is special because it is only when α=1 that one can separate out variables A and X from a joint probability distribution, and write:

Failed to parse (Missing texvc executable; please see math/README to configure.): H(A,X) = H(A) + \mathbb{E}_{p(a)} \{ H(X|a) \}


for the absolute entropies, and

Failed to parse (Missing texvc executable; please see math/README to configure.): D_\mathrm{KL}(p(x|a)p(a)||m(x,a)) = \mathbb{E}_{p(a)}\{D_\mathrm{KL}(p(x|a)||m(x|a))\} + D_\mathrm{KL}(p(a)||m(a)),


for the relative entropies.

The latter in particular means that if we seek a distribution p(x,a) which minimises the divergence of some underlying prior measure m(x,a), and we acquire new information which only affects the distribution of a, then the distribution of p(x|a) remains m(x|a), unchanged.

The other Rényi divergences satisfy the criteria of being positive and continuous; being invariant under 1-to-1 co-ordinate transformations; and of combining additively when A and X are independent, so that if p(A,X) = p(A)p(X), then

Failed to parse (Missing texvc executable; please see math/README to configure.): H_\alpha(A,X) = H_\alpha(A) + H_\alpha(X)\;


and

Failed to parse (Missing texvc executable; please see math/README to configure.): D_\alpha(P(A)P(X)\|Q(A)Q(X)) = D_\alpha(P(A)\|Q(A)) + D_\alpha(P(X)\|Q(X)).


The stronger properties of the α = 1 quantities, which allow the definition of the conditional informations and mutual informations which are so important in communication theory, may be very important in other applications, or entirely unimportant, depending on those applications' requirements.

References

A. Rényi (1961). "On measures of information and entropy". Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability 1960: 547-561. 

See also

fr:Entropie de Rényi

AD Links