Nyquist–Shannon sampling theorem
From Wikipedia, the free encyclopedia
Categories: Digital signal processing | Information theory | Mathematical theorems | Articles containing proofs | Fourier analysis
|
The Nyquist–Shannon sampling theorem is a fundamental result in the field of information theory, in particular telecommunications and signal processing. The theorem is commonly called the Shannon sampling theorem, and is also known as Nyquist–Shannon–Kotelnikov, Whittaker–Shannon–Kotelnikov, Whittaker–Nyquist–Kotelnikov–Shannon, WKS, etc., sampling theorem, as well as the Cardinal Theorem of Interpolation Theory. It is often referred to as simply the sampling theorem. (See Historical background below.) Sampling is the process of converting a signal (for example, a function of continuous time or space) into a numeric sequence (a function of discrete time or space). The theorem states, in the original words of Shannon (where he uses "cps" for "cycles per second" instead of the modern unit hertz):[1]
More recent statements of the theorem are sometimes careful to exclude the equality condition; that is, the condition is if f(t) contains no frequencies higher than or equal to W; this condition is equivalent to Shannon's except when the function includes a steady sinusoidal component at exactly frequency W. The assumptions necessary to prove the theorem form a mathematical model that is only an idealization of any real-world situation. The conclusion, that perfect reconstruction is possible, is mathematically correct for the model, but only an approximation for actual signals and actual sampling techniques. The theorem also leads to a formula for reconstruction of the original signal. IntroductionA signal or function is bandlimited if it contains no energy at frequencies higher than some bandlimit or bandwidth Failed to parse (Missing texvc executable; please see math/README to configure.): B\, . A signal that is bandlimited is constrained in how rapidly it changes in time, and therefore how much detail it can convey in an interval of time. The sampling theorem asserts that the uniformly spaced discrete samples are a complete representation of the signal if this bandwidth is less than half the sampling rate. To formalize these concepts, let Failed to parse (Missing texvc executable; please see math/README to configure.): x(t)\, represent a continuous-time signal and Failed to parse (Missing texvc executable; please see math/README to configure.): X(f)\, be the continuous Fourier transform of that signal (which exists if Failed to parse (Missing texvc executable; please see math/README to configure.): x(t)\, is square-integrable):
is bandlimited to a one-sided baseband bandwidth Failed to parse (Missing texvc executable; please see math/README to configure.): B\, if:
for all Failed to parse (Missing texvc executable; please see math/README to configure.): |f| > B \,
(in samples per unit time)
or equivalently,
Failed to parse (Missing texvc executable; please see math/README to configure.): 2 B\, is called the Nyquist rate and is a property of the bandlimited signal, while Failed to parse (Missing texvc executable; please see math/README to configure.): f_s /2\, is called the Nyquist frequency and is a property of this sampling system. The time interval between successive samples is referred to as the sampling interval
are denoted by
(integers). The sampling theorem leads to a procedure for reconstructing the original Failed to parse (Missing texvc executable; please see math/README to configure.): x(t)\ from the samples Failed to parse (Missing texvc executable; please see math/README to configure.): x[n]\, and states sufficient conditions for such a reconstruction to be exact. The sampling processThe theorem describes two processes in signal processing: a sampling process, in which a continuous time signal is converted to a discrete time signal, and a reconstruction process, in which the original continuous signal is recovered from the discrete time signal. The continuous signal varies over time (or space in a digitized image, or another independent variable in some other application) and the sampling process is performed by measuring the continuous signal's value every T units of time (or space), which is called the sampling interval. In practice, for signals that are a function of time, the sampling interval is typically quite small, on the order of milliseconds, microseconds, or less. This results in a sequence of numbers, called samples, to represent the original signal. Each sample value is associated with the instant in time when it was measured. The reciprocal of the sampling interval (1/T) is the sampling frequency denoted fs, which is measured in samples per unit of time. If T is expressed in seconds, then fs is expressed in Hz. Reconstruction of the original signal is an interpolation process that mathematically defines a continuous-time signal x(t) from the discrete samples x[n] and at times in between the sample instants nT.
The normalized sinc function: sin(πx) / (πx) ... showing the central peak at x= 0, and zero-crossings at the other integer values of x.
If the original signal contains a frequency component equal to one-half the sampling rate, the condition is not satisfied. The resulting reconstructed signal may have a component at that frequency, but the amplitude and phase of that component generally will not match the original component. This reconstruction or interpolation using sinc functions is not the only interpolation scheme. Indeed, it is impossible in practice because it requires summing an infinite number of terms. However, it is the interpolation method that in theory exactly reconstructs any given bandlimited x(t) with any bandlimit B < 1/2T); any other method that does so is formally equivalent to it. Practical considerationsA few consequences can be drawn from the theorem:
AliasingImage:Nonoverlapped.svg
400px
If the sampling condition is not satisfied, then frequencies will overlap; that is, frequencies above half the sampling rate will be reconstructed as, and appear as, frequencies below half the sampling rate. The resulting distortion is called aliasing; the reconstructed signal is said to be an alias of the original signal, in the sense that it has the same set of sample values. For a sinusoidal component of exactly half the sampling frequency, the component will in general alias to another sinusoid of the same frequency, but with a different phase and amplitude. To prevent or reduce aliasing, two things can be done:
The anti-aliasing filter is to restrict the bandwidth of the signal to satisfy the condition for proper sampling. Such a restriction works in theory, but is not precisely satisfiable in reality, because realizable filters will always allow some leakage of high frequencies. However, the leakage energy can be made small enough so that the aliasing effects are negligible. Application to multivariable signals and images
Subsampled image showing a Moiré pattern
The sampling theorem is usually formulated for functions of a single variable. Consequently, the theorem is directly applicable to time-dependent signals and is normally formulated in that context. However, the sampling theorem can be extended in a straightforward way to functions of arbitrarily many variables. Grayscale images, for example, are often represented as two-dimensional arrays (or matrices) of real numbers representing the relative intensities of pixels (picture elements) located at the intersections of row and column sample locations. As a result, images require two independent variables, or indices, to specify each pixel uniquely — one for the row, and one for the column. Color images typically consist of a composite of three separate grayscale images, one to represent each of the three primary colors — red, green, and blue, or RGB for short. Other colorspaces using 3-vectors for colors include HSV, LAB, XYZ, etc. Some colorspaces such as cyan, magenta, yellow, and black (CMYK) may represent color by four dimensions. All of these are treated as vector-valued functions over a two-dimensional sampled domain. Similar to one-dimensional discrete-time signals, images can also suffer from aliasing if the sampling resolution, or pixel density, is inadequate. For example, a digital photograph of a striped shirt with high frequencies (in other words, the distance between the stripes is small), can cause aliasing of the shirt when it is sampled by the camera's image sensor. The aliasing appears as a Moiré pattern. The "solution" to higher sampling in the spatial domain for this case would be to move closer to the shirt or use a higher resolution sensor. Another example is shown to the right in the brick patterns. The top image shows the effects when the sampling theorem's condition is not satisfied. When software rescales an image (the same process that creates the thumbnail shown in the lower image) it, in effect, runs the image through a low-pass filter first and then downsamples the image to result in a smaller image that does not exhibit the Moiré pattern. The top image is what happens when the image is downsampled without low-pass filtering: aliasing results. The top image was created by zooming out in GIMP and then taking a screenshot of it. The likely reason that this causes a banding problem is that the zooming feature simply downsamples without low-pass filtering (probably for performance reasons) since the zoomed image is for on-screen display instead of printing or saving. The application of the sampling theorem to images should be made with care. For example, the sampling process in any standard image sensor (CCD or CMOS camera) is relatively far from the ideal sampling which would measure the image intensity at a single point. Instead these devices have a relatively large sensor area at each sample point in order to obtain sufficient amount of light. In other words, any detector has a finite-width point spread function. The analog optical image intensity function which is sampled by the sensor device is not in general bandlimited, and the non-ideal sampling is itself a useful type of low-pass filter, though not always sufficient to remove enough high frequencies to sufficiently reduce aliasing. When the area of the sampling spot (the size of the pixel sensor) is not large enough to provide sufficient anti-aliasing, a separate anti-aliasing filter (optical low-pass filter) is typically included in a camera system to further blur the optical image. Despite images having these problems in relation to the sampling theorem, the theorem can be used to describe the basics of down and up sampling of images. DownsamplingWhen a signal is downsampled, the sampling theorem can be invoked via the artifice of resampling a hypothetical continuous-time reconstruction. The Nyquist criterion must still be satisfied with respect to the new lower sampling frequency in order to avoid aliasing. To meet the requirements of the theorem, the signal must usually pass through a low-pass filter of appropriate cutoff frequency as part of the downsampling operation. This low-pass filter, which prevents aliasing, is called an anti-aliasing filter. Critical frequencyImage:CriticalFrequencyAliasing.png
A family of sinusoids at the critical frequency, all having the same sample sequences of alternating +1 and –1. That is, they all are aliases of each other, even though their frequency is not above half the sample rate.
The Nyquist rate is defined as twice the bandwidth of the continuous-time signal. The sampling frequency must be strictly greater than the Nyquist rate of the signal to achieve unambiguous representation of the signal. This constraint is equivalent to requiring that the system's Nyquist frequency (also known as critical frequency, and equal to half the sample rate) be strictly greater than the bandwidth of the signal. If the signal contains a frequency component at precisely the Nyquist frequency then the corresponding component of the sample values cannot have sufficient information to reconstruct the Nyquist-frequency component in the continuous-time signal because of phase ambiguity. In such a case, there would be an infinite number of possible and different sinusoids (of varying amplitude and phase) of the Nyquist-frequency component that are represented by the discrete samples. As an example, consider this family of signals at the critical frequency:
Mathematical basis for the theoremThe Nyquist–Shannon sampling theorem states that, given a bandlimited continuous-time signal x(t) that is uniformly sampled at a sufficient rate, even if all of the information in the signal between samples is discarded, there remains sufficient information in the samples that the original continuous-time signal can be mathematically reconstructed perfectly from only those discrete samples. To prove this, a different function is first constructed, conceptually, from the whole original signal, but preserving information from just the sample instants:
Since the Dirac impulse is zero except where its argument is zero, ΔT(t) takes a value of zero except for values of t that are at the sampling instants, nT, for integer n. Therefore xs(t) also takes on zero values for all t except for the sampling instants nT. Multiplying x(t) by ΔT(t) effectively discards all of the information between sampling instants and retains information only at the sampling instants nT. xs(t) can be represented in terms of the samples:
where x[n] = x(nT) are the samples. The sequence of sample impulses xs(t) can also be written in terms of the Fourier series of the Dirac comb,:
Using the frequency shifting property of the continuous Fourier transform,
where X(f) is the Fourier transform of x(t). This says that the spectrum of the baseband signal being sampled is shifted and repeated forever at integral multiples of the sampling frequency, fs. These repeated copies are called images of the original signal spectrum. Now constrain x(t) to be bandlimited to B (that is, X(f) = 0 for all |f| > B), and consider what condition precludes overlapping of the adjacent images X(f-kfs) :
With that condition satisfied, there is no overlap of images in Xs(f) and X(f) (and thus x(t)) can be reconstructed from Xs(f) (or xs(t)) by low pass filtering out all of the images of X(f) in Xs(f) except for the original image at the baseband. To do that, fs > 2B (to prevent overlap) and the frequency response of the reconstruction filter H(f) must be:
is the rectangular function. With H(f) so defined, it is clear that
and the spectrum of the original signal that was sampled, X(f), is recovered from the spectrum of the sampled signal, Xs(f). This means, in the time domain, that the original signal that was sampled, x(t), is recovered from the sampled signal, xs(t). This completes the proof of the Nyquist–Shannon sampling theorem. It says that if the sampling frequency, fs, is strictly greater than twice the bandwidth, B, of the continuous-time baseband signal, x(t), then no information is lost (or "aliased") by sampling. To reconstruct x(t) from the samples x[n], a reconstruction filter (a brick-wall low-pass filter) with response H(f) is constructed. The impulse response of the reconstruction filter is the inverse Fourier transform of H(f):
This function is the impulse response of the reconstruction filter with input the sampled signal xs(t), which is just a collection of dirac impulses, δ(t-nT), each delayed to the time of their sampling instance, nT and weighted by a value proportional to the value of the continuous-time signal that was sampled at that instance, x[n]=x(nT). Since the reconstruction filter is a linear, time-invariant system, each impulse at time nT generates its own impulse response delayed to the same time, and the output of the reconstruction filter is the sum of outputs driven by each weighted impulse separately. For each input impulse, the component of the output is the impulse response delayed to the same time of that input impulse, h(t-nT), and weighted by the same coefficient attached to that input impulse, T•x[n]. That is, the output of the reconstruction filter is:
This shows explicitly how the samples x[n] are combined to reconstruct the original function x(t). Concise summary of the mathematical proofThere is no actual device that produces the infinite-valued samples implied by the Dirac comb model of sampling. The finite-valued samples, x[n], are not a function of continuous time, thus their Fourier transform is undefined. To use that analysis tool, a continuous-time function is contrived conceptually (neither actually nor numerically) by using the samples to modulate the "teeth" of a Dirac comb function. This modulated comb does have a continuous-time Fourier transform (not within the strict definition that requires square integrable functions, but in the generalization that allows Schwartz distributions, in the case of the original signal being square integrable). The transform of the (virtual) modulated comb, Failed to parse (Missing texvc executable; please see math/README to configure.): X_s(f)\, , is related to the transform of the physical waveform, Failed to parse (Missing texvc executable; please see math/README to configure.): X(f)\, , via a superposition of shifted copies (which is equivalent to convolution with a frequency-domain Dirac comb); this superposition viewpoint leads to an understanding of aliasing and ways to mitigate it. When the shifted copies do not overlap, the original can be extracted by lowpass filtering, giving back the original signal. The Fourier transform view also reveals that the sample rate can be higher than twice the highest frequency, with no ill effect, and even leaving room for a transition band in which the transfer function of the reconstruction filter is free to take intermediate values. Undersampling, which causes aliasing, is not in general a reversible operation. Oversampling may be inefficient or wasteful, but it is also reversible, meaning that no information is lost. Shannon's original proofThe original proof presented by Shannon is elegant and quite brief, but it offers less intuitive insight into the subtleties of aliasing, both unintentional and intentional. Quoting Shannon's original paper, which uses f for the function, F for the spectrum, and W for the bandwidth limit:
be the spectrum of Failed to parse (Missing texvc executable; please see math/README to configure.): f(t) . Then
is assumed to be zero outside the band W. If we let
at the sampling points. The integral on the right will be recognized as essentially the nth coefficient in a Fourier-series expansion of the function Failed to parse (Missing texvc executable; please see math/README to configure.): F(\omega) , taking the interval –W to W as a fundamental period. This means that the values of the samples Failed to parse (Missing texvc executable; please see math/README to configure.): f(n/2W) determine the Fourier coefficients in the series expansion of Failed to parse (Missing texvc executable; please see math/README to configure.): F(\omega) . Thus they determine Failed to parse (Missing texvc executable; please see math/README to configure.): F(\omega) , since Failed to parse (Missing texvc executable; please see math/README to configure.): F(\omega) is zero for frequencies greater than W, and for lower frequencies Failed to parse (Missing texvc executable; please see math/README to configure.): F(\omega) is determined if its Fourier coefficients are determined. But Failed to parse (Missing texvc executable; please see math/README to configure.): F(\omega) determines the original function Failed to parse (Missing texvc executable; please see math/README to configure.): f(t) completely, since a function is determined if its spectrum is known. Therefore the original samples determine the function Failed to parse (Missing texvc executable; please see math/README to configure.): f(t) completely. Shannon's proof of the theorem is complete at that point, but he goes on to discuss reconstruction via sinc functions, what we now call the Whittaker–Shannon interpolation formula as discussed above. He does not derive or prove the properties of the sinc function, but these would have been familiar to engineers reading his works at the time, since the Fourier pair relationship between rect and sinc was well known. Quoting Shannon:
be the nth sample. Then the function Failed to parse (Missing texvc executable; please see math/README to configure.): f(t) is represented by:
Sampling of non-baseband signalsFor sampling a non-baseband signal, the conditions to avoid information loss and to allow perfect reconstruction can be generalized in terms of conditions on the frequency interval of nonzero spectrum. See Sampling (signal processing) for more details and examples. A bandpass condition is that Failed to parse (Missing texvc executable; please see math/README to configure.): X(f) = 0\, for all nonnegative Failed to parse (Missing texvc executable; please see math/README to configure.): f\, outside the open band of frequencies
for some nonnegative integer Failed to parse (Missing texvc executable; please see math/README to configure.): N\, . This formulation includes the normal baseband condition as the case N=0. The corresponding interpolation function is the impulse response of a bandpass filter with cutoffs at the upper and lower edges of the specified band, which is the difference between a pair of lowpass impulse responses:
. Other generalizations, for example to signals occupying multiple non-contiguous bands, are possible as well. Even the most generalized form of the sampling theorem does not have a provably true converse. That is, one cannot conclude that information is necessarily lost just because the conditions of the sampling theorem are not satisfied; from an engineering perspective, however, it is generally safe to assume that if the sampling theorem is not satisfied then information will most likely be lost. Historical backgroundThe sampling theorem was implied by the work of Harry Nyquist in 1928 ("Certain topics in telegraph transmission theory"), in which he showed that up to 2B independent pulse samples could be sent through a system of bandwidth B; but he did not explicitly consider the problem of sampling and reconstruction of continuous signals. About the same time, Karl Küpfmüller showed a similar result[2], and discussed the sinc-function impulse response of a band-limiting filter, via its integral, the step response Integralsinus; this bandlimiting and reconstruction filter that is so central to the sampling theorem is sometimes referred to as a Küpfmüller filter (but seldom so in English). The sampling theorem, essentially a dual of Nyquist's result, was proved by Claude E. Shannon in 1949 ("Communication in the presence of noise"). V. A. Kotelnikov published similar results in 1933 ("On the transmission capacity of the 'ether' and of cables in electrical communications", translation from the Russian), as did the mathematician E. T. Whittaker in 1915 ("Expansions of the Interpolation-Theory", "Theorie der Kardinalfunktionen"), J. M. Whittaker in 1935 ("Interpolatory function theory"), and Gabor in 1946 ("Theory of communication"). Other discoverersOthers who have independently discovered or played roles in the development of the sampling theorem have been discussed in several historical articles, for example by Jerri[3] and by Lüke.[4] For example, Lüke points out that H. Raabe, an assistant to Küpfmüller, proved the theorem in his 1939 Ph.D. dissertation; the term Raabe condition came to be associated with the criterion for unambiguous representation (sampling rate greater than twice the bandwidth). Meijering[5] mentions several other discoverers and names in a paragraph and pair of footnotes:
Why Nyquist?Exactly how, when, or why Harry Nyquist had his name attached to the sampling theorem remains obscure. The term Nyquist Sampling Theorem (capitalized thus) appeared as early as 1959 in a book from his former employer, Bell Labs,[6] and appeared again in 1963,[7] and not capitalized in 1965.[8] It had been called the Shannon Sampling Theorem as early as 1954,[9] but also just the sampling theorem by several other books in the early 1950s. In 1958, Blackman and Tukey[10] cited Nyquist's 1928 paper as a reference for the sampling theorem of information theory, even though that paper does not treat sampling and reconstruction of continuous signals as others did. Their glossary of terms includes these entries:
When Shannon stated and proved the sampling theorem in his 1949 paper, according to Meijering[5] "he referred to the critical sampling interval T = 1/2W as the Nyquist interval corresponding to the band W, in recognition of Nyquist’s discovery of the fundamental importance of this interval in connection with telegraphy." This explains Nyquist's name on the critical interval, but not on the theorem. Similarly, Nyquist's name was attached to Nyquist rate in 1953 by Harold S. Black:[11]
According to the OED, this may be the origin of the term Nyquist rate. In Black's usage, it is not a sampling rate, but a signaling rate. Historical references
See also
References
External links
cs:Shannonův teorém de:Nyquist-Shannon-Abtasttheorem eo:Teoremo pri specimenado es:Teorema de muestreo de Nyquist-Shannon fi:Nyquistin teoreema fr:Théorème d'échantillonnage de Nyquist-Shannon he:תורת הדגימה (עיבוד אותות) it:Teorema del campionamento di Nyquist-Shannon ja:標本化定理 ko:표본화 정리 nl:Bemonsteringstheorema van Nyquist-Shannon pl:Twierdzenie Kotielnikowa-Shannona ru:Теорема отсчётов Уиттакера — Найквиста — Котельникова — Шеннона sv:Nyquistteoremet uk:Теорема відліків Віттакера — Найквіста — Котельникова — Шеннона |


