**Chapter 5**

**III, Sampling, and Interpolation**

**5.1** **X-Ray Diffraction: Through a Glass Darkly**

^{1}

Diffraction is not only an interesting phenomenon to look at, it is an important experimental tool, the tool
*being diffraction gratings. A diffraction grating is an aperture plane with a large number of parallel slits,*
closely spaced. See, for example

http://hyperphysics.phy-astr.gsu.edu/hbase/phyopt/grating.html.

*Diffraction gratings are used to separate light of different wavelengths, and to measure wavelengths. I want*
to look briefly at this latter application.

X-rays were discovered by William Roentgen in 1895. It was not known whether they were particles or
waves, but the wave hypothesis put their wavelength at about 10^{−8} cm. Using diffraction gratings was out
of the question for experiments on X-rays because diffraction effects are only seen if the width of the slits
is comparable to the wavelength. It was possible to build such gratings for experiments on visible light,
where the wavelengths are between 400 and 700 nanometers (10^{−7} cm), but that extra order of magnitude
to get to X-rays couldn’t be done.

A related set of mysteries had to do with the structure of crystals. It was thought that the macroscopic structure of crystals could be explained by a periodic arrangement of atoms, but there was no way to test this. In 1912 Max von Laue proposed that the purported periodic structure of crystals could be used to diffract X-rays, just as gratings diffracted visible light. He thus had three hypotheses:

1. X-rays are waves.

2. Crystals are periodic.

3. The spacing between atoms is of the order 10^{−8} cm.

Friedrich and Kniping carried out experiments that confirmed von Laue’s hypotheses and the subject of X-ray crystallography was born.

But you need to know some math.

11 Corinthians 13: When I was a child, I spake as a child, I understood as a child, I thought as a child: but when I became a man, I put away childish things. For now we see through a glass, darkly, but then face to face: now I know in part; but then shall I know even as also I am known.

**Electron density distribution** An important quantity to consider in crystallography is how the elec-
*trons are distributed among the atoms in the crystal. This is usually referred to as the electron density*
*distribution of the crystal. We want to see how we might represent this as a function, and consider what*
happens to the function in the course of an X-ray diffraction experiment.

Let’s take the one-dimensional case as an illustration; we’ll look at the (more realistic) higher dimensional
case later in the course. We view a one-dimensional crystal as an evenly spaced collection of atoms along
a line. In fact, for purposes of approximation, we suppose that an infinite number of them are strung out
*along a line. If we describe the electron density distribution of a single atom by a function ρ(x) then the*
*electron density distribution of the crystal with spacing p is the periodic function*

*ρ*_{p}*(x) =*
X∞
*k=−∞*

*ρ(x − kp) .*

*As our discussion of diffraction might indicate, the Fourier transform of ρ*_{p}*(x) is proportional to the*

*“scattered amplitude” of X-rays diffracted by the crystal. Thus we want to write ρ**p**(x) in a form that’s*
*amenable to taking the Fourier transform. (Incidentally, it’s not unreasonable to suppose that ρ is rapidly*
decreasing — the electron density of a single atom dies off as we move away from the atom.)

*As we’ll see, it’s convenient to write the periodized density as a convolution with a sum of shifted δ’s:*

*ρ**p**(x) =*
X∞
*k=−∞*

*ρ(x − pk) =*
X∞
*k=−∞*

*δ(x − kp) ∗ ρ(x) =*

X^{∞}

*k=−∞*

*δ(x − kp)*

*∗ ρ(x) .*

Now introduce

III*p**(x) =*
X∞
*k=−∞*

*δ(x − kp) ,*
so that, simply,

*ρ** _{p}*= III

_{p}*∗ ρ .*

III* _{p}* is the star of the show. Bracewell calls it the “shah function”, after the Cyrillic letter, and this has

*caught on. It’s also referred to as the Dirac comb (with spacing p).*

Using the convolution theorem, we have

*F ρ*_{p}*= F ρ · F III*_{p}*.*
What is F III* _{p}*? That’s a really interesting question.

**5.2** **The III Distribution**

We want to develop the properties of III* _{p}*, particularly its Fourier transform. In fact, we met this distribution
earlier, in Chapter 1. Rather, we met its Fourier transform — it’s the continuous buzz signal, as we’ll
discuss further, below.

*As a “standard” we take the spacing p to be 1, so we sum over the integer points and define*
*III(x) =*

X∞
*k=−∞*

*δ(x − k)* or III =
X∞
*k=−∞*

*δ**k**.*

5.2 The III Distribution 213

*As above, for a series of δ’s spaced p apart we write*
III_{p}*(x) =*

X∞
*k=−∞*

*δ(x − kp)* or III* _{p}* =
X∞

*k=−∞*

*δ*_{kp}*.*

*I’ll mostly write δ’s “at points” in this section. It seems the more natural thing to do.*

To see that the series for III*p* *makes sense as a distribution, let ϕ be a test function; then*

hIII*p**, ϕi =*

X^{∞}

*k=−∞*

*δ**kp**, ϕ*

=
X∞
*k=−∞*

*hδ**kp**, ϕi =*
X∞
*k=−∞*

*ϕ(kp) .*

*This sum converges because of the rapid decrease of ϕ at ±∞.*

*There are two facets to the III’s versatility: periodizing and sampling. We’ll consider each in turn.*

**5.2.1** **Periodizing with III**

*Our first application of III was as above, to write the periodization of the electron density function ρ of a*
*single atom in a crystal as a convolution. The purpose was to periodize ρ to reflect the physical structure of*
the crystal. This is a general procedure. The III function furnishes a handy way of generating and working
with periodic functions and distributions. Take that as an aphorism.

*If f is a function or distribution for which convolution with III makes sense, then*
*(f ∗ III*_{p}*)(t) =*

X∞
*k=−∞*

*f (t − pk)*

*is periodic with period p. Note that*

*f (at + b) ∗ III*_{p}*(t) =*
X∞
*k=−∞*

*f (at + b − apk)*

*also has period p, and this can just as well be written in terms of a shifted III:*

X∞
*k=−∞*

*f (at + b − apk) = f (at) ∗ III**p**(t +* ^{b}*a**) .*

Convolving with III*p* now emerges as the basic, familiar way to produce a periodic function. However, some
care must be taken; convolving with III to periodize doesn’t shift the graph and link them up, it shifts the
*graph and adds them up.*

In many cases the series

X∞
*k=−∞*

*f (t − pk)*

will converge in some reasonable sense, often at least to define a periodic distribution (see Section 5.4). A
*common application is to form f ∗ III**p* *when f is zero for |t| ≥ p/2. In this case the convolution exists and*
*we naturally say that f ∗ III*_{p}*is the p-periodic extension of f .*

I want to look at this operation in a little more detail, one reason being that it will make the discussion of sampling and aliasing, soon to come, much cleaner and easier. Recall the scaled rect function

Π*p**(x) =*

(1 *|x| < p/2*
0 *|x| ≥ p/2*
*If f is zero when |t| ≥ p/2 (note the ≥ not >) then*

Π_{p}*f = f*
and

*f = Π*_{p}*(f ∗ III*_{p}*) .*

In fact these two conditions are equivalent. That should be clear if you have the geometric picture in mind.

*For example, shown below are the graphs of a function f (x) that is zero outside of |t| < p/2 and of three*
*cycles of its periodization; that’s f (x + p) + f (x) + f (x − p) = f (x) ∗*P1

*k=−1**δ(x − kp).*

5.2 The III Distribution 215

Here are the algebraic details that go from the picture to the formulas. If Π*p**f = f then*
Π*p**(t)(f ∗ III**p**)(t) = Π**p**(t)((Π**p**f ) ∗ III**p**)(t)*

= Π*p**(t)*
X∞
*k=−∞*

Π*p**(t − kp)f (t − kp)*

=
X∞
*k=−∞*

Π*p**(t)Π**p**(t − kp)f (t − kp) = Π**p**(t)f (t) = f (t)*

since

Π*p**(t)Π**p**(t − kp) =*

(Π_{p}*(t)* *k = 0*

0 *k 6= 0*

*On the other hand, if f = Π**p**(f ∗ III**p*) then

Π*p**f = Π**p*(Π*p**(f ∗ III**p*)) = Π^{2}_{p}*(f ∗ III**p*) = Π*p**(f ∗ III**p**) = f .*

If we had defined Π_{p}*differently at ±p/2 (in other cultures either Π*_{p}*(±p/2) = 1 or Π*_{p}*(±p/2) = 1/2 are*
*typical) then the calculations and results above would hold except for the translates of ±p/2, a discrete*
set of points. Such an exceptional set generally poses no problems in applications.

*This all seems pretty innocent, but cutting off a distribution by Π**p*(a discontinuous function) is not part of
the theory. We only defined the product of a distribution and a smooth function. In general we’ll proceed
as though all is well, though careful justifications can take some work (which we won’t do). Be not afraid.

**5.2.2** **Sampling with III**

The flip side of periodizing with III is sampling with III. Here’s what this means. Suppose we multiply III
*by a function f . Then as a distribution*

*f (x)III(x) =*
X∞
*k=−∞*

*f (x)δ(x − k) =*
X∞
*k=−∞*

*f (k)δ(x − k) .*

*Multiplying III by f “samples” f at the integer points, in the sense that it “records” the values of f at*
those points in the sum.

There’s nothing sacred about sampling at the integers of course. Sampling using III* _{p}* means

*f (x)III*_{p}*(x) =*
X∞
*k=−∞*

*f (kp)δ(x − kp) ,*

*so f is sampled at the points kp. Scaled or not, the thing to keep in mind about the shah function is that*
*it takes evenly spaced samples of a function f .*

To summarize:

• Convolving a function with III (with III*p**) produces a periodic function with period 1 (with period p).*

• Multiplying a function by III (by III*p**) samples the function at the integer points (at the points pk).*

**5.2.3** **Scaling identity for III***p*

There’s a simple scaling identity for III* _{p}* that comes up often enough in formulas and derivations to make
it worth pointing out. We’ve defined

III*p**(x) =*
X∞
*k=−∞*

*δ(x − kp) ,*

*scaling the spacing of the impulses by p, but it’s also natural to consider*
*III(px) =*

X∞
*k=−∞*

*δ(px − k) .*

*Now recall the scaling property of δ; for p > 0,*

*δ(px) =* 1
*pδ(x) .*
*Plugging this into the formula for III(px) gives*

*III(px) =*
X∞
*k=−∞*

*δ(px − k)*

=
X∞
*k=−∞*

*δ*

*p(x −* ^{k}*p*)

=
X∞
*k=−∞*

1
*p**δ*

*x −*^{k}

*p*

= 1

*p*III_{1/p}*(x)*
To give it its own display:

*III(px) =* ^{1}

*p*III_{1/p}*(x)*

*(It would be a good exercise to derive this in a variable-free environment, using the delay operator τ**p* and
*the scaling operator σ**p*.) By the same token,

III_{p}*(x) =* ^{1}
*p*III

1
*p**x*

*.*

**5.3** **The Fourier Transform of III, or, The deepest fact about the inte-** **gers is well known to every electrical engineer and spectroscopist**

The most interesting thing about III is what happens when we take its Fourier transform. If we start with the definition

*III(x) =*
X∞
*k=−∞*

*δ(x − k) .*

*and apply what we know about the Fourier transform of δ (it’s 1) plus the shift theorem, we obtain*
*F III(s) =*

X∞
*k=−∞*

*e*^{−2πiks}*.*

5.3 The Fourier Transform of III 217

*Since we’re summing over all positive and negative k we can write this as*
*F III(s) =*

X∞
*k=−∞*

*e*^{2πiks}*.*

which looks more like a Fourier series. We did see this when we introduced the buzz signal. It sounds like a signal with every harmonic present in equal amounts. It sounds terrible.

The expression

X∞
*k=−∞*

*e*^{2πiks}

actually does make sense as a distribution, as we’ll see, but it’s not yet a helpful expression. Instead, to
*find the Fourier transform of III we go back to the definition in terms of tempered distributions. If ϕ is a*
Schwartz function then

*hF III, ϕi = hIII, F ϕi .*
On the right hand side,

*hIII, F ϕi =*

X^{∞}

*k=−∞*

*δ**k**, F ϕ*

=
X∞
*k=−∞*

*hδ**k**, F ϕi =*
X∞
*k=−∞*

*F ϕ(k)*
And now we have something absolutely remarkable.

* The Poisson summation formula: Let ϕ be a Schwartz function. Then*
X∞

*k=−∞*

*F ϕ(k) =*
X∞
*k=−∞*

*ϕ(k)*

This result actually holds for other classes of functions (the Schwartz class was certainly not known to Poisson!) but that’s not important for us.

The Poisson summation formula is the deepest fact known about the integers. It’s known to every electrical engineer and every spectroscopist because of what it says about the Fourier transform of F III. We’ll settle that now and come back to the derivation of the formula afterward.

We pick up our calculation of F III where we left off:

*hF III, ϕi =*
X∞
*k=−∞*

*F ϕ(k)*

=
X∞
*k=−∞*

*ϕ(k)* (because of the Poisson summation formula)

=
X∞
*k=−∞*

*hδ*_{k}*, ϕi* *(definition of δ** _{k}*)

=

X^{∞}

*k=−∞*

*δ*_{k}*, ϕ*

*= hIII, ϕi .*

Comparing where we started to where we ended up, we conclude that
*F III = III .*

Outstanding. The III distribution is its own Fourier transform. (See also Section 5.10.)

**Proof of the Poisson Summation Formula** The proof of the Poisson summation formula is an
excellent example of the power of having two different representations of the same thing, an idea certainly
at the heart of Fourier analysis. Remember the maxim: If you can evaluate an expression in two different
ways it’s likely you’ve done something significant.

*Given a test function ϕ(t) we periodize to Φ(t) of period 1:*

*Φ(t) = (ϕ ∗ III)(t) =*
X∞
*k=−∞*

*ϕ(t − k) .*

As a periodic function, Φ has a Fourier series:

*Φ(t) =*
X∞
*m=−∞*

*Φ(m)e*ˆ ^{2πimt}*.*

*Let’s find the Fourier coefficients of Φ(t).*

*Φ(m) =*ˆ
Z 1

0

*e*^{−2πimt}*Φ(t) dt*

= Z 1

0

X∞
*k=−∞*

*e*^{−2πimt}*ϕ(t − k) dt =*
X∞
*k=−∞*

Z 1 0

*e*^{−2πimt}*ϕ(t − k) dt*

=
X∞
*k=−∞*

Z *−k+1*

*−k*

*e*^{−2πim(t+k)}*ϕ(t) dt*

=
X∞
*k=−∞*

Z *−k+1*

*−k*

*e*^{−2πimt}*e*^{−2πimk}*ϕ(t) dt* *(using e** ^{−2πimk}* = 1)

= Z ∞

−∞

*e*^{−2πimt}*ϕ(t) dt*

*= F ϕ(m) .*
Therefore

*Φ(t) =*
X∞
*m=−∞*

*F ϕ(m)e*^{2πimt}*.*

(We’ve actually seen this calculation before, in a disguised form; look back to Section 3.5 on the relationship between the solutions of the heat equation on the line and on the circle.)

Since Φ is a smooth function, the Fourier series converges. Now compute Φ(0) two ways, one way from plugging into its definition and the other from plugging into its Fourier series:

Φ(0) =
X∞
*k=−∞*

*ϕ(−k) =*
X∞
*k=−∞*

*ϕ(k)*

Φ(0) =
X∞
*k=−∞*

*F ϕ(k)e** ^{2πin0}*=
X∞

*k=−∞*

*F ϕ(k)*
Done.

5.4 Periodic Distributions and Fourier series 219

**The Fourier transform of III***p* From F III = III we can easily deduce the formula for F III*p*. Using the
identities

III_{p}*(x) =* ^{1}
*p*III

1
*p**x*

and *III(px) =* ^{1}

*p*III_{1/p}*(x) .*
we have

F III_{p}*(s) =* ^{1}
*p*F

III

*x*
*p*

= ^{1}

*p**pF III(ps)* (stretch theorem)

*= III(ps)*

= ^{1}

*p*III_{1/p}*(s)*
**5.3.1** **Crystal gazing**

Let’s return now to the setup for X-ray diffraction for a one-dimensional crystal. We described the electron
*density distribution of a single atom by a function ρ(x) and the electron density distribution of the crystal*
*with spacing p as*

*ρ*_{p}*(x) =*
X∞
*k=−∞*

*ρ(x − kp) = (ρ ∗ III*_{p}*)(x) .*
Then

*F ρ**p**(s) = F (ρ ∗ III**p**)(s)*

*= (F ρ · F III**p**)(s)*

*= F ρ(s)*^{1}

*p*III_{1/p}*(s)*

=
X∞
*k=−∞*

1
*p**F ρ*

*k*
*p*

*δ*

*s −*^{k}

*p*

Here’s the significance of this. In an X-ray diffraction experiment what you see on the X-ray film is a
*bunch of spots, corresponding to F ρ**p*. The intensity of each spot is proportional to the magnitude of the
*Fourier transform of the electron density ρ and the spots are spaced a distance 1/p apart, not p apart. If*
you were an X-ray crystallographer and didn’t know your Fourier transforms, you might assume that there
is a relation of direct proportion between the spacing of the dots on the film and the atoms in the crystal,
*but it’s a reciprocal relation — kiss your Nobel prize goodbye. Every spectroscopist knows this.*

We’ll see a similar relation when we consider higher dimensional Fourier transforms and higher dimensional
*III-functions. A III-function will be associated with a lattice and the Fourier transform will be a III-function*
*associated with the reciprocal or dual lattice. This phenomenon has turned out to be important in image*
*processing; see, for example, Digital Video Processing by A. M. Tekalp.*

**5.4** **Periodic Distributions and Fourier series**

I want to collect a few facts about periodic distributions and Fourier series, and show how we can use III as a convenient tool for “classical” Fourier series.

**Periodicity** *The notion of periodicity for distributions is invariance under the delay operator τ**p*, i.e., a
*distribution (or a function, for that matter) is periodic with period p if*

*τ**p**S = S .*

This is the “variable free” definition, since we’re not supposed to write
*S(x − p) = S(x)* or *S(x + p) = S(x)*

which is the usual way of expressing periodicity. It’s a pleasure to report that III_{p}*is periodic with period p.*

You can see that most easily by doing what we’re not supposed to do:

III*p**(x + p) =*
X∞
*k=−∞*

*δ(x + p − kp) =*
X∞
*k=−∞*

*δ(x − (k − 1)p) =*
X∞
*k=−∞*

*δ(x − kp) = III**p**(x).*

It’s also easy to give a variable-free demonstration, which amounts to the same thing:

*τ** _{p}*III

*= X∞*

_{p}*k=−∞*

*τ*_{p}*δ** _{kp}*=
X∞

*k=−∞*

*δ** _{kp+p}*=
X∞

*k=−∞*

*δ** _{p(k+1)}*=
X∞

*k=−∞*

*δ** _{kp}* = III

_{p}*.*

*When we periodize a test function ϕ by forming the convolution,*
*Φ(x) = (ϕ ∗ III*_{p}*)(x) ,*

*it’s natural to view the periodicity of Φ as a consequence of the periodicity of III**p*. By this I mean we can
appeal to:

*• If S or T is periodic of period p then S ∗ T (when it is defined) is periodic of period p.*

Let me show this for functions (something we could have done way back) and I’ll let you establish the
*general result. Suppose f is periodic of period p. Consider (f ∗ g)(x + p). We have*

*(f ∗ g)(x + p) =*
Z ∞

−∞

*f (x + p − y)g(y) dy =*
Z ∞

−∞

*f (x − y)g(y) dy = (f ∗ g)(x).*

*The same argument works if instead g is periodic.*

So, on the one hand, convolving with III* _{p}* produces a periodic function. On the other hand, suppose Φ is

*periodic of period p and we cut out one period of it by forming Π*

_{p}*Φ. We get Φ back, in toto, by forming*the convolution with III

*p*; that is,

*Φ = ϕ ∗ III**p*= (Π*p*Φ) ∗ III*p*

(Well, this is almost right. The cut-off Π_{p}*Φ is zero at ±p/2 while Φ(±p/2) certainly may not be zero.*

These “exceptions” at the end-points won’t affect the discussion here in any substantive way.^{2})

The upshot of this is that something is periodic if and only if it is a convolution with III* _{p}*. This is a nice
point of view. I’ll take this up further in Section 5.10.

2We can either: (a) ignore this problem; (b) jigger the definition of Πp to make it really true, which has other problems; or (c) say that the statement is true as an equality between distributions, and tell ourselves that modifying the functions at a discrete set of points will not affect that equality.

5.4 Periodic Distributions and Fourier series 221

**Fourier series for III** Taking the Fourier series of III term by term we arrived at
F III =

X∞
*k=−∞*

*e*^{2πikt}*,*

and if we next use F III = III we would then have III =

X∞
*k=−∞*

*e*^{2πikt}*.*
The series

X∞
*k=−∞*

*e** ^{2πikt}*
does define a distribution, for

X^{∞}

*k=−∞*

*e*^{2πikt}*, ϕ*

= Z ∞

−∞

X∞
*k=−∞*

*e*^{2πikt}*ϕ(t) dt*

*exists for any test function ϕ because ϕ is rapidly decreasing. There’s a pretty straightforward development*
of Fourier series for tempered distributions, and while we won’t enter into it, suffice it to say we do indeed
have

III =
X∞
*k=−∞*

*e*^{2πikt}*.*

*The right hand side really is the Fourier series for III. But, by the way, you can’t prove this without proving*
the Poisson summation formula and that F III = III, so Fourier series isn’t a shortcut to the latter in this
case.

Remember that we saw the finite version of the Fourier series for III back in Fourier series section:

*D**N**(t) =*
X*N*
*n=−N*

*e** ^{2πint}* =

*sin(π(2N + 1)t)*

*sin πt*

*.*

*Here’s the graph for N = 20:*

It’s now really true that

*D**N* → III

*as N → ∞, where the convergence is in the sense of distributions.*

**Fourier transform of a Fourier series** When we first started to work with tempered distributions, I
said that we would be able to take the Fourier transform of functions that didn’t have one, i.e., functions
for which the integral defining the (classical) Fourier transform does not exist. We’ve made good on that
promise, including complex exponentials, for which

*F e*^{2πikt/p}*= δ*

*s −* ^{k}

*p*

*.*

With this we can now find the Fourier transform of a Fourier series. If
*ϕ(t) =*

X∞
*k=−∞*

*c*_{k}*e*^{2πikt/p}

then

*F ϕ(s) =*
X∞
*k=−∞*

*c*_{k}*F e** ^{2πikt/p}*=
X∞

*k=−∞*

*c*_{k}*δ*

*s −* ^{k}

*p*

It may well be that the seriesP∞

*k=−∞**c**k**e** ^{2πikt/p}* converges to define a tempered distribution — that’s not
asking too much

^{3}

*— even if it doesn’t converge pointwise to ϕ(t). Then it still makes sense to consider its*Fourier transform and the formula, above, is OK.

**Rederiving Fourier series for a periodic function** We can turn this around and rederive the formula
*for Fourier series as a consequence of our work on Fourier transforms. Suppose Φ is periodic of period p*
and write, as we know we can,

*Φ = ϕ ∗ III*_{p}

*where ϕ is one period of Φ, say ϕ = Π**p*Φ. Take the Fourier transform of both sides and boldly invoke the
convolution theorem:

*F Φ = F (ϕ ∗ III**p**) = F ϕ · F III**p**= F ϕ ·* ^{1}
*p*III_{1/p}*,*
or, at points,

*F Φ(s) = F ϕ(s)* ^{1}
*p*

X∞
*k=−∞*

*δ*

*s −*^{k}

*p*

!

= ^{1}
*p*

X∞
*k=−∞*

*F ϕ*

*k*
*p*

*δ*

*s −* ^{k}

*p*

*.*

Now boldly take the inverse Fourier transform:

*Φ(t) =*
X∞
*k=−∞*

1
*p**F ϕ*

*k*
*p*

*e*^{2πikt/p}*(the F ϕ*

*k*
*p*

*are constants) .*
But

1
*p**F ϕ*

*k*
*p*

= ^{1}
*p*

Z ∞

−∞

*e*^{−2πi(k/p)t}*ϕ(t) dt*

= ^{1}
*p*

Z ∞

−∞

*e** ^{−2πi(k/p)t}*Π

_{p}*(t)Φ(t) dt =*

^{1}

*p*

Z *p/2*

*−p/2*

*e*^{−2πi(k/p)t}*Φ(t) dt ,*

3*For example, if ϕ is integrable so that the coefficients c*k tend to zero. Or even less than that will do, just as long as the
coefficients don’t grow too rapidly.

5.5 Sampling Signals 223

*and this is the k-th Fourier coefficient c**k* of Φ. We’ve rederived
*Φ(t) =*

X∞
*k=−∞*

*c**k**e*^{2πikt/p}*,* where *c**k* = ^{1}
*p*

Z *p/2*

*−p/2*

*e*^{−2πi(k/p)t}*Φ(t) dt .*

**5.5** **Sampling Signals**

In the previous lecture we studied three properties of III that make it so useful in many applications. They are:

• Periodizing

◦ Convolving with III periodizes a function.

• Sampling

◦ Multiplying by III samples a function.

• The Fourier transform of III is III.

◦ Convolving and multiplying are themselves flip sides of the same coin via the convolution theo- rem for Fourier transforms.

We are now about to combine all of these ideas in a spectacular way to treat the problem of “sampling and interpolation”. Let me state the problem this way:

*• Given a signal f (t) and a collection of samples of the signal, i.e., values of the signal at a set of points*
*f (t*0*), f (t*1*), f (t*2*), . . . , to what extent can one interpolate the values f (t) at other points from the*
sample values?

This is an old question, and a broad one, and it would appear on the surface to have nothing to do with III’s or Fourier transforms, or any of that. But we’ve already seen some clues, and the full solution is set to unfold.

**5.5.1** **Sampling sines and bandlimited signals**

Why should we expect to be able to do interpolation at all? Imagine putting down a bunch of dots — maybe
*even infinitely many — and asking someone to pass a curve through them that agrees everywhere exactly*
*with a predetermined mystery function passing through those dots. Ridiculous. But it’s not ridiculous. If*
a relatively simple hypothesis is satisfied then interpolation can be done! Here’s one way of getting some
intuitive sense of the problem and what that hypothesis should be.

*Suppose we know a signal is a single sinusoid. A sinusoid repeats, so if we have enough information to pin*
it down over one period, or cycle, then we know the whole thing. How many samples — how many values
of the function — within one period do we need to know to know which sinusoid we have? We need three
*samples strictly within one cycle. You can think of the graph, or you can think of the equation: A general*
*sinusoid is of the form A sin(2πνt + φ). There are three unknowns, the amplitude A, the frequency ν and*
*the phase φ. We would expect to need three equations to find the unknowns, hence we need values of the*
function at three points, three samples.

What if the signal is a sum of sinusoids, say
X*N*
*n=1*

*A**n**sin(2πnνt + φ**n**) .*

Sample points for the sum are “morally” sample points for the individual harmonics, though not explicitly.

We need to take enough samples to get sufficient information to determine all of the unknowns for all of the harmonics. Now, in the time it takes for the combined signal to go through one cycle, the individual harmonics will have gone through several cycles, the lowest frequency harmonic through one cycle, the lower frequency harmonics through a few cycles, say, and the higher frequency harmonics through many.

We have to take enough samples of the combined signal so that as the individual harmonics go rolling
*along we’ll be sure to have at least three samples in some cycle of every harmonic.*

*To simplify and standardize we assume that we take evenly spaced samples (in t). Since we’ve phrased*
things in terms of cycles per second, to understand how many samples are enough it’s then also better to
think in terms of “sampling rate”, i.e., samples/sec instead of “number of samples”. If we are to have at
least three samples strictly within a cycle then the sample points must be strictly less than a half-cycle
*apart. A sinusoid of frequency ν goes through a half-cycle in 1/2ν seconds so we want*

spacing between samples = number of seconds
number of samples *<* 1

*2ν* *.*
The more usual way of putting this is

*sampling rate = samples/sec > 2ν .*

*This is the rate at which we should sample a given sinusoid of frequency ν to guarantee that a single cycle*
will contain at least three sample points. Furthermore, if we sample at this rate for a given frequency, we
*will certainly have more than three sample points in some cycle of any harmonic at a lower frequency.*

Note that the sampling rate has units 1/seconds and that sample points are 1/(sampling rate) seconds apart.

For the combined signal — a sum of harmonics — the higher frequencies are driving up the sampling
*rate; specifically, the highest frequency is driving up the rate. To think of the interpolation problem*
geometrically, high frequencies cause more rapid oscillations, i.e., rapid changes in the function over small
intervals, so to hope to interpolate such fluctuations accurately we’ll need a lot of sample points and thus
a high sampling rate. For example, here’s a picture of the sum of two sinusoids one of low frequency and
one of high frequency.

5.6 Sampling and Interpolation for Bandlimited Signals 225

If we sample at too low rate we might miss the wiggles entirely. We might mistakenly think we had only the low frequency sinusoid, and, moreover, if all we had to go on were the samples we wouldn’t even know we’d made a mistake! We’ll come back to just this problem a little later.

If we sample at a rate greater than twice the highest frequency, our sense is that we will be sampling often
enough for all the lower harmonics as well, and we should be able to determine everything. The problem
*here is if the spectrum is unbounded. If, as for a square wave, we have a full Fourier series and not just a*
finite sum of sinusoids, then we have no hope of sampling frequently enough to determine the combined
*signal from the samples. For a square wave, for example, there is no “highest frequency”. That’s trouble.*

It’s time to define ourselves out of this trouble.

**Bandlimited signals** From the point of view of the preceding discussion, the problem for interpolation,
is high frequencies, and the best thing a signal can be is a finite Fourier series. The latter is much too
restrictive for applications, of course, so what’s the “next best” thing a signal can be? It’s one for which
*there is a highest frequency. These are the bandlimited signals — signals whose Fourier transforms are*
identically zero outside of a finite interval. Such a signal has a bounded spectrum; there is a “highest
frequency”.

More formally:

*• A signal f (t) is bandlimited if there is a finite number p such that F f (s) = 0 for all |s| ≥ p/2. The*
*smallest number p for which this is true is called the bandwidth of f (t).*

*There’s a question about having F f be zero at the endpoints ±p/2 as part of the definition. For the fol-*
lowing discussion on sampling and interpolation, it’s easiest to assume this is the case, and treat separately
some special cases when it isn’t. For those who want to know more, read the next paragraph.

**Some technical remarks** *If f (t) is an integrable function then F f (s) is continuous, so if F f (s) = 0*
*for all |s| > p/2 then F f (±p/2) = 0 as well. On the other hand, it’s also common first to define the*
*support of a function (integrable or not) as the complement of the largest open set on which the function is*
identically zero. (This definition can also be given for distributions.) This makes the support closed, being
*the complement of an open set. For example, if F f (s) is identically zero for |s| > p/2, and on no larger*
*open set, then the support of F f is the closed interval [−p/2, +p/2]. Thus, with this definition, even if*
*F f (±p/2) = 0 the endpoints ±p/2 are included in the support of F f .*

*One then says, as an alternate definition, that f is bandlimited if the support of F f is closed and bounded.*

**In mathematical terms, a closed, bounded set (in R**^{n}*) is said to be compact, and so the shorthand definition*
*of bandlimited is that F f has compact support. A typical compact set is a closed interval, like [−p/2, +p/2],*
but we could also take finite unions of closed intervals. This definition is probably the one more often given,
but it’s a little more involved to set up, as you’ve just witnessed. Whichever definition of bandlimited one
adopts there are always questions about what happens at the endpoints anyway, as we’ll see.

**5.6** **Sampling and Interpolation for Bandlimited Signals**

We’re about to solve the interpolation problem for bandlimited signals. We’ll show that interpolation is possible by finding an explicit formula that does the job. Before going through the solution, however, I want to make a general observation that’s independent of the interpolation problem but is important to it.

*It is unphysical to consider a signal as lasting forever in time. A physical signal f (t) is naturally “time-*
*limited”, meaning that f (t) is identically zero on |t| ≥ q/2 for some q — there just isn’t any signal beyond*

a point. On the other hand, it is very physical to consider a bandlimited signal, one with no frequencies beyond a certain point, or at least no frequencies that our instruments can register. Well, we can’t have both, at least not in the ideal world of mathematics. Here is where mathematical description meets physical expectation — and they disagree. The fact is:

• A signal cannot be both timelimited and bandlimited.

What this means in practice is that there must be inaccuracies in a mathematical model of a phenomenon that assumes a signal is both timelimited and bandlimited. Such a model can be at best an approximation, and one has to be prepared to estimate the errors as they may affect measurements and conclusions.

Here’s one argument why the statement is true; I’ll give a more complete proof of a more general statement
*in Appendix 1. Suppose f is bandlimited, say F f (s) is zero for |s| ≥ p/2. Then*

*F f = Π**p**· F f .*
Take the inverse Fourier transform of both sides to obtain

*f (t) = p sinc pt ∗ f (t) .*

*Now sinc pt “goes on forever”; it decays but it has nonzero values all the way out to ±∞. Hence the*
*convolution with f also goes on forever; it is not timelimited.*

**sinc as a “convolution identity”** There’s an interesting observation that goes along with the argument
*we just gave. We’re familiar with δ acting as an “identity element” for convolution, meaning*

*f ∗ δ = f .*

*This important property of δ holds for all signals for which the convolution is defined. We’ve just seen*
*for the more restricted class of bandlimited functions, with spectrum from −p/2 to +p/2, that the sinc*
function also has this property:

*p sinc pt ∗ f (t) = f (t) .*

**The Sampling Theorem** Ready to solve the interpolation problem? It uses all the important properties
of III, but it goes so fast that you might miss the fun entirely if you read too quickly.

*Suppose f (t) is bandlimited with F f (s) identically zero for |s| ≥ p/2. We periodize F f using III**p* and then
*cut off to get F f back again:*

*F f = Π*_{p}*(F f ∗ III*_{p}*) .*
This is the crucial equation.

5.6 Sampling and Interpolation for Bandlimited Signals 227

Now take the inverse Fourier transform:

*f (t) = F*^{−1}*F f (t) = F*^{−1}(Π*p**(F f ∗ III**p**))(t)*

= F^{−1}Π_{p}*(t) ∗ F*^{−1}*(F f ∗ III*_{p}*)(t)*

(taking F^{−1} turns multiplication into convolution)

= F^{−1}Π_{p}*(t) ∗ (F*^{−1}*F f (t) · F*^{−1}III_{p}*(t))*

(ditto, except it’s convolution turning into multiplication)

*= p sinc pt ∗ (f (t) ·*^{1}

*p*III_{1/p}*(t))*

*= sinc pt ∗*
X∞
*k=−∞*

*f*

*k*
*p*

*δ*

*t −*^{k}

*p*

(the sampling property of III* _{p}*)

=
X∞
*k=−∞*

*f*

*k*
*p*

*sinc pt ∗ δ*

*t −*^{k}

*p*

=
X∞
*k=−∞*

*f*

*k*
*p*

*sinc p*

*t −* ^{k}

*p*

*(the sifting property of δ)*

We’ve just established the classic “Sampling Theorem”, though it might be better to call it the “interpo- lation theorem”. Here it is as a single statement:

*• If f (t) is a signal with F f (s) identically zero for |s| ≥ p/2 then*
*f (t) =*

X∞
*k=−∞*

*f*

*k*
*p*

*sinc p*

*t −* ^{k}

*p*

*.*

Some people write the formula as

*f (t) =*
X∞
*k=−∞*

*f*

*k*
*p*

*sinc(pt − k) ,*

*but I generally prefer to emphasize the sample points*
*t** _{k}* =

^{k}*p*

and then to write the formula as

*f (t) =*
X∞
*k=−∞*

*f (t**k**) sinc p(t − t**k**) .*

*What does the formula do, once again? It computes any value of f in terms of sample values. Here are a*
few general comments to keep in mind:

*• The sample points are spaced 1/p apart — the reciprocal of the bandwidth.*^{4}

4That sort of reciprocal phenomenon is present again in higher dimensional versions of the sampling formula. This will be a later topic for us.

*• The formula involves infinitely many sample points — k/p for k = 0, ±1, ±2, . . . .*

So don’t think you’re getting away too cheaply, and realize that any practical implementation can only involve a finite number of terms in the sum, so will necessarily be an approximation.

◦ Since a bandlimited signal cannot be timelimited we should expect to have to take samples all
*the way out to ±∞. However, sampling a bandlimited periodic signal, i.e., a finite Fourier series,*
requires only a finite number of samples. We’ll cover this, below.

Put the outline of the argument for the sampling theorem into your head — it’s important. Starting with a bandlimited signal, there are three parts:

• Periodize the Fourier transform.

• Cut off this periodic function to get back where you started.

• Take the inverse Fourier transform.

Cutting off in the second step, a multiplication, exactly undoes periodizing in the first step, a convolution,
*provided that F f = Π*_{p}*(F f ∗ III** _{p}*). But taking the inverse Fourier transform swaps multiplication with
convolution and this is why something nontrivial happens. It’s almost obscene the way this works.

**Sampling rates and the Nyquist frequency** The bandwidth determines the minimal sampling rate
*we can use to reconstruct the signal from its samples. I’d almost say that the bandwidth is the minimal*
sampling rate except for the slight ambiguity about where the spectrum starts being identically zero (the

“endpoint problem”). Here’s the way the situation is usually expressed: If the (nonzero) spectrum runs
*from −ν*max *to ν*max then we need

*sampling rate > 2ν*_{max}
to reconstruct the signal from its samples.

*The number 2ν*max *is often called the Nyquist frequency, after Harry Nyquist, God of Sampling, who was*
the first engineer to consider these problems for the purpose of communications. There are other names
associated with this circle of ideas, most notably E. Whittaker, a mathematician, and C. Shannon, an all
around genius and founder of Information Theory. The formula as we’ve given it is often referred to as the
Shannon Sampling Theorem.

The derivation of the formula gives us some one-sided freedom, or rather the opportunity to do more work
*than we have to. We cannot take p smaller than the length of the interval where F f is supported, the*
*bandwidth, but we can take it larger. That is, if p is the bandwidth and q > p we can periodize F f to*
*have period q by convolving with III**q* and we still have the fundamental equation

*F f = Π*_{q}*(F f ∗ III*_{q}*) .*

(Draw a picture.) The derivation can then proceed exactly as above and we get
*f (t) =*

X∞
*k=−∞*

*f (τ*_{k}*) sinc q(t − τ** _{k}*)
where the sample points are

*τ** _{k}* =

*k*

*q*

*.*

*These sample points are spaced closer together than the sample points t*_{k}*= k/p. The sampling rate is*
higher than we need. We’re doing more work than we have to.

5.7 Interpolation a Little More Generally 229

**5.7** **Interpolation a Little More Generally**

Effective approximation and interpolation of signals raises a lot of interesting and general questions. One approach that provides a good framework for many such questions is to bring in orthogonality. It’s very much analogous to the way we looked at Fourier series.

**Interpolation and orthogonality** We begin with still another amazing property of sinc functions

*— they form an orthonormal collection. Specifically, the family of sinc functions {sinc(t − n) : n =*
*0, ±1, ±2, . . .} is orthonormal with respect to the usual inner product on L*^{2}**(R). Recall that the inner**
product is

*(f, g) =*
Z ∞

−∞

*f (t)g(t) dt .*

The calculation to establish the orthonormality property of the sinc functions uses the general Parseval

identity, Z ∞

−∞

*f (t)g(t) dt =*
Z ∞

−∞

*F f (s)F g(s) ds .*
We then have

Z ∞

−∞

*sinc(t − n) sinc(t − m) dt =*
Z ∞

−∞

*(e*^{−2πisn}*Π(s)) (e*^{−2πism}*Π(s)) ds*

= Z ∞

−∞

*e*^{2πis(m−n)}*Π(s)Π(s) ds =*
Z *1/2*

*−1/2*

*e*^{2πis(m−n)}*ds*

*From here direct integration will give you that this is 1 when n = m and 0 when n 6= m.*

*In case you’re fretting over it, the sinc function is in L*^{2}**(R) and the product of two sinc functions is**
*integrable. Parseval’s identity holds for functions in L*^{2}**(R), though we did not establish this.**

*Now let’s consider bandlimited signals g(t), and to be definite let’s suppose the spectrum is contained in*

*−1/2 ≤ s ≤ 1/2. Then the Nyquist sampling rate is 1, i.e., we sample at the integer points, and the*
interpolation formula takes the form

*g(t) =*
X∞
*n=−∞*

*g(n) sinc(t − n) .*

Coupled with the result on orthogonality, this formula suggest that the family of sinc functions forms
*an orthonormal basis for the space of bandlimited signals with spectrum in [−1/2, 1/2], and that we’re*
*expressing g(t) in terms of this basis. To see that this really is the case, we interpret the coefficients (the*

*sample values g(n)) as the inner product of g(t) with sinc(t − n). We have, again using Parseval,*
*(g(t), sinc(t − n)) =*

Z ∞

−∞

*g(t) sinc(t − n) dt*

= Z ∞

−∞

*F g(s)F (sinc(t − n)) ds* (by Parseval)

= Z ∞

−∞

*F g(s)(e*^{−2πisn}*Π(s)) ds*

=
Z *1/2*

*−1/2*

*F g(s)e*^{2πins}*ds*

= Z ∞

−∞

*F g(s)e*^{2πins}*ds* *(because g is bandlimited)*

*= g(n)* (by Fourier inversion)

*It’s perfect! The interpolation formula says that g(t) is written in terms of an orthonormal basis, and the*
*coefficient g(n), the n-th sampled value of g(t), is exactly the projection of g(t) onto the n-th basis element:*

*g(t) =*
X∞
*n=−∞*

*g(n) sinc(t − n) =*
X∞
*n=−∞*

*g(t), sinc(t − n)*

*sinc(t − n) .*

**Lagrange interpolation** Certainly for computational questions, going way back, it is desirable to find
*reasonably simple approximations of complicated functions, particularly those arising from solutions to*
differential equations.^{5} The classic way to approximate is to interpolate. That is, to find a simple function
that, at least, assumes the same values as the complicated function at a given finite set of points. Curve
fitting, in other words. The classic way to do this is via polynomials. One method, presented here just for
your general background and know-how, is due to Lagrange.

*Suppose we have n points t*1*, t*2*, . . . , t**n**. We want a polynomial of degree n − 1 that assumes given values*
*at the n sample points. (Why degree n − 1?)*

*For this, we start with an n-th degree polynomial that vanishes exactly at those points. This is given by*
*p(t) = (t − t*_{1}*)(t − t*_{2}*) · · · (t − t*_{n}*) .*

Next put

*p*_{k}*(t) =* *p(t)*
*t − t**k*

*.*

*Then p**k**(t) is a polynomial of degree n − 1; we divide out the factor (t − t**k**) and so p**k**(t) vanishes at the*
*same points as p(t) except at t** _{k}*. Next consider the quotient

*p*_{k}*(t)*
*p**k**(t**k*)*.*

*This is again a polynomial of degree n − 1. The key property is that p**k**(t)/p**k**(t**k*) vanishes at the sample
*points t*_{j}*except at the point t** _{k}* where the value is 1; i.e.,

*p*_{k}*(t** _{j}*)

*p*

*k*

*(t*

*k*) =

(1 *j = k*
0 *j 6= k*

5The sinc function may not really qualify as an “easy approximation”. How is it computed, really?

5.8 Finite Sampling for a Bandlimited Periodic Signal 231

To interpolate a function by a polynomial (to fit a curve through a given set of points) we just scale and
*add. That is, suppose we have a function g(t) and we want a polynomial that has values g(t*1*), g(t*2*), . . . ,*
*g(t*_{n}*) at the points t*_{1}*, t*_{2}*, . . . , t** _{n}*. We get this by forming the sum

*p(t) =*
X*n*
*k=1*

*g(t**k*) *p*_{k}*(t)*
*p**k**(t**k*)*.*

*This does the trick. It is known as the Lagrange Interpolation Polynomial. Remember, unlike the sampling*
*formula we’re not reconstructing all the values of g(t) from a set of sample values. We’re approximating*
*g(t) by a polynomial that has the same values as g(t) at a prescribed set of points.*

*The sinc function is an analog of the p*_{k}*(t)/p*_{k}*(t** _{k}*) for “Fourier interpolation”, if we can call it that. With

*sinc t =*

*sin πt*

*πt* *.*

we recall some properties, analogous to the polynomials we built above:

*• sinc t = 1 when t = 0*

*• sinc t = 0 at nonzero integer points t = ±1, ±2, . . . .*
Now shift this and consider

*sinc(t − k) =* *sin π(t − k)*
*π(t − k)* *.*
*This has the value 1 at t = k and is zero at the other integers.*

*Suppose we have our signal g(t) and the sample points . . . , g(−2), g(−1), g(0), g(1), g(2), . . . . So, again,*
we’re sampling at evenly spaced points, and we’ve taken the sampling rate to be 1 just to simplify. To
interpolate these values we would then form the sum

X∞
*n=−∞*

*g(k) sinc(t − k) .*

*There it is again — the general interpolation formula. In the case that g(t) is bandlimited (bandwidth 1*
*in this example) we know we recover all values of g(t) from the sample values.*

**5.8** **Finite Sampling for a Bandlimited Periodic Signal**

We started this whole discussion of sampling and interpolation by arguing that one ought to be able to interpolate the values of a finite sum of sinusoids from knowledge of a finite number of samples. Let’s see how this works out, but rather than starting from scratch let’s use what we’ve learned about sampling for general bandlimited signals.

As always, it’s best to work with the complex form of a sum of sinusoids, so we consider a real signal given by

*f (t) =*
X*N*
*k=−N*

*c**k**e*^{2πikt/q}*,* *c**−k* *= c**k**.*

*f (t) is periodic of period q. Recall that c**−k* *= c**k*. Some of the coefficients may be zero, but we assume
*that c** _{N}* 6= 0.

*There are 2N + 1 terms in the sum (don’t forget k = 0) and it should take 2N + 1 sampled values over*
*one period to determine f (t) completely. You might think it would take twice this many sampled values*

*because the values of f (t) are real and we have to determine complex coefficients. But remember that*
*c**−k* *= c**k**, so if we know c**k* *we know c**−k**. Think of the 2N + 1 sample values as enough information to*
*determine the real number c*_{0} *and the N complex numbers c*_{1}*, c*_{2}*, . . . , c** _{N}*.

*The Fourier transform of f is*

*F f (s) =*
X*N*
*k=−N*

*c*_{k}*δ*

*s −* *k*

*q*

*and the spectrum goes from −N/q to N/q. The sampling formula applies to f (t), and we can write an*
equation of the form

*f (t) =*
X∞
*k=−∞*

*f (t**k**) sinc p(t − t**k**) ,*

but it’s a question of what to take for the sampling rate, and hence how to space the sample points.

*We want to make use of the known periodicity of f (t). If the sample points t**k* are a fraction of a period
*apart, say q/M for an M to be determined, then the values f (t*_{k}*) with t*_{k}*= kq/M , k = 0, ±1, ±2, . . . will*
*repeat after M samples. We’ll see how this collapses the interpolation formula.*

*To find the right sampling rate, p, think about the derivation of the sampling formula, the first step*
*being: “periodize F f ”. The Fourier transform F f is a bunch of δ’s spaced 1/q apart (and scaled by the*
*coefficients c**k**). The natural periodization of F f is to keep the spacing 1/q in the periodized version,*
*essentially making the periodized F f a scaled version of III*_{1/q}*. We do this by convolving F f with III*_{p}*where p/2 is the midpoint between N/q, the last point in the spectrum of F f , and the point (N + 1)/q,*
*which is the next point 1/q away. Here’s a picture.*

1

0 1/q N /qp/2

-1/q -p/2-N/q

(N +1)/q -(N+1)/q

*Thus we find p from*
*p*
2 = 1

2

*N*

*q* +*N + 1*
*q*

= *(2N + 1)*

*2q* *,* or *p =* *2N + 1*

*q* *.*

*We periodize F f by III** _{p}* (draw yourself a picture of this!), cut off by Π

*, then take the inverse Fourier transform. The sampling formula back in the time domain is*

_{p}*f (t) =*
X∞
*k=−∞*

*f (t*_{k}*) sinc p(t − t** _{k}*)

with

*t** _{k}* =

^{k}*p*

*.*

5.8 Finite Sampling for a Bandlimited Periodic Signal 233

*With our particular choice of p let’s now see how the q-periodicity of f (t) comes into play. Write*
*M = 2N + 1 ,*

so that

*t** _{k}* =

^{k}*p*=

*kq*

*M* *.*

*Then, to repeat what we said earlier, the sample points are spaced a fraction of a period apart, q/M , and*
*after f (t*_{0}*), f (t*_{1}*), . . . , f (t*_{M −1}*) the sample values repeat, e.g., f (t*_{M}*) = f (t*_{0}*), f (t*_{M +1}*) = f (t*_{1}) and so on.

More succinctly,

*t**k+k*^{0}*M* *= t**k* *+ k*^{0}*q ,*
and so

*f (t*_{k+k}^{0}_{M}*) = f (t**k* *+ k*^{0}*q) = f (t**k**) ,*

*for any k and k*^{0}. Using this periodicity of the coefficients in the sampling formula, the single sampling
*sum splits into M sums as:*

X∞
*k=−∞*

*f (t**k**) sinc p(t − t**k*)

*= f (t*_{0})
X∞
*m=−∞*

*sinc(pt − mM ) + f (t*_{1})
X∞
*m=−∞*

*sinc(pt − (1 + mM )) +*

*f (t*2)
X∞
*m=−∞*

*sinc(pt − (2 + mM )) + · · · + f (t**M −1*)
X∞
*m=−∞*

*sinc(pt − (M − 1 + mM ))*

*Those sums of sincs on the right are periodizations of sinc pt and, remarkably, they have a simple closed*
*form expression. The k-th sum is*

X∞
*m=−∞*

*sinc(pt − k − mM ) = sinc(pt − k) ∗ III*_{M/p}*(t) =* *sinc(pt − k)*

sinc(_{M}^{1} *(pt − k))* = *sinc(p(t − t** _{k}*))
sinc(

^{1}

_{q}*(t − t*

*k*))

*.*

(I’ll give a derivation of this at the end of this section.) Using these identities, we find that the sampling formula to interpolate

*f (t) =*
X*N*
*k=−N*

*c*_{k}*e*^{2πikt/q}

*from 2N + 1 = M sampled values is*

*f (t) =*
X*2N*
*k=0*

*f (t** _{k}*)

*sinc(p(t − t*

*))*

_{k}sinc(^{1}_{q}*(t − t**k*))*,* *where p =* ^{2N + 1}

*q* *, t** _{k}* =

^{k}*p* = ^{kq}*2N + 1**.*
This is the “finite sampling theorem” for periodic functions.

It might also be helpful to write the sampling formula in terms of frequencies. Thus, if the lowest frequency
*is ν*min *= 1/q and the highest frequency is ν*max*= N ν*min then

*f (t) =*
X*2N*
*k=0*

*f (t**k*)*sinc((2ν*_{max}*+ ν*_{min}*)(t − t** _{k}*))

*sinc(ν*_{min}*(t − t** _{k}*))

*,*

*where t*

*k*=

^{kq}*2N + 1*

*.*

The sampling rate is

*sampling rate = 2ν*max*+ ν**min**.*
Compare this to

*sampling rate > 2ν*_{max}
for a general bandlimited function.

*Here’s a simple example of the formula. Take f (t) = cos 2πt. There’s only one frequency, and ν*_{min} =
*ν*max*= 1. Then N = 1, the sampling rate is 3 and the sample points are t*0 *= 0, t*1 *= 1/3, and t*2 *= 2/3.*

The formula says

*cos 2πt =* *sinc 3t*

*sinc t* + cos* _{2π}*
3

* sinc(3(t −*^{1}_{3}))

*sinc(t −*^{1}_{3}) + cos* _{4π}*
3

* sinc(3(t −*^{2}_{3}))
*sinc(t −* ^{2}_{3}) *.*

Does this really work? I’m certainly not going to plow through the trig identities needed to check it!

However, here’s a plot of the right hand side.

*Any questions? Ever thought you’d see such a complicated way of writing cos 2πt?*

**Periodizing sinc Functions** In applying the general sampling theorem to the special case of a periodic
signal, we wound up with sums of sinc functions which we recognized (sharp-eyed observers that we are)
to be periodizations. Then, out of nowhere, came a closed form expression for such periodizations as a
ratio of sinc functions. Here’s where this comes from, and here’s a fairly general result that covers it.

**Lemma** *Let p, q > 0 and let N be the largest integer strictly less than pq/2. Then*
X∞

*k=−∞*

*sinc(pt − kpq) = sinc(pt) ∗ III*_{q}*(t) =* 1
*pq*

*sin((2N + 1)πt/q)*
*sin(πt/q)* *.*

*There’s a version of this lemma with N ≤ pq/2, too, but that’s not important for us. In terms of sinc*
functions the formula is

*sinc(pt) ∗ III*_{q}*(t) =* *2N + 1*
*pq*

*sinc((2N + 1)t/q)*
*sinc(t/q)* *.*