• 沒有找到結果。

Convolutions and the Convolution Theorem

Convolution of distributions presents some special problems and we’re not going to go into this too deeply.

It’s not so hard figuring out formally how to define S ∗T for distributions S and T , it’s setting up conditions under which the convolution exists that’s somewhat tricky. This is related to the fact of nature that it’s impossible to define (in general) the product of two distributions, for we also want to have a convolution theorem that says F (S ∗ T ) = (F S)(F T ) and both sides of the formula should make sense.

What works easily is the convolution of a distribution with a test function. This goes through as you might expect (with a little twist) but in case you want to skip the following discussion I am pleased to report right away that the convolution theorem on Fourier transforms continues to hold: If ψ is a test function and T is a distribution then

F (ψ ∗ T ) = (F ψ)(F T ) .

The right hand side is the product of a test function and a distribution, which is defined.

Here’s the discussion that supports the development of convolution in this setting. First we consider how to define convolution of ψ and T . As in every other case of extending operations from functions to distributions, we suppose first that a distribution T comes from a function f . If ψ is a test function we want to look at the pairing of ψ ∗ f with a test function ϕ. This is

hψ ∗ f, ϕi = Z

−∞

(ψ ∗ f )(x)ϕ(x) dx

= Z

−∞

Z

−∞

ψ(x − y)f (y) dy



ϕ(x) dx

= Z

−∞

Z

−∞

ψ(x − y)ϕ(x)f (y) dy dx

= Z

−∞

Z

−∞

ψ(x − y)ϕ(x) dx



f (y) dy

(The interchange of integration in the last line is justified because every function in sight is as nice as can be.) We almost see a convolution ψ ∗ ϕ in the inner integral — but the sign is wrong. However, bringing back our notation ψ(x) = ψ(−x), we can write the inner integral as the convolution ψ∗ ϕ (or as ψ ∗ ϕ by a change of variable). That is

hψ ∗ f, ϕi = Z

−∞

(ψ ∗ f )(x)ϕ(x) dx = Z

−∞

∗ ϕ)(x)f (x) dx = hf, ψ∗ ϕi . This tells us what to do in general:

• If T is a distribution and ψ is a test function then ψ ∗ T is defined by hψ ∗ T , ϕi = hT , ψ∗ ϕi .

Convolution property of δ Let’s see how this works to establish the basic convolution property of the δ-function:

ψ ∗ δ = ψ

where on the right hand side we regard ψ as a distribution. To check this:

hψ ∗ δ, ϕi = hδ, ψ∗ ϕi = (ψ∗ ϕ)(0)

= Z

−∞

ψ(−y)ϕ(y) dy = Z

−∞

ψ(y)ϕ(y) dy = hψ , ϕi .

Look at this carefully, or rather, simply. It says that ψ ∗ δ has the same outcome as ψ does when paired with φ. That is, ψ ∗ δ = ψ. Works like a charm. Air tight.

As pointed out earlier, it’s common practice to write this property of δ as an integral, ψ(x) =

Z

−∞

δ(x − y)ψ(y) dy .

This is sometimes called the sifting property of δ. Generations of distinguished engineers and scientists have written this identity in this way, and no harm seems to have befallen them.

We can even think of Fourier inversion as a kind of convolution identity, in fact as exactly the sifting property of δ. The inversion theorem is sometimes presented in this way (proved, according to some people, though it’s circular reasoning). We need to write (formally)

Z

−∞

e2πisxds = δ(x)

viewing the left hand side as the inverse Fourier transform of 1, and then, shifting, Z

−∞

e2πisxe−2πistds = δ(x − t) .

And now, shamelessly, F−1F ϕ(x) =

Z

−∞

e2πisx

Z

−∞

e−2πistϕ(t) dt

 ds

= Z

−∞

Z

−∞

e2πisxe−2πistϕ(t) dt dt

= Z

−∞

Z

−∞

e2πisxe−2πistds



ϕ(t) dt = Z

−∞

δ(x − t)ϕ(t) dt = ϕ(x) .

At least these manipulations didn’t lead to a contradiction! I don’t mind if you think of the inversion theorem in this way, as long as you know what’s behind it, and as long as you don’t tell anyone where you saw it.

The convolution theorem Having come this far, we can now derive the convolution theorem for the Fourier transform:

hF (ψ ∗ T ), ϕi = hψ ∗ T , F ϕi = hT , ψ∗ F ϕi

= hT , F F ψ ∗ F ϕi (using the identity F F ψ = ψ)

= hT , F (F ψ · ϕ)i

(for functions the convolution of the Fourier transfoms is the Fourier transform of the product)

= hF T , F ψ · ϕi (bringing F back to T )

= h(F ψ)(F T ), ϕi (how multiplication by a function is defined) Comparing where we started and where we ended up:

hF (ψ ∗ T ), ϕi = h(F ψ)(F T ), ϕi . that is,

F (ψ ∗ T ) = (F ψ)(F T ) . Done.

One can also show the dual identity:

F (ψT ) = F ψ ∗ F T

Pay attention to how everything makes sense here and has been previously defined. The product of the Schwartz function ψ and the distribution T is defined, and as a tempered distribution it has a Fourier transform. Since ψ is a Schwartz function so is its Fourier transform F ψ, and hence F ψ ∗ F T is defined.

I’ll leave it to you to check that the algebraic properties of the convolution continue to hold for distributions, whenever all the quantities are defined.

Note that the convolution identities are consistent with ψ ∗ δ = ψ, and with ψδ = ψ(0)δ. The first of these convolution identities says that

F (ψ ∗ δ) = F ψF δ = F ψ ,

since F δ = 1, and that jibes with ψ ∗ δ = ψ. The other identity is a little more interesting. We have F (ψδ) = F ψ ∗ F δ = F ψ ∗ 1 =

Z

−∞

1 · F ψ(x) dx = F−1F ψ(0) = ψ(0) . This is consistent with F (ψδ) = F (ψ(0)δ) = ψ(0)F δ = ψ(0).

Convolution in general I said earlier that convolution can’t be defined for every pair of distributions.

I want to say a little more about this, but only a little, and give a few examples of cases when it works out OK.

At the beginning of this section we considered, as we always do, what convolution looks like for distributions in the case when the distribution comes from a function. With f playing the role of the distribution and

ψ a Schwartz function we wrote

hψ ∗ f, ϕi = Z

−∞

(ψ ∗ f )(x)ϕ(x) dx

= Z

−∞

Z

−∞

ψ(x − y)f (y) dy



ϕ(x) dx

= Z

−∞

Z

−∞

ψ(x − y)ϕ(x)f (y) dy dy

= Z

−∞

Z

−∞

ψ(x − y)ϕ(x) dx



f (y) dy .

At this point we stopped and wrote this as the pairing

hψ ∗ f, ϕi = hf, ψ∗ ϕi so that we could see how to define ψ ∗ T when T is a distribution.

This time, and for a different reason, I want to take the inner integral one step further and write Z

−∞

ψ(x − y)ϕ(x) dx = Z

−∞

ψ(u)ϕ(u + y) du (using the substituion u = x − y).

This latter integral is the pairing hψ(x), ϕ(x + y)i, where I wrote the variable of the paring (the integration variable) as x and I included it in the notation for pairing to indicate that what results from the pairing is a function y. In fact, what we see from this is that hψ ∗ f, ϕi can be written as a “nested” pairing, namely

hψ ∗ f, ϕi = hf (y), hψ(x), ϕ(x + y)ii

where I included the variable y in the outside pairing to keep things straight and to help recall that in the end everything gets integrated away and the result of the nested pairing is a number.

Now, this nested pairing tells us how we might define the convolution S ∗ T of two distributions S and T . It is, with a strong proviso:

Convolution of two distributions If S and T are two distributions then their convolution is the distribution S ∗ T defined by

hS ∗ T , ϕi = hS(y), hT (x), ϕ(x + y)ii provided the right-hand-side exists.

We’ve written S(y) and T (x) “at points” to keep straight what gets paired with what; ϕ(x + y) makes sense, is a function of x and y, and it’s necessary to indicate which variable x or y is getting hooked up with T in the inner pairing and then with S in the outer pairing.

Why the proviso? Because the inner paring hT (x), ϕ(x + y)i produces a function of y which might not be a test function. Sad, but true. One can state some general conditions under which S ∗ T exists, but this requires a few more definitions and a little more discussion.21 Enough is enough. It can be dicey, but we’ll play a little fast and loose with existence of convolution and applications of the convolution theorem. Tell the rigor police to take the day off.

21It inevitably brings in questions about associativity of convolution, which might not hold in general, as it turns out, and, a more detailed treatment of the convolution theorem.

Convolving δ with itself. For various applications you may find yourself wanting to use the identity δ ∗ δ = δ .

By all means, use it. In this case the convolution makes sense and the formula follows:

hδ ∗ δ, ϕi = hδ(y), hδ(x), ϕ(x + y)ii

= hδ(y), ϕ(y)i = ϕ(0) = hδ, ϕi . A little more generally, we have

δa∗ δb= δa+b, a nice formula! We can derive this easily from the definition:

a∗ δb, ϕi = hδa(y), hδb(x), ϕ(x + y)ii

= hδa(y), ϕ(b + y)i = ϕ(b + a) = hδa+b, ϕi . It would be more common to write this identity as

δ(x − a) ∗ δ(x − b) = δ(x − a − b) .

In this notation, here’s the down and dirty version of what we just did (so you know how it looks):

δ(x − a) ∗ δ(x − b) = Z

−∞

δ(y − a)δ(x − b − y) dy

= Z

−∞

δ(u − b − a)δ(x − u) du (using u = b + y)

= δ(x − b − a) (by the sifting property of δ).

Convolution really is a “smoothing operation” (most of the time) I want to say a little more about general properties of convolution (first for functions) and why convolution is a smoothing operation.

In fact, it’s often taken as a maxim when working with convolutions that:

• The function f ∗ g has the good properties of f and g.

This maxim is put to use through a result called the derivative theorem for convolutions : (f ∗ g)0(x) = (f ∗ g0)(x) = (f0∗ g)(x) .

On the left hand side is the derivative of the convolution, while on the right hand side we put the derivative on whichever factor has a derivative.

We allow ourselves to differentiate under the integral sign — sometimes a delicate business, but set that aside — and the derivation is easy. If g is differentiable, then

(f ∗ g)0(x) = d dx

Z

−∞

f (u)g(x − u) du

= Z

−∞

f (u) d

dxg(x − u) du = Z

−∞

f (u)g0(x − u) du = (f ∗ g0)(x) The second formula follows similarly if f is differentiable.

The importance of this is that the convolution of two functions may have more smoothness than the individual factors. We’ve seen one example of this already, where it’s not smoothness but continuity that’s

improved. Remember Π ∗ Π = Λ; the convolution of the rectangle function with itself is the triangle function. The rectangle function is not continuous — it has jump discontinuities at x = ±1/2 — but the convolved function is continuous.22 We also saw that repeated convolution of a function with itself will lead to a Gaussian.

The derivative theorem is saying: If f is rough, but g is smooth then f ∗ g will be smoother than f because we can differentiate the convolution by putting the derivative on g. We can also compute higher order derivatives in the same way. If g is n-times differentiable then

(f ∗ g)(n)(x) = (f ∗ g(n))(x) .

Thus convolving a rough function f with an n-times differentiable function g produces an n-times differ-entiable function f ∗ g. It is in this sense that convolution is a “smoothing” operation.

The technique of smoothing by convolution can also be applied to distributions. There one works with ψ ∗ T where ψ is, for example, a Schwartz function. Using the family of Gaussians gt(x) = (1/

2πt)e−x2/2t to form gt∗ T produces the so-called regularization of T . This is the basis of the theorem on approximating a general distribution by a sequence of distributions that come from Schwartz functions.

The distribution δ is the breakeven point for smoothing by convolution — it doesn’t do any smoothing, it leaves the function alone, as in

δ ∗ f = f .

Going further, convolving a differentiable function with derivatives of δ produces derivatives of the function, for example,

δ0∗ f = f0.

You can derive this from scratch using the definition of the derivative of a distribution and the definition of convolution, or you can also think of

δ0∗ f = δ ∗ f0= f0.

(Careful here: This is δ0convolved with f , not δ0paired with f .) A similar result holds for higher derivatives:

δ(n)∗ f = f(n).

Sometimes one thinks of taking a derivative as making a function less smooth, so counterbalancing the maxim that convolution is a smoothing operation, one should add that convolving with derivatives of δ may roughen a function up.

相關文件