Continued Fractions - The Princeton Companion to Mathematics

mostn^ksteps. A problem belongs toPif and only if the time taken to solve it scales up by at most a constant factor when the size of the input scales up by a con- stant factor. A good example of such a problem is mul- tiplication of twon-digit numbers: if you use ordinary long multiplication, then replacingnby 2n increases the time taken by a factor of 4.

Suppose that you are presented with a positive inte- gerxand told that it is a product of two primespand q. How diﬃcult is it to determinep and q? Nobody knows, but one thing is easy to see: if you are toldp andq, then it is not hard (for a computer, at any rate) to check that pq really does equal x. Indeed, as we have just seen, long multiplication takes polynomial time, and comparing the answer with x is even eas- ier. The complexity classNPconsists of those compu- tational tasks for which a correct answer can bever- iﬁed in polynomial time, even if it cannot necessar- ily befoundin polynomial time. Remarkably, although this is a fundamental distinction, nobody knows how to prove thatP = NP: this problem is widely consid- ered to be the most important in theoretical computer science.

We brieﬂy mention two other important complexity classes.PSPACEconsists of all problems that can be solved using an amount of memory that grows at most polynomially with the size of the input. It turns out to be the natural class associated with reasonable compu- tational strategies for games such as chess. The com- plexity classNCis the set of all Boolean functions that can be computed by a “circuit of polynomial size and depth at most a polynomial in logn.” This last class is a model for the class of problems that can be solved very rapidly using parallel processing. In general, com- plexity classes are often surprisingly good at character- izing large families of problems with interesting and intuitively recognizable features in common. Another remarkable fact is that almost all complexity classes have “hardest problems” within them: that is, problems for which a solution can be converted into a solution for any other problem in the class. These problems are said to becompletefor the class in question.

These issues, as well as several other complexity classes, are discussed incomputational complexity [IV.20]. A vast number of further classes can be found at

http://qwiki.stanford.edu/wiki/Complexity_Zoo along with a brief deﬁnition of each.

natural numbers with them in the obvious way: 1 with 0, then 2 with 1, then 3 with−1, then 4 with 2, then 5 with−2, and so on.

An inﬁnite set is calledcountableif it has the same size as the natural numbers. As the above example shows, this is exactly the same as saying that we can list the elements of the set. Indeed, if we have listed a set asa₁, a2, a3, . . ., then our correspondence is just to sendnto an. It is worth noting that there are of course many attempted listings that fail: for example, for_Zwe might have tried−3,−2,−1,0,1,2,3,4, . . .. So it is important to recognize that when we say that a set is countable we are not saying thatevery attempt to list it works, or even that the obvious attempt does:

we are merely saying that there issomeway of listing the elements. This is in complete contrast to ﬁnite sets, where if we attempt to match up two sets and ﬁnd some elements of one set left over, then we know that the two sets cannot be in one-to-one correspondence. It is this diﬀerence that is mainly responsible for the “odd consequences” mentioned above.

Now that we have established that some sets that seem smaller or larger than _N, such as the squares or the integers, are actually countable, let us turn to a set that seems “much larger,” namely_Q. How could we hope to list all the rationals? After all, between any two of them you can ﬁnd inﬁnitely many others, so it seems hard not to leave some of them out when you try to list them. However, remarkable as it may seem, itispossible to list the rationals. The key idea is that listing the rationals whose numerator and denominator are both smaller (in modulus) than some ﬁxed number kis easy, as there are only ﬁnitely many of them. So we go through in order: ﬁrst when both numerator and denominator are at most 1, then when they are at most 2, and so on (being careful not to relist any number, so that for example¹₂ should not also appear as ²₄or ³₆).

This leads to an ordering such as 0,1,−1,2,−2,¹₂,−¹₂, 3,−3,¹₃,−¹₃,²₃,−²₃,³₂,−³₂,4,−4, . . ..

We could use the same idea to list sets that look even larger, such as, for example, thealgebraicnumbers (all real numbers, such as √

2, that satisfy a polynomial equation with integer coeﬃcients). Indeed, we note that each polynomial has only ﬁnitely many roots (which are therefore listable), so all we need to do is list the polyno- mials (as then we can go through them, in order, listing their roots). And we can do that by applying the same technique again: for eachdwe list those polynomials of degree at mostdthat we have not already listed, with coeﬃcients that are at mostdin modulus.

Based on the above examples, one might well guess thateveryinﬁnite set is countable. But a beautiful argu- ment of cantor [VI.54], called his “diagonal” argu- ment, shows that the real numbers are not countable.

We imagine that we have a list of all real numbers, say r₁, r₂, r₃, . . .. Our aim is to show that this list cannot possibly contain all the reals, so we wish to construct a real that is not on this list. How do we accomplish this?

We have eachriwritten as an inﬁnite decimal, say, and now we deﬁne a new numbersas follows. For the ﬁrst digit ofs(after the decimal point), we choose a digit that is not the ﬁrst digit ofr₁. Note that this already guaran- tees thatscannot equalr1. (To avoid coincidences with recurring 9s and the like, it is best to choose this ﬁrst digit ofsnot to be 0 or 9 either.) Then, for the second digit ofs, we choose a digit that is not the second digit ofr2; this guarantees thatscannot be equal tor2. Con- tinuing in this way, we end up with a real numbersthat is not on our list: whatevernis, the numberscannot bern, assandrndiﬀer in thenth decimal place!

One can use similar arguments any time that we have

“an inﬁnite number of independent choices” to make in specifying an object (like the various digits ofs). For example, let us use the same ideas to show that the set of all subsets of_Nis uncountable. Suppose we have listed all the subsets asA₁, A₂, A₃, . . .. We will deﬁne a new setBthat is not equal to any of theAn. So we include the point 1 inBif and only if 1 does not belong toA1(this guarantees thatBis not equal toA1), and we include 2 inBif and only if 2 does not belong toA₂, and so on. It is amusing to note that one can write this setBdown as{n∈N:n∈A_n}, which shows a striking resemblance to the set in Russell’s paradox.

Countable sets are the “smallest” inﬁnite sets. How- ever, the set of real numbers is by no means the

“largest” inﬁnite set. Indeed, the above argument shows that no setX can be put into one-to-one correspon- dence with the set of all its subsets. So the set of all subsets of the real numbers is “strictly larger” than the set of real numbers, and so on.

The notion of countability is often a very fruitful one to bear in mind. For example, suppose we want to know whether or not all real numbers are algebraic. It is a genuinely hard exercise to write down a particu- lar real that istranscendental[III.41] (meaning not algebraic; seeliouville’s theorem and roth’s theo- rem[V.22] for an idea of how it can be done), but the above notions make it utterly trivial that transcenden- tal numbers exist. Indeed, the set of all real numbers is

uncountable but the set of algebraic numbers is count- able! Furthermore, this shows that “most” real numbers are transcendental: the algebraic numbers form only a tiny proportion of the reals.

III.12 C

^∗

-Algebras

A banach space [III.62] is both a vector space [I.3 §2.3] and ametric space[III.56], and the study of Banach spaces is therefore a mixture of linear algebra and analysis. However, one can arrive at more sophis- ticated mixtures of algebra and analysis if one looks at Banach spaces that have more algebraic structure. In particular, while one can add two elements of a Banach space together, one cannot in general multiply them.

However, sometimes one can: a vector space with a mul- tiplicative structure is called analgebra, and if the vec- tor space is also a Banach space, and if the multipli- cation has the property thatxyx yfor any two elementsxandy, then it is called aBanach alge- bra. (This name does not really reﬂect historical real- ity, since the basic theory of Banach algebras was not worked out by Banach. A more appropriate name might have been Gelfand algebras.)

AC^∗-algebra is a Banach algebra with aninvolution, which means a function that associates with each ele- mentxanother elementx^∗in such a way thatx^∗∗=x, x^∗ = x,(x+y)^∗=x^∗+y^∗, and(xy)^∗=y^∗x^∗ for any elementsxandy; this involution is required to satisfy theC^∗-identityxx^∗ = x². A basic example of aC^∗-algebra is the algebraB(H)of all continuous linear mapsT deﬁned on ahilbert space[III.37]H.

The norm ofT is deﬁned to be the smallest constant M such thatT xMxfor everyx∈H, and the involution takesTto itsadjoint. This is a mapT^∗that has the property thatx, T y = T^∗x, yfor everyx andy inH. (It can be shown that there is exactly one map with this property.) IfHis ﬁnite dimensional, then Tcan be thought of as ann×nmatrix for somen, and T^∗is then the complex conjugate of the transpose ofT.

A fundamental theorem of Gelfand and Naimark states that everyC^∗-algebra can be represented as a subalgebra ofB(H)for some Hilbert spaceH. For more information, seeoperator algebras[IV.15 §3].

III.13 Curvature

If you cut an orange in half, scoop out the inside, and try to ﬂatten one of the resulting hemispheres of peel, then you will tear it. If you try to ﬂatten a horse’s saddle, or a soggy potato chip, then you will have the opposite

problem: this time, there is “too much” of the surface to ﬂatten and you will have to fold it over itself. If, how- ever, you have a roll of wallpaper and wish to ﬂatten it, then there is no diﬃculty: you just unroll it. Surfaces such as spheres are said to bepositively curved, ones with a saddle-like shape arenegatively curved, and ones like a piece of wallpaper areﬂat.

Notice that a surface can be ﬂat in this sense even if it does not lie in a plane. This is because curvature is deﬁned in terms of theintrinsic geometryof a surface, where distance is measured in terms of paths that lie inside the surface.

There are various ways of making the above notion of curvature precise, and also quantitative, so that with each point of a surface one can associate a number that tells you “how curved” it is at that point. In order to do this, the surface must have ariemannian metric [I.3 §6.10] on it, which is used to determine the lengths of paths. The notion of curvature can also be general- ized to higher dimensions, so that one can talk about the curvature of a point in ad-dimensional Rieman- nian manifold. However, when the dimension is higher than 2, the way that the manifold can curve at a point is more complicated, and is expressed not by a single number but by the so-calledRicci tensor. Seericci ﬂow [III.78] for more details.

Curvature is one of the fundamental concepts of modern geometry: not only the notion just described but also various alternative deﬁnitions that measure in other ways how far a geometric object deviates from being ﬂat. It is also an integral part of the theory of general relativity (which is discussed in general relativity and the einstein equations[IV.13]).

III.14 Designs

Peter J. Cameron

Block designs were ﬁrst used in the design of exper- iments in statistics, as a method for coping with sys- tematic diﬀerences in the experimental material. Sup- pose, for example, that we want to test seven diﬀer- ent varieties of seed in an agricultural experiment, and that we have twenty-one plots of land available for the experiment. If the plots can be regarded as identical, then the best strategy is clearly to plant three plots with each variety. Suppose, however, that the available plots are on seven farms in diﬀerent regions, with three plots on each farm. If we simply plant one variety on each farm, we lose information, because we cannot dis- tinguish systematic diﬀerences between regions from

4 5

6 7

Figure 1A block design.

diﬀerences in the seed varieties. It is better to follow a scheme like this: plant varieties 1, 2, 3 on the ﬁrst farm; 1, 4, 5 on the second; and then 1, 6, 7; 2, 4, 6;

2, 5, 7; 3, 4, 7; and 3, 5, 6. This design is represented in ﬁgure 1.

This arrangement is called a balanced incomplete- block design, or BIBD for short. The blocks are the sets of seed varieties used on the seven farms. The blocks are “incomplete” because not every variety can be planted on every farm; the design is “balanced”

because each pair of varieties occurs in the same block the same number of times (just once in this case).

This is a(7,3,1)design: there are seven varieties; each block contains three of them; and two varieties occur together in a block once. It is also an example of a ﬁniteprojective plane. Because of the connection with geometry, varieties are usually called “points.”

Mathematicians have developed an extensive theory of BIBDs and related classes of designs. Indeed, the study of such designs predates their use in statistics.

In 1847, T. P. Kirkman showed that a(v,3,1)design exists if and only ifvis congruent to 1 or 3 mod 6. (Such designs are now calledSteiner triple systems, although Steiner did not pose the problem of their existence until 1853.)

Kirkman also posed a more diﬃcult problem. In his own words,

Fifteen young ladies in a school walk out three abreast for seven days in succession: it is required to arrange them daily so that no two shall walk twice abreast.

The solution requires a(15,3,1)Steiner triple system with the extra property that the thirty-ﬁve blocks can

be partitioned into seven sets called “replicates,” each replicate consisting of ﬁve blocks that partition the set of points. Kirkman himself gave a solution, but it was not until the late 1960s that Ray-Chaudhuri and Wilson showed that(v,3,1)designs with this property exist whenevervis congruent to 3 mod 6.

For whichv,k,λdo designs exist? Counting argu- ments show that, given k and λ, the values of v for which(v, k, λ)designs exist are restricted to cer- tain congruence classes. (We noted above that(v,3,1) designs exist only ifv is congruent to 1 or 3 mod 6.) An asymptotic existence theory developed by Richard Wilson shows that this necessary condition is suﬃcient for the existence of a design, apart from ﬁnitely many exceptions, for each value ofkandλ.

The concept of design has been further generalized:

at–(v, k, λ)design has the property that anytpoints are contained in exactlyλblocks. Luc Teirlinck showed that nontrivialt-designs exist for all t, but examples fort >3 are comparatively rare.

The statisticians’ concerns are a bit diﬀerent. In our introductory example, if only six farms were available, we could not use a BIBD for the experiment, but would have to choose the most “eﬃcient” possible design (allowing the most information to be obtained from the experimental results). A BIBD is most eﬃcient if it exists; but not much is known in other cases.

There are other types of design; these can be impor- tant to statistics and also lead to new mathematics.

Here, for example, is anorthogonal array: if you take any two rows of this matrix you obtain a 2×9 matrix in which each ordered pair of symbols from {0,1,2} occurs exactly once as a column.

0 0 0 1 1 1 2 2 2

0 1 2 0 1 2 0 1 2

0 1 2 1 2 0 2 0 1

0 2 1 1 0 2 2 1 0

It could be used if we had four diﬀerent treatments, each of which could be applied at three diﬀerent levels, and if we had nine plots available for testing.

Design theory is closely related to other combina- torial topics such as error-correcting codes; indeed, Fisher “discovered” the Hamming codes as designs ﬁve years before R. W. Hamming found them in the context of error correction. Other related subjects include pack- ing and covering problems, and especially ﬁnite geom- etry, where many ﬁnite versions of classical geometries can be regarded as designs.

III.15 Determinants

The determinant of a 2×2 matrix a b

c d

is deﬁned to bead−bc. The determinant of a 3×3

matrix ⎛

⎜⎜

⎝

a b c

d e f g h i

⎞

⎟⎟

⎠

is deﬁned to beaei+bf g+cdh−af h−bdi−ceg. What do these expressions have in common, how do they generalize, and why is the generalization signiﬁcant?

To begin with the ﬁrst question, let us make a few simple observations. Both expressions are sums and diﬀerences of products of entries from the matrix. Each one of these products contains exactly one element from each row of the matrix and also exactly one ele- ment from each column. In both cases, a minus sign seems to attach itself to the products for which the entries selected from the matrix “slope upward” rather than “downward.”

Up to a point it is easy to see how to extend this deﬁnition ton×n matrices with n 4. We simply take sums and diﬀerences of all possible products of nentries, where one entry from each row is used and one from each column. The diﬃculty comes in decid- ing which of these products to add and which to sub- tract. To do this we take one of the products and use it to deﬁne a permutationσ of the set{1,2, . . . , n}as follows. For eachi n, the product contains exactly one entry in theith row. If it belongs to thejth column thenσ (i)=j. The product is added if this permutation is even and subtracted if it is odd (seepermutation groups[III.68]). So, for example, the permutation cor- responding to the entryaf hin the 3×3 determinant above sends 1 to 1, 2 to 3, and 3 to 2. This is an odd permutation, which is whyaf hreceives a minus sign.

We still need to explain why the particular choice of products and minus signs that we have just deﬁned is important. The reason is that it tells us something about the eﬀect of a matrix when it is considered as a linear map. LetAbe ann×nmatrix. Then, as explained in [I.3 §3.2],Aspeciﬁes a linear mapαfrom_Rⁿto_Rⁿ. The determinant of A tells us what this linear map does to volumes. More precisely, ifXis a subset of_Rⁿ withn-dimensional volumeV, thenαX, the result of transformingX using the linear mapα, will have vol-

umeV times the determinant ofA. We could write this symbolically as follows:

vol(αX)=detA·vol(X).

For example, consider the 2×2 matrix A=

cosθ −sinθ sinθ cosθ

The corresponding linear map is a rotation of _R² through an angle ofθ. Since rotating a shape does not aﬀect its volume, we should expect the determinant of Ato be 1, and sure enough it is cos²θ+sin²θ, which is 1 by Pythagoras’s theorem.

The above explanation is a slight oversimpliﬁca- tion in one respect: determinants can be negative, but clearly volumes cannot. If the determinant of a matrix is−2, to give an example, it means that the linear map multiplies volumes by 2 but also “turns shapes inside out” by reﬂecting them.

Determinants have many useful properties, which become obvious once one knows the above interpre- tation in terms of volumes. (However, it is much less obvious that this interpretation is correct: in setting up the theory of determinants one must do some work somewhere.) Let us give three of these properties.

(i) LetVbe avector space[I.3 §2.3] and letα:V→V be a linear map. Letv1, . . . ,vnbe a basis ofV and let Abe the matrix ofαwith respect to this basis. Now let w₁, . . . ,w_nbe another basis ofVand letBbe the matrix ofαwith respect to this diﬀerent basis. ThenAandB are diﬀerent matrices, but since they both represent the linear mapα, they must have the same eﬀect on volumes. It follows that det(A)=det(B). To put this another way: the determinant is better thought of as a property of linear maps rather than of matrices.

Two matrices that represent the same linear map in the above sense are calledsimilar. It turns out thatA andB are similar if and only if there is an invertible matrixP such thatP⁻¹AP =B. (Ann×nmatrixP is invertibleif there is a matrixQsuch thatP Qequals then×nidentity matrix,In, which turns out to imply that QP equals In as well. If this is true, then Q is called theinverseofP and is denotedP⁻¹.) What we have just shown is that similar matrices have the same determinant.

(ii) IfAandBare any twon×nmatrices, then they represent linear maps α and β of _Rⁿ. The product AB represents the linear map αβ: that is, the linear map that results from doingβfollowed byα. Sinceβ multiplies volumes by detBandαmultiplies them by

在文檔中 The Princeton Companion to Mathematics (頁 193-200)