See normed spaces and banach spaces [III.62]
III.3 Bayesian Analysis
Suppose you throw a pair of standard dice. The proba- bility that the total is 10 is121 because there are thirty- six ways the dice can come up, of which three (4 and 6, 5 and 5, and 6 and 4) give 10. If, however, you look at the first die and see that it came up as a 6, then the conditional probabilitythat the total is 10, given this information, is 16(since that is the probability that the other die comes up as a 4).
In general, theprobability of AgivenBis defined to be the probability ofAandBdivided by the probability ofB. In symbols, one writes
P[A|B]=P[A∧B]
P[B] .
From this it follows thatP[A∧B]=P[A|B]P[B]. Now P[A∧B]is the same asP[B∧A]. Therefore,
P[A|B]P[B]=P[B|A]P[A],
since the left-hand side isP[A∧B]and the right-hand side isP[B∧A]. Dividing through byP[B]we obtain Bayes’s theorem:
P[A|B]=P[B|A]P[A]
P[B] ,
which expresses the conditional probability ofAgiven Bin terms of the conditional probability ofBgivenA.
A fundamental problem in statistics is to analyze ran- dom data given by an unknownprobability distribu- tion[III.71]. Here, Bayes’s theorem can make a signif- icant contribution. For example, suppose you are told that some unbiased coins have been tossed and that three of them have come up heads. Suppose that you are told that the number of coins tossed is between 1 and 10, and that you wish to guess this number. LetH3
stand for the event that three coins came up heads and letCbe the number of coins. Then for eachnbetween 1 and 10 it is not hard to calculate the conditional prob- abilityP[H3|C = n], but we would like to know the reverse, namelyP[C=n|H3]. Bayes’s theorem tells us that it is
P[H3|C=n]P[C=n]
P[H3] .
This would tell us the ratios between the various con- ditional probabilitiesP[C=n|H3]if we knew what the
probabilities P[C = n] were. Typically, one does not know this, but one makes some kind of guess, called a prior distribution. For example, one might guess, before knowing that three coins had come up heads, that for each n between 1 and 10 the probability that n coins had been chosen was 101.Afterthis information, one would use the calculation above to revise one’s assessment and obtain aposterior distribution, in which the probability thatC =n would be proportional to
1
10P[H3|C=n].
There is more to Bayesian analysis than simply apply- ing Bayes’s theorem to replace prior distributions by posterior distributions. In particular, as in the exam- ple just given, there is not always an obvious prior distribution to take, and it is a subtle and interesting mathematical problem to devise methods for choosing prior distributions that are “optimal” in different ways.
For further discussion, seemathematics and medi- cal statistics[VII.11] andmathematical statistics [VII.10].
III.4 Braid Groups
F. E. A. JohnsonTake two parallel planes, each punctured atnpoints.
Label the holes 1 tonin each plane, and run a string from each hole in the first plane to one in the second, in such a way that no two strings go to the same hole.
The result is ann-braid. Two different 3-braids, shown in two-dimensional projection in a similar manner to knot diagrams[III.44], are given in figure 1.
As the diagrams suggest, we insist that the strings go from left to right without “doubling back”; so, for example, a knotted string is not allowed.
A certain freedom is allowed when we describe a braid: provided that the string ends remain fixed and the strings do not break or pass through each other, one can stretch, contract, bend, and otherwise move the strings about in three dimensions and end up with the “same” braid. This notion of “sameness” is an equivalence relation[I.2 §2.3] calledbraid isotopy.
Braids may be composed as follows: arrange a pair of braids end to end to abut in a common (middle) plane, join up the strings, and remove the middle plane. For the braidsXandY in figure 1, the compositionXY is given in figure 2.
With this notion of composition, n-braids form a groupBn. In our example,Y =X−1, since “pulling all the strings tight” shows thatXYis isotopic to thetrivial braid (figure 3), which acts as the identity.
1 1
2 2
3
}
X 31 1
2 2
3
}
Figure 1YTwo 3-braids. 3}
Y 321
}
3 2 1
X
Figure 2 Braid composition.
1 1
2 2
3 3
Figure 3 The trivial braid.
As a group,Bnis generated by elements(σi)1in−1, whereσi is formed from the trivial braid by crossing the ith string over the (i+1)st as in figure 4. The reader may perceive a similarity between theσiand the adjacent transpositions that generate the groupSnof
i − 1 i − 1
i i
i + 1 i + 1
i + 2 Figure 4
}
The generatorσi σi. i + 2 permutations[III.68] of{1, . . . , n}. Indeed, any braid determines a permutation by the rulei→right-hand label ofith string.
Ignoring everything except the behavior at the ends gives a surjective homomorphism Bn → Sn, which mapsσito the transposition(i, i+1). This isnotan iso- morphism, however, asBnis infinite. In fact,σihas infi- nite order, whereas the transposition(i, i+1)squares to the identity. In his celebrated 1925 paper “Theorie der Zöpfe,”artin[VI.86] showed that multiplication in Bnis completely described by the relations
σiσj=σjσi (|i−j|2), σiσi+1σi=σi+1σiσi+1.
These relations have subsequently acquired impor- tance in statistical physics, where they are known as the Yang–Baxter equations.
In groups defined by generators and relations it is usually difficult (there being no method that works uniformly in all cases) to decide whether an arbitrary word in the generators represents the identity element (seegeometric and combinatorial group theory [IV.10]). ForBn, Artin solved this problem geometri- cally, by “combing the braid.” An alternative algebraic method, due to Garside (1967), also decides when two elements inBnare conjugate.
In relation to the decidability of such questions, and in many other respects, braid groups display close affinities withlinear groups: that is, groups in which all elements behave as if they were invertibleN×Nmatri- ces. Although such similarities suggested that it should be possible to prove that braid groups genuinely are linear, the problem of doing so remained unsolved for many years, until in 2001 a proof was eventually found by Bigelow and independently by Krammer.
The groups described here are, strictly speaking, braid groups of the plane, the plane being the object punctured. Other braid groups also occur, often in sur- prising contexts. The connection with statistical phys- ics has already been mentioned. They also arise in alge- braic geometry, when algebraic curves become punc- tured by discarding exceptional points. Thus, though originating in topology, braids may intervene signifi- cantly in areas such as “constructive Galois theory” that seem at first sight to be purely algebraic.
III.5 Buildings
Mark RonanThe invertible linear transformations on a vector space form a group, called thegeneral linear group. Ifn is the dimension of the vector space andKis the field of scalars, then it is denoted by GLn(K), and if we pick a basis for the vector space, then each group element can be written as ann×nmatrix whosedeterminant [III.15] is nonzero. This group and its subgroups are of great interest in mathematics, and can be studied “geo- metrically” in the following way. Instead of looking at the vector spaceV, where of course the origin plays a unique role and is fixed by the group, we use thepro- jective space[I.3 §6.7] associated withV: the points of projective space are the one-dimensional subspaces ofV, the lines are the two-dimensional subspaces, the planes are the three-dimensional subspaces, and so on.
Several important subgroups of GLn(K)can be ob- tained by imposing constraints on the linear maps (or matrices). For example, SLn(K) consists of all linear transformations of determinant 1. The group O(n)con- sists of all linear transformations α of an n-dimen- sional real inner-product space such thatαv, αw = v, wfor any two vectorsvandw(or in matrix terms all real matricesAsuch that AAT = I); more gener- ally, one can define many similar subgroups of GLn(K) by taking all linear maps that preserve certain forms, such as bilinear or sesquilinear forms. These subgroups are called classical groups. The classical groups are either simple or close to simple (for example, we can often make them simple by quotienting out by the sub- group of scalar matrices). WhenKis the field of real or complex numbers, the classical groups areLie groups.
Lie groups and their classification are discussed in lie theory[III.48]: the simple Lie groups comprise the classical groups, which fall into one of four families, known as An, Bn, Cn, and Dn (wheren is a natural number), along with other types known asE6,E7,E8,
F4, andG2. The subscripts are related to the dimen- sions of the groups. For example, the groups of type Anare the groups of invertible linear transformations inn+1 dimensions.
These simple Lie groups have analogues over any field, where they are often referred to as groups of Lie type. For example,Kcan be a finite field, in which case the groups are finite. It turns out that almost all finite simple groups are of Lie type: seethe classi- fication of finite simple groups[V.7]. A geometric theory underlying the classical groups had been devel- oped by the first half of the twentieth century. It used projective space and various subgeometries of projec- tive space, which made it possible to provide analogues for the classical groups, but it did not provide ana- logues for the groups of typesE6,E7,E8,F4, andG2. For this reason, Jacques Tits looked for a geometric theory that would embrace all families, and ended up creating the theory ofbuildings.
The full abstract definition of a building is somewhat complicated, so instead we shall try to give some idea of the concept by looking at the building associated with the groups GLn(K)and SLn(K), which are of typeAn−1. This building is anabstract simplicial complex, which can be thought of as a higher-dimensional analogue of a graph[III.34]. It consists of a collection of points called vertices; as in a graph, some pairs of vertices form edges; however, it is then possible for triples of ver- tices to form two-dimensionalfaces, and for sets ofk vertices to form(k−1)-dimensional “simplexes.” (The geometrical meaning of the word “simplex” is a con- vex hull of a finite set of points in general position: for instance, a three-dimensional simplex is a tetrahedron.) All faces of simplexes must also be included, so for example three vertices cannot form a two-dimensional face unless each pair is joined by an edge.
To form the building of typeAn−1, we start by taking all the 1-spaces, 2-spaces, 3-spaces, and so on (corre- sponding to points, lines, planes, and so on, in projec- tive space), and treat them as “vertices.” The simplexes are formed by all nested sequences of proper sub- spaces: for example, a 2-space inside a 4-space inside a 5-space will form a “triangle” whose vertices are these three subspaces. The simplexes of maximal dimen- sion have n−1 vertices: a 1-space inside a 2-space inside a 3-space, and so on. These simplexes are called chambers.
There are many subspaces, so a building is a huge object. However, buildings have important subgeome- tries called apartments, which in the An−1 case are
obtained by taking a basis for the vector space, and then taking all subspaces generated by subsets of this basis. For example, in the A3 case our vector space is four dimensional, so a basis has four elements;
its subsets span four 1-spaces, six 2-spaces, and four 3-spaces. To visualize this apartment it helps to view the four 1-spaces as the vertices of a tetrahedron, the six 2-spaces as the midpoints of its edges, and the four 3-spaces as the midpoints of its faces. The apartment has twenty-four chambers, six for each face of the orig- inal tetrahedron, and they form a triangular tiling of the surface of the tetrahedron. This surface is topolog- ically equivalent to a sphere, as are all apartments of this building: such buildings are calledspherical. The buildings for the groups of Lie type are all spherical, and, just asA3is related to the tetrahedron, their apart- ments are related to the regular and semiregular poly- hedra inndimensions, wherenis the subscript in the Lie notation given earlier.
Buildings have the following two noteworthy fea- tures. First, any two chambers lie in a common apart- ment: this is not obvious in the example above but it can be proved using linear algebra. Second, in any building all apartments are isomorphic and any two apartments intersect nicely: more precisely, ifAandA are apart- ments, thenA∩A is convex and there is an isomor- phism fromAtoA that fixesA∩A. These two features were originally used by Tits in defining buildings.
The theory of spherical buildings does not just give a pleasing geometric basis for the groups of Lie type: it can also be used to construct the ones of typesE6,E7, E8, andF4, for an arbitrary fieldK, without the need for sophisticated machinery such as Lie algebras. Once the building has been constructed (and a construction can be given in a surprisingly simple manner), a theorem of Tits on the existence of automorphisms shows that the groups themselves must exist.
In a spherical building the apartments are tilings of a sphere, but other types of buildings also play signifi- cant roles. Of particular importance areaffine buildings, in which the apartments are tilings of Euclidean space;
such buildings arise in a natural way from groups, such as GLn(K), whereKis ap-adic field[III.51]. For such fields there are two buildings, one spherical and one affine, but the affine one carries more information and yields the spherical building as a structure “at infin- ity.” Going beyond affine buildings, there are hyperbolic buildings, whose apartments are tilings of hyperbolic space; they arise naturally in the study of hyperbolic Kac–Moody groups.
III.6 Calabi–Yau Manifolds
Eric Zaslow1 Basic Definition
Calabi–Yau manifolds, named after Eugenio Calabi and Shing-Tung Yau, arise in Riemannian geometry and algebraic geometry, and play a prominent role in string theory and mirror symmetry.
In order to explain what they are, we need first to recall the notion of orientability on a real mani- fold [I.3 §6.9]. Such a manifold is orientable if you can choose coordinate systems at each point in such a way that any two systems x = (x1, . . . , xm) and y=(y1, . . . , ym)that are defined on overlapping sets give rise to a positive Jacobian: det(∂yi/∂xj) >0. The notion of a Calabi–Yau manifold is the natural com- plex analogue of this. Now the manifold is complex, and for each local coordinate systemz=(z1, . . . , zn) one has aholomorphic function[I.3 §5.6]f (z). It is vital thatf should be nonvanishing: that is, it never takes the value 0. There is also a compatibility condi- tion: if ¯z(z)is another coordinate system, then the cor- responding function ˜f is related tof by the equation f=f˜det(∂˜za/∂zb). Note that if we replace all complex terms by real terms in this definition, then we have the notion of a real orientation. So a Calabi–Yau manifold can be thought of informally as a complex manifold with complex orientation.
2 Complex Manifolds and Hermitian Structure Before we go any further, a few words about complex and Kähler geometry are in order. A complex manifold is a structure that looks locally like Cn, in the sense that one can find complex coordinatesz=(z1, . . . , zn) near every point. Moreover, where two coordinate sys- temszand ˜zoverlap, the coordinates ˜zaare holomor- phic when they are regarded as functions of thezb. Thus, the notion of a holomorphic function on a com- plex manifold makes sense and does not depend on the coordinates used to express the function. In this way, the local geometry of a complex manifold does indeed look like an open set inCn, and the tangent space at a point looks likeCnitself.
On complex vector spaces it is natural to consider Hermitianinner products[III.37] represented byher- mitian matrices[III.50 §3]gab¯with respect to a basis ea. On complex manifolds, a Hermitian inner product on the tangent spaces is called a “Hermitian metric,”
and is represented in a coordinate basis by a Hermitian matrixgab¯, which depends on position.1
3 Holonomy, and Calabi–Yau Manifolds in Riemannian Geometry
On ariemannian manifold[I.3 §6.10] one can move a vector along a path so as to keep it of constant length and “always pointing in the same direction.”Curvature expresses the fact that the vector you wind up with at the end of the path depends on the path itself. When your path is a closed loop, the vector at the starting point comes back to a new vector at the same point.
(A good example to think about is a path on a sphere that goes from the North Pole to the equator, then a quarter of the way around the equator, then back to the North Pole again. When the journey is completed, the “constant” vector that began by pointing south will have been rotated by 90◦.) With each loop we asso- ciate a matrix operator called the holonomy matrix, which sends the starting vector to the ending vector;
the group generated by all of these matrices is called theholonomy groupof the manifold. Since the length of the vector does not change during the process of keeping it constant along the loop, the holonomy matri- ces all lie in the orthogonal group of length-preserving matrices, O(m). If the manifold is oriented, then the holonomy group must lie in SO(m), as one can see by transporting an oriented basis of vectors around the loop.
Every complex manifold of (complex) dimensionn is also a real manifold of (real) dimension m = 2n, which one can think of as coordinatized by the real and imaginary parts of the complex coordinates zj. Real manifolds that arise in this way have additional structure. For example, the fact that we can multiply complex coordinate directions by i = √
−1 implies that there must be an operator on the real tangent space that squares to−1. This operator has eigenvalues
±i, which can be thought of as “holomorphic” and
“anti-holomorphic” directions. The Hermitian property states that these directions are orthogonal, and we say that the manifold is Kähler if they remain so after transport around loops. This means that the holon- omy group is a subgroup of U(n) (which itself is a subgroup of SO(m): complex manifolds always have realorientations). There is a nice local characterization of the Kähler property: ifga¯b are the components of
1. The notationga¯bindicates the conjugate-linear property of a Hermitian inner product.
the Hermitian metric in some coordinate patch, then there exists a functionϕon that patch such thatgab¯=
∂2ϕ/∂za∂¯zb.
Given a complex orientation—that is, the nonmetric definition of a Calabi–Yau manifold given above—a compatible Kähler structureleads to a holonomy that lies in SU(n)⊂U(n), the natural analogue of the case of real orientation. This is the metric definition of a Calabi–Yau manifold.
4 The Calabi Conjecture
Calabi conjectured that, for any Kähler manifold of complex dimension n and any complex orientation, there exists a functionuand a new Kähler metric ˜g, given in coordinates by
˜
ga¯b=ga¯b+ ∂2u
∂za∂¯zb,
that is compatible with the orientation. In equations, the compatibility condition states that
det
ga¯b+ ∂2u
∂za∂¯zb
= |f|2,
wheref is the holomorphic orientation function dis- cussed above. Thus, the metric notion of a Calabi–Yau manifold amounts to a formidable nonlinear partial dif- ferential equation foru. Calabi proved the uniqueness and Yau proved the existence of a solution to this equa- tion. So in fact the metric definition of a Calabi–Yau manifold is uniquely determined by its Kähler structure and its complex orientation.
Yau’s theorem establishes that the space of metrics with holonomy group SU(n)on a manifold with com- plex orientation is in correspondence with the space of inequivalent Kähler structures. The latter space can easily be probed with the techniques of algebraic geometry.
5 Calabi–Yau Manifolds in Physics Einstein’s theory of gravity, general relativity, con- structs equations that the metric of a Riemannian space-time manifold must obey (see general rela- tivity and the einstein equations [IV.13]). The equations involve three symmetric tensors: the metric, thericci curvature[III.78] tensor, and the energy–
momentum tensor of matter. A Riemannian manifold whose Ricci tensor vanishes is a solution to these equa- tions when there is no matter, and is a special case of an Einstein manifold. A Calabi–Yau manifold with
its unique SU(n)-holonomy metric has vanishing Ricci tensor, and is therefore of interest in general relativity.
A fundamental problem in theoretical physics is the incorporation of Einstein’s theory into the quantum theory of particles. This enterprise is known asquan- tum gravity, and Calabi–Yau manifolds figure promi- nently in the leading theory of quantum gravity,string theory[IV.17 §2].
In string theory, the fundamental objects are one- dimensional “strings.” The motion of the strings in space-time is described by two-dimensional trajecto- ries, known asworldsheets, so every point on the world- sheet is labeled by the point in space-time where it sits.
In this way, string theory is constructed from a quan- tum field theory of maps from two-dimensional rie- mann surfaces [III.79] to a space-time manifold M. The two-dimensional surface should be given a Rie- mannian metric, and there is an infinite-dimensional space of such metrics to consider. This means that we must solve quantum gravity in two dimensions—
a problem that, like its four-dimensional cousin, is too hard. If, however, it happens that the two-dimensional worldsheet theory is conformal (invariant under local changes of scale), then just a finite-dimensional space of conformally inequivalent metrics remains, and the theory is well-defined.
The Calabi–Yau condition arises from these consid- erations. The requirement that the two-dimensional theory should be conformal, so that the string theory makes good sense, is in essence the requirement that the Ricci tensor of space-time should vanish. Thus, a two-dimensional condition leads to a space-time equa- tion, which turns out to be exactly Einstein’s equation without matter. We add to this condition the “phe- nomenological” criterion that the theory be endowed with “supersymmetry,” which requires the space-time manifoldMto be complex. The two conditions together mean that M is a complex manifold with holonomy group SU(n): that is, a Calabi–Yau manifold. By Yau’s theorem, the choices of suchMcan easily be described by algebraic geometric methods.
We remark that there is a kind of distillation of string theory called “topological strings,” which can be given a rigorous mathematical framework. Calabi–Yau mani- folds are both symplectic and complex, and this leads to two versions of topological strings, called A and B, that one can associate with a Calabi–Yau manifold. Mir- ror symmetry is the remarkable phenomenon that the A version of one Calabi–Yau manifold is related to the B version of an entirely different “mirror partner.” The