A survey of modern algebra - Birkhoff & MacLane.pdf

Garrett BirkhoffHarvard University Saunders Mac Lane The University of Chicago A SURVEY OF ern fourth edition Macmillan Publishing Co., Inc. New York Collier Macmillan Publishers London " . . ... ra Copyright © 1977, Macmillan Publishing Co., Inc. Printed in the United States of America All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the Publisher. Earlier editions copyright 1941 and 1953 and copyright © 1965 by Macmillan Publishing Co., Inc. Macmillan Publishing Co., Inc. 866 Third Avenue, New York, New York 10022 Collier Macmillan Canada, Ltd. library of Congress Cataloging in Publication Data Birkhofl, Garrett, (date) A survey of modern algebra. Bibliography: p. Includes index. 1. Algebra, Abstract. I. MacLane, Saunders, (date) joint author. II. Title. QA162.B57 1977 512 75-42402 ISBN 0-02-310070-2 Printing: 345678 Year: 012 Preface to the Fourth Edition During the thirty-five years since the first edition of this book was written, courses in "modern algebra" have become a standard part of college curricula all over the world, and many books have been written for use in such courses. Nevertheless, it seems desirable to recall our basic philosophy, whi'ch remains that of the present book. "We have tried throughout to express the conceptual background of the various definitions used. We have done this by illustrating each new tenn by as many familiar examples as possible. This seems especially important in an elementary text because it serves to emphasize the fact that the abstract concepts all arise from the analysis of concrete situations. "To develop the student's power to think for himself in terms of the new concepts, we have included a wide variety of exercises on each topic. Some of these exercises are computational, some explore further examples of the new concepts, and others give additional theoretical developments. Exercises of the latter type serve the important function of familiarizing the student with the construction of a formal proof. The selection of exercises is sufficient to allow an instructor to adapt the text to students of quite varied degrees of maturity, of undergraduate or first year graduate level. "Modern algebra also enables one to reinterpret the results of classical algebra, giving them far greater unity and generality, Therefore, instead of omitting these results, we have attempted to incorporate them systematically within the framework of the ideas of modern algebra. "We have also tried not to lose sight of the fact that, for many students, the value of algebra lies in its applications to other fields: higher analysis, ge~tIf,. pblsics, and philosophy. This has influenced us in our emphasis onog tHe-real and complex fields, on .groups of transformations as contrasted with abstract groups, on symmetric matrices and reduction to diagonal form, on the classification of quadratic forms under the orthogonal and Euclidean groups, and finally, in the inclusion of Boolean algebra, lattice theory, and transfinite numbers, all of which are important in mathematica110gic and in the modern theory of real functions." v Preface • VI In detail, our Chapters 1-3 give an introduction to the theory of linear and polynomial equations in commutative rings. The familiar domain of integers and the rational field are emphasized, together with the rings of integers modulo n and associated polynomial rings. Chapters 4 and 5 develop the basic algebraic properties of the real and complex fields which are of such paramount importance for geometry and physics. Chapter 6 introduces noncommutative algebra through its simplest and most fundamental concept: that of a group. The group concept is applied systematically in Chapters 7-10, on vector spaces and matrices. Here care is taken to keep in the foreground the fundamental role played by algebra in Euclidean, affine, and projective geometry. Dual spaces and tensor products are also discussed, but generalizations to modules over rings are not considered. Chapter 11 includes a completely revised introduction to Boolean algebra and lattice theory. This is followed in Chapter 12 by a brief discussion of transfinite numbers. Finally, the last three chapters provide an introduction to general commutative algebra and arithmetic: ideals and quotient-rings, extensions of fields, algebraic numbers and their factorization, and Galois theory. Many of the chapters are independent of one another; for example, the chapter on group theory may be introduced just after Chapter 1, while the material on ideals and fields (§§13.1 and 14.1) may be studied immediately after the chapter on vector spaces. This independence is intended to make the book useful not only for a full-year course, assuming only high-school algebra, but also for various shorter courses. For example, a semester or quarter course covering linear algebra may be based on Chapters 6-10, the real and complex fields being emphasized. A semester course on abstract algebra could deal with Chapters 1-3, 6-8, 11, 13, and 14. Still other arrangements are possible. We hope that our book will continue to serve not only as a text but also as a convenient reference for those wishing to apply the basic concepts of modern algebra to other branches of mathematics, including statistics and computing, and also to physics, chemistry, and engineering. It is a pleasure to acknowledge our indebtedness to Clifford Bell, A. A. Bennett, E. Artin, F. A. Ficken, J. S. Frame, Nathan Jacobson, Walter Leighton, Gaylord Merriman, D. D. Miller, Ivan Niven, and many other friends and colleagues who assisted with helpful suggestions and improvements, and to Mrs. Saunders Mac Lane, who helped with the secretarial work in the first three editions. Cambridge, Mass. Chicago, Illinois GARRETT BIRKHOFF SAUNDERS MAC LANE Contents Preface to the Fourth Edition 1 The Integer. 1.1 1.2 1.3 . 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 2 1 Commutative Rings; Integral Domains 1 Elementary Properties of Commutative Rings Ordered Domains 8 11 Well-Ordering Principle Finite Induction; Laws of Exponents 12 Divisibility 16 The Euclidean Algorithm 18 Fundamental Theorem of Arithmetic 23 Congruences 25 The Rings Zn 29 Sets, Functions, and Relations 32 Isomorphisms and Automorphisms 35 3 Rational Numbers and Fields 2.1 2.2 2.3 2.4 2.5 2.6 3 v 38 38 Definition of a Field Construction of the Rationals 42 Simultaneous Linear Equations 47 Ordered Fields 52 Postulates for the Positive Integers 54 Peano Postulates 57 Polynomials 3.1 Polynomial Forms 61 3.2 Polynomial Functions 65 3.3 Homomorphisms of Commutative Rings 3.4 Polynomials in Several Variables 72 74 3.5 The Division Algorithm 3.6 Units and Associates 76 3.7 Irreducible Polynomials 78 3.8 Unique Factorization Theorem 80 3.9 Other Domains with Unique Factorization 3.10 Eisenstein's Irreducibility Criterion 88 3.11 Partial Fractions 90 61 69 84 vii ••• VIII Contents .4 Real Numbers 4.1 4.2 4.3 4.4 4.5 5 Dilemma of Pythagoras 94 Upper and Lower Bounds 96 Postulates for Real Numbers 98 101 Roots of Polynomial Equations Dedekind Cuts 104 Complex Numbers 5.1 5.2 5.3 5.4 5.5 5.6 5.7 6 107 Definition 107 The Complex Plane 110 Fundamental Theorem of Algebra 113 Conjugate Numbers and Real Polynomials Quadratic and Cubic Equations 118 121 Solution of Quartic by Radicals Equations of Stable Type 122 117 Groups 6.1 124 Symmetries of the Square 126 6.2 Groups of Transformations 131 6.3 Further Examples 6.4 Abstract Groups 133 6.5 Isomorphism 137 6.6 Cyclic Groups 140 6.7 Subgroups 143 6.8 Lagrange's Theorem 146 6.9 150 Permutation Groups 153 6.10 Even and Odd Permutations 6.11 Homomorphisms 155 6.12 Automorphisms; Conjugate Elements 6.13 Quotient Groups 161 6.14 Equivalence and Congruence Relations 7 94 124 157 164 Vectors and Vector Spaces 7.1 7.2 7.3 7.4 Vectors in a Plane 168 Generalizations 169 Vector Spaces and Subspaces 171 Linear Independence and Dimension 168 176 • IX Contents 7.5 Matrices and Row-equivalence 180 7.6 Tests for Linear Dependence 183 7.7 Vector Equations; Homogeneous Equations 7.8 Bases and Coordinate Systems 193 7.9 Inner Products 198 7.10 Euclidean Vector Spaces 200 7.11 Normal Orthogonal Bases 203 7.12 Quotient-spaces 206 7.13 Linear Functions and Dual Spaces 208 8 The Algebra of Matrices 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 9 Linear Transformations and Matrices 214 Matrix Addition 220 Matrix Multiplication 222 Diagonal, Permutation, and Triangular Matrices Rectangular Matrices 230 Inverses 235 Rank and NUllity 241 Elementary Matrices 243 Equivalence and Canonical Form 248 Bilinear Functions and Tensor Products 251 Quaternions 255 214 228 Linear Groups 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9.10 9.11 9.12 9.13 9.14 10 188 260 Change of Basis 260 Similar Matrices and Eigenvectors 263 The Full Linear and Affine Groups 268 The Orthogonal and Euclidean Groups 272 Invariants and Canonical Forms 277 Linear and Bilinear Forms 280 Quadratic Forms 283 Quadratic Forms Under the Full Linear Group 286 Real Quadratic Forms Under the Full Linear Group 288 Quadratic Forms Under the Orthogonal Group 292 Quadrics Under the Affine and Euclidean Groups 296 Unitary and Hermitian Matrices 300 Affine Geometry 305 Projective Geometry 312 Determinants and Canonical Forms 10.1 10.2 Definition and Elementary Properties of Determinants 323 Products of Determinants 318 318 x Contents 10.3 Determinants as Volumes 327 10.4 The Characteristic Polynomial 331 10.5 The Minimal Polynomial 336 10.6 Cayley-Hamilton Theorem 340 10.7 Invariant Subspaces and Reducibility 342 10.8 First Decomposition Theorem 346 10.9 Second Decomposition Theorem 349 10.10 Rational and Jordan Canonical Forms 352 11 Boolean Algebras end Lattices 11 .1 11 .2 11.3 11.4 11.5 11 .6 11 .7 11 .8 12 Numbers and Sets 381 Countable Sets 383 Other Cardinal Numbers 386 Addition and Multiplication of Cardinals Exponentiation 392 381 390 Rings and Ideals 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 14 368 Transfinite Arithmetic 12.1 12.2 12.3 12.4 12.5 13 Basic Defi nition 357 Laws : Analogy with Arithmetic 359 Boolean Algebra 361 Deduction of Other Basic Laws 364 Canonical Forms of Boolean Polynomials Partial Orderings 371 Lattices , 374 Representation by Sets 377 357 395 Rings 395 Homomorphisms 399 Quotient-rings 403 Algebra of Ideals 407 Polynomial Ideals 410 413 Ideals in Linear Algebras 415 The Characteristic of a Ring 418 Characteristics of Fields Algebraic Number Fields 14.1 14.2 Algebraic and Transcendental Extensions Elements Algebraic over a Field 423 420 420 xi Contents 14.3 14.4 14.5 14.6 14.7 14.8 14.9 14.10 15 Adjunction of Roots 425 Degrees and Finite Extensions 429 431 Iterated Algebraic Extensions Algebraic Numbers 435 Gaussian Integers 439 Algebraic Integers 443 445 Sums and Products of Integers Factorization of Quadratic Integers 448 Galois Theory 15.1 15.2 15.3 15.4 15.5 15.6 Hi.7 15.8 15.9 Root Fields for Equations 452 Uniqueness Theorem 454 Finite Fields 456 The Galois Group 459 Separable and Inseparable Polynomials Properties of the Galois Group 467 Subgroups and Subfields 471 !rreducible Cubic Equations 474 Insolvability of Quintic Equations 478 452 464 Bibliography 483 List of Special Symbols 486 Index 489 . Integral Domains Modern algebra has exposed for the first time the full variety and richness of possible mathematical systems. but for many other systems of n~bers. Then R is called a commutative ring if the following postulates (i)-(viii) hold: .1 The Integers 1.. and all complex numbers. we shall say that R is a commutative ring. Let R be a set of elements a. The integers have many interesting algebraic properties... 'we will asswne some especially obvious such properties as postulates. and by continuous real functions on any given interval. . ±2. (i) Qosure. They are also satisfied by polynomials. for which the sum a + b and the product ab of any two elements a and b (distinct or not) of R are defined. Commutative Rings. If a and b are in R. These postulates hold not only for the integers. c. We first assume eight postulates tor addition and multiplication. . but the most fundamental of them all is the oldest mathematical system-that consisting of all the positive integers (whole numbers). . such as that of all rational numbers (fractions)\ all real numbers (unlimited decimals). We shall construct and examine many such systems. When these eight postulates hold for a system R. then the sum a + b and the product ab are in R. b. . 1. ± 1. and deduce from them many other properties as logical consequences.. ±3.. A related but somewhat larger system is the collection Z of all integers 0. We begin our discussion with this system because it more closely resembles the other systems which arise in modern algebra. Definition. In this chapter. The property of zero stated in (vi) is the characteristic property of the number zero. if c¥-O and ca = cb in Z. The assumption 1 ¥. 1 The Integers (ii) Uniqueness. respectively. = a for all a in R. (vi) Zero. The system Z of all integers has another property which cannot be deduced from the preceding postulates. we may say that 0 and 1 are the "identity elements" for addition and multiplication. For all a. R contains an element 0 such that a+O=a (vii) Unity.0 such that al (viii) Additive inverse. and For all a and b in R.0 in (vii) is included tq eliminate trivial cases (otherwise the set consisting of the integer 0 alone would be a commutative ring). This property is not satisfied by real functions on a given interval. a(bc) = (ab)c. a + (b + c) ab = a'b'. for all a in R. For example. and c in R. The integers therefore constitute not only a . + (b + c) and (a + b) + c. then necessarily a = b (partial converse of (ii)). though these form a commutative ring. solution x in R. = (a (v) Distributive law. and similarly. the property of 1 stated in (vii) is the characteristic property of the number one. For each a in R. a+b=b+a. then a + b = a' + b' (iii) Commutative laws. If a = a' and b = b' in R. Since these laws are formally analogous. b. Namely. + b) + c. ab = ba. the equation a + x = 0 has a It is a familiar fact that the set Z of all integers satisfies these postulates. for example. R contains an element 1 ¥. a (b + c) = ab '+ ac. b. the commutative and associative laws are so familiar that they are ordinarily used without explicit mention: thus a + b + c customarily denotes the equal numbers a.2 Ch. (iv) Associative laws. For all a. and c in R. §1.2 Elementary Properties of-Commutative Rings 3 commutative ring but also an integral domain in the sense of the following definition. Definition. An integral domain is a commutative ring in which the following additional postulate holds: (ix) Cancellation law. If c¥-O and ca = cb, then a = b. The domain Z[ v2]. An integral domain of interest for number theory consists of all numbers of the form a + bv2, where a and bare ordinary integers (in Z). In Z[v2], a + bv2 = c + dv2 if and only if a = c, b = d. Addition and multiplication are defined by (a + bv2) + (c + dv2) = (a + c) + (b + d)v2 (a + bv2)(c + dv2) = (ac + 2bd) + (ad + bc)v2. Uniqueness and commutativity are easili' verified for these operations, while 0 + Ov2 acts as a zero and 1 + O..J2 as a unity. The additive inverse of a + bv2 is (-a) + (-b )v2. The verification of the associative and distributive laws is a little more tedious, while that of the cancellation law will be def~rred to the end of § 1.2. 1.2. Elementary Properties of Commutative Rings In elementary algebra one often takes the preceding postulates and their elementary consequences for granted. This seldom leads to serious errors, provided algebraic manipulations are checked against specific examples. However, much more care must be taken when one wishes to reach reliable conclusions about whole families of algebraic systems (e.g., valid for all integral domains generally). One must be sure that all proofs use only postulates listed explicitly and standard rules of logic. Among the most fundamental rules of logic are the three basic laws for equality: Reflexive law: a = a. Symmetric law: If a = b, then b = a. Transitive law: If a = band b = c, then a = c, valid for all a, b, and c. We now illustrate the idea of a formal proof for several rules valid in any commutative ring R. 4 Ch. 1 The Integers (a + b)e = ae + be, for all a, b, e in R. RULE 1. This rule may be called the right distributive law, postulate (v), which is the left distributive law. Proof. For all a, b, and e in R: 1. 2. 3. 4. 5. 6. (a + b)c = e(a + b) e(a + b) = ea + eb (a + b)c = ea + eb ea = ae, eb = be ea + eb = ae + be (a + b)e = ae + be RULE 2. Proof. For all a in R, In contrast to (commutative law of mult.). (distributive law). (1,2, transitive law). (commutative law of mult.). (4, uniqueness of addn.). (3, 5, transitive law). 0+a = a and 1· a = a. For all a in R: 1.0+a=a+O 2. a + 0 = a 3. 0 + a = a (commutative law of addn.). (zero). (1, 2, transitive law). The proof for 1 . a = a is similar. RULE 3. If z in R has the property that a + z = a for all a in R, then z = O. This rule states that R contains only one element 0 which can act as the identity element for addition. Proof. Since a + z = a holds for all a, it holds if a is O. 1.0+z=O 2. 0 = 0+z 3. 0 + z = z 4. 0 = z (1, symmetric law). (Rule 2 when a is z). (2,3, transitive law). In subsequent proofs such as this one, we shall condense the repeated use of the symmetric and transitive laws for equality. RULE 4. For all a, b, e in R: a+b=a+e implies b = e. This rule is called the cancellation law for addition. §1.2 5 Elementary Properties of Commutative Rings Proof. By postulate (viii) there is for the element a an element x with a + x = O. Then I.x+a=l!.+x=O 2. x = x, a + b = a + c + (a + b) = x + (a + c) 4. b = 0 + b = (x + a) +b = x + (a + b) = x + (a + c) = (x + a) + c = 0 + c = c. 3. x (comm. law addn., trans. law). (reflexive law, hypothesis). (2, uniqueness of addn.). (Supply the reason for each step of 4!) RULE 5. For each a, R contains one and only one solution x of the equation a + x = O. This solution is denoted by x = -a, as usual. The rule may then be quoted as a + (-a) = O. As customary, the symbol a - b denotes a + (-b). Proof. By postulate (viii), there is a solution x. If y is a second solution, then a + x = 0 = a + y by the transitive and symmetric laws. Hence by Rule 4, x = y. Q.E.D. RULE 6. For given a and b in R, there is one and only one x in R with a + x = b. This rule asserts that subtraction is possible and unique. Proof. Take x = (-a) + b. Then (give reasons!) a +x = a + ((-a) + b) = (a + (-a» + b If y is a second solution, then a + x = b = a hence x == y by Rule 4. Q.E.D. RULE 7. = 0 + b = b. + y by the transitive law; For all a in R, a . 0 = 0 = O· a. Proof· 1. a == a, a + 0 = a 1. a (a + 0) = aa 3. aa +a ·0 = a(a + 0) = aa = aa + 0 4. a ·0 = 0 5. O· a = a . 0 = 0 (reflexive law, postulate (vi». (1, uniqueness of mUlt.). (distributive law, etc.). (3, Rule 4). (comm. law mult., 4) 6 Ch. 1 The Integers If u in R has the property that au = a for all a in R, then RULE 8. u=1. This rule asserts the uniqueness of the identity element 1 for multiplication. The proof, which resembles that of Rule 3, is left as an exercise. RULE 9. For all a and b in R, (-a)(-b) = abo A special case of this rule is the "mysterious" law (-1)(-1) = 1. Proof. Consider the triple sum (associative law!) 1. [ab + a(-b)] + (-a)(-b) = ab + [a(-b) + (-a)(-b)]. By the distributive law, the definition of -a, Rule 7, and (vi), 2. ab + [a(-b) + (-a)(-b)] ab + [a + (-a)](-b) = ab + O(-b) = abo = For similar reasons, 3. [ab + a(-b)] + (-a)(-b) = a[b + (-b)] + (-a)(-b) = a ·0+ (-a)(-b) = (-a)(-b). The result then follows from 1, 2, and 3 by the transitive and symmetric laws for equality. Q.E.D. Various other simple and familiar rules are consequences of our postulates; some are stated in the exercises below. Another basic algebraic law is the one used in the solution of quadratic equations, when it is argued that (x + 2)(x - 3) = 0 means either that x + 2 = 0 or that x - 3 = O. The general law involved is the assertion (1) if ab = 0 , then either a = 0 or b = O. This assertion is not true in all commutative rings. But the proof is immediate in any integral domain D, by the cancellation law. For suppose that the first factor a is not zero. Then ab = 0 = a . 0, and a may be cancelled; whence b = O. Conversely, the cancellation law follows from this assertion (1) in any commutative ring R, for if a ¥- 0, ab = ac means that ab - ac = a(b - c) = 0, which by (1) makes b - c = O. We therefore have Theorem 1. The cancellation law of multiplication is equivalent in a commutative ring to the assertion that a product of nonzero factors is not zero. §1.2 7 Elementary Properties of Commutative Rings Nonzero elements a and b with a product ab = 0 are sometimes called "divisors of zero," so that the cancellation law in a commutative ring R is equivalent to the assumption that R contains no divisors of zero. Theorem 1 can be used to prove the cancellation law for the domain Z[h] defined at the end of § 1.1, as follows. Suppose that Z[ v'2] included divisors of zero, with (a + bh)(c -+ dJ2) = (ac + 2bd) + (ad + bc)J2 = o. By definition, this gives ac + 2bd = 0, ad + be = O. Multiply the first by d, the second by c, and subtract; this gives b(2d 2 - c 2 ) = 0, whence either b = 0 or c 2 = 2d 2 • If b = 0, then the two preceding equations give ac = ad = 0, so either a = 0 or c = d -:- 0 by Theorem 1. But the first alternative, a = 0, would imply that a + bJ2 = 0 (since b = 0); the second that c + dJ2 = O-in neither case do we have divisors of zero. There remains the possibility c 2 = 2d 2 ; this would imply J2 = die rational, whose impossibility will be proved in Theorem 10, §3.7. If one admits that J2 is a real number, and that the set of all real numbers forms an integral domain R, then one can very easily prove that Z[ J2] is an integral domain, by appealing to the follqwing concept of a subdomain. Definition. A subdomain of an integral domain D is a subset of D which is also an integral domain, for the same operations of addition and multiplication. It is obvious that such a subset S is a subdomain if and only if it contains 0 and 1, with any element a its additive inverse, and with any two elements a and b their sum a + b and product abo Exercises In each of Exercises 1-5 give complete proofs, supporting each step by a postulate, a previous step, one of the rules established in the text, or an already established exercise. 1. Prove that the folIowing rules hold in any integral domain: (a) (a + b)(e + d) = (ae + be) + (ad + bd), (b) a + [b + (e + d)] = (a + b) + (e + d) = [(a + b) + e] +'d, (c) a + (b + e) = (e + a) + b, (d) a(be) = c(ab), (e) a(b + (e + d» = (ab + ae) + ad, (f) a(b + e)d = (ab)d + a(ed). 8 Ch. 1 The Integers 2. (a) Prove Rule 8. (b) Prove 1 . 1 = 1, (c) Prove that the only "idempotents" (i.e., elements x satisfying xx an integral domain are 0 and 1. 3. Prove that the following rules hold for -a in any integral domain: (a) -(-a) = a, (b) -0 = 0, (c) -(a + b) = (-a) + (-b), (e) (-a)b = a(-b) = -(ab). (d) -a = x) in = (-1)a, 4. Prove Rule 9 from Ex. 3(d) and the special case (-1)(-1) = 1. 5. Prove that the following rules hold for the operation a - b = a + (-b) in any integral domain: (a) (a - b) + (e - d) = (a + c) - (b + d), (b) (a - b) - (e - d) == (a + d) - (b + c), (c) (a - b)(e - d) == (ae + bd) - (ad + be), (d) a - b = e - d if and only if a + d == b + e, (e) (a - b)c = ae - be. 6. Are the following sets of real numbers integral domains? Why? (c) all positive integers, (a) all even integers, (b) all odd integers, (d) all real numbers a + b5 1 / 4 , where a and b are integers, (e) all real numbers a + b9 1 / \ where a and b are integers, (f) all rational numbers whose denominators are 1 or a power of 2. 7. (a) Show that the system consisting of 0 and 1 alone, with addition and multiplication defined as usual, except that 1 + 1 = 0 (instead of 2) is an integral domain. (b) Show that the system which consists of 0 alone, with 0 + 0 == 0·0 = 0, satisfies all postulates for an integral domain except for the requirement o oF 1 in (vii). 8. (a) Show that if an algebraic system S satisfies all the postulates for an integral domain except possibly for the requirement 0 oF 1 in (vii), then S is either an integral domain or the system consisting of 0 alone, as described in Ex. 7(b). (b) Is 0 oF 1 used in proving Rules 1-9? 9. Suppose that the sum of any two integers is defined as usual, but that the product of any two integers is defined to be zero. With this interpretation, which ones among the' postulates for an integral domain are still satisfied? 10. Find two functions f ¥;. 0 and g¥;.O such that fg "" O. 1.3. Properties of Ordered Domains Because the ring Z of all ordinary integers plays a unique role in mathematics, one should be aware of its special properties, of which the commutative and cancellation laws of multiplication are only two. Many other properties stem from the possibility of listing the integers in the usual order . . . -4" - 3 -2 - 1" 0 " 1 2" 3 4 ... . , § 1.3 Properties of Ordered Domains 9 This order is customarily expressed in terms of the relation a < b, where the assertion a < b (a is less than b) is taken to mean that the integer a stands to the left of the integer b in the list above. But the relation a < b holds if and only if the difference b - a is a positive integer. Consequently, every property of the relation a < b can be derived from properties of the set of positive integers. We assume then as postulates the following three properties of the set of positive integers 1,2,3, .... Addition: The sum of two positive integers is positive. Multiplication: The product of two positive integers is positive. Law of trichotomy: For a given integer a, one and only one of the following alternatives holds: either a is positive, or a = 0, or -a is positive. Incidentally, these properties are shared by the posItive rational numbers and the positive real numbers; hence all the consequences of these properties are also shared. It is convenient to call an integral domain containing positive elements with these properties an ordered domain. Definition. An integral domain D is said to be ordered if there are certain elements of D, called the positive elements, which satisfy the addition, multiplication, and trichotomy laws stated above for integers. Theorem 2. In any ordered domain, all squares of nonzero elements are positive. Proof. Let a 2 be given, with a ¥- O. By the law of trichotomy, either a or -a is positive. In the first case, a 2 is positive by the multiplication law for positive elements; in the second, -a is positive, and so a 2 = (-af > 0 by.Rule 9 of §1.2. Q.E.D. It is a corollary that 1 = 12 is always positive. Definition. In an ordered domain, the two equivalent statements a a (" b is greater than a") both mean that b - a is positive. Also a < b means that either a < b or a = b. According to this definition, the positive elements a can now be described as the elements a greater than zero. Elements b < 0 are called negative. One can deduce a number of familiar properties of the relation "less than" from its definition above. Transitive law: If a < band b < c, then a < c. 10 Ch. 1 The Integers - Proof. By definition, the hypotheses a < band b < c mean that b - a and c - b are positive. Hence by the addition principle, the sum (b - a) + (c - b) = c - a is positive, which means that a < c. The three basic postulates for positive elements are reflected by three corresponding properties of inequalities: Addition to an inequality: If a < b, then a + c b holds. As an example, we prove the principle that an inequality may b~ multiplied by a positive number c. The conclusion requires us to prove that bc - ac = (b - a)c is positive (d. Ex. 5(e) of §1.2). But this is aOl immediate consequence of the multiplication postulate, for the factors. b - II and c are both positive by hypothesis. By a similar argument one may demonstrate that the mUltiplication of an inequality by a negative number inverts the sense of the inequality (see Ex. l(c) below). Definition. In an ordered domain, the absolute value Ia I of a number is 0 if a is 0, and otherwise is the positive member of the couple a, -a. This definition might be restated as Ia I = +a (2) if a :> la I = -a 0; if a < O. By appropriate separate consideration of these two cases, one may prove the laws for absolute values of sums and products, (3) labl = lal Ibl, la +bI< la 1+ Ib I· The sum law may also be obtained thus: by the definition, we have -I a 1< a < 1a 1and -I b 1< b -(I a I + Ib I) < < 1b I; hence adding inequalities gives a +b < Ia I + Ib I· This indicates at once that, whether a + b is positive or negative, its absolute value cannot exceed 1'-1 I + Ib I· Exercises 1. Deduce from the postulates for an ordered domain the following rules: (a) if a < b, then a + c c, and conversely. (b) a - x < a - y if and only if x > y. 11 §1.4 . Well-Ordering Principle 2. 3. 4. *5. *6. *7. *8. *9. *10. (c) if a < 0, then ax > ay if and only if x < y, (d) 0 < e and ac < be imply a < b, (e) x + x + x + x = 0 implies x = 0, (f) a 0, then a :> b implies ae :> be. Prove that the equation x 2 + 1 = 0 has no solution in an ordered domain. Prove as many laws on the relation a <: b as you can. Prove that II a I - Ib II <: Ia - b I in any ordered domain. Prove that a 7 = b 7 implies a = b in any ordered domain. In any ordered domain, show that a 2 - ab + b 2 :> 0 for all a, b. Define "positive" element in the domain Z[Jzt and show that the addition, mult'iplication, and trichotomy laws hold. Let D be an integral domain in which there is defined a relation a < b which satisfies the transitive law, the principles for addition and multiplication of inequalities, and the law of trichotomy stated in the text. Prove that if a set of "positive" elements is suitably chosen, D is an ordered domain. Prove in detail that any subdomain of an ordered domain is an ordered domain. Let R be any commutative ring which contains a subset of "positive" elements satisfying the addition, multiplication, and trichotomy laws. Prove that R is an ordered domain. (Hint: Show that the cancellation law of multiplication holds, by considering separately the four cases x > 0 and y > 0, x > 0 and -y > 0, -x > 0 and y > 0, -x > 0 and -y > 0.) 1.4. Well-Ordering Principle A subset S of an ordered domain (such as the real number system) is called well-ordered if each nonempty subset of S contains a smallest member. In terms of this concept, one can formulate an important property of the integers, not characteristically algebraic and not shared by other number systems. This is the Well-ordering principle. The positive integers are well-ordered. In other words, any non empty collection C of positive integers must contain some smallest member m, such that whenever c is in C, m <: c. For instance, the least positive even integer is 2. To illustrate the force of this principle, we prove Theorem 3. There is no integer between 0 and 1. This is immediately clear by a glance at the natural order of the integers, but we wish to show that this fact can also be proved from our ... Here and subsequently exercises of greater difficulty are starred. 12 Ch. 1 The Integers assumptions without "looking" at the integers. We give an indirect proof. If there is any integer c with 0 < c < 1, then the set of all such integers is nonempty. By the well-ordering principle, there is a least integer m in this set, and 0 < m < 1. If we multiply both sides of these inequalities by the positive number m, we have 0 < m 2 < m. Thus m 2 is another integer in the set C, smaller than the supposedly minimum element m of C. This contradiction establishes Theorem 3. Theorem 4. A set S of positive integers which includes 1, and which includes n + 1 whenever it includes n, includes every positive integer. Proof. It is enough to show that the set S', consisting of those positive integers not included in S, is empty. Suppose S' were not empty; it would have to contain a least element m. But m o;i. 1 by hypothesis; hence by Theorem 3, m > 1, and so m - 1 would be positive. But since 1 > 0, m - 1 < m; hence by the choice of m, m - 1 would be in S. It follows by hypothesis that (m - 1) + 1 = m would be in S. This contradiction establishes the theorem. Exercises 1. Show that for any integer a, a - 1 is the greatest integer less than a. 2. Which of the following sets are well-ordered: (a) all odd positive integers, (b) all even negative integers, (c) all integers greater than -7, (d) all odd integers greater than 249? 3. Prove that any subset of a well-ordered set is well-ordered. 4. Prove that a set of integers which contains -1000, and contains x + 1 when it contains x, contains all the positive integers. 5. (a) A set S of integers is said to have the integer b as "lower bound" if b < x for all x in S; b itself need not be in S. Show that any non empty set S of integers having a lower bound has a least element. (b) Show that any nonempty set of integers having an "upper bound" has a greatest element. 1.5. Finite Induction; Laws of Exponents We have now formulated a complete list of basic properties for the integers in terms of addition, multiplication, and order. Henceforth we assume that the integers form an ordered integral domain Z in which the positive elements are well-ordered. Every other mathematical property of the integers can be proved, by strictly logical processes, from those assumed. In particular, we can deduce the extremely important . . one can .. Using this result and (4). we assume the law (4) for n = k and try to prove it for n = k + 1. we define the repeated sum b l + b2 + b3 b l + b2 + b3 + b4 bl + . p(1) is true and. . which is immediate.. We first use it to establish formally the general distributive law for any number n of summands.... + bk) + bk+b which determines the arrangement of parentheses in k + 1 terms. + bn as follows: + b2 ) + b3 . then P(n) is true for all positive integers n. [(bl + b2 ) + b3 ] + b4 • = (b l = This convention can be stated in general as a recursive formula (for k :> 1) (5) bl + . To be explicit. + bk) + bk+l ] = a(b l + . Laws of Exponents Principle of Finite Induction . + bk+l ) = a [(bl + .. Similar but more complicated inductive arguments will yield the general associative law. we have completed the inductive proof of (4).. + bk + bk+1 = (b l + . + bk or a product b l . . first. The inductive proof of (4) requires first the proof for n = 1. The metnod of proof by induction will now be used to prove various laws valid in any commutative ring... given this arrangement for k terms. second. 9 below). + bk) + abk+l • On the right..§1. as Since the right-hand side is ab l + . which asserts that a sum b l + . If. bk has the same value for any arrangement of parentheses (a special case appears in Ex. P(k) implies P(k + 1). simply observe that the set of those positive integers k for which P(k) is true satisfies the hypotheses and hence the conclusion of Theorem 4. the first term can now be reduced by the assumed case of (4) for k summands. for all k. + abk+b by the definition (5). ... By the definition (5) and the simple distributive law (v). Secondly. a(b l + . Let there be associated with each positive integer n a proposition P(n) which is either true or false.5 13 Finite Induction. To deduce this principle from the well-ordering assumption... a ..0 in Z by and . + amb n • Note also the general associative and commutative law. whatever the order or the grouping of the terms. 0 f a m+l . to n factors. according to which the sum of k given terms always has the same value. for any positive integral exponents m and n. If n is a positive integer. + amb 1 + . Then define the binomial coefficients similarly for n >. exactIy the de fi mtIon . the induction assumption.. the first law may be proved by induction on n. Positive integral exponents in any commutative ring R may also be treated by induction. (6) which makes it possible to compute any power a n + 1 in terms of an already computed lower power an. wh'ICh IS Next assume that the law (7) is true for every m and for a given positive integer n = k. the assocIative law. . and the definition again.14 Ch... This can also be stated as a "recursive" definition (any a in R)... + bn ) = a\b 1 + . and so completes the induction. as follows: (7) (8) For instance. the binomial formula can be proved over any commutative ring R. and consider the analogous expression ama k + 1 for the next larger exponent k + 1. + am)(b 1 + . From these definitions one may prove the usual laws. a = a m+l . First define the factorial function n! on the nonnegative integers by recursion: O! = 1 and (n + 1)! = (n !)(n + 1). 1 The Integers then also establish the two-sided general distributive law (al + . . If n = 1. as follows. a. th e Iaw becomes a m .. + a1bn + . Finally.. This gives the law (7) for the case n = k + 1. One finds by successive applications of the definition. the power an stands for the product a ..... (b) (ab)' = a'b'. Exercises 1.k)! (~) = n! (I. Prove by induction that 1 + 2 + . Let there be associated with each positive integer n a proposition P(n).) The Principle of Finite Induction permits one to assume the truth of P(n) gratis in proving P(n + 1).§ 1.. Prove by induction that x/ + . 7. + (~)xn-kl + ... By choice of m. In any ordered domain. (b) 1 + 8 + 27 + . = X...) .e. so that one must implicitly include a proof of P(1).D.....5 15 Finite Induction. This is called the Second Principle of Finite Induction. + n 3 = [n(n + 1)/2]2.2 > 0 unless Xl = .. Using induction. Prove by induction that the following laws for positive exponents are valid in any integral domain: (a) (am)' = a mn . 3. show that every odd power of a negative element is negative. Prove by induction the following summation formulas: (a) 1 + 4 + 9 + . but not the well-ordering principle. We shall now show that one can even assume the truth of P(k) for all k <: n. giving a contradiction. Caution: In case m = 1.. Let S be the set of integers for which P(n) is false. 5. prove Theorem 3. The only way out is to admit that S is empty. 4. (. it will have a first member m. hence by hypothesis. Unless S is empty... = O. (Hint: Let pen) mean n > 1. + (n) xn-kyk k and that (10) (k!)(n . the assumption that P(k) is true for all k < m implies the conclusion that P(m) is itself true. for each m. the set of all k < 1 is empty. then P(n) is true for all n. + X. P(k) will be true for all k < m. + n 2 = n(n + 1)(2n + 1)/6.. (c) l' = 1. If.E. Proof. Q.) = (n!)/(k!)(n .. 6. 2. Laws of Exponents From these definitions it follows by induction on n that (x (9) + y)" = xn = I k-O + nxn-1y + . Prove formulas (9) and (10).k)! We leilVe the proof as an exercise. P(m) must itself be true. + n = n(n + 1)/2. . Like the equality relation a = b. ~ O. 27.. prove the well-ordering principle from the Principle of Finite Induction. 13. a I band b Ic imply IS reflexive and a I c. The first law of (11) is trivial. '.) 9. since a = a . 14. An analogous concept of divisibility arises in every integral domain. b is said to be divisible by a. we also call a a fa~tor or divisor of b. and 81 ounces and a two-pan balance (weights may be placed in either pan). it is defined as follows ... Prove that to any base a > 1. + am) + (b t + . and b a mUltiple of a.16 Ch. Using the definition (5).6. Illustrate Ex. 9. the investigation of this situation is the first problem of number theory. checking by multiplying out. 11 by converting the equation 63 . 111 = 6993 to the base 7..) = at + . where the integers 'k satisfy 0 < 'k < a.al domain D. Obtain a formula for the nth derivative of the product of two functions and prove the formula by induction on n. 1 implies that a I a. *12. an element b is divisible by an element a when b = aq for some q in D. 10.. To prove the second. *11. In an integ. + b•.. we write a I b. 3. (Hint: Let Pen) be the proposition that any class of positive integers containing a number <n has a least member.._t + . Divisibility An equation ax = b with integral coefficients does not always have an integral solution x. Prove that the sum of the digits of any mUltiple of 9 is itself divisible by 9. Using Ex. The divisors of 1 in D are called units or invertibles of D. 1. Definition. . + a 2 '2 + a't + '0. + b. A druggist has only the five weights of 1. Show that he can weigh any amount up to 121 ounces. 7.. prove the following associative law: (at + . each positive integer m has a unique expression of the form a'. If there is an integral solution. the relation a I b transitive: (11) a I a. + a·-t. + am + b t + .. When b is divisible by a. 1 The Integers *8. . recall that the hypotheses a I band b I c are defined to mean . 672 90 = 9· 10 = 3z . 1 = (-a)( -1). then a = ±b. 2· 5. ab = 1 gives Iab I = Ia I . This theorem asserts. Ib I = 1. Ibl could not be 1. as asserted. . By hypothesis a = bd l and b = adz. Since a = a . Since dld z is an integer. and -1. This uniqueness of the prime factorization can be proved by studying greatest common divisors.5. then b = 0. There are no positive integers between 0 and 1 (Theorem 3). -a.6 17 Divisibility b == ad l and c = bd z. Proof. hence a = adzd l . which we now do. in effect. + 1.§1. for some integers d l and d z.3. If either inequality l)eld. Then d 1 = ± 1 by the theorem.11.7. If a = 0. Ia I and Ib I are positive numbers. Any positive integer which is not one or a prime can be factored into prime factors. ab = 1 implies a = ± 1 and b = ± 1. cancellation yields 1 = dzd l . too. The first few positive primes are 2. the product Ia I.2 5 • It is a matter of experience that we always get the same prime factors no matter how we proceed to obtain them. Therefore Ia I = Ib I = 1. that for integers a and b. An integer p is a prime if p is not 0 or ± 1 and if p is divisible only by ± 1 and ±p. If a :Ie.13. If the integers a and b divide each other (a Iband b Ia). Since neither a nor b is zero.0.D. Definition.19. so by the law of trichotomy Ia I > 1 and Ib I > 1.31. any integer a is divisible by a. Substitution of the first equation in the second gives c = a(dld z). Theorem 5.29. and hence again a = ±b.17. But according to the rules for the absolute value of a product.23. Corollary. = 7·96 = 7·12·8 = 7'3. Q. b = ± 1.E. The only units of Z are ± 1. as asserted in the conclusion of (11). thus 128 = 27. so that a = ± 1. this states according to the definition that a I c. exclusive of the right-hand end point. If a Ib. The Euclidean Algorithm The ordinary process of dividing an integer a by b yields a quotient q and a remainder r. say a .bq. while a = bq + (a . division points on the line -3b -2b -b o b 2b 3b The point representing a must fall in one of the intervals determined by these points. Hence 0 <: r < b. with b > 0.bx. then a I(b + c). 2. (Hint: Throwaway multiples of 2.bq = r. prove that la I < Ib I when b '" O. then a . while if r >. If we imagine the whole numbers displayed on the real axis.7. Division Algorithm. This picture suggests the following proof based on our postulates. There certainly is some integral multiple of b not exceeding a. a .bq) = bq + r. there exist integers q and r such that (12) a = bq o <: r < b. .b >. 4. this amounts to the following assertion. (c) if c divides every x in D. and use Ex. Proof. List all positive primes less than 100. For given integers a and b. (b) a unit u of D divides every element of D.(-I a \)b.1 by Theorem 3. 3.18 Ch.0. Hence. 1 The Integers Exercises 1.bx contains at least one nonnegative integer. by the well-ordering postulate. + r. contrary to our choice of q. Prove that if a Ib and a Ic. c is a unit. as asserted. 3. there is a least nonnegative a .b.0 would be less than a . say in the interval between bq and b(q + 1).) 5. 1. By construction. 3. Prove the following properties of units in any domain: (a) the product of two units is a unit. We conclude that 0 <: r < b. Prove: If b is positive and not prime. r >. namely. since b > 0. This means that a . 5.b(q + 1) = r . it has a positive prime divisor d < Jb. Therefore the set of differences a .bq = r. so (-Ia I)b <: -Ia 1 <: a. the possible multiples bq of b form a set of equally spaced Geometric picture. Formally. 7. where r represents a length shorter than the whole length b of an interval. for instance. b >. negative. Therefore. there is at least one positive element Ia I = ±a in S.a = 0.q) is numerically smaller than b.bq = r.7 The Euclidean Algorithm Corollary 1. All the even integers (positive. 3. Then r . The well-ordering principle will provide a least positive element b in S. which gives the uniqueness of q and r.d. for xm ± ym = (x ± y)m is a multiple of m. An integer d is a greatest common divisor (g. Q. d must have the properties d I a. Definition. Suppose that a = bq + r = bq' + r'. 9. O.19 §1. The set S can contain nothing but the integral multiples of b.. d I b. the Division Algorithm may be applied to give a difference a . More generally. then (k + 1)b = kb + b is a sum of two elements of S. such as the set . For if a is any element of S. which is also in S. Hence r = r'. and a = bq is a multiple of b. Theorem 6. .. as asserted. where 0 <: r < b. while b is the smallest positive element in S. 0.D.r' = b(q' . a set S of integers is said to be closed under addition and subtraction if S contains the sum a + b and the difference a . In general. hence is in S.. In symbols. Let such a set S contain an element a o. Any non void set of integers closed under addition and subtraction either consists of zero alone or else contains a least positive element and consists of all the multiples of this integer. For one may first show by induction on n that any positive multiple nb is in S: if n = 1. Proof. the quotient q and the remainder r which satisfy (12) are uniquely determined.r' must be zero. . and hence the difference 0 . bq = bq'. It follows that r .E. the set of all multiples xm of any fixed integer m is closed under addition and subtraction.(nb) is a difference of two elements of S. This set has the important property that the sum or the difference of any two integers in the set is again an integer in the set. Therefore r == 0. hence is in S. We now prove that such sets of multiples are the only sets of integers with these properties.) of the integers a and b if d is a common divisor of a and b which is a multiple of every other common divisor. The set S must contain all integral multiples of b. b is in S. The remainder r is nonnegative and less than b. 6. -3.E.b of any two integers a and b in S.D. any negative multiple (-n)b = 0 . For given integers a and b. which consists of all mUltiples of 3. q =q'. Frequently. Then S contains the difference a . Q. we have occasion to deal not with individual integers but with certain sets of integers.i.. 0 <: r' < b. if kb is already known to lie in S.c. -6. Proof. Consequently. but is a multiple of b.a = -a. c Ia and c Ib imply c I d. and zero) form such a set. with integral coefficients sand t. The Division Algorithm gives (14) o -< r1 < b. of two integers a and b. Note that the adjective "greatest" in the definition of a g. but that d is a multiple of any such c.D.'s ± d for a and b. For any two such Therefore the set S of all integers sa + tb is closed under addition and subtraction.c.c. d is a common divisor. On the other hand. (a. the original integers a = 1 . means not primarily that d has a greater magnitude than any other common divisor c. Similarly. 1 The Integers For example. To find explicitly the g. b). Of the two possible g. b) = sa + tb. a + 0 . so by Theorem 6 consists of all mUltiples of some minimum positive number d = sa + tb. In other words. According to the definition two different g. Any two integers a and b have a least common multiple m = [a.). Consider the numbers of the form sa + tb. b) = (at -b).c. We may suppose that a and b are both positive. Any two integers a 0 and b o.'s must divide each other. and hence must be multiples of the minimum number d in this set. both 3 and -3 are greatest common divisors of 6 and 9.d. every common divisor of band r1 is a divisor of a in (14). Every integer which divides the terms a and b must divide the remainder r1.f 0 have a positive greatest common divisor (a.c.c.d.d. the set M of common multiples of a and b is closed under addition and subtraction. Q.20 Ch. in the form (13) Proof. b] which is a divisor of every common multiple and which itself is a common multiple.m. Hence it is the desired greatest common divisor.d.f (a. From this formula it is clear that any common factor c and b must be a factor of d. Its least positive member m will be a common mUltiple of a and b dividing every common mUltiple. Thus m is a "least common multiple" (or I. hence differ only in sign. conversely.d.E. It can be expressed as a "linear combination" of a and b. b). b both lie in the set S under consideration. o. band b = 0 . the positive one is often denoted by the symbol (a. one may use the so-called Euclidean algorithm. Theorem 7. Theorem 8. . a + 1 .c. since a negative integer b could be replaced by -b without altering the g. '1) are identical. b) and (b. 0< '3 < '2. = a . so that 1 is a g. is just itself. b) = sa + tb for the g. is of the greatest utility.c. (-q2)a + (1 + Q1Q2)b. The algorithm can also be used to represent the g.d. as 'n 'n 'i '1 + (-ql)b.d. the only common divisors of p and a are ± 1. Proof.c.c. This can be done by expressing the successive remainders in terms of a and b.7 21 The Euclidean Algorithm Therefore the common divisors of a and b are the same as the common divisors of b and 'I> so the g.d. If P is a p. there must ultimatelyt be a remainder 'n+l which is zero.bql = a '2 = b .§1. then pi ab implies p Ia 0' pi b.q2'1 = 'n The form of these equations indicates that one would eventually obtain as a linear combination of a and b with integral coefficients sand t which involve the quotients Qi' The expression (a.d. (15) Since the remainders continually decrease. explicitly as a linear combination sa + tb. The g. of a and p and can thus be expressed in the t Why? Does a proof of this involve the well-ordering principle? . The argument above shows that the desired greatest common divisor is But the last equation of (15) shows that 'n is itself a divisor of 'n-I> so that the last g.'s (a.d. of the given integers a and b is thus the last nonzero remainder in the Euclidean algorithm (14) and (15). as we have indicated in the last equation. By the definition of a prime. If the conclusion p I a is false. One important consequence is the fact that a prime which divides a product of two numbers must always divide at least one of the factors: Theorem 9.d.ime.c. the only factors of pare ± 1 and ±p.c. This reduction can be repeated on band '1: 0< '2 < '1.c. 7.) 6. e). This argument proves Theorem 11. *12. two integers a and b are relatively prime if they have no common divisors except ± 1. b the set of all rna + nb (m.d. (e) (4148. e) = (a. ae) = a(b. 1 The Integers form 1 = sa + tp. Give a detailed proof of Theorem 10. Show that a set of integers closed under subtraction is necessarily also closed under addition.c. e). (This fact is used in proving Corollary 1. (b. If (a. of (a) (14. 3.252). b) larger than abo . If (a. Therefore the product ac divides m. a I m. If a > 0. b) = 1. b). which can be expressed in the form sa + tb + ue.7684).35). y) in the form se + ty (s. b). Show that b Ie and Ie I < b imply e = O. 4. then ac 1m. Show that for any positive integers a.c. e have a g. Use the Euclidean algorithm to find the g. (d) (2873.d. Discuss Exs. On multiplying through by b. Exercises 1. (a) Prove that any three integers a.7655). 8. hence the left side b is divisible by p. as in the second alternative in the theorem. a) = Ia I for any integer a. In other words.E. b.D. (c) (180. Prove that (0.c. c) = 1.6643). prove that (ab. we have b = sab + tbp. In the Euclidean algorithm. Show that a set of integers closed under addition alone need not consist of all multiples of one fixed element. Both terms on the right are divisible by p. The argument used to prove Theorem 9 will also prove the following generalization: Theorem 10. we call a and b relatively prime. Write (x. Q. 15). 10. One consequence may be drawn for an integer m which is a multiple of each of two relatively prime integers a and c. so by this theorem c I d. and c I m. (b) (11. n positive integers) includes all multiples of (a. and m = ad = a(cd'). 3-5 ~nd 6(b) for the case of I. l(a)-(c). t integers) in Ex. then c lb.22 Ch. e)) = ((a. (b) Prove that ((a. If (c. (f) (1001. 2. show by induction on k that each remainder can be expressed in the form rk = Ska + tkb. 11.m. where Sk and tk are integers. Such an m has the form m = ad and is divisible by c. 5. a) = 1 and c I ab. 9. 8. Pro yielding for a the composite expression which is of the desired form. . so that a = be. Proof. we can assume P(b) and P(c) to be true. the terms ± 1 in the two The prime PI in the first factorization is a ±qI . But by the second induction principle. If q is an integer such that for all integers a and b.•. (a) Prove that if (a. Theorem 9). a I b. then it has a positive divisor b which is neither 1 nor a. qm so that repeated application of must divide at least one factorqj of this . m) = 1. 1. Theorem 12. If a = 1 or if a is a prime. 14. This process involves the second principle of finite induction and can be described as follows. (b) Prove that if (a. then (ab. Fundamental Theorem of Arithmetic It is now easy to prove the unique factorization theorem for integers. and c Ib. c] = ac/(a. On the other hand. c). m) = (b. To prove the uniqueness. Since the primes Pi and qj decompositions must agree. divisor of the product a = Theorem 9 insures that PI are all positive. This expression is unique except for the order in which the prime factors occur. It clearly suffices to consider only positive integers a. we have to consider two possible pnme factorizations of an integer a. then P(a) is trivially true. That any integer a can be written as such a product may be proved by successively breaking a up into smaller factors. if a is composite.8 Fundamental Theorem of Arithmetic 23 13. c < a. . c) = d. so that band c can be expressed as products of primes: b = PIP2 .§ 1. q Iab implies q I a or q I b. prove that q is 0. then ac I bd. ±1. (c) Prove that [a. also called the fundamental theorem of arithmetic. Any integer not zero can be expressed as a unit (± 1) times a product of positive primes. with b < a. or a prime (cf. m) = 1. Let P(a) be the proposition that a can be factored as in Theorem 12. (a>. Vp(ab) = Vp(a) + Vp(b). P.) 5. II b II and Iia + b n ~ max (II a II... PI = qj.c.b]) = max{Vp(a). so that in the original factorization. (For a second proof. . b]. b = 625.) 2. In the factorization of a number the same prime p may occur several times.7. 2. 2. defined for all nonzero integers a and having properties (i) and (iii) of Ex. P. ::= II a II .. (Hint: It is helpful to use "dummy" zero components for primes dividing one but not both of a or b. Since PI Iqj and both are positive primes. qj so that qj appears first.) 6. We have caused the two factorizations to agree simply by rearranging the primes in the second factorization.D. b» = min {Vp(a)...v . lIb II)· Let V(a) be a nonnegative function with integral values. b ::= 360. 2.. Describe a systematic process for finding the g. Ex.E.. Vp(b)}.. are n primes. prove that (iii) II ab I *4.m. < Pk)' (16) Here our uniqueness theorem asserts that the exponent ej to which each prime Pi occurs is uniquely determined by the given number a. Exerci . 1 The Integers product. Vp(b )}. Prove that V(a) is either identically 0 or a constant multiple of one of the functions Vp(a) of Ex. Rearrange the factorization qlq2 .. 3. as asserted in our uniqueness theorem. illustrating with a ::= 216. 14(c). ct. then the integer PIP2' . (ii) Vp«a. we may write the decomposition as (1 < PI < P2 < . (iv) Vp([a. and a = 144. Using the formulas of Ex. of two integers whose prime-power decompositions (16) are known.) . 2. Collecting these occurrences. (Hint: First locate some P with V(P) > 0. Continue this process until no primes are )eft one one side of the resulting equation. Q. for Vp as in Ex. + 1 is divisible by none of these primes. Prove that the number of primes is infinite (Euclid).c.d. show that for any positive integers a and b. If Vp(a) denotes the exponent of the highest power of the prime P dividing the nonzero integer a. §1. There can then be no primes left on the other side. If lIa I = 2. then cancel PI against qb leaving where the accents denote the q's in their new order. Vp(b)}. and the I. ab = (a. b)[a. m = n.... prove the formulas (i) Vp(a + b) ~ min {Vp(a). (Hint: If Ph .24 Ch. 1. Two integers a and b are congruent modulo m if and only if they leave the same remainder when divided by 1mi. Then a. Then a .7). b leaves a remainder b . If a product mn of positive integers is square and if (m. = b + cm = (qm + r) + cm = (q + c)m + r. hence a and b do have the same remainder. (b) e(n') = . apply Ex.x 2 . (b) If y is even. Proof. On division by m. n) = 1. and z may be found as follows. and we write 7 = 19 (mod 12). . One might equally well say that a == b (mod m) means that the difference a . Prove that (a) for given. and then to begin over again. a = b (mod m) holds if and only if m 1 (a . Definition. and n in Z. For instance. a multiple of m. of the exponents occurring in the prime factorization of n. Congruences In giving the time of day. (a) If x 2 + y2 = Z2. 7 and 19 are so congruent. and z have no common factors except ±1.§1. y. 8. it will suffice to prove this result for the case m > O. 8. where m and n are integers with x = m 2 . There is still another alternative definition.b lies in the set of all mUltiples of m.b == cm.9 25 Congruences *7. This simple idea of throwing away the mUltiples of a fixed number 12 is the basis of the arithmetical notion of congruence. Since a == b (mod m) if and only if a == b (mod -m).d. then d Ie(mn).x) = 2.9. The possible right trangles with sides measured by integers x. to show that y = 2mn. show that both m and n are squares.c.b). This alternative we state as follows: Theorem 13. show that x and y cannot both be odd. based on the fact that each integer a on division by m leaves a unique remainder (Corollary 1 of §1. *9. and show (z + x. e(n).qm == r. (c) if e(m) = e(n) = d. Suppose first that a = b (mod m) according to our definition.n 2 . z . z = m 2 + n 2 • (Hint: Factor Z2 . there is an integer x such that x' = n if and only if 'I e(n). Define the function e(n) (n any positive integer) as the g. it is customary to count only up to 12.) a 1. We call two integers congruent "modulo 12" if they differ only by an integral multiple of 12. y. This equation indicates that r is the unique remainder of a on division by m.. Assume that x. where 0 <: r < m. and c the following properties. At best. Thus the hypothesis becomes a . m I (ax . that m I c(a . and products of congruent integers are congruent. so that a == b (mod m). so Theorem 10 allows us to conclude that m divides the second factor a-b." reminiscent of equality also: sums of congruent integers are congruent. in other words. If a = b (mod m). . Theorem 14. then for all integers x. a modified cancellation law can be found: Theorem 15.a) in the form b .a). reminiscent of the laws of equality (§1. b = q'm + r. The study of Iin\!ar equations may be extended to congruences. Then a .b .b = km for some k. as asserted. implies a == b (modm). so translated. from this we may derive the conclusions in the form m I (a +x . which gives the conclusion m I (b . 1 (mod 12) does not imply that 7 = 1 (mod 12).b = (q . Proof. a = band b == c imply a == c Each of these laws may be proved by reversion to the definition of congru~nce. But m is assumed relatively prime to the first factor c of this product. m I(-a + b). This means that a == b (mod m). Q. -a == -b (all mod m). This inference fails because the 2 which was cancelled is a factor of the modulus. the hypothesis states that m I(ca . Whenever c is relatively prime to m.b = dm.2): Reflexive: Symmetric: Transitive : ~ : implies b = a } all taken (mod m).b). requires that m I (a . The relation of congruence for a fixed modulus m has a further "substitution property. ca == cb (mod m).cb) or.26 Ch. The symmetric law. By definition. with the same remainder r.b) imply m I (b . a +x =b+x ax == bx. The hypothesis here is a . Thus 2 · 7 == 2 .bx). 1 The Integers Conversely.q')m is divisible by m. The relation of congruence for a fixed modulus m has for all integers a. b.a = (-d)m.x).D. Here again the proofs rest on an appeal to the definition. . suppose that a = qm + r.E. The law of cancellation which holds for equations need not hold for congruences. the g. Since c is supposed prime to m. suppose that X and x' are two solutions of the given simultaneous congruences (17).E. this congruence can be solved for y by Theorem 16. Theorem 17. as in Theorem 15. This states that X = bs is the required solution of b = xc. Then X .D. If the moduli ml and m2 are relatively prime. Simultaneous congruences can also be treated. b = bsc + btm. By hypothesis. An important special case arises when the modulus m is a prime. X = b l + yml is a solution of the first congruence. Since ml is relatively prime to the modulus m2.x' = 0 (mod ml) and also (mod m2)' Since ml and m2 are relatively prime. then the congruence cx = b (mod m) has an integral solution x.b l (mod m2).d. Any two solutions XI and X2 are congruent. (c. we can cancel the c here.D. so that b = (bs)c (mod m). Such an X satisfies the second congruence also if and only if b l + yml = b 2 (mod m2). modulo p. or yml = b 2 . On the other hand. m) is 1.9 Congruences 27 Theorem 16. two solutions XI and X2 of this congruence must satisfy cx I = CX2 because congruence is a transitive and symmetric relation. If P is a prime and if c ¢ 0 (mod p). The same methods of attack apply to two or more congruences of the form aiX == bi (mod mi). Conversely. with (ah mi) = 1 and with the various moduli relatively prime in pairs.E. The final term here is a multiple of m.x' is divisible by the product modulus mlm2> so that x = x' (mod mlm2)' Q. obtaining the desired conclusion XI == X2 (mod m). so 1 = sc + tm for suitable integers sand t. Corollary. this implies that the difference x . . This fact gives the Proof. If c is relatively prime to m. In this case all integers not divisible by m are relatively prime to m.c. then the congruences (17) have a common solution x. Proof. modulo m. Q. Multiplying by b. Any two solutions are congruent modulo mlm2' For any integer y. then cx = b (mod p) has a solution which is unique.§1. and m by (a. whence P(n) implies (n + 1)P == n + 1 (mod p). Exercises 1. Then P(O) and P(1) are obvious. (b) Prove that no integer m = 7 (mod 8) can be expressed as a sum of three squares. *4.) 10. m) Ib.) S. (a) Show that the congruence ax "" b (mod m) has a solution if and only if (a. Solve the simultaneous congruences: 2x == 1 (mod 8). Later. five men and a monkey gather coconuts all day.) *12. (c) 243x + 17 "" 101 (mod 725). (e) 6x + 3 = 4 (mod 10). Find the minimum number of coconuts originally present (Hint: Try -4 coconuts. 5. 1. 1 28 The Integers . *9. If x is an odd number not divisible by 3. (b) Show that if (a. show that m 2 "" 0. Solve the following congruences: (a) 3x = 2 (mod 5). he too finds one extra and gives it to the monkey. m) incongruent solutions modulo m. let P(n) be the proposition that n P == n (modp). hence (n + 1)p == n P + 1 (mod p). then sleep. (b) 7x == 4 (mod 10). Theorem 18 (Fermat). (Hint: Divide a. 3. Prove that the relation a == b (mod m) is reflexive and transitive. He divides the coconuts into five equal shares. 2x == 1 (mod 3). the second man awakens and takes his fifth from the remaining pile. *7. which is the proposition P(n + 1). modulo 8. Show by induction that Theorem 17 can be generalized to n congruences with moduli relatively prime in pairs. In the binomial expansion (9) for (n + 1)p. m). then a P == a (mod p). or 4. with one coconut left over.Ch. For a fixed prime p. On a desert island. 11. If m is an integer. 8. Proof. the congruence has exactly (a. m) I b. (a) x = 2 (mod 5). . Prove x 2 == 35 (mod 100) has no solutions. (Hint: Use Ex. 2. He gives the extra one to the monkey. The first man awakens and decides to take his share. (f) 6x + 3 "" 1 (mod 10). every co~fficient except the first and the last is divisible by p. 6. Prove directly that a == Q (mod m) and c = d (mod m) imply a + c == b + d (mod m) and ac = bd (mod m). (b) 3x == 2 (mod 5). (a) Show by tables that all numbers from 25 to 40 can be expressed as sums of four or fewer squares (the result is actually true for all positive numbers). (d) 4x + 3 = 4 (mod 5). b. prove that x 2 "" 1 (mod 24). Each of the remaining three men does likewise in turn. Prove that if x 2 == n (mod 65) has a solution then so does x 2 = -n (mod 65). hides his share. and goes to sleep. If a is an integer and p is a prime. For what positive integers m is it true that whenever x 2 = 0 (mod m) theQ also x = 0 (mod m)? 16.1.2) have a common solution. m2) = 1. The Rings Zn From early antiquity. even' even = even .m2' *14. the set of integers 0. 1·1=1. Tables for the case n = 5 are + 0 0 3 4 4 0 2 3 4 0 1 3 4 0 1 2 4 0 1 2 3 1 2 3 4 1 2 1 2 1 2 3 0 0 3 4 0 1 2 3 4 1 2 3 4 0 0 0 0 0 1 2 3 0 2 4 1 0 3 1 4 0 4 3 2 0 4 3 2 1 In every case the resulting system has properties (i)-(viii) of § 1.n . m 1) = (a 2. .1 constitutes a commutative ring Zn. We will now show that a similar construction can be applied to the remainders 0. The following laws for reckoning with even and odd integers are also familar: (18) even + even = odd + odd = even. . . and any two solutions are congruent modulo m. o.2. 4. Prove that if (m. and having the addition and multiplication tables 0+0=1+1=0 . m2) = (a. . Generalize Ex. 1. 1.. man has distinguished between the "even" integers 2. . Under addition and multiplication modulo any fixed n >. 15.1 to any modulus n. . That is. 3. then the simultaneous congruences a1x == bi (mod mJ (i = 1. we have Theorem 19. odd· odd = odd. 6. . which consists of two elements 0 ("even") and 1 ("odd") alone. Two such remainders can be added (or multiplied) by simply forming the sum (or product) in the ordinary sense (i. 1.. If a and b are integers and p a prime. . ...e. = odd. odd =i' even + odd even.. 0 = 0. and the "odd" integers 1. 13 to n simultaneous congruences. n . in Z). 2. 5. and then replacing the result by its remainder modulo n. 0+1 = 1 + 0 = 1.10..' . These identities define a new integral domain Zz.§1. 0 = O· 1 = 1 .. prove that (a + bY' = a P + bP (mod p)..10 29 The Rings Zn *13.. 1 30 The Integers Proof. d (mod n). That is... a = b (mod n) and c = d (mod n) together imply (19) a + c = b + d (mod n). It remains to verify postulates (iii)-(v). -12" " . and 4..D. The device of replacing congruence by equality means essentially that all the integers which leave the same remainder on division by n are grouped together to make one new "number. respectively. modulo n.Ch. Since a(b + c) = ab + ac for any integers.3." For the modulus 5 there are five such classes. In fact. This proves Theorem 20. This is the distributive law in Zn. If n is not prime. According to Theorem 1. .. . This is equivalent to the assertion that n I ab implies n I a or nib.. this l&w is equivalent to the assertion that there are no divisors of zero in Zn: ab = 0 implies a = 0 or b = O. one must by (19) have a(b + c) == ab + ac (mod n) when remainders are taken mod n. so the law becomes the statement: ab == 0 (mod n) implies a = 0 (mod n) or b == 0 (mod n). . corresponding to the possible remainders. . more systematic ways to construct the algebra of integers modulo n. -13 " -8 -3 " 2 7" 12 17 .E. so n I ab although neither n I a nor nib. consider the distributive law. -7 -2 3 8" 13 18 3 5 = { .. we saw that the relation x == y (mod n) is reflexive. In the last section. . like ordinary equality. and transitive. postulates 0) and (ii) hold. n has a nontrivial factorization n = ab. t -14 " -9 -4" 1 6" 11 16 . 1.. }. This is true if n is a prime (Theorem 9).. For any modulus n the residue class rn determined by a remainder r with o <: r < n consists of all integers a which leave the remainder r on . The only postulate for an integral domain not such an identity is the cancellation law of multiplication.. }. provided "equality" in Z is reinterpreted to mean "congruent modulo n.. These equations in Zn mean congruences for ordinary integers. 0 and 1 in Z act in Zn as identities for addition and multiplication. while n . symmetric. and Zn has zero-divisors. a . some of these classes are 15 = { . There are other. 0. The ring Zn of integers modulo n is an integral domain if and only if n is a prime. }.k is an additive inverse of k. 25 = { . Q.. by Theorem 14. the proofs of the commutative and associative laws are the same.2." Again..c == b ." Each such group of integers is called a "residue class. the sum 15 + 25 = 35 of the classes listed above may be found by adding any chosen representatives 6 + (-13) to get a result -7 which lies in the sum class 35. n) = 1 in Z.: (3· 4) . If a is in 'n.10 31 The Rings Zn division by n. (n . (a) Enumerate the units of Z15' (b) Show that if n = 2m + 1 is odd. Construct addition and multiplication tables for Z3 and Z4' Compute in 7. this rule may be stated as (20) For instance. as in the proof of Theorem 19. then the number of units of Z. and s any other elements in the corresponding classes. *7. . 2. 6.1)n. 3· (4·5). + s = t (mod n). For suppose that two residues. pick any representatives a and b of these classes. any element c which is not a unit is a zero-divisor. .···.13. Find all divisors of zero in Z26' Z24. all give the same sum. b in s"' then a + b is in the class tn belonging to the sum t. let x = y (mod 211") mean that x = y + 2n1l" for some integer n. 5. Show that k is a unit of Z. 11 + 7 = 18. The answer would be obtained if one used instead of the residues . by a general method to be discussed in §6. The residue classes which we have defined in terms of remainders may also be defined directly in terms of congruences. 3·4 + 3·5. Each integer belongs to one and only one residue class. 5.-. 3. 4. -14 + 17 = 3. How are these related to the sets 48 + 48 and 48 . For real numbers x and y. if and only if (k. whereas multiplication of residue classes cannot be so defined. *8.§1. Determine the exact set of all sums x + y and that of all products xy for x in 48. 48? Verify the associative law for the addition of residue classes. is even. 1". There are n residue classes: 0". In general. 35. and s give in Zn a remainder t as sum. y in 48. and find the residue class containing the sum (or the product) of these representatives. 3· (4 + 5). Exercises 1. Show that addition of residue classes can then be defined as in (20). Show that in Z. Other choices -9 + (-3) = -12. If an denotes the residue class which contains a. for a = . the algebra Zn could be defined as the algebra of these residue classes: to add (or mUltiply) two classes. *9.+ s = t (mod n). and two integers will belong to the same residue class if and only if they are congruent (Theorem 13). The algebraic operations of Zn can be carried out directly on these classes. and b = s give a + b == . 2. a ~ Ia I. a function 4>: A ~ B on A to B is a rule which assigns to each element a in A an element a4> in B. x E A if and only if x E B. and Relations At this point. the process of taking the absolute value. the empty set 0 (the set with no members) is a subset of every set. for example.2 for any equality. A set is a quite arbitrary collection of mathematical objects: for example. Thus x ~ x 2 is a function 4> on the set A = Q of all rational numbers to the set B of all nonnegative rationals (it can also be considered as a function 4>: Q ~ Q). 4} denotes the set whose (only) elements are the numbers 0. The relation a ~ a4> is sometimes written a ~ 4>a or a ~ 4>(a). given any set A and a property P. so the relation "subset of" is transitive. . This principle (called the axiom of extensionality) can also be stated symbolically: A = B means that for all x. the set of all odd numbers or the set of all points in the plane equidistant from two given points. 2. symmetric. If A is a set.11. in the sense that two sets A and B are equal (the "same") if and only if they have the same elements. function. we can pick out various subsets: the set of all positive integers. These examples illustrate the principle that any property determines a subset. Moreover. hence it is a function 4>: z ~ Z. the operation "add one" sends each integer n to another. A function 4>: A ~ B is also . Taking the negative. with the symbol 4> for the function in front. a ~ -a. {O. Starting with any set. the set of all odd positive integers. we pause to discuss briefly the fundamental notions of set. In any ordered domain D. If both T c: Sand S c: A. is still another function on D to D . and transitive relation. the condition for the equality of sets becomes the statement that A = B if and only if both A c: Band B c: A. Generally. The resulting equality of sets is clearly a reflexive. and 4. A finite set A can be specified by listing its elements. by n ~ n + 1. we write x E A to signify that the object x is an element of the set A. then clearly T c: A. the set of all integers greater than 18. as required in §1. More generally. is similarly a function on the set D to the set of nonnegative elements in D. more exactly. the symbol S c: A indicates that S is a subset of A. if A and B are sets. one may form the subset (21) S == {x Ix E A and . and relation. Likewise.32 Ch. and x e A when x is not an element of A. binary operation. A set S is called a subset of a set A if and only if every element x of S is also in A. and so on. such as the set of all integers.x has P} of all those elements of A which have the property P. Likewise. any set is determined by its elements. Sets. We will write this a ~ a4>. Functions. 1 The Integers 1. and Relations called a mapping. a transformation. while not necessarily injective correspondences have been called many-one correspondences. the rule a .. Binary Operations. for any domain D. all a</> for a in A. but rieed not be all of B. For example. b. For example. The image is a subset of the codomain B.Ia I for integers is a function Z ~ Z. and the like.n + 1 is a bijection Z ~ Z and..11 33 Sets. x ~ 2x is an injection Z ~ Z (but is not surjection). Operations on pairs of numbers arise in many contexts-the addition of two integers. . 9}.. The image (or "range") of a function </>: A ~ B is the set of all the "values" of the function. the addition of two residue classes in Zn. For example.. and B its codomain. the multiplication of two real numbers. as in the uniqueness postulate for a commutative ring... with a</> = b.. A function </>: A ~ B is a bijection (or bijective. when to each element b E B there is one and only one a E A which has image b. absolute value a .. For example. that is. In such cases we speak of a binary operation. the subtraction of one integer from another. However. n . a binary operation "0" on a set S of elements a. when the image is the whole codomain. . . or a correspondence from A to B. . . the image of the telephone-dial function is the subset {O.I a I also defines a function Z ~ N that is surjective. but is not surjective because the image is the (proper) subset NeZ of all nonnegative integers. Bijections </>: A ~ B are also called one-one correspondences (of A onto B). A function </>: A ~ B is an injection (or one-one into) when different elements of A always have different images-in other words. A function </>: A ~ B is called surjective (or onto) when every element b E B is in the image-that is. 9} of all ten digits. that is. a .§1. Q omitted) to the set {O. the usual telephone dial ABC DEF \11 2 \11 3 GHI JKL \11 \11 4 5 MNO PRS TUV WXy z \11 6 \11 7 \11 8 \11 I 9 ° defines a function on a set A of 25 letters (the alphabet. The set A is called the domain of the function </>. 2. . For example. Here by "uniquely" we mean the substitution property (22) a = a' and b = b' imply aob=a'ob'. . is a rule which assigns to each ordered pair of elements a and b from S a uniquely defined third element c = a 0 b in the same set S. 1. when a always implies a = a'. To decide whether or not a function is onto. with 1 omitted. In general. Functions.. we must know the intended codomain ..a is a bijection D ~ D. or one-one onto) when it is both injective and surjective. c. there are also nonmathematical relations." or "I. One also writes S2 for the product S x S of a set with itself. Exercises 1." "a == b (mod 7)/~ or "a Ib. aRc for all a. aRb). satisfy the following laws: Reflexive: Symmetric: Transitive: aRa for all a in S. Which of the three properties "reflexive.b. the relation of congruence between triangles in the plane-is such an equivalence relation. 3.b. Especially important in mathematics are the relations R on a set S which. b) with a E S. c in S. such as the relation "is a brother of" between people. To discuss relations in general we introduce a symbol R to stand for any relation ("R" stands for" <. such as "a = b. symmetric." "symmetric. given two elements a and b in the set S. and which ones are commutative? a . Two given integers may be "related" to each other in many ways. and transitive relations are known as equivalence relations.B "is an uncle of/' "is a descendant of. One may readily mention many other relations between other types of mathematical objects. Reflexive. b in S." "a < b. For example. alb. b. bET. a < \b\. Do the same for the following relations on the class of all people: "is a father of/' "is a brother oft "is a friend of. or a does not stand in the relation R to b (in symbols.Ch. aR' b ). HR" denotes a binary relation on a given set S of objects if. Formally. 2(a + b).1 34 The Integers It is convenient to write S x T for the set of all ordered pairs of elements (a. a < b." etc. either a stands in the relation R to b (in symbols. like congruence and equality.). -a . Which of the following binary operations a 0 b on integers a and bare associative. this is called the Cartesian product (or simply "productH) of Sand T." and "transitive" apply to each of the following relations between integers a and b? a <: b. a binary operation is then the same thing as a function 0: S2 ~ S. aRb implies bRa aRb and bRc imply for all a." Would any of your answers be changed if these relations are restricted to apply only to the class of all men? *4." Each of these phrases is said to express a certain "binary relation" between a and b. 2." "=== . How is the relation "is an uncle of" connected with the relations "is a brother of" and "is a parent of"? Can you state any similar general rule for making a new relation out of two given ones? . replacing Z by the class Z+ of positive integers. Show that a relation is reftexive and circular if and only if it is reftexive. (d) a ~ g. " ." 7. 8.12. I}. which satisfies for all elements a and b the conditions (23) (a + b)' = a' + b'.10. 9. formula (18)). Show that any relation R on a set S can be regarded as a function f: S2 ~ {O. aRb implies bRa. as discussed in §1. Isomorphisms and Automorphisms One of the most important concepts of modern algebra is that of isomorphism. In each case specify the image and whether or not the function is injective. On account of the laws (23) one may say that the isomorphism a ~ a' "preserves sums and products. An appropriate example is the algebra of "even" and "odd" as compared with the integral domain ~.. We now define this concept for commutative rings as follows: Definition. (a.§1. Each of the following rules defines a function f: Z ~ Z. The rings Rand R' are called isomorphic if there exists such a correspondence. A relation R is called "circular" if aRb and bRc imply eRa.? 10. (b) a ~ a 2 . An isomorphism between two commutative rings Rand R' is a" one-one correspondence a ~ a' of the elements a of R with the elements a' of R'. aRb and bRa implyaRa. by the transitive laws. 1.12 35 Isomorphisms and Automorphisms 5. What is wrong with the following "proof' that the symmetric and transitive laws for a relation R imply the reflexive law? "By the symmetric law. 7. (c) a ~ 2a + 5. 6). The one-one correspondence even ~ 0 odd ~ 1 is an isomorphism between these domains because corresponding elements are added and multiplied according to the same rules (cf. and transitive. ? surjective on Z. Do Ex. For what integers n is the function x ~ 6x + 7 bijective on Z.c. two commutative rings are isomorphic when they differ only in the notation for their elements.d. (a) a ~ \a I + 1. symmetric.. (ab)' = a'b'." Loosely speaking. *6. b)' = a' .. we claimed that these postulates completely describe the integers for all mathematical purposes.1). the domain Z[.. We can now state this more precisely (it will be proved in §2.b)' = a'.6). We shall see later that the idea of isomorphism applies to algebraic systems in general.J2)(ml + n l J2)]. since for any a = m + n.b'. One may even describe abstract algebra as the study of those properties of algebraic systems which are preserved under isomorphism.2 for m and n in the domain Z of integers..n..2. but also differences. similarly...(mnl + mln)J2. 1 The Integers Many integral domains have important isomorphisms with themselves. By definition. of the equation b + x = a. + 2nnl) + (mnl + mln)J2].b) = a..J2) = (mm. and if S' is another system isomorphic to S. This correspondence is an isomorphism. Since the correspondence preserves sums. In words: the zero (unity) of R corresponds to the zero (unity) of R'. for example.b is the solution. SUCh a characterization of Z "up to isomorphism" is the most that could be achieved with any postulate system of the type we have used. so that b + (a . (ab)' = [(m a'b' = (m . it is isomorphic to itself under the nontrivial correspondence m + nJ2 +-'» m ..1 as the set of all numbers m + n.. a .(mnl + mln)J2 and. = [(mm. = (mm.. this asserts that (a . In describing the system of integers as an ordered domain in which each set of positive integers has a least element.n. Other rules are (24) 0' = 0 .b)' is the (unique) solution of the equation b' + x = a'. in general. + 2nn. they are analogous to symmetries of geometrical figures (see §6. Any ordered domain in which the positive elements are well-ordered is isomorphic to the domain Z of integers.. Any isomorphism a +-'» a' preserves not only sums and products.. that if a system S satisfies such a system of postulates. (a + b)' = a' + b'. Thus if S satisfies . or that (a . for it is clear.. then S' must also satisfy the postulates. Such isomorphisms are called automorphisms . l' = 1.2.nJ2)(ml .) .36 Ch.2 and b = ml + nl. (-a)' = -(a'). + 2nnt) .. Consider.2] described in §1.. b' + (a . we have + n. Since the isomorphism preserves sums..m + nJ3 is not an isomorphism between the domains Z[J2] and Z[. n E Z./3 for m. 3. 2. a reflexive. then a + b = b + a for all a and b in S. . S. Exercises 1.13]. (a) Prove that under any isomorphism an element x satisfying an equation x 2 = 1 + 1 must correspond to an element y = x' satisfying the equation y2 = l' + 1'. This asserts that the commutative law also holds in 5'.§ 1. Prove that isomorphism is an "equivalence relation" (Le.. so (a + b)' = (b + a)'.12 Isomorphisms and Automorphisms 37 a commutative law for addition. Prove that the correspondence m + nJ2 . a' + b' = b' + a'. Exhibit a nontrivial isomorphism of Z[ .. The corresponding elements in the given isomorphism must be equal...13] with itself. 4. symmetric. and transitive relation) . This argument is of a general character and applies to all our postulates. 7. Prove that the properties (24) hold for any isomor£hism. Show that the domain Z of integers has no nontrivial isomorphisms with itself. *6. Prove that an integral domain with exactly three elements is necessarily isomorphic to Z3. Let Z[J3] be the domain of all numbers m + n. (b) Use (a) to show that no isomorphism is possible between Z[ J2] and Z[J3]. 1. A field F is a commutative ring which contains for each elementa::.t. It is easy to show that the cancellation law Ox) of § 1.t:. 38 .1a = 1. every field is an integral domain. we now show that division is possible and has its familiar properties in any commutative ring where all nonzero elements have nonmultiplicative mverses. The method of extension is illustrated by the standard representation of fractions as quotients of integers. 0 and ca = cb. Commutative rings with this property are called fields.t. then In other words. for if c ::. Conversely. more generally. Definition. Oan "inverse" element a-I satisfying the equation a . in this section and the next we will show that any integral domain can be extended to a field in one and only one minimal way. so is every subdomain of a field (and for the same reason). Definition of a Field Both the integral domain Q of aU rational numbers and the integral domain R of all real numbers have an essential algebraic advantage over the domain Z of integers: any equation ax = b (a ::. 0) can be solved in them.1 holds in any field. Theorem 1.2 Rational Numbers and Fields 2. Division (except by zero) is possible and is unique in any field. Conversely. In any b ¥. This gives ad = a(b-I)d = ed-I(bd) = ed-Idb = be. we have + (-a/b) = (ab . then alb = b-Ia = b-Iadd.2 are satisfied in fields. The hypothesis (a/b) = (e/ d) means ab -I = Cd-I . quotients obey the following laws (where = (e/ d) if and only if = (ad ± be)/ (bd). Proof of (i). As above. as desired. Proof of (iii).I = b .O.0).Ibed.1 39 Definition of a Field We have to show that for given a¥-O and b in a field F the equation ax = b has one and only one solution x in F. if ad = be. The usual rules for the manipulation of quotients can also be proved from the postulates for a field. Theorem 2. bdy = be.D. (i) (a/ b) (ii) (a/ b) ± (e/ d) (iii) (a/b)(e/d) (iv) (a/b) + (-a/b) (v) (a/b)(b/a) field. considered as integral domains. Substituting in (iii). x = 1 satisfies this equation. Proof of (iv). = (ae/bd).I = Cd-I = e/d.E. if (a/b) ¥. = = 0. All the rules for algebraic manipulation listed in § 1. Q. for by the cancellation law proved above. 1/a = a -I.0 and d ¥. Proof. hence ab/ ba = 1.ba)/b 2 = 0/b 2 = O· (b 2)-1 = O. These equations may be combined to give dbx = da. . we have (a/ b )(b/ a) = ab/ ba. Q. But ab/ba is the unique solution of the equation bax = abo Clearly. The solution of ax = b is denoted by b/ a (the quotient of b by a). the inverse a -I may be used to construct an element x = a -I b which on substitution proves to be a solution of ax = b.E.§2. ax = band ay = b together imply x = y if a ¥. Proof of (v). In particular. Thus x ± y is the unique solution z = (ad ± be)/ bd of the equation bdz = ad ± be. the equations bx = a and dy = e can be combined to give (bd)(xy) = (bx )(dy) = ae. If a ¥. bd(x ± y) = ad ± be. whence xy = (ae)/(bd) . (a/b) Substituting in (ii).D. It is the only solution.O. Proof of (ii). 1 ad = be.0. Observe that x = a/band y = e/ d denote the solutions of bx = a and dy = e. Theorem 3 does apply. The proofs will be left to the reader as exercises. b.I) = d-1b. In testing a subset S of F for being a subfield.l .. d ¥. 1 = 1 + 0. This follows from the corollary of Theorem 16.O. §1.b. All identities (viz.J2) is another one of the same sort.O. and similarly the product is (a + bJ2)(c + dJ2) = (ac + 2bd) + (bc + ad)J2.J2.J2) contains 0 '= 0 + 0. = ad/bc.9. A subfield of a given field F is a subset of F which is itself a field under the operations of addition and multiplication in F. c.J2 if it contains a + bJ2. Fields exist in great variety. This gives the following result: Theorem 3. c ¥- a/I (-a)/(-b) '= = a. if b. Finally. a(b/c) = ab/c. is a subfield of the field of all real numbers. Q(. Definition.J2.0) its inverse a -I in S.J2. Again. for any prime p. and distributive laws) which hold in F hold a fortiori in any subset of F. This subfield is customarily denoted by Q(. if one assumes that the real numbers form a field.J2r l of any .J2) = -a . 2 40 Rational Numbers and Fields Arguments similar to those just employed can be used to prove such other familiar laws as the following: (1) (2) (bd)-I a ± (b/c) = (ac ± b)/c. Again. and -(a + b. (a/b)/c -(a/b) = (-a)/b = a/(-b). alb. associative. d ¥. the integral domain Zp constructed in § 1.J2). (3) (a/b)/(c/d) (4) (-b)-I = -(b. for the sum of any two numbers of Q(. A subset S of a field F is a sUbfield if S contains the zero and unity of F. one can easily construct other examples of fields by using the notion of a subfield. where Q designates the field of rationals. and if each a of S has its negative and (provided a ¥. one can therefore ignore the postulates which are identities and test only those which involve some "existence" assertion. provided the operations in question can be performed. the commutative. with rational coefficients a and b. if S is closed under addition and multiplication. such as the existence of an inverse. o. an inverse (a + b.10 is a field. = a/bc. Thus.Ch. Theorem 3 may now be applied to show that the set of all real numbers of the form a + b. b ¥- o. (Note that since w 3 . 1 a + The new denominator a 2 . c is a field." _ 1-= _ a + bJ2 (a . We may construct still other subfields if we assume that there is a field of complex numbers a + bi. subtraction.1 = (w . for (a + hw) + (c + dw) (a + bw)(c + dw) (a + c) + (b + d)w. the set Q(?s) of all real numbers a + b?s + with rational a. em Similarly.6).0 has an inverse in the set.2b a . y. any a + bw ¥. Addition.1 41 Definition of a Field nonzero element may be found by "rationalizing the denominator.§2. where i = J=i and a and b are real. 2 = ac + (be + ad)w + bdw = (ac .bd) + (be + ad .[3/2)i in the field. (a + b. b rational) form a subfield Q{w) of the field of all complex numbers.rs + cJ25)-1 may be computed by showing that the equation (a + b?s + cm)(x + y?s + zm) = 1 + o· ?s + o· m is equivalent to a system of simultaneous linear equations.rs)3 = 5 is a rational number.1 has been used to get rid of the term in w 2. One may easily verify that this inverse does indeed satisfy the equation (a' + b'J2)(a + bJ2) = 1. Furthermore.2b 2). for 2 -(b .2b . = where the equation w 2 = -w . These equations can always be solved for x. and z.2b 2 is never zero (as is proved in §3.ab (a + bw) [ a2 _ ab + b 2 = a 2 _ ab + b2 + b2 = 1.bd)w.1)(w 2 + w + 1) = 0. and multiplication are performed within this set much as in Q(J2). The quadratic equation w 2 +w+1=0 will have a root w = (-1 + J 3)/2 = -1/2 + (.a + bw)] a . w is an "imaginary" cube root of unity!) All a + bw (a.2b\ b' = -b/(a 2 . unless a = b = c = o. and the resulting inverse does have the required form a' + b'J2 with rational coefficients a' = a/(a 2 . . using this time the fact that (. b. Finally.bJ2) = ( a ) _ ( b )J2 2 2 2 2 bJ2 a -bJ2 a . . Is every integral domain isomorphic to a field itself a field? Why? 7. 8.ab + b 2 = (a 2 + b 2)/2 + (a . 2.b)2 /2 is certainly positive unless a = b = O. Prove that the only subfield of the field Q of rational numbers is Q itself. 10. with a. 3. Show that the law a + b = b + a is implied by postulates (i). *13.) *5. show that the set of elements common to Sand S' is also a subfield. Construction of the Rationals We will now prove rigorously that the (ordered) field Q of rational numbers can be constructed from the well-ordered domain Z of all integers. with a. 2. 6.1 for each c¥-O in ZII. (e) all numbers a + and b rational. Clearly. assuming that 1 + 1 = 0 (addition is mod 2) and that there is an element x such that x 2 = x + 1.ab + b 2 appearing in this inverse is never zero. If Sand S' are two subfields of a given field F. 2 42 Rational Numbers and Fields The denominator a 2 . Make a table which exhibits c./3. together with (viii') For each a in R. Show that in Theorem 3 the conditions 0 E Sand 1 E S can be replaced by the condition "S contains at least two elements." (Hint: Consider ax = a.1. Indeed. which of the followil1B subsets of reals are fields? (a) all positive inte~rs. Prove formulas (1)-(4) from the postulates for a field. 11.. 1. State and prove an analogue of Theorem 3 for subdomains. we will prove more: that a similar construction can be applied to any integral domain. the construction of the rational numbers from the integers is essentially just the construction of a field which will contain the integers. b rational (d) all with a rational numbers which are not integers. and (iv)-(vii) of §1. Exercises 1.O. the equations a + x = 0 and y + a = 0 have solutions x and y in R.Ch.J2) is either Q itself or the whole field Q(J2) . Find all subfields of the field of Ex. (b) all numbers a + b. The integers alone do not form a field.2. b rational. Can you state a general theorem on the possible subdomains of Z? of Z" ? *12. brs. 4. for a 2 . this field must also contain solutions for all equations bx = a with integral coefficients a and b ¥. If the set of real numbers is assumed to be a field. 12. Construct addition and multiplication tables for a field of four elements. 9. Show that a subfield of Q(. whose existence was postulated in Chap. (ii). (c) all numbers a + b45. b). Since this relation is not formal identity b) identical to (ai. r'. in the following way. symmetric. Theorem 1). for the distributive law one can reduce each side of the law systematically. (a. For instance. b') = (ab ' + a'b. b) = (ai. each of which is intended to stand for a solution of an equation bx = a. Definition. It can be formulated precisely as follows .2 43 Construction of the Rationals To construct abstractly the "rational numbers" which solve these equations. For each sum in the conclusion is given by a formula like (6). we simply introduce certain new symbols (or couples) r = (a. But this equation follows from the hypothesis (a. and «a. And then. (a. (i)-(iii». b') if and only if ab ' = a'b. according to definitions (6) and (7). In the first place. b') = (aa ' . b) = (a I. b) = (ai. b") = (ai. b') would mean a = a ' and b = b'). respectively. To realize this intention we must specify that these new objects shall be added.1.2 (for formal identity these properties would have been trivial). the sum and product are uniquely determined in the sense of this congruence. b') implies (a.e. by (6) (7) (a. b) + (a". Note that since D contains no "divisors of zero" (§ 1. The preceding specification makes good sense whether we start with the domain of integers Z. while sums and products are defined. bED and b :f: O. We conclude that the equality defined by (5) has the desired properties. The "equality" of such couples is governed by the convention that (5) (a. and transitive. b') + (a". Various algebraic laws in Q(D) may now be checked. We wish to regard the relation" =" of "congruence" between couples as an equality. b) with a. A similar uniqueness assertion holds for the product.. Thus. and equated exactly as are the quotients a/bin a field (Theorem 2. b"). we may check by straightforward argument that" =" is reflexive. Let D be any integral domain. b)' (ai. the product bb ' :f: 0 in (6) and (7). we must prove that this congruence has the properties of equality listed in §. The field of quotients Q(D) of D consists of all couples (a. multiplied. or from some other integral domain D.2.§2. and so Q is closed under addition and multiplication. and these two results are congruent in the sense (5) if and only if (ab" + a"b)b'b" = (a'b" + a"b')bb". bb ' ). b) + (ai. bb ' ). ab' = a'b). . where r. b') (i. Ch. The hypothesis (a. bb'b") . r(r' These two results give equal couples in the sense of (5). But this is easy.D. hence that (x. y) == (c.1 for an integral domain. 1).. Such an extra factor in a couple always gives an equal couple. for by (5) this equality amounts simply to the identity bxy = byx. b )[(a'. one proves the associative and commutative laws.. namely. b) = (0· b + 1 . b).in Q(D) of an inverse for r. b)(x. b' b") (aa'b" + aa"b'. a. b )(bc. Theorem 4. as the second result differs from the first only in the presence of an extra nonzero factor b in all terms. and (abc. By the same straightforward use of the definitions and the laws for D. 1) insures that a ¥. b) is -(a. b )(a' b" + a"b'. b )(a'. It remains only to prove that every equation rx = 1 with r ¥. We now wish to show that Q(D) actually contains our original integral domain D as a subdomain-in other words. for (0. This verifies all the postulates listed in §1. ad). b) ~ (0. b) = (a. b).0. d) because abed = bade. d) with (a. y) = (be. any equation ° (8) ° (a. 1) is an identity for multiplication. by) == (x. An identity element for addition (a zero) is the couple (0. y) has a second term ad not zero. Proof. the existence for every r ¥. that Q(D) is actually an . as required by our definition of a rational number. For by direct substitution (a. b) ~ (0. 2 44 Rational Numbers and Fields r" are any three couples: + r") (a. b")] (a. ad) = (abc. bad)." (a.. bb") (aa'bb" + aa"bb'. bad) == (c. This explicit proof of the distributive law in Q(D) is but an illustration. b") (aa'. bb') + (aa".1) has a solution suggested by (3). bb'bb"). b') + (a". The field of quotients Q(D) is a field for any integral domain D.E. 1) + (a.has a solution x in Q(D)-that is. (bx. b') + (a. The negative of (a. b )(a". and the couple (1.' + . Q. (8') (x. y). 1 . b) = (-a. The cancellation law holds. more generally. and multiplication exactly like a itself. each element of which is a quotient of two elements ofD.1 + b . if and only if a = b. Let an integral domain D be contained as a subdomain in any field F Then the set of all those elements of F of the form alb. This proves Theorem 5.2 45 Construction of the Rationals extension of D. is a subfield S of F. 1 . 1) . b) is the quotient alb. a. this subfield S is isomorphic to Q(D) under the correspondence alb ~ (a. 1). We will. Theorem 6. We now show that the rational field Q = Q(Z) is in fact exactly characterized (up to isomorphism) by the preceding statement. 1) = (a . The integral domain Z can be embedded as a subdomain in a field Q = Q(Z). as shown by (a. (b. or br = a. we can associate with each a E D a couple (a. Moreover. Hence we have the Corollary. One may conclude that the one-one correspondence a ~ (a. 1 . prove the analogous result for any domain D. b E D. Note. 1) = (a + b. 1). b . moreover. Specifically.§2. 1) is an isomorphism of the given integral domain D to a subdomain of the field Q(D) = F. hence r = (a. Any integral domain D can be embedded isomorphically in a field Q(D). indeed it is suggestive to follow through the preceding arguments thinking of the special case that D = Z. 1) = (ab. addition. b . 1) which behaves under equality. 1) + (b. 1) (a. This is not strictly possible. b) can't be the same thing as an element of D. this is as complete a characterization as we can hope for. Since Z is defined by its postulates only up to an isomorphism. b). b) E Q(D) is the solution of an equation (b. An isomorphism between two fields F and F' means an isomorphism between F and F' regarded as commutative rings. since a couple (a. each element of which is a quotient alb of integers. it is a one-one correspondence between F and F' such that if .e 0. in fact. 1. so that Q(D) = Q(Z) is the set of a1l6rdinary fractions. (a. However. Theorem 5 applies in particular to the domain Z.e O. 1). 1) = (b. l)r = (a. 1) = (ab. equations (8) and (8') show that any couple r = (a. (a' . 2 x ~ 46 Rational Numbers and Fields x' and y ~ (x y'. multiplication. in particular. Can the ring Z6 of integers modulo 6 be embedded in a field? Why? S.) 8. subtraction. The way in which these quotients a/ b add.. S is a field (Theorem 3). Exactly the same rules are used for the couples (a. 3. (Cf. 1 of §1.b'. Then + y) ~ (x' + y') and (xy) ~ (x'y').12. (c) Describe its quotient field. symmetric. provided c ¢ O. Prove in detail the commutative and the associative laws for multiplication of couples. and transitive. The integral domain Z can be embedded in one and only one way in a field Q = Q(Z) so that each element of Q is a quotient of integers.a + b. Proof. 4. b) is an isomorphism of the closure S of D onto Q(D) . Combining Theorem 6 with the preceding corollary.b')/c'.D. by the laws of Theorem 2. (b) Prove that they form an integral domain. Ex. S is closed under addition. This completes the construction of the rational field Q from the integers. Show that under any isomorphism F . and become equal is described by (i)-(iii) of Theorem 2. Describe the field of quotients of the ring Zs of integers modulo 5. (a) State explicitly how to add and multiply two such numbers. 1 b . Prove that the "equality" relation defined by (5) is reflexive.F' between two fields. 2.. The set S of all these quotients contains all the integers. so that S might be described as the closure of D under these operations in F. a .-1 and (a . Prove that the correspondence a + b. and c ./il (a.c' imply c. e .b)/c . multiply.J7 . a/I = a. b). and division. Let Z[i] be the set of all complex numbers a + bi.Ch. that this correspondence maps each a in D onto a/I ~ (a. Q.E.C. In any event. we get Theorem 7.a'. Exercises 1. Hence the correspondence a/ b ~ (a. where a and bare integers and = -1. The field F contains quotients a/ b which are solutions of equations bx = a with coefficients a and b ~ 0 in D. b rational) is not an isomorphism. 7. What is the field of quotients of the field Q? Generalize. 1) = a. Observe. 6. Prove that any rational number not 0 or ± 1 can be expressed uniquely in the form (± 1)Pl" .. The assumption that D is finite means that the elements of D can be completely enumerated in a list bl> b2 . 16. has a field of quotients isomorphic to the set of all real numbers of the form.§2.E. bn the elements of D). Any finite integral domain D is a field. *12.. Proof. 17. + sJ3.) 10. and s rational.. O. . where n is a suitable integer. if p is a prime. Identify its field of quotients. . To prove D a field. 2. *15. Try all the products (9) (bl> . and obtain an explicit isomorphism. 12). and the exponents el are positive or negative integers.. consisting of all a + bJ3 for integers a and b. we need only provide an inverse for any specified element a¥-O in D. < P. for instance.t. incongruent) elements.3. with 0 <: bk < k if k > 1.j would by the cancellation law entail bi = bl> counter to the assumption that the b's are distinct. What can one say about the fields of quotients alb and a I I b' from isomorphic integral domains D and D'? Prove your statements. and bn . Since this list (9) exhausts all of D..bn . Prove that any rational number . and each bk is an integer. 13. ••• . Q. Find the smallest subdomain of Q containing the rational numbers 1/6 and 1/5..e. where n is some positive integer (a discussion of finite sets in general appears in Chap. Describe all possible integral domains which are subdomains of Q . 14. 0 can be expressed uniquely in the form 'Is = b l + b2 /2! + b3 /3! + . The fact that the domain Zp is a field is a corollary of Theorem 8. This gives n elements in D which are all distinct. where the PI are positive primes with Pl < P2 < .3 Simultaneous Linear Equations 47 *9. Show that any field with exactly two elements is isomorphic to ~. (Hint: Show that nothing can correspond to J7. *11. . because ab i = abj for i ¥. Simultaneous Linear Equations A field need not consist of ordinary "numbers".. Show that the integral domain Z[J3]. For a fixed prime PI show that the set Z(p) of all rationals min with n prime to P is an integral domain.Is . P:'. b rational).D.t.. . Prove that there is no isomorphism between the field Q(v'7) of numbers of the form a + bJ7 and that of numbers of the form a + bill (a. the integers modulo p form a field containing only a finite number of distinct (i. + bnln!. the unity element 1 of D must somewhere appear in the list as 1 = ab i • The corresponding element bl is then desired inverse of a. Inverses can also be computed directly. multiplying the second equation by a. The preceding device of elimination can be extended to m simultaneous linear equations in n unknowns XI> ••• . .' . and the latter can be solved for the integer x by the Euclidean algorithm methods. one proceeds by trial of all possible numbers bi in Zp.. Thus. the second by b.Ch..bl. x n . as in Theorem 16 of § 1.be).I stand for arbitrary elements of the field F.be. the first bye. ¥. for the equation ax = 1 with a¥-O in Zp is simply another form of the congruence ax == 1 (mod p) with a ¥. Chap. then equations (10) have the solution x = de . we get (ad .bl a y = al . so that the two equations are "proportional"). ax ex + dy = I. d = kb. Hence. where the letters a. I = ke.0. 10) a= and if (10') a I. Gauss Elimination.. and subtracting. and no other solution.ee a (a = ad . We will now describe a general process. It is a remarkable fact that the entire theory of simultaneous linear equations applies to fields in general. of the form + a12x2 + . Multiplying the first equation by d.ee.be)x = de . + a2nXn allxl = bl> = b 2. if we define the determinant of the coefficients of (10) as (cf.0. -f. .. consider the two simultaneous equations (10) + by = e. we get (ad . alnXn a21XI + aW2+ ..be)y = al . and subtracting. !I= ad . (11) Here both the known coefficients aij> bi and the unknowns Xj are restricted to a specified field F. then equations (10) have either no solution or many solutions (the latter eventuality arises when e = ka.9. Whereas if a = 0. 2 48 Rational Numbers and Fields To actually find the inverse in Zp by the proof. 1.O... Some ail ¥. Xn.. Case 1. the number of unknowns.. . over the field ZII this would reduce 3x + 5y + 7z = 6 5x + 9y + 6z = 7 2x + Y + 4z = 3 X to + 9y + 6z = 2 8y + 9z = 8 5y + 3z = 10.• .. n by writing n (11') L aijXj = for bi j=1 i = 1.i+lXi+1 + C/.. (Thus. Any system (11) of m simultaneous linear equations in n unknowns can be reduced to an equivalent system whose ith equation has the form (13) Xi + Ci.O. . we write down only the ith equation. + aln'xn = a22'x2 + a22'x3 + . m).I. Case 2. . for finding all solutions of the given set (system) of equations. Then subtracting ai I times the new first equation from each ith equation in turn (i = 2. trivially. we then get an equivalent system in which all ¥. . + CinXn = di . Then. •. . Multiplying the first equation by all . indicating its form by a sample term.. we get an equivalent system of the form XI + a12'X2 + a13'X3 + . distinguishing two cases. By interchanging two equations (if necessary).1 unkowns X2. + O· xn = bi is "equivalent" to 0 = bi> which cannot be satisfied. . the degenerate equation O· x + .. " m' all aij E F. which is equivalent to the given system in the sense of having precisely the same solutions. where all equations are understood to be modulo 11. we obtain Theorem 9. The idea is to replace the given system by a simpler system. aijXj and the statement that the equation is to be summed over j = 1.) In a more compact notation.3 49 Simultaneous Linear Equations known as Gauss elimination.§2.. the system (11') is equivalent to a "smaller" system of m equations in the n ..1+2X/+2 + . we get an equivalent system with all ¥. XI is arbitrary for any solution of the smaller system. . Every ail = O.. Proceeding by induction on m.' . + a2n'x n = bl' b 2' (12) For example. We argue by induction on n. 0.. m. 2 50 Rational Numbers and Fields for some subset of r of the integers i = 1. For any choice of these Xk. . ••• .Xi+l from the relation (13') Xi = di If it is not. d 3. if one d k ¥.E. + C3nX n = dl. X I in succession. Solutions of any system of the echelon form (13) are easily described.. This proves the - Ci. (r <: m). .r equations of the form 0 = d k · Proof.Ch... . X3 + . 8y + 9z = 8 (mod 11) would first be reduced to y + 8z == 1 (mod 11).i+l - Ci. then it is determined by x"' .. . If a given Xi in this sequence is the first variable in an equation of (13). The echelon form of the given system is thus X + 9y + 6z = 2 y + 8z = 1 (mod 11). d2. . . Q. If Case 2 always arises. . + C2nXn = X2 + C23X 3 + .. the system (13) looks like the display written below XI + C12X 2 + Cl3 X 3 + .. we get m equations of the form (12). In the numerical example displayed. n Xn = . the set of all solutions of (11) is determined as follows. The m .. If Case 1 arises. . the original system (11) is incompatible (has no solutions). Written out in full.. Subtracting five times this equation from 5 y + 3z == 10 (mod 11). then we may get degenerate equations of the form 0 = d k • If all d k = 0. and the given system is said to be compatible. z=7 . the remaining Xi can be computed recursively by substituting in (13'). In the compatible case of Theorem 9. Consider Xm Xn-b X n -2.D. then this Xi can be chosen arbitrarily. . . these can be ignored.' .r variables Xk not occurring in (13) can be chosen arbitrarily (they are free parameters). which is said to be in echelon form.i+2 - ••• - CinXn' Corollary. plus m . whence z == 7 (mod 11). we get 7 z == 5 (mod 11). + c. . 6. Theorem 10.2y + z = 5 (mod 13). can never arise for homogeneous equations. The solution x = 4. (b) x + y + z == 1 (mod 5). . we get y 1 .xn) is another solution.. 1. . n. Hence.Ji)x + (1 - ./i)x + (3 - ·. S.3 Simultaneous Linear Equations = = = Solving. the last equation of (12/ will always contain an extra variable which can be chosen at will. . provided there are no constants k ¥. Furthermore. 3x + 4z = 6 (mod 11).8z 0 (mod 11). 4x + 7y + z = o(mod 11).. 2x + 5y + z + 2t 1. There may be no further solutions. 3x + 3y + 3z = 4 (mod 5).9y . . Such a system always has a (trivial) solution XI = X2 = . . (2 - . x + 3y + 2s + 6t = 2. with m < n. 4. with moduli deleted. and x 2 . xn) is any solution of a system of homogeneous linear equations.. in the field Q of rational numbers. 5x . . + bnxn = d always have a solution for coefficients in a given field. = mb. 3x + 2y + 4z = o(mod 5). the possible inconsistent equations 0 = d..Ji)y = 2. A system of equations (11) is homogeneous if the constants bi on the right are all zero. but if the number of variables exceeds the number of equations. Solve the following simultaneous congruences: 4x + 6y = 3 (mod 7). = Xn = O. always has a solution in which not all the unknowns are zero. What can be said about the sum of two solutions? = = . .3y + 4z = 1 (mod 13).51 §2..0 with ka. for i = 1. b1Xl + . y = 0..6z = 4 (mod 11). Exercises 1. . + anXn = c. 7. 3. (c) x ./l)y = 1. .allmod7. Find all incongruent solutions of the simultaneous congruences: (a) x + 2y . A system of m homogeneous linear equations in n variables. then (-X. 2x + 2y = 7 (mod 13). Find all incongruent solutions of the simultaneous congruences x +y+z = 0 (mod 5). (b) 2x + 7y = 3 (mod 11). (a) 3x + 2y = 1 (mod 7). Solve equations (a) and (b) in Ex. . 2.z + 5t 4. Solve in Q(v'2) the simultaneous equations (1 + .· . Prove that if (X. .0 and m ¥.. Prove that two equations a1x 1 + . z = 7 can be checked by substituting into the original equation. b) >0 is defined to mean that the integer ab is positive. Hence in any ordered field. when considered as a domain. and trichotomic properties listed in §1. This is true. whence . and since abd 2 = b 2 ed in virtue of the hypothesis ad = be. and conversely. But the rational number (a. If a quotient alb is positive. d) > O.ab"e' ~ O. (b) Compute a formula for x in (a). we shall now prove this from our construction of rationals as couples of integers. d) is positive.4. Positiveness also has the requisite additive. For instance. First recall that in any ordered domain a nonzero square b 2 is always positive.Ch. d) imply (e. Theorem 11. Since we have defined equality by convention. b) > 0 and (a. We know by experience that the rational numbers do constitute such an ordered field. and shall show further that the "natural" method of ordering is the only way of making the rational numbers into an ordered field. the sum of two positive couples (a. and trichotomic properties.3. a'x + b'y + e'z = dr.a/be" . the product (alb)b 2 = ab must therefore also be positive. Proof. b) was intended to represent the quotient al b. multiplicative. b) == (e. since cd has the same sign as b 2 ed. multiplicative. since ab > 0 and cd > 0 imply d 2 ab > 0 and b 2 ed > 0.a"b'e . (a) Prove that the three simultaneous equations ax + by + ez = d. have ooe and only one solution in any field F if the 3 x 3 determinant !1 = ab'e" + a'b"e + a"be' . b) and (e. Ordered Fields Afield F is said to be ordered if it contains a set P of "positive" elements with the additive. Hence we define a rational number (a. b) to be positive if and only if the product ab is positive in Z. 2. aI/x + b"y + e"z = d". ab the same sign as abd 2 . (14) alb> 0 if and only if ab > O.2 52 Rational Numbers and Fields *8. The rational numbers form an ordered field if (a. we must prove that equals of positive elements are positive: (a. and use it to show that x = 4 for the three simultaneous linear equations over Zll displayed below (12). it is an ordered integral domain. in other words. a field is ordered if. 2 2 abd 0. For instance. . 1) which represent integers. Then (x + y)/2 > . the field Q(. § 1. and other subfields of the real number field. (17) O<a<b implies 0< lib < 1/a. bd) is posItive.4 53 Ordered Fields which is to say that the sum (ad + bc. Finally. the definition of "positive" for fractions agrees with the natural order of the special fractions (a. 1) is positive by the definition (14) only if 1 . then 2 2 2 (a .3. Theorem 12.J2 (see §2. if a ~ b. This states that the arithmetic mean of two. In any ordered field. In any such field an absolute value can be introduced as in § 1.J2) of numbers a + b. The field Q of quotients of an ordered integral domain D may be ordered by the stipulation that a quotient al b of elements a and b of D is positive if and only if ab is positive. in addition to the rules valid in any ordered domain.bf > 0. There are many other ordered fields: the field of real numbers. 2 + a2 2 + .1). a > O.E. In this. it in fact establishes the following more general result.fxY (x ~ y). This is the only way in which the order of D may be extended to make Q an ordered field.§2. The rule (19) that a sum of squares is never negative (Theorem 2. so a .D. + an 2 :> 0• The two rules (17) and (18) are the usual ones for the division of inequalities. which gives a 2 + b > 2ab. (18) a<b<O implies 0> 1/a > lib.. and the properties of inequalities established there will hold. Since the proof of Theorem 11 involves only the assumption that the integers are an ordered domain. (19) a. for (a. prove that the product of two positive rational numbers is positive. Exercises 1. set x = a 2 and y = b 2 and divide by 2.2ab + b > 0..3) is especially useful. Q.. Assuming that the integers form an ordered domain. distinct positive real numbers exceeds the geometric mean JXY. b) . there are infinitely many x satisfying a < x < b.lmes that negative numbers exist. . a/ b + a/ e = a/ (b + e) implies a = 0 or b 2 + be + e 2 = O. Prove that in no ordered field do the positive elements form a well-ordered set. we begin by listing some basic properties of the system Z+ of all positive integers that follow easily from the results of Chap.d. 1. Prove formulas (15)-(19) of the text. in any ordered field). then just one of the two alternatives (a. (iii) Furthermore. and 4istributive. prove that if a < b. such that m . Q(D).Ch. (a) Prove: any subfield of an ordered field is an ordered field.(a. *21. then m = n. Postulates for the Positive Integers Although we have used the domain Z of all integers as the starting point for our review of the basic number systems of mathematics. b) > 0 ho!ds in. (b) Show that in an ordered field. S. 1 = m for all m in Z+. this procedure is really quite sophisticated because it assJ. (Hint: Set (a + b)/2 = r. In the rest of this chapter we shall show how this assumption can be avoided. For consistency.5. 2 54 Rational Numbers and Fields 2. 0. it implies a = O. by showing how to derive the negative integers and their properties from familiar facts about positive integers alone. the following cancellation law holds in Z+: (20) ifmx = nx. 3. (a) Show that in any field..) 4. For the rational numbers (or. * Sections which are starred may be omitted without loss of continuity. prove that (a" + b")/2 :> «a + b)/2)". A common mistake in arithmetic is the assumption that a/ b + a/ e = a/(b + e). commutative. more generally. (Hint: Square both sides. (ii) There exists a multiplicative identity 1 in Z+. which are associative. b) > 0 and . Theorem 13.J(x 2 + y2)(X'2 + y'2) in any ordered field in which all positive numbers have square roots. Prove lxx' + yy'l < . (b) Is any subdomain of an ordered field an ordered domain? 7.) 6. D an ordered domain. 8. a = r + d. If n is a positive integer and a and b positive rational numbers.t. The system z+ of all positive integers in Z has the following properties: (i) It is closed under uniquely defined binary operations of addition and multiplication. Prove similarly that if (a. 9. b = r . (r. Finally. whence m + z = n + z is impossible. n) of positive integers m and n. then n +z = (m + x) + z + (x + z) = m = m + (z + x) = (m + z) + x. then m = n.§2. An integer is a couple (m. n = m + y is incompatible with m + z = n + z. The details of this construction resemble the construction of the rationals from the integers (§2. the three alternatives about the equations m + x = n take the place of some of the order properties of the positive integers. one may reconstruct the system Z of all integers. for any two elements m and n of Z+. or m = n + y has a solution y in Z+. (m.2).s) means m+s=n+r. by (iv).n) + (r. exactly one of the following alternatives holds: m = n. n) of positive integers. appealing a third time to property (iv). Definition. (m. if the properties (i)-(v) stated in this theorem are viewed as postulates. contains every element of Z+. We leave the proof of these properties of Z+ as an exercise.n + s). n) . "Equality" of couples is defined by the convention (22) (m. Similarly. Therefore. n) is "positive" if and only if n + x = m for some positive (23) integer x. solution x in Z+. ms + nr). we obtain (21) if m +z = n + z. The object of this construction is to get a system larger than Z+ in which subtraction will always be possible.5 55 Postulates for the Positive Integers (iv) Again. Hence we introduce as new elements certain couples (m. s) = (mr + ns. Furthermore.n) = (r.s) = (m . where each couple is to behave as if it were the solution of the equation n + x = m. while sums and products are defined by + r. the Principle of Finite Induction holds in Z+: any subset of z+ which contains 1. or m + x = n has a . Starting with the positive integers. they do completely characterize the positive integers. as given by these postulates. Conversely. and n + 1 whenever it contains n. Note in particular that if m + x = n in Z+. in the sense that the positive integers as we previously defined them do have these properties and that any other system satisfying these postulates can be proved to be isomorphic to this system of positive integers. (24) (m. (v) Finally. m + n). 1) is a unity and (1. in any integral domain containing Z+. By the postulate (iv) in Theorem 13. . n) is an injection from the given positive integers x and the new positive couples (n + x. n) if and only if x = y. those of the second form (n + x. n) (m + x. n) + (n. m) = (1. one proof uses condition (iv) of Theorem 13. Moreover. The various formal laws for an integral domain then follow by a systematic application of the definitions (23) and (24) to these laws. and that sums and products as given by (23) and (24) are uniquely determined in the sense of this equality. Hence the new "positive" couples satisfy the law of finite induction. m)(n + y. n) .Ch. 2 56 Rational Numbers and Fields The couples as introduced by these definitions do in fact satisfy all the postulates we have given for the integers. and may be shown to have the additive. With this proved. It should be noticed that the proof just sketched involves only our postulates on Z+.b) of elements of Z+ must satisfy definitions (22)-(24). Those of the first form are equal to the zero (1. The cancellation law for the mUltiplication of couples is harder to prove. §1. 1) a zero for the system just defined. multiplicative. mn + nx + mn + my). 8. One must first verify that the equality introduced by (22) is reflexive. n). The system Z thus constructed is an ordered domain whose positive elements satisfy the Principle of Finite Induction. Ex. much as in the discussion of rational numbers. n) = (m + n + x + y. Hence. 1). (n + x. symmetric. By § 1. (Cf. we know that the couples form an integral domain. (m + x. (2. if "congruent" couples are actually identified. since by definitions (23) and (24). The system z+ of positive integers can be embedded in a larger system Z in which subtraction is possible. m + x). m) + (n + y.2. Conversely. (m. and trichotomic properties required in the definition of an ordered integral domain (§1. m). It is even a monomorphism. = (mn + my + nx + mn + xy. n). the correspondence x ~ (n + x.3). We have thus sketched a proof of the following result: Theorem 14. in such a way that any element of Z is a difference of two positive integers from Z+. the differences (a . this result implies the well-ordering principle. In particular. every couple can be written in just one of the three forms (m. Additive inverses exist since (m. n) are the positive couples.5. Ex.1) for all (m. and transitive. m) = (n + y. (m + x. In Z·. Prove the relation defined by (22) is reflexive.) 13. 13 may be used to replace the cancellation laws (20) and (21) in the list of postulates for Z+. Prove that (m + 1. Prove that the "addition" defined by (23) is commutative and associative. one can define them in terms of the successor function (25) S(n) = n + 1. (b) m < m for no m. . 10. 7. Show that Theorem 14 would not hold for any definition of "positivenes~" of couples (m. n')· (r. and transitive. and is an additive zero. 4. Peano Postulates Instead of regarding addition and multiplication as undefined operations on the set P = Z+ of positive integers. Prove (a) m < nand n < r imply m < r.s) and (m. Show that postulate (iv) of Theorem 13 may be replaced by the requirement that m + 1 -. define m < n to means that m + x == n for some x E Z·. Prove that (m. 1-8? State a theorem bearing the same relation to Theorems 14 and 15 that Theorem 7 bears to Theorem 5.s) = (m'. Show that conditions (c) and (d) of Ex. Prove the same for the "multiplication" defined by (24). m) is a multiplicative identity. 2.n') + (r. 15. n) other than stated after (24). Prove the distributive law. as stated in Theorem 16. m) is the same for all m. Show that the first statement follows from the second.e 1 for every m of Z+.n) + (r. 3. 5. Can you generalize this result? * 2. then (m. Prove the cancellation law for multiplication. 8. s) = (m'. 6. *14.6. (This is essentially Peano's postulate (iii).n'). 9.) This proves Theorem 15.n) = (m'.6 57 Peano Postulates 5. n)· (r. (c) m < n implies m + r < n + r for all r. Prove that if (m. s) for all (r.§2. Show that repetition of the process used to obtain Z from Z+ yields no new extension of Z. Any integral domain containing the system Z+ contains a subdomain isomorphic to the domain Z of all integers. 11. s). Exercises 1. (d) m < n implies mr < nr for all r. *12. symmetric. What properties of Z+ have been used in Exs. Prove Theorem 13 in detail. Theorem 17. since a + I' = b + I' implies a = b.3'. Since P' consists only of positive elements. and which contains S(n) whenever it contains n. we define P' to be the intersection of all these sets T. The properties (i)-(v) are known as the Peano postulates for the positive integers. We shall now use them to show that our original postulates for the integers determine the integers up to isomorphism. based on our postulates for ordered domains. (i) and (ii) hold for P'.D. there is a unique subset P' which satisfies the Peano postulates with respect to the unity I' and the successor function S'(a) = a + I'. Since I' < I' + I' < I' + I' + I' < .' . that (v) is the Principle of Finite Induction. To prove (v). Proof. Theorem 18. a E P' if and only if a is in every such set T.. . be the class of all subsets T of D+ which have the properties (i) and (ii) of P. Remark. as will be shown below. is such a set P'. must equal P.' . The set D+ of all positive elements of D clearly contains I' and satisfies (i) and (ii).Ch. i. Informally. Q. hence P' is contained in A. The set P of positive integers and the successor function S have the following properties: (i) 1 E P. 2 ~ 2'. Remark. But we wish a formal proof. Proof. defined by 2' = I' + I'.2'. should yield the required isomorphism. In any ordered domain D.e.. and therefore P' = A. (iv) for nand m in P. Then A is one of the sets T used above. This proves (v) for P'. S(n) = S(m) implies n = m. and order. 2 Rational Numbers and Fields 58 Theorem 16. (iv) holds. since P' satisfies (i) and (ii). The subset P' of Theorem 17 is isomorphic to the set P of positive integers with respect to addition. it is clear that 1 ~ I'. this correspondence should preserve order. to prove all the properties of the positive integers.E. in particular. (ii) if n E P. then S(n) E P. multiplication.. 3' = I' + I' + I'. First let Q(n) be the proposition that there is a unique correspondence x ~ lPn (x) between the integers 1 < x <n in P and . Note. Now let I. Intuitively. (v) a subset of P which contains 1. Proof. etc.. They suffice. it is clear that the sequence 1'. (iii) for no n in Pis S(n) = 1. let A be a subset of P' which contains I' and contains S'(a) whenever it contains a. (iii) holds. By definition. The cited properties are immediate from Theorem 13... and (v) shows that P' is the only possible such set. §2. Given O(n) and hence a 4Jm we can construct a unique 4Jn+l by setting 4Jn+l(X) = 4Jn(x) 'for 1 < x <n and 4Jn+l(n + 1) = S'(4Jn(n)). that is.m is positive. if 1 < x < n < m. Finally. m < n means that n . Corollary 1. 4J(S(x)) = S'(4J(X)). 4J is a bijection of P to pi. hence 4J(m) < 4J(n) and therefore 4J(n) ¥. 4Jn(S(x)) = S'(4Jn(x)) for 1 < x < n. 4J is an isomorphism with respect to addition and multiplication. since we already know that 4J (x) includes all of pi. by the definition. in other words.4J(m). m + n.4J(m). Indeed. since tjJ(k) is positive in D. (30) m < n if and only if n = m +k for some k in P. Hence m < n yields n = m + k. and includes with any 4J(x) its successor. Hence O(n) implies O(n + 1). we need only show that n ¥. But n ¥. that is. . hence 4J(n) = 4J(m) + 4J(k). Again. Every element of pi is the correspondent 4J(x) of some x E P. m < n implies 4J(m) < 4J(n). This proves O(n) by induction. one can prove by induction on x that 4Jn(x) = 4Jm(x). Let 4J(x) denote this element of P'. that m < n. To summarize our conclusions. In view of Theorem 15. SCm) = n From these equations and (27). aearly 0(1) holds. Any ordered domain D contains a subdomain orderisomorphic with Z. as required. hence the set is all of pi by property (v) of P'. This gives a correspondence x ~ 4J(x) of P to pi with the properties (27) 4J(1) = 1'. we get from Theorem 18 the following corollary. one can easily prove by induction on m that 4J(n + m) = 4J(n) + 4J(m) and 4J(nm) = 4J(n)4J(m). for the set of elements 4J(x) includes 1'. we define an order-isomorphism between two ordered domains to be an isomorphism which preserves order.6 59 Peano Postulates elements 4Jn (x) in pi under which (26) 4Jn(1) = 1'. say. this proves 4J(m) < 4J(n).m implies 4J(n) ¥. 4J preserves order. provided that x < n. hence 4Jn(x) is independent of n. Both in P and in pi we have (28) n +1 = Sen) n·l = n (29) n + + m). Next. .m means. SCm) = Sen n . The treatment of the integers can begin. Any ordered field contains a subfield order-isomorphic with the field Q of rational numbers. 2. up to isomorphism. Formally. in case the positive elements in D are well-ordered. the set pi of Theorem 17 can easily be shown to consist of all the positive elements of D. up to order-isomorphism. show that addition is commutative. 5. assume only the Peano postulates and that addition and multiplication are defined by (28) and (29). Finally. we have Corollary 2. but with the Peano postulates. 3. and the construction of couples given in §2. 1. Prove the distributive law.5 then yields the integers from the Peano postulates. Exercises In the following exercises. This result gives an abstract characterization of the rational field as the smallest ordered field. only one ordered domain Z whose positive elements form a well-ordered set. one can establish. The various properties listed in Theorem 13 can then be established by induction. Prove that addition is associative. that there is one and only one binary operation + satisfying (28). much as in the proof of Theorem 15. The essential point is the observation that the recursive equations (28) and (29) can be used to define complete addition and mUltiplication tables. Show by induction that n + 1 = 1 + n. and similarly for multiplication. Prove that multiplication is associative.Ch. . 1. This shows that the postulates we have used for the integers determine the integers uniquely. not with the postulates for a well-ordered domain. 2 Rational Numbers and Fields 60 Combining this result with Theorems 6 and 7. 4. There is. This proves: Corollary 3. Using Ex. . But conversely. and let x be any element of a larger integral domain E which contains D as a subdomain. subtract.t. . In E one can form sums. one evidently gets all expressions of the form (ao....an ED.0 if n > 0). and finally the distributive law. x) 0 . + 1 . Polynomial Forms Let D be any integral domain. 2 + 1 . differences. x + 1 . where x" (n any positive integer) is defined as xx . the commutative and associative law. For example. an. 2 + 0 . x to n factors. 3 .1. x + (. if D is the domain of integers.2)x 2 • 2 f(x) = (0 = + (-2)x 2 • 3· x = 0 + o· x + 2x + 3x 2 + (-4)x 2 + (-6)x 3 = 0 + (0 + 2)x + (3 + (-4))x 2 + (-6)x 3 = 0 + 2x + (-1)x 2 + (-6)x 3 . x . x + (-2)x 2 )(2 + 3 . obtaining a third such expression. or mUltiply any two expressions of the form (1). 3 . one can add. using only the postulates for an integral domain..3 Polynomials 3. 61 . by the generalized distributive law. and products of x with the elements of D and with itself. By performing these operations repeatedly. x . . In order to prove that there always does exist such an integral domain E. + amx m q(x) = bo + b1x + .. and multiplied by formulas (2) and (3). We have thus proved the following result: Theorem 1. If m > n. then we have (2) p(x) ± q(x) = (ao ± bo) + .Ch. and so form a subdomain of E. + amX m . Then the polynomials (1) in this element x are added. by the distributive law. the coefficient of Xk j=n • • • • • is clearly a sum L.. . we have In this formula. and an Figure 1 element x not in D. Definition. + (an ± bn)xn + an+lx n+1 + .. By a polynomial in x over an integral domain D is meant an expression of the form (1). p(x)q(x) m n = L L °+° a. Assume there exists an integral domain E containing a subdomain isomorphic with the given domain D.i for all i with 0 -< i -< m and o -< k . See Figure 1. one wants the following definition... subtracted. + bnx n be any two expressions of the form (1).bjx' '. Two polynomials are called equal if they have the same degree and if corresponding coefficients are equal.. let and p(x) = aO+alx + . 3 62 Polynomials This argument can be generalized.bk . Indeed. . The integer n is called the degree of the form (1). a. A similar formula holds if m -< n.=OJ-O Collecting terms with the same exponent and adding coefficients..i -< n. Again. §3. am ¥. Proof. as and showing that the coefficient of each power Xk of x is the same in both expressions. + bnx n (ai. To prove the commutative.. By the distributive law in the domain D. bn " 0) .. This changes (2) and (3) to the simpler forms (2') (3') where all but a finite number of coefficients are zero. then the different polynomial forms in x over any integral domain D form a new integral domain D[x] containing D. Any law such as the distributive law may then be verified simply by multiplying out both sides of the law by rules (2') and (~'). associative. bj in D. and distributive laws it is convenient to introduce "dummy" zero coefficients.0 if m > 0. see §3. The properties of 0 and 1 and the existence of additive inverses follow readily from (2) and (3). Now recalling Theorem 7 of §2..2). Theorem 2.1 Polynomial Forms 63 Since nothing is assumed known about the symbol x. The absence of zero-divisors (cancellation law of mUltiplication) follows from (3). the expression (1) is also often called a polynomial form (to distinguish it from a polynomial function. and the symbol x itself is called an indeterminate.b o + b l + .2. If addition and multiplication are defined by formulas (2) and (3). + amx m q(x) . the coefficient of the k -th power of x is the same in both expressions.. Similar arguments complete the proof of Theorem 2. since the leading coefficient ambn of the product of two nonzero polynomial forms is the (nonzero) product of the nonzero leading coefficients am and bn of its factors. we see that if we define a rational form in the indeterminate x over D as a formal quotient p(x) _ ao + alX + . 10(b). 7. + anx n is defined as p'(x) = a 1 + 2a2x + . 4. The rational forms in an indeterminate x over any integral domain D constitute a field. *9.64 Ch. and define equality.(0 + x + x 2 + 3x 3 ) to the form (1). .1. Prove. (a) Is 1/2 + 3 . Corollary. although Z[x] fails to satisfy the well-ordering principle. Setting D = Z in Ex. (6). *8. (d) (pn)' = npn-1p'.. (3x 2 + 7x . . .5x(3x + 7)2.4)(4x 3 . Prove that the associative laws for addition and multiplication hold in D(x]. Exercises x 2 . over any integral domain: (a) (cp)' = cp'. (x 2 + 5x . (b) Show that D(x] is also an ordered domain if we define p(x) > 0 to mean that an > 0 in (1). with coefficients a i in D. and (7) of the Definition of §2.4)(x 2 .. 6. *11. (c) (pq)' = pq' + p' q. show that the polynomial forms (1) constitute an ordered domain D(x] if p(x) > 0 is defined to mean that the first nonzero coefficient ak in p(x) is positive in D. in a symbol t.2x + 3).1/2)(x 3 .x/2 + O.4 of the form (I)? Reduce it to this form. Reduce to the form (1): 3. where the coefficients are 1. Reduce (1 + x + 2X2 + 3x 3 ) . addition. we get a field. Compute similarly (3x 3 + 5x . (b) The degree of the sum of two polynomials is the larger of the degrees of the summands.X + 3). (b) (p + q)' = p' + q'. show that the substitution y = q(x) yields a polynomial p(q(x)). If p(y) and q(x) are polynomial forms in indeterminates y and x. stating which postulates are used at each step. For the formal derivative of Ex. and multiplication by (5). (a) If D is an ordered domain. 2. show that 1 is the least "positive" polynomial in Z[x]. *10. The "formal derivative" of p(x) = a o + a 1x + . . For given D show how to construct an integral domain D{t} consisting of all "formal" infinite power series a o + a 1 t + a2 t 2 + . 7. X 1/2 + 5x a polynomial form over the rational field? (b) Why is x 3 • X4 not equal to x 2 in the domain of polynomial forms with coefficients in Zs? Discuss the following statements: (a) The degree of the product of two polynomial forms is the sum of the degrees of the factors. 3 Polynomials of polynomial forms with non-zero denominator. 5.2.. prove that [P(q(x))]. Is x 3 + 5x . = p'(q(x))· q'(x). + na nx n. the integers mod 7. This field is denoted by D(x). Polynomial Functions As before.3. By definition. also in D. Since the only rules used in deriving formulas (2) and (3) are valid in any integral domain. and the product p = fg of two functions are defined by the rules h(x) = f(x) + g(x). A constant function is one whose value b is independent of x.2 Polynomial Functions 65 3. they are identities. and therefore sums and products of polynomial functions can also be computed by formulas (2) and (3). q(x) = f(x) . or epimorphisms. it would be permissible to forget the distinction between polynomial forms and polynomial functions. We shall define two such functions to be equal (in symbols. Unfortunately. f(x) becomes an ordinary function: "If x is given (as c). f = g) if and only if f(x) = g(x) for all x. the identity function is the function j with j(x) = x for all x. Therefore there is certainly a mapping which preserves sQms and products." By abstraction. That is. it follows that the polynomial functions over D constitute a commutative ring in the sense defined in §1. t Indeed. from the point of view of abstract algebra. this is the secret of solving equations by letting "x be the unknown quantity": every manipulation allowed on x must be true for every possible value of x. . Hence. (Such correspondences are called homomorphisms onto. see §3. If the indeterminate x is replaced by an element c E D. and each polynomial function is determined by at least one such form. if x is regarded as an independent variable in the sense of the calculus. we would know that it was an isomorphism. and p(x) = f(x)g(x) for all x. The sum h = f + g.§3. and let be any polynomial form in x over D .2. the difference q = f . then f(x) is determined (as f(c )).1. we shall define generally a "function" f of a variable on D as a rule assigning to each element x of D a "value" f(x). + amc m of D. they hold no matter what value c (in D) is assigned to the indeterminatet x. each form (1) determines a unique polynomial function.3. In other words..g.. from the polynomial forms to the polynomial functions over any given integral domain D.g(x). instead of as an abstract symbol outside of D.f(x) no longer remains an empty expression: it can be evaluated as a definite member ao + alC + . let D be any integral domain.) If we could be certain that the mapping was one-one. As will be explained in §3. A polynomial function is a function which can be written in the form (1). such is not the case. Definition. Theorem 4. The term anx n of biggest degree is called its leading term.Ch. Ck [( X k=l a )(x k-l + x k-2 a + . substituting a for x gives r{a) = O. the distinct forms f{x) = x 3 . (By a zero of r{x) is meant a root of the equation r{x) = 0. where s{x) has degree n . Corollary.a if and only if r{a) = O. then by the theorem. over any Zp. r{x) = (x . Hence. By Fermat's theorem (§1. We shall now show that it is no accident that the domain of coefficients in the preceding example is finite. s{x) has at most n .0). Theorem 18).x and g{x) = 0 determine the same function-the function which is identically zero. we recall some elementary definitions. Theorem 3.r{a) = (x .1.) Proof. . 3 66 Polynomials Indeed. if r{x) = (x . Proof.a" means that r{x) = (x . we mean n.a)s{x) for some polynomial form over D.a)s{x). + a k-l)] . We could not construct such an example over the field of rationals. . A polynomial form r{x) of degree n over an integral domain D has at most n zeros in D. Set r{x) = Co + CIX + . over the field Z3 of integers mod 3. equality has an effectively different meaning for functions than it does for forms. we have. For every a. the same is true over Zp for x P . and if an = 1.x and O. Therefore r{x) .9. . Here the statement "r{x) is divisible by x . But before doing this..1 zeros.a)s{x). but r{x) = 0 by Theorem 1 of § 1. an element a E D such that r{a) = 0. that is.a)s{x).1. If an integral domain D is infinite. By induction.2 if and only if x = a or s{x) = O. an its leading coefficient. the polynomial is termed monic. + cnx n (c n yt. If a is a zero. Hence r{x) = 0 has at most n zeros. Conversely. then two polynomial forms over D which define the same function have identical coefficients. where s{x) is a polynomial form of degree n . by high school algebra. . . n L k=O n CkXk - L k=O n Ckak = L Ck{X k - a k ) k=O n '" = L. its biggest exponent... By the degree of a nonzero form (1).. A polynomial form r{x) over a domain D is divisible by x . aj) = (x .. To solve this problem. so that (4) (i p(a.. (Caution: A rational function is not defined at all points. the concepts of polynomial function and polynomial form are equivalent (technically.an' For example. consider the polynomials qj(x) = n (x . The polynomial functions on any infinite integral domain themselves form an integral domain. n . a h ••• . the polynomial r(x) is zero for at most n values of x-whence. there will be remaining values of x on which r(x) ¥.. the monic polynomial form (x .) = Yj = 0. in this case. = qj(a/) = n (aj - j"'i a. In terms of the difference r(x) = p(x) . at all but a finite number of points. and the rational functions on D form a field.) It is often desired to find a polynomial p(x) of minimum degree which assumes given values Yo. Theorem 4 never holds if D is a finite integral domain. for unless the coefficients Ci are all zero. since D is infinite. distinct rational forms define distinct rational functions. This conclusion follows by Theorem 3.ai+l) .. if D is a field. Since any system isomorphic with an integral domain is itself an integral domain.. if D is infinite. let p(x) and q(x) be two given forms in the indeterminate x...an) .Yn in a field F at n + 1 given points ao. This is called the problem of polynomial interpolation.j). . ..an) of degree n determines the same function as the form 0.ao) .a2) ...aj if i ¥..2 67 Polynomial Functions Proof. As in (1). the ring of polynomial functions is isomorphic to that of polynomial forms). .O. + cna = 0 for all a in D implies that Co = CI = . but only where the denominator is not zero.) ¥. On the other hand. Yh .ai-I)(x . .. aj ¥. = Cn = O. then p(a) = q(a) for every element a chosen from D. (x .. with elements ah ..q(x). . If D is an infinite field. (x . . (x . Thus. this is to say that n r(a) = Co + Cia + . . 1. the desired conclusion is then that p(x) and q(x) have the same degree and have corresponding coefficients equal.an E F. Thus it is defined.§3.ad(x .O. If they determine the same function. j"'i c. Theorem 4 implies the following corollary: Corollary. 1(3) = 1. This proves the following result. 3 Polynomials Hence C. Find a cubic polynomial I(x) = a + bx + cx 2 + dx 3 satisfying 1(0) = 0. at most one polynomial of degree n or less can satisfy equations (4): the difference of two such polynomials would have n + 1 zeros. (a) Show that if two polynomial forms I(x) and g(x) determine the same function. Let D be a finite integral domain with n elements al>' . distinct rational forms which determine the same functions are formally equal in the sense of §2. Use the interpolation formula (5) to show that every function on any finite field (such as Zp) is equal to some polynomial function. find a second polynomial form determining the same function as x 2 .Yo)(x ~ al) + ~(Y2 . In the domain Zs. then (4) can be solved for n = 2 by the parabolic interpolation formula p(x) = Yl + ~(Y2 .e 0..2Yl + Yo)(X ~ alr 4. . There is exactly one polynomial form of degree n or less which assumes given values at n + 1 distinct points.1 has four zeros over ZlS' Why doesn't this contradict the corollary of Theorem 3? 3. d as unknowns in four equations. Formula (5) is called Lagrange's interpolation formula.g(x).x in case D = Zp.) 7. (Hint: Use Fermat's theorem.) 5. Show that x 2 . then m(x) is a divisor of the form I(x) . (This is the method of undetermined coefficients. (x . Exercises 1. 1(2) = 0. b.h.68 Ch..a l ) . (b) Compute m(x) for the domains Z3 and Zs.an' Let m(x) denote the fixed polynomial form (x .2. by treating a. In view of Theorem 3. 2. Show that if ao = al . a2 = a l + h. (c) Show that m(x) = x P .an). 1(1) = 1. Theorem 5.. . *6. of which the last is a + 3b + 9c + 27d = 1. and 1 + 1 -.1 exists. Prove that over an infinite field. and so would be the polynomial form zero. c. and the following polynomial of degree n or less (5) satisfies equations (4).x + 1. Also.§3. There are zero-divisors in the domain D* of all functions even on infinite integral domains D.3. we recapitulate this de'finition here. On the other hand. and so on. f(x) + g(x) = g(x) + f(x). (a) If V and V' are isomorphic domains. If Q is the field of quotients of a domain V (Theorem 4. Thus.x. In other words. yet f ¥.2.3 Homomorphisms of Commutative Rings 69 8. and associative operations. D* has every other defining property of an integral domain. 13. and let D(x) denote the system of polynomial functions over D.2). D(x) is a commutative ring in the sense defined in § 1. Theorem 19. then fg = h is h(x) = Ixl 2 .[y]. (iii) a multiplicative identity (unity) 1 exists. This breaks down when D is finite because there exists a zero product (x . respectively. In summary. Another instance of a commutative ring is furnished by the system D* of all functions on any integral domain D. commutative. Noncommutative rings will be considered in Chap.2 were proved to be valid in any commutative ring. (b) How about V(x) and V'(y)? 9. prove that V[x] is isomorphic to V. Hence addition and multiplication are commutative. . prove that the field V(x) is isomorphic to the field Q(x). and inverses exist for addition. t It will be recalled that Rules 1-9 in §1. 3. One can prove each law for D* from the corresponding law for D by the simple t Some authors omit condition (iii) in defining commutative rings.O. identity elements exist for addition and multiplication.1. Homomorphisms of Commutative Rings Let D be any given integral domain. 0 + f(x) = f(x).. Definition. where V[x] and V'[y] are the domains of polynomial forms in indeterminates x and y over V and V'.a\)(x . (ii)" an additive identity (zero) 0 and additive inverses exist. and in which further: (i) multiplication is distributive over addition.10.0. called addition and multiplica tion. an interesting family of finite commutative rings Zm was constructed in § 1. D(x) satisfies all the postulates for an integral domain except the cancellation law of multiplication. g ¥. §2. For convenience. associative. A commutative ring is a set closed under two binary. (x -an) of nonzero factors.x 2 = 0 for all x. if D is any ordered domain.. For all xED. and distributive.a2) . and if we define f(x) = Ix I + x and g(x) = Ix 1. 1· f(x) = f(x). where addition and mUltiplication are defined as in §3. for any fixed modulus m. they become (a + b) == (a) + (b) and (ab) == (a)(b) instead. and which also contains the unity of A.Ch. Evidently. so that e is a mUltiplicative identity (unity) of D* . This gives a simple algebraic characterization of the concept of a polynomial function. Definition. (See why the cancellation law of multiplication cannot be proved in this way. (2) contains all constant functions and the identity function. is a homomorphism Z -+ Z. b E R. Now let us define (by analogy with "subdomain") a subring of a commutative ring A as a subset of A which contains. an isomorphism is just a homomorphism which is bijective (one-one and onto) . also f ± g and fg. One easily verifies that the function from n to the residue class containing n. 3 70 Polynomials device of writing "for all x" in the right places. These conditions state that the homomorphism preserves addition and multiplication. They have been written in the compact notation of §§1. (ab) == (a<l> )(b<l> ). whereby a<l> signifies the transform of a by <1>. In this sense D(x) is the subring of D* generated by the constant functions and the identity function.) Since the cancellation law for multiplication was nowhere used in the above. Thus f(x) + g(x) == g(x) + f(x) for all x implies f + g == g + f. if we define e as the constant function e(x) == 1 for all x. f(x) == f(x) for all x and f. The functions on any commutative ring A themselves form a commutative ring. for all a. then e(x )f(x) == 1 . If we write <!>(a) instead of a<l>.11-1. By Theorem 1. a ~ a<l> from a commutative ring R into a commutative ring R' is called a homomorphism if and only if it satisfies. A function <1>. Again. whence ef == f for all f. and (3) is contained in any other such subring. with any two elements f and g. the set D(x) of polynomial functions on any integral domain D (1) is a subring of the ring D* of all functions on D. and carn'es the unity of R into the unity of R'. Deeper insight into commutative rings can be gained by generalizing the notion of isomorphism as follows. (6) (7) ·(a + b) == a<l> + b<l>.:". we may assert: Lemma 1. .12. 1 used only the postulates for an integral domain. Theorem 6. Let l/J be a homomorphism from a commutative ring R into a commutative ring R'. For any element x in D. Prove the associative law for sums and products in D*. and (a . since the derivation of these identities in §3. Theorem 19. (b) all functions 1 on D with 1(0) = 1(1). Ol/J = (0 + O)l/J = Ol/J + Ol/J. whence xl/J = al/J . prove that D(x) and D'(x) are isomorphic. Lemma 2.e 0. (a) Show that there are only four different functions on the field Z2' and write 2. if x = a . which proves that Ol/J is the zero of R'.b)l/J = al/J .b in R. (d) all functions Ion Q (the rational field) with -7 < I(x) < 7 for all x.3 Homomorphisms of Commutative Rings 71 mapping the domain of integers onto the ring Zm of §1. Exercises 1. then the homomorphism of Theorem 6 is an isomorphism. (c) all functions 1 on D with 1(0) -. (c) Is this ring of functions isomorphic with the ring of integers modulo 4? How many different functions are there on the ring Zn of integers modulo n? Are the following sets of functions commutative rings with unity? (a) all functions 1 on a domain D for which 1(0) = 0. Proof. Construct two commutative rings of functions not included in the examples of Ex. 5. Proof. . 6. the addition and multiplication of the numbers p(x) and q(x) in D must conform to identities (2) and (3).§3. Likewise. then b + x = a and so al/J = (b + x)l/J = bl/J + xl/J. 4.bl/J in R'. The correspondence p(x) ~ f(x) from the domain D[x] of polynomial forms over any integral domain D to the ring D(x) of polynomial functions over D is a homomorphism. (e) all 1 on Q with I(x + 1) = I(x) for all x (such an 1 is periodic). Then Ol/J is the zero of R'. (a) If D and D' are isomorphic domains. 3. out addition and multiplication tables for this ring of functions.bl/J for all a. Let D* be defined as in the text. 7. 3. b E R. By (6). We now prove another easy result. The result of Theorem 4 states that if D is infinite. (b) How about D* and (D')*? Show that one cannot embed in a field the ring Zp(x) of all polynomial functions over Zp.1O. (b) Express each of these functions as a polynomial function. 9. But most of the results extend without difficulty to the case of several variables (or indeterminates) XI. xn) = Xi (i = 1. A corollary of Theorem 4 and induction on n is Theorem 7. then the unity of R is carried by 4> into the unity of R'. 3 72 Polynomials 8. . xn) == c and the n identity functions [. Each polynomial function in X\..Ch.. Show that if a homomorphism maps a commutative ring R onto a commutative ring R'. Whether Dis infinite or not.4.· . x) = ~ (7 a/jy')xi .. It follows by Theorem 7 that if D is infinite. . . xn-\] of polynomial forms in XI> ••• . Show that if 4>: R ~ R' is any homomorphism of rings. We shall now show that this result is true for any integral domain D. .. Xn can be expressed in one and only one way as a polynomial form if D is infinite. subtraction. A polynomial form over D in indeterminates x I> ••• . Consider the case of two indeterminates x.(Xh .x3)y2-usuallY written in the more flexible form 3 + X2 + 2xl. D[x). . . y) = (3 + X2) + O· Y + (2x .. x n-\ (in short.. one such form would be p(x. n)... Each form p(y.. . .1-3. *3. then the set K of those elements in R which are mapped onto 0 in R' is a subring. xn] = D[Xh . .. .. D[x). Proof. . Xn is defined recursively as a form in Xn over the domain D[xl> . Definition. . . Polynomials in Several Variables The discussion of §§3.. It is obvious from the definition that every permutation of the subscripts induces a natural automorphism of the commutative ring D(x).. . y.x 3 l. in the case of two variables x..... A polynomial function of variables XI> ••• . xn-\][x n]).Xn ]. Xn. . the same is true of polynomial forms (whose definition is not symmetrical in the variables). y. Every permutation of the subscripts induces a different automorphism on D[x).. and multiplication from the constant function f(x).xn ) of polynomial functions of n variables. Theorem 8..3 dealt with polynomials in a single variable (indeterminate) x. . Thus... . Xn on an integral domain D is one which can be built up by addition.xn ] is an integral domain. . for the domain D" = D[x. and associative laws in D[y.i J . Represent as polynomials in y with coefficients in D[x]: (a) p(x. This may be done in the case n = 2. This result has the proper form. x] and D[x. . . Firstly. . x] to give an expression of the form p(y. y) thus set up is one-one-every finite set of nonzero coefficients aij corresponds to just one element of D[y. By this we mean that a finite sum with coefficients aij in D can be zero if and only if all coefficients aij are zero. Thus D[Xh . the generators x and yare simultaneous indeterminates over D or are algebraically independent over D). 2. Finally. . y. commutative. since rules (2) and (3) for addition and mUltiplication can be deduced from the postulates for an integral domain. 9 below).xn • This suggests framing a definition of D[Xh . The correspondence p(y.xn ] in fact depends symmetrically on Xh . y] (x first. Rearrange the following expression as a polynomial in x with coefficients which are polynomials in y tas in the proof of Theorem 8): (3x 2 + 2x + 1)y3 + (x 4 + 2)y2 + (2x . y) = (x + y)3 . and can be interpreted as if it were a polynomial p'(x. x) ~ p'(x. These two properties uniquely determine the domain D[x. in the second place. roughly as follows. D" is generated by x. y on the domain Z2' 3. The case of n indeterminates can be treated similarly with a more elaborate general notation-or deduced by induction from the case of two variables. Compute the number of possible functions of two variables x.3)y + X4 - 3x 2 + 2x.. we see that the correspondence preserves sums and products.. and D by repeated sums and products). and elements of D (every element of D" may be obtained from x. y] are. (b) q(x.3yx(x 2 + x-I). y]. which both D[y.. y. y) in the domain D[x.4 73 Polynomials in Several Variables of D[y. x] can be rearranged by the distributive.§3.. y]. Exercises 1.xn ] from which this symmetry is immediately apparent. x] and just one of D[x.•.y) = l x + (x 2 .xy)2. x) = ~ ( LaijX ') y. then y). y] in a symmetrical manner (see Ex. (bn/am)xn-ma(x) = O· xn + (b n. (b) for any n. an automorphism of D[x]? Illustrate by numerical examples. 8. 6. although usually carried out with rational coefficients. y". Theorem 9. for n = 2.e 0) and b(x) = bo + blx + . and a(x) -. We can then repeat this process until the degree of the remainder is less than m. Is the correspondence p(x) ~ p(x + c). We shall now show that this Division Algorithm. y] (first x. show that the correspondence p(x) ~ p(ax) is an automorphism of F[x] for any constant a -. (c) Use parts (a) and (b) to give another proof of Theorem 8. . + amx m (am -.e 0 and b(x) are any polynomials over F. or zero. and each element of D on itself. 9.e 0). is actually possible for polynomials with coefficients in any field. (a) for n = 2.e O. . The Division Algorithm The Division Algorithm for polynomials (sometimes called "polynomiallong division") provides a standard scheme for dividing one polynomial b(x) by a second one a(x) so as to get a quotient q(x) and remainder r(x) of degree less than that of the divisor a(x). Eliminate successively the highest terms of the dividend b(x) by subtracting from it products of the divisor a(x) by suitable monomials cx k • If a(x) = ao + alx + .I .. (b) Let D' and D" be two domains each generated over D by two simultaneous indeterminates x'. Prove that the correspondence which carries each p(x) into p(-x) is an automorphism of D[x]. If F is any field. Prove that D' is isomorphic to D" under a correspondence which maps x' on x". Exhibit automorphisms of D[x. y] other than those described in Theorem 8.. 7. where c is a constant... Prove Theorem 7. and if the degree n of b(x) is not already less than that m of a(x).am-Ib n/ am)x n. then y) is indeed generated over D by two "simultaneous indeterminates" x and y.Ch. Is it also one of D(x)? 5. 3 74 Polynomials 4. y' and x". then we can find polynomials q(x) and r(x) over F so that (8) b(x) = q(x)a(x) + r(x). Informal proof. respectively..5. 3. y' on y". If F is a field..I + . where r(x) is either zero or has a degree les than that of a(x). Let D be any integral domain. + bnx n (b n -. we can form the difference (9) bl(x) = b(x) . (a) Prove in detail thatthe domain D[x. which will be of degree less than n. if a(x) and b(x) are two polynomial forms over an integral domain D. Exercises 1. then b(x) is divisible by a(x) over D or in D[x] if and only if b(x) = q(x)a(x) for some polynomial form q(x) E D[x].aj). is p(c) (Remainder Theorem) . then the remainder r(x) in (8) is a constant r = b(x) . If we set x = c.(x . When the remainder r(x) in (8) is zero. 3 for the field Z3· 5. Substituting (11) in (10). 2 for the field Zs. . r(x) if b(x) = x S . Given distinct numbers aQ.5 75 The Division Algorithm A formal proof for this Division Algorithm can be based on the Second Induction Principle. Let m be the degree of a(x). (b) Do Ex. Show that q(x) and r(x) are unique for given Compute q(x). 2 if a(x) is respectively x (a) Do Ex. let a(x) = II (x . x + 2.x 3 + 3x The same as Ex. . Is x 3 + x 2 + X + 1 divisible by x 2 + 3x + 2 over any of the domains Z3' Zs. we get the desired equation (8). 5 and a(x) = x 2 + 7. at.1. 2. when divided by x . Any polynomial b(x) of degree n < m then has a representation b(x) = O· a(x) + b(x). this equation gives r = b(c) . unless r(x) = O. 4. For a polynomial b(x) of degree n > m. transposition of (9) gives (10) .c)q(x).5.c is monic and linear. we can assume the expansion (8) to be possible for all b(x) of degree k < n. More exactly. Hence we have Corollary 1. j=Q Show that the remainder r(x) of any polynomial f(x) over F upon division by a(x) is precisely the Lagrange interpolant to f(x) at these points. as In particular. By the second induction principle. ~? . as formulated in §1.Oq(c) = b(c). we say that b(x) is divisible by a(x). n field F. with a quotient q(x) == O. x 3 + X .. an in a a(x) and b(x) in (8).§3. 6. if the polynomial a(x) == x . 2.. so that we have (11) where the degree of r(x) is less than m.c. The remainder of a polynomial p(x). where the degree k of b1(x) is less than n unless b1(x) == O. 3. To clarify the analogy between irreducible polynomials and prime numbers. so both c and c' are units. This is clearly impossible. Conversely.1 • Elements with this property are also called invertible. b Ia) if there exists some c in D such that a = cb. In the application of the Second Induction Principle to the Division·Algorithm. Substituting x = -b this gives (_b)2 + 4 = (-b + a)(-b + b) = 0. defined as follows.a )(x . every a . In the domain Z of integers.1 with 1 = uu. the role of prime numbers is played by "irreducible" polynomials.lOx + 12 is divisible by x 2 + 2. .5)? 3. (a) If a polynomial I(x) over any domain has I(a) = 0 = I(b).b). For suppose instead x 2 + 4 = (x + a)(x + b). The cancellation law gives 1 = ee'. hence the associates of any a are ±a. the domain Z of integers. an element u is a unit in D if and only if it has in D a multiplicative inverse u. (b) Generalize this result. hence (-b)2 = -4. If a and b are associates. be it the polynomial domain Q[x]. Since 11 a for all a. 0 is a unit. a = cb and b = cia. 8. In a field. EXAMPLE 1. it is called irreducible over F. a = ub is an associate of b if u is a unit. where a ~ b. as a square · cannot be negative. hence a = ee' a. EXAMPLE 2. An associate of the unity element 1 is called a unit. the units are ±1. Hence two elements are associates if and only if each may be obtained from the other by introducing a unit factor. Since the same reasoning holds in any ordered field. we conclude that x 2 + 4 is also irreducible over the real field or any other ordered field. Units and Associates I One can get a complete analogue for polynomials of the fundamental theorem of arithmetic.e. Definition.Ch. we now define certain divisibility concepts for an arbitrary integral domain D.6. 9. show that I(x) is divisible by (x . A polynomial form is called reducible over a field F if it can be factored into polynomials of lower degree with coefficients in F. Two elements a and b are associates if both b Ia and a Ib. An element a of D is divisible by b (in symbols. 3 76 Polynomials 7. Find all possible rings Zn over which x 5 . In this analogue. otherwise. what specifically is P(n) (see §1. or something else. Thus the polynomial x 2 + 4 is irreducible over the field of rationals. 1-3 below. the degree of a product f(x) . the units of the polynomial domain F[x] are thus exactly the nonzero constants of F. Since a 2 + b 2. we have 1 = (a + bJ l)(c + d~) = (ac . g(x) is the sum of the degrees of the factors. These are called "improper" divisors of b. y = 2 -bj(a 2 . J] EXAMPLE 6. If a + b -1 is a unit. Hence any element b(x) with a polynomial inverse a(x)b(x) = 1 must be a polynomial b(x) = b of degree zero. b integers). Over any field F. The proof will be left to the reader. -1.bd)2 + (ad + bc)2 = (a 2 + b 2)(C 2 + d 2). c 2 + d 2 are nonnegative integers. and --J 1. In the domain Z[ J2] of all numbers a + bJ2 (a. (See also Exs. An element not a unit with no proper divisors is called prime or irreducible in D. the relation "a and bare associates" is an equivalence relation. whereas 2 + J2 is not a unit in Z[v2]. Therefore the units of D[x] are the units of D. Hence ac .2b 2). then. In a polynomial domain D[xJ in an indeterminate x. giving four units. the only possibilities are thus 1.6 77 Units and Associates EXAMPLE 3. b E Z. for its only factors are constants (units) or constant multiples of itself (associates). Thu~ 1 ± J2 and 3 ± 2J2 are units. 2 2 2 we infer a + b = c 2 + d = 1. and 1 = (ac . ad + bc = 0. EXAMPLE 5. -J 1. Consider the domain Z[-J of "Gaussian integers" of the form a + b~. as can easily be checked. Lemma. An element b of an arbitrary integral domain D is divisible by all its associates and by all units.2b = ±l.bd = 1. Such a constant polynomial b has an inverse only if b already has an inverse in D. (a + bJ2)(x + yJ2) = 1 implies x = aj(a 2 .) . a linear polynomial ax + b with a ¥ 0 is irreducible. In any integral domain D. with a. for some c + d-J 1. If F is a field.bd) + (ad + bc )J=i. so that two polynomials f(x) and g(x) are associates in F[x] if and only if each is a constant mUltiple of the other EXAMPLE~.2b 2)-and these are integers if and only if a 2 .§3. (d) If N(a) is a prime in Z. (b) that if a is a unit in Z[." Prove that (a) if a .a' and b . Find all the units in the domain D which consists of all rational numbers min with m and n integers such that n is not divisible by 7.J5 is a unit in this domain.J5] be the domain of all numbers a = a + b.3b 2 • Prove (a) N(aa') = N(a)N(a').J5 are primes. y] of polynomials in two indeterminates. a . Prove that the units of Zm are the integers relatively prime to Zm. In any integral domain. define N(a) = a 2 . 4. (b) if a . . 5. 12. Find all units in the domain D[x. (c) any two elements have a common divisor and a common multiple.) (g) Use 2· 2 = (3 + .Ch.7. b integers).5b 2 • . Irreducible Polynomials A basic problem in polynomial algebra consists in finding effective tests for the irreducibility of polynomials over a given field.J5) to show that Z[. (d) if a Ib and a Ie.J5 and 4 . then a .a'b'-whereas in general a + b .by.J3].. Where a = a + bT3. Let Z[. then a I(b ± e). but are not units. then q is prime. then a Ie if and only if b Ie. 2. (f) Show that 2 and 3 + . 6. then x .) (b) Show that 1 . then N(a) = ± 1. (e) Show that 4 + . and set N(a) = a 2 . 3. In any integral domain D.J5] is not a unique factorization domain (§3. 3.. then ab . Prove in detail the lemma of the text. 11.J5 and 3 + . Thus over the . 7. (d) if P is prime and p . then b Ia if and only if be Iae. (Hint: x 2 = 2 (mod 5) is impossible for x E Z.y.q.J5 (a.J5 are primes.b.b. (b) if e ~ 0.J5].J5 are associates. The nature of such tests depends entirely on the field F in question. (c) if a Ic if and only if b Ie. Prove the "generalized law of cancellation": If ax .9). (c) Show generally that a is a unit if and only if N(a) = ± 1. 10. 3 78 Polynomials Exercises 1. Ex. 10.1 in Zs[x].b'.b" mean "a is associate to b. Show that if a .J5)(3 .. let "a . show that a is a prime in Z[.b. (a) Prove that 9 + 4. For which elements a of an integral domain D is the correspondence p(x) ~ p(ax) an automorphism of D[x]? 9. and a ~ 0.b. then e 1a if and only if e I b. (ef.a' + b' fails. 8. prove that (a) the relation "b Ia" is reflexive and transitive. List all associates of x 2 + 2x . 10)... Theorem 10. Corollary. Again. Proof... Yet x 2 + 1 is irreducible over the real field R.28 = (x . By dividing out the g.7. Proof.J28)(x +. since the degree of a product of polynomials is the sum of the degrees of the factors. x 2 = 28 implies that x = rls is an integer.28 > 0 if Ix2 I :> 6.28 < 0 if Ix I -< 5..28 is reducible over the real field.3.28 is irreducible over the rational field. + ans n. Any rational root of the equation p(x) = 0 must have the form rl s.c. But x 2 . There is no easy general test for irreducibility of polynomials over the rational field Q (but see §3. Lemma. one can express bl c in "lowest terms" as a quotient rls of relatively prime integers rand s. In fact. the polynomial x 2 + 1 can be factored as x 2 + 1 == (x + r-i)(x . the only irreducible polynomials of C[x] are linear. and x 2 . S"l Iml ar Iy.I + .d. . r) = 1. It is now easy to prove that x 2 . Suppose p(x) == 0 for some fraction x = blc..28 is irreducible over Q.7 79 Irreducible Polynomials complex field C. as -ans n == r ( aor n-I + . as will be shown in §5. A quadratic or cubic polynomial p(x) is irreducible over a field. of band c. by successive applications of Theorem 10 of §1.28 = 0. one factor must be linear. The same polynomial is irreducible over the rational field. hence. By the Corollary. where r Ian and s lao. Any rational root of a monic polynomial having integral coefficients is an integer. r Ian.§3.. F. + an-Is n-I) .s IQo. S Iaor n-I . whence and But (s. Let p(x) = aoxn + alx n. as we shall now prove rigorously.2 Hence no integer can be a root of x . Substitution of this value in p(x) gIVes (12) 0 == snp(rls) == aorn + alrn-Is + . .J28).. and (by the Lemma) x . since x 2 . unless p(c) = 0 for some c E F. In any factorization of p(x) into polynomials of lower degree. the polynomial x 2 . .r-i). + an be a polynomial with integral coefficients.. (b) 5x J + x 2 + X = 4.3x = 18. since For any a E ra ± sa = (r ± s)a R. Such an ideal is called a principal ideal. Is x 2 + 1 irreducible over ZJ? over Zs? How about x J + x + 2? 6. + anx n is irreducible. Show that if 4ac > b 2 . (Hint: Use the fundamental theorem of arithmetic. For which rational numbers x is 3x 2 .7x an integer? Find necessary and sufficient conditions. . 13. 10.. + aoX n. r E R imply ra E C. Find a finite field over which x 2 . then ax 2 + bx + c is irreducible over any ordered field. the set of all multiples ra of a is an ideal..Ch. then so is an + an_IX + an_2 x 2 + . 4. (d) 6x J . Unique Factorization Theorem Throughout this section we shall be considering factorization in the domain F[x] of polynomial forms in one indeterminate x over a field F. Decompose into irreducible factors the polynomial X4 . • 3. Find all monic irreducible quadratic polynomials over the field Zs. Prove that 30x n = 91 has a rational root for no integer n > 1. and a E C. and s(ra) = (sr)a. (b) irreducible. A nonvoid subset C of a commutative ring R is called an ideal when a E C and bEe imply (a ± b) E C. (c) 8x s + 3x 2 == 17. Test the following equations for rational roots: (a) 3x 3 . Find all monic irreducible cubic polynomials over ZJ' 9.5x 2 + 6 over the field of rationals.7x = 5..) 3. s..8. For what integers a between 0 and 250 does 30x n = a have a rational root for some n > I? 5. Definition. The main result is that factorization into irreducible (prime) factors is unique. 7. and over the field of reals. 1). We will now show that all ideals in any F[x] are principal.2 is (a) reducible. Remark. Prove that if ao + alx + a 2 x 2 + . over the field Q(J2) of §2. the proof being a virtual repetition of that of the analogous fundamental theorem of arithmetic (Chap. which will be considered systematically in Chap. 8. The analogy involves the following fundamental notion. *11. 3 80 Polynomials Exercises 1. r E R.1. . 2. so is every associate of d. and by construction it contains no nonzero polynomial of degree less than d(a). for if d and d' are two greatest common divisors of the same polynomials a and b. difference. In F[x].§3. all its multiples q(x)a(x). did' and d' Id. Unless C = 0. since (in abbreviated notation) (sa + tb) ± + t'b) = (s ± s')a + (t ± q(sa + tb) = (qs)a + (qt)b. Hence the set C is an ideal and so. Hence r(x) = 0 and b(x) = q(x)a(x). In this case. since d(x) = so(x)a(x) + to(x)b(x). or multiple of its members. The g. Now let a(x) and b(x) be any two polynomials. any ideal C of F[x] consists either (i) of 0 alone. so that d and d' are indeed associates. But by hypothesis C contains r(x). (ii) d is a "linear combination" d = sa + tb of a and b. and contains any sum. (i') c Ia and c Ib imply c Id.d.c. and will be divisible by any common divisor of a(x) and b(x). if d satisfies (i). and (ii). d(x) is unique except for unit factors. It is sometimes convenient to speak of the unique monic polynomial associate to d as "th" d e g.. This polynomial d(x) will divide both a(x) = 1· a(x) + O· b(x) and b(x) = O· a(x) + 1· b(x). Our conclusion is Theorem 12. if d is a g. This means that . and. or (ii) of the set of multiples q(x)a(x) of any nonzero member a(x) of least degree. consists of the multiples of some polynomial d(x) of least degree. described in detail in §1. proving the theorem.q(x)a(x) has degree less than d(a). Incidentally.. We remark that the Euclidean algorithm.) Also. This set Cis obviously nonvoid. (s'a t')b. by Theorem 9 some r(x) = b(x) .d. then so do all associates of d. (i) and (ii) imply (i'). Conversely.8 81 Unique Factorization Theorem Theorem 11. and consider the set C of all the "linear combinations" s(x)a(x) + t(x)b(x) which can be formed from them with any polynomial coefficients s(x) and t(x). Two polynomials a(x) a~d b(x) are said to be relatively prime if their greatest common divisors are unity and its associates. Moreover. (i'). it contains a nonzero polynomial a(x) of least degree d(a).c. any two polynomials a and b have a "greatest common divisor" d satisfying (i) d Ia and d Ib. Over any field F. by Theorem 11. can be used to compute d explicitly from a and b. if b(x) is any polynomial of C. with a(x). (This is because our Division Algorithm allows us to compute remainders of polynomials explicitly. then by (i) and (i').c .7. Proof. i]. Since p(x) divides the product a(x)b(x).8. This expression is unique except for the order in which the factors occur. b'(x) = C'Pl'(X) . Pm(X)Pl'(X) ' . Therefore. Pm(x) equals the product of the qk(X) [k ¥..Ch. Otherwise. p(x)la(x). pn'(x). Hence Pl(X) = qi(X) . the Pj(x) [j ¥. P2(X) . it must by Theorem 13 divide some (nonconstant) factor qi(X). since qi(X) is irreducible. and since Pl(X) and qi(X) are both monic.. §1. this is trivial. a(x) is the product a(x) = b(x)b'(x) of factors of lower degree. 3 82 Polynomials polynomials are relatively prime if and only if their only common factors are the nonzero constants of F (the units of the domain F[x]). and has a lower degree than a(x). In the former case.. we can assume b(x) = CPl(X)' . Pm(x). By the Second Induction Principle.1] and qdx) [k ¥.. If p(x) is irreducible. suppose a(x) has two possible such "prime" factorizations. such a factorization is possible.d. If a(x) is a constant or irreducible.c. last paragraph) that the exponent ei to which each (monic) irreducible polynomial Pi(X) occurs as a factor of . Pn'(x). since Pl(X) divides C'ql(X)··· qn(x) = a(x). c = c' will be the leading coefficient of a(x) (since the latter is the product of the leading coefficients of its factors). it divides both terms on the right. Proof. Clearly. First. of p(x) and a(x) is either p(x) or the unity 1. the g. Again. completing the proof. it must be 1. Any nonconstant polynomial a(x) in F[x] can be expressed as a constant c times a product of monic irreducible polynomials. Cancelling. then p(x)la(x)b(x) implies that p(x)la(x) or p(x)lb(x) . It is a corollary (ct..i] are equal in pairs. Because p(x) is irreducible. in the latter case. again by the Second Induction Principle. where cc' is a constant and the Pi(X) and p/(x) are irreducible and monic polynomials. whence a(x) = (CC')Pl(X)' . we can write 1 = s(x)p(x) + t(x)a(x) and so b(x) = 1· b(x) = s(x)p(x)b(x) + t(x)[a(x)b(x)].. To prove the uniqueness. as required for the theorem. Theorem 14. hence does divide b(x). Theorem 13. the quotient qi(X)/Pl(X) must be a constant.. ) 10. 3 3 (d) X4 + x + X + 1. 11. Do Ex. Do the same for x 6 .§3. Prove that two polynomial forms q(x) and rex) over Z represent the same function on Zp if and only if (x P .1 is associate to one on your list. .2.l.c. If a given polynomial p(x) over F has the property that p(x)la(x)b(x) always implies either p(x) Ia (x) or p(x) Ib(x) . (b) x 3 + x + 2. (b) Infer that the polynomials have a l. Hence any two such factorizations can be made to agree with each other simply by reordering terms and replacing each factor by a suitable associate factor.d. as in Theorem 14. X4 + 4x 3 + 3x 2 + X + 1. Factor the following polynomials in Z3: (a) x 2 + x + 1. as a linear combination d(x) = s(x)a(x) + t(x)b(x) of the given polynomials. (a) Prove that the set of all common multiples of any two given polynomials over a field is an ideal.1. This situation is summarized by the statement that the decomposition of a polynomial in F[x] is unique to within order and unit factors (or to within order and replacement of factors by associates).m. List (to within associates) all divisors of X4 .c.m. 7.8 Unique Factorization Theorem 83 a(x) is uniquely determined by a(x).c.d.. (b) Express this g.) (c) The same for X I8 . assuming that the polynomials have coefficients in Z3. However. 4. (Caution: The coefficients need not be integers. illustrate by finding the I. proving that every divisor of X4 . of x 3 . each factor Pi(X) divided by its leading coefficient gives a (unique) monic irreducible factor. of 2x 3 + 6x 2 . and is the biggest e such that Pi(XY la(x). prove p(x) irreducible over F. X 33 .c. then the antecedents of the additive zero of R' form an ideal in R.1 in the domain of polynomials with rational coefficients.d.c. (a) Find the g. the factors are no longer absolutely unique.d. 5. Prove that any finite set of polynomials over a field has a g. Find the g. 3. and therefore is associate to this irreducible in F[x]. Show that x 3 + x + 1 is irreducible modulo 5.X .. If a polynomial a(x) is decomposed into irreducible factors Pi(X) which are not necessarily monic. which is a linear combination of the given polynomials. *(e) X4 + x + X + 2.1. (c) 2x 3 + 2X2 + X + 1.rex)]. of x 2 + 3x + 2 and (x + 1)2. 6 of §3. x 8 .c.1 and X4 + x 3 + 2X2 + X + 1. Show that if 4> is any homomorphism from a commutative ring R to a commutative ring R'. 9.1. 8. 6. 3. Exercises 1. 12. 2.x)l[q(x) .3. (Hint: Use Ex. one can show that in both cases. prove p(x) irreducible. prove h (x) relatively prime to I(x )g(x). 17. yet s(x) . b(6) = b(7). Show that S is an ideal. y] of polynomial forms in two indeterminates over the rational field Q. yet there are 0 polynomials s (x.c.d. x) = 1. If m(x) is a power of an irreducible polynomial. 16. g. Thus Theorem 12 does not hold in either domain: Nevertheless. y) and t(x. y) = x and b(x. 2 + t(x) . Other Domains with Unique Factorization Consider the domain Q[x. in the domain Z[x] of polynomials with integral coefficients.. The following descriptions give certain sets of polynomials with rational coefficients.. By a unique factorization domain (sometimes called a "Gaussian domain") is meant an integral domain in which (i) any element not a unit can be factored into primes. factorization into primes is possible and unique (Theorem 14 holds). then so is any domain G[Xb . If I(x) and g(x) are relatively prime polynomials in F[x]. y) = y2 + X are 1 and its associates. If h (x) is relatively prime to both I(x) and g(x). *18. prove that I(x) and g(x) are relatively prime also in K[x].Ch. Definition. (a) all b(x) with b(3) = b(5) = 0. *3. whatever the choice of sand t. y) = 1 since the polynomial xs + (y2 + x)t would never have a constant term not zero. (ii) this factorization is unique to within order and unit factors. 19. y) + (/ + x )t(x. show that m(x)la(x)b(x) implies either m(x)la(x) or m(x)ib(x)y for some e.xn ] of polynomial forms over G. Our main result will be that if G is any unique factorization domain. 3 Polynomials 84 13. The only common divisors of ~x. 20. (b) all b(x) with b(3) ~ 0 and b(2) = 0. (c) all b(x) with b(3) = 0. Similarly. and if F is a subfield of K. prove that they have a common divisor with rational coefficients which is not a constant. x = 1 has no solution. 15. Which of these sets are ideals? When the set is an ideal. If h(x)l/(x)g(x) and hex) is relatively prime to I(x). If two polynomials with rational coefficients have a real root in common. find in it a polynomial of least degree.9. If p(x) is a given polynomial such that any other polynomial is either relatively prime to p(x) or divisible by p(x). Using . prove that h(x)lg(x). y) such that xs (x. (2. Let S be any set of polynomials over F which contains the difference of any two of its members and contains with any b(x) both xb(x) and ab(x) for each constant a in F. (d) all b(x) such that some power of b(x) is divisible by (x + 1)4(x + 2). . 14. since all the terms on the right are so divisible. This means that the prime p must appear in the unique decomposition of one of the factors am or bn> in contradiction to the choice of am and bn as not divisible by p. bi E G ("integers"). where g(x) has coefficients in G. + (b n/ an)x n. since the unique factorization theorem holds in G). We may typically imagine G as the domain of the integers and F correspondingly as that of the rationals. First write f(x) = (b o/ ao) + (b l / al)x + . But let j am . Proof.2. Lemma 1 (Gauss). and f(x) = (cc/)f*(x). where cf is in F and f*(x) is primitive. Now let c ' be a greatest common divisor of the coefficients of g(x) (this exists. Then the formula (3) tor the coefficient Cm + n in the product gives so that the product ambn is divisible by p. for a given f(x). it suffices to show that f* is unique to within units in G. since the polynomials are primitive). To this end. Theorem 4).9 85 Other Domains. To prove the uniqueness of cf and f*. Second. with cf = CC'. the constant cf and the primitive polynomial f*(x) are unique except for a possible unit factor from G. suppose f*(x) = cg*(x). If C = l/aoal··· an> we have f(x) = cg(x). one can evidently reduce the problem to the case G[x] of a single indeterminate. L bjx .5x 2 is primitive. Any nonzero polynomial f(x) of F[x] can be written as f(x) = cff*(x).and bn be the first coefficients not divisible by p in L aiV i and L bjx . Lemma 2. Proof. 3 . k i j if this is not primitive. The product of any two primitive polynomials is itself primitive. we shall call a polynomial of F[x] primitive when its coefficients (i) are in G ("integers") and (ii) have no common divisors except units in G.3. and we shall consider F[x] along with G[x].6x 2 is not. Moreover. ai. Write j L Ck Xk = L aixi . Thus 3 ... This is the first result. we shall embed G in the field F = O( G) of its formal quotients (§2. j j respectively (they certainly exist. with Unique Factorization induction on n. and it is this case which we shall consider. First. Clearly. f*(x) = g(x)/c' is primitive. then some prime pEG will divide every Ck. where . hence v is a unit in G. hence a prime element f(x) in G[x] must have one of these factors cf or t* a unit of G[x]. 3 Polynomials 86 f*(x) and g*(x) are primitive and c E F. g*(x)h*(x) is pnmltIve. and hence is associate to a product of primitive irreducibles of G[x]. so is G[x].D. where u. v divides every coefficient of g*(x). What is more important. A polynomial with integral coefficients which can be factored into polynomials with rational coefficients can already be factored into polynumials of the same degrees with integral coefficients. qm(x).E.. The former takes place i G and so by hypothesis is possible and unique. then cf . Now consider any polynomial f(x) in G[x]. hence by Lemma 2 the two differ by a unit factor u in G (are associate). by Lemma 3 the factorization of ~y f(x) in G[x] splits into independent parts: the factorization of its " ontent" cf and that of its "primitive part" t*(x). f(x) . as f(x) . so that ug*(x) = vt*(x). It is a corollary that if f(x) is in G[x] and reducible in F[x].ql(X) .CgCh and t*(x) .g*(x)h *(x). whence. The constant cf of Lemma 2 is called the content of f(x). any polynomial f(x) has a factorization f(x) = Ct/*(x) . Therefore the primes of G[x] are of two types: the primes p of G. and the primitive polynomials q(x) which are irreducible. qm(x). But g*(x) is primitive. where the element d of G can be factored into irreducibles Pi of G. v E G are relatively prime. Proof. u is a unit. it is unique to within associateness in G.. By symmetry. where" -" denotes the relation of being associate in G[x]. which is possible and unique by Theorem 14. If G is a unique factorization domain. The coefficients of ug*(x) will then have v as a common factor. If f(x) = g(x)h(x) in G[x] or even F[x]. the latter is essentially e~uivalent to factorization in F[x]. both in G[x] and (Theorem 15) in F[x]. hence cf = U-1CgCh' Q. Theorem 15. it is also clearly a constant multiple of t*(x). This suggests Lemma 4. Lemma 3. and so u/v is a unit in G. Proof. By Lemma 2. By Lemma 1. This gives the following generalization of the Corollary of Theorem 10. All told.Ch. Write c = u/v. By Lemma 3. since u and v are relatively prime.. This completes the proof. then f(x) = uCfg*(x)h*(x).. It has a factorization in F[x]. Thus f(x) = dql(X) . b]. §3. ll(g».. the product PI . 2. Q. Describe a systematic method for finding all linear factors ax + b of a polynomial I(x) in Z[x]. that a "prime" of G[x] which divides a product a(x)b(x) must divide a (x) or b (x). Prove that ab r-. 6. Represent each of the following as a product of a constant by a primitive polynomial of Z[x]: 3x 2 + 6x + 9. [a.8 hold in every unique factorization domain? 9.xn ] over G. Therefore the Pi are the (essentially) unique factors of cf in the given domain G.9 87 Other Domains with Unique Factorization has the decomposition where each Pi is a prime of G.6. Ex. 8. For what integers n is 2X2 + nx . 4. (a. . y]. Fom Lemma 4 and an induction on n one concludes Theorem 16. If I(x) and g(x) are relatively prime in F[x]. If G is any unique factorization domain. This shows that G[x] is a unique factorization domain.c. 7.d.3 in Z[x]. each qj(x) a primitive irreducible of G[x]. x 2 /2 + xl3 + 7. b] in any unique factorization domain.§3. Pr is the essentially unique content cf of f(x). so polynomial domain G[xJ. 10.1O. In this factorization the polynomials qj(x) which appear are uniquely determined. b lea. b) and an I.7 reducible in Q[x]? 5. Exercises 1. we shall exhibit an integral domain which is not a unique factorization domain. Do the properties of "relatively prime" elements as stated in Exs. prove that yl(x)'+ g(x) is irreducible in F[x. .c. IS every In §14. List all the divisors of 6x 2 + 3x .1 cg in G and t*(x) Ig*(x) in F[x].D. In the notation of the text. (b) using (a).E. to within units in G.m.. show directly (a) that ctf*(x) I cgg*(x) in G[x] if and only if c. 15 and 16 of §3.. Since the qj(x) are primitive.' . Find the prime factors of the following polynomials in Q[x]: *3.(a. as the primitive parts of the unique irreducible factors of f(x) in F[x]. in which neither Theorem 12 nor Theorem 14 holds (d. Prove that two elements a and b in a unique factorization domain always have a g. . (d) x 7 + 2x 3 y + 3x 2 + 9y. Let D be the set of all rational numbers which can be written as fractions alb with a denominator b relatively prime to 6.1 + with integral coefficients.2) a systematic method of finding in a finite number of steps all factors of given degree of any f(x) of Z[x]. (b) X4 . + co) .1 has no monic linear factors over Q except x-I. This result will be deduced from the following sufficient condition for irreducibility due to Eisensteiil: Theorem 17.) If f(x)ig(x) in Z[x]. y]. It follows that xn . *15. Find all irreducible polynomials of degree 2 or less in Z2[X. + ao be a polynomial an -ijiE 0 (mod p). has no rational root except x = 1.. prime p. Indeed. an-l == a n-2 == Then a(x) is irreducible over the Proof. (Kronecker. t S ) irreducible in F[t]. such that •. 13. prove tha~ f(c)ig(c) for each c in Z.1 + . Develop from this fact (and the interpolation formula (5) of §3..l)cp(x) gives the (unique) factorization of x P .3). n appearing as the exponents of some term xmyn of f. y] if there is a substitution x ~ 1'.2) +t(x.1 n-l n-2 =x +x +"'+x+l x-I is irreducible. ao -ijiE 0 (modp2).1 into monic irreducible factors. this polynomial is reducible unless n is a prime. Show that there exist in Q[x. In any possible factorization (n = m + k) a(x) = (bmx m + bm_1x m.. so that x P . y].. then the cyclotomic polynomial cp(x) defined by (13) is irreducible.1 = (x . We now show that if n = p is a prime..y6. Decompose each of the following into irreducible factors in Q[x. y ~ t S which yields a polynomial f(t'.Ch. 14. y] no polynomial solutions for the equation 1 1 = s(x.y)(x .. field of rationals. n odd. == ao == 0 (modp). + bo)(ckxk + Ck~IXk-l + . *3. (c) x 6 .10. provided the degree of f(t'. y) is -irreducible in F[x. Eisenstein's Irreducibility Criterion It is obvious that the equation xn = 1.y2. Show that the polynomial f(x. 3 88 Polynomials 11. Prove that D is a unique factorization domain. 16. let a(x) = anx" + an_lx n. 12. But this does not show that the quotient (13) () = cpx xn . t S ) is the maximum of the integers mr and ns for all pairs m.y)(x + y . For a gwen . and prove that your factors are actually irreducible: (a) X 3 _ y 3. for p is a prime.I worls:s. Exercises 1. But bo ~ 0 and c.1) = [(y + I)P . +b. 3. Pick the smallest index r <: k for which c.. If f(x) is irreducible over a field F.D. 1·2 The binomial coefficients which appear on the right are all integers divisible by the prime p. If a polynomial f(x) of degree n > k satisfies the hypotheses an ~ 0. show that f(x) has an irreducible factor of degree at least k. ~ 0. ak-l = .1)/(x .. for any a in F... . ak ~ 0. = ao = 0 (modp) and ao ~ 0 (modp2). Since ao = boco. By the hypothesis. the third hypothesis ao ~ 0 (mod p2) means that not both bo and Co are divisible by p. so Ck ~ 0 (mod p). + x + 1. Which of the following polynomials are irreducible over the field of rationals? X4 + 15. show thatf(x + a) also is.E.1]/y p-l = y + py p-2 + p(p . so that the polynomial f(x) is indeed irreducible.co == boc.2 + . for the binomial expansion gives (x P . with C. Q. = boc. The polynomial in y thus satisfies the hypotheses of the Eisenstein criterion. (modp). 2.. it gives the cyclotomic polynomial (13') cfJ(x) = (x p - l)/(x . and can never be cancelled out by the (smaller) integers in the denominator. But bmCk = an ~ 0 (mod p). ~ 0 (mod p). + b1c. == Co == 0 (mod p). for which this is possible is an> so r = n: the degree of the second of the proposed factors must be n. but a simple change of variable y = x . To fix our ideas. ~ 0. suppose that b o ~ 0 (mod p).-1 + .1) = XP-l + x p . + p. Then a..10 89 Eisenstein's Irreducibility Criterion we may assume by Theorem 15 that both factors have integral coefficients bi and 9. for p occurs in each numerator as a factor. while Co == 0 (mod p). give a..-l = . Use Eisenstein's criterion to show x 2 + 1 irreducible over the rationals. hence is irreducible.§3. the only coefficient a. This criterion may be applied to the polynomial (13) when n = p...1) Y p-3 + . . The Eisenstein criterion does not apply to (13') as it stands. 4. this entails the irreducibility of the original cyclotomic polynomial cfJ(x) of (13'). an+1 0 (mod p). an == an-I == . then divide the quo-. (a) If I(x) is a monic polynomial with integral coefficients. Instead. Partial Fractions The unique decomposition theorem for polynomials can be applied to rational functions to obtain certain simplified representations. (b) Show that every factor of I(x) over Z must reduce modulo p to a factor of the same degree over Zp. If the denominator a(x) is a power a(x) = [c(x)r. x]. b(x) = qo(x)c(x) + ro(x). 6. ao . hence (14) b(x)/[c(x)d(x)] = [s(x)b(x)]/d(x) + [t(x)b(x)]/c(x).¢ 0 (modp).11.. State and prove an analogue of the Eisenstein Theorem for polynomials I(x) with coefficients in F[t]. a2n == . A rational form in which the denominator is the product of relatively prime polynomials c(x) and d(x) can be expressed as a sum of two quotients with denominators c(x) and d(x). this process does not apply directly. assuming throughout that the polynomials and rational forms used have coefficients in some fixed field F.¢ 0 (mod p3).) (b) Use this to prove that x 3 + 3t 2x 2 + 2tx 2 + t 4x + 7t + t 2 is irreducible in the domain F[t. *3. Theorem 12 gives polynomials sex) and t(x) with 1 = sc + td.. (Hint: Use t in place of p. Combined. (a) Let F[t] be the domain of all polynomials in an indeterminate t.Ch. to get qo(x) = llI(X)C(X) + rl(x). 3 90 Polynomials *5. show that the irreducibility of I(x) modulo p implies its irreducibility over Q. "" ao = 0 (mod p2).. The result in words is Lemma 1. . respectively. these give . (c) Use this to test (using small p) the irreducibility over Q of = = 7. m > 1. divide the numerator by c(x) as in the division algorithm. tient qo(x) again by c(x). Consider first a rational form b(x)/a(x) in which the denominator has a factorization a(x) = c(x)d(x) with relatively prime factors c(x) and d(x). like the partial fraction decomposition used in integral calculus. This we now discuss. Show that the irreducibility of a polynomial of odd degree 2n + 1 is enforced by the conditions a2n+l. decompose an arbitrary given denominator a(x) into a product of monic irreducibles.11 91 Partial Fractions Repeating this process (this phrase disguises an induction. hence are relatively prime. . r'. where p(x) is irreducible and r(x) has degree less than that of p(x). each with a denominator [Pi(X)r'. 11. Lemma 1 can therefore be applied to that factorization of the denominator in which one factor is c I(x) = [PI (x) while the other factor is all the rest of (17). in abbreviated notation. has degree less than that of c(x). Such a proof. §l.§3. one has (17) with integral exponents mi' Any two distinct monic irreducibles PI(X) and P2(X) are certainly relatively prime. if not zero. To these denominators the reduction of (16) may be applied. so that the powers [PI(x)r' and [P2(X)r 2 have no common factors except units. one finds. If equal irreducibles are grouped together. the successive steps of the proof of Theorem 18 may be carried out to get the explicit result.S. The denominators [p(x)r which occur are all factors of the original denominator a(x). Theorem 18. To combine these results. that t where each polynomial ri = rj(x). Any rational form b(x)/a(x) can be expressed as a polynomial in x plus a sum of ("partial") fractions of the form r(x)/[p(x)r. remove the camouflage!). Repetition gives b/ a as a sum of fractions. The rational form b(x)/a(x) now becomes This proves Lemma 2. t This is the analogue of the decimal expansion of an integer presented in Ex. A rational form with a power [c(x)r as denominator can be expressed as a polynomial plus a sum of rational forms with denominators which are powers of c(x) and numerators which have degrees less than that of c(x). If the explicit partial fraction decomposition of a given rational function b(x)/a(x) is to be found. 1).e.7 (f) (x _ 2)2' tCompare the directness of this method with that often used in texts on calculus.. . (This statement will be proved in §5. and exponential functions.1) and (Bx + C)/(x 2 + X + 1). Theorem 7. and their inverses). For example. The denominator is (x . we get 3(x + 1) + 1) = (x + 1)(x 2 + X + 1) + 1 x 2 + 3x + 2 3(x x 3 = x -1 x-I 2 x +x+l (x 2 + 3x + 2)(x . consider. Each of the resulting fractions may be simplified by a further long division.t to give 3(x x3 + 1) - 1 - 2x + 1 2 x-I 2 x +X +1 Over the field R of real numbers the only irreducible polynomials are the linear ones and the quadratic polynomials ax 2 + bx + c with b 2 4ac < O. trigonometric.1) + 3. By Theorem 18. B.Ch. over the field Q. Multiplying this equation by the numerator x + 1 of the original equation.l)(x 2 + X + 1). .1). is known as a "constructive" proof. Decompose into partial fractions (over the rational field): (a) (d) +4 x + 3x + 2' 3x 2 a2 x 3 - a 3 (b) 1 x 2 (e) ' X4 2 (c) - a + 3 5x 2 ' + 4' 1 x 3 + x' 3x . and the second factor is irreducible. (x + 1)/(x 3 . 3 92 Polynomials which can always be used for actual computation of · the objects concerned.) Therefore over R any rational function can be expressed as a sum of terms with denominators which are powers of linear and quadratic expressions. algebraic. This fact is used in calculus to prove that the indefinite integral of any rational function can be expressed in terms of "elementary functions" (i. where one ~ust solve for the "unknown" coefficients A. C which occur in the terms A/(x . The Division Algorithm gives x 2 + x + 1 = (x + 2)(x .5. Exercises 1. the rational form to be integrated is essentially a sum of terms of the types c(x + a)-m and c(x + d)(x 2 + ax + b)-m. Hence the proposition on integrals will be proved if one can integrate these two types by elementary functions (which can be done). ).aj ).3. ai' .) = 1 by Lagrange's interpolation formula. *(b) Using Ex. the polynomial domain D[x] becomes ordered if we choose for "positive" polynomials those having a positive leading coefficient. show that the representation is unique (a) in Lemma 1. (a) If (x .= (x . j~i (Hint: Expand pea. .a). . 8.a) is not a factor of f(x). . *12. (b) in Lemma 2. prove that . 3. For example.) I r x C' .. (c) in Theorem 18.) = Jdz/(z + a. Develop a method of representing any rational number as a sum of "partial fractions" of the special form a/pn (p prime. 7.1 ---. prove that 1 -:::---. . (This means that further partial fraction decompositions of b(x)/p(x) are out of the question.a)' g(x) + -----'''-'---=-- (x . 9. an are distinct. so that an > 0 in (1). 0 < a < p).) (b) Can the same be said for b(x)/[p(x)r? *10. Assuming Theorem 6 of §5. 5. *11.at' where G i = II (at . + 2n[(x + 2n)(x + 2n+l)rl .11 93 Partial Fractions 2. Find the sum of [(x + l)(x + 2)rl + 2[(x + 2)(x + 4)]-1 + . Show that over any ordered domain D. (b) the field Q of rational numbers. If ao. 6. Decompose (4x + 2)/(x 3 + 2X2 + 4x + 8) over (a) the field Zs of integers mod 5.1/3. 1/6 = 1/2 .§3.= II (x 4. deduce a canonical form for those rational functions whose denominators can be factored into linear factors.. *13. 3. (a) Prove that any rational form not a polynomial can be represented as a polynomial plus another rational form in which the numerator is 0 or has lower degree than the denominator. show that the indefinite integral of any complex rational function is the sum of a rational function and a linear combination of complex logarithms log (z + a. Give a detailed proof by induction of Theorem 18. 8(a) or Ex. (b) Is this representation unique? If all fractions (including the partial fractions) are restricted to have numerators of degrees lower than the respective denominators. (a) If p(x) is irreducible.) Prove equation (13) by induction on m.-If(x)' where C = l/f(a) and g(x) is a suitable polynomial. prove that any representation of a fraction b(x)/p(x) (with b relatively prime to p) as a sum of fractions must involve at least one fraction with a denominator divisible by p(x).a.a)'f(x) C (x . . Di lemma of Pythagoras Although "modern" algebra properly stresses the wealth of prop-erties holding in general fields and integral domains. A completely geometric approach to real numbers was used by the Greeks. a number was simply a ratio (a: b) between two line segments a and b. 7). there is a "number" r satisfying r2 = 1 + 1 = 2.4 Real Numbers 4. they also have unique algebraic properties. the real and complex fields are indispensable for describing quantitatively the world in which we live. these two fields are crucial in the relation of algebra to geometry. both in elementary analytic geometry and in the further development of vectors and vector analysis (Chap. which will be exploited in later chapters of this book. We shall devote the next two chapters to these completeness properties and their algebraic implications. Especially important are the order completeness of the real field R and the algebraic completeness of the complex field C. to be proved from postulates for plane geometry (including the parallel postulate). 1. mUltiplication. The ancient Greek philosopher Pythagoras knew that the ratio r = d/ s between the length d of a diagonal of a square and the length s of its side must satisfy the equation (1) So. subtraction. For them. Moreover. and division of ratios. They gave direct geometric constructions for equality between ratios and for addition. 94 . he reasoned. For example.4) appeared to the Greeks as a series of geometric theorems. The postulates stating that the real numbers form an ordered field (§2. and many others.1 95 Dilemma of Pythagoras On the other hand.1416). and the ratio ~2 of the length of a side of C to the side of a cube having half as much volume. Thinking of x = a: band y = c: d as ratios of lengths of line segments. and so are nonalgebraic..§4.7.g.g. To answer the fundamental question "what is a real number?" we shall need to use entirely new ideas. Similar arguments show that both the ratio . A second such idea is that the ordered field Q of rational numbers is dense in the real field. This "completeness" postulate is analogous to the well-ordering postulate for the integers (§1. The two preceding ideas can be combined into a single postulate of completeness. Therefore. are irrational numbers. a 2 = 2b 2 has no solution in integers. it divides 2b 2 an odd number of times. In Chap. Further irrational numbers are 7T (which thus cannot be exactly 2/ or even 3.4): both deal with properties of infinite sets. he found r could not be represented as a quotient r = alb of integers. that every positive number has a square root). but also (unlike . which also permits one to construct the real field as a natural extension of the ordered field Q./2) even fail to satisfy any algebraic equation. similarly. this completeness postulate is needed to establish certain essential algebraic properties of the real field (e.. 14 we shall prove that the vast majority of real numbers not only are irrational. One such idea is that of continuity-the idea that if the real axis is divided into two segments. 2 divides a 2 just twice as often as it divides a-hence an even number of times. for all positive integers m and n.J3 of the leI}[th of a diagonal of a cube C to the length of its side. then these segments must touch at a common frontier point. (3) na > mb implies nc > md. for (albf = 2 would imply a 2 = 2b 2 • By the prime factorization theorem. integral mUltiples n· a of which could be formed geometrically. of finite decimal approximations correct to n places). From this "dilemma of Pythagoras" one can escape only by creating irrational numbers: numbers which are not quotients of integers. so that every real number is a limit of one or more sequences of rational numbers (e.. These results are special cases of Theorem 10 of §3. . This idea can also be expressed in the statement (2) If x < y. Eudoxus stipulated that (a : b) = (c: d) if and only if. e. then there exists min E Q such that x < min < y. na < mb implies nc < md. This property of real numbers was first recognized by the Greek mathematician Eudoxus. As we shall see. think of the real numbers as the points of a continuous line (the x-axis). one readily concludes that every real number a can be characterized as the least upper bound of the set S Of all rational numbers r = min (n > 0) such that r < a. We now define these two notions. that is. if a ~ 0 and b are rational. a is irrational unless the integer a is the nth power of some integer... The concepts of lower bound and greatest lower bound of S are defined dually.proof that ..) 00 *6. n > 0) such that m < 2n 2 • That is.. if for any b ' < b there is an x in S with b ' < x. which are analogous to the concepts of greatest common divisor and least common mUltiple in the theory of divisibility./5./2 = . then (n!)e would be an integer for some n. 2. and imagine the rational numbers as sprinkled densely on this line in their natural positions. as defined by the convergent series L 1/k !. Prove that e.Ch. By an upper bound to a set S of elements of an ordered domain D is meant an element b (which need not itself be in S) such that b :> x for every x in S.. Prove that J2 +. It follows directly from the definition that a subset S of D has at most one least upper bound and at most one greatest lower bound (why?)./5 is irrational.2. From this picture.. Prove that ':. Show that. 3. starting by squaring both sides of x . Upper and Lower Bounds The real field can be most simply characterized as an ordered field in which arbitrary bounded sets have greatest lower and least upper bounds.) 4.J3 is irrational.. Definition. then au + b is rational if and only if u is rational. is irrational.. J2 is the least real number greater than all ratios 2 min (m > 0. the number J2 is the least upper bound of the set of positive rational numbers min such that m 2 < 2n 2 • ./5. For example. (Hint: Find a poly-nomial equation for x = J2 + . by interchanging> with < throughout.) 4. (Hint: k-O If e were rational. An upper bound b of S is a least upper bound if no smaller element of D is an upper bound for S. 4 96 Real Numbers Exercises 1. Give a direct . in the above definition. Prove that loglO 3 is irrational (Hint: Use the definition of the logarithm. Intuitively. *5. (Hint: Compute 1000x ..2 97 Upper and Lower Bounds The concept of real numbers as least upper bounds of sets of rationals is directly involved in the familiar representation of real numbers by unlimited decimals.1. Theory of Functions (New York: Kings Oown Press.414. Does the result of Ex.." (Suggestion: Show that if the same remainder occurs after m as after m . mals. Thus we can write .§4.42.23672367· .k divisions by 10.1. Define your terms carefully.' . the block of k digits between gets repeated indefinitely. between different deci- ..). ').4. represents a rational number. it is very hard to prove what is implicitly assumed in high-school algebra: that the system of unlimited decimals is an ordered field.. represents a rational number. The least (n + l)-st place decimal coincides with this through the first n places. Do the same for y = 1. *3. 1 and 2. (1.d 1d 2 d 3 •••• By construction. 4 holds in the duodecimal scale? 6. see J. Our construction hence defines a certain unlimited decimal c = k + O. it is very easy to "see" that every nonempty set T of positive real numbers has a greatest lower bound.. ) (4) = g.4142. Prove conversely that the decimal expansion of any rational number is "repeating.1...1. t For details." like those of Exs. as follows. because there are only a finite number of nonnegative n-place decimals less than any given member of T.5.d 1d 2 ••• d m where k is some integer and each d i is a digit. Prove that x = .4143.) *5.x. . Find three successive approximations to J2 in the domain of all rational numbers with denominators powers of 3.fi = l.t Exercises 1. rutt.12437437437. = .) and a greatest lower bound (g.1.u. (1.l.41..' . Consider the n-place decimals which express members of T to the first n places: there will be a least among them.1. if the real numbers are defined as unlimited decimals. The difficulty begins with equations like . this is a lower bound to T (since its decimal expansion is greater than that of no x in T).) 2.u.b. so has the form k + O. Assuming the familiar properties of this decimal representation. *4.19999· . Prove that any "repeating decimal. Let this least n-place decimal be k + O.d 1 d 2 • •• dndn+ h with one added digit.415. 1947).20000· . and even a greatest lower bound (any bigger decimal would lose this property).b.l.b.fi as both a least upper bound (l.b. . However. F. up to an isomorphism.. Xk = mk/nk. ? 4. An ordered domain D is complete if and only if every nonempty set S of positive elements of D has a greatest lower bound in D. .Ch.E. The number a* = -b* then proves to be a least upper bound of the given set T. 4 98 Real Numbers 7. Consequently.u. so Corollary 2 of Theorem 18 in §2. Xk+l = (xk/2) + (l/Xk)' (a) Show that for k > 1. we can just as well assume that the field R of real Proof.b. (b) Defining £k = Xk . there results a set S' of positive numbers x . show th~~ ~ £~l < £ / /2J2.b. (2. this set S' has a g. and. Postulates for Real Numbers We shall now describe the real numbers by a brief set of postulates. 17/12. By our postulate. If 1 .D. In the field R of real numbers. However.J2. The real numbers form a complete From the properties of the real numbers given by this postulate. by the previous proof.. one can actually deduce all the known properties of the real numbers. 3/2. Theorem 1. Definition. 2 only up to isomorphism. if the set T has an upper bound a. including such a result as Rolle's theorem.6 shows that R must contain a subfield isomorphic to the field Q of rationals.3.577/408. Q. ) .l.l.17/12. the number c = c' + b . c'.b is added to each number x of S.b.1 is then a g. Subsequently we shall see (Theorem 6) that these postulates determine the real numbers uniquely. where m/ = 2n/ + 1. Describe two different sets of rational numbers which both have the same l. the set has a greatest lower bound b*.b. every nonempty subset S which has a lower bound has a greatest lower bound. every nonempty subset T which has an upper bound has a least upper bound. we shall confine our attention to a few simple applications. Postulate for the real numbers. the set of all negatives -y of elements of T has a lower bound -a. Hence. for the original set S.l.b + 1. 3/2. (c) Show that g. Since Q is defined in Chap. dually. ordered field R. Our postulates make the real numbers an ordered field R. as may be readily verified./2.."') recursively by Xl = 2. Dually. which is known to be fundamental in the proof of Taylor's theorem and elsewhere in the calculus. 2. *8. Suppose S has a lower bound b. Define the sequence (2. .d > 0. Between any two real numbers c > d. Suppose the conclusion false for two particular real numbers a and b. For any two numbers a > 0 and b > 0 in the field R of all real numbers (as defined by our postulates). As before.d) = c. with a fixed denominator n are spaced along the real axis at intervals of length l/n. although it is smaller than the given least upper bound.1. this is to be proved simply from the postulate that the reals form a complete ordered field. Theorem 3. By hypothesis. The set S of all the mUltiples na then has the upper bound b so that it has also a least upper bound b*. This convention adjusts our postulates to fit ordinary usage. and enables us to prove the following property of the reals (often called the Archimedean law). (3)). To be sure that one such point falls between c and d. or 1/ n < c . §4. ±2/ n.. The so-established "Archimedean property" may be used to justify the condition of Eudoxus (cf. .d) > 1. there exists a rational number m/ n such that c > m/ n > d. so that.d.a is an upper bound for the set S of all mUltiples of a. We can visualize the above proof as follows. with b > 0. we need only make the spacing l/n less than the given difference c . The proof of this extension of the Division Algorithm will be left to the reader. This implies b* . Therefore b* :> na for every n. there exists an integer q such that a = bq + r. b :> na. a contradiction. Corollary. then (m . Proof.3 Postulates for Real Numbers 99 numbers does contain all the rationals and hence all the integers. so that m/n = (m . 0 -< r < b. Since m/ n > d. ± 1/ n. Given real numbers a and b. there exists an integer n for which na > b..§4. Now let m be the smallest integer such that m > nd. so that also b* :> (m + l)a for every m. for every n. c . so the Archimedean law yields a positive integer n such that n(c .l)/n + l/n < d + (c . Theorem 2.l)/n -< d.a :> ma so that b* .d. The various fractions 0. this completes the proof. u. if b* > O? 7. (a) Show in detail why the set of aIr numbers -3x.' . a* and a g. (a) of the set of all numbers 7x + 2 for x in S. Proof. b. If h > 1 is an integer.5. Collect in one list a complete set of postulates for the real numbers. of each of the following sets of rational numbers: (a) 1/3.b. Corollary. (b) of the set of all numbers l/x for x ~ 0 in S.. 6. b*.u.Ch.b. 13.b. 4 100 Real Numbers This theorem may be used to substantiate formally the idea used intuitively in a representation like (4) of a real number as a l.b. .u. Prove that there is no ordered domain D in which every nonempty set has a I. Exercises 1. 3. Let a set S have a I.3/4.b. -3a*. there lies a rational number of the form m/hk.Is this true for rational squares? 12. of rationals. Let a. Show that the ordered domain Z is complete. *9.. Then c is an upper bound of S.b. for x in S. where m and k are suitable integers. In Ex. §2.b. and the g.40/81.) 10.b. and d be positive elements of a complete ordered field.u. Exhibit the I. 0/27. Show that an element a * in an ordered field is a least upper bound for a set S if and only if (i) x <:: a * for all XES and (ii) for each positive e in the field. of a set of rationals.4/9. 4. hence c is the least upper bound of S. for x in S.u. For a given real number c.u.b. there is an x in S with Ix . 11. Construct a system of postulates for the positive real numbers.l.l. and g.. there exists a rational cube (m/n)3 such that c«m/n 3) < d. has the I. let S denote the set of all rationals m/ n <:: c. Let SI and S2 be sets of real numbers with the respective least upper bounds b l and b 2 • What is the least upper bound (a) of the set SI + S2 of all sums SI + S2 (for SI in SI and S2 in S2).b.u.a*1 < e. 7/8. 14. by the theorem no smaller real number d could be an upper bound of S. . (b) 1/2. (b) In the same way. S. find the I. (Hint: Show that D itself can have no upper bound. -3b* and the g. c. Prove in detail the corollary to Theorem 2. (b) of the set of all elements belonging either to SI or to S2? 8. of the set of all numbers x + 5. Show that between any two real numbers c < d.l.l.' . 15/16. 5.) *2. Every real number is the I. State in geometrical language a postulate on points of the real axis which asserts that bounded sets have I.u. what is the I. (use the words "left" and "right"). prove that between any two real numbers c > d.b.b.u.b. Show that alb = cld if and only if the condition (3) of Eudoxus is satisfied. (Hint: Cf. Now summing over k and taking out the common factor h. this is true by the binomial theorem.p(x)1 < Mh for all x and all positive h satisfying a <x < b. (Cf: Theorem 3. h) 1 or leave it undiminished. but for any continuous function. = hg(x.4. there exists a real constant M such that Ip(x + h) . not only for polynomial functions p(x). h) by its absolute value. For this purpose. b. But if we replace each term in g(x. h) is a polynomial depending onlY. we increase 1 g(x. and p(x). We shall show that p(c) = C. the hypothesis means that the graph of y = p(x) meets the horizontal line y = p(a) at x = a and the line y = p(b) at x = b.§4. it suffices to show that Ig(x. If p(x) is a polynomial with real coefficients. This substitution gives us. then for every constant C satisfying p(a) < C < p(b). respectively. however. we have p(x + h) . Since p(a) < C. the conclusion asserts that the graph must also meet each intermediate horizontal linet y = C at some point with an x-coordinate between a and b.4 101 Roots of Polynomial Equations 4. a < x + h < b.a I. 1h 1 < 1b . Proof. by formula (3) of § 1. h). Let S denote the set of real numbers between a and b satisfying p(x) < C. it is clearly sufficient to exclude the possibilities p(c) < C and p(c) > C.p(x) where g(x. But p(c) < C would imply that p(c + h) < C t There is a general theorem of analysis which asserts this conclusion. Geometrically. Roots of Polynomial Equations We shall now show how to use the existence of least upper bounds to prove various properties of the real number ~stem R. Lemma 1. By Lemma 1. we are ready to prove Theorem 4. h)1 < M whenever 1x 1 < 1a 1 + 1b I. and it has b as upper bound. a real constant M. including first the existence of solutions for equations such as x = 2. For any real x and h. The proof depends upon two lemmas. hence it has a real least upper bound c. depending only on the coefficients of p(x) and the interval a < x < b. and if p(a) < p(b). Having Lemma 2 at our disposal.a I.) For each monomial term akxk of p(x). . §3.3 and induction. For given a. Lemma 2. we get the desired result. if a < b. We do the same again if we replace 1 x 1 and 1hi by 1a 1 + 1bland 1b . the equation p(x) = C has a root between a and b. Theorem 4. S is nonvoid.on p(x). Proof.2. the solution is immediate.Ch. TIteorem 4 does not give a construction for actually computing a root of p(x) = C in decimal form~ but this is easy to do. by making the substitution x = y .p(a)r 1 • Other efficient methods of calculating roots of equations are studied in analysis. But in this case. 2 2 2 2! 2 2 2 3! . For example. if Ix I < 1. p(e .a2/3 and transposing the constant term.) There remains the possibility p(c) > C. In the case of a cubic equation (6) the real roots can be found as follows. By repeating this construction. In the first case the root is found. whence (c + h) E S.C]/(2M). one may use the infinite series (5) Jl +x = 1 +!x + !(_!)X2 + !(_!)(_ 3)X3 + . Appendix. Dividing through by a3.. and if C > 0. Trigonometric Solution of Cubic. For example. we reduce (6) to the case a3 = 1. we reduce (6) to (7) y3 + py If p = 0. again by Lemma 2. then p(x) = C has a positive real root.E. = q. This would contradict our definition of c. a root of p(x) = C can be found to any desired approximation.a][p(b) . If p(x) is of odd degree~ then p(x) = C has a real root for every real number C. .[P(c) . This would contradict our definition of e as the least upper bound of S: c . in the second and third cases there is a root in an interval (either a <: X <: Cl or Cl <: X <: b) half long as before.C]/(2M) would give a smaller upper bound. one can let Cl = (a' + b)/2. Convergence would be much faster if one used linear interpolation and set as Cl = a + [C - p(a)J[b .. There remains only the possibility pee) = C. as an upper bound to S. Q.D.p(c)]/M~ by Lemma 2. Corollary 2.positive h <: [pee) . then P(Cl) = C or p(cd > C or P(Cl) < C. If p(x) is a polynomial with positive coefficients and no constant term. From the theorem one readily proves: Corollary 1. (Lemma 2 applies because c + h :> b is evidently also impossible. 4 102 Real Numbers for h = [C . Now.h) > C for all . 6. 7. to get (9 c) z = cos [(1/3) cos -1 C]. .§4. 2. the same method applies after changing the sign of z.x = 1/9. (c) x 3 + 3x 2 + 2 = O. using (5) and (. Find. S.x = C has two real roots for every C> -3/8. Show that for any positive real number a and any integer n.3 cosh 8 to get (9b) z = cosh [(1/3) cosh -1 C].J4Ip!/3. Prove Corollary 2.. show that ax. 1 C has three values. To solve the second equation. we use the analogous formula 3 cosh 38 = 4 cosh 8 . In this case z assumes three values because cos differing by multiples of 120°. whence (9a) z = sinh [(1/3) sinh -1 C]. Prove that every positive real number has a real square root. 4. one can use the familiar trigonometric identity 3 sinh 38 = 4 sinh 8 + 3 sinh 8. where . Exercises 1. 10.8). (b) x 3 ./5 to four decimal places. the equation x. Show that a monic polynomial of even degree assumes a least value K. Find to three decimal places the real roots of (a) 3x 3 ./5/2)2 = 1 + 1/4. if C :> 1. setting y = hz and mu1tiplying (7) through by k. 8. -== a has one and only one positive real root !fa. find a real number m such that p (x) > 0 fOf all x > M.3x 2 + 6x -= 7. use similarly cos 38 = 4 cos 8 .+1 > bx" for all sufficiently large positive values of x. To solve the first equation. 9.4 h 103 Roots of Polynomial Equations Otherwise.. we can reduce it to one of the forms = (8) 4z 3 + 3z = C or 4z 3 - 3z = C. k = 3/(hlpj).. (b) Given a polynomial p(x) with a positive leading coefficient. Find J2 to six places using (5) and (5J2/7? = 1 + 1/49. (a) If a and b are positive teals. 3. Prove Corollary 1. To solve the second equation when IC I < 1 (this is the so-called irreducible case 3 of §15.. Show that X4 . and every value C > K twice.3 cos 8. If C <: -1. But cutting the x-axis (say with scissors). and (ii) V is the set of all upper bounds to the elements of L. Since a is an upper bound of L. if x < y for all y in V. Every rational number falls into one of these two classes. Let (L. Lemma 1. If the existence of least upper bounds is given. the cut will be said to go through a.5. and so for all the elements of V. and L the set of all lower bounds of V (clearly. and that S is a non empty bounded set. To prove (L. If L and V have an element a in common. while a rational number m/ n is in both only if the axis is cut exactly at the point x = m/ n. it is a lower bound for all the upper bounds. U) be any cut.a. it must lie in V. Formally. U) a . Clearly. whence x E L. if La is the set of x < a. Again. Proof. Let V be the set of all upper bounds of S. The lower and upper halves of a Dedekind cut taken together include all elements. This leads to the idea of a Dedekind cut. Dedekind Cut Axiom (on an ordered field F).b (since a E V. proving the second assertion. we mean a pair of non void subsets L . Theorem 5. Observe especially that if x is in L. If x < a for some a E L. By the definition of a cut this means that a lies in L. and Va the set of x >. L on the left and V on the right.Ch. x must lie in L. by the trichotomy law. one divides the rational numbers into two classes. Dedekind Cuts Imagine the rational numbers sprinkled in their natural position on the x-axis. suppose the Dedekind axiom holds.and V such that (i) L is the set of all lower bounds to the elements of V. Then a >. conversely. Every cut goes through some element a. and so x E V. they have at most one element in common. x > a for all a E L. which proves the first assertion: every element of F is in either L or V. Va) through every a. then x < y for every y of V. Proof. b E U). Let x E F be given. let a and b each be both in L and in U. Otherwise. The Dedekind cut axiom holds in an ordered field F if and only if F is a complete ordered field. let F be any ordered field. so the given cut does go through the element a. since it is a least upper bound. L has a least upper bound a. bEL) and a < b (since a E L. 4 Real Numbers -1U4 *4. there is a cut (La. By a "Dedekind cut" in F. Conversely. then x < a < y for all y E V. L contains S). whence a = b. u.5 Dedekind Cuts 105 cut. Vb) in Q'. Any two complete ordered fields are isomorphic. Va) and (Lb. a <:: X for all x E U) qua an element of L. Then a + b corresponds to the cutt (La + L b.6. We shall now sketch a proof of the categorical nature of our postulate (§4. Conversely. Then ab corresponds to the cut (LaLb. which is an upper bound to S qua an element of V. LR = g. and thereby a cut in Q' (the subfield of rationals). V R in this way.e. the cut (L. whence the elements of F and F' are bijective to the cuts in Q' and Q'. Theorem 6. More precisely. VaVb)-where LaLb is the set of products xy (x E La. We shall extend the isomorphism between Q' and Q' (an isomorphism which preserves order as well as sums and products) to an isomorphism between F' and F'. one can show that the cuts in Q form an ordered field satisfying the Dedekind cut axiom-hence giving a Proof. one need only establish that V is the set of all upper bounds of L. We omit the details.l. but one then obtains a cut if the missing number is adjoined to both halves. To multiply positive elements a and b. the operations in F' and F' can be defined from those of Q' and Q" so as to extend the isomorphism. every element of V is an upper bound of L (x <:: y for all x E L. one may use the cuts to "construct" the real numbers from the integers or positive integers. But by the construction of L. let a and b correspond to cuts (La. This bijection clearly preserves order. U a + Ub) fails to be a cut because the number a + b appears in neither half. U) goes through some element a. Finally. every a' E F defines a cut in F'. form similar cuts in the system of positive rationals. y E L b). and a least upper bound (i. A similar remark applies to LaLb below.3) that the real number system is a complete ordered field. by Corollary 2 of Theorem 18 in §2. Va + Vb)-where La + Lb is the set of sums x + y (x E La. y E U).b. . they will contain isomorphic "rational" subfields Q' and Q'. and similarly for Va Vb. this extends to all products. while since L contains S. Let F' and F" be any two such fields. V includes all such upper bounds. while Va + Vb is similarly described. Since (-a)b = a(-b) = -ab and (-a)(-b) = ab. One first proves that the rationals form an ordered field Q having the Archimedean property stated in Theorem 2. But by Theorem 3. V R ) in Q' determines an a' = l. Now by the Dedekind axiom. Indeed.. a' is determined by this cut in Q'-and every cut (L R . This completes the proof. Y E L b ).b.§4. t In certain cases. (La + L b. respectively. By defining the addition and multiplication of cuts in Q in the way sketched in the previous paragraph. Cuts in Q" behave similarly. C. V') are cuts in the rational field. *8. By adding and deleting suitable single numbers. U) of this type gives a cut (L. 2. > 0. 10 R(x) > 0 if and only if R(t) > 0 for all sufficiently large t in F. ao + atX + . + b. If band c are any positive elements of D. 6. V) and (L'. Why does this theorem fail for negative rationals? 4. then F(x) becomes an ordered field.) (b) Using (a). Show that in Ex. *11.t Exercises 1. There is one and (except for isomorphic fields) only one complete ordered field. + a"x" we define R (x) > 0 to mean that a"b. so that we shall just state the result. for each rational function. 1940). prove that the only isomorphism of R with itself is the trivial isomorphism x >-+ x. show that t"b < c for some n. But the proof is long. Show that if (L. y E L') or as u + v (u E V. Show that if D = F is an ordered field and if. and would lead us far afield.x has a root. *10. show that s = 2 . MacDuflee.. Show that for every E > 0 there is an n so large that 10. State and prove an analogous theorem for the positive rational numbers under multiplication. Introduction to Abstract Algebra (New York: Wiley. V E V').) 9. 6 and 7 to show that any "complete" ordered domain is isomorphic either to Z or to R. t See the treatment in Chapter VI of C. Instead of using Dedekind cuts.4 106 Real Numbers complete ordered field. V) in the sense of the text. it is also possible to construct the real numbers from the rationals as limits of sequences of rationals. Show that D contains an element t with 0 < t < 1. Theorem 7.. st < 1.." < E. consider all x with xb < 1. . (a) Prove that any isomorphism of R with itself preserves the relation x < y. If t is an element in an ordered domain D with 0 < t < 1. and conversely. 5. every rational number with one exception at most can be written either as x + y (x E L.x' '" 0. Let D be a "complete" ordered domain not isomorphic to Z. Use Exs. 7. (Hint: To find the inverse of b > 1. (Hint: x < y if and only if Z2 = Y . R(x) = bo + btx + . 3.. show that every cut (L'. A Dedekind cut in an ordered field F is sometimes defined as a pair of subsets L' and U of F such that every element of F lies either in L' or in V' and such that x < y whenever x E L' and y E U.t has the properties s > 1.Ch. satisfying . by the definition of an integral domain 107 .2 = -1. it was observed that the equation x 2 = 2 -1 had no real root (x being never negative).1. (x. but to simple algebraic experimentation. many algebraic theorems have much simpler statements if one extends the real number system R to a larger field C of "complex" numbers. xy' + yx'). The system of complex numbers so defined is denoted by C. y) of real numhers-x being called the real and y the imaginary component of (x. Complex numbers are added and multiplied by the rules: (1) (2) (x. Stated in precise language. y') = (xx' . y). A complex number is a couple (x. (x'. y real numbers) would represent an element. We owe the above definition not to divine revelation. This suggested inventing an imaginary number i. This we shall now define. Definition. any expression of the form x + yi (x.yy'. and otherwise satisfying the ordinary laws of algebra. y) + (x'. it suggested the plausible hypothesis that there was an integral domain D containing such an element i and the real field R as well. y') = (x + x'. First. y) . Y + y'). and show that it is what one gets from the real field if one desires to make every polynomial equation have a root. In D. Definition Especially in algebra. Moreover.5 Complex Numbers 5. but also in the theory of analytic functions and differential equations. and the fact that (-x. (x + yi) == (x' + y'i) implies (x . the commutative and associative laws of addition.= y'. Since i 2 == -1.y)2. y) are immediate consequences of the fact that real and imaginary components are added independently. Theorem 2.2. (x . In summary. y) of real numbers determine distinct elements x + yi of D. -(y' . 5 108 Complex Numbers (laws of ordinary algebra). comparing formulas (1')--(2") with (1)-(2). The complex number system. is a field containing a subfield isomorphic to R and a root of x 2 + 1 = o. This proves Theorem 1. -y) is an additive inverse of (x. we see that the correspondence preserves sums and products. distinct couples (x.Ch. This establishes a one-one correspondence of the form (x.0) has a multiplicative inverse (3) follow from the fact to be established in §S. (1') (2') (x + yi) ± (x' + y'i) = (x ± x') + (y ± y')i. It is a corollary that the subdomain of D generated by Rand i contains all elements of the form x + yi and no others.0) is a multiplicative identity and that every (x. 2 (x + yi) . this is impossible unless x == x'. y) ~ x + yi between the elements of C and those of the subdomain of D generated by Rand i.yy') + (xy' + yx')i. Then the subdomain of D generated by Rand i is isomorphic with C. the fact that (0. while the corresponding laws hold for them.y)i. the commutative and associative laws of multiplication.x') = (y' . the facts that (1. y) ~ (0. (x' + y'i) = xx' + (xy' + yx')i + yy' i .X. Again. hence is an isomorphism. as defined above. y . we get from (2') (2n) (x + yi) .x'f == -(y' . Similarly.0) is an additive identity. And since (x .y)2 <:: 0. Proof. Let D be any integral domain containing the real number system R and a square root i of -1. We now prove our conjecture that there does indeed exist an integral domain D which contains the real numbers and a square root of -1. y). hence squaring both sides. Finally. For the couples (x. (x' + y'i) == (xx' .)2 :> 0. that "arguments" and "absolute values" of complex numbers combine independently under . yy". and in fact. we can check the distributive law by similar direct substitution. y) = (x. it is preferable to check these laws by direct substitution in the definition (2)-<mly the calculation for the associative law is long-winded. from this. This is just the recognition that the correspondence x ~ (x. x(y' = (xx' = (xx' - + yn) + y(x' + x")). We omit the details. z" = (x". Any couple (x. 0)(0. and the two immediately preceding letters of the alphabet for its real and imaginary components. we use a single letter to denote a complex number. used in Theorem 1. 1) = x + yi The notation x + yi is so suggestive that we shall usually employ it instead of (x. if the second components y and y' in the definitions (1) and (2) are both zero.y(y' + y"). z(z' + z") = zz' + zz" can be checked directly. Finally. For brevity. yy'. We agree. y)(x' + x". we shall also often write z = (x. as in previous cases. v) = u + vi. In this field C of couples of numbers one may find a subfield of real numbers by exploiting the correspondence (x. z' = (x'. in which the real numbers x correspond to couples with second term zero and the couple (0. . z(z' zz' + z") + zz" = (x. xy' + yx' + xy" + yxn). y"). 1). Check that complex multiplication is commutative and associative. then the first components x and x' add and mUltiply just as do the real numbers x and x'. and so on-in other words. y' + y") = (x(x' + x") . Check that (x. w = (u. y) in the sequel.109 §5.yy". 1) to i. y)-I = (1. y) ~ x + yi. But at the present stage. xy' + yx') + (xx" . Specifically. 0) + (y. y'). Thus let z = (x. the desired square root of -1 is presumably the couple (0. xy" + yxn) yy' + xx" . 1)2 = (-1. a special case of the definition (2) shows that (0. 0) is simply to be identified with the corresponding real number x. 1). y) = x + yi. Finally. y) then has the form (4) (x.0) = -1. y)(x. Hence we define i to be the couple (0. y) = (x. c = (a. that each such special complex number (x.1 Definition multiplication and themselves satisfy the same laws. 2. b) = a + bi.0) holds if formula (3) is used. Exercises 1.O) + (0. 0) is an isomorphism of the field R of reals to a subset of C. Then substituting in (1) and (2). y). y = rsin (J. (J the .a)r /2 .i)w = 2i. *11. 5. Polar coordinates may be used in this plane. show without recourse to the real numbers that the rational field Q can be extended to a larger field Q(J2) containing Q and a square root of 2. Find all complex roots of Z2 = -a. (a) Show that Z2 = a + ib has solutions z == x + iy with y = bl(2x). each complex number z = x + iy is mapped onto the point P = (x. One calls r the absolute value of the complex number z and argument of z. (b) Show also that y = [(1/2)(Ja 2 + b 2 . Compute one other root in decimal form. Describe the subfield of C which is generated by i and the rational numbers. Is Theorem 1 still true if D is a commutative ring? Give details. iz + w = 1 + i. 7. (2 + i)z + (2 . Show that if F is any ordered field. so (5) Izl = r = (x 2 + y2)1/2.iw = 3 + i. where r is the (nonnegative) length of the segment Oz joining the point P to the origin. then there exists a larger field F* containing a subfield isomorphic to F and a square root of -l. Solve (1. (b) using (3). 6. 4. while (J is the angle from the x-axis to this segment (Figure 1). Namely. where a is any positive real number. x = bl(2y). 5. We may recall that each point P of the plane and hence each complex number z is uniquely determined by the two polar coordinates rand (J. Find complex numbers z == x + yi and w = u + vi which satisfy (a) z + iw == 1. *10. The Complex Plane There is a fundamental one-one mapping of the complex numbers onto the points of a Cartesian plane. 8. Using the methods of Theorems 1 and 2. argz = (J = tan.2. y) with the real component x of z as abscissa and the imaginary component y as ordinate. (b) (1 + i)z .Ch. y) = (2. l)(x. Show that there is no possible definition of "positive complex number" which would make C an ordered field.1 y/x. They determine x and y by (6) x = r cos (J.1) (a) as a pair of simultaneous linear equations in x and y.5 110 Complex Numbers 3. 12. 9. Justify your answer.) The equation Z3 + 3iz = 3 + i has -i for one root. z = r(cos (J + i sin (J). (Note that these formulas are more accurate for numerical computation when a is negative and bl a small. Iz + z'l <:: 101 = 0. That is. From the de Moivre formulas (7) one sees immediately that . by well-known trigonometric formulas. since the usual Taylor series expansion gives ei6 = 1 + i() + (_1)()2 + (-i)()2 + . this is equivalent to zz' = rr'[ cos «() + ()') + i sin «() + ()')].sin () sin ()') = r'(cos ()' + i sin ()'). The absolute value of a product of complex numbers is the product of the absolute values of the factors. and z'. Not only the multiplicative. note that formula (1) means that the sum z + z' may be found by drawing (Figure 2) the parallelogram with three vertices at z. Formulas (8) and (9) now follow from the identity between absolute values and geometrical lengths. (8) (9) Izl > 0 unless z = 0. One also writes (6) in the form z = rei/}. z = r(cos () + i sin ().. but the additive properties of (inequalities on) absolute values are valid for complex as well as real numbers. As in (6). the fourth vertex will be z + z'. Izl + Iz'l· To prove these. arg zz' = arg z Proof. Complex nth roots of unity may be found using trigonometry. + i(cos () sin ()' + sin () cos ()')]. 0. in other words. z' Substituting in the definition (2).. z 3! The importance of the absolute values and arguments rests largely on de Moivre's formulas. 2! = cos () + i sin (). we get zz' = "'[(cos () cos ()' . which may be stated as follows: --~~~x~--~-------x Rgure 1 Theorem 3. the argument is the sum of the arguments of the factors.§5.2 111 The Complex Plane y the usual laws for the transformation from polar to rectangular coordinates. + arg z'. (7) Izz'l = Izl·lz'l. This gives the result (7). . Thus the nth roots of care zo. one solution of this is Zo = Ie Il/n(cos 0 + i sin 0). wZo. wn-1zo of c = a + bi numerically. 5 112 Complex Numbers [r(cos 0 + i sin 0)]-1 y = (l/r)[cos (-0) + i sin (-0)]. The complex nth roots of unity are the vertices of a regular polygon of n sides inscribed in the unit circle Iz I = 1. n 1 w . w 2 . G eometnca . this is Theorem 4. wZo. The computation is completed by the formula z = r(cosO + i sin 0) = Izlcos(arg. argzo equals 1 1 (l/n) tan.Ch. wZo is a root of xn = c if and only if c = (wzof = wnzon n wnc. there are thus precisely n solutions of zn = 1.l)/n.(b/a) + 360k/n in degrees. In polar coordinates. known as the "cyclotomic" equations. arg z is an integral multiple 2k7r of 27T.· . In particular. w n-l . cos 27T(n l)/n + isin27T(n . IIy umty stated.w. as 1. Moreover. If we denote cos 27T/n + i sin 27T/n by w.0. de Moivre's formulas (7). Since arg z is single-valued on 0 <:: 0 < 27T. and arg w Zo = (l/n) tan. .z) + ilzlsin(argz). These equations. Consider more generally the equation zn = c. with the aid of logarithmic and trigonometric tables. with o= ~ 0 is any (l/n) arg c. .. one sees that zn = 1 if and only if Iz In = 1 and n . cos 27T/ n + i sin 27T/ n. . Each complex nth root of unity w satisfies a polynomial equation with rational coefficients irreducible over the field of rationals. in rectangular coordinates they are 1. One can easily compute the nth roots zo." .ZO. play an important role in the theory of equations. . W2zo. From the identity Br one can compute Izol. we obtain another representation of these nth roots of .(b/a). they are also represented by the vertices of a regular polygon.. whence w = 1. z' z ----~~-----------x Figure 2 Further.. Since Izl >. Izl = 1. where w is as defined above. where c complex number. the facts become less simple.lover Q (the rationals). 2.. Find to 4 decimal places the cube and fourth roots of 2 + 2i. Describe geometrically the effect of transformations z .1 all except z = 1 satisfy (10) q"(z} = (z" ." 6.3. Z3 + Z2 + z + 1 = (z + 1)(z2 + 1) is reducible. (Thus.1)/(z . Describe geometrically the correspondence z . moreover. where k runs through the proper divisors of n. (b) Prove that w m is a primitive nth root of unity if and only if m is relatively prime to n. if n = 4. 3.. If n is not a prime. cz + d (c. 8. But the proof of this result. + .. In general.. In §3. Exercises 1.. drawing a large "unit circle. every nth root of unity satisfies z" . List the primitive twelfth roots of unity. involve more number theory than is desirable here.... S. Find the irreducible factors of Z6 . zi.) They are the w m with m relatively prime to n." and "expansion. + z + 1 = O. and plot them on graph paper. 5.. and they all satisfy the same irreducible equation over the rational field.3 113 Fundamental Theorem of Algebra By definition. 4. we can factor out from (10) the cyclotomic polynomials satisfied by kth roots of unity." "rotation. Thus.1 that the complex number system is obtained by adjoining to the real number system R an imaginary root i of the equation Z2 + 1 = O..§S. But why stop here? Why not try to add "imaginary" roots of other polynomial equations so as to get still larger fields? . The nth roots of unity which are not also kth roots of unity for some k < n are called primitive nth roots of unity. (a) Prove that w = cos (2Tr/n) + i sin (2Tr/n) is a primitive nth root of unity. Fundamental Theorem of Algebra We saw in §S. the primitive fourth roots of unity are i and -i. ") 7. What if I c I = I? (Hint: Use the words "translation. Find to 4 decimal places (using trigonometric tables) the real and imaginary components of the cube roots and the fifth roots of unity. dEC.10 Eisenstein's criterion was used to show that qp(z) is irreducible if n = p is a prime.and the computation of the degree of this equation. C y6 0). Prove the commutative and associative laws of multiplication and the existence of multiplicative inverses from de Moivre's formulas.1) = Z"-1+ Z"-2 = 0. New First Course in the Theory of Equations (New York: Wiley.f}) = r d(argw) = r 2 (udv .vdu)/(u + v 2 )." and the other the "w-plane. Since p(z) = amz m + am_lz m.vdu)/(u 2 + v 2 ) holds. t All proofs involve nonalgebraic concepts like those introduced in Chap 4. 1938). with am ¥.' not passing through the origin w = O. so that one does not need to invent imaginary ones to solve equations.Ch.1 + . labeling one the "z-plane. In this case let us picture two complex planes.' in the w-plane: the image of the circle 1'. + co.. 1939). E. Appendix. that the image of some circle on the z-plane passes through O. Proof. then q(z) (being differentiable) will describe a continuous curve on the w-plane. Introduction to the Theory of Equations (New York: Macmillan. what is the same thing. We do not prove the nonalgebraic part in detail from the relevant axioms of Chap.0. for example." The given function q(z) maps each point Zo = (xo. we have selected one whose nonalgebraic part is especially plausible intuitively. p. only the case where the leading coefficient is unity need be discussed. essential use is made of the completeness of R. every polynomial equation has actual (complex) roots.' t Q. if z describes a continuous curve on the z-plane. L. consider the line integral. 4.. the function w = q(re i9 ) defines a closed curve 1'.. . :j: In proving the existence of line integrals. or L. Every polynomial p(z) of positive degree with complex coefficients has a complex root. Our object is to show that the origin 0 of the w-plane is the "image" q(z) of some z on the z-plane-or. Dickson. this is defined for any 1'. Yo) of the z-plane onto a point Wo = q(zo) of the w-plane. 145. (If 1'.:j: 4>(r. has the same roots as = Z m + Cm-lZ m-l + . 5 114 Complex Numbers The answer is contained in the so-called Fundamental Theorem of Algebra: as soon as i is adjoined. since argw = arctan (v/u). Weisner. Many proofs of this celebrated theorem are known. Theorem 5 (Euler-Gauss).. The identity (d argw) = (udv .. For each fixed r. + ao.: Iz I = r (z = re i9 ) of radius r and center 0 in the z-plane. For each fixed r > 0. Moreover. the conclusion of Theorem 5 is immediate. k=l By de Moivre's formulas (7). n(r) is the degree m of q(z). Introduction (Q Topology (Princeton University Press. We conclude that if r is large enough.t however. 1949). . + CIZ + Co m. argq(z) = m argz + arg(1 + I Cm_kZ-k). the net change in arg q(z) is the sum of m times the change in arg z (which is m . n (0) = 0 (unless Co = 0 in which case 0 is a root). 127. n(r) varies continuously with r except when 'Yr' Figul'fl3 passes through the origin. as z describes the circle 'Yr counterclockwise. Thus n(r) = 1 in the imaginary example depicted in Figure 3.§5.. 27T) plus the change in But if Iz I = r is sufficiently large. 'Yr' is deformed continuously (since q(z) is continuous). S. Now assume Co ¥. for example. We shall now show that if r is large enough. that a curve tThis is proved as a theorem in plane topology.3 115 Fundamental Theorem of Algebra passes through w = 0. by formulas (8) and (9) 1 + L Cm-k Z -k = U k stays in the circle Iu . k=l Hence. d. But as r changes. Now consider the variation of n(r) with r.k) = zm ( 1 + L Cm-kZ .) It is geometrically obvious that 4>(r. n(r) = m: the total change in arg q(z) is 27Tm. Lefschetz..11 < 1/2. Indeed. Again. let q(z) = z m + Cm-lZ m-l + .O. where the winding number n(r) is the number of times that 'Yr' winds counterclockwise around the origin.27T) = 27Tn(r). It is geometrically evident. and so goes around the origin zero times (make a figure to iUustrate this). p.. Since q(re i6 ) is a continuous function. "y: must pass through the origin.5) we can write p(z) = (z . the number of its occurrences is called the multiplicity of the root Zi. It can also be defined. It follows that. Prove the uniqueness of the decomposition (11) without using the general uniqueness theorem of §3. 7.1Z )/2i.1) derivatives all vanish at Zi. show formally that e ix = cos x + i sin x. . we find m linear factors for p(z). 3 is Theorem 6. Proceeding thus. Use partial fractions to show that any rational function over the field C can be written as a sum of a polynomial plus rational functions in which each numerator is a constant and each denominator a power of a linear function. As a corollary. Exercises 1. hence also has a complex rot>t z = Z2.z l)r(z). Prove that any rational complex function which is finite for all z is a pOlynomial. then by the Remainder Theorem (§3. sin z = (e U .) occurs repeatedly.Ch. 2. where this happens. If a factor (z . (a) Using the MacLaurin series.1) or cz 2 by a suitable automorphism of C[z]. 6. If the degree m of p(z) exceeds 1. for some r. S. Show that any quadratic polynomial can be brought to one of the forms cz(z . A corollary of this and the unique factorization theorem of Chap.E.z. as the "order" to which p(z) vanishes at Zi: the greatest integer JI such that p(z) and its first (JI . The roots of p(z) are evidently the Zj in (l1)-since a product vanishes if and only if one of its factors is zero. 3.iZ )/2.D. q(z) = O! Q. Any polynomial with complex coefficients can be written in one and only one way in the form (11). the quotient r(z) has positive degree. 5 116 Complex Numbers which winds around the ongtn n¥-O times cannot be continuously deformed into a point without being made to pass through the origin at some stage of the deformation. using the calculus. Do couples (w. as (11) It follows that the only irreducible polynomials over C are linear. Factor Z2 + z + 1 + i. z) of complex numbers when added and multiplied by rules (1) and (2) form a commutative ring with unity? a field? 4.e.8. we note that if P(Zl) = 0. (b) Show that every complex number can be written as re/ B• (c) Derive the identities cos z = (e i • + e. §54 117 Conjugate Numbers and Real Polynomials 5.4. Conjugate Numbers and Real Polynomials In the complex field C, the equation Z 2 = -1 has two roots i and -i = 0 + (-l)i. The correspondence x + yi ~ x + y(-i) = x - yi carries the first of these roots into the second and conversely, while leaving all real numbers unchanged. Furthermore, this correspondence carries sums into sums and products into products, as may be checked either by direct substitution in formulas (1) and (2) or by application of Theorem l. In other words, the correspondence is an automorphism of C (an isomorphism of C with itself). We can state this more compactly as follows. By the "conjugate" z* of a complex number z = x + yi, we mean the number x - yi. The correspondence z ~ z* is an automorphism of period two of C, in the sense that (z*)* = z. It amounts geometrically to a reflection of the complex plane in the x-axis; the only numbers which are equal to their conjugates are the real numbers. Conjugate complex numbers are very useful in mathematics and physics (especially in wave mechanics). In using them, it is convenient to memorize such simple formulas as Iz 12 = zz*, Their use enables one to derive the factorization theory of real polynomials easily out of Theorem 6. Lemma. The nonreal complex roots of a polynomial equation with real coefficients occur in conjugate pairs. This generalizes the well-known fact that a quadratic ax 2 + bx + c with discriminant b 2 - 4ac < 0 has two roots x = (-b ± .Jb 2 - 4ac)/2a which are complex conjugates. Proof. Let p(z) be the given polynomial; we can write it in the form (11), where the Zi are complex (not usually real). Since the correspondence Zi ~ Zi * applied to these roots Zi is an automorphism, it carries p(z) into another polynomial p*(z) = c*(z - Zl*)(Z - Z2*)· .. (z - Zn *) in which each coefficient is the conjugate of the corresponding coefficient of p(z). But since the coefficients of p(z) are real, p(z) = p*(z). Hence, the factorization (11) being unique, c = c* is real, and the Zi are also real or complex conjugate in pairs. Ch. 5 118 Complex Numbers Theorem 7. Any polynomial with real coefficients can be factored into (real) linear polynomials and (real) quadratic polynomials with negative discriminant. Proof. The real Zi in the lemma give (real) linear factors (z - Zi). A pair of conjugate complex roots a + bi and a - bi with b ¥- 0 may be combined as in (Z - (a + bi»(z - (a - bi)) = Z2 - 2az + (a 2 + b 2 ) to give a quadratic factor of p(z) with real coefficients and with a real discriminant 4a 2 - 4(a 2 + b 2 ) = -4b 2 < O. Q.E.D. Conversely, linear polynomials and quadratic polynomials with negative discriminant are irreducible over the real field (the latter since they have only complex roots, and hence no linear factors). It is a corollary that the factorization described in Theorem 7 is unique. Exercises 1. Solve: (a) (1 + i)z + 3iz* = 2 + i, (b) zz* + 2z = 3 + i, (c) zz* + 3(z - z*) = 4 - 3i. 2. Solve: zz* + 3(z + z*) = 7, zz* + 3(z + z*) = 3i. 3. Solve simultaneously: iz + (1 + i)w = 3 + i, (1 + i)z* - (6 + i)w* = 4. 4. Give an independent proof of Corollary 2 of Theorem 4 (§4.4). S. Show that if one adjoins to the real number system an imaginary root of any irreducible nonlinear real polynomial, one gets a field isomorphic with C. 6. Show that over any ordered field ax 2 + bx + c is irreducible if b 2 - 4ac < O. 7. Show that every automorphism of C in which the real numbers are all left fixed is either the identity automorphism (z >-+ z) or the automorphism z >-+ z*. *5.5. Quadratic and Cubic Equations In §S.3 we proved the existence of roots for any polynomial equation with complex coefficients, but did not show how to calculate roots effectively. We shall show, in §§S .S-S .6, how to do this for polynomials of degrees two, three, and four. The procedures will involve only the four rational operations (addition, multiplication, subtraction, and division) and the extraction of nth roots. We showed how to perform these operations on complex numbers in §§S.1-S.2; the procedure to be used now will also apply to any other field in which nth roots of arbitrary numbers can be constructed and in which 1 + 1 ¥- 0 and 1 + 1 + 1 ¥- O. §S.S 119 Quadratic and Cubic Equations Quadratic equations can be solved by "completing the square" as in high-school algebra. Such an equation az 2 + bz + c (13) = (a -;e 0), 0 is equivalent to (has the same roots as) the simpler equation (14) Z2 + Bz + C = 0 (B = bfa, C = cia). If one sets w = B/2 (i.e., z = w - B/2), so as to complete the square, one sees that (14) is equivalent to (15) Substituting back z, a, b, c for w, B, C, this gives (16) z =w- B/2 = (-b + .Jb 2 - 4ac)/(2a) and so yields two solutions, by §5.2. Cubic equations can be solved similarly. First reduce the cubic, as in §4.4, to the form (17) Z3 + pz + q = O. Then make Vieta's substitution z = w - p/(3w). The result (after cancellation) is (18) Multiplying through by w 3 , we get a quadratic in w3 , which can be solved by (16), giving (19) (two values). This gives six solutions for w in the form of cube roots. Substituting these in the formula z = w - p/(3w), we get three pairs of solutions for z, paired solutions being equal. It is interesting to relate the preceding formulas to Theorem 6. Thus, in the quadratic case, writing Z2 + Bz + C = (z - Zl)(Z - Z2), we have whence The quantity B2 - 4C = D is the discriminant of (14). In terms of the original coefficients of (13), D = (b 2 - 4ac)/a 2 . Ch. 5 120 Complex Numbers Similarly, if Zb Z2, Z3 are the roots of the reduced cubic equation (17), (21) Zl + Z2 + Z3 = 0, Combining the first two relations, we get the formulas (22) p = ZlZ2 - z/, (Zl - z2f = -4p - 3z/, z/ + z/ + z/ = -2p. We now define the discriminant of a cubic equation by Squaring P and using the· second relation of (22), we get after some calculation (24) 2 D = -4p 3 - 27q, which can be used to simplify (19) to W = -q/2 + J D/6. Theorem 8. A quadratic or cubic equation with real coefficients has real roots if its discriminant is nonnegative, and two imaginary roots if its discriminant is negative. Proof. By the Corollary of Theorem 7, either all roots are real or there are two conjugate imaginary roots Zl = Xl + iy, and Z2 = Xl - iy. If all roots are real, (Zi - zf :> 0 for all i ¥- j, and so D :> O. In the opposite case (Zl - Z2)2 = _4y2 < 0, and since Z3 = X3 is real, (Zl - Z3)(Z2 - Z3) = (Xl - x3f + y2 > 0, so that D < O. Q.E.D. By (23), the condition D = 0 gives a simple test for multiple roots. Unfortunately, precisely in the case D > 0 that Z3 + pz + q = 0 has all real roots, formula (19) expresses them in terms of complex numbers. We shall show in §15.6 that this cannot be helped! Exercises 1. Prove that for any (complex) y, p there exists a z satisfying y = z - p/3z. How many exist? 2. Solve in radicals (a) Z2 + iz = 2, (b) Z3 + 3iz = 1 + i, (c) Z3 + 3iz 2 = 1Oi. 3. Convert one root in each of Exs. 2(a)-(c) into decimal form. 4. (a) Prove (22). (b) Prove (24). §s.& 121 Solution of Quartic by Radicals *5. (a) Show that sinh 3y = sinh (3y + 27Ti). (b) Using formula (9a) in §4.4, show that 4z 3 + 3z = C has, in addition to the real root sinh [(1/3) sinh-I C] = sinh y, also the complex roots -(1/3) cosh y ± i(.J3/2) sinh y. 6. Let w = c 2m / S be a primitive fifth root of unity, and let ( = w + l/w. (a) Show that (2 + ( = 1. (b) Infer that in a regular pentagon with center at (0,0) and one vertex at (1,0), the x-coordinate of either adjacent vertex is (J"S - 1)/4. 7. Using the formula cos () = (e i6 + e- i6 )/2, show that cos n() = T"(cos ()) for a suitable polynomial T" of degree n, and compute T I , T 2 , T 3 , T 4 • *5.6. Solution of Quartic by Radicals Any method which reduces the solution of an algebraic equation to a sequence of rational operations and extractions of nth roots of quantities already known is called a "solution by radicals." Theorem 9. Any polynomial equation of degree n <:: 4 with real or complex coefficients is solvable by radicals. Proof. Since the case n = 1 is solvable over any field, while the cases n = 2,3 were treated in §5.5, we need only consider ax 4 + bx 3 + cx 2 + dx + e = 0 (a ¥- 0). Again, dividing through by a, and replacing x by z "complete" the quartic), we get the equation (25) Z4 + pZ2 + qz + r = = x + b/4a (so as to 0, whose roots differ from those of the original equation by b/4a. But for all u, (25) is equivalent to (26) Z4 + Z2 U + u 2/4 - Z2 U u 2/4 + ~2 + qz + r = 0 or (Z2 + u/2f - [(u - p)Z2 - qz + (u 2/4 - r)] = O. - The first term is a perfect square p2, with P = Z2 + !u. The term in square brackets is a perfect square Q2 for those u such that (equating the discriminant to zero) (27) Ch.5 122 Complex Numbers This cubic equation in u can be solved by radicals, using Theorem 8. If the coefficients of (25) are real, one can even show that at least one real number Ul :> P satisfies (27), for the right side of (27) is zero if U = P and becomes larger than q2, or any other preassigned constant, when u is sufficiently large and positive. Hence, by Theorem 4 of §4.4, (27) has the desired real root u\. Substituting this constant u\ into (26), the left side of (25) assumes the form p2 _ Q2 = (P + Q)(P - Q), or (28) where (29) Q = Az - B, A = Ju \ - p, B = q/2A. j The roots of (25) are clearly those of the two quadratic factors 0 (28), which can be found by (16). Note that these factors are real if the coefficients a, b, c, d, e of the original equation were real. It is interesting to recall the history of the solution of equations by radicals. The solution of the quadratic was known to the Hindus and in its geometric form (§4.1) to the Greeks. The cubic and quartic were solved by the Renaissance Italian mathematicians Scipio del Ferro (1515) and Ferrari (1545). However, not until the nineteenth century did Abel and Galois prove the impossibility of solving all polynomial equations of degree n :> 5 in the same way (§15.9). Exercises 1. Solve by radicals: Z4 - 4z 3 + (1 + j)z = 3i. 2. Prove, without using the Fundamental Theorem of Algebra, that every real polynomial of degree n < 6 has a complex root. 2 3. Solve the simultaneous equations: zw = 1 + i, Z2 + w = 3 - i. *5.7. Equations of Stable Type Many physical systems are stable if and only if all roots of an appropriate polynomial equation have negative real parts. Hence equations with this property may be called "of stable type." In the case of real quadratic equations Z2 + Bz + C = 0, it is easy to test for stability. If 4C < B2, both roots are real. They have the same §S.7 123 Equations of Stable Type sign if and only if ZlZ2 = C> 0, the sign being negative if and only if B = -(ZI + Z2) > O. If 4C > B2, the roots are two conjugate complex numbers. They both have negative real parts ·Xl = X2 if and only if B = -2Xl = -2X2 > 0; in this case also C> B 2 /4 > O. Hence in both cases the condition for "stability" is B > 0, C > O. In the case of real cubic equations Z2 + AZ2 + Bz + C . 0, conditions for stability are also not hard to find. (It is not, of course, sufficient to consider the reduced form (17).) Indeed, if all roots have negative real parts, then, since one root Z = -a is real, we have a factorization Z3 + AZ2 + Bz + C = (z + a)(z2 + bz + c). (30) Here a > 0, and by the previous case b > 0 and c > O. Therefore A = a + b > 0, B = (ab + c) > 0, and C = ac > 0 are necessary for stability. Furthermore, AB - C = b(a 2 + ab + c) > O. Conversely, suppose that A > 0, B > 0, C> 0, and consider the real factorization (30), which always exists by Theorem 7. Since ac = C > 0, a and c have the same sign. But, if they were both negative, then b would have to be negative to make ab + c > 0, and so A = a + b < 0, contrary to hypothesis. Hence a > 0 and c > 0, implying a 2 + ab + c = a(a + b) + c > O. But this implies b = (AB - C)/(a 2 + ab + c) > 0, whence both factors of (30) are "stable." Hence we have proved the following result. Theorem 10. The real quadratic equation Z2 + Bz + C = 0 is of stable tyge if and only if B > 0 and C > O. The real cubic equation Z2 + Az + Bz + C = 0 is of stable type if and only if A > 0, B > 0, C > 0, and AB > C Exercise. 1. Test the following polynomials for stability: (a) Z3 + Z2 + 2z + 1, (b) Z3 + Z2 + 2z + 2. 2. Show that for monic real polynomial of degree n to be of stable type, all its coefficients must be positive. *3. Show that Z4 + Az 3 + BZ2 + Cz + D with real coefficients is of stable type if and only if all its coefficients are positive, and ABC> A 2D + C 2 • *4. Assuming Ex. 3, obtain necessary and sufficient conditions for a complex quadratic equation Z2 + Bz + C = 0 to be of stable type. (Hint: Consider (Z2 + Bz + C)(Z2 + B*z + C*) = 0.) 6 Grou,ps 6.1. Symmetries of the Squar'; The idea of "symmetry" is familiarAo every educated person. But fewer people realize that there is a consequential algebra of symmetry. This algebra will now be introduced in the concrete case of the symmetries of the square. Imagine a cardboard square laid on a plane with fixed axes, so that the center of the square falls on the origin of coordinates, and one side is horizontal. It is clear that the square has rotational symmetry: it is carried into itself by the following rigid motions. R: R', R": a 90° rotation clockwise around its center O. similar rotations through 1800 and 270°. The square also has reflective symmetry; it can be carried into itself by the following rigid reflections. H: V: D: D': a a a a reflection reflection reflection reflection in in in in the the the the horizontal axis through O. vertical axis through O. diagonal in quadrants I and III. diagonal in quadrants II and IV. Our list thus includes seven symmetries so far. The algebra of symmetries has its genesis in the fact that we can multiply two motions by performing them in succession. Thus, the product HR is obtained by first reflecting the square in a horizontal axis, then rotating clockwise through 900. By experimenting with a square 124 §6.1 125 Symmetries of the Square piece of cardboard, one can verify that this has the same net effect as D', reflection about the diagonal from the upper left- to the lower right-hand corner. Alternatively, the equation HR = D' can be checked by noting that both sides have the same effect on each vertex of the square. Thus, in Figure 1, HR sends 1 into 4 by H and then 4 into 3 by R-hence 1 into 3, just as does D'. Similarly, RH is defined as a result v D of a clockwise rotation through 90° , I / followed by reflection in a horizontal ' 'f2_ _ _~!___......,.1 / / axis. (Caution: The plane of Figure 1 , ' / / which contains the axes of reflection, " / is not imagined as rotated with the ", : / / square.) -- -----'(-------H R A computation shows that RH = // D ~ HR, from which we conclude / I ' / I , incidentally that our "multiplication" / I , is not in general commutative! It is, /3/ : ~ however, associative, as we shall see // : , D' in §6.2. Figure 1 The reader will find it instructive to compute other products of symmetries of the square (a complete list is given in Table 1, §6.4). If he does this, he will discover one exception to the principle that successive applications of any two symmetries yield a third symmetry. If, for example, he multiplies R with R", he will see that their product is a motion which leaves every point fixed: it is the so-called "identity" motion 1. This is not usually considered a symmetry by nonmathematicians; nevertheless, we shall consider it a (degenerate) symmetry, in order to be able to multiply all pairs of symmetries. In general, a symmetry of a geometrical figure is, by definition, a one-one transformation of its points which preserves distance. It can be readily seen that any symmetry of the square must carry the vertex 1 into one of the four possible vertices, and that for each such choice there are exactly two symmetries. Thus all told there are only eight symmetries, which are those we have listed. Not only the square, but every regular polygon and regular solid (e.g., the cube and regular icosahedron) has an interesting group of symmetries, which may be found by the elementary method sketched above. Similarly, many ornaments have interesting symmetries. Thus consider the infinite ornamental pattern , I' / I' , ) ) ) ) Ch.6 126 Groups in which the arrowheads are spaced uniformly one inch apart along a line. Three simple symmetries of this figure are T, a translation to the right by one inch, T, a translation'to the left by one inch, and H, a reflection in the horizontal axis of the figure . Others (in fact, all others!) may be found by multiplying these together repeatedly. Exercises 1. Compute HV, HD', D'H, R'D' , D'R', R'R". 2. Describe TH and lIT in the ornamental "arrowhead" pattern. 3. List the symmetries of an equilateral triangle, and compute five typical products. 4. List the symmetries of a general rectangle, and compute all their products. *5. How many symmetries are possessed by the regular tetrahedron? by the regular octahedron? Draw figures . *6. Show that any symmetry of the ornamental pattern of the text can be obtained by repeatedly multiplying H, T, and T'. 6.2. Groups of Transformations The algebra of symmetry can be extended to one-one transformations of any set S of elements whatever. Although it is often suggestive to think of the set S as a "space" (e.g., a plane or a sphere), its elements as "points," and the bijections as "symmetries" of S wi~h respect to suitable properties, the bijections of S satisfy some nontrivial ~lgebraic laws in any I Q~. To understand these laws, one must have clearl~ in mind the definitions of function , injection, surjection, and bijection made in §1.11 . To illustrate these afresh, we give some new examples ; as in §1.11, we will usually abbreviate f(x) to xf (read "the transform of x by 1"), g(x) to xg, etc. The function f(x) = e 21ru maps the field R of all real numbers into the field C of all complex numbers; its range (image) is the unit circle. Similarly, g(z) = Iz I is a function g : C -+ R whose image is the set of all nonnegative real numbers. Again, consider the following functions 4>0: Z -+ Z and 0/0: Z -+ Z on the domain Z of all integers to itself: n4>o= 2n, and m/2 if m is even, if m is odd. m%= { 0 provided that the products involved are defined. so that cf>o does not transform Z onto Z. In other words. p[cf>(I/IO)] = (pc/J )(1/10) = [(pc/J)I/I]O = [p(cf>I/I)]O '" (1/10) t/I8 """ - p[(cf>I/I )0].§6. but it does map Z onto Z. yet its range consists only of even integers. We shall now restrict our attention to this case. Formally. since all odd integers are mapped onto zero. we have for each PES. Multiplication of transformations conforms to the Associative law: (cf>I/I)O = cf>(I/IO). The product or composite cf>I/I of two transformations is again defined as the result of performing them in succession. Two transformations cf>: S ~ T and cf>': S ~ T with the same domain S and the same codomain T are called equal if they have the same effect upon every point of S. and finally 0. We turn now to the algebra of transformations. (2) which defines the effect of cf>I/I upon any point pES. thus 1/10 is surjective but not injective. then 1/1. On the other hand. By the definition (1) of equality for transformations. that is. 1/10 is not one-one. In particular. if cf>: S ~ T. then cf>I/I is the transformation of S into U given by the equation p(cf>I/I) = (pc/J )1/1. (1) cf> = cf>' means that pc/J = pc/J' for every pES. provided however that the codomain of cf> is the domain of 1/1. first cf>. in that order. 1/1: T ~ U. The identity transformation I = Is on the set S is that transformation I: S ~ S which leaves every point of S fixed. whenever the products involved are defined. this proves the associative law cf>(I/IO) = (cf>I/I)O. although almost all the identities proved below apply also to the general case.2 127 Groups of Transformations By the cancellation law of multiplication cf>o is one-one. This is obvious intuitively: both (cf>I/I)O and cf>(I/IO) amount to performing first cf>.1/1)( 0) where each step depends on applying the definition (2) of mUltiplication to the product indicated below the equality symbol for that step. (". This is stated algebraically . the product of two transformations of S (into itself) is always defined. then 1/1. as defined earlier. m~ol/lo = m for all m E Z. as follows. similarly. These definitions are closely related to the concepts of being "one-one" (injective) and "onto" (surjective). that p(~I) = (W)I = p~. = p(~1/1) = (W)I/I ~I/I :. t In case the set of such points q is infinite. . and 0 if m is odd. it is onto if and only if it has a left-inverse. Clearly. if the transformations ~: S -7 Sand 1/1: S -7 S have the product ~I/I = I: S -7 S. I~ = ~I = ~ Identity law: To see this. On the other hand. if ~ is one-one. hence ~ol/lo = 1. § 12. the Axiom of Choice (d. Theorem 1. note that p(I~) = (pI)~ = W for all p and. we first construct a second transformation 1/1: S -7 S. If ~ p has a right-inverse 1/1. if ~ has a left-inverse 1/1'. for each p. then. chooset as image ql/l anyone of these points p. one for each q. given any ~: S -7 S. Now if ~ is onto. For each q in S which)s the image under ~ of one or more points p of S. every q has the form W.Ch. Let 1/1 map the remaining points q of S in any way whatever. so that ~ is one-one. (W)I/I must be the unique antecedent p of q = hence ~I/I = I and 1/1 is a right-inverse of ~ as asserted. Then q(I/I~) = (ql/l)~ = p~ = q for any q of the form p~.: rand W = p' ~ imply = (p'~)1/1 = p'(~1/1) = p'. Therefore ~ is onto. w. On the other hand. We may thus call 1/10 a right-inverse (but not a left-inverse) of ~o. say on some fixed point of the (nonempty) set S.6 128 Groups in the identity pI (3) =p for every p E · S. A transformation ~: S -7 S is one -one if and only if it has a right-inverse. hence I/Io~o ¢ 1. From the above definitions there follows directly the for all~. Thus W = p' ~ implies p = p'. and compute their products. then 1/1' ~ = 1. Conversely. and hence I/I~ = I. any q in S can be written q = qI = q(I/I' ~) = (ql/l')~. and 1/1 a right-inverse of ~. ml/lo~o = m if m is even. as the ~-image of a suitable point p = ql/l'. Similarly. In general. Hence. Return now to the special transformations ~o and 1/10 defined above on the set Z. then ~ is called a left-inverse of 1/1.2) asserts the possibility of making such an infinite number of choices of p. Proof. so that ~ has 1/1 as left-inverse. In the special case when S is finite. Theorem 1 and its corollaries also hold. it has an immediate direct proof. One need only observe that a left-inverse t/I or a right-inverse 8 is a transformation of the second set T into S and that t/lc/J = IT: T ~ T. Define a (two-sided) inverse of c/J: S ~ S as any transformation c/J-1 which satisfies the Inverse law: These equations also state that c/J -1 is a two-sided inverse of c/J. The functional notation y = c/J(x) of the calculus suggests writing y = c/Jx where we have written y = xc/J above. so that the more elaborate discussion of left. Corollary 1. then t/I." and the notions of right. A transformation c/J: S ~ S is a bijection if and only if it has both a right-inverse and a left-inverse. . Either notation by itself is satisfactory. any two inverses of c/J are equal. The meaning of two-sided inverse stays the same. the composite of c/J and z = t/I(y) is naturally written z = (t/lc/J )x. however. if c/J has a right-inverse 8 and a left-inverse t/I.and left-inverse become interchanged. as an abbreviation for z = t/I( c/J (x». When this is the case. c/J is one-one if and only if it is onto. respectively. Here Is and IT are the identity transformations on Sand T. but confusion between them must be avoided. Hence t/lc/J means "perform first c/J. c/J8 = Is : S ~ S. In this notation. A transformation c/J: S ~ S is bijective if and only if c/J has a (two-sided) inverse c/J -1. hence the further corollary: Corollary 2. instead of z = xc/Jt/I. . and (4) This corollary is what will be used below. as do the following corollaries.2 129 Groups of Transformations Remark. for c/J -1 is simply that transformation of S which takes each point q = PcP back into its unique antecedent p. together with their proofs.§6.and right-inverses is pointless in this case. then 8 = 18 = (t/lc/J)8 = t/I(c/J8) = t/lI = t/I. any right-inverse of c/J is equal to any left-inverse of c/J. Indeed. for functions c/J: S ~ T on a set into another set T. When this is the case. The group of all permutations of n elements is called the symmetric group of degree n. If ~ is in G. b an odd integer. (ii) if ~ is in G. A bijection of a finite set S to itself is usually called a permutation of S. so is its inverse. (VD)R". (b) a = 1. Q. (c) a = 1. V(DR") in the group of the square. find when the set of all possible rp's with coefficients a and b of the type indicated is a group of transformations. Corollary 2 above shows that ~ -I is also one-one onto. that of the second element can then be chosen in n . (e) a an integer. 3.1 ways from the elements not k J. and so on. b = 0. 2. Compute similarly HR. Since I I = I. as required by condition (i) above. the identity I and S is bijective. (a) a and b rational numbers. hence is in the set G. (R'H)R. b an even integer. Give reasons. The set G of all bijections of any space S onto itself is a group of transformations. (~I/I)(I/I -I~ -I) Therefore ~I/I = ~(I/II/I -I)~ -I = ~I~ -I = ~~ -I is also bijective (one-one onto).6 130 Groups We are now ready to define the important concept of a group of transformations.Ch. a and b real numbers. Let S consist of all real numbers (or all points x on a line). = I. Compute VD. Exercises 1. In each of the following cases. (g) a '" 0. so is their product ~I/I. for the image kl of the first element can be chosen in nways. (I/I-I~-I)(~I/I) = I/I-I(~-I~)I/I = 1/I-IIIjJ = 1/1-11/1 = I. a an integer. as in (ii). b a positive integer or 0. (f) a '" 0. the inverse of a product is the product of the inverses.D. while the transformations considered have the form xrp = ax + b.E. By a group of transformations on a "space" S is meant any set G of one-one transformations ~ of S onto S such that (i) the identity transformation of S is in G. R'(HR). DR". it evidently contains n! permutations. and has as inverse (5) In words. (iii) if ~ and 1/1 are in G. R'H. for by hypothesis Proof. (d) a = 1. Theorem 2. Finally. . b a real number. hence is likewise in G. the product of any two one-one transformations ~ and 1/1 of S onto S has an inverse. taken in the opposite order. To see this. of the transformations under which the plane is congruent to itself. (RH)-I = 111R.3 4. b a real number. then so does ¢JI/I. 9. 11.¢J" are one-one. rigid rotations. Many of them have special geometrical properties. (i) a '" 0." and are 48 in number.1 '" R-1H. in the group of the square. 6. and two right-inverses of ¢Jo. ¢J".. Check that. and test the rule (5). . every point of the cube is in fixed position. a an integer. Hence the cube has exactly 48 symmetries. (j) a '" 0. Show that for any ¢J: S ~ S. 12. After the transform of anyone vertex ha. Further Examples The symmetries of a cube form another interesting group.3. 9.s been fixed. with (¢JI¢J2'" ¢J")-I = ¢J"-I¢J"-I-\" ' . Solve the equation RXR' = D for the group of the square. a a real number. 5. 8. *14. these symmetries are the one-one transformations which preserve distances on the cube. Geometrically speaking.1. the transformation 1/1 constructed 10 the second part of the proof of Theorem 1 satisfies ¢JI/I¢J = ¢J. and exhibit explicitly two right-inverses. They are known as "isometries. Find the inverse of every symmetry of the rectangle. giving 6· 8 = 48 possibilities. Compute (R-1(VR))-I((R-1D)R) for the group of the square. and reflections. Show that if ¢J and 1/1 both have right-inverses. It is made up of products of translations. such as the one which reflects each point into the diametrically opposite point. b an irrational number. If ¢JI' . b an integer. 13. 6. so the whole symmetry is known. ¢JI. the three adjacent vertices can be permuted in any of six ways. 10. note that any initial vertex can be carried into anyone of the eight vertices." How many are there? How many of these are one-one? Show that the transformation n>-+ n 2 on the set of positive integers has no left-inverse. When one vertex and the three adjacent vertices occupy known positions. A familiar group containing an infinite number of transformations is the so-called Euclidean group. in the language of elementary geometry. In which of these groups is "multiplication" commutative? Find all the transformations on a "space" S of exactly three "points.1. prove that so is ¢JI ¢J2 ••.. This consists of the "isometries" of the plane-{)r.§6. Show that a transformation ¢J : S ~ S which has a unique right-inverse or a unique left-inverse is necessarily a one-one transformation of S onto S. Further Examples 131 (h) a '" 0. it will be discussed in greater detail in Chap. 7. a a rational number. . Exhibit two distinct left-inverses of the transformation 1/10: Z ~ Z defined in the text. be = 1 and with coefficients in any field F. those one-one transformations of any set of elements which preserve any given property or properties of these elements form a group. 4. 5. "projective" and "affine" geometry deals with the properties which are preserved under the "projective" and "affine" groups to be defined in Chap. Felix Klein (Erlanger Programm. Let S. The isometries of the plane leaving invariant a regular hexagonal network (Figure 2) form another interesting group. The rigid motions of the surface of any sphere into itself again constitute a group. with ad . Describe ST geometrically. Again. *8. Describe some isometries of the plane which carry the hexagonal network of Figure 2 onto itself. Show that the transformations x ~ x' = (ax + b)/(ex + d). Similarly. Thus Euclidean geometry deals with those properties of space preserved under all isometries. 9. G~nerally speaking. 3. a rubber band held in a straight line between fixed endpoints P and Q may be deformed in many ways along this line. 2. and topology with those which are preserved under all homeomorphisms.Ch. for a finite cylinder. 1. for a helix wound around the cylinder making a constant angle with the axis of the cylinder. Do the same for the network of equilateral triangles. T be reflections of a cube in planes parallel to distinct faces. . 1872) has eloquently described how the different branches of geometry can be regarded as the study of those properties of suitable spaces which are preserved under appropriate groups of transformations. All such deformations form a group (the group of Rgure 2 so-called "homeomorphisms" of the segment PQ). Exercises 1. Describe the six symmetries of a cube with one vertex held fixed. and relate this to the group of Ex. 7. Describe all the symmetries of a wheel with six equally spaced radial spokes.6 132 Groups Another group consist!1 of the "similarity" transformations of spacethose one-one transformations which mUltiply all distances by a constant factor k > 0 (a factor of proportionality). constitute a group acting on the set consisting of the elements of F and a symbolic element 00 . Do the same for an infinite cylinder. Can you enumerate all such transformations (this is difficult)? 6. Do the same for a network of squares. The product of any two nonzero numbers is a nonzero number. c. and 1/x = x -1 satisfies the inverse law. A group whose operation satisfies the commutative law is called a "commutative" or "Abelian" group: Using this concept. any two elements have a uniquely determinate sum. addition is associative. relative to addition. F is a commutative group with identity 0." are equally valid.2..g. Abstract Groups Groups of transformations are by no means the only systems with a multiplication which satisfies the associative. The product notation "ab" will ordinarily be used to denote the result of applying the group operation to two elements a and b of G-but other notations. the three laws defining groups become Associative law: Identity law: Inverse law: a(bc) = (ab)c ae = ea = a aa -1 =a -1 a=e for all a. without reference to transformations. identity. of the rational. such as "a + b" and "a 0 b. the unit 1 of the field satisfies the identity law. c. and inverse laws of §6. For instance.4. Similarly. . and (iii) admits for each element a an element a -1 (called its inverse) satisfying the inverse law. for each a and some a -1. the elements (including zero.4 133 Abstract Groups 6. addition and multiplication. A group G is a system of elements with a binary operation which (i) is associative. with "e" for the identity. or complex field) satisfy them. Thus. the nonzero numbers of any field (e. for all a. A field is a system F of elements closed under two uniquely defined binary operations. the elements of any integral domain form a group under addition. in many ways. In discussing abstract groups. and -x the inverse law. this time) of any integral domain satisfy our laws when combined under addition. Definition. groups so defined are often called abstract groups. Groups can be defined abstractly. real. . Definition.. b.§6. In the product notation. b. elements will be denoted by small Latin letters a. while zero satisfies the identity law. we can simplify the definition of a field as follows. such that (i) under addition.. the associative law holds.. It is convenient to introduce the abstract concept of a group to include these and other instances. (ii) admits an identity satisfying the identity law. In other words. Ch. (ba. cancellation on the left is possible. Theorem 4. ay = b implies -l y=a b. But it is. this can be verified in detail. (iii) both distributive laws hold: a(b + c) = ab + ac. A group has only one identity element. a -1 a = e for all a. Given these weaker laws. Some of the results of the first sections of Chaps. the identity and inverse laws can be replaced by the weaker laws. Similarly. Proof. 1 and 2 will now appear as corollaries of the following theorem on groups. Proof. .1a) = be = b. and. I -1 . except for the associative law for products with a factor 0.1 )a = b(a.1b) = b. a(a. and only one inverse a -1 for each element a. for we need only to premultiply each side of ca = cb by c. and gives a = b. Note that in this proof a -1 is not assumed to be the only element satisfying xa = e. ea = a " G wen a.1 and apply the associative law to get (c. that is. Left-identity: Left-inverse: For some e. and so does ac = ad (cancellation law). and. The given left-identity is also a right-identity. In any group. observe that the postulates just given include all those previously stated for a field. In the preceding definition of a group. then X = xe = x ( aa -1) = ( xa ) a -1 = ea -1 = a -1 . If a. To see that this definition is equivalent to that given in §2.1 is the element specified in the inverse law.1c)b. we get the Corollary. for a -1 ae = ee = e = a -1 a.orsomea . ca = cb implies a = b. clearly. since if xa = e. a.1c)a = (c. similarly. similarly. Theorem 3. xa = band ay = b have the unique solutions x = ba -1 and y = a -1 b. which is ea = eb.1. the nonzero elements form a commutative group.6 134 Groups (ii) under multiplication.1 is the only element such that ay = e. Since in any group G the equations ex = e and ay = e have by Theorem 3 the unique solutions x = e and y = a -1. Hence ca = da implies c = d. Conversely. xa = b implies x = xe = xaa -1 = ba -1. ae = a for all. The entry opposite a on the left and headed above by b is the product ab (in that order). The proof is left as an exercise (Ex. Besides systematizing the algebraic laws governing multiplication in any group G. Left-cancellation now gives aa -1 = e. A useful one may be set up in terms of the possibility of division. as follows: Theorem 5.§6. by left-cancellation. There are many other postulate systems for groups. If G is a nonvoid system closed under an associative multiplication for which all equations xa = band ay = b have solutions x and y in G. In Table 1 we have tabulated for illustration the multiplication table TBble f. since the left-identity is a right-identity. This completes our proof. we may list in a "multiplication table" the special rules for forming the product of any two elements of G. Finally. headed both to the left and above by a list of the elements of the group.p. provided the number of elements in G is finite. 12). left-inverses are also right-inverses. for a -1( aa -1) = (-1) a a a -1 = ea -1 = a -1 = a -1 e. This is a square array of entries. then G is a group.4 135 Abstract Groups whence. Group of the Square I R R' R" H V D D' I I R R' R" H V D D' R R R' R" I D D' V H R' R' R" I R V H D' D R" R" I R R' D' D H V H H D' V D I R' R" R V V D H D' R' I R R" D D H D' V R R" I R' D' D' V D H R" R R' I . · respectively. 7.6. 2. (b) (1. Prove that if xx = x in a group. Most of the group properties can be read directly from the table. c be fixed elements of a group.3. (Hint: ba is one of e. 5. The group is commutative if and only if its table is symmetric about the principal diagonal (which extends from upper left to lower right).10). . since the solution is unique. d a a b c d a b b a d c c d c c d a a d b d d c b b 8. The possibility of solving the equation ay = b means that the row opposite a must contain the entry b. (d) (1.) 6. the associative law cannot be easily visualized in the table. and 6 of §1. b.8). b can occur only once in this row.9). 4. Prove the equation xaxba = xbc has one and only one solution.3. except in trivial cases. 3. Do the following multiplication tables describe groups? a b c d a b d a c b d c b c a b d c a a b c. which of the following sets are groups under multiplication? (a) (1. then x = e.5.7. Prove that Rules 2. ab. Do the positive real numbers form a group under addition? under multiplication? Do the even integers form one under addition? Do the odd ones? Why? 4. the existence of an identity states that some row and the corresponding column must be replicas of the top heading and of the left heading.4. Let a. Exercises 1.5. prove there is an element besides the identity which is its own inverse. Unfortunately.8). (c) (1.6 Groups for the group of symmetries of the square. b.1 in proving that HR = D' and RH = D.2 are valid in any commutative group. Another method is described in §6. a. Thus. The computations can be modeled on those made in §6. In the field Zll of integers modulo 11. Prove that a group with 4 or fewer elements is necessarily Abelian.136 Ch. In a group of 2n elements. This is indeed the main practical use of logarithms! Next. y: we can replace computations of products by parallel computations with sums. that is. If I. 120°. a(be) = (ab)e.10).137 §6. Isomorphism Consider the transformation x ~ log x on the domain of real numbers. prove that S can be embedded in a group. (Hint: If ax = a. 2 + 2 = 1 (mod 3). under the operation z z' = 1z I· z'. then G is commutative. prove that S is a group. R'R' = R. under multiplication. (e) All integers under the operation of subtraction. (c) All complex numbers of absolute value 1. (iii) the "left-inverse" postulate of Theorem 4. 0 6.5. and R' are the rotations through 0°. (0 The "units" (§3. (d) All complex numbers z with Iz 1= 1. Moreover log (xy) = log x + log y for all x.S Isomorphism 9. e. Which of the following sets of numbers are groups? Why? (a) All rational numbers under addition. *11. (ii) the "left-identity" postulate of Theorem 4. For instance. 2 ++ R' associating integers with rotations is one which carries sums in Z3 into products of the corresponding rotations. 10. and ax = ay implies x = y. and 240°. respectively.) *13.6) of any integral domain under multiplication. *12. These are instances of the general concept of "isomorphism" mentioned in §1. the correspondence is One-One between the system of positive real numbers and the system of all real numbers (the inverse transformation being y ~ eY ). R. then x is a right-identity. (b) If S is finite or infinite. . 1 ++ R. This concept is simpler and also more important for groups than for integral domains. and let G be the group of the rigid rotations of an equilateral triangle into itself. under multiplication. 1 RR' = I. and any right-identity equals any left-identity. (a) If S is finite. let Z3 be the field of the integers mod 3 (§1. b. Prove that if x 2 = e for all elements of a group G. Prove that the following postulates describe an Abelian group: (i) (ab)e = a(eb) for all a.12. (b) All irrational numbers under multiplication. log x increases continuously in the interval -00 < y < +00. consider the correspondences + 2 = 0 (mod 3). Prove Theorem 5. the bijection 0 ++ I. Let S be a nonvoid set closed under a multiplication such that ab = ba. It is well known that as x increases in the interval 0 < x < +00. Ch. The relation "G is isomorphic to G'" is a reflexive. By an isomorphism between two groups G and G ' is meant a bijection a ++ a I between their elements which preserves group multiplication-i. and transitive relation between groups. The fact that isomorphic groups are abstractly the same (and differ only in the notation for their elements) can be seen in a number of ways. 3 ~ 3 is an isomorphism from the group of integers under addition mod 4 to the group of nonzero integers mod 5 under multiplication. Thus.6 138 Groups Definition. I Thus.. It follows from the next to the last sentence of §6. which is such that if a ++ a ' and b ++ b ' . 1 ++ R. 3 ++ R" is an isomorphism can be checked by comparing Tables 2 and 3 with part of Table 1 (§6. That the bijection 0 ++ I. two finite groups G and G ' are isomorphic if and only if every group table for G yields a group table for G ' . In the second. 1 ~ 2. In turn. the mapping 0 ~ 1. we have pointed out the isomorphism of the additive group of the integers mod 3 with the group of rotational symmetries of the equilateral triangle. by definition. the group of the integers under addition modulo 4 is isomorphic with the group of rotational symmetries of the square. in the first example. Again. by appropriate substitution. we have described an isomorphism between the group of positive real numbers under multiplication. any isomorphic image of a finite Abelian group is Abelian.4). Table 2 3 x 1 2 3 0 3 1 2 2 3 0 1 3 0 1 2 1 1 2 4 3 2 4 3 1 4 3 1 2 3 1 2 4 + 0 0 0 1 2 3 Table 3 1 2 1 2 2 4 3 4 3 The notion of isomorphism is technically valuable because it gives form to the recognition that the same abstract group-theoretic situation can arise in entirely different contexts.4 that G ' is Abelian if and only if G is. symmetric. 2 ++ R'. and that of all real numbers under addition. that is.e. Theorem 6. . Similarly. then ab ++ a b'. It is convenient to check this result by comparing the group table for the integers under addition mod 4 with that for the nonzero elements of Zs under mUltiplication. isomorphism behaves like equality in another respect. 2 ~ 4. See Tables 2 and 3. E. Any abstract group G is isomorphic with a group of transformations. and the set G' of all ~a contains. One can similarly show that (~a)-l exists and is in G' for all a. which is by (6) isomorphic with G. (c) the group of rotations of a regular hexagon. Under an isomorphism between two groups" the identity elements co"espond and the inverses of corresponding elements co"espond. Since e~a = e~b implies a = ea = eb = b. being in fact ~a-l.§6. Hence G' is a group of transformations. Since (6) holds for all x. which can be interpreted as demonstrating the completeness of our postulates on the multiplication of transformations. their product. Consequently. Theorem 7. it has an inverse T. Exercises 1. As for the symmetric property. the unique solution a -1 of the equation ax = e in G goes into the unique solution a. which is an isomorphism of G' onto G. The unique solution e of ax = a goes into the unique solution e' of a'x = a'. It is worth observing that Theorem 6 and its proof hold equally for isomorphisms between integral domains. The reflexive property is trivial (every group is isomorphic to itself by the identity transformation). Q. then TT' is an isomorphism of G with Gil.Pa~b is ~ab. Finally. if T maps G isomorphically on G'. (d) the additive group of integers mod6? . G' contains the identity.S 139 Isomorphism Proof. (b) the group of symmetries of a square. Are any two of the following groups isomorphic: (a) the group of symmetries of an equilateral triangle. hence the identities correspond. the product .r . We shall finally prove a remarkable result of Cayley. Associate with each element a E G the transformation ~a : x ~ xa = x~a on the "space" of all elements x of G. Again.D. while T' maps G' isomorphically on Gil. this completes the proof. with any two transformations. since x~e = xe = x for all x. Theorem 8. Proof. let a ++ aT be any isomorphic correspondence between G and G'. since Tis bijective. Proof. distinct elements of G correspond to distinct transformations.-l of a'x = e' in G'. and indeed for isomorphisms between algebraic systems of any kind whatever. (a) Exhibit an isomorphism between the group of the square and a group of transformations on the four vertices 1.1 told sr factors . If m > 0. 8. The same question for (a) the group of rotations of a square. one of r or s may be zero. n E Z). mod 13. (d) the multiplicative group of 1. the integral powers am of the group element a can be defined separately for positive. a (to m factors). s sets of r factors "a" each give a1. (ab)' . In the other cases for the first law of (8)..140 Ch. (8) (a')' == aT>.6. n E Z) is isomorphic to the multiplicative group of rational fractions of the form 2"3 m (m. (b) the group of symmetries of a rectangle. numbers. a . Cyclic Groups In any group. (a) Prove that the additive group of "Gaussian" integers m + nr-l (m. Two of the usual laws of exponents hold. or both rand s may be negative. 3. *4. (7) am = a . 6.5). we define .= a'b' in general (ct. 7. 8.11. On the other hand. Is the multiplicative group of nonzero real numbers isomorphic with the additive group of all real numbers? 5. (c) the additive group of integers mod 8. 5. If both exponents rand s are positive. 6.7. Do the same for the group of all rotations of a regular hexagon. 4 of the square.6 Groups 2. in which case the result comes directly from the last part t Thus r factors "a" followed by s factors "a" give all told r + s factors.2. (b) Show explicitly how inverses correspom! under this isomorphism. § 1. as in Theorem 7. mod 12. 12. in which case (8) is immediate. zero.. (b) Show that both are isomorphic to the group of all translations of a rectangular network. (b) the multiplicative group of all nonzero real. Illustrate Theorem 8 by exhibiting a group of transformations isomorphic with each of the following groups: (a) the additive group of all real numbers. 2). (e) the mUltiplicative group of 1. (c) the group of symmetries of a rhombus (equilateral parallelogram). the laws (8) follow directlyt from the definition (7) (ct. and negative exponents. . Ex. Determine all the isomorphisms between the additive group of Z4 and the group of rotations of the square.5. Again. 3. tThe well-ordering principle of §1. while if n < m. ·+r = a rs . If s is positive. if r ~ s.4 guarantees the c. For example. (9) First.6 Cyclic Groups of the definition (7). if the order of a is infinite. If s is zero. This group might equally well have been generated by R .s the least positive integert m such that am = e~ if no positive power of a equals the identity e. or negative. no two powers of a are equal. (a -l)m-n or a -(m-n). If s is negative. noting that (a ') -1 = a-' whether r is positive. G is isomorphic with the additive group of the integers modulo n.xistence of this m. or s > r. In fact. Moreover. and I = (R 3)4 with R3 exhaust the group. if the order of a is some finite integer n. by (8).a . a has order infinity.141 §6. then the order of a determines G to within isomorphism. Again. R2 . R = (R 3)3. In case n :> m we have left a . The second half of (8) can be established even more simply. with m > 0 and n > O.. e for no r > s. . say r = -m and s = n. which is a counterclockwise rotation of 90°. hence if the order of a is infinite. a . Theorem 9. The order of an element a in a group . G is isomorphic with the additive group of the integers.. we have some inverses left. the result is immediate. ( to s f actors) = a r+r+ .. we can make a similar expansion. There remains the case when one exponent is negative and one positive. and R4 = I of the clockwise rotation R of 3 90°. If an elemettt a generates the cyclic group G. zero. Then n (-l)m n (-1 ) a -ma=a a=a "'a -1)( a···a. Proof. so that a'-s =. In both cases we have the desired law a -ma n = a n+(-m) . . The group G is cyclic if it contains some one element x whose powers exhaust G. since R2 = (R 3)2. then by the first half of (8). R 3. By the associative law we can cancel successive a's against the inverses n m a-I. . Definition. a . the group of all rotations of a square into itself consists of the four powers R. this element is said to generate the group. either r > s. a' = as if and only if '( S)-1 =aa r -s r-$ e=aa =a. 1) into the same form. the correspondence as ~ s makes G isomorphic with the additive group of the integers.4) shows that R=R . the function aT ~ r is an isomorphism of G to the additive group of the integers modulo n.E. For example. The group of the square is not cyclic. Prove that if (ab)" = a"b" for all a and b in G and all positive integers n.1 and j = 0. a m+ 1 = ama. aTas = a T+ S.142 Ch. Hence. and conversely. 4. Finally. Furthermore. prove laws (8). but is generated by the two elements Rand H. 7. at = e if and only if t is a multiple of the order n of a-and so by (9). by Theorem 6 of § 1. and by (8) contains the sum and difference of any two of its members. Hand R satisfy These are called "defining relations" because they suffice to put the product of any two elements HiR j (i = 0. for positive exponents.2. consequently. 2. 3. The elements of the group are thus represented uniquely as HiR j with i = 0. HR = D' . by induction. If the order of a is finite.1. aT = as if and only if r == s (mod n). aT = as if and only if n I(r . H=H.3. Q. How many different generators has a cyclic group of order 6? . by (8) again. and that any two cyclic groups of the same order are isomorphic. then G is commutative. it is cyclic. It is a corollary that the number of elements in any cyclic group G is equal to the order of any generator of G. . that is. Using the definitions at = a. Exercises 1. similar calculations will give the whole multiplication table for the group of the square (Table 1). proving our first assertion.s). Show that if a commutative group with 6 elements contains an element of order 3. then the set of those integers t with at = e contains 0. indeed Table 1 (§6.D.6 Groups by (8) aSa' = a s+' therefore. Make similar studies for the groups described in Exs. for there is at . The group of the even integers under addition is a part of the group of all integers under addition. . (k.6 mod 7 cyclic? of 1. H2 = I. Find the order of every element in the group of the square. 5. 1. 7. These examples suggest the concept of a subgroup. If a cyclic group G is generated by a of order m. xy = yx.3.d. find the order of any element a k of G. Theorem 10. clearly S is a subgroup: the associativity is trivial. 4.4. 7. the group of the eight permutations of the vertices of the square induced by symmetries is a part of the group of all 4! = 24 permutations of these vertices.7 143 Subgroups 5. The dihedral group Dn is the group of all symmetries of a regular polygon of n sides (if n = 4. Are any two of these three groups isomorphic? *12. Under these hypotheses. and 5 of §6. 8.2.7 mod 8? of I. Subgroups Many groups are contained in larger groups. Thus. 10. and RH = HR n . Is the multiplicative group of 1.3.. Show that Dn contains 2n elements and is generated by two elements Rand H with R n = I. the group of rotations of the square is a part of the group of all symmetries of the square. 2. Obtain generators and defining relations for the groups of symmetries of the three infinite patterns ) ) ) 7 / 7 / 7 / v V V imagined as extended to infinity in both directions. Proof. the identity e = aa -I of G is in S.7. Subgroups of G other than the trivial ("improper") subgroups e and G are called proper subgroups.l • *11. A nonvoid subset S of a group G is a subgroup if and only if (i) a and b in Simply ab in Sand (ii) a in S implies a-I in S. . 6. Under the hypotheses of Ex.c.5. the set consisting of the identity e alone is a subgroup. A subset S of a group G is called a subgroup of G if S is itself a group with respect to the binary operation (multiplication) of G. Give the elements and the multiplication table of the group generated by two elements x and y subject to the defining relations x 2 = y2 = e. m) = l. prove that a k generates G if and only if g. 2. The whole group G is also a subgroup of itself. 8 mod 9? 6. . In any group G. 9.§6. 6. Again. Dn is the group of the square). Conversely. If as and a l are .b In a Ian y Theorem 10. the other group postulates are assumed. V] t The conclusion. . R"] [I. we must prove that (i) and (ii) hold in any subgroup. 6.th en a S+I = a S d a s-I = a S( a 1)-1 are In . R'] Vertex 1 (or 3) Vertex 2 (or 4) A vertical side A horizontal side [I. we now enumerate all the subgroups of the group of the square. §6. clearly a m. Any subgroup S of a cyclic group G is itself cyclic. since G has but one inverse for any a. The problem of determining all subgroups of a specified group G is in general very difficult. R . To obtain material for further development.1 a = am = e. H. D'] [I.H] [I. By examining the definitions given in §6. Consequently.6 Groups least one element a in S. one finds the (proper) subgroups leaving invariant each of the eight following configurations: A diagonal An axis A/ace An axis and a diagonal [I. R'] [I. also holds if the set S consists of 0 alone. The set of integers s for which as is in S is therefore closed under addition and subtraction. Q. Proof Let G consist of the powers of an element a. S. only those r > 0 which are divisors of n determine subgroups in this manner-but again these subgroups are all distinct.1 for the operations of this 'group. S .144 Ch. In case G is infinite. and so is the identity of G (Ex.E. the inverse of any element a in the subgroup is the same as its inverse in G. every r > 0 determines a different subgroup.D. A nonvoid subset S of a finite group G is a subgroup of G if and only if the product of any two elements in S is itself in S. If G has n elements.1• Hence one has the following simplified condition. We leave to the reader the verification that the center is. Theorem 11. so (ii) holds.4). in fact. D'. Condition (i) is obvious. Therefore S itself consists of the powers a kr = (aT)k. Theorem 12. D. V. The identity x = e' of any subgroup of G satisfies xx == x. For elements a of finite order m. always a subgroup of G. and so 1 a. We shall now solve it in the case that G is a cyclic group. then since an = e is surely in S. with r = 0. § 1. R'] [I. sot consists of the mUltiples of some least positive exponent r (Theorem 6. D] [I. 7). R' .= a m . This is defined as the set of all elements a E G such that ax = xa for all x E G. Among the subgroups of a given non-Abelian group G. hence is cyclic with generator aT. one of the most important is its center. ab in T.R'.8 why they all contain either 2 or 4 elements. observe that any subgroup must contain not only two cyclic subgroups {a} and {b}. All these subgroups may be displayed in their relation to each other in a table. but this can never happen unless the number of elements in the group is a product of at least four distinct primes.fI. but also the set {a. The intersection S n T of two subgroups Sand T of a group G is a subgroup of G. Observe first that if a subgroup S of G contains an element a. c} generated by three or more elements. as follows. likewise it implies a-I in T. (Prove. using Theorem 11.VJ :--l~ / Rgure3 Without using geometry we could still find all these subgroups.~. Hence by Theorem . we understand those which do not turn the square over. this gives us all but the first two of the subgroups listed.R] ~ [I. Next. hence a-I in S. this procedure gives us the remaining subgroups.D.7 145 Subgroups By the transformations leaving a face invariant. In the present case. Proof By Theorem 10. it also contains the "cyclic" subgroup {a} (prove it is a subgroup!) consisting of all the powers of a. a and b in S n T implies ab in S. [I'D')~ [I. a in S n T implies a in S. (We shall see in §6. we may have to test further for subgroups {a. Indeed. . of any two sets!) S and T is the set of all elements which belong both to S and to T.)------. .D) ~.R"] [J H V R'] ".D. of powers of a and b. where each group is joined to all of its subgroups by a descending line or sequence of lines. and so a-I in S n T.Ht. b} of all products. as shown in Figure 3. such as a 2 b.§6.R. Similarly. The intersection S n T of two subgroups (indeed. that these form a subgroup!) In the present case.~I>---G [J.) In general.3 a. [J. the determination of all the subgroups of a specified finite group G is most efficiently handled by considering the group elements purely abstractly. and so ab in S n T. b. Theorem 13.'--fI. Show that the additive group of Zp[x] is such a group. Do the same for the group of a regular polygon of n sides. 3. 7. prove T is a subgroup of G. Prove that the center of any group 0 is a subgroup of G. S nTis a subgroup. there exists a least subgroup containing both Sand T. 4. By a right coset (left coset) of a subgroup S of a group G is meant any set Sa (or as) consisting of all the right-multiples sa (leftmultiples as) of the elements s of S by a fixed element a in G. what is the subgroup leaving a diagonal fixed? 2. find the following subgroups: (a) all c/J carrying the set {l. and let S consist of those permutations of 0 leaving one letter fixed. but all elements of 0 have a finite order. (b) the group of a regular pentagon. Tabulate all the subgroups of the following groups: (a) the additive group mod 12. 3. It consists of all the products of positive and negative powers of elements of Sand T-it is called the join of Sand T. S n T contains e. *(d) the group of all permutations of four letters. Definition. In the groups of all permutations c/J of four digits 1. 4. 2} into the set {l. If T is a subgroup of S. 2}. Prove that Theorem 11 still holds if 0 is infinite. By the order of a group or subgroup is meant the number of its elements. (c) the group of a regular hexagon. and so IS nonvoid.146 Ch. 2.E. *9.6 Groups 10. 8.D. (b) all c/J such that a == b (mod 2) implies ac/J == bc/J (mod 2) for any digits a. Find the center of the group (a) of the square. W 6. 5. Exercises 1.8. b of the set 1. Q. In the group of symmetries of the regular hexagon. Clearly. S nTis the largest subgroup contained in both Sand T. 11. Let a ++ a' be an isomorphism between two groups 0 and 0' of permutations. (b) of the equilateral triangle. Show that the elements of finite order in any commutative group 0 form a subgroup. and S a subgroup of 0. Lagrange s Theorem We now come to a far-reaching concept of abstract group theory: the idea that any subgroup S of a group G decomposes G into cosets. *6. dually. 3. The number of distinct right cosets is called the "index" of Sin G. 4. Also. Does the set S' of all elements of 0' corresponding to a's in S necessarily form a subgroup of O'? Must the set S' leave a letter fixed? Illustrate. *10. . and denoted S u T We shall return to these concepts in Chap. 2. S is a right coset of itself. t The extension to the infinite case follows immediately from the discussion of Chap.-Is'a = (ss.-Is")b of Sa.. we obtain a classic result which is of fundamental importance for the theory of finite groups. Consequently.B 147 Lagrange's Theorem Since Se = S. the transformation s ~ sa is bijective: each element t = sa of the coset Sa is the image of one and only one element s = ta. any group G is exhausted by its right cosets. Hence the coset Sl/J contains only (and so by Lemma 1 all) the 5! permutations carrying 1 ~ k. Therefore the right cosets of S are the subsets carrying 1 ~ 1. The order of a finite group G is a multiple of the order of everyone of its subgroups. let G be the symmetric group of all permutations of the symbols 1.) Lemma 2. one has Lemma 1. H]I [1. If S is finite. Finally. 12-but the importance of the result disappears. H]R" = [R". Each coset has two elements. H] has the four right cosets [1. . . H]R' = [1. suppose Sa and Sb have an element c = s'a = s"b (s'. the subgroup S = [1. For. Thus. each right coset Sa of S has exactly as many elements as S does. D']. t the conclusion is: Theorem 14 (Lagrange). Since any right coset Sa always contains a = ea. = [R'. H]R = [R. For. . V]. 1 ~ 2..§6. Then 1l/J = k implies for all 1/1 E S that 1(I/Il/J) = (11/1)l/J = 1l/J = k. the subgroup of multiples ±5n of 5 has for right cosets the different residue classes modulo 5. [1. HR] = [R. each of which has exactly as many elements as S. HR'] = [R'. It is easy to illustrate these results.6. HR"] = [R". . and similarly Sa contains every element of Sb. D].I of S. 1 ~ 6. Moreover. Sa = Sb. respectively. [I. and every element of the group falls into one of the four right cosets. while S is the subgroup leaving the symbol 1 fixed. if G is the additive group of the integers. also Theorem 8. if G is the group of symmetries of the square. . H]. Two right cosets Sa and Sb of S are either identical or without common elements. If G is finite. Then Sb contains every element sa = ss.. From the preceding lemmas. s" in S) in common. (Cf. Again.. Therefore G is decomposed by S into nonoverlapping subsets. . bc = cb = a. hence ab = c. Lagrange's theorem can also be applied to number theory. ab cannot be ae = a. The order of any element a of this gr0Etuis en a divisor of p .Ch. c. we obtain th desired congruence. If we multiply by a on both sides. If a group G of order 4 contains an element of order 4. We now prove Cerellary 3. Therefore we have Corollary 1.1 elements. e in such a group has an order n > 1 dividing p. define the four group as the group with four commuting elements: e (the identity) and a.r. But this implies n = p. Otherwise. Similarly. and ex = xe = x for all x. As an example. so that a P = 1 (mod p) whe er a ¢ 0 (mod p).6 Groups 148 Each element a of G generates a cyclic subgroup. Call them a. it is cyclic. Corollary 4 (Fermat). by Corollary 1. Proof. whose order is (Theorem 9) simply the order of a. Lagrange's theorem can be applied to determine (up to isomorphism) all abstract groups of any low order. b. and so G = A is cyclic.:: c. ac = ca :: b. The only abstract groups of order four are the cyclic group of that . or aa = e.er and the four group. ba :. the latter each of order two. If a is an integer and p a prime. The multiplication group mod p (excluding zero) has p .1. the cyclic subgroup A generated by any element a :j. It will be shown in §6. Corollary 2. together with a 2 = b 2 = c 2 = e. Every group G of prime order p is cyclic. More generally. give the multiplication table of the four group. For. eb = b. In other words. c = ab. then a P = a (modp). by Corollary 1.9 that this group is isomorphic to the group of symmetries of a rectangle. all elements of G except e must have order 2. Every element of a finite group G has as order a divisor of the order of G. By the cancellation law. every group of order four is isomorphic to either the cyclic group of that order or the four group. But these. Proof. b. (c) The order of G. n) = 1. *12. where a ~ 0 and b are real. Prove that the only abstract groups of order 6 are the cyclic group and the symmetric group on three letters.1 if n = p is a prime. 10) of order 26. (a) Prove that this relation is reflexive. (a) Enumerate the subgroups of the dihedral group (§6.. Find the right and left cosets of S. . infer that 2h divides p . and if u and v are the orders of SliT and S u T.. Determine the cosets of the subgroup [I. (b) Do the same for the subgroup T of all transformations with b = O. cf>(16). cf>(30). Exercises 1.§6. *11.6. is denoted cf>(n) and called Euler's cf>-function. and compute cf>(12). where p is a prime. Prove that a group of order pm. bEG. (a) Prove that in the multiplicative group mod p. Prove that for any a. *13. If Sand T are subgroups of orders sand t of a group G. describe the right and left cosets of S in G. 3. I. while S is the subgroup of all such transformations with a = 1. Ex. either saS II SbS is void or SaS = SbS.1 = 2h. Let G be the group of a regular hexagon. let SaS denote the set of all products sas' for s. . the units (those elements with multiplicative inverses) form a group G. Prove: the number of right cosets of any subgroup of a finite group equals the number of its left cosets. then G consists of the positive integers k < n relatively prime to n. (c) Conclude that h is a power of 2. infer that if (k. s' in S. 6. For a subgroup S.. let x == y (mod S) be defined to mean xy-I E S. and show that x == y (mod S) if and only if x and y lie in the same right coset of S. D] of the group of the square. If S is any subgroup of a group G. (Hint: Use the correspondence x >4 X-I. for which the conclusion is trivially true..8 149 Lagrange's Theorem except for the case a = 0 (mod p). in case R = Zn. and transitive. Show that cf>(p) = p . (d) Using Lagrange's theorem.) . (b) Show that x == y (mod S) implies xa == ya (mod S) for all a. Check Fermat's theorem for p = 7 and a = 2. 7. 3. (b) Show that if R = Zn. How many are there? (b) Generalize your result. S. Let 2h + 1 be a prime p. . must contain a subgroup of order p. symmetric. (a) If G is the group of all transformations x >4 ax + b of R. the order of 2 is 2h. S the subgroup leaving one vertex fixed. (b) Using Fermat's theorem. then k"'Cn) == 1 (mod n). *1•• (a) Show that in any commutative ring R. prove that st ~ uv. 2. We have in -yk the identity I if and only if ai+k equals aj . This product may be written in either order. an) carries aj into aj+l. and 3 by themselves. Hence -y2 has the doubled effect of carrying each aj into aj+2. Proof.Ch. 2cf> = 3. The reader will find it instructive to compute cf>cf>'. For instance. it is the product of these two cycles. the permutation cf>' in (11) cyclically permutes the digits 1. The smallest k with -yk = I is then n itself. and finally the letter transformed into the original letter. and generally -yk carries aj into ai+k. if and only if k = 0 (mod n). Rgure4 Theorem 15. The notation for a cyclic permutation can be extended to any permutation. There is a suggestive notation for cyclic permutations-simply write down inside parentheses first any letter involved. 4cf>' = 5. 3cf>' = 1. For example. and to note that cf>cf>' -:I: cf>' cf>. then its transform.. give a circular rearrangement of the symbols permuted (Figure 4) are called cyclic permutations or cycles. cf>'cf>. . the permutation cf> of (10) might be written in anyone of the equivalent forms (12345). (45123). A cyclic permutation of n symbols has order n. 2. One permutation might be the transformation cf>. . since the symbols permuted . (10) lcf> = 2. The cycle -y is said to have length n. Permutation Groups A permutation is a one-one transformation of a finite set into itself. 5cf> = 1. the set might consist of the five digits 1. 5cf>' = 4. The cyclic permutation -y = (a. 3. where all subscripts are to be reduced modulo n.9. 2. . Thus. 3cf> = 4. (23451). 2cf>' = 3. or (51234).a2· . (34512). 4.6 150 Groups 6. like the permutation cf> defined above. so -y does have the order n (see the definition in §6. and 4 and 5 by themselves. (123)(45) = (45)(123).6). Another might be the transformation cf>' with (11) lcf>' = 2. 5. Permutations which. 4cf> = 5. . Thus. that is. an) contains. by Theorem 8 of §6. cPR = (IR)(HV).5.. Proof.-------'4 particular. acting on disjoint sets of symbols (more briefly. This group is known as the four group.. a2cP by a3. According to Theorem 8.. The result now follows by induction on the number of symbols.D. Theorem 16. cP n = I if and only if every 'Yt is the identity. hence cP permutes the re{l1aining symbols among themselves. Under it..-_ _ _ _. cPv = (IV)(RH).. its antecedent. Every finite group is isomorphic with one or more groups of permutations. In 31. 'Yrn for all n. t Two sets are called disjoint when they have no element in common.. Since the antecedent of any aj (i > 1) is ai-. . In particular. The order of any permutation 4> is the least common multiple of the lengths of its disjoint cycles. then 'Yi and 'Yj are disjoint. with any symbol ai. Q. one can prove Theorem 17.2 the 'Yio from which the conclusion of Theorem 17 follows immediately.. . (al . a product of disjointt cycles ). ." each of length one. 'Yr of disjoint cycles 'Yi' If i :f. Conversely.. the identity permutation on m letters is represented by m "cycles. . . evidently any product of disjoint cycles represents a permutation. as we now illustrate by two examples. H = (13)(24).9 151 Permutation Groups by (123) are left unchanged by (45)..E. Thus the effect of cP on the 'letters a. hence 'Yi'Yj = 'Y/Yi.. and the factors 'Yi may be rearranged in cP and in its powers to give cP n = 'YIn . Write the permutation cP as the product cP = 'YI •.. Moreover. Any permutation cP can be written as a product of cycles. j.. Therefore. R = (14)(23). this is true of finite groups of symmetries of FIgure 6 geometrical figures. the vertices are transformed by the four permutations I = (1)(2)(3)(4). an-IcP by am until ancP = ai is some element already named.. Moreover. this means cP n = I if and only if n is a common multiple of the lengths of 1. cPH = (IH)(RV). Proof.... Select any symbol.. denote it by al' Denote alcP by a2. an). an is the cycle (ala2 . But by Theorem 15. Consider the group of symmetries of the rectangle (Figure 5). v = (12)(34)..§6.. ancP must be al. it is isomorphic with the group of permutations cPI = (I)(R)( V)(H). which means that successive application of these permutations in either order gives the same result... . 3. For any permutations QJ and 1/1.I = ai. we have QJ -I I/IQJ = 'YI'· . from the column headed "R" in the group table (Table 1).1) can similarly be represented as a group of permutations of the four vertices.6 Groups The group of symmetries of the square (§6. 3c/J = 2. 3c/J = 6. where the 'Y/ are obtained from the 'Yi as in Theorem 18. (b) 1c/J = 5. one sees that this permutation is (IRR'R")(HD'VD). Find the order of each of these permutations. 'Yr is written as a product of cycles. 2c/J = 6..I 'YQJ carries each letter aiQJ in succession into aiQJQJ . Corollary.. (c) 1c/J = 3. Similarly.152 Ch. amQJ) the cycle obtained by replacing each letter ai in the representation of 'Y by its image under QJ. 5c/J = 1. 3c/J = 5. Let QJ and 'Y be permutations of n letters. as asserted. where QJ = (12)(34) is the permutation taking each digit of the cycle 'Y into the corresponding digit in 'Y'. Express as products of disjoint cycles the following permutations: (a) 1c/J = 4. . (14)(123 )(45)(14). 4c/J = 1. 5c/J = 3. Thus.• . 4. Two cycles of the same length are closely related. then one may compute that 'Y' = QJ -I'YQJ.• 'Yr'. Proof. Exercises 1. H corresponds to (IH)(RD)(R'V)(R"D').. where 'Y is a cyclic permutation 'Y = (a . (12345)(67)(1357)(163). Using Theorem 8. Hence QJ -I'YQJ = 'Y'. 2c/J. we can also represent it as a group of permutations of the eight symbols which represent the elements of the group. 4c/J = 6. am). 6c/J = 1. if 1/1 = 'YI . Represent the following products as products of disjoint cycles: (1234)(567)(261)(47). 4c/J = 4. and denote by 'Y' = (a IQJ. one computes that QJ -I'YQJ and 'Y' both carry any letter b not of the form aiQJ into itself. 2c/J = 5. This is a special case of the following result. 2.. Find the order of each product. . Theorem 18. 6c/J = 2. Find the order of (abcdef)(ghij)(klm) and of (abcdef)(abcd)(abc). The product QJ . then to ai'Y = ai+b then to ai'YQJ = aHIQJ. . Then QJ -I'YQJ = 'Y'. 6c/J = 2. R corresponds to the permutation effected on right-multiplication of these symbols by "R". 5c/J = 4. For example. Represent th6' group of the rhombus (equilateral parallelogram) as a group of permutations of its vertices. and hence has the same effect upon aiQJ as does 'Y' (call am+1 = al)' Similarly. if 'Y = (1234) and 'Y' = (2143). ..' 3. .1)/2. * 10. P is and p2 is the discriminant discussed in §5. Represent the group of symmetries of the equilateral triangle as a group of permutations of (a) three and (b) six letters. X6 which carry the set {Xl> X2} into itself. Represent G as a group of permutations of the vertices (cf. . Hence it does change P to -P.1. j > 2. Which symmetric groups are Abelian? 7. §6.. . any permutation of the subscripts in P leaves the set of factors of P.5. and the odd permutations interchanging P and -P. Let G be the group of all symmetries of the cube leaving one vertex fixed. P is a polynomial of degree n(n . Clearly. *(b) How does this relate to the proof of the "generalized commutative law" from the law ab = ba (§1. It follows when we consider the effect of two permutations performed in succession. 8. Even and Odd Permutations An important classification of permutations may be found by considering the homogeneous polynomial form P = II (Xi - Xj). interchanges the (Xl . The n! permutations of the subscripts are therefore of two kinds: the even permutations leaving P (and . 6. and leaves the other factors unchanged.3). *11.10 153 Even and Odd Permutations 5. In what sense is the representation of Theorem 16 unique? Prove your answer. n .X2) ·into its negative (X2 . *(c) Do (b) in two essentially different ways. i<j where i and j run from 1 to n.10. and hence P itself.5)? 9. Describe the right and left cosets of the subgroup of all those permutations of Xl> ••• . unchanged except as to sign. Prove that the symmetric group of degree n is generated by the cycles (1. 2. 6.Xl). the transposition (XIX2) changes (Xl . Moreover.xJ.P) invariant. If n = 3. (a) Prove that every permutation can be written as a product of (not in general disjoint) cycles of length two ("transpositions").§6..xJ and the (X2 . we have the rules (12) even x even = odd x odd even x odd = odd x even = even = odd. n).1) and (n . In general. Thus. 1938). . U3 = i<j L XjXjXk. then cP{3-1 is even. they are (15) UI = LXi> U2 = L XjXj.. corollary. Any symmetric polynomial p(XI>' .3xy(x + y) . the odd permutations form a single right coset of An.6.xn ) in n indeterminates Xj is called "symmetric" if it is invariant under the symmetric group of all permutations of its subscripts. . one can at least ask for the set of all those permutations of the indices which le'ave the polynomial unchanged. Introduction to the Theory of Equations (New York: Macmillan.k in the expansion of p(t) = Ih(t .xn ) can be expressed as a polynomial in the elementary symmetric polynomials. 108.. Even if a polynomial q(xl>' .2xy = u/ y)3 . i <j < k Since (-lluk is the coefficient of t n . Moreover. and so cP = (cP{3-I){3 is in the right coset An{3.6 154 Groups It is a corollary of (12) and of Theorem 11 that the even permutations form a subgroup An of the symmetric group of degree n. Also see §15. the "alternating group" on n symbols contains just (n !)/2 elements. Weisner. and so on. the expressions Uj give the coefficients of p(t) as functions of its roots. we call such polynomials elementary symmetric polynomials (in n variables). in the case of two variables x and y. p.t 1 heorem 19. Particular symmetric polynomials are (for n = 3) They are the coefficients in the expansion In general. if {3 is a fixed and cP a variable odd permutation. They derive much of their importance from the so-called "fundamental theorem on symmetric polynomials. X2 X3 + + y2 = (x + yl y3 = (x + .Xk) as a polynomial in t. .2U2.. .. Hence by Lagrange's theorem. it is called the group of the polynomial. t See L. Theorem 15.xn ) is not symmetric. In summary. It is clear that this set is a group. = UI(UI 2 - 3(2)." which we shall state without proof.Ch. A polynomial g(xl> . This subgroup is usually called the "alternating group" of degree n. (a) Construct sample even and odd permutations of order 14 on 11 letters. but the correspondence is many-one. Homomorphisms A single-valued transformation from a group G to a group G' may preserve multiplication without being one-one (i. Or consider the correspondence n ~ in. 5. consider the correspondence between the symmetric group of degree n and the group of ± 1 under multiplication. For which positive integers n is a cycle of length n even? odd? 3. §6.r=t.§6. Thus. such that (xy)' = x'y' for all x. 7.11 155 Homomorphisms Exercises 1. (a) Show that a product of not necessarily disjoint cycles is odd if and only if it contains an odd number of cycles of even length. A homomorphism of a group G to a group G' is a Single-valued transformation x ~ x' mapping G into G'. without being an isomorphism). Find the group of each of the following polynomials: 8. Under any homomorphism G ~ G'.. (b) Are the permutations (123)(246)(5432) and (12)(345)(67)(891) odd or even? 4. (b) of four letters.11. These and other examples lead to the following concept. between the additive group of the integers and the multiplicative group of the fourth roots of unity. 2. and inverses into inverses.e. Theorem 20. which carries even permutations into + 1 and odd ones into -1. By (12). . 8. Definition. it carries products into products. where i = . the identity e of G goes into the identity of G'. Show that every even permutation can be written as the product of cycles of length three. Again the group operation is preserved: im+n = imi n. Represent each of the following polynomials in terms of the elementary symmetric polynomials: 6. *6.. List the odd permutations (a) of three letters.9). Show that a permutation is even if and only if it can be written as a product of an even number of transpositions (Ex. y in G. (b) Prove every permutation of order 10 on 8 letters is odd.. then aa. then the direct product of cyclic groups of orders m and n is itself a cyclic group. hence G x H is a group. Likewise. hence N is a subgroup. zero. Again. Any two groups G and H have a direct product G X H. and the must go into the identity of G' .)m = (am)' of a' exhaust G'. h') = (gg'.I = e must go into a'(a. b)k = (a k. the function a (g. e) acts as an identity in G x H. Then. by Theorem 20. It can be proved that every Abelian group of finite order is isomorphic. and multiplication . e) if and only if k 0 (mod m) and k 0 (mod n).-I = e' and ab ~ a'b' = e' e' = e'. By Theorem = = t Homomorphisms onto are sometimes called epimorphisms. h). is a subgroup of G.Ch. of order mn. (am)' = (a. (15a) hE H. (a. in C = A x B. h -I) is an inverse of (g. h) = h is a homomorphism from G x H onto H. the powers (a. by cancellation. This set N is called the kernel of the homomorphism. hh'). Since e 2 = e. Evidently. The elements of G x H are the ordered pairs (g. (g . to a direct product of cyclic groups of prime-power orders. Any homomorphic imaget of a cyclic group is cyclic.)-I = e. For. and correspondingly homomorphic images are called epimorphic images. b k) is the identity (e. multiplication in G x H is defined by the formula (g. .I. h) = g defines a homomorphism a from G x H onto G. and the function ~(g. Let a and b generate the cyclic groups A and B. by Theorem 20 and hypothesis.is associative. Corollary 2. We content' ourselves here with the following. Proof. a ~ e' and b ~ e' imply a-I ~ (a. Since e ~ e'. and so inverse of a'. h) with g E G. Theorem 21. of orders m and n. Direct Products.6 156 Groups Proof. N is nonvoid. Hence if the powers am exhaust G. the image f of e satisfies f2 = f = is the identity of G'. respectively. under a homomorphism of G to G'. (e. Moreover. If m and n are relatively prime. or negative.I)' = e'. if a goes into a' (a-I)'. fe'. where e' identity of G and a-I into (a-I)' is the Corollary 1. much weaker result.)m whether m is positive. h)(g'. The set N of all elements of G mapped on the identity e' of G'. f = e'. Hence. If G is a group of permutations of n letters 1.12. 2. this implies k = 0 (mod mn). Show that there is a homomorphism cf> ~ cf>* in which each motion cf> in the group of the square induces a permutation cf>* on d. where i = J -1 and nEZ... . 5. Show that the four group is the direct product of two cyclic groups of order 2. (g) X ~ -1/x. Thus an automorphism a of G is a one-one transformation of G onto itself (bijection of G) such that (16) (xy)a = (xa)(ya) for all x. 6. (h) x ~ Fx. and v. Q.E. 8.. in which each permutation cf> of G carries the subset of letters 1. H. Hence (a. . Conjugate Elements Definition. d'. .' . Automorphisms. what is its image G'. In the homomorphism n ~ in. What is the kernel? 6. Conjugate Elements 157 17 of §1. 3 (e) x ~ -x. If G is homomorphic to G' and G' to G". .) 10. h. k. Show that a cyclic group of order 8 has as homomorphic images (a) a cyclic group of order 4. . (b) x ~ 2x. identify the homomorphic image G' and the kernel. In a square let the two diagonals be d and d'.9. show that G is homomorphic onto the group G' of the permutations cf>* induced on 1. prove that G is homomorphic to G". (f) x ~ x .n. 3. K. (d) x ~ l/x. (Hint: Let z = re i8. the axes hand v. Exhibit the correspondence cf> ~ cf>* in detail. *9.k into itself. find the kernel. An isomorphism of a group G with itself is called an automorphism of G. 2. 2 (a) x ~ lxi. Prove that for any groups G. Is the correspondence mapping each x on the complex number e 2mx a homomorphism of the additive group of real numbers x? If so. (c) x ~ x . b) = c is of order mn in C. Which of the following correspondences map the multiplicative group of all nonzero real numbers homomorphically into itself? If the correspondence is a homomorphism. which contains only mn elements. G x H is isomorphic to H x G and G x (H x K) is isomorphic to (G x H) x K.D. y in G. Exercises 1. and what is the kernel? 4. Show that the multiplicative group of all nonzero complex numbers is the direct product of the group of rotations of the unit circle and the group of all real numbers under addition. 7. and is therefore cyclic..§6.' . (b) a cyclic group of order 2.12 Automorphisms. all other automorphisms are outer. V = R. For any fixed element a of the group G. A parallel definition and theorem apply to integral domains. '" = a -1l/Ja is related to l/J much as in Theorem 18. (Cf. (a -1 xa)(a -1 ya) = a -1(xy)a for all x. Thus if a and l/J are one-one transformations of a space S onto itself. Definition. any point q in S can be written as q = pa for some pES. Theorem 6. then by (16) (xy)a -1 = [(xa -1a)(ya -1a)]a -1 = ([(xa -1)(ya -1)]a)a. in other words. A similar interpretation applies to any group of transformations. Proof. reflection in the vertical axis is conjugate under R to reflection in the horizontal axis because R carries the horizontal axis into the vertical axis.1 HR. Proof. and that so is the product of any two automorphisms. the conjugate '" = a -ll/Ja is obtained from l/J by replacing each point p and its image r = pl/J by pa and ra. a -1 xa is called the conjugate of x under "conjugation" by a.Ch. On the other hand. the conjugation Ta: x ~ a -1 xa is an automorphism of G. but has the "outer" automorphism x ++ x 2 • . the cyclic group of order three has no inner automorphisms except the identity. One can fruitfully regard an "automorphism" of an abstract algebra A as just a symmetry of A. in the group of the square.D. y. it has four outer automorphisms. The automorphisms of any group G themselves form a group A. For example.) It is obvious that the identity transformation is an automorphism.6 158 Groups Theorem 22.1)l/Ja = (pl/J)a.1 = (xa -1)(ya -1) so that a -1 is an automorphism. In Theorem 18 we have already seen that the conjugate of any cycle in a permutation group is another cycle of the same length. Finally. respectively. Thus'" is the transformation pa ~ (pl/J)a. if x ~ xa is an automorphism.E. Automorphisms Ta of the form x ~ a -1 are called inner automorphisms. Specifically. and indeed to abstract algebras in general. Q. In any group G. and (pa)'" = pa(a": 1l/Ja) = (paa . Theorem 23. It may be checked that the group of symmetries of the square has four distinct inner automorphisms. since (by Theorem 20) fJ(b -1) = [fJ(b )r l . Definition (Galois).§6.-lfJ(a)b' = b. x. the product of the inner automorphisms Ta and Tb is the inner automorphism Tab. ~ H is a Proof.12 159 Automorphisms. let a-I Sa denote the set of all products a -lsa for s in S.-le'b' = e' for b' = fJ(b) and e' = fJ(e). by Corollary 2 of Theorem 20. Remark. so is the subgroup [I. then fJ(b-lab) = b. Thus.e. contains with any element all its conjugates). The kernel N of any homomorphism fJ: G normal subgroup of G. It is a corollary that every subgroup S with only one other coset is normal. Again. The inner automorphisms of any group G form a subgroup of the group of all automorphisms of G. Theorem 26. By the . Since b-l(a-lxa)b=(abrlx(ab). the elements not in S form the right and left coset of S. The definition then states that S is normal if and only if the set a-I Sa equals S for every a in G. since (a-1)-1(a-1xa)(a. hence the set Sa of sa (s E S) is the same as the set (aSa -l)a of (asa -1) a = as (s E S). The ·group of translations of the plane is also a normal subgroup of the Euclidean group of all rigid motions of the plane (d.l = (a-l)-l = S for all s.l ) = x. Thus. similarly. Theorem 25. aSa. Hence the alternating group is a normal subgroup of the symmetric group of degree n. It is a subgroup. In general. A subgroup S is normal if and only if all its right cosets are left cosets.8) is eS = S. Conjugate Elements Theorem 24. If S is normal. Chap. Again. A subgroup S of a group G is normal (in G) if and only if it is invariant under all inner automorphisms of G (i. the group of rotations of the square is a normal subgroup of the group of all its symmetries. Conversely. 9). if the right coset Sa is a left coset bS.I ). Sa = as for all a. R2]. A normal subgroup is sometimes. then a-lSa = a-lbS contains e = a-lea and so (Lemma 2. called a "self-conjugate" or an "invariant" subgroup.. Proof. Proof. § 6. the inverse of the conjugation Ta is T(a.xa = a-I ax = x for all a. if a EN and bEG. Consider the correspondence between elements a of a group G and the inner automorphisms Ta which they induce. every subgroup of an Abelian group is normal since a . 14. Exercises 1.6. Prove that. sets of conjugate elements. the number of different isomorphisms between G and H is the number of automorphisms of G. it is a homomorphism. Prove that an element a of a group induces the inner automorphism identity if and only if it is in the center. and A its group of automorphisms. Enumerate the inner automorphisms. Prove that if G and H are isomorphic groups. 12. Show that G either is cyclic or contains an element of order p (or q).) 8. Let G be a group of order pq (p. Which are inner? 3.all rational numbers. g') = (aa'. and normal subgroups of the group of the square. (a) Find an automorphism a of the group of the square such that Ra = R and Ha = D. *7. (a) Show that the holomorph of the cyclic group of order three is the symmetric group of degree three. then so is their intersection. Prove that the inner automorphisms of any group G are a normal subgroup of the group of all automorphisms of G. List all the automorphisms of the four group. (Hint: Represent the group of the square by the generators Rand H discussed in § 6. *15. Show that the automorphisms of the cyclic group of order m are the correspondences a k >-+ a. How many automorphisms has a cyclic group of order p? of order pq1 (p. YEN) is a normal s~bgroup of G. 13. the set MN of all products xy (x E M. q distinct primes). (b) Show that the holomorph of the cyclic group of order four is the group of the square. In the second case. Prove that if M and N are normal subgroups of a group G. Ta Tb = Tab: it preserves multiplication. it is usually not one-one (R 2 and I induce the same inner automorphism). *16. where r is a unit of the ring Zm.160 Ch. 9. g)(a'. One easily verifies that the kernel of this homomorphism is precisely the center of G. under the hypotheses of Ex. 5. 12. In the latter case. the relation "x is conjugate to y" is an equivalence relation. (a) Show that for every rational c ~ 0. 6. Prove that in any group. as in the case of the group of symmetries of the square. Show that the couples (a. the correspondence x >-+ xc is an automorphism of the additive group of . 2.k. 4. (b) Show that a is an outer automorphism. g) with a E A and g E G form a group (the "holomorph" of G) under the multiplication (a. Find all the automorphisms of the cyclic group of order eight. (b) Show that this group has no other automorphisms. (ga')g').6 Groups proof of Theorem 24. q primes). *11. Yet. Let G be any group. *10. prove G contains either 1 normal or q conjugate subgroups of order p. . bEN). b' = a' if and only if b = at (t EN). let x ~ x' be any homomorphism of G onto a group G'. then (uv)' = a'x'b'y' = e'x'e'y' = x'y'. Proof.q(p . *18. If u = ax. if and only if tEN. In summary. (a) Show that the defining relations am = b" = e. v E Ny). (b) Using Ex. Lemma 1. This establishes a one-one correspondence between the elements of G' and the costs of N in G. Using the analysis of Ex. respectively. *19.§6. show that there are only two nonisomorphic groups of any given prime-square order. and let N be the kernel of this homomorphism.13. multiplied by the rule that the "product" Nx 0 Ny of two cosets is the (unique) coset containing all products uv (u E Nx. *17. 16. v = by (a. We can illustrate the preceding discussion by considering the homomorphism between the group G of the symmetries of the square . Lemma 2. then x'y' corresponds to the (unique) coset of N containing the set NxNy of all products uv (u E Nx. find all possible groups of orders (a) 10 and (b) 14. Then x'y' may be found as follows. 16.1) = q elements not of order p form a normal subgroup. b-1ab = a k define a group of order mn with a normal subgroup of order m. so that b' = a't'. Hence the order of G' is the number of costs (or "index") of N in G. a't' = a' if and only if t' = e'-that is. if k" == 1 (mod m). Infer that G always has a proper normal subgroup. 16. Let Nx and Ny correspond to x' and y'. If a and b are any elements of G. Let x' and y' be elements of G'. Two elements of G have the same image in G' if and only if they are in the same coset Nx = xN of the kernel N. ·Quotient Groups Now we shall show how to construct isomorphic replicas of all the homomorphic images G' of a specified abstract group G.13 161 Quotient Groups show that the pq . Indeed. Using Ex. But by the cancellation law. find all groups of order 6 and all groups of order 15. v E Ny). Thus G' is determined to within isomorphism by G and N: it is isomorphic with the system of cosets of N in G. we can write b = at. 6. If u = ax and v = by for a. We have thus defined a single-valued binary operation on the elements of G' (alias cosets of G). not associated a priori with any homomorphism. by (17). 3 [R. and forming the coset containing the product xy. left-inverses of cosets exist. and the quotient-group is often called the difference group. a homomorphism of G onto G/ N. D'] ~ c. there cannot be two different cosets each containing NxNy. so must be Ne = N. form) the coset [D. The correspondence x ~ Nx is. v E Ny). [H. a. These results. Both ([Nx] 0 [Ny]) 0 [Nz] and [Nx] 0 ([Ny] 0 [Nz)) contain (xy)z = x(yz). so the coset N = Ne is a left identity for the system of cosets. let there be given any normal subgroup N of G. R 3 H. R2]. D'] of antecedents of c. Definition. we can derive the sample rule ab = c by computing the products [RH. R2] ~ e. The group of cosets of N is called the quotient-group (or factor group) of G by N and is denoted byt G/N. Conversely. therefore. under which [1.6 162 Groups and the "four group" G': [e. One can construct from N a homomorphic image G' of G as follows. RV. where b' = xbx. [D. b in N. The product [Nx] 0 [Ny] of any two cosets Nx and Ny of N is defined as the coset (if any) containing the set NxNy of all products uv (u E Nx. c] (§ 6. written G . Therefore N(xy) is a coset containing NxNy. V] ~ b. R 3 V]-which lie in (in fact. R2]. The product [Ne] 0 [Ny] = N(ey) = Ny." then every subgroup N is normal in G.N .l is also in N because N is normal. Finally. In words: the product of any two cosets is found by multiplying in G any pair of "representatives" x and y.Ch. (Check from the group table that this is a homomorphism!) The antecedents of e form the normal subgroup [1. . by (17). The cosets of any normal subgroup N of G form a group under multiplication.8). The elements of G' are defined as the different cosests Nx of N. b. t If G is an Abelian group in which the binary operation is denoted by "+. so the multiplication of cosets is associative. prove the following Lemma 3. and those of the other elements are the cosets of [1. then uv = axby = ab'xy. moreover. with Theorem 4. R ] ~ a. Finally. which may be written as (17) [Nx] 0 [Ny] = N(xy). since distinct cosets are nonoverlapping and the set NxNy is nonvoid. and the kernel of this homomorphism is N. the coset [Nx.l ] 0 [Nx] contains X-IX = e. (a) Show that if M and N are normal subgroups of G with M n N = 1. If G is a group.13 Quotient Groups Conversely. then G = M x N. m.8. (b) G/ S. prove that G/e is Abelian. 6 of § 6. *9. *10. Finally. if N is a normal subgroup of G and G/ N is Abelian. Prove that the set e of all products of such commutators forms a normal subgroup of G. If G is the group of all rational numbers of the form 2k 3 m S".10). 6. Prove that in Ex. and the relation x = y (mod n) can be paralleled by defining x = y (mod N) as the relation xy -1 EN-which is equivalent to the assertion that x and y are in the same coset of N (see Ex. Remark. while S is the multiplicative subgroup of all numbers 2\ describe (a) the cosets of S. x = y (mod S) implies ax = ay (mod S) for all a if and only if S is a normal subgroup. then ab = ba for all a E M. We conclude Theorem 27. 2.163 §6. In Ex. if M uN = G.and that if S' is normal.1 E M n N.8). The cosets of N are the analogues of the residue classes mod n. prove that N contains C. 11•. Show that the set of all antecedents of any subgroup S' of G' is a subgroup S of G. *8. 3. elements of the form x-1y-1xy are called commutators. 6. and that G/ Z is isomorphic with the group of inner automorphisms of G. The preceding "construction" of quotient-groups from groups and normal subgroups is analogous to the construction of the ring of integers "mod n" from the integral domain of all integers (§§ 1. in (a). 4. then so is S. bEN. Prove that the center Z of any group G is a normal subgroup of G. the image G' is isomorphic with the quotient-group Gf N. with integral exponents k. we have already seen (following Lemma 2) that for any homomorphism of G onto a group G' in which the kernel is N. if S n N = e and SuN = G. (Hint: Show that aba-1b. 5. Do the same for the group of a regular hexagon. § 6. Exercises 1. . 8. Prove that the intersection of any subgroup S of G with its conjugates is a normal subgroup of G. multiplication of cosets of N being defined by (17). The homomorphic images of a given abstract group G are the quotient-groups Gf N by its different normal subgroups. prove that G/ N is isomorphic to S.) *(b) Show that. *7.91. If S is a subgroup and N a normal subgroup of a group G. Let G ~ G' be a homomorphism. and n. List all abstract groups which are homomorphic images of the group of symmetries of a square. Two subgroups Sand T of a group G are called conjugate if a-1Sa = T for some a E G. let T" be the permutation (Sx) . we are willing to treat suitable subsets of S as elements. since the symmetric law gives bRa. By the reflexive law. Lemma 1 here specializes to the assertion that a == b (mod n) if and only if a and b lie in the same residue class. Prove that the cosets of a nonnormal subgroup do not form a group under the multiplication (17). bRb. Indeed.10). aRb and bRc imply aRc. implies b E R(a). In the particular case when R is the relation of congruence modulo n between integers. and transitive properties. and elsewhere.(Sxa) on the right cosets Sx of S. b E R (a) if and only if bRa. For any a E G. so that b E R(b). Suppose now that R(a) = R(b).14. b. Suppose first that aRb. We shall now formulate the significance of this assertion. mod n (cf. c of a set S. a relation R which has the reflexive. symmetric.13). we have asserted that any reflexive.6 164 Groups *12. These R-subsets have various simple properties. and let c be any element of R(a). as in the case of cosets (§6. Proof. (b) The kernel is the normal subgroup of Ex. we may denote by R(a) the set of all elements b equivalent to a. Let G be any group. . § 1. hence by the transitive law cRb. S any subgroup of G. aRb implies R(a) = R(b). aRa. *6.Ch. Conversely. such as equivalence relation R becomes ordinary equality. Then by definition eRa. if a is any element of S. the class R(a) determined by an integer a is simply the residue class containing a. b'). symmetric. Prove: (a) The correspondence a . If. This completes the proof. 10. and transitive relation might be regarded as a kind of equality. b) == (a'. For convenience. will be called an equivalence relation on S. Equivalence and Congruence Relations In defining the relation a = b (mod n) between integers. and so aRb. aRb implies bRa. Lemma 1. 13. Other illustrations are given as exercises. c E R(b) implies c E R(a). which means that the two classes R(a) and R(b) have the same members and hence are equal.T" is a homomorphism. which means that c E R(b). Since R(a) = R(b). for all the members a. in setting up the rational numbers in terms of a congruence of number pairs (a. which was defined to mean that ab' = a' b. and conversely. Indeed. and let '1T be the corresponding partition into the R-subsets A.14 165 Equivalence and Congruence Relations Again. B... By Lemma 1 this implies R(a) = R(b). such that elements a and b of 5 lie in the same subclass of the partition '1T if and only if aRb.. the R-subset R(a) determined by each element a for this relation is exactly that subclass of the partition '1T which contains a. If a set 5 is divided by a partition '1T into nonoverlapping subclasses A. we also demanded a certain "substitution property" relative to binary operations. aRb. . The R-subsets always provide such a partition. If R(a) and R(b) contain an element c in common.. the two classes cannot overlap. Proof. . for. and. Consequently. each partition of 5 yields an equivalence relation R. of 5 such that each element of 5 belongs to one and only one of the subclasses (subsets) of the collection. This condition also has a definite theoretical content. In discussing the requisites for an admissible equality relation (§ 1. and the collection of all R -subsets is a partition of S. if R(a) ¥= R(b). In terms of the equivalence relation R and the binary operation a 0 b = c on the set 5. let R be any equivalence relation on 5. cRc. And just as with quotient-groups (or residue classes mod n). every element c of the set 5 is in the particular R-subset R(c). so that cRa and cRb. C. B. B. then by the symmetric and transitive laws.§6. and this does give an abstract equivalence relation R on 5. and hence may be said to form a "partition" of Z. Finally. Every equivalence relation R on a set 5 determines a partition '1T of 5 into nonoverlapping R -classes. this property takes the form (18) aRa' and bRb' imply (a ob)R(a'ob'). let us regard the R -subsets as the elements of a new system l: = 51 R. These conclusions may be summarized as follows: Theorem 28.. . then a relation aRb may be defined to mean that a and b lie in one and the same subclass of this partition. we .11). Two R-subsets are either identical or have no elements in common. C. There is thus a one-one correspondence R ++ '1T between the equivalence relations R on 5 and the partitions '1T of 5. a partition '1T of a class 5 is any collection of subclasses A. Lemma 2. . The converse of Lemmas 1 and 2 is immediate. the residue classes mod n divide the whole set Z of integers into nonoverlapping subclasses.. In general. conversely. so c E R(c). Just as with cosets. Moreover. by the reflexive law. . C.. b. Given an equivalence relation R on a set S. Which of the following relations R are equivalence relations? In case they are.b is odd. (e) Z as in (c) . Exercises 1. This proves Theorem 29. Let G be a group of permutations of the letters x I> ••• . Theorem 29 can be applied to the relation a . if aRa') and if band b' are in an R-subset B. let x.Xn . the substitution property (18) is equivalent to the assertion that definition (19) yields a (single-valued) binary operation on R-subsets (i. This resulting R -subset C is thus uniquely determined by A and B and is the "product" A 0 B in the sense of (19).b is a prime.. Property (18) asserts that if a and a' are both in an R-subset A (Le. 2.e.. (b) G. (c) Z is the domain of integers. bfJ.bEe.. aRb means that a .. then (a 0 b) and (a' 0 b') both lie in the same R -subset." Similarly. any binary operation defined on S and having the substitution property (18) yields a (single-valued) binary operation on the R-subsets of S. S a subgroup. (a) G is a group.Rxj mean that x. cfJ) for all a. Thus if G and H are algebraic with a ternary operation (a. aRb means that a . Is R an equivalence relation? How does G act on each R-subset? . and the theorem yields the addition and multiplication of residue classes in Z. and aRb means a-1b E S. For example. aRb means that a . and can even be extended to other algebraic systems with operations which need not be binary. aRb means ba. c)fJ = (afJ. where C is any ideal in any commutative ring. More generally. as defined in §1. a homomorphism of G onto H is a map fJ of G on H with the property that (a. both addition and multiplication have the substitution property (18). c in G. on ~). automorphism. and homomorphism can be applied to general algebraic systems.-<P = Xj for some f/J E G. S as in (a).166 Ch. In other words. c).b is even. In general. the concepts of isomorphism. relations satisfying the conditions of Theorem 29 may be called "congruence relations. as defined by (19). if R is the relation of congruence mod n on the set of integers. b.1 E S. (d) Z as in (c).6 Groups may try to define a binary operation in A (19) a E A 0 and B = C b E B m ~ ~ from that in S. describe the R -subsets.1O. b. if and only if imply (a 0 b) E C in S. and R an equivalence relation on 5. let aRb mean that a . Show that if aRa ' implies (a 0 b)R (a lob) and (b 0 a)R (b 0 a '). y)R(x ' . . (b) Prove that if R is any congruence relation on a commutative ring. (a) Is R an equivalence relation? (b) Is it a congruence relation for addition? (c) Is it one for multiplication? (d) What does this imply regarding addition and multiplication of angles? 5. Show that the relation (a . Let u: 52 ~ 5 be a binary operation. then (18) holds.14 Equivalence and Congruence Relations 167 3. Let (x. show that half the substitution rule (18) holds for any 5 and that the other half holds if and only if 5 is normal. y)4> = (x'. 7.the R-subsets form another commutative ring if addition and multiplication are defined by (19). 6. y') for some 4> E G. y) of the plane. In Ex.b is an integral multiple of 360. Let G consist of the transformations (x. (a) Let C be any ideal in a commutative ring. l(a).§6.b) E C is a congruence relation for addition and multiplication. What are the R-subsets in this case? 4. y') mean that (x. With real numbers a and b. y) ~ (x + a. Vectors in a Plane In physics there arise quantities called vectors which are not merely numbers. but which have direction as well as magnitude. a. Forces acting on a point in a plane.1. then the combined displacement 'Y = a + (3 is the arrow leading from Figure 1 the origin of a to the terminus of {3. It may conveniently be represented by an arrow a of the proper length and direction (Figure 1). and velocities and accelerations have similar representations by means of vectors-and in all cases the parallelogram law of vector addition. A displacement a may be tripled to give a new displacement 3 . the direction must be reversed. In general. If c is positive.If {3 is I applied after a by placing the origin of the arrow o {3 at the terminus of a. or halved to give a displacement !a. is a third "total" displacement 1'. and multiplication by (real) scalars 168 . representing a displacement twice as large as a in the direction opposite to a. while if c is negative.7 Vectors and Vector Spaces 7. This is the diagonal of the parallelogram with sides a and p. a. The numbers c are called scalars and the product ca a "scalar" proQuct. The combined effect of two / such displacements a and {3. a may be multiplied by any real number c to form a new displacement c . One may even form a negative multiple such as -2a. ca has the direction of a and a magnitude c times as large. Thus a parallel displacement in the plane depends for its effect not only on the distance but also on the direction of displacement. executed one after I /Ii another. This rule for finding a + (3 is the so-called parallelogram law for the addition of vectors. a2 are real numbers. 4. c(ah a2) = (cah ca2). 'Y a vector along the y-axis. '1'/. the number of dimensions (which was two in §7. hence by the vector sum !(a + (3). Exercises 1. A complete list of postulates for vector algebra will be given in §7. (3 a + ({3 + 'Y) = (a + (3) + 'Y.3. Vector operations may be used to express many familiar geometric ideas. Generalizations The example just described can be generalized in two ways.1) can be arbitrary. the midpoint of the line joining the terminus of the vector a = (ah a2) to that of {3 = (b h b2) is given by the formulas «a 1 + b 1)/2. 'Y. 7. t such as (3) a (4) c(a + {3 = + (3) = + a. denote vectors and small Latin letters to denote scalars.0) and terminus at a suitable point (ah a2). ~. For example. using the rules (1) (ai> a2) (2) + (bi> b2) = (al + bh a2 + b2). ca + c{3.2 Generalizations have much the same significance as with displacements. (a2 + b2)/2). This illustrates the general principle that various physical situations may have the same mathematical representation. Show that the vectors in the plane form a group under addition. We may represent any such vector by an arrow a with origin at (0. ••• . Analytical geometry suggests representing vectors in a plane by pairs of real numbers. Then vector sums and scalar products may be computed coordinate by coordinate. 3. From these rules we easily get the various laws of vector algebra. using the rules (1) and (2). Many of these (notably the commutative law of vector addition) also correspond to geometrical principles. 2.2. Prove the laws (3) and (4) of vector algebra. The resulting vector is also known as the center of gravity of a and {3. where the coordinates ai. •• • to .169 §7. we shall first describe other examples of vectors. The first t We shall systematically use small Greek letters such as a. First. Illustrate the distributive law (4) by a diagram. /3. Show that every vector a in the plane can be represented uniquely as a sum a = f3 + 'Y. 1· a = a . (. and so on. where f3 is a vector along the x-axis. (Distributive laws) . The generalizations described in the last two paragraphs can be combined into a single formulation.an) = (cab' .an + bn).bn) = (al + bh .. f3 = (bh . . t variable). Thus. In the vector space V = pn. for any positive whole number n.. + am). bn).13). 14 base the theory of algebraic numbers on the study of vectors with rational scalars. To get a complete geometrical theory.. f3 # 0.. an) of real numbers form an n-dimensional vector space which may be regarded as an n-dimensional geometry.. " with components ai and bi in F.. while multiplication by scalars (real numbers) has the same significance as before. so far as algebraic properties are concerned.. a = c . f3 fixed. c(ab' . X3) instead of two. . . + f3) = c . .. a + c' . . a + c .. can)· (6) Theorem 1. one need only introduce distance as in § 7.. More generally.10. a = a.am is (l/m)(al + . straight lines are the sets of elements of the form a + tf3 (a. it is shown in the theory of statics that the forces acting on a rigid solid can be resolved into six components: three pulling the center of gravity in perpendicular directions and three of torque causing rotation about perpendicular axes. vectors have three components (XI.. valid for any positive integer n (the dimension) and any field P of scalars.. 1· a (c + c') . the n-tuples a = (ab' . The sum of two forces may again be computed component by component.. but can be elements of any field. . Addition and scalar multiplication in pn are defined as follows: EXAMPLE. .an). and so on (this will be developed in §9. ..1. f3. vectors with complex components are constantly used in the theory of electric circuits and in electromagnetism. X2. (5) (ab' . Again. the components of vectors and the scalars need not be real numbers. vector addition and scalar multiplication have the fol/owing properties: (7) V is an Abelian group under addition. an) + (bb . (c' . Indeed. The vector space pn has as elements all n-tuples a = (ah . a). while we shall in Chap.170 Ch. . . The only difference is that in the case of space. 7 Vectors and Vector Spaces hint of this is seen in the possibility of treating forces and displacements in space in the same way that plane displacements and forces were treated in §7. (8) c' (a (9) (cc')· a = c . the center of gravity of ab ... A second line of generalization begins with the observation that.. it is essentially just an' algebraic system whose elements combine. Exercises 1.· . (c) What is the center of gravity of a.3 Vector Spaces and Subspaces 171 Proof. {3.1.) 7.i.(i + 3)y. 5.i{3.2 hold. (a) How many vectors are then in Z3"? (b) What can you say about a + a + a in Z3"? *6. Let Z'3" consist of vectors with n components in the field of integers mod 3. 1/4.§7. Divide the line segment a{3 in the ratio 2: 1 in Exs. Vector addition is associative. We first verify the postulates for a group.i. The special vector 0 = (0. {3 = (0. 2. (c) Solve a . (3 = (-1/2. Vector Spaces and Subspaces We now define the general notion of a vector space.O).3. we have: (a + {3) + Y = (al + b l + Ch ••.2((3 + y).. . .0.i~ = {3. Let a = (1. Compute: (a) a + 2{3 + 3y.. while 0 = 0 . .I). 1 and 2. while -a = (-ah· .2). under vector addition and multiplication by scalars from a suitable field F. Note that -a = (-l)a is also the product of the vector a by the scalar -1. 2. Compute: (a) 2a .2i). (b) ia + (1 + i){3 . *4. y = (1. The group is commutative because aj + bj = bj + aj for each i. Can you define a "midpoint" between two arbitrary points in Z3"? A center of gravity for three-for four-arbitrary points? (Hint: Try numerical examples. cn).0) acts as identity. an + bn + cn) = a + ({3 + y). .2/3). . the definitions (5) and (6) reduce each side of the distributive laws (8) to a corresponding distributive law for fields which holds component by component.i. since for any vectors a and {3 as defined above and any y = (Ch . Likewise. can you "divide the line segment ajj in the ratio 1: 2i"? Explain. y = (0.4). a for any a. 1. since (aj + bi) + Cj = aj + (bj + Cj) for each i by the associative postulate for addition in a field (§ 6. . y? (d) Solve 6{3 + 5~ = a. In Ex. 3. so that the rules listed in § 7. -an) is inverse to a in the sense that a + (-a) = (-a) + a = O.2 . (b) 3(a + (3) . .0). Let a = (l. with the properties (4) and (7)-(9).) Theorem 1 essentially stated that for any positive integer n and any field. so that f assigns to each x E 5 a value f(x) E F. but their operations of addition and scalar multiplication have the same formal algebraic properties as our other examples.172 Ch. These functions cannot be represented by arrows. ca + Oa = (c + O)a = ca = ca + 0. The null vector 0 is not to be confused with the zero scalar O. it is the unique "null" or "zero" vector satisfying (10) a+O=O+a=a for all a . let 5 denote the set of all functions f(x) of a real variable x which are single-valued and continuous on the interval 0 < x < 1. . Vectors in this set 5 may even be regarded as having one "component" (the value of the function!) at each point x on the line 0 <: x < 1. Two such functions f(x) and g(x) have as sum a function h(x) = f(x) + g(x) in 5. if the sum h = f + g and the scalar product h' = c . F" was a vector space. . and that any vector a E Vand any scalar c E F determine a scalar product c . The set of all such functions f forms a vector space over F. such that any two vectors a and f3 of V determine a (unique) vector a + f3 as sum. Indeed. for all c and a. consider the functions f whose domain is any set 5 whatever (say. cancelling ca on both sides. they playa fundamental role in modern mathematical analysis. ca + cO = c(a + 0) = c . any plane region). Now. called vectors. f(x). For example. (Rules (8) and (9) are to hold for all vectors a and f3 and all scalars c and c' . and the "scalar" product of f(x) by a real constant c is also such a function cf(x). However. we get the two laws (11) Oa = 0 for all a. cO = 0 for all c. 7 Vectors and Vector Spaces Definition. a = ca + O. Again. the two distributive laws give. the two are connected by an identity. A vector space V over a field F is a set of elements. with the field F as codomain. we shall denote by 0 the identity element of the group. There are also many infinite-dimensional vector spaces. a in V. Conforming with our use of the additive notation for the group operation in a vector space. fare the functions defined for each x E 5 by the equations h(x) = f(x) + g(x) and h'(x) = c . . A nonvoid subset S is a subspace if and only if the sum of any two vectors of S lies in S and any product of any vector of S by a scalar lies in S. The analogy with earlier definitions of a subfield and subgroup is obvious. the vectors which lie in a fixed plane through the origin form by themselves a "twodimensional" vector space which is part of the whole space. Similarly. for a + (-I)a . hence (12) the (additive) group inverse of any vector a is (-l)a. 0. For example.) through the origin 0. a "subspace" is simply a linear subspace (line. Similarly.. R3. It follows from (11) and (12) that the cyclic subgroup of the "powers" of any vector a consists of the multiples of a by the different integers n. the set S of all vectors lying on a fixed line through the origin is closed under the operations of addition and multiplication by scalars. Geometrically. the set of all continuous functions f(x) defined for 0 < x < 1 is a subspace of the linear space of all functions defined on the same domain. am in a vector space V.1· a + (-I)a = (1 + (-l))a = Oa = 0. Again. This statement may be easily checked from the definition. the set of all linear combinations (each Cj a scalar) . . the set of polynomials of degree at most seven is a subspace of the vector space of all polynomials-whether the base field is real or not. For a given vectors at. the vectors of the form (0. . etc. plane. Definition: A subspace S of a vector space V is a subset of V which is itself a vector space with respect to the operations of addition and scalar multiplication in V. Also. In ordinary three-dimensional vector space. hence this set is also a "subspace" of R3. X2.3 173 Vector Spaces and Subspaces Again. the scalar multiple (-I)a acts as the inverse of any given vector a in the group. X4) constitute a subspace of p4 for any field F. the null vector 0 alone is a subspace of any vector space.§7. Theorem 17 of §6. This is because of the identities (13) (clal + . any scalar mUltiple c . (15) where 5 5 c c Rand T c T R c 5 + T. + cm'am) = (el + cl')al + . hence it is called the subspace generated or spanned by them. hence is also in the intersection 5 n T. This proves Theorem 2.9. this set is itself a subspace. . at. a of a is in 5 n T. the subspace spanned by two non collinear vectors al and a2 turns out to be the plane passing through the origin. The intersection 5 n T of any two subspaces of a vector space V is itself a subspace of V.Ch. + cmam) + (el'al + . on the intersection of two subgroups). . imply 5 +T c R. any two subs paces 5 and T of a vector space V determine a set 5 + T consisting of all sums a + f3 for a in 5 and f3 in T. The intersection of two given subs paces 5 and T is defined to be the set 5 n T of all those vectors belonging both to 5 and to T (d. Again. + (em + cm')a m. and distributive laws (3) and (4). associative. It clearly contains 5 and T. and c'. . Proof... These properties of 5 + T may be stated as 5c5+T. This subspace is evidently the smallest subspace containing all the given vectors.E . Similarly. valid for all vectors aj and for all scalars Cj.8) of two subgroups. R means that the subspace 5 is contained in the subspace R.. 7 Vectors and Vector Spaces of the aj 174 is a subspace. this is simply the line through the origin and al' Similarly. The set of all linear combinations of any set of vectors in a space V is a subspace of V . The subspace spanned by a single vector al # 0 is the set 51 of all scalar mUltiples cal. called the linear sum or span of 5 and T. Q.'.D.. If a and f3 are two such vectors. geometrically. hence the concept of linear sum is analogous to that of the join (d. their sum a + f3 must be in 5 (since 5 is a subspace containing a and f3) likewise in T. and is contained in any other subspace R containing both 5 and T. §6. By the commutative. c. and a2' Theorem 3. 2. how many vectors are spanned by (1. x 3 ). 1) and (2. (e) all ~ such that 7Xl .. Which of the sets of functions described in Ex. (d) all functions such that 0 + 1(1) = 1(0) + 1. (d) all ~ such that 3Xl + 4X2 = 1. In Ex. Construct Z/ and tabulate its subspaces.2.9if3. + anxn = 0. (c) all ~ with either Xl or X 2 zero. <: X <: ~ 1 are 6. (Hint: Expand (1 + l)(a + ~) in two ways. compute 7(2(a 2a. Xj all lie in F. 1.. Prove that the vector space postulate 1 . and 113. Let S be the subspace of Q3 consisting of all vectors of the form (0. (e) all positive functions.0) and (1. § 7.. Whi. If S is spanned by ~l 12.2. . § 3. show that S + T is spanned by ~I> ~2' 111> 112.175 §7. 2.0) and (3.2.X 2 = o.) *15. where a" bi . Show that the postulate of commutativity for vector addition is redundant.x) for all x. Generalize this result. In Ex. (f) all functions satisfying I(x) = 1(1 ..0) and (-3. (c) all functions I such that 2/(0) = 1(1).3(3)] . Which vectors are in S n T? In S + T? 8. 4. 2)? 9. (b) all polynomials of degree <: four (including I(x) = 0). + bnxn = 0 is a subspace of P. 1. and ~2' T by l1it 112.0). In Q3 show that the plane X3 = 0 may be spanned by each of the following pairs of vectors: (1. 1) and (2. X 2 . 3. xn) of a pair of homogeneous linear equations alx l + .0).0) and (4.xn (a) all ~ with Xl an integer.2. Which of the following subsets of Qn (n :> 2) constitute subspaces (here denotes (XI> ••• . 3. blXl + . a = a cannot be proved from the other postulates.y) + 5f3 + 2. 1. ca = 0 implies c = 0 or a = O. 1.8a .2). 11. »? 5.3 Vector Spaces and Subspaces Exercisas 1. 3(3) + ~(3f3 .) .2.. 113. 13. *14. I)? By (1. form vector spaces when D is taken to be a field F? 7. the projection of c . Construct an addition tabie for Z/ and list its subspaces.2. (2. Prove that the set of all solutions (Xit· . 10. a on a fixed line.3. and T the subspace spanned by (1. (b) all ~ with X 2 = 0. (3.6y» . (Hint: Construct in the plane a pseudo-scalar product c ® a.2(a . §7.0.ch of the following sets of real functions I(x) defined on 0 subspaces of the vector space of all such functions? (a) all polynomials of degree four. 1.2. Prove that in any vector space. In Z/. compute (1 + 2i)[(2a .0). 1. ... . XIEI + .' . However..' . in addition. The nonzero vectors al> ••. .. En == (0..0.am in a space V are linearly . (0.0).4. It is a trivial consequence of the definition that any subset of a linearly independent set is linearly independent. More generally.' . if and only if Xl == ••• == Xn == 0. Thus. for all scalars Cj in P.. . 1.O)-that is. . and (0. En generate the whole of F". . + cmam == 0 implies CI == C2 0= ••• == cm 0= 0. Hence its dimension is three. any F" is spanned by n unit vectors EI = (1..xn ) = (0.0. any vector of F" is a linear combination of these. 7 Vectors and Vector Spaces 7. the following relation of dependence to linear combinations is more important: Theorem 4. Vectors which are not linearly independent are called linearly dependent. (16) Indeed. It will be described as the minimum number of vectors spanning the space (or subspace). Definition..1. .0). .1) of unit length lying along the three coordinate axes.0.0). but by no set of two vectors (a set of two noncollinear vectors spans a plane through the origin).. am are linearly independent (over F) if and only if. . 1). Not only do E l > " ' ... ordinary space R3 can be spanned by the three vectors (1.176 Ch. E2 0= (0. This justifies calling pn an n-dimensional vector space over the field P. + XnEn == 0 if and only if (Xl>' •• . Linear Independence and Dimension The important geometric notion of the dimension of a vector space or subspace remains to be defined abstractly.0). (18) cIa I + C2a2 + . The vectors at.0. because (17) We shall prove in Corollary 2 of Theorem 5 that pn cannot be spanLed by fewer than n vectors. This means that the unit vectors are "linearly independent" in the following sense. suppose that the vectors are linearly dependent.4 177 Linear Independence and Dimension dependent if and only if some one of the vectors ak is a linear combination of the preceding ones. and show that the remaining vectors generate the same subspace. We can now state the fundamental theorem on linear dependence. + dma m = 0. we obtain Corollary 2. . f33) spans the same subspace as does its proper subset (f32..2f32 .e. not zero.' .. Then n :> r. Hence the vectors are dependent. using induction. In this case dIal = 0. Any finite set of vectors contains a linearly independent subset which spans (generates) the same subspace. Now. g. We can express this linear dependence either by the relation f31 .0. f33)' This illustrates ° Corollary 1. 0) do not span the whole of ordinary space R3 because they all lie in one plane.3f33 = or (solving for f31) by f31 = 2f32 + 3f33' Thus. Let Ao = [at. with d l ~ 0. with at least one coefficient. Q. . ° Namely. contrary to the hypothesis that none of the given vectors equals zero. except in the case k = 1. Let n vectors span a vector space V containing r linearly independent vectors.§7.E.0). Conversely. an] be a sequence of n vectors spanning V. In case the vector ak is a linear combination ak = Clal + . we can delete from the set anyone vector which is or which is a linear combination of the preceding ones. One can then solve for ak as the linear combination ak = (-dk-Idl)al + .. A set of vectors is linearly dependent if and only if it contains a proper (i. so al = 0. we have at once a linear relation Proof..3. f32 = (1.D. . by (18) . . so that d I a I + d2a2 + . (-1). the set (f3h f32. -2. For instance. and let X = [gh .. smaller) subset spanning the same subspace.. + (-dk-Idk_l)ak_l' This gives ak as a combination of preceding vectors. Proof. and choose the last subscript k for which d k ~ O.. + Ck-Iak-l of the preceding ones..] be a sequence of r linearly independent . the three vectors f31 = (2.0). and f33 = (0. Theorem 5. A vector space is finitedimensional if and only if it has a finite basis. Construct the sequence B2 = [~2' AI] = 16. the unit vectors Et. i (say. Hence as before. All bases of any finite-dimensional vector space V have the same finite number of elements. En of (16) are a basis of F".an] both spans V and is linearly dependent. even though the full significance of the concepts of "basis" and "dimension". A basis of a vector space is a linearly independent subset which generates (spans) the whole space. Hence Ao must have originally contained at least r elements.Ch. . at. Theorem 5 has several important consequences. Since V is finite-dimensional. ai-l in B 1 • Deleting this term. at. let B be any other basis of V. Definition.' .D. . some vector of B2 is a linear combination of its predecessors. we have Proof. . We shall prove these now for convenience.an]. and is denoted by d[ V). as in Corollary 1. If a vector space V has dimension n. ~t.an]. . since ~l belongs to a set X of independent vectors. . . Corollary 2. so must be some aj> with a subscript j . By Theorem 4. . ~l is a linear combination of the ai. this vector cannot be 6 or ~I. . ai+I.an] which still spans V. at. it has a finite basis A = [at.8. Hence n = r. then (i) any n +1 . 7 178 Vectors and Vector Spaces vectors of V. and that n :> r.' . with j > i).' . . Since Ao spans V. Deletion of this aj leaves a new sequence of n vectors spanning V. ai+t. until the elements of X are exhausted. will not become apparent until § 7..ai-I. SO that the sequence BI = [~t.. Since A spans V and B is linearly independent. Like Bb B2 spans V and is linearly dependent. we obtain. B spans V.. Hence some vector ai is dependent on its predecessors ~t. say with r elements.. . . so r :> n. . This cannot be ~t. a subsequence Al = [~I. . which they involve. . For example. The number of elements in any basis of a finite-dimensional vector space V is called the dimension of V.:. Each time. .ai-I. . Corollary 1.. Q. at. . On the other hand. and A is linearly independent. This argument can be repeated r times.E. By Theorem 5. an element of Ao is thrown out.. Theorem 5 shows that B is finite. . proving n :> r. Because the ~i are linearly independent. . some vector of BI must be dependent on its predecessors.· . Now repeat the argument. . For n vectors ah' .. then it is a part of a basis by Theorem 6. 1) form a basis of Q3? Why? 3. Proof. g" and let a h . show that the vector {3 is a linear combination of aJ. Since the gi are independent. if A is independent. {3 are linearly dependent. Show that the vectors (aJ. g" ah . . Again. Do the vectors (1... . 8. . Theorem 6. and (ii) no set of n ..an of an n-dimensional vector space to be a basis. Prove that if {3 is not in the subspace S.§7. Proof. 1. ~2' ~3 are independent in R". since the dimension of V is n..an]' We can extract (Theorem 4. How many elements are in each subspace spanned by four linearly independent elements of Z/? Generalize your result. . . If the vectors aJ.an Je a basis for V. Prove: Three vectors with rational coordinates are linearly independent in Q3 if and only if they are linearly independent in R3.0) and (0. ..1 elements can span V. If A = {ah . . 4.. Let the independent set be gh . and this basis has n elements by Corollary 1 of Theorem 5. 6. no gi will be deleted..an} spans Vi it contains a subset A' which is a basis of V (Theorem 4. ... this subset A' must have n elements (Theorem 5. Hence A' = A. Corollary 2) an independent subsequence of C which also spans V (hence is a basis for V) by deleting every term which is a linear combination of its predecessors.am are linearly independent... . a2) and (bl> b 2) in p2 are linearly dependent if and only if a l b 2 . Any independent set of elements of a finite-dimensional vector space V is part of a basis. Corollary 1).am if and only if the vectors aJ. 1.4 Linear Independence and Dimension 179 elements of V are linearly dependent. ..a2bl = O. . but is in the subspace spanned by S and a.. Which of the postulates and theorems discussed so far fail to hold in this more general case? *7. 2. . . Generalize this result in two ways. Exercises 1. and A is a basis of V.am. Define a "vector space" over an integral domain D.. Corollary 2). then so are ~l + ~2' ~l + ~3' ~2 + ~3' Is this true in every F"? 5... Prove that if ~J. it is sufficient that they span V or that they be linearly independent. then a is in the subspace spanned by Sand {3. and so must be A itself. Form the sequence C = [gJ. and so the resulting basis will include every gi' Corollary. .···. B. 11.. which centers around the fundamental concept of matrices and their row-equivalence. 12. The row space of the matrix A is that subspace of F" which is spanned by the rows of A. Evidently.. where CI C3 i' 0. If cIa + C2{3 + C3Y = 0.. .3.am define the m x n matrix (19) A== whose ith rOW consists of the components ail.. am and {31. . having m rows and n columns. is called an m x n matrix over F. The matrix (19) may be written compactly as I aij II. Show that the real numbers 1. and J5 are linearly independent over the field of rational numbers. . Matrices and Row-equivalence Problems concerning sets of vectors in F" with given numerical coordinates can almost always be formulated as problems in simultaneous linear equations. We first define the former concept.Ch. {3" span the same subspace. *13. Find four vectors of C which together span a subspace of two dimensions. they usually can be solved by the process of elimination described in §2. . any two vectors being linearly independent. Remark. 7 Vectors and Vector Spaces 180 *9. C Over any field F form an mn-dimensional vector space under the two operations of (i) multiplying all entries by the same scalar c and (ij) adding corresponding components. We will now begin a systematic study of this process../2. Clearly. 10. *14. the vectors ai. al. show that a and {3 generate the same subspace as do {3 and y. If two subspaces Sand T of a vector space V have the same dimension.···.ajn of the vector aj. (a) How many linearly independent sets of two elements has Z/? How many of three elements? of four elements? (b) Generalize your formula to Z2" and to Zp". Definition. . the m x n matrices A. A rectangular array of elements of a field F. How many different k-dimensional subspaces has Z/? 7.5. prove that SeT implies S = T. We now use the concept of a matrix to determine when two sets of vectors in F". As such. The row space of A is then the set of all vectors of the form clal + . we shall confine our attention to the case of a single elementary operation of type (iii). + cma m... and the elementary row operations become: (i) Interchanging any aj with any aj (i -# j). if B is row-equivalent to A. Any vector 'Y in the rOw space of B has the form 'Y = I cj3. (ii) Replacing aj by caj for any scalar c -# o. called elementary row operations: (i) The interchange of any two rows. We now consider the effect on the matrix A in (19) of the following three types of steps. when do their rows span the same subspace of pn? A partial answer to this question is provided by the concept of row-equivalence.. Theorem 7. (iii) Replacing aj by aj + daj for any j -# i and any scalar d.. . Hence. hence it is an equivalence relation. + cmam. Take the typical case of the addition of a multiple of the second row to the first row. . the relation of row-equivalence is symmetric.am of A by the new rows (20) of the row-equivalent matrix B... . Row-equivalent matrices have the same row space.· . It suffices to consider the effect on the row space of a single elementary row operation of each type. that is. we have the following Lemma. Since the effect of each such operation can be undone by another operation of the same type. We now ask: when do two m x n matrices have the same row space? That is.am. It is clearly also reflexive and transitive.. Since operations of types (j) and (ii) clearly do not alter the row space. Proof. which we now define.5 Matrices and Row-Equivalence 181 regarded as vectors in F". hence on substitution from (20) we have 'Y = cl(al + da2) + C2a 2 + . which replaces the rows at. Denote the successive row vectors of the m x n matrix A by at. (ij) Multiplication of a row by any nonzero constant c in P. The m x n matrix B is called row-equivalent to the m x n matrix A if B can be obtained from A by a finite succession of elementary row operations. . then A is row-equivalent to B. (iii) The addition of any multiple of one row to any other row..§7. The inverse of any elementary row operation is itself an elementary row operation. Ch. the rows of A can be expressed in terms of the rows of B as so that the same argument shows that the row space of A is contained in the row space of B. . We wish to know which solution vectors ~ = (Xl>' ..3. + b1nxn b21 x 1 + b22 x2 + . It is easy to verify that the set of solution vectors ~ satisfying (21) is invariant under each of the following operations: (i) the interchange of any two equations. these are just the three elementary row operations defined earlier... This proof gives at once Corollary 1.n+l. if any. . We next apply the concept of row-equivalent matrices to reinterpret the process of "Gauss elimination" described in § 2. 7 182 Vectors and Vector Spaces which shows that l' is in the row space of A. (iii) Addition of any mUltiple of one equation to any other equation... If A and B are row-equivalent m x (n + 1) matrices over the same field F. .n+l. Any sequence of elementary row operations reducing a matrix A to a row-equivalent matrix B yields explicit expressions for the rows of B as linear combinations of the rows of A.. by the Lemma. + a2n Xn = al. satisfy the given system of equations (21).xn). + b2nxn bllXl (21') = b 1•n + 1 . (ii) multiplication of an equation by any nonzero constant c in F. a2.... . This proves Corollary 2.. Consider the system of simultaneous linear equations allxl a21 X l + a12 X2 + .n+l. Simultaneous Linear Equations.then the system of simultaneous linear equations (21) has the same set of solution vectors ~ = (Xl> .xn) as the system + b 12 X2 + .. thus these row spaces are equal. (21) where the coefficients aij are given constants in the field F. But as applied to the m x (n + 1) matrix of constants aij in (21). = b2. + alnX n = + a22 X2 + . Conversely. Proof. If A and B are row-equivalent matrices. Any matrix A is row-equivalent to a row-reduced matrix. In any nonzero row of A. the first nonzero entry may be called the "leading" entry 'Of that row. prove that the rows of A are linearly independent if and only if those of Bare. We say that a matrix A is row-reduced if: (a) Every leading entry (of a nonzero row) is 1. 1.6 183 Tests for Linear Dependence Exercise. This reduces every other entry in column t to zero. by elementary row operations of types (ii) and (iii). Now let the same construction be applied to the other rows in succession. Now subtract ajl times the first row from the ith row for every i ¥. Multiply the first row by all-I. (Hint: Try 2 x 2 matrices. Show that any elementary row operation of type (i) can be effected by a succession of four operations of types (ii) and (iii). this leading entry becomes one.1. *3. Tests for Linear Dependence We now aim to use elementary row operations to simplify a given m x n matrix A as much as possible.6. Show that the meaning of row-equivalence is unchanged if one replaces the operation (iii) by (iii') The addition of any row to any other row. (b) Every column containing such a leading entry 1 has all its other entries zero. 2. Sample 4 x 6 row-reduced matrices are 0 0 1 r14 r15 r16 1 d 12 0 d 14 0 d 16 1 0 0 r24 r25 r26 0 0 1 d 24 0 d 26 r35 r36 0 0 0 0 1 d 36 0 0 0 0 0 0 0 0 (22) 0 1 0 r34 0 0 0 0 Theorem 8. The application involving row k does not alter the columns .§7.) 7. Suppose that the given matrix A with entries ajj has a nonzero first row with leading entry all located in the tth column. so that conditions (a) and (b) are satisfied as regards the first row. . we have ti ¥. Since all entries of E in column ti are zero. the second matrix of (22) is already a reduced echelon matrix. Suppose that there are r nonzero rows and that the leading entry of row i appears in column ti for i = 1.. after the application involving row k. the coefficient Yi of 1'i is the entry of {3 in the column ti . in the row space of E. Theorem 9.. then every Yi = 0 in the preceding theorem.1. we can then arrange R so that (d) tl < t2 < . . 7 184 Vectors and Vector Spaces contammg the leading entries of rows 1. the first matrix of (22) is not. Since any such column has all its other entries zero. + y. For if {3 = 0. Hence. the tj-th component of {3 must be Yi . we can evidently rearrange the rows of a row-reduced matrix R so that (c) Each zero row of R comes below all nonzero rows of R. . i... . but can be brought to this form by placing the first row after the third row. Corollary 1. The nonzero rows of a row -reduced matrix are linearly independent..j. By permutations of rows (i. Let E be a row-reduced matrix with nonzero rows 1'1. t.· . We have proved the Corollary.. r.e. ... < t.e.1'. a succession of elementary row operations of type (i)).···.. By a further permutation of rows. except that of 1'i> which is one. For example. the tj -th entry of {3. 1. Theorem 8 now follows by induction on k. 1'" and leading entries 1 in columns tb . A row-reduced matrix which also satisfies (c) and (d) is called a (row) reduced echelon matrix (the leading entries appear "in echelon"). Then. k .. because row k already had entry zero in each such column. Proof.tj whenever i ¥.. . (leading entry of row i in column tJ. we have a matrix satisfying conditions (a) and (b) at its first k rows. Any matrix is row-equivalent to a reduced echelon matrix. for any vector {3 = Y1'Yl + .Ch. Because of the ordering of the rows (condition (d) above). They are thus a basis of this row space. The rank r of a matrix A is defined as the dimension of the row space of A. which by Theorem 7 is identical to the row space of A. hence n leading entries 1 in n different columns and no other nonzero entries in these columns (which include all the columns). we see that the rank of A can also be described as the maximum number of linearly independent rows of A. One such matrix is the n x n identity matrix In> which has entries 1 along the main diagonal (upper left to lower right) and zeros elsewhere. An n n matrix A has rank n if and only if it row -equivalent to the n x n identity matrix In. which must contain a linearly independent set of rows spanning the row space.E. Q. In particular. . row-equivalent matrices have the same rank. A is row-equivalent to a reduced echelon matrix E which also has rank n. or more generally in computing the dimension of a subspace (= rank of a matrix).6 Tests for Linear Dependence 185 Corollary 2. Then the nonzero rows of R form a basis of the row space of A. Since this space is spanned by the rows of A.§7. an n X n (square!) matrix A has rank n if and only if all its rows are linearly independent. X IS Proof. it is needless to use the reduced echelon form. Corollary 3. By Theorem 7. Let the m x n matrix A be row-equivalent to a row- reduced matrix R. It is sufficient to bring the matrix to any echelon form. This matrix E then has n nonzero rows. and span the row space of R. In testing vectors for linear independence.D. and that in each row after the first the number of zeros preceding this entry 1 is larger than the corresponding numliler in the preceding row. E is then just the identity matrix. Proof. by Corollary 1. These rows of R are linearly independent. such as the form of the following 4 x 7 matrix: Such an echelon matrix may be defined by the condition that the leading entry in each nonzero row is 1. . the rank of a matrix can be found immediately by applying the following theorem. 'Y3. There is only one m x n reduced echelon matrix E with a given row space S c F".6.Ch. 'Y.. . since C has a row of zeros. 1.. a2 = (2. the original ai are linearly EXAMPLE. Theorem 10...f + . . 'Y3 = f33 . lie beyond t. The proof will be left as an exercise. -2. Test a1 = (1. we have the explicit dependence relation between the a's.• .3a1 = (0. Because ts < . after reduction to echelon form. hence the row space S determines the indices tt. Appendix on Row-equivalence. Proof.2a1 = (0. -3. 1) for independence.. . t 2 • Each of these columns occurs (as the leading entry of a 'YJ. < tn the leading entries of the remaining 'Ys+ 1t ••• . then f3 = y~'Y.. .... obtain the new rows (31 = at..'Y.. The rank of any matrix A is the number of nonzero rows in any echelon matrix row-equivalent to A. (33 = a3 . 'Yn where ')'i has leading entry 1 in column Ii' By condition (d). 1.. By transformations of type (iii). f3 has entry Yi in column tl.1. Let the reduced echelon matrix E with row space S have the nonzero rows 'Y1. Finally.. < I. If Ys is the first nonzero y. Let f3 = Y1 'Y1 + . sketched below. set 'Y1 = f3b 'Y2 = -(1/3)(32. It < 12 < . -8). every vector f3 of S has leading entry in one of the columns tll . There results the echelon matrix C with rows ')'1. Theorem 11.3.. so that ~ has Ys as its leading entry in column ts' In other words.ny nonzero vector in the row space of E. 7 186 Vectors and Vector Spaces Thus. 3. + y. + Yr'Yr be a. 'Y2. and a3 = (3. (32 = a2 . -1.3). . 1 -1 C= -1/3 0 1 -1/3 o 0 0 3 0 dependent.10).6')'2 = f33 + 2f32 = O. Reduced echelon matrices provide a convenient test for row-equivalence.. t. By substitution m the definition of 'Y3 = 0.. -5. ..4). by Theorem 9. 'Yr of E have leading entry 1 and entries zero in all but one of the columns tl. respectively. Every m x n matrix A is row-equivalent to one and only one reduced echelon matrix. if A and B have the same row space. Proof. they are identical. may be summarized by saying that the reduced echelon matrices provide a canonical form for matrices under row-equivalence: every matrix is row-equivalent to one and only one matrix of the specified canonical form. Two m x n matrices A and B are row-equivalent if and only if they have the same row space.E') to B. by Theorem 7.6 Tests for Linear Dependence 187 The rows 'Yl. This result.•. ••. by Theorem 11. then A and B have the same row space. If A is row-equivalent to B. .' . 'Yr of E. Corollary 1.. Since E and E' have the same row space. by Theorem 9. which is immediate. These results emphasize again the fact that the row-equivalence of matrices is just another language for the study of subspaces of F". as was required. they are row-equivalent to reduced echelon matrices E and E'. tr • If f3 is any vector of S with leading entry 1 in some column Ij and entries zero in the other columns tjl then.D.E. . . Conversely. Hence A is indeed row-equivalent (through E --:. Corollary 2.§7. Q. (3 must be 'Yi' Thus the row space and the column indices uniquely determine the rows 'Yb . . (1. 7.4. Find the ranks and bases for the row spaces of the following matrices: (a) 12 23 43) . (0. 2 . Prove: The rank of an m x n matrix exceeds neither m nor n. -1.0.. (-1. List all possible forms for a 2 x 4 reduced echelon matrix with two nonzero rows. .4.j).0.am of F" and various vectors A.Ch. (d) (1.. (3. (1. (0. 1. -1. -1. taken together. (1. 11. 7 188 Vectors and Vector Spaces 3.0). In Ex. (1. (0.6. express the rows of each associated echelon matrix as linear combinations of the rows of the original matrix. a3 be as in the Example of § 7. 1. a2. -1).3. Having reduced the matrix A to echelon form. Homogeneous Equations It is especially advantageous to use elementary row operations on a matrix in place of linear equations (21) when one wishes to solve several vector equations of the form (23) for fixed vectors ah .0).3. find a basis for the subspace spanned by the two sets of vectors.3). 1). 1. 4. In Q6. test each of the following sets of vectors for independence. let ai. 5. -2. (a) (2. 0.) 9.0. extract a linearly independent subset which generates the same subspace. -1. 7). Prove directly (without appeal to Theo~em 8) that any matrix A is rowequivalent to a (not necessarily reduced) echelon matrix. 1. -3. 7.2. -6). and let A = (2. (i.6. 5. 2.2). and find a basis for the subspace spanned.7. 1. i. 6. 1) in R3. 3) in C 3 . (c) (1. 1) in Q3 and C\ (b) (0. In Ex. 1 + i). (These yield a cell decomposition of the "Grassmann manifold" whose points are the planes through the origin in 4-space. (0. (0. 5. then rank (A) < rank (B). For instance. -1. If the m x (n + k) matrix B is formed by adding k new columns to the m x n matrix A.3.7. Test the following sets of vectors for linear dependence: (a) (1. 1. 1). In every case of linear dependence. we .0.2. 1).0). 1).2). 10. 0. 4. -2. 1. Vector Equations.2. (b) (2. 1) in Z/ and Z/. (3 4 5 (b) 1 2 1 2 3 2· 3 2 -1 -3 0 4 0 4 -1 -3 245 234 -2 0 2 8.3. 0.7. -1. 1'. additions. + Xmam can be solved (if a solution exists) by a finite number of rational operations in F. Since 1'3 = -7al + 2a2 + a3 = 0. at. and since a given matrix can be transformed to reduced echelon form after a finite number of elementary row operations. then since the rows of C will be known lin.. it is usually best to first transform the m x n matrix whose rows are at.7 Vector Equations.am in F". = Ylelj + . . other solutions of (23) in this case are for arbitrary y.···. For given vectors A. we will obtain a solution for (23) j-l in the form A = I YieiPb whence we have Xj proves the following result. If it is satisfied. then A is not in the row space of A. + Y. Let Sand T be subspaces of F" spanned by vectors at. Corollary... Computing the third and fourth components of Sa 1 .13k respective/yo Then the relations S => T.. Had the vector A been A' = (2.. we then get 7 = -Yl + Y2 or Y2 = 9. One can then apply Theorem 9 to get the only possible coefficients which will make A := YI1'1 + . we get 2 = Yl. . . + y.. T => S.ear I m combinations 1'i = I eijaj of the a's. . . This is actually the most general solution of (23). -6). the above procedure would have shown that A cannot be expressed as a linear combination of the a's at all. this can be done after a finite number of rational operations (Le..I!'j' This Theorem 12.. . and (23) has no solution.. . the vector equation A = Xlal + .3a2. am to reduced echelon form C with nonzero rows 1't. we see that A is indeed such a linear combination.1.. Since each elementary row operation on a matrix involves only a finite number of rational operations. . Homogeneous Equations first solve the equation A YI1'1 + Y21'2 + Y31'3 = = Yt'Y1 + Y21'2' Equating first components. and S = T can be tested by a finite number of rational operations. Hence we must have if A is a linear combination of the a's at all. If this equation is not satisfied by all components of 1'.. subtractions. am and {3 t. when several vectors A are involved.. and divisions).1'.. multiplications... equating second components.7.. In fact.189 §7.. . . + C2. for S = T holds if and only if these two reduced forms have the same nonzero vectors.. Together these two procedures test for S = T. with leading entries 1 in the columns It.. .+1X.+1 + . We shall now show how to determine a basis for this subspace. one can construct from the a's. Now bring A to a reduced echelon form. First observe. assume that the leading entries appear in the first r columns (this actually can always be brought about by a suitable permutation. The corresponding system of equations has r nonzero equations. . and the ith equation is the only one containing the unknown Xli' To simplify the notation. It is easy to show that S is a subspace. . . One then tests as above whether all the (3's are linear combinations of the 1"s. a sequence of nonzero vectors 1'l> .+l> ••• . (25) In this simplified form. and solving (25) for Xt. .. applied to the unknowns Xi and thus to the columns of A). .1'..xn ).ain) in the ith equation of (24). 7 Vectors and Vector Spaces 190 For. we can clearly obtain all solutions by choosing arbitrary values for X. + ClnXn = 0.X" to give . by elementary operations. Specifically. + C2nXn = 0.. one determines whether or not T :::> S. that elementary row operations on the system (24) transform it into an equivalent system of equations. The reduced equations then have the form 'XI X2 + CI.. Reduced echelon matrices are also useful for determining the solutions of systems of homogeneous linear equations of the form (24) Thus let S be the set of all vectors ~ = (Xl>' •• .. . By reversing the preceding process. . forming the rows of a reduced echelon matrix and also spanning S. these operations carry A into another matrix having the same set S of "solution vectors" ~ = (Xl> ••• .' . .+1 + .Xn .+IX. which is clearly necessary and sufficient for S :::> T. as applied to the m x n matrix A which has as ith row the coefficients (ail. as in §2. alternatively S = T may be tested by transforming the matrix with rows a and the matrix with rows {3 to reduced echelon form. .Ch. I.xn ) ()f F" satisfying (24)...3. X3 = 1. of these n .0.. .r basic solutions. In particular. a basis for the space of solutions is provided by the cases X2 = 0. . = (-Cl m 0. 1 -2 -1 -1) 1 -2 -+ (1 1 -1 -1) 2 -1 0-3 -+ (1 4 -3 0) 0 -3 2 -1' The final matrix (except for sign and column order) is in reduced echelon form.' •• .~.).. The only solution of a system of n linearly independent homogeneous linear equations in n unknowns Xl = X2 = . The "solution space" of all solutions of a system of r linearly independent homogeneous linear equations in n unknowns has dimension n . . . . -3X2 + 2X3 . . + x.X.. we obtain n . X r+ 1.3X3 = 0. Let S be defined by the equations Xl + X2 = X3 + X4 and Xl + X3 = 2(X2 + X4)' Thus. -3X2 + 2X3). ..r+l. It yields the equivalent system of equations Xl + 4X2 . ..§7.. The matrix of these equations reduces as follows: EXAMPLE. . ••• .. X. Equation (26) states that the general solution ~ is just the linear combination ~ = Xr+l~r+l + . = X.) Corollary. equal to 1 and the remaining parameters equal to zero.. ..X. geometrically. X3 = 0. Xl.. . ••• . -Crn.r+l. thereby proving Theorem 13.0. giving the solutions ~r+l = 1...r solution vectors are linearly independent (since they are independent even if one neglects entirely the first r coordinates!). or (3...X.. S is the intersection of two three-dimensional "hyperplanes" in four-space. . These n . {. . is = O. (Xl.X4 = 0. Homogeneous Equations the solution vector ~ (26) = (- f c ljXj.0. -Cr. and X2 = 1.4xl> Xl> X3.0) (-Cl.0. ••• .r. 1). with the general solution ~ = (3X3 . ••• . 1. - j=r+l f j=r+l CrjXj. We have thus found a basis for the space S of solution vectors of the given system of equations (24).2) and (-4. 1.. -3).7 191 Vector Equations.r solutions by setting one of the parameters Xr+l. -2). 3 for the vectors satisfying 3x) . (-1. y + z + t = 0.4z = 0. 1) + x2(2.X3 . ~4 = (4. a2.2X2 + X3 . 5 if the equations are taken to be congruences modulo 5. Do Ex.Ch. 5. -2) = Xt(1. The linear equations of our example. are equivalent to the vector equation The solutions are thus all relations of linear dependence between the four vectors (1. Thus. . 1). with x's in place of a's. 6.3i). 7 192 Vectors and Vector Spaces By duality. 1. 2) + x2(2. Let 111 = (1 + i.6. 7) . 2). r.3X3 . (b) (1. 2.1.X4 = 0 and Xl . 112 = (2. -1. Let ~1 = (1. Find a basis of the proper number of linearly independent solutions for each of the following four systems of equations: (a) x + y + 3z = 0.X4 = O. 2x + 2y + 6z = 0. (a) (1. (c) x + 2y . 1. Find all complex numbers Ci such that C)11) + C2112 + C3113 = O. (b) x + y + z = 0.4. and when this is the case. -1. 7. 1. 0. 2i). + x3(1. 3x + y . 1.z + t = 0. ~2 = (2. -1. -2. 1). (1. X3. 2. and (-1. one can obtain a basis for the linear equations satisfied by all the vectors of any subspace. ~3 = (3.1) = x)(2.2X4 = O. let T be the subspace of p4 spanned by the vectors (1. -1). -2).2z = 0. 2x + 3y . Find numbers Ci not all zero such that c)~) + C2~2 + C3~3 + C4~4 = O. 3) + x3(3. -2) of the two-dimensional space This could also be solved as in §7. -1.5 by reducing to echelon form the 4 x 2 matrix having these vectors as rows. Xl + X2 . Find two vectors which span the subspace of all vectors (Xl> x 2. 3. -1). Then the homogeneous linear equation l:aixi = 0 is satisfied identically for (x h X2. 4. 3x + 4y + 2t = O. find one solution.2X2 + 4X3 + X4 = x) + X2 . . Exercise. 3 + 4i). 1. 3). this matrix being obtained from that displayed above by transposing rows and columns. Determine whether each of the following vector equations (over the rational numbers) has a solution. 0). 113 = (2i.3) (c) (2. 1). X3. Do Ex. a3. 1. X4) in T if and only if al + a2 = a3 + a4 and al + a3 = 2(a2 + a4)' A basis for the set of coefficient vectors (at. X4) satisfying x) + X2 = X3 . a4) satisfying these equations has been found above. -1) and (1.2X4 = 0.2) + x3(1. 1. .1) = x)(1. ld) x + y + z + t = 0. . 2. . so that each Xi :. + x~an' then subtraction from (27) and recombination gives Since the ai are a basis. (c) (0.x D = .· .4).. Theorem 14. If is a second vector of V.1. State and prove an analogue of Ex..2).1. -1). a .3. whence the expression (27) is unique.: (x n .1. a3 = (0. (Do not count computations like aa.3.1 = 1. Bases and Coordinate Systems A basis of a space V was defined to be an independent set of vectors spanning V. We shall call the scalars Xi in (27) the coordinates of the vector ~ relative to the basis at.x~) = 0. (d) (2. 42 additions and subtractions. (b) (3. and 4 formations of reciprocals.2).a = 0.0. an form a basis for V. a2 = (1. an.8 Bases and Coordinate Systems 193 8. with coordinates Yh. Show that an m x n matrix can be put in row-reduced form by at most m 2 elementary operations on its rows.) *11..0). :. hence every vector ~ in V has at least one expression of the form (27). -2.0. The proof depends on the following theorem. and the preceding equation implies that (x 1 .2. under a suitably chosen coordinate system.. 1.. . Since the ai form a basis. . 10 for n x n matrices. they span V. then every vector ~ has a unique expression E V (27) as a linear combination of the ai. The real significance of a basis lies in the fact that the vectors of any basis of F n may be regarded as the unit vectors of the space.. Proof. 10. Express each of the following four vectors in the form Xlal + X2a2 + X3a3 + x 4 a 4 : (1. -2).. they are independent. and a4 = (-1. 9. Show that a 4 x 6 matrix can be put in row-reduced form after at most 56 multiplications. . 1. 1). Ym then by the identities .0). or Oa = 0. If some ~ E V has a second such expression ~ = x~al + . .2. -2.1. (a) 7.8.. -1. In Q4 let al = (1. . If al.: x.§7. 1.Ch. Equations (28) and (29) then show that each basis a b ••• . Likewise. Thus by Theorem 7. at = (1. In particular. A vector space can have many different bases. as in (31) Since the number n of vectors in a basis is determined by the dimension nl which is an invariant (Theorem 5. over the same field F. This isomorphism is the correspondence Ca which assigns to each vector ~ of V the n-tuple of its coordinates relative to a. What is more. 7 194 Vectors and Vector Spaces of vector algebra In words. We have thus solved the problem of determining (up to isomorphism) all finite-dimensional vector spaces. Any finite-dimensional vector space over a field F is isomorphic to one and only one space F".1'/ in V and all scalars c in F. any sequence of vectors of F n obtained from E1> ••• . any three noncoplanar vectors in ordinary three-space define a basis of vectors for "oblique coordinates. By analogy with the corresponding definitions for integral domains and groups." . we have shown that all bases of the same vector space V are equivalent. 1). to be a one-one correspondence ~ -+ ~C of V onto W such that (30) and (c~)C = c(~C) for all vectors ~.0). in the sense that there is an automorphism of V carrying any basis into any other basis.0. and a3 = (1. an in a vector space V over F provides an isomorphism of V onto F". a2 = (0. 1) are a basis for F3 for any field F in which 1 + 1 ¥. En by a succession of elementary row operations is a basis for F". 1.O. Corollary 1). Similarly. the product of the vector ~ of (27) by a scalar c is so that each coordinate of c~ is the product of c and the corresponding coordinate of ~. we have proved Theorem 15. let us now define an isomorphism C: V -+ W between two vector spaces V and W. the coordinates of a vector sum relative to any basis are found by adding corresponding coordinates of the summands. . in symbols. the domain F[x] of all polynomial forms in an indeterminate X over a field F is a vector space over F. x F (n factors). () = (c1'/.. respectively. Since (U' + 7) + (U" + 7') = (U' + U") + (7 + 7'). Conversely. for all the postulates for a vector space are satisfied in F[x]. x 3 . The two numbers 1 + i and 1 . F" is the direct product (as an additive group) of n copies of the additive group of F. 7) ~ (U' + 7) is an isomorphism from the additive group of the vector space V onto the direct product (§6. the field C of all complex numbers may be considered as a vector space over the field R of real numbers. The definition of equality applied to the equation p(x) = 0 implies that the powers 1. Or consider the homogeneous linear differential equation d 2x/ dt 2 3dx/ dt + 2x = O. and that the product of a solution by any (real) constant is a solution. V is their direct sum in the sense defined above. moreover. x 2. we say that a vector space V is the direct sum of two subspaces Sand T if every vector ~ of V has one and only one expression (32) ~ = U' + 7. scalar mUltiplication being defined by the formula c (1'/. F" = F x . the subsets of (1'/. which means precisely that the most general solution can be expressed in the form X = cle ' + c2e 21. Hence F[x] has an infinite basis consisting of these powers. if Sand T are any two given vector spaces over the same field F. in one and only one way. sometimes called the "solution space" of the differential equation.. 7 E T as a sum of a vector of S and a vector of T. but less convenient. One verifies readily that the sum Xl (t) + X2(t) of two solutions is itself a solution.S 195 Bases and Coordinate Systems Again. basis for Cover R. In R 3 . c() for any c E F. More generally. Therefore the set V of all solutions of this differential equation is a vector space. the correspondence (U'. This space has the dimension 2.11) of the additive groups of Sand T. The easiest way to describe this space is to say that e' and e 21 form a basis of solutions. X. for 1 and i form a basis. () constitute subspaces isomorphic to Sand T. a plane S and a line T not in S. generating respectively the "subspaces" of real and pure imaginary numbers. if one ignores all the algebraic operations in C except the addition of complex numbers and the ("scalar") mUltiplication of complex numbers by reals. and any vector in the space can be expressed uniquely as a sum of a vector in the plane and a vector in the line. More generally. span the whole space. both through the origin. = U' E S. Finally.0) and (0. ' •• are linearly independent over F.§7. one can define a new vector space V = S EB T whose additive group is the direct product of those of Sand T.i form another. In this V. for any vector (polynomial form) can be expressed as a linear combination of a finite subset of this basis. if ~I is any vector common to Sand T. so that b i = . the representation is unique. Secondly. then ~I has two representations ~I = ~I + 0 and ~I = 0 + ~I of the form (32). Corollary.13k. Q. . Secondly. If the finite-dimensional vector space V is the direct sum of its subspaces Sand T. S n T = O. I'm are indeed linearly independent. Indeed. so that 0 = 1'/0 = I bi{3i and 0 = I cm. 1'1> . then d[V] = k'+ m. First. But 0 = 0 + 0 is another representation of 0 as a sum of a vector in S and one in T. then (34) d[V] = d[S] + d[T]. for if then 0 is represented as a sum of the vector 1'/0 = I bi{3i in Sand (0 = I cm in T. By assumption. We then have (35) S+T=V.. If the finite-dimensional space V is the direct sum of its subspaces Sand T.. respectively. But the {3's and the 1"s are separately linearly independent. I'm. then the union of any basis of S with any basis of T is a basis of V. so that the {31>"'... Let Sand T have the bases {3l>"'.. When V is the direct sum of Sand T. 13k and 1'1>"'. 13k> 1'1> . since these two representations must be the same. . I'm is a basis of V. .. the above proof shows that when d[S] = k and d[T] = m.. these vectors span V. these vectors are linearly independent. (32) states that V is the linear sum of the subspaces Sand T. for any ~ in V can be written as ~ = 1'/ + (.D.Ch. Since the dimension of a space is the number of vectors in (any) basis.. Proof. Theorem 16. Proof. so that the intersection . = bk = 0 and CI = . This theorem and its proof can readily be extended to the case of a direct sum of a finite number of subspaces. 7 196 Vectors and Vector Spaces One also speaks of S EB T as the direct sum of the given vector spaces S and T. = Cm = O. we call Sand T complementary subspaces of V.. where 1'/ is a linear combination of the (3's and ( of the 1"s. we wish to prove that {31> .E. ~I = 0. The relation (33) thus holds only when all the scalar coefficients are zero.. §7.8 197 Bases and Coordinate Systems S nTis zero, as asserted. Conversely, we can prove that under the conditions (35), V is the direct sum of Sand T. Thus in this case equation (34) reduces to d[V] = d[S + 11 + d[S n 11 = d[S] + d[11. This latter result holds for any two subspaces. Theorem 17. Let Sand T be any two finite-dimensional subspaces of a vector space V. Then (36) d[S] + d[T] = d[S n T] + d[S + T]. Proof. Let ~I> ••• , ~n be a basis for S n T; by Theorem 6, Sand T have bases ~I> ••• ,~"' 1/1> ... , 1/, and ~J, • • • ,~"' (I> ••• , (s respectively. Clearly, the ~i' 1/j, and (k together span S + T. They are even a basis, since implies that I bj 1/j = - I ai~i - I Cdk is in T, whence I bj 1/j is in S n T and so I bj 1/j = I di~i for some scalars d i • Hence (the ~i and 1/j being independent) every bj is O. Similarly, every Ck = 0; substituting, I ai~i = 0, and every ai = O. This shows that the ~i' 1/j and (k are a basis for S + T. But having proved this, we see that the conclusion of the theorem reduces to the arithmetic rule (n + r) + (n + s) = n + (n + r + s). Exercises 1. In Ex. 4 of §7.6, which of the indicated sets of vectors are bases for the spaces involved there? 2. In Q4, find the coordinates of the unit vectors £1' £2' £3' £4 relative to the basis a1 = (1, 1, 0, 0), a2 = (0, 0, 1, 1), a3 = (1, 0, 0, 4), a4 = (0, 0, 0, 2). 3. Find the coordinates of (1,0, 1) relative to the following basis in (2i, 1,0), (2, -i, 1), C: (0,1 + i,1 - i). 4. In Q4, find (a) a basis which contains the vector (1,2, 1,1); (b) a basis containing the vectors (1, 1,0,2) and (1, -1, 2, 0); (c) a basis containing the vectors (1, 1,0,0), (0,0,2,2), (0,2,3,0). 5. Show that the numbers a + b .J2 + c .J3 + d .J6 + e with rational a, ... , e form a commutative ring, and that this ring is a vector space over the rational field Q. Find a basis for this space. m Ch. 7 198 Vectors and Vector Spaces 6. In Q4, two subspaces Sand T are spanned respectively by the vectors: S: (1, -1, 2, -3), T: (0, -2, 0, -3), (1,1,2,0), (1,0,1,0). (3, -1, 6, -6), Find the dimensions of S, of T, or SliT, and of S + T. *7. Solve Ex. 6 for the most general Z/. 8. Find the greatest possible dimension of S + T and the least possible dimension of SliT, where Sand T are variable subspaces of fixed dimensions sand t in P". Prove your result. *9. Prove that, for subspaces, SliT = SliT', S + T = S + T, and T c T imply T = T'. 10. If S is a subspace of the finite-dimensional vector space V, show that there exists a subspace T of V such that V is the direct sum of Sand T. *11. V is called the direct sum of its subspaces SI> ... , Sp if each vector ~ of V has a unique expression ~ = 111 + ... + 11p, with 11. E S•. State and prove the analogue of Theorem 16 for such direct sums. 12. Prove that V is the direct sum of Sand T if and only if (35) holds. *13. State and prove the analogue of Ex. 12 for the direct sum of p subspaces. 14. By an "automorphism" of a vector space V is meant an isomorphism of V with itself. (a) Show that the correspondence (XI> X 2 , x 3)>-+ (X2, -XI> X3) is an automorphism of p3. (b) Show that the set of all automorphisms of V is a group of transformations on V. 15. An automorphism of p2 carries (1,0) into (0,1) and (0,1) into (-1, -1). What is its order? Does your answer depend on the base field? *16. Establish a one-one correspondence between the automorphisms (ct. Ex. 14) of a finite-dimensional vector space and its ordered bases. How many automorphisms does ~ have? What about Z/? 7.9. Inner Products Ordinary space is a three-dimensional vector space over the real field; it is R3. In it one can define lengths of vectors and angles between vectors (including right -angles) by formulas which generalize very nicely not only to R n , but even to infinite-dimensional real vector spaces (see Example 2 of §1.1O). This generalization will be the theme of §§7.9-7.11. To set up the relevant formulas, one needs an additional operation. The most convenient such operation for this purpose is that of forming inner products. By the "inner product" of two vectors ~ = (XI> ••• ,xn ) and 1'/ = (Yt. ... , Yn), with real components, is meant the quantity (37) §7.9 199 Inner Products (Since this is a scalar, physicists often speak of our inner product as a "scalar product" of two vectors.) Inner products have four important properties, which are immediate consequences of the definition (37): (38) (~, ~) (39) > 0 unless ~ = O. The first two laws assert that inner products are linear in the left-hand factor; the third is the symmetric law, and gives with the first two the linearity of inner products in both factors (bilinearity); the fourth is that of positiveness. Thus, the Cartesian formula for the length (also called the "absolute value" or the "norm") 1 ~ 1 of a vector ~ in the plane R2 gives the length as the square root of an inner product, (40) A similar formula is used for length in three-dimensional space. Again, if a and {3 are any two vectors, then for the triangle with sides a, {3, l' = {3 - a (Figure 2), the trigonometric law of cosines gives 1{3 - a 12 = 1a 12 + 1{312 - 21 a 1. 1{31 . cos C, (C = L(a, (3». But by (38) anq (40), 1{3 - a 12 == ({3 - a, (3 - a) = ({3, (3) - 2(a, (3) + (a, a). Combining and capceling, we get (41) cosL(a,{3) = (a,{3)/lal·I{3I· In words, the cosine of the angle L(a, (3) between two vectors a and {3 is the quotient of their inner product by the product of their lengths. It follows that a and {3 are geometrically orthogonal (or "perpendicular") if and only if the inner product (a, (3) vanishes. In view of the ease with which the concepts of vector addition and scalar multiplication extend to spaces of an arbitrary dimeno 0: sion over an arbitrary field, it is natural to try RgUf82 to generalize the concepts of length and angle similarly, When we do this, however, we find that although the dimension number can be arbitrary, trouble arises with most fields. Inner products ~=H Ch. 7 Vectors and Vector Spaces 200 can be defined by (37); but lengths (42) are not definable unless every sum of n squares has a square root. The same applies to distances, while angles cause even more trouble. For these reasons we shall at present confine our discussion of lengths, angles, and related topics to vector spaces over the real field. In §9.12 the correspondiRg notions for the complex field will be treated. Exerci ... 1. In the plane, show by analytic geometry that the square of the distance between ~ = (x h X2) and 11 = (Yh Y2) is given by 1~ 12 + 11112 - 2(~, 11). 2. Use direction cosines in three-dimensional space to show that two vectors ~ and 11 are orthogonal if and only if (~, 11) = o. 3. If length is defined by the formula (42) for vectors ~ with complex numbers as components, show that there will exist nonzero vectors with zero length. 4. Show that there is a sum of two squares which has no square root in the fields Z3 and Q. 5. Prove formulas (38) and (39) from the definition (37). 6. Prove formulas analogous to (38) asserting that the inner product is linear in the right-hand factor. 7. Prove that the sum of the squares of the lengths of the diagonals of any parallelogram is the sum of the squares of the lengths of its four sides. *8. In R\ define outer products by (a) Prove that (~ x 11, ( x T) = (~, ()(11, T) - (~, T)(11, (). (b) Setting ~ = (,11 = T, infer the Schwarz inequality in R3 as a corollary. (Cf. Theorem 18.) (c) Prove that ~ x (11 x () = (~, ()11 - (~, 11)(. 7.10. Euclidean Vector Spaces Our discussion of geometry without restriction on dimension will be based on the following definition, suggested by the considerations of §7.9. Definition. A Euclidean vector space is a vector space E with real scalars, such that to any vectors ~ and T/ in E corresponds a (real) "inner §7.10 201 Euclidean Vector Spaces product" (~, TI) which is symmetric, bilinear, and positive in the sense of (38) and (39). EXAMPLE 1. . Any R n is an n-dimensional Euclidean vector space if (~, TI) is defined by equation (37). EXAMPLE 2. The continuous real functions </>(x) on the domain o <: X <: 1 form an infinite-dimensional Euclidean vector space, if we make the definition (</>, I{!) = J~ </>(x ) I{! (x )dx. The "length" I~ I of a vector ~ of a Euclidean vector space E may be defined in terms of the inner product as the positive square root (~, ~)1/2_the existence of the root being guaranteed by the positiveness condition of (39). Theorem 18. In any Euclidean vector space, length has the following properties: Ie~ I = (i) (ii) (iii) (iv) Ie I . I~ I· I~ I > 0 unless ~ = I(~, TI) I <: I~ I . ITil I~ + Til <: I~I + ITiI o. (Sch warz inequality). (tn·angle inequality). Since (c~, c~) = C2(~, ~), we have (i) . Property (ii) is a corollary of the condition of positiveness required in the definition of a Euclidean vector space. The proof of (iii) is less immediate. If ~ = 0 or TI = 0, then (iii) reduces to the trivial inequality 0 <: O. Otherwise, Proof. o <: (a~ ± 2 bTl, ~ ± bTl) ~ a2(~,~) ± 2ab(~, TI) + b (TI, TI)· Set a = ITil and b = I~ I, so that a 2 ing, we then have = (TI, TI) and b 2 = (~, ~). Transpos- Dividing through by 21 ~ I . ITil > 0, we get (iii). From (iii) we now get (iv) easily, for I~ + TlI2 = (~ + TI, ~ + TI) = (~,~) + 2(~, TI) + (TI, TI) <: 1~12 +'21~1·ITlI + ITlI2 = (I~I + ITlI)2. Now, if we define the distance between any two vectors ~ and TI of E 202 Ch.7 Vectors and Vector Spaces as Ig - Til, we can show that it has the so-called "metric" properties of ordinary distance, first considered abstractly by Frechet (1906). Theorem 19. Distance has the properties: (M1) Ig - gl = 0, while Ig - Til > 0 ifg ¥ TI· (M2) Distance is symmetric, Ig - Til = ITI - gl· (M3) Ig - Til + ITI - (I :> Ig - n First, Ig-gl=lol=IO·gl=O·lgl=O by (i), while Ig - Til > 0 if g - TI ¥ 0 (or g ¥ TI) by (ii), proving M1. Secondly, Ig - Til = 1(-l)(TI - g)1 = 1-11·ITI - gl = ITI - gl by 0), proving M2. Finally, M3 follows from (iv) because Proof. From Schwarz's inequality, we deduce in particular that for any g, TI not 0, we have -1 <: (g, TI )/1 gI· ITil <: 1. Hence (g, TI Mig I . I Til is the cosine of one and only one angle between 0" and 180°, which we can define as the angle between the vectors g and TI (compare the special case (41». We shall not prove except in the case of right angles that the angles so defined have any properties (could you prove that L(g, TI) + L(TI, () :> L(g, ()?). Two vectors g and TI will be called orthogonal (in symbols, g J. TI) whenever (g, TI) = O. This definition, applied to Example 2 above, yields an instance of the important analytical concept of orthogonal functions. It is easy to prove that if g ..1 TI, then TI ..1 g (the orthogonality relation is "symmetric"), and cg ..1 C'TI for all c, c'. Also, 0 is the only vector orthogonal to itself. Furthermore, whenever (TI, ~l) = ... = (TI, gm) = 0, then for any scalars Cj, (TI,Clgl + ... + cmgm) = Cl(TI,gl) + ... + cm(TI,gm) = Cl . 0 + ... + C m. 0 = 0, so that TI is also orthogonal to every linear combination of the gj. This proves Theorem 20. If a vector is orthogonal to g..... , gm, then it is orthogonal to every vector in the subspace spanned by g..... , gm· Exercises n 1. Set ~ = (1,2, 3, 4), 71 == (0,3, -2, 1). Compute (~, 71), 1 IT/I, L(~, 71)· 2. If ~ and 71 are as in Ex. 1, find a vector of the form (1, 1,0,0) + Cl~ + C271 orthogonal to both ~ and 71· §7.11 203 Normal Orthogonal Bases 3. (a) Are sin 21TX and cos 21TX "orthogonal" in Example 2 of the text? (b) Are sin 2m1Tx and sin 2nm orthogonal? (c) Find a polynomial of degree two orthogonal to 1 and x. 4. Prove that It" - '1112 + It" + '1112 = 2(1 t" 12 + 1'1112). 5. Prove that in R\ there are precisely two vectors of length one perpendicular to two given linearly independent vectors. 6. Prove that there is a vector with rational coordinates in R3 perpendicular to any two given vectors with rational coordinates. 7. If a and P "" 0 are fixed vectors of a Euclidean vector space, find the shortest vector of the form 'Y = a + tp. Is this orthogonal to P? Draw a figure. *8. If a is equidistant from p and 'Y, prove that the midpoint of the segment P'Y is the foot of the perpendicular from a to P'Y. 9. Prove that if It" I = Ia I in a Euclidean vector space, then t" - a .L t" + a. Interpret this geometrically. *10. (a) Show that the discriminant B2 - 4AC of the quadratic equation (t", t")t 2 + 2(t", 'I1)t + ('11, '11) = j It" + '1112 = 0 is four times (t", '11? - (t", t")( '11, '11). (b) Using this fact, prove the Schwarz inequality. (Hint: IIt" + '111 = 0 cannot have two distinct real solutions t unless t" = 0.) 11. Prove II t" I - I'1111 < It" - '111, in any Euclidean vector space. 12. Show that R3 becomes' a Euclidean vector space if inner products are defined by (t", '11) = (Xl + xz)(Y. + Y2) + X2Y2 + (X2 + 2X3)(Y2 + 2Y3)' 7.11. Normal Orthogonal Bases In Example 1 of §7.10, the "unit vectors" E. = (1,0, ... ,0), ... , En .:.... (0,0, ... , 1) have unit length and are mutually orthogonal. This is an instance of what is called a "normal orthogonal basis." Definition. Vectors at. ... , an, are called normal orthogonal when (i) Iai I = 1 for all i, (ii) ai ..1 aj if i ¥ j. Lemma 1. Nonzero orthogonal vectors at.···, am of a Euclidean vector space E are linearly independent. Proof. o= + ... + Xmam = 0, then for k = 1,· .. , m (0, ak) = x.(at. ak) + ... + xm(a m, ak) = xdak> ak), If x.a. where the last equality comes from the orthogonality assumption. But ak ¥ 0 by assumption; hence (ak> ak) > 0 and Xk = O. Q.E.D. 204 Ch. 7 Vectors and Vector Spaces Corollary. Normal orthogonal vectors spanning E are a basis for E (a so-called "normal orthogonal basis"). We shall now show how to orthogonalize any basis of a Euclidean vector space, using only rational operations. This is called the GramSchmidt orthogonalization process. Lemma 2. From any finite sequence of independent vectors 'YI"· .. , 'Ym of a finite-dimensional Euclidean vector space E, an orthogonal sequence of nonzero vectors (44) ai = 'Yi - L dik'Yk (i = 1,· .. , m) k ..• , 'Ym· Proof. By induction on m, we can assume that orthogonal nonzero vectors al , · . . ,am-I have been constructed which span the same subspace S as 'VI> .. • , 'Vm- I. We now split 'Ym into a part Pm "parallel" to S, and a part am perpendicular to S. To do this, set (44') am = 'Ym - L where Cmkak, k <m Then for j = 1,· .. ,m - 1, we have m-I (am, aj) :; ('Ym, L aj) - Cmk (ak, aj) = 0, k=1 ,I since (ak' a) = 0 if k ¥ j by orthogonality, while cmj(aj, aj) = ('Ym, aj) byl (44'). Substituting in (44), we have by induction on m, am = 'Ym - L k <m Cmkak = 'Ym - L Cmk'Yk k<m + L : Cmkdkj'Yj· j<k < m This proves (44), with d mk = Cmk - L k <j < m cmjdjk . Since 'Ym is not dependent on 'Yh· •. ,'Ym-I> it cannot be in S; hence: am ¥ O. Finally, 'Yt. •.. , 'Ym and at.· · ·, am both span the subspace~ spanned by Sand 'Ym. This completes the proof of Lemma 2. §7.11 205 Normal Orthogonal Bases Theorem 21. Every set 'Yl, •.. , 'Ym of normal orthogonal vectors of a finite-dimensional Euclidean vector space E is part of a normal orthogonal basis. By Theorem 6, the 'Yj are part of a basis 'Yi, .•• ,'Yn of E. This basis may be orthogonalized by Lemma 2, and then normalized by setting Pi = aJI aj I; the process will not change the original vectors 'Yl, ... , 'Ym. Proof. Corollary. Any finite-dimensional Euclidean vector space E has a normal orthogonal basis. The Gram-Schmidt orthogonalization process has other implications. Thus let S be any m-dimensional subspace of a Euclidean vector space E; as above, S has a normal orthogonal basis ah ... ,am. If 'Y is any vector not in S, the process represents 'Y as the sum 'Y = a + P of a component P in S and a component a perpendicular to every vector of S. The vector P is called the orthogonal projection of 'Y on S. We shall conclude by determining all inner products on a given (real) finite-dimensional vector space V. Clearly, if ah .•• ,an is any basis for V, then for any vectors g = Xlal + ... + Xnan and 7J = Ylal + ... + Ynan, we have by bilinearity (45) Thus, the inner product of any two vectors is determined by the n 2 real constants (aj, ak) = ajk as a certain "bilinear" form L aikXjYk in the i.k coordinates Xj and Yk. Because (aj, ak) = (ak> a;), this form is called "symmetric. " Conversely, any symmetric bilinear form L aikXjYk (ajk = aki) in F" i,k satisfies the first three conditions of (38) and (39). The fourth condition is that the quadratic form L ajkXjXk be "positive definite"-i.e., be positive unless every Xj = O. An algorithm for determining when a square matrix is positive definite will be derived in §9.9. Relative to a normal orthogonal basis, we have (ai, ak) = 0 if i ~ k, and (aj, a;) = 1; hence (45) reduces to (46) n (g,77) = L j~l XiYi = XIYl + ... + XnYn· This formula enables us to conclude with Theorem 22. Relative to any normal orthogonal basis, an "abstract" inner product assumes the "concrete" form (46). 206 Ch. 7 Vectors and Vector Spaces Thus every finite-dimensional Euclidean vector space is isomorphic to some R". Exercises 1. Find normal orthogonal bases for the subs paces of Euclidean four-space spanned by: (a) (1, 1, 0, 0), (0, 1, 2, 0), and (0, 0, 3, 4); (b) (2, 0, 0, 0), (1, 3, 3, 0), and (0, 4, 6, 1). (Hint: First find orthogonal bases, then normalize.) 2. Draw a figure to illustrate the orthogonal projection of a vector on a one-dimensional subspace. 3. Find the orthogonal projection of {3 = (2, 1,3) on the subspace spanned by a = (1,0, 1). 4. Find the orthogonal projection of (3 = (0,0,0,3) on each of the subspaces of Ex. 1. 5. Let S be any subspace of a Euclidean vector space E. Show that the set S"- of all vectors orthogonal to every ~ in S is a subspace satisfying S + S"- = E, and d[S) + d[S"-) = d[E). (The subspace S"- is called the orthogonal complement of S.) 6. Find a basis for the orthogonal complement of the subspace spanned by (2, -1, -2) in Euclidean three-space. 7. Find bases for the orthogonal complements of each of the subspaces of Ex. 1. *8. (a) Exhibit a nontrivial subspace of Q3 which does not contain any vector of unit length. (b) State and prove an analogue of Lemma 2 which is valid for vector spaces with scalars in any ordered field. 7.12. Quotient-spaces We shall now show that the construction of quotient-groups in §6.13 has an easy extension to vector spaces. Let V be any vector space over .a field F, and let S l]e any subspace of V. Under addition, V is a commutative group, and S is a (necessarily normal) subgroup of V. Hence we can form the additive quotient-group VIS. For example, in Euclidean space R 3 , let S consist of the multiples (0, y, 0) of the unit vector (0,1,0). Then the coset of any vector a = (a, b, c) will consist of the vectors (a, b + y, c) having the same xcoordinate a and z -coordinate c as a; they are the vectors (a, ., c), where the dot stands for an arbitrary entry. The sum (a,·, c) + (a ' ,·, c /) of two such vectors in the quotient-group R3 IS is clearly (a + a ' ,·, c + c'). 12 Quotient-Spaces In this example. there exists a quotient-space X = VIS and an epimorphism P: V is S and whose range is X. Now define the sum of two cosets to be the coset (a + S) + (f3 + S) = (a + (3) + S. we can paraphrase the discussion of §6. and it is evident that the quotientgroup R3 ISis a (real) vector space under these operations. If S is a one-dimensional subspace of the space R 3 . Two cosets a + Sand f3 + S are equal (as sets) if and only if (a . a and f3 represent (are members of) the same coset.·. Geometrically. the different cosets of a subspace S are just its "parallel subspaces" under translation.13.13 to obtain a quotient-space VIS = X. then P is an epimorphism of vector spaces with kernel exactly S and range all of VIS.product also does not depend on the choice of the representative of the given coset. the elements of the quotientgroup GIN are simply the cosets xN of N in G. Next.(3) E S implies (ca . this sum does not depend on the choice of the representatives a and f3. called the quotient-space of V by S. Given a vector space V over a field F. It is readily verified that these two definitions make the set VIS of all the cosets of S in V into a vector space. V determines a coset of S. define the product of a coset a + S by a scalar c to be the coset c(a + S) = ca + S. Since (a .·. when this holds. ~ X whose kernel Exercises 1. given a subspace S of the vector space V. tc). call it a "representative" of the coset. c) by any scalar t E R to get the new coset (ta. Recall that for any group G and (normal) subgroup N. this. each vector a E. Thus a = a + 0 is one of the vectors in this coset.(3) E S. . we have thus proved Theorem 23.c(3) E S. This transformation P is called the canonical projection of V onto its quotient-space. defined as the set a + S of all sums a + u for variable u E S. Moreover. if the function P is defined by aP = a + S. as in Lemma 2 of §6. show that the cosets of S are the lines parallel to S. Given any subspace S of a vector space V. We shall now show that a similar construction is possible in general.207 §7. we can also multiply each vector (a. Hence. shows at once that Of = O. (b) For F = R. a linear function f is one which preserves linear combinations. z) and (x'. TI E V. and they apply to infinite-dimensional vector spaces (e. *7. 1. y'.. describe S and its cosets geometrically. (a) Show that two vectors (x. and let cP: f(x) >-+ ![f(x) + f(-x)].e.xn ) of the finitedimensional vector space V = F" is a polynomial function of the special form where the Ci terms are arbitrary constants in the field F. Prove in detail that. 4. The first identity. (b) Describe its kernel S and the quotient-space VIS. The two identities imply the combined identity (49) ~. a (homogeneous) "linear function" of the coordinates Xi of a variable vector ~ = (Xl> ••• . For V = F\ F any field. this one identity yields the first identity of (48). 3. let S be the subspace spanned by (1.13. TI in V and any scalar a in F.. 7 Vectors and Vector Spaces 2. they do not depend on the choice of a basis in V). The preceding identities have two advantages over the definition by formula (47): they are intrinsic (i. Conversely. y. for a = b :::: 1. Linear Functions and Dual Spaces In elementary algebra. 5. Z/) are in the same coset of S if and only if x + y' = x' + y. hence Of = 0. (a) Show that cP is a homomorphism of vector spaces. for b = O. b E F. Prove that if S is a subspace of V = F" that is isomorphic to F m . for any vectors ~. the cosets of any subspace S of a vector space V do form a vector-space. Let V = R[x] be the space of all real polynomials f(x). .g. One easily verifies that any such function f satisfies the identities· (48) (a~)f = a(lj). and hence the second identity of (48). a. under the operations displayed in the text. Briefly.208 Ch. 1). with TI = 0.0) and (1. We shall therefore define a linear function f on any vector space V over any field F as a function from V to F which satisfies the two identities (48). then VIS is isomorphic to p-m. to function spaces). 1. By induction on n.. For any vector space V... each Xj in Theorem 14 is a linear function of g.)cj = a L XjCj + b L YjCj = a(lj) + b(Tlf). . One verifies readily that f + g and fc are again linear functions on V... This function is linear..§7.. cn) of coefficients in the formula (47). each g has by Theorem 14 a unique expression g = Xlf31 + . then there is one and only one linear function f on V with f3J = Cj. define the sum f + g of two linear functions f and g to be the function given by the equation (51) g(f + g) = lj + gg for all g E V.13 209 Linear Functions and Dual Spaces The concept of "linear function" just defined is virtually equivalent to that of "coordinate" introduced in §7. equation (50) follows directly from (49) for any linear function f with tJJ = Cj... as g varies over V. are the functions given by the Indeed. equation (50) therefore defines a single-valued function.. i = 1. If f3 .Cn are n constants in P.)f = L (axj + by. This function f is given by the formula (50) Proof..· .n..f3n is a basis of the vector space V over P. and the product fc of the linear function f by a scalar given by the equation (52) g(fC) = (if)c for all g E V. Theorem 24.. Each linear function is thus determined uniquely by the n-tuple (C ... The linear functions on P linear expressions (47).f3n of V. . C E C to be the function F. . Corollary. The following result is "dual" to Theorem 14. so that condition (49) is satisfied. ••• . Conversely. + Yn/3m (ag + bTl)f = C:L (axj + bYj)f3. .. . in a sense which will be made precise shortly.Cn in F... .8. namely. . . and if c . because for any g and TI = Ylf31 + . for any basis f3 . this suggests that the linear functions themselves form a vector space... . i = 1. (47) gives that function f which takes the value Cj at the unit vector Ej of pn.· . + xnf3n· For any constants Cl.n. · . ••• . . = ]. f'I 1 I i. by Theorem 24. The proof requires only that we verify that the axioms for a vector space hold for the operations f + g and fc. by its values' {3d = Ci. The basis h.{3n> then its dual space V* has a basis h. . This space V* of linear functions on V is called the dual or conjugate vector space to V.. its value at any basis vector Proof. . by the definitions (51) and (52) and the distributive law in V. + xn{3n)fi = Xi. .. The dual V* of an n-dimensional vector space V has the same dimension n as V... by (54).. i = 1. for if f = hCl + ... n.. If V is a vector space over F..n. (53) g[(f + g)c] = [g(f + g)]c = [U + gg]c == (ij)c + (gg)c = g(fc) + g(gc) = g(fc + gc). defined by (Xl{3l + . . to prove the distributive law (f + g)c = fc + gc. j = 1. . . the set V* of all linear functions on V is also a vector space over F. hence are necessarily equal. + fncn is a linear function. . The proof of the other axioms is similar. and hence f is equal to the combination LhCj formed with these j values as coefficients. hCl {3i is (3{ih Cj) = } "i.. under the operations f + g and fc defined by (51) and (52). If the vector space V has a finite basis {3l. .fn> consisting of the n linear functions f.. • For n given scalars cl>···. it is fundamental in modern mathematics.· . .. Corollary 1.fn are linearly independent in V*... observe that for any g E V.. .210 Ch. .. For example. It also follows that the n linear functions fl> .{3n of V. This equation states that the functions (f + g)c and fc + gc have the same value for any argument g. hence Cl = C2 == .. The n linear functions fi are uniquely determined by the formulas _{o (54) {3ih - if i ~ j. .. . Corollary 2. + fncn = 0. then {3J = 0 for each i. {3ih Cj == Ci· } It follows that the functions f ..fn span V* : any linear function f is determined. 7 Vectors and Vector Spaces Theorem 25. . = Cn = O.fn is called the basis of V* dual to the given basis {3 . Cn> the linear combination f = + . . But by Corollary 2 of Theorem 25. Hence. With any subspace 5 of V we associate the set 5' consisting of all those linear functions f in V* such that (O'. one can also write the value of f at the argument ~ in the symmetric "inner product" notation 1.§7.E. f) = 0 and (0'. This isomorphism ~ ~ Fto unlike that between V and V* implied by Corollary 2. f). g) = 0 imply (O'. however. for (0'./. by (56). then' = ~ . Any finite-dimensional vector space V is isomorphic to its second conjugate space (V*)*. Proof. If ~ is a vector in V and f a vector in the dual space V*. We call 5' the annihilator of 5.f) = 0 for every 0' in 5. We now show that T is one-one. under the correspondence mapping each ~ E V onto the function F~ defined by F~(f) = 1. It is clearly a subspace of V*.D. ~ determines a linear function of f. and so . the vector operations on these functions correspond exactly to the vector operations on the original vectors ~. Formally. by Theorem 24. g)d.'rI r'= 0. so that This proves that T is one-one.13 Linear Functions and Dual Spaces 211 The transformation T: V ~ V* which maps each vector I XiPi of V into the function IfiXi of V* is an isomorphism of V onto V*. there is a linear function fo in V* with (fo = 1 r'= 0. V and (V*)* have the same dimension. Then. is a part of a basis of V. depends upon the choice of the basis in V.fc + gd) = O. f). the isomorphism.f)c + (~.fc + gd) = (~. hence an isomorphism. Theorem 26. In (~.f). The similarity of these two equations suggests another interpretation. The correspondence 5 ~ S' . each ~ in V determines a function F~ on the dual space V*. the correspondence T: ~ ~ F~ preserves vector addition and scalar multiplication. hence an isomorphism of V into (V*)*. hence T is onto./ = (~. hold ~ fixed and let f vary. and by (55). By (55). is "natural" in that its definition does not depend upon the choice of a basis in V. Equation (49) then becomes (55) while the definitions (51) and (52) of addition and scalar multiplication become (56) (~. defined by F~(f) = (~. Then (56) states that F~ is a linear function. If ~ r'= 'rI. Q. that is. to a basis 13 b ••. = S' + T'. Theorem 27. If S is a k-dimensional subspace of the n-dimensional vector space V. 7 Vectors and Vector Spaces between subspaces of V and their annihilators in V* has the property that SeT (57) S' implies ::::> T' (inclusion is reversed). S' satisfies (S n T). . ::::> S. . Dually. if and only if Cl = . about the number of independent solutions of a system of homogeneous linear equations. The correspondence S (58) (S')' = S. For if f E T'. into the largest subspace S' n T' contained in S' and T. f) = 0 for all g in S and all f in S'. therefore (S')' > S is impossible.k )-dimensional subspace of V*. in which connection the following properties are also basic. each subspace R of the conjugate space V* determines as its annihilator the subspace R' of V. But by Theorem 27. Since (g. In the dual basis flo .13k. ...f3n of V. The annihilator of the subspace consisting of o alone is the whole dual space V*. This equation states that the correspondence S ~ S' of a subspace to its annihilator when applied twice is the identity correspondence.. .. (S + ~ T). ... it follows that it carries S + T.. and (S')' = S.. . f) = 0 for every u in T.. hence g E (S')'. Theorem 28. The correspondence S ~ S' of subspaces to their annihilators leads to the Duality Principle of n-dimensional projective geometry.f) = 0 for every f in R. Because it also inverts inclusion by (57). then (u.k functions /k+b .fn of V*. = S' + T'. hence this correspondence has an inverse and is one-one onto.(n . and the annihilator of V is the subspace of V* consisting of the zero function alone. Proof. the function hCl + .k) = k = drS]. by Theorem 6. and hence for every u in SeT.212 Ch. . then the set S' of all linear functions f annihilating S is an (n . the dimension of (S')' is n .).13k of S and extend it. consisting of all g in V with (g.. each g in S annihilates every vector f E S'.fn form a basis of the annihilator S' of S.. and dually that (S n T). = S' n T'. the smallest subspace containing Sand T. + fncn vanishes in all of S if and only if it vanishes for each 13 .. Theorem 27 is just a reformulation of Theorem 13. This means precisely that the n . = Ck = O. Proof. Choose a basis f3b . and thus (S.. . For each subspace S. X"' . and show that if S is one-dimensional. which can be defined in terms of the intrinsic inner product (~. there is a natural isomorphism from E to its dual E*.• .YtX2 + X3Y4 . ••• . ... . Hence the dual space V* consists of all such infinite sequences. For any subspace S of V. The arguments leading to Theorem 27 and (58) can be repeated to give the desired result. There is a one-one correspondence of L(V) onto itself. 11) = X t Y2 . Cm ••• ). Remark 1. . In the case of a finite-dimensional Euclidean vector space E. . 2. . Let L(V) be the set of all subspaces of a finite- dimensional vector space V over a field.. (a) Complete the proof in Remark 1. 'T/).. = C..n. having only a finite number of nonzero entries. 'T/) is bilinear. 11) = 0 for all ~ E S. Exercises 1.13 213 Linear Functions and Dual Spaces Corollary 1.§7.) in S. Prove (57) and (58). let V be the vector space of all sequences ~ = (Xl. let S' be the set of all vectors 'T/ = YI/31 + . and Ch· •../3" be chosen in V. ). 3.In be n linearly independent linear functions on an n-dimensional vector space V. such that (59) XIYI + .. to appeal to more advanced concepts. then S c S'. if F is a countable field... . Any linear function on V can still be represented in the form {f = L XjCj for an arbitrary infinite list of coefficients 'Y = (Ch C2. Show that there is one and only one vector ~ in V with IjJ.1. The formula g.Cn given constants." = (~. Interpret in terms of nonhomogeneous linear equations. then V is countable but V* is not. (b) Show the connection with Corollary 1 of Theorem 25. Remark 2.. which is linear since (~./3. + X"Y" = 0 for all ~ = (XI/31 + . for example. + y"/3. Proof." can easily be shown to be an isomorphism of E onto E*.. Let Ilo . The isomorphism of V to V* does not in general hold for an infinite-dimensional space V." on E. define S' as the set of all vectors 11 with (~.. which inverts inclusion and satisfies (58).. addition and multiplication being performed termwise. . j = . 'T/) defines for each vector 'T/ E E a function f. The spaces V and V* are not isomorphic. The correspondence 'T/ ~ f.. 4. Let any fixed basis /31t' . For example. In C'.. define (~.Y3X4.. Complete the proof of Theorem 25. X" E F. + x.. so that any linear combination of vectors is carried into the same linear combination of transformed vectors. Symbolically this means that (1) Equivalently. it means that T preserves sums and scalar products. It is clear geometrically that R8 transforms the diagonal ~ + 11 of the parallelogram with sides § and 11~ \ \ \ \ \ \ \ \ \ Figurtl 1 214. Linear Transformations and Matrices There are many ways of mapping a plane into itself linearly. consider the (counterclockwise) rigid rotation Ro of the~ plane about the origin through an angle O.1. For example. in the sense that (~ (2) + t7)T ::: ~T + 11T. that is. .8 The Algebra of Matrices B. X3) into a vector TI :::: (y. 0. Hence any rigid rotation of the plane is linear. X3 + X3') of { and the vector f = (Xl" X2'. X3') may be computed by (4) to have the coordinates of X. it defines reflection in the origin (rotation through 180°). that is.1 215 Linear Transformations and Matrices into the diagonal {Ro + TIRo of the rotated parallelogram with sides {Ro and TIRo. where () = 135°. consider a simple expansion Dk of the plane away from tQe origin. 1. Similar transformations exist in any finite-dimensional vector space F".. Note that if 0 < k < 1.0. symbolically. (C{)Dk = kc{ = ck{ = c({Dk ).. then so are the Yi in (4). This Zj is just Yj + Y/. Likewise. the multiple c{ of { is rotated into c({R o). Thus. hence Dk is linear. the same considerations apply to rotations of space about any axis. 0). Zj :::: aj(XI :::: (ajXl + Xl') + b j (X2 + X2') + C/X3 + X3') + bjX2 + CjX3) + (ajx. so that ({ + TI)Dk :::: {Dk + TlDk· Moreover. This is illustrated in Figure 1.. X2. Thus. let T be the transformation of R3 which carries each vector { :::: (X. X2 + X2'. if the xI are all multiplied by the same constant d. if k :::: -1. denote the transforms of the unit vectors £1 :::: by (1. the transform ( of the sum { + f = (Xl + Xl'. moreover.' + bjX2' + CjX3').2. so that (d{)T = dTi = d(n). equation (3) defines a simple contraction toward the origin.§8. it shows that ({ + TI)R o :::: {Ro + TIRo. X2. hence vector sums into sums. 1) . ({ + f)T = {T+fT Conversely. Y3) whose coordinates are given by homogeneous linear functions and X3' Clearly. Again. £3 = (0.3. so that (c{)Ro :::: c({Ro). £2 = (0. To see this. any linear transformation T on R3 into itself is of the form (4). for j = 1. so that these transformations are also linear. if c is any real scalar.0). Also. under which each point is moved radially to a position k times its original distance from the origin. where the Yj are given by (4) and the Y/ are corresponding primed expressions. Y2. for all f (3) This transformation again carries parallelograms into parellelograms. X2. y'=xsin()+ycos(). y' == X . x .2x) = (x'. X3) in R3 into 'TI = gT = (XIEI +X2E2 + X3 E3)T :::: XI (EI + X2(E2 + X3(E3 = XI a + X2f3 + X3'Y = (xlal + X2bl + X3Cb Xla2 + x 2b 2 + X3C2. 1) is rotated into (cos «() + ni2). y' = X sin 2a . b* == cos (). () into one with polar coordinates (r. -1.216 Ch.y cos 2a. b = sin (). so that the equations for Ro are (5) Ro : x'==xcos()-ysin(). if T is linear.2a . Thus consider the counterclockwise rotation R(J about the origin through an angle (). c(gn g For example. This carries the plane vectors (1.y.1) into the orthogonal space vectors (1. Likewise. is a transformation T of V into W + d('TIn for all vectors and 'TI in V which satisfi{!s (cg + d'TI)T == and all scalars c and d in F.y.0) is rotated into (cos (). z' = 2x. sin «() + ni2» == (-sin ().: x' = X cos 2a + y sin 2a. . a* = -sin (J.2) and (1. 0).1. respectively. A linear transformation T: V ~ W. Definition. The very definition of the sine and cosine functions shows that the unit vector EI = (1. it has the form (4). y) ~ (x + y. The concept of linearity also applies more generally to transformations between any two vector spaces over the same field. defined by the equations x' = x + y.0) and (0. sin (). cos (). and transforms the plane linearly into a subset of space. reflection Fa in a line through the origin making an angle a with the x-axis carries the point whose polar coordinates are (r. Xla3 + x 2b 3 + X3C3)' n n n Hence. y'. Z'). consider the transformation (6) T t : (x. 8 The Algebra of Matrices Then T must carry each g= (XI. The preceding construction gives the coefficients of (4) explicitly.(). while the unit vector E2 == (0. of a vector space V to a vector space Wover the same field F. Thus in (4) we have a == cos (). Hence the effect of Fa is expressed by (5') F". and we let the f3j be the unit vectors El = (1. each point is moved parallel to the x-axis through a distance proportional to its altitude above the x-axis.=1 m". . .' . .. + xmf3m.. let 131 = (1. 1) in the plane. we can give each al its .§8. and a2 = (a. we obtain a very important application of Theorem 1..1) of Vm . .0.... . . . If V = F"' and W = F".. Then.. and rectangles with sides parallel to the axes go into parallelograms. . . This transformation is defined by (7) (Xlf31 + .... am are any m vectors in W.A B A'-'--_.E. . formation satisfying f31Sa = at. m = L (CXj + dYj)aj .1 217 Linear Transformations and Matrices The finite-dimensional case is most conveniently treated by means of the following principle. Q. then there is one and only one linear transformation T: V --+ W with 131 T = at. Since every vector in V can be expressed uniquely as Xlf31 + .I B. If f3t. m). (See Figure 2.0). + xma m· For example.13m is any basis of the vector space V. . .=1 Xjaj +d L Yjaj i-I Hence T is linear.y) ~ D~l (x + ay. If T is linear and f3iT = aj (i = 1. and at. + xmf3m)T = Xl a l + .Em = (0. and picture this with a deck of cards!) Proof.D.... 1). In this case.. 132 = (0. al = (1. . formula (7) defines a single-valued transformation T of V into W.y) / is linear and is the only linear trans. . Geometrically. Theorem 1..0). = C L . Then Theorem 1 asserts that the horizontal shear transformation (8) (x. .0)..0. . hence there can be no other linear transformation of V into W with f3iT = ai' To show that T is linear.f3mT = am... then the definition (1) and induction give the explicit formula (7). Figure 2 f32Sa = a2.. let 11 = L Yif3j be a second vector of V. the corresponding matrix A is the matrix with ith row the row of coordinates of £jT.. given A = Ilajjll... (3). + Xmam = xl(all. a mn ) = (xlall + ..... .... .. . the rotation.a2n) (9) Theorem 1 states that there is just one linear transformation associated with the formula (9).ajn) as its ith row. . + xmamh . a22.. . We have proved Theorem 2. . We denote by TA the linear transformation of F"' into F" corresponding to A in this fashion. .. + xmamn) in W = F". For example. This transformation is thus determined by the m x n matrix A = Ilajjll. Given T.. . Yn) are the coordinates of the transformed vector '11 = {T.· . + Xm£m of pm into the vector {T = Xlal + .Xlal n + . al n) £2T = a2 = (a21..· . which has the coordinates (ail. ajn) of A.Ch. . + xm(amh .. 8 The Algebra of Matrices 218 coordinate representation £1 T = al = (all. al n) + . T is the (unique).xm) = Xl£l + . cos () The general transformation T = TA of (9) carries any given vector = (Xl.. . There is a one-one correspondence between the linear transformations T: F"' --+ F" and the m x n matrices A with entries in the field F. and (8) correspond respectively to the matrices R(j { --+ ( cos () -sin () sin () ). . and ajj as the entry in its ith row and jth column.. in the plane. linear transformation carrying each unit vector £j of pm into the ith row (ail.. al2. if (Yh .. HenCe. T is given in terms of these coordinates by the homogeneous linear equations ...... . . . and shear of (5). similitude. . a linear transformation T = TA : pm ~ P. (10) assumes the more familiar form buxI + b 12 x2 + .. Conversely. i Xjajn' i Hence we have the Corollary. it is the matrix of (9) with its rows and columns interchanged. each m x n matrix A determines. .. ••• .. The rectangular array of the coefficients of (10) is not the matrix A appearing in (9).1 219 Linear Transformations and Matrices YI = xlaU + X2 a 21 + . '" " n' " = 1" ..13m in V and a basis 'Yb' .. + xmaml = I X•.. For then T is determined by the images f3i T = II ai/'Yj> and we say that T is represented by the m x n matrix .Xn into the vector 17 = ~T with coordinates YI. + b2m x m = Y2 (11') The preceding formulas for linear transformations refer to the spaces pm and P of m-tuples and n-tuples. This n x m matrix of coefficients of (10). then the transpose B = A T of the matrix A is defined formally by the equations. + blmx m = YI b2l X I + b 22 x2 + . then any linear transformation T: V ~ W can be represented by a matrix A. each T determines an m x n matrix A = II aij II.Qil. is called the transpose of A and is denoted by A T. so that T carries the vector ~ with coordinates Xl.. once we have chosen a basis f3b . respectively. Yn given by (10).. . Any linear transformation T of pm into P can be described by homogeneous linear equations of the form (10).. . which is obtained from the m x n matrix A by interchanging rows and columns. . More generally.'Yn in W..§S...... respectively. m) In this notation. . i (10) Xlal2 + X2 a 22 + . by means of equations (10). (11) (i = 1.. If A = II aij I has entry alj in its ith row and jth column. + xmam2 = I Yn = Xlal n + X2 a 2n + . if V and Ware any two finite-dimensional vector spaces over P of dimensions m and n. Caution. Specifically. + Xmamn = I Y2 = Xi ai2. Prove that (S)T is itself a subspace.(3. (b) p' is the projection of P on the line of slope 1/2 through the origin. x' = kx + kay.0) .. By the image of a subspace S of V under a linear transformation T.0) and (0. followed by a shear parallel to the y-axis.0. (c) p' lies on the half-line OP joining P to the origin.. Consider the transformation of the plane into itself which carries every point P into a point p' related to P in the way described below. multiplication of a linear transformation by a scalar.2) .(1.1.2). Matrix Addition The algebra of linear transformations (matrices) involves three operations: addition of two linear transformations (or matrices). 3..1) into (0. and multiplication of two linear . 5. Determine when the transformation is linear and find its equations. (b) x' = 0.and n-tuples. relative to the given bases. (0. x' = X. 7. Describe the geometric effects of the following linear transformations of space: (a) x' = ax.y.0). y' = y.. x' = ex. 2. 1) .(-1.(3. x' = 0. 1. 4.(1. y' = X + y. . (c) x' = x + 2y + 5z. L Y/Yj ~ (Yl. ±. Describe the geometric effect of each of the following linear transformations (a) y' = X.0.!3/2).. Yn). 8. (b) y' = X.0. 1). at a distance from 0 such that Op' = 4/0P. This amounts to replacing the spaces V and W by the isomorphic spaces of m. under the isomorphisms L Xl3i ~ (X.(0. A linear transformation T takes (1. Find the matrices which represent the symmetries of the equilateral triangle with vertices (1. 1. (d) y' = ky. 8 The Algebra of Matrices A = I aij II of these coefficients.0) . x' = y.xm ). Exercises 1.(1. z' = 3z. . -1). (c) (2. .2) and (-1. (e) p' is the reflection of P in the line x = 3.0) . What is the matrix of the transformation (6) of the text? 6.1) into (2. y' = by.1) . (d) p' is obtained from P by a rotation through 30" about the origin. (d) (1. (e) y' = by. I). Find the matrix which represents the linear transformation described: (a) (1.3) .3). (c) y' = X.220 Ch. (d) x' = x . one means the set (S)T of all vectors ~T for ~ in S. (0.(4.2). What matrix represeflts T? 8.2.1.2. y' = 3y.(0.0) and (3. 1) and (-1. z' = ez..1) . (a) p' is two units to the right of P and one unit above (a translation). Z' = z. z' = 4z. .1) . (b) (1.0) and (-1/2. (13) (c + d)A = cA + dA. Any matrix II alj II may be written as a sum IaijEij. where Eij is the special matrix with entry 1 in the ith row and jth column. . entries 0 elsewhere.§8. for (c~ + dl1)(T + U) c(rn. Similarly. all m x n matrices over a field F form a vector space over F. The dimension of this space is therefore mn. as (12) This sum obeys the usual commutative and associative laws because the terms aij obey them. so form a basis for the space of all m x n matrices. The scalar product cA of a matrix A by a scalar c is formed by multiplying each entry by c. The sum = (c~ + dl1)T + (c~ + dl1)U = c~T + c~U + dl1T + dl1U = The product cT is also linear. 1· A = A. so that O+A=A+O=A for any m x n matrix A. Theorem 3. the scalar product cT is defined by T + U is linear according to definition (1). c(A + B) = cA + cB. Under addition. m x n matrices thus form an Abelian group. One can define the sum T + U of any two linear transformations from a vector space V to a vector space W by (14) ~(T + U) = ~T + ~U for all ~ in V. the addition of two matrices. These matrices Eij are linearly independent. There is a corresponding algebra of linear transformations. We shall now define the vector operations on matrices.2 221 Matrix Addition transformations (matrices). The m x n matrix 0 which has all entries zero acts as a zero matrix under this addition. One may verify the usual laws for vectors: c(dA) = (cd)A. and the multiplication of a matrix by a scalar. The sum A + B of two m x n matrices A = II ajj II and B = II bjj II is obtained by adding corresponding entries. namely. The additive inverse may be found by simply multiplying each entry by -1. ~(cn = c~(T + U) + dl1(T + U). Under addition and scalar multiplication. Matrix Multiplication The most important combination of two linear transformations T and V is their product TV (first apply T. y) into x" = kx + kay.3Dk . in the sense of being independent of the coordinate systems used in V and W (cf.. which sends (x'. 8 The Algebra of Matrices = pm and W = F".8). That is. if the shear Sa of (8) is followed by a transformation of similitude D k . in the notation introduced following Theorem 2. which preserves multiplication by scalars as well. Finally. 3. Sa of §8. 2Sa . that the set of all linear transformations T : V. without reference to matrices. definition (14) implies that Ej(T + V) = EjT + EjV. the vector space of all linear transformations from V into W is often referred to as Hom (V. W is a vector space under the operations defined in and below (14) . then V.3. . V of a vector space V into itself. For the matrices R 8. y') into x" = kx'. as in §6. Theorem 4. D k . (CA)T = cA T. For this reason. the operation of scalar multiplication just defined corresponds to that previously defined fOr m x n matrices. The product of two linear transformations is linear. y" = kyo This product SaDk is s till linear. Prove that (A + B)T = AT + B T.Sa + SDk' 2. y" = ky'.1. In this section. the combined effect is to take (x. Then TV may be defined as that transformation of V into itself with ~(TU) = (~nV for every vector ~. it should be observed that a linear transformation of a vector space V into a vector space W is just a homomorphism of V into W (both being considered as Abelian groups).2). and R8 . They also apply to infinite-dimensional vector spaces. §7. Since c(Ejn = c(£jn. 4. Exercises 1. whence the matrix C corresponding to T + V in Theorem 2 is the sum of the matrices which correspond to T and V. we shall consider only the product of two linear transformations T. compute 2R8 + D k . W). Prove the rults (13). Prove directly. When V (15) and The new definitions have the advantage of being intrinsic. For instance.222 Ch. B.. be followed by a second linear transformation of the plane. let the transformation (17) x' = xal1 y' = xa12 + ya21. U: F" ~ F" must yield a suitable product of their matrices. ~ into ~(TU) = (~n u.E. which is to say that TU also satisfies the defining condition (1) for a linear transformation. found by substituting (17) in (18). mapping (x'. for Theorems 2 and 4 show that the product of the transformations T. where (18) x" y" = = x 'b 11 + Y'b 21.§8. We shall now compute the "matrix product" AB which corresponds to TA TB . and so on. is (19) x" = (allb 11 y" = (a11 b I2 + a12b21)x + (a21b11 + a22b21)y. This multiplication rule is a labor-saving device which spares one the trouble of using variables in substitutions like (19). respectively. By definition. a product TU maps any By the linearity of T and U. + a12 b22)x + (a 21 b 12 + a22 b22)Y· The matrix of coefficients in this product transformation arises from the original matrices A and B by the important rule The entry in the first row and the second column of this result involves only the first row of a's and the second column of b's. x 'b 12 + Y'b 22. y"). Similar formulas hold for n x n matrices.D. with matrix A. Q. y') on (x".3 223 Matrix Multiplication Proof. so as to give the rule (21) . + ya22. This result implies that the homogeneous linear equations (10) for T and U may be combined to yield homogeneous linear equations for TU. The combined transformation. To be specific. EjTA I ajjEj = =I and EjTB Ej(TATB) = (EjTA)TB = bjkEk' Hence (~aijEj)TB ~ aij(EjTB) = } } 7aij(t bjkEk) t CikEk. AC and an entry hik of BC and proves the first of the two distributive laws (24) (A + B)C = AC + BC. i a. To "multiply" a row by a column. since these matrices correspond to the transformations TA (TBTe> and (TATB) Tc . j Hence the matrix product C = AB must be defined by (22) in order to make (21) valid.. .. it is distributive on matrix sums.c'k + "£. booC'k I}} I}}' j This gives dik as the sum of an entry gik of. It follows immediately from the correspondence (21) between matrix multiplication and transformation multiplication that the multiplication of matrices is associative. 8 The Algebra of Matrices By Theorem 2.2).)C'k = "(a £. + ainbnk.. we adopt this definition. = = where Cik (22) =I aipjk = ailblk + ai2b2k + . one multiplies corresponding entries. The product of two matrices may also be described verbally: the entry Cik in the ith row and the kth column of the product AB is found by multiplying the ith row of A by the kth column of B.. Definition. which are equal by the associative law for the multiplication of transformations (§6. A(B + C) = AB + AC. In symbols (23) A(BC) = (AB)C. The product AB of the n x n matrix A by the n x n matrix B is defined to be the n x n matrix C having for its entry in the ith row and kth column the sum Cik given by (22). . I} I/} i = "£.224 Ch. Not only is matrix multiplication associative. . then adds the results. for the matrix (A + B)C has entries d ik given by formulas like (22) as d'kI oo + b.. 225 §8. as in (~ ~). This is exactly the condition that multiplication by B on the right be a linear transformation X~ XB on the vector space of all n x n matrices X. which is associative. Similarly. one may also verify the laws (dA)B (25) d(AB) = and A(dB) = d(AB). We may summarize the foregoing as follows: Theorem 5. for there are plenty of divisors of zero. However.2) it has no left-inverse or right-inverse. for the first halves of these laws combine to give (dA + d* A *)B = d(AB) + d*(A *B). §6. n. which has entries eii = 1 along the principal diagonal (upper left to lower right) and zeros elsewhere. Since I represents the identity transformation. which represents an oblique projection on the x-axis. multiplication is not commutative. . Thus 1) 0 = ( 0 ( -1o 0)( 1 -1 0 -1 -1). since Ej TJ = Ej for all i = 1. does not induce a one-one transformation and is not onto. it has the property IA = A = AI for every n x n matrix 1. has an identity. The other laws of (24) and (25) assert that multiplication by A on the left is also a linear transformation. (~ ~) = (~ ~). thus the matrix G~). hence (Theorem 1. Formulas (15) and (21) assert the following important principle.1? Not all nonzero matrices have multiplicative inverses.3 Matrix Multiplication For scalar products by d. The set of all n x n matrices over a field F is closed under multiplication. 0 = (0 1) ( -10 01) (-10 0) 1 1 0 Hint: What geometric transformations do these matrices induce on the square of §6. the law of cancellation fails. The laws (24) and (25) are summarized by the statement that matrix multiplication is bilinear. Corresponding to the identity transformation TJ of F" is the n x n identity matrix I. and is bilinear with respect to vector addition and scalar multiplication.···. For functions f(x. it carries f(x) into f'(x). (b) SaDk' (d) R. A2 + AB . 8 The Algebra of Matrices Theorem 6. (e) DkS..2B. I y . This conjecture is readily verified. This definition of linearity applies at once. and let I be the transformation or "operator" [f(x)]I = f(x + 1). but observe that we cannot set up the linear homogeneous equations in this infinite space. Both I and a are linear. for [cf(x) + dg(x)]J = c(f(x)]I + d[g(x)]J. Taylor's theorem may be symbolically written as e D = I. B = (10 1 -1)' (1 3) C=21' D = (~ ~) . AC. The derivative operator D applies to the space Coo of all functions f(x) which possess derivatives of all orders.(A + 2B)(B . A(CD). y)]Ix.I)(A . (a) DkSa. y)]Dx. the operator a = I . ax.. EXAMPLE Exercises 1. If I is the identity transformation. 1.B + I) ." when applied to suitable vector spaces of infinite dimensions. and leads directly to certain aspects of the "operational calculus.5a for 0 = 45°. Compufe the products indicated. for the matrices A = (~ ~).1). may in fact be valid for the linear transformations of any vector space whatever. 2. EXAMPLE 3.I is known as a "difference operator". The algebra of linear transformations of F" is isomorphic to the algebra of all n x n matrices over F under the correspondence TA ++ A of Theorem 2. This suggests that the formal laws. Let V consist of all functions f(x) of a real variable x. = f(x + 1. D y . there are corresponding linear operators lx. EXAMPLE 2. For fixed a(x) the operation f(x) ~ a(x)f(x) is also linear. asserted in Theorem 5 for the algebra of matrices.A). (f(x. (d) Test the associative law for the products (AC)D. y). D is linear.. (a) AB.IJk' .226 Ch. (b) (A + B . (c) DB.. it carries f(x) into f(x + 1) .f(x)... = fx'(x. y) and (f(x. a y • Thus.5aDk for 0 = 30°. Use matrix products to compute the equations of the following transformations (notation as in §8. (c) R. AD. y) of two variables. BA. Dx. E IE 3. (b) Show why eD = J. . when a. In Ex. prove the laws R(S + = RS + RT. Prove the associative law for matrix multiplication directly from definition (22). = ( 11 2 1) 3 2. 17. 9. E3 = 001 (a0 0 o o b 0) O. If A is any n x n matrix.1 denote by Tn the transformation described in part (n).§8. When is SaDk = DkSa (notation as in Ex.3 227 Matrix Multiplication 3.D/x'. T of V into itself. BE3. Show that if R. *13. T are any transformations (linear or not) of a vector = RS + RT.. S. 5.A 2x. Find all matrices which commute with the matrix E3 of Ex. where B 10. .. *11. Consider a new "product" A x B of two matrices. *19. 6. (b) TaTe. S.. C (b) If A is any 3 x 3 matrix. is always convergent. Prove that every matrix which commutes with the matrix D of Ex. 2)? 4.) *16. E2 100 = (10k) 0 1 0. *15.Dx. 12. Is this product associative? (a) Compute the products BEl.1 + . and let M be the largest of the IaiJ Prove that the entries of A k are bounded in magnitude by nk-IMk.. then R(S + not hold in general. (a) Let A In Exs. the notation follows that in Examples 1-3 above. (e) TJbTd' Prove the laws (25) and the second half of (24). Prove that each n x n matrix A satisfies an equation of the form n n Am + cm_IA m. XA2 . (a) Simplify xD . xA . E 2E 3. (b) Prove that A 3A 2 = A 2A 3. b. 17-21. unless T is linear. Compute (using matrices) the following products: (a) TbTe. prove that the set C(A) of all n x n matrices which commute with A is closed under addition and multiplication. BE2. (It may be used to define the exponential function e A of the matrix A. 8. 1 can be expressed in the form aI + bD. 7. 4 of §8. (b) Simplify x'Vi . 18. 14. = I a'i II be an n X n matrix of real numbers. Prove DJJy = DP•. (a) Expand (A + B)3. (R + S)T = RT + ST. and c are distinct. (b) Show that the series I + A + A 2/2! + A 3/3! + . and S(cn = c(Sn for any linear transformations R. x'A/ . EI 146 = (0 0 1) 0 1 0. defined by a "row-byrow" multiplication of A by B. by E 2 • Without using matrices. + CIA + col = 0.Ax. 9. how is AE3 related to A? (c) Describe the effect caused by multiplying any matrix on the right by E I . but that (R + S)T = RT + ST does space. (a) Prove D linear. (c) TbTaTb' (d) TdTe.A/Xi . 9). The n x n permutation matrices form under multiplication a group isomorphic to the symmetric group on n letters. A permutation matrix P is a square matrix which in each rOw and in each column has some one entry 1. Permutation.1 is the inverse of D. One may then prove Theorem 7. simply add or multiply corresponding entries along the diagonal (why?). in the sense that DE = I = ED." 8. They are I and the matrices 01 01 0)0 . Diagonal.I)" by a "binomial theorem. The 3 x 3 permutation matrices are six in number. all other entries zero. if and only if all nonzero entries of D lie on the principal diagonal (from upper left to lower right). (001 (0 1 0 (1 0 0 (0 1 0 (1 0 0 Since the rows of a matrix are the transforms of the unit vectors. *21. that is. y(V2)2 _ (V2)2y. A matrix M is monomial if each row and column has exactly one nonzero term. the diagonal matrix E = lIeijll with eii == d ii . There are also other important classes of matrices. and find xV 2 . En' The n x n permutation matrices therefore correspond one-one with the n! possible permutations of n symbols (§6. All n x n diagonal matrices with nonzero diagonal entries in a field F form a commutative group under multiplication. and Triangular Matrices A square matrix D = II d ij II is called diagonal if and only if i "I: j implies d ij = 0.Ch. 1o 00 0)1 . To add or to multiply two diagonal matrices. Define the Laplacian operator V2 by V2 = D/ + D/. Theorem 8. If all the diagonal entries dii of D are nonzero. any such matrix may be obtained from a permutation matrix by replacing . 8 The Algebra of Matrices 228 *20. V2(X 2 + y2) _ (x 2 + y2)V2. 01 00 01) .4. 0o 01 01) . 0o 01 0)1 . a matrix P is a permutation matrix if and only if the corresponding linear transformation Tp of Vn permutes the unit vectors E1o···.V2x. and this correspondence is an isomorphism. Expand ~" = (J . If the equations giving the motions R. For instance. (0 -1)0' R ~ 1 R' ~ (-1o -1' 0) H 0) ~ 0 -1' (1 D ~ (0. The multiplication table of this group. Permutation. Pick an origin at the center of the square and an x-axis parallel to one side of the square. that is. R ' . they will give transformations with the following matrices. and D are written out in terms of x and y (see the description in §6. What is the effect of multiplying an n on the right? X n matrix A by a diagonal matrix D . Any group of linear transformations may be represented by a corresponding group of matrices. and Triangular Matrices the 1's by any nonzero entries. 01). H. A matrix S is strictly triangular if all the entries on or below the main diagonal are zero. 1 The other four elements of the group can be similarly represented. This scheme of prescribing a "pattern" for the nonzero terms of a matrix is not the only method of constructing groups of matrices. where I is the identity. Such matrices are called nonsingular or invertible. 3 0 M2 = 0 7 -30) . as given in §6.1). In other words. Finally. they will be studied systematically in §8. The preceding examples show that a given matrix A may have an inverse A-I. the group of the square is isomorphic to a group of eight 2 x 2 matrices. as for example in (26) M.4 229 Diagonal.4. the group of the square consists of linear transformations. ~ (-~ oo 5) 0 . Exercises 1.6. the entries below the diagonal are zero. if tij = 0 whenever i > j. These two patterns may be schematically indicated in the 4 x 4 case by r s o u v o 0 x q T= (27) o t w y S= 000 z u v o 0 x w y 000 z o 0 0 0 where the letters denote arbitrary entries. (4 0 0 0 0 A square matrix T = II tij II is triangular if all .§8. might have been computed by simply multiplying the corresponding matrices here (try it!). a scalar matrix is a matrix which can be written as cI. such that AA -1 = A -I A = 1. Show that a triangular 2 x 2 matrix with l's on the main diagonal represents a shear transformation. D. 8. (a) Prove that a monomial matrix M can be written in one and only one way in the form M = DP.230 Ch.. j where. A matrix A is called nilpotent if some power of A is O. S. Let Si be the one-dimensional subspace of V" spanned by the ith unit vector E. in the sum. prove M-1DM diagonal.) 13. where D is nonsingular and diagonal. 14. 10. An m x n matrix A = II aij II and an n x r matrix B = I bjk II. n x n) matrices. and P is a permutation matrix. 5 for monomial matrices.P. 8 The Algebra of Matrices 2.n. and find the inverses of Ml and M2 in (26). compute the matrices which represent the symmetries H. How are the rows of PA related to those of A for P as in Ex. with the same n. D diagonal. 4. *(c) Exhibit a homomorphism mapping the group of monomial matrices onto the group of permutation matrices. (Hint: Use Ex. determine as product AB = licik I an m x r matrix C with entries Cik = I aiAk. Prove that a nonsingular matrix D is diagonal if and only if the corresponding linear transformation TD maps each subspace Si onto itself. Represent the group of symmetries of the rectangle as a group of matrices. 5. what matrices A commute with D (when is AD = DA)? 3. 11. V.5. In Ex. Find a description like that of Ex. j runs from 1 to n. For the group of symmetries of the square with vertices at (± 1.) (b) Write the matrices Ml and M2 of the text in the forms DP and PD. 9. we now discuss the multiplication of rectangular matrices-that is. Exhibit explicitly the isomorphism between the 3 x 3 permutation matrices and the symmetric group. Describe the inverse of a monomial matrix M. *IS. If P is a permutation matrix and D diagonal.e. Find its kernel. (Hint: Try the 3 x 3 case. Verify that HD = DV. of m x n matrices where in general m 'I. describe explicitly the form of the transform P-1DP. Rectangular Matrices So far we have considered only the multiplication of square (i. ± 1). This "row-by-column" product cannot be formed unless each row of A is just as long as each column of . 6. show that the formula M = DP defines a group-homomorphism M . 7. 1O? 12. 8.. 7. Prove that any strictly triangular matrix is nilpotent. If D is diagonal and all the terms on the diagonal are distinct. If M is monomial. C. . as always. it is best proved by appeal to an interpretation of rectangular matrices as transformations. the m x m identity matrix 1m and the n x n identity In satisfy (29) (if A ism x n). the transpose A T of an m x n matrixA is an n x m matrix A T with en tries a / = aji (i == 1. The i th row of this transpose A T is the ith column of the original A. Thus. n x r. j the result is just the (i. and TB : F" ~ . hence the assumption that the number n of columns of A equals the number n of rows of B. the product of a transformation T: V ~ W by a transformation U: W ~ X is defined by ~(TU) = (~nU (28) for all ~ in V. . provided these matrices have the proper dimensions to make all products involved well defined. as in (24) and (25). n = 2. As in our formulas (21)-(22). To calculate the transpose C T of a product AB = C. Again. For example. and vice versa. . k) element of the product B TAT.S 231 Rectangular Matrices B. . A(BC) = (AB)C B. As in (11). The algebraic laws for square matrices hold also for rectangular matrices.) This proves the first of the laws The correspondence A ++ A T therefore preserves sums and inverts the . (Note the change in order. . m). associated with A and B.§S. r x s). Here. r = 3. IA==A==Al m n Matrix multiplication is again bilinear. The associative law is (30) (A is m x n. j == 1. use (31) Cik T = Cki == I akjbji j == I j bjiakj == I b ij T ajk T . One may also obtain A T by reflecting A in its main diagonal. .' . if m == 1.P. n . respectively. the matrix product AB corresponds under Theorem 2 to the product TA TB of the linear transformations TA : F m ~ F". 9) is simply the matrix product of the row matrix X by the column matrix yT. a vector ~ in the space F" of n-tuples over F may be regarded as a 1 x n matrix X with just one row.)T and Y = (Yh . B(k) = the kth column of B.. Column Vectors.232 Ch. The whole ith row (Cih' . Hence it is customary to rewrite the matrix equation XA = Yin the transposed form yT = A TXT. Thus the linear transformation TA : F'" ~ F" can be written in the compact form (33) Y=XA . hence is the matrix product of Ai by all of B. and X = (Xh' •• . + x"y" of two vectors (§7. this anti-automorphism is called "involutory. Also. with yT and XT both column vectors. with B = A T.. y"f both column vectors. In the notation of (36) these rules are (37) . . row and column vectors are used together." A systematic use of rectangular matrices has several advantages. In treating bilinear and quadratic forms. The row-by-column multiplication of matrices A and B is actually a matrix multiplication of the ith row of A by the kth column of B so that the definition of a product may be written as (35) where we have employed the notation (36) Ai = the ith row of A.Ci") of the product AB uses only the ith row in A and the various columns of B. Similarly. 8 The Algebra of Matrices order of products. Thus. Y E F". Note that even though Y is a row vector in the equation XA = Y.. so is sometimes called an anti-automorphism. or "row matrix. ." This allows us to interpret the equations Yi = I Xiaij of (10) as stating that the row matrix Y is the product of the rOw matrix X by the matrix A. For example. so that (34) X and Y rOw matrices. the inner product XIYl + . . the scalar product cX is just the matrix product of the 1 x 1 matrix c by the 1 x n (row) matrix X. Since (A T)T = A. the kth column of AB arises only from the kth column of B. there results an equation BX = Y of the form of (11').. Changing the notation. X. its entries appear in the display (10) in a single column. . ..1 bs+I.. . .. the ith rOw of AB is by definition the product of the row matrix Ai = (ail. a 6 x 5 matrix B could be considered as a 6 x 2 matrix DI = IIB(I) B(2)11 laid side by side with a 6 x 3 matrix D2 = IIB(3) B(4) B(5)11 to form the whole 6 x 5 matrix B = IIDI D211... Yn) is a row matrix.s+1 • M2 bsl bsr bs+I. . m'. . Thus. the rule for multiplication becomes (39) DI and D2 n-rowed blocks. . . ain) by B. . bnr ) = ylB I + . the product YB is the row matrix YB = (yIb ll + ..S 233 Rectangular Matrices The second rule may be visualized by writing out B as the row of its columns. .. . ... yIb lr + . + Yn(b nh · . These formulas are special instances of a method of multiplying matrices which have been subdivided into "blocks" or submatrices.. The product YB is thus formed by multiplying the row Y by the "column" of rows B i. ... (40) thus each row of AB is a linear combination of the rows of B. + Ynbnl> . for then These columns ioay also be grouped into sets of columns forming larger submatrices..s+1 amn am... ... bll all als ams amI • MI b ir NI ai n al. + YnBn.s . b Ir ) + ..§8. If we decompose the n x r matrix B into n rows Bh . hence i = 1. By (38).r' N2 bnl bnr Let the n columns of a matrix A consist of the s columns of a submatrix MI followed by a submatrix M2 with the remaining n . + Ynbnr) = YI(b l l . Bn> and if Y = (Yh . For example. It is convenient to sketch other instances of this method. 3. 0 (a) Find XA. 8 234 The Algebra of Matrices columns. -1). Y A. .s) x r matrix N 2. and only the kth column from the top block Nt of B. Y = (i. . The product formula for AB = C subdivides into two corresponding sections The first parenthesis uses only the ith row from the first block M t of A. (20). YE. + M22N22 This assumes that the subdivisions fit: that the number of columns in Mll equals the number of rows in N ll . with a corresponding row subdivision of B. When both rows and columns are subdivided. x = (1. We conclude that matrix multiplication under any fitting block subdivision proceeds just like ordinary matrix multiplication. Likewise. 1 -I) 1 +. (43) ( Mll M2t M t2 ). so that B appears as an s x r matrix Nt on top of an (n .j and N.j are submatrices or "blocks. A similar result holds for any subdivision of columns of A. Let A = (~ ~ ~). except that the entries M. the entry in the ith row and kth column of the block product MtNt . XB. Exercises 1. so the whole product AB is the matrix sum.0). the rule for multiplication is a combination of (42) and rule (39). (Nll M22 N2t N12) N22 = (MllNll + M12N2t M2tNll MllNt2 M21Nt2 + M22N2t + M t2 N 22 ). Therefore this first parenthesis is exactly dik ." and not scalars. B = (0. Thus. MtNt + M 2N 2. as stated in §8. Make a parallel subdivision of the rows of the matrix B.Ch. the second parenthesis of (41) is the term d~ of the product M 2 N 2 • Therefore Cik = d ik + d!. just like the row-by-column multiplication of matrix entries. This rule (43) is exactly the rule for the multiplication of 2 x 2 matrices. (42) This formula is a row-by-column multiplication of blocks. Definition. 0) of three-dimensional Euclidean space onto the (x. 3. Find AB. the oblique projection (x. What is the effect of multiplying any n X (r + n) matrix by /*? *5. AB T . 8.6 Inverses (b) Find 3A . AC. . When this is the case. If there is a linear relation .1. z) ~ (x. 7. For instance. T has a (two-sided) linear inverse T.· . an Tare linearly independent in V. A linear transformation T of a vector space V with finite basis a h ••• .4B.. an is nonsingular if and only if the vectors a IT.-1T = 1. y. Hence a nonsingular linear transformation of V may be called an automorphism of V. y + z. Prove the "block multiplication rule" (43). First suppose T is nonsingular. Let /* be the (r + n) X n matrix formed by putting an n x n identity matrix on top of an r x n matrix of zeros. Proof. 2. The most direct way to prove the main facts about singular and nonsingular linear transformations is to apply the theory of linear independence derived in Chap. Inverses Linear transformations of a finite-dimensional vector space are of two kinds: either bijective (both one-one and onto) or neither injective nor surjective (neither one-one nor onto). XAB T .. Theorem 9. Otherwise T is called singular. y)-plane is neither injective nor surjective.(1 + i) Y)(iA + SB). (c) Find BA T. then XX T is the inner product of X with itself.an for the vector space V on which a given transformation T operates. BA TyT. A + (1 + i)B. while XTX is the matrix A with a'i = x. Show that if X is any row vector. . if A= 2 3 0 0 S 200 o 0 4 0 ' 000 2 B= 1 000 o 100 1 2 1 0 ' 3 4 0 1 o C= 1 1 0 2 0 o 2 4. BA.6. A linear transformation T of a vector space V to itself is called nonsingular or invertible when it is bijective from V onto V. (X . A nonsingular linear transformation T is a bijection of V onto V which preserves the algebraic operations of sum and scalar product. with TT.235 §8.1 = r. so is an isomorphism of the vector space V to itself.' . and BC. using a fixed basis aI.xj. Therefore the ai T are linearly independent. . they span a proper subspace of V.... then (i) T has a two-sided linear inverse.f3n = anT are linearly independent. Again. as by the methods of §7. . Since the f3h"'. Similarly. Since V is n-dimensional. then since OT = 0. By Theorem 1.' . = Xn = O." This proves (ii). there is a linear transformation S of V with f3nS = an· Thus for each i = 1. ai(TS) = f3iS = ai.2). Hence ST = 1. assume that the vectors f31 = al T. + Xnan = 0 and hence. Xl = .. + xnanT = (xlal + ..Xn not all zero-hence (the ai being independent) for some ~ '#: 0. . .. if ~T = 0 for some ~ '#: 0..an of V. (iii') T is not one-one. and Tis nonsingular. If T is singular.. . the aiT are linearly dependent by Theorem 9.. there is by Theorem 1 only one linear transformation R with f3iR = f3i for every i. since the aiT are linearly dependent and V is n-dimensional. by the independence of the a's. and (iv) are parts of the definition. (iv) T transforms Vonto V. one may test the linear independence of the images of any finite basis of V. . . by Theorem 5. (ii) ~T = 0 and ~ in V imply ~ = 0. Again. which proves (ii'). and T is one-one.. Again. if T is singular. since the a's are a ~asis.. If T is nonsingular. this implies Xlal + . f3n are a basis. Proof. (iii). Let T be a linear transformation of a finite-dimensionalvector space V. 8 236 The Algebra of Matrices Since OT = 0.f3n are a basis of V.. . Condition (i) was proved in Theorem 9.. Corollary 1. proving (iii').. contrary to the definition of "nonsingular. Conversely. TS = 1. Therefore 0= xlalT + .6. f3i(Sn = f3i. . Thus S is the inverse of T. . (iii) T is one-one from V into V. then for any basis at. + xnan)T = ~T for some Xl.. and this transformation is the identity. (ii') ~T = 0 for some ~ '#: 0.oT = 0. and recall that a transformation T is one-one onto if and only if it has a two-sided inverse (§6. the n independent vectors f310 . Thus to test the nonsingularity of T. it follows that T is not one-one. n..Ch. and. Since . then (i') T has neither a leftnor a right-inverse. T would not be one-one. (iv') T transforms V into a proper subspace of V. But by Corollary 4. hence is nonsingular by (i') above. Q.I by (i). Thus if (iv) holds. If A has an inverse. Similarly. A -I.. then (iv') cannot hold. by Theorem 1 of §6. so is A T. V = T-\ and UT = 1.E. Hence the condition of Theorem 9 becomes (cf. hence it must be nonsingular. Corollary 4.e. (iii') and (iv') are equivalent to (i').I = A-IA = I (A. Q. A Tis nonsingular if and only if its rows are .D. the Corollary of Theorem 6. as asserted. If the product TU of two linear transformations of a finite-dimensional vector space V is the identity. all eight conditions are "if and only if" (i. Since TV = I. moreover. so does its transpose. one gets by (31) (A -IfA T = A T(A -If = I. conditions (i) and (i') of Corollary 1 translate into the following result.4): Corollary 3. and has an inverse T. equivalently. and the other conclusions follow. then T and V are both nonsingular.2. But the transformation TA is. for on taking the transpose of either side of (44). Finally. the reverse is true similarly. the preceding results can be translated into results about matrices. hence T cannot be singular. Corollary 2. so that (45) Thus. Note that since the conditions enumerated in Corollary 1 are incompatible in pairs. T = V-I.6 237 Inverses Corollary 2.§8. Proof. if A is nonsingular.E. proving (iv'). otherwise we shall call A singular.1 = T-I(TU) = (T-1T)V = V. that transformation which takes the unit vectors of F" into rows of A. Then T. of §7. An n x n matrix A is nonsingular if and only if it has a matrix inverse A -\ such that (44) AA.4. necessary and sufficient) conditions. if and only if they form a basis for pn. In view of the multiplicative isomorphism of Theorem 6 between linear transformations of F" and n x n matrices over F. by Theorem 2. An n x n matrix over a field Pis nonsingular if and only if its rows are linearly independent-or. We define an n x n matrix A to be nonsingular if and only if it corresponds under Theorem 2 to a nonsingular linear transformation TA of F". I all n x n). §7.D. T has a right-inverse. . £. 0. . 1). One may try to solve these equations for the "unknowns" ~ in terms of the Ai. 8 238 The Algebra of Matrices linearly independent. Another construction for A-appears in §8. . hence we have Corollary 5. 12 = (0.. .Ch.. Every left-inverse of a square matrix is also a right- inverse.. (47) In = (0. A square matrix is nonsingular if and only if its columns are linearly independent.I = AlA . + CjnAn = I CjkAk' k-I By (40).. 1. If we write the coordinates of the basis vectors as II = (1..8. . . 0). If Corollary 2 is translated from linear transformations to matrices..I = AA . I" j of the basis vectors.I = 1. (46) for (AB)(B-1A -I) = A(BB-1)A . These rows are precisely the columns of A. 0. . the result will be linear expressions for the ~ as (48) ~ n = cjlA 1 + . then in a given matrix A combination = "aij II each row Ai IS given as a linear A I = "a·. = I.J. 0).0. we obtain Corollary 6. by Theorem 6. this equation states that the matrix C = IlCjk" satisfies CA hence that C = A -I.0. If matrices A and B both have inverses. Inverses of nonsingular matrices may be computed by solving suitable simultaneous linear equations... so does their product. 0. (note the order!). . . Q. However. since TI/I is also the identity.8.. ) ~ (0.6 239 Inverses EXAMPLE. a two-sided inverse. Since I/IT is the identity transformation of W. 1 3 -2 write its rows as AI = II + 212 . To compute the inverse of the matrix 1 2 -2) (o -1 0 . .213.E. If the linear transformation T: V ~ W is a one-one transformation of Vonto W.§8. . Proof. For example. in the sense of §7. X2. ) on the space of infinite sequences of real numbers is one-one but not onto. when it exists. X3. X3. its inverse is linear. but no left-inverse. 0 0 1 Linear transformations from a finite-dimensional vector space V to a second such space W (over the same field) can well be one-one but not onto. hence has many (linear) right-inverses. Let 1/1 denote the unique inverse transformation of W onto V.. or vice versa. and T is linear Apply 1/1 to both sides. A2 = -II + 312. one finds that (49) an equation which asserts that 1/1 is linear. A one-one linear transformation T of V onto W is an isomorphism of V to W. Xh X2. is necessarily linear even if V is a space of infinite dimensions: Theorem 10.. Take vectors ~ and 11 in Wand scalars C and d. not assumed to be linear. 12 = AI + A2 + 2A 3. the linear transformation (x I. A3 = -2I2+h These three simultaneous equations have a solution II = 3AI + 2A2 + 6A 3. The coefficients Cjk in these linear combinations give the inverse matrix.D. The same is true of linear transformations from an infinite-dimensional vector space to itself. for one may verify that 6) (1 2 -2) (1 0 0) 3 2 1 1 2 ( 2 2 5 -1 0 3 -2 0 1 = 0 1 0 . . h = 2AI + 2A2 + 5A 3. 5. 4. (a) Prove that A = (. A is nonsingular.1. Thus T carries lines into lines.I SaRo (with (J = 45°).3. §8. 2.1). compute the matrix of the transformation Ro. D k .240 Ch.be ¥. the image or transform 5' = (5)T under T of a subspace 5 of V is defined as the set of all transforms ~T of vectors in 5. .A. 8 The Algebra of Matrices Corollary 1. 3.I Vtfl. (Hint: Use blocks. 7. for each linear combination C(~n + d('TIn = (c~ + d'TI)T of vectors ~T and 'TIT in 5' is again in 5'. Exercises 1. a r of V into independent vectors in W..1.I exists and is I .) (c) Prove that every triangular matrix with no zero terms on the diagonal has a triangular inverse. .f3s of vectors spanning V into vectors spanning W Proof. the aiT are independent. and planes into planes. Find inverses for the linear transformations R o.) 9.o (see §8. we may apply I I to find Xlal + . (c) Do the same for Ro. hence XI = " . + x. Find inverses (if any) for the linear transformations of Ex. = Xr = 0. B. Sa of §8..3. y' = y. the image under Tof any finite-dimensional subspace 5 of V has the same dimension as 5. D of Ex.5. §8.A + I = 0. and E3 of Ex.. 9. Find inverses for the matrices A and B of Ex. (a) If (J = 45°. where Vb is the transformation x' = bx. If there is a linear relation between the aiT. its inverse is ~-I (d-e -b). Corollary 2. 4. C. 8. 3. For any transformation T: V ~ W. (a) Compute a formula for the inverse of a 2 x 2 triangular matrix. Find inverses for the matrices E10 E 2 .. (Hint: Try a triangular inverse.ar = 0. 1. This image is always a subspace of W.be. (b) Show that if !) is nonsingular if and only if ad . If A satisfies A 2 .' •• . An isomorphism T of V onto W carries any set of independent vectors a!. and any set f31. (b) Describe geometrically the effect of this transformation.O. 6. where a ~ = ad . Find inverses for the matrices A. (b) The same for a 3 x 3. For an isomorphism T: V ~ W. §8. The proof of the second half is similar. prove that A . §8. . (b) (~ ~. . .E. the set of Lemma 2. respectively. and C. Rank and Nullity In general (see §6... 11.ain) of A.. B-1. . 14. Lemma 1. ••• ) onto itself which is not one-one.S. prove that it has a right-inverse which is linear (without assuming finite-dimensionality).. If a product AB of square matrices is nonsingular. each transformation (function) T: 5 ~ 51 has given sets 5 and 51 as domain and codomain. ain)' These are exactly the different linear combinations of the rows Ai = (ail. 13. The image of a linear transformation T: V ~ W is itself a vector space (hence a subspace of W). Exhibit a linear transformation of the space of sequences (x h XZ. Since C(~n = (c~)T and ~T + 'TIT = transforms is closed under the vector operations. B.7. Let TA be the linear transformation corresponding to the m x n matrix A. The range of TA is thus the set of all linear combinations of the rows of A. prove that both factors A and B are nonsingular.I Xiain) = I Xi(ail. (~ + 'TI)T. Given A. Q.2). * 8.. as defined in §7. Proof. The transformation TA : pm ~ F" carries each vector X = (Xl. Proof. A -1. find the multiplicative inverse of (a) (~ ~). 15. If a linear transformation T: V -+ W has a right-inverse.• .7 241 Rank and Nullity 10. .§8. . *16..D. Prove that all nonsingular n x n matrices form a group with respect to matrix mUltiplication. (c) (~ ~). Then the image of TA is the row space of A. . . Prove without appeal to linear transformations that a matrix A has a left-inverse if and only if its rows are linearly independent. The range of T is the set of transforms (image of the domain under In case T is a linear transformation of a vector space V into a second vector space W the image (set of all ~n cannot be an arbitrary subset of W. xm) of pm into Y = XA in pn.. Prove Corollary 2 of Theorem 10. . so that the image of TA consists of all n-tuples of the form Y = XA = (I Xiail. 12. n. X 3. But this is the row space of A. We conclude that the dimension m = s + r of the domain is the sum of the dimensions s of N and r of R.D. The null-space of any linear transformation (or matrix) is a subspace of its domain. Hence the vectors {3jT are independent and form a basis for R.Ch. Rank + nullity = dimension of domain. + Xr = O.• . we need supply the proof only for one case. Proof. . Since the dimension of the subspace spanned by m given vectors is the maximum number of linearly independent vectors in the set. Q. + xrf3r in N. Hence cg + C''TI IS In the null-space.. the rank of A also is the maximum number of linearly independent rows of A. Moreover. as of s elements.. If gT = 0 and 'TIT = 0. Because of the correspondence between matrices and linear transformations. Thus for an m x n matrix... The dimension of the null-space of a given matrix A or a linear transformation T is called the nullity of A or T. Proof. it is therefore the dimension of the range of TA • More generally. . as distinguished from the column-rank. then for all c. Definition. The null-space of a linear transformation T is the set of all vectors g such that gT = o. Nullity and rank are connected by a fundamental equation. + x r({3rT) = 0 implies XI{3Z + . c'. The null-space of a matrix A is the set of all row matrices X which satisfy the homogeneous linear equations XA = O. so that Xl = .E. the rank of any linear transformation Tis defined as the dimension (finite or infinite) of the image of T.. which IS therefore a subspace. {3r for the whole domain of T. 8 242 The Algebra of Matrices The rank of a matrix A has been defined (§7. its null-space N has a basis aI. Theorem 11. For this reason. Dual to the concept of the row space of a matrix or range of a linear transformation is that of its null-space. XI({31 T) + . which can be extended to a basis a h ••• . Since every ajT = 0.. (row) rank plus (row) nullity equals m. Lemma 3. which is all we need. .6) as the (linear) dimension of the row space of A. which is the maximum number of linearly independent columns of A. a.. the rank of A as defined above is often called the row-rank of A. {3 h . valid for both matrices and linear transformations... If the nullity of Tis s.. the vectors {3jT span the image (range) R of T. Condition (a) states that T carries F" onto itself. Given the ranks of A and B. For example. Construct a transformation of R4 into itself which has a null-space spanned by (1. show that for every n X n matrix B the matrices AB.1. and BA all have the same rank. what is the rank of (~ ~)? 8.2.8 243 Elementary Matrices Theorem 12.4). Construct a transformation of R3 into itself which will have its range spanned by the vectors (1.3.4.bz) 1 . -1. 4(b). Proof. al bz 1 . 4. 6. ranks. null-spaces. Exercises 1. al + 1 . 1). az + 1. Thus Theorem 12 is j"lst a restatement of conditions (iv) and (ii) of Corollary 1 to Theorem 9. each of the following conditions is necessary and sufficient: (a) rank T = n. Elementary Matrices The elementary row operations on a matrix A. simply do the same for an identity factor in front: ( ( o l)(al 1 bl 1 o)(al 1 o c bl az + bz) bz az ) cb z . 7. ' . two rows in a matrix may be permuted by premultiplying the matrix by a matrix obtained by permuting the same rows of the identity matrix 1. 2. Prove that rank (A + B) < rank (A) + rank (B). and nullities of the transformations given in Exs. while condition (b) states that gT = 0 implies g = 0 in pn. introduced in §7. (b) nullity T = o. Por a linear transformation T: F" ~ F" to be nonsingular. bl + O· b l o.2) and (3. 5. Thus (0l O1)b(ai l a z) = (0. If the n x n matrix A is nonsinguiar.3. 3. may be interpreted as premultiplications of A by suitable factors. 4(a).5.2.§8.4) and (2. §8. B. az (b + O· bz l al To add the second row to the first row or to multiply the second row by c. l(a)-l(d). Prove that the row-rank of a product AB never exceeds the row-rank of B. Find the ranges.8. samples of which are (50) 1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 H24 1 0 0 0 1 0 0 0 0 0 0 0 3 0 0 1+ 2E33 1 1 0 0 0 d 1 0 0 0 0 0 0 1 0 0 1 1+ dE21 In general. for example. as before.I)Ejj • Finally. 8 244 The Algebra of Matrices Similar results hold for m x n matrices. Each elementary row operation on an m trixE. There are thus three types of elementary matrices. This may be proved easily by direct computation of the product EA. the elementary operation of adding the ith row to the jth row of A. this matrix M can be written as M = I + (c . the elementary operation of adding d times the ith row to the jth row. If Eij is. Definition. when applied to I. Similarly. Consider. the matrix having the single entry 1 in the ith row and the jth column. The rows Fk of the corresponding elementary matrix f are then given by (53). let Ik denote the kth row of the m x m identity matrix 1. the prefactors used to represent the operations are known as elementary matrices. and all other entries zero. whose rows Fk are given by (53) x n matrix A amounts to premultiplitation by the co"esponding elementary m + m maTheorem 13.Ch. The rows of any product EA are always found . j). An elementary m x m matrix E is any matrix obtained from the m x m identity matrix I by one elementary row operation. multiplication of row i in I by a nonzero scalar c gives the matrix M whose rows Mk are given by (52) Aft = cIt (c -F 0). Then the interchange in I of row i with row j gives the elementary permutation matrix H = Ilhijll whose rows Hk are (51) H I =I-I' H-I = I-" (k -F i. gives the elementary matrix F = I + dEji . . as asserted in Theorem 13. where L . Hence. For. in the usual case that no zeros are produced on the main diagonal. E'£s-I . but since subtracting multiples of any ith row from later rows amounts to premultiplication by a lower triangular matrix Lko we have s < n(n . and so nonsingular. not only is the coefficient matrix A reduced to upper triangular form V (which is obvious). where P is nonsin gular. EIA = I. the elementary operation in question does carry A into FA. E is obtained from I by certain operations. for in this case. then PA. By Corollary 3 of Theorem 9 of §7. A can be reduced to I by elementary row operations. . These equations state that the rows of FA are obtained from the rows of IA = A by adding the ith row to the jth row. LI is lower triangular. Corollary 1. Proof. where V = LA. by Theorem 13.. by formula (37).§8. Let A be any nonsingular square matrix.1)/2.. this is referred to as the "LV-decomposition" of A. In other words. where L = LsLs-I . E has a left-inverse E*. and so is nonsingular. The equivalence between row operations and premultiplication gives to Gauss elimination another useful interpretation. Therefore Ax = b is e~ui valent to Vx = Lb. where the E j are elementary.8 245 Elementary Matrices from the rows of the first factor. by Theorem 13. for suitable elementary matrices E j . EIA.. so FjA = (FA)k = FkA = IkA = (Ij + ~)A (FA)j = = (IA)k IA + 10 = (IA)j + (IA)j> (k "# j). Corollary 2.I is lower and V is upper triangular.. Hence we can write A = L . If two m B = x n matrices A and B are row-equivalent.. Every elementary matrix E is nonsingular.V. By Theorem 13 it carries E into E* E. B = EnEn-1 . Matrix inverses can be calculated using elementary matrices. The reverse operation corresponds to some elementary matrix E* and carries E back into I.6. so E* E = I. . A square matrix P is nonsingular if and only if it can be written as a product of elementary matrices. The only 2 x 2 elementary matrices are MI = F21 = (~ ~). And by Theorem 15 this amounts to B = PA. x n matrices A and B are row-equivalent if and only if B = PA for some nonsingular matrix P For B is row-equivalent to A if and only if (by Theorem 13) B = EnEn-1 . (55) Corollary 1. as in (54).Ch. This is an efficient construction for the inverse.. with P nonsingular. any nonsingular matrix P is the inverse (p-I)-I of another nonsingular matrix. If a square matrix A is reduced to the identity by a sequence of row operations. 8 246 The Algebra of Matrices Multiply each side of this equation by A -Ion the right. .. . Chap.Es. In the latter event A has no inverse. (~ ~).. Given any A. Two m P = EsEs-I . it will by a finite sequence of rational operations either produce an inverse for A or reduce A to an equivalent singular matrix. EIA. 10). Theorem 15 has a simple geometrical interpretation in the twodimensional case... where the Ei are elementary. it can be written as a product of elementary matrices. This proves Theorem 14. this method is more efficient than the devices from determinant theory sometimes used to find A -I (d. Incidentally. For matrices larger than 3 x 3. the same sequence of operations applied to the identity matrix I will give a matrix which is the inverse of A. E I. hence. Ell = A -I. The matrix on the left is the result of applying to the identity I the sequence of elementary operations E . Theorem 15... This combines with Corollary 1 of Theorem 13 to yield the following result.. Then (54) EsEs-I . . In either field we get the same echelon form. as in §8. while the field is that of all real numbers. (M. (M. The operations with row-equivalence are exactly those used to solve simultaneous linear equations (§2. This gives Corollary 2. Operations on the rows of this matrix correspond to operations carrying the given equations ..or y-axis.. for c positive) a compression (or elongation) parallel to the x. where T X is the column vector of unknowns (the transpose of the vector X = (X.§8. one-dimensional compressions (or elongations)..5). the elementary operations can be carried out just as if the field contained only the rational numbers. ] = 1. consider m equations L ajjxj = hi (i = 1. Theorem 16. while the constant terms hi constitute a column vector BT.j) a shear parallel to one of the axes.n) j in the n unknowns Xj' The coefficients of the unknowns form an m x n matrix A = I aij II.···.···. . The column of constants B T may be adjoined to the given matrix'A to form an m x (n + 1) matrix IIA. If the elements of a matrix A are rational numbers. Any nonsingular homogeneous linear transformation of the plane may he represented as a product of shears. and reflections. BTII. Analogous results may be stated for a space of three or more dimensions.1: (H12 ) a reflection of the plane in the 45° line through the origin.. xn)). then the rank of A relative to F is the same as the rank of A relative to the smaller field F'. . hence the same number of independent rows. The elementary row operations on a matrix involve only manipulations within the given field F. If a matrix A over a field F has its entries all contained in a smaller field F'. for c negative) a compression followed by a reflection in the axis. This primarily geometric conclusion has been obtained by algebraic argument on matrices. The equations may be written in matrix form as AXT = BT. the so-called augmented matrix of the given system of equations..m.8 247 Elementary Matrices The corresponding linear transformations are.3 and §7. To state the connection. (F. compressions. I + 2E33 . *8. Show that the rank of a product never exceeds the rank of either factor. (~~) . Find the -inverse of each of the 4 x 4 elementary matrices H 24 . Equivalence and Canonical Form Operations analogous to elementary row operations can also be applied to columns. sharpen your result. I + dE 21 displayed in the text. 3. (a) Display the possible 3 x 3 elementary matrices. 8. state and prove an analogue of Corollary 2 to Theorem IS.9.Ch. Write each of the following matrices as a product of elementary matrices: (a) (32 6)l ' (b) (43 -2) -S ' (c) the first matrix of Ex. Using Ex. (~ ~ ~) 5. Find the inverses of and 01 10 2)2 . and reflections. (1 2 0 130 6. Represent the transformation x' = 2x . Let AXT = BT be a system of nonhomogeneous linear equations with a particular solution X T = XO T. Find the row-equivalent echelon form for each of the matrices displayed in Ex. S. An elementary column operation on an m x n matrix A thus means either (i) the interchange of any two columns of A. and conversely. 2. BTII. y' = -3x + y as a product of shears.Sy. Prove that any nonsingular 2 x 2 matrix can be represented as a product of the matrices (~ ~). 13. Prove Theorem 13 for 2 x 2 matrices by direct computation for the five matrices displayed after Theorem IS. §7. 7. where yT is a solution of the homogeneous equations A yT = 0. Prove that a system of linear equations AXT = BT has a solution if and only if the rank of A equals the rank of the augmented matrix IIA. 9.3. 9(a) of §8. Exercises 1. and (~~). and so two systems of equations AXT = B T and A *XT = B* have the same solutions XT if their augmented matrices are row-equivalent. 11. (ii) the . (b) Draw a diagram to represent each n x n elementary matrix of the form (SI)-(S3). Prove that every solution X T can be written as X T = Xo T + yT. Prove: If a system of linear equations with coefficients in a field F has no solutions in F. 3. What does this result mean geometrically? 10. 12. 8 248 The Algebra of Matrices into equivalent e~uations. For a three-dimensional space.S. 4. it has no solutions in any larger field. where c ~0 is any scalar. O"n-. Theorem 17. In particular. We may define two m x n matrices A and B to be equivalent if and only if A can be changed to B by a succession of elementary row and column operations. Hence. The explicit postfactors corresponding to each elementary operation may be found by applying this operation to the identity matrix. Explicitly. ) . may be displayed in block form as (56) D = ( I. If all entries of A are zero. for a nonsingular Q makes B column-equivalent to A. by permuting rows and columns we can bring some nonzero entry c to the all position. B = AQ. Om-". this means that BT = PA T. r < n. A can be transformed into B by a succession of elementary column operations if and only if the transpose A T can be transformed into B T by a succession of elementary row operations. . there is nothing to prove. much as in Theorem 13. Applying Corollary 1 of Theorem 15.§8. where clearly r < m.. All other entries in the first column can be made zero by adding suitable multiples of the first row to each other row. or B = (BTf = (PA T)T = ApT = AQ. Otherwise. Column and row operations may be applied jointly. we can reduce matrices toa very simple canonical form (see §9. °m-rn-r I. and the same may be done with other elements of . if r is the number of nonzero entries in D. or (iii) the addition of any multiple of one column to another column. The proof is by induction on the number m of rows of A. where Q = pT is nonsingular. the r x r identity matrix. all the l's preceding all the 0' s on the diagonal.5). then D = D. Theorem 18. An m x n matrix A is equivalent to a matrix B if and only if B = PAQ for suitable nonsingular m x m and n x n matrices P and Q. Conversely. ' where DtJ denotes the i x j matrix of zeros. Any m x n matrix is equivalent to a diagonal matrix D in which the diagonal entries are either 0 or 1. and we then get the following result. application of column operations is equivalent to post multiplication by nonsingular factors . the entry all is 1. The replacement of A by its transpose A T changes elementary column operations into elementary row operations. Using simultaneous row and column operations.9 249 Equivalence and Canonical Form multiplication of any column by any nonzero scalar. and vice versa. After the first row is multiplied by c -. XB = 0 implies XA = XAQQ-l = XBQ-l = OQ-I = O.1) matrix. column -equivalent matrices have the same null-space. Equivalent matrices have the same column rank. Corollary 1. This reduces A to an equivalent matrix of the form (57) B = (~ ~). The column rank of A (the maximum number of independent columns of A) equals the (row) rank of its transpose A T. But the e~ivalence of A to B entails the equivalence of the transposes A T and B . A T and B T have the same rank. Theorem 19. Upon applying the induction assumption to C we are done. have the same nullity. hence to each other. which is certainly true if they have the same nUll-space. If equivalent. But XA = 0 clearly implies XB = XAQ = 00 = 0. and hence the same rank. this is true if A and B. Again. both are equivalent to the same canonical D. An n equivalent to n matrix A is nonsinguIar it and only it it is the identity matrix 1. X . Proof. The (row) rank of a matrix always equals its column rank. By the theorem. Corollary 5. Two m x n matrices are equivalent if and only if they have the same rank. That is.Ch. Theorem 7) that row-equivalent matrices have the same row space. An m X Corollary 2. Equivalent matrices have the same rank. they have the same rank {Theorem 19). by Theorem 11. Corollary 4. In the canonical form (56) the rank is the same as the column rank. we deduce Corollary 3. and conversely.5. Hence we need only show that column-equivalent matrices A and B = AO (Q nonsingular) have the same rank. C an (m . if they have the same rank. We already know (§7. 8 250 The Algebra of Matrices the first row. since both ranks are unaltered by equivalence. the rank r of A determines the number r of units on the diagonal.1) x (n . n matrix A is equivalent to one and only one diagonal matrix of the form (56). Proof. so A and B have the same column rank. §7. n).. 1 of §7... where B is m x 1 and Cis 1 x n. 12.r).. t *8 .6~ (b) in Exs. TJ) c/([. Do the same for the matrices of Ex. prove that the rank of AB is never less than (r + s) . by Corollary 4. TJ) of the two variables ~ E V and TJ E W is defined as a function with values in F such that (58) (58') f(ae + bf. . where T and U are triangular matrices. TJ).6. I are obtained from I by the same sequences of elementary operations. it is also at least the nullity of B. E/. if A is square. (b) Give examples to show that both of these limits can be attained by the nullity of AB. (b) Use this to give an independent proof of the fact that the transpose of any nonsingular matrix is nonsingular. . Show that by suitably choosing bases in V -and in W the equations of T take on the form YI = Xi (i = 1. . 10 Bilinear Functions and Tensor Products Now let V and W be any vector spaces over the same field F. E.. Show that A -1 = QP where P = Er .§8. . . 7. 8. Er of elementary row operations. *6.8) may be solved by solving DyT = PB T and then computing X T = QyT.. Let T be a linear transformation of an m-dimensional vector space V into an n-space W. CTJ + dTJI) = a/([. reduce A to 1. §7. 7(b) of §7. . Let the sequence El. (Hint: Use the canonical form for A. 2..~) = /([. E1 and Q = E/ . TJ) + b/(f.6.. TJ) and . (a) Prove that the transpose of any elementary matrix is elementary. If A and B are n x n matrices of ranks rand s. Show that if PAQ = D. A is equivalent to I if and only if it has rank n.. A bilinear function f([. then the system AXT = BT of simultaneous linear equations (§8. 10.. Prove: An m x n matrix A has rank at most 1 if and only if it can be represented as a product A = Be. 2. Yj:::::: 0 (j = r + 1. this is true if and only if A is nonsingular. Show that any nonsingular n x n matrix P with nonzero diagonal entries can be written as P = TIJT.10 251 Bilinear Functions and Tensor Products For.n."'. Prove that any matrix of rank r is the sum of r matrices of rank 1. Check Corollary 3 of Theorem 19 by computing both the row and the column ranks (a) in Ex. suitably interspersed with the elementary column operations· E/. Find an equivalent diagonal matrix for each matrix of Ex. + df([. 3. (a) Prove Sylvester's law of nullity: the nullity of a product AB never exceeds the sum of the nullities of the factors and is never less than the nullity of A.6.. 5. 4.) *7. Exercises 1. by Theorem 12. 9. 7(a). *11. as in Theorem 18. the outer ~roduct g x 11 of two vectors from R3 is bilinear with U = V = W = R .. more general context. There are many such functions.. so that H ~ h is a bijection from the set of m x n matrices H = liB. Moreover. 'Ym respectively. II with entries in U to the set of bilinear functions h:VxW~U. We can define bilinear functions h(g.yA. one easily obtains the following result... j=1 j=1 for g and 11 expressed as before. 11) of variables g and 11 from vector spaces V and W with values in a third vector space U (U.. where pm x F" is the Cartesian product of pm and F" defined in §1. such a function h: V x W ~ U is bilinear when it satisfies (58) and (58').. B) = AB is bilinear from Mn x Mn to Mm as stated in Theorems 3 and 5. . if we let U = V = W = Mn be the vector space of all n x n matrices over F. = h(13j. 11) n = i=1 L . Hence (59) is a bijection. g' E V and 11. Note that the two equations of (59) describe inverse functions A ~ f and f ~ A between m x n matrices A = II a.. If V and W have finite bases 131'···' 13m and 'Yl> .Y.=1 L Xiai. Theorem 20. ..12. then every bilinear function f(g. The result of Theorem 20 holds in the preceding.'). Likewise. 8 252 The Algebra of Matrices for all g. the "matrix product" function p(A. with the symbol ® usually .11' E W. V. 11) of the variables g = xl131 + .. . and W all over the same field F).. The preceding bijection can be generalized.Ch. This theorem suggests a way of getting a single standard or "most general" bilinear function ® on V x W. and its proof is similar.... Namely.. For example.11. Let the vector spaces V and W over F have finite bases 13 . + x m13m E V and 11 = Yl'Yl + .• . 'Y. + Yn'Yn E W has the form m (59) f(g.i II over F and bilinear functions f: F m x F" ~ F.13m and 'Y . Theorem 21. 'Ym respectively. Repeating the argument used to prove Theorem 23 of §7. any bilinear h: V x W ~ U has this form for Bj. Then any mn vectors B'i in a third vector space U over F determine a bilinear function h: V x W ~ U by the formula (60) m h(g. 11) = n L L x. . as follows: Theorem 22. as in (60). Proof. .. + Yn'Yn) = n L L XiY/Xij.m. in terms of the mn vectors Bij = h(f3i' 'Yj)' Now the parallel between (60) and (61) leads to the linear transformation T: V ® W ~ U. On the other hand.." . . if T: V ® W ~ U.253 §8. Then formula (60) becomes as required.' . n. We first construct ® as above. This means that the function ® can be defined by m (61) (Xlf31 + . aii> for i = 1.' .10 Bilinear Functions and Tensor Products written between its arguments as ~ ® 71 = ®(~.n which serve as the values aii = f31 ® 'Yj of ® on the given basis elements of V and W. mand j = 1. + xmf3m)®(Yl'Yl + . so T must be the T used above. . Then any bilinear h can be expressed.. this new space V ® W is best described by an intrinsic property not referring to any choice of bases in V and W.. i= 1 j= 1 as in (60). which is uniquely determined as that transformation which takes each basis vector aij of V® W to Bij in U. then h(~. indeed.. 71). as asserted in the theorem. for some linear j = 1..· . there exist a vector space V ® Wand a bilinear function ®: VX W~ V®W with the following property: Any bilinear function h: V X W ~ U to any vector space U over F can be expressed in terms of ®: V X W ~ V ® Was for a unique linear function T: V ® W ~ u. For any given finite-dimensional vector spaces Vand W over a field F. 71) = (~® 71)T i = 1. The values of this function ® lie in a new vector space called V ® W. Therefore T is unique. . with the Bii replaced by aij' However. . we construct this space so as to have a basis of mn vectors. by the theorem. Then V ® W = F mn can be the space of all m x n matrices I aji II. Let V = F"'. Universality.. 0 is called the universal bilinear function-any other h can be obtained from it.. (g ® 'TI) T = L x/yjh jj . 'TI E W. .13n and 'Y/' ••• . 'TI) = (g ® 'TI) T. so T is invertible with inverse T' and so is an isomorphism V0 W :::: V0' W.Ch. W = F". . so that the diagram "commutes" as ®T = h. For example. we would have . and the bottom row is any bilinear function h. This means that ®rr = ® = ®I. TT = I. Then the function 8 is clearly the composite ®T of ® as defined above and the linear function T: V ® W -+ U defined by the formula T(llajill) = L ajjh j/> because for all g E V. 'TI) E V X W into the rank one matrix IlxjYj II = Ilaij II. the theorem states that there is always exactly one linear transformation T. had we constructed V ® W not from the bases 13. Similarly. The space V 0 W with this universal property is called the tensor product of the spaces V and W.'Ym but from some different bases for V and W. 8 The Algebra of Matrices 254 EXAMPLE. In particular. if we constructed any other standard bilinear function 0' with the same "universal" property-say. with 1 the identity. so that h(g.. by using different bases for V and W-we would have a diagram with ®T = ®' and 0'T = ®. and let the 131 and 'Yj be the standard unit vectors Ej and E/ in these spaces. Each bilinear 8: V x W -+ U is then determined by the nm vectors 8(E/. this means that IT = 1. while ® maps each (g. E/) = h jj. that is. In turn. This theorem can be represented by a diagram in which the top row is the "standard" bilinear function ®. For this reason. this last result shows that the "universal" property determines this space uniquely up to an isomorphism. 2. To what linear transformation on V ® W does A ® B naturally correspond? *8. B) = AB is bilinear from V x W to U. Hom (w. The Kronecker product A ® B of an m x m matrix A and an n X n matrix B is the matrix C with entries epq = aikbib where the p and q are the pairs (i. . Exercises 1.§8. O. 4 and 5. V®V*®v. Show that the formula q(x) = a(x )p'(x) defines a bilinear function c/>(a. respectively. such as the quaternions of Hamilton. These quaternions constitute a four-dimensional vector space over the field of real numbers. we can construct various tensor products: V®v. I). Specifically.11.11 255 Quatern ions obtained an isomorphic space V ® W. . . p) = q from the Cartesian product R[x] x R[x] of two copies of the space of all real polynomials to R[x]. Establish the following natural isomorphisms: V®F == V. given one space V and its dual space V*. this tensor product space V ® W can be constructed in other ways. Our particular construction with its basis Pi ® 'Yj shows that its dimension is dim (V® W) = dim V + dim W. V. Show that there are vectors not representable as a single summand ~ ® 1/.··· These are the spaces of tensors used in differential geometry and relativity theory. Show that the function p(A. For that matter. where V and Ware the spaces of all m x r and all r x n matrices over F. Show that the mapping f >-+ A defined by (59) is an isomorphism of vector spaces from the space of all bilinear functions on V x W to the space of all m x n matrices over F. j) and (k. aternions The algebraic laws valid for square' matrices apply to other algebraic systems. it always has the same "universal" property. What is U? In Exs.. without using any bases (or using infinitely many basis vectors for infinite-dimensional spaces V and W). suitably ordered. Show the set Hom (V ® w. 3. V® V® v. 5. u ®(V® W) == (U® V)® w. and W be any vector spaces over a field F. with a . (Hint: Take V = F2 = W. Every vector in V ® W is a sum of terms ~ ® 1/. *6.) *7.. 4. V®W==W®V. U». U) = Hom (V. V® V*. let U. Thus for the product cg of a vector g by a scalar c. J·k = -1. A division ring is a system R of elements closed under two single-valued binary operations. (iii) both distributive laws hold: a(b + c) = ab + ac and (a + b)c = ac + bc. =-} ki = -ik = j. R is a commutative group with an identity 0. while I. The algebraic operations for quaternions are the usual two vector operations (vector addition and scalar multiplication). Xl> Xz. addition and multiplication. i.X3Y3 XIYO + XZY3 . but in defining (§S. Definition. k is defined by the requirement that 1 act as an identity and by the table kZ . We note also that analogues of the results of §§S. they satisfy every other postulate for a field. the rule aO = Oa = 0 and thence the associative law for products with a factor 0 can be deduced easily. the elements other than 0 form a group. with the distributive law.I-S. i. k. j. such that (i) under addition. the product (cl)(dm) is defined as (cd)(lm).2) the product of a transformation T by a scalar. k. m are any two of 1. with real coefficients xo. plus a new operation of quaternion multiplication. i..X3Yz)i XzYo + X3YI . Number systems sharing this property are called division rings. If c and d are any scalars. Thus.7 are valid over division rings if one is careful about the side on which the scalar factors appear. Definition. XIYI - + (XOYI + + (xoYz + + (XoY3 + Though the mUltiplication of quaternions is noncommutative. j. we write the scalar on the right g(Tc) = (cg)T. (ii) under multiplication. These rules. we write the scalar to the left. k'= l . if X = Xo + xli + xzi + X3k and y = Yo any two quaternions. X3.XIY3)j X3YO + XIYZ . j. 8 The Algebra of Matrices basis of four special vectors denoted by 1. and we likewise multiply a matrix by a scalar on the .xZYI)k. A quaternion is a vector x = Xo + xli + Xu' + X3k. From these postulates. It follows that any commutative division ring is a field. determine the product of any two quaternions. The product of any two of the quatemions 1.z= J·z= I (62) ij = -ji = k. then their product is (63) xy = XoYo - + yli + yzi + Y3k are xzYz .256 Ch. It is then easily shown that the norm of x. all the rules of (62) are contained in the expansion (65). Actually.xli .xli is the complex (and quaternion!) conjugate of Zl = Xo + Xli.4) and the associative law of multiplication. To prove that every nonzero quaternion x = Xo + xli -+ xzj + x3k has an inverse.11 257 Quatern ions right. where Zl* = Xo . one of the most curious concerns the multiplication of the pure quaternions ~ = Xli + xzj + x3k .x)(t .1.xzj .x3k. The space of linear transformations T of a left vector space over a division ring R is thus a right vector space over R. Ex. The proof of the associative law is most easily accomplished using complex numbers.x*) = t 2 . (65) where Zl and Z2 behave like ordinary complex numbers. Moreover. is a real number which satisfies (64) N(x) = xx* = x*x = xo 2 + Xl 2 + x/ + x/ > 0 if X # O.§8. the associative and distributive laws. we can readily verify the associative law. by Theorem 3 of §6. Any quaternion X = Xo + xli + X2j + x3k can be decomposed into its real part Xo and its "pure quaternion" part xli + xzj + x3k. Every quaternion X satisfies a quadratic equation f(t) = 0 with roots X and x*.2xot + N(x). defined by N(x) = xx*. The proof of every postulate. Hence x has the inverse x*/N(x). Indeed. and the rules J. it is easily seen from (64) that the quaternions x = Xo + xli with X2 = X3 = 0 constitute a subsystem isomorphic with the field of complex numbers. Theorem 23. The quaternions form a division ring. This equation is f(t) = (t . is trivial. These have various interesting properties (ct. 2(c). define the conjugate of x as x* = Xo . and with real coefficients. except for the existence of multiplicative inverses (which implies the cancellation law.(x + x*)t + xx* = t 2 .2 (66) = . the product of two quaternions in the form (65) is Using this formula. 15). Indeed. d = 1 + j and (b) c = 2 + j.2. (67) where ~ x 11 = (XzY3 . (This group. 8. ±i. 7. Derive the multiplication table (66) from (62). (a) Find a + b. has a quaternion solution x. d = 3 + k. on the number of roots of a polynomial. 2. x 2 = q has exactly two quaternion solutions. (d) Show that if q is not real. 4. a . 11) = x I YI + xzYz + X3Y3 is the "inner product" defined in Chap. Exercises 1. an -F 0. 258 By definition. with real coefficients ao and bo.b. If a quaternion x satisfies a quadratic equation x 2 + aoX + bo = 0. Prove that the multiplication of quaternions is associative. It was proved by Eilenberg and Niven in 1944 that any polynomial equation f(x) = ao + alx + . ab. Largely because of the identity (67).XIY3){ + (XIYZ ..xZYI)k is the usual outer product (or "vector product") of ~ and 11. x 2 = b.) 10. and n > 0. and (~. 5. (a) Show that the norm N(x) = xx* of x is xo2 + X 12 + xl + x/. the "center" consists precisely of the real nonzero quaternions. (a) Prove that x 2 = -1 has an infinity of quaternions x as solutions. which could be defined directly. + anxn = 0. with quaternion coefficients. (Hint: Use (65) and (66).2b.Ch. In the algebra of quaternions prove the elements ±1. ±k form a multiplicative group. (c) Show that the real quaternions are those whose squares are positive. 8 The Algebra of Matrices and 11 = y1i + yJ + Y3k. 9. aa* . 7. is known as the quatemion group. Prove that the solution of a quaternion equation xa = b is uniquely determined if a '" O. prove that every quaternion q-1xq satisfies the same quadratic equation (if q '" 0). ±j. b = 1 + j + k. Infer that the set of quaternions satisfying x 2 < 0 is closed under addition and subtraction. (b) Show that x*y* = (yx)*. much of present-day three-dimensional vector analysis was couched in the language of quaternions in the halfcentury 1850-1900. Show that in the group of nonzero quaternions under multiplication. 3. Let a = 1 + i + j. 6. Solve xc = d for (a) c = i. while the pure quaternions are those whose squares are negative real numbers. §3.. (b) Show why this does not contradict the Corollary of Theorem 3.) . (b) Solve ax = b. a*.X3Yz)i + (X3YI . bx + (2j + k) = a. xa = b. ia . 11 259 Quatern ions 11. Derive all of the rules (62) from i 2 = j2 = 18.) = ijk = -1.1.) 14.) 13. How many of the conditions of Theorem 2. If the integers a and b are both sums of four squares of integers. (Hint: Use Ex. (a) Prove that the quaternions Xo + xli + xJ· + X3k with rational coefficients Xl form a division ring. 12. can you prove in a general division ring. (b) Show that the quaternion group is not isomorphic with the group of the square. 10). 17. (Hint: Expand (a + b)(1 + 1) in two different ways. §2. (Note: Do not confuse the scalar J=1 E C with the quaternion unit i. show that the commutative law for addition follows from the other postulates. In a division ring. 5. if a/ b is interpreted as ab -l? 15. show that the product ab is also a sum of four squares.§8. (a) Enumerate the subgroups of the quatemion group (Ex. Show that the "outer product" of two vectors is not associative. and show they are all normal. 16. (b) Show that this is not the case for quaternions with complex coefficients. Does (ABf = BTAT hold for matrices with quaternion entries? e . 1. ir2 may be found from the "old" coordinates Xl! X2 of [ as follows. By definition (§7. these coordinates are the coefficients in the expressions of [ in terms of the two bases.8). the vector f3 = 4£1 + 2£2 has by definition the coordinates (4. £2' The vectors also form a basis~ in terms of this basis.8).9 Linear Groups 9. Solving the vector equations (1) for £1 and 260 . in the real plane R2. X2 * of any vector ~ relative to the "new" basis ir}. relative to Rgure 1 this new basis (Le. hence any change in the basis will cause a change in the coordinates of f For example. Change of Basis The coordinates of a vector { in a space V depend upon the choice of a basis in V (see §7. relative to the oblique coordinate system shown in Figure 1).. More generally. {3 is expressed as {3 = ir 1 + 2a2' The coefficients 1 and 2 are the coordinates of {3.2) relative to the basis of unit vectors £1. the coordinates XI *. . then each vector aj * of the new basis can be expressed as a linear combination of vectors of the old basis..1 261 Change of Basis Substituting in the first expression for Hence the new coordinates of equations e the values of £1 and £2.s a basis of the . If a}. as X2 = X2*. Upon substituting the values of aj * in terms of the a's.. we obtain ...all any basis of V. if p = Ilplj II is any nonsingular matrix. n. . where pT is the transpose of P. and a}. .• Pin) of the vector ai *. .19. If all . and hence the matrix P is nonsingular (§8. the n vectors aj * = L Pijab with i = 1. and a 1*' . The matrix P = II Pij II of the coefficients in these expressions has as its ith row the row of old coordinates (Pil. the rows of P are linearly independent. One may also express the old basis in terms of the new basis by equations ak = Lqkiai* with a coefficient matrix Q = Ilqkill.. then for each nonsingular matrix P = IIPij II. in the form n (3) a[ * = Pilal + . Conversely. . + pinan = L Pij(Xb i = 1. the expression (3) can be written as the matrix equation a* = apT. .' . . vector space V... Since the vectors aj * form a basis. . j=l Formally. we find e are given by the linear homogeneous (2) Conversely. the old coordinates can be expressed in terms of the Xl = 2Xl* + X2*' ne~. . hence form a new basis of V This proves Theorem 1.. and every basis of V can be obtained in this way from exactly one nonsingular n x n matrix P..an . .. the vectors (Xi * determined as in (3) by P are linearly independent..6.. constitute a new basis of V. all * a new (ordered) basis.. Theorem 9).n. Similar relations hold in n dimensions...all is a given basis wi th its vectors arranged in a definite order. a..j L = i ::E Xi *Pij 1 Proof. + X.. where X = (Xl> ••• . 9 262 Linear Groups But there is only one such expression for the vectors ak in terms of themselves.. j) entries in the matrix product OP. ak = ak· Hence the coefficient L qkiPij of aj here must be 0 or 1. one gets the interesting relation (5) bases: a* = apT..) .. and one may solve for X* in terms of X as X* = XP. * expressed in the form ai * = L Pijaj> then the j coordinates XI of any vector ~ relative to the old basis ai determine the new coordinates Xi * of ~ relative to the ai * by the linear homogeneous equations . The equations (4) may be written in matrix form as X = X* P.. Since the aj and aj * are bases..l of the second equation is the transposed inverse of the matrix pT of the first. These coefficients are exactly the (k. X. Substitution of the formula (3) for ai * yields The coefficient of each aj here is the old coordinate Xj of ~.. a. *) is the row matrix of new coordinates. .. The matrix p. . of the vector space V is changed to a new basis al *. and 0 = p-l is the inverse of P. hence OP = I. coordinates: X* = XP. hence the equations (4).Ch.8).. By definition (§7. *P. .) is the row matrix of old coordinates and X* = (x I *. according as k '# j or k = j.l • If one compares this matrix equation with the matrix formulation a* = apT of (3) already mentioned. The parallel result for change of coordinates is as follows: Theorem 2. namely. P is nonsingular. If the basis a h ••• .l .. the coordinates Xj * of ~ relative to the basis ai * are the coefficients in the expression ~ = L x/ a/ of ~ as a linear combination of the ai *.. (4) Xj = Xl *Plj + . X. (This situation is sometimes summarized by the statement that the change of coordinates is contragredient to the corresponding change of basis.. . Take any (ordered) basis at> ••• . . But relative to the new basis al = 2£}. £2 ~ -£1 + 2£2 is represented. as displayed below: D = (~ ~). i). 3.specified below.2 263 Similar Matrices and Eigenvectors Exercises 1. To generalize this result. depending on the choice of a basis (coordinate system) in V. (b) a) = (2.3). -1). Let ~ = L Xiai be a vector of V with the n-tuple X = (Xl.1 the transformation is al ~ 3al and a2 ~ 2a2. by the n x n matrix A. 1). in the usual coordinate system of R2. 1. a 2 = (1. -1). Thus. (a) a) = (1. a2 = (-2.. an}. a2 = (0. hence it is represented by the simpler diagonal matrix D displayed above. let us recall how a matrix represents a transformation. Similar Matrices and Eigenvectors A linear transformation T: V ~ V of a vector space V may be represented by various matrices. . We shall say that two such matrices A and D are similar.1). Give the equations for the transformation of coordinates due to rotation of axes in the plane through an angle 8. by the matrix A whose rows are the coordinates of the transforms of £1 and £2. In cases (a) and (b) draw a figure. a 3 = (0. 1). (d) a) = (i. xn) . i.0). 1. where i 2 = -1. 1). i).1 as (6) a·T = ~ a··a· I "'IJ J' j Hence T is represented. in the plane the transformation defined by £1 ~ 3£}. 9.2. 1.§9. If a new basis a j * is given indirectly by equations of the form a j = L qj/Xj *. . This relation' can also be expressed in terms of coordinates. relative to the basis a = {a}. Then the images under T of the basis vectors ai may be written by formula (9) of §8. (c) a) = (1. . j work out the equations for the corresponding change of coordinates. Let T carry the usual unit vectors E j (in V2 or V 3 ) into the vectors a j .. 2... a3 = (0. a2 = £1 + £2 discussed in §9. Find the corresponding equations for the new coordinates in terms of the old coordinates. 1. an of a vector space V and any linear transformation T: V ~ V. and for the old coordinates in terms of the new coordinates.0. a2 = (1. The equivalence relation B = PAP. Two n x n matrices A and B with entries in a field Fare. (7) Y= XA. the new basis is expressed in terms of the old one by a nonsingular n x n matrix P as in (3). . and called the relation of similarity.I is' formally like that of conjugate elements in a group (§6.Ch. similar (over F) if and only if there is a nonsingular n x n matrix P over F with B = PAP-I.an * be a second basis.a. . the old coordinates.. . if and only if the matrices A and B are similar. Then the image ~T = (Lx. } } are then just the coefficients of Y}· is ~~X. Two n x n matrices A and B over a field F represent the same linear transformation T. . relative to the basis a.' V ~ Von an n-dimensional vector space V over F. Y = a -coordinates of 'T/ = ~T. where X = a -coordinates of ~. Either of the equivalent statements (6) or (7) means that T is represented.x"}' ·a·· j 'T/ is just the matrix product Y = XA. 9 264 Linear Groups of coordinates relative to the basis a. relative to (usually) different coordinate systems. by Theorem 2.12).ai)T = LXj(ajT) = 'T/ and the coordinate vector Y of Briefly. by the matrix A. and the new coordinates of ~ and 'T/ = U are given in terms of.iaj = ~(~Xjajj)aj' I The coordinates Yi of 'T/ = ~T I ai> so that = ""'. Then by (7) Hence. Definition. it is of fundamental importance. Our discussion above proves Theorem 3. as X* = Xp-I and Y* = yp-l. by Theorem 1. Then. by (7) again.. the matrix B representing T in the new coordinate system has the form PAP-I. Now let al *. Let the linear transformation T: V -+ V be represented by a matrix A re/~tive to a basis at>"'. Note also that any nonzero scalar multiple of an eigenvector is an eigenvector.. _ Proof. An eigenvector of a linear transformation T: V V is a nonzero vector ~ E V such that ~T = c~ for some scalar c.. . Explicitly.) such that XA = cx.0. 0 is an eigenvector of the n x n matrix A if XA = cX for some scalar c. a" of V.1 is an eigenvector of B belonging to the same eigenvalue c. An n x n matrix A is similar to a diagonal matrix D if and only if the eigenvectors of A span F". The set of all eigenvalues of T (or TA ) is called its spectrum. . we may restate this as Theorem 3'. similar matrices have the same eigenvalues. Since similar matrices correspond to the same linear transformation under different choice of bases.. j Then T is represented relative to the new basis by the matrix PAP-I. it is important to know which matrices are similar to diagonal matricesand also which pairs of diagonal matrices are similar to each other. . -+ Thus each eigenvector ~ of T determines an eigenvalue c.. one simply adds (or multiplies) corresponding diagonal entries. let P = IIpij II be a nonsingular matrix. then (XP-I)B = XP. An eigenvalue of T is a scalar c such that ~T = c~ for some vector ~ not O. correspondingly. The answer to these questions involves the notions of characteri!)tic vector and characteristic root-also called eigenvector and eigenvalue.. a vector X = (XI' . the eigenvalues of A are the diagonal entries in D. and each eigenvalue belongs to at least one eigenvector. the n-tuple X ::. For this and other reasons. .. so that XP.1 = c(Xp.i. Suppose first that A is similar to a diagonal matrix D with diagonal entries db"'. Definition.2 265 Similar Matrices and Eigenvectors More explicitly. An eigenvector or eigenvalue of a square matrix A is. this means that the eigenvalues of a diagonal matrix are the entries on the diagonal.x. when this is the case. .1 ). . If the matrix B = PAP-I is similar to A.§9.0). The algebra of matrices applies especially smoothly to diagonal matrices: to add or multiply any two diagonal matrices. The unit vectors £1 = (1..IpAP. In particular. and ai * = L PiPj the corresponding new basis of V. d". The connection between eigenvectors and diagonal matrices is provided by Theorem 4. .f3n' TA is represented as in (6) by the diagonal matrix D with diagonal entries c. This search is greatly facilitated by the following consideration.. .. and if I is the n x n identity matrix. hence by Theorem 9 of §8. relative to the basis f3.···.is diagonal. . 1) are then characteristic vectors for D. .Xn is nonsingular because the rows are linearly independent. so that djxj = CXj for all i. f3 I T A = Clf3. We are given n linearly independent n-tuples X. Q...6. in fact... Hence.D. so that XD = cX for a suitable eigenvalue c. Since some Xj -::.Xn which are eigenvectors of A. If a scalar A is an eigenvalue of the n x n matrix A. then XA = AX = AXI and consequently X(A . ••• .... . Cn • The matrix P is. . .i.. Corollary 2) we can extract a subset of eigenvectors f3. d n are the corresponding eigenvalues of D and hence of A. Also. .. .E. .AI) = 0 for some nonzero n-tuple X The n homogeneous linear equations with matrix A .Cn • The matrix P with rows X. then P is nonsingular and PAP. c"' and A is similar to this matrix D.. exactly the matrix required for the change of basis involved in the direct proof of Theorem 4. for let X = (X.l = D.. . 0. and the eigenvalue c is indeed some d i • Conversely. Now XD = (d1x.. Corollary. Ex.0.f3nTA = cnf3n for eigenvalues C.. CI (8) 0 A- o This asserts that PA = DP. one thus searches for eigenvalues and eigenvectors. By the block mUltiplication rule Proof. and hence that PAP. suppose there are enough eigenvectors of the matrix A to span the whole space F" on which TA operates... They are the only eigenvalues..xn ) -::. Then (§7. To explicitly construct a diagonal matrix (if it exists!) similar to a given matrix.. . . . 5 below). cn.f3n which is a basis for F". .i. . " ' . Theorem 4. this proves that d i = C for this i. we have . . .Ch.··· ... there are matrices which are not similar to any diagonal matrix (cf." ..4.dnxn). . . so that XjA = CiXj for characteristic roots c... Since each f3i is an eigenvector.. the diagonal entries d l .EnD = dnEn. If P is a matrix whose rows are n linearly independent l eigenvectors of the n x n matrix A. 0 be any characteristic vector of D.. where D is the diagonal matrix with diagonal entries CI. 9 266 Linear Groups En = (0.AI thus have a nontrivial solution. . On the other hand.. since EID :::: diE. Compute the eigenvalues and the eigenvectors of the transformation and interpret them geometrically. The scalar A is an eigenvalue of the matrix A if and only if the matrix A .Y = y. The new diagonal matrix may be written. The polynomial (10) is A 2 + 4A .5 . for each root A there is at least one eigenvector. we get eigenvectors (1. Solving these. Moreover. Find a diagonal matrix similar to the matrix (. Show that the equations 2x' = (1 + b)x + (1 . .AI is singular. the 2 x 2 matrix A .A a21 is readily seen to be singular if and only if (10) (This merely states that the determinant of A .b)x + (1 + b)y represent a compression on the 45° line through the origin. Using these as a new basis. hence the eigenvectors satisfy one or the other of the systems of homogeneous equations -3x + 2y =x or -3x + 2y = -5x 4x . 2y' = (1 . 2) and (1. For example. -1).§9.~ _ ~) . as a product 2)(-32 -14)( 11 -1)2)-1 (10 -50) · -1 Exercises 1.2 267 Similar Matrices and Eigenvectors Theorem 5.AI = (9) (all . The roots of this are 1 and -5. the transformation takes on a diagonal form.y = -5y. according to Theorem 3'. found by solving Xlall Xla12 EXAMPLE. 4x .) Hence we find all eigenvalues by solving this equation.AI is zero.b)y. + X2a21 = AXI + X2a22 = AX2. ) (b) Show that if ad . Prove that the set of all eigenvectors belonging to a fixed eigenvalue of a given matrix constitutes a subspace when 0 is included among the eigenvectors. This group is called the full linear group Ln = Ln (F). 9 268 Linear Groups 2. Theorem 9).4.be = 1. Prove that no matrix 6. Show that the slopes y ·of the eigenvectors of a 2 x 2 matrix A satisfy the quadratic equation: a21y2 + (all . then ad . (d) (-1. §8. (b) C~ ~). then A is similar to an orthogonal matrix if and only if a + d = O. in the light of this remark. In the one-one correspondence of linear transformations to matrices. relative to different bases in V and in W. For each matrix A given in Ex. Interpret the result geometrically. a nonsingular matrix P for which PAP-I is diagonal. -2. (b) Prove that the matrix representing a rotation of the plane througH an angle (J (0 < (J < 11') is not similar to any real diagonal matrix. *10. 4.6. 2) 3. (For definition of orthogonal. products correspond to . Prove that any 2 x 2 real symmetric matrix not a scalar matrix has two distinct real eigenvectors. Prove that AB = BA if and only if A and B have a common basis of eigenvectors (Frobenius).a 22 )y . (a) Find the complex eigenvalues of the matrix representing a rotation of the plane through an angle (J. (a) Show that if A = (.a l2 = O. *11. Let both A and B be similar to diagonal matrices. when possible. (a) Show that two m x n matrices A and B are equivalent if and only if they represent the same linear transformation T: V -+ W of an mdimensional vector space V into an n-dimensional vector space W.Ch. (c) Show that if ad .6 O. 9.3. 8. 9. (~ ~) is similar to a real or complex diagonal matrix if e -:. see §9. then A is similar to an orthogonal matrix if and only if A = ±I or -2 < a + d < 2. 2i\.be = ±l. Compute the eigenvalues and the eigenvectors of the following matrices over the complex field: (a) (~:). S.9. The Full Linear and Affine Groups AU nonsingular linear transformations of an n -dimensional vector space F" form a group because the products and inverses of such transformations are again linear and nonsingular (§8. 2 find. (b) Interpret Theorem 18. !) is similar to an orthogonal matrix.be = -1. 7. . The set of all nonsingular affine transformations of F" constitutes a group. the translation then carries the end-point of each vector ~ into the end-point of ~ + K. It corresponds exactly to the sum of the vectors K and A.§9.Kl\ found by solving (11) for ~. so the full linear group Ln (P) is isomorphic to the group of all nonsingular n x n matrices with entries in the field F. Relative to any coordinate system.. This proves Theorem 7.3 269 The Full Linear and Affine Groups products. The result is again affine because KU + A is a fixed vector of F". hence the affine transformation (11) will be one-one and onto if and only if its linear part T is one-one. It contains as subgroups the full linear group and the group of translations. The affine transformations include the linear transformations (with K = 0) and the translations (with T = I). Every translation is one-one and onto. Similarly. hence it has an inverse.K. Theorem 8): Theorem 6. What are the equations of an affine transformation relative to a basis? The linear part T yields a matrix A = I aij II.~ ~ ( = ~ + (K + A). A translation in any space F" is a transformation ~ ~ ~ + K for K fixed. the product is (12) ~ ~ (~T + K)U + A = ~(TU) + (KU + A). The product of a translation ~ ~ 'T/ = ~ + K by 'T/ ~ ( = 'T/ + A is found by substitution to be the translation . The distance and direction may be represented by a vector K of the appropriate magnitude and direction. the coordinates Yi of the translated vector are YI = Xl + k}. A translation of the plane moves all the points of the plane the same distance in a specified direction. All the translations ~ ~ ~ + K of F" form an Abelian group isomorphic to the additive group of the vectors K of F". the affine group An (P). Thus we have proved the following special case of Cayley's theorem (§6. . K a fixed vector).Yn = Xn + kn' where the k i are coordinates of K. A linear transformation T followed by a translation yields (Tlinear. the inverse of a translation ~ ~ ~ + K is 'T/ ~ 'T/ . the translation vector has as . (11) An affine transformation H of F" is any transformation of this form. Its inverse will consequently be the affine transformation 'T/ ~ ~ = 'T/ll . 'T/ ~ 'T/U + A. The translations form another important group. If one affine transformation (11) is followed by a second..S. Each affine transformation ~H = ~T + K determines a unique linear transformation T.L xjajj + k j . 1).L K .Ch.. (43» gives (16) (A K O\(B I} L 0\ I} + 0 .. In any homomorphism the objects mapped on the . the result is precisely the bordered matrix belonging to the product transformation (14). ..1 (j = 1. A transformation is affine if and only if it is expressed. and the product of two affine transformations determines as in (12) the product of the corresponding linear parts. below and to the right of the single entry 1: (15) {Y = XA + K} ++ (~ ~) (0 is n x 1) (K is 1 x n)' The rule for block multiplication (§8. below by the row K.. The same multiplication rule holds for a matrix of order n + 1 constructed from the transformation (13) by bordering the matrix A to the right by a column of zeros.0. k n ).. .11).. xn) into a vector with coordinates. The isomorphism is explicitly given by the correspondence (15). Yj = . L A · 0 + O· 1) + 1.0 + 1. The group of all nonsin gular affine transformations of an n-dimensional space is isomorphic to the group of all those nonsingular (n + 1) x (n + 1) matrices in which the last column is (0.. by nonhomogeneous linear equations such as these. This correspondence H 1-+ T maps the group of nonsingular affine transformations H onto the full linear group. .' . . L row matrices). and is a homomorphism in the sense of group theory (§6. relative to some basis. . the formula is parallel to (12). The product of the transformation (13) by Z = YB + L is (14) Z = X(AB) + KB + L (K. 9 270 Linear Groups coordinates a row K = (kl> . n). This proves Theorem 8. (13) Y = XA n + K. The affine transformation thus carries a vector with coordinates X = (Xl> .5. 1 = (AB KB = (~! + L J. be == 1. Given the circle x 2 + y2 == 1. We call the first interpretation an alibi (the point is moved elsewhere) and the second an alias (the point is renamed). Thus. Equation (13) was interpreted above as a transformation of points (vectors).. H 2 H I • (c) Find the inverses of HI and H 2 • 2. (g) All strictly triangular matrices.4. y' == 3y .3 271 The Full Linear and Affine Groups identity form a normal subgroup. y. can be interpreted (alibi) as a point transformation which translates the whole plane two units east and one unit south. x' == x + y + 3. . or (alias) as a change of coordinates.. 4. The group of translations is a normal subgroup of the affine group.X.y -+ 5. *3.) in the same coordinate system. (c) All nonsingular diagonal matrices. which carried each point X = (Xl>' •• . in which the original coordinate network is replaced by a parallel network. Prove that the set of all affine transformations x' == ax + by + e. prove that every nonsingular affine transformation of the plane carries this circle into an ellipse or a circle. with new origin two units west and one unit north of the given origin. y' == ex + dy + {. (e) All monomial matrices. is a nonnal subgroup of the affine group A 2 (F). in the plane. Which of the following sets of n x n matrices over a field are subgroups of the full linear group? (a) All scalar matrices eI. A similar double interpretation applies to all groups of substitutions. This proves Theorem 9.§9. . Exercises 1. with ad . (b) All diagonal matrices. the equations YI = Xl + 2.) into a new point Y having coordinates (Yl> . y' == x . in this case the affine transformations H with T = I are exactly the translations.. We could equally well have interpreted equation (13) as a change of coordinates. (b) Compute the products H I H 2 . (a) Represent each of the following affine transformations by a matrix x' = 3x + 6y + 2. (f) All triangular matrices.. (d) All permutation matrices. 14. with A r x rand B s x s.) (b) Prove that the identity is the only affine transformation that commutes with every affine transformation. prove that the groups Ln (F) and Ln (K) are isomorphic. = OH2 (0 is the origin!). prove that Ln(F) is isomorphic to a subgroup of Lm(F). 8. If Ln (F) is the full linear group. 6. 9 272 Linear Groups (h) All matrices with zeros in the second row. Tn the group of translations. 12. Hence we seek those linear transformations of Euclidean vector spaces which preserve the lengths I~ I of all vectors ~. (a) Show that the set of all nonsingular matrices of the form (~ ~). What is the order of the full linear group L 2 (Zp) when Zp is the field of integers mod p? 8. Exhibit a group of matrices isomorphic with the group of all translations of P. list all matrices in L 2(Z2)' (b) Construct a multiplication table for this group L 2(Z2)' *7. Map the group of nonsingular 3 x 3 triangular matrices homomorphically on the nonsingular 2 x 2 triangular matrices. (Hint: Proceed as in Ex. The Orthogonal and Euclidean Groups In Euclidean geometry. 15.4. (b) What is the geometric character of the linear transformations of R3 determined by such a matrix if r = 2. 5. . (a) If Z2 is the field of integers modulo 2. Show that the correspondence A -+ a is a homomorphism. Prove that the quotient group An(F)/Tn(F) is isomorphic to Ln(F). but use blocks.Ch. 16. (a) Show that all one-one transformations y = (ax + b )/(cx + d) with ad '" bc form a group (called the linear fractional group). where An denotes the affine group. (b) Prove that this group is isomorphic to the quotient group of the full linear group modulo the subgroup of nonzero scalar matrices. If two fields F and K are isomorphic. (i) All matrices in which at least one row consists of zeros. show that two affine transformations HI and H2 fall into the same right coset of Ln (F) if and only if OH. 11. is a group isomorphic with the direct product L. (Hint: They must commute with every I + E jj . length plays an essential role. If n < m. Let G be the group of all matrices A = (~ !) with ad '" O.(F) x L. 9. *(c) Extend the results to matrices larger than 2 x 2. s = I? 9.(F). (a) Prove that the center of the linear group Ln (F) consists of scalar matrices cI (c '" 0). *13.) 10. Theorem 10. TIT). and subtracting (18). so that I~TI = IH We now determine all orthogonal transformations Y Euclidean plane. 1) has a transform (al + b h a2 + b 2) of length . Expanding.TI I = I~T . the vector (1. or I~ . Geometrically. (ii) T preserves inner products. 1) have the length 1. TI) = (a. Tin. The transforms = XA of the (17) of the unit vectors (1. . the inverse of the first orthogonal transformation (19) is obtained by replacing (J by -(J. formulas (5) and (5'). or ~ . An orthogonal transformation T has. Hence every orthogonal transformation of the plane is a rotation or a reflection.1 TI implies ~T . so (al + b l )2 + (a2 + b2)2 = 2. TI. or (~. This fact (unlike the trigonometric formulas) generalizes to n x n orthogonal matrices. According to the Pythagorean formula for length. we find (18') By (18) there is an angle (J with cos (J = ah sin (J = a2.273 §9. In addition. for every pair of vectors ~. tan (J = a2/ al = -btl b 2.1. TI) = cos L(~T. respectively. or cos L(~.4 The Orthogonal and Euclidean Groups Definition.TlTI. the properties (i) T preserves distance. b/ + b/ = 1. A linear transformation T of a Euclidean vector space is orthogonal if it preserves the length of every vector ~.1 TIT. this means al 2 + a2 2 (18) = 1. (iii) T preserves orthogonality. whence by (18) b2 = ±cos (J. these represent rotation through an angle (J and reflection in a line making an angle a = (J/2 with the x-axis. (iv) T preserves magnitude of angles. Then by (18'). hence it is the transpose of the original.0) and (0./2. since A is orthogonal. The two choices of sign give exactly the two matrices (19) COS (J ( -sin (J sin (J) cos (J COS (J ( sin (J sin (J) -cos (J By §8. b l = ±sin (J. T/ means (~. then for any vector ~ = XIEI + . Conversely.. carry the given basis Eb' . T/) = 0 and since angle is also definable in terms of inner products (§7. if T has this property. properties (iii) and (iv) will follow immediately from (ii)..11 that the length is given by the usual formula as I~I = (x/ + '" + Xn 2)1/2 = I~TI. which matrices correspond to orthogonal linear transformations? The question is easily answered. T/) in terms of the "lengths" such as \~ \ = (~. En' In coordinate form. + XnEn having the transform . by Theorem to. §8. regarded as vectors. 9 274 Linear Groups Proof Since T is linear. ~)1/2. Since ~ 1.1) that the ith row of A represents the coordinates of aj = EiTA relative to the original basis Eb .an = EnT which is normal and orthogonal. Relative to any normal orthogonal basis. The proof is completed by the remark (d.E. at least relative to normal orthogonal bases. and any two rows. we know by Theorem 22 of §7. . T/) + (T/.D. This proves (ii). This equation may be solved for (~. . Q. . the conditions on A stated in the theorem are equivalent to the equations n (21) L k~l ajkaik =1 n for all i. (41».~) + 2(~.. Proof Any orthogonal transformation T must. .i. If we write Ai for the ith row of the . whence T is orthogonal. are orthogonal.9. a real n x n matrix A represents an orthogonal linear transformation if and only if each row of A has length one. . the definition gives (i). . . ~ + T/) = (~. L aikajk = 0 if i -::. As for (ii). hence also the inner product on the left of this equation. for length is defined in terms of an inner product. a transformation T known to preserve all inner products must preserve length and hence be orthogonal. ~ = xlal + .. Theorem 11. j.. the "bilinearity" of the inner product proves that (~ + T/. in the form (20) The orthogonal transformation T kaves invariant the lengths on the right. Conversely. + xna n. T/). . k=l The conclusions (21) are exactly those already found explicitly in (18) and (18') for a two-rowed matrix.En into a basis al = El T. Next we ask.Ch. A square matrix A over any field is called orthogonal if and only if AA T = 1.1JU \ = I~ . (The symbol Dij is called the Kronecker delta. Therefore if an affine transformation ~ ~ ~T + K is rigid.5).. Fro~ this it also follows that a matrix A is orthogonal if and only if each column A has length one. every orthogonal matrix A is nonsingular.1J I for all vectors ~. hence by Theorem 9 of §8. 1J. so the conditiQns (21) may be written as (21') ifi ~ j. and so is a rigid motion. the equations (21') state that the ith row times the jth column is AAt = Djj.) We have proved Theorem 12.I = A T. where Dij is the element in the ith row and the jth column of the identity matrix I = IIDjj II. Any translation of E preserves vector differences ~ . which satisfies I~U .1J. This equation may be written as A T(A Tf = I. hence their lengths. so that the concept of an orthogonal matrix can be defined in general. the inner product of Aj by Aj is the matrix product AAt (see (34). i. since the inverse A -I = A T of an orthogonal matrix is orthogonal and the product of the two orthogonal matrices A and B is orthogonal: (AB)T = BTA T = B-1A -I = (ABr l • This subgroup of the full linear group Ln(F) is called the orthogonal group On(F). This means that the transpose A T of an orthogonal matrix A is a right-inverse of A. whence A T is orthogonal: the transpose of any orthogonal matrix A is also orthogonal. In the row-by-column product AA T of A by its transpose. Bya rigid motion of a Euclidean vector space E is meant a nonsingular transformation U of E which preserves distance. A real n x n matrix represents an orthogonal transformation if and only if AA T = 1. with A .e. L akiakj = 0 if i ~ j. Definition.275 §9. k=l All n x n orthogonal matrices form a group.4 The Orthogonal and Euclidean Groups matrix A and At for its transpose. §8. and any two columns are orthogonal.6. with diagonal entries Dii = 1 and all nondiagonal entries zero. Therefore A TA = 1. so is ~ ~ . (22) n n L k=1 akiaki = 1 for all i. The equation AA T = I has meaning over any field. This is clear. it is isomorphic to the group of all orthogonal transformations of the given Euclidean space if F = R. /j/2 112 (b) ( 1/2 (c) (. It follows as in the proof of Theorem 7. It is the basis of Euclidean geometry.11 12. and (~ ~) are also. . -. called the Euclidean group. 9 276 Linear Groups (11 ./j/2) . Find an orthogonal matrix whose first row is a scalar multiple of (5. 7. 12. If the columns of an orthogonal matrix are permuted.K) = ~T. Test the following matrices for orthogonality. consisting of those linear transformations T which alter all lengths by a numerical factor CT > 0.8) . A familiar one is the group of similarity transformations T. If a matrix is orthogonal. a linear transformation is rigid if and only if it is orthogonal.t Various other geometrical groups exist. so is ~ ~ 11 = ~T + K. 1/2 4.0). . The "extended" similarity group consists of all affine transformations ~ ~ ~T + K in which T is a similarity transformation. t It is a fact that any rigid motion is necessarily affine. prove that the result is still orthogonal.1{ . Prove that any similarity transformation S can be written in the form S = cT as a product of a positive scalar and an orthogonal transformation T in one and only one way. conversely. since the orthogonal transformations form a group./j/2 . 3./j/2) . 8. 5. ( o -sin () cos () o 0 6. find its inverse: . We conclude. Show that an affine transformation H commutes with every translation if and only if H is itself a translation. and test the resulting product for orthogonality: sin ¢ COS ¢ -sin ¢ cos ¢ o cos () sin () . As an alternative proof of (ii) in Theorem 10. Show that the Euclidean group is isomorphic to a group of matrices. But by Theorem 10. One may prove that they do in fact form a group which contains the orthogonal group as subgroup.8 -. Prove that all translations form a normal subgroup of the Euclidean group. an affine transformation (11) is a rigid motion if and only if T is orthogonal.6 .Ch. If A and B are orthogonal. if T is rigid. 9. prove that (~ ~) (a) ( 1/2 . 10.6 2. show from first principles that 1 0 0) ( 4({. Multiply the following two matrices. so that I~TI = CT I~ I. that the totality of rigid affine transformations constitutes a subgroup of the affine group. hence the Euclidean group is actually the group of all rigid motions. Exercises 1. 11) = 1{ + 1112 . (a) Show that the correspondence A ~ {I(A) = (A -I)T is an automorphism of the full linear group Ln (F). A function F(x) defined for all elements x of S and with values in some convenient other set. A subset C of S is cailed a set of canonical forms under G if each XES is equivalent under G to one and only one element c in C . How many 3 x 3 orthogonal matrices are there with coefficients in~? in Z3? 14. is an invariant under G if F(xT) = F(x) for every point x in S and every transformation Tin G. the affine. as follows. (c) For which matrices does {I(A) = A? 9. Fn is a complete set of invariants under G if FI(x) = FI (y). by applying suitable transformations from these groups. whose rank was proved to be an invariant under the transformations considered . (b) Prove that On is a normal subgroup of Sn.12).t Let G be a group of transformations (§6. in other words. (a) Prove that all similarity transformations form a group Sn. Theorems 11 and 12).I carries y back into x. In the following sections.§9. The notions of a simplified "canonical form" and "invariant" can be formulated in great generality. Fn (x) = Fn (y) imply that x is equivalent to y. . These simplifications will be analogous to the simplification made in reducing a general matrix to row-equivalent reduced echelon form. say a set of numbers. 13. (b) Show that {l2(A) = A for all A. Give necessary and sufficient conditions that a matrix A represent a similarity transformation relative to a normal orthogonal basis (d. . SO yEox. A collection of invariants F h · • • . 12. and the Euclidean groups form examples of linear groups. Similarly.. xEoY) if and only if there is some transformation T ofG which carries x into y. we shall see how far one can go in "simplifying" polynomials. this element c is then the canonical form of x. the orthogonal. Invariants and Canonical Forms The full linear. F must have the same value at all equivalent elements. (c) Prove that the quotient group Sn/On is isomorphic to the mUltiplicative group of all positive real numbers.2) on any set or "space" S. and the relation of equivalence is symmetric.5. and various geometrical figures. one proves that equivalence under any G is also a reflexive and transitive relation (an equivalence relation). t The reader is advised to return again to this discussion when he has finished the chapter. using the other group properties. . Then T. Another is the unitary group (§9.. quadratic forms. Call two elements x and y of S equivalent under G (in symbols.5 Invariants and Canonical Forms 277 11. together with three new cases which will be discussed in subsequent sections (§9.12. but is not a complete system of invariants. since two similar matrices are certainly equivalent. in this case. The first line is to be read "A is row-equivalent to B if and only if there exists P such that B = PA. The . the full linear group of matrices P acts on A by A ~ PA. but it does not give a complete system of invariants.9. §9. arises from the study of a fixed subspace of F" represented as the row space of a matrix A. respectively). that of row equivalent. Each of these equivalence relations is the equivalence relation Eo determined by a suitable group G acting on M n . by Theorem 18 of §8. the full linear group acts on A by A ~ PAP-I.I B = PAP. but with the O's preceding the 1 's on the diagonal.I . Note that we might equally well have chosen a different set of canonical forms-say the same sort of diagonal matrices. A A A A A A row-equivalent to B equivalent to B similar to B congruent to B orthogonally equivalent to B unitary equivalent to B B =PA. P nonsingular. W already have at hand three different equivalence relations to such matrices. since two matrices A and B with the same rank need not be row-equivalent. P. §9. The set of all eigenvalues of a matrix is also an invariant under similarity. 9 278 Linear Groups For example.1O. 9). P nonsingular. these are listed below. let the space S be the set Mn of all n x n matrices over some field. Q nonsingular.Ch. B = PAP-I. Ex. The set of all diagonal matrices with entries 1 and 0 along the diagonal. with P nonsingular. in this case. The relation of similarity arises when we study the various matrix representations of a linear transformation of a vector space into itself. and rank is even invariant under equivalence. and §9. P orthogonal. P unitary. P nonsingular.2. the rank is a complete system of invariants under the group A ~ PAQ. The rank of A is a (numerical) invariant under this group. the 1 's preceding the O's. B = PAQ. not to be confused with the general notion of an equivalence relation) arises when we are studying the various matrix representations of a linear transformation of one vector space into a second such space (cf. ." and similarly for the other lines. B = PAP.2. Here. B = PAp T . by §9. and the reduced echelon form is a canonical form under this group. The first relation. Under similarity the rank of the matrix A is an invariant. and arises naturally from one of the various interpretations of a matrix. is a set of canonical forms.8. The second relation of equivalence (in the technical sense B = PAQ. recall that the full linear group Ln (P) is a group of transformations on the vector space F".i. To give a last example.§9. 5 below). Find canonical forms for all quadratic polynomials ax 2 + bx + c with a 'i' 0 under the affine group y = hx + k. On the other hand. h 'i' O. arises from the representation of a quadratic form by a (symmetric) matrix. consider the simplification of a quadratic polynomial f(x) = ax 2 + bx + c. any transformed polynomial has the same leading coefficient a and the same discriminant d = (b . 0. it gives rise to the Jordan canonical form of a matrix (see § 10. we find the result of translating f(x) to be g(y) = a(y . where d = b 2 - 4ac. 2. §8. This one invariant is actually a complete set of invariants for subspaces of p" under the full linear group (see Ex. so that the quadratic polynomials without linear terms are canonical forms under this group. By Corollary 2 of Theorem 10. Thus f(x) is equivalent under the group of translations to one and only one polynomial of the form al + h. As still another example of equivalence under a group. Exercises 1. Hence the first coefficient and the discriminant of f(x) are invariants under the group. . the dimension of any subspace S is an invariant under the full linear group. as will appear subsequently.6.2ak)2 .bk + c.10). by the group of all translations y = x + k. and in this case the polynomial is (23) g(y) = ay2 .k) + c = al + (b .S 279 Invariants and Canonical Forms formulation of a complete set of canonical forms under similarity is one of the major problems of matrix theory.d/(4a). we obtain the familiar "completion of the square"-the new polynomial will have no linear terms if and only if k = b/2a. Each transformation of this group carries a subspace S of F" into another subspace.4a(ae .bk + c) as the original polynomial f(x).k. Find canonical forms for all monic quadratic polynomials x 2 + bx + c under the group of translations.k)2 + b(y . They constitute a complete set of invariants because the canonical form can be expressed in terms of them as shown in (23). In particular. Substituting x = y . for the field of complex numbers. The relation of congruence (B = PAp T ).2ak)y + ae . with a . any quartic polynomial is equivalent under translation to a polynomial in which the cubic term is absent. In Ex.0) = c.0) = b l + C. . 9. (24) with coefficients bl>' . to exclude the trivial case.. 6. j to yield upon substitution in (24) the new linear form (26) We say that [ and g are equivalent forms under the affine group. 1) = bn + c. 9 280 Linear Groups 3. Distinct forms determine distinct functions. 9. . Linear and Bilinear Forms A linear form in n variables over a field F is a polynomial of the form . The form is homogeneous if c = O. If f(x) is any polynomial in one variable. [(1. .6. Show that a complete system of invariants for ordered pairs of subspaces (S1> S2) of V under the full linear group is given by the dimensions of S1> of S2' and of their intersection.. .. To any linear form we may apply a nonsingular affine transformation (25) Xi = I aijYj + ki .0. *10. bn and c in F.. •• . in case band c are elements from the field Z2 of integers modulo 2. 2. show that d/ a = b 2 / a . II aij II nonsingular.. xn) of Fn. [(0). prove that the degree of f and the number of real roots are both invariant under the affine group. For a polynomial in n variables. Any form (24) may be regarded as a function [(X) of the vector X = (XI> . Consider the set of homogeneous quadratic functions ax 2 with rational a under the group x .Ch. with r ~ 0 and rational. we assume that some coefficient bj is not zero. 7. ••• . Show that a real cubic polynomial is equivalent under the affine group to one and only one polynomial of the form x 3 + ax + b. for the function [(X) determines the coefficients of the form by the formulas [(0. Show that over any field in which 1 + 1 ~ 0.. Find canonical forms for the quadratic functions x 2 + bx + c under the group of translations. Let V be an n-dimensional vector space.0. 5.4c is an affine invariant. 4. show that the coefficient of the term of highest degree is invariant under the group of translations. . Show that the set of integral coefficients a which are products of distinct primes ("square free") provide canonical forms for this set. . . . 8. if there exists such a nonsingular affine transformation carrying [ into g..rx. Consider now the equivalence of real linear forms under the Euclidean group (i. every linear form (24) is equivalent to one and only one of the canonical forms dy. i=lj=l it is determined by the matrix of coefficients A = II aij II. and carries any f with c = 0 to the equivalent function g(Yb' . and (26) shows that in the transformed form the coefficient vector is the transform {3A of the original coefficient vector by the orthogonal matrix II aij II. Hence there is an orthogonal matrix II hij Ilwith this vector as its first row. As before. since dYI = blXI + . with c = 0.§9.Yn is a polynomial of the form (27) b(XI. The transformation Yi = L hijxj is then in the Euclidean group. . We have proved Theorem 13..' . Call d = (b 12 + . + b/)1/2 is an invariant under this group. To show this. with A = Ilaijll in (25) an orthogonal matrix). ... with positive d. Xm n L L XiaijYj. since some bj #:. . .... . If this form is written with the variables Xj' the new affine transformation with the equations Yn = Xn is nonsingular. bn / d) is a vector of unit length.. we can remove the constant c by a translation.e. . hence the norm is indeed invariant. Y) = XA yT. A (homogeneous) bilinear form in two sets of variables and Yl> ..0 and c = O. into the form g = dYI' This form dy I is a canonical form for linear forms under the Euclidean group.6 281 Linear and Bilinear Forms A canonical form may be obtained readily. First. where d = (b/ + .c/bj and Xi = Yi for i #:.. . (b l / d.. .... bn ). + bnxn> it carries the form f. . we need only prove that the norm d is invariant under the Euclidean group. The permutation Zl = Yj> Zj = Yb and Zi = Yi for i #:. Under the Euclidean group. Yb . Yn) = m XI. xm) and Y = (YI.• . Now the norm d of f is just tbe length of the coefficient vector {3 = (b h ••• .0. By the choice of d.Xm. . In terms of the vectors X = (Xl> ••. the translation Xj = Yj .j will remove the constant term.' . Yn) the bilinear form may be written as the matrix product (28) b(X.. . + b/)1/2 the norm of the form (24).. it is linear in each argument separately. As a function of X and Y.1 or J will then give a new form like (24) with b l #:. Yn) = YI' Therefore all nonzero linear forms are equivalent under the affine group. . B(g. respectively. Exercises 1. . which is the rank of the matrix of the form.j In other words.(3n in Wand let the scalars ajj be defined as ajj = B(aj.9 on the equivalence of matrices. and let B(g. al'T/l + a2'T/2) = B(g. 'T/) = B(xlal + . with a new matrix PAQT. let V and W be any vector spaces having finite dimensions m and n. Choose a basis a h .Ch.5. we see that two bilinear forms are equivalent (under changes of bases) if and only if their matrices are equivalent. with values in F.. (29) B(algl + a26. . any bilinear function B on V and W has a unique expression. + B(g. an m x n matrix B.. Find a canonical form for homogeneous real linear functions under the similarity group. These transformations replace (28) by a new bilinear form X*(PAQT)y*T. 'T/l> 'T/2 E W. is a (complete set of) invariants. Hence. defined for arguments g E V and 'T/ E W. am in V and a basis {31. YI{31 + . g E V. as a bilinear form (27) . by (29) and (29'). Then for any vectors g and 'T/ in V and W.. Since any nonsingular matrix may be written as the transpose Q T of a nonsingular matrix.9 282 Linear Groups More generally. gl. any bilinear form is equivalent to one and only one of the canonical forms The integer r. A change of basis in both spaces corresponds to nonsingular transformations X = X* P and Y = Y*Q of each set of variables. 'T/2)a2. 'T/) be any function. + xma m . (29') B(g. 'T/) = aIB(gl> 'T/) + a2B (g2. 'T/). g2 E V.• . {3j)Yj ij = L xjajjYj' i. Equivalently. in the notation of §8. by Theorem 18 of §8. . expressed in terms of the respective bases. (3j). + ynf3n) and hence. a bilinear form is just the product XBY of a row m-vector X . relative to given bases. we have B(g.. •. over the same field F. which is bilinear in the sense that for al and a2 E F. and a column n-vector Y. . 'T/I)al 'T/ E W. 'T/) = L xjB(aj. in the projective equations for conics in homogeneous coordinates."'. Such conics have equations AX2 + Bxy + Cy2 = 1.. in the formula Ixl 2 = (x/ + x/ + . Quadratic Forms The next four sections are devoted to the study of canonical forms for quadratic functions under various groups of transformations. . The result can be written as a matrix product.7 283 Quadratic Forms 2... y) ( 3 + 2y 3 Y) = 5x 2 + 6xy + 2y 2 .)(cIlYt + . Such quadratic forms can be expressed by matrices. Treat the same question (a) under the diagonal group of transformations Yt = dtx t . y) (5x 3x + The 2 x 2 matrix of coefficients which arises here is symmetric. . II alj II is symmetric if and only if aij = aji for all i and j. In general. .. v. 9.. and w. (b) under the monomial group of all transformations Y = XM with a monomial matrix M. + b.).7. in the formula for the length of arc ds in spherical coordinates of space. a matrix C is skew-symmetric if C T = -CO To split a matrix B into symmetric and skew-symmetric parts.. y*.r. first adjust the 2 form so that the coefficients of xy and yx are equal. as the sum of r products of linear forms. To obtain a matrix from a quadratic form such as 5x 2 + 6xy + 2y2. in other words. where i=l. Similarly. in differential geometry. in the formula (m/2)(u 2 + v 2 + w 2) for the kinetic energy of a moving body in space with three velocity components u. The simplest problems of this type arise in connection with central conics in the plane (ellipses or hyperbolas with "tilted" axes). A = A. as 5x + 3xy + 3yx + 2y2. a s$uare matrix A is called symmetric if it is equal to its own transpose."x. + x/) for the square of the length of a vector. . i that is. w* which will reduce to canonical form the bilinear function xu + xv + xw + yu + yv + yw + zu + zv + zw. v*. 4.§9. 3. Prove that any bilinear form of rank r is expressible as I (bilXt + . Find new variables x*. in the sense that it is equal to its transpose. ds 2 = dr 2 + r 2dl/J J + sin l/Jdfl. 23)(x) y = (x. + Ci"Y.. . z* and u*.. .y" = d"x". in which the left-hand side is a "quadratic form" Such quadratic forms (homogeneous quadratic expressions in the variables) arise in many other instances: in the equations for quadric surfaces in space.2 5 (x. . In terms of the new coordinates of t. where S = (B + BT)j2.B T)j2. B + B = 2S 1 . The new matrix. T this is another quadratic form with a new matrix PApT. In conclusion. provided 1 + 1 -:I:. the quadratic function becomes = (Xl>' •• . any quadratic form may be expressed uniquely. Kl = K. write the matrix B as B = S + K. each quadratic form determines a quadratic function Q(t) = XAX of the vector t.0. (PApT)T = pITATpT = PApT.K l .Ch. any matrix can be expressed uniquely as the sum of a symmetric matrix and a skew matrix. K = (B .O. . Formulas (30) apply over any field in which 2 = 1 + 1 -:I:. A change of coordinates rep/aces a quadratic form with T matrix A by a quadratic form with matrix PAp . like A. with S denoted by A.. since ant such decompo~tion would give BT = SIT + Kl T = SI . (B ± BT)T = BT ± BIT = BT ± B. but are meaningless for matrices over the field Z2' where 1 + 1 = O. • Hence if 1 + 1 -:I:. is symmetric. as (31) If a vector A = t has coordinates X II ajj II symmetric..Xn is by definition a polynomial L L XibjjXj i j in which each term is of degree two. A homogeneous quadratic form in n variables XI.BT)j2 = S + K. B . A change of basis in the space gives new coordinates X* related to the old coordinates by an equation X = X* P.B = 2Kb and SI = S. 9 284 Linear Groups write (30) B = (B + BT)j2 + (B . If the matrix B of coefficients is skew-symmetric. according to (30). No other decomposition B = SI + Kl with SI symmetric and Kl skew is possible. Theorem 14.xn ). In general. so S is symmetric and K skew. where Pis nonsingu/ar. bij = -bji and the form equals zero.0. the form then becomes Kskew . By the laws for the transpose. . with P nonsingular. This form may be written as a matrix product XBXT. Reinterpreted. 11.11) = Q(g + 11) Q(g) .11) == B(11. 9. (b) Exhibit a matrix over Z2 Which is not a sum S + K. then Q(g) = B(g. Prove: If A is skew-symmetric. in the form S + K. (30». *13. Exercises 1. 5. 6. 111') for any two vectors g and 11. Show that AD = DA if and only if A is also diagonal. then A -lKA is skewsymmetric. then (/ . 1. (a) Prove: Over the field Z2 (integers mod 2) every skew-symmetric matrix is symmetric. Prove: If A and B are symmetric. 10. with g. 4. (b) 8xy + 4y2. is symmetric when B(g. prove that Q(a + /3 + y) - Q(a + /3) - Q{f3 + y) - Q(y + a) + Q(a) + Q{f3) + Q(y) = O. Prove that ATA and AA T are always symmetric. Theorem 14 asserts that the problem of reducing a homogeneous quadratic form to canonical form.§9. under the full linear group of nonsingular linear homogeneous substitutions on its variables. A bilinear form B(g. 11). then A -lSA is symmetric. If Q(g) is a quadratic function.7 285 Quadratic Forms Symmetric matrices A and B are sometimes called congruent when (as in this case) B = PAp T for some nonsingular P. then AB is symmetric if and only if AB = BA. (b) IfK is skew-symmetric and A orthogonal.S)(/ + S)-1 is orthogonal. 2. . 8. (b) A and B both skew-symmetric. (cf. 7. Prove that if B is a symmetric bilinear form. (c) x + 2xy + 4xz + 3y2 + yz + 7z . Show that if the real matrix S is skew-symmetric and / + S is nonsingular. then A 2 is symmetric.Q(11)· 12. (c) A symmetric and B skew-symmetric. Show that a real n x n matrix A is symmetric if and only if the associated linear transformation T == TA of Euclidean n -space satisfies (gT. (e) x 2 + 4xy + 4y2 + 2xz + Z2 + 4yz. is equivalent to the problem of finding a canonical form for symmetric matrices A under the group A ~ PApT.BA in the following cases: (a) A and B both symmetric. 11) = (g. §8. Represent each matrix of Ex. 3. with 2B(g.3. 2 2 (d) 4xy. g) is a quadratic form. Find the symmetric matrix associated with each of the following quadratic forms: (a) 2X2 + 3xy + 6y2. g). Describe the symmetry of the matrix AB . Let D be a diagonal matrix with no repeated entries.11 both in V. Prove: (a) If S is symmetric and A orthogonal. (Hint: How is the transformation used here related to a rotation of the axes of the hyperbola?) An analogous preparatory device may be applied to forms in more than two variables. but c ~ 0.Ch. Proof. and the corresponding equation 2bxy = 1 represents an equilateral hyperbola. provided only that 1 + 1 ~ O.y' will reduce the form to 2b(x' + y')(x' . the form becomes ax. all diagonal terms au are zero.y') = 2b(x. with a leading coefficient 2al2 ~ 0. Finally.8. the transformation x = x' + y'.2. we can make al2 ~ 0. the result again contains only square terms. By a nonsingular linear transformation. In the remaining case.y/).2 _ y. If there is a diagonal term au ~ 0. if a = c = 0. In this case. the procedure gives ax2 + 2bxy + c/ = a[x 2 + 2(bja)xy + (b 2ja 2)/] + [c . Lemma. the cross term has been eliminated.(b ja)]/. This argument requires a ~ O. Just as in the case of the equilateral hyperbola. by the symmetry of the matrix a12 = a2l' The given quadratic form is then a12Xlx2 + a2lX2Xl = 2al2XlX2. For two variables. but there are indices i ~ j. If a = 0. by a transformation This transformation is nonsingular.(b 2ja)]y. a similar transformation works. for by elimination one easily shows . Under this linear change of variables. By permuting the variables.2 + [c . Y = x' . 9 286 Linear Groups 9. By hypothesis.2). with aij ~ O. this may be reduced to a form 2a dy 12 . one can get a new coefficient all' ~ 0 by interchanging the variables Xl and Xi (this is a nonsingular transformation because its matrix is a permutation matrix) . The term in brackets suggests the new variables x' = x + (bj a )y. QuadratiC' Forms Under the Full Linear Group The familiar process of "completing the square" may be used as a device for simplifying a quadratic form by linear transformations. at least one coefficient ajj ~ O. plus terms involving other variables. the original form is 2bxy. any quadratic form L XjajjXj not identically zero can be reduced to a form with leading coefficient an ~ 0..(b 2ja)]y2 2 = a[x + (bja)y]2 + [c . y' = y. By nonsingular linear transformations of the variables. . . Because of the symmetry of the matrix..Yn = Xn.. The number r of nonzero diagonal terms is an invariant. since different methods of reducing the form may well . to this form the same process applies. The process may be repeated (an induction argument!) till the new coefficients in one of the residual quadratic forms are all zero. This residual part in Y2.9. This rank must equal the rank of the matrix A of the original quadratic form. and we already know (§8. (32) each d i 7'€ o. for by Theorem 14 our transformations reduce A to D = PAp T . a quadratic form over any field with 1 + 1 ~ 0 can be reduced to a diagonal quadratic form. for r is the rank of the diagonal matrix D of the reduced form (32). The original form is now al1Y/ + I YjCjkYk. Yn is a quadratic form in n .. where the indices j and k run from 2 to n. but the coefficients are not. we make al1 ~ 0. the terms which actually involve Xl are then The formation of this "perfect square" suggests the transformation n YI = Xl + I j=2 bljxj.§9. since this means that the matrix A is nonsingular. . Hence we have Theorem 15. A quadratic form XAX T in n variables is called nonsingular if its rank is n. This number r is called the rank of the given form XAXT. then YI will appear only as Y/. Its in variance is immediate. so the form can be written as al1(I xibijXj). In the diagonal form (32) the rank r is an invariant..1 variables. where bij = ai/ a 11 and b l1 = 1.8 287 Quadratic Forms Under the Full Linear Group that it has an inverse Query: Where does this argument use the hypothesis 1 + 1 ~ O? Now for the completion of the square in any quadratic form! By the Lemma. Y2 = X2. . Theorem 19) that rank is invariant under the more general transformation A ~ PAQ. Theorem 17. 2' 0 6. Any quadratic function over the field of real numbers can be reduced by nonsingular linear transformation of the variables to a form (33) Zl 2 + . Over the field Zs.Ch. Carrying out these substitutions simultaneously on all the variables will reduce the quadratic form to L ±y/.. Show rigorously that the quadratic form xy is not equivalent to a diagonal form under the group L 2(Z2)' (a) A = (b) A = (c) A = 9. We shall now get a complete set of invariants for the special case of the real field.. Find a P such that PApT is diagonal if (~ ~). 2.2. prove that every quadratic form may be reduced by linear transformations to a form L d. 1. to diagonal form. in the sense that p depends only on the function and not on the method used to reduce it (Sylvester's law of inertia). so that the term diy/ becomes ±y. .. or 2. This proves Theorem 16.. Over the field of rational numbers show that the quadratic form x/ + x/ can be transformed into both of the distinct diagonal forms 9y/ + 4y/ and 2Z\2 + 8z/. G!). .. 4. 3. . 9 288 Linear Groups yield different sets of coefficients db . Real Quadratic Forms Under the Full Linear Group Conic sections and quadric surfaces are described in analytic geometry by real quadratic polynomial functions. + y" .9.. Over the field of rational numbers reduce each of the quadratic forms of Ex.y/.dr. Reduce 2x 2 + xy + 3y2 to diagonal form over the field of integers modulo S. Over the real field. §9. 7. The number P' of positive squares which appear in the reduced form (33) is an invariant of the given function Q.. with each coefficient d i = 0.. each term of the diagonal form (32) can be simplified further by making the substitution y/ = (±di )1/2 yi . + Zp2 - Zp+l 2 _ . + • 2 2 x"2 mto Yl + .• . Find all linear transformations which carry the real quadratic form Xl 2 + .7. Exercises 1. 4. - 2 Zr . In this sum the variables may be permuted so that the positive squares come first. S. 0 1 0) (o 1 0 2 . . rand s form a complete system of numerical invariants. ..p) + (r .9 Proof.p negative signs.(r . Sometimes this integer s is called the signature. .p) = 2p . a manifest contradiction. a real symmetric matrix A is called positive definite under the same conditions. . = ZT = 0 in (33)..(r .8. Therefore SI and S2 have a nonzero vector g in common. completing the proof. .. . then (33) and (34) represent the same quadratic function O(g) of a fixed vector g with coordinates Zi relative to one basis. since two forms are equivalent if and only if they reduce to the same canonical form (33). The assumption q > p would lead to a similar contradiction. = Yn = O.q) = n + (p ..p) coordinates Zb' . (34) makes O(g) < 0 for each g¥-o with coordinates Yl = .. only one form of the type (33). . zP' ZT+l> . This canonical form itself is uniquely determined by the so-called signature {+.q) > n. Theorem 18..q )-dimensional subspace S2.zn). Yj relative to another. The expressions l: ±z/ of this type are therefore canonical for quadratic forms under the full linear group. - YT 2 with q positive terms.. These conditions determine an (r . = Yq = YT+l = .§9. The g's satisfying these r . there is a nonsingular transformation carrying (33) into (34). Suppose q < p. . the dimension of the intersection SI n S2 is positive. +. The sum of the dimensions of these subspaces SI and S2 is n ...(r . We may regard the equations of this transformation as a change of coordinates ("alias"). .. This set of signs is determined by r and by s = p . + Yq 2 - Yq+1 2 -. Similarly. Then O(g):> 0 whenever Zp+l = .(r . A real quadratic form 0 = XAX T in n variables is called positive definite when X¥-~ implies 0 > 0.p) dimensional subspace S 1 (in this subspace there are n . so q = p. -} which is a set of p positive and r . Two real quadratic forms are equivalent under the full linear group if and only if they have the same rank and the same signature. -. This result shows that any real quadratic form can be reduced by linear transformations to one and. If we consider the canonical . (34) 289 Real Quadratic Forms Under the Full Linear Group Suppose that there is another reduced form Yl 2 + ... O(g) :> 0 by (33) and O(g) < 0 by (34). r being the rank of the form. For this common vector g. for according to Theorem 17 of §7.. Together.p equations form an n .r (s is the number of positive signs diminished by the number of negative signs). . Since both are obtained from the same 0 by nonsingular transformations. 1. no sequence of linear transformations 2 could reduce a circle x 2 + l = 1 to an equilateral hyperbola x . = 1. That is. -x 2 . we have Theorem 19. Geometrically. XAX T <: 0 in (33) unless p = n. compressions.8 it was proved (Theorem 15. and reflections. compressions. canOnlca By Theorem 14. Theorem 20. In §8.2 y =1 .. lfiorm IS . This is because a sum of n squares is unless all terms are individually zero. a circle.2 y = They represent. an equilateral hyperbola..x 2 = 1 (no locus). Y = Yo. respectively. A real quadratic form is positive definite if and only if its . it is evident that this is the case if and only if the form is Z 1 2 + . The signature is useful in studying the maxima and minima of functions of two variables. this result is reasonable: an ellipse could be compressed along one axis to make a circle. T A quadratic form XAX defines in n-dimensional real space a locus.l = 1. For example.. and because for X = E".Ch. but. 9 290 Linear Groups reduced reduced positive the nth proved form (33). This is the geometric significance of the invariance of the signature in this case. . x2 . clearly. and reflections. A real symmetric matrix A is positive definite if and only if there exists a real nonsingular matrix P such that A = ppT. + z" 2 . unit vector. y) be a smooth function whose first partial derivates f" and fy both vanish at x = Xo. The canonical form (33) means that a suitable nonsingular linear transformation will reduce this locus to one with an equation Z1 2 + . Hence any "central conic" with an equation ax 2 + bxy + cy2 = 1 can be reduced to one of the forms we have listed by a succession of shears. The only form of rank 0 is 0 = 1. consisting of all points X satisfying XAX T = 1. Zl 2 + . or no locus. . the reduced equations of rank 2 are X 2 + Y2 = 1. this means that A = PIPT. so that there are no first-degree terms in the Taylor's series expansion of Z in powers . in the plane. + zp2 - Zp+l 2 -'" - 2 Z. those of rank 1 are x 2 = 1 (which represents the two lines x = ± 1) or .. Corollary 2) that any nonsingular linear transformation of the plane can be represented as a product of shears. which gives the following further result. Let Z = f(x. + z" 2 . Describe the geometrical loci corresponding to the various possible canonical forms for real quadratic forms in three dimensions.12xlx2 + 18x/. 6. (b) 2Xl2 .. Maxima. (a) List all the types of nonsingular quadratic forms in four variables. 3. 7.. Similar results hold for critical points of functions of three or more variables. so xo. nearby values of f(xo + h. 2. 8. A quadratic form is called positive semidefinite if its rank equals its signature..b 2 > O. This expansion is f(xo + h. it is a quadratic form in hand k with real coefficients. Prove that the real quadratic function ax 2 + bxy + cy2 is positive definite if and only if a > 0 and 4ac . (c) -2X12 .2 ± k.x/. Do the same for Theorem 20. + zr2 . 5. and z has a relative minimum. but a saddle-point (like a saddle or a pass between two mountain peaks. If this form has rank 2. Yo). z has a maximum. State and prove an analogue of Theorem 19 for such forms.Yo).4XIX2 + 22x/ + 12x 2x 3 + 6x3XI . 4. 9. it can be expressed in terms of transformed variables h' and k'. as ±h. If both signs are plus. where motion in one direction increases the altitude z. Yo) + (1/2)[ah2 + 2bhk + cel + . If both signs are minus.§9. the coefficients being the partial derivatives For small values of hand k the controlling term is the one in brackets.. (b) Describe geometrically at least two of the corresponding loci in R4. and saddle-points of f are therefore distinguished by the signature of the quadratic form. Prove: A homogeneous quadratic form with complex coefficients is always equivalent under the full complex linear group to a sum of squares z 12 + . Yo is neither a maximum nor a minimum. in another decreases z).2. Exercises 1. . Show that a positive definite symmetric matrix has all positive entries on its main diagonal. (a) 9Xl2 + 12x 1x 2 + 79x/. minima.9 of h Real Quadratic Forms Under the Full Linear Group = (x 291 . Prove that two quadratic forms in n variables with complex coefficients are equivalent under the full linear group if and only if they have the same rank . 10. Yo + k) must always exceed f(xo. Find the rank and signature of each form. Prove that the bilinear function XA yT is an "inner product" if and only if A is symmetric and positive definite. . Yo + k) = f(xo. Reduce the following real quadratic forms to the canonical form of Theorem 16. the quadratic form may take on both positive and negative values. If one sign is plus and one sign is minus.xo) and k = (y . consider any real quadratic function O(g) = ax 2 + c/ with a -< c and no xy term.1AP.E. If a real quadratic function 0 = ax + 2bxy + has 2 among all points on the unit circle x + / == 1 a maximum value at x = 0. and this maximum is taken on at the point y = 1. then b = O.Ch. so the derivative y' = dy/dx is y' = -x/yo The derivative of 0 is 0' = (ax 2 + 2bxy + c/). since P is orthogonal. this means that the maximum value assumed by 0 for all points on the unit circle x 2 + y2 = 1 is c. go. i. Therefore. this derivative must be zero. as in calculus. y == 1. O(g) has a maximum:!: Al on S. . we get 2x + 2y(dy/dx) = 0. 9 292 Linear Groups 9. In other words. Since go has length 1. Now return to quadratic forms in n variables. among all vectors g of unit length there is one. at which O(g) takes as its maximum value AI. hence 2b == o. where 2 y is given implicitly in terms of x by x + / = 1. :tHere. x == 0. are sometimes called orthogonally congruent. c/ 2 Lemma. its points are all vectors of length one. In n ~space the unit hypersphere L x/ == 1 is a closed and bounded set S. Then O(g) -< cx 2 + c/ = c(x 2 + /f. On this hypersphere the values taken on by a real quadratic form O(g) == L xjajjxj have an upper bound L Iajj I. Quadratic Forms Under the Orthogonal Group How far can a real quadratic form be simplified by transformations restricted to be orthogonal? An orthogonal transformation Y = xp changes XAXT into Y(P-IAP-IT)yT. But at the maximum y == 1. Consider 0 as a (two-valued) function of one variable x. we assume the fact that a function continuous on a bounded closed set has a maximum value on this set. x == 0.10. x = o. Differentiating. one finds 0' = 2b. Q.i ~j since O(g) is a continuous function of g.D. The major axis might be charac~ terized as the longest diameter. = 2ax + 2by + 2bxy' + 2cyy'. at most one can hope to rotate the axes of the ellipse into standard position. the new matrix may be writtent P-1AP-IT = P-1AP. Conversely. Proof. Putting in the value of y' and setting y == 1. In the plane an orthogonal transformation (rotation or reflection) of an ellipse will never yield a circle. the latter statement insures the absence of an xy term in Q. we may t Two symmetric matrices A and p. with P orthogonal. To reformulate this maximum property. The coefficients Ai of (36) may be thus characterized as the solutions of certain maximum problems which depend only on Q.0. we may reapply the same device of choosing a new normal orthogonal basis which makes Q*(g) a maximum for IgI = 1. 'T/ = (32 makes Q('T/) a maximum A2 among all vectors 'T/ for which I'T/ I = 1.. so by substitution the maximum value Al is bll' This maximum will remain the maximum if we further restrict the variables so that all but two.' . The maximum value Al of Q is given by the vector al with coordinates (1. YI and Yi. ('T/. and so on.. -an' In this space (which is the orthogonal complement of the first basis vector go).. Yi -:. that is. The shortest principal axis c is the minimum diameter. . . In terms of the new coordinates YI> ••• .. i~2 j~2 The first coefficient AI is not a vector.. The difference Q*(g) = Q(g) . etc. . One finally finds a basis of principal axes for which Here ZI>"'. . an (Theorem 21. Zn are the coordinates of g relative to a basis (32. subject to the condition YI + Y/ = 1.11). § 7. . Yn' These variables are coordinates in the space Sn-I spanned by the n . Therefore Q. and not on a . This argument applies to each i = 2.AIYI 2 in (35) is a quadratic form in n .0 is the maximum of the form 2 2 b ll YI + 2b li YIYi + biiy/. n.0). The Lemma (with x replaced by Yi) then asserts that the cross product coefficient b li is zero.. this splits another diagonal term off the form. al> The third basis vector yields a maximum for Q(?) among all vectors I?I = 1 orthogonal to al and {32.. in these coordinates Yi. . the next principal axis b is the minimum diameter among all those perpendicular to the shortest axis. ')'3.1 new basis vectors a2. The first vector al gives Q(g) its maximum value AI> subject only to the restriction I~I = 1. Therefore. .1 variables Y2.§9. which has been chosen step by step by successive maximum requirements. loses all cross product terms involving YI and becomes n (35) Q(g) = Aly/ n + L L YibijYj. are zero. the quadratic form is now expressed as Q(g) = L YibijYj with a new matrix of coefficients bij ..10 Quadratic Forms Under the Orthogonal Group 293 choose a I = go as the first vector of a new normal orthogonal basis a I> ••• .. al) = O. . These successive maximum problems may be visualized (in inverted form) on an ellipsoid with three different axes a > b > C > O.Yn of g relative to this basis. YI = 1. but a scalar (the maximum of Q(g) on the sphere IgI = 1). The second basis vector {32 was chosen as a vector in the space orthogonal to al. i we have made an orthogonal transformation of the variables in the quadratic form. In the plane the canonical forms of equations O(g) = 1 are simply A IX2 + = 1. a hyperboloid of two sheets. Either of these two results is known as the "Principal Axis Theorem. A3 • If all are positive. 0. we see that the Ai in the canonical form (36) are just the eigenvalues of A. if all three are negative.0. Comparing with Theorem 4. .I is diagonal. a! are normal and j orthogonal. a! can. by Theorem 1." If the quadratic form is replaced by its symmetric matrix. . relative to a suitable normal orthogonal basis. if one is negative. x! as Xj = L x ~Pij.. For any real symmetric matrix A there is a real orthogonal matrix P such that PAP T = PAP.' .. This proves the following Principal Axis Theorem. if two are negative... furthermore. (Note again the role of the signature and of the rank. Theorem 21. Xn are then expressed in terms of the new coordinates x!. Any real homogeneous quadratic function of n variables can be reduced to the diagonal form (36) by an orthogonal point- transformation.4). since the vectors at···.. a hyperboloid of one sheet. they include the usual standard equations for an ellipse (A 1 :> A2 > 0) or hyperbola (A 1 > 0 > A2). the matrix P = Ilpij II of coefficients is an orthogonal matrix. An ambiguity in the reduction process could arise only if the first maximum (or some later maximum) were given by two or more distinct vectors go and 'T/o of length 1. . .En = (0" . . 0). In other words.. The "alias" result of Theorem 21 thus may be rewritten in "alibi" form as Corollary 1.Ch. the coefficients determine the lengths of the axes. no locus..) A2l . This new basis at· . the old coordinates XI>' •• . in other words. Any real quadratic form in n variables assumes the diagonal form (36). we have shown that any real symmetri. . the locus 0 = 1 is an ellipsoid.c matrix is similar to a diagonal matrix.9 294 Linear Groups particular coordinate system. Even in this case the Ai can still be proved unique (§10. . be expressed in terms of the original basis EI = (1. As in Theorem 2. the theorem asserts Corollary 2. In three-space a similar remark applies to the thr~ coefficients AI> A2. 1) as ai * = LPijEj. where P is orthogonal and B2 diagonal. Proof.§9. 3. Let A be any real symmetric matrix.c)/2b.IAA T(S-I)T = S-IS2(S-I)T = S(S-I)T = .10 295 Quadratic Forms Under the Orthogonal Group Comparing with the Corollary of Theorem 4. Prove that every real skew-symmetric matrix A has the form A = P-1BP. to find a basis of vectors ~j such that A~j = AjB~j is called the generalized eigenvector problem. Every nonsingular real matrix A can be expressed as a product A = SR. by extracting their square roots. S. following the method given: (a) 5x 2 .y' sin a. (b) 2X2 + 4J3xy . Corollary 4. We already know (Theorem 20) that AA T is symmetric and positive definite .6xy + 5y2. . Consider the real quadratic form ax 2 + 2bxy + cy2. For A real symmetric. But RRT = S . Exercises 1. the linear transformation X ~ XA has a basis of orthogonal eigenvectors with real eigenvalues. We leave the proof as an exercise. where S is a symmetric positive definite matrix and R is orthogonal. By the present theorem. 2. its solution plays a basic role in vibration theory. Reduce the following quadratic forms to diagonal forms by orthogonal transformations. (b) If cot 2a = (a .SInce Corollary 5. and B any positive definite (real) symmetric matrix.2y2. we see that (for A symmetric) the principal axes of the quadratic function XAX T are precisely the eigenvectors of the linear transformation X ~ XA.I = I.and PBP.are simultaneously diagonal. Then there exists a real nonsingular matrix I I p such that PAP. show that the form is diagonalized by the orthogonal substitution x = x' cos a .1AA Tp diagonal and positive definite. for then A = SR. The corollary will be proved if weshow that R = S-IA is orthogonal. we obtain a positive definite diagonal matrix T with T2 = P-I AA Tp and hence a positive definite symmetric matrix S = PTP-I with S2 = AA T. There follows Corollary 3.ac are invariant under orthogonal transformations. as desired. y = x' sin a + y' cos a . · (S. (a) Show that a + c and b 2 . The diagonal entries are thus positive.I)T = S-I · f or symmetnc SS . there is an orthogonal matrix P with P . . observe that a translation X = Y + k leaves invariant the quadratic coefficient a. 8. prove that S2 = AA T. -1/3) yields the maximum value 18 for Q when Y/ + Y/ + Y/ = 1.. with S symmetric and R orthogonal. for (38) f = a {y = ay2 + k)2 + b (y + k) + c + {2ak + b)y + ak2 + bk + c. 7.11. Consider the quadratic form ax 2 + 2bxy + cy2 on the unit circle x = cos 8. bn ) a row matrix.) *9. (a) If A = SR. Show that there is no orthogonal matrix with rational entries which reduces xy to diagonal form.j. 1. 2/3. . where A = Ilaij II is a symmetric matrix and B = (bl> .•. 1. Y = sin 8.) 9.n). Show that its extreme values are (cf. To the quadratic form 9X12 . 1): (a + c) ± J{a + d . 9 296 Linear Groups 4. (Hint: Any eigenvector for S2 must be one for S. *(b) Show that there is only one positive definite symmetric matrix S which satisfies S2 = AA T.··· . X n . where S is the special diagonal matrix with diagonal entries. In the simple case of a function f = ax2 + bx + c of one variable. -l.Ch.x/ invariant. Show that a matrix P defines a Lorentz transformation if and only if p-l = SpTS = SpTS-l. T This may be written f(g) = XAX + BXT + c. Ex. Y2' and Y3 show directly that the vector (2/3. 1. (37) (i.9x/ + 18x/ apply the orthogonal transformation: For the resulting form Q in Yr.k = 1. A Lorentz transformation is defined to be a linear transformation leaving X12 + x/ + x/ .4M2.. (Hint: Consider XAX T as a quadratic function in the Euclidean vector space with inner product XBX T and write B = ppT by Theorem 20. Prove Corollary 5 of Theorem 21. 6. Quadrics Under the Affine and Euclidean Groups Consider next an arbitrary nonhomogeneous quadratic function of a vector g with coordinates XI> . Check by the calculus. 5. . §9. + b~zn + c/. The product YAK T (row matrix x matrix x column matrix) is a scalar.. a homogeneous linear transformation X = YP changes f(~) to Y(PApT)yT + (BpT)yT + c. (39) f(~) = YA yT + (2KA + B) yT + KAKT + BKT + c. The result is one of the forms (40) f(~) = Aly/ + . A translation leaves unaltered the matrix A of the homogeneous quadratic part of a quadratic function f(~). d > O.2 :> . just as in the case of transformation of a homogeneous form alone.) The bj associated with nonzero Aj can now be eliminated by the simple device of "completing the square.11 Quadrics Under the Affine and Euclidean Groups A similar computation works for n variables. as in Theorem 13. As in §9. an exact analogue of the formula (38).. P orthogonal! By the remarks above. An no Ai = 0. permuting the variables so that the nonzero A's come first. + ArY/ + c/. all told. :> + . This proves the Lemma." using a translation Yj = Zj + bj/2Aj • Now.. the orthogonal transformation by P alone may be used to simplify the matrix A of the quadratic terms.. . one finds (with new coefficients b. . . . + ArY/ + dYr+b (41) f(~) = Aly/ where Al :> 11. hence equals its transpose KA TyT = KAyT.. in this quadratic function the new matrix of quadratic terms is PAp T. If the linear part of this function is not just the constant c/. Now to reduce the real function f(~) by a rigid motion with equations X = YP + K. exactly as for a homogeneous quadratic form. On the other hand. a translation X X .K (K a row matrix) gives 297 ~ Y = f(~) = (Y + K)A(Y + K)T + B(Y + K)T + c = YAy T + KAyT + YAK T + KAKT + ByT + BKT + c.+lzr+l + .. + ArY/ + b. we get f(~) = Aly/ + . it may be changed by a suitable translation and orthogonal transformation.. to the form dYr+I' This transformation need not affect the first r variables.10. 4. each quadratic function f(~) = XAX T + BXT + c defines a figure or locus. An affine transformation Y = XP + K applied to the equation of this surface . < () r n .Ch. . in three-space. the uniqueness of these (including multiplicity) will be proved in § 10... with P nonsingular. it goes as follows. which consists of all those vectors ~ which satisfy the equation f(~) = O. For affine transformations X = YP + K. + YP 2 - Yp+l 2 - ••• - y. Under the Euclidean group of all rigid motions. - y.. Theorem 23. In particular. By an affine transformation (or by an affine change of coordinates) any real quadratic function in n variables may be reduced to one of the forms (4 2) Yl 2 + . 9 298 Linear Groups Theorem 22. the number r of squares in (40) or (41) is an invariant. the coefficients now can all be made ±1. Since the quadratic terms are unaffected by translation. In two-dimensional space. . The invariants d and c' are most simply characterized intuitively using the calculus. In reducing the quadratic part to diagonal form. + YP 2 - Yp+l 2 - • . The Ai are (see § 9. the figure found from such a quadratic equation is simply an ordinary conic section .10) the eigenvalues of the matrix A of (37). but d can be characterized as the minimum of Igrad f I.. = 0.+l ( r -<) n . any real quadratic form (37) is equivalent to one of the forms (40) or (41). In (40). In outline. and in general it may be called a hyperquadric (or a quadric hypersurface). unaltered under A ~ PApT. These reduced forms are actually canonical under the Euclidean group. and c' is the constant value of f(~) on this (invariant) locus.6. it is a quadratic surface. a similar treatment applies. In (41) it is the subspace Yl = .. the locus is empty. but the proof is much more difficult. this minimum can also be proved invariant under the Euclidean group. note that r is also the rank of A.9. Consider the locus where the vector gradf = (~" aYl . since afjay.2 + Y. the rank rand the number p of positive terms must be invariants by the law of inertia (Theorem 17).2 + c (43) Yl 2 + . = y.. The linear part is then treated as in §9. as in § 9..+! = d ~ 0.~) aYn is the zero vector. From a geometrical point of view. Observe in particular that the different canonical functions x 2 + l + 1 and x 2 + 1 give the same locus (namely. this equation gives the same locus as does 2 (c -')y. which are thus equivalent. This may be used to simplify the canonical forms such as y.p..y2. = . (45) y. - ••• - Yr - ••• - Yr2 = 0.0. Therefore.Y/ + c = o found above. In general.. that an equation f(~) = 0 and a scalar multiple cf(~) = 0 of the same equation give identical loci. .Yr+l interchanges p and r .(c -')Y2 + 1 = 0. where 0 -< p -< r -< n. +yp -Yp+ . and the new figure is said to be equivalent to the old one under the given affine transformation.j CZi gives a similar result z 2 2 ..0 -x 2 -l + 1 = 0 ±(x2 + l) = 0 x 2 _ y2 = 0 =2 2 no locus ±x hyperbola x2 circle -x 2 one point two intersecting"lines r = 1 + Y = 0 parabola + 1 = 0 no locus +1=0 two parallel lines x 2 = 0 one line. 2 + 1 = O. Observe first. So do the canonical functions x 2 + Y and _x 2 . In (44) distinct forms represent affinely inequivalent loci. and so do x 2 + Y and -x 2 + y. any hyperquadric is equivalent under the affine group to a locus given by an equation of one of the following forms: (44) 2 2 2 2 0 Y. in the plane the possible types of loci with r > 0 are: r x2 + l + 1 = 0 x 2 . 2 . When c #:. when c > 0. however. but in (45) the transformation Yr+l ~ .. 2 + ..§9. + . this device can always be applied to change the constant c which appears in (43) to 1 or O. with r < n in the case of (45). while for c < 0 the transformation Yi =.z/ + 1 = 0. this may be reduced by an affine transformation y. + . Clearly. the results found above for the classification of quadratic functions under equivalence will yield a similar classification of the corresponding figures.2 . the fifure consisting of no points at all).z.2 ...11 299 Quadrics Under the Affine and Euclidean Groups amounts simply to applying the same transformation to the points of the figure..Y2 + 1 -.kz 2 to the form z._···-Yr +1= .kz" Y2 = . + Yp (46) y. in an n:-dimensional vector space over the field of real numbers. For example. + YP 2 2 2 - Yp+l - Yp+l 2 2 2 + Yr+' = 0. Ch.9 300 Linear Groups Exercises 1. Classify under the Euclidean group the forms (a) 4xz + 4y2 + 8y + 8, (b) 9x 2 - 4xy + 6y2 + 3z 2 + 2-1sx + 4-1sy + 12z + 16. 2. Classify under the affine group the forms (a) x 2 + 4y2 + 9z 2 + 4xy + 6xz + 12yz + 8x + 16y + 24z + 15, (b) x 2 - 6xy + lOy2 + 2xz - 20z 2 - lOyz - 40z -17, (c) x 2 + 4Z2 + 4xz + 4x + 4z - 6y + 6, (d) _2X2 - 3y2 - 7z 2 + 2xy - 8yz - 6xz - 4x - 6y - 14z - 6. 3. In a quadratic function XAX T + BX T + c with a nonsingular matrix A, prove that the linear terms may be removed by a translation. 4. (a) Show that a nontrivial real quadric XAX T = 1 is a surface of revolution if and only if A has a double eigenvalue. (b) Describe the quadric xy + yz + zx = 3. 5. Generalize the affine classification of quadratic functions given in Theorem 23 to functions with coefficients in any field in which 1 + 1 ¢ O. 6. (a) List the possible affine types of quadric surfaces in three-space. (b) Give a brief geometric description of each type. 7. Classify (a) ellipses, (b) parabolas, and (c) hyperbolas under the extended similarity group (§ 9.4, end). Find complete sets of numerical invariants in each case. *8. Classify quadric hypersurfaces in n -dimensional Euclidean space under the group of rigid motions (use Theorem 22). 9. Find a hexagon of maximum area inscribed in the ellipse x 2 + 3y2 = 3. *9.12. Unitary and Hermitian Matrices For the complex numbers the orthogonal transformations of real quadratic forms are replaced by "unitary" transformations of certain "hermitian" forms. A single complex number c = a + ib is defined as a pair of real numbers (a, b) or a vector with components (a, b) in twodimensional real space R2. The norm or absolute value 1c 1 of the complex number is just the length of the real vector (47) 1c 12 = la + ib 12 = a 2 + b 2 = (a + ib)(a - ib) = cc*, where c* denotes the complex conjugate a - ib. On the same grounds, a vector y with n complex components (Cl>···, cn), each of the form Cj = aj + ibj> may be considered as a vector with 2n components (aJ, bl>· .. ,am bn ) in a real space of twice the dimensions. The length of §9.12 301 Unitary and Hermitian Matrices this real vector is given by the square root of + b 12) + ... + (a n 2 + bn 2 ) I(CI, ... ,cn )1 2 = (a1 2 n (48) = L (aj + ibj)(aj - ibj ) j=1 = Clef + ... + cnc!. Since each product clej = a/ + b/ :> 0, this expression has the crucial n property of positive definiteness: the real sum L Cjcj is positive unless all j= I Cj = O. In this respect (48) resembles the usual Pythagorean formula for the length of a real vector. We adopt (48) as the definition of the length of the complex row vector K = (CI, ... ,cn ). The formula L Cjcj may be written in matrix notation as KK*T, where K* is the vector obtained by forming the conjugate of each component of K. en, Definition. In the complex vector space let g and 1/ be vectors with coordinates X = (XI,· .. ,xn ) and Y = (YI,· .. ,Yn), and introduce an inner product (49) Much as in the case of the ordinary inner product, one may then prove the basic properties + d1/,{) Linean'ty: (cg Skew-symmetry: (g,1/) Positiveness: If g ¥- 0, = c(g,{) + d(1/,?). = (1/, g)*. (g, g) is real and (g, g) >0. The skew-symmetry clearly implies a skew-linearity in the second factor: (g, C1/ + d{) = = + d{, g)* = c*( 1/, g)* + d*({, g)* c*(g, 1/) + d*(g, {), (c1/ so that (50) (g, C1/ + d{) = c*(g, 1/) + d*(g, {). If desired, one may adopt the properties of linearity, skew-symmetry, and positiveness as postulates for an inner product (g, 1/) in an abstract vector space over the complex field; the space is then called a unitary space (compare the Euclidean vector spaces of §7.1O). Ch. 9 302 Linear Groups Two vectors g and 1] are orthogonal (g .1 1]) if (g, 1]) = O. By the skew-symmetry, g .1 1] implies 1] .1 g. A set of n vectors aI,· .. ,an in the (n-dimensional) space is a normal unitary basis of the space if each vector has length one and if any two are orthogonal: (51) (a· a·) = " } 0 (i ¥ j). Such a set is necessarily a basis in the ordinary sense. The original basis vectors El = (1,0,· .. ,0), ... ,En = (0,· .. ,0, 1) do form such a basis. By the methods of §7.1l, one may construct other such bases and prove Theorem 24. Any set of m < n mutually orthogonal vectors of length one of a unitary space forms part of a normal unitary basis of the space. In particular, if a I, . . . ,am are orthogonal nonzero vectors, and Ci = (g, ai)/(aj, ai), then a m+l = g - Clal - ... - cma m is orthogonal to aI, . . . ,am, for any g. An n x n matrix U = I Uij I of complex numbers is called unitary if UU*T = I, where U* denotes the matrix found by taking the conjugate of each entry of U. This is clearly equivalent to the condition that L UikUjk * = Oij' where Oij is the Kronecker delta (§9.4); in other words, k that each row of U has length one, and any two rows of U are orthogonal. This means that the linear transformation of en defined by U carries EI,' .. ,En into a normal unitary basis. It is also equivalent, by Theorem 9, Corollary 6, of §8.6, to the condition U*TU = I, which states that each column of U has length one, and that any two columns are orthogonal. An arbitrary linear transformation X ~ XA of en carries the inner product Xy*T into XAA *Ty*T. This new product is again equal to Xy*T = Xly*T for all vectors X and Y if and only if AA *T = I, i.e., if and only if A is unitary. Thus a matrix A is unitary if and only if the corresponding linear transformation TA preserves complex inner products Xy*T. A similar argument shows that A is unitary if and only if TA preserves lengths (XX*T)I/2. Geometrically, a linear transformation T of a unitary space is said to be unitary if T preserves lengths, IgT I = IgI, and hence inner products. The set of all unitary transformations of n -space is thus a group, isomorphic to the group of all n x n unitary matrices. Quadratic forms are now replaced by "hermitian" forms, of which the simplest example is the formula L XiX~ for the length. In general, a hermitian form is an expression with complex coefficients h ij n (52) L x;h ijx1 = XHX*T, i,j=l H = Ilhijll, §9.12 303 Unitary and Hermitian Matrices in which the coefficient matrix H has the property H*T = H. A matrix H of this sort is called hermitian; in the special case when the terms h ij are real, the hermitian matrix is symmetric. The form (52) may be considered as a function h(t) = XHX*T of the vector t with coordinates Xl>' •• ,Xn relative to some basis. The value XHX*T of this function is always a real number. To prove this, it suffices to show that this number is equal to its conjugate (or, equally well, to its conjugate transpose). But since H is hermitian, as asserted. A unitary transformation Y hermitian form, yields = XU, X = YU- l = yU*T, applied to a The coefficient matrix U-lHU is still hermitian, Exactly the same effect on the form may be had by changing to a new normal unitary coordinate system, · for such a change will give new coordinates Y for t related to the old coordinates by an equation Y = XU with a unitary matrix U. Using this interpretation of the substitution, one may transform any hermitian form to principal axes. The new axes are chosen by successive maximum properties exactly as in the discussion of the principal axes of a quadratic form under orthogonal transformations. The first axis al is chosen as a vector of length one which makes h (t) a maximum among all t with It I = 1; one may then find a normal unitary basis involving a 1 by Theorem 24. Relative to this basis the cross product terms XlXj for j ¥ 1 again drop out. Since the values of the form are all real, the successive maxima A/ are real numbers. This process proves the following Principal Axis Theorem. Theorem 25. Any hermitian form XHX*T can be reduced to real diagonal form, (53) by a unitary transformation Y = XU. Ch. 9 304 Linear Groups This theorem may be translated into an assertion about the matrix H of the given form, as follows: Theorem 26. For each hermitian matrix H there exists a unitary matrix U such that U-1HU = u*THU;S a real diagonal matrix. The methods of Chap. 10 will again prove the diagonal coefficients Ai of (53) unique. Exercises 1. Which of the following matrices are unitary or hermitian? (~ ~) . (1 + i)/2 (1 - i)/2), ( (1 - i)/2 (1 + i)/2 2. Find a normal unitary basis for the subspace of vectors orthogonal to (1/2, i/2, (1 + i)/2). 3. Prove that II hij II is hermitian if and only if h t == hji for all i and j. 4. Show that if w is a primitive nth root of unity, then n-1 /2l1wij II is unitary, for i, j = 1,· .. , n. . ( cosh i sinh 5. Show that the complex matriX ., is unitary for any real 8. - I smh 8 cosh 8 Compute its eigenvalues and eigenvectors. 6. Show that all n x n unitary matrices form a group (the unitary group) which is isomorphic with a subgroup of the group of all 2n x 2n real orthogonal matrices. 7. Prove the linearity, skew-symmetry, and positiveness properties of the hermitian inner product (~, T/). 8. Give a detailed proof of Theorem 24 on normal unitary bases. 9. Show that a monomial matrix is unitary if and only if all its nonzero entries have absolute value one. 10. Prove a lemma like that of § 9.10 for a hermitian form in two variables with a maximum at x == 0, y = 1. (Hint: Split each variable into its real and imaginary parts.) *11. Give a detailed proof of the principal axis theorem for hermitian forms. 12. Reduce the form xy* + x*y to diagonal form by a unitary transformation of x and y. (Hint: Consider the corresponding real quadratic form.) 13. Reduce zz* - 2ww* + 2i(zw* - wz*) to diagonal form under the unitary group. 14. Show that any real skew-symmetrix matrix A has a basis of complex eigenvectors with pure imaginary characteristic values. (Hint: Show iA is hermitian.) 8 8) 305 §9.13 Affine Geometry 15. Show that the spectrum of any unitary matrix lies on the unit circle in the complex plane. 16. Show that a complex matrix C is positive definite and hermitian if and only if C = pp*T for some nonsingular P. 17. Show that a hermitian matrix is positive 'definite if and only if all its eigenvalues are positive. *9.13. Affine Geometry Affine geometry is the study of properties of figures invariant under the affine group, just as Euclidean geometry treats properties invariant under the Euclidean group. The affine group, acting on a finitedimensional vector. space V, consists as in (11) of the transformations H of V which carry a point (vector) g of V into the point (54) here K is a fixed vector, and T a fixed nonsingular linear transformation of V. We assume that V is a vector space over -a field F in which 1 + 1 -:j:. 0 (e.g., F is not the field Z2). In affine geometry, just as in Euclidean geometry, any two points a and {3 are equivalent, for the translation g ~ g + ({3 - a) carries a into {3. This distinguishes affine geometry from the vector geometry of V (under the full linear group), where the origin 0 plays a special role as the 0 of V. When considering properties preserved under the affine group, one usually refers to vector spaces as affine spaces. In plane analytic geometry, the line joining the two points (Xi> Yl) and (X2, Y2) has the equation Y - Yl = Y2 - Yl( X X2 - Xl - ) Xl , Introduce the parameter t = (x - Xl)/(X2 - Xl); then one obtains Y = Yl + t(Y2 - Yl) and X - Xl = t(X2 - Xl); in other words, the line has the parametric equations (55) X = (1 - t)Xl + tX2, Y = (1 - t)Yl + tY2, which may be written in vector form as (x, y) = g = (1 - t)gl + t6. Geometrically, the point (x, y) of (55) is the point dividing the line segment from (XI. Yl) to (X2, Y2) in the ratio t: (1 - t). For t = t this point is the midpoint. Ch. 9 306 Linear Groups In any affine space, the point dividing the "segment" from a to the ratio t: (1 - t) is defined to be the point (56) I' = (1 - t)a and the (affine) line af3 joining a to of all such points for t in F. 13 in + tf3, 13, for a f:. 13, is defined to be the set Theorem 27. Any nonsingular affine transformation carries lines into lines. Proof. I'H By substituting (54) in (56), we have = = (1 - t)aT + tf3T + K (1 - t)(aT + K) + t(f3T + K) = (1 - t)(aH) I'T +K = + t(f3H). Hence H carries the affine line af3 through a and 13 into the affine line through aH and f3H. Q.E.D. If I' = (1 - t)a + tf3 and 0 = (1 - u)a + uf3 are any two distinct points of af3, then, since (1 - v)y + vo = (1 - t + vt - vu)a + (t - vt + vu)f3, af3 contains every point of 1'0. The converse may be proved similarly, whence af3 = 1'0. That is, a straight line is determined by any two of its points. An ordinary plane is sometimes characterized by the property of flatness: it contains with any two points the entire straight line through these points. We may use this property to define an affine subspace of V as any subset M of V with the property that when a and 13 are in M, then the entire line af3 lies in M. Clearly, an affine transformation maps affine subs paces onto affine subspaces. Furthermore, the affine subspaces of V are exactly the subs paces obtained by translating vector subspaces of V, in the following sense. Theorem 28. If M is any affine subspace of V, then there is a linear subspace S of Vand a vector K such that M consists of all points g + K for g in S. Conversely, any Sand K determine in this wayan affine subspace M =S+ K. Proof. Let K be any point in M, and define S to be the set of all vectors a - K for a in M; in other words, S is obtained by translating M by -K. Clearly, M has the required form in terms of Sand K; it remains §9.13 Affine Geometry 307 only to prove that S is a vector subspace. Since straight lines translate into straight lines, the hypotheses on M insure a like property for S: the line joining any two vectors of S lies in S. For any a in S, the line joining o (in S) to a lies in S, which therefore contains all scalar multiples ca. If S contains a and /3, it contains 2/3 and 2a and all the line g = 2a + t(2/3 - 2a) joining them. (Draw a figure!) In particular, for t = 1/2, it contains g = 2a + (/3 - a) = /3 + a, the sum of the given vectors. Thus, we have demonstrated that S is closed under sum and scalar product, hence is a vector subspace, as desired. Q.E.D. The case F = Z2 is a genuine exception: the triple of vectors (0,0), (1,0), (0, 1) is a "flat" which contains with any two points a and /3, all (1 - t)a + t/3; yet this triple is not an affine subspace. The converse assertion is readily established; it asserts in other words that an affine subspace is just a coset of a vector subspace in the additive group of vectors. In particular, an affine line is a coset (under translation) of a one-dimensional vector subspace. The preceding results involve another concept of affine geometry: that of parallelism. Definition. Two subsets Sand S* of an affine space V are called parallel if and only if there exists a translation L: g ~ g + A of V which maps S onto S*. Theorem 29. Any affine transformation of V carries parallel sets into parallel sets. Proof. Let Sand S* = S + A be the given parallel sets; let U and U* be their transforms under H: g ~ gT + K. The theorem asserts that U* is the set of all g + J-L for variable g E U and some fixed translation vector J-L. By definition, U* is the set of (u + A)T + K = (uT + K) + AT for u E S. And U is the set of all g = uT + K for s E S. Setting J-L = AT, the conclusion is now obvious. Q.E.D. Equivalence under the affine group over the real field R has a number of interesting elementary geometrical applications. Under the affine group any two triangles are equivalent. To prove this, it suffices to show that any triangle a/3y is equivalent to the particular equilateral triangle with vertices at 0 = (0,0), /30 = (2,0), and Yo = (1, J3) (see Figure 2). By a translation, the vertex a may be moved to the origin 0; the other vertices then take up positions /3' and y'. Since these vectors /3' and y' are linearly independent, there then exists a linear transformation x/3' + yy' ~ x/3o + yYo carrying /3' into /30, y' into Yo. The product of the translation by this linear transformation will carry a/3y into O/3oyo, as desired; hence the two triangles are equivalent. Ch. 9 308 Linear Groups Thus, every triangle is equivalent y to an equilateral triangle. But in the latter, the three medians must by symmetry meet in one point (the center of gravity). An affine trans, formation, however, carries mid'Y points into midpoints and hence medians into medians. This proves the elementary theorem that the medians of any triangle meet in a point. Again, one may prove very easily that the point of intersection divides the medians of an equilateral triangle in the ratio 1: 2; hence the same property holds for any triangle. Moreover, any ellipse is affine ~~----~~-------------x o Jjo equivalent to a circle. But any diameter through the center of a CirFigure 2 cle has parallel tangents at opposite extremities; furthermore, the conjugate diameter which is parallel to these tangents bisects all chords parallel to the given diameter. It follows that the same two properties hold for any ellipse, for an affine, Figure 3 transformation leaves parallel lines parallel and carries tangents into tangents (but observe that conjugate diameters in an ellipse need not be orthogonal, Figure 3). Appendix. Centroids and Barycentric Coordinates. The point (56) dividing a line segment in a given ratio is a special case of the notion of a centroid. Given m + 1 points ao, ... ,am in V and m + 1 elements Xo, ... ,Xm in F such that Xo + ... + Xm = 1, the centroid of the points ao, ... ,am with the weights Xo, ... ,Xm is defined to be the point Xo (57) + ... + Xm = 1. (More generally, whenever w = Wo + ... + Wm ¥- 0, the "centroid" of the points ao, ... ,am with weights Wo, ... , Wm is defined by (57), where Xi = wJw.) §9.13 309 Affine Geometry If H is any affine transformation (54), then gH = (xoao + ... + xmam)T + K = xo(aoT) + ... + xm(amT) + (LXj)K = xo(aoH) + ... + Xm (amH). In other words, an affine transformation carries centroids to centroids with the same weights. Theorem 30. An affine subspace M contains all centroids of its points. The proof is by induction on the number m + 1 of points in (57). If m = 0, the result is immediate, and if m = 1, a centroid of ao and al is just a point on the line through ao and a h hence lies in M by definition. Assume m > 1, and consider g as in (57). Then some coefficient Xj, say Xm, is not equal to 1. Set t = Xo + ... + Xm-l; then Xm = 1 - t, t ,.e 0, and the point f3 = (xo/t)ao + ... + (xm-dt)am-l is a centroid of ao, ... ,am-l and lies in M by the induction assumption. Furthermore, g = tf3 + (1 - t)a m is on the line joining f3 E M to am E M, hence g is in M, as asserted. Centroids may be used to describe the subspace M spanned by a given set of points ao, ... , am, as follows. Theorem 31. The set of all centroids (57) of m + 1 points ao, ... , am of V is an affine subspace M. This subspace M contains each aj and · is contained in any affine subspace N containing all of ao, ... ,am. Proof. Let the g of (57) and (57') Yo + ... + Ym = 1 be any two centroids. Then (1 - t)g + tT] = [(1 - t)xo + tYo]ao + ... + [(1 - t)xm + tYm]a m is also a centroid of ao,···, am, since the sum of the coefficients (1 - t)Xj + tYj is 1. Hence M is indeed an affine subspace. That it contains each aj is clear. On the other hand, any affine subspace N containing all the aj must, by Theorem 30, contain all of M. Q.E.D. The m + 1 points ao, ... ,am are called affinely independent if the m vectors al - ao, ... ,am - ao are linearly independent. For an affine transformation H, one has (a; - ao)T = ajH - aoH; hence a nonsingu- Theorem 32.x. .' . Suppose that the points ai are independent.ao) = 0 with some coefficient.E. Exercises 1. am to be affinely dependent. the scalars Xo."'. am has a unique representation as a centroid (57) of the ai' Proof. . in virtue of Xo + .. When the points ao. am are called the barycentric coordinates of ~ relative to ao. .x.ao). For each of the following pairs of points. both with LXi = 1 = L x. find the space S I)' .. . In this definition of affine independence.(x~ ..' . . But al has a second such representation as al = 1· al. the initial point ao plays a special role. Then al = -C2a2 .(XI + .' ai as a centroid. .}ai . am are affinely independent if and only if every point ~ in the affine subspace M spanned by ao. + (xm .. but that some point ~ in M has two representations ~ = L Xiaj. find the parametric equations of the line joining the two points and represent the line in the form SI + A (i.. suppose the points ao. The representation of ~ as a centroid is thus unique. Q.. The following result will show that affine independence does not depend on the choice of an initial point.'.e.. Secondly. + xm). Since Xo = 1 . . ... The m + 1 points ao. = m L (Xi . j =I Since the vectors ai .. Then xo' .Ch..XI') and the zero vector 0 = 0 has a representation m 0= L (Xi .')(aj . + Cm + l)ao. . we also have Xo = xo'. . not zero. m. am are affinely independent.. ~ = LX.xo)ao i=1 m = L (X... .xDaj .. we conclude that Xi = X. for i = 1. say CJ. Xm appearing in the representation (57) of points in the space spanned by ao. ... a representation of al in which the sum of the coefficients is 1. am. hence the representation as a centroid is not unique..ao are linearly independent.. By division we can assume CI = 1.cma m + (C2 + . There is then a linear relation L Cj(ai .. Note that any m of these coordinates determines the remaining coordinate.D. 9 310 Linear Groups lar affine transformation carries affinely independent points into affinely independent points.· ..Xo = (XI ..x m').-0 + .'.. + Xm = 1. . 0).0).2. prove that every hyperplane has such an equation. (b) Conversely. and 1 + 1 + 1 . 2. find a basis for the parallel plane through the origin. 3). 4.2) and (-1. (1... 9. If an affine subspace M is spanned by m + 1 affinely independent points ao.1.1).4) and (4. 1.3. Show that every parallelogram is affine equivalent to a square. .0.. 0).3). .. (0. (b) (1. (1. ° ° . (1.'"5).t}aT + t({3T). (c) (1. then the medians of any triangle meet in a point. Prove: Through three vectors a. 18.1. Prove.13 311 Affine Geometry (a) (2. 4).3). (c) Find the equation of the hyperplane through (1. Draw a figure. and (3. In each part of Ex. (-1. assuming only the relevant definitions. that any affine transformation carries midpoints into midpoints. .3. Prove that the four diagonals of any parallelepiped have a common midpoint (it is the center of gravity). 16. .0).3.0) into the equilateral triangle with vertices (1. if the first triangle has vertices (1.7. 13. Find the parametric equations (in the form of Ex. Prove that the vectors in this plane have the form ~ = a + S (f3 . (b) The same problem.. with four different choices of x. 4. 1). (0. and (1. {3.3) and (4. (c) (2.0.1). 1..1. .am. prove that M is parallel to an m-dimensional vector subspace. Prove by affine methods that in a trapezoid the two diagonals and the line joining the midpoints of the parallel sides go through a point.. 0. 3) for the plane (if any) through each of the following triples of points: (a) (1.1. 12. Represent the line through (1. (a) Find an affine transformation of R2 which will take the triangle with vertices (0. and let {3o.2). 8.0.0). 1).3. .f:. provided the coefficients ai are not all zero. . (1. (a) Show that over any field F any two triangles are equivalent under the affine group. Give an affine proof that the diagonals of a parallelogram always bisect each other. Let ao. By definition. 5. in F. {3n be any n + 1 points in V. into {3. Prove that there is one and only one affine transformation of V carrying each a. *(b) Show that if 1 + 1 . *6. 0. (a) Prove that the set of all vectors ~ whose coordinates satisfy a linear equation a\x\ + .2.1. 0).1) and (5.1. 3. Prove that any parallelepiped is affine equivalent to a cube.an be n + 1 affinely independent points of an n-dimensional vector space V. 10.0. 15. 1). 'Y not on a line there passes one and only one two-dimensional affine subspace (a plane!).t)a + t{3 always implies 'YT = (1 .. Prove that Theorem 28 is valid over every field except Z2' 7. ' . (4. 1. -1).. -1. (3. a hyperplane is F' in an affine subspace of dimension n .0). (2. (0.a) for variables sand t.1). 11.. (0.2) in the form S + A.2). 17. 0.a) + t( 'Y . (0.§9.f:. Show that a one-one transformation T of a vector space V is affine if and only if 'Y = (1 . + anxn = c is a hyperplane.0). (b) (1.1).1). 14. ." plus a minor change in terminology. We call these triples~ . . 9 linear Groups 312 19. The incidence properties (i) and (ii) are clearly dual to each other. by Theorem 17.. X2.Ch. We prove that the "points" and "lines" of P 2 (R) satisfy (i) and (ii). changes property (i) into property (ii) and vice versa. This proves (ii). Take a three-dimensional vector space V3 over the field R of real numbers. must havea higher dimension and is then the whole three-dimensional space V3 . Projective Geometry In the real affine plane. CX3) determine the same point S if c ~ O. Prove: If an affine subspace M is spanned by m + 1 affinely independent points ao.X2.am and by r + 1 affinely independent points {3o. §7. if the lines (two-dimensional subspaces) LI and Lz are distinct.. X3) of real numbers. any two points lie on a unique line. Then each nonzero triple (XI. the subspace LI + L 2. If the points SI and S2 are the one-dimensional subspaces spanned by the vectors al and a2. in which (i) Any two distinct points lie on a unique line.. take V3 to be the space R3 of triples (XI. which is their linear sum. in the sense that the interchange of the words "point" and "line. .8. Secondly. as follows. .X3) determines a point S of P 2 . so that the one-dimensional subspace LI n L2 is the unique point lying on both LI and L 2. the two-dimensional vector subspace spanned by al and a2. then. . and call a one-dimensional vector (not affine) subspace S of V3 a "point" of P 2 . Therefore. and any two nonparallel lines "intersect" in a unique point.X3) and (cx!> CX2. The unique line L in which both SI and S2lie is. To obtain suitable projective coordinates in P 2 = P 2 (R). say that the point S lies on the line L if and only if the subspace S is contained in the subspace L. (ii) Any two distinct lines intersect in a unique point. and a two-dimensional subspace L of V3 a "line" of P 2 • Furthermore.X2. this proves (i). One way to construct the real projective plane P2 = P 2(R) is as follows. We shall now construct a real projective plane. *9.14..(3" then m = r. the triples (XI. then SI ~ S2 if and only if al and a2 are linearly independent. It is thus again clear that two projective lines (two great circles) intersect in one projective point (one pair of antipodal points on the sphere).14 313 Projective Geometry with the identification c ¥. Since any two-dimensional vector subspace L of V3 cuts the unit sphere in a great circle. Y3) and (-Yl> -Y2. Then P == Pn(F) is described as follows: a point of P is a one-dimensional subspace S of V. a2. the ratios (XI/X3. a3) and (cal. and two antipodal points (YI' Y2. with X3 ¥." It may be verified that each line . In other words. X3) of a point S can be normalized. for c ¥. CX2.§9. so that the new coordinates (Yl. a2. homogeneous coordinates of the point S. In any case. -Y3) on this sphere determine the same point of P 2 .0. by multiplication with (x/ + x/ + X/)-1/2. The real projective plane has a very simple geometrical representation. X2/X3) are called the nonhomogeneous coordinates of the projective point (CXl. an m-dimensional subspace of P is . Any homogeneous coordinates (Xl. of the projective plane P 2 is either the line at infinity (if al == a2 == 0) or a line al (XI/ X3) + a2(x2/ X3) + a3 == 0 of the affine plane. clearly. Since any two-dimensional subspace L of V3 may be described as the set of vector solutions of a single homogeneous linear equation. ca3). Y2. 1). plus one point (a2. ca2. -al. it is clear that each one-dimensional vector subspace (CXl. the coordinates (al. called the "line at infinity. Y3) satisfy Y/ + Y/ + Y/ == 1 and lie on the unit sphere. a line L of P 2 is the locus of points whose homogeneous coordinates satisfy an equation (58) We may call (al. A "projective plane" P 2 (P) can be defined in just the same way over any field P. 0) on the line at infinity. determine the same line. An n-dimensional projective space P can be constructed over any field P. intersects the affine plane X3 = 1 in exactly one point (XdX3' X2/X3. CX3). we may say that a line of P 2 consists of the pairs of antipodal points on a great circle of the unit sphere.0. CX3)' But the locus X3 == 0 is a projective line.0. The essential step is to start with a vector space V == pn+l of one greater dimension. a3) homogeneous coordinates of the line L. the points of P 2 may be obtained by identifying diametrically opposite points on the unit sphere. CX2. X2. 11 11. . the relations between the projective space P and the dual projective space whose points are the hyperplanes of Pare exactly the same as the relation between the vector space V and the dual space V*. Let T: V ~ V be a nonsingular linear transformation. then each point S of Pn can be given n + 1 homogeneous coordinates (X.. and the coordinates (CX. the linear transformation T is determined by a nonsingular (n + 1) x (n + 1) matrix I aij II. with preservation of the dimension. matrix A determines the identi- . We know (§8.7 about the dimension of the set of solutions of homogeneous linear equations.r.. . determine the same point. . If V is represented (say by coordinates relative to a given basis) as the space of (n + I)-tuples of elements of F. ·· ·. .0). The numbers (al> . Clearly. each such subspace is itself isomorphic to the m-dimensional projective space Pm determined in the same fashion by the (m + I)-dimensional vector space L.(0... By Theorem 13 in §7. The transformation T* then carries the point with homogeneous coordinates (Xl> .. an+l) may be regarded as the homogeneous coordinates of the hyperplane. Yn+l given by (j = 1.···. it follows that a set of r linearly independent equations such as (59) determines a projective subspace of dimension n . CXn+l). Hence the set of all projective transformations constitutes a group. the n-dimensional projective group.. and the correspondence T ~ T* is a homomorphism of the full linear group in (n + 1) dimensions onto the projective group in n dimensions over the field F.. n (60) Theorem 33. the product Tl T2 induces a transformation (T1 T2)* on P which is the product of the induced transformations.. Hence T induces a transformation S ~ S* = ST* of the points of the projective space P. We call T* a projective transformation of P. with C ¥. Relative to a given system of coordinates in V. .6.. A hyperplane (a subspace of dimension n .. 9 314 Linear Groups the set of all points S or P lying in some (m + I)-dimensional vector subspace L of V. Xn+l) into the point with homogeneous coordinates Yl> .1) in P = P n (F) is again the locus given by a single homogeneous equation (al> . and this transformation T* carries projective subspaces into projective subspaces. If Tl and T2 are two such linear transformations of V. an+l) ¥.. Corollary 2) that T carries each one-dimensional subspace S of V into a one-dimensional subspace S* of V. The (n + 1) x (n + 1) + 1).· . . Xn+l).Ch..0. . Theorem 10.. .. while A carries this vector into (Ch' . a projective transformation has the form Proof. if C ¥. Corollary. ...CXn+l) determine the same point of P. A similar representation (62') (b j = aj. Then T must carry each of the n + 1 unit vectors Ei into some scalar multiple CiEj. 1) if and only if all the Ci are equal. obtained by dividing the first equation of (61) by the second. with c ¥. 1) into some scalar multiple of itself. .0. The projective group in n dimensions over the field F is isomorphic to the quotient-group of the full linear group in n + 1 dimensions by the subgroup of nonzero scalar mUltiples of the identity.bc..§9.. The correctness of these symbolic interpretations may be verified by reverting to homogeneous coordinates and using (61).Cn+l)' This is a scalar multiple of (1. 1. .O.n) ... Conversely. and the point z = -d/c into the point w = 00. It also follows that two matrices A and A I determine the same projective transformation if and only if Al = cA for some scalar c. Hence the result follows by Theorem 28 of §7. . In terms of the nonhomogeneous coordinates z = xdxz and w = YI/Y2. this transformation may be written as a linear fractional substitution (62) w = (az + b)/(cz + d). But T must also carry the vector (1. For the one-dimensional projective line.n+l. Therefore A is indeed a scalar multiple of 1. . . then (62) carries the point z = 00 into the point a/c. Proof. (61) Yz = CXI + dxz. Theorem 33 asserts that the kernel of this homomorphism is precisely the set of scalar multiples of the identity transformation.. ad ¥. then Yj = CXj: the homogeneous coordinates (Xl> .14 315 Projective Geometry cal projective transformation 1'* of P n if and only if A is a scalar multiple cI of the identity matrix I.13. then (62) carries the point z = 00 into the point w = 00..'" . suppose that 1'* is the identity. If A = cI in (60). hence A must be the diagonal matrix with diagonal entries Ch' . . and 1'* is indeed the identity. .Cn+l. The map T ~ 1'* is a homomorphism of the full linear group into the projective group.Xn+l) and (CXl> . i = 1. Formula (62) is to be interpreted as follows: if C = 0. Generalize Ex. 2. is (0). n 2 + n + 1 lines. X3) satisfy this equation. We have already seen that projective transformations of Pn (F) carry lines into lines. Hence we conclude that any two non degenerate conics are projectively equivalent in the real projective plane. the (projective) rank of the conic is the rank of the matrix B of coefficients. List all points and lines. prove: (a) Any two distinct points lie on one and only one line. then any scalar multiple (CXl> CX2. 1 to projective n-space. Xl = 0.e. If the line at infinity is deleted. For the first conic of (64). Prove that the cross-ratio is invariant under any linear fractional transformation (62). hyperbola. In the real projective plane. the projective conic (63) becomes an ordinary conic. and n + 1 points on each line. Z2' Z3' Z4 is defined as the ratio (Z3 . CX3) also satisfies the equation. Exercises 1. 'I I = 0 (i. 9 316 Linear Groups of projective transformations by linear fractional substitutions is possible in n dimensions. ellipse. it is a classical result that anyone-one transformation of a real projective space Pn (R). This locus is called a projective conic.Ch.Z2)(Z4 .9 to one having one of the four equations (64) Xl (64') -Xl 2 2 + X2 2 + X3 2 - X2 2 - X3 2 = 0. hence the conics given by (64') are essentially those given by (64). and the points on each line in the projective plane over the field Z2' 4. 2 = 2 = 0. for if the coordinates (Xl> X2. Conversely. 6). (b) Any three points not on a line lie on one and only one plane. the locus is empty.i in the projective plane. which carries lines into lines. show that there are n 2 + n + 1 pOints.Zj)(Z4 . any nondegenerate conic (i. A homogeneous quadratic form in three variables determines a locus (63) "x·box· L. 5. . is projective if n :> 2 (see Ex. The cross-ratio of four distinct numbers Zj. 2.. In the projective three-space over a field F. Xl 2 2 + X2 2 - X2 2 - X3 - X3 0. .Zj) (with appropriate conventions when one of the z. j = 1. 3. In the projective plane over a finite field with n elements. or parabola) is equivalent by § 9. 3) i. A change of sign in the whole left-hand side of such an equation does not alter the locus.Z2)/(Z3 . Show that. Show that the transformation (Zl> Z2. there exists a projective transformation (62) which carries each Zt into the corresponding Wi. Let PI> P2.§9. but is not projective. Z3 and W h W 2 • W3 in the projective line. in the complex projective plane. Show that there exists a projective transformation (62') which carries each Pi into the corresponding qi.) 7. z!. P3' P4 and qh q2. Q4 be any two quadruples of points in the projective plane. 10. What does the projective conic x t 2 = 2X2X3 represent in the affine plane if the "line at infinity" X3 = 0 is deleted? 8. (Asterisks denote complex conjugates. Q3. (b) To which of the above is an elliptic paraboloid projectively equivalent? a hyperbolic paraboloid? (c) Show that a sphere is not projectively equivalent to a hyperboloid of one sheet.14 Projective Geometry 317 6. 9. given any two triples of distinct points ZI> zz. Z3) ~ (zt. zj) carries lines into lines. . (a) Show that every nondegenerate real quadric surface is projectively equivalent to a sphere Or to a hyperboloid of one sheet. Definition and Elementary Properties of Determinants Over any field each square matrix A has a determinant. Two linear equations alX + bly = kb a2X + b2 y = k2 have the unique solution provided a Ib 2 . and show the relation of the characteristic polynomial of a matrix A to its characteristic roots (eigenvalues).a 2 b l =1= O.1. The polynomials which appear here in numerator and denominator are known as determinants. These concepts will then be applied to the study of canonical forms for matrices under similarity. though the determinant can be used in the elementary study of the rank of a matrix and in the solution of simultaneous linear equations. (1) Similarly. In this chapter we shall define determinants.1 Determinants and Canonical Forms 10. examine their geometric properties. one may compute the simultaneous solutions of three linear equations L aljXj = k j • The denominator of each solution Xj turns out to 318 . its most essential application in matrix theory is to the definition of the characteristic polynomial of a matrix. The formulas for the solution of simultaneous linear equations lead naturally to determinants. it is a t The entries are elements of a field F or. Each row appears once and only once in each term of IA I. Collecting the coefficients of each such aij' we get an expression (4) I A 1 = Ala'l r I + A 2a'2 + . n."'.. Ie/> )a(2. j)..iq. j): det (A) = 1 AI = (3) = ~ sgn e/> L~l ai'iq. is + 1 or -1. Each term has exactly one factor from each row.. Writing aij as a(j. with the blanks filled by some permutation of the column indices 1. while the odd permutations are associated with a minus sign. (132) appear in products with a prefix +. The factor sgn e/> prefixing each product ITai. where the sign ± is called sgn e/> (for signum e/».. .§10.. Summation is over the n! different permutations e/> of the integer3 1. ne/».. where the blanks are filled in by a variable permutation e/> of the digits 1. and exactly one factor from each column. and letting ie/> be the image of i under e/>.. which means that 1 A 1 is a linear homogeneous function of the entries ail. 2e/»' . + A·rn a· I I fn' where the coefficient Aij of aij is called the cofactor of aij.al3 a 22 a 31· On the right are six products. so that a term of (2) has the form al-a2-a3_. an _.. . ne/».n. Experience has shown that the solutions of n equations in n unknowns are expressed by analogous formulas. Each involves one factor ali from the first row.. Definition. . Of the six possible permutations. of a commutative ring. (123). . The determinant 1A 1 of an n x n matrix A = II aij II is the following polynomial in the entriest aij = a(j. the three even permutations I. 3.1 319 Elementary Properties of Determinants be (2) all al2 al3 a2l a22 a23 a3l a32 a33 alla22 a 33 + a12 a 23 a 3l + al3 a 2I a 32 -alla23 a 32 . a(n. 2e/» . Thus. q.. according as e/> is an even or an odd permutation. a(n. one from the second row. le/»a(2. .] L (sgn e/> )a(l. the determinant IA I = II aij II is a sum of n! terms ±al-a2.ain in the ith row of A.a12 a 21 a 33 . more generally. 2.. and one from the third. the general term can be written ±a(1. Each column is also represented in every product. . . j This result is a sample term of IA TI. . jcPo). IA TI = IA I.320 Ch. If ATis the transpose of A. A sample term of IA I with j = icP and i = jcP -1 is (sgn cP)l1 a(i. the cofactor Aij can involve neither the ith row nor the jth column. j) = (sgn cP) i j I IT aT (j. To permute two rows of A. It contains only entries from the "minor" or submatrix M ij . 10 Determinants and Canonical Forms polynomial in the entries of the remaining rows of A. thus it replaces A by B = II bij " where b (i. RULE 2. for cP is even (i. change the sign of IA I. This cofactor can also be described as the partial derivative A j = alA I/aaij. If A has two rows alike. Proof.10). By Theorem 1. for every permutation is the inverse cP -1 of some permutation cPo Even the signs (sgn cP) = (sgn cP -1) agree. IB I = L (sgn cP) IT b(i. Proof.. What is the effect of elementary row operations on a determinant? RULE 1. Q. icP) = L (sgn cP) IT a(i. it suffices to prove that IA I = 0 if A has two like columns. '" I i '" Since the permutations form a group. The entry a/ = aji of A T is found by inverting subscripts. which is the matrix obtained from A by crossing out the ith row and the jth column. Only the signs of the terms are changed. Let 1/1 be the transposition which interchanges the two like . Rows and columns enter symmetrically in IA I: Theorem 1. multiply the determinant IA I by c. j) = a (i. Then. and vice versa: sgn #0 = -sgn cPo This gives the rule.ain from the i th row simply gives an extra factor c in IA I.0. we may prove instead that the interchange of two columns changes the sign. Hence IA I = IA TI.D. an extra factor c in each term ai 1. jcP -1). for in the linear homogeneous expression (4). . To multiply the ith row of A by a scalar c ¥. By symmetry (Theorem 1). so cPcPo is even when cP is odd. the products cPcPo (with cPo fixed) include all permutations. IA I = O.E. icP) = (sgn cP) IT a(jcP -1. Lemma 1. in the alternating group) if and only if its inverse cP -1 is also even (§ 6.e. Since each term of IA I involves each row and each column only once. so that IB I above has all the terms of IA I. This interchange is represented by an odd permutation cPo of the column indices. for cPo is odd. i#o). Then two rows become alike.4» in (3) occur in pairs {4>."'4». But this determinant may be found by replacing row . sgn 4> = .2). lEAl = IEIIAI = IAEI Another rule is that for explicitly getting the cofactors from the submatrices Mij discussed above. The determinant is indeed unchanged. RULE 3. This establishes Theorem 2. by row k. The determinant II I = 1 is thereby changed to IE I = c. .sgn "'4>. consisting of the cosets of the two-element subgroup generated by". j) position on a ± checkerboard which starts with a plus in the upper left-hand corner. This proves lEA 1= lEilA I. "'4>}. leaves IA I unchanged. each cofactor Aij is found from the determinant of the corresponding submatrix by prefixing the sign (-1/+ j • This is the sign to be found in the (i. These rules may be summarized in terms of elementary matrices. Any elementary row operation carries the identity I into an elementary matrix E. it is convenient to express this lemma by an equation. or 3). or IA I as the case may be. An even (odd) permutation of this type is .§10.4» = ITa (i. and A into its product EA. by the linear homogeneous expression (4).1 321 Elementary Properties of Determinants columns. j j j by (4) and (5). j RULE 4. First. . In A. . For the consideration of adjoints (§10. Aij = (-1/+ IMij l. while IA I goes to lEA I = ciA I. -1. ITa (i. or 1 (Rule 1. Since '" is odd. consider the proof of this rule for i = j = 1. by row k in the linear homogeneous expression (4). in words. Hence the paired summands are equal in magnitude and opposite in sign. Q. Then the summands (sgn 4>)ITa (i. the new determinant is L Aij(aij + Cakj) = L Aijaij + c L Apkj = IA I + 0. The definition (3) shows at once that the terms involving all are exactly the terms belonging to permutations 4> with 14> = 1.E.. replace row . and the determinant is zero. (-1) IA I. by symmetry (Theorem 1) the same applies to postfactors E. If E is an elementary matrix. The addition of a constant c times row k to row . and their sum is zero. 2. so that (5) (i ~ k).D. since the columns are alike. This operation replaces each aij by aij + Cakj. Lemma 2. The computation is completed by setting ITI = tl1 ••. n. The determinant of a triangular matrix is the product of its diagonal entries. .. Prove Lemma 2 directly from the definition of a determinant.1). An especially useful case is that in which all the first row is zero except for the first term. and Ch . The expansion (4) then need involve only the first cofactor IMl1l = A l1 . By Theorem 2. the various scalars used to multiply rows (or columns) of A.. . and compare the results. and hence the sign of the cofactor of aij' i + j . so the terms with al1 removed are exactly the terms in the expansion of 1M111.1).1 successive interchanges of adjacent rows and j . Any other cofactor Aij may now be reduced to this special case by moving the aij term to the upper left position by means of i . . Compute the determinant of the matrices of Ex. If n is odd and 1 + 1 ~ 0. and record t.Ch. S. This reduction proves the rule. cs. 2. 10 Determinants and Canonical Forms 322 actually an even (odd) permutation of the remaining digits 2. The preceding rules provide a system for computing a determinant IA I. 2. (a) If A = (- ~ ~ ~). 3. K is (n . but they do change the sign of IA I. By this rule and induction. § 7. These operations do not alter IMij I because the relative position of the rows and columns in Mij is unaffected. IAI = {-1)'{Cl" ·cs)-IITI. show that an n x n skew-symmetric matrix A has determinant O. (b) Compute IA I on the assumption that the entries of A are integers modulo 2. 4. . the number of interchanges of rows (or columns) used.1 . tnn> using Lemma 2.1 times. and B is (n .1) x (n . . so (6) where 0 is 1 x (n .1 interchanges of adjacent columns.6. Reduce A by elementary operations to a triangular form T. . one obtains the following result. - compute IA I both by the minors of the first 2 1 1 row and by the minors of the first column. Exercises 1.1) x 1. Write out the positive terms in the expansion of a general 4 x 4 determinant. XI)(X 3 . (a) Deduce the following expansion of the "Vandermonde" determinant: 1 Xl X z l 1 1 Xz X3 x/ == (xz .2 323 Products of Determinants 6. A real n x n matrix is called diagonally dominant if I Iaij I < a for jj i~.xz). IA I· == (a 12 a 34 a!3aZ 4 - + a l4 aZ 3 )z. then IA I = I1 (X x). § 8.k=1 dx A • jk alA I aa (b) Use this to venfy that Aij = . 1 bz *11. bz) has the equation al az 1 bl = O. 8.. show that dl A I == dx . In the plane show that the line joining the point (aI. any square matrix"A is equivalent to a diagonal matrix D (Theorem 18. az) to the point (b l . prove that I~ ~I = IA 11c!· *13. by proving that if ail == x/-I. provided n = 1 (mod 4).and postmultiplication by elementary matrices Ei and Em.. (a) If each entry aij in a matrix A is a function of x. I dajk j. Products of Determinants Under elementary row and column operations. show that Inl = n"/z." " n. so A can be obtained from D by pre. where w is a primitive complex nth root of unity.8.2. 9. as in Theorem 13 of § 8. 10. x/ (b) Generalize the result to the 4 x 4 case.XI)(X 3 . (a) Show that the determinant of any permutation matrix is ± 1. Show that for any 4 x 4 skew-symmetric matrix A. Show that if A is diagonally dominant. 10. (c) Generalize to the n x n case. i = 1. (7) . jj *12. j - i>i 7. If A and C are square matrices. then IA I·> O. (b) Show that the determinant of a monomial matrix is the product of the nonzero entries. If n is the n x n matrix I/wijl/.9). times ± 1.§10. The canonical form D has exactly r entries 1 along the diagonal. forming the product of the diagonal entries which are replaced by 1 as one proceeds.Ch.O.. IE I of Theorem 2 show that in the determinant of the product (7) the factors IE I may be taken out one at a time to give (8) Since each IE. A nonsingular matrix A is a product A = E.I"'IEll'IE~I"'IEil = IAI·IBI· Theorem 4. and both sides of lAB I = IA I·IB I are zero.. The original equations (4) and Proof.0. Q. if and only if A is nonsingular. while the determinant ID I is the product of its n diagonal entries.. Hence ID I¥-O if and only if r = n. . El of elementary matrices. the product AB has a determinant which may be computed. this suffices.. I ¥. that is. as IABI == IE. since the determinants of the other elementary matrices used are 1. 10 324 Determinants and Canonical Forms The rules IEA I = IE I. as in (8). The inverse of a matrix A with a determinant IA I¥-o exists and may be found explicitly by using cofactors of A. IA I and IAE I = IA I . where r is the rank of A. The computation above proves this rule only when A and B are both nonsingular. But if A or B is singular.E. Therefore (8) proves Theorem 3. the whole determinant IA I¥-O if and only if ID I ¥. If B = E~ . Computing Determinants. Thus 2 4 3 4 -1 1 2 3 -6 5 2 6 8 5 7 -2 =2 1 3/2 2 1/2 0 -7 -6 1 0 14 14 9 0 -7 -9 -6 1 3/2 == -14 2 1/2 0 1 6/7 -1/7 0 0 2 11 0 0 -3 -7 whence the determinant is (-14)(19) = -266. A square matrix A is nonsingular if and only if IA I ¥. One proceeds as in Gaussian elimination.··· EIE~'" = Eil IE.O. so is AB. Ei isa second such matrix. The determinant of a matrix product is the product of the determinants: lAB I = IA I·IB I· . Formula (8) also provides an efficient algorithm for computing n x n determinants numerically. .D. and is known as the adjoint of A. i) entry of the identity matrix I = 118ki II. this equation .. This proves Theorem 6 (Cramer's Rule). . there is a unique solution (11) j = 1. If A is nonsingular..k. A given system of equations has the form L aijXj = b i.bnf column n-vectors).' .xn ) = A -lB.. i) entry of the product of A = I akj I by the transposed matrix of cofactors. This number 8ki is exactly the (k.remultiplied by A -1 gives the unique vector solution X = (Xl>' . n. On the right of (9) is the (k. if the subscripts of the cofactors Aij are interchanged.0. The equation (9) is much like a matrix product. j) entry in the inverse A -1 is just Aj.0. the left side of (9) gives the (k. In case 1A 1 = 1.§10. If n linear equations L aijXj = bi j in n unknowns have a nonsingular matrix A = I ail' I of coefficients. Cramer's rule for solving n linear equations in n unknowns is a consequence of this formula for the inverse. If IA 1¥. where 8ki = { 0 iii ¥. the equation (10) states that the adjoint is the inverse of A. . in general. This solution may be expanded if we observe that the (i.2 325 Products of Determinants (5) involving the cofactors may be written as I if i = k. . i) entry of the identity mUltiplied by a scalar 1A I. the inverse of A is A-I = IA 1-11IAijf.!1 A I.. j where i and j range from 1 to n. if 1 AI¥. so (10) The matrix I Aij f which appears in this equation is the transposed matrix of the cofactors of elements of A. where Aij is the cofacior of aij in the coefficient matrix A. . (10) proves Theorem 5. In matrix notation the equation is AX = B (X and B = (bl> . for it is the expansion by cofactors of the jth column of a determinant obtained from A by replacing the jth column by the column of constants bj • Observe. 4. A submatrix (or "minor") of a rectangular matrix A is any matrix obtained from A by crossing out certain rows and certain columns of A (this is to include the case when no rows or no columns are omitted). 10. 9 below). Find the inverses of Ex. § 7. 2(a). C d 2. I + 2E 33 . 2(b). Exercises (a 1. a 2x + b2y + C2Z = 0 has a simultaneous solution x = Ib l CII. By the adjoint method. If A is nonsingular. b2 C2 Y = ICI all. find the inverses of the 4 x 4 elementary matrices H 24 . .3 (cL Ex. 6. (a) Compute the adjoint of the matrix of Ex. Solve the simultaneous congruences of Ex. d has the properties: (i) A has at least one d x d minor M. every h x h square minor N of A has INI = O. Write out Cramer's Rule for three equation in 3 unknowns. It can be shown that the rank of any matrix equals its determinant rank. 5. 10 326 Determinants and Canonical Forms The numerator of this formula may itself be written as a determinant. (ii) if h > d. prove that IA -II = IA I-I. 5.8 (50).7. however. C2 a 2 Z = la l bll. § 8.6. Prove that the product of a singular matrix by its adjoint is the zero matrix. Prove that the adjoint of any orthogonal matrix is its transpose.Ch. (a) Show that the pair of homogeneous linear equations alx + bly + CIZ = 0. (b) Do the same for the matrix of Ex. and verify in this case the rule for the product of a matrix by its adjoint. a2 b2 (b) When is this solution a basis for the whole set of solutions? (c) Derive similar formulas for three equations in 4 unknowns. Determinants and Rank. A "determinant rank" d for any rectangular matrix A ~ 0 may be defined as the number of rows in the biggest square minor of A with a nonvanishing determinant. Write out the adjoint of a 2 x 2 matrix A = b) and the product of A by its adjoint. 1. 7. Appendix. 9. in other words. as in §7. 3. by Cramer's Rule. by the adjoint method. the large sets of simultaneous equations may usually be solved more efficiently by reducing the matrix (or augmented matrix) to a row-equivalent "echelon" form. 8. Cramer's rule evidently applies to any field-and so in particular to all equations discussed in §2. and I + dE 21 of § 8.6.8. with IMI ~ 0. It is especially convenient for solving simultaneous linear equations in 2 and 3 unknowns. § 7.3. § 2. If an n x n matrix A has rank r. 10. each such parallelogram determines a matrix (cf. 13. (a) If A and Bare 3 x 3 matrices. it is in fact the determinant of I/(aj. Prove that the rank of any matrix equals its determinant rank.L-----~a~2 = (X2 . aj)11 = AA T. *16. 12.. A similar formula holds for parallelograms in Euclidean space of any dimension-and can even be extended to m -dimensional analogues of . Figure 1). (b) Generalize this result and use it to prove that rank (AB) < rank A. Each real 2 x 2 matrix A with rows a 1 and a2 may be represented as a parallelogram with vertices at O'--:'Y..1.§10. then s = 1.1. then s = n. The area of the parallelogram is (12) Figure 1 base x altitude = /al/'/a2/'/ sin CI. if r < n . By the The result looks very much like the determinant of a 2 x 2 matrix..9.1. *17. prove that the rank s of the adjoint of A is determined as follows: If r = n. then s = O. Show that the determinant of the adjoint of a matrix A is IA 1".3.Y2) and conversely. 14. Determinants as Volumes Determinants of real n x n matrices can be interpreted geometrically as volumes in n -dimensional Euclidean space. show that the determinant of any 2 x 2 submatrix of AB is the sum of a number of terms. Prove that the adjoint of the adjoint of A is IA 1"-2A. The connection IS suggested by the formula for the area of a parallelogram. each of which is a product of the determinant of a 2 x 2 submatrix of A by that of a 2 x 2 submatrix of B.3 327 Determinants as Volumes 11.. if r = n . Prove that the determinant of an orthogonal matrix is ± 1. *15. Show directly from the definition of determinant rank that an elementary row operation does not alter the determinant rank. the square of the area is al and a2.. where C denotes the angle between the given vectors cosine formula (41) of §7. with rows a I> ••• .am is the determinant IAA TI. If m = 1. this is always possible by §7. and the "volume" of II is independent of which m . . Since A is an m x n matrix. The square of the volume of the parallelepiped with edges al>· .. Theorem 7. . . am be called the base of II. you will get something affinely equivalent to a cube!) This construction establishes a correspondence between real m x n matrices and m -dimensional parallelepipeds in n-dimensional space.. am.I)-dimensional volume of the base by the length If31 of the altitude. (13) The volume of II is defined as the product of the (m .. We now argue by induction on m.· . .. .am. . Let the parallelepiped with the edges a2.Ch. Proof.3. To establish the generalization.t Note. ." .1 vectors are said to span its "base. (Picture this in case m = n = 3. The altitude is the component of al orthogonal to a2. where A is the matrix with the coordinates of aj in the ith row.. where P is an m x m permutation matrix with IPI = IpTI = ±1. 10 328 Determinants and Canonical Forms parallelograms in n-dimensional Euclidean space..am and a component f3 orthogonal to Sm-l (see Figure 1. Since a permutation of the rows of A replaces A by pA. The parallelepiped II in En spanned by the m vectors ai consists of all vectors of the form (0 < ti < 1. the matrix A is a row. . These rows represent vectors issuing from the origin in n-dimensional Euclidean space En.. the ai are called edges of the parallelepiped II. . m). the product AA T is an m x m square matrix. The m-dimensional volume (including as special cases length if m = 1 and area if m = 2) V(II) of this figure can be defined by induction on m. let A be any m x n matrix. it is to be found from the remaining edge a 1 by writing a 1 as the sum of a component'Y in the space Sm-l spanned by a2. These analogues are called parallelepipeds. . i = 1. the coordinates of a vector are taken relative to a fixed normal orthogonal basis.11). Theorem 7 degenerates to the equation 0 = 0 if m > n. and the "inner product" AA T = (a I> al) is the square of the t Throughout §10.. . As in (13). This argument applies also to the formula of Theorem 7. .. Let A be any real n x n matrix with rows al> . The absolute value of a determinant is unaltered by any permutations of the rows. Frame. an. Theorem 9. + cmAm is a linear combination of them.Am (BIA/ = 0). the elementary row operations involved each premultiply A by an elementary matrix of determinant 1. and we have provedt Theorem 8. so this theorem shows also that our definition of the volume of a parallelepiped is independent of the arrangement of the edges in a sequence.. by induction on m... so we have the desired base x altitude formula for AA T..3 329 Determinants as Volumes length. . In the special case when the number of rows is n. the scalar BIB/ is the square of the length' of the altitude. .. When m = n..1 rows. its sign is reversed by any odd permutation. the first row Al may be written Al = BI + Cl> where the "altitude" BI is orthogonal to each of the rows A 2..an' The determinant of A is (except possibly for sign) the volume of the parallelepiped in En having the vectors al> ..D. and consider the case of m rows. . evidently IAA T I = IA I·IA TI = IA 12 .. Subtract successively Ci times the ith row from the first row of A.§10. an as edges. Suppose that the theorem is true for matrices of m . tThe line of argument in this proof was originally suggested to us by Professor J. . as desired. .. so IDDTI is the square of the volume of the base. the Here D is the matrix whose rows A 2..Am span the base of II. and IA*A*TI = IPAATpTI = IPllAATllpTI = IAATI.1 rows A 2.E. hence A * = PA. then where BIDT = 0 because BIA/ determinant is = 0 for each row Ai of D. Q..Am of A *. furthermore. when m < n. where Ipi = 1. . S. But if D is the block composed of the m . A linear transformation Y = XP of an n-dimensional Euclidean vector space multiplies the volumes of all n -dimensional parallelepipeds by the factor ±IPI. the determinant IA I is often called the "signed" volume of the parallelepiped with edges al> . This changes A into a new matrix A * with first row B I . By (6). . . Furthermore. . while C = C2A2 + . The new signed volume is then IAP I = IA II P I. 1.0). From this it follows that the transformation Y == XP preserves signed volumes if and only if its matrix satisfies IP I = + 1.Ch. Corollary. (Hint: Reduce to the case of an equilateral triangle.0). and (4.5). take the sum L VeIl. its absolute value). (This is commonly done in the integral calculus. . 4) of all these sums for different such sets of parallelepipeds. all transformations which preserve the absolute magnitude of volumes).0. we obtain the following result. . Sometimes this group is enlarged to include all P with IP I = ± 1 (i.An. (2. An affine transformation Y = XP + K alters all volumes by the factor IPI (or rather. and (0. (1. Consider a parallelepiped which has n edges with respective coordinates A b ••• . by an affine transformation. Since translations leave volumes unaltered.) By Theorem 9.0). (1.0.AnP: the matrix with these new rows is simply the matrix product AP. the parallelepipeds· being cubes with sides parallel to the coordinate axes. prove that any line through P bisects the area of the parallelogram. (3.4) in the plane. where A has the rows A b ••• .. a linear transformation with matrix P changes the volume of any parallelepiped in the ratio 1 : Ipi. .An.e. .lIs of given shape and orientation. 0). (a) If P is the intersection of the diagonals of a parallelogram. The row vectors A b ••• . The set of all matrices (or of all transformations) with this property is known as the unimodular group. Prove that the diagonals of any parallelogram divide it into four parts of equal area.4). The volume of any region f in n -dimensional Euclidean space may be defined loosely as follows: circumscribe f by a finite set of parallelepipeds lIb' .0).2. Describe three planes which divide a tetrahedron into six parts of equal volume.. Exercises 1. 5. Show that the medians of any triangle divide it into six parts of equal area.). (b) Extend this result to three dimensions. (b) Do the same for the parallelepiped in space with the adjacent vertices (0. (a) Compute the area of the parallelogram with vertices (0. 2.. where IA I is the old volume. hence it changes the volume of f in the same ratio. 4.An are transformed into AlP. 10 Determinants and Canonical Forms 330 Proof.) 3. and define the volume of f to be the greatest lower bound (Chap.. 4. 'Ia n I <: Knn n/ Z (Hadamard's determinant theorem).2. (b) Infer IA I <: lall·lazl· . and z-axes is 1/6. (Xl> Yl> Zl). prove that if tfie a.zz) is (l/2)IAA T ll / Z . (a 3 . where A = (Xl Yl Zl). 'Ia m /)z. prove that the parallelepiped which they span has m-dimensional volume zero. By Theorem 3. *(d) Generalize to "tetrahedra" of higher dimensions. show that the matrices with IA 1= +1 (the "proper" orthogonal matrices) form a normal subgroup of index 2. Let -K <: aii <: K for i. . Xz Yz Zz *(c) The volume of the tetrahedron with three unit edges along the X-. Theorem 18. (c) Is the extended unimodular group (all P with IPI = ±1) a normal subgroup of the full linear group? 10. (a4 ..j = 1. y-. (a) If m vectors al> . In the group of orthogonal matrices.AIl = 0. *12.rn. use the proof of Theorem 7 to show that IAA T I >. 11. prove directly that the area A of the parallelogram spanned by the vectors ~ = (Xl> xz) and 1/ = (Yl> yz) satisfies A z = IX I Yl xzlz. (b) State and prove the converse of this result 8. " n.Yz.a l ). (b) Infer that the unimodular group is a normal subgroup of the full linear group. a)//.O.0).0. where B is the 3 x n matrix with rows (az . 10. . then AA T is the matrix of inner products //(a. which proves the following lemma.4 The Characteristic Polynomial 6. (a) Show that if a. Yz 7.. are orthogonal.' . (b) Using (a). (a) Prove that if A is any matrix with rows ai.am in En are linearly dependent.a l )..331 §10... this is the case if and only if IA . = (ail' . . Prove the volume of a tetrahedron with vertices al> a z. Show that the case m = 2 of this result is the Schwarz inequality of §7. The Characteristic Polynomial We have already seen (§9. ain ).a l ). Using trigonometry. Theorem 5) that A is a characteristic root (eigenvalue) of the n x n matrix A if and only if the matrix A . a 3 .. 9. then Ia. then IAA T I = (Iall' . (a) If A is a real m x n matrix. (a) Show that the correspondence A ~ IA I maps the full linear group homomorphically onto the multiplicative group of nonzero scalars.AI is singular. (b) Show that the area of a triangle with vertices (0.1O. I <: K. and (xz. a 4 is (1/6)IBB T ll/z. -4. Relative to the new basis formed by these three vectors. to find the characteristic roots of a 3 x 3 matrix. z) is characteristic.5. to . -y + z = 3z. (o -1 Then. §9.5.All = -(A . hence linearly independent.2.3. it is characteristic for A = 3 if and only if x + 3y = 3x. Similarly. 3x).z = y.3). EXAMPLE. the transformation TA is nonsingular with matrix PAP. with characteristic root A = 1.5..All = O. when such a reduction is possible.1)(A + 4)(A . The characteristic vectors for A = -4 are likewise the scalar multiples of (-3.2y . The matrix P with these vectors as rows is nonsingular. 1. this is the case only for scalar multiples of (3. (In general. -I). -1).3). y. z)TA = (x + 3y. Theorems 3' and 4). 3x . if and only if x + 3y = x. we have IA . This lemma provides a straightforward means for reducing a matrix to diagonal form.4 or §5. expanding IA - IA .1) are mutually orthogonal.e. The characteristic roots (eigenvalues) of a matrix A are the scalars A such that IA .A -1 o -1 1. We may also normalize this basis. a vector g = (x. giving g = (x. 0. (3. if and only if y = 0 and z = 3x. y.z = 3y.0. i. Let A be the real symmetric matrix A = 13 -23 -10) .A 3 o 3 -2 . 3x .A =-A 3 +13A-12.-y + z).All = All 1 by minors of the first row.z.2.1 (d.Ch. so that the characteristic roots of A are 1.2y . -y + z = z. The three characteristic vectors (1.1). 10 332 Determinants and Canonical Forms Lemma. (-3.) For each characteristic root there is a characteristic vector of the transformation TA • Since (x. Factoring. as in §4.2. one must solve a cubic equation.3x - 2y . linear in the entries of each row. a linear transformation has at least one (nonzero) characteristic vector. The 3 x 3 symmetric matrix A displayed above is the matrix of the quadratic form x 2 + 6xy .AI)PI = IP-II·IA . (X3.3. -4. The characteristic roots (eigenvalues) of a matrix A are the roots of the characteristic equation of A.All::: O.AIl. (X2 = r. 5.AP-lIPI = IP-I(A . the determinant IA . of the form We shall define the characteristic polynomial of A as the polynomial cA(A) = IA .-:-(3. Similar matrices have the same characteristic polynomial. In general. assumes the diagonal form x 2 + 3y2 . Since a determinant is a polynomial. . The preceding analysis shows that this quadratic form. relative to the normal orthogonal basis ("principal axes") (Xi> (X2.2yz + Z2.4 The Characteristic Polynomial 333 obtain the normal orthogonal basis of characteristic vectors (Xl = 1 1 1 ~(1 . Proof. -1). (X3 = ~(-3. Since a complex polynomial has at least one root. Over the complex field. and the characteristic equation of A as the equation IA .AIl. Since Ip-ll = IPr l and IP I are scalars. 3).AIl is a polynomial of degree n in the indeterminate A. y10 y14 y35 The matrix Q with these rows is orthogonal. Theorem 11.4z 2.AII·IPI = IA . Let the matrices be A and B = P-lAP.§10. let A · be any n x n matrix. they commute. We can now restate the lemma above as follows: Theorem 10. 0. 2. and QAQ-l = QAQ T is the diagonal matrix with diagonal entries 1. and so the rule for multiplying determinants gives IP-lAP - AIl = jP-lAP .0.2y2 . we infer the following Corollary. AI is itself a triangular matrix. pT = p. A matrix A and its transpose AT have the same characteristic polynomial. The characteristic polynomial of a triangular matrix T with diagonal entries d I> ••• .. + Anz/../ + (-I)"-12bn _ 2 • i=1 Iq at In the case of symmetric matrices. by Theorem 1. hence the same characteristic roots..Ch. Hence the set of diagonal entries and the number of occurrences of each diagonal entry are the same for any two similar diagonal matrices.. It is a corollary that the set of diagonal entries (with multiplicity) consists of the roots (with multiplicity) of the characteristic polynomial. Since P is orthogonal. The eigenvalues A I> ••• .AIl.An of D are therefore the same as those of the given matrix A. Suitable polynomials in the bi give other useful invariants. .AIl are invariants of the matrix A. The properties of similarity throw a new light on the orthogonal transformation of a real quadratic form (§9. bl> . One such invariant is n I iJ=1 n aipji = I a/j2 + 2 I aijaji = bn . since T .). d n is The proof follows from Lemma 2 of §10. This gives the following sharpened form of Theorem 21 of §9. i<1 of IA .1.aijaj. . under the group A ~ P-IAP.10).AI)TI = IA . the diagonal matrix D of this new form is D = PApT. This can be stated as follows: Corollary. Theorem 12. (ajjaH .2 = (-1)" I I.AIl = I(A .I and D = PAP-I. this invariant is simply I Since IA T .10. 10 334 Determinants and Canonical Forms It is a corollary that the successive coefficients bo = IA bn . If a quadratic from XAX T with matrix A has been reduced by an orthogonal transformation Z = XP to a diagonal form A l z I 2 + . Two diagonal matrices are similar if and only if they differ only in the order of their diagonal terms. we also have the Corollary. hence the new matrix D and the original matrix A are similar. If A is symmetric.. Since we know that any real symmetric matrix is orthogonally equivalent to a real diagonal matrix. An> any n associated eigenvectors Xi> .I will be diagonal. and XI is therefore orthogonal to X 2• Hence if the n x n symmetric matrix A has n distinct eigenvalues AI. Remark. ••• . and -1.A)' . 2. 2. But the characteristic equation. (-2 -4 -1 (c) 4 9 0) (o 07 0 -2 8 .A) of A. . 1. Compute the characteristic equation of P-1DP and compare with that of D. (0. (0.. Any real quadratic form XAX T may be reduced bl an orthogonal transformation to a diagonal form Alz/ + . -3).. 1). -1. we get the Corollary. Find the lengths of the principal axes of the quadric xy + yz + zx + x + y+z=1. while P is a traingular matrix with rows (1. Let D be a diagonal matrix with diagonal entries 3. and the unit vectors Xl/IXII. Compute the eigenvalues and eigenvectors of the matrices: ~) -1 2 2) ( 2 2 2. 4). .. Xn/IXn I will form the rows of an orthogonal matrix P such that PAp T = PAP.. -3 -6 -6 (b) 2) 31 42 1. Write down a diagonal quadratic form equivalent under orthogonal transformation to the expressions given below. This proves the essential uniqueness of the diagonal form-'and gives a direct way to compute the coefficients.§10. Exercises 1. one can also compute the principal axes as the associated eigenvectors in the way indicated above. . for the bilinear expression X I AX2 T may be computed in two ways as Since AI ¥. 0.A2 . (An . 4. + Anzn . Xn will be orthogonal.All = (AI . All eigenvalues of a real symmetric matrix are real. Knowing the coefficients.' . X I X 2 T must be zero.. 3. then eigenvectors XI and X 2 having distinct characteristic values A I ¥.A2 are necessarily orthogonal.4 335 The Characteristic Polynomial Theorem 13. in which the coefficients Ai are the roots of the characteristic equation IA . is uniquely determined by A. and hence its roots. 10 336 Determinants and Canonical Forms (a) _2X2 . 11. bn. Exhibit an orthogonal transformation which reduces each form of Ex. + ann is called the trace of A. Show that if A and B are square matrices. but having no two linearly independent eigenvectors with eigenvalue AI. The various powers 1'"' of T are then also linear . 4 to its diagonal equivalent. Find a necessary and sufficient condition that the eigenvalues of a 2 x 2 matrix be equal. (b) If a l is chosen as the first vector in a new normal orthogonal basis./ = bn _ 12 + (-I)"-12b n _ 2 for symmetric matrices. The Minimal Polynomial The construction of canonical forms for a matrix under similarity depends upon the study of the polynomial equations satisfied by the matrix or by the corresponding transformation.. Prove the principal axis theorem for a real symmetric matrix A by the following analysis of the linear transformation X 1-+ XA. where A = I/a jj II.4xz . Prove that the volume of the ellipsoid L. (a) Show that if a matrix A has r linearly independent eigenvectors with eigenvalue Aj . (The invariant all + . Specifically. 7. Prove that in (14). a.) (Hint: Show that all (b) 3x 2 .5z 2 + 4xy + 16yz + 20xz.. (b) Prove that the eigenvectors span the space of all vectors. 12. S. and T: V --'» V a linear transformation of V. where X* denotes the complex conjugate of x. 8.) . Prove directly from the definition that all eigenvalues of a real symmetric A are real. Prove the formula L. *17. where /d / = 1.Ch. let V be an n -dimensional vector space over a field F. 9. + ann). for any r. Show that every unitary matrix U has an eigenvector ~ with ~U = d~. the new matrix for the given transformation has zeros in the first column and the first row.lOyt. *16.) 10. (Hint: For X an eigenvector. (Hint: Transform to principal axes. r 10.5. (b) Construct.I = ±(a u + . *14.2 = (-1)" L.t 2 . Find all 2 x 2 matrices witneigenvalues +1 and -l.3z 2 . the characteristic polynomial of the matrix (~ ~) is the product of those for A and B. (a) The matrix A has an eigenvector a l of length 1. then CA(A) is a multiple of (A .) 13. bn. an r x r matrix A with CA (A) = (A .Aj )'. (a) Prove that all eigenvalues of a hermitian matrix are real.AI)'. integral eigenvalues are multiples of 9. show XAX*T = AXX H = A*XX T *. 6. except for the first entry.ajlljJ.lly2 . (ajjajj .. ajrjxj < 1 is (417/3) ·/A 1/ 2 .y2 . and use Theorem 9. Prove that in (14). (c) The argument is continued by induction.. 15. Hence. and multiplication by any polynomial gCx): it is an ideal of the ring F[x]. where the symbol mCx) IfCx) means that mCx) divides fCx) in the polynomial ring F[x]. the polynomials fCx) over F such that fCT) = 0 are the multiples of a unique monic polynomial mCx). by Theorem 11 of §3. A. we can consider for each polynomial form ICx) = ao + a\x + . Icn . fcn =0 implies mCx) IfCx).§10. since powers of T are permutable CT"'TI = TIT'" = r+ q ). + akxk with coefficients ai in F the corresponding polynomial (15) Icn: in T... Because of the isomorphism A ~ TA between n x n matrices and linear transformations of V". Similarly. and in particular.. each n x n matrix A with entries in F yields polynomials (16) in A. Since transformations can also be added or multiplied by scalars. M is closed under addition. We call mCx) the minimal polynomial of T. . as in Chap. A n are certainly linearly dependent. the~ are again n x n matrices with entries in F. the cOnstant polynomial ICx) = 1 yields the identity transformation I: V ~ V. subtraction. 3. M consists of the multiples of the monic polynomial mCx) of least degree with m = O.5 337 The Minimal Polynomial transformations of V. For each linear transformation T of a finite-dimensional vector space V over F. Moreover.8.. Since there are exactly n linearly independent n x n matrices over F. there will also exist for each linear transformation T of an n-dimensional vector space V a nonzero polynomial fCx) with fcn = 0. Theorem 14. It is the monic polynomial cn characterized by the properties (17) mcn = 0. Consider the set M of all polynomials fCx) over F such that fCn = 0. it is identical with the minimal polyno- . The minimal polynomial of an n x n matrix A is described similarly. the polynomials are added and multiplied like the polynomials ICx). the n 2 + 1 2 matrices I. We have just seen that M contains a nonzero polynomial. . and the dependence relation provides a nonzero polynomial fCx) of degree at most n 2 with ICA) = 0. Proof. It represents a linear transformation V ~ V. 1 ~ 0. Since similar matrices are different representations of the same linear transformation. I I As a special case.. called the companion matrix of g(x) is.r- ar- r- 0' 1 0' 0' 0' 1 0' 0' 0' 0' 0' 0' 0' 1 0' 0' in which the only nonzero entries are 1's along the diagonal just above the principal diagonal. ••• .I are chosen as a basis. .= apTir. we have Corollary. h is the least positive integer with = O. aT. This matrix. suppose that h = n.. there would be a linear dependence relation 0 = aoa + alaT + . hence aj = 0'. More generally. 10 338 Determinants and Canonical Forms mial of the corresponding transformation TA of P... a linear transformation T with T"' = 0 for some m. If aj is the first nonzero coefficient. . aT. consider a nilpotent transformation (or matrix). there n 1 is a vector a with aT . As an illustration.1 with coeffij I cients aj not all zero.~ O. we apply j I j 1 to the equation to get 0 = or. Similar matrices over a field F have the same minimal polynomial over F. When these independent vectors a. (18) cg = 0' 0' 0' 1 0' 0' 0' 1 0' 0' 0' 1 . indeed. that is. are linearly independent. Since T then satisfies T"' = 0. We assert then that the n vectors a. a contradiction. This matrix. I aT2 . + an_IaTn.. Since = ~ 0.Ch.. its minimal polynomial is Xh for some integer h. but a was chosen so that aTn .= apr-I. to each monic polynomial g (X ) = Co + CIX + . for n = 4. and hence is represented by the n x n matrix r r. + Cn_IX n-I + X n of degree n we can construct an n X n matrix with minimal polynomial g(x).ar. T carries each vector of the basis into the next and the last vector to zero. which is clearly nilpotent.. is known as the "companion matrix" of the polynomial xn.. If not. Since the rows of the matrix are the coordinates of the transforms of the unit vectors 101. Co has entries zero except for entries 1 in the diagonal just above the main diagonal. Furthermore. we have In other words. ~g(T) = Ed(T)g(T) = EIg(T)f(T) = 0. the vectors 10 I. ••• . Show that every real 2 x 2 matrix whose determinant is negative is similar to a diagonal matrix. the companion matrix Cg has minimal polynomial g(x) and characteristic polynomial (-I)"g(A). For any f(x) ~ 0 of smaller degree.. 10 IT. 2. the sign (-1)" occurs because the characteristic n polynomial of any n X n matrix has (-I)"A as its leading term. ICg . Interpret geometrically. Ed(T) = ~ ~ 0 by (19).1 is a polynomial of degree at most n .§10. EIT" = -CoEI . Let T be the linear transformation of F" represented by the companion matrix Cg of (18). -Cn . Exercises 1. IS (b) Prove a corresponding result for 3 x 3 matrices. Proof.AIl is exactly (-I)"g(A). Therefore. Since the minor of -Ck is triangular with k diagonal entries -A and the others 1.Cn_IEIT"-\ so that EIg(T) = O.· •• . For each monic polynomial g(x). (a) Show that any 2 x 2 matrix which satisfies X 2 or = 0 is similar to (~ ~) . (00 00) . + an_Ix n. for any vector ~. so that any vector ~ can be written uniquely as where f(x) = ao + alx + .I in the last row.••• . Thus g(x) is indeed the minimal polynomial of Cg • The characteristic polynomial of Cg is found by expanding the determinant ICg . Theorem 15. and entries -Co.· •• .. hence f(T) ~ o.5 339 The Minimal Polynomial for any n.1. .En of F". 10 I T" -I are a basis of F". which asserts that T satisfies the monic polynomial equation geT) = O.AIl by minors of the last row. This is eaily proved using the concept of a matric polynomial or A-matrix.. (a) Show that any real 3 x 3 matrix A has a real eigenvector.) *7.An of any circulant matrix are Ap = a 11 + a 12 w P + . (Hint: Use Exs. (This is Euler's theorem. 5. ±i. + ArB" . 7 and 8. 4. show that if A is an orthogonal 3 x 3 matrix and IA I > 0. then A has an eigenvalue + 1 and is a rigid rotation.AI whose entries are polynomials in a symbol A. Cayley-Hamilton Theorem We shall now show that every square matrix A satisfies its characteristic equation-that is. (a) Show that the eigenvalues of the matrix C displayed to o 1 0 0 the right are ±1. to a matrix of the form ( 0 B ' where B is an 0) . prove that the correspondence A >-+ PAP. whose determinant is positive.the same? *5.4. (Hint: See Ex. j-subscripts being all taken modulo n.) 10.. Show that if A is an eigenvalue of A. whose determinant is negative. Collecting terms involving like powers of A. (b) Show that any orthogonal 3 x 3 matrix is similar. 2 or §9. then q(A) is an eigenvalue of q(A). Show that the characteristic polynomial of any diagonal matrix is a multiple of its minimal polynomial. Show that the eigenvalues AI>' . + a Iw(n-l)p n' where w is a primitive nth root of unity. (c) Using Ex. o 0 1 0 (b) What are the complex eigenvectors of C? 000 1 (c) To what complex diagonal matrix is C similar? 1 000 *9. . *8.6. under an orthogonal ±1 change of basis. By this is meant a matrix like A . is a rigid reflection.. is a rigid rotation. When are they . the complex fourth roots of unity.. orthogonal 2 x 2 matrix.) (b) Show that every 2 x 2 orthogonal matrix.1 is an automorphism of the algebra of all n x n matrices A. (a) Show that every real 2 x 2 orthogonal matrix. one can write any nonzero A-matrix B(A) in the form B(A) = Bo + ABI + .Ch. *6. (a) For any nonsingular n x n matrix P.. and q(A) is any polynomial. the minimal polynomial of A divides the characteristic polynomial of A. (b) Deduce from (a) a direct proof that similar matrices have the same minimal polynomial. 10 340 Determinants and Canonical Forms 3. An n x n matrix A is called a circulant matrix when a ij = ai+lJ+l for all i. the product of A .B k . If C :: B(A)(A .-1I) (A .Ail) . Now observe that the familiar factorization (i :> 1) will give. .AI is such a minor.. + I A k(BkA .=1 n == I b. each .6 341 Cayley-Hamilton Theorem where the B. then C :: O.(A i-I + AA i-2+ i=1 •. are matrices of constants and B. This means that if each power A i in the characteristic polynomial f(A) :: IA . .. (Equality means Lemma. + bn_IA n-I + (-IrA n = O. fix d O l .AIl of (14) is replaced by the same power A i of the matrix (and if A 0 is replaced by A 0 == I). so that its nonzero minors are determinants which are also polynomials in A of degree n . In the matrix A . k=1 where B.AI).AI the entries are linear polynomials in A.. Expanding B(A)(A .Ai - I n biAiI = .=0 n b.§10.AI). According to (10).AI by its adjoint is (21) C(A)(A . . so that this adjoint may be written as a sum of n matrices.(A i . Proof. ~ 0 unless B(A) :: 0.I ) + BoA.AI) = IA . equality in every coefficient of A in each entry. The conclusion is now obvious. Every square matrix satisfies its characteristic equation. Theorem 16 (Cayley-Hamilton). Each entry in the adjoint C of A .1 or less. mvo vmg terms mae power A .+I B . A n-I 0 f A.) ~ 0. we get .AIl· 1= f(A)' I" where f(A) is the characteristic polynomial. + A .. I n ot her wor d s.AI) is a matrix of constants.. in terms of the coefficients b.A the adjoint C = C(A) is a A -matrix C :: C(A) == I A iC.f(A)'I == I . -A.=0 I b.. the result is zero: (20) boI + blA + . n f(A) . Proof. of the characteristic polynomial (14). I' . the matrix representing T can often be correspondingly simplified. Show that if A is nonsingular and has the characteristic polynomial (14). (a) Show by explicit computation that the 4 x 4 companion matrix Cg of (18) satisfies its characteristic equation. Exercise.AI). (b) Do the same for the companion matrix of a polynomial of degree n. we assume that 1 + 1 y6 0 in the base field F. + b"_I A "-2 + (-1)" A "-I]. In the notation of Ex. Invariant Subspaces and Reducibility If a linear transformation T satisfies a polynomial equation which can be factored..AI) = teA).. this result implies teA) = o..t(A)· J = -G(A)(A . 10. S. since (g(T + J»T = 2 g(T + T) = geT + J). 4. 3. (22) teA) . show that the characteristic polynomial of A . 2.G(A)](A . Suppose. 1. we get [C(A) . then the adjoint of A is given by -:--[bII + b 2A + . Prove by direct substitution that every 2 x 2 matrix satisfies its characteristic equation. The eigenvectors for T include all nonzero vectors 71 = geT + I) in the range of T + J. By the lemma.I is (-1)"[A" + ~A". + (-1)"] IAI IAI .7. by direct computation. *(b) Same problem for triangular matrices. for example. . so that the factors of (T . where teA) is a matrix of constants. that T satisfies T2 = J (is of period two)..342 Ch. (a) Prove the Cayley-Hamilton theorem for strictly triangular matrices. If we add (22) to (21). That is. 10 Determinants and Canonical Forms where teA) is obtained from the characteristic polynomial teA) by substituting A for A. where G(A) is a new A-matrix.I + .I)(T + I) = 0 will be relatively prime. 2. withf(x) and g(x) monic and relatively prime.T) = -(g(T . the minimal polynomial is x + 1. and Theorem 17 can be restated in another way. for any vector g. Specifically.1. That is. 0.(I is a vector such that af(T) = 0 and also ag(T) = 0. for 2 (g(T . by Theorem 4 of §9.2. (g(T) = O. Therefore. if both + 1 and -1 occur. hence by (24).1. then a = 711 . Substitution of T for x yields I = h(T)f(T) + k(T)g(T). the Euclidean algorithm provides polynomials h(x) and k(x) with coefficients in F so that (24) 1 = h(x)f(x) + k(x)g(x). Since f and g are relatively prime.In.I». for if g = 711 + (. this is the required decomposition. This analysis is a special case of Theorem 17. then any vector in V has a unique expression as a sum (23) TJ!(T) = 0. hence the eigenvectors with eigenvalues ± 1 span the whose space. aI == ah(T)f(T) + ak(T)g(T) = 0.7 343 Invariant Subspaces and Reducibility they belong to the eigenvalue + 1. The decomposition (23) is unique. The subspace SI consists of all vectors 71 with TJ!(T) = 0. T is the identity. with eigenvalue -1. Thus. if the entries are all + 1. Proof. T can be represented by a diagonal matrix with diagonal entries ± 1 on the diagonal. any vector g can be written as a sum g = (l/2)[g(T + I) . the minimal polynomial of Tis x 2 .§10. and similarly (g(T) = 0. and the minimal polynomial of T is x . = 712 + (2 are two decompositions. . if the entries are all -1. All nonzero vectors in the range of T .I»T = g(T But since 1 + 1 ~ - T) = g(I . 71 = gk(T)g(T). If the minimal polynomial m(x) of a linear transformation T: V --'» V can be factored over the base field F of V as m(x) = f(x)g(x). and S2 of all ( with (g(T) = O. ( = gh(T)f(T)· As TJ!(T) = gk(T)g(T)f(T) = gk(T)m(T) = 0.712 = (2 .I are also characteristic.g(T . the set of all vectors lj(T) with g E V-is invariant under T. Given T: V -'» V and a vector a in V. A special class of invariant subspaces are the cyclic subspaces generated by one vector. of transforms of a under successive powers of T. Invariant subspaces arise in many ways.8. Theorem 17'. if S is invariant under T. Moreover V is the direct sum of the subspaces SI and S2. Similarly the minimal polynomial of T2 on S2 is a divisor g2(X) of g(x). Clearly. for any vector g. 10 344 Determinants and Canonical Forms SI is the null-space of f(T). and S2 the null-space of g(T). which we shall now define.. then S is invariant under h(T).. aT. If SI and S2 are the null-spaces of f(T) and g(T). In this event. Evidently.. hence. Consider now the sequence a = aI. But fl (x) also divides f(x). We thus get the following result.aT d - 1 are linearly independent.Ch. Since f and g are relatively prime. the minimal polynomial of TI is a divisor fl(x) of f(x). TI!(T) = 0 for every 71 E S I. in Theorem 17. . and h(x) is any polynomial. this proves that f(x) divides fl(X). if TI is the linear transformation of S I induced by T. aT2. . for lj(T)T = (mf(T) is in this range. aT. We will then have ar where a. and the transformations TI and T2 induced by T on SI and S2 have the minimal polynomials f(x) and g(x). there is a first one which is linearly dependent on its predecessors. then the range of the transformation f(T): V -'» V-that is. Therefore. then V is the direct sum of S I and S2. if g E S implies gT E S. But the set Za of all such transforms is an invariant subspace which contains a. thus it is an "invariant" subspace in the sense of the following general definition. g(x) divides g2(X). the correspondence g ~ gT is called the transformation induced on S by T. Each of these subspaces is mapped into itself by T. and likewise g2 = g. Thus rna (x) = . represented as in (23). In Theorem 17. . respectively. Thus if f(x) is any polynomial. respectively. A subspace S of a vector space V is said to be invariant under a linear transformation T: V -'» V. in the sense of §8. we call it the T-cyclic subspace generated by a. clearly any subspace of V which contains a and is invariant under T must contain all transforms af(T) of a by polynomials -in T.. Hence the product fl(X)g2(X) is divisible by the minimal polynomial m(x) = f(x)g(x). so that fl = f. 8). (0. aT. t he minima .. I poI ynom. If T: V -'» V has the minimal polynomial m(x). . Conversely. (-co.: F" -'» F n which carries each unit vector Ej of F n into the next one and the last unit vector En into EI T". ... aT. which is carried into (27) Relative to the basis a.. then the T-order of every vector a in V is a divisor of m(x). + Co IS tion Ta induced by T on the T-cyclic subspace Za.-Cd-I) of the transforms of the basis vectors.a . as in (19). I f or the trans f ormax d + Cd-IX d-I + .. 0. 1. except for art-I. polynomial g(x) is relatively prime to the T-order ma(x) of a.... 0. This matrix is exactly the companion matrix of the polynomial ma (x). by (26). Note that T carries each vector of the basis a.I for Za into its immediate successor. Since = 0.0). .. Theorem 19. the companion matrix Cr of a monic polynomial f of degree n represents a transformation T = Tc. so that we have proved Theorem 18..aTd . the polynomial ma (x) is called the T-order of a. 0).0. Ta is thus represented by the matrix whose rows are the coordinates (0.. (a) Show that any real 2 x 2 matrix satisfying A matrix C~ 2 = -[ is similar to the ~). . . therefore. Two vectors a and {3 in V span the same T-cyc/ic subspace Za = Z{3 if and only if (3 = ag(n.. (b) Show that no real 3 x 3 matrix satisfies A 2 = . Exercises 1. IS a Corollary. .. where the. 1).7 345 Invariant Subspaces and Reducibility . . (0.. m(x) multiple of the T-order ma(x) of a. .art. the whole space F" is a T-cyclic subspace generated by EI> with the T-order f(x). . .. hence. (c) What can be said of real 4 ?< 4 matrices A satisfying A 2 = -J? .· . amen = 0..I. The proof is left as an exercise (Ex. . 1. men Proof. .1 of Za.§10. The transformation induced by T on a T-cylic subspace Za with T-order ma(x) can be represented by the companion matrix of ma(x). . .8. 8. 9. then a + (3 has the T-order [(x)g(x). then the conclusion of Theorem 17 holds." ·. while [(x) and g(x) are relatively prime. 10 346 Determinants and Canonical Forms 2. (35)). Prove: Every invariant subspace of a T-cyclic space is itself T-cyclic. (a) In Theorem 17. Under what conditions on the field of scalars is a matrix with minimal polynomial x 2 + x . (a) Show that the range and null-space of any "idempotent" linear transformation T satisfying r = T are complementary subspaces (cf. . whose minimal polynomials are powers of irreducible polynomials. (Hint: Use the result of (a). even if [(x )g(x) is not the minimal polynomial of T.8. Exactly as in §7 . Prove: The T-order of a vector a is the monic polynomial [(x) of least degree such that a[(n = O. fen 10.. Find a canonical form for 2 x 2 matrices satisfying this equation. (a) Classify all 3 x 3 complex matrices which satisfy A 3 = I. Prove: If g(n have the same cyclic subspaces.. then T and 11. *10.I. 5k (in symbols. (Hint: Form A . one can prove . (Hint: Consider the corresponding property of cyclic groups.) *5. (b) Show that any two idempotent matrices which have the same rank are similar. V = 51 EB .2 similar to a diagonal 2 x 2 matrix? 6.. In this decomposition.. fen. 7. §7. where [(x) and g(x) are relatively prime polynomials..k). Theorem 16. A vector space V is said to be the direct sum of its subspaces 51> . Definition. Prove the Corollary of Theorem 19. + T/k (T/j E 5j . Every plane shear satisfies A 2 + I = A + A.) 3.8. EB 5d when every vector g in V has a unique representation (28) g == T/ 1 + .Ch. i-I. show that the range of g(n is identical with the null-space of (b) Show that if [(ng(n = 0. First Decomposition Theorem The construction used in proving Theorems 17 and 17' can be used to decompose a general linear transformation into "primary" components. (b) Do the same for real 3 x 3 matrices. 4 •. the concept of a direct sum of k subspaces plays a central role.) = 0. Prove: Given T: V ~ V and vectors a and (3 in V with the relatively prime T-orders [(x) and g(x). ain.. . If V is the direct sum of invariant subspaces S1. . with zeros elsewhere. Choose a basis ail .D. .ain" then V is the direct sum of Sl. Proof. ••• .f(Bk ).. + nk of the dimensions of the direct summands Si.. . Sk if and only if is a basis of v..Bk arranged along the diagonal. Bk This matrix B. relative to the basis (29). . . Sk. If V is spanned by the subspaces Sl.Bk • Observe that any polynomial feB) in B is the direct sum of f(B 1 ). . Theorem 21. T carries the basis vectors ail..E. . . where each Sj is of dimension nj and has a basis ail' . Then these basis vectors combine to yield a basis (29) for the whole space. for each invariant subspace Si.ain.§10.. .. Corollary.8 347 First Decomposition Theorem Theorem 20. Q. . consisting of blocks B1.Sk.. It follows that the dimension of V is the sum n 1 + . . and then V is the direct sum of Sl. so that Bi is the matrix representing the transformation T on Sj relative to this basis... Furthermore. into vectors of the ith subspace. Sk... Sk on each of which the transformation induced by a given transformation T: V --'» V is represented by a matrix B i .. . then T can be represented on V by the matrix o o (30) B- o o· . and hence T is represented. . A linear transformation T: V --'» V (or a matrix representing T) is said to be fully reducible if the space V can be represented as a direct sum of proper invariant subspaces... by the indicated direct sum matrix (30)... If V has subspaces SI>·· . ... is called the direct sum of the matrices B1. .· .. Cl) ••• (x .AI) . (x . are relatively prime. If the minimal polynomial of the linear transformation T: V --'» V has the factorization (31) into monic factors PI(X) irreducible over the base field F. then V is the· direct sum of invariant subspaces S}" . A matrix A with entries in F is similar over F to a diagonal matrix if and only if the minimal polynomial m (x) of A is a product of distinct linear factors over F. Ck.. . An important special case is the Corollary. If (32) m(x) Let T = = TA : F" --'» F" be the transformation corresponding (x . 10 Determinants and Canonical Forms Now consider the factorization of the minimal polynomial m (x) of T as a product of powers of distinct monic polynomials PI(X) irreducible over the base field.CII) . They are uniquely determined by T because the decomposition (31) is unique. (D . . Proof.. A.. . induced by Ton 51 has the minimal polynomial PI(X)e. the theorem shows that V is the direct sum of spaces 51> where 51 consists of all vectors '111 with 'I1i T = A. The transformation T. and hence is a product of distinct linear factors. . This is our first decomposition theorem.cd. The minimal polynomial of D-and of any other matrix similar to D-is a factor of the product (x ...Ak on the diagonal.Sk where Sj is the null-space of pj(n e . repeated use of Theorem 1..Ak ).. the subspaces Sj are called the "primary components" of V under T.. we have T represented by a diagonal matrix with the entries A.. in the form (31) Since distinct Pi (X)e.348 Ch. . that is.···. Combining these bases as in (29). if D is any diagonal matrix whose distinct diagonal entries are c... then the transformation represented by the product f(D) = (D . . to A. .'I1I. Conversely..7' will yield Theorem 22. and consequently f(D) = O.ckI) carries each basis vector into 0. Ak distinct scalars. so that the matrix representing T on Sj is Azl. . of all eigenvectors belonging to the eigenvalue AI' Any basis of Sj must consist of such eigenvectors. considering first the case m(x) = p(x)'..) 10. (x . we shall use the concept of the quotient-space VIZ of a vector space V by a subspace S. where p(x) is irreducible. Show that the minimal polynomial of a matrix A can be factored into linear factors if and only if the characteristic polynomial of A can be so factored.7. if the subspace S is invariant under T. prove that the number of times an entry Ai occurs on the diagonal of D is equal to the dimension of the set of eigenvectors belonging to the eigenvalue Ai' S.AI)'l . *3. 9. *7. there exists a vector a with T-order exactly m(x). (Hint: Use Ex.A. Let A be a complex matrix whose minimal polynomial m(x) = (x . of the form sketched below: Ai 0 1 Ai 1 Bi = Ai 0 1 Ai *8.9. not using Theorem 17'. 4.)" equals its characteristic polynomial. Prove that if m (x) is the minimal polynomial for T. then in the formula (33) (g + S)T' = gT + S. and that the projection P: V ~ VIS = V' given by gP = g + S is a linear transformation. In Theorem 22. . Prove: the minimal polynomial of the direct sum of two matrices BI and B2 is the least common multiple of the minimal polynomials of BI and B 2. Prove Theorem 20.12) that the elements of this quotient-space V' = VIS are the cosets g + S of S. In proving this. 2. for given T: V ~ V.9 349 Second Decomposition Theorem Exercises 1. Give a direct proof of Theorem 22.§10. If the n x n matrix A is similar to a diagonal matrix D. §1O. In particular. Second Decomposition Theorem We shall show below that each "primary" component Sj of a linear transformation T: V ~ V is itself a direct sum of T-cyclic subspaces. We recall (§7. 6. set qi(X) = m(x)/Pi(X)".. Show that A is similar to a direct sum of r triangular ei x ei matrices B i . and prove that the subspace Si there is the range of qt(n. and a1 generates a T-cyclic subspace Z1' Since Z1 is invariant under T. We call T the transformation of VIS == V' induced by T. the minimal polynomial of T on V' is a divisor of p(xt.12.d[Z1] to decompose VIZl into a direct sum of T-cyclic subspaces Z2'.Ch. For n > 1.. for if another representative TJ = g + ( were chosen. hence V contains a vector a1 with a1P(T)e .Z:. .: Z 1 EB . we have p(T)e = 0. then TJT +S = a + (T + S == gT + S.' . The existence of the direct sum decomposition will be established by induction on the dimension n of V. ~ - e' r· .. In case n = 1. but p(T)e-1 '" 0. for aoflY polynomial f(T) in T. Proof. for which the T -orders are e ~ - e2' ~ - . since ( E S implies (T E S. (33) gives with the formulas of §7. (34) (g + S)f(T') == lj(T) + S.. f(T) == 0 implies f(T) = 0' in V'.1 '" 0. The T-order of a1 is therefore p(x)e. so that the T-order of f in V' divides the T-order of g in V.. Hence the linear transformation T: V' ~ V defined by (33) is single-valued. Moreover. We are now ready to prove the second decomposition theorem. If the linear transformation T: V ~ V has a minimal polynomial m(x) = p(x)e which is a power of a monic polynomial p(x) i"educible over the field F of scalars of V. In particular. and the result is immediate. T induces a linear transformation T on V = VIZ1' Since evidently p(T)e = 0'. 10 350 Determinants and Canonical Forms a + S does not depend upon the choice of the representative g of f == g + S. it is easily verified that T is also linear.. Theorem 23. V is itself a cyclic subspace. then V is the direct sum (35) V :. and we can use induction on d[VIZ1] = d[V] . EB Zr of T-cyclic subspaces Zj with the respective T-orders (36) Any representation of Vas a direct sum of T-cyc/ic subspaces has the same number of component subspaces and the same set (36) of T-orders. and hence the whole space V has dimension d(el + . Then Since al has the T-order p(x)e. Since the T-order of ai is a multiple of the T-order p(X)d of aj' = aj + Zh it is sufficient to note that Having proved Lemma 1. we let Zj be the T-cyclic subspace generated by aj.§10. + 71r (71i E Zi. It remains to prove the uniqueness of the exponents appearing in any decomposition (36).r. so that for any representative 71 = 71i of a.'].9 351 Second Decomposition Theorem Lemma 1.. Any vector g of V has a unique representation as g= 711 + . it follows that the subspaces Zh"'.'. then the coset a. and dimension 0 if e <: s. since both dimensions are equal to the degree of the common T-order p(x)e/ of a.p(T)s of Zi under p(T)s is the cyclic subspace generated by f3i = aiP(T)s. + d[Zr]' By choosing bases.In particular let p(X)d be the T -order of a.'. hence by (37) and the Corollary to Theorem 20. and hence that f(x) = g(X)p(X)d for some polynomial g(x). If a. i = 2.s) if ei > s.. Vp(T)s of p(T)s has a unique .'" .. Then d[Zj] = d[Z.' contains a representative ai whose T-order is the T-order of a. It has dimension d(ei . Observe also that for any integer s.. ffi Z" as asserted..'. Hence (37) d[ V] .r).. then the cyclic subspace Zi has dimension dej. This will be done by the computation of the dimensions of certain subspaces. as required. .. Hence any vector in the range i = 1. We shall now show that ai = 71 . The proof depends on the fact that the T-order p(x)e of al is a multiple of the T-order of every element of V.. 71P(T)d = ad(T) is in the T-cyclic subspace generated by al. + er).alg(T) has aT-order p(X)d equal to the T'-order of a.. . it will suffice to show that these exponents are determined by T and V. if d denotes the degree of p(x). For example. Zr span V.' generates the T-cyc/ic subspace Z.. it follows that V is the direct sum V = Zl ffi .d[Zl] = d[ VI Zd = d[Z2] + .'.'.. the image Z. this implies that p(x)e If(x)p(x)e-d. Show that if a vector space V is spanned by the T-cyclic subspaces generated by vectors al> . > s. 3.5.. following (37).a. 10 Determinants and Canonical Forms 352 representation. 10. then (39) determines the number of ei (if any) equal to e . next take s = e . then (39) determines the number of ei equal to e..c. . Exercises 1. generated by the f3i = a. as (38) with components 'TJiP(T)s in the spaces ZiP(T)s.+ . t = r). (or if er > s.. First take s = e . . they. Prove in detail that T: VIZ ~ VIZ is linear if T: V ~ V is linear and the subspace Z is invariant under T.1 = el . This proves the invariance of the exponents e 1> ••• .. + (e. Rational and Jordan Canonical Forms Using Theorems 20 and 23. One only needs to give a canonical form for transformations on cyclic subspaces! One such form is provided by Theorem 21. Find the minimal polynomial of the matrix B of §8. If A is any n X n matrix. The dimensions on the left are determined by V and T. . of the T-orders of the a i • 2.2.en and completes the proof of Theorem 23. then the minimal polynomial of T is the I. Prove. for i = 1. 4.m. Combining all these bases yields a .t. that ZI> .1.. .Ch. in turn. ..1. then a suitable choice of basis in each cyclic subspace represents TA on that subspace by a companion matrix. by (38).. 3. determine the ei in succession as follows. Hence..10. Vp(T)s is the direct sum of the cyclic subspaces Z{3.. Zr span V. Ex. and its dimension is (39) d[Vp(T)S] = d[(el .s) . The integer s determines an integer t such that e. it is easy to obtain canonical forms for matrices under similarity..s)].p(T)s. and so forth. The representation of A as this direct sum of companion matrices is called the primary rational canonical form of A ("Primary" because powers of irreducible polynomials are used and "rational" because the analysis insolves only rational operations in the field F). k.. Since the minimal polynomial m(x) divides the characteristic polynomial. except for sign. with the theorem. i = 1. Any matrix A with entries in a field F is similar over F to one and only one direct sum of companion matrices of polynomials en >.. Pk(X)ek . by Theorem 15. We have proved Theorem 24. which are powers of monic irreducible polynomials PI (x). with respect to which TA is represented by the direct sum of these companion matrices. any root of the characteristic polynomial must... hence by the theorem is a root of m(x).Bq is the product of the characteristic polynomials of the B j • But..10 353 Rational and Jordan Canonical Forms basis of F". Corollary 1. be a root of one of the elementary divisors PI(X)eii. Conversely. .Pk (x).. Any 6 x 6 rational matrix with minimal polynomial (x + l)(x + 3)2 is similar to one of the following direct sums of EXAMPLE. The eigenvalues of a square matrix are the roots of its minimal polynomial.. It is readily seen that the characteristic polynomial of a direct sum of matrices B 1> ••• . The set of polynomials (40). These two (acts. Proof. Corollary 2. The minimal polynomial of A is m(x) = PI(X)ellp2(X)e21 . The uniqueness assertions of Theorems. >- ejri > 0. The characteristic polynomial of an n X n matrix A is (-1)" times the product of the elementary divisors of A. by Corollary 1. any root of the minimal polynomial is a root of the characteristic polynomial. the characteristic polynomial of a companion matrix Ct is f(x).§10. prove Corollary 1.. . . 20 and 23 show that the set of companion matrices so obtained is uniquely determined by A. Proof.' . which is a complete set of invariants of A under similarity (over F) is called the set of elementary divisors of A. 2 . hence a characteristic root (eigenvalue). . if j = e.{3j + {3j+l..··· . . .AI. and all other entries zero.. the characteristic polynomial is (x 2 + 1)(x + 3)4. In this case.Ch.Ai)e for some scalar Ai and positive integer e. in the second and third cases..1.1 + aUj. entries 1 on the diagonal next above the principal diagonal. the only monic irreducible polynomials are the linear polynomials x . C(x 2+1) EB C(x 2+1) EB C(X+3)2. (3jT = aUj-IT = aUj-I(U + AI) = AiaUj. aT. each T-cyclic subspace Za in Theorem 23 will have the T-order (x .. 10 354 Determinants and Canonical Forms companion matrices:. consider the vectors {31 = a. in the first case the characteristic polynomial is (x 2 + 1)2(X + 3)2.Ai. then aU = 0 and {3jT = Ai{3j.1 plus some linear combination of vectors aT' with k < j .I of Za. with Ai a scalar. . To obtain the effect of T upon the {3j' observe that r. this gives {3jT = A.{32 = aU. for any matrix whose minimal polynomial is a product of powers of linear factors. C(x 2 +1) Ef> C(X+3)2 Ef> C(X+3)2. Since j each {3j is aT . C(x 2 +1) Ef> qX+3)2 Ef> C(x+3) Ef> Qx+3).{3e also constitute a basis of Za. T is represented as in Theorem 24 by the companion matrix of (x . If we use the above type of basis instead of that leading to the companion matrix in Theorem 24. with entries A/ on the principal diagonal. Call such a matrix an elementary Jordan matrix. If the minimal polynomial for the matrix A over the field F is a product of linear factors (41) . Over the field of complex numbers. more generally.Ai On the other hand. a different canonical form can be constructed for matrices with complex entries or. Now T is represented relative to this basis by the matrix whose rows Ai 1 o o o Ai 0 0 o o 1 are the coordinates of the {3jT. This is a matrix like that displayed just above.ar. the vectors {3 b . Using this observation. j If j < e.{3e = aif-\ where U = T . we obtain Theorem 25. Relative to the basis a. TS. minimal polynomial (x . which include at least one ei x ei elementary Jordan matrix belonging to the characteristic root (eigenvalue) Ai> and no larger elementary Jordan matrix belonging to the characteristic root (eigenvalue) Ai' Note that the number of occurrences of Ai on the diagonal is the multiplicity of Ai as a root of the characteristic polynomial of A. Find all possible primary rational canonical forms over the field of rational numbers for the matrices described as follows: (a) 5 x 5.2)(x . 3. (b) Infer that they are always similar. in particular. It applies to any matrix over the field of complex numbers. the Jordan canonical form is a diagonal matrix with the Ai as the diagonal entries. is called the Jordan canonical form of A. Express in primary rational canonical form the elementary Jordan matrix displayed in the test.1). Any complex matrix is similar to a matrix in Jordan canonical form.A2)\ (c) (x . Ak distinct.At )3(X . Corollary.A3)2. (a) Show that a complex matrix and its transpose necessarily have the same Jordan canonical form. (b) 7 x 7. which is unique except for the order in which these blocks are arranged along the diagonal. and only then. 4. S2 = T2 = I.A1r(x . Exercises 1.1)2.A2)2.§10. U 2 = I.A2)2(X . Part of the Corollary of Theorem 22 is thus included as a special case. and are hermitian.1)3. 2.1). that if all the ei in (41) are 1. Prove that U = iST is hermitian and satisfies TU = -UT. minimal polynomial (x 2 + 4)2(X + 8)2.At)(x . Note that the Jordan canonical form is determined by the set of elementary divisors and. Exhibit all possible Jordan canonical forms for matrices with each of the following characteristic polynomials: (a) (x . characteristic polynomial (x 4 .1)(x 2 . then A is similar over F to one and only one direct sum of elementary Jordan matrices.2)2(X . (c) 8 x 8. S.10 Rational and Jordan Canonical Forms 355 with Ab ••• . (b) (x . characteristic polynomial (x 2 . (d) 6 x 6. minimal polynomial (x 2 . . The resulting direct sum of elementary Jordan matrices. (a) Two of the Pauli "spin matrices" satisfy ST = . show that any linear transformation T: V ~ V decomposes V into a direct sum of T-cyclic subspaces with T-orders II (x). if 2 coordinates.' . . . T = X 2. . and ft(x) is the minimal polynomial of T. r.t. 10 356 Determinants and Canonical Forms (b) Show that.Ch.9.. S is similar to (b~1 ~) (~ _~) and that.(x). *6. with these for some b. where j..(x) If-I (x) for i = 2. Using the methods of §1O.. . Basic Definition We will now analyze more closely. from the standpoint of modem algebra. If X c Yand Y c X.11. But the inclusion relaFigu/'B 1 tion is not symmetric. if Xc Y and Y c X.Z. as in the "Venn diagram" of Figure 1. On the contrary. One writes X c Y (or Y :J X) whenever X is a subset of Y-i. and X. Y. then clearly every element of X is in Z. whenever every element of X is in Y. so that X = Y. Suppose that I is any set. the inclusion relation for sets shares with the inequality relation of arithmetic the following properties: Reflexive: Antisymmetric: Transitive: For all X. then X c Z.'" denote subsets of I. It is also transitive: if every element of X is in Y and every element of Y is in Z.11 Boolean Algebras and Lattices 11. then X = Y If X c Y and Y c Z. In summary. Z three overlapping congruents disks located in I. X c X.e.. This relation is also expressed by saying that X is "contained" or "included" in Y." briefly introduced in §1. ---~ . The relation of inclusion is reflexive: trivially. Y. then X and Y must contain exactly the same elements. any set X is a subset of itself. Thus I might be a square.1. while X. the fundamental notions of "set" (or class) and "subset. " analogous to ordinary "plus" and "times. I' is the empty set 0. Finally. The subsets of a given set I are not only related by inclusion. X = Y. given two sets X and Y. Combinations of these regions in the square I can be depicted by shading appropriate areas: thus. and we define the union of X and Y (in symbols. In this diagram. We can have Y c X but not X c Y. There are. which contains no elements at all! This is because we are considering only subsets of I. And finally. The Venn diagrams for X. four possible ways in which two sets X and Y can be related by inclusion. Exercises 1. (X u Y') u Z'. therefore. and Z which represents exactly that area. It may be that X c Y and Y c X. and we write X < Y or Y > X. either Xc Yor Yc X." respectively. The operations of the algebra of classes can be illustrated graphically by means of the Venn diagram of Figure 1. (b) X' u Y' = (X u Y)'. (X u Y)' n Z. We can have X c Y but not Y c X. The symbols nand u are called "cap" and "cup. (d) Xu (Y n Z)." The extent and importance of this analogy were first clearly recognized by the British mathematician George Boole (1815-64). 3. = (X u Y') n Z'. Y. (c) (X u Y) n Z = (X u Z) n Y. and X n (Y' u Z) is the shaded area. or both. in which case we say that X is properly contained in Y. X u Y) as the set of all elements in either X or Y. not true that. they can also be combined by two binary operations of "union" and "intersection. . By shading each of the appropriate areas on a Venn diagram. who founded the algebraic theory of sets little more than a century ago. in which case X and Yare said to be incomparable. we write X' (read. however. 11 358 Boolean Algebras and Lattices It is. and Z cut the square into eight nonoverlapping areas. On a Venn diagram shade each of the following areas: (X' n Y) u (X n Z'). Y' is the exterior of Y. and Z are the interiors of the three overlapping disks. Label each such area by an algebraic combination of X. We define the intersection of X and Y (written X n Y) as the set of all elements in both X and Y. we have neither X c Y nor Y c X. in which case. Y. X.Ch. It is principally the existence of incomparable sets which distinguishes the inclusion relation from the inequality relation between real numbers. "the complement of X") to signify the set of all elements not in X. determine which of the following equations are valid: (a) (X' u Y)' = X nY'. For example. by antisymmetry. in which case X properly contains Y. Y. 2. and X u Y = Further. I u X = /. Involution: XuX'=/. Intersection and union are related to each other and to inclusion by a fundamental law of Consistency: The three conditions X c y. as postulated in Chap 1. u and ordinary . complementation is related to intersection and union by three new laws. X X X X X X Idempotent: Commutative: Associative: Distributive: n X = X and X u X n Y = Y n X and X ('\ (Y n Z) = (X n Y) n Z lJ (Y u Z) = (X u Y) u Z. The analogy between n. Complementarity : X n X' = 0 and (X n Y)' = X' u Y' Dualization: (X'). and used to define Boolean algebra. and (X u Y)' = X' n Y' . n (Y u Z) = (X n Y) u (X u (Y n Z) = (X u Y) n (X = X. X n Y Yare mutually equivalent. whose truth is obvious. 0 n X = 0 and 0 u X =X and I n X = X. the void (empty) set being denoted by 0.. = x. Finally.2. = X. Universal bounds: Intersection: Union: 0 c X c I for all X. Clearly. The first three intersection and union properties are analogous to properties of 0 and 1 in ordinary arithmetic.2 359 Laws: Analogy with Arithmetic 11. all of these except for the idempotent laws and the second distributive law correspond to familiar properties of + and·. we have the following special properties of 0 and I. u Y = Y u X.§11. Laws: Analogy with Arithmetic The analogy between the algebra of sets and ordinary arithmetic wiII now be described in some detail. and n Z) and u Z). + is in part described by the following laws. Third." If one assumes these properties as basic. Such an argument is convincing to our common sense.Ch. then the area X' is shaded by horizontal lines and Y' by vertical lines.. Consider the distributive law. an element in X but not in Y. as in the first dualization law. as one normally does in mathematical reasoning. second. The truth of the above laws can be established in various ways. since only deductive proofs are allowed in mathematical reasoning. Second. the figure shows at once that this area is the complement of the sum X u Y. while for three classes there are eight cases and eight areas (Figure 1). Note that for two classes X and Y." A little reflection convinces one that these two statements are equivalent." "b in (X n Y) u (X n Z)" means "b is either in both X and Y or in both X and z. one sees that (X n Y)' and X' u Y' have the same elements. we can use the verbal definitions of the operations cup and cap to reformulate the laws. we can consider separately each of the possible cases for an element of I: first. First. and so on. hence in X' u Y'. according to the ordinary usage of the connectives "and" and "either· .X and XX = X is assumed. . all four possible cases for an element are represented by points in the four areas of the Venn diagram." "or. Here "b in X n (Y u Z)" means "b is both in X and in Y or Z." Appropriate examples are furnished by the Venn diagrams." This verification of the distributive law may indicate how the laws of the algebra of classes are paraphrases of the properties of the words "and. one can then prove from them all the above laws for classes." and "not. 11 Boolean Algebras and Lattices 360 The first and third laws correspond to laws of ordinary arithmetic. but it is not permissible technically. By looking at the other two cases also. If X and Yare the respective interiors of the left. thus verifying them by "induction. if X' is interpreted as 1 . one can test them in particular examples. The cross-hatched area is then just the intersection X' n Y'. an element of the first type is in X n Y. or. hence not in (X n Y)' and not in X' u Y'. as asserted by the second dualization Figure 2 law. an element b in X and in Y. while an element of the second type is in (X n Y)' and in Y'. For example.and right-hand circles in Figure 2. . which satisfy the idempotent laws a 1\ a =a v a = a. how many cases occur? *(b) Draw a diagram for four sets which shows everyone of these possible cases as a region.y<1. Use the method of subdivision into cases to verify the associative. Y of I. Show that the intersection and union properties of 0 and I may be derived from the universal bounds property and the consistency principle.§11. we now lay down our basic definition. Definition. b. I fails if 0 is replaced by the number 0.x and Y' by 1 . 11. show that four hold for real numbers x." as in the third method described above. Prove that for arbitrary subsets X. A Boolean algebra is a set B of elements a. 1\ (wedge) and v (vee). y if O<x. using a slightly different notation to emphasize the fact that the postulates assumed may apply to other things than sets. In Ex. and then deduce from these postulates as many interesting consequences as possible. 5. . Reformulate the laws of complementarity.3 361 Boolean Algebra Exercises 1.y? 8. and consistency laws. Of the six implications obtained by replacing "equivalent" by "if" and "only if" in the consistency postulate. and involution in terms of "and. and I by 1? Which of the properties of complements fails if X' is replaced by 1 . as was done in Chap. with the following properties: (i) B has two binary operations. Instead. in which the four given sets are discs. which of the intersection and union properties of 0. 1 for the laws of arithmetic.. 7. 2. Accordingly. 6. (a) In treating an algebraic expression in four sets by considering all possible cases for their elements. (c) Show that no such diagram can be made. commutative. c. dualization. 4. 6.3. Boolean Algebra We will not concern ourselves further with deriving the preceding algebraic laws from the fundamental principles of logic." and "not. X c Y if and only if X' u Y = 1. 3. Use Venn diagrams to verify the distributive laws." "or. . we will simply assume the most basic of these laws as postulates. c E B.Ch. = and the associative la ws a /I. respectively. I as the whole space. and (v) are satisfied. (iv). T. a' = 0.) . a = a. and S' as the orthogonal complement SJ. c) = (a /I. Theorem 1. b. EXAMPLE 1. but not all. of the subspace S. although the distributive laws (iii) are not. a I of complementation.2 can be summarized in the following statement. Ova=a. a.0). Then postulates (i). a v b b v a. (b v c) = (a /I. which obeys a v a' = 1. Define S /I. b) v (a /I. for example. It is understood that the preceding laws are assumed to hold for all a. 1). (v) B has a unary operation a the laws a /I. c) = (a vb) /I. b = b /I. (1. /I. U be the subspaces of the plane spanned by (1. I which satisfy o /I. (iii) These operations are mutually distributive: a /I. (Let S. (a v c). /I. T = S n T as the intersection of Sand T. b) /I. the subsets of any set I form a Boolean algebra. a v (b v c) = (a v b) v c. the conclusions of §§11. 1). union. 11 362 Boolean Algebras and Lattices the commutative laws a /I. a v (b c). we now describe examples in which some of them hold. 0 as the null vector 0. Under intersection.. (iv) B contains universal bounds 0.1 O. S v T = S + T as their linear sum. To illustrate more selectively the significance of the preceding postulates.1-11. (a vb) = a v (a b) = a. Let L have as "elements" the subspaces of the ndimensional Euclidean vector space of §7. Iva=I. a = 0. (0. -+ I /I. c. and complement. (b /I. Using this definition. (ii) These operations satisfy the absorption laws a /I. (ii). (ii). while M v N = MN is the set of all products xy (x E M. Let f and g be two expressions formed from the letters a b . then L is called a distributive lattice. . the effect of the idempotent laws is clearly to permit elimination of repeated occurrences of the same term-all but one of the occurrences of a given term can be deleted. we wiD use these names interchangeably. and I is G itself. then postulates (i). of a finite group G. .. If N is the set of subscripts i = 1. Let M 1\ N = M n N be the intersection of M and N. In summary. Since the systems constructed in Examples 1 and 2 satisfy postulates (0 and (ii).. that we can permute terms in any way we like in an expression involving only vees or only wedges. under intersection and union. In conjunction with the above laws. commutative. Again. The effect of the associative and commutative laws has already been studied in §1. If 0 denotes the group identity 1.5. The same holds for expressions involving only wedges ". Then the idempotent.3 363 Boolean Algebra 2. with two binary operationst " and v which are idempotent. . then the set of all polygonal domains is a distributive lattice.=1 n and Aaj or A N aj i=l t The wedge operation A is also called meet. and (iv) are satisfied. and m v n their least common multiple. if the void set 0 is included.. the set of positive integers is a distributive lattice. For example. the commuta· tive law.n.. commutative. .§ 11. . N. and the vee operation v called join. and associative and which satisfy the absorption law (ii). an using only vees v and using all these letters (possibly with some repetitions). The associative law means essentially that we can form mUltiple intersections or unions without usiI1g parentheses. of which we will now derive a few of the simplest. . they are lattices in the following sense. If in addition the distributive laws (iii) hold. although in general (iii) and (v) are not. Then M " Nand M v N are normal subgroups of G. with m 1\ n the greatest common divisor of m and n. Let L have as "elements" the normal subgroups M. A lattice is a set L of elements. YEN). and associative laws imply that f = g. we can without ambiguity write " aj Vaj or V N . . and sets of zero area are neglected. EXAMPLE Definition. we have Lemma 1. The various laws postulated above have many interesting algebraic consequences. each of the identities a II x = a and a v x = x (for all x) implies that a = 0. Prove in detail that Example 1 defines a lattice which is not distributive if n>l. fl notations of algebra. Again.5. . *2. Prove in detail by induction that in any distributive lattice.. associative. each of the identities a v x = a and a II x = x implies that a = /. Prove that the lattice of all subgroups of the four-group is not distributive. 11 364 Boolean Algebras and Lattices to denote the join and meet of all the ai. generalized distributive laws such as the following: X II (YI V ••• x v (y 1 (x 1 V • • • V x m) II v Yn) = (x II ••• II (y 1 II YI) Yn) = (x v Y I) V . and distributive laws. . just as in § 1. II ••• II (x V Yn). starting from the commutative. we can derive by induction.Ch. Prove that Example 2 defines a lattice which is distributive if G is cyclic.-) " yJ in any distributive lattice. Lemma 2. S. V v . Use induction to prove in detail that n (a) x " V Yi . . they imply the uniqueness of 0 and I. respectively.=1 n = V (x . in any Boolean algebra. Deduction of Other Basic Laws We now show that the postulates for Boolean algebra listed above imply the other basic formulas of the algebra of classes which were discussed in §§ 11.2. . v (x llYn). In any Boolean algebra.. For instance.1-11. V (X m II Yn) Exercises 1. 3.4. dually. Yn) = (x 1 II Y I) v (x 1 II Y2) v . 4. which we did not postulate. These notations are analogous to the L. . 11. We shall now define this relation and deduce its basic properties from the postulates stated above. The relation a -< b is reflexive. that a v b = b. even though this is the most fundamental concept of all. then in particular a " 0 = a. The definition of a Boolean algebra given above did not mention the inclusion relation. where the last step uses (ii) again. If a " x = a for all x. Again. for one absorption law is x = x " (x v z) for all x. Finally.E. whence a -< c. a = a " b = b " a = b. The power of the absorption laws was exhibited above in the proofs of Lemmas 2 and 3. Actually. and transitive in any lattice. then a v 0 = 0. antisymmetric. if a " b = a. The proof of the unicity of I is similar. Likewise. x " 0 = 0 and x vI =I (iv") For all x. and associative laws imply the idempotent laws: the latter are redundant in the definition of a lattice. Proof. commutative. z. b of any lattice.4 365 Deduction of Other Basic Laws Proof. a v b = b v (b " a) = b. Conversely. Hence. Definition. by the commutative laws. hence again a = 0. then by the absorption law (ii) a " b = a " (a vb) = a. Define a -< b to mean that a " b = a-or equivalently (Lemma 2). of which a part was already proved as Lemma 2 above. This proves the transitive law. we infer x = x " (x v (x " y)) for all x. Lemma 4. a Since a " a = a. conditions (iv) can be replaced by either of the following postulates: (iv') For all x. The proof restates the law of consistency. 0 v x = x and I " x ::: X. then by the same law a v b = (a " b) v b. Lemma 3. Q. the absorption. hence a = O. but a v 0 = a by (iii). but a " 0 = 0 by (iii).§ 11. = x " y. if a v x = x for all x. Corollary. y. Applying the dual . proving the reflexive law. For elements a. In the definition of a Boolean algebra. which proves the antisymmetric law. a -< band b -< a imply -< a for all a. a " b only if a v b = b. =a holds if and Proof. a -< band b -< c imply a = a " b = a " (b " c) = (a " b) " c = a "c.D. If a v b = b. Setting z . by Lemma 5. by the distributive laws. and But any element x with a 1\ x = 0 and a v x = I must. we have x (x v a) = x 1\ (y V a) = (x 1\ y) v (x 1\ a) = (y 1\ x) V (y = y 1\ (x v a) = y 1\ (y V a) = y. 11 366 Boolean Algebras and Lattices absorption law x v (x law). = X 1\ Now recall that the operation a a 1\ a' = 0 ~ 1\ a) a' of complementation satisfies a v a' = I. 1\ x = Proof. Lemma 6. In other words. The proof that x 1\ = y) = x. and the absorption and distributive laws. (x v y)' and = x' 1\ y'. since x' 1\ x = X 1\ x' = 0 and x' v x = x v x' = I. interchanging 1\ and v. In any distributive lattice. again by the . Hence. We now show that the remaining properties of set-complements also hold in any Boolean algebra. In any Boolean algebra. (x 1\ y) 1\ (x' V y') = (x 1\ Y 1\ x') v (x 1\ Y 1\ y') = «x 1\ x') 1\ y) v (x 1\ 0) = (0 (x 1\ 0 y) v (x' v y') = (x v x' v y') = (I v y') 1\ (y 1\ y) V =0 1\ V v 0 = 0 (y V x' v y') y' v x') = I 1\ (I v x') = /. a v x = a v y and a a 1\ y together imply x = y. (x 1\ y)' = x' v y'. we have (1) (x')' = x. and (x')' = x. successively. we conclude x = x x (one idempotent x v x is similar.Ch. hence x is the unique complement of x'. Proof. 1\ Lemma 5. This shows that x' v y' is a complement of x 1\ y. satisfy x = a'. By substitution of equals for equals. The statement that x' is a complement of x implies by the commutative law that x is a complement of x'. the complement a' is uniquely determined by the complementation laws (v) in the definition of a Boolean algebra. But we have just seen that complements are unique. Again. x' v y' = (x A y)' is the complement of x A y. hence by induction on n we can assume the lemma to be true for them. Simplify the following Boolean expressions: (a) (x' II y')'. If the number n of letters in the given expression f (counting repetitions) is 1. 10. (x v y') A (Zl V w) . then the lemma is true. Prove that (x v y) II (x' v z) = (x' II y) v (x liZ) . f' = a ' v b ' or f' = a ' A b'. (c) (x II y) v (z II x) v (x' v y')'. and absorption laws. 8. prime each unprimed letter and unprime each primed letter. 4. 6. 9. Exercises 2-10 refer to Boolean algebras. (d) (x' v y)' II (x V y') . Exercises 1. Prove in detail that (x v y)' = x' II y'. respectively. Prove that (a) y < x' if and only if x II U = 0. (a) x v y v z'. 2. Proof. Apply the argument of the corollary to Lemma 5 to the expression (x' II y z' ) v (x II y'). 3. Corollary. To find the complement of an expression built up from primed and unprimed letters by iterated vees and wedges (but not using primed parentheses). Prove that (x II y) v (x II y') v (x' II y) v (x' II y') = I. (b) y :> x' if and only if x v y = I. Prove that the idempotent law x v x = x follows from the commutative. justifying each step. the complement of (x' A y) v (z A Wi) is. Interpret in terms of the two-circle analog of Venn's diagram. Find complements of the following expressions: (b) (x v y' v z') II (x V (y v z')). by this rule. Thus.4 367 Deduction of Other Basic Laws uniqueness of complements. Prove Poretzky's law: Given x and t. Substituting in the expressions f' = a ' v b ' or f' = a ' A b'. (c) x v (y II (z v w')). (b) (a v b) v (c va) v (b v c). Otherwise. II .§11. we can write the expression as f == a A b or f = a v b-giving. associative. interchange v and A throughout. The identity (x A y)' = x' A y' can be proved similarly. Prove that x = y if and only if (x II y') v (x' II y) = O. 7. since (x)' = x' and (x')' = x. we get the desired formula for the complement. x = 0 if and only if t = (x II t') v (x' II t). since no parentheses are primed. 5. But the expressions a and b contain fewer letters than does f. 0. with any two elements x and y. a " x = 0 and a v x = 1. . f = (x' A Z' A y) V (x' A Z' A z) v (y A x). . There results a polynomial in which all meets A are formed before any join v. then the A can be moved inside by applying the distributive law. . X V y. For example.]. I is said to be complemented when.Xn is 22". An element a of a lattice L with universal bounds 0. illustrating the argument by the polynomial f(x. As in the case of groups. this subalgebra is said to be generated by X. the polynomial becomes an expression involving only vees and wedges acting on primed and unprimed letters. Given an arbitrary nonvoid subset X of B. if any prime occurs outside any parenthesis in the polynomial. v. as in Lemma 5 of § 11. the set of all values P(Xh . y. /. . This we now show. and x' (and hence 0 = x A x' and I). Secondly. Canonical Forms of Boolean Polynomials Various expressions built up from the A. it may be moved inside by an application of the dualization law. This is a special instance of the following surprising fact: the number of different Boolean polynomials in n variables Xh . . 11. 3) is obvious. for some x E L.Ch. also x A y. as in C A (a vb) = (C A a) v (C A b). the same is true of a " b and a v b.4. the analogy with ordinary polynomials (Chap. Such expressions are called "Boolean polynomials" (or "Boolean functions"). in our example: f = [x' A Z' A (y v z)] v (y A x). the subalgebra generated by anyone element x consists of the four elements x.5. v (y A x). Show that if a and b are complemented elements of a distributive lattice. z) = [x v z v (y v z). 12. 11 368 Boolean Algebras and Lattices 11. Thus. In the example above. When all the primes have been moved all the way inside.xn ) of elements Xi E X is clearly the smallest subalgebra of B which contains X. and' operations have been studied in the preceding section. if any A stands outside a parenthesis which contains a v. First. the expression is a join of terms T h ••· . Prove that (X" y) v (y "z) v (z "y) = (x v y)" (y v z)" (z v x) in any distributive lattice. Tk in which each Tk is a meet of primed and unprimed letters. . . x'. We now define a subalgebra of a Boolean algebra B as a nonvoid subset S of B which contains. that is. since 0 v b = b for all b. Any Boolean polynomial in XI. one occurrence can be omitted. we can write replacing Tk by two terms. Thus. This means geometrically that any Boolean combination of the three circles X. we have proved the following lemma. our process reduces any Boolean polynomial to 0 or to some join of the terms X 1\ Y X 1\ y' (3) 1\ Z. Y.• . If c appears both primed and unprimed. X 1\ x' 1\ y' y' 1\ Z. and Z will be the union of some selection of the eight regions of the diagram. 1\ z'. Thus. to disjunctive canonical form. since c 1\ a 1\ c' = 0 for all a. hence it can be omitted. then the whole term is 0. This is called the disjunctive canonical form for f. Finally. Lemma.§11. X 1\ 1\ Z. z'. Since there are two alternatives for each qj. If a letter "c" appears twice in one term.Xn can be reduced either (2) that is. we see that there are exactly 2 n possible T k • Thus when n = 3. certain expressions can be shortened or omitted. in our example: f = (x' 1\ Z' 1\ y) V (y z) 1\ X 1\ V (y 1\ X 1\ z'). in each of which c occurs exactly once. x' 1\ Y 1\ z'. thus f = (x' 1\ Y 1\ z') v (x 1\ Y 1\ z) v (x 1\ Y 1\ z'). x' 1\ Y 1\ Z.5 369 Canonical Forms of Boolean Polynomials Thirdly. above f '= (x' 1\ Z' 1\ y) (y V 1\ x). x' 1\ Y y' 1\ 1\ z'. since c 1\ c = c. the letters appearing in each term can be rearranged so as to appear in their natural order. Now if some term Tk fails to contain a letter c. It is no accident that these eight polynomials represent the eight regions into which the three circles of Figure 1 divide the square. to 0 or to some join of terms Tk of the form •. . . There are just 2 2 " different Boolean functions of n variables. am whose ith digit ai is 0 or 1 according as Si is primed or unprimed in M. Now assign to each M the n-digit binary number 71(M) = YlY2 . such as those listed in (3). simply by reducing each side to disjunctive canonical form. therefore. Yn) E Z2n. a minimal polynomial n M(xl> .. Xn to the set I of all 2 n n -digit binary numbers.110. Different minimal polynomials M = Ma are clearly represented by different binary numbers..001. Corollary.000. Exercises 1. .Ch. .. (b) (x v y) " (y v z) " (x v z). Xn is equal to 0 or to the join of a set S of minimal polynomials. 71(M) can be thought of as the vector 71 = (YI> Y2. xn) in n variables XI> ••• .. We can now replace haphazard manipulation of Boolean polynomials by a systematic procedure. Ym n where the digit Yi is 1 or 0 according as qi is Xj or x/ in M = 1\ qi above.. Any given Boolean polynomial in XI> ••• . Sn) consists of a single binary number a(M) = ala2 . We have proved Theorem 2... and each Boolean polynomial VM. . then S/ will consists of those with Yi = O..010. Thus in (3). Xn is a meet 1\ qi of n elements in i~1 which each ith element qi is either Xi or X/. Reduce each of the following expressions to canonical form: (a) (x v y) " (z' " y)'. This proves the following result. i=1 Then the function 71: M ~ 71(M) is a bijection from the set of minimal polynomials in XI> ••• . The truth or falsity of any purported equation EI = E2 in Boolean algebra can be settled definitely.(xl>···' xn) s as corresponding to a set of these vectors.101. Hence (cf. 11 370 Boolean Algebras and Lattices The ultimate terms. Ex. will be called minimal Boolean polynomials.011.. If Sj c I consists of those binary numbers with Yi = 1. the joins of different sets of Ma represent different subsets of /. Alternatively.. 9 below) the set representing a given minimal polynomial M(SI>' . the 71(M) as listed are 111.100. In other words. .. Test each of the following proposed equalities by (disjunctive) canonical form: reducin~ both sides to (a) [x " (y v z). they hold for the subgroups (or the normal subgroups!) of any group. 8. Thus.]. 8-10. 0) " x' " y'].6 371 Partial Orderings 2. and show that they are complements of minimal polynomials. 6. For example. the subfields of any field. Describe these prime polynomials carefully.4. 11. and so on-even though these do hot form Boolean algebras. They also hold for .. A Prove: M". Yet these are the most fundamental laws of all. denote the join of all minimal polynomials in the set A. How is the result analogous to the theorem that every ordinary polynomial over a field has a unique representation as a product of irreducible polynomials? 4. (b) x = (x' V y')' V [z v (x v y)'J. 3 to test the equality of Ex. = (x " y)' v (x " z). Show that every Boolean polynomial has a dual canonical form which is a "meet" of certain "prime" polynomials. 7 and Xi = Xi "I that each Xi is the join of all those minimal polynomials with ith term Xi' 9. 7 and 9. Expand I = (XI V XI') " ••• " (xn v x/) by the generalized distributive law to show that I is the join of the set of all minimal Boolean polynomials.§11. the subs paces of any linear space. 7. of the void set of minimal polynomials to be O. '" *10. Use the canonical form of Ex. prove (V M". (Hint: Use Lemma 5 of § 11.) I A = A' *11. Using Exs. Prove that the canonical form of f(x. and transitive laws of inclusion. Partial Orderings Little use has been made above of the reflexive. Using only Exs. (a) Let V M". 5. antisymmetric. Prove that the meet of any two distinct minimal polynomials is O. I) " x' " y] v fJ(O. thus they apply to many systems which are not Boolean algebras. y) = fJ(I. give an independent proof that every Boolean polynomial can be written as a join of minimal polynomials. y) is f(x. 0) " x " y'] v fJ(O. they clearly hold for the system of all subsets of a set which are distinguished by any special property (writing either c or <). I) " x " y] v fJ(I.) (V A v (V MfJ) = B V MY' AUB (b) Show that the preceding formulas are valid also if we define the join V M". 2(a).) VM".6. Prove from Ex. 3. is a diagram for the partially ordered set of all the subgroups of the group of the square..p b. The others have been constructed at random. and transitive relation. It is clear that in any partially ordered set. One then draws a descending line from a to b in case a covers b. Each element of the system is represented by a small circle so placed that the circle for a is above that for b if a > b.Ch. For any relation a - b from the diagram. In any lattice. and if a < x b if and only if it is possible to climb from b to a along ascending line segments in the diagram. while b may be said to cover a if a < b. the relation x -< y defined to mean x is a partial ordering. A y = x Partially ordered sets with a finite number of elements can be conveniently represented by diagrams. antisymmetric. Lemma. Definition. the Boolean algebra of all subsets of a set of three points. and transitive. it is equivalent to x v y = y. These examples suggest the abstract concept of a "partial ordering. and so on. and transitive (simply read the postulates from right to left to see this). the relation ::> is also reflexive.8 under the divisibility relation. §6. "b includes a") of this type. the third. antisymmetric. For example. 11 372 Boolean Algebras and Lattices the less-than-or-equal-to relation x -< y between real numbers.7.2. The following lemma shows that any lattice can be considered as a partially ordered set (the full significance of this will be explained in the next section). and show how one can construct abstract partially ordered sets simply by drawing diagrams. in Figure 3 the first diagram represents the system of all subgroups of the four-group.4. for the divisibility relation x I y between positive integers. which is reflexive. the second. we may define a b. b') means that either a < a ' or a = a' and b . If a < b is defined to mean that either a < b or a = b. It is to be emphasized that this principle is not a theorem about partially ordered sets in the usual sense. b') means that a < a' and b < b'.6 373 Partial Orderings from the postulates defining partially ordered sets by using the relation a <: b could be established by exactly the same train of reasoning. 24 under divisibility. b) < (a'. (c) the set of all subgroups of the quaternion group. 5. This is the Duality Principle. it belongs to the domain of "metamathematics. (d) all pairs of real numbers if (a.b') means that a <a' and b <b'. Any theorem which is true in every partially ordered set remains true if the symbols <: and > are interchanged through the statement of the theorem. (f) all polynomials in FIx] if f(x) < g(x) means that f(x) divides g(x). (e) the set of all subgroups of a cyclic group of order 54. (f) the set of all ideals of the ring Z40 of integers modulo 40. 6. Show in detail how the second diagram of Figure 3 does represent the algebra of all subsets of a set I of three points. Show that the partially ordered set of parts (d). (d) the integers 1. if everywhere a <:. (e). 3. (b) Make and prove a similar statement about Example 2 of §11. but is a theorem about theorems. 8. (b) the set of all subgroups of a cyclic group of order 12.§ 11. (f) and Ex. Prove the lemma stated in the text. 4. Draw a diagram for each of the following partially ordered sets: (a) the Boolean algebra of all subsets of a set of four points. and vice versa. Which of the following sets are partially ordered sets? (a) all subfields of the field R of real numbers. (e) all subdomains of a given integral domain. Consider a system of elements with a relation a < b which is transitive and irreflexive (a < a is never true).3. 12. what do II and v signify? . b) < (a'." Exercises 1. 6.3 the lattice L is defined by the relation of set-inclusion between subspaces. As such. (c) If min is used to define a partial ordering of the positive integers.S b'. under the inclusion relation. prove that one gets a partially ordered set. (b) all pairs of numbers (a. b) if (a. 2.b)«a'. (c) all pairs of real numbers if (a. 2 are all "isomorphic" in a suitably defined sense. (a) Show that in Example 1 of §11. under the inclusion relation. 7. The proof is completed by duality.a.u. Clearly. and a l.l.b. the associative law. it suffices to prove each of the first three laws for g.l.u. We will now show that this property completely characterizes lattices. x v y is the least thing which contains both x and y. in which a -. Bya "lower bound" to a set X of elements of a partially ordered set P is meant an element a satisfying a -< x for all x E X. x v y.b.b. x 1\ y and a l. S.) is meant. Namely. Peirce. Dually. We are here applying our metamethematical Duality Principle! Hence it is legitimate to speak of the g.u. 4.Ch. This observation is due to C. Theorem 3. since both x 1\ (y 1\ z) and (x 1\ y) 1\ Z are . we shall state it more precisely as follows.b. whence a = b. Since x 1\ x 1\ Y = X 1\ Y and y 1\ X 1\ Y = X 1\ y. commutative.b. again by the consistency principle.l. together with the consistency principle.b.u. then a >. by the Duality Principle.b. Let L be any partially ordered set in which any two elements x.b. the consistency principle shows that x 1\ y is a lower bound of x and y.b. while x 1\ y is the greatest thing contained in both x and y. y have a g. whenever these bounds exist. conversely. since z -< x and z -< y imply z = x 1\ Z = X 1\ (y 1\ z). Proof. and the l. and prove the uniqueness of the latter when they exist.band b >. and l. Moreover.b.). Lattices The consistency principle shows how to define inclusion in terms of join or meet. It is a greatest lower bound. one can define join and meet in terms of inclusion. we can define "upper bounds" and "least upper bounds" (l.b. v. equivalently.7. and so z -< X 1\ y. of a set of elements. Lemma 1.l. Proof. of the set consisting of the two elements x and y. g. Then L is a lattice under the operations 1\.b. It suffices to prove the idempotent. a v b = b).'s of the same set X. associative.l. and absorption laws. The commutative law is obvious from the symmetry of the definition. respectively. we shall now show that.'s are unique if they exist-for if a and bare both g.u. In any lattice..a for any other lower bound a. as in Chap.l. the meet x 1\ y and the join x v yare the g. This shows that any lattice is a partially ordered set having the "lattice property" that any two elements have a g. then any z with z -< x and z -< y satisfies z -< x. so that x satisfies the definition of the g. x v (y 1\ z) -< (x v y) 1\ (x V z) hold. One such is indicated by Ex. Conversely. The labor of proof is cut in half by the Duality Principle. antisymmetric. the semidistributive laws X 1\ (y V z) :> (x 1\ y) v (x 1\ z).7 375 Lattices g. y. It is a corollary of the preceding theorems that. by substitution in the definition.l. so that x 1\ y. Boolean algebras can also be described by many other systems of postulates. A Boolean algebra is a distributive lattice which contains elements 0 and I with 0 -< a -< I for all a. In any lattice. . assume first that x -< y. we get by expansion (x v y) 1\ (x v z) = «x v y) 1\ x) v «x v y) 1\ z) = x v (x 1\ z) V (y 1\ z) = x v (y 1\ z). z are chosen to be the three subgroups of order 2 in the four-group (Figure 3. Finally. This also proves Theorem 5. and dually for the intersection.3. (x 1\ y) v (x 1\ z). The absorption law now follows by a proof similar to that of Lemma 3 in § 11. hence the g. (iii) S n (T u U) = (S n T) u (S n U) identically. two related inequalities do hold.4. (iv) each set S has a "complement" S' satisfying S n S' = 0. (ii) the union of two sets is the least set which contains both.b.'s of the three elements x. x 1\ y. The idempotent law is trivial. we need only know that (i) set-inclusion is reflexive. 13 below. since they do not hold in all lattices. a v a' == I.§ 11. Theorem 4.l. note that both terms on the left have both terms on the right for lower bounds. either distributive law implies its dual. of x and y v z is an upper bound both to x 1\ y and to x 1\ z. and hence to their l. then x is a lower bound of y. and z. while x -< x and x -< y. However. For the consistency principle.u.b. this proves the consistency principle. assuming the first distributive law of § 11. which is the other distributive law of § 11. (iii). they do not hold if x.b. Moreover. As regards the first semidistributive law. (iii). Proof. 3. The distributive laws were not mentioned above.l. The proof is completed by the Duality Principle. and in which each a has a complement a' satisfying a 1\ a' == 0. to prove that the algebra of classes is a Boolean algebra. if x == x 1\ y. For example. first diagram).b. SuS' == I. and transitive. y. 2) there is an element c such that at > c > bj for all i and j. (iii) all normal subgroups of any group. 9. A chain is a simply ordered set (i.6. *11. (a) What does this mean if x and yare classes? Draw a figure.. X = 0. x <z always implies x v (y /I.e. 3. 1) . .. in which each element a has a complement a with the properties *8.. a2. e *13. Which of the diagrams of Figure 3 represent lattices? Draw two new diagrams for partially ordered sets which are not lattices. .. I x < a' if and only if a y > a' if and only if avy=I. xnyJ. Show that a finite partially ordered set with 0 and I is a lattice whenever between any elements at. *10. y/) v (x' /I. 2. z) = (x v y) /I. (c) Prove that each of the following is a modular lattice: (i) all subspaces of a vector space. and has a zero element. Which of the examples of Ex. bl> b 2 with aj < bj (i. (b) Construct a diagram for a lattice of five elements which is not modular. State and prove a Duality Principle for Boolean algebras. represent lattices? Show that if b < c in a lattice L. In any Boolean algebra the symmetric difference of two elements x and y is defined by x + Y = (x /I. z. . c and a v b < a v c for all a E L. show that every Boolean algebra is a commutative ring. 1/ = {1/. then Z2 becomes a commutative ring in which = {for all f *(b) Show that this ring is a Boolean algebra under the operations { /I. 4. b < a /I. we define multiplication by n {1/ = (xtYt. (d) Show that in a modular lattice. (b) Prove that a lattice is a chain if and only if all of its subsets are sublattices. commutative. (a) Show that if. (c) With the symmetric difference as a sum and the meet as a product. 4. . { v 1/ = { + 1/ + {1/. (a) Prove that every chain is a distributive lattice.j = 1. /I. .. in the vector space Z2n. (b) Show that x + y is associative. Hence infer that the Duality Principle holds for modular lattices. 12. {' = (1. (a) Prove that every distributive lattice is modular. § 11. then a /I. 5.Ch. (ii) all subgroups of an Abelian group. 11 376 Boolean Algebras and Lattices Exercises 1. Illustrate the Duality Principle by writing out the detailed proof of the second half of Lemma 1 from the proof given for the first half.f If L is a lattice with universal bounds 0 and I. y) v z. Show that a lattice having only a finite number of elements has elements 0 and I satisfying 0 < x z always implies x /I. 7. 1. (y v z) = (x /I. y). a partially ordered set in which any x and y have either x > y or y > x) . . (Hint: To prove the first distributive law it suffices to prove that e == [a " (b v c)] " [(a" b) v (a " C)]' = O. . We will now prove a stronger result. the function being defined as in § 11. Representation by Sets The main conclusion of §11.§11. y E L. Clearly. An isomorphism f: A ~ B between two Boolean algebras (regarded as lattices) necessarily carries the universal bounds 0. Z of the Venn diagram (Figure 1) is isomorphic with the algebra of all subsets of Z2 3 . Y. Proof. and complement. A function f: L ~ M from a lattice L to a lattice M is called a homomorphism when f(x " y) = f(x) " f(y) and f(x v y) = f(x) v f(y) for all x. Write e as a meet and consider the individual terms. a suitable family of subsets SI. I and complements in A into the corresponding bounds and complements in B..Sn of a particular set Z2 n was shown to have the property that p(SI.Sn) = q(SI' . For example. A homomorphism which is one-one and onto is called an isomorphism. . . the Boolean algebra generated by the circles X. In fact. The Boolean algebra consisting of all these disjunctive canonical forms.8. .5. . showing in passing that the postulates used to define distributive lattices completely characterize the properties of intersections and unions of sets. for given n. the proof that f(I) = I is similar.5 is that the postulates which were assumed for Boolean algebra imply all true identities for the algebra of sets with respect to intersection. .. 0" x = 0 for all x E A implies f(O) " f(x) = f( 0 " x) = f( 0) for all f(x) E B.. hence f( 0) is a universal lower bound in B. we will need concepts of homomorphism and isomorphism analogous to those already used for groups. union. Therefore x " x' = 0 in A implies f(x) " f(x ' ) = f(x " x') = f(O) = 0 in B. Definition. For this purpose.) 11..Sn) for two Boolean polynomials p. q if and only if these polynomials had the same· disjunctive canonical form.. is called the free Boolean algebra with n generators. Lemma 1.8 377 Representation by Sets prove that L is a Boolean algebra. For any other a E L. a field of subsets of I is just a Boolean subalgebra of the Boolean algebra A of all subsets of I. since S(O) = 0. Definition. and 0 is the least upper bound of the void set. and consider the mapping a ~ S(a). Conversely. a field of sets is a ring of sets which contains I. where x < a. for each element a of any finite lattice L. In proving these converses of Theorem 1. 11 378 Boolean Algebras and Lattices and dually f(x) v f(x ') = I in B. let S(a) be the set of all join-irreducible elements Pk <: a in L. These may be regarded as partial analogs of Cayley's theorem for groups. the void set. Lemma 2. an element is join -irreducible if and only if it is an atom. We have Lemma 3. Now. We will prove that every finite distributive lattice is isomorphic with a ring of sets and every finite Boolean algebra is isomorphic with the field of all subsets of some (finite) set. Therefore a = a 1\ I = a 1\ (x v x') = (a 1\ x) v (a 1\ x') = X v (a 1\ x'). then p = x v y implies x = p or x = 0. a 1\ x' < a also. we use the Second Principle of Finite Induction. A ring of sets is a family of subsets of a set I which contains with any two subsets Sand T also their intersections S n T and their union S u T. letting P(n) be the . An element a > 0 of a lattice L is join-irreducible if x v Y = a implies x = a or y = a. In a Boolean algebra. and there is no element x such that p > x > 0. an element p is an atom if p > 0. if a is not an atom or 0. If p is an atom. in the second case p = 0 v y = y. considered as a distributive lattice. hence p is join-irreducible. then a > x > 0 for some x. Proof. = [f(x)]" com- Definition. every element a satisfies a = V Pk· Sea) Proof. and with any set S also its complement S'. In other words. the empty set 0. it is meet-irreducible if a < I and x 1\ y = a implies x = a or y = a. For a = 0.Ch. which proves that f(x ' ) pleting the proof of Lemma 1. Since a 1\ x' <: a. and a 1\ x' = a would imply x = a 1\ x = a 1\ x' 1\ X = 0. hence a is join-reducible. a ring of subsets is just a sublattice of A. the result is immediate. In a finite lattice L. we will also want the following concepts. it follows that x and y are joins of join-irreducible elements: x = Vp! and y = V q. A given join-irreducible p is contained in a v b if and only if p = P 1\ (a v b) = (p 1\ a) v (p If p is join-irreducible. moreover. Proof. p -< a) or p 1\ b = p (p -< b). Q. We now show that is contains all sets of atoms of L. This proves 1\ m Theorem 6.e. the mapping a ~ S(a) carries meets in L into set-theoretic intersections: S(a 1\ b) = S(a) n S(b). This shows that S(a vb) contains p if and only if S(a) contains p or S(b) contains p." is a join of join-irreducible elements. [S(a)]' = S(a'). In a finite distributive lattice L. Trivially. 1\ b. this implies either p 1\ b). When L is a finite Boolean algebra. We have shown that the mapping a ~ T(a) is an isomorphism from any Boolean algebra L to a field is of subsets of atoms of L. That is. p -< a 1\ b if and only if both p -< a Lemma 5.". Lemma 2 tells us that each a E L is the join of the atoms p -< a. proving . and the set of all atoms (join-irreducibles) of L.§ 11. one-one from L onto ffl. Also. Proof. By induction on n.D. Lemmas 4 and 5 show that the mapping a ~ S(a) is a homomorphof subsets of the set I of join-irreducible ism from L onto a ring elements of L.. where x < a and y < a. Lemma 3 shows that it is. In any finite lattice L. for any a E L: S(a) n S(a') = S(a S(a) u S(a') = S(a 1\ a') = S(O) va') = S(I) = 0. Any finite distributive L is isomorphic with a ring of sets. the correspandence of Lemma 4 carries joins in L into set-theoretic unions: S (a vb) = S(a) u S(b). a = p (i. = J. by Lem'mas 4 and 5. Hence x y a = Vp! V V q.E. Lemma 4. P(n) is true if a is join-irreducible. then a = x v y. But if a is not joinirreducible and not 0. and so the function a ~ S(a) is an isomorphism . whence n (x) < n (a) and n (y) < n (a).8 379 Representation by Sets proposition that Lemma 3 holds if the number of elements x -< a in L is n. . But the converse is obvious in any lattice. By definition of a and p -< b. V PT' But this is a S T corollary of the following result. assuming Lemma 6. p. this implies Pu = q. V Pu contains the atoms in S and no others. Since q is an atom. (a) Find a lattice homomorphism f: A . Any finite Boolean algebra L is isomorphic with the Boolean algebra of all sets ofits atoms. It only remains to show that if Sand Tare distinct sets of atoms Pu. = q. it follows that some one q " Pu o < Pu < q. (b) Show that such an f preserves complements if and only if it preserves universal bounds. (b) Show that this lattice is distributive... q = q " V Pu = V (q " Pu). Lemma 6. then L itself is a chain. then V Pu oj. 11 380 Boolean Algebras and Lattices Theorem 7. S Proof of lemma. If two finite sets I and I have the same number of elements.Ch. S s· Since q is join-irreducible. 6... 3. Show that if the join-irreducibles of a finite distributive lattice L are a chain C. Completion of proof. (a) Show that the set Z+ of all positive integers is a lattice under the partial ordering m < n if and only if min. How many more elements does L have than C? . show that the algebra of all subsets of I is isomorphic to the algebra of all subsets of 1. (c) Identify its join-irreducibles. V Pu. then q S ' E S. • " of L. Show that the Boolean algebra of all subsets of a class of n elements has exactly n! automorphisms.. B from a Boolean algebra A onto a Boolean algebra B which does not preserve universal bounds or complements. 2. By the generalized distributive law. Prove that for every positive integer n there is a Boolean algebra with 2" elements. If an atom q < For. 5. whence Exercises 1. 4. we shall show how to add. which playa basic role in modern mathematics. Our object will be to extend the cardinal approach so as to give a precise definition of infinite cardinal numbers. This is the cardinal approach to the positive integers. n. and raise to powers arbitrary cardinal numbers.A set S will be said to have cardinal number n (in symbols. Using this definition. . Developed with care. four. Definition.1. 2. we shall assume the reader to be familiar with both the positive integers and the concept of a set. three. t o(S) = n) if and only if there exists a bijection between the elements of S and the integers 1.any positive integer. Instead. 3. multiply. this cardinal approach enables one to define numbers in terms of sets.. two. . as contrasted with the ordinal approach exemplified by the Peano postulates of §2. Let n be . Numbers and Sets The present chapter will be concerned with the relationship between numbers and sets. " '" as basic.12 Transfinite Arithmetic 12. therefore. The source of the relationship between numbers and sets is the following definition.6. thereby reducing the totality of undefined terms which must be assumed in mathematics.. t An empty set is sometimes said to have the cardinal number zero. and shall proceed from there. .. But to carry out this program requires too great a reshuffling of basic ideas to fit neatly into the present book. showing in the process that these operations have most (though not all) of the properties possessed by the corresponding operations on positive integers. which regard position in the familiar sequence "one. . . by mathematical induction.. m and a proper subset S of the integers 1. a priori.Sno where Sk is the element of S corresponding to the integer k .n} if and only if m -< n...1 < n . counting each element once and only once . but the converse must be analyzed more carefully. hence we can use induction on m.···.. By hypothesis. . n if and only if m < n. There exists a bijection between the set {I. The converse is trivial if m = 1.. if k = n. the integers g(1). adding one to both sides. If m < n."'..m . . m ~ m is of the desired sort. stating first a somewhat more general result.n-1.. so that f(i) is never k for i = 1.1.. g(m . f(i) = n is never true.1 and certain of the integers 1. . n. . m < n. . In either event. the definition (1) shows that no g(i) equals k. so i ~ g(i) is one-one between the integers 1.382 Ch.. n. This half of the proposition is obvious. S3. 12 Transfinite Arithmetic This definition means that the elements of S can be labeled Slo S2. Proof... we can assume m . Let us select the first positive integer k -< n which is not in S.Sn ~ tn.. . . one will not arrive at a different total number of elements. . . .' . . But what is not obvious. Define a new bijection i ~ g(i) [i = 1.. one can count the elements of S by counting up to n. by recounting in a different order. g(i) = f(m) if f(i) = n. There exists a bijection between the set 1. ••• . It is a corollary that if two sets Sand T have the same cardinal number. . since 1 is the least positive integer.1) fail to include all the integers 1. But now suppose there were a bijection 1 ~ f(l).' . . the set S of all integers f(i) is a proper subset of the set 1.n .1] as follows: (1) g(i) = f(i) unless f{i) = n. this means that S does not contain all the integers 1. m....' . is the fact that the same set cannot have two different cardinal numbers-that. n . . . the correspondence i ~ g(i) would be one-one between the integers 1.m ...m and a proper subset of the set 1. Let m and n be positive integers.1 and a proper subset of the integers 1. Now. . then there is a bijection between them-namely.. . . Corollary 1.I-whence. Theorem 1.. . In other words..m .. 2 ~ 2. If k < n.' .. . . then the bijections 1 ~ 1. .1. m} and a subset of the set {I." . the correspondence Sl ~ tlo ••• ."'. Since f{i) = n for at most one i.m ~ f(m) between 1.. . n. We shall now prove this fact. so no g(i) equals f(m). . . Proof. If m -< n. if i ~ f(i) is a bijection between {I. 12. If a set S has cardinal n.m} and the set {I. m ~ m is of the desired sort.. the bijection 1 ~ 1.n}. Definition.. Countable Sets A set is called finite if and only if its elements can be counted in the usual way. . Do the same for Corollary 3. For.2 383 Countable Sets Proof.' . show that there exists a bijection between Sand 1. This shows that the same set cannot have two different positive integers for cardinal numbers. then m = n. . n}.. m -< nand n -< m-whence m = n.2. . show that the deletion of a single element from S leaves a set S* of cardinal n . there is no bijection between the set {I... Prove Corollary 1 directly by the method used in the proof of Theorem 1.. Corollary 3.. . If there were such a bijection. .n} and the set S. Let Sand T be any two sets whose cardinal numbers are positive integers m and n. We shall formulate this more precisely as follows. .. . . If a set S has cardinal n. The preceding results immediately imply the following. . m} and certain of the integers 1. . Hence by Theorem 1. so m -< n.n. Exercises 1.. If there exists a bijection between the set {I. m = n if and only if there is a bijection between S and all of T.. .n in which t corresponds to n.. Conversely. . .. 2. Then m -< n if and only if there is a bijection between S and a subset of T. A nonempty set S is called finite if and only if its cardinal number is a positive integer. n + I}.. . . Theorem 1 would prove n < n.. and if t is a particular element of S. a contradiction. by Corollary 1. 4. . . A set which is not empty or finite is called infinite. 3. .§12.n. . m < n + 1. If S is a proper subset of the set {I. . Corollary 2. it is a bijection between {I..m} and a proper subset of {I..1... This construction uses a basic principle of set theory known as the Axiom of Choice: that given any set S. More precisely. choose for Sl any element in it. ••• .D. Proof. A set 5 is said to be countable or denumerable or to have the cardinal number d (in symbols. and so on. From 5 . All the elements of the set (say 5) can by hypothesis be written Sb S2. hence we can always choose an Sn+l in it.E. 5 . S2 ~ S3. using Theorem 1. . Theorem 2 (Paradox of Galileo). ••• . It may be shown that d ("countable infinity") is the smallest infinite cardinal number.:!: and the process can never stop until we have constructed an infinite sequence of different elements of 5. Let 5 be the infinite set. there exists a "choice function" 'Y which chooses from any nonempty set T c S an element 'Y(T) E T.{Sl}. If another set T is bijective with a countable set 5. Sm ••• . If 5 is a finite set of cardinal number n. so that each element of 5 appears once and only once. then choose a second element S2. A set 5 is infinite if and only if it has a bijection wiih a proper subset of itself. then 5 is bijective with 1.. The bijection Sl ~ S2. it follows that T is itself countable. this is Theorem 3. * . the set Z+ of all positive integers is infinite. . Any infinite set contains a countable subset.. sn} can never be empty. S2. (It is not hard to prove this.{st. Any denumerable set has a bijection onto a proper subset of itself.t 0(5) = d) if it is bijective with the set of all positive integers. S2. S3. Corollary (Dedekind-Peirce). Definition.) We shall now introduce the idea that infinite sets may also be considered to have cardinal numbers.{Sb S2} a third element S3. Proof. This is equivalent to the requirement that it be possible to enumerate all the elements of 5 in an ordinary infinite sequence: Sl. from 5 . Since 5 is infinite. ••• .Sj ~ Sj+b ••• is one-one between the set 5 and the set obtained from 5 by deleting Sl' Q. S3.Ch. Proof. •• " with the different positive integers as subscripts. so Corollary 3 to Theorem 1 asserts that 5 cannot have a t The Hebrew letter ~o (aleph-nought) is often used instead of d. 12 Transfinite Arithmetic 384 For example. n. . . Q..4 2/2 3/2 number d).2.. Prove directly that a bijection between the set of all positive integers and a finite set is impossible.... This proves the first assertion . The function associating each element Uj in U with its successor Uj+b and each element of S not in U with itself. establishing a bijection m/n ~ k between Q+ and Z+. . ) (-n) ~ 2n (n = 1" 2 . 0 ~ 0. Delete from this sequence all fractions which are not in their lowest terms (or equivalently. 3 . . where T and U are countable. The first term is 1/1. Examples are given by t i i Theorem 4. . •• ' . 2. 4. But this can be easily extended to a bijection m/n ~ k.1) if m :> n > 1. let S be any infinite set. Conversely. 4. 1/1 3/1 2/1 In practice. . +2 . We shall next prove that the set Q+ of all positive rational numbers is countable.D. . . .2 385 Countable Sets bijection with a proper part of itself. prove that S is countable.§12.E. 1/3 ~ 2/3 - 3/3 . If S = T u U. The set Z of all integers is countable. . Show that the set of all vectors in a finite-dimensional space Q" over the field of rationals is countable. . . it will contain a countable subset U of elements Ub U2. . it follows that Q is. To do this. which are equal to other quotients of integers previously enumerated). the set Q of all rational numbers is countable.. we can then arrange all such quotients in the following ordinary infinite sequence. . as in Figure 1. U3. . . 2. . The resulting subsequence enumerates the positive rational numbers in an ordinary sequence. of positive integers. the successor of n/l is 1/(n + 1). ) is one-one between the set 0" -1 +1 -2 . Exercises 1. The correspondence n ~ 2n + 1 (n = 0. . Show that the set of all integral multiples of 7 is countable. Since Z is countable.. 3. is a bijection from S to a proper part of itself. -(m/n) ~ -k between the set Q of all rational numbers and the set Z of all integers. the successor of m/n is (m + 1)/n if m < n. . and is m/(n . 3. By going around the borders of smaller squares in order.. . Figure 1 Proof. 5. " of all integers and the set 1. surprisingly many infinite sets i turn out to be countable (to have the cardinal 1/2 . we first arrange the quotients of positive integers in an infinite square. 1.. . . . §2..Ch. . 12. r' E Q) is countable. is the decimal expansion of a real number b which differs from the nth number Xn of the enumeration in at least the nth decimal place. If S = T u U. let the bth digit bn in the decimal Figure 2 expansion of b be ann . Remark.of all real numbers is not countable. From the X2 = .1 if ann oj. Show that in Figure 1. . " of all real numbers. 12 Transfinite Arithmetic 386 5. . Theorem 5 (Cantor). Prove that every group contains a subgroup which is countable or finite. List the decimal expansions of these numbers after the decimal point in their enumerated order in a square array as in Figure 2. Where ann is the nth diagonal term. struct a new real number b between 0 and 1 as follows ...n + 1)st if m > n. 0 and 1 if ann = O. For instance. Exhibit a bijection between the field of real numbers and a proper subset thereof.a21 a22a23a24 . Then b = .1)2 + m)th term if m < n and the (m 2 . 10. 12.. Thus. This proof is complicated by the circumstance that certain numbers. X3. .000 = 0. Other Cardinal Numbers Not all infinite classes are countable: there is more than one "infinite" cardinal number. We use the so-called "diagonal process. Proof. digits along the diagonal of this array conX3 = . . 6. The set R . min is the «n .a31 a32a33a34 . Show that every subset of a countable set is either finite or countable. 13. b is equal to no Xm contradicting our supposition that the enumeration included all real numbers.. 8. Prove that the set of all numbers of the form r + r'~ (r." Suppose there were an enumeration Xlo X2.1). 7. prove that U is countable. 9. where S is countable and T is finite. Prove that the ring Q[xJ of all polynomials with rational coefficients is countable.. Show that the number of decimals ending in an infinite sequence which consists exclusively of 9's is countable. . such as 1. may have two different decimal . Establish specific bijections between the set of all integers and three different proper subsets of itself.3. . 11. *14.b 1b2b3b4 . Prove that the field Q(J2) is countable (ct..999· . Tracing t This concept is like that of a "chemical element. J'he formulation of this principle involves the general concept of a cardinal number. and let t ~ tu be the given injection from T to a subset of S. if there is an injection from S to T and another from T to S. A set S which is bijective to the set R of all real numbers will be said to have the cardinal number c of the continuum (in symbols. Definition. and so we proceed to define this concept now. most of the sets occurring in geometry and analysis have one of the cardinal numbers d or c. Let s ~ ST be the given injection from S to T. Each element s of S is the image to of at most one element t = su -\ of T. o(S) = c). • •• in the original enumeration. and so on. the concept of inequality between cardinal numbers can be defined in a way which is consistent with the ordinary notion of inequality between positive integers." which is likewise an abstraction. Theorem 6 (Schroeder-Bernstein). Bernstein. this (if it exists) in turn has at most one parent s' = tT -\ = su -IT -\ in S. (The converse is trivial. using special constructions. We denote this by the symbolic equation o(S) = 0(T). In virtue of the last sentence of § 12. one ending in an infinite succession of 9's. Definition. a specified structure).) Proof. We shall say that a set S is cardinally majorizable by a set T-and write o(S) -< o(T)-whenever there is an injection from S to T. referring to all atoms having a specified nuclear charge (i.e. hence the decimal expansion of b is in proper form to be compared with the expansions X n . The construction of b never yields a digit bn = 9. Schroeder and F. It follows that two sets Sand T have the same cardinal number (or are cardinally equivalent) if and only if there exists a bijection between them.3 387 Other Cardinal Numbers expansions. then there is a bijection between all of S and all of T. If o(S) then o(S) = 0(T).§12. .1. The trouble may be avoided by assuming that the former type of expansion (with 9's) is never used for the decimals Xlo X2. The cardinal number of a set S is the class of all setst which have a bijection onto S. Definition. the cardinal number of S is denoted by o(S). But in the long run it is much easier to prove first a general principle due to E. In words.. This can be shown case by case. -< 0(T) and 0(T) -< o(S). In practice. the other in a succession of O's. The sets Sand T are represented by points on two vertical lines. Tb Indeed.E. Sb. consider the mapping (2) between ordered couples of real numbers between 0 and 1. 13). where T is represented by the arrows slanting down to the right and u by those slanting to the left. 12 Transfinite Arithmetic 388 back in this way the ancestry of each element of S (and also of T) as far as possible. we divide S into subsets Sa. y) is oneone between -<X) < x <'+00 and 0 < y < +00. Moreover. The line segment SI: 0 < x < 1.. Sb ~ Tb. (b) elements descended from a parentless ancestor s T in S. while each element t of Ta is the parent of one and only one element tu of Sa. while the bijection of Sb to Tb is indicated by the lines without arrowheads. Te. y < 1 in the plane.Ch. we see that there are three alternative cases: (a) elements whose ancestry goes back forever. Combining these three bijections. z < 1 in space. ill us tra ted by Figure 3. Figure 3 which does not show the elements of infinite ancestry. The function x ~ eX = y (with inverse y ~ log. Similarly. Corresponding to these cases. Theorem 7. written in decimal form. we get a bijection between all of S and all of T. Q. Hence the function x ~ eX 1(1 + eX) = z is bijective from -<X) < x < +00 to 0 < z < 1.z) is bijective between 0 < Y < +00 and 0 < z < 1. the unit square S2: o < x. the category containing any element of S or T contains all its ancestors and descendants. Proof. u (also T!) is clearly bijective between Sa and Ta-each element of Sa is the image under u of one and only one element of Tc Ta. the function y ~ y/(l + y) = z with inverse z ~ z/(1 . T. (c) elements descended from a parentless ancestor in T. Tb. To prove the second. and T into subsets Ta. and single real numbers between 0 and 1. Se. The si tua tion is. y.D. This proves the first assertion. Sa ~ Ta. and the unit cube S3: 0 < x. while u (but not T!) bijective between Se and Te. and Se ~ Te. perhaps periodically (see Ex. T (but not uO is bijective between Sb and Tb. all have the cardinal number c. It is injective . . T = N v {a. hence by Theorem 6. *12. 4. b. 3. If o(S) < 0(T) and 0(U) = o(S). u(n) = n + 1 and cyclic on {a. Establish an explicit bijection between the set of all real numbers between 0 and 1.S3 and u the injection ( _ (3." and "transitive" apply to the relation of cardinal equivalence (o(S) = 0(T»? Do they apply to the relation o(S) < 0(T)? Give reasons. What about Sc in this case? 2. whence. Sb> Sc. if S is the set of positive integers. (You may assume the Axiom of Choice. it would be bijective between S2 and all of SI' This proves 0(S2) <: O(SI)' But O(SI) <: 0(S2) (by the obvious mapping x ~ (x. inclusive. 5. Sb. 0(S2) '. c}. which is c. Do the properties "reflexive. prove directly that (a) the set of nonnegative real numbers has cardinality c.3 389 Other Cardinal Numbers (although not continuous) between the square S2 and a subset of the line segment SI-if it were not that decimals consisting exclusively of 9's after one point were excluded. C}. similarly.a1a2a3 •. Without using the Schroeder-Bernstein theorem." "symmetric. T is the injection s . Further examples of sets of cardinal number c are listed exercises. Sc.S. Ta. then 0(T) < o(S).E. Tc when Sand T are the intervals -1 < s < 1/2 and -1 < ( < 1/2. and the set of all unlimited decimals . The same exercise. Tb? Tc when S = N v {a. In the Exercises 1. c}. A similar mapping shows that 0(S3) <: o(Sd. and T(n) = n + 2 and identity on {a. The same exercise. for the intervals 0 < x < 1 and 0 < x < 10. (b) the set of those positive real numbers which are not integers has cardinality c. *10. b. Ta. Prove: Any subset of real n -dimensional space which contains a continuous arc has cardinal number c. T is ( . Prove that if there is a surjection from S to all of a second set T. *11. u is s ..) 7. Determine explicitly the sets Sa. *13. 0(S3) = c. Tb. . Prove: There are c n x n matrices with quaternion coefficients.( + 1. *9. b. that if 0(T) < o(S). Q. Why is the Schroeder-Bernstein theorem trivial in case the set Tc is void? . c}. prove that o(U) < 0(T). 1/2)). then there is a surjection of S onto T. conversely.D. O(SI). T that of nonnegative integers. Prove. Determine explicitly the sets Sa. 8. 6. b.§12. Proof.4. The commutative and associative laws of addition are corollaries of the laws of Boolean algebra. the class of all couples (i. Definition. Addition is single-valued.then one can combine these into a bijection between all of S and all of T. most of the laws of ordinary arithmetic apply to infinite as well as to finite numbers. 1 is an identity. . for if Sand T are sums of disjoint subsets S' and S" respectively T' and Til.5) is much more interesting. 12 Transfinite Arithmetic 390 *12. Z E U] to that of all triples (x. z»[x E S.. the subscripts of an m x n matrix) has the cardinal number mn. t . since the function (x. We shall not prove these familiar facts. one may construct a set with cardinal number m + n by starting with a set S' of cardinal m (say the set 1. . . XES]. The union S' u S" then has cardinal m + n.. multiplication is single-valued. . where i runs through the integers 1. this fact loses much of its interest in the light of the theorem (which we shall not prove) that the sum or product of any two infinite cardinal numbers is simply the greater of the two.g. The associative law of multiplication follows from an obvious bijection from the set of all triples «x. multiplication is distributive over addition.. If m and n are positive integers... x) is bijective between the set of all couples (x. x) [y E T. y)[x E S. The commutative law of multiplication follows. (y. YET] and that of all couples (y.. y) ~ (y. y).m and j through the integers 1. Theorem 8. whatever the sets Sand T.Ch. Let a and f3 be arbitrary cardinal numbers. and af3 is the cardinal number of the set of all couples (x. . Transfinite exponentiation (§ 12. j).. . we shall point out that they suggest the following extension of the operations of ordinary addition and multiplication to infinite cardinal numbers. respectively. .. t Unfortunately. Then a + f3 is the cardinal number of those sets which are sums of disjoint subsets having a and f3 elements. yET. y). Addition and Multiplication of Cardinals Infinite cardinal numbers can be added and multiplied just like finite ones with preservation of all laws except the cancellation laws.. . .m + n). Similarly. z)[x E S. instead.m) and adding to it a disjoint set S" of cardinal n (say the set m + 1.n (e. Similarly. Addition and multiplication are commutative and associative. Indeed. and there are bijections between S' and T' and between S" and T". where x runs through a set of a elements and y through a set of f3 elements. m + 2. but we shall not prove this. although 1 oj. (a) If n is a finite cardinal. d or 2d = lei-yet 2 oj.E. (b) Show that there may exist subgroups of any given finite order in a denumerable G. Prove d + d + d = ddd = d.§12. Prove a = a + 1 for any finite cardinal number a. Prove de = e.) 3. There is an obvious bijection between these two sets. 7. and U are any sets. proving the distributive law. The cancellation laws of addition and of multiplication do not hold for infinite cardinal numbers. (1 + l)d = 1 . Theorem 9. the equations a = a + 1 and a = a + a hold for all infinite cardinal numbers. (Hint: See Figure 1. then x + d = x. Q. a = a for any cardinal number a is trivial. can you prove it? Exercises 1. (b) Show likewise that dn = d.4. (a) Prove that if x :::: d. It is a corollary that the system of finite and infinite cardinal numbers cannot be embedded in any system in which subtraction and division are possible. W in Tor U]. 2. 6. the set Z+ of positive integers is divisible into the disjoint subsets of even and odd integers. Hence by Theorem 8. Z 391 E U]. if T and U are disjoint. *10. then x == e. YET] plus all couples (x. while o(S)o(1') + o(S)o( U) is that of the set of all couples (x. show that d + n = d. 9. Prove the last statement in §12. Actually. o(S)(o(1') + o(U)) is clearly the cardinal number of the set of all couples (x.) 4. Prove in detail (using Boolean algebra) that addition of cardinal numbers is commutative and associative. Z E U]. 2-which violates the cancellation law of addition. For a denumerable group G consider the proof of the Lagrange theorem (Chap. . Again. (a) Show that the proof puts no restriction on the order of S. Prove e + d = e without using Theorem 6. Finally. 8. z) [x E S. whence d + d = d. y) [x E S. The proof that 1 . 6) on the orders of possible finite subgroups S of G. (b) Prove that if x + d = e. w) [x E S.D.4 Addition and Multiplication of Cardinals YET. 5. T. The proof of Theorem 2 shows that d = d + 1. and these are countable. where S. (Hint: Use Theorem 3. But this implies d + 1 = (d + 1) + 1 = d + 2. Prove e + e = e' e = e without using §125. 1. Proof. then the ordinary power nm = o(T)°(S) can be described as the number of functions from the set S to T. and so on-so that the number of ways of choosing all o(S) images is 0(T) multipied by itself o(S) times. then f3a = f3. respectively. Theorem 10. To count the number of different such abstract functions f. and U are sets of a. and y elements.Ch. f3. This combinatory characterization of o(T)°(S) can be applied to infinite cardinal numbers. we suppose that S. .a'. Proof. or o(T)°(S). Now. Distinct real numbers x and y have different expansions (§4. Proof. On the other hand. To prove identities (i)-(iii). and y: (i) oJJa'Y = aP+'Y. (ii) (af3r = a'Yf3'Y. f3. Theorem 11. But the number of such sequences is by definition the number of functions from a countable domain (namely. there are 0(T) choices for the image y of the second element of S. (iii) a (aPr = aP'Y. The proofs of the two parts of (iv) are trivial. that c <: 2d. using Theorem 6. each infinite decimal composed exclusively of (say) 3's and 7's represents a different real number-hence 2d <: c. T. ••• ) is one-one. Let a and f3 be arbitrary cardinal numbers. with T and U disjoint. c = 2d. X3. o and 1). Definition. Each real number x between 0 and 1 has a dyadic expansion . observe that the first element x of S has just 0(T) possible images. not O.3). We omit the essentially trivial proof that this defines a univalent operation: that if a = a' and f3 = f3'. 12 Transfinite Arithmetic 392 *12.5. Exponentiation If Sand T are finite sets with cardinalities m = o(S) and n = 0(T). X2. hence the function f(x) = (Xl. We infer that there are at most 2d real numbers between 0 and I-hence by Theorem 7.XlX2X3· •• as an infinite seqence of Xi equal to 0 or 1. (iv) a l = a and l = 1. we get c = 2d. the set of all d places in the sequence) to a domain of two elements (namely. For any such function x ~ y determines a function y = f(x) which prescribes for each argument x in S a value y in T. for each of these. The following laws on exponentiation hold for arbitrary cardinal numbers a. Then f3a is the number of functions from a class of a elements to a class of f3 elements. Proof of (i). . one from T to S and the other from U to S. u) = fu(t) of the variables t and u.. By this notation is intended the assertion that a <: 2 . This proves that a <: 2 a • Conversely.y and fAx) = 1. 1). moreover. with domain S and with values 0 and 1. By defining fAy) = 0 if x . and so on..D. respectively. h(x) . This defines a function with domain S and with values 0 and 1.393 §12. Ex.t. c2 = (2d)2 = 2 2d = 2d = C. Consider the functions f(t. g(u» of arbitrary values in Sand T. Using these results. For any cardinal number a.t. the number of such functions is a(3+'Y. assigning to each u E U a pair (s.. we get a bijection x ~ fx between S and a special set of functions from S to the set (0. in which T and U are complementary subsets. assigning to each t a value fu (t) = f(t. by construction. and yet a . and Theorem 6. By definition. On the other hand. Let S be any set of cardinal number a. to the set S. Since the number of fu is by definition a(3. Theorems 10 and 11 allow one to infer a number of equations involving c from corresponding equations on d. Theorem 12. 1 below. u) is (a(3r. each mapping u ~ fu defines a function f(y. their number is by definition a(3'Y. We conclude that h is different from . Then 2 a is by definition the number of functions f(x). We shall conclude by proving a generalization of Theorem 5. n d = c for any n > 1. Q. But it is also the number a 'Yf3'Y of pairs of functions f(u). u) associates with each fixed u a rule fu(t). Conversely. the number of f(t. t) = (f(u).2 a • Proof.. 2c = 212d = 2d+ 1 = 2d = C . Cd = (2d)d = 2d = 2d = c 2 (d. u) in S. The number of such functions is (af3r. The number of these is by definition a (3 a 'Y. Proof of (iii). But every f(t. each such function determines and is determined by a pair (f(t). g(u» of independent functions. Thus. Proof of (ii).5 Exponentiation Consider the functions h (v) from a set V. g(u)--one from U to S and the other from U to T. Theorem 4).t. g(x). Construct a new function h(x): h(x) = 0 if gAx) = 1 and h(x) = 1 if gAx) = O. by definition. a < 2a • a Explanation. we obtain easily such rules as dd = c. let there be given any bijection x ~ gx between Sand functions with the domain S and with values 0 and 1.gAx) for all gr. u) of two variables t E T and u E U with values in S.E. Consider the functions h (u). that there exists no bijection between S and the set of all functions with domain S and with values 0 and 1. .) 3. (d) y'" < y/3.) 4. How many sets of real numbers are there whose cardinal number is c? 7. *5. a "# 2". then for all y: (a) a + y < /3 + y. Exercises 1. (b) ay < /3y. Prove c" = 2". fT(X) = 0 otherwise. (Hint: Use Ex. with fT(X) = 1 if x E T. 2. What is the number of (a) finite and (b) countable sets of real numbers? *6. and so. If a set S has cardinal a. In symbols. 7 of § 12. Show that the conventions 0 0 = 1 and 0'" = 0 for all a > 0 are consistent with laws (i)-(iv) of Theorem 11. (c) a Y < {3". 12 Transfinite Arithmetic every g". Show that if a < /3.4.394 Ch. prove that the set of all possible subsets of S has cardinal 2"'. Show that the number of subsets of the square is equal to the number of all real functions of a real variable. (Hint: Each subset T < S determines a so-called characteristic function fT(X). showing how the latter are associated with ideals. and is closed under an associative operation of multiplication which is distributive with respect to addition.11. and A [x]. 395 . we shall take up the study of general rings and their homomorphisms. which is also noncommutative if n > 1. 14) to the factorization theory of algebraic numbers. lor all a. such as the quaternion ring of §8. Definition. We shall then apply the concept of ideals to the geometry of algebraic curves and surfaces. The set Mfl (F) of all n x n matrices over any given field F is a ring under A + Band AB. They also include noncommutative rings. y]. We shall also assume that every ring A has a unity 1 :I: 0. such as Zm (the integers modulo m) .13 Rings and Ideals 13. A ring A is a system of elements which is an Abelian group under an operation of addition. Thus. Rings In this chapter. Rings include all the integral domains and other commutative rings studied in Chapters 1-3. such that 1a = a1 = a lor aU aE A. and (in Chap. Our basic postulates will be as follows. (1) a(bc) = (ab)c. b. the rings of polynomials with coefficients in any given commutative ring A.1. (a + b)c = ac + bc. a (b + c) = ab + ac. c in the ring A. A [x. then Q EEl Z EEl 0 is a ring. it and §13. see also Ex. This bizarre example gives some indication of the enormous variety of rings! -Much of the theory of commutative rings extends to the noncommutative case. 13. (ab bl)(az.12 applies whether or not ab = ba. much of the discussion of commutative rings applies to any ring. b 2) = (alaz. The algebra is called a division algebra if. = c(ay) + d(f3y) (bilinear).Ch. 2l: has a unity element 1 if 1a = a = a 1 for all a in 2l:. becomes a ring under the two operations defined by (ab b l ) (2) + (az. (ca + df3)y (associative) . t Matrices and quaternions are important examples of a class of rings having an additional vector space structure. with a in A and b in B. where these laws are to hold for all scalars c and d in F and for all a. Z the domain of integers. so does the definition of subring given in §3. they are usually called linear associative algebras. every linear algebra is a ring. Thus if Q is the rational field. Such rings were originally constructed as "hypercomplex number systems" more extensive than C. A linear algebra over a field F is a set 2l: which is a finite-dimensional vector space over F and which admits an associative and bilinear multiplication. b 2) = (al + a2. .I a = 1. The order of 2l: is its dimension as a vector space. The resulting ring A EEl B is called the direct sum of A and B. In particular.c and be are in S. y in 2l:.3. Definition. (3) (4) a(f3y) = (af3)y a(cf3 + dy) = c(af3) + d(ay). 1. today. Moreover.t. Linear Algebras. the set of all pairs (a.6 can be omitted without loss of continuity. b l + b2). b l b 2). in addition. and 0 the quaternion ring. b). 13 396 Rings and Ideals If A and B are any two rings. 0 an a -I for which a . Thus the definition of isomorphism of rings given in § 1. while band c in S imply that b . t The material on linear algebras has been included primarily as a source of examples and because of its intrinsic interest. Thus one can prove that a subset S of a ring A is a subring if and only if 1 E S. it contains with every a . i. From these rules the product of any two elements of A can be found. 1. and in which multiplication is determined by bilinearity from the group table for G. €2 = €. for EXAMPLE (a8 + b€)(c8 + dE) = ac8 = (ad 2 + ad8€ + bc€8 + bd€2 + bc)8 + bd€. which mUltiply according to the rules 8€ = €8 = 8. + xnan)(Ylal + . + Ynan) = L (xiy)(aia ). it has the multiplication table (32 = (3.§13. In particular.t: k). which have entry 1 in the i. Let G be any finite group... forms an EXAMPLE . with elements ah . an and multiplication aiaj = ak. at the upper right and the lower left. the group algebra of the cyclic group of order two with 2 generator a has the basis 1 = a and a. such as the associative law for multiplication. 2. Construct over the real numbers an algebra of "dual numbers" which has two basis elements 8 and €. The multiplication table for the basis elements is EijEjk = Eik' EijEkl = 0 (j ::. may be verified. Relative to the basis (3 = (1 + a)/2.. EXAMPLE 3. 'Y2 = 'Y.a)/2. The set of all those 2n x 2n matrices which have n x n blocks of zeros. j position and zeros elsewhere. If F is any field. shows how an algebra may be defined by giving a suitable multiplication table for the basis elements. there exists a linear algebra 2l: over F which has the elements of G for a basis. like the quaternions. (3'Y = 'Y(3 = O. The total matrix algebra Mn (F) of all n x n matrices over F has as a basis the matrices E ij . . This example. 'Y = (1 .8 2 = 0.1 397 Rings A celebrated theorem of Frobenius (1878) states that the quaternions constitute the only noncommutative division algebra over the field of real numbers. EXAMPLE (Xlal + . 4. The requisite postulates....i This algebra is known as the group algebra of Gover F. and the multiplication (X· 1 + ya)(u· 1 + va) = (xu + yV)l + (xv + yu)a. T is a linear transformation. Since a unity 1 is present. ca'.T is an isomorphism of the given algebra to an algebra of linear transformations on ~. Prove the statement of the text characterizing a subring S. Moreover. 3. Prove that the direct sum of two linear algebras over a field F can be made into a linear algebra over F.T + U. 13 398 Rings and Ideals algebra which is a subring of M2n (P). so distinct elements a and 13 induce distinct transformations T and V. (b) Prove that aO = Oa for all a. af3 .. Define the direct sum of n given rings. Show that the zero element 0 of a linear algebra satisfies ~ . Prove that the direct sum of two integral domains is not an integral domain. g{af3) = {ga )13. so the corresponding transformations are a + 13 . and that the unity 1 is unique. Theorem 8).cT.TV.Ch. (af3)' = a'f3'. 7. ~ for all ~. The algebra ~ is a vector space of elements f Associate with each element a in ~ the transformation T obtained by right multiplication as gT = ga for any g in ~. Since multiplication is bilinear as in (4).5. Theorem 1. the algebra postulates give g{a + 13) = ga + gf3.f3 + 13)' E ~ = a' + 13'. F. 6. 8. we define two algebras ~ and ~' over the same field F to be isomorphic when there is a bijection a ~ a' between their elements that preserves all three operations: (5) (a for all a.. and all c E (ca)' = ca'. 4. Exercises 1.. after suitable definition of scalar multiplication. 0 = 0 = 0 . This means that the correspondence a . Proof. First. hence the statement of the theorem. Every linear associative algebra of order n with a unity is isomorphic to an algebra of n x n matrices. Is the algebra of dual numbers a division algebra? Justify. We now prove an analogue for matrices of Cayley's theorem (§6. . 5. la = 113 implies a = 13. prove that (-a)(-b) = ab and that -(-a) = a. It is the direct sum of two copies of Mn(F). The transformations in turn are represented isomorphically by matrices. and prove it a ring. (a) In any ring. 2. Prove that the direct sum defined by (2) is actually a ring. the identity element of the additive group of A'. a homomorphism is a mapping which preserves unity.s of all those elements z in ~ which commute with every element of ~ is a subalgebra of ~. for the rules for adding and· multiplying polynomial forms in an indeterminate x certainly apply to the corresponding polynomial expressions in b. Show that the following systems are linear algebras: (a) a vector space Vn . then A ~ p. and products. (Hint: A commutes with each E ij . As with groups. Homomorphisms Given two rings A and A'. (a -b)H = aH .I AP is an automorphism of Mn(F). this .J2 (see the discussion in §2. just as in the commutative case of §3. (ab)H = (aH)(bH) . {3 = 0 for all a and {3.3. for all a and b in A. A homomorphism H from the ring A to A' is certainly a homomorphism of the additive group of A to that of A '. H has the properties. (b) all n x n triangular matrices (entries all 0 below the diagonaO. Therefore. (6) (a+b)H=aH+bH.) 13. b) ~ b. If f(x) is any polynomial with coefficients in an integral domain D. the correspondence f(x) ~ f(J2) is an epimor£hism of the polynomial ring Q[x] onto the field of all numbers a + b. The familiar correspondence a ~ am. and if. Here 0' is the zero element of the ring A'. show that the set . Show that if P is any invertible n x n matrix over F. with a .11 for groups. 10.1).) 12. proved in §6.2. (7) OH = 0' . If ~ is an algebra. Prove that an n x n matrix A which commutes with every n x n matrix is necessarily a scalar matrix. IH = 1'. (-a)H = -(aH). (It is called the center of ~. the correspondence a ~ aH is called a homomorphism of A to A' if aH is a uniquely defined element of A' for each element a of A.bH. If Q[x] is the ring of polynomials with rational coefficients. Generalize. In brief. sums. which carries each integer a into its residue class modulo m. a homomorphism onto is also called an epimorphism. is a homomorphism of the ring Z of integers to Zm. The direct sum A EB B of two rings A and B is mapped epimorphically on the summand B by the correspondence (a. that is.2 399 Homomorphisms 9.§13. the correspondence f(x) ~ f(b) found by "substituting" for x a fixed element b of D is a homomorphism of the polynomial domain D[x] to D. 11. so that every element b E B is the image aH of some a E A under H Theorem 3. For example.b)H = 0'.C2 is in C. To describe a particular homomorphism explicitly. (ac)H = (aH)(cH) = (aH)O' = 0' and (ca)H = (cH)(aH) = 0'. To express this analogy. Hence we search for the set of elements mapped by H on the zero element 0' of A'. This result suggests that ideals in a ring are analogous to normal subgroups in a group. one would naturally ask when two elements a and b of the first ring have the same image in the second. Then. Moreover. the homomorphism f(x) ~ f(b) maps onto zero all polynomials divisible by (x . the homomorphism Z ~ Zm maps onto zero all mUltiples km of the modulus m. The set of all these multiples is closed under subtraction. and also under multiplication by any integer of Z whatever. An epimorphic image of a ring . Theorem 2.8). for any a whatever in A. 13 Rings and Ideals 400 correspondence preserves sums and products by the very definition (2) of the operations in a direct sum. c1H = C2H = 0' gives by (7) hence property (i). The set S of all these polynomials is also closed under subtraction and under multiplication by all members of D[x] (whether in S or not). where 0' is the zero element of the image A'. To prove Theorem 2 in general. we call the set of all elements mapped on zero by a homomorphism H the kernel of H. which proves (ii). .Ch. Proof.b). §3. (ii) C in C and a in A imply that ac and ca are in C. respectively. We have to show that if Hand K are epimorphisms of A onto rings A' and A 1/. An ideal C in a ring A is a non void subset of A with the properties (i) Cl and C2 in C imply that Cl . Definition. and if aH = 0' if and only if aK = 0". and no others. In any homomorphism H of a ring A. let C be the set of all elements c in A with cH = 0'. By the rule (7). the set of all elements mapped on zero is an ideal in A. and we say that a ring B is an epimorphic image of a ring A under the homomorphism H when H is surjective (an epimorphism). Similarly. this can happen only when their difference has the image (a . These two examples suggest the following definition and theorem (d.A is determined up to isomorphism by its kernel. 2 401 Homomorphisms • then A' and A" are isomorphic. we find that the sum Cl + C2 = Cl .b" by hypothesis. aK = a". This correspondence is one-one: under it each a' in A' corresponds to one and only one a" in A". whence (a . aK = a". so that the epimorphism is not an isomorphism (mapping only (0) on 0'). hence (i) shows c . since it may not contain the unity of A.c = 0 to be in C. Therefore C is improper. C contains any element a = a . and which thus contains an element c ~ O. It suffices to show that a division ring D can have no proper ideals. Second. then a' + b' = (a + b)H ~ (a + b)K = a" + b" a'b' = (ab)H ~ (ab)K = a"b" where a is a common antecedent of a' and a". It is natural to let an element a' E A' correspond to. To see this. Any ideal C contains some element c. if a' ~ a" and a' ~ b". a nonvoid subset C of A is an ideal of A if and only if every linear combination alCI ± a2C2 and Clal ± C2a2 lies in C. an ideal of A need not be a subring of A. implying that 0" = (a . then aH = a' . In particular. Thus.b)K = a" . By property (i). for some a. They are called improper ideals of A. 1 of the whole division ring. The correspondence also preserves sums and products.b)H . A division ring has no proper epimorphic images. Correspondingly. The whole ring A and the subset (0) consisting of o alone are always ideals in any ring A.(-C2) of any two elements of C lies in C. Proof. The two properties (i) and (ii) of an ideal have several immediate consequences. Any other ideal is called proper. b in A.= a' . by (ii) again. C then contains 1 = c -1 c and. bK = b" for some a. and b one for b' and b". since 1 E A. for if a' ~ a" and b' ~ b". By (ii). so a' ~ a" when aH = a'. a proper epimorphism of a ring A is one whose kernel is a proper ideal. Therefore 0 . bH = a' .a' = 0'. note first that each a' in A' has at least one antecedent a in A and hence corresponds to at least one a" = aK in A". for Cl and C2 in C and coefficients al and a2 in A. .§13.c = -c is also in C for any c in C.a" E A" if and only if these two elements have a common antecedent a in A . "rheorem 4. as asserted. Let C be any ideal in D which is not the ideal (0). If an ideal C contains elements Ci> C2. . but there exist domains where this is not the case. By Theorem 11 of §3. all its elements can be represented by linear combinations xg(x... Which of the following mappings ate homomorphisms. for variable x in A.) In most familiar integral domains. In the ring Q[x. the same is true in the domain F[x] of polynomials in one indeterminate over any field F. it is the smallest ideal of A containing b. + XmCm = 0 need not imply C1 = . every ideal has a finite basis. = Cm = 0. defined by (8)..Cm. by Theorem 6 of §1.cm) = [all elements L XiCi for Xi in A] i is itself an ideal.. then it must contain all linear combinations Li XiCi of these elements with coefficients Xi in A. y) with polynomial coefficients. every ideal in the domain Z of integers is principal. Consider now the ideal generated by any given finite set of elements in a commutative ring A.. .. .. for properties (i) and (ii) may be verified.Sa.Ch. Since A has a unity element 1. y). Therefore. + O· CI-l + 1 . + O· Crn in this set (8). . Cm. is an ideal. for the two polynomials x and y both lie in C and cannot both be multiples of one and the same polynomial f(x. This ideal (b) is known as a principal ideal. y) + yh (x.L YiCi = L (Xi - Yi)ci and a (~XiCi) = L (aXi)Ci. the set (b) of all multiples xb of b. y] of polynomials in two variables with rational coefficients. for L XiCi . Ci + O· Ci+l + . (Such basis elements do not resemble bases of vector spaces because XICI + . Exercises 1.. y).. It is not a principal ideal.. each Ci is necessarily one of the elements Ci = O· Cl + .. and why? If the mapping is a homomorphism.7.8. We recall that. the set C of all polynomials with constant tenn zero is an ideal.. so the whole ideal is given by the linear combinations of two generating elements x and y. . is an ideal of A containing the Ci and contained in every ideal containing all the Ci' It is called the ideal with the basis Ci> .. I I so the set has the properties (i) and (ii) requisite for an ideal.. But the set (8) (Ci> C2. . (a) a . the set (Ci> ••• . a an integer in Z. describe the ideal mapped into zero. Though this ideal C is not generated by any single polynomial f(x.•• . 13 402 Rings and Ideals • If b is an element in a commutative ring A. cm ). where Z is the ring of integers. Find all ideals in the direct sum of two fields. (b) all f(x. Z. 1. we shall now construct a corresponding homomorphic image.3 Quotient-rings For every homomorphism of a ring there is a corresponding ideal of elements mapped on zero. 7. y) with constant term zero (a = 0). (a) all f(x. (b) Find all epimorphic images of Zm.f(t. Two elements al and a2 belong to the same coset if and only if their difference lies in the ideal C. (a) Find all ideals in Z6' (b) Find all homomorphic images of Z6' Prove in detail that the only proper epimorphic images of Z are the rings Z/(m) = Zm defined in Chap. Let A be a ring containing a field F with the same unity element (e.. 10. which of the following sets of polynomials are ideals? If the set is an ideal..c. Prove that every proper ideal in has the form (pk) for some positive integer k. 11. *13. Since addition is commutative. prove that C I EEl C 2 is an ideal in the direct sum Al EEl A 2 .§13. 5. *9. find a basis for it. Let be the ring of all rational numbers m/n with denominator relatively prime to a given prime p. Show that every homomorphic image of a commutative ring is commutative. y] of polynomials f(x. 4.. = 0).6). mapping F[x. t).d. (c) f(x. y] into F[t] (x. 13. and that every ideal in the direct sum has this form. so the cosets of C form an Abelian quotient-group. Z. In an integral domain show that (a) = (b) if and only if a and bare associates (§ 3. Find all ideals in the direct sum Z EEl Z. y.few)..g. often called the residue class a = a + C. Each element a in A belongs to a coset. (c) all polynomials without a quadratic term (cl = C2 = C3 = 0). If A is a commutative ring in which every ideal is principal. w a cube root of unity. A might be a ring of polynomials over F). Prove that every proper homomorphic image of A contains a subfield isomorphic to F. If C I and C 2 are ideals in rings AI and A 2 . which consists of all sums a + c for variable c in C. 403 Quotient-Rings (b) f(x) . y) not involving x(b l = C I = C2 = . given an ideal. 8.3 2. f(x) a polynomial in Q[x]. . (a) Find all ideals in Zm for every m. y) = a + blx + b 2 y + C I X 2 + c2xy + C3y2 + . y) . An ideal C in a ring A is a subgroup of the additive group of A. t indeterminates). Conversely. *12. 6.. prove that any two elements a and b in A have a g. which has an expression d = ra + sb. C is a normal subgroup of the additive group A. 3. Generalize. in which the sum of two I . In the ring Q[x. choose any element al + Cl in the first and any element a2 + C2 in the second. and the coset which contains 1 acts like a unity. . so the cosets of C in A form a ring. t The ring Ale is also often calJed a residue class ring. the cosets of any ideal C in a ring A form a ring. this product coset is (10) The associative and the distributive laws follow at once from the corresponding laws in A. then A' is isomorphic to the quotient-ring Ale. This sum was shown in §6.Ch. so is Ale. so the elements of C are mapped upon zero. Therefore all products of elements in the first coset by elements in the second lie in a single coset. If an epimorphism H maps A onto A I and has the kernel C. To constr4ct the product of two cosets. The relation of ideals to homomorphisms is now complete. 13 404 Rings and Ideals cosets is a third coset found by adding representative elements. Under the definitions (9) and (10). and CIC2 lie in the ideal e. If A is commutative.13 to be independent of the choice of the elements al and a2 in the given cosets. as (al + C) + (a2 + C) (9) = (al + a2) + e. These results may be summarized as follows: Theorem 5. and the kernel of this epimorphism is the given ideal e. The correspondence a ~ a ' = a + C which carries each element of A into its coset is an epimorphism by the very definitions (9) and (10) of the operations on cosets. since its elements are the residue classes (cosets) of C in A. claZ. Corollary 1. called the quotient-ringt Ale. In particular. the uniqueness assertion of Theorem 3 can be restated thus: Corollary 2. the zero element is the coset 0 + C. The function a ~ a + C which carries each element of A into the coset containing it is an epimorphism of A onto the quotient-ring AI C. The product is always an element in the coset a 1a2 + C. In the epimorphic image. for by property (ii) of an ideal the terms alcZ. b) E C. In particular. and is a field if and only if C is a maximal ideal in A.3 405 Quotient-Rings The ring Zm of integers modulo m can now be described as the quotient-ring Z/(m). Proof. if AI C is a field. In commutative rings. If A is a commutative ring. a (principal) ideal (p) is a prime ideal if and only if p is a prime number. which is to say that the commutative ring of cosets is a field. in P.t. This requirement reads formally (11) a' b' = 0 only if a' = 0 or b' = 0.2. The commutative ring AI C is an integral domain if and only if it has no divisor of zero (§1. Conversely.C. Call an ideal P in A prime when every product ab which is in P has at least one factor. for a product ab of two integers is a multiple of p if and only if one of the factors is a mUltiple of p. t"Maximal" is sometimes replaced by the term "divisorless. Theorem 1). for any coset b' = b + C. Now a coset a' of C is zero if and only if a is in the ideal C. in the ring Z of integers. with this example in mind. when (a . and let b be any element of A not in C. Q. the unity 1 is in the ideal. where a' and b' are cosets of elements a and b in A. 1 = c + ba. it must be the whole ring A. since C is maximal. Thus.§13. one often writes a = b (C). prime ideals play a special role. This is exactly the definition of a prime ideal C. In terms of cosets this equation reads I' = b'a'. for any c in C and any x in A. Every property of a quotient-ring is reflected in a corresponding property of its generating ideal C. Suppose next that C is maximal. Theorem 6. a or b. when p is a prime but not otherwise. To illustrate this principle. the quotient-ring AI C is an integral domain if and only if C is a prime ideal.E. and says that a and b are congruent modulo an ideal of a ring R. one may prove C maximal (Ex. Thus. This ideal contains C and contains an element b not in C. Conversely. can be shown to be an ideal." • . so for some a. call an ideal C < A maximalt when the only ideals of A containing Care C and the ring A itself. Then the set of all elements c + bx. so the requirement above may be translated by (12) ab in G only if a is in C or b is in C. 10).D. we have found a reciprocal coset a' = a + C. ) . Conversely. 5. Find a familiar ring isomorphic to each of the following quotient rings A/ C: = Q[x]. C = (3. y]. C = (x 2 + 1). so (x) cannot be maximal. *7. But Fly J is not a field. w an imaginary cube root of unity).2.1). 10.3). 13 406 Rings and Ideals Since every field is an integral domain. (b) Prove that A/C is isomorphic to (A/D)/(C/D). (d) (x 2 + 1). (The "Second Isomorphism Theorem. y) which maps the domain Flx. Prove in detail Corollary 1. and show that a coset of C consists of mutually congruent elements.2. x). Theorem 6 implies that every maximal ideal is prime. Prove that congruences can be added and multiplied. It is in fact contained in the larger ideal (x. Find a prime ideal which is not maximal in the domain Z[x] of all polynomials with integral coefficients. b integers. y . (b) A = Q[x]. 2. this ideal (x) is a prime ideal. C = (x. Since the image ring Fly J is indeed a domain. Show that.Ch.1). 3. which consists of all polynomials with constant term zero. In the polynomial ring Q[x.") Let C > D be two ideals in a ring A. y]. (d) A = Z[x]. a prime ideal need not be a maximal ideal.3). *6. Prove that if a quotient-ring A/ C is a field. y). Let cenJTuence modulo an ideal C < A be defined so that a = b (mod C) if and only if a . (e) (x 2 . C = (x . = Z:' C = (p). Find all prime ideals and all maximal ideals in the ring F[x] of polynomials over F. as in Ex. (a) A (c) A (e) A (a) Prove that the quotient C/ D is an ideal in A/D. 13 of § 13. y . Preve the associative and distributive laws for the multiplication of cosets. For example. which of the following ideals are prime and which are maximal? (a) (x 2 ) (b) (x . Describe Z[w]/(2). (c) (y .b is in C. Find all prime ideals in the ring Z of integers. 4. (Hint: The product of two homomorphisms is a homomorphism. 8. y . (2) is a prime ideal. in the domain Z[w] of all numbers a + bw (a. Prove without using Theorem 6 that every maximal ideal of an integral domain is prime. = Q[x.2). as one can also verify directly. consider the homomorphism [(x. Exercises 1. y J of all polynomials in x and y with coefficients in a field on the smaller domain Fly J. 9. Theorem 5. then C is maximal. The ideal thereby mapped onto 0 is the principal ideal (x) of all polynomials which are mUltiples of x. 12. y) t--+ [(0. 11. however.3). (f) (x 2 + 1. then b = ax for some x E R and so by = axy E (a) for all by E (b). in any commutative ring R. so the condition n I m means that (m) is contained in (n). The set (m) of all multiples of m is thus the set of all common multiples of nand k. The intersection B n C of any two ideals Band C of a ring A may be shown to be an ideal. hence that every mUltiple of m is a multiple of n. whence (b) c (a). c in C] . hence that m == an. that a I b.m. Conversely.b. the ideal B n C has the three properties B nee B. for instance. This situation can be generalized to arbitrary ideals in arbitrary (not necessarily commutative) rings. (b) c (a) (13) if and only if a lb. as follows. But beware! The "bigger" number corresponds to the "smaller" ideal. DeB and Dee B nee C. Conversely. In the ring Z of integers n I m means that m = an.4 407 Algebra of Ideals *13.§13. Dual to the intersection is the sum of two ideals. If D is any other ideal of A. of Band C in the sense of lattice theory. In a commutative ring R. If Band C are ideals in A. Therefore (m) c (n) if and only if nlm. one may verify that the set (14) B + C = [all sums b + c for bin B.l. This proves Theorem 7. also have ideal-theoretic interpretations. if a I b. and I. Algebra of Ideals Inclusion between ideals is closely related to divisibility between numbers. the ideal (6) of all multiples of 6 is properly contained in the ideal (2) of all even integers.c. (b) c (a) implies that b = ax for some x E R-that is. imply and DeB n C.4. The intersection is thus the g. The least common multiple m of integers nand k is a multiple of nand k which is a divisor of every other common multiple. The mUltiples of n constitute the principal ideal (n).c. (m) c (n) means in particular that m is in (n). so is just the set of elements common to the principal ideals (n) and (k). More generally.d. The g. d. If the integers m and n have d as g. may be used to compute greatest common divisors of integers explicitly. The ideals in a ring A form a lattice under the ordinary inclusion relation with the join given by the sum B + C of (14) and the meet by the intersection B n C.d. so that the g.c. (d) is the join of (m) and (n). For example. = L Xibi + L Y. The preceding observation can be generalized as follows: Lemma.6) = (6). the sum (b) + (c) of two principal ideals is itself a principal ideal (d) if and only if d is a greatest common divisor of band c. of 336 and 270 is 6. Thus B + C is a l. any ideal containing m and n must needs contain d and so all of (d). B + C is generated by b's and c's. this ideal B + C contains Band C and is contained in every ideal containing Band C.Cj. 13 408 Rings and Ideals is an ideal in A. Since any ideal containing Band C must contain all sums b + c.270) = (66.4 x 66) = (66. if ideals Band C in a commutative ring are generated by bases (15) then we have. or join in the sense of lattice theory. that is. (d) ::::> (m) and (d) ::::> (n).. Theorem 8. For. then the ideal sum (m) + (n) is just the principal ideal (d). Therefore. so that This rule. in combination with natural transformations of bases.270.c. We leave the proof to the reader.270) = (336 . i j as in (8).270) = (66.270 .u. (336) + (270) = (336.Ch. since d has a representationd = rm + sn. for any b + c E b +c B + C. In a commutative ring R. (d) = (m) + (n). . In general. by (13).b. That is. 12825).§13. This set is in fact an ideal. 8. it is generated by all the products bc with one factor in B and another in C. (C. ••• (xc. C. . (a) In any commutative ring. ••• . More generally. 396) and (8624.. (c) Prove that B(C + D) = BC + BD. Prove that every ideal in the ring Z of integers can be represented uniquely as a product of prime ideals. show that BC c B n C. Prove the following rules for transforming a basis of an ideal: 1. 4. so is the smallest ideal containing all these products. and d(x) is their g. y]: 9. C2. 10. C2. any product be has the form Hence the product ideal BC has the basis (18) Such products are useful for algebraic number theory (§14. one can also define the product B .c. ••• . C of any two ideals Band C. Simplify the bases of the following ideals in R[x. 6. Prove that the lattice of ideals in any ring is modular in the sense of Ex.em). In particular. 2.d.4 409 Algebra of Ideals In any commutative ring.. C2. *10. the product of two principal ideals (b) and (c) is simply the principal ideal (bc) generated by the product of the given elements band c. 7. (b) Give an example to show that BC < B n C is possible. C2.c m ) = ••• .. §11.7.'s (280. (C1 + XC2. .10).d.cm ). Prove that the product BC of (17) is an ideal..em) = (C 1. 5. Draw a lattice diagram for all the ideals in Z24' If I(x) and g(x) are polynomials over a field. if ideals Band C are determined by bases as in (12). Exercises Prove in detail that B n C and B + C are always ideals. 3. Compute by ideal bases the g.c. prove that (f(x» + (g(x» = (d(x». y. x") satisfying a suitable finite system of polynomial equations (19) For example. let B : e denote the set of all elements x such that xc is in B whenever c is in C. These describe the curve C as the intersection of a circular cylinder and a plane. V of all points (Xi> ' • • . Polynomial Ideals The notion of an ideal is fundamental in modern algebraic geometry.8 = 0. by the equivalent simultaneous equations (21) X 2 + y 2' + z 2 . 13 410 Rings and Ideals 11. then R is isomorphic to the direct sum of Band C. Still another description is possible. y. in R 3 . But C can be described with equal accuracy as the intersection of a sphere with the plane z = 2. The reason for this soon becomes apparent if one considers algebraic curves in three dimensions. (It is called the "ideal quotient. in the n-dimensional vector space F". But if f(x.2 z = O.5. .4 = 0. In a commutative ring A.Ch. prove that B : e is also an ideal in A. (join) of all ideals X with ex c B. These describe C as the intersection of a circular cylinder with the paraboloid of revolution x 2 + y2 = 2z.") (b) Show that (B 1 fl B 2 ) : e = (B 1 : e) f l (B 2 : C). y. ?) are any two polynomials whose values are identically zero on C. One can avoid the preceding ambiguity by describing C in terms of all the polynomial equations which its points satisfy.2 (20) = o. (c) Prove that B : e is the I. the circle C of radius 2 lying in the plane parallel to the (x.2 = O.u. 12. x2 + Y2 . 13. z) in space satisfying the simultaneous equations z . by the equations (22) X 2 + Y2 . z . (a) If Band e are ideals. Prove that if a ring R contains ideals Band e with B fl e = 0. z) and g(x. y)-plane and two units above it in space is usually described analytically as the set of points (x.b. an (affine) algebraic variety is defined as the set. Generally. B + e = R. y. and hence to the ring of all trigonometric polynomials p(cos 8. j( C) is the ideal of all linear combinations (23) h(x.4. z . and not any special pair of its elements. does any mUltiple a(x..2) = (x + l. if P(Xh . likewise. z]/(x 2 + l. z)(x 2 + l. while if p and q vanish there.2) which are definable as pol~nomials in the variables x. while those of (20) can conversely be obtained by combination of the polynomials of (21). The polynomials of (21) generate the same ideal.2) = (x 2 + l + Z2 . z . so do p ± q. z)(z . l - (24) (x 2 + l - 4. .4) + b(x.. z . y. Thus. is the ultimate description of C. That is. y. §3 . sin 8) with the usual rules of identification. z) = a(x. y.2). z. z . for these polynomials are linear combinations of those of (20). We will now show that the set of all such equations is an ideal. y. then so do all multiples of p. y. in fact J(S) is just the intersection of the ideals J(g) of polynomials which vanish at the different points g E S. z) whatsoever. y]/(x + 1). So. This quotient ring is called the ring of polynomial functions on C. it is isomorphic with the ring of all functions on C (cf. 3 The twisted cubic C 3 : x = t. z) and b(x. z) lies on C 3 if and only if y = x 2 and z = x 3 • Hence C 3 is the algebraic curve defined in R3 by the ideal M = (y .5 411 Polynomial Ideals then their sum and difference also vanish identically on C.2) with basis x 2 + 4 and z . z )f(x..xn]. Namely. z).x 2 .2z.2. z) by any polynomial a(x. y. This ideal then. This means that the set of all polynomials whose values are identically zero on C is an ideal. . Z = t is an algebraic curve which (unlike C) can be defined parametrically by polynomial functions of the parameter t. the set J(S) of all polynomials which vanish identically on a given set S is an ideal in F[Xh' . Theorem 9. and its extension to a field is called the field of rational functions on C. In F". y. The same is true of polynomials which vanish identically on a given set. J(C) is simply the ideal (x 2 + y2 .2) has an 2 The quotient ring important meaning. y. a given point (x. y. Y = t 2. y. z .8. R[x.2). It is clearly isomorphic with R[x. The polynomial ideal determined by this curve thus has various bases. with polynomial coefficients a(x. z) of f(x.xn ) vanishes at a given point.4..§13. For. Evidently. l- . in the case of the circle C discussed above. y.x 3 ). Z . However. . to prove it requires an extension of Theorem 1 of §3. z') ~ I'(t.1m.x 2 . This correspondence maps onto 0 every term of I' which contains y' or z'. 0). z]/ M plays an important role.4) defines a cylinder of radius 2 with the z-axis as its axis. y. observe that the substitution y = y' + x 2 . The sum of two ideals has a simple geometric interpretation. z) into a polynomial I'(x. z = z' + x 3 will turn any polynomial I(x. . y'. y = x 2 and z = x 3 for all points on C 3 . any ideal J in the polynomial ring R[ x h ' . according to the rule (16). z' = Z . the ideal J( V) of this variety may be larger than the given ideal J (d. y. .. and that in this form the homomorphism (25) is (25') I'(x. z') = (y .1. y. The mapping (25) shows that this quotient-ring is isomorphic to the polynomial ring R[t]. z] the principal ideal (z . .x 2 and z . and z are replaced by the coordinates of a point on the plane z = 2.x 3 ) with basis y' = y . so that the corresponding locus V is indeed an algebraic variety. y. In fact.an) = 0 for each polynomial I E J. which shows that y . 3 below). so the polynomials mapped onto zero are simply those which are linear combinations g(x. But. Z . In the further analysis of C 3 . t Caution. For example.4. y.. z) vanishes identically on C 3 if and only if p(t. 13 412 Rings and Ideals By definition. Therefore. Now consider the homomorphismt (t an indeterminate).2) of this ideal vanish whenever x.x 2 .Ch. Ex. z)z'. We have just seen that this sum (23) represents the circle which is the intersection of the plane and the cylinder. conversely. a polynomial p(x. the quotientring R[x.. . it is obvious that the locus corresponding to the sum of two ideals is the intersection of the loci determined by the ideals separately. and no others. z)y' + h(x.x 3 • This expresses C 3 as the intersection of a parabolic cylinder and another cylinder. y.an) of n-space such that I(ah . the principal ideal (x 2 + l . Similarly. our ideal M is exactly the ideal (y'. z'). because all the polynomials I(x. z )(z .. Hilbert's Basis Theorem asserts that J has a finite basis II. .2). Conversely. which consists of all points (ah . The fact that (25) defines a homomorphism is not obvious. y. (25) Clearly..2) represents the plane z = 2. in R[x. 0.xn ] determines a corresponding locus. The sum of these two ideals is 2 (x + l .. t 2 . . y'. y. z .x 3 will lie in our ideal M. t 3 ) = 0 for all t E R. Show that any ideal (ax + by + CZ.6 413 Ideals in Linear Algebras Exercises 1. z] determine the same algebraic variety. (b) Show that any ideal and its square determine the same locus. an ideal in our previous sense is called a two-sided ideal. y. it fails to make zero at least one of the polynomials in the first ideal. Find the ideal belonging to the curve with the parametric equations x = t + 1. the matrices in which the first column is all zero form a left ideal but do not form a two-sided ideal. .x 2 .y and ax lie in L whenever x and yare in L and a is in A. (b) If H is an ideal in the polynomial ring C[x. Describe the locus determined by x 2 + y2 = 0 (a) in R3 and (b) in C. (a) Show that the ideals (x. 5. in the ring M2 of all 2 x 2 matrices. V the corresponding locus. 4. xn ] vanishing identically on any locus C is an ideal. z]. (a) What is the locus determined by xy = 0 in three-dimensional space? (b) Prove that the locus determined by the product of two principal ideals is the union of the loci determined by the two ideals individually. a'x + b'y + c'z) generated by two linearly independent linear polynomials determines a line in R3 . q polynomials) is a group.6. any such linear algebra A is a ring... (a) Compute the inverse of the "birational" transformation T: x' = x. z = z' + q(x. Show in detail that the set of polynomials in R[xl> . In contrast to these notions. xy. y) (p. These concepts may be profitably applied to a linear algebra A with a unity element 1. prove that l( V) contains JH. the radical of H is the set JH of all x in A with some power xm in H. Z' = Y = x 3 • (b) Prove that the set of all substitutions of the form x' = x. (c) Generalize to arbitrary ideals. y) and (x 2. For example. y' = y + p(x).) 8. y' = y . (a) If H is an ideal in a commutative ring A. A left ideal L in a ring A is a subset of A such that x . In this case.§13. *13. 7. (c) Show that each such substitution induces an automorphism on the ring R[x. z = t 4 + t 2 in R3. Prove that JH is an ideal. 2.. y. (Hint: If a point in the locus of the product is not in the locus determined by the first factor.) (d) What is the locus determined by the intersection of two ideals? 6. y. 3.1. y = t 3 . A right ideal may be similarly defined. y2) in R[x. as observed in §13. (The Hilbert Nullstellensatz asserts that 1(V) = JH. Ideals in Linear Algebras In a noncommutative ring one may consider "one-sided" ideals. z]. any left ideal L or right ideal R is closed also with respect to . A famous theorem of Frobenius asserts that the only division algebras over the real field Rare R.Ch. This converse asserts that. as follows. j) position and zero elsewhere. Using the fundamental theorem of algebra. C.s)-l L EkrEijEskaij = Ekk i. one can prove that the only division algebra over the complex field Cis C itself. and the algebra of quaternions (§8. which have entry 1 in the (i. Each matrix (26) (a. Wedderburn (1908) proved a celebrated converse of Theorem 10. (27) Wedderburn's result is that if F is any field. To add or multiply two n x n matrices with coefficients in a division algebra D. Thus. every simple algebra over the field C of complex numbers is isomorphic to the algebra of all n x n matrices over C.j then lies in B. To handle the general case. . A proper ideal B in Mn would contain at least one nonzero matrix A = L aijEij. any left (or right) ideal of A is thus a subspace. with a coefficient i a. so B k must be the whole algebra. if g is any element in Land c any scalar. Take any division algebra D over F and any positive integer n. the most general simple algebra A over F is obtained as follows. By this is meant a linear algebra which is a division ring. the identity matrix I = L Ekk is in B.1l). Proof. in particular.s)-lEkAEsk = (a. 13 414 Rings and Ideals scalar multiplication. Thus. If A is regarded as a linear space over its field F of scalars. 1)g is the product of an element in L by some element c . Theorem 10. apply the ordinary rules. 1 in A. The algebra of all n x n matrices over a field is simple. Q. A linear algebra is said to be simple if it has no proper (two-sided) ideals. a simple algebra has no proper homomorphic images. This algebra Mn has as a basis the n 2 matrices E ij. because cg = (c .E. then L contains cg. II aij I + I bij I = II aij + bij II. Then A consists of all n x n matrices with coefficients in D.D. one needs the concept of a division algebra. One can construct a total matrix algebra Mn(D) of any order n over any division ring D. Consequently. and is improper.s ~ O. and m x (a + b) = m x a + m x b. In additive notation we write m x a for the mth "power" of a. (30) (31) (m x a) + (n x a) = (m + n) x a. prove that the set of all matrices with rows in S is a left ideal of Mn (F). m x (-a) = (-m) x a. if m is positive integer.0 x a = 0. Discuss the algebra of left ideals in a ring. for powers in any commutative group. Find all right ideals in a division algebra.6. if m = 0. Extend Theorem 10 to the total matrix algebra Mn(D) over an arbitrary division ring D.6-7. 4. it is defined for any m E Z and a E R. and principal left ideals. + (-a) (n summands). m x (n x a) = (mn) x a. 5. (a) If S is a subspace of the vector space F". These natural mUltiples of elements in a domain D have all the properties which have been proved in §6. 3.7.. .415 §13. hence.. intersections. 2. (Hint: Show that every row of a matrix of C is the first row of a matrix in C which has its remaining rows all zero. Thus. The cyclic subgroup generated by any a E R consists of the mth powers of a. (29) (-n) x a = n x (-a) = (-a) + (-a) + .) *6. in the mUltiplicative notation. The Characteristic of a Ring Any ring R can be considered as an additive (Abelian) group. (28) mXa=a+a+"'+a (m summands). 13. Prove that every division algebra is simple.7 The Characteristic of a Ring Exercises 1. Show that every quotient-ring of a linear algebra over F is itself a linear algebra. while if m = -n is negative.7. where m ranges over the integers. describing sums. *(b) Show that every left ideal C of Mn(F) is one of those described in part (a). Use the methods of §§7. We call m x a the mth natural multiple of a. 1) = (mn) x 1.Ch. One general distributive law (see § 1. Moreover. Q. The characteristic of a ring R is the number m of distinct natural multiples m X 1 of its unity element 1. setting a = b ::::: 1 in (33).. m X b ::::: 0 if and only if (m x l)b = 0. the unity (multiplicative identity) of R. + a)b = ab + ab + . Finally. Definition. + ab is another gen- (mn) x (ab). I .. the characteristic of D. eral distributive law. the product of b with the m th natural multiple of 1... we obtain (33') (m x l)(n x 1) = (mn) x (1. The mapping m ~ m x 1 is a homomorphism from the ring Z into R for any ring R. for with m = -n the definition (29) gives (-n) X ab = n x (-ab) = [n x (-a)Jb = The rule (a + .. It may be reformulated as (33) (m x a)(n x b) = [(-n) x alb. In the additive group of an integral domain D. or zero. (32) shows that m X b is just (m X l)b. + a)(b + .. negative. + b) = ab + .. Corollary 2. this becomes (m (32) X a)b = m x (ab) = a (m x b). we see that the mapping m ~ m x 1 from Z into R preserves sums.. 13 416 Rings and Ideals There are further properties which result from the distributive law. which is equivalent by the cancellation law to m x 1 = O. This proves Theorem 11. the mapping preserves prod uets. The set of natural mUltiples of 1 in any ring R is a subring isomorphic to Z or to Zm for some integer m > 1. Corollary 1.E. This also holds for m = 0 and for negative m.5) is (a + a + .. ' Setting a = 1. In terms of natural multiples. all nonzero elements have the same order-namely. This also is valid for all integers m and n. For all nonzero bED. + ab (m summands). Proof.. positive. setting a = 1 in (30).D. ·3· 2· 1 and O! = 1." . Then by (33'). the ring unity 1 of D satisfies o = m x 1 = (rs) x 1 = (r x 1) .5 of the binomial formula (9) there involves the binomial coefficients as natural multiples.§13..5 illustrates the value of natural multiples.) = [n !]/[(n .. The binomial formula (9) of § 1. i = 0. Corollary.7 417 The Characteristic of a Ring The domain Z of all integers has characteristict 00..i)!i!]. In any commutative ring R. a natural multiple 2 x (ab). 1. To prove this. The characteristic of an integral domain is either 00 or a positive prime p. properly speaking. the proof by induction given in § 1. the expansion (a + b)2 = a 2 + ab + ba + b 2 = a 2 + 2 x (ab) + b 2 has a middle term which is. the additive subgroup generated by the unity element is a subdomain isomorphic to Z or to Zp. as assumed. More generally. that some domain D had a finite characteristic which was composite. suppose. and not m. while the domain Zp has characteristic p. t Most writers use "characteristic 0" in place of "characteristic 00.1) .) are natural integers given by the formulas (35) (. These are the only characteristics possible: Theorem 12. and so we can write where the coefficients (. to the contrary. . (s xl). . as m = rs. either r x 1 = 0 or s x 1 = O. n. and where n! = n(n . In any domain. Hence the characteristic must be a divisor or r or of s. By the cancellation law. Since F is finite.9. then by Theorem 12 the additive .Ch.8. If a field F has characteristic p. By (6). the kernel of the homomorphism a ~ a P is 0. Proof. (m + 1) x a = m X a + a. 0 < i <p. the correspondence a ~ a P is a homomorphism. we are required to prove that 1P = 1. drop out. Exercises 1. Theorem 18) as a corollary of Theorem 13. hence an automorphism. 4. Show that the natural multiple m x a can be defined for positive m by the "recursion formulas" 1 x a = a. (c) Show that a finite field must have a proper automorphism unless iUs one of the "prime" fields Zp- 13. Since p is a prime. b E R. There follows the identity (36) completing the proof. that (ab)p = aPb P. Obtain Fermat's theorem (§1. Proof. The first two equations hold in every commutative ring. the discussion of characteristics applies at once to fields. (b) Show that if D = Zp(x). Since a P = 0 implies a = 0 in F. then the image of a is a proper subdomain of D. Characteristics of Fields Since a field is defined as an integral domain in which division (except by zero) is possible. Corollary. hence all terms in (34) with factor (~). Prove by induction the rules (30) and (32) for positive natural multiples. set n= p in formulas (34) and (35). 2. In a finite field F of characteristic p the correspondence a ~ a P is an automorphism. this implies that a ~ a P is also onto. and the homomorphism is one-one. (a) Show that a : a ~ a P is one-one (a monomorphism) in any integral domain D of characteristic p. Hence all the binomial coefficients in (34) with 0 < i < p are mUltiples of p.i)! for 0 < i < p. What can you say about the characteristic of an ordered integral domain? 5. As for the third. 13 418 Rings and Ideals Theorem 13. 3. it is not divisible by any of the factors of i! or (p . and that (a ± b)p = a P ± b P for all a. In any commutative ring R of prime characteristic p. But the ring R has characteristic p. 2. the map (m x l)/(n x 1) ~ min is an isomorphism between the subfield generated by 1 and the field of rational numbers. with n ~ O. it is isomorphic to the field of rational numbers. Corollary 2 of Theorem 18. it is thus possible (and convenient) to identify each quotient (m x l)/(n x 1) with its corresponding rational number m/ n. By a similar convention every field of characteristic p may be said to contain the field Zp. (c) Using this fact. (b) Show that both elements not in the prime subfield Z2 of F4 satisfy x 2 = x + 1. (a) Show that F4 has characteristic 2.3. Therefore it is natural to begin a systematic classification of fields with a survey of the ways of extending a given field.2. This subfield is the field of quotients of the subdomain of all mUltiples m x 1. with n ~ O. the subfield generated by the unity element is isomorphic to the field Q of all rational numbers.8 419 Characteristics of Fields subgroup of F generated by its unity element is a subfield and is isomorphic to the finite field composed of the integers modulo p. then by Theorem 12 the subgroup generated by the unity element 1 consists of all mUltiples m x 1. and so the subfield generated by c is composed of all the quotients (m x l)/(n x 1). §13. which is the field of quotients of the domain of the integers m ~ m x 1.5) for solving a cubic equation valid? . Such a survey will be made in the next chapter. As such. Indeed. 1. In a field of characteristic 00. Over which fields is the usual formula (§5. Exercises 1. 4. If a field F has characteristic 00. each field of characteristic 00 may be said to contain all the rational numbers m/ n. The isomorphism (m x l)/(n x 1) ~ min preserves all four rational operations in such a field F. In dealing with a single field F.§13. 8. Let F4 be any field with exactly four elements. In this sense. With this convention. every field is an extension of one of the minimal fields (so-called prime fields) Q and Zp. show that F4 is isomorphic to the field Z[w]/(2) of Ex. §2. by Theorem 7 of §2. 3. This proves the following result (cf. Find all automorphisms of the field F4 of Ex. Show that the conventional formula for the solution of a quadratic equation applies to any field of characteristic not 2.6): Theorem 14. hj in F. After describing general properties of such extensions.r) = q(J.1. the complex numbers a + hi are generated by the reals and the single complex number i.2 and consists of all real numbers a + bJ2 with rational coefficients a and b (see the example a: 420 . we will study specifically the field of all "algebraic numbers" obtained by extending the rational field Q in this way. the field Q(J2) is generated by-a root J2 of the equation x 2 . It will be shown that any such equation can be solved in a suitable extension of P. through the problem of proving unique factorization theorems for "integers" in certain quadratic extensions Q[x]/(x 2 . Algebraic and Transcendental Extensions The remaining two chapters are concerned with solutions of polynomial equations p(x) = 0 over a general field F and their properties.=-l (the case r = -1) can be uniquely factored into Gaussian primes: The simplest kind of extension K of a field F is that consisting of akck)/a: btc 1) of a single element rational expressions p(c)/q(c) = c E K with coefficients a. r E Z. A brief introduction is given to algebraic number theory. Thus p(x) = 0 always has one root in the quotient-field FLx]/(p) of the polynomial ring F[x] by the principal ideal of multiples of p. the Gaussian integers m + n".14 Algebraic Number Fields 14. A single field may be generated in several different ways. while the field Q(x) of all rational forms (with rational coefficients) in an indeterminate x is generated by the field Q and'tire<element x. by which is meant a field K containing F as a subfield.. For example.). For instance. For example. Any subdomain of K containing F and c necessarily contains all such elements f(c). The use of a transformations of variables to simplify an equation thus corresponds to the choice of a new generator for the corresponding field." This important dichotomy applies to elements over any field.71828 . Therefore these expressions (1) constitute the subdomain of K generated by F and c. called a rational expression in c with coefficients in F. . for any number in the field can be expressed in terms of this new generator as a + b. The set of all such quotients is a subfield. This subdomain is conventionally denoted by F[ c].. 2 gives x + 4x + 2 = (x + 2? . and c an element of K. There are other numbers. and Q(w) discussed in §2. An element c of K will be called algebraic over F if c satisfies a polynomial equation with . Definition./2. It can be proved that any extension of F whatever is obtainable by a finite or (well-ordered) transfinite sequence of simple extensions.1 are all instances of simple extensions. " which can be shown to satisfy no such equations (except trivial ones). so that K = F(c). like 7T and e = 2. with square brackets. and multiplication. 4'5. Let K be a given field. it is the field generated by F and c and is conventionally denoted by F(c). F a subfield of K. Let us describe in general the subfield generated by a given element in any extension K of a field F. The usual process of completing the square. with round brackets./2 which generates the same field Q(. Consider those elements of K which are given by polynomial expressions of the form (1) (each aj in F). Conversely. Q(4'5)./2) ./2).§14. the set of all such polynomials is closed under addition. . subtraction. A field K is called a simple extension of its subfield F if K is generated over F by a single element c.2 = 0 with a root generating the same field. A different equation x + 4x + 2 = 0 has a root -2 + . applied to this equation.J 3.1 421 Algebraic and Transcendental Extensions 2 in §2. If f(c) and g(c) ~ 0 are polynomial expressions like (1).2 = 0. Let K be any field. satisfy polynomial equations with rational coefficients. Over the field of rational numbers./2 = (a + 2b) + b(-2 + . The fields Q(J2). and F any subfield of K. The latter numbers are called "transcendental. some complex numbers. so that y = x + 2 satisfies a new equation i . such as i.1). their quotient f(c)/ g(c) is an element of K. Exercises 1. A simple extension K = F(c) is said to be algebraic or transcendental over F. Proof.71828 . their coefficients must be equal term by term. Show that if x is algebraic (over F). e 2wi . 1T . J2 + i.. What numbers in Q(J"S) generate the whole field? 4. By the rules for operating with polynomials. rs. . 2. 3. with coefficients in F. An element c of K which is not algebraic over F is called transcendental over F. + anc n = 0 (ai in F. The isomorphism may be'so chosen that a ~ a for each a in F.).. (2) ao + a I c + a2c 2 + . and conversely.fz(c) would yield a polynomial equation for c with coefficients not all zero. then so are x 2 and x + 3. i + 3.422 Ch. Identify each of the following complex numbers as algebraic or transcendental 2 over the field Q of rational numbers and cite your reasons: . the subfield F(c) generated by F and c is isomorphic to the field F(x) of all rational forms in an indeterminate x. (b) Find those elements in Q(Jd) which generate the whole field. contrary to the assumption that c is transcendental over F.. Theorem 1. e + 3 (where e = 2. according as the generating element c is algebraic or transcendental over F. It may be extended by Theorem 6 of §2. not all 0). (c) Express each such element as a root of a quadratic equation with coefficients in Q. and c ~ x. The structure of a simple transcendental extension is especially easy to describe. describe the field Q(Jd). The extension F(c) clearly contains F and all the rational expressions f(c)/ g(c) with coefficients in F. (a) If d is an integer which is not a square.2 to give the isomorphism f(c)/g(c) ~ f(x)/ g(x) between F(c) and F(x). 14 Algebraic Number Fields coefficients not all zero in F. because otherwise the difference fl(e) . this correspondence is an isomorphism./7. If c is transcendental over F.. If two polynomial expressions fl (c) and fz(c) are equal in F(c).. Therefore the correspondence f(c) ~ f(x) is a bijection between the domain FIc] and the domain FIx] of polynomial forms in an indeterminate x. where f and g are polynomials of smaller degree. that is..1 = 0 for coefficients ai in F if and only if ao = al = .. it is evident that 4>' is .§14. the mapping f(x) ~ f(u) will be shown to be an isomorphism 4>': F[x]/(p) ~ F(u) of fields between the quotient-ring F[x]/(p) and F(u). call it p. then h(u) = 0 if and only if h is a multiple of p in the domain F[x]. = an-l = o.4 = 0. This p is irreducible. Corollary. generated by F and a single element u algebraic over F. Theorem 2. The polynomials h E F[x] with h(u) = 0 constitute an ideal in F[x]. The proof is complete. if and only if h is in the principal ideal.2 Elements Algebraic over a Field 423 14. The minimal polynomial of an element u algebraic over a field F is the (unique) monic irreducible polynomial p E F[x] with p(u) = 0.2x = 0. x .. for otherwise it could be factored as p = fg. If an element u of an extension K of a field F is algebraic over F. x . Elements Algebraic over a Field We next investigate the nature of simple algebraic extensions of a field F. this element must satisfy over F a polynomial equation of degree at least one.2.(p) of F[x]. If the element u has degree n over a field F. then one has ao + alu + .2 = 0. Theorem 11). the degree n = [u : F] of u over F is the degree of this polynomial.8. and so on. But it is the root of just one irreducible and monic polynomial equation (see also Ex.. which would imply f(u)g(u) = p(u) = 0. By definition. this ideal is just the kernel of the homomorphism 4>u: F[x] ~ K defined by the "evaluation map" p ~ p(u) that assigns to each polynomial p its value at u E K. The rest of this section will be concerned with this result.. The same element u may satisfy many different equations. so either f(u) = 0 or g(u) = 0 contrary to the choice of p as a polynomial of least degree with p(u) = O. Like all ideals of F[x]. We are now in a position to describe the subfield of K generated by F and our algebraic element u. 6 below).. Definition. Just one of these is monic. Proof. for r. If h is another polynomial in F[x].2 is a root of x . Moreover. this ideal is principal (§3. . + an_lu n. then u is a zero of one and only one monic polynomial p(x) that is irreducible in the polynomial domain F[x]. and so consists of all multiples of anyone of its members of least degree.: 2 · 3 4 example.. From the formulas for adding and multiplying polynomials. This subfield F(u) clearly contains the sub-domain F[u] of all elements expressible as polynomials f(u) with coefficients in F. Combining this result with Corollary 2 of Theorem 5. This states that the nonzero element f(u) of F[u] does have a reciprocal t(u) which is also a polynomialt in u. Thus. 14 Algebraic Number Fields an epimorphism from F[x] to the subdomain F[u]. Theorem 4. and this is a unique polynomial (4) r ( x ) =ro + rIX + . hence by Theorem 2 that f(x) is not a multiple of the irreducible polynomial p(x).(13)(1 . Indeed.. just do the same to their coefficients. let p(x) be the monic irreducible polynomial over F of which u is the root. § 13. where p is the monic irreducible polynomial of u over F. form their polynomial product as in §3.J2) of the rational field t For example.13]. in the special case of the extension Q(.3. we see that F[u] is the subfield of K generated by F and u. In Theorem 3.. in Q[. Therefore we can write 1 = t(x)f(x) (3) + s(x)p(x) for suitable polynomials t(x) and s(x) in F[x].13 has the multiplicative inverse.(13) = (1 .1. Let K be any field. the domain F[u] is a subfield. and shows that F[u] is a subfield of K.. Then the mapping cP': f(x) ~ f(u) from the polynomial domain F[x] to F(u) is an epimorphism with kernel (p(x». The statement that f(u) ~ 0 means that u is not a root of f(x). To add or subtract two such polynomials. F(u) is isomorphic to the quotient-ring F[x]/(p). found by "rationalization of the denominator" as 1/(1 + ... hence that f(x) and p(x) are relatively prime. we have an immediate corollary. But actually. and compute the remainder under division by p(x). every subfield of K which contains F and u evidently contains every polynomial f(u) in F[u]. let us find an inverse for any element f(u) ~ 0 in F[u]. Each polynomial f(x) E F[x] is congruent modulo (p) to its remainder r(x) = f(x) . Since. conversely. The corresponding equation in F[u] is 1 = t(u)f(u). . 1 + .(13) = -! + !J3.424 Ch. + rn-IX n-I of degree less than n. We have proved Theorem 3. (3').(13)/(1 + . The quotient-ring F[x ]/(p) can be described very simply..a(x)p(x) when divided by p(x). To multiply them. and u an element of K algebraic over the subfield F of K. with coefficients in F.u 4 + 2. and have characterized the subfield of K generated by F and a given u E K in terms of the minimal (i. . we can start just with F and an irreducible polynomial p and construct a larger field containing a root of p(x) = O. 2. b. us. Hence any element of Q(u) can be written as a + bJ2 with rational a. Prove from the relevant definitions: If u is any element of a field K. 3. 7. express each of the following elements in the form (4): (u 3 + 2)(u 3 + 3u).1. Represent the field Q(Ji) as a quotient-ring from the domain Q[x] of polynomials with rational coefficients.3. 1/u. 4. Hence the algebraic extension F(x)/(p) can also be considered as a commutative linear algebra over F. 5.§14. then the monic polynomial of least degree with root u is irreducible over F.3 425 Adjunction of Roots F = Q by u = J2. Note also that multiplication is bilinear (linear in each factor). (u + 2)/(u + 3). 6. then the set of all polynomials g(x). and F any subfield of K. 14. it is the quotient space of the infinite-dimensional vector space F1:x] by the subspace of multiples of p(x). 2 2 4 4 U (U + 3u + 7u + 5). 3u s . Find five different polynomial equations for J3 and show explicitly that they are all multiples of the monic irreducible equation for J3 (over the field Q). monic irreducible) polynomial p over F such that p(u) = O. Adjunction of Roots So far we have assumed as given an extension K of a field F.. Exercises 1. we have p(x) = x 2 . 1/(u + 1).2. as in (4): u 4 . express each of the following elements in terms of the elements 1. In the simple Q(u) generated by a root u of the irreducible equation u 3 . and (a + bJ2)(c + dJ2) = 2 + (ad + bc)J2 + bd(J2? 2 (a + 2bd) + (ad + bc)J2 = a Formula (4) reveals the quotient-ring F1:x]/(p) as an n-dimensional vector space over F. Alternatively. of which u is a root is an ideal of F[x].6u 2 + 9u + 3 = 0. u 2 . Prove directly from the relevant definitions: If u is algebraic over F. In the simple extension Q(u) generated by a root u of x S + 2x + 2 = 0. 1/(u 2 - 6u + 8). in the sense of § 13. Represent the complex number field as aquotient-ting from the domain R[x] of all polynomials with real coefficients.e. u. For example. we "multiply out" in the natural fashion and then simplify by the proposed equation .x . This simple extension is unique. Hence the quotient-ring F{x]/(P) is a field. generated respectively by roots u and v of the same polynomial p irreducible over F. The polynomial x 2 .3. Theorem 5. Proof. there is exactly one isomorphism of F(u) to F(v) in which u corresponds to v and each element of F to itself. since [u : F] = 2. This field can also be constructed directly without using the concept of a quotient-ring.1 has none of the three elements 0. Take the composite F(u) cPu -lcPv of the isomorphisms <P <Pv ~F{x]/(P)~F(v) provided in Theorem 3. there exists a field K :::: F{x]/ (p) which is a simple algebraic e~tension of F generated by a root u of p(x). To compute the product of two elements of this type.) + (b + d)u. It consists of just nine elements of the form a + bu. b E Z3. Therefore the quotient-ring Z3[X]/(X 2 . The sum of two of them is given by the rule (a + bu) + (c + du) = (a + c. or 2 as a zero. every element of this field K can be written uniquely as a + bu. hence it is irreducible in Z3[X]. call it u. start with the field Z3 of integers modulo 3. Since p(x) is irreducible.x . If F is a field and p a polynomial irreducible over F. Theorem 5 may be used to construct various finite fields. Moreover. then F(u) and F(v) are isomorphic. It contains F and the residue class x + (P) containing x. If the fields F(u) and F(v) are simple algebraic extensions of the same field F. which satisfies p(x) = 0 in F{x]/(P). Proof. up to isomorphism: Theorem 6. by §13.1) is a field K generated by its subfield Z3 and the coset. Specifically. The characterizations of Theorems 3 and 4 show how to achieve the same result in general. 14 Algebraic Number Fields This "constructive" approach generalizes the procedure used in Chapter 5 to construct the complex field C from the real field R by adjoining an "imaginary" root of the equ"ation x 2 + 1 = O.426 Ch. with a. 1. the principal ideal (p) is maximal in F{x]. of x. so K has exactly nine elements. Theorem 6. . and p(x) the polynomial x 2 + 1 irreducible over R. the inverses of the nonzero elements are given by 1+u 1 2 u 2u 1 2 2+u 1 + 2u 2 + 2u 1 + 2u 2+u 2u u 2 1 + 2u +u By its construction. we thus have a slight variant of the construction used in Chap. In particular. and if p(x) is some irreducible polynomial over F.1)(Z2 .1)(Z2 . If F is the field R of all real numbers.(Z2 .3). The quotient-ring K = F[t]/p(t» is then a field containing all rational functions and the algebraic function t. It is one of the simplest examples of a finite field (see §15.4). Thus. then the construction yields a field R(u) generated by a quantity u with u 2 = -1. 5 to obtain the complex numbers from the real numbers.J(Z2 .3 = U + 1.4) as an irreducible quadratic polynomial in t with coefficients in C(z). let it be desired to adjoin to F a function t(z) such that t 2 = (Z2 . without having to construct a Riemann surface for it (it is two-valued). + an_1u n. J . There are only a finite number p of choices for each coefficient aj. b E Z3) under these two operations satisfy all the postulates for a field. n.1. = ac One can verify in detail that the nine elements a + bu (a.. The preceding adjunction process may be applied to any base field F whatever. let F = C(z) be the field of all rational complex functions. One can construct algebraic function fields in the same way. this field is clearly the field Z3(U) generated by u from the field Z3 of residue classes. where n is the degree of the polynomial p. If F is the field Zp of integers modulo p. t) = t 2 .4) dz. We can consider the polynomial p(t) = f(z. This quantity u behaves like i = and the field R(u) is actually isomorphic to the field C of complex numbers. hence the field constructed is a finite field of pn elements. The field K is called an elliptic function field because it is generated by the integrand of an elliptic integral. One can study t(z) as an element of K.1)(Z2 . . the construction above will yield a field consisting of elements ao + alu + . The result is (a + bu)(c + du) + (ad + bc)u + bdu 2 = (ac + bd) + (ad + bc + bd)u.u 2 427 Adjunction of Roots §14. how many elements has the resulting field? 4.I is irreducible over the field Zs of integers modulo 5. Q(v'-3). Hence there is by Theorem 6 an automorphism of C carrying i into -i.5. it can refer equally to the extension of Q by the positive rs or to Q(wrs) where w = (-1 + J3i)/2 is a complex cube root of unity. Exhibit a nonreal field of complex numbers isomorphic to each of the real fields Q(~5). prove that integers modulo 2. irreducible over the field Q of rationals. and that all the algebraic properties of a root u may be derived from the irreducible equation which it satisfies. (b) Exhibit explicitly the isomorphism a _ a 3 for this field. Use Theorem 6 of §13. Prove that x 3 + x . If g(t) is a reducible polynomial. roughly speaking. Q(i).9. 14 Algebraic Number Fields If Theorem 6 is applied to an ordinary polynomial such as x 3 .4)/(x 2 . 9. This automorphism is just the correspondence a + hi ~ a . (b) Construct addition and multiplication tables for a field with four elements. 5. which elements in the quotient-ring F(t]/(g(t» actually have inverses? 10. Prove that the elliptic function field C(x.4) is irreducible in t over the field C(x). that any two roots of an irreducible polynomial p(x) have the same behavior. 6. For instance. This isomorphism means. (a) Show that the field of nine elements constructed in the text has characteristic 3. (a) If K is an extension of degree 2 of the field Q of rationals. 3. the field C = R(i) of complex numbers is generated over the field R of real numbers by either of the two roots ±i of the equation x 2 + 1 = O.) 7.3 to give another proof that F(t]/(P(t» is a field. . If a root of this polynomial is adjoined to Zs. Q(~). 2. (Hint: First show that every element in such a finite field is quadratic over Z3.hi between a number and its ordinary complex conjugate. y) of the text can be generated over C(x) by a root z of the equation t 2 = (x 2 . It shows that these two fields Q(rs) and Q(wrs) are algebraically indistinguishable because they are isomorphic.1)(x 2 .1). There are many examples of such an isomorphism. Prove that the polynomial t 2 .(x 2 .428 Ch.) 8. (Hint: Use the results of §3. (a) Find all the irreducible quadratic polynomials over the field Z3. Exhibit an automorphism not the identity of each of the following fields: Q(Ji). (b) Prove that any two fields with nine elements are isomorphic. Exercises 1. For example. If this vector space K has a finite dimension. If two algebraiC elements u and v over a field F generate the same extension F{u) = F{v). In §14.5 we shall show how the vector space approach may be used to analyze extensions of a field F obtained by adjoining several different algebraic elements.4 Degrees and Finite Extensions 429 14.4. u. Theorem 4 on simple algebraic extensions may be testated in terms of dimensions as follows. Corollary.un-I. Theorem 7. . This unique representation closely resembles the representation of a vector in terms of the vectors of a "basis" 1. Indeed. This suggests an application of vector space concepts.. In general. But before discussing such "multiple" extensions we shall first see how the vector space approach enables one to compare the irreducible equations satisfied by different elements in ~he same simple algebraic extension F{u) over F. . . This vector space has a basis 1. the complex field C = RU) is a two-dimensional vector space over the real subfield R (as in §5. The degree of an algebraic element u over a field F is equal to the dimension of the extension F{u).§14. then the field K is called a finite extension of F. then u and v have the same degree over F. and the dimension n of the vector space is known as the degree n = [K :F] of the extension.2)... Degrees and Finite Extensions In a simple extension F{u) generated by an element u of degree n. and use as operations of the vector space the addition of two elements of K and the multiplication (a "scalar" multiplication) of an element of K by an element of F. every element w has by formula (4) a unique representation as (5) with coefficients in F. any extension K of a field F may be considered as a vector space over F: simply ignore the multiplication of elements of K. . u. and so on. A fundamentlal fact about vector spaces is the invariance of the dimension (any two bases of a space have the same number of elements). This fact may be applied to the special case of finite extensions of fields. regarded as a vector space over F. . the field ~) generated by the rational numbers and a cube root of 5 is a three-dimensional vector space over the rational subfield Q. un -I .. All the vector space postulates are satisfied by this addition and scalar multiplication. as follows. hence must be linearly dependent over F (§7. . as (6) u = -w 2 /3 + w u 2 = -w + 4/3. Suppose. u. this relation implies that w is algebraic over F. the irreducible polynomial p(x) for u must be used systematically. every finite extension consists of algebraic elements. that Q(u) is an extension of degree 3 over the field Q of rationals. Interpreted as a polynomial. 2 /3 + 2w + 4/3.4. The element w = u 2 . This important conclusion assures us that a transcendental element would never appear in a simple algebraic extension.2. Cerellary. This is done by applying repeatedly the given equation u 3 = 2u . Theorem 5. and u 2 . for by Theorem 2 an element g(u) in the extension is zero if and only if the polynomial g(x) is divisible by p(x). Every element of a simple algebraic extension F(u) is algebraic over F. conversely. In working with a particular simple algebraic extension F(u).2u 3 + u 2 and w3 = u 6 .430 Ch.u in this extension Q(u) must satisfy some polynomial equation of degree at most 3. w. Proof.2 = O. substituted in the expression for w w3 - 4w 2 - 3 . J These. express the powers w2 = u ..2x + 2. as in Theorem 4. w = 16u 2 - 28u + 18.3u 5 + 3u 4 . generated by a root u of x 3 .. one may solve the equations for wand w2 linearly to get u and u 2 . There must. be a linear relation b o + b I w + . w. w 2 .1O). This polynomial is irreducible by the Eisenstein irreducibility criterion (§3. ••• . Corollary 2). This gIves W = U 2 - 3 u. for instance. therefore. To obtain the linear relation which must hold between 1. w 2 . and. where n = [K: F] is the degree of the given extension. 14 Algebraic Number Fields A simple algebraic extension is finite.u 3 linearly in terms of 1. To find this 4 equation. give the desired equation 4w .w n of the given element w ~re elements of the n-dimensional vector space K. Theorem 8. The n + 1 powers 1. Every element w of a finite extension K of F is algebraic over F and satisfies an equation irreducible over F of degree at most n. + bn w n = 0 with not all coefficients zero. and w 3 . where u satisfies u 3 = 2u + 2. Iterated Algebraic Extensions Finite extensions of a field F may be built up by repeated simple extensions. (b) If. one may prove that any such iterated extension can be obtained as a simple extension. Each of the following numbers is in a simple algebraic extension of Q. (b) ~ + . This means that any equation of degree 3 for w must be irreducible. *8. (b) How much of this result remains true if Q is replaced by a field F of characteristic oo? by a field F of any characteristic? Is the field F(x) of rational forms in the indeterminate x a finite extension of F? Why? Prove that the number of elements in a finite field of characteristic p is a power of p. (a) Prove that there are exactly (p2 . 7./3.1. it is . If F has characteristic 00. 14. (c) <fi + ~. Exercises 1. where u satisfies u 3 = . that is.5.p)/2 monic irreducible quadratic polynomials over the field Zp of integers modulo p. 6. (b) Prove that for each p there exists a field of characteristic p with p2 elements. then D is a field. one may argue by equation (6) that u is in Q( w). and by the Corollary to Theorem 7 have the same degree 3 over Q. 5. Prove that every finite extension of the field R of real numbers either is R itself or is isomorphic to the field C of complex numbers. by the Eisenstein theorem. (e) u 2 + u./5.& Iterated Algebraic Extensions 431 This equation is irreducible over Q. (d) u 2 . Prove: (a) D is a vector space over F. as a vector space. so that Q(u) = Q( w) and u and w generate the same extension. (a) 2 + .. Prove that the field of all complex numbers has no proper finite extensions. hence 2. D has a finite dimension over F.§14.3u 2 + 3. Prove that there are exactly (p2 . Let F be any field contained in an integral domain D. (a) If K is an extension of degree 2 of the field Q of rationals. Alternatively. prove that K = Q(Jd}. where d is an integer which is not a square and which has no factors which are squares of integers. is algebraic over Q.P)/3 monic irreducible cubic polynomials over the field Zp of integers modulo p. 4. 3. *9. Find in each case the monic irreducible equation satisfied by the number. (.2X2 + 9 = 0 may be written as The equation.Ji) . This field K could have been obtained by adjoining to Q first . i. Wm constitute a basis for an extension L of K. By the usual formula.Ji) consists of real numbers. If the elements form a basis for a finite extension K of F. .Ji over •. F(cl> C2) is the simple extension L(c2) of the simple extension L = F(Cl)' Iterated algebraic extensions may arise in the solution of equations. The quadratic equation y2 + 1 = 0 for i must therefore remain irreducible over the real field .. U h " ' .. If we adjoin the auxiliary quantity i to the field Q of rationals.Ji) a degree 2 and a basis of two elements 1 and i. Un . while wI.3)/2x]2 = -1.Ji) + (c + d.Ji)i = a + b... The intermediate field Q(. therefore.Ji. c. . then i. where it is often useful to introduce appropriate auxiliary equations. .Jii thus form a basis for the whole extension K = Q(. The field Q(. the factor x 2 . c. The four elements 1. We shall omit this proof and discuss the properties of iterated extensions directly. thus.Ji + ci + d. . i) has over Q(. i) can be expressed as . This method of compounding bases can be stated in general. if K is any extension of F containing elements Cb C2. and the elements of F (the subfield consisting of all elements rationally expressible in terms of CI. (7) w = (a + b. b.Ji) in turn has a basis 1. the equation X4 . over F). as follows: The. The original equation thus has a root in the field K = Q(i... .3 + 2xi)(X2 . such a multiple extension may be obtained by iterated simple extensions. ••• . ••• .3 .Cn the symbol F(Cl. hence cannot contain i. i) over Q. the original equation becomes reducible over Q(i). For example.2xi).Ji. with rational coefficients a.2ix has a root u = i + . 14 Algebraic Number Fields generated over F by a suitably chosen single element.432 Ch.' .rem S. C2. . so that the extension Q(. Alternatively.Ji. then the mn products U.3 . . Therefore any element w in the whole field Q(..Jii. c. is [(x 2 . c. m form a basis for Lover F.) denotes the subfield of K generated by Cb ••• .t 4 - 2 2x~~"+ 9 = (x .Ji. .Ji).Wj for i = 1. . This formula indicates that any field which contains a root u of the given equation also contains a root i = (u 2 .Ji. .Ji.3)/2u of the equation l = -1. for . nand j = 1. In general. . and d. hence by (8). Any element y in L can be represented as a linear combination y = L rjwj of the given basis. Proof. . Corollary 2. . In the first place. An element u of a finite extension K whole extension if and only if [K : F] = [u : F]. with coefficients in F. one may state the result without reference to the particular bases used. every element U of K has over F a degree which is a divisor of n. :::J F generates the Proof.. hence by Corollary 1 the whole degree [K: F] is finite.y. . Many consequences flow from Theorem 9. appears as a linear combination of the suggested elements UiWj. n = [K:F(u)][F(u):F]. then L is a finite extension of F. Each coeffij cient rj is in turn some combination '1" = L aijUi of the basis elements of i K. Q. If Kis is a a finite extension of degree n. " every element in K is then algebraic over F. On substitution of these values. Corollary 3. .. If K is a finite extension of F and L a finite extension of K.. Corollary 4. where the second factor is the degree of u under consideration. The same type of successive argument proves that these mn elem~nts are linearly independent over F. and every element in K is algebraic over F. with each aij in F. Every degree [F(Yb"'. then u generates a subfield F(u) of degree n over F. then K is a finite extension of F.) is a field generated by r quantities y" where each successive Yi is algebraic over the field F(Yb .D.E.Yi-dl is finite.. If u satisfies over F an irreducible equation of degree [K: F]. as follows: Corollary 1. .S 433 Iterated Algebraic Extensions Proof. Yi-b Yi): F(Yh . By (8) this subfield must include all of K. and its degree is (8) [L:F] = [L :K][K:F] (L :::J K :::J F). If K = F(Yb Y2. The element U generates a simple extension F(u). Proof. By Theorem 8.§14. " Yi-l) generated by the preceding i . hence do constitute a basis for K. = [K: F] over F. with coefficients rj in K.I quantities. By Eisenstein's theorem this equation is irreducible over the field Q of rationals (the field associated with the data). and if K is an extension of F of degree 2m . this extension will never contain a root of the given irreducible cubic. by Corollary 2.2 will still be irreducible. . This corollary means in particular that an irreducible cubic equation could never be solved by successive square roots. 14 Algebraic Number Fields 434 Corollary 5. this will again give an irreducible cubic equation. . and a cube with this segment as side. t This depends essentially on the fact that the equation of a circle (compass) is quadratic and the equation of a straight line (ruler) is linear.. ) obtained by any number of square roots will have as degree some power 2m of 2. the coordinates of these points (and the ratios of the coefficients in the equations for these lines) are a set of real numbers which generates a certain field F of real numbers. c. so that the extension K = F( Fa. The trisection problem is treated in similar fashion.u.2 = O.Ch. The problem is to construct another cube of double the volume.. But such an element u of degree 3 over F cannot be contained in a field K of degree 2 m over F... The data consist of a pair of coordinate axes. a unit segment along one of these axes. For most angles. Hence by these methods it is impossi... . The side of this new cube will satisfy the equation x 3 .Jb. This corollary is the algebraic basis of the theorem that it is impossible to solve the classical problem of duplicating a general cube or trisecting a general angle by ruler and compass alone. suppose p(x) reducible over the field K of degree 2m. so that K contains a root u of p(x). It can be shownt that the corresponding new field of numbers is either F itself or a quadratic extension of F. the essential device consists in writing the trigonometric equation fOr the cosine of one third of an angle in terms of the cosine of the whole angle. By Corollary 5. For a proof.. Consider now the duplication of the cube. Each step in a ruler and compass construction provides certain new points and lines. for the adjunction of a square root to a field F either will give no extension at all or will give an extension of degree 2. The data of the problem consist of a number of points and lines. Over any field K corresponding to a ruler and compass construction. . If p(x) is an irreducible cubic polynomial over a field F. This proves p(x) irreducible. the polynomial x 3 . Relative to some set of axes. Any such construction problem may be reduced to analytic terms. by Corollary 5. Hence repeated constructions yield a set of points and lines corresponding to a field K of degree 2m over F. Then the cubic p(x) must have at least one linear factor x . then p(x) is irreducible over K. ble to construct (say along the x-axis) a segment which is the side of the duplicated cube. (aj in Q. (a) U = ~.. we have repeatedly used examples of algebraic numbers. (b) Show that this equation is irreducible over Q when 38 = 60° (this means that an angle of 60° cannot be trisected with ruler and compass). Give reasons. or w.6 435 Algebraic Numbers Exercises 1. such as i. Determine in each of the following cases whether the number U given generates the given extension of the field Q of rational numbers. (e) Q(~. in Q(v). (d) x 5 + 3x 3 . (d) Q(vS./5. where u 4 + 6u + 2 = 0.equation which gives cos 8 in terms of cos 38. where v 3 + 5v . over Q(. in Q(~).) 3. 8. . (b) Q(~. ul. . 1 + i).in Q(~'3). i). -I -5. (c) x 3 + 8x . In Theorem 9. If p(x) is a polynomial of degree q and is irreducible over F and if K is a finite extension of F of degree relatively prime to q. 2 (e) U = v + V + 1. prove in detail that the mn elements UjWj are independent over F 2. Give reasons.Ji).. In other words.1/(1 + J2)./ 2. Give a basis over Q for each field of Ex.3 + FsO) .6. an algebraic number is any complex number which is algebraic over the field Q of rationals.n). Determine the degree of each of the following mUltiple extensions of the field Q of rational numbers. In each case. (a) Find the cubic. 5. . (a) Q(J3. prove that any element in K but not in F generates all of Kover F. J2). (g) Q(J3.Ji). Prove that the equation X4 . 14.5 = O. (c) Q(J18. 4. In discussing extensions of fields . Is C = 1T 6 + 51T 3 + 21T .. Determine whether the polynomial given is irreducible over the field indicated. ¥3. not all aj = 0).2.14 transcendental or algebraic over the field Q of rational numbers? Why? 9. over Q(J2) . 6. (Hint: Use the degree of Q(J2. prove your answer correct. n.6./5). in Q(J2). (a) x 2 + 3. 7. (b) x 2 + 1. (c) U = 2 + ~. (f) Q(J3. prove that p(x) is irreducible over K. -1-2). in Q(J2. If K is an extension of F of prime degree. (d) U = J2 . over Q(. 4.§14. Algebraic Numbers An algebraic number u is a complex number which satisfies a polynomial equation with rational coefficients not all zero.9x . over Q(-I-2).Ji. (b) U = J2 +. ./5. ..2x 2 + 9 treated in the text is irreducible over Q. 10.. 2x+2. . This argument gives an indirect proof of the existence of a transcendental real number. 14 Algebraic Number Fields Theorem 10... 2x: .""" ¥ ... ... . x-I. Take again the diagonal development of this list. 2x + 1. there results an equation with integral coefficients not all zero. as 0. and we get a list of all polynomials. I''''''' .. First. The set of all algebraic numbers is countable. Observe that an equation (9) for an algebraic number can be mUltiplied through by a common denominator for its rational coefficients. .. We know that the possible integral coefficients of these polynomials can be enumerated. -x . But Cantor's diagonal process proves (§12. . In this list replace every polynomial by its roots and drop out any duplications.". . ~+2. -x + 1. " 2x-2.. -2x . and so on for higher degrees. .-x-2.' ". We then find a rectangular array of quadratic polynomials by simply adjoining the various second-degree terms mx 2 to each element in this list.. ". Theorem 5) that the set of all real numbers is not countable. . -x+3. in which the nth row is the list of all polynomials of degree n. / . . The result is the list x... -3.... that is.. -2x + 2. ..""" One can then make a single list including them all by taking in succession as indicated the diagonals of the above array..7 2x+l. -x. " -x-I.. 2x..".. .. ' ".. -2x.. +1.... From this array we again obtain a list of all quadratic polynomials.... we list all the equations which they satisfy. ... x-·2. The verification of this statement requires that we describe a method of enumerating or of listing all algebraic numbers. ---.."... .. -2.. in which the first coefficient may be assumed to be positive. .. . The result is a list of all the roots of polynomials with integral coefficients. . x-I..... . x + 1. Hence this set must be larger than the set of algebraic real numbers. -1. it is the required enumeration of all algebraic numbers.. ... .' ..' "". such as: . +2.. -_2x -1.-2x.". for example. 2x-l.#' ~x+3.3. -x+2. r . The linear polynomials with integral coefficients can be displayed in an array....' x+l. +3.436 Ch.. -2x + 1. Not every real number is algebraic.1. .. When this is done for every degree. x. -x+l. . 2x+3. A consequence is that the real algebraic numbers are countable... ...-x. there results an array of lists. ".---.tIfIl' .. _..."'" ~.". The result we state as follows: Corollary. '.2.. . ' - ".. " .¥- -'" # ~. §14. We need only demonstrate that the sum.. v) is finite over Q(u). Theorem 12.. Hence by Theorem 9. We conclude that a field F is algebraically complete if and only if F has no proper simple algebraic extensions. A field F is called algebraically complete t if every polynomial equation with coefficients in F has a root in F. by Corollary 4 to Theorem 9. by Theorem 8. Theorem 5) asserts that the field of all complex numbers is algebraically complete. Un-I) which is a finite extension of the field Q of rationals. v) of the field of complex numbers generated by u and v. difference.D. there can be no simple algebraic extension of F except F itself. in view of the topological analogy. Ub . . v) is a finite extension of Q. This means that the root r is an algebraic number. Q(u. hence has a linear factor x-c. These coefficients generate an extension K = Q(uo. The element r of this extension is then algebraic over Q. But all these combinations are contained in the subfield Q(u. The set of all algebraic numbers is a fie/d. and the field R of real numbers embedded in t Instead of "algebraically complete. since v is algebraic over Q(u). 10-13 below). product. . Q(u) is a finite extension of Q. since it did not exhibit any specific transcendental real numbers. the only irreducible polynomials over F are linear. and quotient of any two algebraic numbers u and v#-O are again algebraic numbers." The term "complete" seems preferable. so A is algebraically complete. The fundamental theorem of algebra (§5.6 Algebraic Numbers 437 Cantor's argument for this result was at first rejected by many mathematicians.3). Over such a field F every polynomial f(x) has a root c. .3. Proof. QED We now have the field Q embedded in the algebraically complete field A of all algebraic numbers. . + Uo = 0 whose coefficients are algebraic numbers Ui in A. Q. but it is possible to give more explicit proofs of this corollary (see Exs. His argument is now generally accepted. so each of its elements is an algebraic number (Theorem 8). so that K(r) is a finite extension of K and hence of Q. Furthermore. Theorem 11. Take a polynomial equation xn + Un_IX n . Q(u. §5.E. in the field A." some sources use "algebraically closed.1 + . Any complex root r of the given equation is algebraic over the field K. The field A of all algebraic numbers is algebraically complete. Proof. Since u is algebraic over Q. and every polynomial over an algebraically complete field F can be written as a product of linear factors (as in formula (11). Consequently. 7.1. using Ex. (b) If f(x) is any polynomial with real coefficients.. (b) vCi + rs. prove that the degree of u + v never exceeds mn. such that Ixi .. respectively (over Q). (to d terms) = d 2 polynomials of all degrees. (e) u. Give the first sixteen terms in the list of all quadratic polynomials. where u + 7u . (a) If u and v are algebraic numbers of degrees m and n. Illustrate Theorem 11 by finding an equation with rational coefficients for each of the following algebraic numbers: (a) Ji + .. which states that any field F whatever has an extension A which is algebraically complete and in which every element is algebraic over F (ct.438 Ch. §15.fi/(1 + Ji).u i I -< N Ix . (b) Show that there exist countably many algebraically independent real numbers../3x . 14 Algebraic Number Fields the algebraically complete field C of complex numbers. without using Theorem 10../3x + 1 + 'V2 = 0. as in the proof of Theorem 10.vCi = 0. where u is a root of u 3 + 5u 2 . that u ~ O. 4. 3 (c) x .14 = O. prove that t + u and tu are transcendental. 6. (d) x 2 + u + 2 = 0. (b) There are d + d + . The theory of algebraic numbers has been elaborately developed. Prove that the set of all algebraic numbers of a fixed degree is countable. (a) Show that there exists a real number transcendental over Q(1T). 2./-3. 3. Prove that any finite extension of a countable field is countable./-2.lOu + 5 = O.u I whenever Ix-ul<1. These results are special cases of a general theorem. Illustrate Theorem 12 by finding an equation with rational coefficients for a root of each of the following equations: 2 (a) x + 3x + Ji = ~?:(b) x 2 + . 3 (d) .. Exercises 1.4. in the latter case. Such a field is known as an algebraic number field. (c) (. + d + . show by factorization of Xi . 8. *10. S.. Show that the set A of all elements of a field F which are algebraic over any countable subfield S of F is countable. show .fi)(rz) . 9.. Appendix). (b) What about the degree of u/v? (c) If t is transcendental and u algebraic. u any real number. Show that the proof given for Theorem 10 implicitly uses the following formulas of transfinite arithmetic: (a) There are d"+' = d polynomials of degree n.u i that there is a constant N(j). (a) If u is any fixed real number. It concerns chiefly fields K of algebraic numbers which are finite extensions of the field Q.. We consider next the arithmetic properties of such a field. provided. 7 and the definition of §3. such that I!(x) !(u)1 < Mix . 12 are called "Liouville (transcendental) numbers.u 1< 1." '" lO..7. If cr = r + si.6). show that u is transcendental. Recall now the general concepts involving divisibility (§3. A unit of Z[i] is a Gaussian integer a#-O with a reciprocal a -\ which is also a . is a Liouville number.si: (10) This norm is always nonnegative and is the square of the absolute value of cr. in other words. and product of two such integers is again such an integer. If u is a real number for which an infinite sequence of distinct rational fractions mkl n k can be found. the norm of a Gaussian integer is a (rational) integer. In particular.e !(mklnd = 0 for all sufficiently large k.) *12. hence the Gaussian integers form an integral domain Z[i]. 14. Ex. This equation means that the correspondence cr >--+ N(cr) preserves products. one has (11) N(crT) = N(cr)N(T). b are both integers.. 10. It is convenient to introduce the "norm" of any complex number cr (integral or not).u I whenever Ix . while !(mln) is a rational number of denominator n'. The sum. (Hint: If the degree of u were r.2aa + (a 2 + b 2 ) = 0 with integral coefficients. Gaussian Integers A Gaussian integer is a complex number a = a + bi whose components a.. difference. it is a homomorphic mapping of the multiplicative group of nonzero numbers cr on a multiplicative group of real numbers. If m and n are integers such that Imin .110001 . show that !(mln) = O.§14.) *13. 1!(mln)1 < lin'. where M is the constant of Ex. 11 would gi . 10. the norm N(cr) is the product of cr by its conjugate cr* = r . (a) Show L k~l (b) Exhibit two other Liouville numbers.u I < liMn'. (Hint: By Ex.(mkl nk) I < 1I kn/ for all k. In this domain questions of divisibility and decomposition into primes (irreducibles) may be considered. Numbers satisfying the hypothesis of Ex.7 439 Gaussian Integers that there is a constant M depending on ! and u. hence it is an algebraic number.k ! = 0. Any such Gaussian integer satisfies a monic 2 equation a . Let the real algebraic number u satisfy the polynomial equation !(x) = 0 of degree r' with integral coefficients. *11. such that Iu . For any two numbers cr and T. I = 1. Is N(CT) .s')i] = 'Y + CT. Hence the only associates of a in Z[i] are ±a and ±ia. For given Gaussian integers a and {3 f=. so each factor which is not a unit must have norm 5. 14 Algebraic Number Fields Gaussian integer. Then aa.1)(-2i . Then + s'i) + [(r a/{3 = (r' where Ir . hence a (or (3) would be a unit.)(1 . for in any decomposition 5 = 'Yo.2i) = (2i . A unique factorization theorem can be proved for the Gaussian integers by developing first a division algorithm. 1/2.r') . Hence 3 has no proper factor a in the domain of Gaussian integers. analogous to that used for ordinary integers and for polynomials.2). 'Y = r' + s'i. (2 + i) = i(1 . 0 there exist Gaussian integers 'Y and p with (13) a = (3'Y + p.r'l <: 1/2. which is impossible for integers a and b. and the norm of a unit a must be N(a) = 1. By trial one finds that the only integers of norm 5 are those used in (12). then N(2 + i) = 5 = N(a)N({3). so that N(aa. Two integers are associate in Z[i] if each divides the other.)2 <: 1/4 + 1/4 < 1. For example. so that N(a) (or N({3» would be 1.i) = (i .s'l + (s <: . Each factor in (12) is prime (irreducible).S. then N(a)N({3) = 9 and N(a)19. then a 2 + b 2 = 3.440 Ch.r.l ) = 1. Suppose 3 = a{3.N(5) = 25 = N('Y)N(o). Theorem 13.2i) and 2 . (12) These decompositions are not essentially different. Start with the quotient a/ {3 = r + si and select integers r' and s' as close as possible to the rational numbers rand s. the rational prime 3 is prime in Z[iJ.2)( -i .)2 + (s . If N(a) = 1. The factors (12) give essentially the only way of decomposing 5. Inspection of (10) shows that the only possible units are ± 1 and ±i. a is a unit. On the other hand. so that = (r . The rational prime number 5 has in Z[i] four different decompositions 5 = (1 + 2. and in each of the other cases corresponding factors are associates. .I ) = N(a)N(a.i = -i(1 + 2i). for instance. N(P) < N({3). if 2 + i had a factorization 2 + i = a{3. while if N(a) = N(a + bi) = 3. Proof.1) = (2 + i)(2 . and write al = 0"11 + Pb a2 = 0"12 + P2 .2ax + (a 2 + b 2) = 0. where a and {3"1.bi)] =x 2 . In order appropriately to generalize these notions. hence must be zero. A Gaussian integer 'TT is said to be prime if it is not 0 or a unit and if its only factors in Z[i] are units and associates of 'TT. Since 0 is in the ideal. By repeated divisions. One proves Lemma 2. Lemma 1. Q. it is a monic irreducible equation with rational integers as coefficients.. Proof. it has the form 0 = {3lal + {32a2. Therefore 0 is the required g. The last remainder not zero is the desired greatest common divisor. Theorem 14. Among the elements of this ideal choose one.§14. we first investigate the irreducible polynomial equations satisfied hy Gaussian integers. 'TTn of prime Gaussian integers. and a must satisfy an irreducible quadratic equation.7-1.7 441 Gaussian Integers The equation may now be written as a = {3"1 + {3lT. it may be shown that if a number r + si in the field Q(i) .5 and §3. and where N({3lT) = N{(3)N(lT) < N({3). If'TT is prime. then b -:I. and hence (3lT. hence it is a mUltiple of every common divisor of al and a2. Every Gaussian integer a can be expressed as a product a = 'TTl .8).E. If a = a + bi is a Gaussian integer which is not a rational integer.(a + bi)J[x . A more sophisticated proof starts with the ideal (ab (2) generated by al and a2 in the ring Z[i]. in the sense that any other decomposition of a into primes has the same number of factors and can be so rearranged that correspondingly placed factors are associates.c. so 0 is a common divisor. Therefore al = O"lb a2 = 0"12. The rest of the treatment of the decomposition of Gaussian integers proceeds exactly as in the case of rational integers (§§1.8) and of polynomials (§3.D. then 'TTia{3 implies that 'TTia or that 'TTi{3. This representation is essentially unique. much as in the case of rational integers (§ 1. of minimum norm. as in (13). one may construct a Euclidean algorithm. hence the algorithm eventually reaches an end. Q.. Two Gaussian integers al and a2 have a greatest common divisor 0 which is a Gaussian integer expressible in the form 0 = {3lal + {32 a 2 where {3l and {32 are Gaussian integers.(a . hence we state only the important stages. 0. The successive remainders P of (13) decrease in norm. 7). Conversely. This is [x .d. The remainders Pi lie in the ideal and have norm less than 0.0.E.D. are integers. a. 1 . 1 + 3v'-2. Prove the existence of greatest common divisors in Z[v' . 14 442 Algebraic Number Fields satisfies a monic irreducible equation with integral coefficients. 4.2] of numbers a + bv' . 3 + i. 9. Find the g.Ch. -Find the decomposition into primes of the following Gaussian integers: 5. State and prove the unique decomposition theorem for Z[v'-2L 13. Prove that every ideal of Gaussian integers is principal. Factor the following numbers in Z[v'-2]:5. (a) Prove Lemma 1. (c) Assuming that the multiplicative group modp is cyclic (§15.d.3i. (b) Show that any rational prime of the form p = 4n + 3 is prime in Z[i].c. Prove Theorem 14 from Lemma 2. (d) Conclude that p = 4n + 1 cannot be a prime in Z[i]. (a) Prove that the quotient-ring Z[x]/(p. x 2 = -} (modp) has a solution in Z. .) t The proof is given in a slightly more general case in the next section (Theorem 16).3. of each of the following pairs of Gaussian integers a. (a) Find a unit different from ± 1 in Z[h]. 3. then this number is a Gaussian integer. Theorem 6). 6. 2. Exercises 1.2]./2]. 12. 9-13 all refer to the domain Z[v' . using a Euclidean algorithm. and show explicitly that any two factorizations differ only by associates. 6i. 11. 7. (Hint: Use powers of one unit.2. x 2 + 1) is isomorphic to both 2 Z[i]/(p) and Zp[x]/(x + 1). Define a norm as N(a + bv'-2) = a 2 + 2b 2 and exhibit its properties. show that if p = 4n + 1. 5. Find all possible factorizations of 13 into Gaussian integers. 14. 11. + f32a2: (a) 3 + 6i and 12 . (a) Prove that a rational prime p is prime in Z[i] if and only if the equation x 2 + y2 = P has no solution in integers x and y. (b) 5 + 3i and 13 + 18i. where a and b are integers. 10. (b) Show that there is an infinite number of distinct units in Z[. *8. (b) Prove that the first is an integral domain if and only if p is prime in Z[i]. A number in the field QU) is a Gaussian integer if and only if the monic irreducible equation which it satisfies over Q has integers as coefficients. while the second is an integral domain if and only if x 2 == -1 (mod p) has no solution in Z. 2 + ~2.7 i. and a2 and express it as f3. (b) Prove Lemma 2. Prove a division algorifhm in the domain Z[v' -2]. Exs. t This gives Theorem 15. (x .§14.f5)/2) = x2 - x-I = 0.. asserts that the quotient q(x) also has integral coefficients. Such an (ordinary) integer of Z may be called a rational integer to distinguish it from other algebraic integers. This amounts to saying that p(x) is primitive.(1 + .9. where p(x) is irreducible over Q.0 is called a unit if both u and u -I are algebraic integers. Proof. in the sense of §3. u = (1 + . so that aj integers. Q.E. which may be taken to have integral coefficients.D. hence ±p(x) is monic. This suggests a systematic . hence is also primitive. Suppose that u is a root of some monic polynomial f(x) with integral coefficients. The leading coefficient 1 in f(x) is then the product of the leading coefficients in q and p. In testing whether a given algebraic number is an integer. in the domain Z[x] of all polynomials with integral coefficients. Algebraic Integers In general. By Theorem 2 we know that the polynomial f(x) with root u must be divisible.m/ n = O. Therefore a rational number is an algebraic integer if and only if it is an integer in the ordinary sense.S 443 Algebraic Integers 14. Since f and p are primitive. by the irreducible polynomial p(x) for u.9. Lemma 3. u also satisfies an irreducible polynomial p(x). so f(x) = q(x)p(x).(1 . which means that u is integral according to the definition (14).8. The given polynomial f(x) is monic. The irreducible equation satisfied by a rational number m/ n is just the linear equation x .d. Any common divisor of these coefficients may be removed. An algebraic number u -:I. A number is an algebraic integer if and only if it satisfies over Q a monic polynomial equation with integral coefficients. Over Q. by virtue of the following result: Theorem 16. which is monic and has integral coefficients. it is not necessary to appeal to an irreducible equation. §3. so we can assume that the coefficients of p(x) have 1 as g.f5)/2 looks like a fraction but satisfies an equation.f5)/2)(x .c. for example. in Q[x]. an algebraic number u is said to be an algebraic integer if the monic irreducible equation satisfied by u over the field of rationals has integers as coefficients. A number may be an algebraic integer even if it doesn't look the part. We assume b -:I. a contradiction to the rules (15) and (16). the integers of Q(Jd) are the numbers a + b(1 + Jd)/2. then a 2 = 0 (mod 4). 4a /c . and a == 0 (mod 2). This is the case to be considered: Theorem 17. If b == 0 (mod 2). 14 Algebraic Number Fields search for those numbers in quadratic fields which are algebraic integers. and that all the integers of Q(Jd) are of the form a + bJd. Conversely. Any field K of degree 2 over the field Q of rationals can be expressed as a simple algebraic extension K == Q(Jd). In either event we conclude that c is 1. so a square is always congruent to 0 or 1. Proof.444 Ch.2 contained in c must divide both a and b 2 . and a 2 = db 2 == 2 or 3 (mod 4). In other words. If d -:I.db 2)/C 2 == o. . However. b. observe that a == 1 (mod 2) means that a == 1 + 2r. the algebraic integers in Q(Jd) are the numbers a + bJd.4db )/C . (4a . hence that a 2 == 1 + 4r + 4r2 == 1 (mod 4). so that c 12a and c 2 14db 2. then in case d == 2 or d == 3 (mod 4). Without loss of generality. if d == 1 (mod 4). then b 2 = 1 (mod 4). these coefficients 2a/c and (a 2 . In this 2 case the last coefficient (a 2 . (15) a == 1 (mod 2) implies a 2 == 1 (mod 4). modulo 4. so the only choices for care c == 1 and c = 2. c have no factor (except ± 1) in common. b. and 4db 2/C must all be integers.1 is an integer with no square factors. one may assume that d is an integer and that it has no factor (except 1) which is the square of an integer.bJd)/c] == x 2 .(a . For similar reasons 41 c is impossible. where the integers a. Therefore. Since d was assumed to contain no square factors. and c have no factor in common. If u is an algebraic integer. Any number u in Q(Jd) may be expressed as u == (a + bJd)/c. so that a. contrary to the arrangement that a. with (rational) integers a and b as coefficients.0 to exclude the trivial case of a rational number. Consider now the case d == 2 or d == 3 (mod 4). with a and b rational integers. The monic irreducible quadratic equation for u is then (17) [x . As a preliminary.db 2)/4 of (17) must be integral. If b == 1 (mod 2). and c have a common factor 2. any prime p -:I.(2a/c)x + (a 2 . with c == 2.db 2)/C 2 2 2 2 2 2 2 must also be integers. b.(a + bJd)/c][x . the monic equation (17) for a number of this form does have integral coefficients. so a == db 2 (mod 4). (16) a == 0 (mod 2) implies a 2 == 0 (mod 4). An instructive proof of Theorem 18 depends on an analysis of ' the additive groups generated by algebraic integers. 14. are again integers of this form. where u is an algebraic integer and b a rational integer (Le. The next task is that of generalizing this corollary to any algebraic number field. Proof. i).E. . Prove that every root of unity is an algebraic integer. 4. (a) Prove that any algebraic number can be written as a quotient u/ b. and products of integers. Find all the integers in Q(J2. an integer of Z). represented as in Theorem 17.D. Sum~ and Products of Integers This section is devoted to the proof of the following result: Theorem 18.9 Sums and Products of Integers 445 The remaining case d = 1 (mod 4) is given a similar treatment. where w is a complex cube root of unity. Sums. The following specialization is an immediate consequence: Corollary. (a) Find all integers and all units in Q(w). This t Such an additive group is sometimes called a Z-module because its elements can be multiplied by "scalars" from Z.§14. 3..Vn are any algebraic numbers. In any field K of algebraic numbers. *5. Exercises 1. The set of all algebraic integers is an integral domain. (b) Prove that any field K of algebraic numbers is the field of quotients of the domain of all algebrl~ic integers in K. we let G = [Vb' •• . vn ] denote the subgroupt generated by these numbers in the additive group of all complex numbers.9. 2. except that a == b = 1 (mod 2) turns out to be possible. Corollary. the algebraic integers form an integral domain. Q. (b) Prove that every unit in Q(w) is a root of unity. In any field of degree 2 over Q the set of all algebraic integers is an integral domain. If Vb' •• . Complete the proof of the second case of Theorem 17 (d "" 1 (mod 4)). differences. so that G k consists of all sums of the form akVk + .. and any element W of S in G k gives an element w' = W .E.q2W2 depends only on V3. is any other element of S in G k ." " v'" and then some q2 so that W . U 2.k + 1 generators of G. • . A number u is an algebraic integer and only if the additive group generated by all the powers 1. so each of the products UVi must lie in G and must be expressible in terms of the .' •• . Among the elements of G k which lie in the given subgroup S. (If in every element the coefficient of Vk is zero. 14 Algebraic Number Fields group G simply consists of all numbers representable in the form (18) (ai rational integers). . The product of u by any j j element L aju of G is still an element L aju + 1 of G. vn] generated by the last n . for given any any element W in S. Therefore u satisfies the criterion of Lemma 2.ql WI depends only on V2. its first coefficient bk may be written bk = qkCk + rk. . .qkWk in G k + l • The n selected elements WI. Proof.. •. If u is an integer. Proof. with a nonnegative remainder rk < Ck.446 Ch. Conversely. U " .) If W = bkvk + . For each index k let G k be the subgroup [Vk. u.. one may find ql so that w .. un-I] generated by n smaller powers of u.. . set Wk = 0. Recall that the natural multiple av = a x v is simply a "power" of v in the additive cyclic subgroup generated by v. The difference W .. then lies in the groups G k and S and has a nonnegative first coefficient rk less than the minimum Ck' Therefore rk = 0. U.qlwl '.• .qkWk = rkvk + .D.Vn] can also be generated by n or fewer numbers. This equation expresses un as an element in the group G = [1. of u can be generated by a finite number of elements. Vn of G. + anvn. u. By iteration. Lemma 1.. .. Any subgroup S of the group G == [Vb' . the same equation may be used to express any higher power of u as an element of this group. select an element (19) in which the first coeffiCient Ck has the least positive value possible. . • can be generated by any n numbers Vb ••• .. if Lemma' 2.. it satisfies a monic equation (14) of degree n with integral coefficients. v'" and so on. suppose that the group G generated by 1. w" generate the whole group S. U2. at the end W = L qiWi' Q.. Theorem 13. and therefore. (c) J7 + (1 + J'S)/2. Y". ukvkand (u + V)k lies in the additive group generated by the products l. where the coefficients bi are certain polynomials in the integers aij and are thus themselves integers.. This equation (20) meanst that u is an algebraic integer.. .. . Proof. v.uI. The group S generated by I.§14. so the matrix of coefficients must be linearly dependent (§7. its determinant is zero. 10. Show explicitly that each of the following numbers is an algebraic integer by displ~ing an appropriate monic equation with integral coefficients: (a) ~2 + J3. If all the positive powers of an algebraic number u lie in an additive group generated by a finite set of numbers Yb .u.1 and I. where A = I/aiil/. where the aij are integers.' • " u n . . Corollary). U2.. + bn = 0.E.1• Therefore every power (UV)k . U. . By the corollary it follows that uv and u + v are algebraic integers.uv. .9 447 Sums and Products of Integers generators as UVi = L aijVj. of the form =0 . is a subgroup of the group generated by I. the number u is an algebraic integer. =0 . If u and v are algebraic integers. in the sense of Oiap. t Note that (20) is simply the characteristic polynomial of A.7. This system of equations has a set of solutions Vb V2. The matrix of coefficients may be written as A .1.. Return now to the proof of Theorem 18. then u is an algebraic integer..un-1v r . • • . The conclusion of Lemma 2 may be reformulated thus: Corollary. as required in the lemma. Q. Since it is singular. Exercises 1. The hypothesis means that all powers Uk and v k can be expressed in terms of a finite number of powers 1. as required for the theorem. this subgroup S can be generated by a finite number of its members." ·. Vn not all zero. u. (b) i + w.1 + .. we are to show that u + v and uv are integers. Yb . " v r . . These expressions i give n homogeneous equations in the v's. " Yn' Hence by Lemma I. so (20) \A - uI\ = (-I)"u n + bn_1u n.uv 2.D. by Lemma 2. The basic tool for this purpose is the concept of norm. show that. vnJ may be used to compute the index of S in G.10..... V..bJd). . uv = u . . Vn are linearly independent over Q.. 3. . prove that any subgroup S of finite index in G = [v" . given an infinite sequence of subgroups S.. vn] 14.448 Ch.. . = [v.. we consider in more detail the simplest case.. (Hint: Find a representative for each coset of S. even for algebraic number fields of higher degrees.bJd which carries each number u into its "conjugate" u Definition. The norm is defined essential!y by means of the automorphisms of the field. The formula for the norm depends on the field. there is an index m for which Sm = Sm+' = Sm+2 = . Show that a group G has no infinite ascending chain of distinct subgroups. W n • (b) Show that any such subgroup S is (group) isomorphic to the whole group G. (b) Exhibit a module contained in the domain Z[i] of Gaussian integers which is not an ideal of Z[i]. . Vn are linearly independent over Q. That is. we consider factorizations of the integers of Q(Jd)... The quadratic field Q(..e. If an algebraic number u satisfies a monic polynomial equation in which the other coefficients are algebraic integers. (a) Show that every module contained in the domain Z of ordinary integers is an ideal of Z. -< S2 -< S3 -< . The norm N(u) of a number u = a + bJd of Q(Jd) is the product uu of u by its conjugate u.. . show how the basis found in Lemma 1 for a subgroup S of G = [v" .. i. Ii is an isomorphism. . vn] also can be generated by n linearly independent numbers W. If numbers v" ./ill has by Theorem 6 an automorphism u = a + bJd ~ u = a . . (a) If numbers v" ..) *4. hence N(uv) = N(u)N(v). (21) N(u) = uu Since the correspondence u (22) = (a + bJd)(a ~ . . that of quadratic integers. *6.. (Hint: Apply Lemma 1 to the join of the groups Sk') 5. as characterized in Theorem 17. Factorization of Quadratic Integers To illustrate the factorization theory of algebraic integers. -< G.' •• . but the idea is the same in all cases. prove that u is also an algebraic integer. 14 Algebraic Number Fields 2. whence N(u) = ±1. This difference shows up in the group U of the units of Q(~. as we shall now see. This proves Theorem 19. then u(±U) = 1 and u is a unit of Q(~. (The norm of an algebraic integer is a rational integer.b 2 d may be positive or negative. then N(u)N(v) = N(uv) = 1. N(l) = 1.J d) then have the form u = m + na(m. where a= J d 1 +J 2 if d ~ 3 (mod 4) d if d = 3 (mod 4).3). the square of the absolute value of u. see Ex. the only units of Q(. If d ¢ 3 (mod 4) and d > 1. The units oL. again.) The properties of the norm depend basically on whether d is positive or negative-i. which is a primitive sixth root of unity. n = O. d > 0 a square-free integer.Whereas if d > 0. N(u) is necessarily a rational integer. 1. n E J). Conversely.§14. Hence if uv = 1 for some other integer v E Q(~. . moreover. Likewise. then d :> 7 and N(u) :> 7 n 2/ 4 > 1 unless n = O. Proof. A similar argument applies to algebraic number fields generally. If 2 d < 0. the norm of u satisfies if d ¢ 3 (mod 4) if d = 3 (mod 4).J d) are ± 1.10 449 Factorization of Quadratic Integers The norm thus transfers any factorization w = uv of an integer in the field into a factorization N(w) = N(u)N(v) of a rational integer N(w).. Correspondingly. those of Q(. Hence. An integer u E Q(~ is a unit if and only if N(u) = ±1. m 2 + n 2 d <: 1 is possible only if m = ± 1. if d = 3 (mod 4) and d > 3. then N(u) is simply 1 u 1 .J 3) are the powers of OJ = (1 + vC3)/2.can determine the units of any complex quadratic number field Q(~d). Trivially. Lemma 1. Combining Lemma 1 with Theorem 17.Q(. and 2 it is positive unless u = O. one . if N(u) = uu = ±1.J 1) are ±1 and ±i. The only complexJlYadratic number fields having units other than ± 1 are Q(vCi) and Q(J . then N(u) = a .e. on whether Q(v'd) is a real or complex quadratic field. The integers of Q(. except for order./2).J 5] has the unique factorization (6) = p 2 02 into prime ideals./2) .J 5) E P if and only if m + n = 0 (mod 2).3. we observe that (m + n./2)±k of (1 + . 12. every ideal can be represented uniquely. N(v) = 18. Hence. 12. then N(u)N(v) = N(6) = 36. P is a prime ideal. The relevant prime ideals are the ideals P = (2./-5) = m 2 + 5n 2 that all possible factors are listed in (23). showing that the ideals (2) and (3) are not prime.3 + 3~5. By a further development one may establish the "fundamental theorem of ideal theory": In the domain D of all algebraic integers in an algebraic number field K.6. 18 require investigation. Theorem 6. (3).J 5).. One can rescue the unique factorization theorem in the preceding example by considering products of ideals. Z[.J 5] serves merely to indicate how the notion of an ideal may be used systematically to reestablish the unique decomposition theorem in domains of algebraic integers where the ordinary factorization is not unique. 3.J 5]/0 is Z3. . A proper factor u of 6 will thus have a norm which is a proper factor of 223 2. we have shown that the ideal (6) of Z[. Similarly. Therefore Z[.450 Ch. instead of . respectivel~t suffices to consider N(u) = 2. so only the cases N(u) = 2.2 + 2. and (1 .J 5).6) = (2). consider the factorizations of the number 6: (23) 6 = 2· 3 = (1 +. For example.. 3. 3. 1 + . 0 = (3. 6.6.. as in §13. 2. every integer u of the domain determines a principal ideal (u) which has such a unique factorization. 4. 6) = (3). Hence so are all the powers (1 + .. 9. 0 2 = (9.J 5).Q!:9ducts of numbers. in these cases.4. One easily sees from N(m + n. this is not the case in Q(. If two integers u and v of Q(~5) satisfy uv = 6. For example./2) = -1. These ideals are not principal ideals.9.4. since N(1 + . as described by their bases in Z[./-5). and so 0 is pnme..J 5]. (1 + . To show that P is a prime ideal in Z[ ~5]..J 5. Though factorization into primes is unigue for many rings of quadratic integers. as a product of prime ideals.J 5)(1 -. This unique ideal decomposition which we have derived in the domain Z[.J-5) are not prime ideals. 14 Algebraic Number Fields Real quadratic number fields have infinitely many units. In particular. One finds that the principal ideals (2).J 5).1 +./2 is a unit of Q(.4. squaring them: p2 = (4. In conclusion..J 5]/ P contains only two elements and is the field Z2..1 +. as in §13. Since. where w = (-1 + v'-3)/2. (a) Prove D is a unique factorization domain. 2. Prove that the number of units in a quadratic field Q(v'-d) with d positive is finite. show that the norm of an integer is an integer. S. (iii) given a and {3 -:jo 0. Find all units in Q[v' -7]. (a) In any quadratic field. *4. Let D be any integral domain in which a norm N(a) is defined. 3. . y and ( exist such that a = {3y + (. where (i) N(a) is a positive integer if a -:jo 0. (Hint: The integral multiples of any (3 divide the complex plane into equilateral triangles.) *6. and N(() < N({3). (b) If u = a + bJd is not rational.10 Factorization of Quadratic Integers 451 Exercises 1. (b) Prove every ideal in D is principal. State and prove a division algorithm for Z[w]. and show that every unit is a root of unity. show that N(u) is the constant term in the monic irreducible equation satisfied by u. Prove that the roots of unity which lie in any given algebraic number field form a cyclic group.§14. (ii) N(a{3) = N(a)N({3). we mean. and quartic equations which we derived in Chap.i= 0) is a quadratic polynomial over F with the conjugate rootst Uj = (b ± .2.15 Galois Theory 15. j = 1. Their efforts produced the solutions "by radicals" of the general quadratic. Definition.• (x . This final chapter presents the most essential arguments of Galois in modem form.Ut)' .1. This is the so-called "root field" of p(x). .Jb 2 . Root Fields for Equations Classically. who showed that an equation is solvable by radicals if and only if the group of automorphisms associated with it is "solvable" in a purely grouptheoretic sense. un).un) in N.4ac)/2a. An extension N of F is a root field of a polynomial f(x) of degree n > 1 with coefficients in F when (i) f(x) can be factored into linear factors f(x) = c(x . the simple t By a "root" of a polynomial t. a number x such that t(x) = 0.. which we now define formally. If f(x) = ax 2 + bx + c (a ::. as N = F(Ub . The automorphisms in question are those automorphisms of the extension field generated by all the roots of the equation. But repeated attempts to obtain similar formulas which would solve general quintic (fifth-degree) equations proved fruitless. of COurse. . beginning with an examination of the extension field generated by all the roots of a given polynomial p (x) over a given field F. The reason for this was finally discovered by Evariste Galois. 5.. cubic. such an x is also called a "zero" of t(x). algebraists tried to solve real and (later) complex polynomial equations by explicit fonnulas. which leave fixed all the coefficients of the equation. (ii) N is generated over F by the roots of f(x). 1 over K. This is true because U2 = c/ aUi> whence f(x) = a(x . ~5. If a and b are any two elements of . and is an extension of Q of degree six. as follows. w 2rs) = Q(rs.1.5) of the rational field generated by the real cube root of 5 is of degree three over Q. Ex.)/2 is a complex cube root of unity. let F* be the set of all elements that appear in one of the Fn-and hence in all its successors. purely algebraically.3 there exists a simple extension K = F(u) generated by a root U of p(x).1 • Finally..§15. and generally. in a sequence Pl(X). This field N is a root field for f(x). Hence the number of all polynomials over F is countable (ct. Suppose the theorem true for all fields F and for all polynomials of degree n . irreducible over F.Ul)(X . Theorem 1 can be used to construct. 14. and we can arrange these polynomials.5 thus has the basis (1. f(x) has a root U and hence a factor x . Any polynomial over any field has a root field. since w satisfies the cyclotomic equation w 2 + w + 1 = O.5 over Q is Q(rs. Over K.1 Root Fields for Equations 453 extension K = F(Ul) == F[x]/(f(x» of F generated by one root Ul of f(x) = 0 is already the root field of f over F. an algebraically complete extension of any finite or countable field F.Qver 2:. ~25.. . P3(X). let Fn be the root field of Pn (x) over Fn . By Theorem 5 of §14. while the smallest extension of Q containing all cube roots of 5 is N = Q(rs. This is of degree two over Q(~'5). The real extension field Q(.· . and let p(x) be a factor. w~5. so f(x) = (x .U2) can be factored into linear factors over in K = F(Ul). and the induction assumption provides a root field N over K generated by n . . so that it is legitimate to speak of the root field of f over F. w). hence we may use induction on the degree n of f(x). let F2 be the root field of pz(x) over F 1 . The number of polynomials of degree n over F is finite or countable. However. Appendix. § 12.1 roots of g(x). w. For a polynomial of first degree. A general existence assertion for root fields may be obtained by using the known existence of simple algebraic extensions. The quotient g(x) is a polynomial of degree n . It will be proved in the next section (Theorem 2) that all root fields of a given polynomial f over a given base field F are isomorphic. w~25). of the given polynomial f(x). as follows: Theorem 1. P2(X).¥5) == Q[x]/(x 3 .2). wrs. "... where w = (-1 + 5. Thus the root field N of x 2 . the root field is just the base field F. being d n + 1 = d (d = countable infinity) if F is countable.u)g(x). Considered as a ~ector space . the root field N of x 3 . this is not generally true of irreducible cubic polynomials. Now let Fl be the root field of Pl(X) over F. w)..U. the above line of argument can be modifiedt so as to apply to any field F.3). Specifically. Therefore a + b. The assertion that the root field is unique is essentially a straightforward consequence of the fact that two different roots of the same irreducible polynomial generate isomorphic simple extensions (Theorem 6. Part I. . Using Theorem 9. §14. Any field F has an algebraically complete extension. and (for b rt 0) a/ b must also have the same value in Fn and all its successors.. Uniqueness Theorem We now prove the uniqueness (up to isomorphism) of the root field of Theorem 1.5. But h(x) can certainly be factored into linear factors in its root field Fm over an appropriate Fm-I-hence so can its divisor g(x). Theorem 2. 15 Galois Theory 454 F*. it remains only to extend appropriately this isomorphism to the whole root field. all the coefficients of g(x) will be in some Fm and so algebraic over F. Any two root fields Nand N' of a given polynomial f(x) over F are isomorphic. . 15. Hence there is an isomorphism T of F(UI) to F(u/). Proof. one can then find a nonzero multiple h(x) of g(x) with coefficients in F (see Ex. Hence g(x) can also be factored into linear factors over the larger field F*.2. they must both be in some Fn and hence in all its successors. two root fields N = F(Ub . which is therefore an algebraically complete field of characteristic p. Furthermore. 5 below). Using general well-ordered sets and so-called transfinite induction instead of sequences. If an isomorphism S between fields F and F' carries the coefficients of an irreducible polynomial p(x) into the corresponding coeffit A detailed proof appears in B. 1930 (in some but not all editions). which shows that F* is a field. The isomorphism of N to N' may be so chosen as to leave the elements of F fixed. " un') of an irreducible p(x) contain isomorphic simple extensions F(UI) and F(UI') generated by roots UI and u/ of p(x). Moderne Algebra.. let g(x) be any polynomial over F*. L. The basic procedure for such an extension is given by Lemma 1. The modification establishes the following important partial generalization of the Fundamental Theorem of Algebra. van der Waerden. Berlin. every element of F* is algebraic over F. " un) and N' = F(u/. To show that F* is algebraically complete.Ch. §14. ab. Since mid < m. the induction assumption of our lemma therefore asserts that the isomorphism S* of (2) can be extended from F(u) to N.X . § 14. This proves Lemma 2. (b) x 3 . For the same reason.1 for all ai in F. For m = 1 it is trivial. the isomorphism S can be extended to an isomorphism of N to N'. with (2) uS* = u' . + an_lun-1)S* = aoS + (a1S)u' + . Find the degrees of the root fields of the following polynomials over Q: (a) x 3 . Since m > 1. hence take m > 1 and assume the lemma true for all root fields N of degree less than m over some F. This will be established by induction on the degree m = [N: F].. + (an_1S)(U'r. thereby proving Theorem 2. Since N is generated over F by the roots of f(x). Exercises 1.x 2 . In case the two root fields Nand N' are both extensions of the same base field F. Lemma 2 shows that N is isomorphic to N'. and S is the identity mapping of F on itself. [F(u)]S* = F'(u'). If an isomorphism S of F to F' carries f(x) into a polynomial f'(x) and if N ::::> F and N' ::::> F' are. so N is a root field of f(x) over F(u). respectively. Let u be a root of p(x) in N. (d) (x 2 - 2)(x 2 - 5) = O.. the desired extension S* is given explicitly by the formula (1) (ao + alU + . p'(u') = O. respective/y. not all roots of f(x) lie in F. Proof.2 = 0.. then S can be extended to an isomorphism S* of F(u) to F'(u').3. where n is the degree of u over F. root fields of f(x) and f'(x). N is certainly generated over the larger field F(u) by these roots. so there is at least one irreducible factor p(x) in f(x) of degree d > 1. by roots u and u' of these polynomials. The root field N' then contains a root u' of p'(x).2 = 0. N' is a root field of f'(x) over F'(u'). Lemma 2.§15. with a degree mid. and by Lemma 1 the given S can be extended to an isomorphism S*. p(u) = 0. while p'(x) is the factor of f'(x) corresponding to p(x) under the given isomorphism S. in which uS* = u'..2 455 Uniqueness Theorem cients of a polynomial p'(x) over F'. . Exactly as in the discussion of Theorem 6. since S is then already extended to N. and if F(u) and F'(u') are simple extensions generated. (c) X4 - 7 = 0. set h(x) = II hj(x). Show that Q(z 1.an). Since a field of characteristic 00 always contains an infinite subfield isomorphic to the rationals (Theorem 14.) 6. x. 3. Zn) is the root field of p over Q. Finite Fields By systematically using the properties of root fields. (Hint: Form a root field of g(x) over F(ao. the non-zero elements form a multiplicative group of order q . prove that g(x) is a divisor of some nonzero h(x) with coefficients in F. + anx n have coefficients algebraic over a field F. Prove: The root field of a polynomial of degree n over a field F has at most the degree n! over F. Prove that any algebraically complete field of characteristic p contains a subfield isomorphic to the field constructed in the Appendix of §15. Corollary. In a finite field F with q = pn elements. *5. The order of every element in this group is then a divisor of q . prove that QW is the root field of xn . ...4.8). we can assume that F contains the field Zp of integers modulo p (see Theorem 12. ••• .al)(x .1. factor g(x) into linear factors (x .. ••• . 5.456 Ch.I = 1. § 13. .aq of F (including zero) satisfy the equation (3) xq - x = 0. . 15 Galois Theory 2. so that every element satisfies the equation x q . the U will be algebraic over F with irreducible equations hj(x).a2) ..uJ in this root field. so there are pn elements in F all told.1. 7). ••. This proves Theorem 3. one can obtain a complete treatment of all fields with a finite number of elements (finite fields).Un over Zp..Zn be its complex roots. Therefore all q elements ai. (a) If ( is a primitive nth root of unity.'" . The finite field F is then a finite extension of Zp and so has a basis U I.6. The number q of elements in a finite field is a power pn of its characteristic. Let g(x) = ao + a1x + . . (x . 4. (b) Compute its degree for n = 3. Let P E Q[x] be any monic polynomial with rational coefficients.1 = 0 over Q. and let Z 1.1.3. Without loss of generality. §13. a2. Hence the product (x .a q ) is a divisor of x q being a product of relatively prime polynomials each dividing x q - x. every finite field F has a prime characteristic p. Every element in F has a unique expression as a linear combination L ajUi' Each coefficient here can be chosen in Zp in exactly p ways.. j 15. x over Zp. the sum of any two of the roots Ub· .x had a mUltiple factor (x . is monic and of degree q. Ex. hence Theorem 5.u)g'(x)].uq of q x . This proves the lemma. If x q . The set of all q roots of x q .U)2g(X)]' = (x . a contradiction. Lemma.x over Zp. since this subfield contains all the roots.x is therefore a subfield of the root field N. For any prime p and any positive integer n.u) would be a divisor of -1. Next consider the question: which finite fields really exist? To exhibit a finite field one would naturally form the root field N of the polynomial x q . 7). On the other hand. it must actually be the whole root field N. pn The product ab is also a root. - x has q distinct linear factors in its root The proof will be by contradiction. then (a ± b )pn = a pn ± b pn = a ± b. for (a ± b)P = a P ± b P in any field of characteristic p. we would have (x q - x)' = q X xq- 1 - 1 = -1 [(x . This means that we have constructed a field with q elements. for (ab )pn = apnb = ab. we conclude that (4) q Therefore F is the root field of x . and a similar result holds for a quotient. . there exists a finite field with pn = q elements: the root field of x q = x over Zp. pn pn so that if a = a and b = b.U)2g (X).x is a root. Any other finite field F with the same number of elements is the root field of the same equation. Any two finite fields with the same number of elements are isomorphic.§15. whence (x . hence is isomorphic to F by the uniqueness of the root field (Theorem 2).u). Comparing formal derivatives (§3. This argument proves Theorem 4.1. we could write x q .x = (x .u)[2g(x) + (x . . The polynomial x q field N.3 457 Finite Fields Since it.. like x q - x. We now prove that the desired root field consists precisely of the roots of this polynomial. as follows.Ch. Proof. The structure of the multiplicative group of this field can be described completely. Exercises 1. write q .I = 1. a r Theorem 7.7. . is an element of order q .. Every finite field of characteristic p has an automorphism a P• ~ Proof. In any finite field F. exactly p e . 8 below). hence all lie in F. Theorem 6. 15 458 Galois Theory By Theorems 4 and 5 there is one and essentially only one field with pn elements. the q elements a give exactly q pth powers. Prove that there exists an irreducible polynomial of every positive degree over Zp. Of all the distinct roots of this equation x P ' = 1. in the multiplicative group of F. we must find in F a "primitive" (q . I (q .I = 1. which must then include the whole field F.1 (cf. This element Ci thus has order pt. < p.e . Ex. The product CIC2 ••• c. the powers of the primitive root will then exhaust the group. Theorem 13). In a finite field of characteristic p.1)st root of unity.1). From the general discussion of fields of characteristic p.' = 1. (0 < PI < P2 < . which has no lower power equal to 1.1 as a product of powers of distinct primes q . Therefore a ~ a P maps F on all of F. Some additional properties of finite fields are stated in the exercises. the multiplicative group of all nonzero elements is cyclic. where q is the number of elements in F. in the sense that it satisfies the equation x q . To this end. so the roots of x ' = 1 are all roots of x q ..1)st root of unity. P For each P = Pi. Since this correspondence is one-one. every element has a pth root.). therefore F contains p«-t at least one root c = Ci of x = 1 which does not satisfy x = 1. This field is sometimes called the Galois field GF(pn). as desired. Corollary. we know that the correspondence a ~ a Po maps F isomorphic ally into the set of pth powers (§13. To prove the group cyclic.1 = PI e 'P2 e2 r ••• p. Each nonzero element in F is a (q .I satisfypethe equation x P ' . In general." of distinct primes has order exactly Pl" . the set S of perfect squares has cardinality (q + 1)/2 at least. (b) If min. Prove that every finite field containing Zp is a simple extension of Zp. 10. Hence .1). Show that the lattice of all subfields of th. the field C of complex numbers has. 7..S) cannot be void for any a E S. show that any subfield of GF(p") has pm elements.4.bi. then GF(p") has a subfield with pm elements.e Galois field of order p" is isomorphic to the lattice of all positive divisors of n. 15. an automorphism T of a field K is a bijection a ~ aT of the set K with itself such that sums and products are preserved. 4. which maps each number on its complex conjugate. 5. in the sense that for all a and b in K. (a) Using degrees. (Hint: Show the order divides h. (b) Infer that S II (a . of elements Cj whose orders are powers p. 9. where min. if min. but fails to divide hlp. (b) Let ( be a primitive pth root of unity over the field Q of rational numbers. for any i. Such an isomorphism of a field onto itself is known as an automorphism. show that there exists a primitive mth root of unity over F. (c) Conclude that every element is a sum of two squares. In GF(p") show that the automorphism a ~ a P has order n.1. 6. 8.4 459 The Galois Group 2. Use (a) to prove that the Galois group of Q(() over Q is cyclic of order p . prove that (pm .1) I(p" . 3. If m is relatively prime to the characteristic p of F. The composite ST of two automorphisms Sand T is also an automorphism. (ab)T = (aT)(bT). (a) Show that in any finite field of order q = p". (5) (a + b)T = aT + bT.) (a) Show from first principles that the multiplicative group of nonzero integers mod p (in Zp) is cyclic. (c) Use (b) to show that. For example. (Hint: Apply the method used for Theorem 6. relative to the real numbers. two symmetries. The Galois Group Groups can be used to express the symmetry not only of geometric figures but also of algebraic systems. one is the identity and the other is the isomorphism a + bi ~ a . Does this apply to a field of characteristic 007) Prove: in an Abelian group the product C1C2· •• c..§15. Prove that every finite extension of a finite field is a simple extension. and the inverse of an automorphism is again an automorphism. p/' = h. hence p(u) = 0 gives (un + bn_1u n. they form a subgroup called the automorphism group of Kover F. let us determine the possible images of an algebraic number under an automorphism. Thus the automorphism group of Cover R consists of the two automorphisms a + bi >--+ a + bi and a + bi >--+ a . generated by either of the conjugate roots ±v'2 of x 2 = 2. the conjugate roots and EXAMPLE .. Consider the field K ::= Q(v'2.. + bo)T = (uTr + bn_1(uT)n-l + . there is an automorphism S of K carrying v'2 into -v'2 and leaving the elements of QU) fixed.1 + .5. Definition. That is. but before we consider specific examples. i) of degree four over the field of rationalst generated by v'2 and i = H. which is algebraic over F. Let K be an extension of F and consider those automorphisms T such that aT :::: a for every a in F. This equation states that uT is also a root of p(x). This theorem asserts that u and its image uT both satisfy the same irreducible equation over F. 1. hence that uT is a conjugate of u... The most imp'ortant special case is the automorphism group of a field of algebraic numbers over the field Q of rationals. Theorem 9.. in the whole group of automorphisms of K..bi.3. Any automorphism T of a finite extension Kover F maps each element u of K on a conjugate uT of u over F.1 + . By Theorem 6. The automorphism T preserves all rational relations. §14. + b1(uT) + b o = O.J2 t As in §14. satisfy a monic irreducible polynomial equation p(x) = xn + bn_1x n.Ch. let the given element u. and leaves each b i fixed. The automorphism group of a field K over a subfield F is the group of those automorphisms of K which leave every element of F invariant. Over the intermediate field F = Q(i) the whole field K is an extension of degree two. + b o with coefficients in F. The set of all automorphisms of a field K is a group under composition. These are the automorphisms which leave F elementwise invariant. by (5). To prove it. 15 460 Galois Theory Theorem 8. one may observe that this field is the root field of X4 - 2X2 + 9. . This is exactly like the multiplication table for the elements of the four group (§6. ~ -.. We assert that I.J2 + ci + dJ2i)T = a + bJ2 . T. any other automorphism U must carry J2 into a conjugate ±J2. s.dJ2i. The effect of these automorphisms on J2 and i may be tabulated as S: {~~ ~12 1 ST: { ~ 12. i. These are exactly the four possibilities tabulated above for I.Ul) ••• (x . §14. {~~ ~ 1 ~ I. so we conclude that the automorphism group of Q(J2.5). The multiplication table for these automorphisms can be found directly from the tabulation of the effects on J2 and i displayed above. T. T. so T simply maps each number on its complex conjugate. . Thus U == I. J2i (ct. S. J2.dJ2i.§15. S. Then (7) (a + b. i) is isomorphic to the four group {I. there is an automorphism T leaving the members of Q(J2) fixed and carrying i into -i. in its effect upon the whole field. and ST. If N = F(Ub . The product ST is still a third automorphism of K. ST}. Definition. It IS (8) ST == TS. S. and ST are the only automorphisms of Kover Q.un). By a similar argument. .12 1 ~ T: -I. T. {~~ ~ 1 I. By Theorem 9. or ST..4 461 The Galois Group -J2 are algebraically indistinguishable.ci . Hence U must agree with one of these four automorphisms in its effect upon the generators J2 and i and.7). and i into a conjugate ±i. therefore. then the automorphism group of N over F is known as the Galois group of the equation f(x) == 0 or as the Galois group of the field N over F. I: ~ -I.bJ2 + ci .un) is the root field of a polynomial f(x) = (x . The effect of S on any element u of K is (6) (a + bJ2 + ci + dJ2i)S = a . where we have written each element of K in terms of the basis 1. Uk in a root field N = F(Ul. i. The permutations (9) include only those permutations which do preserve all polynomial identities between the roots and so can correspond to automorphisms. . every element w in the root field is expressible as a polynomial w = h(ul. so that k (9) -< n. where i = .Uk)' Then each automorphism T of the Galois group G of f(x) induces a permutation Ui ~ UiT on the distinct roots of f(x). r. : . the properties (9) of T give This formula asserts that the effect of T on w is entirely determined by the effect of T on the roots. the whole root field N has degree eight over Q. .ud. . On the other hand. ir. by Eisenstein's Theorem. one proceeds as follows. Let f(x) be any polynomial of degree n over F which has exactly k distinct roots Ub . . ir3. Hence T effects a permutation 4> of the distinct roots Ub .. .3 = 0 is irreducible over the field Q. Corollary 1. ir. Since T leaves these coefficients fixed. -ir) may be generated as N = Q(r. with coefficients in F. . and T is completely determined by this permutation. Let N be the root field of f(x) over F. Then T maps roots of f(x) onto roots of f(x) (Theorem 9). positive fourth root of 3. The root field N = Q(r. the permutations (9) form a group isomorphic to the group of automorphisms. EXAMPLE 2. The results so established may be summarized as follows: Theorem 10.5. or that T is uniquely determined by the permutation (9).J=l. . r2. . The equation X4 . The Galois group of a polynomial of degree n has order dividing nt. 15 462 Galois Theory To describe explicitly the automorphisms T of a particular Galois group. and r = ~ is the real. The Galois group of any polynomial is isomorphic to a group of permutations of its rootr Corollary 2. Since every element in N can be expressed as a linear n. -r.. ir2. . ir.Ch. By Theorem 9. and distinct roots onto distinct roots. hence of degree two over the real field Q(r).' . Uk of f(x). Since the product of two permutations is obtained by applying the corresponding automorphisms in succession. -ir. and has the four distinct roots r. §14. Since r is of degree four over Q and since i is complex. -r. this extension N has a basis of 8 elements 1. l. .. Since N is an extension of degree two over the real field Q(r). Several automorphisms of N may be readily constructed. iS2 = i.4). where U* is defined for an element w in Q(i) by the identity wU* = wu. with rational coefficients. S2] generated by S2. r). one finds for N eight automorphisms. The correspondence U ~ U* is a . By Theorem 6.3. and r into a conjugate ±r or ±ir. Thus G contains the subgroup H = [I. S. it has an automorphism T which maps each number of N on its complex conjugate. This means that U induces an automorphism U* of Q(i)./3. isomorphic with the group of the square (§6. S3] generated by Sand the smaller subgroup L = [I. with the following effects upon the generators i and r: I S S2 S3 r mapped into r IT -r -IT i mapped into i 1 1 1 T TS TS 2 TS 3 r ir -r -IT -i -I -I -I One may also compute that TS 3 = ST. i = 3.4 463 The Galois Group combination of these basis elements.§15. S2./3) c Q(i. Many concepts of group theory can be applied to such a Galois group G. It follows that S2 is an automorphism with rS2 = i 2r. while 3 rS3 = -ir. This example illustrates the significance of the subgroups of a Galois group for the solution of an equation by radicals. In this sense. These automorphisms constitute the whok Galois group. so that these eight automorphisms form a group. The smaller subgroup L consists of those automorphisms which leave fixed everything in the larger subfield Q(i.. r2). iT = -i. Each automorphism of the subgroup H leaves i fixed. the effect of an automorphism T will be completely determined once rT and iT are known. By further combinations of Sand T. hence rT = r. On the other hand. Such an ascending sequence of subfields gives a method of solving the given equation by successively adjoining the roots of simpler equations x 2 = -1. so rS = ir. the descending seqlle~ce of subgroups G ::::> H ::::> L ::::> I corresponds to an ascending sequence of subfields Q c Q(i) c Q(i. and hence leaves fixed every element in the subfield Q(i). N has an automorphism S mapping r into its conjugate ir. for any automorphism must map i into one of its conjugates ±i. the table above includes all eight possible combinations of these effects. hence carries each element of the field Q(i) into some element of the same field. §14. N is an extension of degree four of the subfield Q(i). iS = i. Each automorphism U of the group G above carries i into ±i. is = i. S4 = T2 = I. Homomorphisms of Galois groups arise naturally. z 2 = . generated by the element r. and the group 0* is therefore isomorphic to the quotient-group Of H. 15 464 Galois Theory homomorphism mapping the group 0 of all automorphisms U of N on the group 0* of automorphisms of Q(i). Discuss the Galois group x 3 . S. Draw a lattice diagram for the system of all subfields of Q(i. If ( is a primitive nth root of unity. This complication occurs for some fields of characteristic p. prove that every automorphism of K leaves each element of Q fixed. (b) Describe the Galois group of X4 . 9. Show from first principles that the following permutation of the roots of X4 . 4. (Hint: Any automorphism has the form ( ~ C. Hence U ~ U* is that epimorphism of 0 whose kernel is H . -ir ~ -r.3 is irreducible over Q(i). U* = I* if and only if U leaves Q(i) elementwise fixed. r). 15. -r ~ r. But 0* has only two elements: the identity J* and the automorphism interchanging i and -i. S2 .3 is irreducible over Q by showing that none of the linear or quadratic factors of X4 .3 as a permutation of the roots. (a) If K is an extension of Q.Ch.2 over F. including a determination of the degree of the root field. 3. Prove that X4 .5. Let K = Zp(u) denote a simple transcendental extension of the field Zp of integers mod p. Furthermore. Prove that the Galois group of a finite field is cyclic. Let F = Q(w) be the field generated by a complex cube root w of unity. 8. ir ~ -ir. Represent each automorphism of the Galois group of X4 . that is. 5. Exercises 1. ( a primitive fifth root of unity. a description of the Galois group in purely grouptheoretic language. (a) Prove that X4 . Separable and Inseparable Polynomials The general discussion of Galois groups is complicated by the presence of so-called inseparable irreducible polynomials-or elements which are algebraic of degree n but have fewer than n conjugates. jf and only if U is in the subgroup H = [1 .3 cannot possibly correspond to an automorphism: r ~ ir. 6. 7.3 over Q(i). 2. prove that the Galois group of Q(() is Abelian. and can be illustrated by a simple example.3 have coefficients in Q. Do the same for x 5 . and a representation of each automorphism as a permutation.7 over Q((). S3]. (b) State and prove a similar result for fields of characteristic p. and let F denote the subfield Zp(u P ) of K generated .) 10. 0). If the coefficients are in the field of real numbers.§15. We can describe the situation in the following terms: Definition.5 465 Separable and Inseparable Polynomials by uP = t.. for if f were reducible over Zp(t). (x . Therefore the root u of f(x) has degree p over F. first define the formal derivative f'(x) of f(x) by the formula (d.t = O. F consists of all rational forms in an element t transcendental over Zp. Thus. be reducible over the domain Zp[t] of polynomials in t. Namely. + anxn. Differentiating both sides of (12) formally.. A finite extension K > F is called separable over F if every element in K satisfies over F a separable polynomial equation. we see that f'(x) is the sum of cel(X . A polynomial f(x) of degree n is separable over a field F if it has n distinct roots in some root field N > F. (12) (c ¥. by Gauss's Lemma (§3.7).uP = (x . but such a factorization f(x) = g(x.Uk)" and (k.. since f(x) = x P . and so on. But f(x) has over K the factorization (10) f(x) = x P .Ul)'. Hence if el > 1. Now factor f(x) into powers of distinct linear factors over any root field N.t is linear in t. (fg). while if .Ul) divides f'(x). t) is impossible. Hence it has only one root.U2)e 2 • • • (x . + (n x an)x n-\ where n x an denotes the nth natural multiple of an (see §13. §3.1) terms each containing (x . 7) (11) f'(x) = al + (2 x a2)x + . otherwise. From the formal definition (11). f(x) is inseparable. without any use of limits. Over F the original element u satisfies an equation f(x) = x P . There is an easy test for the separability or inseparability of a given polynomial f(x) = ao + alX + .-I(X . it would.. t)h(x.9). = fg' + gf'.Ul)" as a factor. this derivative agrees with the ordinary derivative as found by calculus.ut. one can deduce many of the laws for differentiation. such as (f + g)' = f' + g'. Ex. This polynomial f(x) is actually irreducible over F = Zp(t).1. and u (although of degree p > 1) has no conjugates except itself. Corollary 2. ¥. compute by the Euclidean algorithm the (monic) greatest common divisor d(x) of f(x) and its formal derivative f'(x). If d(x) = 1. unless f(x) is separable. otherwise. 3 directly by the Euclidean algorithm in F[x]. then f(x) is separable. f(x) is inseparable. Hence Corollary 1. that is.c. while d(x) is the g.d. Repeating the argument for e2.d. Prove that f(x)/ d(x) is a polynomial which has the same roots as f(x).0 if n > 0 and an ¥. then it does not. then the g.c.ek.1 = O. g(x» is 1 unless f(x) divides g(x). . = ek = 1. hence the polynomial f(x) is separable when factored over N if and only if f(x) and its formal derivative f'(x) are relatively prime.1 + . The result of Corollary 2 does not hold for fields of prime characteristic. but which has no multiple roots.O. Let f(x) be a polynomial with rational coefficients. It is a further corollary that if F is of characteristic 00. . Without using Theorem 11. so that any algebraic extension of such a field is separable in the sense of the definition above.c.t mentioned at the beginning of the section has a formal derivative (x P . 15 466 Galois Theory el = 1.. For example. then the root field of any irreducible polynomial f(x) of degree n contains exactly n distinct conjugate roots of f(x). any algebraic element over a field of characteristic 00 satisfies an equation which is irreducible and hence separable.. the irreducible polynomial f(x) = x P . of f(x) and !'(x).. and f(x) cannot divide any polynomial of lower degree except O. An i"educible polynomial is separable unless its formal derivative is O.t)' = p X x p . . Any i"educible polynomial over a field of characteristic is separable. (f(x). of f(x) and f'(x) can be computed as in Chap. it is not altered if F is extended to a larger field. We infer Theorem 11.d.. 00 For f'(x) = n x anx n. we find that f(x) and f'(x) have a common factor unless el = e2 = ..Ch. Let f(x) be any polynomial over a field F. Furthermore. If f(x) is irreducible. 2. But the g.. Exercises 1. show that the roots of an irreducible quadratic polynomial over Q are distinct. Lemma. Properties of the Galois Group The root fields and Galois groups of separable polynomials have two especially elegant properties. In the second example of §15. hence every possible extension of S is yielded by one of our constructions. the elements left invariant by every automorphism of the Galois group of N over F are exactly the elements of F This theorem gives us some positive information about the Galois group G. 6.2 is separable..6. Since f(x) is separable.. which concerned the extensibility of isomorphisms between fields . then f(x) is inseparable over any field F.x is separable over Z!" if q = p". Any extension T of the given isomorphism S of F to F' will map the root u used in (2) into some one of the roots u' of p'(x). show that f(x) can be written in the form a o + a)x p + .a. then f(x) = [g(x}Y' for some g(x).2. In the root field N => F of a separable polynomial. S can be extended to N in exactly m = [N: F] different ways. which we now state as theorems. Theorem 13.6 Properties of the Galois Group 467 3. + a"x"p. Note that in this lemma (unlike in §15. its factor p(x) of degree d will have exactly d distinct roots u'. This result can be proved by mathematical induction on m.5) rex) does not signify the derivative of f(x) . Show that x 3 . *(b) Show that if f'(x) = over Zp. 5. (b) Show that if F is finite.2u is inseparable over Z3(U). For the proof of Theorem 12. 4. Theorem 12. (a) If f(x) is a polynomial over a field F of characteristic p with f'(x) = 0.4 we have already seen that this is the case for the root field of X4 = 3. Use Theorem 11 to show that x q . . for it asserts that for each element a in N but not in F there is in G an automorphism T with aT ¥. If the polynomial f(x) of Lemma 2 in §15. ° 15.§15. refer back to Lemma 2 of §15. The order of the Galois group of a separable polynomial over F is exactly the degree [N: F] of its root field. (a) Show that if f'(x) = 0. f(x) = [g(x)]p for suitable g(x). (c) Use part (b) to show that every irreducible polynomial over a finite field is separable. Show that the Galois group of its root field is the identity. A finite extension N of a field F is said to be normal over F If every polynomial p(x) irreducible over F which has one root in N has all its roots in N. Let w be a root of p(x) in N. the root field N of any f(x) is normal. Hence every automorphism in G is an extension to N of the identity automorphism I of K. our new lemma asserts that the identity automorphism I of F can be extended in exactly m different ways to an automorphism of N. ° Proof. hence contains the root field M of p(x). 15 468 Galois Theory These d choices of u' give exactly d choices for S* in (2) . If there are elements of N not in M. as asserted in Theorem 13. One shows easily that K is a field. to prove Theorem 13. Suppose that there is some polynomial p(x) irreducible over F which has one but not all of its roots in N.Ch. Finally. one of these elements v satisfies an irreducible equation q(x) = 0. as asserted. Definition. each such S* can then be extended to N in mid = [N:F(u)] different ways. and has a root in N. A finite extension of F is normal over F if and only if it is the root field of some polynomial over F. one of the successive root fields so obtained must be the whole field N. this implies that K = F. while by Theorem 12 there are [N: F] automorphisms. By the induction assumption. so there are all told d(mld) = m extensions. Since K :::> F. Theorem 14. can be factored into linear factors over N. Hence [N: K] = [N: F]. choose any element u of N not in F and find the irreducible equation p(x) = satisfied by u. every polynomial p(x) which is irreducible over F. N contains all roots of p(x). By the definition of normality. But these automorphisms constitute the Galois group of N over F. there are by our lemma only [N: K] such extensions. and so on. Since N is a root field over K.2. Since the degree of N is finite. and M is contained in the larger root field of p(x)q(x). all told. Conversely. ' Still another consequence of the extension lemmas is the fact that a root field is always "normal" in the following sense. If N is normal over F. while K is the set of all elements of N invariant under every automorphism of G. and that K :::> F. let G be the Galois group of the root field N of a separable polynomial over F. and adjoin to N another root w' which is not in N. proving Theorem 12. The simple extension F( w) is isomorphic to F( w') by . If f(x) = f'(x) is separable of degree m and we set N = N' in Lemma 2 of §15. In other words. Any automorphism T of the Galois group G of N effects a permutation Uj >---+ Uj T of the roots of f(x)...2. normal. Every finite.xn) means that it is unaltered by any permutation of the indeterminates. §15. hence is a root field for f(x) over F(w'). so that its degree over F is larger than that of N. Theorem 19 of §6. The element w = g(Ub .6 Properties of the Galois Group 469 a correspondence T with wT = w'.xn) be any polynomial form over F symmetric in n indeterminates Xi. The symmetry of g(Xb· . every finite and normal extension N of the field Q of rational numbers is automatically separable (Theorem 11. w lies in F.. . The Galois group may be used to treat properties of symmetric polynomials.. .. and separable extension of F is the root field of a separable polynomial. by Theorem 10. these isomorphic fields Nand N' must have the same degree over F.10.§1S. all the polynomials p(x). . . hence is the root field of some separable polynomial. N' = N( w') is generated by roots of f(x) over F(w'). and let g(Xb· . . hence Since w is altered by no automorphism T. This contradiction proves the theorem...Un of a separable polynomial f(x) of degree n. by Lemma 2. This proves the Corollary. .un) of N then lies in the base field F. But we assumed that N' = N( w') is a proper extension of N.1O. Since T leaves the elements of the base field F fixed. .Xn can be expressed as a rationalt function (over F) in the n t Cf.. un) be the field generated by all n roots Ub· . Hence. on the other hand. In particular. Any polynomial (over F) symmetric in n indeterminates Xb· . Corollary. The order of the automorphism group of N over Q is therefore exactly the degree [N: Q]. Proof. Corollary 2). by Theorem 13. Let N = F(Ul. as defined in §6. q(x) used are separable. The field N is a root field for f(x) over F( w). . If the first half of this proof is applied to a separable extension (one in which each element satisfies a separable equation). Theorem 15. which states a stronger result. the correspondence T can be extended to an isomorphism of N to N'. Exercises 1. we write out the proof for the case n = 3 only. If a pOlynomial of degree n has n roots Xl>···. (b) Show that in (a) K may be chosen as a subfield of the field of real numbers. (Cf. *(c) The same problem for a cubic polynomial. then K is normal over L. Express x/ + x/ + x/ in terms of the elementary symmetric functions. show that the Galois group of N = K(x 1 . Exs. 5. the generators Xj of N are the roots of a cubic polynomial with coefficients which prove to be exactly the given symmetric functions (13). lT3). 7 and 8. Introduce the Galois group G of the root field N over K. (a) Show that the discriminant of a polynomial with rational coefficients is a rational number. Over F the elementary symmetric functions lTl> lT2. any symmetric polynomial of the Xi lies in the base field K. also §6. X3) generated by the three original indeterminates is a finite extension of K. X 2 . and F c L c K. lT3. By Theorem 10.Ch. express D explicitly as a rational function of the coefficients.) 4. X n . In the proof of the corollary of Theorem 15. such that the Galois group of Kover F is the symmetric group of degree n.) 3. The field N = F(xl> X2. lT3). hence by Theorem 15. . . and lT3 generate a field K = F(lTl> lT2. (a) Show that there exist a field K and a subfield F. every automorphism induces a permutation of the Xi. it follows that such a symmetric polynomial is a rational function of lTl> lT2.1O. in fact. x 3 ) over K is exactly the symmetric group on three letters.Xj)2. Since K = F(lTl> lT2. 15 470 Galois Theory elementary symmetric functions (13) To simplify the formulas. where the product is taken over all pairs of subscripts with i < j. Show that if K is normal over F. (Hint: Use n algebraicaliy independent real numbers. (b) For a quadratic polynomial. 2. its discriminant is D = II(xj . with no explicit reference to a base field.§15.+I = 0. . .. ylS = 1 also and so t = 1.YmS is also a solution of (14). We conclude that YiS = Yi for every i = 1.. Now pick the smallest integer m such that the n equations (14) T E H.. From the n elements T of H we construct a system of n homogeneous linear equations in n + 1 unknowns Yi..C. . and so by. . and H is any subgroup of the Galois group of N over F. . . the elements a of N left invariant by all the automorphismsof H (such that aT = a for each Tin H) form a subfield of N... which involves the idea of looking on a Galois group simply as a finite group of automorphisms. a suitable linear combination would give a solution of the system with m .3. t If H has order n. Without loss of generality. since YI = 1 and S is an automorphism.+I of N are linearly dependent over K. we can also assume YI = 1.1 unknowns. by Theorem 10 of §2. where t is a factor of proportionality. . the uniqueness of the solution is tyl. Such a system always has in N a solution different from YI = Y2 = . Subgroups and Subfields If H is any set of automorphisms of a field N. . it will suffice to show that any n + 1 elements Cb . However. . which t This proof. This solution Yb . .7 471 Subgroups and Subfields 15. still have such a solution. Since TS = T runs over all the elements of H. Proof. this is true if N is the root field of any polynomial over any base field F. In particular.. Now apply any automorphism S in H to the left side of (14). the result is a system identical with (14) except for the arrangement of equations. = Y.Ym consists of elements of N and is unique to within a constant factor. while K is the subfield of all elements invariant under H.7... Theorem 16.. the degree [N: K] of N over K is at most the order of H.. tym. for if there were two nonproportional solutions. is due to Professor Artin.m and every S in H. . If H is any finite group of automorphisms of a field N.. Therefore yIS. this automorphism T does not lie in the group H(K 1) . Since a is in K\. at least for separable polynomials. the corresponding subgroup H = H(K) consists of all automorphisms in G which leave each element of K fixed. On the basis of this result. so does the product ST. This correspondence provides a systematic way of reducing questions about fields related to a given equation to parallel questions about subgroups of (finite) Galois groups. .. Two different intermediate fields K 1 and K2 determine distinct subgroups H(Kd and H(K2). The field N is a root field for f(x) over K. so the set H(K) is a subgroup. Proof. . . This proves the theorem. For each K. Equation (14) with T = I now asserts that the elements c\. If Sand T have this property. and every automorphism of N over K is certainly an automorphism of N over F leaving every element of K fixed.. so H(Kd ¥. H(K) is described thus: Tis in H(K) if and only if bT= b for all b in K. and its order is the degree [N : K]. If K is given.H(K2) ' We know now that K ~ H(K) is a bijection between all of the subfields of N and some of the subgroups of G. hence is in the subgroup H(K). Let H be a subgroup of order h and K = K(H) be defined as in the statement of Theorem 17 : (16) b is in K(H) if and only if bS =b for all S in H. the subgroup H(K) is the Galois group of N over K. and apply Theorem 13 to the group H(K 2) of N over K 2. choose any a in Kl but not in K 2. (15) For a given K. In order to establish a bijection between all subfields and all subgroups we must show thjlt every subgroup appears as an H(K) . then there is a bijection H ~ K between the subgroups H of G and those subfields K of N which contain F. a correspondence between the subgroups of a Galois group and the subfields of the corresponding root field . the corresponding subfield K = K(H) consists of all elements in N left invariant by every automorphism of the subgroup H. Theorem 17 (Fundamental Theorem of Galois Theory). It asserts that H(K2 ) contains some T with aT ¥.Cm are linearly dependent over the field K.a. If G is the Galois group for the root field N of a separable polynomial f(x) over F. 15 Galois Theory means that the coefficients Yi lie in the subfield K of invariant elements. it shows that the order of H(K) is exactly the degree of N over K. if H is given. Therefore H(K) is by definition the Galois group of N over K. If Theorem 12 is applied to this Galois group. To prove this. we can establish.472 Ch. V2)' Theorem 18. hence will be smaller. (19) H(K J n. K 2 ) = H(K J) v H(K2).b. the Galois group of Kover F is isomorphic to the quotient-group G/ H(K). as is asserted in (18) and (19). If K is normal. this means that the order of the group H(K) does not exceed the order of its subgroup H. while their l. one sees that the subgroup H(K) corresponding to K = K(H) certainly includes the group H originally given. the subgroup consisting of the identity alone corresponds to the whole normal field N.7 Subgroups and Sllbfields 473 According to Theorem 16. . Therefore H(K) = H.. which consists of all elements common to KJ and K 2.b. [N:K] -< h. as asserted. the subfield of N generated by the elements of K J and K2 jointly. or meet in this lattice is the intersection KJ n K 2. To prove the theorem. In particular. The lattice of all subfields KJ. By comparing (15) with (16). K 2. Any bijection between two lattices which has these properties is called a dual isomorphism. their join is the mUltiple extension F(vJ. observe first that the definition (15) of the group belonging to a field K shows that for a larger subfield the corresponding group must leave more elements invariant. in such a way that (17) implies (18) H(KJ v K 2) = H(K J) n H(K2). This completes the proof.. is a normal field over F if and only if the corresponding group H(K) is a normal subgroup of the Galois group G of N. A field K. if KJ = F(vJ) and K2 = F(V2) are simple extensions.u. or join is KJ v K 2. 7). Theorem 19.l. For instance. is mapped by the correspondence K ~ H(K) of Theorem 17 onto the lattice of all subgroups of G. while by Theorem 12 the order of H(K) is [N: K]. We omit the proof of the following further result. These results state that the correspondence inverts the inclusion relation and carries any meet into the (dual) join and conversely. a bijection which inverts inclusion must interchange these two. Since [N: K] -< h.§15. If KJ and K2 are two subfields. This gives (17). The set of all fields K between Nand F is a lattice relative to the ordinary relation of inclusion between subfields. with N ::::> K ::::> F. their g. hence by the Duality Principle. . The meet and the join are defined purely in terms of the inclusion relation (see § 11. Do the same for the root field of X4 .Y2)(Y . (a) Prove that if H is any set of automorphisms of a field N. defined by the formula (22) . 6. if there exists an automorphism T of N over F carrying K\ into K 2 . The coefficients p and q may be expressed as symmetric functions of the roots. 4. *8. as discussed in §15. Prove Theorem 19.e. 15. 10 a Exercises 1. Prove that the fields K between Nand F form a lattice. 2. If K is a finite extension of a field F of characteristic 00.4. (b) Show that N is normal over this subfield K.5.Y3).8. Prove that the index of H(K) in G is the degree of Kover F. 3. if and only if H(K\) and H(K2 ) are "conjugate" subgroups of G). we shall consider the famous "irreducible case" of cubic equations with real roots. 7. with real coefficients p and q and with three real or complex roots Yb Y2. Exhibit completely the subgroup-subfield correspondence for the field Q(h. one finds (21) It is important to introduce the discriminant D of the cubic. *9. If N is the root field of a separable polynomial f(x) over F.Yl)(Y . for on mUltiplying out (20). Irreducible Cubic Equations Galois theory can be applied to show the impossibility of resolving various classical problems about the solution of equations by radicals. and Y3. As a simple example of this technique. 5. prove that the number of fields between Nand F is finite.3.. Two subfields K\ and K2 are called conjugate in the situation of Theorem 17. prove that the number of fields between K and F is finite. i) over Q. Prove that this is the case if and only if r\ H(K\)T = H(K2 ) (i. (17» (20) t(y) == l + py + q == (y .474 Ch. 15 Galois Theory The conclusions of this theorem have Jilready been illustrated special case by the example at the end of § 15. the elements of N left invariant by all the automorphisms in H form a subfield K of N.4. A cubic equation may be taken in the form (see §5. while the third root is real. If the cubic polynomial (20) is irreducible over F = Q(p. so that .0). so that D is a polynomial symmetric in Yh Y2. (Yl .5.bi) = 2bi is a pure imaginary. The discriminant D is therefore negative. Y3.bi must then also be a root (§5. while D = 0 if two roots are equal: Suppose.§15. if D = 0. This expression is. and Y3' By Theorem 15 it follows that D is expressible as a quantity in the field F = Q(p. hence it remains only to prove that the roots Y2 and Y3 are contained in K = F(JD. This gives exactly the alternatives listed in the theorem. while if D < 0.4). so the remaining quadratic factor (24) also has its coefficients in K.(a . Theorem 20. q). that one root YI = a + bi is an imaginary number (b ¥.8 Irreducible Cubic Equations 475 The permutation of any two roots does not alter D. has roots Yh Y2. YI)' Proof. and Y3 and may be checked by straightforward use of the equations (21) and (22). the root field certainly contains JD. then its root field F(y h Y2. finally. The complex conjugate Y2 = a . By substitution in (24). This may be verified simply by observing how the various types of roots affect the formula (22) for D. as in §5. In (22). at least two roots are equal. and discriminant D. Y3) is F( JD. (23) this equation is a polynomial identity in Yh Y2. By the definition (22) of D. If all roots are real. while is a real number. YI .Y3) is in K. D is clearly positive. Theorem 21.Y2)(YI . two roots are imaginary.YI). YI)' In this field K the cubic has a linear factor (y . q) generated by the coefficients. (24).Y2 = (a + bi) . A real cubic equation with a positive discriminant has three real roots. a of prime degree r over a real fieldt K is either i"educible over K or has a root in K. though the proof must be slightly modified if K has characteristic r. we discuss more thoroughly the properties of a ! . A polynomial xr . But the coefficient Y2 + Y3 of (24) is also in K. so that any radical may be obtained by a succession of radicals with prime exponents. t h en a I / m == (l/r)l/s a . then there is no rational formula for a root of the cubic in terms of real radicals over F. u) contains the r roots u.Iu of the polynomial xr .a over K«(. (u. (2U. so are Y2 and Y3' This proves the theorem.a) == (x .a. This lemma is true for any field. D is positive (Theorem 20). u). I f m ·IS composite. t..5 gives the roots as Y = z . q) generated by its coefficients.a.Jl/4 + p3/27 = -q/2 + . radlca1m"a == a I / m . 15 476 Galois Theory is in K. and so on. . hence is the root field of this polynomial. Adjoin to K a primitive rth root of unity ( and then a root u of xr . or higher roots). If a cubic polynomial has real roots and is i"educible over the field F = Q(P. hence the square root in these formulas is an imaginary number. so that the constant term b in g(x) is a product of m t A real field is any field with elements which are real numbers. . The formula thus gives the real roots Y in terms of complex numbers! For many years this was regarded as a serious blemish in this set of formulas. Before proving this. If both Y2 + Y3 and Y2 . and mathematicians endeavoured to find for the real roots of the cubic other formulas which would involve only real radicals (square roots.Iu ). Wit h m = rs. where Z3 = -q/2 + . Lemma. This search was in vain.(u)(x - (2 U )' •. Consider now a cubic which is irreducible over its coefficient field but which has three real roots.p/3z. Suppose that xr .J-D/108. by reason of the following theorem. Formula (19) of §5 .Y3 are in K.a has over F a proper factor g(x) of positive degree m < r.Ch. cube roots. ..u)(x . (x _ (r. Proof. This factor g(x) is then a product of m of the linear factors of xr . Theorem 22. The resulting extension K«(. In the latter case we can determine the degree of the field obtained by adjoining a radical. . (We have used the expression (23) for D.. which has the factorization (x r .) Since the roots are real. We can now prove Theorem 22.Ji5 is adjoined first.Ji5).n . .ai is irreducible over Ki and hence that the degree of K i + 1 is [Ki+ l : K i ] = rio By assumption.1. To do this. The field K is obtained by a finite number of radicals. Since D is positive. contrary to the fact that K j contains none of the Yi' The extension (27) then has degree r and contains an element YI of degree three over K j • By Theorem 9.5.a over K thus yields a root bSa ' of xr . Corollary 2.Ji5 adjoined to this field gives another real field K = L(. If . suppose the conclusion false. by the lemma r this means that x . By dropping out extra fields. By Theorem 21 the roots of the cubic will all lie in this field. so that bSr = a sm = a I-Ir = a/a'r. (13». . and a = (bSa ' )'. 31 r.Ji5). Then some root of the cubic can be expressed by real radicals.. so the prime r must be 3.§15.Yi) over K j . (26) 'i with each ai in Ki and each a prime. for m < r is relatively prime fa r and there exist integers sand t with sm + tr = 1 (§1. . the real radical . so they all can be expressed by formulas involving real radicals. this amounts to saying that K is the last of a finite chain of fields (25) where i = 1. one may assume that the real root a/ l r. we are .8 477 Irreducible Cubic Equations roots (iU . they do not lie in F or in F(.a in K. which is to say that a root YI lies in some field L = F(ra.' . say the root YI' Over the previous field K j the given cubic must be irreducible.. ~ generated over F by real radicals. The assumed reducibility of xr .D. for some integer k. is not in the field K i .7. since the cubic is irreducible over F. Therefore b = (kU m . for otherwise it would have a linear factor (Y . Q.fb. the roots of the cubic lie in K. §14.E. In the chain (25) there is then a first field K j + 1 which contains a root of the cubic. and From this we can find in K an rth root of a.. ' . by the lemma of §15. 6. This violates the assumption that ~+I c K is a real field . and hence is normal over"F. The proof is complete. *3.8.a is irreducible over F. so there is an automorphism S of K carrying the root a I/r into the root (a I / r. Prove: If F is a field of characteristic 00 containing all nth roots of unity. the polynomial xr . hence these powers include all the automorphisms of K over F.Ch. by YI> contains and hence by Theorem 21 contains all roots of the cubic. q). where r is a prime. S. then the degree [F(a 1/") : F] is a divisor of n. The powers I. (r-I a I/r. Suppose K = F(a I/r) is generated by F and a single rth root a I/r of an element a E F. where ( is a primitive rth root of unity. F will denote a subfield of the field of complex numbers which contains all roots of unity. 15 478 Galois Theory ra. Express the roots of the cubic explicitly in terms of YI and JD. since it contains one root a 1/3 of the polynomial x 3 . S2. Unless K = F. each involving only the adjunction of an ni th root to the preceding extension of F.a which has a factor of degree prime to n over a field F of characteristic 00 has a root in F. . (a I/r. ••. 2. as in Chap. Insolvability of Quintic Equations Throughout the present section. rv. Consider the Galois group G of the irreducible cubic (20) over F = Q(P. 5. We conclude that the Galois group of Kover F is cyclic.9. Exercises 1. Verify formula (23) for the discriminant. The other roots of xr = a are. suppose K is normal over F and can be obtained from F by a sequence of simple extensions. then G is the alternating group On three letters. . Therefore K is the root field of xr = a over F. so K j + 1 also contains w. How much of the discussion of cubic equations applies to cubics over Z)? 4. after the method of Theorem 21. Prove: A polynomial x" . a complex cube root of unity. The other roots are wa 1/3 and w 2 a 1/3. and that otherwise it is the symmetric group.. More generally. Therefore K j + 1 is the root field of the given cubic over K j • As a root field it is normal by Theorem 14. S. dealing in (27) with a cube root The field Kj+l is generated over K. Prove that if D is the square of a number of F. and so is in F. 15. .Sr-I of this automorphism carry a I/r respectively into each of the roots of the equation xr = a. This means .a irreducible over Kj> it must therefore contain all roots of this polynomial. and K will denote a variable finite extension of F. it is the root field of a polynomial f(x) over F. Then the Galois group G of Kover F contains a sequence of subgroups So = G ::::> SI ::::> S2 ::::> ••• ::::> S. any group whose order is divisible by fewer than three distinct primes is .2. Theorem 23. it is also normal over K I • Hence the preceding argument can be reapplied to G(K/ K 1) to prove that G(K/ K 2) is a normal subgroup of G(K/ K I) with cyclic quotient-group G(K2/ K I). we infer that G(K/K I) is a normal subgroup of G(K/F) with cyclic quotient-group G(KI/F). like that described at the end of §15.I(Xi). moreover.4. K is an extension of KI by radicals.§15. we can assume each ni is prime. Under this epimorphism.9 479 Insolvability of Quintic Equations that there exists a sequence of intermediate fields K i . each normal in the preceding with cyClic quotient-group Si-I/Si. for example. every automorphism of KI over F can be extended to one of Kover F. But KI is normal over F by the preceding paragraph. as above. Hence the correspondence is an epimorphism from the Galois group of Kover F to that of KI over F. Consequently.I. (28) F = Ko C KI C K2 C . where xt. such that Ki = K i . Combining this with the result of the last paragraph. The latter is therefore isomorphic to the quotient-group G(K/F)/G(K/K I). Let K be any normal extension of F by radicals. This states that the Galois group of Kover F is solvable in the sense of the following definition. and with Ss consisting of I alone. and the multiplication of automorphisms is the same. the elements inducing the identity automorphism on KI over F are by definition just the automorphisms of Kover K I • This shows that the Galois group G(K/ F) of Kover F is mapped epimorphically onto G(K I/ F).. E K i. by Lemma 2 of §15. Further. By definition. A finite group G is solvable if and only if it contains a chain of subgroups So = G ::::> S I ::::> S2 ::::> ••• ::::> Ss = I such that for all k. and so the root field of the same f(x) over K I-and so (Theorem 14) normal over K I . (i) Sk is normal in Sk-I and (ii) Sk-dSk is cyclic. Without loss of generality. Now use induction on s. Repeating this argument s times and denoting the subgroup G(K/ K i ) by Si' we get the following basic result. Definition. c Ks = K. A great deal is known about abstract solvable groups.. Since K is normal.. every automorphism of Kover F induces an automorphism of KI over F. Such a K we shall call an extension of F by radicals. Finally. so is the latter.S.D.-I = (x -I). " S. However. then its Galois group over F is solvable.' thus has the properties which make G' solvable. y being arbitrary antecedents of x' and y' in Sk). since Sk-I consists of the powers (Skat = Ska" of some single coset of Sk (Sk-I/Sk being cyclic). the root field N of f(x) must also be contained in a finite extension K* ::::> K. Lemma 1. The chain of these subgroups So' ::::> SI' ::::> S2' ::::> ••• ::::> S. This is the case for all quadratic. This proves Theorem 24. . it is even known (Feit-Thompson) that every group of odd order is solvable. Any epimorphic image G' of a finite solvable group G is itself solvable. SI'. 15 Galois Theory 480 solvable (Burnside).-Ix'a' = (a-Ixa)' is in Sk'. Now let us define an equation f(x) = 0 with coefficients in F to be solvable by radicals over F if its roots lie in an extension K of F obtainable by successive adjunctions of nth roots. This K* contains N as a normal subfield over F. also x' y' = (xy)' and X. by §S. content ourselves with the following meager fact. cubic. . It should be observed that K is not required to be normal. Hence each automorphism S of K* over F induces an automorphism S1 of N over F. (x. Then each Sk' contains. normal over F and an extension of F by radicals. and so is a subgroup of G'. hence by Lemma 1. That is. and so is also cyclic.E. We shall. Since a' may be any element of Sk-I'. but the former is solvable (by Theorem 23). the Galois group of K* over F is epimorphic to that of N over F. and then we shall exhibit a quintic equation whose Galois group is the symmetric group of degree five. If an equation f(x) = 0 with coefficients in F is solvable by radicals. however. Sk-1' consists of the powers Sk'a'" = (Sk'a't of the image of this coset. Let G have the chain of subgroups Sk as described in the definition of solvability.' = I' be their homomorphic images. We shall do this: first we shall prove that the symmetric group of degree five is not solvable. as required for Lemma 1. but only to contain the root field N of f(x) over F. with any x' and y'..Ch. Proof. and the correspondence S >-+ SI is an epimorphism. In order to prove that equations of the fifth degree are not always solvable by radicals. Q. if a is in Sk-I and x is in Sb the normality of Sk in Sk-I means that a-Ixa is in Sk and hence that a. and quartic equations. we need therefore find only one whose Galois group is not solvable. since any conjugate of an element expressible by radicals is itself expressible by conjugate radicals. this proves Sk' normal in Sk-I'. Furthermore. and let So' = G'. is commutative. This proves that Ss contains every 3-cycle. The symmetric group G on n letters is not solvable unless n -< 4... j. k are given and I. m are any two other letters (such letters exist unless n -< 4).. each normal in the preceding with cyclic quotient-group Sk-ll Sk. First. it will be countable and contain all roots of unity. " lTs). To see this. Since So == G contains every 3-cycle. It is known that the alternating group An is a normal subgroup of the symmetric group G. Let G == So ::::> Sl ::::> S2 ::::> ••• ::::> Ss be any chain of subgroups. . hence which implies '}' E Ss.. being cyclic. and so that G cannot be solvable. we have '}' == (jli) (mkj)(i/j) (jkm ) = (ijk) E Ss for all i. . five algebraically independent real numbers Xl> . As in Theorem 15. it is possible to prove a more explicit form of this theorem. Incidentally. the Galois group of the polynomial over F is the symmetric group on the five letters Xi' . Hence we can choose in succession. This quotient-group. " Xs over A. This will imply that Ss > I. Proof. But in the special case when rp == Ulj) and '" == (jkm). and '}" in Ss-dSs. ". note that if the permutations rp and '" are both in Ss-i> then their so-called "commutator" '}' == rp-l". as desired.§15. Proof. lTs be the elementary symmetric polynomials in the Xi' and let F == A(lTi> .. then so does Ss. One may then prove that the alternating group An (for n > 4) has no normal subgroups whatever except itself and the identity. consider the images rp'.. There is a (real) quintic equation whose Galois group is the symmetric group on five letters. we shall prove by induction on s that Ss must contain every 3-cyc/e (ijk). Let A be the field of all algebraic numbers. so there is a chain beginning G > An. where i. j. Now let lTl> .'.9 481 Insolvability of Quintic Equations Theorem 25. k. as in §14. --lrp". it is sufficient by induction to show that if SS-l contains every 3-cycle. is in Ss.xs). Lemma 2. Form the transcendental extension A (Xi> .6. If Q is the rational field. 10. whose Galois group is not solvable. where a is in K. (This fact was used above in the proof of Theorem 24. . show that the Galois group of f over the field Q(UI' . If F contains the nth roots of unity. Show explicitly that if K is an extension of F by radicals. there exists a real equation of degree n which is not solvable by radicals. Show that if n > 4. (a) Show that the Galois group of x" = a is solvable even over a field not containing roots of unity. 4. Prove that any finite abstract group G is the Galois group of a suitable equation. (Hint: By Cayley's theorem. 2. we get our final result. 5. and if K = F(a l /"). (b) Show that Theorem 24 holds for any F. Now applying Theorem 24. Exercises 1. There exists a (real) quintic equation which is not solvable by radicals. f the special polynomial of (29). Prove that if a finite group G contains a normal subgroup N such that Nand Gj N are both solvable. G is isomorphic with a subgroup of a symmetric group.) 8. (b) Using this and the alternating subgroup. 7. Prove that the symmetric group on three letters is solvable. (Hint: Show that.482 Ch.) *6.. . then G is solvable. (a) Prove that in the symmetric group on four letters the commutators of 3-cycles form a normal subgroup of order 4. Theorem 26. 9. Prove that any finite commutative group is solvable. 15 Galois Theory It follows from Lemma 2 and Theorem 25 that there exists a (real) quintic equation over a field containing all roots of unity.) 3. then there exists an extension K* of K which is normal over F and which is also an extension of F by radicals. us) is still the symmetric group on five letters. prove that the symmetric group on four letters is solvable. show that the Galois group of Kover F is cyclic. it contains a (normal) subgroup of prime index. .. whether it contains roots of unity or not. even when n is not prime. New York: Chelsea. Birkhoff. and I. H. Godement. and E.. A. New York: Wiley. 1954. 5th ed. Modern Algebra.J. Jacobson. N. New York: Ungar. Reiner. Theory of Equations. 1976. Algebra. C. 0. van der Waerden. and H. C. W. Macmillan. W. I. Oxford: Clarendon. Saunders. S. N. 1963. Algebraic Number Theory Lang. 1956. Algebraic Number Theory. Paris: Hermann. Basic Algebra. and E. New York: McGrawHill. New York. 1948. Algebraic Numbers. H.). J. Mac Lane.Bibliography General References Albert. (English translation). San Francisco: Freeman. Representation Theory of Finite Groups and Associative Algebras. An Introduction to the Theory of Numbers. 1974. Bartee. G. 1964. An Introduction to the Theory of Numbers.. Algebraic Numbers. 483 . and II. E. Ivan. New York: Wiley. G. Roger. Geometric Algebra. Englewood Cliffs. B. 1957. Basic Algebra. S.. M. New York: Interscience. Number Theory Hardy. I. 1952. Herstein. 1962. Zuckerman.: AddisonWesley.. New York: Wiley. Mass. Artin. 4th ed. Ribenboim. 4th ed. Reading. 1967. 1964. Cours d'algebre. New York: Wiley. II. Topics in Number Theory. New York: Interscience. J. N. Niven.: Addison-Wesley. 1963. 1966 and 1967. V. Group Theory Curtis. Sperner. 1964. I. Mass. Wright. and Garrett Birkhoff. Weiss. (ed. 1972. Studies in Modern Algebra (MAA Studies in Mathematics.. New York: McGraw-Hill. Schreier. Reading. Introduction to Modern Algebra and Matrix Theory (English translation). and T. 1960. Rademacher. II). 1963. 2 vols. E. L. Topics in Algebra. New York: McGraw-Hill. Modern Applied Algebra. P. A. Uspensky. LeVeque. Lectures on Elementary Number Theory. 1970.: Prentice-Hall. Lattice Theory Abbott. 2nd ed. Lie Algebras. J. Lang. Algebraic Curves. 1958. 1964. N. 1959. Englewood Cliffs. S. Providence: American Mathematical Society.2). 2nd ed. Samuel. V. Budapest: Hungarian Academy of Sciences. Zariski. Rotman. S. J. McCoy. Ind. Sets. Abelian Groups. Boston: Allyn & Bacon. 1963. New York : Harper & Row. and P. . Birkhoff. 1964. Finite Groups. Galois Theory. 1966. Computational Methods of Linear Algebra. Matrix Iterative Analysis. C. New York: Macmillan. Commutative Algebras. R. 1962. M. 0. 3rd ed. 1960. N.: Prentice-Hall. 1944. Mendelson. N. 1962. Garrett. Logic Kleene. Notre Dame. E. 2 vols. 1969. 1968. New York: Van Nostrand.Bibliography 484 Fuchs. New York: Dover. Linear Algebra and Rings Jacobson. Rudiments of Algebraic Geometry. L. Jacobson. (Notre Dame Mathematical Lecture No. D. 1958. Introduction to Mathematical Logic. Introduction to Algebraic Geometry. Gorenstein. S. Benster.J. The Structure of Rings. J. D. N. C. Varga. 1969.. The Theory of Groups. New York: Benjamin. N6W York: American Mathematical Society. 1964.: University of Notre Dame Press. W. New York: Van Nostrand. Lattices and Boolean Algebras. Hall. Jenner. New York: Macmillan. W. Mathematical Logic. New York: Oxford University Press. Translated by C. The Theory of Rings. New York: Wiley. Algebraic Geometry Fulton. E. New York: Wiley. Boston: Allyn & Bacon. New York: Interscience. E. 1967. 1959. 1965. N. H. Matrix Theory Faddaeva. Lattice Theory. The Theory of Groups. 1958. Galois Theory Artin. J. Universal Algebra. 1964. Topics in Universal Algebra (Lecture Notes in Mathematics No. New York: Van Nostrand. 1972. Berlin: Springer. 1965. Abelian Categories. P. Jonsson. Jans. Saunders. New York: Harper & Row. Mac Lane. 1963. Rings and Homology. Saunders. Berlin: Springer. 250). 1964. New York: Harper & Row. P.Bibliography 485 Homological Algebra Freyd. M. Homology. P. 1971. Mac Lane. G. Universal Algebra. Universal Algebra Cohn. . Bjarni. Berlin: Springer. Gratzer. 1968. New York: Holt. Categories for the Working Mathematician. l. j)-entry 1. 1 F Fn F[x] F(x) G g. etc. Identity matrix or transformation. others 0 Group identity Field Space of n-tuples over F Polynomial forms in x. (i. coefficients in D Polynomial functions in x. k ] K [K:. etc.) Field Degree of Kover F Full linear group over F Least upper bound Total matric algebra over.FJ Ln(F) l.J-i. Mn(F) Matrix (also B.) Transposed matrix Complex conjugate of matrix Linear algebra Affine group over F Boolean algebra Cardinality of R Complex field Integral domain Polynomial forms in x. coefficients in D Cardinality o(RZ) of set of positive integers Euclidean n-space Special matrix.b. L. coefficients in F Group Greatest lower bound . F 486 .b.List of Special Symbols c C D D[x] D(x) d En Eij e. I I j. C. coefficients in F Rational forms in x. greatest element of lattice Quaternion units Ideal in ring (also H.u. List of Special Symbols o On(F) o(S) p p. union (of sets) Meet. join (in Boolean algebra.L T TA [u: F] V. properly included in Inequality 487 . lattice) Is a member of Is a subset of Less than. functions Product Summation Vectors Zero vector Void set Intersection.{3 (a. {3) l)ij Ei n. subgroup. subspace Set-complement Orthogonal complement (of subspace) Linear transformation Transformation given by matrix A Degree of u over F Vector spaces Dual vector space Vector or row matrix Domain or group of integers Integers mod n Semiring of positive integers Vectors Inner (dot) product of vectors Kronecker delta Unit vector Transformations. mappings.q Q(D) Q R R S S' S. W V* X Z Zn Z+ a. v E c < Zero matrix. least element of lattice Orthogonal group Cardinal number of set S Prime ideal (also nonsingular matrix) Positive prime number Field of quotients of domain D Rational field Ring Real field Set.u ". d. II alb (a.488 List of Special Symbols .) • .b] Orthogonal to Direct Product Direct sum Binary operator Goes into (for elements) Goes into (for sets) Conjugate complex number Infinity Associate Congruent Determinant of A Absolute value Matrix a divides b Greatest common divisor (g.c.m.) Least common multiple (l.1 @ EEl 0 ~ ~ * 00 - IAI lal I ai. b) [a.c. 454 -independent. 10 of matrices. 269 -independence.23 Associate. 224 for rings. 415. 2 for transformations. 427 -integer. 438 -variety. 410 Algebraically -closed. 270 Algebraic. 2 for groups. 422 -function field. 99 Argument of complex number. 232 489 . 202 Annihilator. -group. 362 Addition (see also Sum) of cardinal numbers. of polynomials. 170. 306 -transformation. 378 Augmented matrix. 325 Affine -geometry.Index A Abelian group. 107 of inequalities. 127 general-. 437 -complete. 305 -subspace. 170 Additive group. 247 Automorphism. 232 Anti-symmetric law. 309 -space. 13 Atom. 421 --extension. 443 -number. 435 -number field. 133 Absolute value. 154 Angle between vectors. 133 for matrices. Fundamental Theorem of. 110 Arithmetic. . 158 involutory. 62 fI. 73 Alias-alibi. 127. 76 Associative law. 357. 437. 220 fI. 446 Additive inverse. 2 Adjoint of matrix. 36 of a group. 157 inner-. 390 of complex numbers. 211 Anti-automorphism. 372 Archimedean property. 269. of vectors. 10 of complex number. 271 Alternating group. 305 fI. 110 Absorption law. 391 for cardinal numbers. 133 Commutative law. 34 Cayley's Theorem. 359. 235 Axiom of choice. 1 Codomain of transformation. -root. 265. 277 bilinear-. 249 -operation. 193 Bijection. 176. 144 Centroids. 353 Canonical projection. 187. 3. 376 Characteristic. 187 for quadratic forms. 281 Bilinear function. 333 -polynomial. 134 Canonical form. normal unitary-. 387 Cartesian product. 33. 377 Boolean function. 139. 368 minimal polynomial. 355 primary rational-. 415 iI. 232 Commutative group. 436 Cardinal number. 308 Chain. etc. 377 isomorphism of-. 319 Column --equivalent. 199 Binary operation.250 -vector. upper. 203 iI. 482 Cayley-Hamilton Theorem. 34 Binomial coefficients. 241 Cofactor. 402. 2 general-. 384 '8 Barycentric coordinates. 386. 207 Cantor's diagonal process. 370 -polynomial.Index 490 outer-. 398. 248. 361 free-.). 332 Class (see Set) Closure under an operation. 33 Binary relation.368 Bounds (lower. 248 -rank. --equation. 381. 269. 14 . 411 normal orthogonal-. 282 disjunctive---. 33. 281 for matrices. Jordan-. 260 for ideals. 283 iI. 225 Bilinearity. 193 change of-. 129. 340 Center of a group. 234 Boolean algebra. 369 for linear forms. 129 Bilinear form. 328 Basis. 332 -vector. 251 Bilinear multiplication. 391 for groups. 265. 176. 14 Block multiplication. 331 iI. 310 Base of parallelepiped. 158 of a vector space. 96 c Cancellation law. 302 of vector spaces. 107 ff. 127 Congruence (see also Congruent). 325 Critical points. 163 Companion matrix. 313 of vectors. 362 Complementary subspaces. 119 -irreducible case. . 291 Cubic equation. 164 Congruent. 474 -subgroups. 210 Consistency principle. 450 Decomposition of matrices. 283. 104 Defining relation. 43. 358 Content of polynomial. cardinal number of. linear. 86 Continuum. 98 Complete ordered field. 97 Decomposition of ideal. 62. 308 in a group. 474 -trigonometric solution. 359 Constructive proof. 375 Complementarity law. 491' homogeneous-. 285 Conic. 158 -quaternions. 69 · Commutator. 41. 140 -permutation. formal.209. 257 -subfields. Complex plane. 110 Composite. 277 Completing the square.262 Coset. 98 Complete set of invariants. 89 D Decimals. Cycle. 111 Denumerable set. 146. 316 Conjugate -algebraic numbers. 150 -subspace. 176 Derivative. 436 Cover (in partial order). 359 Complementation. 25. 465 Determinant. 163 -vector space. 112 -polynomial. 344 Cyclotomic --equation. linear transformations. 372 Cramer's Rule. 193. 405 -matrices. 384. 296. 260 ff. 43 -relation.Index Commutative ring. 119 Complex number. 448 -complex numbers. 1. 387 Coordinates change of-. 66 of extension field. 318 ff. 346 ff. 384 Dependence. 62 De Moivre formulas. 207 Countable set. 338 Complement in a lattice. 142 Degree. 92 Contain (a set). 150 Cyclic -group. 117 -diameter. 104 . 25. 48. Dedekind cut (axiom). 196 Complete ordered domain. Dedekind. 429 of polynomial. 102 Cut. 362 for cardinal numbers. -relation. 277 ff. 277 under a group. 396 of subspaces. 195. 26 of transformations. 154 Elimination. 34. 178 Direct product. 16 Division -algebra. 396. 228 Diagonal process. 243 ff. 353 -matrix. 399 Equality of functions.414 -Algorithm. 47 of stable type. 241 integral-. 16 greatest common-. 195 of matrices. 74. 127 laws of-. 3 ordered-. 359 Divisible. 224 right. 180 Elliptic function field. 94 Dimension of vector space. 323 Diagram. 210 -isomorphism. for partial order. 248. 181 -symmetric polynomial.373 Dualization law. 436 Diagonally dominant. 184.Index -rank. 2. 400 Epimorphism. 390 general. 84 492 Dual -basis. 88 Elementary -column operation. 169. 122 Equivalence. 14 for matrices. 212. 98 unique factorization-. 359 Duplication of cube. 386. 397 -space. 256. 369 Distance.326 Diagonal matrix. 185 Eigenvalue. 3. 170 for Boolean algebra. 33. 401 Divisor. 120 Disjunctive canonical form. 248 -divisors. 18. 48. 117 of cubic.. 473 -numbers. 4 for sets. 164 . 127 Equations (see also Polynomial) simultaneous linear-. 427 Epimorphic image. 156 Direct sum. 372 Dihedral groups. 265 Eisenstein's irreducibility criterion. Pythagorean. 19 Divisor of zero (see Zero divisor) Domain. 134. 265. 201 Distributive lattice. 9. 434 E Echelon matrix.440 -ring. 210 Duality principle. 347 of rings. 332 Eigenvector. 143 Dilemma. 321 -row operation. 346 Discriminant. 363 Distributive law. 124 ff. 81.. 113 ff. 302 Hermitian matrix. 412 . 200 ff. 374 Group. 14 Extension. 96. 268 Fully reducible. 141 of vector space (see Span) Gram-Schmidt process. 13 second principle of. 465 Four group. 81 Euclidean group. 347 Function. 472 of Ideal theory. 18 for Gaussian integers. 16 -group (see Quotient-group) Fermat Theorem. 421 of group. 48. 441 for polynomials. 23 of Galois theory. of Arithmetic. -set. 369 Generators of a field extension. 378 Finite -dimensional. 126. 392 Exponents. 331 Hermitian form. 456 ff. 439 Generation of sub algebras. 418 Field. 19. 204 Greatest common divisor. 383 Finite induction. 130 H Hadamard determinant theorem. 32 Fundamental Theorem of Algebra. 132 Euclidean Algorithm. 151 Full linear group. 45 Extension field. 300 ff. 276 Euclidean vector space. Even permutation. laws of. 459 ff. 43 of sets. 407 Greatest lower bound. 450 on symmetric polynomials. 140 of transformations. 458 Galois group. 131. 421 iterated-. 133 -algebra. 154 G Galileo paradox. 153 Exponentiation of cardinal numbers. 38. 429 -field. Hilbert Basis Theorem. 28. 133 of quotients. 397 cyclic-. 431 ff. Gauss elimination. 178 -extension. 180 Gauss' Lemma. 85 Gaussian domain (see Unique factorization domain) Gaussian integer. 15 Formal derivative. F Factor. 133 abstract-. 384 Galois field.Index 493 Erlanger Program. 108 -numbers. 65 -element. 10 of cardinal numbers. 383 Inner automorphism. 80. 359 Identity. 313 Homogeneous -coordinates. 127 Image. 402 -quotient. 201 Infinite set. 80. 173 -law. 405 principal-. 372 Incomparable sets. 133 -matrix. 201 triangle-. 146 Induction principle finite-. 65 -law. 63 Index of subgroup. 400 prime-. 51. 128. 133 -function. 133 left-. 3 Intersection. 225 -transformation. 334 of quadratic functions. 241 of subspace. 128. 176 Indeterminate. 357 for Boolean algebra. 235 of a transformation. 174 Invariant -subgroup.47 additive-. 401 Inclusion. 283 Homomorphism -of a group. 1 algebraic-. 478 Integers. 73 linear-. Ch. Inverse.15 Inequality. 309 algebraic-. 9 Integral domain. 38. 136 . 377 -of a ring. 191 Ideal. 159 -subspace. 240 Imaginary component. 399 Hypercomplex numbers. 229. 387 Schwarz-. 358 of ideals. 358 Independence 494 affine-. 465 Insolvability of quintic. 128 of a matrix. 190. 69 if. 277 of a matrix.Index Hilbert Nullstellensatz. Inseparable polynomial. right-. 443 positive-. 13 second-. 410 Idempotent law.. 344 Invariants. 145 of subspaces. 365 for partial ordering. 158 Inner product. 107 Improper Ideal.219 -quadratic forms. 2. 313 -linear equations. 155 -of a lattice. 33. 198 if. 396 Hyperplane. 407 of subgroups. 287 if. 185. 413 Holomorph of a group. 216 Liouville number. 134 Left inverse. 95 Irreducible element -polynomial. 214 ff. 374 M Majorizable. 354 monomial-. invertible-. 374 ff.. J Join.201. 146 Join-irreducible. Isometry. 354 K Kernel of homomorphism. 407 Least upper bound. 66 Least common multiple. 174 -transformations.301 Linear -algebra. 137 of lattices. 268 -independence. term. 47. 35. 78 ff. 340 companion-. 96. 146 Left ideal. 439 Lorentz transformation. 228 diagonally dominant. 20. 134. 194. 240 Iterated field extension. 400 Kronecker delta. 207. 96. 156. 387 Matric polynomial (A-matrix). 323 hermitian-. 413 Left identity. 275 Kronecker product. 35 of groups. 228 . 131 Isomorphism. 76. 173 --dependence. 255 L Lagrange interpolation formula.Index 495 Invertible. 280 -fractional su bstitu tion. 68 -theorem for groups. 180 ff. 374 Left coset. 396 -combination. 199. 408 of subgroups. 431 ff. 208 -groups. 338 diagonal-. 16. 65 of commutative rings. 340 Matrix. cardinally. 377 of vector spaces. 260 ff. 300 ff. 146 Lattice. Leading coefficient. 238 Length of vector. 296 Lower bounds. 176 -equations. 128. 315 -function. 229 Jordan-. 232. 364 of ideals. circulant. 176 -space (see vector space) -sum of subspaces. 76.. 378 Jordan matrix. 359 Involutory automorphism. 229 Involution law. 235 Irrational number. 363. 182 -form. 128 Onto. 228 Multiple. 226 Order of a group. 222 ff. 302 Null-space. 59 of a linear algebra. 378 Metamathematics. 33. 176 -group. 135 Multiplicity of roots. -subgroup. 229 unimodular-. 229. 33 Operational calculus. 237. 235 Norm. 153 One-one. 246 N onsingular linear transformation. 423 Boolean. 16 least common-.313 nonsingular-. 128 Operation. 20 Multiplication (see also Product) of matrices. 468 -orthogonal basis. 146 of a group element. 439. 376 Modulo. 396 Ordered domain. 448 of vector (see Length of vector) Normal -field extension. Maximum. 205 -transformation. 203 ff. 229 orthogonal-. binary. 291 Minor of matrix. 116 N Natural multiple. 202 Midpoint. 313 Nonsingular matrix. nullity. 370 Minimum. 330 unitary-. 273 -vector. 52 Orthogonal. 363 -irreducible. 184 row-reduced-.Index 496 nilpotent-. 313 Nonhomogeneous coordinates. 281 of complex number. modulus (of a congruence). 25 ff. 141 -isomorphism. 9. 33. scalar-. 257. 183 triangular-. 119. 305 Minimal polynomial. 415 Nilpotent matrix. 275 -matrix. 373 Metric properties. 275 -projection. 69 Ordered field. 275 . 300 ff. 320 Modular lattice. 242 Null vector. 405 Meet. 172 o Odd permutation. 158 . 291 Maximal ideal. Monic polynomial. 159 -unitary basis. reduced echelon-. 170 -table of group. 66 Monomial matrix. 302 Outer automorphism. 337. 419 -ideal.222 iI. Partial order.Index 497 Outer product of vectors. 312 iI. 294. 76. 316 -geometry. 63 function. 402 Principle of Finite Induction. 81 Primitive polynomial. 166.401 -subgroup. 224 tensor-. inseparable. 314 -line.66 separable-. 312 iI. 313 -transformation. 324 of ideals. 130. 17 -field. 353 Primary component. 127. 303 Principal ideal. 54 Positive semi-definite. 17 relatively-. 328 Parallelogram law. 143 Properly contain. 154 Positive definite. 390 of cosets. 150 -matrix. 404 of determinants. 222. 78 iI. 65. 307 Parallelepiped. 348 Prime. 371 iI. 22. 384 Parallel. Projective --conic. 90 iI. 358 Q Quadratic --equation. 57 Permutation. 113 Principal axes. -group. 333 Principal Axis Theorem. 410 iI. 119 . of transformations. 13 Second-. -integer. minimal-. 61 iI.423 monic-. 161. 9. 294. -plane. 465 symmetric-. 85 Primitive root of unity. 409 of matrices. 405 . 401 -ideal. 251 iI. 289 Positive integers (postulates). 465 irreducible-. 314 Proper -homomorphism. 150 -group. 168 Partial fractions. form. 140 Primary canonical form. 297 Postulates. Partition. 337. 258 p Paradox of Galileo. 165 Peano postulates. 228 Polynomial. 411 -ideal. 80. 15 Product (see also Multiplication) of cardinal numbers. 313 -subspace. 1 Power (see also Exponent) of group element. 312 -space. 44 -group. 455 normal-. 443 -number. 26 for divisibility. 357 Relations. 34 of congruence. 134. 296 fl. 121 Roots of unity. 273 Reflexive law. 378 Root field. 241 fl. 28. 403 fl. 18. 102. 33 Rank of bilinear form. of quadratic form. 246 . 238 Rigid motion. 1. 101. -integers. 161 fl. 75 Residue class. 206 fl. 16 for equivalence relations. 290. . division-. -space. 401 noncommutative. -ring. 294. -function.402 fl. 411 -integer. 296 fl. 42 fl. 113 Rotation. 74 Remainder Theorem. 128 -inverse.413 -identity. 326 of matrix. 63 -function. 478 Quotient. 112 fl. 256. 146 -ideal. polynomial.. 76 Reflection. 468 Roots of equations.. 3. 335 Rectangular matrix. 230 Recursive definition. 185. 124. 283 fl. 39. 98 Real quadratic form. 258 Quaternions. 79. 34. Quintic equation. 119. 34. 413 Radicals. 166. Quartic equation. Quaternion group. 181 fl. 164 Relatively prime. Real numbers. 395 of sets. 448 Quadric. solution by. 125. 288 fl.29. 121 fl.34 equivalence-. 43 field of -s. 403 Right -coset. 34. 216. 18. 14 Reducible polynomial. 335 Real symmetric matrix.403 -ring. 20. 101 quartic-. 112 primitive-. 395 commutative-. 121 Range (of transformations). 275 Ring. 289 Rational -canonical form. 81 Remainder. cubic-. 67.69. 214 Row -equivalent. 255 fl. 164 for inclusion. 452 fl. -isomorphism. 353 -form.Index 498 -form. 186.R Radical of an ideal. 282 determinant-.474 fl. 163 binary-. 217. 396 Subset. 122 Subalgebra (of Boolean algebra). 143 normal-. 232 -multiples (see Product) -product. 154. 229. 291 Scalar. 479 Span. Direct sum) of ideals. 306 invariant-. 33 Sylvester's law of inertia. 132. 191 Solvable group. 283 Skew-symmetry. 283 -polynomial. 357 ff. Subspace. affine-. 43 -indeterminates. 235 Skew-linearity. 121 ff. 26. 47. 276 Simple extension of field. 301 Solution by radicals. 159. 7 Subfield.Index 499 -matrix. 465 Set. 233. 32. 27. 154. 165 Successor function.221 Schroeder-Bernstein theorem. 180 s Saddle-point. 34. 57 Sum (see also Addition. 3. 124. 465 Separable polynomial. 40 Subgroup. 131 . 344 Substitution property. 326 Subring. 159 Semidistributive laws.481 -law.470 Symmetries. Stable type (equations). 301 Skew-symmetric matrix. 376 -group. 182. 73 -linear equation. 378 Submatrix. 333 Similarity transformation. 207 T -cyclic-. 357 Shear transformation. group of. 32.. 421 Simple linear algebra. 480 Solution space. 174 Spectrum. 242 -reduced. 407 of matrices. 247. 368 Subdomain. 172. 201 Self-conjugate subgroup. 264. 232 -rank. 319 Similar matrices. 185 ff.. 265 Square. 320. 375 Separable extension. 183 -space. 168 -matrix. 124 ff. 199 -matrix. 173 ff. 387 Schwarz inequality. 288 Symmetric --difference. 247 Signature of quadratic form. 174 Surjection. 289 Signum (of permutation). 168. 130. 414 Simultaneous --congruences. 221 of subspaces. 344 parallel-. 318 Singular matrix.473 Sub lattice. 70. 397 Trace of a matrix. 265 -equations. 172 . 269 Transpose of matrix.320 Triangle inequality. 128 orthogonal-. 76. 258 unit-. 336 Transcendental extension. 254 Upper bounds. onto. 298. set of. 2. 9. 172 Vector space. of algebraic integers. 322 Trichotomy (law of). 16.. 273 . 80 H. 231. 169. 221 -vector. 96. 323 Vector.. 176 Unitary basis. 434 Two-sided ideal. 176 zero-. 115 z Zero (see a/so Roots of equations). 301 Unitary transformations. 422 Transformation. 34 Translation.. 441 of polynomials.347 -divisor. 327 w Well-ordering principle. 254 Total matrix algebra. 450 -domain. 102 Trisection of angle. 449 Unit vectors. 119 Volume. 302 Unitary matrix. 11 Winding number. 330 linear-. 357. 6. 439. Unit. 241 Transitive law. 216 ff. 396 for commutative rings. 201 Triangular matrix. 2. 413 u Unary operation. 330 Union (see a/so Join). 241 one-one. 3. 34 for inclusion. 26. 69 -ma trix. dimension of. 23 ff. 372 Transitive relation. 170 characteristic (eigenvector). 80 if.10 Trigonometric solution of cubic. 422 -number. 314 unitary-. 358 Unique factorization. 178 Venn diagram. 300 Transforms. 168 H. 300 ff. 69 Universal bounds. 84 of Gaussian integers. 219. 269 ff. 229. 362 Unimodular group. 359 Universality. 9.Index 500 T Tensor product.projective-.. 302 Unitary space. 2. 126 affine-. -addition. 374 v Vandermonde determinant. 188 -product. 172 ff. 300 Unity. 357 Vieta substitution. Documents Similar To A survey of modern algebra - Birkhoff & MacLane.pdfSkip carouselcarousel previouscarousel nextMathematics Form and FunctionAlbert Messiah - Quantum Mechanics Vol.1Group Theory - The Application to Quantum Mechanics [Meijer-Bauer]Schaum Group TheoryA Course in Mathematical Analysis (Volume 2)Birkhoff a Survey of Modern AlgebraApplied Abstract AlgebraB.L.van der Waerden-Modern Algebra Vol 2.pdfAlgebra, Second Edition, Michael Artin.pdfSaunders Mac Lane, Garret Birkhoff Algebra 1999Conceptual Mathematics A first introduction to category theoryBirkhoff & Beatley. Basic GeometryFelix Hausdorff Set Theory 2005Spivak, David I - Category Theory for the SciencesFrankel_T_The Geometry of Physics.pdf(Cambridge Studies in Advanced Mathematics) Kiran S. Kedlaya-p-adic Differential Equations (Cambridge Studies in Advanced Mathematics)-Cambridge University Press (2010).pdfAnIntroductionToTheTheoryOfNumbers 4thEd G.h.hardyE.m.wrightArithmetic Dynamical Systems - Joseph SilvermanCoxeter Geometry RevisitedPhysical MathematicsHumphreys J.E. Representations of Semisimple Lie Algebras in the BGG Category O 2008Edwin Moise Elementary Geometry From an Advanced Standpointmaclane_homologyKolmogorov - Mathematics. Its content, methods and meaning. Volume 1 (1963).pdfGeometry David a Brannan Matthew f Esplen Jeremy j GrayA Course in Mathematical AnalysisFinite-Dimensional Vector Spaces - 2nd Ed - Paul Halmos - UTMP.Lancaster.The.Theory.of.Matrices.2nd.ED.pdfJ.H. Conway, R.K. Guy - The Book of NumbersBerger-Geometry 1- (1987)11.pdfMore From Juan FigoSkip carouselcarousel previouscarousel nextAnálise no Rn - J. Delgado & K. Frensel.pdfAnálisis Real - Dennis A. Redtwitz.pdfCálculo Diferencial - René Jiménez.pdfTeoría de Funciones de una variable compleja - Felipe Zaldivar.pdfAlgebra Lineal y sus Aplicaciones - David C. Lay.pdfIntroducción a la Topología - Juan Horváth.pdfÁlgebra - Paul K. Rees & Fred W. Sparks.pdfAnálisis Funcional.pdfBest Books About Field (Mathematics)Fundamentals of Advanced Mathematics 2: Field Extensions, Topology and Topological Vector Spaces, Functional Spaces, and Sheavesby Henri BourlesGroup Theoryby IntroBooksRecent Advances in Field Theory: Proceedings of the Fourth Annecy Meeting on Theoretical Physics, Annecy-le-Vieux, France, 5–9 March 1990by Elsevier Books ReferenceGeodesy: The Conceptsby P. Vanícek and E.J. KrakiwskyElementary Topology: A Combinatorial and Algebraic Approachby Donald W. BlackettContributions to Algebra: A Collection of Papers Dedicated to Ellis Kolchinby Elsevier Books ReferenceFooter MenuBack To TopAboutAbout ScribdPressOur blogJoin our team!Contact UsJoin todayInvite FriendsGiftsLegalTermsPrivacyCopyrightSupportHelp / FAQAccessibilityPurchase helpAdChoicesPublishersSocial MediaCopyright © 2018 Scribd Inc. .Browse Books.Site Directory.Site Language: English中文EspañolالعربيةPortuguês日本語DeutschFrançaisTurkceРусский языкTiếng việtJęzyk polskiBahasa indonesiaSign up to vote on this titleUsefulNot usefulMaster Your Semester with Scribd & The New York TimesSpecial offer for students: Only $4.99/month.Master Your Semester with a Special Offer from Scribd & The New York TimesRead Free for 30 DaysCancel anytime.Read Free for 30 DaysYou're Reading a Free PreviewDownloadClose DialogAre you sure?This action might not be possible to undo. Are you sure you want to continue?CANCELOK

A survey of modern algebra - Birkhoff & MacLane.pdf

Comments

Description