Discrete Mathematics

Discrete Mathematicsincomplete working draft do not distribute without permission from the authors P. Gouveia P. Mateus J. Rasga C. Sernadas Instituto Superior Técnico Departamento de Matemática 2011 Contents Preface 7 1 Modular congruences 1.1 Motivation . . . . . . . . . . . . . . . . . . 1.2 Divisibility . . . . . . . . . . . . . . . . . . 1.2.1 Divisors, quotients and remainders 1.2.2 Euclid’s algorithm . . . . . . . . . 1.2.3 Prime numbers . . . . . . . . . . . 1.3 Modular arithmetic . . . . . . . . . . . . . 1.3.1 Congruence modulo n . . . . . . . 1.3.2 The rings Zn . . . . . . . . . . . . 1.3.3 The Chinese Remainder Theorem . 1.4 RSA revisited . . . . . . . . . . . . . . . . 1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 10 11 13 20 24 24 30 37 42 46 2 Pseudo-random numbers 2.1 Motivation: traffic simulation 2.2 Linear congruential generators 2.3 Blum-Blum-Schub generators 2.4 Traffic simulation revisited . . 2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 52 54 59 59 62 . . . . . . . . . . 65 65 65 66 68 68 73 77 84 90 91 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Polynomials 3.1 Motivation . . . . . . . . . . . . . . . . . . . . 3.1.1 Digital circuit equivalence . . . . . . . 3.1.2 Inverse kinematics of a robot . . . . . . 3.2 Basic concepts . . . . . . . . . . . . . . . . . . 3.2.1 Rings of polynomials . . . . . . . . . . 3.2.2 Monomial orderings . . . . . . . . . . . 3.2.3 Division of terms and polynomials . . . 3.2.4 Reduction modulo a set of polynomials 3.3 Gröbner bases . . . . . . . . . . . . . . . . . . 3.3.1 Ring ideals . . . . . . . . . . . . . . . 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 CONTENTS 3.4 3.5 3.3.2 Buchberger criterion . . . . . 3.3.3 Buchberger algorithm . . . . . 3.3.4 Properties of Gröbner basis . Motivating examples revisited . . . . 3.4.1 Equivalence of digital circuits 3.4.2 Inverse kinematics . . . . . . Exercises . . . . . . . . . . . . . . . . 4 Euler-Maclaurin formula 4.1 Motivation . . . . . . . . . . 4.2 Expressions . . . . . . . . . 4.3 Main results . . . . . . . . . 4.4 Examples . . . . . . . . . . 4.4.1 Gaussian elimination 4.4.2 Insertion sort . . . . 4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Discrete Fourier transform 5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . 5.2 Discrete Fourier transform . . . . . . . . . . . . . 5.2.1 Complex roots of unity . . . . . . . . . . . 5.2.2 Discrete Fourier transform . . . . . . . . . 5.3 Fast Fourier transform . . . . . . . . . . . . . . . 5.4 Polynomial multiplication revisited . . . . . . . . 5.4.1 Coefficient and point-value representations 5.4.2 Polynomial multiplication and FFT . . . . 5.5 Image processing . . . . . . . . . . . . . . . . . . 5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . 6 Generating functions 6.1 Motivation . . . . . . . . . . . 6.1.1 Search by hashing . . . 6.1.2 Euclid’s algorithm . . 6.2 Generating functions . . . . . 6.3 Motivating examples revisited 6.3.1 Search by hashing . . . 6.3.2 Euclid’s algorithm . . 6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 101 111 118 118 122 123 . . . . . . . 129 130 131 144 145 146 149 154 . . . . . . . . . . 159 159 161 161 163 167 171 171 174 178 178 . . . . . . . . 179 179 179 182 183 194 194 205 212 References 214 Subject Index 219 CONTENTS Table of Symbols 5 223 . 6 CONTENTS . In section 5. In Section 4. A motivating example from public key cryptography is presented in Section 1. In Section 3.2 we introduce summation expressions and some of their relevant properties. In Chapter 4. BlumBlum-Shub generators are presented in in Section 2. Chapter 3 presents several key concepts and results related to polynomials.3.2 we introduce linear congruential generators.3. In Section 1.1 we present a motivating example related to traffic simulation. In section 2.3 concentrates on modular arithmetic including basic properties of the rings Zn . In Section 3. Section 1. we introduce several concepts and results related to divisibility. in Section 1.2.2 we introduce the notion of polynomial over a field as well as the sum and product of polynomials. we revisit the motivating example.4.Preface The material presented herein constitutes a self-contained text on several topics of discrete mathematics for undergraduate students.1 we first discuss a motivating example illustrating how polynomials can be used to verify equivalence of digital circuits.3. Gröbner bases and their properties are presented in Section 3. In Section 3. Random numbers are useful in several different fields ranging from simulation and sampling to cryptography. Then. The Euler-Maclaurin formula is presented in Section 4. In Chapter 1 we address modular congruences and their applications.1. In Section 4. We then introduce division of polynomials and several related results.1 we discuss a motivating example that illustrates the use of the discrete Fourier transform 7 .1 we present a motivating example in Bioinformatics.4. we illustrate the relevance of summations to the analysis of the Gauss elimination technique and the insertion sort algorithm.4. In Section 4. ranging from image processing to efficient multiplication of polynomials and large integers. Then we illustrate the relevance of polynomials in robotics. The traffic simulation example is revisited in Section 2.4 we revisit our motivating examples and show how to use Gröbner bases for finding solutions of systems of nonlinear polynomial equations. we introduce several techniques to compute summations. An outline of this text is as follows. In Section 2. In Chapter 5 we present the discrete Fourier transform The discrete Fourier transform is widely used in many fields. In Chapter 2 we discuss the generation of pseudo-random numbers. Polynomials 5. in Section 5. Image processing using the fast Fourier transform is discussed in Section 5. we revisit polynomial multiplication based on the discrete Fourier transform.4.3 we present an efficient method for computing the discrete Fourier transform. together with their sum. 1. . In Section 5.Discrete Fourier transform 6.1 we present motivating examples of the use of generating functions in algorithm analysis and recurrence relation solving. product. In section 6.2 we introduce the discrete Fourier transform and in Section 5. In Chapter 6 we introduce generating functions.2.Euler-MacLaurin formula 3. Chapter dependencies are described in Figure 1.8 CONTENTS for efficient polynomial multiplication.Generating functions Figure 1: Chapter dependencies At the end of each section several exercises are proposed.5.3 we revisit the motivating examples. Generating functions are introduced in Section 6. Then. derivative and integral. the fast Fourier transform. In Section 6.Modular congruences 2.Pseudo-random numbers 4. even in the presence of an encrypted message. uk and rk must be computed efficiently (in polynomial-time). The most-well known solution to this problem consists in using a public key protocol to exchange the message. each party has a pair of keys: a public key and a corresponding private key. In Section 1. it should be hard to get the private key from the public one. that have not previously agreed upon a key. it should be hard to invert uk . we introduce several concepts and results related to divisibility. In public key cryptography. when rk is not known. Let K be the set of public keys and R a set of private keys. Encryption is described as a family of maps u = {uk : X → Y }k∈K and decryption as a family of maps v = {vr : Y → X}r∈R such that: 1. Messages are encrypted with the public key and can only be decrypted with the corresponding private key. The chapter is organized as follows. say from Alice to Bob. he uses his private key rk to obtain the original message vrk (uk (x)) = x. Y be sets. is made public whereas the private key is kept secret.1 Motivation Consider the problem of sending a secret message through a public channel between two parties. The public key.4 we revisit the example. 3.Chapter 1 Modular congruences In this chapter we address modular congruences and their applications. as the name suggests.2. In Section 1. Alice first uses Bob’s public key k to obtain the ciphered text uk (x). 2. for each public key k ∈ K there is a unique private key rk ∈ R such that vrk ◦ uk = id. 1.5 we propose some exercises.1 with a motivating example from public key cryptography. When Bob receives uk (x) over the public channel.3 concentrates on modular arithmetic including basic properties of the rings Zn . 9 . Moreover. in Section 1. We start in Section 1. Then. To send a message x to Bob. and X. Section 1. the latter chooses two prime numbers p and q and computes n = p × q and (p − 1) × (q − 1).1. . the security of RSA relies on the conjecture that it is not possible to factorize integers efficiently (in polynomial-time). It can be characterized as in Figure 1. Shamir and L.2 Divisibility In this section we introduce the notions regarding divisibility. Then. (p − 1) × (q − 1)) = 1 and publishes (n. Rivest. ◭ For the RSA cryptosystem to work we need to show that a. n) Private key: (n. p. the RSA must be abandoned if and when quantum computers become available.1 The RSA cryptosystem is due to R. Let n. 1. .b) (y) = mod(y b. n − 1} Public key: (n. where mod(n. Therefore. Adelman [26]. A. it has been shown by Shor [28] that quantum computers factorize in polynomial time. (p − 1) × (q − 1)) = 1 Message space: X = Y = {0. a). . b.1. . Example 1. it is hard for a third party. b) v(n. 1. say Eve. . The proof uses several notions and results that are presented in this chapter. he chooses a and b such that mod(a × b. 2. that eavesdrops the channel and knows uk (x) and k to obtain x.1: RSA cryptosystem For Alice to send a message to Bob. Indeed. u and v can be efficiently computed and that the equality vrk ◦ uk = idX indeed holds.a) (x) = mod(xa . As we shall see. MODULAR CONGRUENCES Due to Property 3. a and b be natural numbers such that n = p × q where p and q are prime numbers mod(a × b. m) is the remainder of the (integer) division of n by m.10 CHAPTER 1. q. This conjecture may well not be true. n) Figure 1. a) u(n. Let r be the least element of S ∩ N0 .2 Let m. r < m. We say that m divides n.2. The function dividesQ in Figure 1. and write m|n whenever there is k ∈ Z such that n = k × m. it holds that r 6= r ′ and 0 ≤ r ′ = r − m < r.n}. and therefore n = q × m + r with r ≥ 0. quotients and remainders We start by recalling divisor of an integer number as well as remainder and quotient of integer division. Observe that each integer m divides 0 and that 1 divides any integer n.2. r = n − q × m for some q ∈ Z. we also say that m is a divisor of n and that n is a multiple of m. We first consider the case where m is positive.3 For each n. Hence.2. considering r ′ = n − (q + 1) × m and recalling that m > 0. n ∈ Z.2 returns True if m divides n and False otherwise. We say that q ∈ Z is the quotient and r ∈ Z is the remainder of the integer division of n by m whenever n=q×m+r and 0≤r<m For simplicity we often only refer to the quotient and remainder of the division of n by m. If r ≥ m.2: Divisor test in Mathematica The remainder of the division of an integer n by an integer m play an important role in the sequel. n ∈ Z with m > 0. n + |n| × m ∈ S.IntegerQ[n/m]] Figure 1.1 Divisors. The following result establishes that they are unique. It uses the Mathematica predicate IntegerQ. DIVISIBILITY 1. Proof: Let S = {n − k × m : k ∈ Z}. Then. . m ∈ Z with m > 0 there are unique integers q and r such that n = q × m + r and 0 ≤ r < m. contradicting the fact that r is the least element in S ∩ N0 . dividesQ=Function[{m.11 1. Definition 1. The set S ∩ N0 is not empty since.2.2. for instance. When m divides n. Definition 1.1 Let m. Proposition 1. Moreover. 1. k) = mod(n. Hence. m > 0. Conversely.3. m) = n. Clearly.2. mod(n. r ′ < m. Assume there are integers q. m). that is. m. r ′ − r = 0 = m × (q − q ′ ). m) = n − jnk m ×m for n. m). If k divides m then mod(mod(n. Proposition 1. m) and n′ = nm × m + mod(n′ . k > 0. given that m > 0. m × (q − q ′ ) = r ′ − r Assuming without loss of generality that q ′ ≤ q and recalling that m > 0. k). Since n ′ n= m × m + mod(n. m) = mod(n′ . Recall that the map Floor associates with each x that is. n′ . We now present some useful facts about remainders.2. MODULAR CONGRUENCES We now prove the unicity.3. r = r ′ . assume that m divides n − n′ . also q − q ′ = 0. m) = mod(n′ . Proof: 1. q ′ .12 CHAPTER 1. n n′ Hence. m ∈ Z. q = q ′ . since 0 ≤ r. 2. m) = n. k ∈ Z with m. n − n′ = ( m − m ) × m and therefore n − n′ is a multiple of m. We denote the remainder of the division by mod(n. m) we conclude that n ′ n − n′ = m × m − nm × m. we can conclude that either r ′ − r = 0 or r ′ − r ≥ m. 3. Assume that mod(n. m n . m). mod(n. q is the Floor of m the integer ⌊x⌋. 2. Therefore. the largest integer k such that k ≤ x. it holds that r ′ − r < m.2. n − n′ = q × m for some q ∈ Z and therefore . m) if and only if n − n′ is a multiple of m. Since n = 0 × m + n and 0 ≤ n < m. Observe that jnk q= .4 Let n. QED Let n = q × m + r be the equality in Proposition 1. mod(n. If 0 ≤ n < m then mod(n. But. r and r ′ such that n = q × m + r and n = q ′ × m + r ′ with 0 ≤ r. r ′ < m. that is. by Proposition 1. Hence. Then. we also have n n − mod(n. gcd(m. As a consequence. gcd(0. −n). Noting that k|0 for all k ∈ Z.2. by Proposition 1. The case m = n = 0 is excluded because every integer divides 0 and therefore there is no greatest integer that divides 0. m) ′ that is. m) = m ×q×k for some q ∈ Z.5). n). 3. Propositions 1. n) = gcd(−m. we conclude that mod(mod(n. Clearly.3 and 1. m). gcd(m. k divides n − mod(n.2. We have that n − mod(n. k ∈ Z.2. m) < m. Since m = k × d if and only if −m = (−k) × d.2 to the case where m < 0 and requiring that 0 ≤ r < |m|. gcd(n.1. m) = |m| holds. 1. m). k) = mod(n. DIVISIBILITY 13 ′ n = q × m + n′ = q × m + nm × m + mod(n′ . for every d. n) = gcd(m. Hence.2. m) = mod(n.5 Let m. Definition 1. n) = gcd(n. QED We can extend Definition 1. m) = |m| for m 6= 0. 1. Since 0 ≤ mod(n′ . mod(n′ . we conclude that gcd(0.6 Let m. Clearly. 2. m) = gcd(mod(n. k). by 2.2. we have that {k ∈ Z : k|m} = {k ∈ Z : k|(−m)} . n × m. n ∈ Z not simultaneously equal to 0. n) is always a positive integer. since k divides 3. n ∈ Z not simultaneously equal to 0. The greatest common divisor of m and n. m) and gcd(m.2. m). n = (q + nm ) × m + mod(n′ . It is easy to conclude that |m| is the largest element of the set {k ∈ Z : k|m} for m 6= 0. m). Proof: 1. m) and. The following properties of the greatest common divisor are useful in the sequel. m). −n) = gcd(−m. 2.3.2 Euclid’s algorithm We now present the Euclid’s algorithm for computing the greatest common divisor of two integers as well as an extension of the algorithm.2. m) = m m. gcd(m.4 also extend to this case (Exercise 3 in Section 1.2. Proposition 1. is the greatest integer that divides both m and n. m) and let d|m. However. consists of listing all the divisors of n and all the divisors of m picking then the largest element included in both lists.18) = euclid( |{z} 0 . m) and k|m} Therefore.2. 18) = euclid( |{z} 6 mod(24. that is.2. n). when m and n are both different from 0. m). 18) euclid(24.6 shows that for computing the greatest common divisor we can concentrate only on nonnegative integers. n). m). n) = euclid(mod(n. m = k × d for some k ∈ Z. For m. or Euclidean algorithm. n nonnegative integers not both equal to 0: ( n if m = 0 euclid(m. Conversely. 24) mod(18. m). if d|n. this is not efficient. m).7 Let us illustrate the Euclid’s algorithm by computing euclid(24.3. then n = q × k × d + k ′ × d = (q × k + k ′ ) × d for some k ′ ∈ Z. m) for some k ′′ ∈ Z and therefore d|mod(n. n) = gcd(−m. The Euclid’s algorithm uses the results above to compute the greatest common divisor of two nonnegative integers in a more efficient way. gcd(m. If d|mod(n.14 CHAPTER 1. MODULAR CONGRUENCES Therefore. m). 6) mod(18. The first and third statements of Proposition 1. The Euclid’s algorithm. m) otherwise Figure 1. one way of finding gcd(m. gcd(n. being included in the 7th book of Euclid’s Elements. Clearly. 3. even if we only list the positive divisors. then k ′′ ×d = q×k×d+mod(n. The other equalities also follow easily. dates from around 300 BC.6 play a crucial role in the Euclid’s algorithm for computing the greatest common divisor.6) = 6 . We then conclude that {k ∈ Z : k|m and k|n} = {k ∈ Z : k|mod(n.3: Euclid’s algorithm Example 1.2.24) . that is. m) = gcd(mod(n. d|n. It can be recursively as described in Figure 1. 18) = euclid( |{z} 18 . QED The second statement of Proposition 1. Assume n = q × m + mod(n. DIVISIBILITY where it is clear the recursive nature of the algorithm. y0) = euclid(x1 .6 we have euclid(xp . since euclid(0. ◭ We now discuss the soundness of the algorithm. the first argument necessarily becomes 0. The gcd(m. x) is an integer between 0 and x − 1. (x0 . when computing euclid(m. That is. . x). yp ) = gcd(xp .8 The Euclid’s algorithm is sound.1) . Proposition 1. n) holds for nonnegative integers m and n. Therefore. So. by 1 of Proposition 1.15 1. yp) = yp = gcd(0. the recursion step can not be applied more than x times. n) = gcd(m. yp ) (1. xi−1 ) and yi = xi−1 . by applying the recursion step euclid(x.2. m) 24 18 18 18 24 6 6 18 0 0 6 − The construction ends whenever we get 0 in column “m”. To compute euclid(24. = euclid(xp . yp) where x ≥ p ≥ 0. n). yp ) = euclid(0. . and. xp = 0 and for 1 ≤ i ≤ p xi = mod(yi−1 . As a consequence. n) is the value in the last line of column“n”. the equality euclid(m.2. that is. at the end. n). x).2. x). 18) we can also make just a simple table including the recursion steps: m n mod(n. y0 ) = (m. Proof: Recall that mod(y. we get a finite number of recursive calls euclid(x0 . y1) = . y) = euclid(mod(y. the first argument strictly decreases. x) = x = gcd(0. n. QED Example 1. If[m==0. euclid[Mod[n. yp ). MODULAR CONGRUENCES and we conclude that euclid(x0 . m) is the smallest positive number of the form a × m + b × n for a. The case m = 0 is clearly similar. a). we have gcd(xi−1 .4 implements the Euclid’s algorithm in the obvious way (see Figure 1.6. m). yi) for all 1 ≤ i ≤ p.n}. gcd(n.2.m]]].2 of Chapter 6.2. y0 ) = gcd(xp . Proposition 1.2. Figure 1. 18) = 6. Then.4: Euclid’s algorithm in Mathematica The time complexity of Euclid’s algorithm is discussed in Section 6. recalling 1 of Proposition 1. b) = gcd(b. ◭ The recursive function euclid in Figure 1. a × m + b × n becomes a × m. (1.6 and the fact that gcd(a.2. euclid=Function[{m. Extended Euclid’s algorithm We start by stating an important property of the greatest common divisor. (1. Then gcd(n. yi−1 ) = gcd(xi .2) and (1. y0 ) = gcd(x0 . Proof: Let us first consider the case n = 0. The smallest positive integer of this form is |m|. b ∈ Z. y0 ) = gcd(m.3).2. We can then conclude that gcd(24. yp ).16 CHAPTER 1. using 3 of Proposition 1. .10 Let m.2) On the other hand.9 From Example 1. n). y0 ) = gcd(xp .3.3) we finally conclude that euclid(m.7 we know that euclid(24. n) = euclid(x0 . hence gcd(x0 . that is. 18) = 6. n ∈ Z not both equal to 0.3) From (1.m]. n) = n .9: . Let y ∈ Z be another common divisor of m and n. Let q and r be the quotient and remainder of the division of m by x. QED It is possible to modify Euclid’s algorithm in order to obtain values a and b such that gcd(m. Then. 18). S is a nonempty set of positive integers and therefore it has a least element x = a × m + b × n for some a and b. this contradict the fact that x is the least element of S.5: Extended Euclid’s algorithm Let us illustrate the extended Euclid’s algorithm. and therefore y ≤ x.17 1. Hence. k ′ ∈ Z. Thus.5. b′ ) = exteuclid(mod(n. 1) if m = 0 exteuclid(m. a′ ) otherwise (b′ − a′ × m where (a′ . n) = a × m + b × n. x is a common divisor of m and n. But. Reasoning in a similar way with respect to x and n we conclude that x divides n.11 Consider the case of exteuclid(24. since r < x. Hence. x = gcd(n. r = 0 and therefore x divides m. Since m and n are not both equal to 0. This algorithm is named the extended Euclid’s algorithm and is recursively defined in Figure 1.2. For m. We can use a table similar to the one presented in Example 1.2. m = q × x + r with 0 ≤ r < x and therefore r = m − q × (a × m + b × n) = (1 − q × a) × m + (q × b) × n If r > 0 then r ∈ S. m). x = a × m + b × n = a × k × y + b × k ′ × y = (a × k + b × k ′ ) × y for some k. Example 1. m). m) Figure 1. respectively. and let S be the set of all positive integers of the form a × m + b × n for integers a and b.2. DIVISIBILITY Assume now that m and n are not both equal to 0. n nonnegative integers not both equal to 0: ( (0. Then. indeed.2. n) = (0. 1). gcd(24. This corresponds to the table in Example 1. in a similar way: m n mod(n. respectively. top down. 100) = 5 = 7 × 15 + (−1) × 100.18 CHAPTER 1. we can fill in the last line of columns “a” and “b” with 0 and 1. Note again that gcd(15. m) 24 18 18 n m a 0 1 (= 1 − (−1) × 18 24 6 18 0 3 − 6 24 18 1 18 (= 1 − 0 × 0 18 −1 24 1 (= 0 − 1 × 6 b − 6 ) −1 1 ) 0 ) 0 1 We start by filling in the first three columns of each line.9. −1) Observe that. as indicated. 18) = 6 = 1 × 24 + (−1) × 18. 18) = (1. for instance. . ◭ Next we discuss the soundness of the extended Euclid’s algorithm. −1). following the second equality of the algorithm. We can now compute exteuclid(15. Afterwards. since exteuclid(m. m) n m a b 6 7 −1 15 100 10 10 15 5 1 −1 1 5 10 0 2 1 0 0 5 − − 0 1 We conclude that exteuclid(15. 100). Once we get 0 in column “m”. bottom up. we fill in the other lines of these columns. Hence. MODULAR CONGRUENCES m n mod(n. exteuclid(24. 100) = (7. 6) . bp ) = (0. ap ) . for 1 ≤ i ≤ p.1) above. 1) (1. observe that. we are required to compute exteuclid(xi . n) = exteuclid(x0 . y0) = (m. bp ) = (0. m). k) for some k. yp = gcd(m. More precisely.5) Following the algorithm. that is. yp ). bi−1 ) = (bi − ai × j j yp−1 xp−1 yi−1 xi−1 for 1 ≤ i ≤ p. xi−1 ). xp = 0 and (xi .2. as remarked therein. Hence.19 1. and therefore k k . that is. v. n). yi). xi−1 ) for each 1 ≤ i ≤ p. we can then compute exteuclid(xp−1 .4) When we get xp = 0 then we get the pair (ap . on one hand u × mod(y. and go on repeating this step until we are required to compute exteuclid(0. bi ) = exteuclid(xi . and therefore. where (x0 . m). yi−1) = (ai−1 .2. n) = (a. b) then gcd(m. y with x > 0. we start by computing exteuclid(mod(n. yp−1) = (ap−1 . b0 ) Moreover. bp−1 ) = (bp − ap × and so on until exteuclid(x0 . yi) = (mod(yi−1 . Note that there are indeed finitely many of these pairs (xi . y0 ) = (a0 . n) (1. if exteuclid(m. n nonnegative integers. ai ) exteuclid(m. yi ) for 1 ≤ i ≤ p and p ≥ 0. because these pairs are exactly as in (1. n). x. Proof: To compute the value of exteuclid(m. that is. n) = a × m + b × n holds for m. we have exteuclid(xi−1 . yp) = (ap . y0 ). we have (1. where (ai . we conclude that exteuclid(xp . with m 6= 0.12 The extended Euclid’s algorithm is sound. 1) corresponding to exteuclid(0. x) + v × x = u × (y − y × x) + v × x = u × y − u × xy × x + v × x = (v − u × xy ) × x + u × y x for integers u. In fact. yi). after a finite number of steps we get xp = 0 for some p. DIVISIBILITY Proposition 1. Definition 1.5) gcd(m. {0. ◭ .n}.m][[1]]}]]. recalling (1. exteuclid[Mod[n.m]. Example 1. Observe that 2 is the only even prime number.13 An integer number p is said to be prime whenever it is greater than 1 and for any positive integer n if n divides p then n = 1 or n = p. exteuclid=Function[{m.m][[1]]Floor[n/m].m]. MODULAR CONGRUENCES ai × xi + bi × yi = (bi − ai j yi−1 xi−1 k ) × xi−1 + ai × yi−1 = ai−1 × xi−1 + bi−1 × yi−1 As a consequence. indeed.6) we can finally conclude that. We also refer to coprime numbers.3 Prime numbers Herein we introduce prime numbers and the Fundamental Theorem of Arithmetic. Figure 1. QED The recursive function exteuclid in Figure 1. If[m==0. ap × xp + bp × yp = a0 × x0 + b0 × y0 On the other hand. In Mathematica the predicate PrimeQ tests whether a number is prime. b) such that gcd(m.14 The first prime number is 2 and the following prime numbers less than 10 are 3.2. n) = a × m + b × n. n) = yp = 0 × 0 + 1 × yp = ap × xp + bp × yp and therefore gcd(m.20 CHAPTER 1.m]. {exteuclid[Mod[n. n) = (a. n) = a0 × x0 + b0 × y0 Recalling (1. exteuclid(m.4) and (1.6: Extended Euclid’s algorithm in Mathematica 1.1}. 5 and 7.2.m][[2]]exteuclid[Mod[n.2.6 implements the extended Euclid’s algorithm following in Mathematica. Thus. DIVISIBILITY The following proposition presents a result known as Euclid’s lemma. m) = 1. and pi 6= pj for all 1 ≤ i. i 6= j. n is not a prime number and therefore n = a × b for some integers a and b such that a. by contradiction. given that p divides m × n. Proof: We first prove that such a product exists and then that it is unique. both a and b can be written as a product of primes.2. n = a × p × n + b × k × p = (a × b + b × k) × p for some k ∈ Z. Since a. This contradicts the assumption and allows us conclude that every integer greater than 1 can indeed be written a product of primes. j ≤ k. b > 1 and a. Proposition 1. p divides n.16 Every integer number n > 1 can be written as a product of prime numbers. there are a. If p divides m × n then p divides m or p divides n. Proof: Assume that p divides m × n but p does not divide m. b ∈ Z such that 1 = a × p + b × m.21 1.2. as a consequence. that there are integers greater than 1 that cannot be written as a product of primes as in the statement of the Theorem.10. many problems in number theory can then be reduced to problems about prime numbers only. By Proposition 1. QED We now present the Fundamental Theorem of Arithmetic. for every 1 ≤ i ≤ k. and let n be the smallest of such integers. n ∈ N0 . n can also be written as a product of primes. Since p does not divide m then gcd(p.2. As a consequence. This theorem asserts that each integer greater than 1 can be written in a unique way as a product of prime numbers. Let n be the smallest of such integers. This is a very important and useful result since. that is. we get n=a×p×n+b×m×n and therefore. b < n. (2) Uniqueness. Assume now that there are integers greater than 1 that have two distinct factorizations in prime numbers. This factorization into prime numbers is unique apart from permutations of the factors.2. This result is used in the proof of the Fundamental Theorem of Arithmetic below. (1) Existence of factorization. Then n can be written as . b < n. Theorem 1. Assume. Multiplying both sides by n. that is k Y n= pi e i i=1 where pi is prime and ei is a positive integer.15 Let p be a prime number and let m. We prove thatcomO p divides n. 22 CHAPTER 1. MODULAR CONGRUENCES n= s Y pi and n= i=1 t Y qi i=1 where pi and qj are prime numbers for 1 ≤ i ≤ s and 1 ≤ j ≤ t, and pi ≤ pi+1 and qi ≤ qi+1 for each 1 ≤ i < s and 1 ≤ j < t. Moreover, assuming without loss of generality that s ≤ t, there is 1 ≤ i ≤ s such that pi 6= qi . Clearly, s, t > 1. Then, since p1 divides n, by Proposition 1.2.15, p1 divides q1 or p1 divides q2 × . . . × qt . On one hand, if p1 divides q1 , then p1 = q1 since they are both prime. Thus, ′ n = s Y i=2 pi and ′ n = t Y qi i=2 that is, n′ < n has two distinct prime factorizations. But this contradicts the assumption of n being the smaller integer greater that 1 satisfying this property. On the other hand, since q2 × . . . × qt < n then q2 × . . . × qt has a unique prime factorization. Hence, if p1 divides q2 × . . . × qt then q2 × . . . × qt = k × p1 and therefore p1 = qj for some 1 ≤ j < t. Removing p1 from the first factorization of n and qj from the second, we again end up with 1 < n′ < n with two distinct prime factorizations thus contradicting once more the assumption regarding n. As a consequence, we conclude that every integer n > 1 has a unique prime factorization. QED Example 1.2.17 The factorizations into primes of 15, 90 and 2205, for instance, are as follows: • 15 = 31 × 51 • 90 = 21 × 32 × 51 • 2205 = 32 × 51 × 72 ◭ As we have already remarked in Section 1.1, factorization of integers into prime numbers is computationally hard and, at present date, no polynomial-time (classical) algorithm is known. Another important result about prime numbers is stated in the Theorem of Euclid presented in the 9th book of Euclid’s Elements (Exercise 6 in Section 1.5). We now present the notion of coprime numbers. Definition 1.2.18 Two integers m and n are coprime, or relatively prime, whenever gcd(m, n) = 1. For instance, 18 and 35 are coprime, but 35 and 40 are not coprime since gcd(35, 40) = 5. When m and n are coprime we also say that m is coprime to n. We now prove some simple but useful facts about coprime numbers. 1.2. DIVISIBILITY 23 Proposition 1.2.19 Let m, n ∈ Z. 1. If m, n are prime numbers and n 6= m then m and n are coprime. 2. If m, n > 1 then m and n are coprime if and only if their factorizations into prime numbers do not have any prime in common. 3. If n is prime then all the positive integers less than n are coprime to n. Proof: 1. If m is prime then the only positive divisors of m are 1 and m. Similarly, the only positive divisors of n are 1 and n. Since n 6= m, gcd(m, n) = 1. Q Q ′ ′ 2. Assume m and n are coprime. Let m = ki=1 pi ei and n = ki=1 p′i ei be the (unique) factorizations of m and n. If pi = p′j for some 1 ≤ i ≤ k and 1 ≤ j ≤ k ′ then pi both divides m and n. Since pi > 1, gcd(m, n) 6= 1 and therefore m and n are not coprime, contradicting the assumption. Conversely, assume that the factorizations of n and m into prime numbers do not have any prime in common. Hence, if r > 1 divides n then, using Theorem 1.2.16, we can conclude that the factorization of r only includes primes also present in the factorization of n. Similarly, when r > 1 divides m, the factorization of r only includes primes also in the factorization of m. Hence, there is no r > 1 that both divides n and m. As a consequence, gcd(m, n) = 1. 3. Suppose that there is 0 < m < n such that gcd(m, n) > 1. Since gcd(m, n) divides n and n is prime, gcd(m, n) = n. But this cannot be the case, since gcd(m, n) divides m and therefore gcd(m, n) ≤ m < n. We then conclude that if 0 < m < n then, necessarily, gcd(m, n) = 1. QED The Euler’s phi function associates with each positive integer n the number of positive integers less than or equal to n that are coprime to n. Definition 1.2.20 The Euler’s phi function, or just phi function, is the map φ : N → N such that φ(n) is the number of positive integers less than or equal to n that are coprime to n. The Euler’s phi function is also known as the totient function. Example 1.2.21 Let φ be the Euler’s phi function. Then, • φ(1) = φ(2) = 1 • φ(3) = φ(4) = φ(6) = 2 • φ(5) = 4 The Euler’s phi function has several interesting properties. ◭ 24 CHAPTER 1. MODULAR CONGRUENCES Proposition 1.2.22 Let φ be the Euler’s phi function. 1. If p is prime and k is a positive integer then φ(pk ) = pk − pk−1 . 2. If n = p ×q where p and q are primes then φ(n) = φ(p)φ(q) = (p −1)(q −1). Proof: 1. An integer r > 0 is coprime to pk if and only if r is not a multiple of p (Exercise 4 in Section 1.5). Hence, φ(pk ) is the number of elements in C = {1, 2, . . . , pk } that are not multiples of p. Observe that the set of multiples of p in C is C ′ = {p, 2p, 3p . . . , zp} where zp is necessarily pk and therefore z = pk−1. Thus, that there are exactly pk−1 multiples of p in C. Finally, φ(pk ) is the number of elements in C\C ′ that is pk − pk−1 . 2. An integer r > 0 is coprime to n if and only if r is not a multiple of p and r is not a multiple of q (Exercise 5 in Section 1.5. Hence, φ(n) is the number of such integers in C = {1, 2, . . . , n}. The set of multiples of p in C is C ′ = {p, 2p, 3p . . . , zp} ⊆ C where zp is necessarily n. Since n = p × q, by Theorem 1.2.16, z = q and we conclude that there are q multiples of p in C. Similarly, we can conclude that the set of multiples of q in C is C ′′ = {q, 2q, 3q . . . , pq} ⊆ C and therefore there are p multiples of q in C. It is an easy exercise to prove that there is only one element in C that it is both multiple of p and multiple of q: the integer p × q. Hence, φ(nk ) is the number of elements in C\(C ′ ∪ C ′′ ) that is n − q − p + 1 = p × q − q − p + 1 = (p − 1)(q − 1). QED In Mathematica the function EulerPhi implements the Euler’s phi function. 1.3 Modular arithmetic This section concentrates on modular arithmetic, that is, where the arithmetic operations are defined modulo n. Modular arithmetic was first introduced by the German mathematician C. Gauss in 1801. 1.3.1 Congruence modulo n Herein we introduce the congruence modulo n relation, for a positive integer n, and some of its properties. Definition 1.3.1 Given a positive integer n, the congruence modulo n relation is the binary relation =n on Z such that a =n b whenever mod(a, n) = mod(b, n). Whenever a =n b we say that a and b are congruent modulo n. 25 1.3. MODULAR ARITHMETIC Let us present some examples. Example 1.3.2 For instance, • 5 =5 15 since mod(5, 5) = mod(15, 5) = 0; • 13 =5 28 since mod(13, 5) = mod(28, 5) = 3; • 11 6=5 32 since mod(11, 5) = 1 6= 2 = mod(32, 5). ◭ It is straightforward to conclude that a =n mod(a, n) and that a =n b if and only if a − b is a multiple of n (Exercise 9 in Section 1.5). Given integers a and b and a positive integer n, the function congrMod in Figure 1.7 returns True if a and b are congruent modulo n and False otherwise. It uses the Mathematica function Mod to compute the integer remainder of the division of two integers. congrMod=Function[{a,b,n},Mod[a,n]==Mod[b,n]]; Figure 1.7: Congruence modulo n in Mathematica The relation =n is an equivalence relation, that is, it is reflexive, symmetric and transitive (Exercise 8 in Section 1.5). The next result relates the congruences modulo m and n with the congruence modulo mn. Proposition 1.3.3 Let a, b, m, n ∈ Z with m, n > 0. If m and n are coprime, then a =mn b if and only if a =m b and a =n b. Proof: (→) Assume a =mn b. Using 2 of Proposition 1.2.4, a − b = kmn for some k ∈ Z and therefore both a =m b and a =n b. (←) Assume that a =m b and a =n b. The result is straightforward if m or n is equal to 1. Using the reflexivity of =n , the result is also immediate if a = b. (1) Let m, n > 1 and a > b. Then, ′ ′′ r r Y Y ′′ ′ ′ e′i ′′ a−b=k × (pi ) = k × (p′′i )ei i=1 i=1 A congruence relation on a given set equipped with some operations is an equivalence relation on that set that preserves these operations. We have that ′ r Y ′ ′ a−b=k × (p′i )ei > 1 i=1 and so the factorization of a − b into prime numbers is unique. (ii) a × b =n a′ × b′ whenever a =n a′ and b =n b′ . Proposition 1. (2) Let m. Thus.3. The relation =n is a congruence relation with respect to the usual operations of sum and multiplication in Z.2. Proof: Let a =n a′ and b =n b′ . We can conclude in a similar way that mn divides b − a and therefore b =mn a. we have (a×b)−(a′ ×b′ ) = ((a′ +k1 ×n)×(b′ +k2 ×n))−(a′ ×b′ ) = (k1 k2 n+k1 b′ +k2a′ )×n . By the symmetry of =n we then get a =mn b. on one hand.5). Since m and n are coprime. Hence. a − b = k × m × n and therefore mn divides a − b. with respect to (ii). b. for any a. n ∈ Z and with n > 0. thus establishing (i). (a + b) − (a′ + b′ ) = (a − a′ ) + (b − b′ ) = k1 × n + k2 × n = (k1 + k2 ) × n and therefore a + b =n a′ + b′ . we conclude that r′ r ′′ Y Y ′′ ′ e′i a−b=k× (pi ) × (p′′i )ei i=1 i=1 for some k ∈ Z.4 The equivalence relation =n is a congruence with respect to the usual operations of sum (+) and multiplication (×). k2 ∈ Z. a − a′ = k1 × n and b − b′ = k2 × n for some k1 .19. As a consequence. k ′′ ∈ Z. a =mn b.3 easily extends to the case where two integers are congruent modulo a product of several pairwise coprime positive integers (Exercise 10 in Section 1. Then. n > 1 and a < b. p′i 6= p′′j for all 1 ≤ i ≤ r ′ and 1 ≤ j ≤ r ′′ . by 2 of Proposition 1. On the other hand. QED The result stated in Proposition 1. Thus. respectively. where ri=1 (p′i )ei and ri=1 (p′′i )ei are the factorizations of m and n into prime numbers.26 CHAPTER 1. That is: (i) a + b =n a′ + b′ whenever a =n a′ and b =n b′ . MODULAR CONGRUENCES Q ′′ Q′ ′′ ′ for some k ′ . Recall that =n is an equivalence relation.3. When no confusion arises we can refer to Zn as the set {0. all the integers of the form kn for some k ∈ Z.. −n [a]n = [n − a]n The set Zn equipped with this operations has several important algebraic properties that are studied in Section 1. In the general case.3.2. We can also consider the unary operation −n on Zn . that is. we can consider the equivalence class [a]n = {x ∈ Z : x =n a} induced by =n for each a ∈ Z. [1]n . The set of all this equivalence classes is Z/ =n the quotient set of Z by =n . The class [1]n consists of all the integers that have remainder 1 when divided by n. This set is often denoted by Zn . Given that =n is a congruence relation with respect to the usual sum and multiplication of integer numbers. . [n − 1]n The class [0]n consists of all the integers that have remainder 0 when divided by n.. n − 1} for a simplified notation. We begin with the following theorem.. all the integers of the form kn + r for some k ∈ Z.3. It is easy to conclude that Zn is a finite set with precisely n elements. that is. The interested reader is referred to [15]. . that we do not prove. Observe that Z1 is a singular set. that is. all the integers of the form kn + 1 for some k ∈ Z.27 1. MODULAR ARITHMETIC that is. Since =n is an equivalence relation. the classes [a]n and [b]n of two integers uniquely determine the classes [a]n +n [b]n and [a]n ×n [b]n . that is. The only element of this set is the class [0]1 and [0]1 = Z. known as the Euler’s theorem. [a]n +n [b]n = [a + b]n and [a]n ×n [b]n = [a × b]n are well defined. . The equivalence class [a]n consists of all the integers that have the same remainder as a when divided by n. a × b =n a′ × b′ . . the n distinct equivalence classes [0]n . the class [r]n consists of all the integers that have remainder r when divided by n.. We end this section with some results involving modular congruences that will be useful later on. . QED Note that if a =n b holds then the congruences a + c =n b + c and a × c =n b × c also hold. given 0 ≤ r ≤ n − 1. the binary operations +n and ×n on Zn . Proof: (→) If n = 1 then gcd(a. n). n). n) = 1. n) is a positive integer. MODULAR CONGRUENCES Theorem 1. it holds that a × (−b) =n 1. .7 Let a ∈ Z be coprime to n ∈ N.6 Let a. n). Proof: If a is coprime to n then. Observing that −a > 0 and that gcd(a. Assume then that n > 1 and that a × b =n 1 for some b ∈ Z. n) > 1 then gcd(a × b.3. By 2 of Proposition 1. We have that aj × bi =n aj−i for all i. n) ∈ {0. QED Multiplicative orders modulo n and the Carmichael function will be useful later on in chapter 2. 1 − a × b is a multiple of n. .4 we conclude that mod(1.2. n) and therefore a × b =n 1 holds. n) = 1. When an integer a is coprime to n then there always exists an integer b such that a × b =n 1. Hence. Then there is k ∈ N such that ak =n 1. Then aφ(n) =n 1.3. n) = gcd(mod(a × b. n) = 1 and therefore a and n are coprime. n − 1} for all k ∈ N0 . Let us consider a < 0. n) cannot be 1. that is. contradicting the above conclusion.6. n) also divides a × b. Proposition 1.5 Let a and n be coprime positive integers. . there are k1 .2. 1 − a × b = n × b′ . k2 ∈ N0 such that ak1 =n ak2 and k1 6= k2 . if gcd(a. (←) Assume that a is coprime to n. n) = (b.3. There is b ∈ Z such that a × b =n 1 if and only if a is coprime to n. n) = gcd(−a. . n) = mod(a × b. n) = gcd(1. by Proposition 1. . since gcd(a. it holds that gcd(a. Since mod(ak . By 3 of Proposition 1. and therefore a and n are coprime. b′ ) then (−a) × b =n 1. If a ≥ 0 and exteuclid(a. there is b ∈ Z such that a×b =n 1. n) = 1.28 CHAPTER 1. Hence. Hence.5). n) = (b.3. b′ ) then a × b + n × b′ = gcd(a. Proposition 1. n) = 1 Since gcd(a. we can reason as above and conclude that if exteuclid((−a). and therefore gcd(a × b. mod(a × b.6 it holds the equality gcd(a × b. j ∈ N0 such that j ≥ i (Exercise 12 in Section 1. Since (−a) × b = a × (−b). n ∈ Z with n > 0. Then ak1 × bk2 =n ak2 × bk2 . 5). that the order of a modulo n is some k ≤ φ(n) that does not divide φ(n). We can then conclude that k divides φ(n). n)k for all k ∈ N0 .3. As a consequence ak1 −k2 =n 1. Example 1.10 Let us consider n = 5. ak1 × bk2 =n ak1 −k2 assuming that k1 > k2 . k) < k.29 1. The multiplicative order of a modulo n is the least k ∈ N such that ak =n 1. Proposition 1. The integer a is a primitive element modulo n if there is no other b ∈ Z coprime to n such that the multiplicative order of b modulo n is greater than the multiplicative order of a modulo n. The Carmichael function associates any positive integer n with the order of the primitive elements modulo n.9 Let a ∈ Z be coprime to n ∈ N. Since aφ(n) = ak⌊ φ(n) ⌋+mod(φ(n). Hence. 3 =5 3. k Definition 1. 32 =5 9. As an example note that mod(2k .3. QED Observe that if a is not coprime to n then the existence of k ∈ N such that a =n 1 is not ensured. 4) is either 2 or 0 for all k ∈ N (Exercise 11 in Section 1. .k) then 1 =n amod(φ(n). The order of 4 modulo 5 is 2 since 4 =5 4 and 42 =5 1. The order of a modulo n divides φ(n). We now prove that the order of any coprime to n divides φ(n). but 4 is not a primitive element modulo 5 since the order of 3 modulo 5 is 4. on one hand ak2 × bk2 =n 1 and. MODULAR ARITHMETIC Hence. Proof: The Euler’s theorem ensures that aφ(n) =n 1. the order of any a > n coprime to n is always less than or equal to the order of any a′ < n coprime to n.k) k = (ak )⌊ φ(n) ⌋ k × amod(φ(n). 33 =5 2 and 34 =5 1. by contradiction. In fact.3. Then ak =n 1 and 0 < mod(φ(n). Hence.3. When there is no ambiguity we may omit the word multiplicative. QED Given that ak = mod(a. Assume.8 Let a ∈ Z be coprime to n ∈ N. on the other hand. Note that if a and b are primitive elements modulo n then their orders modulo n are equal. there is always a primitive number modulo n less than n when n > 1. we conclude that 2 and 3 are primitive elements modulo 5. the order of a modulo n is always less than or equal to φ(n). Noting that the order of 2 modulo 5 is also 4.k) But the above congruence contradicts the assumption that k is the order modulo n of a. −n and ×n can be implemented in Mathematica using the function Mod. −n a is equal to n − a when a 6= 0 and it is 0 otherwise. For simplicity. ×n .13 Let n be an integer greater than 2. . 0.3. Let us consider n = 5.3.10. n) × mod(b. 1. and the element 0 constitutes a ring as we shall see below.30 CHAPTER 1. n) + mod(b.3. MODULAR CONGRUENCES Definition 1. Example 1. +.3. 1..3.15 A ring is a tuple A = (A. n) and a ×n b =n a × b =n mod(a..2 The rings Zn In this section we endow the sets Zn with some algebraic structure. Clearly. In Mathematica the order of a modulo n and λ(n) can be computed using the functions MultiplicativeOrder and CarmichaelLambda. In particular.10. we will consider Zn = {0. The notion of ring is useful herein and also. but. n). in Chapter 3. respectively. ×) where • A is a set • + : A2 → A and × : A2 → A are binary operations on A • − : A → A is a unary operation on A . n − 1}. The following notion of quadratic residue modulo n is also useful later on in Chapter 2. The operations +n . Then the operations +n .3.3. Definition 1. n) for each 0 ≤ a. later on. ×n and −n on Zn presented at the end of Section 1. Since 1 2 =5 = 4 2 =5 1 and 22 =5 = 3 2 =5 4 we can conclude that 1 and 4 are quadratic residue modulo 5..11 The Carmichael function λ : N → N is such that λ(n) is the order modulo n of the primitive elements modulo n for all for all n ∈ N. −.3 become a +n b = mod(a + b. We have that λ(5) = 4. for instance. Example 1.3. The set Zn equipped with the operations +n . a +n b =n a + b =n mod(a.12 Recall Example 1. Definition 1. The integer a is a quadratic residue modulo n if a is coprime to n and there is an integer x such that x2 =n a. −n . 2 and 3 are not quadratic residue modulo 5. n) a ×n b = mod(a × b.14 Recall Example 1. n) −n a = mod(n − a. b ≤ n − 1. 5). When A is a unitary ring we can refer to the inverse with respect to ×. ×) is a unitary commutative ring but it is not a field (Exercise 15 in Section 1.17 A field is a unitary commutative ring A = (A. ×) where the multiplicative unity is distinct from 0 and every a ∈ A\{0} has a multiplicative inverse.5). +. Definition 1. .3. −. 0. additive inverses are unique (Exercise 13 in Section 1. +. It is straightforward to conclude that −(−a) = a for every element a of a ring. +.31 1. a2 ∈ A • a1 × a2 = a2 × a1 (commutativity of ×) The ring A is unitary if there exists an element 1 ∈ A such that • a×1= 1×a= a (multiplicative identity) for every a ∈ A. a1 . 0.16 Let A = (A. A unitary commutative ring where every nonzero element has a multiplicative inverse has a special name. whereas (Z. −. Then A is a commutative ring if for every a1 . 0. Definition 1.3. −. −.3. whenever it exists. b ∈ A is a multiplicative inverse of a ∈ A if a × b = b × a = 1. Moreover. a3 ∈ A • a1 + (a2 + a3 ) = (a1 + a2 ) + a3 • a1 + a2 = a2 + a1 • a+0=a • a + (−a) = 0 • a1 × (a2 × a3 ) = (a1 × a2 ) × a3 (associativity of +) (commutativity of +) (additive identity) (−a additive inverse of a) (associativity of ×) • a1 × (a2 + a3 ) = (a1 × a2 ) + (a1 × a3 ) (left distributivity of × over +) • (a1 + a2 ) × a3 = (a1 × a3 ) + (a2 × a3 ) (right distributivity of × over +) The set A is the carrier set of A. a2 . ×) is a field. multiplicative inverses are unique (Exercise 13 in Section 1. for every a. such that. Clearly. Then. Such algebraic structure is a field. 0.5). In the following example we show that endowing Zn with the operation +n and ×n we obtain a unitary commutative ring. It is easy to conclude that (R. We use a−1 to denote the multiplicative inverse a. MODULAR ARITHMETIC • 0 ∈ A. ×) be a ring. +. that mod(a × mod(b + c. the unique element of Z1 . n). −n .4. n). n). n) + mod(a × c.3. n). n) + c. n) From the distributivity of × over + and the transitivity of =n it easily follows mod(a × mod(b + c. n) and (a + b) + c =n mod(a + b. n) = mod(mod(a × b. n) = mod(mod(a + b. n) = mod(mod(a × b.18 Let n be a positive integer. n) holds. • Left distributivity of ×n over +n Let us prove the equality a ×n (b +n c) = (a ×n b) +n (a ×n c). n). c ∈ Zn . • Commutativity of +n We have a +n b = mod(a + b. Hence a × (b + c) =n a × mod(b + c. n) = mod(a. b. let us prove that the the equality mod(a + mod(b + c. Then (Zn . n) and a × c =n mod(a × c. 0. n) holds as well as the congruences a × b =n mod(a × b. n) and (a × b) + (a × c) =n mod(a × b. n). Let a. n) = a. n) + c. n) + mod(a × c. ×n ) is a unitary commutative ring. n) then a + (b + c) =n a + mod(b + c. n) = mod(mod(a + b. n) holds.32 CHAPTER 1. +n . n) and b + c =n mod(b + c. n) holds. n) + mod(a × c. . MODULAR CONGRUENCES Example 1. using the fact that 0 is the additive identity in the ring of integers and Proposition 1. When n = 1 the identity is 0. 1 is the multiplicative identity. it is straightforward to conclude that mod(a + mod(b + c. that is. Since a + b =n mod(a + b. n) = b +n a. n). that is.2. • Associativity of +n Let us prove that a +n (b +n c) = (a +n b) +n c. n) + c Using the associativity of + and the transitivity of =n . n) = mod(b + a. The congruence b + c =n mod(b + c. using the commutativity of + • 0 additive identity We have a +n 0 = mod(a + 0. If n > 1. n). −5 .19 Recall that Z5 = {0. The reference to n can be omitted if no ambiguity arises. 3 also has multiplicative inverse and 3−5 1 is 2. Hence. +5 . Proving that −n a is the additive inverse of a is also easy and it is left as an exercise to the reader. the case 1 ×n a = a is similar. n) = mod(a. The elements of Zn that have multiplicative inverse are precisely the elements of Zn that are coprime to n.3. for n > 1 We have a ×n 1 = mod(a × 1. +6 . 0. 6) = 1 Only 1 and 5 have multiplicative inverses. 3. when considering the ring (Zn . In the ring (Z5 . +n . The proofs of the associativity and commutativity of ×n are similar to the ones for +n . 5) = 1 Clearly. also the reference to n can be omitted in the additive inverse −n a. ×5 ) is also a field. 1. MODULAR ARITHMETIC 33 • 1 multiplicative identity. Example 1. 2. +6 . . For simplicity. 5}. Right distributivity follows from the left distributivity and the commutativity of +n and ×n .20 Recall that Z6 = {0. the additive inverse of 4. ×6 ): • −6 2. 0. is 3 • −5 4. 0. the additive inverse of 3. Only 0 has no multiplicative inverse. using the fact that 1 is the multiplicative identity in the ring of integers and Proposition 1. 4}. (Z5 . or simply a−1 .3. A element a ∈ Zn with multiplicative inverse is also said to be a unit of Zn and the corresponding inverse is denoted by a−n 1 . −5 . In the ring (Z6 . the additive inverse of 2. 4. the ring (Z6 . −n . −6 ×6 ) is not a field.2.6. the additive inverse of 2. is 4 • −6 3. Hence.3. is 1 • 1 has multiplicative inverse and 1−5 1 is 1 since 1 ×5 1 = 1 • 2 has multiplicative inverse and 2−5 1 is 3 since 2 ×5 3 = mod(2 × 3. is 3 • 1 has multiplicative inverse and 1−6 1 is 1 since 1 ×6 1 = 1 • 5 has multiplicative inverse and 5−6 1 is 5 since 5 ×6 5 = mod(5 × 5. These is a consequence of Proposition 1. Example 1. 0.4. 2.3. Similarly. +5 . ×5 ): • −5 2. 3. −6 . Clearly. 5) = 1 • 4 has multiplicative inverse and 4−5 1 is 4 since 4 ×5 4 = mod(4 × 4. n) = a. ×n ) we often just refer to the ring Zn .1. 1. 0. If a ∈ Zn has multiplicative inverse in Zn and exteuclid(a. Then Zn is a field. n) ≤ n − 1 holds. ◭ A simple corollary of Proposition 1. 42 (= 61 − 19). then a × b =n a × mod(b.3. 61) = (−19. (←) Let a ∈ Zn be coprime to n. 5) the multiplicative inverse of 16 in Z61 is mod(−19. We can also get the multiplicative inverse of 16 in Z61 looking at −19 as the additive inverse of 19 in Z61 . n). n).6. QED In some situations it is useful to consider an extension of the notion of multiplicative inverse in Zn . that is. n). By 3 of Proposition 1. Proof: If a ∈ Zn \{0} then 0 < a < n. n) and therefore a × mod(b.3.6 and 1. n) = 1 since 1 ≤ n. n) = (c. b ∈ Z.21 a has multiplicative inverse in Zn . Since exteuclid(16. that is. d) then mod(c.2.5).22 Consider the ring Z61 . Thus. Proof: (→) By Proposition 1.3. Note that 0 ≤ b ≤ n − 1 may not hold. if n is prime then Zn is a field.21 Let n be a positive integer. Corollary 1. 61) = 42. Then a ∈ Zn has multiplicative inverse if and only if a is coprime to n. we say that b is a multiplicative inverse of a modulo n whenever a × b =n 1.19 and Proposition 1. The multiplicative inverse of a is then mod(b. By Proposition 1.23 Let n be a prime number. n) = mod(1. Given a positive integer n and a.34 CHAPTER 1. a ×n mod(b.3. But 0 ≤ mod(b.6. Example 1. Given that b =n mod(b. .21 suggest an algorithm for computing inverses in Zn using the extended Euclid’s algorithm. n) is that multiplicative inverse (Exercise 17a in Section 1. MODULAR CONGRUENCES Proposition 1.3. QED The proofs of Propositions 1. Since 61 is prime all nonzero elements of Z61 have multiplicative inverse. that is. for instance. mod(a × mod(b. there is b ∈ Z such that a × b =n 1.3. n) =n 1.3. Let us compute the multiplicative inverse of 16. n).3.21 states that if n is prime all the elements of Zn apart from 0 have multiplicative inverse.3. ×3 ). +. • 0 = (0′ .1. a′′ ) × (b′ . • (a. −′′ . • (a′ . a′′ ×′′ b′′ ). Definition 1. −′′ a′′ ). 1). 2). +. b′′ ) = (a′ +′ b′ . is the ring (A. As expected. 9) = 2 and 2−9 1 = 5 then 5 is a multiplicative inverse of 20 modulo 9. b′ ) = (a +2 a′ . Furthermore. multiplicative inverses modulo n are not unique: if the integer b is a multiplicative inverse of a modulo n then the integer c is a multiplicative inverse of a modulo n if and only if c =n b.3. A ring product is a binary operation that takes to rings and returns the product of the rings. −. ×) where • Z2 × Z3 = {(0. 2)}. b) + (a′ . 0. We can use a−n 1 to denote a multiplicative inverse of a modulo n. 0). +2 . b′′ ) = (a′ ×′ b′ . a′′ +′′ b′′ ). a′′ ) + (b′ . −′ . (0.3. • −(a. +3 .3. ×′′ ) be two rings. n) in Zn is a multiplicative inverse of a modulo n (Exercise 18 in Section 1. • (a′ .5) Example 1. +′ . for all a′ ∈ A′ and a′′ ∈ A′′ .3. (0. Since mod(20. (1. if a is coprime to n then the multiplicative inverse of mod(a. As an example let us consider a = 20 and n = 9. a′′ ) = (−′ a′ . 0. −3 b). 0. 0′′ ). 1). ×′ ) and A′′ = (A′′ . MODULAR ARITHMETIC 35 By Proposition 1. The product of rings is well defined since A′ × A′′ is indeed a ring (Exercise 20 in Section 1.25 Consider the rings (Z2 .6. denoted by A′ × A′′ . • −(a′ . To end this section we introduce the notions of ring product and ring homomorphism. 0). +′′ . 0). b) = (−2 a. −3 . b +3 b′ ). −. 0′′ . The product of A′ and A′′ . (0. A is the Cartesian product of the carrier sets of each ring. Their product is the ring (Z2 × Z3 . −2 . (1. (1. such integer b exists if and only if a and n are coprime and from its proof it follows that the extended Euclid’s algorithm computes a multiplicative inverse of a modulo n.24 Let A′ = (A′ . 0′ .5). ×2 ) and (Z3 . that is. ×) where • A = A′ × A′′ . . ×2 ) and (Z3 . −′ . ×) to the ring (R. −. denote by h : A → A′ is a map h : A → A′ such that • h(a + b) = h(a) +′ h(b) • h(0) = 0′ • h(−a) = −′ h(a) • h(a × b) = h(a) ×′ h(b) for every a. In Exercise 22 in Section 1. +. (a. The map h : Z → Z such that h(z) = −z is not an homomorphism from (Z. 0. +. the reader is asked to present an homomorphism from the ring (Z. −. ◭ It is important to relate rings to each other. An homomorphism from A to A′ . 0. 0′ . ×) to (Z.28 Consider the ring (Z. −3 . +. 0). ×). Such relationship is called ring homomorphism. ×) to (Z3 . −. +2 . +′ . −3 b)) = −3 b = −3 h((a. b ×3 b′ )) = b ×3 b′ = h((a. b′ )) = h((a ×2 a′ . (0. Definition 1. hence not bijective. An homomorphism h : A → A′ is an isomorphism whenever h is a bijection.3. −. 0. 0). 0. ×′ ) be two rings.3. b). In fact. 0. +. The map h : Z2 × Z3 → Z3 such that h((a. b) + (a′ . ×). 0. Example 1.5. It is a map between the carrier sets that preserve the operations. +. +. −. −. b)) = h((−2 a. b′ ) = (a ×2 a′ . ×3 ) and their product (Z2 × Z3 .26 Let A = (A. b) × (a′ . b)) ×′3 h((a′ . b) × (a′ . This homomorphism is not an isomorphism since h is not injective. b′ )) for every (a. ×3 ). . ×). −2 .36 CHAPTER 1. 0. −. Example 1.27 Consider the rings (Z2 . b ∈ A. b)) • h((a. b +3 b′ )) = b +3 b′ = h((a. b′ )) = h((a +2 a′ . In fact • h((a. b′ )) • h((0. (0. 0) = 0 • h(−(a. b′ ) ∈ Z2 × Z3 . ×). 0. b)) +′3 h((a′ .3. −. b ×3 b′ ). 0. −3 . h(a × b) = −(a × b) 6= (−a) × (−b) = h(a) × h(b). +. b)) = b is an homomorphism from (Z2 × Z3 . +. ×) and A′ = (A′ . MODULAR CONGRUENCES • (a. +3 . +3 . . this result is useful for solving some systems of linear congruences. by 2 of Proposition 1. . we can conclude that Ni is coprime to ni . Ni is the product of all those nj with j 6= i.×nr → Zn1 × . We now state the Chinese Remainder Theorem in Proposition 1.. we only have to prove that h is injective and surjective.. . Proof: As we have stated above. . .7) is an isomorphism.. mod(x.9) . (1) h is surjective. MODULAR ARITHMETIC 1. by Proposition 1. × Znr there exists x ∈ Zn1 ×. Proposition 1.2. . N ∈ Zn1 ×. . h is a ring homomorphism. .19 and Theorem 1. Therefore..37 1. . −ni 1 xi × Ni × Ni =nj 0 for every xi ∈ Zni and 1 ≤ j ≤ r. nr be positive integers pairwise coprime.3. . × nr and Ni = N ni for all 1 ≤ i ≤ r For each 1 ≤ i ≤ r. nr )) for each x ∈ Zn1 ×. −ni 1 xi × Ni × Ni =ni xi for every xi ∈ Zni (1. × Znr (1.×nr i=1 −ni 1 where Ni is a multiplicative inverse of Ni modulo ni .×nr This map is a ring homomorphism between the rings Zn1 ×.7) such that h(x) = (mod(x.. .. .8) and. This theorem is nearly 2000 years old and was established by Chinese scholars. . . Let r be a positive integer and let n1 . on the other hand. . . xr ) ∈ Zn1 × . .29 The map (1.. .5).3. We have to prove that given (x1 . . .2.6. .3.×nr such that h(x) = (x1 . Consider the map h : Zn1 ×. j 6= i (1.16. Ni has a multiplicative inverse modulo ni . n1 ).29..3 The Chinese Remainder Theorem Herein we present the Chinese Remainder Theorem.×nr and Zn1 ×. . since ni is coprime to all such nj . xr ). In particular.3. Hence. One one hand.×Znr (Exercise 26 in Section 1.3... . Hence. Let us then consider ! r X −ni 1 x = mod (xi × Ni × Ni ). Let N = n1 × . . 38 CHAPTER 1. Thus.7) can also be proved observing that the sets Zn1 ×.2. xj = mod(x. N). using the above result. Moreover.. Let us briefly see how. h is surjective. MODULAR CONGRUENCES given that nj divides Ni for each j 6= i..5). × Znr are finite and have the same number of elements. Finally. we can proof the Chinese Remainder Theorem without using the result stated in Exercise 10.×nr b if and only if a =ni b for all 1 ≤ i ≤ r (Exercise 10 in Section 1.9). −ni 1 x =nj mod(xj × Nj × Nj . By 3 of Proposition 1. N) = mod(y. . . . y ∈ ZN such that h(x) = h(y). nj ) =nj xi ×Ni i ×Ni and using the congruence properties of the modular equality. nj ) only the term −n 1 mod(xj × Nj i × Nj . −n 1 Given that mod(xi ×Ni i ×Ni . Therefore. nr pairwise coprime. Let x. . using (1. where N = n1 × . (2) h is injective. N) = mod(y. .4. we conclude that x =nj xj that is. Given positive integers n1 .2.8) and Proposition 1. Hence. Hence.. h is injective because every surjective map between finite sets with same cardinality is necessarily injective. We have to prove that xj = mod(x. we can use the Chinese Remainder Theorem to prove this result. × nr . Then mod(x.×nr and Zn1 × . nj ) for each 1 ≤ j ≤ r. P −n 1 By (1. . nj ). QED Note that the injectivity of (1. mod(x. we can conclude that x =n j r X i=1 −ni 1 mod(xi × Ni × Ni . we have x = mod(x. nj ) is not necessarily equal to 0. in the above summation ri=1 mod(xi × Ni i × Ni . we have that a =n1 ×.. y ∈ ZN . x =n j r X i=1 −ni 1 (xi × Ni −n 1 × Ni ). nj ). ni ) = mod(y. Hence. since x. nj ) as intended. . ni) for all 1 ≤ i ≤ r and therefore. N) = y .4. . . 4. MODULAR ARITHMETIC Consider positive integers n1 . N i=1 where Ni = nNi for each 1 ≤ i ≤ r. The following corollary shows how to use the Chinese Remainder Theorem for solving some systems of linear congruences. using 2 of Proposition 1. nr are positive integers pairwise coprime. . Hence. since h is injective. kr ∈ Z and n1 . . .2. Therefore.3. that is. Since. by the congruence properties it is easy to conclude that given x ∈ Z.3. that is. Conversely. . . . is a solution of the system. . ni ) for all 1 ≤ i ≤ r. Corollary 1. from now on we only refer to the system S ′ . by 3 of Proposition 1. N) ∈ ZN . . × nr . x′ = y ′. Then. nr pairwise coprime. Proof: Let S be the given system and consider the following system S ′  ′   x =n 1 k 1 . .. x =ni y. . if x =N y then. since ! r X −ni 1 s = mod (ki × Ni × Ni ).   x = ′ nr k r where ki′ = mod(ki ..30 Consider the system of r > 1 congruences    x =n 1 k 1 . . x′ =ni y ′ for all 1 ≤ i ≤ r where x′ = mod(x. Let x. x =N y. .. . and N = n1 × . . and x ∈ Z is a solution of the system if and only if x =N s. y ∈ Z such that x =n i y for all 1 ≤ i ≤ r. .4.2.   x = nr k r where k1 .39 1. N) ∈ ZN and y ′ = mod(y. . h(x′ ) = h(y ′ ). × nr . ki′ =ni ki for all 1 ≤ i ≤ r. for all 1 ≤ i ≤ r. The system has a unique solution modulo N = n1 × . x is a solution of S if and only if x is a solution of S ′ . x − y is also a multiple of ni . . using the Chinese Remainder Theorem. Hence.. x − y is a multiple of N and therefore. using again Proposition 1. . Reasoning as in the proof of the Chinese Remainder Theorem. x =N s′ . We now prove that given x ∈ Z then x =N s′ if and only if x is a solution of (→) If x =N s′ then. kr′ ) ∈ Zn1 × . there exists a s′ ∈ ZN such that s′ =ni ki′ for all 1 ≤ i ≤ r (1. by the Chinese Remainder Theorem there exists s′ ∈ ZN such that h(s′ ) = (mod(s′ . x − s′ is a multiple of N. s =N s′ as intended. i=1 S ′. n1 ). . then x =ni ki′ for all 1 ≤ i ≤ r. that is. we know that the modular equations (1. Considering x′ = mod(x. . To prove this equivalence. . . s =nj kj and s for all 1 ≤ j ≤ r.10) hold for ! r X 1 − n s′ = mod (ki′ × Ni i × Ni ). Therefore. . Then. Given that x′ ∈ ZN . we only have to prove that s =N s′ . . We finally prove that given x ∈ Z then x =N s if and only if x is a solution of S ′ . from the above unicity of s′ we conclude that s′ = x′ . Hence. . we conclude that s =nj s′ for all 1 ≤ i ≤ r. we can conclude that for each 1 ≤ j ≤ r ! r X −n 1 s =nj mod (ki × Ni i × Ni ). nr )) = (k1′ .3.40 CHAPTER 1.2. . . thus a multiple of ni for all 1 ≤ i ≤ r. and therefore x =ni s′ for all 1 ≤ i ≤ r. . . × Znr . by 2 of Proposition 1. x′ =ni ki′ for all 1 ≤ i ≤ r. . Hence. Given that kj′ =nj kj for all 1 ≤ i ≤ r.5. mod(s′ .10). we have x =N x′ and. N i=1 ′ ! =nj kj′ =nj kj′ Hence. kr′ ) that is. QED Example 1.2.10) Recalling the proof of the Chinese Remainder Theorem. (←) If x is a solution of S ′ . MODULAR CONGRUENCES Given (k1′ .31 Consider the following system of congruences (or modular equations) . . x′ =ni x for all 1 ≤ i ≤ r. x =ni ki′ for all 1 ≤ i ≤ r. using the result stated in the Exercise 10 in Section 1. using (1. N =nj kj i=1 and s′ =n j r X −n 1 mod (ki′ × Ni i × Ni ). N .4. x is a solution of S ′ . N).4. Hence. given that 6−7 1 = 6 since 6 × 6 = 36 =7 1.3. 6−7 1 .4.3. to obtain an equivalent congruence in the intended form. 6x =7 3 Since 6 is coprime to 7.3. MODULAR ARITHMETIC 13x + 1 =7 4 −4x − 2 =9 −5 We want to find all the integer solutions of this system using the Chinese Remainder Theorem (Corollary 1.41 1. (i) We first transform the given system into an equivalent one where each congruence ax + b =k c is replaced by a congruence of the form x =k c′ .21. 13x + 1 − 1 =7 4 − 1 that is 13x =7 3 Given that 13 =7 6 then 13x =7 6x. by Proposition 1. we now have the system . Let us first consider the congruence 13x + 1 =7 4. We can use its inverse.3.30). Hence. using the congruence properties of =7 . taking again into account the congruence properties of =7 : 6−7 1 × 6x =7 6−7 1 × 3 that is 1 × x =7 6 − 7 1 × 3 and. we finally obtain x =7 18 Considering now the second congruence and reasoning in a similar way we have −4x − 2 =9 −5 ⇔ ⇔ ⇔ ⇔ ⇔ ⇔ 4x − 2 + 2 =9 −5 + 2 −4x =9 −3 4x =9 3 4−9 1 × 4x =9 4−9 1 × 3 x =9 7 × 3 x =9 21 Recall that inverses in Zn can be computed using the extended Euclid’s algorithm. by Proposition 1. it has multiplicative inverse in Z7 . Since −1 =7 −1. . the function isoCRT in Figure 1. MODULAR CONGRUENCES x =7 18 x =9 21 Although we could already use Corollary 1. we get the equivalent system x =7 4 x =9 3 (ii) Since 7 and 9 are coprime we can use Corollary 1. . . We refer the interested reader to. ◭ This technique for solving systems of congruences can be extended to cases where the ni ’s are not necessarily pairwise coprime.3. we prove that when Bob decrypts the ciphered text sent by Alice he obtains the original message. isoCRT returns an error message. 63) = 39 (iii) Finally. Assuming n1 = 7 and n2 = 9. × nr or the ni ’s are not pairwise coprime. Given a positive integer n. nr } of positive integers and a in Zn . .22).b) ◦ u(n.8 returns the image of a by the isomorphism of the Chinese Remainder Theorem. It first checks if elements of w are pairwise coprime using the Mathematica function GCD. 1.30 to find the solutions. a list w = {n1 . then k1 = 4 k2 = 3 N = 7 × 9 = 63 N1 = 63 7 =9 N2 = 63 9 =7 Hence.a) = idZn . described in Section 1.42 CHAPTER 1. Then it computes each component of the image of a.3.9 where φ(n) = (p − 1)(q − 1) (see Proposition 1. for instance. . [15]. . Recall the RSA cryptosystem in Figure 1.30.29). If n 6= n1 × . .4 RSA revisited At the light of the results presented in this chapter we show that the RSA cryptosystem.1 is sound.3.2. 63) = mod(144 + 84. we can further simplify this system observing that 18 =7 4 and 21 =9 3. The proof applies Euler’s theorem and the Chinese Remainder Theorem (Proposition 1. We now prove that v(n. that is. Hence. the solutions of the system are the integers x such that x =63 39. given that N1−7 1 = 4 and N2−9 1 = 4 we have s = mod(4 × 9 × 4 + 3 × 7 × 4. 9: RSA cryptosystem . .w.coprime}. 1. n − 1} Public key: (n.8: Chinese remainder theorem in Mathematica Let n. a) u(n.a}.Length[w]}]]]]].b) (y) = mod(y b.w]!=n.w[[k]]].{k. Print["Error"]. j=i+1. While[i<Length[w]&&coprime. j=j+1]. n) Private key: (n.j. 2. . . b) v(n.43 1.w[[j]]]==1). Table[Mod[a.1. . φ(n)) = 1 Message space: Zn = {0. RSA REVISITED isoCRT=Function[{n. Print["Erro"]. q. If[!coprime. i=i+1]. If[Apply[Times. p. a and b be natural numbers such that n = p × q where p and q are prime numbers mod(a × b. Figure 1. coprime=True.Module[{i.4.a) (x) = mod(xa . n) Figure 1. While[j<=Length[w]&&coprime. coprime=(GCD[w[[i]]. i=1. q). p) = mod(xab . we obtain mod(v(n. p). Then.7) at Section 1. Therefore. we conclude that h(v(n. φ(n)) = 1. since p is prime. p).19. (1) We first prove that h(v(n.12) By definition of RSA cryptosystem (see Figure 1. Using the fact that u(n.a) (x))b . x and p are coprime. q) = mod(xab . Let h be the map presented in (1.a) (x)) = mod((u(n. mod(xab .a) (x))). q)).b) (u(n.a) (x))). p and q are coprime.b) (u(n.9) we have that mod(ab. we conclude that xp−1 =p 1. From (1. p). q)) for each x ∈ Zn .3. MODULAR CONGRUENCES Proposition 1. .b) (u(n. p). p). by replacing p with q.a) (x)))) = (mod(xab . n) it is straightforward to conclude that xab =n (mod(xa .1 Consider the RSA cryptosystem with public key (n. p) = 0 = mod(xab .5). Proof: Recall that n = p × q where p and q are distinct prime numbers and therefore. mod(xab . n))b using the congruence properties of =n (Exercise 9 in Section 1.5).b) (u(n. xφ(p) =p 1. p) = mod(mod(xab . we get mod(v(n.a) (x) = mod(xa . Similarly. by 1 of Proposition 1.44 CHAPTER 1. (1.22. q)) for each x ∈ Zn . (1.2. we get (xp−1 )k(q−1) =p 1 and then xxk(p−1)(q−1) =p x. we have v(n. Since xa =n mod(xa . n) = mod(xab . a) and private key (n.2.b) (u(n.2. We start by showing that mod(x. n). by the Euler’s theorem (Theorem 1. b).4.11) (2) We now prove that h(x) = (mod(xab . n). and therefore ab = kφ(n) + 1 for some k ∈ Z. Then v(n. p) = mod(xab .b) ◦ u(n. Using 1 of Proposition 1. mod(xab . If p does not divide x. If p divides x then p also divides xab and so mod(x. Hence. using the congruence properties of =p .3.3.12). by 3 of Proposition 1. n). p).a) = idZn .a) (x))) = (mod(xab .4. 2 Consider the RSA cryptosystem and. p) = mod(xab . p). h is injective. let us assume that Bob has chosen the primes p = 13 and q = 7. Given p and q. we conclude that xxk(p−1)(q−1) = xxkφ(n) = xab and therefore x =p xab . QED We now present an example that illustrates the RSA encryption and decryption of messages. (3) From (1) and (2). by Proposition 1. Bob first picks up an element a in Z72 that has multiplicative inverse. that is. the extended Euclid’s algorithm can be used to compute its inverse b: x y mod(y.b) (u(n. we conclude that h(v(n. RSA REVISITED Using 2 of Proposition 1.3.22. p).4.4. x) 5 72 2 2 5 1 0 y x c d 14 29 −2 1 2 −2 1 2 0 2 1 0 1 − − 0 1 . that is. The proof of the equality mod(xab . In practice.a) (x))) = h(x) and therefore v(n. mod(x.45 1. mod(xab .29. h(x) = (mod(xab .2. p and q should be very large primes (with a few hundreds of digits). Hence. q) is similar. Example 1. q) = mod(x. Let us consider a = 5. just for illustration purposes. Bob can now compute n and φ(n): n = 13 × 7 = 91 and φ(91) = 12 × 6 = 72 To choose the exponents a and b. a coprime to 72. q)) for each x ∈ Zn .a) (x)) = x since.b) (u(n. to ensure that n can not be easily factorized. Then. −2). Bob gets the original message 2. 91) = mod(32 × 74 × 54763. that is. 91) = mod(32 × 74 × 163 . 91) = mod(32 × 5297. then b = 29. 91) = 32 that she sends through the channel. 91) = mod(32 × 74 × 16 × 162 . Using the Euclid’s algorithm compute (a) gcd(32. Bob’s public key is (91. 91) = mod(32 × (322 )14 .46 CHAPTER 1. When Bob receives the encrypted message 32 he decrypts it using the decryption rule associated with his private key. 91) = mod(32 × 747 . 91) = mod(32 × 74 × 16 × 74. v(91. 91) = mod(32 × 16 × 16.29) (32) = mod(3229 . Hence. 72) = (29. Then she uses Bob public key and the corresponding encryption rule and obtains the encrypted message u(91. 5) and his private key is (91. 91) = mod(32 × 74 × (742 )3 . Assume that Alice wants to send the message 2 ∈ Z91 to Bob. 91) = mod(32 × 2314 . 91) = 2 As expected. MODULAR CONGRUENCES Given that exteuclid(a. 91) = mod(32 × 102414 . ◭ 1. 91) = mod(32 × (232 )7 . 63) .5) (2) = mod(25 . φ(n)) = exteuclid(5.5 Exercises 1. 91) = mod(2. 29). 91) = mod(32 × 74. Note that we repeatedly compute squares and reduced it to elements in Z91 . it fulfills the following properties. Let n be a positive integer. Prove that mod(2k . c ∈ Z: (i) a =n a (reflexivity). (b) a =n b if and only if a − b is a multiple of n. Hint: assume that all primes are smaller than n and find a contradiction with the prime factorization of n! + 1 7.. Let p be a prime number and k a positive integer. n ∈ Z with n > 0. 5. Let a. for all nonnegative integer k. 72) 2. 6. Prove that a =n1 ×. . 4) is either 2 or 0 for all k ∈ N. b. Apply the extended Euclid’s algorithm to (a) 32 and 63 (b) 81 and 22 (c) 105 and 72 3. 4. 22) (c) gcd(105. n ∈ Z with n > 0. Let a. where a. . EXERCISES 47 (b) gcd(81.2 can be extended to the case where m < 0 by requiring that 0 ≤ r < |m|. Prove the extensions of Propositions 1. Definition 1.5. Prove that (a) a =n mod(a. j ∈ N0 such that j ≥ i. . (iii) if a =n b and b =n c then a =n c (transitivity). Prove that the relation =n is an equivalence relation.2. 12. nr be positive integers pairwise coprime. . b.. Prove that if b is a multiplicative inverse of a then aj × bi =n aj−i for all i. n). 10.×nr b if and only if a =ni b for all 1 ≤ i ≤ r. 8. Let n1 . Prove the Euclid’s Theorem: there are infinitely many prime numbers.1. 11. Prove that if n = p × q where p and q are coprime then φ(n) = φ(p)φ(q).2.4 to this case.2. Let p and q be two prime numbers. Prove that a positive integer r is coprime to p × q if and only if r is not a multiple of p and r is not a multiple of q. (ii) if a =n b then b =n a (symmetry).3 and 1. b. . Prove that a positive integer r is coprime to pk if and only if r is not a multiple of p. that is. 9. (c) if a =n b then ak =n bk . Prove that f (a) = a−n 1 is a well defined map from I. +. ×) is a unitary commutative ring but it is not a field. ×) is a field. b. 0. +. n) is the multiplicative inverse of a in Zn . a is coprime to n and b is a multiplicative inverse of a modulo n. b. −. (d) if a × b = 1 and a × c = 1 then b = c (multiplicative inverses are unique). (b) (R. (h) if there is y ∈ A such that y + x = x + y = 0 for all x ∈ A then y = 0. (b) c + a = c + b if and only if a = b. 20. Let (A. n ∈ Z such that n > 0. n) = (c. −. 18. b. c ∈ A such that a 6= 0. c. 0. ×) be a field. (g) (−a) × b = −(a × b). Let A′ and A′′ be two rings. Prove that f is a Q I to 2 bijection and use this fact to conclude that a∈I a =n 1. (e) (a−1 )−1 = a. 0. a × b = a × c and b 6= c. (a) Prove that if exteuclid(a. c ∈ A. Let (A. −. MODULAR CONGRUENCES 13. (b) mod(a. Prove that there exists a ring (A. ×) be a ring and a. Prove that A′ × A′′ is a ring. c ∈ A. +. Prove that (a) c is a multiplicative inverse of a modulo n if and only if c =n b. 14. Prove that (a) if a + b = 0 then b = −a (additive inverses are unique). 0. +. d) then mod(c. (d) Compute the multiplicative inverses of 32 and 45 in Z63 . +. −.48 CHAPTER 1. 19. 16. 15. (f) 0 × a = a × 0 = 0. . ×) and a. −. Prove that (a) (Z. Prove that if a 6= 0 and a × b = a × c then b = c for all a. n)−n 1 is a multiplicative inverse of a modulo n. (b) Compute the multiplicative inverses of 3 and 5 in Z13 . (c) −(−a) = a. 0. (c) Compute the multiplicative inverses of 18 and 22 in Z35 . Let n be a positive integer and let a ∈ Zn have multiplicative inverse in Zn . b. 17. Let a. Let n be a positive integer and I = {a ∈ Zn : a is coprime to n}. 25.5.7) is a ring homomorphism. 0. .. Do these properties also hold for ring homomorphisms? 26. −. Prove that h : Z → Zn such that h(a) = mod(a. Consider the properties of ring isomorphisms stated in Exercise 24. . .. (b) A has a multiplicative inverse in A if and only if h(a) has a multiplicative inverse in A′ . xr ) ∈ Zn1 × . 23.. Show that proving the surjectivity of the map (1. . Find all the integer solutions of the following systems of congruences 3x − 2 =7 4 (a) 13x =9 −2 (b) 2x + 4 =9 −1 12x − 2 =5 6   5x + 10 =9 −1 (c) 5x − 4 =7 6  4x − 2 =5 6   3x + 1 =7 10 (d) 4x − 2 =9 −3  x + 3 =4 1 . Prove that (a) h(an ) = h(a)n for every a in the carrier set and n ∈ N. +. × Znr the system of congruences has a solution in Zn1 ×. ×). 0. EXERCISES 21. 27.. Present a ring homomorphism from (Z.   x = nr xr 28. Prove that (a) A is unitary if and only if A′ is unitary. +. Prove that the map (1. 22. ×i is a ring. Let h : A → A′ be a ring isomomorphism. 0. . (b) h{a : h(a) = 0′ }.7) amounts to prove that given any (x1 . −.    x =n1 x1 . ×) to (R. −. n) is a ring homomorphism. Let h : A → A′ be a ring homomorphism.×nr . .49 1. 24. +. start by assigning to the result variable the value abk and then do a cycle from k − 1 until 0 such that in the i-iteration.) 35. 36. Choose an appropriate public key and a corresponding private key. Assume that the RSA cryptographic system is being used with Zn as the message space and (n. Assume that the RSA cryptographic system is being used with Zn as the message space and (n. bk−1 . 33. Hint: assuming that b is the exponent and that its binary representation is {bk . a) corresponding to the public key (n. Show that if a and the prime factors of n are known then it is possible to obtain b. Consider the cryptographic system RSA with prime numbers p = 7 and q = 11. . . (This explains the reason why factoring is considered to be the Achilles’ heel of the RSA. a) as the public key. Compute the corresponding private key and decrypt 9 to obtain the original message x. and public key (143. then the private key can be found in polynomial time. u can be inverted in polynomial time. otherwise square the result variable and reduce it modulo n. 32. 7). multiply it by a and reduce it modulo n. public key (33. that is. returns the last bit of x. and explain how it works encrypting the message 2 and decrypting the resulting message. The RSA cryptographic system requires a fast modular exponentiation. 3) and private key (33. b0 }. 7) and let 9 be an encrypted message. a) as the public key. if he knows φ(n). Consider the cryptographic system RSA with prime numbers p = 7 and q = 13. 31. Show that it is feasible for an attacker to know the private key (n. given u(x). . . if bk−i = 1 then square the result variable. MODULAR CONGRUENCES 29. and explain how it works encrypting the message 3 and decrypting the resulting message. b1 . Choose an appropriate public key and a corresponding private key. Develop in Mathematica an efficient algorithm for modular exponentiation using the binary representation of the exponent.7 (x) = 9. Consider the cryptographic system RSA with prime numbers p = 3 and q = 11. Confirm that u(143.50 CHAPTER 1. Consider the cryptographic system RSA with prime number p = 13. This algorithm is known as Repeated Squaring Algorithm. . Explain how it works encrypting the message 2 and decrypting the resulting message. Prove that if there exists an algorithm that in polynomial time. a) used. 34. 30. But it is not easy to get both at the same time. but it is of utmost relevance in cryptography. For many purposes.5 we propose some exercises. to generate strings of bits and to random generate keys in a given key space. Physical methods based in entropy theory can be used to generate sequences of numbers that can be considered close to truly random number sequences. In section 2.1 we present a motivating example related to traffic simulation.3. Blum Blum Shub generators [4] are slow but their security properties makes them suitable for cryptographic applications. Others generators. it is enough to use some suitable number sequence generating algorithms. but they are not secure enough for cryptographic applications. for instance. Blum-Blum-Shub generators are presented in in Section 2. random numbers are used to create representative real world scenarios. Suitable statistical pattern detection tests can then be used. Linear congruential generators [17]. 51 . they are used. for instance. In Section 2. In Section 2. the number sequences are completely determined by the initial value (the seed). the pseudorandom number generators. are fast and therefore useful in simulation applications. Proving a pseudo-random number generator secure is more difficult. In Section 2. In this case. for example.Chapter 2 Pseudo-random numbers In this chapter we discuss the generation of pseudo-random numbers. There are several features of pseudo-random number generators that can be measured. In cryptography.4 we revisit the traffic simulation example.2 we introduce linear congruential generators. like. Random numbers are useful in several different fields such as simulation. sampling and cryptography. A pseudo-random number generator should be fast and secure. But they can be expensive and slow for many applications. but a careful choice of the appropriate algorithms often yields useful number sequences for many applications. In simulation and sampling. they stay in the queue until the payment is done. Economics and even Sociology. This is an example of discrete event simulation. This list is also known as the pending event set. Computer simulation is then an important tool in many different fields such as Physics. it is also necessary to indicate for each event the vehicle with which it is associated. The system is represented by a sequence of events and the occurrence of an event corresponds to a change of state in the system.1 offers an intuitive and user-friendly collection of services that includes the creation of an event and the access to their different attributes. In more complex simulations more attributes may be considered. we present a traffic problem simulation example. the second one to the arrival of a vehicle to the toll gate. After arriving to the toll booth (with only one toll gate). and then they leave the road. that is. In this case there are the following three kinds of events (named according to the usual designations in queue simulation): arr (arrival). if we want to study not only the time spent at the queue but also the time spent since the vehicle enters the road until it reaches the toll queue. ess (end of self-service) and dep (departure). The problem can be described as follows. Discrete models can be used when the systems we want to study can be described by events and their consequences. Continuous models usually use differential equations that describe the evolution of relevant continuous variables. In the later case. the variables follow random laws. Simulation can also be classified as deterministic or stochastic. the events that will have to simulated. Biology. the first step is the identification of the kinds of relevant events in the system being considered.52 2. Chemistry. The goal is to study the evolution of the number of vehicles in the toll queue. Using suitable models we can study their behaviour and predict their evolution. The first one corresponds to the arrival of a vehicle to the toll road.1 CHAPTER 2. and the third one to the departure of a vehicle after the payment. depending on the particular application. arrival of a vehicle to the toll gate and departure after payment. In computer simulation we can use continuous or discrete models. There is a list of simulation events listing the pending events. PSEUDO-RANDOM NUMBERS Motivation: traffic simulation Many complex systems can be studied using computer simulation. Engineering. The relevant ones herein are the following: time (time of the event occurrence) and kind (the kind of the event). . Each event is characterized by several attributes. According to this technique. Herein. in terms of some given random laws of the intervals between arrivals. For example. Vehicles randomly come in to a given toll road. The Mathematica package in Figure 2. " time::usage = "time[e] returns the time of event e. The third step consists in defining procedures for simulating the observation of the random variables of the system.k}.2 includes a collection of services providing the procedure exprandom. MOTIVATION: TRAFFIC SIMULATION 53 BeginPackage["trafficSim`desèventsP`"] eventsP::usage = "Operations on events. With respect to arrivals we assume herein that the interval of time between consecutive arrivals is a random variable following a exponential distribution with average value ba (between arrivals). we consider that the time that a vehicle takes to cross the road is a random variable following an exponential distribution with average value ss (self-service).k]: the event on time t of kind k.e[[2]]] End[] EndPackage[] Figure 2. That is. with respect to payments." kind::usage = "kind[e] returns the kind of event e. the value of 1 − e m is approximately 0. for m = 2 t and t = 6.95. Observe that in order to define the function exprandom we use the Mathematica function Random that generates a pseudo-random number in the interval [0.k}] time = Function[e." evt::usage = "evt[t.e[[1]]] kind = Function[e." Begin["`Private`"] evt = Function[{t. Recall that all random variable distributions . Finally.{t. The Mathematica package in Figure 2.1. the probability that the observed value is greater than 3 times the average value is less than 5%. 1] following an uniform distribution.2. it is assumed that the time that a vehicle takes to pay (since the beginning of the payment to its departure) is a random variable following an exponential distribution with average value st (service time). Recall that a random variable following an exponential distribution with avt erage value m has distribution function F (t) = 1 − e m . For example. This value can be interpreted as follows: the probability that the observed value is less than or equal to 3 times the average value is a little bit more than 95%.1: Mathematica package for events The second step is the definition of the random laws followed by the events. With respect to self-service. 2: Mathematica package for random numbers can be obtained from the uniform random variable. 8. c = 2 and m = 11 4. 3.2. c < m.54 CHAPTER 2. 5. 0. s0 . 4. 8. PSEUDO-RANDOM NUMBERS BeginPackage["trafficSim`des`randomnumbersP`"] randomnumbersP::usage = "Exponential random numbers. . 10. 7. c ∈ N0 and s0 . -(m*Log[Random[]])] End[] EndPackage[] Figure 2. We introduce linear congruential sequences and the conditions its parameters should meet in order to ensure maximum length period. . 6. 9." Begin["`Private`"] exprandom = Function[{m}. 8. Definition 2. c and m are the seed. 6. 1. 2." exprandom::usage = "exprandom[m] returns an observation of the exponential random variable with mean value m. Example 2. 5. a. 3. where m.2. All examples start with the term s1 (i) s0 = 2. 2. . the increment and the modulus of the linear congruential sequence. 7. 4. 0.1 A linear congruential sequence is a sequence {sn }n∈N0 such that sn+1 = mod(asn + c. a. The parameters s0 . a. 10. respectively. the multiplier. 1. So it is of utmost importance to know how to obtain uniform pseudo-random generators in order to get other types of generators. A linear congruential sequence {sn }n∈N0 is a sequence of elements of Zm for some positive integer m.2 Linear congruential generators In this section we present linear congruential generators. 10.2 Let us look at several consecutives terms of some linear congruential sequences {sn }n∈N0 . m) for all n ∈ N0 . 2. a = 1. 1. 6. 9. . 6. . 2. 0. 3. (iii) s0 = 2. 5. a = 5. . . LINEAR CONGRUENTIAL GENERATORS 55 (ii) s0 = 2. 6. 0. c = 5 and m = 9 1. 3. . 4. 12. . 7. can say that some sequences look more random than others. 5. . The following proposition states that the seed indeed occurs more than once in a linear congruential sequence whenever the multiplier and the modulus are coprime. 12. 2. In fact. (iv) s0 = 2. 3. Then there is k ∈ N such that sk = s0 . Proof: First of all note that for i. 8. Using the word “random” in a rather informal way. 8.2. k ∈ N0 such that si+k = si and. 0. . 2. 7. 12. 8. 1. m) = si−1 and mod(sj−1 . 1. 8. 4. c = 2 and m = 12 then we have 8. a has an inverse a−1 in Zm . 25) = mod(62. 3. 7. 7. recalling Definition 2. 2. 2 . QED In any linear congruential sequence {sn }n∈N0 there is a finite sequence of numbers that is repeated an infinite number of times. that is. 2. 7. c = 2 and m = 11 then we have 6. 7. 8. In fact. 2. 8. Hence. it may be the case that not all the elements of Zm occur in the sequence. 4. 5.2. . 1. 0. Such a k always exists because every term of s is in Zm and this set is finite. m) = sj−1. Assume that k ∈ N is the least index such that sk = si for some 0 ≤ i < k. 2. 4. j > 0 if si = sj then si−1 = sj−1. 3. Since a and m are coprimes.1. 12. 12. there are i. 5. The last sequence is even constant for n ≥ 1. 2. 2. 2. a = 2. 8. 2. . 1. 8.2. 12. 12. since mod(5 × 12 + 2. 2. 0.2. 8. 5. 12. 6. 8. 8. we get si−1 = sj−1. 10. since the set Zm is finite and each term uniquely determines the next one. 5. 12. a = 3. 12. a = 7. 4. 1. 8. 6. c = 2 and m = 25 then we have 12. 2. 12. 10. It can not be the case that i ≥ 1 since then si−1 = sk−1 and therefore k would not be the least index satisfying the conditions above. Note that the seed may or may not occur again in the sequence as a term of index greater than 0. i = 0 and we conclude that sk = si . 4. . 12. 3. 8. multiplying the left and the right hand sides by a−1 and noting that mod(si−1 . once si+k = si . the terms following si+k are exactly the same following si . 1. we have asi−1 + c =m asj−1 + c. 2. Proposition 2. Moreover. . 8. 12. 8. (v) s0 = 2. asi−1 =m asj−1 .3 Let {sn }n∈N0 be a linear congruential sequence such that its multiplier a and its modulus m are coprime. Hence. . 25) = 12. 6. 12. PSEUDO-RANDOM NUMBERS The period length of a linear congruential sequence {sn }n∈N0 is the least k ∈ N such that there is i ∈ N0 such that si+k = si .6 Recall again the linear congruential sequences presented in Example 2. Finally.5. The sequence presented in (ii) also satisfies those conditions.2. Example 2. For the proof of Proposition 2.5 we refer the reader to [17].56 CHAPTER 2. In the sequence presented in (iii).4 Recall the linear congruential sequences presented in Example 2. Example 2.2. Similarly. The sequence s has period length m if and only if all the following conditions hold (i) c and m are coprime. The sequence presented in (iii) has a period of length 10 (note that 9 does not occur in the sequence). Observe that a − 1 = 0 and 0 a multiple of any integer. establishes several conditions on the parameters that ensure maximum period length. although c and m are coprime. a − 1 = 1 and 1 is not a multiple of 11 (note that 11 the only prime number that divides m in this case) and therefore the sequence does not verify the conditions of Proposition 2. The choice of the multiplier m determines an upper bound for the period of a linear congruential sequence.2. increment c and modulus m. The sequence presented in (i) satisfies the conditions of Proposition 2.5. Note that the seed is not relevant to ensure maximum period length. the period is always less than or equal to the modulus m of s and therefore we can always determine the period of s after computing m + 1 terms at the most. The maximum period lentght of a linear congruential sequence is the modulus m of that sequence.2. c and m are the multiplier. (iii) a − 1 is a multiple of 4 whenever m is a multiple of 4.2.2. The period length of the sequence presented in (i) is 11.2. the increment and the modulus of s respectively. the period length of the sequence presented in (iv) is 2 and the period length of the sequence presented in (v) is 1. known as the Maximum Period Length Theorem. the sequence presented in (ii) also has maximum period length (9 in this case). It is the maximum period lenght. Proposition 2. Note that m is not a multiple of 4 and therefore a−1 is not required to be a multiple of 4.2. The others parameters have to be carefully chosen in order to ensure that the period is as long as possible.2. (ii) a − 1 is a multiple of every prime divisor of m. for instance. . where a.2. The following proposition.5 Let {sn }n∈N0 be linear congruential sequence with multiplier a. Clearly. m) and therefore once si = 0 for some i ∈ N0 then sj = 0 for all j > i. the period length is at most φ(m).2.11 that λ(m) is the order (modulo m) of the primitive elements modulo m.2. informally speaking. a period of length m is no longer advisable since sn+1 = mod(asn .2. The Mathematica function MCL in Figure 2.3.3: Linear congruential sequence in Mathematica When computation time is an issue we often consider linear congruential sequences {sn }n∈N0 with c = 0. Another relevant fact in this case is that given a common divisor d ∈ N of si and m we have j as k j as k i i si+1 = mod(asi . λ(m) is the maximum possible period length of any linear congruential sequence with modulus m and increment 0.20). MCL=Function[{}. has maximum period length but. where φ is the Euler function (see Definition 1. s0 .2.3. m) = asi − m = d ksi − k′ m m for some integers 0 < k. that is.m]]. computes the terms of linear congruential sequences.7 Let {sn }n∈N0 be a linear congruential sequence with multiplier a. The following proposition establishes some conditions that ensure the maximum possible period length when the increment is 0. we can also conclude that sj is a multiple of d for all j ≥ i. LINEAR CONGRUENTIAL GENERATORS 57 It is worth noticing that a long period is not the only requirement for a good choice of the parameters of a linear congruential sequence.2. However. The difference of two consecutive terms is almost always equal to 2 because a = 1 and therefore in most cases sn+1 − sn = c = 2. In this case the period length is at most the number of positive integers coprime to m. Moreover. si+1 a multiple of si . that is. If s0 and m are coprime and a is a primitive element modulo m then the period length of s is λ(m). modulus m and increment 0. it is not much “random” since even and odd numbers alternate. k ′ ≤ m.s=Mod[a*s+c. c and m. for instance. assuming that suitable integer values have already being assigned to the variables a . Moreover. Figure 2.2. The sequence (i) in Example 2. Proposition 2. For the proof we refer the reader to [17]. . in this case. It is then advisable to ensure that m and sn are coprime for all n ∈ N0 . Recall from Definition 1. Note that in f (s0 ) we are assuming the binary representation of the seed of s that. .10 that 2 is a primitive element modulo 5. We can the define a (j. the sequence s satisfies the conditions of Proposition 2.2.9 Let j. k ∈ N such that k > j. the string f (r) is the bit string generated by r. k)linear congruential generator as follows: given j = 1 + ⌊log2 m⌋ and j < k < m.58 CHAPTER 2.3. As we have already remarked in Section 2. we define bit generator functions. Linear congruential sequences allow us to obtain such sequences of pseudo-random numbers in [0. it is of utmost importance to have uniform pseudo-random generators in order to get other types of generators. Definition 2.3. . Let s = (sn )n∈N0 be a linear congruential sequence with modulus m. Recall from Example 1. k is obtained as a polynomial function of j. 1] as follows [17]: given a linear congruential sequence (sn )n∈N0 with modulus m we just consider the sequence u = (un )n∈N0 where sn un = m for each n ∈ N0 . zk ) where zi = mod(si .2 with respect to the generation of pseudo-random numbers for simulation purposes. . then f : Zj2 → Zk2 is such that f (s0 ) = (z1 .7 and therefore it has period λ(5). given j. A (j. k)-bit generator is a functiom f : Zj2 → Zk2 that can be computed in polynomial time with respect to j. First.2. z2 .8 Let us consider a linear congruential sequence {sn }n∈N0 such that m = 5. s0 = 4 and a = 2. We now describe how linear congruential sequences can be use as bit generators. In practice. .12 that λ(5) = 4. capitalizing on the fact that every distribution can be generated from the uniform distribution in the interval [0.2. PSEUDO-RANDOM NUMBERS Example 2. has a length less than or equal to j. Since the seed and the modulus are coprime. For each r ∈ Zj2 . 2) for each 1 ≤ i ≤ k. by hypothesis. In cryptographic applications we often have to randomly generate strings of bits. 1]. Recall from Example 1. . the toll queue should be inspected. each time a departure event occurs. Finally. as well as. vehicles in the toll queue. can be immediately placed in the pending event list. Note that the events are kept ordered in the list according to its time attribute."dep"]. If it is empty then the toll gate should be set to not occupied. that is.4). the event evt[ct+exprandom[st].4 Traffic simulation revisited 59 Having discussed the problem pseudo-random numbers generation we proceed herein the example of traffic simulation. First. each time a vehicle leaves the toll road after paying. The pending event list (schedule) is. where by the next event in the pending occurrence list we mean the event in the list with the least time value. BLUM-BLUM-SCHUB GENERATORS 2."dep"] can already be place in the list. A Mathematica package for the scheduling is presented in Figure 2. For example."arr"]. the central part of the simulator is the pending event list.2. In general. The answer to this question is not so simple. Otherwise the first element of the toll queue should be removed from the queue and the event for the conclusion of its payment can already be placed in the pending event list. the next function simply returns the first element of the . or the conclusion of the payment of the vehicle that is paying (dep). that is. therefore. The next event to be simulated can be the arrival of a vehicle to the road (arr). Hence. note that the list has to contain the events that we already know that have to be simulated in the future. Each time an arrival to the toll booth is simulated.3 Blum-Blum-Schub generators 2. the event evt[ct+exprandom[ba]."ess"]. that is. Another important issue is the way the pending event list is fed with new events. the arrival of the vehicle to the toll booth.3. Consider the traffic simulation assuming that there are vehicles in the road. and that have not already been simulated because their time has not come yet. its inclusion in the payment queue (ess). the event evt[ct+exprandom[ss]. since it will occur for certain. a collection of events together with some operations. the next event to be simulated is the next event in the pending event list. each time an arrival is simulated. that is. the event evt[ct+exprandom[st]. Otherwise the vehicle should begin the payment phase and we can then generate the event for the end of the payment. and a vehicle is paying in the toll gate. the arrival of the next vehicle to the road. the arrival of a vehicle already in the road to the toll booth. that is. If it is not empty then the toll queue should be incremented with another vehicle. According to the discrete event simulation technique we are using. that is. How can the simulation proceed from this situation? Note that the simulation is mainly a loop where in each step the occurrence of an event is simulated. where ct is the variable of the simulator holding the current time. it should be checked if the toll gate is empty or not. Prepend[ add[e.s}." Begin["`Private`"] empty={} next=Function[s. that is. and the variables under simulation (namely." add::usage="add[e. nss (number of clients in self-service. {e}. that is. trace (it records the simulated events and the values of some variables). state of the toll gate.s[[1]]]]]] delete=Function[s. sch (schedule). The simulator is no more than a loop." delete::usage="delete[s] removes the next event from schedule s. ct (the time of the event currently being simulated). .60 CHAPTER 2. The function add is more complicated since it inserts the received event in the right place in the list. In each step of the loop the next event in the pending event list is simulated." empty::usage="The empty schedule. s[[1]]] add=Function[{e. the number of vehicles in the toll queue). Prepend[s. the length of the toll queue) are observed.4: Mathematica package for scheduling list.5. ce (the current event being simulated). ck (the kind of the event currently under simulation). The simulator is presented in Figure 2. that is. the total number of vehicles in the toll).s] inserts event e into schedule s. the list and the state variables (length of the toll queue.Rest[s]]. If[s==empty." next::usage="next[s] returns the next event of schedule s. The local variables of the function sim are the following: busy (it indicates whether the toll gate is occupied or not).e]. If[time[e]<time[s[[1]]]. nwc number of waiting clients. etc) are changed accordingly to the event being simulated. tnc (total number of clients. PSEUDO-RANDOM NUMBERS BeginPackage["trafficSim`des`schedulesP`"] Needs["trafficSim`desèventsP`"] schedulesP::usage="Operations on schedules.Rest[s]] End[] EndPackage[] Figure 2. In that simulation. the number of vehicles in the toll road). nss=nss+1].PlotJoined->True]]] End[] EndPackage[] Figure 2.nss. ck=kind[ce]. sch=add[evt[ct+exprandom[ss]. and halting time ht.simDep[]]. ct=time[ce]. nwc=nwc+1.simArr[].nwc.sch. tnc=0."dep"].sch]. tnc=tnc+1."arr"]. While[ct<=ht.st."dep". nwc=0.sch].{ct. busy=1]]. nss=0. simArr=Function[{}.2. sch=add[evt[ct+exprandom[st]." Begin["`Private`"] sim=Function[{ba. ce=next[sch]. trace=Append[trace.nwc}].5: Mathematica package for the simulation . simDep=Function[{}.ht}.4."arr"]."dep"]. ck=kind[ce]. nss=nss-1. ce=evt[exprandom[ba]. nwc=nwc-1. Module[{busy."ess"]. busy=0.ss."arr". sch=add[evt[ct+exprandom[ba]. If[nwc==0.ck}. sch=add[evt[ct+exprandom[st]. simEss=Function[{}.simEss[]. If[busy==1.ct. sch=delete[sch]]. ct=time[ce].tnc." sim::usage = "sim[ba. busy=0. ListPlot[trace.sch]]]. average service time st.sch].trace. sch=empty.ht] runs the simulation with: average time between arrivals ba. TRAFFIC SIMULATION REVISITED 61 BeginPackage["trafficSim`des`simulationP`"] Needs["acs`desèventsP`"] Needs["acs`des`schedulesP`"] Needs["acs`des`randomnumbersP`"] simulationP::usage = "Discrete event simulation. Switch[ck.st.ss. trace={}.ce. average time of selfservice ss."ess". the average interval of time between consecutive arrivals. Figure 2. is 0. is 1. In both cases. Let s = {sn }n∈N0 be a linear congruential sequence with parameters s0 . the average time a vehicle takes to pay. is 50. that ss. and that st.6 depicts the graphic corresponding to a similar simulation but 25 20 15 10 5 200 400 600 800 1000 Figure 2.6: Evolution of the toll queue length assuming ba=1. Prove that if a = 1 then sn = mod(s0 + nc.9 assuming that st is 2.6 depicts the graphic corresponding to a simulation assuming that ba. c and m. that is the value of the variable nwc. m). ss=50 and st=2 2. the average time a vehicle takes to cross a road.9. ss=50 and st=0. PSEUDO-RANDOM NUMBERS The output of the simulation is a graphic displaying the evolution of the length of the toll queue. the starting time is 0 and the halting time is 1000 400 300 200 100 200 400 600 800 1000 Figure 2.5 Exercises 1.62 CHAPTER 2. a. .7: Evolution of the toll queue length assuming ba=1. Figure 2. 20) = 1. such that m = 10e for some integer e ≥ 2. Prove that s has maximum length period if and only if mod(a. c and m. m . 5. a) Prove that s′k = mod(sk . r ∈ N. Let s = {sn }n∈N0 be a linear congruential sequence with parameters s0 . a = a′ . 6. Let s = {sn }n∈N0 be a linear congruential sequence with parameters s0 . a′ . What is the value of sre−1 ? 7. c and m. Determine all the primitive elements modulo 6 in Z6 and define a linear congruential sequence with maximum possible period length with modulus 6 and increment 0. c′ and m′ . Let s = {sn }n∈N0 be a linear congruential sequence with parameters s0 . c and m.5. respectively. a. b) Prove that implication in a) is an equivalence whenever e ≥ 2. m′ ) for every nonnegative integer k. c and m. a. c = c′ and m = r e and m′ = r e−1 for some e. EXERCISES 63 2. e ∈ N. Find all the values of a that satisfy the conditions of the maximum period theorem when m = 106 −1 (note that 106 −1 = 33 ×7×11×13×37). and s′0 . 4) = 1 then s has maximum length period. Define a linear congruential sequence with maximum period length with (a) m = 162 (b) m = 402 9. 8. 10. . a−1 3. Let s = {sn }n∈N0 be a linear congruential sequence with maximum period length. a) Prove that if c is odd and mod(a. Let s = {sn }n∈N0 be a linear congruential sequence with parameters s0 . a 6= 1.2. c and m and assume that m = 2e some positive integer e. b) Prove that if s has maximum period length then s′ has also maximum period length. seed s0 = 0 and modulus m = r e with r. 4. Let s = {sn }n∈N0 and s′ = {s′n }n∈N0 be linear congruential sequences with parameters s0 . and that neither 2 nor 5 divide c. a. Determine all the primitive elements modulo 5 in Z5 and define a linear congruential sequence with maximum possible period length with modulus 5 and increment 0. a. such that s0 = s′0 . Prove that for every nonnegative integer k ak − 1 k sn+k = mod a sn + c. 4 that distinguishes between two kinds of vehicles. Define linear congruential sequences with maximum possible period length with increment 0 and (a) modulus 15. (c) modulus 402. Determine all the primitive elements modulo 8 in Z8 and define a linear congruential sequence with maximum possible period length with modulus 8 and increment 0. Determine the period length of each sequence. . such that each kind has a different average time between arrivals to the toll road. 14. 13.64 CHAPTER 2.4 that traces the average number of vehicles during a specific period of time and the maximum length of the toll queue. Make an histogram. Develop an enriched version of the simulator presented in Subsection 2. Note that 5 is a primitive element modulo 162 and 7 is a primitive element modulo 402. by each kind of vehicle. (b) modulus 162. 12. PSEUDO-RANDOM NUMBERS 11. the light vehicles and the heavy vehicles. of the number of vehicles in the poll queue during the simulation time. Develop an enriched version of the simulator presented in Subsection 2. a combinational circuit can be specified by a truth table that lists the output values for each combination of input values.4 we revisit our motivating examples and show how to use Gröbner bases for checking equivalence of digital circuits and for finding solutions of systems of nonlinear polynomial equations. We then introduce division of polynomials and several related results.1 Digital circuit equivalence Digital circuits are hardware components that manipulate binary information [23]. If there is only one output variable.1 we start by motivating that polynomials can be used to verify equivalence of digital circuits.1 Motivation In this section. the combinational circuit computes a Boolean function and can also be 65 . 3. They accept binary signals from the inputs and generate binary signals at the outputs. we present several key concepts and results related to polynomials. combinational circuits consist of input variables. we present two motivating examples using polynomial equations. In Section 3.Chapter 3 Polynomials Polynomials and polynomial equations are widely used in science and engineering. Then we illustrate the relevance of polynomials in robotics.3. Hence.5 we propose some exercises. The second one illustrates why solving systems of nonlinear equations is important in robotics. In Section 3. Herein. In Section 3. In particular. The first one is related to the validity of propositional formulas and the problem of checking the equivalence of digital circuits.1. output variables and interconnected logic gates. 3. Gröbner bases and their properties are presented in Section 3.2 we introduce the notion of polynomial over a field as well as the sum and product of polynomials. Outputs are determined by combining the input values using the logic operations associated with the logic gates. In Section 3. they may have more gates than strictly needed. But. It represents a robot consisting of three arm links a1 . one for each variable. of course. There are several algorithms for checking the satisfiability of propositional formulas known as SAT-algorithms [16.2 Inverse kinematics of a robot Consider the robot arm depicted in Figure 3. In Section 3. we have to ensure that the original version of the circuit. but the technique can be extended to circuits with more outputs. Hence. an hand E and a base A that supports the robot. two joints J1 and J2 . POLYNOMIALS represented by a propositional formula. the two produce the same outputs for the same inputs. or equivalently. it computes n Boolean functions. B. the resulting circuits are not always as simple/efficient as they could be. That is.1. For instance. A .4 we will see how polynomials can also be used for this purpose. the propositional formula ϕA ⇔ ϕB is valid. its negation is not satisfiable. and can be represented by n propositional formulas. There are algorithms for designing combinational circuits that compute the Boolean function(s) corresponding to a given truth table. is equivalent to the new version.66 CHAPTER 3.1: Sketch of a robot arm To study the robot movement we assume that the arm links are represented by line segments and the joints and hand by points in the 3-dimensional Euclidean space. this amounts to say that the propositional formula ϕA induced by circuit A is equivalent to the propositional formula ϕB induced by circuit B. the task of checking the equivalence of these circuits corresponds to the task of checking whether two propositional formulas are equivalent. a2 and a3 with fixed length. If there are n > 1 output variables. J2 a2 a3 E J1 a1 A Figure 3. Therefore. We consider a Cartesian coordinate system 0xyz with origin in J1 . For simplification herein we only illustrate circuit with one output. If there is only one output. A. 21]. we may have to introduce modifications in the design. That is.1. 3. However. 67 3. c) of hand E. That is. respectively.2 (the x-axis points toward the observer). Then. This task is often difficult. The coordinates can be easily computed using simple calculations involving trigonometric functions. there is only one possible solution. given the angle β at the base and the angles θ1 and θ2 of the joints. given the intended coordinates of the hand we want to determine what are the suitable angles at the base and joints for reaching that position.3. θ2 z 0 6 - θ1 (a. a2 and a3 . l2 and l3 be the length of the arm links a1 . c) y A Figure 3. Joint J2 works similarly. b. That is. b. in the coordinate system 0xyz above we have that    a = (l2 cos θ1 + l3 cos θ2 ) cos β b = (l2 cos θ1 + l3 cos θ2 ) sin β   c = l sin θ + l sin θ 2 1 3 2 . that the three arm links always lie in the same plane. y - 0 β x ? Figure 3. A projection on the xy-plane of the arm link a2 is depicted in Figure 3. we assume that the arm link a1 always lies in the z-axis and. Furthermore. Clearly. we assume that base A is fixed but the arm a1 might rotate around the z-axis. Let l1 . its position in space. Inverse kinematics is the reverse task (see for instance [25. The joint J1 only rotates around the axis that contains J1 and it is perpendicular to the plane containing the arms. 3]). It can be the case that no solution or more that one solution exist. The task becomes harder if the degrees of freedom of the joints increase and in some cases infinitely many solutions may exist.1. moreover. MOTIVATION projection on the yz plane of the robot arm is depicted in Figure 3.3: Projection of the arm link a2 on the xy-plane Kinematics is the task of computing the coordinates (a.2: Projection on the yz-plane For simplicity. one of the following equivalent systems  a = (l2 v2 + l3 v3 )v1      b = (l2 v2 + l3 v3 )u1     c = l u +l u 2 2 3 3 2 2  1 = u1 + v1      1 = u22 + v22    1 = u23 + v32 or  l2 v2 v1 + l3 v3 v1 − a      l2 v2 u1 + l3 v3 u1 − b     l2 u2 + l3 u3 − c  u21 + v12 − 1      u22 + v22 − 1    u23 + v32 − 1 = = = = = = 0 0 0 0 0 0 The systems have 6 variables. The last three equations express the well known Pythagorean trigonometric identity that involves the values of the sine and cosine of an angle. we have to solve this system. At the end we refer to polynomial reduction modulo a set of polynomials. where u1 = sin β. b. We can consider. . Note that taking into account the Pythagorean trigonometric identity we could also have considered only the three variables u1 . We first introduce the notion of monomial in n variables. for instance. u2 and u3 .2 Basic concepts In this section we first present the notion of polynomial (over a field) as well as sums and products of polynomials [20.3 of Chapter 1).1) solves systems of polynomial equations where each term of the polynomial involves at most one variable and the corresponding exponent is always less or equal than 1 (linear equations). In Section 3. Each vi would then be replaced by 1 − u2i and the three last equations omitted. 2. other algorithms have be considered. to determine the angles β. and ui = sin θi−1 and vi = cos θi−1 for i = 2. We have to introduce the notion of polynomial in n variables with coefficients in a field C and the associated operations of sum and product. 3. ui and vi for i = 1. In order to solve systems of nonlinear polynomial equations. One possibility consists in converting these equations into polynomial equations where the variables are the sines and cosines. The Gaussian elimination algorithm (see Section 4. θ1 and θ2 given the coordinates (a.4 we will see how Gröbner bases can be used for solving these systems. as the one we have above.68 CHAPTER 3. 3. c).1 Rings of polynomials Our goal is to define a particular kind of ring: a ring of polynomials. 3. v1 = cos β. 10]. Next we introduce the notion of ordered polynomial and then the division of polynomials.2. 3. Polynomials over a field together with the sum and product of polynomials constitute a ring (see Section 1. POLYNOMIALS Hence. 1 The set of monomials in the variables x1 . . +. is denoted by Mx1 . . ... that is. The . .3 Let C = (C.2. we often write x21 x12 x14 instead of x21 x12 x03 x14 and 1 instead x01 x02 x03 . . . For simplicity. A polynomial in the variables x1 . . is the integer max{deg(m) : m ∈ M and p(m) 6= 0}.. . . ... . Polynomials in C[x1 . the coefficient of the monomial m in the polynomial p. .xn → C such that p(m) = 0 for all but a finite number of elements of Mx1 . . We may even write M for Mx1 .. ..xn . Given a monomial xα1 1 . If p is such that p(m) = 0 for each monomial m then p is the zero polynomial.2. .xn ] . .. If p is a nonzero polynomial the degree of p. For instance. . .. xαnn we often omit xαi i when αi = 0 and write 1 when α1 = . The subscript is omitted when no confusion arises. . we can refer just to monomials in x1 .2. Definition 3. . The degree of the monomial xα1 1 ... . . . x2 with degree 5.. xn ]. + αn . . with n ∈ N and n > 0. = αn = 0. . 1) be a field and let n ∈ N. . xαnn where αi ∈ N0 for 1 ≤ i ≤ n. xn . xn over a field.. .69 3.. .xn and is the set of sequences xα1 1 . We can also write xi for xαi i when αi = 1. Example 3.. denoted by deg(xα1 1 .. . ×. . Each p(m) ∈ C is a coefficient of p.2 • x21 is a monomial in the variable x1 with degree 2. −. . . . . . . we can write x21 x2 x4 instead of x21 x12 x03 x14 . . The set of all polynomial in the variables x1 .. .. xn over C is a function p : Mx1 .xn when no confusion arises. xn ] are called univariate polynomials when n = 1.. 0. xn . xn ] is denoted by 0C[x1 . xαnn . denoted by deg(p). . We now introduce the notion of polynomial in the variables x1 . . The zero polynomial in C[x1 . . . • x21 x32 is a monomial in the variables x1 . BASIC CONCEPTS Definition 3. and multivariate polynomials when n > 1.. xαnn ) is the natural α1 + . . . xn over C is denoted by C[x1 .2. . . . . . . . . . 3x21 x22 and 1x01 x02 . . . The coefficient of 21 x31 x22 is 2 and its monomial is x31 x22 . Assuming that q is the term cxα1 1 . denoted by deg(t) is 0 if t is a zero term and is the degree of its monomial otherwise. . The subscript is again omitted when no confusion arises. • p(x01 x02 ) = 1.70 CHAPTER 3. X p(m)m. A monic term is a term whose coefficient is the multiplicative identity of C and a zero term is a term whose coefficient is the additive identity of C.4 Consider the polynomial p in Z5 [x1 . .. xn is denoted by 1C[x1. Given p in C[x1 . . xn ]. xα1 1 . The degree of a term t. the function first creates the list {c. x2 . The function degmon uses the built-in Mathematica function PolynomialMod that. . xn ] and m ∈ M . . • p(m) = 0 for all the other monomials in x1 . . . . . we say that p(m)m is a term of p. xn ] is often presented as a sum of all monomials weighted with their nonzero coefficients. • p(x21 x22 ) = 3. . and returns the degree of q. unless deg(mi ) = 0. xn over C without mention any particular polynomial in C[x1 . x0n ) = 1 and p(m) = 0 for all the other monomials in x1 . given a polynomial . . When no confusion arises. . Then it removes the coefficient c and adds the exponent of each variable.. xn ] such that p(x01 .4 receives as input a term q and positive integer n. x2 ] such that • p(x31 x22 ) = 2.. we can refer just to terms in x1 . xαmm . For simplicity. . . . . .2. {m∈M :p(m)6=0} Note that using this notation the same polynomial can be referred to in different ways. A polynomial p in C[x1 . . As expected. Moreover. Example 3.xn ] . The Mathematica function degmon in Figure 3. a monomial mi can be omitted when deg(mi ) = 0 and a coefficient ci can be omitted whenever it is the multiplicative identity of C. That is. . POLYNOMIALS polynomial p in C[x1 . . . the coefficient of the term is p(m) and its monomial is m. all the conventions introduced above for monomials can also be used. . We can present p as 2x31 x22 + 3x21 x22 + 1 Observe that p has degree 5 and its terms are 2x31 x22 . xαmm }. . The sum of p1 and p2 is the polynomial p1 + p2 in C[x1 . Let µ : {x1 . w=Apply[List. . . p=PolynomialMod[q. . . If[Head[p]===Times. xαnn ) = µ(x1 )α1 × .5 Let p = 2x21 x22 + x1 x22 + 1 be a polynomial in Z5 [x1 . . × µ(xn )αn for each monomial xα1 1 . Example 3.2. .6 Let p1 and p2 be polynomials in C[x1 . cn ) for evalµ (p) when µ(xi ) = ci for each 1 ≤ i ≤ n. We now define sum and multiplication of polynomials. returns the polynomial that results from poly by replacing each coefficient c by a new coefficient c′ ∈ Zn such that c′ =n c.1]]. Definition 3. xn . Figure 3. . . Then.71 3. . .. This map can be extended to monomials considering the map µ : Mx1 . . .xn → C such that µ(xα1 1 . . BASIC CONCEPTS degmon = Function[{q.w}. .. . Then.. If[Head[m]===Power. .n}. . . . Polynomials in x1 . If[NumberQ[First[w]]. ..2. . . the µ-evaluation of a polynomial p in C[x1 . Apply[Plus. xn ] such that p1 + p2 (m) = p1 (m) + p2 (m) for each monomial m in x1 . xαnn in Mx1 .4: Degree of a term over Zn poly and a positive integer n.. . . . . x2 } → Z5 such that µ(x1 ) = 3 and µ(x2 ) = 4. . . x2 ] and consider the map µ : {x1 .Map[Function[m. we also write p(c1 .w]]]].w=Rest[w]]. xn ] is X evalµ (p) = p(m) × µ(m).. . xn over C can be evaluated.n].. Module[{p. .m[[2]]..p]. . evalµ (p) = 2 × 32 × 42 + 3 × 42 + 1 = 2. . . xn ]. . xn } → C be a map.xn . w={p}].2. {m∈M :p(m)6=0} For simplicity. xn over C.2.72 CHAPTER 3. xnαn +βn in x1 . . As usual we may omit the symbol × in a product of monomials or polynomials and write p1 p2 instead of p1 × p2 . . Let m1 = xα1 1 . x2 ]. . .8 Let p1 and p2 be polynomials in C[x1 . xn ] . the ring of polynomials in x1 . xn whose product is m. Example 3. . . . . . m21 ). . Then. Clearly. . . xn ]. . p1 + p2 is 2x31 x2 + 2x21 x22 + 3x1 x22 + x32 + 1. The product of p1 and p2 is the polynomial p1 × p2 in C[x1 . The product of m1 and m2 is the monomial xα1 1 +β1 . Let us now introduce the multiplication of polynomials. given a field C and n ∈ N. m2r ) be all the distinct pairs of monomials in x1 .5). . for each monomial m in x1 . POLYNOMIALS In the polynomial p1 + p2 the coefficient of each monomial m is the sum (in C) of the coefficients of m in p1 and p2 . . xn 0 if deg(m) > deg(p1 ) + deg(p2 ) p1 × p2 (m) = c11 × c21 + . xβnn be two monomials in x1 . . . It is easy to conclude that deg(p1 × p2 ) = deg(p1 ) + deg(p2 ). x2 ]. . . . deg(p2 )}. . . Definition 3. . . . . . .2. deg(p1 + p2 ) ≤ max{deg(p1 ). xn . . In the sequel.9 Let p1 = 3x31 x2 + 2x21 x22 + x1 x22 + 1 and p2 = 4x31 x2 + 2x1 x22 + x32 be polynomials in Z5 [x1 . p1 × p2 is 2x61 x22 + x31 x42 + 3x51 x32 + 2x21 x52 + 2x21 x42 + x1 x52 + 4x31 x2 + 2x1 x22 + x32 . . Then. xαnn and m2 = xβ1 1 . .7 Let p1 = 3x31 x2 + 2x21 x22 + x1 x22 + 1 and p2 = 4x31 x2 + 2x1 x22 + x32 be polynomials in Z5 [x1 . The symmetric of a polynomial p is the polynomial −p such that (−p)(m) is the symmetric of p(m) in the field C. . + c1r × c2r if deg(m) ≤ deg(p1 ) + deg(p2 ) where. . .2. . . . . . xn . It is easy to conclude that the above operations of sum and multiplication of polynomials satisfy all the relevant properties of a ring (Exercise 5 in Section 3. . (m1r . cij is the coefficient of mij in pi for each 1 ≤ i ≤ 2 and 1 ≤ j ≤ r. . denoted by m1 × m2 . denoted by C[x1 . . letting (m11 . . xn ] such that. Example 3. . 11 Let us consider monomials in x1 . xn . . . . the polynomial 2x1 in R[x1 ]. There are several orders that fulfill the above conditions. see. in order to define this operation. 0. +. . x2 . Definition 3. A total order > in the set of all monomials in x1 .5). . . xαnn >lx xβ1 1 . . . Example 3. [11]. . Ww will also write just p1 − p2 for p1 + (−p2 ). . xαnn . It is easy to conclude that there is no polynomial p in R[x1 ] such that 2x1 × p = 1. we will assume that × takes precedence over +. in general. The multiplicative identity is the polynomial 1 (Exercise 6 in Section 3. .2. As usual.10 The lexicographic order >lx on monomials in x1 . similar to the usual ordering of words in a dictionary. . we have to introduce first an ordering on monomials.2. for instance. xn ]. there is no infinite sequence m1 . that is. We first introduce the notion of lexicographic order on monomials in the variables x1 . xβnn ) ∈ >lx . mj . Consider. In this section we just introduce monomial orderings and the induced term orderings. However. Herein. xβnn . of monomials such that mj > mj+1 for all j ∈ N. As usual. . Then . denoted by. x3 . Note that. for instance. . . . .2. postponing division to the next section. . . . Each of them induces a different division. . . a ring of polynomials is not a field. . . xα1 1 .73 3. xn is a monomial order providing that: (i) > preserves products of monomials. . if there is 1 ≤ i ≤ n such that αi > βi and αj = βj for all 1 ≤ j < i. . . ×). . BASIC CONCEPTS is the ring (C[x1 .2 Monomial orderings Our next goal is to define division of polynomials. . . xβ1 1 . For more details on orders. given two monomials m and m′ we write m 6>lx m′ to denote that it is not the case that m >lx m′ and we write m ≥lx m′ to denote that m >lx m′ or m = m′ . that is if m′ > m′′ then m′ × m > m′′ × m.2. x4 . we consider the lexicographic order and the graded lexicographic order. Every ring of polynomials is unitary. 3. −. . xn is such that (xα1 1 . (ii) > is well founded. The order >lx on monomials can be used for defining division of polynomials.2. .13 The graded lexicographic order on monomials in x1 . . since it satisfies the requirements above. by Definition 3. we also have m × m1 >lx m × m2 . . Then. . Then. there is 1 ≤ j ≤ n such that αj1 > αj2 and αi1 = αi2 for all 1 ≤ i ≤ i. x32 x21 is not a monomial in x1 . But. different orderings on the variables can also be considered. Proposition 3. xαnni . xβnn . . x2 . Note that we assume that in a monomial the powers of the variables always occur by the order of the variables in x1 . . Proof: The order >lx is total and well founded (Exercise 7 in Section 3. for i = 1. For instance.74 CHAPTER 3. that is. POLYNOMIALS • x31 x22 x3 x24 >lx x21 x42 x23 x54 taking i = 1.1. We now prove that >lx is preserved by the product of monomials. .2. of course. 2 are monomials such that m1 >lx m2 . xn is denoted by >glx and defined as follows: given the monomials m and m′ m >glx m′ whenever one of the following conditions holds: • deg(m) > deg(m′ ) . xn .5). QED Next. . To obtain the order >lx above the variable ordering is. . This order also takes into account the degree of monomials.12 The order >lx is a monomial order. > xn . x2 . 2. . Assume that mi = xα1 1i . an ordering of the variables has to be previously fixed and the monomials written accordingly. xαnni +βn for i = 1. to introduce a definition of lexicographic order on monomials as the one above. . and αj1 + βj > αj2 + βj and αi1 + βi = αi2 + βi for all 1 ≤ i ≤ i. Definition 3. . we remark that it is not mandatory to impose a particular order when defining monomials. . . • x1 x54 6>lx x1 x2 x24 since the only exponent of a variable in x1 x54 that is greater than the corresponding one in x1 x2 x24 is the exponent of x4 and the exponents of x2 in the two polynomials are not equal. . As a side comment. Let m = xβ1 1 . we introduce the graded lexicographic order. This property of monomials has been implicitly used in the definition of >lx above. both x21 x32 and x32 x21 can be considered monomials in x1 . leading to different lexicographic orderings of monomials. x1 > . Since m × mi = x1α1i +βi .2. . either (i) deg(m1 ) > deg(m2) or (ii) deg(m1 ) = deg(m2 ) and m1 >lx m2 . The function monorderQ uses the function degmon already presented in Figure 3. clearly deg(m1 + m) > deg(m2 + m). given any finite set M of monomials we can determine the maximum max(M) with respect to >lx . Moreover. we conclude that m × m1 >glx m × m2 . . we have deg(m1 + m) = deg(m2 + m) and. Term coefficients are irrelevant and just the monomials are compared. . Any order > on monomials induces an order on the set of terms of a polynomial. The function index receives as input a term q and a positive integer n. Then • x1 x54 >glx x1 x2 x24 since deg(x1 x54 ) > deg(x1 x2 x24 ). The Mathematica function monorderQ in Figure 3. Then.5. • x21 x32 x3 >glx x21 x22 x24 since (i) deg(x21 x32 x3 ) = deg(x21 x22 x24 ) = 6.75 3. Assume m1 >glx m2 . 2. since >lx is also well founded we can determine the minimum min(M) of any set M of monomials. (ii) x21 x32 x3 >lx x21 x22 x24 .5). m × m1 >lx m × m2 . for i = 1. .2. x3 .14 Let us consider again monomials in x1 . .4 and the function index depicted in Figure 3. xαnni +βn and deg(m × mi ) = deg(m) + deg(mi) for i = 1. Hence.2. by Proposition 3. We again write m 6>glx m′ to denote the fact that it is not the case that m >glx m′ .2. a term m2 and a positive integer n. . . QED Observe that since >lx is a total order. It returns the index of the first variable in the monomial of q that has a nonzero . Proof: The order >glx is total and well founded (Exercise 7 in Section 3. The properties of monomial orders clearly extend to the orders induced on the terms of a polynomial. x4 . Example 3. xαnni . Let us prove that >glx is preserved by the product of monomials.15 The order >glx is a monomial order.6 receives as input a term m1. Similarly with respect to >glx . x2 . and returns a Boolean value. We have that m × mi = x1α1i +βi .12. 2. respectively. In case (i). In case (ii). xβnn . Hence. It returns True if m1 >glx m2 and False otherwise. Proposition 3. Let mi = xα1 1i . and consider m = xβ1 1 . BASIC CONCEPTS • deg(m) = deg(m′ ) and m >lx m′ . t > t′ whenever mt > mt′ where mt and mt′ are the monomials of t and t′ .2. If[PolynomialMod[m1.p]. . If[Head[p]===Times. . .n] . xαmm } and then removes c. . p=PolynomialMod[q.m[[2]]]]. . . the function returns True if the index of the first variable in m1 with a nonzero exponent is less than the index of the first variable in m2 with a nonzero exponent. If[degmon[m1. First[ Map[Function[m.n]. + ts . Otherwise.Module[{p. Figure 3.False. .w]]]]. Figure 3.n].76 CHAPTER 3. .n]>index[m2. it first tests if they are equal returning False if this is the case.n]===PolynomialMod[m2. POLYNOMIALS exponent. If[index[m1. when presenting a polynomial in C[x1 .m2/xindex[m1. If[degmon[m1.m2. n]]]]]]].w}. xαmm . If[NumberQ[First[w]]. When deg(m1) = deg(m2).False. it compares their degrees returning True if deg(m1) > deg(m2) and False if deg(m1) < deg(m2). .n].False.n] .m[[1.n]>degmon[m2. Otherwise. .n]. If[index[m1.n}.w=Rest[w]]. Given the input terms m1 and m2.5: Index of the first variable with a nonzero exponent The function monorderQ is recursively defined.n}.6: Checking whether m1 is greater than m2 In the sequel. monorderQ=Function[{m1.n]. xn ] as s X i=1 ti or t1 + . xα1 1 . the function decrements by 1 the exponents of these variables and recursively checks the the resulting terms.n]<index[m2. If[Head[m]===Power.2]]. it creates the list with the indexes of the variables with a nonzero exponent and then returns its first element.n]. . If it is greater it returns False. monorderQ[m1/xindex[m1.w=Apply[List.True. . the function index first creates the list {c. Afterwards.w={p}]. Assuming that q is cxα1 1 . index=Function[{q.n]<degmon[m2.True. 2. Its ordered presentation is 6x1 3 + 3x1 2 x2 + 4x1 2 x3 + 2x1 x2 2 + 2x1 x2 x3 + 6x2 2 x3 + 5x1 x2 + x2 x3 The next notions are useful in the sequel. ti > ti+1 for all 1 ≤ i < s. Example 3. terms occur according to their ordering (induced by the monomial order > we are considering).6). . denoted by lt(p) is the nonzero term t of p such that t > t′ for each nonzero term t′ of p distinct from t.18 The leading term of the polynomial in R[x1 . we will often assume that this presentation is ordered. over a field C. for some n ∈ N. . .16 The presentation of the polynomial in R[x1 . . and returns the leading term of p. . We consider the >glx order on monomials. Recall that each polynomial has a finite number of nonzero terms and that all the monomial orders we are considering induce a a total order on the terms of a polynomial.2. xn . The integer n indicates that we are considering polynomials over Zn .3.2. The polynomial p is said to be monic if lt(p) is a monic term. The Mathematica function lt in Figure 3. This polynomial is not monic. x3 ] 6x1 3 + 3x1 2 x2 + 4x1 2 x3 + 2x1 x2 2 + 2x1 x2 x3 + 6x2 2 x3 + 5x1 x2 + x2 x3 is 6x1 3 . The function polsort is used to get the ordered list of the terms of p. . x3 ] 6x1 3 + 5x1 x2 + 3x1 2 x2 + 2x1 x2 2 + 4x1 2 x3 + x2 x3 + 2x1 x2 x3 + 6x2 2 x3 is not ordered. The Mathematica function polsort presented in Figure 3. Hence.8 receives as input a polynomial p an a positive integer n.3 Division of terms and polynomials In this section we introduce division of terms an division of (ordered) polynomials in x1 . and returns the ordered list of the terms of q. ts are the nonzero terms of the polynomial. . Definition 3. x2 . . BASIC CONCEPTS 77 where t1 . . Example 3.7 receives as input a polynomial q and a positive integer n. . The function first creates the list of the terms of q and then orders this list using the function monorderQ (see Figure 3. x2 . .17 Given a nonzero polynomial p in C[x1 . xn ] the leading term of p. . 3. . that is.2.2. n}.Module[{p}. when t1 is divisible by t2 the term t is unique.8: Leading term of a polynomial Terms We start with the notion of divisibility of terms. if t1 is not divisible by t2 then it may be the case that deg(t1 ) < deg(t2 ) but it is also possible that deg(t1 ) = deg(t2 ) or even deg(t1 ) > deg(t2 ). .n]]]. . POLYNOMIALS polsort=Function[{q. whenever there is a term t in x1 . . When a term t1 is divisible by a given nonzero term t2 then deg(t1 ) ≥ deg(t2 ). We say that t1 is divisible by t2 .78 CHAPTER 3. • t1 = x1 x42 is not divisible by t2 = x21 x2 since the exponent of x1 is 1 in t1 and 2 in t2 and therefore there is no term t such that t1 = t × t2 . . xn over C such that t1 = t × t2 . Figure 3. Then • t1 = 3x31 x22 is divisible by t2 = 2x1 x22 since t1 = 4x21 × t2 . .2. The term t is the quotient of the division of t1 by t2 and it is denoted by Example 3.First[polsort[p. or that t2 divides t1 .20 Let us consider terms in x1 .h2}. .n]. t2 As we will see below. A nonzero term t1 is divisible by a nonzero term t2 if and only if the exponent of each variable in the t1 monomial is greater or equal to the exponent of that variable in the t2 monomial.n}.monorderQ[h1. Sort[Apply[List. p=PolynomialMod[q. x2 over Z5 . {p}]]]. a zero term is divisible by any term.19 Let t1 and t2 be terms in x1 . hence the above notion of quotient is well defined. Figure 3. We can also say that t1 is a multiple of t2 whenever t1 is divisible by t2 . xn over C where t2 is a nonzero term. However. Definition 3.2.p]. .7: Ordered list of the terms of q lt=Function[{p.h2. Function[{h1. Hence. If[Head[p]===Plus. t1 . . Clearly.n]]]. . −1 −1 ′ ′ c1 = c′ × c2 and therefore c1 × c−1 QED 2 = c × c2 × c2 . c2 6= 0. Conversely. xm are nonnegative and False otherwise. xαnni . xn over C. . xγnn +αn2 . . xαnn1 −αn2 is the only 2 ) x1 term in x1 . xαm2m . αj1 = βj + αj2 for every 1 ≤ j ≤ n. . . . Proof: Note that c1 . . . returning True if this is the case. . .21 Let ti = ci xα1 1i . we have that αj1 − αj2 ≥ 0 for every 1 ≤ j ≤ n and conclude that t1 = t × t2 considering α11 −α12 t = (c1 × c−1 . t1 is divisible by t2 if and only if αj1 ≥ αj2 for every 1 ≤ j ≤ n. . xγnn . βj ≥ 0 for every 1 ≤ j ≤ n. . Clearly.3. The function first checks whether t1 = 0. . . αj1 = γj + αj2 .2. Assume now that also t1 = t′ × t2 with t′ = c′ xγ11 . γj = αj1 − αj2 for every 1 ≤ j ≤ n. . xαnn1 = (c′ × c2 ) xγ11 +α12 . Hence. xαnn1 = (c × c2 ) xβ1 1 +α12 . a term t2 and a positive integer n. 2 ) x1 2. . xα1 11 −α21 . Otherwise. . If t1 is divisible by t2 then t = (c1 × c−1 . .2. Then. . . be two nonzero terms in x1 . . Assume t1 = t × t2 with t = c xβ1 1 . . . and returns a Boolean value. c1 xα1 11 . Moreover. . It receives as input a term t1. . . It returns True if t1 is divisible by t2 and False otherwise. for i = 1. assuming that t1 = c1 xα1 11 . 2. xnβn +αn2 . then αj1 ≥ αj2 for every 1 ≤ j ≤ n. it creates the α1m −α2m } and returns True if all the exponents α1i − α2i list {c1 /c2 . Otherwise. . α11 −α12 2. The Mathematica function divisibleQ presented in Figure 3. . . c1 × c2 = c . . xβnn . xαnn1 −αn2 . we have that t1 = t × t2 . xαm1m and t2 = c2 xα1 21 . BASIC CONCEPTS 79 divisibility of nonzero terms only depends on the monomials of the terms. . Recall that nonzero term coefficients are nonzero elements of a field C and therefore have multiplicative inverse. it checks whether t2 = 0 returning False is this is the case. that is. Since. Lemma 3. . As a consequence. . Thus. 1. .9 checks whether a term is divisible by another term in the ring of polynomials over Zn . . t1 = c1 xα1 11 . xn over C such that t1 = t × t2 . 1. . that is. . Proposition 3. Proof (sketch): Let p and d be nonzero polynomials in C[x1 .n]. . POLYNOMIALS divisibleQ=Function[{t1.80 CHAPTER 3. such that • p0 = p. Then p can be written as p= q×d+r where q and r are polynomials in C[x1 . To simplify the presentation we denote by gdt(p. q and r are unique. r1=PolynomialMod[t1. . + ts we are always assumed to be ordered. . r2=PolynomialMod[t2. xn ]. xn ]. Function[p.If[Head[p]===Power.n}. consider a finite sequence of polynomials p0 . . . .9: Checking whether term t1 is divisible by term t2 Polynomials We Ps now consider division of polynomials.False]]]]==0. pk with k ∈ N0 . .2. In the sequel. .True]]]]]]. presentations such as i=1 ti or t1 + . t) the greatest nonzero term of a polynomial p that is divisible by a term t. xn ]. If[Head[r1/r2]===Times.r2}. at least one nonzero term of pi is divisible by lt(d) for each 0 ≤ i < k. Furthermore.t2. and r = 0 or r is a nonzero polynomial whose nonzero terms are not divisible by lt(d). . If[r2===0. p[[2]]<0. If[r1===0. . Figure 3. . The >glx order on monomials is assumed by default. Then r = pk and q is obtained from d and the polynomials in the sequence.False. and pk = 0 or there are no nonzero terms of pk divisible by lt(d). . .Module[{r1. . Length[Select[Apply[List. . (r1/r2)[[2]]>0. .r1/r2]. If[Head[r1/r2]===Power. . .n]. pk of polynomials where p0 = p. .True.22 Let p and d be nonzero polynomials in C[x1 . . . More precisely. . To get the polynomials q and r we build a suitable sequence p0 . Otherwise. p is divisible by d. Then. Proposition 3. to get the polynomials q and r we proceed as follows (note that the leading term of any ordered polynomial is always its greatest nonzero term): i) lt(p) is divisible by lt(d) thus lt(p) .2. QED The steps described above to get polynomials q and r such that p = q × d + r are the starting point for the division algorithm. that is 3x21 x2 . We do not present herein the details of the proof and refer the reader to [10]. lt(d))  if k > 0  lt(d) and r = pk . We have that p = q × d + r where q = 3x21 x2 + 5x1 x2 + 2x1 and r= 0 Hence. If r is a nonzero polynomial then p is not divisible by d. is a term of q. lt(d)) pi+1 = pi − × d. lt(d) • pk = 0 or pk is a nonzero polynomial whose nonzero terms are not divisible by lt(d). Clearly. let  0 if k = 0   k−1 X q= gdt(pi . The polynomial d is the divisor. Then q is said to be the quotient of the division of p by d and r is said to be the remainder of the division of p by d. Example 3.2. lt(d) .2. any zero polynomial p is divisible by any nonzero polynomial d. since p = 0 × d. x2 ].23 Let us consider the polynomials p = 3x31 x32 + 5x21 x32 − 6x31 x2 + 2x21 x22 − 10x21 x2 − 4x21 d = x1 x22 − 2x1 in R[x1 . Following the division algorithm described above. p is divisible by d. i=0 We can prove that indeed p = q × d + r and that q and r are unique in the above sense.81 3.22 ensures that such q and r are unique when we assume that r = 0 or r is a nonzero polynomial whose nonzero terms are not divisible by lt(d). BASIC CONCEPTS • for each 0 ≤ i < k lt(d) divides at least one nonzero term of pi and gdt(pi . is a term of q. lt(d) ii) lt(p1 ) is divisible by lt(d) thus lt(p1 ) . We have that p = q × d + r where q = 3x21 x32 x3 + 2x22 x23 and since using the division algorithm we get r = 2x32 x53 . lt(d) p3 = p − lt(p2 ) × d = 0. lt(d) p2 = p − lt(p1 ) × d = 2x21 x22 − 4x21 . POLYNOMIALS p1 = p − lt(p) × d = 5x21 x32 + 2x21 x22 − 10x21 x2 − 4x21 . that is 2x1 . that is 5x1 x2 . is a term of q. x2 .2. Example 3. lt(d) The above computations can also be presented in the following way where. we have also included the names of the polynomials: p 3x31 x32 + 5x21 x32 − 6x31 x2 + 2x21 x22 − 10x21 x2 − 4x21 x1 x22 − 2x1 d 2 3 3 3 3x1 x2 + 5x1 x2 + 2x1 q −3x1 x2 + 6x1 x2 p1 5x21 x32 + 2x21 x22 − 10x21 x2 − 4x21 −5x21 x32 + 10x21 x2 p2 2x21 x22 − 4x21 −2x21 x22 + 4x21 p3 0 We now present another example.82 CHAPTER 3. x3 ].24 Let us consider the polynomials p = 2x41 x32 x23 + 2x32 x53 + 3x21 x22 x33 + x21 x32 x3 + 4x22 x23 d = 4x21 x3 + 2 in Z5 [x1 . lt(d) iii) lt(p2 ) is divisible by lt(d) thus lt(p2 ) . for illustration purposes. x2 ]: • assuming the order >glx we have p = q × d + r where q = 4x21 + 2x2 . x1 + x22 4x1 x22 − 4x42 . Example 3. r = 4x62 + 2x32 . since the leading term of d depends on the particular monomial order. x22 + x1 4x21 + 2x2 • assuming the order >lx we get p = q × d + r where q = 4x1 x22 − 4x42 . Different orders may lead to different quotients and remainders. It is worthwhile noticing that when we divide a nonzero polynomial p by a nonzero polynomial d the uniqueness of the quotient and remainder polynomials depends on the particular ordering of monomials we are considering.83 3. since using the division algorithm we get the following 4x21 x22 + 2x32 −4x21 x22 − 4x31 −4x31 + 2x32 −2x32 − 2x1 x2 −4x31 − 2x1 x2 .2. The following example illustrates this situation.2. r = −4x31 − 2x1 x2 . given that we now have 4x21 x22 + 2x32 −4x1 x22 − 4x1 x42 −4x1 x42 + 2x32 4x1 x42 + 4x62 4x62 + 2x32 . BASIC CONCEPTS 2x41 x32 x23 + 2x32 x53 + 3x21 x22 x33 + x21 x32 x3 + 4x22 x23 3x41 x32 x23 + 4x21 x32 x3 2x32 x53 + 3x21 x22 x33 + 4x22 x23 2x21 x22 x33 + x22 x23 2x32 x53 4x21 x3 + 2 3x21 x32 x3 + 2x22 x23 Recall that in Z5 the equalities −2 = 3 and −1 = 4 hold and the equality 3+2 = 0 also holds.25 Let us consider the polynomial p = 4x21 x22 + 2x32 and the polynomial d = x22 + x1 in R[x1 . dm }.84 CHAPTER 3. . . .4 Reduction modulo a set of polynomials We now discuss the reduction of a polynomial modulo a set of polynomials so that we can define certain polynomials as a combination a1 × d1 + . . . Definition 3. am are also polynomials. Example 3. since p− 5x21 x32 × d = 3x31 x32 − 6x31 x2 + 2x21 x22 − 4x21 lt(d) we can also conclude that d p −→ 3x31 x32 − 6x31 x2 + 2x21 x22 − 4x21 . . am × dm of a given finite set of polynomials {d1 . POLYNOMIALS Note that the ordered form of d is different in both cases. .2.2. . . . x2 ] where p = 3x31 x32 + 5x21 x32 − 6x31 x2 + 2x21 x22 − 10x21 x2 − 4x21 d = x1 x22 − 2x1 Since lt(d) divides lt(p) and p− lt(p) × d = 5x21 x32 + 2x21 x22 − 10x21 x2 − 4x21 lt(d) we have that d p −→ 5x21 x32 + 2x21 x22 − 10x21 x2 − 4x21 . However lt(d) also divides the term 5x21 x32 of p and therefore. thus the leading term is also different. where a1 . xn ]. if p′ = p − t ×d lt(d) for some nonzero term t of p divisible by lt(d). . Then p reduces to p′ modulo d in one step. . . . .2. 3.26 Let p and d be nonzero polynomials in C[x1 . written d p −→ p′ . We start by considering the one step reduction.27 Let us consider the polynomials p and d in R[x1 . . Moreover. we always get rid of the term t divisible by lt(d) that we choose to compute the reduction. We consider two cases.2. .28 Let p and d be nonzero polynomials in C[x1 .2.2. Then. Lemma 3. lt(p) >glx lt(p ) whenever t = lt(p) and lt(p) = lt(p′ ) whenever t 6= lt(p). Assume t that t is a term of p divisible by lt(d) and that p′ = p − lt(d) × d is a nonzero ′ polynomial. In particular. Similarly. . This term is replaced in p by a multiple of the polynomial that results from d by removing lt(d). . t (1) t = lt(p). tpi i=1 Hence. rd X t d × ti . xn ]. It is easy to see that if we reduce p1 modulo d in one step using the term lt(p1 ) we get the polynomial p2 therein. t >glx t × tdi lt(d) for all 1 < i ≤ rd . When reducing p modulo d in one step. BASIC CONCEPTS Another reduction of p modulo d is also possible given that the term 2x21 x22 is also divisible by lt(d). The only difference is that in the division algorithm we always choose the greatest nonzero term that is divisible by the leading term of d and herein we can choose any term divisible by the leading term of d.15. − lt(d) i=1 Since p and d are ordered polynomials.85 3. we get 0 by reducing p2 modulo d. The following result is useful in the sequel. It is also easy to conclude that the reduction of p modulo d in one step may correspond to a step of the division algorithm presented in Section 3. lt(d) = td1 . − lt(d) i=2 .2.2.23.3. The terms tp1 and lt(d) × td1 cancel each other and ′ p = rp X i=2 tpi rd X t d × ti . Proof: Let p = Prp p i=1 ti and d = ′ p = rp X Prd d i=1 ti . Note that the first reduction above is just the polynomial p1 we got in the first step of the division algorithm in Example 3. lt(p) = tp1 . given 1 < i ≤ rp and 1 < i′ ≤ rd we have that lt(p) >glx tpi and lt(d) >glx tdi′ . t t × lt(d) >glx × tdi lt(d) lt(d) that is. by Proposition 3. Clearly.r=w[[i]]].n].86 CHAPTER 3. in this case we have that lt(p) >lx lt(p′ ). a term u and a positive integer n. The function first creates the list of the terms of q and then passes through the list using the function divisibleQ (see Figure 3.u.n}. and therefore tpk is a term of p′ for all 1 ≤ k < j.10. r=0.t. it uses the Mathematica function selterm in Figure 3. It returns a term of q that is divisible by u. the terms tpj t and lt(d) × lt(d) cancel each other. if such a term exists.10: Selecting a term of q divisible by u The Mathematica function redone in Figure 3. Then. We lt(d) can then conclude that lt(p′ ) = tp1 = lt(p).Module[{p. tpk >glx t for all 1 ≤ k < j and. t × tdi tpk >glx lt(d) for all 1 < i ≤ rd .28 also holds when polynomials are ordered using the order >lx instead of >glx . t=PolynomialMod[u. selterm=Function[{q. It returns as output the polynomial that results from a reduction of f modulo g in one step. we Since lt(p) >glx tpi for all 1 < i ≤ rp and lt(p) >glx lt(d) conclude that lt(p) >glx lt(p′ ). The function selterm receives as input a polynomial q. r]]. no term tpk . i=i+1]. POLYNOMIALS t × tdi for all 1 < i ≤ rd . as a consequence.p]. . Figure 3. When there is no such term it returns 0.n].11 receives as input two polynomials f and g and a positive integer n. with 1 < i ≤ rd .t. when computing p′ . i=1. and 0 otherwise.w. Besides the function lt (see Figure 3.8). The integer n indicates that we are considering polynomials over Zn .2.r.w=p]. QED Note that Lemma 3. p=PolynomialMod[q. If[divisibleQ[w[[i]]. Hence. cancels with a term t × tdi . If[Head[p]===Plus.i}. t = tpj for some j > 1 and.9) to pick the first term divisible by u. While[i<=Length[w]&&r===0. Moreover.w=Apply[List. with 1 ≤ k < j.n]. (2) t 6= lt(p). 5). Expand[PolynomialMod[f(selterm[f. . . . written D p −→ p′ .n]/lt[g. Figure 3.2.n]]]. such that • p0 = p and pm = p′ .n]. .n}. .2.n])*g. . If p cannot be reduced in one step modulo an element of D we say that p is irreducible modulo D or that p is D-irreducible.30 Let us consider the polynomial p and the set of polynomials D = {d1 . . redone=Function[{f. Definition 3.29 Let p be a nonzero polynomial and D a finite set of nonzero polynomials in C[x1 . BASIC CONCEPTS To obtain a reduction of f modulo g in one step. pm where m ∈ N0 . p reduces to p′ modulo D in several steps. (Exercise 15 in Section 3. if a nonzero polynomial p is irreducible modulo D then p∈ / D.g. d2 } in Z7 [x1 .87 3. . otherwise. for each 0 ≤ i < m. Example 3. Then.lt[g.2. if there is a sequence p0 . xn ].11: One step reduction The notion of reduction can now be introduced capitalizing on the one step reduction. . d • pi −→ pi+1 for some d ∈ D. x2 ] where p = 6x42 + 3x21 x2 + 5x1 We have that d1 = 3x22 + 2 d2 = x21 D p −→ 3x22 + 5x1 d d 2 1 since p −→ 6x42 + 5x1 −→ 3x22 + 5x1 . Clearly. the function redone uses selterm and lt to select a term of f divisible by the leading term of g and then computes the reduction as expected. p reduces to p′ modulo D. When m = 0 we say that p reduces to p′ modulo D in zero steps and. . if p reduces to 0. .. These properties are useful later on. . . xn ] for each 1 ≤ i ≤ k. . pm+1 be such that. . . k ∈ N0 . dk .2. . .. . . . pm . Lemma 3. . .dk } C[x1 . . m ∈ N0 . Proof: The proof is by induction on m. pi −→ pi+1 for some 1 ≤ j ≤ k. . . k X t t a′i di × dj = (p0 − pm ) + × dj = p0 − pm+1 = p0 − pm − lt(dj ) lt(dj ) i=1 where a′i = ai + t if i = j and a′i = ai otherwise. . . i=1 dj Step: Let p0 . xn ]. Hence. xn ] such that. POLYNOMIALS When p reduces to p′ modulo a finite set D of polynomials. k ∈ N0 . . p0 − pm = k X ai di i=1 dj where ai is a polynomial in C[x1 . . . . In this case. for all 1 ≤ i ≤ k. p0 − pm = k X ai di i=1 where ai is a polynomial in C[x1 . lt(dj ) QED Proposition 3. then t pm+1 = pm − × dj lt(dj ) for some term t of pm .. . xn ] for each 1 ≤ i ≤ k. . for each 0 ≤ i < m + 1. pi reduces to pi+1 modulo dj for some 0 ≤ j ≤ k. Assume that p0 . then the polynomial p − p′ can be expressed in terms of the polynomials in D. be a nonzero polynomials in {d1 . be nonzero polynomials in C[x1 .2. . . for each 0 ≤ i < m. . Then. . . xn ] for each 1 ≤ i ≤ k.32 Let p and d1 . . In particular. . Basis: m = 0.88 CHAPTER 3. are polynomials in C[x1 . . Since pm −→ pm+1 for some 1 ≤ j ≤ k. . dk . .31 Let d1 .. If p −−−−→ p′ then p − p′ = k X ai di i=1 where ai is a polynomial in C[x1 . . then we can express p in terms of the polynomials in the set. xn ]. . . . . pm = p0 and therefore p0 − pm = 0 = k X 0di . . By the induction hypothesis. 31. = x1 × d2 + x1 × d3 + x2 × d3 p = x1 × d2 + (x1 + x2 ) × d3 . QED The proof of Lemma P 3. lt(d2 ) Therefore. d3 }. These reductions allow us to express p in terms of the polynomials in D. Example 3. dn }.33 Consider the polynomials in Z5 [x1 .2. x2 ] d1 = 2x21 x2 + x2 d2 = x32 + x1 d3 = 4x31 + 2x1 and p = 4x31 x2 + 4x41 + x1 x32 + 3x21 + 2x1 x2 .29 and Lemma 3.12 receives as input two polynomials f and g and a positive integer n. we have that D p −→ 0 since d d d 3 3 2 p −→ 4x41 + x1 x32 + 3x21 −→ x1 x32 + x21 −→ 0. . Assuming D = {d1 . The function repeatedly uses the function redone (see Figure 3. In fact.2. lt(d3 ) d 3 • p1 −→ p2 where p2 = p1 − d 2 • p2 −→ 0 where 0 = p2 − lt(p1 ) × d3 = p1 − x1 × d3 lt(d3 ) lt(p2 ) × g2 = p2 − x1 × d2 .2.2.2. . The Mathematica function red in Figure 3. .31 sketches a technique for obtaining polynomials ai such that p − p′ = ni=1 ai × di whenever p reduces to p′ modulo {d1 . that is or p = p1 + x2 × d3 = p2 + x1 × d3 + x2 × d3 . d2 . BASIC CONCEPTS Proof: By Definition 3. It returns a polynomial irreducible modulo {g}. We present in the sequel an illustrative example. assuming that p1 = 4x41 + x1 x32 + 3x21 and p2 = x1 x32 + x21 we have d 3 • p −→ p1 where p1 = p − lt(p) × d3 = p − x2 × d3 .11) to reduce f modulo g in one step until an irreducible polynomial is obtained. .89 3. . p = 0 × d1 + x1 × d2 + (x1 + x2 ) × d3 when considering all the polynomials in D. r}.Module[{i. i=i+1]. 5. Figure 3. 8] and were originally proposed as an algorithmic solution to some important problems in polynomial ring theory and algebraic geometry (for more technical details see [1]). This function extends the function red to a set of polynomials G = {g1 . Figure 3. 6. 27]. gm }.FixedPoint[Function[h. 32. obtains a polynomial h1 by reducing h modulo g1 . r=h. redmod=Function[{f.G[[i]].90 CHAPTER 3. finally returning hm . .13 receives as input a polynomial f.n}. . Gröbner bases were first introduced by B. It is the fixed point of the function that. FixedPoint[ Function[h. f]].n}. a polynomial h2 by reducing h1 modulo g2 and so on.n].n]]. The interested reader can consult [30. . r=red[r. . POLYNOMIALS red=Function[{f.g. Buchberger [7. It returns a polynomial G-irreducible that results from the reduction of f modulo G.12: Reduction of polynomial f modulo g The Mathematica function redmod in Figure 3. a list of polynomials G and a positive integer n.redone[h. r]]. . i=1.g.G. Since then many applications and generalizations have been proposed. given a polynomial h.3 Gr¨ obner bases In this section we introduce the notion of Gröbner basis and some related concepts and properties.13: Reduction of polynomial f modulo G 3. While[i<=Length[G]. Gröbner bases are often seen as a multivariate generalization of the Euclidean algorithm for computing the greatest common divisor for univariate polynomials.f]]. as a consequence. Herein we do not detail this subject.3. Since the transformation from S to G can be algorithmically performed. −.1 An ideal I over a commutative ring (A. if we transform S into a Gröbner basis G that generates the same ideal the problem is easily solved. this problem is not easy to solve.2 we discuss how to use Gröbner bases to solve systems of nonlinear polynomial equations. a2 ∈ I then a1 + a2 ∈ I. 3. • if a1 ∈ A and a2 ∈ I then a1 × a2 ∈ I. This new set G satisfies some good properties and. many problems involving finite sets of polynomials become algorithmically solvable. prime ideals and coprime ideals can be defined as a generalization of prime and coprime numbers and there is even a Chinese remainder theorem involving ideals. 0. In Section 3.2 we then discuss Gröbner bases. 0. There are several concepts that have to be presented before defining Gröbner bases. the idea behind the Gröbner basis approach is as follows. +. S can be transformed into a set G of polynomials (the Gröbner basis) which is equivalent to S in a sense to be discussed later on.1 Ring ideals In a nutshell. GROBNER BASES 91 They are also seen as a nonlinear generalization of the Gauss elimination algorithm for linear systems. ideals of polynomials and some relevant related notions and properties. that is. Definition 3.3. The ideal membership problem consists in checking whether or not some polynomial is a member of the ideal of polynomials generated by a given finite set S of polynomials.3. In this section we introduce ideals. In general. many problems and questions that were difficult to handle when considering the arbitrary set S become easier when we work with G.4. ×) (Exercise 8 in Section 3. .5). A typical example of this situation is the ideal membership problem. a2 ∈ A • if a1 .¨ 3. We are mainly concerned with ideals of polynomials. An ideal is a subset of a ring with some interesting properties. Observe that 0 ∈ I for every ideal I over (A. ideals over rings of polynomials (hence over unitary commutative rings). −. Given a (finite) set S of polynomials in some ring of polynomials (that depends on the particular problem at hand). For example. +. However. ×) is a nonempty subset of A such that for all a1 . In Section 3. Using ideals important properties of integer numbers can be generalized to other rings. The interested reader is referred to [20].3. 7 Let (A. It is easy to conclude that given g1 . Then (i) {g1 . It is easy to conclude that • {0} and A are ideals over A • if 1 ∈ I then I = A.2 Consider the ring Z of integer numbers with the usual operations and n ∈ Z.5). . Example 3. 0. We say that an ideal I over A is proper whenever I 6= A. . 0. −.4 Let (A. . . . gk ) (ii) if {g1 . . . . k ∈ N. Ze is finitely generated and {2} is a basis of Ze . ×) be a unitary commutative ring. the set {p ∈ A : p = a1 × g1 + .3 Let (A. . . . . . An ideal I is finitely generated if it has finite set G of generators. .3. This ideal is usually denoted by (g1 . . gk } ⊆ I is a set of generators of an ideal I then I = (g1 . −. gk ∈ A. Then. +. .3. Indeed any even number d ∈ Z can be written as d = a × 2 for some a ∈ Z. such that p = a1 × g1 + . . . . 0. . . gk } is a basis of the ideal (g1 . + ak × gk with a1 . +. . . with k ∈ N. (ii) the sum of multiples of n is a multiple of n and (iii) the product of an integer number by a multiple of n is again a multiple of n. with k ∈ N. . . + ak × gk . . . . +. −. POLYNOMIALS Example 3. . Hence. . the set G is a basis of I. +. ×). When G ⊆ I is a set of generators of I. . .3. Definition 3. Example 3. gk ). . In the second case note that given any a ∈ A we have that a × 1 = a ∈ I. . .3.3. A set G ⊆ I is a set of generators of I if for any p ∈ I there exist g1 . we also say that I is generated by G. The set {2} is a set of generators of Ze . . . . gk ∈ G and a1 .5 Let I be an ideal over a commutative unitary ring (A. gk ). . Proposition 3. −.6 The set Ze of even integer numbers is an ideal over Z (Exercise 9 in Section 3. . ×) be a unitary commutative ring and consider g1 . . gk ∈ A. Example 3. ak ∈ A} is an ideal over A. The set of the multiples of n is an ideal over Z since (i) this set is not empty. . ak ∈ A. .3. . ×) be a unitary commutative ring. 0. .92 CHAPTER 3. where agj′ ∈ A for every 1 ≤ j ≤ m. . . . .32. . Otherwise. As a P consequence. . since p − p′ ∈ I and I is an ideal. for instance.4. . Proof: 1. If 1 ∈ G then p −→ 0.¨ 3.2. then using the definition of ideal. . . . that is. . Proposition 3.3. if p −→ 0.3. . . The fact that p − p′ ∈ I follows from Proposition 3. gk ∈ I. If p −→ p′ then p − p′ ∈ I and p ∈ I if and only if p′ ∈ I. . If p′ ∈ I then. .2. . . gk ) from Example 3. . . GROBNER BASES 93 Proof: Recall the ideal (g1 . . . With respect to statement (ii). . .3. . . gm } and ai = 0 otherwise. . . In fact.32 and Definition 3. gr ) and therefore I is finitely generated. . using Proposition 3. . Moreover. G 2. it is easy to conclude that p − p′ ∈ I and that p ∈ I if and G only if p′ ∈ I. Hence. . J is finitely generated. These properties are useful in the sequel. let p = ki=1 ai gi ∈ (g1 . gk ). . QED The following result states that every ideal over a ring of polynomials is generated by some finite set of polynomials.5. . then. . . . . . The converse is similar. . . gm ∈ {g1 . Proposition 3. gk }. Conversely. I = (g1 . G When G is a basis of an ideal I of polynomials and p −→ p′ . . . . .3. . . . gr } ⊆ I such that J = (lt(g1 ). In particular. For the details of the proof we refer the reader to [10]. and therefore (g1 . gk ). p= k X ai gi i=1 ′ where ai = agi if gi ∈ {g1′ . we also have p = p′ + (p − p′ ) ∈ I. if I = {0} then I = (0). . . letting J be the ideal generated by the leading terms of the polynomials in I.3. . gk ) ⊆ I.8 Every ideal I over C[x1 . Since g1 . lt(gr )). . xn ] is finitely generated. . Statement (i) follows easily from Definition 3. . note that if p ∈ I then m X agj′ gj′ p= j=1 ′ for some g1′ . it is easy to conclude that also p ∈ I. I ⊆ (g1 . there is a finite set {g1 . . . .3.9 Let G be a basis of an ideal I over C[x1 .5. then p ∈ I. xn ] and let p be a polynomial in C[x1 . xn ]. G 1. . for every 1 ≤ i ≤ k. . 3. Consider the polynomial p = 4x31 x2 + 4x41 + x1 x32 + 3x21 + 2x1 x2 inZ5 [x1 . The set G = {g1 . Otherwise. xn ]. we conclude that p − 0 ∈ I and therefore p ∈ I. Hence. Moreover. we have reduced p to a polynomial that results from p by removing lt(p).9. POLYNOMIALS 2. in Z5 [x1 . note that lt(1) divides lt(p) and therefore lt(p) 1 p −→ p − × 1. .9 we use Proposition 3.2.3.10 Consider the polynomials g1 = 2x21 x2 + x2 g2 = x32 + x1 g3 = 4x31 + 2x1 .94 CHAPTER 3. There is an infinite number of possible bases of an ideal I over a ring C[x1 . QED In the proof of Proposition 3. or even p = 0 × d1 + x1 × d2 + (x1 + x2 ) × d3 if we want to explicitly consider all the polynomials in the basis. . note that the proof of Proposition 3. . x2 ]. by Proposition 3. we have that p = x1 × g2 + (x1 + x2 ) × g3 . recalling again Example 3. x2 ]. g2 . . Hence. However. .32 to conclude that G p − p′ ∈ I whenever p −→ p′ . g3 ) over Z5 [x1 . lt(1) 1 That is p −→ p − lt(p). Among them we can distinguish the Gröbner bases.33. In particular. It is then easy to conclude that with a suitable number of reductions in one step modulo 1 we get 0. g2 . g3 } is a basis of the ideal I = (g1 . Recall from Example 3.32 not only ensures that p − p′ ∈ I but also provides a technique d for expressing G p − p′ in terms of the polynomials in the basis G. x2 ]. if p −→ 0. A rigorous proof of this result uses induction on the number of terms of p and is left as an exercise.2.2. then we can easily express p in terms of the polynomials of the basis of the ideal. Example 3.2. If p is 0 the result is immediate.3.33 that G p −→ 0. 3. there is an algorithm to solve the ideal membership problem if there is algorithm to obtain from G a new set G′ of generators such that G′ is a Gröbner basis. . . . First of all. Proposition 3.3.. Hence. The choice of a particular monomial order may depend on the particular application of Gröbner bases we are interested in. . xn ] then {g} is a Gröbner basis of the ideal (g) (Exercise 15 in Section 3. Gröbner bases provide an algorithmic solution for the ideal membership problem. . . Moreover. . . . any other order on monomials will do provided it is suitable for polynomial division (see Section 3. gk ). .12 Let I be an ideal over C[x1 . . some properties of Gröbner bases depend on the monomial order we are considering. The following result constitutes an alternative way of characterizing a Gröbner basis G of an ideal I.2 95 Buchberger criterion Given polynomials g1 . . As already stated. then Proposition 3. The set G is a Gröbner basis of I whenever for any nonzero p ∈ I there is g ∈ G such that lt(p) is divisible by lt(g). then p can . . Proof: (→) Assume that G is a Gröbner basis of I and let p ∈ I. . . . . Note that if 1 is an element of a basis G of I then G is a Gröbner basis.3. . .¨ 3. .3.3).3. ak in C[x1 . gk }. we consider the order >glx on monomials. . gk in C[x1 .9 immediately ensures that p is an element of (g1 . xn ] and let G be a basis of I. . .5).11 Let I be an ideal over C[x1 . xn ] and a polynomial p in C[x1 . . if there is no reduction of p to 0 modulo G nothing can be concluded from that result.8 ensures that every ideal of polynomials is generated by a finite set of polynomials.3.2. If p = 0. this problem is usually known as the ideal membership problem and it has relevant applications in several domains. In this section we introduce the notion of Gröbner basis and in Section 3. . then p is in the ideal generated by G if and only if p can be reduced to the zero polynomial using the polynomials in G. G Note that if p −→ 0. G Then G is a Gröbner basis of I if and only if p −→ 0 for all p ∈ I. . xn ] it can be a hard task to determine whether or not p ∈ (g1 . . . In the sequel we always consider ideals of polynomials that are generated by some given finite basis. whenever G is a Gröbner basis. . given a set G of generators. However. . . . gk ). . . GROBNER BASES 3. where G = {g1 . . But. As we have already remarked above. Observe also that given any polynomial g in C[x1 . . . xn ].3. . . . xn ] and G a basis of I. This is not a restriction since Proposition 3. Recall that we have to check whether p = a1 ×g1 +. . . Definition 3. .3 we will present an algorithm to obtain a Gröbner basis of an ideal I from any finite set of generators of I. . . .+gk ×gk for some a1 . we are done. . (←) We prove the result by contraposition.96 CHAPTER 3. Definition 3. p′ = p − lt(g) If p′ = 0. lt(p) >glx lt(p′ ) and.11.3. If p is irreducible modulo G. or even Proposition 3. p′ ∈ I. and therefore G p −→ 0. lt(p′ ) is not divisible by lt(g) for all g ∈ G and so we can reason in a similar way. as we have already remarked. The polynomial p can then be reduced to lt(p) × g. Note also that if 1 ∈ G and G is basis such that G 6= {1}. Again. Then there is a nonzero polynomial p ∈ I such that lt(p) is not divisible by lt(g) for all g ∈ G. it is the case that every ideal over C[x1 . Hence p′ 6= 0.3. • lt(g) does not divide the nonzero terms of g ′ for all g ′ ∈ G\{g}.28. . Otherwise. .3. In fact.11.3. by Proposition 3. then G is Gröbner basis.28 ensures that lt(p′ ) = lt(p). Since >glx is well founded. Otherwise. QED Although not obvious from the Definition 3. . at some point we will have to reduce to 0. Assume that G is not a Gröbner basis.13 A Gröbner basis G of an ideal I over the ring C[x1 . but G is not a reduced Gröbner basis. is often a very hard task. it is easy to conclude that it is not possible to reduce p to 0 modulo G. Observe that if g is a monic polynomial then {g} is a reduced Gröbner basis of the ideal (g).3. . it can have many Gröbner bases. by Definition 3.3. we conclude that p cannot be reduced to 0 modulo G. Therefore. . . .11. by Lemma 3. Observe that we have to check all the polynomials of the ideal. Checking whether a set of polynomials G is a Gröbner basis using Definition 3. Similarly. POLYNOMIALS be reduced to 0 modulo G and we are done. Reduced Gröbner bases play an important role since every polynomial ideal has only one reduced Gröbner basis (see Proposition 3.30). Lemma 3. xn ] has a Gröbner basis. xn ] is said to be a reduced Gröbner basis if for each g ∈ G • lt(g) is monic.3. we can now reduce p′ modulo G either to 0 or to p′′ ∈ I such that lt(p′ ) >glx lt(p′′ ).12. since lt(1) divides any term. We now present the Buchberger criterion that provides an alternative way of checking whether G is a Gröbner .9. since p 6= 0. there is g ′ ∈ G such that lt(p) is divisible by lt(g ′). If p reduces to p′ in one step modulo an element of G.2.2. 8) to obtain the leading term of p. . p2 ) = lcm(lt(p1 ).3. q=PolynomialMod[p.14 computes the least common multiple of two terms using the built-in Mathematica function PolynomialLCM and the auxiliary function coeflt. . βi } for each 1 ≤ i ≤ n. Definition 3. PolynomialLCM[t1/coeflt[t1]. .14: Least common multiple of terms t1 and t2 The function polLCM receives as input two terms t1 and t2.n]]===Power. . xβnn be nonzero terms in x1 .¨ 3. .q. If[NumberQ[q]. p2 ). The function coeflt receives as input a polynomial p and a positive integer n. We first introduce the notion of least common multiple of two terms. Otherwise.n][[1]]. . Figure 3. It returns the coefficient of the leading term of p. is the monic term xγ11 . . . lt(p2 )) × p1 − × p2 lt(p1 ) lt(p2 ) . The least common multiple of t1 and t2 . .Module[{q}. denoted lcm(t1 .3.lt[q. The terms t1 and t2 are first divided by the corresponding leading term coefficient. . xγnn where γi = max{αi . xαnn and t2 = k2 xβ1 1 .14 Let t1 = k1 xα1 1 . . .n][[1]]]. t2). The Mathematica function polLCM in Figure 3. polLCM=Function[{t1. xn over C.3. We now present Buchberger polynomials also called S-polynomials. If[Head[lt[q. lt(p2 )) lcm(lt(p1 ). The function first checks whether p has only a nonzero term in which case it returns its coefficient. Definition 3.t2}. If[NumberQ[lt[q. The integer n indicates that we are considering polynomials over Zn . Then the function PolynomialLCM is used to obtain their least common multiple. xn ]. t2 ).t2/coeflt[t2]]]. and returns lcm(t1. it uses the function lt (see Figure 3.n]. The Buchberger polynomial of p1 and p2 . GROBNER BASES 97 basis. is the polynomial B(p1 . denoted B(p1 . . Using the Buchberger criterion we only have to consider a finite number of polynomials.1.1]]]]].n}. . coeflt=Function[{p.15 Let p1 and p2 be nonzero polynomials in C[x1 . 16 Considering the polynomials in R[x1 .lt[g. .3. p2 ) ∈ I.n]]/lt[g. and. xn ]. p1 ).98 CHAPTER 3. . . . POLYNOMIALS in C[x1 . x3 ] p1 = 3x21 x32 x3 + 2x21 x2 + x2 x3 + 1 p2 = 5x32 x43 + 3x22 x23 then lcm(lt(p1 ). Note also that if p1 . g ′) −→ 0 for all distinct polynomials g.n}.n]*f -polLCM[lt[f. where I is an ideal of polynomials. p1 ) (Exercise 14 in Section 3.n]. p2 ) = −B(p2 . The Mathematica function polBuch in Figure 3. For the complete proof we refer the interested reader to [10]. .g. lt(p2 )) = x21 x32 x43 B(p1 . 3 3 3 5 Observe that the leading terms of the polynomials p1 and p2 are canceled when we compute their Buchberger polynomial. We only sketch the proof of the theorem. p2 ) = x21 x32 x43 x21 x32 x43 × p2 × p − 1 3x21 x32 x3 5x32 x43 2 1 1 3 = x21 x2 x33 + x2 x43 + x33 − x21 x22 x23 . p2 ) reduces to 0 modulo a set of polynomials so does B(p2 . Then G is Gröbner basis if and only if G B(g. The Buchberger polynomial is computed as expected using the function polLCM (see Figure 3.17 Let G be a basis of an ideal I over the ring C[x1 .lt[g. It returns the Buchberger polynomial of f and g. Figure 3. The following theorem is known as the Buchberger theorem.15 receives as input two polynomials f and g and a positive integer n.n]]].n]*g. therefore.3. . g ′ ∈ G.n]]/lt[f.14). Moreover. Example 3. Expand[PolynomialMod[polLCM[lt[f. xn ]. B(p1 . . . p2 ∈ I then B(p1 . x2 . it is easy to conclude that if B(p1 .5). polBuch=Function[{f.15: Buchberger polynomial of f and g We can now introduce the Buchberger criterion for checking whether a given basis of an ideal is a Gröbner basis. Theorem 3.n]. . ak ) because the lt(ai gi )’s whose monomial is m(a1 . The monomial of lt(p) may differ from m(a1 . .. Example 3.. g ′) reduces to 0 modulo G for all distinct g. Then there is 1 ≤ i ≤ k such that lt(gi ) divides lt(p) as required. g2 . ak ) ∈ A let M(a1 . g ′) −→ 0.. ...ak ) be the set of the monomials of lt(ai gi ) for all 1 ≤ i ≤ k and let m(a1 . Let A be Pkthe set of all tuples (a1 . Let us use the Buchberger criterion to check whether G = {g1. either m is the monomial of lt(p) or m >glx lt(p). to check whether a set of polynomials is a Gröbner basis we just have to consider a finite number of polynomials: the Buchberger polynomials of the pairs of distinct elements of G. .. . Note that either m(a1 .. .ak ) >glx lt(p). . assume that all the polynomials in G are monic.ak ) with (a1 .ak ) may cancel each other. But this contradicts the assumption m = min(M) and. g ′ ∈ G. QED Taking into account Theorem 3. if m >glx lt(p) then it would be possible to get a′1 . .ak ) is the monomial of lt(p) or m(a1 .. Let M be the set of all m(a1 .3. g2 .. As remarked above... g3): • B(g1 . g ′ ∈ G. GROBNER BASES 99 Proof (sketch): (→) Since B(g. • B(g2 . xn ] such that p = i=1 ai gi ... g3 } constitutes a Gröbner basis of the ideal (g1 . . .. . . Using the hypothesis that B(g. . . as a consequence. (←) Conversely. . g2 ) = 4x31 + 3x32 and G 4x31 + 3x32 −→ 0 g3 g2 since 4x31 + 3x32 −→ 4x31 + 2x1 −→ 0..... .. xn ] such that P p = ki=1 a′i gi and m >glx lt(a′i gi ) for each 1 ≤ i ≤ k. . if G is Gröbner basis G then Proposition 3. .3.¨ 3. . .3. g ′) ∈ I for all distinct polynomials g.3.. ....18 Consider the polynomials g1 = 2x21 x2 + x2 g2 = x32 + x1 g3 = 4x31 + 2x1 in Z5 [x1 . x2 ].. a′k ∈ C[x1 . . ak ) ∈ A and let m = min(M). g3 ) = 0... . ak ) of polynomials in C[x1 . we have to prove that for each p ∈ I there is g ∈ G such that lt(g) divides lt(p).. we can conclude that m is the monomial of lt(p).. Consider p ∈ I and assume that G consists of k polynomials g1 .. .. .ak ) = max(M(a1 .12 ensures that B(g.. • B(g1 . gk . . . For each (a1 . .. .17. g3 ) = x41 + 2x1 x32 and G x41 + 2x1 x32 −→ 0 g3 g2 since x41 + 2x1 x32 −→ 2x1 x32 + 2x21 −→ 0. .ak ) ). Without loss of generality. 17.17. we can conclude that B(g1 . Since we have not proven that reductions modulo a set of polynomials are unique.1. G is a Gröbner basis. x2 ] presented in Example 3.{j.n].1. By Theorem 3. and returns False otherwise. Flatten[Table[ redmod[polBuch[G[[i]]. Figure 3.n]===0. We have that B(g1 .15) according to Theorem 3. It returns True if G is a Gröbner basis of the ideal generated by the polynomials in G.G[[j]]. g2 ) = x21 x32 x21 x32 × g2 = 4x31 + 3x32 × g − 1 2x21 x2 x32 and G 4x31 + 3x32 −→ 4x31 + 2x1 since g2 4x31 + 3x32 −→ 4x31 + 2x1 .3.16 receives as input a list of polynomials G and returns a Boolean value. i 6= j. gj ) −→ 0 for all 1 ≤ i.3.Length[G]}]]]]. It is not a reduced Gröbner basis since lt(g1 ) and lt(g3 ) are not monic.19 Consider the polynomials g1 and g2 in Z5 [x1 . Example 3. Note that the nonzero polynomial 4x31 + 2x1 cannot be reduced modulo g1 or g2 since neither of its terms are divisible by lt(g1 ) or lt(g2 ). It is easy to conclude that this is indeed the case.16: Checking whether G is a Gröbner basis .17 to conclude that G is not a Gröbner basis. j ≤ 3. We have to ensure that there is no other possible reductions of B(g1 .G.n}. g2 ) cannot be reduced to a zero polynomial modulo G. GrobnerQ=Function[{G. we cannot use just yet Theorem 3. p′′ are irreducible modulo G.17. POLYNOMIALS G Given that B(gi .3. Thus. G is not a Gröbner basis. {i. since lt(g1 ) does not divide any term of 4x31 + 3x32 and lt(g2 ) only divides the term 3x32 . respectively. The Mathematica function GrobnerQ in Figure 3. g2 } constitutes a Gröbner basis of the ideal (g1 .3.3.100 CHAPTER 3. by Theorem 3. in G G the sense that p′ = p′′ whenever p −→ p′ and p −→ p′′ and p′ .Length[G]}. The predicate uses the functions redmod (see Figure 3. Apply[And. g2 ).3.18.13) and polBuch (see Figure 3. Let us use again the Buchberger criterion to check whether the set G = {g1 . g2 ) modulo G. given any basis G of I. .3). g ′) −→ p and we add it to G.20 Let G be a basis of an ideal I over C[x1 . xn ]. The set Gm is a Gröbner basis of I. g ′) −→ 0 for every pair of distinct polynomials g. Note that no term t ∈ LTi .3 101 Buchberger algorithm In this section we present the Buchberger algorithm. we consider the order >glx on monomials but the Buchberger algorithm can be used assuming any other order on monomials will do provided it is suitable for polynomial division (see Section 3. . A possible implementation of the Buchberger algorithm in Mathematica is also presented. Otherwise. . Given a finite set of polynomials G as input. we look for pairs of polynomials ′ g. We first address the problem of obtaining a Gröbner basis of an ideal I of polynomials from any given set of generators of I. g ′) reduces to pi modulo Gi for some g. we consider an appropriated nonzero G polynomial p such that B(g. In this first step. that is. g ′ ∈ Gm . . g ∈ G whose Buchberger polynomial does not reduce to 0 modulo G. If there are none. for each i ∈ N. We go on adding such new polynomials until the Buchberger polynomial of any pair of elements of the current basis reduces to 0. Gm such that • G0 = G.17).3. there is a finite sequence of sets of polynomials G0 . g ′ ∈ Gi .¨ 3. Unless otherwise stated. G is a Gröbner basis. • Gi+1 = Gi ∪ {pi } where pi is a nonzero Gi -irreducible polynomial such that Gi B(g. we show how to obtain a reduced Gröbner basis of I from a given Gröbner basis of I. . It is based on the Buchberger theorem (Theorem 3. Proposition 3. The first step of the Buchberger algorithm consists of a technique for obtaining a Gröbner basis of an ideal I from a given basis of I. the current basis satisfies the condition of the Buchberger theorem (Theorem 3. for each 0 ≤ i < m. g ′ ∈ Gi and pi is Gi -irreducible.3. Let LT0 = {lt(g) : g ∈ G0 } and. Let Ji be the ideal generated by LTi for each i ∈ N0 . the Buchberger algorithm outputs a finite set of polynomials that constitutes a reduced Gröbner basis of the ideal generated by the polynomials in G. Then. . Then.3. thus obtaining a new basis G ∪ {p} of I. g ′) −→ pi for some g. Proof (sketch): Assume that for each i ∈ N0 there is a set Gi+1 = Gi ∪ {pi } where pi 6= 0 is such that B(g.17). G m • B(g. The following proposition ensures that this can indeed be achieved in a finite number of steps. Note that B(g. GROBNER BASES 3.2. this basis is a Gröbner basis of I. g ′) reduces to 0 modulo G ∪ {p}. let LTi = LTi−1 ∪ {lt(pi−1 )}.3.3. Then. if B(g. Hence. G0 = {g1 . x2 ].3. g3 ) −→ 0 (see Example 3. Hence. the set Gm is also a basis of I. for all g.18) We can then conclude that G1 = {g1 . Using the functions redmod and polBuch (see Figures 3.102 CHAPTER 3.17 constitutes a possible implementation in Mathematica of the first step of the Buchberger algorithm. g2 ): 1. storing in H the nonzero polynomials resulting from their reduction modulo K∪H. g2) −→ 0. At the end . that is. there is some m ∈ N0 such that. B(g2 . Since p is Gm -irreducible. Since G ⊆ Gm and G is a basis of T . g2 . POLYNOMIALS divides lt(pi ).20 sketches a technique for obtaining a Gröbner basis of an ideal I from any basis of I.3. . B(g1 .17 ensures that it is a Gröbner basis.3. The function calcBuch in Figure 3. The function consists of a loop that works with two lists K and H of polynomials. The following example illustrates that construction of a Gröbner basis. g3 } with g3 = 4x31 + 2x1 G G G 1 1 1 3.15). the function calcBuch then successively computes the Buchberger polynomials of all pairs of elements of K. The list K is initially set to G. g ′ ∈ Gm .13 and 3. At the beginning of each iteration H is empty. g2 ). QED Proposition 3. p = 0. it can not be the case that p ∈ Gm . g2 } 2. B(g1 .21 Consider the polynomials g1 = 2x21 x2 + x2 and g2 = x32 + x1 in Z5 [x1 . Example 3. for each i ∈ N0 and therefore J0 ⊂ J1 ⊂ J2 ⊂ . We now follow Proposition 3. g2 ) = 4x31 + 3x32 and g2 4x31 + 3x32 −→ 4x31 + 2x1 and 4x31 + 2x1 is irreducible modulo G0 therefore G1 = {g1 . g ′) reduces to p modulo Gm and p is Gm -irreducible then either p = 0 or p ∈ Gm . Theorem 3. Given that each B(g. The function calcBuch receives as input a list G of polynomials and a positive integer n. Gm of set of polynomial indeed exists. the inclusion Ji ⊂ Ji+1 holds for each i ∈ N0 . . g ′) can always be reduced to a Gm -irreducible polynomial we conclude that such a finite sequence G0 ..3. B(g1 . g3 } is a Gröbner basis for (g1 .20 to compute a Gröbner basis for the ideal (g1 .3. However. g3 ) −→ 0. g2 . . one can prove that no such increasing chain of ideals exists. . The integer n indicates that we consider polynomials over Zn . f0 f1 f2 fr h −→ h1 −→ h2 −→ . . First.H. . g=True. Union[K. {i. we remove from G polynomials g such that lt(g) is divisible by lt(g ′) for some g ′ 6= g in the basis. . If[h[i. K=Union[Join[K. Proof: Since lt(g) is divisible by lt(g ′ ). .3. Module[{K.22 Let G be a Gröbner basis of an ideal I over C[x1 .3.12.i.c-1}.{j.H=Union[Append[H.g. hr −→ 0 where f0 . g=(Length[H]!=0)].28.j]=!=0.2.H]. h ∈ I since g ∈ I and therefore. Do[h[i. by G Proposition 3. While[g. the polynomial g can be reduced modulo lt(g) ′ G to h = g − lt(g ′ ) × g .H]]. . xn ] and let g ∈ G be such that lt(g) is divisible by lt(g ′ ) for some g ′ ∈ G\{g}.n]. . By Lemma 3. Proposition 3. .h}.n].3. h −→ 0. . Otherwise K is updated with the polynomials in H and H is reseted to the empty list. fr ∈ G and hr 6= 0. .9. H={}. moreover. GROBNER BASES 103 calcBuch=Function[{G. Hence.h[i. we make the remaining polynomials monic by multiplying the terms of each polynomial by the inverse of its leading term coefficient.1. that lt(hi ) >glx lt(hi+1 ) or lt(hi ) = lt(hi+1 ) for all . K=G. By Proposition 3.j]]]].c}]. there are three more steps.c.j]=redmod[polBuch[K[[i]]. . The next propositions ensure that we indeed end up with a reduced Gröbner basis after performing these steps. K]].3.17: First step of the Buchberger algorithm in Mathematica of an iteration if H is empty the loop ends and the function returns K.¨ 3. we can conclude that lt(g) >glx lt(h) and. Then. To this end.n}. Finally. c=Length[K].K[[j]]. The set G\{g} is a Gröbner basis of I. Figure 3. The next goal of the Buchberger algorithm is to obtain a reduced Gröbner basis of an ideal I from a given Gröbner basis G of I. f1 . we replace each monic polynomial h by h′ where h′ is obtained from h reducing it as much as possible. Hence.3. QED Proposition 3.104 CHAPTER 3. lt(g) × g′ g =h+ lt(g ′ ) so can g. The set G\{g} is then also a basis of I. H is also a basis of I. lt(g) >glx lt(hi ) for all 0 ≤ i ≤ r and therefore no term of hi is divisible by lt(g). . Then h is monic and G′ = (G\{g}) ∪ {h} is also a Gröbner basis of I. the terms lt(hg ) and lt(g) have the same monomial. . given any p ∈ I there is some g ∈ G such that lt(g) divides lt(p). G\{g} Let g ∈ G and assume that g −→ h where h is (G\{g})-irreducible.24 Let G be a Gröbner basis of an ideal I over C[x1 . for 0 ≤ i ≤ r. Hence. lt(hg ) also divides lt(p). The set H = {hg : g ∈ G} is also a Gröbner basis of I and every polynomial in H is monic. Since G is a Gröbner basis of I. . POLYNOMIALS 0 ≤ i < r. and assuming that G consists of k polynomials g1 . . k X p= ai × ci × hgi i=1 because cg × hg = g for each g ∈ G.32. .3. the polynomial h can be expressed in terms of the polynomials in G\{g} and. Then. Since G is a basis of I. .2. Proposition 3. . . . xn ] such that 0 ∈ / G. . xn ] such that for each g ∈ G • g is monic • lt(g ′) does not divide lt(g) for all g ′ ∈ G such that g ′ 6= g. g cannot be used in the above reductions. assuming h0 = h. . . the terms lt(hg ) and lt(g) have the same monomial. since. . xn ] such that p = ki=1 ai × gi . For each g ∈ G. g 6= fi for all 0 ≤ i ≤ r. hg = (cg )−1 × g is a monic polynomial for each g ∈ G. let hg = (cg )−1 × g. then it is also divisible by the term lt(g ′ ).23 Let G be a Gröbner basis of an ideal I over C[x1 . lt(g ′) does not divide any nonzero term of h for all g ′ ∈ G\{g}. that is. given p ∈ I there are polynomials a1 . . As a consequence. ak in C[x1 . Moreover. QED By removing from a Gröbner basis G the polynomials g such that lt(g) is divisible by lt(g ′ ) for some g ′ 6= g in the basis we can obtain a smaller Gröbner basis G′ . where cg is the coefficient of lt(g). It is also a Gröbner basis because if lt(p) is divisible by lt(g) for some p ∈ I. Moreover. . . H is also a Gröbner basis of I. Hence. By Proposition 3. Since. . . . Proof: Clearly.P . gk . . .2. . .¨ 3. If h = g. Then. . Hence. g1 can be expressed in terms of the polynomials in G′ . f1 f2 f3 fr g −→ h1 −→ h2 −→ . . The above propositions ensure the correction of Buchberger algorithm. some other term t of g is used to reduce g to h1 modulo f1 . fr ∈ G\{g}. Since lt(g) is not divisible by lt(g ′ ) for all g ′ ∈ G\{g}. h is monic. If p ∈ I then p can be expressed in terms of the P (G\{g}) polynomials in G. Similarly. then we indeed get a reduced Gröbner basis of I. because lt(h) = lt(g). . and. lt(hi ) = lt(g) for all 1 ≤ i ≤ r. Observe that Buch(G) is well defined since reduced Gröbner basis are unique (see Proposition 3. . Proposition 3. Note that it is easy to obtain a Gröbner basis for the ideal {0}. It remains to prove that h is monic and that G′ is a Gröbner basis of I. lt(p) is divisible by lt(g ′) for some g ′ ∈ G. GROBNER BASES 105 Proof: Assume that G consists of k polynomials g1 . . p can also be expressed in terms of the polynomials in G′ . The input set G is any basis of an ideal I 6= {0} such that 0 6∈ G.2. where g ′ results from g by successive reductions modulo all the other polynomials until no more reductions are possible. it is also easy to conclude that if G is a basis of an ideal I 6= {0} and 0 ∈ G then G\{0} is also a basis of I. by Lemma 3. . . in fact. assume that g is g1 . g1 = h + k X a′i gi i=2 that is. xn ].3. lt(h1 ) = lt(g). there is no loss of generality by assuming that the input set of the Buchberger algorithm does not include 0.18. we can conclude that. that is. The output set G4 obtained in the last step is a reduced Gröbner basis of I. Assume now that h 6= g.3. clearly. f2 . by Proposition 3. In the sequel.24 ensures that if we replace each polynomial g by g ′. Since g is monic. g1 − h = ki=2 a′i gi for some a′2 . Therefore. so is hr . The Buchberger algorithm can be sketched as in Figure 3. Since g −→ h. . in fact. QED Note that given a Gröbner basis G of I obtained following the steps described above. . . Hence. . without loss of generality. the only basis of this ideal and it is a Gröbner basis. Since G is a Gröbner basis. We then conclude that G′ is a basis of I.30). Moreover. The set {0} is. . Therefore. . If lt(p) is divisible by lt(g) then it is also divisible by lt(h).28. a′k in C[x1 . that is the case.32. . gk . . we use Buch(G) to denote the output of the algorithm when it receives G as input. consider any nonzero p ∈ I. r ≥ 1 and hr = h.3. then. the set G′ is a Gröbner basis of I. given a finite set G of nonzero polynomials. The fact that lt(g ′ ) does not divide any nonzero term of h for all g ′ ∈ G\{g} follows from the assumption that h is (G\{g})-irreducible. −→ hr where f1 . Hence. Finally. . g ′) −→ p for some distinct g. for each 0 ≤ i < m. Hm of sets of polynomials such that • H0 = G. step 4: Assume that G3 consists of k ∈ N polynomials g1 . step 2: Assume that G1 consists of k ∈ N polynomials g1 . Let G4 = Hk . output: G4 Figure 3. for each 0 ≤ i < k. g ′ ∈ Hi . . and Hi+1 = Hi otherwise. . • Hi+1 = Hi \{gi+1 } if lt(gj ) divides lt(gi+1 ) for some i + 1 < j ≤ k. . . . . g ′) −→ 0 for any distinct polynomials ′ g.18: Buchberger algorithm . . . . step 3: Let G3 = {c−1 g × g : g ∈ G2 } where cg is the coefficient of the leading term of g for each g ∈ G2 . gk and compute a sequence H0 .106 CHAPTER 3. Hk of sets of polynomials such that • H 0 = G3 . . gk and compute a sequence H0 . . . g ∈ Hm . H m and Hm is the first set such that B(g. Let G1 = Hm . • Hi+1 = Hi ∪ {p} where p is a nonzero Hi -irreducible polynomial such H i that B(g. for each 0 ≤ i < k − 1 Let G2 = Hk . . Hi \{gi+1 } • Hi+1 = (Hi \{gi+1 }) ∪ {h} where gi+1 −−−−→ h and h is (Hi \{gi+1 })- irreducible. . POLYNOMIALS Buchberger algorithm input: Finite nonempty set G of nonzero polynomials step 1: Compute a sequence H0 . Hk−1 of sets of polynomials such that • H 0 = G1 . clearly. g4}. g2 .3. G4 = H0 = {1}. Let us compute Buch({g1 . g4 } • H1 = {g2 .3. g4 } since lt(g4 ) divides lt(g2 ) • H3 = {g4 } since lt(g4 ) divides lt(g3 ) Hence. G1 = H1 = {g1 . g2 . g3}): step 1: • H0 = {g1 . g ′) −→ 0 for all polynomials g. g4 } since lt(g4 ) divides lt(g1 ) • H2 = {g3 . g ′ ∈ H1 such that g 6= g ′ . g4} with g4 = 1 H 1 Moreover B(g. . g2 . 1 −→ 1 and 1 is H0 \{1}- ireducible. g3 .¨ 3. g3 .25 Consider the set of polynomials G = {g1 . step 3: G3 = {1 × 1} = {1}. Hence. g2 ) = x2 H 0 x2 −→ 1 and 1 is irreducible modulo H0 H1 = {g1 . x2 over Z2 where g1 = x1 g2 = x1 x2 − x2 g3 = x2 + 1 We use the Buchberger algorithm to compute a reduced Gröbner basis of the ideal (g1 . GROBNER BASES 107 Next we present two illustrative examples. We conclude that Buch({g1 . step 4: • H0 = {1} H0 \{1} • H1 = H0 since H0 \{1} = ∅ hence. g3 ). g2 . step 2: • H0 = {g1 . g3 . g2 . g2 . g2 . g3 } in x1 . g2 . g2 . g3 } • B(g1 . g3). g3 . g3 }) = {1} and therefore {1} is a reduced Gröbner basis of the ideal (g1 . G2 = {g4 } = {1}. Example 3. Hence. g3′ } H1 = H0 since g1′ is irreducible modulo {g2′ . g3 }. Note that this is not just a consequence of this particular example. B(g.108 CHAPTER 3. then we immediately have a Gröbner basis. g2 . g2 } • H1 = {g1 . 4 × g3 } = {x21 x2 + 3x2 .3. since lt(1) divides any term. In general. g2 . H 1 Moreover. x2 ]. Then. 1 × g2 .21. H0 = {g1′ . g2′ = x32 + x1 and g3′ = x31 + 3x1 . g3 } with g3 = 4x31 + 2x1 . Hence. then g can never be reduced to a distint polynomial modulo G3 \{g} since this set is empty. step 2: • H0 = {g1 . g3 }. Let us compute Buch({g1 . We use the Buchberger algorithm to compute a reduced Gröbner basis of the ideal (g1 . Then. g3′ } H3 = H2 since g3′ is irreducible modulo {g1′ . G2 = H0 = {g1 . given that their leading terms are divisible by lt(1).3.26 Consider the polynomials g1 = 2x21 x2 + x2 and g2 = x32 + x1 . x31 + 3x1 }. Similarly with respect to step 4 since the basis has now only one element. whenever the input set G3 for step 4 is a singular set {g}. g2′ . Step 3 does not modify anything because 1 is monic. Note that. G1 = H1 = {g1 . all the elements of the basis are removed except the polynomial 1. g ′) −→ 0 for all polynomials g.3.25 we added the polynomial 1 to the basis in the first step of the algorithm and at the end we got {1} as reduced Gröbner basis. g3 } • H1 = H0 since neither lt(g2 ) nor lt(g3 ) divide lt(g1 ) • H2 = H1 since lt(g3 ) does not divide lt(g1 ) Hence. in Z5 [x1 . step 3: G3 = {3 × g1 . x32 + x1 . in step 2. g2 . in general. g ′ ∈ H1 such that g ′ 6= g. step 4: Let g1′ = x21 x2 + 3x2 . g2}): step 1: From Example 3. g3′ } H2 = H1 since g2′ is irreducible modulo {g2′ . g2 ). we have that • H0 = {g1 . if we add the polynomial 1 to the basis in step 1 of the Buchberger algorithm. Example 3. g2 . g2′ } . and therefore no changes occur in step 4. POLYNOMIALS In Example 3. Map[Function[h.Module[{ntddivQ. In Figure 3. x32 + x1 . . We now present implementations of the other steps. letting k be the number of polynomials in G. x31 + 3x1 }.K]]].j.n]]]. x31 + 3x1 }.19. While[j<=Length[G2].n]. The function rempol uses the auxiliary function ntddivQ that given a list K of polynomials and a polynomial p returns True if the leading term of each polynomial in K does not divide the leading term of .G2[[j]]]].y}. ntddivQ=Function[{K. We conclude that Buch({g1 .n]<degmon[lt[y. In Figure 3. Not[divisibleQ[lt[p.Function[{x. Although some software packages already provide functions to compute Gröbner bases. g2 ). degmon[lt[x. .n].G2}. .22 we present a possible implementation of the Buchberger algorithm in Mathematica.19: Second step of the Buchberger algorithm in Mathematica The second step of the Buchberger algorithm can be implemented using the function rempol in Figure 3.5). Figure 3. This set of polynomials is then a reduced Gröbner basis of the ideal (g1 . H]].17 we have already presented an implementation of the first step of the algorithm. GROBNER BASES 109 Hence.n]]]. j=1. j=j+1].p}. rempol=Function[{G1. G4 = H0 = {x21 x2 + 3x2 . . . .n].lt[h. H={}. it is interesting to look at some of the details of the computation. Apply[And.G2[[j]]].¨ 3. It returns a list that results from G1 by removing all the polynomials whose leading term is divisible by the leading term of another polynomial in the list. Moreover. If G and H are reduced Gröbner bases of I then they have the same number of polynomials. g2 }) = {x21 x2 + 3x2 .3. The function receives has input a list G1 of polynomials and a positive integer n. there is an enumeration g1 . x32 + x1 . H=Append[H. gk of the elements of G and an enumeration h1 . .n]. G2=Sort[G1. If[ntddivQ[H. hk of the elements of H such that lt(gi) = lt(hi ) for each 1 ≤ i ≤ k (Exercise 30 in Section 3.n}.H. . K=G. The resulting polynomial is inserted in the output list U.14.n}. h=redmod[First[K]. It receives as input a list G of polynomials and a positive integer n.21: Fourth step of the Buchberger algorithm in Mathematica Finally. Figure 3.h]]]. The function returns a list of monic polynomials that results from G by multiplying each polynomial in the list by the inverse of its leading term coefficient. and sets H to the empty list. The function reduce uses the function redmod presented Figure 3. removes the first polynomial in K and reduces it modulo the remaining polynomials in the list and the polynomials in already in U. K=Rest[K]. POLYNOMIALS p.n]. The function makemonic in Figure 3.20 implements the third step of the Buchberger algorithm.13. U]]. After that the function passes through G2 appending to H each element of G2 whose leading term is not divisible by the leading terms of the polynomials already in H.110 CHAPTER 3. the function reducedGrobnerbase in Figure 3.h}. and the output list U is set to the empty list.Join[Rest[K].21 implements the fourth step of the Buchberger algorithm. Then there is a loop that.U]. reduce=Function[{G.n]]].n].n]. It receives as input a list G of polynomials and a positive integer n. If[h=!=0. U={}. It uses the function coeflt presented in Figure 3.22 is a possible implementation of the Buchberger algorithm. The loop stops when K becomes the empty list. at each step. Expand[PolynomialMod[p*PowerMod[coeflt[p. It receives as input a list G of polynomials .-1. makemonic=Function[{G. While[Length[K]!=0.20: Third step of the Buchberger algorithm in Mathematica The function reduce in Figure 3.n}.Module[{K.Map[Function[p.U=Append[U. Figure 3.G]]. The function rempol first creates the list G2 that results from ordering the polynomials in G1 according to the degree of their leading terms.U. A copy K of the input list G is first created. 3. . by Proposition 3. p ∈ I.28 Consider the polynomials g1 = 2x21 x2 + x2 g2 = x32 + x1 g3 = 4x31 + 2x1 . QED Example 3. Let us consider the polynomial p = 2x21 x2 + x22 . K=reduce[K. rempol. . some important questions about I can be answered in an easy way.n]. Proposition 3. to solve the membership problem.Module[{K}.3. g2 . and returns the reduced Gröbner basis of the ideal generated by the polynomials in G. In this section we present some properties of Gröbner basis. .22: Buchberger algorithm in Mathematica 3. First of all note that given an ideal I over C[x1 .n].n}.3. . .3. Figure 3. that is. . .n].27 Let I be an ideal over C[x1 .3. K=calcBuch[G. Then p −→ 0 if and only if p ∈ I. Since G is a Gröbner basis. (←) Conversely. K=rempol[K. x2 ].4 Properties of Gr¨ obner basis When we have a Gröbner basis of an ideal I of polynomials. makemonic and reduce that implement the four steps of the algorithm are used as expected. let us assume that p ∈ I. g3 } constitutes a Gröbner basis of the ideal I = (g1 . . K]]. . xn ]. The functions calcBuch. g2 .3. reducedGrobnerbase=Function[{G. From Example 3. xn ] it is quite easy to determine whether or not p ∈ I. . . g3 ). . from G Proposition 3. . . GROBNER BASES 111 and a positive integer n. Proof: G (→) If p −→ 0 then.3. . in Z5 [x1 . . xn ] with a Gröbner basis G and a polynomial p in C[x1 . K=makemonic[K.12 we conclude that p −→ 0. xn ] with a Gröbner basis G G and let p be a polynomial in C[x1 .18 we know that G = {g1 .9.¨ 3.n]. if we reduce a polynomial modulo a Gröbner basis G of an ideal I until no more reductions in one step are possible. that is. given two ideals I1 and I2 with Gröbner bases G1 and G2 respectively. x2 ]. . p′ − p′′ ∈ I. xn ] G G we have that p′ = p′′ whenever p −→ p′ . Proposition 3. xn ] with a Gröbner basis G G G and let p be a polynomial in C[x1 . so is p′ −p′′ . . p − p′ ∈ I and p − p′′ ∈ I. . we can determine whether or not I1 ⊆ I2 . The fact that we can solve the ideal membership problem. As a consequence. Proposition 3.3. Note that reductions modulo a Gröbner basis G are unique. recall that G e H have the same number of polynomials and. . Hence. xn ] such that p −→ p′ and p −→ p′′ . g2 or g3 . then we always get the same polynomial. . that is.9. if for all p ∈ C[x1 . . If p′ and p′′ are G-irreducible then p′ = p′′ .5). Hence.29 also holds.3. . .30 Let I 6= {0} be an ideal over C[x1 . . then G is a Gröbner basis (Exercise 26 in Section 3. Proof (sketch): We start by recalling that by the Buchberger algorithm we can always get a reduced Gröbner basis of I.3. that is. letting k be the number . By Proposition 3.3. (Exercise 27 in Section 3. For instance.3. p −→ p′′ and p′ and p′′ are G-irreducible. . that is. Let G and H be reduced Gröbner bases of I. The uniqueness result in Proposition 3. Hence p cannot be reduced modulo G to 0. Different orders may lead to different reduced Gröbner basis of a given ideal. Proof: By Proposition 3. By Proposition 3. POLYNOMIALS in Z5 [x1 .5) Reduced Gröbner basis have a very important property: each ideal as a unique reduced Gröbner basis.3. we have that G (p − p′′ ) − (p − p′ ) ∈ I.3. p′ − p′′ = 0. We cannot reduce p modulo g2 or g3 but we can reduce it modulo g1 : g1 p −→ x22 − x2 The polynomial x22 − x2 cannot be reduced modulo g1 . . p′ = p′′ . . QED The converse of Proposition 3. .30 below assumes that some order on monomials is fixed. p′ − p′′ −→ 0. But. Recall that we are considering a particular order on monomials.112 CHAPTER 3. allows us to solve other problems involving ideals.27. .27. Then I has a unique reduced Gröbner basis. . since p′ and p′′ are G-irreducible. . xn ]. This property is sometimes used to define the notion of Gröbner basis. p ∈ / I. we can also determine whether or not I1 = I2 . moreover. As a consequence.29 Let I be an ideal over C[x1 . Then either a nonzero term of gi is divisible by lt(gj ) or a nonzero term of hi is divisible by lt(hj ). . . . . xn ].3. QED A consequence of the above result is that we can use ideals and ideal bases to check whether two systems of polynomial equations have the same solutions. given a set S of polynomials in C[x1 . Definition 3. . assume that there is some 1 ≤ i ≤ k such that gi 6= hi . . In fact. un ) ∈ Z({pP 1 . . . . . . . un ) = 0 for all p ∈ S}. un ) ∈ C n : p (u1. Since (p1 . Proposition 3. un ) × pi (u1 . .30 is that the reduced Gröbner basis of the nonproper ideal C[x1 . . . . . The following propositions illustrate the relevance of Gröbner basis for solving systems of (nonlinear) polynomial equations. Then. . gk of the elements of G and an enumeration h1 . pm }) ⊆ Z(I). . . . .5). xn ] with m ∈ N. . . . . Hence. . . . Then I1 = I2 if and only if their reduced Gröbner bases are equal. . Proposition 3. pm }). . Corollary 3. Using the Buchberger algorithm we compute the reduced Gröbner bases of I1 from its basis and similarly with respect to I2 . . . . that is G = H. .¨ 3. . given any p ∈ I we have that p = m i=1 ai × pi for some a1 . In the sequel. thus contradicting the fact that G and H are reduced Gröbner bases of I. . . gi = hi for each 1 ≤ i ≤ k.30 ensures that it is the only one. .3. .3. . pr } ⊆ I. . j 6= i. un ) = ai (u1 .32 Let p1 . . Note that lt(gi ) and lt(hi ) cancel each other in gi − hi and therefore lt(gi ) >glx lt(gi − hi ). . By contradiction. . m X p(u1 . pm ) is a basis of I. . . . xn ] is {1}. . pm }) = Z(I). Another consequence of Proposition 3. . . Z({p1 . . . and {1} is a reduced Gröbner basis. . .3. We first introduce some notation and present a preliminary result. un ) = 0. . . pm ). . xn ]. . Hence. {1} is a Gröbner basis. . . The inclusion Z(I) ⊆ Z({p1 . . . since gi − hi ∈ I. . . . . . we assume that Z(S) = {(u1 .31 Let I1 6= {0} and I2 6= {0} be ideals over C[x1 . . . . . . . am in C[x1 . Hence. Then. . has we have already remarked above. . Then we just have to compare them. . pm }) is immediate since {p1 . . . QED Proposition 3. hk of the elements of H such that lt(gi ) = lt(hi ) for each 1 ≤ i ≤ k (Exercise 30 in Section 3. pm be polynomials in C[x1 . (u1 .30 provides another method to determine whether two ideals I1 and I2 are equal. . and let I = (p1 . . . i=1 Therefore. .11 ensures that lt(gi − hi ) is divisible by lt(gj ) (and lt(hj )) for some 1 ≤ j ≤ k. . xn ]. . GROBNER BASES 113 of polynomials in G. .3. Proof: Let (u1 . We have that Z({p1 . un ) ∈ Z(I). .3. .3. . . . there is an enumeration g1 . . . . . . POLYNOMIALS Proposition 3..   q =0 s Proof: Assume that I = (p1 . for r. . . .3. pr ) = (q1 .. . x2 ]. to check whether two ideals are equal it is enough to verify that their reduced Gröbner bases are equal. . .. Z({p1 . pr and q1 . . a good starting point is to compute the reduced Gröbner basis of the ideal corresponding to each system. pr }) = Z({q1 . . qs ) then the systems of polynomial equations    p1 = 0 . pr ) and J = (q1 . Hence.. qs }) that is. s ∈ N. . . . If the bases are equal the systems have the same solutions.   p =0 r and have the same solutions. . . g2 }): .33 Let p1 .1).34 Consider the systems of polynomial equations ( x1 x2 − x1 − x2 + 1 = 0 x21 x2 − 2x21 + 2x1 x2 − 4x1 + x2 − 2 = 0 (3. if we want to determine whether two systems of polynomial equations have the same solutions. . Example 3. Let us present an illustrative example. . If (p1 . . . . By Proposition 3. .2) in R[x1 . . .114 CHAPTER 3. We briefly sketch the computation of Buch({g1 . Therefore. . . qs ). . . Z({p1 .1) and  2   x1 x2 + x2 − x1 − 4x2 + 3 = 0 x21 + x1 x2 + x1 − 5x2 + 6 = 0   2 x1 + x22 + 2x1 − 7x2 + 7 = 0 (3. xn ]. . . g2 ) where g1 = x1 x2 − x1 − x2 + 1 and g2 = x21 x2 − 2x21 + 2x1 x2 − 4x1 + x2 − 2. . QED As remarked above. . pr }) = Z(I) and Z({q1 . qs be polynomials in C[x1 . qs }) = Z(J). . . . . .32. We first compute the reduced Gröbner basis of the ideal generated by the polynomials involved in (3.    q1 = 0 .3.3. that is. the two systems have the same solutions. . . . . . . . the ideal (g1 . where r1 = x1 x2 + x22 − x1 − 4x2 + 3. g4} with g4 = 4x22 − 12x2 + 8 H 2 Moreover. g ′) −→ 0 for all polynomials g. r4 } . g2 . G1 = H1 = {r1 . g3 . g3 } with g3 = x21 + 2x1 − 4x2 + 5 • B(g1 . In this case we have to compute the reduced Gröbner basis of the ideal (r1 . Hence. r2 . r2 . step 2: G2 = {r1 . r3 ). r2 . B(g. We conclude that Buch({g1 . x22 − 3x2 + 2} and therefore this set of polynomials is the reduced Gröbner basis of (g1 . r4 }. g2 } • B(g1 .3. g4′ } where g4′ = x22 − 3x2 + 2 step 4: G4 = G3 . r2 ) = −x21 − 5x1 x2 + 5x22 + 3x1 − 6x2 H 0 B(r1 . r3 .¨ 3. g3 . g ′) −→ 0 for all polynomials g. g4 } step 3: G3 = {g1 . g2 . r2 . g2 ) −→ x21 + 2x1 − 4x2 + 5 H1 = {g1 . r2 ) −→ 9x22 − 27x2 + 18 H1 = {r1 . We now consider the system (3. g2 }) = {x1 x2 − x1 − x2 + 1. g2 ) = x21 − 3x1 x2 + 5x1 − x2 + 2 H 0 B(g1 . g3 ) = −x21 − 3x1 x2 + 4x22 + x1 − 5x2 H 1 B(g1 . x21 + 2x1 − 4x2 + 5. step 2: G2 = {g1 . g2 . r3 } • B(r1 . g ′ ∈ H2 such that g ′ 6= g. r3 . r2 = x21 + x1 x2 + x1 − 5x2 + 6 and r3 = x21 + x22 + 2x1 − 7x2 + 7. r3 . r3 }): step 1: • H0 = {r1 . g2 ). G1 = H2 = {g1 . GROBNER BASES 115 step 1: • H0 = {g1 . B(g.2). g ′ ∈ H1 such that g ′ 6= g. g3 . Hence. We briefly sketch the computation of Buch({r1 . r4 } with r4 = 9x22 − 27x2 + 18 H 1 Moreover. r2 . g3 ) −→ 4x22 − 12x2 + 8 H2 = {g1 . g4}. g3 . 116 CHAPTER 3. POLYNOMIALS step 3: G3 = {r1 , r3 , r4′ } where r4′ = x22 − 3x2 + 2 r′ r′ 4 4 x21 + 2x1 − 4x2 + 5 x1 x2 − x1 − x2 + 1 and r3 −→ step 4: since r1 −→ then G4 = {x1 x2 − x1 − x2 + 1, x21 + 2x1 − 4x2 + 5, x22 − 3x2 + 2}. We conclude that Buch({r1 , r2 , r3 }) = {x1 x2 − x1 − x2 + 1, x21 + 2x1 − 4x2 + 5, x22 − 3x2 + 2} and therefore this set is the reduced Gröbner basis of (r1 , r2 , r3 ). Since the reduced Gröbner basis of the ideals (g1 , g2 ) and (r1 , r2 , r3 ) is the same, the ideals are equal and the systems have the same solutions. Another consequence of Proposition 3.3.32 is the following. Given polynomials p1 , . . . , pm in C[x1 , . . . , xn ] we can conclude that the system p1 = 0, . . . , pm = 0 has no solutions in C if 1 ∈ (p1 , . . . , pm ) (Exercise 31 in Section 3.5). We can also conclude that the system has no solution whenever {1} is the reduced Gröbner basis of (p1 , . . . , pm ). Proposition 3.3.35 Let p1 , . . . , pm be polynomials in C[x1 , . . . , xn ], for m ∈ N. If {1} is the reduced Gröbner basis of the ideal (p1 , . . . , pm ) then the system has no solution in C.    p1 = 0 ...   p =0 m Proof: Since {1} is the reduced Gröbner basis of the ideal (p1 , . . . , pm ), then (p1 , . . . , pm ) = C[x1 , . . . , xn ]. So, Z({p1 , . . . , pm }) = Z(C[x1 , . . . , xn ]) by Proposition 3.3.32. But, Z(C[x1 , . . . , xn ]) = ∅, since, for instance, p1 = x1 −1 and p2 = x1 are both polynomials in C[x1 , . . . , xn ] and there is no c ∈ C such that p1 (c) = 0 and p2 (c) = 0, given that 1 6= 0 in any field. QED Let us present an illustrative example. Example 3.3.36 Consider the system of polynomial equations  2   x1 x2 + x1 x2 + 2 = 0 x22 − x1 = 0   x21 + x1 x2 + 1 = 0 ¨ 3.3. GROBNER BASES 117 in R[x1 , x2 ]. We have to compute the reduced Gröbner basis of the ideal (p1 , p2 , p3 ) where p1 = x1 x22 + x1 x2 + 2, p2 = x22 − x1 and p3 = x21 + x1 x2 + 1. We have that B(p1 , p2 ) = x21 + x1 x2 + 2 and p3 x21 + x1 x2 + 2 −→ 1. Assuming p4 = 1, we are able to conclude that B(pi , pj ) reduces to 0 modulo {p1 , p2 , p3 , p4 }, for all 1 ≤ i, j ≤ 4, i 6= j. Hence, a Gröbner basis of (p1 , p2 , p3 ) is G = {p1 , p2 , p3 , p4 }. The other steps of the Buchberger algorithm are not necessary. In fact, noting the remarks after Example 3.3.25, we can already conclude that {1} is the reduced Gröbner basis of the ideal (p1 , p2 , p3 ). Hence, the system has no solutions in R. The converse of Proposition 3.3.35 also holds but only when the coefficient field is algebraically closed. A field C is said to be algebraically closed [20] whenever for each polynomial p of degree m ∈ N in C[x1 ] there is c ∈ C such that p(c) = 0. The field C of complex numbers, for instance, is algebraically closed. The field R of real numbers is not algebraically closed, since, for instance, x21 + 1 = 0 has no solution in R. Moreover, no finite field is algebraically closed. An example that illustrates the fact that the converse of Proposition 3.3.35 does not hold for arbitrary fields is the following. Consider the equation x21 +1 = 0. This equation has no solution in R. However, the reduced Gröbner basis of the / {x21 + 1}. ideal (x21 + 1) over R[x1 ] is {x21 + 1} and 1 ∈ As the following example illustrates, a consequence of Proposition 3.3.33 is that we can often use Gröbner bases, or reduced Gröbner bases, to find the solutions of systems of polynomial equations. Example 3.3.37 Consider the system (3.1) in Example 3.3.34. We know that the reduced Gröbner basis of the ideal generated by the two polynomials involved is {x1 x2 − x1 − x2 + 1, x21 + 2x1 − 4x2 + 5, x22 − 3x2 + 2} and that the corresponding system    x1 x2 − x1 − x2 + 1 = 0 x21 + 2x1 − 4x2 + 5 = 0   x22 − 3x2 + 2 = 0 (3.3) and the system (3.1) have the same solutions. But note that the last equation of the system (3.3) involves only one variable, x2 , thus it is easier to solve. With the solutions for x2 we can transform the other two equations in equations involving only the variable x1 that can also be easily solved. 118 CHAPTER 3. POLYNOMIALS From the equation x22 − 3x2 + 2 = 0 we get x2 = 1 or x2 = 2. For x2 = 1, we the get the system ( 0x1 = 0 x21 + 2x1 + 1 = 0 and therefore x1 = −1. For x2 = 2 we get ( x1 − 1 = 0 x21 + 2x1 − 3 = 0 and therefore x = 1. Thus, the system (3.3), and therefore the system (3.1), has two solutions x1 = −1, x2 = 1 and x1 = 1, x2 = 2. If a system has three or more variables, once we have eliminated one of the variables as above, we can repeat the process with the remaining equations and try to eliminate a new variable. Observe that this process of eliminating variables is similar to Gauss elimination algorithm for systems of linear equations. Note that it may not be possible to obtain Gröbner bases where all the variables but one have been eliminated from some equation. However, recall that the uniqueness of reduced Gröbner bases depend on the monomial order considered. Hence, if in the reduced Gröbner basis with respect to the order >glx , for instance, no equation with only one variable exists, it may be the case that considering another order, such as the order >lx for example, such equation occurs. In fact, whenever the system satisfies some suitable conditions, using the order >lx ensures that in the corresponding reduced Gröbner basis an equation with only one variable always exists. We do not further develop this subject herein and refer the reader to [10], for instance. 3.4 3.4.1 Motivating examples revisited Equivalence of digital circuits We address the problem of how Gröbner bases can be used for checking the equivalence of two combinational circuits. As we have discussed in Section 3.1.1, we can check the equivalence of two combinational circuits checking whether two suitable propositional formulas are 3.4. MOTIVATING EXAMPLES REVISITED 119 equivalent. A combinational circuit with one output variable computes a Boolean function and can be represented by a propositional formula. Let P be a set (of propositional symbols). The set FP of propositional formulas over P is inductively defined as follows: • p ∈ FP for every p ∈ P ; • if ϕ1 , ϕ2 ∈ FP then (¬ϕ1 ) ∈ FP and (ϕ1 ⇒ ϕ2 ) ∈ FP . The connectives ∧, ∨ and ⇔ can be defined as abbreviations as expected (for details see [9]). A valuation over P is a map v : P → {0, 1}. A valuation v over P can be extended to propositional formulas over P considering the map v : FP → {0, 1} such that v(p) = v(p) for every p ∈ P , and such that v(¬ϕ1 ) = 1 − v(ϕ1 ) and v(ϕ1 ⇒ ϕ2 ) = 1 − v(ϕ1 ) + v(ϕ1 ) × v(ϕ2 ) for every ϕ1 , ϕ2 ∈ FP . Given ϕ ∈ FP , we say that ϕ is satisfiable whenever there is a valuation v over P such that v(ϕ) = 1, and we say that ϕ is valid whenever v(ϕ) = 1 for all valuations v over P . We first describe how a propositional formula is converted into a polynomial in Z2 [x1 , . . . , xn ]. For simplicity, in the sequel we use x1 , x2 , . . . as propositional symbols. Definition 3.4.1 Let P = {x1 , . . . , xn } with n ∈ N. The function conv : FP → Z2 [x1 , . . . , xn ] is inductively defined as follows: • conv(xi ) = xi for each 1 ≤ i ≤ n; • conv(¬ϕ) = 1 − conv(ϕ); • conv(ϕ1 ⇒ ϕ2 ) = conv(ϕ1 ) × conv(ϕ2 ) − conv(ϕ2 ). It is straightforward to conclude that conv(ϕ1 ∨ ϕ2 ) = conv(ϕ1 ) × conv(ϕ2 ) and conv(ϕ1 ∧ ϕ2 ) = conv(ϕ1 ) + conv(ϕ2 ) − conv(ϕ1 ) × conv(ϕ2 ). Moreover, conv(ϕ1 ⇔ ϕ2 ) = conv((ϕ1 ⇒ ϕ2 ) ∧ (ϕ2 ⇒ ϕ1 )) (see Exercise 36 in Section 3.5). Example 3.4.2 Consider ϕ1 = x1 ∨ (¬x1 ) and ϕ2 = (¬x1 ) ∨ (x1 ∧ x2 ). Then, conv(ϕ1 ) = = = = conv(x1 ) × conv(¬x1 ) conv(x1 ) × (1 − conv(x1 )) x1 × (1 − x1 ) −x21 + x1 . Proof: If 1 is in the reduced Gröbner basis of (p. . . . . If 1 is in the reduced Gröbner basis of the ideal (p. . The following proposition establishes how Gröbner bases can be used to check the validity of a propositional formula.x2 ]]:=conv[x2]conv[x1]-conv[x2]. pn ). As a consequence. . .x2 ]]:=conv[x1]conv[x2]. un ∈ Z2 . . . .    pm = 0 has no solutions.35. .4. .3. .23: Converting propositional formulas into polynomials Let ϕ be a propositional formula and let p ∈ Z2 [x1 . xn ] then ϕ is valid. . ¬ϕ is not satisfiable . x1]]]. conv[neg[x ]]:=1-conv[x]. . . conv[and[x1 . pn ) over Z2 [x1 . . un ) 6= 0 for all u1 . . .x2 ]]:=conv[neg[or[neg[x1]. by Proposition 3. .imp[x2. . . Given that pi (u) = 0 for each 1 ≤ i ≤ n and u ∈ Z2 . the system  p=0     p =0 1  .3 Consider P = {x1 .23 convert propositional formulas into polynomials. . . Let p = conv(¬ϕ) and pi = conv(xi ∨ (¬xi ))) for each 1 ≤ i ≤ n. . then this reduced Gröbner basis is {1} and therefore. Figure 3.5). . x2]. un ) = 0 (see Exercise 37 in Section 3. ϕ1 also corresponds to x21 + x1 and conv(ϕ2 ) to x21 x2 + x21 + x1 + x2 .neg[x2]]]]. .x2 ]]:=conv[and[imp[x1. then p(u1 . Proposition 3. . . . It is straightforward to conclude that ϕ is satisfiable if and only if there are u1 . conv[imp[x1 . xn } and ϕ ∈ FP . POLYNOMIALS conv(ϕ2 ) = = = = conv(¬x1 ) × conv(x1 ∧ x2 ) (1− conv(x1 ))×(conv(x1 )+ conv(x2 )− conv(x1 )× conv(x2 )) (1 − x1 ) × (x1 + x2 − x1 × x2 ) −x21 x2 − x21 + x1 + x2 Given the properties of the field Z2 . . p1 . . . The Mathematica rewriting rules in Figure 3. . un ∈ Z2 such that p(u1. conv[eqv[x1 .120 CHAPTER 3. . p1 . . xn ] be the polynomial conv(ϕ). conv[or[x1 . . After some computations we get conv(¬(ϕ1 ⇔ ϕ2 )) = x61 x42 + x61 x22 + x51 x32 + x41 x42 + x51 x2 + x41 x22 + x21 x42 + x41 x2 + x21 x22 + x1 x32 + x42 + x21 + x22 + x1 + 1. g2 . we check whether ϕ1 ⇔ ϕ2 is valid . the circuits are equivalent. Assuming g4 = 1. g3 ). g3 ) over Z2 [x1 . g2 ) −→ 1. B(gi . respectively. We can already conclude that {1} is the reduced Gröbner basis of the ideal (g1 .121 3. Let us illustrate equivalence of circuits with the following example. g4 } is a Gröbner basis of (g1 . From Example 3. g2 . Therefore. g3 . Herein. The other steps of the Buchberger algorithm are not necessary. i 6= j. j ≤ 4.2 we have that conv(x1 ∨ (¬x1 )) = x21 + x1 and. g3 ). g2 . g2 . To this end we can check if 1 is in the reduced Gröbner basis of the ideal over Z2 [x1 . The interested reader is referred to [23]. g2 ) = x51 x42 + x61 x22 + x51 x32 + x41 x42 + x51 x2 + x41 x22 + x21 x42 + x41 x2 + x21 x22 + x1 x32 + x42 + x21 + x22 + x1 + 1 and H 0 B(g1 . similarly conv(x2 ∨ (¬x2 )) = x22 + x2 . g2 . we have to convert the formulas ¬(ϕ1 ⇔ ϕ2 ). for all 1 ≤ i. .4 In this example we are going to test the equivalence of two circuits using Gröbner basis.4. g3 . . x1 ∨ (¬x1 ) and x2 ∨ (¬x2 ) into polynomials in Z2 [x1 . . we do not give details on how to convert combinational circuits into propositional formulas. Hence. gj ) reduces to 0 modulo {g1 . x2 ] where g1 = conv(¬(ϕ1 ⇔ ϕ2 ) Observe that g2 = conv(x1 ∨ (¬x1 )) g3 = conv(x2 ∨ (¬x)) B(g1 . Consider circuits 1 and 2 corresponding to propositional formulas ϕ1 and ϕ2 . To check the equivalence of the two circuits we are going to check if 1 is in the reduced Gröbner basis of the ideal (g1 .4. MOTIVATING EXAMPLES REVISITED and therefore ϕ is valid. where ϕ1 = (¬x1 ) ∨ (x1 ∧ x2 ) and ϕ2 = (¬x1 ) ∨ x2 . g4}. xn ] generated by the polynomial corresponding to ¬(ϕ1 ⇔ ϕ2 ) and the polynomials corresponding to the formulas xi ∨ (¬x1 ) for each propositional symbol xi in ϕ1 ⇔ ϕ2 . . x2 ]. . that is.4. Example 3. QED To check whether two circuits corresponding to the formulas ϕ1 and ϕ2 are equivalent we check whether the formulas are equivalent. First. G = {g1 . x6 ]. v1 = cos β. b.4. To this end we have to find the solutions of the system of polynomial equations  l2 v2 v1 + l3 v3 v1 − a = 0      l2 v2 u1 + l3 v3 u1 − b = 0     l2 u2 + l3 u3 − c = 0  u21 + v12 − 1 = 0      u22 + v22 − 1 = 0    u23 + v32 − 1 = 0 where l2 and l3 are the lengths of the arm links. x5 = u3 and x6 = v3 . Thus. . b. v2 = cos θ1 . ui and vi for i = 1.4. For simplicity. The system has 6 variables. p2 . where p1 = x4 x2 + x6 x2 − 1 p4 = x21 + x22 − 1 p2 = x4 x1 + x6 x1 − 1 p5 = x23 + x24 − 1 p3 = x3 + x5 − 1 p6 = x25 + x26 − 1 considering the order >lx . x4 − x6 . p5 . POLYNOMIALS Inverse kinematics In Section 3. 3. x25 − 21 . x4 = v2 . u3 = sin θ3 .122 3. and u2 = sin θ1 . we want to determine what are the suitable angles β. p3 . p4 . Let us assume that a = b = c = l2 = l3 = 1. We also assume a change of variables: x1 = u1 .3. Let us compute the reduced Gröbner basis of the ideal (p1 . the system becomes  x2 x4 + x2 x6 − 1 = 0      x1 x4 + x1 x6 − 1 = 0     x3 + x5 − 1 = 0  x21 + x22 − 1 = 0      x23 + x24 − 1 = 0    x25 + x26 − 1 = 0 where the polynomials are already ordered. x26 − 12 }. x3 . 2. c) of the end effector of a particular robot. c. We use Gröbner bases as described in Section 3. x2 = v1 .2 we have described the inverse kinematics task: given the intended coordinates (a. v3 = cos θ3 . x2 − 2x6 .1. x3 = u2 . x1 − 2x6 .2 CHAPTER 3. where u1 = sin β. Note that the polynomials are already ordered according to this order. p6 ) over R[x1 . we are going to solve this system for particular values of a. l2 and l3 . x5 . x4 . θ1 and θ2 at the base and joints for reaching that position. x2 . Using the Buchberger algorithm we get {x3 + x5 . Compute the quotient and the remainder of the division of p by d where (a) p = 2x1 6 + x1 5 + x1 4 + x1 and d = x1 2 − x1 are polynomials in Q[x1 ]. +. 8. x2 ]. . EXERCISES Hence. xn ]. d1 = x31 + x3 and {d1 . x3 ]. 0. +. Prove that >lx and >glx are well founded total orders on monomials in x1 .5. . 4. (c) p = 3x21 x2 −x21 +x1 x2 and d = 2x1 x2 + x1 are polynomials in Z5 [x1 . where + and × are the sum and product of polynomials and −p is the symmetric of p for ecah p ∈ C[x1 . can be computed just solving the system  x3 + x5 = 0      x1 − 2x6 = 0     x − 2x = 0 2 6 . Prove that p −→ 0. assuming a = b = c = l2 = l3 = 1.  x4 − x6 = 0      x25 − 12 = 0    x26 − 21 = 0 3. Prove that p −→ 2x3 . x2 ]. the solutions of the original system. . (d) p = x1 x2 x33 + 4x22 x33 + 2x21 x33 + x1 x2 x23 and d = 2x1 x3 + x2 x3 are polynomials in Z7 [x1 . Prove that every ring of polynomials is unitary. .5 Exercises 1. xn ]. . . 6. x2 ]. . 5. . . (b) p1 = 6x21 x32 + 4x22 and p2 = 5x31 + 3x51 x2 are polynomials in Z7 [x1 . x2 . −. Compute p1 + p2 and p1 × p2 where (a) p1 = 3x2 + 4x and p2 = 2x3 + 3x2 are polynomials in Z5 [x].d2 } d2 = x32 + x3 + 2 in Z5 [x1 . ×). . ×). 2. x2 ]. x2 . 0. −. (b) p = 2x21 x2 − x21 + x1 x2 + x1 and d = x1 x2 + x1 are polynomials in Z3 [x1 . Consider the polynomials p = 3x31 x32 + 5x21 x32 − 6x31 x2 + 2x21 x22 − 10x21 x2 − 4x21 {d} and d = x1 x22 − 2x1 in R[x1 . 7. xn . Prove that 0 ∈ I for every ideal I over a ring (A. x3 ]. . Prove that (C[x1 . is a ring over C. Consider the polynomials p = 2x31 x32 + x31 x3 + x32 x3 . . 3.123 3. Let C be a field and n ∈ N. . Consider the polynomials g1 = x31 +x3 and g2 = x32 +x3 +2 in Z5 [{x1 . b ∈ J} (c) I − J = {i − j : i ∈ I. xn ]. x3 }]. g2 = 4x22 x3 +x3 and g3 = 2x22 +3 in Z5 [{x1 . POLYNOMIALS 9. (a) Prove that {g1 . +. . x2 . g2 . 9) over Z is the set of multiples of 3. (c) if g is a G-irreducible nonzero polynomial then g ∈ / G. . g2 = x1 x3 + x22 + x1 + 6x2 and g3 = 4x1 x22 + 4x21 in Z7 [{x1 . . b2 ∈ I and a1 . 10. Consider the polynomials g1 = 2x22 + 2x1 . ×). Consider the polynomials g1 = x1 x3 +x22 +2. x2 . 12. . . . Prove that {g1 . . −. p1 ) −→ 0. . Prove that B(p1 . g3 ). (c) Compute the reduced Gröbner basis of (g1 . g2 ) × n : n ∈ Z}. 0. Prove that the set Zeven of the integer numbers that are even is an ideal over Z. . a2 ∈ A. . Let (A. g3 } is a Gröbner basis of the ideal (g1 . . g2 . x2 . g2 . g2 . ×) be a commutative unitary ring and let I ⊆ A. 11. . Prove that (a) I ∩ J (b) I + J = {a + b : a ∈ I. xn ] and g ∈ C[x1 . . g3 ) 18. −. x3 }]. g3 ) and check whether x1 x22 x3 + x1 x3 + 1 ∈ (g1 . Let I and J be ideals over the same ring A = (A. . xn ]. x3 }]. Let p1 . p2 ) −→ 0 if and only if B(p2 . . . . g2 . (b) Check whether 2x21 x3 + 6x1 x22 + 6x21 + 5x1 x2 ∈ (g1 .124 CHAPTER 3. g2 . . g2 ) and check whether 2x31 x32 + x31 x3 + x32 x3 ∈ (g1 . g3 ). p2 be two polynomials in C[x1 . g3 ). Let I be an ideal over C[x1 . j ∈ J} are also ideals over A. 15. . Prove that {g1 . . Prove that the ideal (6. 0. xn ] and let D be a finite subset D D of C[x1 . xn ]. g3} is a Gröbner basis of the ideal (g1 . +. G a finite subset of C[x1 . 17. 14. Prove that (a) {g} is a Gröbner basis of the ideal (g). g2. 13. (b) if G is a basis of I and 1 ∈ G then G is a Gröbner basis. 16. Prove that I is an ideal over the ring if and only if I 6= ∅ and ((a1 × b1 ) + (a2 × b2 )) ∈ I for every b1 . g2 } is a Gröbner basis of the ideal (g1 . Prove that the ideal (g1 . g2 ). g2 ) over Z is the set {gcd(g1 . . Let I1 = (x1 .3. x2 ]. . p −→ p′′ and p′ and p′′ are G-irreducible. 21. g2 . x3 ]. g2 ). g2 ). Prove that G is a Gröbner basis. −x2 +x3 . . (a) Check whether {g1 . x2 ]. −x2 2 +x3 x4 . Consider the polynomials g1 = x1 . Consider the polynomials g1 = x21 + 2 and g2 = x21 + 4x2 in Z5 [x1 . (b) Compute the reduced Gröbner basis of (g1 . Let G be a set of polynomials in C[x1 . 20. . x2 ]. −x1 −x4 . g2 } is a Gröbner basis of the ideal (g1 . 3x2 ) and I2 = (4x1 . x2 ]. Check whether x21 x23 + x22 x3 ∈ (g1 . . 26. −x21 + 2x2 ) be ideals over Z7 [x1 . −x21 + 2x2 ) be ideals over Z7 [x1 . g2 . (c) Check whether 4x1 2 x2 + 3x2 3 ∈ (g1 . . x2 . . x3 + x4 } be a Gröbner basis of the ideal generated by G. EXERCISES 125 19. −x2 x3 x4 + x3 x4 2 . g2 } is a Gröbner basis of the ideal (g1 . Let I1 = (4x1 + 2x2 . Compute the reduced Gröbner basis of the ideal generated by G. xn ] we have that p′ = p′′ whenever p −→ p′ . Consider the polynomials g1 = x22 + x23 . g3 ). g2 . g2 ). g2 ). (c) Check whether 4x1 2 x2 2 + 3x1 ∈ (g1 . (b) Compute the reduced Gröbner basis of (g1 . 3x2 ) and I2 = (x21 . x3 +x1 . Prove that I1 ∩ I2 = I2 .Compute the reduced Gröbner basis of the ideal (g1 . g2 = x21 x2 + x2 x3 and g3 = x33 + x1 x2 in Z2 [x1 . xn ] and assume that for all G G p ∈ C[x1 . . x2 ]. . Compute the reduced Gröbner basis of the ideal (g1 . x2 ]. . Let G be a finite set of polynomials in Z2 [{x1 . Check whether x21 + x32 ∈ (g1 . Let G1 and G2 be Gröbner bases of the ideals I1 and I2 over C[x1 . xn ]. 27.5. x2 . x2 ]. 3x21 − x1. 29. g2 ). . 3x21 − x2 . x4 }] and let the set G′ = {x1 −x2 . g2 ). x3 . 24. g2 ). . respectively. Check whether x1 2 + x2 x4 is in the ideal generated by G. Consider the polynomials g1 = 3x1 x2 − 2x1 and g2 = −2x2 2 + 3x1 in Z5 [x1 . Use the properties of Gröbner bases to determine whether I1 ⊆ I2 . 22. (a) Check whether {g1 . . g3 ). g3 ). Compute the reduced Gröbner basis of the ideal (g1 . . g2 ). x2 +x4 . Compute the reduced Gröbner basis of the ideal (g1 . g3 ). Consider the polynomials g1 = −3x1 x2 + 2 and g2 = 3x1 − 1 in Z5 [x1 . 25. g2. Prove that I1 ∪ I2 = I1 ∩ I2 . Consider the polynomials g1 = x31 and g2 = x41 + 4x2 in Z5 [x1 . g2 = x1 x2 − x2 and g3 = x2 + 1 in Z2 [x1 . 23. 28. Solve the following system of polynomial equations in R using Gröbner bases ( 2 x1 + x1 x2 = 1 x1 x2 + x22 = 2 33. xn ]. Solve the following system of polynomial equations in R using Gröbner bases  2   x1 + x2 + x3 − x3 − 1 = 0 x1 + 2x2 + x23 − 1 = 0   x1 + x2 + x23 − 3 = 0 35. . . . . Prove that the system p1 = 0. . 31. 37. xn ] be the polynomial conv(ϕ). Let p1 . . . . . . . . . . . . . . . . . Let G and H be reduced Gröbner bases of an ideal I over C[x1 . . (b) conv(ϕ1 ∧ ϕ2 ) = conv(ϕ1 ) + conv(ϕ2 ) − conv(ϕ1 ) × conv(ϕ2 ). pm ∈ C[x1 . 32. POLYNOMIALS 30. pm = 0 has no solutions in C if 1 ∈ (p1 . . . there is an enumeration g1 . un ) = 0. gk of the elements of G and an enumeration h1 . xn ]. . hk of the elements of H such that lt(gi ) = lt(hi ) for each 1 ≤ i ≤ k. . . moreover. un ∈ Z2 such that p(u1 . . . . . Solve the following system of polynomial equations in Z5 using Gröbner bases ( x21 − x22 = 1 x21 x2 − x32 − x2 = 1 34. Solve the following system of polynomial bases  2   x1 x2 x3 + x2 x3 + x2 = x1 x3 + x2 x3 − x3 =   x22 + x2 x3 = equations in Z3 using Gröbner 0 −1 2 36.126 CHAPTER 3. pm ). Prove that (a) conv(ϕ1 ∨ ϕ2 ) = conv(ϕ1 ) × conv(ϕ2 ). . . Then G and H have the same number of polynomials and. . Let ϕ be a propositional formula and let p ∈ Z2 [x1 . (c) conv(ϕ1 ⇔ ϕ2 ) = conv((ϕ1 ⇒ ϕ2 ) ∧ (ϕ2 ⇒ ϕ1 )). . . . . Prove that ϕ is satisfiable if and only if there are u1 . . . letting k be the number of polynomials in G. . x2 }] resulting from their conversion. . (a) Present the polynomials in Z2 [{x1 . EXERCISES 38.127 3. 39. Check whether they are equivalent using Gröbner bases. x2 }] corresponds to the formula ¬(((x1 ∨ (¬x2 )) ∧ (x1 ∨ (¬x1 ))) ⇔ ((¬x1 ) ⇒ (¬x2 ))). Consider the digital circuits corresponding to the formulas (¬x1 ) ∧ x2 and ¬((¬x2 ) ∨ x1 ). Consider the digital circuits corresponding to the formulas (x1 ∨ (¬x2 )) ∧ (x1 ∨ (¬x1 )) and (¬x1 ) ⇒ (¬x2 ). assuming that the polynomial 1 + x1 + x31 + x41 + x51 + x71 + x81 + x31 x2 + x41 x2 + x51 x2 + x71 x2 + x21 x22 + x61 x22 + x71 x22 + x41 x32 + x71 x32 + x41 x42 + x61 x42 + x81 x42 in Z2 [{x1 .5. (b) Check whether they are equivalent using Gröbner basis. POLYNOMIALS .128 CHAPTER 3. is denoted by X f (k) (4. . In Section 4. The goal of this chapter is to provide several techniques for computing summations.1) k∈A and is the sum of all images f (k) for every k ∈ A. can be written as b X f (k). that is. when A is an integer interval [a. In Section 4. of f on A. Moreover. linear algebra and sorting. We give illustrations on analysis of algorithms in Bioinformatics. Given a set of integers A. the set of all integers between a and b. . the sum. In Section 4.1 we present a motivating example in Bioinformatics. we illustrate the relevance of summations to the analysis of the Gauss elimination technique and the insertion sort algorithm. The Euler-Maclaurin formula is presented in Section 4. .b]. then X k∈A f (k) = f (a1 ) + · · · + f (an )..1). we present several techniques to compute summations. In Section 4. say. then the summation (4. an }.3. k=a Summations play an important role in several different fields from science to engineering. a set of numbers B. . and a map f : A → B. A = {a1 . When A is a finite set.5 we present some exercises.4. or summation.Chapter 4 Euler-Maclaurin formula In this chapter. 129 .2 we introduce summation expressions and some of their relevant properties. A DNA string. This means that the element in the k-th position of the list w differs from the one in the k-th position of the list p. r]]. w[[m + 1]]. . If they all match we are done. .Module[{i. a prefix of w). . where the symbols represent the molecules: adenine (A). w[[3]]. If the value of r is True then the pattern p is a subsequence of the word w. This step consists in comparing the pattern p with w[[2]. SPatMatch=Function[{w. Figure 4. . Pattern matching of DNA is related to the search a sequence of letters in a larger sequence. r=False.1: Naive pattern matching algorithm The function SPatMatch in Figure 4. cytosine (C). Otherwise. At this point the first iteration of the outer loop ends and we start the next iteration. s=True. C. for instance.130 4. While[j<=Length[p]&&s. guanine (G) and thymine (T). we necessarily reach a position 1 ≤ k ≤ m such that w[[k]] 6= p[[k]]. Let n be the length of the word w and let m be the length of the pattern p. i=0. the genome. j=1.1 receives two lists.1 CHAPTER 4. s=s&&w[[i+j]]===p[[j]]. the problem we have to address is to determine if a given sequence (pattern) is a subsequence of another sequence (word). respectively. or gene. corresponding to the word and the pattern.s}. EULER-MACLAURIN FORMULA Motivation Bioinformatics is related to searching and data mining DNA sequences. A naive algorithm to solve this problem is presented in Figure 4. is a sequence in the alphabet {A.j. From the computational point of view. While[i<=Length[w]-Length[p]&&!r. r=r||s.1. stored in w and p. The first iteration of the outer loop consists in comparing the pattern p with the first m elements of w. i=i+1]. the pattern p is a subsequence of w (in fact. j=j+1]. .r. that is. It returns a Boolean value stored in variable r. T }.p}. G. C. it yields (n − m + 1)m. is inductively defined as follows: . w = {C. that is. the case when the mismatch occurs always in the last iteration of the inner loop (eg. It will finish if either a pattern is found or k + (m − 1) > n. i=0 j=1 In this case the summation is very simple to compute. at the k-th step the algorithm compares the pattern p with w[[k]. . . j=1 Moreover. C.131 4. w[[k + 1]]. Let I be a nonempty set of integer variables. We can consider. w[[k + (m − 1)]]. C. in particular. For instance. the comparisons w[[i+j]] === p[[j]]. the total number of comparisons is given by the following summation: ! n−m m X X 1 . C} and p = {C. In this case the number of comparisons executed in the inner loop is m X 1. In worst-case analysis. C. . 4. A}). Observe that the maximum number of such comparisons occurs when the algorithm always executes m comparisons at each iteration of the outer loop. C. C. we concentrate on characterizing the worst possible situation.2. C. This number should be expressed as a function of the input sizes of both w and p. Syntax We introduce integer and real expressions. The set of integer expressions EI is the smallest set containing all integer numbers and variables that is closed under addition and multiplication.2 Expressions In this section. We now analyze the complexity of the algorithm in Figure 4. EI . EXPRESSIONS If no pattern is found.1 in terms of the number of comparisons performed between the input word w and pattern p. worst-case analysis. C. on evaluating the maximum number of comparisons needed to determine wether or not the pattern p occurs in the word w. since the inner loop is executed when i ranges from 0 to n − m. That is. we introduce the syntax and semantics of summation expressions. that is. . m. logarithm and summation. (e1 × e2 ). An assignment ρ is a map that assigns to each variable a real number. We need to compare two expressions e1 and e2 . j. we use the letters i. z. Semantics We will interpret integer expressions as integers and real expressions as real numbers. multiplication. • (e1 + e2 ). Hence. the denotation is a partial function [[·]]ρ : EX 6→ R defined as follows: . for real variables. the denotation is a function that associates an expression with a real number. We use the following abbreviations: • e1 for e1 × e−1 2 . namely. checking whether e1 = e2 or e2 ≤ e2 . That is. for integer variables and the letters x. For this purpose we need the notion of assignment. i1 . EULER-MACLAURIN FORMULA • Z ∪ I ⊆ EI . (ee12 ). ! d2 X • e for every d1 . • −d for (−1) × d. . n. As before we can omit parentheses when no ambiguity occurs. that is. . loge1 (e2 ) ∈ EX for every e1 . • (d1 + d2 ). x1 . Given an assignment. y1. d2 ∈ EI and e ∈ EX . Otherwise stated. . some expressions can not be interpreted.g. . i2 . . x2 .132 CHAPTER 4. y. exponentiation. . ρ : X → R such that ρ(i) ∈ Z for all i ∈ I. The set of real expressions EX is the smallest set containing all real numbers and variables and is closed under addition. In the sequel we can omit parentheses when no ambiguity occurs. The following usual simplifications will be used: • d1 d2 for d1 × d2 . k. • e1 e2 for e1 × e2 . However. k=d1 Observe that EI is contained in EX . j1 . e. Let X be a set of real variables such that I ⊆ X. (d1 × d2 ) ∈ EI for every d1 . e2 • −e for (−1) × e. e2 ∈ EX . 01 . d2 ∈ EI . EX is inductively defined as follows: • R ∪ X ⊆ EX . Similarly for e1 ≤ e2 . Example 4. undefined ( otherwise.2. undefined otherwise. if [[d1 ]]ρ ≤ [[d2 ]]ρ and both terms are defined.  0      "" d ##   "" d ## 2  X 2  X [[e]]ρ′ + e • e =  k=d1 +1 ρ  k=d1  ρ       undefined where ( ρ(x) ρ′ (x) = [[d1 ]]ρ if [[d1 ]]ρ > [[d2 ]]ρ . otherwise. To check if e1 = e2 amounts to verify if [[e1 ]]ρ = [[e2 ]]ρ for all ρ such that both denotations are defined. [[e1 ]]ρ × [[e2 ]]ρ  [[e2 ]]ρ    [[e1 ]]ρ    if [[e1 ]]ρ and [[e2 ]]ρ are defined. ([[e1 ]]ρ = 0 and[[e2 ]]ρ > 0) or ([[e1 ]]ρ < 0 and [[e2 ]]ρ ∈ Z). if x 6= k otherwise. EXPRESSIONS • [[a]]ρ = a for each a ∈ R.2. if [[e2 ]]ρ is defined and ([[e1 ]]ρ > 0 or undefined • [[loge1 (e2 )]]ρ = otherwise. log[[e1 ]]ρ ([[e2 ]]ρ ) if [[e1 ]]ρ > 0 and [[e2 ]]ρ > 0.133 4.1 In order to check that 2 X k2 = 5 k=1 holds. • [[x]]ρ = ρ(x) for each x ∈ X. ( [[e1 ]]ρ + [[e2 ]]ρ • [[e1 + e2 ]]ρ = undefined • [[e1 × e2 ]]ρ = • [[ee12 ]]ρ = ( if [[e1 ]]ρ and [[e2 ]]ρ are defined. we have to establish that "" 2 X k=1 k2 ## ρ = [[5]]ρ . otherwise. Indeed: ## ## "" 2 "" 2 X X k2 = 12 + k2 k=1 k=1+1 ρ = 1 + 22 + "" ρ 2 X k2 k=1+1+1 ## ρ = 1 + 4 + 0 = 5 = [[5]]ρ . by the induction hypothesis. Let ρ. Then also [[d1 ]]ρ′ > [[d2 ]]ρ′ and therefore "" d2 X k=d1 e′ ## ρ =0= "" d2 X k=d1 e′ ## ρ′ .134 CHAPTER 4. Step: We only consider the cases where e is e1 + e2 and where e is a summation. Otherwise. If e ∈ R then [[e]]ρ = e = [[e]]ρ′ . 2. ρ′ be such that ρ(x) = ρ′ (x) for every variable x occurring (2) e is dk=d 1 in e and [[e]]ρ and [[e]]ρ′ are both defined. e ∈ X and then [[e]]ρ = ρ(e) = ρ′ (e) = [[e]]ρ′ . 2. Let ρ.1) [[d1 ]]ρ > [[d2 ]]ρ .2 Let e ∈ EX . Then [[e]]ρ = [[e]]ρ′ for all assignments ρ. [[e1 + e2 ]]ρ = [[e1 ]]ρ + [[e2 ]]ρ = [[e1 ]]ρ′ + [[e2 ]]ρ′ = [[e1 + e2 ]]ρ′ . P2 e′ . [[di ]]ρ = [[di ]]ρ′ for i = 1. ρ′ such that ρ(x) = ρ′ (x) for every variable x occurring in e when both denotations are defined. Proof: The proof follows by structural induction Basis: e ∈ R ∪ X. ρ′ be such that ρ(x) = ρ′ (x) for every variable x occurring in e and [[e]]ρ and [[e]]ρ′ are both defined. ◭ The denotation of an expression only depends on the values assigned to the variables occurring in the expression. (2. (1) e is e1 + e2 . By the induction hypothesis. Let ρ. Lemma 4. ρ′ be assignments such that ρ(x) = ρ′ (x) for every x ∈ R occurring in e. Then [[ei ]]ρ = [[ei ]]ρ′ for i = 1. EULER-MACLAURIN FORMULA for every assignment ρ. Hence.2. As a consequence. by the induction hypothesis of the structural induction. Proposition 4. = [[e′ ]]ρ′′ = [[e′ ]]ρ′′′ = e′ k=d2 k=d2 ρ ρ′ Induction hypothesis: "" d2 X k=d2 −n+1 ′ e ## ρ = "" d2 X ′ e k=d2 −n+1 ## (4. using also (4. d1 . ρ′′ (x) = ρ′′′ (x) for every x ∈ X occurring in e.3 Let e. Basis: n = 0. Hence. Let ρ′′ be the assignment such that ρ′′ (x) = ρ(x) for x 6= k and ρ′′ (k) = [[d2 ]]ρ and let ρ′′′ be the assignment such that ρ′′′ (x) = ρ′ (x) for x 6= k and ρ′′′ (k) = [[d2 ]]ρ′ . Again ρ′′ (x) = ρ′′′ (x) for every x ∈ X occurring in e′ . Note that proving (4.3). We have to prove that ## ## "" d "" d 2 2 X X = e′ e′ k=d1 k=d1 ρ (4. c ∈ EX .2.2) [[d1 ]]ρ ≤ [[d2 ]]ρ .2) is equivalent to proving "" d ## "" d ## 2 2 X X = e′ e′ k=d2 −n k=d2 −n ρ ρ′ The proof follows by induction on n.2. d′1 .3) ρ′ Step: Let ρ′′ be the assignment such that ρ′′ (x) = ρ(x) for x 6= k and ρ′′ (k) = [[d2 − n]]ρ and let ρ′′′ be the assignment such that ρ′′′ (x) = ρ′ (x) for x 6= k and ρ′′′ (k) = [[d2 − n]]ρ′ . [[e′ ]]ρ′′ = [[e′ ]]ρ′′′ . and therefore in e′ . EXPRESSIONS (2. e′ .135 4. by the induction hypothesis of the structural induction. ## ## "" d "" d ## "" d ## "" d 2 2 2 2 X X X X e′ = e′ = [[e′ ]]ρ′′′ + e′ = [[e′ ]]ρ′′ + e′ k=d2 −n ρ k=d2 −n+1 ρ k=d2 −n+1 ρ′ k=d2 −n ρ′ QED We now establish some properties of summation that we will use latter on for symbolically reasoning with summations.2) ρ′ Let n = [[d2 ]]ρ − [[d1 ]]ρ (= [[d2 ]]ρ′ − [[d1 ]]ρ′ ). ## "" d ## "" d 2 2 X X e′ . Then the following properties hold: . and therefore. Clearly. Then also [[d1 ]]ρ′ ≤ [[d2 ]]ρ′ . Hence. d2 . d′2 ∈ EI and k ∈ I such that k does not occur in c. [[e′ ]]ρ′′ = [[e′ ]]ρ′′′ . k=d1 d2 X c= k=d1 ( 0 if d1 < d2 c(d2 − d1 + 1) otherwise. Distributivity d2 X ce = c k=d1 d2 X e.4) ρ for every assignment ρ where the denotation is defined. 5. Associativity d2 X (e + e′ ) = k=d1 d2 X e+ k=d1 3. .136 CHAPTER 4. Change of variable d2 X e= d+d X2 ek(k−d) = ek(d−k) k=d−d2 k=d+d1 k=d1 d−d X1 where ekd′ is the expression obtained from e by replacing the occurrences of k by d′ . We have to prove that "" d2 X k=d1 ## ce ρ = "" c d2 X k=d1 ## e (4. Proof: We only present the proof of first property. Constant d2 X e′ . 4. k=d1 2. EULER-MACLAURIN FORMULA 1. leaving the proofs of the other properties as exercises. Additivity of indices   0    d2 d3 Pd2 e X X k=d e+ e = Pd3 1   k=d2 +1 e k=d1 k=d2 +1   P   d3 e k=d1 if d1 > d2 and d2 + 1 > d3 if d1 ≤ d2 and d2 + 1 > d3 if d1 > d2 and d2 + 1 ≤ d3 otherwise. We consider two cases. 4) is equivalent to showing "" d2 X k=d2 −n ## ce = ρ "" c d2 X ## e k=d2 −n . Then. Then. "" d2 X ## ce k=d1 ρ = 0 = [[c]]ρ × 0 = [[c]]ρ × "" ## d2 X e k=d1 = ρ "" c d2 X k=d1 ## e . let n = [[d2 ]]ρ − [[d1 ]]ρ .137 4. "" d2 X k=d2 ## ce ρ = [[c e]]ρ′ = [[c]]ρ′ × [[e]]ρ′ = [[c]]ρ × "" d2 X ## e k=d2 = ρ "" c d2 X k=d2 ## e ρ since [[c]]ρ′ = [[c]]ρ given that k does not occur in c. Take ρ′ to be the assignment such that ρ′ (x) = ρ(x) for x 6= k and ρ′ (k) = [[d2 ]]ρ . . ρ The proof follows by induction on n. Then. ρ (2) [[d1 ]]ρ ≤ [[d2 ]]ρ . EXPRESSIONS (1) [[d1 ]]ρ > [[d2 ]]ρ . Induction Hypothesis: "" d2 X k=d2 −n+1 ## ce ρ = "" c d2 X k=d2 −n+1 ## e . Observe that showing (4. Basis: n = 0. ρ Step: Take ρ′ to be the assignment such that ρ′ (x) = ρ(x) for x 6= k and ρ′ (k) = [[d2 − n]]ρ .2. to reason about summations in a symbolic way (that is. not invoking semantic arguments). . We will check the following property 5 X j=3 Clearly. EULER-MACLAURIN FORMULA "" d2 X k=d2 −n ## ce "" = [[c e]]ρ′ + d2 X ## ce k=d2 −n+1 ρ = [[c]]ρ′ × [[e]]ρ′ + "" ρ d2 X c = [[c]]ρ × [[e]]ρ′ + [[c]]ρ ×  = [[c]]ρ [[e]]ρ′ + = [[c]]ρ × = "" c "" d2 X k=d′ "" d2 X k=d2 −n ρ d2 X k=d2 −n+1 d2 X k=d2 −n+1 (induction hypothesis) e k=d2 −n+1 "" ## ## e ρ ##  e  ρ ## e ρ ## e ρ QED Symbolic reasoning We will use the properties described in Proposition 4. e+ 7 X j=4 e= 7 X j=3 e+ 5 X j=4 e. together with the wellknown properties of real expressions.3.138 CHAPTER 4.2. 2. We will consider closed forms for the summations of members of both arithmetic and geometric progressions.4 motivating example revisited In Section 4. In fact. In many cases.3.1.139 4. we only need the constant sequence property. ◭ We will use symbolic reasoning to simplify summations. .1 we had to compute the summation ! n−m m X X 1 i=0 j=1 which can now be computed as follows: n−m X i=0 m X j=1 1 ! = n−m X m (constant) = m(n − m + 1) (constant) i=0 using Proposition 4. EXPRESSIONS 5 X j=3 e+ 7 X e= j=4 5 X 5 X e+ j=3 = 7 X e+ j=3 = e+ j=4 5 X 7 X 7 X j=5 e j=5 e+ j=3 5 X ! + ! e 5 X (additivity) e j=4 e (additivity) j=4 Using the above properties of summations we can now compute the worst-case number of comparisons carried out by the algorithm presented in Section 4.2. e′ is called a closed form for the summation. In this case. given a summation d2 X e k=d1 ′ we want to obtain an expression e such that d2 X e = e′ k=d1 and e′ does not have summations.2. Example 4. 2 k=0 The equality is established as follows. EULER-MACLAURIN FORMULA Example 4. All arithmetic progressions are of the form uk = (c + rk). We would like to consider the summation of the first n + 1 members of {uk }k∈N0 .140 CHAPTER 4.5 We start by finding a closed form for the sum of members of an arithmetic progression. Recall that an arithmetic progression is a sequence {uk }k∈N0 such that uk+1 −uk is the same for all k ∈ N0 . k=0 The following equality holds for n ≥ 0 n X rn (c + rk) = c + (n + 1). (1) n X (c + rk) = k=0 n−0 X (c + r(n − k)) (change of variable) k=n−n = n X k=0 (c + rn − rk) (2) 2 n X (c + rk) = k=0 n X (c + rk) + k=0 = n X k=0 = n X n X k=0 (c + rn − rk) (c + rk + c + rn − rk) (associativity) (2c + rn) k=0 = (2c + rn) n X 1 (distributivity) k=0 = (2c + rn)(n + 1) (constant) . that is n X (c + rk).2. 2. k=0 The following equality holds n X cr k = k=0 c − cr n+1 . 1−r In fact: (1) n X k=0 cr k ! + cr n+1 = n+1 X cr k (additivity) k=0 = cr 0 + n+1 X cr k (additivity) k=1 0 = cr + n X cr k+1 (change of variable) k=1 =c+r n X k=1 cr k (distributivity) . Recall that an geometric progression is a sequence {uk }k∈N0 such that uk+1 uk is the same for every k ∈ N0 . All geometric progressions are of the form uk = cr k . We would like to consider the summation of the first n + 1 members of {uk }k∈N0 .2.6 We now find a closed form for the sum of members of an geometric progression.141 4. that is n X cr k . EXPRESSIONS (3) n X rn (c + rk) = c + (n + 1) 2 k=0 Example 4. on the other hand. It works as follows.5 and 4. an+1 . when we rewrite both the expressions for sn+1 in terms of sn we obtain an equation for sn whose solution is a closed form of the summation. and another by splitting off its first term. The perturbation technique is an useful way of finding closed forms of summations. In fact. EULER-MACLAURIN FORMULA (2) n X cr k = k=0 c − cr n+1 1−r As a consequence of Examples 4.142 CHAPTER 4. on one hand we have sn+1 = n+1 X k=0 ak = n X ak + an+1 k=0 and. Thus. sn+1 = n+1 X ak = a0 + n+1 X ak k=1 k=0 = a0 + n X ak+1 k=0 using also a change of variable. a0 . using additivity. In many cases. Assume we want to find a closed form for the summation n X ak k=0 P where n > 0. We can rewrite sn+1 in two different ways: one by splitting off its last term.2.6 we get the equalities n X n(n + 1) k= 2 k=0 n X and rk = k=0 1 − r n+1 1−r where n ≥ 0 and r 6= 1. For simplicity let sn abbreviate nk=0 ak .2. we get the equation n X k=0 ak + an+1 = a0 + n X k=0 ak+1 . . . n .5 we get n+1 sn + (n + 1)a 0 = 0a + n X (k + 1)ak+1 k=0 and. Let sn Pn technique k abbreviate k=0 ka .6 for sn we obtain the intended closed form − 1 − an+1 1 n+1 a (n + 1) + a . Note that this is not always possible. 1−a (1 − a)2 Summations over countable sets can also be considered. using associativity and distributivity for rewriting the right-hand side in terms of sn . Example 4.7 Consider the summation n X kak k=0 where n is a nonnegative integer and a is an integer not equal to 1.2.5 in terms of sn in such a way that solving the equation for sn results in a closed form for sn .143 4.5) k=0 The goal is now to express the right-hand side of 4. Using equation 4. a∈N0 f (a) = lim na=0 f (a). Let us illustrate this method with the following example. in particualr. . an } is a finite set then n X X f (a) = f (ai ).2. the following equation involving the unknown sum sn sn + (n + 1)an+1 = asn + a 1 − an+1 1−a (4. If A = {a1 . P whenever this real number L exists. Let us use the perturbation to find a closed form for this summation. Note that. EXPRESSIONS that is sn + an+1 = a0 + n X ak+1 (4. Let f : A → B be a map where A is a countable set and B is set of real numbers. . n n X X n+1 k sn + (n + 1)a =a ka + a ak k=0 k=0 that is.6) Solving equation 4. i=1 a∈A If A is an infinite countable set then X f (a) = L a∈A P where L = min({L ∈ R : a∈C f (a) ≤ L forPall finite C ⊂ A}). 7) where {Bk }k∈N0 is the sequence of Bernoulli numbers inductively defined as follows k 1 X k+2 B0 = 1 and Bk+1 = − Bj (4. Then the Euler-Maclaurin formula is as follows: n X f (k) = k=0 Z 0 n f (x) dx − B1 (f (n) + f (0)) + p X Bk k=2 k! f (k−1) (n) − f (k−1) (0) + Rp . p ≥ 2. The central result was discovered independently by Leonhard Euler and Colin Maclaurin in the XVIII century. is the remainder term such that Z 2ζ(p) n . (4.3 Main results One of the most important techniques to obtain a closed formula for a sum is by approximation to an integral. Let f : [0. EULER-MACLAURIN FORMULA 4.8) j k + 2 j=0 and Rp . Denote by f (p) the pth derivative of f .144 CHAPTER 4. n] → R be p times differentiable. . (p) . . 9) Given a non-negative integer n and an integer k the binomial coefficient is defined as follows:  n!   if 0 ≤ k ≤ n n (n − k)!k! = k   0 otherwise. Hence. we can see the binomial coefficient as representing the number of k-element subsets of an n-element set. f (x) dx. it represents the number of ways that k objects can be chosen among n objects when order is irrelevant. that is. Informally. (1 + x) = k k=0 There is also an alternative recursive definition of binomial coefficient: n n = =1 0 n where n k = n−1 k−1 + n−1 k . ∞ X n n xk . |Rp | ≤ (2π)p 0 (4. Its name is derived from the fact that they constitute the coefficients of the series expansion of a power of a binomial. Bernoulli numbers are associated to the interesting fact that the first computer program is recognized to be a Bernoulli calculation program for specialized calculus operations in the Babbage’s engine.4. ζ is the Riemmann zeta function.9). program proposed by Ada Byron to Charles Babbage approximately in the year 1850. Since 1 < ζ(p) < 2 for every integer p ≥ 2.145 4. In (4. EXAMPLES which is the reason why the Pascal’s triangle is valid. it holds that 4 |Rp | ≤ (2π)p Z n 0 . (p) . . f (x). 3 2 6 k=0 Observe that the Euler-Maclaurin formula gives an exact value for the summation whenever f (k) is a polynomial k p . . Proposition 4. 4. The Euler-Maclaurin formula is useful when obtaining a closed formula of a Pn summation k=0 f (k) where f (p) is 0 for some natural number p.4 Examples We further illustrate the relevance of summations in algorithm analysis.8) we get that B1 = −1/2.1 Consider the map f such that f (k) = k 2 and take p = 3. Then f (p) is such that f (p) (x) = 0 and so Rp = 0.3. then we analyze the insertion sort algorithm. B2 = 1/6 and B3 = 0. dx. Example 4. Using (4.7) we get n X 1 1 1 k 2 = n3 + n2 + n. in this case f (p+1) is such that f (p+1) (x) = 0 and therefore Rp+1 is 0.2 For each n ∈ N and p ∈ N0 n X 1 kp = p+1 k=0 p X p+1 k=0 k Bk (n + 1)p+1−k ! .3. The proof of the validity of the Euler-Maclaurin formula can be seen in [24]. for any p ∈ N. We first consider the Gaussian elimination technique. Moreover f (1) (n) = 2n and f (2) (n) = 2 and so from the formula (4. In fact. The Mathematica function GaussElimination in Figure 4. j ≤ n.10). The objective is to obtain an equivalent system of equations A′ x = b′ .. the map f : N → N0 such that f (n) is the number of arithmetic operations involving matrix entries performed by the algorithm when it receives as input a n × n matrix is in O(λn. we only need to say how to obtain aij and bi (k−1) b(k−1) for 0 < k < i. It is straightforward to prove that A(k) x = b(k) is equivalent to Ax = b for k = 0.n3 ). det(A) 6= 0. EULER-MACLAURIN FORMULA 4. that is. A(k) is a triangular matrix up to column k. Hence the system has a unique solution x = A−1 b. (4. Assuming that akk 6= 0 then (k−1) (k−1) (k) aij = (k−1) aij − aik (k−1) a (k−1) kj akk from A(k−1) and (k−1) bi = (k−1) bi − aik (k−1) akk (k−1) bk .2 is cubic in the order of the input matrix.146 CHAPTER 4. b(0) = b. Assume that A is nonsingular. 2.. (k) (k) From 2 and 3. Proposition 4. The output is a triangular matrix Aux and a vector baux such that the linear equation systems Ax = b and Aux x = baux are equivalent.. Moreover. .10) The Gaussian elimination technique described above can be easily implemented in Mathematica as depicted in Figure 4. . that is. .n−1} such that 1. 3. We now analyze the algorithm in terms of the number of arithmetic operations involving matrix elements. The Gaussian elimination technique computes the solution x. The first k rows of A(k−1) are equal to the first k rows of A(k) . Note that this number only depends on the number of rows and columns of the given matrix.2. that is. A is such that the denominators in (4. A(0) = A. For this purpose.1 The algorithm for Gaussian elimination presented in Figure 4. . A(n−1) = A′ and b(n−1) = b′ . The most inner loop computes the elements of Aux and baux as described in (4. a system of equations with the same solution as Ax = b and where the computation of the solution is much easier.10) are always nonzero. It consists of three nested loops that implement the Gaussian elimination technique described above.. we define a sequence of systems of equations {A(k) x = b(k) }k∈{0.4. n − 1.1 Gaussian elimination We now briefly describe Gaussian elimination technique (for a general description see [29] and [13]).4. Consider a system Ax = b of n linear equations with n variables where n > 0. .2 receives as input a nonsingular square matrix A and a vector b such that the dimension of b is equal to the order of A. Module[{k. Aux=A.i.k]]. j=j+1].n.2: Gaussian elimination in Mathematica 147 . i=k+1. baux[[i]]=baux[[i]]-m*baux[[k]]. Figure 4.j]].j. Aux[[i.b}.Aux. i=i+1]. m=Aux[[i. While[k<=n-1.j]]=Aux[[i.4. Aux[[i. baux=b. k=k+1].k]]=0.j]]-m*Aux[[k.baux}.baux}]]. While[j<=n.m.4. j=k+1.k]]/Aux[[k. n=Length[b]. {Aux. EXAMPLES GaussElimination= Function[{A. While[i<=n. k=1. there are ! n n X X 1 1+ j=k+1 i=k+1 subtractions. the execution of each iteration performs n X 1 j=k+1 subtractions plus the subtraction baux[[i]]-m*baux[[k]]. the total number of sums and subtractions is ! n−1 X n n X X 1 1+ k=1 i=k+1 j=k+1 We can use the techniques in Section 4.j]]-m*Aux[[k. More precisely. n−1 X n X k=1 i=k+1 1+ n X j=k+1 1 ! = n−1 X n X (1 + (n − k)) (constant) k=1 i=k+1 = n−1 X ((n − k)2 + (n − k)) (constant) k=1 n−1 X = (k 2 + k) (change of variable) k=1 = n−1 X k=1 k2 ! + n−1 X k=1 k ! (associativity) .j]] and therefore the execution of the loop involves exactly n X 1 j=k+1 subtractions. Looking now at the next loop.148 CHAPTER 4.11) 3 j=k+1 k=1 i=k+1 In fact. The execution of each iteration of the most inner loop performs the subtraction Aux[[i. EULER-MACLAURIN FORMULA Proof: Consider a n × n input matrix. Thus. we prove that ! n n−1 X n X X n3 − n 1 = 1+ (4. Finally.2 to compute this summation. (1) To begin with let us count the number of sums and subtractions involving matrix elements. 4. given a list w of numbers. 6 j=1 j=1 j=k+1 k=1 i=k+1 (3) From (1) and (2) we conclude that the map f is such that f (n) = n3 − n 2n3 + 3n2 − 5n 5n3 + 3n2 − 7n + = 3 6 6 for each n ∈ N. . among them.149 4. . we can reason as in the case of sums and subtractions and conclude that the total number of such multiplications and divisions is ! n n−1 X n X X 2+ 1 .5 n−1 X k= k=1 n(n − 1) 2 (2) With respect to multiplications and divisions involving matrix elements. .n3 ). EXAMPLES The equality (4. . 4. The function consists of two nested loops. wk are already ordered and the goal is to put wk+1 in the proper position with respect to w1 . .3 receives as input a list w to be ordered and it gives as output the corresponding ordered list v. wk .2 QED Insertion sort The problem of sorting consists in.4. At the kth step of the insertion sort algorithm the list w is such that the elements w1 . k=1 i=k+1 j=k+1 The computation of this sum is similar to the one presented in (1) and we get ! ! ! n n−1 n−1 n n−1 X X X X X 2n3 + 3n2 − 5n 2+ 1 = j2 + 2 j = . The Mathematica function InsertSort in Figure 4. At the kth . An implementation in Mathematica of the insertion sort algorithm is given in Figure 4.11) then follows since from Example 4.2. There are many sorting algorithms. obtaining an ordered list that is a permutation of w. At the beginning v is set to w. the reader can consult [2]).3. and therefore f ∈ O(λn. . the insertion sort algorithm (for more details on sort algorithms.1 n−1 X (n − 1)3 (n − 1)2 n − 1 k = + + 3 2 6 k=1 2 and from Example 4.3. . . Figure 4. v=w. Then. In the last two steps −10 and 8 are inserted in the correct positions. The first k elements of v are the first k elements of w already ordered. an upper bound for this value.4. Then.Module[{v. −5 is compared with 2 and becomes the first element of v ′ . m=v[[i]].j. j=j-1]. at least. v[[j+1]]=v[[j]]. 43. i=2. i=i+1].3: Insertion sort algorithm in Mathematica iteration of the outer loop the list v is considered to be divided in two parts. A run of the insertion sort algorithm for the input list w = {2.m}.150 CHAPTER 4. This run has four steps. While[i<=Length[w]. 43 is compared with the last element of v ′ and remains in the original position since it is greater than 2. v is the intended output list. The ordered sublist is extended to the k + 1th position by inserting the k + 1th element of w in the correct position The loop ends when there are no more elements to sort. The elements of v and w from position k + 1 to the last position are the same. In the analysis of a sorting algorithm the relevant operations are the comparisons between list elements carried out by the algorithm to sort the list.i. While[j>0&&v[[j]]>m. In the first step. We denote by v ′ the ordered part of the list and by v ′′ the remaining part. 8} is depicted in Figure 4. v[[j+1]]=m. v]]. −10. This number depends on the length n of the list and also on how “unsorted” the list is. j=i-1. −5. EULER-MACLAURIN FORMULA InsertSort=Function[w. . Worst-case analysis In worst-case analysis we concentrate on finding the maximum number of comparisons needed to sort an input list of length n or. We now analyze the insertion sort algorithm in terms of the number of comparisons v[[j]] > m performed by the algorithm to sort a list v of length n. 4. −10. 8. 43 . 8} | ′ {z } |′′ {z } v v ⇓ v = {−10. −5. 2. 43. 2.4: A run of the insertion sort algorithm Proposition 4. Proof: Consider an input list w = {w1 . to insert the i-th element in the correct position we have to compare it with all the elements to its left. n X i=2 (i − 1) = n−1 X j=1 j= (n − 1)n . . Note that Mathematica evaluates conjunctions in order. 2 .4 is. for instance. −5. the map f : N → N0 such that f (n) is the maximum number of comparisons between list elements performed by the algorithm when it receives as input a list of length n is in O(λn. wn } of length n ≥ 1. In fact.4. 43 .2. Since 2 ≤ i ≤ n. 8} | {z } | ′′ {z } ′ v v ⇓ v = {−5. Hence. The maximum number of comparisons occurs whenever the input list is sorted by (strict) decreasing order. wi > wi+1 holds for all 1 ≤ i < n. in this case. that is.n2 ). that is. −5. −10. in the worst-case. −10. 43 |{z}} {z } ′′ | ′ v v Figure 4.151 4. . 2. i=2 Using a change of variable and Example 4. 43. 2 .5.2 The insertion sort algorithm presented in Figure 4. when evaluating j>0&&v[[j]]>m. quadratic in the length of the input list. . |{z} 8 } | {z } ′′ ′ v v ⇓ v = {−10. EXAMPLES v = {|{z} 2 . i − 1 comparisons are executed. the total number of comparisons to sort the input list is given by the value of the summation n X (i − 1). 8} {z } | ′ ′′ v v ⇓ v = {−5. if j>0 is false then v[[j]]>m is not evaluated. that is. Harmonic numbers constitute the discrete version of the natural logarithm.3 For each n ∈ N ln(n) ≤ Hn ≤ ln(n) + 1.4 below. Recall that the (i − 1)-th step of the outer loop of the function in Figure 4. Hn = n X 1 i=1 i ≥ Z 1 n 1 dx = ln(n). EULER-MACLAURIN FORMULA Hence.12) The sum (4. i 1 x i=2 On the other hand. ranging from 1 to i. but it is easy to establish upper and lower bounds for the value of Hn . (4. the probability of the element to be placed in any position is the same (uniform distribution).4. On one hand. x QED We now present the average-case analysis of the insertion sort algorithm presented in Figure 4. In the sequel we assume that the probability of the element to be placed in any position is 1 i that is.4. Proposition 4.12) is the n-th harmonic number and it is usually denoted by Hn .3.152 CHAPTER 4. . There is no closed form for Hn . the average-case analysis of the insertion sort algorithm involves sums such as n X 1 i=1 i . the map f is such that n2 − n (n − 1)n = 2 2 for each n ∈ N.3 places the i-th element of the list in the right ordered position.n2 ). Z n n X 1 1 Hn − 1 = ≤ dx = ln(n). Proof: The result follows by definition of the Riemann integral. As we will see in the proof of Proposition 4. and therefore f ∈ O(λn. f (n) = QED Average-case analysis The average-case analysis is in general more elaborated than the worst-case analysis since it involves probabilistic hypothesis. the average-case number of comparisons to place the i-th element in the correct position is ! i X 1 1 (i − 1) + (i − k + 1) . Furthermore.5) . the number of required comparisons is i − k + 1. ranging from 1 to i.4. Recall that we are assuming that the probability of this element to be placed in any position is the same.3 the i-th element of the list is placed in the right ordered position. quadratic in the length of the input list. that is. Moreover. Since i i X 1 k=2 1X (i − k + 1) (i − k + 1) = i i k=2 (distributivity) i−2 = 1X (j + 1) i j=0 = 1 i(i − 1) i 2 = i−1 2 (change of variable) (Example 4. if it is placed in the first position the number of required comparisons is i − 1. the map f : N → N0 such that f (n) is the average number of comparisons between list elements performed by the algorithm when it receives as input a list of length n ≥ 1 is in O(λn. that is.4 is. (1) At the (i − 1)-th step of the outer loop of the function in Figure 4. i If the i-th element is placed in position k ≥ 2. i i i=2 k=2 Let us find a closed form of the above summation.4 The insertion sort algorithm presented in Figure 4. i i k=2 Hence. 1 .153 4.4. EXAMPLES Proposition 4. in the average case. the average-case number of required comparisons to sort the input list is given by the summation !! i n X X 1 1 (i − 1) + (i − k + 1) .n2 ).2. Proof: Consider an input list of length n ≥ 1. B2 and B3 .5) (2) The map f is then such that f (n) = n X i=2 1 (i − 1) + i i X 1 k=2 i !! (i − k + 1) n2 + 3n = − 4 n X 1 i=1 i ! for each n ∈ N.3 it follows that f (n) ≤ n2 + 3n − ln(n) 4 for each n ∈ N. Compute the Bernoulli numbers B1 .2. Prove that Pn k n n (a) k=1 3k2 = 6(1 − 2 + 2 n) .154 CHAPTER 4. 2. EULER-MACLAURIN FORMULA we have that n X 1 i=2 i i X 1 (i − 1) + = k=2 i !! (i − k + 1) n X 1 i−1 (i − 1) + i 2 i=2 = n X 1+i 2 i=2 1 = 2 n X ! − ! (i + 1) i=0 i=1 ! n X 1 i=2 i i n X 1 − n X 1 n2 + 3n − = 4 i=1 i (associativity) ! − ! 1 2 (distributivity) (Example 4. 4.n2 ).4. and therefore f ∈ O(λn. 3. Prove that the Bernoulli number Bk is equal to 0 for every odd integer k greater than 1. From Proposition 4.5 QED Exercises 1. ncol}. Prove that k=0 n is a positive integer. j=1. a=a*h[m.j.nlin. . Consider the Mathematica function f that receives as input a matrix m of integer numbers and computes an integer a using a function h. Compute the following sums where n is a positive integer (a) (b) (c) (d) (e) (f) (g) (h) Pn k Pn 2 k=0 (n2 Pn + 3k) k=−2 n(k k=0 (6k Pn k=0 3 k k=1 (3k 2 Pn k=0 (6(n k=−1 ((n Pn+2 − k) + n2k + 3k) k Pn+1 Pn+1 2 k=−3 k + k3k + 5k) − k)2 + k2k ) − k)5n−k+2 + 2(k − 4)) 3 Pn−1 Pn−1 (bk+1 − bk )ak+1 ) where (ak+1 − ak )bk = an bn − a0 b0 − ( k=0 5.j]. ncol=Length[First[m]]. j=j+1]. 6. a=1. 4.5.a.155 4. i=2. nlin=Length[m]. While[j<=ncol.Module[{i.i. f=Function[{m}. While[i<nlin. EXERCISES (b) (c) (d) (e) (f) Pn k k=1 2k(−2) Pn k=1 k Pn i=1 2 n3 3 = = 49 (−1 + (−2)n + 3(−2)n n) + n2 2 + n 6 n2 (n+1)2 4 i3 = Pn 2 i=1 i(i+1) = Pn 1 k=1 3(k+k 2 ) 2n 1+n = n 3n+3 where n is a positive integer. j]]==x.found}. j=j+1]. a]] Compute the number of multiplications performed by an evaluation of f[m].j. 8.i. While[j≤n.j]]==x is evaluated during an evaluation of matrixMemberQ[m.i. j=1. j=1.j] performs i + j multiplications.i]. n=Length[w]. a=1.n}. i=i+1]. i=1. found]] Assuming that the integer x is uniformly distributed in the matrix of integers m. a=a+p[i+1]. Consider the Mathematica function matrixMemberQ implementing an algorithm for linear search.j. j=j+1]. matrixMemberQ=Function[{m. EULER-MACLAURIN FORMULA i=i+1]. f=Function[{w}. If[m[[i. where n > 1.Module[{i. Compute the number of multiplications performed by an evaluation of f[w]. i=1. a=w[[i. a]]. .found=True].156 CHAPTER 4. While[!found&&i<=Length[m]. While[i≤n-1.j]]*a*t[j. Consider the Mathematica function f that receives as input a n × n matrix w of integer numbers.y] performs x + y multiplications. While[!found&&j<=Length[m[[1]]]. i=i+1]. analyze the average-case number of times the expression m[[i. assuming that the evaluation of p[x] performs 3x multiplications and the evaluation of t[x.x}. 7. and computes an integer a using a function p and a function h.Module[{a. found=False. assuming that the evaluation of h[m.x]. i=i+1].menoresQ=False]. j=1.4.j]]<x.j]]<x is evaluated assuming that the rows of m are ordered in increasing order. j=j+1].menoresQ}. r]] Consider an evaluation of f[m.5.Module[{i. menoresQ=True. While[i<=Length[m]. If[m[[i. Consider the following Mathematica function f f=Function[{m. While[menoresQ&&j<=Length[m[[1]]].x}. (a) Analyze the average-case number of times the expression m[[i.x] where m is a matrix of integers and x is an integer.r. (b) Analyze the average-case number of times the expression m[[i.r=r+1. EXERCISES 157 9.j. i=.j]]<x is evaluated assuming that the rows of m are ordered in increasing order and that x occurs in each row exactly once. r=0. . 158 CHAPTER 4. EULER-MACLAURIN FORMULA . In Section 5.Chapter 5 Discrete Fourier transform In this chapter we introduce the discrete Fourier transform The discrete Fourier transform is widely used in many fields. for instance.3 we present the fast Fourier transform. In Section 5. 5.6.5. an efficient method for computing the discrete Fourier transform. and recall from Chapter 3 that their product is the polynomial 2n−2 X ck xk p×q = k=0 where. In section 5.1 we present a motivating example that illustrates the use of the discrete Fourier transform for efficient polynomial multiplication. Several exercises are proposed in Section 5. In Section 5.2 we introduce the discrete Fourier transform. Consider the polynomials of degree n − 1 p= n−1 X ai xi and i=0 q= n−1 X bj xj j=0 in R[x].1 Motivation Polynomial multiplication is crucial in many tasks. Image processing using the fast Fourier transfer is discussed in Section 5. for each 0 ≤ k ≤ 2n − 2 159 .4 we revisit polynomial multiplication based on the discrete Fourier transform. from signal processing to large integer multiplication and therefore efficient algorithms for polynomial multiplication are of utmost importance. ranging from image processing to efficient multiplication of polynomials and large integers. j ≤ n − 1. As Moreover. The main idea is depicted in Figure 5. it also involves computing (n − 1) + 2 i=1 a consequence. interpolation O(n log n) pointwise multiplication O(n) p×q point-value rep. It is based on the discrete Fourier transform and it also uses the point-value representation of polynomials. Finally. The interpolation can also be computed using FFT again with a O(n log n) number of sums and multiplications. p.160 CHAPTER 5. evaluation O(n log n) p. Then. It first involves the evaluation of p and q at suitable values (the complex roots of unity). computing n2 multiplications of real numbers. The celebrated Shor algorithm for factorizing integers in polynomial time in a quantum computer uses this technique. . thus obtaining point-value representations of p and q. we conclude that the naive way of computing the product of two polynomials of degree n − 1 in R[x] involves a O(n2 ) number of sums and multiplications of real numbers. p×q coefficient rep. named fast Fourier transform (FFT). from these point-value representations we get a point-value representation of p × q. the method can be described as follows. The naive way of computing the polynomial p × q involves computing ai × bj for each 0 ≤ i. q point-value rep. There is however a very efficient technic for computing the product of two polynomials in R[x] (or C[x]) that only involves a O(n log n) number of sums and multiplications. DISCRETE FOURIER TRANSFORM ck =              k X i=0 n−1 X ai × bk−i i=k−n+1 ai × bk−i if 0 ≤ k ≤ n − 1 if n − 1 < k ≤ 2n − 1. The evaluations of p and q can be computed with a O(n log n) number of sums and multiplications using a particular way of computing the discrete Fourier transform (DFT). Pn−2 i = (n − 1)2 sums.1. interpolation is used to obtain the coefficients of p × q from its point-value representation. Figure 5. that is. q coefficient rep.1: Polynomial multiplication Roughly speaking. .6). The set of complex numbers endowed with their addition and multiplication constitutes a field. Addition and multiplication of complex numbers are defined in the usual way. also in this case znk = cis( 2kπ n The n-th roots of unity enjoy several interesting properties. each integer number k is considered as the value of the evaluation of a suitable polynomial at a positive integer b. the set of complex number is called the algebraic closure of the set of real numbers. that is. also refer to znk for k ∈ Z. n − 1} The n-th root of unity zn = zn1 = ei n is the principal root. • (a + b i) + (c + d i) = (a + c) + (b + d)i. . 1.5.2 Discrete Fourier transform In this section we first recall some properties of complex numbers. . Herein. As expected. Then we introduce discrete Fourier transform and some related properties. of the nth-roots of unity. . The following properties are useful in the sequel (see Exercises 1 and 2 in Section 5. 1. For this purpose. Any complex number can be expressed as a + b i where a and b are real numbers and i is called the imaginary unit. .2. namely. given n ∈ N. we are particularly interested in the distinct n-th roots of the unity znk = ei 2kπ n 2π for k ∈ {0. . taking into account that i2 = −1. Recall that in the field of complex numbers C. • (a + b i) × (c + d i) = (ac − bd) + (ad + bc)i. . The coefficients of the polynomial correspond to the digits the representation of k in base b. 5. corresponding to the base we are considering. In the sequel we may 2kπ ) = ei n . 5. n − 1}.1 Complex roots of unity The set of complex number C is the extension of the set of real numbers for which all polynomials with real coefficients of degree n have precisely n roots (including multiplicity). . . that is (a + bi) = reiθ = r(cos(θ) + i sin(θ)) √ where r = a2 + b2 is called the modulus and θ = neg(a)π + arctan b/a is called the phase where neg(a) = 0 iff a > 0 and neg(a) = 1 otherwise. For this reason. DISCRETE FOURIER TRANSFORM 161 This technique can also be used for efficient multiplication of large integers.2. A useful representation of complex numbers is their polar form. each nonzero complex number reiθ has exactly n distinct n-th roots √ θ+2kπ n r ei n for k ∈ {0. Proposition 5.2 Let k ∈ N0 and n. .2. that is n zn0 . Proposition 5..162 CHAPTER 5. DISCRETE FOURIER TRANSFORM Proposition 5.3 Let k ∈ N0 and n ∈ N such that k is not divisible by n. note that once we have computed the first n2 roots of unity..2. n +d d 2 zn = −zn and zn2k = z kn . i=0 . zn1 . Then znk = znn+k . 2 Using the above properties of the complex roots of unity we can speed up the computation of the n-th roots of unity when n is even. . Then.6).1 Let n ∈ N and k ∈ Z. +0 n 2 −1 roots: we have just to consider the = −zn0 = −zn1 n • znn−1 = zn2 +( n −1) 2 n = −zn2 −1 On the other hand... when we have already computed the n2 -th roots of unity. we can use those values to obtain half of the n-th roots: • zn0 = zn2×0 = z 0n 2 • zn2 = zn2×1 = z 1n 2 . −1) 2( n 2 • znn−2 = zn n = z n2 −1 2 Another relevant property is the following (see Exercise 3 in Section 5. On one hand. d ∈ N where n is an even number.2.. . Then n−1 X (znk )i = 0. zn2 then we can easily compute the other symmetric of the above since n n • zn2 = zn2 n +1 • zn2 . Note that. (zn1 )n−1 . . . an−1 ). . . An m × n Vandermonde matrix for (α1 . . an−1 ) ∈ Cn where n ∈ N. a1 . a1 . . .. . This matrix is invertible and therefore the matrix product .2. .. . . αm ) is such that its entry at row i column j is αij . a1 . . . Therefore. . . . . . . −2. and it is denoted by DFTn (a0 . Definition 5. 1 znn−1 (znn−1 )2 . .4 Let ~a = (a0 . denoted by Vn (α1 . (zn0 )n−1 . . . for each 1 ≤ k ≤ n.2. Fourier transform of ~a is ~b = (b0 . 2 − 2i. 1. DFTn (a0 . an−1 )k denotes the kth component of the tuple DFTn (a0 . for this normalized version. zn1 . Moreover U = √1n V is unitary and so.. −2. . Example 5. .. an−1 ). the DFT can be seen as a change of basis preserving the norm. . in particular. . . . z42 = −1 and z43 = −i. . .5 Consider ~a = (3. In the sequel. and so each row i is a sequence of a geometric progression with ratio αi . Indeed DFTn (~a) = V · ~a where    V =  1 zn0 (zn0 )2 1 zn1 (zn1 )2 .2. 2 + 2i). . 1. . znn−1 ). . 0). Observe that the discrete Fourier transform is a particular linear transformation described by a Vandermonde matrix. . DFTn (a) = a for each a ∈ Cn . a1 .2 Discrete Fourier transform We now present the discrete Fourier transform and some related properties.2. . . P P • b3 = 3j=0 aj z43j = 3j=0 aj (−i)j = 3(−i)0 + (−2)(−i)1 + (−i)2 = 2 + 2i. DISCRETE FOURIER TRANSFORM 5. P P • b1 = 3j=0 aj z4j = 3j=0 aj ij = 3i0 + (−2)i1 + i2 = 2 − 2i. . 6. . . (znn−1 )n−1      is the square Vandermonde matrix for (zn0 . . Observe that z40 = 1. . z41 = i. P P • b2 = 3j=0 aj z42j = 3j=0 aj (−1)j = 3(−1)0 + (−2)(−1)1 + (−1)2 = 6. bn−1 ) where bk = n−1 X The discrete aj znkj j=0 for each 1 ≤ k ≤ n − 1. . . .. P P • b0 = 3j=0 aj z40 = 3j=0 aj = 2. DFT4 (3. Then. since z10 = 1. αm ). .163 5. . . 0) = (2. 2. y1 . . . .164 CHAPTER 5. . . . ..7 Let Vn be the Vandermonde matrix V (zn0 . y1 . znn−1 ). . . Note that −(n − 1) < j ′ − j < n − 1 and therefore j ′ − j is not divisible by n..6 The inverse of  1 zn0  1 z1 n  V = . . Then   y0  y1  1   Vn−1  . Proposition 5. . . . yn−1 ) as the discrete Fourier transform of a suitable tuple.. . n−1 n−1 . .. zn1 .  = DFTn (y0 .  yn−1 .. yn−1. it equals 0 otherwise.. zn        Proof: For each 0 ≤ j... .3. . . . . . . . . Hence. .. 1 znn−1 the Vandermond matrix (zn0 )2 (zn1 )2 . . . the inverse of of the discrete Fourier transform can be computed also using the discrete Fourier transform. . V −1 × V is an identity matrix. yn−1 ) = (a0 ... Proposition 5.2. . yn−2. . n−1 2 (zn ) . .. an−1      −1 corresponds to DFT−1 can n (y0 . (zn ) is the matrix V −1   1  =  n  1 zn−1×0 zn−2×0 . (zn0 )n−1 ..      −(n−1)×0 zn −(n−1)×1 1 zn−1×1 zn−2×1 . a1 . Hence. the above summation equals 1 when j ′ = j and. an−1 ). .2. . .. . .. by Proposition 5.   V −1   yn−1       =   a0 a1 . DISCRETE FOURIER TRANSFORM  y0 y1 . . .. −1×(n−1) −2×(n−1) −(n−1)×(n−1) 1 zn zn . (zn1 )n−1 . . . . . ..  . . QED We can now write DFT−1 n (y0 . zn . . j ′ ≤ n − 1 the entry in row j + 1 and column j ′ + 1 of V −1 × V is n−1 1 X −kj kj ′ z z n k=0 n n ′ k(j−j ′ ) Since zn−kj znkj = zn . The matrix V be characterized as follows. y1 ) n  . DFT−1 n (y0 . p(znn−1 )). z42 = −1 and z43 = −i. y1 ) written a column matrix. we may associate to a tuple (a0 . . yn−2. that is. .2.2. p(zn1 ).1. . . a1 . . −2. . . zn−k = znn−k . Proof: Considering Vn−1 it holds that       y0 y1 . z41 = i. and therefore n−1 1X ak = yj zn(n−k)j . a1 . an−1      n−1 ak = 1 X −kj yj zn n j=0 for each 0 ≤ k ≤ n − 1. yn−1 ) = 1 DFTn (y0 . x2 − 2x + 3. . . DISCRETE FOURIER TRANSFORM where in the above equality we assume DFTn (y0 . P In fact. . yn−1      =   a0 a1 . .1) that is. yn−1 . yn−2.8 Consider again the tuple (3. . (5. yn−1. Another relevant way of presenting the discrete Fourier transform is based on polynomials. the inverse of the discrete Fourier transform can be computed using the discrete Fourier transform itself. . . . n j=0 QED Hence. . . . an−1 ) ∈ Cn the n−1 polynomial p = j=0 aj xj in C[x].. By Proposition 5. . 0). Recall that z40 = 1. y1 . . an−1 ) = (p(zn0 ). . Example 5.165 5. . Since • p(1) = 1 − 2 + 3 = 2 • p(i) = i2 − 2i + 3 = 2 − 2i • p(−1) = 1 + 2 + 3 = 6 . .2) We illustrate this fact with a simple example. . 1. Then DFTn (a0 .2.. . y1 ) n (5. . The polynomial p is in this case 0x3 + x2 − 2x + 3. Let ai be the coefficient of xi for each 0 ≤ i ≤ 3.2). computing qn−1 (u) takes n − 1 multiplications and n − 1 sums.9). n − 1 defined as follows: • q0 = an−1 • qj = qj−1 × x + an−(j+1) . Moreover. • q3 = q2 × x + a0 = (x2 − 4) × x + 2 = x3 − 4x + 2. Consider the sequence of polynomials qj in C[x] with j = 0. Hence. The naive way to evaluate the polynomial p used in (5.2. DISCRETE FOURIER TRANSFORM • p(−i) = (−i)2 + 2i + 3 = 2 + 2i we conclude that DFT4 (3. Step: Assuming that the degree is n. • q0 = a3 = 1. 1.9.. • q2 = q1 × x + a1 = x2 − 4. a1 = −4 and a0 = 2. 6. a3 = 1.9 Let p = i=0 ai xi be a polynomial in C[x]. Then p = qn−1 and the evaluation of qn−1 using the above sequence of polynomials involves n − 1 multiplications and n − 1 sums. . it would require O(n3 ) multiplications and sums. Pn−1 Proposition 5. Using Proposition 5.2. 2 − 2i. • q1 = q0 × x + a2 = x + 0 = x. 2 + 2i). By induction hypothesis. Proof: The proof follows by induction on the degree n − 1 of p.2. a2 = 0. This bound can be improved using the Horner’s rule (see Proposition 5. Basis: the degree is 0. −2. computing DFT reduces to evaluating a polynomial in the roots of the unity. we have a O(n2) number i=1 i = 2 of multiplications and sums.2). we do not take advantage of the previously computed value of ui−1 to get ui = ui−1 × u). taking into account Equality (5. If we use this naive way to evaluate DFTn . . QED Example 5. n − 1 = 0 and no multiplications or sums are performed.2.10 Consider the polynomial x3 − 4x + 2 of degree 3.2) at some value u consists of computing ai × ui for each 1 ≤ i ≤ n − 1 and then add all these values to a0 ..166 CHAPTER 5.. So computing p(u) takes n multiplication and n sums. then p(u) = qn (u) = qn−1 (u) ∗ u + a0 . if to compute ui we always perform i − 1 multiplications (that is. this evaluation involves Pn−1 n(n−1) multiplications and n − 1 sums. Taking into account the equality (5. which uses only a O(n) number of multiplications and sums to evaluate p at u. Then. Then. 0) = (2. .1 j let p = j=0 aj x . . Consider the polynomials n • p0 = −1 2 X • p1 = −1 2 X a2j xj j=0 n a2j+1 xj j=0 Then p(u) = p0 (u2 ) + up1 (u2) for all u ∈ Cn . This method relies on some properties of the n-th roots of unity and it is usually referred to as fast Fourier transform (FFT). a1 . The evaluation of DFTn needs O(n2 ) multiplications and sums using Horner’s rule and Equality (5. Note that we are not taking into account that the polynomial p to be evaluated is always the same throughout all components of DFTn and that p is evaluated at the roots of the unity. These two facts combined allow to further improve the complexity of computing DFTn to O(n log(n)) multiplications and sums. . an−1 ) whenever n is a power of 2. . As a consequence n a2j u2j and p1 (u2 ) = −1 2 X j=0 a2j+1 u2j . .3.3 Fast Fourier transform In this section we describe an efficient method to evaluate the discrete Fourier transform DFTn (a0 . In the following section we present the algorithm that achieves this complexity.167 5. Proposition Let (a0 . a1 . Proof: We equalities n p0 (u2 ) = −1 2 X j=0 hold. an−1 ) ∈ Cn where n is an even number and Pn−1 5.3. . . FAST FOURIER TRANSFORM Let us compute p(3): • q0 = a3 = 1 • q1 (3) = q0 × 3 + a2 = 3 + 0 = 3 • q2 (3) = q1 (3) × 3 + a1 = 33 − 4 = 5 • p(3) = q3 (3) = q2 (3) × 3 + a0 = 5 × 3 + 2 = 17. .2). 5. known as the Fast Fourier Transform. . . . DFT1 (a1 ) . DFT1 (an−1 ) Figure 5. we have that p(znk ) = p0 (zn2k ) + znk p1 (zn2k ) = p0 (z kn ) + znk p1 (z kn ) 2 for each 0 ≤ k ≤ n p(zn2 +k n 2 2 − 1. DFT1 (an−2 ) DFT n2 (a1 . . . . DFTn (a0 . an−2 ) .. where n is an even number. an−1 ) .. . Moreover. . b( n2 −1) + zn 2 c( n2 −1) . we have that n2 is also an even number and we can reason in a similar way with respect to p0 and p1 . . . a1 . .2).. n4 is also a even number and we can keep reasoning in this way until we only to have compute DFT1 (aj ) for 0 ≤ j ≤ an−1 (see Figure 5. b( n2 −1) − zn 2 c( n2 −1) ) where . . .. an−1 ) can be computed from DFT n2 (a0 . .168 CHAPTER 5. the evaluation at u of a polynomial p = j=0 aj xj of degree n − 1. an−1 ) = ( n −1) ( n −1) (b0 + zn0 c0 . .. . . DISCRETE FOURIER TRANSFORM n up1 (u2 ) = −1 2 X a2j+1 u2j+1 j=0 and therefore p(u) = p0 (u2 ) + up1 (u2 ). Hence. . . an−2 ) and DFT n2 (a1 . DFT1 (a0 ) . When considering the n-th roots of unity. a1 . Similarly. DFTn (a0 . . can be computed using the evaluation of two polynomials of degree less than or equal to n2 − 1 at u2 . a3 . the evaluations of p at the n-th roots of unity can be computed from znk for 0 ≤ k ≤ n2 − 1 and the evaluations of p0 and p1 at the n2 -th roots of unity.. a2 . . that is.2: Discrete Fourier transform Hence. . . an−1 ).. . .. . . a2 . . . Since n is a power of 2. ) = p(−znk ) = p0 (zn2k ) − znk p1 (zn2k ) = p0 (z kn ) − znk p1 (z kn ) 2 2 n 2 for each 0 ≤ k ≤ − 1. . . . . . . a3 . the discrete Fourier transform can be computed recursively as follows: DFTn (a0 . . b0 − zn0 c0 . a1 . QED Pn−1 Hence. . . an−1 ) DFT n2 (a0 . . . a3 .k.ptu.n/2-1}]. . Figure 5.0. • DFT n2 (a1 . au=Table[w[[2i+2]]. . + 2log2 (n) O(n/2log2 (n) ) log2 (n)O(n) O(n log2 (n)).r}. Let oFFT(n) be the number of sums and multiplications used in FFT for an input of length n. b( n2 −1) ). r[[k+1]]=ptz[[k+1]]+z*ptu[[k+1]].ptz. The algorithm to compute the DFT using this recursion is called Fast Fourier Transform (FFT). an−2 ) = (b0 . r=Table[0.3: FFT in Mathematica The analysis of the FFT algorithm follows straightforwardly. FAST FOURIER TRANSFORM 169 • DFT n2 (a0 .n/2-1}].1. a2 . r]]].au.{i. .az. 2π zp=EI n . . ptz=FFT[az]. r[[k+n/2+1]]=ptz[[k+1]]-z*ptu[[k+1]].{i. If[n==1. .{i. . . . . In Figure 5. n=Length[w].zp. z=1. . FFT=Function[{w}. az=Table[w[[2i+1]]. . For such input. z=z*zp]. c( n2 −1) ). an−1 ) = (c0 .k=k+1.0.Module[{n. . . So we have to find the solution for oFFT(n) = 2oFFT(n/2) + O(n) that is: oFFT(n) = = = O(n) + 2O(n/2) + 4O(n/4) + . . the FFT performs O(n) multiplications and sums. In the following example we illustrate the computation a discrete Fourier transform using the FFT algorithm. . ptu=FFT[au]. .z. assuming that the lenght of w is a power of 2.3 a Mathematica implementation of the FFT is given. . and makes two recursive calls of order n2 .5. .3.k<=n/2-1. For[k=0.n}]. . w. 1. −2) ) DFT1 (−2) = −2 DFT1 (0) = 0 . 0)1 = DFT1 (−2) + z20 DFT1 (0) = −2 + 0 = −2. 0)1 = DFT2 (3. 0)2 = DFT2 (3. 1) = (4. 2 + 2i) ) DFT2 (3. −2. 1)1 = DFT1 (3) + z20 DFT1 (1) = 3 + 1 = 4. 2) ) DFT1 (3) = 3 DFT1 (1) = 1 DFT2 (−2. 6. 1. 2 + z41 . DFT1 (1).0. −2 − z20 . 2) since DFT2 (3. 3 − z20 . 0)2 = 2 + 2i. 1. and it can also be briefly sketched as follows DFT4 (3. −2. 4 − z40 . The computation involves 2throots of unity and 4th-roots of unity. 1)1 + z40 DFT2 (−2. −2.1. 1. 1. 0) = (−2 + z20 . 1. 2 − 2i. DISCRETE FOURIER TRANSFORM Example 5. 1)2 = DFT1 (3) − z20 DFT1 (1) = 3 − 1 = 2. 0)4 = DFT2 (3. 6. 1)1 − z40 DFT2 (−2. DFT4 (3. DFT4 (3.(−2). 0) = (−2. 1. The computation proceeds as follows: DFT2 (3. −2.3. −2. 0)2 = DFT1 (−2) − z20 DFT1 (0) = −2 − 0 = −2. 1)2 + z41 DFT2 (−2.(−2). DFT1 (−2) and DFT1 (0). 0). DFT2 (3. 2 − z41 .2 Consider the tuple (3. This tranform can be computed from DFT2 (3. z20 and z41 . 1)2 − z41 DFT2 (−2. 2 + 2i) since DFT4 (3. 0) = (2. 1.(−2)) ( = (2. 2 − 2i. 0)2 = 2 − 2i.1) ( = (4. and therefore DFT4 (3. DFT4 (3. −2. −2. have to be computed since z20 = z40 = 1 z21 = −z20 = z42 = −1 z41 = i z43 = −z41 = −i. Only two of them.170 CHAPTER 5. and these from DFT1 (3). 1) = (3 + z20 . The goal is to compute the discrete Fourier transform DFT4 (3. 0)1 = 6. DFT2 (−2. DFT2 (−2. 0). −2. −2) since DFT2 (−2. 0)3 = DFT2 (3. 0) = (4 + z40 . 0) using the FFT algorithm.0) ( = (−2. 0)1 = 2.(−2). 1) and DFT2 (−2. . 2).4 171 Polynomial multiplication revisited We now detail the polynomial multiplication technique briefly skteched in Section 5. .4.5.1 Let p ∈ C[x] be a polynomial with degree m and let n ∈ N0 be such that m ≤ n. 17)} of pairs of real numbers. −4.4. . j ≤ n such that i 6= j. Definition 5. (un . . Given (u0 . 1). .   . . unn Recall that the coefficient matrix is the Vandermonde matrix Q Vn (u0 . (u1 .4. . The coefficient representation of degree n of the polynomial p is the tuple (a0 . .. starting by introducing some remarks related to coefficient and point-wise representation of polynomials. since its determinant. un   a1   v1  1 1        . . 2. . 1. an ) where ai is the coefficient of xi in p for all 0 ≤ i ≤ n. . 0). 5. a1 . This matriz is invertible. that is.1 Coefficient and point-value representations Herein we refer to coefficient and point-value representation of polynomials. .2 Consider the polynomial x3 − 4x + 2 of degree 3 in C[x]. (3. a1 . . an whose matrix form is      1 u0 u20 . . by solving a system of n + 1 linear equations on the unknowns a0 . (1..1. . We get the coefficients ai of xi for each 0 ≤ i ≤ n by interpolation. −4.  . POLYNOMIAL MULTIPLICATION REVISITED 5. |Vn |. v1 ). un ). . . there is one and only one polynomial p in C[x] with degree less than or equal to n such that p(ui ) = vi for all 0 ≤ i ≤ n. . . v0 ). There is one (and only one) polynomial p in R[x] with degree less than or equal to 3 such that p(0) = 2 p(1) = −1 p(2) = 2 p(3) = 17 since solving the system . Its coefficient representation of degree 4 is the tuple (2. an vn 1 un u2n . . 0.. u1 . Otherwise the degree is less than n.   . . Example 5. Its coefficient representation of degree 3 is the tuple (2. un0 a0 v0  1 u1 u2 . . . .  . . 0.3 Consider the set {(0.. Example 5...   . −1). We now refer to the point-value representation of polynomials in C[x].4. . If an 6= 0 then p has degree n.4.  =  . (2. . vn ) ∈ C2 where n ∈ N0 and ui 6= uj for all 0 ≤ i. . . . is 0≤i<j≤n (uj − ui ) and therefore |Vn | is different from 0 whenever ui 6= uj . 172 CHAPTER 5. DISCRETE FOURIER TRANSFORM  we get 1  1   1 1 0 1 2 3  a0 0 0   1 1   a1 4 8   a2 a3 9 27  2   −1   =   2  17   a0 = 2 a1 = −4 a2 = 0 a3 = 1 and therefore x3 − 4x + 2 is the intended polynomial p. In this case the degree of p is 3. Hence, each polynomial p of degree n ∈ N0 in C[x] can be represented by a set of suitable n + 1 pairs of complex numbers, obtained by evaluating p at distinct n + 1 complex numbers. Definition 5.4.4 Let p = an xn + . . . + a0 ∈ C[x] be a polynomial with degree n ∈ N0 and let u0 , . . . , un ∈ C where ui 6= uj for all 0 ≤ i, j ≤ n such that i 6= j. The set {(u0 , p(u0)), . . . , (un , p(un ))} is a point-value representation of p, more precisely, the point-value representation of p at u0 , . . . , un . Clearly, point-value representation is not unique, in the sense that any set of n + 1 pairs of complex numbers {(v0 , p(v0 )), . . . , (vn , p(vn ))} with distinct first components is also a point-value representation of p. In certain situations it is useful to consider extended point-value representations of a polynomial p with degree n: any set of m > n + 1 pairs of real values {(u0 , p(u0 )), (u1, p(u1 )), . . . , (um , p(um ))} where ui 6= uj for all 0 ≤ i, j ≤ m such that i 6= j, is an extended point-value representation of p. Note that the solution of the system of m linear equations on the n + 1 unknowns a0 , a1 , . . . , an we get from the above extended representation of p is equal to the solution of the system of n+1 linear equations on the unknowns a0 , a1 , . . . , an that results from the interpolation with the first n + 1 pairs (u0 , p(u0)), (u1 , p(u1)), . . . , (un , p(un )) of the extended representation. For simplicity, in the sequel we often refer only to point-value representations even in the case of extended point-value representations. Example 5.4.5 Let p = x3 − 4x + 2 be a polynomial in C[x]. Since p(0) = 2, p(1) = −1, p(2) = 2 and p(3) = 17, the set {(0, 2), (1, −1), (2, 2), (3, 17)} is a point-value representation of p. Given that p(−2) = 2 and p(−1) = 5, the set {(−2, 2), (−1, 5), (0, 2), (1, −1)} is another possible point-value representation of p. The set {(−2, 2), (−1, 5), (0, 2), (1, −1), (2, 2), (3, 17)} is an extended pointvalue representation of the polynomial p. 5.4. POLYNOMIAL MULTIPLICATION REVISITED 173 Note that using the discrete transform we get a point-value representaPn−1 Fourier i tion of a polynomial p = i=0 ai x of degree n−1: the point-value representation at the n-th roots of unity. In fact, from DFTn (a0 , a1 , . . . , an−1 ) = (p(zn 0 ), p(zn 1 ), . . . , p(zn n−1 )) we get the set {(zn 0 , p(zn 0 )), (zn 1 , p(zn 1 )), . . . , (zn n−1 , p(zn n−1 ))}. Conversely, given a point-value representation of a polynomial p at the n-th roots of unity {(zn 0 , y0 )), (zn 1 , y1 )), . . . , (zn n−1 , yn−1)} the inverse of the discrete Fourier transform is used to perform interpolation. Since DFTn (a0 , a1 , . . . , an−1 ) = (y0 , y1 , . . . , yn−1 ) getting the coefficients a0 , a1 , . . . , an−1 from y0 , y1 , . . . , yn−1 corresponds to computing the inverse of the discrete Fourier transform, that is, DFT−1 n (y0 , y1 , . . . , yn−1 ) = (a0 , a1 , . . . , an−1 ). We now refer to sum and multiplication of polynomials using only their pointvalue representations. Given two polynomials p and q with degree n, from point-value representations of p and q at the same complex numbers u0 , . . . , un we easily get a point-value representation of p+q at u0 , . . . , un . Recall that deg(p+q) ≤ max{deg(p), deg(q)}. Proposition 5.4.6 Consider the polynomials p and q in C[x] with degree n. If {(u0 , v0 ), . . . , (un , vn ))} and {(u0, w0 ), . . . , (un , wn )} are point-value representations of p and q, respectively, then the pointwise sum {(u0 , v0 + w0 ), . . . , (un , vn + wn )} is a (possibly extended) point-value representations of p + q. If the polynomials p and q do not have the same degree, let us say, for instance, deg(q) < deg(p), then deg(p+q) = deg(p) and it is easy to conclude that taking a suitable extended point-value representation of q we can also obtain a point-value representation of p + q as described in Proposition 5.4.6. Example 5.4.7 Consider the polynomials p = x3 − 4x + 2 and q = x2 − 2x + 1. Since {(0, 2), (1, −1), (2, 2), (3, 17)} and {(0, 1), (1, 0), (2, 1), (3, 4)} are a pointvalue representation of p and an extended point-value representation of q, respectively, then {(0, 3), (1, −1), (2, 3), (3, 21)} is a point-value representation of p + q. 174 CHAPTER 5. DISCRETE FOURIER TRANSFORM Extended point-value representations are also useful when multiplying polynomials in point-value representation. Recall that deg(p × q) = deg(p) + deg(q). Hence, if p and q have degree n, any point value-representation of p × q has 2n + 1 elements. As a consequence, to obtain a point value representation for p × q we always have to consider extended point-values representations of p and q whenever deg(p) and deg(q) are not both 0. Proposition 5.4.8 Let p and q be polynomials in C[x] with degree n > 0. If {(u0 , v0 ), . . . , (u2n , v2n ))} and {(u0, w0 ), . . . , (u2n , w2n )} are extended point-value representations of p and q, respectively, then the pointwise multiplication {(u0, v0 × w0 ), . . . , (u2n , v2n × w2n )} is a point-value representations of p × q. If deg(p) = deg(q) = 0 then deg(p × q) is also 0 and if {(u0 , v0 )} and {(u0, w0 )} are point-value representations of p and q, respectively, then {(u0, v0 × w0 )} is a point-value representations of p × q. If p and q do not have the same degree we just have to consider extended point-value representations of p and q with deg(p) + deg(q) + 1 elements. Example 5.4.9 Consider p = x3 − 4x + 2 and q = x2 − 2x + 1. Given that {(−2, 2), (−1, 5), (0, 2), (1, −1), (2, 2), (3, 17)} is an extended point-value representation of p and {(−2, 9), (−1, 4), (0, 1), (1, 0), (2, 1), (3, 4)} is an extended pointvalue representation of q, then {(−2, 18), (−1, 20), (0, 2), (1, 0), (2, 2), (3, 68)} is a point-value representation of p × q. Note that the pointwise multiplication in Proposition 5.4.8 only involves a O(n) number of multiplications of real numbers. 5.4.2 Polynomial multiplication and FFT We now revisit the polynomial multiplication technique briefly sketched in Section 5.1. The main steps are depicted in Figure 5.4. We use two discrete Fourier transforms to obtain point-value representations of p and q (in fact, only the second component of each pair is relevant). Then, using pointwise multiplication, we can compute a point-value representations of p × q. Finally, using a discrete Fourier transform to interpolate we get the coefficients of p × q. The fast Fourier transform is used to compute the discrete Fourier transforms involved. Given that deg(p × q) = deg(p) + deg(q), we need a point-value representation of the polynomial p × q with deg(p) + deg(q) + 1 elements. Since we want to use the fast Fourier transform to efficiently compute the coefficients of p × q, if deg(p) + deg(q) + 1 is not a power of 2, we have to consider an extended pointvalue representation of p × q with n elements, where n = 2k and k is such that 175 5.4. POLYNOMIAL MULTIPLICATION REVISITED p×q coefficient rep. p, q coefficient rep. DFTn for p DFTn for q p, q point-value rep. DFT−1 n pointwise multiplication p×q point-value rep. Figure 5.4: Polynomial multiplication and DFT 2k−1 < deg(p) + deg(q) + 1 ≤ 2k . Therefore, we need (possibly extended) pointvalues representations of p and q with n elements and, as a consequence, we have to consider n coefficients of p and q, that is, we have to consider coefficient representations of degree n − 1 of p and q, in order to compute the corresponding discrete Fourier transforms. In Section 5.3 we concluded that DFTn (~a) can be computed using a O(n log2 n) number of sums and multiplications where n is a power of 2. The computation of DFT−1 a) can also be performed using a O(n log2 n) number of sums and multin (~ plications since, in Section 5.2.2, we proved that to get DFT−1 a) we only have n (~ −1 ~ to compute DFTn (b), where ~b results from ~a just by changing the order of the components. It is also trivial to conclude that the pointwise multiplication of the point-value representations of p and q mentioned above just involves a O(n) number of multiplications. Therefore, as the following proposition ensures, polynomial multiplication computed as described above involves a O(n log n) number of sums and multiplications. Proposition 5.4.10 Let f1 , f2 , f3 : N0 → N0 such that f1 (n), f2 (n) ∈ O(n log2 n) and f3 (n) ∈ O(n). Then f1 (n)+f2 (n) ∈ O(n log2 n) and f1 (n)+f3 (n) ∈ O(n log2 n). Proof: For i = 1, 2, let pi ∈ N0 be such that | fi (n) |≤ ci | n log2 n | for all n ≥ pi for some ci ∈ R+ . Hence, letting p = max({p1 , p2 }) f1 (n) + f2 (n) ≤ c1 n log2 n + c2 n log2 n for all n ∈ N0 such that n > p. Moreover, f1 (n) + f2 (n) ≤ 2cn log2 n considering c = max({c1 , c2 }). Since all the values involved are nonnegative, it holds that | f1 (n) + f2 (n) |≤ 2c | n log2 n | for all n ∈ N0 such that n > p, and therefore f1 (n) + f2 (n) ∈ O(n log2 n). We now compute DFT4 (−1. The goal is to compute p × q using the discrete Fourier transform and the fast Fourier transform. 0). If n > 2 then log2 n > 1 and therefore n < n log2 n. where m is the degree of p and q. 2.2. 0): DFT2 (−1. 2 − 2i. 2}) and c = max({c1 . 0)2 = DFT1 (2) − z20 DFT1 (0) = 2 − 1 × 0 = 2 . 1. 0) = (−1. −2. it holds that DFT4 (3. QED Let us now relate the number of sums and multiplications involved in this computation of p×q with the degrees of p and q. 2 + 2i).11 Let us consider the polynomials p = x2 −2x+3 and q = 2x−1. p3 . (1) Since deg(p × q) = 3 then we need a point-value representation for p × q with 4 (= 22 ) elements and therefore. −2. Assuming that deg(p) = deg(q) = m. Example 5. 1. As a consequence. It is then easy to conclude this computation of p×q indeed involves a O(m log2 m) number of sums and multiplications. Recalling Example 5. 2) since DFT2 (2. c3 }) f1 (n) + f3 (n) ≤ 2cn log2 n for all n ∈ N0 such that n > p. also point-value representations for p and q with 4 elements. −1) since DFT2 (−1. respectively (3. it holds that n2 < 2m + 1 ≤ n. We now present an example that illustrates polynomial multiplication using the discrete Fourier transform and the fast Fourier transform. 0) = (2. and recalling that n is the power of 2 used for computing the discrete Fourer transforms. 1. 0)2 = DFT1 (−1) − z20 DFT1 (0) = −1 − 1 × 0 = −1 DFT2 (2.176 CHAPTER 5. it holds that | f1 (n) + f3 (n) |≤ 2c | n log2 n | for all n ∈ N0 such that n > p and therefore f1 (n) + f3 (n) ∈ O(n log2 n). 2.4. 0) and (−1. Since all the values involved are nonnegative. 0) = (2. 0. DISCRETE FOURIER TRANSFORM Let p1 ∈ N0 be as above and let p3 ∈ N0 be such that | f3 (n) |≤ c3 | n | for all n ≥ p3 for some c3 ∈ R+ . 0)1 = DFT1 (2) + z20 DFT1 (0) = 2 + 1 × 0 = 2 DFT2 (2. that is. 0)1 = DFT1 (−1) + z20 DFT1 (0) = −1 + 1 × 0 = −1 DFT2 (−1. 0) (2) We first compute DFT4 (3. we begin with coefficient representations of degree 3 of p and q. letting p = max({p1 . Hence.3. 0. 6. −2. 2. 2 + 6i) = (−12. 0.4. 2 − 6i. 2 + 2i) ⊗ (1. 2. 2 − 6i) = (−12. only the second components): (2. 2 − 6i)2 = DFT1 (2 + 6i) − z20 DFT1 (2 + 6i) = −12i DFT4 (2. −18. 8) = (−3. 2 − 6i. −18)1 = DFT1 (2) + z20 DFT1 (−18) = −16 DFT2 (2. −20. . −18. −1 − 2i) since DFT4 (−1. 0. −18)2 = DFT1 (2) − z20 DFT1 (−18) = 20 DFT2 (2 + 6i. −1 + 2i. 0)1 = DFT2 (−1. 2 − 6i) = DFT4 (2. 0. 0)2 + z41 DFT2 (2. 0) and pointwise multiplication (denoted by ⊗) we get a point-value representation for p × q (in fact. 2. −5. −2. 2 + 6i) 4 DFT2 (2. −3. 0)1 − z40 DFT2 (2. 2 − 6i)1 = DFT1 (2 + 6i) + z20 DFT1 (2 − 6i) = 4 DFT2 (2 + 6i. 2 + 6i)1 = −16 + z40 4 = −12 DFT4 (2. 2 − 6i) (4) Finally. −1 − 2i) = (2. 2 + 6i)3 = −16 − z40 4 = −3 DFT4 (2. 0. −18. 0). −20. 2 + 6i. 4 The product of p and q is therefore the polynomial 2x3 − 5x2 + 8x − 3. DFT4 (−1. 0. 2). 2 + 6i. 2 + 6i)4 = 20 − z41 (−12i) = 8 Hence. 0)1 = −1 + 1 × 2 = 1 DFT4 (−1. 0. 2 + 6i.5. 2 − 6i. 2. 32. −18. −18. we compute 1 DFT−1 4 (2. 32. −18. −18. 20) since DFT2 (2. 2 + 6i)2 = 20 + z41 (−12i) = 32 DFT4 (2. 0)2 = −1 − 2i (3) Using DFT4 (3. 2. 0)1 + z40 DFT2 (2. POLYNOMIAL MULTIPLICATION REVISITED 177 and therefore DFT4 (−1. 2 − 2i. 0)3 = DFT2 (−1. 2 − 6i) = (4. 1 DFT−1 4 (2. −3. 2. −18) = (−16. 0)2 = −1 + 2i DFT4 (−1. 0)2 − z41 DFT2 (2. 2 − 6i. −1 + 2i. 0)2 = DFT2 (−1. 2 − 6i. 0)1 = −1 − 1 × 2 = 1 DFT4 (−1. 1. −18. 6. −18. −12i) since DFT2 (2 + 6i. 8) since DFT4 (2. 2 − 6i. 0)4 = DFT2 (−1. 0) = (1. 8. 1. using the discrete Fourier transform and its inverse. −1. 0. 1. 2. 0. 0.5 Image processing 5. where (a) p = 3x2 − 7x + 4 e q = −2 (b) p = 5x + 3 e q = −2x + 4 (c) p = 3x − 2 e q = −4x2 − 5x + 3 (d) p = 5x e q = −2x2 − 3x (e) p = 5x2 e q = −2x4 − 3x − 1 . 1. 0) (e) DF T8 (2. Let k ∈ N0 and n. 4) (b) DF T4 (0. 1. −7. Prove that zn k = zn n+k where n ∈ N and k ∈ Z. d ∈ N where n is an even number. 1. Prove that the P0 n−1 equality i=0 (znk )i = 0 holds. 1) (f) DF T2−1 (−2. 4. 1) (g) DF T4−1 (1. Compute p × q in a efficient way. −1. −2) 5. DISCRETE FOURIER TRANSFORM 5. 0) (d) DF T4 (4. −1 + i) (i) DF T8−1 (−1. −2. 1. −1) (h) DF T4−1 (4.178 CHAPTER 5. 2. 2.6 Exercises 1. −1. −2. Prove that n (a) zn 2 = −1 n (b) zn 2 +d = −zn d (c) zn 2k = z kn 2 3. Using the Fast Fourier transform compute (a) DF T2 (3. 1. 0. Let k ∈ N and n ∈ N such that k is not divisible by n. −1 − i. 2) (c) DF T4 (2. 1. 3. 6. To this end we can then consider a table Key for storing keys and a table Data for storing the corresponding data. 179 .4 we propose some exercises. There are several kinds of generating functions for a sequence.1 Search by hashing Assume we want store information in a computer using of a set of records.2. Key[2] is the key of second record that has been stored. The value of a variable rstored indicates the number of records that have already been stored. we only consider ordinary generating functions. In section 6. exponential generating functions and Poisson generating functions [19. for each 1 ≤ j ≤ nr . We first refer to the average case analysis of a search algorithm involving a hash function. Herein. The tables are filled sequentially. it should be easy to get the corresponding data D(K). In Section 6. in the sense that Key[1] is the key of first record that has been stored (and therefore Data[1] is the corresponding data). 31]. a series involving all the terms of the sequence. 18. that is. 6. we refer to the use of generating functions in algorithm analysis. Assuming that nr is the maximum number of records that can be stored. we can associate with it a generating function. Given a key K.3 we revisit the motivating examples and in Section 6.1 we present motivating examples in algorithm analysis.Chapter 6 Generating functions In this chapter we introduce generating functions. such as ordinary generating functions. where each record has a key K and some data D(K). if Key[j] is some key K then Data[j] is the corresponding data D(K). etc. Generating functions are introduced in section 6. Then we refer to the Euclid’s algorithm The second example also illustrates the relevance of generating functions for solving recurrence relations.1.1 Motivation In this section. Given any sequence of real or complex numbers. the university students. we assign its key value to Key[rstored + 1] and its data to Data[rstored + 1] and then increment rstored. This can be rather slow when a large number of records have already been stored. Key[3] = 43289. 1. Example 6.. We can sketch the key distribution as follows L1 L2 L3 15367 L4 L5 L6 35346 43289 32128 46238 38532 L7 L8 L9 L10 that is.. a nonnegative integer number less than 100000.. Key[5] = 38532 and Key[6] = 46238. that is m = 10. L4 and L5 and the other lists are empty.. Key[4] = 32128. 1. The integer h(K) indicates the list where to search for the key K.. 99999} → {1. . lets say. To keep things easy. and that the hash function h : {1. . nr } indicates the position in table Key of the key that follows Key[j] in the list h(Key[j]). Next[j] ∈ {0. This situation corresponds to . comparing each Key[j] to the given key K. and consider a hash function h that transforms each key K into an integer h(K) ∈ {1. The database library uses these identification numbers as record keys. there are keys in the lists L2 . m}. In order to borrow books. .1 Consider an university library maintaining a database storing relevant data about its readers. . The university sequentially assigns an identification number to each of its students. . each student has to register first as a reader at the library.1. assume in this example that we have only 10 lists.. .. . The value −1 indicates that the list is empty. The value 0 indicates that Key[j] is the last key in its list. 10} is such that K h(K) = +1 10000 Moreover. We have to consider also the tables F irst and Next.. The task of searching for some key K among the keys already stored can be accomplished going through the table Key sequentially. Key[2] = 15367. let us assume that only 6 students have registered at the library so far and that Key[1] = 35346. for simplicity. F irst[i] ∈ {−1. For each 1 ≤ j ≤ nr . For each 1 ≤ i ≤ m. .180 CHAPTER 6.. . . GENERATING FUNCTIONS To insert a new record. . In order to improve this situation we can use the hashing technique that involves splitting the storing memory space for keys into m lists. nr } indicates the position in table Key of the first key of list i. 8. F irst[5] = 3 • Next[i] = 0 for i ∈ {2. F irst and Next. assuming that the hash function h is already known and that the lists key. If[key[[j]]==k. Using the auxiliary lists first and next.j=next[[j]]]]. keySearch=Function[{k}. first and next record the tables Key. 3. r=False. F irst[4] = 1.j. i=h[k]. 9. respectively. 6.1. It returns the position of the key k in key if k has already been stored and the string “the key has not been stored” otherwise. If[r. function keySearch compares k with all the elements in key whose hash function value equals that of k.1: Key search function in Mathematica If the key we are searching for has been already stored we say that the search is successful. Figure 6.6. The function keySearch in Figure 6. . MOTIVATION 181 • rstored=6 • F irst[i] = −1 for i ∈ {1. Our goal is to determine this average case number of comparisons.r=True. the task of searching for keys can be performed faster. if the hash function h is such that h(K) = h(K ′ ) for all the keys K ′ that have already been stored. Clearly. While[j>0&&!r. 10} and F irst[2] = 2.Print["the key has not been stored"]]]].j. 7. Otherwise. j=first[[i]]. the worst case number of comparisons is equal to the worst case number of comparisons when no hash function is involved. Therefore. the average case number of comparisons is smaller when the hashing technique is used. Next[3] = 6. 6} and Next[1] = 4.Module[{i. 5. However. Next[4] = 5 Using the hashing technique described above. since when looking for a given key K we only have to compare it with the stored keys K ′ such that h(K ′ ) = h(K).1 determines whether a given key k has already been stored. the search is unsuccessful.r}. then all the lists are empty but one and K has to be compared with all the stored keys. It states that for k.m]. Generating functions are useful for computing the mean and the variance of discrete random variables whose values are nonnegative. or mean.17). with m < n. the evaluation of euclid[m. we can reason as in the previous case and conclude that again the first argument is going to be less than the second in all the following recursive calls. there are no recursive calls if m = 0 and if m = n there is just one recursive call. there is again a first recursive call to euclid[Mod[n. In fact. the average number of comparisons is the expected value. again.sk+1 ]. there is one recursive call to euclid[Mod[n. but the first argument is now less than the second. Hence. The characterization of the random variable NC depends on the kind of search: successful or unsuccessful.n] involves less than k recursive calls whenever m < sk+1. in both cases. To this end we have to consider a discrete random variable NC whose values correspond to the possible numbers of such comparisons.1) Moreover (see Proposition 6.1 we discuss the relevance of generating functions in the average case analysis of the above key search algorithm. it is easy to conclude that the first argument is going to be also less than the second in all the following recursive calls. GENERATING FUNCTIONS We want to determine the average number of comparisons key[[j]]==k that are executed when searching for some given key k. But. what is the probability that h(k) = i for each 1 ≤ i ≤ m.n]. (6.n] is evaluated.3.n] is evaluated. such as.m] where. The analysis of the Euclid’s algorithm often assumes that the first argument is less than the second. Reasoning in a similar way. When evaluating euclid[m.182 CHAPTER 6. Recall that the sequence of Fibonacci numbers is the sequence s = {sn }n∈N0 such that s0 = 0 s1 = 1 and sn = sn−1 + sn−2 for n ≥ 2. m.3. We also have to assume some probabilistic hypothesis.16) establishes an upper bound for the number of recursive calls that are performed when euclid[m. . since mod(n.m]. the first argument is less than the second. note that if m 6= 0 and m < n.m]. for instance.3. The Lamé theorem (see Theorem 6. m) < m. n ∈ N. In Section 6. the worst case number of recursive calls occurs when evaluating euclid[sk . The analysis of this algorithm involves counting the number of recursive calls performed when euclid[m. of NC. 6.1. In this case there are k − 1 recursive calls.2 Euclid’s algorithm Recall the Euclid’s algorithm for computing the greatest common divisor of two nonnegative integers presented in Figure 1.4. where sk+1 is the (k + 2)th Fibonacci number. If m > n. A generating function associates a formal power series with each sequence of elements of a field. The analysis of recursive algorithms often involves recurrence relations. in particular. but the closed form   √ !k √ !k 1 1− 5  1+ 5 sk = √  (6.3) i=0 A generating function for a sequence s records all the elements of s. The generating function for s is +∞ X si z i . for some k ∈ N0 .1 Let s = {sn }n∈N0 be a sequence of real or complex numbers. For simplicity.1).5) . If si = 1 or si = −1 for all i ∈ N0 .1). can be computed using (6.4) i=0 to denote Gs (z). it is often convenient to write +∞ X i=k si z i or +∞ X si+k z i+k (6. If s is such that sn = 0 for all n > k.2) − 2 2 5 is often useful. for some k ∈ N0 . If for some k ∈ N it holds that si = 0 for each 0 ≤ i ≤ k − 1. (6.3.2. in (6. respectively. Herein. The subscript may be omitted when no confusion arises.2 Generating functions In this section we introduce the notion of generating function and some related properties. In the sequel.183 6.3) we may just use z i or −z i . . we refer to sequences of real or complex numbers. we can write s0 + s1 z 1 + s2 z 2 + . In Section 6. we often use Gs (z) or Gs to denote the generating function for a sequence s. it is usual to introduce the following notations to represent the generating function for s = {sn }n∈N0 . 6. . To get the equality (6.1).2.2) we have to solve the recurrence relation (6.2 we discuss how generating functions can be used for solving. Definition 6. the recurrence relation in (6. GENERATING FUNCTIONS The Fibonacci number sk . + sk z k (6. Note that we can associate a polynomial in R[z] (or C[z]) to each generating function Gs (z) for a sequence s such that sn = 0 for all n > k.2 The following are examples of generating functions. the coefficient of z n in the polynomial associated to Gs (z) is sn . for some k ∈ N0 (and vice-versa). GENERATING FUNCTIONS together with the usual conventions regarding polynomials: using z instead of z 1 . If.6) i=0 to denote Gs (z). Example 6. for each n ∈ N0 . etc. . for some k ∈ N. then we can use +∞ X z ki (6. q1 = 0. s is such that sn = 1 when n is a multiple of k and sn = 0 otherwise. i=0 (ii) The generation function for v = (vn )n∈N0 where vn = 1 for each n ∈ N0 is +∞ X zi . As expected. (iv) We can write 1 − z 2 for the generating function for q = (qn )n∈N0 where q0 = 1. taking into account the observations above. i=0 (iii) The generation function for r = (rn )n∈N0 where rn = n for each n ∈ N0 is +∞ X iz i i=0 and. q2 = −1 and qn = 0 for each n > 2. we can also write +∞ X iz i i=1 or +∞ X (i + 1)z i+1 i=0 for this generation function.184 CHAPTER 6. (i) The generation function for s = (sn )n∈N0 where sn = 2n + 1 for each n ∈ N0 is +∞ X (2i + 1)z i .2. omitting s0 when s0 = 0 and si z i when si = 0 for some 1 ≤ i ≤ k. Similar results also hold for sequences of complex numbers. we have the map f that associates each real or complex number c to the real or complex number f (c). in general. Generating functions represent sequences as formal power series. after some manipulations we may end up with a power series that indeed converges in some interval (disc) I and therefore it defines a map with domain I. the point of view we are interested in along this chapter. The sum of s and t is the sequence s + t = {(s + t)n }n∈N0 where (s + t)n = sn + tn for each n ∈ N0 . Convergence issues are clearly relevant P+∞ in ithis case since the domain of f is the interval or disc of convergence of i=0 si z . Moreover. Then we can take advantage of this fact and use it as a map. . P i Let s = {sn }n∈N0 be a sequence. P+∞ i When we consider i=0 si z as a formal power series. In the first case. However.2. where a is any real or complex number (we just write −s when a = −1). Note that the expression +∞ i=0 si z can be seen as defining a map but it can also be seen as a formal power series. Sum and product of generating functions We now introduce several operations over generating functions for sequences of real numbers and some related properties. Consider two sequences s = {sn }n∈N0 and t = {tn }n∈N0 . the result is useful for several purposes as we will discuss in the sequel. as = {(as)n }n∈N0 is the sequence such that (as)n = asn for each n ∈ N0 . We now introduce the sum of generating functions. for n ∈ N0 . They are algebraic objects that we can manipulate using some suitable (ring) operations.185 6. The Pn convolution of s and t is the sequence s ∗ t = {(s ∗ t)n }n∈N0 where (s ∗ t)n = k=0 sk tn−k for all n ∈ N0 . where f (c) is the sum of the series P +∞ i i=0 si c . when we refer to a sequence we always assume that it is a sequence of real numbers. GENERATING FUNCTIONS (v) We can write +∞ X z 3i i=0 to denote the generation function for w = (wn )n∈N0 where wn = 1 if n is a multiple of 3 and wn = 0 otherwise. convergence issues are P i s not relevant and +∞ i=0 i z is only seen as a way of recording all the elements of the sequence s. This is. To begin with it is relevant to recall several notions regarding sequences of real numbers. But even when this is not the case. without taking into account intervals (discs) of convergence. In the sequel. denoted by Gs (z) × Gt (z). +∞ +∞ X X i Gr (z) + Gv (z) = Gr+v (z) = (ri + vi )z = (i + 1)z i . where a ∈ R. denoted by Gs (z) + Gt (z). i=0 i=0 The product of generating functions is defined as follows. Definition 6. For simplicity we often just write Gs (z)Gt (z) for Gs (z) × Gt (z). that is. the generating function for the sequence s ∗ t.4 Let r and v be the sequences presented in Example 6.2. . s = (sn )n∈N0 is such that s0 = a and sn = 0 for all n > 0. 0. we get Gs (z) × Gt (z) = +∞ X ati z i i=0 and therefore Gs (z) × Gt (z) is the generation function for the sequence at. Gs (z) × Gt (z) = +∞ i X X i=0 sk ti−k k=0 ! zi . is Gs+t (z) that is.2. Since i X sk ti−k = s0 ti = ati k=0 for each i ∈ N0 . is Gs∗t (z). Then.3 Consider two sequences s = {sn }n∈N0 and t = {tn }n∈N0 . . Example 6. GENERATING FUNCTIONS Definition 6. Hence.6 Let t = {tn }n∈N0 be any sequence and let s be the sequence a.5 Consider two sequences s = {sn }n∈N0 and t = {tn }n∈N0 . .2. Example 6.2. The product of Gs (z) and Gt (z). Note that we can also write a × Gt (z) = +∞ X i=0 ati z i or a × Gt (z) = Gat (z) since in this case we can write just a for Gs (z). . 0. the generating function for the sequence s + t.186 CHAPTER 6. The sum of Gs (z) and Gt (z). 0.2. that is.2. 187 6.2. GENERATING FUNCTIONS Example 6.2.7 Let t = {tn }n∈N0 be any sequence and let u be the sequence 0, 0, 0, 1, 0, 0, 0, 0, 0, . . . that is, u = {un }n∈N0 is such that u3 = 1 and un = 0 for all n ∈ N0 \{3}. Note that i i X X uk ti−k = 0ti−k = 0 k=0 k=0 for i < 3, and that i X uk ti−k = u3 ti−3 = ti−3 k=0 for i ≥ 3. Hence, Gu (z) × Gt (z) is the generating function for the sequence 0, 0, 0, t0, t1 , t2 , t3 , t4 . . . that is, the sequence t′ = {t′n }n∈N0 such that t′0 = t′1 = t′2 = 0 and t′n = tn−3 for n ≥ 3, and therefore Gu (z) × Gt (z) = +∞ X t′i z i = Gt′ (z). (6.7) i=0 Taking into account the notation introduced above and the fact that ti = t′i+3 for each i ≥ 0, we can also write 3 z × Gt (z) = +∞ X i=3 t′i z i or 3 z × Gt (z) = +∞ X ti z i+3 (6.8) i=0 Equalities similar to (6.7) hold for sequences u such that um = 1 is the only nonzero term of the sequence, where m is any nonnegative integer. The product Gu (z) × Gt (z) can then be also denoted as in (6.8) with the integer 3 substituted for m. Observe that the sum and product of generating functions corresponding to polynomials indeed correspond to the sum and product of polynomials. Note also that the sum and product of generating functions coincide with the sum and product of real functions admitting a power series expansion within its interval of convergence. Let G denote the set of generating functions for sequences of real numbers. It is easy to conclude that the operation + : G 2 → G that associates to each pair of generating functions their sum is a commutative and associative operation. Moreover, Gs (z) + 0 = Gs (z) and Gs (z) + G−s (z) = 0, for all Gs (z) ∈ G (recall 188 CHAPTER 6. GENERATING FUNCTIONS that 0 denotes the generating function for s = {sn }n∈N0 such that sn = 0 for each n ∈ N0 ). P P Since ik=0 sk ti−k = ik=0 si−k tk for all i ∈ N0 , it is also easy to conclude that the operation × : G 2 → G that associates to each pair of generating functions their product is a commutative and associative operation. It also holds that Gs (z) × 1 = Gs (z) for all Gs (z) ∈ G (recall that 1 denotes the generating function for s = {sn }n∈N0 such that s0 = 1 and sn = 0 for each n > 0). Moreover, the product of generating functions is distributive with respect to their sum. Hence, the set G endowed with the operations defined above and − : G → G such that −(Gs ) = G−s constitutes a unitary commutative ring. The multiplicative identity is the generating function 1. Proposition 6.2.8 The tuple (G, +, 0, −, ×) constitutes a unitary commutative ring. Not all the generating functions have multiplicative inverses. If s = {sn }n∈N0 is such that s0 6= 0 then Gs (z) has multiplicative inverse (and vice-versa). Proposition 6.2.9 The generating function Gs (z) for the sequence s = {sn }n∈N0 has a multiplicative inverse if and only if s0 6= 0. Proof: P ti z i is the multiplicative inverse of Gs (z). Then, (→) Assume that Gt (z) = +∞ i=0P P Gs (z) × Gt (z) = 1. Therefore, 0k=0 sk t0−k = s0 t0 = 1 and ik=0 sk ti−k = 0 for i > 0. In particular, s0 × t0 = 1 and, as a consequence, s0 6= 0. (←) Assume that s0 6= 0. Let t = {tn }n∈N0 be such that t0 = and 1 s0 n 1 X tn = − sk tn−k s0 k=1 for n > 0. Then, 0 X sk t0−k = s0 t0 = s0 k=0 (6.9) 1 =1 s0 and, taking into account (6.9), n X k=0 sk tn−k = s0 tn + n X k=1 sk tn−k = s0 n 1 X sk tn−k − s0 k=1 ! + for n > 0. Hence, Gs (z) × Gt (z) is the generating function 1. n X sk tn−k = 0 k=1 QED Besides Gs (z)−1 we may also use Gs1(z) to denote the multiplicative inverse of a generating function Gs (z), when it exits. 6.2. GENERATING FUNCTIONS 189 Example 6.2.10 Let v be the sequence 1, 1, 1, . . . presented in Example 6.2.2. Since v0 6= 0 the multiplicative inverse of Gv (z) exists. The proof of Proposition 6.2.9 describes a method for obtaining Gv (z)−1 . Assuming that Gv (z)−1 is the generating function for a sequence t it holds that: • t0 = 1 1 = =1 v0 1 1 1 X 1 • t1 = − vk t1−k = − v1 t0 = −1 v0 k=1 v0 2 1 1 X vk t2−k = − (v1 t1 + v2 t0 ) = − (−1 + 1) = 0 • t2 = − v0 k=1 v0 • t3 = − • ... 3 1 1 X vk t3−k = − (v1 t2 + v2 t1 + v3 t0 ) = − (0 + −1 + 1) = 0 v0 k=1 v0 It can be easily proved by induction that tn = 0 for all n ≥ 2. Hence, we conclude that Gv (z)−1 = 1 − z. ◭ Example 6.2.11 Let a be a real number and let ga be the sequence 1, a, a2 , a3 , . . ., that is, ga = {(ga )n }n∈N0 with (ga )n = an for all n ∈ N0 (geometric progression with ratio a and first term 1). Since (ga )0 6= 0 the multiplicative inverse of Gga (z) exists. Using again the proof of Proposition 6.2.9, and assuming that Gga (z)−1 is the generating function for a sequence t it holds that: • t0 = 1 1 = =1 (ga )0 1 1 1 1 X (ga )k t1−k = − (ga )1 t0 = −a • t1 = − (ga )0 (ga )0 k=1 • t2 = − • ... 2 1 1 X (ga )k t2−k = − ((ga )1 t1 + (ga )2 t0 ) = − −a2 + a2 = 0 (ga )0 k=1 (ga )0 It can be easily proved by induction that tn = 0 for all n ≥ 2. Hence, we conclude that Gga (z)−1 = 1 − az. Note that the sequence s considered in Example 6.2.10 is just g1 and, as expected, the inverse of Gs (z) computed therein complies with the general expression obtained above for the inverse of generation functions Gga (z) for geometric progressions with ratio a and first term 1. ◭ 190 CHAPTER 6. GENERATING FUNCTIONS Derivative and integral of generating functions We can also define the derivative and the integral of a generating function. Definition 6.2.12 Consider the sequence s = {sn }n∈N0 . The derivative of Gs (z), denoted by G′s (z), is the generating function for the sequence t = {tn }n∈N0 where tn = (n + 1)sn+1 for all n ∈ N0 . Hence, the derivative of Gs (z) is G′s (z) = +∞ X (i + 1)si+1 z i . i=0 Example 6.2.13 Let v be the sequence 1, 1, 1, . . . in Example 6.2.2. The derivative of Gv (z) is the generating function for 1v1 , 2v2 , 3v3 , . . .. Hence, G′v (z) is the generating function for 1, 2, 3, 4, . . ., the sequence x = (xn )n∈N0 where xn = n + 1 for each n ∈ N0 , and therefore G′v (z) = +∞ X (i + 1)z i . i=0 ◭ Example 6.2.14 Let t = (tn )n∈N0 be such that tn = 0 for all n > k for some k ∈ N0 , and recall that we can use t0 + t1 z + t2 z 2 + . . . + tk z k to denote Gt (z). The derivative G′t (z) is the generating function for de 1t1 , 2t2 , . . ., ktk , 0, 0,. . ., the sequence x = (xn )n∈N0 where xn = (n + 1)tn+1 for 0 ≤ n ≤ k − 1 and xn = 0 for n ≥ k, and therefore G′t (z) = t1 + 2t2 z + 3t3 z 2 + . . . + ktk z k−1 . Note that the derivative of t0 + t1 z + t2 z 2 + . . . + tk z k is just the derivative of the corresponding polynomial function. Considering the particular case of 1, 0, −1, 0, 0 . . ., the sequence q in Example 6.2.2, it holds G′q (z) = (1 − z 2 )′ = −2z. ◭ We now define the integral of a generating function. Definition 6.2.15 Consider the sequence s = {sn }n∈N0 . The integral of Gs (z), Rz denoted by 0 Gs (z), is the generating function for the sequence t = {tn }n∈N0 where t0 = 0 and tn = sn−1 for all n ∈ N. n . . it holds Z z Z z 1 Gq (z) = (1 − z 2 ) = z − z 3 . The integral of Gt (z) is the generating function for 0. the sequence y = (yn )n∈N0 where y0 = 0.191 6.2. + 2 3 k+1 0 In the particular case of the sequence q in Example 6. . the sequence v in Example 6. . the sequence whose terms are all equal to 1.16 Consider again 1. . 1. 3 × 13 . . i+1 Rz Rz Note that ( 0 Gv (z))′ = Gv (z) since the derivative of 0 Gv (z) is the generating function for 1 × 1. .14 and the generating function Gt (z) = t0 + t1 z + t2 z 2 + . ( 0 Gs (z)) = Gs (z). .11) and Moreover. that is. . + tk z k . 1 .2. Hence. Z z t2 tk k+1 t1 z . v10 . 13 ..2. that is. 1. 2 × 21 . Gs (z) (Gs (z)) (6..17 Recall again the sequence t = (tn )n∈N0 in Example 6. that is. v21 . t0 t1 t2 . 21 .2. . Similarly. RThe z Gv (z) is the generating function for 0.12) The derivative of the of a generating function is the original generating R z integral ′ function. Gt (z) = t0 z + z 2 + z 3 + . . 3. ◭ Example 6. given generating functions Gs (z) e Gt (z) (Gs (z) + Gt (z))′ = G′s (z) + G′t (z) (6.2. .. with respect to the integral. .2.. . (6. Hence. and therefore Z z Gv (z) = 0 +∞ X i=0 1 i+1 z . 3 0 0 ◭ The usual properties regarding the derivative of the sum and of the product hold. Observe that the notion of derivative of a generating function coincides with the usual notion of derivative of a function admitting a power series expansion. . within its domain. 0. . v32 . . .2.. k+1 0. 2. .2. . the sequence y = (yn )n∈N0 0 where y0 = 0 and yn = n1 for each n > 0. The proofs of these properties are left as an exercise to the reader. integral of Gv (z) is the generating function for 0. tk . . 1. if Gs (z) has a multiplicative inverse then ′ 1 G′ (z) = − s 2. . GENERATING FUNCTIONS Example 6. yn = tn−1 for each 1 ≤ n ≤ k + 1 and n yn = 0 para n ≥ k + 2.10) (Gs (z) × Gt (z))′ = G′s (z) × Gt (z) + Gs (z) × G′t (z). for Gga (z) we conclude that Gga (z) = 1 1 − az (6. to get an equality Gs (z) = e where the expression e does not explicitly involve power series.2. ◭ Example 6.13). 1.18 The equalities Let v be the sequence 1. +. .2. Example 6. presented in Example 6. Solving zGv (z) = Gv (z) − 1 for Gv (z) (within the ring (G.. This technique can be generalized to conclude that a Gs (z) = Gav (z) = 1−z where s is a sequence whose terms are all equal to a real number a.19 Let a be a real number and let ga be the sequence presented in Example 6.2. zGv (z) = +∞ X z i+1 = +∞ X zi i=0 i=0 ! − 1 = Gv (z) − 1 hold. as a consequence. GENERATING FUNCTIONS Closed forms It is often useful to get a closed form for a generating function Gs (z). 1.2.13) thus obtaining a closed form for Gv (z). that is. that is.14) .2. From Gga (z) = +∞ X ai z i i=0 we get azGga (z) = +∞ X ai+1 z i+1 = i=0 Solving +∞ X ai z i i=0 ! − 1 = Gga (z) − 1 azGga (z) = Gga (z) − 1. −×)) we get Gv (z) = 1 1−z (6. s = av.11 (geometric progression with ratio a and first term 1). . .2. The following examples illustrate how we can obtain closed forms for some generating functions.192 CHAPTER 6.10) to conclude that Gv (z) = (1 − z)−1 and. 0. the equality (6. Observe that we can also use the fact that Gv (z)−1 = 1 − z (see Example 6. and therefore the equality (6. 2. 1. .16) = 1−z (1 − z)2 ◭ . The resulting closed form is analogous to (6.12) and the closed form (6. 4. the sequence p = {pn }n∈N0 where pn = 2 when n is even and pn = 0 otherwise.193 6.. we get the closed form Gw (z) = 1 .14). and v is the sequence 1. (6. 3. We can get a closed form for Gx (z) using (6. Note that p = v + g−1 where v is the sequence 1.2. 0. 2. . 1.13 that Gx (z) = G′v (z) where x is the sequence 1. recalling the closed forms (6.2. that is.13) and (6. .15) We can reason as above when for some k ∈ N the sequence w = (wn )n∈N0 is such that wn = 1 if n is multiple of k and wn = 0 otherwise. we conclude that Gp (z) = Gv (z) + Gg−1 (z) = 1 1 + . Again we could have used use the fact that Gga (z)−1 = 1 − az (see Example 6. . 0. Then 3 z Gw (z) = +∞ X i=0 z 3i+3 = Gw (z) − 1 and therefore. solving z 3 Gw (z) = Gw (z) − 1 for Gw (z). Note also that (6.. 0. 1 .2. Example 6. 0. 1 − z3 (6.14). Hence.2. .2.2. ◭ Example 6.21 Let p be the sequence 2.2. 0. 1. 1. since v is g1 . 1.2. . 1..18 and g−1 is the geometric progression with ratio −1 and first term 1.13) is a particular case of (6. . presented in Example 6. .11) to conclude that Gga (z) = (1 − az)−1 . 0.15) with k instead of 3. 0.22 Recall from Example 6.. 1. . . in Example 6. . GENERATING FUNCTIONS thus obtaining a closed form for Gga (z). for each n ∈ N0 . 0.2. . 1−z 1+z ◭ In the next example we use the derivative of generating functions to obtain a closed form. Example 6.13): ′ 1 1 ′ Gx (z) = Gv (z) = . ◭ The following example uses the sum of generating functions. 1.20 Let w be the sequence 1.14). We can associate a probability function with each discrete random variable. For instance.2. Definition 6. Given a discrete probability space (Ω.2 A random variable over a discrete probability space (Ω. . if X(Ω) ⊆ N0 we say that X takes only nonnegative integer values. 1] is a map such that ω∈Ω p(ω) = 1.1. . Definition 6. Discrete random variables and probability generating functions We briefly recall several basic notions concerning discrete random variables. From (6. p). 1] such that PX (x) = p({w ∈ Ω : X(w) = x}).3.1 Motivating examples revisited Search by hashing We first describe how to use generating functions for computing the expected value and the variance of some discrete random variables. Each element of Ω is an elementary event. The probability function associated with X is the map PX : R → [0.17) ◭ 6.2. GENERATING FUNCTIONS From the closed form obtained in Example 6. We then present the average case analysis of the key search algorithm discussed in Section 6. When there is no ambiguity we just use p for p. .3. (1 − z)2 (6. we say that the random variable X takes only values in a set C whenever X(Ω) ⊆ C.3.1 A discrete probability space isPa pair (Ω.3 6. the map p canPbe extended to subsets of Ω considering the map p : 2Ω → [0. p).3 Let X : Ω → R be a discrete random variable over (Ω. In the sequel.16) we conclude that Gr (z) = zGt (z) = z . Example 6. 3. 1] such that p(A) = ω∈A p(ω).22 we can obtain closed forms for other generating function.3. 1. beginning with the notion of discrete probability space. p) is a map X : Ω → R. in Example 6.194 CHAPTER 6. X(Ω) denotes the set {X(w) : w ∈ Ω}. p) where Ω is a countable set and p : Ω → [0. . We now introduce discrete random variables. A random variable over a discrete probability space is said to be a discrete random variable.2.2. Moreover.. 2. Definition 6.23 Let r be the sequence 0. Some parameters are useful to characterize the probability function of a random variable. of X. but we just say which values X takes.3. The variance of X is V (X) = E((X − E(X))2 ).6. or mean. is +∞ X k m PX (k) k=0 wheneverP this summation is a real number. . We can consider the sequence PX (0). Let X be a discrete random variable taking only nonnegative integer values. The joint probability function associated with X and Y is the map PXY : R2 → [0. we just define X(Ω). E(X) = +∞ k=0 kPX (k) is the expected value.4 Let X and Y be discrete random variables over the same probability space (Ω. We now introduce probability generating functions. For each m ∈ N. the probability generating function associated with X. p) we often do not explicitly define X(w) for each w ∈ Ω. The two variables are independent whenever PXY (x. the m-th moment of X. denoted by E(X m ). The first moment of X.3. . that is. Note that Definition 6.5 Let X be a discrete random variable over (Ω. Since in the sequel we only refer to discrete random variables taking only nonnegative integer values. 1] such that PXY (x. and therefore the corresponding the generating function. . When characterizing a discrete random variable X over a probability space (Ω. Definition 6.18) and therefore the variance of X can be computed using the first and the second moments of X.3.3. MOTIVATING EXAMPLES REVISITED 195 We also write P (X = x) to denote PX (x). Moreover. p) taking only nonnegative integer values. we may not explicitly refer to the the function p. defining instead the values of PX (x) for each x ∈ X(Ω) in a suitable way. we just introduce expected value and variance of such random variables. that is. It can be easily proved that V (X) = E((X − E(X))2 ) = E(X 2 ) − (E(X))2 (6. y) = PX (x)PY (y). Definition 6. y) = p({w ∈ Ω : X(w) = x} ∩ {w ∈ Ω : Y (w) = y}). such as the expected value and the variance. . In some situations it is relevant to consider several random variables over the same probability space.3 implies that PX (x) is always 0 whenever x ∈ / X(Ω).. p). PX (1). PX (2). GENERATING FUNCTIONS Definition 6. The expected value of a discrete random variable X taking only nonnegative integer values can be computed using the derivative of the probability generating function of X. GX (z) = k=0 The sets {w ∈ Ω S : X(w) = k} and {w ∈ Ω : X(w) = k ′ } are disjoint for distinct k. and k∈N0 ({w ∈ Ω : X(w) = k}) = Ω. Proposition 6. GX (1) = 1. that is. is the generating function sX = {(sX )n }n∈N0 such that (sX )n = PX (n) for all n ∈ N0 .6 Let X be a discrete random variable over (Ω. As a consequence. p) taking only nonnegative integer values.1.19) k=0 Therefore.3.3. E(X 2 ) = G′′X (1) + G′X (1) 2. The probability generating function of X. V (X) = G′′X (1) + G′X (1) − (G′X (1))2 . QED The second moment of X can be computed using the first and the second derivatives of GX (z). the equalities +∞ P (k) = X k=0 w∈Ω p(w) = 1 hold. Proposition 6. recallP P ing also Definition 6.3.7 Let X be a discrete random variable over (Ω. (6.8 Let X be a discrete random variable over (Ω. denoted by GX (z). that is. Then E(X) = G′X (1). Proof: The first derivative of the probability generating function of X is G′X (z) = +∞ X (k + 1)PX (k + 1)z k . +∞ X PX (k)z k .196 CHAPTER 6. 1. k=0 Note that E(X) is a real number if and only if G′X (1) is a real number.3. p) taking only nonnegative integer values. p) taking only nonnegative integer values. k ′ ∈ N0 . G′X (1) = +∞ X (k + 1)PX (k + 1) = k=0 +∞ X kPX (k) = E(X). 3.3. Proof: The probability generating function of X + Y is GX+Y (z) = +∞ X PX+Y (k)z k . k − i) = for each k ∈ N0 . GX+Y (z) = +∞ X k=0 PX (k)z k ! +∞ X k=0 k X i=0 PX (i)PY (n − i) PY (k)z k ! = GX (z)GY (z).18). Hence. PX+Y (k) = k X i=0 PXY (i. . MOTIVATING EXAMPLES REVISITED Proof: 1. If X and Y are independent random variables then GX+Y (z) = GX (z)GY (z). It follows from 1.3. G′′X (z) + G′X (z) = +∞ X k=0 = +∞ X k=0 = k (k − 1)kPX (k)z + +∞ X kPX (k)z k k=0 ((k − 1)kPX (k) + kPX (k))z k +∞ X k 2 PX (k)z k k=0 Hence. The second derivative of the probability generating function of X is G′′X (z) +∞ X = (k + 1)(k + 2)PX (k + 2)z k k=0 Therefore.197 6. Proposition 6. k=0 Since X and Y are independent. Proposition 6. k=0 2. whenever G′′X (z) and G′X (z) converge at z = 1 it holds that G′′X (1) + G′X (1) = +∞ X k 2 PX (k) = E(X 2 ).9 Let X and Y be two discrete random variables over (Ω. QED The following result is useful in the sequel. p) taking only nonnegative integer values.7 and the equality (6. Average case analysis: unsuccessful search In this section we return to the motivating example presented in section 6. m}. We want to determine the average number of comparisons key[[j]]==k that are performed when searching for k.1 to determine whether k has already been stored.1. 2.1 and show how to use probability generating functions in the average-case analysis of the key searching algorithm. . K 6= Ki for each 1 ≤ i ≤ n. Assume we are searching for a key K and assume that n ∈ N keys have already been stored in table Key. . We first consider unsuccessful search. Hence. GXn (z) holds. . the key K is not yet been stored. To this end some probabilistic hypothesis have to be considered. we have two lists L1 and L2 . and therefore there are m different lists where to search for K.. .198 CHAPTER 6. we denote by Ki the key Key[i]. For illustration purposes suppose m = 2 that is.+Xn (z) = GX1 (z)GX2 (z) . . the equality GX1 +X2 +. The average case analysis depends on whether the key has already been stored (successful search) or not (unsuccessful search). • 1 comparison key[[j]]==k if scenario (3). Suppose we have a hash function h : K → {1. that is. where m ∈ N and K is the key space. Since we are analyzing unsuccessful searches. For simplicity.. There are 23 = 8 possible scenarios that can be sketched as follows: L1 L2 K1 K2 K3 L1 L2 K1 K2 K3 (1) (2) L1 L2 K1 K2 K3 L1 L2 K2 K1 K3 L1 L2 K3 K1 K2 L1 L2 K1 K3 K2 L1 L2 K1 K2 K3 L1 L2 K2 K1 K3 (3) (4) (5) (6) (7) (8) Observe that if h(K) = 1 then we have to perform • 0 comparisons key[[j]]==k if scenario (2) is the case. . Recall the function keySearch in Figure 6. . and suppose n = 3. (4) or (5) is the case. GENERATING FUNCTIONS QED The above result can be extended to the sum of a finite number of random variables. Furthermore. To compute the average number of the intended comparisons we reason as follows. . . r3 . . 2. with respect to the probability space involved. To characterize the number of comparisons we first consider the random variables X1 . Letting Ai = {(r1 . . In the uniform case. (7) or (8) is the case. . . r4 ) ∈ {1. Note that in every scenario each stored key Ki is compared to K at most once and it is only compared to K whenever h(K) = h(Ki ). . . .3. Thus. rn+1 ) ∈ Ω : ri = rn+1 } for each 1 ≤ i ≤ n. . corresponding to the number of possible comparisons between K and Ki . We can use a tuple (r1 . all the elementary 1 events have the same probability and therefore p(w) = 16 for each elementary event w. . each elementary event represents one of the mn+1 possible situations corresponding to the mn possible ways the n keys can be distributed by the m lists and the m possible values of h(K). As described above. r2 . . MOTIVATING EXAMPLES REVISITED 199 • 2 comparisons key[[j]]==k if scenario (6). Similarly. . 1. (1) The discrete probability space is (Ω. (7) or (8). (4) or (5). 2 or 3 comparisons have to be performed when we have scenario (1). • 3 comparisons key[[j]]==k if scenario (1) is the case. scenarios (3). and scenario (2) respectively. m}n+1 .6. . if h(K) = 2 then 0. m} with m ∈ N. scenarios (6). we can consider in this example 16 (=8×2) elementary events w corresponding to the 8 possible ways K1 . 1} and Xi (r1 . K is always compared with all the keys in the list h(K). We now return to the general case with n ∈ N keys already stored and a hash function h : K → {1. where each Xi is a random variable that only takes values in {0. 1}. p) such that Xi (Ω) = {0. PXi (1) = X w∈Ai p(w) and PXi (0) = 1 − PXi (1). K2 and K3 can be distributed by the 2 lists and the 2 possible values of h(K). (2) For each 1 ≤ i ≤ n we have a random variable Xi over the probability space (Ω. p). p) where Ω = {1. the number of comparisons NC can be defined as NC = X1 + X2 + X3 . rn+1 ) = ( 1 if rn+1 = ri 0 otherwise The values of Xi correspond to number of possible comparisons between K and Ki . X2 and X3 over (Ω. Then. 2}4 to represent such an event letting ri = h(Ki ) for 1 ≤ i ≤ 3 and r4 = h(K). . . . . n}. .3. we consider the discrete random variable NC = n X Xi i=1 Note that NC(Ω) = {0. i=1 Then we just compute E(NC) = G′N C (1). . n ∈ N. Xn are pairwise independent. . . The average case number of comparisons is then E(NC). m}n+1 and p(w) = where m.10 Assuming an uniform distribution. .200 CHAPTER 6. . a situation where every elementary event w is equally likely: • (Ω. (3) Finally. • for 1 ≤ i ≤ n PXi (1) = m × mn−1 × 1 mn+1 = 1 m 1 mn+1 for each w ∈ Ω. . Example 6. p) is such that Ω = {1. . that is. . . every elementary event is equally likely. GENERATING FUNCTIONS The random variables X1 . Since GXi (z) = +∞ X PXi (r)z r = PXi (0) + PXi (1)z r=0 we get GN C (z) = n Y (PXi (0) + PXi (1)z). We can also compute the variance V (NC) = G′′N C (1) + G′N C (1) − (G′N C (1))2 . . 1. the expected value of NC. Xn are pairwise independent GN C (z) = n Y GXi (z) i=1 and therefore we have to compute GXi (z) for each 1 ≤ i ≤ n. . . The following example illustrates the average case analysis assuming an uniform distribution. The values of NC correspond to the possible number of the intended comparisons. . . that is. . (4) We use the probability generating function of NC to compute E(NC). Since X1 . 1]. being both probabilities equal to m1 .10. m m m m r=1 . as expected m X 1 1 1 1 PXi (1) = × =m× 2 = . An alternative way of characterizing Xi without explicitly involving the elementary events and their probabilities is as follows:   1 if h(K) = h(Ki ) • Xi =  0 otherwise and • PXi (1) = m X P (h(K) = r e h(Ki ) = r) r=1 PXi (0) = 1 − PXi (1) where P (h(K) = r e h(Ki ) = r) ∈ [0.3. P (h(K) = r e h(Ki ) = r) = P (h(K) = r) × P (h(Ki ) = r) where P (h(K) = r) is the probability that h(K) = r and P (h(Ki ) = r) is the probability that h(Ki ) = r.6. is the probability that both h(K) = r and h(Ki ) = r. ◭ The variance of NC is V (NC) = G′′N C (1) + G′N C (1) − (G′N C (1)) = m2 To end this section we return to the characterization of the random variables Xi for 1 ≤ i ≤ n. MOTIVATING EXAMPLES REVISITED PXi (0) = 1 − PXi (1) = 201 m−1 m m−1 1 + z m m n n Y m−1 1 • GN C (z) = GXi (z) = + z m m i=1 GXi (z) = n−1 m−1 1 • + z m m n−2 1 n(n − 1) m − 1 ′′ + z • GN C (z) = m2 m m G′N C (z) n = m The average case number of comparisons between the stored keys and the given key K is then n E(NC) = G′N C (1) = m n(m − 1) 2 . In the uniform case described in Example 6.3. This probability depends on probability space involved. for each 1 ≤ r ≤ m. Hence. Recall that Ki denotes the key Key[i] and that K1 was the first key to be stored. . . . This situation differs from the case of unsuccessful searches in several aspects. n}. . p) such that Y (Ω) = {1. . .202 CHAPTER 6. rn+1) = rn+1 . . . Finally. whereas in a successful search this may not be the case since we need no more comparisons once we find K. .. Recall the set Ai defined above for each 1 ≤ i ≤ n. To begin with note that the minimum number of comparisons in the case of an unsuccessful search is 0 but it is 1 in the case of a successful search. in an unsuccessful search. Observe also that if K = Kj for some 1 ≤ j ≤ n then there are no comparisons between K and Kj ′ for any j ′ > j. K2 . . . rn+1 ) ∈ Ω where ri = h(Ki ) for each 1 ≤ i ≤ n and rn+1 ∈ {1. GENERATING FUNCTIONS Let us give a closer look to the relationship between the two definitions we have presented for PXi (1). The . . . . To compute the average number of the intended comparisons we reason as follows. . . We r have that Ai = A1i ∪ . . Each possible scenario consists of a particular distribution of the keys K1 . . since list h(K) is never empty. Moreover. . K is compared with all the keys in the list h(K). and therefore the maximum number of comparisons is j. rn+1 ) ∈ Ω : ri = rn+1 = r} for each 1 ≤ r ≤ m. n}. . that n ∈ N keys have already been stored. m}n × {1. 2. ∪ Am i where Ai = {(r1 . we are analyzing successful searches. and that the hash function is h : K → {1. Average case analysis: successful search We now address the successful search case. note that in a successful search we have to take also into account the probability that K = Kj for each 1 ≤ j ≤ n. . (1) The discrete probability space is (Ω. Kn over the m lists and K = Ki for some 1 ≤ i ≤ n. with Y (r1 . . n} indicates which of the n keys is K. This is the case because when j ′ > j then Kj ′ occurs after Kj in table Key. . . m}. . . Since. These sets are pairwise disjoint and therefore PXi (1) = X p(w) = w∈Ai m X p(Ari ) r=1 Note that p(Ari ) is the probability that h(K) = h(Ki ) = r which just corresponds to the probability P (h(K) = r e h(Ki ) = r) mentioned above. . . . . . We again assume we are searching for a key K. K2 the second and so on. . whit m ∈ N. . . (2) We consider a discrete random variable Y over the discrete probability space (Ω. r2 . the key K we are looking for is one of the keys already stored. each elementary event is a tuple (r1 . . . Thus. p) where in this case Ω = {1. . Y is defined just stating that Y =j when K = Kj for 1 ≤ j ≤ n. assuming that K is Kj .  1 when i = j   m X PXij (1) =  P (h(ki) = r e h(kj ) = r) when i 6= j  r=1 and PXij (0) = 1 − PXij (1) where P (h(K) = r e h(Ki ) = r) ∈ [0. X2j . for each 1 ≤ r ≤ m. 1] is the probability that both h(K) = r and h(Ki ) = r. we have PY (j) = w∈Bj p(w) for each 1 ≤ j ≤ n. . Recall that when K = Kj there are no comparisons between K and Kj ′ for any j ′ > j. hence for each j we only need Xij for each 1 ≤ i ≤ j. . Xjj over (Ω. 1} for each 1 ≤ i ≤ j.3. X2j . (4) For each 1 ≤ j ≤ n we consider the random variable NCj = j X Xij i=1 Note that NC(Ω) = {1. j}. Thus. Dropping the reference to the elementary events. . MOTIVATING EXAMPLES REVISITED 203 values of Y correspond to the n possible values of K. The random variables X1j . . Moreover.. . there is only a comparison between Kj and Ki whenever h(Ki ) = h(Kj ). (3) For each 1 ≤ j ≤ n we also consider the discrete random variables X1j . Letting Bj ⊆ Ω be thePset of all elementary events whose last component is j. . p) such that Xij (Ω) = {0.6. . . . .. Dropping the reference to the elementary events Xij is such that   1 if h(Ki ) = h(Kj ) Xij =  0 otherwise Note that Xjj is always equal to 1. Xjj are pairwise independent. assuming that K is Kj . The values of Xij correspond to the possible number of comparisons between K and Ki . . The values of NCj correspond to the possible number of comparisons between the stored keys and K. 3. .11 Assuming an uniform distribution. . with m. . X2j . The variance V (NC) can also be computed as expected. . p) is such that Ω = {1. Example 6. . we get GN Cj (z) = j Y i=0 j Y (PXij (0) + PXij (1)z). . The following example illustrates the average case analysis assuming that every elementary event is equally likely. n} and p(w) = w ∈ Ω. since X1j . the average number of comparisons is E(NC) = G′ (1). j=1 The average case number of comparisons is E(NC). m}n × {1. that is. n ∈ N. The values of NC correspond the possible number of comparisons between the stored keys and K.. the expected value of NC. For each 1 ≤ nc ≤ n we have that PN C (nc) = n X PY (j)PN Cj (nc). . . . . (6) The probability generating function of NC is used to compute E(NC). n}. . • PY (j) = mn × mn 1 1 = ×n n • for 1 ≤ j ≤ n and 1 ≤ i ≤ j for each 1 ≤ j ≤ n 1 mn ×n for each . GN C (z) = j=1 For each 1 ≤ j ≤ n and 1 ≤ i ≤ j we have GXij (z) = +∞ X PXij (r)z r = PXij (0) + PXij (1)z r=0 and. GENERATING FUNCTIONS (5) We then have the random variable NC over (Ω. Xjj are pairwise independent. We can prove that n X PY (j)GN Cj (z). a situation where every elementary event w is equally likely: • (Ω. p) such that NC(Ω) = {1.204 CHAPTER 6. GXij (z) = i=0 Finally. . . . . After presenting a first example. 2m ◭ 6. we consider the case of the Fibonacci sequence. We then return to the analysis of the Euclid’s algorithm already introduced in Section 6.1.  when i = j  z GXij (z) = m−1 1  + z when i 6= j m m • GN Cj (z) = z 1 m−1 + z m m j−1 for each 1 ≤ j ≤ n j−1 j−1 n n X m−1 1 1 z X m−1 1 + + z • GN C (z) = z = n m m n j=1 m m j=1 The average case number of comparisons between the stored keys and the given key K is then E(NC) = G′N C (1) = n−1 + 1.3.   1 when i = j PXij (1) =  1 when i 6= j m and therefore  when i = j  0 PXij (0) = 1 − PXij (1) =  m − 1 when i 6= j m Moreover. .3. MOTIVATING EXAMPLES REVISITED 205  1 when i = j   m X PXij (1) = 1 1  × when i 6= j  m m r=1 that is.2 Euclid’s algorithm We first discuss how generating functions can be used for solving recurrence relations.6.2. 20.20) We will use the generating function G(z) = +∞ X ti z i i=0 for t to solve the recurrence relation 6.2. GENERATING FUNCTIONS Solving recurrence relations: a first example Consider the sequence t = {tn }n∈N0 such that t0 = 0 tn = 3tn−1 − 2 and for n ≥ 1. Let us proceed as follows. (i) First of all note that G(z) = +∞ X ti z i = i=0 +∞ X ti z i i=1 = +∞ X i=1 =3 (3ti−1 − 2)z i +∞ X i=1 =3 +∞ X ti−1 z i − 2 ti z i+1 i=0 = 3z +∞ X i=0 = 3z +∞ X i=0 +∞ X i=1 − 2(−1 + ti z i − 2(−1 + ti z i − 2z 1−z G(z) = 3zG(z) − 2z 1−z (ii) Solving G(z) = +∞ X zi ) i=0 1 ) 1−z 2z 1−z = 3zG(z) − for G(z) we get zi −2z (1 − z)(1 − 3z) (6. recall the notations and closed forms presented in Section 6. (6.21) . In the sequel. in order to get an expression for tn that does not depend on other elements of the sequence.206 CHAPTER 6. Solving the Fibonacci sequence The Fibonacci numbers are present in several different situations. since then the coefficient ai is an expression for ti .22) . respectively. and therefore A = 1 and B = −1. Fibonacci numbers are used in some pseudorandom generators and. for each i ∈ N0 .2. MOTIVATING EXAMPLES REVISITED (iii) P Now. For z = 0 and z = −1 we get A + B = 0 and 4A + 2B = 2.23) into a power series +∞ i i=0 ai z . −2z 1 −1 = + . from −2z A B = + (1 − z)(1 − 3z) 1 − z 1 − 3z we get A(1 − 3z) + B(1 − z) = −2z. in the analysis of the Euclid’s algorithm. −1 1 + 1 − z 1 − 3z = +∞ X i=0 = +∞ X i z − +∞ X 3i z i i=0 (−3i + 1)z i i=0 (iv) From (ii) and (iii) we get G(z) = +∞ X (−3i + 1)z i i=0 and therefore t = {tn }n∈N0 is such that tn = −3n + 1 for each n ∈ N0 .1. from Biology (eg the arrangement of leaves on a stem and the patterns of some pine cones) to Economics (eg trading algorithms and strategies in financial markets). the goal is to expand the right hand side of (6. Recall that the Fibonacci sequence is the sequence s = {sn }n∈N0 such that s0 = 0 s1 = 1 and sn = sn−1 + sn−2 for n ≥ 2.207 6.1) On one hand. (6. (1 − z)(1 − 3z) 1 − z 1 − 3z (iii.3. (iii.2) On the other hand. Thus. as we have already mentioned in Section 6. thus obtaining an expression for sn that does not depend on other elements of the sequence. (ii) Solving G(z) = z + zG(z) + z 2 G(z) for G(z) we get G(z) = z 1 − z − z2 (6.1) To make things easier√we first rewrite it as follows. GENERATING FUNCTIONS Using the generating function G(z) = +∞ X si z i i=0 of the Fibonacci sequence we are able to solve the recurrence relation 6.208 CHAPTER 6. of 1 − z − z are β1 = 2 . (i) Note that +∞ X si z i =z+ i=0 +∞ X si z i i=2 =z+ +∞ X (si−1 + si−2 )z i i=2 =z+ +∞ X i si−1 z + =z+ si z i+1 + i=0 =z+z si−2 z i i=2 i=2 +∞ X +∞ X +∞ X +∞ X si z i+2 i=0 i si z + z i=0 2 +∞ X si z i .22. (iii.23) into a power series +∞ i i=0 ai z . since then the coefficient ai is the Fibonacci number si . . Note that the roots √ −1− 5 −1+ 5 2 and note that β1 β2 = −1. G(z) = z + zG(z) + z 2 G(z). i=0 Hence. and β2 = 2 Hence.23) (iii) P The goal is to expand the right hand side of (6. for each i ∈ N0 . The following steps are similar to the ones presented in the previous example. 2) P+∞ Wei now expand the right hand side of (6. MOTIVATING EXAMPLES REVISITED 1 − z − z2 = −(z − β1 )(z − β2 ) z z −1 −1 = −β1 β2 β1 β2 z z = 1− 1− β1 β2 and therefore we get z z = 2 1−z−z (1 − α1 z)(1 − α2 z) where α1 = From 1 β1 = 2√ −1+ 5 = √ 1+ 5 2 and α2 = 1 β2 = 2√ −1− 5 = √ 1− 5 . 2 A B z = + (1 − α1 z)(1 − α2 z) 1 − α1 z 1 − α2 z we get the equation A(1 − α2 z) + B(1 − α1 z) = z. As a consequence.23) into a power series i=0 ai z as follows z 1 1 1 =√ − 1 − z − z2 5 1 − α1 z 1 − α2 z ! +∞ +∞ X 1 X (α1 z)i − (α2 z)i =√ 5 i=0 i=0 +∞ X α1i − α2i √ zi = 5 i=0 (iv) From (ii) and (iii) we conclude that the generating function for the Fibonacci sequence is +∞ i X α1 − α2i √ G(z) = zi 5 i=0 and therefore the Fibonacci sequence s = {sn }n∈N0 is such that . 1 z =√ (1 − α1 z)(1 − α2 z) 5 1 1 − 1 − α1 z 1 − α2 z (iii.3. For z = 0 and z = 1 this equation yields A + B = 0 and A(1 − α2 ) + B(1 − α1 ) = 1 respectively and therefore A = √15 and B = − √15 .209 6. 3. for instance. The following proposition states that all sequences involving a linear recurrence relation have indeed generating functions given by the quotient of two polynomial functions. Pn Pm i i Proposition 6.3. −ρj p(1/ρj ) q ′ (1/ρj ) To find the real numbers ρ1 .210 CHAPTER 6. . .12 it is often useful to use the following property. (1 − ρm z) if and only if q r = q0 (z − ρ1 ) . GENERATING FUNCTIONS √ !n 1+ 5 − 2 1 αn − αn sn = 1 √ 2 = √ 5 5 √ !n ! 1− 5 2 for all n ∈ N0 . . . . For simplicity. We do not further develop this subject herein. Many recurrence relations can also be solved reasoning as above for the Fibonacci sequence. The function r = such that r(z) = +∞ X p is q ai z i i=0 where.12 Let p = i=0 pi z and q = i=0 qi z be polynomials in R[z] such that deg(p) < deg(q) and q = q0 (1 − ρ1 z) . Let q = i=0 qm−i z .3. in the sequel we also use p to denote the polynomial function associated with a polynomial p in R[z]. . We just note that step (iii) may be not so easy as in the case of the Fibonacci sequence and the following theorem. (1 − ρm z) where ρi 6= 0 and ρi 6= ρj for all 1 ≤ i. may be helpful. Then q = q0 (1 − ρ1 z) . the rational expansion theorem for distinct roots. The interested reader is referred. ρm referred to in Proposition 6. to [31]. (z − ρm ).13 P Let q = m i=0 qi z be a polynomial of degree m in R[z] such m i r that q0 6= 0. . . . P i Proposition 6. . for each i ∈ N0 ai = m X bj ρij j=1 with bj = for each 1 ≤ j ≤ m. j ≤ m with i 6= j. . 15. . . .3. Proposition 6.m] ≥ sk+2 + sk+1 = sk+3 .2 it is enough to study the case where m < n.3. Since there are at least two recursive calls. Theorem 6. It establishes an upper bound for the number of recursive calls that take place when evaluating euclid[m.m] ≥ sk+1 .3. .n].m]. m ≥ sk+2 and Mod[n.n] with n > m ≥ 1 there are k ∈ N recursive calls then n ≥ sk+2 and m ≥ sk+1 where s = {sn }n∈N0 is the Fibonacci sequence. + ck sn where c1 . It easily follows from Proposition 6. .3. if m < sk+1 . As explained in Section 6.m]. Euclid’s algorithm analysis We now return to the analysis of the Euclid’s algorithm (see Figure Figure 1. Since n > m. sk−1 are explicitly given and the other terms are defined by the recurrence relation sn+k = c1 sn+k−1 + c2 sn+k−2 + . . n ∈ N.3.15 If when evaluating euclid[m. Then Gs (z) = +∞ X i=0 si z i = p q where p and q are polynomials in R[z] such that deg(p) = k and deg(p) ≤ k. But jnk m + Mod[n. then m ≥ 1.1.n] involves less than k recursive calls. The analysis of this algorithm involves counting the number of recursive calls performed when euclid[m.1.m]. . we have n ≥ 2 = s3 . . since m < n. by the induction hypothesis.14 Let s = {sn }n∈N0 be a sequence given by a linear recurrence relation of order k ∈ N. Therefore. where s = {sn }n∈N0 is the Fibonacci sequence. as consequence.16 For each k. QED The following theorem is known as the Lamé theorem. ⌊ m ⌋m ≥ m. . Step: We have to prove that n ≥ sk+3 and m ≥ sk+2 when euclid[m. the first recursive call is euclid[Mod[n. MOTIVATING EXAMPLES REVISITED Proposition 6. that is. We can then conclude that n ≥ m + Mod[n. there k > 1 recursive calls involved in the evaluation of euclid[Mod[n.m] n= m n n and.n] is evaluated. . Proof: We use induction on k. then the evaluation of euclid[m.m]. the inequality ⌊ m ⌋ ≥ 1 holds and. Basis: If k = 1 then m ≥ 1 = s2 . Hence.2. Moreover.4) introduced in Section 6.211 6. ck ∈ R. .n] evaluation involves k + 1 ≥ 2 recursive calls. the real numbers s0 . m. sk+1 ].4 Exercises 1.sk+1 ] involves k − 1 recursive calls. Proof: We use induction on k ≥ 2.3. Consider the sequences s = {sn }n∈N0 and t = {tn }n∈N0 .sk ]. √ given that | 1−2 5 | < 1. Since s2 = 1 and s3 = 2. Then sk−1 < sk and therefore ⌊ k skk−1 ⌋ = 1. that is. that is k − 1 recursive calls. we would have m ≥ sk+1 .17 Let s = {sn }n∈N0 be the Fibonacci sequence.sk+1 ] yelds a call to euclid[Mod[sk+1 .sk ] involves exactly k − 2 recursive calls and therefore the evaluation of euclid[sk . By the induction hypothesis. sk+1 sk + sk−1 mod(sk+1 . Hence. Pi P 1 i Gs (z) = ∞ (b) 1−z j=0 sj )z . the evaluation of euclid[s2 . Reof the definition √ n √ n 1− 1 5 5 1+ for all n ∈ N0 . The following proposition states that if the conditions of the theorem hold.212 CHAPTER 6.sk ]. the worst case number of recursive calls is indeed close to k when evaluating euclid[sk . Basis: Let k = 2. the evaluation of euclid[sk−1 . that is. s +s Step: Let k > 2. sk ) = sk+1 − sk = (sk + sk−1 ) − sk = sk−1 sk sk The evaluation of euclid[sk . GENERATING FUNCTIONS Proof: If there were k or more recursive calls. QED The nonrecursive Fibonacci sequence s is relevant herein. when computing the greatest common divisor of two consecutive Fibonacci numbers. i=0 ( .sk+1 ] involves exactly k − 1 recursive calls. Hence.3. In this case there are k − 1 recursive calls. but we do not know how close the real number of calls is to this upper bound. Proposition 6.s3 ] involves exactly one recursive call. For each k ∈ N0 such that k ≥ 2 the evaluation of euclid[sk . The Lamé’s theorem establishes an upper bound for the number of recursive calls that take place when evaluating euclid[m. we can conclude − call that sn = √5 2 2 √ k that k ∈ O(loge m) since sk is approximately √15 1+2 5 for large integers k. We can prove that the function Mod involves a O(log2e (m)) number of bit operations and therefore there are a O(log33 (m)) number of bit operations. by Proposition 6. Prove that P i (a) aGs (z) + bGt (z) = ∞ i=0 (asi + bti )z .15. QED 6.n]. to euclid[sk−1 .sk ]. 3.4. k+1 (e) PX (k) = 32 . Let (Ω. Find a closed form for the probability generating function of the random variable X that describes a loaded dice such that the probability of seeing an even number of spots in a roll is half the probability of seeing an odd number. n (g) sn = 3 if n ≥ m and sn = 0 otherwise. for some n ∈ N. (c) sn = 2n + 1. k (d) PX (k) = 32 13 . Let h be the hashing function and let NC be the random variable corresponding to the number of comparisons between keys involved in a unsuccessful search of a key K. Compute the mean and the variance of NC when (a) there are 2 lists and the probability that h(K) = 1 is 14 . EXERCISES 2.213 6. (f) sn = 3 2 if n ≥ m and sn = 0 otherwise. (b) sn = 2 if n is odd and sn = 1 otherwise. for some m ∈ N0 . 7. 5. for some m ∈ N0 . (b) PX (k) = 4. 8. Find the mean and variance of the random variable X in Exercise 4. Let h be the hashing function and let NC be the random variable corresponding to the number of comparisons between keys involved in a successful search . (d) sn = 5n + 5. 6. k (c) PX (k) = 15 . p) be a discrete probability space. Find a closed form for the probability generating function of the random variable X : Ω → N0 such that (a) PX (k) = 1 6 1 n if k < 6 and PX (k) = 0 otherwise. if k < n and PX (k) = 0 otherwise. Consider the hashing technique for storing and retrieving information. Find the mean and variance of the random variables X in Exercise 3. (e) sn = 3n + 2n . (b) there are 3 lists and the probability of that h(K) = 1 and that h(K) = 2 is p1 and p2 respectively. Consider the hashing technique for storing and retrieving information. Find a closed form for the generating function for the sequence s = {sn }n∈N0 where (a) sn = 1 if n is a multiple of 3 and sn = 0 otherwise. Compute the mean and the variance of NC when there are 3 lists and (i) the probability that h(K) = 1 is 13 and the probability that h(K) = 2 is 12 . Compute the mean and the variance of NC when there are 2 lists and the probability that h(K) = 1 is p1 . 9. (ii) the probability that K is the first key inserted is 43 and it is uniformly distributed otherwise. GENERATING FUNCTIONS of a key K. 10. . Consider the hashing technique for storing and retrieving information. s1 = 1 and sn = 3sn−1 + 4sn−2 for n ≥ 2. Let h be the hashing function and let NC be the random variable corresponding to the number of comparisons between keys involved in a successful search of a key K.214 CHAPTER 6. Solve the following recurrence relations using generating functions (a) s = {sn }n∈N0 where s0 = 0 and sn = 3sn−1 − 2 for n ≥ 1. (c) s = {sn }n∈N0 where s0 = 0. (b) s = {sn }n∈N0 where s0 = 2 and sn = 6sn−1 − 5 for n ≥ 1. Appl. P. Appl. volume 3 of Graduate Studies in Mathematics. Adams and P. J.-M. Lascar. Addison-Wesley Series in Computer Science and Information Processing. Ideals. and O. SIAM Journal on Computing. and D. 213(8):1612–1635. Mart´ınezMoro. Blum. M. M. Wienand. E. PhD thesis. Algebra. 1994. Algebra Engrg.. University of Innsbruck. 2008. A course with exercises. 1986. Buchberger. third edition. Buchberger. Wedler. Math.. In Robotics. Dreyer. 4(3):374–383. A. [6] M. Mathematical logic. Comput. pages 49–89. [9] R. Oxford University Press. Brickenstein. American Mathematical Society. Amer. [4] L. J. Ullman. Sympos. Loustaunau. 2007. Providence. 19(5):393–411. Blum. 15:364–383. Ein Algorithmus zum Auffinden der Basiselemente des Restklassenringes nach einem nulldimensionalen Polynomideal. 215 . Soc. New developments in the theory of Gröbner bases and applications to formal verification. P. Math. Providence. A. Cori and D. A. Springer.Bibliography [1] W. 1983. Fitzpatrick. Borges-Quintana. M. [3] J. Resolution of kinematic redundancy. RI. 2000. Shub. RI. Varieties and Algorithms. 1990. Little. A simple unpredictable random number generator. Ein algorithmisches kriterium f¨ ur die lösbarkeit eines algebraischen gleichungssystems. Cox. [2] A. Part I. Addison-Wesley. Greuel. Gröbner bases and combinatorics for binary codes. [7] B. 1965. Hopcroft.. [5] M. and E. 1970. [10] D. An Introduction to Gröbner Bases. G. Baillieul and D. Data Structures and Algorithms. [8] B. O’Shea. Borges-Trenard. Aho. and M. Aequationes mathematicae. and J. Pure Appl. V. volume 41 of Proc. Martin. Comm. D. 2009. J. 1974. Discrete Appl. Cambridge University Press. 2007. Volume 2. and Hybrid Petri Nets. Van Loan. C. Math. 1999. Algebra. K. A. [15] G. L. SpringerVerlag. 2004. third edition. 1996. and Control. 2007. 1998. Mayr and A. Vaughan. Paul. volume 97 of Cambridge Studies in Advanced Mathematics. 46:305–329.216 BIBLIOGRAPHY [11] B. M. [20] S. 1998. W. Elementary Number Theory. The Art of Computer Programming. Continuous. Lectures on Generating Functions. Montgomery and R. 155(12):1514–1524. Davey and H. volume 23 of Student Mathematical Library. [14] P. MD. [25] R. third edition. H. Lang. 2005. Springer. Reason. Lang. Johns Hopkins Studies in the Mathematical Sciences. Springer-Verlag London. Cambridge. Meyer. 1982. Henrici. Molitor and J. Jones and J. volume 211 of Graduate Texts in Mathematics. Principles. Multiplicative Number Theory. Advances in Mathematics. [12] R. Applied and Computational Complex Analysis. Jones. Methods. Addison-Wesley. second edition. Programming. Introduction to Lattices and Order. Golub and C. 43(1):81–119. Springer. [22] E. Complex Analysis. Matrix Computations. Robot Manipulators: Mathematics. Classical Theory. Lando. Mohnke. Discrete. [18] S. A. Formalization and implementation of modern SAT solvers.. Selman. Alla. Springer Undergraduate Mathematics Series. [16] H. Marić. R. I. John Wiley & Sons. The state of SAT. [23] P. MIT Press. F. [17] D. [24] H. Springer. forth edition. Automat. Priestley. New York. Volume 1. P. David and H. 2002. 2003. [13] G. Equivalence Checking of Digital Circuits: Fundamentals. London. 2009. third edition. Baltimore. . Knuth. Cambridge University Press. Johns Hopkins University Press. 1981. 2002. Kautz and B. J.. [21] F. American Mathematical Society. The complexity of the word problems for commutative semigroups and polynomial ideals. A. [19] S. S. Traverso. I. third edition. Perret. 1994. and L. Pap. 48(2):405–421. [31] H. 1978. S. BIT. K. Shamir. Wellesley Cambridge Press.. and N. Sakata. Sala. 2009. L. Zhiping. Circuits Syst. A tutorial on Gröbner bases with applications in signals and systems. Adleman. A method for obtaining digital signatures and public key cryptosystems. fourth edition. Tuomela. Rivest. Bose. A. [27] M. [30] J. [32] Z. Generatingfunctionology. L. Commununications of the ACM. Gröbner Bases. 55(1):445–461. In IEEE Symposium on Foundations of Computer Science. Linear Algebra and its Applications. IEEE Trans. 2009. Algorithms for quantum computation: Discrete logarithms and factoring. Strang. [28] P. Xu.BIBLIOGRAPHY 217 [26] R. and C. T. 2008. Kinematic analysis of multibody systems. Mora. Coding. Wilf. (21):120–126. and Cryptography. A. Shor. 2008. 2006. Regul. [29] G. . Peters. Springer. K. 218 BIBLIOGRAPHY . 186 for summation. 68. 22 Euler phi function. 31 fundamental theorem of arithmetic. 42 closed form for generating function. 188 integral. 118 reduced. 97 theorem. 137 congruence modulo n. 165 Fibonacci number. 13 harmonic number.Subject Index lemma. 69 of term. 17. 142 evaluation of polynomial. 78 divisor. 24 relation. 42 Euler-Maclaurin formula. 101 application to inverse kinematics. 95. 180 sequence. 98 fast Fourier transform. 69 of polynomial. 164 Euclid algorithm. 36. 27. 26 coprime numbers. 144 generating function. 183 greatest common divisor. 30 unity. 70 discrete Fourier transform. 91 219 . 23 theorem. 80 of terms. 106 polynomial. 66. 105 degree of monomial. 30 Bernoulli number. 180. 163 division of polynomials. 96. 183 ring. 20 theorem. 22 Gaussian elimination technique. 186 derivative. 188 product. 181 closed form. 33 ideal. 150 Horner’s rule. 142 binomial coefficient. 184 sum. 13 Gröbner basis. 11 greatest common. 204 extended algorithm. 201 field. 160 inverse. 144 analysis. 14. 71 additive inverse. 29 Chinese remainder theorem. 121 SAT. 142 Buchberger algorithm. 20 Carmichael function. 30 homomorphism. 21 probability generating function. 54 modulus. 69 degree of. 34 quadratic residue modulo n. 87 reduction in one step. 77 term. 11 isomorphism of rings. 54 increment. 70 monomial. 157 reduction. 31 of generating functions. 191 product of monomials. 84 ring. 69 graded lexicographic order. 72. 73 order. 75 lexicographic order. 97 coefficent of. 80 evaluation of. 54 modular congruence application to pseudo-random numbers. 36 linear congruential sequence. 70 zero polynomial. 22 harmonic. 31 multiplicative order. 69 degree of. 72 multiple. 87 remainder. 73 sum. 30 additive inverse. 51 order graded lexicographic. 70 univariate. 34 . 150 worst-case analysis. 10 monic polynomial. 184 of polynomials. 73 product. 92 Gröbner basis. 69 division. 95 proper. 69 prime number. 77 multivariate. 142 coprime to. 35 isomorphism. 20 pseudo-random. 92 insertion sort. 148 integer division. 11 multiplicative inverse. 73 product. 72 product of rings. 77 monic. 28 number Bernoulli.220 basis of. 92 set of generators. 54 seed. 147 average-case analysis. 69 Buchberger. 140 polynomial. 72 term of. 36 multiplicative unity. 72 symmetric. 11 ring. 29 quotient. 31 unity. 54 public key cryptography. 170 product. 70 point-value representation. 150 prime. 20 factorization. 92 finitely generated. 73 perturbation technique. 31 multiplivative inverse. 30 additive unity. 71 leading term. 75 SUBJECT INDEX lexicographic. 12 reduction of polynomials. 10. 161 221 . 42 Euclid. 134 change of variable. 159 RSA cryptosystem. 134 distributivity. 20 Vandermonde matrix. 70 theorem Buchberger. 138 members of geometric progression.SUBJECT INDEX unitary. 144 analysis of insertion sort algorithm. 22 Euler. 134 application to analysis of Gaussian elimination technique. 134 members of arithmetic progression. 38 term. 78 leading term. 70 division. 127. 70 coefficient of. 140 systems of linear congruences. 77 least common multiple. 70 zero term. 97 monic. 36. 134 closed form. 42 summation. 70 monomial of. 70 degree of. 27. 98 Chinese remainder theorem. 42 fundamental theorem of arithmetic. 31 roots of unity. 147 associativity. 137 constant. 130 additivity of indices. 139 perturbation technique. 222 SUBJECT INDEX . a1 . . 161 FFT. 73 B(p1 . 27. . p2 ). 70 C[x1 . . 14 exteuclid(m. a1 . 11 D p −→ p′ . t2 ). 181 Mx1 . an−1 ). 127 k∈A gcd(m. . 127 k=a d2 X 0. . an−1 ). . 87 d p −→ p′ . n).. 30 −n . n). 12 e. . 27 Buch(G). 17 DFTn (a0 . 159 223 . . n). 97 ⌊x⌋. . 130 k=d1 X f (k). 13 lcm(t1 . 77 Zn . 160 DFTn−1 (a0 . 32 euclid(m. 75 >lx . 97 Bk . 27. 105 deg(p)... 165 ×n . .m). 24 >glx . 142 Gs (z). 69 [a]n . 30. 30 =n . 12 lt(p). 84 znk . . 26. 30 m | n. 69 mod(n. xn ]. 69 b X f (k)..Table of Symbols +n . .xn . 27.

Comments

Description