Approximations.pdf

Approximation MethodsPhysics 130B, UCSD Fall 2009 Joel Broida November 15, 2009 Contents 1 The Variation Method 1 1.1 The Variation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Excited States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Linear Variation Functions . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.1 Proof that the Roots of the Secular Equation are Real . . . . 17 2 Time-Independent Perturbation Theory 21 2.1 Perturbation Theory for a Nondegenerate Energy Level . . . . . . . 21 2.2 Perturbation Theory for a Degenerate Energy Level . . . . . . . . . 26 2.3 Perturbation Treatment of the First Excited States of Helium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.4 Spin–Orbit Coupling and the Hydrogen Atom Fine Structure . . . . 46 2.4.1 Supplement: Miscellaneous Proofs . . . . . . . . . . . . . . . 53 2.5 The Zeeman Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.5.1 Strong External Field . . . . . . . . . . . . . . . . . . . . . . 59 2.5.2 Weak External Field . . . . . . . . . . . . . . . . . . . . . . . 61 2.5.3 Intermediate-Field Case . . . . . . . . . . . . . . . . . . . . . 62 2.5.4 Supplement: The Electromagnetic Hamiltonian . . . . . . . . 66 3 Time-Dependent Perturbation Theory 71 3.1 Transitions Between Two Discrete States . . . . . . . . . . . . . . . 71 3.2 Transitions to a Continuum of States . . . . . . . . . . . . . . . . . . 80 i ii 1 The Variation Method 1.1 The Variation Theorem The variation method is one approach to approximating the ground state energy of a system without actually solving the Schrödinger equation. It is based on the following theorem, sometimes called the variation theorem. Theorem 1.1. Let a system be described by a time-independent Hamiltonian H, and let ϕ be any normalized well-behaved function that satisfies the boundary conditions of the problem. If E0 is the true ground state energy of the system, then hϕ|Hϕi ≥ E0 . (1.1) Proof. Consider the integral I = hϕ|(H − E0 )ϕi. Then I = hϕ|Hϕi − E0 hϕ|ϕi = hϕ|Hϕi − E0 . We must show that I ≥ 0. Let {ψn } be the true (stationary state) solutions to the Schrödinger equation, so that Hψn = En ψn . By assumption, the ψn form a complete, orthonormal set, so we can write X ϕ= an ψn n where hψn |ψm i = δnm . Then X X I= a∗n hψn |(H − E0 ) am |ψm i n m X = a∗n am (hψn |Hψm i − E0 δnm ) n,m X = a∗n am (Em − E0 )δnm n,m 2 X = |an | (En − E0 ) . n But |an | ≥ 0 and En > E0 for all n > 0 because E0 is the ground state of the system. Therefore I ≥ 0 as claimed. Suppose we have a trial function ϕ that is not normalized. Then multiplying by a normalization constant N , equation (1.1) becomes |N |2 hϕ|H|ϕi ≥ E0 . But by 2 2 definition we know that 1 = hN ϕ|N ϕi = |N | hϕ|ϕi so that |N | = 1/hϕ|ϕi, and hence our variation theorem becomes hϕ|Hϕi ≥ E0 . (1.2) hϕ|ϕi 1 The integral in (1.1) (or the ratio of integrals in (1.2)) is called the variational integral. So the idea is to try a number of different trial functions, and see how low we can get the variational integral to go. Fortunately, the variational integral approaches E0 a lot faster than ϕ approaches ψ0 , so it is possible to get a good approximation to E0 even with a poor ϕ. However, a common approach is to introduce arbitrary parameters and minimize the energy with respect to them. Before continuing with an example, there are two points I need to make. First, I state without proof that the bound stationary states of a one-dimensional system are characterized by having no nodes interior to the boundary points in the ground state (i.e., the wavefunction is never zero), and the number of nodes increases by one for each successive excited state. While the proof of this statement is not particularly difficult (it’s really a statement about Sturm-Liouville type differential equations), it would take us too far astray at the moment. If you are interested, a proof may be found in Messiah, Quantum Mechanics, Chapter III, Sections 8-12. A related issue is the following: In one dimension, the bound states are nondegenerate. To prove this, suppose we have two degenerate states ψ1 and ψ2 , both with the same energy E. Multiply the Schrödinger equation for ψ1 by ψ2 : ~2 d2 ψ1 − ψ2 + V ψ1 ψ2 = Eψ1 ψ2 2m dx2 and multiply the Schrödinger equation for ψ2 by ψ1 : ~2 d2 ψ2 − ψ1 + V ψ1 ψ2 = Eψ1 ψ2 . 2m dx2 Subtracting, we obtain d2 ψ1 d2 ψ2 ψ2 − ψ1 = 0. dx2 dx2 But then d2 ψ1 d2 ψ2 d dψ1 dψ2 ψ2 − ψ1 = ψ2 − ψ1 =0 dx dx dx dx2 dx2 so that dψ1 dψ2 ψ2 − ψ1 = const . dx dx However, we know that ψ → 0 as x → ±∞, and hence the constant must equal zero. Rewriting this result we now have d ln ψ1 = d ln ψ2 or ln ψ1 = ln ψ2 + ln k where ln k is an integration constant. This is equivalent to ψ1 = kψ2 so that ψ1 and ψ2 are linearly dependent and hence degenerate as claimed. The second topic I need to address is the notion of classification by symmetry. So, let us consider the time-independent Schrödinger equation Hψ = Eψ, and suppose that the potential energy function V (x) is symmetric, i.e., V (−x) = V (x) . 2 Under these conditions, the total Hamiltonian is also symmetric: H(−x) = H(x) . To understand the consequences of this, let us introduce an operator Π called the parity operator, defined by Πf (x) = f (−x) where f (x) is an arbitrary function. It is easy to see that Π is Hermitian because Z ∞ Z ∞ hf |Πgi = f (x)∗ Πg(x) dx = f (x)∗ g(−x) dx −∞ −∞ Z ∞ Z ∞ = f (−x)∗ g(x) dx = [Πf (x)]∗ g(x) dx −∞ −∞ = hΠf |gi where in going from the first line to the second we simply changed variables x → −x. (I will use the symbol dx to denote the volume element in whatever n-dimensional space is under consideration.) Now what can we say about the eigenvalues of Π? Well, if Πf = λf , then Π2 f = Π(Πf ) = λΠf = λ2 f . On the other hand, it is clear that Π2 f (x) = Π(Πf (x)) = Πf (−x) = f (x) and hence we must have λ2 = 1, so the eigenvalues of Π are ±1. Let us denote the corresponding eigenfunctions by f± : Πf+ = f+ and Πf− = −f− . In other words, f+ (−x) = f+ (x) and f− (−x) = −f− (x) . Thus f+ is any even function, and f− is any odd function. Note that what have shown is the existence of a Hermitian operator with only two eigenvalues, each of which is infinitely degenerate. (I leave it as an easy exercise for you to show that f+ and f− are orthogonal as they should be.) Next, note that any f (x) can always be written in the form f (x) = f+ (x) + f− (x) where f (x) + f (−x) f (x) − f (−x) f+ (x) = and f− (x) = 2 2 3 are obviously symmetric and antisymmetric, respectively. Thus the eigenfunctions of the parity operator are complete, i.e., any function can be written as the sum of a symmetric function and an antisymmetric function. It will be extremely convenient to now introduce the operators Π± defined by 1±Π Π± = . 2 In terms of these operators, we can write Π± f = f± . It is easy to see that the operators Π± satisfy the three properties Π2± = Π± Π+ Π− = Π− Π+ = 0 Π+ + Π− = 1 . The operators Π± are called projection operators. Returning to our symmetric Hamiltonian, we observe that Π(H(x)ψ(x)) = H(−x)ψ(−x) = H(x)ψ(−x) = H(x)Πψ(x) and thus the Hamiltonian commutes with the parity operator. But if [H, Π] = 0, then it is trivial to see that [H, Π± ] = 0 also, and therefore acting on HψE = EψE with Π± we see that HψE+ = EψE+ and HψE− = EψE− . Thus the stationary states in a symmetric potential can always be classified according to their parity, i.e., they can always be chosen to have a definite symmetry. Moreover, since, as we saw above, the bound states in one dimension are nondegenerate, it follows that each bound state in a one-dimensional symmetric potential must be either even or odd. Example 1.1. Let us find a trial function for a particle in a one-dimensional box of length l. Since the true wavefunction vanishes at the ends x = 0 and x = l, our trial function must also have this property. A simple (un-normalized) function that obeys these boundary conditions is ϕ = x(l − x) for 0 ≤ x ≤ l and ϕ = 0 outside the box. 4 The integrals in equation (1.2) are Z l ~2 d2 hϕ|Hϕi = − x(l − x) 2 x(l − x) dx 2m 0 dx ~2 l ~2 l 3 Z = x(l − x) dx = m 0 6m and l l5 Z hϕ|ϕi = x2 (l − x)2 dx = . 0 30 Therefore hϕ|Hϕi ~2 E0 ≤ =5 2. hϕ|ϕi ml For comparison, the exact solution has energy levels n2 ~2 π 2 En = n = 1, 2, . . . 2ml2 so the ground state (n = 1) has energy π 2 ~2 ~2 2 = 4.9348 2 2 ml ml for an error of 1.3%. The figure below is a plot of the exact normalized ground state solution to the particle in a box together with the normalized trial function. You can see how closely the trial function is to the exact solution. 1.4 1.2 1.0 0.8 0.6 Exact Solution 0.4 0.2 Trial Function 0.2 0.4 0.6 0.8 1.0 √ √ Figure 1: Plot of 2 sin πx and 30x(1 − x). Example 1.2. Let us construct a variation function with parameter for the one- dimensional harmonic oscillator, and find the optimal value for that parameter. 5 What do we know in general? First, the wavefunction must vanish as x → ±∞. 2 The most obvious function that satisfies this is e−x . However, x has units of length, and we can only take the exponential of a dimensionless quantity (think of the power 2 series expansion for e−x ). However, if we include a constant α with dimensions 2 of length−2 , then e−αx is satisfactory from a dimensional standpoint. In addition, since the potential V = 21 kx2 is symmetric, we know that the eigenstates will have a definite parity. And since the ground state has no nodes, it must be an even function (since an odd function has a node at the origin). Thus the trial function 2 ϕ = e−αx has all of our desired properties. Since ϕ is unnormalized, we use equation (1.2). The Hamiltonian is ~2 d2 1 − + mω 2 x2 2m dx2 2 and hence ∞ 2 ∞ ~2 d2 e−αx 1 Z Z −αx2 2 hϕ|Hϕi = − e dx + mω 2 x2 e−2αx dx 2m −∞ dx2 2 −∞ ∞ Z ∞ ~2 1 Z h i 2 2 2 =− 4α2 x2 e−2αx − 2αe−2αx dx + mω 2 x2 e−2αx dx 2m −∞ 2 −∞ ∞ −2~2 α2 ~2 α ∞ −2αx2 1 Z Z 2 = + mω 2 x2 e−2αx dx + e dx . m 2 −∞ m −∞ The second integral is easy (and you should already know the answer): Z ∞ r −2αx2 π e dx = . −∞ 2α Using this, the first integral is also easy. Letting β = 2α we have Z ∞ Z ∞ Z ∞ 2 2 ∂ 2 x2 e−2αx dx = x2 e−βx dx = − e−βx dx −∞ −∞ ∂β −∞ ∂ π 1 π 1/2 r =− = ∂β β 2 β 3/2 1 π 1/2 = . 2 (2α)3/2 After a little algebra, we now arrive at ~2 π 1/2 α1/2 mω 2 π 1/2 α−3/2 hϕ|Hϕi = 3/2 + . 2 m 27/2 And the denominator in equation (1.2) is just Z ∞ r 2 π hϕ|ϕi = e−2αx dx = . −∞ 2α 6 Thus our variational integral becomes hϕ|Hϕi ~2 α mω 2 W := = + . hϕ|ϕi 2m 8α To minimize this with respect to α we set dW/dα = 0 and solve for α: ~2 mω 2 − =0 2m 8α2 or mω α=± . 2~ 2 The negative root must be rejected because otherwise ϕ = e−αx would be divergent. Substituting the positive root for α into our expression for W yields 1 W = ~ω 2 which is the exact ground state harmonic oscillator energy. This isn’t surprising, because up to normalization, our ϕ with α = mω/2~ is just the exact ground state harmonic oscillator wave function. 1.2 Excited States So far all we have discussed is how to approximate the ground-state energy of a system. Now we want to take a look at how to go about approximating the energy of an excited state. Let us assume that the stationary states of our system are numbered so that E0 ≤ E1 ≤ E2 ≤ · · · . If {ψn } is a complete set of orthonormal P eigenstates of H, then our normalized trial function can be written ϕ = n an ψn where an = hψn |ϕi. Then as we have seen ∞ |an |2 En X X X hϕ|Hϕi = a∗n am Em hψn |ψm i = a∗n am Em δnm = n,m n,m n=0 and ∞ 2 X hϕ|ϕi = |an | = 1 . n=0 Suppose we restrict ourselves to trial functions that are orthogonal to the true ground-state wavefunction ψ0 . Then a0 = hψ0 |ϕi = 0 and we are left with ∞ ∞ |an |2 En |an |2 = 1 . X X hϕ|Hϕi = and hϕ|ϕi = n=1 n=1 7 2 2 For n ≥ 1 we have En ≥ E1 so that |an | En ≥ |an | E1 and hence ∞ ∞ ∞ 2 2 2 X X X |an | En ≥ |an | E1 = E1 |an | = E1 . n=1 n=1 n=1 This gives us our desired result hϕ|Hϕi ≥ E1 if hψ0 |ϕi = 0 and hϕ|ϕi = 1 . (1.3) While equation (1.3) gives an upper bound on the energy E1 of the first excited state, it depends on the restriction hψ0 |ϕi = 0 which can be problematic. However, for some systems this is not a difficult requirement to achieve even though we don’t know the exact ground-state wavefunction. For example, a one-dimensional problem with a symmetric potential has a ground-state wavefunction that is always even, while the first excited state is always odd. This means that any (normalized) trial function ϕ that is an odd function will automatically satisfy hψ0 |ϕi = 0. It is also possible to extend this approach to approximating the energy levels of higher excited states. In particular, if we somehow choose the trial function ϕ so that hψ0 |ϕi = hψ1 |ϕi = · · · = hψn |ϕi = 0, then, following exactly the same argument as above, it is easy to see that if hϕ|ϕi = 1 we have hϕ|Hϕi ≥ En+1 . For example, consider any particle moving under a central potential V (r) (e.g., the hydrogen atom). Then the Schrödinger equation factors into a radial equation that depends on V (r) times an angular equation (that is independent of V ) with solutions that are just the spherical harmonics Ylm (θ, φ). It may very well be that we can’t solve the radial equation with this potential, but we know that spherical harmonics with different values of l are orthogonal. Thus, we can get an upper bound to the energy of the lowest state with a particular angular momentum l by choosing a trial function that contains the factor Ylm . 1.3 Linear Variation Functions The approach that we are now going to describe is probably the most common method of finding approximate molecular wave functions. A linear variation function ϕ is a linear combination of n linearly independent functions fi : n X ϕ= ci f i . i=1 The functions fi are called basis functions, and they must obey the boundary conditions of the problem. The coefficients ci are to be determined by minimizing the variational integral. 8 We shall restrict ourselves to a real ϕ, so the functions fi and coefficients ci are taken to be real. Later we will remove this requirement. Furthermore, note that the basis functions are not generally orthogonal since they are not necessarily the eigenfunctions of any operator. Let us define the overlap integrals Sij by Z Sij := hfi |fj i = fi∗ fj dx (where the asterisk on fi isn’t necessary because we are assuming that our basis functions are real). Then (remember that the ci are real) n X n X hϕ|ϕi = ci cj hfi |fj i = ci cj Sij . i,j=1 i,j=1 Next, we define the integrals Z Hij := hfi |Hfj i = fi∗ Hfj dx so that n X n X hϕ|Hϕi = ci cj hfi |Hfj i = ci cj Hij . i,j=1 i,j=1 Then the variation theorem (1.2) becomes Pn hϕ|Hϕi i,j=1 ci cj Hij W = = Pn hϕ|ϕi i,j=1 ci cj Sij or n X n X W ci cj Sij = ci cj Hij . (1.4) i,j=1 i,j=1 Now W is a function of the n ci ’s, and we know that W ≥ E0 . In order to minimize W with respect to all of the the ck ’s, we must require that at the minimum we have ∂W = 0; k = 1, . . . , n . ∂ck Taking the derivative of (1.4) with respect to ck and using ∂ci = δik ∂ck we have n n n ∂W X X X ci cj Sij + W (δik cj + ci δjk )Sij = (δik cj + ci δjk )Hij ∂ck i,j=1 i,j=1 i,j=1 9 or (since ∂W/∂ck = 0) n X n X n X n X W cj Skj + W ci Sik = cj Hkj + ci Hik . j=1 i=1 j=1 i=1 However, the basis functions fi are real so we have Z Sik = fi fk dx = Ski and since H is Hermitian (and H(x) is real) we also have Hik = hfi |Hfk i = hHfi |fk i = hfk |Hfi i∗ = hfk |Hfi i = Hki . Therefore, because the summation indices are dummy indices, we see that the two terms on each side of the last equation are identical, and we are left with n X n X W cj Skj = cj Hkj j=1 j=1 or n X (Hkj − W Skj )cj = 0 ; k = 1, . . . , n . (1.5) j=1 This is just a system of n homogeneous linear equations in n unknowns (the n coefficients cj ), and hence for a nontrivial solution to exist (we don’t want all of the cj ’s to be zero) we must have the secular equation det(Hkj − W Skj ) = 0 . (1.6) P (You can think of this as a system of the form j akj xj = 0 where the matrix A = (ajk ) must be singular or else A−1 would exist and then the equation Ax = 0 would imply that x = 0. The requirement that A be singular is equivalent to the requirement that det A = 0.) Written out, equation (1.6) looks like H11 − W S11 H12 − W S12 ··· H1n − W S1n H21 − W S21 H22 − W S22 ··· H2n − W S2n .. .. .. = 0. . . . H − WS Hn2 − W Sn2 ··· N − WS n1 n1 nn nn The determinant in (1.6) is a polynomial in W of degree n, and it can be proved that all n roots of this equation are real. (The proof is given at the end of this section for those who are interested.) Let us arrange the roots in order of increasing value as W0 ≤ W1 ≤ · · · ≤ Wn−1 . 10 Similarly, we number the bound states of the system so that the corresponding true energies of these bound states are also arranged in increasing order: E0 ≤ E1 ≤ · · · ≤ En−1 ≤ En ≤ · · · . From the variation theorem we know that E0 ≤ W0 . Furthermore, it can also be proved (see the homework) that Ei ≤ Wi for each i = 0, . . . , n − 1 . In other words, the linear variation method provides upper bounds for the energies of the lowest n bound states of the system. It can be shown that increasing the number of basis functions used (and hence increasing the number of states whose energies are approximated), the better the accuracy of the previously calculated energies. Once we have found the n roots Wi , we can substitute them one-at-a-time back (i) into equation (1.5) and solve for the coefficients cj , where the superscript denotes that fact that this particular set of coefficients applies to the root Wi . (Again, this is just like finding the eigenvector corresponding to a given eigenvalue.) Note also that all we can really find is the ratios of the coefficients, say relative to c1 , and then fix c1 by normalization. There are some tricks that can simplify the solution of equation (1.6). For example, if we choose the basis functions to be orthonormal, then Skj = δkj . If the originally chosen set of basis functions isn’t orthonormal, we can always use the Gram-Schmidt process to construct an orthonormal set. Also, we can make some of the off-diagonal Hkj ’s vanish if we choose our basis functions to be eigenfunctions of some other Hermitian operator A that commutes with H. This because of the following theorem: Theorem 1.2. Let fi and fj be eigenfunctions of a Hermitian operator A corresponding to the eigenvalues ai 6= aj . If H is an operator that commutes with A, then Hji = hfj |Hfi i = 0 . Proof. Let us first assume that the eigenvalue ai is nondegenerate. Then Afi = ai fi and A(Hfi ) = HAfi = ai (Hfi ) . Thus Hfi is in the eigenspace Vai of A corresponding to the eigenvalue ai . But ai is nondegenerate so that the eigenspace is one-dimensional and spanned by fi . Hence we must have Hfi = bi fi for some scalar bi . Recalling that eigenfunctions belonging to distinct eigenvalues of a Hermitian operator are orthogonal, we have hfj |Hfi i = bi hfj |fi i = 0 . 11 Now assume that the eigenvalue ai is degenerate. This means that the eigenspace Vai has dimension greater than one, say dim Vai = n. Then Vai has a basis g1 , . . . , gn consisting of eigenvectors of A corresponding to the P eigenvalue ai , i.e., Agk = ai gk n for each k = 1, . . . , n. So if Hfi is in Vai , then Hfi = k=1 ck gk for some expansion coefficients ck . But then we again have n X hfj |Hfi i = ck hfj |gk i = 0 k=1 because the eigenfunctions fj and gk belong to the distinct eigenvalues aj and ai respectively. Another (possibly easier) way to prove Theorem 1.2 is this. Let Afi = ai fi and Afj = aj fj where ai 6= aj . (In other words, fi and fj belong to different eigenspaces of A.) Then on the one hand we have hfj |HAfi i = ai hfj |Hfi i while on the other hand, we can use the fact that H and A commute along with the fact that A is Hermitian and hence has real eigenvalues, to write hfj |HAfi i = hfj |AHfi i = hAfj |Hfi i = aj hfj |Hfi i . Equating these results shows that (ai − aj )hfj |Hfi i = 0. Therefore, if ai 6= aj , we must have hfj |Hfi i = 0. Finally, it is left as a homework problem to show that equations (1.5) and (1.6) also hold if the variation function is in fact allowed to be complex. Example 1.3. In Example 1.1 we constructed the trial function ϕ = x(l − x) for the ground state of the one-dimensional P particle in a box. Let us now construct a linear variation function ϕ = i ci fi to approximate the energies of the first four states. This means that we need at least four independent functions fi that obey the boundary conditions of vanishing at the ends of the box. While there are an infinite number of possibilities, we want to limit ourselves to integrals that are easy to evaluate. We begin by taking f1 = x(l − x) , and another simple function that obeys the proper boundary conditions is f2 = x2 (l − x)2 . If the origin were chosen to be at the center of the box, we know that the exact solutions would have a definite parity, alternating between even and odd functions, starting with the even ground state. To see that both f1 and f2 are even functions, 12 we shift the origin to the center of the box by changing variables to x′ = x − l/2. Then x = x′ + l/2 and we find f1 = (x′ + l/2)(l/2 − x′ ) and f2 = (x′ + l/2)2 (l/2 − x′ )2 which shows that f1 and f2 are both clearly even functions of x′ . Since both f1 and f2 are even functions, if we took ϕ = c1 f1 + c2 f2 we would end up with an upper bound for the two lowest energy even states (the n = 1 and n = 3 states). In order to also approximate the odd n = 2 and n = 4 states, we must add in two odd functions. Thus we need two functions that vanish at x = 0, x = l and x = l/2. Two functions that satisfy these requirements are f3 = x(l − x)(l/2 − x) and f4 = x2 (l − x)2 (l/2 − x) . By again changing variables as we did for f1 and f2 , you can easily show that f3 and f4 are indeed odd functions. Note also that the four functions we have chosen are linearly independent as they must be. One of the advantages in choosing our functions to have a definite parity is that many of the integrals that occur in equation (1.6) will vanish. In particular, since any integral of an odd function over an even interval is identically zero, and since the product of an even function with an odd function is odd, it should be clear that S13 = S31 = 0 S14 = S41 = 0 S23 = S32 = 0 S24 = S42 = 0 . Furthermore, since the functions have a definite parity, they are eigenfunctions of the parity operator Π with Πf1,2 = +f1,2 and Πf3,4 = −f3,4 . And since the potential is symmetric, we have [Π, H] = 0 so that by Theorem 1.2 we know that Hij = 0 if one index refers to an even function and the other refers to an odd function: H13 = H31 = 0 H14 = H41 = 0 H23 = H32 = 0 H24 = H42 = 0 . With these simplifications, (1.6) becomes H11 − W S11 H12 − W S12 0 0 H21 − W S21 H22 − W S22 0 0 = 0. 0 0 H33 − W S33 H34 − W S34 0 0 H43 − W S43 H − WS 44 44 Since the determinant of a block diagonal matrix is the product of the determinants of the blocks, we can find all four roots by finding the two roots of each of the 13 following equations: H11 − W S11 H12 − W S12 =0 (1.7a) H21 − W S21 H22 − W S22 H33 − W S33 H34 − W S34 = 0. (1.7b) H43 − W S43 H44 − W S44 Let the roots of (1.7a) be denoted W1 , W3 . These are the approximations to the energies of the n = 1 and n = 3 even states. Similarly, the roots W2 , W4 of (1.7b) are the approximations to the odd energy states n = 2 and n = 4. Once we have the roots Wi , we substitute them one-at-a-time back into equation (1.5) to determine (i) the set of coefficients cj corresponding to that particular root. In the particular case of W1 , this yields the set of equations (1) (1) (H11 − W1 S11 )c1 + (H12 − W1 S12 )c2 = 0 (1) (1) (1.8a) (H21 − W1 S21 )c1 + (H22 − W1 S22 )c2 = 0 (1) (1) (H33 − W1 S33 )c3 + (H34 − W1 S34 )c4 = 0 (1) (1) (1.8b) (H43 − W1 S43 )c3 + (H44 − W1 S44 )c4 = 0 . Now, W1 was a root of (1.7a), so the determinant of the coefficients in (1.8a) (1) (1) must vanish, and we have a nontrivial solution for c1 and c2 . However, W1 was not a root of (1.7b), so the determinant of the coefficients in (1.8b) does not vanish, (1) (1) and hence there is only the trivial solution c3 = c4 = 0. Thus the trial function (1) (1) for W1 is ϕ1 = c1 f1 + c2 f2 . Exactly the same reasoning applies to the other three roots, and we have the trial functions (1) (1) (3) (3) ϕ1 = c1 f1 + c2 f2 ϕ3 = c1 f1 + c2 f2 (2) (2) (4) (4) ϕ2 = c3 f3 + c4 f4 ϕ4 = c3 f3 + c4 f4 . So we see that the even states ψ1 and ψ3 are approximated by the trial functions ϕ1 and ϕ3 consisting of linear combinations of the even functions f1 and f2 . Similarly, the odd states ψ2 and ψ4 are approximated by the trial functions ϕ2 and ϕ4 that are linear combinations of the odd functions f3 and f4 . To proceed any further, we need to evaluate the non-zero integrals Hij and Sij . From Example 1.1 we can immediately write down H11 and S11 . The rest of the integrals are also straight-forward to evaluate, and the result is H11 = ~2 l3 /6m H12 = H21 = ~2 l5 /30m H22 = ~2 l7 /105m H33 = ~2 l5 /40m H44 = ~2 l9 /1260m H34 = H43 = ~2 l7 /280m S11 = l5 /30 S12 = S21 = l7 /140 S22 = l9 /630 S33 = l7 /840 S44 = l11 /27720 S34 = S43 = l9 /5040 . 14 Substituting these results into equation (1.7a) to determine W1 and W3 we have ~ 2 l3 l5 ~ 2 l5 l7 6m − 30 W 30m − 140 W 25 7 2 7 9 = 0. ~l − l W ~l − l W 30m 140 105m 630 To evaluate this, it is easiest to recall that multiplying any single row of a determinant by some scalar is the same as multiplying the original determinant by that same scalar. (This is an obvious consequence of the definition n X det A = εi1 ···in a1i1 · · · anin .) i1 ,...,in =1 Since the right hand side of this equation is zero, we don’t change anything by multiplying any row in this determinant by some constant. Multiplying the first row by 420m/l3 and the second row by 1260m/l5 we obtain 70~2 − 14ml2W 14~2 l2 − 3ml4 W =0 (1.9) 42~2 − 9ml2 W 12~2 l2 − 2ml4 W or ml4 W 2 − 56ml2 ~2 W + 252~4 = 0 . The roots of this quadratic are √ W1,3 = (~2 /ml2 )(28 ± 532) = 4.93487~2/ml2 , 51.0651~2/ml2 . Similarly, substituting the values for Hij and Sij into (1.7b) results in √ W2,4 = (~2 /ml2 )(60 ± 1620) = 19.7508~2/ml2 , 100.249~2/ml2 . For comparison, the first four exact solutions En = n2 ~2 π 2 /2ml2 are En = 4.9348~2/ml2 , 19.7392~2/ml2 , 44.4132~2/ml2 , 78.9568~2/ml2 so the errors are (in the order of increasing energy levels) 0.0014%, 0.059%, 15.0% and 27.0%. As expected, we did great for n = 1 and n = 2, but not so great for n = 3 and n = 4. We still have to find the approximate wave functions that correspond to each of the Wi ’s. We want to substitute W1 = 4.93487~2/ml2 into equations (1.8a) and use the integrals we have already evaluated. However, it is somewhat easier to note that (1) the coefficients of c1,2 in equations (1.8a) are equivalent to the entries in equation (1.9). Furthermore, as we have already noted, all we can find is the ratio of the ci ’s, so the two equations in (1.9) are equivalent, and we only need to use either one of them. (That the equations are equivalent is a consequence of the fact that the determinant (1.9) is zero, so the rows must be linearly dependent. Hence we get no new information by using both rows.) 15 So choosing the first row we have 70~2 − 14ml2 W1 = 70~2 − 14ml2(4.93487~2/ml2 ) = 0.91182~2 14~2 l2 − 3ml4 W1 = 14~2 l2 − 3ml4 (4.93487~2/ml2 ) = −0.80461~2l2 so that (1) 0.91182~2 (1) (1) c2 = c = 1.133c1 /l2 . 0.80461~2l2 1 (1) To fix the value of c1 we use the normalization condition: (1) (1) (1) (1) 1 = hϕ1 |ϕ1 i = hc1 f1 + c2 f2 |c1 f1 + c2 f2 i (1) (1) (1) (1) = [c1 ]2 S11 + 2c1 c2 S12 + [c2 ]2 S22 (1.133)2 (1) 2 1.133 = [c1 ] S11 + 2 · 2 S12 + S22 l l4 5 1.133 l7 (1.133)2 l9 (1) l = [c1 ]2 +2· 2 + 30 l 140 l4 630 (1) = 0.05156[c1 ]2 l5 (1) and hence c1 = 4.404l−5/2. Putting this all together we finally obtain ϕ1 = 4.404l−5/2f1 + 4.990l−9/2f2 = 4.404l−5/2x(l − x) + 4.990l−9/2x2 (l − x)2 = l−1/2 [4.404(x/l)(1 − x/l) + 4.990(x/l)2 (1 − x/l)2 ] . As you can see√from the plot below, the function ϕ1 is almost identical to the exact solution ψ1 = 2 sin πx/l: 1.4 1.2 1.0 0.8 0.6 Exact Solution 0.4 0.2 Trial Function 0.2 0.4 0.6 0.8 1.0 Figure 2: Plot of ψ1 and ϕ1 vs x/l. 16 Repeating all of this with the other roots W2 , W3 and W4 we eventually arrive at ϕ2 = l−1/2 [16.78(x/l)(1 − x/l)(1/2 − x/l) + 71.85(x/l)2(1 − x/l)2 (1/2 − x/l)] ϕ3 = l−1/2 [28.65(x/l)(1 − x/l) − 132.7(x/l)2(1 − x/l)2 ] ϕ4 = l−1/2 [98.99(x/l)(1 − x/l)(1/2 − x/l) − 572.3(x/l)2(1 − x/l)2 (1/2 − x/l)] 1.3.1 Proof that the Roots of the Secular Equation are Real In this section we will prove that the roots of the polynomial in W defined by equation (1.6) are in fact real. In order to show this, we must first review some basic linear algebra. Let V be a vector space over C. By an inner product on V (sometimes called the Hermitian inner product), we mean a mapping h· , ·i : V × V → C such that for all u, v, w ∈ V and a, b ∈ C we have (IP1) hau + bv, wi = a∗ hu, wi + b∗ hv, wi ; (IP2) hu, vi = hv, ui∗ ; (IP3) hu, ui ≥ 0 and hu, ui = 0 if and only if u = 0 . If {ei } is a basis for V , then in terms of components we have X X hu, vi = u∗i vj hei , ej i := u∗i vj gij i,j i,j where we have defined the (square) matrix G = (gij ) = hei , ej i. As a matrix product, we may write hu, vi = u∗T Gv . I emphasize that this is the most general inner product on V , and any inner product can be written in this form. (For example, if V is a real space and gij = hei , ej i = δij , then we obtain the usual Euclidean inner product on V .) Notice that gij = hei , ej i = hej , ei i∗ = gji ∗ and hence G = G† so that G is in fact a Hermitian matrix. (Some of you may realize that in the case where V is a real vector space, the matrix G is just the usual metric on V .) Now, given an inner product, we may define a norm on V by kuk = hu, ui1/2 . Note that because of condition (IP3), we have kuk ≥ 0 and kuk = 0 if and only if u = 0. This imposes a condition on G because kuk2 = hu, ui = u∗T Gu = X u∗i uj gij ≥ 0 i,j 17 and equality holds if and only if u = 0. A Hermitian matrix G with the property that u∗T Gu > 0 for all u 6= 0 is said to be positive definite. It is important to realize that conversely, given a positive definite Hermitian matrix G, we can define an inner product by hu, vi = u∗T Gv. That this is true follows easily by reversing the above steps. Another fundamental concept is that of the kernel of a linear transformation (or matrix). If T is a linear transformation, we define the kernel of T to be the set Ker T = {u ∈ V : T u = 0} . A linear transformation whose kernel is zero is said to be nonsingular. The reason the kernel is so useful is that it allows us to determine whether or not a linear transformation is an isomorphism (i.e., one-to-one). A linear transformation T on V is said to be one-to-one if u 6= v implies T u 6= T v. An equivalent way to say this is that T u = T v implies u = v (this is the contrapositive statement). Thus, if T u = T v, the using the linearity of T we see that 0 = T u − T v = T (u − v) and hence u − v ∈ Ker T . But if Ker T = {0}, then we in fact have u = v so that T is an isomorphism. Conversely, if T is an isomorphism, then we must have Ker T = {0}. This is because T is one-to-one, and any linear transformation has the property that T 0 = 0. (Because T u = T (u + 0) = T u + T 0 so that T 0 = 0.) Now suppose that T is a nonsingular surjective (i.e., onto) linear transformation on V . Such a T is said to be a bijection. You should already know that the matrix representation A = (aij ) of T with respect to the basis {ei } for V is defined by X T ei = ej aji . j This is frequently written as A = [T ]e . Then the fact that T is a bijection simply means that the matrix A is invertible (i.e., that A−1 exists). (Actually, if T : U → V is a nonsingular (one-to-one) linear transformation between two finite-dimensional vector spaces of equal dimensions, then it is automatically surjective. This is a consequence of the well-known rank theorem which says rank T + dim Ker T = dim U where rank T is another term for the dimension of the image of T . Therefore, if Ker T = {0} we have dim Ker T = 0 so that rank T = dim U = dim V . The proof of the rank theorem is also not hard: Let dim U = n, and let {w1 , . . . , wk } be a basis for Ker T . Extend this to a basis {w1 , . . . , wn } for U . Then Im T is spanned by {T wk+1 , . . . , T wn }, and it is easy to see that these are linearly independent. Thus dim U = n = k + (n − k) = dim Ker T + dim Im T .) Note that if G is positive definite, then we must have Ker G = {0}. This is because if u 6= 0 and Gu = 0, we would have hu, ui = u∗T Gu = 0 in contradiction to the assumed positive definiteness of G. Thus a positive definite matrix is necessarily nonsingular. Let us take a more careful look at Sij = hfi |fj i. I claim that the matrix S = (Sij ) is positive definite. To show this, I will prove a general result. Suppose 18 I have n linearly independent (complex) vectors v1 , . . . , vn , and I construct the nonsingular matrix M whose columns are just the vectors vi . Letting vij denote the jth component of the vector vi , we have v11 v21 ··· vn1    v12 v22 ··· vn2    M =  .. .. .. .  . . .   v1n v2n ··· vnn From this we see that ∗ ∗ ∗ v11 v12 ··· v1n    ∗ ∗ ∗  v21 v22 ··· v2n  M† =    .. .. ..   . . .   ∗ ∗ ∗ vn1 vn2 ··· vnn and therefore hv1 |v1 i hv1 |v2 i ··· hv1 |vn i    hv2 |v1 i hv2 |v2 i ··· hv2 |vn i    † M M = .. .. .. . (1.10) . . .     hvn |v1 i hvn |v2 i · · · hvn |vn i A matrix of this form is called a Gram matrix. If I denote the Hermitian matrix M † M by S, then for any vector c 6= 0 we have 2 hc|Sci = hc|M † M ci = hM c|M ci = kM ck > 0 so that S is positive definite. That this is strictly greater than zero (and not greater than or equal to zero) follows from the fact that M is nonsingular so its kernel is {0}, together with the assumption that c 6= 0. In other words, any matrix of the form (1.10) is positive definite. But this is exactly what we had when we defined Sij = hfi |fj i = hi|ji, where the linearly independent functions fi define a basis for a vector space. In other words, what we really have is fi = vi so that the matrix M † M defined above is exactly the matrix S defined by Sij = hi|ji. With all of this formalism out of the way, it is now easy to show that the roots of the secular equation are real. Let us write equation (1.5) in matrix form as Hc = W Sc so that hc|Hci = W hc|Sci . 19 On the other hand, using the fact that H is Hermitian and S is real and symmetric, we can write hc|Hci = hHc|ci = hW Sc|ci = W ∗ hSc|ci = W ∗ hc|Sci . Thus we have (W − W ∗ )hc|Sci = 0 which implies W = W ∗ because c 6= 0 so that hc|Sci > 0. Note that this proof is also valid in the case where ϕ is complex because (1.5) still holds, and S = M † M is Hermitian so that hSc|ci = hc|Sci. 20 2 Time-Independent Perturbation Theory 2.1 Perturbation Theory for a Nondegenerate Energy Level Suppose that we want to solve the time-independent Schrödinger equation Hψn = En ψn , but the Hamiltonian is too complicated for us to find an exact solution. However, let us suppose that the Hamiltonian can be written in the form H = H 0 + λH ′ (0) (0) (0) where we know the exact solutions to H 0 ψn = En ψn . (We will use a superscript 0 to denote the energies and eigenstates of the unperturbed Hamiltonian H 0 .) The additional term H ′ is called a perturbation, and it must in some sense be considered small relative to H 0 . The dimensionless parameter λ is redundant, but is introduced for mathematical convenience; it will not remain a part of our final solution. For example, the unperturbed Hamiltonian H 0 could be the (free) hydrogen atom, and the perturbation H ′ could represent the interaction energy eE · r of the electron with an electric field E. (This leads to an energy level shift called the Stark effect.) The full (i.e., interacting or perturbed) Schrödinger equation is written Hψn = (H 0 + λH ′ )ψn = En ψn (2.1) and the unperturbed equation is H 0 ψn(0) = En(0) ψn(0) . (2.2) We think of the parameter λ as varying from 0 to 1, and taking the system smoothly from the unperturbed system described by H 0 to the fully interacting system described by H. And as long as we are discussing nondegenerate states, we can think (0) of each unperturbed state ψn as undergoing a smooth transition to the exact state ψn . In other words, lim ψn = ψn(0) and lim En = En(0) . λ→0 λ→0 Since the states ψn = ψn (λ, x) and energies En = En (λ) depend on λ, let us expand both in a Taylor series about λ = 0: λ2 ∂ 2 ψn (0) ∂ψn ψn = ψn + λ + + ··· ∂λ λ=0 2! ∂λ2 λ=0 λ2 d2 En dEn En = En(0) + λ + + ··· . dλ λ=0 2! dλ2 λ=0 Now introduce the notation 1 ∂ k ψn 1 dk En ψn(k) = En(k) = k! ∂λk λ=0 k! dλk λ=0 21 so we can write ψn = ψn(0) + λψn(1) + λ2 ψn(2) + · · · (2.3a) En = En(0) + λEn(1) + λ2 En(2) + · · · . (2.3b) (k) (k) For each k = 1, 2, . . . we call ψn and En the kth-order correction to the wavefunction and energy. We assume that the series converges for λ = 1, and that the first few terms give a good approximation to the exact solutions. It will be convenient to simplify some of our notation, so integrals such as (j) (k) hψn |ψn i will simply be written hn(j) |n(k) i. We assume that the unperturbed states are orthonormal so that hm(0) |n(0) i = δmn and we also choose our normalization so that hn(0) |ni = 1 . (2.4) If this last condition on ψn isn’t satisfied, then multiplying ψn by hn(0) |ni−1 will ensure that it is. Since multiplying the Schrödinger equation Hψn = En ψn by a constant doesn’t change En , this has no effect on the energy levels. If so desired, at the end of the calculation we can always re-normalize ψn in the usual way. Substituting (2.3a) into (2.4) yields 1 = hn(0) |n(0) i + λhn(0) |n(1) i + λ2 hn(0) |n(2) i + · · · . Now, P∞ it isna general result that if you have a power series equation of the form n=0 an x = 0 for all x, then an = 0 for all n. That a0 = 0 follows by letting x = 0. Now take the derivative with respect to x and let x = 0 to obtain a1 = 0. Taking the derivative again and letting x = 0 yields a2 = 0. Clearly we can continue this procedure to arrive at an = 0 for all n. Applying this result to the above power series in λ and using the fact that hn(0) |n(0) i = 1 we conclude that hn(0) |n(k) i = 0 for all k = 1, 2, . . . . (2.5) We now substitute equations (2.3) into the Schrödinger equation (2.1): (H 0 + λH ′ )(ψn(0) + λψn(1) + λ2 ψn(2) + · · · ) = (En(0) + λEn(1) + λ2 En(2) + · · · )(ψn(0) + λψn(1) + λ2 ψn(2) + · · · ) or, grouping powers of λ, H 0 ψn(0) + λ(H 0 ψn(1) + H ′ ψn(0) ) + λ2 (H (0) ψn(2) + H ′ ψn(1) ) + · · · = En(0) ψn(0) + λ(En(0) ψn(1) + En(1) ψn(0) ) + λ2 (En(0) ψn(2) + En(1) ψn(1) + En(2) ψn(0) ) + · · · . 22 Again ignoring questions of convergence, we can equate powers of λ on both sides of this equation. For λ0 we simply have H 0 ψn(0) = En(0) ψn(0) (2.6a) which doesn’t tell us anything new. For λ1 we have H 0 ψn(1) + H ′ ψn(0) = En(0) ψn(1) + En(1) ψn(0) or (H 0 − En(0) )ψn(1) = (En(1) − H ′ )ψn(0) . (2.6b) 2 For λ we have H (0) ψn(2) + H ′ ψn(1) = En(0) ψn(2) + En(1) ψn(1) + En(2) ψn(0) or (H 0 − En(0) )ψn(2) = (En(1) − H ′ )ψn(1) + En(2) ψn(0) . (2.6c) And in general we have for k ≥ 1 (H 0 − En(0) )ψn(k) = (En(1) − H ′ )ψn(k−1) + En(2) ψn(k−2) + · · · + En(k) ψn(0) . (2.6d) (k) (k−1) (k−2) Notice that at each step along the way, ψn is determined by ψn , ψn , (0) (0) (k) . . . , ψn . We can also add an arbitrary multiple of ψn to each ψn without affecting the left side of these equations. Hence we can choose this multiple so that hn(0) |n(k) i = 0 for k ≥ 1, which is the same result as we had in (2.5). (0) Now using the hermiticity of H 0 and the fact that En is real, we have hn(0) |H 0 n(k) i = hH 0 n(0) |n(k) i = En(0) hn(0) |n(k) i = 0 for k ≥ 1 . (0)∗ Then multiplying (2.6d) from the left by ψn and integrating, we see that the left-hand side vanishes, and we are left with (since hn(0) |n(0) i = 1) 0 = −hn(0) |H ′ n(k−1) i + En(k) or En(k) = hn(0) |H ′ n(k−1) i for k ≥ 1 . (2.7) In particular, we have the extremely important result for the first order energy correction to the nth state Z En(1) = hn(0) |H ′ n(0) i = ψn(0)∗ H ′ ψn(0) dx . (2.8) Letting λ = 1 in (2.3b), we see that to first order, the energy of the nth state is given by Z En ≈ En + En = En + ψn(0)∗ H ′ ψn(0) dx . (0) (1) (0) 23 Example 2.1. Let the unperturbed system be the free harmonic oscillator, with ground-state wavefunction 1/4 (0) mω 2 ψ0 = e−mωx /2~ π~ and energy levels 1 En(0) = n+ ~ω . 2 Now consider the anharmonic oscillator with Hamiltonian H = H 0 + H ′ := H 0 + ax3 + bx4 . The first-order energy correction to the ground state is given by 1/2 Z ∞ (1) mω 2 E0 = hn(0) |H ′ n(0) i = e−mωx /~ (ax3 + bx4 ) dx . π~ −∞ However, the integral over x3 vanishes by symmetry (the integral of an odd function over an even interval), and we are left with 1/2 Z ∞ 1/2 Z ∞ (1) mω 2 α 2 E0 =b x4 e−mωx /~ dx = b x4 e−αx dx π~ −∞ π −∞ 1/2 2 Z ∞ 1/2 2 1/2 α ∂ −αx2 α ∂ π =b e dx = b π ∂α2 −∞ π ∂α2 α 3b 3b ~2 = = . 4α2 4 m2 ω 2 Thus, to first order, the ground state energy of the anharmonic oscillator is given by (0) (1) 1 3b ~2 E0 ≈ E0 + E0 = ~ω + . 2 4 m2 ω 2 Now let’s find the first-order correction to the wavefunction. Since the unper- (0) (1) turbed states ψn form a complete orthonormal set, we may expand ψn in terms of them as X ψn(1) = am ψm(0) m where am = hm(0) |n(1) i . (It would be way too cluttered to try and label these expansion coefficients to denote the fact that they also refer to the first-order correction of the nth state.) Then for 24 (0) m 6= n, we multiply (2.6b) from the left by ψm and integrate: hm(0) |(H 0 − En(0) )n(1) i = En(1) hm(0) |n(0) i − hm(0) |H ′ n(0) i (0) (0) (0) or (since H 0 ψm = Em ψm and hm(0) |n(0) i = 0 for m 6= n) (0) (Em − En(0) )hm(0) |n(1) i = −hm(0) |H ′ n(0) i . Therefore hm(0) |H ′ n(0) i am = hm(0) |n(1) i =(0) (0) for m 6= n . (2.9) En − Em You should realize that this last step was where the assumed nondegeneracy of the (0) (0) states came in. In order for us to divide by En − Em , we must assume that (0) (0) it is nonzero. This is true as long as m 6= n implies that En 6= Em . Since (0) (1) an = hn |n i = 0 (this is equation (2.5)), we finally obtain X hm(0) |H ′ n(0) i ψn(1) = (0) (0) (0) ψm . (2.10) m6=n En − Em Now that we have the first-order correction to the wavefunction, it is easy to get the second-order correction to the energy. Using (2.10) in (2.7) with k = 2 we immediately have X hn(0) |H ′ m(0) i2 X hm(0) |H ′ n(0) ihn(0) |H ′ m(0) i (2) En = (0) (0) = (0) (0) . (2.11) m6=n En − Em m6=n En − Em The last term we will compute is the second-order correction to the wavefunction. (0) We again expand in terms of the ψn as X ψn(2) = (0) bm ψm m (0)∗ where bm = hm(0) |n(2) i. Multiplying (2.6c) from the left by ψm and integrating we have (assuming m 6= n) (0) (Em − En(0) )hm(0) |n(2) i = En(1) hm(0) |n(1) i − hm(0) |H ′ n(1) i or (1) En hm(0) |H ′ n(1) i bm = hm(0) |n(2) i = (0) (0) hm(0) |n(1) i − (0) (0) . Em − En Em − En Now use (2.9) in the first term on the right-hand side and use (2.10) in the second term to write (1) En hm(0) |H ′ n(0) i X hm(0) |H ′ k (0) ihk (0) |H ′ n(0) i bm = − (0) (0) − (0) (0) (0) (0) . (En − Em )2 k6=n (Em − En )(En − Ek ) 25 Using (2.8) we finally obtain X X hm(0) |H ′ k (0) ihk (0) |H ′ n(0) i ψn(2) = (0) (0) (0) (0) (0) ψm m6=n k6=n (En − Em )(En − Ek ) X hm(0) |H ′ n(0) ihn(0) |H ′ n(0) i (0) − (0) (0) ψm . (2.12) m6=n (En − Em )2 Let me make several points. First, recall that because of equation (2.4), our states are not normalized. Second, be sure to realize that the sums in equations (2.10), (2.11) and (2.12) are over states, and not energy levels. If some of the energy levels other than the nth are degenerate, then we must include a term in each of these sums for each linearly independent wavefunction corresponding to the (1) (2) degenerate energy level. The reason for this is that the expansions of ψn and ψn were in terms of a complete set of functions, and hence we must be sure to include all linearly independent states in the sums. Furthermore, if there happens to be a continuum of states in the unperturbed system, then we must also include an integral over these so that we have included all linearly independent states in our expansion. 2.2 Perturbation Theory for a Degenerate Energy Level We now turn to the perturbation treatment of a degenerate energy level, meaning that there are multiple unperturbed states that all have the same energy. If we (0) (0) let d be the degree of degeneracy, then we have states ψ1 , . . . , ψd satisfying the unperturbed Schrödinger equation H 0 ψn(0) = En(0) ψn(0) (2.13a) with (0) (0) (0) E1 = E2 = · · · = Ed . (2.13b) You must be careful with the notation here, because we don’t want to clutter it up (0) (0) with too many indices. Even though we write E1 , . . . , Ed , this does not mean that these are necessarily the d lowest-lying states that satisfy the unperturbed Schrödinger equation. We are referring here to a single degenerate energy level. The interacting (or perturbed) Schrödinger equation is Hψn = (H 0 + λH ′ )ψn = En ψn . In our treatment of a nondegenerate energy level, we assumed that limλ→0 En = (0) (0) (0) En and limλ→0 ψn = ψn where the state ψn was unique. However, in the case of degeneracy, the second of these does not hold. While it is true that as λ goes to zero we still have lim En = En(0) λ→0 26 the presence of the perturbation generally splits the degenerate energy level into multiple distinct states. However, there are varying degrees of splitting, and while the perturbation may completely remove the degeneracy, it may also only partially remove it or have no effect at all. This is illustrated in the figure below. E5 E5 E4c E4b E4abc E4a E3 E1 Energy E3 E2abc E2ab E2c E1 0 λ 1 Figure 3: Splitting of energy levels due to a perturbation. The important point to realize here is that in the limit λ → 0, the state ψn does (0) not necessarily go to a unique ψn , but rather only to some linear combination (0) (0) of the normalized degenerate states ψ1 , . . . , ψd . This is because any such linear combination (0) (0) (0) c1 ψ1 + c2 ψ2 + · · · + cd ψd (0) will satisfy (2.13a) with the same eigenvalue En . Thus there are an infinite number of such linear combinations made up of these d linearly independent normalized eigenfunctions, and any of them will work as the unperturbed state. For example, recall that the hydrogen atom states are labeled ψnlm where the energy only depends on n and l, and the factor eimφ makes the wave function complex for m 6= 0. The 2p states correspond to n = 2 and l = 1, and these are broken into the wave functions 2p1 and 2p−1 . However, instead of these complex wave functions, we can take the real linear combinations defined by 1 ψ2px = √ (ψ2p1 + ψ2p−1 ) 2 and 1 ψ2py = √ (ψ2p1 − ψ2p−1 ) i 2 which have the same energies. For most purposes in chemistry, these real wave functions are much more convenient to work with. And while the 2p0 , 2p1 and 2p−1 states are degenerate, the presence of an electric or magnetic field will split the 27 degeneracy because the interaction term in the Hamiltonian depends on the spin of the electron (i.e., the m value). Returning to our problem, all we can say is that d (0) X lim ψn = ci ψi , 1 ≤ n ≤ d. λ→0 i=1 Hence the first thing we must do is determine the correct zeroth-order wave func- (0) tions, which we denote by φn . In other words, d (0) X φ(0) n := lim ψn = ci ψi , 1≤n≤d (2.14) λ→0 i=1 (0) (n) where each φn has a different set of coefficients ci . (These should be labeled ci , (0) (0) (0) but I’m trying to keep it simple.) Note that since H 0 ψi = Ed ψi for each i = 1, . . . , d it follows that (0) (0) H 0 φ(0) n = Ed φn . (2.15) For the d-fold degenerate case, we proceed as in the nondegenerate case, except (0) (0) that now we use φn instead of ψn for the zeroth-order wave function. Then equations (2.3) become ψn = φ(0) (1) 2 (2) n + λψn + λ ψn + · · · (2.16a) (0) En = Ed + λEn(1) + λ2 En(2) + · · · (2.16b) where we have used (2.13b). Equations (2.16) apply for each n = 1, . . . , d. As in the nondegenerate case, we substitute these into the Schrödinger equation Hψn = En ψn and equate powers of λ. This is exactly the same as we had before, except that now (0) (0) we have φn instead of ψn , so we can immediately write down the results from equations (2.6). (0) (0) (0) Equating the coefficients of λ0 we have H 0 φn = Ed φn . Since for each n = (0) (0) 1, . . . , d the linear combination φn is an eigenstate of H 0 with eigenvalue Ed (this is just the statement of equation (2.15)), this doesn’t give us any new information. From the coefficients of λ1 we have (for each n = 1, . . . , d) (0) (H 0 − Ed )ψn(1) = (En(1) − H ′ )φ(0) n . (2.17) (0)∗ Multipling this from the left by φn and integrating we have (here I’m not using (0) (0) n(0) as a shorthand for ψn to make sure there is no confusion with φn ) (0) hφ(0) 0 (0) (0) (0) (1) (0) (0) (0) ′ (0) n |H ψn i − Ed hφn |ψn i = En hφn |φn i − hφn |H φn i . Using (2.15) we see that the left-hand side of this equation vanishes, so assuming that the correct zeroth-order wave functions are normalized, we arrive at the first order correction to the energy En(1) = hφ(0) ′ (0) n |H φn i . (2.18) 28 This is similar to the nondegenerate result (2.8) except that now we use the correct zeroth-order wave functions. Of course, in order to evaluate these integrals, we must (0) know the functions φn which, so far, we don’t. So, for any 1 ≤ m ≤ d, we multiply (2.17) from the left by one of the d-fold (0) degenerate unperturbed wave functions ψm and integrate to obtain (0) (0) hψm |H 0 ψn(1) i − Ed hψm (0) (1) |ψn i = En(1) hψm (0) (0) (0) |φn i − hψm |H ′ φ(0) n i. (0) (0) (0) Since H 0 ψm = Ed ψm , we see that the left-hand side of this equation vanishes, and we are left with (0) hψm |H ′ φ(0) (1) (0) (0) n i − En hψm |φn i = 0 , m = 1, . . . , d . (0) There is no loss of generality in assuming that the zeroth-order wave functions ψi of the degenerate level are orthonormal, so we take (0) (0) hψm |ψi i = δmi for m, i = 1, . . . , d . (2.19) (0) (If the zeroth-order wave functions ψi aren’t orthonormal, then apply the Gram- Schmidt process to construct an orthonormal set. Since the new orthonormal functions are just linear combinations of the original set, and the correct zeroth-order (0) (0) (0) functions φn are linear combinations of the ψi , the φn will just be different linear combinations of the new orthonormal functions.) Then substituting the definition (0) (2.14) for φn we have d d (0) (0) X X (0) ci hψm |H ′ ψi i − En(1) (0) ci hψm |ψi i = 0 i=1 i=1 or d X ′ (Hmi − En(1) δmi )ci = 0 , m = 1, . . . , d (2.20a) i=1 where ′ (0) (0) Hmi = hψm |H ′ ψi i . This is just another homogeneous system of d equations in the d unknowns ci . In fact, if we let c be the vector with components ci , then we can write (2.20a) in matrix form as H ′ c = En(1) c (2.20b) which shows that this is nothing more than an eigenvalue equation for the matrix H ′ acting on the d-dimensional eigenspace of degenerate wave functions. As usual, if (2.20a) is to have a nontrivial solution, we must have the secular equation ′ det(Hmi − En(1) δmi ) = 0 . (2.21) 29 Written out, this looks like H11 − En(1) ′ ′ ′ H12 ··· H1d (1) ′ ′ ′ H21 H22 − En ··· H2d = 0. .. .. .. . . . (1) H′ d1 ′ Hd2 ··· ′ Hdd − En (1) (1) (1) (1) This is a polynomial of degree d in En , and the d roots E1 , E2 , . . . , Ed are the first-order corrections to the energy of the d-fold degenerate unperturbed state. (1) So, we solve (2.21) for the eigenvalues En , and use these in (2.20b) to solve for the eigenvectors c. These then define the correct zeroth-order wave functions according to (2.14). Again, note that all we are doing is finding the eigenvalues and eigenvectors ′ of the matrix Hmi . And since H ′ is Hermitian, eigenvectors belonging to distinct eigenvalues are orthogonal. But each eigenvector c has components that are just the expansion coefficients in (2.14), and therefore (reverting to a more complete notation) d d (m)∗ (0) (0) (n) (m)∗ (n) X X hφ(0) ′ (0) m |H φn i = ci hψi |H ′ ψj icj = ci ′ Hij cj i,j=1 i,j=1 = c(m)† H ′ c(n) = En(1) c(m)† c(n) = En(1) hc(m) |c(n) i or hφ(0) ′ (0) (1) m |H φn i = En δmn (2.22) where we assume that the eigenvectors are normalized. In the case where m = n, we arrive back at (2.18). What about the case m 6= n? Recall that in our treatment of nondegenerate perturbation theory, the reason we had to assume the nondegeneracy was because equations (2.10) and (2.11) would (0) (0) blow up if there were another state ψm with the same energy as ψn . However, in that case, we would be saved if the numerator also went to zero, and that is precisely what happens if we use the correct zeroth-order wave functions. Essentially then, the degenerate case proceeds just like the nondegenerate case, except that we must use the correct zeroth-order wave functions. Returning to (2.21), if all d roots are distinct, then we have completely split the degeneracy into d distinct levels (0) (1) (0) (1) (0) (1) Ed + E1 , Ed + E2 , ..., Ed + Ed . If not all of the roots are distinct, then we have only partly removed the degeneracy (at least to first order). We will assume that all d roots are distinct, and hence that the degeneracy has been completely lifted in first order. 30 (1) Now that we have the d roots En , we can take them one-at-a-time and plug back into the system of equations (2.20a) and solve for c2 , . . . , cd in terms of c1 . (Recall that because the determinant of the coefficient matrix of the system (2.20a) is zero, the d equations in (2.20a) are linearly dependent, and hence we can only find d − 1 of the unknowns in terms of one of them.) Finally, we fix c1 by normalization, using equations (2.14) and (2.19): d d d (0) (0) |ci |2 . X X X 1 = hφ(0) (0) n |φn i = c∗i cj hψi |ψj i = c∗i cj δij = (2.23) i,j=1 i,j=1 i=1 Also be sure to realize that we obtain a separate set of coefficients ci for each root (1) En . This is how we get the d independent zeroth-order wave functions. Obviously, finding the roots of (2.21) is a difficult problem in general. However, under some special conditions, the problem may be much more tractable. The best situation would be if all off-diagonal elements Hmi , m 6= i vanished. Then the determinant is just the product of the diagonal elements, and the d roots are simply (1) ′ En = Hmm for m = 1, . . . , d or (1) ′ (1) ′ (1) ′ E1 = H11 , E2 = H22 , ..., Ed = Hdd . (1) (1) ′ Let us assume that all d roots are distinct. Taking the root En = E1 = H11 as a specific example, (2.20a) becomes the set of d − 1 equations ′ (1) (H22 − E1 )c2 = 0 ′ (1) (H33 − E1 )c3 = 0 .. . ′ (1) (Hdd − E1 )cd = 0 . (1) ′ ′ Since E1 = H11 6= Hmm for m = 2, 3, . . . , d, it follows that c2 = c3 = · · · = cd = 0. Normalization then implies that c1 = 1, and the corresponding zeroth-order wave (0) (0) function defined by (2.14) is φ1 = ψ1 . Clearly this applies to any of the d roots, so we have (0) (0) φi = ψi , i = 1, . . . , d . Thus we have shown that when the secular equation is diagonal and the d matrix ′ (0) elements Hmm are all distinct, then the initial wave functions ψi are the correct (0) zeroth-order wave functions φi . Another situation that lends itself to a relatively simple solution is when the secular determinant is block diagonal. For example, in the case where d = 4 we 31 would have H ′ − E (1) ′ H12 0 0 11 n ′ ′ (1) H21 H22 − En 0 0 = 0. ′ (1) ′ 0 0 H33 − En H34 ′ ′ (1) 0 0 H43 H44 − En This is of the same form as we had in Example 1.3 (except with Sij = δij ). Exactly the same reasoning we used to show that two of the variation functions were linear combinations of f1 and f2 and two of the variation functions were linear combinations of f3 and f4 now shows that the correct zeroth-order wave functions are of the form (0) (1) (0) (1) (0) (0) (2) (0) (2) (0) φ1 = c1 ψ1 + c2 ψ2 φ2 = c1 ψ1 + c2 ψ2 (0) (3) (0) (3) (0) (0) (4) (0) (4) (0) φ3 = c3 ψ3 + c4 ψ4 φ4 = c3 ψ3 + c4 ψ4 (0) Is there any way we can choose our initial wave functions ψi to make things easier? Well, referring back to Theorem 1.2, suppose we have a Hermitian operator A that commutes with both H 0 and H ′ . If we choose our initial wave functions to be eigenfunctions of both A and H 0 , then the off-diagonal matrix elements ′ (0) (0) (0) (0) Hij = hψi |H ′ ψj i will vanish if ψi and ψj belong to different eigenspaces of (0) A. Therefore, if the functions ψi all have different eigenvalues of A, the secular (0) (0) determinant will be diagonal so that the φi = ψi . (0) If more than one ψi belongs to a given eigenvalue ak of A (in other words, dim Vak > 1), then this subcollection will form a block in the secular determinant. So in general, we will have a secular determinant that is block diagonal where each block has size dim Vak . In this case, each correct zeroth-order wave function will be (0) a linear combination of those ψi that belong to the same eigenvalue of A. Before proceeding with an example, let me prove a very important and useful property of the spherical harmonics. The parity operation is r → −r, and in spherical coordinates, this is equivalent to θ → π − θ and ϕ → ϕ + π. z θ r y ϕ x 32 Indeed, we know that (for the unit sphere) z = cos θ, and from the figure we see that −z would be at π − θ. Similarly, a point on the x-axis at ϕ = 0 goes to the point −x at ϕ = π. Alternatively, letting θ → π −θ in x = sin θ cos ϕ doesn’t change x, so in order to have x → −x we need cos ϕ → − cos ϕ which is accomplished by letting ϕ → ϕ + π. Now observe that under parity, r → −r and p → −p, so that L = r × p is unchanged. Thus angular momentum is a pseudo-vector, as you probably already knew. But this means that the parity operation Π commutes with the quantum mechanical operator L, so that the three operators L2 , Lz and Π are mutually commuting, and the eigenfunctions Ylm (θ, ϕ) of angular momentum can be chosen to have a definite parity. Note also that since Π and L commute, it follows that Π and L± commute, so acting on any Ylm with L± won’t change its parity. Look at the explicit form of the state Yll : 1/2 (2l + 1)! 1 Yll (θ, ϕ) = (−1)l (sin θ)l eilϕ . 4π 2l l! Letting θ → π − θ we have (sin θ)l → (sin θ)l , but under ϕ → ϕ + π we have eilϕ → eilπ eilϕ = (−1)l eilϕ . Therefore, under parity we see that Yll → (−1)l Yll . But we can get to any Ylm by repeatedly applying L− to Yll , and since this doesn’t change the parity of Ylm we have the extremely useful result Π Ylm (θ, ϕ) = (−1)l Ylm (θ, ϕ) . (2.24) Example 2.2 (Stark Effect). In this example we will take a look at the effect of a uniform electric field E = E ẑ on a hydrogen atom, where the unperturbed Hamiltonian is given by p2 e2 H0 = − . 2m r and r = r1 − r2 is the relative position vector from the proton to the electron. We first need to find the perturbing potential energy. The force on a particle of charge q in an electric field E = −∇φ is F = qE = −q∇φ where φ(r) is the electric potential. On the other hand, the force is also given in terms of the potential energy V (r) by F = −∇V , and hence ∇V = q∇φ so that Z r Z r ∇V · dr = q ∇φ · dr 0 0 or V (r) − V (0) = q[φ(r) − φ(0)] . If we take V (0) = φ(0) = 0, then we have V (r) = qφ(r) . 33 Thus the interaction Hamiltonian H ′ consists of both the energy eφ(r2 ) of the proton and the energy −eφ(r1 ) of the electron, and therefore H ′ = e[φ(r2 ) − φ(r1 )] . But the electric field is constant so that Z r2 E · dr = E · (r2 − r1 ) = −E · (r1 − r2 ) = −E · r = −E z r1 while we also have Z r2 Z r2 E · dr = − ∇φ · dr = −[φ(r2 ) − φ(r1 )] . r1 r1 Hence the final form of our perturbation is H ′ = eE · r or H ′ = eE z . Note also that if we define the electric dipole moment µe = e(r2 − r1 ) = −er, then H ′ can be called a dipole interaction because H ′ = −µe · E . Let us first consider the ground state ψ100 of the hydrogen atom. This state is nondegenerate, so the first-order energy correction to the ground state is, from equation (2.8), (1) E100 = hψ100 |eE z|ψ100 i = eE hψ100 |z|ψ100 i . But H 0 is parity invariant, so the states ψnlm all have a definite parity (−1)l . Then (1) E100 is the integral of an odd function over an even interval, and hence it vanishes: (1) E100 = 0 . In fact, this shows that any nondegenerate state of the hydrogen atom has no first- order Stark effect. Now consider the n = 2 levels of hydrogen. This is a four-fold degenerate state consisting of the wave functions ψ200 , ψ210 , ψ211 and ψ21 −1 . Since the parity of the states is given by (−1)l , we see that the l = 0 state has even parity while the l = 1 states are odd. However, it is not hard to see that [H ′ , Lz ] = 0. This either a consequence of ′ the fact that H P is a function of z = cos θ while Lz = −i~∂/∂ϕ, or you can note that [Li , rj ] = i k εijk rk so that [Lz , z] = 0. Either way, we have 0 = hψnl′ m′ |[H ′ , Lz ]|ψnlm i = hψnl′ m′ |H ′ Lz − Lz H ′ |ψnlm i = ~(m − m′ )hψnl′ m′ |H ′ |ψnlm i 34 and hence we have the selection rule hψnl′ m′ |H ′ |ψnlm i = 0 if m 6= m′ . (This is an example of Theorem 1.2.) This shows that H ′ can only connect states with the same m values. And since H ′ has odd parity, it can only connect states with opposite parities, i.e., in the present case it can only connect an l = 0 state with an l = 1 state. Suppressing the index n = 2, we order our basis states ψlm as {ψ00 , ψ10 , ψ11 , ψ1 −1 }. (In other words, the rows and columns are labeled by these functions in this order.) Then the secular equation (2.21) becomes (also writing E instead of E (1) for simplicity) −E hψ00 |H ′ |ψ10 i 0 0 hψ10 |H ′ |ψ00 i −E 0 0 =0 0 0 −E 0 0 0 0 −E or (since it’s block diagonal) [E 2 − (H12 ′ 2 ) ]E 2 = 0 where ′ H12 = hψ00 |H ′ |ψ10 i = hψ10 |H ′ |ψ00 i = H21 ′ because both H ′ and the wave functions are real. Therefore the roots of the secular equation are En(1) = ±H12 ′ , 0, 0 . For our wave functions we have ψnlm = Rnl Ylm or 1/2 1 r ψ200 = 1− e−r/2a0 Y00 2a30 2a0 1/2 1 r −r/2a0 0 ψ210 = 3 e Y1 24a0 a0 where a0 is the Bohr radius defined by a0 = ~2 /me e2 , and hence ′ H12 = hψ200 |eE z|ψ210 i −3 2 −r/a0 r r Z = eE (2a0 ) √ e 1− zY00∗ Y10 r2 drdΩ . 3 a0 2a0 But r 1 4π 0∗ Y00∗ = √ and z = r cos θ = r Y 4π 3 1 35 so that using Z ′ dΩ Ylm∗ Ylm = δll′ δmm′ we have 2 r Z ′ H12 = eE (2a0 )−3 e−r/a0 r4 1 − Y 0∗ Y 0 drdΩ 3a0 2a0 1 1 Z ∞ r5 2 = eE (2a0 )−3 r4 − e−r/a0 dr . 3a0 0 2a0 Using the general result Z ∞ ∞ ∂n Z n −αr n r e dr = (−1) e−αr dr 0 ∂αn 0 ∂ n −1 = (−1)n α ∂αn n! = αn+1 we finally arrive at ′ H12 = −3eE a0 . Now we need to find the corresponding eigenvectors c that will specify the correct zeroth-order wave functions. These are the solutions to the system of equations (1) (1) (1) H ′ c = En c for each value of En (see equation (2.20b)). Let E1 = H12 ′ . Then (1) the eigenvector c satisfies ′ ′ −H12 H12 0 0 c1    ′ ′  H12 −H12 0 0   c2         = 0.  0 ′  0 −H12 0    c3    ′ 0 0 0 −H12 c4 √ This implies that c1 = c2 and c3 = c4 = 0. Normalizing we have c1 = c2 = 1/ 2 so that (0) 1 ϕ1 = √ (ψ200 + ψ210 ) . 2 (1) ′ Next we let E2 = −H12 . Now the eigenvector c(2) satisfies  ′ ′ H12 H12 0 0 c1    ′ ′  H12 H12 0 0   c2       = 0  0 ′  0 H12 0   c3    ′ 0 0 0 H12 c4 36 √ so that c1 = −c2 and c3 = c4 = 0. Again, normalization yields c1 = −c2 = 1/ 2 and hence (0) 1 ϕ2 = √ (ψ200 − ψ210 ) . 2 (1) (1) Finally, for the two degenerate roots E3 = E4 = 0 we have ′ 0 H12 0 0 c1     ′  H12 0 0 0   c2       = 0  0 0 0 0  c3     0 0 0 0 c4 so that c1 = c2 = 0 while c3 and c4 are completely arbitrary. Thus we can simply choose (0) (0) ϕ3 = ψ211 and ϕ4 = ψ21 −1 . In summary, the correct zeroth-order wave functions for treating the Stark effect (0) (0) are ϕ1 which gets an first-order energy shift of −3eE a0 , the wave function ϕ2 which gets a first-order energy shift of +3eE a0 , and the original degenerate states (0) (0) ϕ3 = ψ211 and ϕ4 = ψ21 −1 which remain degenerate to this order. 2.3 Perturbation Treatment of the First Excited States of Helium The helium atom consists of a nucleus with two protons and two neutrons, and two orbiting electrons. If we take the nuclear charge to be +Ze instead of +2e, then our discussion will apply equally well to helium-like ions such as H− , Li+ or Be2+ . Neglecting terms such as spin–orbit coupling, the Hamiltonian is ~2 2 ~2 2 Ze2 Ze2 e2 H =− ∇1 − ∇2 − − + (2.25) 2me 2me r1 r2 r12 where ri is the distance to electron i, r12 is the distance from electron 1 to electron 2, and ∇2i is the Laplacian with respect to the coordinates of electron i. The Schrödinger equation is thus a function of six variables, the three coordinates for each of the two electrons. (Technically, the electron mass me is the reduced mass m = me M/(me + M ) where M is the mass of the nucleus. But M ≫ me so that m ≈ me . If this isn’t familiar to you, we will treat two-body problems such as this in detail when we discuss identical particles.) Because of the term e2 /r12 the Schrödinger equation isn’t separable, and we must resort to approximation methods. We write H = H0 + H′ 37 where ~2 2 Ze2 ~2 2 Ze2 H 0 = H10 + H20 = − ∇1 − − ∇ − (2.26) 2me r1 2me 2 r2 is the sum of two independent hydrogen atom Hamiltonians, and e2 H′ = . (2.27) r12 We can now use separation of variables to write the unperturbed wave function Ψ(r1 , r2 ) as a product Ψ(r1 , r2 ) = ψ1 (r1 )ψ2 (r2 ) . In this case we have the time-independent equation H 0 Ψ = (H10 + H20 )ψ1 ψ2 = ψ2 H10 ψ1 + ψ1 H20 ψ2 = Eψ1 ψ2 so that dividing by ψ1 ψ2 yields H10 ψ1 H 0 ψ2 =E− 2 . ψ1 ψ2 Since the left side of this equation is a function of r1 only, and the right side is a function of r2 only, each side must in fact be equal to a constant, and we can write E = E1 + E2 where each Ei is the energy of a hydrogenlike wave function: Z 2 e2 Z 2 e2 E1 = − E2 = − n21 2a0 n22 2a0 and a0 is the Bohr radius ~2 a0 = = 0.529 Å . me e 2 In other words, we have the unperturbed zeroth-order energies 2 1 1 e E (0) = −Z 2 + , n1 = 1, 2, . . . , n2 = 1, 2, . . . . (2.28) n21 n22 2a0 Correspondingly, the zeroth-order wave functions are products of the usual hydrogenlike wave functions. The lowest excited states of helium have n1 = 1, n2 = 2 or n1 = 2, n2 = 1. Then from (2.28) we have (for Z = 2) 2 1 1 e E (0) = −22 2 + 2 = −5(13.606 eV) = −68.03 eV . 1 2 2a0 38 For n = 2, the possible values of l are l = 0, 1, and since there are 2l + 1 values of ml , we see that the n = 2 level of a hydrogenlike atom is fourfold degenerate. (This just says that the 2s and 2p states have the same energy.) Thus the first excited unperturbed state of He is eightfold degenerate, and the eight unperturbed wave functions are (0) (0) (0) (0) ψ1 = 1s(1)2s(2) ψ2 = 2s(1)1s(2) ψ3 = 1s(1)2px(2) ψ4 = 2px (1)1s(2) (0) (0) (0) (0) ψ5 = 1s(1)2py (2) ψ6 = 2py (1)1s(2) ψ7 = 1s(1)2pz (2) ψ8 = 2pz (1)1s(2) Here the notation 1s(1)2s(2) means, for example, that electron 1 is in the 1s state and electron 2 is in the 2s state. I have also chosen to use the real hydrogenlike wave functions 2px , 2py and 2pz which are defined as linear combinations of the complex wave functions 2p0 , 2p1 and 2p−1 : 5/2 1 1 Z 2px := √ (2p1 + 2p−1 ) = √ re−Zr/2a0 sin θ cos φ 2 4 2π a0 5/2 1 Z = √ xe−Zr/2a0 (2.29a) 4 2π a0 5/2 1 1 Z 2py := √ (2p1 − 2p−1 ) = √ re−Zr/2a0 sin θ sin φ i 2 4 2π a0 5/2 1 Z = √ ye−Zr/2a0 (2.29b) 4 2π a 0 5/2 1 Z 2pz := 2p0 = √ re−Zr/2a0 cos θ 4 2π a 0 5/2 1 Z = √ ze−Zr/2a0 (2.29c) 4 2π a0 This is perfectly valid since any linear combination of solutions with a given energy is also a solution with that energy. (However, the 2px and 2py functions are not eigenfunctions of Lz since they are linear combinations of eigenfunctions with different values of ml .) These real hydrogenlike wave functions are more convenient for many purposes in constructing chemical bonds and molecular wave functions. In fact, you have probably seen these wave functions in more elementary chemistry courses. For example, a contour plot in the plane (i.e., a cross section) of a real 2p wave function is shown in Figure 4 below. (Let φ = π/2 in any of equations (2.29).) The three-dimensional orbital is obtained by rotating this plot about the horizontal axis, so we see that the actual shape of a real 2p orbital (i.e., a one-electron wave function) is two separated, distorted ellipsoids. It is not hard to verify that the real 2p wave functions are orthonormal, and (0) hence the eight degenerate wave functions ψi are also orthonormal as required by equation (2.19). The secular determinant contains 82 = 64 elements. However, H ′ 39 40 20 0 -20 -40 -40 -20 0 20 40 Figure 4: Contour plot in the plane of a real 2p wave function. (0) ′ ′ is real, as are the ψi , so that Hij = Hji and the determinant is symmetric about the main diagonal. This cuts the number of integrals almost in half. ′ Even better, by using parity we can easily show that most of the Hij are zero. ′ 2 Indeed, the perturbing Hamiltonian H = e /r12 is an even function of r since r12 = [(x1 − x2 )2 + (y1 − y2 )2 + (z1 − z2 )2 ]1/2 and this is unchanged if r1 → −r1 and r2 → −r2 . Also, the hydrogenlike s- wave functions depend only on r = |r| and hence are invariant under r → −r. Furthermore, you can see from the above forms that the 2p wave functions are odd under parity since they depend on r and either x, y or z. Hence, since we are integrating over all space, any integral with only a single factor of 2p must vanish: ′ ′ ′ ′ ′ ′ H13 = H14 = H15 = H16 = H17 = H18 =0 and ′ ′ ′ ′ ′ ′ H23 = H24 = H25 = H26 = H27 = H28 = 0. Now consider an integral such as Z ∞ ′ e2 H35 = 1s(1)2px (2) 1s(1)2py (2) dr1 dr2 . −∞ r12 If we let x1 → −x1 and x2 → −x2 , then r12 is unchanged as are 1s(1) and 2py (2). However, 2px (2) changes sign, and the net result is that the integrand is an odd function under this transformation. Hence it is not hard to see that the integral vanishes. This lets us conclude that ′ ′ ′ ′ H35 = H36 = H37 = H38 =0 40 and ′ ′ ′ ′ H45 = H46 = H47 = H48 = 0. Similarly, by considering the transformation y1 → −y1 and y2 → −y2 , it follows that ′ ′ ′ ′ H57 = H58 = H67 = H68 = 0. With these simplifications, the secular equation becomes ′ b11 H12 0 0 0 0 0 0 ′ H12 b22 0 0 0 0 0 0 ′ 0 0 b 33 H 34 0 0 0 0 ′ 0 0 H 34 b 44 0 0 0 0 0 ′ =0 (2.30) 0 0 0 b 55 H 56 0 0 ′ 0 0 0 0 H56 b66 0 0 ′ 0 0 0 0 0 0 b 77 H 78 ′ 0 0 0 0 0 0 H78 b88 where bii = Hii′ − E (1) , i = 1, 2, . . . , 8 . Since the secular determinant is in block-diagonal form with 2 × 2 blocks on the diagonal, the same logic that we used in Example 1.3 would seem to tell us that the correct zeroth-order wave functions have the form (0) (0) (0) (0) (0) (0) φ1 = c1 ψ1 + c2 ψ2 φ2 = c̄1 ψ1 + c̄2 ψ2 (0) (0) (0) (0) (0) (0) φ3 = c3 ψ3 + c4 ψ4 φ4 = c̄3 ψ3 + c̄4 ψ4 (0) (0) (0) (0) (0) (0) φ5 = c5 ψ5 + c6 ψ6 φ6 = c̄5 ψ5 + c̄6 ψ6 (0) (0) (0) (0) (0) (0) φ7 = c7 ψ7 + c8 ψ8 φ8 = c̄7 ψ7 + c̄8 ψ8 where the barred and unbarred coefficients distinguish between the two roots of each second-order determinant. However, while that argument applies to the upper 2 × 2 determinant (i.e., the first two equations of the system), it doesn’t apply to the whole determinant in this case. This is because it turns out (as we will see below) that the lower three 2 × 2 determinants are identical. Therefore, their pairs of roots are the same, and all we can say is that there are two six-dimensional eigenspaces. In other words, all we can say is that for each of the two roots and (0) (0) for each n = 3, 4, . . . 8, the function φn will be a linear combination of ψ3 , . . . , (0) ψ8 . However, we can choose any basis we wish for this six-dimensional space, so (0) we choose the three two-dimensional orthonormal φn ’s as shown above. The first determinant is H ′ − E (1) ′ H12 11 =0 (2.31) ′ ′ (1) H12 H22 − E 41 where e2 e2 Z Z ′ H11 = 1s(1)2s(2) 1s(1)2s(2) dr1 dr2 = [1s(1)]2 [2s(2)]2 dr1 dr2 r12 r12 e2 Z ′ H22 = [1s(2)]2 [2s(1)]2 dr1 dr2 . r12 Since the integration variables are just dummy variables, it is pretty obvious that letting r1 ↔ r2 shows that ′ ′ H11 = H22 . Similarly, it is easy to see that ′ ′ ′ ′ ′ ′ H33 = H44 H55 = H66 H77 = H88 . ′ The integral H11 is sometimes denoted by J1s2s and called a Coulomb integral: e2 Z ′ H11 = J1s2s = [1s(1)]2 [2s(2)]2 dr1 dr2 . r12 The reason for the name is that this represents the electrostatic energy of repulsion between an electron with the probability density function [1s]2 and an electron with probability density function [2s]2 . The integral H12 ′ is denoted by K1s2s and called an exchange integral: e2 Z ′ H12 = K1s2s = 1s(1)2s(2) 2s(1)1s(2) dr1 dr2 . r12 Here the functions to the left and right of H ′ differ from each other by the exchange of electrons 1 and 2. The general definitions of the Coulomb and exchange integrals are Jij = hfi (1)fj (2)|e2 /r12 |fi (1)fj (2)i Kij = hfi (1)fj (2)|e2 /r12 |fj (1)fi (2)i where the range of integration is over the full range of spatial coordinates of particles 1 and 2, and the functions fi , fj are spatial orbitals. Substituting these integrals into (2.31) we have J1s2s − E (1) K1s2s =0 (2.32) J1s2s − E (1) K1s2s or J1s2s − E (1) = ±K1s2s and hence the two roots are (1) (1) E1 = J1s2s − K1s2s and E2 = J1s2s + K1s2s . 42 (1) Just as in Example 1.3, we substitute E1 back into (2.20a) to write K1s2s c1 + K1s2s c2 = 0 K1s2s c1 + K1s2s c2 = 0 (0) and hence c2 = −c1 . Normalizing φ1 we have (using the orthonormality of the (0) ψi ) (0) (0) (0) (0) (0) (0) 2 2 hφ1 |φ1 i = hc1 ψ1 − c1 ψ2 |c1 ψ1 − c1 ψ2 i = |c1 | + |c2 | = 1 √ (1) so that c1 = 1/ 2. Thus the zeroth-order wave function corresponding to E1 is (0) (0) (0) φ1 = 2−1/2 [ψ1 − ψ2 ] = 2−1/2 [1s(1)2s(2) − 2s(1)1s(2)] . (1) Similarly, the wave function corresponding to E2 is easily found to be (0) (0) (0) φ2 = 2−1/2 [ψ1 + ψ2 ] = 2−1/2 [1s(1)2s(2) + 2s(1)1s(2)] . This takes care of the first determinant in (2.30), but we still have the remaining three to handle. ′ ′ First look at the integrals H33 and H55 : e2 Z ′ H33 = 1s(1)2px(2) 1s(1)2px (2) dr1 dr2 r12 e2 Z ′ H55 = 1s(1)2py (2) 1s(1)2py (2) dr1 dr2 . r12 The only difference between these is the 2p(2) orbital, and the only difference between the 2px and 2py orbitals is their spatial orientation. Since the 1s orbitals are spherically symmetric, it should be clear that these integrals are the same. For- ′ mally, in H33 we can change variables by letting x1 → y1 , y1 → x1 , x2 → y2 and ′ ′ y2 → x2 . This leaves r12 unchanged, and transforms H33 into H55 . The same ′ ′ argument shows that H77 = H33 also. Hence we have e2 Z ′ ′ ′ H33 = H55 = H77 = 1s(1)2pz (2) 1s(1)2pz (2) dr1 dr2 := J1s2p . r12 A similar argument shows that we also have equal exchange integrals: e2 Z ′ ′ ′ H34 = H56 = H78 = 1s(1)2pz 2pz (1)1s(2) dr1 dr2 := K1s2p . r12 Thus the remaining three determinants in (2.30) are the same and have the form J1s2p − E (1) K1s2p = 0. (1) K1s2p J1s2p − E 43 But this is the same as (2.32) if we replace 2s by 2p, and hence we can immediately write down the solutions: (1) (1) (1) E3 = E5 = E7 = J1s2p − K1s2p (1) (1) (1) E4 = E6 = E8 = J1s2p + K1s2p and (0) φ3 = 2−1/2 [1s(1)2px (2) − 1s(2)2px (1)] (0) φ4 = 2−1/2 [1s(1)2px (2) + 1s(2)2px (1)] (0) φ5 = 2−1/2 [1s(1)2py (2) − 1s(2)2py (1)] (0) φ6 = 2−1/2 [1s(1)2py (2) + 1s(2)2py (1)] (0) φ7 = 2−1/2 [1s(1)2pz (2) − 1s(2)2pz (1)] (0) φ8 = 2−1/2 [1s(1)2pz (2) + 1s(2)2pz (1)] So what has happened? Starting from the eight degenerate (unperturbed) states (0) ψi that would exist in the absence of electron-electron repulsion, we find that including this repulsion term splits the degenerate states into two nondegenerate levels associated with the configuration 1s2s, and two triply degenerate levels associated with the configuration 1s2p. Interestingly, going to higher-order energy corrections will not completely remove the degeneracy, and in fact it takes the application of an external magnetic field to do so. In order to evaluate the Coulomb and exchange integrals in the expressions for E (1) we need to use the expansion ∞ X l l 1 X 4π r< = l+1 [Ylm (θ1 , ϕ1 )]∗ Ylm (θ2 , ϕ2 ) (2.33) r12 2l + 1 r> l=0 m=−l where r< means the smaller of r1 and r2 and r> is the larger of these. The details of this type of integral are left to the homework, and the results are 17 Ze2 59 Ze2 J1s2s = = 11.42 eV J1s2p = = 13.21 eV 81 a0 243 a0 16 Ze2 112 Ze2 K1s2s = = 1.19 eV K1s2p = = 0.93 eV 729 a0 6561 a0 where we used Z = 2 and e2 /2a0 = 13.606 eV. Recalling that E (0) = −68.03 eV we obtain (1) E (0) + E1 = E (0) + J1s2s − K1s2s = −57.8 eV (1) E (0) + E2 = E (0) + J1s2s + K1s2s = −55.4 eV (1) E (0) + E3 = E (0) + J1s2p − K1s2p = −55.7 eV (1) E (0) + E4 = E (0) + J1s2p + K1s2p = −53.9 eV . 44 −53.9 eV 1s2p Kp −55.4 eV −55.7 eV 1s2s Ks −57.8 eV Jp Js E (0) + E (1) −68.0 eV E (0) Figure 5: The first excited levels of the helium atom. (See Figure 5 below.) The first-order energy corrections place the lower 1s2p level below the upper 1s2s level, which disagrees with the actual helium spectrum. This is due to the neglect of higher-order corrections. Since the electron-electron repulsion is not a small quantity, this is not surprising. Finally, let us look at the sources of the degeneracy of the original eight zeroth- order wave functions and the reason for the partial lifting of this degeneracy. There are three types of degeneracy to consider: (1) The degeneracy between states with the same n but different values of l. The 2s and 2p functions have the same energy. (2) The degeneracy between wave functions with the same n and l but different values of ml . The 2px , 2py and 2pz functions have the same energy. (This could just as well have been the 2p0 , 2p1 and 2p−1 complex functions.) (3) There is an exchange degeneracy between functions that differ only in the exchange of elec- (0) (0) trons between the orbitals. For example, ψ1 = 1s(1)2s(2) and ψ2 = 1s(2)2s(1) have the same energy. By introducing the electron-electron perturbation H ′ = e2 /r12 we removed the degeneracy associated with l and the exchange degeneracy, but not the degeneracy due to ml . To understand the reason for the lifting of the l degeneracy, realize that a 2s electron has a greater probability than a 2p electron of being closer to the nucleus than a 1s electron, and hence a 2s electron is not as effectively shielded from the nucleus by the 1s electrons as a 2p electron is. Since the energy levels are given by Z 2 e2 E=− 2 n 2a0 we see that a larger nuclear charge means a lower energy, and hence the 2s electron has a lower energy than the 2p electron. This is also evident from the Coulomb integrals, where we see that J1s2s is less than J1s2p . These integrals represent the 45 electrostatic repulsion of their respective charge distributions: when the 2s electron penetrates the 1s charge distribution it only feels a repulsion due to the unpene- trated portion of the 1s distribution. Therefore the 1s-2s electrostatic repulsion is less than the 1s-2p repulsion, and the 1s2s levels lies below the 1s2p levels. So we see that the interelectronic repulsion in many-electron atoms lifts the l degeneracy, and the orbital energies for the same value of n increase with increasing l. To understand the removal of the exchange degeneracy, note that the original zeroth-order wave functions specified which electron went into which orbital. Since the secular determinant wasn’t diagonal, these couldn’t have been the correct zeroth-order wave functions. In fact, the correct zeroth-order wave functions do not assign a specific electron to a specific orbital, as is evident from the form of each (0) φi . This is a consequence of the indistinguishability of identical particles, and will (0) (0) be discussed at length a little later in this course. Since, for example, φ1 and φ2 have different energies, the exchange degeneracy is removed by using the correct zeroth-order wave functions. 2.4 Spin–Orbit Coupling and the Hydrogen Atom Fine Struc- ture The Hamiltonian ~2 ∂ 2 e2 2 ∂ 1 H0 = − + + L2 − (2.34) 2m ∂r2 r ∂r 2mr2 r used to derive the hydrogen atom wave functions ψnlm that we have worked with so far consists of the kinetic energy of the electron plus the potential energy of the Coulomb force binding the electron and proton together. (Recall that in this equation, m is really the reduced mass m = me Mp /(me + Mp ) ≈ me .) While this works very well, the actual Hamiltonian is somewhat more complicated than this. In this section we derive an additional term in the Hamiltonian that is due to a coupling between the orbital angular momentum L and the spin angular momentum S. The discussion that follows is a somewhat heuristic approach to deriving an interaction term that agrees with experiment. You shouldn’t take the physical picture too seriously. However, the basic idea is simple enough. From the point of view of the electron, the moving nucleus (i.e., a proton) generates a current that is the source of a magnetic field B. This current is proportional to the electron’s angular momentum L. The interaction energy of a magnetic moment µ with this magnetic field is −µ · B. Since the magnetic moment of an electron is proportional to its spin S, we see that the interaction energy will be proportional to L · S. With the above disclaimer, the interaction term we are looking for is due to the fact that from the point of view of the electron, the moving hydrogen nucleus (the proton) forms a current, and thus generates a magnetic field. From special relativity, we know that the electric and magnetic fields are related by a Lorentz transformation so that B⊥ = γ(B′ ⊥ + β × E′ ) Bk = B′ k 46 E⊥ = γ(E′ ⊥ − β × B′ ) Ek = E′ k where β = v/c is the velocity of the primed frame with respect to the unprimed frame, γ = (1 − β 2 )−1/2 and ⊥, k refer to the components perpendicular or parallel to β. β O′ O We let the primed frame be the proton rest frame, and note that there is no B′ field in the proton’s frame due to the proton itself. Also, if β ≪ 1, then γ ≈ 1 and we then have B = β × E′ and E = E′ . If v is the electron’s velocity with respect to the lab (or the proton), then β = −v/c so the field felt by the electron is v B = − × E′ . (2.35) c The electric field E′ due to the proton is e e E′ = 2 r̂ = 3 r (2.36) r r where e > 0 and r is the position vector from the proton to the electron. From basic electrodynamics, we know that the energy of a particle with magnetic moment µ in a magnetic field B is given by (see the end of this section) W = −µ · B (2.37) so we need to know µ. Consider a particle of charge q moving in a circular orbit. It forms an effective current ∆q q qv I= = = . ∆t 2πr/v 2πr By definition, the magnetic moment has magnitude I qv qvr µ= × area = · πr2 = . c 2πrc 2c But the angular momentum of the particle is L = mvr so we conclude that the magnetic moment due to orbital motion is q µl = L. (2.38) 2mc 47 The ratio of µ to L is called the gyromagnetic ratio. While the above derivation of (2.38) was purely classical, we know that the electron also possesses an intrinsic spin angular momentum. Let us hypothesize that the electron magnetic moment associated with this spin is of the form −e µs = g S. 2mc The constant g is found by experiment to be very close to 2. (However, the relativistic Dirac equation predicts that g is exactly 2. Higher order corrections in quantum electrodynamics predict a slightly different value, and the measurement of g − 2 is one of the most accurate experimental result in all of physics.) So we now have the electron magnetic moment given by −e µs = S (2.39) mc and hence the interaction energy of the electron with the magnetic field of the proton is (using equations (2.35) and (2.36)) e2 e e e W = −µs · B = + S·B=− S· v × r = S · (r × p) mc mc r3 c m 2 c2 r 3 or e2 W = S · L. (2.40) m 2 c2 r 3 Alternatively, we can write W in another form as follows. If we assume that the electron moves in a spherically symmetric potential field, then the force −eE on the electron may be written as the negative gradient of this potential energy: dV r dV −eE = −∇V (r) = − r̂ = − . dr r dr Using this in (2.35) we have v 1 dV 1 1 dV B=− ×r = r×p c er dr mc er dr and hence e 1 dV W = −µs · B = S · (r × p) m 2 c2 er dr or 1 1 dV W = S · L. (2.41) m2 c2 r dr However, we have made one major mistake. The classical equation that leads to (2.37) is dL =N=µ×B (2.42) dt where L is the angular momentum of the particle in its rest frame, N is the applied torque, and B is the magnetic field in that frame. But this only applies if the 48 electron’s rest frame isn’t rotating. If it is, then the left side of this equation isn’t valid (i.e., it isn’t equal to only the applied torque), and we must use the correct (operator) expression from classical mechanics: d d = +ω× . (2.43) dt lab dt rot (If you don’t know this result, I will derive it at the end of this section so you can see what is going on and why.) For the electron, (2.42) gives dS/dt in the lab frame, so in the electron’s frame we must use dS dS = − ωT × S (2.44) dt rot dt lab where ωT is called the Thomas precessional frequency. Thus we see that the change in the spin angular momentum of the electron, (dS/dt)rot , is given by the change due to the applied torque µ × B minus an effect due to the rotation of the coordinate system: dS e = µ × B − ωT × S = − S × B + S × ωT dt rot mc or dS eB =S× − + ωT . (2.45) dt rot mc This is the analogue of (2.42), so the analogue of (2.37) is eB e W = −S · − + ωT = S · B − S · ωT . (2.46) mc mc Note that the first term is what we already calculated in equation (2.40). What we need to know is the Thomas factor S·ω T . This is not a particularly easy calculation to do exactly, so we will give a very simplified derivation. (See Jackson, Classical Electrodynamics, Chapter 11 if you want a careful derivation.) Basically, Thomas precession can be attributed to time dilation, i.e., observers on the electron and proton disagree on the time required for one particle to make a revolution about the other. Let T be the time required for a revolution according to the electron, and let it be T ′ according to the proton. Then T ′ = γT where γ = (1 − β 2 )−1/2 . (Note that a circular orbit means an acceleration, so even this isn’t really correct.) Then the electron and proton each measure orbital angular velocities of 2π/T and 2π/T ′ respectively. To the electron, its spin S maintains its direction in space, but to the proton, it appears to precess at a rate equal to the difference in angular velocities, or 2π 2π 1 1 γ 1 ωT = − ′ = 2π − ′ = 2π − T T T T T′ T′ 2π β 2 2π = ′ (1 − β 2 )−1/2 − 1 ≈ ′ . T T 2 49 But in general we know that ω = v/r and hence 2π v mvr L ′ = = 2 = T r mr mr2 and therefore L β2 L v2 1 L 1 mv 2 ωT = = = . mr2 2 mr2 2c2 2 m 2 c2 r r We also know that F = ma, where for circular motion we have an inward directed acceleration a = v 2 /r. Since F = −∇V , we have mv 2 dV F=− r̂ = − r̂ r dr and we can write 1 1 1 dV ωT = L. (2.47) 2 m2 c2 r dr From this we see that S · ω T is just one-half the energy given by equation (2.41), and equation (2.46) shows that it is subtracted off. Therefore the correct spin–orbit energy is given by 1 1 dV W = L·S (2.48a) 2m2 c2 r dr or, from (2.40) with a slight change of notation, e2 Hso = L ·S. (2.48b) 2m2 c2 r3 Calculating the spin–orbit interaction energy Eso by finding the eigenfunctions and eigenvalues of the Hamiltonian H = H 0 + Hso is a difficult problem. Since the effect of Hso is small compared to H 0 (at least for the lighter atoms), we will estimate the value of Eso by using first-order perturbation theory. Then first-order energy shifts for the hydrogen atom will be the integrals (1) Eso ≈ hΨ|Hso Ψi where the hydrogen atom wave functions including spin are of the form Ψ = Rnl (r)Ylm (θ, ϕ)χ(s) . From J = L + S, we have J 2 = L2 + S 2 + 2L · S so that 1 2 L·S = (J − L2 − S 2 ) . (2.49) 2 Note that neither L nor S separately commutes with L · S, but you can easily show that J = L + S does in fact commute with L · S. Because of this, we can choose our states to be simultaneous eigenfunctions of J 2 , Jz , L2 and S 2 , all of which commute with H. 50 Since Ylm is an eigenfunction of Lz and χ is an eigenfunction of Sz , the wave function Ylm χ is an eigenfunction of Jz = Lz + Sz but not of J 2 . However, by the usual addition of angular momentum problem, in this case L and S, we can construct simultaneous eigenfunctions ψ of J 2 , Jz , L2 and S 2 . In this case we have s = 1/2, so we know that the resulting possible j values are j = l − 1/2, l + 1/2. The reason we want to do this is because there are 2(2l + 1) degenerate levels for a given n and l, where the additional factor of 2 comes from the two possible spin orientations. Let us assume that we have constructed these eigenfunctions, and we now denote the hydrogen atom wave functions by Ψ = Rnl (r)ψ(θ, ϕ, s) where, by (2.49) ~2 L · Sψ = [j(j + 1) − l(l + 1) − s(s + 1)]ψ 2 ~2 3 = j(j + 1) − l(l + 1) − ψ . 2 4 Using this, our first-order energy estimate becomes 2 (1) e 1 Eso ≈ Rnl ψ 2 2 3 L · SRnl ψ 2m c r e 2 ~2 3 1 = j(j + 1) − l(l + 1) − R R nl 3 nl (2.50) 4m2 c2 4 r where 1 1 Rnl ψ 3 Rnl ψ = Rnl 3 Rnl r r because hψ|ψi = 1. The integral in (2.50) is not at all hard to do if you use some clever tricks. I will show how to do it at the end of this section, and the answer is 1 1 Rnl 3 Rnl = 3 3 (2.51) r a0 n l(l + 1/2)(l + 1) where the Bohr radius is ~2 ~ a0 = = (2.52) me2 mcα and the fine structure constant is e2 1 α= ≈ . (2.53) ~c 137 Note that for l = 0 we also have L · S = 0 anyway, so there is no spin–orbit energy. 51 Recall that the energy corresponding to H 0 is me4 mc2 α2 En(0) = − 2 2 =− . (2.54a) 2~ n 2n2 or (0) E1 −13.6 eV En(0) = 2 = . (2.54b) n n2 Combining (2.50) and (2.51) we have e 2 ~2 (1) [j(j + 1) − l(l + 1) − 3/4] Eso = 4m2 c2 a30 n3 l(l + 1/2)(l + 1) (0) 2 En α [j(j + 1) − l(l + 1) − 3/4] = . (2.55) 2n l(l + 1/2)(l + 1) Since j = l ± 1/2, this gives us the two corrections to the energy (0) 2 En α 1 (1) Eso = for j = l + 1/2 and l 6= 0 (2.56a) n (2l + 1)(l + 1) (0) 2 En α 1 (1) Eso = − for j = l − 1/2 and l 6= 0 . (2.56b) n l(2l + 1) There is yet another correction to the hydrogen atom energy levels due to the relativistic contribution to the kinetic energy of the electron. The kinetic energy is really the difference between the total relativistic energy E = (p2 c2 + m2 c4 )1/2 and the rest energy mc2 . To order p4 this is p2 p4 T = (p2 c2 + m2 c4 )1/2 − mc2 ≈ − . 2m 8m3 c2 Since the Hamiltonian is the sum of kinetic and potential energies, we see from this that the term p4 Hrel = − (2.57) 8m3 c2 may be treated as a perturbation to the states ψnlm . While the states ψnlm are in general degenerate, in this case we don’t have to worry about it. The reason is that Hrel is rotationally invariant, so it’s already diagonal in the ψnlm basis, and that is precisely what the zeroth-order wavefunc- (0) tions ϕn accomplish (see equation (2.22)). Therefore we can use simple first-order perturbation theory so that (1) 1 Erel = − hψnlm |p4 |ψnlm i . 8m3 c2 Using H 0 = p2 /2m − e2 /r we can write 2 2 2 e2 4 2 p 2 0 p = 4m = 4m H + 2m r 52 and therefore (1) 1 (0) 2 (0) 2 1 4 1 Erel = − (En ) + 2En e + e 2mc2 r r2 where h·i is shorthand for hψnlm | · |ψnlm i. These integrals are not hard to evaluate (see the end of this section), and the result (in different forms) is (0) (En )2 (1) 4n Erel =− −3+ 2mc2 l + 1/2 (0) 2 En α 3 n =− − + (2.58) n2 4 l + 1/2 1 3 1 = − mc2 α4 − 4 + 3 . 2 4n n (l + 1/2) Adding equations (2.56) and (2.58) we obtain the fine structure energy shift mc2 α4 (1) 3 1 Efs = − − + 2n3 4n j + 1/2 (0) 2 (2.59) En α 3 n =− − + n2 4 j + 1/2 which is valid for both j = l ± 1/2. This is the first-order energy correction due to the “fine structure Hamiltonian” Hfs = Hso + Hrel . (2.60) 2.4.1 Supplement: Miscellaneous Proofs Now let’s go back and prove several miscellaneous results stated in this section. The first thing we want to show is that the energy of a magnetic moment in a uniform magnetic field is given by −µ · B where µ for a loop of area A carrying current I is defined to have magnitude IA and pointing perpendicular to the loop in the direction of your thumb if the fingers of your right hand are along the direction of the current. To see this, we simply calculate the work required to rotate a current loop from its equilibrium position to the desired orientation. Consider Figure 6 below, where the current flows counterclockwise out of the page at the bottom and into the page at the top. Let the loop have length a on the sides and b across the top and bottom, so its area is ab. The magnetic force on a current-carrying wire is Z FB = Idl × B and hence the forces on the opposite “a sides” of the loop cancel, and the force on the top and bottom “b sides” is FB = IbB. The equilibrium position of the loop is 53 B µ a/2 θ θ θ FB FB B a/2 B Figure 6: A current loop in a uniform magnetic field horizontal, so the potential energy of the loop is theR work required to rotate it from θ = 0 to some value θ. This work is given by W = F · dr where F is the force that I must apply against the magnetic field to rotate the loop. Since the loop is rotating, the force I must apply at the top of the loop is in the direction of µ and perpendicular to the loop, and hence has magnitude FB cos θ. Then the work I do is (the factor of 2 takes into account both the top and bottom sides) Z Z Z θ W = F · dr = 2 FB cos θ(a/2)dθ = IabB cos θ dθ = µB sin θ . 0 But note that µ · B = µB cos(90 + θ) = −µB sin θ, and therefore W = −µ · B . (2.61) In this derivation, I never explicitly mentioned the torque on the loop due to B. However, we see that kNk = kr × FB k = 2(a/2)FB sin(90 + θ) = IabB sin(90 + θ) = µB sin(90 + θ) = kµ × Bk and therefore N = µ× B. (2.62) R Note that W = kNk dθ. Next I will prove equation (2.43). Let A be a vector as seen in both the rotating and lab frames, and let {ei } be a fixed basis in the rotating frame. Then (using the summation convention) A = Ai ei so that dA d dAi dei = (Ai ei ) = ei + Ai . dt dt dt dt 54 Now (dAi /dt)ei is the rate of change of A with respect to the rotating frame, so we have dAi dA ei = . dt dt rot And ei is a fixed basis vector in the frame that is rotating with respect to the lab frame. Then, just like any vector rotating in the lab with angular velocity ω, we have dei = ω × ei . dt (See the figure below. Here ω = dφ/dt, and dv = v sin θ dφ so dv/dt = v sin θ ω or dv/dt = ω × v.) ω dφ dv θ v Then dei Ai = Ai ω × ei = ω × Ai ei = ω × A . dt Putting this all together we have dA dA = + ω × A. dt lab dt rot Equation (2.43) is just the ‘operator’ version of this result. Finally, let me show how to evaluate the integrals h1/ri, h1/r2 i and h1/r3 i where the expectation values are taken with respect to the hydrogen atom wave functions ψnlm . First, instead of h1/ri, consider hλ/ri. This can be interpreted as the first-order correction to the energy due to the perturbation λ/r. But H 0 = T + V = T − e2 /r, so H = H 0 + H ′ = H 0 + λ/r = T − (e2 − λ)/r, and this is just our original problem if we replace e2 by e2 − λ everywhere. In particular, the exact energy solution is then m(e2 − λ)2 me4 me2 m En (λ) = − 2 2 = − 2 2 + λ 2 2 − λ2 2 2 . 2~ n 2~ n ~ n 2~ n But another way of looking at this is as the expansion of En (λ) given in (2.3b): λ2 d2 En (0) dEn En = En + λ + + ··· dλ λ=0 2! dλ2 λ=0 = En(0) + λEn(1) + λ2 En(2) + · · · 55 (1) where the first-order correction En = hH ′ i is just the term linear in λ. Therefore, letting λ → 1, we have h1/ri = hH ′ i = me2 /~2 n2 or 1 1 = . (2.63) r a0 n 2 (1) Note that if you have the exact solution En (λ), you can obtain En by simply evaluating λ(dEn /dλ)λ=0 . Before continuing, let me rewrite the hydrogen atom Hamiltonian as follows: ~2 ∂ 2 e2 2 ∂ 1 H0 = − 2 + + 2 L2 − 2m ∂r r ∂r 2mr r (2.64) p2r L2 e2 = + − 2m 2mr2 r where I have defined the “radial momentum” pr by ∂ 1 pr = −i~ + . ∂r r Now consider hλ/r2 i. Again, letting H = H 0 + H ′ = H 0 + λ/r2 , we can still solve the problem exactly because all we are doing is modifying the centrifugal term L2 L2 + 2mλ ~2 l(l + 1) + 2mλ ~2 l′ (l′ + 1) 2 → 2 → 2 = 2mr 2mr 2mr 2mr2 where l′ = l′ (λ) is a function of λ. (Just write ~2 l′ (l′ + 1) = ~2 l(l + 1) + 2mλ and use the quadratic formula to find l′ as a function of λ.) Recall that the exact energies were defined by me4 me4 En = − 2 2 =− 2 2~ n 2~ (k + l + 1)2 where k = 0, 1, 2, . . . was the integer that terminated the power series solution of the radial equation. Now what we have is me4 E(l′ ) = − = E(λ) = E (0) + λE (1) + · · · 2~2 (k + l′ + 1)2 where (note λ = 0 implies l′ = l) dl′ (1) dE dE E = = . dλ λ=0 dλ l′ =l dl′ l′ =l Then from the explicit form of E(l′ ) and the definition of n we have me4 me4 dE ′ = 2 3 = 2 3 dl l′ =l ~ (k + l + 1) ~ n 56 and taking the derivative of ~2 l′ (l′ + 1) = ~2 l(l + 1) + 2mλ with respect to λ yields dl′ 2m 1 m 1 = 2 = 2 . dλ l′ =l ~ 2l + 1 ~ (l + 1/2) Therefore (me2 /~2 )2 E (1) = (l + 1/2)n3 and hλ/r2 i = λE (1) so that 1 1 = . (2.65) r2 a20 (l + 1/2)n3 The last integral to evaluate is h1/r3 i. Since there is no term in H 0 that goes like 1/r3 , we have to try something else. Note that H 0 ψnlm = En ψnlm so that h[H 0 , pr ]i = hψnlm |H 0 pr − pr H 0 |ψnlm i = En hpr i − hpr iEn = 0 . Using 1 ∂ 1 1 ∂ 2 , = 2 and , = 3 r ∂r r r2 ∂r r (recall [ab, c] = a[b, c] + [a, c]b), it is easy to use (2.64) and show that i~ L2 i~e2 [H 0 , pr ] = − + . m r3 r2 But now i~ L2 1 0 = h[H 0 , pr ]i = − + i~e 2 m r3 r2 i~3 l(l + 1) 1 2 1 =− + i~e m r3 r2 and therefore me2 1 1 = 2 r3 ~ l(l + 1) r2 or 1 1 1 = . (2.66) r3 a0 l(l + 1) r2 Combining this with (2.65) we have 1 1 = 3 (2.67) r3 a0 l(l + 1)(l + 1/2)n3 57 2.5 The Zeeman Effect In the previous section we studied the effect of an atomic electron’s magnetic moment interacting with the magnetic field generated by the nucleus (a proton). In this section, I want to investigate what happens when a hydrogen atom is placed in a uniform external magnetic field B. These types of interactions are generally referred to as the Zeeman effect, and they were instrumental in the discovery of spin. (Pieter Zeeman and H.A. Lorentz shared the second Nobel prize in physics in 1902. For a very interesting summary of the history of spin, read Chapter 10 in the text Quantum Mechanics by Hendrik Hameka.) The hydrogen atom Hamiltonian, including fine structure, is given by H = H 0 + Hfs = H 0 + Hso + Hrel where ~2 ∂ 2 e2 2 ∂ 1 H0 = − 2 + + 2 L2 − (equation (2.34)) 2m ∂r r ∂r 2mr r e2 Hso = L·S (equation (2.48b)) 2m2 c2 r3 p4 Hrel = − (equation (2.57)) . 8m3 c2 (And where I’m approximating the reduced mass by the electron mass me .) The easy way to include the presence of an external field B is to simply add an interaction energy Hmag = −µtot · B where, from equations (2.38) and (2.39), we know that the total magnetic moment for a hydrogenic electron is e e µtot = µl + µs = − (L + 2S) = − (J + S) . (2.68) 2me c 2me c However, the correct way to arrive at this is to rewrite the Hamiltonian taking into account the presence of an electromagnetic field. For those who are interested, I work through this approach at the end of this section. In any case, the Hamiltonian for a hydrogen atom in an external uniform magnetic field is then H = H 0 + Hso + Hrel + Hmag . There are really three cases to consider. (I’ll ignore Hrel for now because it’s a correction to the kinetic energy and irrelevant to this discussion.) The first is when B is strong enough that Hmag is large relative to Hso . In this case we can treat Hso as a perturbation on the states defined by H 0 + Hmag , where these states are simultaneous eigenfunctions of L2 , S 2 , Lz and Sz (rather than J 2 and Jz ). The reason that J is not a good quantum number is that the external field exerts a 58 torque µtot × B on the total magnetic moment, and this is equivalent to a changing total angular momentum dJ/dt. Thus J is not conserved, and in fact precesses about B. In addition, if there is a spin–orbit interaction, then this internal field causes L and S to precess about J. The second case is when B is weak and Hso dominates Hmag . In this situation, Hmag is treated as a perturbation on the states defined by H 0 + Hso . As we saw in our discussion of Hso , in this case we must choose our states to be eigenfunctions of L2 , S 2 , J 2 and Jz because L and S are not conserved separately, even though J = L+S is conserved. (Neither L nor S alone commutes with L·S, but [Ji , L·S] = 0 and hence J 2 commutes with H.) And the third and most difficult case is when both Hso and Hmag are roughly equivalent. Under this “intermediate-field” situation, we must take them together and use degenerate perturbation theory to break the degeneracies of the basis states. 2.5.1 Strong External Field Let us first consider the case where the external magnetic field is much stronger than the internal field felt by the electron and due to its orbital motion. Taking B = Bẑ we have eB Hmag = (Lz + 2Sz ) . (2.69) 2me c If we first ignore spin, then the first-order correction to the hydrogen atom energy levels is (1) eB e~ Enlm = ψnlm Lz ψnlm = Bm := µB Bm 2me c 2me c where e~ µB = = 5.79 × 10−9 eV/gauss = 9.29 × 10−21 erg/gauss 2me c is called the (electron) Bohr magneton. Thus we see that for a given l, the (2l+1)- fold degeneracy is lifted. For example, the 3-fold degenerate l = 1 state is split into three states, with an energy difference of µB B between states: m=1 µB B l=1 m=0 µB B m = −1 This strong field case is sometimes called the Paschen-Back effect. If we now include spin, then (1) Enlml ms = µB B(ml + 2ms ) (2.70) where ms = ±1/2. This yields the further splitting (or lifting of degeneracies) sometimes called the anomalous Zeeman effect: 59 ms = 1/2 µB B ml = 1 ms = 1/2 l=1 ml = 0 ms = 1/2, −1/2 ml = −1 ms = −1/2 ms = −1/2 (0) (1) (0) This gives us the energy levels En + Enlml ms where En is given by (2.54a). However, since the basis states we used here are just the usual hydrogen atom wave functions, it is easy to include further corrections due to both Hso and the relativistic correction Hrel discussed in Section 2.4. We simply apply first-order perturbation theory using these as the perturbing potentials. For Hrel , we can simply use the result (2.58). However, we can’t just use equations (2.56) for Hso because they were derived using the eigenfunctions of J 2 which don’t apply when there is a strong external magnetic field. To get around this problem, we simply calculate hψnlml ms |L · S|ψnlml ms i. We have L · S = L x Sx + L y Sy + L z Sz where Lx = (L+ + L− )/2 and Ly = (L+ − L− )/2i with similar results for Sx and Sy . Using these, it is quite easy to see that the orthogonality of the eigenfunctions yields hψ|Lx Sx |ψi = hψ|Ly Sy |ψi = 0 while hψnlml ms |Lz Sz |ψnlml ms i = ~2 ml ms . (2.71) Combining the results for Hrel and Hso we obtain the following corrections to (0) (1) the “unperturbed” energies En + Enlml ms : mc2 α4 3 e2 (1) (1) 1 1 Erel + Eso = − + ~ 2 ml ms 3 3 2n3 4n l + 1/2 2m2 c2 a0 n l(l + 1)(l + 1/2) where we used equations (2.58), (2.48b), (2.67) and (2.71). After a little algebra, which I leave to you, we arrive at me4 α2 (1) (1) 3 l(l + 1) − ml ms Erel + Eso = − 2~2 n3 4n l(l + 1)(l + 1/2) 2 (0) α 3 l(l + 1) − ml ms = −E1 3 − . (2.72) n 4n l(l + 1)(l + 1/2) 60 2.5.2 Weak External Field Now we turn to the second case where the external field is weak relative to the spin–orbit term. As we discussed above, now we must take our basis states to be eigenfunctions of L2 , S 2 , J 2 and Jz . For a many-electron atom, there P are basicallyPtwo ways to calculate the total J. The first way is to calculate L = Li and S = Si and then evaluate J = L + S. This is called L–S or Russel-Saunders coupling. It is applicable to the lighter elements where interelectronic repulsion energies are significantly greater than the spin–orbit interaction energies. This is because if the spin–orbit coupling is weak, then L and S “almost” commute with H 0 + Hso . P The second way is to first calculate Ji = Li + Si so that J = Ji . This is called j–j coupling. It is used for heavier elements where the electrons are moving very rapidly, and hence there is a strong spin–orbit interaction. Because of this, L and S no longer commute with H, even though J does so. This type of coupling is also more difficult to use, so we will deal only with the L–S scheme. Here is the physical situation: B S S L J −µtot ∼ J + S = L + 2S Since J commutes with H 0 + Hso , it is conserved (and hence is fixed in space), even though L and S are not. This means that L and S both precess about J. If the applied external B field is much weaker than the internal field, then J will precess much more slowly about B than L and S precess about J. We need to evaluate the correction (2.69) in first-order perturbation theory. Since our basis states are eigenfunctions of J 2 and Jz but not Lz and Sz , we can’t directly evaluate the expectation value of Lz + 2Sz = Jz + Sz . The correct way to handle this is to use the Wigner-Eckart theorem, which is rather beyond the scope of this course. Instead, we will use a physical argument that gets us to the same answer. We note that since L and S (and hence µtot ) precess rapidly about J, the time av average of the Hamiltonian Hmag = −hµtot · Bi will be the same as −hµtot i · B. But 61 the average of µtot is just its component along J, which is µtot · J hµtot i = (µtot · Ĵ)Ĵ = J. J2 Using L = J − S so that L2 = J 2 + S 2 − 2S · J we have 1 (J + S) · J = J 2 + S · J = J 2 + (J 2 + S 2 − L2 ) . 2 Then since B = Bẑ, we now have av eB (J + S) · J Hmag = −Bhµtot i · ẑ = Jz 2me c J2 J 2 + S 2 − L2 eBJz = 1+ . 2me c 2J 2 Our basis states are simultaneous eigenstates of L2 , S 2 , J 2 and Jz , so the average av energy Emag is given by the first-order correction av e~Bmj j(j + 1) + s(s + 1) − l(l + 1) Emag = 1+ 2me c 2j(j + 1) e~Bmj 3 3/4 − l(l + 1) (2.73) = + 2me c 2 2j(j + 1) := µB Bmj gJ where the Landé g-factor gJ is defined by j(j + 1) + s(s + 1) − l(l + 1) gJ = 1 + . 2j(j + 1) The total energy of a hydrogen atom in a uniform magnetic field is now given (0) by the sum of the ground state energy En (equation (2.54a)), the fine-structure (1) av correction Ef s (equation (2.59)) and Emag (equation (2.73)). 2.5.3 Intermediate-Field Case Finally, we consider the intermediate-field case where the internal and external magnetic fields are approximately the same. In this situation, we must apply degenerate perturbation theory to the degenerate “unperturbed” states ψnlml ms by treating H ′ = Hfs + Hmag as a perturbation. It is easiest to simply work out an example. As we saw in our discussion of spin–orbit coupling, it is best to work in the basis in which our states are simultaneous eigenstates of L2 , S 2 , J 2 and Jz . (The choice of basis has no effect on the eigenvalues of Hfs + Hmag , and the eigenvalues are just 62 what we are looking for when we solve (2.21).) Let us consider the hydrogen atom state with n = 2, so that l = 0, 1. Since s = 1/2, the possible j values are 1 1 1 3 1 0⊗ +1⊗ = + ⊕ 2 2 2 2 2 or j = 1/2, 3/2, 1/2. Our basis states |l s j mj i are given in terms of the states |l s ml ms i using the appropriate Clebsch-Gordan coefficients (which you can look up or calculate for yourself). For l = 0 we have j = 1/2 so mj = ±1/2 and we have the two states ψ1 := 0 1 1 1 = 0 1 0 1 2 2 2 2 2 ψ2 := 0 12 1 1 = 0 12 0 − 12 2 −2 where the first state in each line is the state |l s j mj i, and the second state in each line is the linear combination of states |l s ml ms i with Clebsch-Gordan coefficients. (For l = 0 the C-G coefficients are just 1.) For l = 1 we have the four states with j = 3/2 and the two states with j = 1/2 (which we order with a little hindsite so the determinant (2.21) turns out block diagonal): ψ3 := 1 1 3 3 = 1 1 1 1 2 2 2 2 2 ψ4 := 1 12 3 − 32 = 1 12 −1 − 12 2 q q1 1 ψ5 := 1 12 3 1 2 1 1 1 2 2 = 3 1 2 0 2 + 3 1 2 1 −2 q q ψ6 := 1 12 1 1 = − 13 1 12 0 12 + 23 1 12 1 − 12 2 2 q q ψ7 := 1 12 3 − 12 = 13 1 12 −1 12 + 23 1 12 0 − 21 2 q q ψ8 := 1 12 1 − 12 = − 23 1 12 −1 12 + 13 1 12 0 − 21 . 2 Now we need to evaluate the matrices of Hfs = Hso +Hrel and Hmag in the |j mj i basis {ψi }. Since Hrel ∼ p4 , it’s already diagonal in the |j mj i basis. And since Hso ∼ S · L = (1/2)(J 2 − L2 − S 2 ), it’s also diagonal in the |j mj i basis. Therefore Hfs is diagonal and its contribution is given by (2.59): (0) 2 En α 3 n hjmj |Hfs |jmj i = − − + n2 4 j + 1/2 (0) 2 E α 2 3 1 =− − 16 j + 1/2 4 (0) (0) where I used En = E1 /n2 and let n = 2. For states with j = 1/2, this gives a contribution (0) 5E1 α2 hψi |Hfs |ψi i = − := −5ξ for i = 1, 2, 6, 8 (2.74a) 64 63 and for states with j = 3/2 this is (0) 2 E α 1 hψi |Hfs |ψi i = − := −ξ for i = 3, 4, 5, 7 . (2.74b) 64 Next, we easily see that the first four states ψ1 –ψ4 are eigenstates of Hmag ∼ Lz + 2Sz (since they each contain only a single factor |l s ml ms i). Hence Hmag is already diagonal in this 4 × 4 block, and so contributes the diagonal terms hψi |Hmag |ψi i = µB B(ml + 2ms ) := β(ml + 2ms ) for i = 1, 2, 3, 4 . For the remaining four states ψ5 –ψ8 we must explicitly evaluate the matrix elements. For example, (r r ) µB B 2 1 1 1 1 1 Hmag |ψ5 i = (Lz + 2Sz ) 1 0 + 1 1− ~ 3 2 2 3 2 2 ( r r ) 2 1 1 1 1 1 = µB B 1· 1 0 +0· 1 1− 3 2 2 3 2 2 r 2 1 1 = µB B 1 0 3 2 2 and therefore (using the orthonormality of the states |l s ml ms i) 2 2 hψ5 |Hmag |ψ5 i = µB B := β 3 3 and √ √ 2 2 hψ6 |Hmag |ψ5 i = hψ5 |Hmag |ψ6 i = − µB B := − β. 3 3 Also, * r + 1 1 1 1 1 hψ6 |Hmag |ψ6 i = ψ6 − µB B 1 0 = µB B := β . 3 2 2 3 3 Since all other matrix elements with ψ5 and ψ6 vanish, there is a 2 × 2 block corresponding to the subspace spanned by ψ5 and ψ6 . Similarly, there is a 2 × 2 block corresponding to the subspace spanned by ψ7 and ψ8 with 2 hψ7 |Hmag |ψ7 i = − β 3 √ 2 hψ8 |Hmag |ψ7 i = hψ7 |Hmag |ψ8 i = − β 3 1 hψ8 |Hmag |ψ8 i = − β . 3 64 Combining all of these matrix elements, the matrix of H ′ = Hfs + Hmag used in (2.21) becomes 2 −5ξ + β 3 −5ξ − β 6 7 6 7 6 7 −ξ + 2β 6 7 6 7 6 7 −ξ − 2β 6 7 6 7 6 √ 7 7. −ξ + 32 β 2 6 6 6 − 3 β 7 7 6 √ 7 − 32 β −5ξ + 13 β 6 7 6 7 6 √ 7 −ξ − 23 β 2 6 7 6 4 − 3 β 7 5 √ 2 − 3 β −5ξ − 31 β Now we need to find the eigenvalues of this matrix (which are the first-order energy corrections). Since it’s block diagonal, the first four diagonal entries are precisely the first four eigenvalues. For the remaining four eigenvalues, we must diagonalize the two 2 × 2 submatrices. Calling the eigenvalues λ, the characteristic equation for the {ψ5 , ψ6 } block is √ −ξ + 2 β − λ 3 − 32 β 11 = λ2 + λ(6ξ − β) + 5ξ 2 − ξβ = 0 . √ 3 − 32 β −5ξ + 31 β − λ From the quadratic formula we find the roots r {5,6} β 2 1 λ± = −3ξ + ± 4ξ 2 + ξβ + β 2 . 2 3 4 Looking at the {ψ7 , ψ8 } block, we see that we can just let β → −β and use the same equation for the roots: r {7,8} β 2 1 λ± = −3ξ − ± 4ξ 2 − ξβ + β 2 . 2 3 4 (1) The energy Ei of each of these eight states is then given by (1) (0) E1 = E2 − 5ξ + β (1) (0) E2 = E2 − 5ξ − β (1) (0) E3 = E2 − ξ + 2β (1) (0) E4 = E2 − ξ − 2β r (1) (0) β 2 1 E5 = E2 − 3ξ + + 4ξ 2 + ξβ + β 2 2 3 4 65 r (1) (0) β 2 1 2 E6 = E2 − 3ξ + − 4ξ 2 + ξβ + β 2 3 4 r (1) (0) β 2 1 2 E7 = E2 − 3ξ − + 4ξ 2 − ξβ + β 2 3 4 r (1) (0) β 2 1 2 E8 = E2 − 3ξ − − 4ξ 2 − ξβ + β 2 3 4 (1) (1) For i = 1, 2, 3, 4 the energy Ei corresponds to ψi . But for i = 5, 6 the energy Ei corresponds to some linear combination of ψ5 and ψ6 , and similarly for i = 7, 8 the energy Ei corresponds to a linear combination of ψ7 and ψ8 . (This is the essential content of Section 2.2.) It is easy to see that for β = 0 (i.e., B = 0), these energies reduce to Efs given by (2.74), and for very large β, we obtain the Paschen-Back energies given by (2.70). Thus our results have the correct limiting behavior. See Figure 7 below. E 10 Β 2 4 6 8 10 -10 -20 Figure 7: Intermediate-field energy corrections as a function of B for n = 2. 2.5.4 Supplement: The Electromagnetic Hamiltonian In a proper derivation of the Lagrange equations of motion, one starts from d’Alembert’s principle of virtual work, and derives Lagrange’s equations d ∂T ∂T − = Qi (2.75) dt ∂ q̇i ∂qi where theP qi are generalized coordinates, T = T (qi , q̇i ) is the kinetic energy and Qi = j Fj (∂xj /∂qi ) is a generalized force. In the particular case that Qi is derivable from a conservative force Fj = −∂V /∂xj , then we have Qi = −∂V /∂qi . Since the potential energy V is assumed to be independent of q̇i , we can replace ∂T /∂ q̇i by ∂(T − V )/∂ q̇i and we arrive at the usual Lagrange’s equations d ∂L ∂L − =0 (2.76) dt ∂ q̇i ∂qi 66 where L = T − V . However, even if there is no potential function V , we can still arrive at this result if there exists a function U = U (qi , q̇i ) such that the generalized forces may be written as ∂U d ∂U Qi = − + ∂ q̇i dt ∂ q̇i because defining L = T − U we again arrive at equation (2.76). The function U is called a generalized potential or a velocity dependent potential. We now seek such a function to describe the force on a charged particle in an electromagnetic field. Recall from electromagnetism that the Lorentz force law is given by v F =q E+ ×B c or 1 ∂A v F = q −∇φ − + × (∇ × A) c ∂t c where E = −∇φ − (1/c)∂A/∂t and B = ∇ × A. Our goal is to write this in the form ∂U d ∂U Fi = − + ∂xi dt ∂ ẋi for a suitable U . All it takes is some vector algebra. We have [v × (∇ × A)]i = εijk εklm v j ∂l Am = (δil δjm − δim δjl )v j ∂l Am = v j ∂i Aj − v j ∂j Ai = v j ∂i Aj − (v · ∇)Ai . But xi and ẋj are independent variables (in other words, ẋj has no explicit depen- dence on xi ) so that ∂Aj ∂ ∂ v j ∂i Aj = ẋj = (ẋj Aj ) = (v · A) ∂xi ∂xi ∂xi and we have ∂ [v × (∇ × A)]i = (v · A) − (v · ∇)Ai . ∂xi But we also have dAi ∂Ai dxj ∂Ai ∂Ai ∂Ai ∂Ai = j + = vj j + = (v · ∇)Ai + dt ∂x dt ∂t ∂x ∂t ∂t so that dAi ∂Ai (v · ∇)Ai = − dt ∂t and therefore ∂ dAi ∂Ai [v × (∇ × A)]i = (v · A) − + . ∂xi dt ∂t 67 But we can write Ai = ∂(v j Aj )/∂v i = ∂(v · A)/∂v i which gives us ∂ d ∂ ∂Ai [v × (∇ × A)]i = (v · A) − (v · A) + . ∂xi dt ∂v i ∂t The Lorentz force law can now be written in the form ∂φ 1 ∂Ai 1 Fi = q − i − + [v × (∇ × A)]i ∂x c ∂t c ∂φ 1 ∂Ai 1 ∂ 1 d ∂ 1 ∂Ai =q − i − + (v · A) − (v · A) + ∂x c ∂t c ∂xi c dt ∂v i c ∂t ∂ v d ∂ v =q − i φ− ·A − i ·A . ∂x c dt ∂v c Since φ is independent of v we can write d ∂ v d ∂ v − · A = φ − · A dt ∂v i c dt ∂v i c so that ∂ v d ∂ v Fi = q − i φ − · A + φ − · A ∂x c dt ∂v i c or ∂U d ∂U Fi = − + ∂xi dt ∂ ẋi where U = q(φ − v/c · A). This shows that U is a generalized potential and that the Lagrangian for a particle of charge q in an electromagnetic field is q L = T − qφ + v · A (2.77a) c or 1 q L= mv 2 − qφ + v · A. (2.77b) 2 c From this, the canonical momentum is defined by pi = ∂L/∂ ẋi = ∂L/∂vi so that q p = mv + A . c Using this, the Hamiltonian is then given by X H= pi ẋi − L = p · v − L q 1 q = mv 2 + A · v − mv 2 + qφ − A · v c 2 c 1 = mv 2 + qφ 2 2 1 q = p − A + qφ . 2m c 68 This is the basis for the oft heard statement that to include electromagnetic forces, you need to make the replacement p → p − (q/c)A. Including any other additional potential energy terms, the Hamiltonian becomes 2 1 q H= p − A + qφ + V (r) . (2.78) 2m c Let’s evaluate (2.78) for the case of a uniform magnetic field. Since B = ∇ × A, it is not hard to verify that 1 A=− r×B 2 will work (I’ll work it out, but you could also just plug into a vector identity if you take the time to look it up): [∇ × (r × B)]i = εijk εklm ∂j (xl Bm ) = (δil δjm − δim δjl )[δjl Bm + xl ∂j Bm ] = Bi − 3Bi = −2Bi where I used ∂j xl = δjl , δjl δlj = δjj = 3 and ∂j Bm = 0 since B is uniform. This shows that B = (−1/2)[∇ × (r × B)] = ∇ × A as claimed. Note also that for this B we have −2∇ · A = ∇ · (r × B) = εijk ∂i (xj Bk ) = εijk δij Bk = 0 because εijk δij = εiik = 0. Hence ∇ · A = 0. Before writing out (2.78), let me use this last result to show that (p · A)ψ = −i~∇ · (Aψ) = −i~(∇ · A)ψ + i~A · ∇ψ = (A · p)ψ and hence p · A = A · p. (Note this shows that p · A = A · p even if B is not uniform if we are using the Coulomb gauge ∇ · A = 0.) Now using this, we have 2 q2 1 q 1 q p− A = p2 − (p · A + A · p) + 2 A2 2m c 2m c c p2 q q2 = − A·p+ A2 . 2m mc 2mc2 But (thinking of the scalar triple product as a determinant and switching two rows) q q q A·p= − (r × B) · p = + B · (r × p) mc 2mc 2mc q = B · L. 2mc And using (I’ll leave the proof to you) 1 1 A2 = (r × B) · (r × B) = [r2 B 2 − (r · B)2 ] 4 4 69 we obtain 2 p2 q2 1 q q p− A = − B·L+ [r2 B 2 − (r · B)2 ] . 2m c 2m 2mc 8mc2 Let’s compare the relative magnitudes of the B · L term and the quadratic (last) term for an electron. Taking r2 ≈ a20 and L ∼ ~, we have (e2 /8mc2 )r2 B 2 (e2 /8mc2 )a20 B 2 1 e2 B = = (e/2mc)B · L (e/2mc)~B 4 ~c e/a20 1 1 B = 4 137 (4.8 × 10−10 esu)/(0.5 × 10−8 cm)2 B = . 9 × 109 gauss Since magnetic fields in the lab are of order 104 gauss or less, we see that the quadratic term is negligible in comparison. Referring back to (2.38), we see that q L = µl 2mc where, for an electron, we have q = −e. And as we have also seen, for spin we must postulate a magnetic moment of the form q µs = g S 2mc where g = 2 for an electron (and g = 5.59 for a proton). Therefore, an electron has a total magnetic moment e µtot = − (L + 2S) 2me c as we stated in (2.68). Combining our results, the Hamiltonian for a hydrogen atom in a uniform external magnetic field is then given by p2 e2 H= − − µtot · B = H 0 − µtot · B = H 0 + H ′ 2me r where we are taking qφ + V (r) = 0 − e2 /r, and me in this equation is really the reduced mass, which is approximately the same as the electron mass. 70 3 Time-Dependent Perturbation Theory 3.1 Transitions Between Two Discrete States We now turn our attention to the situation where the perturbation depends on time. In this situation, we assume that the system is originally in some definite state, and that applying a time-dependent external force then induces a transition to another state. For example, shining electromagnetic radiation on an atom in its ground state will (may) cause it to undergo a transition to a higher energy state. We assume that the external force is weak enough that perturbation theory applies. There are several ways to deal with this problem, and everyone seems to have their own approach. We shall follow a method that is closely related to the time- independent method that we employed. To begin, suppose H = H 0 + H ′ (t) and that we have the orthonormal solutions H 0 ϕn = En ϕn with ϕn (t) = ϕn e−iEn t/~ . Note that we no longer need to add a superscript 0 to the energies, because with a time-dependent Hamiltonian there is no energy conservation and hence we are not looking for energy corrections. We would like to solve the time-dependent Schrödinger equation ∂ψ(t) Hψ(t) = [H 0 + H ′ (t)]ψ(t) = i~ . (3.1) ∂t In this case, the solutions ϕn still form a complete set (they describe every possible state available to the system), the difference being that now the state ψ(t) that results from the perturbation will depend on time. So let us write X ψ(t) = ck (t)e−iEk t/~ ϕk . (3.2) k The reason for this form is that we want the time-dependent coefficients cn (t) to reduce to constants if H ′ (t) = 0. In other words, so H ′ (t) → 0 implies ψ(t) → ϕ(t). Our goal is to find the probability that if the system is in an eigenstate ϕi = ψ(0) at time t = 0, it will be found in the eigenstate ϕf at a later time t. This probability is given by 2 2 Pif (t) = |hϕf |ψ(t)i| = |cf (t)| (3.3) where hψ(t)|ψ(t)i = 1 implies 2 X |ck (t)| = 1 . k 71 Using (3.2) in (3.1) we obtain X −iEk t/~ ′ X iEk ck (t)e [Ek + H (t)]ϕk = i~ ċk (t) − ck (t) e−iEk t/~ ϕk ~ k k or X X i~ ċk (t)e−iEk t/~ ϕk = H ′ (t)ck (t)e−iEk t/~ ϕk . (3.4) k k But hϕn |ϕk i = δnk so that X i~ċn (t)e−iEn t/~ = hϕn |H ′ (t)|ϕk ick (t)e−iEk t/~ . k Defining the Bohr angular frequency En − Ek ωnk = (3.5) ~ we can write 1 X ċn (t) = hϕn |H ′ (t)|ϕk ick (t)eiωnk t . (3.6a) i~ k This set of equations for cn (t) is exact and completely equivalent to the original Schrödinger equation (3.1). Defining ′ Hnk (t) = hϕn |H ′ (t)|ϕk i we may write out (3.6a) in matrix form as (for a finite number of terms)   ′ ′ iω12 t ′ ċ1 (t) H11 H12 e ··· H1n eiω1n t c1 (t)      ′ iω21 t ′ ′  ċ2 (t)   H21 e H22 ··· H2n eiω2n t   c2 (t)     i~   ..  =  ..   .. ..   ..  .    (3.6b)  .   . . .  .  ′ ċn (t) Hn1 eiωn1 t ′ Hn2 eiωn2 t ··· ′ Hnn cn (t) As we did in the time-independent case, we now let H ′ (t) → λH ′ (t), and expand ck (t) in a power series in λ: (0) (1) ck (t) = ck (t) + λck (t) + · · · . (3.7) Inserting this into (3.6a) yields ċ(0) (1) 2 (2) n (t) + λċn (t) + λ ċn (t) + · · · 1 X ′ (0) (1) (2) = Hnk (t)[λck (t) + λ2 ck (t) + λ3 ck (t) + · · · ]eiωnk t . i~ k Equating powers of λ, for λ0 we have ċ(0) n (t) = 0 (3.8a) 72 and for λs+1 with s ≥ 0 we have 1 X ′ (s) ċ(s+1) n (t) = Hnk (t)ck (t)eiωnk t . (3.8b) i~ k (0) In principle, these may be solved successively. Solving (3.8a) gives ck (t), and using (1) this in (3.8b) then gives cn (t). Then putting these back into (3.8b) again yields (2) cn (t), and in principle this can be continued to any desired order. Let us assume that the system is initially in the state ϕi , so that cn (0) = δni . (3.9a) Since this must be true for all λ, we have c(0) n (0) = δni (3.9b) and c(s) n (0) = 0 for s ≥ 1 . (3.9c) From (3.8a) we see that the zeroth-order coefficients are constant in time, so we have c(0) n (t) = δni (3.9d) and the zeroth-order solutions are completely determined. Using (3.9b) in (3.8b) we obtain, to first order, 1 X ′ 1 ′ ċ(1) n (t) = Hnk (t)δki eiωnk t = Hni (t)eiωni t i~ i~ k so that t 1 Z ′ c(1) n (t) = ′ Hni (t′ )eiωni t dt′ (3.10) i~ 0 where the constant of integration is zero by (3.9c). Using (3.9d) and (3.10) in (3.2) yields ψ(t) to first order: X 1 Z t −iEi t/~ ′ ′ iωki t′ /~ ψ(t) = ϕi e +λ H (t )e dt e−iEk t ϕk . ′ i~ 0 ki k From (3.3) we know that the transition probability to the state ϕf is given by Pif (t) = |hϕf |ψ(t)i|2 = |cf (t)|2 (0) (1) where cf (t) = cf (t) + λcf (t) + · · · . We will only consider transitions to states ϕf (0) that are distinct from the initial state ϕi , and hence cf (t) = 0. Then the first-order transition probability is (1) 2 Pif (t) = λ2 cf (t) 73 or, from (3.10) and letting λ → 1, 2 1 t ′ ′ iωf i t′ ′ Z Pif (t) = H (t )e dt . (3.11) ~2 0 f i A minor point is that our initial conditions could equally well be defined at t → −∞. In this case, the lower limit on the above integrals would obviously be −∞ rather than 0. Example 3.1. Consider a one-dimensional harmonic oscillator of a particle of charge q with characteristic frequency ω. Let this oscillator be placed in an electric field that is turned on and off so that its potential energy is given by 2 /τ 2 H ′ (t) = qE xe−t where τ is a constant. If the particle starts out in its ground state, let us find the probability that it will be in its first excited state after a time t ≫ τ . Since t ≫ τ , we may as well take t → ±∞ as limits. From (3.11), we see that we must evaluate the integral Z ∞ ′ ′ I= H10 (t′ )eiω10 t dt′ −∞ where 2 ′ /τ 2 H10 (t) = qE e−t hψ1 |x|ψ0 i and En = ~ω(n + 1/2) so that ω10 = (E1 − E0 )/~ω = 1. Then (keeping ω10 for generality at this point) Z ∞ 2 2 I = qE hψ1 |x|ψ0 i e−t /τ eiω10 t dt −∞ Z ∞ 2 )(t2 −iω10 τ 2 t) = qE hψ1 |x|ψ0 i e−(1/τ dt −∞ Z ∞ 2 −ω10 τ 2 /4 2 )(t−iω10 τ 2 /2)2 = qE hψ1 |x|ψ0 ie e−(1/τ dt −∞ Z ∞ 2 2 2 )u2 = qE hψ1 |x|ψ0 ie−ω10 τ /4 e−(1/τ du −∞ 2 2 √ = qE hψ1 |x|ψ0 ie−ω10 τ /4 πτ 2 . The easy way to do the spatial integral is to use the harmonic oscillator ladder operators. From r ~ x= (a + a† ) 2mω 74 where √ √ aψn = nψn−1 and a† ψn = n + 1ψn+1 we have r r r ~ ~ ~ hψ1 |x|ψ0 i = hψ1 |a† ψ0 i = hψ1 |ψ1 i = . 2mω 2mω 2mω Therefore r π~ −ω10 2 τ 2 /4 I = qE τ e 2mω so that πq 2 E 2 τ 2 −ω10 2 τ 2 /2 πq 2 E 2 τ 2 −τ 2 /2 P01 (t → ∞) = e = e . 2m~ω 2m~ω Note that as τ → ∞ (i.e., the electric field is turned on very slowly), we have P01 → 0. This shows that the system adjusts “adiabatically” to the field and is not shocked into a transition. Example 3.2. Let us consider a harmonic perturbation of the form H ′ (t) = V0 (r) cos ωt , t ≥ 0. Note that letting ω = 0 we obtain the constant perturbation H ′ (t) = V0 (r) as a special case. It just isn’t much harder to treat the more general situation, which represents the interaction of the system with an electromagetic wave of frequency ω. If we define Vf i = hϕf |V0 (r)|ϕi i , then Hf′ i = hϕf |V0 (r) cos ωt|ϕi i = hϕf |V0 (r)|ϕi i cos ωt = Vf i cos ωt . Using cos ωt = (eiωt + e−iωt )/2i, we then have Z t Vf i t iωt′ Z ′ ′ ′ Hf′ i (t′ )eiωf i t dt′ = (e + e−iωt )eiωf i t dt′ 0 2i 0 Vf i t i(ωf i +ω)t′ Z ′ = (e + ei(ωf i −ω)t ) dt′ 2i 0 Vf i ei(ωf i +ω)t − 1 ei(ωf i −ω)t − 1 = + . 2i i(ωf i + ω) i(ωf i − ω) Inserting this into (3.11), we can write 2 2 |Vf i | 1 − ei(ωf i +ω)t 1 − ei(ωf i −ω)t Pif (t; ω) = + (3.12) 4~2 ωf i + ω ωf i − ω 75 where I’m specifically including ω as an argument of Pif because the transition probability depends on ω. Let us consider the special case of a constant (i.e., time-independent) perturbation, ω = 0. In this case, (3.12) reduces to 2 2 |Vf i | 2 |Vf i | Pif (t; 0) = 2 2 1 − eiωf i t = 2 2 2(1 − cos ωf i t) . ~ ωf i ~ ωf i Using the elementary identity cos A = cos(A/2 + A/2) = cos2 A/2 − sin2 A/2 = 1 − 2 sin2 A/2 we can write the transition probability as 2 2 2 |Vf i | sin ωf i t/2 |Vf i | Pif (t; 0) = := F (t; ωf i ) . (3.13) ~2 ωf i /2 ~2 The function 2 2 sin ωf i t/2 sin ωf i t/2 F (t; ωf i ) = = t2 ωf i /2 ωf i t/2 has amplitude equal to t2 , and zeros at ωf i = 2πn/t. See Figure 8 below. 4 3 2 1 -5 5 Figure 8: Plot of F (t; ωf i ) vs ωf i for t = 2. The main peak lies between zeros at ±2π/t, so its width goes like 1/t while its height goes like t2 , and hence its area grows like t. It is also interesting to see how the transition probability depends on time. 76 1.0 0.8 0.6 0.4 0.2 0.0 2 4 6 8 10 12 14 Figure 9: Plot of F (t; ωf i ) vs t for ωf i = 2. Here we see clearly that for times t = 2πn/ωf i the transition probability is zero, and the system is certain to be in its initial state. Because of this oscillatory behavior, the greatest probability for a transition is to allow the perturbation to act only for a short time π/ωf i . For future reference, let me make a (very un-rigorous but useful) mathematical observation. From Figure 8, we see that as t → ∞, the function F (t, ω) = t2 [(sin ωt/2)/(ωt/2)]2 has an amplitude t2 that also goes to infinity, and a width 4π/t centered at ω = 0 that goes to zero. Then if we include F (t, ω) inside the integral of a smooth function f (ω), the only contribution to the integral will come where ω = 0. Using the well-known result Z ∞ sin2 x 2 dx = π −∞ x we have (with x = ωt/2 so dx = (t/2)dω) ∞ 2 ∞ sin2 x sin ωt/2 Z Z lim f (ω)t2 dω = 2tf (0) dx = 2πtf (0) t→∞ −∞ ωt/2 −∞ x2 and hence we conclude that 2 2 sin ωt/2 2 sin ωt/2 t→∞ F (t; ω) = =t −−−→ 2πtδ(ω) . (3.14) ω/2 ωt/2 Example 3.3. Let us take a look at equation (3.12) when ω ≈ ωf i . This is called a resonance phenomenon. We will assume that ω ≥ 0 by definition, and we will consider the case where ωf i > 0. The alternative case where ωf i < 0 can be treated in an analogous manner. 77 We begin by rewriting the two complex terms in (3.12). For the first we have 1 − ei(ωf i +ω)t −i(ωf i +ω)t/2 − ei(ωf i +ω)t/2 i(ωf i +ω)t/2 e A+ := =e ωf i + ω ωf i + ω sin(ωf i + ω)t/2 = −iei(ωf i +ω)t/2 (ωf i + ω)/2 and similarly for the second 1 − ei(ωf i −ω)t sin(ωf i − ω)t/2 A− := = −iei(ωf i −ω)t/2 ωf i − ω (ωf i − ω)/2 If ω ≈ ωf i , then A− dominates and is called the resonant term, while the term A+ is called the anti-resonant term. (These terms would be switched if we were considering the case ωf i < 0.) We are considering the case where |ω − ωf i | ≪ |ωf i |, so A+ can be neglected in comparison to A− . Under these conditions, (3.12) becomes 2 2 2 |Vf i | 2 |Vf i | sin(ωf i − ω)t/2 Pif (t; ω) = |A− | = 4~2 4~2 (ωf i − ω)/2 2 |Vf i | := F (t; ωf i − ω) . (3.15) 4~2 A plot of F (t; ωf i − ω) as a function of ω would be identical to Figure 8 except that the peak would be centered over the point ω = ωf i . In particular, F (t; ωf i − ω) has a maximum value of t2 , and a width between its first two zeros of 4π ∆ω = . (3.16) t Here is another way to view Example 3.3. Let us consider a time-dependent potential of the form H ′ (t) = V0 (r)e±iωt . (3.17) Then t t ei(ωf i ±ω)t − 1 Z Z ′ ′ Hf′ i (t′ )eiωf i t dt′ = Vf i ei(ωf i ±ω)t dt′ = Vf i 0 0 i(ωf i ± ω) sin(ωf i ± ω)t/2 = Vf i ei(ωf i ±ω)t/2 (ωf i ± ω)/2 and (3.11) becomes 2 2 |Vf i | sin(ωf i ± ω)t/2 Pif (t) = . (3.18) ~2 (ωf i ± ω)/2 78 As t → ∞, we can use (3.14) to write 2π 2 lim Pif (t) = |Vf i | δ(Ef − Ei ± ~ω)t t→∞ ~ where we used the general result δ(ax) = (1/ |a|)δ(x) so that δ(ω) = δ(E/~) = ~δ(E). Note that the transition probability grows linearly with time. We can write this as Pif (t → ∞) = Γi→f t (3.19a) where the transition rate (i.e., the transition probability per unit time) is defined by 2π 2 Γi→f = |Vf i | δ(Ef − Ei ± ~ω) . (3.19b) ~ (The result (3.19b) differs from (3.15) by a factor of 4 in the denominator. This is because in Example 3.2 we used cos ωt which contains the terms (1/2)e±iωt .) Because of the delta function, we only get transitions in those cases where |Ef − Ei | = ~ω, which is simply a statement of energy conservation. Assuming that Ef > Ei , in the case of a potential of the form V0 e+iωt , we have Ef = Ei − ~ω so the system has emitted a quantum of energy. And in the case where we have a potential of the form V0 e−iωt , we have Ef = Ei + ~ω so the system has absorbed a quantum of energy. In Example 3.3, we saw that resonance occurs when ω = ωf i . Since we are considering the case where ωf i = (Ef − Ei )~ ≥ 0, this means that resonance is at the point where Ef = Ei +~ω. In other words, a system with energy Ei undergoes a resonant absorption of a quantum of energy ~ω to transition to a state with energy Ef . Had we started with the case where ωf i < 0, we would have found that the system underwent a resonant induced emission of the same quantum of energy ~ω, so that Ef = Ei − ~ω. Also recall that in Example 3.3, we neglected A+ relative to A− . Noting that 2 2 2 |A+ (ω)| = |A− (−ω)| , it is easy to see that a plot of |A+ | is exactly the same as 2 a plot of |A− | reflected about the vertical axis ω = 0. See Figure 10 below. Note that both of these curves have a width ∆ω = 4π/t that narrows as time increases. 4 3 2 1 -30 -20 -10 10 20 30 2 2 Figure 10: Plot of |A+ | and |A− | vs ω for t = 2 and ωf i = 20. 79 In addition, we see that A+ will be negligible relative to A− as long as they are well-separated, in other words, as long as 2 |ωf i | ≫ ∆ω . Since ∆ω = 4π/t, this is equivalent to requiring 1 1 t≫ ≈ . |ωf i | ω Physically, this means that the perturbation must act over a long enough time interval t for the system to oscillate enough that it indeed appears sinusoidal. On the other hand, in both Examples 3.2 and 3.3, the transition probability Pif (t; ω) has a maximum value proportional to t2 . Since this approaches infinity as t → ∞, and since a probability always has to be less than or equal to 1, there is clearly something wrong. One answer is that the first order approximation we are using has a limited time range. In Example 3.3, resonance occurs when ω = ωf i , in which case |Vf i |2 2 Pif (t; ω = ωf i ) = t . 4~2 So in order for our first-order approximation to be valid, we must have ~ t≪ . |Vf i | Combining this with the previous paragraph, we conclude that 1 ~ ≪ . |ωf i | |Vf i | This is the same as ~ |ωf i | = |Ef − Ei | ≫ |Vf i | = hϕf |V0 |ϕi i and hence the energy difference between the initial and final states must be much larger than the matrix element Vf i between these states. 3.2 Transitions to a Continuum of States In the previous section we considered the transition probability Pif (t) from an initial state ϕi to a final state ϕf . But in the real experimental world, detectors generally observe transitions over a (at least) small range of energies and over a finite range of incident angles. Thus, we should treat not a single final state ϕf , but rather a group (or continuum) of closely spaced states centered about some ϕf . Since the area under the curve in Figure 8 grows like t, we expect that the transition probability to a set of states with approximately the same energy as ϕf to grow linearly with time. (We saw this for a transition to a single state in equation (3.19a).) 80 Let us now generalize (3.19b) to a more physically realistic detector. After all, no physical transition rate can go like a delta function. To get a good idea of what to expect, we first consider the perturbation (3.17) and the resulting transition probability (3.18). For a physically realistic detector, instead of a transition to a single final state we must consider all transitions to a group of final states centered about Ef : X |Vf i |2 sin(ωf i ± ω)t/2 2 P(t) = ~2 (ωf i ± ω)/2 Ef ∈∆Ef 2 sin(Ef − Ei ± ~ω)t/2~ |Vf i |2 X = (Ef − Ei ± ~ω)/2 Ef ∈∆Ef where the sum is over all states with energies in the range ∆Ef . We assume that the final states are very closely spaced, and hence may be treated as a continuum of states. In that case, the sum may be converted to an integral over the interval ∆Ef by writing the number of states with energy between Ef and Ef + dEf as ρ(Ef ) dEf , where ρ(Ef ) is called the density of final states. It is just the number of states per unit energy. Then Z Ef +∆Ef /2 2 2 sin(Ef − Ei ± ~ω)t/2~ P(t) = ρ(Ef ) dEf |Vf i | . (3.20) Ef −∆Ef /2 (Ef − Ei ± ~ω)/2 As t becomes very large, we have seen that the term in brackets becomes sharply peaked about Ef = Ei ∓ ~ω, and hence we may assume that ρ(Ef ) and |Vf i | are essentially constant over the region of integration, which we may also let go to ±∞. Changing variables to x = (Ef − Ei ± ~ω)t/2~ we then have 2t ∞ sin2 x 2π Z P(t) = ρ(Ef ) |Vf i |2 dx = ρ(Ef ) |Vf i |2 t . ~ −∞ x2 ~ Defining the transition rate Γ = dP/dt we finally arrive at 2π 2 i Γ= ρ(Ef ) |Vf i | (3.21) ~ Ef =Ei ∓~ω which is called Fermi’s golden rule. A completely equivalent way to write this is to take equations (3.19) and write X X P(t) = Pif (t) = Γi→f t = Γt final states final states where 2π Γi→f = |Vf i |2 δ(Ef − Ei ± ~ω) ~ and X Γ= Γi→f . final states 81 If you wish, you can then replace the sum over states by an integral over energies if you include a density of states factor ρ(E). This has the same effect as simply using (3.14) in (3.20) to write Ef +∆Ef /2 2π Z 2 P(t) = ρ(Ef ) dEf |Vf i | tδ(Ef − Ei ± ~ω) Ef −∆Ef /2 ~ Ef +∆Ef /2 2π Z = |Vf i |2 ρ(Ef )δ(Ef − Ei ± ~ω) dEf t ~ Ef −∆Ef /2 = Γt . Example 3.4. Let us consider a simple, one-dimensional model of photo-ionization, in which a particle of charge e in its ground state ψ0 in a potential U (x) is irradiated by light of frequency ω, and hence is ejected into the continuum. To keep things simple, we first assume that the wavelength of the incident light is much longer than atomic dimensions. Under these conditions, the electric field of the light may be considered uniform in space, but harmonic in time. (The magnetic field of the light exerts a force that is of order v/c less than the electric force, and may be neglected.) Since we are treating the absorption of energy, we write the electric field as E = E e−iωt x̂. Using E = −∇ϕ we have Z Z Z E · dx = E e−iωt dx = E e−iωt x = − ∇ϕ · dx = −ϕ(x) so that ϕ(x) = −E e−iωt x. From Example 2.2 we know that the interaction energy of the particle in the electric field is given by eϕ(x), and hence the perturbation is H ′ (x, t) = −eE xe−iωt = V0 (x)e−iωt . The second assumption we shall make is that the frequency ω is large enough that the final state energy Ef is very large compared to U (x), and therefore we may treat the final state of the ejected particle as a plane wave (i.e., a free particle of definite energy and momentum). We need to find the density of final states and the normalization of these states. The standard trick to accomplishing this is to consider our system to be in a box of length L, and then letting L → ∞. By a proper choice of boundary conditions, this will give us a discrete set of normalizable states. However, we can’t treat this like a “particle in a box,” because such states must vanish at the walls, and a state of definite momentum can’t vanish. Therefore, we employ the mathematical (but non- physical) trick of assuming periodic boundary conditions, whereby the walls are taken to lie at x0 and x0 + L together with ψ(x0 + L) = ψ(x0 ). The free particle plane waves are of the form eipx/~ , so our periodic boundary conditions become eip(x0 +L)/~ = eipx0 82 so that eipL/~ = 1 and hence √ 2πn~ p = 2mE = ; n = 0, ±1, ±2, . . . . L This shows that the momentum (and hence energy) of the particle takes on discrete values. Note that as L gets larger and larger, the spacing of the states becomes closer and closer, and in the limit L → ∞ they become the usual free particle continuum states of definite momentum. This is the justification for using periodic boundary R x +L 2 conditions. Finally, the normalization condition x00 |ψ| dx = 1 implies that the normalized wave functions are then 1 √ ψE = √ ei 2mE x/~ . L The next thing we need to do is find the density of states ρ(E), which is defined as the number of states with an energy between E and E + dE, i.e., ρ(E) = dN/dE. Consider a state with energy E defined by √ 2πN ~ 2mE = L so that L √ N= 2mE . 2π~ From n = 0, ±1, ±2, . . . , ±N , we see that there are 2N + 1 states with energy less than or equal to E. Calling this number N (E), we have L√ N (E) = 2N + 1 = 2mE + 1 . π~ But then Lp L√ p N (E + dE) = 2m(E + dE) + 1 = 2mE 1 + dE/E + 1 π~ π~ L√ r L 2m ≈ 2mE(1 + dE/2E) + 1 = N (E) + dE π~ 2π~ E and hence r L 2m dN = N (E + dE) − N (E) = dE . 2π~ E Directly from the definition of ρ(E) we then have r L 2m ρ(E) = . (3.22) 2π~ E Now we turn to the matrix element Vf i . The initial state is the normalized wave function ψ0 with energy E0 = −ǫ where ǫ is the binding energy. The final state is the normalized free particle state ψEf with energy Ef = E0 + ~ω = ~ω − ǫ. Then Z 1 √ Vf i = −E hψEf |ex|ψ0 i = −E √ e−i 2mEf x/~ ex ψ0 dx . L 83 Note that this is the quantum mechanical average of the energy of an electric dipole in a uniform electric field E . Putting all of this together in (3.21), we have the transition probability √ s Z 2 2π L 2m 2 2 1 −i 2mEf x/~ Γ= e E e x ψ0 dx ~ 2π~ Ef L √ s 2 e2 E 2 Z 2m −i 2mEf x/~ = 2 e x ψ0 dx . (3.23) ~ Ef Note that the box size L has canceled out of the final result, as it must. Let’s actually evaluate the integral in (3.23) for the specific example of a particle in a square well potential. Recall that the solutions to this problem consist of sines and cosines inside the well, and exponentially decaying solutions outside. To simplify the calculation, we assume first that the well is so narrow that the ground state is the only bound state (a cosine wave function), and second, that this state is only very slightly bound, so that its wave function extends far beyond the edges of the well. By making the well so narrow, we can simply replace the cosine wave function inside the well by extending the exponential wave functions back to the origin. With these additional simplifications, the normalized ground state wave function is 1/4 √ 2mǫ ψ0 = e− 2mǫ |x|/~ ~2 where ǫ is the binding energy. Then the integral in (3.23) becomes Z ∞ √ 2mǫ 1/4 Z ∞ √ √ √ e−i 2mEf x/~ x ψ0 dx = e − 2m( ǫ |x|+i Ef x)/~ x dx −∞ ~2 −∞ 2mǫ 1/4 Z 0 √ √ √ 2m( ǫ−i Ef )x/~ = e x dx ~2 −∞ Z ∞ √ √ √ + e− 2m( ǫ+i Ef )x/~ x dx . 0 Using 0 0 ∂ ∂ 1 1 Z Z eax x dx = eax dx = =− 2 −∞ ∂a −∞ ∂a a a and ∞ ∞ ∂ ∂ 1 1 Z Z e−bx x dx = − e−bx dx = − = 2 0 ∂b 0 ∂b b b we have Z ∞ √ 2mǫ 1/4 2 ~ 1 1 −i 2mEf x/~ e x ψ0 dx = √ − √ ~2 p p −∞ 2m ( ǫ + i Ef )2 ( ǫ − i Ef )2 84 1/4 p ~2 (−4i) ǫEf 2mǫ = . ~2 2m (ǫ + Ef )2 Hence equation (3.23) becomes 3/2 1/2 8~e2 E 2 ǫ Ef Γ= m (ǫ + Ef )4 where Ef = ~ω − ǫ, or ǫ + Ef = ~ω. Since our second initial assumption was essentially that ~ω ≫ ǫ, we can replace Ef in the numerator by ~ω, leaving us with the final result 8e2 E 2 ǫ3/2 Γ= . m~5/2 ω 7/2 What this means is that if we have a collection of N particles of charge e and mass m in their ground state in a potential well with binding energy ǫ, and they are placed in an electromagnetic wave of frequency ω and electric vector E , then the number of photoelectrons with energy ~ω − ǫ produced per second is N Γ. Now that we have an idea of what the density of states means and how to use the golden rule, let us consider a somewhat more general three-dimensional problem. We will consider an atomic decay ϕi → ϕf , with the emission of a particle (photon, electron etc.), whose detection is far from the atom, and hence may be described by a plane wave 1 ψ(r, t) = √ ei(p·r−ωp t) . V (At the end of our derivation, we will generalize to multiple particles in the final state.) Here √ V is the volume of a box that contains the entire system, and the factor 1/ V is necessary to normalize the wave function. If we take the box to be very large, its shape doesn’t matter, so we take it to be a cube of side L. In order to determine the allowed momenta, we impose periodic boundary conditions: ψ(x + L, y, z) = ψ(x, y, z) and similarly for y and z. Then eipx L/~ = eipy L/~ = eipz L/~ = 1 so that we must have 2π~ 2π~ 2π~ px = nx ; py = ny ; pz = nz L L L where each ni = 0, ±1, ±2, . . . . Our real detector will measure all incoming momenta in a range p to p + δp, and hence we want to calculate the transition rate to all final states in this range. Thus we want X Γ= Γi→f (p) δp 85 where Γi→f (p) is given by (3.19b). Since each momentum state is described by the triple of integers (nx , ny , nz ), this is equivalent to the sum X Z Γ= Γi→f (n) → d3 n Γi→f (n) δnx ,δny ,δnz where we have gone over to an integral in the limit of a very large box, so that compared to L, each δni becomes an infinitesimal dni . Noting that 3 3 L V d n = dnx dny dnz = dpx dpy dpz = d3 p (3.24) 2π~ (2π~)3 we then have (from (3.19b)) 2π V Z 2 Γ= d3 p |Mf i | δ(Ef − Ei + E) (3.25) ~ (2π~)3 where we have assumed that the emitted particle has energy E (which is essen- 2 tially the integration variable), and we changed notation slightly to |Mf i | = 2 |hϕf |H ′ (t)|ϕi i| where H ′ (t) = V0 (r)e+iωt as in (3.17). If we let dΩp = d cos θp dφp be the element of solid angle about the direction defined by p, then 2π V Z Z Γ= dΩp p2 dp |Mf i |2 δ(Ef − Ei + E) ~ (2π~)3 2 2π V p dp Z Z 2 = dΩp 3 dE |Mf i | δ(Ef − Ei + E) ~ (2π~) dE 2 2π V p dp Z 2 = dΩp |Mf i | . (3.26) ~ (2π~)3 dE E=Ei −Ef Here the integral is over Ωp , and is to cover whatever solid angle range we wish to include. This could be just a small detector angle, or as large as 4π to include all emitted particles. The quantity in brackets is evaluated at E = Ei − Ef as required by the energy conserving delta function. And√the factor of V in the numerator will 2 be canceled by the normalization factor (1/ V )2 coming from |Mf i | and due to the outgoing plane wave particle. From (3.24) we see that 2 V p dp V d3 p d3 n dΩp = = := ρ(E) (3.27) (2π~)3 dE (2π~)3 dE dE where the density of states ρ(E) is defined as the number of states per unit of energy. Note that in the case of a photon (i.e., a massless particle) we have E = pc so that p2 dp p2 E2 ~2 = = 3 = 3 ω2 dE c c c 86 where we used the alternative relation E = ~ω. And in the case of a massive particle, we have E = p2 /2m and p2 dp m √ = p2 = mp = m 2mE . dE p You should compare (3.27) using these results to (3.22). In all cases, the density of states goes like 1/E as it should. In terms of the density of states, (3.26) may be written 2π 2 i Γ= ρ(E) |Mf i | . (3.28) ~ E=Ei −Ef This is the golden rule for the emission of a particle of energy E. If the final state contains several particles labeled by k, then (3.25) becomes 2π Z Y V d3 pk 2 X Γ= |M fi | δ Ef − Ei + Ek ~ indep pk (2π~)3 k k where the integral is over all independent momenta, since the energy conserving delta function is a condition on the total momenta of the emitted particles, and hence eliminates a degree of freedom. However, the product of phase space factors V d3 pk /(2π~)3 is over all particles in the final state. Alternatively, we may leave the integral over all momenta if we include an energy conserving delta function in addition: 2π Y V d3 pk Z 2 X X Γ= |Mf i | δ Ef + Ek − Ei δ pf + pk − pi . ~ (2π~)3 k k k 87

Comments

Description