Chapter 7: The Variation Theorem

Chapter 7: The Variation Theorem#

Prof. Eugene DePrince, Florida State University, Prof. Jay Foley, UNC Charlotte

The preceding notebooks have considered analytically solvable model problems that culminated in the hydrogenic atom problem. For other, more complicated problems, it will not be possible, in general, to find exact analytic solutions to the time-independent Schrödinger equation. As such, this notebook describes one strategy to find approximate solutions to the Schrödinger equation using the variation theorem.

The Variation Theorem for Ground States#

Consider a system with a time-independent Hamiltonian, $\hat{H},$ and energy, $E_1$. The variation theorem states that, if $\phi$ is any well-behaved function that satisfies the boundary conditions for the system, then the expectation value of $\hat{H}$ with respect to $\phi$ is guaranteed to be an upper-bound to $E_1,$ i.e.,

\[\begin{align} \frac{\langle \phi | \hat{H} | \phi\rangle}{\langle \phi| \phi\rangle} \ge E_1 \end{align}\]

If $\phi$ is normalized, then we have

\[\begin{align} \langle \phi | \hat{H} | \phi\rangle \ge E_1 \end{align}\]

Proof. Let us expand $\phi$ in the basis of eigenfunctions of $\hat{H},$ which form a complete set, which we can choose to be orthonormal. We have

\[\begin{align} |\phi\rangle = \sum_k c_k |\psi_k\rangle \end{align}\]

where

\[\begin{align} \hat{H} | \psi_k \rangle = E_k | \psi_k \rangle \end{align}\]

$\phi$ is chosen to be normalized, which leads to

\[\begin{align} \langle \phi | \phi \rangle = \sum_k |c_k|^2 = 1 \end{align}\]

We take the expectation value of $\hat{H}$ with respect to $\phi,$ which we will call $W$

\[\begin{split}\begin{align} W &= \langle \phi | \hat{H} | \phi \rangle \\ &= \sum_{kl} c_k^* c_l \langle \psi_k | \hat{H} | \psi_l \rangle \\ &= \sum_{kl} c_k^* c_l E_l \langle \psi_k | \psi_l \rangle \\ &= \sum_{kl} c_k^* c_l E_l \delta_{kl} \\ &= \sum_k |c_k|^2 E_k \end{align}\end{split}\]

Now, by definition, $E_k \ge E_1$ for all $k$ because $E_1$ is the ground-state energy. As a result, we can state that

\[\begin{align} |c_k|^2 E_k \ge |c_k|^2 E_1 \end{align}\]

Combining this result with the expectation value evaluated above leads to

\[\begin{split}\begin{align} W &= \sum_k |c_k|^2 E_k \\ &\ge \sum_k |c_k|^2 E_1 = E_1 \end{align}\end{split}\]

Therefore,

\[\begin{align} W = \langle \phi | \hat{H} | \phi \rangle \ge E_1 \end{align}\]

This result has important consequences for finding approximate solutions to the Schrödinger equation. First, we could simply test a set of trial functions to see which one is the “best” in the sense that it gives the lowest energy (the closest to $E_1$). Alternatively, we could choose a single trial function that contains parameters, and the best set of parameters could be chosen by varying them such that they minimize the expectation value of the Hamiltonian. In either case, we need to start with a good trial function.

As an example, consider the one-dimensional particle-in-a-box (PIB) problem. We have a potential defined by

\[\begin{split}\begin{align} V(x) = \begin{cases} \infty\text{, } &x < 0 \\ 0\text{, } & 0 \le x \le L \\ \infty\text{, } & x \ge L \end{cases} \end{align}\end{split}\]

As such, we know that the boundary conditions for the problem require that the wave function go to zero at the edges of the box. The following trial wave function satisfies these conditions

\[\begin{split}\begin{align} \phi(x) = \begin{cases} 0\text{, } &x < 0 \\ N x(L-x)\text{, } &0 \le x \le L \\ 0\text{, }&x \ge L \end{cases} \end{align}\end{split}\]

where $N$ is a normalization coefficient. The following Python code visualizes this trial function alongside the actual ground-state wave function for the PIB problem

\[\begin{split}\begin{align} \psi_1(x) = \begin{cases} 0\text{, } &x < 0 \\ \left (\frac{2}{L} \right ) ^{1/2}\sin\left (\frac{\pi x}{L}\right )\text{, } &0 \le x \le L \\ 0\text{, }&x \ge L \end{cases} \end{align}\end{split}\]

The code also evaluates the expectation value of the Hamiltonian with respect to $\phi,$ which, according to the variation theorem, should be greater than or equal to the exact ground-state energy for the PIB problem. To simplify matters, we will work in atomic units. We choose the mass of the particle to be that of an electron ($m = m_\text{e} = 1$), and the length of the box to be $L = 1 a_0$. Because

\[\begin{align} \langle \phi | \hat{H} | \phi \rangle = -\frac{\hbar^2}{2m} \left \langle \phi \left | \frac{d^2}{dx^2}\right | \phi \right \rangle \end{align}\]

we will also need to know the second derivative of $\phi$ with respect to $x$

\[\begin{split}\begin{align} \frac{d^2}{dx^2} \phi(x) &= \frac{d^2}{dx^2} N x(L-x) \\ &= \frac{d}{dx} N(L - 2x) \\ &= -2N \end{align}\end{split}\]

The code below will also normalize $\phi(x),$ so we need to remember to include the normalization constant in this derivative expression.

import numpy as np
import matplotlib.pyplot as plt

L = 1

x = np.linspace(0, L)
phi = x * (L-x)
psi_1 = np.sqrt(2.0 / L) * np.sin(np.pi * x / L)

# normalize phi 
# phi -> N * phi
# N = 1/sqrt(<phi|phi>)
N = 1.0 / np.sqrt(np.trapz(phi**2, x))
phi *= N

fig = plt.figure()

plt.xlim(0, L)
plt.xlabel(r'x ($a_0$)')
plt.ylabel(r'wave function')

plt.plot(x, phi, label = r'$\phi(x)$')
plt.plot(x, psi_1, label = r'$\psi_1(x)$')
plt.legend()
plt.show()

# evaluate <phi|H|phi> = -1/2<phi|d^2/dx^2|phi> = <phi|N>
W = N * np.trapz(phi, x)

print('W = %10.5f Eh' % (W / np.trapz(phi**2, x)))

How does this result compare to the true ground-state energy for the PIB problem? Recall

\[\begin{align} E_n = \frac{\hbar^2n^2\pi^2}{2mL^2} \end{align}\]

so

\[\begin{align} E_1 = \frac{\pi^2}{2} {\rm E}_h \approx 4.93480 ~{\rm E}_h \end{align}\]

Sure enough, the average energy associated with the trial wave function is higher than the exact ground-state energy, by about 1.3%.

Practice

Question 1

Verify the results above by evaluating the expectation value of the Hamiltonian with respect to $\phi$ analytically, rather than numerically.

How could this result be improved? As mentioned above, we could choose a trial function that contains parameters and vary these parameters such that they minimize the variational integral, $W$. Let us consider the following trial function

\[\begin{align} \phi(x) = c_1 \phi_1(x) + c_2 \phi_2(x) \end{align}\]

where the basis functions, $\phi_1$ and $\phi_2,$ are both functions that satisfy the boundary conditions for the problem, i.e.,

\[\begin{split}\begin{align} \phi_1(x) &= N_1 x(L-x) \\ \phi_2(x) &= N_2 x^2(L-x)^2 \end{align}\end{split}\]

when $0 \le x \le L,$ and both functions are zero elsewhere. Here, $N_1$ and $N_2$ are normalization constants for the basis functions. As will be discussed below, the problem of finding the optimal expansion coefficients $c_1$ and $c_2$ can actually be expressed as a matrix eigenvalue problem, but, for now, let us determine these coefficients by direct numerical minimization in Python. As above, we will work in atomic units and choose $m = m_\text{e}$ and $L = 1a_0$. Also like above, we will need to evaluate an integral involving the second derivatives of $\phi_1$ (which we determined above) and $\phi_2$

\[\begin{split}\begin{align} \frac{d^2}{dx^2} \phi_2 &= N_2 \frac{d^2}{dx^2} x^2 (L-x)^2 \\ &= N_2 \frac{d}{dx}\left [ 2x(L-x)^2 - 2x^2(L-x)\right ] \\ &= N_2 \left [ 2(L-x)^2 - 8x(L-x) + 2x^2 L \right ] \end{align}\end{split}\]

So, we will write a function that evaluates

\[\begin{align} W = \frac{\langle \phi | \hat{H} \phi \rangle}{\langle \phi |\phi \rangle} \end{align}\]

and use a Python library function to minimize $W$ with respect to $c_1$ and $c_2.$ For the numerator, we have

\[\begin{align} \langle \phi | \hat{H} \phi \rangle &= -\frac{1}{2} \left ( c_1^2 \langle \phi_1 | \phi_1^{\prime\prime}\rangle + c_1c_2 [ \langle \phi_1 | \phi_2^{\prime\prime}\rangle + \langle \phi_2 | \phi_1^{\prime\prime}\rangle ] + c_2^2 \langle \phi_2 | \phi_2^{\prime\prime}\rangle \right) \end{align}\]

and, for the denominator, we have

\[\begin{align} \langle \phi |\phi \rangle &= c_1^2 \langle \phi_1 | \phi_1 \rangle + 2 c_1c_2 \langle \phi_1 | \phi_2\rangle + c_2^2 \langle \phi_2 | \phi_2 \rangle \end{align}\]

where, in both expressions, we have assumed that $c_1$ and $c_2$ are real valued.

def evaluate_w(var, H11, H12, H22, S12):
    """
    evaluate the variational integral W = <phi|H|phi>/<phi|phi>

    :param var: the parameters, c1 and c2
    :param H11: <phi_1 | H | phi_1>
    :param H12: <phi_1 | H | phi_2>
    :param H22: <phi_2 | H | phi_2>
    :param S12: <phi_1 | phi_1>
    :return W: the variational integral
    """

    c1, c2 = var

    numerator = c1**2 * H11 + 2 * c1*c2 * H12 + c2**2 * H22
    denominator = c1**2 + 2 * S12 * c1*c2 + c2**2
    
    W = numerator / denominator
                
    return W
    
# minimize W
from scipy.optimize import minimize

L = 1.0

# use a fine-grain grid to be sure numerical integrals are accurate
x = np.linspace(0, L, 5001)

# normalization constant
phi_1 = x * (L-x)
N1 = 1.0 / np.sqrt(np.trapz(phi_1**2, x))
phi_1 *= N1

# second derivative
phi_1pp = -2 * N1

# normalization constant
phi_2 = x**2 * (L-x)**2
N2 = 1.0 / np.sqrt(np.trapz(phi_2**2, x))
phi_2 *= N2

# second derivative
phi_2pp = N2 * ( 2 * (L-x)**2 - 8 * x * (L-x) + 2 * x**2 * L )

# <phi1|H|phi1>
H11 = -0.5 * np.trapz(phi_1 * phi_1pp, x)
    
# <phi1|H|phi2>
# <phi2|H|phi1> = <phi1|H|phi2>* because H is Hermitian
H12 = -0.5 * np.trapz(phi_1 * phi_2pp, x)

 # <phi2|H|phi2>
H22 = -0.5 * np.trapz(phi_2 * phi_2pp, x)

# S11 = <phi_1|phi_1> = 1 because phi_1 is normalized
# S22 = <phi_2|phi_2> = 1 because phi_2 is normalized
# S12 = S21 = <phi_1 | phi_2>
S12 = np.trapz(phi_1 * phi_2, x)
    
res = minimize(evaluate_w, [1, 1], args=(H11, H12, H22, S12) )

c1 = res.x[0]
c2 = res.x[1]
w = evaluate_w([c1, c2], H11, H12, H22, S12)
print("c1 = %10.5f" % (res.x[0]))
print("c2 = %10.5f" % (res.x[1]))
print("w  = %10.5f Eh" % (w))

Wow! The variational integral, $W,$ shows only a 0.001% error relative to the exact ground-state energy for the PIB problem, 4.93480 $E_\text{h}$! Now, let’s normalize $\phi$ and visualize it relative to the exact ground-state wave function.

psi_1 = np.sqrt(2.0 / L) * np.sin(np.pi * x / L)

fig = plt.figure()

plt.xlim(0, L)
plt.xlabel(r'x ($a_0$)')
plt.ylabel(r'wave function')

# normalize phi
N = 1.0 / np.sqrt(c1**2 + c2**2 + 2 * S12 * c1 * c2)
phi = N * (c1 * phi_1 + c2 * phi_2)
plt.plot(x, phi, label = r'$\phi(x)$')

plt.plot(x, psi_1, label = r'$\psi_1(x)$')
plt.legend()
plt.show()

We can see that the approximate wave function, $\phi,$ is indistinguishable from $\psi_1$ on the scale of this figure.

Before moving on, let us consider one more example of a trial function that contains a variable parameter. Recall the one-dimensional quantum harmonic oscillator (QHO) problem. The QHO potential has the form

\[\begin{align} V(x) = 2 \pi^2 \nu^2 m x^2 \end{align}\]

where $m$ is the mass of the oscillator, and $\nu$ is the frequency at which it oscillates. What boundary conditions should a trial wave function for this problem satisfy? Because

\[\begin{align} \lim_{x \to \pm \infty} V(x) = \infty \end{align}\]

we know that the wave function should go to zero in these limits. As such, we should pick a trial function that satisfies this requirement. We choose

\[\begin{align} \phi(x) = e^{-c x^2} \end{align}\]

where $c$ is a variable parameter that is related to how quickly the wave function decays to zero. For this example, let us determine the optimal parameter analytically, rather than numerically. The optimal value for $c$ should satisfy

\[\begin{align} \frac{d}{dc} W = \frac{d}{dc} \frac{\langle \phi | \hat{H} | \phi \rangle}{\langle \phi | \phi \rangle } = 0 \end{align}\]

For the numerator of the variational integral, we have

\[\begin{split}\begin{align} \langle \phi | \hat{H} | \phi \rangle &= -\frac{\hbar^2}{2m} \int_{-\infty}^\infty e^{-cx^2} \frac{d^2}{dx^2} e^{-cx^2} dx + 2\pi^2\nu^2 m \int_{-\infty}^{\infty} x^2 e^{-2cx^2} dx \\ &= \left ( \frac{\hbar^2}{m} \right ) \left ( \frac{\pi}{8} \right )^{1/2} c^{1/2} + \nu^2 m\left ( \frac{\pi^5}{8} \right )^{1/2} c^{-3/2} \end{align}\end{split}\]

The denominator of the variational integral is

\[\begin{split}\begin{align} \langle \phi | \phi \rangle &= \int_{-\infty}^\infty e^{-2cx^2} dx \\ &= \left ( \frac{\pi}{2} \right )^{1/2} c^{-1/2} \end{align}\end{split}\]

Combining these results,

\[\begin{split}\begin{align} W &= \frac{\langle \phi | \hat{H} \phi \rangle}{\langle \phi | \phi \rangle } \\ &= \frac{\hbar^2}{2m} c + \frac{\pi^2}{2} \nu^2 m c^{-1} \end{align}\end{split}\]

and

\[\begin{align} \frac{d}{dc} W = \frac{\hbar^2}{2m} - \frac{\pi^2}{2} \nu^2 m c^{-2} = 0 \end{align}\]

Solving for $c,$ we obtain

\[\begin{align} c = \pm \left ( \frac{\pi m \nu}{\hbar} \right ) \end{align}\]

and we reject the solution with the minus sign because

\[\begin{align} \phi(x) = e^{+\pi\nu m x^2/\hbar} \end{align}\]

is not a square integrable function. So, the approximate wave function is

\[\begin{align} \phi(x) = e^{-m\pi\nu x^2/\hbar} \end{align}\]

and, the variational integral is

\[\begin{split}\begin{align} W &= \frac{\hbar^2}{2m} c + \frac{\pi^2}{2} \nu^2 m c^{-1} \\ &= \frac{\hbar^2}{2m} \frac{\pi m \nu}{\hbar} + \frac{\pi^2}{2} \nu^2 m \frac{\hbar} {\pi m \nu} \\ &= \frac{1}{2} h\nu \end{align}\end{split}\]

which is exactly equal to the ground-state energy for the one-dimensional QHO! Clearly, we chose an excellent trial function because the optimal variational parameter led to exactly the correct ground-state energy and wave function.

The Variation Theorem for Excited States#

As expressed above, the variation theorem appears to be useful only for approximating the wave function for the lowest-energy eigenfunction of a given Hamiltonian. However, it is possible to extend this theorem to excited states. We proceed in the same manner as before, by expanding the trial function, $\phi,$ in the basis of eigenstate of the Hamiltonian. We have

\[\begin{align} |\phi\rangle = \sum_k c_k |\psi_k\rangle \end{align}\]

where

\[\begin{align} \hat{H} | \psi_k \rangle = E_k | \psi_k \rangle \end{align}\]

and, again, we choose $\{\psi_k\}$ to be an orthonormal set, and we also choose $\phi$ to be normalized, so

\[\begin{align} \langle \phi | \phi \rangle = \sum_k |c_k|^2 = 1 \end{align}\]

What happens if $\phi$ is orthogonal to the true ground-state wave function? In that case, we have

\[\begin{split}\begin{align} \langle \phi | \psi_1 \rangle &= 0 \\ &= \sum c_k^* \langle \psi_k | \phi_1 \rangle \\ &= c_1^* \end{align}\end{split}\]

So, if $\langle \phi | \psi_1 \rangle = 0,$ then the expansion coefficient corresponding to that basis function is zero. As a result, we can restrict the sum that defines $\phi$ as

\[\begin{align} |\phi\rangle = \sum_{k=2} c_k |\psi_k\rangle \end{align}\]

Working through the expectation value of the Hamiltonian, we would find that the variational integral would then be

\[\begin{align} W = \sum_{k=2} |c_k|^2 E_k \end{align}\]

and, because all $E_k$ in that sum satisfy $E_k \ge E_2,$ we find

\[\begin{align} W \ge E_2 \end{align}\]

This analysis suggests that, if we can guarantee that $\langle \phi | \psi_1 \rangle = 0,$ then we can use the variation theorem to find approximations to excited-state energies and wave functions. Without knowledge of the actual form of the ground-state wave function, enforcing the orthogonality requirement might seem like a tall order. However, in many cases, we can rely on symmetry arguments to make our lives easier. Consider the following examples.

Case 1. First, recall that, in the one-dimensional PIB problem, the wave functions with alternate even or odd symmetry about the center of the box. The ground-state wave function has even symmetry (which is actually the case for any problem for which the potential has even symmetry), so the first excited-state wave function must have odd symmetry. A suitable trial function would be

\[\begin{split}\begin{align} \phi(x) = \begin{cases} 0\text{, } &x < 0 \\ N x(L-x)(L/2 - x) \text{, } &0 \le x \le L \\ 0\text{, }&x \ge L \end{cases} \end{align}\end{split}\]

where $N$ is a normalization constant. The following code visualizes this trial function and the true wave function for the first excited-state (with $n=2$) of the PIB problem. Again, we work with atomic units and choose $L = 1a_0$.

import numpy as np
import matplotlib.pyplot as plt

L = 1

x = np.linspace(0, L)
phi = x * (L-x) * (L/2 - x)
psi_2 = np.sqrt(2.0 / L) * np.sin(2 * np.pi * x / L)

# normalize phi 
# phi -> N * phi
# N = 1/sqrt(<phi|phi>)
N = 1.0 / np.sqrt(np.trapz(phi**2, x))
phi *= N

fig = plt.figure()

plt.xlim(0, L)
plt.xlabel(r'x ($a_0$)')
plt.ylabel(r'wave function')

plt.plot(x, phi, label = r'$\phi(x)$')
plt.plot(x, psi_2, label = r'$\psi_2(x)$')
plt.legend()
plt.show()

As can be seen, this trial function seems like it will be a reasonable approximation to the true wave function for the $n=2$ state.

Practice

Question 2

Consider the trial function $$\begin{align} \phi(x) = \begin{cases} 0\text{, } &x < 0 \\ N x(L-x)(L/2 - x) \text{, } &0 \le x \le L \\ 0\text{, }&x \ge L \end{cases} \end{align}$$ Evaluate the expecation value of the one-dimensional PIB Hamiltonian with respect to this function numerically and analytically, and show that this expectation value is an upper bound to the true energy of the $n=2$ state for the PIB problem.

Case 2. As a second example, let’s revisit the one-dimensional QHO problem. Above, we found that the exact form of the ground-state wave function could be obtained from the trial function,

\[\begin{align} \phi(x) = e^{-cx^2} \end{align}\]

Recall that, like the PIB problem, the QHO wave functions for different states alternate between even and odd symmetry. The ground-state has even symmetry, so, if we would like to approximate the first-excited state, we should choose a trial function that has odd symmetry. Consider

\[\begin{align} \phi(x) = x e^{-cx^2} \end{align}\]

It turns out that this is also an excellent choice for a trial function, because the optimal value for $c$ will recover the exact energy and wave function for the $v=1$ state!

Practice

Question 3

Show that the trial function

\[\begin{align} \phi(x) = x e^{-cx^2} \end{align}\]

will recover the exact energy for the first excited state of the one-dimensional QHO problem.

Case 3. The hydrogenic atom problem is a central force problem, and all central force problems have solutions of the form

\[\begin{align} \psi(r, \theta, \phi) &= R(r) Y^m_l(\theta, \phi) \end{align}\]

where $Y^m_l(\theta, \phi)$ are the spherical harmonics. Recall that the spherical harmonics with different $l$ values are orthogonal, i.e.,

\[\begin{align} \langle Y^m_l(\theta, \phi) | Y^m_{l^\prime}(\theta, \phi) \rangle = 0\text{, }~~~l \neq l^\prime \end{align}\]

One could choose as trial functions

\[\begin{split}\begin{align} \phi_1(r, \theta, \phi) &= R(r) Y^0_0(\theta, \phi) \\ \phi_2(r, \theta, \phi) &= R(r) Y^0_1(\theta, \phi) \\ \phi_3(r, \theta, \phi) &= R(r) Y^0_2(\theta, \phi) \\ \text{etc.} \nonumber \end{align}\end{split}\]

where $R(r)$ is an as-of-yet undetermined function that contains the variational parameters and should satisfy the boundary conditions for bound states

\[\begin{align} \lim_{r\to\infty} R(r) = 0 \end{align}\]

The trial function $\phi_1$ could be used to find an approximate radial function and energy for the lowest-energy state with $l=0,$ which would be the $n=1$ state. The trial functions $\phi_2,$ $\phi_3,$ etc., could then be used to find approximate radial functions and energies for the lowest-energy states with $l=1, 2,$ etc. (the $n=2, 3,$ etc. states).

Case 4. In a later notebook, we will discuss Hartree-Fock theory, which is a variational approach for finding approximate wave functions for many-electron systems. In the Hartree-Fock approach, the trial wave function is chosen to be an antisymmetrized product of “spin-orbitals,” which are one-electron wave functions that include both spin and spatial coordinates. As an example, for two electrons, such a wave function would look like

\[\begin{align} \phi({\bf x}_1, {\bf x_2}) = \frac{1}{\sqrt{2}} [ \chi_1({\bf x}_1) \chi_2({\bf x}_2) - \chi_1({\bf x}_2) \chi_2({\bf x}_1)] \end{align}\]

where ${\bf x}_1$ and ${\bf x}_2$ represent the spin and spatial coordinates for electrons 1 and 2, respectively, and $\chi_1$ and $\chi_2$ are the spin-orbitals that contains the variational parameters. For molecules having point-group symmetry beyond $\text{C}_1,$ the spin-orbitals can be defined such that they belong to a specific irreducible representation of the point-group. States of different total symmetry can then be constructed where the overall symmetry is defined by the products of the symmetries of the underlying spin-orbitals. Consider a water molecule, which belongs to the C$_{2v}$ point group, which itself has four irreducible representations ($A_1,$ $A_2,$ $B_1,$ and $B_2$). As such, the Hatree-Fock approach could be used to find wave functions and energies approximating those for the four lowest-energy states with each of these overall symmetries.

Case 5. Lastly, consider again the antisymmetrized product wave function from Hartree-Fock theory. The spin degrees of freedom could be chosen such that the trial wave function has a specific spin symmetry (e.g., singlet, triplet, etc.). In this way, the Hartree-Fock approach could be used to find wave functions and energies approximating those for the lowest-energy states of a given spin symmetry.

The Linear Variation Method#

One of the main challenges in applications of the variation theorem is choosing a good trial function. On the one hand, we could simply inspect the boundary conditions and choose an appropriate form that respects those conditions (as we did above for the PIB and QHO problems). A more systematic approach is to expand the wave function in a basis of known functions and let the expansion coefficients be the variational parameters. We touched on this idea above when considering the PIB problem. In that case, we chose a trial function that contained two basis functions,

\[\begin{align} \phi(x) = c_1 \phi_1(x) + c_2 \phi_2(x) \end{align}\]

and we found the optimal coefficients numerically using a Python minimization routine. It turns out that we could have also solved this problem analytically using a formalism that involves some linear algebra. Before doing so, we should review some relevant mathematical concepts.

Matrices in Quantum Mechanics#

Given a set $n$ of orthonormal basis functions, $\{f_i\},$ we can define the matrix representation of an operator, $\hat{A},$ as

\[\begin{split}\begin{align} {\bf A} = \begin{pmatrix} A_{11} & A_{12} & ... \\ A_{21} & \ddots & \\ \vdots & & A_{nn}\\ \end{pmatrix} \end{align}\end{split}\]

where each matrix element is defined by

\[\begin{align} A_{ij} = \langle f_i | \hat{A} | f_j \rangle \end{align}\]

We will find that matrices have the same mathematical properties as linear Hermitian operators that we discussed in an earlier notebook. Consider first the equivalence of the matrix representation of two operators

\[\begin{align} \hat{R} = \hat{S} \end{align}\]

in the basis $\{f_i\}$. We multiply this expression by $\langle f_i|$ on the left and by $|f_j\rangle$ on the right and then integrate over all space to obtain

\[\begin{align} R_{ij} = S_{ij} \end{align}\]

From this expression, we conclude that the matrix representations of $\hat{R}$ and $\hat{S}$ are equal if the matrix elements are equivalent, for all $i$ and $j$.

Now, consider the addition of two operators

\[\begin{align} \hat{C} = \hat{A} + \hat{B} \end{align}\]

We can evaluate the matrix elements for $\hat{C}$ in the basis $\{f_i\}$ as

\[\begin{split}\begin{align} C_{ij} &= \langle f_i |\hat{C} | f_j \rangle \\ &= \langle f_i |\hat{A} + \hat{B} | f_j \rangle \\ &= \langle f_i |\hat{A} | f_j \rangle + \langle f_i | \hat{B} | f_j \rangle \\ &= A_{ij} + B_{ij} \end{align}\end{split}\]

which is the rule for matrix addition!

Let us try something slightly more complicated. Consider the product of operators

\[\begin{align} \hat{R} = \hat{S}\hat{T} \end{align}\]

The matrix representation of this product is

\[\begin{split}\begin{align} R_{ij} &= \langle f_i | \hat{R} | f_j \rangle \\ &= \langle f_i |\hat{S}\hat{T} | f_j \rangle \end{align}\end{split}\]

If the basis $\{f_i\}$ is complete, then we can expand

\[\begin{align} \hat{T} | f_j \rangle = \sum_k c_k | f_k \rangle \end{align}\]

so

\[\begin{split}\begin{align} R_{ij} &= \sum_k c_k \langle f_i |\hat{S} | f_k \rangle \\ &= \sum_k c_k S_{ik} \end{align}\end{split}\]

How can we determine the expansion coefficients? You may recall from an earlier notebook that one of the expansion coefficients, $c_m,$ can be obtained by projecting $\hat{T} | f_j \rangle = \sum_k c_k | f_k \rangle$ onto the basis function, $\langle f_m|,$ i.e.

\[\begin{split}\begin{align} \langle f_m | \hat{T} | f_j \rangle &= \sum_k c_k \langle f_m | f_k \rangle \\ &= \sum_k c_k \delta_{mk} \\ &= c_m \end{align}\end{split}\]

Now, we find that

\[\begin{align} R_{ij} &= \sum_{k} S_{ik} T_{kj} \end{align}\]

which is the rule for matrix multiplication! We could also have obtained this result more directly, by inserting the identity operator,

\[\begin{align} \hat{I} = \sum_k | f_k \rangle \langle f_k | \end{align}\]

directy between the product of operators in

\[\begin{split}\begin{align} R_{ij} &= \langle f_i | \hat{R} | f_j \rangle \\ &= \langle f_i |\hat{S}\hat{T} | f_j \rangle \\ &= \langle f_i |\hat{S}\hat{I}\hat{T} | f_j \rangle \\ &= \sum_k \langle f_i | \hat{S}|f_k \rangle \langle f_k | \hat{T} | f_j \rangle \\ &= \sum_{k} S_{ik} T_{kj} \end{align}\end{split}\]

Vectors in Quantum Mechanics#

These exercises should convince us that we can view operators as matrices once we have chosen a basis. It turns out that we can also view functions expanded in this basis as vectors. Consider the action of the operator $\hat{A}$ on a function

\[\begin{align} \hat{A} | u \rangle = | w \rangle \end{align}\]

Now, we select a basis, $\{f_i\},$ in which we expand the function $|u\rangle$

\[\begin{align} |u\rangle = \sum_i u_i | f_i \rangle \end{align}\]

with

\[\begin{align} u_i = \langle f_i | u \rangle \end{align}\]

or

\[\begin{align} |u\rangle = \sum_i | f_i \rangle \langle f_i | u \rangle \end{align}\]

We can think of the expansion coefficients, $u_i,$ as being components of a vector

\[\begin{split}\begin{align} {\bf u } = \begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_n \end{pmatrix} \end{align}\end{split}\]

where $n$ is the dimension of the basis. Let us insert $|u\rangle = \sum_i | f_i \rangle \langle f_i | u \rangle$ into the expression for the matrix vector product

\[\begin{align} \sum_i \hat{A} | f_i \rangle \langle f_i | u \rangle = | w \rangle \end{align}\]

Next, we can expand the function $\hat{A} | f_i \rangle$ in the same basis

\[\begin{split}\begin{align} \hat{A} | f_i \rangle &= \sum_k c_k | f_k \rangle \\ &= \sum_k \langle f_k | \hat{A} | f_i \rangle | f_k \rangle \end{align}\end{split}\]

which gives us

\[\begin{align} \sum_{ik} \langle f_k | \hat{A} | f_i \rangle \langle f_i | u \rangle | f_k \rangle = | w \rangle \end{align}\]

We now can expand the function $|w\rangle$ in the same basis to obtain

\[\begin{split}\begin{align} |w\rangle &= \sum_i w_i | f_i \rangle \\ &= \sum_i |f_i \rangle \langle f_i | w\rangle \end{align}\end{split}\]

and

\[\begin{align} \sum_{ik} \langle f_k | \hat{A} | f_i \rangle \langle f_i | u \rangle | f_k \rangle = \sum_j |f_j \rangle \langle f_j | w\rangle \end{align}\]

or

\[\begin{align} \sum_{ik} A_{ki} u_i | f_k \rangle = \sum_j w_j |f_j \rangle \end{align}\]

We can move all terms to the left-hand side to give

\[\begin{align} \sum_{k}\left [ \sum_i A_{ki} u_i - w_k \right ] | f_k \rangle = 0 \end{align}\]

At this point, we put on our linear algebra hats and note that, if $|f_k\rangle \neq 0$ and all $|f_k\rangle$ are linearly independent, then this equation only has a solution if

\[\begin{align} \sum_i A_{ki} u_i - w_k = 0\text{, }~~~\forall ~k \end{align}\]

Put another way,

\[\begin{align} \sum_i A_{ki} u_i = w_k\text{, }~~~\forall ~k \end{align}\]

which is the rule for a matrix-vector product!

The Secular Equation#

In the linear variational method, the trial wave function is expanded in a basis of known functions

\[\begin{align} |\phi\rangle = \sum_k c_k |f_k\rangle \end{align}\]

where the set of functions $\{|f_k\rangle\}$ is not necessarily orthonormalized. While not required, the following analysis will be simpler if we assume that $\{c_k\}$ and $\{|f_k\rangle\}$ are real-valued. We would like to minimize the variational integral

\[\begin{align} W = \frac{\langle \phi | \hat{H} | \phi \rangle}{\langle \phi | \phi \rangle} \end{align}\]

with respect to the expansion coefficients. The optimal coefficients should satisfy

\[\begin{align} \frac{\partial W}{\partial c_i} = 0 \end{align}\]

for all $i$. Let’s do some math! We have

\[\begin{split}\begin{align} \frac{\partial W}{\partial c_i} &= \frac{\partial}{\partial c_i} \langle \phi | \hat{H} | \phi \rangle \langle \phi | \phi \rangle^{-1} \\ &= \left [ \left \langle \left . \left . \frac{ \partial \phi}{\partial c_i}\right | \hat{H} \right | \phi \right \rangle + \left \langle \phi \left | \hat{H} \left | \frac{\partial \phi}{\partial c_i} \right . \right . \right \rangle \right ] \langle \phi | \phi \rangle^{-1} \nonumber \\ &- \langle \phi | \hat{H} | \phi \rangle \left [ \left \langle \left . \frac{ \partial \phi}{\partial c_i} \right | \phi \right \rangle + \left \langle \phi \left | \frac{\partial \phi}{\partial c_i}\right . \right \rangle \right ] \langle \phi | \phi \rangle^{-2} \end{align}\end{split}\]

The derivative of the trial wave function with respect to $c_i$ is

\[\begin{split}\begin{align} \frac{\partial}{c_i} | \phi \rangle &= \sum_k \frac{\partial}{\partial c_i} c_k | f_k \rangle \\ &= \sum_k \delta_{ik} | f_k \rangle \\ &= |f_i\rangle \end{align}\end{split}\]

so the gradient of $W$ becomes

\[\begin{align} \frac{\partial W}{\partial c_i} &= \frac{\langle f_i | \hat{H} | \phi \rangle + \langle \phi | \hat{H} | f_i \rangle}{\langle \phi | \phi \rangle} - \frac{\langle \phi | \hat{H} | \phi \rangle}{\langle \phi |\phi\rangle^2} \left [ \langle f_i | \phi \rangle + \langle \phi | f_i \rangle \right ] \end{align}\]

By multiplying through by a factor of $\langle \phi | \phi \rangle$ and recognizing that $\frac{\langle \phi | \hat{H} | \phi \rangle}{\langle \phi | \phi \rangle} = W,$ this expression simplifies to

\[\begin{split}\begin{align} 0 &= \langle f_i | \hat{H} | \phi \rangle + \langle \phi | \hat{H} | f_i \rangle - W \left [ \langle f_i | \phi \rangle + \langle \phi | f_i \rangle \right ] \\ &= \sum_k c_k \langle f_i | \hat{H} | f_k \rangle + \sum_k c_k^* \langle f_k | \hat{H} | f_i \rangle - W \left ( \sum_k c_k \langle f_i | f_k \rangle + \sum_k c_k^* \langle f_k | f_i \rangle \right ) \\ &= \sum_k c_k \left ( H_{ik} + H_{ki} \right ) - W \sum_k c_k \left (S_{ik} + S_{ki} \right ) \end{align}\end{split}\]

where, in the last line, we have recognized that $c_k$ is real valued, and we have introduced the overlap integral

\[\begin{align} S_{ik} = \langle f_i | f_k \rangle \end{align}\]

Now, recall that $\hat{H}$ is a Hermitian operator, so

\[\begin{split}\begin{align} \langle f_i | \hat{H} | f_k \rangle &= \langle f_k | \hat{H} | f_i \rangle^* \\ H_{ik} &= H_{ki}^* \\ H_{ik} &= H_{ik}^\dagger \\ \end{align}\end{split}\]

where we have introduced the “dagger” notation, which refers to the adjoint, or conjugate transpose, of the matrix. Hence, we say that Hermitian matrices are “self adjoint.” If the basis functions are real-valued and the Hamiltonian contains only real-valued quantities (e.g., no spin-orbit coupling or complex external fields), then the elements of the matrix representation of the Hamiltonian will be real-valued, as will the overlap integrals, so

\[\begin{split}\begin{align} H_{ik} &= H_{ki} \\ S_{ik} &= S_{ki} \end{align}\end{split}\]

We now have

\[\begin{align} \sum_k \left [ H_{ik} - S_{ik}W\right ] c_k = 0\text{, }~~~\forall~i \end{align}\]

It will be easiest to understand how to solve this set of linear equations if we consider the specific case where trial function is expanded in a basis of only two functions, in which case we would have

\[\begin{split}\begin{align} i &= 1: ~~~ (H_{11} - S_{11} W) c_1 + (H_{12} - W S_{12}) c_2 = 0 \\ i &= 2: ~~~ (H_{21} - S_{21} W) c_1 + (H_{22} - W S_{22}) c_2 = 0 \end{align}\end{split}\]

We have three options for solving these equations:

Use Gaussian / Gauss-Jordon elimination. This procedure is essentially what one would do when solving for $n$ unknown coefficients with $n$ equations by hand.
Solve the non-linear equation resulting from

\[\begin{align} \det\left ( {\bf H} - W{\bf S} \right ) = 0 \end{align}\]

where $\det$ refers to the determinant. It turns out that the original set of equations has a non-trivial solution if and only if this determinant is equal to zero. This determinant equation is sometimes called the secular equation.
Use numerical matrix methods. The linear equations are equivalent to what is called a generalized eigenvalue problem. If the basis is orthonormalized, then the linear equations would be equivalent to matrix eigenvalue problem, which could be solved by finding a transformation that brings the matrix to diagonal form. This approach is most similar to how some problems in quantum chemistry are solved, in practice, so we will return to this idea later.

For now, let us proceed with option 2. Again, consider the two basis function case, where the secular equation is

\[\begin{split}\begin{align} \det ( {\bf H} - W {\bf S} ) &= 0 \\ &= \begin{vmatrix} H_{11} - S_{11}W & H_{12} - S_{12}W \\ H_{21} - S_{21}W & H_{22} - S_{22}W \\ \end{vmatrix} \\ &= (H_{11} - S_{11}W)(H_{22} - S_{22}W) - (H_{12} - S_{12}W)(H_{21} - S_{21}W) \end{align}\end{split}\]

In the last line, we have followed the rule for evaluating a $2\times 2$ determinant. This equation is quadratic in $W,$ which implies the existence of two solutions, or two estimates of the energy. With $n$ basis functions, we would have an $n$th-order polynomial that would give $n$ estimates for energies with

\[\begin{align} W_1 \le W_2 \le ... \le W_n \end{align}\]

It can be proved that each of these values are bounded by the true eigenvalues of the Hamiltonian as

\[\begin{split}\begin{align} E_1 \le W_1 \\ E_2 \le W_2 \\ \text{etc}. \nonumber \end{align}\end{split}\]

How does this work in practice? Let’s revisit the one-dimensional PIB problem, where the trial function is expanded in a basis of four functions:

\[\begin{align} |\phi\rangle = \sum_{k=1}^4 c_k |f_k\rangle \end{align}\]

where

\[\begin{split}\begin{align} |f_1\rangle &= x(l-x) \\ |f_2\rangle &= x^2(l-x)^2 \\ |f_3\rangle &= x(l-x)(l/2-x) \\ |f_4\rangle &= x^2(l-x)^2(l/2-x) \\ \end{align}\end{split}\]

Wait, does the use of four basis functions imply that we will need to deal with a $4\times 4$ determinant? In general, yes, but for this specific example, not quite. The actual structure of the problem will be much simpler, and the reason is symmetry.

First, let us think about the symmetry of the basis functions. $|f_1\rangle$ and $|f_2\rangle$ are even about the middle of the box, while $|f_3\rangle$ and $|f_4\rangle$ are odd. As such,

\[\begin{split}\begin{align} \langle f_1 | f_3 \rangle = S_{13} &= 0 \\ &= S_{14} \\ &= S_{23} \\ & = S_{24} \end{align}\end{split}\]

So, overall, the overlap matrix is block diagonal

\[\begin{split}\begin{align} {\bf S} = \begin{pmatrix} S_{11} & S_{12} & 0 & 0 \\ S_{21} & S_{22} & 0 & 0 \\ 0 & 0 & S_{33} & S_{34} \\ 0 & 0 & S_{43} & S_{44} \\ \end{pmatrix} \end{align}\end{split}\]

What about the matrix representation of the Hamiltonian? Does it have a similar structure? The answer is, “yes,” but first, in order to understand why, we must introduce the parity operator and explore its properties.

The parity operator, $\hat{\Pi},$ is an operator that flips the sign of spatial coordinates. In three dimensions, we have

\[\begin{align} \hat{\Pi} f(x, y, z) = f(-x, -y, -z) \end{align}\]

What are the eigenvalues of this operator? It may be useful to consider what happens when acting on both sides of this equation with the parity operator

\[\begin{split}\begin{align} \hat{\Pi}^2 f(x, y, z) &= \hat{\Pi} f(-x, -y, -z) \\ &= f(x, y, z) \end{align}\end{split}\]

which implies that

\[\begin{align} \hat{\Pi}^2 = 1 \end{align}\]

Now, consider the eigenvalue equation

\[\begin{align} \hat{\Pi} g_i = c_i g_i \end{align}\]

Acting on both sides of the equation with $\hat{\Pi}$ leads to

\[\begin{split}\begin{align} \hat{\Pi}^2 g_i &= c_i \hat{\Pi} g_i \\ &= c_i^2 g_i \end{align}\end{split}\]

which, when combined with the previous result, gives

\[\begin{split}\begin{align} g_i &= c_i^2 g_i \\ \end{align}\end{split}\]

and, thus,

\[\begin{align} c_i = \pm 1 \end{align}\]

What are the eigenfunctions of the parity operator? Given that the eigenvalues are $\pm 1,$ the eigenfunctions should satisfy

\[\begin{align} \hat{\Pi} g_i(x, y, z) &= \pm g_i(x, y, z) \end{align}\]

but we also know that the parity operator should flip the sign of the coordinates, i.e.,

\[\begin{align} \hat{\Pi} g_i(x, y, z) &= g_i(-x, -y, -z) \end{align}\]

Combining these results

\[\begin{align} g_i(x, y, z) = \pm g_i(-x, -y, -z) \end{align}\]

which implies that the eigenfunctions of the parity operator are all (well-behaved) functions with even and odd parity.

Does the parity operator commute with the Hamiltonian for the one-dimensional PIB problem? Consider

\[\begin{align} [\hat{H}, \hat{\Pi}] = [\hat{T}, \hat{\Pi}] + [V, \hat{\Pi}] \end{align}\]

In one-dimension

\[\begin{split}\begin{align} [\hat{T}, \hat{\Pi}] f(x) &= \hat{T}\hat{\Pi}f(x) - \hat{\Pi}\hat{T}f(x) \\ &= -\frac{\hbar^2}{2m} \left [ \frac{d^2}{dx^2}\hat{\Pi} f(x) - \hat{\Pi} \frac{d^2}{dx^2} f(x) \right ] \\ &= -\frac{\hbar^2}{2m} \left [ \frac{d^2}{dx^2} f(-x) - \frac{d}{d(-x)} \frac{d}{d(-x)}f(-x) \right ] \\ &= -\frac{\hbar^2}{2m} \left [ \frac{d^2}{dx^2} f(-x) - \frac{d}{dx} \frac{d}{dx}f(-x) \right ] \\ &= 0 \end{align}\end{split}\]

Therefore, $[\hat{T}, \hat{\Pi}] = 0$.

Practice

Question 4

Convince yourself that

\[\begin{align} \hat{\Pi} \left [ \frac{d^2}{dx^2} f(x) \right ] = \frac{d^2}{dx^2} f(-x) \end{align}\]

using the centered second-order finite-difference expression for the second derivative

\[\begin{align} \frac{d^2}{dx^2} f(x) \approx \frac{f(x+h) - 2f(x) + f(x-h)}{h^2} \end{align}\]

Does the parity operator commute with the potential operator? In one-dimension, we have

\[\begin{align} [V(x), \hat{\Pi}]f(x) = [V(x) - V(-x)]f(-x) \end{align}\]

If the potential is even, then $V(x) = V(-x)$, and

\[\begin{align} [V(x), \hat{\Pi}] = 0 \end{align}\]

and

\[\begin{align} [\hat{H}, \hat{\Pi}] = 0 \end{align}\]

More generally, for a multidimensional problem, if the potential has even parity, then the Hamiltonian will commute with the parity operator. In such cases, it should be possible to define simultaneous eigenfunctions of both operators. In other words, the eigenfunctions of the Hamiltonian could be chosen such that they have even or odd parity.

Before returning to the secular equation, we require one additional theorem for Hermitian operators:

Theorem If $g_i$ and $g_j$ are eigenfunctions of a Hermitian operator $\hat{A}$ with different eigenvalues, then

\[\langle g_i | \hat{B} | g_j \rangle = 0\]

If $\hat{A}$ and $\hat{B}$ commute.

If the potential has even parity, then $[\hat{\Pi}, \hat{H}] = 0$. If the Hamiltonian is represented in a basis of functions with well-defined parity, this theorem suggests that elements $H_{ij} = \langle f_i | \hat{H} | f_j \rangle$ will be zero, if basis functions $f_i$ and $f_j$ have different parities. For the PIB problem with four basis functions, the potential is even, and the basis functions have either even or odd parity (with respect to the center of the box). As such, the Hamiltonian has the same block diagonal structure as the overlap matrix:

\[\begin{split}\begin{align} {\bf H} = \begin{pmatrix} H_{11} & H_{12} & 0 & 0 \\ H_{21} & H_{22} & 0 & 0 \\ 0 & 0 & H_{33} & H_{34} \\ 0 & 0 & H_{43} & H_{44} \\ \end{pmatrix} \end{align}\end{split}\]

Practice

Question 5

Proove that

\[\begin{align} \langle f_i | \hat{B} | f_j \rangle = 0 \end{align}\]

if $f_i$ and $f_j$ are eigenfunctions of linear Hermitian operator $\hat{A}$ with different eigenvalues and $[\hat{A}, \hat{B}] = 0.$

Armed with this new knowledge, we can see that the $4\times 4$ determinant for the PIB problem has a special structure, block-diagonal structure:

\[\begin{split}\begin{align} 0 = \begin{vmatrix} H_{11} - W S_{11} & H_{12} - W S_{12} & 0 & 0 \\ H_{21} - W S_{21} & H_{22} - W S_{22} & 0 & 0 \\ 0 & 0 & H_{33} - W S_{33} & H_{34} - W S_{34} \\ 0 & 0 & H_{43} - W S_{43} & H_{44} - W S_{44} \\ \end{vmatrix} \end{align}\end{split}\]

It turns out that the determinant of a block-diagonal matrix is expressible as a product of determinants of the individual subblocks, i.e.,

\[\begin{split}\begin{align} \begin{vmatrix} {\bf A} & 0 \\ 0 & {\bf B} \\ \end{vmatrix} = \begin{vmatrix} {\bf A} \end{vmatrix}\begin{vmatrix} {\bf B} \end{vmatrix} \end{align}\end{split}\]

So, if we are solving

\[\begin{align} \det({\bf A})\det({\bf B}) = 0 \end{align}\]

we can obtain separate sets of roots by separately solving

\[\begin{align} \det({\bf A}) = 0 \end{align}\]

and

\[\begin{align} \det({\bf B}) = 0 \end{align}\]

To actually solve these equations, we need integrals, $H_{11},$ $S_{11},$ etc., some of which we evaluated numerically above. Let us reuse some of that code to solve the secular equation for the even parity part of the four-basis-function PIB problem:

\[\begin{split}\begin{align} 0 &= \begin{vmatrix} H_{11} - W S_{11} & H_{12} - W S_{12} \\ H_{21} - W S_{21} & H_{22} - W S_{22} \\ \end{vmatrix} \\ &= (H_{11} - W S_{11})(H_{22} - W S_{22}) - (H_{12} - W S_{12})(H_{21} - W S_{21}) \\ &= aW^2 + b W + c \end{align}\end{split}\]

with

\[\begin{split}\begin{align} a &= S_{11} S_{22} - S_{12} S_{21} \\ b &= - (H_{11}S_{22} + H_{22}S_{11} - H_{12}S_{21} - H_{21}S_{12}) \\ c &= H_{11} H_{22} - H_{12}H_{21} \end{align}\end{split}\]

As above, we will work in atomic units and choose $L=1a_0,$ and $m = m_\text{e}\text{.}$

L = 1.0

# use a fine-grain grid to be sure numerical integrals are accurate
x = np.linspace(0, L, 5001)

# normalization constant
phi_1 = x * (L-x)
N1 = 1.0 / np.sqrt(np.trapz(phi_1**2, x))
phi_1 *= N1

# second derivative
phi_1pp = -2 * N1

# normalization constant
phi_2 = x**2 * (L-x)**2
N2 = 1.0 / np.sqrt(np.trapz(phi_2**2, x))
phi_2 *= N2

# second derivative
phi_2pp = N2 * ( 2 * (L-x)**2 - 8 * x * (L-x) + 2 * x**2 * L )

# <phi1|H|phi1>
H11 = -0.5 * np.trapz(phi_1 * phi_1pp, x)
    
# <phi1|H|phi2>
# <phi2|H|phi1> = <phi1|H|phi2>* because H is Hermitian
H12 = -0.5 * np.trapz(phi_1 * phi_2pp, x)

 # <phi2|H|phi2>
H22 = -0.5 * np.trapz(phi_2 * phi_2pp, x)

# S11 = <phi_1|phi_1> = 1 because phi_1 is normalized
# S22 = <phi_2|phi_2> = 1 because phi_2 is normalized
# S12 = S21 = <phi_1 | phi_2>
S12 = np.trapz(phi_1 * phi_2, x)
    
# quadratic formula:
# ax^2 + bx + c = 0
a = 1 - S12**2
b = - (H11 + H22 - 2*H12*S12)
c = H11 * H22 - H12**2

W1 = (-b - np.sqrt(b**2 - 4*a*c))/(2*a)
W3 = (-b + np.sqrt(b**2 - 4*a*c))/(2*a)

print("W1  = %10.5f Eh" % (W1))
print("W3  = %10.5f Eh" % (W3))

Wow! $W_1$ calculcated in this way, which provides an estimate for the exact ground-state energy, $E_1,$ is identical to the estimate we found earlier by explicit numerical minimization of the variational integral. As a reminder, this estimate shows only a 0.001% error relative to the exact ground-state energy for the PIB problem, 4.93480 $E_\text{h}$! Moreover, we also have an estimate to the energy of the first excited state of even parity, which corresponds to the second excited-state energy overall, $E_3$. Recall

\[\begin{align} E_n = \frac{\hbar^2n^2\pi^2}{2mL^2} \end{align}\]

so

\[\begin{align} E_3 = \frac{9\pi^2}{2} {\rm E}_h \approx 44.41322 ~{\rm E}_h \end{align}\]

As can be seen, the description of this state is a bit worse than that of the ground-state, with $W_3$ agreeing with $E_3$ to only $\approx 15\%\text{.}$ Note that $W_3 > E_3,$ as asserted above.

Solving

\[\begin{split}\begin{align} 0 &= \begin{vmatrix} H_{33} - W S_{33} & H_{34} - W S_{34} \\ H_{43} - W S_{43} & H_{44} - W S_{44} \\ \end{vmatrix} \\ \end{align}\end{split}\]

will give us the odd parity solutions, which provide upper-bounds to $E_2$ and $E_4$ of the one-dimensional PIB problem. As with the even-parity states, solving this equation requires knowledge of the matrix elements of the Hamiltonian for basis functions $|f_3\rangle$ and $|f_4\rangle\text{.}$ We can simplify matters if we note that the Hamiltonian is purely kinetic within the box, and

\[\begin{align} \hat{H} = \hat{T} = \frac{\hat{p}^2}{2m} \end{align}\]

Because $\hat{p}$ is Hermitian,

\[\begin{split}\begin{align} H_{ij} &= \langle f_i | \hat{H} | f_j \rangle \\ &= \frac{1}{2m} \langle f_i | \hat{p}^2 | f_j \rangle \\ &= -\frac{i\hbar}{2m} \left \langle f_i \left | \hat{p} \left | \frac{ d f_j}{dx} \right \rangle \right . \right . \\ &= -\frac{i\hbar}{2m} \left . \left \langle \left . \frac{ d f_j}{dx} \right | \hat{p} \right | f_i \right \rangle^* \\ &= \frac{\hbar^2}{2m} \left . \left \langle \frac{ d f_j}{dx} \right | \frac{ d f_i}{dx} \right \rangle^* \\ &= \frac{\hbar^2}{2m} \left . \left \langle \frac{ d f_i}{dx} \right | \frac{ d f_j}{dx} \right \rangle \end{align}\end{split}\]

As such, we only need to evaluate first derivatives of the basis functions. The Python code below calculates these gradients numerically.

L = 1.0

# use a fine-grain grid to be sure numerical integrals are accurate
x = np.linspace(0, L, 5001)

# normalization constant
phi_3 = x * (L-x) * (0.5*L - x)
N3 = 1.0 / np.sqrt(np.trapz(phi_3**2, x))
phi_3 *= N3

# first derivative
phi_3p = np.gradient(phi_3, x)

# normalization constant
phi_4 = x**2 * (L-x)**2 * (0.5 * L-x)
N4 = 1.0 / np.sqrt(np.trapz(phi_4**2, x))
phi_4 *= N4

# first derivative
phi_4p = np.gradient(phi_4, x)

# <phi3|H|phi3>
H33 = 0.5 * np.trapz(phi_3p * phi_3p, x)
    
# <phi3|H|phi4>
# <phi4|H|phi3> = <phi3|H|phi4>* because H is Hermitian
H34 = 0.5 * np.trapz(phi_3p * phi_4p, x)

 # <phi2|H|phi2>
H44 = 0.5 * np.trapz(phi_4p * phi_4p, x)

# S33 = <phi_3|phi_3> = 1 because phi_3 is normalized
# S44 = <phi_4|phi_4> = 1 because phi_4 is normalized
# S34 = S43 = <phi_3 | phi_4>
S34 = np.trapz(phi_3 * phi_4, x)
    
# quadratic formula:
# ax^2 + bx + c = 0
a = 1 - S34**2
b = - (H33 + H44 - 2*H34*S34)
c = H33 * H44 - H34**2

W2 = (-b - np.sqrt(b**2 - 4*a*c))/(2*a)
W4 = (-b + np.sqrt(b**2 - 4*a*c))/(2*a)

print("W2  = %10.5f Eh" % (W2))
print("W4  = %10.5f Eh" % (W4))

Given the exact energies, $E_2$ and $E_4,$

\[\begin{split}\begin{align} E_2 &= 2\pi^2 \approx 19.73921 E_\text{h} \\ E_4 &= 8\pi^2 \approx 78.95684 E_\text{h} \end{align}\end{split}\]

we can see that $W_2$ and $W_4$ agree with these values to within 0.059% and 27%, respectively. Again, note that $W_2 > E_2$ and $W_4 > E_4,$ as asserted above.

So far, we have seen that estimates for the eigenvalues of the Hamiltonian can be obtained from the secular equation. What about the corresponding estimates of the eigenfunctions? Once the $W_n$ are known, we can substitute them back into the original linear equations

\[\begin{align} \sum_j (H_{ij} - W_n S_{ij}) c_j^{(n)} = 0 \end{align}\]

and solve for the expansion coefficients, $c_j^{(n)},$ where the superscript indicates that these coefficients are different for each state. It is important to note that, in cases where the basis functions have well-defined parity, coefficients corresponding to even parity basis functions will only contribute to the states with even parity, and coefficients corresponding to odd parity basis functions will only contribute to the states with odd parity.

The Variation Problem as an Eigenvalue Problem#

If the basis in which the trial function is expanded is orthonormalized, the problem of solving the secular equation is equivalent to the problem of finding the eigenvalues of the matrix representation of the Hamiltonian. To make this apparent, first consider a linear trial function of the form

\[\begin{align} |\phi \rangle = \sum_k c_k | f_k \rangle \end{align}\]

For the two basis function case, the stationarity of $W$ with respect to variations in the coefficients led to

\[\begin{split}\begin{align} (H_{11} - S_{11} W) c_1 + (H_{12} - S_{12} W )c_2 &= 0\\(H_{21} - S_{21} W) c_1 + (H_{22} - S_{22} W )c_2 &= 0 \end{align}\end{split}\]

If the basis is orthonormalized, then

\[\begin{align} S_{ij} = \langle f_i | f_j \rangle = \delta_{ij} \end{align}\]

and

\[\begin{split}\begin{align} H_{11} c_1 + H_{12}c_2 &= W c_1 \\ H_{21} c_1 + H_{22}c_2 &= W c_2 \end{align}\end{split}\]

which can be re-expressed as

\[\begin{split}\begin{align} \begin{pmatrix} H_{11} & H_{12} \\ H_{21} & H_{22} \\ \end{pmatrix} \begin{pmatrix} c_{1} \\ c_{2} \\ \end{pmatrix} = W \begin{pmatrix} c_{1} \\ c_{2} \\ \end{pmatrix} \end{align}\end{split}\]

or

\[\begin{align} {\bf H} {\bf c} = W {\bf c} \end{align}\]

The structure of this equation is similar to that of the eigenvalue equations we have encountered throughout these notebooks. We have a matrix (${\bf H}$) whose action on a vector (${\bf c}$) gives the same vector, multiplied by a scalar. Indeed, this is an example of a matrix eigenvalue equation, and we refer to ${\bf c}$ as an eigenvector of ${\bf H}\text{.}$ This result is generalizable to the case where the trial function is expanded as a linear combination of more than two basis functions.

Recall that the linear equations that we are now representing as an eigenvalue equation only have a solution if

\[\begin{align} \det({\bf H} - W{\bf I}) = 0 \end{align}\]

where ${\bf I}$ is the identity matrix. The determinant is an $n$th-order polynomial, where $n$ is the number of basis functions, and there are $n$ roots to the secular equation, that give $n$ different estimates for the eigenvalues of the Hamiltonian. It follows that the matrix ${\bf H}$ should have $n$ distinct eigenvalue and eigenvector pairs. For a Hermitian matrix (like the Hamiltonian), the eigenvectors are linearly independent, which means that they fully span the same space as the basis functions. This property is reminiscent of the postulate stating that the eigenfunctions of Hermitian operators form a complete set.

Because there are $n$ eigenvalues and eigenvectors, we have

\[\begin{align} {\bf H}{\bf c}^{(i)} = W_i {\bf c}^{(i)} \end{align}\]

for $i = 1, 2, ..., n\text{.}$ If we collect each of the eigenvectors into a single matrix

\[\begin{split}\begin{align} {\bf C} = \begin{pmatrix} c_1^{(1)} & c_1^{(2)} & ... &c_1^{(n)} \\ c_2^{(1)} & c_2^{(2)} & ... &c_2^{(n)} \\ \vdots & \vdots & \ddots & \vdots \\ c_n^{(1)} & c_n^{(2)} & ... &c_n^{(n)} \end{pmatrix} \end{align}\end{split}\]

and define a diagonal matrix with the eigenvalues on the diagonal

\[\begin{split}\begin{align} {\bf W} = \begin{pmatrix} W_1 & 0 & 0 & ... \\ 0 & W_2 & 0 & ... \\ 0 & 0 & \ddots & \\ \vdots & \vdots & &W_n \end{pmatrix} \end{align}\end{split}\]

then we can write

\[\begin{align} {\bf H} {\bf C} = {\bf C}{\bf W} \end{align}\]

Now, we define the inverse of a matrix , ${\bf C}^{-1}\text{,}$ such that

\[\begin{align} {\bf C}{\bf C}^{-1} = {\bf C}^{-1}{\bf C} = {\bf I} \end{align}\]

The inverse matrix, ${\bf C}^{-1}\text{,}$ exists, provided that

\[\begin{align} \det({\bf C}) \neq 0 \end{align}\]

Assuming that ${\bf C}^{-1}$ does exist, we can apply it to both sides of the equation above to give

\[\begin{align} {\bf C}^{-1} {\bf H} {\bf C} &= {\bf C}^{-1} {\bf C} {\bf W} \end{align}\]

or

\[\begin{align} {\bf C}^{-1} {\bf H} {\bf C} &= {\bf W} \end{align}\]

This expression indicates that the eigenvectors of ${\bf H}$ form a matrix that bring the Hamiltonian to diagonal form. In other words, if the basis used to represent the Hamiltonian is orthonormal, then the linear variational problem reduces to the search for the set of vectors that diagonalize ${\bf H}$.

We will apply the Linear Variational Method to the particle in a box of length $10$ atomic units with a delta function potential centered at $x=5$ atomic units. We will plot the delta function potential and the infinite potential walls at $x = 0$ and $x = 10$ in the next cell using the python library matplotlib and the special function from the scipy library called signal.unit_impulse.

import numpy as np
from scipy import signal
from matplotlib import pyplot as plt

# create an array of x-values between 0 and 10
x = np.linspace(0, 10, 500)

# create the delta function potential.  Need to specify that it will be in the middle 
# of 500 samples along x.  We will scale this by the amplitude of the normalized energy 
# eigenstates so that this potential appears on the same scale as the eigenfunctions
V_x = signal.unit_impulse(500, idx='mid') * 2/10
V_0 = signal.unit_impulse(500, 0) * 5
V_L = signal.unit_impulse(500, 499) * 5
V = V_x + V_0 + V_L

# now plot V_x against x
plt.plot(x, V, color='red', label="Potential")
plt.xlabel("Position (atomic units)")
plt.ylabel("Potential (atomic units)")
plt.ylim(0, np.sqrt(2/10))
plt.legend()
plt.show()

Approach#

We can write the Hamiltonian for this system between $x=0$ and $x=10$ as follows:

\[\hat{H} = -\frac{\hbar^2}{2m} \frac{d^2}{dx^2} + \delta(x-5), \]

and we notice that this looks very similar to the Hamiltonian we know from the ordinary particle in a box except for the presence of this peculiar spikey potential right in the middle of the box. We could try to find the exact eigenfunctions for this Hamiltonian, but we can also obtain insights and reasonably good results utilizing approximate methods, such as the Variational Method or Perturbation Theory. Here we will use the Variational Method.

Form of the trial wavefunction#

In particular, we will optimize the trial wavefunction given by \begin{equation} \Phi(x) = \sum_{n=1}^N c_n \psi_n(x) \end{equation} where the expansion coefficients $c_n$ are real numbers and $\psi_n(x)$ are the energy eigenfunctions of the ordinary particle in a box that has no potential between $x=0$ and $x=L$. In particular, these eigenfunctions have the form \begin{equation} \psi_n(x) = \sqrt{\frac{2}{10} } {\rm sin}\left(\frac{n \pi x}{10} \right). \end{equation}

Form of the energy functional of the trial wavefunction#

We will seek to minimize the energy functional of the trial wavefunction through the expansion coefficients, where the energy functional of the trial wavefunction can be written as \begin{equation} E_{trial} = \frac{\int_0^{10} \Phi^* (x) : \hat{H} : \Phi(x) dx }{\int_0^{10} \Phi^* (x) : \Phi(x) dx }, \end{equation} where we have recognized that the boundaries of the box are at $x=0$ and $x=10$, so our range of integration goes between these boundaries.

Atomic Units#

We will express our Hamiltonian in atomic units where $\hbar = 1$ and $m = 1$, so we can write the Hamiltonian in this unit system as: \begin{equation} \hat{H} = -\frac{1}{2} \frac{d^2}{dx^2} + \delta(x-5). \end{equation} We further note that the Hamiltonian can be written as a sum of kinetic and potential operators, $\hat{H} = \hat{T} + \hat{V}$ where $\hat{T} = -\frac{1}{2} \frac{d^2}{dx^2}$ and $\hat{V} = \delta(x-5)$. For the ordinary particle in a box, the Hamiltonian only contains $\hat{T}$, so we can note that the particle in a box energy eigenfunctions obey the following eigenvalue equation with the kinetic energy operator (in atomic units) \begin{equation} \hat{T} \psi_n(x) = \frac{\pi^2 n^2}{200} \psi_n(x) \equiv E_n \psi_n(x).
\end{equation}

Solving the Linear Variational Problem#

The Linear Variational Method is carried out by minimizing $E_{trial}$ with respect to the expansion coefficients in $\Phi(x)$. This leads to the condition that the partial derivative of the trial energy with respect to each expansion coefficient is zero: \begin{equation} \frac{\partial}{\partial c_m} E_{trial} = 0 ; ; \forall ; m. \end{equation} When this is true, the trial energy and the expansion coefficients satisfy the following equations: \begin{equation} E_{trial} c_m = \sum_{n=1}^N H_{nm} c_n, \end{equation} where \begin{equation} H_{nm} = \int_0^{10} \psi^*n(x) \hat{H} \psi_m(x) dx. \end{equation} This can be written as an eigenvalue equation \begin{equation} {\bf H} {\bf c} = E{trial} {\bf c}, \end{equation} where ${\bf H}$ is the matrix whose elements are given by $H_{nm}$ and ${\bf c}$ is the vector of coefficients.

Questions Part 1:#

The matrix element $H_{nm}$ can be expressed as a sum of kinetic and potential matrix elements, $T_{nm} + V_{nm}$ where \begin{equation} T_{nm} = \int_0^{10} \psi^n(x) \hat{T} \psi_m(x) dx \end{equation} and \begin{equation} V{nm} = \int_0^{10} \psi^_n(x) \hat{V} \psi_m(x) dx. \end{equation} Write a general expression for the elements of $T_{nm}$ and $V_{nm}$ for our particle in a box with a delta potential.
Write two python functions Kinetic_matrix_element(n, m), Potential_matrix_element(n, m) that take the indices $n$ and $m$ and return the corresponding value of $T_{nm}$ and $V_{nm}$. Skeleton code for this function follows. Three helper functions are provided:
- def energy_eigenvalue(n, L, m) that can provide the energy eigenvalues of the ordinary particle in a box for energy eigenstate $n$, length $L$, and mass $m$
- def energy_eigenfunction(n, L, x) that can provide the value(s) of the energy eigenfunction of the ordinary particle in a box for energy eigenstate $n$ with length $L$ evaluated at x-value(s) $x$.
- def Hamiltonian_matrix_element(n, m) that will call the Kinetic_matrix_element and Potential_matrix_element functions to return the total Hamiltonian matrix element value.

import numpy as np

def energy_eigenvalue(n, L, M):
    """Calculate the energy eigenvalue for a particle in a box.
    
    Arguments
    ---------
    n : int
        Quantum number (positive integer) for the particle in a box state
    L : float
        Length of the box in atomic units
    M : float
        Mass of the particle in atomic units
        
    Returns
    -------
    E_n : float
        Energy eigenvalue in atomic units
        
    Notes
    -----
    Uses the formula: E_n = n² π² / (2 M L²)
    """
    return n**2 * np.pi**2 / (2 * M * L**2)


def energy_eigenfunction(n, L, x):
    """Evaluate the normalized eigenfunction for a particle in a box.
    
    Arguments
    ---------
    n : int
        Quantum number (positive integer) for the particle in a box state
    L : float
        Length of the box in atomic units
    x : float or numpy.ndarray
        Position(s) where the eigenfunction is evaluated (atomic units)
        
    Returns
    -------
    psi_n : float or numpy.ndarray
        Value(s) of the normalized eigenfunction ψ_n(x)
        
    Notes
    -----
    Uses the formula: ψ_n(x) = √(2/L) sin(nπx/L)
    """
    return np.sqrt(2 / L) * np.sin(n * np.pi * x / L)


def Hamiltonian_matrix_element(n, m, L, M, x_delta):
    """Calculate the Hamiltonian matrix element <n|H|m>.
    
    Arguments
    ---------
    n : int
        Quantum number for the bra state |n⟩
    m : int
        Quantum number for the ket state |m⟩
    L : float
        Length of the box in atomic units
    M : float
        Mass of the particle in atomic units
    x_delta : float
        Position of the delta function potential in atomic units
        
    Returns
    -------
    H_nm : float
        Matrix element ⟨n|H|m⟩ = ⟨n|T|m⟩ + ⟨n|V|m⟩
        
    Example
    -------
    >>> H_12 = Hamiltonian_matrix_element(1, 2, L=10.0, M=1.0, x_delta=5.0)
    >>> H_33 = Hamiltonian_matrix_element(3, 3, L=10.0, M=1.0, x_delta=5.0)
    
    Notes
    -----
    For n == m: ⟨n|H|n⟩ = E_n + ⟨n|V|n⟩
    For n != m: ⟨n|H|m⟩ = ⟨n|V|m⟩
    """
    T_nm = Kinetic_matrix_element(n, m, L, M)
    V_nm = Potential_matrix_element(n, m, L, x_delta)
    return T_nm + V_nm


def Kinetic_matrix_element(n, m, L, M):
    """Calculate the kinetic energy matrix element <n|T|m>.
    
    Arguments
    ---------
    n : int
        Quantum number for the bra state |n⟩
    m : int
        Quantum number for the ket state |m⟩
    L : float
        Length of the box in atomic units
    M : float
        Mass of the particle in atomic units
        
    Returns
    -------
    T_nm : float
        Matrix element ⟨n|T|m⟩ = E_m δ_nm
        
    Example
    -------
    >>> T_12 = Kinetic_matrix_element(1, 2, L=10.0, M=1.0)  # Returns 0
    >>> T_33 = Kinetic_matrix_element(3, 3, L=10.0, M=1.0)  # Returns E_3
    
    Notes
    -----
    The kinetic energy matrix is diagonal: ⟨n|T|m⟩ = E_m if n == m, else 0
    """
    if n == m:
        return energy_eigenvalue(m, L, M)
    else:
        return 0.0


def Potential_matrix_element(n, m, L, x_delta):
    """Calculate the potential energy matrix element <n|V|m> for V = δ(x - x_delta).
    
    Arguments
    ---------
    n : int
        Quantum number for the bra state |n⟩
    m : int
        Quantum number for the ket state |m⟩
    L : float
        Length of the box in atomic units
    x_delta : float
        Position of the delta function potential in atomic units
        
    Returns
    -------
    V_nm : float
        Matrix element ⟨n|V|m⟩ = ψ_n(x_delta) * ψ_m(x_delta)
        
    Example
    -------
    >>> V_12 = Potential_matrix_element(1, 2, L=10.0, x_delta=5.0)
    >>> V_33 = Potential_matrix_element(3, 3, L=10.0, x_delta=5.0)
    
    Notes
    -----
    For a delta function potential V(x) = V_0 δ(x - x_delta), the matrix element is
    ⟨n|V|m⟩ = V_0 ψ_n(x_delta) * ψ_m(x_delta)
    This implementation assumes V_0 = 1.
    """
    return energy_eigenfunction(n, L, x_delta) * energy_eigenfunction(m, L, x_delta)

Variational optimization#

Since we see that the linear Variational Method can be cast as an eigenvalue equation ${\bf H} {\bf c} = E {\bf c}$, we can find the Variational ground-state energy and wavefunction through the following steps:

Build the matrix ${\bf H}$; we will do this in the basis of energy eigenfunctions of the ordinary particle in a box
Diagonalize ${\bf H}$; the lowest eigenvalue will be the variational approximation to the ground-state energy and the corresponding eigenvector will represent the expansion coeefficients for the variational ground-state wavefunction in terms of the basis functions.

We can use np.linalg.eigh() to diagonalize the Hamiltonian matrix.

The following code block contains a helper function called build_matrices(n_basis) that will build and return the Hamiltonian, Kinetic Energy, and Potential Energy matrices defined above.
The argument n_basis will be used to determine the number of basis functions, and the size of the resulsing matrices will be n_basis x n_basis.

def build_matrices(n_basis, L, M, x_delta):
    """Build the Hamiltonian, Kinetic, and Potential energy matrices.
    
    Arguments
    ---------
    n_basis : int
        Number of basis functions used to build the matrices
    L : float
        Length of the box in atomic units
    M : float
        Mass of the particle in atomic units
    x_delta : float
        Position of the delta function potential in atomic units
        
    Returns
    -------
    H : numpy.ndarray
        Total Hamiltonian matrix (n_basis × n_basis)
    T : numpy.ndarray
        Kinetic energy matrix (n_basis × n_basis)
    V : numpy.ndarray
        Potential energy matrix (n_basis × n_basis)
         
    Example
    -------
    >>> H, T, V = build_matrices(n_basis=3, L=10.0, M=1.0, x_delta=5.0)
    
    Notes
    -----
    Matrix indices run from 0 to n_basis-1, corresponding to quantum numbers
    n, m from 1 to n_basis.
    """
    H = np.zeros((n_basis, n_basis))
    T = np.zeros((n_basis, n_basis))
    V = np.zeros((n_basis, n_basis))
    
    for i in range(n_basis):
        n = i + 1  # i starts from 0 but quantum number n starts at 1
        for j in range(n_basis):
            m = j + 1  # j starts at 0 but quantum number m starts at 1
            H[i, j] = Hamiltonian_matrix_element(n, m, L, M, x_delta)
            T[i, j] = Kinetic_matrix_element(n, m, L, M)
            V[i, j] = Potential_matrix_element(n, m, L, x_delta)
    
    return H, T, V

The following line calls our build_matrices function to build ${\bf H}$, ${\bf T}$, and ${\bf V}$ matrices. We will call the array that stores ${\bf H}$ Hamiltonian_matrix.

n_basis = 3
Hamiltonian_matrix, Kinetic_matrix, Potential_matrix = build_matrices(n_basis=3, L=10.0, M=1.0, x_delta=5.0)

The following lines will find the eigenvalues and eigenvectors of ${\bf H}$. The lowest energy eigenvalue will be the variational estimate of the ground-state energy and the associated eigenvector will be the variational estimate of the ground-state wavefunction.

# compute eigenvalues and eigenvectors of Hamiltonian_matrix
# store eigenvalues to E_opt and eigenvectors to c_opt
E_opt, c_opt = np.linalg.eigh(Hamiltonian_matrix)

# print lowest eigenvalues corresponding to the 
# variational estimate of the ground state energy
print(F'The variational ground state energy with {n_basis} basis functions is {E_opt[0]:.6f} atomic units')

Visualizing the variational ground-state#

Now that we have the variational eigenvectors (along with the eigenvalues), we can visualize the variational ground-state wavefunction by expanding it in terms of the basis functions $\psi_n(x)$ using the expansion coefficients from the ground-state eigenvector: $$ \Phi(x) = \sum_n c_n \psi_n(x). $$ The ground-state eigenvector can be accessed as a slice of c_opt as follows: c_opt[:,0]

We will build a numpy array of values from this expansion and plot it using pyplot.

L = 10
# Initialize a numpy array for Phi_gs as an array of zeros with the same length as x
Phi_gs = np.zeros_like(x)

# Loop through the basis states and add c_n ψ_n(x) to Phi_gs 
for i in range(n_basis):
    n = i + 1  # Quantum number starts at 1
    Phi_gs += c_opt[i, 0] * energy_eigenfunction(n, L, x)
    
# Plot the variational ground state along with the ground state of the ordinary PIB
plt.plot(x, Phi_gs**2, "blue", label="Variational Ground-State Probability Density")
plt.plot(x, energy_eigenfunction(1, L, x)**2, "red", label="PIB Ground State Probability Density")

# Optionally plot the delta potential
plt.plot(x, V, "purple", label="Potential")

# Set y-axis limits (scaled by box length)
plt.ylim(0, 2 / L * 1.5)

# Add labels and legend
plt.xlabel("Position (a.u.)")
plt.ylabel("Probability Density (a.u.⁻¹)")
plt.legend()
plt.show()

Questions Part 2:#

Is the energy you calculated above higher or lower than the ground state energy of the ordinary particle in a box system (that is, without the delta function potential)?
Why do you think mixing in functions that correspond to excited states in the ordinary particle in a box system actually helped to improve (i.e. lower) your energy in the system with the delta function potential?
Increase the number of basis functions to 50 (so that ${\bf H}$ is a 50x50 matrix and ${\bf c}$ is a vector with 50 entries) and repeat your calculation of the variational estimate of the ground state energy. Does the energy improve (lower) compared to what it was when 3 basis functions were used? Do you notice any difference in the plot of the variational estimate of the ground-state probability density?

Behavior of Total Energy, Kinetic Energy, and Potential Energy functionals with basis set size.#

Typically speaking, the quality of the variational ground-state energy improves with the basis set size. Here we will look at the behavior of the total energy, the kinetic energy, and the potential energy of the trial wavefunction as a function of basis set size.

For a given trial wavefunction (as determined by the variationally determined ground-state eigenvector ${\bf c}$, we can define the total energy as \begin{equation} E = {\bf c}^t {\bf H} {\bf c} \end{equation} the kinetic energy as \begin{equation} T = {\bf c}^t {\bf T} {\bf c}, \end{equation} and the potential energy as \begin{equation} V = {\bf c}^t {\bf V} {\bf c}. \end{equation}

where ${\bf c}^t$ is just the transpose of ${\bf c}$. We will perform this computation for basis set sizes spanning n_basis = 2 to n_basis = 202, and we will compare to the ground state and first excited state of the ordinary particle in a box ($E_1$ and $E_2$).

# Define physical parameters
L = 10.0      # Length of the box in atomic units
M = 1.0       # Mass of the particle in atomic units
x_delta = 5.0 # Position of the delta function potential in atomic units

# Initialize lists to store results
n_basis_list = []
E_list = []
T_list = []
V_list = []
E_1_list = []
E_2_list = []

# Loop over different basis set sizes
for i in range(2, 202, 2):
    n_basis_list.append(i)
    
    # Build matrices with proper parameters
    H, T, V = build_matrices(i, L, M, x_delta)
    
    # Diagonalize Hamiltonian to get eigenvalues and eigenvectors
    e_t, c_t = np.linalg.eigh(H)
    
    # Store ground state energy
    E_list.append(e_t[0])
    
    # Calculate kinetic energy expectation value: <Ψ|T|Ψ>
    T_c = np.dot(T, c_t[:, 0])
    T_list.append(np.dot(np.transpose(c_t[:, 0]), T_c))
    
    # Calculate potential energy expectation value: <Ψ|V|Ψ>
    V_c = np.dot(V, c_t[:, 0])
    V_list.append(np.dot(np.transpose(c_t[:, 0]), V_c))
    
    # Store unperturbed PIB energies for first two states
    E_1_list.append(T[0, 0])  # E_1 = n=1 eigenvalue
    E_2_list.append(T[1, 1])  # E_2 = n=2 eigenvalue

# Convert to numpy arrays for easier manipulation
n_basis_list = np.array(n_basis_list)
E_list = np.array(E_list)
T_list = np.array(T_list)
V_list = np.array(V_list)
E_1_list = np.array(E_1_list)
E_2_list = np.array(E_2_list)
    

    
    

plt.plot(n_basis_list, E_list, label="E")
plt.plot(n_basis_list, T_list, label="T")
plt.plot(n_basis_list, V_list, label="V")
plt.plot(n_basis_list, E_1_list, label="E_1")
plt.plot(n_basis_list, E_2_list, label="E_2")
plt.legend()

Finite Well#

As a second example, let us revisit the one-dimensional finite square well potential problem. For this problem, the potential has the form

\[\begin{split}\begin{align} V(x) = \begin{cases} V_0\text{, } &x < 0 \\ 0\text{, } & 0 \le x \le L \\ V_0\text{, } & x \ge L \end{cases} \end{align}\end{split}\]

We want to expand the trial function as a linear combination of orthogonal basis functions,

\[\begin{align} | \phi \rangle = \sum_n c_n |f_n \rangle \end{align}\]

Let us choose $f_n$ to be one-dimensional PIB wave functions, which satisfy this orthogonality requirement and have the form

\[\begin{split}\begin{align} f_n(x) = \begin{cases} 0\text{, } &x < 0 \\ \left ( \frac{2}{L^\prime} \right) \sin\left(\frac{n\pi (x-x_\text{min})}{L^\prime}\right)\text{, } &0 \le x \le L^\prime \\ 0\text{, }&x \ge L^\prime \end{cases} \end{align}\end{split}\]

Here, we have chosen a box with a length $L' > L$ because the finite square well potential wave functions should penetrate into the classically forbidden regions where $x > L$ and $x < -L\text{.}$ Note that the position in the sine function is shifted because the PIB functions are usually defined on the interval $x = [0, L^\prime]\text{,}$ but if they are defined in this way, the basis functions would not be centered about the middle of the finite square well. Hence, the PIB basis function should be defined on the interval $x = [x_\text{min}, x_\text{max}]\text{,}$ where

\[\begin{split}\begin{align} x_\text{min} &= \frac{1}{2} L - \frac{1}{2} L^\prime \\ x_\text{max} &= \frac{1}{2} L + \frac{1}{2} L^\prime \\ \end{align}\end{split}\]

The following Python code defines the potential and 100 basis functions on the interval $x = [x_\text{min}, x_\text{max}]\text{.}$

# set some parameters (in atomic units, where hbar = 1)
# these should be consistent with those used in the previous notebook
hbar = 1.0
m = 1.0 # mass of particle
L = 1.0 # length of box (from 0 to L)
V0 = 10.0

# box width for PIB basis functions
Lprime = 10 * L

# number of basis functions
nbf = 100

# min x value ...
minx = 0.5 * L - 0.5 * Lprime

# max x value
maxx = 0.5 * L + 0.5 * Lprime

dx = 0.01
x = np.linspace(minx, maxx, int((maxx-minx)/dx+1))

# define the potential over this interval
V = []
for myx in x:
    if myx < 0:
        V.append(V0)
    elif myx < L:
        V.append(0.0)
    else:
        V.append(V0)

# define basis functions, f
f = []
for n in range (0, nbf):
    
    # define basis function n over all space
    fn = []
    for myx in x:
        # don't forget to shift myx by minx because the PIB functions are supposed to go from 0 to L'
        fn.append( np.sqrt(2.0 / Lprime) * np.sin((n+1) * np.pi * (myx-minx) / Lprime) )

    # add this basis function to our list of basis functions
    f.append(fn)

f = np.array(f)

Having defined our basis functions, we can now evaluate the Hamiltonian matrix elements,

\[\begin{split}\begin{align} H_{ij} &= \langle f_i | \hat{H} | f_j \rangle \\ &= \langle f_i | \hat{T} | f_j \rangle + \langle f_i | V(x) | f_j \rangle \\ &= \frac{\hbar^2}{2m} \langle f_i^\prime | f_j^\prime \rangle + \langle f_i | V(x) | f_j \rangle \\ \end{align}\end{split}\]

where on the third line, we have again taken advantage of the fact that the expectation value of the kinetic energy operator can be expressed in terms of an overlap between gradients of the basis functions. The following Python code evaluates these matrix elements, with the gradients and integrals handled numerically.

H = np.zeros( [nbf, nbf], dtype = np.float64)

# gradient of basis functions
df = []
for i in range (0, nbf):
    df.append(np.gradient(f[i], x))

for i in range (0, nbf):
    for j in range (0, nbf):

        # potential energy term: <fi|V|fj>
        H[i, j] = np.trapz(f[i] * V * f[j], x = x)
        
        # kinetic energy term
        H[i, j] += hbar**2 / (2.0 * m) * np.trapz(df[i] * df[j], x = x)

We now can use NumPy to solve the eigenvalue problem

\[\begin{align} {\bf H} {\bf C} = {\bf C}{\bf W} \end{align}\]

w, c = np.linalg.eigh(H)

print('W1 = %8.5f Eh' % (w[0]))
print('W2 = %8.5f Eh' % (w[1]))

Recall that the exact energies for this problem that we determined in an earlier notebook were

\[\begin{split}\begin{align} E_1 &= 2.29499 E_\text{h} \\ E_2 &= 8.13715 E_\text{h} \end{align}\end{split}\]

Wow! $W_1$ and $W_2$ agree with these values to within less that 0.1%! However, note that they are not strict upper bounds to the exact energy. There are several possibilities for why this is the case:

The PIB basis functions do not satisfy the boundary conditions for the finite square well potential problem because they have a derivative discontinuity at $x = x_\text{min}$ and $x = x_\text{max}\text{.}$ As a result, there will not be a strict guarantee that the energy estimates obtained by diagonalizing the matrix representation of the Hamiltonian will be upper bounds to the true energies for this problem.
The potential and kinetic energy matrix elements are evaluated numerically, which introduces some numerical error.
The kinetic energy matrix elements were also evaluated using gradients that were determined numerically, which introduces additional numerical error.

Lastly, we should visualize the variationally optimized wave functions. To do so, we will need to contract the eigenvectors with the basis functions as

\[\begin{align} |\phi^{(i)}\rangle = \sum_j c^{(i)}_{j} |f_j\rangle \end{align}\]

which can be achieved using NumPy’s einsum function.

phi = np.einsum('ji,jk->ik', c, f)

plt.figure()
plt.plot(x, phi[0], label = r'$\phi_1$')
plt.plot(x, phi[1], label = r'$\phi_2$')
plt.ylabel('wave function')
plt.xlabel(r'x ($a_0$)')
plt.legend()
plt.show()

If you plot these approximate wave functions alongside the exact wave functions determined in a previous notebook, you will find that they are indistinguishable on the scale of this figure.

Chapter 7: The Variation Theorem

Contents

Chapter 7: The Variation Theorem#

The Variation Theorem for Ground States#

The Variation Theorem for Excited States#

The Linear Variation Method#

Matrices in Quantum Mechanics#

Vectors in Quantum Mechanics#

The Secular Equation#

The Variation Problem as an Eigenvalue Problem#

Approach#

Form of the trial wavefunction#

Form of the energy functional of the trial wavefunction#

Atomic Units#

Solving the Linear Variational Problem#

Questions Part 1:#

Variational optimization#

Visualizing the variational ground-state#

Questions Part 2:#

Behavior of Total Energy, Kinetic Energy, and Potential Energy functionals with basis set size.#

Finite Well#