2.5 Hermitian Operators

The goal of this section is to think a little bit more about what kinds of operators we can have in quantum mechanics. We have seen that operators represent observable properties of a quantum system, with their eigenvalues representing possible measurement outcomes, but is this the case for all operators? Do all operators represent observable quantities? If not, how can we tell if an operator does represent an observable?

It turns out that only a special subset of operators are observables. These operators are the so-called Hermitian operators, which is a mathematical property. We will go over what this means and why Hermitian operators are observable below, but first we need to introduce another property of operators, the adjoint.

2.5.1 The Adjoint

For every operator $\hat{O}$ we can define a generalisation of the complex conjugate of a scalar called the adjoint, or Hermitian conjugate, denoted with $\hat{O}^{\dagger}$ . But how do we define the complex conjugate of an operator? In a finite dimensional Hilbert space where operators are matrices, the generalisation of the complex conjugate is the conjugate transpose. But how do we define the conjugate transpose of an operator in an infinite dimensional Hilbert space?

Instead of trying to wrap our heads around what different types of operators in infinite dimensions can look like and trying to define a generalisation of conjugate transpose for them, let’s look at how the conjugate transpose acts on vectors. In finite dimensions, we have that for a complex vector $\boldsymbol{u}$ and a matrix $A$ :

(A\boldsymbol{u})^{\dagger}=\boldsymbol{u}^{\dagger}A^{\dagger},

(2.91)

where the $\dagger$ denotes conjugate transpose. This means for the inner product of $A\boldsymbol{u}$ with another complex vector $\boldsymbol{v}$ we have

A\boldsymbol{u}\cdot\boldsymbol{v}=(A\boldsymbol{u})^{\dagger}\boldsymbol{v}=% \boldsymbol{u}^{\dagger}A^{\dagger}\boldsymbol{v}=\boldsymbol{u}\cdot(A^{% \dagger}\boldsymbol{v}).

(2.92)

In Dirac notation this would be written as

\langle\hat{A}u\mathclose{}|\mathopen{}v\rangle=\langle u\mathclose{}|% \mathopen{}\hat{A}\mathclose{}|\mathopen{}v\rangle.

(2.93)

We can now simply define the adjoint of an operator in an infinite dimensional Hilbert space to satisfy this relation.

Definition 2.1.

For an operator $\hat{O}$ , the adjoint, or Hermitian conjugate, of $\hat{O}$ , denoted $\hat{O}^{\dagger}$ , is the operator which for any states $\lvert\psi\rangle$ and $\lvert\phi\rangle$ satisfies:

\langle\hat{O}\psi\mathclose{}|\mathopen{}\phi\rangle=\langle\psi\mathclose{}|% \mathopen{}\hat{O}^{\dagger}\mathclose{}|\mathopen{}\phi\rangle,

(2.94)

or in the position basis:

\int_{-\infty}^{\infty}(\hat{O}\psi)^{\ast}\phi\operatorname{\mathrm{d}}\!x=% \int_{-\infty}^{\infty}\psi^{\ast}\hat{O}^{\dagger}\phi\operatorname{\mathrm{d% }}\!x.

(2.95)

Some operators are equal to their own adjoint.

\hat{O}^{\dagger}=\hat{O}.

(2.96)

These operators are called self-adjoint, or more commonly among physicists, Hermitian.

Definition 2.2.

A Hermitian operator is an operator $\hat{O}$ that is equal to its own adjoint $\hat{O}^{\dagger}$ . Hermitian operators can therefore be defined by the following identity:

\langle\hat{O}\psi\mathclose{}|\mathopen{}\phi\rangle=\langle\psi\mathclose{}|% \mathopen{}\hat{O}\mathclose{}|\mathopen{}\phi\rangle,

(2.97)

or in the position basis:

\int_{-\infty}^{\infty}(\hat{O}\psi)^{\ast}\phi\operatorname{\mathrm{d}}\!x=% \int_{-\infty}^{\infty}\psi^{\ast}\hat{O}\phi\operatorname{\mathrm{d}}\!x.

(2.98)

Hermitian operators can be viewed as the “real numbers” of operators.

Before moving on to defining what an observable is, we will calculate the adjoint of a product of two operators. By definition of the adjoint (2.94), we have

\langle\hat{A}\hat{B}\psi\mathclose{}|\mathopen{}\phi\rangle=\langle\psi% \mathclose{}|\mathopen{}(\hat{A}\hat{B})^{\dagger}\mathclose{}|\mathopen{}\phi\rangle.

(2.99)

However, by defining $\lvert\chi\rangle=\hat{B}\lvert\psi\rangle$ , we can apply the definition of the adjoint differently to get

\langle\hat{A}\chi\mathclose{}|\mathopen{}\phi\rangle=\langle\chi\mathclose{}|% \mathopen{}\hat{A}^{\dagger}\mathclose{}|\mathopen{}\phi\rangle=\langle\hat{B}% \psi\mathclose{}|\mathopen{}\hat{A}^{\dagger}\mathclose{}|\mathopen{}\phi\rangle.

(2.100)

Then, defining $\lvert\upsilon\rangle=\hat{A}^{\dagger}\lvert\phi\rangle$ and applying the definition of the adjoint one more time, we have

\langle\hat{B}\psi\mathclose{}|\mathopen{}\upsilon\rangle=\langle\psi% \mathclose{}|\mathopen{}\hat{B}^{\dagger}\mathclose{}|\mathopen{}\upsilon% \rangle=\langle\psi\mathclose{}|\mathopen{}\hat{B}^{\dagger}\hat{A}^{\dagger}% \mathclose{}|\mathopen{}\phi\rangle.

(2.101)

Comparing this to equation 2.99, we have shown that

(\hat{A}\hat{B})^{\dagger}=\hat{B}^{\dagger}\hat{A}^{\dagger}.

(2.102)

So the adjoint of the product of two operators is equal to the product of the adjoints in the opposite order. Note that these equations could equivalently be written with the position basis integral notation, but we will use Dirac notation from now on because it is less cumbersome.

2.5.2 What are Observables?

Let us think about what properties we require of operators for them to represent physical observables. We would like that all the eigenvalues are real, since they represent measurement outcomes. Additionally, we also require that eigenvectors corresponding to distinct eigenvalues are orthogonal, otherwise our probabilistic interpretation of state vectors in Hilbert space wouldn’t work. Finally, we would like that the eigenvectors of an observable form a basis for Hilbert space, so that any valid state of the system can be expressed as a superposition of eigenstates.

Looking at the first property, we will first show that all eigenvalues of an operator being real is equivalent to the expectation value of the operator for any valid state being real. Let $\hat{O}$ be an operator with eigenstates $\lvert i\rangle$ and eigenvalues $O_{i}$ , so

\hat{O}\lvert i\rangle=O_{i}\lvert i\rangle\quad\forall i.

(2.103)

Suppose the expectation value for any state $\lvert\psi\rangle$ , $\langle\hat{O}\rangle_{\psi}$ is real. Then the expectation value of $\hat{O}$ for each eigenstate is

\langle\hat{O}\rangle_{i}=\langle i\mathclose{}|\mathopen{}\hat{O}\mathclose{}% |\mathopen{}i\rangle=O_{i}\langle i\mathclose{}|\mathopen{}i\rangle=O_{i},

(2.104)

which implies that $O_{i}$ must be real. Conversely, if we assume that all of the $O_{i}$ ’s are real, then the expectation value of $\hat{O}$ for an arbitrary state $\lvert\psi\rangle$ is

\langle\hat{O}\rangle_{\psi}=\langle\psi\mathclose{}|\mathopen{}\hat{O}% \mathclose{}|\mathopen{}\psi\rangle=\sum_{i,j}c_{i}^{\ast}c_{j}\langle i% \mathclose{}|\mathopen{}\hat{O}\mathclose{}|\mathopen{}j\rangle=\sum_{i,j}c_{i% }^{\ast}c_{j}O_{j}\langle i\mathclose{}|\mathopen{}j\rangle=\sum_{i,j}c_{i}^{% \ast}c_{j}O_{j}\delta_{i,j}=\sum_{i}\lvert c_{i}\rvert^{2}O_{i},

(2.105)

which is a sum of real numbers and is therefore itself real.

Now, note that the complex conjugate of an expectation value is given by

\langle\hat{O}\rangle_{\psi}^{\ast}=(\langle\psi\mathclose{}|\mathopen{}\hat{O% }\mathclose{}|\mathopen{}\psi\rangle)^{\ast}=\langle\hat{O}\psi\mathclose{}|% \mathopen{}\psi\rangle.

(2.106)

Finally, for the expectation value an operator $\hat{O}$ to be real, we must have

	$\displaystyle\langle\hat{O}\rangle_{\psi}^{\ast}$	$\displaystyle=\langle\hat{O}\rangle_{\psi}$		(2.107)
	$\displaystyle\implies\langle\hat{O}\psi\mathclose{}\|\mathopen{}\psi\rangle$	$\displaystyle=\langle\psi\mathclose{}\|\mathopen{}\hat{O}\mathclose{}\|\mathopen% {}\psi\rangle,$		(2.108)

and this is exactly the definition of a Hermitian operator! So this implies that if $\hat{O}$ is Hermitian, all of its eigenvalues are real and thus it seems to be a valid candidate for an observable. The converse is also true: if all of the eigenvalues of an operator are real, then that operator is Hermitian. This equivalency leads us to believe that all observables are Hermitian operators (and vice versa!).

What about the other properties? It turns out that Hermitian operators fulfill them too.

Consider a Hermitian operator $\hat{O}$ and let $\lvert i\rangle$ and $\lvert j\rangle$ be eigenstates of $\hat{O}$ with eigenvalues $O_{i}$ and $O_{j}$ respectively. Then because $\hat{O}$ is Hermitian, we have that

\langle i\mathclose{}|\mathopen{}\hat{O}\mathclose{}|\mathopen{}j\rangle-% \langle j\mathclose{}|\mathopen{}\hat{O}\mathclose{}|\mathopen{}i\rangle^{\ast% }=0,

(2.109)

but by expanding these inner products we find

	$\displaystyle\langle i\mathclose{}\|\mathopen{}\hat{O}\mathclose{}\|\mathopen{}j% \rangle-\langle j\mathclose{}\|\mathopen{}\hat{O}\mathclose{}\|\mathopen{}i% \rangle^{\ast}$	$\displaystyle=O_{j}\langle i\mathclose{}\|\mathopen{}j\rangle-O_{i}\langle j% \mathclose{}\|\mathopen{}i\rangle^{\ast}$		(2.110)
		$\displaystyle=(O_{j}-O_{i})\langle i\mathclose{}\|\mathopen{}j\rangle.$		(2.111)

Thus if $O_{i}\neq O_{j}$ then $\langle i\mathclose{}|\mathopen{}j\rangle=0$ so the eigenstates must be orthogonal. What if $O_{i}=O_{j}$ ? Then a linear combination of $\lvert i\rangle$ and $\hat{j}$ is also an eigenvector! Luckily, it is always possible to find two linear combinations that are orthogonal. We will come back to study this case in more detail in chapter 4.

Since the Hamiltonian is Hermitian (it represents total energy which is observable), this implies that the energy eigenfunctions (with distinct eigenvalues) are mutually orthogonal. We already saw this was the case for the infinite square well, but this implies that it is true for any system.

Finally, it can be shown that eigenvectors of Hermitian operators form a basis for the Hilbert space. This means that for a set of eigenvectors $\lvert i\rangle$ of a Hermitian operator $\hat{O}$ , any state $\lvert\psi\rangle$ can be written

\lvert\psi\rangle=\sum_{i}c_{i}\lvert i\rangle=\sum_{i}\lvert i\rangle\langle i% \mathclose{}|\mathopen{}\psi\rangle.

(2.112)

Where we have introduced the orthonormality condition of the eigenvectors to show that the coefficients $c_{i}=\langle i\mathclose{}|\mathopen{}\psi\rangle$ (as we saw when we were expanding an arbitrary state in terms of the energy eigenfunctions for the infinite square well in section 2.4.3).

We can pull an interesting fact from this equation. If we put brackets around all the terms including $i$ ,

\lvert\psi\rangle=\left(\sum_{i}\lvert i\rangle\langle i\rvert\right)\lvert% \psi\rangle,

(2.113)

then since this equation holds for any state $\lvert\psi\rangle$ , the term in the brackets must be equal to the identity operator!

\sum_{i}\lvert i\rangle\langle i\rvert=I.

(2.114)

This is called a resolution of the identity. This is a fact that comes in useful sometimes for evaluating complicated expressions and we will use it later on. In particular, it lets us write down the diagonal representation of $\hat{O}$ , which is given by

\hat{O}=\sum_{i}O_{i}\lvert i\rangle\langle i\rvert.

(2.115)

Example 2.1.

Consider the operator $\hat{D}=\frac{\partial}{\partial x}$ . Does this operator represent an observable?

The easiest way to check this is to calculate the adjoint $\hat{D}^{\dagger}$ using the position basis definition 2.95. We will start from the left hand side and integrate by parts:

$\displaystyle\int_{-\infty}^{\infty}(\hat{D}\psi)^{\ast}\psi\operatorname{% \mathrm{d}}\!x$	$\displaystyle=\int_{-\infty}^{\infty}\frac{\partial\psi^{\ast}}{\partial x}% \psi\operatorname{\mathrm{d}}\!x$	(2.116)
	$\displaystyle=\left.\psi^{\ast}\psi\right\|_{-\infty}^{\infty}-\int_{-\infty}^{% \infty}\psi^{\ast}\frac{\partial\psi}{\partial x}\operatorname{\mathrm{d}}\!x$	(2.117)
	$\displaystyle=\int_{-\infty}^{\infty}\psi^{\ast}(-\hat{D})\psi\operatorname{% \mathrm{d}}\!x.$	(2.118)

By the definition of the adjoint, this is equal to the right hand side of equation 2.95, therefore we have

\hat{D}^{\dagger}=-\hat{D}=-\frac{\partial}{\partial x},

(2.119)

so $\hat{D}$ is not Hermitian and therefore cannot be an observable.

	$\displaystyle\langle i\mathclose{}\|\mathopen{}\hat{O}\mathclose{}\|\mathopen{}j% \rangle-\langle j\mathclose{}\|\mathopen{}\hat{O}\mathclose{}\|\mathopen{}i% \rangle^{\ast}$	$\displaystyle=O_{j}\langle i\mathclose{}\|\mathopen{}j\rangle-O_{i}\langle j% \mathclose{}\|\mathopen{}i\rangle^{\ast}$		(2.110)
		$\displaystyle=(O_{j}-O_{i})\langle i\mathclose{}\|\mathopen{}j\rangle.$		(2.111)