An introduction to quantum Monte Carlo¶

Statistics recap from last lecture¶

Random variables and probabilities¶

A random selection $X$, of one of the possible values $x1, x2, \ldots$ from a sample space $\Omega$, with probability $0\leq P \leq1$, is called a random variable.
Continuous random variables have associated probability density functions (pdf, $f_X$) and cummulative distribution functions (cdf, $F_X$):

$$P[a\leq X\leq b] = \int_a^b f_X(x)dx$$$$F_X(x) = \int_{-\infty}^x f_X(y)dy$$

There is a clear relationship between pdf and cdf. Moreover, we can derive a normalization condition for the pdf.

Expected value¶

The expected value (sometimes expectation value or stochastic mean value) of $X$ is defined as:

$$E[X] = \sum_i P[X=x_i]x_i$$

or in the continuous case: $$E[X] = \int^{+\infty}_{-\infty}xdF_X = \int^{+\infty}_{-\infty}xf_X(x)dx$$

Expected value¶

A function of a random variable is a random variable itself, so:

$$E[g(X)]=\int^{+\infty}_{-\infty}g(x)f_X(x)dx$$

This last result is know as the law of the unconscious statistician, and it is only a nice application of the chain rule.

Variance¶

Stochastical quantities are meaningless without a proper estimation of the error sources. The square root of the variance is a messure of the dispersion of a random variable. In the continous case, the variance can be computed as:

$$\sigma^2 = \text{Var}[X]=E[(X-E[X])^2]=E[X^2]-E[X]^2$$

Exercise¶

Prove the above relationship.

The Metropolis algorithm¶

Draw samples by constructing a Markov chain whose stationary distribution is the desired target distribution $P(x)$. Provided that our transition probability $P(x_2 \mid x_1)$ can be separated in the product of a proposal probability $g(x_2\mid x_1)$ and an acceptance probablity $A(x_2, x_1)$, and that detail balance is veryfied:

$$P(x_2 \mid x_1)P(x_1)=P(x_1 \mid x_2)P(x_2)$$

The algorithm proceeds as follows:

While true:
1. Select initial system configuration $x_0$
2. Draw $x' \sim g(x' \mid x_{i-1})$
3. Compute acceptance probability $$A(x', x_{i-1})=\min\left(1, \frac{P(x')}{P(x_{i-1})} \frac{g(x_{i-1} \mid x')}{g(x' \mid x_{i-1})}\right)$$
4. With probability $A(x', x_{i-1})$ set $x_i = x'$, otherwise $x_i = x_{i-1}$

Two important theorems¶

Law of large numbers¶

Let $X_1, X_2, \ldots, X_N$ be a sequence of indenpendent equaly distributed (i.i.d.) reandom variables with $E[X_i]=\mu$. For large N the average value of the $X_i$:

$$\bar{X}_N=\frac{1}{N}\sum_{i=1}^N X_i$$

converges almost surely to $\mu$:

$$P\left[\lim_{N\to\infty}\bar{X}_N=\mu \right]=1$$

Central limit theorem¶

Let $X_1, X_2, \ldots$ be a sequence of indenpendent equaly distributed (i.i.d.) reandom variables with $E[X_i]=\mu$ and variance $\sigma < \infty$. As the sample size $N$ increases we get:

$$\sqrt(N)(\bar{X}_N-\mu)\rightarrow \mathcal{N}(0, \sigma^2)$$

In other words, regardless of the original distribution of the random variables, the pdf of the average tends to a normal distribution.

Back to physics¶

The expected value of an observable $O$ in a canonical discrete ensemble is given by:

$$\langle O\rangle = \frac{\sum_i O_i e^{-\beta E_i}}{\sum_i e^{-\beta E_i}}$$

How can we compute this in practice?

Lets recast the expression into something more familiar:

$$\langle O\rangle = \sum_i O_i \frac{e^{-\beta E_i}}{\sum_j e^{-\beta E_j}}=\sum_i p_i O_i$$

Then, if we sample long enough, the law of large numbers offers us an estimator of the expected value:

$$\langle O\rangle_p = \frac{1}{N}\sum_{i=1}^N O_i$$

and the averages of the samples drawn with probability $p_i$ will have a normal distribution, according to the central limit theorem.

Quantum Monte Carlo¶

We will focus on variational Monte Carlo (VMC) in this course, but you should know that other flavors exists, e.g. diffusion Monte Carlo (DMC). For VMC, we need three main ingredients:

Schrödinger equation

Metropolis algorithm

Efficient method for optimizing the parameters in the wave functions

VMC fundamentals¶

The main idea is to calculate the multidimensional integrals appearing in quantum mechanics using a Monte Carlo numerical integration. The most important quantity is the variational energy $E$, which is associated to a Hamiltonian $H$ and a wave function $\Psi$:

$$E_v=\frac{\langle \Psi|H|\Psi \rangle}{\langle \Psi|\Psi \rangle}=\frac{\int dR \Psi(R)^2 E_L(R)}{\int dR \Psi(R)^2}=\int dR E_L(R) \rho(R)$$

where we have defined the local energy: $$E_L(R)=\frac{H\Psi(R)}{\Psi(R)}$$ and the normalized probability density: $$\rho(R)=\frac{\Psi(R)^2}{\int dR \Psi(R)^2}$$

Estimation of the variational energy¶

We can estimate $E_v$ as the average of the $E_L(R)$ on a sample of $M$ points $R_k$, sampled from the probaility density $\rho(R)$: $$E_v \approx \bar{E}_v = \frac{1}{M}\sum_{k=1}^M E_L(R_k)$$

Question:¶

How can we sample the $R_k$?

Estimation of the variance¶

In VMC the variance can be computed as: $$\sigma^2=\int dR (E_L(R)-E_v)^2\rho(R)$$ The corresponding estimator is: $$\sigma^2\approx\bar{\sigma}^2=\frac{1}{M}\sum_{k=1}^M (E_L(R)-\bar{E}_v)^2$$

Types of errors in VMC¶

Systematic error due to approximate wave function (common to all ab initio methods)
Statistical uncertainty due to the sampling of finite size M (specific of Monte Carlo methods)

A very powerful result follows then from the central limit theorem. Using M Monte Carlo samples the integration error is given by:

$$E_v\pm\delta Ev$$$$\delta E_v \sim \frac{\sigma}{\sqrt{M}}$$

independently of the dimensions of the problem!

Wave functions¶

Since we do not solve analytical integrals, our wave function (WF) choice gets simplified. A popular choice are the so called Jastrow-Slater wave functions:

$$\Psi(R)=J(R)\Phi(R)$$

where $J(R)$ is the Jastrow factor and $\Phi(R)$ is a Slater determinant or a linear coombination of them.

Jastrow factor¶

The Jastrow factor has the general functional form:

$$J(R)=e^{f(R)}$$

where $f(R)$ is a function of the interparticle distances, and it captures the dynamic correlation effects of the wave function. In practice, the Jastrow is separated in three components:

$$J(R)=J_{ee}(R_{ij})J_{en}(R_{i\alpha})J_{een}(R_{i\alpha},R_{j\alpha},R_{ij})$$

A good Jastrow factor is critical for the correct treatment of the so called cusp conditions.

Cusp conditions¶

We want the local energy $E_L(R)$ to be a finite quantity:

$$\frac{H\Psi}{\Psi}=-\frac{1}{2}\sum_i\frac{\nabla_i^2 \Psi}{\Psi}-\frac{Z}{R_{ij}}$$

For small interparticles distances the potential energy diverges, we need then to ensure that (Kato's condition):

$$\frac{\Psi'}{\Psi}\Bigr|_{\{R_{i\alpha} \mid R_{ij}\}=0}=\{-Z \mid \frac{1}{2}\}$$

VMC with uniform sampling¶

Hydrogen atom solution¶

As an example, lets compute the variational energy of the hydrogen atom. Recall the expression:

$$E_v=\frac{\int dR \Psi(R)^2 E_L(R)}{\int dR \Psi(R)^2}$$

We can then use importance sampling:

$$E_v=\frac{\int dR \frac{\Psi(R)^2}{P(R)}P(R) E_L(R)}{\int dR \frac{\Psi(R)^2}{P(R)}P(R)}$$

Question¶

What is a natural choice for $P(R)$?

The uniform probability in a cube of edges $L$, centered on the origin:

$$P(R)=\frac{1}{L^3}$$

Simulation algorithm¶

Run $N_{MC}$ steps
1. Draw $R \sim P(R)$ for cube with geometry $(−5,−5,−5)\le(x,y,z)\le(5,5,5)$
2. Compute $\Psi(R)^2/P(R)$ and accumulate results (Why?)
3. Compute $\Psi(R)^2E_L(R)/P(R)$ and accumulate results (Why?)
Then compute the variational energy and the corresponding statistical error

Ingredients¶

Functions to compute:
1. The potential
2. The wave function
3. The local kinetic energy
4. The local energy

evaluated at $R=R(x, y, z)$. Ready, set, ...

In [1]:

# Code!
import numpy as np

The potential¶

The Coulomb potential for the electrons is:

$$V(R)=\frac{1}{\sqrt{x^2+y^2+z^2}}$$

In [2]:

def pot(R):
    """Computes coulomb potential for electrons"""
    return -1 / np.sqrt(np.dot(R, R))

The wave function¶

We consider the following model wave function:

$$\Psi(R)=e^{-a|R|}$$

In [3]:

def psi(a, R):
    """Computes hydrogen wave function"""
    return np.exp(-a * np.sqrt(np.dot(R, R)))

The local kinetic energy¶

Differenciate two times the expresion in the middle to get the last one:

$$T_L(R)=-\frac{1}{2}\frac{\nabla^2 \Psi(R)}{\Psi(R)}=-\frac{1}{2}\left(a^2 - \frac{2a}{|R|} \right)$$

In [4]:

def kin(a, R):
    """Computes local kinetic energy"""
    dist = np.sqrt(np.dot(R, R))
    return a * (1 / dist - 0.5 * a)

The local energy¶

Add the pieces together:

$$E_L(R)=T_L(R)+V(R)$$

In [5]:

def e_loc(a, R):
    """Computes the local energy for the hydrogen"""
    return kin(a, R) + pot(R)

Helper functions¶

A function that computes the average and the variance of the local energy:

In [6]:

def stats(E):
    """Computes average and variance of sample."""
    ave = np.mean(E)
    err = np.std(E, ddof=1) # Important to get an umbiased stimator! (Bessel's correction)
    return ave, err

Monte Carlo simulation¶

Lets set in our calculation $a = 1.2$, $nmc = 10000$, and $nsa=40$:

In [7]:

def monte_carlo(a, nmc):
    """Uniform sampling Monte Carlo simulation of the hydorgen atom"""
    ener, norm = 0, 0
    
    for _ in np.arange(nmc):
        R = np.random.uniform(-5, 5, 3)
        WFSQ = psi(a, R) * psi(a, R)
        norm += WFSQ
        ener += WFSQ * e_loc(a, R)
        
    return ener / norm

# Set simulation conditions
a = 1.2
nmc = 10000
nsa = 40

# Draw samples
X = np.zeros(nsa)
for idx in np.arange(nsa):
    X[idx] = monte_carlo(a, nmc)

# Compute estimator of variational energy and error
E, err = stats(X)

print(f"E = {E} +/- {err}")

E = -0.46868167113798515 +/- 0.04823011269107362

Achtung!¶

Although computationally harmless, there is a serious theoretical omission on the above code (unless you omitted it intentionally, but did you?). Can you find it?

Conclusions¶

Why quantum Monte Carlo?

Favorable scaling $N^4$
Flexibility in the selection of the wave function
Arguably easier to (massively) parallelize than any other quantum chemistry method
No need to compute forces (unless you want) or nasty integrals
Error independent of the dimensions of the system

"If God has made the world a perfect mechanism, He has at least conceded so much to our imperfect intellects that in order to predict little parts of it, we need not solve innumerable differential equations, but can use dice with fair success."

Max Born