There is a clear relationship between pdf and cdf. Moreover, we can derive a normalization condition for the pdf.
The expected value (sometimes expectation value or stochastic mean value) of $X$ is defined as:
$$E[X] = \sum_i P[X=x_i]x_i$$or in the continuous case: $$E[X] = \int^{+\infty}_{-\infty}xdF_X = \int^{+\infty}_{-\infty}xf_X(x)dx$$
A function of a random variable is a random variable itself, so:
$$E[g(X)]=\int^{+\infty}_{-\infty}g(x)f_X(x)dx$$This last result is know as the law of the unconscious statistician, and it is only a nice application of the chain rule.
Stochastical quantities are meaningless without a proper estimation of the error sources. The square root of the variance is a messure of the dispersion of a random variable. In the continous case, the variance can be computed as:
$$\sigma^2 = \text{Var}[X]=E[(X-E[X])^2]=E[X^2]-E[X]^2$$Prove the above relationship.
Draw samples by constructing a Markov chain whose stationary distribution is the desired target distribution $P(x)$. Provided that our transition probability $P(x_2 \mid x_1)$ can be separated in the product of a proposal probability $g(x_2\mid x_1)$ and an acceptance probablity $A(x_2, x_1)$, and that detail balance is veryfied:
$$P(x_2 \mid x_1)P(x_1)=P(x_1 \mid x_2)P(x_2)$$The algorithm proceeds as follows:
Let $X_1, X_2, \ldots, X_N$ be a sequence of indenpendent equaly distributed (i.i.d.) reandom variables with $E[X_i]=\mu$. For large N the average value of the $X_i$:
$$\bar{X}_N=\frac{1}{N}\sum_{i=1}^N X_i$$converges almost surely to $\mu$:
$$P\left[\lim_{N\to\infty}\bar{X}_N=\mu \right]=1$$Let $X_1, X_2, \ldots$ be a sequence of indenpendent equaly distributed (i.i.d.) reandom variables with $E[X_i]=\mu$ and variance $\sigma < \infty$. As the sample size $N$ increases we get:
$$\sqrt(N)(\bar{X}_N-\mu)\rightarrow \mathcal{N}(0, \sigma^2)$$In other words, regardless of the original distribution of the random variables, the pdf of the average tends to a normal distribution.
The expected value of an observable $O$ in a canonical discrete ensemble is given by:
$$\langle O\rangle = \frac{\sum_i O_i e^{-\beta E_i}}{\sum_i e^{-\beta E_i}}$$How can we compute this in practice?
Lets recast the expression into something more familiar:
$$\langle O\rangle = \sum_i O_i \frac{e^{-\beta E_i}}{\sum_j e^{-\beta E_j}}=\sum_i p_i O_i$$Then, if we sample long enough, the law of large numbers offers us an estimator of the expected value:
$$\langle O\rangle_p = \frac{1}{N}\sum_{i=1}^N O_i$$and the averages of the samples drawn with probability $p_i$ will have a normal distribution, according to the central limit theorem.
We will focus on variational Monte Carlo (VMC) in this course, but you should know that other flavors exists, e.g. diffusion Monte Carlo (DMC). For VMC, we need three main ingredients:
The main idea is to calculate the multidimensional integrals appearing in quantum mechanics using a Monte Carlo numerical integration. The most important quantity is the variational energy $E$, which is associated to a Hamiltonian $H$ and a wave function $\Psi$:
$$E_v=\frac{\langle \Psi|H|\Psi \rangle}{\langle \Psi|\Psi \rangle}=\frac{\int dR \Psi(R)^2 E_L(R)}{\int dR \Psi(R)^2}=\int dR E_L(R) \rho(R)$$where we have defined the local energy: $$E_L(R)=\frac{H\Psi(R)}{\Psi(R)}$$ and the normalized probability density: $$\rho(R)=\frac{\Psi(R)^2}{\int dR \Psi(R)^2}$$
In VMC the variance can be computed as: $$\sigma^2=\int dR (E_L(R)-E_v)^2\rho(R)$$ The corresponding estimator is: $$\sigma^2\approx\bar{\sigma}^2=\frac{1}{M}\sum_{k=1}^M (E_L(R)-\bar{E}_v)^2$$
A very powerful result follows then from the central limit theorem. Using M Monte Carlo samples the integration error is given by:
$$E_v\pm\delta Ev$$$$\delta E_v \sim \frac{\sigma}{\sqrt{M}}$$independently of the dimensions of the problem!
Since we do not solve analytical integrals, our wave function (WF) choice gets simplified. A popular choice are the so called Jastrow-Slater wave functions:
$$\Psi(R)=J(R)\Phi(R)$$where $J(R)$ is the Jastrow factor and $\Phi(R)$ is a Slater determinant or a linear coombination of them.
The Jastrow factor has the general functional form:
$$J(R)=e^{f(R)}$$where $f(R)$ is a function of the interparticle distances, and it captures the dynamic correlation effects of the wave function. In practice, the Jastrow is separated in three components:
$$J(R)=J_{ee}(R_{ij})J_{en}(R_{i\alpha})J_{een}(R_{i\alpha},R_{j\alpha},R_{ij})$$A good Jastrow factor is critical for the correct treatment of the so called cusp conditions.
We want the local energy $E_L(R)$ to be a finite quantity:
$$\frac{H\Psi}{\Psi}=-\frac{1}{2}\sum_i\frac{\nabla_i^2 \Psi}{\Psi}-\frac{Z}{R_{ij}}$$For small interparticles distances the potential energy diverges, we need then to ensure that (Kato's condition):
$$\frac{\Psi'}{\Psi}\Bigr|_{\{R_{i\alpha} \mid R_{ij}\}=0}=\{-Z \mid \frac{1}{2}\}$$As an example, lets compute the variational energy of the hydrogen atom. Recall the expression:
$$E_v=\frac{\int dR \Psi(R)^2 E_L(R)}{\int dR \Psi(R)^2}$$We can then use importance sampling:
$$E_v=\frac{\int dR \frac{\Psi(R)^2}{P(R)}P(R) E_L(R)}{\int dR \frac{\Psi(R)^2}{P(R)}P(R)}$$What is a natural choice for $P(R)$?
The uniform probability in a cube of edges $L$, centered on the origin:
$$P(R)=\frac{1}{L^3}$$Run $N_{MC}$ steps
Then compute the variational energy and the corresponding statistical error
evaluated at $R=R(x, y, z)$. Ready, set, ...
# Code!
import numpy as np
The Coulomb potential for the electrons is:
$$V(R)=\frac{1}{\sqrt{x^2+y^2+z^2}}$$def pot(R):
"""Computes coulomb potential for electrons"""
return -1 / np.sqrt(np.dot(R, R))
We consider the following model wave function:
$$\Psi(R)=e^{-a|R|}$$def psi(a, R):
"""Computes hydrogen wave function"""
return np.exp(-a * np.sqrt(np.dot(R, R)))
Differenciate two times the expresion in the middle to get the last one:
$$T_L(R)=-\frac{1}{2}\frac{\nabla^2 \Psi(R)}{\Psi(R)}=-\frac{1}{2}\left(a^2 - \frac{2a}{|R|} \right)$$def kin(a, R):
"""Computes local kinetic energy"""
dist = np.sqrt(np.dot(R, R))
return a * (1 / dist - 0.5 * a)
Add the pieces together:
$$E_L(R)=T_L(R)+V(R)$$def e_loc(a, R):
"""Computes the local energy for the hydrogen"""
return kin(a, R) + pot(R)
A function that computes the average and the variance of the local energy:
def stats(E):
"""Computes average and variance of sample."""
ave = np.mean(E)
err = np.std(E, ddof=1) # Important to get an umbiased stimator! (Bessel's correction)
return ave, err
Lets set in our calculation $a = 1.2$, $nmc = 10000$, and $nsa=40$:
def monte_carlo(a, nmc):
"""Uniform sampling Monte Carlo simulation of the hydorgen atom"""
ener, norm = 0, 0
for _ in np.arange(nmc):
R = np.random.uniform(-5, 5, 3)
WFSQ = psi(a, R) * psi(a, R)
norm += WFSQ
ener += WFSQ * e_loc(a, R)
return ener / norm
# Set simulation conditions
a = 1.2
nmc = 10000
nsa = 40
# Draw samples
X = np.zeros(nsa)
for idx in np.arange(nsa):
X[idx] = monte_carlo(a, nmc)
# Compute estimator of variational energy and error
E, err = stats(X)
print(f"E = {E} +/- {err}")
E = -0.46868167113798515 +/- 0.04823011269107362
Although computationally harmless, there is a serious theoretical omission on the above code (unless you omitted it intentionally, but did you?). Can you find it?
Why quantum Monte Carlo?
"If God has made the world a perfect mechanism, He has at least conceded so much to our imperfect intellects that in order to predict little parts of it, we need not solve innumerable differential equations, but can use dice with fair success."
Max Born