## Bite 333. Metropolis–Hastings Algorithm

In statistics and statistical physics, the Metropolis–Hastings algorithm is a Markov chain Monte Carlo (MCMC) method for obtaining a sequence of random samples from a probability distribution from which direct sampling is difficult.

This sequence can be used to approximate the distribution (e.g. to generate a histogram) or to compute an integral (e.g. an expected value).

Metropolis–Hastings and other MCMC algorithms are generally used for sampling from multi-dimensional distributions, especially when the number of dimensions is high.

#### Metropolis Algorithm

For the purpose of illustration, you will implement a special case of the Metropolis–Hastings algorithm where the proposal function (which candidate to choose next) is symmetric (all candidates with the same distance to the current value are equally likely).

Given a function `f` from which we want to draws `N` samples to approximate this function. `f` should accept a single parameter `x` and return the function's value for `x`, that is `f(x)`.

1. Initialization: Choose an arbitrary point `x_0` to be the first observation in the list of samples and choose the normal (Gaussian) distribution as a proposal function that suggests a candidate for the next sample value `x_next`, given the previous sample value `x_t`.

The Gaussian distribution is centered at the current sample `x_t`, so that points closer to `x_t` are more likely to be visited next, making the sequence of samples into a random walk.

2. For each iteration `t`:

- Generate a candidate `x_next` for the next sample by drawing a random sample from the proposal distribution that is centered at the current sample `x_t`.

- Calculate the acceptance ratio `α=f(x_next)/f(x_t)`, which will be used to decide whether to accept or reject the candidate (the ratio is the probability to move to `x_next`).

- Accept or reject:

- Generate a uniform random number `u` drawn from the interval 0 to 1.

- If `u ≤ α`, then accept the candidate `x_next` as next sample.

- If `u > α`, then reject the candidate and stay at `x_t`, that is `x_next=x_t`

Your function will be used to approximate the mean and standard deviation values for a range of common univariate distributions like the normal distribution.

You are provided with a template for the function `metropolis_hastings`.

You have to implement the above mentioned algorithm in Python.

The function accepts an arbitrary probability density `f` (will be provided, see the tests), a start value `x_0` and the number of samples `n_samples` to draw from `f`.

Step one of the algorithm is already done by passing `x_0` to the function (or setting it to zero by default).

`x_0` is your first sample and from there you will move either to some point in the neighborhood or stay at `x_0`.

For this bite, you use a normal distribution as proposal distribution with a standard deviation of one and a mean value that is equal to the current sample (`x_0` at the beginner and `x_t` afterwards).

Refer to the tests to see how the function is called and how the statistics of `f` are tested.

#### Example

To test your function, you could try to guess the mean and standard deviation of a normal distribution like in the following example.

`import numpy as npdef norm_dist(x, mean, std):    """Gaussian normal probability distribution."""    return np.exp(-0.5 * (x - mean) ** 2 / std ** 2)samples = metropolis_hastings(f=lambda x: norm_dist(x, mean=0, std=1), x_0=-1, n_samples=100)print(samples.mean(), samples.std())`

If your implementation is right, this should return an approximation of the true mean (0) and standard deviation (1).

#### Hints

`numpy` will come handy in this exercise as you can use it to sample both from a normal distribution and from a uniform distribution.

See `numpy.random` for further details.

If you want to take things up a notch, you should look into `scipy.stats`, the module of scipy that contains a large number of probability distributions, summary and frequency statistics, correlation functions and statistical tests. For example, `scipy.stats` has a `norm` function that can be used to draw samples from a normal distribution (`norm.rvs(size=100)`), to calculate the probability of an x-value (`norm.pdf(x)`), and to calculate the cumulative probability of an x-value (`norm.cdf(x)`).

As I said, `scipy.stats` is optional and just for the curious, you can solve this bite with `numpy` alone.

Happy coding!

If you are interested in the topic, I recommend Computational Statistics in Python with a lot of code to see the concepts in action!

If you want to take things up a notch, you should look into `scipy.stats`, the module of scipy that contains a large number of probability distributions, summary and frequency statistics, correlation functions and statistical tests. For example, `scipy.stats` has a `norm` function that can be used to draw samples from a normal distribution (`norm.rvs(size=100)`), to calculate the probability of an x-value (`norm.pdf``(x)`), and to calculate the cumulative probability of an x-value (`norm.cdf(x)`).

As I said, `scipy.stats` is optional and just for the curious, you can solve this bite with `numpy` alone. Be aware that `scipy` is not available on the pybites platform, so you have to try it locally.