# Probabilistic Inference Using Markov Chain Monte Carlo Methods, Part 2

These notes are based on the gnofai tutorial that covers Radford Neal’s MCMC review (pdf).

Sanmi was out of town this week, so I can’t rely on his writeup for keywords to index my notes this time around.

We covered the basics of Markov Chains this week: every fininte chain has at least one (not necessarily unique) invariant distribution.

Detailed balance (time reversibility, $\pi(x)T(x,x') = \pi(x')(T(x',x)$) and ergodicity (convergence to a fixed point, $\pi = \pi T$ independent of starting state, $P_0$) were defined; we showed that detailed balance implies a fixed point, but that the converse isn’t true, and we found conditions for homogeneous (stateless / memoryless) Markov process to be ergodic, as well as how to generalize to a non-homogeneous process (let $T = T_0 T_1 \dots T_{d-1}$).

There was a brief aside where we related the eigenvectors of T to properties of the Markov chain; there was a comment from Meghana that the properties as written weren’t quite right, so we should  see the review for details.

We then covered Gibbs Sampling, which is useful when we have a joint distribution that we want to sample, and the conditionals have simple parametric forms. The result is that we reduce a multivariate to a univariate with a nice form that we can sample. And even if not, we could just use rejection sampling.

To do Gibbs, we fix everything but some $x_k$, and then change $x_k$ when we sample, iterating over all $k$. Since we repeat after N steps, this is homogenous w.r.t. $T = B_1 B_2 \dots B_n$. We proved that we do actually converge to a fixed point; intuitively, it’s because only the kth variable changes, so their marginals don’t change, and the kth just draws from this conditional.

We then covered a simple example of a latent class model with n observed variables which were correlated with some hidden variable