\begin_layout Standard
\begin_inset FormulaMacro


\begin_layout Standard
\begin_inset Formula 
\log P\left(Y\vert\mu,\Sigma\right) & =-\frac{1}{2}\sum_{n}\left(Y_{n}-\mu\right)^{T}\Sigma^{-1}\left(Y_{n}-\mu\right)-\frac{1}{2}N\log\left|\Sigma\right|+C\\
 & =-\frac{1}{2}\sum_{n}\mathrm{trace}\left(\Sigma^{-1}Y_{n}Y_{n}^{T}\right)+\sum_{n}\mathrm{trace}\left(\Sigma^{-1}Y_{n}\mu^{T}\right)-N\frac{1}{2}\mathrm{trace}\left(\Sigma^{-1}\mu\mu^{T}\right)-\frac{1}{2}N\log\left|\Sigma\right|+C\\
\log P\left(\mu\right) & =-\frac{1}{2}\mathrm{trace}\left(\Sigma_{\mu}^{-1}\mu\mu^{T}\right)+C\\
\log P\left(\Sigma\right) & =-\frac{1}{2}\mathrm{trace}\left(V\Sigma^{-1}\right)-\nu\log\left|\Sigma\right|



\begin_layout Standard
Suppose we have an approximate posterior, 
\begin_inset Formula $q\left(\mu\vert Y\right)$

 We wish to calculate an expectation with respect to the true posterior
\begin_inset Formula 
\mbe\left[g\left(\mu\right)\vert Y\right] & =\int P\left(\mu\vert Y\right)g\left(\mu\right)d\mu\\
P\left(\mu\vert Y\right) & \propto P\left(Y\vert\mu\right)P\left(\mu\right).



\begin_layout Standard
Suppose we knew the true 
\begin_inset Formula $\mu_{0}$

 that generated 
\begin_inset Formula $Y$

 We could then draw 
\begin_inset Formula $B$

 new datasets,
\begin_inset Formula 
Y_{b} & \sim P\left(Y_{b}\vert\mu_{0}\right),\textrm{ for }b=1,...,B.



\begin_layout Standard
For each of these, we could draw from 
\begin_inset Formula $\mu_{s}\sim q\left(\mu\vert Y_{b}\right)$

 This is an over-dispersed version of 
\begin_inset Formula $q\left(\mu\vert Y\right)$

, with,
\begin_inset Formula 
P\left(\mu_{s}\vert\mu_{0}\right) & =\int q\left(\mu\vert Y_{b}\right)P\left(Y_{b}\vert\mu_{0}\right)dY_{b}.



\begin_layout Standard
An importance sampling estimate of 
\begin_inset Formula $\mbe\left[g\left(\mu\right)\vert Y\right]$

 is then
\begin_inset Formula 
\mbe\left[g\left(\mu\right)\vert Y\right] & \approx\sum_{s}g\left(\mu_{s}\right)\tilde{w}_{s}\\
w_{b} & =\frac{P\left(Y\vert\mu_{s}\right)P\left(\mu_{s}\right)}{\int q\left(\mu_{s}\vert Y_{b}\right)P\left(Y_{b}\vert\mu_{0}\right)dY_{b}}\\
\tilde{w}_{b} & =\frac{w_{b}}{\sum_{b'}w_{b'}}



\begin_layout Standard
Of course, we do not know the denominator of 
\begin_inset Formula $w_{b}$

 However, we can approximate it with a non-parametric bootstrap:

\begin_layout Standard
\begin_inset Formula 
\int q\left(\mu_{b}\vert Y_{b}\right)P\left(Y_{b}\vert\mu_{0}\right)dY_{b} & \approx\frac{1}{B}\sum_{b}q\left(\mu_{b}\vert Y_{b}\right)\\
Y_{b} & \sim\mathrm{Bootstrap}\left(Y\right)



\begin_layout Standard
This is a bit fuzzy, since the bootstrap distribution does not approximate
 the distribution under the true 
\begin_inset Formula $\mu_{0}$

 Typically you use it with pivots.
 This seems tricky and let us not get hung up on it for now.
 Note that is is important to draw again from an approximation to the posterior
 so that we are doing importance sampling on a mixture of distributions
 in the 
\begin_inset Formula $\mu$

 Point estimates will not, I think, do in general, since the sample space
 of the bootstrap (the data, 
\begin_inset Formula $Y$

) is generally different from the sample space of 
\begin_inset Formula $\mu$

, so importance sampling does not make sense.

\begin_layout Standard
A (under-dispersed) variational approximation to the normal model is given
\begin_inset Formula 
\bar{Y} & =\frac{1}{N}\sum Y_{n}\\
\Lambda & :=N\Sigma^{-1}+\Sigma_{\mu}^{-1}\\
q\left(\mu\vert Y\right) & =\prod_{k}q\left(\mu_{k}\right)\\
q\left(\mu_{k}\vert Y\right) & =\mathcal{N}\left(\mu_{k};\left(N\Lambda^{-1}\Sigma^{-1}\bar{Y}\right)_{k},1/\Lambda_{kk}\right)



