Suppose that if \(\theta=1, \text { then } \mathrm{y}\) has a normal distribution with mean 1 and standard deviation \(\sigma,\) and if \(\theta=2,\) then \(y\) has a normal distribution with mean 2 and standard deviation \(\sigma .\) Also, suppose \(\operatorname{Pr}(\theta=1)=0.5\) and \(\operatorname{Pr}(\theta=2)=0.5\)
1. \(\mathbf{Solution.}\qquad\) We can compute the marginal probability density for \(y\) as follows, \[\begin{align*} p(y) & =p(y\mid\theta=1)\mathbb{P}(\theta=1)+p(y\mid\theta=2)\mathbb{P}(\theta=2)\\ & =\frac{1}{2}\left[\frac{1}{2\sqrt{2\pi}}\exp\left\{ -\frac{1}{8}(y-1)^{2}\right\} +\frac{1}{2\sqrt{2\pi}}\exp\left\{ -\frac{1}{8}(y-2)^{2}\right\} \right]\\ & =\frac{1}{4\sqrt{2\pi}}\left(\exp\left\{ -\frac{1}{8}(y-1)^{2}\right\} +\exp\left\{ -\frac{1}{8}(y-2)^{2}\right\} \right) \end{align*}\]
Below we show the R code to plot this density.
m = 1000
y_m = seq(-10, 10, by = 1 / (m - 1))
p_y = (1 / (4 * sqrt(2 * pi))) *
(exp((-1/8) * (y_m - 1)^2) + exp((-1/8) * (y_m - 2)^2))
df = data.frame("y" = y_m, "density" = p_y)
ggplot(df, aes(x = y, y = density)) +
geom_line() +
ggtitle("Plot of marginal density p(y)") +
xlab("y") +
ylab("p(y)") +
theme(plot.title = element_text(hjust = 0.5))
 Â
2. \(\mathbf{Solution.}\qquad\) Using Bayes formula we can compute the posterior density as follows, \[\begin{align*} \mathbb{P}(\theta=1\mid y=1) & =\frac{\mathbb{P}(y=1\mid\theta=1)\mathbb{P}(\theta=1)}{\mathbb{P}(y=1\mid\theta=1)\mathbb{P}(\theta=1)+\mathbb{P}(y=1|\theta=2)\mathbb{P}(\theta=2)}\\ & =\frac{\frac{1}{4\sqrt{2\pi}}}{\frac{1}{4\sqrt{2\pi}}\left(1+e^{-1/8}\right)}\\ & =\left(1+e^{-1/8}\right)^{-1} \end{align*}\]
 Â
3. \(\mathbf{Solution.}\qquad\) Using our result from part 2, we wish to plot the prior density as a function of \(\sigma\), \[ \mathbb{P}(\theta=1\mid y=1)=\left(1+e^{-1/\left(2\sigma^{2}\right)}\right)^{-1} \]
The following R code shows this plot for \(\sigma \in [0,10]\).
# Plot density as a function of sigma
m = 1000
sigma_m = seq(0, 10, by = 1 / (m - 1))
p_theta = 1 / (1 + exp(-1 / (2 * sigma_m^2)))
df = data.frame("sigma" = sigma_m, "posterior_density" = p_theta)
ggplot(df, aes(x = sigma, y = posterior_density)) +
geom_line() +
ggtitle(TeX(paste0("Plot of posterior density $P(\\theta = 1 | y = 1)$",
"as a function of $\\sigma$"))) +
xlab(TeX("$\\sigma$")) +
ylab(TeX("$P(\\theta = 1 | y = 1)$")) +
theme(plot.title = element_text(hjust = 0.5))
We see that as \(\sigma\) decreases and approaches \(0\), the posterior density goes to \(1\). As \(\sigma\) increases, the posterior density gets close to \(\frac{1}{2}\), which is equivalent to flipping a fair coin.
A random sample of \(n\) students is drawn from a large population, and their weights are measured. The average weight of the \(n\) sampled students is \(\bar{y}=150\) pounds. Assume the weights in the population are normally distributed with unknown mean \(\theta\) and known standard deviation 20 pounds. Suppose your prior distribution for \(\theta\) is normal with mean 180 and standard deviation \(40 .\)
1. \(\mathbf{Solution.}\qquad\) First we can construct the product of the likelihood function with the prior density, \[\begin{align*} \mathbb{P}\left(\theta\mid y_{1},\ldots,y_{n}\right) & \propto\mathbb{P}\left(y_{1},\ldots,y_{n}\mid\theta\right)p(\theta)\\ & \propto\prod_{i=1}^{n}\exp\left\{ \frac{-1}{2(20)^{2}}\left(y_{i}-\theta\right)^{2}\right\} \exp\left\{ \frac{-1}{2\left(40\right)^{2}}(\theta-180)^{2}\right\} \end{align*}\]
Using formula and notation from lecture, we have \[ y_{i}\sim N\left(\theta,\sigma^{2}\right),\quad\theta\sim N\left(u_{0},\tau_{0}^{2}\right), \quad\mathbb{P}\left(\theta\mid y_{1},\ldots,y_{n}\right)\sim N\left(\mu_{n},\tau_{n}^{2}\right) \]
Then, \[ \begin{aligned}\mu_{n} & =\frac{\frac{180}{40^{2}}+\frac{150n}{20^{2}}}{\frac{1}{40^{2}}+\frac{n}{20^{2}}} & \qquad\tau_{n}^{2} & =\frac{1}{\frac{1}{40^{2}}+\frac{n}{20^{2}}}\\ & =\frac{180+600n}{1+4n} & & =\frac{40^{2}}{1+4n} \end{aligned} \]
For simplicity we will write \(y_{1:n}:=y_{1},\ldots,y_{n}\). Therefore, \[\begin{equation} \mathbb{P}\left(\theta\mid y_{1:n}\right)\sim N\left(\frac{180+600n}{1+4n},\frac{40^{2}}{1+4n}\right)\label{eq:1} \end{equation}\]
 Â
2. \(\mathbf{Solution.}\qquad\) We can compute \(p\left(\tilde{y}\mid y_{1:n}\right)\) as follows, \[ p\left(\tilde{y}\mid y_{1:n}\right)=\int p(\tilde{y}\mid\theta)p\left(\theta\mid y_{1:n}\right)d\theta \]
where, \[ p(\tilde{y}\mid\theta)\sim N\left(\theta,\sigma^{2}\right) \]
and \(p\left(\theta\mid y_{1:n}\right)\) is same as (). Next, we can compute mean and variance of \(p\left(\tilde{y}\mid y_{1:n}\right)\). \[\begin{align*} \mathbb{E}\left[\tilde{y}\mid y_{1:n}\right] & =\mathbb{E}\left[\mathbb{E}\left[\tilde{y}\mid\theta,y_{1:n}\right]\mid y_{1:n}\right]\\ & =\mathbb{E}\left[\theta\mid y_{1:n}\right]\\ & =\mu_{n} \end{align*}\]
Also, \[\begin{align*} \mathrm{Var}\left[\tilde{y}|y_{1in}\right] & =\mathrm{Var}\left(\mathbb{E}\left[\tilde{y}\mid\theta,y_{1:n}\right]\mid y_{1:n}\right)+\mathbb{E}\left[\mathrm{Var}\left(\tilde{y}\mid\theta,y_{1:n}\right)\mid y_{1:n}\right]\\ & =\mathrm{Var}\left(\theta\mid y_{1:n}\right)+\sigma^{2}\\ & =\tau_{n}^{2}+\sigma^{2} \end{align*}\]
Therefore, \[ \tilde{y}\sim N\left(\mu_{n},\tau_{n}^{2}+\sigma^{2}\right) \]
 Â
3. \(\mathbf{Solution.}\qquad\) We can construct the interval for \(\theta\) by standardizing \(\theta\), \[ \mathbb{P}\left(z_{\frac{0.05}{2}}\leq\frac{\theta-\mu_{n}}{\tau_{n}}\leq z_{1-\frac{0.05}{2}}\right)=0.95 \]
then, \[ \mu_{n}-1.96\tau_{n}\leq\theta\leq\mu_{n}+1.96\tau_{n} \]
Similarly, the interval for \(\tilde{y}\) is, \[ \mu_{n}-1.96\sqrt{\tau_{n}^{2}+\sigma^{2}}\leq\tilde{y}\leq\mu_{n}+1.96\sqrt{\tau_{n}^{2}+\sigma^{2}} \]
Below we use the following R code to compute the 95% confidence intervals.
tau_0 = 40
sigma = 20
mu_0 = 180
alpha = 0.05
crit = qnorm(1 - alpha/2)
df_theta = data.frame()
df_y = data.frame()
for (n in c(10, 100)) {
mu_n = (mu_0 + 600 * n) / (1 + 4 * n)
t_n2 = 1 / (1 / tau_0^2 + n / (sigma^2))
theta_CI = c(mu_n - crit*sqrt(t_n2), mu_0 + crit*sqrt(t_n2))
y_tilde_CI = c(mu_n - crit * sqrt(sigma^2 + t_n2),
mu_n + crit * sqrt(sigma^2 + t_n2))
df_theta = rbind(df_theta, t(round(theta_CI, 2)))
df_y = rbind(df_y, t(round(y_tilde_CI, 2)))
}
For \(n=10, \theta \in\) [138.49,192.24] and \(\tilde{y} \in\) [109.66,191.8].
For \(n=100, \theta \in\) [146.16,183.92] and \(\tilde{y} \in\) [110.68,189.47].
Suppose you observe \(y=285\) from the model Binomial \((500, \theta),\) where \(\theta\) is an unknown parameter. Assume the prior on \(\theta\) has the following form \[ p(\theta)=\left\{\begin{array}{ll} {8 \theta,} & {0 \leq \theta<0.25} \\ {\frac{8}{3}-\frac{8}{3} \theta,} & {0.25 \leq \theta \leq 1} \\ {0,} & {\text { otherwise }} \end{array}\right. \]
1. \(\mathbf{Solution.}\qquad\) Below we show R code to compute the unnormalized density posterior density function and we plot the normalized posterior density as a function of \(\theta\).
m = 100
width = 1 / (m - 1)
theta_m = seq(0, 1, by = 1 / (m - 1))
p_theta = function(theta) {
if (theta >= 0 & theta < 0.25) {
return(theta * 8)
} else if (theta >= 0.25 & theta < 1) {
return(8 / 3 - (8 / 3) * theta)
} else {
return(0)
}
}
p_thetas = sapply(theta_m, p_theta)
y_likelihood = sapply(theta_m, dbinom, x = 285, size = 500)
posterior_unnormalized = y_likelihood * p_thetas
posterior_normalized = posterior_unnormalized /
(sum(posterior_unnormalized) * width)
# Plot Normalized posterior
x = data.frame("thetas" = theta_m,
"normalized_posterior" = posterior_normalized)
ggplot(x, aes(x = thetas, y = normalized_posterior)) +
geom_line() +
ggtitle("Normalized Posterior Plot") +
xlab(TeX("$\\theta$")) +
ylab(TeX("$p(\\theta | y)$")) +
theme(plot.title = element_text(hjust = 0.5))
 Â
2. \(\mathbf{Solution.}\qquad\) Below we show R code for sampling \(\theta\) from the posterior density and plotting a histogram.
draws = 10000
samples = sample(theta_m, draws, prob = posterior_normalized, replace = TRUE)
y = data.frame("samples" = samples)
ggplot(y, aes(x = samples)) +
geom_histogram(aes(y = stat(count) / sum(count)), binwidth = width) +
theme(plot.title = element_text(hjust = 0.5)) +
ggtitle("Histogram of Posterior Density") +
xlab(TeX("$\\theta$")) +
ylab(TeX("Density"))