Question 1.1

Recall the basic properties of covariance, \(\mathrm{Cov}\left(X,Y\right) = \mathbb{E}\big[(X-\mathbb{E}[X])(Y-\mathbb{E}[Y])\big]\), following the convention that upper case letters are random variables and lower case letters are constants:

P1. \(\quad \mathrm{Cov}(Y,Y)= \mathrm{Var}(Y)\),

P2. \(\quad \mathrm{Cov}(X,Y)=\mathrm{Cov}(Y,X)\),

P3. \(\quad \mathrm{Cov}(aX,bY)= ab\,\mathrm{Cov}(X,Y)\),

P4. \(\quad \mathrm{Cov}\left(\sum_{m=1}^M Y_m,\sum_{n=1}^N Y_n\right)= \sum_{m=1}^M \sum_{n=1}^N \mathrm{Cov}(Y_m, Y_n)\).

Let \(Y_{1:N}\) be a covariance stationary time series model with autocovariance function \(\gamma_h\) and constant mean function, \(\mu_n=\mu\). Consider the sample mean as an estimator of \(\mu\), \[\hat{\mu}(y_{1:N}) = \frac{1}{N}\sum_{n=1}^N y_n.\] Show how the basic properties of covariance can be used to derive the expression, \[\mathrm{Var}\big(\hat{\mu}(Y_{1:N})\big) = \frac{1}{N}\gamma_0 + \frac{2}{N^2}\sum_{h=1}^{N-1}(N-h)\gamma_h.\]


\(\textbf{Solution.}\qquad\) By definition of \(\hat{\mu}\left(Y_{1:N}\right)=\frac{1}{N}\sum_{n=1}^{N}Y_{n}\) we can compute its variance as,

\[\begin{align} \mathbb{\mathrm{{Var}}}(\hat{\mu}\left(Y_{1:N}\right)) & =\mathbb{\mathrm{{Var}}}(\frac{1}{N}\sum_{n=1}^{N}Y_{n})\nonumber \\ & =\mathrm{Cov}\left(\frac{1}{N}\sum_{m=1}^{N}Y_{m},\frac{1}{N}\sum_{n=1}^{N}Y_{n}\right)\nonumber \\ & =\frac{1}{N^{2}}\sum_{m=1}^{N}\sum_{n=1}^{N}\mathrm{Cov}\left(Y_{m},Y_{n}\right) \label{eq:1} \end{align}\]

Recall the notation, \(\gamma_{h}=\gamma_{n,n+h}\) with lag \(h\). Then we can write (\ref{eq:1}),

\[\begin{align*} \mathbb{\mathrm{{Var}}}(\hat{\mu}\left(Y_{1:N}\right)) & =\frac{1}{N^{2}}\sum_{m=1}^{N}\sum_{n=1}^{N}\mathrm{Cov}\left(Y_{m},Y_{n}\right)\\ & =\frac{1}{N^{2}}\sum_{m=1}^{N}\sum_{n=1}^{N}\gamma_{m-n}\\ & =\frac{1}{N^{2}}\sum_{m-n=-N}^{N}\left(N-\left(m-n\right)\right)\gamma_{m-n}\\ & =\frac{1}{N^{2}}\sum_{h=-N}^{N}\left(N-h\right)\gamma_{h}\\ & =\frac{1}{N^{2}}\left(N\gamma_{0}+2\sum_{h=1}^{N-1}\left(N-h\right)\gamma_{h}\right)\\ & =\frac{1}{N}\gamma_{0}+\frac{2}{N^{2}}\sum_{h=1}^{N-1}\left(N-h\right)\gamma_{h} \end{align*}\]

Question 1.2

The sample autocorrelation is perhaps the second most common type of plot in time series analysis, after simply plotting the data. We investigate how R represents chance variation in the plot of the sample autocorrelation function produced by the acf function. We seek to check what R actually does when it constructs the dashed horizontal lines in this plot. What approximation is being made? How should the lines be interpreted statistically?

If you type acf in R, you get the source code for the acf function. You’ll see that the plotting is done by a service function plot.acf. This service function is part of the package, and is not immediately accessible to you. Nevertheless, you can check the source code as follows:

  1. Notice, either from the help documentation ?acf or the last line of the source code acf that this function resides in the package stats.

  2. Now, you can access this namespace directly, to list the source code, by
stats:::plot.acf
  1. Now we can see how the horizontal dashed lines are constructed. The critical line of code seems to be
clim0 <- if (with.ci) qnorm((1 + ci)/2)/sqrt(x$n.used)

This appears to correspond to a normal distribution approximation for the sample autocorrelation estimator, with mean zero and standard deviation \(1/\sqrt{N}\).


A. This question investigates the use of \(1/\sqrt{N}\) as an approximation to the standard deviation of the sample autocorrelation estimator under the null hypothesis that the time series is a sequence of independent, identically distributed (IID) mean zero random variables.

Instead of studying the full autocorrelation estimator, you are asked to analyze a simpler situation where we take advantage of the knowledge that the mean is zero and consider \[ \hat\rho_h(Y_{1:N}) = \frac{\frac{1}{N}\sum_{n=1}^{N-h} {Y_n} \, Y_{n+h}} {\frac{1}{N}\sum_{n=1}^{N} Y_{n}^2}\] where \(Y_1,\dots,Y_N\) are IID random variables with zero mean and finite variance. Specifically, find the mean and standard deviation for \(\hat\rho_h(Y_{1:N})\) when \(N\) becomes large.

The actual autocorrelation estimator subtracts a sample mean, and you can analyze that instead if you want an additional challenge.

You will probably want to make an argument based on linearization. You can reason at whatever level of math stat formalization you’re happy with. According to Mathematical Statistics and Data Analysis by John Rice, a textbook used for the undergraduate upper level Math Stats course, STATS 426,

“When confronted with a nonlinear problem we cannot solve, we linearize. In probability and statistics, this method is called propagation of errors or the \(\delta\) method. Linearization is carried out through a Taylor Series expansion.”

Rice then proceeds to describe the delta method in a way very similar to the Wikipedia article on this topic. In summary, suppose \(X\) is a random variable with mean \(\mu^{}_X\) and small variance \(\sigma^2_X\), and \(g(x)\) is a nonlinear function with derivative \(g^\prime(x)=dg/dx\). To study the random variable \(Y=g(X)\) we can make a Taylor series approximation, \[ Y \approx g(\mu^{}_X) + (X-\mu^{}_X) g^\prime(\mu^{}_X).\] This approximates \(Y\) as a linear function of \(X\), so we have

  1. \(\quad \mu^{}_Y = \mathbb{E}[Y]\approx g(\mu^{}_X)\).

  2. \(\quad \sigma^2_Y = \mathrm{Var}(Y) \approx \sigma^2_X \big\{g^\prime(\mu^{}_X)\big\}^2\).

  3. If \(X\sim N\big[\mu^{}_X,\sigma_X^2\big]\), then \(Y\) approximately follows a \(N\big[g(\mu^{}_X), \sigma^2_X \big\{g^\prime(\mu^{}_X)\big\}^2\big]\) distribution.


\(\textbf{Solution.}\qquad\) First we define the following variables,

\[ U:=\sum_{n=1}^{N-h}Y_{n}Y_{n+h},\qquad V:=\sum_{n=1}^{N}Y_{n}^{2},\qquad g\left(U,V\right):=\frac{U}{V} \]

Since \(g\left(U,V\right)=\hat{\rho}_{h}\left(Y_{1:N}\right)\), we can approximate \(g\left(U,V\right)\) by first-order Taylor expansion about \(\textbf{a}=\left(\mu_{U},\mu_{V}\right)\),

\[\begin{align} g\left(\textbf{x}\right) & \approx g\left(\textbf{a}\right)+\left(\textbf{x}-\textbf{a}\right)\cdot\nabla g\left(\textbf{a}\right)\label{eq:2} \end{align}\]

where,

\[ \nabla g\left(\textbf{a}\right)=\left.\left[\begin{array}{cc} \frac{1}{V}, & -\frac{U}{V^{2}}\end{array}\right]^{T}\right|_{\left(\mu_{U},\mu_{V}\right)}=\left[\begin{array}{cc} \frac{1}{\mu_{V}}, & -\frac{\mu_{U}}{\mu_{V}^{2}}\end{array}\right]^{T} \]

Thus (\ref{eq:2}) becomes,

\[\begin{align} g\left(U,V\right) & =\frac{\mu_{U}}{\mu_{V}}+\left(U-\mu_{U}\right)\frac{1}{\mu_{V}}-\left(V-\mu_{V}\right)\frac{\mu_{U}}{\mu_{V}^{2}}\nonumber \\ & =\frac{1}{\mu_{V}}\left(U-\left(V-\mu_{V}\right)\frac{\mu_{U}}{\mu_{V}^{2}}\right)\label{eq:3} \end{align}\]

Next we compute \(\mu_{U}\) and \(\mu_{V}\). First \(\mu_{U}\) can be written as,

\[\begin{align} \mathbb{E}[U] & =\mathbb{E}\left[\sum_{n=1}^{N-h}Y_{n}Y_{n+h}\right]\nonumber \\ & =\sum_{n=1}^{N-h}\mathbb{E}\left[Y_{n}Y_{n+h}\right]=0\label{eq:4} \end{align}\]

where in (\ref{eq:4}) we used the fact that each \(Y_{n}\) are \(i.i.d\). Second we also compute \(\mu_{V}\) and assume \(\mathrm{Var}\left(Y_{n}\right)=\sigma^{2}\) is fininte.

\[ \begin{aligned}\mathbb{E}[V] & =\mathbb{E}\left[\sum_{n=1}^{N}Y_{n}^{2}\right]\\ & =\sum_{n=1}^{N}\mathbb{E}\left[Y_{n}^{2}\right]=N\sigma^{2} \end{aligned} \]

Then (\ref{eq:3}) becomes,

\[\begin{equation} \hat{\rho}_{h}\left(Y_{1:N}\right)\approx\frac{U}{N\sigma^{2}}\label{eq:5} \end{equation}\]

Using (\ref{eq:5}), we can compute \(\mathbb{E}\left[\hat{\rho}_{h}\left(Y_{1:N}\right)\right]\) for lare \(N\),

\[\begin{align*} \underset{N\rightarrow\infty}{\mathrm{lim}}\mathbb{E}\left[\hat{\rho}_{h}\left(Y_{1:N}\right)\right] & =\underset{N\rightarrow\infty}{\mathrm{lim}}\frac{1}{N\sigma^{2}}\mathbb{E}[U]\\ & =\underset{N\rightarrow\infty}{\mathrm{lim}}0\\ & =0 \end{align*}\]

We can also compute \(\mathrm{Var}\left(\hat{\rho}_{h}\left(Y_{1:N}\right)\right)\),

\[\begin{align} \mathrm{Var}\left(\hat{\rho}_{h}\left(Y_{1:N}\right)\right) & =\mathrm{Var}\left(\frac{U}{N\sigma^{2}}\right)\nonumber \\ & =\frac{1}{N^{2}\sigma^{4}}\mathrm{Var}\left(U\right)\label{eq:6} \end{align}\]

In order to finish computing (\ref{eq:6}), we need to compute \(\mathrm{Var}\left(U\right)\).

\[ \begin{aligned}\mathrm{Var}(U) & =\mathbb{E}\left[\left(\sum_{n=1}^{N-h}Y_{n}Y_{n+h}\right)^{2}\right]-\underbrace{\mathbb{E}\left[\sum_{n=1}^{N-h}Y_{n}Y_{n+h}\right]^{2}}_{=0}\\ & =\mathbb{E}\left[\sum_{n=1}^{N-h}Y_{n}^{2}Y_{n+h}^{2}+2\sum_{i>j}Y_{i}Y_{j}Y_{i+h}Y_{j+h}\right]\\ & =\sum_{n=1}^{N-h}\mathbb{E}\left[Y_{n}^{2}\right]\mathbb{E}\left[Y_{n+h}^{2}\right]+2\underbrace{\sum_{i>j}\mathbb{E}\left[Y_{i}Y_{j}Y_{i+h}Y_{j+h}\right]}_{=0}\\ & =(N-h)\sigma^{4} \end{aligned} \]

Plugging back into (\ref{eq:6}),

\[\begin{align*} \mathrm{Var}\left(\hat{\rho}\left(Y_{1:N}\right)\right) & =\frac{1}{N^{2}\sigma^{4}}(N-h)\sigma^{4}\\ & =\frac{N-h}{N^{2}} \end{align*}\]

When \(N\) becomes large,

\[ \underset{N\rightarrow\infty}{\mathrm{lim}}\mathrm{Var}\left(\hat{\rho}_{h}\left(Y_{1:N}\right)\right)=\underset{N\rightarrow\infty}{\mathrm{lim}}\frac{N-h}{N^{2}}=\frac{1}{N} \]


B. It is often asserted that the horizontal dashed lines on the sample ACF plot represent a confidence interval. For example, in the documentation produced by ?plot.acf we read

ci: coverage probability for confidence interval.  Plotting of the confidence interval is suppressed if ‘ci’ is zero or negative.

Use a definition of a confidence interval to explain how these lines do, or do not, construct a confidence interval.


\(\textbf{Solution.}\qquad\) From part A we can apply CLT to the sample autocorrelation estimator,

\[ \sqrt{N}\left(\hat{\rho}_{h}\left(Y_{1:N}\right)\right)\overset{CLT}{\sim}N(0,1) \]

Then we can construct \(95\%\) confidence interval for \(\hat{\rho}_{h}\left(Y_{1:2N}\right)\),

\[ \mathbb{P}\left(\frac{-1.96}{\sqrt{N}}\leq\hat{\rho}_{h}\left(Y_{1:N}\right)\leq\frac{1.96}{\sqrt{N}}\right)=0.95 \]

Question 1.3

Explain which parts of your responses above made use of a source, meaning anything or anyone you consulted (including classmates or office hours) to help you write or check your answers. All sources are permitted, but failure to attribute material from a source is unethical. See the syllabus for additional information on grading.