Question 1

The Nasdaq index data set from Jan 1, 2008 to Dec 31, 2012 is in the data set Nasdaq_daily_Jan1_2008_Dec31 2012.csv in Data folder on Canvas.

  1. Fit a Generalized Pareto Distribution (GPD for short) to the upper tail of the negative returns and give detailed plots of the fit (along the lines of what is generated in class) and discuss your results.
  2. Compute the relative VaR (expressed in units of the current price) at the level \(=.0025\) utilizing for the case of fitting a normal distribution to the returns (not using any POT).
  3. Compute VaR for 1 million dollars invested in Ford stock at the level \(=.0025\) utilizing the GPD distributional fits generated in part (a). Discuss the comparison of this result from part (b).
  4. Utilizing “quant” command in evir package, give a discussion on the stability of the VaR as function of threshold.
  5. Compute expected shortfall for part (c) using Monte Carlo methods and the estimated GPD distribution and relative VaR in (c).

(a) \(\mathbf{Solution.}\qquad\) For this homework we attempt to recreate most plots from evir library using ggplot2. This will also be useful for creating plots in Question 2. The code is adapted directly from the functions in evir. The code for the custom ggplot functions can be found on the last page. First we build the empirical CDF of the negative returns for the Nasdaq index.

Next we create the shape plot from which we will determine a suitable shape paramter \(\xi\) for our GPD.

From the shape plot above, we determine that a proper threshold is about 0.0121. After having chosen this threshold, we can now generate the tailplot to see how well the points in the tail are fitted to our estimated GPD.

We can see that that our fitted GPD does a good job of fitting the points, with scale parameter \(\hat{\beta}=\) 0.013 and shape paramter \(\hat{\xi}=\) 0.035.

   

(b) \(\mathbf{Solution.}\qquad\) Below we fit a normal distribution to the returns and estimate the relative VaR.

The 0.0025-quantile of the returns is -0.0482 so the relative VaR is 0.0482.

   

(c) \(\mathbf{Solution.}\qquad\) Below we get our VaR estimate using our fitted GPD from problem (a).

##       beta 
## 0.06946913

For the GPD case, our relative VaR is 0.0695 which is higher than the relative VaR estimated by the normal distribution in part (b). Viewing the qq-plot below for Nasdaq returns, we see that the left tail is more heavy-tailed than predicted by a normal distribution. Thus, we can see that the normal distribution underestimates VaR and our fitted GPD provides a better estimate.

   

(d) \(\mathbf{Solution.}\qquad\) Using the quant command from evir library, in the graph below we see that our VaR estimate is stable as we vary the threshold. We find that VaR ranges from [0.0687], [0.0713] which is only a difference of 0.0026.

   

(e) \(\mathbf{Solution.}\qquad\) Below we compute the ES according to our fitted GPD.

We find that \(ES=\) 0.0846 so the shortfall is $84640.74.

Question 2

Redo problems 1 (a), (c), (d), and (e) utilizing an exponential distribution. In particular for part (a) you do not need to consider multiple thresholds, but rather fit an exponential distribution to the tail, based on the one threshold you utilized in problem 1, and provide tail-fit plot and discussion about goodness of fit of this model. Parts (c),(d), and (e) should be done in a totally similar fashion as you did for problem 1. Provide a discussion on the differences/similarities in the results for Problem 1 and 2, and whether they match what would have been expected based on the estimated GPD shape parameter in problem 1.

(a) \(\mathbf{Solution.}\qquad\) Below we construct the tail plot for the negative Nasdaq returns using an exponential distribution.

From the tail plot above it seems that exponential distribution with \(\lambda\) = 76.62 slightly underestimates the upper tail of negative returns. In contrast to our GPD distribution, we belive that GPD provides a better fit.

   

(c) \(\mathbf{Solution.}\qquad\) Below we get our VaR estimate using our fitted exponential distribution from problem (a).

## [1] 0.06725146

For the exponential case, our relative VaR is 0.0673 and VaR is $67251.46 which is lower than the relative VaR estimated by the GPD in part 1 (c) by 0.0022 units.

   

(d) \(\mathbf{Solution.}\qquad\) Below we compute the VaR as a function of threshold.

In the graph above we see that our VaR estimate is stable as we vary the threshold. We find that VaR ranges from 0.0642, 0.0696 which is a difference of 0.0053. This range falls below the range we estimated in our GPD.

   

(e) \(\mathbf{Solution.}\qquad\) Below we compute the expected shortfall using Monte Carlo methods and our exponential distribution.

We find that \(ES=\) 0.0803 so the shortfall is $80296.14. This \(ES\) is lower than the \(ES\) predicted by GPD by 0.0043 units

Question 3

Problem 4 on page 132 in Ruppert/Matteson Let \(X\) be a random variable with mean \(\mu\) and standard deviation \(\sigma\).

  1. Show that the kurtosis of \(X\) is equal to 1 plus the variance of \(\{(X-\mu) / \sigma\}^{2}\)
  2. Show that the kurtosis of any random variable is at least 1.
  3. Show that a random variable \(X\) has a kurtosis equal to 1 if and only if \(P(X=a)=P(X=b)=1 / 2\) for some \(a \neq b\).

(a) \(\mathbf{Solution.}\qquad\) Let \(Z=\frac{X-\mu}{\sigma},\) then we know that, \[\begin{align*} \mathbb{E}[Z] & =\frac{1}{8}(\mathbb{E}[X]-\mu)=0\\ \operatorname{Var}(Z) & =\frac{1}{\sigma^{2}}(\operatorname{Var}(X))=1 \end{align*}\]

Since \(\operatorname{Var}\left(Z^{2}\right)=\mathbb{E}\left[Z^{4}\right]-\mathbb{E}\left[Z^{2}\right]^{2}\), then, \[\begin{align*} \operatorname{Kur}(X) & =\mathbb{E}\left[Z^{4}\right]\\ & =\operatorname{Var}\left(Z^{2}\right)+\mathbb{E}\left[Z^{2}\right]^{2}\\ & =\operatorname{Var}\left(Z^{2}\right)+(1)^{2}\\ & =\operatorname{Var}\left(Z^{2}\right)+1 \end{align*}\]

   

(b) \(\mathbf{Solution.}\qquad\) From question 3 (a) we showed that the \(\mathrm{Kur}\left(X\right)=\operatorname{Var}\left(Z^{2}\right)+1\). Since \(\operatorname{Var}\left(Z^{2}\right)\in[0,\infty)\), then the smallest \(\mathrm{Kur}\left(X\right)\) can be is 1.

   

(c) \(\mathbf{Solution.}\qquad\) (\(\implies\)) First, if \(X\) is a random variable where \(\mathrm{Kur}\left(X\right)=1\), then question 3 (a) implies, \[\begin{align} 1 & =\operatorname{Var}\left\{ Z^{2}\right\} +1\nonumber \\ 0 & =\operatorname{Var}\left\{ \left(\frac{X-\mu}{\sigma}\right)^{2}\right\} \nonumber \\ 0 & =\frac{1}{\sigma^{4}}\operatorname{Var}\left(X^{2}-2\mu X+\mu^{2}\right)\nonumber \\ 0 & =\operatorname{Var}\left(X^{2}-2\mu X\right)\nonumber \\ 0 & =\mathbb{E}\left[\left(\left(X^{2}-2\mu X\right)-\mathbb{E}\left[\left(X^{2}-2\mu X\right)\right]\right)^{2}\right]\label{eq:1} \end{align}\]

Equation (\ref{eq:1}) implies that \(X^{2}-2\mu X=\mathbb{E}\left[X^{2}-2\mu X\right]\). Then we have, \[\begin{align*} X^{2}-2\mu X & =\mathbb{E}\left[X^{2}-2\mu X\right]\\ & =\mathbb{E}\left[X^{2}\right]-2\mu\mathbb{E}\left[X\right]\\ & =\sigma^{2}+\mu^{2}-2\mu^{2}\\ & =\sigma^{2}-\mu^{2} \end{align*}\]

Moving all terms to the left, \[\begin{align} X^{2}-2\mu X-\left(\sigma^{2}-\mu^{2}\right) & =0\nonumber \\ \left(X-(\sigma+\mu)\right)\left(X+(\sigma-\mu)\right) & =0\label{eq:2} \end{align}\]

To satisfy equation (\ref{eq:2}), we need \(X=\left(\sigma+\mu\right)\text{ or }\left(\mu-\sigma\right)\). We have shown that \(X\) can only take on two values. Next we need to show that the two outcomes of \(X\) occur with equal probability. In this case we define the probabilities as, \(p:=\mathbb{P}\left(X=\sigma+\mu\right)\) and \(\left(1-p\right):=\mathbb{P}\left(X=\mu-\sigma\right)\). By computing \(\mathbb{E}\left[X\right]\), \[\begin{align*} \mathbb{E}\left[X\right] & =p(\sigma+\mu)+(1-p)(\mu-\sigma)\\ \mu & =p\sigma+p\mu+\mu-\sigma-p\mu+p\sigma\\ \sigma & =2p\sigma\\ \frac{1}{2} & =p \end{align*}\]

Thus we’ve proven the claim. Next we show the opposite direction.

 

(\(\Longleftarrow\)) Given \(X\) is a random variable where, \[ X=\begin{cases} a & \text{with }p=\frac{1}{2}\\ b & \text{with }\left(1-p\right)=\frac{1}{2} \end{cases} \]

Then the moments of \(X\) are, \[ \mathbb{E}[X]=\frac{1}{2}(a+b) \]

and, \[\begin{align*} \operatorname{Var}(X) & =\frac{1}{2}\left(a^{2}+b^{2}\right)-\left(\frac{1}{2}(a+b)\right)^{2}\\ & =\frac{1}{2}a^{2}+\frac{1}{2}b^{2}-\frac{1}{4}\left(a^{2}+2ab+b^{2}\right)\\ & =\frac{1}{4}\left(a^{2}-2ab+b^{2}\right)\\ & =\frac{1}{4}(a-b)^{2} \end{align*}\]

From question 3 (a) we know, \(\mathrm{Kur}(X)=1+\operatorname{Var}\left(Z^{2}\right)\). We will prove the claim by showing that \(\operatorname{Var}\left(Z^{2}\right)=0\) using direct computation. \[\begin{align} \operatorname{Var}\left(Z^{2}\right) & =\operatorname{Var}\left[\left(\frac{X-\frac{1}{2}\left(a+b\right)}{\frac{1}{2}\left(a-b\right)}\right)^{2}\right] \\ & =\frac{16}{(a-b)^{2}}\operatorname{Var}\left(X^{2}-(a+b)X\right)\nonumber \\ & =\frac{16}{(a-b)^{2}}\left[\operatorname{Var}\left(X^{2}\right)+(a+b)^{2}\operatorname{Var}\left(X\right)-2\left(a+b\right)\operatorname{Cov}\left(X^{2},X\right)\right]\label{eq:3} \end{align}\]

We solve for the terms in equation (\ref{eq:3}) in two parts. First the variance terms, \[\begin{aligned}\operatorname{Var}\left(X^{2}\right)+(a+b)^{2}\operatorname{Var}(X) & =\mathbb{E}\left[X^{4}\right]-\mathbb{E}\left[X^{2}\right]^{2}+(a+b)^{2}\left(\frac{1}{4}(a-b)^{2}\right)\\ & =\frac{1}{2}\left(a^{4}+b^{4}\right)-\frac{1}{4}\left(a^{2}+b^{2}\right)^{2}+\frac{1}{4}(a+b)^{2}(a-b)^{2}\\ & =\frac{1}{4}\left(a^{4}-2a^{2}b^{2}+b^{4}\right)+\frac{1}{4}(a+b)^{2}(a-b)^{2}\\ & =\frac{1}{4}(a+b)^{2}(a-b)^{2}+\frac{1}{4}(a+b)^{2}(a-b)^{2}\\ & =\frac{1}{2}(a+b)^{2}(a-b)^{2} \end{aligned} \]

the second part involves computing the covariance term, \[\begin{align*} -2(a+b)\mathrm{Cov}\left(X^{2},X\right) & =-2(a+b)\mathbb{E}\left[\left(X^{2}-\mathbb{E}\left[X^{2}\right]\right)\left(X-\mathbb{E}\left[X\right]\right)\right]\\ & =-2(a+b)\mathbb{E}\left[X^{3}-X^{2}\mathbb{E}[X]-X\mathbb{E}\left[X^{2}\right]+\mathbb{E}\left[X^{2}\right]\mathbb{E}\left[X\right]\right]\\ & =-2(a+b)\bigg[\frac{1}{2}\left(a^{3}+b^{3}\right)-\frac{1}{2}\left(a^{2}+b^{2}\right)\frac{1}{2}(a+b)\\ & \quad+\underbrace{-\frac{1}{2}(a+b)\frac{1}{2}\left(a^{2}+b^{2}\right)+\frac{1}{2}\left(a^{2}+b^{2}\right)\frac{1}{2}(a+b)}_{=0}\bigg]\\ & =-\left(a^{3}+b^{3}\right)(a+b)+\frac{1}{2}\left(a^{2}+b^{2}\right)(a+b)^{2}\\ & =-(a+b)^{2}\left[a^{2}-ab+b^{2}-\frac{1}{2}a^{2}-\frac{1}{2}b^{2}\right]\\ & =-\frac{1}{2}(a+b)^{2}\left[a^{2}-2ab+b^{2}\right]\\ & =-\frac{1}{2}(a+b)^{2}(a-b)^{2} \end{align*}\]

Thus equation (\ref{eq:3}) becomes, \[\begin{align*} \frac{16}{(a-b)^{2}}\left[\frac{1}{2}(a+b)^{2}(a-b)^{2}-\frac{1}{2}(a+b)^{2}(a-b)^{2}\right] & =0 \end{align*}\]

Therefore \(\operatorname{Kur}(X)=1\).

Additional Functions

gg_shape = function(data, models = 30, start = 15, end = 500, reverse = TRUE, 
                    ci = 0.95, auto.scale = TRUE, labels = TRUE) {
  data <- as.numeric(data)
  qq <- 0
  if (ci) 
    qq <- qnorm(1 - (1 - ci)/2)
  x <- trunc(seq(from = min(end, length(data)), to = start, 
                 length = models))
  
  gpd.dummy <- function(nex, data) {
    out <- gpd(data = data, nextremes = nex, information = "expected")
    c(out$threshold, out$par.ests[1], out$par.ses[1])
  }
  mat <- apply(as.matrix(x), 1, gpd.dummy, data = data)
  mat <- rbind(mat, x)
  dimnames(mat) <- list(c("threshold", "shape", "se", "exceedances"), 
                        NULL)
  thresh <- mat[1, ]
  y <- mat[2, ]
  index <- x
  if (reverse) 
    index <- -x
  df = data.frame("x" = index, "y" = y)
  labels = formatC(thresh[c(TRUE, FALSE, FALSE, FALSE, FALSE)], 
                   format = "e", digits = 2)
  breaks = index[c(TRUE, FALSE, FALSE, FALSE, FALSE)]
  ggplot(df, aes(x = x, y = y)) + 
    geom_line() + 
    scale_y_continuous(breaks = scales::pretty_breaks(n = 8)) + 
    scale_x_continuous(breaks = breaks,
                       labels = as.character(-breaks),
                       sec.axis = sec_axis(~ . * 1,
                       breaks = breaks, 
                       labels =  labels)) + 
    ylab("Shape (xi)") + 
    xlab("Exceedances") + 
    ggtitle("Threshold") + 
    theme(plot.title = element_text(hjust = 0.5, size = 10))
}


gg_tailplot = function(x, extend = 1.5) {
  data <- as.numeric(x$data)
  threshold <- x$threshold
  xi <- x$par.ests["xi"]
  beta <- x$par.ests["beta"]
  plotmin <- threshold
  if (extend <= 1) 
    stop("extend must be > 1")
  plotmax <- max(data) * extend
  xx <- seq(from = 0, to = 1, length = 1000)
  z <- qgpd(xx, xi, threshold, beta)
  z <- pmax(pmin(z, plotmax), plotmin)
  ypoints <- ppoints(sort(data))
  y <- pgpd(z, xi, threshold, beta)
  prob <- x$p.less.thresh
  ypoints <- (1 - prob) * (1 - ypoints)
  y <- (1 - prob) * (1 - y)
  df = data.frame("sortdata" = sort(data), "ypts" = ypoints)
  df2 = data.frame("x" = z[y >= 0] , "y" = y[y >= 0])
  ggplot(df, aes(x = sortdata, y = ypts)) +
    geom_point(shape = 1, size = 2) + 
    scale_x_log10() + 
    scale_y_log10() +  
    geom_line(data = df2, aes(x = x, y = y), inherit.aes = FALSE) + 
    ggtitle(paste0("Tail Plot of GPD for u = ", u)) + 
    xlab('x (on log scale)') + 
    ylab('1-F(x) (on log scale)') + 
    theme(plot.title = element_text(hjust = 0.5, size = 10))
}