Hypothesis Tests and Confidence Intervals Involving
Abstract
This technical report explores some issues left open in Technical Reports 669
and 670
(Geyer and Shaw, 2008a,b): for fitness landscapes fit using an aster models, we
propose
hypothesis tests of whether the landscape has a maximum and confidence regions
for
the location of the maximum.
All analyses are d one in R (R Development Core Team, 2008)
using the aster
contributed package described by Geyer, Wagenius and Shaw (2007) and Shaw,
Geyer,
Wagenius, Hangelbroek, and Etterson (2008). Furthermore, all analyses are done
using
the Sweave function in R, so this entire technical report and all of the
analyses reported
in it are exactly reproducible by anyone who has R with the aster package
installed
and the R noweb file specifying the document.
1 R Package Aster
We use R statistical computing environment (R Development
Core Team, 2008) in our
analysis. It is free software and can be obtained from Precompiled
binaries are available for Windows, Macintosh, and popular Linux distributions.
We use the contributed package aster. If R has been installed, but this package
has not
yet been installed, do
install. packages ("aster")
from the R command line (or do the equivalent using the
GUI menus if on Apple Macintosh
or Microsoft Windows). This may require root or administrator privileges.
As suming the aster package has been installed, we load it
> library (aster)
The version of the package used to make this document is
0.7-7 (which is available on
CRAN). The version of R used to make this document is 2.8.1.
This entire document and all of the calculations shown
were made using the R command
Sweave and hence are exactly reproducible by anyone who has R and the R noweb (RNW)
file from which it was created. Both the RNW file and and the PDF document
produced
from it are available at . For further details on
the use of Sweave and R see Chapter 1 of the technical report by Shaw, et al.
(2007) available
at the same web site.
Not only can one exactly reproduce the results in the
printable document, one can also
modify the parameters of the simulation and get different results.
Finally, we set the seeds of the random number generator
so that we obtain the same
results every time. To get different results, obtain the RNW file, change this
statement ,
and reprocess using Sweave and LATEX.
> set.seed(42)
2 Data Structure
We use the data simulated in Technical Report 669 (Geyer
and Shaw, 2008a, herein after
TR 669), because this simulated data has features not present in any currently
available
real data yet shows the full possibilities of aster modeling.
Temporarily, we load the data from a file. We intend that
these data will be incorporated
in a future version of the aster package.
> load ("sim.rda")
> ls ()
[1] "beta. true" "fam" "ladata" "mu.true"
[5] "phi. true" "pred" "redata" "theta.true"
[9] "vars"
For a full description of the graphical structure of these
data see Section 2.1 of TR 669.
For a full description of the variables and their conditional distributions see
Section 2.2 of
TR 669. For a full description of the sum of fitness components that is deemed
the best
surrogate of fitness for these data see Section 2.3 of TR 669.
3 Hypothesis Tests about Maxima
3.1 Asymptotic
First, we consider asymptotic (large sample, approximate)
tests based on the well known
likelihood ratio test, which has asymptotic chi- square distribution .
We fit the same model fit in TR 669 in which the
unconditional canonical parameter
corresponding to best surrogate of fitness is a quadratic function of phenotype
data. In
short, fitness is quadratic on the canonical parameter scale.
> out6 <- aster (resp ~ varb + 0 + z1 + z2 + I(z1^2) +
+ I(z1 * z2) + I(z2^2), pred, fam, varb, id, root,
+ data = redata)
We also fit the model in which fitness is linear on the
canonical parameter scale.
> out5 <- aster (resp ~ varb + 0 + z1 + z2, pred, fam,
+ varb, id, root, data = redata)
Then we compare these two models using the conventional
likelihood ratio test.
> anova(out5, out6)
Analysis of Deviance Table
Model 1: resp ~ varb + 0 + z1 + z2
Model 2: resp ~ varb + 0 + z1 + z2 + I(z1^2) + I(z1 * z2) + I(z2^2)
Model Df Model Dev Df Deviance P(>|Chi|)

The P- value calculated here is highly statistically significantly (P = 9.6 ×
10-4). It is a
correct asymptotic approximation to the P-value for comparing the linear and
quadratic
models. It says the quadratic model clearly fits better.
A linear model cannot have a stationary point. A quadratic model does have a
stationary
point. Hence this is also a test of whether the fitness landscape has a
stationary point, which
may be a maximum, a minimum, or a saddle point.
If we wish to turn this into a test for the presence of a maximum, we should
make the
alternative be that the model is quadratic and the the quadratic surface has a
maximum.
Since this requires the coefficients of both I(z1^2) and I(z2^2) to be negative,
this restricts
the alternative to 1/4 of the parameter space. Hence the appropriate P-value for
this test
is 1/4 of the P-value for the general quadratic alternative, that is, (P = 2.4 ×
10-4).
This test, although seemingly liberal, dividing the P -value of the conventional
test by
4, or, more generally, by 2p where p is the number of phenotypic variables, is
in fact
(asymptotically) conservative. Our claim that the alternative is only 1/4 of the
parameter
space, or 2-p of the parameter space for general p, is conservative, since it
only takes account
of the diagonal terms of the Hessian matrix of the quadratic function.
3.2 Parametric Bootstrap
Second, we consider tests based on the distribution under the null hypothesis
determined
by simulation. These are still large sample approximate in a weak sense in that
we should
simulate the distribution for the true unknown parameter vector
but cannot and
must
use our best approximation, which is the distribution for the parameter vector
that is
the MLE for the null hypothesis. Since this only makes sense when
is close to
, this
procedure is only approximate. To remind everyone of this fact, we call the
procedure
a parametric bootstrap rather than a simulation test. However, this procedure is
much
less approximate than the procedure of the preceding section, because it does
not use the
chi-square approximation for the distribution of the test statistic but instead
calculates its
exact sampling distribution when
is the true parameter value.
The test in the preceding section also does not fully account for the
restriction that the
Hessian matrix be negative definite . Thus an even more correct P-value can be
obtained
using a parametric bootstrap that goes like this.
> nsim <- 249
> theta. boot <- predict(out5, parm. type = "canonical",
+ model. type = "conditional")
> nind <- length (unique (redata$id))
> theta. boot <- matrix (theta. boot, nrow = nind)
> phi. boot <- out6$coef * 0
> phi.boot[1:length(out5$coef)] <- out5$coef
> pvalsim <- double (nsim)
> eigmaxsim <- double (nsim)
> save. time <- proc. time ()
> for (i in 1:nsim) {
+ ystar <- raster (theta.boot, pred, fam, root = theta. boot^0)
+ redatastar <- redata
+ redatastar$resp <- as.vector (ystar)
+ out6star <- aster (resp ~ varb + 0 + z1 + z2 +
+ I(z1^2) + I(z1 * z2) + I(z2^2), pred, fam,
+ varb, id, root, parm = phi.boot, data = redatastar)
+ out5star <- aster (resp ~ varb + 0 + z1 + z2,
+ pred, fam, varb, id, root, parm = out5$coef,
+ data = redatastar)
+ Afoo <- matrix (NA, 2, 2)
+ Afoo[1, 1] <- out6star$coef["I(z1^2)"]
+ Afoo[2, 2] <- out6star$coef["I(z2^2)"]
+ Afoo[1, 2] <- out6star$coef["I(z1 * z2)"]/2
+ Afoo[2, 1] <- out6star$coef["I(z1 * z2)"]/2
+ pvalsim[i] <- anova(out5star, out6star)[2, 5]
+ eigmaxsim[i] <- max (eigen(Afoo, symmetric = TRUE,
+ only. values = TRUE) $values)
+ }
> elapsed. time <- proc. time() - save. time
> pval. obs <- anova(out5, out6)[2, 5]
> pvalsim. corr <- pvalsim
> pvalsim. corr[eigmaxsim > 0] <- 1
> mean (c(pvalsim.corr, pval. obs) <= pval. obs)
[1] 0.004
The parametric bootstrap P-value, here P = 0.004, cannot be lower than 1/(n +
1), where
n is the number of simulations, here nsim = 249. We have gotten the lowest
bootstrap
P-value we could have with this number of simulations, which took 24 minutes and
52.1
seconds.
Hence there is little point in using the parametric bootstrap here where the
asymptotic
P-value (P = 2.4×10-4) is so small. If the asymptotic P-value were equivocal,
somewhere
in the vicinity of 0.05, then there would be much more reason to calculate a
parametric
bootstrap P-value, and the code above shows how to do it right.
We can see that the correction of dividing the conventional P-value by 2p is
conservative
here. The fraction of the sample in which the matrix A is negative definite is
0.145, a good
deal less than 2-p = 0.25.
4 Confidence Regions about Maxima
Now we consider the MLE of the location of the maximum, which is calculated in
TR 669
as follows
> Afoo <- matrix (NA, 2, 2)
> Afoo[1, 1] <- out6$coef["I(z1^2)"]
> Afoo[2, 2] <- out6$coef["I(z2^2)"]
> Afoo[1, 2] <- out6$coef["I(z1 * z2)"]/2
> Afoo[2, 1] <- out6$coef["I(z1 * z2)"]/2
> bfoo <- rep (NA, 2)
> bfoo[1] <- out6$coef["z1"]
> bfoo[2] <- out6$coef["z2"]
> cfoo <- solve (-2 * Afoo, bfoo)
> cfoo
[1] 3.335738 1.626314
The explanation for this is that the estimated regression function, mapped to
the natural
parameter scale, is

where c is an arbitrary constant, b is the R vector bfoo above and A is the R
matrix Afoo
above. The first derivative vector is

and setting this equal to zero and solving for z gives

For future reference, we also compute the maximum of the simulation truth
fitness
landscape.
> Abar <- matrix (NA, 2, 2)
> Abar[1, 1] <- beta. true ["I(z1^2)"]
> Abar[2, 2] <- beta. true ["I(z2^2)"]
> Abar[1, 2] <- beta. true ["I(z1 * z2)"]/2
> Abar[2, 1] <- beta. true ["I(z1 * z2)"]/2
> bbar <- rep (NA, 2)
> bbar[1] <- beta. true ["z1"]
> bbar[2] <- beta. true ["z2"]
> cbar <- solve(-2 * Abar, bbar)
In order to apply the multivariable delta method, we need to differentiate the
function
of the parameter
that gives the maximum,

where we have now written the matrix A and the vector b as functions of the
regression
coefficient vector
, which they are, each component of A and each component of
b being
a component of
. The partial derivatives are

The asymptotic variance of the components of
is the submatrix of the inverse
Fisher
information matrix corresponding to the components of
that enter into A(
) and
b(
)
> beta. sub. names <- c("I(z1^2)", "I(z2^2)", "I(z1 * z2)",
+ "z1", "z2")
> beta. sub. idx <- match (beta.sub.names, names(out6$coef))
> asymp. var <- solve(out6$fisher)
> asymp. var <- asymp. var [beta. sub. idx, ]
> asymp. var <- asymp. var [, beta. sub. idx]
The derivative matrix is set up as follows
> jack <- matrix (NA, 2, 5)
> jack[, 1] <- (1/2) * solve (Afoo) %*% matrix (c(1,
+ 0, 0, 0), 2, 2) %*% solve (Afoo) %*% cbind (bfoo)
> jack[, 2] <- (1/2) * solve (Afoo) %*% matrix(c(0,
+ 0, 0, 1), 2, 2) %*% solve (Afoo) %*% cbind (bfoo)
> jack[, 3] <- (1/2) * solve (Afoo) %*% matrix(c(0,
+ 1, 1, 0), 2, 2) %*% solve (Afoo) %*% cbind (bfoo)
> jack[, 4] <- (-(1/2) * solve (Afoo) %*% matrix(c(1,
+ 0), 2, 1))
> jack[, 5] <- (-(1/2) * solve (Afoo) %*% matrix(c(0,
+ 1), 2, 1))
Finally, we finish applying the delta method
> asymp. var <- jack %*% asymp. var %*% t (jack)
> asymp. var

So now we plot a confidence region for the maximum based on the delta method
calculation
above. The following R statements make Figure 1 (page 8)
> par (mar = c(2, 2, 1, 1) + 0.1)
> plot(ladata$z1, ladata$z2, xlab = "", ylab = "",
+ pch = 20, axes = FALSE, xlim = range(ladata$z1,
+ cfoo[1]), ylim = range(ladata$z2, cfoo[2]))
> title (xlab = "z1", line = 1)
> title (ylab = "z2", line = 1)
> box()
> z1 <- cos(seq(0, 2 * pi, length = 101))
> z2 <- sin(seq(0, 2 * pi, length = 101))
> z <- rbind(z1, z2)
> points(cfoo[1], cfoo[2], col = "blue", pch = 19)
> fred <- eigen (asymp.var)
> sally <- fred $ vectors %*% diag (sqrt (fred $ values)) %*%
+ t(fred $ vectors)
> points(cbar[1], cbar[2], col = "green3", pch = 19)
> jane <- qchisq(0.5, 2) * sally %*% z
> lines(cfoo[1] + jane[1, ], cfoo[2] + jane[2, ], col = "blue",
+ lwd = 2)
> jane <- qchisq(0.75, 2) * sally %*% z
> lines(cfoo[1] + jane[1, ], cfoo[2] + jane[2, ], col = "blue",
+ lwd = 2, lty = "dotted")
> jane <- qchisq(0.9, 2) * sally %*% z
> lines(cfoo[1] + jane[1, ], cfoo[2] + jane[2, ], col = "blue",
+ lwd = 2, lty = "dashed")
The confidence regions are huge, the 90% confidence region containing most of
the range of
the data. One would need much larger sample sizes than the 500 used here to get
precise
confidence regions.
One might think we should have a section on how to calculate a confidence region
based
on the parametric bootstrap rather than asymptotic normality, and this would be
expected
for a confidence interval. However, it is an open research question how best to
make a
confidence region in this situation. The elliptical (large sample, approximate,
delta method)
confidence regions shown in Figure 1 get their shape from the asymptotic
bivariate normal
distribution of the two-dimensional vector (location of the maximum) being
estimated.
A bivariate normal distribution has elliptical contours of its probability
density function,
hence elliptical confidence regions make sense. If we drop the “assumption”of
normality (not
really an assumption but an asymptotic approximation), then there is no reason
to make
elliptical confidence regions. In fact, the main point of parametric bootstrap
confidence
intervals is to drop the “assumption” of normality and use intervals that are
not centered
at the MLE and reflect the skewness of the simulation distribution of the
estimates. So in
order to do a parametric bootstrap correctly in this situation, we should also
use a nonelliptical
confidence region that reflects the non-normality of the simulation distribution
of
the estimates. But how? That is the open research question. All of the bootstrap
literature
known to us is about confidence intervals, not about confidence regions.