Brief Overview of the Monte.Carlo.se Package

2019-05-01

The R Package Monte.Carlo.se gives R code that easily produces standard errors for Monte Carlo simulation summaries using either jackknife or bootstrap resampling. (“Monte Carlo” methods essentially refer to any use of random simulation. David (1998) reports that the name was coined by famous mathematician and computer scientist John von Neumann and his Los Alamos colleague S.M. Ulam.)

The Monte.Carlo.se Package functions and vignettes give many examples, but more details may be found in Boos and Osborne (2015) and Boos and Stefanski (2013, Ch. 9).

The main functions in this package are

They are explained in the vignettes

To fix ideas concretely, we generate 10,000 normal samples of size n=15 (taken from the Example 1 vignette).

N <- 10000
set.seed(346)                   # sets the random number seed
z <- matrix(rnorm(N*15),nrow=N) # N rows of N(0,1) samples, n=15

Then create vectors of N=10,000 means, 20% trimmed means, and medians computed from these samples,

out.m.15   <- apply(z,1,mean)             # mean for each sample
out.t20.15 <- apply(z,1,mean,trim=0.20)   # 20% trimmed mean for each sample
out.med.15 <- apply(z,1,median)           # median for each sample

and combine then into a Monte Carlo output matrix X

> X <- cbind(out.m.15,out.t20.15,out.med.15)
> dim(X)
 10000     3
> X[c(1:4,9997:10000),]
out.m.15  out.t20.15  out.med.15
[1,] -0.2016663 -0.30957261 -0.23881327
[2,]  0.4069637  0.27808734  0.09589171
[3,]  0.2799703  0.51686132  0.47694372
[4,]  0.1133106  0.05632255  0.11780811
.        .           .          .
.        .           .          .
.        .           .          .

[997,] -0.1150505 -0.1225642  -0.38207995
[998,] -0.2972992 -0.3700191  -0.43463496
[999,]  0.3470409  0.4545897   0.57967180
[1000,] 0.4045499  0.4045008  -0.01031273

X is used to compute Table entries (summaries) and their Monte carlo standard errors. Examples of Monte Carlo summaries (= Monte Carlo estimates), often appearing in tables and plots, are

• the estimated bias and variance of an estimator;
• the estimated percentiles of a test statistic or pivotal quantity;
• the estimated power function of a hypothesis test;
• the estimated mean length and coverage probability of a confidence interval.

To further clarify statistical language, several definitions are important. Let $$Y$$ be any random quantity computed from a random sample or process.

the mean of a $$Y$$, denoted $$E(Y)=\mu$$, is the expected value (or average) of $$Y$$
the variance of $$Y$$ = the expected (or average) value of $$\{Y-E(Y)\}^2$$
the standard deviation (SD) = $$\sqrt{\mbox{variance}}$$ for any random quantity
the standard error (SE) is an estimate of the SD

We find that using the above definitions for standard deviation and standard error leads to clarity.

When Monte Carlo precedes any of these definitions, like Monte Carlo SE, we mean the standard error computed from $$N$$ independent replicates of random quantities, typically computed from $$N$$ Monte Carlo simulated samples. For example, suppose $$N$$ samples of size $$n$$ are generated, and the sample median (MD) is computed from each sample, resulting in $$MD_1, \ldots, MD_N$$, a Monte Carlo sample of sample medians (out.med.15 created above is an example). A Monte Carlo estimate of the bias of the sample median would be $\frac{1}{N}\sum_{i=1}^N MD_i - \theta,$ where $$\theta$$ is the population median. The Monte Calo SE of this bias estimate is simply $$s/\sqrt{N}$$, where $$s$$ is the sample standard deviation of the $$N$$ sample medians, $s=\left\{\frac{1}{N-1}\sum_{i=1}^N (MD_i-\overline{MD})^2\right\}^{1/2}.$

As explained in the summary to Boos and Osborne (2015).

“Good statistical practice dictates that summaries in Monte Carlo studies should always be accompanied by standard errors. Those standard errors are easy to provide for summaries that are sample means over the replications of the Monte Carlo output: for example, bias estimates, power estimates for tests and mean squared error estimates. But often more complex summaries are of interest: medians (often displayed in boxplots), sample variances, ratios of sample variances and non-normality measures such as skewness and kurtosis. In principle, standard errors for most of these latter summaries may be derived from the Delta Method, but that extra step is often a barrier for standard errors to be provided.”

The purpose of the package is to provide Monte Carlo SEs for both simple and complex summaries from Monte Carlo output.