\(\Phi(x) = \int_{-\infty}^x \frac{1}{\sqrt{2\pi}} \text{e}^{-t^2/2}\, dt\).
x <- seq(.1, 2.5, length = 10)
n <- 10000
u <- runif(n)
cdf <- numeric(length(x))
for (i in 1:length(x)) {
g <- x[i] * exp(-(u * x[i])^2 / 2)
cdf[i] <- mean(g) / sqrt(2 * pi) + 0.5
}
Hint: Break this problem into two cases: \(x\geq0\) and \(x<0\), and use the symmetry of the normal density to handle the second case. Make the substitution \(u=t/x\), having \(du=dy/x\).
x <- seq(.1, 2.5, length = 10)
n <- 10000
z <- rnorm(n)
dim(x) <- length(x)
p <- apply(x, MARGIN = 1,
FUN = function(x, z) {mean(z < x)}, z = z)
Phi <- pnorm(x)
print(round(rbind(x, cdf, p, Phi), 4))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## x 0.1000 0.3667 0.6333 0.9000 1.1667 1.4333 1.7000 1.9667 2.2333 2.5000
## cdf 0.5398 0.6430 0.7366 0.8155 0.8775 0.9228 0.9535 0.9729 0.9842 0.9902
## p 0.5433 0.6437 0.7338 0.8105 0.8749 0.9242 0.9563 0.9755 0.9882 0.9938
## Phi 0.5398 0.6431 0.7367 0.8159 0.8783 0.9241 0.9554 0.9754 0.9872 0.9938
Estimate the variance of the estimator in Example 3, and build approximate 95% confidence interval for the estimate of \(Φ(2)\).
x <- 2
n <- 10000
z <- rnorm(n)
g <- (z < x)
v <- mean((g - mean(g))^2) / n
cdf <- mean(g)
c(cdf, v)
## [1] 9.771000e-01 2.237559e-06
c(cdf - 1.96 * sqrt(v), cdf + 1.96 * sqrt(v))
## [1] 0.9741681 0.9800319
pnorm(2)
## [1] 0.9772499
pnorm(2)*(1-pnorm(2))/n
## [1] 2.223256e-06
Refer to Example 3, repeat the estimation using antithetic variables, and find the approximate reduction in standard error.
MC.Phi <- function(x, R = 10000, antithetic = TRUE) {
u <- runif(R/2)
if (!antithetic) v <- runif(R/2) else
v <- 1 - u
u <- c(u, v)
cdf <- numeric(length(x))
for (i in 1:length(x)) {
g <- x[i] * exp(-(u * x[i])^2 / 2)
cdf[i] <- mean(g) / sqrt(2 * pi) + 0.5
}
cdf
}
Show a comparison of estimates obtained from a single Monte Carlo experiment.
x <- seq(.1, 2.5, length=5)
Phi <- pnorm(x)
set.seed(123)
MC1 <- MC.Phi(x, anti = FALSE)
set.seed(123)
MC2 <- MC.Phi(x)
print(round(rbind(x, MC1, MC2, Phi), 5))
## [,1] [,2] [,3] [,4] [,5]
## x 0.10000 0.70000 1.30000 1.90000 2.50000
## MC1 0.53983 0.75825 0.90418 0.97311 0.99594
## MC2 0.53983 0.75805 0.90325 0.97132 0.99370
## Phi 0.53983 0.75804 0.90320 0.97128 0.99379
Can the approximate reduction in variance be estimated for given x by a simulation under both methods, the simple Monte Carlo integration approach and the antithetic variable approach?
n <- 1000
MC1 <- MC2 <- numeric(n)
x <- 2
for (i in 1:n) {
MC1[i] <- MC.Phi(x, R = 1000, anti = FALSE)
MC2[i] <- MC.Phi(x, R = 1000)
}
print(sd(MC1))
## [1] 0.007198348
print(sd(MC2))
## [1] 0.0003669758
print((var(MC1) - var(MC2))/var(MC1))
## [1] 0.997401
The Beta(2,2) density is \(\pi(x) = 6 x (1 - x)\), \(0 < x < 1\).
Let \(p_u(x)\) be the Uniform(0,1) density. Then \(\pi(x)/p_u(x) \leq 6\), \(\forall\, 0 < x < 1\), so \(M=6\).
A random \(x\) from \(p_u(x)\) is accepted if \[ u < \frac{\pi(x)}{M p_u(x)} = \frac{6 x (1 - x)}{6 \times 1} = x (1 - x) . \]
On average, \(M n = 6000\) iterations (\(12000\) random numbers) will be required for a sample size \(1000\). In fact, that depends on the upper bound \(M\) of \(\pi(x)/p_u(x)\), which depends on the choice of the function \(p_u(x)\).
n <- 1000
k <- 0 #counter for accepted
j <- 0 #iterations
x <- numeric(n)
while (k < n) {
u <- runif(1)
j <- j + 1
y <- runif(1) #random variate from p_u(x)
if (y * (1-y) > u) {
k <- k + 1 #we accept x
x[k] <- y
}
}
info<-cbind(j,2*j)
colnames(info)<-c("iterations","random numbers")
print(info)
## iterations random numbers
## [1,] 5973 11946
p <- seq(.1, .9, .1)
Qhat <- quantile(x, p) #quantiles of sample
Q <- qbeta(p, 2, 2) #theoretical quantiles
round(rbind(Qhat, Q), 3)
## 10% 20% 30% 40% 50% 60% 70% 80% 90%
## Qhat 0.189 0.276 0.356 0.436 0.491 0.556 0.616 0.693 0.782
## Q 0.196 0.287 0.363 0.433 0.500 0.567 0.637 0.713 0.804
n <- 10000
k <- 0
j <- 0
x <- numeric(n)
while (k < n) {
u <- runif(1)
j <- j + 1
y <- runif(1)
if (y * (1-y) > u) {
k <- k + 1
x[k] <- y
}
}
p <- seq(.1, .9, .1)
Qhat <- quantile(x, p)
Q <- qbeta(p, 2, 2)
round(rbind(Qhat, Q), 3)
## 10% 20% 30% 40% 50% 60% 70% 80% 90%
## Qhat 0.198 0.291 0.367 0.439 0.505 0.567 0.635 0.711 0.803
## Q 0.196 0.287 0.363 0.433 0.500 0.567 0.637 0.713 0.804
In order to use importance sampling method to estimate \[ \int_0^1 \frac{\text{e}^{−x}}{1 + x^2} dx \] several possible choices of importance functions are compared here. The candidates for the importance functions are \[ \begin{array}{ll} p_0(x) = 1, &0 < x < 1, \\ p_1(x) = \text{e}^{−x}, &0 < x < \infty, \\ p_2(x) = (1+x^2)^{−1}/\pi, &−\infty < x < \infty, \\ p_3(x) = \text{e}^{−x}/(1 − \text{e}^{−x}), &0 < x < 1, \\ p_4(x) = 4(1+x^2)^{−1}/\pi, &0 < x < 1. \end{array} \] The integrand is \[ g(x) = \begin{cases} \text{e}^{−x}/(1 + x^2), &\text{if}\ 0 < x < 1, \\ 0, &\text{otherwise}. \end{cases} \] While all five of the possible importance functions are positive on the set \(0 < x < 1\) where \(g(x)>0\), \(p_1\) and \(p_2\) have larger ranges and many of the simulated values will contribute zeros to the sum, which is inefficient.
All of these distributions are easy to simulate; \(p_2\) is standard Cauchy or \(t_{(1)}\)-Student. Plot the densities on (0,1) for comparison with \(g(x)\) in Figure 1.
The function that corresponds to the most nearly constant ratio \(g(x)/p(x)\) appears to be \(p_3\), which can be seen more clearly from graphs (Figure 2). From the graphs, might we prefer \(p_3\) for the smallest variance?
x <- seq(0, 1, .01)
w <- 2
p1 <- exp(-x)
p2 <- (1 / pi) / (1 + x^2)
p3 <- exp(-x) / (1 - exp(-1))
p4 <- 4 / ((1 + x^2) * pi)
g <- exp(-x) / (1 + x^2)
#figure 1
plot(x, g, type = "l", main = "", ylab = "g(x)", ylim = c(0,2), lwd = w)
lines(x, g/g, lty = 2, col=2, lwd = w)
lines(x, p1, lty = 3, col=3, lwd = w)
lines(x, p2, lty = 4, col=4, lwd = w)
lines(x, p3, lty = 5, col=5, lwd = w)
lines(x, p4, lty = 6, col=6, lwd = w)
legend("topright", legend = c("g", 0:4), lty = 1:6, col = 1:6, lwd = w, inset = 0.02)
#figure 2
plot(x, g, type = "l", main = "", ylab = "g(x)/p(x)", ylim = c(0,3.2), lwd = w, lty = 2, col = 2)
lines(x, g/p1, lty = 3, col=3, lwd = w)
lines(x, g/p2, lty = 4, col=4, lwd = w)
lines(x, g/p3, lty = 5, col=5, lwd = w)
lines(x, g/p4, lty = 6, col=6, lwd = w)
legend("topright", legend = c(0:4), lty = 2:6, col = 2:6, lwd = w, inset = 0.02)
Figures 1 and 2 plot importance functions with \(g(x)\) and ratios \(g(x)/p(x)\), respectively.
m <- 10000
theta.hat <- se <- numeric(5)
g <- function(x) {
exp(-x - log(1+x^2)) * (x > 0) * (x < 1)
}
x <- runif(m) #using p0
fg <- g(x)
theta.hat[1] <- mean(fg)
se[1] <- sd(fg)
x <- rexp(m, 1) #using p1
fg <- g(x) / exp(-x)
theta.hat[2] <- mean(fg)
se[2] <- sd(fg)
x <- rcauchy(m) #using p2
i <- c(which(x > 1), which(x < 0))
x[i] <- 2 #to catch overflow errors in g(x)
fg <- g(x) / dcauchy(x)
theta.hat[3] <- mean(fg)
se[3] <- sd(fg)
u <- runif(m) #p3, inverse transform method
x <- - log(1 - u * (1 - exp(-1)))
fg <- g(x) / (exp(-x) / (1 - exp(-1)))
theta.hat[4] <- mean(fg)
se[4] <- sd(fg)
u <- runif(m) #p4, inverse transform method
x <- tan(pi * u / 4)
fg <- g(x) / (4 / ((1 + x^2) * pi))
theta.hat[5] <- mean(fg)
se[5] <- sd(fg)
rbind(theta.hat, se)
## [,1] [,2] [,3] [,4] [,5]
## theta.hat 0.5232240 0.5245760 0.5369783 0.52491761 0.5240453
## se 0.2458656 0.4184484 0.9623768 0.09676078 0.1409299
Can we make the following comments? i) The simulation indicates that \(p_3\) and possibly \(p_4\) produce smallest variance among these five importance functions, while \(p_2\) produces the highest variance. ii) The standard Monte Carlo estimate without importance sampling has \(\widehat{se} = 0.244\) (\(p_0=1\)). The importance functions \(p_1\) and \(p_2\) do not reduce error, but \(p_3\) and \(p_4\) each reduce the standard error in estimating \(\theta\).
Rizzo, M.L. (2019). Statistical Computing with R, 2nd edition. Chapman and Hall/CRC.
Tanner, M.A. (1996). Tools for Statistical Inference, 3rd edition. Springer-Verlag, New York.