3 Hypothesis testing

3.1 Tests of hypotheses

A statistical hypothesis is a statement about the distribution of some observable quantity in a population.

The parametric case

Given \(\mathbf{x}\in\mathcal{X}\) from some \(F\) in \(\mathcal{F}_\theta\) test a null hypothesis \[H_0: \theta \in \Theta_0\] against an alternative hypothesis \[H_1: \theta \in \Theta_1\] in which \(\Theta_0\cap\Theta_1=\emptyset\) and \(\Theta_0\cup\Theta_1=\Theta\).

A hypothesis testing procedure is a partition defined in \(\mathcal{X}\) that leads to one of two possible decisions:

  1. to reject \(H_0\)

  2. to not reject \(H_0\)

Note

  • \(H_0\) is usually chosen in a conservative way – it should only be rejected if strong evidence against it is found.

  • As a consequence, the two hypotheses are not permutable.

The procedure

Define a statistic \(T:\mathcal{X} \rightarrow [0,1]\) such that

\[T(\mathbf{x})=P(\text{Reject}\, H_0\mid \mathbf{x})=\begin{cases} 1, & \mathbf{x}\in\mathcal{X}_c\\[0.1cm] \varepsilon, & \mathbf{x}\in\mathcal{X}_r\\[0.1cm] 0, & \text{otherwise} \end{cases},\] with \(\mathcal{X}_c,\, \mathcal{X}_r\subset\mathcal{X}\), \(\mathcal{X}_c\cap\mathcal{X}_r=\emptyset\) and \(\varepsilon\in [0,1[\).

Note

  • \(\mathcal{X}_c\) is called the critical region;

  • If \(\varepsilon\neq 0\) then the test is called a randomized test.

The evaluation of tests

Any test can lead to one of two possible wrong decisions:

Type I error

rejecting \(H_0\) given that \(H_0\) is true

Type II error

not rejecting \(H_0\) given that \(H_0\) is false

The function \[\beta_T(\theta)=P(\text{Reject}\, H_0\mid \theta)=E[T(\mathbf{X})\mid\theta]\] is called the power function of the test \(T\).

\[\beta_T(\theta)=\begin{cases}P(\text{Type I error}), & \theta\in\Theta_0\\[0.2cm]1-P(\text{Type II error}), & \theta\in\Theta_1\end{cases}\]

The ideal test: \(\beta_T(\theta)=I_{\Theta_1}(\theta)\)

Theorem

For any test \(T\) and any sufficient statistic \(S\) there is some other test \(U\) such that:

  1. \(U\) is a function of \(S\);

  2. \(\beta_U(\theta) = \beta_T(\theta),\ \forall \theta \in \Theta\).

Let \((X_1,\ldots,X_n)\) be a random sample from

\[\mathcal{F}=\{N(\mu,1):\mu\in \mathbb{R}\}.\]

To test the hypotheses \(H_0:\mu=0\) against \(H_1:\mu\neq 0\)

at a significance level \(\alpha\) we can use the known test \(T(\mathbf{X})=I_{\mathcal{X}_{c}}(\mathbf{X})\) with \[\mathcal{X}_{c}=\left\{\mathbf{x}\in\mathcal{X}: |\sqrt{n}\bar{x}| > \phi^{-1}(1-\alpha/2)\right\}.\]

Obtain the power function for this test.

Reality check

It is usually impossible to get arbitrarily close to the ideal test since the two probabilities of error vary in opposite directions.

So, in practice some compromise is required.

For instance, we could try to minimize

\[k \beta_T(\theta\mid H_0) + (1-k) \left(1-\beta_T(\theta\mid H_1)\right),\]

for some fixed \(k\in]0,1[\).

3.2 Uniformly most powerful tests

For a test \(T\), \[\alpha=\sup\limits_{\theta\in\Theta_0}\beta_T(\theta)\] is said the size of the test and \(T\) is said a \(\alpha\)-size test.

Any test \(U\) with \(\sup\limits_{\theta\in\Theta_0}\beta_U(\theta)\leq\alpha\) is said a \(\alpha\)-level test.

Note

When fixing or limiting the size of tests we are controlling the Type I error.

A test \(T\) is an uniformly most powerful (UMP) test in the class of \(\alpha\)-level tests \(\mathcal{C}_{\alpha}\) if \[\beta_T(\theta)\geq\beta_{T^*}(\theta),\,\forall\theta\in\Theta_1,\] for any other test \(T^*\) in \(\mathcal{C}_{\alpha}\).

How can we find UMP tests?

Neyman-Pearson lemma To test \(H_0:\theta=\theta_0\) against \(H_1:\theta=\theta_1\) in the model

\[\mathcal{F}=\left\{f(x\mid\theta):\theta\in\{\theta_0,\theta_1\}\right\}\]

the test

\[T(\mathbf{x})=\begin{cases}1, & f(\mathbf{x}\mid\theta_1)>kf(\mathbf{x}\mid\theta_0)\\[0.2cm]\varepsilon, & f(\mathbf{x}\mid\theta_1)=kf(\mathbf{x}\mid\theta_0)\\[0.2cm]0, & f(\mathbf{x}\mid\theta_1)<kf(\mathbf{x}\mid\theta_0)\end{cases},\]

for some \(k > 0\) and \(\varepsilon\in[0,1[\), is the essentially unique MP test of its level.

Let \((X_1,\ldots,X_n)\) be a random sample from

\[\mathcal{F}=\{N(\mu,\sigma_0^2):\mu\in \{\mu_0,\mu_1\}\}.\]

Find the MP \(\alpha\)-size test for the hypotheses

\[H_0:\mu=\mu_0\ \text{against}\ H_1:\mu=\mu_1(>\mu_0).\]

Let \((X_1,\ldots,X_n)\) be a random sample from \[\mathcal{F}=\{Exp(\lambda):\lambda\in \{\lambda_0,\lambda_1\}\},\, \text{with}\, \lambda_1>\lambda_0.\] Find the MP \(\alpha\)-size test for the hypotheses

\[H_0:\lambda=\lambda_0\ \text{against}\ H_1:\lambda=\lambda_1.\]

Let \((X_1,\ldots,X_{10})\) be a random sample from \[\mathcal{F}=\{Poi(\lambda):\lambda\in \{1,2\}\}.\] Find the MP test with size 0.05 for the hypotheses

\[H_0:\lambda=2\ \text{against}\ H_1:\lambda=1.\]

As we have seen, the test \[T(\mathbf{x})=\begin{cases} 1, & Z_0>\Phi^{-1}(1-\alpha)\\[0.2cm] 0, & \text{otherwise} \end{cases},\] with \(Z_0=\sqrt{n}\frac{\bar{X}-\mu_0}{\sigma_0}\), is the MP \(\alpha\)-size test for \(H_0:\mu=\mu_0\) against \(H_1:\mu=\mu_1(>\mu_0)\).

Note that \(T\) does not depend on the actual value of \(\mu_1\).

What can be concluded from that?

The former test is the UMP \(\alpha\)-size test for \(H_0:\mu=\mu_0\) against \(H_1:\mu>\mu_0\).

Can it also be the UMP test for

\[H_0:\mu\leq\mu_0\ \text{against}\ H_1:\mu>\mu_0?\]

Let’s look at \[\beta_T(\mu)=1-\Phi\left(\Phi^{-1}(1-\alpha)- \sqrt{n}\frac{\mu-\mu_0}{\sigma_0}\right).\]

Consider now the problem of testing \[H_0:\mu=\mu_0\ \text{against}\ H_1:\mu\neq\mu_0\] in \(\mathcal{F}=\{N(\mu,\sigma_0^2):\mu\in \mathbb{R}\}\).

  1. Check if the test in the previous example can still be the UMP \(\alpha\)-size test for this new hypotheses.

  2. Is there an UMP test for this case?

    Hint: consider the MP test for \(H_0:\mu=\mu_0\) against \(H_1:\mu=\mu_1(<\mu_0)\).

Where can we find UMP tests?

A model \(\mathcal{F}=\{f(x\mid\theta):\theta\in \Theta\subset\mathbb{R}\}\) has a monotone likelihood ratio in a real-valued statistic \(S\) if for all \(\theta_2>\theta_1\) the likelihood ratio \(\frac{f(\mathbf{x}\mid\theta_2)}{f(\mathbf{x}\mid\theta_1)}\) is a nondecreasing function of \(S\) in \(\left\{\mathbf{x}\in\mathcal{X}:f(\mathbf{x}\mid\theta_1)>0\,\text{or}\,f(\mathbf{x}\mid\theta_2)>0 \right\}\).

Note

  1. We will consider \(c/0=+\infty\) for \(c>0\).

  2. If the likelihood ratio is a nonincreasing function of \(S\) then the model has a MLR in \(-S\).

Let \(\mathcal{F}\) be a member of the uniparametric exponential family of distributions.

Under which conditions can this model have a MLR?

Show that the model \(\mathcal{F}=\{U(0,\theta):\theta\in\mathbb{R}^+\}\) has a MLR in some statistic.

Lemma If a model \(\mathcal{F}_{\theta}\) has a MLR in a statistic \(S\) and \(g\) is a nondecreasing function then \(E[g(S)\mid \theta]\) is a nondecreasing function of \(\theta\).

Karlin-Rubin’s theorem If the model \(\mathcal{F}=\{f(x\mid\theta):\theta\in \Theta\subset\mathbb{R}\}\) has a monotone likelihood ratio in a real-valued statistic \(S(\mathbf{X})\) then the test \[T(\mathbf{x})=\begin{cases} 1, & S(\mathbf{X})>k\\[0.1cm] \varepsilon, & S(\mathbf{X})=k\\[0.1cm] 0, & S(\mathbf{X})<k \end{cases},\] for some \(k\) and \(\varepsilon\in[0,1[\), is an UMP test of its size to test \(H_0:\theta\leq\theta_0\) against \(H_1:\theta>\theta_0\).

Note

For the hypotheses \(H_0:\theta\geq\theta_0\) against \(H_1:\theta<\theta_0\) we can use the reparametrization \(\lambda=-\theta\).

Consider a random sample of size 20 from the model \[\mathcal{F}=\{U(0,\theta):\theta\in\mathbb{R}^+\}\] and find the UMP test with size 0.05 for the hypotheses \(H_0:\theta\geq 1\) against \(H_1:\theta<1\).

3.3 Likelihood ratio tests

For many problems there is no UMP test among all tests of a given size!

We could keep applying the same optimality criterion in restricted classes of tests:

  1. unbiased tests (UMPU)
  2. invariant tests (UMPI)
  3. . . .

For the hypotheses \(H_0: \theta \in \Theta_0\) against \(H_1: \theta \in \Theta_1\) with \(\Theta_0\cup\Theta_1=\Theta\), the test \(T(\mathbf{X})=I_{[0,k[}\left(\Lambda(\mathbf{X})\right)\) where \(k\in[0,1]\) and \[\Lambda(\mathbf{X})=\frac{\sup\limits_{\theta\in\Theta_0}\mathcal{L}(\theta\mid\mathbf{X})}{\sup\limits_{\theta\in\Theta}\mathcal{L}(\theta\mid\mathbf{X})}\] is called a likelihood ratio test.

Note

  1. \(0\leq\Lambda(\mathbf{x})\leq 1,\,\forall \mathbf{x}\in\mathcal{X}\).

  2. We can also write \(\Lambda(\mathbf{X})=\dfrac{\mathcal{L}(\hat{\theta}_0\mid\mathbf{X})}{\mathcal{L}(\hat{\theta}\mid\mathbf{X})}\), where \(\hat{\theta}\) is the MLE of \(\theta\) and \(\hat{\theta}_0\) is the MLE of \(\theta\) restricted to \(\Theta_0\).

  3. If \(S\) is a sufficient statistic for \(\theta\) then \(\Lambda(\mathbf{X})\) can be written as a function of \(S\).

Let \((X_1,\ldots,X_n)\) be a random sample from \[\mathcal{F}=\{N(\mu,\sigma_0^2):\mu\in \mathbb{R}\}.\] Find the LR test for the hypotheses \(H_0:\mu=\mu_0\) against \(H_1:\mu\neq\mu_0\).

Let \((X_1,\ldots,X_n)\) be a random sample from \[\mathcal{F}=\left\{f(x\mid\theta) = e^{-(x-\theta)}I_{[\theta,+\infty[}(x):\theta\in \mathbb{R}\right\}.\] Find the LR test for the hypotheses \(H_0:\theta<\theta_0\) against \(H_1:\theta\geq\theta_0\).

It is possible to construct LR tests provided:

  1. the distribution of \(\Lambda(\mathbf{X})\) under \(H_0\) is known, or

  2. the LR test can equivalently be written as a function of a statistic \(S(\mathbf{X})\) whose distribution under \(H_0\) is known.

Otherwise it may be difficult to find a LR test!

Wilk’s LR test statistic Under some regularity conditions we have that \[W(\mathbf{X})=-2\log\Lambda(\mathbf{X}) \overset{\mathcal{D}}{\underset{H_0}{\longrightarrow}} \chi_{(r)}^2,\]

where \(r=\dim(\Theta)-\dim(\Theta_0)\) and

\[T(\mathbf{X})=I_{[c,+\infty[}\left(W(\mathbf{X})\right),\]

with \(c=F_{\chi_{(r)}^2}^{-1}(1-\alpha)\), is the \(\alpha\)-size Wilk’s asymptotic LR test.

  1. Let \((X_1,X_2)\) be a random sample from the model \(\{Poi(\theta),\,\theta>0\}\). We want to test \(H_0:\theta\leq 1\) against \(H_1:\theta >1\).

    1. Show that the size of the test \(T_1(X_1,X_2)=1-I_{\{0,1\}}(X_1)\) is approximately 0.26.

    2. Define and interpret the test \[T_2(X_1,X_2)=E[T_1(X_1,X_2)\mid X_1+X_2=t].\] Should we prefer \(T_2\) over \(T_1\)?

    3. Identify the UMP test with size \(\alpha=E[T_1(X_1,X_2)\mid \theta = 1]\).

  1. Let \(X_1,\ldots,X_n\) be a random sample from the model \(\{Ber(\theta),\,0<\theta<1\}\).

    1. Define the LRT for the hypotheses \(H_0:\theta\leq \theta_0\) and \(H_1:\theta >\theta_0\).

    2. Is the previous test a UMP test? Justify.

pages
grid_view zoom_out zoom_in swap_horiz autorenew visibility_off
menu
fullscreen
stylus_note ink_eraser delete_forever content_paste navigate_before navigate_next
draw