Considere o modelo de regressão linear
,
,
.
Seja vs .
Podemos deduzir que , onde. Eé a notação típico para a matriz aniquilador, , onde y é a variável dependenteyregrediram emX .
O livro que estou lendo afirma o seguinte:
Perguntei anteriormente quais critérios devem ser usados para definir uma região de rejeição (RR). Consulte as respostas para essa pergunta e o principal era escolher o RR que tornasse o teste o mais poderoso possível.
Nesse caso, com a alternativa sendo uma hipótese composta bilateral, geralmente não há teste UMP. Além disso, pela resposta dada no livro, os autores não mostram se fizeram um estudo do poder de seu RR. No entanto, eles escolheram um RR bicaudal. Por que isso, já que a hipótese não determina 'unilateralmente' o RR?
Editar: Esta imagem está no manual de soluções deste livro como a solução para o exercício 4.14.
fonte
mathematical-statistics
. Então, um q bom. IMO. É um pouco amplo, mas acho que uma boa resposta pesquisaria várias abordagens e considerações, e um exemplo motivador ajuda muito. Porém, eu teria escolhido um exemplo o mais simples possível - testes sobre a variação de uma distribuição normal com média conhecida ou a média de uma distribuição exponencial. .]Respostas:
Mais fácil primeiro trabalhar com o caso em que os coeficientes de regressão são conhecidos e a hipótese nula, portanto, simples. Então a estatística suficiente é , onde z é o residual; sua distribuição sob o nulo também é um qui-quadrado com escala de σ 2 0 e com graus de liberdade iguais ao tamanho da amostra n .T=∑z2 z σ20 n
Escreva a razão das probabilidades em & σ = σ 2 e confirme que é uma função crescente de T para qualquer σ 2 > σ 1 :σ=σ1 σ=σ2 T σ2>σ1
So by the Karlin–Rubin theorem each of the one-tailed testsH0:σ=σ0 vs HA:σ<σ0 & H0:σ=σ0 vs HA:σ<σ0 is uniformly most powerful. Clearly there's no UMP test of H0:σ=σ0 vs HA:σ≠σ0 . As discussed hereσ>σ0 or that σ<σ0 when you reject the null.
This is a fine statistic for quantifying how much the data supportHA:σ≠σ0 over H0:σ=σ0 . And confidence intervals formed from inverting the likelihood-ratio test have the appealing property that all parameter values inside the interval have higher likelihood than those outside. The asymptotic distribution of twice the log-likelihood ratio is well known, but for an exact test, you needn't try to work out its distribution—just use the tail probabilities of the corresponding values of T in each tail.
If you can't have a uniformly most powerful test, you might want one that's most powerful against the alternatives closest to the null. Find the derivative of the log-likelihood function with respect toσ —the score function:
Evaluating its magnitude atσ0 gives a locally most powerful test of H0:σ=σ0 vs HA:σ≠σ0 . Because the test statistic's bounded below, with small samples the rejection region may be confined to the upper tail. Again, the asymptotic distribution of the squared score is well known, but you can get an exact test in the same way as for the LRT.
Another approach is to restrict your attention to unbiased tests, viz those for which the power under any alternative exceeds the size. Check your sufficient statistic has a distribution in the exponential family; then for a sizeα test, ϕ(T)=1 if T<c1 or T>c2 , else ϕ(T)=0 , you can find the uniformly most powerful unbiased test by solving
A plot helps show the bias in the equal-tail-areas test & how it arises:
At values ofσ a little over σ0 the increased probability of the test statistics' falling in the the upper-tail rejection rejection doesn't compensate for the reduced probability of its falling in the lower-tail rejection region & the power of the test drops below its size.
Being unbiased is good; but it's not self-evident that having a power slightly lower than the size over a small region of the parameter space within the alternative is so bad as to rule out a test altogether.
Two of the above two-tailed tests coincide (for this case, not in general):
I think all, even the one-tailed tests, are admissible, i.e. there's no test more powerful or as powerful under all alternatives—you can make the test more powerful against alternatives in one direction only by making it less powerful against alternatives in the other direction. As the sample size increases, the chi-squared distribution becomes more & more symmetric, & all the two-tailed tests will end up being much the same (another reason for using the easy equal-tailed test).
With the composite null hypothesis, the arguments become a little more complicated, but I think you can get practically the same results, mutatis mutandis. Note that one but not the other of the one-tailed tests is UMP!
fonte
I am not sure if that is true in general. Certainly, a lot of the classical results (Neymon-Pearson, Karlin-Rubin) are based on either simple or one-sided hypothesis, but generalizations to two-sided composite hypothesis do exist. You can find some notes on that here, and more discussion in the textbook here.
For your problem specifically, I don't know whether a UMP test exists or not. But intuitively, it seems to be that under 0-1 loss, a one sided test will probably be inadmissible, and thus the class of admissible test will be all two-sided tests. Give the class of two sided tests, the goal is to find the one with the largest power, which should automatically happen by choosing quantiles around the one mode of theχ2 . (This is all based on intuition).
fonte