#jsDisabledContent { display:none; } My Account |  Register |  Help

# Multinomial test

Article Id: WHEBN0016093546
Reproduction Date:

 Title: Multinomial test Author: World Heritage Encyclopedia Language: English Subject: List of statistics articles Collection: Publisher: World Heritage Encyclopedia Publication Date:

### Multinomial test

In statistics, the multinomial test is the test of the null hypothesis that the parameters of a multinomial distribution equal specified values. It is used for categorical data; see Read and Cressie.[1]

We begin with a sample of N items each of which has been observed to fall into one of k categories. We can define \mathbf{x} = (x_1, x_2, \dots, x_k) as the observed numbers of items in each cell. Hence \textstyle \sum_{i=1}^k x_{i} = N.

Next, we define a vector of parameters H_0: \mathbf{\pi} = (\pi_{1}, \pi_{2}, \dots, \pi_{k}), where :\textstyle \sum_{i=1}^k \pi_{i} = 1. These are the parameter values under the null hypothesis.

The exact probability of the observed configuration \mathbf{x} under the null hypothesis is given by

\Pr(\mathbf{x)_0} = N! \prod_{i=1}^k \frac{\pi_{i}^{x_i}}{x_i!}.

The significance probability for the test is the probability of occurrence of the data set observed, or of a data set less likely than that observed, if the null hypothesis is true. Using an exact test, this is calculated as

\Pr(\mathbf{sig})=\sum_{y: Pr(\mathbf{y}) \le Pr(\mathbf{x)_0}} \Pr(\mathbf{y})

where the sum ranges over all outcomes as likely as, or less likely than, that observed. In practice this becomes computationally onerous as k and N increase so it is probably only worth using exact tests for small samples. For larger samples, asymptotic approximations are accurate enough and easier to calculate.

One of these approximations is the likelihood ratio. We set up an alternative hypothesis under which each value \pi_{i} is replaced by its maximum likelihood estimate p_{i}=x_{i}/N. The exact probability of the observed configuration \mathbf{x} under the alternative hypothesis is given by

\Pr(\mathbf{x)_A} = N! \prod_{i=1}^k \frac{p_{i}^{x_i}}{x_i!}.

The natural logarithm of the ratio between these two probabilities multiplied by -2 is then the statistic for the likelihood ratio test

-2\ln(L/R) = \textstyle -2\sum_{i=1}^k x_{i}\ln(\pi_{i}/p_{i}) .

If the null hypothesis is true, then as N increases, the distribution of -2\ln(LR) converges to that of chi-squared with k-1 degrees of freedom. However it has long been known (e.g. Lawley 1956) that for finite sample sizes, the moments of -2\ln(LR) are greater than those of chi-squared, thus inflating the probability of type I errors (false positives). The difference between the moments of chi-squared and those of the test statistic are a function of N^{-1}. Williams (1976) showed that the first moment can be matched as far as N^{-2} if the test statistic is divided by a factor given by

q_1 = 1+\frac{\sum_{i=1}^k \pi_{i}^{-1}-1}{6N(k-1)}.

In the special case where the null hypothesis is that all the values \pi_{i} are equal to 1/k (i.e. it stipulates a uniform distribution), this simplifies to

q_1 = 1+\frac{k+1}{6N}.

Subsequently, Smith et al. (1981) derived a dividing factor which matches the first moment as far as N^{-3}. For the case of equal values of \pi_{i}, this factor is

q_2 = 1+\frac{k+1}{6N}+\frac{k^2}{6N^2}.

The null hypothesis can also be tested by using Pearson's chi-squared test

\chi^2 = \sum_{i=1}^{k} {(x_i - E_i)^2 \over E_i}

where E_i=N\pi_i is the expected number of cases in category i under the null hypothesis. This statistic also converges to a chi-squared distribution with k-1 degrees of freedom when the null hypothesis is true but does so from below, as it were, rather than from above as -2\ln(LR) does, so may be preferable to the uncorrected version of -2\ln(LR) for small samples.

## References

1. ^ Read, T. R. C. and Cressie, N. A. C. (1988). Goodness-of-fit statistics for discrete multivariate data. New York: Springer-Verlag. ISBN 0-387-96682-X.
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.

Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.