First of all, and this is important, a nonsignificant statistical test is
never convincing evidence of anything. Not. Ever. Another way to say this is that absence of proof is not the same as proof of absence. Nonsignificance is absence of proof.
The way to address this is to do a formal test of equivalence -- a topic that, unfortunately, is missing from almost all statistics courses. In your situation, It could be stated in terms of a parameter theta, whereby N*theta is the noncentrality parameter
of the chi-square statistic, and N is the total sample size. The hypotheses for the equivalence test are:
H0: theta >= theta0
H1: theta < theta0
where theta_0 is a specified equivalence threshold (i.e., is theta < theta0, we consider the groups close enough to be considered equivalent). With this framework, if one rejects H0, we then have strong evidence in favor of H1, that theta is small.
To do the test, proceed as follows:
- Create a 2 x 13 table of hypothetical proportions that reflects a pattern of proportions that you consider at the threshold between equivalence and nonequivalence. Each row should sum to 1. Do not use the observed data. Do this carefully in consideration
of the scientific issues and opinions of experts, and without looking ahead to see if it yields the P value you want.
- Multiply the rows by the sample sizes (33 million and 142 million?)
- Compute the chi-square statistic for this hypothetical table. Call this value chi0^2. That will be the value of N*theta0.
- Compute the chi-square statistic for the observed data. Call this value chi^2.
- The P value for the equivalence test is the cumulative probability of chi^2, computed from the noncentral chi^2 distribution with (13-1)*(2-1)=12 d.f. and noncentrality parameter chi0^2. (Programs like SAS, R, and Minitab have such a function.)
For more on equivalence tests, there is a paper by Schuirmann and a book by Wellek. See also the 2001 paper in The American Statistician by Hoenig and Heisey, "The Abuse of Power", that explains more about why nonsignificant tests and power calculations are
inappropriate methods for assessing equivalence. There is also an R package named 'equivalence' that looks promising, but I have not tried it myself.
Russ
Russell V. Lenth - Professor Emeritus
Department of Statistics and Actuarial Science
The University of Iowa - Iowa City, IA 52242 USA
Voice (319)335-0712 (Dept. office) - FAX (319)335-3017