Validate Few Jackknife Replicates

Purpose - Vitalnet BRFSS Query & Mapping Software (VB-QMS) makes it much easier and more reliable to analyze BRFSS data. Analyses taking hours or days with generic stats software, and often subject to error, are easily and reliably done in minutes. VB-QMS uses Jackknife Replication (JR) to calculate confidence intervals (CI). We have figured out how to make the JR method run fast, in most cases. Nevertheless, to speed things further, the user may wish to use a lower number of replicates. The purpose of this study was to estimate what effect this has on the accuracy of the method.

Methods - A series of analyses were done for 2005, Texas, binge drinker variable, % Yes, by sex, 95% CI, jackknife replicate method, as shown in this sample table. Each analysis used a different number of maximum replicates (Max_Reps). If Max_Reps is greater than the number of valid surveys in a cell (in this case, 2386 male and 4047 female), the number of replicates is changed to the lower number. Each "diff" in the table is the difference between the confidence limit with given Max_Reps and the confidence limit with unlimited Max_Reps. For example, looking at male Max_Reps = 50, 21.01 - 20.95 = 0.06 (a very small difference). Small differences mean acceptable accuracy.

Max_Reps	Male				Female
Max_Reps	CI lo	diff	CI hi	diff	CI lo	diff	CI hi	diff
50	21.01	0.06	25.36	0.06	4.85	0.12	7.03	0.13
100	21.08	0.13	25.29	0.13	4.87	0.10	7.01	0.11
200	20.99	0.04	25.38	0.04	4.96	0.01	6.92	0.02
400	20.96	0.01	25.41	0.01	4.97	0.00	6.90	0.00
800	20.92	0.03	25.45	0.03	4.98	0.01	6.92	0.02
1600	20.93	0.02	25.44	0.02	4.96	0.01	6.90	0.00
3200	20.95	0.00	25.42	0.00	4.98	0.01	6.91	0.01
6400	20.95	0.00	25.42	0.00	4.97	0.00	6.90	0.00
Unlimited	20.95	0.00	25.42	0.00	4.97	0.00	6.90	0.00

Results - The Jackknife CI for small Max_Reps is very similar to those for unlimited Max_Reps. The largest differences (shown in red) were only about 0.1 and only occurred with very low Max_Reps. So even the largest differences were less than or about the same as differences between different CI calculation methods, as implemented in various BRFSS analysis systems, which are about 0.1 for large numbers and about 0.5 for small numbers.

Conclusions - For exploratory data analysis, to speed up Jackknife CI calculation, the user may select low Max_Reps, with very little loss in accuracy. For published results, it is recommended to select unlimited Max_Reps, as this gives a more accurate result.

Contact us if you have suggestions / comments, or to license Vitalnet BRFSS Query & Mapping Software to make better use of your data.