Validate Low # of Jackknife Replicates
Purpose -
We recently developed a Vitalnet module for analyzing BRFSS data.
The new program makes BRFSS data analysis much better,
easier, and more reliable.
Due to the complex survey design,
confidence intervals (CI) for BRFSS data are non-trivial.
Several CI methods are available for BRFSS data.
We chose "Jackknife Replication" (JR) mainly because
1. JR can calculate a CI for any outcome, including medians, and
2. JR seemed easier to understand and explain.
Probably because of inability to run the computation-intensive JR method fast enough,
most BRFSS analyses using SAS and SUDAAN typically use Taylor Series linearization (TS) instead.
We have figured out how to make the JR method run fast, in most cases.
Nevertheless, to speed things further, we would like to allow the user
to use a lower number of replicates.
The purpose of this study was to estimate what affect this has on the accuracy of the method.
Methods -
A series of analyses were done for
2005,
Texas,
_RFBING3 (binge drinker),
% Yes,
by sex,
95% CI,
jackknife replicate method,
as shown in this
sample table.
Each analysis used a different number of maximum replicates. If the number of
maximum replicates is greater than the number of valid surveys in a cell, the
number of replicates is changed to the lower number.
Each "diff" in the table is the difference between the confidence limit
with reduced max_reps and the
confidence limit with unlimited max_reps.
For example, 21.01 - 20.95 = 0.06 (a very small difference).
If the differences are small enough,
then it is OK to use a lower number of replicates.
max_reps | Male | Female |
CI lo | diff | CI hi | diff | CI lo | diff | CI hi | diff |
50 | 21.01 | 0.06 | 25.36 | 0.06 | 4.85 | 0.12 | 7.03 | 0.12 |
100 | 21.08 | 0.13 | 25.29 | 0.13 | 4.87 | 0.10 | 7.01 | 0.10 |
200 | 20.99 | 0.04 | 25.38 | 0.04 | 4.96 | 0.01 | 6.92 | 0.01 |
400 | 20.96 | 0.01 | 25.41 | 0.01 | 4.97 | 0.00 | 6.90 | 0.00 |
800 | 20.92 | 0.03 | 25.45 | 0.03 | 4.98 | 0.01 | 6.92 | 0.01 |
1600 | 20.93 | 0.02 | 25.44 | 0.02 | 4.96 | 0.01 | 6.90 | 0.01 |
3200 | 20.95 | 0.00 | 25.42 | 0.00 | 4.98 | 0.01 | 6.91 | 0.01 |
6400 | 20.95 | 0.00 | 25.42 | 0.00 | 4.97 | 0.00 | 6.91 | 0.00 |
Unlimited | 20.95 | 0.00 | 25.42 | 0.00 | 4.97 | 0.00 | 6.91 | 0.00 |
Results -
The Jackknife CI for small numbers of replicates is
very similar to those for unlimited replicates.
The largest differences (shown in red) were only about 0.1 and occurred with low replicates.
The differences are less than or about the same as
differences between different CI calculation methods,
as implemented in various BRFSS analysis systems, which are about 0.1
for large numbers and are about 0.5 for small numbers.
Conclusions -
For exploratory data analysis, to speed up Jackknife
CI calculation, the user may select a low number of replicates,
with very little loss in accuracy. For published results, it is recommended
to select more replicates, as this gives a more accurate result.