Validation of Vitalnet® Jackknife Replication


 
Introduction and Purpose - We recently developed a Vitalnet module for analyzing BRFSS data. The new program makes BRFSS data analysis much better, easier, and more reliable.
 
Due to the complex survey design, confidence intervals (CI) for BRFSS data are non-trivial. Several CI methods are available for BRFSS data. We chose "Jackknife Replication" (JR) mainly because 1. JR can calculate a CI for any outcome, including medians, and 2. JR seemed easier to understand and explain. Probably because of inability to run the computation-intensive JR method fast enough, most BRFSS analyses using SAS and SUDAAN typically use Taylor Series linearization (TS) instead.

Confidence intervals - Calculating confidence intervals (CI) for BRFSS data is complex. Several CI methods are available for BRFSS data. We chose "Jackknife Replication" (JR) mainly because 1) JR can calculate confidence intervals for any outcome, including medians, and 2) JR is better documented and easier to understand. Jackknife Replication - JR recalculates the outcome, such as "% Yes", many times, each time leaving out one or a few observations, and re-weighting the remaining observations. Each recalculation is called a "replicate". Then, it calculates the confidence interval based on the distribution of the replicate outcomes. The method is called "jackknife" because it is so generally useful, like a jackknife. Taylor Series Linearization - The other main option, "Taylor series linearization" (TS), converts each observation to a "linearized variable". Then, it calculates the confidence interval based on the distribution of the linearized variables. Based on comparisons we have done, TS and JR produce essentially the same results. However, TS cannot calculate confidence intervals for medians. There is no way around this disadvantage. Also, the details of the TS algorithm are very poorly documented. Currently, TR is essentially a "black box". Vitalnet implementation - JR is computation-intensive and can be slow. However, we have optimized JR to make it much faster (typically a few seconds). Also, we have determined that a JR CI with smaller numbers of replicates is close to one with an unlimited number of replicates. For exploratory data analysis, the user can select a smaller number of replicates. For published results, more replicates are recommended. If possible to obtain details of the TS algorithm, we may add TS as an option in the future, to speed up CI analyses in some cases with very large numbers of records.

Because we are using a different (though apparently equally valid) method than the TS commonly used, and because we needed to validate the complex JR programming as best as possible, we felt it was necessary to systematically compare a series of Vitalnet CI calculations with ones calculated by CDC WEAT, and several prominent State systems. It was also felt necessary to do a "reality check", to verify that JR and TS give the same or very similar results in practice. We considered WEAT to be the current "gold standard", and used it as a benchmark.



Methods - For each comparison, a CI was calculated for "% Yes" (or % No), for a particular BRFSS variable analyzed by Vitalnet, WEAT, and one of two State systems, for 2005. Each CI was calculated at the 95% level. To provide more tests, and because this analysis was available for each system, the analysis was carried out for male and female. A range of percents ended up being used, from small to large. This would expose any difference for larger or smaller percents.

Since WEAT and some State systems were limited to one digit after the decimal point, one digit level of precision was used for all tests. An unlimited number of replicates was used for JR method. The BRFSS variables were picked at random, and to ensure that all three systems (Vitalnet, WEAT, and the State system) included the variable.

To select the systems included in the report, we started with two that we knew existed (A System and B System). We tried to include C System, but is said "under reconstruction", as it had for at least some weeks. Then, we did an internet search for "brfss data query", and picked the ones highest in the results at the time (October, 2008). We were not able to use the D System, which showed up in the google search; It allowed some selection of parameters, but there was consistently a "page not found" error when we clicked the button to run the query. The next system found on the google search was the E System, which we initially included. Later, we left out the E System, when we received information the State re-weights their data to account for differences between State population estimates and those used by the CDC. The re-weighting made it not possible to directly compare weighted percents and confidence intervals with their results. Otherwise, once a comparison was done, it was included, regardless of the results.

The purpose of the report is to compare statistical methods, not to compare software systems. Therefore, the States are not identified in this report. Full information, including output with confidence intervals, is available on request.



Results and Discussion - Each Table A-D below shows a comparison between WEAT, Vitalnet, and a particular State system. When only one system had a different CI, or when a system had a markedly different result, the differing number is marked in red. Each Table E-F below shows a comparison between WEAT, Vitalnet, and the B System, for small numbers. The B System was the only State system where is was obvious how to do a sub-analysis.

Very Similar Vitalnet and WEAT CI - We found very close results between Vitalnet and WEAT. For Table A-D (large numbers), only 3 of 24 confidence limits differed, and only by 0.1% in each case. Also for Tables A-D, Vitalnet (and WEAT) were the "odd man out" (colored red) only 1 of 16 possible times.

State Systems CI not quite as Similar - As shown in Tables A-D, the A System and B System were close to Vitalnet and WEAT, but differed somewhat. The A System was "odd man out" (colored red) 4 of 8 possible times. The B System was "odd man out" (colored red) 5 of 8 possible times. In practice, with large numbers, the differences in Tables A-D with the A System and B System would not affect the interpretation of results.

Why the Differences? - Vitalnet uses jackknife replicates (JR) to calculate the CI, and the procedure is fully documented. It is not suprising that Vitalnet differs a little from WEAT, since WEAT uses Taylor series linearization (TS), a totally different method. If anything, the surprise is that WEAT and Vitalnet are so close. Since none of the State programs give any details concerning how they calculate a CI, there is no way to figure out why they differ somewhat more. It is likely the State programs also use TS, since they are likely based on SAS or SUDAAN, but there is no way to know without further investigation.

Small Numbers - As shown in Tables E-F, each Vitalnet and WEAT CI based on small numbers is also very close. The difference varies from 0.0 to 0.3. Each WEAT smaller number CI was somewhat smaller than the Vitalnet CI. Or one could say the Vitalnet CI was somewhat larger. At this point, there is no way to say which is "right". From a theoretical point of view, JR would seem to be a more valid method, since it is an exact non-parametric method. However, the differences are so small to not seem significant. The B System small number CIs were not as close to Vitalnet or WEAT, as shown in Table E-F. It is possible the small number CI differences between WEAT and Vitalnet may be occurring with larger numbers of responses, but are being obscured by the relatively smaller size of the CI, and the fact that WEAT only allows one digit after the decimal point.

Conclusion - The JR method produces valid confidence intervals for BRFSS data. In practice, in terms of accuracy, JR and TS appear to be equivalent.




Table A) _RFBING3 (Binge Drinker)

System
Results
MaleFemale
% YesLLULYes% YesLLULYes
A System 23.2 21.0 25.5 < 500 5.9 5.0 7.0 < 500
Vitalnet 23.2 20.9 25.4 < 500 5.9 5.0 6.9 < 500
WEAT 23.2 21.0 25.4 < 500 5.9 5.0 6.9 < 500



Table B) _BMI4CAT (Overweight)

System
Results
MaleFemale
% YesLLULYes% YesLLULYes
A System 72.4 70.0 74.7 > 1000 55.6 53.5 57.6 > 1000
Vitalnet 72.4 70.1 74.8 > 1000 55.6 53.5 57.6 > 1000
WEAT 72.4 70.1 74.8 > 1000 55.6 53.6 57.6 > 1000






Table C) _RFSMOK3 (Current Smoker)

System
Results
MaleFemale
% YesLLULYes% YesLLULYes
B System 13.7 11.9 15.8 < 500 9.3 8.1 10.6 < 500
Vitalnet 13.7 11.7 15.7 < 500 9.3 8.0 10.5 < 500
WEAT 13.7 11.8 15.7 < 500 9.3 8.0 10.5 < 500



Table D) _FV5SRV (Five a Day)

System
Results
MaleFemale
% NoLLULNo% NoLLULNo
B System 85.1 83.2 86.9 > 1000 70.9 68.7 73.0 > 1000
Vitalnet 85.1 83.3 87.0 > 1000 70.9 68.7 73.0 > 1000
WEAT 85.1 83.3 87.0 > 1000 70.9 68.7 73.0 > 1000






Table E) _CHOLCHK (Chol. past 5 years), 45-49

System
Results
MaleFemale
% YesLLULYes% YesLLULYes
B System 73.3 65.3 80.0 < 500 78.9 72.9 83.9 < 500
Vitalnet 73.3 65.8 80.8 < 500 78.9 73.4 84.5 < 500
WEAT 73.3 65.9 80.7 < 500 78.9 73.4 84.4 < 500



Table F) _CHOLCHK (Chol. past 5 years), 18-24

System
Results
MaleFemale
% YesLLULYes% YesLLULYes
B System 28.7 21.5 37.3 < 100 32.8 25.7 40.8 < 100
Vitalnet 28.7 20.5 36.9 < 100 32.8 25.0 40.6 < 100
WEAT 28.7 20.8 36.7 < 100 32.8 25.2 40.4 < 100


References consulted concerning TS and JR methods include the following:

Abbreviations used in the tables and text are:

This information last updated: Jan 12, 2014