A. Introduction
High value data -
The BRFSS is a rich source of public health data.
From the CDC BRFSS web site:
The Behavioral Risk Factor Surveillance System (BRFSS) is the world's largest,
on-going telephone health survey system, tracking health conditions and risk
behaviors in the United States yearly since 1984.
Currently, data are collected monthly in all 50 states, the
District of Columbia, Puerto Rico, the U.S. Virgin Islands, and Guam.
Underutilitized data -
However, BRFSS data are currently underutilized, mainly because they
are quite complex to analyze.
Sources of the complexity include:
1) Complex nature of survey design and weighting.
2) Large number of questions asked.
3) States can add or not ask certain questions.
4) Questions can change from year to year.
Vitalnet BRFSS -
We have developed Vitalnet BRFSS for analyzing and making better use of BRFSS data.
Vitalnet BRFSS provides the needed combination of analytical flexibility,
ease-of-use, and output capabilities, in a professional software system.
As explained below, other analysis alternatives do not even come close
to providing this needed combination.
Greatly improved data use -
Vitalnet makes much better use of BRFSS data for improving US public health.
Vitalnet will make it much easier for data analysts, researchers,
health agencies and public users to analyze BRFSS data.
B. BRFSS Vitalnet Development and Current Status
Background research -
We have appreciated the usefulness of BRFSS data for many years.
As a first step for developing the Vitalnet BRFSS module,
we carefully reviewed CDC and State BRFSS reports,
and the extensive technical documentation on the
BRFSS web site.
We also looked at existing home-grown web-based systems.
These reviews helped us better understand the nature of BRFSS data,
and how the data are currently used.
Database design -
Next, we designed a novel record-level database architecture
specially optimized for BRFSS data.
This was a key step.
The special database design results in low disk space requirements, rapid analyses, and retention of record-level detail.
Data importing -
Next, Vitalnet data importing routines were modified,
to rapidly and reliably import BRFSS data into the Vitalnet data warehouse.
New data can be imported in less than one day,
including extensive error checking.
Database engine -
Next, The Vitalnet database engine (the functions that read the database and
produce the needed results) was modified to read the new database architecture.
New functions were written to take into account BRFSS weighting and produce
the "primary statistics" needed for BRFSS data, such as "weighted % Yes".
Interface and outputs -
Next, we modified the existing Vitalnet desktop and web
interfaces to provide additional BRFSS-specific menu options.
The BRFSS module uses the same "look and feel" as other Vitalnet modules,
making it easy for users to switch between data sets.
Finally, Vitalnet output formats were modified to handle the
BRFSS numerical results, and to include the BRFSS-specific
documentation needed to prevent misinterpretation of results.
Confidence intervals -
Calculating confidence intervals (CI) for BRFSS data is complex.
Several CI methods are available for BRFSS data.
We chose "Jackknife Replication" (JR) mainly because 1) JR can
calculate confidence intervals for any outcome, including medians,
and 2) JR is better documented and easier to understand.
Jackknife Replication -
JR recalculates the outcome, such as "% Yes", many times, each time leaving out one or a few observations, and re-weighting the remaining observations.
Each recalculation is called a "replicate".
Then, it calculates the confidence interval based on the distribution of the replicate outcomes.
The method is called "jackknife" because it is so generally useful, like a jackknife.
Taylor Series Linearization -
The other main option, "Taylor series linearization" (TS),
converts each observation to a "linearized variable".
Then, it calculates the confidence interval based on the distribution of the linearized variables.
Based on comparisons we have done, TS and JR produce essentially
the same results.
However, TS cannot calculate confidence intervals for medians.
There is no way around this disadvantage.
Also, the details of the TS algorithm are very poorly documented.
Currently, TR is essentially a "black box".
Vitalnet implementation -
JR is computation-intensive and can be slow.
However, we have optimized JR to make it much faster (typically a few seconds).
Also, we have determined
that a JR CI with smaller numbers of replicates is close to one
with an unlimited number of replicates.
For exploratory data analysis, the user can select a smaller number of replicates.
For published results, "unlimited replicates" is recommended.
If possible to obtain details of the TS algorithm,
we may add TS as an option in the future,
to speed up CI analyses in some cases with very large numbers of records.
Validation of Vitalnet JR confidence intervals.
Verification that low # of JR replicates is similar.
Current status -
Currently, over 70 BRFSS variables are incorporated into the Vitalnet BRFSS module.
Any requested variable can be added, at the licensee's request.
Several years of data have been added,
and the program does time trend analyses.
Mechanisms have been added to take into account changes in
questions asked over the years, and questions not asked in some years.
The program has been extensively verified and tested,
both for internal consistency and in comparison with printed
reports from the BRFSS web site and results from CDC WEAT.
Future plans -
We plan to add age-adjusted outcome measures.
Also, maps will be added, similar to the mapping in other Vitalnet modules.
Currently, Vitalnet BRFSS module does descriptive statistics, which is
what most users do most of the time, and are easiest to understand.
In the future, a possibility is to add multiple regression analysis.
If you have thoughts on that, or other suggestions for future directions,
please contact us.
C. Examples of BRFSS module output
Below are links to examples of output produced by the Vitalnet BRFSS module.
Each example took a minute or less to design and produce.
The time taken is mostly figuring out what you want to do.
The actual parameter selection and program execution if fast.
The examples are just a sample of what is possible with Vitalnet BRFSS.
The program lets you quickly and reliably "compare just about anything with anything".
We think we've provided the descriptive statistics almost any user would want.
But if you think we've missed something you find useful,
let us know.
· 95% Confidence intervals
· 98% Confidence intervals
· Omit Confidence intervals
· Summary time trend analysis
· Detailed time trend analysis
· Cross-tabulation
· Combining areas
· Comparing areas
· Comparing regions
· Body mass index analysis
· Last checkup analysis
· Bar chart
· Line chart
· Pie chart
· Valid interviews
· Number yes for question
· Number no for question
· Weighted valid interviews (prevalence)
· Weighted number yes for question (prevalence)
· Weighted number no for question (prevalence)
· Weighted percent yes for question
· Weighted percent no for question
· Column total = 100%
· Row total = 100%
· Try out VitalWeb (Vitalnet browser platform) for free.
D. Vitalnet Compared with Generic Stat Packages (GSP)
Motivation for GSP -
BRFSS data have historically been analyzed using
certain commercial generic stat packages (GSP),
because there was no alternative.
"Generic stat packages" include SAS, SPSS, SUDAAN, and Stata.
They are powerful, general purpose, and flexible.
GSP shortcomings -
However, using a GSP is far beyond the capability of the
great majority of users.
Even in expert hands, using a GSP is quite complex,
error-prone, awkward, and time-consuming, especially for
complex survey data such as BRFSS data.
The user needs to understand many details about BRFSS file layouts,
BRFSS variables, and how to use the complex software.
Also, a GSP is typically not cheap, can easily cost thousands of dollars.
Also, the needed training is expensive and time-consuming.
Vitalnet is much better -
Vitalnet is a new and fundamentally different kind of stats package.
We've taken a novel approach to greatly simplify data analysis.
Vitalnet is specifically and totally customized for
analyzing BRFSS data (or some other data set of interest to you),
for a particular jurisdiction.
Vitalnet "knows" all about the data set.
For both occasional and expert users,
Vitalnet is much easier, more reliable, and more useful.
Instead of the confusing array of statistical tests offered by a GSP,
Vitalnet offers exactly the options the user might need, in a fully menu-driven format.
Instead of the arduous, error-prone task of setting up a GSP analysis,
you merely choose options from self-explanatory menus and press "Go".
Training needs are minimal.
Vitalnet training is mostly focused on understanding data analysis in general
(eg, what is a confidence interval),
and explaining what capabilities are available in Vitalnet.
Here are some key ways Vitalnet BRFSS is much better than a GSP
for analyzing BRFSS data:
| Vitalnet BRFSS Module | Generic Stat Package (GSP) |
|---|---|
|
Produces results in seconds or minutes. Benefits both casual and expert users. Is easy-to-use and reliable. "Knows" details of BRFSS files. Output is publication-ready. Directly makes needed tables. |
Takes hours or days. Requires expert user. Is difficult and error-prone. User must learn file details. Output needs reformatting. Only makes subset of needed tables. |
were produced in an effort to make BRFSS and other data easier
to access and analyze.
| Vitalnet BRFSS Module | Home-Grown System (HGS) |
|---|---|
|
Has clean, professional interface. Makes almost any result needed. Allows needed customization of output. User never needs to "back out". Is a professional software system. Allows data groupings to be customized. Makes customizable charts and graphs. Always produces valid output. Output is publication-ready. |
Interface is kludgy and awkward. Only makes subset of needed results. Output customization capacity is minimal. User must "back out" to select year or variable. Has rough edges, missing parts, and glitches. Data groupings are fixed or not accessible. Charting is missing or poorly implemented. Output results in error or misinterpretation. Output is rough, and needs reformatting. |