Time Trend Analysis
The main purpose of this web page
is to seek help from the statistical community concerning
a better method to determine significance of a linear time trend.
If someone is plotting results over time,
they normally want to know if there is a time trend.
"Is the death rate for disease X in County Y steady, going up or going down?".
This is a very basic statistical task,
and applies to many situations.
Vitalnet uses the
least squares regression method
to determine the time trend line.
of the trend is determined by calculating the confidence interval (CI)
of the slope, at some confidence level (eg, 95%). If the CI includes
0, there is no significant trend. Otherwise, there is a significant
trend. The following table shows the three general trend cases:
95% CI of Slope
-0.6 to -0.2 (- to -)
yes, downward trend
-0.2 to +0.2 (- to +)
no significant trend
+0.2 to +0.6 (+ to +)
yes, upward trend
Least squares works fine if each data point is known with certainty.
The problem is that least squares analysis takes no account of the variability at each data point.
All least squares knows is the rate at each point.
However, a rate based on many observations is certainly more stable than one based on just a few.
For example, suppose there are two counties, Adams with 2,000 people,
Brown with 200,000 people, and the population counts stay the same over time.
Further, suppose the rates and
case counts for disease X in the two counties are as follows:
| Year || Rate per|
| Adams Cty|
| Brown Cty
| 2000 || 150 || 3 || 300
| 2001 || 200 || 4 || 400
| 2002 || 200 || 4 || 400
| 2003 || 250 || 5 || 500
Both counties would (appropriately) have exactly the same least squares line for the rates,
since the least squares analysis is based on the rate data.
Also, any trend (upward in this case) would be the same for both counties.
The problem is that the CI for the slope would also be
identical for both counties. However, this doesn't make sense, because
the rates in Adams County are obviously less stable,
so the Adams slope CI should accordingly be larger.
I would think there is an existing method to determine the CI of the slope,
taking into account the number of observations.
However, I have read through standard statistical textbooks,
as well as books focusing on trend analysis,
and could not find this addressed,
at least in a way that could be translated into a practical algorithm.
Also, I have asked a few statisticians this basic question.
So far, nobody has known the answer.
Please note that "use procedure X in SPSS / SAS / STATA" is
probably not helpful, because a black box does not help.
all too often statisticians select from a "menu" of statistics,
without understanding the underlying algorithm.
Instead, I am looking for the basic math algorithm,
with a practical example worked out,
such as the
practical example showing least squares
The same four-year data set used in the least squares example could be used.
If you know how to construct a practical example showing how to take the number of cases into account when calculating the slope CI,
or have any other suggestions / insights, please
I will certainly acknowledge your assistance,
and owe you a debt of gratitude.
if it turns out there is currently no known method to carry out this seemingly basic task,
or if the knowledge of that method is too well hidden,
I'm interested in devising and publishing a non-parametric method.
I already have outlined to myself how to do that.
If you are interested in possibly collaborating on this non-parametric method,