Note: Large images and tables on this page may necessitate printing in landscape mode.

Basic and Clinical Biostatistics > Chapter 8. Research Questions About Relationships among Variables >

 Key Concepts Correlation and regression are statistical methods to examine the linear relationship between two numerical variables measured on the same subjects. Correlation describes a relationship, and regression describes both a relationship and predicts an outcome.Correlation coefficients range from –1 to +1, both indicating a perfect relationship between two variables. A correlation equal to 0 indicates no relationship.Scatterplots provide a visual display of the relationship between two numerical variables and are recommended to check for a linear relationship and extreme values.The coefficient of determination, or r2, is simply the squared correlation; it is the preferred statistic to describe the strength between two numerical variables.The t test can be used to test the hypothesis that the population correlation is zero.The Fisher z transformation is used to form confidence intervals for the correlation or to test any hypotheses about the value of the correlation.The Fisher z transformation can also be used to form confidence intervals for the difference between correlations in two independent groups.It is possible to test whether the correlation between one variable and a second is the same as the correlation between a third variable and a second variable.When one or both of the variables in correlation is skewed, the Spearman rho nonparametric correlation is advised.Linear regression is called linear because it measures only straight-line relationships.The least squares method is the one used in almost all regression examples in medicine. With one independent and one dependent variable, the regression equation can be given as a straight line.The standard error of the estimate is a statistic that can be used to test hypotheses or form confidence intervals about both the intercept and the regression coefficient (slope).One important use of regression is to be able to predict outcomes in a future group of subjects.When predicting outcomes, the confidence limits are called confidence bands about the regression line. The most accurate predictions are for outcomes close to the mean of the independent variable X, and they become less precise as the outcome departs from the mean.It is possible to test whether the regression line is the same (ie, has the same slope and intercept) in two different groups.A residual is the difference between the actual and the predicted outcome; looking at the distribution of residuals helps statisticians decide if the linear regression model is the best approach to analyzing the data.Regression toward the mean can result in a treatment or procedure appearing to be of value when it has had no actual effect; having a control group helps to guard against this problem.Correlation and regression should not be used unless observations are independent; it is not appropriate to include multiple measurements of the same subjects.Mixing two populations can also cause the correlation and regression coefficient to be larger than they should.The use of correlation versus regression should be dictated by the purpose of the research—whether it is to establish a relationship or to predict an outcome.The regression model can be extended to accommodate two or more independent variables; this model is called multiple regression.Determining the needed sample size for correlation and regression is not difficult using one of the power analysis statistical programs.
 Presenting ProblemsPresenting Problem 1In the United States, according to World Health Organization (WHO) standards, 42% of men and 28% of women are overweight, and an additional 21% of men and 28% of women are obese. Body mass index (BMI) has become the measure to define standards of overweight and obesity. The WHO defines overweight as a BMI between 25 and 29.9 kg/m2 and obesity as a BMI greater than or equal to 30 kg/m2. Jackson and colleagues (2002) point out that the use of BMI as a single standard for obesity for all adults has been recommended because it is assumed to be independent of variables such as age, sex, ethnicity, and physical activity. Their goal was to examine this assumption by evaluating the effects of sex, age, and race on the relation between BMI and measured percent fat. They studied 665 black and white men and women who ranged in age between 17 years and 65 years. Each participant was carefully measured for height and weight to calculate BMI and body density. Relative body fat (%fat) was estimated from body density using previously published equations. The independent variables examined were BMI, sex, age, and race. We examine this data to learn whether a relationship exists, and if so, whether it is linear or not. Data are on the CD [available only with the book] in a folder entitled "Jackson."Presenting Problem 2Hypertension, defined as systolic pressure greater than 140 mm Hg or diastolic pressure greater than 90 mm Hg, is present in 20–30% of the U.S. population. Recognition and treatment of hypertension has significantly reduced the morbidity and mortality associated with the complications of hypertension. A number of finger blood pressure devices are marketed for home use by patients as an easy and convenient way for them to monitor their own blood pressure.How correct are these finger blood pressure devices? Nesselroad and colleagues (1996) studied these devices to determine their accuracy. They measured blood pressure in 100 consecutive patients presenting to a family practice office who consented to participate. After being seated for 5 min, blood pressure was measured in each patient using a standard blood pressure cuff of appropriate size and with each of three automated finger blood pressure devices. The data were analyzed by calculating the correlation coefficient between the value obtained with the blood pressure cuff and the three finger devices and by calculating the percentage of measurements with each automated device that fell within the ± 4 mm Hg margin of error of the blood pressure cuff.We use the data to illustrate correlation and scatterplots. We also illustrate a test of hypothesis about two dependent or related correlation coefficients. Data are given in the section titled, "Spearman's Rho," and on the CD-ROM [available only with the book] in a folder called "Nesselroad."Presenting Problem 3Symptoms of forgetfulness and loss of concentration can be a result of natural aging and are often aggravated by fatigue, illness, depression, visual or hearing loss, or certain medications. Hodgson and Cutler (1997) wished to examine the consequences of anticipatory dementia, a phenomenon characterized as the fear that normal and age-associated memory change may be the harbinger of Alzheimer's disease.They studied 25 men and women having a living parent with a probable diagnosis of Alzheimer's disease, a condition in which genetic factors are known to be important. A control group of 25 men and women who did not have a parent with dementia was selected for comparison. A directed interview and questionnaire were used to measure concern about developing Alzheimer's disease and to assess subjective memory functioning. Four measures of each individual's sense of well-being were used in the areas of depression, psychiatric symptomatology, life satisfaction, and subjective health status. We use this study to illustrate biserial correlation and show its concordance with the t test. Observations from the study are given in files on the CD-ROM [available only with the book] entitled "Hodgson."Presenting Problem 4The study of hyperthyroid women by Gonzalo and coinvestigators (1996) was a presenting problem in Chapter 7. Recall that the study reported the effect of excess body weight in hyperthyroid patients on glucose tolerance, insulin secretion, and insulin sensitivity. The study included 14 hyperthyroid women, 6 of whom were overweight, and 19 volunteers with normal thyroid levels of similar ages and weight. The investigators in this study also examined the relationship between insulin sensitivity and body mass index for hyperthyroid and control women. (See Figure 3 in the Gonzalo article) We revisit this study to calculate and compare two regression lines. Original observations are given in Chapter 7, Table 7–8.
 An Overview of Correlation & RegressionIn Chapter 3 we introduced methods to describe the association or relationship between two variables. In this chapter we review these concepts and extend the idea to predicting the value of one characteristic from the other. We also present the statistical procedures used to test whether a relationship between two characteristics is significant. Two probability distributions introduced previously, the t distribution and the chi-square distribution, can be used for statistical tests in correlation and regression. As a result, you will be pleased to learn that much of the material in this chapter will be familiar to you.When the goal is merely to establish a relationship (or association) between two measures, as in these studies, the correlation coefficient (introduced in Chapter 3) is the statistic most often used. Recall that correlation is a measure of the linear relationship between two variables measured on a numerical scale.In addition to establishing a relationship, investigators sometimes want to predict an outcome, dependent, or response, variable from an independent, or explanatory, variable. Generally, the explanatory characteristic is the one that occurs first or is easier or less costly to measure. The statistical method of linear regression is used; this technique involves determining an equation for predicting the value of the outcome from values of the explanatory variable. One of the major differences between correlation and regression is the purpose of the analysis—whether it is merely to describe a relationship or to predict a value. Several important similarities also occur as well, including the direct relationship between the correlation coefficient and the regression coefficient. Many of the same assumptions are required for correlation and regression, and both measure the extent of a linear relationship between the two characteristics.

Correlation

Figure 8–1 illustrates several hypothetical scatterplots of data to demonstrate the relationship between the size of the correlation coefficient r and the shape of the scatterplot. When the correlation is near zero, as in Figure 8–1E, the pattern of plotted points is somewhat circular. When the degree of relationship is small, the pattern is more like an oval, as in Figures 8–1D and 8–1B. As the value of the correlation gets closer to either +1 or –1, as in Figure 8–1C, the plot has a long, narrow shape; at +1 and –1, the observations fall directly on a line, as for r = +1.0 in Figure 8–1A.

The scatterplot in Figure 8–1F illustrates a situation in which a strong but nonlinear relationship exists. For example, with temperatures less than 10–15°C, a cold nerve fiber discharges few impulses; as the temperature increases, so do numbers of impulses per second until the temperature reaches about 25°C. As the temperature increases beyond 25°C, the numbers of impulses per second decrease once again, until they cease at 40–45°C. The correlation coefficient, however, measures only a linear relationship, and it has a value close to zero in this situation.

One of the reasons for producing scatterplots of data as part of the initial analysis is to identify nonlinear relationships when they occur. Otherwise, if researchers calculate the correlation coefficient without examining the data, they can miss a strong, but nonlinear, relationship, such as the one between temperature and number of cold nerve fiber impulses.

Calculating the Correlation Coefficient

We use the study by Jackson and colleagues (2002) to extend our understanding of correlation. We assume that anyone interested in actually calculating the correlation coefficient will use a computer program, as we do in this chapter. If you are interested in a detailed illustration of the calculations, refer to Chapter 3, in the section titled, "Describing the Relationship between Two Characterisitics," and the study by Hébert and colleagues (1997).

Recall that the formula for the Pearson product moment correlation coefficient, symbolized by r, is

where X stands for the independent variable and Y for the outcome variable.

A highly recommended first step in looking at the relationship between two numerical characteristics is to examine the relationship graphically. Figure 8–2 is a scatterplot of the data, with body mass index (BMI) on the X-axis and percent body fat on the Y-axis. We see from Figure 8–2 that a positive relationship exists between these two characteristics: Small values for BMI are associated with small values for percent body fat. The question of interest is whether the observed relationship is statistically significant. (A large number of duplicate or overlapping data points occur in this plot because the sample size is so large.)

The extent of the relationship can be found by calculating the correlation coefficient. Using a statistical program, the correlation between BMI and percent body fat is 0.73, indicating a strong relationship between these two measures. Use the CD-ROM [available only with the book] to confirm our calculations. Also, see Chapter 3, in the section titled, "Describing the Relationship between Two Characterisitics," for a review of the properties of the correlation coefficient.

Interpreting the Size of r

The size of the correlation required for statistical significance is, of course, related to the sample size. With a very large sample of subjects, such as 2000, even small correlations, such as 0.06, are significant. A better way to interpret the size of the correlation is to consider what it tells us about the strength of the relationship.

The Coefficient of Determination

The correlation coefficient can be squared to form the statistic called the coefficient of determination. For the subjects in the study by Jackson, the coefficient of determination is (0.73)2, or 0.53. This means that 53% of the variation in the values for one of the measures, such as percent body fat, may be accounted for by knowing the BMI. This concept is demonstrated by the Venn diagrams in Figure 8–3. For the left diagram, r2 = 0.25; so 25% of the variation in A is accounted for by knowing B (or vice versa). The middle diagram illustrates r2 = 0.50, similar to the value we observed, and the diagram on the right represents r2 = 0.80.

The coefficient of determination tells us how strong the relationship really is. In the health literature, confidence limits or results of a statistical test for significance of the correlation coefficient are also commonly presented.

The t Test for Correlation

The symbol for the correlation coefficient in the population (the population parameter) is (lower case Greek letter rho). In a random sample, is estimated by r. If several random samples of the same size are selected from a given population and the correlation coefficient r is calculated for each, we expect r to vary from one sample to another but to follow some sort of distribution about the population value . Unfortunately, the sampling distribution of the correlation does not behave as nicely as the sampling distribution of the mean, which is normally distributed for large samples.

Part of the problem is a ceiling effect when the correlation approaches either –1 or +1. If the value of the population parameter is, say, 0.8, the sample values can exceed 0.8 only up to 1.0, but they can be less than 0.8 all the way to –1.0. The maximum value of 1.0 acts like a ceiling, keeping the sample values from varying as much above 0.8 as below it, and the result is a skewed distribution. When the population parameter is hypothesized to be zero, however, the ceiling effects are equal, and the sample values are approximately distributed according to the t distribution, which can be used to test the hypothesis that the true value of the population parameter ?r? is equal to zero. The following mathematical expression involving the correlation coefficient, often called the t ratio, has been found to have a t distribution with n – 2 degrees of freedom:

Let us use this t ratio to test whether the observed value of r = 0.73 is sufficient evidence with 655 observations to conclude that the true population value of the correlation is different from zero.

Step 1: H0: No relationship exists between BMI and percent body fat; or, the true correlation is zero: = 0.

H1: A relationship does exist between BMI and percent body fat; or, the true correlation is not zero: ≠ 0.

Step 2: Because the null hypothesis is a test of whether is zero, the t ratio may be used when the assumptions for correlation (see the section titled, "Assumptions in Correlation") are met.

Step 3: Let us use = 0.01 for this example.

Step 4: The degrees of freedom are n – 2 = 655 – 2 = 653. The value of a t distribution with 653 degrees of freedom that divides the area into the central 99% and the upper and lower 1% is approximately 2.617 (using the value for 120 df in Table A–3). We therefore reject the null hypothesis of zero correlation if (the absolute value of) the observed value of t is greater than 2.617.

Step 5: The calculation is

Step 6: The observed value of the t ratio with 653 degrees of freedom is 27.29, far greater than 2.617. The null hypothesis of zero correlation is therefore rejected, and we conclude that the relationship between BMI and percent body fat is large enough to conclude that these two variables are associated.

Fisher's z Transformation to Test the Correlation

Investigators generally want to know whether = 0, and this test can easily be done with computer programs. Occasionally, however, interest lies in whether the correlation is equal to a specific value other than zero. For example, consider a diagnostic test that gives accurate numerical values but is invasive and somewhat risky for the patient. If someone develops an alternative testing procedure, it is important to show that the new procedure is as accurate as the test in current use. The approach is to select a sample of patients and perform both the current test and the new procedure on each patient and then calculate the correlation coefficient between the two testing procedures.

Either a test of hypothesis can be performed to show that the correlation is greater than a given value, or a confidence interval about the observed correlation can be calculated. In either case, we use a procedure called Fisher's z transformation to test any null hypothesis about the correlation as well as to form confidence intervals.

To use Fisher's exact test, we first transform the correlation and then use the standard normal (z) distribution. We need to transform the correlation because, as we mentioned earlier, the distribution of sample values of the correlation is skewed when ≠ 0. Although this method is a bit complicated, it is actually more flexible than the t test, because it permits us to test any null hypothesis, not simply that the correlation is zero. Fisher's z transformation was proposed by the same statistician (Ronald Fisher) who developed Fisher's exact test for 2 x 2 contingency tables (discussed in Chapter 6).

Fisher's z transformation is

where ln represents the natural logarithm. Table A–6 gives the z transformation for different values of r, so we do not actually need to use the formula. With moderate-sized samples, this transformation follows a normal distribution, and the following expression for the z test can be used:

To illustrate Fisher's z transformation for testing the significance of , we evaluate the relationship between BMI and percent body fat (Jackson et al, 2002). The observed correlation between these two measures was 0.73. Jackson and his colleagues may have expected a sizable correlation between these two measures; let us suppose they want to know whether the correlation is significantly greater than 0.65. A one-tailed test of the null hypothesis that 0.65, which they hope to reject, may be carried out as follows.

Step 1: H0: The relationship between BMI and percent body fat is 0.65; or, the true correlation 0.65.

H1: The relationship between BMI and percent body fat is >0.65; or, the true correlation > 0.65.

Step 2: Fisher's z transformation may be used with the correlation coefficient to test any hypothesis.

Step 3: Let us again use = 0.01 for this example.

Step 4: The alternative hypothesis specifies a one-tailed test. The value of the z distribution that divides the area into the lower 99% and the upper 1% is approximately 2.326 (from Table A–2). We therefore reject the null hypothesis that the correlation is 0.65 if the observed value of z is > 2.326.

Step 5: The first step is to find the transformed values for r = 0.73 and = 0.65 from Table A–6; these values are 0.929 and 0.775, respectively. Then, the calculation for the z test is

Step 6: The observed value of the z statistic, 3.93, exceeds 2.358. The null hypothesis that the correlation is 0.65 or less is rejected, and the investigators can be assured that the relationship between BMI and body fat is greater than 0.65.

Confidence Interval for the Correlation

A major advantage of Fisher's z transformation is that confidence intervals can be formed. The transformed value of the correlation is used to calculate confidence limits in the usual manner, and then they are transformed back to values corresponding to the correlation coefficient.

To illustrate, we calculate a 95% confidence interval for the correlation coefficient 0.73 in Jackson and colleagues (2002). We use Fisher's z transformation of 0.73 = 0.929 and the z distribution in Table A–2 to find the critical value for 95%. The confidence interval is

Transforming the limits 0.852 and 1.006 back to correlations using Table A–6 in reverse gives approximately r = 0.69 and r = 0.77 (using conservative values). Therefore, we are 95% confident that the true value of the correlation in the population is contained within this interval. Note that 0.65 is not in this interval, which is consistent with our conclusion that the observed correlation of 0.73 is different from 0.65.

Surprisingly, computer programs do not always contain routines for finding confidence limits for a correlation. We have included a Microsoft Excel program in the Calculations folder on the CD-ROM [available only with the book] that calculates the 95% CI for a correlation.

Assumptions in Correlation

The assumptions needed to draw valid conclusions about the correlation coefficient are that the sample was randomly selected and the two variables, X and Y, vary together in a joint distribution that is normally distributed, called the bivariate normal distribution. Just because each variable is normally distributed when examined separately, however, does not guarantee that, jointly, they have a bivariate normal distribution. Some guidance is available: If either of the two variables is not normally distributed, Pearson's product moment correlation coefficient is not the most appropriate method. Instead, either one or both of the variables may be transformed so that they more closely follow a normal distribution, as discussed in Chapter 5, or the Spearman rank correlation may be calculated. This topic is discussed in the section titled, "Other Measures of Correlation."

Comparing Two Correlation Coefficients

On occasion, investigators want to know if a difference exists between two correlation coefficients. Here are two specific instances: (1) comparing the correlations between the same two variables that have been measured in two independent groups of subjects and (2) comparing two correlations that involve a variable in common in the same group of individuals. These situations are not extremely common and not always contained in statistical programs. We designed Microsoft Excel programs; see the folder "Calculations" on the CD-ROM [available only with the book].

Comparing Correlations in Two Independent Groups

Fisher's z transformation can be used to test hypotheses or form confidence intervals about the difference between the correlations between the same two variables in two independent groups. The results of such tests are also called independent correlations. For example, Gonzalo and colleagues (1996) in Presenting Problem 4 wanted to compare the correlation between BMI and insulin sensitivity in the 14 hyperthyroid women (r = –0.775) with the correlation between BMI and insulin sensitivity in the 19 control women (r = –0.456). See Figure 8–4.

In this situation, the value for the second group replaces z() in the numerator for the z test shown in the previous section, and l/(n – 3) is found for each group and added before taking the square root in the denominator. The test statistic is

To illustrate, the values of z from Fisher's z transformation tables (A–6) for –0.775 and –0.456 are approximately 1.033 0.492 . (with interpolation), respectively. Note that Fisher's z transformation is the same, regardless of whether the correlation is positive or negative. Using these values, we obtain

Assuming we choose the traditional significance level of 0.05, the value of the test statistic, 1.38, is less than the critical value, 1.96, so we do not reject the null hypothesis of equal correlations. We decide that the evidence is insufficient to conclude that the relationship between BMI and insulin sensitivity is different for hyperthyroid women from that for controls. What is a possible explanation for the lack of statistical significance? It is possible that there is no difference in the relationships between these two variables in the population. When sample sizes are small, however, as they are in this study, it is always advisable to keep in mind that the study may have low power.

Comparing Correlations with Variables in Common in the Same Group

The second situation occurs when the research question involves correlations that contain the same variable (also called dependent correlations). For example, a very natural question for Nesselroad and colleagues (1996) was whether one of the finger devices was more highly correlated with the blood pressure cuff—considered to be the gold standard—than the other two. If so, this would be a product they might wish to recommend for patients to use at home. To illustrate, we compare the diastolic reading with device 1 and the cuff (rXY = 0.32) to the diastolic reading with device 2 and the cuff (rXZ = 0.45).

There are several formulas for testing the difference between two dependent correlations. We present the simplest one, developed by Hotelling (1940) and described by Glass and Stanley (1970) on pages 310–311 of their book. We will show the calculations for this example but, as always, suggest that you use a computer program. The formula follows the t distribution with n – 3 degrees of freedom; it looks rather forbidding and requires the calculation of several correlations:

We designate the cuff reading as X, device 1 as Y, and device 2 as Z. We therefore want to compare rXY with rXZ. Both correlations involve the X, or cuff, reading, so these correlations are dependent. To use the formula, we also need to calculate the correlation between device 1 and device 2, which is rYZ = 0.54. Table 8–1 shows the correlations needed for this formula.

Table 8–1. Correlation Matrix of Diastolic Blood Pressures in All 100 Subjects.
 Table 8–1. Correlation Matrix of Diastolic Blood Pressures in All 100 Subjects.
Pearson Correlations Section
Cuff DiastolicDevice 1 DiastolicDevice 2 DiastolicDevice 3 Diastolic
Cuff Diastolic 1.00000.3209a 0.4450 0.3592
0.00000.00110.00000.0002
100.0000100.0000100.0000100.0000
Device 1 Diastolic 0.32101.00000.5364 0.5392
0.00110.00000.00000.0000
100.0000100.0000100.0000100.0000
Device 2 Diastolic 0.44500.53641.00000.5629
0.00000.00000.00000.0000
100.0000100.0000100.0000100.0000
Device 3 Diastolic 0.35920.53920.56291.0000
0.00020.00000.00000.0000
100.0000100.0000100.0000100.0000

aBolded values are needed for comparing two dependent correlations.

Source: Data, used with permission, from Nesselroad JM, Flacco VA, Phillips DM, Kruse J: Accuracy of automated finger blood pressure devices. Fam Med 1996;28:189–192. Output produced using NCSS; used with permission.

The calculations are

You know by now that the difference between these two correlations is not statistically significant because the observed value of t is –1.50, and |–1.50| = 1.50 is less than the critical value of t with 97 degrees of freedom, 1.99. This conclusion corresponds to that by Nesselroad and his colleagues in which they recommended that patients be cautioned that the finger blood pressure devices may not perform as marketed.

We designed a Microsoft Excel program for these calculations as well. It is included on the CD-ROM [available only with the book] in a folder called "Calculations" and is entitled "z for 2 dept r's."

Other Measures of Correlation

Several other measures of correlation are often found in the medical literature. Spearman's rho, the rank correlation introduced in Chapter 3, is used with ordinal data or in situations in which the numerical variables are not normally distributed. When a research question involves one numerical and one nominal variable, a correlation called the point–biserial correlation is used. With nominal data, the risk ratio, or kappa (), discussed in Chapter 5, can be used.

Spearman's Rho

Recall that the value of the correlation coefficient is markedly influenced by extreme values and thus does not provide a good description of the relationship between two variables when their distributions are skewed or contain outlying values. For example, consider the relationships among the various finger devices and the standard cuff device for measuring blood pressure from Presenting Problem 2. To illustrate, we use the first 25 subjects from this study, listed in Table 8–2 (see the file entitled "Nesselroad25" on the CD-ROM [available only with the book]).

Table 8–2. Data on Diastolic Blood Pressure for the First 25 Subjects.
 Table 8–2. Data on Diastolic Blood Pressure for the First 25 Subjects.
SubjectCuff DiastolicDevice 1 DiastolicDevice 2 DiastolicDevice 3 Diastolic
180585138
265796147
370666150
480937553
560757654
682717556
770586058
870737458
960726759
1070887060
1148708860
121001148262
1370745664
1470757967
1570896269
1660957570
1780878972
1880577473
1990699073
2080608575
2170727577
2285856179
231001029989
24701138394
259012710899

Source: Data, used with permission, from Nesselroad JM, Flacco VA, Phillips DM, Kruse J: Accuracy of automated finger blood pressure devices. Fam Med 1996;28:189–192. Output produced using NCSS; used with permission.

It is difficult to tell if the observations are normally distributed without looking at graphs of the data. Some statistical programs have routines to plot values against a normal distribution to help researchers decide whether a nonparametric procedure should be used. A normal probability plot for the cuff diastolic measurement is given in Figure 8–5. Use the CD-ROM [available only with the book] to produce similar plots for the finger device measurements.

When the observations are plotted on a graph, as in Figure 8–5, it appears that the data are not unduly skewed. This conclusion is consistent with the tests given for the normality of a distribution by NCSS. In the normal probability plot, if observations fall within the curved lines, the data can be assumed to be normally distributed.

As we indicated in Chapter 3, a simple method for dealing with the problem of extreme observations in correlation is to rank order the data and then recalculate the correlation on ranks to obtain the nonparametric correlation called Spearman's rho, or rank correlation. To illustrate this procedure, we continue to use data on the first 25 subjects in the study on blood pressure devices (Presenting Problem 2), even though the distribution of the values does not require this procedure. Let us focus on the correlation between the cuff and device 2, which we learned was 0.45 in the section titled, "Comparing Correlations with Variables in Common in the Same Group."

Table 8–3 illustrates the ranks of the diastolic readings on the first 25 subjects. Note that each variable is ranked separately; when ties occur, the average of the ranks of the tied values is used.

Table 8–3. Rank Order of the Diastolic Blood Pressure for the First 25 Subjects.
 Table 8–3. Rank Order of the Diastolic Blood Pressure for the First 25 Subjects.
RowCuff DiastolicDevice 1 DiastolicDevice 2 DiastolicDevice 3 Diastolic
117.02.51.01.0
25.015.05.02.0
310.05.05.03.0
417.020.013.54.0
53.013.516.05.0
620.08.013.56.0
710.02.53.07.5
810.011.010.57.5
93.09.58.09.0
1010.018.09.010.5
111.07.021.010.5
1224.524.018.012.0
1310.012.02.013.0
1410.013.517.014.0
1510.019.07.015.0
163.021.013.516.0
1717.017.022.017.0
1817.01.010.518.5
1922.56.023.018.5
2017.04.020.020.0
2110.09.513.521.0
2221.016.05.022.0
2324.522.024.023.0
2410.023.019.024.0
2522.525.025.025.0

Source: Data, used with permission, from Nesselroad JM, Flacco VA, Phillips DM, Kruse J: Accuracy of automated finger blood pressure devices. Fam Med 1996;28:189–192. Output produced using NCSS; used with permission.

The ranks of the variables are used in the equation for the correlation coefficient, and the resulting calculation gives Spearman's rank correlation (rS), also called Spearman's rho:

where RX is the rank of the X variable, RY is the rank of the Y variable, and RX and RY are the mean ranks for the X and Y variables, respectively. The rank correlation rS may also be calculated by using other formulas, but this approximate procedure is quite good (Conover and Iman, 1981).

Calculating rS for the ranked observations in Table 8–3 gives

The value of rS is smaller than the value of Pearson's correlation; this may occur when the bivariate distribution of the two variables is not normal. The t test, as illustrated for the Pearson correlation, can be used to determine whether the Spearman rank correlation is significantly different from zero. For example, the following procedure tests whether the value of Spearman's rho in the population, symbolized S (Greek letter rho with subscript S denoting Spearman) differs from zero.

Step 1: H0: The population value of Spearman's rho is zero; that is, S = 0.

H1: The population value of Spearman's rho is not zero; that is, S ≠ 0.

Step 2: Because the null hypothesis is a test of whether S is zero, the t ratio may be used.

Step 3: Let us use = 0.05 for this example.

Step 4: The degrees of freedom are n – 2 = 25 – 2 = 23. The value of the t distribution with 23 degrees of freedom that divides the area into the central 95% and the upper and lower 2½% is 2.069 (Table A–3), so we will reject the null hypothesis if (the absolute value of) the observed value of t is greater than 2.069.

Step 5: The calculation is

Step 6: The observed value of the t ratio with 23 degrees of freedom is 1.677, less than 2.069, so we do not reject the null hypothesis and conclude there is insufficient evidence that a nonparametric correlation exists between the diastolic pressure measurements made by the cuff and finger device 2.

Of course, if investigators want to test only whether Spearman's rho is greater than zero—that there is a significantly positive relationship—they can use a one-tailed test. For a one-tailed test with = 0.05 and 23 degrees of freedom, the critical value is 1.714, and the conclusion is the same.

It is easy to demonstrate that performing the above-mentioned test on ranked data gives approximately the same results as the Spearman rho calculated the traditional way. We just used the Pearson formula on ranks and found that Spearman's rho for the sample of 25 subjects was 0.33 between the cuff measurement of diastolic pressure and finger device 2. Use the CD-ROM [available only with the book], and calculate Spearman's rho on the original data. You should also find 0.33 using the traditional methods of calculation.

To summarize, Spearman's rho is appropriate when investigators want to measure the relationship between: (1) two ordinal variables, or (2) two numerical variables when one or both are not normally distributed and investigators choose not to use a data transformation (such as taking the logarithm). Spearman's rank correlation is especially appropriate when outlying values occur among the observations.

Confidence Interval for the Odds Ratio & the Relative Risk

Chapter 3 introduced the relative risk (or risk ratio) and the odds ratio as measures of relationship between two nominal characteristics. Developed by epidemiologists, these statistics are used for studies examining risks that may result in disease. To discuss the odds ratio, recall the study discussed in Chapter 3 by Ballard and colleagues (1998) that examined the use of antenatal thyrotropin-releasing hormone (TRH). Data from this study were given in Chapter 3, Table 3–21. We calculated the odds ratio as 1.1, meaning that an infant in the TRH group is 1.1 times more likely to develop respiratory distress syndrome than an infant in the placebo group. This finding is the opposite of what the investigators expected to find, and it is important to learn if the increased risk is statistically significant.

Significance can be determined in several ways. For instance, to test the significance of the relationship between treatment (TRH versus placebo) and the development of respiratory distress syndrome, investigators may use the chi-square test discussed in Chapter 6. The chi-square test for this example is left as an exercise (see Exercise 2). An alternative chi-square test, based on the natural logarithm of the odds ratio, is also available, and it results in values close to the chi-square test illustrated in Chapter 6 (Fleiss, 1999).

More often, articles in the medical literature use confidence intervals for risk ratios or odds ratios. Ballard and colleagues reported a 95% confidence interval for the odds ratio as (0.8 to 1.5). Let us see how they found this confidence interval.

Finding confidence intervals for odds ratios is a bit more complicated than usual because these ratios are not normally distributed, so calculations require finding natural logarithms and antilogarithms. The formula for a 95% confidence interval for the odds ratio is

where exp denotes the exponential function, or antilogarithm, of the natural logarithm, ln, and a, b, c, d are the cells in a 2 x 2 table (see Table 6–9 in Chapter 6). The confidence interval for the odds ratio for risk of respiratory distress syndrome in infants who were given TRH from Table 3–21 is

This interval contains the value of the true odds ratio with 95% confidence. If the odds are the same in each group, the value of the odds ratio is approximately 1, indicating similar risks in each group. Because the interval contains 1, we may be 95% confident that the odds ratio risk may in fact be 1; that is, insufficient evidence exists to conclude that the risk of respiratory distress increases in infants who received TRH. By the same logic, this treatment has no protective effect. Of course, 90% or 99% confidence intervals can be formed by using 1.645 or 2.575 instead of 1.96 in the preceding equation.

To illustrate the confidence interval for the relative risk, we refer to the physicians' health study (Steering Committee of the Physicians' Health Study Research Group, 1989) summarized in Chapter 3 and Table 3–19. Recall that the relative risk for an MI in physicians taking aspirin was 0.581. The 95% confidence interval for the true value of the relative risk also involves logarithms:

Again, the values for a, b, c, d are the cells in the 2 x 2 table illustrated in Table 6–9. Although it is possible to include a continuity correction for the relative risk or odds ratio, it is not commonly done. Substituting values from Table 3–19, the 95% confidence interval for a relative risk of 0.581 is

The 95% confidence interval does not contain 1, so the evidence indicates that the use of aspirin resulted in a reduced risk for MI. For a detailed and insightful discussion of the odds ratio and its advantages and disadvantages, see Feinstein (1985, Chapter 20) and Fleiss (1999, Chapter 5); for a discussion of the odds ratio and the risk ratio, see Greenberg and coworkers (2002, Chapters 8 and 9).

The folder containing Microsoft Excel equations on the CD-ROM [available only with the book] describes two routines for finding the 95% confidence limits; they are called "CI for OR" and "CI for RR." You may find these routines helpful if you wish to find 95% confidence limits for odds ratios or relative risks for published studies that contain the summary data for these statistics.

Measuring Relationships in Other Situations

We have discussed how to measure and test the significance of relationships by using Pearson's product moment correlation coefficient, Spearman's nonparametric procedure based on ranks, and risk or odds ratios. Not all situations are covered by these procedures, however, such as when one variable is measured on a nominal scale and the other is numerical but has been classified into categories, when one variable is nominal and the other is ordinal, or when both are ordinal but only a few categories occur. In these cases, a contingency table is formed and the chi-square test is used, as illustrated in Chapters 6 and 7.

On other occasions, the numerical variable is not collapsed into categories. For example, Hodgson and Cutler (1997) studied 25 subjects who had a living parent with Alzheimer's disease and a matched group who had no family history of dementia. Subjects answered questions about their concern of developing Alzheimer's disease and completed a questionnaire designed to evaluate their concerns about memory, the Memory Assessment Index (MAI). Data are given in Table 8–4.

Table 8–4. Data on 50 Subjects in the Study on Anticipatory Dementia.
 Table 8–4. Data on 50 Subjects in the Study on Anticipatory Dementia.
Samplea

SexConcerneda

MAIb

Life Satisfa

Health Statusa

1F1601
2F1811
1F0011
1F0201
1F1401
1F11000
1M0311
1F11200
1F1801
1F1911
1F1801
1F0211
1F1611
1M0211
1M0210
1M1511
1M0300
2F0301
2F0011
1M1501
2F0110
2F0201
1F1701
1M1501
1F1700
1F1901
2F11000
2F0311
2F1501
2F0311
2F1900
1F1411
1F1811
2F1411
2F0201
2F1900
2F0311
2M1811
2F0311
2F0401
2M1700
2M0201
2F0011
2M0211
1F0500
2F0211
1F1801
2M1611
2F0200
2M1401

aSample: 1=Alzheimer, 2=Control.

Concerned: 0=No, 1=Yes.

Life satisfaction: 0=Not satisfied, 1=Satisfied.

Health status: 0=Excellent, 1=Not excellent.

bMAI (Memory Assessment Index) on scale of 0=No memory problems to 12=Negative perceptions of memory and very concerned about developing dementia.

Source: Data, used with permission, from Hodgson LG, Cutler SJ: Anticipatory dementia and well-being. Am J Alzheimer's Dis 1997;12:62–66. Output produced using NCSS; used with permission.

The investigators were interested in the relationship between life satisfaction and performance on the MAI. Life satisfaction was measured as yes or no, and the MAI was measured on a scale from 0 = no memory problems to 12 = negative perceptions of memory and concern about developing dementia. When one variable is binary and the other is numerical, it is possible to evaluate the relationship using a special correlation, called the point–biserial correlation. If the binary variable is coded as 0 and 1, the Pearson correlation procedure can be used to find the point–biserial correlation. Box 8–1A gives the results of the correlation procedure using life satisfaction and MAI. The correlation is –0.37, and the P value is 0.008633.

Box 8–1. Correlation and T Test for Life Satisfaction and Anticipatory Dementia As Measured by Mai.
 Box 8–1. Correlation and T Test for Life Satisfaction and Anticipatory Dementia As Measured by Mai.
A. Correlation Matrix
Anticipatory DementiaLife Satisfaction
Anticipatory Dementia1.000000–0.367601
0.0000000.008633
50.00000050.000000
Life Satisfaction–0.3676011.000000
0.0086330.000000
50.00000050.000000

B. t Test
CountMeanStandard Deviation
LIFESAT=0275.8518522.931312
LIFESAT=1233.6521742.70704

Alternative Hypothesist Value Probability LevelDecision (5%)Power (=0.05)
Difference
<> 02.73860.008633Reject H0

0.765296
C. Box Plot

Source: Data, used with permission, from Hodgson LG, Cutler SJ: Anticipatory dementia and well-being. Am J Alzheimer's Dis 1997;12:62–66. Output produced using NCSS; used with permission.

Did you wonder why a t test was not used to see if a difference existed in mean MAI for those who were satisfied with their life versus those who were not satisfied? If so, you are right on target because a t test is another way to look at the research question. It simply depends on whether interest focuses on a relationship or a difference. What do you think the results of a t test would show? The output from the NCSS t test procedure is given in Box 8–1B. Of special interest is the P value (0.008633); it is the same as for the correlation. This illustrates an important principle: The point–biserial correlation between a binary variable and a numerical variable has the same level of significance as does a t test in which the groups are defined by the binary variable.

The point–biserial correlation is often used by test developers to help evaluate the questions on the test. For example, the National Board of Medical Examiners determines the point–biserial correlation between whether examinees get an item correct (a binary variable) and the examinee's score on the entire exam (a numerical variable). A positive point–biserial indicates that examinees who answer the question correctly tend to score high on the exam as a whole, whereas examinees missing the question tend to score low generally. Similarly, a negative point–biserial correlation indicates that examinees who answer the question correctly tend to score low on the exam—certainly not a desirable situation. It may be that the question is tricky or poorly worded because the better examinees are more likely to miss the question; you can see why this statistic is useful for test developers.

Linear Regression

Remember that when the goal is to predict the value of one characteristic from knowledge of another, the statistical method used is regression analysis. This method is also called linear regression, simple linear regression, or least squares regression. A brief review of the history of these terms is interesting and sheds some light on the nature of regression analysis.

The concepts of correlation and regression were developed by Sir Francis Galton, a cousin of Charles Darwin, who studied both mathematics and medicine in the mid-19th century (Walker, 1931). Galton was interested in heredity and wanted to understand why a population remains more or less the same over many generations with the "average" offspring resembling their parents; that is, why do successive generations not become more diverse. By growing sweet peas and observing the average size of seeds from parent plants of different sizes, he discovered regression, which he termed the "tendency of the ideal mean filial type to depart from the parental type, reverting to what may be roughly and perhaps fairly described as the average ancestral type." This phenomenon is more typically known as regression toward the mean. The term "correlation" was used by Galton in his work on inheritance in terms of the "co-relation" between such characteristics as heights of fathers and sons. The mathematician Karl Pearson went on to work out the theory of correlation and regression, and the correlation coefficient is named after him for this reason.

The term linear regression refers to the fact that correlation and regression measure only a straight-line, or linear, relationship between two variables. The term "simple regression" means that only one explanatory (independent) variable is used to predict an outcome. In multiple regression, more than one independent variable is included in the prediction equation.

Least squares regression describes the mathematical method for obtaining the regression equation. The important thing to remember is that when the term "regression" is used alone, it generally means linear regression based on the least squares method. The concept behind least squares regression is described in the next section and its application is discussed in the section after that.

Least Squares Method

Several times previously in this text, we mentioned the linear nature of the pattern of points in a scatterplot. For example, in Figure 8–2, a straight line can be drawn through the points representing the values of BMI and percent body fat to indicate the direction of the relationship. The least squares method is a way to determine the equation of the line that provides a good fit to the points.

To illustrate the method, consider the straight line in Figure 8–6. Elementary geometry can be used to determine the equation for any straight line. If the point where the line crosses, or intercepts, the Y-axis is denoted by a and the slope of the line by b, then the equation is

The slope of the line measures the amount Y changes each time X changes by 1 unit. If the slope is positive, Y increases as X increases; if the slope is negative, Y decreases as X increases; and vice versa. In the regression model, the slope in the population is generally symbolized by 1, called the regression coefficient; and 0 denotes the intercept of the regression line, that is, 1 and 0 are the population parameters in regression. In most applications, the points do not fall exactly along a straight line. For this reason, the regression model contains an error term, e, which is the distance the actual values of Y depart from the regression line. Putting all this together, the regression equation is given by

When the regression equation is used to describe the relationship in the sample, it is often written as

For a given value of X, say X*, the predicted value of Y* is found by extending a horizontal line from the regression line to the Y-axis as in Figure 8–7. The difference between the actual value for Y* and the predicted value, e* = Y* – Y*', can be used to judge how well the line fits the data points. The least squares method determines the line that minimizes the sum of the squared vertical differences between the actual and predicted values of the Y variable; ie, 0 and 1 are determined so that (YY')2 is minimized. The formulas for 0 and 1 are found,a and in terms of the sample estimates b and a, these formulas are

aThe procedure for finding 0 and 1 involves the use of differential calculus. The partial derivatives of the preceding equations are found with respect to 0 and 1; the two resulting equations are set equal to zero to locate the minimum values; these two equations in two unknowns, 0 and 1, are solved simultaneously to obtain the formulas for 0 and 1.

Calculating the Regression Equation

In the study described in Presenting Problem 4, the investigators wanted to predict insulin sensitivity from BMI in a group of women. Original observations were given in Chapter 7, Table 7–8. For now we ignore the different groups of women and examine the entire sample regardless of thyroid and weight levels.

Before calculating the regression equation for these data, let us create a scatterplot and practice "guesstimating" the value of the correlation coefficient from the plot (although it is difficult to estimate the size of r accurately when the sample size is small). Figure 8–8 is a scatterplot with BMI score as the explanatory X variable and insulin sensitivity as the response Y variable. How large do you think the correlation is?

If we knew the correlation between BMI and insulin sensitivity, we could use it to calculate the regression equation. Because we do not, we assume the needed terms have been calculated; they are

Then,

In this example, the insulin sensitivity scores are said to be regressed on BMI scores, and the regression equation is written as Y' = 1.5817 – 0.0433X, where Y' is the predicted insulin sensitivity score, and X is the BMI.

Figure 8–9 illustrates the regression line drawn through the observations. The regression equation has a positive intercept of +1.58, so that theoretically a patient with zero BMI would have an insulin sensitivity of 1.58, even though, in the present example, a zero BMI is not possible. The slope of –0.043 indicates that each time a woman's BMI increases by 1, her predicted insulin sensitivity decreases by approximately 0.043. For example, as the BMI increases from 20 to 30, insulin sensitivity decreases from about 0.73 to about 0.3. Whether the relationship between BMI and insulin sensitivity is significant is discussed in the next section.

Assumptions & Inferences in Regression

In the previous section, we worked with a sample of observations instead of the population of observations. Just as the sample mean is an estimate of the population mean , the regression line determined from the formulas for a and b in the previous section is an estimate of the regression equation for the underlying population.

As in Chapters 6 and 7, in which we used statistical tests to determine how likely it was that the observed differences between two means occurred by chance, in regression analysis we must perform statistical tests to determine the likelihood of any observed relationship between X and Y variables. Again, the question can be approached in two ways: using hypothesis tests or forming confidence intervals. Before discussing these approaches, however, we briefly discuss the assumptions required in regression analysis.

If we are to use a regression equation, the observations must have certain properties. Thus, for each value of the X variable, the Y variable is assumed to have a normal distribution, and the mean of the distribution is assumed to be the predicted value, Y'. In addition, no matter the value of the X variable, the standard deviation of Y is assumed to be the same. These assumptions are rather like imagining a large number of individual normal distributions of the Y variable, all of the same size, one for each value of X. The assumption of this equal variation in the Y's across the entire range of the X's is called homogeneity, or homoscedasticity. It is analogous to the assumption of equal variances (homogeneous variances) in the t test for independent groups, as discussed in Chapter 6.

The straight-line, or linear, assumption requires that the mean values of Y corresponding to various values of X fall on a straight line. The values of Y are assumed to be independent of one another. This assumption is not met when repeated measurements are made on the same subjects; that is, a subject's measure at one time is not independent from the measure of that same subject at another time. Finally, as with other statistical procedures, we assume the observations constitute a random sample from the population of interest.

Regression is a robust procedure and may be used in many situations in which the assumptions are not met, as long as the measurements are fairly reliable and the correct regression model is used. (Other regression models are discussed in Chapter 10.) Meeting the regression assumptions generally causes fewer problems in experiments or clinical trials than in observational studies because reliability of the measurements tends to be greater in experimental studies. Special procedures can be used when the assumptions are seriously violated, however; and as in ANOVA, researchers should seek a statistician's advice before using regression if questions arise about its applicability.

The Standard Error of the Estimate

Regression lines, like other statistics, can vary. After all, the regression equation computed for any one sample of observations is only an estimate of the true population regression equation. If other samples are chosen from the population and a regression equation is calculated for each sample, these equations will vary from one sample to another with respect to both their slopes and their intercepts. An estimate of this variation is symbolized SY.X (read s of y given x) and is called the standard error of regression, or the standard error of the estimate. It is based on the squared deviations of the predicted Y's from the actual Y's and is found as follows:

The computation of this formula is quite tedious; and although more user-friendly computational forms exist, we assume that you will use a computer program to calculate the standard error of the estimate. In testing both the slope and the intercept, a t test can be used, and the standard error of the estimate is part of the formula. It is also used in determining confidence limits. To present these formulas and the logic involved in testing the slope and the intercept, we illustrate the test of hypothesis for the intercept and the calculation of a confidence interval for the slope, using the BMI–insulin sensitivity regression equation.

To test the hypothesis that the intercept departs significantly from zero, we use the following procedure:

Step 1: H0: 0 = 0 (The intercept is zero)

H1: 0 ≠ 0 (The intercept is not zero)

Step 2: Because the null hypothesis is a test of whether the intercept is zero, the t ratio may be used if the assumptions are met. The t ratio uses the standard error of the estimate to calculate the standard error of the intercept (the denominator of the t ratio):

Step 3: Let us use equal to 0.05.

Step 4: The degrees of freedom are n – 2 = 33 – 2 = 31. The value of the t distribution with 31 degrees of freedom that divides the area into the central 95% and the combined upper and lower 5% is approximately 2.040 (from Table A–3). We therefore reject the null hypothesis of a zero intercept if (the absolute value of) the observed value of t is greater than 2.040.

Step 5: The calculation follows; we used a spreadsheet (Microsoft Excel) to calculate SY.X = 0.256 and (XX–)2 = 468.015.

Step 6: The absolute value of the observed t ratio is 5.30, which is greater than 2.040. The null hypothesis of a zero intercept is therefore rejected. We conclude that the evidence is sufficient to show that the intercept is significantly different from zero for the regression of insulin sensitivity on BMI.

As you know by now, it is also possible to form confidence limits for the intercept using the observed value and adding or subtracting the critical value from the t distribution multiplied by the standard error of the intercept.

Instead of illustrating the hypothesis test for the population regression coefficient, let us find a 95% confidence interval for 1. The interval is given by

Because the interval excludes zero, we can be 95% confident that the regression coefficient is not zero but that it is between –0.0674 and –0.0192 or between about –0.07 and –0.02. Because the regression coefficient is significantly less than zero, can the correlation coefficient be equal to zero? (see Exercise 3.) The relationship between b and r illustrated earlier and Exercise 3 should convince you of the equivalence of the results obtained with testing the significance of correlation and the regression coefficient. In fact, authors in the medical literature often perform a regression analysis and then report the P values to indicate a significant correlation coefficient.

The output from the SPSS regression program is given in Table 8–5. The program produces the value of t and the associated P value, as well as 95% confidence limits. Do the results agree with those we found earlier? To become familiar with using regression, we suggest you replicate these results using the CD-ROM [available only with the book].

Table 8–5. Computer Output of Regression of Insulin Sensitivity on Body Mass Index.
 Table 8–5. Computer Output of Regression of Insulin Sensitivity on Body Mass Index.
Coefficientsa

Unstandardized CoefficientsStandard Coefficients  95% Confidence Interval for B
Mode 1 BStd. ErrorBetat SignificanceLower BoundUpper Bound
1(Constant)1.5820.299 5.2940.0000.9722.191
Body mass index–0.0430.012–0.548–3.6520.001–0.067–0.019

Source: Data, used with permission, from Gonzalo MA, Grant C, Moreno I, Garcia FJ, Suarez AI, Herrera-Pombo JL, et al: Glucose tolerance, insulin secretion, insulin sensitivity and glucose effectiveness in normal and overweight hyperthyroid women. Clin Endocrinol 1996;45:689–697. Output produced using SPSS, a registered trademark of SPSS, Inc; used with permission.

Predicting with the Regression Equation: Individual and Mean Values

One of the important reasons for obtaining a regression equation is to predict future values for a group of subjects (or for individual subjects). For example, a clinician may want to predict insulin sensitivity from BMI for a group of women with newly diagnosed diabetes. Or the clinician may wish to predict the sensitivity for a particular woman. In either case, the variability associated with the regression line must be reflected in the prediction. The 95% confidence interval for a predicted mean Y in a group of subjects is

The 95% confidence interval for predicting a single observation is

Comparing these two formulas, we see that the confidence interval predicting a single observation is wider than the interval for the mean of a group of individuals; 1 is added to the standard error term for the individual case. This result makes sense, because for a given value of X, the variation in the scores of individuals is greater than that in the mean scores of groups of individuals. Note also that the numerator of the third term in the standard error is the squared deviation of X from . The size of the standard error therefore depends on how close the observation is to the mean; the closer X is to its mean, the more accurate is the prediction of Y. For values of X quite far from the mean, the variability in predicting the Y score is considerable. You can appreciate why it is difficult for economists and others who wish to predict future events to be very accurate!

Table 8–6 gives 95% confidence intervals associated with predicted mean insulin sensitivity levels and predicted insulin sensitivity levels for an individual corresponding to several different BMI values (and for the mean BMI in this sample of 33 women). Several insights about regression analysis can be gained by examining this table. First, note the differences in magnitude between the standard errors associated with the predicted mean insulin sensitivity and those associated with individual insulin sensitivity levels: The standard errors are much larger when we predict individual values than when we predict the mean value. In fact, the standard error for individuals is always larger than the standard error for means because of the additional 1 in the formula. Also note that the standard errors take on their smallest values when the observation of interest is the mean (BMI of 24.921 in our example). As the observation departs in either direction from the mean, the standard errors and confidence intervals become increasingly larger, reflecting the squared difference between the observation and the mean. If the confidence intervals are plotted as confidence bands about the regression line, they are closest to the line at the mean of X and curve away from it in both directions on each side of . Figure 8–10 shows the graph of the confidence bands.

Table 8–6 illustrates another interesting feature of the regression equation. When the mean of X is used in the regression equation, the predicted Y' is the mean of Y. The regression line therefore goes through the mean of X and the mean of Y.

Table 8–6. 95% Confidence Intervals for Predicted Mean Insulin Sensitivity Levels and Predicted Individual Insulin Sensitivity Levels.
 Table 8–6. 95% Confidence Intervals for Predicted Mean Insulin Sensitivity Levels and Predicted Individual Insulin Sensitivity Levels.
Predicting MeansPredicting Individuals
BMIInsulin SensitivityPredictedSEa

Confidence IntervalSEb

Confidence Interval
18.1000.9700.7980.0920.610 to 0.9860.2730.242 to 1.354
23.6000.8800.5600.0470.463 to 0.6560.2610.028 to 1.092
24.0000.6600.5430.0460.449 to 0.6360.2610.011 to 1.074
20.4000.5200.6980.0700.556 to 0.8410.2660.156 to 1.241
21.5000.3800.6510.0600.528 to 0.7740.2630.113 to 1.188
24.9210.5030.5030.0440.413 to 0.5930.260–0.027 to 1.033

aStandard error for means.

bStandard error for individuals.

BMI = body mass index.

Now we can see why confidence bands about the regression line are curved. The error in the intercept means that the true regression line can be either above or below the line calculated for the sample observations, although it maintains the same orientation (slope). The error in measuring the slope therefore means that the true regression line can rotate about the point (, ) to a certain degree. The combination of these two errors results in the concave confidence bands illustrated in Figure 8–10. Sometimes journal articles have regression lines with confidence bands that are parallel rather than curved. These confidence bands are incorrect, although they may correspond to standard errors or to confidence intervals at their narrowest distance from the regression line.

Comparing Two Regression Lines

Sometimes investigators wish to compare two regression lines to see whether they are the same. For example, the investigators in Presenting Problem 4 were particularly interested in the relationship between BMI and insulin sensitivity in women who were hyperthyroid versus those whose thyroid levels were normal. The investigators determined separate regression lines for these two groups of women and reported them in Figure 3 of their article. We reproduced their regression lines in Figure 8–11.

As you might guess, researchers are often interested in comparing regression lines to learn whether the relationships are the same in different groups of subjects. When we compare two regression lines, four situations can occur, as illustrated in Figure 8–12. In Figure 8–12A, the slopes of the regression lines are the same, but the intercepts differ. This situation occurs, for instance, in blood pressure measurements regressed on age in men and women; that is, the relationship between blood pressure and age is similar for men and women (equal slopes), but men tend to have higher blood pressure levels at all ages than women (higher intercept for men).

In Figure 8–12B, the intercepts are equal, but the slopes differ. This pattern may describe, say, the regression of platelet count on number of days following bone marrow transplantation in two groups of patients: those for whom adjuvant therapy results in remission of the underlying disease and those for whom the disease remains active. In other words, prior to and immediately after transplantation, the platelet count is similar for both groups (equal intercepts), but at some time after transplantation, the platelet count remains steady for patients in remission and begins to decrease for patients not in remission (more negative slope for patients with active disease).

In Figure 8–12C, both the intercepts and the slopes of the regression lines differ. The investigators in Presenting Problem 4 reported a steeper decline in the slope of insulin sensitivity as the BMI increased in the hyperthyroid women than in the control group.b Although they did not specifically address any difference in intercepts, the relationship between BMI and insulin sensitivity resembles the situation in Figure 8–12C.

bGonzalo and colleagues presented regression equations after adjusting for age. We briefly discuss this procedure in the next section under Multiple Regression.

If no differences exist in the relationships between the predictor and outcome variables, the regression lines are similar to Figure 8–12D, in which the lines are coincident: Both intercepts and slopes are equal. This situation occurs in many situations in medicine and is considered to be the expected pattern (the null hypothesis) until it is shown not to apply by testing hypotheses or forming confidence limits for the intercept and or slope (or both intercept and slope).

From the four situations illustrated in Figure 8–12, we can see that three statistical questions need to be asked:

1. Are the slopes equal?
2. Are the intercepts equal?
3. Are both the slopes and the intercepts equal?

Statistical tests based on the t distribution can be used to answer the first two questions; these tests are illustrated in Kleinbaum and associates (1997). The authors point out, however, that the preferred approach is to use regression models for more than one independent variable—a procedure called multiple regression—to answer these questions. The procedure consists of pooling observations from both samples of subjects (eg, observations on both hyperthyroid and control women) and computing one regression line for the combined data. Other regression coefficients indicate whether it matters to which group the observations belong. The simplest model is then selected. Because the regression lines were statistically different, the Gonzalo and colleagues reported two separate regression equations.

Use of Correlation & Regression

Some of the characteristics of correlation and regression have been noted throughout the discussions in this chapter, and we recapitulate them here as well as mention other features. An important point to reemphasize is that correlation and regression describe only linear relationships. If correlation coefficients or regression equations are calculated blindly, without examining plots of the data, investigators can miss very strong, but nonlinear relationships.

Analysis of Residuals

A procedure useful in evaluating the fit of the regression equation is the analysis of residuals (Pedhazur, 1997). We calculated residuals when we found the difference between the actual value of Y and the predicted value of Y', or YY', although we did not use the term. A residual is the part of Y that is not predicted by X (the part left over, or the residual). The residual values on the Y-axis are plotted against the X values on the X-axis. The mean of the residuals is zero, and, because the slope has been subtracted in the process of calculating the residuals, the correlation between them and the X values should be zero.

Stated another way, if the regression model provides a good fit to the data, as in Figure 8–13A, the values of the residuals are not related to the values of X. A plot of the residuals and the X values in this situation should resemble a scatter of points corresponding to Figure 8–13B in which no correlation exists between the residuals and the values of X. If, in contrast, a curvilinear relationship occurs between Y and X, such as in Figure 8–13C, the residuals are negative for both small values and large values of X, because the corresponding values of Y fall below a regression line drawn through the data. They are positive, however, for midsized values of X because the corresponding values of Y fall above the regression line. In this case, instead of obtaining a random scatter, we get a plot like the curve in Figure 8–13D, with the values of the residuals being related to the values of X. Other patterns can be used by statisticians to help diagnose problems, such as a lack of equal variances or various types of nonlinearity.

Use the CD-ROM [available only with the book] and the regression program to produce a graph of residuals for the data in Presenting Problem 4. Which of the four situations in Figure 8–13 is most likely? See Exercise 8.

Dealing with Nonlinear Observations

Several alternative actions can be taken if serious problems arise with nonlinearity of data. As we discussed previously, a transformation may make the relationship linear, and regular regression methods can then be used on the transformed data. Another possibility, especially for a curve, is to fit a straight line to one part of the curve and a second straight line to another part of the curve, a procedure called piecewise linear regression. In this situation, one regression equation is used with all values of X less than a given value, and the second equation is used with all values of X greater than the given value. A third strategy, also useful for curves, is to perform polynomial regression; this technique is discussed in Chapter 10. Finally, more complex approaches called nonlinear regression may be used (Snedecor and Cochran, 1989).

Regression Toward the Mean

The phenomenon called regression toward the mean often occurs in applied research and may go unrecognized. A good illustration of regression toward the mean occurred in the MRFIT study (Multiple Risk Factor Intervention Trial Research Group; Gotto, 1982), which was designed to evaluate the effect of diet and exercise on blood pressure in men with mild hypertension. To be eligible to participate in the study, men had to have a diastolic blood pressure of 90 mm Hg. The eligible subjects were then assigned to either the treatment arm of the study, consisting of programs to encourage appropriate diet and exercise, or the control arm, consisting of typical care. This study has been called a landmark trial and was reprinted in 1997 in the Journal of the American Medical Association. See Exercise 13.

To illustrate the concept of regression toward to the mean, we consider the hypothetical data in Table 8–7 for diastolic blood pressure in 12 men. If these men were being screened for the MRFIT study, only subjects 7 through 12 would be accepted; subjects 1 through 6 would not be eligible because their baseline diastolic pressure is < 90 mm Hg. Suppose all subjects had another blood pressure measurement some time later. Because a person's blood pressure varies considerably from one reading to another, about half the men can be expected to have higher blood pressures and about half to have lower blood pressures, owing to random variation. Regression toward the mean tells us that those men who had lower pressures on the first reading are more likely to have higher pressures on the second reading. Similarly, men who had a diastolic blood pressure  90 on the first reading are more likely to have lower pressures on the second reading. If the entire sample of men is remeasured, the increases and decreases tend to cancel each other. If, however, only a subset of the subjects is examined again, for example, the men with initial diastolic pressures > 90, the blood pressures will appear to have dropped, when in fact they have not.

Table 8–7. Hypothetical Data on Diastolic Blood Pressure to Illustrate Regression Toward the Mean.
 Table 8–7. Hypothetical Data on Diastolic Blood Pressure to Illustrate Regression Toward the Mean.
SubjectBaselineRepeat
17880
28081
38282
48486
58685
68890
79088
89291
99495
109695
119897
1210098

Regression toward the mean can result in a treatment or procedure appearing to be of value when it has had no actual effect; the use of a control group helps to guard against this effect. The investigators in the MRFIT study were aware of the problem of regression toward the mean and discussed precautions they took to reduce its effect.

Common Errors in Regression

One error in regression analysis occurs when multiple observations on the same subject are treated as though they were independent. For example, consider ten patients who have their weight and skinfold measurements recorded prior to beginning a low-calorie diet. We may reasonably expect a moderately positive relationship between weight and skinfold thickness. Now suppose that the same ten patients are weighed and measured again after 6 weeks on the diet. If all 20 pairs of weight and skinfold measurements are treated as though they were independent, several problems occur. First, the sample size will appear to be 20 instead of 10, and we are more likely to conclude significance. Second, because the relationship between weight and skinfold thickness in the same person is somewhat stable across minor shifts in weight, using both before and after diet observations has the same effect as using duplicate measures, and this results in a correlation larger than it should be.

The magnitude of the correlation can also be erroneously increased by combining two different groups. For example, consider the relationship between height and weight. Suppose the heights and weights of ten men and ten women are recorded, and the correlation between height and weight is calculated for the combined samples. Figure 8–14 illustrates how the scatterplot might look and indicates the problem that results from combining men and women in one sample. The relationship between height and weight appears to be more significant in the combined sample than it is when measured in men and women separately. Much of the apparent significance results because men tend both to weigh more and to be taller than women. Inappropriate conclusions may result from mixing two different populations—a rather common error to watch for in the medical literature.

Comparing Correlation & Regression

Correlation and regression have some similarities and some differences. First, correlation is scale-independent, but regression is not; that is, the correlation between two characteristics, such as height and weight, is the same whether height is measured in centimeters or inches and weight in kilograms or pounds. The regression equation predicting weight from height, however, depends on which scales are being used; that is, predicting weight measured in kilograms from height measured in centimeters gives different values for a and b than if predicting weight in pounds from height in inches.

An important consequence of scale independence in correlation is that the correlation between X and Y is the same as the correlation between Y' and Y. They are equal because the regression equation itself, Y' = a + bX, is a simple rescaling of the X variable; that is, each value of X is multiplied by a constant value b and then the constant a is added. The fact that the correlation between the original variables X and Y is equal to the correlation between Y and Y' provides a useful alternative for testing the significance of the regression, as we will see in Chapter 10. Finally, the slope of the regression line has the same sign (+ or –) as the correlation coefficient (see Exercise 10). If the correlation is zero, the regression line is horizontal with a slope of zero. Thus, the formulas for the correlation coefficient and the regression coefficient are closely related. If r has already been calculated, it can be multiplied by the ratio of the standard deviation of Y to the standard deviation of X, SDY/SDX to obtain b (see Exercise 9). Thus,

Similarly, if the regression coefficient is known, r can be found by

Multiple Regression

Multiple regression analysis is a straightforward generalization of simple regression for applications in which two or more independent (explanatory) variables are used to predict an outcome. For example, in the study described in Presenting Problem 4, the investigators wanted to predict a woman's insulin sensitivity level based on her BMI. They also wanted to control for the age of the woman, however. The results from two analyses are given in Table 8–8. First, regression was done using the BMI to predict insulin sensitivity among hyperthyroid women; the resulting equation was

Table 8–8. Regression Equations for Hyperthyroid Women Using BMI versus BMI and Age As Predictor Variables.
 Table 8–8. Regression Equations for Hyperthyroid Women Using BMI versus BMI and Age As Predictor Variables.
Regression Equation Section
Independent VariableRegression CoefficientStandard Errort Value (H0: B=0)

Probability LevelDecision (5%)
Intercept2.3360.4625.0540.0003Reject H0

BMI–0.0771.807E–02–4.2480.0011Reject H0

R2

0.601
Regression Equation Section
Independent VariableRegression CoefficientStandard Errort Value (H0: B=0)

Probability LevelDecision (5%)
Intercept2.29050.4614.9730.0004Reject H0

Age–4.463E–034.103E–03–1.0880.3000Accept H0

BMI–6.782E–021.972E–02–3.4390.0055Reject H0

R2

0.639

BMI = body mass index.

Source: Data, used with permission, from Gonzalo MA, Grant C, Moreno I, Garcia FJ, Suarez AI, Herrera-Pombo JL, et al: Glucose tolerance, insulin secretion, insulin sensitivity and glucose effectiveness in normal and overweight hyperthyroid women. Clin Endocrinol 1996;45:689–697. Output produced using NCSS; used with permission.

Next, the regression was repeated using both BMI and age as independent variables. The results were

As you can see, the addition of the age variable has relatively little effect; in fact, the P value for age is 0.30, indicating that age is not significantly associated with insulin sensitivity in this group of hyperthyroid women.

As an additional point, note that R2 (called R-squared) is 0.601 for the first regression equation in Table 8–8. R2 is interpreted in the same manner as the coefficient of determination, r2, discussed in the section titled, "Interpreting the Size of r." This topic, along with multiple regression and other statistical methods based on regression, is discussed in detail in Chapter 10.

Sample Sizes for Correlation & Regression

As with other statistical procedures, it is important to have an adequate number of subjects in any study that involves correlation or regression. Complex formulas are required to estimate sample sizes for these procedures, but fortunately we can use statistical power programs to do the calculations.

Suppose that Jackson and colleagues (2002) wanted to know what sample size would be necessary to produce a confidence interval for the correlation of BMI and percent body fat that would be within ± 0.10 from an expectant correlation coefficient of 0.75. In other words, how many subjects are needed for a 95% confidence interval from 0.65 to 0.85, assuming they observe a correlation of 0.75 (recall they actually found a correlation of 0.73)? We used the nQuery Advisor program to illustrate the sample size needed in this situation; the output is given in Figure 8–15. A sample of 102 patients would be necessary. nQuery produces only a one-sided interval, so we used 97.5% to obtain a 95% two-sided interval. We could have used the upper limit of 0.85 instead of the lower limit 0.65 (line 3 of the nQuery table). Do you think the sample size would be the same? Try it and see.

To illustrate the power analysis for regression, consider the regression equation to predict insulin sensitivity from BMI (Gonzalo et al, 1996). Recall that we found that a 95% confidence interval for the regression coefficient was between –0.0674 and –0.0192 in the entire sample of 33 women. Suppose Gonzalo and colleagues wanted to know how many women would be needed for the regression. The power program PASS finds the sample size by estimating the number needed to obtain a given value for R2 (or r2 when only one independent variable is used). We assume they want the correlation between the actual insulin sensitivity and the predicted sensitivity to be at least 0.50, producing an r2 of 0.25. The setup and output from the PASS program are given in Figures 8–16 and 8–17. From Figure 8–17, we see that a sample size of about 26 is needed in each group for which a regression equation is to be determined.

 SummaryFour presenting problems were used in this chapter to illustrate the application of correlation and regression in medical studies. The findings from the study described in Presenting Problem 1 demonstrate the relationship between BMI and percent body fat, a correlation equal to 0.73. The authors reported that the relationship was nonlinear, which can be seen in Figure 8–2. Several factors other than BMI affected the relationship. The authors concluded that BMI is only a moderate predictor of percent body fat, and it is important to consider age and gender when defining the prevalence of obesity with BMI for populations of American men and women.In Presenting Problem 2, Nesselroad and colleagues (1996) evaluated three automated finger blood pressure devices marketed as being accurate devices for monitoring blood pressure. We examined the relationship among these devices and the standard method using a blood pressure cuff. The observed correlations were quite low, ranging from 0.32 to 0.45. We compared these two correlation coefficients and concluded that no statistical difference exists between them. Nesselroad also reported that the automated finger device measurements were outside of the ±4 mm Hg range obtained with the standard blood pressure cuff 75–81% of the time. These researchers appropriately concluded that people who want to monitor their blood pressure cannot trust these devices to be accurate.Hodgson and Cutler (1997) reported results from their study of people's fears that normal age-associated memory change is a precursor of dementia. We examined the relationship between memory scores and whether people reported they were satisfied with their life. We demonstrated that the conclusions from computing the biserial correlation (the correlation between a numerical and a binary measure) and performing a t test are the same. Other results showed that the sense of well-being in these individuals is related to anticipatory dementia. Those with higher levels of anticipatory dementia are more depressed, have more psychiatric symptoms, have lower life satisfaction, and describe their health as being poorer than individuals not concerned about memory loss and Alzheimer's disease. Furthermore, women in the study demonstrated a relationship between anticipatory dementia and well-being that was not observed in men.Data from Gonzalo and colleagues (1996) was used to illustrate regression, specifically the relationship between insulin sensitivity and BMI for hyperthyroid and control women. We found separate regression lines for hyperthyroid and for control women and observed that the relationships between insulin sensitivity and BMI are different in these two groups of women. The investigators also reported that overall glucose tolerance was not affected by hyperthyroidism in normal weight women.The flowcharts for Appendix C summarize the methods for measuring an association between two characteristics measured on the same subjects. Flowchart C–4 indicates how the methods depend on the scale of measurement for the variables, and flowchart C–5 shows applicable methods for testing differences in correlations and in regression lines.

Exercises

[Note to AccessLange users: data and software are not available on the website.]

1. The extent to which stool energy losses are normalized in cystic fibrosis patients receiving pancreatic enzyme replacement therapy prompted a study by Murphy and colleagues (1991). They determined the amount of energy within the stools of 20 healthy children and 20 patients with cystic fibrosis who were comparatively asymptomatic while taking capsules of pancreatin, an enzyme replacement. Weighed food intake was recorded daily for 7 days for all study participants. Over the final 3 days of the study, all stools were collected. Measures of lipid content, total nitrogen content, bacterial content, and total energy content of the stools were recorded. Data for the cystic fibrosis children are given in Table 8–9 and on the CD-ROM in a folder entitled "Murphy."

Table 8–9. Observations on Stool Lipid and Stool Energy Losses in Children with Cystic Fibrosis.
 Table 8–9. Observations on Stool Lipid and Stool Energy Losses in Children with Cystic Fibrosis.
SubjectFecal Lipid (g/day)Fecal Energy (MJ/day)
110.02.1
211.01.1
39.91.1
49.80.9
515.50.7
65.00.4
710.71.0
813.01.5
913.81.2
1016.71.4
113.21.0
124.00.5
136.00.9
148.90.8
159.10.6
164.10.5
1717.01.2
1822.21.1
192.90.9
205.01.0

Source: Modified and reproduced, with permission, from the table and Figure 3 in Murphy JL, Wooton SA, Bond SA, Jackson AA: Energy content of stools in normal healthy controls and patients with cystic fibrosis. Arch Dis Child 1991;66:495–500.

a. Find and interpret the correlation between stool lipid and stool energy.
b. Figure 8–18 is from the study by Murphy. What is the authors' purpose in displaying this graph? What can be interpreted about the relationship between fecal lipid and fecal energy for control patients? How does that relationship compare with the relationship in patients with cystic fibrosis?
2.   a. Perform a chi-square test of the significance of the relationship between TRH and placebo and the subsequent development of respiratory distress syndrome using the data in Chapter 3, Table 3–21.
b. Determine 95% confidence limits for the relative risk of 2.3 for the risk of death within 28 days of delivery among infants not at risk using the data in Table 3–20. What is your conclusion?

3. Calculate the correlation between BMI and insulin sensitivity for the entire sample of 33 women, using the results in the section titled, "Calculating the Regression Equation," for b. The standard deviation of BMI is 3.82 and of insulin sensitivity is 0.030.

4. Goldsmith and colleagues (1985) examined 35 patients with hemophilia to determine whether a relationship exists between impaired cell-mediated immunity and the amount of factor concentrate used. In one of their studies, the ratio of OKT4 (helper T cells) to OKT8 (suppressor/cytotoxic T cells) was formed, and the logarithm of this ratio was regressed on the logarithm of lifetime concentrate use (Figure 8–19).

a. Why is the logarithm scale used for both variables?
b. Interpret the correlation.
c. What do the confidence bands mean?

5. Helmrich and coworkers (1987) conducted a study to assess the risk of deep vein thrombosis and pulmonary embolism in relation to the use of oral contraceptives. They were especially interested in the risk associated with low dosage (<50 g estrogen) and confined their study to women under the age of 50 years. They administered standard questionnaires to women admitted to the hospital for deep vein thrombosis or pulmonary embolism as well as to a control set of women admitted for trauma and upper respiratory infections to determine their history and current use of oral contraceptives. Twenty of the 61 cases and 121 of the 1278 controls had used oral contraceptives in the previous month.

a. What research design was used in this study?
b. Find 95% confidence limits for the odds ratio for these data.
c. The authors reported an age-adjusted odds ratio of 8.1 with 95% confidence limits of 3.7 and 18. Interpret these results.

6. Presenting Problem 2 in Chapter 3 by Hébert and colleagues (1997) measured disability and functional changes in 655 residents of a community in Quebec, Canada. The Functional Autonomy Measurement System (SMAF), a 29-item rating scale measuring functional disability in five areas, was a major instrument used in the study. We used observations on mental ability for women 85 years or older at baseline and 2 years later to illustrate the correlation coefficient in Chapter 3 and found it to be 0.58. Use the data on the CD-ROM and select or filter those subjects with sex = 0 and age 85; 51 subjects should remain for the analysis.

a. Form a 95% confidence interval for this correlation.
b. Calculate the sample size needed to produce a confidence interval for the correlation of the mental ability scores at times 1 and 3 that would be within ±0.10 from the observed correlation coefficient. In other words, how many subjects are needed for a 95% confidence interval from 0.48 to 0.68 around the correlation of 0.58 found in their study?

7. The graphs in Figure 8–20 were published in the study by Einarsson and associates (1985).

a. Which graph exhibits the strongest relationship with age?
b. Which variable would be best predicted from a patient's age?
c. Do the relationships between the variables and age appear to be the same for men and women; that is, is it appropriate to combine the observations for men and women in the same figure?

8. Use the CD-ROM regression program to produce a graph of residuals for the data from Gonzalo and coworkers (1996). Which of the four situations in Figure 8–13 is most likely?

9. Explain why the mean of the predicted values, ', is equal to .

10. Develop an intuitive argument to explain why the sign of the correlation coefficient and the sign of the slope of the regression line are the same.

11. Use the data from the "Bossi" file (Presenting Problem in Chapter 3) to form a 2 x 2 contingency table for the frequencies of hematuria (hematur) and whether patients had RBC units > 5 (gt5rbc). The odds ratio is 1.90. Form 95% confidence limits for the odds ratio and compare them to those calculated by the statistical program. What is the conclusion?

12. Group Exercise. The causes and pathogenesis of steroid-responsive nephrotic syndrome (also known as minimal-change disease) are unknown. Levinsky and colleagues (1978) postulated that this disease might have an immunologic basis because it may be associated with atopy, recent immunizations, or a recent upper respiratory infection. It is also responsive to corticosteroid treatment. They analyzed the serum from children with steroid-responsive nephrotic syndrome for the presence of IgG-containing immune complexes and the complement-binding properties (C1q-binding) of these complexes. For purposes of comparison, they also studied these two variables in patients with systemic lupus erythematosus. You will need to consult the published article for details of the study; a graph from the study is reproduced in Figure 8–21.

a. What were the study's basic research questions?
b. What was the study design? Is it the best one for the study's purposes?
c. What was the rationale in defining the kinds of patients to be studied? How were subjects obtained?
d. Interpret the correlations for the two sets of patients in Figure 8–21. What conclusions do you draw about the relationships between C1q-binding and IgG complexes in patients with systemic lupus erythematosus? In patients with steroid-responsive nephrotic syndrome?
e. Discuss the use of the parallel lines surrounding the regression line; do they refer to means or individuals? (Hint: The standard error of regression is 11.95 and (X – 2) is 21,429.37).
f. Do you think the regression lines for the two sets of patients will differ?
g. Would the results from this study generalize? If so, to what patient populations, and what cautions should be taken? If not, what features of the study limit its generalizability?

13. Group Exercise. The MRFIT study (Multiple Risk Factor Intervention Trial Research Group, 1982), has been called a landmark trial; it was the first large-scale clinical trial, and it is rare to have a study that follows more than 300,000 men who were screened for the trial for a number of years. The Journal of the American Medical Association reprinted this article in 1997. In addition, the journal published a comment in the Landmark Perspective section by Gotto (1997). Obtain a copy of both articles.

a. What research design was used in the study?
b. Discuss the eligibility criteria. Are these criteria still relevant today?
c. What were the treatment arms? Are these treatments still relevant today?
d. What statistical methods were used? Were they appropriate? One method, the Kaplan–Meier product-limit method, is discussed in Chapter 9.
e. Refer to Figure 1 in the original study. What do the lines in the figure indicate?
f. Examine the distribution of deaths given in the article's Table 4. What statistical method is relevant to analyzing these results?
g. The perspective by Gotto discusses the issue of power in the MRFIT study. How was the power of the study affected by the initial assumptions made in the study design?