Moderated Polynomial Regression

Researchers are often interested in testing whether the effects of congruence are moderated by another variable.  Moderation can be tested by supplementing polynomial regression equations with moderator variables and building on principles of moderated regression.  As a starting point, consider the following polynomial regression equation:

(1)      Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + e

We will now add the moderator variable W to Eq. 1.  Following principles of moderated regression (e.g., Aiken & West, 1991), we will add W and the product of W with each term in Eq. 1.  This results in the following expression:

(2)     Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + b6W + b7WX + b8WY + b9WX2 + b10WXY + b11WY2 + e

The moderating effect of W is captured by the five terms WX, WY, WX2, WXY, and WY2 as a set.  Moderation is tested by assessing the increment in R2 yielded by the terms WX, WY, WX2, WXY, and WY2, which amounts to testing whether the R2 from Eq. 2 is larger than the R2 from Eq. 1.

If the increment in R2 yielded by the five terms WX, WY, WX2, WXY, and WY2 is statistically significant and sufficiently large from a substantive perspective, then it is appropriate to interpret the form of the moderating effect yielded by W.  This can be accomplished by rewriting Eq. 2 to show simple quadratic functions at selected levels of W, analogous to simple slopes in moderated regression:

(3)     Z = (b0 + b6W) + (b1 + b7W)X + (b2 + b8W)Y + (b3 + b9W)X2 + (b4 + b10W)XY + (b5 + b11W)Y2 + e

The simple quadratic functions indicated by Eq. 3 can be derived by substituting selected values of W.  For instance, assume W is a dichotomous variable representing gender in which W = 0 for men and W = 1 for women.  The equation for men is derived by substituting W = 0 into Equation 3, which yields:

(4)     Z = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + e

Note that Eq. 4 reduces to Eq. 1, given that each term involving W becomes zero and therefore drops out.  The equation for women is found by substituting W = 1 into Equation 3, which produces:

(5)     Z = (b0 + b6) + (b1 + b7)X + (b2 + b8)Y + (b3 + b9)X2 + (b4 + b10)XY + (b5 + b11)Y2 + e

When W is a continuous variable, meaningful values of W, such as the mean of W and one standard deviation above and below the mean, can be substituted into Eq. 3.  For example, if W is measured on a 7-point scale and yields a mean of 4, the quadratic function at the mean of W is found by substituting W = 4 into Eq. 3, which produces:

(6)     Z = (b0 + 4b6) + (b1 + 4b7)X + (b2 + 4b8)Y + (b3 + 4b9)X2 + (b4 + 4b10)XY + (b5 + 4b11)Y2 + e

The terms yielded by substituting selected values of W into Eq. 3 can be tested in various ways.  One approach is to use procedures for testing weighted linear combinations of regression coefficients, which involves dividing the term to be tested by an estimate of its standard error.  For instance, to test the coefficient on X in Eq. 3, which is (b1 + b7W), this term would be divided by the square root of the variance of (b1 + b7W).  The variance of (b1 + b7W) is derived by applying conventional rules for computing the variance of a weighted linear combination of random variables, as follows:

(7)     V[(b1 + b7W)] = V(b1) + W2V(b7) + 2WC(b1,b7)

where V(.) and C(.) are variance and covariance operators, respectively.  Variances of regression coefficients can be obtained by squaring their standard errors, and covariances of regression coefficients can be obtained by requesting regression diagnostics that includes the correlations among the regression coefficients, which can be converted to covariances by multiplying the correlations by the standard errors of the two coefficients involved in the correlation.  Dividing (b1 + b7W) by the square root of V[(b1 + b7W)] as given by Eq. 7 yields a t-test of (b1 + b7W) with N – k – 1 degrees of freedom (k equals the number of coefficients in Eq. 2, excluding the intercept).

Another approach is to use functions in statistical packages that provide tests of weighted linear combinations of regression coefficients.  For instance, suppose we wanted to test the coefficient on X in Eq. 3 at W = 4.  This test would be produced by the following SPSS syntax:

GLM
z WITH x y x2 xy y2 w wx wy wx2 wxy wy2
/INTERCEPT = INCLUDE
/PRINT = DESCRIPTIVE PARAMETER
/LMATRIX = x 1 wx 4
/DESIGN = x y x2 xy y2 w wx wy wx2 wxy wy2 .

The /LMATRIX line assigns a weight of 1 to the coefficient on X and a weight of 4 to the coefficient on wx, corresponding to the expression (b1 + 4b7), and tests whether this weighted linear combination differs from zero.  Other statistical packages that provide tests of weighted linear combinations of regression coefficients include SAS, STATA, and SYSTAT.

The simple surfaces indicated by Eq. 6 can be analyzed using standard procedures (Edwards, 2002; Edwards & Parry, 1993) with the caveat that the individual coefficients typically used to analyze surface features are replaced with the compound coefficients in Eq. 6.  For instance, recall that the shape of a surface along the Y = X line can be examined by setting Y = X in Eq. 1, which yields the following equation:

(8)             Z = b0 + (b1 + b2)X + (b3 + b4 + b5)X2 + e

In similar fashion, setting Y = X in Eq. 2, which includes W as a moderator variable, yields the following equation:

(9)     Z = b0 + (b1 + b2)X + (b3 + b4 + b5)X2 + b6W + (b7 + b8)WX + (b9 + b10 + b11)WX2 + e

Analogous to Eq. 3, Eq. 9 can be rewritten in terms of simple shapes along the Y = X line:

(10)    Z = (b0 + b6W) + [b1 + b2 + (b7 + b8)W]X + [b3 + b4 + b5 + (b9 + b10 + b11)W]X2 + e

Along the Y = -X line, the moderated polynomial regression equation is as follows:

(11)    Z = b0 + (b1 – b2)X + (b3 – b4 + b5)X2 + b6W + (b7 – b8)WX + (b9 – b10 + b11)WX2 + e

Rewriting Eq. 11 in terms of simple shapes yields:

(12)    Z = (b0 + b6W) + [b1 – b2 + (b7 – b8)W]X + [b3 – b4 + b5 + (b9 – b10 + b11)W]X2 + e

The compound terms in Eq. 3 can also be substituted into expressions for computing stationary points and principal axes given by Edwards (2002) and Edwards and Parry (1993).  In addition, the bootstrap can be applied to Eq. 2 in order to obtain a large number (e.g., 10,000) of bootstrap estimates of the coefficients in the equation, and these coefficients can be used to construct bias-corrected confidence intervals for surface features at selected levels of W.  This is accomplished by applying Eq. 3 to the set of coefficients yielded by each bootstrap sample and using the resulting compound coefficients in the same manner as simple coefficients are used to test surface features when moderation is not involved. An example using data from Edwards (2002) can be found here. For additional information on the bootstrap, see Efron and Tibshirani (1993), Mooney and Duval (1993), and Stine (1989).

Edwards, J. R.  (2002).  Alternatives to difference scores: Polynomial regression analysis and response surface methodology. In F. Drasgow & N. W. Schmitt (Eds.), Advances in measurement and data analysis (pp. 350-400).  San Francisco: Jossey-Bass.

Edwards, J. R., & Parry, M. E.  (1993).  On the use of polynomial regression equations as an alternative to difference scores in organizational research. Academy of Management Journal, 36, 1577-1613.

Efron, B., & Tibshirani, R. (1993). An introduction to the bootstrap. New York: Chapman & Hall.

Mooney, C. Z., & Duval, R. D. (1993). Bootstrapping: A nonparametric approach to statistical inference. Newbury Park, CA: Sage.

Stine, R. (1989). An introduction to bootstrap methods. Sociological Methods & Research, 18, 243-291.