FREE ELECTRONIC LIBRARY - Abstract, dissertation, book

Pages:   || 2 | 3 | 4 |

«Key Results of Interaction Models With Centering David Afshartous Vanderbilt University Richard A. Preston University of Miami Journal of Statistics ...»

-- [ Page 1 ] --

Journal of Statistics Education, Volume 19, Number 3 (2011)

Key Results of Interaction Models With Centering

David Afshartous

Vanderbilt University

Richard A. Preston

University of Miami

Journal of Statistics Education Volume 19, Number 3 (2011)

http://www.amstat.org/publications/jse/v19n3/afshartous .pdf

Copyright c 2011 by David Afshartous and Richard A. Preston, all rights reserved. This

text may be freely shared among individuals, but it may not be republished in any medium

without express written consent from the authors and advance notification of the editor.

Key Words: Beta coefficients; Introductory statistics; Medical statistics; Misspecification bias; Multicollinearity; Multiple regression.

Abstract We consider the effect on estimation of simultaneous variable centering and interaction effects in linear regression. We technically define, review, and amplify many of the statistical issues for interaction models with centering in order to create a useful and compact reference for teachers, students, and applied researchers. In addition, we investigate a sequence of models that have an interaction effect and/or variable centering and derive expressions for the change in the regression coefficients between models from both an intuitive and mathematical perspective. We demonstrate how these topics may be employed to motivate discussion of other important areas, e.g., misspecification bias, multicollinearity, design of experiments, and regression surfaces. This paper presents a number of results also given elsewhere but in a form that gives a unified view of the topic. The examples cited are from the area of medical statistics.

Journal of Statistics Education, Volume 19, Number 3 (2011)

1. Introduction We consider the case of simultaneous variable centering and interaction effects in linear regression. The goal is to create a useful reference for teachers and students of statistics, as well as applied researchers. Thus, we technically define, review, and amplify many of the statistical issues for interaction models with centering and provide a comprehensive summary and discussion of key points. While many of the points we raise have been made elsewhere, they are somewhat scattered across a voluminous literature. The examples cited are from the area of medical statistics.

By the term variable centering we mean subtracting either the mean value or a meaningful constant from an independent variable. It is well-known that variable centering can often increase the interpretability of regression coefficients as well as reduce multicollinearity between lower and higher-order predictor variables.

To discuss characteristics of interaction effects, we consider a model with two predictors and their cross-product term. For ease of illustration we assume that continuous predictors are linearly related to the dependent variable.1 Interaction effects arise when the effects of predictor variables are not additive, i.e., the effect of one predictor variable depends on the value of another variable. For example, consider Figure 1 in the context of a potassium challenge experiment2 where Y represents urinary potassium excretion, X1 represents serum potassium level, and X2 represents glomerular filtration rate (GFR). As GFR is a measure of kidney function one might expect that the slope of the response Y against serum potassium level X1 would increase for higher GFR levels X2. This is often referred to as a reinforcement or synergistic interaction, whereas an offsetting or interference interaction effect occurs when the slope of the response decreases for higher GFR levels X2. Moreover, centering of GFR and serum potassium could enhance the interpretability of the regression coefficients given that it is not meaningful to consider a subject with a zero value for GFR or serum potassium.

When adding centering to a model that includes an interaction term, the magnitude and standard error of certain estimated coefficients change. Indeed, as researchers often sift through several different models, many of which yield the same fitted values merely under different parameterizations, the potential for confusion is high. In this paper we attempt to provide a compact guide to help reduce such confusion. In Section 2, we provide separate overviews of variable centering and interaction effects. In Section 3, we consider simultaneous centering and interaction effects via a sequence of models. We derive expressions for the change in the regression coefficients for the new models from both an intuitive and 1 For a discussion of relaxing the linearity assumption see Harrell (2001), p.16 2 Potassium challenge experiments involve the administration of a potassium load to experimental subjects in order to investigate the physiology of potassium handling.

–  –  –

Figure 1. Illustration of reinforcement and interference interaction effects.

In the additive model (a), the relationship between Y and X1 does not depend on the value of X2. In a reinforcement interaction effect (b), the slope between Y and X1 increases for higher X2 values, while in an interference interaction effect (c) the slope between Y and X1 decreases for higher X2 values.

mathematical perspective. In Section 4, we provide a list of key points to guide both teaching and applied work with interaction models and centering.3 We conclude with a brief summary in Section 5.

2. Variable Centering and Interaction Effects

2.1 Variable Centering Motivations for employing variable centering include enhanced interpretability of coefficients and reduced numerical instability for estimation associated with multicollinearity.

Consider the standard bivariate linear regression model where scalars Xi and Yi represent the predictor and response variables, respectively, for the ith observation, and scalar εi represents the corresponding random error term where the standard assumption is that εi ∼ N(0, σ 2 ). Omitting the subscript without loss of generality, the “true” population

model is4 :

Y = α + β X + ε, (1) 3 For assessments of the methodology to detect interaction effects in certain fields (that also attempt to identify key points) see Carte and Russell (2003); Champoux and Peters (1987).

4 Throughout the paper only scalar notation is employed. Greek letters are employed for population parameters while the corresponding English lower-case letter represents the corresponding estimator, e.g., (α, β ) versus (a, b).

–  –  –

where one may consider this as the regression of Y on the transformed predictor variable X ∗ = X − k. For instance, consider k = µX = E(X), the population mean of X.5 Although this change of location of the predictor variable shifts the 0 point to µX, other changes of location to another meaningful value k are possible as well. Since E(Y ) = α ∗ + β ∗ (X − µX ), the new intercept α ∗ represents the expected value of Y when X = µX, i.e., the expected value of Y for the average predictor value. If the X variable is a physiological variable such as weight or blood pressure, the centered model provides a much more meaningful intercept. Since both population models must yield the same expected values for the same given X values, it follows that α ∗ = α + β µX and β ∗ = β. For instance, E(Y |X = µX ) = α + β µX = α ∗ and E(Y |X = 0) = α = α ∗ − β ∗ µX, from which both results follow. Since correlation properties between variables do not change under linear transformations, the fact that the estimated slope should not change is also intuitive. It also follows that centering (or any linear transformation) does not alter the coefficient of determination R2 (Arnold and Evans 1979; Allison 1977).

In practice, the population parameters are unknown and must be estimated via sampled data (Xi,Yi ), i = 1,..., n, yielding the analogous equations for the estimated regression coefficients, e.g., a = a∗ − bX and b∗ = b. Note that centering predictors by their sample mean also has the beneficial effect of making the estimate of the intercept independent of the estimate of the slope.6 In multiple regression, variable centering is often touted as a potential solution to reduce numerical instability associated with multicollinearity, and a common cause of multicollinearity is a model with interaction term X1 X2 or other higher-order terms such as X 2 or X 3. For the case of two predictor variables X1 and X2, when X1 and X2 are uncorrelated in the sample data the estimated regression coefficient b1 is the same regardless of whether X2 is included in the model or not (similarly for b2 and X1 ). This may be seen from the following algebraic expression for b1 in the standard multiple regression model with two 5 Asterisks are employed to denote corresponding parameters and estimators in a transformed model versus the original model, e.g., α ∗ is the intercept in the centered model while α is the intercept in the original model.

6 This result no longer holds if one centers via k where k is not the sample mean.

–  –  –

where rY 2 represents the sample correlation coefficient between Y and X2 and r12 represents the sample correlation coefficient between X1 and X2.7 However, if the predictor variables are (perfectly) uncorrelated we have r12 = 0 and it immediately follows that

–  –  –

which by definition is the estimated slope in the bivariate regression of Y on X1 alone.

Note that predictors are often correlated, except for designed experiments where the experimenter may choose the levels of the predictor variables.

When predictor variables are perfectly correlated infinitely many estimated coefficients provide the same predicted values and fit to the data. Perfect correlation, however, is not as troublesome as near perfect correlation. Under perfect correlation, the simple solution is to remove one of the variables since doing so does not remove any information. On the other hand, if |Cor(X1, X2 )| 1, removing one of the variables entails a loss of information.

Practically, the estimated coefficient b1 changes depending on whether the predictor variable X2 is included in the model or not. This change may be quantified and is commonly referred to as specification bias. Specifically, if Cov(X1, X2 ) = σ12 and one estimates a model without X2 when the model should include X2, one may show that the resulting estimate for the regression coefficient of X1 has E(b1 ) = β1 + β2 σ12, i.e., the expected bias in σ2 1 b1 is thus β2 σ12 (Goldberger 1964). Even if both variables are included, inference becomes more difficult in the presence of inflated standard errors, i.e., estimation uncertainty, where σ1 a small change to the data can result in a large change to the estimated coefficients. The more advanced reader may find further details regarding multicollinearity in Christensen (2002).

2.2 Interaction Effects Consider multiple regression with two predictor variables. An interaction effect may be modeled by including the product term X1 × X2 as an additional variable in the regression, known as a two-way interaction term. If there are k predictor variables in the multiple regression, there are 2!(k−2)! potential two-way interactions, and analogously for threek!

7 Note that there exists the distinction between the population correlation and the sample correlation, and correlated in sample does not necessarily imply correlated in population, and vice versa.

Journal of Statistics Education, Volume 19, Number 3 (2011) way and higher-order interactions. For a simple model with two-way interactions only, the

population model is:

–  –  –

The re-arrangement of terms in Equations 6–8 demonstrates the meaning of an interaction effect, i.e., the slope associated with X1 is no longer simply a constant β1, but rather (β1 + β3 X2 ), which clearly depends on the value of X2, and similarly the slope associated with X2 is now (β2 + β3 X1 ). The coefficient β1 now represents the effect of X1 on Y when X2 = 0, whereas β1 in a model without interaction represents the effect of X1 on Y for all levels of X2. The effect of X1 on Y for non-zero values of X2 is affected by the magnitude and sign of β3, e.g., if β3 0, the effect of X1 on Y is less for higher values of X2 and greater for smaller values of X2 (interference or offsetting interaction, Figure 1), and vice versa for β3 0 (synergistic or reinforcing interaction, Figure 1).

For instance, for X2 = 0, 1, 2, we have three different lines for the effect of X1 on Y :

–  –  –

and the bivariate relationship between Y and X1 depends on X2. Note that β3 in isolation lacks information about the relative strength of the interaction. For instance, β1 may be so large that even for a seemingly large β3 there is not a substantial impact over the range of X2 values considered.

Interaction effects are sometimes called joint effects, where the focus (instead of the conditional focus above) is more on how the two variables interact when accounting for the variance in Y over and above the contributions of the individual additive effects. Indeed, the interaction term does not assess the combined effect, e.g., a positive interaction coefficient β3 0 only provides slope change information: higher values of X2 correspond to a greater slope between Y and X1. On the other hand, β3 0 provides no information whatsoever regarding whether Y achieves its highest values for the highest values of X1 and X2 (Hartmann and Moers 1999). For example, in Figure 2a and Figure 2b the sign and magnitude of the interaction coefficient β3 is the same. However, for the range of X1 shown in Figure 2a, Y is higher when both predictors are high, while in Figure 2b we have Y higher when X1 is

–  –  –

high and X2 is low.8 Figure 2. Interaction coefficient does not provide information with respect to where dependent variable is higher. In both a) and b), the sign and magnitude of the interaction is the same. In a), Y is higher when both predictors are high, while in b) Y is higher when X1 is high and X2 is low.

Pages:   || 2 | 3 | 4 |

Similar works:

«Andrew B. Leber, Ph.D. Curriculum Vitae 1/27/2016 203 Psychology Building Phone: 614.688.1372 The Ohio State University Fax: 614.688.3984 1835 Neil Avenue Email: leber.30@osu.edu Columbus, OH 43210 http://faculty.psy.ohio-state.edu/leber Primary Professional Appointments 2012 The Ohio State University, Columbus OH Assistant Professor, Department of Psychology Affiliate, Center for Cognitive & Brain Sciences Affiliate, Center for Cognitive & Behavioral Brain Imaging 2007 2012 University of New...»

«Adenoviral infectivity of Exfoliated Viable Cells in Urine: Implications for the Detection of Bladder Cancer Anuradha Murali*, Laura Kasman *, Christina Voelkel-Johnson§ Department of Microbiology & Immunology, Medical University of South Carolina, Charleston SC, 29425, USA *These authors contributed equally to this work § Corresponding author Email addresses: AM: muraliak@musc.edu LK: kasmanl@musc.edu CVJ: johnsocv@musc.edu -1Abstract Background Bladder cancer, the 6th most common malignancy...»

«1 Amiodarone Hydrochloride 150 mg/3 mL NEW ZEALAND DATA SHEET Amiodarone Hydrochloride 150 mg/3 mL Amiodarone hydrochloride Concentrated Injection 150 mg/3 mL Name of the Medicine Non-proprietary name Amiodarone hydrochloride Chemical structure CAS number 1951-25-3 Description The active ingredient of Amiodarone Hydrochloride 150 mg/3 mL is amiodarone hydrochloride (2-n-butyl-3(4-(2-diethylaminoethoxy)-3,5-diiodobenzoyl) benzofuran hydrochloride). Amiodarone hydrochloride is a Class III...»

«CURRICULUM VITAE PERSONAL: Name: Karl G. Csaky Current Appointment: T. Boone Pickens Senior Scientist Director, Harrington Macular Degeneration Laboratory Medical Director Retina Foundation of the Southwest Partner – Texas Retina Associates Clinical Associate Professor Department of Ophthalmology UT-Southwestern Current Work Address: 9600 N. Central Expressway Suite 100 Dallas, TX 75231 Current Home Address: 4000 Hanover St Dallas, TX 75225 Telephone: Work: 214-363-3911 Ext: 137 E-mail:...»

«СОДЕРЖАНИЕ Стр. CONTENT АКТУАЛЬНЫЕ СТАТЬИ SUBJECT REVIEW Глухов А.Н., Ефименко Н.В., Чалая Е.Н., Алфимова Е.А. Glukhov A.N., Efimenko N.V., Chalaya E.N., Alfimova E.A. Актуальные вопросы наукометрических и библиометрических Topical issues of scientometric and bibliometric researches in исследований в курортологии health resort study 2-11 КУРОРТНЫЕ...»

«Preface The authors and publishers are pleased to present the twenty-ninth edition of Harper's Illustrated Biochemistry. The first edition of this text, entitled Harper's Biochemistry, was published in 1939 under the sole authorship of Dr Harold Harper, University of California, San Francisco. Subsequently, various authors have contributed to the text. Cover Illustration for the Twenty-Ninth Edition The cover illustration for the 29th edition commemorates Elizabeth H. Blackburn, Carol W....»

«BIOACTIVITY OF PLANTS SECONDARY METABOLITES Estrogenic, cytotoxic and anabolic effects on estrogen target organs of an extract of Erythrina excelsa and Ecdysterone A thesis presented to: The Institute of Cardiovascular Research and Sports Medicine Department of Molecular and Cellular Sports Medicine German Sport University Cologne In partial fulfillment of the requirements for the degree PhD in Natural Sciences by: Sadrine Tchoukouegno Ngueu From Bamenyam, Cameroon Members of the jury: Prof....»

«Committee on Guidelines for the Use of Animals in Neuroscience and Behavioral Research Institute for Laboratory Animal Research Division on Earth and Lifes Studies THE NATIONAL ACADEMIES PRESS Washington, D.C. www.nap.edu THE NATIONAL ACADEMIES PRESS • 500 Fifth Street, N.W. • Washington, DC 20001 NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of...»

«EUROPEAN COMMISSION HEALTH & CONSUMER PROTECTION DIRECTORATEGENERAL Public Health and Risk Assessment Pharmaceuticals 3 October 2014 EMA/572454/2014 Rev 17 Compliance and Inspection Compilation of Community Procedures on Inspections and Exchange of Information This document forms part of the Compilation of Community Procedures on Inspections and Exchange of Information. Please check for updates on the European Medicines Agency’s website. Published in Agreement with the European Commission by...»

«Curriculum Vitae Bruce Richard DeForge July 2011 School of Social Work 115 Ardoon Road OFFICE: HOME: University of Maryland Lutherville, Maryland 21093 525 West Redwood Street (410) 583-5742 Baltimore, Maryland 21201 email: bdeforge@comcast.net (410) 706-5612; FAX (410) 706-6046 email: bdeforge@ssw.umaryland.edu bdeforge@umaryland.edu EDUCATION Ph.D. University of Maryland, College Park, College Park, MD Department of Sociology Curriculum: Social Psychology, Mental Health, Research Design and...»

«Gesundheit für Shimshal e.V., Kieselsgarten 19, 97273 Kürnach Vereinsregister des Amtsgerichts Würzburg VR 200185, StNr. 257/108/80444 Spendenkonto bei der Sparkasse Heidelberg, BLZ: 672 500 20, Konto: 9110640 Reise vom 30.09. – 11.10.2010 Bericht zum Projektstatus Stand Oktober 2010 Teilnehmer: Lisa Buschmann Die für Anfang August geplante dreiwöchige Kleingruppenreise musste trotz einer Verschiebung um zwei Wochen letztendlich aufgrund der katastrophalen Auswirkungen der Flut leider...»

«FINAL REPORT FROM THE TASK FORCE ON THE DEVELOPMENT IMPACT OF ILLICIT FINANCIAL FLOWS A Task Force led by Norway, set up under the Leading Group on Solidarity Levies to fund Development, November 2008 Introduction It is today recognised that illicit financial flows represent a significant and growing leakage of capital from most countries in the world. While difficult to quantify, and impossible to measure accurately, there is no doubt they constitute a considerable drain on often scarce public...»

<<  HOME   |    CONTACTS
2016 www.abstract.xlibx.info - Free e-library - Abstract, dissertation, book

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.