Multilevel model
Regression analysis 

Models 

Estimation 
Background 
Multilevel models (also hierarchical linear models, nested models, mixed models, random coefficient, randomeffects models, random parameter models, or splitplot designs) are statistical models of parameters that vary at more than one level.^{1} These models can be seen as generalizations of linear models (in particular, linear regression), although they can also extend to nonlinear models. These models became much more popular after sufficient computing power and software became available.^{1}
Multilevel models are particularly appropriate for research designs where data for participants are organized at more than one level (i.e., nested data).^{2} The units of analysis are usually individuals (at a lower level) who are nested within contextual/aggregate units (at a higher level).^{3} While the lowest level of data in multilevel models is usually an individual, repeated measurements of individuals may also be examined.^{2} As such, multilevel models provide an alternative type of analysis for univariate or multivariate analysis of repeated measures. Individual differences in growth curves may be examined (see growth model).^{2} Furthermore, multilevel models can be used as an alternative to ANCOVA, where scores on the dependent variable are adjusted for covariates (i.e., individual differences) before testing treatment differences.^{4} Multilevel models are able to analyze these experiments without the assumptions of homogeneityofregression slopes that is required by ANCOVA.^{2}
Multilevel models can be used on data with many levels, although 2level models are the most common. The dependent variable must be examined at the lowest level of analysis.^{1}
Contents
Level 1 regression equation
Y_{ij} = β_{0j} + β_{1j}(X_{ij}) + e_{ij}
 Y_{ij} refers to the score on the dependent variable for an individual observation at Level 1 (subscript i refers to individual case, subscript j refers to the group).
 X_{ij} refers to the Level 1 predictor.
 β_{0j} refers to the intercept of the dependent variable in group j (Level 2).
 β_{1j} refers to the slope for the relationship in group j (Level 2) between the Level 1 predictor and the dependent variable.
 e_{ij} refers to the random errors of prediction for the Level 1 equation (it is also sometimes referred to as r_{ij}). At Level 1, both the intercepts and slopes in the groups can be either fixed (meaning that all groups have the same values, although in the real world this would be a rare occurrence), nonrandomly varying (meaning that the intercepts and/or slopes are predictable from an independent variable at Level 2), or randomly varying (meaning that the intercepts and/or slopes are different in the different groups, and that each have their own overall mean and variance).^{2}
Level 2 regression equation
The dependent variables are the intercepts and the slopes for the independent variables at Level 1 in the groups of Level 2.
β_{0j} = γ_{00} + γ_{01}W_{j} +u_{0j}
β_{1j} = γ_{10} + u_{1j}
 γ_{00} refers to the overall intercept. This is the grand mean of the scores on the dependent variable across all the groups when all the predictors are equal to 0.
 W_{j} refers to the Level 2 predictor.
 γ_{01} refers to the overall regression coefficient, or the slope, between the dependent variable and the Level 2 predictor.
 u_{0j} refers to the random error component for the deviation of the intercept of a group from the overall intercept.
 γ_{10} refers to the overall regression coefficient, or the slope, between the dependent variable and the Level 1 predictor.
 u_{1j} refers to the error component for the slope (meaning the deviation of the group slopes from the overall slope).^{2}
Types of models
Before conducting a multilevel model analysis, a researcher must decide on several aspects, including which predictors are to be included in the analysis, if any. Second, the researcher must decide whether parameter values (i.e., the elements that will be estimated) will be fixed or random.^{2}^{4} Fixed parameters are composed of a constant over all the groups, whereas a random parameter has a different value for each of the groups. Additionally, the researcher must decide whether to employ a maximum likelihood estimation or a restricted maximum likelihood estimation type.^{2}
Random intercepts model
A random intercepts model is a model in which intercepts are allowed to vary, and therefore, the scores on the dependent variable for each individual observation are predicted by the intercept that varies across groups.^{4}^{5} This model assumes that slopes are fixed (the same across different contexts). In addition, this model provides information about intraclass correlations, which are helpful in determining whether multilevel models are required in the first place.^{2}
Random slopes model
A random slopes model is a model in which slopes are allowed to vary, and therefore, the slopes are different across groups. This model assumes that intercepts are fixed (the same across different contexts).^{4}
Random intercepts and slopes model
A model that includes both random intercepts and random slopes is likely the most realistic type of model, although it is also the most complex. In this model, both intercepts and slopes are allowed to vary across groups, meaning that they are different in different contexts.^{4}
Developing a multilevel model
In order to conduct a multilevel model analysis, one would start with fixed coefficients (slopes and intercepts). One aspect would be allowed to vary at a time (that is, would be changed), and compared with the previous model in order to assess better model fit.^{1} Thus, there are three different questions that a researcher would ask in assessing a model. First, is it a good model? Second, is a more complex model better? Third, what contribution do individual predictors make to the model? In order to assess models, different model fit statistics would be examined.^{2} One such statistic is the chisquare likelihoodratio test, which assesses the difference between models. The chisquare test can be employed for model building in general, for examining what happens when effects in a model are allowed to vary, and when testing a dummycoded categorical variable as a single effect.^{2} However, this test can only be used when models are nested (meaning that a more complex model includes all of the effects of a simpler model). When testing nonnested models, comparisons between models can be made using Akaike's Information Criterion (AIC), Hurvich and Tsai's Criterion (AICC), Bozdogan's Criterion (CAIC) or Schwarz's Bayesian Criterion (BIC).^{1}^{2}^{4}
Assumptions
Multilevel models have the same assumptions as other major general linear models (e.g., ANOVA, regression), but some of the assumptions are modified for the hierarchical nature of the design (i.e., nested data).
Linearity
The assumption of linearity states that there is a rectilinear (straightline, as opposed to nonlinear or Ushaped) relationship between variables.^{6}
Normality
The assumption of normality states that the error terms at every level of the model are normally distributed.^{6}
Homoscedasticity
The assumption of homoscedasticity, also known as homogeneity of variance, assumes equality of population variances.^{6}
Independence of observations
Independence is an assumption of general linear models, which states that cases are random samples from the population and that scores on the dependent variable are independent of each other.^{6}
Modification of assumptions
The assumptions of linearity and normality do not pose any problems for multilevel models, and are thus retained. However, the assumptions of homoscedasticity and independence of observations must be adapted in order to proceed with this type of analysis. The latter two assumptions pose several problems. First, units of observations in the same group are more similar than those in different groups. Second, while groups are independent of each others, observations within a group share values on variables, and thus, they are not independent. However, one advantage of employing multilevel models over other types of analyses is that independence is not required, because it is violated at each level of the analysis.^{2} Additionally, multilevel models are designed to deal with this intraclass correlation, which assumes that data from the same context are more similar than data from different contexts. Multilevel modelling analysis serves to measure the variability within contexts.
Statistical tests
The type of statistical tests that are employed in multilevel models depend on whether one is examining fixed effects or variance components. When examining fixed effects, the tests are compared with the standard error of the fixed effect, which results in a Ztest.^{4} A ttest can also be computed. When computing a ttest, it is important to keep in mind the degrees of freedom, which will depend on the level of the predictor (e.g., level 1 predictor or level 2 predictor).^{4} For a level 1 predictor, the degrees of freedom are based on the number of level 1 predictors, the number of groups and the number of individual observations. For a level 2 predictor, the degrees of freedom are based on the number of level 2 predictors and the number of groups.^{4}
Statistical power
Statistical power for multilevel models differs depending on whether it is level 1 or level 2 effects that are being examined. Power for level 1 effects is dependent upon the number of individual observations, whereas the power for level 2 effects is dependent upon the number of groups.^{7} To conduct research with sufficient power, large sample sizes are required in multilevel models. However, the number of individual observations in groups is not as important as the number of groups in a study. In order to detect crosslevel interactions, given that the group sizes are not too small, recommendations have been made that at least 20 groups are needed.^{7} The issue of statistical power in multilevel models is complicated by the fact that power varies as a function of effect size and intraclass correlations, it differs for fixed effects versus random effects, and it changes depending on the number of groups and the number of individual observations per group.^{7}
Applications of multilevel models
Level
The concept of level is the keystone of this approach. In an educational research example, the levels might be:
 pupil
 class
 school
 district
The researcher must establish for each variable the level at which it was measured. In this example "test score" might be measured at pupil level, "teacher experience" at class level, "school funding" at school level, and "urban" at district level.
Example
As a simple example, consider a basic linear regression model that predicts income as a function of age, class, gender and race. It might then be observed that income levels also vary depending on the city and state of residence. A simple way to incorporate this into the regression model would be to add an additional independent categorical variable to account for the location (i.e. a set of additional binary predictors and associated regression coefficients, one per location). This would have the effect of shifting the mean income up or down — but it would still assume, for example, that the effect of race and gender on income is the same everywhere. In reality, this is unlikely to be the case — different local laws, different retirement policies, differences in level of racial prejudice, etc. are likely to cause all of the predictors to have different sorts of effects in different locales.
In other words, a simple linear regression model might, for example, predict that a given randomly sampled person in Seattle would have an average yearly income $10,000 higher than a similar person in Mobile, Alabama. However, it would also predict, for example, that a white person might have an average income $7,000 above a black person, and a 65yearold might have an income $3,000 below a 45yearold, in both cases regardless of location. A multilevel model, however, would allow for different regression coefficients for each predictor in each location. Essentially, it would assume that people in a given location have correlated incomes generated by a single set of regression coefficients, whereas people in another location have incomes generated by a different set of coefficients. Meanwhile, the coefficients themselves are assumed to be correlated and generated from a single set of hyperparameters. Additional levels are possible: For example, people might be grouped by cities, and the citylevel regression coefficients grouped by state, and the statelevel coefficients generated from a single hyperhyperparameter.
Multilevel models are a subclass of hierarchical Bayesian models, which are general models with multiple levels of random variables and arbitrary relationships among the different variables. Multilevel analysis has been extended to include multilevel structural equation modeling, multilevel latent class modeling, and other more general models.
Uses of multilevel models
Multilevel models have been used in education research or geographical research, to estimate separately the variance between pupils within the same school, and the variance between schools. In psychological applications, the multiple levels are items in an instrument, individuals, and families. In sociological applications, multilevel models are used to examine individuals embedded within regions or countries. In organizational psychology research, data from individuals must often be nested within teams or other functional units.
Different covariables may be relevant on different levels. They can be used for longitudinal studies, as with growth studies, to separate changes within one individual and differences between individuals.
Crosslevel interactions may also be of substantive interest; for example, when a slope is allowed to vary randomly, a level2 predictor may be included in the slope formula for the level1 covariate. For example, one may estimate the interaction of race and neighborhood so that an estimate of the interaction between an individual's characteristics and the context.
Applications to longitudinal (repeated measures) data
Alternative ways of analyzing hierarchical data
There are several alternative ways of analyzing hierarchical data, although most of them have some problems. First, traditional statistical techniques can be used. One could disaggregate higherorder variables to the individual level, and thus conduct an analysis on this individual level (for example, assign class variables to the individual level). The problem with this approach is that it would violate the assumption of independence, and thus could bias our results. This is known as atomistic fallacy.^{8} Another way to analyze the data using traditional statistical approaches is to aggregate individual level variables to higherorder variables and then to conduct an analysis on this higher level. The problem with this approach is that it discards all withingroup information (because it takes the average of the individual level variables). As much as 8090% of the variance could be wasted, and the relationship between aggregated variables is inflated, and thus distorted.^{9} This is known as ecological fallacy, and statistically, this type of analysis results in decreased power in addition to the loss of information.^{2}
Another way to analyze hierarchical data would be through a randomcoefficients model. This model assumes that each group has a different regression model  with its own intercept and slope.^{4} Because groups are sampled, the model assumes that the intercepts and slopes are also randomly sampled from a population of group intercepts and slopes. This allows for an analysis in which one can assume that slopes are fixed but intercepts are allowed to vary.^{4} However this presents a problem, as individual components are independent but group components are independent between groups, but dependent within groups. This also allows for an analysis in which the slopes are random; however, the correlations of the error terms (disturbances) are dependent on the values of the individuallevel variables.^{4} Thus, the problem with using a randomcoefficients model in order to analyze hierarchical data is that is still not possible to incorporate higher order variables.
Error terms
Multilevel models have two error terms, which are also known as disturbances. The individual components are all independent, but there are also group components, which are independent between groups but correlated within groups. However, variance components can differ, as some groups are more homogeneous than others.^{9}
See also
References
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} Bryk, Stephen W. Raudenbush, Anthony S. (2002). Hierarchical linear models : applications and data analysis methods (2. ed., [3. Dr.] ed.). Thousand Oaks, CA [u.a.]: Sage Publications. ISBN 076191904X.
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} ^{f} ^{g} ^{h} ^{i} ^{j} ^{k} ^{l} ^{m} ^{n} Fidell, Barbara G. Tabachnick, Linda S. (2007). Using multivariate statistics (5th ed. ed.). Boston ; Montreal: Pearson/A & B. ISBN 0205459382.
 ^ Luke, Douglas A. (2004). Multilevel modeling (3. repr. ed.). Thousand Oaks, CA: Sage. ISBN 9780761928799.
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} ^{f} ^{g} ^{h} ^{i} ^{j} ^{k} ^{l} Applied multiple regression/correlation analysis for the behavioral sciences (3. ed. ed.). Mahwah, NJ [u.a.]: Erlbaum. ISBN 9780805822236.
 ^ editor, G. David Garson,. Hierarchical linear modeling : guide and applications. Thousand Oaks, Calif.: Sage Publications. ISBN 1412998859.
 ^ ^{a} ^{b} ^{c} ^{d} Salkind, Samuel B. Green, Neil J. (2004). Using SPSS for Windows and Macintosh : analyzing and understanding data (4th ed. ed.). Upper Saddle River, NJ: Pearson Education. ISBN 013146597X.
 ^ ^{a} ^{b} ^{c} Leeuw, Ita Kreft, Jan de (1998). Introducing multilevel modeling. (Repr. ed.). London: Sage Publications Ltd. ISBN 0761951415.
 ^ Hox, Joop (2002). Multilevel analysis : techniques and applications (Reprint. ed.). Mahwah, NJ [u.a.]: Erlbaum. ISBN 080583219X.
 ^ ^{a} ^{b} Bryk, Anthony S.; Raudenbush, Stephen W. (1 January 1988). "Heterogeneity of variance in experimental studies: A challenge to conventional interpretations.". Psychological Bulletin 104 (3): 396–404. doi:10.1037/00332909.104.3.396.
Books
 Goldstein, H. (2011). Multilevel Statistical Models. 4th ed. London: Wiley.
 Hox, J. J. (2010). Multilevel analysis: Techniques and applications. 2nd ed. Hogrefe and Huber.
 Raudenbush, S. W. and Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. 2nd ed. Thousand Oaks, CA: Sage.
 Snijders, T. A. B. and Bosker, R. J. (2011). Multilevel Analysis: an Introduction to Basic and Advanced Multilevel Modeling. 2nd ed. London: Sage.
External links
Software
Specialised multilevel statistical software
Standard statistical software
 R
 SAS (software) using the MIXED procedure
 SPSS using the Advanced Statistics product
 Stata
 gllamm user written command
 runmlwin user written command (to run MLwiN from within Stata)
 xtmixed command
 xtmelogit command
 xtmepoisson command
 WinBUGS
 JAGS
 STAN
Other resources
Return to Fuhz Home  This article covering Hierarchical linear modeling is enhanced for the visually impaired.
The text of this Fuhz article is released under the GNU Free Documentation License