Structural Equation Modeling (SEM)

  1. 3
    structural equations comprehensively represent the complex multidimensional relations among research variables in a theory. structural equation modeling (or sem) is a sophisticated class of multivariate analytic statistical techniques used to examine the underlying relationships, or structure, among variables in a model. sem allows the researcher to model, test, and reduce hypothesized relationships among a set of observed variables. sem seeks to represent hypotheses about the means, variances, and covariances of observed data in terms of parameters defined by a hypothesized underlying model.

    a structural equation model implies a structure of the variance-covariance matrix of the measures. researchers test whether variables are interrelated through a set of linear relationships by examining the variances and covariances of the variables. it helps answer questions about whether sample data are consistent with the hypothesized model.

    sem can clearly summarize results that generate a large number of interrelated measures. variables can be treated as both independent variables and dependent variables. it allows examination of a set of relationships between one or more independent variables, either continuous or discrete, and one or more dependent variables, either continuous or discrete. independent variables are called exogenous or upstream variables; dependent or mediating variables are called endogenous or downstream variables.

    sem deals with both observed and latent variables. an observed or manifest variable is a variable that can be observed directly and is measurable. a latent or unobserved variable is a variable that cannot be observed directly (such as intelligence or attitude) and must be inferred from measured variables. latent variables (or factors) are implied by the covariances among two or more measured variables. in sem, the focus is usually on latent variables, rather than on the observed variables used to measure these constructs. sem allows multiple measures to be associated with a single latent construct.

    sem is a hybrid of multiple regression and factor analysis techniques, belonging to the general linear model family. sem analyzes relationships among latent variables by combining the strengths of factor analysis and multiple regression into a single model that can be tested statistically. like multiple regression, this model allows for the evaluation of direct and indirect effects of variables in a model. unlike multiple regression, sem allows all variables to be examined simultaneously, testing an entire hypothesized multivariate model. sem allows simultaneous assessment of the strength and direction of the interrelationships among multiple dependent and independent variables, examining the direct and indirect effects of one variable upon another.

    structural equation modeling encompasses such diverse statistical techniques as path analysis, confirmatory factor analysis, causal modeling with latent variables, and even analysis of variance and multiple linear regression. major applications of sem include:

    - causal modeling or path analysis
    - confirmatory factor analysis
    - second order factor analysis
    - regression models
    - covariance structure models
    - correlations structure models

    most structural equation models can be expressed as path diagrams. path diagrams are similar to flowcharts, with lines, arrows, and geometric figures. they show the way observed and unobserved variables are inter-related, as well as showing which variables cause changes in other variables. ovals or circles represent latent variables, while rectangles or squares represent measured variables. residuals are unobserved, so they are represented by ovals or circles. correlations and covariances are represented by bidirectional arrows, which represent relationships without an explicitly defined causal direction.

    this is an example of a path diagram used in structural equation modeling:

    path analysis is commonly used to evaluate direct and indirect associations among observed variables. sem goes beyond the information provided by path analysis by allowing a more precise estimation of the indirect effects of independent variables on all dependent variables. sem allows researchers to test theories and assumptions directly by specifying which variables are related to other variables. that is, the researcher can test some paths (or relationships) but not others in the analysis.

    sem has distinct advantages over an ordinary least squares multiple regression approach. sem tests both conceptual and measurement models simultaneously, tests latent variable structure, allows for multiple measures of independent variables, adjusts for measurement error, and, it utilizes the measurement model to identify the errors of measurement. with sem, there is no assumption that the observed variables are measured without error. thus, sem is more flexible and realistic in that it allows for measurement error and does not require perfect reliability. sem allows researchers to examine relationships among latent variables with multiple observed variables. the researcher can evaluate the real-world scenario of observed variables' simultaneous impact on one another, without having to make artificial decisions about blocking or order of entry. the relationships among latent variables are purged of measurement error, leading to more accurate and often stronger relationships between latent variables than what would be observed using multivariate methods that consider observed variables only (such as manova or multiple regression).

    sem is suited to theory testing rather than theory development. the researcher first specifies a model based on theory, then determines how to measure constructs of interest (i.e., how to operationalize these with a reliable and valid measurement instrument), collects data, and then inputs the data into the sem software package. the package fits the data to the specified model and produces the results, which include overall model fit statistics and parameter estimates. the researcher then makes modifications. all sem analyses follow a logical sequence of these five steps or processes: model specification, model identification, model estimation, model testing, and model modification.

    sem is sometimes referred to as causal modeling. it allows for assessment of indirect causal paths to outcomes, as well as the testing of alternative models. sem is used to test the reasonableness of alternative hypotheses regarding the causal relationships between various measures, and their relationships to underlying dimensions or latent variables. sem can be used to analyze causal models involving latent variables. with sem, researchers can pose complex models that evaluate the direct and indirect impact of several variables on one or more outcome variables. these complex models can predict various types of outcomes. the researcher must keep in mind that correlation is not causation, even if the correlation is complex and multivariate. what causal modeling does allow is examination of the extent to which data agree or fail to agree with a model of causality.

    sem models have two basic elements: a measurement model and a structural equation model. the measurement model describes the indicators (observed measures) of the latent variables. this corresponds to a confirmatory factor analysis, in which a measurement model is tested. the structural mode delineates the direct and indirect effects among latent variables, specifying how the latent variables or hypothetical constructs are measured in terms of the observed variables. it also describes the measurement properties, the validities and reliabilities, of the observed variables. using sem allows the researcher greater options for estimation, including the most commonly used maximum likelihood methods along with numerous statistical indices for evaluating model fit.

    when there is evidence of an adequate fit of the data to the hypothesized measured model, the theoretical causal model is tested by structural equation modeling. the structural equation model specifies the causal relationships among the latent variables and describes the underlying effects and the amount of unexplained variance. in this part of the analysis, sem yields information about the hypothesized causal parameters – that is, the path coefficients, which are presented as beta weights. the coefficients indicate the expected amount of change in the latent endogenous variable that is caused by a change in the latent causal variable. sem programs provide information on the significance of individual paths. the residual terms (amount of unexplained variance for the latent endogenous variables) can also be calculated from the sem analysis. the overall fit of the causal model to the research data can be tested by means of several alternative statistics. two such statistics are the goodness-of-fit (gfi) and the adjusted goodness-of-fit (agfi). for both indexes, a value of .90 or greater indicates a good fit of the model to the data.

    three key requirements of sem are as follows: thorough knowledge of the theory; adequate assessment of statistical criteria; and parsimony (ability to predict the greatest amount of variance in the outcome variable or variables using the smallest number of predictor variables).

    assumptions: any analysis in sem assumes that the model has been specified correctly, that the sample size is sufficiently large (e.g., n > 200), that there is independence of observations, that multivariate data are distributed normally, that there are linear relationships among the observed variables, and that there is an absence of highly correlated observed variables (e.g., r > .90). examining residuals after a model has been reduced helps the researcher to determine the extent to which the errors in prediction are distributed normally within acceptable ranges.

    interpreting results of sem depends on the quality of the measured data and generalizability of the sample. sem allows the researcher to evaluate the importance of each independent variable in the model and to test the overall fit of the model to the data. a good fit of the specified measurement or structural model to the observed data indicates that the model is consistent with the relationships within the observed data. the researcher asks, "does it fit well enough to usefully approximate reality and to furnish a reasonable explanation of the data trends?" once the researcher obtains a model that fits well, is theoretically consistent, and provides statistically significant parameter estimates, the researcher must interpret it in the light of the research questions and then distill the results in written form for publication. the fact that the model fits the data does not necessarily imply that the model is the correct one. there may be other equivalent models that fit the data equally well. there may also be non-equivalent alternative models that fit the data better than this model. researchers should strive to test and rule out likely alternative models whenever possible.

    spss does not have a structural equation modeling module, but it does support an “add on” called amos (analysis of moment structures) or lisrel (linear structural relations). eqs is another statistical package for doing sem.

    research example: structural equation modeling being used to examine the hypothesized causal and correlational links among racism, chronic stress emotions, and blood pressure.


    buhi, e. r., goodson, p., & neilands, t. b. (2007). structural equation modeling: a primer for health behavior researchers. american journal of health researchers, 31(1), 74-85.

    clayton, m. f., & pett, m. a. (2008). amos versus lisrel: one data set, two analyses. nursing research, 57(4), 283-292.

    hayes, r. d., revicki, d., & coyne, k. s. (2005). application of structural equation modeling to health outcomes research. evaluation & the health professions, 28, 295-309.

    musil. c. m., jones, s. l., & warner, c. d. (1998). structural equation modeling and its relationship to multiple regression and factor analysis. research in nursing & health, 21, 271-281.

    pallant, j. (2007). spss survival manual. new york: mcgraw-hill education.

    polit, d. f., & beck, c. t. (2008). nursing research: generating and assessing evidence for nursing practice (8th ed.). philadelphia: wolters kluwer health.
    Last edit by VickyRN on May 8, '09
    Do you like this Article? Click Like?

  2. Visit VickyRN profile page

    About VickyRN

    Joined: Mar '01; Posts: 12,046; Likes: 6,492
    Nurse Educator; from US
    16 year(s) of experience in Gerontological, cardiac, med-surg, peds

    Read My Articles