    WGCI - Seminar
    Sequential Regression – A Method for Multipe Imputations of Missing Data
    Seminar of the Working Group on Composite Indices
    27 March 2002
    Tanja Srebotnjak, United Nations Statistics Division, 2002
    A multivariate technique for multiply imputing missing data
    Uses a sequence of regression models
    Developed by T. E. Raghunathan, J. M. Lepkowski, Peter Solenberger and John Van Hoewyk, [University of Michigan]
    Procedure (contd.):
    Assume a dataset of dimension (n p) with item non-response/missingness
    Partition the dataset into n1 variables with no missing obs, say X=(X1,X2,…,Xn1) and (n- n1) with missing values Y=(Y1,Y2,…,Yn-n1)
    Y is ordered by degree of missingness, from least to most
    Procedure (contd.):
    Then, the conditional distribution of Y1, i=1,2,…,n-n1, given the observed values is modeled as a regression model of Yi on X, e.g. E(Y1|X)=X + e
    Missing values are imputed using this model
    Once, Y1 is imputed, it is used as a predictor for Y2, i.e. X=(X1, X2, …, Xn, Y1)
    Procedure (contd.):
    The algorithm continues cycling through this series of regression models (using updated predictor sets until X=(X1,X2,…,Xn1,Y1,…,Yn-n1)
    Now, a new round begins, using the full dataset as predictor for Y1 again, thus updating the regression coefficients
    The algorithm is repeated until convergence in the regression coefficients is achieved, i.e. change below a specified margin
    Procedure (contd.):
    Finally, the missing values for each Yi are imputed using the corresponding converged regression model
    In order to yield multiple imputations, the complete algorithm is repeated m times, resulting in m completed datasets
    The m datasets are analyzed and the results combined to yield final parameter estimates
    Basically all model types can be fitted
    Stepwise regressions possible to ensure that only most important predictors enter the model


