Chapter 18, Section 18.4

 

Treatment of Lead Exposed Children Trial (TLC)

Analysis of Response Profiles with Multiple Imputation of Missing Data

 

 

use "tlcmiss.dta"

 

mi set flong

 

mi misstable summarize, all

                                                               Obs<.

                                                +------------------------------

               |                                | Unique

      Variable |     Obs=.     Obs>.     Obs<.  | values        Min         Max

  -------------+--------------------------------+------------------------------

            id |                           100  |    100          1         100

           trt |                           100  |      2          0           1

            y1 |                           100  |     75       19.7        41.1

            y2 |        12                  88  |     75        2.8        40.8

            y3 |        19                  81  |     67          3        38.6

            y4 |        26                  74  |     63        4.1        63.9

  -----------------------------------------------------------------------------

 

mi register imputed trt y1 y2 y3 y4

(40 m=0 obs. now marked as incomplete)

 

mi set M = 50

(50 imputations added; M = 50)

 

mi impute mvn y1 y2 y3 y4 trtreplace rseed(57313131) /// 

     burnin(5000) burnbetween(500) prior(jeffreys)

 

note: variables y1 trt contain no soft missing (.) values; imputing nothing

 

Performing EM optimization:

  observed log posterior = -740.52173 at iteration 18

 

Performing MCMC data augmentation ... 

 

Multivariate imputation                 Imputations =       50

Multivariate normal regression                added =        0

Imputed: m=1 through m=50                   updated =       50

 

Prior: Jeffreys                          Iterations =    29500

                                            burn-in =     5000

                                            between =      500

 

               |              Observations per m              

               |----------------------------------------------

      Variable |   complete   incomplete   imputed |     total

---------------+-----------------------------------+----------

            y1 |        100            0         0 |       100

            y2 |         88           12        12 |       100

            y3 |         81           19        19 |       100

            y4 |         74           26        26 |       100

           trt |        100            0         0 |       100

--------------------------------------------------------------

(complete + incomplete = total; imputed is the minimum across m

 of the number of filled in observations.)

 

mi reshape long y, i(id) j(time)

 

reshaping m=0 data ...

(note: j = 1 2 3 4)

 

Data                               wide   ->   long

-----------------------------------------------------------------------------

Number of obs.                      100   ->     400

Number of variables                   6   ->       4

j variable (4 values)                     ->   time

xij variables:

                           y1 y2 ... y4   ->   y

-----------------------------------------------------------------------------

 

reshaping m=1 data ...

 

reshaping m=2 data ...

 

reshaping m=3 data ...

 

reshaping m=4 data ...

 

 

<output omitted> 

 

 

reshaping m=46 data ...

 

reshaping m=47 data ...

 

reshaping m=48 data ...

 

reshaping m=49 data ...

 

reshaping m=50 data ...

 

assembling results ...

 

mi xtset id time

       panel variable:  id (strongly balanced)

        time variable:  time, 1 to 4

                delta:  1 unit

 

mi estimatextmixed y i.trt##i.time || id:, noconstant ///

     residuals(unstructured,t(time))

 

Multiple-imputation estimates                   Imputations        =        50

Mixed-effects REML regression                   Number of obs      =       400

 

Group variable: id                              Number of groups   =       100

                                                Obs per group: min =         4

                                                               avg =       4.0

                                                               max =         4

 

                                                Average RVI        =    0.3763

DF adjustment:   Large sample                   DF:     min        =    215.58

                                                        avg        =  1.84e+53

                                                        max        =  1.67e+54

Model F test:       Equal FMI                   F(   7, 5622.8)    =     30.63

                                                Prob > F           =    0.0000

 

------------------------------------------------------------------------------

           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

       1.trt |       .268   1.004504     0.27   0.790    -1.700791    2.236791

             |

        time |

          2  |  -1.618012   .8353496    -1.94   0.053     -3.25531    .0192857

          3  |  -2.246293   .7995542    -2.81   0.005    -3.813435   -.6791499

          4  |  -2.528538   .9334718    -2.71   0.007    -4.358145   -.6989314

             |

    trt#time |

        1 2  |  -11.25636   1.204945    -9.34   0.000     -13.6183   -8.894428

        1 3  |  -8.963268   1.176418    -7.62   0.000    -11.26967   -6.656871

        1 4  |  -2.930642   1.666068    -1.76   0.080    -6.208377    .3470938

             |

       _cons |     26.272   .7102914    36.99   0.000     24.87985    27.66415

------------------------------------------------------------------------------

 

------------------------------------------------------------------------------

  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]

-----------------------------+------------------------------------------------

id:                  (empty) |

-----------------------------+------------------------------------------------

Residual: Unstructured       |

                      sd(e1) |   5.022519   .3587503      4.366381    5.777255

                      sd(e2) |   6.860414    .524996

                      sd(e3) |   6.679675   .5315782

                      sd(e4) |   7.998199   .7371925

                 corr(e1,e2) |   .5601737    .074164      .3979179    .6884483

                 corr(e1,e3) |   .5823588   .0704424      .4276799    .7039033

                 corr(e1,e4) |   .5881536   .0828851      .4016811    .7277961

                 corr(e2,e3) |   .8133749   .0388493      .7219197    .8769025

                 corr(e2,e4) |   .5036239   .1042274       .271888    .6801527

                 corr(e3,e4) |   .4468578   .1117488      .2027052    .6387043

------------------------------------------------------------------------------

Note: confidence intervals are not computable for some random-effects

      parameters because MI degrees of freedom cannot be estimated.

 

 

mi test 1.trt#2.time 1.trt#3.time 1.trt#4.time

note: assuming equal fractions of missing information

 

 ( 1)  [y]1.trt#2.time = 0

 ( 2)  [y]1.trt#3.time = 0

 ( 3)  [y]1.trt#4.time = 0

 

       F(  3,2052.6) =   23.60

            Prob > F =    0.0000

 

 

 

Clinical Trial of Contracepting Women

IPW-GEE Estimation of Marginal Logistic Regression Model

 

 

 

use "contracep.dta"

 

sort id time

 

. * Logistic Regression Model for Probability of Remaining in the Study

glm r i.time dose prevy 1.dose#c.prevy if time != 0, /// 

     family(binomial 1) link(logit)

 

Iteration 0:   log likelihood = -1210.4909  

Iteration 1:   log likelihood = -1206.5609  

Iteration 2:   log likelihood = -1206.5443  

Iteration 3:   log likelihood = -1206.5443  

 

Generalized linear models                          No. of obs      =      2902

Optimization     : ML                              Residual df     =      2896

                                                   Scale parameter =         1

Deviance         =  2413.088514                    (1/df) Deviance =  .8332488

Pearson          =  2902.658701                    (1/df) Pearson  =  1.002299

 

Variance functionV(u) = u*(1-u)                  [Bernoulli]

Link function    : g(u) = ln(u/(1-u))              [Logit]

 

                                                   AIC             =  .8356611

Log likelihood   = -1206.544257                    BIC             = -20677.17

 

------------------------------------------------------------------------------

             |                 OIM

           r |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        time |

          2  |   .1367107    .119009     1.15   0.251    -.0965427     .369964

          3  |   .7286143   .1438758     5.06   0.000     .4466228    1.010606

             |

        dose |    .067969   .1312701     0.52   0.605    -.1893157    .3252536

       prevy |  -.4513959   .1619033    -2.79   0.005    -.7687205   -.1340712

             |

dose#c.prevy |

          1  |  -.2381024   .2195724    -1.08   0.278    -.6684565    .1922516

             |

       _cons |   1.668112   .1043739    15.98   0.000     1.463543    1.872681

------------------------------------------------------------------------------

 

 

predict prob

(option mu assumed; predicted mean r)

(1151 missing values generated)

 

. * Calculation of Inverse Probability Weights (IPW)

gen cumprob=1

 

by id: replace cumprob=prob*cumprob[_n-1] if  _n != 1

(2902 real changes made)

 

gen ipw=1/cumprob

 

summ ipw if r == 1

 

    Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

         ipw |      3616    1.273165    .2492687          1   2.064016

 

 

 

. * IPW-GEE Estimation of Marginal Logistic Regression Model for 

. * Odds of Amenorrhea

logit y c.dose##(c.time##c.time) [pw=ipw], cluster(id) 

 

Iteration 0:   log pseudolikelihood = -3045.0594  

Iteration 1:   log pseudolikelihood = -2855.0847  

Iteration 2:   log pseudolikelihood = -2852.8533  

Iteration 3:   log pseudolikelihood = -2852.8501  

Iteration 4:   log pseudolikelihood = -2852.8501  

 

Logistic regression                               Number of obs   =       3616

                                                  Wald chi2(5)    =     324.06

                                                  Prob > chi2     =     0.0000

Log pseudolikelihood = -2852.8501                 Pseudo R2       =     0.0631

 

                                  (Std. Err. adjusted for 1151 clusters in id)

------------------------------------------------------------------------------

             |               Robust

           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        dose |   .1081157   .1492941     0.72   0.469    -.1844954    .4007268

        time |   .5379267    .133738     4.02   0.000     .2758051    .8000483

             |

      c.time#|

      c.time |  -.0036828   .0405379    -0.09   0.928    -.0831357    .0757701

             |

      c.dose#|

      c.time |   .4088951   .1909164     2.14   0.032     .0347059    .7830844

             |

      c.dose#|

      c.time#|

      c.time |  -.1263461   .0576979    -2.19   0.029    -.2394319   -.0132603

             |

       _cons |  -1.496422   .1075297   -13.92   0.000    -1.707177   -1.285668

------------------------------------------------------------------------------