Chapter 18, Section 18.4

Treatment of Lead Exposed Children Trial (TLC)

Analysis of Response Profiles with Multiple Imputation of Missing Data

. use "tlcmiss.dta"

. mi set flong

. mi misstable summarize, all

Obs<.

+------------------------------

| | Unique

Variable | Obs=. Obs>. Obs<. | values Min Max

-------------+--------------------------------+------------------------------

id | 100 | 100 1 100

trt | 100 | 2 0 1

y1 | 100 | 75 19.7 41.1

y2 | 12 88 | 75 2.8 40.8

y3 | 19 81 | 67 3 38.6

y4 | 26 74 | 63 4.1 63.9

-----------------------------------------------------------------------------

. mi register imputed trt y1 y2 y3 y4

(40 m=0 obs. now marked as incomplete)

. mi set M = 50

(50 imputations added; M = 50)

. mi impute mvn y1 y2 y3 y4 trt, replace rseed(57313131) ///

burnin(5000) burnbetween(500) prior(jeffreys)

note: variables y1 trt contain no soft missing (.) values; imputing nothing

Performing EM optimization:

observed log posterior = -740.52173 at iteration 18

Performing MCMC data augmentation ...

Multivariate imputation Imputations = 50

Multivariate normal regression added = 0

Imputed: m=1 through m=50 updated = 50

Prior: Jeffreys Iterations = 29500

burn-in = 5000

between = 500

| Observations per m

|----------------------------------------------

Variable | complete incomplete imputed | total

---------------+-----------------------------------+----------

y1 | 100 0 0 | 100

y2 | 88 12 12 | 100

y3 | 81 19 19 | 100

y4 | 74 26 26 | 100

trt | 100 0 0 | 100

--------------------------------------------------------------

(complete + incomplete = total; imputed is the minimum across m

of the number of filled in observations.)

. mi reshape long y, i(id) j(time)

reshaping m=0 data ...

(note: j = 1 2 3 4)

Data wide -> long

-----------------------------------------------------------------------------

Number of obs. 100 -> 400

Number of variables 6 -> 4

j variable (4 values) -> time

xij variables:

y1 y2 ... y4 -> y

-----------------------------------------------------------------------------

reshaping m=1 data ...

reshaping m=2 data ...

reshaping m=3 data ...

reshaping m=4 data ...

reshaping m=46 data ...

reshaping m=47 data ...

reshaping m=48 data ...

reshaping m=49 data ...

reshaping m=50 data ...

assembling results ...

. mi xtset id time

panel variable: id (strongly balanced)

time variable: time, 1 to 4

delta: 1 unit

. mi estimate: xtmixed y i.trt##i.time || id:, noconstant ///

residuals(unstructured,t(time))

Multiple-imputation estimates Imputations = 50

Mixed-effects REML regression Number of obs = 400

Group variable: id Number of groups = 100

Obs per group: min = 4

avg = 4.0

max = 4

Average RVI = 0.3763

DF adjustment: Large sample DF: min = 215.58

avg = 1.84e+53

max = 1.67e+54

Model F test: Equal FMI F( 7, 5622.8) = 30.63

Prob > F = 0.0000

------------------------------------------------------------------------------

y | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

1.trt | .268 1.004504 0.27 0.790 -1.700791 2.236791

time |

2 | -1.618012 .8353496 -1.94 0.053 -3.25531 .0192857

3 | -2.246293 .7995542 -2.81 0.005 -3.813435 -.6791499

4 | -2.528538 .9334718 -2.71 0.007 -4.358145 -.6989314

trt#time |

1 2 | -11.25636 1.204945 -9.34 0.000 -13.6183 -8.894428

1 3 | -8.963268 1.176418 -7.62 0.000 -11.26967 -6.656871

1 4 | -2.930642 1.666068 -1.76 0.080 -6.208377 .3470938

_cons | 26.272 .7102914 36.99 0.000 24.87985 27.66415

------------------------------------------------------------------------------

Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]

-----------------------------+------------------------------------------------

id: (empty) |

-----------------------------+------------------------------------------------

Residual: Unstructured |

sd(e1) | 5.022519 .3587503 4.366381 5.777255

sd(e2) | 6.860414 .524996

sd(e3) | 6.679675 .5315782

sd(e4) | 7.998199 .7371925

corr(e1,e2) | .5601737 .074164 .3979179 .6884483

corr(e1,e3) | .5823588 .0704424 .4276799 .7039033

corr(e1,e4) | .5881536 .0828851 .4016811 .7277961

corr(e2,e3) | .8133749 .0388493 .7219197 .8769025

corr(e2,e4) | .5036239 .1042274 .271888 .6801527

corr(e3,e4) | .4468578 .1117488 .2027052 .6387043

------------------------------------------------------------------------------

Note: confidence intervals are not computable for some random-effects

parameters because MI degrees of freedom cannot be estimated.

. mi test 1.trt#2.time 1.trt#3.time 1.trt#4.time

note: assuming equal fractions of missing information

( 1) [y]1.trt#2.time = 0

( 2) [y]1.trt#3.time = 0

( 3) [y]1.trt#4.time = 0

F( 3,2052.6) = 23.60

Prob > F = 0.0000

Clinical Trial of Contracepting Women

IPW-GEE Estimation of Marginal Logistic Regression Model

. use "contracep.dta"

. sort id time

. * Logistic Regression Model for Probability of Remaining in the Study

. glm r i.time dose prevy 1.dose#c.prevy if time != 0, ///

family(binomial 1) link(logit)

Iteration 0: log likelihood = -1210.4909

Iteration 1: log likelihood = -1206.5609

Iteration 2: log likelihood = -1206.5443

Iteration 3: log likelihood = -1206.5443

Generalized linear models No. of obs = 2902

Optimization : ML Residual df = 2896

Scale parameter = 1

Deviance = 2413.088514 (1/df) Deviance = .8332488

Pearson = 2902.658701 (1/df) Pearson = 1.002299

Variance function: V(u) = u*(1-u) [Bernoulli]

Link function : g(u) = ln(u/(1-u)) [Logit]

AIC = .8356611

Log likelihood = -1206.544257 BIC = -20677.17

------------------------------------------------------------------------------

| OIM

r | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

time |

2 | .1367107 .119009 1.15 0.251 -.0965427 .369964

3 | .7286143 .1438758 5.06 0.000 .4466228 1.010606

dose | .067969 .1312701 0.52 0.605 -.1893157 .3252536

prevy | -.4513959 .1619033 -2.79 0.005 -.7687205 -.1340712

dose#c.prevy |

1 | -.2381024 .2195724 -1.08 0.278 -.6684565 .1922516

_cons | 1.668112 .1043739 15.98 0.000 1.463543 1.872681

------------------------------------------------------------------------------

. predict prob

(option mu assumed; predicted mean r)

(1151 missing values generated)

. * Calculation of Inverse Probability Weights (IPW)

. gen cumprob=1

. by id: replace cumprob=prob*cumprob[_n-1] if _n != 1

(2902 real changes made)

. gen ipw=1/cumprob

. summ ipw if r == 1

Variable | Obs Mean Std. Dev. Min Max

-------------+--------------------------------------------------------

ipw | 3616 1.273165 .2492687 1 2.064016

. * IPW-GEE Estimation of Marginal Logistic Regression Model for

. * Odds of Amenorrhea

. logit y c.dose##(c.time##c.time) [pw=ipw], cluster(id)

Iteration 0: log pseudolikelihood = -3045.0594

Iteration 1: log pseudolikelihood = -2855.0847

Iteration 2: log pseudolikelihood = -2852.8533

Iteration 3: log pseudolikelihood = -2852.8501

Iteration 4: log pseudolikelihood = -2852.8501

Logistic regression Number of obs = 3616

Wald chi2(5) = 324.06

Prob > chi2 = 0.0000

Log pseudolikelihood = -2852.8501 Pseudo R2 = 0.0631

(Std. Err. adjusted for 1151 clusters in id)

------------------------------------------------------------------------------

| Robust

y | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

dose | .1081157 .1492941 0.72 0.469 -.1844954 .4007268

time | .5379267 .133738 4.02 0.000 .2758051 .8000483

c.time#|

c.time | -.0036828 .0405379 -0.09 0.928 -.0831357 .0757701

c.dose#|

c.time | .4088951 .1909164 2.14 0.032 .0347059 .7830844

c.dose#|

c.time#|

c.time | -.1263461 .0576979 -2.19 0.029 -.2394319 -.0132603

_cons | -1.496422 .1075297 -13.92 0.000 -1.707177 -1.285668

------------------------------------------------------------------------------