Possible bug in R2MLwiN
Posted: Thu Mar 20, 2014 9:31 am
Comparing the output for my model in MLwiN and R, I have found serious discrepancies. In R, estimates were printed as belonging to different variables, even though the numbers themselves were identical. Here is a re-run of my model with `debugmode=TRUE`.
The R version is as follows:
And the MLwiN output is this:

Notice how the values are switched. The estimate of -1.601 related to "avejobgm" variable in MLwiN (which is actually correct) is related to StartCareer60:HT interaction in R output. The value of -0.144 pertaining to "carlengthgm" in the MLwiN output corresponds to "avejobgm" in R output. Here is the R call I use:
I don't do anything else to the model, just call it from R using `debugmode=TRUE` option, take a screenshot in MLwiN, close MLwiN, and then see model output in R by calling it via the object name. There is a warning in MLwiN saying that my variable names may cause problem in Stata (which I don't use). To me it looks like the estimates are sent back to R in the proper order (if you compare R and MLwiN output without looking at variable names, the order of the estimates is identical), but the order of the variable names is changed. The interaction terms are grouped together after main terms in MLwiN, whereas R retains the order from the original formula.
Additional info: MLwiN version is 2.28, R2MLwiN version is 0.1-8, R version is 3.0.2.
The R version is as follows:
Code: Select all
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN multilevel model (Normal)
Estimation algorithm: MCMC Elapsed time : 829.38s
Number of obs: 16673 (from total 18841 ) Number of iter.: 20000 Burn-in: 10000
Bayesian Deviance Information Criterion (DIC)
Dbar D(thetabar) pD DIC
115547.648 115506.289 41.353 115589.000
---------------------------------------------------------------------------------------------------
The model formula:
unemp100 ~ (0 | cons + educationgm + HT + gender + StartCareer60 +
StartCareer60:HT + avejobgm + avejobgm:HT + carlengthgm) +
(1 | cons + StartCareer60 + StartCareer60:HT) + (2 | cons +
StartCareer60 + StartCareer60:HT)
Level 2: country Level 1: MERGEID
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Cred. Interval] ESS
cons 3.60620 0.52371 6.84 7.826e-12 *** 2.55728 4.65196 10000
educationgm -0.14639 0.01522 -9.61 7.384e-22 *** -0.17663 -0.11658 9372
HT 1.48281 0.21762 6.78 1.24e-11 *** 1.06275 1.91989 10000
gender 1.01506 0.12152 8.38 5.144e-17 *** 0.78305 1.25600 10000
StartCareer60 -0.10891 0.02355 -4.61 4.033e-06 *** -0.15567 -0.06341 10000
StartCareer60:HT -1.60088 0.61555 -2.62 0.008886 ** -2.82614 -0.40949 7913
avejobgm -0.14369 0.02075 -6.94 3.941e-12 *** -0.18491 -0.10401 10668
avejobgm:HT 0.13507 0.04394 3.10 0.001946 ** 0.05060 0.22456 9714
carlengthgm 0.77653 0.24015 3.25 0.001134 ** 0.30106 1.24803 10000
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Notice how the values are switched. The estimate of -1.601 related to "avejobgm" variable in MLwiN (which is actually correct) is related to StartCareer60:HT interaction in R output. The value of -0.144 pertaining to "carlengthgm" in the MLwiN output corresponds to "avejobgm" in R output. Here is the R call I use:
Code: Select all
mlmod <- runmw(unemp100~(0|cons+educationgm+HT+gender+StartCareer60+StartCareer60:HT + avejobgm + avejobgm:HT + carlengthgm) + (1|cons+StartCareer60+StartCareer60:HT) + (2|cons+StartCareer60+StartCareer60:HT),
levID=c("country","MERGEID"),data=ml.data,
estoptions=list(EstM=1,debugmode=T,clre=covmatrix,mcmcOptions=list(hcen=2),
mcmcMeth=list(iterations=20000,burnin=10000,thinning=2,Lev1VarM=3)))
Additional info: MLwiN version is 2.28, R2MLwiN version is 0.1-8, R version is 3.0.2.