coefficients and variance partition differs considerably between runmlwin and melogit

lisagruene · Post by **lisagruene** » Fri Feb 17, 2017 11:55 am

Dear all,

I have estimated a logistic multilevel model via runmlwin and meqrlogit. The coefficients and variance partition differ considerably between the two commands. I know that a small difference is normal and due to the different estimation methods, however my coefficient has almost twice the size with meqrlogit and the variance on the second level is twice as bis as well. Moreover, if I estimate a model with clustered standard errors, the results are comparable to the runmlwin model, which is why I assume I made a mistake in that specification and it does not account for the second level in runmlwin. I have checked my model several times, but just cannot see where I went wrong. Can anybody help me with that, maybe? I´ll attatch the two outputs (and sytaxes) below.

Thank you all!

Code: Select all

***syntax runmlwin***

use "C:\UXX\data", clear
global MLwiN_path "C:\Program Files (x86)\MLwiN trial\i386\mlwin.exe"

gen level1indicator=_n
gen cons=1

sort level2indicator level1indicator

runmlwin depvar cons i.varl1, level2(level2indicator: cons) level1(level1indicator) discrete(distribution(binomial) link(logit)denom(cons)) nopause

Code: Select all

 ***output runmlwin***

Run time (seconds)   =       5.07
Number of iterations =          5
------------------------------------------------------------------------------
stellefrei_sci|     Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        cons |   1.380566   .1180445    11.70   0.000     1.149203    1.611929
      _1_varl1 |  -.0186297   .1457571    -0.13   0.898    -.3043084    .2670489
    _2_varl1 |  -.3663916   .1389778    -2.64   0.008     -.638783   -.0940001
------------------------------------------------------------------------------

------------------------------------------------------------------------------
   Random-effects Parameters |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 2: indicator             |
                   var(cons) |   2.293732   .2425133      1.818414    2.769049
------------------------------------------------------------------------------

Code: Select all

***Syntax meqrlogit***
meqrlogit depvar i.varl1 ||level2indicator:

Code: Select all

***output meqrlogit***

Integration points =   7                        Wald chi2(2)       =     20.24
Log likelihood = -746.48846                     Prob > chi2        =    0.0000

------------------------------------------------------------------------------
depvar |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
             |
  _1_varl1  |   -.025434   .2384883    -0.11   0.915    -.4928625    .4419946
  _2_varl1  |  -.9323545   .2362082    -3.95   0.000    -1.395314   -.4693948
             |
       _cons |   3.527563   .2945633    11.98   0.000      2.95023    4.104897
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
level2indicator: Identity           |
                  var(_cons) |   14.11532   1.947718      10.77052    18.49885
------------------------------------------------------------------------------
LR test vs. logistic regression: chibar2(01) =   456.45 Prob>=chibar2 = 0.0000

GeorgeLeckie · Post by **GeorgeLeckie** » Sun Feb 19, 2017 1:31 pm

Dear Lisa,

The differences you see relate to the severe clustering exhibited in your data.

MLwiN's default estimation method for binary and other categorical and count response models is quasilikelihood estimation and comes in four flavours of increasing accuracy, MQL1 (default), MQL2, PQL1 and PQL2.

Stata's default is maximum likelihood estimation via adaptive quadrature with increased accuracy coming from specifying increased number of quadrature points.

When there is only mild clustering, especially when clusters are large (e.g., most cross-sectional applications) model results are similar across all of the above alternative estimation methods.

However, when there is severe clustering, especially when clusters are small (e.g., many longitudinal applications), model results can differ across the above estimation methods. The quasilikelihood estimates especially the default MQL1 versions which you are using will be biased downwards. The estimated variance will be too small and so will the estimated regression coefficients. PQL2 will be the least biased of the quasilikelihood results so you should at least switch to that. However, this will likely still be biased downwards given how large the estimated variance component is in Stata. My advice would be to therefore use the MCMC estimation methods implemented in MLwiN as these do not suffer from the biases described here. If you choose to use Stata over MLwiN, then note that you should increase the number of adaptive quadrature points until the model estimates stop changing. I suspect that the default of seven used by Stata is not enough given how severe the clustering is.

I hope that helps

Best wishes

George

lisagruene · Post by **lisagruene** » Mon Feb 20, 2017 1:30 pm

Dear George,

perfect, thank you so much!!! I tried my models again with the mcmc option and now the results look much better, thank you so much!

If I can ask a follow-up question (it´s ok if I can`t), I´m also working with imputed data and would like to estimate average marginal effects. Working with imputed data seems to be no problem, as I specified the cmdok option (mi est, post cmdok: runmlwin...) and that generally seems to work and I also found your post on how to calculate the ames, however I was unable to do so with imputed data (i think because as I "forced" stata to run mlwin with mi est, it doesnt save the individual results from each imputatation, so I dont have the results stored that I need), is there maybe a way around that or a general way to deal with that? Could it be solved if I try to write a program?

Sorry for the question assault & thank you so much,

Lisa

www.cmm.bristol.ac.uk/forum

coefficients and variance partition differs considerably between runmlwin and melogit

coefficients and variance partition differs considerably between runmlwin and melogit

Re: coefficients and variance partition differs considerably between runmlwin and melogit

Re: coefficients and variance partition differs considerably between runmlwin and melogit