Hi Francesca,
First, a general piece of advice:
Default estimation for discrete response models in MLwiN is by quasilikelihood methods. These methods only provide approximate maximum likelihood estimates and so we recommend users always use the MCMC methods in MLwiN for any final discrete response model.
You have stumbled on a rather interesting second order estimation limitation of the quasilikelihood methods in MLwiN:
It turns out that when fitting multinomial models in MLwiN by quasilikelihood methods, the degree of approximation is itself sensitive to the choice of base category. Note that this problem does not manifest itself when you use the recommended MCMC methods.
An example is given below. For simplicity this is for a single-level multinomial model, but the same issues apply for multilevel multinomial models.
Best wishes
George
Syntax:
Code: Select all
*-------------------------------------------------------------------------------
* Prepare the data
*-------------------------------------------------------------------------------
* Load the data
webuse sysdsn1, clear
* Generate the variables
generate cons = 1
drop if insure==.
drop if age==.
tabulate site, gen(site)
*-------------------------------------------------------------------------------
* Fit models using mlogit
*-------------------------------------------------------------------------------
* Base category is 1
mlogit insure age male nonwhite site2 site3, base(1)
estimates store m1ml
test male
* Base category is 2
mlogit insure age male nonwhite site2 site3, base(2)
estimates store m2ml
test male
* Base category is 3
mlogit insure age male nonwhite site2 site3, base(3)
estimates store m3ml
test male
*-------------------------------------------------------------------------------
* Fit models using runmlwin - IGLS
*-------------------------------------------------------------------------------
* Base category is 1
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
maxiter(100) ///
nopause
estimates store m1igls
test [FP1]male_2 [FP2]male_3
* Base category is 2
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
maxiter(100) ///
nopause
estimates store m2igls
test [FP1]male_1 [FP2]male_3
* Base category is 3
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
maxiter(100) ///
nopause
estimates store m3igls
test [FP1]male_1 [FP2]male_2
*-------------------------------------------------------------------------------
* Fit models using runmlwin - MCMC
*-------------------------------------------------------------------------------
* Base category is 1
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
tolerance(5) maxiter(100) ///
mcmc(burnin(1000) chain(10000)) initsmodel(m1igls) ///
nopause
estimates store m1mcmc
test [FP1]male_2 [FP2]male_3
* Base category is 2
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
tolerance(5) maxiter(100) ///
mcmc(burnin(1000) chain(10000)) initsmodel(m2igls) ///
nopause
estimates store m2mcmc
test [FP1]male_1 [FP2]male_3
* Base category is 3
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
tolerance(5) maxiter(100) ///
mcmc(burnin(1000) chain(10000)) initsmodel(m3igls) ///
nopause
estimates store m3mcmc
test [FP1]male_1 [FP2]male_2
Output:
Code: Select all
. *-------------------------------------------------------------------------------
. * Prepare the data
. *-------------------------------------------------------------------------------
.
. * Load the data
. webuse sysdsn1, clear
(Health insurance data)
.
. * Generate the variables
. generate cons = 1
. drop if insure==.
(28 observations deleted)
. drop if age==.
(1 observation deleted)
. tabulate site, gen(site)
site | Freq. Percent Cum.
------------+-----------------------------------
1 | 194 31.54 31.54
2 | 228 37.07 68.62
3 | 193 31.38 100.00
------------+-----------------------------------
Total | 615 100.00
.
.
.
. *-------------------------------------------------------------------------------
. * Fit models using mlogit
. *-------------------------------------------------------------------------------
.
. * Base category is 1
. mlogit insure age male nonwhite site2 site3, base(1)
Iteration 0: log likelihood = -555.85446
Iteration 1: log likelihood = -534.67443
Iteration 2: log likelihood = -534.36284
Iteration 3: log likelihood = -534.36165
Iteration 4: log likelihood = -534.36165
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity | (base outcome)
-------------+----------------------------------------------------------------
Prepaid |
age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962
male | .5616934 .2027465 2.77 0.006 .1643175 .9590693
nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958
site2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013
site3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433
_cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476
-------------+----------------------------------------------------------------
Uninsure |
age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294
male | .4518496 .3674867 1.23 0.219 -.268411 1.17211
nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129
site2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747
site3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108
_cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260134
------------------------------------------------------------------------------
. estimates store m1ml
. test male
( 1) [Indemnity]o.male = 0
( 2) [Prepaid]male = 0
( 3) [Uninsure]male = 0
Constraint 1 dropped
chi2( 2) = 7.88
Prob > chi2 = 0.0194
.
. * Base category is 2
. mlogit insure age male nonwhite site2 site3, base(2)
Iteration 0: log likelihood = -555.85446
Iteration 1: log likelihood = -534.67443
Iteration 2: log likelihood = -534.36284
Iteration 3: log likelihood = -534.36165
Iteration 4: log likelihood = -534.36165
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity |
age | .011745 .0061946 1.90 0.058 -.0003962 .0238862
male | -.5616934 .2027465 -2.77 0.006 -.9590693 -.1643175
nonwhite | -.9747768 .2363213 -4.12 0.000 -1.437958 -.5115955
site2 | -.1130359 .2101903 -0.54 0.591 -.5250013 .2989296
site3 | .5879879 .2279351 2.58 0.010 .1412433 1.034733
_cons | -.2697127 .3284422 -0.82 0.412 -.9134476 .3740222
-------------+----------------------------------------------------------------
Prepaid | (base outcome)
-------------+----------------------------------------------------------------
Uninsure |
age | .0039489 .0115994 0.34 0.734 -.0187855 .0266832
male | -.1098438 .3651883 -0.30 0.764 -.8255998 .6059122
nonwhite | -.7577178 .4195759 -1.81 0.071 -1.580071 .0646357
site2 | -1.324599 .4697954 -2.82 0.005 -2.245381 -.4038165
site3 | .3801756 .3728188 1.02 0.308 -.3505358 1.110887
_cons | -1.556656 .5963286 -2.61 0.009 -2.725438 -.387873
------------------------------------------------------------------------------
. estimates store m2ml
. test male
( 1) [Indemnity]male = 0
( 2) [Prepaid]o.male = 0
( 3) [Uninsure]male = 0
Constraint 2 dropped
chi2( 2) = 7.88
Prob > chi2 = 0.0194
.
. * Base category is 3
. mlogit insure age male nonwhite site2 site3, base(3)
Iteration 0: log likelihood = -555.85446
Iteration 1: log likelihood = -534.67443
Iteration 2: log likelihood = -534.36284
Iteration 3: log likelihood = -534.36165
Iteration 4: log likelihood = -534.36165
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity |
age | .0077961 .0114418 0.68 0.496 -.0146294 .0302217
male | -.4518496 .3674867 -1.23 0.219 -1.17211 .268411
nonwhite | -.2170589 .4256361 -0.51 0.610 -1.05129 .6171725
site2 | 1.211563 .4705127 2.57 0.010 .2893747 2.133751
site3 | .2078123 .3662926 0.57 0.570 -.510108 .9257327
_cons | 1.286943 .5923219 2.17 0.030 .1260134 2.447872
-------------+----------------------------------------------------------------
Prepaid |
age | -.0039489 .0115994 -0.34 0.734 -.0266832 .0187855
male | .1098438 .3651883 0.30 0.764 -.6059122 .8255998
nonwhite | .7577178 .4195759 1.81 0.071 -.0646357 1.580071
site2 | 1.324599 .4697954 2.82 0.005 .4038165 2.245381
site3 | -.3801756 .3728188 -1.02 0.308 -1.110887 .3505358
_cons | 1.556656 .5963286 2.61 0.009 .387873 2.725438
-------------+----------------------------------------------------------------
Uninsure | (base outcome)
------------------------------------------------------------------------------
. estimates store m3ml
. test male
( 1) [Indemnity]male = 0
( 2) [Prepaid]male = 0
( 3) [Uninsure]o.male = 0
Constraint 3 dropped
chi2( 2) = 7.88
Prob > chi2 = 0.0194
.
.
.
. *-------------------------------------------------------------------------------
. * Fit models using runmlwin - IGLS
. *-------------------------------------------------------------------------------
.
. * Base category is 1
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
> maxiter(100) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 2 vs. 1
2 | 3 vs. 1
----------------------------------
Run time (seconds) = 2.58
Number of iterations = 9
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_2 | .2252591 .3154893 0.71 0.475 -.3930887 .8436068
age_2 | -.0107379 .0059806 -1.80 0.073 -.0224596 .0009839
male_2 | .5582314 .1935448 2.88 0.004 .1788905 .9375723
nonwhite_2 | .9535601 .2243779 4.25 0.000 .5137874 1.393333
site2_2 | .1158666 .2026885 0.57 0.568 -.2813956 .5131288
site3_2 | -.5801533 .2177574 -2.66 0.008 -1.00695 -.1533567
-------------+----------------------------------------------------------------
Contrast 2 |
cons_3 | -1.453731 .5697818 -2.55 0.011 -2.570482 -.3369788
age_3 | -.0041833 .0109225 -0.38 0.702 -.025591 .0172245
male_3 | .4109438 .3514573 1.17 0.242 -.2778999 1.099787
nonwhite_3 | .1834563 .404807 0.45 0.650 -.6099509 .9768636
site2_3 | -1.175968 .4566931 -2.57 0.010 -2.07107 -.2808663
site3_3 | -.1783079 .3514771 -0.51 0.612 -.8671902 .5105745
------------------------------------------------------------------------------
. estimates store m1igls
. test [FP1]male_2 [FP2]male_3
( 1) [FP1]male_2 = 0
( 2) [FP2]male_3 = 0
chi2( 2) = 12.65
Prob > chi2 = 0.0018
.
. * Base category is 2
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
> maxiter(100) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 2
2 | 3 vs. 2
----------------------------------
Run time (seconds) = 2.16
Number of iterations = 10
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | -.2310948 .3137796 -0.74 0.461 -.8460915 .383902
age_1 | .0104725 .0059061 1.77 0.076 -.0011033 .0220482
male_1 | -.5557387 .1950257 -2.85 0.004 -.9379821 -.1734954
nonwhite_1 | -.9664936 .2270495 -4.26 0.000 -1.411502 -.5214848
site2_1 | -.0957603 .2029076 -0.47 0.637 -.4934518 .3019312
site3_1 | .6150761 .2140144 2.87 0.004 .1956155 1.034537
-------------+----------------------------------------------------------------
Contrast 2 |
cons_3 | -1.388619 .5652334 -2.46 0.014 -2.496456 -.2807814
age_3 | -.0007419 .0110818 -0.07 0.947 -.0224617 .020978
male_3 | -.0502273 .3452459 -0.15 0.884 -.7268969 .6264422
nonwhite_3 | -.7764147 .4047265 -1.92 0.055 -1.569664 .0168347
site2_3 | -1.304454 .4585573 -2.84 0.004 -2.20321 -.4056979
site3_3 | .438271 .3494704 1.25 0.210 -.2466783 1.12322
------------------------------------------------------------------------------
. estimates store m2igls
. test [FP1]male_1 [FP2]male_3
( 1) [FP1]male_1 = 0
( 2) [FP2]male_3 = 0
chi2( 2) = 9.04
Prob > chi2 = 0.0109
.
. * Base category is 3
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
> maxiter(100) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 3
2 | 2 vs. 3
----------------------------------
Run time (seconds) = 6.16
Number of iterations = 62
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | 1.832884 .3120234 5.87 0.000 1.221329 2.444438
age_1 | -.0046236 .0058567 -0.79 0.430 -.0161026 .0068554
male_1 | -.3018351 .1932547 -1.56 0.118 -.6806073 .0769371
nonwhite_1 | -.2804579 .2263649 -1.24 0.215 -.7241249 .1632092
site2_1 | 1.171023 .2028789 5.77 0.000 .7733874 1.568658
site3_1 | .1718383 .2140424 0.80 0.422 -.2476771 .5913538
-------------+----------------------------------------------------------------
Contrast 2 |
cons_2 | 2.1153 .3131501 6.75 0.000 1.501538 2.729063
age_2 | -.0167806 .0059208 -2.83 0.005 -.028385 -.0051761
male_2 | .2546102 .1917686 1.33 0.184 -.1212494 .6304698
nonwhite_2 | .7097989 .2230472 3.18 0.001 .2726344 1.146964
site2_2 | 1.288371 .2027133 6.36 0.000 .8910607 1.685682
site3_2 | -.4195967 .2175539 -1.93 0.054 -.8459946 .0068012
------------------------------------------------------------------------------
. estimates store m3igls
. test [FP1]male_1 [FP2]male_2
( 1) [FP1]male_1 = 0
( 2) [FP2]male_2 = 0
chi2( 2) = 2.45
Prob > chi2 = 0.2931
.
.
.
. *-------------------------------------------------------------------------------
. * Fit models using runmlwin - MCMC
. *-------------------------------------------------------------------------------
.
. * Base category is 1
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
> tolerance(5) maxiter(100) ///
> mcmc(burnin(1000) chain(10000)) initsmodel(m1igls) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: MCMC
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 2 vs. 1
2 | 3 vs. 1
----------------------------------
Burnin = 1000
Chain = 10000
Thinning = 1
Run time (seconds) = 38
Deviance (dbar) = 1080.83
Deviance (thetabar) = 1068.89
Effective no. of pars (pd) = 11.95
Bayesian DIC = 1092.78
------------------------------------------------------------------------------
| Mean Std. Dev. ESS P [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_2 | .2439661 .3306381 100 0.225 -.4377796 .8944722
age_2 | -.0116693 .0062352 106 0.035 -.0235667 .001046
male_2 | .5774286 .2050881 1217 0.001 .1878866 .9868808
nonwhite_2 | .9938435 .2456638 1025 0.000 .5157954 1.487963
site2_2 | .136753 .2092072 505 0.252 -.2779506 .5478277
site3_2 | -.5754883 .2323951 597 0.009 -1.028104 -.1193915
-------------+----------------------------------------------------------------
Contrast 2 |
cons_3 | -1.266102 .5665138 112 0.018 -2.364103 -.1250768
age_3 | -.0089878 .010662 119 0.194 -.0317096 .0106432
male_3 | .4349701 .3721229 1234 0.115 -.3101985 1.164274
nonwhite_3 | .2144592 .43907 1264 0.305 -.6578132 1.055963
site2_3 | -1.257277 .48655 916 0.003 -2.238394 -.3342115
site3_3 | -.2279715 .3668159 610 0.268 -.9427684 .458814
------------------------------------------------------------------------------
. estimates store m1mcmc
. test [FP1]male_2 [FP2]male_3
( 1) [FP1]male_2 = 0
( 2) [FP2]male_3 = 0
chi2( 2) = 8.06
Prob > chi2 = 0.0178
.
. * Base category is 2
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
> tolerance(5) maxiter(100) ///
> mcmc(burnin(1000) chain(10000)) initsmodel(m2igls) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: MCMC
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 2
2 | 3 vs. 2
----------------------------------
Burnin = 1000
Chain = 10000
Thinning = 1
Run time (seconds) = 37.9
Deviance (dbar) = 1081.00
Deviance (thetabar) = 1068.89
Effective no. of pars (pd) = 12.11
Bayesian DIC = 1093.11
------------------------------------------------------------------------------
| Mean Std. Dev. ESS P [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | -.2858514 .3604522 75 0.217 -.9505032 .4229137
age_1 | .0122968 .0068222 82 0.041 -.0010685 .0250168
male_1 | -.5763975 .2075412 1256 0.003 -.9714403 -.1700981
nonwhite_1 | -.9925631 .2332351 1392 0.000 -1.442944 -.5384932
site2_1 | -.120517 .2096018 467 0.275 -.5333347 .2939599
site3_1 | .5926808 .2300952 458 0.005 .1455985 1.061441
-------------+----------------------------------------------------------------
Contrast 2 |
cons_3 | -1.570087 .5960569 84 0.005 -2.722909 -.4105062
age_3 | .0034625 .0120921 87 0.377 -.020258 .0264747
male_3 | -.1283098 .372871 1250 0.370 -.8742442 .5816209
nonwhite_3 | -.7938632 .4275478 1282 0.027 -1.680918 .0183959
site2_3 | -1.37527 .4716171 996 0.000 -2.338544 -.4711227
site3_3 | .3826549 .3790009 695 0.152 -.373991 1.165254
------------------------------------------------------------------------------
. estimates store m2mcmc
. test [FP1]male_1 [FP2]male_3
( 1) [FP1]male_1 = 0
( 2) [FP2]male_3 = 0
chi2( 2) = 7.86
Prob > chi2 = 0.0197
.
. * Base category is 3
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
> tolerance(5) maxiter(100) ///
> mcmc(burnin(1000) chain(10000)) initsmodel(m3igls) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: MCMC
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 3
2 | 2 vs. 3
----------------------------------
Burnin = 1000
Chain = 10000
Thinning = 1
Run time (seconds) = 39.1
Deviance (dbar) = 1080.65
Deviance (thetabar) = 1068.91
Effective no. of pars (pd) = 11.74
Bayesian DIC = 1092.39
------------------------------------------------------------------------------
| Mean Std. Dev. ESS P [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | 1.341938 .4810038 24 0.000 .3858214 2.298268
age_1 | .0063641 .0090425 28 0.235 -.0111701 .0253759
male_1 | -.4179802 .3716217 266 0.132 -1.117263 .3421187
nonwhite_1 | -.1475168 .4374826 234 0.369 -.9869192 .746083
site2_1 | 1.312172 .4777177 71 0.003 .4427745 2.275522
site3_1 | .2385831 .375573 74 0.261 -.5183207 .9483098
-------------+----------------------------------------------------------------
Contrast 2 |
cons_2 | 1.635101 .5199141 22 0.000 .660411 2.622971
age_2 | -.0058798 .009519 26 0.276 -.0251208 .0121648
male_2 | .1500722 .3760808 256 0.347 -.6029025 .8947665
nonwhite_2 | .833674 .4362051 230 0.029 -.0243578 1.738333
site2_2 | 1.419851 .5008552 64 0.001 .4945272 2.461509
site3_2 | -.356654 .3892435 67 0.173 -1.094442 .4072808
------------------------------------------------------------------------------
. estimates store m3mcmc
. test [FP1]male_1 [FP2]male_2
( 1) [FP1]male_1 = 0
( 2) [FP2]male_2 = 0
chi2( 2) = 7.81
Prob > chi2 = 0.0202