Page 1 of 1
Wald test following multinomial logistic regression
Posted: Wed Jul 02, 2014 1:41 pm
by mlwinnewbie
Hi everyone,
I am interested in assessing whether the topic selected (social vs. personal vs. clinical) is a predictor of involvement in therapy (yes, medium and low).
I wrote the following script in Stata:
Code: Select all
global MLwiN_path "C:\Program Files (x86)\MLwiN v2.29\i386\mlwin.exe"
capture drop cons
sort IDD wave
capture drop cons
gen cons=1
gen id=_n
* create 2 dummy variables so that topic level 1 is the reference category
g topic2 = (topic3g_w==2)
g topic3 = (topic3g_w==3)
set matsize 800
set more off
runmlwin invw topic2 topic3 cons, ///
level2(IDD: cons) ///
level1(id) ///
discrete(distribution(multinomial) ///
link(mlogit) ///
denominator(cons) ///
basecategory(3)) ///
nopause
I then wanted to run a Wald test to assess if overall topic is a predictor of involvement. If I were to use the mlogit command, I could then type:
In order to obtain an equivalent Wald test, following the runmlwin command above I then typed:
Code: Select all
test [FP1]topic2_1 [FP1]topic3_1 [FP2]topic2_2 [FP2]topic3_2
is this correct?
Also depending on the basecategory I select the overall Wald test produces different results - this is not the case when I use mlogit so I am wondering if I am doing something wrong.
Thanks for your help.
F
Re: Wald test following multinomial logistic regression
Posted: Fri Jul 04, 2014 1:20 pm
by GeorgeLeckie
Hi F,
In terms of your first query, yes, that is how you go about doing an overall test for whether the predictor has any explanatory power. I have given an example below.
Best wishes
George
Syntax
Code: Select all
* Load the data
use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear
* Generate dummy variables
generate lc1 = (lc==1)
generate lc2 = (lc==2)
generate lc3plus = (lc>=3)
* Fit the model
runmlwin use4 cons lc1 lc2 lc3plus, ///
level1(woman) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(4)) ///
nopause
* Perform the Wald test
test ///
[FP1]lc1_1 [FP1]lc2_1 [FP1]lc3plus_1 ///
[FP2]lc1_2 [FP2]lc2_2 [FP2]lc3plus_2 ///
[FP3]lc1_3 [FP3]lc2_3 [FP3]lc3plus_3
Output
Code: Select all
. * Load the data
. use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear
.
. * Generate dummy variables
. generate lc1 = (lc==1)
. generate lc2 = (lc==2)
. generate lc3plus = (lc>=3)
.
. * Fit the model
. runmlwin use4 cons lc1 lc2 lc3plus, ///
> level1(woman) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(4)) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 2867
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 4
2 | 2 vs. 4
3 | 3 vs. 4
----------------------------------
Run time (seconds) = 5.17
Number of iterations = 10
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | -3.884983 .2909287 -13.35 0.000 -4.455193 -3.314773
lc1_1 | 2.1912 .325586 6.73 0.000 1.553063 2.829337
lc2_1 | 2.664649 .3188518 8.36 0.000 2.039711 3.289586
lc3plus_1 | 2.574364 .302671 8.51 0.000 1.98114 3.167589
-------------+----------------------------------------------------------------
Contrast 2 |
cons_2 | -1.472055 .0949997 -15.50 0.000 -1.658251 -1.285859
lc1_2 | .7469173 .1376688 5.43 0.000 .4770915 1.016743
lc2_2 | .6903579 .1455586 4.74 0.000 .4050683 .9756474
lc3plus_2 | .2076819 .1254473 1.66 0.098 -.0381903 .453554
-------------+----------------------------------------------------------------
Contrast 3 |
cons_3 | -2.585702 .1552284 -16.66 0.000 -2.889944 -2.28146
lc1_3 | .7473474 .2200436 3.40 0.001 .3160699 1.178625
lc2_3 | 1.063134 .2147492 4.95 0.000 .6422336 1.484035
lc3plus_3 | 1.101028 .179334 6.14 0.000 .7495399 1.452516
------------------------------------------------------------------------------
.
. * Perform the Wald test
. test ///
> [FP1]lc1_1 [FP1]lc2_1 [FP1]lc3plus_1 ///
> [FP2]lc1_2 [FP2]lc2_2 [FP2]lc3plus_2 ///
> [FP3]lc1_3 [FP3]lc2_3 [FP3]lc3plus_3
( 1) [FP1]lc1_1 = 0
( 2) [FP1]lc2_1 = 0
( 3) [FP1]lc3plus_1 = 0
( 4) [FP2]lc1_2 = 0
( 5) [FP2]lc2_2 = 0
( 6) [FP2]lc3plus_2 = 0
( 7) [FP3]lc1_3 = 0
( 8) [FP3]lc2_3 = 0
( 9) [FP3]lc3plus_3 = 0
chi2( 9) = 178.34
Prob > chi2 = 0.0000
Re: Wald test following multinomial logistic regression
Posted: Fri Jul 04, 2014 1:46 pm
by mlwinnewbie
Hi George,
Thank you very much for your reply - it is really helpful.
I was hoping you could comment on my second query - apologies if this is a very basic question:
Depending on the basecategory I select the overall Wald test produces different results - this is not the case when I use mlogit so I am wondering if I am doing something wrong.
Does this make sense? And if so, I should select the vase category based on theory and NOT present all possible comparisons - correct?
Thanks again,
Francesca
Re: Wald test following multinomial logistic regression
Posted: Fri Jul 04, 2014 2:39 pm
by GeorgeLeckie
Hi Francesca,
First, a general piece of advice:
Default estimation for discrete response models in MLwiN is by quasilikelihood methods. These methods only provide approximate maximum likelihood estimates and so we recommend users always use the MCMC methods in MLwiN for any final discrete response model.
You have stumbled on a rather interesting second order estimation limitation of the quasilikelihood methods in MLwiN:
It turns out that when fitting multinomial models in MLwiN by quasilikelihood methods, the degree of approximation is itself sensitive to the choice of base category. Note that this problem does not manifest itself when you use the recommended MCMC methods.
An example is given below. For simplicity this is for a single-level multinomial model, but the same issues apply for multilevel multinomial models.
Best wishes
George
Syntax:
Code: Select all
*-------------------------------------------------------------------------------
* Prepare the data
*-------------------------------------------------------------------------------
* Load the data
webuse sysdsn1, clear
* Generate the variables
generate cons = 1
drop if insure==.
drop if age==.
tabulate site, gen(site)
*-------------------------------------------------------------------------------
* Fit models using mlogit
*-------------------------------------------------------------------------------
* Base category is 1
mlogit insure age male nonwhite site2 site3, base(1)
estimates store m1ml
test male
* Base category is 2
mlogit insure age male nonwhite site2 site3, base(2)
estimates store m2ml
test male
* Base category is 3
mlogit insure age male nonwhite site2 site3, base(3)
estimates store m3ml
test male
*-------------------------------------------------------------------------------
* Fit models using runmlwin - IGLS
*-------------------------------------------------------------------------------
* Base category is 1
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
maxiter(100) ///
nopause
estimates store m1igls
test [FP1]male_2 [FP2]male_3
* Base category is 2
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
maxiter(100) ///
nopause
estimates store m2igls
test [FP1]male_1 [FP2]male_3
* Base category is 3
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
maxiter(100) ///
nopause
estimates store m3igls
test [FP1]male_1 [FP2]male_2
*-------------------------------------------------------------------------------
* Fit models using runmlwin - MCMC
*-------------------------------------------------------------------------------
* Base category is 1
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
tolerance(5) maxiter(100) ///
mcmc(burnin(1000) chain(10000)) initsmodel(m1igls) ///
nopause
estimates store m1mcmc
test [FP1]male_2 [FP2]male_3
* Base category is 2
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
tolerance(5) maxiter(100) ///
mcmc(burnin(1000) chain(10000)) initsmodel(m2igls) ///
nopause
estimates store m2mcmc
test [FP1]male_1 [FP2]male_3
* Base category is 3
runmlwin insure cons age male nonwhite site2 site3, ///
level1(patid:) ///
discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
tolerance(5) maxiter(100) ///
mcmc(burnin(1000) chain(10000)) initsmodel(m3igls) ///
nopause
estimates store m3mcmc
test [FP1]male_1 [FP2]male_2
Output:
Code: Select all
. *-------------------------------------------------------------------------------
. * Prepare the data
. *-------------------------------------------------------------------------------
.
. * Load the data
. webuse sysdsn1, clear
(Health insurance data)
.
. * Generate the variables
. generate cons = 1
. drop if insure==.
(28 observations deleted)
. drop if age==.
(1 observation deleted)
. tabulate site, gen(site)
site | Freq. Percent Cum.
------------+-----------------------------------
1 | 194 31.54 31.54
2 | 228 37.07 68.62
3 | 193 31.38 100.00
------------+-----------------------------------
Total | 615 100.00
.
.
.
. *-------------------------------------------------------------------------------
. * Fit models using mlogit
. *-------------------------------------------------------------------------------
.
. * Base category is 1
. mlogit insure age male nonwhite site2 site3, base(1)
Iteration 0: log likelihood = -555.85446
Iteration 1: log likelihood = -534.67443
Iteration 2: log likelihood = -534.36284
Iteration 3: log likelihood = -534.36165
Iteration 4: log likelihood = -534.36165
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity | (base outcome)
-------------+----------------------------------------------------------------
Prepaid |
age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962
male | .5616934 .2027465 2.77 0.006 .1643175 .9590693
nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958
site2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013
site3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433
_cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476
-------------+----------------------------------------------------------------
Uninsure |
age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294
male | .4518496 .3674867 1.23 0.219 -.268411 1.17211
nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129
site2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747
site3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108
_cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260134
------------------------------------------------------------------------------
. estimates store m1ml
. test male
( 1) [Indemnity]o.male = 0
( 2) [Prepaid]male = 0
( 3) [Uninsure]male = 0
Constraint 1 dropped
chi2( 2) = 7.88
Prob > chi2 = 0.0194
.
. * Base category is 2
. mlogit insure age male nonwhite site2 site3, base(2)
Iteration 0: log likelihood = -555.85446
Iteration 1: log likelihood = -534.67443
Iteration 2: log likelihood = -534.36284
Iteration 3: log likelihood = -534.36165
Iteration 4: log likelihood = -534.36165
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity |
age | .011745 .0061946 1.90 0.058 -.0003962 .0238862
male | -.5616934 .2027465 -2.77 0.006 -.9590693 -.1643175
nonwhite | -.9747768 .2363213 -4.12 0.000 -1.437958 -.5115955
site2 | -.1130359 .2101903 -0.54 0.591 -.5250013 .2989296
site3 | .5879879 .2279351 2.58 0.010 .1412433 1.034733
_cons | -.2697127 .3284422 -0.82 0.412 -.9134476 .3740222
-------------+----------------------------------------------------------------
Prepaid | (base outcome)
-------------+----------------------------------------------------------------
Uninsure |
age | .0039489 .0115994 0.34 0.734 -.0187855 .0266832
male | -.1098438 .3651883 -0.30 0.764 -.8255998 .6059122
nonwhite | -.7577178 .4195759 -1.81 0.071 -1.580071 .0646357
site2 | -1.324599 .4697954 -2.82 0.005 -2.245381 -.4038165
site3 | .3801756 .3728188 1.02 0.308 -.3505358 1.110887
_cons | -1.556656 .5963286 -2.61 0.009 -2.725438 -.387873
------------------------------------------------------------------------------
. estimates store m2ml
. test male
( 1) [Indemnity]male = 0
( 2) [Prepaid]o.male = 0
( 3) [Uninsure]male = 0
Constraint 2 dropped
chi2( 2) = 7.88
Prob > chi2 = 0.0194
.
. * Base category is 3
. mlogit insure age male nonwhite site2 site3, base(3)
Iteration 0: log likelihood = -555.85446
Iteration 1: log likelihood = -534.67443
Iteration 2: log likelihood = -534.36284
Iteration 3: log likelihood = -534.36165
Iteration 4: log likelihood = -534.36165
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity |
age | .0077961 .0114418 0.68 0.496 -.0146294 .0302217
male | -.4518496 .3674867 -1.23 0.219 -1.17211 .268411
nonwhite | -.2170589 .4256361 -0.51 0.610 -1.05129 .6171725
site2 | 1.211563 .4705127 2.57 0.010 .2893747 2.133751
site3 | .2078123 .3662926 0.57 0.570 -.510108 .9257327
_cons | 1.286943 .5923219 2.17 0.030 .1260134 2.447872
-------------+----------------------------------------------------------------
Prepaid |
age | -.0039489 .0115994 -0.34 0.734 -.0266832 .0187855
male | .1098438 .3651883 0.30 0.764 -.6059122 .8255998
nonwhite | .7577178 .4195759 1.81 0.071 -.0646357 1.580071
site2 | 1.324599 .4697954 2.82 0.005 .4038165 2.245381
site3 | -.3801756 .3728188 -1.02 0.308 -1.110887 .3505358
_cons | 1.556656 .5963286 2.61 0.009 .387873 2.725438
-------------+----------------------------------------------------------------
Uninsure | (base outcome)
------------------------------------------------------------------------------
. estimates store m3ml
. test male
( 1) [Indemnity]male = 0
( 2) [Prepaid]male = 0
( 3) [Uninsure]o.male = 0
Constraint 3 dropped
chi2( 2) = 7.88
Prob > chi2 = 0.0194
.
.
.
. *-------------------------------------------------------------------------------
. * Fit models using runmlwin - IGLS
. *-------------------------------------------------------------------------------
.
. * Base category is 1
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
> maxiter(100) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 2 vs. 1
2 | 3 vs. 1
----------------------------------
Run time (seconds) = 2.58
Number of iterations = 9
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_2 | .2252591 .3154893 0.71 0.475 -.3930887 .8436068
age_2 | -.0107379 .0059806 -1.80 0.073 -.0224596 .0009839
male_2 | .5582314 .1935448 2.88 0.004 .1788905 .9375723
nonwhite_2 | .9535601 .2243779 4.25 0.000 .5137874 1.393333
site2_2 | .1158666 .2026885 0.57 0.568 -.2813956 .5131288
site3_2 | -.5801533 .2177574 -2.66 0.008 -1.00695 -.1533567
-------------+----------------------------------------------------------------
Contrast 2 |
cons_3 | -1.453731 .5697818 -2.55 0.011 -2.570482 -.3369788
age_3 | -.0041833 .0109225 -0.38 0.702 -.025591 .0172245
male_3 | .4109438 .3514573 1.17 0.242 -.2778999 1.099787
nonwhite_3 | .1834563 .404807 0.45 0.650 -.6099509 .9768636
site2_3 | -1.175968 .4566931 -2.57 0.010 -2.07107 -.2808663
site3_3 | -.1783079 .3514771 -0.51 0.612 -.8671902 .5105745
------------------------------------------------------------------------------
. estimates store m1igls
. test [FP1]male_2 [FP2]male_3
( 1) [FP1]male_2 = 0
( 2) [FP2]male_3 = 0
chi2( 2) = 12.65
Prob > chi2 = 0.0018
.
. * Base category is 2
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
> maxiter(100) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 2
2 | 3 vs. 2
----------------------------------
Run time (seconds) = 2.16
Number of iterations = 10
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | -.2310948 .3137796 -0.74 0.461 -.8460915 .383902
age_1 | .0104725 .0059061 1.77 0.076 -.0011033 .0220482
male_1 | -.5557387 .1950257 -2.85 0.004 -.9379821 -.1734954
nonwhite_1 | -.9664936 .2270495 -4.26 0.000 -1.411502 -.5214848
site2_1 | -.0957603 .2029076 -0.47 0.637 -.4934518 .3019312
site3_1 | .6150761 .2140144 2.87 0.004 .1956155 1.034537
-------------+----------------------------------------------------------------
Contrast 2 |
cons_3 | -1.388619 .5652334 -2.46 0.014 -2.496456 -.2807814
age_3 | -.0007419 .0110818 -0.07 0.947 -.0224617 .020978
male_3 | -.0502273 .3452459 -0.15 0.884 -.7268969 .6264422
nonwhite_3 | -.7764147 .4047265 -1.92 0.055 -1.569664 .0168347
site2_3 | -1.304454 .4585573 -2.84 0.004 -2.20321 -.4056979
site3_3 | .438271 .3494704 1.25 0.210 -.2466783 1.12322
------------------------------------------------------------------------------
. estimates store m2igls
. test [FP1]male_1 [FP2]male_3
( 1) [FP1]male_1 = 0
( 2) [FP2]male_3 = 0
chi2( 2) = 9.04
Prob > chi2 = 0.0109
.
. * Base category is 3
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
> maxiter(100) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 3
2 | 2 vs. 3
----------------------------------
Run time (seconds) = 6.16
Number of iterations = 62
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | 1.832884 .3120234 5.87 0.000 1.221329 2.444438
age_1 | -.0046236 .0058567 -0.79 0.430 -.0161026 .0068554
male_1 | -.3018351 .1932547 -1.56 0.118 -.6806073 .0769371
nonwhite_1 | -.2804579 .2263649 -1.24 0.215 -.7241249 .1632092
site2_1 | 1.171023 .2028789 5.77 0.000 .7733874 1.568658
site3_1 | .1718383 .2140424 0.80 0.422 -.2476771 .5913538
-------------+----------------------------------------------------------------
Contrast 2 |
cons_2 | 2.1153 .3131501 6.75 0.000 1.501538 2.729063
age_2 | -.0167806 .0059208 -2.83 0.005 -.028385 -.0051761
male_2 | .2546102 .1917686 1.33 0.184 -.1212494 .6304698
nonwhite_2 | .7097989 .2230472 3.18 0.001 .2726344 1.146964
site2_2 | 1.288371 .2027133 6.36 0.000 .8910607 1.685682
site3_2 | -.4195967 .2175539 -1.93 0.054 -.8459946 .0068012
------------------------------------------------------------------------------
. estimates store m3igls
. test [FP1]male_1 [FP2]male_2
( 1) [FP1]male_1 = 0
( 2) [FP2]male_2 = 0
chi2( 2) = 2.45
Prob > chi2 = 0.2931
.
.
.
. *-------------------------------------------------------------------------------
. * Fit models using runmlwin - MCMC
. *-------------------------------------------------------------------------------
.
. * Base category is 1
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
> tolerance(5) maxiter(100) ///
> mcmc(burnin(1000) chain(10000)) initsmodel(m1igls) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: MCMC
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 2 vs. 1
2 | 3 vs. 1
----------------------------------
Burnin = 1000
Chain = 10000
Thinning = 1
Run time (seconds) = 38
Deviance (dbar) = 1080.83
Deviance (thetabar) = 1068.89
Effective no. of pars (pd) = 11.95
Bayesian DIC = 1092.78
------------------------------------------------------------------------------
| Mean Std. Dev. ESS P [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_2 | .2439661 .3306381 100 0.225 -.4377796 .8944722
age_2 | -.0116693 .0062352 106 0.035 -.0235667 .001046
male_2 | .5774286 .2050881 1217 0.001 .1878866 .9868808
nonwhite_2 | .9938435 .2456638 1025 0.000 .5157954 1.487963
site2_2 | .136753 .2092072 505 0.252 -.2779506 .5478277
site3_2 | -.5754883 .2323951 597 0.009 -1.028104 -.1193915
-------------+----------------------------------------------------------------
Contrast 2 |
cons_3 | -1.266102 .5665138 112 0.018 -2.364103 -.1250768
age_3 | -.0089878 .010662 119 0.194 -.0317096 .0106432
male_3 | .4349701 .3721229 1234 0.115 -.3101985 1.164274
nonwhite_3 | .2144592 .43907 1264 0.305 -.6578132 1.055963
site2_3 | -1.257277 .48655 916 0.003 -2.238394 -.3342115
site3_3 | -.2279715 .3668159 610 0.268 -.9427684 .458814
------------------------------------------------------------------------------
. estimates store m1mcmc
. test [FP1]male_2 [FP2]male_3
( 1) [FP1]male_2 = 0
( 2) [FP2]male_3 = 0
chi2( 2) = 8.06
Prob > chi2 = 0.0178
.
. * Base category is 2
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
> tolerance(5) maxiter(100) ///
> mcmc(burnin(1000) chain(10000)) initsmodel(m2igls) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: MCMC
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 2
2 | 3 vs. 2
----------------------------------
Burnin = 1000
Chain = 10000
Thinning = 1
Run time (seconds) = 37.9
Deviance (dbar) = 1081.00
Deviance (thetabar) = 1068.89
Effective no. of pars (pd) = 12.11
Bayesian DIC = 1093.11
------------------------------------------------------------------------------
| Mean Std. Dev. ESS P [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | -.2858514 .3604522 75 0.217 -.9505032 .4229137
age_1 | .0122968 .0068222 82 0.041 -.0010685 .0250168
male_1 | -.5763975 .2075412 1256 0.003 -.9714403 -.1700981
nonwhite_1 | -.9925631 .2332351 1392 0.000 -1.442944 -.5384932
site2_1 | -.120517 .2096018 467 0.275 -.5333347 .2939599
site3_1 | .5926808 .2300952 458 0.005 .1455985 1.061441
-------------+----------------------------------------------------------------
Contrast 2 |
cons_3 | -1.570087 .5960569 84 0.005 -2.722909 -.4105062
age_3 | .0034625 .0120921 87 0.377 -.020258 .0264747
male_3 | -.1283098 .372871 1250 0.370 -.8742442 .5816209
nonwhite_3 | -.7938632 .4275478 1282 0.027 -1.680918 .0183959
site2_3 | -1.37527 .4716171 996 0.000 -2.338544 -.4711227
site3_3 | .3826549 .3790009 695 0.152 -.373991 1.165254
------------------------------------------------------------------------------
. estimates store m2mcmc
. test [FP1]male_1 [FP2]male_3
( 1) [FP1]male_1 = 0
( 2) [FP2]male_3 = 0
chi2( 2) = 7.86
Prob > chi2 = 0.0197
.
. * Base category is 3
. runmlwin insure cons age male nonwhite site2 site3, ///
> level1(patid:) ///
> discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
> tolerance(5) maxiter(100) ///
> mcmc(burnin(1000) chain(10000)) initsmodel(m3igls) ///
> nopause
MLwiN 2.30 multilevel model Number of obs = 615
Unordered multinomial logit response model
Estimation algorithm: MCMC
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 3
2 | 2 vs. 3
----------------------------------
Burnin = 1000
Chain = 10000
Thinning = 1
Run time (seconds) = 39.1
Deviance (dbar) = 1080.65
Deviance (thetabar) = 1068.91
Effective no. of pars (pd) = 11.74
Bayesian DIC = 1092.39
------------------------------------------------------------------------------
| Mean Std. Dev. ESS P [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | 1.341938 .4810038 24 0.000 .3858214 2.298268
age_1 | .0063641 .0090425 28 0.235 -.0111701 .0253759
male_1 | -.4179802 .3716217 266 0.132 -1.117263 .3421187
nonwhite_1 | -.1475168 .4374826 234 0.369 -.9869192 .746083
site2_1 | 1.312172 .4777177 71 0.003 .4427745 2.275522
site3_1 | .2385831 .375573 74 0.261 -.5183207 .9483098
-------------+----------------------------------------------------------------
Contrast 2 |
cons_2 | 1.635101 .5199141 22 0.000 .660411 2.622971
age_2 | -.0058798 .009519 26 0.276 -.0251208 .0121648
male_2 | .1500722 .3760808 256 0.347 -.6029025 .8947665
nonwhite_2 | .833674 .4362051 230 0.029 -.0243578 1.738333
site2_2 | 1.419851 .5008552 64 0.001 .4945272 2.461509
site3_2 | -.356654 .3892435 67 0.173 -1.094442 .4072808
------------------------------------------------------------------------------
. estimates store m3mcmc
. test [FP1]male_1 [FP2]male_2
( 1) [FP1]male_1 = 0
( 2) [FP2]male_2 = 0
chi2( 2) = 7.81
Prob > chi2 = 0.0202
Re: Wald test following multinomial logistic regression
Posted: Fri Jul 04, 2014 3:32 pm
by mlwinnewbie
Hi George,
Thanks again for your helpful reply. I used the mcmc approach you suggested -
set more off
runmlwin invw topic2 topic3 cons, ///
level2(IDD: cons) ///
level1(id) ///
discrete(distribution(multinomial) ///
link(mlogit) ///
denominator(cons) ///
basecategory(2)) ///
nopause
estimates store m2igls
set more off
eststo m2mcmc: runmlwin invw topic2 topic3 cons, ///
level2(IDD: cons) ///
level1(id) ///
discrete(distribution(multinomial) ///
link(mlogit) ///
denominator(cons) ///
basecategory(2)) ///
tolerance(5) maxiter(100) ///
mcmc(burnin(1000) chain(10000)) initsmodel(m2igls) ///
nopause rrr
estimates store m2mcmc
test [FP1]topic2_1 [FP1]topic3_1 [FP2]topic2_3 [FP2]topic3_3
When I compared the Wald test when using base(2) and base(3) I received different results:
with base 2:
. test [FP1]topic2_1 [FP1]topic3_1 [FP2]topic2_2 [FP2]topic3_2
( 1) [FP1]topic2_1 = 0
( 2) [FP1]topic3_1 = 0
( 3) [FP2]topic2_2 = 0
( 4) [FP2]topic3_2 = 0
chi2( 4) = 91.40
Prob > chi2 = 0.0000
with base 3:
. test [FP1]topic2_1 [FP1]topic3_1 [FP2]topic2_3 [FP2]topic3_3
( 1) [FP1]topic2_1 = 0
( 2) [FP1]topic3_1 = 0
( 3) [FP2]topic2_3 = 0
( 4) [FP2]topic3_3 = 0
chi2( 4) = 103.89
Prob > chi2 = 0.0000
I am assuming this is due the approximation issue that you mentioned in your reply. if I were interested in looking at comparisons using different bases, would I need to report the different chi2 results? I am just wondering how reviewers would respond to this.
Once again I'd be grateful for your comments.
Many thanks,
Francesca
Re: Wald test following multinomial logistic regression
Posted: Fri Jul 04, 2014 6:17 pm
by GeorgeLeckie
Hi Francesca,
Note that this query is an MLwiN specific query (as opposed to your first query which was a runmlwin query) and so it may well receive more responses if you place it on the MLwiN forum.
The approximation problem I previously described is specific to quasilikelihood methods in MLwiN. It does not apply to MCMC estimation in MLwiN.
Having said that, when using MCMC methods you will typically see small differences in fit between reparameterisations of a model (here different choice of base category). However, these differences should be trivially small (assuming that the different parameteristaions are stable and that you have specified sensible starting values and specified long enough burnin and chain periods). This appears to be the case in the example I provided, the DIC statistics are effectively the same, they differ by less than 1 point (see below). Differences of 5 or more are typically taken as meaningful differences in model fit.
I would just pick the most stable choice of base category (typically the largest category) and present those results. You can always manipulate the parameter estimates to get different log-odds contrasts if they are of interest.
Best wishes
George
Syntax:
Code: Select all
estimates table m1mcmc m2mcmc m3mcmc, stats(dic) b(%4.3f) style(oneline)
Output:
Code: Select all
. estimates table m1mcmc m2mcmc m3mcmc, stats(dic) b(%4.3f) style(oneline)
--------------------------------------------
Variable | m1mcmc m2mcmc m3mcmc
-------------+------------------------------
FP1 |
cons_2 | 0.244
age_2 | -0.012
male_2 | 0.577
nonwhite_2 | 0.994
site2_2 | 0.137
site3_2 | -0.575
cons_1 | -0.286 1.342
age_1 | 0.012 0.006
male_1 | -0.576 -0.418
nonwhite_1 | -0.993 -0.148
site2_1 | -0.121 1.312
site3_1 | 0.593 0.239
-------------+------------------------------
FP2 |
cons_3 | -1.266 -1.570
age_3 | -0.009 0.003
male_3 | 0.435 -0.128
nonwhite_3 | 0.214 -0.794
site2_3 | -1.257 -1.375
site3_3 | -0.228 0.383
cons_2 | 1.635
age_2 | -0.006
male_2 | 0.150
nonwhite_2 | 0.834
site2_2 | 1.420
site3_2 | -0.357
-------------+------------------------------
OD |
bcons_1 | 1.000 1.000 1.000
-------------+------------------------------
Statistics |
dic | 1092.781 1093.106 1092.389
--------------------------------------------
Re: Wald test following multinomial logistic regression
Posted: Mon Jul 07, 2014 9:17 am
by mlwinnewbie
Hi George,
Thanks a lot for your reply! Can I also double-check what you mean by "You can always manipulate the parameter estimates to get different log-odds contrasts if they are of interest" - how could I achieve that?
Thanks again,
Francesca
Re: Wald test following multinomial logistic regression
Posted: Mon Jul 07, 2014 5:14 pm
by GeorgeLeckie
Hi,
Any decent book which covers multinomial response models should show the algebraic manipulations of the current coefficients to get those which you would obtain directly if you were to change the base category.
The following is very accessible...
Scott Long, J. (1997). Regression models for categorical and limited dependent variables. Advanced quantitative techniques in the social sciences. Sage.
Best wishes
George