Wald test following multinomial logistic regression

mlwinnewbie · Post by **mlwinnewbie** » Wed Jul 02, 2014 1:41 pm

Hi everyone,

I am interested in assessing whether the topic selected (social vs. personal vs. clinical) is a predictor of involvement in therapy (yes, medium and low).

I wrote the following script in Stata:

Code: Select all

global MLwiN_path "C:\Program Files (x86)\MLwiN v2.29\i386\mlwin.exe"
capture drop cons
sort IDD wave
capture drop cons
gen cons=1
gen id=_n

* create 2 dummy variables so that topic level 1 is the reference category
g topic2 = (topic3g_w==2)
g topic3 = (topic3g_w==3)

set matsize 800

set more off
runmlwin invw topic2 topic3 cons, ///
level2(IDD: cons) ///
level1(id) ///
discrete(distribution(multinomial) ///
link(mlogit) ///
denominator(cons) ///
basecategory(3)) ///
nopause

I then wanted to run a Wald test to assess if overall topic is a predictor of involvement. If I were to use the mlogit command, I could then type:

Code: Select all

test test  2.topic3g_w 3.topic3g_w

In order to obtain an equivalent Wald test, following the runmlwin command above I then typed:

Code: Select all

test  [FP1]topic2_1 [FP1]topic3_1 [FP2]topic2_2 [FP2]topic3_2

is this correct?

Also depending on the basecategory I select the overall Wald test produces different results - this is not the case when I use mlogit so I am wondering if I am doing something wrong.

Thanks for your help.
F

GeorgeLeckie · Post by **GeorgeLeckie** » Fri Jul 04, 2014 1:20 pm

Hi F,

In terms of your first query, yes, that is how you go about doing an overall test for whether the predictor has any explanatory power. I have given an example below.

Best wishes

George

Syntax

Code: Select all

* Load the data
use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear

* Generate dummy variables
generate lc1 = (lc==1)
generate lc2 = (lc==2)
generate lc3plus = (lc>=3)

* Fit the model
runmlwin use4 cons lc1 lc2 lc3plus, ///
  level1(woman) ///
  discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(4)) ///
  nopause

* Perform the Wald test
test ///
  [FP1]lc1_1 [FP1]lc2_1 [FP1]lc3plus_1 ///
  [FP2]lc1_2 [FP2]lc2_2 [FP2]lc3plus_2 ///
  [FP3]lc1_3 [FP3]lc2_3 [FP3]lc3plus_3

Output

Code: Select all

. * Load the data
. use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear

. 
. * Generate dummy variables
. generate lc1 = (lc==1)

. generate lc2 = (lc==2)

. generate lc3plus = (lc>=3)

. 
. * Fit the model
. runmlwin use4 cons lc1 lc2 lc3plus, ///
>   level1(woman) ///
>   discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(4)) ///
>   nopause
 
MLwiN 2.30 multilevel model                     Number of obs      =      2867
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1

----------------------------------
    Contrast | Log-odds
-------------+--------------------
           1 | 1 vs. 4
           2 | 2 vs. 4
           3 | 3 vs. 4
----------------------------------

Run time (seconds)   =       5.17
Number of iterations =         10
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1   |
      cons_1 |  -3.884983   .2909287   -13.35   0.000    -4.455193   -3.314773
       lc1_1 |     2.1912    .325586     6.73   0.000     1.553063    2.829337
       lc2_1 |   2.664649   .3188518     8.36   0.000     2.039711    3.289586
   lc3plus_1 |   2.574364    .302671     8.51   0.000      1.98114    3.167589
-------------+----------------------------------------------------------------
Contrast 2   |
      cons_2 |  -1.472055   .0949997   -15.50   0.000    -1.658251   -1.285859
       lc1_2 |   .7469173   .1376688     5.43   0.000     .4770915    1.016743
       lc2_2 |   .6903579   .1455586     4.74   0.000     .4050683    .9756474
   lc3plus_2 |   .2076819   .1254473     1.66   0.098    -.0381903     .453554
-------------+----------------------------------------------------------------
Contrast 3   |
      cons_3 |  -2.585702   .1552284   -16.66   0.000    -2.889944    -2.28146
       lc1_3 |   .7473474   .2200436     3.40   0.001     .3160699    1.178625
       lc2_3 |   1.063134   .2147492     4.95   0.000     .6422336    1.484035
   lc3plus_3 |   1.101028    .179334     6.14   0.000     .7495399    1.452516
------------------------------------------------------------------------------


. 
. * Perform the Wald test
. test ///
>   [FP1]lc1_1 [FP1]lc2_1 [FP1]lc3plus_1 ///
>   [FP2]lc1_2 [FP2]lc2_2 [FP2]lc3plus_2 ///
>   [FP3]lc1_3 [FP3]lc2_3 [FP3]lc3plus_3

 ( 1)  [FP1]lc1_1 = 0
 ( 2)  [FP1]lc2_1 = 0
 ( 3)  [FP1]lc3plus_1 = 0
 ( 4)  [FP2]lc1_2 = 0
 ( 5)  [FP2]lc2_2 = 0
 ( 6)  [FP2]lc3plus_2 = 0
 ( 7)  [FP3]lc1_3 = 0
 ( 8)  [FP3]lc2_3 = 0
 ( 9)  [FP3]lc3plus_3 = 0

           chi2(  9) =  178.34
         Prob > chi2 =    0.0000

mlwinnewbie · Post by **mlwinnewbie** » Fri Jul 04, 2014 1:46 pm

Hi George,

Thank you very much for your reply - it is really helpful.

I was hoping you could comment on my second query - apologies if this is a very basic question:
Depending on the basecategory I select the overall Wald test produces different results - this is not the case when I use mlogit so I am wondering if I am doing something wrong.

Does this make sense? And if so, I should select the vase category based on theory and NOT present all possible comparisons - correct?

Thanks again,
Francesca

GeorgeLeckie · Post by **GeorgeLeckie** » Fri Jul 04, 2014 2:39 pm

Hi Francesca,

First, a general piece of advice:

Default estimation for discrete response models in MLwiN is by quasilikelihood methods. These methods only provide approximate maximum likelihood estimates and so we recommend users always use the MCMC methods in MLwiN for any final discrete response model.

You have stumbled on a rather interesting second order estimation limitation of the quasilikelihood methods in MLwiN:

It turns out that when fitting multinomial models in MLwiN by quasilikelihood methods, the degree of approximation is itself sensitive to the choice of base category. Note that this problem does not manifest itself when you use the recommended MCMC methods.

An example is given below. For simplicity this is for a single-level multinomial model, but the same issues apply for multilevel multinomial models.

Best wishes

George

Syntax:

Code: Select all

*-------------------------------------------------------------------------------
* Prepare the data
*-------------------------------------------------------------------------------

* Load the data
webuse sysdsn1, clear

* Generate the variables
generate cons = 1
drop if insure==.
drop if age==.
tabulate site, gen(site)



*-------------------------------------------------------------------------------
* Fit models using mlogit
*-------------------------------------------------------------------------------

* Base category is 1
mlogit insure age male nonwhite site2 site3, base(1)
estimates store m1ml
test male

* Base category is 2
mlogit insure age male nonwhite site2 site3, base(2)
estimates store m2ml
test male

* Base category is 3
mlogit insure age male nonwhite site2 site3, base(3)
estimates store m3ml
test male



*-------------------------------------------------------------------------------
* Fit models using runmlwin - IGLS
*-------------------------------------------------------------------------------

* Base category is 1
runmlwin insure cons age male nonwhite site2 site3, ///
  level1(patid:) ///
  discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
  maxiter(100) ///
  nopause
estimates store m1igls
test [FP1]male_2 [FP2]male_3

* Base category is 2
runmlwin insure cons age male nonwhite site2 site3, ///
  level1(patid:) ///
  discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
  maxiter(100) ///
  nopause
estimates store m2igls
test [FP1]male_1 [FP2]male_3

* Base category is 3
runmlwin insure cons age male nonwhite site2 site3, ///
  level1(patid:) ///
  discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
  maxiter(100) ///
  nopause
estimates store m3igls
test [FP1]male_1 [FP2]male_2



*-------------------------------------------------------------------------------
* Fit models using runmlwin - MCMC
*-------------------------------------------------------------------------------

* Base category is 1
runmlwin insure cons age male nonwhite site2 site3, ///
  level1(patid:) ///
  discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
  tolerance(5) maxiter(100) ///
  mcmc(burnin(1000) chain(10000)) initsmodel(m1igls) ///
  nopause
estimates store m1mcmc
test [FP1]male_2 [FP2]male_3

* Base category is 2
runmlwin insure cons age male nonwhite site2 site3, ///
  level1(patid:) ///
  discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
  tolerance(5) maxiter(100) ///
  mcmc(burnin(1000) chain(10000)) initsmodel(m2igls) ///
  nopause
estimates store m2mcmc
test [FP1]male_1 [FP2]male_3

* Base category is 3
runmlwin insure cons age male nonwhite site2 site3, ///
  level1(patid:) ///
  discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
  tolerance(5) maxiter(100) ///
  mcmc(burnin(1000) chain(10000)) initsmodel(m3igls) ///
  nopause
estimates store m3mcmc
test [FP1]male_1 [FP2]male_2

Output:

Code: Select all

. *-------------------------------------------------------------------------------
. * Prepare the data
. *-------------------------------------------------------------------------------
. 
. * Load the data
. webuse sysdsn1, clear
(Health insurance data)

. 
. * Generate the variables
. generate cons = 1

. drop if insure==.
(28 observations deleted)

. drop if age==.
(1 observation deleted)

. tabulate site, gen(site)

       site |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |        194       31.54       31.54
          2 |        228       37.07       68.62
          3 |        193       31.38      100.00
------------+-----------------------------------
      Total |        615      100.00

. 
. 
. 
. *-------------------------------------------------------------------------------
. * Fit models using mlogit
. *-------------------------------------------------------------------------------
. 
. * Base category is 1
. mlogit insure age male nonwhite site2 site3, base(1)

Iteration 0:   log likelihood = -555.85446  
Iteration 1:   log likelihood = -534.67443  
Iteration 2:   log likelihood = -534.36284  
Iteration 3:   log likelihood = -534.36165  
Iteration 4:   log likelihood = -534.36165  

Multinomial logistic regression                   Number of obs   =        615
                                                  LR chi2(10)     =      42.99
                                                  Prob > chi2     =     0.0000
Log likelihood = -534.36165                       Pseudo R2       =     0.0387

------------------------------------------------------------------------------
      insure |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity    |  (base outcome)
-------------+----------------------------------------------------------------
Prepaid      |
         age |   -.011745   .0061946    -1.90   0.058    -.0238862    .0003962
        male |   .5616934   .2027465     2.77   0.006     .1643175    .9590693
    nonwhite |   .9747768   .2363213     4.12   0.000     .5115955    1.437958
       site2 |   .1130359   .2101903     0.54   0.591    -.2989296    .5250013
       site3 |  -.5879879   .2279351    -2.58   0.010    -1.034733   -.1412433
       _cons |   .2697127   .3284422     0.82   0.412    -.3740222    .9134476
-------------+----------------------------------------------------------------
Uninsure     |
         age |  -.0077961   .0114418    -0.68   0.496    -.0302217    .0146294
        male |   .4518496   .3674867     1.23   0.219     -.268411     1.17211
    nonwhite |   .2170589   .4256361     0.51   0.610    -.6171725     1.05129
       site2 |  -1.211563   .4705127    -2.57   0.010    -2.133751   -.2893747
       site3 |  -.2078123   .3662926    -0.57   0.570    -.9257327     .510108
       _cons |  -1.286943   .5923219    -2.17   0.030    -2.447872   -.1260134
------------------------------------------------------------------------------

. estimates store m1ml

. test male

 ( 1)  [Indemnity]o.male = 0
 ( 2)  [Prepaid]male = 0
 ( 3)  [Uninsure]male = 0
       Constraint 1 dropped

           chi2(  2) =    7.88
         Prob > chi2 =    0.0194

. 
. * Base category is 2
. mlogit insure age male nonwhite site2 site3, base(2)

Iteration 0:   log likelihood = -555.85446  
Iteration 1:   log likelihood = -534.67443  
Iteration 2:   log likelihood = -534.36284  
Iteration 3:   log likelihood = -534.36165  
Iteration 4:   log likelihood = -534.36165  

Multinomial logistic regression                   Number of obs   =        615
                                                  LR chi2(10)     =      42.99
                                                  Prob > chi2     =     0.0000
Log likelihood = -534.36165                       Pseudo R2       =     0.0387

------------------------------------------------------------------------------
      insure |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity    |
         age |    .011745   .0061946     1.90   0.058    -.0003962    .0238862
        male |  -.5616934   .2027465    -2.77   0.006    -.9590693   -.1643175
    nonwhite |  -.9747768   .2363213    -4.12   0.000    -1.437958   -.5115955
       site2 |  -.1130359   .2101903    -0.54   0.591    -.5250013    .2989296
       site3 |   .5879879   .2279351     2.58   0.010     .1412433    1.034733
       _cons |  -.2697127   .3284422    -0.82   0.412    -.9134476    .3740222
-------------+----------------------------------------------------------------
Prepaid      |  (base outcome)
-------------+----------------------------------------------------------------
Uninsure     |
         age |   .0039489   .0115994     0.34   0.734    -.0187855    .0266832
        male |  -.1098438   .3651883    -0.30   0.764    -.8255998    .6059122
    nonwhite |  -.7577178   .4195759    -1.81   0.071    -1.580071    .0646357
       site2 |  -1.324599   .4697954    -2.82   0.005    -2.245381   -.4038165
       site3 |   .3801756   .3728188     1.02   0.308    -.3505358    1.110887
       _cons |  -1.556656   .5963286    -2.61   0.009    -2.725438    -.387873
------------------------------------------------------------------------------

. estimates store m2ml

. test male

 ( 1)  [Indemnity]male = 0
 ( 2)  [Prepaid]o.male = 0
 ( 3)  [Uninsure]male = 0
       Constraint 2 dropped

           chi2(  2) =    7.88
         Prob > chi2 =    0.0194

. 
. * Base category is 3
. mlogit insure age male nonwhite site2 site3, base(3)

Iteration 0:   log likelihood = -555.85446  
Iteration 1:   log likelihood = -534.67443  
Iteration 2:   log likelihood = -534.36284  
Iteration 3:   log likelihood = -534.36165  
Iteration 4:   log likelihood = -534.36165  

Multinomial logistic regression                   Number of obs   =        615
                                                  LR chi2(10)     =      42.99
                                                  Prob > chi2     =     0.0000
Log likelihood = -534.36165                       Pseudo R2       =     0.0387

------------------------------------------------------------------------------
      insure |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity    |
         age |   .0077961   .0114418     0.68   0.496    -.0146294    .0302217
        male |  -.4518496   .3674867    -1.23   0.219     -1.17211     .268411
    nonwhite |  -.2170589   .4256361    -0.51   0.610     -1.05129    .6171725
       site2 |   1.211563   .4705127     2.57   0.010     .2893747    2.133751
       site3 |   .2078123   .3662926     0.57   0.570     -.510108    .9257327
       _cons |   1.286943   .5923219     2.17   0.030     .1260134    2.447872
-------------+----------------------------------------------------------------
Prepaid      |
         age |  -.0039489   .0115994    -0.34   0.734    -.0266832    .0187855
        male |   .1098438   .3651883     0.30   0.764    -.6059122    .8255998
    nonwhite |   .7577178   .4195759     1.81   0.071    -.0646357    1.580071
       site2 |   1.324599   .4697954     2.82   0.005     .4038165    2.245381
       site3 |  -.3801756   .3728188    -1.02   0.308    -1.110887    .3505358
       _cons |   1.556656   .5963286     2.61   0.009      .387873    2.725438
-------------+----------------------------------------------------------------
Uninsure     |  (base outcome)
------------------------------------------------------------------------------

. estimates store m3ml

. test male

 ( 1)  [Indemnity]male = 0
 ( 2)  [Prepaid]male = 0
 ( 3)  [Uninsure]o.male = 0
       Constraint 3 dropped

           chi2(  2) =    7.88
         Prob > chi2 =    0.0194

. 
. 
. 
. *-------------------------------------------------------------------------------
. * Fit models using runmlwin - IGLS
. *-------------------------------------------------------------------------------
. 
. * Base category is 1
. runmlwin insure cons age male nonwhite site2 site3, ///
>   level1(patid:) ///
>   discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
>   maxiter(100) ///
>   nopause
 
MLwiN 2.30 multilevel model                     Number of obs      =       615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1

----------------------------------
    Contrast | Log-odds
-------------+--------------------
           1 | 2 vs. 1
           2 | 3 vs. 1
----------------------------------

Run time (seconds)   =       2.58
Number of iterations =          9
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1   |
      cons_2 |   .2252591   .3154893     0.71   0.475    -.3930887    .8436068
       age_2 |  -.0107379   .0059806    -1.80   0.073    -.0224596    .0009839
      male_2 |   .5582314   .1935448     2.88   0.004     .1788905    .9375723
  nonwhite_2 |   .9535601   .2243779     4.25   0.000     .5137874    1.393333
     site2_2 |   .1158666   .2026885     0.57   0.568    -.2813956    .5131288
     site3_2 |  -.5801533   .2177574    -2.66   0.008     -1.00695   -.1533567
-------------+----------------------------------------------------------------
Contrast 2   |
      cons_3 |  -1.453731   .5697818    -2.55   0.011    -2.570482   -.3369788
       age_3 |  -.0041833   .0109225    -0.38   0.702     -.025591    .0172245
      male_3 |   .4109438   .3514573     1.17   0.242    -.2778999    1.099787
  nonwhite_3 |   .1834563    .404807     0.45   0.650    -.6099509    .9768636
     site2_3 |  -1.175968   .4566931    -2.57   0.010     -2.07107   -.2808663
     site3_3 |  -.1783079   .3514771    -0.51   0.612    -.8671902    .5105745
------------------------------------------------------------------------------


. estimates store m1igls

. test [FP1]male_2 [FP2]male_3

 ( 1)  [FP1]male_2 = 0
 ( 2)  [FP2]male_3 = 0

           chi2(  2) =   12.65
         Prob > chi2 =    0.0018

. 
. * Base category is 2
. runmlwin insure cons age male nonwhite site2 site3, ///
>   level1(patid:) ///
>   discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
>   maxiter(100) ///
>   nopause
 
MLwiN 2.30 multilevel model                     Number of obs      =       615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1

----------------------------------
    Contrast | Log-odds
-------------+--------------------
           1 | 1 vs. 2
           2 | 3 vs. 2
----------------------------------

Run time (seconds)   =       2.16
Number of iterations =         10
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1   |
      cons_1 |  -.2310948   .3137796    -0.74   0.461    -.8460915     .383902
       age_1 |   .0104725   .0059061     1.77   0.076    -.0011033    .0220482
      male_1 |  -.5557387   .1950257    -2.85   0.004    -.9379821   -.1734954
  nonwhite_1 |  -.9664936   .2270495    -4.26   0.000    -1.411502   -.5214848
     site2_1 |  -.0957603   .2029076    -0.47   0.637    -.4934518    .3019312
     site3_1 |   .6150761   .2140144     2.87   0.004     .1956155    1.034537
-------------+----------------------------------------------------------------
Contrast 2   |
      cons_3 |  -1.388619   .5652334    -2.46   0.014    -2.496456   -.2807814
       age_3 |  -.0007419   .0110818    -0.07   0.947    -.0224617     .020978
      male_3 |  -.0502273   .3452459    -0.15   0.884    -.7268969    .6264422
  nonwhite_3 |  -.7764147   .4047265    -1.92   0.055    -1.569664    .0168347
     site2_3 |  -1.304454   .4585573    -2.84   0.004     -2.20321   -.4056979
     site3_3 |    .438271   .3494704     1.25   0.210    -.2466783     1.12322
------------------------------------------------------------------------------


. estimates store m2igls

. test [FP1]male_1 [FP2]male_3

 ( 1)  [FP1]male_1 = 0
 ( 2)  [FP2]male_3 = 0

           chi2(  2) =    9.04
         Prob > chi2 =    0.0109

. 
. * Base category is 3
. runmlwin insure cons age male nonwhite site2 site3, ///
>   level1(patid:) ///
>   discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
>   maxiter(100) ///
>   nopause
 
MLwiN 2.30 multilevel model                     Number of obs      =       615
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1

----------------------------------
    Contrast | Log-odds
-------------+--------------------
           1 | 1 vs. 3
           2 | 2 vs. 3
----------------------------------

Run time (seconds)   =       6.16
Number of iterations =         62
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1   |
      cons_1 |   1.832884   .3120234     5.87   0.000     1.221329    2.444438
       age_1 |  -.0046236   .0058567    -0.79   0.430    -.0161026    .0068554
      male_1 |  -.3018351   .1932547    -1.56   0.118    -.6806073    .0769371
  nonwhite_1 |  -.2804579   .2263649    -1.24   0.215    -.7241249    .1632092
     site2_1 |   1.171023   .2028789     5.77   0.000     .7733874    1.568658
     site3_1 |   .1718383   .2140424     0.80   0.422    -.2476771    .5913538
-------------+----------------------------------------------------------------
Contrast 2   |
      cons_2 |     2.1153   .3131501     6.75   0.000     1.501538    2.729063
       age_2 |  -.0167806   .0059208    -2.83   0.005     -.028385   -.0051761
      male_2 |   .2546102   .1917686     1.33   0.184    -.1212494    .6304698
  nonwhite_2 |   .7097989   .2230472     3.18   0.001     .2726344    1.146964
     site2_2 |   1.288371   .2027133     6.36   0.000     .8910607    1.685682
     site3_2 |  -.4195967   .2175539    -1.93   0.054    -.8459946    .0068012
------------------------------------------------------------------------------


. estimates store m3igls

. test [FP1]male_1 [FP2]male_2

 ( 1)  [FP1]male_1 = 0
 ( 2)  [FP2]male_2 = 0

           chi2(  2) =    2.45
         Prob > chi2 =    0.2931

. 
. 
. 
. *-------------------------------------------------------------------------------
. * Fit models using runmlwin - MCMC
. *-------------------------------------------------------------------------------
. 
. * Base category is 1
. runmlwin insure cons age male nonwhite site2 site3, ///
>   level1(patid:) ///
>   discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(1)) ///
>   tolerance(5) maxiter(100) ///
>   mcmc(burnin(1000) chain(10000)) initsmodel(m1igls) ///
>   nopause
 
MLwiN 2.30 multilevel model                     Number of obs      =       615
Unordered multinomial logit response model
Estimation algorithm: MCMC

----------------------------------
    Contrast | Log-odds
-------------+--------------------
           1 | 2 vs. 1
           2 | 3 vs. 1
----------------------------------

Burnin                     =       1000
Chain                      =      10000
Thinning                   =          1
Run time (seconds)         =         38
Deviance (dbar)            =    1080.83
Deviance (thetabar)        =    1068.89
Effective no. of pars (pd) =      11.95
Bayesian DIC               =    1092.78
------------------------------------------------------------------------------
             |      Mean    Std. Dev.     ESS     P       [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1   |
      cons_2 |   .2439661   .3306381      100   0.225    -.4377796    .8944722
       age_2 |  -.0116693   .0062352      106   0.035    -.0235667     .001046
      male_2 |   .5774286   .2050881     1217   0.001     .1878866    .9868808
  nonwhite_2 |   .9938435   .2456638     1025   0.000     .5157954    1.487963
     site2_2 |    .136753   .2092072      505   0.252    -.2779506    .5478277
     site3_2 |  -.5754883   .2323951      597   0.009    -1.028104   -.1193915
-------------+----------------------------------------------------------------
Contrast 2   |
      cons_3 |  -1.266102   .5665138      112   0.018    -2.364103   -.1250768
       age_3 |  -.0089878    .010662      119   0.194    -.0317096    .0106432
      male_3 |   .4349701   .3721229     1234   0.115    -.3101985    1.164274
  nonwhite_3 |   .2144592     .43907     1264   0.305    -.6578132    1.055963
     site2_3 |  -1.257277     .48655      916   0.003    -2.238394   -.3342115
     site3_3 |  -.2279715   .3668159      610   0.268    -.9427684     .458814
------------------------------------------------------------------------------


. estimates store m1mcmc

. test [FP1]male_2 [FP2]male_3

 ( 1)  [FP1]male_2 = 0
 ( 2)  [FP2]male_3 = 0

           chi2(  2) =    8.06
         Prob > chi2 =    0.0178

. 
. * Base category is 2
. runmlwin insure cons age male nonwhite site2 site3, ///
>   level1(patid:) ///
>   discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(2)) ///
>   tolerance(5) maxiter(100) ///
>   mcmc(burnin(1000) chain(10000)) initsmodel(m2igls) ///
>   nopause
 
MLwiN 2.30 multilevel model                     Number of obs      =       615
Unordered multinomial logit response model
Estimation algorithm: MCMC

----------------------------------
    Contrast | Log-odds
-------------+--------------------
           1 | 1 vs. 2
           2 | 3 vs. 2
----------------------------------

Burnin                     =       1000
Chain                      =      10000
Thinning                   =          1
Run time (seconds)         =       37.9
Deviance (dbar)            =    1081.00
Deviance (thetabar)        =    1068.89
Effective no. of pars (pd) =      12.11
Bayesian DIC               =    1093.11
------------------------------------------------------------------------------
             |      Mean    Std. Dev.     ESS     P       [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1   |
      cons_1 |  -.2858514   .3604522       75   0.217    -.9505032    .4229137
       age_1 |   .0122968   .0068222       82   0.041    -.0010685    .0250168
      male_1 |  -.5763975   .2075412     1256   0.003    -.9714403   -.1700981
  nonwhite_1 |  -.9925631   .2332351     1392   0.000    -1.442944   -.5384932
     site2_1 |   -.120517   .2096018      467   0.275    -.5333347    .2939599
     site3_1 |   .5926808   .2300952      458   0.005     .1455985    1.061441
-------------+----------------------------------------------------------------
Contrast 2   |
      cons_3 |  -1.570087   .5960569       84   0.005    -2.722909   -.4105062
       age_3 |   .0034625   .0120921       87   0.377     -.020258    .0264747
      male_3 |  -.1283098    .372871     1250   0.370    -.8742442    .5816209
  nonwhite_3 |  -.7938632   .4275478     1282   0.027    -1.680918    .0183959
     site2_3 |   -1.37527   .4716171      996   0.000    -2.338544   -.4711227
     site3_3 |   .3826549   .3790009      695   0.152     -.373991    1.165254
------------------------------------------------------------------------------


. estimates store m2mcmc

. test [FP1]male_1 [FP2]male_3

 ( 1)  [FP1]male_1 = 0
 ( 2)  [FP2]male_3 = 0

           chi2(  2) =    7.86
         Prob > chi2 =    0.0197

. 
. * Base category is 3
. runmlwin insure cons age male nonwhite site2 site3, ///
>   level1(patid:) ///
>   discrete(distribution(multinomial) link(mlogit) denom(cons) basecategory(3)) ///
>   tolerance(5) maxiter(100) ///
>   mcmc(burnin(1000) chain(10000)) initsmodel(m3igls) ///
>   nopause
 
MLwiN 2.30 multilevel model                     Number of obs      =       615
Unordered multinomial logit response model
Estimation algorithm: MCMC

----------------------------------
    Contrast | Log-odds
-------------+--------------------
           1 | 1 vs. 3
           2 | 2 vs. 3
----------------------------------

Burnin                     =       1000
Chain                      =      10000
Thinning                   =          1
Run time (seconds)         =       39.1
Deviance (dbar)            =    1080.65
Deviance (thetabar)        =    1068.91
Effective no. of pars (pd) =      11.74
Bayesian DIC               =    1092.39
------------------------------------------------------------------------------
             |      Mean    Std. Dev.     ESS     P       [95% Cred. Interval]
-------------+----------------------------------------------------------------
Contrast 1   |
      cons_1 |   1.341938   .4810038       24   0.000     .3858214    2.298268
       age_1 |   .0063641   .0090425       28   0.235    -.0111701    .0253759
      male_1 |  -.4179802   .3716217      266   0.132    -1.117263    .3421187
  nonwhite_1 |  -.1475168   .4374826      234   0.369    -.9869192     .746083
     site2_1 |   1.312172   .4777177       71   0.003     .4427745    2.275522
     site3_1 |   .2385831    .375573       74   0.261    -.5183207    .9483098
-------------+----------------------------------------------------------------
Contrast 2   |
      cons_2 |   1.635101   .5199141       22   0.000      .660411    2.622971
       age_2 |  -.0058798    .009519       26   0.276    -.0251208    .0121648
      male_2 |   .1500722   .3760808      256   0.347    -.6029025    .8947665
  nonwhite_2 |    .833674   .4362051      230   0.029    -.0243578    1.738333
     site2_2 |   1.419851   .5008552       64   0.001     .4945272    2.461509
     site3_2 |   -.356654   .3892435       67   0.173    -1.094442    .4072808
------------------------------------------------------------------------------


. estimates store m3mcmc

. test [FP1]male_1 [FP2]male_2

 ( 1)  [FP1]male_1 = 0
 ( 2)  [FP2]male_2 = 0

           chi2(  2) =    7.81
         Prob > chi2 =    0.0202

mlwinnewbie · Post by **mlwinnewbie** » Fri Jul 04, 2014 3:32 pm

Hi George,

Thanks again for your helpful reply. I used the mcmc approach you suggested -

set more off
runmlwin invw topic2 topic3 cons, ///
level2(IDD: cons) ///
level1(id) ///
discrete(distribution(multinomial) ///
link(mlogit) ///
denominator(cons) ///
basecategory(2)) ///
nopause
estimates store m2igls

set more off
eststo m2mcmc: runmlwin invw topic2 topic3 cons, ///
level2(IDD: cons) ///
level1(id) ///
discrete(distribution(multinomial) ///
link(mlogit) ///
denominator(cons) ///
basecategory(2)) ///
tolerance(5) maxiter(100) ///
mcmc(burnin(1000) chain(10000)) initsmodel(m2igls) ///
nopause rrr

estimates store m2mcmc

test [FP1]topic2_1 [FP1]topic3_1 [FP2]topic2_3 [FP2]topic3_3

When I compared the Wald test when using base(2) and base(3) I received different results:

with base 2:

. test [FP1]topic2_1 [FP1]topic3_1 [FP2]topic2_2 [FP2]topic3_2

( 1) [FP1]topic2_1 = 0
( 2) [FP1]topic3_1 = 0
( 3) [FP2]topic2_2 = 0
( 4) [FP2]topic3_2 = 0

chi2( 4) = 91.40
Prob > chi2 = 0.0000

with base 3:

. test [FP1]topic2_1 [FP1]topic3_1 [FP2]topic2_3 [FP2]topic3_3

( 1) [FP1]topic2_1 = 0
( 2) [FP1]topic3_1 = 0
( 3) [FP2]topic2_3 = 0
( 4) [FP2]topic3_3 = 0

chi2( 4) = 103.89
Prob > chi2 = 0.0000

I am assuming this is due the approximation issue that you mentioned in your reply. if I were interested in looking at comparisons using different bases, would I need to report the different chi2 results? I am just wondering how reviewers would respond to this.

Once again I'd be grateful for your comments.

Many thanks,
Francesca

GeorgeLeckie · Post by **GeorgeLeckie** » Fri Jul 04, 2014 6:17 pm

Hi Francesca,

Note that this query is an MLwiN specific query (as opposed to your first query which was a runmlwin query) and so it may well receive more responses if you place it on the MLwiN forum.

The approximation problem I previously described is specific to quasilikelihood methods in MLwiN. It does not apply to MCMC estimation in MLwiN.

Having said that, when using MCMC methods you will typically see small differences in fit between reparameterisations of a model (here different choice of base category). However, these differences should be trivially small (assuming that the different parameteristaions are stable and that you have specified sensible starting values and specified long enough burnin and chain periods). This appears to be the case in the example I provided, the DIC statistics are effectively the same, they differ by less than 1 point (see below). Differences of 5 or more are typically taken as meaningful differences in model fit.

I would just pick the most stable choice of base category (typically the largest category) and present those results. You can always manipulate the parameter estimates to get different log-odds contrasts if they are of interest.

Best wishes

George

Syntax:

Code: Select all

estimates table m1mcmc  m2mcmc  m3mcmc, stats(dic) b(%4.3f) style(oneline)

Output:

Code: Select all

. estimates table m1mcmc  m2mcmc  m3mcmc, stats(dic) b(%4.3f) style(oneline)

--------------------------------------------
    Variable | m1mcmc    m2mcmc    m3mcmc   
-------------+------------------------------
FP1          |
      cons_2 |   0.244                      
       age_2 |  -0.012                      
      male_2 |   0.577                      
  nonwhite_2 |   0.994                      
     site2_2 |   0.137                      
     site3_2 |  -0.575                      
      cons_1 |            -0.286     1.342  
       age_1 |             0.012     0.006  
      male_1 |            -0.576    -0.418  
  nonwhite_1 |            -0.993    -0.148  
     site2_1 |            -0.121     1.312  
     site3_1 |             0.593     0.239  
-------------+------------------------------
FP2          |
      cons_3 |  -1.266    -1.570            
       age_3 |  -0.009     0.003            
      male_3 |   0.435    -0.128            
  nonwhite_3 |   0.214    -0.794            
     site2_3 |  -1.257    -1.375            
     site3_3 |  -0.228     0.383            
      cons_2 |                       1.635  
       age_2 |                      -0.006  
      male_2 |                       0.150  
  nonwhite_2 |                       0.834  
     site2_2 |                       1.420  
     site3_2 |                      -0.357  
-------------+------------------------------
OD           |
     bcons_1 |   1.000     1.000     1.000  
-------------+------------------------------
Statistics   |                              
         dic | 1092.781   1093.106   1092.389  
--------------------------------------------

mlwinnewbie · Post by **mlwinnewbie** » Mon Jul 07, 2014 9:17 am

Hi George,

Thanks a lot for your reply! Can I also double-check what you mean by "You can always manipulate the parameter estimates to get different log-odds contrasts if they are of interest" - how could I achieve that?

Thanks again,
Francesca

GeorgeLeckie · Post by **GeorgeLeckie** » Mon Jul 07, 2014 5:14 pm

Hi,

Any decent book which covers multinomial response models should show the algebraic manipulations of the current coefficients to get those which you would obtain directly if you were to change the base category.

The following is very accessible...

Scott Long, J. (1997). Regression models for categorical and limited dependent variables. Advanced quantitative techniques in the social sciences. Sage.

Best wishes

George

www.cmm.bristol.ac.uk/forum

Wald test following multinomial logistic regression

Wald test following multinomial logistic regression

Re: Wald test following multinomial logistic regression

Re: Wald test following multinomial logistic regression

Re: Wald test following multinomial logistic regression

Re: Wald test following multinomial logistic regression

Re: Wald test following multinomial logistic regression

Re: Wald test following multinomial logistic regression

Re: Wald test following multinomial logistic regression