Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!
I am fitting a three-level multinomial logit where the structure is the following, individuals nested in regions nested in countries.
I am trying to compare the estimates of a null model using a three level structure against a single level model. I have first fitted the three-level model using IGLS and I intend to use these estimates in order to run PQL2 using them as initial values.
It fits the IGLS model ok but then when I tell it to use the previous values as initial values then I get this message:
"error while obeying batch file C:\... at line number 183: NEXT model specified does not match last model run ."
I have also tried storing the result and using instb(). I cannot work out what I am doing wrong.
I think it somehow sorted itself out. I think it was just giving the wrong message error. I run your command in another instance of stata and it works perfectly. In the same one I loaded my dataset and I ran it again and now what it says is:
"WARNING: IGLS algorithm failed to converge. Increase the number of iterations. See the maxiterations() option."
. runmlwin parti cons, level3(tnscntry: cons)level2(p7: cons) level1(id) discrete(distribution(multinomial) link(mlogit) denominator(cons) basecategor
> y(0)) nopause
MLwiN 2.32 multilevel model Number of obs = 25146
Unordered multinomial logit response model
Estimation algorithm: IGLS, MQL1
-----------------------------------------------------------
| No. of Observations per Group
Level Variable | Groups Minimum Average Maximum
----------------+------------------------------------------
tnscntry | 30 290 838.2 991
p7 | 312 1 80.6 469
-----------------------------------------------------------
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 0
2 | 2 vs. 0
----------------------------------
Run time (seconds) = 31.41
Number of iterations = 11
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | -4.909793 .2211221 -22.20 0.000 -5.343185 -4.476402
-------------+----------------------------------------------------------------
Contrast 2 |
cons_2 | -3.07217 .1077873 -28.50 0.000 -3.283429 -2.860911
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 3: tnscntry |
var(cons_1) | 1.087117 .3782067 .345845 1.828388
cov(cons_1,cons_2) | .0755327 .1307052 -.1806448 .3317102
var(cons_2) | .2738921 .0898127 .0978625 .4499218
-----------------------------+------------------------------------------------
Level 2: p7 |
var(cons_1) | 1.259003 .2755601 .7189149 1.79909
cov(cons_1,cons_2) | -.137653 .0867556 -.307691 .0323849
var(cons_2) | .2841612 .0537555 .1788023 .38952
------------------------------------------------------------------------------
. estimates store m5, title(3-IGLS)
. runmlwin parti cons, level3(tnscntry: cons)level2(p7: cons) level1(id) discrete(distribution(multinomial) link(mlogit) denominator(cons) basecategor
> y(0)pql2) initsmodel(m5) nopause
Model fitted using initial values specified as parameter estimates from saved estimates in m5
MLwiN 2.32 multilevel model Number of obs = 0
Unordered multinomial logit response model
Estimation algorithm: IGLS, PQL2
-----------------------------------------------------------
| No. of Observations per Group
Level Variable | Groups Minimum Average Maximum
----------------+------------------------------------------
tnscntry | 0 . . .
p7 | 0 . . .
-----------------------------------------------------------
----------------------------------
Contrast | Log-odds
-------------+--------------------
1 | 1 vs. 0
2 | 2 vs. 0
----------------------------------
Run time (seconds) = 9.04
Number of iterations = 3
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Contrast 1 |
cons_1 | -4.909793 .2211221 -22.20 0.000 -5.343185 -4.476402
-------------+----------------------------------------------------------------
Contrast 2 |
cons_2 | -3.07217 .1077873 -28.50 0.000 -3.283429 -2.860911
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 3: tnscntry |
var(cons_1) | 1.087117 .3782067 .345845 1.828388
cov(cons_1,cons_2) | .0755327 .1307052 -.1806448 .3317102
var(cons_2) | .2738921 .0898127 .0978625 .4499218
-----------------------------+------------------------------------------------
Level 2: p7 |
var(cons_1) | 1.259003 .2755601 .7189149 1.79909
cov(cons_1,cons_2) | -.137653 .0867556 -.307691 .0323849
var(cons_2) | .2841612 .0537555 .1788023 .38952
------------------------------------------------------------------------------
WARNING: IGLS algorithm failed to converge. Increase the number of iterations. See the maxiterations() option.
So i used the maxiterations to 30 but still only performs 3 iterations according to the number of iterations stated in the output. Is this because it is not going to converge even if I allow that many iterations or because it is simply ignoring that it can iterate further? I think I read 20 iterations in the help file is the standard. Thank you in advance and sorry for the weird error message.
command in Stata after making the change so that it would reload the .ado file.
Looking at the output that you posted your second model appears to state that there are no observations in the data used, as well as giving empty grouping information. If this was the case it would explain the behaviour that you are seeing, as well as why your results are identical to the starting values. Could you try running the command without the nopause option and checking what the data looks like within MLwiN?
I have done so and the data is complete, sorted correctly. The only thing that calls my attention and which I do not know if it is significant is that a few variables are created for the estimation: ~P, ~H... and also some cxxx variables.
From those variables (which I am not very sure of their meaning) c1203 (50292), c1488(312), c1489(312) appear as missing. My dataset has a length of 25146, three categories in the categorical variables (Yes, No, DK)
It seems as if it inputs the initial values correctly, when it writes the model it shows all the initial values from the IGLS estimation and cases in use (50292 out of 50292) but as you have seen once I estimate it it says 0 cases used...
It sounds as if there may be numeric issues related to the PQL2 estimation. Could you see whether either MQL2 or PQL1 work, and if so whether using the starting values from these cause your model to fit any better?
When I use MQL2 I get a numerical error. I get 1.#QO(0.000)
It will let me estimate it with PQL1 but using these estimates as initial values for PQL2 does not make the situation better. Still it appears that 0 cases are used.
[1] Do stare at your data. Estimation will be easiest when: (1) there are lots of units at each level (and within each cluster); (2) when the data are balanced; (3) when there are a healthy proportion of individuals in each response category within each cluster. The more your data depart from this ideal the harder estimation will be; (4) when the degree of clustering is not too severe. In particular, small clusters where all individuals are in the same response category can make estimation tough. You could try temporarily reducing your sample to a better behaved subset to get things working. For example, you might consider temporarily dropping countries with less than, say, five regions, and dropping regions with less than, say, 20 individuals. The more you play around with your data the more you will understand which features of the data are likely making estimation difficult
[2] Perhaps simplify the model. First get the two simpler two-level multinational logistic regression models working: (1) Individuals-within regions (i.e., temporarily ignore country-level clustering) (2) individuals within countries (i.e., temporarily ignore region-level clustering)
[3] Alternatively try manually specifying stating values for the three-level model and going straight to MCMC estimation.