3-level ordered logistic - problem with PQL2 + weights

zuzannap · Post by **zuzannap** » Fri Jan 30, 2015 10:33 am

Dear all,

I’ve got a problem with fitting a 3-level ordered logistic regression model using PQL2. I managed to get the results for MQL1, MQL2 and PQL1 (each time using the estimates from the previous model as staring values –the initsprevious or initsmodel(name) option). However, when I proceed to PQL2, I get the following error:

error while obeying batch file C:\Docume….\ST_00000007.tmp at line number 506:
design vector at level 2 is the wrong length

I use the following syntax:

Code: Select all

runmlwin diseng_simpl cons (ethnic age sex edu income income_reg urban1 urban2 urban3 fed1 fed2 fed3, contrast(1/6)), level3 (regid: (cons, contrast(1/6))) level2(housid: (cons, contrast(1/6))) level1(indid) discrete(distribution(multinomial) link(ologit) denominator(cons) basecategory(6) pql2) initsmodel(m4) nopause

I tried running the same code without nopause option and MLWiN seems to produce some estimates but when I click Resume macro twice I get the same message as above (the only difference is that the error occurs now at line number 521).

What may be the reason of that?

By the way, I tried also to get the results with the MCMC using the estimates from the MQL1 model and I managed to get the results.

..................

My second doubt concerns weighting. I actually should use the poststratification weights in my analysis. I know that I cannot use them by MCMC. I thought of using them in PQL2 model and check whether the PQL2 with weights gives me similar results as the unweighted MCMC (as recommended by George in this post: http://www.cmm.bristol.ac.uk/forum/view ... ?f=3&t=456). However, the runmlwin help says that

Sampling weights should therefore only be used for continuous response variables as the quasilikelihood procedures available for (R)IGLS estimation of discrete response variables are only approximate.

Does this mean that there is actually no way to apply weights in case of ordered logistic regression using runmlwin? What can I do with that? Should I try treating my response variable as interval (although it is ordinal)?

Best regards,
Zuza

GeorgeLeckie · Post by **GeorgeLeckie** » Sat Jan 31, 2015 4:32 am

Hi Zara,

Sampling weights are implemented for IGLS estimation, but not MCMC estimation. This is potentially problematic when you are fitting models with discrete responses as here IGLS provides quasilikehood estimation (MQL1, MQL2, PQL1, or PQL2 depending on what you choose) which is an approximation to maximum likelihood. In many cases the degree of approximation is minimal and so you can stick with PQL2 and everything will be fine, but in other cases the degree of approximation can be severe (typically when you have small clusters and extreme clustering, e.g., longitudinal data) and you need to abandon PQL2 in favour of MCMC which does not suffer from the same approximation problem. Unfortunately, and as you point out, you then run into the problem that sampling weights are not implemented for MCMC.

If you can get away with using PQL2...

In terms of model convergence difficulties, try using the PQL1 estimates for the the identical model specification as starting values for refitting the model by PQL2. If this does not work then you could specify starting values manually. Or you could try altering the model specification to try to see which part of the model is causing the convergence problems. If one of your ordinal categories is very rare you might consider combining it with an adjacent one. I'm afraid you just have to play around a bit, sometimes getting complex models to converge is a bit of an art. Most importantly always think carefully as to how sensible the model specification is given the data.

The reasons why you see what appear to be model estimates when you omit the nopause option and before you click the resume button is that these are the starting values you provide from model m4 rather than the desired estimates.

If you can't use PQL2...

Then yes, you might consider treating the ordinal response as continuous (especially if you have many categories) so that you can use sampling weights

Best wishes

George

zuzannap · Post by **zuzannap** » Mon Feb 02, 2015 8:38 am

Dear George,

Thank you for a prompt reply! You are right, probably playing around is the best option in this case. However, I wonder what my cause the convergence problem. I do not suspect the degree of approximation to be severe - I have about 10 000 adults in about 5 000 households within 38 regions. What is more, neither of categories of my ordinal response seem particularly rare - the most extreme (and the smallest at the same time) have 480 and 557 cases respectively (by about 1500-3200 cases in the remaining middle 4 categories).
As I wrote before, I already tried fitting the model with PQL2 using PQL1 estimates as starting values (with no success - still getting the same error concerning the wrong length of vector at level 2).
It also seems strange to me that I get MCMC estimates pretty easily, while meologit command in Stata does not work for my data at all (I get the 'not concave' and 'flat regions' messages and the model does not converge even running for a veeery long time). Any other ideas what may cause that?
Note that the model is not very complicated - I have not added any random slope yet...

Bests,
Zuza

GeorgeLeckie · Post by **GeorgeLeckie** » Mon Feb 02, 2015 11:34 am

Dear Zuza,

Very small clusters (households in your case) are certainly going to make computation harder whatever the estimation method. You may even have a lot of households which only have one adult, which isn't so problematic conceptually (as long as you have a healthy number with 2 or more adults), but will again make things more difficult computation wise? One strategy might be to use the MCMC estimates as starting values for PQL2 (as you could then allow sampling weights). You might even explore using the MCMC estimates as starting values for meologit to speed up estimation (as again meologit I think allows sampling weights). You might also want to play around with the many different estimation options for meologit, but using a small sub sample of your data to speed this exploration up, to see whether alternative estimation settings are more fruitful for your data.

Good luck!

Best wishes

George

zuzannap · Post by **zuzannap** » Mon Feb 02, 2015 7:46 pm

Thanks George for all the hints!

Zuza

www.cmm.bristol.ac.uk/forum

3-level ordered logistic - problem with PQL2 + weights

3-level ordered logistic - problem with PQL2 + weights

Re: 3-level ordered logistic - problem with PQL2 + weights

Re: 3-level ordered logistic - problem with PQL2 + weights

Re: 3-level ordered logistic - problem with PQL2 + weights

Re: 3-level ordered logistic - problem with PQL2 + weights