www.cmm.bristol.ac.uk/forum

Posted: **Fri Sep 01, 2017 2:44 pm**

Hello. I would be interested to read others' thoughts on this matter. I am fitting a multilevel cross classified model using Mcmc estimation in mlwin. I've found that the various non-informative priors found do not work well, so I am leaning towards informative priors. I probably will only be using a random sample of my original dataset for computational reasons, and wondered about the appropriateness of using the unused data to construct informative priors (ignoring the cross-classification or just using the non-cross-classified observations). Is that a bit pointless or could I just construct informative priors using results obtained from IGLS estimation of the original sample? Ultimately I'm not that interested in increasing the precision of my estimates, I simply wish to obtain appropriate priors so that when I rerun the model to take account of the cross classification I'm working in the correct probability space. As a non-Bayesian I'd be interested in anyone's experiences on this, thanks.

Posted: **Mon Sep 04, 2017 9:31 am**

Hi Rdmcdowell,
When you say do not work well do you mean simply that the chains mix poorly? My only experience here really is in my paper Browne et al. (2007) in Statistical Modelling - https://scholar.google.com/citations?vi ... h67rFs4hoC

This is a slightly more complex MV response multivariate model and MLwiN will use inverse Wishart priors for the variance matrices where I force the prior estimates to reflect the variability in each response. I then did a prior sensitivity by splitting the data in two parts - temporally based i.e. one was the earlier data, the other the later data. I used the estimates from the earlier data in the prior for the later to show that the estimates were not too sensitive to the prior.
Does that help?
Bill.

Posted: **Mon Sep 04, 2017 11:14 am**

Hello Bill
Thanks for your response and for that helpful citation. What I mean that the estimates I am obtaining using MCMC estimation in no way correspond to those obtained using IGLS. Consider a simple example of a logistic model with a random intercept only. Using IGLS estimation I may get an estimate of -1.2 for the intercept, with variance 4.5. If I use these as starting values for MCMC estimation, regardless of the default diffuse priors used or how long I allow the chains to run (eg 500,000) to satisfy the MCMC diagnostic criteria, the estimates I obtain never remotely come close e.g. an intercept of -10.00 with variance 200. I've tried the usual strategies to help with the mixing but these do not resolve the estimation problem. Hence my query as to whether I should really be looking at informative priors. I wondered about creating two random samples from the dataset, one which would be used to create informative priors which could then be used for the MCMC estimation with the other half of the data. The data is longitudinal, so an earlier/later distinction wouldn't really be useful. I was also wondering about the appropriateness of using the IGLS estimates to create informative priors (such as Winbugs Priors Section 6.3 in the MCMC mlwin manual), or using the winbugs code generated from Mlwin to try alternative non-informative priors.
I would add I am asking colleagues about this, it's just good to hear of the experiences of others who have encountered similar problems!

Posted: **Mon Sep 04, 2017 12:19 pm**

Hi Rdmcdowell,
Thanks for the clarification. Good to see people using my MCMC book

The first thing to note is that a variance of 4.5 is crazily big in a logistic regression model (particularly if you are getting this from the default 1st MQL) and would usually indicate that most clusters are all 1s or all 0s. It's well known that most estimation methods struggle in such scenarios and I looked a long time ago at differences between estimation methods (see Browne and Draper 2000 - https://scholar.google.co.uk/citations? ... CSPb-OGe4C )
and in particular the analysis there of the Rodriguez-Goldman datasets)
There we showed how the IGLS methods give biased (low) estimates and they were not as extreme as your example. I'd personally step back and dig into your data some more and look at your clusters and see if you are really getting mostly clusters of all 1s or all 0s in which case you might be better off collapsing to the cluster level.
Best wishes,
Bill.

Posted: **Tue Sep 05, 2017 1:59 pm**

Hi Bill

Yes, all the materials available for Mlwin/Runmlwin are excellent and extremely informative. You are absolutely correct when you say that most clusters are either 0 or 1. I was able to run the analyses on collapsed (2 level) data without any problems in estimation due to the large size of the dataset, and although there is no problem getting appropriate estimates out from the 3 level data using IGLS, it is running the 3 level models using MCMC estimation where the difficulties come in. It appears then that this may not be surprising and that the use of diffuse priors may not be productive given the instability associated with fitting these types of models to this type of data. Many thanks for your guidance and helpful references.

Ron

Posted: **Tue Sep 05, 2017 3:16 pm**

No worries Ron,
Glad to be of help - basically the larger the clustering variance the harder it is to justify fitting a multilevel model rather than simply collapsing the data by aggregating up a level.
Best wishes,
Bill.

www.cmm.bristol.ac.uk/forum

Informative priors

Informative priors

Re: Informative priors

Re: Informative priors

Re: Informative priors

Re: Informative priors

Re: Informative priors