Informative priors

Welcome to the forum for MLwiN users. Feel free to post your question about MLwiN software here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Remember to check out our extensive software FAQs which may answer your question: http://www.bristol.ac.uk/cmm/software/s ... port-faqs/
Post Reply
rdmcdowell
Posts: 31
Joined: Mon Apr 02, 2012 3:26 pm

Informative priors

Post by rdmcdowell »

Hello. I would be interested to read others' thoughts on this matter. I am fitting a multilevel cross classified model using Mcmc estimation in mlwin. I've found that the various non-informative priors found do not work well, so I am leaning towards informative priors. I probably will only be using a random sample of my original dataset for computational reasons, and wondered about the appropriateness of using the unused data to construct informative priors (ignoring the cross-classification or just using the non-cross-classified observations). Is that a bit pointless or could I just construct informative priors using results obtained from IGLS estimation of the original sample? Ultimately I'm not that interested in increasing the precision of my estimates, I simply wish to obtain appropriate priors so that when I rerun the model to take account of the cross classification I'm working in the correct probability space. As a non-Bayesian I'd be interested in anyone's experiences on this, thanks.
billb
Posts: 157
Joined: Fri May 21, 2010 1:21 pm

Re: Informative priors

Post by billb »

Hi Rdmcdowell,
When you say do not work well do you mean simply that the chains mix poorly? My only experience here really is in my paper Browne et al. (2007) in Statistical Modelling - https://scholar.google.com/citations?vi ... h67rFs4hoC

This is a slightly more complex MV response multivariate model and MLwiN will use inverse Wishart priors for the variance matrices where I force the prior estimates to reflect the variability in each response. I then did a prior sensitivity by splitting the data in two parts - temporally based i.e. one was the earlier data, the other the later data. I used the estimates from the earlier data in the prior for the later to show that the estimates were not too sensitive to the prior.
Does that help?
Bill.
rdmcdowell
Posts: 31
Joined: Mon Apr 02, 2012 3:26 pm

Re: Informative priors

Post by rdmcdowell »

Hello Bill
Thanks for your response and for that helpful citation. What I mean that the estimates I am obtaining using MCMC estimation in no way correspond to those obtained using IGLS. Consider a simple example of a logistic model with a random intercept only. Using IGLS estimation I may get an estimate of -1.2 for the intercept, with variance 4.5. If I use these as starting values for MCMC estimation, regardless of the default diffuse priors used or how long I allow the chains to run (eg 500,000) to satisfy the MCMC diagnostic criteria, the estimates I obtain never remotely come close e.g. an intercept of -10.00 with variance 200. I've tried the usual strategies to help with the mixing but these do not resolve the estimation problem. Hence my query as to whether I should really be looking at informative priors. I wondered about creating two random samples from the dataset, one which would be used to create informative priors which could then be used for the MCMC estimation with the other half of the data. The data is longitudinal, so an earlier/later distinction wouldn't really be useful. I was also wondering about the appropriateness of using the IGLS estimates to create informative priors (such as Winbugs Priors Section 6.3 in the MCMC mlwin manual), or using the winbugs code generated from Mlwin to try alternative non-informative priors.
I would add I am asking colleagues about this, it's just good to hear of the experiences of others who have encountered similar problems!
billb
Posts: 157
Joined: Fri May 21, 2010 1:21 pm

Re: Informative priors

Post by billb »

Hi Rdmcdowell,
Thanks for the clarification. Good to see people using my MCMC book :) The first thing to note is that a variance of 4.5 is crazily big in a logistic regression model (particularly if you are getting this from the default 1st MQL) and would usually indicate that most clusters are all 1s or all 0s. It's well known that most estimation methods struggle in such scenarios and I looked a long time ago at differences between estimation methods (see Browne and Draper 2000 - https://scholar.google.co.uk/citations? ... CSPb-OGe4C )
and in particular the analysis there of the Rodriguez-Goldman datasets)
There we showed how the IGLS methods give biased (low) estimates and they were not as extreme as your example. I'd personally step back and dig into your data some more and look at your clusters and see if you are really getting mostly clusters of all 1s or all 0s in which case you might be better off collapsing to the cluster level.
Best wishes,
Bill.
rdmcdowell
Posts: 31
Joined: Mon Apr 02, 2012 3:26 pm

Re: Informative priors

Post by rdmcdowell »

Hi Bill

Yes, all the materials available for Mlwin/Runmlwin are excellent and extremely informative. You are absolutely correct when you say that most clusters are either 0 or 1. I was able to run the analyses on collapsed (2 level) data without any problems in estimation due to the large size of the dataset, and although there is no problem getting appropriate estimates out from the 3 level data using IGLS, it is running the 3 level models using MCMC estimation where the difficulties come in. It appears then that this may not be surprising and that the use of diffuse priors may not be productive given the instability associated with fitting these types of models to this type of data. Many thanks for your guidance and helpful references.

Ron
billb
Posts: 157
Joined: Fri May 21, 2010 1:21 pm

Re: Informative priors

Post by billb »

No worries Ron,
Glad to be of help - basically the larger the clustering variance the harder it is to justify fitting a multilevel model rather than simply collapsing the data by aggregating up a level.
Best wishes,
Bill.
Post Reply