Predicted probabilities after multilevel logit model

Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to runmlwin: Running MLwiN from within Stata >> http://www.bristol.ac.uk/cmm/software/runmlwin/
Post Reply
Raphael
Posts: 19
Joined: Wed Oct 12, 2011 2:52 am

Predicted probabilities after multilevel logit model

Post by Raphael »

Hi everyone,
I am currently working on a research project for which I am running 3-level logit models (households are nested within municipality which are nested within states) to predict out-migration in relation to a environmental variable (measured at the state-level). I found a pretty interesting quadratic association. I used the following model (abbreviated version is displayed and the real model contains a lot more controls).

Code: Select all

runmlwin mig cons ageh eduh env env2 , ///
  level3(state: cons) ///
  level2(muni: cons) ///
  level1(hhID: ) ///
  discrete(distribution(binomial) link(logit) denominator(cons)) batch
In this equation env is the continuous environmental variable of interest and env2 is the squared term (env x env). Both regression coefficients are highly significant (b env = -.481, p=.002; b env2 = -.317, p<.001) suggesting a concave association. I would now like to display this association by means of a graph. I used the following equation to obtain predicted values (I changed the name of the environmental measures from env to lraind and env2 to lraind2).

Code: Select all

gen yhat = (_b[cons]*cons) + (_b[lraind]*lraind) + (_b[lraind2]*lraind2)
I then transformed the yhat values so that the y-axis reflects predicted probabilities instead of the meaningless log odds scale…

Code: Select all

replace yhat=((exp(yhat))/(1+(exp(yhat))))
And finally, I have plotted this association using a simplistic scatter plot.

Code: Select all

twoway (scatter yhat lraind)
However, this a rather crude way of displaying the association and I am not sure whether the use of the constant (_b[cons]*cons) term in my equation for yhat makes sense. I would rather like to use STATA’s margins command to obtain predicted probabilities. However, it appears that it is not possible to use this post-estimation command after estimating a logit model using runmlwin. Or am I wrong? Has anyone used STATA’s margins command in combination with runmlwin? Or is there another way to correctly calculate and display predicted probabilities (holding all other variables at the mean)?
Thanks so much for your help!

Best,
Raphael
Last edited by Raphael on Tue Jun 26, 2012 9:15 pm, edited 1 time in total.
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Predicted probabilities after multilevel logit model

Post by GeorgeLeckie »

Hi Raphael,

Thanks for your post. I'm afraid that the margins postestimation command does not currently work after runmlwin. What you have done looks correct and is how I would have done this. I have given some comments below...

When you write

Code: Select all

runmlwin mig cons ageh eduh env env2 , ///
  level3(state: cons) ///
  level2(muni: cons) ///
  level1(hhID: ) ///
  discrete(distribution(binomial) link(logit) denominator(cons)) batch
I am sure you know this, but for the benefit of other readers of this post, remember to fit any final discrete response models by PQL2 or ideally by MCMC as MQL1, the default estimation method for discrete response models, underestimates the model parameters, particularly the random part parameters. In data with high degree of clustering such as longitudinal data these biases can be severe. In data with little clustering these biases can be very small, but you only know for certain by checking and fitting the model also by PQL2 and ideally by MCMC.

When you write

Code: Select all

gen yhat = (_b[cons]*cons) + (_b[lraind]*lraind) + (_b[lraind2]*lraind2)
You are predicting the probability of out migration as a function of lraind2, holding all other covariates at zero. If you want to do this, it is probably best to centre all your covariates around their grand means so that holding all other covariates at zero implies making predictions for a typical individual. (Centring the covariates affects the magnitude of the intercept and therefore your predictions)

When you write

Code: Select all

replace yhat=((exp(yhat))/(1+(exp(yhat))))
you could have equally made use of Stata's invlogit() function.

Your use of the constant (_b[cons]*cons) term in your equation for yhat does makes sense. But remember how you centre your covariates affects the estimate of the intercept and therefore its interpretation (see above).

Best wishes

George
Raphael
Posts: 19
Joined: Wed Oct 12, 2011 2:52 am

Re: Predicted probabilities

Post by Raphael »

Hi George,
Thank you so much for this helpful comment! I learned a lot!
Have a nice day!

Best,
Raphael
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Predicted probabilities after multilevel logit model

Post by GeorgeLeckie »

Hi Raphael,

Another way to do predictions is to

(1) Fit the model
(2) Add some extra observations to the end of your dataset which have the desired covariate values
(3) Use the -predict- command to make fixed part predictions for these out-of-sample observations

Instead of (2) you could simply recode the covariate values of the observations included in your estimation sample and then proceed to (3). You could for example replace the values of a covariate x by its mean values by simply typing

Code: Select all

runmlwin ...
sum x
replace x = r(mean)
predict yhat
This approach avoids you having to explicitly reference the parameter estimates, but it only allows predictions of the fixed part of the model. You would still have to manually add on the random effects and then do the invlogit() transformation to get predicted probabilities.

Hope that helps

George
Raphael
Posts: 19
Joined: Wed Oct 12, 2011 2:52 am

Re: Predicted probabilities after multilevel logit model

Post by Raphael »

Wow, I never thought about the possibility of obtaining predictions in this nifty way! Thanks so much for sharing these insights! Have a great day!

Best,
Raphael
Post Reply