Bootstrap runmlwin

ChrisCharlton · Post by **ChrisCharlton** » Mon Jul 26, 2021 6:31 pm

To use the MCMC chains you would apply the normal prediction formula, except instead of single values for the parameter estimates you would use the chain of values instead. This would then give you a chain of predicted values where you can use the mean or median of the values as your prediction estimate, and the appropriate percentiles to obtain your credible intervals. There is an example of doing a calculation like this (using Mata) in section 6.1 of the Random Slopes Regression Models MCMC manual replication example.

johannesmueller · Post by **johannesmueller** » Tue Jul 27, 2021 9:58 am

Thanks for sharing this link and the example. There is so much yet to be discovered for me! As far as I can tell, the worked example visually represents the different slopes for individuals nested in different schools.

What I would be keen on achieving is to estimate an interaction across levels of analysis to explain these different slopes (partially) through the fixed part of the model, i.e. using the example from before, I would want to estimate and plot the multiplicative / interactive effect of educ * d_lit (as predicted probabilities and marginal effects).

Would you be aware of anything in that vein?

: Graph.jpg (24.82 KiB) Viewed 11427 times

ChrisCharlton · Post by **ChrisCharlton** » Tue Jul 27, 2021 11:50 am

I've asked George if he has any examples, but in the mean time it might be worth looking through the replication materials for the LEMMA course (https://www.cmm.bris.ac.uk/lemma/), in particular modules 6 (Regression models for binary responses) and 7 (Multilevel models for binary responses) provided at https://www.bristol.ac.uk/cmm/software/ ... /examples/.

At the end of 7.6 (https://www.bristol.ac.uk/cmm/media/runmlwin/7.6.do) you can see an example of creating a prediction for a model with a cross-level interaction. This is not fitted with MCMC, so you would need to make appropriate changes for this if you wanted to create a graph that contained credible intervals (this particular example displays the predictions as a bar chart).

johannesmueller · Post by **johannesmueller** » Tue Jul 27, 2021 3:02 pm

Thank you for your continued help. I have worked through these slides and tutorials a while ago, they are a tremendous resource (and actually also taught me a lot about Stata).
The predicted probabilities in 7.6. are also great.
What is a challenge for me is to get the confidence intervals of these predicted probabilities and especially the marginal effects with confidence intervals. If you would have any guidance on how to set this up that'd be great!

ChrisCharlton · Post by **ChrisCharlton** » Wed Aug 04, 2021 3:55 pm

One way to calculate a prediction with credible intervals, using your previous example would be as follows:

Code: Select all

				 
clear all
use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear
generate lc1 = (lc==1)
generate lc2 = (lc==2)
generate lc3plus = (lc>=3)

foreach var of varlist hindu urban {
	bysort district: egen M_`var' =mean(`var')
}
 
gen double educ_X_d_lit = educ*d_lit // new moderation of interest
 
* Create initial values
runmlwin use cons lc1 lc2 lc3plus age d_lit M_urban M_hindu  educ educ_X_d_lit , ///
		level2(district: cons) level1(woman) discrete(distribution(binomial) link(logit) denominator(cons)) ///
		nopause 
		
* Run model with MCMC    
runmlwin use cons lc1 lc2 lc3plus age d_lit M_urban M_hindu  educ educ_X_d_lit , ///
		level2(district: cons, residuals(u, savechains("u.dta", replace))) level1(woman) discrete(distribution(binomial) link(logit) denominator(cons)) ///
		nopause initsprevious mcmc(on)
					 
* Fixed part variables
putmata X=(cons lc1 lc2 lc3plus age d_lit M_urban M_hindu  educ educ_X_d_lit)

* Level-2 Random part variables
putmata XR2=(cons)

* Level-2 identifier in original data
putmata id_x=district

preserve

mcmcsum, getchains

* Fixed part parameter chains
putmata beta=(FP1_cons FP1_lc1 FP1_lc2 FP1_lc3plus FP1_age FP1_d_lit FP1_M_urban FP1_M_hindu  FP1_educ FP1_educ_X_d_lit)

* Load random part residual chains
use "u.dta", clear
drop idnum residual
rename value u

sort district iteration

* Level-2 residual parameter chains
putmata residuals=(u)

* Level-2 identifier in residual chains
putmata id_u=district
restore

* Note only needs to be installed once
ssc install moremata

mata
// Fixed part prediction
xb = X*beta'

// Add random part prediction for each unit
grpnos = uniqrows(id_x);
for (i = 1; i < rows(grpnos); i++) {
	xind = selectindex(id_x :== grpnos[i]);
	uind = selectindex(id_u :== grpnos[i]);
	xgrp = XR2[xind, .];
	ugrp = residuals[uind, .];
	xu = xgrp*ugrp';
	xb[xind, .] = xb[xind, .] + xu;
}

// Exract quantiles
quants = mm_quantile(xb', 1, (0.025 \ 0.5 \ 0.975))'

end

* create dataset variables containing calculated prediction and credible intervals
getmata (predlo predmd predhi)=quants

* Clear temporary mata variables
mata: mata drop X XR2 beta residuals id_x id_u grpnos xind uind xgrp ugrp xb xu i quants

johannesmueller · Post by **johannesmueller** » Thu Aug 05, 2021 12:17 pm

Dear Chris,

This is great, many thanks. Allow me a final and perhaps naive question: Once predlo predmd predhi are predicted, how would I then go about plotting them?

Many thanks in advance!

ChrisCharlton · Post by **ChrisCharlton** » Tue Aug 10, 2021 4:01 pm

George has had a look at this and provided us with an example for linear regression which you should be able to adapt (this is also simpler than mine and avoids using Mata):

Code: Select all

clear all

* Open data
use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial", clear

codebook schgend

* Generate dummy
generate singlesex = (schgend!=1)

* Generate interaction
generate singlesexXstandlrt = singlesex*standlrt

* Set MLwiN path
global MLwiN_path "C:\Program Files\MLwiN v3.05\mlwin.exe"

* Fit linear regression by ML
runmlwin normexam cons singlesex standlrt singlesexXstandlrt, ///
  level1(student: cons) ///
  nopause

* Fit model by MCMC
runmlwin normexam cons singlesex standlrt singlesexXstandlrt, ///
  level1(student: cons) ///
  mcmc(thinning(5) savechains("chains.dta", replace)) ///
  initsprevious ///
  nopause

* Keep variables of interest
keep cons singlesex standlrt singlesexXstandlrt

* Drop duplicate observations
duplicates drop

* Merge in MCMC parameter chains
cross using "chains.dta"

* Generate prediction
generate yhat = FP1_cons*cons ///
  + FP1_singlesex*singlesex ///
  + FP1_standlrt*standlrt ///
  + FP1_singlesexXstandlrt*singlesexXstandlrt

* Keep variables of interest
keep iteration yhat singlesex standlrt

* Calculate the means and lower and upper limits of the 95% credible
* interactions of the MCMC chains for the prediction
bysort singlesex standlrt: egen yhatmn = mean(yhat)
bysort singlesex standlrt: egen yhatlo = pctile(yhat), p(2.5)
bysort singlesex standlrt: egen yhathi = pctile(yhat), p(97.5)

* Drop redundant variables
drop yhat iteration

* Drop duplicate observations
duplicates drop

* Plot interaction plot
twoway ///
  (rarea yhathi yhatlo standlrt if singlesex==0) ///
  (line yhatmn standlrt if singlesex==0) ///
  (rarea yhathi yhatlo standlrt if singlesex==1) ///
  (line yhatmn standlrt if singlesex==1), ///
  ytitle("Predicted Age 16 score") ///
  legend(order(2 "Mixed sex school" 4 "Single sex school"))

johannesmueller · Post by **johannesmueller** » Wed Aug 11, 2021 2:59 pm

Dear Chris,

Many thanks, also to George, for having a look at this.

This is very accessible and fantastic.

A few questions to make sure I understand it all:

Why is thinning = 5? I understand that every fifth result is stored but I am not sure I understand why it is done. Should (burnin(#) and chain(#) be increased from the default?

In the example, I have added a control variable and modified the random part of the model slightly. I am wondering whether the control variable girl can be held at its mean values in predicting yhat and whether the way below would be correct? Some literatures prefer holding all controls at the mean when obtaining yhat.

Finally, I am wondering whether the random part of the model contains any further information? Does e.g. the random slope effect of standlrt affect the yhat via its influence on the fixed part of the model, i.e. FP1_girl FP1_singlesex FP1_standlrt FP1_singlesexXstandlrt?

Code: Select all

clear all

* Open data
use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial", clear

codebook schgend

* Generate dummy
generate singlesex = (schgend!=1)

* Generate interaction
generate singlesexXstandlrt = singlesex*standlrt

* Set MLwiN path
global MLwiN_path "C:\Program Files\MLwiN v3.05\mlwin.exe"

* Fit linear regression by ML
runmlwin normexam cons girl singlesex standlrt singlesexXstandlrt, ///
  level2(school: cons standlrt) level1(student: cons) ///
  nopause

* Fit model by MCMC
runmlwin normexam cons girl singlesex standlrt singlesexXstandlrt, ///
  level2(school: cons standlrt)  level1(student: cons) ///
  mcmc(thinning(5) savechains("chains.dta", replace)) ///
  initsprevious ///
  nopause
*Question: Why is thinning = 5? I understand that every fifth result is stored but I am not sure I understand why it is done. Should (burnin(#) and chain(#) be increased from default? 


* Keep variables of interest
keep cons singlesex standlrt singlesexXstandlrt girl

* Drop duplicate observations
duplicates drop

* Merge in MCMC parameter chains
cross using "chains.dta"

* Generate prediction
generate yhat = ///
	FP1_cons*cons ///
  + FP1_girl*girl ///
  + FP1_singlesex*singlesex ///
  + FP1_standlrt*standlrt ///
  + FP1_singlesexXstandlrt*singlesexXstandlrt 
  
* Generate prediction where control is set to mean (crudely hardcoded)
* Question: above the control girl is added to the model and set to its values in predicting yhat. 
* Can this be set manually to the mean (.490566; trivial for results here)
* Some literatures prefer holding all controls at the mean when obtaining yhat.  
sum girl

generate yhat2 = ///
	FP1_cons*cons ///
  + FP1_girl*.490566 ///
  + FP1_singlesex*singlesex ///
  + FP1_standlrt*standlrt 
  
  sum yhat*
  corr yhat*
   
*Question: Does the random part of the model contain any further information? 
sum RP2_var_cons_ RP2_cov_cons_standlrt_ RP2_var_standlrt_ RP1_var_cons_
   
* Keep variables of interest
keep iteration yhat* singlesex standlrt  

* Calculate the means and lower and upper limits of the 95% credible
* interactions of the MCMC chains for the prediction
bysort singlesex standlrt: egen yhatmn = mean(yhat)
bysort singlesex standlrt: egen yhatlo = pctile(yhat), p(2.5)
bysort singlesex standlrt: egen yhathi = pctile(yhat), p(97.5)

* Drop redundant variables
drop yhat iteration

* Drop duplicate observations
duplicates drop

* Plot interaction plot
twoway ///
  (rarea yhathi yhatlo standlrt if singlesex==0) ///
  (line yhatmn standlrt if singlesex==0) ///
  (rarea yhathi yhatlo standlrt if singlesex==1) ///
  (line yhatmn standlrt if singlesex==1), ///
  ytitle("Predicted Age 16 score") ///
  legend(order(2 "Mixed sex school" 4 "Single sex school"))

ChrisCharlton · Post by **ChrisCharlton** » Wed Aug 11, 2021 4:21 pm

Here are some answers to your questions from George:

Why is thinning = 5? I understand that every fifth result is stored but I am not sure I understand why it is done. Should (burnin(#) and chain(#) be increased from the default?

To reduce the size of the resulting saved file to make data manipulation and plotting quicker. You want the resulting chain length to be long enough that you feel the mean and 2.5th and 97.5th summaries are reasonably accurate.

In the example, I have added a control variable and modified the random part of the model slightly. I am wondering whether the control variable girl can be held at its mean values in predicting yhat and whether the way below would be correct? Some literatures prefer holding all controls at the mean when obtaining yhat.

Yes you would typically hold control variables at their means.

Finally, I am wondering whether the random part of the model contains any further information? Does e.g. the random slope effect of standlrt affect the yhat via its influence on the fixed part of the model, i.e. FP1_girl FP1_singlesex FP1_standlrt FP1_singlesexXstandlrt?

The inclusion of random intercept and slope will mostly increase the SE of the fixed-part predictors unless there is confounding concerns (correlation between random effects and covariates) in which case adding the random effects can move the regression coefficients about a bit.

johannesmueller · Post by **johannesmueller** » Thu Aug 12, 2021 10:25 am

Great, many thanks for these clarifications.

Given n(level1) = 200,000 and n(level2) =40, would you have any recommendation for the chain length?

This leads to a related issue: The sandbox example works great in the relatively small tutorial dataset. I have been unable to run it on my real dataset with n(level1) = 200,000. This likely is partially driven by the cross option that leads to an explosion of the data. Would there be another workaround that does not involve "cross"?

www.cmm.bristol.ac.uk/forum

Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin