Dummy as Random Slopes

Multimembership · Post by **Multimembership** » Wed Jan 15, 2014 12:45 pm

Hi,

I am running a Random Slopes Model with five levels. I want to test whether the variance explained by the effect at level 5 is equal in two different groups (group 1 and 2). To test this, I construct a dummy that takes the value of 1 when the group variable is equal 2, and a value of 0 when the group variable is equal 1. Then, I add this dummy variable at the level 5 as a random slope. The DIC of this model is lower than the model without the random slope and the COV(effectlevel5, dummy) is negative. I interpret this result in the following way: the variance explained by the effect at level 5 is significantly lower when the dummy takes the value of 1, that is, when the group variables is equal 2.

Then, I tried to build the dummy in the opposite way but I did not find opposite results. Now, the dummy takes the value of 1 when the group variable is equal 1, and a value of 0 when the group variable is equal 2. I would expect opposite results to the ones reported above. That is, I would expect that the variance explained by the effect at level 5 is significantly higher when the dummy takes the value of 1, that is, when the group variable is equal 1. However, I find the following results. The DIC of this model is higher than the model without the random slope and the COV(effectlevel5, dummy) is negative. I interpret this result in the following way: the variance explained by the effect at level 5 is not significantly lower when the dummy takes the value of 1, that is, when the group variable is equal 1.

Overall, it seems that the results change depending on the way I specify the dummy.
- Am I interpreting the results correctly?
- What should I change in my model?
- Why do results change depending on the way I specify the dummy?

Thank you in advance for your support.

Best!

billb · Post by **billb** » Wed Jan 15, 2014 1:42 pm

Hi,
I am not sure what is happening here but I'd be happy to take a look at your spreadsheet. I am presuming that your grouping is not a level 5 variable as this would cause issues? Otherwise if you have coded your dummies correctly it shouldn't matter which is used and they should have the same DIC. The variance function for each group can be calculated via the variance functions window. If the DIC is not the same it would be worth checking the method has converged i.e. that the ESS for the parameters is similar. Finally also the MCMC algorithm will use the IGLS estimates as part of the prior distribution at level 5 so it might be worth checking that these didn't affect it.
Anyway if you want to send on model to me (william.browne@bristol.ac.uk) I'll take a look.
Regards,
Bill.

Multimembership · Post by **Multimembership** » Wed Jan 15, 2014 7:35 pm

Please, find below the runmlwin syntax with an annotated explanation of the problem.

I want to test not only whether the level-5 variance is equal across the two groupings (when the dummy flip = 0 versus when flip = 1) but also the sign, that is, whether the variance explained by level 5 increases when the dummy takes the value of 1 instead of 0.

I am not sure which one of the two options below is the right one:

1 option: Look only at the COV and conclude that the variance explained by level 5 increases when COV is positive and decrease when COV is negative.
2 option: Look at the entire variance at level 5. The variance at level 5 is composed by 3 terms: 2 variances and one covariance (lets say v0, c01, v1). I should then compute the entire variance - given by v0+2*c01+v1 - in the two groups, that is, when flip=0 and when flip=1. In my specific case, I have that: v0=0.019, v1=0.449, and c01=-0.04 (estimates obtained by running the model called "Model with First Dummy Flip"). When flip=0, the variance explained by the PEfirm1 effect is 0.019 (i.e., equal to v0 because flip = 0 and v1 and c01 are equal to 0). When flip=1, then the variance explained by the PEfirm1 effect is 0.019+2*-0.04+0.449=0.388 (i.e., v0+2*c01+v1). Based on this result, I should conclude that the variance explained by the PEfirm1 effect increases when flip increases, going from 0 to 1, because the overall variance explained by PEfirm1 goes from 0.019 to 0.388.

Which option is the correct one: the first or the second?

Best and thank you.

Code: Select all

* RANDOM INTERCEPT MODEL
* Baseline model: DIC 13187.07
sort PEfirm1 industry1 fund1  company1 entry1
runmlwin perfmean cons, level5(PEfirm1: cons) level4(industry1: cons) level3(fund1: cons)  level2(company1: cons) level1(entry1: cons) rigls nopause
matrix b = e(b)
matrix V = e(V)
runmlwin perfmean cons, level5(PEfirm1: cons) level4(industry1: cons) level3(fund1: cons)  level2(company1: cons) level1(entry1: cons) mcmc(on) initsb(b) initsv(V) nopause

* RANDOM SLOPE MODELS
* First Dummy Flip
gen flip = 1 if duration <=2
replace flip = 0 if flip ==.
* Second Dummy Longer (the opposite of Flip)
gen longer = 0 if duration <=2
replace longer = 1 if longer ==.
* Correlation between First Dummy Flip and Second Dummy Longer equal -1
corr flip longer


* Model with First Dummy Flip: DIC 12989.52 and COV(cons, flip) negative. Given that the DIC is lower than the baseline model (i.e., lower than 13187.07)
* than I conclude that the variance explained by the PEfirm1 effect is lower when Flip equal 1
sort PEfirm1 industry1 fund1  company1 entry1
runmlwin perfmean cons, level5(PEfirm1: cons flip) level4(industry1: cons) level3(fund1: cons)  level2(company1: cons) level1(entry1: cons) rigls nopause
matrix b = e(b)
matrix V = e(V)
runmlwin perfmean cons, level5(PEfirm1: cons flip) level4(industry1: cons) level3(fund1: cons)  level2(company1: cons) level1(entry1: cons) mcmc(on) initsb(b) initsv(V) nopause


* Model with Second Dummy Longer: DIC 12966.74 and COV(cons, Longer) negative. Given that the DIC is lower than the baseline model (i.e., lower than 13187.07)
* than I conclude that the variance explained by the PEfirm1 effect is lower when Longer equal 1. I would have expected the opposite since I am using in this model 
* a Dummy (i.e., Longer) that is the opposite of the dummy used in the model above (i.e., Flip)
sort PEfirm1 industry1 fund1  company1 entry1
runmlwin perfmean cons, level5(PEfirm1: cons longer) level4(industry1: cons) level3(fund1: cons)  level2(company1: cons) level1(entry1: cons) rigls nopause
matrix b = e(b)
matrix V = e(V)
runmlwin perfmean cons, level5(PEfirm1: cons longer) level4(industry1: cons) level3(fund1: cons)  level2(company1: cons) level1(entry1: cons) mcmc(on) initsb(b) initsv(V) nopause


* IN SUM, THE TWO DUMMIES, THAT ARE ONE THE OPPOSITE OF THE OTHER, BRING DIFFERENT RESULTS.

* For your information, we obtain exactely the opposite results of Flip, only when we construct the opposite of the Flip dummy as in the model below. You see that in this case
* the COV(cons, oppositeflip) is positive (i.e., +0.04) and takes exactly the opposite value of when we use Flip (i.e., -0.04)
gen oppositeflip = flip * -1
corr flip oppositeflip
sort PEfirm1 industry1 fund1  company1 entry1
runmlwin perfmean cons, level5(PEfirm1: cons oppositeflip) level4(industry1: cons) level3(fund1: cons)  level2(company1: cons) level1(entry1: cons) rigls nopause
matrix b = e(b)
matrix V = e(V)
runmlwin perfmean cons, level5(PEfirm1: cons oppositeflip) level4(industry1: cons) level3(fund1: cons)  level2(company1: cons) level1(entry1: cons) mcmc(on) initsb(b) initsv(V) nopause

GeorgeLeckie · Post by **GeorgeLeckie** » Wed Jan 15, 2014 7:42 pm

Hi,

Your query relates to understanding the equivalency between different parametrizations of the same underlying statistical model.

For simplicity, consider a two-level students-within-schools model for student attainment and suppose that we want to allow gender-specific between-school variances. We can achieve this by adding gender to both the fixed part of the model and the random part at the school level. It turns out that there are three possible parametrizations which we might entertain.

(1) Enter girl as a covariate and allow its coefficient to vary across schools
(2) Enter boy as a covariate and allow its coefficient to vary across schools
(3) Remove the intercept and enter both girl and boy as covariates and allow both their coefficients to vary across schools

What follows demonstrates that the three parametrizations really give the same gender-specific between-school variances.

Now suppose that you want to test whether you actually need to allow for gender-specific variances. To do this simply use a LR test (if using IGLS) to compare the current model (parametrization 1, 2 or 3) to the restricted model where you do not enter gender into the random part of the model at the school level. If you are using MCMC then examine the drop in the DIC associated with moving from the simpler model to the current model (parametrization 1, 2 or 3). Note if you see non-trivial differences in the DIC across the three parametrizations then this would suggest that you should run the model with a longer burnin and chain. Use the burnin() and chain() options to alter these estimation settings.

Best wishes

George

Code: Select all

********************************************************************************
* Fit three different parameterisations by IGLS
********************************************************************************

* Load the data
use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial.dta", clear

* Generate the girl and boy dummy variables
drop girl
generate girl = (standlrt>1)
generate boy = 1 - girl



***************************************
* Parameterisation 1
***************************************

* Fit model by IGLS
runmlwin normexam cons girl, ///
	level2(school: cons girl) ///
	level1(student: cons) ///
	nopause

* Store model results
estimates store igls1

* Display the gender-specific means
display "girl mean   = " %4.3f _b[cons] + _b[girl]
display "boy mean    = " %4.3f _b[cons]

* Display the implied intercept variances and covariances for the two groups
display "girl var    = " %4.3f [RP2]var(cons) + 2*[RP2]cov(cons\girl) + [RP2]var(girl)
display "boy girl cov = " %4.3f [RP2]var(cons) + [RP2]cov(cons\girl)
display "boy var    = " %4.3f [RP2]var(cons)



***************************************
* Parameterisation 2
***************************************

runmlwin normexam cons boy, ///
	level2(school: cons boy) ///
	level1(student: cons) ///
	nopause
estimates store igls2
display "girl mean   = " %4.3f _b[cons]
display "boy mean    = " %4.3f _b[cons] + _b[boy]
display "girl var    = " %4.3f [RP2]var(cons)
display "boy girl cov = " %4.3f [RP2]var(cons) + [RP2]cov(cons\boy)
display "boy var    = " %4.3f [RP2]var(cons) + 2*[RP2]cov(cons\boy) + [RP2]var(boy)



***************************************
* Parameterisation 3
***************************************

runmlwin normexam girl boy, ///
	level2(school: girl boy) ///
	level1(student: cons) ///
	nopause
estimates store igls3
display "girl mean   = " %4.3f _b[girl]
display "boy mean    = " %4.3f _b[boy]
display "girl var    = " %4.3f [RP2]var(girl)
display "boy girl cov = " %4.3f [RP2]cov(girl\boy)
display "boy var    = " %4.3f [RP2]var(boy)



***************************************
* Compare IGLS results
***************************************

* Compare the IGLS results
esttab igls1 igls2 igls3 , wide b(%3.2f) stats(dic)


********************************************************************************
* Fit three different parameterisations by MCMC
********************************************************************************

***************************************
* Parameterisation 1
***************************************

* Fit model by MCMC
runmlwin normexam cons girl, ///
	level2(school: cons girl) ///
	level1(student: cons) ///
	mcmc(on) initsmodel(igls1) ///
	nopause

* Store model results
estimates store mcmc1

* Display the implied intercept variances and covariances for the two groups
display "girl mean   = " %4.3f _b[cons] + _b[girl]
display "boy mean    = " %4.3f _b[cons]
display "girl var    = " %4.3f [RP2]var(cons) + 2*[RP2]cov(cons\girl) + [RP2]var(girl)
display "boy girl cov = " %4.3f [RP2]var(cons) + [RP2]cov(cons\girl)
display "boy var    = " %4.3f [RP2]var(cons)



***************************************
* Parameterisation 2
***************************************

runmlwin normexam cons boy, ///
	level2(school: cons boy) ///
	level1(student: cons) ///
	mcmc(on) initsmodel(igls2) ///
	nopause
estimates store mcmc2
display "girl mean   = " %4.3f _b[cons]
display "boy mean    = " %4.3f _b[cons] + _b[boy]
display "girl var    = " %4.3f [RP2]var(cons)
display "boy girl cov = " %4.3f [RP2]var(cons) + [RP2]cov(cons\boy)
display "boy var    = " %4.3f [RP2]var(cons) + 2*[RP2]cov(cons\boy) + [RP2]var(boy)



***************************************
* Parameterisation 3
***************************************

runmlwin normexam girl boy, ///
	level2(school: girl boy) ///
	level1(student: cons) ///
	mcmc(on) initsmodel(igls3) ///
	nopause
estimates store mcmc3
display "girl mean   = " %4.3f _b[girl]
display "boy mean    = " %4.3f _b[boy]
display "girl var    = " %4.3f [RP2]var(girl)
display "boy girl cov = " %4.3f [RP2]cov(girl\boy)
display "boy var    = " %4.3f [RP2]var(boy)



***************************************
* Compare MCMC results
***************************************

esttab mcmc1 mcmc2 mcmc3 , wide b(%3.2f) stats(dic)

Multimembership · Post by **Multimembership** » Fri Jan 17, 2014 7:36 pm

Hi George,

this is very clear. Thank you for your prompt and thorough support!

Best.

www.cmm.bristol.ac.uk/forum

Dummy as Random Slopes

Dummy as Random Slopes

Re: Dummy as Random Slopes

Re: Dummy as Random Slopes

Re: Dummy as Random Slopes

Re: Dummy as Random Slopes