indicator functions for logistic models
Posted: Sat Aug 04, 2012 5:43 am
Hi.
Using panel data, I am trying to estimate a logistic model (on the probability of a residential move) that varies according to an individual's marital status: (single vs. cohabiting relationship) (Kindly see attached file equation.jpg). I have 35 explanatory variables. Since, across time an individual will move from being single to cohabiting relationship, I suspect that the random effects for the singles will be correlated with the random effects for those who are in cohabiting relationship.
To allow variation according to individual's marital status, I created two sets of explanatory variables using the stata codes below. Hence, instead of the original 35 explanatory variables, now I have 70 X's.
foreach x of varlist x1 - x35 {
g `x'_C = `x' if couple == 1
replace `x'_C = 0 if couple == 0
g `x'_S = `x' if couple == 0
replace `x'_S = 0 if couple == 1
}
g Xsingle = couple == 0
g Xcouple = couple == 1
runmlwin y cons x1_S - X35_S X1_C - X35_C, level2(individual_id: Xsingle Xcouple) level1(survey_round:) discrete(distribution(binomial) link(logit) denominator(cons)) nopause
When I estimated the model, I get sensible results for the fixed part of the model. However, I get the following results for the random parts:
cov(Xsingle,Xcouple) = .1861189
var(Xsingle) = .1041276
var(Xcouple) = .2092139
display (.1861189 )/sqrt(.1041276*.2092139) = 1.26
As you would notice, I have a correlation falling outside the unit circle. Naturally, I can not proceed to MCMC using these numbers as initial values. While I can specify different initial values, I am just wondering whether it's possible that my approach of creating the explanatory variables is causing this problem? Or is there a direct way of putting an indicator function in the model specification instead of creating two sets of X's?
Thank you very much for your insights.
Using panel data, I am trying to estimate a logistic model (on the probability of a residential move) that varies according to an individual's marital status: (single vs. cohabiting relationship) (Kindly see attached file equation.jpg). I have 35 explanatory variables. Since, across time an individual will move from being single to cohabiting relationship, I suspect that the random effects for the singles will be correlated with the random effects for those who are in cohabiting relationship.
To allow variation according to individual's marital status, I created two sets of explanatory variables using the stata codes below. Hence, instead of the original 35 explanatory variables, now I have 70 X's.
foreach x of varlist x1 - x35 {
g `x'_C = `x' if couple == 1
replace `x'_C = 0 if couple == 0
g `x'_S = `x' if couple == 0
replace `x'_S = 0 if couple == 1
}
g Xsingle = couple == 0
g Xcouple = couple == 1
runmlwin y cons x1_S - X35_S X1_C - X35_C, level2(individual_id: Xsingle Xcouple) level1(survey_round:) discrete(distribution(binomial) link(logit) denominator(cons)) nopause
When I estimated the model, I get sensible results for the fixed part of the model. However, I get the following results for the random parts:
cov(Xsingle,Xcouple) = .1861189
var(Xsingle) = .1041276
var(Xcouple) = .2092139
display (.1861189 )/sqrt(.1041276*.2092139) = 1.26
As you would notice, I have a correlation falling outside the unit circle. Naturally, I can not proceed to MCMC using these numbers as initial values. While I can specify different initial values, I am just wondering whether it's possible that my approach of creating the explanatory variables is causing this problem? Or is there a direct way of putting an indicator function in the model specification instead of creating two sets of X's?
Thank you very much for your insights.