Confidence ellipse for EB estimates of bivariate random var
Posted: Fri Mar 07, 2014 2:33 pm
Hello
I'm looking for ways to test a joint hypothesis about (or create a confidence ellipse around) the cluster-specific estimates of two random effects. I'm estimating a bivariate 2-lvl model and the error terms at each level are assumed to be drawn from a bivariate normal distribution with unstructured variance-covariance matrix. So the model is given by
Yij_1 = a + uj_1 + eij_1
Yij_2 = a + uj_2 + eij_2
where uj_1 and uj_2 are BVN with mean zero, variances var_1, var_2 and covariance covar_12 = rho*sqrt(var_1)*sqrt(var_2) [or alternative, rho = covar_12 / (sqrt(var_1)*sqrt(var_2)) ], i is the lower level unit and j is the higher (cluster) level unit. While this model describes a SUR model, I think the same setup and questions apply to a model with random intercept and random coefficient that are BVN.
I'm trying to test the hypothesis that u1_j > 0 and u2_j>0. IGLS or MCMC give me (Empirical) Bayes estimates of u1_j and u2_j as well as their posterior standard deviation. Using these estimates, and ignoring the correlation between the two intercepts, I can calculate boxed CIs but these have incorrect coverage. So i need a way to incorporate the correlation, i.e. the resulting confidence region is given by an ellipse where the correlation determines the direction of the 'longer side' of the ellipse.
I thought one way to test my hypothesis is to run the model in MCMC, record the chain for the u1's and u2's and check how many of the MCMC simulations fulfill the hypothesis. But this is very slow. An alternative may be to estimate the model using IGLS, sample from the posterior bivariate distribution of the Emprical Bayes with estimates of u1_j, u2_j plugged in as means and the posterior standard deviations^2 plugged in as variances. So I treat the cluster means and posterior SD^2 as describing a bivariate normal distribution from which i can sample. I would then count how many times these simulated values correspond to my hypothesis, as above.
Does this make sense? The second approach would be much faster. But can I use the estimate of covar_12 to construct the BVN to sample from? When I run MCMC and record the chains, i find that the overall correlation of u1_j and u2_j (across all j and all MCMC iterations) is approx equal to the IGLS estimate of the covariance. But the correlation of MCMC simulations for each cluster j is very different.
I'm not actually interested in the priors or any other feature of the 'full' Bayesian approach. I'm perfectly happy with IGLS and Empirical Bayes estimates. So I would like to avoid going down the MCMC route if I don't have to.
Any comments would be highly appreciated.
Cheers
Nils
I'm looking for ways to test a joint hypothesis about (or create a confidence ellipse around) the cluster-specific estimates of two random effects. I'm estimating a bivariate 2-lvl model and the error terms at each level are assumed to be drawn from a bivariate normal distribution with unstructured variance-covariance matrix. So the model is given by
Yij_1 = a + uj_1 + eij_1
Yij_2 = a + uj_2 + eij_2
where uj_1 and uj_2 are BVN with mean zero, variances var_1, var_2 and covariance covar_12 = rho*sqrt(var_1)*sqrt(var_2) [or alternative, rho = covar_12 / (sqrt(var_1)*sqrt(var_2)) ], i is the lower level unit and j is the higher (cluster) level unit. While this model describes a SUR model, I think the same setup and questions apply to a model with random intercept and random coefficient that are BVN.
I'm trying to test the hypothesis that u1_j > 0 and u2_j>0. IGLS or MCMC give me (Empirical) Bayes estimates of u1_j and u2_j as well as their posterior standard deviation. Using these estimates, and ignoring the correlation between the two intercepts, I can calculate boxed CIs but these have incorrect coverage. So i need a way to incorporate the correlation, i.e. the resulting confidence region is given by an ellipse where the correlation determines the direction of the 'longer side' of the ellipse.
I thought one way to test my hypothesis is to run the model in MCMC, record the chain for the u1's and u2's and check how many of the MCMC simulations fulfill the hypothesis. But this is very slow. An alternative may be to estimate the model using IGLS, sample from the posterior bivariate distribution of the Emprical Bayes with estimates of u1_j, u2_j plugged in as means and the posterior standard deviations^2 plugged in as variances. So I treat the cluster means and posterior SD^2 as describing a bivariate normal distribution from which i can sample. I would then count how many times these simulated values correspond to my hypothesis, as above.
Does this make sense? The second approach would be much faster. But can I use the estimate of covar_12 to construct the BVN to sample from? When I run MCMC and record the chains, i find that the overall correlation of u1_j and u2_j (across all j and all MCMC iterations) is approx equal to the IGLS estimate of the covariance. But the correlation of MCMC simulations for each cluster j is very different.
I'm not actually interested in the priors or any other feature of the 'full' Bayesian approach. I'm perfectly happy with IGLS and Empirical Bayes estimates. So I would like to avoid going down the MCMC route if I don't have to.
Any comments would be highly appreciated.
Cheers
Nils