highly correlated multivariate depenents -> numerical error

Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to runmlwin: Running MLwiN from within Stata >> http://www.bristol.ac.uk/cmm/software/runmlwin/
Post Reply
ash
Posts: 6
Joined: Wed Jul 13, 2011 7:38 pm

highly correlated multivariate depenents -> numerical error

Post by ash »

hi-

took quite a bit to narrow down this issue.

MLWIN fails to estimate multilevel multivariate models with highly correlated dependents. I created an absurd toy example. This works:

. use "http://www.bristol.ac.uk/cmm/media/runm ... ecomp1.dta", clear

. runmlwin ///
(written cons, eq(1)) ///
(csework cons, eq(2)), ///
level1(student: (cons, eq(1)) (cons, eq(2)))


But this doesn't: (result is numerical error calculating likelihood)

. gen int iwritten = written
. corr written iwritten

. runmlwin ///
(written cons, eq(1)) ///
(iwritten cons, eq(2)), ///
level1(student: (cons, eq(1)) (cons, eq(2)))


this fails too:

. runmlwin ///
(written cons, eq(1)) ///
(iwritten cons, eq(2)), ///
level1(student: (cons, eq(1 2)) )


I'm guessing it's an over-paramterized model issue, but regular mulivariate regression works:

. mvreg written iwritten = cons, nocon

so what is the multilevel equivalent? Is there anytype of adjustment ?

regards, ash
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: highly correlated multivariate depenents -> numerical er

Post by GeorgeLeckie »

Hi Ash,

What has happened is that the 2-by-2 student-level variance-covariance matrix has gone non-positive-definite on one of the iterations. What MLwiN does to solve this problem is to reset the relevant row of the variance-covariance matrix to zero and then to carry on iterating until convergence. Normally this works and you get to the "right" answers. However, what happens in your particular example is that MLwiN gets stuck and you end up with non-nonsensical results where some of the parameter estimates are stuck at zero.

In your toy example the two response variables effectively have a correlation of 1 which is on the boundary of the parameter space. So on one of the iterations what has happened is that the correlation goes slightly higher than 1 making the variance-covariance matrix non-positive-definite on that iteration and so the relevant offending elements of the matrix are reset to zero. However, you would not normally run into this problem, even with very highly correlated response variables (see example below). So I think the real problem is this combined with the fact that the second response variable you create is binary (the model you have specified treats the two response variables as bivariate normal). If you instead create the second response variable as continuous by generating it equal to the first response variable, plus a very small amount of noise, runmlwin copes fine and gives the results you would expect.

. use "http://www.bristol.ac.uk/cmm/media/runm ... ecomp1.dta", clear

. gen iwritten = written + rnormal(0,1)

. corr written iwritten

. runmlwin ///
(written cons, eq(1)) ///
(iwritten cons, eq(2)), ///
level1(student: (cons, eq(1)) (cons, eq(2))) ///
nopause

. runmlwin, corr


You will see if you run this model that the estimated correlation is extremely high (0.997) as we added only a very small amount of random noise when generating the second response variable. However the model fitted fine.

Note the problem is not that the model is over-parametrised.

If what you really want to do is to fit a bivariate response model for a continuous and a binary response variable, then this is possible in MLwiN and is documented in the MLwiN manuals. The relevant runmlwin do-files are also provided on the website.

I hope this helps

George
Post Reply