Page 1 of 1

highly correlated multivariate depenents -> numerical error

Posted: Sat Jul 23, 2011 9:48 am
by ash
hi-

took quite a bit to narrow down this issue.

MLWIN fails to estimate multilevel multivariate models with highly correlated dependents. I created an absurd toy example. This works:

. use "http://www.bristol.ac.uk/cmm/media/runm ... ecomp1.dta", clear

. runmlwin ///
(written cons, eq(1)) ///
(csework cons, eq(2)), ///
level1(student: (cons, eq(1)) (cons, eq(2)))


But this doesn't: (result is numerical error calculating likelihood)

. gen int iwritten = written
. corr written iwritten

. runmlwin ///
(written cons, eq(1)) ///
(iwritten cons, eq(2)), ///
level1(student: (cons, eq(1)) (cons, eq(2)))


this fails too:

. runmlwin ///
(written cons, eq(1)) ///
(iwritten cons, eq(2)), ///
level1(student: (cons, eq(1 2)) )


I'm guessing it's an over-paramterized model issue, but regular mulivariate regression works:

. mvreg written iwritten = cons, nocon

so what is the multilevel equivalent? Is there anytype of adjustment ?

regards, ash

Re: highly correlated multivariate depenents -> numerical er

Posted: Mon Jul 25, 2011 2:49 pm
by GeorgeLeckie
Hi Ash,

What has happened is that the 2-by-2 student-level variance-covariance matrix has gone non-positive-definite on one of the iterations. What MLwiN does to solve this problem is to reset the relevant row of the variance-covariance matrix to zero and then to carry on iterating until convergence. Normally this works and you get to the "right" answers. However, what happens in your particular example is that MLwiN gets stuck and you end up with non-nonsensical results where some of the parameter estimates are stuck at zero.

In your toy example the two response variables effectively have a correlation of 1 which is on the boundary of the parameter space. So on one of the iterations what has happened is that the correlation goes slightly higher than 1 making the variance-covariance matrix non-positive-definite on that iteration and so the relevant offending elements of the matrix are reset to zero. However, you would not normally run into this problem, even with very highly correlated response variables (see example below). So I think the real problem is this combined with the fact that the second response variable you create is binary (the model you have specified treats the two response variables as bivariate normal). If you instead create the second response variable as continuous by generating it equal to the first response variable, plus a very small amount of noise, runmlwin copes fine and gives the results you would expect.

. use "http://www.bristol.ac.uk/cmm/media/runm ... ecomp1.dta", clear

. gen iwritten = written + rnormal(0,1)

. corr written iwritten

. runmlwin ///
(written cons, eq(1)) ///
(iwritten cons, eq(2)), ///
level1(student: (cons, eq(1)) (cons, eq(2))) ///
nopause

. runmlwin, corr


You will see if you run this model that the estimated correlation is extremely high (0.997) as we added only a very small amount of random noise when generating the second response variable. However the model fitted fine.

Note the problem is not that the model is over-parametrised.

If what you really want to do is to fit a bivariate response model for a continuous and a binary response variable, then this is possible in MLwiN and is documented in the MLwiN manuals. The relevant runmlwin do-files are also provided on the website.

I hope this helps

George