Page 1 of 1

Removing Multivariate Outliers

Posted: Wed Mar 22, 2017 9:11 pm
by Jillianeh
Hello,

I have a data set of 20,000 students, nested in 1000 classrooms, nested in 300 schools.

I have saved my standardized residuals for level 1, level 2, and level 3. The only thing random in my model is the intercept.

I want to remove the standardized residuals that are >/=2 or </=(-2) (at first level, second level, and third level) for a sensitivity analysis.

I can store these residuals, however, the second and third level residuals only provide me with 1 residual per level-unit (i.e. 1000 at class level and 300 at school level) as opposed to a higher-level residual associated with every observation (i.e. level 1, students). When I export my dataset to SPSS, the second and third level residuals are not matching up with my level 2 and level 3 IDs, so I cannot aggregate the values.

In the MLwiN manual, the only thing I can seem to find about sensitivity analyses/removing outliers, suggested manually pointing-and-clicking every outlying observation in the residual plot and "removing from analysis" by hand. This is not feasible given the size of my dataset.

Does anyone know how to remove multivariate outliers in MLwiN?

Thanks in advance,
Jillian

Re: Removing Multivariate Outliers

Posted: Thu Mar 23, 2017 9:48 am
by ChrisCharlton
If you just want the residuals expanded up to be the same length as level-1 then probably the easiest way to do this is via the predictions window. If you just select the residual term of interest (i.e. u0j) here without any fixed effects then the prediction will only contain the data for this term, but for each level-1 unit. An alternative is to use the CALC command to generate the expression determining the data to be excluded. If you do this there is a lev1 operator which automatically expands the referred to column to the level-1 length. For more information see the help topic for this command.