MLwiN data structure?
Posted: Thu Mar 04, 2010 1:14 pm
Hi!
I'm reading through the MLwiN Command Manual first, so this may be in the User's Guide (although I took a glance, and didn't see it).
My impression of the data structure for MLwiN is that it's more "spreadsheet"-like than "rectangular data set" -- in that unlike the statistical software that I'm used to (Stata, SPSS), that treats each row (i.e. case, record) as permanently locked together -- MLwiN lets the cells in the data slide up and down.
For example, if you have columns (variables) for Age, Income, Years of Education, and Number of Kids -- you can use the "choose" or "omit" commands to delete some cells, and write the remaining cells to new variables. In which case, you need to be **really** careful that you use variables all from the same stage of selecting for your analyses -- because row #5 (for example) will reflect information about Bob in some columns, and Fred in other columns.
Is this a correct interpretation of the data structure? (I can see flexibility to this arrangement. But for data analyzing purposes, it also seems a little dangerous -- as opposed to having **two** types of column, with the second type behaving more like the "boxes" -- i.e. for storing results -- and the first type of column being dedicated to your survey data.)
Thanks!
I'm reading through the MLwiN Command Manual first, so this may be in the User's Guide (although I took a glance, and didn't see it).
My impression of the data structure for MLwiN is that it's more "spreadsheet"-like than "rectangular data set" -- in that unlike the statistical software that I'm used to (Stata, SPSS), that treats each row (i.e. case, record) as permanently locked together -- MLwiN lets the cells in the data slide up and down.
For example, if you have columns (variables) for Age, Income, Years of Education, and Number of Kids -- you can use the "choose" or "omit" commands to delete some cells, and write the remaining cells to new variables. In which case, you need to be **really** careful that you use variables all from the same stage of selecting for your analyses -- because row #5 (for example) will reflect information about Bob in some columns, and Fred in other columns.
Is this a correct interpretation of the data structure? (I can see flexibility to this arrangement. But for data analyzing purposes, it also seems a little dangerous -- as opposed to having **two** types of column, with the second type behaving more like the "boxes" -- i.e. for storing results -- and the first type of column being dedicated to your survey data.)
Thanks!