Page 1 of 1
Do I need to sort the data before running runmlwin?
Posted: Fri Oct 28, 2011 3:49 am
by Queena
In MlwiN, the data set need to be sorted to reflect the data's hierarchical or nested structure.
Do I need to do the same action before using "runmlwin" in stata?
thanks.

Re: Do I need to sort the data before running runmlwin?
Posted: Fri Oct 28, 2011 1:30 pm
by GeorgeLeckie
Great question!
Yes you do need to sort the data according to the model hierarchy before running the runmlwin command for the model.
However, the nice thing about runmlwin is that runmlwin will automatically tell you if the data is sorted incorretly, so you do not need to worry that you have sent mis-sorted data to MLwiN.
Have a look at the following example. The first time we fit the model the data is sorted according to the model hierarchy (school student). We then resort the data according to the response variable normexam. When we try to fit the model for a second time we receive an error message telling us that the data are not sorted according to the data hierarchy.
The Stata commands are
Code: Select all
use http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial, clear
runmlwin normexam cons standlrt, level2 (school: cons standlrt) level1 (student: cons) nopause
sort normexam
runmlwin normexam cons standlrt, level2 (school: cons standlrt) level1 (student: cons) nopause
The Stata output window will show:
Code: Select all
. use http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial, clear
. runmlwin normexam cons standlrt, level2 (school: cons standlrt) level1 (student: cons) nopause
MLwiN 2.24 multilevel model Number of obs = 4059
Normal response model
Estimation algorithm: IGLS
-----------------------------------------------------------
| No. of Observations per Group
Level Variable | Groups Minimum Average Maximum
----------------+------------------------------------------
school | 65 2 62.4 198
-----------------------------------------------------------
Run time (seconds) = 1.45
Number of iterations = 4
Log likelihood = -4658.4351
Deviance = 9316.8701
------------------------------------------------------------------------------
normexam | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cons | -.0115052 .039783 -0.29 0.772 -.0894784 .066468
standlrt | .5567305 .019937 27.92 0.000 .5176548 .5958063
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 2: school |
var(cons) | .0904446 .017924 .0553143 .1255749
cov(cons,standlrt) | .0180414 .0067229 .0048649 .031218
var(standlrt) | .0145361 .0044139 .0058851 .0231872
-----------------------------+------------------------------------------------
Level 1: student |
var(cons) | .5536575 .0124818 .5291936 .5781214
------------------------------------------------------------------------------
. sort normexam
. runmlwin normexam cons standlrt, level2 (school: cons standlrt) level1 (student: cons) nopause
The data must be sorted according to the order of the model hierarchy: school student.
Best wishes
George
Re: Do I need to sort the data before running runmlwin?
Posted: Sat Jan 06, 2018 9:01 pm
by hamzah734
I have made command below used Stata 14
gen cons = 1
gen id = _n
runmlwin malaria cons, level2(district: cons) level1(id:) discrete(distribution(binomial) link(logit) denominator(cons))
however, I found the notification
The data must be sorted according to the order of the model hierarchy: district id.
Please give me advise
Best wishes,

Re: Do I need to sort the data before running runmlwin?
Posted: Tue Jan 09, 2018 10:46 am
by ChrisCharlton
-runmlwin- will also check that the data is sorted by the district variable. Could you please check that this is the case in your data, and if not then see if doing so fixes the error?
Re: Do I need to sort the data before running runmlwin?
Posted: Thu Jan 11, 2018 10:40 am
by hamzah734
Dear George,
Thank you very much for your nice feedback. I have compared some fit model below.
Y is dependent variable in a categoric
x1 to x 21 is dependent variable in a categoric at level 1 / Micro level/id level
x22 to x25 is dependent variable in a continuous at level 2 / Marco level/district level
However, the variable in the level has transformed to categorical in the 25th percentile that is also called the first quartile.
Hopefully, the process is already correct for multilevel analysis in 2 level that is
gen cons = 1
gen id = _n
Model 0 - The null logit model is a model that consists only of bound variables, without compositional variables or contextual variables
estimates store r1m1
Model i - Simple logistic regression analysis
estimates store r1m2
Model ii - Multilevel logistic regression
estimates store r1m3
Model iii - Multilevel logistic regression with Neighbourhood
estimates store r1m4
ssc install runmlwin
global MLwiN_path "C:\Program Files (x86)\MLwiN trial\i386\mlwin.exe"
. *Model null
. runmlwin y cons, level2(district: cons) level1(id: cons)
I get notifications below
The data must be sorted according to the order of the model hierarchy: district id.
Model i - Simple logistic regression analysis
runmlwin y cons, level1(id: cons x1 x2 x3 x4 x5 x6 x8 x9 x10 x11 x13 x14 x15 x16)
I get notifications below
option level1() required
Model ii - Multilevel logistic regression
*Fitting model
runmlwin y cons, level1(id: cons x1 x2 x3 x4 x5 x6 x8 x9 x10 x11 x13 x14 x15 x16) ///
level2(district:)
Model iii - Multilevel logistic regression with Neighbourhood
*Fitting model
runmlwin y cons, level1(id: cons x1 x2 x3 x4 x5 x6 x8 x9 x10 x11 x13 x14 x15 x16) ///
level2(district: x22_25 x23_25 x24_25 x25_25 x26_25)
note: _25 is the 25th percentile
I get notifications below
The data must be sorted according to the order of the model hierarchy: district id.
Are the commands above correct already? Please advise
Thank you very much for your guidance.
Best,