Four-level cross-classified model
-
- Posts: 4
- Joined: Mon Sep 19, 2016 2:21 pm
Four-level cross-classified model
I am trying to fit a four-level cross-classified model for binary responses using the MCMC method. In order to obtain starting values I use the 1st order MQL estimates, which works fine (and the values definitely make sense). But when I then switch to MCMC and specify the model as cross-classified, the the coefficient β keeps increasing with every draw of the monitoring chain. Also, trying to fit the model with 2nd order MQL or PQL the model fails to converge, with all values turning into 1.#QO(0,000). Reducing the model to three levels does not help. Could it be an issue that I have relatively little observations within each level (949, 5460, 22916 and 39599 respectively)? Thanks for any help
-
- Posts: 1384
- Joined: Mon Oct 19, 2009 10:34 am
Re: Four-level cross-classified model
Could you check the Model->Hierarchy viewer to ensure that the unit structure is the same as you expect it to be after switching to cross-classified? When you change to cross-classified the IDs can have a different meaning as previously the same ID value within different higher level units would be counted as different whereas they would now be assumed to be referring to the same unit.
-
- Posts: 4
- Joined: Mon Sep 19, 2016 2:21 pm
Re: Four-level cross-classified model
Yes, the unit structure is as I would expect (I have attached a screenshot)
- Attachments
-
- hierarchy viewer.png (121.16 KiB) Viewed 10277 times
-
- Posts: 1384
- Joined: Mon Oct 19, 2009 10:34 am
Re: Four-level cross-classified model
I suspect that this may be due to the distribution of the zeros and ones in your response variable at each level (i.e. there are many units without variation). If you run the following MLwiN macro it will create columns indicating the proportion of ones for each unit at each level.
To test this you could try running the model with a response generated from random binomial draws and see whether the model becomes more stable.
The real problem appears to be the 22,916 exhibitions when there are only 39,599 events within them. Looking at the distribution across the 22,916 exhibitions we see that 22076 are all 0 whilst 819 are all 1 and only 21 exhibit any variation therefore it is not feasible to fit the exhibition level within the model and it doesn't seem very likely that this level follows a normal distribution. If one fits the simpler cross-classified model with artists crossed with spaces then you will see that the chains converge although mixing is still poor largely due to the large clustering variability observed at both of these levels. Hope this helps.
Code: Select all
NOTE Find response column
YVAR b30
NOTE Store number of levels
NLEV b31
NOTE Reserve three free temporary columns
LINK 3 G6
NOTE Look up IDs for each level
IDCOl G6[3]
NOTE Loop over levels
LOOP b32 1 b31
NOTE Reserve two free output columns
LINK 2 G7
NOTE Look up ID for current level
PICK b32 G6[3] b33
NOTE Look up column name for current ID
COLN cb33 s1
NOTE Sort response and current level ID into temporary columns
SORT 1 cb33 cb30 G6[1] G6[2]
NOTE Store unique IDs into first output column
SJOIN s1 "_unique" s2
NAME G7[1] s2
UNIQ G6[1] G7[1]
NOTE Store mean response within each ID into second output column
SJOIN s1 "_proportion" s3
NAME G7[2] s3
TABStore G7[2] G6[2] G6[1]
NOTE unreference output columns
LINK 0 G7
ENDLoop
NOTE erase temporary columns
ERAS G6
NOTE unreference temporary columns
LINK 0 G6
The real problem appears to be the 22,916 exhibitions when there are only 39,599 events within them. Looking at the distribution across the 22,916 exhibitions we see that 22076 are all 0 whilst 819 are all 1 and only 21 exhibit any variation therefore it is not feasible to fit the exhibition level within the model and it doesn't seem very likely that this level follows a normal distribution. If one fits the simpler cross-classified model with artists crossed with spaces then you will see that the chains converge although mixing is still poor largely due to the large clustering variability observed at both of these levels. Hope this helps.