large dataset & worksheet size limitation

Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to runmlwin: Running MLwiN from within Stata >> http://www.bristol.ac.uk/cmm/software/runmlwin/
Post Reply
julia1633
Posts: 30
Joined: Mon Aug 15, 2011 1:54 pm

large dataset & worksheet size limitation

Post by julia1633 »

Greetings,

I am currently working with a large dataset of about 800,000 obs and 330 group obs at level 2 and 80 group obs at level 3. While I managed to run a multinomial regression within the multilevel setting using mql1, I couldn't estimate it based on this with mcmc. MLwin keeps crashing generating a message that a programme has stopped working. After adding an interaction term I fail even to estimate the model using mql1 and I keep getting the same message that 56637648 free space is needed. I've simplified the model to a 2-level one with the second level reflecting a country-year groupings. However, I still face the same problem. Is there anyhow I can increase a worksheet above 250000 (I tried it but Mlwin does not allow me to do this)?

Thanks,
Julia
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: large dataset & worksheet size limitation

Post by GeorgeLeckie »

Hi Julia,

In terms of your first query "MLwin keeps crashing generating a message that a programme has stopped working" you will need to post that one on the MLwiN forum as it sounds like an MLwiN rather than a runmlwin problem. You can confirm this by checking that you get the same error message when running the model in MLwiN via the traditional point-and-click route.

In terms of your second query "I keep getting the same message that 56637648 free space is needed", you could try manually inreasing the amount of RAM runmlwin allocates to MLwiN (runmlwin occasionally assignes too little RAM to MLwiN)

We only implemented these options a couple of months ago, so the first thing you want to do is make sure you have the latest version of runmlwin

Code: Select all

ssc install runmlwin, replace
You can now use the new options we added, documented at

Code: Select all

help runmlwin##mlwin_settings
Try ramping up the RAM using the size() option to see whether this solves your problem.

You can see what RAM runmlwin automatically assigns to MLwiN by first attempting (and failing) to fit this model without this option. From within MLwiN you can check the settings which runmlwin has assinged by going to

Options --> Worksheet

Best wishes

George
julia1633
Posts: 30
Joined: Mon Aug 15, 2011 1:54 pm

Re: large dataset & worksheet size limitation

Post by julia1633 »

Dear George,

Many thanks for this. I'll follow your suggestions and see if this would make a difference.

Thanks,
Julia
julia1633
Posts: 30
Joined: Mon Aug 15, 2011 1:54 pm

Re: large dataset & worksheet size limitation

Post by julia1633 »

Dear George,

Thanks a lot. Increase of RAM allocated by runmlwin to MlWin manually seems to have done the trick. The model runs. However, I seem to get quite divergent results for some of the variables of interest using mql1 and MCMC. The model is fitted with MCMC based on the initial values obtained from MQl1. More specifically, in the case of MQL1 I got stronger property rights protection (xconst) insiginificant in explaining the choice of self-employment vis-a-vis non entry into entrepreneurship (contrast 1) and entrepreneurship with growth aspirations vis-a-vis non-entry into entrepreneurship ~(contrast 2). However, once the model is fitted with MCMC the results seem to be very odd: I got stronger property rights protection to be positively and significantly affecting the choice of self-employment and negatively and significantly affecting entrepreneurship with growth aspirations. This is really odd. I'd rather expect stronger property rights protection to be insignificantly related to self-employment but be positive and significant in the case of entrepreneurship with growth aspirations. I know that MQL1 does not produce robust results and that the model shoudl be finalised with MCMC. However, I wonder whether such a divergence between the results obtained based on MQL1 and MCMC should be expected or there is something wrong and none of these two estimators should be trusted.

best wishes,
Julia
julia1633
Posts: 30
Joined: Mon Aug 15, 2011 1:54 pm

Re: large dataset & worksheet size limitation

Post by julia1633 »

attached is the file with the results just in case if you wish to look at them...
Attachments
Results GEM 2001-09 20120225.docx
(16.86 KiB) Downloaded 607 times
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: large dataset & worksheet size limitation

Post by GeorgeLeckie »

Hi Julia,

I'm afraid we don't have the scope to offer more substantive support with fitting these models.

However, one thing which you should do is run the model for longer as your Effective Sample Sizes (ESS) for the parameters which you describe (the coefficients of xconst) are only 5, which is very low. You wouldn't report summary statistics based on a sample of 5 observations, so summary statistics based on a chain which is only equivalent to 5 independent observations isn't great either!

You might want try specifying the orthogonal and hierarchical centring options as well. Read the relevant chapters of the MCMC manual first. Also read the bits about ESS in the manual.

Best wishes

George
Post Reply