repeated time-series

Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to runmlwin: Running MLwiN from within Stata >> http://www.bristol.ac.uk/cmm/software/runmlwin/
Post Reply
camilabvic
Posts: 2
Joined: Sun Oct 11, 2015 5:35 pm

repeated time-series

Post by camilabvic »

Hi all,

I am not familiar at all yet with "runmlwin" that got published in the Journal of Statistical Software, and because of that I do not know if "runmlnin" works when having repeated time values within units.

I tried to use the Stata command xtreg or any of the other -xt- models that are available in Stata since I thought the data I have is panel data. However, when using "xtset" (a first needed Stata command before using "xtreg" or any of the other -xt- models) Stata indicates that repeated time values are not allowed, a feature of my data. Panel data are defined by an identifier variable and a time variable and each combination of identifier and time should occur, at most, once. Therefore, "xtset" cannot be used if we have repeated time-series since it does not allow for them.

So, since "runmlwin" is designed for multilevel (hierarchical or cluster) data structures and not panel data structures (please correct me if I am wrong on this), does "runmlwin" allows for repeated time values within each unit (for example, countries)?

Thanks!
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: repeated time-series

Post by GeorgeLeckie »

Hi camilabvic,

Sounds like you have three-level rather than standard two-level data (xt suite of commands only handles the latter).

Individuals (level-1) within years (level-2) within countries (level-3)

Yes, runmlwin can fit three-level models to three-level data

Best wishes

George
socgdj
Posts: 9
Joined: Fri Oct 07, 2011 8:36 am

Re: repeated time-series

Post by socgdj »

Here is a question akin to the 'repeated time series question. George Leckie answered it, but gave no hints on how to proceed.

Modelling time series data on yields

Originally I had imagined I would be able to model our yield gaps data with MLWin. However, the time series macros included with this programme don’t work satisfactorily and don’t seem to fit our case. I and my team need urgent advice and assistance on how to proceed. Below I will describe the data and our approach to modelling. Grateful for all comments.

The data
Data from the MODIS sensor can be developed into a Normalized Difference Vegetation Index (NDVI). This is a standard procedure and has been used a lot and we use for a sample villages surveyed. I have chosen the variable Sintegral and logged it to make the regression coefficient more easy to interpret. Our data also yields data from three rounds of village Afrint surveys: 2002, 2008 and 2013-14 (in the example here only for Kenya and Ghana). For each round there are estimated yields of maize (tonnes per hectare) for three crop years. This gives a time series for the years 2000 – 2002, 2006 – 2008 and 2012-14. I have used a SPSS routine for missing value substitution in SPSS to fill out the ‘holes’ in the series, which gives us an unbroken series covering the entire period from 2000 to 2014. Here is screenshot showing some of the data:

[/url]https://www.dropbox.com/s/mzw25qusj9g0j ... m.png?dl=0[/url]

The model
There are several ways to model time series data. I had imagined a multilevel model of yields regressed on the NDVI, using an autoregressive approach:

y(it9 = β0(it) +β1(it)*y(it-1) + β2(it) *x2(it) + e(it)

In this model yield for a certain village (i) and year (t) is regressed on lagged yield, y(it-1), which is the autoregressive component, and x2(it) the NDVI for the same villages and years. Add a constant (β0) and a residual (e) and a link function (equality) and you have a simple regression model. In more complicated versions you can add other independent variables to the equation. Not so simple, however, since the model has to be adapted to time series data.

My plan was to use MLWin for the modelling. This would have had the advantage of dividing the residual into components: a first level residual giving us the variance between years and within village u(it) and the second-level variance between villages v(i). Moreover, two more levels could have given us region- and country-specific variances and the possibility of adding especially country-specific data to the model. I think the developers of MLWin has a point when saying this approach is superior to dealing with higher level factors through dummy variables, as is the standard approach.

Software problems
MLWin contains macros and a manual by Yang et al for dealing with time series data of various sorts, including the type of independent variable we have (a time-dependent scale). As far as I can see the macros presume growth curve models, which are common in biology and other sciences. In those approaches growth is modelled using polynomous independent variables (y=f(x,x^2,x^3,x^4,x^5…), while I would have preferred the autoregressive approach described above and more common in the social sciences.

SPSS also contains modules for analysing time series data, but I haven’t found a good way of dealing with pooled time series data, as our dataset is. I am not very versatile with Stata, which is used by many colleagues. I know that this programme contains for example an xtreg module, in which I think ‘xt’ stands for cross-sectional time series regression. Judging from the first question in this string 'xtreg' will not solve oout problems, as I had suspected. If so, how do deal with the multi-level structure of our data?

Both the colleague who initiated this conversation and myself would be very grateful for a more detailed answer from our friends in CMM.

Göran Djurfeldt :roll:
Post Reply