Non-linear GMM?

tyler...@sciencespo.fr

unread,

Jun 6, 2015, 7:35:33 AM6/6/15

to pystat...@googlegroups.com

Hello list,

I am new to statsmodels (and development) and would like to contribute if I can. I am working on a non-linear GMM paper and can't find the code within the statsmodels module (nor can I seem to load it into python). Is this because the code is not there? Was the GMM class ever finished? If so, could someone direct me to the source in the file tree? If not, does anyone know where this project stands currently and what a good starting point would be to contribute?

Any guidance would be greatly appreciated!

Tyler

--------------------------------------------------------------------------

Tous les courriers électroniques émis depuis la messagerie de Sciences Po doivent respecter les conditions d'usage.

Pour les consulter rendez-vous sur :

http://www.sciencespo.fr/ressources-numeriques/fr/content/regles-de-confidentialite

josef...@gmail.com

unread,

Jun 6, 2015, 9:23:16 AM6/6/15

to pystatsmodels

On Sat, Jun 6, 2015 at 5:53 AM, <tyler...@sciencespo.fr> wrote:

Hello list,

I am new to statsmodels (and development) and would like to contribute if I can. I am working on a non-linear GMM paper and can't find the code within the statsmodels module (nor can I seem to load it into python). Is this because the code is not there? Was the GMM class ever finished? If so, could someone direct me to the source in the file tree? If not, does anyone know where this project stands currently and what a good starting point would be to contribute?

Thank Tyler, getting a new contributor in this area would be great.

documentation and some example notebooks are here

http://statsmodels.sourceforge.net/devel/gmm.html

http://nbviewer.ipython.org/gist/josef-pkt/6895915

http://nbviewer.ipython.org/gist/josef-pkt/6890383

data for the notebooks https://gist.github.com/josef-pkt/8128539 https://gist.github.com/josef-pkt/8128535

I'm not sure the notebooks are for the latest version that is in master.

If you are interested in the development details

here is the merged PR https://github.com/statsmodels/statsmodels/pull/1105

with some comments from my last round when I worked on it

and there are a few open issues

for example adding more models that use GMM

https://github.com/statsmodels/statsmodels/issues/1790

https://github.com/statsmodels/statsmodels/issues/1742

the test suite also has more examples used as test cases

The GMM classes are finished in the sense of they are working and have a good set of unit tests against Stata. However, there remains a lot to be done, since GMM is a huge topic.

roughly, the tasks that are still left

1) write more models on top of the current GMM class(es)

2) review usability and user interface

3) more post estimation tools and hypothesis/diagnostic tests

4) internal refactoring

5) extension to different data structures.

to 1)

I think this is the best way to get started with GMM. If you have a group of models for your work that use GMM, then it should be "reasonably straightforward" to use the current classes.

Essentially, the GMM base classes should go into statsmodels.base and provide similar to LikelihoodModel a generic estimation class. But we need to write specific models on top of it to make it directly usable.

The IVGMM classes are written as special cases for models that can be put into linear or linear IV moment conditions.

to 3)

This is one of the areas that I was looking at after I finished my last round but didn't write much code, and I don't have a good overview.

One area that I had tried to figure out was to add support and tests for the many weak instrument case.

Another

to 4)

This is mainly for me to get back to sandwich robust covariance matrices. I had written the sandwiches for GMM before adding them to the other models, and I haven't gone back to reuse the same code and patterns across GMM and the other models.

to 5)

currently GMM has a completely generic structure where users need to provide the moment conditions, the IV versions assume a single set of moment conditions z (y - f(x)) or something that can be transformed into this.

The main extension that is missing compared to Stata or other packages is build in multi-equation and panel data support. For example system GMM for dynamic panel data is completely missing.

(Related to GMM: We are also missing empirical likelihood GEL or other higher order approaches in a GMM like setting, but GMM-CUE is available, AFAIR)

We have roughly the equivalent to GMM in Stata (or to the R package which I never used) with the exception of system-GMM,

However, we don't have all the models where Stata uses GMM internally, and there are several user packages in Stata that provide additional functionality for IV or GMM that we don't have yet.

I can provide more specific information. What is the applications in your GMM paper?

Josef

josef...@gmail.com

unread,

Jun 6, 2015, 11:21:04 AM6/6/15

to pystatsmodels

On Sat, Jun 6, 2015 at 9:23 AM, <josef...@gmail.com> wrote:

On Sat, Jun 6, 2015 at 5:53 AM, <tyler...@sciencespo.fr> wrote:
Hello list,

I am new to statsmodels (and development) and would like to contribute if I can. I am working on a non-linear GMM paper and can't find the code within the statsmodels module (nor can I seem to load it into python). Is this because the code is not there? Was the GMM class ever finished? If so, could someone direct me to the source in the file tree? If not, does anyone know where this project stands currently and what a good starting point would be to contribute?

Thank Tyler, getting a new contributor in this area would be great.

documentation and some example notebooks are here

http://statsmodels.sourceforge.net/devel/gmm.html
http://nbviewer.ipython.org/gist/josef-pkt/6895915
http://nbviewer.ipython.org/gist/josef-pkt/6890383

data for the notebooks https://gist.github.com/josef-pkt/8128539 https://gist.github.com/josef-pkt/8128535
I'm not sure the notebooks are for the latest version that is in master.

browsing around a bit, the notebooks look nice, but the docstrings are incomplete, and I guess contain outdated comments.

The code also has comment that show that there are still several "hackish" solutions where the overall structure doesn't quite fit.

If you are interested in the development details
here is the merged PR https://github.com/statsmodels/statsmodels/pull/1105
with some comments from my last round when I worked on it

and there are a few open issues

for example adding more models that use GMM
https://github.com/statsmodels/statsmodels/issues/1790
https://github.com/statsmodels/statsmodels/issues/1742

the test suite also has more examples used as test cases

The GMM classes are finished in the sense of they are working and have a good set of unit tests against Stata.

The code might also be a bit "ugly" in some parts because it hasn't seen another cleanup round yet. In my last round I kept adding methods and options to handle the specific cases that I added and tested.

I expect that there will be more changes as we add more models that use GMM. And, once we have enough application, we might be able to streamline some of the code and options. (And remove or privatize some extra methods.)

However, there remains a lot to be done, since GMM is a huge topic.

roughly, the tasks that are still left

1) write more models on top of the current GMM class(es)
2) review usability and user interface
3) more post estimation tools and hypothesis/diagnostic tests
4) internal refactoring
5) extension to different data structures.

to 1)
I think this is the best way to get started with GMM. If you have a group of models for your work that use GMM, then it should be "reasonably straightforward" to use the current classes.

Essentially, the GMM base classes should go into statsmodels.base and provide similar to LikelihoodModel a generic estimation class. But we need to write specific models on top of it to make it directly usable.
The IVGMM classes are written as special cases for models that can be put into linear or linear IV moment conditions.

to 3)
This is one of the areas that I was looking at after I finished my last round but didn't write much code, and I don't have a good overview.
One area that I had tried to figure out was to add support and tests for the many weak instrument case.
Another

to 4)
This is mainly for me to get back to sandwich robust covariance matrices. I had written the sandwiches for GMM before adding them to the other models, and I haven't gone back to reuse the same code and patterns across GMM and the other models.

There are also still some limitations or extra inherited things that don't apply to GMM, because currently the base classes in statsmodels.base are all maximum likelihood oriented and need to be generalized

tyler...@sciencespo.fr

unread,

Jun 6, 2015, 11:36:43 AM6/6/15

to pystat...@googlegroups.com

Hi Josef,

I'm actually working on something very similar to the Hansen and Singleton example you provide. I'm a little puzzled though by your results. Where did you get the data? The CRRA parameter should be positive, otherwise the preferences are risk loving => preferences are convex and exhibit increasing marginal utility. I've messed with the example for a bit and it seems to be fine, so I guess you generated the data?

I will work on my problem to get acquainted with the module and maybe work on the documentation for the gmm classes as I go. It can be difficult to find source and determine what does what. Along with that, whatever tests/analysis not already included which I write for myself I will push for the module. When I'm done I could add my model to the classes (if it turns out to be any different from what is already there).

With those I would hit on 1-3 of your points. I am interested in dynamic panel data models, so if I do anything related to this I will be sure to propose it for the module.

Sound good?

Tyler

josef...@gmail.com

unread,

Jun 6, 2015, 12:07:12 PM6/6/15

to pystatsmodels

On Sat, Jun 6, 2015 at 11:36 AM, <tyler...@sciencespo.fr> wrote:

Hi Josef,

I'm actually working on something very similar to the Hansen and Singleton example you provide. I'm a little puzzled though by your results. Where did you get the data? The CRRA parameter should be positive, otherwise the preferences are risk loving => preferences are convex and exhibit increasing marginal utility. I've messed with the example for a bit and it seems to be fine, so I guess you generated the data?

I think the data are a standard dataset for the consumption. The example is based mostly on some online explanation or lecture notes and on the stata manual, e.g. http://www.stata.com/manuals13/rgmm.pdf p. 34 where gamma is also estimated as negative.

Maybe it shouldn't be called CRRA if it is defined with the wrong sign.

I haven't done macroeconomics in a very long time, and didn't try to "understand" the Hansen-Singleton model again. It was just a example for me.

corrections are welcome

I will work on my problem to get acquainted with the module and maybe work on the documentation for the gmm classes as I go. It can be difficult to find source and determine what does what. Along with that, whatever tests/analysis not already included which I write for myself I will push for the module. When I'm done I could add my model to the classes (if it turns out to be any different from what is already there).

With those I would hit on 1-3 of your points. I am interested in dynamic panel data models, so if I do anything related to this I will be sure to propose it for the module.

Sound good?

Sounds good.

What would also be very useful from an application viewpoint are additional functions or methods to interpret the results. You might have a better idea based on the subject than I do.

We had a GSOC proposal last year for System-GMM (Arellano, Bond and related) but, unfortunately, lost the student to an internship at an international organization.

Implementing the full System GMM is, I think, quite a large project, but there might be versions that don't use the full set of moment conditions that are more "digestible".

Josef

tyler...@sciencespo.fr

unread,

Jun 6, 2015, 1:37:25 PM6/6/15

to pystat...@googlegroups.com

On Saturday, June 6, 2015 at 6:07:12 PM UTC+2, josefpktd wrote:

On Sat, Jun 6, 2015 at 11:36 AM, <tyler...@sciencespo.fr> wrote:
Hi Josef,

I'm actually working on something very similar to the Hansen and Singleton example you provide. I'm a little puzzled though by your results. Where did you get the data? The CRRA parameter should be positive, otherwise the preferences are risk loving => preferences are convex and exhibit increasing marginal utility. I've messed with the example for a bit and it seems to be fine, so I guess you generated the data?

I think the data are a standard dataset for the consumption. The example is based mostly on some online explanation or lecture notes and on the stata manual, e.g. http://www.stata.com/manuals13/rgmm.pdf p. 34 where gamma is also estimated as negative.
Maybe it shouldn't be called CRRA if it is defined with the wrong sign.

I haven't done macroeconomics in a very long time, and didn't try to "understand" the Hansen-Singleton model again. It was just a example for me.

corrections are welcome

I just looked through that example and indeed their estimate is also negative, but they say it "implies risk-loving behavior and therefore a poorly specified model." So, your example is probably working fine, in the sense that you reproduce the same.

I will work on my problem to get acquainted with the module and maybe work on the documentation for the gmm classes as I go. It can be difficult to find source and determine what does what. Along with that, whatever tests/analysis not already included which I write for myself I will push for the module. When I'm done I could add my model to the classes (if it turns out to be any different from what is already there).

With those I would hit on 1-3 of your points. I am interested in dynamic panel data models, so if I do anything related to this I will be sure to propose it for the module.

Sound good?

Sounds good.
What would also be very useful from an application viewpoint are additional functions or methods to interpret the results. You might have a better idea based on the subject than I do.

We had a GSOC proposal last year for System-GMM (Arellano, Bond and related) but, unfortunately, lost the student to an internship at an international organization.
Implementing the full System GMM is, I think, quite a large project, but there might be versions that don't use the full set of moment conditions that are more "digestible".

I will definitely work on output and interpretation (eg tables, tests, etc.), but for now I won't commit to programming anything too time consuming. Although there's always next summer.

Thanks for your help and I'll let you know if I have any further questions.

Tyler

josef...@gmail.com

unread,

Jun 6, 2015, 3:33:38 PM6/6/15

to pystatsmodels

I'd like to add one wishlist item here.

While we have still many areas in statistics and econometrics where we need to catch up, there are also newer methods that are more exciting in terms of being hotter and more fashionable and closer to current research.

One area for GMM for which I'm interested in gettting full support, is regularized estimation of the weight matrix or penalization if we have a large number of moment conditions or instruments. A target would be to get better small sample performance when the number of observations is relatively small but the number of moment conditions is relatively large.

We are building up examples and code across models, but it should eventually be "relatively easy" to reuse them for new models or areas.

Josef

Reply all

Reply to author

Forward