GSoC proposal: Exporting multi-indexed DataFrames to Excel - other projects from the ecosystem?

68 views
Skip to first unread message

Dr. Leo

unread,
Feb 24, 2016, 1:03:40 AM2/24/16
to pyd...@googlegroups.com
Hi,

some time ago I saw that this is not implemented yet. Filling this gap
should appeal to students who want to get familiar with MS Office
internals. I could not mentor this.

If there is a chance to accept topics more broadly within the pandas
ecosystem, I would happily mentor one or two relating to pandaSDMX, e.g.

https://github.com/dr-leo/pandaSDMX/issues/7

Leo

Joris Van den Bossche

unread,
Feb 27, 2016, 4:40:20 PM2/27/16
to PyData
2016-02-24 7:03 GMT+01:00 'Dr. Leo' via PyData <pyd...@googlegroups.com>:
Hi,

some time ago I saw that this is not implemented yet. Filling this gap
should appeal to students who want to get familiar with MS Office
internals. I could not mentor this.

Is this still the case? There have been some improvements to writing multi-indexes to excel in 0.17 (see eg http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#changes-to-excel-with-multiindex)
 

If there is a chance to accept topics more broadly within the pandas
ecosystem, I would happily mentor one or two relating to pandaSDMX, e.g.

https://github.com/dr-leo/pandaSDMX/issues/7

I am not sure what the rules are, but in principle I would certainly be OK with that (Jeff, do you know that?) If you think that such a topic would be a worthy GSOC proposal (I mean: would be multiple months of work)

Joris

 


Leo

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dr. Leo

unread,
Feb 28, 2016, 7:18:17 PM2/28/16
to pyd...@googlegroups.com
Joris,

you are right: since v0.17 multi-indexed dataframes can be exported to Excel. I am thrilled to see that. Sorry for the confusion. - I've changed the subject accordingly.

I think the "worthiness" of a GSoC project will ultimately be a function of (1) the number of interested students and corresponding mentors, and (2) prioritisation. -

In terms of complexity, if the candidate finalized the sdmxjson reader (supported at least by IMF, OECD and ECB) prematurely,  I would suggest to continue with some of the other issues such as support for structure-specific datasets (more complex than the former), and various extensions to the information model such as refinements for datatypes, xml validation etc. I think even a super-smart student would be busy for a couple of months. It's all pure Python. But the SDMX data model and pandaSDMX code have a learning curve. You may perceive it as a drawback that this proposal involves no direct exposure to pandas code as the class that writes SDMX data - and shortly metadata - to pandas is reasonably complete.

In terms of prioritisation, I admit that when in doubt, implementing a general purpose pandas feature would generally be preferrable to extending a non-core data acquisition tool.
 
Leo
Reply all
Reply to author
Forward
0 new messages