Medcouple again

213 views
Skip to first unread message

Jordi Gutiérrez Hermoso

unread,
Apr 6, 2015, 9:23:26 AM4/6/15
to pystatsmodels
(Please CC me manually, as Googlegroups refuses to send emails to my
jor...@octave.org address.)

Hello.

As I promised long ago, I finally got around to writing a medcouple
Wikipedia article:

https://en.wikipedia.org/wiki/Medcouple

Of particular note is the algorithm section, where I tried to write a
language-neutral description of the fast algorithm (which nevertheless
looks kind of Pythonish). I am hoping that you can use this
description to improve the implementation of the medcouple in
statsmodels. I tried to elucidate the fast algorithm as much as I
could. Please let me know how I did and if you can turn my pseudocode
into Python code.

Thank you,
- Jordi G. H.


Daniel Baker

unread,
Apr 6, 2015, 8:53:40 PM4/6/15
to pystat...@googlegroups.com
Have you thought about writing a cython wrapper for your C++ code? That would let you keep the high performance without having to re-code it. I'd use it.

Cheers,

Daniel

Jordi Gutiérrez Hermoso

unread,
Apr 6, 2015, 9:18:31 PM4/6/15
to pystatsmodels
(Please CC me manually, as Googlegroups refuses to send emails to my
jor...@octave.org address.)

On Mon, 2015-04-06 at 19:00 -0600, Daniel Baker wrote:
> Have you thought about writing a cython wrapper for your C++ code?
> That would let you keep the high performance without having to
> re-code it. I'd use it.

My code is GPL'ed. You seem to have found the C++ version. Here's a
Python version:

http://inversethought.com/hg/medcouple/file/default/medcouple.py

You are quite free to use it, of course, that's why it's GPL'ed.
However, if you plan to implement this for statsmodels without
copyleft, don't read my GPL'ed version. The statsmodels developers
consider copyleft to be an unacceptable condition, and reading my code
to implement your own version incurs high risk that your version will
be derivative of mine.

However, algorithms can't be copyrighted, so the high-level pseudocode
description I wrote can be adapted into Python code for statsmodels.

In essence, I am trying to help you clean-room reverse engineer the
fast medcouple algorithm. I wrote the design spec. That's the
Wikipedia article. It is up to you write the non-copyleft
implementation now.

Or you could just accept the GPL. It certainly hasn't hindered the
success of git, Linux, R, or Octave.

Or you could keep the slow medcouple implementation. It's actually not
that bad for sample sizes up to couple thousand.

HTH,
- Jordi G. H.





Reply all
Reply to author
Forward
0 new messages