MCA (Multiple Correspondence Analysis) feature request

66 views
Skip to first unread message

William Hack

unread,
Mar 25, 2018, 10:09:45 PM3/25/18
to gonum-dev
I see that PCA is in the library. Would it be possible to add MCA to the library?

It allows analysis of categorical data sets.

There exist a few non-golang implementations out there mostly in R or Python (prince pip or mca pip).


Dan Kortschak

unread,
Mar 26, 2018, 2:11:46 AM3/26/18
to William Hack, gonum-dev
According to the wikipedia article, that looks trivial to implement
with a call to stat.PC.PrincipalComponents; the indicator matrix needs
to be converted to a transformed CDT by x_{ik} = y_{ik}/p_k - 1 and
then giving x to a stat.PC.

I'd think that this probably warrants an example rather than an
exported function, but if it's commonly used, it's not hard to add.

Dan

William Hack

unread,
Mar 27, 2018, 6:11:33 PM3/27/18
to gonum-dev
It sounds promising!  An example would be great.  I can't speak for the whole population. But for this this sample of 1, it is being used a lot.  I'm currently using python (in particular the package prince), but I'm trying to migrate the projects over to golang because it's way better for concurrent operations which would dramatically speed up the processing of the projects.

So in short. Yes please add the exported func :-).  I think most know how to use the PCA and MCA but only if a library already supports it. If the gonum had the func exported I think it would be very beneficial.

As far a metrics when it comes to analyzing categorical data, MCA or CA (if only two variable) is really the main way forward.  That being said, for completeness there is also FAMD which work like PCA and MCA combined and there's also DCA for categorical but, having MCA as library call would be just amazing and sufficient. And would open the door for a lot people to leverage it quickly and easily. (From my standpoint its alot easier to argue "let's make the switch to golang" when there's a library that supports the functions others in the decision group are concerned about. ;-) )

Dan Kortschak

unread,
Mar 27, 2018, 6:30:50 PM3/27/18
to William Hack, gonum-dev
I think the most likely case would be adding a function that conditions
a CDT for use as an input to PCA if we add an exported function. This
is the minimal API change that gets the functionality in the package.

William Hack

unread,
Mar 27, 2018, 9:38:13 PM3/27/18
to gonum-dev
That sounds reasonable.  One positive, it should allow for using different PCA algorithms depending on the data set size (maybe). 

Would an example how to use it still be possible to add?

May I ask what is the reason for "minimal API change[s]"? (Just curious.)  
Is it due to the maintenance tail for new changes... or is it to limit changes due to a release schedule... or keep the API as stable as possible?

Dan Kortschak

unread,
Mar 27, 2018, 10:28:20 PM3/27/18
to William Hack, gonum-dev
On Tue, 2018-03-27 at 18:38 -0700, William Hack wrote:
> That sounds reasonable.  One positive, it should allow for using
> different 
> PCA algorithms depending on the data set size (maybe). 
>
> Would an example how to use it still be possible to add?

Sure.

> May I ask what is the reason for "minimal API change[s]"? (Just
> curious.)  
> Is it due to the maintenance tail for new changes... or is it to
> limit 
> changes due to a release schedule... or keep the API as stable as
> possible?

I guess it should be s/change/addition/. To completely replicate the
PCA API for MCA would give us a new type and several new function that
just call the relevant PCA methods. We prefer to make things
composable, so since you can do for example

```
var pc stat.PC
ok := pc.PrincipalComponents(stat.TransformCDT(x), nil)
```

to get the behaviour you want, that feels nicer.

I'd like to get input from other about where this "stat.TransformCDT"
by some name would actually go though.

Dan Kortschak

unread,
Mar 28, 2018, 1:18:50 AM3/28/18
to William Hack, gonum-dev
Also, we can add Classical Torgerson’s MDS using the same approach.

On Tue, 2018-03-27 at 18:38 -0700, William Hack wrote:

William Hack

unread,
Mar 28, 2018, 6:07:14 PM3/28/18
to gonum-dev
I definitely wouldn't say no.

Dan Kortschak

unread,
Mar 28, 2018, 10:50:28 PM3/28/18
to William Hack, gonum-dev
I've send a WIP PR[1] that adds these, but I don't know what your
handle is on github. Please chime in there. In particular, if you have
some toy examples for, at least, MCA that I can use for the example,
and some test cases, that would be great.

[1]https://github.com/gonum/gonum/pull/449

Dan Kortschak

unread,
Apr 26, 2018, 1:07:29 AM4/26/18
to William Hack, gonum-dev
Are you able to chime in at https://github.com/gonum/gonum/pull/459, I
don't have anything to add there unless you can provide some input.
Reply all
Reply to author
Forward
0 new messages