maximizing feature diversity in moses model ensembles

34 views
Skip to first unread message

Michael Duncan

unread,
Dec 2, 2018, 3:05:32 PM12/2/18
to opencog
can any moses gurus comment on using the diversity scoring options?  for some bioinformatics work we are more interested in evolving a minimal ensemble of models to maximally represent the patterns distinguishing two sample sets rather than just maximizing out of sample prediction accuracy.  in other words, we want to maximize the number of unique features in a model ensemble of a given size and accuracy.  more generally, are there procedures for choosing optimal ensemble models beyond combining the top n models from different cross-validation runs?

Linas Vepstas

unread,
Dec 2, 2018, 10:32:38 PM12/2/18
to opencog
On Sun, Dec 2, 2018 at 2:05 PM Michael Duncan <mjsd...@gmail.com> wrote:
can any moses gurus comment on using the diversity scoring options?  for some bioinformatics work we are more interested in evolving a minimal ensemble of models to maximally represent the patterns distinguishing two sample sets rather than just maximizing out of sample prediction accuracy.  in other words, we want to maximize the number of unique features in a model ensemble of a given size and accuracy.  more generally, are there procedures for choosing optimal ensemble models beyond combining the top n models from different cross-validation runs?

Wow. At one point, I wanted to add something like that as a feature: Namely, when keeping a deme (ensemble) of some finite size, I wanted it to be as diverse as possible (while still having acceptable fitness). I was excited, ready to implement this, but could not quite figure out a good way of measuring how "different" two models were.  What constitutes diversity, and how do you measure it? Without that, I couldn't move forward.

--linas

--
cassette tapes - analog TV - film cameras - you

Ben Goertzel

unread,
Dec 2, 2018, 11:54:36 PM12/2/18
to opencog
Nil implemented a diversity measure in the Aidyia period and one can tune the weight of diversity vs accuracy in the fitness function ..Not sure if this does what u need...

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA36wzOpVFiqqqY0GdfAvH9hPB0PQgS-TZQ2A%2BRcvJb15Pw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Linas Vepstas

unread,
Dec 3, 2018, 12:22:55 AM12/3/18
to opencog
On Sun, Dec 2, 2018 at 10:54 PM Ben Goertzel <b...@goertzel.org> wrote:
Nil implemented a diversity measure in the Aidyia period and one can tune the weight of diversity vs accuracy in the fitness function ..Not sure if this does what u need...

Ah yes, that's right. There are more than a few ways to measure diversity; being who I am, I recall thinking that there should be other, better ways of measuring diversity, but never found the time to try others.

The correct architecture would be to make the diversity-measure be a plug-in module, so you could code up and try different measures of diversity.

--linas
 

On Mon, 3 Dec 2018, 11:33 Linas Vepstas <linasv...@gmail.com wrote:


On Sun, Dec 2, 2018 at 2:05 PM Michael Duncan <mjsd...@gmail.com> wrote:
can any moses gurus comment on using the diversity scoring options?  for some bioinformatics work we are more interested in evolving a minimal ensemble of models to maximally represent the patterns distinguishing two sample sets rather than just maximizing out of sample prediction accuracy.  in other words, we want to maximize the number of unique features in a model ensemble of a given size and accuracy.  more generally, are there procedures for choosing optimal ensemble models beyond combining the top n models from different cross-validation runs?

Wow. At one point, I wanted to add something like that as a feature: Namely, when keeping a deme (ensemble) of some finite size, I wanted it to be as diverse as possible (while still having acceptable fitness). I was excited, ready to implement this, but could not quite figure out a good way of measuring how "different" two models were.  What constitutes diversity, and how do you measure it? Without that, I couldn't move forward.

Nil Geisweiller

unread,
Dec 3, 2018, 4:56:36 AM12/3/18
to ope...@googlegroups.com
Hi,

> On Sun, Dec 2, 2018 at 2:05 PM Michael Duncan
> <mjsd...@gmail.com <mailto:mjsd...@gmail.com>> wrote:
> can any moses gurus comment on using the diversity scoring
> options?  for some bioinformatics work we are more
> interested in evolving a minimal ensemble of models to
> maximally represent the patterns distinguishing two sample
> sets rather than just maximizing out of sample prediction
> accuracy.  in other words, we want to maximize the number of
> unique features in a model ensemble of a given size and
> accuracy.  more generally, are there procedures for choosing
> optimal ensemble models beyond combining the top n models
> from different cross-validation runs?

Yes, moses can take into account model diversity when sorting the top n
models of the metapopulation.

I recall that it works very well, but of course that depends on how
moses is being used.

My advice would be to start by setting --diversity-pressure to 0.1, then
double till you get passed 10.

See if you obtain a more diverse population.

I think the the tool

eval-diversity

may help you to measure the diversity of each run (if moses isn't enough).

If you're not happy with the result it might mean that you don't let
moses evolve enough demes. Remember that diversity work at the
metapopulation level, thus it affects the choice of the next deme
exemplar, so you really need to let moses explore multiple demes to
build-up diversity.

Also, are you looking for feature set diversity? Or are you looking for
candidate diversity (expressing different output behaviors, regardless
of whether they use different features)?

If you're mostly interested in feature set diversity then I would
recommend to enable diversity at the feature selection stage, see

--fs-diversity-pressure

of course it's only gonna work if you're using feature selection to
begin with (which I would recommend).

The other option, that doesn't involve using any diversity flag, is to
use tune feature selection to be highly sensitive of random
fluctuations. I forgot how to do that but I could dig it up if you want
me to. The advantage is that it's gonna yield diversity at a really low
computational cost.

Anyway, hope it helps, feel free to send me you're moses commands and
data so I can provide you more guidance.

Nil
Reply all
Reply to author
Forward
0 new messages