measuring diversity

Sophia

unread,

Dec 6, 2019, 7:46:24 PM12/6/19

to deap-users

Hello,

I used both NSGA-II and SPEA2 to solve a multiobjective optimisation problem with 2 objectives. I seem to get better solutions using SPEA2. I would like to measure the diversity of the solutions in the Pareto front for both algorithms. I didn't record the final population, but recorded the hypervolume and pareto front at each generation. I don't know the true Pareto front, I used a slightly worse point as a reference point to calculate the hypervolume. Can I use the deap.benchmarks.tools.diversity(first_front, first, last) to measure the diversity of the pareto front at the final generation; what is the first and last? Here is one example of my Pareto front, is the first item 173.384071,100.0 and the last
is 2.611496,7.00?

173.384071,100.0
45.082696,80.0
34.562327,79.0
24.557862,68.0
14.550288,50.0
13.726517,33.0
13.479549,24.0
13.439238,23.0
13.430936,22.0
13.409498,20.0
9.029356,19.0
7.88063,18.0
2.611496,7.00

Thanks for your help.

Derek Tishler

unread,

Dec 6, 2019, 9:08:40 PM12/6/19

to deap-users

Sophia,

One idea is you can create a histogram, with probability curves, to help illustrate diversity of fitness. Works well on a bigger pop. I did this by taking the fitness values of each individual listed in the pop, hof/pf and using matplotlib.

Here is a link to some example code using your values:

https://gist.github.com/DMTSource/521a84269087d9e21ad516193c0e566f

Example figure:

Derek Tishler

unread,

Dec 7, 2019, 12:34:34 AM12/7/19

to deap-users

Regarding specific use of:

from deap.benchmarks.tools import diversity, convergence

You can check out the commented section of nsga2.py example:
https://github.com/DEAP/deap/blob/f6accf730555c5bbc1c50ac310250ad707353080/examples/ga/nsga2.py#L134

To understand how they work you can check out the source:

convergence
https://github.com/DEAP/deap/blob/f6accf730555c5bbc1c50ac310250ad707353080/deap/benchmarks/tools.py#L271

diversity:
https://github.com/DEAP/deap/blob/f6accf730555c5bbc1c50ac310250ad707353080/deap/benchmarks/tools.py#L249

Sophia

unread,

Dec 7, 2019, 5:29:42 AM12/7/19

to deap-users

Many thanks for your help Derek. NSGA2 uses the optimal_front:

# with open("pareto_front/zdt1_front.json") as optimal_front_data:

# optimal_front = json.load(optimal_front_data)

# Use 500 of the 1000 points in the json file

# optimal_front = sorted(optimal_front[i] for i in range(0, len(optimal_front), 2))

# print("Convergence: ", convergence(pop, optimal_front))

# print("Diversity: ", diversity(pop, optimal_front[0], optimal_front[-1]))

Is this the true pareto front, I do not know the true Pareto front of my problem.

Derek Tishler

unread,

Dec 7, 2019, 8:36:42 AM12/7/19

to deap-users

Are you using, instead of a hall of fame, the ParetoFront:

hof = tools.ParetoFront()

ParetoFront docs:

https://deap.readthedocs.io/en/master/api/tools.html#deap.tools.ParetoFront

Example script:

https://github.com/DEAP/deap/blob/master/examples/ga/knapsack.py

Derek Tishler

unread,

Dec 7, 2019, 8:56:41 AM12/7/19

to deap-users

Sorry missed what you meant, without the optimal front a benchmarking version of diversity may not be what you are looking for. If you could clarify what you need for a diversity measure, for example I found one here on page 188, right column for the Diversity Metric Del which may be able to use just the paretofront you have:
https://www.iitk.ac.in/kangal/Deb_NSGA-II.pdf

Sophia White

unread,

Dec 7, 2019, 9:55:05 AM12/7/19

to deap-...@googlegroups.com

I wanted to use the diversity metric to compare the performance of the two algorithms. I don't know the true optimal pareto front of my problem. My question was can we still use the obtained pareto front, and if so how would I define the first and last? Now, looking at the NSGA-II paper, it seems I can use the obtained Pareto Front but, I need to somehow estimate two extreme points to be able to achieve this. Thanks for your help.

On Sat, Dec 7, 2019 at 1:56 PM Derek Tishler <lstr...@gmail.com> wrote:

Sorry missed what you meant, without the optimal front a benchmarking version of diversity may not be what you are looking for. If you could clarify what you need for a diversity measure, for example I found one here on page 188, right column for the Diversity Metric Del which may be able to use just the paretofront you have:
https://www.iitk.ac.in/kangal/Deb_NSGA-II.pdf

--
You received this message because you are subscribed to the Google Groups "deap-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to deap-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/deap-users/7088080d-965c-4a0f-842c-fca31cef7e49%40googlegroups.com.

Derek Tishler

unread,

Dec 7, 2019, 11:06:52 AM12/7/19

to deap-users

The hof, for your 2d case at least(see img), may already be sorted via deap. Here is a quick attempt at equation 1 on page 188. It really needs a review before trusting but it may get you started with a method for comparing both your hofs.

I have updated the code to generate the following image:

code

https://gist.github.com/DMTSource/521a84269087d9e21ad516193c0e566f

new figure:

Sophia White

unread,

Dec 7, 2019, 11:34:34 AM12/7/19

to deap-...@googlegroups.com

Yes, DEAP sorts the hof. Thank you very much for all your help, I will go through the code and the paper.

--
You received this message because you are subscribed to the Google Groups "deap-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to deap-users+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/deap-users/defa3c2c-9a44-4d40-981c-faceff13413a%40googlegroups.com.

Derek Tishler

unread,

Dec 7, 2019, 11:43:55 AM12/7/19

to deap-users

My pleasure, I hope it helps!

I updated my example code, after matching the figure's labels to the whitepaper figure I noticed some mistakes:

https://gist.github.com/DMTSource/521a84269087d9e21ad516193c0e566f

Updated figure:

Derek Tishler

unread,

Dec 7, 2019, 12:06:39 PM12/7/19

to deap-users

As for what to use for the extreme points... In your case you may not want to use the method I used above; using the farthest(first/last) points of the frontier.

One estimation method could be using std gaps on 'x_grid' like I showed by accident in the first figure posted to the gist code. You can visualize the distribution via the hist plots and then use those limits to decide on a reasonable extreme point estimate. This is tricky either way as you have a very small hof in this example so these estimation methods are pretty noisy.

https://user-images.githubusercontent.com/8509302/70367148-3d8bfe80-1895-11ea-96ba-f8bcad01ff65.png

Reply all

Reply to author

Forward