This is a class assignment that I have been trying to do. I have to minimize the De Jong's test bed functions. Since GAs are a maximization technique, I would have to alter the function to minimize it
For example the F1: x^2+y^2+z^2 (-5.12 < x,y,z <5.12), I could use -
1) 78.6432 - F1 (bounded function)
2) 1/F1 (converges at infinity)
What is the appropriate value to use?
Whereas both 1 and 2 are technically valid, but as you noted (2) has
the 1/0 problem. So, (1) would be the better option here.
HTH,
Amit
If I catch the exception, 2 shall yield a better result. When I tried 2 in a
python program, it never happened to reach the 1/0 situation, but yielded a result closer to the minima, which is 0 than the 1st objective function.
II was looking for experiences of people who have had to deal with minimizations using GA. And how they decided to alter the Objective function.
In general, the details of a maximization function used in place of a
real function are problem dependent. You may not know in advance that
a function is bounded and you typically don't know in advance how
deceptive (with respect to the distribution of local maximums or
minimums) the function is.
If you know that F1 is bounded above by 78.6432 then (1) is a
perfectly good approach. Similarly (2) should be okay if F1 is
bounded below by 0, but will absolutely not work if F1 can be
negative. Assuming F1 is bounded below by a value of 0, then
typically we might expect (2) to have too little selective pressure in
the initial phase of the genetic search and then way too much as the
values of F1 approach 0. Similarly, if most of the values of F1 are
small compared to the bound in (1), (1) can give too small a selective
pressure and this selective pressure will decrease even more as F1
converges to it's minimum (there's just not enough difference between
right and almost right to make an effective difference in reproductive
success.) We can improve (1) by using the function [Worst value for
this generation]-F1 which substitutes the highest value of F1 this
generation for the upper bound and so will automatically increase the
selective pressure as the distribution of F1 values gets better.
(3) [Largest value of F1 for this generation] - F1.
An advantage of (3) is that we don't have to know an upper bound of
F1. The problem with (3) is that the amount of selective pressure can
still vary enormously depending on the range of F1 values present.
More generally I find that
(4) (F1 - [Worst for Generation]) / ([Best for Generation]-[Worst
for Generation])
will produce a fairly constant level of selective pressure at all
stages of the search. Furthermore since the meaning of Best and Worst
varies accordingly, this will work whether I am trying to maximize or
minimize F1. In either case (4) will give a civilized value in the
interval [0,1].
The problem with (4) is that for deceptive functions this reasonably
constant level of selective pressure may be too great and so we switch
to (5) where r is a value that typically is in the .01 to .5 range.
(5) ( (1-r)([Worst]-F1) / ([Best] - [Worst]) ) + r
Now (5) is in the interval [r,1] The greater r is, the less selective
pressure we have and so we have a tool to avoid premature
convergence. As r approaches 1 the selective pressure goes to 0.
But all of this may be overkill for what you are doing. At least try
(3) and see how it works for you.
Clif
Thank your for the reply. That was helpful. I have a routine that performs a linear scaling on the fitness of my population as -
f'=a*f+b
I believe this should help maintain a uniform selection pressure. I think I must implement the modified evaluation functions you suggested in addition to the linear scaling that I am already doing instead of switching off the scaling and choosing (3), (4) or (5). (Do you disagree?)
If I understand things right, the scaling prevents strong individuals from dominating early and the modified objective functions help maintain a uniform selection pressure as the generations progress. (Did I get this part right?)
Thank you.
Bharath.