geneticAlgorithm¶
Optimizes a user defined fitness function using a tailored genetic algorithm.
Synopsis¶
geneticAlgorithm (fitness, initialPopulation)
Required Arguments¶
- float
fitness
(Imsls_d_individualindividual
) (Input) - The fitness function. Given the data structure for an individual within
the population,
fitness
returns the fitness of that individual. The fitness function must return non-negative values. - Imsls_d_population
initialPopulation
(Input) - The initial population.
Return Value¶
Function geneticAlgorithm
optimizes a user defined fitness function by
evolving an initial population using a tailored genetic algorithm that
searches for the fittest individual. It returns a clone of the fittest
individual in the last generation. Memory allocated to this data structure
can be released using gaFreeIndividual.
Optional Arguments¶
grayEncoding
(Input)- By default, alleles for integer and real phenotypes are encoded using Base-2 encoding. This argument changes that default to Gray encoding for integer and real phenotypes.
noElitism
(Input)- By default, elitism is used to preserve the fittest individual from one generation to the next. This argument disables elitism.
noDecode
(Input)- By default, chromosome information is decoded into the individual’s phenotypes before every call to the user’s fitness function. This argument disables automatic decoding between generations. Decoding is only applied to the last generation, including the fittest individual.
printLevel
, int (Input)- By default, no printing of intermediate and final results occur from this
function. The
printLevel
argument accepts the following values forlevel
:
printLevel |
Enumeration | Description |
---|---|---|
0 | NONE |
Suppresses printing of any results. |
1 | FINAL |
Prints summary of final results. |
2 | TRACE_GEN |
Prints summary of final results plus generation statistics. |
3 | TRACE_ALL |
Prints summary of final results, generation statistics and individual crossovers. |
maxGenerations
, int (Input)The maximum number of generations. Optimization is halted when the number of generations exceeds
maxGenerations
.Default:
maxGenerations
=100.maxFitness
, float (Input)The optimization is halted if the maximum fitness is greater than this value.
Default:
maxFitness=machine(7)
, i.e., optimization is not halted by large fitness values. Optimization only stops when the number of generations exceedsmaxGenerations
.
linearScaling
, float (Input)Specifies an upper limit for the linear fitness scaling constant. Set
linearScaling
= 1 for no scaling. A check is made to ensure that the minimum scaled fitness is non-negative. If it falls below zero, then the scaling constant is automatically reduced to make the minimum scaled fitness equal to zero. For linear scaling the scaling constant is typically between one and two.Default:
linearScaling
=1,
no linear fitness scaling.
sigmaScaling
(Input)- By default, sigma scaling is not used for fitness scaling. This argument enables sigma scaling.
generationGap
, float (Input)The proportion of weakest individuals replaced between generations. If
generationGap
=1, all of the individuals are replaced.Default:
generationGap
=1.mutationProb
, float (Input)The probability of mutation. Although most applications set this to a value between 0.005 and 0.1, any value between 0 and 1 is allowed.
Default:
mutationProb
=0.005.crossoverProb
, float (Input)The probability of crossover.
crossoverProb
can be any value between 0 and 1. Most genetic algorithms use a probability between 0.6 and 0.9.Default:
crossoverProb
= 0.6.crossovers
, int (Input)The number of crossover points. De Jong’s (1975) generalized crossover model R6 is implemented. If
crossovers
is odd, then the chromosome is treated as a string with a default crossover at the beginning of the chromosome. Ifcrossovers
is even, then the chromosome is treated as a ring with no beginning or end, and crossovers are selected using the uniform distribution on a circle. Crossing points occur at the odd crossover points. If thepmxCrossover
optional argument is used, there are always two crossover points within the nominal portion of the chromosome. For partially matched crossovers, this argument is only used to define the number of crossovers within the binary, integer and real portion of the chromosome.Default:
crossovers
=1.pmxCrossover
(Input)- By default this optional argument applies partially matched crossover to
the nominal portion of the chromosome. Crossovers for other phenotypes
are still applied using standard crossover and inversion crossover if
requested. With partially matched crossover, the number of crossovers for
nominal phenotypes is set to 2, and partially matched crossover is
applied only to the nominal phenotype. The number of crossovers for
non-nominal phenotypes is still controlled by the value of
crossovers
. However, if this optional argument is used, crossover points are randomly selected separately for nominal and non-nominal alleles. invertCrossover
(Input)- This option augments standard or partially matched crossover with inversion. Inversion crossover inverts the values of the alleles in every other crossover segment. If this is applied with partially matched crossover, inversion is applied within the matched segment of the alleles for the nominal phenotypes and then within every other segment of any non-nominal phenotype.
selectionModel
, intselectionModel
(Input)The model used for selecting individuals for reproduction. Selection models are described in the following table:
selectionModel
Description DETERMINISTIC
Individuals with highest fitness are selected for reproduction using their expected sampling frequency. See Goldberg (1989) ROULETTE_WITH
Original fitness-proportional selected described by Holland(1975). Sampling is done with replacement. ROULETTE_WITHOUT
The original fitness-proportional selected except that sampling is done without replacement. This is also referred to as De Jong’s (1975) R3 model. REMAINDER_WITH
Remainder selection with replacement. REMAINDER_WITHOUT
Remainder selection without replacement SUS_SELECTION
Stochastic Universal Sampling as described by Baker (1987). RANK_SELECTION
Rank selection. The individuals with the highest fitness are selected once for reproduction. TOURNAMENT_1
Tournament selection as described by Wetzel (1983). Only the fittest individual in a pair is selected. TOURNAMENT_2
Tournament selection as described by Goldberg and Deb (1991). The fittest individual in a pair is selected with probability 0.75. Otherwise the weaker individual is selected. Default: The original selection method described by Holland (1975),
selectionModel
=ROULETTE_WITH.
nGenerations
(Output)- The number of generations used to find the fittest individual.
onLinePerformance
(Output)- An array of length
maxGenerations
containing on-line performance statistics for each generation. offLinePerformance
(Output)- An array of length
maxGenerations
containing off-line performance statistics for each generation. velocity
(Output)- An array of length
maxGenerations
containing velocity statistics for each generation. The velocity for the i-th generation is equal to \(\tfrac{1}{2} \ln\left( f_i\times f_0 \right)\) where \(f_i\) is the maximum fitness for the i-th generation. generationStats
(Output)- An array of size
maxGenerations
× 4 containing the maximum fitness, minimum fitness, average fitness and standard deviation of the fitness for each generation. The i-th row ofgenerationStats
contains the statistics for the i-th generation. WhennGenerations<maxGenerations
, rows greater thannGenerations
- 1 are filled with NaN values. The four columns contain the following statistics calculated for each generation:
Column | Description |
---|---|
1 | Maximum Fitness |
2 | Minimum Fitness |
3 | Fitness Average |
4 | Fitness Standard Deviation |
lastGeneration
(Output)- The last generation produced by the genetic algorithm. Memory allocated
to this data structure can be released using
gaFreePopulation
.
Description¶
Genetic algorithms search for the optimum individual in a population. This
is defined as the individual with the highest fitness. Function
geneticAlgorithm
returns the fittest individual in the last generation.
Mathematically, this is equivalent to finding the values of the phenotypes
that maximize a user provided fitness function. Although there are no
requirements that the fitness function be non-negative, in general,
convergence to optimum fitness is faster when values of the fitness function
are non-negative. Constraints can be applied by incorporating a penalty
function within the fitness calculation. Phenotypes can consist of any
combination of nominal, binary, integer and real values. Integer and real
values must be encoded into a binary representation. This procedure provides
for either Base-2 or Gray encoding. However, users can supply other
encodings within the fitness function.
The function geneticAlgorithm
uses the population data structure and
fitness with simulated genetic processes of reproduction to search for the
optimum individual, i.e., settings of phenotype values. Genetic algorithms
have been successfully applied to a wide variety of optimization and search
problems, see Holland (1975) and Goldberg (1985).
There are many refinements to the basic genetic algorithm originally
described by Holland (1975). His basic algorithm begins with an initial
population of n individuals, a fitness function, and probabilities for
crossover and mutation of crossoverProb
and mutationProb
respectively. The initial population is transformed from one generation to
the next using the following steps:
- Select n individuals from the current population to generate a mating pool.
- Apply crossover with probability
crossoverProb
to pairs of the selected individuals within the mating pool to produce two offspring. - Apply mutation with probability
mutationProb
to the offspring to generate the next generation. - Check stopping criteria. If they are met, stop and report the fittest individual within the last generation.
By default Holland’s approach to these steps are used. However, many variations of these can be selected using optional arguments.
The initial population can be generated automatically using
gaRandomPopulation or it can be created by first
creating individuals using gaIndividual
and then a population for those
individuals using gaPopulation.
By default Holland’s roulette wheel with replacement is used for selecting
the mating pool. The optional argument selectionModel
allows users to
select alternate selection methods including remainder, tournament and
stochastic universal selection. Default crossover and mutation probabilities
are crossoverProb
= 0.8 and mutationProb
= 0.005. These defaults
can be changed using the optional arguments crossoverProb
and
mutationProb
.
In the original algorithm only a single crossover point was randomly
selected. The optional argument crossovers
allows users to designate any
number of crossover points.
Standard crossover proceeds by combining the genes from both parents in the
order found in those parents. Inversion crossover inverts this order for one
of the parents. Inversion crossover is selected using the optional argument
invertCrossover
.
For certain problems, such as the traveling salesman problem, standard
crossover can produce infeasible individuals. One approach is to assign zero
fitness to those solutions, but this can be very inefficient. Partially
matched crossover is an approach that ensures individuals are feasible for a
certain class of problems. If the problem is best represented using nominal
phenotypes with values 0, 1, …, nNominal
-1 where all values must appear
once and only once in the chromosome, then partially matched crossover
preserves that condition. Partially matched crossover is selected using the
optional argument pmxCrossover
.
One issue with some applications of genetic algorithms is premature
convergence or convergence to false local solutions. This can occur when
dominant individuals within early generations take over the population
prematurely reducing population diversity. One approach to this problem is
fitness scaling. This implementation allows users to use either linear or
sigma fitness scaling. By default, no scaling is used. However, the optional
arguments linearScaling
and sigmaScaling
allow users to have fitness
values automatically scaled before selection.
The genetic algorithm is stopped when any one of the stopping criteria is
met. The algorithm is stopped when the number of generations exceeds
maxGenerations
or when the maximum fitness exceeds maxFitness
. By
default maxGenerations
= 100; this can be changed using
maxGenerations
. By default maxFitness
is machine(7)
; this can be
changed using maxFitness
.
On some platforms, geneticAlgorithm
can evaluate the user-supplied
function fitness
in parallel. This is done only if the function
ompOptions is called to flag user-defined functions
as thread-safe. A function is thread-safe if there are no dependencies
between calls. Such dependencies are usually the result of writing to global
or static variables.
Examples¶
Example 1¶
De Jong (1975) examined the performance of a genetic algorithm for finding the maximum of a multivariate function. This is an example of optimizing a variation of De Jong’s R2 function:
where
Since there were only two real phenotypes and the function is easily
calculated, the phenotypes were encoded using discretization with 65,536
values over the interval [-2.048, 2.048]. By default, encoding and decoding
is done within geneticAlgorithm
. This allows the fitness function to
calculate individual fitness using the real phenotypes instead of the
chromosome. Both the chromosome and its phenotype representation are
available within the Imsls_d_individual data structure argument.
The default selection algorithm ROULETTE_WITH
was used, but the number
of crossover probability was set to 0.6. The genetic algorithm was more
efficient using a lower crossover probability and Gray encoding instead of
the defaults 0.7 and Base-2 encoding. Each generation consisted of 40
individuals.
from __future__ import print_function
import sys
from numpy import *
from pyimsl.stat.free import free
from pyimsl.stat.ompOptions import ompOptions
from pyimsl.stat.gaChromosome import gaChromosome
from pyimsl.stat.gaRandomPopulation import gaRandomPopulation
from pyimsl.stat.gaFreePopulation import gaFreePopulation
from pyimsl.stat.geneticAlgorithm import geneticAlgorithm, FINAL
from pyimsl.stat.randomSeedSet import randomSeedSet
######################################################################
# De Jong's R2 Function
######################################################################
def deJongR2(individual):
x1 = individual[0].realPhenotype[0]
x2 = individual[0].realPhenotype[1]
f = 100 * (x1 * x1 - x2) * (x1 * x1 - x2) + (1.0 - x1) * (1.0 - x1)
f = 4000 - f
return f
n = 40 # population size
generations = 0 # final number of generations
n_real = 2 # number of real phenotypes
r_intervals = [65536, 65536]
r_bounds = [[-2.048, 2.048], [-2.048, 2.048]]
genStats = [] # generation statistics
last_generation = [] # last generation
dashes = "**************************************************************"
# In this example the user function is thread safe. Let CNL
# know it is safe, which allows genetic algorithm to run in
# parallel, if that capability exists on the user computer.
ompOptions(setFunctionsThreadSafe=1)
randomSeedSet(12345)
real = {"rIntervals": r_intervals, "rBounds": r_bounds}
chromosome = gaChromosome(real=real)
population = gaRandomPopulation(n, chromosome, grayEncoding=True,
fitnessFcn=deJongR2)
n_generations = []
genStats = []
last_generation = []
best_individual = geneticAlgorithm(deJongR2, population,
printLevel=FINAL,
maxFitness=3999.999,
crossoverProb=0.6,
grayEncoding=True,
maxGenerations=1000,
nGenerations=n_generations,
generationStats=genStats,
lastGeneration=last_generation)
print("*****************GENERATION STATISTICS*****************")
print("Generation Max. Fit. Avg. Fit. Min. Fit. CV")
print("*******************************************************")
for i in range(0, n_generations[0] + 1):
print("Gen. %3d: %11.5f %10.2f %10.2f %9.2f " % (
i, genStats[i][0], genStats[i][2], genStats[i][1], 100 * genStats[i][3] / genStats[i][2]))
print("\nLAST GENERATION")
print(dashes)
print("\nIndv Fitness ")
print("Chromosome X1 X2")
lastGen = last_generation[0]
for i in range(0, lastGen[0].n):
print(" %2d %6.2f " % (i, lastGen[0].fitness[i]), end=' ')
for j in range(0, lastGen[0].chromosome[0].c_length):
sys.stdout.write("%d" % ((lastGen[0].individual[i])[
0].chromosome[0].allele[j]))
sys.stdout.write("%7.3f %6.3f\n" % (
(lastGen[0].individual[i])[0].realPhenotype[0],
(lastGen[0].individual[i])[0].realPhenotype[1]))
print("\nMaximum: %6.2f for Individual %d" % (
lastGen[0].maxFitness,
lastGen[0].indexFittest))
print("Minimum: %6.2f for Individual %d" % (
lastGen[0].minFitness,
lastGen[0].indexWeakest))
print("Average: %6.2f" % (lastGen[0].avgFitness))
print("Std. Dev: %6.2f\n" % (lastGen[0].stdFitness))
print(dashes)
Output¶
In this example, the print level is set to FINAL
in order to print the
optimum solution. The generation statistics are requested using the
generationStats
option, and the last population is requested using the
lastGeneration
option.
Although the maximum number of generations is set to 100 using the
lastGeneration
option, the genetic algorithm halted after 26 generations
when the maximum population fitness exceeded 3999.999.
*****************GENERATION STATISTICS*****************
Generation Max. Fit. Avg. Fit. Min. Fit. CV
*******************************************************
Gen. 0: 3996.12906 3244.16 578.92 28.62
Gen. 1: 3996.25271 3725.90 2770.41 8.24
Gen. 2: 3999.22964 3699.14 1917.26 10.28
Gen. 3: 3999.22964 3683.41 2551.70 9.87
Gen. 4: 3999.73775 3778.83 2551.70 8.35
Gen. 5: 3999.73775 3823.50 3187.72 5.22
Gen. 6: 3999.73775 3796.59 3187.72 5.76
Gen. 7: 3999.73775 3850.94 3302.26 4.28
Gen. 8: 3999.73775 3860.17 3358.92 3.93
Gen. 9: 3999.73775 3886.33 3138.96 4.23
Gen. 10: 3999.74678 3896.13 3292.64 3.85
Gen. 11: 3999.74678 3900.24 3638.17 2.93
Gen. 12: 3999.74678 3899.95 3376.36 3.17
Gen. 13: 3999.74678 3900.57 3476.12 3.19
Gen. 14: 3999.74678 3897.88 3408.28 3.36
Gen. 15: 3999.74678 3908.28 2331.26 3.36
Gen. 16: 3999.99596 3897.28 3301.18 3.94
Gen. 17: 3999.99596 3910.99 3236.92 3.31
Gen. 18: 3999.99596 3953.46 3429.17 1.31
Gen. 19: 3999.99596 3944.98 3764.08 1.41
Gen. 20: 3999.99596 3945.62 3751.01 1.41
Gen. 21: 3999.99596 3934.07 3751.10 1.81
Gen. 22: 3999.99596 3947.08 3739.52 1.62
Gen. 23: 3999.99604 3943.99 3652.86 1.95
Gen. 24: 3999.99604 3942.42 3652.86 1.99
Gen. 25: 3999.99917 3970.69 3845.42 0.92
Gen. 26: 3999.99944 3970.61 3845.42 0.90
LAST GENERATION
**************************************************************
Indv Fitness
Chromosome X1 X2
0 3916.58 11100000000011001100110001010011 1.023 0.134
1 3996.09 01010000000001011100110001010011 -0.512 0.134
2 3965.22 11010000000011101110111110011011 0.511 0.849
3 3993.47 01100101000001011010000001100011 -0.928 1.028
4 3941.02 11010000000001011010000001100011 0.512 1.028
5 3998.11 11110000000001011100110001110011 0.512 0.134
6 3951.04 01100100000101011100110001100011 -0.898 0.132
7 3919.76 01000000000001011110110001110011 -0.000 0.890
8 3965.35 11110000000001011110111110011011 0.512 0.849
9 3913.56 11010000000011101010111101110011 0.511 1.190
10 3948.39 11010000000011101110001110110011 0.511 0.978
11 3996.04 01010000000001011100110001100011 -0.512 0.132
12 3997.44 11100000100111001110111101101001 1.009 0.859
13 3992.45 11110000000001110100000001000011 0.512 -0.008
14 3958.07 01010000000101011110110001110011 -0.510 0.890
15 3951.33 01100100000001011100110001100011 -0.896 0.132
16 3948.39 11010000000011101110001110110011 0.511 0.978
17 4000.00 11100000000001011010000111010001 1.024 1.046
18 3999.96 11100000000001011010000001100011 1.024 1.028
19 3925.31 01010110010001011010000111010001 -0.440 1.046
20 3999.37 11010000000001011101111001101111 0.512 0.325
21 3845.42 01100101010110110101110001100011 -0.921 -0.380
22 3999.64 11010101000011101100111001000011 0.415 0.184
23 3992.37 01100000000011101110111101101001 -1.023 0.859
24 3990.55 01010000000001110100000001110011 -0.512 -0.006
25 3993.26 11110000000011101100000001000011 0.513 0.008
26 3964.15 11010000000011101110111101110011 0.511 0.858
27 3998.36 11010000000011101100110010011011 0.511 0.143
28 3950.63 01100101000011101100111001000011 -0.927 0.184
29 3996.08 01010000000001011100110001110011 -0.512 0.134
30 3997.89 01010110000011101100111001100111 -0.447 0.188
31 3998.11 11110000000001011100110001110011 0.512 0.134
32 3993.47 01100101000001011010000001100011 -0.928 1.028
33 3992.45 11110000000001110100000001000011 0.512 -0.008
34 3916.61 11100000000011011100110001010011 1.023 0.134
35 3929.38 01010110000001011010000001100011 -0.448 1.028
36 3997.89 01010110000011101100111001100011 -0.447 0.188
37 4000.00 11100000000001011010000111010001 1.024 1.046
38 4000.00 11100000000001011010000111000011 1.024 1.048
39 3991.28 01100100000001011010000001110001 -0.896 1.030
Maximum: 4000.00 for Individual 38
Minimum: 3845.42 for Individual 21
Average: 3970.61
Std. Dev: 35.71
**************************************************************
OPTIMUM SOLUTION
Fitness: 3999.999442
Phenotypes:
Real: 2
Function Calculations: 1080
Population Size: 40
Number of Generations: 26
Real Phenotype(s):
1.023594 1.047844
Chromosome (Gray Encoded):
11100000000001011010000111000011
Example 2¶
The traveling salesman problem creates a problem for traditional crossover. In this problem, the objective is to find the shortest route while traveling to each city once. In this example, there are eight cities, labeled using the letters a-h, with distances ranging from 17 to 113 miles.
Traditional crossover would create unfeasible routes; that is some routes after crossover would not visit every city once. Some would not be visited and others would be visited more than once.
Partially matched crossover (PMX) preserves the feasibility of a route. In
the general sense, PMX assumes that the nominal phenotypes consists of a
string of numbers from zero to nNominal-1
, with each number appearing
once and only once in that string. Partially matched crossover uses two
crossover points within the nominal portion of the chromosome and swaps the
middle segment between the parents. The first and third segments are
manipulated to ensure the resulting offspring is feasible.
from __future__ import print_function
import sys
from numpy import *
from pyimsl.stat.free import free
from pyimsl.stat.ompOptions import ompOptions
from pyimsl.stat.gaChromosome import gaChromosome
from pyimsl.stat.gaRandomPopulation import gaRandomPopulation
from pyimsl.stat.gaFreeIndividual import gaFreeIndividual
from pyimsl.stat.gaFreePopulation import gaFreePopulation
from pyimsl.stat.geneticAlgorithm import geneticAlgorithm, FINAL
from pyimsl.stat.randomSeedSet import randomSeedSet
###########################################################################
# FITNESS FUNCTION
###########################################################################
def pmxFitness(individual):
i = 0
k = 0
n_nominal = 8 # number of nominal phenotypes
f = 0.0 # fitness value
# cities: a b c d e f g h
distances = [0, 17, 27, 73, 61, 57, 51, 23, # a
17, 0, 37, 73, 72, 74, 66, 40, # b
27, 37, 0, 48, 35, 49, 65, 50, # c
73, 73, 48, 0, 47, 82, 113, 95, # d
61, 72, 35, 47, 0, 38, 80, 78, # e
57, 74, 49, 82, 38, 0, 48, 65, # f
51, 66, 65, 113, 80, 48, 0, 40, # g
23, 40, 50, 95, 78, 65, 40, 0] # h
n_nominal = individual[0].chromosome[0].n_nominal
k = individual[0].chromosome[0].nominalIndex + 1
for i in range(k, k + n_nominal - 1):
i1 = individual[0].nominalPhenotype[i - 1]
i2 = individual[0].nominalPhenotype[i]
f += distances[i1 * n_nominal + i2]
return 516 - f
n = 50 # population size
n_nominal = 8 # number of nominal phenotypes
n_categories = [8, 8, 8, 8, 8, 8, 8, 8] # nominal category limits
genStats = [] # generation statistics
last_generation = [] # last generation
cities = ["a", "b", "c", "d", "e", "f", "g", "h"]
###########################################################################
# In this example the user function is thread safe. Let CNL know it is
# safe, which allows genetic algorithm to run in parallel, if that
# capability exists on the user computer.
###########################################################################
ompOptions(setFunctionsThreadSafe=1)
randomSeedSet(12345)
chromosome = gaChromosome(nominal=n_categories)
population = gaRandomPopulation(n, chromosome, pmxCrossover=True,
fitnessFcn=pmxFitness)
genStats = []
n_generations = []
last_generation = []
best_individual = geneticAlgorithm(pmxFitness, population,
printLevel=FINAL,
pmxCrossover=True,
invertCrossover=True,
crossoverProb=0.8,
maxGenerations=10,
generationStats=genStats,
nGenerations=n_generations,
lastGeneration=last_generation)
print("GENERATION STATISTICS")
print("Total Number of Generations: %d\n" % (n_generations[0]))
print("Generation Max. Fit. Min. Fit. Avg. Fit. CV")
for i in range(0, n_generations[0] + 1):
print("Gen. %3d: %8.0f %8.0f %12.2f %8.2f" % (
i, genStats[i][0], genStats[i][1], genStats[i][2],
100 * genStats[i][3] / genStats[i][2]))
print("\n\n LAST GENERATION")
print("*************************************")
print("Individual Fitness Phenotype Values ")
lastGen = last_generation[0]
avg = lastGen[0].avgFitness
for i in range(0, lastGen[0].n):
x1 = lastGen[0].fitness[i]
print(" %2d %6.0f " % (i, x1), end=' ')
for j in range(0, lastGen[0].chromosome[0].c_length):
k = (lastGen[0].individual[i])[0].nominalPhenotype[j]
print(cities[k], end=' ')
if (x1 == lastGen[0].maxFitness):
print("***")
else:
print("")
print("Average Fitness: %6.1f\n" % (avg))
print("*************************************")
sys.stdout.write("OPTIMUM SOLUTION:\n\nFitness:%4.0f\n\nChromosome: " % (
pmxFitness(best_individual)))
for i in range(0, best_individual[0].chromosome[0].c_length):
sys.stdout.write("%3d" % (best_individual[0].chromosome[0].allele[i]))
sys.stdout.write("\n\nPhenotype Values: ")
for i in range(0, n_nominal):
k = best_individual[0].nominalPhenotype[i]
sys.stdout.write("->%s" % (cities[k]))
sys.stdout.write("\n\n")
print("freeing best individual")
gaFreeIndividual(best_individual)
print("freeing last generation")
gaFreePopulation(lastGen)
gaFreePopulation(population)
Output¶
This program produced the following output. Since the print level was set to
FINAL
, the optimum solution was printed. The generation statistics were
requested using the generationStats
option, and the last population was
requested using the lastGeneration
option.
The maximum number of generations was set to 10. The genetic algorithm found the optimum route after evaluating the fitness of 550 routes in 10 generations. Generation zero is the initial generation provided to the algorithm and is not counted towards the maximum generation count.
GENERATION STATISTICS
Total Number of Generations: 10
Generation Max. Fit. Min. Fit. Avg. Fit. CV
Gen. 0: 194 29 113.64 33.55
Gen. 1: 251 24 119.10 35.08
Gen. 2: 251 37 131.16 35.99
Gen. 3: 251 38 128.96 33.68
Gen. 4: 251 29 135.08 33.25
Gen. 5: 255 38 136.18 33.44
Gen. 6: 255 28 142.38 33.23
Gen. 7: 255 56 150.56 29.31
Gen. 8: 269 55 155.06 28.88
Gen. 9: 269 56 148.62 26.68
Gen. 10: 269 48 146.32 29.59
LAST GENERATION
*************************************
Individual Fitness Phenotype Values
0 106 a c d g b h e f
1 269 d e c b a h g f ***
2 222 d e c h g f a b
3 154 e a g h b d c f
4 215 e d c a b f g h
5 215 e d c a b f g h
6 158 a h b c d f g e
7 113 c a d b f e h g
8 78 f a d b h c g e
9 125 e d h g b c f a
10 118 c b e g f h a d
11 209 d e c b h g a f
12 161 h b a d c f g e
13 167 g d b a h c e f
14 169 a f h g b c e d
15 137 f g a e d h b c
16 127 h d a b c f e g
17 90 b d h g c e a f
18 216 d h b a c e f g
19 167 g d e c f h a b
20 120 e h f c d a b g
21 86 f a h d b c g e
22 171 a b d f g h c e
23 156 d h g a c e f b
24 146 c e a d b h g f
25 116 f d g e c h a b
26 118 b c a h e f d g
27 95 d g a b f e h c
28 143 d h b a f c e g
29 157 b f c g h a e d
30 175 b e a h g f c d
31 146 c e a d b h g f
32 111 c b e h g f a d
33 140 b e a f g h c d
34 164 b c e d g h a f
35 166 f e b c a h g d
36 199 b a g h f c d e
37 171 e d f a b c g h
38 97 f a c d h b e g
39 222 d e c h g f a b
40 128 d h c g a b e f
41 99 e g d c b a f h
42 112 d h f c b a e g
43 107 h d a b e f c g
44 143 c h a b f d e g
45 144 g a c e d b f h
46 64 a e c g h d f b
47 103 b d f h g c a e
48 129 d a g b f e c h
49 172 f h b g a c d e
Average Fitness: 146.3
*************************************
OPTIMUM SOLUTION:
Fitness: 269
Chromosome: 3 4 2 1 0 7 6 5
Phenotype Values: ->d->e->c->b->a->h->g->f
freeing best individual
freeing last generation
OPTIMUM SOLUTION
Fitness: 269.000000
Phenotypes:
Nominal: 8
Function Calculations: 550
Population Size: 50
Number of Generations: 10
Nominal Phenotype(s):
3 4 2 1 0 7 6 5
Chromosome (Base-2 Encoded):
3 4 2 1 0 7 6 5
Example 3¶
This example uses the N-Queens problem to illustrate the use of a fitness function with parameters in implementing a genetic algorithm. The N-Queens problem is derived from chess. The genetic algorithm provides an efficient search for a valid solution. For this problem the chess board consists of N rows and N columns. The objective is to place N queens on this board with no conflicts. A conflict occurs when one queen can move and capture another. Since queens can move diagonally, vertically and horizontally this problem is challenging when N becomes large.
One solution for N = 4 is displayed in the following table. A valid solution must place every queen in a different row and column. The problem is to ensure the queens are not in conflict because of lying on the same diagonal.
Row/Col | 0 | 1 | 2 | 3 |
---|---|---|---|---|
0 | Q | |||
1 | Q | |||
2 | Q | |||
3 | Q |
Similar to the traveling salesman problem, the N-Queens problem can be expressed using N nominal phenotypes with values 0, 1, …, N-1. The value of the i-th phenotype represents the row number for the queen in the i-th column. This ensures that any arrangement of the phenotype values represents a board with N queens, each placed in a different row and column.
The solution for N queens displayed above can be represented by the phenotype values {1, 3, 0, 2}. The search looks for arrangements that also do not place queens on the same diagonal. Two queens fall on the same diagonal if the absolute value of the difference between their row numbers equals the absolute value of the difference between their column numbers.
This example uses this representation with a fitness function equal to (N – C) where C equals the number of conflict among the queens. With this fitness, a solution to the N-Queens has a fitness of N.
from __future__ import print_function
import sys
from numpy import *
from pyimsl.stat.free import free
from pyimsl.stat.ompOptions import ompOptions
from pyimsl.stat.gaChromosome import gaChromosome
from pyimsl.stat.gaRandomPopulation import gaRandomPopulation
from pyimsl.stat.gaFreeIndividual import gaFreeIndividual
from pyimsl.stat.gaFreePopulation import gaFreePopulation
from pyimsl.stat.geneticAlgorithm import geneticAlgorithm, FINAL
from pyimsl.stat.randomSeedSet import randomSeedSet
# This algorithm seems to run really fast until n_queens is > 10,
# so we'll leave it at 10 for testing.
#
# n_queens = 25 # number of nominal phenotypes
n_queens = 10 # number of nominal phenotypes
n = 500 # population size
n_categories = empty((n_queens), dtype=double)
def queensFitness(individual):
global n_queens
f = 0 # Fitness value
for i in range(0, n_queens - 1):
for j in range(i + 1, n_queens):
k = individual[0].chromosome[0].allele[i] - \
individual[0].chromosome[0].allele[j]
k = abs(k)
if (abs(i - j) == k):
f += 1
f = n_queens - f
return f
###########################################################################
# In this example the user function is thread safe. Let CNL know it is
# safe, which allows genetic algorithm to run in parallel, if that
# capability exists on the user computer.
###########################################################################
ompOptions(setFunctionsThreadSafe=1)
randomSeedSet(12345)
maxFit = n_queens - 0.5
for i in range(0, n_queens):
n_categories[i] = n_queens
chromosome = gaChromosome(nominal=n_categories)
population = gaRandomPopulation(n, chromosome, pmxCrossover=True,
fitnessFcn=queensFitness)
genStats = []
n_generations = []
best_individual = geneticAlgorithm(queensFitness, population,
printLevel=FINAL,
pmxCrossover=True,
linearScaling=2.0,
crossoverProb=0.7,
mutationProb=0.01,
maxGenerations=10000,
maxFitness=maxFit,
generationStats=genStats,
nGenerations=n_generations)
print("GENERATION STATISTICS")
print("Total Number of Generations: %d\n" % (n_generations[0]))
print("Generation Max. Fit. Min. Fit. Avg. Fit. CV")
print("*************************************************")
for i in range(0, n_generations[0] + 1, 25):
print("Gen. %3d: %8.0f %8.0f %12.2f %8.2f" % (
i, genStats[i][0], genStats[i][1], genStats[i][2],
100 * genStats[i][3] / genStats[i][2]))
i = n_generations[0]
print("Gen. %3d: %8.0f %8.0f %12.2f %8.2f" % (
i, genStats[i][0], genStats[i][1], genStats[i][2],
100 * genStats[i][3] / genStats[i][2]))
print("*************************************************\n")
sys.stdout.write("OPTIMUM SOLUTION:\n\nFitness:%4.0f\n\nChromosome: " % (
queensFitness(best_individual)))
for i in range(0, best_individual[0].chromosome[0].c_length):
sys.stdout.write("%3d" % (best_individual[0].chromosome[0].allele[i]))
if(n_queens < 100):
print("\n\nBoard Positions: \n")
for i in range(0, n_queens):
sys.stdout.write("--")
sys.stdout.write("-\n")
for i in range(0, n_queens):
for j in range(0, n_queens):
k = best_individual[0].nominalPhenotype[j]
if(i == k):
sys.stdout.write("|Q")
else:
sys.stdout.write("| ")
sys.stdout.write("|\n")
for k in range(0, n_queens):
sys.stdout.write("--")
sys.stdout.write("-\n")
sys.stdout.write("\n\n")
Output¶
This program produced the following solution to the N-Queens problem with \(N=50\) queens. Notice that some of the minimum fitness values are negative. This alters the random selection of the fittest parents, but if these values are few and small, then the effect is not enough to halt the genetic algorithm.
For 50 queens there are over 1064 ways of placing queens ensuring each is in its own row and column. An exhaustive search of these possible solutions to find a arrangement without diagonal conflicts would be time consuming. The genetic algorithm search found a solution in only 568 generations, requiring 284,500 function evaluations.
GENERATION STATISTICS
Total Number of Generations: 26
Generation Max. Fit. Min. Fit. Avg. Fit. CV
*************************************************
Gen. 0: 9 -9 3.62 66.71
Gen. 25: 10 -8 4.68 43.25
Gen. 26: 10 -4 4.44 46.15
*************************************************
OPTIMUM SOLUTION:
Fitness: 10
Chromosome: 6 2 5 7 0 4 8 1 3 9
Board Positions:
---------------------
| | | | |Q| | | | | |
---------------------
| | | | | | | |Q| | |
---------------------
| |Q| | | | | | | | |
---------------------
| | | | | | | | |Q| |
---------------------
| | | | | |Q| | | | |
---------------------
| | |Q| | | | | | | |
---------------------
|Q| | | | | | | | | |
---------------------
| | | |Q| | | | | | |
---------------------
| | | | | | |Q| | | |
---------------------
| | | | | | | | | |Q|
---------------------
OPTIMUM SOLUTION
Fitness: 10.000000
Phenotypes:
Nominal: 10
Function Calculations: 13500
Population Size: 500
Number of Generations: 26
Nominal Phenotype(s):
6 2 5 7 0 4 8 1 3 9
Chromosome (Base-2 Encoded):
6 2 5 7 0 4 8 1 3 9
Fatal Errors¶
IMSLS_STOP_USER_FCN |
Request from user supplied function to stop algorithm. User flag = “#”. |