geneticAlgorithm

../../_images/OpenMp_27.png

Optimizes a user defined fitness function using a tailored genetic algorithm.

Synopsis

geneticAlgorithm (fitness, initialPopulation)

Required Arguments

float fitness(Imsls_d_individual individual) (Input)
The fitness function. Given the data structure for an individual within the population, fitness returns the fitness of that individual. The fitness function must return non-negative values.
Imsls_d_population initialPopulation (Input)
The initial population.

Return Value

Function geneticAlgorithm optimizes a user defined fitness function by evolving an initial population using a tailored genetic algorithm that searches for the fittest individual. It returns a clone of the fittest individual in the last generation. Memory allocated to this data structure can be released using gaFreeIndividual.

Optional Arguments

grayEncoding (Input)
By default, alleles for integer and real phenotypes are encoded using Base-2 encoding. This argument changes that default to Gray encoding for integer and real phenotypes.
noElitism (Input)
By default, elitism is used to preserve the fittest individual from one generation to the next. This argument disables elitism.
noDecode (Input)
By default, chromosome information is decoded into the individual’s phenotypes before every call to the user’s fitness function. This argument disables automatic decoding between generations. Decoding is only applied to the last generation, including the fittest individual.
printLevel, int (Input)
By default, no printing of intermediate and final results occur from this function. The printLevel argument accepts the following values for level:
printLevel Enumeration Description
0 NONE Suppresses printing of any results.
1 FINAL Prints summary of final results.
2 TRACE_GEN Prints summary of final results plus generation statistics.
3 TRACE_ALL Prints summary of final results, generation statistics and individual crossovers.
maxGenerations, int (Input)

The maximum number of generations. Optimization is halted when the number of generations exceeds maxGenerations.

Default: maxGenerations=100.

maxFitness, float (Input)

The optimization is halted if the maximum fitness is greater than this value.

Default: maxFitness=machine(7), i.e., optimization is not halted by large fitness values. Optimization only stops when the number of generations exceeds maxGenerations.

linearScaling, float (Input)

Specifies an upper limit for the linear fitness scaling constant. Set linearScaling = 1 for no scaling. A check is made to ensure that the minimum scaled fitness is non-negative. If it falls below zero, then the scaling constant is automatically reduced to make the minimum scaled fitness equal to zero. For linear scaling the scaling constant is typically between one and two.

Default: linearScaling =1, no linear fitness scaling.

sigmaScaling (Input)
By default, sigma scaling is not used for fitness scaling. This argument enables sigma scaling.
generationGap, float (Input)

The proportion of weakest individuals replaced between generations. If generationGap=1, all of the individuals are replaced.

Default: generationGap=1.

mutationProb, float (Input)

The probability of mutation. Although most applications set this to a value between 0.005 and 0.1, any value between 0 and 1 is allowed.

Default: mutationProb=0.005.

crossoverProb, float (Input)

The probability of crossover. crossoverProb can be any value between 0 and 1. Most genetic algorithms use a probability between 0.6 and 0.9.

Default: crossoverProb= 0.6.

crossovers, int (Input)

The number of crossover points. De Jong’s (1975) generalized crossover model R6 is implemented. If crossovers is odd, then the chromosome is treated as a string with a default crossover at the beginning of the chromosome. If crossovers is even, then the chromosome is treated as a ring with no beginning or end, and crossovers are selected using the uniform distribution on a circle. Crossing points occur at the odd crossover points. If the pmxCrossover optional argument is used, there are always two crossover points within the nominal portion of the chromosome. For partially matched crossovers, this argument is only used to define the number of crossovers within the binary, integer and real portion of the chromosome.

Default: crossovers=1.

pmxCrossover (Input)
By default this optional argument applies partially matched crossover to the nominal portion of the chromosome. Crossovers for other phenotypes are still applied using standard crossover and inversion crossover if requested. With partially matched crossover, the number of crossovers for nominal phenotypes is set to 2, and partially matched crossover is applied only to the nominal phenotype. The number of crossovers for non-nominal phenotypes is still controlled by the value of crossovers. However, if this optional argument is used, crossover points are randomly selected separately for nominal and non-nominal alleles.
invertCrossover (Input)
This option augments standard or partially matched crossover with inversion. Inversion crossover inverts the values of the alleles in every other crossover segment. If this is applied with partially matched crossover, inversion is applied within the matched segment of the alleles for the nominal phenotypes and then within every other segment of any non-nominal phenotype.
selectionModel, int selectionModel (Input)

The model used for selecting individuals for reproduction. Selection models are described in the following table:

selectionModel Description
DETERMINISTIC Individuals with highest fitness are selected for reproduction using their expected sampling frequency. See Goldberg (1989)
ROULETTE_WITH Original fitness-proportional selected described by Holland(1975). Sampling is done with replacement.
ROULETTE_WITHOUT The original fitness-proportional selected except that sampling is done without replacement. This is also referred to as De Jong’s (1975) R3 model.
REMAINDER_WITH Remainder selection with replacement.
REMAINDER_WITHOUT Remainder selection without replacement
SUS_SELECTION Stochastic Universal Sampling as described by Baker (1987).
RANK_SELECTION Rank selection. The individuals with the highest fitness are selected once for reproduction.
TOURNAMENT_1 Tournament selection as described by Wetzel (1983). Only the fittest individual in a pair is selected.
TOURNAMENT_2 Tournament selection as described by Goldberg and Deb (1991). The fittest individual in a pair is selected with probability 0.75. Otherwise the weaker individual is selected.

Default: The original selection method described by Holland (1975), selectionModel=ROULETTE_WITH.

nGenerations (Output)
The number of generations used to find the fittest individual.
onLinePerformance (Output)
An array of length maxGenerations containing on-line performance statistics for each generation.
offLinePerformance (Output)
An array of length maxGenerations containing off-line performance statistics for each generation.
velocity (Output)
An array of length maxGenerations containing velocity statistics for each generation. The velocity for the i-th generation is equal to \(\tfrac{1}{2} \ln\left( f_i\times f_0 \right)\) where \(f_i\) is the maximum fitness for the i-th generation.
generationStats (Output)
An array of size maxGenerations × 4 containing the maximum fitness, minimum fitness, average fitness and standard deviation of the fitness for each generation. The i-th row of generationStats contains the statistics for the i-th generation. When nGenerations<maxGenerations, rows greater than nGenerations - 1 are filled with NaN values. The four columns contain the following statistics calculated for each generation:
Column Description
1 Maximum Fitness
2 Minimum Fitness
3 Fitness Average
4 Fitness Standard Deviation
lastGeneration (Output)
The last generation produced by the genetic algorithm. Memory allocated to this data structure can be released using gaFreePopulation.

Description

Genetic algorithms search for the optimum individual in a population. This is defined as the individual with the highest fitness. Function geneticAlgorithm returns the fittest individual in the last generation. Mathematically, this is equivalent to finding the values of the phenotypes that maximize a user provided fitness function. Although there are no requirements that the fitness function be non-negative, in general, convergence to optimum fitness is faster when values of the fitness function are non-negative. Constraints can be applied by incorporating a penalty function within the fitness calculation. Phenotypes can consist of any combination of nominal, binary, integer and real values. Integer and real values must be encoded into a binary representation. This procedure provides for either Base-2 or Gray encoding. However, users can supply other encodings within the fitness function.

The function geneticAlgorithm uses the population data structure and fitness with simulated genetic processes of reproduction to search for the optimum individual, i.e., settings of phenotype values. Genetic algorithms have been successfully applied to a wide variety of optimization and search problems, see Holland (1975) and Goldberg (1985).

There are many refinements to the basic genetic algorithm originally described by Holland (1975). His basic algorithm begins with an initial population of n individuals, a fitness function, and probabilities for crossover and mutation of crossoverProb and mutationProb respectively. The initial population is transformed from one generation to the next using the following steps:

  1. Select n individuals from the current population to generate a mating pool.
  2. Apply crossover with probability crossoverProb to pairs of the selected individuals within the mating pool to produce two offspring.
  3. Apply mutation with probability mutationProb to the offspring to generate the next generation.
  4. Check stopping criteria. If they are met, stop and report the fittest individual within the last generation.

By default Holland’s approach to these steps are used. However, many variations of these can be selected using optional arguments.

The initial population can be generated automatically using gaRandomPopulation or it can be created by first creating individuals using gaIndividual and then a population for those individuals using gaPopulation.

By default Holland’s roulette wheel with replacement is used for selecting the mating pool. The optional argument selectionModel allows users to select alternate selection methods including remainder, tournament and stochastic universal selection. Default crossover and mutation probabilities are crossoverProb= 0.8 and mutationProb= 0.005. These defaults can be changed using the optional arguments crossoverProb and mutationProb.

In the original algorithm only a single crossover point was randomly selected. The optional argument crossovers allows users to designate any number of crossover points.

Standard crossover proceeds by combining the genes from both parents in the order found in those parents. Inversion crossover inverts this order for one of the parents. Inversion crossover is selected using the optional argument invertCrossover.

For certain problems, such as the traveling salesman problem, standard crossover can produce infeasible individuals. One approach is to assign zero fitness to those solutions, but this can be very inefficient. Partially matched crossover is an approach that ensures individuals are feasible for a certain class of problems. If the problem is best represented using nominal phenotypes with values 0, 1, …, nNominal-1 where all values must appear once and only once in the chromosome, then partially matched crossover preserves that condition. Partially matched crossover is selected using the optional argument pmxCrossover.

One issue with some applications of genetic algorithms is premature convergence or convergence to false local solutions. This can occur when dominant individuals within early generations take over the population prematurely reducing population diversity. One approach to this problem is fitness scaling. This implementation allows users to use either linear or sigma fitness scaling. By default, no scaling is used. However, the optional arguments linearScaling and sigmaScaling allow users to have fitness values automatically scaled before selection.

The genetic algorithm is stopped when any one of the stopping criteria is met. The algorithm is stopped when the number of generations exceeds maxGenerations or when the maximum fitness exceeds maxFitness. By default maxGenerations= 100; this can be changed using maxGenerations. By default maxFitness is machine(7); this can be changed using maxFitness.

On some platforms, geneticAlgorithm can evaluate the user-supplied function fitness in parallel. This is done only if the function ompOptions is called to flag user-defined functions as thread-safe. A function is thread-safe if there are no dependencies between calls. Such dependencies are usually the result of writing to global or static variables.

Examples

Example 1

De Jong (1975) examined the performance of a genetic algorithm for finding the maximum of a multivariate function. This is an example of optimizing a variation of De Jong’s R2 function:

\[f(x_1,x_2) = 4000 - 100\left(x_1^2 - x_2\right)^2 - \left(1-x_1\right)^2\]

where

\[-2.048 ≤ x_1 ≤ 2.048 \text{ and } -2.048 ≤ x_2 ≤ 2.048\]

Since there were only two real phenotypes and the function is easily calculated, the phenotypes were encoded using discretization with 65,536 values over the interval [-2.048, 2.048]. By default, encoding and decoding is done within geneticAlgorithm. This allows the fitness function to calculate individual fitness using the real phenotypes instead of the chromosome. Both the chromosome and its phenotype representation are available within the Imsls_d_individual data structure argument.

The default selection algorithm ROULETTE_WITH was used, but the number of crossover probability was set to 0.6. The genetic algorithm was more efficient using a lower crossover probability and Gray encoding instead of the defaults 0.7 and Base-2 encoding. Each generation consisted of 40 individuals.

from __future__ import print_function
import sys
from numpy import *
from pyimsl.stat.free import free
from pyimsl.stat.ompOptions import ompOptions
from pyimsl.stat.gaChromosome import gaChromosome
from pyimsl.stat.gaRandomPopulation import gaRandomPopulation
from pyimsl.stat.gaFreePopulation import gaFreePopulation
from pyimsl.stat.geneticAlgorithm import geneticAlgorithm, FINAL

from pyimsl.stat.randomSeedSet import randomSeedSet

######################################################################
# De Jong's R2 Function
######################################################################


def deJongR2(individual):
    x1 = individual[0].realPhenotype[0]
    x2 = individual[0].realPhenotype[1]
    f = 100 * (x1 * x1 - x2) * (x1 * x1 - x2) + (1.0 - x1) * (1.0 - x1)
    f = 4000 - f
    return f


n = 40                         # population size
generations = 0                # final number of generations
n_real = 2                     # number of real phenotypes
r_intervals = [65536, 65536]
r_bounds = [[-2.048, 2.048], [-2.048, 2.048]]
genStats = []               # generation statistics
last_generation = []           # last generation
dashes = "**************************************************************"

# In this example the user function is thread safe.  Let CNL
# know it is safe, which allows genetic algorithm to run in
# parallel, if that capability exists on the user computer.
ompOptions(setFunctionsThreadSafe=1)
randomSeedSet(12345)
real = {"rIntervals": r_intervals, "rBounds": r_bounds}
chromosome = gaChromosome(real=real)
population = gaRandomPopulation(n, chromosome, grayEncoding=True,
                                fitnessFcn=deJongR2)

n_generations = []
genStats = []
last_generation = []
best_individual = geneticAlgorithm(deJongR2, population,
                                   printLevel=FINAL,
                                   maxFitness=3999.999,
                                   crossoverProb=0.6,
                                   grayEncoding=True,
                                   maxGenerations=1000,
                                   nGenerations=n_generations,
                                   generationStats=genStats,
                                   lastGeneration=last_generation)

print("*****************GENERATION STATISTICS*****************")
print("Generation  Max. Fit.   Avg. Fit.  Min. Fit.     CV")
print("*******************************************************")
for i in range(0, n_generations[0] + 1):
    print("Gen. %3d: %11.5f %10.2f %10.2f %9.2f " % (
        i, genStats[i][0], genStats[i][2], genStats[i][1], 100 * genStats[i][3] / genStats[i][2]))
print("\nLAST GENERATION")
print(dashes)
print("\nIndv  Fitness              ")
print("Chromosome              X1     X2")
lastGen = last_generation[0]
for i in range(0, lastGen[0].n):
    print(" %2d   %6.2f  " % (i, lastGen[0].fitness[i]), end=' ')
    for j in range(0, lastGen[0].chromosome[0].c_length):
        sys.stdout.write("%d" % ((lastGen[0].individual[i])[
                         0].chromosome[0].allele[j]))
    sys.stdout.write("%7.3f %6.3f\n" % (
        (lastGen[0].individual[i])[0].realPhenotype[0],
        (lastGen[0].individual[i])[0].realPhenotype[1]))
print("\nMaximum:    %6.2f for Individual %d" % (
    lastGen[0].maxFitness,
    lastGen[0].indexFittest))
print("Minimum:    %6.2f for Individual %d" % (
    lastGen[0].minFitness,
    lastGen[0].indexWeakest))
print("Average:    %6.2f" % (lastGen[0].avgFitness))
print("Std. Dev:    %6.2f\n" % (lastGen[0].stdFitness))
print(dashes)

Output

In this example, the print level is set to FINAL in order to print the optimum solution. The generation statistics are requested using the generationStats option, and the last population is requested using the lastGeneration option.

Although the maximum number of generations is set to 100 using the lastGeneration option, the genetic algorithm halted after 26 generations when the maximum population fitness exceeded 3999.999.

*****************GENERATION STATISTICS*****************
Generation  Max. Fit.   Avg. Fit.  Min. Fit.     CV
*******************************************************
Gen.   0:  3996.12906    3244.16     578.92     28.62 
Gen.   1:  3996.25271    3725.90    2770.41      8.24 
Gen.   2:  3999.22964    3699.14    1917.26     10.28 
Gen.   3:  3999.22964    3683.41    2551.70      9.87 
Gen.   4:  3999.73775    3778.83    2551.70      8.35 
Gen.   5:  3999.73775    3823.50    3187.72      5.22 
Gen.   6:  3999.73775    3796.59    3187.72      5.76 
Gen.   7:  3999.73775    3850.94    3302.26      4.28 
Gen.   8:  3999.73775    3860.17    3358.92      3.93 
Gen.   9:  3999.73775    3886.33    3138.96      4.23 
Gen.  10:  3999.74678    3896.13    3292.64      3.85 
Gen.  11:  3999.74678    3900.24    3638.17      2.93 
Gen.  12:  3999.74678    3899.95    3376.36      3.17 
Gen.  13:  3999.74678    3900.57    3476.12      3.19 
Gen.  14:  3999.74678    3897.88    3408.28      3.36 
Gen.  15:  3999.74678    3908.28    2331.26      3.36 
Gen.  16:  3999.99596    3897.28    3301.18      3.94 
Gen.  17:  3999.99596    3910.99    3236.92      3.31 
Gen.  18:  3999.99596    3953.46    3429.17      1.31 
Gen.  19:  3999.99596    3944.98    3764.08      1.41 
Gen.  20:  3999.99596    3945.62    3751.01      1.41 
Gen.  21:  3999.99596    3934.07    3751.10      1.81 
Gen.  22:  3999.99596    3947.08    3739.52      1.62 
Gen.  23:  3999.99604    3943.99    3652.86      1.95 
Gen.  24:  3999.99604    3942.42    3652.86      1.99 
Gen.  25:  3999.99917    3970.69    3845.42      0.92 
Gen.  26:  3999.99944    3970.61    3845.42      0.90 

LAST GENERATION
**************************************************************

Indv  Fitness              
Chromosome              X1     X2
  0   3916.58   11100000000011001100110001010011  1.023  0.134
  1   3996.09   01010000000001011100110001010011 -0.512  0.134
  2   3965.22   11010000000011101110111110011011  0.511  0.849
  3   3993.47   01100101000001011010000001100011 -0.928  1.028
  4   3941.02   11010000000001011010000001100011  0.512  1.028
  5   3998.11   11110000000001011100110001110011  0.512  0.134
  6   3951.04   01100100000101011100110001100011 -0.898  0.132
  7   3919.76   01000000000001011110110001110011 -0.000  0.890
  8   3965.35   11110000000001011110111110011011  0.512  0.849
  9   3913.56   11010000000011101010111101110011  0.511  1.190
 10   3948.39   11010000000011101110001110110011  0.511  0.978
 11   3996.04   01010000000001011100110001100011 -0.512  0.132
 12   3997.44   11100000100111001110111101101001  1.009  0.859
 13   3992.45   11110000000001110100000001000011  0.512 -0.008
 14   3958.07   01010000000101011110110001110011 -0.510  0.890
 15   3951.33   01100100000001011100110001100011 -0.896  0.132
 16   3948.39   11010000000011101110001110110011  0.511  0.978
 17   4000.00   11100000000001011010000111010001  1.024  1.046
 18   3999.96   11100000000001011010000001100011  1.024  1.028
 19   3925.31   01010110010001011010000111010001 -0.440  1.046
 20   3999.37   11010000000001011101111001101111  0.512  0.325
 21   3845.42   01100101010110110101110001100011 -0.921 -0.380
 22   3999.64   11010101000011101100111001000011  0.415  0.184
 23   3992.37   01100000000011101110111101101001 -1.023  0.859
 24   3990.55   01010000000001110100000001110011 -0.512 -0.006
 25   3993.26   11110000000011101100000001000011  0.513  0.008
 26   3964.15   11010000000011101110111101110011  0.511  0.858
 27   3998.36   11010000000011101100110010011011  0.511  0.143
 28   3950.63   01100101000011101100111001000011 -0.927  0.184
 29   3996.08   01010000000001011100110001110011 -0.512  0.134
 30   3997.89   01010110000011101100111001100111 -0.447  0.188
 31   3998.11   11110000000001011100110001110011  0.512  0.134
 32   3993.47   01100101000001011010000001100011 -0.928  1.028
 33   3992.45   11110000000001110100000001000011  0.512 -0.008
 34   3916.61   11100000000011011100110001010011  1.023  0.134
 35   3929.38   01010110000001011010000001100011 -0.448  1.028
 36   3997.89   01010110000011101100111001100011 -0.447  0.188
 37   4000.00   11100000000001011010000111010001  1.024  1.046
 38   4000.00   11100000000001011010000111000011  1.024  1.048
 39   3991.28   01100100000001011010000001110001 -0.896  1.030

Maximum:    4000.00 for Individual 38
Minimum:    3845.42 for Individual 21
Average:    3970.61
Std. Dev:     35.71

**************************************************************

OPTIMUM SOLUTION

   Fitness: 3999.999442
   Phenotypes:
      Real:     2
   Function Calculations: 1080
   Population Size:       40
   Number of Generations: 26

   Real Phenotype(s): 

      1.023594 1.047844 

   Chromosome (Gray Encoded):   

      11100000000001011010000111000011

Example 2

The traveling salesman problem creates a problem for traditional crossover. In this problem, the objective is to find the shortest route while traveling to each city once. In this example, there are eight cities, labeled using the letters a-h, with distances ranging from 17 to 113 miles.

Traditional crossover would create unfeasible routes; that is some routes after crossover would not visit every city once. Some would not be visited and others would be visited more than once.

Partially matched crossover (PMX) preserves the feasibility of a route. In the general sense, PMX assumes that the nominal phenotypes consists of a string of numbers from zero to nNominal-1, with each number appearing once and only once in that string. Partially matched crossover uses two crossover points within the nominal portion of the chromosome and swaps the middle segment between the parents. The first and third segments are manipulated to ensure the resulting offspring is feasible.

from __future__ import print_function
import sys
from numpy import *
from pyimsl.stat.free import free
from pyimsl.stat.ompOptions import ompOptions
from pyimsl.stat.gaChromosome import gaChromosome
from pyimsl.stat.gaRandomPopulation import gaRandomPopulation
from pyimsl.stat.gaFreeIndividual import gaFreeIndividual
from pyimsl.stat.gaFreePopulation import gaFreePopulation
from pyimsl.stat.geneticAlgorithm import geneticAlgorithm, FINAL

from pyimsl.stat.randomSeedSet import randomSeedSet

###########################################################################
# FITNESS FUNCTION
###########################################################################


def pmxFitness(individual):
    i = 0
    k = 0
    n_nominal = 8    # number of nominal phenotypes
    f = 0.0          # fitness value
    # cities:    a  b   c   d   e   f   g   h
    distances = [0, 17, 27, 73, 61, 57, 51, 23,   # a
                 17, 0, 37, 73, 72, 74, 66, 40,   # b
                 27, 37, 0, 48, 35, 49, 65, 50,   # c
                 73, 73, 48, 0, 47, 82, 113, 95,  # d
                 61, 72, 35, 47, 0, 38, 80, 78,   # e
                 57, 74, 49, 82, 38, 0, 48, 65,   # f
                 51, 66, 65, 113, 80, 48, 0, 40,  # g
                 23, 40, 50, 95, 78, 65, 40, 0]   # h
    n_nominal = individual[0].chromosome[0].n_nominal
    k = individual[0].chromosome[0].nominalIndex + 1
    for i in range(k, k + n_nominal - 1):
        i1 = individual[0].nominalPhenotype[i - 1]
        i2 = individual[0].nominalPhenotype[i]
        f += distances[i1 * n_nominal + i2]
    return 516 - f


n = 50                         # population size
n_nominal = 8                  # number of nominal phenotypes
n_categories = [8, 8, 8, 8, 8, 8, 8, 8]  # nominal category limits
genStats = []                  # generation statistics
last_generation = []           # last generation
cities = ["a", "b", "c", "d", "e", "f", "g", "h"]

###########################################################################
# In this example the user function is thread safe. Let CNL know it is
# safe, which allows genetic algorithm to run in parallel, if that
# capability exists on the user computer.
###########################################################################
ompOptions(setFunctionsThreadSafe=1)
randomSeedSet(12345)
chromosome = gaChromosome(nominal=n_categories)

population = gaRandomPopulation(n, chromosome, pmxCrossover=True,
                                fitnessFcn=pmxFitness)
genStats = []
n_generations = []
last_generation = []
best_individual = geneticAlgorithm(pmxFitness, population,
                                   printLevel=FINAL,
                                   pmxCrossover=True,
                                   invertCrossover=True,
                                   crossoverProb=0.8,
                                   maxGenerations=10,
                                   generationStats=genStats,
                                   nGenerations=n_generations,
                                   lastGeneration=last_generation)

print("GENERATION STATISTICS")
print("Total Number of Generations: %d\n" % (n_generations[0]))
print("Generation  Max. Fit. Min. Fit.  Avg. Fit.    CV")
for i in range(0, n_generations[0] + 1):
    print("Gen. %3d: %8.0f %8.0f %12.2f %8.2f" % (
        i, genStats[i][0], genStats[i][1], genStats[i][2],
        100 * genStats[i][3] / genStats[i][2]))
print("\n\n            LAST GENERATION")
print("*************************************")
print("Individual  Fitness  Phenotype Values ")
lastGen = last_generation[0]
avg = lastGen[0].avgFitness
for i in range(0, lastGen[0].n):
    x1 = lastGen[0].fitness[i]
    print("    %2d     %6.0f    " % (i, x1), end=' ')
    for j in range(0, lastGen[0].chromosome[0].c_length):
        k = (lastGen[0].individual[i])[0].nominalPhenotype[j]
        print(cities[k], end=' ')
    if (x1 == lastGen[0].maxFitness):
        print("***")
    else:
        print("")

print("Average Fitness: %6.1f\n" % (avg))
print("*************************************")
sys.stdout.write("OPTIMUM SOLUTION:\n\nFitness:%4.0f\n\nChromosome:        " % (
    pmxFitness(best_individual)))
for i in range(0, best_individual[0].chromosome[0].c_length):
    sys.stdout.write("%3d" % (best_individual[0].chromosome[0].allele[i]))
sys.stdout.write("\n\nPhenotype Values:  ")
for i in range(0, n_nominal):
    k = best_individual[0].nominalPhenotype[i]
    sys.stdout.write("->%s" % (cities[k]))
sys.stdout.write("\n\n")

print("freeing best individual")
gaFreeIndividual(best_individual)
print("freeing last generation")
gaFreePopulation(lastGen)
gaFreePopulation(population)

Output

This program produced the following output. Since the print level was set to FINAL, the optimum solution was printed. The generation statistics were requested using the generationStats option, and the last population was requested using the lastGeneration option.

The maximum number of generations was set to 10. The genetic algorithm found the optimum route after evaluating the fitness of 550 routes in 10 generations. Generation zero is the initial generation provided to the algorithm and is not counted towards the maximum generation count.

GENERATION STATISTICS
Total Number of Generations: 10

Generation  Max. Fit. Min. Fit.  Avg. Fit.    CV
Gen.   0:      194       29       113.64    33.55
Gen.   1:      251       24       119.10    35.08
Gen.   2:      251       37       131.16    35.99
Gen.   3:      251       38       128.96    33.68
Gen.   4:      251       29       135.08    33.25
Gen.   5:      255       38       136.18    33.44
Gen.   6:      255       28       142.38    33.23
Gen.   7:      255       56       150.56    29.31
Gen.   8:      269       55       155.06    28.88
Gen.   9:      269       56       148.62    26.68
Gen.  10:      269       48       146.32    29.59


            LAST GENERATION
*************************************
Individual  Fitness  Phenotype Values 
     0        106     a c d g b h e f 
     1        269     d e c b a h g f ***
     2        222     d e c h g f a b 
     3        154     e a g h b d c f 
     4        215     e d c a b f g h 
     5        215     e d c a b f g h 
     6        158     a h b c d f g e 
     7        113     c a d b f e h g 
     8         78     f a d b h c g e 
     9        125     e d h g b c f a 
    10        118     c b e g f h a d 
    11        209     d e c b h g a f 
    12        161     h b a d c f g e 
    13        167     g d b a h c e f 
    14        169     a f h g b c e d 
    15        137     f g a e d h b c 
    16        127     h d a b c f e g 
    17         90     b d h g c e a f 
    18        216     d h b a c e f g 
    19        167     g d e c f h a b 
    20        120     e h f c d a b g 
    21         86     f a h d b c g e 
    22        171     a b d f g h c e 
    23        156     d h g a c e f b 
    24        146     c e a d b h g f 
    25        116     f d g e c h a b 
    26        118     b c a h e f d g 
    27         95     d g a b f e h c 
    28        143     d h b a f c e g 
    29        157     b f c g h a e d 
    30        175     b e a h g f c d 
    31        146     c e a d b h g f 
    32        111     c b e h g f a d 
    33        140     b e a f g h c d 
    34        164     b c e d g h a f 
    35        166     f e b c a h g d 
    36        199     b a g h f c d e 
    37        171     e d f a b c g h 
    38         97     f a c d h b e g 
    39        222     d e c h g f a b 
    40        128     d h c g a b e f 
    41         99     e g d c b a f h 
    42        112     d h f c b a e g 
    43        107     h d a b e f c g 
    44        143     c h a b f d e g 
    45        144     g a c e d b f h 
    46         64     a e c g h d f b 
    47        103     b d f h g c a e 
    48        129     d a g b f e c h 
    49        172     f h b g a c d e 
Average Fitness:  146.3

*************************************
OPTIMUM SOLUTION:

Fitness: 269

Chromosome:          3  4  2  1  0  7  6  5

Phenotype Values:  ->d->e->c->b->a->h->g->f

freeing best individual
freeing last generation

OPTIMUM SOLUTION

   Fitness: 269.000000
   Phenotypes:
      Nominal:  8
   Function Calculations: 550
   Population Size:       50
   Number of Generations: 10

   Nominal Phenotype(s): 

        3   4   2   1   0   7   6   5 

   Chromosome (Base-2 Encoded): 

      3 4 2 1 0 7 6 5

Example 3

This example uses the N-Queens problem to illustrate the use of a fitness function with parameters in implementing a genetic algorithm. The N-Queens problem is derived from chess. The genetic algorithm provides an efficient search for a valid solution. For this problem the chess board consists of N rows and N columns. The objective is to place N queens on this board with no conflicts. A conflict occurs when one queen can move and capture another. Since queens can move diagonally, vertically and horizontally this problem is challenging when N becomes large.

One solution for N = 4 is displayed in the following table. A valid solution must place every queen in a different row and column. The problem is to ensure the queens are not in conflict because of lying on the same diagonal.

Row/Col 0 1 2 3
0     Q  
1 Q      
2       Q
3   Q    

Similar to the traveling salesman problem, the N-Queens problem can be expressed using N nominal phenotypes with values 0, 1, …, N-1. The value of the i-th phenotype represents the row number for the queen in the i-th column. This ensures that any arrangement of the phenotype values represents a board with N queens, each placed in a different row and column.

The solution for N queens displayed above can be represented by the phenotype values {1, 3, 0, 2}. The search looks for arrangements that also do not place queens on the same diagonal. Two queens fall on the same diagonal if the absolute value of the difference between their row numbers equals the absolute value of the difference between their column numbers.

This example uses this representation with a fitness function equal to (N – C) where C equals the number of conflict among the queens. With this fitness, a solution to the N-Queens has a fitness of N.

from __future__ import print_function
import sys
from numpy import *
from pyimsl.stat.free import free
from pyimsl.stat.ompOptions import ompOptions
from pyimsl.stat.gaChromosome import gaChromosome
from pyimsl.stat.gaRandomPopulation import gaRandomPopulation
from pyimsl.stat.gaFreeIndividual import gaFreeIndividual
from pyimsl.stat.gaFreePopulation import gaFreePopulation
from pyimsl.stat.geneticAlgorithm import geneticAlgorithm, FINAL

from pyimsl.stat.randomSeedSet import randomSeedSet

# This algorithm seems to run really fast until n_queens is > 10,
# so we'll leave it at 10 for testing.
#
# n_queens = 25  # number of nominal phenotypes
n_queens = 10  # number of nominal phenotypes
n = 500        # population size
n_categories = empty((n_queens), dtype=double)


def queensFitness(individual):
    global n_queens
    f = 0      # Fitness value
    for i in range(0, n_queens - 1):
        for j in range(i + 1, n_queens):
            k = individual[0].chromosome[0].allele[i] - \
                individual[0].chromosome[0].allele[j]
            k = abs(k)
            if (abs(i - j) == k):
                f += 1

    f = n_queens - f
    return f


###########################################################################
# In this example the user function is thread safe. Let CNL know it is
# safe, which allows genetic algorithm to run in parallel, if that
# capability exists on the user computer.
###########################################################################
ompOptions(setFunctionsThreadSafe=1)
randomSeedSet(12345)
maxFit = n_queens - 0.5
for i in range(0, n_queens):
    n_categories[i] = n_queens
chromosome = gaChromosome(nominal=n_categories)

population = gaRandomPopulation(n, chromosome, pmxCrossover=True,
                                fitnessFcn=queensFitness)
genStats = []
n_generations = []
best_individual = geneticAlgorithm(queensFitness, population,
                                   printLevel=FINAL,
                                   pmxCrossover=True,
                                   linearScaling=2.0,
                                   crossoverProb=0.7,
                                   mutationProb=0.01,
                                   maxGenerations=10000,
                                   maxFitness=maxFit,
                                   generationStats=genStats,
                                   nGenerations=n_generations)

print("GENERATION STATISTICS")
print("Total Number of Generations: %d\n" % (n_generations[0]))
print("Generation  Max. Fit. Min. Fit.  Avg. Fit.    CV")
print("*************************************************")
for i in range(0, n_generations[0] + 1, 25):
    print("Gen. %3d: %8.0f %8.0f %12.2f %8.2f" % (
        i, genStats[i][0], genStats[i][1], genStats[i][2],
        100 * genStats[i][3] / genStats[i][2]))
i = n_generations[0]
print("Gen. %3d: %8.0f %8.0f %12.2f %8.2f" % (
    i, genStats[i][0], genStats[i][1], genStats[i][2],
    100 * genStats[i][3] / genStats[i][2]))
print("*************************************************\n")
sys.stdout.write("OPTIMUM SOLUTION:\n\nFitness:%4.0f\n\nChromosome:        " % (
    queensFitness(best_individual)))
for i in range(0, best_individual[0].chromosome[0].c_length):
    sys.stdout.write("%3d" % (best_individual[0].chromosome[0].allele[i]))
if(n_queens < 100):
    print("\n\nBoard Positions:  \n")
    for i in range(0, n_queens):
        sys.stdout.write("--")
    sys.stdout.write("-\n")

    for i in range(0, n_queens):
        for j in range(0, n_queens):
            k = best_individual[0].nominalPhenotype[j]
            if(i == k):
                sys.stdout.write("|Q")
            else:
                sys.stdout.write("| ")
        sys.stdout.write("|\n")
        for k in range(0, n_queens):
            sys.stdout.write("--")
        sys.stdout.write("-\n")
    sys.stdout.write("\n\n")

Output

This program produced the following solution to the N-Queens problem with \(N=50\) queens. Notice that some of the minimum fitness values are negative. This alters the random selection of the fittest parents, but if these values are few and small, then the effect is not enough to halt the genetic algorithm.

For 50 queens there are over 1064 ways of placing queens ensuring each is in its own row and column. An exhaustive search of these possible solutions to find a arrangement without diagonal conflicts would be time consuming. The genetic algorithm search found a solution in only 568 generations, requiring 284,500 function evaluations.

GENERATION STATISTICS
Total Number of Generations: 26

Generation  Max. Fit. Min. Fit.  Avg. Fit.    CV
*************************************************
Gen.   0:        9       -9         3.62    66.71
Gen.  25:       10       -8         4.68    43.25
Gen.  26:       10       -4         4.44    46.15
*************************************************

OPTIMUM SOLUTION:

Fitness:  10

Chromosome:          6  2  5  7  0  4  8  1  3  9

Board Positions:  

---------------------
| | | | |Q| | | | | |
---------------------
| | | | | | | |Q| | |
---------------------
| |Q| | | | | | | | |
---------------------
| | | | | | | | |Q| |
---------------------
| | | | | |Q| | | | |
---------------------
| | |Q| | | | | | | |
---------------------
|Q| | | | | | | | | |
---------------------
| | | |Q| | | | | | |
---------------------
| | | | | | |Q| | | |
---------------------
| | | | | | | | | |Q|
---------------------



OPTIMUM SOLUTION

   Fitness: 10.000000
   Phenotypes:
      Nominal:  10
   Function Calculations: 13500
   Population Size:       500
   Number of Generations: 26

   Nominal Phenotype(s): 

        6   2   5   7   0   4   8   1   3   9 

   Chromosome (Base-2 Encoded): 

      6 2 5 7 0 4 8 1 3 9

Fatal Errors

IMSLS_STOP_USER_FCN

Request from user supplied function to stop algorithm.

User flag = “#”.