gaChromosome¶
Creates an Imsls_d_chromosome data structure containing unencoded and encoded phenotype information.
Synopsis¶
gaChromosome ()
Return Value¶
The function gaChromosome
returns an Imsls_d_chromosome data
structure, which is required input to gaIndividual
, gaPopulation
,
and gaRandomPopulation
. The memory allocated to this data structure can
be released using free.
Optional Arguments¶
t_print
, (Input)- By default, summary results are not printed. This option causes the function to print summary results describing the chromosome structure.
binary
(Input)The number of binary phenotypes.
Default:
binary
= 0.nominal
, intnCategories[]
(Input)The first parameter
nNominal
is equal to the number of nominal phenotypes.nCategories
is an array of lengthnNominal
containing the number of nominal categories for each nominal phenotype. Each value of this array must be at least 2 or greater. If partially matched crossover is used during the genetic algorithm search then the arraynCategories
is ignored since all of its values are assumed equal tonNominal
.Default:
nNominal
= 0.integer
, intiIntervals[]
, intiBounds[]
(Input)The first parameter
nInteger
is equal to the number of integer phenotypes. The second parameter in this argument,iIntervals
, is a one-dimensional array of lengthnInteger
containing the number of discrete intervals used to map each integer into the chromosome loci. For efficiency, each value in this array should be a power of 2 such as 2, 4, 8, 16, etc. The third parameter,iBounds
is an array of sizenInteger
by 2 containing the lower and upper bounds for each integer phenotype. The lower and upper bounds for the i-th integer phenotype are equal toiBounds[
2*i]
andiBounds[
2 *i
+1]
respectively. Each integer value submitted to geneticAlgorithm for the i-th integer phenotype, \(\omega\) must conform to the inequality:\[\mathrm{iBounds}[2i] \leq w_i \leq \mathrm{iBounds}[2i+1]\]Default:
nInteger
= 0.real
, intrIntervals[]
, floatrBounds[]
(Input)The first parameter
nReal
is equal to the number of real phenotypes. Point values are mapped into chromosome loci using discretization. The second parameter in this argument,rIntervals
, is a one-dimensional array of lengthnReal
containing the number of discrete intervals used to map each real value into the chromosome loci. For efficiency, each value in this array should be a power of 2 such as 2, 4, 8, 16, etc. The third parameter,rBounds
is an array of sizenReal
by 2 containing the lower and upper bounds for each integer phenotype. The lower and upper bounds for the i-th real phenotype are equal torBounds[
2*i]
andrBounds[
2*i
+1]
respectively. Hence,rBounds[
2*i
+ 1]
must be greater thanrBounds[
2*i]
. Each real value submitted to geneticAlgorithm for the i-th real phenotype, \(\omega_i\), must conform to the inequality:\[\mathrm{rBounds}[2i] \leq w_i \leq \mathrm{rBounds}[2i+1]\]Default:
nReal
= 0.
Description¶
The genetic algorithm requires a chromosome representation of phenotypes. Most textbook applications of genetic algorithms use phenotypes that have a natural binary encoding. Real world problems often have non-binary phenotypes. Phenotypes are parameters used by the fitness function. Those can include any data type. This function allows for easy encoding of binary, nominal, integer and real phenotypes.
Binary phenotypes are mapped directly into the chromosome as binary bits.
Each binary phenotype is treated as a single allele. If the user specifies
binary>
0, then the first binary
bits in the chromosome are
allocated for encoding this information. When the fitness function is called
during genetic optimization, these bits are translated into zeros and ones
and then sent to the fitness function as an integer array of length
binary
.
Nominal phenotypes are mapped into the chromosome following the binary
phenotypes. The number of bits assigned to each nominal phenotype is
determined from the number of categories for each nominal phenotype. The
value nCategories[
i]
is equal to the number of categories for
the i-th nominal phenotype. The number of bits assigned to this category is
the smallest value of k such that \(2^k\) ≥ nCategories[i]
, i.e.,
\(k=\lceil\log_2 \left( \mathrm{nCategories}[i] \right) \rceil\), where
\(\lceil x \rceil\) is the ceiling of x (least integer of x). A
binary nominal phenotype would be assigned one bit, and one bit would
constitute a single allele. A trinary nominal phenotype would be assigned two
bits since \(2^2=4\geq 3\), and these bits would be treated as a single
allele.
The mapping of multiple bits to a single allele is a key difference between nominal phenotypes and other phenotypes. Alleles for binary, integer and real phenotypes are represented as single bits in the chromosome. The alleles for nominal phenotypes consist of multiple bits. Since crossover occurs between alleles, crossover for nominal phenotypes is treated differently. This ensures that only viable values for nominal phenotypes result from crossover.
It also means that Gray encoding of individual bits has no effect on nominal phenotypes. For many problems Gray encoding is used instead of standard Base-2 encoding to reduce large changes of encoded phenotype values after crossover. As a result, Gray encoding is never applied to nominal phenotypes.
In addition, partially mixed crossover is only an option for nominal phenotypes. Nominal phenotypes combined with partially mixed crossover make it easy to implement search problems similar to the traveling salesman problem.
Both integer and real phenotypes are discretized. Although this is the most common approach to encoding these phenotypes, some problems may benefit from other forms of encoding. If so, users should provide their own encoding, translating the phenotype into a bit representation that can be mapped into binary phenotypes.
Discretization is controlled by two arrays. For integer phenotypes, the array
iIntervals
contains the number of discrete intervals used to represent
each integer. The number of chromosome bits assigned to the i-th integer is
determined by the values in this array. If iIntervals[i]
= \(k=2^m\)
then the i-th integer phenotype is assigned m bits. For example, if
iIntervals[i]
= 4, then this phenotype is assigned two bits.
The array iBounds
contains the upper and lower bounds for each integer
phenotype. The lower bound for the i-th integer phenotype is equal to lb
= iBounds[
2i]
, and the upper bound is equal to ub =
iBounds[
2i+
1]
. The values for the i-th phenotype, w,
must satisfy the inequality lb ≤ w < ub. w is discretized to w’
using the formula:
Where ⌊x⌋ is the floor of x (greatest integer of x). This results in mapping the i-th integer phenotype, w, into one of the integers 0,1,…, k‑1.
Real phenotypes are handled in the same fashion as integer phenotypes using
the values in rIntervals
and rBounds
.
The number of chromosome bits assigned to each phenotype are described in the following table:
PHENOTYPE | NUMBER of BITS |
---|---|
Binary | \(\mathit{bbits}=\mathrm{nBinary}\) |
Nominal | \(\mathit{nbits}=\sum_{i=0}^{\mathrm{nNominal}-1} \lceil\log_2 \left( \mathrm{nCategories[i]} \right) \rceil\) |
Integer | \(\mathit{ibits}=\sum_{i=0}^{\mathrm{nInteger}-1} \log_2 \left( \mathrm{iIntervals[i]} \right)\) |
Real | \(\mathit{rbits}=\sum_{i=0}^{\mathrm{nReal}-1} \log_2 \left( \mathrm{rIntervals[i]} \right)\) |
See Table 13.4 for a description of the allele values (bits) for the chromosome. Chromosome bits are ordered first by binary phenotypes in bits 0 through bbits ‑1, then nominals, integers and reals in that order.
The memory allocated to this data structure can be released using free.
Example¶
This example creates a chromosome with 1 binary, 2 nominal, 3 integer and 2
real phenotypes. The t_print
argument is used to print a description of
the chromosome structure.
from numpy import *
from pyimsl.stat.gaChromosome import gaChromosome
from pyimsl.stat.free import free
n_binary = 1
n_nominal = 2
n_integer = 3
n_real = 2
# number of categories for nominal phenotypes
n_categories = [2, 3]
# number of intervals and boundaries for integer phenotypes
i_intervals = [16, 16, 16]
i_bounds = [[0, 1000], [-10, 10], [-20, 0]]
# number of intervals and boundaries for real phenotypes
r_intervals = [512, 1024]
r_bounds = [[0.0, 20.0], [-20.0, 20.0]]
integer = {"iIntervals": i_intervals, "iBounds": i_bounds}
real = {"rIntervals": r_intervals, "rBounds": r_bounds}
# Chromosome Data Types
chromosome = gaChromosome(binary=n_binary,
nominal=n_categories,
integer=integer,
real=real,
t_print=True)
Output¶
The t_print
option produced the following description of the chromosome.
The data structure uses 304 bytes. The chromosome has 34 alleles. The first
bit is used to represent the binary phenotype.
The next two alleles are assigned to the nominal phenotypes. The first phenotype will be encoded in allele 1 with the integers zero and one since it has only two categories. The second nominal phenotype has 3 categories. It will be encoded with the integers zero, one, and two.
The integer phenotypes are each assigned 4 binary bits. Since the number of intervals is the same for each integer, 16, 4 bits will be used to encode the integers 0-15. If Base-2 encoding is used, the \(16^{th}\) interval will be encoded as \(15=\{1111\}\).
The first real phenotype uses 512 intervals to discretize its value. This is encoded using 9 alleles. The second real phenotype uses 1024 intervals to discretize its value. This requires 10 alleles to properly represent the values 0 to 1023.
*******************************
**** CHROMOSOME STRUCTURE *****
Data Structure length: 352 Bytes
Chromosome length: 34 Bits
******ALLELE ASSIGNMENTS*******
Binary: 0 - 0 n_binary = 1
Nominal: 1 - 2 n_nominal= 2
Integer: 3 - 14 n_integer= 3
Real: 15 - 33 n_real = 2
*******************************
NOMINAL CATEGORIES*************
Variable 0: 2 categories
Variable 1: 3 categories
*******************************
INTEGER BOUNDS*****************
Variable 0: [ 0, 1000]
Variable 1: [ -10, 10]
Variable 2: [ -20, 0]
*******************************
INTEGER BITS*******************
Variable 0: 4 bits
Variable 1: 4 bits
Variable 2: 4 bits
*******************************
INTEGER DISCRETE INTERVALS*****
Variable 0: 16 intervals
Variable 1: 16 intervals
Variable 2: 16 intervals
*******************************
REAL BOUNDS********************
Variable 0: [0,20]
Variable 1: [-20,20]
*******************************
REAL BITS**********************
Variable 0: 9 bits
Variable 1: 10 bits
*******************************
REAL DISCRETE INTERVALS********
Variable 0: 512 intervals
Variable 1: 1024 intervals
*******************************