FGA tutorial


Introduction to genetic algorithms

A genetic algorithm is a search heuristic used to solve optimization problems. It is based on the paradigm of evolutionary computation and it is inspired by biological models.

The possible solutions of a problem are encoded in chromosomes: firstly, a random population of chromosomes is created; then each chromosome is evaluated by the “fitness” function and the best chromosomes are selected for reproduction. The probability of being chosen is proportional to fitness:

(where is the fitness of chromosome i)

(note that FGA offers other different ways to select chromosomes).

Two parent chromosomes generate two children chromosomes through the “crossover” operator; FGA implements different types of crossover and lets the user define custom ones. Here are two examples of crossover:





(one-point crossover and two-point crossover)

Then comes the mutation stage, where there is a probability for each gene of a chromosome to be changed (with FGA it is also possible to define different mutation routines).

These steps are repeated until the population's best chromosome satisfies minimum criteria or until it stabilizes to a certain value.


Using FGA

To compile the examples on Linux/Unix platforms, launch “make” from a shell.

You can also issue "make install" (with root privileges) if you wish to install FGA system-wide; after that, fga.hpp will be located in /usr/local/include, all documentation in /usr/local/share/doc/fga and example source code in /usr/local/share/examples/fga; compiled example binaries will be in /usr/local/bin.

"make uninstall" deletes all files installed with the previous command.

On Windows, you can use the included project file for Visual Studio (“project-win32-visualstudio.sln”) or for Dev-C++ (“project-win32-devcpp.dev”).


The header fga.hpp implements two template classes: Population and PopulationMT. The first controls the real genetic algorithm, while the second is used to build a parallel version of the algorithm.

The user must define the data type of genes, the size of the population, the number of threads to be executed (in the case of PopulationMT) and some functions that are used to generate, evaluate, combinate and mutate chromosomes, the most important being the “fitness” function. All the functions are passed as function pointers.

Then the functions run() or cycle() have to be called in order to start the algorithm; there are of course functions that return the best chromosome of the population.

For example, let's assume chromosomes are strings of characters:

float my_fitness_function(char *chromosome) 
{
  // compute fitness value 
  return fitness_of_chromosome; 
}

char my_mutate_gene(char gene)
{
  // compute mutated gene
  return mutated_gene;
}

char my_random_gene()
{
  // compute a random gene
  return random_gene;
}

int main()
{
  Population<char> my_population(number_of_chromosomes, length_of_chromosomes, my_fitness_function, my_mutate_gene, my_random_gene, NULL, NULL, NULL, NULL);
  my_population.run(minimum_tolerated_fitness, maximum_number_of_generations);
  float best_score = my_population.get_best_score();
  char *best_chromosome = my_population.get_best();
}

OR (for another version of run())

int main()
{
  // ...
  my_population.run(); // runs until the population converges
  // ...
}

OR

int main()
{
  // ...
  while (some_condition) {
    my_population.cycle();
    float best_score = my_population.get_best_score();
    char *best_chromosome = my_population.get_best();
  }
  // ...
}

OR (for custom crossover operator, chromosome-wide routines and custom selection operator)

float my_fitness_function(char *chromosome)
{
  // compute fitness value
  return fitness_of_chromosome;
}

void my_mutate_chromosome(char *chromosome)
{
  // mutate the chromosome in place
}

void my_random_chromosome(char *chromosome)
{
  // generate a chromosome in the given buffer
}

void my_crossover_operator(char *chromosome1, char *chromosome2)
{
  // cross over the two chromosomes
}

char *my_select_chromosome(char **chromosomes)
{
 // return a pointer to the selected chromosome, chosen from the “chromosomes” array
}

int main()
{
  // ...
  Population<char> my_population(number_of_chromosomes, length_of_chromosomes, my_fitness, NULL, NULL, my_crossover_operator, my_mutate_chromosome, my_random_chromosome, my_select_chromosome);
  // ...
}

OR (for multi-threaded algorithm)

int main()
{
  // ...
  PopulationMT<char> my_population(number_of_threads, chromosomes_per_thread, length_of_chromosomes, my_fitness, my_mutate_gene, my_random_gene, NULL, NULL, NULL, NULL);
  // ...
}

For a complete list of the available functions refer to the FGA manual.


The maxbit example shows the usage of the FGA library for a very simple problem: maximize the 1-bits of a binary string of fixed length.

The following is a less trivial application of genetic algorithms.


Solving the travelling salesman problem

The “travelling salesman problem” (often called TSP) consists in finding the cheapest Hamiltonian tour that connects all the nodes (“cities”) of a graph.

The TSP is proven to be NP-hard, and an optimal solution takes O(n!) time, or, using dynamic programming, O(2n), which is in any case exponential.

This is because such an algorithm involves trying all the possible permutations of the sequence of nodes traversed in the tour.

Using an heuristic based on genetic algorithms, a good solution to the problem can be found in a reasonable amount of time, even if it is not sure that it will be the optimal solution.

The main issue is the choice of how to represent a tour in a chromosome and how to implement the crossover operator: if a tour is represented (as it is natural) by a permutation of the nodes, standard operators don't guarantee that the strings they produce represent valid Hamiltonian tours; different approaches were found to solve this problem by changing the representation: for a brief review of them see [1].

My choice was to keep the simplest representation and define new crossover and mutation operators. Crossover now selects a random sub-string from parent 1 and joins it with the maximum compatible sub-string from parent 2 (and vice versa for child 2); then, if needed, it completes the tour with missing nodes in random order. For example, let's consider a TSP with 5 cities:

parent 1: A – C – E – B – D

parent 2: B – E – A – D – C

random sub-string from parent 1: C – E -

maximum compatible sub-string from parent 2: - E – A – D -

sub-strings joined: C – E – A – D -

child 1 (completed with missing nodes): C – E – A – D – B

This new crossover operator is particularly good at preserving inheritance, and always returns well-formed Hamiltonian tours.

The mutation operator simply swaps two random nodes:

child 1 (mutated): C – D – A – E - B

Now the mutation rate has a new meaning: it represents the probability that a chromosome will be subject to mutation.

The “tournament” selection operator was chosen, with a number of N / 10 candidates (where N is the total size of the population).

Comparing the algorithm with the classical brute-force approach shows the benefit of using genetic algorithms to solve problems that lack a polynomial-time solution.

$ ./graph_gen

L = 5

MAX_COST = 10

$ time ./tsp_bf

Brute force algorithm


Progress: 100% (120 of 120 paths generated)


Best cost: 10


Best path:

2 4 0 1 3


real 0m0.071s

user 0m0.040s

sys 0m0.024s

$ time ./tsp

Generations: 1512


Mutation rate: 0.7


Best cost: 10


Best path:

2 4 0 1 3


Best path was stable for 1500 generations. Algorithm halted.


real 0m2.024s

user 0m1.212s

sys 0m0.687s

With a graph of only 5 nodes, the brute-force algorithm found the optimal solution instantly; the genetic algorithm also found it, but had to wait a fixed number of generations before to consider it the final result.

If we change the dimension of the graph, the exponential growth of the brute-force solution will become clear:

$ ./graph_gen

L = 25

MAX_COST = 100

$ time ./tsp

Generations: 11000

Mutation rate: 0.8

Best cost: 252

Best path:

15 11 13 17 22 23 24 12 4 7 10 19 9 21 1 2 6 0 20 5 16 8 18 3 14

Best path was stable for 7500 generations. Algorithm halted.

real 0m15.060s

user 0m10.184s

sys 0m4.087s

$ time ./tsp_bf

Brute force algorithm

Progress: 57440000 paths generated

Best cost: 894

Best path:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 17 22 23 15 18 20 24 21 16 19 14

real 1m14.164s

user 0m49.537s

sys 0m19.418s

After more than 1 minute we had to stop the process, and the brute-force solver was still at ~0% of its progress; moreover, the solution found at that time was visibly worst than the one returned by the genetic algorithm.

More precisely, the brute-force version had generated ~108 of ~1025 possible paths (25! = 15511210043330985984000000 ~= 1025) , so we ould expect it to terminate in the order of 1017 minutes, which corresponds to ~100000000000 years.

The genetic algorithm took 15 seconds to return a good approximate solution.

Testing the TSP solver on TSPLIB's data showed that it finds solutions very close to optimal even on critical instances of the problem.

A graphical version of the TSP solver, as well as some screen-shots, can be found here: http://fga.sourceforge.net/tsp_graphic.html.



[1] http://www.cs.uml.edu/~giam/91.510/Lectures/Lecture6.ppt


Copyright © 2007 Alessandro Presta