42 Stars: Genetic Programming Details

The genetic programming system is created by first deciding what terminal values and what operators you will use given your problem. Next write the objective function. The objective function accepts an individual then evaluates that individual over your data set. This data set could be internal, trying to find an individual that returns the closest approximation to the square root of two, or your data set could be external.

If the data set is external, like planetary orbital locations, your objective function reads the data and evaluates the data with respect to the data set. When evaluating the individual your objective function must compute some number that shows the relative "goodness" or "badness" of this individual with respect to the data function. With the square root of two problem this value might be the absolute difference between the evaluated result of the individual with the fixed value of square root of two (1.414). For a problem like planetary orbital data this value could be the absolute difference of each evaluation of the individual and the observed position.

Depending on the magnitude of the values you're working with, somethings this difference is squared to make the summed difference between individuals greater and the "rightness" of the individual more pronounced with respect to other individuals in the population.

Once the objective function is written, then you decide with what frequency you want crossover and mutation to happen within your population. Crossover percentages are normally high, around 70%. I don't have a citation for that number at the moment. Mutation percentages are low, around 3%.

A genetic programming sequence of events when running is:

randomize the individuals of a population given the sets of terminal values and operators
evaluate every individual using the objective function
sort the "correctness" of each individual
randomly go through the population deciding which individuals will have crossover and which individuals will be mutated
repeat the above steps until either the maximum number of generations of the population is reached or until an individual is found for which the objective function returns a perfect score

I'll cover the details of how to select an individual for crossover or mutation in my next post.

For a given problem, how do you know what is the perfect score? Researcher and author John Koza proposed that the objective function in some way sum up the individual's score into what is called the "raw score". This raw score is then converted into a value where one (1) is perfect and zero bad. The formula used is: adjusted score = 1 / (1 + raw score). In this framework the raw score approaches zero as the fitness of the individual increases. If the individual is perfect and has a raw score of zero the equation is: adjusted score = 1 / (1 + 0), = 1 / 1 = 1. That's great. If the individual has a bad fit and a large raw score the equation is more like: adjusted score = 1 / (1 + big number) = 1 / big number = tiny number.

I think John Koza's idea for adjusted scores is really useful.

42 Stars

Thursday, August 30, 2012

Genetic Programming Details - Part 2

No comments:

Post a Comment