Package ec.eval

Class MetaProblem

java.lang.Object
ec.Problem
ec.eval.MetaProblem
All Implemented Interfaces:
Prototype, Setup, SimpleProblemForm, Serializable, Cloneable

public class MetaProblem extends Problem implements SimpleProblemForm

MetaProblem is a special class for implenting so-called "Meta-Evolutionary Algorithms", a topic related to "HyperHeuristics". In a Meta-EA, an evolutionary system is used to optimize the parameters for another evolutionary system.

We will refer to the EA used to optimize the parameters as the Meta-EA or meta-level EA. The EA whose parameters are getting optimized will be called the Base EA.

In order to optimize the parameters for a base EA, one must be able to test those parameters, and this is done by running a base evolutionary system using those parameters and seeing how well it performs. This means generating a second instance of an ECJ system. ECJ does this in the same Java process as the original system: so you must account for this in your memory consumption.

The way it works is as follows. First, you set up the base-level ECJ system as normal, with a parameter file defining all of its parameters (ideally including default settings for the parameters you'll be optimizing). You probabliy ultimately want to prevent statistics from writing anything to files or printing anything to the screen, which would be very inefficient when doing meta-level stuff.

     stat.silent = true       

Next you set up the meta-level ECJ system. Here you define the problem class as a MetaProblem:

    eval.problem = ec.eval.MetaProblem    

Next you tell the meta-level ECJ system where the parameter file for the base-level ECJ system is, so it can set up the base-level system from that file:

    eval.problem.file = base-ec.params    

MetaProblem assesses the fitness of its individuals (the parameter settings for the base ECJ system) by running the base ECJ system some N times using those parameter settings, gathering the best-of-run fitnesses from each of those N times, and taking the mean of the fitnesses. This means that the base ECJ system must use a fitness facility which can be reduced to a single number. Further, both ECJ systems (meta-level and base) should use the same Fitness class. The value of N is important: if you make it too high, you're wasting valuable time in testing. But if you make it too low, you will get inaccurate fitness results and you'll have a lot of noise in your testing. To specify the N number of tests to 10 (for example), you say:

    eval.problem.runs = 10    

Because evaluations are noisy and random, you will probably want to guarantee that individuals are reevaluated if they show up again in a later generation. To do this, you say:

    eval.problem.reevaluate = true    

Actually you don't need to say this because it's the default value. But if you want to avoid reevaluating individuals, you must explicitly set it to false.

Now we get to specifying the actual parameters which will be optimized. The outer ECJ system must use a DoubleVectorIndividual. The genome size of that individual must be exactly the number of parameters you're trying to optimize in the base ECJ system. If there are 5 parameters, for example, you might say:

    pop.subpop.0.species = ec.vector.FloatVectorSpecies    
    pop.subpop.0.species.ind = ec.vector.DoubleVectorIndividual    
    pop.subpop.0.species.genome-size = 5    

We suggest you keep the default min-gene and max-gene values something simple, such as 0.0 and 1.0 respectively, and likewise some default mutation data. And a crossover parameter perhaps:

    pop.subpop.0.species.min-gene = 0.0    
    pop.subpop.0.species.max-gene = 1.0    
    pop.subpop.0.species.mutation-prob = 0.25    
    pop.subpop.0.species.mutation-type = gauss    
    pop.subpop.0.species.mutation-stdev = 0.1    
    pop.subpop.0.species.mutation-bounded = true    
    pop.subpop.0.species.out-of-bounds-retries = 100    
    pop.subpop.0.species.crossover-type = one    

The genes in this genome are not necessarily going to be treated as doubles, however. This is because not all parameters are doubles. While something like gaussian mutation variance is a double, population size is an integer. Furthermore, some parameters are booleans, and others are values chosen from a set of possible strings, such as one-point ("one"), two-point ("two"), and uniform ("any") crossover (3 strings). To handle this, we will encode in the double vector any of the following kinds of data:

  • Double values between a min and max value inclusive.
  • Integer values between a min and a max value inclusive.
  • Integer values from 0 through n-1, representing some N possible string values.
  • Boolean values, represented as the integers 0 and 1

For example, let's say you're trying to optimize the following parameters.

  1. Mutation probability (a double)
  2. Mutation type (one of "reset", "gauss", or "polynomial")
  3. Mutation standard deviation for gauss mutaton (a double)
  4. Mutation distribution index for polynomial mutation (an integer)
  5. Use of the alternative polynomial mutation version (a boolean)

For each of these we need to specify the mutation type used for the gene responsible for that parameter. In our example, we will use our default mutation (gaussian, probability 0.25, stdev 0.1, bounded) for our two double parameters, "reset" mutation for the second and fifth parameters, and integer random walk mutation for the fourth parameter. Declaring these mutation types will also determine the initialization procedure for those genes: see "Heterogeneous Vectors" in the manual for more information.

Finally we need to specify the parameter associated with each gene. We do that with a parameter like this:

    eval.problem.param.0 = pop.subpop.0.species.mutation-prob    
If the parameter value is numerical or boolean, MetaProblem will create the right value for it automatically. If the parameter value is a string, you need which string value corresponds to the number stored in the gene. For example:
    eval.problem.param.1 = pop.subpop.0.species.mutation-type    
    eval.problem.param.1.num-vals = 3    
    eval.problem.param.1.val.0 = reset    
    eval.problem.param.1.val.1 = gauss    
    eval.problem.param.1.val.2 = polynomial    

So we need to specify two things: information about how the gene is mutated (and hence initialized), and information about how it is to be interpreted as a parameter. In the parameters below note that we often omit mutation information when we are relying on some default we defined above:

    pop.subpop.0.species.genome-size = 5    
    eval.problem.num-params = 5    

    eval.problem.param.0 = pop.subpop.0.species.mutation-prob    
    eval.problem.param.0.type = float    

    eval.problem.param.1 = pop.subpop.0.species.mutation-type    
    eval.problem.param.1.num-vals = 3    
    eval.problem.param.1.val.0 = reset    
    eval.problem.param.1.val.1 = gauss    
    eval.problem.param.1.val.2 = polynomial    
    pop.subpop.0.species.max-gene.1 = 2    
    pop.subpop.0.species.mutation-type.1 = integer-reset    

    eval.problem.param.2 = pop.subpop.0.species.mutation-stdev    
    eval.problem.param.2.type = float    

    eval.problem.param.3 = pop.subpop.0.species.mutation-distribution-index    
    eval.problem.param.3.type = integer    
    pop.subpop.0.species.max-gene.3 = 10    
    pop.subpop.0.species.mutation-type.3 = integer-random-walk   
    pop.subpop.0.species.random-walk-probability.3 = 0.8    

    eval.problem.param.4 = pop.subpop.0.species.alternative-polynomial-version    
    eval.problem.param.4.type = boolean    
    pop.subpop.0.species.mutation-type.4 = integer-reset    

If the mappings above are insufficient for you, you can create your own by overriding two methods:

  • loadDomain(...) sets up the domain from the parameters above. You could override this to interpret your own parameters as you saw fit, or simply to turn off parameter loading entirely.

  • map(...) actually maps a gene into a parameter value (a string). You could override this to provide your own mapping, either hard coded, or from some version of loadDomain(...) you created. If you override this method, you'll want to override loadDomain(...) for sure, if only to turn it off.

Preparing for the final run Once you've got everything working, you probably want to eliminate all output at the base level before starting the big meta-level run. You can do this in a base-level parameter file like this:

    silent = true    

Caveats. A meta-level individual is tested by setting a base-level EA with its parameters, then running the base-level EA, then extracting the best individual of the run and getting its fitness. This is done some N times, and the fitness is combined from these N, using the method combine(...). By default this method simply does setToMeanOf(...), but you might want to do something else. Note that this means that by default it's going to be difficult to have multiobjective fitness at either level without overriding the combine() method. Furthermore, because the fitness is extracted from just the first subpopulation of the base-level EA, this implies that you probably only want one subpopulation at the base-level, except in the case of competitive coevolution where you ultimately don't care about the fitness of othe other subpopulations. If you have more than one subpopulation at the base level, you will receive a one-time warning.

MetaProblem gathers two kinds of statistics of interest to you. First, it gathers the best individual of run, mearning the DoubleVectorIndividual whose parameters on average produced the highest best-fitness-of-run runs in the base system. This is gathered using the standard statistics procedures. Second it gathers the best individual discovered among the various base runs. This individual is reported, at the end, during the describe(...) method, and appears at the end of the statistics file if you are using SimpleStatistics at the meta-level.

Finally, yes, MetaProblem can be recursive. You can set things up so that you're evolving the parameters for an EC system which evolves the parameters for an EC system which evolves the parameters for an EC system.

Parameters

base.file
filename
(the filename of the "base" (lower-level) parameter file i)
base.runs
int >= 1 (default=1)
(the number of base-level evolutionary runs performed to assess the fitness of a meta individual)
base.reevaluate
boolean (default=true)
(when a meta individual has its evaluated flag set, should we reevaluate it anyway?)
base.set-random
boolean (default=false)
(Should we silence the stdout and stderr logs of the Output of the base EA?)
base.num-params
int >= 1
(How many parameters are being evolved? This should match the genome length of the meta-level EA individuals)
base.param.number
String
(The parameter name)
base.param.number.type
String, one of: integer boolean float (or not defined if num-vals is defined)
The parameter type
base.param.number.num-vals
int >= 1
(The number of values (Strings) a parameter may take on, if it is a multi-string type)
base.param.number.val.val-number
String
(A possible value that a parameter may take on, if it is a multi-string type)
See Also:
  • Field Details

    • P_FILE

      public static final String P_FILE
      See Also:
    • P_RUNS

      public static final String P_RUNS
      See Also:
    • P_REEVALUATE_INDIVIDUALS

      public static final String P_REEVALUATE_INDIVIDUALS
      See Also:
    • P_NUM_PARAMS

      public static final String P_NUM_PARAMS
      See Also:
    • P_PARAM

      public static final String P_PARAM
      See Also:
    • P_TYPE

      public static final String P_TYPE
      See Also:
    • V_INTEGER

      public static final String V_INTEGER
      See Also:
    • V_BOOLEAN

      public static final String V_BOOLEAN
      See Also:
    • V_FLOAT

      public static final String V_FLOAT
      See Also:
    • P_NUM_VALS

      public static final String P_NUM_VALS
      See Also:
    • P_VAL

      public static final String P_VAL
      See Also:
    • P_MUZZLE

      public static final String P_MUZZLE
      See Also:
    • P_SET_RANDOM

      public static final String P_SET_RANDOM
      See Also:
    • base

      public Parameter base
      The parameter base from which the MetaProblem was loaded.
    • p_database

      public ParameterDatabase p_database
      A prototypical parameter database for the underlying (base-level) evolutionary computation system. This is never directly used, just cloned.
    • currentDatabase

      public ParameterDatabase currentDatabase
      This points to the database presently used by the underlying (base-level) evolutionary computation system. It is a cloned and modified version of p_database.
    • runs

      public int runs
      The number of base-level evolutionary runs to perform to evaluate an individual.
    • reevaluateIndividuals

      public boolean reevaluateIndividuals
      Whether to reevaluate individuals if and when they appear for evaluation in the future.
    • bestUnderlyingIndividual

      public Individual[] bestUnderlyingIndividual
      The best underlying individual array, one per subpopulation. We retain the best underlying individual here rather than storing it in (say) the associated fitness because fitnesses are *averaged* over trials, so we wouldn't be able to keep track of the *max* fitness and associated individual that way. So we do it here.
    • lock

      public Object lock
      Acquire this lock before accessing bestUnderlyingIndividual
    • domain

      public Object[] domain
      A list of domain information, one per parameter in the genome.
  • Constructor Details

    • MetaProblem

      public MetaProblem()
  • Method Details

    • setup

      public void setup(EvolutionState state, Parameter base)
      Description copied from interface: Prototype
      Sets up the object by reading it from the parameters stored in state, built off of the parameter base base. If an ancestor implements this method, be sure to call super.setup(state,base); before you do anything else.

      For prototypes, setup(...) is typically called once for the prototype instance; cloned instances do not receive the setup(...) call. setup(...) may be called more than once; the only guarantee is that it will get called at least once on an instance or some "parent" object from which it was ultimately cloned.

      Specified by:
      setup in interface Prototype
      Specified by:
      setup in interface Setup
      Overrides:
      setup in class Problem
    • loadDomain

      protected void loadDomain(EvolutionState state, Parameter base)
    • map

      protected String map(EvolutionState state, double[] genome, FloatVectorSpecies species, int index)
    • modifyParameters

      public void modifyParameters(EvolutionState state, ParameterDatabase database, int run, Individual metaIndividual)
      Override this method to revise the provided parameter database to reflect the "parameters" specified in the given meta-individual. 'Run' is the current run number for this individual's evaluation.
    • evaluate

      public void evaluate(EvolutionState state, Individual ind, int subpopulation, int threadnum)
      Description copied from interface: SimpleProblemForm
      Evaluates the individual in ind, if necessary (perhaps not evaluating them if their evaluated flags are true), and sets their fitness appropriately.
      Specified by:
      evaluate in interface SimpleProblemForm
    • combine

      public void combine(EvolutionState state, Fitness[] runs, Fitness finalFitness)
      Combines fitness results from multiple runs into a final Fitness. By default this is done by using setToMeanOf.
    • describe

      public void describe(EvolutionState state, Individual ind, int subpopulation, int threadnum, int log)
      Description copied from class: Problem
      Part of SimpleProblemForm. Included here so you don't have to write the default version, which usually does nothing.
      Specified by:
      describe in interface SimpleProblemForm
      Overrides:
      describe in class Problem