Raw File
  A generic Genetic Algorithm for feature selection
  These functions allow you to initialize (\code{GenAlg}) and iterate
  (\code{newGeneration}) a genetic algorithm to perform feature
  selection for binary class prediction in the context of gene
  expression microarrays or other high-throughput technologies.
GenAlg(data, fitfun, mutfun, context, pm=0.001, pc=0.5, gen=1)
    The initial population of potential solutions, in the form of a data
    matrix with one individual per row.}
    A function to compute the fitness of an individual solution. Must take
    two input arguments: a vector of indices into rows of the population
    matrix, and a \code{context} list within which any other items required
    by the function can be resolved. Must return a real number; higher values
    indicate better fitness, with the maximum fitness occurring at the optimal
    solution to the underlying numerical problem.}
    A function to mutate individual alleles in the population. Must take two
    arguments: the starting allele and a \code{context} list as in the
    fitness function.}
    A list of additional data required to perform mutation or to compute
    fitness. This list is passed along as the second argument when
    \code{fitfun} and \code{mutfun} are called.}
    A real value between \code{0} and \code{1}, representing the probability
    that an individual allele will be mutated.}
    A real value between \code{0} and \code{1}, representing the probability
    that crossover will occur during reproduction.}
    An integer identifying the current generation.}
    An object of class \code{GenAlg}}
  Both the \code{GenAlg} generator and the \code{newGeneration} functions
  return a \code{\link{GenAlg-class}} object. The \code{popDiversity} function
  returns a real number representing the average diversity of the population.
  Here diversity is defined by the number of alleles (selected features) that
  differ in two individuals.
  Kevin R. Coombes \email{krc@silicovore.com},
  P. Roebuck \email{proebuck@mdanderson.org}
# generate some fake data
nFeatures <- 1000
nSamples <- 50
fakeData <- matrix(rnorm(nFeatures*nSamples), nrow=nFeatures, ncol=nSamples)
fakeGroups <- sample(c(0,1), nSamples, replace=TRUE)
myContext <- list(dataset=fakeData, gps=fakeGroups)

# initialize population
n.individuals <- 200
n.features <- 9
y <- matrix(0, n.individuals, n.features)
for (i in 1:n.individuals) {
  y[i,] <- sample(1:nrow(fakeData), n.features)

# set up the genetic algorithm
my.ga <- GenAlg(y, selectionFitness, selectionMutate, myContext, 0.001, 0.75)

# advance one generation
my.ga <- newGeneration(my.ga)


back to top