Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

Raw File Download

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • content
content badge
swh:1:cnt:c9842263e9f14bb0a20639e81de08c5b175e6f7e

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • content
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
\name{GenAlg-tools}
\alias{GenAlg-tools}
\alias{simpleMutate}
\alias{selectionFitness}
\alias{selectionMutate}
\title{Utility functions for selection and mutation in genetic algorithms}
\description{
  These functions implement specific forms of mutation and fitness
  that can be used in genetic algorithms for feature selection.
}
\usage{
simpleMutate(allele, context)
selectionMutate(allele, context)
selectionFitness(arow, context)
}
\arguments{
  \item{allele}{
    In the \code{simpleMutate} function, \code{allele} is a binary
    vector filled with 0's and 1's.  In the \code{selectionMutate}
    function, \code{allele} is an integer (which is silently ignored;
    see Details). 
  }
  \item{arow}{
    A vector of integer indices identifying the rows (features) to be
    selected from the \code{context$dataset} matrix.
  }
  \item{context}{
    A list or data frame containing auxiliary information that is needed
    to resolve references from the mutation or fitness code.  In both 
    \code{selectionMutate} and \code{selectionFitness}, \code{context}
    must contain a \code{dataset} component that is either a matrix or a
    data frame.  In \code{selectionFitness}, the \code{context} must
    also include a grouping factor (with two levels) called \code{gps}.
  }
}
\details{
  These functions represent 'callbacks'. They can be used in the
  function \code{\link{GenAlg}}, which creates objects. They will then
  be called repeatedly (for each individual in the population) each time
  the genetic algorithm is updated to the next generation.

  The \code{simpleMutate} function assumes that chromosomes are binary
  vectors, so alleles simply take on the value 0 or 1. A mutation of an
  allele, therefore, flips its state between those two possibilities.

  The \code{selectionMutate} and \code{selectionFitness} functions, by
  contrast, are specialized to perform feature selection assuming a
  fixed number K of features, with a goal of learning how to
  distinguish between two different groups of samples. We assume that
  the underlying data consists of a data frame (or matrix), with the
  rows representing features (such as genes) and the columns
  representing samples. In addition, there must be a grouping vector
  (or factor) that assigns all of the sample columns to one of two
  possible groups. These data are collected into a list,
  \code{context}, containing a \code{dataset} matrix and a \code{gps}
  factor. An individual member of the population of potential
  solutions is encoded as a length K vector of indices into the rows
  of the \code{dataset}. An individual \code{allele}, therefore, is a
  single index identifying a row of the \code{dataset}. When mutating
  it, we assume that it can be changed into any other possible allele;
  i.e., any other row number. To compute the fitness, we use the
  Mahalanobis distance between the centers of the two groups defined by
  the \code{gps} factor.
}
\value{
  Both \code{selectionMutate} and \code{simpleMutate} return an integer
  value; in the simpler case, the value is guaranteed to be a 0 or 1.
  The \code{selectionFitness} function returns a real number.
}
\author{
  Kevin R. Coombes \email{krc@silicovore.com},
  P. Roebuck \email{proebuck@mdanderson.org}
}
\seealso{
  \code{\link{GenAlg}},
  \code{\link{GenAlg-class}},
  \code{\link{maha}}.
}
\examples{
# generate some fake data
nFeatures <- 1000
nSamples <- 50
fakeData <- matrix(rnorm(nFeatures*nSamples), nrow=nFeatures, ncol=nSamples)
fakeGroups <- sample(c(0,1), nSamples, replace=TRUE)
myContext <- list(dataset=fakeData, gps=fakeGroups)

# initialize population
n.individuals <- 200
n.features <- 9
y <- matrix(0, n.individuals, n.features)
for (i in 1:n.individuals) {
  y[i,] <- sample(1:nrow(fakeData), n.features)
}

# set up the genetic algorithm
my.ga <- GenAlg(y, selectionFitness, selectionMutate, myContext, 0.001, 0.75)

# advance one generation
my.ga <- newGeneration(my.ga)

}
\keyword{optimize}

back to top

Software Heritage — Copyright (C) 2015–2025, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, API— Content policy— Contact— JavaScript license information— Web API