Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

  • d7cb755
  • /
  • snowfall-b-init.Rd
Raw File Download
Permalinks

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • content
  • directory
content badge Iframe embedding
swh:1:cnt:4a20bde6f649ea6f33b442eb0f7c1ce549994805
directory badge Iframe embedding
swh:1:dir:d7cb75514aab51501ba3c5b55375b70ffec0aecd
Citations

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • content
  • directory
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
snowfall-b-init.Rd
\name{snowfall-init}
\alias{snowfall-init}

\alias{sfInit}
\alias{sfStop}

\alias{sfParallel}
\alias{sfCpus}
\alias{sfNodes}
\alias{sfType}
\alias{sfIsRunning}
\alias{sfSocketHosts}
\alias{sfGetCluster}
\alias{sfSession}
\alias{sfSetMaxCPUs}

\title{Initialisation of cluster usage}
\usage{
sfInit( parallel=NULL, cpus=NULL, type=NULL, socketHosts=NULL, restore=NULL,
        slaveOutfile=NULL, nostart=FALSE, useRscript=FALSE )
sfStop( nostop=FALSE )

sfParallel()
sfIsRunning()
sfCpus()
sfNodes()
sfGetCluster()
sfType()
sfSession()
sfSocketHosts()
sfSetMaxCPUs( number=32 )
}
\arguments{
  \item{parallel}{Logical determinating parallel or sequential
    execution. If not set values from commandline are taken.}
  \item{cpus}{Numerical amount of CPUs requested for the cluster. If
    not set, values from the commandline are taken.}
  \item{nostart}{Logical determinating if the basic cluster setup should
    be skipped. Needed for nested use of \pkg{snowfall} and usage in
    packages.}
  \item{type}{Type of cluster. Can be 'SOCK', 'MPI', 'PVM' or 'NWS'. Default is 'SOCK'.}
  \item{socketHosts}{Host list for socket clusters. Only needed for
    socketmode (SOCK) and
    if using more than one machines (if using only your local machine
    (localhost) no list is needed).}
  \item{restore}{Globally set the restore behavior in the call
    \code{sfClusterApplySR} to the given value.}
  \item{slaveOutfile}{Write R slave output to this file. Default: no
    output (Unix: \code{/dev/null}, Windows: \code{:nul}). If
    using sfCluster this argument has no function, as slave logs are
    defined using sfCluster.}
  \item{useRscript}{Change startup behavior (snow>0.3 needed): use shell scripts or R-script for startup (R-scripts beeing the new variant, but not working with sfCluster.}
  \item{nostop}{Same as noStart for ending.}
  \item{number}{Amount of maximum CPUs useable.}
}
\description{
  Initialisation and organisation code to use \pkg{snowfall}.
}
\details{
  \code{sfInit} initialisise the usage of the \pkg{snowfall} functions
  and - if running in parallel mode - setup the cluster and
  \pkg{snow}. If using
  \code{sfCluster} management tool, call this without arguments. If
  \code{sfInit} is called with arguments, these overwrite
  \code{sfCluster} settings. If running parallel, \code{sfInit}
  set up the
  cluster by calling \code{makeCluster} from \pkg{snow}. If using with
  \code{sfCluster}, the initialisation also contains management of
  lockfiles. If this function is called more than once and current
  cluster is yet running, \code{sfStop} is called automatically.

  Note that you should call \code{sfInit} before using any other function
  from \pkg{snowfall}, with the only exception \code{sfSetMaxCPUs}.
  If you do not call \code{sfInit} first, on calling any \pkg{snowfall}
  function \code{sfInit} is called without any parameters, which is
  equal to sequential mode in \pkg{snowfall} only mode or the settings from
  sfCluster if used with sfCluster.

  This also means, you cannot check if \code{sfInit} was called from
  within your own program, as any call to a function will initialize
  again. Therefore the function \code{sfIsRunning} gives you a logical
  if a cluster is running. Please note: this will not call \code{sfInit}
  and it also returns true if a previous running cluster was stopped via
  \code{sfStop} in the meantime.

  If you use \pkg{snowfall} in a package argument \code{nostart} is very
  handy if mainprogram uses \pkg{snowfall} as well. If set, cluster
  setup will be skipped and both parts (package and main program) use
  the same cluster.

  If you call \code{sfInit} more than one time in a program without
  explicit calling \code{sfStop}, stopping of the cluster will be
  executed automatically. If your R-environment does not cover required
  libraries, \code{sfInit} automatically switches to sequential mode
  (with a warning). Required libraries for parallel usage are \pkg{snow}
  and depending on argument \code{type} the libraries for the
  cluster mode (none for
  socket clusters, \pkg{Rmpi} for MPI clusters, \pkg{rpvm} for
  PVM clusters and \pkg{nws} for NetWorkSpaces).

  If using Socket or NetWorkSpaces, \code{socketHosts} can be used to
  specify the hosts you want to have your workers running.
  Basically this is a list, where any entry can be a plain character
  string with IP or hostname (depending on your DNS settings). Also
  for real heterogenous clusters for any host pathes are setable. Please
  look to the acccording \pkg{snow} documentation for details.
  If you are not giving an socketlist, a list with the required amount
  of CPUs on your local machine (localhost) is used. This would be the
  easiest way to use parallel computing on a single machine, like a
  laptop.

  Note there is limit on CPUs used in one program (which can be
  configured on package installation). The current limit are 32 CPUs. If
  you need a higher amount of CPUs, call \code{sfSetMaxCPUs}
  \emph{before} the first call to \code{sfInit}. The limit is set to
  prevent inadvertently request by single users affecting the cluster as
  a whole.

  Use \code{slaveOutfile} to define a file where to write the log
  files. The file location must be available on all nodes. Beware of
  taking a location on a shared network drive! Under *nix systems, most
  likely the directories \code{/tmp} and \code{/var/tmp} are not shared
  between the different machines. The default is no output file.
  If you are using \code{sfCluster} this
  argument have no meaning as the slave logs are always created in a
  location of \code{sfClusters} choice (depending on it's configuration).

  \code{sfStop} stop cluster. If running in parallel mode, the LAM/MPI
  cluster is shut down.
  
  \code{sfParallel}, \code{sfCpus} and \code{sfSession} grant access to
  the internal state of the currently used cluster.
  All three can be configured via commandline and especially with
  \code{sfCluster} as well, but given
  arguments in \code{sfInit} always overwrite values on commandline.
  The commandline options are \option{--parallel} (empty option. If missing,
  sequential mode is forced), \option{--cpus=X} (for nodes, where X is a
  numerical value) and \option{--session=X} (with X a string).

  \code{sfParallel} returns a
  logical if program is running in parallel/cluster-mode or sequential
  on a single processor.

  \code{sfCpus} returns the size of the cluster in CPUs
  (equals the CPUs which are useable). In sequential mode \code{sfCpus}
  returns one. \code{sfNodes} is a deprecated similar to \code{sfCpus}.

  \code{sfSession} returns a string with the
  session-identification. It is mainly important if used with the
  \code{sfCluster} tool.
 
  \code{sfGetCluster} gets the \pkg{snow}-cluster handler. Use for
  direct calling of \pkg{snow} functions.

  \code{sfType} returns the type of the current cluster backend (if
  used any). The value can be SOCK, MPI, PVM or NWS for parallel
  modes or "- sequential -" for sequential execution.

  \code{sfSocketHosts} gives the list with currently used hosts for
  socket clusters. Returns empty list if not used in socket mode (means:
  \code{sfType() != 'SOCK'}).
  
  \code{sfSetMaxCPUs} enables to set a higher maximum CPU-count for this
  program. If you need higher limits, call \code{sfSetMaxCPUs} before
  \code{sfInit} with the new maximum amount.
}
\keyword{package}
\seealso{
See snow documentation for details on commands:
\code{link[snow]{snow-cluster}}
}
\examples{
\dontrun{
  # Run program in plain sequential mode.
  sfInit( parallel=FALSE )
  stopifnot( sfParallel() == FALSE )
  sfStop()

  # Run in parallel mode overwriting probably given values on
  # commandline.
  # Executes via Socket-cluster with 4 worker processes on
  # localhost.
  # This is probably the best way to use parallel computing
  # on a single machine, like a notebook, if you are not
  # using sfCluster.
  # Uses Socketcluster (Default) - which can also be stated
  # using type="SOCK".
  sfInit( parallel=TRUE, cpus=4 )
  stopifnot( sfCpus() == 4 )
  stopifnot( sfParallel() == TRUE )
  sfStop()

  # Run parallel mode (socket) with 4 workers on 3 specific machines.
  sfInit( parallel=TRUE, cpus=4, type="SOCK",
          socketHosts=c( "biom7", "biom7", "biom11", "biom12" ) )
  stopifnot( sfCpus() == 4 )
  stopifnot( sfParallel() == TRUE )
  sfStop()

  # Hook into MPI cluster.
  # Note: you can use any kind MPI cluster Rmpi supports.
  sfInit( parallel=TRUE, cpus=4, type="MPI" )
  sfStop()

  # Hook into PVM cluster.
  sfInit( parallel=TRUE, cpus=4, type="PVM" )
  sfStop()

  # Run in sfCluster-mode: settings are taken from commandline:
  # Runmode (sequential or parallel), amount of nodes and hosts which
  # are used.
  sfInit()

  # Session-ID from sfCluster (or XXXXXXXX as default)
  session <- sfSession()

  # Calling a snow function: cluster handler needed.
  parLapply( sfGetCluster(), 1:10, exp )

  # Same using snowfall wrapper, no handler needed.
  sfLapply( 1:10, exp )

  sfStop()
}
}

back to top

Software Heritage — Copyright (C) 2015–2025, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, API— Contact— JavaScript license information— Web API