https://github.com/cran/mapReduce
Tip revision: b33ba5de8421c132f4cf0157f88ec376752ec089 authored by Christopher Brown on 24 June 2012, 00:00:00 UTC
version 1.2.6
version 1.2.6
Tip revision: b33ba5d
TODO
TODO:
- Default of 'data' argument should be parent.frame()
attach( iris )
mapReduce( Species, mean( Petal.Length ), ... )
The problem is that the notion of environment or data.frame is lost.
Have to check that
- 'data' could be a data sructure that is already split. It is very
common to work with data and not have to resplit it for each operation.
See e.g. split.frame.
- Lazy evaluation
- Accept a featureList instead of ...
Write method for extracting features from a featureList
mapReduce( map, featureList, ... )
Where ... are an unquoted list of strings
returns a data frame if all features are vectors
This seems to indicate that a package might be in order.
fl <- quote( length=mean(Petal.Length), width=mean(Petal.Width) )
mapReduce( Species, fl, data=iris )
- For the reducer, support:
- lists: ??
- functions,
- expressions
- Lazy evaluation,
return a promise, but not the result. This can be used to later
distribute the calculation.
The feature is not evaluated until it is needed.
getFeature( feature_obj, ...names... )
- If the features are there, returns the features
- If the features are not evaluated, evals the features
on the data
- newFeatures( data, map, ...features... )
slots for @data = split data
map
features can be a list as well.
- acct_features( list ) : this could be a simple object that
returns a list or loads a list ...
the features are also tied to the object
# f1<-function(data, ... ) {
#
# print( substitute(...[]) )
#
# innerFun = function( data, ... )
# eval( ..., data )
#
# innerFun( data, substitute( c(...) ) )
# }
#
# f1( test, min=min(amt), max=max(amt) )