https://github.com/cran/mapReduce
Raw File
Tip revision: b33ba5de8421c132f4cf0157f88ec376752ec089 authored by Christopher Brown on 24 June 2012, 00:00:00 UTC
version 1.2.6
Tip revision: b33ba5d
TODO
TODO:

- Default of 'data' argument should be parent.frame() 
    attach( iris )
    mapReduce( Species, mean( Petal.Length ), ... )  
  The problem is that the notion of environment or data.frame is lost.
  Have to check that 

- 'data' could be a data sructure that is already split.  It is very
  common to work with data and not have to resplit it for each operation.
  See e.g. split.frame. 
  - Lazy evaluation 

- Accept a featureList instead of ... 
  Write method for extracting features from a featureList
  mapReduce( map, featureList, ... )
      Where ... are an unquoted list of strings
      returns a data frame if all features are vectors
  This seems to indicate that a package might be in order.
    fl <- quote( length=mean(Petal.Length), width=mean(Petal.Width) )
    mapReduce( Species, fl, data=iris )

- For the reducer, support:
  - lists: ??
  - functions,
  - expressions 
  

- Lazy evaluation, 
  return a promise, but not the result. This can be used to later
  distribute the calculation.
  The feature is not evaluated until it is needed.
  getFeature( feature_obj, ...names... ) 
    - If the features are there, returns the features
    - If the features are not evaluated, evals the features
      on the data 
  
- newFeatures( data, map, ...features... )
    slots for @data = split data
    map 
    features can be a list as well.
      
- acct_features( list ) : this could be a simple object that 
    returns a list or loads a list ...
    the features are also tied to the object

# f1<-function(data, ... ) {
# 
#     print( substitute(...[]) )
# 
#     innerFun = function( data, ... )
#         eval( ..., data )
#      
#     innerFun( data, substitute( c(...) ) ) 
# }
# 
# f1( test, min=min(amt), max=max(amt) )  

back to top