https://github.com/apache/spark

sort by:
Revision Author Date Message Commit Date
9f20b6b Added reduceByKey operation for RDDs containing pairs 04 October 2010, 03:28:20 UTC
34ecced Fixed a rather bad bug in HDFS files that has been in for a while: caching was not working because Split objects did not have a consistent toString value 03 October 2010, 05:06:06 UTC
b6debf5 Merge branch 'matei-logging' 29 September 2010, 17:59:01 UTC
f50b23b Increase default locality wait to 3s. Fixes #20. 29 September 2010, 17:04:00 UTC
a7c0e2a Made task-finished log messages slightly nicer 29 September 2010, 07:22:11 UTC
40f6914 Made spark-executor output slightly nicer 29 September 2010, 07:22:09 UTC
0d28bdc A couple of minor fixes: - Don't include trailing $'s in class names of Scala objects - Report errors using logError instead of printStackTrace 29 September 2010, 07:10:46 UTC
0fa70a6 Updated log4j.properties to ignore jetty messages below WARN level 29 September 2010, 06:58:19 UTC
7090dea Changed printlns to log statements and fixed a bug in run that was causing it to fail on a Mesos cluster 29 September 2010, 06:54:29 UTC
516248a Added log4j.properties 29 September 2010, 06:22:39 UTC
332c8b8 Removed Hadoop's SLF4J jars 29 September 2010, 06:16:28 UTC
db623de Added Logging trait 29 September 2010, 06:12:23 UTC
c7d233b Added log4j jars and paths 29 September 2010, 06:08:01 UTC
e5e9ede Merge branch 'http-repl-class-serving' 29 September 2010, 05:43:04 UTC
e068f21 More work on HTTP class loading 29 September 2010, 05:32:38 UTC
7ef3a20 Modified the interpreter to serve classes to the executors using a Jetty HTTP server instead of a shared (NFS) file system. 29 September 2010, 00:55:11 UTC
b749f0e fixed typo in printing which task is already finished 29 September 2010, 00:28:54 UTC
366c09c Let's use future instead of actors 13 September 2010, 22:30:22 UTC
0896fd6 Added fork()/join() operations for SparkContext, as well as corresponding changes to MesosScheduler to support multiple ParallelOperations. 12 September 2010, 16:01:44 UTC
6f0d2c1 round robin scheduling of tasks has been added 07 September 2010, 21:03:59 UTC
e9ffe6c now adding the Split object. 01 September 2010, 20:31:06 UTC
7a9ff1c - Got rid of 'Split' type parameter in RDD - Added SampledRDD, SplitRDD and CartesianRDD - Made Split a class rather than a type parameter - Added numCores() to Scheduler to help set default level of parallelism 31 August 2010, 19:08:09 UTC
ea8c278 now we have sampling with replacement (at least on a per-split basis) 18 August 2010, 22:59:35 UTC
156bccb HdfsFile.scala: added a try/catch block to exit gracefully for correupted gzip files MesosScheduler.scala: formatted the slaveOffer() output to include the serialized task size RDD.scala: added support for aggregating RDDs on a per-split basis (aggregateSplit()) as well as for sampling without replacement (sample()) 18 August 2010, 22:25:57 UTC
75b2ca1 Removed HOD from included Hadoop because it was making the project count as Python on GitHub :|. 17 August 2010, 06:16:35 UTC
1cbffaa Modified Scala interpreter to have it avoid computing string versions of all results when :silent is enabled, so that it is easier to work with large arrays in Spark. (The string version of an array of numbers might not fit in memory even though the array itself does.) 16 August 2010, 01:33:27 UTC
1600c31 Added latest mesos.jar 14 August 2010, 02:03:46 UTC
0b19592 Improved README and added blank templates for config files. 14 August 2010, 01:54:32 UTC
3d8d7fd Bug fix from Justin 13 August 2010, 18:29:19 UTC
a9481c3 Update to work with latest Mesos API changes 13 August 2010, 07:39:36 UTC
4488b3b Fixed a bug where we would incorrectly decide we've finished a parallel operation if Mesos tells us a task is finished twice 09 August 2010, 23:46:14 UTC
f415b07 Change shell framework's name to "Spark shell" 06 August 2010, 19:07:26 UTC
0e6e577 Add Mesos native library to .gitignore 26 July 2010, 03:54:56 UTC
b56ed67 Updated code to work with Nexus->Mesos name change 26 July 2010, 03:53:46 UTC
4239f76 Removed Matei's old start on broadcast code 26 July 2010, 03:46:44 UTC
e240e38 Updated a bunch of libraries, and increased the default memory in run so that unit tests can run successfully. 26 July 2010, 01:10:03 UTC
0435de9 Made it possible to set various Spark options and environment variables in general through a conf/spark-env.sh script. 20 July 2010, 01:00:30 UTC
edad598 Updated Spark to run with latest Mesos build and Scala-2.8.0.final. 19 July 2010, 22:03:49 UTC
7d0eae1 Merge branch 'dev' Conflicts: src/scala/spark/HdfsFile.scala src/scala/spark/NexusScheduler.scala src/test/spark/repl/ReplSuite.scala 27 June 2010, 22:21:54 UTC
6aacaa6 Made Spark shell class directory configurable. 18 June 2010, 23:24:18 UTC
323571a Initial work on union operation. 18 June 2010, 19:54:33 UTC
b541988 Added appropriate hashCode, equals and toString to ParallelArraySplit. 17 June 2010, 20:19:02 UTC
cd247b7 Created common RDD superclass for distributed files and parallel arrays. This also means that parallel arrays now get all the functionality files used to have (filter, map, reduce, cache, etc). 17 June 2010, 19:49:42 UTC
77103ea Fixed README 11 June 2010, 21:55:23 UTC
0d9c51d Added back REPL tests 11 June 2010, 17:03:01 UTC
e58fba2 Fix junk stripper 11 June 2010, 08:18:43 UTC
396f48e New interpreter port for Scala 2.8 interpreter 11 June 2010, 08:10:03 UTC
4eb39e0 New nexus.jar 11 June 2010, 05:41:23 UTC
1473987 Fixed classpath for tests 11 June 2010, 05:36:45 UTC
359e84c Use new Nexus API 11 June 2010, 05:09:13 UTC
92246c8 Initial work on 2.8 port 11 June 2010, 04:50:55 UTC
c177a54 Ignore .DS_Store 11 June 2010, 01:08:59 UTC
1c90a32 Fix native build to use build directory 30 April 2010, 22:41:21 UTC
06aac8a Imported changes from old repository (mostly Mosharaf's work, plus some fault tolerance code). 04 April 2010, 06:44:55 UTC
df29d0e Initial commit 29 March 2010, 23:17:55 UTC
back to top