https://github.com/mongodb/mongo-hadoop

sort by:
Revision Author Date Message Commit Date
10d65a8 Update streaming/pymongo_hadoop/output.py to use BSON object 22 February 2012, 23:43:18 UTC
9c1b12c Fixed pymongo_hadoop output to use BSON.encode 22 February 2012, 23:41:58 UTC
8612fd6 Added support to streaming for the -file flag to distribute files out to the cluster if they don't exist. 22 February 2012, 21:31:25 UTC
1100505 Don't build streaming as a dependent 22 February 2012, 21:00:00 UTC
0eb7b95 Fix missing sbt plugins. 22 February 2012, 20:17:32 UTC
19ec395 Make InputFormat and OutputFormat implied on Streaming jobs, defaulting to the Mongo ones. 22 February 2012, 19:57:14 UTC
6896fbd Streaming now builds as a fat assembly jar and works. 22 February 2012, 19:28:24 UTC
289912e Added an 0.23 / cdh4 build. No longer allow raw "cdh" or "cloudera" build artifacts to avoid confusion as to 'which cloudera?' 22 February 2012, 17:59:24 UTC
364e9c2 Added a .23 build, based on Cloudera's current distro (should be binary compatible with stock) 22 February 2012, 17:20:32 UTC
5ae5548 Merge pull request #33 from bpfoster/master Make combiner truly optional 21 February 2012, 17:52:29 UTC
447e236 Merge pull request #35 from rjurney/master HADOOP-20 - Applied cleaned up for the patch for failing to authenticate, given username/pass in connection string 21 February 2012, 17:52:08 UTC
cb13ae8 Applied cleaned up patch for HADOOP-20 16 February 2012, 02:35:28 UTC
14f8261 Updated gitignore to ignore more stuff 16 February 2012, 02:24:10 UTC
eee397b Merge pull request #34 from tychoish/master Documentation/Readme Tweaks 13 February 2012, 20:10:15 UTC
10b41ec MongoHadoop readme/documentation revision 13 February 2012, 19:48:37 UTC
4561759 Update README.md 13 February 2012, 00:29:26 UTC
070e097 Need to include organization in buildSettings in order to properly publish maven 12 February 2012, 21:49:11 UTC
7731682 Release r1.0.0-rc0 12 February 2012, 21:24:30 UTC
3ea15b3 Note HADOOP-19 in Knonw Issues. 12 February 2012, 21:21:38 UTC
16fc640 Versioning 12 February 2012, 21:20:00 UTC
c012e2a Cleanup examples further (strip maven builds) 12 February 2012, 21:19:18 UTC
d73c432 Remove TODO block, as this is managed on JIRA now 12 February 2012, 21:18:33 UTC
1bcafd2 Formatting 12 February 2012, 21:16:51 UTC
eea6968 Further doc cleanup. 12 February 2012, 21:15:59 UTC
6f410e3 Saner CDH Artifact 12 February 2012, 21:12:14 UTC
13fc9df Updated Docs. 12 February 2012, 21:05:31 UTC
3fe38e9 HADOOP-18: Addeda a "default" build mode with no linkins 12 February 2012, 21:04:10 UTC
a6f0801 HADOOP-18: Root project needs dependentSettings so it includes core dependency. 12 February 2012, 20:57:22 UTC
1f12b71 HADOOP-18: Fixed pig dependency resolution 12 February 2012, 20:52:29 UTC
0aa695c HADOOP-18: Aggregate Pig dependency in master project, add dependency on core 12 February 2012, 20:27:01 UTC
44bb93b HADOOP-18: Add Pig build, Remove Maven POMs; also culled some extra examples for the time being 12 February 2012, 20:25:04 UTC
0bc79d7 HADOOP-18: Flume is no longer a "Hadoop Version" dependent build, building and standing alone instead 12 February 2012, 20:21:51 UTC
65273db HADOOP-18: Set Scala Library to be a test only dependency 12 February 2012, 20:10:24 UTC
bbd356c HADOOP-18: Fixed SBT Build to include Hadoop build version 12 February 2012, 19:39:19 UTC
f71c41a If combiner is not specified, do not pass it to Hadoop. While the combiner should be optional, giving Hadoop a null combiner will result in a NullPointerException. 08 February 2012, 16:41:56 UTC
0d5b2fb HADOOP-18: SBT Build now works for Cloudera complete with tests! 06 February 2012, 23:26:51 UTC
8355175 HADOOP-18: Thanks to @jteigen got the def of Hadoop Base fixed 06 February 2012, 22:33:00 UTC
87b2c15 HADOOP-18: Add repos/resolvers 06 February 2012, 21:05:34 UTC
77d62e2 Added Flume 06 February 2012, 20:57:44 UTC
0b4ea84 HADDOP-18: Introducing SBT for build management 06 February 2012, 20:51:17 UTC
8dac711 Updated to disable streaming with Hadoop 1.0. 06 February 2012, 15:52:47 UTC
72976da Update CDH version, link pig version to parent distro 06 February 2012, 15:47:06 UTC
50da29d Migrate Streaming to examples directories; Update POM to use Hadoop 1.0 and default to it 03 February 2012, 22:05:36 UTC
6f5ef9b Merge branch 'master' of github.com:mongodb/mongo-hadoop 02 February 2012, 19:20:05 UTC
c02a855 Some cleanup and added examples in organized directories including new streaming example. 02 February 2012, 19:19:46 UTC
f6dc8e3 Merge pull request #31 from tlockney/master Just a quick fix up to the docs for the flume sink. (@tlockney) 20 January 2012, 20:23:02 UTC
24c4e06 Updated to reflect changes in Flume config and Maven-based build 20 January 2012, 18:42:09 UTC
d31e163 Update Pig to 0.9.1 18 January 2012, 15:40:01 UTC
77590bf HADOOP-3: Basic util and test for reading BSON files from any InputStream 11 January 2012, 19:29:35 UTC
d82273a Fixed a typo in ConfigUtil which broke InputSplits 10 January 2012, 22:23:25 UTC
b707af4 HADOOP-16: Corrected an issue where InputSPlits didn't deserialize correctly. 10 January 2012, 22:13:07 UTC
5591287 Updated Mongo Java Driver to 2.7.2 10 January 2012, 20:14:54 UTC
d308f7c Added the Specs2 testing framework as a dependency for core. 10 January 2012, 20:12:59 UTC
419021e Fixes HADOOP-14 - MongoInputSplit should be BSON Serialized - Wrap split data in a BSON Document and ser/dser as bson appropriately instead of JSON data 03 January 2012, 18:56:36 UTC
0b46e16 Merge pull request #29 from rjurney/master Added Tuples and Bags serialization support to MongoStorage for Pig 03 January 2012, 15:44:18 UTC
b0e3023 Added tuples and bags (one level deep) to MongoStorage for Pig 01 January 2012, 04:19:04 UTC
5507c79 A few minor cleanups; create_input_splits should have defaulted to true, some cleanup of python streaming code. 08 December 2011, 19:05:55 UTC
6a6a2da Made QUERY_NOTIMEOUT a configurable value, defaulting again to false. 07 December 2011, 19:10:15 UTC
b510bc6 HADOOP-2 - Support Splitting on Unsharded Clusters * If splitVector produces no splits, force a fallthrough to a single split. 07 December 2011, 18:09:14 UTC
50bac7f Slight format tweak on docs 07 December 2011, 18:03:21 UTC
de27298 HADOOP-2 - Support Splitting on Unsharded Clusters * Updated documentation 07 December 2011, 18:02:12 UTC
286cfed Merge branch 'master' of https://github.com/mongodb/mongo-hadoop 07 December 2011, 17:29:49 UTC
fd479a6 HADOOP-2 - Support Splitting on Unsharded Clusters * Done, fully works with customisation of several areas * See new fields in /mongo-defaults.xml for docs on each feature and how to use until more formal docs are written up. 07 December 2011, 17:29:17 UTC
f16d6be Updated link to issue tracking location 07 December 2011, 15:25:50 UTC
ebebbd5 Global "default" example defaults doc for Mongo specific hadoop settings. 06 December 2011, 20:57:23 UTC
b0dca46 HADOOP-12 - Allow user to choose the field fed into the Mapper as the "key" * Defaults to _id, but setting mongo.input.key will change the field fed in as your Mapper key 06 December 2011, 20:40:34 UTC
a9e889b HADOOP-12 - Allow user to choose the field fed into the Mapper as the "key" * Config values for input_key 06 December 2011, 19:10:46 UTC
1570040 New "UFO Sightings" example, useful for sharding testing (Just import it a bunch of times for dups) 04 December 2011, 15:13:20 UTC
1553bae Cleaning up some logic and code flow for input splitting 03 December 2011, 22:04:26 UTC
2677455 More corrections to config for Treasury 03 December 2011, 21:22:40 UTC
383a339 Fix jar location 03 December 2011, 20:15:21 UTC
199a1b0 Clean up & Correct README. 03 December 2011, 20:09:19 UTC
7baa7a4 Merge pull request #19 from stackmob/master Support for format strings in mongo url path names 27 October 2011, 16:04:25 UTC
3b99ea2 removed email address... peeps can contact me through github. 27 October 2011, 13:41:56 UTC
b37dea7 ignore target directories 31 August 2011, 00:50:31 UTC
afb7d5b pom fixes 31 August 2011, 00:43:11 UTC
b2eb081 Bucketed MongoDB sink 31 August 2011, 00:42:59 UTC
0af08c0 Merge pull request #14 from dcrosta/master Clean up Mappers for a more uniform interface like the Reducer ones. 04 August 2011, 16:22:50 UTC
dd63006 touch up docs 04 August 2011, 16:21:20 UTC
e1a038d add BSONMapper, KeyValueBSONMapper these classes wrap the input and output classes to allow writing generator pipeline functions for mappers 04 August 2011, 16:13:30 UTC
0d463cc Pom files for examples 02 August 2011, 16:11:50 UTC
f61a48b Rename slave ok, use shards and use chunks options to be more sane/normalized 02 August 2011, 16:11:30 UTC
c5a7c9f Cleaning up packaging of example programs. 02 August 2011, 15:45:34 UTC
fd0fe8e Ignore 01 August 2011, 21:39:28 UTC
0e68798 Include dependencies and Main Class attribute in Streaming Jar; fix run demo scripts to execute against new build. 01 August 2011, 21:31:23 UTC
36e2c05 Fix pathing of Flume 01 August 2011, 21:06:40 UTC
fac7381 Fix README links for streaming 01 August 2011, 20:47:16 UTC
8bf469f Fix links in readme 01 August 2011, 20:46:47 UTC
54007db Refactor project into modules with Maven based profile build. Profile build allows for you to pick what Hadoop distribution you want to build for. 01 August 2011, 20:43:40 UTC
054f4a8 Working demos of both "regular" and "key value" streaming python 01 August 2011, 18:16:44 UTC
3cd9d06 Refactored InputSplit calculation for sharding into a utility class, based all Streaming code for "Classic MapReduce" code against latest "new" mapReduce entries and share one set of "Calculate Splits" code across both UIs. Also fixed code for reading configuration for sharding to conform with the rest of the project standard, encapsulated fully within MongoConfigUtil. 29 July 2011, 19:26:19 UTC
6c3e4f0 Remove setting of cursor size as we ignore that for progress tracking. 29 July 2011, 18:29:50 UTC
69df2f6 Removed ip_location example as it has a bunch of strange "Special" operators hardcoded into the Hadoop library just for it's own usage. Should be readded using proper MongoDB syntax. At this time, we don't support in place updates of Data with the Hadoop driver and hardcoding "special" operators serves only to confuse users. There is an open TODO item to investigate how to support incremental map/reduce, but updating values in place is not an appropriate way to handle a MapReduce approach. 29 July 2011, 18:27:35 UTC
f2cf807 <reformatting> 29 July 2011, 18:15:28 UTC
ba4010d Move Constructor for MongoInputSplit to top of class where it belongs. 29 July 2011, 18:03:27 UTC
b2a2c2a Reformat Examples to Project Standard. 29 July 2011, 17:58:44 UTC
e189109 Cleanup and reformat code to project standard. 29 July 2011, 17:57:05 UTC
18f42b4 Fixes #9, integrate performance and array sizing tweaks from @josephks to BSONWritable 29 July 2011, 17:44:26 UTC
d5a6734 Refs #9, Logging Cleanups from @josephks 29 July 2011, 17:44:09 UTC
edc36f7 Refs #9, merge small fixes to temp table deletion in WebLogAnalyzer example from @josephks 29 July 2011, 17:25:47 UTC
back to top