10d65a8 | Steve Francia | 22 February 2012, 23:43:18 UTC | Update streaming/pymongo_hadoop/output.py to use BSON object | 22 February 2012, 23:43:18 UTC |
9c1b12c | Steve Francia | 22 February 2012, 23:41:58 UTC | Fixed pymongo_hadoop output to use BSON.encode | 22 February 2012, 23:41:58 UTC |
8612fd6 | Brendan W. McAdams | 22 February 2012, 21:31:25 UTC | Added support to streaming for the -file flag to distribute files out to the cluster if they don't exist. | 22 February 2012, 21:31:25 UTC |
1100505 | Brendan W. McAdams | 22 February 2012, 21:00:00 UTC | Don't build streaming as a dependent | 22 February 2012, 21:00:00 UTC |
0eb7b95 | Brendan W. McAdams | 22 February 2012, 20:17:32 UTC | Fix missing sbt plugins. | 22 February 2012, 20:17:32 UTC |
19ec395 | Brendan W. McAdams | 22 February 2012, 19:57:14 UTC | Make InputFormat and OutputFormat implied on Streaming jobs, defaulting to the Mongo ones. | 22 February 2012, 19:57:14 UTC |
6896fbd | Brendan W. McAdams | 22 February 2012, 19:28:24 UTC | Streaming now builds as a fat assembly jar and works. | 22 February 2012, 19:28:24 UTC |
289912e | Brendan W. McAdams | 22 February 2012, 17:59:24 UTC | Added an 0.23 / cdh4 build. No longer allow raw "cdh" or "cloudera" build artifacts to avoid confusion as to 'which cloudera?' | 22 February 2012, 17:59:24 UTC |
364e9c2 | Brendan W. McAdams | 22 February 2012, 17:19:46 UTC | Added a .23 build, based on Cloudera's current distro (should be binary compatible with stock) | 22 February 2012, 17:20:32 UTC |
5ae5548 | Brendan W. McAdams | 21 February 2012, 17:52:29 UTC | Merge pull request #33 from bpfoster/master Make combiner truly optional | 21 February 2012, 17:52:29 UTC |
447e236 | Brendan W. McAdams | 21 February 2012, 17:52:08 UTC | Merge pull request #35 from rjurney/master HADOOP-20 - Applied cleaned up for the patch for failing to authenticate, given username/pass in connection string | 21 February 2012, 17:52:08 UTC |
cb13ae8 | Russell Jurney | 16 February 2012, 02:35:28 UTC | Applied cleaned up patch for HADOOP-20 | 16 February 2012, 02:35:28 UTC |
14f8261 | Russell Jurney | 16 February 2012, 02:24:10 UTC | Updated gitignore to ignore more stuff | 16 February 2012, 02:24:10 UTC |
eee397b | Brendan W. McAdams | 13 February 2012, 20:10:15 UTC | Merge pull request #34 from tychoish/master Documentation/Readme Tweaks | 13 February 2012, 20:10:15 UTC |
10b41ec | tycho garen | 13 February 2012, 18:15:18 UTC | MongoHadoop readme/documentation revision | 13 February 2012, 19:48:37 UTC |
4561759 | Brendan W. McAdams | 13 February 2012, 00:29:26 UTC | Update README.md | 13 February 2012, 00:29:26 UTC |
070e097 | Brendan W. McAdams | 12 February 2012, 21:49:11 UTC | Need to include organization in buildSettings in order to properly publish maven | 12 February 2012, 21:49:11 UTC |
7731682 | Brendan W. McAdams | 12 February 2012, 21:24:30 UTC | Release r1.0.0-rc0 | 12 February 2012, 21:24:30 UTC |
3ea15b3 | Brendan W. McAdams | 12 February 2012, 21:21:38 UTC | Note HADOOP-19 in Knonw Issues. | 12 February 2012, 21:21:38 UTC |
16fc640 | Brendan W. McAdams | 12 February 2012, 21:20:00 UTC | Versioning | 12 February 2012, 21:20:00 UTC |
c012e2a | Brendan W. McAdams | 12 February 2012, 21:19:18 UTC | Cleanup examples further (strip maven builds) | 12 February 2012, 21:19:18 UTC |
d73c432 | Brendan W. McAdams | 12 February 2012, 21:18:33 UTC | Remove TODO block, as this is managed on JIRA now | 12 February 2012, 21:18:33 UTC |
1bcafd2 | Brendan W. McAdams | 12 February 2012, 21:16:51 UTC | Formatting | 12 February 2012, 21:16:51 UTC |
eea6968 | Brendan W. McAdams | 12 February 2012, 21:15:59 UTC | Further doc cleanup. | 12 February 2012, 21:15:59 UTC |
6f410e3 | Brendan W. McAdams | 12 February 2012, 21:12:14 UTC | Saner CDH Artifact | 12 February 2012, 21:12:14 UTC |
13fc9df | Brendan W. McAdams | 12 February 2012, 21:05:31 UTC | Updated Docs. | 12 February 2012, 21:05:31 UTC |
3fe38e9 | Brendan W. McAdams | 12 February 2012, 21:04:10 UTC | HADOOP-18: Addeda a "default" build mode with no linkins | 12 February 2012, 21:04:10 UTC |
a6f0801 | Brendan W. McAdams | 12 February 2012, 20:57:22 UTC | HADOOP-18: Root project needs dependentSettings so it includes core dependency. | 12 February 2012, 20:57:22 UTC |
1f12b71 | Brendan W. McAdams | 12 February 2012, 20:52:29 UTC | HADOOP-18: Fixed pig dependency resolution | 12 February 2012, 20:52:29 UTC |
0aa695c | Brendan W. McAdams | 12 February 2012, 20:27:01 UTC | HADOOP-18: Aggregate Pig dependency in master project, add dependency on core | 12 February 2012, 20:27:01 UTC |
44bb93b | Brendan W. McAdams | 12 February 2012, 20:25:04 UTC | HADOOP-18: Add Pig build, Remove Maven POMs; also culled some extra examples for the time being | 12 February 2012, 20:25:04 UTC |
0bc79d7 | Brendan W. McAdams | 12 February 2012, 20:21:51 UTC | HADOOP-18: Flume is no longer a "Hadoop Version" dependent build, building and standing alone instead | 12 February 2012, 20:21:51 UTC |
65273db | Brendan W. McAdams | 12 February 2012, 20:10:24 UTC | HADOOP-18: Set Scala Library to be a test only dependency | 12 February 2012, 20:10:24 UTC |
bbd356c | Brendan W. McAdams | 12 February 2012, 19:39:19 UTC | HADOOP-18: Fixed SBT Build to include Hadoop build version | 12 February 2012, 19:39:19 UTC |
f71c41a | bfoster | 08 February 2012, 16:41:56 UTC | If combiner is not specified, do not pass it to Hadoop. While the combiner should be optional, giving Hadoop a null combiner will result in a NullPointerException. | 08 February 2012, 16:41:56 UTC |
0d5b2fb | Brendan W. McAdams | 06 February 2012, 23:26:51 UTC | HADOOP-18: SBT Build now works for Cloudera complete with tests! | 06 February 2012, 23:26:51 UTC |
8355175 | Brendan W. McAdams | 06 February 2012, 22:33:00 UTC | HADOOP-18: Thanks to @jteigen got the def of Hadoop Base fixed | 06 February 2012, 22:33:00 UTC |
87b2c15 | Brendan W. McAdams | 06 February 2012, 21:05:34 UTC | HADOOP-18: Add repos/resolvers | 06 February 2012, 21:05:34 UTC |
77d62e2 | Brendan W. McAdams | 06 February 2012, 20:57:44 UTC | Added Flume | 06 February 2012, 20:57:44 UTC |
0b4ea84 | Brendan W. McAdams | 06 February 2012, 20:51:17 UTC | HADDOP-18: Introducing SBT for build management | 06 February 2012, 20:51:17 UTC |
8dac711 | Brendan W. McAdams | 06 February 2012, 15:52:47 UTC | Updated to disable streaming with Hadoop 1.0. | 06 February 2012, 15:52:47 UTC |
72976da | Brendan W. McAdams | 06 February 2012, 15:47:06 UTC | Update CDH version, link pig version to parent distro | 06 February 2012, 15:47:06 UTC |
50da29d | Brendan W. McAdams | 03 February 2012, 22:05:36 UTC | Migrate Streaming to examples directories; Update POM to use Hadoop 1.0 and default to it | 03 February 2012, 22:05:36 UTC |
6f5ef9b | Brendan W. McAdams | 02 February 2012, 19:20:05 UTC | Merge branch 'master' of github.com:mongodb/mongo-hadoop | 02 February 2012, 19:20:05 UTC |
c02a855 | Brendan W. McAdams | 02 February 2012, 19:19:46 UTC | Some cleanup and added examples in organized directories including new streaming example. | 02 February 2012, 19:19:46 UTC |
f6dc8e3 | Brendan W. McAdams | 20 January 2012, 20:23:02 UTC | Merge pull request #31 from tlockney/master Just a quick fix up to the docs for the flume sink. (@tlockney) | 20 January 2012, 20:23:02 UTC |
24c4e06 | Thomas Lockney | 20 January 2012, 18:42:09 UTC | Updated to reflect changes in Flume config and Maven-based build | 20 January 2012, 18:42:09 UTC |
d31e163 | Brendan W. McAdams | 18 January 2012, 15:40:01 UTC | Update Pig to 0.9.1 | 18 January 2012, 15:40:01 UTC |
77590bf | Brendan W. McAdams | 11 January 2012, 19:29:35 UTC | HADOOP-3: Basic util and test for reading BSON files from any InputStream | 11 January 2012, 19:29:35 UTC |
d82273a | Brendan W. McAdams | 10 January 2012, 22:23:25 UTC | Fixed a typo in ConfigUtil which broke InputSplits | 10 January 2012, 22:23:25 UTC |
b707af4 | Brendan W. McAdams | 10 January 2012, 22:13:07 UTC | HADOOP-16: Corrected an issue where InputSPlits didn't deserialize correctly. | 10 January 2012, 22:13:07 UTC |
5591287 | Brendan W. McAdams | 10 January 2012, 20:14:54 UTC | Updated Mongo Java Driver to 2.7.2 | 10 January 2012, 20:14:54 UTC |
d308f7c | Brendan W. McAdams | 10 January 2012, 20:12:59 UTC | Added the Specs2 testing framework as a dependency for core. | 10 January 2012, 20:12:59 UTC |
419021e | Brendan W. McAdams | 03 January 2012, 18:56:36 UTC | Fixes HADOOP-14 - MongoInputSplit should be BSON Serialized - Wrap split data in a BSON Document and ser/dser as bson appropriately instead of JSON data | 03 January 2012, 18:56:36 UTC |
0b46e16 | Brendan W. McAdams | 03 January 2012, 15:44:18 UTC | Merge pull request #29 from rjurney/master Added Tuples and Bags serialization support to MongoStorage for Pig | 03 January 2012, 15:44:18 UTC |
b0e3023 | Russell Jurney | 01 January 2012, 04:19:04 UTC | Added tuples and bags (one level deep) to MongoStorage for Pig | 01 January 2012, 04:19:04 UTC |
5507c79 | Brendan W. McAdams | 08 December 2011, 19:05:55 UTC | A few minor cleanups; create_input_splits should have defaulted to true, some cleanup of python streaming code. | 08 December 2011, 19:05:55 UTC |
6a6a2da | Brendan W. McAdams | 07 December 2011, 19:10:15 UTC | Made QUERY_NOTIMEOUT a configurable value, defaulting again to false. | 07 December 2011, 19:10:15 UTC |
b510bc6 | Brendan W. McAdams | 07 December 2011, 18:09:14 UTC | HADOOP-2 - Support Splitting on Unsharded Clusters * If splitVector produces no splits, force a fallthrough to a single split. | 07 December 2011, 18:09:14 UTC |
50bac7f | Brendan W. McAdams | 07 December 2011, 18:03:21 UTC | Slight format tweak on docs | 07 December 2011, 18:03:21 UTC |
de27298 | Brendan W. McAdams | 07 December 2011, 18:02:12 UTC | HADOOP-2 - Support Splitting on Unsharded Clusters * Updated documentation | 07 December 2011, 18:02:12 UTC |
286cfed | Brendan W. McAdams | 07 December 2011, 17:29:49 UTC | Merge branch 'master' of https://github.com/mongodb/mongo-hadoop | 07 December 2011, 17:29:49 UTC |
fd479a6 | Brendan W. McAdams | 07 December 2011, 17:29:17 UTC | HADOOP-2 - Support Splitting on Unsharded Clusters * Done, fully works with customisation of several areas * See new fields in /mongo-defaults.xml for docs on each feature and how to use until more formal docs are written up. | 07 December 2011, 17:29:17 UTC |
f16d6be | Ian Whalen | 07 December 2011, 15:25:50 UTC | Updated link to issue tracking location | 07 December 2011, 15:25:50 UTC |
ebebbd5 | Brendan W. McAdams | 06 December 2011, 20:57:23 UTC | Global "default" example defaults doc for Mongo specific hadoop settings. | 06 December 2011, 20:57:23 UTC |
b0dca46 | Brendan W. McAdams | 06 December 2011, 20:40:34 UTC | HADOOP-12 - Allow user to choose the field fed into the Mapper as the "key" * Defaults to _id, but setting mongo.input.key will change the field fed in as your Mapper key | 06 December 2011, 20:40:34 UTC |
a9e889b | Brendan W. McAdams | 06 December 2011, 19:10:46 UTC | HADOOP-12 - Allow user to choose the field fed into the Mapper as the "key" * Config values for input_key | 06 December 2011, 19:10:46 UTC |
1570040 | Brendan W. McAdams | 04 December 2011, 15:13:20 UTC | New "UFO Sightings" example, useful for sharding testing (Just import it a bunch of times for dups) | 04 December 2011, 15:13:20 UTC |
1553bae | Brendan W. McAdams | 03 December 2011, 22:04:26 UTC | Cleaning up some logic and code flow for input splitting | 03 December 2011, 22:04:26 UTC |
2677455 | Brendan W. McAdams | 03 December 2011, 21:22:40 UTC | More corrections to config for Treasury | 03 December 2011, 21:22:40 UTC |
383a339 | Brendan W. McAdams | 03 December 2011, 20:15:21 UTC | Fix jar location | 03 December 2011, 20:15:21 UTC |
199a1b0 | Brendan W. McAdams | 03 December 2011, 20:09:19 UTC | Clean up & Correct README. | 03 December 2011, 20:09:19 UTC |
7baa7a4 | Brendan W. McAdams | 27 October 2011, 16:04:25 UTC | Merge pull request #19 from stackmob/master Support for format strings in mongo url path names | 27 October 2011, 16:04:25 UTC |
3b99ea2 | Ryan | 27 October 2011, 13:41:56 UTC | removed email address... peeps can contact me through github. | 27 October 2011, 13:41:56 UTC |
b37dea7 | Alex Yakushev | 31 August 2011, 00:50:31 UTC | ignore target directories | 31 August 2011, 00:50:31 UTC |
afb7d5b | Alex Yakushev | 31 August 2011, 00:43:11 UTC | pom fixes | 31 August 2011, 00:43:11 UTC |
b2eb081 | Alex Yakushev | 31 August 2011, 00:42:59 UTC | Bucketed MongoDB sink | 31 August 2011, 00:42:59 UTC |
0af08c0 | Brendan W. McAdams | 04 August 2011, 16:22:50 UTC | Merge pull request #14 from dcrosta/master Clean up Mappers for a more uniform interface like the Reducer ones. | 04 August 2011, 16:22:50 UTC |
dd63006 | Dan Crosta | 04 August 2011, 16:21:20 UTC | touch up docs | 04 August 2011, 16:21:20 UTC |
e1a038d | Dan Crosta | 04 August 2011, 16:13:30 UTC | add BSONMapper, KeyValueBSONMapper these classes wrap the input and output classes to allow writing generator pipeline functions for mappers | 04 August 2011, 16:13:30 UTC |
0d463cc | Brendan W. McAdams | 02 August 2011, 16:11:50 UTC | Pom files for examples | 02 August 2011, 16:11:50 UTC |
f61a48b | Brendan W. McAdams | 02 August 2011, 16:11:30 UTC | Rename slave ok, use shards and use chunks options to be more sane/normalized | 02 August 2011, 16:11:30 UTC |
c5a7c9f | Brendan W. McAdams | 02 August 2011, 15:45:34 UTC | Cleaning up packaging of example programs. | 02 August 2011, 15:45:34 UTC |
fd0fe8e | Brendan W. McAdams | 01 August 2011, 21:39:28 UTC | Ignore | 01 August 2011, 21:39:28 UTC |
0e68798 | Brendan W. McAdams | 01 August 2011, 21:31:23 UTC | Include dependencies and Main Class attribute in Streaming Jar; fix run demo scripts to execute against new build. | 01 August 2011, 21:31:23 UTC |
36e2c05 | Brendan W. McAdams | 01 August 2011, 21:06:40 UTC | Fix pathing of Flume | 01 August 2011, 21:06:40 UTC |
fac7381 | Brendan W. McAdams | 01 August 2011, 20:47:16 UTC | Fix README links for streaming | 01 August 2011, 20:47:16 UTC |
8bf469f | Brendan W. McAdams | 01 August 2011, 20:46:47 UTC | Fix links in readme | 01 August 2011, 20:46:47 UTC |
54007db | Brendan W. McAdams | 01 August 2011, 20:43:40 UTC | Refactor project into modules with Maven based profile build. Profile build allows for you to pick what Hadoop distribution you want to build for. | 01 August 2011, 20:43:40 UTC |
054f4a8 | Brendan W. McAdams | 01 August 2011, 18:16:44 UTC | Working demos of both "regular" and "key value" streaming python | 01 August 2011, 18:16:44 UTC |
3cd9d06 | Brendan W. McAdams | 29 July 2011, 19:26:19 UTC | Refactored InputSplit calculation for sharding into a utility class, based all Streaming code for "Classic MapReduce" code against latest "new" mapReduce entries and share one set of "Calculate Splits" code across both UIs. Also fixed code for reading configuration for sharding to conform with the rest of the project standard, encapsulated fully within MongoConfigUtil. | 29 July 2011, 19:26:19 UTC |
6c3e4f0 | Brendan W. McAdams | 29 July 2011, 18:29:50 UTC | Remove setting of cursor size as we ignore that for progress tracking. | 29 July 2011, 18:29:50 UTC |
69df2f6 | Brendan W. McAdams | 29 July 2011, 18:27:35 UTC | Removed ip_location example as it has a bunch of strange "Special" operators hardcoded into the Hadoop library just for it's own usage. Should be readded using proper MongoDB syntax. At this time, we don't support in place updates of Data with the Hadoop driver and hardcoding "special" operators serves only to confuse users. There is an open TODO item to investigate how to support incremental map/reduce, but updating values in place is not an appropriate way to handle a MapReduce approach. | 29 July 2011, 18:27:35 UTC |
f2cf807 | Brendan W. McAdams | 29 July 2011, 18:15:28 UTC | <reformatting> | 29 July 2011, 18:15:28 UTC |
ba4010d | Brendan W. McAdams | 29 July 2011, 18:03:27 UTC | Move Constructor for MongoInputSplit to top of class where it belongs. | 29 July 2011, 18:03:27 UTC |
b2a2c2a | Brendan W. McAdams | 29 July 2011, 17:58:44 UTC | Reformat Examples to Project Standard. | 29 July 2011, 17:58:44 UTC |
e189109 | Brendan W. McAdams | 29 July 2011, 17:57:05 UTC | Cleanup and reformat code to project standard. | 29 July 2011, 17:57:05 UTC |
18f42b4 | Brendan W. McAdams | 29 July 2011, 17:44:26 UTC | Fixes #9, integrate performance and array sizing tweaks from @josephks to BSONWritable | 29 July 2011, 17:44:26 UTC |
d5a6734 | Brendan W. McAdams | 29 July 2011, 17:44:09 UTC | Refs #9, Logging Cleanups from @josephks | 29 July 2011, 17:44:09 UTC |
edc36f7 | Brendan W. McAdams | 29 July 2011, 17:25:47 UTC | Refs #9, merge small fixes to temp table deletion in WebLogAnalyzer example from @josephks | 29 July 2011, 17:25:47 UTC |