https://github.com/mongodb/mongo-hadoop

sort by:
Revision Author Date Message Commit Date
20208a0 Merge pull request #160 from mongodb/DRIVERS-2036 28 January 2022, 19:28:02 UTC
2d59cfa DRIVERS-2036: EOL Notice Adding EOL notice before archiving repo. 28 January 2022, 19:25:49 UTC
cdcd0f1 Beautifying markdown headers 04 April 2017, 12:28:21 UTC
d16e574 Support continuous integration in Evergreen 30 March 2017, 17:49:30 UTC
09b824b BUMP 2.0.2 27 January 2017, 17:12:19 UTC
ddb01ff Update History.md for 2.0.2 release. 27 January 2017, 17:10:03 UTC
acc6270 Set split key min and max on MongoInputSplits created with createSplitFromBounds. This reverts some of the changes made by a2f662fe7d66bc5c5a4b8fc7219eb64e76100c39. 27 January 2017, 06:58:36 UTC
977c9a0 HADOOP-303 - Use the correct projection when selecting a column that maps to an embedded field within a document. (#143) Normalize nested field names to lower case. 27 January 2017, 01:12:54 UTC
a2f662f HADOOP-304 - Add a test case to cover constructing MongoInputSplit. Allow min/max split keys to be set from the Configuration in MIS. 27 January 2017, 01:10:34 UTC
b618dc9 Support MongoDB skip when creating MongoInputSplit 27 January 2017, 00:28:48 UTC
a6b9cb7 BUMP 2.0.1 -> 2.0.2.dev 24 January 2017, 22:17:06 UTC
4507e57 BUMP 2.0.1 30 August 2016, 19:50:24 UTC
fd8529d Update History.md for 2.0.1 release. 30 August 2016, 19:50:24 UTC
df080df HADOOP-295 - MongoPaginatingSplitter should set the noTimeout option on its cursor. 30 August 2016, 17:25:02 UTC
f6f74f9 BUMP 2.0.0 15 August 2016, 17:59:28 UTC
c3c6585 BUMP 2.0.0-rc0 28 July 2016, 22:33:16 UTC
d6ebf36 Merge branch '2.0-dev' Conflicts: History.md README.md build.gradle core/src/test/java/com/mongodb/hadoop/testutils/BaseHadoopTest.java 28 July 2016, 22:32:14 UTC
9c354a9 Update History.md for 2.0.0-rc0. 28 July 2016, 22:00:54 UTC
6b16268 Remove temporary directories, in addition to temporary files (HADOOP-292). 28 July 2016, 22:00:34 UTC
fb83e03 Sleep to allow more time for jobtracker to start. 21 July 2016, 20:34:23 UTC
b8d77de Try to use a local mongos host/port on InputSplits produced by ShardChunkMongoSplitter (HADOOP-202). 12 July 2016, 20:38:20 UTC
feced9f Fix MongoSplitterFactoryTest run against a sharded cluster. 20 June 2016, 22:29:05 UTC
4868153 Appease checkstyle. 20 June 2016, 21:08:22 UTC
0ab0d10 Add Powerrr to the list of CONTRIBUTORS. 20 June 2016, 21:08:22 UTC
667f514 use query projection in MongoPaginatingSplitter 20 June 2016, 20:36:57 UTC
d8d63ac Merge pull request #141 from Powerrr/paginating-splitter-projection Use query projection in MongoPaginatingSplitter 20 June 2016, 18:56:40 UTC
33d1560 use query projection in MongoPaginatingSplitter 16 June 2016, 11:16:34 UTC
5b649a5 Add SampleSplitter (HADOOP-283). SampleSplitter creates InputSplits based on the output of the $sample aggregation operator. This is a very inexpensive way to create splits on unsharded MongoDB collections without requiring special privileges as the 'splitVector' command does. SampleSplitter requires MongoDB 3.2+. 15 June 2016, 16:15:05 UTC
08c45fc HADOOP-236 - Add two new classes to support making updates from Hadoop streaming jobs: - MongoUpdateInputWriter - MongoUpdateOutputReader To use these classes (and thus specify that a job is for making updates), set `-io mongoUpdate` when launching the Hadoop streaming job. 13 June 2016, 16:28:47 UTC
ae06f2e Make one GridFSInputFormatTest Hadoop-1.2 compatible. 10 June 2016, 23:33:26 UTC
eabb01d Restore support for Hadoop 1.2.X (HADOOP-246). 10 June 2016, 23:17:27 UTC
72626ab Support for reading from GridFS via GridFSInputFormat and GridFSSplit (HADOOP-272). 03 June 2016, 22:52:20 UTC
52f671e Add CONTRIBUTORS.md 06 May 2016, 17:09:02 UTC
b407177 Merge pull request #138 from emanresusername/2.0-dev unorderedBulkOperation support (HADOOP-279) 06 May 2016, 16:58:19 UTC
a4bdaf8 unorderedBulkOperation support 06 May 2016, 02:45:51 UTC
6a662c5 Support document replacement (HADOOP-263). 04 May 2016, 17:42:25 UTC
e99542e Update History.md 13 April 2016, 00:38:38 UTC
469f6e4 Ensure that BSONPickler and custom constructors are registered on every Spark node (HADOOP-273). 13 April 2016, 00:38:07 UTC
a5f869c Fix Spark version in README (HADOOP-275). 13 April 2016, 00:38:07 UTC
dc50df8 Clarify installation instructions and version compatibility in the README (HADOOP-275). 13 April 2016, 00:38:07 UTC
045ed4c Allow datetime.datetime objects to be read and written from Spark (HADOOP-274). 13 April 2016, 00:38:07 UTC
130ba1b BUMP 1.5.2 28 March 2016, 19:39:42 UTC
9764990 Update History.md 28 March 2016, 19:39:37 UTC
22cd303 Ensure that BSONPickler and custom constructors are registered on every Spark node (HADOOP-273). 28 March 2016, 18:36:46 UTC
aecd367 Fix Spark version in README (HADOOP-275). 24 March 2016, 20:58:38 UTC
75038b9 Clarify installation instructions and version compatibility in the README (HADOOP-275). 24 March 2016, 18:23:01 UTC
883b3e0 Allow datetime.datetime objects to be read and written from Spark (HADOOP-274). 18 March 2016, 20:06:44 UTC
2a43478 BUMP 1.5.2-SNAPSHOT 18 March 2016, 20:05:57 UTC
54b5ace BUMP 2.0.0-SNAPSHOT 09 March 2016, 18:40:21 UTC
8f2699a BUMP 1.5.1 09 March 2016, 18:09:22 UTC
d6f84ad Update History.md 09 March 2016, 18:07:09 UTC
f241321 Close MongoClients in MongoRecordWriter and MongoOutputCommitter (HADOOP-265). 03 March 2016, 19:48:42 UTC
0c06361 Don't allow null templates to be passed to JSONPigReplace.replaceAll() (HADOOP-266). 03 March 2016, 19:32:19 UTC
3f3b09c Allow users to set the limit on MongoInputSplits (HADOOP-267). 29 February 2016, 18:43:26 UTC
3f57880 BUMP 2.0.0-SNAPSHOT 23 February 2016, 18:13:33 UTC
c10a614 BUMP 1.5.0 23 February 2016, 18:03:34 UTC
750e52a Fix parameter name in JavaDoc. 23 February 2016, 18:03:34 UTC
02a043a Return null early in getTypeForBSON if input is null (HADOOP-255). 17 February 2016, 20:42:38 UTC
c0e49c5 BUMP 1.5.0-rc1-SNAPSHOT. 01 February 2016, 21:43:24 UTC
9d693eb BUMP 1.5.0-rc0 01 February 2016, 21:10:55 UTC
07fddd6 Update History.md 01 February 2016, 21:10:51 UTC
4b841ea Fix a test that uses a cursor after it is closed. 29 January 2016, 18:57:37 UTC
9dd9fa5 Add UDFs that permit storing BSON from Pig and extracting timestamp information from ObjectIds (HADOOP-76). 28 January 2016, 21:00:21 UTC
352ad53 Fix some tests that were broken with MongoDB < 2.6. 28 January 2016, 20:53:28 UTC
37e9917 Update the project to use the latest versions of Hive, Hadoop, Pig, Spark, and the MongoDB Java Driver (HADOOP-250). 27 January 2016, 22:52:03 UTC
9a46b62 Be able to infer FileSystem implementation from URI (HADOOP-253). 27 January 2016, 19:16:21 UTC
92a923f Close only thread-local clients with MongoConfigUtil.close() (HADOOP-243). 26 January 2016, 23:05:47 UTC
7758903 Create option 'mongo.input.splits.combine' for combining splits. Add MongoPaginatingSplitter (HADOOP-83). 22 January 2016, 19:15:59 UTC
4271291 Update HADOOP_HOME to HADOOP_PREFIX in README. 08 January 2016, 18:07:17 UTC
96ac8e5 Amend a few minor details about the Enron emails Spark example: - Include output from "core" project in example fat jar. - Use accessor methods for getting tuple data in Java. - Some minor cleanup of comments/style to appease checkstyle. 07 December 2015, 22:18:25 UTC
53467ad Formatting corrections for compliance to sytle guide 03 December 2015, 22:05:56 UTC
9ef021f New Spark examples, including Dataframes & SparkSQL, using Enron email dataset. 03 December 2015, 21:26:33 UTC
9905150 Add Mariano Semelman to the list of contributors in README.md. 16 November 2015, 22:05:26 UTC
1ff5085 Run the 'splitVector' command on the same database from which we want the splits (HADOOP-238). 16 November 2015, 22:05:18 UTC
9d38854 HADOOP-242 - Support compression in mapred package. 09 November 2015, 18:42:08 UTC
3aa106d Use full path to 'hdfs' binary in enronEmails task. 04 November 2015, 21:18:17 UTC
2b7a0d7 Update pymongo-spark's README to reflect that mongo-hadoop-spark.jar has not yet been released. 04 November 2015, 16:31:04 UTC
103574a Don't use $eq operator in tests, since it's not available in older server versions. 30 October 2015, 17:56:14 UTC
f5d51eb Allow projections to be pushed down to MongoDB from Pig (HADOOP-167). 22 October 2015, 17:18:38 UTC
0dcf8ad Don't need to build a tests jar for the spark subproject in order to run the tests. 22 October 2015, 17:04:38 UTC
9d7516a Throw a RuntimeException if MongoRecordWriter cannot open an OutputStream (HADOOP-235). 16 October 2015, 17:56:27 UTC
6cdb43f Fix a broken link in the README. 07 October 2015, 22:32:38 UTC
affad1b Add support for PySpark (HADOOP-187). This adds a "spark" module to the project, which compiles into "mongo-hadoop-spark.jar". This jar is currently only necessary if you want to use PySpark with mongo-hadoop. Additionally, this adds the pymongo_spark module, which provides the necessary objects and methods to use mongo-hadoop and PySpark together on the Python-side. 07 October 2015, 22:30:13 UTC
5642d65 Support compressed BSON files in BSONFileInputFormat (HADOOP-71). BSONFileInputFormat can now read files compressed with any of the codecs included with Hadoop. Additionally, BSONSplitter can be run as an executable program to split, compress, and upload BSON files to HDFS, or any other file system supported by Hadoop. 07 October 2015, 20:13:50 UTC
75c6e53 Push down some projections and queries from Hive to MongoDB (HADOOP-90). Any binary operator supported by IndexPredicateAnalyzer can be part of a pushdown predicate to MongoDB. Other operators are currently unsupported, so more advanced filtering is done Hadoop-side. 07 October 2015, 20:03:31 UTC
fdc37e5 Specify options in a "properties" file which is read by MongoStorageHandler (HADOOP-216). Set the path to the properties file with the "mongo.properties.path" table property in Hive. 07 October 2015, 18:25:03 UTC
e4df9eb BUMP 1.5.0-SNAPSHOT 30 September 2015, 00:38:40 UTC
2214f05 BUMP 1.4.1 30 September 2015, 00:20:59 UTC
9000ba0 Update documentation links in build.gradle. 30 September 2015, 00:20:59 UTC
ec1ff3d Update History.md. 30 September 2015, 00:18:19 UTC
9c84080 Fix test that parses Strings into Timestamps (HADOOP-226). 29 September 2015, 22:44:41 UTC
766922b Use zero-args constructors for MongoOutputCommitter, so that they can be provided to -D mapred.output.committer.class=XXX (HADOOP-231) 29 September 2015, 18:57:03 UTC
3dafe6e Appease checkstyle. 15 September 2015, 00:27:40 UTC
f9724fa Fix some merge artifacts related to merge of PR #131 (add min/max parameters to splitVector command). 15 September 2015, 00:22:18 UTC
b377dbd cleanup changes and simplified checkings 15 September 2015, 00:22:18 UTC
da2bfb9 fixed confussion with expected BSONObject 15 September 2015, 00:22:18 UTC
7be32b3 support for min/max parameters for splitter. 15 September 2015, 00:22:18 UTC
0537b44 remove old, non-existent repository checkstyle fixes 09 September 2015, 17:23:31 UTC
bd167a7 Convert Strings that represent Hive Timestamps to Timestamp automatically, if the schema requires (HADOOP-226). 31 August 2015, 17:03:48 UTC
5338040 Do not log the full MongoDB connection string, so that credentials cannot show up in Hadoop logs (HADOOP-219). 14 August 2015, 16:41:26 UTC
back to top