Revision history - refs/heads/release-1.10.0 - origin: https://github.com/voldemort/voldemort

visit type:

Revision	Author	Date	Message	Commit Date
d780f55	Felix GV	19 October 2015, 19:58:17 UTC	Releasing Voldemort 1.10.0	19 October 2015, 19:58:17 UTC
f7303df	Felix GV	16 October 2015, 22:47:38 UTC	Protection against data corruption. If index or data files are somehow corrupted, or mismatched (meaning that the index file from one store/version gets used in combination with a data file for another store/version), then the server gets into bad buffer allocation problems, causing IllegalArgumentExceptions and potentially OOMing in the process. This commit makes it easier to debug the issues and protect against the OOMs. Changes include: - Disallow allocation of negative size buffers. - Disallow allocation of excessively large value buffers (configurable, max 25 MB by default). - Disallow allocation of a buffer which would go past the data file's length. - Server-side logs now print out the file name and key (in hex) which triggered the problem. - Server-side logs now print out the socket name and stacktrace when SelectorManagerWorker catches a Throwable. - Client-side will report an "internal server error" when any of the bad allocation scenarios described above happen. - Slight optimization in the allocation strategy while reading the key/values in a data file.	19 October 2015, 19:23:46 UTC
d4e27d9	Arunachalam Thirupathi	17 October 2015, 02:01:42 UTC	Test for ChunkedFileSet Collission Test for ChunkedFileSet collission case.	17 October 2015, 02:02:22 UTC
8b7b4cc	James Lent	24 September 2015, 16:33:40 UTC	Make some modifications to how the benchmarking tool works in local mode as part of an evaluation of the RocksDB strorage engine: 1) Ensure that all the warm up records requested are created to eliminate some random errors. 2) Modify the mixed operation in local mode to increment the VectorClock to prevent all iterations from failing due to ObsoleteVersion exceptions. 3) Modify the mixed operation to perform the write even if the read returns nothing rather than silently passing without performing a write (I considered making this an error, but, preferred this approach). 4) Add a new Warning count similar to the ReturnCode count to report how often a mixed operation reads nothing (could be used for other situations). 5) Move the RecordCode report outside the summaryOnly check so that the error counts are always reported. 6) Make it possible to configure the storage engine in local mode. 7) Make several of the RocksDB parameters configurable.	16 October 2015, 02:08:09 UTC
13d055a	Bhavani Sudha Saktheeswaran	08 October 2015, 18:04:02 UTC	Releasing Voldemort 1.9.22	08 October 2015, 18:04:02 UTC
d708069	Bhavani Sudha Saktheeswaran	06 October 2015, 23:31:44 UTC	fix tests	06 October 2015, 23:31:44 UTC
26d9582	Bhavani Sudha Saktheeswaran	05 October 2015, 17:43:48 UTC	Use IOUtils.closeQuietly() to close AdminClient Add more tests - add more tests for disk-quota - reorganize HDFSFetcher code	05 October 2015, 22:42:53 UTC
b21a46a	Bhavani Sudha Saktheeswaran	28 September 2015, 23:48:00 UTC	Fix default quota value for new stores - Make default quota value for new stores created via BnP to be configurable - -1 for default quota value indicates no quota restriction - fix unit tests fix tests minor fix Fix bug - delete quota call for, setQuoataForNode() now deletes quota from the quota store instead of the actual store name	03 October 2015, 00:14:26 UTC
831b809	Felix GV	02 October 2015, 14:58:10 UTC	BnP HA debuggability improvements.	02 October 2015, 14:59:30 UTC
985de33	ARUNACHALAM THIRUPATHI	07 June 2015, 22:59:44 UTC	Add heart beat for NIO Selector On Voldemort Server, when there is a disk crash the selector gets hung. Acceptor still keeps on accepting sockets and after a while the file descriptor limit is reached and everything is in a hung state. No JVM stack or heap dump. Now a selector heart beat is added and if the selector is past the max heart beat time ( defaults to 3 minutes) the acceptor will stop assigning connections to this selector. If all selectors are unhealthy ( past max heart beat time) acceptor will close the socket so that client can recover from these errors faster.	02 October 2015, 01:52:56 UTC
4cc1c14	ARUNACHALAM THIRUPATHI	01 October 2015, 05:57:42 UTC	Merge pull request #309 from jalkjaer/master add additional keepalive settings to avoid client connection leaks	01 October 2015, 05:57:42 UTC
dc50968	singhsiddharth	29 September 2015, 19:23:21 UTC	Merge pull request #315 from FelixGV/hadoop_utils_clean_ups Trimmed the fat in HadoopUtils.	29 September 2015, 19:23:21 UTC
810129c	Felix GV	28 September 2015, 23:19:09 UTC	Trimmed the fat in HadoopUtils. Half the functions were not used anywhere, and it's doubtful we would ever need them in the future.	28 September 2015, 23:19:09 UTC
6050ab9	Bhavani Sudha Saktheeswaran	23 September 2015, 23:58:54 UTC	Remove unnecessary special handling of QuotaExceededExceptions Add new Exception type for invalid stores Add log message when adding quota minor fix - set quota to 0 only if the store definition needs to be created. minor fix: - change exception type to be VoldemortException instead of QuotaExceededException	25 September 2015, 22:36:11 UTC
2f72ffb	Bhavani Sudha Saktheeswaran	05 September 2015, 00:44:41 UTC	Iteration 1: Disk Quota - Add Exception communication from server to client Iteration 2: Disk Quota - Random function that generates QuotaException. For testing end to end exception flow. Later this will be replaced by appropriate quota checks Iteration 3: Disk Quota - compute the bytes written to index and data files in each reducer. Add this information to the filename - estimate the total disk size in bytes needed for a <node, store> pair. Done by iterating through all the index and data files and getting theie sizes from the file names, after the job is run. Iteration 4: Disk Quota - In HDFSFetcher, fetch estimated disk size from metadata file - In HDFSFetcher, fetch disk quota limit for <node, store> pair using admin client Minor fix: Add log message Minor fix: Fix exception handling minor fix: Handle Quota Exceptions appropriately minor fix: Fix more exceptions More fixes for Exception logs More exception handling fixes Handling Quota Exception in AdminStoreSwapper Iteration 5: Fix disk estimation logic - Instead of renaing the .data and .index files, add the file size info to corresponding '.checksum' files - Introduce new CheckSumMetadata file that holds the following fields - checksum, data file size, index file size - CheckSumMetadata is similar to ReadOnlyStorageMetadata file in terms of serialization and deserialization minor fix: exclude unnecessary "_" in file names Cleanup debug logs Add quota checks - Determine if quota needs to be checked for the incoming push. For pre existing stores quota is not checked. They will be quota-ed in future. - Check if there is sufficient quota left for a new push. - Filter out stores that push data mistakenly to the Voldemort cluster. This is based on the assumption that for invalid stores, during build phase, a store definition is with 0 quota. All other stores that are onboarded through Nuage will have proper quota set already. More fixes: - Fix Non Quota Exception Handling. Propogates actual error message to client side. - Change getQuota() to getQuotaForNode() in HDFSFetcher. - Add debug log messages - Set zero Quota for stores that are created during Build and Push. We want all new stores to be pre created with appropriate quota via Nuage or via admin client.	25 September 2015, 22:36:11 UTC
0af020f	Felix GV	25 September 2015, 15:46:35 UTC	Proper interruption of BnP hooks.	25 September 2015, 15:46:35 UTC
4622df9	Felix GV	24 September 2015, 23:19:32 UTC	Releasing Voldemort 1.9.21	24 September 2015, 23:19:32 UTC
26e1f48	Felix GV	24 September 2015, 07:32:28 UTC	New server config to determine HDFS login interval. Default: fetcher.login.interval.ms=-1 (re-login every time)	24 September 2015, 21:59:21 UTC
df73cd4	Johannes Alkjaer	21 September 2015, 22:19:47 UTC	added settings nio.connector.keepalive and nio.admin.connector.keepalive for inbound NIO Connections	22 September 2015, 20:29:43 UTC
711cc40	Felix GV	15 September 2015, 01:37:00 UTC	Releasing Voldemort 1.9.20	15 September 2015, 01:40:58 UTC
ebbc44f	James Lent	14 September 2015, 15:22:52 UTC	Fix the RocksDB iteration logic for keys and entries. The problems addressed include: 1) Key and entry iterators failed to strip prefix from keys (if required). 2) RocksdbStorageEngineTest failed to test BdbStorageEngine (all tests). 3) Both Key iterators would miss first entry and then iterate off the end. With these changes I was able to remove the Ignore annotation from 2 tests.	15 September 2015, 00:18:45 UTC
2738b7c	Felix GV	14 September 2015, 22:53:34 UTC	Fixed HdfsFetcher test code and trimmed some fat.	14 September 2015, 22:54:40 UTC
2aec46a	Arunachalam Thirupathi	11 September 2015, 01:59:30 UTC	Avoid duplicate JMX for Store and Pipeline stats Current code creates duplicate JMX for storeClient , pipeline stats when the getStoreClient method is called more than once. This fix avoids the duplicate jmx registration by caching the first creation and registering only once. Read only stroage engine registers the last swapped with node id in the name, this creates multiple counters which clutters the log and tracking across multiple nodes last swap time difficult to visualize.	14 September 2015, 18:35:28 UTC
7975559	Felix GV	11 September 2015, 23:13:05 UTC	BnP HA improvements. - HdfsFailedFetchLock now happens on the server-side. - BnP no longer relies on any specific node (the node.id config parameter is eliminated). - Moved some authentication-related code out of HdfsFetcher and into HadoopUtils.	14 September 2015, 04:22:10 UTC
5c3d158	Felix GV	08 September 2015, 22:33:46 UTC	Graceful recovery from incomplete BnP store creation.	08 September 2015, 22:33:46 UTC
8563062	Felix GV	31 August 2015, 20:59:24 UTC	Added long retries to ServerTestUtils.startVoldemortServer It will now retry for up to 120 attempts, and up to 5 minutes before giving up. This is a workaround for the flaky non-deterministic issues we see in a lot of tests.	02 September 2015, 17:33:25 UTC
4b0fc58	Felix GV	31 August 2015, 20:58:35 UTC	Marked some RocksDB tests with @Ignore. The corresponding functionality is not implemented, so these failures are expected.	31 August 2015, 20:58:35 UTC
54164ff	Felix GV	28 August 2015, 21:23:54 UTC	Changed routing strategy for system stores to "all-routing". Previously, the metadata_version_persistence and store_quotas stores were using the "local-pref-all-routing" strategy which leads to undeterministic behavior.	28 August 2015, 21:23:54 UTC
4280c14	Felix GV	25 August 2015, 00:37:19 UTC	Fixed broken test and marked HTTP Service as deprecated.	25 August 2015, 00:37:19 UTC
9e06265	Felix GV	24 August 2015, 23:22:16 UTC	Added warning messages on the stream admin commands. These commands are not considered production-ready. They are intended for debugging purposes. Also cleaned up a bit of dead code in admin commands.	24 August 2015, 23:22:16 UTC
8997512	James Lent	20 August 2015, 17:44:40 UTC	Add a 'binary' format to the fetch-entries admin command so that it will create a file comaptible with the update-entries command. For consistency sake add this same format to the fetch-keys command.	24 August 2015, 22:38:28 UTC
ef29e5d	Arunachalam Thirupathi	22 August 2015, 01:09:35 UTC	Fork lift corrupts the data on schema mismatch 1) If the source and destination schema does not match, currently forklift corrupts the destination data by streaming in bytes from the source data. After this commit, the forklift will fail when a schema mismatch is detected. The old behavior if required can be achieved by undocumented parameter ignore-schema-mismatch Default of fork lift which forklifts all stores is changed to fail when store name is not specified. I can't imagine a situation where you want to forklift form one cluster to other often. If an admin forgets to specify this parameter, they are forklifting the entire cluster which is not definitely intended default. Added 5 unit tests ( 3 for key mismatch and 2 for value mismatch). Added pretty print functions to Compression and SerializerDefintion.	24 August 2015, 22:35:59 UTC
fce83e7	Arunachalam Thirupathi	22 August 2015, 01:07:25 UTC	Make avro utf8 and bytes readable in shell output 1) fetch-keys and fetch-entries streaming option for utf8 and bytes are not human readable. This is a problem if you want to sample and read them using the shell. 2) voldemort-shell.sh does not output the avro bytes in a readable format. The previous output was some internal state and it does not convey what is the output.	24 August 2015, 22:35:59 UTC
eec454a	Felix GV	20 August 2015, 01:09:14 UTC	Improving debuggability of read-only fetches. - All SchedulerService threads now have a unique name, instead of being all called "java.util.concurrent.ThreadPoolExecutor$Worker". - AsyncOperation instances will now override their current thread's name to provide even more detail about what's running, and then restore the original thread name. - The fetcher loop in the server will report slightly more useful info to the BnP job via the AsyncOperation's status message. - The fetcher loop in the server will also print local logs which are more similar to the BnP-side log, for easier log correlation. - HdfsCopyStats flush lines as they happen, in order to improve debuggability of stalled or abrupbtly interrupted fetches...	24 August 2015, 22:27:48 UTC
fd345f7	Felix GV	15 August 2015, 00:20:02 UTC	Releasing Voldemort 1.9.19	15 August 2015, 00:20:02 UTC
07f0f72	Felix GV	14 August 2015, 22:33:10 UTC	Removed MemLock class and related configs. It turns out our mlock call always failed and we've been running fine without it anyway, so there's no point in keeping that code around.	14 August 2015, 22:33:10 UTC
9e5fe19	Greg Banks	13 August 2015, 14:45:24 UTC	Cleanup formatting of system store schema.	13 August 2015, 14:45:24 UTC
a6294ce	Arunachalam Thirupathi	12 August 2015, 22:26:15 UTC	Revert "Clean Set Quota Fix" This reverts commit 3ff1cbe47a2206834c2cd6b77878dcccb3f0be67. The unit tests are broken. On a deeper look, file backed caching storage engine, does not have a version per key, but at a file level. Any updates to the file with lower version will be rejected. So we have to go with super clocks for now. I am reverting the commit to unblock the release of a new version.	12 August 2015, 22:27:04 UTC
fa0acf6	James Lent	22 July 2015, 19:19:38 UTC	Ensure that all the AbstractStorageEngineTest tests get run by all the subclasses. Add the @Test annotation to several tests. Without that annotation it appears that those subclasses that are parameterized do not run these specific test cases. This is most likely a base gradle issue. Perhaps somewhat related to: https://issues.gradle.org/browse/GRADLE-3112	12 August 2015, 19:03:44 UTC
da0ec6d	Felix GV	12 August 2015, 01:35:40 UTC	Removed HadoopStoreJobRunner and related scripts. This is to avoid confusion. There is no point in running those scripts. - For building and pushing a read-only store, VoldemortBuildAndPushJobRunner and ./bin/run-bnp.sh should be used. If someone wants to just build without pushing, that can be done by passing push=false to the BnP config. - For swapping a store version, it can be done via the vadmin.sh script.	12 August 2015, 17:54:11 UTC
00ad250	Felix GV	12 August 2015, 00:09:29 UTC	Changes to @elad's shadowJar build, to ensure all dependencies are bundled up.	12 August 2015, 17:54:11 UTC
a4bc78c	Elad Efrat	04 August 2015, 13:22:17 UTC	Minimal config for read-only store, with Hadoop (BnP) hints.	12 August 2015, 17:54:11 UTC
3add1aa	Elad Efrat	04 August 2015, 13:16:06 UTC	Print number of partitions, useful for debugging.	12 August 2015, 17:54:11 UTC
e5eee78	Elad Efrat	04 August 2015, 13:12:45 UTC	Add BnP job and script from @FelixGV with slight changes.	12 August 2015, 17:54:11 UTC
c974058	Elad Efrat	04 August 2015, 13:05:25 UTC	Make this work on Hadoop 2.x by shading a few dependencies. Shade avro (to 1.4.0) and protobuf (to 2.3.0) and always include jdom (1.1). Mostly from #274, also see #284.	12 August 2015, 17:54:11 UTC
e3113e3	Felix GV	12 August 2015, 01:56:04 UTC	Addressing stylistic comment.	12 August 2015, 17:41:35 UTC
98cd411	ARUNACHALAM THIRUPATHI	12 August 2015, 01:40:33 UTC	Merge pull request #296 from arunthirupathi/setQuota Clean Set Quota Fix	12 August 2015, 01:40:33 UTC
8c068ac	Felix GV	12 August 2015, 00:46:46 UTC	Allow RO servers to run without Kerberos enabled.	12 August 2015, 00:46:46 UTC
3ff1cbe	Arunachalam Thirupathi	11 August 2015, 19:35:49 UTC	Clean Set Quota Fix Clean up the Set quota to not generate super clocks. Wait for all the nodes to complete, before returning success.	11 August 2015, 19:35:49 UTC
2a0c81e	Greg Banks	06 August 2015, 15:49:56 UTC	Remove unused members and methods from MemLock public setFile() is unused, and also dangerous because there is no point in the lifetime of the object when it can have a useful effect. The descriptor member is not needed after the ctor.	11 August 2015, 18:30:03 UTC
dc91e04	Greg Banks	06 August 2015, 15:41:08 UTC	Factor out some common code in ChunkedFileSet into a new mapAndRememberIndexFile() method.	11 August 2015, 18:30:03 UTC
faa2ca0	Greg Banks	06 August 2015, 15:32:14 UTC	Remove unused mapFile() method in ChunkedFileSet.	11 August 2015, 18:30:03 UTC
8647f82	Greg Banks	06 August 2015, 15:30:11 UTC	Simplify MappedFileReader, close Unix fd early Remove most of the members and make them local variables in the map() function which is the only place that uses them. This simplifies the c'tor and means it cannot throw IOException anymore. It also means we close the Unix file descriptor used to create the mapping much earlier. This will help reduce the load on file descriptors.	11 August 2015, 18:30:03 UTC
1eafb88	Greg Banks	06 August 2015, 07:41:41 UTC	Remove unused fields from MappedFileReader - fadvise was an unused leftover from previous code - offset was used but it's value was always zero, stop pretending otherwise	11 August 2015, 18:30:03 UTC
a4fe60a	Greg Banks	06 August 2015, 07:38:03 UTC	Merge BaseMappedFile into its only subclass. It had precisely one subclass, did not define any semantically meaningful behavior, and nobody used the base class. It will not be missed.	11 August 2015, 18:30:03 UTC
4b74809	Greg Banks	06 August 2015, 06:59:32 UTC	Report mmap/munmap errors using IOException which the signature claims can be thrown but never was. Ensure that the only callchain (MappedFileReader -> MemLock -> mman) will actually handle and log IOException correctly. Also, stop claiming that mlock() and munlock() throw IOException. They never did and it would not be helpful, in fact we want to ignore all errors from those functions because their normal behavior in unprivileged processes is to fail.	11 August 2015, 18:30:03 UTC
9fb2689	Greg Banks	06 August 2015, 05:33:50 UTC	Remove all traces of MAP_ALIGN which doesn't exist on Linux (the binary value we were passing into the kernel was some other harmless option) and even if it did exist it has no effect because we're hardcoding the alignment to 0 which means "let the kernel choose" i.e. the default behavior.	11 August 2015, 18:30:03 UTC
be018e3	Greg Banks	06 August 2015, 05:31:35 UTC	Remove all traces of MAP_LOCKED which we didn't actually use except in some stale comments, and which is not implemented on any current OS anyway.	11 August 2015, 18:30:03 UTC
9202548	singhsiddharth	11 August 2015, 06:22:10 UTC	Merge pull request #295 from FelixGV/fixes_to_DataCleanupJobTest Bump EventThrotller window + fixes to data cleanup job test	11 August 2015, 06:22:10 UTC
d25d169	Felix GV	11 August 2015, 04:51:19 UTC	Upgrade to Hadoop 2.3.0-cdh5.1.5 and Kerberos clean up.	11 August 2015, 05:13:55 UTC
9a48c7e	Felix GV	11 August 2015, 00:51:59 UTC	Added STORAGE_SPACE quota and cleaned up some vadmin stuff.	11 August 2015, 04:40:56 UTC
e65b0e0	Felix GV	11 August 2015, 03:01:12 UTC	Fixed DataCleanupJobTest.	11 August 2015, 03:03:05 UTC
3c6be4b	Felix GV	11 August 2015, 02:59:28 UTC	Bumped up EventThrottler's default window time to 1000 ms.	11 August 2015, 03:02:49 UTC
8d6f06a	singhsiddharth	07 August 2015, 00:11:19 UTC	Remove sleep after deprecated warning	07 August 2015, 00:11:19 UTC
df46fdc	singhsiddharth	29 July 2015, 19:28:58 UTC	Merge pull request #288 from FelixGV/remove_scala_and_ec2_testing Remove scala, ec2 testing and public-lib directory	29 July 2015, 19:28:58 UTC
b38aabe	Felix GV	29 July 2015, 18:46:01 UTC	Removed public-lib as it was only used by the deprecated ant build.	29 July 2015, 18:46:01 UTC
3bfc934	Felix GV	29 July 2015, 18:39:54 UTC	Deleted unused cruft (scala shell and ec2-testing contrib).	29 July 2015, 18:39:54 UTC
105d1ed	Felix GV	27 July 2015, 17:33:45 UTC	Removed a duplicate log in AdminClient.waitForCompletion	27 July 2015, 17:33:45 UTC
dcbd8c9	Felix GV	23 July 2015, 20:35:15 UTC	Improved BnP logging.	23 July 2015, 20:35:15 UTC
3a935aa	Felix GV	20 July 2015, 22:54:37 UTC	Releasing Voldemort 1.9.18	20 July 2015, 22:54:37 UTC
724596d	Felix GV	20 July 2015, 19:43:28 UTC	Voldemort BnP pushes to all colos in parallel. Also contains many logging improvements to discriminate between hosts and clusters.	20 July 2015, 19:43:28 UTC
9c61ada	Felix GV	14 July 2015, 20:55:01 UTC	Rewrite of the EventThrottler code to use Tehuti. - Makes throttling less vulnerable to spiky traffic sneaking in "between the interval". - Also fixes throttling for the HdfsFetcher when compression is enabled.	16 July 2015, 01:18:31 UTC
82f80b6	Arunachalam Thirupathi	10 July 2015, 20:25:23 UTC	Fix the white space changes The previous refactor was done from my mac book which did not replace the tabs with spaces. This messed up lot of the editing. Instead of re-doing the change with spaces, I just formatted the code which is easier and no re-verification is required. you can review teh commit by adding ?w=1 on github url or use the git diff -w if you are using the command line to ignore the whitespaces and there are not many changes.	10 July 2015, 20:25:23 UTC
168cb69	ARUNACHALAM THIRUPATHI	28 June 2015, 05:16:29 UTC	Pass in additional parameters to fetch 1) Currently the AsyncOperationStatus is set for HdfsFetcher if 2 or more fetches are going on, this would produce erroneous results. 2) Add StoreName, Version, Metadatastore for use in future fetches. 3) Enabled the Hadoop* Tests, don't know why they were not run in ant tests. When I ported them for parity reasons I disabled them too, but now enabling it as the test seems valid. 4) made the fetch throw IOException instead of throwable, which seems less reliable and catching more than it is intended.	01 July 2015, 06:41:08 UTC
d70ed85	ARUNACHALAM THIRUPATHI	28 June 2015, 03:23:40 UTC	Refactor file fetcher to Strategy Interface/class Refactor the file fetcher to Strategy Interface and class In the future this lets you modify the file fetching strategey like having BuildAndPush build only one copy for partition, chunk and the fetcher can fetch them under different names. There is no logic change, just the code is refactored.	01 July 2015, 06:41:08 UTC
6a42f59	Felix GV	01 July 2015, 00:04:33 UTC	Improved path handling and validation in VoldemortSwapJob	01 July 2015, 01:04:12 UTC
aa51b0b	ARUNACHALAM THIRUPATHI	30 June 2015, 21:47:13 UTC	Merge pull request #271 from dallasmarlow/coordinator-class Thanks for the fix @dallasmarlow update coordinator class name in server script	30 June 2015, 21:47:13 UTC
a45fc83	Felix GV	30 June 2015, 21:06:29 UTC	Fixed voldemort.cluster.ClusterTest	30 June 2015, 21:06:29 UTC
c2db8fd	Felix GV	30 June 2015, 18:11:45 UTC	First-cut implementation of Build and Push High Availability. This commit introduces a limited form of HA for BnP. The new functionality is disabled by default and can be enabled via the following server-side configurations, all of which are necessary: push.ha.enabled=true push.ha.cluster.id=<some arbitrary name which is unique per physical cluster> push.ha.lock.path=<some arbitrary HDFS path used for shared state> push.ha.lock.implementation=voldemort.store.readonly.swapper.HdfsFailedFetchLock push.ha.max.node.failure=1 The Build and Push job will interrogate each cluster it pushes to and honor each clusters' individual settings (i.e.: one can enable HA on one cluster at a time, if desired). However, even if the server settings enable HA, this should be considered a best effort behavior, since some BnP users may be running older versions of BnP which will not honor HA settings. Furthermore, up-to-date BnP users can also set the following config to disable HA, regardless of server-side settings: push.ha.enabled=false Below is a description of the behavior of BnP HA, when enabled. When a Voldemort server fails to do some fetch(es), the BnP job attempts to acquire a lock by moving a file into a shared directory in HDFS. Once the lock is acquired, it will check the state in HDFS to see if any nodes have already been marked as disabled by other BnP jobs. It then determines if the Voldemort node(s) which failed the current BnP job would bring the total number of unique failed nodes above the configured maximum, with the following outcome in each case: - If the total number of failed nodes is equal or lower than the max allowed, then metadata is added to HDFS to mark the store/version currently being pushed as disabled on the problematic node. Afterwards, if the Voldemort server that failed the fetch is still online, it will be asked to go in offline node (this is best effort, as the server could be down). Finally, BnP proceeds with swapping the new data set version on, as if all nodes had fetched properly. - If, on the other hand, the total number of unique failed nodes is above the configured max, then the BnP job will fail and the nodes that succeeded the fetch will be asked to delete the new data, just like before. In either case, BnP will then release the shared lock by moving the lock file outside of the lock directory, so that other BnP instances can go through the same process one at a time, in a globally coordinated (mutually exclusive) fashion. All HA-related HDFS operations are retried every 10 seconds up to 90 times (thus for a total of 15 minutes). These are configurable in the BnP job via push.ha.lock.hdfs.timeout and push.ha.lock.hdfs.retries respectively. When a Voldemort server is in offline mode, in order for BnP to continue working properly, the BnP jobs must be configured so that push.cluster points to the admin port, not the socket port. Configured in this way, transient HDFS issues may lead to the Voldemort server being put in offline mode, but wouldn't prevent future pushes from populating the newer data organically. External systems can be notified of the occurrences of the BnP HA code getting triggered via two new BuildAndPushStatus passed to the custom BuildAndPushHooks registered with the job: SWAPPED (when things work normally) and SWAPPED_WITH_FAILURES (when a swap occurred despite some failed Voldemort node(s)). BnP jobs that failed because the maximum number of failed Voldemort nodes would have been exceeded still fail normally and trigger the FAILED hook. Future work: - Auro-recovery: Transitioning the server from offline to online mode, as well as cleaning up the shared metadata in HDFS, is not handled automatically as part of this commit (which is the main reason why BnP HA should not be enabled by default). The recovery process currently needs to be handled manually, though it could be automated (at least for the common cases) as part of future work. - Support non-HDFS based locking mechanisms: the HdfsFailedFetchLock is an implementation of a new FailedFetchLock interface, which can serve as the basis for other distributed state/locking mechanisms (such as Zookeeper, or a native Voldemort-based solution). Unrelated minor fixes and clean ups included in this commit: - Cleaned up some dead code. - Cleaned up abusive admin client instantiations in BnP. - Cleaned up the closing of resources at the end of the BnP job. - Fixed a NPE in the ReadOnlyStorageEngine. - Fixed a broken sanity check in Cluster.getNumberOfTags(). - Improved some server-side logging statements. - Fixed exception type thrown in ConfigurationStorageEngine's and FileBackedCachingStorageEngine's getCapability().	30 June 2015, 18:11:45 UTC
050ec92	ARUNACHALAM THIRUPATHI	29 June 2015, 21:18:22 UTC	Merge pull request #273 from bitti/master @bitti thanks for the fix, merged it in. Fix SecurityException when running HadoopStoreJobRunner in an oozie java action	29 June 2015, 21:18:22 UTC
fb9cab6	David Ongaro	17 June 2015, 16:24:58 UTC	Fix SecurityException when running HadoopStoreJobRunner in oozie	17 June 2015, 16:24:58 UTC
f3801cf	Arunachalam Thirupathi	12 June 2015, 23:48:22 UTC	Releasing voldemort 1.9.17	12 June 2015, 23:48:22 UTC
88fcf8d	Arunachalam Thirupathi	31 May 2015, 08:51:47 UTC	ConnectionException is not catastrophic 1) If a connection timesout or fails during protocol negotiation, they are treated as normal errors instead of catastrophic errors. Connection timeout was a regression from NIO connect fix. Protocol negotiation timeout is a new change to detect the failed servers faster. 2) When a node is marked down, the outstanding queued requests are not failed and let them go through the connection creation cycle. When there is no outstanding requests they can wait infinitely until the next request comes up. 3) UnreachableStoreException is sometimes double wrapped. This causes the catastrophic errors to be not detected accurately. Created an utility method, when you are not sure if the thrown exception could be UnreachableStoreException use this method, which handles this case correctly. 4) In non-blocking connect if the DNS does not resolve the Java throws UnresolvedAddressException instead of UnknownHostException. Probably an issue in java. Also UnresolvedAddressException is not derived from IOException but from IllegalArgumentException which is weird. Fixed the code to handle this. 5) Tuned the remembered exceptions timeout to twice the connection timeout. Previously it was hardcoded to 3 seconds, which was too aggressive when the connection for some use cases where set to more than 5 seconds. Added unit tests to verify all the above cases.	12 June 2015, 23:23:22 UTC
2b95f0d	Dallas Marlow	12 June 2015, 16:37:50 UTC	update coordinator class name in server script	12 June 2015, 16:37:50 UTC
d65f7db	Felix GV	09 June 2015, 13:46:34 UTC	Releasing Voldemort 1.9.16	09 June 2015, 13:46:34 UTC
c574a37	Felix GV	09 June 2015, 13:41:07 UTC	Standardized recent release_notes formatting.	09 June 2015, 13:41:07 UTC
97d8694	Felix GV	09 June 2015, 01:34:42 UTC	Some more AvroUtils and BnP clean ups.	09 June 2015, 13:33:27 UTC
e13f6a2	Greg Banks	07 June 2015, 20:20:21 UTC	Fix error reporting in AvroUtils.getSchemaFromPath() - report errors with an exception - report errors exactly once - provide the failing pathname - don't generate spurious cascading NPE failures	09 June 2015, 01:56:31 UTC
037a0dc	ARUNACHALAM THIRUPATHI	08 June 2015, 23:01:48 UTC	Merge pull request #269 from FelixGV/VoldemortConfig_bug Fixed VoldemortConfig bug introduced in 3692fa3.	08 June 2015, 23:01:48 UTC
c7e6cec	Felix GV	08 June 2015, 22:38:57 UTC	Fixed VoldemortConfig bug introduced in 3692fa3f493acf717b1431d624af4c997df4f2fd.	08 June 2015, 22:38:57 UTC
5f0cd8b	ARUNACHALAM THIRUPATHI	06 June 2015, 00:28:12 UTC	Merge pull request #265 from gnb/VOLDENG-1912 Unregister the "-streaming-stats" mbean correctly	06 June 2015, 00:28:12 UTC
924c72f	Greg Banks	06 June 2015, 00:17:01 UTC	Unregister the "-streaming-stats" mbean correctly This avoids littering up the logs with JMX exceptions like this 2015/06/04 23:55:58.105 ERROR [JmxUtils] [voldemort-admin-server-t21] [voldemort] [] Error unregistering mbean javax.management.InstanceNotFoundException: voldemort.server.StoreRepository:type=cmp_comparative_insights at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415) at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546) at voldemort.utils.JmxUtils.unregisterMbean(JmxUtils.java:348) at voldemort.server.StoreRepository.removeStorageEngine(StoreRepository.java:187) at voldemort.server.storage.StorageService.removeEngine(StorageService.java:749) at voldemort.server.protocol.admin.AdminServiceRequestHandler.handleDeleteStore(AdminServiceRequestHandler.java:1487) at voldemort.server.protocol.admin.AdminServiceRequestHandler.handleRequest(AdminServiceRequestHandler.java:238) at voldemort.server.niosocket.AsyncRequestHandler.read(AsyncRequestHandler.java:190) at voldemort.common.nio.SelectorManagerWorker.run(SelectorManagerWorker.java:105) at voldemort.common.nio.SelectorManager.run(SelectorManager.java:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)	06 June 2015, 00:22:19 UTC
e63bc53	ARUNACHALAM THIRUPATHI	06 June 2015, 00:07:15 UTC	Releasing Voldemort build 1.9.15	06 June 2015, 00:07:15 UTC
139e441	Arunachalam Thirupathi	05 June 2015, 23:30:17 UTC	Fix Log message HdfsFile does not have toString method which causes object id to be printed in the log message, it broke the script we had for collecting the download speed. Although speed can be calculated better now using the stats file, but that is a separate project. Added number of directories being downloaded, files in addition to size. This will help to track some more details, as the files if not exist, dummy files are created in place. Renamed HDFSFetcherAdvancedTest to HdfsFetcherAdvancedTest to keep it in sync with other naming conventions.	05 June 2015, 23:58:49 UTC
1592db0	ARUNACHALAM THIRUPATHI	04 June 2015, 18:49:05 UTC	Merge pull request #263 from FelixGV/hung_async_task_mitigation Added SO_TIMEOUT config (default 30 mins) in ConfigurableSocketFactory. Looks good.	04 June 2015, 18:49:05 UTC
3692fa3	Felix GV	03 June 2015, 17:42:52 UTC	Added SO_TIMEOUT config (default 30 mins) in ConfigurableSocketFactory and VoldemortConfig. Added logging to detect hung async jobs in AdminClient.waitForCompletion	04 June 2015, 18:24:00 UTC
13a4b81	ARUNACHALAM THIRUPATHI	31 May 2015, 16:32:33 UTC	HdfsCopyStatsTest fails intermittently The OS returns the expected files in random order. Use set instead of list.	31 May 2015, 16:32:33 UTC
b8d9525	Arunachalam Thirupathi	27 May 2015, 22:50:09 UTC	Add more testing for Serialization. Added more testing for Serialization. I was doing some tests on what is the expected input for the serializers and expected output. I thought it will be a good idea instead of just documenting, if i can write unit tests to validate them. Most of them have very poor testing, so decided to add the unit tests. I will add more testing as I start working more on the expected input/output.	27 May 2015, 22:50:09 UTC
b540533	Arunachalam Thirupathi	22 May 2015, 16:49:55 UTC	Release 1.9.14 Release version 1.9.14	22 May 2015, 16:49:55 UTC
df12409	Arunachalam Thirupathi	18 May 2015, 18:14:36 UTC	RO Hdfs fetcher allocates too much memory 1) Hdfs Fetcher in 1.0.4 uses ByteRangeInputStream. This class does not override the method read(byte[], int , int). So it defaults to this method from InputStream, which reads a character at a time from the input stream. HttpInputStream for this method creates byte arrays for each read. So if you are download 2 TB data, the server will allocate/free 2 TB data before the data is downloaded. This creates too much garbage and new gen gets full in few milliseconds and GC happens. Though GC are fast, this too much GC causes the latency to spike and causes JVM to run out of Memory. 2) http://svn.apache.org/viewvc?view=revision&revision=1330500 fixed this issue on April 2012 rather knowingly/unknowingly. I tried upgrading to Hadoop latest but it brings in ProtoBuf 2.5.0 and Avro 1.7. When I disabled the dependencies it failed at runtime expecting protobuf 2.5.0 . I enabled only protobuf and it has no runtime dependency on Avro 1.7. But I am saving that fix for a later day. The branch is hadoop_Version_Upgrade which uses Hadoop 2.6.0 and ProtoBuf 2.6.1	18 May 2015, 18:49:17 UTC

Newer
Older