Revision history - None - origin: https://github.com/voldemort/voldemort

visit type:

Revision	Author	Date	Message	Commit Date
fd345f7	Felix GV	15 August 2015, 00:20:02 UTC	Releasing Voldemort 1.9.19	15 August 2015, 00:20:02 UTC
07f0f72	Felix GV	14 August 2015, 22:33:10 UTC	Removed MemLock class and related configs. It turns out our mlock call always failed and we've been running fine without it anyway, so there's no point in keeping that code around.	14 August 2015, 22:33:10 UTC
9e5fe19	Greg Banks	13 August 2015, 14:45:24 UTC	Cleanup formatting of system store schema.	13 August 2015, 14:45:24 UTC
a6294ce	Arunachalam Thirupathi	12 August 2015, 22:26:15 UTC	Revert "Clean Set Quota Fix" This reverts commit 3ff1cbe47a2206834c2cd6b77878dcccb3f0be67. The unit tests are broken. On a deeper look, file backed caching storage engine, does not have a version per key, but at a file level. Any updates to the file with lower version will be rejected. So we have to go with super clocks for now. I am reverting the commit to unblock the release of a new version.	12 August 2015, 22:27:04 UTC
fa0acf6	James Lent	22 July 2015, 19:19:38 UTC	Ensure that all the AbstractStorageEngineTest tests get run by all the subclasses. Add the @Test annotation to several tests. Without that annotation it appears that those subclasses that are parameterized do not run these specific test cases. This is most likely a base gradle issue. Perhaps somewhat related to: https://issues.gradle.org/browse/GRADLE-3112	12 August 2015, 19:03:44 UTC
da0ec6d	Felix GV	12 August 2015, 01:35:40 UTC	Removed HadoopStoreJobRunner and related scripts. This is to avoid confusion. There is no point in running those scripts. - For building and pushing a read-only store, VoldemortBuildAndPushJobRunner and ./bin/run-bnp.sh should be used. If someone wants to just build without pushing, that can be done by passing push=false to the BnP config. - For swapping a store version, it can be done via the vadmin.sh script.	12 August 2015, 17:54:11 UTC
00ad250	Felix GV	12 August 2015, 00:09:29 UTC	Changes to @elad's shadowJar build, to ensure all dependencies are bundled up.	12 August 2015, 17:54:11 UTC
a4bc78c	Elad Efrat	04 August 2015, 13:22:17 UTC	Minimal config for read-only store, with Hadoop (BnP) hints.	12 August 2015, 17:54:11 UTC
3add1aa	Elad Efrat	04 August 2015, 13:16:06 UTC	Print number of partitions, useful for debugging.	12 August 2015, 17:54:11 UTC
e5eee78	Elad Efrat	04 August 2015, 13:12:45 UTC	Add BnP job and script from @FelixGV with slight changes.	12 August 2015, 17:54:11 UTC
c974058	Elad Efrat	04 August 2015, 13:05:25 UTC	Make this work on Hadoop 2.x by shading a few dependencies. Shade avro (to 1.4.0) and protobuf (to 2.3.0) and always include jdom (1.1). Mostly from #274, also see #284.	12 August 2015, 17:54:11 UTC
e3113e3	Felix GV	12 August 2015, 01:56:04 UTC	Addressing stylistic comment.	12 August 2015, 17:41:35 UTC
98cd411	ARUNACHALAM THIRUPATHI	12 August 2015, 01:40:33 UTC	Merge pull request #296 from arunthirupathi/setQuota Clean Set Quota Fix	12 August 2015, 01:40:33 UTC
8c068ac	Felix GV	12 August 2015, 00:46:46 UTC	Allow RO servers to run without Kerberos enabled.	12 August 2015, 00:46:46 UTC
3ff1cbe	Arunachalam Thirupathi	11 August 2015, 19:35:49 UTC	Clean Set Quota Fix Clean up the Set quota to not generate super clocks. Wait for all the nodes to complete, before returning success.	11 August 2015, 19:35:49 UTC
2a0c81e	Greg Banks	06 August 2015, 15:49:56 UTC	Remove unused members and methods from MemLock public setFile() is unused, and also dangerous because there is no point in the lifetime of the object when it can have a useful effect. The descriptor member is not needed after the ctor.	11 August 2015, 18:30:03 UTC
dc91e04	Greg Banks	06 August 2015, 15:41:08 UTC	Factor out some common code in ChunkedFileSet into a new mapAndRememberIndexFile() method.	11 August 2015, 18:30:03 UTC
faa2ca0	Greg Banks	06 August 2015, 15:32:14 UTC	Remove unused mapFile() method in ChunkedFileSet.	11 August 2015, 18:30:03 UTC
8647f82	Greg Banks	06 August 2015, 15:30:11 UTC	Simplify MappedFileReader, close Unix fd early Remove most of the members and make them local variables in the map() function which is the only place that uses them. This simplifies the c'tor and means it cannot throw IOException anymore. It also means we close the Unix file descriptor used to create the mapping much earlier. This will help reduce the load on file descriptors.	11 August 2015, 18:30:03 UTC
1eafb88	Greg Banks	06 August 2015, 07:41:41 UTC	Remove unused fields from MappedFileReader - fadvise was an unused leftover from previous code - offset was used but it's value was always zero, stop pretending otherwise	11 August 2015, 18:30:03 UTC
a4fe60a	Greg Banks	06 August 2015, 07:38:03 UTC	Merge BaseMappedFile into its only subclass. It had precisely one subclass, did not define any semantically meaningful behavior, and nobody used the base class. It will not be missed.	11 August 2015, 18:30:03 UTC
4b74809	Greg Banks	06 August 2015, 06:59:32 UTC	Report mmap/munmap errors using IOException which the signature claims can be thrown but never was. Ensure that the only callchain (MappedFileReader -> MemLock -> mman) will actually handle and log IOException correctly. Also, stop claiming that mlock() and munlock() throw IOException. They never did and it would not be helpful, in fact we want to ignore all errors from those functions because their normal behavior in unprivileged processes is to fail.	11 August 2015, 18:30:03 UTC
9fb2689	Greg Banks	06 August 2015, 05:33:50 UTC	Remove all traces of MAP_ALIGN which doesn't exist on Linux (the binary value we were passing into the kernel was some other harmless option) and even if it did exist it has no effect because we're hardcoding the alignment to 0 which means "let the kernel choose" i.e. the default behavior.	11 August 2015, 18:30:03 UTC
be018e3	Greg Banks	06 August 2015, 05:31:35 UTC	Remove all traces of MAP_LOCKED which we didn't actually use except in some stale comments, and which is not implemented on any current OS anyway.	11 August 2015, 18:30:03 UTC
9202548	singhsiddharth	11 August 2015, 06:22:10 UTC	Merge pull request #295 from FelixGV/fixes_to_DataCleanupJobTest Bump EventThrotller window + fixes to data cleanup job test	11 August 2015, 06:22:10 UTC
d25d169	Felix GV	11 August 2015, 04:51:19 UTC	Upgrade to Hadoop 2.3.0-cdh5.1.5 and Kerberos clean up.	11 August 2015, 05:13:55 UTC
9a48c7e	Felix GV	11 August 2015, 00:51:59 UTC	Added STORAGE_SPACE quota and cleaned up some vadmin stuff.	11 August 2015, 04:40:56 UTC
e65b0e0	Felix GV	11 August 2015, 03:01:12 UTC	Fixed DataCleanupJobTest.	11 August 2015, 03:03:05 UTC
3c6be4b	Felix GV	11 August 2015, 02:59:28 UTC	Bumped up EventThrottler's default window time to 1000 ms.	11 August 2015, 03:02:49 UTC
8d6f06a	singhsiddharth	07 August 2015, 00:11:19 UTC	Remove sleep after deprecated warning	07 August 2015, 00:11:19 UTC
df46fdc	singhsiddharth	29 July 2015, 19:28:58 UTC	Merge pull request #288 from FelixGV/remove_scala_and_ec2_testing Remove scala, ec2 testing and public-lib directory	29 July 2015, 19:28:58 UTC
b38aabe	Felix GV	29 July 2015, 18:46:01 UTC	Removed public-lib as it was only used by the deprecated ant build.	29 July 2015, 18:46:01 UTC
3bfc934	Felix GV	29 July 2015, 18:39:54 UTC	Deleted unused cruft (scala shell and ec2-testing contrib).	29 July 2015, 18:39:54 UTC
105d1ed	Felix GV	27 July 2015, 17:33:45 UTC	Removed a duplicate log in AdminClient.waitForCompletion	27 July 2015, 17:33:45 UTC
dcbd8c9	Felix GV	23 July 2015, 20:35:15 UTC	Improved BnP logging.	23 July 2015, 20:35:15 UTC
3a935aa	Felix GV	20 July 2015, 22:54:37 UTC	Releasing Voldemort 1.9.18	20 July 2015, 22:54:37 UTC
724596d	Felix GV	20 July 2015, 19:43:28 UTC	Voldemort BnP pushes to all colos in parallel. Also contains many logging improvements to discriminate between hosts and clusters.	20 July 2015, 19:43:28 UTC
9c61ada	Felix GV	14 July 2015, 20:55:01 UTC	Rewrite of the EventThrottler code to use Tehuti. - Makes throttling less vulnerable to spiky traffic sneaking in "between the interval". - Also fixes throttling for the HdfsFetcher when compression is enabled.	16 July 2015, 01:18:31 UTC
82f80b6	Arunachalam Thirupathi	10 July 2015, 20:25:23 UTC	Fix the white space changes The previous refactor was done from my mac book which did not replace the tabs with spaces. This messed up lot of the editing. Instead of re-doing the change with spaces, I just formatted the code which is easier and no re-verification is required. you can review teh commit by adding ?w=1 on github url or use the git diff -w if you are using the command line to ignore the whitespaces and there are not many changes.	10 July 2015, 20:25:23 UTC
168cb69	ARUNACHALAM THIRUPATHI	28 June 2015, 05:16:29 UTC	Pass in additional parameters to fetch 1) Currently the AsyncOperationStatus is set for HdfsFetcher if 2 or more fetches are going on, this would produce erroneous results. 2) Add StoreName, Version, Metadatastore for use in future fetches. 3) Enabled the Hadoop* Tests, don't know why they were not run in ant tests. When I ported them for parity reasons I disabled them too, but now enabling it as the test seems valid. 4) made the fetch throw IOException instead of throwable, which seems less reliable and catching more than it is intended.	01 July 2015, 06:41:08 UTC
d70ed85	ARUNACHALAM THIRUPATHI	28 June 2015, 03:23:40 UTC	Refactor file fetcher to Strategy Interface/class Refactor the file fetcher to Strategy Interface and class In the future this lets you modify the file fetching strategey like having BuildAndPush build only one copy for partition, chunk and the fetcher can fetch them under different names. There is no logic change, just the code is refactored.	01 July 2015, 06:41:08 UTC
6a42f59	Felix GV	01 July 2015, 00:04:33 UTC	Improved path handling and validation in VoldemortSwapJob	01 July 2015, 01:04:12 UTC
aa51b0b	ARUNACHALAM THIRUPATHI	30 June 2015, 21:47:13 UTC	Merge pull request #271 from dallasmarlow/coordinator-class Thanks for the fix @dallasmarlow update coordinator class name in server script	30 June 2015, 21:47:13 UTC
a45fc83	Felix GV	30 June 2015, 21:06:29 UTC	Fixed voldemort.cluster.ClusterTest	30 June 2015, 21:06:29 UTC
c2db8fd	Felix GV	30 June 2015, 18:11:45 UTC	First-cut implementation of Build and Push High Availability. This commit introduces a limited form of HA for BnP. The new functionality is disabled by default and can be enabled via the following server-side configurations, all of which are necessary: push.ha.enabled=true push.ha.cluster.id=<some arbitrary name which is unique per physical cluster> push.ha.lock.path=<some arbitrary HDFS path used for shared state> push.ha.lock.implementation=voldemort.store.readonly.swapper.HdfsFailedFetchLock push.ha.max.node.failure=1 The Build and Push job will interrogate each cluster it pushes to and honor each clusters' individual settings (i.e.: one can enable HA on one cluster at a time, if desired). However, even if the server settings enable HA, this should be considered a best effort behavior, since some BnP users may be running older versions of BnP which will not honor HA settings. Furthermore, up-to-date BnP users can also set the following config to disable HA, regardless of server-side settings: push.ha.enabled=false Below is a description of the behavior of BnP HA, when enabled. When a Voldemort server fails to do some fetch(es), the BnP job attempts to acquire a lock by moving a file into a shared directory in HDFS. Once the lock is acquired, it will check the state in HDFS to see if any nodes have already been marked as disabled by other BnP jobs. It then determines if the Voldemort node(s) which failed the current BnP job would bring the total number of unique failed nodes above the configured maximum, with the following outcome in each case: - If the total number of failed nodes is equal or lower than the max allowed, then metadata is added to HDFS to mark the store/version currently being pushed as disabled on the problematic node. Afterwards, if the Voldemort server that failed the fetch is still online, it will be asked to go in offline node (this is best effort, as the server could be down). Finally, BnP proceeds with swapping the new data set version on, as if all nodes had fetched properly. - If, on the other hand, the total number of unique failed nodes is above the configured max, then the BnP job will fail and the nodes that succeeded the fetch will be asked to delete the new data, just like before. In either case, BnP will then release the shared lock by moving the lock file outside of the lock directory, so that other BnP instances can go through the same process one at a time, in a globally coordinated (mutually exclusive) fashion. All HA-related HDFS operations are retried every 10 seconds up to 90 times (thus for a total of 15 minutes). These are configurable in the BnP job via push.ha.lock.hdfs.timeout and push.ha.lock.hdfs.retries respectively. When a Voldemort server is in offline mode, in order for BnP to continue working properly, the BnP jobs must be configured so that push.cluster points to the admin port, not the socket port. Configured in this way, transient HDFS issues may lead to the Voldemort server being put in offline mode, but wouldn't prevent future pushes from populating the newer data organically. External systems can be notified of the occurrences of the BnP HA code getting triggered via two new BuildAndPushStatus passed to the custom BuildAndPushHooks registered with the job: SWAPPED (when things work normally) and SWAPPED_WITH_FAILURES (when a swap occurred despite some failed Voldemort node(s)). BnP jobs that failed because the maximum number of failed Voldemort nodes would have been exceeded still fail normally and trigger the FAILED hook. Future work: - Auro-recovery: Transitioning the server from offline to online mode, as well as cleaning up the shared metadata in HDFS, is not handled automatically as part of this commit (which is the main reason why BnP HA should not be enabled by default). The recovery process currently needs to be handled manually, though it could be automated (at least for the common cases) as part of future work. - Support non-HDFS based locking mechanisms: the HdfsFailedFetchLock is an implementation of a new FailedFetchLock interface, which can serve as the basis for other distributed state/locking mechanisms (such as Zookeeper, or a native Voldemort-based solution). Unrelated minor fixes and clean ups included in this commit: - Cleaned up some dead code. - Cleaned up abusive admin client instantiations in BnP. - Cleaned up the closing of resources at the end of the BnP job. - Fixed a NPE in the ReadOnlyStorageEngine. - Fixed a broken sanity check in Cluster.getNumberOfTags(). - Improved some server-side logging statements. - Fixed exception type thrown in ConfigurationStorageEngine's and FileBackedCachingStorageEngine's getCapability().	30 June 2015, 18:11:45 UTC
050ec92	ARUNACHALAM THIRUPATHI	29 June 2015, 21:18:22 UTC	Merge pull request #273 from bitti/master @bitti thanks for the fix, merged it in. Fix SecurityException when running HadoopStoreJobRunner in an oozie java action	29 June 2015, 21:18:22 UTC
fb9cab6	David Ongaro	17 June 2015, 16:24:58 UTC	Fix SecurityException when running HadoopStoreJobRunner in oozie	17 June 2015, 16:24:58 UTC
f3801cf	Arunachalam Thirupathi	12 June 2015, 23:48:22 UTC	Releasing voldemort 1.9.17	12 June 2015, 23:48:22 UTC
88fcf8d	Arunachalam Thirupathi	31 May 2015, 08:51:47 UTC	ConnectionException is not catastrophic 1) If a connection timesout or fails during protocol negotiation, they are treated as normal errors instead of catastrophic errors. Connection timeout was a regression from NIO connect fix. Protocol negotiation timeout is a new change to detect the failed servers faster. 2) When a node is marked down, the outstanding queued requests are not failed and let them go through the connection creation cycle. When there is no outstanding requests they can wait infinitely until the next request comes up. 3) UnreachableStoreException is sometimes double wrapped. This causes the catastrophic errors to be not detected accurately. Created an utility method, when you are not sure if the thrown exception could be UnreachableStoreException use this method, which handles this case correctly. 4) In non-blocking connect if the DNS does not resolve the Java throws UnresolvedAddressException instead of UnknownHostException. Probably an issue in java. Also UnresolvedAddressException is not derived from IOException but from IllegalArgumentException which is weird. Fixed the code to handle this. 5) Tuned the remembered exceptions timeout to twice the connection timeout. Previously it was hardcoded to 3 seconds, which was too aggressive when the connection for some use cases where set to more than 5 seconds. Added unit tests to verify all the above cases.	12 June 2015, 23:23:22 UTC
2b95f0d	Dallas Marlow	12 June 2015, 16:37:50 UTC	update coordinator class name in server script	12 June 2015, 16:37:50 UTC
d65f7db	Felix GV	09 June 2015, 13:46:34 UTC	Releasing Voldemort 1.9.16	09 June 2015, 13:46:34 UTC
c574a37	Felix GV	09 June 2015, 13:41:07 UTC	Standardized recent release_notes formatting.	09 June 2015, 13:41:07 UTC
97d8694	Felix GV	09 June 2015, 01:34:42 UTC	Some more AvroUtils and BnP clean ups.	09 June 2015, 13:33:27 UTC
e13f6a2	Greg Banks	07 June 2015, 20:20:21 UTC	Fix error reporting in AvroUtils.getSchemaFromPath() - report errors with an exception - report errors exactly once - provide the failing pathname - don't generate spurious cascading NPE failures	09 June 2015, 01:56:31 UTC
037a0dc	ARUNACHALAM THIRUPATHI	08 June 2015, 23:01:48 UTC	Merge pull request #269 from FelixGV/VoldemortConfig_bug Fixed VoldemortConfig bug introduced in 3692fa3.	08 June 2015, 23:01:48 UTC
c7e6cec	Felix GV	08 June 2015, 22:38:57 UTC	Fixed VoldemortConfig bug introduced in 3692fa3f493acf717b1431d624af4c997df4f2fd.	08 June 2015, 22:38:57 UTC
5f0cd8b	ARUNACHALAM THIRUPATHI	06 June 2015, 00:28:12 UTC	Merge pull request #265 from gnb/VOLDENG-1912 Unregister the "-streaming-stats" mbean correctly	06 June 2015, 00:28:12 UTC
924c72f	Greg Banks	06 June 2015, 00:17:01 UTC	Unregister the "-streaming-stats" mbean correctly This avoids littering up the logs with JMX exceptions like this 2015/06/04 23:55:58.105 ERROR [JmxUtils] [voldemort-admin-server-t21] [voldemort] [] Error unregistering mbean javax.management.InstanceNotFoundException: voldemort.server.StoreRepository:type=cmp_comparative_insights at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415) at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546) at voldemort.utils.JmxUtils.unregisterMbean(JmxUtils.java:348) at voldemort.server.StoreRepository.removeStorageEngine(StoreRepository.java:187) at voldemort.server.storage.StorageService.removeEngine(StorageService.java:749) at voldemort.server.protocol.admin.AdminServiceRequestHandler.handleDeleteStore(AdminServiceRequestHandler.java:1487) at voldemort.server.protocol.admin.AdminServiceRequestHandler.handleRequest(AdminServiceRequestHandler.java:238) at voldemort.server.niosocket.AsyncRequestHandler.read(AsyncRequestHandler.java:190) at voldemort.common.nio.SelectorManagerWorker.run(SelectorManagerWorker.java:105) at voldemort.common.nio.SelectorManager.run(SelectorManager.java:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)	06 June 2015, 00:22:19 UTC
e63bc53	ARUNACHALAM THIRUPATHI	06 June 2015, 00:07:15 UTC	Releasing Voldemort build 1.9.15	06 June 2015, 00:07:15 UTC
139e441	Arunachalam Thirupathi	05 June 2015, 23:30:17 UTC	Fix Log message HdfsFile does not have toString method which causes object id to be printed in the log message, it broke the script we had for collecting the download speed. Although speed can be calculated better now using the stats file, but that is a separate project. Added number of directories being downloaded, files in addition to size. This will help to track some more details, as the files if not exist, dummy files are created in place. Renamed HDFSFetcherAdvancedTest to HdfsFetcherAdvancedTest to keep it in sync with other naming conventions.	05 June 2015, 23:58:49 UTC
1592db0	ARUNACHALAM THIRUPATHI	04 June 2015, 18:49:05 UTC	Merge pull request #263 from FelixGV/hung_async_task_mitigation Added SO_TIMEOUT config (default 30 mins) in ConfigurableSocketFactory. Looks good.	04 June 2015, 18:49:05 UTC
3692fa3	Felix GV	03 June 2015, 17:42:52 UTC	Added SO_TIMEOUT config (default 30 mins) in ConfigurableSocketFactory and VoldemortConfig. Added logging to detect hung async jobs in AdminClient.waitForCompletion	04 June 2015, 18:24:00 UTC
13a4b81	ARUNACHALAM THIRUPATHI	31 May 2015, 16:32:33 UTC	HdfsCopyStatsTest fails intermittently The OS returns the expected files in random order. Use set instead of list.	31 May 2015, 16:32:33 UTC
b8d9525	Arunachalam Thirupathi	27 May 2015, 22:50:09 UTC	Add more testing for Serialization. Added more testing for Serialization. I was doing some tests on what is the expected input for the serializers and expected output. I thought it will be a good idea instead of just documenting, if i can write unit tests to validate them. Most of them have very poor testing, so decided to add the unit tests. I will add more testing as I start working more on the expected input/output.	27 May 2015, 22:50:09 UTC
b540533	Arunachalam Thirupathi	22 May 2015, 16:49:55 UTC	Release 1.9.14 Release version 1.9.14	22 May 2015, 16:49:55 UTC
df12409	Arunachalam Thirupathi	18 May 2015, 18:14:36 UTC	RO Hdfs fetcher allocates too much memory 1) Hdfs Fetcher in 1.0.4 uses ByteRangeInputStream. This class does not override the method read(byte[], int , int). So it defaults to this method from InputStream, which reads a character at a time from the input stream. HttpInputStream for this method creates byte arrays for each read. So if you are download 2 TB data, the server will allocate/free 2 TB data before the data is downloaded. This creates too much garbage and new gen gets full in few milliseconds and GC happens. Though GC are fast, this too much GC causes the latency to spike and causes JVM to run out of Memory. 2) http://svn.apache.org/viewvc?view=revision&revision=1330500 fixed this issue on April 2012 rather knowingly/unknowingly. I tried upgrading to Hadoop latest but it brings in ProtoBuf 2.5.0 and Avro 1.7. When I disabled the dependencies it failed at runtime expecting protobuf 2.5.0 . I enabled only protobuf and it has no runtime dependency on Avro 1.7. But I am saving that fix for a later day. The branch is hadoop_Version_Upgrade which uses Hadoop 2.6.0 and ProtoBuf 2.6.1	18 May 2015, 18:49:17 UTC
e2d845c	ARUNACHALAM THIRUPATHI	14 May 2015, 17:24:42 UTC	Output stats file for RO files download .stats directory will be created and will contain last X (default: 50) stats file. If a version-X is fetched a file with the same name as this directory name will contain the stats for this download. The stats file will contain the individual file name, time it took to download and few other information. Added unit tests for the HdfsCopyStatsTest	18 May 2015, 18:49:17 UTC
b5db5ed	Xu Ha	13 May 2015, 20:49:12 UTC	fix slop pusher unit test	15 May 2015, 21:22:29 UTC
45cce9e	Xu Ha	13 May 2015, 22:54:19 UTC	fix store-delete command	13 May 2015, 22:54:19 UTC
705b6ff	Xu Ha	13 May 2015, 17:34:57 UTC	add admin command for meta get-ro and add test config for readonly-two-nodes-cluster	13 May 2015, 20:49:30 UTC
eabb057	Xu Ha	23 January 2015, 21:52:59 UTC	Add Admin API to list/stop/enable scheduled jobs	13 May 2015, 17:48:54 UTC
20f1037	Xu Ha	12 May 2015, 17:52:55 UTC	add storeops.delete and deleteQuotaForNode, fix vector clock for setQuotaForNode	12 May 2015, 21:39:11 UTC
b4fa1cb	ARUNACHALAM THIRUPATHI	11 May 2015, 18:06:21 UTC	Refactor HdfsFetcher 1) Created directory and File class to help me in the future. 2) Cleaned up some code to make for easier readability.	12 May 2015, 17:54:57 UTC
3378d6c	Arunachalam Thirupathi	11 May 2015, 22:30:00 UTC	Code compiled on Java8 fails to run on Java6 Ever witnessed Exception in thread "main" java.lang.NoSuchMethodError: java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView; at voldemort.store.metadata.MetadataStore.updateRoutingStrategies(MetadataStore.java:855) at voldemort.store.metadata.MetadataStore.init(MetadataStore.java:1189) This is because of the issue documented here https://gist.github.com/AlainODea/1375759b8720a3f9f094	11 May 2015, 22:30:00 UTC
20455c7	Arunachalam Thirupathi	11 May 2015, 21:45:23 UTC	Releasing Voldemort 1.9.13	11 May 2015, 21:45:23 UTC
b4674b5	Arunachalam Thirupathi	11 May 2015, 21:09:02 UTC	Suppress obsoleteVersionException on logs During the refactoring of the server buffers, all errors from the stroage engine are logged. Previous code does not log any errors on writes. I looked at the exception stack and could not see other errors that need to be suppressed. Verified that ProtocolBuffer does not log any error, so only Voldemort Native request handler is affected.	11 May 2015, 21:09:02 UTC
1c8e0d4	ARUNACHALAM THIRUPATHI	07 August 2014, 07:50:57 UTC	NIO style connect Problems : 1) Connect blocks the selector. This causes other operations (read/write ) queued on the selector to incur additional latency or timeout. This is worse when you have data centers that are far away. 2) ProtocolNegotiation request is done after the connection establishment which blocks the selector in the same manner. 3) If Exceptions are encountered while getting connections from the queue they are ignored. Solutions : The connection creation is async. Create method is modified to createAsync and it takes in the pool object. for NIO the createAsync triggers an async operation which checks in the connection when it is ready. For Blocking connections the createAsync blocks, creates the connection and checks in the connection to the pool before returning. As the connection creation is async now, exceptions are remembered (for 5 seconds ) in the pool. When some thread asks for a connection and if the exceptions are remembered they will get an exception. There is no ordering in the way connections are handed out, one thread can request a connection and before it could wait, other thread could steal this connection. This is avoided to a certain extent by instead of doing one blocking wait, the thread splits the blocking wait in 2 half and creates connection if required. This should not be a problem in the real world as when you reach steady state ( create required number of connections) this can't happen. Upgrade the source compatibility from java 5 to 6. Most of the code is written with the assumption of Java 6, I don't believe you can run this code on Java 5. So the impact should be minimal, but if it goes in Client V2 branch, it will get benefit of additional testing.	06 May 2015, 22:06:42 UTC
5c03ea6	Bhavani Sudha Saktheeswaran	01 May 2015, 22:21:44 UTC	Releasing Voldemort 1.9.12	01 May 2015, 22:27:47 UTC
706a1f3	Bhavani Sudha Saktheeswaran	01 May 2015, 01:54:19 UTC	Add more tests and fix buffer size of GZIP Streams	01 May 2015, 21:09:59 UTC
495a234	ARUNACHALAM THIRUPATHI	30 April 2015, 18:18:37 UTC	Merge pull request #256 from FelixGV/disable_ant_build Fully disabled the Ant build in favor of the Gradle one. Though the docs task is not yet ported to gradle, we can always fetch the build.xml from an older trunk and generate the docs. Given the amount of confusion it causes, I will merge this change in.	30 April 2015, 18:18:37 UTC
d310bf2	Felix GV	30 April 2015, 18:11:43 UTC	Fully disabled the Ant build in favor of the Gradle one.	30 April 2015, 18:11:43 UTC
8addbf7	Arunachalam Thirupathi	30 April 2015, 17:44:09 UTC	Fix the Readme to use Gradle Remove the ant and fix the readme to use Gradle.	30 April 2015, 17:44:09 UTC
c01f26e	Arunachalam Thirupathi	27 April 2015, 21:42:36 UTC	Rebalance unit tests fail intermittently There are 2 issues. 1) Put is asynchronous, so there needs to be wait time before the put is verified on all the nodes. 2) Repeated puts need to generate different vector clocks.	27 April 2015, 22:01:41 UTC
9af4da4	Xu Ha	24 April 2015, 06:58:42 UTC	turn on reset-quota by default for rebalance-controller-cli	25 April 2015, 01:22:27 UTC
a831610	Xu Ha	24 April 2015, 06:50:54 UTC	split quota-resetting logic to QuotaResetter class and add unit test	25 April 2015, 01:22:27 UTC
8e39e55	Xu Ha	21 April 2015, 23:36:21 UTC	add reset-quota logic in RebalanceControllerCLI	25 April 2015, 01:22:27 UTC
a103fca	Bhavani Sudha Saktheeswaran	24 April 2015, 00:43:21 UTC	Releasing Voldemort 1.9.11	24 April 2015, 00:43:21 UTC
2ec72c4	Bhavani Sudha Saktheeswaran	15 April 2015, 01:00:16 UTC	Adding compression to RO path - first pass commit VoldemortConfig - Added a new config for compression codec. Default value for this property is GZIP. This is used by the AdminServiceRequestHandler to respond to the VoldemortBuildAndPushJob on what codec is supported. VAdminProto - Added a new Request Type for getting the suported compression codecs from RO Voldemort Server AdminServiceRequestHandler - New method to handle the above request type. AdminClient - Provides a method - getSupportedROStorageCompressionCodecs, that supports the above request type.. VoldemortBuildAndPushJob - inside run(), immediately after check cluster equalities, an admin request is issued to the VoldemortServer (specified by the property "push.node") to fetch the RO Compression Codec supported by the Server. - If any of the supported CODEC match the COMPRESSION_CODEC, then compression specific properties are set. Else no compression is enabled. AbstractHadoopJob - This is where the RO compression specific properties are set in Jobconf inside the createJobConf() Method HadoopStoreWriter and HadoopStoreWriterPerBucket - Adding dummy test only constructors - Creating index and value file streams based on compression settings - Got rid of some unused variables - minor movement of code HDFSFetcher - Changed copyFileWithCheckSum() to check if the files are ending with ".gz" and create a GZPIInputStream based on that. - GZPIInputStream (if compression is enabled) wraps the orifinal FSDataInputStream Tests for HadoopStoreWriter and HadoopStoreWriterPerBucket - These ar parameterized tests - takes in a boolean to either save keys or not - Run two tests - compressed and uncompressed - have tighter assumptions and use the test specific constructors in the corresponding classes	23 April 2015, 23:14:37 UTC
9de1042	Siddharth Singh	20 April 2015, 23:55:36 UTC	Fix mode option in cluster fork lift	21 April 2015, 22:39:15 UTC
9e21ccb	Xu Ha	17 April 2015, 18:32:40 UTC	create admin api for quota operations 1. Get quota by node id 2. Set quota by node id 3. Rebalance quota 4. Unit test for the new admin apis	20 April 2015, 21:26:50 UTC
b83e3e7	Xu Ha	17 April 2015, 01:38:23 UTC	add metadata key for quota.enforcement.enabled	20 April 2015, 21:26:50 UTC
c8e583e	Arunachalam Thirupathi	16 April 2015, 01:10:50 UTC	Releasing Voldemort 1.9.10	16 April 2015, 01:10:50 UTC
32e2e0b	ARUNACHALAM THIRUPATHI	30 March 2015, 05:44:38 UTC	Client buffer cleanup and isCompleteResponse 1) Client isCompleteResponse for Get and GetAll allocates the entire key and value. Discards them immediately. Now the byte array is not de-serialized and the validity is verified by advancing the pointers. 2) Put request size is calculated and the buffer is grown to the required size to avoid double allocation.	16 April 2015, 00:00:40 UTC
3f425ef	ARUNACHALAM THIRUPATHI	26 March 2015, 15:06:26 UTC	Vector clock deserializer from Input Stream Avoid double allocating the value size for puts which can be potentially few kilobytes. Vector clock has a deserializer from InputStream and it is used to avoid the double allocation on the hot path.	16 April 2015, 00:00:40 UTC
94be1a5	Arunachalam Thirupathi	25 March 2015, 21:04:40 UTC	ShareBuffer Refactoring Refactored the Shared Buffer code to eliminate the separate read and write buffers. Now a common buffer is used and the code is refactored into its own classes. running the unit test.	16 April 2015, 00:00:40 UTC
b3becf3	Arunachalam Thirupathi	23 March 2015, 18:43:01 UTC	Separate Client and Admin Request Handler Separated both Admin and Client Request Handler. Currently the client port will answer admin requests and the admin port will answer client requests. You can bootstrap from one of these ports and client after bootstrapping sends the queries to the correct ports. This is dangerous as most of the security implementations of voldemort relies on blocking the admin port via firewall and an attacker can change the voldemort source code to send the admin requests to client port. My intention for the fix was to make sure that the client answers only client requests. This will help me to make the client request handler share the read and write buffer without touching the admin request handler. Though it can be done for both client and admin, admin requests are too few and there are too many places to touch. So will fix only the client request handler. The AdminClient expects both the client and admin request handler. The admin client does some get remote metadata calls which uses the voldemort native v1 requests on admin port. So leaving the admin request handler unchanged, just moved some code so that client request handlers are isolated.	16 April 2015, 00:00:40 UTC
4a87d69	ARUNACHALAM THIRUPATHI	24 August 2014, 19:58:38 UTC	client sharing read/write buffer Client either writes/reads from socket, never does them together. So the buffer can be shared which will bring down the memory requirement for the client by half. But the client has to watch for 2 things 1) On Write the buffer expands as necessary. So the buffer needs to be reinitialized if it grows. 2) On Read, if the buffer can't accomodate it grows as necessary, this case also needs to be handled. This works as expected and the unit tests are passing. Will put it through VPL to measure the efficiency of the fixes. Created a new class to hold the Buffer reference. This helps to share the buffer between input and output streams easily. Previously you have to watch out for places where one buffer moves away from the other and need to call an explicit method to update it. Also moved many buffer growing and resetting logic to a common code, so it is more readable and understandable. Should I rename the ByteBufferContainer to MutableByteBuffer this fits the MutableInt pattern nicely where a single int can be shared by multiple classes and updating one is visible to others.	16 April 2015, 00:00:40 UTC
298bdc1	Arunachalam Thirupathi	13 April 2015, 22:01:53 UTC	Increase the heap size for Tests Increase the heap size for Tests to 8GB ZoneShrinkage tests fails time to time with errors, as it runs out of heap.	13 April 2015, 22:01:53 UTC
d546a02	Felix GV	10 April 2015, 18:34:22 UTC	Releasing Voldemort 1.9.9	10 April 2015, 18:34:22 UTC
ca08a06	Greg Banks	09 April 2015, 21:15:01 UTC	Merge pull request #251 from voldemort/revert-223-master Revert "Steps towards automating cluster zone expansion"	09 April 2015, 21:15:01 UTC

Newer
Older