swh:1:snp:5115096b921df712aeb2a08114fede57fb3331fb

sort by:
Revision Author Date Message Commit Date
5273c81 Ability to invoke application hook for every key during compaction. Summary: There are certain use-cases where the application intends to delete older keys aftre they have expired a certian time period. One option for those applications is to periodically scan the entire database and delete appropriate keys. A better way is to allow the application to hook into the compaction process. This patch allows the application to set a method callback for every key that is being compacted. If this method returns true, then the key is not preserved in the output of the compaction. Test Plan: This is mostly to preview the proposed new public api. Since it is a public api, please do due diligence on reviewing it. I will be writing test cases for this api in mynext version of this patch. Reviewers: MarkCallaghan, heyongqiang Reviewed By: heyongqiang CC: sheki, adsharma Differential Revision: https://reviews.facebook.net/D6285 06 November 2012, 00:02:13 UTC
f1a7c73 fix complie error Summary: as subject Test Plan:n/a 05 November 2012, 18:30:19 UTC
d55c2ba Add a tool to change number of levels Summary: as subject. Test Plan: manually test it, will add a testcase Reviewers: dhruba, MarkCallaghan Differential Revision: https://reviews.facebook.net/D6345 05 November 2012, 18:17:39 UTC
81f735d Merge branch 'master' into performance Conflicts: db/db_impl.cc util/options.cc 05 November 2012, 17:41:38 UTC
a1bd5b7 Compilation problem introduced by previous commit 854c66b089bef5d27f79750884f70f6e2c8c69da. Summary: Compilation problem introduced by previous commit 854c66b089bef5d27f79750884f70f6e2c8c69da. Test Plan: make check 05 November 2012, 06:04:14 UTC
854c66b Make compression options configurable. These include window-bits, level and strategy for ZlibCompression Summary: Leveldb currently uses windowBits=-14 while using zlib compression.(It was earlier 15). This makes the setting configurable. Related changes here: https://reviews.facebook.net/D6105 Test Plan: make all check Reviewers: dhruba, MarkCallaghan, sheki, heyongqiang Differential Revision: https://reviews.facebook.net/D6393 02 November 2012, 18:26:39 UTC
3096fa7 Add two more options: disable block cache and make table cache shard number configuable Summary: as subject Test Plan: run db_bench and db_test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6111 01 November 2012, 20:23:21 UTC
3e7e269 Use timer to measure sleep rather than assume it is 1000 usecs Summary: This makes the stall timers in MakeRoomForWrite more accurate by timing the sleeps. From looking at the logs the real sleep times are usually about 2000 usecs each when SleepForMicros(1000) is called. The modified LOG messages are: 2012/10/29-12:06:33.271984 2b3cc872f700 delaying write 13 usecs for level0_slowdown_writes_trigger 2012/10/29-12:06:34.688939 2b3cc872f700 delaying write 1728 usecs for rate limits with max score 3.83 Task ID: # Blame Rev: Test Plan: run db_bench, look at DB/LOG Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6297 30 October 2012, 14:21:37 UTC
fb8d437 fix test failure Summary: as subject Test Plan: db_test Reviewers: dhruba, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6309 30 October 2012, 01:55:52 UTC
925f60d add a test case to make sure chaning num_levels will fail Summary: Summary: as subject Test Plan: db_test Reviewers: dhruba, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6303 29 October 2012, 22:27:07 UTC
53e0431 Merge branch 'master' into performance Conflicts: db/db_bench.cc util/options.cc 29 October 2012, 21:18:00 UTC
321dfdc Allow having different compression algorithms on different levels. Summary: The leveldb API is enhanced to support different compression algorithms at different levels. This adds the option min_level_to_compress to db_bench that specifies the minimum level for which compression should be done when compression is enabled. This can be used to disable compression for levels 0 and 1 which are likely to suffer from stalls because of the CPU load for memtable flushes and (L0,L1) compaction. Level 0 is special as it gets frequent memtable flushes. Level 1 is special as it frequently gets all:all file compactions between it and level 0. But all other levels could be the same. For any level N where N > 1, the rate of sequential IO for that level should be the same. The last level is the exception because it might not be full and because files from it are not read to compact with the next larger level. The same amount of time will be spent doing compaction at any level N excluding N=0, 1 or the last level. By this standard all of those levels should use the same compression. The difference is that the loss (using more disk space) from a faster compression algorithm is less significant for N=2 than for N=3. So we might be willing to trade disk space for faster write rates with no compression for L0 and L1, snappy for L2, zlib for L3. Using a faster compression algorithm for the mid levels also allows us to reclaim some cpu without trading off much loss in disk space overhead. Also note that little is to be gained by compressing levels 0 and 1. For a 4-level tree they account for 10% of the data. For a 5-level tree they account for 1% of the data. With compression enabled: * memtable flush rate is ~18MB/second * (L0,L1) compaction rate is ~30MB/second With compression enabled but min_level_to_compress=2 * memtable flush rate is ~320MB/second * (L0,L1) compaction rate is ~560MB/second This practicaly takes the same code from https://reviews.facebook.net/D6225 but makes the leveldb api more general purpose with a few additional lines of code. Test Plan: make check Differential Revision: https://reviews.facebook.net/D6261 29 October 2012, 18:48:09 UTC
acc8567 Add more rates to db_bench output Summary: Adds the "MB/sec in" and "MB/sec out" to this line: Amplification: 1.7 rate, 0.01 GB in, 0.02 GB out, 8.24 MB/sec in, 13.75 MB/sec out Changes all values to be reported per interval and since test start for this line: ... thread 0: (10000,60000) ops and (19155.6,27307.5) ops/second in (0.522041,2.197198) seconds Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6291 29 October 2012, 18:30:07 UTC
de7689b Fix unit test failure caused by delaying deleting obsolete files. Summary: A previous commit 4c107587ed47af84633f8c61f65516a504d6cd98 introduced the idea that some version updates might not delete obsolete files. This means that if a unit test blindly counts the number of files in the db directory it might not represent the true state of the database. Use GetLiveFiles() insteads to count the number of live files in the database. Test Plan: make check 29 October 2012, 18:12:24 UTC
70c42bf Adds DB::GetNextCompaction and then uses that for rate limiting db_bench Summary: Adds a method that returns the score for the next level that most needs compaction. That method is then used by db_bench to rate limit threads. Threads are put to sleep at the end of each stats interval until the score is less than the limit. The limit is set via the --rate_limit=$double option. The specified value must be > 1.0. Also adds the option --stats_per_interval to enable additional metrics reported every stats interval. Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6243 29 October 2012, 17:17:43 UTC
8965c8d Add the missing util/auto_split_logger.h Summary: Test Plan: Reviewers: CC: Task ID: 1803577 Blame Rev: 26 October 2012, 22:23:50 UTC
d50f8eb Enable LevelDb to create a new log file if current log file is too large. Summary: Enable LevelDb to create a new log file if current log file is too large. Test Plan: Write a script and manually check the generated info LOG. Task ID: 1803577 Blame Rev: Reviewers: dhruba, heyongqiang Reviewed By: heyongqiang CC: zshao Differential Revision: https://reviews.facebook.net/D6003 26 October 2012, 21:55:02 UTC
3a91b78 Keep build_detect_platform portable Summary: AFAIK proper /bin/sh does not support "+=". Note that only our changes use "+=". The Google code does A="$A + $B" rather than A+=$B. Task ID: # Blame Rev: Test Plan: build Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6231 26 October 2012, 21:20:04 UTC
65855dd Normalize compaction stats by time in compaction Summary: I used server uptime to compute per-level IO throughput rates. I intended to use time spent doing compaction at that level. This fixes that. Task ID: # Blame Rev: Test Plan: run db_bench, look at results Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6237 26 October 2012, 21:19:13 UTC
ea9e087 Merge branch 'master' into performance Conflicts: db/db_bench.cc db/db_impl.cc db/db_test.cc 26 October 2012, 15:57:56 UTC
8eedf13 Fix unit test failure caused by delaying deleting obsolete files. Summary: A previous commit 4c107587ed47af84633f8c61f65516a504d6cd98 introduced the idea that some version updates might not delete obsolete files. This means that if a unit test blindly counts the number of files in the db directory it might not represent the true state of the database. Use GetLiveFiles() insteads to count the number of live files in the database. Test Plan: make check Reviewers: heyongqiang, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6207 26 October 2012, 15:42:05 UTC
5b0fe6c Greedy algorithm for picking files to compact. Summary: It is best if we pick the largest file to compact in a level. This reduces the write amplification factor for compactions. Each level has an auxiliary data structure called files_by_size_ that sorts all files by their size. This data structure is updated when a new version is created. Test Plan: make check Differential Revision: https://reviews.facebook.net/D6195 26 October 2012, 01:27:53 UTC
8fb5f40 firstIndex fix for multi-threaded compaction code. Summary: Prior to multi-threaded compaction, wrap-around would be done by using current_->files_[level[0]. With this change we should be using the first file for which f->being_compacted is not true. https://github.com/facebook/leveldb/commit/1ca0584345af85d2dccc434f451218119626d36e#commitcomment-2041516 Test Plan: make check Differential Revision: https://reviews.facebook.net/D6165 25 October 2012, 15:44:47 UTC
e7206f4 Improve statistics Summary: This adds more statistics to be reported by GetProperty("leveldb.stats"). The new stats include time spent waiting on stalls in MakeRoomForWrite. This also includes the total amplification rate where that is: (#bytes of sequential IO during compaction) / (#bytes from Put) This also includes a lot more data for the per-level compaction report. * Rn(MB) - MB read from level N during compaction between levels N and N+1 * Rnp1(MB) - MB read from level N+1 during compaction between levels N and N+1 * Wnew(MB) - new data written to the level during compaction * Amplify - ( Write(MB) + Rnp1(MB) ) / Rn(MB) * Rn - files read from level N during compaction between levels N and N+1 * Rnp1 - files read from level N+1 during compaction between levels N and N+1 * Wnp1 - files written to level N+1 during compaction between levels N and N+1 * NewW - new files written to level N+1 during compaction * Count - number of compactions done for this level This is the new output from DB::GetProperty("leveldb.stats"). The old output stopped at Write(MB) Compactions Level Files Size(MB) Time(sec) Read(MB) Write(MB) Rn(MB) Rnp1(MB) Wnew(MB) Amplify Read(MB/s) Write(MB/s) Rn Rnp1 Wnp1 NewW Count ------------------------------------------------------------------------------------------------------------------------------------- 0 3 6 33 0 576 0 0 576 -1.0 0.0 1.3 0 0 0 0 290 1 127 242 351 5316 5314 570 4747 567 17.0 12.1 12.1 287 2399 2685 286 32 2 161 328 54 822 824 326 496 328 4.0 1.9 1.9 160 251 411 160 161 Amplification: 22.3 rate, 0.56 GB in, 12.55 GB out Uptime(secs): 439.8 Stalls(secs): 206.938 level0_slowdown, 0.000 level0_numfiles, 24.129 memtable_compaction Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - (cherry picked from commit ecdeead38f86cc02e754d0032600742c4f02fec8) Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D6153 24 October 2012, 21:21:38 UTC
47bce26 Merge branch 'master' into performance 24 October 2012, 05:32:54 UTC
3b06f94 Merge branch 'master' into performance Conflicts: db/db_impl.cc db/db_impl.h db/version_set.cc 24 October 2012, 05:30:07 UTC
51d2adf Fix broken build. Add stdint.h to get uint64_t Summary: I still get failures from this. Not sure whether there was a fix in progress. Task ID: # Blame Rev: Test Plan: compile Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6147 23 October 2012, 21:58:53 UTC
4c10758 Delete files outside the mutex. Summary: The compaction process deletes a large number of files. This takes quite a bit of time and is best done outside the mutex lock. Test Plan: make check Differential Revision: https://reviews.facebook.net/D6123 22 October 2012, 18:53:23 UTC
5010daa add "seek_compaction" to log for better debug Summary: Summary: as subject Test Plan: compile Reviewers: dhruba Reviewed By: dhruba CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6117 22 October 2012, 17:00:25 UTC
3489cd6 Merge branch 'master' into performance Conflicts: db/db_impl.cc db/db_impl.h 21 October 2012, 09:15:19 UTC
f95219f Delete files outside the mutex. Summary: The compaction process deletes a large number of files. This takes quite a bit of time and is best done outside the mutex lock. Test Plan: make check Differential Revision: https://reviews.facebook.net/D6123 21 October 2012, 09:03:00 UTC
98f23cf Merge branch 'master' into performance Conflicts: db/db_impl.cc db/db_impl.h 21 October 2012, 08:55:19 UTC
64c4b9f Delete files outside the mutex. Summary: The compaction process deletes a large number of files. This takes quite a bit of time and is best done outside the mutex lock. Test Plan: Reviewers: CC: Task ID: # Blame Rev: 21 October 2012, 08:49:48 UTC
5016699 Merge branch 'master' into performance 19 October 2012, 23:08:04 UTC
507f5aa Do not enable checksums for zlib compression. Summary: Leveldb code already calculates checksums for each block. There is no need to generate checksums inside zlib. This patch switches-off checksum generation/checking in zlib library. (The Inno support for zlib uses windowsBits=14 as well.) pfabricator marks this file as binary. But here is the diff diff --git a/port/port_posix.h b/port/port_posix.h index 86a0927..db4e0b8 100644 --- a/port/port_posix.h +++ b/port/port_posix.h @@ -163,7 +163,7 @@ inline bool Snappy_Uncompress(const char* input, size_t length, } inline bool Zlib_Compress(const char* input, size_t length, - ::std::string* output, int windowBits = 15, int level = -1, + ::std::string* output, int windowBits = -14, int level = -1, int strategy = 0) { #ifdef ZLIB // The memLevel parameter specifies how much memory should be allocated for @@ -223,7 +223,7 @@ inline bool Zlib_Compress(const char* input, size_t length, } inline char* Zlib_Uncompress(const char* input_data, size_t input_length, - int* decompress_size, int windowBits = 15) { + int* decompress_size, int windowBits = -14) { #ifdef ZLIB z_stream _stream; memset(&_stream, 0, sizeof(z_stream)); Test Plan: run db_bench with zlib compression. Reviewers: heyongqiang, MarkCallaghan Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D6105 19 October 2012, 23:06:33 UTC
e982f5a Merge branch 'master' into performance Conflicts: util/options.cc 19 October 2012, 22:16:42 UTC
cf5adc8 db_bench was not correctly initializing the value for delete_obsolete_files_period_micros option. Summary: The parameter delete_obsolete_files_period_micros controls the periodicity of deleting obsolete files. db_bench was reading in this parameter intoa local variable called 'l' but was incorrectly using another local variable called 'n' while setting it in the db.options data structure. This patch also logs the value of delete_obsolete_files_period_micros in the LOG file at db startup time. I am hoping that this will improve the overall write throughput drastically. Test Plan: run db_bench Reviewers: MarkCallaghan, heyongqiang Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6099 19 October 2012, 22:10:12 UTC
1ca0584 This is the mega-patch multi-threaded compaction published in https://reviews.facebook.net/D5997. Summary: This patch allows compaction to occur in multiple background threads concurrently. If a manual compaction is issued, the system falls back to a single-compaction-thread model. This is done to ensure correctess and simplicity of code. When the manual compaction is finished, the system resumes its concurrent-compaction mode automatically. The updates to the manifest are done via group-commit approach. Test Plan: run db_bench 19 October 2012, 21:00:53 UTC
cd93e82 Enable SSE when building with fbcode support. Summary: fbcode build now support SSE instructions. Delete older version of the compile-helper fbcode.sh. This is subsumed by fbcode.gcc471.sh. Test Plan: run make check Reviewers: heyongqiang, MarkCallaghan Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D6057 18 October 2012, 15:43:25 UTC
aa73538 The deletion of obsolete files should not occur very frequently. Summary: The method DeleteObsolete files is a very costly methind, especially when the number of files in a system is large. It makes a list of all live-files and then scans the directory to compute the diff. By default, this method is executed after every compaction run. This patch makes it such that DeleteObsolete files is never invoked twice within a configured period. Test Plan: run all unit tests Reviewers: heyongqiang, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6045 16 October 2012, 17:26:10 UTC
0230866 Enhance db_bench to allow setting the number of levels in a database. Summary: Enhance db_bench to allow setting the number of levels in a database. Test Plan: run db_bench and look at LOG Reviewers: heyongqiang, MarkCallaghan Reviewed By: MarkCallaghan CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6027 15 October 2012, 17:18:49 UTC
5dc784c Fix compilation problem with db_stress when using C11 compiler. Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 13 October 2012, 00:00:25 UTC
24f7983 [tools] Add a tool to stress test concurrent writing to levelDB Summary: Created a tool that runs multiple threads that concurrently read and write to levelDB. All writes to the DB are stored in an in-memory hashtable and verified at the end of the test. All writes for a given key are serialzied. Test Plan: - Verified by writing only a few keys and logging all writes and verifying that values read and written are correct. - Verified correctness of value generator. - Ran with various parameters of number of keys, locks, and threads. Reviewers: dhruba, MarkCallaghan, heyongqiang Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5829 10 October 2012, 19:12:55 UTC
696b290 Add LevelDb's JNI wrapper Summary: This implement the Java interface by using JNI Test Plan: compile test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5925 05 October 2012, 20:13:49 UTC
fc23714 Add LevelDb's Java interface Summary: See the wiki below https://our.intern.facebook.com/intern/wiki/index.php/Database/leveldb/Java Test Plan: compile test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5919 05 October 2012, 20:11:31 UTC
f7975ac Implement RowLocks for assoc schema Summary: Each assoc is identified by (id1, assocType). This is the rowkey. Each row has a read/write rowlock. There is statically allocated array of 2000 read/write locks. A rowkey is murmur-hashed to one of the read/write locks. assocPut and assocDelete acquires the rowlock in Write mode. The key-updates are done within the rowlock with a atomic nosync batch write to leveldb. Then the rowlock is released and a write-with-sync is done to sync leveldb transaction log. Test Plan: added unit test Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5859 04 October 2012, 06:19:01 UTC
c1006d4 An configurable option to write data using write instead of mmap. Summary: We have seen that reading data via the pread call (instead of mmap) is much faster on Linux 2.6.x kernels. This patch makes an equivalent option to switch off mmaps for the write path as well. db_bench --mmap_write=0 will use write() instead of mmap() to write data to a file. This change is backward compatible, the default option is to continue using mmap for writing to a file. Test Plan: "make check all" Differential Revision: https://reviews.facebook.net/D5781 04 October 2012, 00:08:13 UTC
e678a59 Add --stats_interval option to db_bench Summary: The option is zero by default and in that case reporting is unchanged. By unchanged, the interval at which stats are reported is scaled after each report and newline is not issued after each report so one line is rewritten. When non-zero it specifies the constant interval (in operations) at which statistics are reported and the stats include the rate per interval. This makes it easier to determine whether QPS changes over the duration of the test. Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba CC: heyongqiang Differential Revision: https://reviews.facebook.net/D5817 03 October 2012, 16:54:33 UTC
d8763ab Fix the bounds check for the --readwritepercent option Summary: see above Task ID: # Blame Rev: Test Plan: run db_bench with invalid value for option Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba CC: heyongqiang Differential Revision: https://reviews.facebook.net/D5823 03 October 2012, 16:52:26 UTC
98804f9 Fix compiler warnings and errors in ldb.c Summary: stdlib.h is needed for exit() --readhead --> --readahead Task ID: # Blame Rev: Test Plan: compile Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - fix compiler warnings & errors Reviewers: dhruba Reviewed By: dhruba CC: heyongqiang Differential Revision: https://reviews.facebook.net/D5805 03 October 2012, 13:46:59 UTC
a58d48d Implement ReadWrite locks for leveldb Summary: Implement ReadWrite locks for leveldb. These will be helpful to implement a read-modify-write operation (e.g. atomic increments). Test Plan: does not modify any existing code Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D5787 02 October 2012, 05:37:39 UTC
fec8131 Commandline tool to compace LevelDB databases. Summary: A simple CLI which calles DB->CompactRange() Can take String key's as range. Test Plan: Inserted data into a table. Waited for a minute, used compact tool on it. File modification time's changed so Compact did something on the files. Existing unit tests work. Reviewers: heyongqiang, dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5697 01 October 2012, 17:49:19 UTC
a321d5b Implement assocDelete. Summary: Implement assocDelete. Test Plan: unit test attached Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D5721 01 October 2012, 16:58:26 UTC
72c45c6 Print the block cache size in the LOG. Summary: Print the block cache size in the LOG. Test Plan: run db_bench and look at LOG. This is helpful while I was debugging one use-case. Reviewers: heyongqiang, MarkCallaghan Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5739 30 September 2012, 04:39:19 UTC
c1bb32e Trigger read compaction only if seeks to storage are incurred. Summary: In the current code, a Get() call can trigger compaction if it has to look at more than one file. This causes unnecessary compaction because looking at more than one file is a penalty only if the file is not yet in the cache. Also, th current code counts these files before the bloom filter check is applied. This patch counts a 'seek' only if the file fails the bloom filter check and has to read in data block(s) from the storage. This patch also counts a 'seek' if a file is not present in the file-cache, because opening a file means that its index blocks need to be read into cache. Test Plan: unit test attached. I will probably add one more unti tests. Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D5709 28 September 2012, 18:10:52 UTC
92368ab Add db_dump tool to dump DB keys Summary: Create a tool to iterate through keys and dump values. Current options as follows: db_dump --start=[START_KEY] --end=[END_KEY] --max_keys=[NUM] --stats [PATH] START_KEY: First key to start at END_KEY: Key to end at (not inclusive) NUM: Maximum number of keys to dump PATH: Path to leveldb DB The --stats command line argument prints out the DB stats before dumping the keys. Test Plan: - Tested with invalid args - Tested with invalid path - Used empty DB - Used filled DB - Tried various permutations of command line options Reviewers: dhruba, heyongqiang Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5643 27 September 2012, 16:53:58 UTC
eace74d Add -fPIC to the shared library builds. Needed by libleveldbjni. Summary: Add -fPIC to the shared library builds. Needed by libleveldbjni. Test Plan: build Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5667 25 September 2012, 18:07:35 UTC
24eea93 If ReadCompaction is switched off, then it is better to not even submit background compaction jobs. Summary: If ReadCompaction is switched off, then it is better to not even submit background compaction jobs. I see about 3% increase in read-throughput on a pure memory database. Test Plan: run db_bench Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5673 25 September 2012, 18:07:01 UTC
26e0ecb Release 1.5.3.fb. Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 25 September 2012, 15:30:46 UTC
ae36e50 The BackupAPI should also list the length of the manifest file. Summary: The GetLiveFiles() api lists the set of sst files and the current MANIFEST file. But the database continues to append new data to the MANIFEST file even when the application is backing it up to the backup location. This means that the database-version that is stored in the MANIFEST FILE in the backup location does not correspond to the sst files returned by GetLiveFiles. This API adds a new parameter to GetLiveFiles. This new parmeter returns the current size of the MANIFEST file. Test Plan: Unit test attached. Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5631 25 September 2012, 10:13:25 UTC
dd45b8c Keep symbols even for production release. Summary: Keeping symbols in the binary increases the size of the library but makes it easier to debug. The optimization level is still -O2, so this should have no impact on performance. Test Plan: make all Reviewers: heyongqiang, MarkCallaghan Reviewed By: MarkCallaghan CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D5601 21 September 2012, 22:57:47 UTC
653add3 Release 1.5.2.fb Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 21 September 2012, 18:01:33 UTC
bb2dcd2 Segfault in DoCompactionWork caused by buffer overflow Summary: The code was allocating 200 bytes on the stack but it writes 256 bytes into the array. x8a8ea5 std::_Rb_tree<>::erase() @ 0x7f134bee7eb0 (unknown) @ 0x8a8ea5 std::_Rb_tree<>::erase() @ 0x8a35d6 leveldb::DBImpl::CleanupCompaction() @ 0x8a7810 leveldb::DBImpl::BackgroundCompaction() @ 0x8a804d leveldb::DBImpl::BackgroundCall() @ 0x8c4eff leveldb::(anonymous namespace)::PosixEnv::BGThreadWrapper() @ 0x7f134b3c010d start_thread @ 0x7f134bf9f10d clone Test Plan: run db_bench with overwrite option Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5595 21 September 2012, 17:55:38 UTC
9e84834 Allow a configurable number of background threads. Summary: The background threads are necessary for compaction. For slower storage, it might be necessary to have more than one compaction thread per DB. This patch allows creating a configurable number of worker threads. The default reamins at 1 (to maintain backward compatibility). Test Plan: run all unit tests. changes to db-bench coming in a separate patch. Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D5559 19 September 2012, 22:51:08 UTC
fb4b381 Print out the compile version in the LOG. Summary: Print out the compile version in the LOG. Test Plan: run dbbench and verify LOG Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5529 18 September 2012, 20:24:32 UTC
3662c29 improve comments about target_file_size_base, target_file_size_multiplier, max_bytes_for_level_base, max_bytes_for_level_multiplier Summary: Summary: as subject Test Plan: compile Reviewers: MarkCallaghan, dhruba Differential Revision: https://reviews.facebook.net/D5499 17 September 2012, 22:56:11 UTC
aa0426f Use correct version of jemalloc. Summary: Use correct version of jemalloc. Test Plan: run unit tests Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5487 17 September 2012, 22:00:19 UTC
a8464ed add an option to disable seek compaction Summary: as subject. This diff should be good for benchmarking. will send another diff to make it better in the case the seek compaction is enable. In that coming diff, will not count a seek if the bloomfilter filters. Test Plan: build Reviewers: dhruba, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D5481 17 September 2012, 20:59:57 UTC
906f2ee New release 1.5.1.fb Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 17 September 2012, 18:35:06 UTC
1f7850c Build with gcc-4.7.1-glibc-2.14.1. Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 17 September 2012, 17:56:26 UTC
b526342 use 20d3328ac30f633840ce819ad03019f415267a86 as builder Summary: Summary: as subject Test Plan: build Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5475 17 September 2012, 17:53:52 UTC
ba55d77 Ability to take a file-lvel snapshot from leveldb. Summary: A set of apis that allows an application to backup data from the leveldb database based on a set of files. Test Plan: unint test attached. more coming soon. Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5439 17 September 2012, 16:14:50 UTC
b85cdca add a global var leveldb::useMmapRead to enable mmap Summary: Summary: as subject. this can be used for benchmarking. If we want it for some cases, we can do more changes to make this part of the option. Test Plan: db_test Reviewers: dhruba CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D5451 17 September 2012, 05:07:35 UTC
dcbd6be remove boost Summary: as subject Test Plan: build Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D5469 17 September 2012, 02:33:43 UTC
33323f2 Remove use of mmap for random reads Summary: Reads via mmap on concurrent workloads are much slower than pread. For example on a 24-core server with storage that can do 100k IOPS or more I can get no more than 10k IOPS with mmap reads and 32+ threads. Test Plan: db_bench benchmarks Reviewers: dhruba, heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5433 14 September 2012, 23:43:50 UTC
fa29f82 scan a long for FLAGS_cache_size to fix a compiler warning Summary: FLAGS_cache_size is a long, no need to scan %lld into a size_t for it (which generates a compiler warning) Test Plan: run db_bench Reviewers: dhruba, heyongqiang Reviewed By: heyongqiang CC: heyongqiang Differential Revision: https://reviews.facebook.net/D5427 14 September 2012, 19:45:42 UTC
8371139 Add --compression_type=X option with valid values: snappy (default) none bzip2 zlib Summary: This adds an option to db_bench to specify the compression algorithm to use for LevelDB Test Plan: ran db_bench Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5421 14 September 2012, 19:28:21 UTC
93f4952 Ability to switch off filesystem read-aheads Summary: Ability to switch off filesystem read-aheads. This change is backward-compatible: the default setting is to allow file system read-aheads. Test Plan: run benchmarks Reviewers: heyongqiang, adsharma Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5391 13 September 2012, 19:09:56 UTC
4028ae7 Do not cache readahead-pages in the OS cache. Summary: When posix_fadvise(offset, offset) is usedm it frees up only those pages in that specified range. But the filesystem could have done some read-aheads and those get cached in the OS cache. Do not cache readahead-pages in the OS cache. Test Plan: run db_bench benchmark. Reviewers: vamsi, heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5379 13 September 2012, 17:56:02 UTC
7ecc5d4 Enable db_bench to specify block size. Summary: Enable db_bench to specify block size. Test Plan: compile and run Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5373 13 September 2012, 17:22:43 UTC
407727b Fix compiler warnings. Use uint64_t instead of uint. Summary: Fix compiler warnings. Use uint64_t instead of uint. Test Plan: build using -Wall Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5355 12 September 2012, 21:42:36 UTC
0f43aa4 put log in a seperate dir Summary: added a new option db_log_dir, which points the log dir. Inside that dir, in order to make log names unique, the log file name is prefixed with the leveldb data dir absolute path. Test Plan: db_test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5205 07 September 2012, 00:52:08 UTC
afb5f22 build scribe with thrift lib Summary: Summary: as subject Test Plan: test build Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5145 07 September 2012, 00:41:53 UTC
536ca69 The ReadnRandomWriteRandom was always looping FLAGS_num of times. Summary: If none of reads or writes are specified by user, then pick the FLAGS_NUM as the number of iterations in the ReadRandomWriteRandom test. If either reads or writes are defined, then use their maximum. Test Plan: run benchmark Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5217 06 September 2012, 16:13:24 UTC
354a9ea Compile leveldb with gcc 4.7.1 Test Plan: run unit tests Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5163 05 September 2012, 07:11:35 UTC
7112c93 Do not use scribe for release builds. Summary: Do not use scribe for release builds. Test Plan: build fbcode Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5139 04 September 2012, 23:33:49 UTC
94208a7 Benchmark with both reads and writes at the same time. Summary: This patch enables the db_bench benchmark to issue both random reads and random writes at the same time. This options can be trigged via ./db_bench --benchmarks=readrandomwriterandom The default percetage of reads is 90. One can change the percentage of reads by specifying the --readwritepercent. ./db_bench --benchmarks=readrandomwriterandom=50 This is a feature request from Jeffro asking for leveldb performance with a 90:10 read:write ratio. Test Plan: run on test machine. Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5067 04 September 2012, 19:06:26 UTC
8bab056 Release 1.5.0.fb. Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 29 August 2012, 22:29:30 UTC
f0b1654 Add libhdfs.a to the build process. Fix compilcation error for hdfs build. Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 29 August 2012, 22:21:56 UTC
fe93631 Clean up compiler warnings generated by -Wall option. Summary: Clean up compiler warnings generated by -Wall option. make clean all OPT=-Wall This is a pre-requisite before making a new release. Test Plan: compile and run unit tests Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5019 29 August 2012, 21:24:51 UTC
e5fe80e The sharding of the block cache is limited to 2*20 pieces. Summary: The numbers of shards that the block cache is divided into is configurable. However, if the user specifies that he/she wants the block cache to be divided into more than 2**20 pieces, then the system will rey to allocate a huge array of that size) that could fail. It is better to limit the sharding of the block cache to an upper bound. The default sharding is 16 shards (i.e. 2**4) and the maximum is now 2 million shards (i.e. 2**20). Also, fixed a bug with the LRUCache where the numShardBits should be a private member of the LRUCache object rather than a static variable. Test Plan: run db_bench with --cache_numshardbits=64. Task ID: # Blame Rev: Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5013 29 August 2012, 19:17:59 UTC
a4f9b8b merge 1.5 Summary: as subject Test Plan: db_test table_test Reviewers: dhruba 28 August 2012, 18:43:33 UTC
6fee5a7 Do not spin in a tight loop attempting compactions if there is a compaction error Summary: as subject. ported the change from google code leveldb 1.5 Test Plan: run db_test Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D4839 28 August 2012, 18:43:33 UTC
935fdd0 fix filename_test Summary: as subject Test Plan: run filename_test Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D4965 28 August 2012, 18:42:42 UTC
690bf88 in db_stats_logger.cc, hold mutex_ while accessing versions_ Summary: as subject Test Plan:db_test Reviewers: dhruba 28 August 2012, 18:29:30 UTC
d3759ca fix db_test error with scribe logger turned on Summary: as subject Test Plan: db_test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D4929 28 August 2012, 18:22:58 UTC
fc20273 Introduce a new method Env->Fsync() that issues fsync (instead of fdatasync). Summary: Introduce a new method Env->Fsync() that issues fsync (instead of fdatasync). This is needed for data durability when running on ext3 filesystems. Added options to the benchmark db_bench to generate performance numbers with either fsync or fdatasync enabled. Cleaned up Makefile to build leveldb_shell only when building the thrift leveldb server. Test Plan: build and run benchmark Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D4911 28 August 2012, 04:24:17 UTC
e675351 number to read is not resepected Summary: as subject Test Plan: sst_dump --command=scan --file= Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D4887 25 August 2012, 01:29:40 UTC
7cd6440 Merge branch 'master' of https://github.com/facebook/leveldb 24 August 2012, 22:22:47 UTC
1de83cc add more logs Summary: as subject add a tool to read sst file as subject. ./sst_reader --command=check --file= ./sst_reader --command=scan --file= Test Plan: db_test run this command Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D4881 24 August 2012, 22:20:49 UTC
back to top