swh:1:snp:5115096b921df712aeb2a08114fede57fb3331fb

sort by:
Revision Author Date Message Commit Date
1aae609 Use CRC32 ss42 instruction. Load it dynamically. Test Plan: make all check Reviewers: dhruba Reviewed By: dhruba CC: leveldb, zshao Differential Revision: https://reviews.facebook.net/D7503 21 December 2012, 18:20:32 UTC
7521a22 sst_dump: Error message should include the case that compression algorithms are not supported. Summary: It took me almost a day to debug this. :( Although I got to learn the file format as a by-product, this time could be saved if we have better error messages. Test Plan: gmake clean all; sst_dump --hex --file=000005.sst Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7551 20 December 2012, 23:29:51 UTC
551f01f Unit test to test block format. Summary: This is a standalone unit test to test the format of a block. Test Plan: ./block_test Reviewers: sheki Reviewed By: sheki Differential Revision: https://reviews.facebook.net/D7533 20 December 2012, 22:55:07 UTC
58d1444 Add libevent include and lib directories Summary: Without this fix, I see failures like this: [zshao@dev1049 /data/users/zshao/rocksdb] . fbcode.gcc471.sh; gmake clean libleveldb.a . . . ./thrift/lib/cpp/async/TEventUtil.h:22:32: fatal error: event.h: No such file or directory Test Plan: . fbcode.gcc471.sh; make clean libleveldb.a Reviewers: dhruba, emayanke, sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D7497 20 December 2012, 22:31:54 UTC
f4c2b7c Enhance ReadOnly mode to process the all committed transactions. Summary: Leveldb has an api OpenForReadOnly() that opens the database in readonly mode. This call had an option to not process the transaction log. This patch removes this option and always processes all transactions that had been committed. It has been done in such a way that it does not create/write to any new files in the process. The invariant of "no-writes" to the leveldb data directory is still true. This enhancement allows multiple threads to open the same database in readonly mode and access all trancations that were committed right upto the OpenForReadOnly call. I changed the public API to match the new semantics because there are no users who are currently using this api. Test Plan: make clean check Reviewers: sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D7479 20 December 2012, 00:30:46 UTC
be9b862 ldb: add "ldb load" command Summary: This command accepts key-value pairs from stdin with the same format of "ldb dump" command. This allows us to try out different compression algorithms/block sizes easily. Test Plan: dump, load, dump, verify the data is the same. Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7443 19 December 2012, 09:45:59 UTC
2585979 Release 1.5.6 for Java code + Script to automate it. Test Plan: it compiles and deploys. Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7341 17 December 2012, 20:11:11 UTC
3d1e92b Enhancements to rocksdb for better support for replication. Summary: 1. The OpenForReadOnly() call should not lock the db. This is useful so that multiple processes can open the same database concurrently for reading. 2. GetUpdatesSince should not error out if the archive directory does not exist. 3. A new constructor for WriteBatch that can takes a serialized string as a parameter of the constructor. Test Plan: make clean check Reviewers: sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D7449 17 December 2012, 19:40:19 UTC
62d4857 Added meta-database support. Summary: Added kMetaDatabase for meta-databases in db/filename.h along with supporting fuctions. Fixed switch in DBImpl so that it also handles kMetaDatabase. Fixed DestroyDB() that it can handle destroying meta-databases. Test Plan: make check Reviewers: sheki, emayanke, vamsi, dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D7245 17 December 2012, 19:26:59 UTC
2f0585f Fix a bug. Where DestroyDB deletes a non-existant archive directory. Summary: C tests would fail sometimes as DestroyDB would return a Failure Status message when deleting an archival directory which was not created (WAL_ttl_seconds = 0). Fix: Ignore the Status returned on Deleting Archival Directory. Test Plan: * make check Reviewers: dhruba, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7395 17 December 2012, 18:25:26 UTC
3d9ff0e ldb: fix dump command to pad HEX output chars with 0. Summary: The old code was omitting the 0 if the char is less than 16. Test Plan: Tried the following program: int main() { unsigned char c = 1; printf("%X\n", c); printf("%02X\n", c); return 0; } The output is: 1 01 Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7437 17 December 2012, 00:55:38 UTC
7dc8bb7 ldb: support --block_size=<4096|65536|...> and --auto_compaction=<0|1> Summary: This allows us to use ldb to do more experiments like block_size changes. Test Plan: run it by hand. Reviewers: dhruba, sheki, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7431 16 December 2012, 17:00:14 UTC
c280975 manifest_dump: Add --hex=1 option Summary: Without this option, manifest_dump does not print binary keys for files in a human-readable way. Test Plan: ./manifest_dump --hex=1 --verbose=0 --file=/data/users/zshao/fdb_comparison/leveldb/fbobj.apprequest-0_0_original/MANIFEST-000002 manifest_file_number 589 next_file_number 590 last_sequence 2311567 log_number 543 prev_log_number 0 --- level 0 --- version# 0 --- 532:1300357['0000455BABE20000' @ 2183973 : 1 .. 'FFFCA5D7ADE20000' @ 2184254 : 1] 536:1308170['000198C75CE30000' @ 2203313 : 1 .. 'FFFCF94A79E30000' @ 2206463 : 1] 542:1321644['0002931AA5E50000' @ 2267055 : 1 .. 'FFF77B31C5E50000' @ 2270754 : 1] 544:1286390['000410A309E60000' @ 2278592 : 1 .. 'FFFE470A73E60000' @ 2289221 : 1] 538:1298778['0006BCF4D8E30000' @ 2217050 : 1 .. 'FFFD77DAF7E30000' @ 2220489 : 1] 540:1282353['00090D5356E40000' @ 2231156 : 1 .. 'FFFFF4625CE40000' @ 2231969 : 1] --- level 1 --- version# 0 --- 510:2112325['000007F9C2D40000' @ 1782099 : 1 .. '146F5B67B8D80000' @ 1905458 : 1] 511:2121742['146F8A3023D60000' @ 1824388 : 1 .. '28BC8FBB9CD40000' @ 1777993 : 1] 512:801631['28BCD396F1DE0000' @ 2080191 : 1 .. '3082DBE9ADDB0000' @ 1989927 : 1] Reviewers: dhruba, sheki, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7425 16 December 2012, 16:58:28 UTC
806d4d9 fixing linters. Summary: old version of linters use "lint_engine" instead of "lint.engine" Some bookeeping in gitignore. Reviewers: abhishekk 14 December 2012, 22:05:27 UTC
2ba866e GetSequence API in write batch. Summary: WriteBatch is now used by the GetUpdatesSinceAPI. This API is external and will be used by the rocks server. Rocks Server and others will need to know about the Sequence Number in the WriteBatch. This public method will allow for that. Test Plan: make all check. Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7293 13 December 2012, 06:21:10 UTC
d0a3093 Expose the serialized string that represents a WriteBatch. Summary: Expose the serialized string that represents a WriteBatch. This is helpful to replicate a writebatch operation from one machine to another. Test Plan: make clean check Reviewers: sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D7317 13 December 2012, 00:25:52 UTC
f20383f Make Java Client compilable. Summary: Debug and ported changes from the Open Source Github repo to our repo. Wrote a script to easy build the java Library. future compiling java lib should just be running this script. Test Plan: it compiles. Reviewers: dhruba, leveldb Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D7323 12 December 2012, 22:07:52 UTC
22c2836 Fix Bug in Binary Search for files containing a seq no. and delete Archived Log Files during Destroy DB. Summary: * Fixed implementation bug in Binary_Searvch introduced in https://reviews.facebook.net/D7119 * Binary search is also overflow safe. * Delete archive log files and archive dir during DestroyDB Test Plan: make check Reviewers: dhruba CC: kosievdmerwe, emayanke Differential Revision: https://reviews.facebook.net/D7263 12 December 2012, 00:15:02 UTC
24fc379 An public api to fetch the latest transaction id. Summary: Implement a interface to retrieve the most current transaction id from the database. Test Plan: Added unit test. Reviewers: sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D7269 11 December 2012, 00:04:19 UTC
dcd919a Port various compaction options to Java. Summary: Porting various options, mostly related to Multi-threaded compaction to Java. Test Plan: mvn test. No clear plan on how else test. Reviewers: dhruba Reviewed By: dhruba CC: leveldb, emayanke Differential Revision: https://reviews.facebook.net/D7221 10 December 2012, 18:53:48 UTC
1c6742e Refactor GetArchivalDirectoryName to filename.h Summary: filename.h has functions to do similar things. Moving code away from db_impl.cc Test Plan: make check Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D7251 10 December 2012, 18:51:07 UTC
38671c4 Fix a race condition while processing tasks by background threads. Summary: Suppose you submit 100 background tasks one after another. The first enqueu task finds that the queue is empty and wakes up one worker thread. Now suppose that all remaining 99 work items are enqueued, they do not wake up any worker threads because the queue is already non-empty. This causes a situation when there are 99 tasks in the task queue but only one worker thread is processing a task while the remaining worker threads are waiting. The fix is to always wakeup one worker thread while enqueuing a task. I also added a check to count the number of elements in the queue to help in debugging. Test Plan: make clean check. Reviewers: chip Reviewed By: chip CC: leveldb Differential Revision: https://reviews.facebook.net/D7203 10 December 2012, 01:15:27 UTC
768edfa ldb: Add compression and bloom filter options. Summary: Added the following two options: [--bloom_bits=<int,e.g.:14>] [--compression_type=<no|snappy|zlib|bzip2>] These options will be used when ldb opens the leveldb database. Test Plan: Tried by hand for both success and failure cases. We do need a test framework. Reviewers: dhruba, emayanke, sheki Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7197 07 December 2012, 22:26:37 UTC
8055008 GetUpdatesSince API to enable replication. Summary: How it works: * GetUpdatesSince takes a SequenceNumber. * A LogFile with the first SequenceNumber nearest and lesser than the requested Sequence Number is found. * Seek in the logFile till the requested SeqNumber is found. * Return an iterator which contains logic to return record's one by one. Test Plan: * Test case included to check the good code path. * Will update with more test-cases. * Feedback required on test-cases. Reviewers: dhruba, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7119 07 December 2012, 19:42:13 UTC
f69e9f3 Fixed off by 1 in tests. Summary: Added 1 to indices where I shouldn't have so overrun array. Test Plan: make check Reviewers: sheki, emayanke, vamsi, dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D7227 07 December 2012, 18:48:46 UTC
0eb0c9b Added methods to write small ints to bit streams. Summary: Added BitStreamPutInt() and BitStreamGetInt() which take a stream of chars and can write integers of arbitrary bit sizes to that stream at arbitrary positions. There are also convenience versions of these functions that take std::strings and leveldb::Slices. Test Plan: make check Reviewers: sheki, vamsi, dhruba, emayanke Reviewed By: vamsi CC: leveldb Differential Revision: https://reviews.facebook.net/D7071 07 December 2012, 18:42:19 UTC
c847a31 Print compaction score for every compaction run. Summary: A compaction is picked based on its score. It is useful to print the compaction score in the LOG because it aids in debugging. If one looks at the logs, one can find out why a compaction was preferred over another. Test Plan: make clean check Differential Revision: https://reviews.facebook.net/D7137 04 December 2012, 18:03:47 UTC
6eb5ed9 rocksdb README file. Summary: rocksdb README file. Test Plan: Reviewers: CC: Task ID: # Blame Rev: 30 November 2012, 06:39:08 UTC
d4627e6 Move WAL files to archive directory, instead of deleting. Summary: Create a directory "archive" in the DB directory. During DeleteObsolteFiles move the WAL files (*.log) to the Archive directory, instead of deleting. Test Plan: Created a DB using DB_Bench. Reopened it. Checked if files move. Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6975 29 November 2012, 01:28:08 UTC
d29f181 Fix all the lint errors. Summary: Scripted and removed all trailing spaces and converted all tabs to spaces. Also fixed other lint errors. All lint errors from this point of time should be taken seriously. Test Plan: make all check Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7059 29 November 2012, 01:18:41 UTC
9b83853 Release 1.5.6.fb Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 29 November 2012, 00:09:41 UTC
9a35784 Delete non-visible keys during a compaction even in the presense of snapshots. Summary: LevelDB should delete almost-new keys when a long-open snapshot exists. The previous behavior is to keep all versions that were created after the oldest open snapshot. This can lead to database size bloat for high-update workloads when there are long-open snapshots and long-open snapshot will be used for logical backup. By "almost new" I mean that the key was updated more than once after the oldest snapshot. If there were two snapshots with seq numbers s1 and s2 (s1 < s2), and if we find two instances of the same key k1 that lie entirely within s1 and s2 (i.e. s1 < k1 < s2), then the earlier version of k1 can be safely deleted because that version is not visible in any snapshot. Test Plan: unit test attached make clean check Differential Revision: https://reviews.facebook.net/D6999 28 November 2012, 23:47:40 UTC
34487af Moved FBCode Linter's to LevelDB. Summary: Added FBCODE like linting support to our codebase. Test Plan: arc lint lint's the code. Reviewers: dhruba Reviewed By: dhruba CC: emayanke, leveldb Differential Revision: https://reviews.facebook.net/D7041 28 November 2012, 17:49:01 UTC
3366eda Print out status at the end of a compaction run. Summary: Print out status at the end of a compaction run. This helps in debugging. Test Plan: make clean check Reviewers: sheki Reviewed By: sheki Differential Revision: https://reviews.facebook.net/D7035 28 November 2012, 06:17:38 UTC
43f5a07 Remove unused varibles. Cause compiler warnings. Test Plan: make check Reviewers: dhruba Reviewed By: dhruba CC: emayanke Differential Revision: https://reviews.facebook.net/D6993 27 November 2012, 04:55:24 UTC
2a39699 Assertion failure while running with unit tests with OPT=-g Summary: When we expand the range of keys for a level 0 compaction, we need to invoke ParentFilesInCompaction() only once for the entire range of keys that is being compacted. We were invoking it for each file that was being compacted, but this triggers an assertion because each file's range were contiguous but non-overlapping. I renamed ParentFilesInCompaction to ParentRangeInCompaction to adequately represent that it is the range-of-keys and not individual files that we compact in a single compaction run. Here is the assertion that is fixed by this patch. db_test: db/version_set.cc:585: void leveldb::Version::ExtendOverlappingInputs(int, const leveldb::Slice&, const leveldb::Slice&, std::vector<leveldb::FileMetaData*, std::allocator<leveldb::FileMetaData*> >*, int): Assertion `user_cmp->Compare(flimit, user_begin) >= 0' failed. Test Plan: make clean check OPT=-g Reviewers: sheki Reviewed By: sheki CC: MarkCallaghan, emayanke, leveldb Differential Revision: https://reviews.facebook.net/D6963 26 November 2012, 22:00:39 UTC
7c6f527 Merge branch 'performance' 26 November 2012, 20:01:55 UTC
e0cd6bf The c_test was sometimes failing with an assertion. Summary: On fast filesystems (e.g. /dev/shm and ext4), the flushing of memstore to disk was fast and quick, and the background compaction thread was not getting scheduled fast enough to delete obsolete files before the db was closed. This caused the repair method to pick up those files that were not part of the db and the unit test was failing. The fix is to enhance the unti test to run a compaction before closing the database so that all files that are not part of the database are truly deleted from the filesystem. Test Plan: make c_test; ./c_test Reviewers: chip, emayanke, sheki Reviewed By: chip CC: leveldb Differential Revision: https://reviews.facebook.net/D6915 26 November 2012, 19:59:51 UTC
6caf3b8 Fix broken test; some ldb commands can run without a db_ Summary: It would appear our unit tests make use of code from ldb_cmd, and don't always require a valid database handle. D6855 was not aware db_ could sometimes be NULL for such commands, and so it broke reduce_levels_test. This moves the check elsewhere to (at least) fix the 'ldb dump' case of segfaulting when it couldn't open a database. Test Plan: make check Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D6903 26 November 2012, 19:11:30 UTC
879e45e Fix ldb segfault and use static libsnappy for all builds Summary: Link statically against snappy, using the gvfs one for facebook environments, and the bundled one otherwise. In addition, fix a few minor segfaults in ldb when it couldn't open the database, and update .gitignore to include a few other build artifacts. Test Plan: make check Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D6855 21 November 2012, 19:07:19 UTC
7632fdb Support taking a configurable number of files from the same level to compact in a single compaction run. Summary: The compaction process takes some files from LevelK and merges it into LevelK+1. The number of files it picks from LevelK was capped such a way that the total amount of data picked does not exceed the maxfilesize of that level. This essentially meant that only one file from LevelK is picked for a single compaction. For bulkloads, we would like to take many many file from LevelK and compact them using a single compaction run. This patch introduces a option called the 'source_compaction_factor' (similar to expanded_compaction_factor). It is a multiplier that is multiplied by the maxfilesize of that level to arrive at the limit that is used to throttle the number of source files from LevelK. For bulk loads, set source_compaction_factor to a very high number so that multiple files from the same level are picked for compaction in a single compaction. The default value of source_compaction_factor is 1, so that we can keep backward compatibilty with existing compaction semantics. Test Plan: make clean check Reviewers: emayanke, sheki Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D6867 21 November 2012, 16:37:03 UTC
fbb73a4 Support to disable background compactions on a database. Summary: This option is needed for fast bulk uploads. The goal is to load all the data into files in L0 without any interference from background compactions. Test Plan: make clean check Reviewers: sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D6849 21 November 2012, 05:12:06 UTC
3754f2f A major bug that was not considering the compaction score of the n-1 level. Summary: The method Finalize() recomputes the compaction score of each level and then sorts these score from largest to smallest. The idea is that the level with the largest compaction score will be a better candidate for compaction. There are usually very few levels, and a bubble sort code was used to sort these compaction scores. There existed a bug in the sorting code that skipped looking at the score for the n-1 level. This meant that even if the compaction score of the n-1 level is large, it will not be picked for compaction. This patch fixes the bug and also introduces "asserts" in the code to detect any possible inconsistencies caused by future bugs. This bug existed in the very first code change that introduced multi-threaded compaction to the leveldb code. That version of code was committed on Oct 19th via https://github.com/facebook/leveldb/commit/1ca0584345af85d2dccc434f451218119626d36e Test Plan: make clean check OPT=-g Reviewers: emayanke, sheki, MarkCallaghan Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D6837 20 November 2012, 23:44:21 UTC
dde7089 Fix asserts Summary: make check OPT=-g fails with the following assert. ==== Test DBTest.ApproximateSizes db_test: db/version_set.cc:765: void leveldb::VersionSet::Builder::CheckConsistencyForDeletes(leveldb::VersionEdit*, int, int): Assertion `found' failed. The assertion was that file #7 that was being deleted did not preexists, but actualy it did pre-exist as shown in the manifest dump shows below. The bug was that we did not check for file existance at the same level. *************************Edit[0] = VersionEdit { Comparator: leveldb.BytewiseComparator } *************************Edit[1] = VersionEdit { LogNumber: 8 PrevLogNumber: 0 NextFile: 9 LastSeq: 80 AddFile: 0 7 8005319 'key000000' @ 1 : 1 .. 'key000079' @ 80 : 1 } *************************Edit[2] = VersionEdit { LogNumber: 8 PrevLogNumber: 0 NextFile: 13 LastSeq: 80 CompactPointer: 0 'key000079' @ 80 : 1 DeleteFile: 0 7 AddFile: 1 9 2101425 'key000000' @ 1 : 1 .. 'key000020' @ 21 : 1 AddFile: 1 10 2101425 'key000021' @ 22 : 1 .. 'key000041' @ 42 : 1 AddFile: 1 11 2101425 'key000042' @ 43 : 1 .. 'key000062' @ 63 : 1 AddFile: 1 12 1701165 'key000063' @ 64 : 1 .. 'key000079' @ 80 : 1 } Test Plan: Reviewers: CC: Task ID: # Blame Rev: 19 November 2012, 22:51:22 UTC
a4b79b6 Merge branch 'master' into performance 19 November 2012, 21:20:25 UTC
74054fa Fix compilation error while compiling unit tests with OPT=-g Summary: Fix compilation error while compiling with OPT=-g Test Plan: make clean check OPT=-g Reviewers: CC: Task ID: # Blame Rev: 19 November 2012, 21:16:46 UTC
48dafb2 Fix compilation error introduced by previous commit 7889e094554dc5bba678a0bfa7fb5eca422c34de Summary: Fix compilation error introduced by previous commit 7889e094554dc5bba678a0bfa7fb5eca422c34de Test Plan: make clean check 19 November 2012, 20:16:45 UTC
7889e09 Enhance manifest_dump to print each individual edit. Summary: The manifest file contains a series of edits. If the verbose option is switched on, then print each individual edit in the manifest file. This helps in debugging. Test Plan: make clean manifest_dump Reviewers: emayanke, sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D6807 19 November 2012, 20:04:35 UTC
661dc15 Fix LDB dumpwal to print the messages as in the file. Summary: StringStream.clear() does not clear the stream. It sets some flags. Who knew? Fixing that is not printing the stuff again and again. Test Plan: ran it on a local db Reviewers: dhruba, emayanke Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6795 19 November 2012, 20:04:35 UTC
65b035a Fix a coding error in db_test.cc Summary: The new function MinLevelToCompress in db_test.cc was incomplete. It needs to tell the calling function-TEST whether the test has to be skipped or not Test Plan: make all;./db_test Reviewers: dhruba, heyongqiang Reviewed By: dhruba CC: sheki Differential Revision: https://reviews.facebook.net/D6771 19 November 2012, 20:04:35 UTC
30742e1 LDB can read WAL. Summary: Add option to read WAL and print a summary for each record. facebook task => #1885013 E.G. Output : ./ldb dump_wal --walfile=/tmp/leveldbtest-5907/dbbench/026122.log --header Sequence,Count,ByteSize 49981,1,100033 49981,1,100033 49982,1,100033 49981,1,100033 49982,1,100033 49983,1,100033 49981,1,100033 49982,1,100033 49983,1,100033 49984,1,100033 49981,1,100033 49982,1,100033 Test Plan: Works run ./ldb read_wal --wal-file=/tmp/leveldbtest-5907/dbbench/000078.log --header Reviewers: dhruba, heyongqiang Reviewed By: dhruba CC: emayanke, leveldb, zshao Differential Revision: https://reviews.facebook.net/D6675 19 November 2012, 20:04:34 UTC
4b622ab Enhance manifest_dump to print each individual edit. Summary: The manifest file contains a series of edits. If the verbose option is switched on, then print each individual edit in the manifest file. This helps in debugging. Test Plan: make clean manifest_dump Reviewers: emayanke, sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D6807 19 November 2012, 20:02:27 UTC
b648401 Fix LDB dumpwal to print the messages as in the file. Summary: StringStream.clear() does not clear the stream. It sets some flags. Who knew? Fixing that is not printing the stuff again and again. Test Plan: ran it on a local db Reviewers: dhruba, emayanke Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6795 19 November 2012, 19:14:07 UTC
62e7583 enhance dbstress to simulate hard crash Summary: dbstress has an option to reopen the database. Make it such that the previous handle is not closed before we reopen, this simulates a situation similar to a process crash. Added new api to DMImpl to remove the lock file. Test Plan: run db_stress Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D6777 19 November 2012, 07:16:17 UTC
de278a6 Fix a coding error in db_test.cc Summary: The new function MinLevelToCompress in db_test.cc was incomplete. It needs to tell the calling function-TEST whether the test has to be skipped or not Test Plan: make all;./db_test Reviewers: dhruba, heyongqiang Reviewed By: dhruba CC: sheki Differential Revision: https://reviews.facebook.net/D6771 16 November 2012, 22:56:50 UTC
f5cdf93 LDB can read WAL. Summary: Add option to read WAL and print a summary for each record. facebook task => #1885013 E.G. Output : ./ldb dump_wal --walfile=/tmp/leveldbtest-5907/dbbench/026122.log --header Sequence,Count,ByteSize 49981,1,100033 49981,1,100033 49982,1,100033 49981,1,100033 49982,1,100033 49983,1,100033 49981,1,100033 49982,1,100033 49983,1,100033 49984,1,100033 49981,1,100033 49982,1,100033 Test Plan: Works run ./ldb read_wal --wal-file=/tmp/leveldbtest-5907/dbbench/000078.log --header Reviewers: dhruba, heyongqiang Reviewed By: dhruba CC: emayanke, leveldb, zshao Differential Revision: https://reviews.facebook.net/D6675 16 November 2012, 17:09:00 UTC
c3392c9 The db_stress test should also test multi-threaded compaction. Summary: Create more than one background compaction thread if specified. This code peice is similar to what exists in db_bench. Test Plan: make check Differential Revision: https://reviews.facebook.net/D6753 15 November 2012, 06:01:39 UTC
6c5a4d6 Merge branch 'master' into performance Conflicts: db/db_impl.h 15 November 2012, 05:39:52 UTC
e988c11 Enhance db_bench to be able to specify a grandparent_overlap_factor. Summary: The value specified in max_grandparent_overlap_factor is used to limit the file size in a compaction run. This patch makes it configurable when using db_bench. Test Plan: make clean db_bench Reviewers: MarkCallaghan, heyongqiang Reviewed By: heyongqiang CC: leveldb Differential Revision: https://reviews.facebook.net/D6729 15 November 2012, 00:20:13 UTC
0f590af Push release 1.5.5.fb. Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 14 November 2012, 00:28:11 UTC
33cf6f3 Make sse compilation optional. Summary: The fbcode compilation was always switching on msse by default. This patch keeps the same behaviour but allows the compilation process to switch off msse if needed. If one does not want to use sse, then do the following: export USE_SSE=0 make clean all Test Plan: make clean all Reviewers: heyongqiang Reviewed By: heyongqiang CC: leveldb Differential Revision: https://reviews.facebook.net/D6717 14 November 2012, 00:25:57 UTC
5d16e50 Improved CompactionFilter api: pass in a opaque argument to CompactionFilter invocation. Summary: There are applications that operate on multiple leveldb instances. These applications will like to pass in an opaque type for each leveldb instance and this type should be passed back to the application with every invocation of the CompactionFilter api. Test Plan: Enehanced unit test for opaque parameter to CompactionFilter. Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan, sheki, emayanke Differential Revision: https://reviews.facebook.net/D6711 14 November 2012, 00:22:26 UTC
43d9a82 Fix asserts so that "make check OPT=-g" works on performance branch Summary: Compilation used to fail with the error: db/version_set.cc:1773: error: ‘number_of_files_to_sort_’ is not a member of ‘leveldb::VersionSet’ I created a new method called CheckConsistencyForDeletes() so that all the high cost checking is done only when OPT=-g is specified. I also fixed a bug in PickCompactionBySize that was triggered when OPT=-g was switched on. The base_index in the compaction record was not set correctly. Test Plan: make check OPT=-g Differential Revision: https://reviews.facebook.net/D6687 13 November 2012, 18:40:52 UTC
a785e02 The db_bench utility was broken in 1.5.4.fb because of a signed-unsigned comparision. Summary: The db_bench utility was broken in 1.5.4.fb because of a signed-unsigned comparision. The static variable FLAGS_min_level_to_compress was recently changed from int to 'unsigned in' but it is initilized to a nagative value -1. The segfault is of this type: Program received signal SIGSEGV, Segmentation fault. Open (this=0x7fffffffdee0) at db/db_bench.cc:939 939 db/db_bench.cc: No such file or directory. (gdb) where Test Plan: run db_bench with no options. Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan, emayanke, sheki Differential Revision: https://reviews.facebook.net/D6663 12 November 2012, 21:59:35 UTC
e626261 Introducing "database reopens" into the stress test. Database will reopen after a specified number of iterations (configurable) of each thread when they will wait for the databse to reopen. Summary: FLAGS_reopen (configurable) specifies the number of times the databse is to be reopened. FLAGS_ops_per_thread is divided into points based on that reopen field. At these points all threads come together to wait for the databse to reopen. Each thread "votes" for the database to reopen and when all have voted, the database reopens. Test Plan: make all;./db_stress Reviewers: dhruba, MarkCallaghan, sheki, asad, heyongqiang Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6627 12 November 2012, 20:26:32 UTC
c64796f Fix test failure of reduce_num_levels Summary: I changed the reduce_num_levels logic to avoid "compactRange()" call if the current number of levels in use (levels that contain files) is smaller than the new num of levels. And that change breaks the assert in reduce_levels_test Test Plan: run reduce_levels_test Reviewers: dhruba, MarkCallaghan Reviewed By: dhruba CC: emayanke, sheki Differential Revision: https://reviews.facebook.net/D6651 12 November 2012, 20:05:38 UTC
9c6c232 Compilation error while compiling with OPT=-g Summary: make clean check OPT=-g fails leveldb::DBStatistics::getTickerCount(leveldb::Tickers)’: ./db/db_statistics.h:34: error: ‘MAX_NO_TICKERS’ was not declared in this scope util/ldb_cmd.cc:255: warning: left shift count >= width of type Test Plan: make clean check OPT=-g Reviewers: CC: Task ID: # Blame Rev: 11 November 2012, 08:20:40 UTC
0f8e472 Metrics: record compaction drop's and bloom filter effectiveness Summary: Record BloomFliter hits and drop off reasons during compaction. Test Plan: Unit tests work. Reviewers: dhruba, heyongqiang Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6591 09 November 2012, 19:38:45 UTC
20d18a8 disable size compaction in ldb reduce_levels and added compression and file size parameter to it Summary: disable size compaction in ldb reduce_levels, this will avoid compactions rather than the manual comapction, added --compression=none|snappy|zlib|bzip2 and --file_size= per-file size to ldb reduce_levels command Test Plan: run ldb Reviewers: dhruba, MarkCallaghan Reviewed By: dhruba CC: sheki, emayanke Differential Revision: https://reviews.facebook.net/D6597 09 November 2012, 18:14:47 UTC
e00c709 Preparing for new release 1.5.4.fb Summary: Preparing for new release 1.5.4.fb Test Plan: Reviewers: CC: Task ID: # Blame Rev: 09 November 2012, 17:21:11 UTC
9e97bfd Introducing deletes for stress test Summary: Stress test modified to do deletes and later verify them Test Plan: running the test: db_stress Reviewers: dhruba, heyongqiang, asad, sheki, MarkCallaghan Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6567 09 November 2012, 00:55:18 UTC
391885c stat's collection in leveldb Summary: Prototype stat's collection. Diff is a good estimate of what the final code will look like. A few assumptions : * Used a global static instance of the statistics object. Plan to pass it to each internal function. Static allows metrics only at app level. * In the Ticker's do not do any locking. Depend on the mutex at each function of LevelDB. If we ever remove the mutex, we should change here too. The other option is use atomic objects anyways as there won't be any contention as they will be always acquired only by one thread. * The counters are dumb, increment through lifecycle. Plan to use ods etc to get last5min stat etc. Test Plan: made changes in db_bench Ran ./db_bench --statistics=1 --num=10000 --cache_size=5000 This will print the cache hit/miss stats. Reviewers: dhruba, heyongqiang Differential Revision: https://reviews.facebook.net/D6441 08 November 2012, 21:55:49 UTC
95dda37 Move filesize-based-sorting to outside the Mutex Summary: When a new version is created, we sort all the files at every level based on their size. This is necessary because we want to compact the largest file first. The sorting takes quite a bit of CPU. Moved the sorting code to be outside the mutex. Also, the earlier code was sorting files at all levels but we do not need to sort the highest-number level because those files are never the cause of any compaction. To reduce sorting costs, we sort only the first few files in each level because it is likely that those are the only files in that level that will be picked for compaction. At steady state, I have seen that this patch increase throughout from 1500 writes/sec to 1700 writes/sec at the end of a 72 hour run. The cpu saving by not sorting the last level was not distinctive in this test run because there were only 100K files in the highest numbered level. I expect the cpu saving to be significant when the number of files is much higher. This is mostly an early preview and not ready for rigorous review. With this patch, the writs/sec is now bottlenecked not by the sorting code but by GetOverlappingInputs. I am working on a patch to optimize GetOverlappingInputs. Test Plan: make check Reviewers: MarkCallaghan, heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D6411 07 November 2012, 23:39:44 UTC
18cb600 Fixed compilation error in previous merge. Summary: Fixed compilation error in previous merge. Test Plan: Reviewers: CC: Task ID: # Blame Rev: 07 November 2012, 23:24:47 UTC
8143062 Merge branch 'master' into performance Conflicts: db/db_impl.cc db/version_set.cc util/options.cc 07 November 2012, 23:11:37 UTC
3fcf533 Add a readonly db Summary: as subject Test Plan: run db_bench readrandom Reviewers: dhruba Reviewed By: dhruba CC: MarkCallaghan, emayanke, sheki Differential Revision: https://reviews.facebook.net/D6495 07 November 2012, 22:19:48 UTC
9b87a2b Avoid doing a exhaustive search when looking for overlapping files. Summary: The Version::GetOverlappingInputs() is called multiple times in the compaction code path. Eack invocation does a binary search for overlapping files in the specified key range. This patch remembers the offset of an overlapped file when GetOverlappingInputs() is called the first time within a compaction run. Suceeding calls to GetOverlappingInputs() uses the remembered index to avoid the binary search. I measured that 1000 iterations of GetOverlappingInputs takes around 4500 microseconds without this patch. If I use this patch with the hint on every invocation, then 1000 iterations take about 3900 microsecond. Test Plan: make check OPT=-g Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan, emayanke, sheki Differential Revision: https://reviews.facebook.net/D6513 07 November 2012, 19:47:17 UTC
4e413df Flush Data at object destruction if disableWal is used. Summary: Added a conditional flush in ~DBImpl to flush. There is still a chance of writes not being persisted if there is a crash (not a clean shutdown) before the DBImpl instance is destroyed. Test Plan: modified db_test to meet the new expectations. Reviewers: dhruba, heyongqiang Differential Revision: https://reviews.facebook.net/D6519 06 November 2012, 23:04:42 UTC
aa42c66 Fix all warnings generated by -Wall option to the compiler. Summary: The default compilation process now uses "-Wall" to compile. Fix all compilation error generated by gcc. Test Plan: make all check Reviewers: heyongqiang, emayanke, sheki Reviewed By: heyongqiang CC: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6525 06 November 2012, 22:07:31 UTC
5f91868 Merge branch 'master' into performance Conflicts: db/version_set.cc util/options.cc 06 November 2012, 00:51:55 UTC
cb7a002 The method GetOverlappingInputs should use binary search. Summary: The method Version::GetOverlappingInputs used a sequential search to map a kay-range to a set of files. But the files are arranged in ascending order of key, so a biary search is more effective. This patch implements Version::GetOverlappingInputsBinarySearch that finds one file that corresponds to the specified key range and then iterates backwards and forwards to find all overlapping files. This patch is critical for making compactions efficient, especially when there are thousands of files in a single level. I measured that 1000 iterations of TEST_MaxNextLevelOverlappingBytes takes 16000 microseconds without this patch. With this patch, the same method takes about 4600 microseconds. Test Plan: Almost all unit tests in db_test uses this method to lookup keys. Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan, emayanke, sheki Differential Revision: https://reviews.facebook.net/D6465 06 November 2012, 00:08:01 UTC
5273c81 Ability to invoke application hook for every key during compaction. Summary: There are certain use-cases where the application intends to delete older keys aftre they have expired a certian time period. One option for those applications is to periodically scan the entire database and delete appropriate keys. A better way is to allow the application to hook into the compaction process. This patch allows the application to set a method callback for every key that is being compacted. If this method returns true, then the key is not preserved in the output of the compaction. Test Plan: This is mostly to preview the proposed new public api. Since it is a public api, please do due diligence on reviewing it. I will be writing test cases for this api in mynext version of this patch. Reviewers: MarkCallaghan, heyongqiang Reviewed By: heyongqiang CC: sheki, adsharma Differential Revision: https://reviews.facebook.net/D6285 06 November 2012, 00:02:13 UTC
f1a7c73 fix complie error Summary: as subject Test Plan:n/a 05 November 2012, 18:30:19 UTC
d55c2ba Add a tool to change number of levels Summary: as subject. Test Plan: manually test it, will add a testcase Reviewers: dhruba, MarkCallaghan Differential Revision: https://reviews.facebook.net/D6345 05 November 2012, 18:17:39 UTC
81f735d Merge branch 'master' into performance Conflicts: db/db_impl.cc util/options.cc 05 November 2012, 17:41:38 UTC
a1bd5b7 Compilation problem introduced by previous commit 854c66b089bef5d27f79750884f70f6e2c8c69da. Summary: Compilation problem introduced by previous commit 854c66b089bef5d27f79750884f70f6e2c8c69da. Test Plan: make check 05 November 2012, 06:04:14 UTC
854c66b Make compression options configurable. These include window-bits, level and strategy for ZlibCompression Summary: Leveldb currently uses windowBits=-14 while using zlib compression.(It was earlier 15). This makes the setting configurable. Related changes here: https://reviews.facebook.net/D6105 Test Plan: make all check Reviewers: dhruba, MarkCallaghan, sheki, heyongqiang Differential Revision: https://reviews.facebook.net/D6393 02 November 2012, 18:26:39 UTC
3096fa7 Add two more options: disable block cache and make table cache shard number configuable Summary: as subject Test Plan: run db_bench and db_test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6111 01 November 2012, 20:23:21 UTC
3e7e269 Use timer to measure sleep rather than assume it is 1000 usecs Summary: This makes the stall timers in MakeRoomForWrite more accurate by timing the sleeps. From looking at the logs the real sleep times are usually about 2000 usecs each when SleepForMicros(1000) is called. The modified LOG messages are: 2012/10/29-12:06:33.271984 2b3cc872f700 delaying write 13 usecs for level0_slowdown_writes_trigger 2012/10/29-12:06:34.688939 2b3cc872f700 delaying write 1728 usecs for rate limits with max score 3.83 Task ID: # Blame Rev: Test Plan: run db_bench, look at DB/LOG Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6297 30 October 2012, 14:21:37 UTC
fb8d437 fix test failure Summary: as subject Test Plan: db_test Reviewers: dhruba, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6309 30 October 2012, 01:55:52 UTC
925f60d add a test case to make sure chaning num_levels will fail Summary: Summary: as subject Test Plan: db_test Reviewers: dhruba, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6303 29 October 2012, 22:27:07 UTC
53e0431 Merge branch 'master' into performance Conflicts: db/db_bench.cc util/options.cc 29 October 2012, 21:18:00 UTC
321dfdc Allow having different compression algorithms on different levels. Summary: The leveldb API is enhanced to support different compression algorithms at different levels. This adds the option min_level_to_compress to db_bench that specifies the minimum level for which compression should be done when compression is enabled. This can be used to disable compression for levels 0 and 1 which are likely to suffer from stalls because of the CPU load for memtable flushes and (L0,L1) compaction. Level 0 is special as it gets frequent memtable flushes. Level 1 is special as it frequently gets all:all file compactions between it and level 0. But all other levels could be the same. For any level N where N > 1, the rate of sequential IO for that level should be the same. The last level is the exception because it might not be full and because files from it are not read to compact with the next larger level. The same amount of time will be spent doing compaction at any level N excluding N=0, 1 or the last level. By this standard all of those levels should use the same compression. The difference is that the loss (using more disk space) from a faster compression algorithm is less significant for N=2 than for N=3. So we might be willing to trade disk space for faster write rates with no compression for L0 and L1, snappy for L2, zlib for L3. Using a faster compression algorithm for the mid levels also allows us to reclaim some cpu without trading off much loss in disk space overhead. Also note that little is to be gained by compressing levels 0 and 1. For a 4-level tree they account for 10% of the data. For a 5-level tree they account for 1% of the data. With compression enabled: * memtable flush rate is ~18MB/second * (L0,L1) compaction rate is ~30MB/second With compression enabled but min_level_to_compress=2 * memtable flush rate is ~320MB/second * (L0,L1) compaction rate is ~560MB/second This practicaly takes the same code from https://reviews.facebook.net/D6225 but makes the leveldb api more general purpose with a few additional lines of code. Test Plan: make check Differential Revision: https://reviews.facebook.net/D6261 29 October 2012, 18:48:09 UTC
acc8567 Add more rates to db_bench output Summary: Adds the "MB/sec in" and "MB/sec out" to this line: Amplification: 1.7 rate, 0.01 GB in, 0.02 GB out, 8.24 MB/sec in, 13.75 MB/sec out Changes all values to be reported per interval and since test start for this line: ... thread 0: (10000,60000) ops and (19155.6,27307.5) ops/second in (0.522041,2.197198) seconds Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6291 29 October 2012, 18:30:07 UTC
de7689b Fix unit test failure caused by delaying deleting obsolete files. Summary: A previous commit 4c107587ed47af84633f8c61f65516a504d6cd98 introduced the idea that some version updates might not delete obsolete files. This means that if a unit test blindly counts the number of files in the db directory it might not represent the true state of the database. Use GetLiveFiles() insteads to count the number of live files in the database. Test Plan: make check 29 October 2012, 18:12:24 UTC
70c42bf Adds DB::GetNextCompaction and then uses that for rate limiting db_bench Summary: Adds a method that returns the score for the next level that most needs compaction. That method is then used by db_bench to rate limit threads. Threads are put to sleep at the end of each stats interval until the score is less than the limit. The limit is set via the --rate_limit=$double option. The specified value must be > 1.0. Also adds the option --stats_per_interval to enable additional metrics reported every stats interval. Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6243 29 October 2012, 17:17:43 UTC
8965c8d Add the missing util/auto_split_logger.h Summary: Test Plan: Reviewers: CC: Task ID: 1803577 Blame Rev: 26 October 2012, 22:23:50 UTC
d50f8eb Enable LevelDb to create a new log file if current log file is too large. Summary: Enable LevelDb to create a new log file if current log file is too large. Test Plan: Write a script and manually check the generated info LOG. Task ID: 1803577 Blame Rev: Reviewers: dhruba, heyongqiang Reviewed By: heyongqiang CC: zshao Differential Revision: https://reviews.facebook.net/D6003 26 October 2012, 21:55:02 UTC
3a91b78 Keep build_detect_platform portable Summary: AFAIK proper /bin/sh does not support "+=". Note that only our changes use "+=". The Google code does A="$A + $B" rather than A+=$B. Task ID: # Blame Rev: Test Plan: build Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6231 26 October 2012, 21:20:04 UTC
65855dd Normalize compaction stats by time in compaction Summary: I used server uptime to compute per-level IO throughput rates. I intended to use time spent doing compaction at that level. This fixes that. Task ID: # Blame Rev: Test Plan: run db_bench, look at results Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6237 26 October 2012, 21:19:13 UTC
back to top