https://github.com/SoftwareHeritage/swh-storage

sort by:
Revision Author Date Message Commit Date
79cc3f6 Better test coverage for content_missing 29 May 2018, 14:38:50 UTC
70574c5 Don't use a temporary table to fetch info about contents 28 May 2018, 16:04:44 UTC
6d5d999 Make the db_transaction{,_generator} decorators support client options Summary: Allow adding server-side statement timeouts for database operations Test Plan: make test still works Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D334 28 May 2018, 13:18:26 UTC
d826f75 swh.storage.api.client: Permit to specify the query timeout option Related T1061 24 May 2018, 10:02:11 UTC
77e69ec Use a concurrent.future to parallelize objstorage and storage addition 12 May 2018, 15:49:44 UTC
4a9b623 test_api_client: stop leaking directories Summary: When mkdtemp is called, shutil.rmtree must be called as well Test Plan: Look at /tmp before and after running tests, notice no new directories instead of 60. Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D331 11 May 2018, 13:42:09 UTC
721d088 Add test to ensure storage/objstorage consistency is kept at all times Summary: The behavior of storage when the underlying objstorage had an exception was never actually tested. This new test weeded out a bug in the threaded implementation for copy_to. Test Plan: the new test passes Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D330 11 May 2018, 10:13:44 UTC
18c9dad Only instantiate the storage backend once per import This allows connection reuse for postgresql and potential remote backends such as for the object storage, rather than reinitiating all connections on every request. 09 May 2018, 14:43:32 UTC
5a2de1c Use thread-aware psycopg2 connection pooling for database access Summary: This allows to use swh.storage with a modicum of concurrency Test Plan: clearly, make test should still pass Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D325 07 May 2018, 13:52:35 UTC
6c93693 Stop using the storage.db attribute directly Summary: Add a level of indirection to allow swapping out the implementation of the db attribute Test Plan: once again, make test keeps on working Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D324 07 May 2018, 13:46:21 UTC
e1e5025 Make the storage test fixture connect to postgres itself Summary: This avoids reusing a potentially stale connection handle. Also allows testing potential connection pooling behavior. This forces us to do proper cursor sanitation as well, a bunch of "transactional" operations weren't actually transactional. Test Plan: another round of make test still working Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D323 07 May 2018, 13:43:27 UTC
ff2d37d Deallocate storage when the tests teardown Summary: Helps avoid lingering postgresql connections when a test fails Test Plan: make test still works ;) Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D322 07 May 2018, 13:35:03 UTC
9ca8f7a Make sure schema changes get committed when doing the AlteringSchemaTests 07 May 2018, 13:31:21 UTC
03eea50 tests: Move test fixture to swh.core.tests.server_testing Related T1036 25 April 2018, 13:00:25 UTC
b4147ec fix typo in docstrings/comments (tnx codespell) 12 March 2018, 11:01:52 UTC
916d9cc Remove file that should not have been committed and update gitignore 27 February 2018, 09:40:18 UTC
d8ad992 storage: Add methods to compute directories/revisions diff This commit adds the implementation of an efficient algorithm for comparing two directory trees in order to compute the list of introduced file changes in terms of addition / deletion / modification/ renaming. It can be found in the diff module located in the new namespace swh.storage.algos That algorithm is used to extend the storage API with the following methods: - diff_directories: compute diff between two arbitrary directories - diff_revisions: compute diff between two arbitrary revisions - diff_revision: compute diff between a revision and its first parent Related T921 Closes D295 20 February 2018, 12:19:09 UTC
6b80cff Add a new table for "bucketed" object counts This table allows counting objects by bucket, keeping the transactions for counting objects short (a few dozen seconds at most). Also add a "single_update" boolean field to the main object_counts table to be able to discriminate tables that are counted via buckets and tables counted on one go. The main table is updated every 256 counted buckets to avoid too much churn on the table. Close T962. 19 February 2018, 18:30:41 UTC
86d68a6 doc: update table clusters in SQL diagram 09 February 2018, 11:48:40 UTC
22ee4c5 swh.storage.content_missing: Improve docstring 08 February 2018, 18:27:29 UTC
9082524 api.client: switch back_compat for snapshot_add as well 06 February 2018, 14:27:10 UTC
ada0d38 storage: swap back_compat switch for snapshot_add 06 February 2018, 14:25:36 UTC
9f84cec Add function to retrieve latest snapshot based on origin filtering This function does _not_ have any backwards compatibility provisions. 31 January 2018, 14:19:47 UTC
32e89b0 Add snapshot information when fetching origin_visits from the database 31 January 2018, 10:49:14 UTC
d6c3f65 sql doc Makefile: (try to) dropdb before creating it this way doc (re)building is more robust 19 January 2018, 13:22:11 UTC
b32d2c9 Add a way to remove the backwards-compatibility feature of snapshots 19 January 2018, 10:54:00 UTC
ab0e90c doc Makefile: clean SQL autodoc upon "clean" 19 January 2018, 10:30:42 UTC
d163501 SQL doc Makefile: ensure destdir gets created wherever needed 19 January 2018, 10:28:35 UTC
985a912 sql/Makefile: add missing deps on pdf/svg DB schema generation 16 January 2018, 15:48:53 UTC
2e033d0 sql: don't depend on plpython3u, we don't use it 16 January 2018, 14:36:32 UTC
56e0092 db-init: enforce UTF8 locale for jsonb fanciness Related T918 10 January 2018, 11:16:47 UTC
f5acfeb sql: Fix missing migration instructions Related 08121f7 09 January 2018, 10:46:46 UTC
7b5c111 db-init: actually grant privileges on tables and sequences 06 January 2018, 14:03:33 UTC
ca308c7 sql/bin/db-init: new helper script to setup development DB 06 January 2018, 13:52:20 UTC
cf6cac7 SQL: move definition of hash_sha1() function to swh-func.sql rationale: swh-init.sql should only contain stuff that requires being Postgres superuser 06 January 2018, 09:21:51 UTC
a0ac20a db: use a thread instead of a tmpfile to COPY data to postgresql Summary: Avoids writing a bunch of temporary files for no good reason. T866 was a symptom of this. Test Plan: No regressions in the unit tests Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D282 21 December 2017, 18:48:16 UTC
419d135 db: move external imports further down 21 December 2017, 18:45:19 UTC
221f8b1 Add 116->117 SQL upgrade script 21 December 2017, 18:40:57 UTC
11fcb82 Refactor swh_snapshot_add to hit more indexes 21 December 2017, 18:38:50 UTC
76e3670 db: remove bare except: 19 December 2017, 15:22:34 UTC
afcbe94 sql/upgrades: add 115 -> 116 swh-enums: Reference new mercurial revision type Related T329 19 December 2017, 13:27:40 UTC
a455fc0 docs: reference original db-schema.svg without symlink indirection the new sphinx seems to be confused by relative symlinks when we merge all docs together 19 December 2017, 11:06:59 UTC
655bcb6 doc generation: use more specific db-schema basename for images was: "swh" this way we can ship the images as is in the global doc, with minimal clash risks 19 December 2017, 11:06:04 UTC
73256e6 Add 114->115 SQL upgrade script 18 December 2017, 15:12:01 UTC
36aee70 Fix documentation formatting for tool_add/tool_get 15 December 2017, 14:48:16 UTC
f4ea97c Add snapshot models Summary: Snapshots are the new, improved occurrences; They're the topmost object in the Software Heritage Merkle Tree, and represent a full picture of an origin at a given time. Snapshots contain a list of named pointers to objects in the Software Heritage archive, as well as an intrinsic identifier. The full specification is supported: pointers to all types of objects, dangling pointers, as well as alias branches. They're implemented with a somewhat classic fully normalised model; Foreign keys use a sha1_git, which makes more sense regarding pointing at non-existent objects, at the expense of some economies of size. Backwards compatibility both ways with occurrences is ensured: when adding a snapshot linked to an origin visit, the corresponding occurrences are created in occurrence_history; when querying the snapshot for an origin visit where we haven't generated the snapshot yet, a virtual snapshot with id None is returned. This lets us migrate to the new tables gently. Close T567. Test Plan: Integration tests are included. Reviewers: zack, #reviewers! Maniphest Tasks: T567 Differential Revision: https://forge.softwareheritage.org/D268 15 December 2017, 14:36:58 UTC
0bfd709 sql/upgrades: add 113 -> 114 Finalize it really. Related T873 07 December 2017, 16:58:55 UTC
20d59c3 swh.storage.api.server: Do not hardcode the config file Related T871 07 December 2017, 13:14:28 UTC
62e8472 d/rules: Fix double export instruction 07 December 2017, 12:18:56 UTC
08121f7 swh-storage: Rename indexer_configuration to tool Related T871 07 December 2017, 08:45:24 UTC
021512b swh-storage: Migrate indexer model to its own model Related T871 07 December 2017, 08:44:26 UTC
63858e5 Add method to storage for searching origins This adds method 'origin_search' to storage enabling to search for origins whose urls contain a string pattern or match a regular expression Related T848 05 December 2017, 10:47:54 UTC
a4abad0 schemata: Add missing __init__.py for build package purposes 24 November 2017, 10:13:48 UTC
c33ad51 sql/upgrades: add 112 -> 113 Related T851 17 November 2017, 14:05:53 UTC
8311a4f swh.storage: Open indexer_configuration_add endpoint Related T851 17 November 2017, 13:46:39 UTC
5a0ddd6 swh.storage.tests: Fix broken content_mimetype tests Due to a new value in db for the same tool. Related 35253443fe0bd792e84f3ce939ee20e7eed52f9b Related T849 17 November 2017, 10:28:49 UTC
3525344 swh-data: Add new content mimetype's indexer configuration Related T849 15 November 2017, 14:36:19 UTC
775a1b8 add __init__.py for tests 10 November 2017, 13:16:40 UTC
c5740fd origin_visit_get: make order repeatable 10 November 2017, 13:16:40 UTC
c4ec821 Add missing visit key for occurrence_add documentation 10 November 2017, 13:16:40 UTC
9b29324 Make unique indices actually unique and vice versa 09 November 2017, 17:11:11 UTC
f87a530 Add 110->111 SQL upgrade script 06 November 2017, 12:46:56 UTC
e7a7b56 Remove unused content provenance cache tables 03 November 2017, 15:31:36 UTC
75f4edd sql/upgrades/110: add metadata tables to SQL schema 03 November 2017, 11:41:28 UTC
477bd44 docs: add absolute anchor to documentation index 02 November 2017, 10:09:27 UTC
d13751a Refactor entry points to origin_metadata table with get_by function deleting entry points get_all and get_by_provider creating one unique entry point origin_metadata_get_by adding entry point provider_get_by name+url 24 October 2017, 12:30:01 UTC
48fc6f7 Refactor origin_metadata and adding provider table and logic added documentation and new version for schema 23 October 2017, 14:33:26 UTC
76c5326 Create origin_metadata tables and logic Summary: adding add and get entry points and tests for origin_metadata test for origin_metadata add and get functions pass Added entry points get_all and get_by_provenance for origin_metadata Closes T737 References P168 Test Plan: tests for add, get, get_all and get_by_provenance Reviewers: ardumont, #reviewers! Maniphest Tasks: T737 Differential Revision: https://forge.softwareheritage.org/D254 23 October 2017, 14:33:26 UTC
17882a9 test for origin_metadata add and get functions pass 23 October 2017, 14:33:26 UTC
3b425af Create origin_metadata tables and logic adding add and get entry points and tests for origin_metadata 23 October 2017, 14:33:26 UTC
7e2e9a9 docs: integrate postgres DB schema in dev docs 21 October 2017, 15:19:40 UTC
cc8a618 sql doc: also generate SVG version of the schema chart 21 October 2017, 15:04:23 UTC
6ecf3d6 doc: convert archiver blueprint to rst and link it from doc index 21 October 2017, 14:47:03 UTC
8915caf db schema chart: add metadata and statistic clusters to place stray tables where they belong 21 October 2017, 13:54:23 UTC
89b9ce4 db schema chart: use more current titles for some clusters 21 October 2017, 13:54:13 UTC
d35e739 Make swh.storage.schemata work on SQLalchemy 1.0 12 October 2017, 17:51:00 UTC
91e1e5e Drop doctests from build as they mess up flask 12 October 2017, 16:41:33 UTC
ecdb993 Cleanup packaging 12 October 2017, 15:16:59 UTC
92e46df Move kafka_python to extra requirements 11 October 2017, 16:51:37 UTC
f56e812 swh.storage.converters: Fix typo in docstring 11 October 2017, 15:42:53 UTC
b8f5018 Flask and doctest are unhappy with each other See also https://github.com/pallets/flask/issues/1680 11 October 2017, 15:39:11 UTC
1964eaf Cleanup kafka-related requirements 11 October 2017, 15:33:01 UTC
13a62b1 swh.storage.listener: drop cyclic dependency on swh.journal 11 October 2017, 15:30:25 UTC
d910b67 add python3-kafka to build-depends 11 October 2017, 15:25:06 UTC
6cf4fc9 Cleanup tests during debian package build 11 October 2017, 15:22:46 UTC
e36a36b Bump dependency on swh.model 11 October 2017, 15:18:34 UTC
2022da2 schemata.distribution: update for reuse by the Debian loader 10 October 2017, 13:59:31 UTC
6a97d92 test_storage: update tests to use DentryPerms instead of raw values 09 October 2017, 10:23:40 UTC
c1bdce3 db: properly handle IntEnums (e.g. DentryPerms) 09 October 2017, 10:15:05 UTC
e67765b schemata: add a new package for ancillary schemata This package is inaugurated by the distribution schemata 14 September 2017, 15:30:39 UTC
82ca9bb debian/control: wrap-and-sort 14 September 2017, 15:29:18 UTC
118a962 sanitize docstrings for sphinx 07 September 2017, 08:21:34 UTC
e14e72d sql: add origin_visit to swh_stat_counters() 04 September 2017, 18:17:03 UTC
4fbc237 sql/upgrades: add 107 to 108 script 01 September 2017, 12:14:21 UTC
20f47dd sql/swh-func: keep a cache of exact object counts as a table Close T719 (cc @rdicosmo) 01 September 2017, 08:18:42 UTC
e1ec2d8 test_storage: move tests that were inadvertently "local-only" to the base class 01 September 2017, 08:18:42 UTC
45b3426 docs/: add sphinx apidoc generation skeleton change cherry picked from python module template commit 71b117ba0cf9f1251b1cac26d0994df03a4c787d 30 August 2017, 10:26:04 UTC
464d5bc storage_testing: leverage reset_db_tables from db_testing 03 August 2017, 16:16:16 UTC
9d416d3 Added revision_metadata table and methods into storage Summary: - testing missing, add and get methods on revision_metadata Reviewers: ardumont, #reviewers! Differential Revision: https://forge.softwareheritage.org/D235 28 July 2017, 10:08:27 UTC
806f511 tests: move teardown reset_tables logic to storage_testing 19 July 2017, 14:56:27 UTC
back to top