https://github.com/SoftwareHeritage/swh-storage

sort by:
Revision Author Date Message Commit Date
89d28d5 Updated backport on buster-swh from debian/0.36.0-1_swh1 (unstable-swh) 24 August 2021, 15:07:34 UTC
dac2450 Merge tag 'debian/0.36.0-1_swh1' into debian/buster-swh 24 August 2021, 15:07:33 UTC
ffe636f Updated debian changelog for version 0.36.0 24 August 2021, 15:01:32 UTC
6924a7e Update upstream source from tag 'debian/upstream/0.36.0' Update to upstream version '0.36.0' with Debian dir 179c1ad6d3ce02e0f64d5944d38e3e3d48e86d89 24 August 2021, 15:01:30 UTC
3a224a9 New upstream version 0.36.0 24 August 2021, 15:01:29 UTC
b110d1b Add cvs as supported revision_type 24 August 2021, 14:39:03 UTC
8f1cdf6 Add test for origin_visit_get_latest in presence of mismatched id and date orders It was unclear this actually worked; I had to write this test to realize the code wasn't buggy. Also replaced a conditional that is always False (because Cassandra always returns results in the order of the clustering key) with an assertion, so the code is less confusing. 24 August 2021, 13:14:39 UTC
cf880db cassandra: Bump next_visit_id when origin_visit_add is called by a replayer When called by a replayer, the visit.visit field is set; but origin.next_visit_id was never incremented, so on the next loader run, the visit id would be 1 even if there is already a visit with that id. 24 August 2021, 13:14:39 UTC
54b5abf cassandra: Make content_missing query in batches Instead of calling content_find() for each object, which needs to make two queries for each. Given the latency of Cassandra queries, this should be a significant speed-up (possibly up to 100 times faster, as this is the value of PARTITION_KEY_RESTRICTION_MAX_SIZE). This also changes the schema, because CQL does not allow doing `IN` queries on compound partition keys. 24 August 2021, 13:14:39 UTC
7113198 backfill: add extra where clause to use the right index for extid requests Related to T3485 24 August 2021, 11:52:32 UTC
55c3f0c Updated backport on buster-swh from debian/0.35.1-1_swh1 (unstable-swh) 20 August 2021, 10:07:02 UTC
8f1e8cd Merge tag 'debian/0.35.1-1_swh1' into debian/buster-swh 20 August 2021, 10:07:02 UTC
ae70564 Updated debian changelog for version 0.35.1 20 August 2021, 10:01:16 UTC
ca5ee3d Update upstream source from tag 'debian/upstream/0.35.1' Update to upstream version '0.35.1' with Debian dir 313295a88c0c4c4f7c924c7c36bea4c310f2cbca 20 August 2021, 10:01:15 UTC
1c038f0 New upstream version 0.35.1 20 August 2021, 10:01:14 UTC
9f00eb9 cassandra: Fix crash when using _missing() functions with more than 100 ids with ScyllaDB. 06 August 2021, 12:59:29 UTC
18418b2 Updated backport on buster-swh from debian/0.35.0-1_swh1 (unstable-swh) 28 July 2021, 13:11:03 UTC
1ac2093 Merge tag 'debian/0.35.0-1_swh1' into debian/buster-swh 28 July 2021, 13:11:03 UTC
5269c40 Updated debian changelog for version 0.35.0 28 July 2021, 08:43:06 UTC
96b0ed6 Update upstream source from tag 'debian/upstream/0.35.0' Update to upstream version '0.35.0' with Debian dir 6ad84ab1c5fb66c4ead922e0b750291db31def2f 28 July 2021, 08:43:05 UTC
3be3f63 New upstream version 0.35.0 28 July 2021, 08:43:04 UTC
912d04e sql: Adapt extid.extid_version comment 27 July 2021, 14:58:02 UTC
7a38045 Implement storage of the ExtID.extid_version field This fields allows having multiple version of the ExtID -> SWHID mapping, for instance when the implementation of a loader changes in a backwards-incompatible way. For now, we don't change the API used to query or store ExtIDs. When querying for the SWHIDs corresponding to a given external objects, all versions are returned, and the client is expected to do the filtering. 23 July 2021, 15:37:12 UTC
f224765 Updated backport on buster-swh from debian/0.34.0-1_swh1 (unstable-swh) 07 July 2021, 17:04:51 UTC
3408528 Merge tag 'debian/0.34.0-1_swh1' into debian/buster-swh 07 July 2021, 17:04:50 UTC
f57de45 Updated debian changelog for version 0.34.0 07 July 2021, 16:58:42 UTC
d051810 Update upstream source from tag 'debian/upstream/0.34.0' Update to upstream version '0.34.0' with Debian dir d1489fbcf94a7c6a63233ae05f74d2de68f382bc 07 July 2021, 16:58:39 UTC
b07c48a New upstream version 0.34.0 07 July 2021, 16:58:37 UTC
9747aed cassandra: Allow to configure the consistency level to use The default ONE level is used to keep the previous behaviour Related to T3396 07 July 2021, 12:26:47 UTC
3258a7b Updated backport on buster-swh from debian/0.33.0-1_swh1 (unstable-swh) 05 July 2021, 15:06:08 UTC
78fd007 Merge tag 'debian/0.33.0-1_swh1' into debian/buster-swh 05 July 2021, 15:06:07 UTC
6dd1060 Updated debian changelog for version 0.33.0 05 July 2021, 15:00:12 UTC
09c86b5 Update upstream source from tag 'debian/upstream/0.33.0' Update to upstream version '0.33.0' with Debian dir 47fe726bda16965dd0e91bd305b768a9e6a855a4 05 July 2021, 15:00:11 UTC
9cfa8cd New upstream version 0.33.0 05 July 2021, 15:00:10 UTC
f195434 Updated backport on buster-swh from debian/0.32.0-1_swh1 (unstable-swh) 28 June 2021, 16:26:08 UTC
5a0e97c Merge tag 'debian/0.32.0-1_swh1' into debian/buster-swh 28 June 2021, 16:26:07 UTC
c6adf97 Updated debian changelog for version 0.32.0 28 June 2021, 16:20:21 UTC
a5a17c2 Update upstream source from tag 'debian/upstream/0.32.0' Update to upstream version '0.32.0' with Debian dir 5a82175111d28225036b7409520b88892fc695d2 28 June 2021, 16:20:20 UTC
bb93b16 New upstream version 0.32.0 28 June 2021, 16:20:18 UTC
f1cac4f postgresql: Add type annotation for 'db' argument This allows mypy to actually type-check calls to db methods. This commit also fixes an issue found by mypy. 28 June 2021, 15:28:15 UTC
dd8a590 --amend 28 June 2021, 15:21:18 UTC
c5beb49 Add endpoint raw_extrinsic_metadata_get_authorities This will make it easier for users of swh-web to discover metadata on a given SWHID, as you otherwise need to specify an authority to fetch metadata. 28 June 2021, 13:30:41 UTC
ec2fac4 cassandra: Add support for non-ASCII origin 'URLs'. We agreed a while ago they are IRIs, and we have some of them in the postgresql database already. 25 June 2021, 15:26:53 UTC
f746163 Updated backport on buster-swh from debian/0.31.0-1_swh1 (unstable-swh) 25 June 2021, 09:31:58 UTC
27fb9f1 Merge tag 'debian/0.31.0-1_swh1' into debian/buster-swh 25 June 2021, 09:31:57 UTC
c38d186 Updated debian changelog for version 0.31.0 25 June 2021, 09:26:09 UTC
f3dce00 Update upstream source from tag 'debian/upstream/0.31.0' Update to upstream version '0.31.0' with Debian dir 04dce0f5260ddc7eeb91b361af1581a4c5dea4ec 25 June 2021, 09:26:08 UTC
040bd64 New upstream version 0.31.0 25 June 2021, 09:26:07 UTC
47575a6 Add endpoints to access REMD by id This will be used by swh-web to allow downloading them from a non-JSON endpoint. 15 June 2021, 13:08:23 UTC
036d227 mypy: Fix errors with release >= v0.900 09 June 2021, 12:58:43 UTC
1d880a5 cassandra: Add partial support for ScyllaDB All features work but snapshot_count_branches, because ScyllaDB does not support user-defined aggregates yet. Migration tests hang when run after the regular tests, but I can't figure out why. This should not be an issue for now, as we won't run Scylla tests on the CI. 21 May 2021, 10:14:43 UTC
f6de0ac Updated backport on buster-swh from debian/0.30.1-1_swh1 (unstable-swh) 21 May 2021, 08:28:25 UTC
495e2ca Merge tag 'debian/0.30.1-1_swh1' into debian/buster-swh 21 May 2021, 08:28:24 UTC
a7ebf15 Updated debian changelog for version 0.30.1 21 May 2021, 08:22:33 UTC
d96b4c1 Update upstream source from tag 'debian/upstream/0.30.1' Update to upstream version '0.30.1' with Debian dir c5b4cee51262d233b57e30c0eadf8462f89b1de9 21 May 2021, 08:22:32 UTC
90383fd New upstream version 0.30.1 21 May 2021, 08:22:31 UTC
8e3731a Finalize the config "local" deprecation in favor of "postgresql" This will remove further deprecation warnings from the tests, especially the ones from other modules depending on the storage's pytest-plugin. This also fixes some edge case configuration for the backfill and the storage rpc backend which would have been broken if we switched to that new name prior to this. Related to b487a21f 21 May 2021, 07:38:55 UTC
a92a968 tests: Make test parameters order deterministic, so they don't crash pytest-xdist pytest-xdist expects the parameters to be in the same order in all processes. 19 May 2021, 08:49:09 UTC
5a8d605 test_cassandra: Improve error when the process is started but not listening 19 May 2021, 08:49:01 UTC
f5e852a Updated backport on buster-swh from debian/0.30.0-1_swh1 (unstable-swh) 18 May 2021, 14:52:31 UTC
93f4567 Merge tag 'debian/0.30.0-1_swh1' into debian/buster-swh 18 May 2021, 14:52:31 UTC
6477a88 Updated debian changelog for version 0.30.0 18 May 2021, 14:45:21 UTC
e0fc621 Update upstream source from tag 'debian/upstream/0.30.0' Update to upstream version '0.30.0' with Debian dir bd97b7a393ad1fb6f02d20bb88e4989e57e84535 18 May 2021, 14:45:14 UTC
1ec845d New upstream version 0.30.0 18 May 2021, 14:45:12 UTC
0ed4a97 Make the TenaciousProxyStorage also handle content_add_metadata 18 May 2021, 10:56:15 UTC
f3ce043 Updated backport on buster-swh from debian/0.29.1-1_swh1 (unstable-swh) 14 May 2021, 17:05:44 UTC
9146ddd Merge tag 'debian/0.29.1-1_swh1' into debian/buster-swh 14 May 2021, 17:05:44 UTC
29fa2ad Updated debian changelog for version 0.29.1 14 May 2021, 16:59:42 UTC
72b0f63 Update upstream source from tag 'debian/upstream/0.29.1' Update to upstream version '0.29.1' with Debian dir 7f0734a925b33f8b6d0c2fa241729777d3fbefd4 14 May 2021, 16:59:41 UTC
5b5d0d3 New upstream version 0.29.1 14 May 2021, 16:59:38 UTC
53c21d4 Add missing schema migration for swh_directory_get_entries 14 May 2021, 16:31:00 UTC
00212b1 Updated debian changelog for version 0.29.0 11 May 2021, 13:12:43 UTC
16b2fb4 Update upstream source from tag 'debian/upstream/0.29.0' Update to upstream version '0.29.0' with Debian dir b2c5f0460c76f4ec0b2043303c82c1a0eac711e2 11 May 2021, 13:12:37 UTC
6ae5098 New upstream version 0.29.0 11 May 2021, 13:12:36 UTC
f328367 content_get: Add support for queries by sha1_git Before this commit, the only way to get Content objects from their sha1_git was to call content_find for each object. This was obviously neither convenient nor efficient. Using this endpoint to batch calls reduces the runtime of the git-bare vault cooker by 30%. 11 May 2021, 12:36:30 UTC
e3cbd5e Add endpoint directory_get_entries, to quickly list a directory's entries It spares a join with the content table, which should hopefully make the vault (and possibly other users) faster when they don't need this join. 11 May 2021, 10:00:27 UTC
f140f63 cassandra: Add tests checking directory_add and snapshot_add are atomic. 11 May 2021, 08:22:23 UTC
b487a21 Deprecate the "local" storage cls in favor of "postgresql" 10 May 2021, 12:56:44 UTC
9105253 Move all proxy storages in swh/storage/proxies/ to clean a bit the swh.storage namespace. 10 May 2021, 12:55:07 UTC
7617099 Make the TenaciousProxyStorage retry when a single object add fails give a chance to one-object batches to be ingested, and reduce the number of objects wrongly reported as non-ingested, e.g. during a replayer session, where this situation can occur. 07 May 2021, 11:46:00 UTC
f269f28 Updated debian changelog for version 0.28.0 06 May 2021, 14:06:51 UTC
da53050 Update upstream source from tag 'debian/upstream/0.28.0' Update to upstream version '0.28.0' with Debian dir d8cdc352f7a71a285fd47916abd71279ae22c86a 06 May 2021, 14:06:51 UTC
455191b New upstream version 0.28.0 06 May 2021, 14:06:49 UTC
35ae94a Use swh.core 0.14 It renamed db_name to dbname, which is a breaking change. 06 May 2021, 12:23:09 UTC
652e3d5 tenacious: Document potential issues about objects being dropped 06 May 2021, 09:56:32 UTC
e170fb2 Stop storing authority/fetcher metadata. We still don't have a use for them, and they are causing issues; such as being unable to add an authority/fetcher based only on a REMD object, which is needed by the replayer. 05 May 2021, 10:54:04 UTC
77ef651 Make postgresql's origin_add not raise an error in case of conflict there is no need for an url insertion in the origin table to result in a unicity error. Conflicting insertion of the same URL in this table may happen in case of concurrent process (loading or in a replayer session). 05 May 2021, 10:18:44 UTC
ffb38f7 Add a new TenaciousProxyStorage This proxy storage attempt to add buckets of objects, but in case of failure, it splits the bucket in parts so every valid object in the bucket get a chance to be inserted. Also provides an error rate-limiting feature. This proxy storage is mainly dedicated to help mirrorring an archive using the replayer stack. 05 May 2021, 09:57:58 UTC
051b771 cassandra: Add a test of a 'complex' migration, with a PK update 03 May 2021, 15:40:37 UTC
f233461 cassandra: Add 'check_missing' option, to allow updating objects as part of a migration. Also write a first test that simulates how a simple migration would go. 03 May 2021, 15:40:36 UTC
3d6aebd Updated debian changelog for version 0.27.4 29 April 2021, 13:04:47 UTC
e3d9edc Update upstream source from tag 'debian/upstream/0.27.4' Update to upstream version '0.27.4' with Debian dir 71b51f7b5839ceb9cd6c802e5be218d4cd04e29f 29 April 2021, 13:04:41 UTC
2b20af5 New upstream version 0.27.4 29 April 2021, 13:04:39 UTC
92d551a Normalize all Storage.xxx_add() methods to return a summary but origin_visit_add() which requires more work to do so. Note that this will change the way 'raw_extrinsinc_metadata_add()' report statsd metrics: the 'method_name' tag will now remain 'raw_extrinsic_metadata_add' instead of a forged '<type_name>_metadata_add'. 29 April 2021, 10:40:33 UTC
ff7ecb4 Properly annotate output of Storage.xxx_add() methods as Dict[str, int] when applicable. 29 April 2021, 10:03:10 UTC
98804f9 Add a fixer for ExtrinsicRawMetadata the 'type' attribute has been removed in swh.model v1.0.0 in favor of an ExtendedSWHID 'target'. 28 April 2021, 12:12:22 UTC
615d719 tox: Add sphinx environments to check sane doc build Enable to check package documentation can be built without producing sphinx warnings. The sphinx environment is designed to be used in continuous integration in order to prevent breaking documentation build when committing changes. The sphinx-dev environment is designed to be used inside a full swh development environment. Related to T3258 27 April 2021, 11:57:23 UTC
2c477ec Fix storage_data hardcoded id values and add a test to check this stays accurate, so that these objects can pass throught the validate proxy storage, for example. 23 April 2021, 13:45:35 UTC
eb8c147 cassandra: Deduplicate table names This removes all table names from cassandra/cql.py, and gets them from cassandra/schema.py instead. When possible, this uses existing constants (BaseRow.TABLE), otherwise it uses a function to compute these names. This is needed to support schema migrations, as updating a table's primary key requires creating a new table with a different name. 22 April 2021, 15:22:18 UTC
a1fc5fb cassandra: Use prepared statements in extid_index_* All other statements are, and there is no reason for them not to be too 15 April 2021, 13:56:32 UTC
back to top