874da2d | Antoine R. Dumont (@ardumont) | 13 June 2020, 06:37:57 UTC | Start migrating the validate proxy toward using BaseModel objects This will allow to progress incrementally towards removing it. When it allows to use BaseModel objects everywhere (and tests in test_storage are adapted to use this property), it will be time to remove it entirely (as it's only used in test). It's preparatory work for future diffs. | 13 June 2020, 06:37:57 UTC |
33efdb0 | Antoine R. Dumont (@ardumont) | 12 June 2020, 16:51:47 UTC | storage*: Do not write twice origin-visit-status in journal Related to T2310 | 12 June 2020, 16:52:19 UTC |
b9f874d | Jenkins for Software Heritage | 12 June 2020, 07:27:26 UTC | Updated backport on buster-swh from debian/0.3.0-1_swh1 (unstable-swh) | 12 June 2020, 07:27:26 UTC |
a097f5c | Jenkins for Software Heritage | 12 June 2020, 07:27:26 UTC | Merge tag 'debian/0.3.0-1_swh1' into debian/buster-swh | 12 June 2020, 07:27:26 UTC |
a6a2258 | Jenkins for Software Heritage | 12 June 2020, 07:22:04 UTC | Updated debian changelog for version 0.3.0 | 12 June 2020, 07:22:04 UTC |
2451932 | Jenkins for Software Heritage | 12 June 2020, 07:22:03 UTC | Update upstream source from tag 'debian/upstream/0.3.0' Update to upstream version '0.3.0' with Debian dir 4877d8c4841fb20259ea4f9afc0ae6fb0ebf0453 | 12 June 2020, 07:22:03 UTC |
1661ef9 | Jenkins for Software Heritage | 12 June 2020, 07:22:01 UTC | New upstream version 0.3.0 | 12 June 2020, 07:22:01 UTC |
37c4530 | Antoine R. Dumont (@ardumont) | 11 June 2020, 12:57:02 UTC | storage*: Align origin-visit-add to take iterable of OriginVisit objects This makes its api consistent with other add endpoints. This is preparatory work towards removing origin-visit-upsert. Related to T2310 | 11 June 2020, 16:55:33 UTC |
5d61633 | Antoine R. Dumont (@ardumont) | 11 June 2020, 12:32:02 UTC | test: Remove dead code 1. obj_type is now origin_visit_status. So this means, we actually never pass here. 2. Those objects have now a storage endpoint anyway. So it's dead code alright. Related to T2310 | 11 June 2020, 12:33:53 UTC |
0cfc7d4 | Jenkins for Software Heritage | 10 June 2020, 10:08:17 UTC | Updated backport on buster-swh from debian/0.2.0-1_swh1 (unstable-swh) | 10 June 2020, 10:08:17 UTC |
ddbbd2f | Jenkins for Software Heritage | 10 June 2020, 10:08:16 UTC | Merge tag 'debian/0.2.0-1_swh1' into debian/buster-swh | 10 June 2020, 10:08:16 UTC |
d252e7b | Jenkins for Software Heritage | 10 June 2020, 10:02:45 UTC | Updated debian changelog for version 0.2.0 | 10 June 2020, 10:02:45 UTC |
9e69fce | Jenkins for Software Heritage | 10 June 2020, 10:02:45 UTC | Update upstream source from tag 'debian/upstream/0.2.0' Update to upstream version '0.2.0' with Debian dir d8da5f658daec84fd358a5f80ec0930e9282a340 | 10 June 2020, 10:02:45 UTC |
3a0e49f | Jenkins for Software Heritage | 10 June 2020, 10:02:43 UTC | New upstream version 0.2.0 | 10 June 2020, 10:02:43 UTC |
d68c7ec | Antoine R. Dumont (@ardumont) | 10 June 2020, 08:29:09 UTC | origin-visit-upsert: Write visit status objects to the journal Related to T2310 | 10 June 2020, 08:51:51 UTC |
86d05fb | Antoine R. Dumont (@ardumont) | 08 June 2020, 14:27:11 UTC | origin-visit-update: Write visit status objects to the journal Related to T2310 | 09 June 2020, 12:33:58 UTC |
0860920 | Antoine R. Dumont (@ardumont) | 08 June 2020, 09:42:13 UTC | origin-visit-add: Write visit status to the journal This also makes the instruction order consistent across the different storage implementations. First, write objects to the journal, then write objects to the storage backend. Related to T2310 | 09 June 2020, 12:29:57 UTC |
7eb44d4 | Valentin Lorentz | 08 June 2020, 09:53:08 UTC | Add pagination to origin_metadata_get. | 08 June 2020, 14:02:15 UTC |
26a8d4f | Valentin Lorentz | 04 June 2020, 15:06:26 UTC | Add SortedList.iter_after. Strict version of iter_from. I'll need it for pagination. | 08 June 2020, 14:01:48 UTC |
6ebdc2f | Valentin Lorentz | 04 June 2020, 10:33:17 UTC | Deduplicate origin-metadata when they have the same authority + discovery_date + fetcher. By replacing the old value with the new one. This will allow an easy implementation of pagination, using the fetcher id as an opaque page_token. Plus, it did not make sense logically to have different metadata from the same authority at the same time (especially with the same fetcher). | 08 June 2020, 14:01:30 UTC |
dcef916 | Antoine R. Dumont (@ardumont) | 03 June 2020, 09:12:34 UTC | Open `origin_visit_status_add` endpoint to add origin visit statuses Related to T2310 | 05 June 2020, 16:32:12 UTC |
88271f8 | David Douard | 07 May 2020, 13:33:35 UTC | Add a replayer test for anonymized journal topics This new test check the behavior of the storage replayer mechanism when replaying a journal with privileged topics. | 05 June 2020, 16:10:42 UTC |
c75da7a | David Douard | 05 June 2020, 10:25:58 UTC | Small refactoring of the InMemoryStorage to make it more consistent - make self._persons a dict - make self._snapshots value Snapshot only instead of the couple (Snapshot, sorted_branch_names) | 05 June 2020, 10:28:04 UTC |
416988f | Jenkins for Software Heritage | 04 June 2020, 15:01:30 UTC | Updated backport on buster-swh from debian/0.1.1-1_swh1 (unstable-swh) | 04 June 2020, 15:01:30 UTC |
c662872 | Jenkins for Software Heritage | 04 June 2020, 15:01:30 UTC | Merge tag 'debian/0.1.1-1_swh1' into debian/buster-swh | 04 June 2020, 15:01:30 UTC |
3e80cea | Jenkins for Software Heritage | 04 June 2020, 14:56:54 UTC | Updated debian changelog for version 0.1.1 | 04 June 2020, 14:56:54 UTC |
24fd84f | Jenkins for Software Heritage | 04 June 2020, 14:56:53 UTC | Update upstream source from tag 'debian/upstream/0.1.1' Update to upstream version '0.1.1' with Debian dir b2713ef77f0899245bffcf192b65115addf2e771 | 04 June 2020, 14:56:53 UTC |
91c3d65 | Jenkins for Software Heritage | 04 June 2020, 14:56:52 UTC | New upstream version 0.1.1 | 04 June 2020, 14:56:52 UTC |
25f584f | Nicolas Dandrimont | 04 June 2020, 14:14:07 UTC | Use explicit configuration (without journal writer) for algos tests Using the in-memory journal writer sometimes makes the tests hang when (very) large objects are used. This works around the issue. | 04 June 2020, 14:14:07 UTC |
48953aa | David Douard | 04 June 2020, 12:27:22 UTC | d/changelog: fix the release | 04 June 2020, 12:27:22 UTC |
3d665d6 | David Douard | 04 June 2020, 11:41:28 UTC | d/changelog: version 0.1.0-2~swh1 | 04 June 2020, 11:41:28 UTC |
54dadc4 | David Douard | 04 June 2020, 11:10:53 UTC | d/control: update dependencies | 04 June 2020, 11:10:53 UTC |
68aa839 | Jenkins for Software Heritage | 04 June 2020, 10:28:44 UTC | Updated debian changelog for version 0.1.0 | 04 June 2020, 10:28:44 UTC |
c96ef27 | Jenkins for Software Heritage | 04 June 2020, 10:28:43 UTC | Update upstream source from tag 'debian/upstream/0.1.0' Update to upstream version '0.1.0' with Debian dir d026955005c1db88f5c6572238c3278592a3c01c | 04 June 2020, 10:28:43 UTC |
525cf13 | Jenkins for Software Heritage | 04 June 2020, 10:28:41 UTC | New upstream version 0.1.0 | 04 June 2020, 10:28:41 UTC |
f9b2ca3 | David Douard | 29 May 2020, 12:28:31 UTC | Replace MockedJournalClient and MockedKafkaWriter by proper kafka test scaffolding This also kills test_write_replay.py file since it does not test anything more than what is currently tested in test_replay.py. | 04 June 2020, 09:46:12 UTC |
ad9c9bb | David Douard | 02 June 2020, 15:25:18 UTC | Adapt to swh.model 0.3 in which List attributes have replaced by Tuple ones. This requires a bit of adaptation in the code of the ValidatingProxyStorage to ensure dict representation of revision objects are properly typed. The test_api_client_dicts.py has been removed since it's not really useful any more and would require a fair amount of work to fix it. | 04 June 2020, 09:46:12 UTC |
eef4900 | David Douard | 29 May 2020, 12:21:55 UTC | Fix InMemoryStorage.origin_visit_upsert() method the self._origin_visits[origin_url] list was built one element too big (since visit ids starts from 1 and not 0). This is needed to ease writing replayer tests (by comparing these lists). | 29 May 2020, 13:23:28 UTC |
6c6080b | Valentin Lorentz | 28 May 2020, 15:03:13 UTC | Fix type annotation. | 28 May 2020, 15:03:13 UTC |
9332547 | Valentin Lorentz | 28 May 2020, 14:34:53 UTC | Remove function drops from the migration. I committed these two lines by mistake | 28 May 2020, 14:34:53 UTC |
107e2b7 | Jenkins for Software Heritage | 28 May 2020, 12:42:14 UTC | Updated backport on buster-swh from debian/0.0.193-1_swh1 (unstable-swh) | 28 May 2020, 12:42:14 UTC |
e59f1a7 | Jenkins for Software Heritage | 28 May 2020, 12:42:14 UTC | Merge tag 'debian/0.0.193-1_swh1' into debian/buster-swh | 28 May 2020, 12:42:14 UTC |
d39c55f | Jenkins for Software Heritage | 28 May 2020, 12:37:58 UTC | Updated debian changelog for version 0.0.193 | 28 May 2020, 12:37:58 UTC |
79b686f | Jenkins for Software Heritage | 28 May 2020, 12:37:57 UTC | Update upstream source from tag 'debian/upstream/0.0.193' Update to upstream version '0.0.193' with Debian dir 9983b90d222aedc6e3a45d694c9d8dcb2ae4d6cd | 28 May 2020, 12:37:57 UTC |
14b3019 | Jenkins for Software Heritage | 28 May 2020, 12:37:55 UTC | New upstream version 0.0.193 | 28 May 2020, 12:37:55 UTC |
2209d31 | Antoine R. Dumont (@ardumont) | 27 May 2020, 11:41:28 UTC | README: Update necessary dependencies for test purposes This also adds a mention on how to avoid running the cassandra tests. | 28 May 2020, 12:25:31 UTC |
7cb3694 | Antoine R. Dumont (@ardumont) | 28 May 2020, 12:07:04 UTC | 152: Fix typo | 28 May 2020, 12:07:04 UTC |
738d648 | Antoine R. Dumont (@ardumont) | 28 May 2020, 11:32:16 UTC | db: Use query_params instead of mogrify Related to D3180 | 28 May 2020, 11:32:16 UTC |
cada7fc | Antoine R. Dumont (@ardumont) | 27 May 2020, 12:11:22 UTC | pg: Write origin visit updates & status, read from origin_visit_status That still writes new origin visit and origin visit status but reads from origin visit status. This partially reverts commit b0b767b91ca077a14368eaac1f98120261d7460c [1] [1] Related to D3101 | 28 May 2020, 11:08:10 UTC |
fe877ce | Valentin Lorentz | 28 May 2020, 09:14:10 UTC | Make content.blake2s256 not null. There is no reason to allow null anymore; and the production DB doesn't have any. | 28 May 2020, 09:14:10 UTC |
b0bff97 | Valentin Lorentz | 28 May 2020, 08:41:32 UTC | Remove unused SQL functions. | 28 May 2020, 08:50:15 UTC |
1aff3c6 | Valentin Lorentz | 28 May 2020, 08:21:35 UTC | Add a pre-commit hook to check there are version bumps in sql/upgrades/*.sql | 28 May 2020, 08:21:50 UTC |
3613438 | Valentin Lorentz | 28 May 2020, 07:54:12 UTC | Add missing dbversion bump in 150.sql. | 28 May 2020, 07:54:12 UTC |
213f1b1 | Valentin Lorentz | 14 May 2020, 12:50:46 UTC | Add artifact metadata to the extrinsic metadata storage specification. | 26 May 2020, 11:01:52 UTC |
f1b51a7 | Antoine R. Dumont (@ardumont) | 19 May 2020, 20:29:14 UTC | Add not null constraints to metadata_authority/origin_metadata Out of 149, I understood we wanted to make not null constraints on the following tables, so here it goes. This commits fixes that, migrating first the data not respecting those constraints. Related to D2988 Related to T2075 | 20 May 2020, 10:08:45 UTC |
c69d100 | Antoine R. Dumont (@ardumont) | 19 May 2020, 18:01:33 UTC | Realign schema with latest 149 migration script If we really want to make those not null, please let's make another migration script. | 19 May 2020, 19:59:24 UTC |
8ed0fde | Jenkins for Software Heritage | 19 May 2020, 16:58:29 UTC | Updated backport on buster-swh from debian/0.0.192-1_swh1 (unstable-swh) | 19 May 2020, 16:58:29 UTC |
f2b60a6 | Jenkins for Software Heritage | 19 May 2020, 16:58:29 UTC | Merge tag 'debian/0.0.192-1_swh1' into debian/buster-swh | 19 May 2020, 16:58:29 UTC |
d240458 | Jenkins for Software Heritage | 19 May 2020, 16:54:00 UTC | Updated debian changelog for version 0.0.192 | 19 May 2020, 16:54:00 UTC |
46e8506 | Jenkins for Software Heritage | 19 May 2020, 16:53:59 UTC | Update upstream source from tag 'debian/upstream/0.0.192' Update to upstream version '0.0.192' with Debian dir de8ef24fd1c59087792d6b671fbef01bf9d69216 | 19 May 2020, 16:53:59 UTC |
1523dbc | Jenkins for Software Heritage | 19 May 2020, 16:53:57 UTC | New upstream version 0.0.192 | 19 May 2020, 16:53:57 UTC |
8c2ee70 | Valentin Lorentz | 19 May 2020, 16:24:21 UTC | origin_metadata_add: Reject non-bytes types for 'metadata'. The left-over jsonize() allowed passing a dict under some circumstances, which allowed tests to pass but failed in production. | 19 May 2020, 16:29:24 UTC |
383128a | Jenkins for Software Heritage | 19 May 2020, 11:56:40 UTC | Updated backport on buster-swh from debian/0.0.191-1_swh1 (unstable-swh) | 19 May 2020, 11:56:40 UTC |
150b957 | Jenkins for Software Heritage | 19 May 2020, 11:56:40 UTC | Merge tag 'debian/0.0.191-1_swh1' into debian/buster-swh | 19 May 2020, 11:56:40 UTC |
e74fc42 | Jenkins for Software Heritage | 19 May 2020, 11:52:01 UTC | Updated debian changelog for version 0.0.191 | 19 May 2020, 11:52:01 UTC |
ddc54d7 | Jenkins for Software Heritage | 19 May 2020, 11:52:00 UTC | Update upstream source from tag 'debian/upstream/0.0.191' Update to upstream version '0.0.191' with Debian dir 4cea5d2fec1933eeeaf0a6b6c11cdb15201f3a44 | 19 May 2020, 11:52:00 UTC |
0671780 | Jenkins for Software Heritage | 19 May 2020, 11:51:58 UTC | New upstream version 0.0.191 | 19 May 2020, 11:51:58 UTC |
e645e63 | Valentin Lorentz | 09 April 2020, 09:46:21 UTC | Implement extrinsic origin metadata specification. https://docs.softwareheritage.org/devel/swh-storage/extrinsic-metadata-specification.html | 18 May 2020, 16:26:06 UTC |
60fca42 | Jenkins for Software Heritage | 18 May 2020, 12:22:58 UTC | Updated backport on buster-swh from debian/0.0.190-1_swh1 (unstable-swh) | 18 May 2020, 12:22:58 UTC |
d7347dc | Jenkins for Software Heritage | 18 May 2020, 12:22:57 UTC | Merge tag 'debian/0.0.190-1_swh1' into debian/buster-swh | 18 May 2020, 12:22:57 UTC |
00b50cb | Jenkins for Software Heritage | 18 May 2020, 12:18:10 UTC | Updated debian changelog for version 0.0.190 | 18 May 2020, 12:18:10 UTC |
e8713ec | Jenkins for Software Heritage | 18 May 2020, 12:18:09 UTC | Update upstream source from tag 'debian/upstream/0.0.190' Update to upstream version '0.0.190' with Debian dir 59f1769d77bb59afe16d2f65ad2bfae72fcc4e40 | 18 May 2020, 12:18:09 UTC |
08f99ed | Jenkins for Software Heritage | 18 May 2020, 12:18:07 UTC | New upstream version 0.0.190 | 18 May 2020, 12:18:07 UTC |
6d24ed7 | Antoine R. Dumont (@ardumont) | 18 May 2020, 11:34:45 UTC | storage: metadata_provider: Ensure idempotency when creating provider Currently, in production, we have duplicated entries in origin_metadata and metadata_provider. This is due to a bad primary key on metadata_provider. This commit ensures uniqueness when adding new metadata_provider by adding a new unique primary key on (provider_type, provider_url). This is preparatory work to allow a smooth transition for D2988 (which will need to be rebased). Related to T2075 | 18 May 2020, 11:41:05 UTC |
87f7bee | David Douard | 07 April 2020, 14:20:35 UTC | journal: add a skipped_content topic dedicated to SkippedContent objects instead of mixing them with Content in the content topic. | 13 May 2020, 10:12:41 UTC |
306aa69 | David Douard | 11 May 2020, 13:18:05 UTC | Add missing return annotations on JournalWriter methods | 11 May 2020, 14:25:22 UTC |
b04cb8f | David Douard | 11 May 2020, 09:20:38 UTC | Improve a bit the exception message of JournalWriter.content_update | 11 May 2020, 14:25:22 UTC |
46f9a7a | David Douard | 11 May 2020, 09:17:05 UTC | Refactor the JournalWriter class to normalize its methods make all the add_xxx methods take an Iterable as argument. Also extract the conditional execution of the underlying journal method calls in a dedicated methof to prevent repeating the ``if self.journal`` all over the place. Ensure all the storage code that uses this JournalWriter class are updated accordingly. | 11 May 2020, 14:25:20 UTC |
e64944b | David Douard | 11 May 2020, 14:15:51 UTC | tests: fix test_replay; do only use aware datetime objects naive dates being now properly forbidden. | 11 May 2020, 14:21:29 UTC |
4edf9e1 | Antoine R. Dumont (@ardumont) | 07 May 2020, 13:53:25 UTC | test_kafka_writer: Add missing object type skipped_content | 07 May 2020, 14:56:08 UTC |
8412d49 | Stefano Zacchiroli | 07 May 2020, 14:53:54 UTC | swh-schema.sql: improve comments on revision columns Reviewers: #reviewers, vlorentz, ardumont Reviewed By: #reviewers, vlorentz, ardumont Subscribers: ardumont, vlorentz Differential Revision: https://forge.softwareheritage.org/D2825 | 07 May 2020, 14:54:24 UTC |
74cffb7 | Stefano Zacchiroli | 07 May 2020, 14:53:54 UTC | swh-schema.sql: improve comments on revision columns Reviewers: #reviewers, vlorentz, ardumont Reviewed By: #reviewers, vlorentz, ardumont Subscribers: ardumont, vlorentz Differential Revision: https://forge.softwareheritage.org/D2825 | 07 May 2020, 14:53:54 UTC |
68eb6b3 | David Douard | 07 May 2020, 07:36:32 UTC | Update test_kafka_writer for swh.journal 0.1.0 | 07 May 2020, 08:26:24 UTC |
1443fc5 | Jenkins for Software Heritage | 04 May 2020, 16:19:47 UTC | Updated backport on buster-swh from debian/0.0.189-1_swh1 (unstable-swh) | 04 May 2020, 16:19:47 UTC |
c7d97ae | Jenkins for Software Heritage | 04 May 2020, 16:19:47 UTC | Merge tag 'debian/0.0.189-1_swh1' into debian/buster-swh | 04 May 2020, 16:19:47 UTC |
e07d234 | Jenkins for Software Heritage | 04 May 2020, 11:22:30 UTC | Updated backport on buster-swh from debian/0.0.188-1_swh1 (unstable-swh) | 04 May 2020, 11:22:30 UTC |
83e5eb6 | Jenkins for Software Heritage | 04 May 2020, 11:22:29 UTC | Merge tag 'debian/0.0.188-1_swh1' into debian/buster-swh | 04 May 2020, 11:22:29 UTC |
7d9bbfe | Jenkins for Software Heritage | 30 April 2020, 12:58:57 UTC | Updated debian changelog for version 0.0.189 | 30 April 2020, 12:58:57 UTC |
94b6946 | Jenkins for Software Heritage | 30 April 2020, 12:58:56 UTC | Update upstream source from tag 'debian/upstream/0.0.189' Update to upstream version '0.0.189' with Debian dir c15d9ed71a3944e12dd13ce35b89362c2e731d92 | 30 April 2020, 12:58:56 UTC |
b579670 | Jenkins for Software Heritage | 30 April 2020, 12:58:55 UTC | New upstream version 0.0.189 | 30 April 2020, 12:58:55 UTC |
b0b767b | Antoine R. Dumont (@ardumont) | 30 April 2020, 11:20:08 UTC | pg: Write both origin visit updates & status, read from origin_visit This partially reverts commit [1]. That now (still) writes new origin visit status... But, as before [1]: - update origin visit (with same values as origin visit status) - read from origin visit That does not revert the new in-memory (D2937) nor the cassandra (D2939) storage implementations. [1] a720caed6eebbb68a9f9b5be554a52859aa052d6 D2938 Related to D2938 Related to T2310#44043 | 30 April 2020, 11:52:13 UTC |
0e8234f | Antoine R. Dumont (@ardumont) | 29 April 2020, 10:10:17 UTC | pg-storage: Add new created state Related to T2310 | 30 April 2020, 11:43:05 UTC |
2b95dd3 | Stefano Zacchiroli | 29 April 2020, 16:33:40 UTC | setup.py: add documentation link | 29 April 2020, 16:33:40 UTC |
4dc2eb6 | Valentin Lorentz | 29 April 2020, 15:02:50 UTC | metadata spec: Fix title hierarchy | 29 April 2020, 15:02:50 UTC |
707f647 | Valentin Lorentz | 29 April 2020, 11:03:15 UTC | tests: Use aware datetimes instead of naive ones. Production should only use aware datetimes. | 29 April 2020, 11:03:15 UTC |
e3e76c4 | Antoine R. Dumont (@ardumont) | 30 March 2020, 11:08:32 UTC | cassandra: Adapt internal implementations to use origin visit update Related to T2310 | 28 April 2020, 14:46:47 UTC |
a720cae | Antoine R. Dumont (@ardumont) | 26 March 2020, 13:15:17 UTC | pg-storage: Adapt internal implementations to use origin visit update Related to T2310 | 28 April 2020, 14:46:46 UTC |
ead8088 | Antoine R. Dumont (@ardumont) | 25 March 2020, 16:53:48 UTC | in_memory: Adapt internal implementations to use origin visit update (pairing with @vlorentz) Related to T2310 | 28 April 2020, 14:46:43 UTC |
baa127e | Jenkins for Software Heritage | 28 April 2020, 11:52:09 UTC | Updated debian changelog for version 0.0.188 | 28 April 2020, 11:52:09 UTC |
03ad3ba | Jenkins for Software Heritage | 28 April 2020, 11:52:08 UTC | Update upstream source from tag 'debian/upstream/0.0.188' Update to upstream version '0.0.188' with Debian dir 50f1d39b3b3add4e066e19f2a7225c144133f3af | 28 April 2020, 11:52:08 UTC |