03e17c3 | Antoine R. Dumont (@ardumont) | 20 July 2020, 08:12:13 UTC | test_storage: skipped_content/content_missing: Use data model object Related to T2494 | 20 July 2020, 09:01:26 UTC |
7131dcb | Valentin Lorentz | 07 July 2020, 11:07:33 UTC | Make metadata-related endpoints consistent with other endpoints by using Iterables of swh-model objects instead of a dict. | 20 July 2020, 08:48:35 UTC |
997ec1d | Antoine R. Dumont (@ardumont) | 17 July 2020, 15:51:08 UTC | test.storage: content_add_metadata: Use data model object Related to T2494 | 17 July 2020, 15:51:08 UTC |
8c2d635 | Antoine R. Dumont (@ardumont) | 17 July 2020, 15:00:29 UTC | test.storage: content_add: Use data model object Related to T2494 | 17 July 2020, 15:16:20 UTC |
2b239f0 | Antoine R. Dumont (@ardumont) | 17 July 2020, 14:37:31 UTC | test_cassandra: Use data model object Related to T2494 | 17 July 2020, 14:37:31 UTC |
eb2bf8c | Antoine R. Dumont (@ardumont) | 17 July 2020, 12:50:34 UTC | test_db: Drop redundant test This is already tested through the test_storage scenario | 17 July 2020, 12:50:34 UTC |
04d25df | Antoine R. Dumont (@ardumont) | 16 July 2020, 15:31:28 UTC | test_cli: Use snapshot model object within test That commit is not so interesting. But at least we validate the snapshot is correct prior to sending it. Also that removes a bit duplicated storage configuration. Related to T2494 | 16 July 2020, 15:52:50 UTC |
2d4f727 | Antoine R. Dumont (@ardumont) | 16 July 2020, 13:33:36 UTC | algos.test_origin: Use data model object and drop validate proxy use Related to T2494 | 16 July 2020, 15:16:36 UTC |
97a0721 | Antoine R. Dumont (@ardumont) | 16 July 2020, 13:17:33 UTC | algos.test_snapshot: Use model objects from sample_data_model This opens up origin_visits and add new snapshots to the fixture. So we can reuse those. Related to T2494 | 16 July 2020, 13:34:23 UTC |
b6971b5 | Antoine R. Dumont (@ardumont) | 16 July 2020, 13:06:55 UTC | pytest_plugin: Ensure fixture instantiates correctly Related to T2484 Should fix [1] [1] https://jenkins.softwareheritage.org/view/Debian%20packages/job/debian/job/packages/job/DLDBASE/job/gbp-buildpackage/154/console | 16 July 2020, 13:13:05 UTC |
3abf6b3 | Antoine R. Dumont (@ardumont) | 16 July 2020, 10:30:55 UTC | pytest_plugin: Do not expose the validate proxy storage Only the storage module needs it. This should fix the debian jenkins build [1] [1] https://jenkins.softwareheritage.org/view/Debian%20packages/job/debian/job/packages/job/DLDBASE/job/gbp-buildpackage/153/console | 16 July 2020, 12:00:46 UTC |
a688e82 | Antoine R. Dumont (@ardumont) | 16 July 2020, 08:26:03 UTC | test_revision_bw_compat: Use revision model object Related to T2494 | 16 July 2020, 10:08:33 UTC |
21efe2a | Antoine R. Dumont (@ardumont) | 15 July 2020, 18:05:46 UTC | test_filter: Use model objects in tests and drop validate proxy | 16 July 2020, 09:28:21 UTC |
2ff4c6f | Antoine R. Dumont (@ardumont) | 15 July 2020, 17:29:45 UTC | test_buffer: Use model objects in tests and drop validate proxy | 16 July 2020, 09:28:21 UTC |
df45641 | Antoine R. Dumont (@ardumont) | 15 July 2020, 17:07:32 UTC | test_retry: Drop validate proxy when we can When we use the sample_data_model (almost all object types except the metadata ones), we can use a storage with no validate proxy. Depends on D3510 | 16 July 2020, 09:28:21 UTC |
14b1648 | Antoine R. Dumont (@ardumont) | 15 July 2020, 15:40:04 UTC | test_retry: Use sample_data_model fixture to manipulate model objects | 16 July 2020, 09:28:20 UTC |
df3f46d | Antoine R. Dumont (@ardumont) | 15 July 2020, 15:03:33 UTC | pytest-plugin: Expose a sample_data_model fixture This is almost the same fixture as sample_data except: - it's BaseModel object instance within - not complete as we cannot convert yet the metadata objects (there is a diff pending which will allow it but right now we cannot). The next commits will use this fixture to allow the switch from dict to model objects. | 16 July 2020, 09:28:20 UTC |
8bc7944 | Antoine R. Dumont (@ardumont) | 16 July 2020, 07:33:00 UTC | pytest_plugin: Avoid fixture client to declare optional dependency Prior to this commit, this would make swh_storage_backend_config fixture clients need to declare an optional dependency on swh.journal. Otherwise, it would not work [1]. This commit fixes it by dropping this configuration in the main pytest plugin. It keeps the storage tests testing with that journal_writer collaborator though by declaring an override which still provides it. This fixes the debian build [1] [1] https://jenkins.softwareheritage.org/view/Debian%20packages/job/debian/job/packages/job/DLDBASE/job/gbp-buildpackage/152/console | 16 July 2020, 07:33:00 UTC |
f5811da | Antoine R. Dumont (@ardumont) | 12 July 2020, 13:57:47 UTC | Allow cassandra binary path to be configured through env variable The current hard-coded value won't work for other distributions not relying on standard conventions (e.g. nixos...). This keeps the original behavior and only allow to diverge based on the environment variable SWH_CASSANDRA_BIN. This also: - fixes an issue on log path inexistence which raises. - renames the other env variable LOG_CASSANDRA to SWH_CASSANDRA_LOG (for consistency) | 15 July 2020, 10:09:55 UTC |
1a8924b | Antoine R. Dumont (@ardumont) | 11 July 2020, 06:42:17 UTC | 158: Make schema and migration converge so the migration works In the end, the order of the revision entry matters whether we select * or not. So the select must match the order defined in the revision_entry type type. Otherwise, a mismatch type error occurs [1] [1] psql:sql/upgrades/158.sql:74: ERROR: return type mismatch in function declared to return revision_entry | 11 July 2020, 06:52:26 UTC |
9219a23 | Antoine Lambert | 10 July 2020, 14:02:30 UTC | in_memory: Fix snapshot_get_branches regression with target_types When providing target_types parameter, snapshot branches must be sorted when iterating otherwise wrong branches can be returned. | 10 July 2020, 14:15:54 UTC |
23318c2 | Antoine R. Dumont (@ardumont) | 09 July 2020, 08:11:01 UTC | setup: Do no expose the pytest-plugin any longer Defining the pytest-plugin though the pytest-plugin [1] makes it loaded by default. This creates loading issues on modules depending on storage but not on the pytest plugin storage exposes. It was explained in the doc and I did not realize [2] Instead we'll explicitely define to modules depending on the pytest plugins in their root conftest [3]: ``` pytest_plugins = [ "swh.storage.pytest_plugin" ] ``` [1] https://docs.pytest.org/en/stable/writing_plugins.html#setuptools-entry-points [2] https://docs.pytest.org/en/stable/writing_plugins.html#plugin-discovery-order-at-tool-startup [3] https://docs.pytest.org/en/stable/writing_plugins.html#requiring-loading-plugins-in-a-test-module-or-conftest-file Related to T2484 | 10 July 2020, 06:19:39 UTC |
124e76d | Nicolas Dandrimont | 09 July 2020, 17:38:29 UTC | Rework dia -> pdf pipeline for inkscape 1.0 - Use dia directly to convert from .dia to .svg (inkscape would use dia via a plugin anyway) - Add proper runes to detect inkscape >= 1 and use the export options for that. | 09 July 2020, 17:38:29 UTC |
de38cd1 | Valentin Lorentz | 09 July 2020, 16:02:10 UTC | Remove overhead of to_dict/from_dict in test_snapshot_large. This should make it fast enough not to exceed the deadline. | 09 July 2020, 16:02:10 UTC |
e415488 | Valentin Lorentz | 09 July 2020, 15:58:12 UTC | in_memory: Fix quadratic run time in snapshot_get_branches. snapshot.branches is now an ImmutableDict, which is backed by a tuple of tuples; so random accesses now take a linear time instead of a constant time. This commit replaces random accesses with a single scan of all the items, and does existence checks in a set instead. | 09 July 2020, 15:59:48 UTC |
c3803ef | David Douard | 09 July 2020, 09:54:07 UTC | Fix a typo I introduced in previous revision dict(x if x is not None else None) != dict(x) if x is not None else None... | 09 July 2020, 09:56:35 UTC |
8bf3794 | Valentin Lorentz | 09 July 2020, 08:27:12 UTC | Convert ImmutableDict to dict before passing it to json.dumps. To work with the new swh-model version, which uses ImmutableDict in model objects. | 09 July 2020, 09:31:50 UTC |
c21d0e3 | Antoine R. Dumont (@ardumont) | 07 July 2020, 09:09:25 UTC | Move sharable fixtures out of conftest into a dedicated pytest plugin This will allow loaders to reuse those dedicated fixtures within their code base without having to import the swh.storage.tests.conftest module. Related to T2484 | 08 July 2020, 09:50:21 UTC |
e45ca76 | Antoine R. Dumont (@ardumont) | 07 July 2020, 10:07:18 UTC | Migrate from vcversioner to setuptools-scm Related to T2105 | 07 July 2020, 15:42:30 UTC |
5ab7023 | David Douard | 03 July 2020, 09:45:06 UTC | Extract revision's extra_header as a top level attribute Follows swh.model's evolution for the Revision model class. In Postgresql, store the extra_headers as a bytea[][]. Ensure data present in postgres with extra_headers in the metadata field are properly supported by the pg-backed storage. Get rid of the (now useless) git_headers_to_db() converter function. In Cassandra, store them as frozen<list<list<blob>>>. | 07 July 2020, 14:49:14 UTC |
8010848 | Antoine R. Dumont (@ardumont) | 06 July 2020, 07:45:40 UTC | storage: Send metrics from the origin_add endpoint Prior to this commit, since the loaders got migrated to use the main endpoint, no metrics were sent for the origin any longer. This commit fixes it. It also drops the send_metrics call from the deprecated endpoint origin_add_one (which, as an implementation details calls the other one). | 06 July 2020, 07:45:40 UTC |
95fd660 | Antoine R. Dumont (@ardumont) | 03 July 2020, 15:54:04 UTC | pg-storage: Add missing cur parameter passing Although, this also pulled a refactoring on the insertion query as the default naive approach ended up with issues on cur already being closed [1] [1] Related to P715 Related to D3416 | 03 July 2020, 15:54:04 UTC |
348bc7b | Antoine R. Dumont (@ardumont) | 03 July 2020, 14:31:22 UTC | storage.db: Drop db.origin_visit_upsert behavior The initial desired behavior was to allow creation of origin-visit if they already had their id set. This is the what's needed for the replayer to actually work. But somehow, this left the possibility to update the origin-visit... This commit fixes it by dropping conflictual origin-visits if any. In effect, we can no longer overwrite origin-visits (pg-storage wise). Related to T2310 | 03 July 2020, 14:42:20 UTC |
248c277 | Valentin Lorentz | 30 June 2020, 14:35:59 UTC | Move tests of content_metadata_* next to origin_metadata_* For consistency with the main code. | 02 July 2020, 09:04:43 UTC |
f2619b6 | Antoine R. Dumont (@ardumont) | 01 July 2020, 13:39:59 UTC | Rework 157 migration to ease replication setup Past experience showed that altering tables is more stressful than plain creation. As in here. Related to T2306 Related P707 | 01 July 2020, 13:41:16 UTC |
312127a | Antoine R. Dumont (@ardumont) | 30 June 2020, 13:22:59 UTC | storage*: Drop intermediary conversion step into OriginVisit This is no longer possible as OriginVisit no longer hold the same information as OriginVisitStatus. This will allow to drop entirely those fields in the model. Related to T2310 | 30 June 2020, 13:54:01 UTC |
953bd29 | Valentin Lorentz | 30 June 2020, 13:11:49 UTC | pg: use 'on conflict do nothing' strategy for duplicate metadata rows. "updates are a problem for postgresql logical replication" | 30 June 2020, 13:25:53 UTC |
00f97f0 | Valentin Lorentz | 30 June 2020, 13:06:03 UTC | Document the behavior of adding a duplicate non-intrinsic object is unspecified. | 30 June 2020, 13:06:03 UTC |
4c2bdad | Valentin Lorentz | 30 June 2020, 12:56:20 UTC | Make the code location of metadata endpoints consistent across backends. | 30 June 2020, 12:56:20 UTC |
ffe6b92 | Valentin Lorentz | 25 June 2020, 15:55:55 UTC | Add content_metadata_{add,get}. | 30 June 2020, 10:31:59 UTC |
869679a | Valentin Lorentz | 25 June 2020, 15:53:06 UTC | Add context columns to object_metadata table and object_metadata_{add,get}. Not used/tested yet; will be used when I introduce content_metadata_{get,add}. | 30 June 2020, 10:31:59 UTC |
27e9426 | Valentin Lorentz | 25 June 2020, 15:42:31 UTC | Generalize origin_metadata to allow support for other object types in the future. | 30 June 2020, 10:31:21 UTC |
1f0e256 | Valentin Lorentz | 30 June 2020, 08:19:38 UTC | Work around the segmentation faults caused by pytest-coverage + multiprocessing. | 30 June 2020, 08:23:25 UTC |
dc1878b | David Douard | 29 June 2020, 14:23:45 UTC | Make release_add support adding the same object twice in the same call This is an edge case, but the mirror infrastructure is apparently hitting it. We modify the SQL query to be properly idempotent. Also ensure in_memory and cassandra backends behave the same. Note: this revision was mostly written by Nicolas Dandrimont <nicolas@dandrimont.eu>. | 29 June 2020, 15:27:21 UTC |
10443b8 | Antoine R. Dumont (@ardumont) | 25 June 2020, 16:37:18 UTC | Iterate over paginated visits in batches to retrieve latest visit/snapshot This should stops the current timeouts on origin with a high number of visits. Related to T2310 | 26 June 2020, 15:38:22 UTC |
182ee49 | Antoine R. Dumont (@ardumont) | 24 June 2020, 16:04:29 UTC | storage*: Open order parameter to origin-visit-get endpoint This allows clients to search from most recent to oldest visit when calling the endpoint with the "order" parameter set to "desc" (visit id desc). This keeps and explicits the existing sorting order as visit id "asc". Related to T2310 | 26 June 2020, 11:22:40 UTC |
f75cd41 | Antoine R. Dumont (@ardumont) | 26 June 2020, 10:28:06 UTC | tests*: Drop obsolete origin visit fields Related to T2310 | 26 June 2020, 10:28:06 UTC |
8620519 | Antoine R. Dumont (@ardumont) | 26 June 2020, 07:47:01 UTC | replayer: Drop obsolete fields from origin-visit Otherwise, we won't be able to replay them. Related T2310 | 26 June 2020, 07:50:38 UTC |
b991e69 | Antoine R. Dumont (@ardumont) | 24 June 2020, 15:11:36 UTC | test_storage: Add missing tests on origin_visit_get method | 25 June 2020, 12:47:11 UTC |
89e9dae | Antoine R. Dumont (@ardumont) | 25 June 2020, 12:37:39 UTC | storage: Given origin-visit index a name to avoid future dev/prod divergence Related to D3342#inline-23217 | 25 June 2020, 12:37:39 UTC |
12d729b | Antoine R. Dumont (@ardumont) | 24 June 2020, 13:06:00 UTC | Relax checks on journal writes regarding origin-visit* | 25 June 2020, 12:35:38 UTC |
c6e6f33 | Antoine R. Dumont (@ardumont) | 25 June 2020, 09:19:55 UTC | replayer: Fix isoformat datetime string for origin-visit We no longer write datetime as strings in the journal. Still, the current journal must have those old values within. Related to D3336 Related to D3345 | 25 June 2020, 09:19:55 UTC |
e5e80ef | Antoine R. Dumont (@ardumont) | 24 June 2020, 08:57:52 UTC | storage*: Drop obsolete fields from origin_visit Related to T2310 | 25 June 2020, 08:35:18 UTC |
621fc8d | David Douard | 22 June 2020, 09:27:54 UTC | Deprecate the origin_add_one() endpoint This endpoint is not really useful since the origin_add() can be used instead. Using a single API endpoint would also make the API a bit more consistant (most other endpoints only provide a xxx_add endpoint) ; having a single endpoint per object_type make is enough and make the whole API simpler. | 23 June 2020, 14:07:09 UTC |
fb603e1 | David Douard | 18 June 2020, 16:28:51 UTC | Make Storage.add_origin() return a sumary dict make it consistent with other add_xxx methods by making it return a summary dict `{"origin:add": int}`. | 23 June 2020, 13:58:54 UTC |
2d497ff | Antoine R. Dumont (@ardumont) | 22 June 2020, 11:13:35 UTC | test_origin: Rename appropriately tests So one can trigger tests separately by name tagging. | 22 June 2020, 12:39:32 UTC |
e9f4554 | Antoine R. Dumont (@ardumont) | 22 June 2020, 11:34:25 UTC | algos: Improve origin visit get latest visit status algorithm Prior to this commit, this looked up only the latest visit information. This now looks up across multiple visits up (from most recent visit to the oldest) until one visit which match the criteria is elected. | 22 June 2020, 12:39:32 UTC |
041543d | Antoine R. Dumont (@ardumont) | 22 June 2020, 09:31:52 UTC | test_snapshot: Do not use origin_visit_add returned result This api will be realigned with other add endpoints. | 22 June 2020, 09:33:15 UTC |
32fded1 | Antoine R. Dumont (@ardumont) | 19 June 2020, 16:50:18 UTC | algos.snapshot: Fix edge case when snapshot is not resolved Fixes [1] [1] https://sentry.softwareheritage.org/share/issue/9848d9ea23d94d6ba8855bc7a7d7d297/ | 22 June 2020, 09:19:38 UTC |
53c4392 | David Douard | 18 June 2020, 16:38:39 UTC | Ensure ids are correct in tests' storage_data Also add an "objects" dict to easily retrieve available objects from their object_type. | 22 June 2020, 08:57:47 UTC |
46ac997 | David Douard | 18 June 2020, 16:37:20 UTC | Fix tests' storage_data revisions one of them was actually invalid (extra_header metadata being used in hash computation) | 22 June 2020, 08:57:39 UTC |
19354bc | David Douard | 22 June 2020, 08:05:37 UTC | SQL: replace the hash(url) index by a unique btree(url) on the origin table This ensures unicity of url in the origin table. | 22 June 2020, 08:09:23 UTC |
9514a1d | Nicolas Dandrimont | 19 June 2020, 14:46:42 UTC | Make sure the pagination in swh_snapshot_get_by_id uses the proper indexes | 19 June 2020, 15:14:59 UTC |
1600907 | Antoine R. Dumont (@ardumont) | 18 June 2020, 13:21:29 UTC | Move deprecated endpoint snapshot_get_latest from api endpoint to algos This allows to avoid repeating the same pattern of retrieving the last snapshot for a given origin. Note that this also makes the new function return a Snapshot model object as well. Related to T2310 | 19 June 2020, 09:19:58 UTC |
5480b7b | Antoine R. Dumont (@ardumont) | 18 June 2020, 11:40:16 UTC | algos.origin: Open origin-get-latest-visit-status function This will allow to avoid repeating the same pattern of retrieving the last visit status for a given origin. Related to T2310 | 18 June 2020, 11:40:16 UTC |
c498901 | Antoine R. Dumont (@ardumont) | 18 June 2020, 08:11:47 UTC | storage*: Allow origin-visit-get-latest to filter on type | 18 June 2020, 10:25:12 UTC |
822d96b | Antoine R. Dumont (@ardumont) | 18 June 2020, 06:55:34 UTC | test_origin: Align storage initialization within tests This aligns consistently the storage initialization with other tests. | 18 June 2020, 06:55:34 UTC |
c3d177b | Antoine R. Dumont (@ardumont) | 17 June 2020, 10:55:16 UTC | test_storage: Fix flakiness in round to milliseconds test util method Prior to this commit, the tests would fail [1] for no good reason [2]. This fixes it. [1] https://jenkins.softwareheritage.org/job/DSTO/job/tests/1264/console [2] microseconds would exceed a limit of 999999 from time to time | 17 June 2020, 13:13:41 UTC |
7319495 | Antoine R. Dumont (@ardumont) | 16 June 2020, 16:03:23 UTC | storage*: Add origin-visit-status-get-latest endpoint So we can read the latest origin-visit-status out of a storage Related to T2310 | 17 June 2020, 10:20:48 UTC |
692bfa3 | David Douard | 17 June 2020, 07:23:11 UTC | Fix/update the backfiller The backfiller has not been updated to match recent changes in several places. This has not been detected because there was no proper test of the backfiller function as a whole. This is now done. | 17 June 2020, 09:35:35 UTC |
057c6fd | Nicolas Dandrimont | 17 June 2020, 09:22:13 UTC | validate: accept model objects as well as dicts on all add endpoints This generalizes work by Antoine Dumont to all object addition endpoints, as a further step towards completely dropping the validate proxy in tests. | 17 June 2020, 09:22:52 UTC |
d153a80 | Antoine R. Dumont (@ardumont) | 16 June 2020, 18:11:10 UTC | cql: Fix blackified strings | 16 June 2020, 18:11:31 UTC |
5e053f8 | Antoine R. Dumont (@ardumont) | 16 June 2020, 16:02:09 UTC | storage: Add missing cur parameter | 16 June 2020, 16:11:35 UTC |
c2b673b | David Douard | 16 June 2020, 10:19:13 UTC | Fix db_to_author() converter to return None is all fields are None Fix T2455. | 16 June 2020, 10:32:20 UTC |
8f1ac4c | Antoine R. Dumont (@ardumont) | 15 June 2020, 13:27:32 UTC | storage*: Drop leftover code This is no longer used, it should have been dropped with previous commits. Related to T2310 | 15 June 2020, 13:28:38 UTC |
d6144d2 | Antoine R. Dumont (@ardumont) | 11 June 2020, 15:26:41 UTC | storage*: Drop origin_visit_upsert endpoint Related to T2310 | 15 June 2020, 12:26:42 UTC |
c7f3060 | Antoine R. Dumont (@ardumont) | 12 June 2020, 16:51:25 UTC | storage*: Remove origin-visit-update endpoint Related to T2310 | 15 June 2020, 12:11:52 UTC |
2bcbc82 | Antoine R. Dumont (@ardumont) | 11 June 2020, 15:16:29 UTC | replay: Replay origin-visit and origin-visit-status This now uses the respective origin-visit-add and origin-visit-status-add endpoints. Related to T2310 | 15 June 2020, 12:06:10 UTC |
0183fec | Antoine R. Dumont (@ardumont) | 15 June 2020, 10:38:42 UTC | in_memory: Make origin-visit-status-add respect "on conflict ignore" policy Prior to this commit, that behavior was not properly tested and inconsistent between backends. All backends except in-memory were respecting it. This commit aligns the in-memory backend implementation and test it. Related to T2310 | 15 June 2020, 11:44:06 UTC |
46a7839 | Antoine R. Dumont (@ardumont) | 15 June 2020, 08:44:10 UTC | test_storage: Add journal behavior coverage for origin-visit-*add This was missing some coverage on origin-visit-add and origin-visit-status-add for the journal part. Related to T2310 | 15 June 2020, 09:50:33 UTC |
874da2d | Antoine R. Dumont (@ardumont) | 13 June 2020, 06:37:57 UTC | Start migrating the validate proxy toward using BaseModel objects This will allow to progress incrementally towards removing it. When it allows to use BaseModel objects everywhere (and tests in test_storage are adapted to use this property), it will be time to remove it entirely (as it's only used in test). It's preparatory work for future diffs. | 13 June 2020, 06:37:57 UTC |
33efdb0 | Antoine R. Dumont (@ardumont) | 12 June 2020, 16:51:47 UTC | storage*: Do not write twice origin-visit-status in journal Related to T2310 | 12 June 2020, 16:52:19 UTC |
37c4530 | Antoine R. Dumont (@ardumont) | 11 June 2020, 12:57:02 UTC | storage*: Align origin-visit-add to take iterable of OriginVisit objects This makes its api consistent with other add endpoints. This is preparatory work towards removing origin-visit-upsert. Related to T2310 | 11 June 2020, 16:55:33 UTC |
5d61633 | Antoine R. Dumont (@ardumont) | 11 June 2020, 12:32:02 UTC | test: Remove dead code 1. obj_type is now origin_visit_status. So this means, we actually never pass here. 2. Those objects have now a storage endpoint anyway. So it's dead code alright. Related to T2310 | 11 June 2020, 12:33:53 UTC |
d68c7ec | Antoine R. Dumont (@ardumont) | 10 June 2020, 08:29:09 UTC | origin-visit-upsert: Write visit status objects to the journal Related to T2310 | 10 June 2020, 08:51:51 UTC |
86d05fb | Antoine R. Dumont (@ardumont) | 08 June 2020, 14:27:11 UTC | origin-visit-update: Write visit status objects to the journal Related to T2310 | 09 June 2020, 12:33:58 UTC |
0860920 | Antoine R. Dumont (@ardumont) | 08 June 2020, 09:42:13 UTC | origin-visit-add: Write visit status to the journal This also makes the instruction order consistent across the different storage implementations. First, write objects to the journal, then write objects to the storage backend. Related to T2310 | 09 June 2020, 12:29:57 UTC |
7eb44d4 | Valentin Lorentz | 08 June 2020, 09:53:08 UTC | Add pagination to origin_metadata_get. | 08 June 2020, 14:02:15 UTC |
26a8d4f | Valentin Lorentz | 04 June 2020, 15:06:26 UTC | Add SortedList.iter_after. Strict version of iter_from. I'll need it for pagination. | 08 June 2020, 14:01:48 UTC |
6ebdc2f | Valentin Lorentz | 04 June 2020, 10:33:17 UTC | Deduplicate origin-metadata when they have the same authority + discovery_date + fetcher. By replacing the old value with the new one. This will allow an easy implementation of pagination, using the fetcher id as an opaque page_token. Plus, it did not make sense logically to have different metadata from the same authority at the same time (especially with the same fetcher). | 08 June 2020, 14:01:30 UTC |
dcef916 | Antoine R. Dumont (@ardumont) | 03 June 2020, 09:12:34 UTC | Open `origin_visit_status_add` endpoint to add origin visit statuses Related to T2310 | 05 June 2020, 16:32:12 UTC |
88271f8 | David Douard | 07 May 2020, 13:33:35 UTC | Add a replayer test for anonymized journal topics This new test check the behavior of the storage replayer mechanism when replaying a journal with privileged topics. | 05 June 2020, 16:10:42 UTC |
c75da7a | David Douard | 05 June 2020, 10:25:58 UTC | Small refactoring of the InMemoryStorage to make it more consistent - make self._persons a dict - make self._snapshots value Snapshot only instead of the couple (Snapshot, sorted_branch_names) | 05 June 2020, 10:28:04 UTC |
25f584f | Nicolas Dandrimont | 04 June 2020, 14:14:07 UTC | Use explicit configuration (without journal writer) for algos tests Using the in-memory journal writer sometimes makes the tests hang when (very) large objects are used. This works around the issue. | 04 June 2020, 14:14:07 UTC |
f9b2ca3 | David Douard | 29 May 2020, 12:28:31 UTC | Replace MockedJournalClient and MockedKafkaWriter by proper kafka test scaffolding This also kills test_write_replay.py file since it does not test anything more than what is currently tested in test_replay.py. | 04 June 2020, 09:46:12 UTC |
ad9c9bb | David Douard | 02 June 2020, 15:25:18 UTC | Adapt to swh.model 0.3 in which List attributes have replaced by Tuple ones. This requires a bit of adaptation in the code of the ValidatingProxyStorage to ensure dict representation of revision objects are properly typed. The test_api_client_dicts.py has been removed since it's not really useful any more and would require a fair amount of work to fix it. | 04 June 2020, 09:46:12 UTC |
eef4900 | David Douard | 29 May 2020, 12:21:55 UTC | Fix InMemoryStorage.origin_visit_upsert() method the self._origin_visits[origin_url] list was built one element too big (since visit ids starts from 1 and not 0). This is needed to ease writing replayer tests (by comparing these lists). | 29 May 2020, 13:23:28 UTC |
6c6080b | Valentin Lorentz | 28 May 2020, 15:03:13 UTC | Fix type annotation. | 28 May 2020, 15:03:13 UTC |
9332547 | Valentin Lorentz | 28 May 2020, 14:34:53 UTC | Remove function drops from the migration. I committed these two lines by mistake | 28 May 2020, 14:34:53 UTC |
2209d31 | Antoine R. Dumont (@ardumont) | 27 May 2020, 11:41:28 UTC | README: Update necessary dependencies for test purposes This also adds a mention on how to avoid running the cassandra tests. | 28 May 2020, 12:25:31 UTC |