d68d03c | Jenkins for Software Heritage | 24 January 2023, 13:26:22 UTC | Updated debian changelog for version 1.5.0 | 24 January 2023, 13:26:22 UTC |
00fd130 | Jenkins for Software Heritage | 24 January 2023, 13:26:21 UTC | Update upstream source from tag 'debian/upstream/1.5.0' Update to upstream version '1.5.0' with Debian dir 5c5abea93e5496c1e4ce76325e777e811f41bb4a | 24 January 2023, 13:26:21 UTC |
5a97137 | Jenkins for Software Heritage | 24 January 2023, 13:26:20 UTC | New upstream version 1.5.0 | 24 January 2023, 13:26:20 UTC |
8f0849a | Antoine R. Dumont (@ardumont) | 16 January 2023, 14:22:12 UTC | Allow logging configuration from configuration yaml file This will allow proper logging configuration for the services which are currently running in the dynamic infrastructure. Their logs are current written in the wrong elasticsearch indices. Ref. swh/infra/sysadm-environment#4524 | 23 January 2023, 17:03:12 UTC |
fccf944 | Antoine R. Dumont (@ardumont) | 12 December 2022, 13:05:46 UTC | Add missing __init__.py so find_packages keep finding sql modules Otherwise, at some point, this will get discarded as per the debian build warning [1] [1] https://jenkins.softwareheritage.org/view/swh-debian%20(draft)/job/debian/job/packages/job/DSCH/job/gbp-buildpackage/182/console | 02 January 2023, 09:21:57 UTC |
d521ab7 | Antoine Lambert | 19 December 2022, 14:10:54 UTC | docs: Include module indices only when building standalone package doc In order to remove warnings about /apidoc/*.rst files being included multiple times in toc when building full swh documentation, prefer to include module indices only when building standalone package documentation. Also include them the proper sphinx way. Related to T4496 | 19 December 2022, 14:10:54 UTC |
3ca9293 | Jenkins for Software Heritage | 12 December 2022, 10:51:31 UTC | Updated debian changelog for version 1.4.0 | 12 December 2022, 10:51:31 UTC |
0f46f3a | Jenkins for Software Heritage | 12 December 2022, 10:51:30 UTC | Update upstream source from tag 'debian/upstream/1.4.0' Update to upstream version '1.4.0' with Debian dir 0fc297ff329f9f004f363b713e958febf0acc324 | 12 December 2022, 10:51:30 UTC |
76030a1 | Jenkins for Software Heritage | 12 December 2022, 10:51:30 UTC | New upstream version 1.4.0 | 12 December 2022, 10:51:30 UTC |
8e125f1 | Antoine R. Dumont (@ardumont) | 07 December 2022, 15:57:32 UTC | cli.add_forge_now: Open `register-lister` with sensible defaults This will ease scheduling of new add-forge-now requests, on: - staging: this will list a subset of disabled origins once - production: this will register recurring tasks (full, incremental if any) to list that new forge This also unifies the previous subcommand schedule-first-visits with the --preset flag. So, the following would be enough to list appropriately in staging/production: ``` swh scheduler add-forge-now \ ( --preset [production|staging] \ # to enable a pre-defined set of rules ) register-lister \ gitea \ url=https://git.afpy.org/api/v1/ ``` Related to https://gitlab.softwareheritage.org/infra/sysadm-environment/-/issues/4674 | 08 December 2022, 17:51:45 UTC |
1c34e98 | Antoine R. Dumont (@ardumont) | 07 December 2022, 14:14:32 UTC | cli.add_forge_now: Open `schedule-first-visits` with sensible defaults This should ease scheduling the first visits for add-forge-now request. The following would be enough to fetch and schedule the forge just listed (be it in production or staging): ``` swh scheduler add-forge-now \ schedule-first-visits \ --visit-type git \ (--visit-type svn \ # if a lister lists multiple kinds of visit, we can mention it ) --lister-name gitea \ --lister-instance-name git.afpy.org \ ( --production | --staging ) # to list only enabled | disabled origins ``` Related to https://gitlab.softwareheritage.org/infra/sysadm-environment/-/issues/4674 | 07 December 2022, 15:44:28 UTC |
03c0d1b | Jenkins for Software Heritage | 07 December 2022, 12:50:33 UTC | Updated debian changelog for version 1.3.0 | 07 December 2022, 12:50:33 UTC |
80c1df4 | Jenkins for Software Heritage | 07 December 2022, 12:50:32 UTC | Update upstream source from tag 'debian/upstream/1.3.0' Update to upstream version '1.3.0' with Debian dir 4b8c2ba3ff41fa515c60acf0dd8a33c9e97e7600 | 07 December 2022, 12:50:32 UTC |
354f2d4 | Jenkins for Software Heritage | 07 December 2022, 12:50:31 UTC | New upstream version 1.3.0 | 07 December 2022, 12:50:31 UTC |
e2878b5 | Antoine R. Dumont (@ardumont) | 07 December 2022, 11:38:58 UTC | task add: Ensure task type provided exist and raise otherwise Related to https://gitlab.softwareheritage.org/infra/sysadm-environment/-/issues/4674 | 07 December 2022, 11:57:04 UTC |
cd16fce | Antoine R. Dumont (@ardumont) | 06 December 2022, 16:01:41 UTC | grab_next_visits: Open lister name and instance name filtering Related to https://gitlab.softwareheritage.org/infra/sysadm-environment/-/issues/4674 | 06 December 2022, 16:03:32 UTC |
a776963 | Antoine R. Dumont (@ardumont) | 06 December 2022, 11:24:33 UTC | send-to-celery: Adapt to schedule from lister name & instance_name This allows to bypass the lister id retrieval step using directly the name and instance name of the lister to discover the uuid. This also drops the --lister-uuid flag which is somewhat difficult to use. Related to https://gitlab.softwareheritage.org/infra/sysadm-environment/-/issues/4674 | 06 December 2022, 15:54:02 UTC |
ff75e74 | Nicolas Dandrimont | 25 October 2022, 13:48:55 UTC | Ensure origins are not visited faster than twice a day The scheduled_cooldown only applies to tasks that have not been executed yet. absolute_cooldown avoids archiving objects faster than that. | 25 October 2022, 14:48:51 UTC |
1f9109f | Nicolas Dandrimont | 25 October 2022, 13:47:37 UTC | Refresh task type data from the database every time recurrent tasks are run Avoids inconsistencies between the database state and an ongoing recurrent task scheduler. | 25 October 2022, 14:48:51 UTC |
bde27a9 | Nicolas Dandrimont | 25 October 2022, 13:46:26 UTC | Use json instead of msgpack for serializers Recent celery versions generate serialized messages with mime types incompatible with older versions when using msgpack | 25 October 2022, 13:51:01 UTC |
aeb870a | David Douard | 18 October 2022, 16:21:00 UTC | pre-commit, tox: Bump pre-commit, codespell, black and flake8 - pre-commit from 4.1.0 to 4.3.0, - codespell from 2.2.1 to 2.2.2, - black from 22.3.0 to 22.10.0 and - flake8 from 4.0.1 to 5.0.4. Also freeze flake8 dependencies. Also change flake8's repo config to github (the gitlab mirror being outdated). | 18 October 2022, 16:53:38 UTC |
87ff3db | Jenkins for Software Heritage | 03 October 2022, 12:07:44 UTC | Updated debian changelog for version 1.2.3 | 03 October 2022, 12:07:44 UTC |
edc1a2e | Jenkins for Software Heritage | 03 October 2022, 12:07:43 UTC | Update upstream source from tag 'debian/upstream/1.2.3' Update to upstream version '1.2.3' with Debian dir 692868b76356cffe70463fb90e278de55a0e035c | 03 October 2022, 12:07:43 UTC |
0929c07 | Jenkins for Software Heritage | 03 October 2022, 12:07:42 UTC | New upstream version 1.2.3 | 03 October 2022, 12:07:42 UTC |
17c6d48 | Antoine R. Dumont (@ardumont) | 03 October 2022, 11:43:32 UTC | Fix compatibility issue with latest dependency version This currently fails all swh related builds which depend on the celery/kombu stack due to that dependency's latest version release. | 03 October 2022, 11:58:46 UTC |
6d0b1d1 | Antoine R. Dumont (@ardumont) | 23 September 2022, 07:48:30 UTC | backend: Prevent query exception when lister ids is empty Related to T4545 | 23 September 2022, 07:49:04 UTC |
4604ff8 | Jenkins for Software Heritage | 15 September 2022, 12:00:17 UTC | Updated debian changelog for version 1.2.2 | 15 September 2022, 12:00:17 UTC |
889cf7b | Jenkins for Software Heritage | 15 September 2022, 12:00:16 UTC | Update upstream source from tag 'debian/upstream/1.2.2' Update to upstream version '1.2.2' with Debian dir fb5e868766d116ab788888d47d0f6258440a68b2 | 15 September 2022, 12:00:16 UTC |
cfa5a6f | Jenkins for Software Heritage | 15 September 2022, 12:00:15 UTC | New upstream version 1.2.2 | 15 September 2022, 12:00:15 UTC |
b1afdab | Antoine Lambert | 14 September 2022, 14:18:51 UTC | recurrent_visits: Allow to set no origins scheduled backoff in config The send_visits_for_visit_type function uses a default schedule backoff of 20 minutes where there is no origins to schedule for a given visit type. It exists use cases when we would like that schedule backoff to be shorter in order to schedule listed origins for loading into the archive more rapidly, typically in the docker environment. So allow to set that backoff value through configuration. | 15 September 2022, 08:41:20 UTC |
7cfaa98 | Antoine Lambert | 22 August 2022, 13:19:50 UTC | sql/Makefile: Fix swh-scheduler SQL file paths Those files have been renamed so the database could not be filled. | 22 August 2022, 13:19:50 UTC |
fd6df6a | Antoine R. Dumont (@ardumont) | 29 July 2022, 08:12:23 UTC | api/server: Clarify load and check configuration backend This adds type to the function, update its docstring and clarify its associated tests as well. | 29 July 2022, 08:12:23 UTC |
4b6972c | Jenkins for Software Heritage | 08 July 2022, 14:53:11 UTC | Updated debian changelog for version 1.2.1 | 08 July 2022, 14:53:11 UTC |
b462eab | Jenkins for Software Heritage | 08 July 2022, 14:53:11 UTC | Update upstream source from tag 'debian/upstream/1.2.1' Update to upstream version '1.2.1' with Debian dir 8876e6480e8b880ad9f16a682166f4ad7951a25b | 08 July 2022, 14:53:11 UTC |
58b365e | Jenkins for Software Heritage | 08 July 2022, 14:53:10 UTC | New upstream version 1.2.1 | 08 July 2022, 14:53:10 UTC |
d847448 | David Douard | 08 July 2022, 12:00:33 UTC | Fix the load_and_check_config() function to support the "postgresql" cls value and replace usage of the "local" scheduler cls with "postgresql" everywhere. | 08 July 2022, 12:23:46 UTC |
d8bc426 | Jenkins for Software Heritage | 03 June 2022, 13:47:47 UTC | Updated debian changelog for version 1.2.0 | 03 June 2022, 13:47:47 UTC |
8ebdc1a | Jenkins for Software Heritage | 03 June 2022, 13:47:45 UTC | Update upstream source from tag 'debian/upstream/1.2.0' Update to upstream version '1.2.0' with Debian dir 5308e2d13ddbbe929fe3a61aa2a483e605787857 | 03 June 2022, 13:47:45 UTC |
ad7ca47 | Jenkins for Software Heritage | 03 June 2022, 13:47:44 UTC | New upstream version 1.2.0 | 03 June 2022, 13:47:44 UTC |
0496c39 | Antoine R. Dumont (@ardumont) | 03 June 2022, 12:41:51 UTC | Remove unused get_current_version method Attribute current_version is already set and directly used by swh db [version|init|upgrade] clis. Related to T4305 | 03 June 2022, 12:44:56 UTC |
ef15385 | David Douard | 31 May 2022, 12:21:31 UTC | tests: use stock pytest_postgresql factory function instead of (soon-to-be-deprecated) swh-core's postgresql_fact one. | 31 May 2022, 14:46:05 UTC |
4e04ccf | Jenkins for Software Heritage | 12 May 2022, 11:55:14 UTC | Updated debian changelog for version 1.1.2 | 12 May 2022, 11:55:14 UTC |
b2e342f | Jenkins for Software Heritage | 12 May 2022, 11:55:13 UTC | Update upstream source from tag 'debian/upstream/1.1.2' Update to upstream version '1.1.2' with Debian dir 77f8815b707fded0e03cae23a4909d7f281a2e97 | 12 May 2022, 11:55:13 UTC |
407dd3d | Jenkins for Software Heritage | 12 May 2022, 11:55:12 UTC | New upstream version 1.1.2 | 12 May 2022, 11:55:12 UTC |
e56fc4d | Antoine Lambert | 12 May 2022, 09:08:09 UTC | interface: Return enabled origins only by default in get_listed_origins Add a new enabled_only parameter set to True by default in get_listed_origins scheduler method. It enables to filter out by default disabled listed origins when requesting the result of a listing and avoid possible errors in listers implementation. | 12 May 2022, 10:07:17 UTC |
c7c53ea | Pratyush Desai | 09 May 2022, 10:13:54 UTC | add strict asyncio_mode in pytest.ini | 09 May 2022, 10:13:54 UTC |
1d50b2e | Antoine Lambert | 06 May 2022, 15:05:20 UTC | cli/task: Fix sphinx >= 4.4 warning Fix "more than one target found for cross-reference 'Origin'" sphinx warning. | 06 May 2022, 15:06:23 UTC |
881b521 | Benoit Chauvet | 28 April 2022, 13:56:01 UTC | Add missing sentry captures | 28 April 2022, 13:59:44 UTC |
f092ed3 | Jenkins for Software Heritage | 28 April 2022, 09:36:24 UTC | Updated debian changelog for version 1.1.1 | 28 April 2022, 09:36:24 UTC |
23ce0d9 | Jenkins for Software Heritage | 28 April 2022, 09:36:23 UTC | Update upstream source from tag 'debian/upstream/1.1.1' Update to upstream version '1.1.1' with Debian dir 51b9198d0925a58c5f477ee300095bd0c9e8f9b6 | 28 April 2022, 09:36:23 UTC |
d9e982e | Jenkins for Software Heritage | 28 April 2022, 09:36:23 UTC | New upstream version 1.1.1 | 28 April 2022, 09:36:23 UTC |
82274c1 | Valentin Lorentz | 27 April 2022, 13:15:28 UTC | cli/utils: Fix parsing of empty strings | 27 April 2022, 13:15:28 UTC |
353cf2a | Valentin Lorentz | 26 April 2022, 11:05:15 UTC | Bump mypy to v0.942 | 26 April 2022, 11:05:15 UTC |
f642da4 | Jenkins for Software Heritage | 26 April 2022, 10:35:52 UTC | Updated debian changelog for version 1.1.0 | 26 April 2022, 10:35:52 UTC |
d912c65 | Jenkins for Software Heritage | 26 April 2022, 10:35:51 UTC | Update upstream source from tag 'debian/upstream/1.1.0' Update to upstream version '1.1.0' with Debian dir 728c35186bf7d46bb2e39efbe69cf3e4981c7311 | 26 April 2022, 10:35:51 UTC |
442fcdb | Jenkins for Software Heritage | 26 April 2022, 10:35:50 UTC | New upstream version 1.1.0 | 26 April 2022, 10:35:50 UTC |
0365b85 | Valentin Lorentz | 21 April 2022, 16:40:55 UTC | Add a 'lister_instance_name' argument to all tasks created from ListedOrigin This will allow loaders to use the right API credentials to fetch extrinsic metadata for the origin from the forge. | 26 April 2022, 10:28:37 UTC |
42e362d | Valentin Lorentz | 21 April 2022, 10:22:03 UTC | Add a 'lister_name' argument to all tasks created from ListedOrigin This will allow loaders to guess the forge type, and use the right API to fetch extrinsic metadata for the origin from the forge. | 26 April 2022, 10:28:33 UTC |
3687931 | David Douard | 25 April 2022, 16:14:29 UTC | Update a bit the documentation for the new origin visit scheduler | 26 April 2022, 08:38:05 UTC |
9483493 | Valentin Lorentz | 21 April 2022, 09:22:48 UTC | Make create_origin_task_dict a standalone function It feels off as an object method; and I am going to make it use joins in a future commit, so it makes more sense this way. | 21 April 2022, 15:15:06 UTC |
5e9ee60 | Valentin Lorentz | 21 April 2022, 09:21:05 UTC | test_utils.py: Convert to pytest-style tests | 21 April 2022, 11:47:58 UTC |
9627e6d | Antoine Lambert | 21 April 2022, 11:39:49 UTC | pre-commit: Remove codespell commit-msg hook That hook can be frustrating as it can discard a long commit message if it finds a typo in it so better removing it. | 21 April 2022, 11:39:49 UTC |
a76bb02 | David Douard | 15 April 2022, 16:08:49 UTC | Make scheduling policy used in schedule_recurrent configurable Add support for a configuration option "scheduling_policy" in the config file loaded by the 'swh scheduler schedule-recurrent' command. This config entry allows to specify the scheduling policies used by the schedule-recurrent tool, instead of having them hardcoded in the source code. A visit type policy config entry should have at least a 'weight' value for each policy. Default values are unchanged. Eg.: scheduling_policy: git: - policy: already_visited_order_by_lag weight: 55 tablesample: 0.5 - policy: never_visited_oldest_update_first weight: 45 tablesample: 0.5 Note: there may not be configuration entries for all visit types, but if a visit type policy is configured, the config entry should be complete (in other words, the merging of the configuration with the default values is only done at first config level). | 20 April 2022, 14:34:23 UTC |
5302efd | Antoine Lambert | 08 April 2022, 13:15:35 UTC | Add .git-blame-ignore-revs file with automatic reformatting commits | 08 April 2022, 13:15:35 UTC |
3f0843b | Antoine Lambert | 08 April 2022, 13:15:09 UTC | python: Reformat code with black 22.3.0 Related to T3922 | 08 April 2022, 13:15:09 UTC |
d9a2512 | Antoine Lambert | 08 April 2022, 13:13:50 UTC | pre-commit, tox: Bump black from 19.10b0 to 22.3.0 black is considered stable since release 22.1.0 and the version we are currently using is quite outdated and not compatible with click 8.1.0, so it is time to bump it to its latest stable release. Please note that E501 pycodestyle warning related to line length is replaced by B950 one from flake8-bugbear as recommended by black. https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#line-length Related to T3922 | 08 April 2022, 13:13:50 UTC |
bafe03f | Antoine Lambert | 06 April 2022, 15:14:52 UTC | requirements-test: Remove pytest pinning to < 7 pytest-postgresql 3.1.3 and pytest-redis 2.4.0 added support for pytest >= 7 so we can now drop the pytest pinning. | 06 April 2022, 15:14:52 UTC |
78f5579 | Antoine Lambert | 22 March 2022, 10:58:10 UTC | pytest: Exclude build directory for tests discovery Due to test modules being copied in subdirectories of the build directory by setuptools, it makes pytest fail by raising ImportPathMismatchError exceptions when invoked from root directory of the module. So ignore the build folder to discover tests. | 22 March 2022, 10:58:10 UTC |
fded717 | Jenkins for Software Heritage | 24 February 2022, 16:03:55 UTC | Updated debian changelog for version 1.0.0 | 24 February 2022, 16:03:55 UTC |
87e54e3 | Jenkins for Software Heritage | 24 February 2022, 16:03:54 UTC | Update upstream source from tag 'debian/upstream/1.0.0' Update to upstream version '1.0.0' with Debian dir 7e7d67a960f191f55f41140f0b00c7a1fe6e30fc | 24 February 2022, 16:03:54 UTC |
a63dbac | Jenkins for Software Heritage | 24 February 2022, 16:03:53 UTC | New upstream version 1.0.0 | 24 February 2022, 16:03:53 UTC |
43794aa | David Douard | 24 February 2022, 15:52:44 UTC | Prepare v1: bump dependency to swh.core 2 also match dependency on swh.storage with requirements-swh.txt | 24 February 2022, 15:52:44 UTC |
5cc62be | David Douard | 08 February 2022, 13:59:29 UTC | Adapt to swh.core 2.0.0 - add the `get_datastore` function in `swh.scheduler` - add the `get_current_version` method in `SchedulerBackend`, - remove dbversion management from sql init script - update tests accordingly | 24 February 2022, 14:51:44 UTC |
234e165 | Antoine Lambert | 10 February 2022, 16:23:34 UTC | pre-commit: Bump hooks and add new one to check commit message spelling To install the new hook: $ pre-commit install -t commit-msg | 10 February 2022, 16:23:34 UTC |
fddec02 | Antoine Lambert | 09 February 2022, 13:22:06 UTC | requirements: Remove click version pin Latest versions of celery and flask now support click >= 8.0 so we can remove the version pin. | 09 February 2022, 13:22:46 UTC |
c46ffad | David Douard | 08 February 2022, 16:26:17 UTC | Prefix task types used in tests with 'test-' so that tests do not depend on a lucky guess on what the scheduler db state actually is. DB initialization scripts do create task types for git, hg and svn (used in tests) but these tests depends on the fact the db fixture has been called already once before, so tables are truncated (especially the task and task_type ones). For example running a single test involved in task-type creation was failing (eg. 'pytest swh -k test_create_task_type_idempotence'). This commit does make tests not collide with any existing task or task type initialization scripts may create. Note that this also means that there is actually no test dealing with the scheduler db state after initialization, which is not grat and should be addressed. | 08 February 2022, 16:34:10 UTC |
9f601f5 | Antoine R. Dumont (@ardumont) | 07 February 2022, 15:46:47 UTC | requirements-test: Pin pytest to < 7.0.0 Related to T3916 | 07 February 2022, 15:47:00 UTC |
ce11283 | Valentin Lorentz | 21 January 2022, 10:10:48 UTC | Fix ReST syntax | 21 January 2022, 10:14:59 UTC |
b5477ea | Antoine R. Dumont (@ardumont) | 12 January 2022, 09:58:58 UTC | sql: Clean up task/task_run data model This archives current task and task_run tables, creating new ones filtering only necessary tasks (last 2 months' oneshot tasks plus some recurring tasks; lister, indexer, ...). Those filtered tasks are the ones scheduled by the runner and runner priority services. This archiving will allow those services to be faster (corresponding query execution time will outputs results faster without the archived data). Related to T3837 | 12 January 2022, 10:30:36 UTC |
3b6e1d4 | Jenkins for Software Heritage | 06 January 2022, 08:39:47 UTC | Updated debian changelog for version 0.23.0 | 06 January 2022, 08:39:47 UTC |
67e1896 | Jenkins for Software Heritage | 06 January 2022, 08:39:46 UTC | Update upstream source from tag 'debian/upstream/0.23.0' Update to upstream version '0.23.0' with Debian dir f7e1a8a1f5f6dc07dc335a3ea905631cb4f80385 | 06 January 2022, 08:39:46 UTC |
4c9e164 | Jenkins for Software Heritage | 06 January 2022, 08:39:45 UTC | New upstream version 0.23.0 | 06 January 2022, 08:39:45 UTC |
5c836d6 | Vincent SELLIER | 04 January 2022, 23:08:50 UTC | Allow to specify the visit grab parameters per visit type and policy Related to T3827 | 05 January 2022, 17:18:32 UTC |
559f345 | Antoine R. Dumont (@ardumont) | 16 December 2021, 14:47:56 UTC | Pin mypy and drop type annotations which makes mypy unhappy This also drops spurious copyright headers to those files if present. Related to T3812 | 16 December 2021, 14:47:56 UTC |
e051b32 | Nicolas Dandrimont | 09 December 2021, 13:54:09 UTC | Use a temporary table to update scheduler metrics When using ``insert into <...> select <...>``, PostgreSQL disables parallel querying. Under some circumstances (in our large production database), this makes updating the scheduler metrics take a (very) long time. Parallel querying is allowed for ``create table <...> as select <...>``, and doing so restores the small(er) runtimes for this query (15 minutes instead of multiple hours). To use that, we have to turn the function into plpgsql instead of plain sql. | 09 December 2021, 14:16:06 UTC |
a8edbdb | Antoine R. Dumont (@ardumont) | 07 December 2021, 13:31:34 UTC | Clean up disabled scheduler archival task related services This is dead code now as this has long been stopped and disabled in production. Related to T3777 | 08 December 2021, 10:12:53 UTC |
0086f5a | Jenkins for Software Heritage | 08 December 2021, 09:06:02 UTC | Updated debian changelog for version 0.22.0 | 08 December 2021, 09:06:02 UTC |
bced01c | Jenkins for Software Heritage | 08 December 2021, 09:05:45 UTC | Update upstream source from tag 'debian/upstream/0.22.0' Update to upstream version '0.22.0' with Debian dir 6ee09dd6732003e781781fea731b4a981ee1d0f1 | 08 December 2021, 09:05:45 UTC |
10d495b | Jenkins for Software Heritage | 08 December 2021, 09:05:44 UTC | New upstream version 0.22.0 | 08 December 2021, 09:05:44 UTC |
5de8ba4 | Nicolas Dandrimont | 07 December 2021, 12:57:51 UTC | Make next_visit_queue_position an integer In visit types with small amounts of origins having no last_update field, we would end up overflowing Python datetimes (which only go up to 31 December 9999) pretty quickly. Making the queue position a 64-bit integer should give us some more leeway. The queue position now defaults to zero instead of an arbitrary point in time. Queue offsets are still commensurate with seconds, but that's mostly to give them some space to be splayed by the fudge factors. | 07 December 2021, 16:39:48 UTC |
c5e514f | Jenkins for Software Heritage | 07 December 2021, 07:45:41 UTC | Updated debian changelog for version 0.21.0 | 07 December 2021, 07:45:41 UTC |
bedb322 | Jenkins for Software Heritage | 07 December 2021, 07:45:40 UTC | Update upstream source from tag 'debian/upstream/0.21.0' Update to upstream version '0.21.0' with Debian dir 25bb4a20c7da58a06e1f9b407b51f2101d951472 | 07 December 2021, 07:45:40 UTC |
a7851ad | Jenkins for Software Heritage | 07 December 2021, 07:45:39 UTC | New upstream version 0.21.0 | 07 December 2021, 07:45:39 UTC |
0a6aac5 | Vincent SELLIER | 06 December 2021, 15:23:49 UTC | Ensure there is no duplicated origins in the insertion batches when a lister try to insert duplicate origins in the same batch, the insertion is failing because the "on cascade do update" instruction cannot manage duplicates in the same transaction Related to T3769 | 06 December 2021, 20:11:40 UTC |
377716e | Jenkins for Software Heritage | 22 November 2021, 15:14:57 UTC | Updated debian changelog for version 0.20.0 | 22 November 2021, 15:14:57 UTC |
3a9bdba | Jenkins for Software Heritage | 22 November 2021, 15:14:56 UTC | Update upstream source from tag 'debian/upstream/0.20.0' Update to upstream version '0.20.0' with Debian dir 648a27f772aa18cbaeb88421ad44cfbf18517068 | 22 November 2021, 15:14:56 UTC |
a42dbf8 | Jenkins for Software Heritage | 22 November 2021, 15:14:55 UTC | New upstream version 0.20.0 | 22 November 2021, 15:14:55 UTC |
2abb393 | Valentin Lorentz | 22 November 2021, 12:32:20 UTC | Fix CardinalityViolation in grab_next_visits on duplicate origins grab_next_visits grabs from `listed_origins`, whose primary key is `(lister_id, url, visit_type)` and uses it to upsert in origin_visit_stats, whose primary key is `(url, visit_type)`. This causes the error `ON CONFLICT DO UPDATE command cannot affect row a second time` when the same (origin, type) pair is grabbed twice. This commit deduplicates the (origin, type) pairs before upserting. | 22 November 2021, 12:36:24 UTC |
00ff02e | Nicolas Dandrimont | 29 October 2021, 13:58:31 UTC | recurrent visits: use policy weights instead of ratios The ratios weren't checked for normalization; using relative weights explicitly ensures that the settings won't be misinterpreted. | 29 October 2021, 13:58:31 UTC |
7f434c3 | Nicolas Dandrimont | 29 October 2021, 13:44:56 UTC | Improve docs rendering for recurrent visits scheduler | 29 October 2021, 13:44:56 UTC |