https://forge.softwareheritage.org/source/swh-scheduler.git

sort by:
Revision Author Date Message Commit Date
fa9d328 Updated backport on buster-swh from debian/0.18.0-1_swh1 (unstable-swh) 02 September 2021, 09:37:34 UTC
db6bff1 Merge tag 'debian/0.18.0-1_swh1' into debian/buster-swh 02 September 2021, 09:37:34 UTC
1a70fd6 Updated debian changelog for version 0.18.0 02 September 2021, 09:35:32 UTC
22baf3f Update upstream source from tag 'debian/upstream/0.18.0' Update to upstream version '0.18.0' with Debian dir e98b6b91a4c915d0ad9f6ee1286273ed99ee3b5b 02 September 2021, 09:35:31 UTC
66bf492 New upstream version 0.18.0 02 September 2021, 09:35:31 UTC
ecc1400 runner: Improve help message on the task types flag. 02 September 2021, 09:15:36 UTC
63fdda0 send-to-celery: Add more options to allow scheduling of edge cases In the non optimal case, we may want to trigger specific case (not-yet enabled origins, origin from specific lister...). Related to T3350 27 August 2021, 11:26:38 UTC
7cc37fa Refine scheduling policy for origins with no known last update For origins that have never been visited, and for which we don't have a queue position yet, we want to visit them in the order they've been added. 26 August 2021, 14:49:37 UTC
2efad28 Add a swh scheduler origin send-to-celery subcommand The subcommand bypasses the legacy task-based mechanism to directly send new origin visits to celery 26 August 2021, 14:48:46 UTC
5e8007f Add table sampling option to grab_next_visits Running common operations on all git origins is pretty intense. Using table sampling gives us the opportunity to at least schedule some jobs in (decently small) time. 26 August 2021, 14:47:52 UTC
cc76a57 journal_client: Only upsert if we have something to upsert 26 August 2021, 09:44:14 UTC
ab15591 Updated backport on buster-swh from debian/0.17.1-1_swh1 (unstable-swh) 26 August 2021, 08:43:42 UTC
395f14f Merge tag 'debian/0.17.1-1_swh1' into debian/buster-swh 26 August 2021, 08:43:41 UTC
4053937 Updated debian changelog for version 0.17.1 26 August 2021, 08:41:41 UTC
c36a724 Update upstream source from tag 'debian/upstream/0.17.1' Update to upstream version '0.17.1' with Debian dir e98b6f8fc3c8547ef7148fd0b9915f432584ab81 26 August 2021, 08:41:40 UTC
d04dbb3 New upstream version 0.17.1 26 August 2021, 08:41:40 UTC
506f78c journal_client: Ensure queue position does not overflow Queue positions are date and the current next_position_offset used to compute the new queue position was not bounded. This has the side-effect of making overflow error. This commit adapts the journal client computations to limit such next_position_offset to 10. This value was chosen because above that exponent the dates overflow (and we are way in the future already). Related to T3502 26 August 2021, 08:24:11 UTC
28ae1d8 Replace index-fossology-license-for-range with index-fossology-license-for-partition We changed the task name/interface a while ago 18 August 2021, 09:20:25 UTC
ec56447 Updated backport on buster-swh from debian/0.17.0-1_swh1 (unstable-swh) 06 August 2021, 09:14:02 UTC
ef76af4 Merge tag 'debian/0.17.0-1_swh1' into debian/buster-swh 06 August 2021, 09:14:01 UTC
fa762c5 Updated debian changelog for version 0.17.0 06 August 2021, 09:11:54 UTC
416b2c5 Update upstream source from tag 'debian/upstream/0.17.0' Update to upstream version '0.17.0' with Debian dir c55959218dbe52af5f47cb3541419dfb69d77945 06 August 2021, 09:11:54 UTC
3c61059 New upstream version 0.17.0 06 August 2021, 09:11:53 UTC
8281e35 journal_client: Disable origins when too many visited attempts failed This disable origins for either failed or not found attempts 3 times in a row. It's not definitive though as it's the lister's responsibility to activate back origins if they get listed again. Related to T2345 03 August 2021, 11:56:32 UTC
1bcf84d Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345 03 August 2021, 10:49:45 UTC
4fa29fe journal_client: Update get_last_status docstring Related to T2345 30 July 2021, 13:35:17 UTC
3b929d0 journal_client: Refactor by inlining the update_position_offset This is no longer required as it's called once. Related to T2345 30 July 2021, 13:23:14 UTC
87e66fa Only record last_visited and last_successful in origin_visit_stats After using this schema for a while, all queries can be implemented in terms of these two timestamps, instead of the four original last_eventful, last_uneventful, last_failed and last_notfound timestamps. This ends up simplifying the logic within the journal client, as well as that of the grab_next_visits query builder. To make this change work, we also stop considering out of order messages altogether in journal_client. This welcome simplification is an accuracy tradeoff that is explained in the updated documentation of the journal client: .. [1] Ignoring out of order messages makes the initialization of the origin_visit_status table (from a full journal) less deterministic: only the `last_visit`, `last_visit_state` and `last_successful` fields are guaranteed to be exact, the `next_position_offset` field is a best effort estimate (which should converge once the client has run for a while on in-order messages). 23 July 2021, 09:56:32 UTC
3ca0d65 test_journal_client: Unify test assertion like the rest Related to D5917 23 July 2021, 07:22:46 UTC
8cf2238 test: Refactor assert_visit_stats_ok to ignore_fields This simplifies and unifies properly the utility test function to compare visit stats. 23 July 2021, 07:18:20 UTC
d58776a Introduce new scheduling policy to grab origins without last update This is in charge of scheduling origins without last update. This also updates the global queue position so the journal client can initialize correctly the next position per origin and visit type. Related to T2345 22 July 2021, 10:23:44 UTC
825e8cf grab_next_visits: make the handling of CTEs more modular This allows us to insert extra CTEs if a scheduling policy needs it. 22 July 2021, 10:19:42 UTC
8c4ae9f journal_client: Compute next position for origin visit For origin without any last_update information [1], the journal client is now also in charge of moving their next position in the queue for rescheduling. Depending on their status, the next position offset and next_visit_queue_position are updated after each visit completes: - if the visit has failed, increase the next visit target by the minimal visit interval (to take into account transient loading issues) - if the visit is successful, and records some changes, decrease the visit interval index by 2 (visit the origin *way* more often). - if the visit is successful, and records no changes, increase the visit interval index by 1 (visit the origin less often). We then set the next visit target to its current value + the new visit interval multiplied by a random fudge factor (picked in the -/+ 10% range). The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins e.g. when a number of origins from a single hoster are processed at once. Note that the computations happen for all origins for simplicity and code maintenance but it will only be used by a new soon-to-be scheduling policy. [1] Lister cannot provide it for some reason. 06 July 2021, 12:35:13 UTC
cb1edf1 Introduce storage for the recurrent visit scheduler queue position 01 July 2021, 08:36:44 UTC
ec6e69f Start handling of recurrent loading tasks in scheduler This deals first and foremost with the next_position_offset update done by the scheduler journal client. 01 July 2021, 08:36:44 UTC
c486b28 journal_client: Explicit docstring 29 June 2021, 13:16:15 UTC
98f99b9 journal_client: Only check last_* fields for some permutation tests In a future commit, we will add new fields whose values will be permutation dependent. 23 June 2021, 15:02:34 UTC
1006f0a journal_client: Auto-generate the empty object from model fields This will help us when adding new fields to the table. 23 June 2021, 14:54:34 UTC
6400cc2 backend: Auto-generate origin visit stats upsert query This will help us when adding new fields to the table. 23 June 2021, 14:54:34 UTC
3762c34 cli/task: Ensure cli output is always in the same order 23 June 2021, 14:54:34 UTC
ed81870 Add a specific cooldown for notfound origins This allows us to avoid repeating visits on them, until a next pass of the lister can mark them as disabled. 23 June 2021, 09:13:00 UTC
651ddcc Add a (longer) specific cooldown for failed origin visits 23 June 2021, 09:13:00 UTC
ce8608d Make the origin visit scheduling cooldown configurable 23 June 2021, 09:13:00 UTC
3b5a8d7 Updated backport on buster-swh from debian/0.16.0-1_swh1 (unstable-swh) 22 June 2021, 15:41:47 UTC
18bb587 Merge tag 'debian/0.16.0-1_swh1' into debian/buster-swh 22 June 2021, 15:41:46 UTC
61259ee Updated debian changelog for version 0.16.0 22 June 2021, 15:39:45 UTC
4ed7ce8 Update upstream source from tag 'debian/upstream/0.16.0' Update to upstream version '0.16.0' with Debian dir 76fdfbde021aeb32390560c3684c46131a969de1 22 June 2021, 15:39:44 UTC
e63d38c New upstream version 0.16.0 22 June 2021, 15:39:43 UTC
7f51f27 interface: Add get_listers method Add new method to scheduler interface returning the full list of listers registered in the database. Related to T3127 22 June 2021, 12:36:08 UTC
9e1b414 Drop duplicate docstring from backend 21 June 2021, 13:46:12 UTC
919334e Updated backport on buster-swh from debian/0.15.0-1_swh1 (unstable-swh) 10 June 2021, 14:51:04 UTC
ab127f6 Merge tag 'debian/0.15.0-1_swh1' into debian/buster-swh 10 June 2021, 14:51:04 UTC
4e410ba Updated debian changelog for version 0.15.0 10 June 2021, 14:48:52 UTC
79e2023 New upstream version 0.15.0 10 June 2021, 14:48:51 UTC
d93dbf7 Update upstream source from tag 'debian/upstream/0.15.0' Update to upstream version '0.15.0' with Debian dir d49b2d7bf8b3e4b7635110179f0d613aeda1d740 10 June 2021, 14:48:51 UTC
c7707b5 runner: Separate scheduling tasks with and without priority concern In effect, this will allow to run 2 runners: - one for recurring tasks - one for the save code now This should decrease the probability of the scheduling tasks for the save code now to be stuck behind the main scheduler runner. Related to T3367 10 June 2021, 12:55:04 UTC
21c4279 Refactor and extract a get_available_slots utility This adds coverage as well. This will be needed for subsidiary diffs. Related to T3367 10 June 2021, 10:15:22 UTC
9d2618d Add typing stubs dependencies for mypy>0.900 This also explicits missing dependencies 09 June 2021, 12:13:36 UTC
9f7ab8f pytest_plugin: Explicitly set hostname in broker_url for celery TestApp Since the release of kombu 5.1.0, a warning is now issued when a hostname is not set in the broker_url config value of a celery app. That change makes the test_celery_monitor_ping test fails due to that new unexpected warning. So explicitly add localhost hostname in the broker_url value of the celery TestApp config. 25 May 2021, 11:43:03 UTC
479b38d Updated backport on buster-swh from debian/0.14.2-1_swh1 (unstable-swh) 06 May 2021, 15:15:12 UTC
0035687 Merge tag 'debian/0.14.2-1_swh1' into debian/buster-swh 06 May 2021, 15:15:12 UTC
04db822 Updated debian changelog for version 0.14.2 06 May 2021, 15:13:11 UTC
ebe1d6d Update upstream source from tag 'debian/upstream/0.14.2' Update to upstream version '0.14.2' with Debian dir 50bf236b540cecee0d42d1299ec4f4c9131e450a 06 May 2021, 15:13:10 UTC
628a203 New upstream version 0.14.2 06 May 2021, 15:13:09 UTC
fe9d949 Fix flaky test_grab_ready_* tests 06 May 2021, 14:20:57 UTC
7225578 Updated backport on buster-swh from debian/0.14.1-1_swh1 (unstable-swh) 06 May 2021, 14:19:35 UTC
df7ca99 Merge tag 'debian/0.14.1-1_swh1' into debian/buster-swh 06 May 2021, 14:19:35 UTC
eb86022 Updated debian changelog for version 0.14.1 06 May 2021, 14:17:39 UTC
eacdf09 Update upstream source from tag 'debian/upstream/0.14.1' Update to upstream version '0.14.1' with Debian dir b739d8b0795caaff54e942819f8ab4ea8da8daa7 06 May 2021, 14:17:38 UTC
8981a70 New upstream version 0.14.1 06 May 2021, 14:17:37 UTC
8a892e2 Use swh.core 0.14 It renamed db_name to dbname, which is a breaking change. 06 May 2021, 13:49:47 UTC
bab557e Remove row locking from SQL queries This would only be useful if we had multiple runners running concurrently, but that's not the case. 30 April 2021, 18:13:38 UTC
feff179 tox: Add sphinx environments to check sane doc build Enable to check package documentation can be built without producing sphinx warnings. The sphinx environment is designed to be used in continuous integration in order to prevent breaking documentation build when committing changes. The sphinx-dev environment is designed to be used inside a full swh development environment. Related to T3258 26 April 2021, 16:01:59 UTC
f186910 Add default index to task(type, next_run) in schema The staging scheduler runner was slow when fetching task due to that missing index. Related to T3271#63831 20 April 2021, 13:50:19 UTC
f33f743 Simplify priority computation in tests + improve exhaustivity We no longer need to deal with ratios, so let's count the objects directly instead. Plus, the existing tests did not check tasks with None priority (because they did not have access to it when ratios were given by the backend), so they do now. 20 April 2021, 11:01:33 UTC
f4e6292 sql/updates/27: Fix sql upgrade script Related to TT3271 20 April 2021, 10:18:23 UTC
14597f2 Updated backport on buster-swh from debian/0.13.0-1_swh1 (unstable-swh) 20 April 2021, 09:52:53 UTC
2b2b7b7 Merge tag 'debian/0.13.0-1_swh1' into debian/buster-swh 20 April 2021, 09:52:52 UTC
828d288 Updated debian changelog for version 0.13.0 20 April 2021, 09:51:00 UTC
15684cb New upstream version 0.13.0 20 April 2021, 09:50:59 UTC
f2122c2 Update upstream source from tag 'debian/upstream/0.13.0' Update to upstream version '0.13.0' with Debian dir 753ba2cc0646d7be73acfff0f55b36cc7fe12e65 20 April 2021, 09:50:59 UTC
befccb9 scheduler: Clean up priority/ratio task dead code Since [1], tasks with priority are routed to dedicated queues (see tasks for more details). The tasks with priority to be scheduled have their own dedicated endpoints to be called. [1] Related to T3084 Related to T3271 20 April 2021, 09:27:18 UTC
4e06bcd Parse task_ids before calling set_status_tasks. So errors on the CLI side do not trigger an exception on the server 20 April 2021, 09:19:52 UTC
974c0c2 tests: Complete checks on message with priority consumption Related to T3084 15 April 2021, 12:57:25 UTC
8a3a8f9 Updated backport on buster-swh from debian/0.12.0-1_swh1 (unstable-swh) 15 April 2021, 11:38:11 UTC
e4b9ac5 Merge tag 'debian/0.12.0-1_swh1' into debian/buster-swh 15 April 2021, 11:38:11 UTC
38ef60a Updated debian changelog for version 0.12.0 15 April 2021, 11:36:14 UTC
9db86c1 Update upstream source from tag 'debian/upstream/0.12.0' Update to upstream version '0.12.0' with Debian dir 788879e9b7caf6d54d8f4645d720dc7b7859e069 15 April 2021, 11:36:13 UTC
a8b8fde New upstream version 0.12.0 15 April 2021, 11:36:12 UTC
17052c4 Route priority tasks to dedicated save code now queues This splits the calls to read tasks into 2 calls, one for tasks with no priority (standard), another call for tasks with priority. If any tasks with priority are detected, they are routed to dedicated `save_code_now:` prefixed named queues (per task type). Related to T3084 15 April 2021, 11:24:13 UTC
bfc1a87 Fix various Sphinx warnings 15 April 2021, 08:19:50 UTC
641b53f Updated backport on buster-swh from debian/0.11.0-1_swh1 (unstable-swh) 14 April 2021, 16:21:24 UTC
d5b6032 Merge tag 'debian/0.11.0-1_swh1' into debian/buster-swh 14 April 2021, 16:21:23 UTC
3bf2f9a Updated debian changelog for version 0.11.0 14 April 2021, 16:19:31 UTC
77ea590 New upstream version 0.11.0 14 April 2021, 16:19:30 UTC
c32aff0 Update upstream source from tag 'debian/upstream/0.11.0' Update to upstream version '0.11.0' with Debian dir aac6bb6258a43d962787d39b4006589985c90bcb 14 April 2021, 16:19:30 UTC
3e2ae3d backend: Open endpoints to peek/grab tasks with any priority The priority notion becomes a blur. Any tasks with a non null priority is considered for reading or grabbing. In a future commit, this should allow to make the runner evolve to reroute tasks with priority to other queues. Related to T3084 13 April 2021, 16:05:29 UTC
ecab745 Make origin_visit_stats_get return results from all pages psycopg2.extras.execute_values executes queries in batches of 100 by default. At the end of execute_values, only the last batch of results is available in the cursor; To fetch all results, one needs to set fetch=True instead of using the cursor. 11 February 2021, 18:39:29 UTC
86ada44 journal client: Filter out status messages without type This allows us to support reading the journal from the beginning, ignoring messages with the old schema. 11 February 2021, 18:38:44 UTC
cdb1775 Simplify max_date() The built-in `max` function can take an iterable directly, no need to reimplement it. 11 February 2021, 18:24:01 UTC
back to top