c17a34f | Jenkins for Software Heritage | 23 November 2020, 12:44:39 UTC | New upstream version 0.8.0 | 23 November 2020, 12:44:39 UTC |
49ed819 | Antoine R. Dumont (@ardumont) | 20 November 2020, 16:17:14 UTC | requirements-test.txt: Drop no longer needed pytest-postgresql requirement requirements-swh.txt already declares the swh.core[db] dependency which transitively pulls it. Related to T2746 | 23 November 2020, 12:11:02 UTC |
2f9e8ec | Antoine R. Dumont (@ardumont) | 21 October 2020, 14:52:45 UTC | scheduler.pytest_plugin: Make scheduler tests faster Reuse the swh.core.db.pytest_plugin factory | 22 October 2020, 10:09:06 UTC |
f385291 | Jenkins for Software Heritage | 19 October 2020, 07:33:53 UTC | New upstream version 0.7.0 | 19 October 2020, 07:33:53 UTC |
6a4455c | Antoine R. Dumont (@ardumont) | 19 October 2020, 07:25:04 UTC | pytest_plugin: Explicitely name the scheduler test db differently When using tests on modules with different lower level modules (e.g storage, scheduler, ...) this avoids clashes. | 19 October 2020, 07:25:04 UTC |
13dcadd | Antoine R. Dumont (@ardumont) | 16 October 2020, 11:12:27 UTC | scheduler: Type and unify get_scheduler factory with other factories Related to T1410 | 16 October 2020, 16:24:03 UTC |
dd33cdc | Antoine R. Dumont (@ardumont) | 16 October 2020, 11:35:54 UTC | test_server: Simplify exception manipulations | 16 October 2020, 11:43:54 UTC |
315a2c9 | Stefano Zacchiroli | 02 October 2020, 14:24:01 UTC | tox.ini: pin black to the pre-commit version (19.10b0) to avoid flip-flops | 02 October 2020, 14:24:01 UTC |
b7e5358 | Nicolas Dandrimont | 25 September 2020, 15:19:10 UTC | Drop vcversioner from requirements We stopped using it months ago. | 25 September 2020, 15:19:17 UTC |
4951a23 | Nicolas Dandrimont | 25 September 2020, 12:19:21 UTC | Run isort after the CLI import changes | 25 September 2020, 12:19:21 UTC |
ba781a5 | Jenkins for Software Heritage | 25 September 2020, 10:06:31 UTC | New upstream version 0.6.0 | 25 September 2020, 10:06:31 UTC |
be7a5ae | David Douard | 22 September 2020, 08:39:15 UTC | Rename sql files according to swh.core 0.3 | 25 September 2020, 07:53:53 UTC |
5cc573d | David Douard | 22 September 2020, 08:36:11 UTC | Adapt cli declaration entrypoint to swh.core 0.3 | 25 September 2020, 07:48:38 UTC |
1d40f20 | Jenkins for Software Heritage | 24 September 2020, 15:53:24 UTC | New upstream version 0.5.3 | 24 September 2020, 15:53:24 UTC |
99e5af8 | Nicolas Dandrimont | 24 September 2020, 15:44:00 UTC | Move from kombu.five.monotonic to time.monotonic Looks like kombu finally axed python2 support. | 24 September 2020, 15:44:00 UTC |
7b0d48f | Antoine Lambert | 17 September 2020, 16:01:30 UTC | python: Reorder imports with isort Related to T2610 | 17 September 2020, 16:03:39 UTC |
8d8b58f | Antoine Lambert | 17 September 2020, 15:26:12 UTC | pre-commit: Add isort hook and configuration Related to T2610 | 17 September 2020, 16:03:39 UTC |
4bec5c8 | Antoine Lambert | 17 September 2020, 16:03:24 UTC | pre-commit: Update flake8 hook configuration flake8 hook has been removed from https://github.com/pre-commit/pre-commit-hooks so now use the one from https://gitlab.com/pycqa/flake8 | 17 September 2020, 16:03:39 UTC |
f5c8154 | David Douard | 10 September 2020, 09:25:08 UTC | cli: speedup the `swh` cli command startup time by moving import statements in functions and using conditional import of typechecking modules (especially StorageInterface which triggers the loading of 300+ modules). Related to T2575. | 10 September 2020, 15:46:08 UTC |
b24be0c | Valentin Lorentz | 25 August 2020, 08:41:38 UTC | Tell pytest not to recurse in dotdirs. pytest wastes a lot of time in .hypothesis and .git; this commit excludes them. | 25 August 2020, 08:41:38 UTC |
6426208 | Antoine R. Dumont (@ardumont) | 01 August 2020, 08:03:23 UTC | cli.task: Migrate scheduler cli to latest storage change on iter_origins Related to T645 | 03 August 2020, 10:18:23 UTC |
849d063 | Antoine R. Dumont (@ardumont) | 24 July 2020, 08:16:09 UTC | test_cli: Adapt tests data and drop unsupported "validate" proxy | 24 July 2020, 08:22:07 UTC |
9f52d95 | Antoine R. Dumont (@ardumont) | 21 July 2020, 08:34:34 UTC | cli.task: Fix iter_origin returned types Related to T2494 | 21 July 2020, 08:36:03 UTC |
c26569f | Jenkins for Software Heritage | 10 July 2020, 11:08:29 UTC | New upstream version 0.5.2 | 10 July 2020, 11:08:29 UTC |
254e24a | Antoine R. Dumont (@ardumont) | 10 July 2020, 10:11:23 UTC | Do no expose pytest-plugin through setuptools, let modules require it when needed Defining the pytest-plugin though the pytest-plugin [1] makes it loaded by default. This creates loading issues on modules depending on scheduler but not on the pytest plugin scheduler exposes as explained in the doc [2] Instead we'll explicitely define to modules depending on the pytest plugins in their root conftest [3]: pytest_plugins = [ "swh.scheduler.pytest_plugin" ] [1] https://docs.pytest.org/en/stable/writing_plugins.html#setuptools-entry-points [2] https://docs.pytest.org/en/stable/writing_plugins.html#plugin-discovery-order-at-tool-startup [3] https://docs.pytest.org/en/stable/writing_plugins.html#requiring-loading-plugins-in-a-test-module-or-conftest-file Related to D3475 Related to T2484 | 10 July 2020, 10:27:42 UTC |
7a6149f | Jenkins for Software Heritage | 09 July 2020, 09:51:36 UTC | New upstream version 0.5.1 | 09 July 2020, 09:51:36 UTC |
0bc33b2 | Jenkins for Software Heritage | 09 July 2020, 08:20:41 UTC | New upstream version 0.5.0 | 09 July 2020, 08:20:41 UTC |
ece598c | Antoine Lambert | 08 July 2020, 16:12:15 UTC | requirements.txt: Remove future dependency This was needed for celery 4.4.4 but that version is not used anymore. | 08 July 2020, 16:33:25 UTC |
7009c3b | Nicolas Dandrimont | 08 July 2020, 15:55:07 UTC | Move all celery-related fixtures to the swh.scheduler pytest plugin This allows us to reuse these fixtures in other modules without brittle swh.scheduler.tests.conftest star imports. Unfortunately, we can't really override pytest fixtures from one plugin to another. We therefore reimplement the fixtures provided by celery, inlining the static configuration and renaming them to our names in the process. This also adds a backwards-compatibility import from pytest_plugin to conftest, to allow old users of the conftest fixtures to keep working. | 08 July 2020, 15:59:15 UTC |
ce63e6a | Antoine R. Dumont (@ardumont) | 07 July 2020, 10:17:55 UTC | pytest.ini: Drop filterwarnings which never worked | 07 July 2020, 10:18:50 UTC |
7dadc14 | Jenkins for Software Heritage | 06 July 2020, 14:52:41 UTC | New upstream version 0.4.0 | 06 July 2020, 14:52:41 UTC |
b2cbb9b | Nicolas Dandrimont | 06 July 2020, 12:51:41 UTC | Move shareable fixtures out of conftest into a dedicated pytest plugin This avoids having to run `from swh.scheduler.tests.conftest import *` in other modules, e.g. swh.lister, to import and use the swh_scheduler pytest fixture. | 06 July 2020, 14:42:04 UTC |
189d845 | Jenkins for Software Heritage | 06 July 2020, 10:23:30 UTC | New upstream version 0.3.0 | 06 July 2020, 10:23:30 UTC |
5b373ce | Nicolas Dandrimont | 06 July 2020, 07:49:44 UTC | Introduce a get_listed_origins endpoint This paginated endpoint allows retrieving information about the origins recorded by listers. | 06 July 2020, 09:51:10 UTC |
aefc5c9 | Nicolas Dandrimont | 06 July 2020, 07:48:29 UTC | Don't recurse into attrs objects when serializing We need to use our serialization hook recursively to make sure that we can deserialize nested data structures. | 06 July 2020, 07:48:29 UTC |
39d886b | Jenkins for Software Heritage | 22 June 2020, 12:07:03 UTC | New upstream version 0.2.2 | 22 June 2020, 12:07:03 UTC |
cc8fa7f | Nicolas Dandrimont | 22 June 2020, 10:46:09 UTC | Re-introduce the root endpoint for the rpc server | 22 June 2020, 10:55:11 UTC |
fa7357b | Jenkins for Software Heritage | 22 June 2020, 10:12:49 UTC | New upstream version 0.2.1 | 22 June 2020, 10:12:49 UTC |
265bc8b | Nicolas Dandrimont | 22 June 2020, 08:58:09 UTC | The celery-monitor subcommand glob filtering needs celery >= 4.3 | 22 June 2020, 08:58:09 UTC |
da69466 | Jenkins for Software Heritage | 22 June 2020, 08:36:48 UTC | New upstream version 0.2.0 | 22 June 2020, 08:36:48 UTC |
8a1724a | Nicolas Dandrimont | 22 June 2020, 08:26:40 UTC | Add SQL for version 16 of the schema | 22 June 2020, 08:26:40 UTC |
d107a55 | Nicolas Dandrimont | 16 June 2020, 08:25:08 UTC | Implement storage of listed origins This new API endpoint allows listers to record the origins they have seen during their current run. Origins are identified by the lister instance, the url of the origin, and the type of loader that should be used to load this origin. The implementation allows listers just send the list of origins they've seen (with some lightweight extra information), leaving the backend to handle whether to do an insertion or an update to an existing origin. The current implementation doesn't disable origins that have disappeared when doing a full listing run. This step will be done by a separate "origin garbage collection" endpoint, which will peruse the `last_seen` field. | 16 June 2020, 08:25:08 UTC |
e0fa5c5 | Nicolas Dandrimont | 16 June 2020, 08:24:03 UTC | Move lister addition in scheduler tests to a pytest fixture This lets us keep the tests a little DRYer. | 16 June 2020, 08:24:03 UTC |
04894bd | Nicolas Dandrimont | 16 June 2020, 08:22:23 UTC | Lister.instance_name doesn't need a factory/default value | 16 June 2020, 08:22:23 UTC |
f520108 | Nicolas Dandrimont | 16 June 2020, 08:08:59 UTC | Improve support of primary keys This splits primary keys across "automatic" primary keys (handled by the database) and manual primary keys (managed by the user). Use the opportunity to improve/clarify the documentation of field metadata attributes. | 16 June 2020, 08:22:12 UTC |
1c93e55 | Nicolas Dandrimont | 12 June 2020, 10:24:20 UTC | Implement basic storage and retrieval of lister information This adds a pair a functions to the backend: - `get_or_create_lister` pulls the record for a given lister from the database - `update_lister` updates the record for a given lister in the database This is one of the basic building blocks for the integration of lister information directly in the scheduler database. Related to T2442. | 15 June 2020, 13:41:02 UTC |
466ac59 | Nicolas Dandrimont | 15 June 2020, 12:46:28 UTC | Introduce a SchedulerException base class This allows us to automatically serialize/deserialize exceptions under this base class within our RPC framework. | 15 June 2020, 12:53:30 UTC |
c509a12 | Nicolas Dandrimont | 12 June 2020, 09:03:26 UTC | Introduce some scaffolding for an attrs-based BaseSchedulerModel Alongside swh.model.model, this allows us to define data models for the objects the scheduler is working with, and to serialize/deserialize these objects transparently at the RPC layer. This also introduces some mild ORM-like logic so we can keep the actual SQL a little DRYer. | 15 June 2020, 10:49:25 UTC |
4c0c37b | Nicolas Dandrimont | 10 June 2020, 14:09:53 UTC | Use the automatic RPC client/server generation | 11 June 2020, 09:42:37 UTC |
aedd323 | Nicolas Dandrimont | 10 June 2020, 09:31:45 UTC | Replace swh-worker-control with a swh scheduler celery-monitor subcommand This new subcommand has two commands: - ping: checks whether the given worker instance answers within a given timeout - list-running: lists running tasks on the given worker instance | 10 June 2020, 10:15:54 UTC |
8411335 | Nicolas Dandrimont | 10 June 2020, 09:30:31 UTC | Remove double logging setup in cli The logging module is already initialized by the main swh.core cli; This only creates double logging with no advantages whatsoever. | 10 June 2020, 09:30:31 UTC |
873cdac | Nicolas Dandrimont | 10 June 2020, 09:28:19 UTC | Handle psycopg2 OperationalError in cli initialization When running the cli with default settings (i.e. pointing to a softwareheritage-scheduler-dev database), and the database doesn't exist, an OperationalError is raised. This shouldn't prevent (some of the) cli subcommands from working, so catch this error and ignore it as one of the scheduler backend setup failure modes. | 10 June 2020, 09:28:19 UTC |
28c5b8d | Nicolas Dandrimont | 09 June 2020, 13:47:26 UTC | Replace vcversioner with setuptools-scm | 09 June 2020, 13:49:00 UTC |
14cd5bb | Nicolas Dandrimont | 03 June 2020, 15:17:50 UTC | Blacken for python3.7+ | 03 June 2020, 15:19:00 UTC |
6ac3d56 | Nicolas Dandrimont | 03 June 2020, 10:34:11 UTC | Drop use of pifpaf and the "db" pytest mark We've been using pytest-postgresql for... a year (4117d5a). | 03 June 2020, 10:34:11 UTC |
db7f167 | Jenkins for Software Heritage | 03 June 2020, 09:39:24 UTC | New upstream version 0.1.1 | 03 June 2020, 09:39:24 UTC |
3f42423 | Nicolas Dandrimont | 03 June 2020, 09:29:58 UTC | Add future dependency, missing from celery 4.4.4 Without future, the tests involving celery hang indefinitely. Upstream issue: https://github.com/celery/celery/issues/6145 | 03 June 2020, 09:29:58 UTC |
e06c756 | Jenkins for Software Heritage | 19 May 2020, 09:52:30 UTC | New upstream version 0.1.0 | 19 May 2020, 09:52:30 UTC |
92c0869 | Nicolas Dandrimont | 19 May 2020, 09:30:13 UTC | Celery runner: only schedule tasks when the buffer is less than 80% full The queries to pick up tasks from the scheduler sometimes degenerate when the number of tasks fetched is too low, which hangs the runner for all other tasks. Adding this lower bound helps postgresql use proper optimizations to pull tasks. | 19 May 2020, 09:34:52 UTC |
b839906 | Nicolas Dandrimont | 19 May 2020, 09:12:55 UTC | Disable the azure http logger in the celery worker base config This is suboptimal (we should move all of this to a logconfig where we can set this stuff), but this is consistent with how we do things currently. | 19 May 2020, 09:14:25 UTC |
2ea919c | Nicolas Dandrimont | 19 May 2020, 09:12:26 UTC | Fix black for py37 | 19 May 2020, 09:12:26 UTC |
3a74069 | Antoine R. Dumont (@ardumont) | 12 May 2020, 09:55:09 UTC | test_scheduler: Fix pep8 violation This fixes ci build [1] [1] https://jenkins.softwareheritage.org/job/DSCH/job/tests/859/console | 12 May 2020, 09:55:09 UTC |
2cc8aa0 | Stefano Zacchiroli | 29 April 2020, 16:33:16 UTC | setup.py: add documentation link | 29 April 2020, 16:33:16 UTC |
1abff22 | Antoine R. Dumont (@ardumont) | 20 April 2020, 15:29:49 UTC | setup: Update the minimum required runtime python3 version Related to T2367 | 20 April 2020, 15:29:49 UTC |
551ceac | David Douard | 08 April 2020, 20:16:58 UTC | Add a pyproject.toml file to target py37 for black | 08 April 2020, 20:16:58 UTC |
cc0ef04 | David Douard | 08 April 2020, 14:58:01 UTC | Enable black - blackify all the python files, - enable black in pre-commit, - add a black tox environment. | 08 April 2020, 14:58:01 UTC |
77b2d0b | Antoine R. Dumont (@ardumont) | 27 March 2020, 06:43:03 UTC | tests: Adapt model according to latest change origin model no longer allows to have type. Related to f533f62bbf114cfcc29f7c72307c4dfbe99cf048 | 27 March 2020, 06:43:03 UTC |
75bf007 | Jenkins for Software Heritage | 23 March 2020, 12:11:59 UTC | New upstream version 0.0.72 | 23 March 2020, 12:11:59 UTC |
e6c2a86 | Nicolas Dandrimont | 23 March 2020, 09:45:30 UTC | Implement listener on top of pika instead of celery | 23 March 2020, 11:52:06 UTC |
68c42fb | Antoine R. Dumont (@ardumont) | 03 February 2020, 08:20:57 UTC | scheduler.backend_es: Leave index opened when streaming bulk Prior to this commit, we had the proper behavior of closing index when done streaming. Unfortunately, this created too much gc on es nodes down the line. So for now, we remove that behavior. Note that this implies we need another cog that makes a pass once in a while on indices to close. Also, this has been running on production for 2 weeks now and no more gc issues arose since then. | 26 February 2020, 09:34:09 UTC |
af58466 | Antoine Lambert | 17 February 2020, 15:55:20 UTC | backend: Make create_task_type idempotent There is no reason to raise an error when a task type has already been created and it enables to stop leaking psycopg2 IntegrityError exception as part of the scheduler interface. | 18 February 2020, 14:17:02 UTC |
b92e3fd | Valentin Lorentz | 12 February 2020, 12:48:52 UTC | Use swh-storage validation proxy. Required by swh-storage >= v0.0.172. | 12 February 2020, 12:48:52 UTC |
73d1e5e | Antoine R. Dumont (@ardumont) | 31 January 2020, 08:18:25 UTC | cli.task: Change `get_storage` according to latest change | 31 January 2020, 08:18:25 UTC |
1c923aa | Antoine R. Dumont (@ardumont) | 31 January 2020, 08:16:20 UTC | test_cli: Fix storage instantiation following api change Using the `swh.storage.get_storage` function instead of calling directly the class name. This actually fixes the master ci build [1] [1] https://jenkins.softwareheritage.org/job/DSCH/job/tests/743/console | 31 January 2020, 08:16:20 UTC |
cfaa584 | Jenkins for Software Heritage | 23 January 2020, 13:29:32 UTC | New upstream version 0.0.71 | 23 January 2020, 13:29:32 UTC |
f6cc231 | Antoine R. Dumont (@ardumont) | 23 January 2020, 13:21:21 UTC | sentry: Fix initialization init_sentry call Api wise, the `sentry_dsn` is expected to be passed as first parameter. Which in the scheduler's case is not set yet. Forcing it to None for now. | 23 January 2020, 13:21:21 UTC |
6817890 | Jenkins for Software Heritage | 23 January 2020, 12:47:42 UTC | New upstream version 0.0.70 | 23 January 2020, 12:47:42 UTC |
0712207 | Valentin Lorentz | 10 January 2020, 14:13:07 UTC | Use swh.core.sentry instead of calling sentry_sdk.init directly. This adds support for SWH_MAIN_PACKAGE to initialize sentry_sdk with a release. | 10 January 2020, 14:13:07 UTC |
b488d69 | Antoine R. Dumont (@ardumont) | 17 December 2019, 22:23:35 UTC | backend_es: Fix configuration mapping | 17 December 2019, 22:23:35 UTC |
9896f0f | Jenkins for Software Heritage | 17 December 2019, 15:04:47 UTC | New upstream version 0.0.69 | 17 December 2019, 15:04:47 UTC |
cc2de16 | Antoine R. Dumont (@ardumont) | 17 December 2019, 14:57:33 UTC | tests: Try to avoid fixture redefinition Somehow, that messes other tests in the debian build. | 17 December 2019, 14:57:33 UTC |
a901970 | Jenkins for Software Heritage | 17 December 2019, 14:33:32 UTC | New upstream version 0.0.68 | 17 December 2019, 14:33:32 UTC |
73ade78 | Antoine R. Dumont (@ardumont) | 17 December 2019, 14:27:15 UTC | tests: Avoid fixture clash in different purposes fixture Somehow, that fails in the debian build | 17 December 2019, 14:27:50 UTC |
652b583 | Jenkins for Software Heritage | 17 December 2019, 13:38:02 UTC | New upstream version 0.0.67 | 17 December 2019, 13:38:02 UTC |
e9d8a5f | Antoine R. Dumont (@ardumont) | 17 December 2019, 12:28:42 UTC | scheduler.backend: Rename appropriately module elasticsearch_memory | 17 December 2019, 12:33:43 UTC |
2cbfb78 | Antoine R. Dumont (@ardumont) | 17 December 2019, 11:51:28 UTC | Add tests to in memory elasticsearch implementation | 17 December 2019, 12:33:43 UTC |
ba5920d | Antoine R. Dumont (@ardumont) | 17 December 2019, 11:51:13 UTC | backend_es: Add tests around elasticsearch client instantiation | 17 December 2019, 12:33:43 UTC |
38d17de | Antoine R. Dumont (@ardumont) | 17 December 2019, 11:50:13 UTC | tests/common: Remove uneeded behavior | 17 December 2019, 12:33:43 UTC |
ac32b5e | Antoine R. Dumont (@ardumont) | 17 December 2019, 09:59:19 UTC | backend: Add alternate memory elasticsearch implem to allow testing | 17 December 2019, 12:33:43 UTC |
7b1c2d5 | Antoine R. Dumont (@ardumont) | 17 December 2019, 09:57:31 UTC | scheduler.backend_es: Allow using different elasticsearch clients For the moment, only 1 official es client exists | 17 December 2019, 12:33:43 UTC |
ec207fb | Antoine R. Dumont (@ardumont) | 17 December 2019, 09:51:20 UTC | scheduler.backend: Make the returned result a dict | 17 December 2019, 12:33:42 UTC |
f97bff6 | Antoine R. Dumont (@ardumont) | 17 December 2019, 09:50:27 UTC | cli.task: Make page_token actually a string even from the cli That actually make it consistent with the api | 17 December 2019, 12:33:42 UTC |
d8859d7 | Antoine R. Dumont (@ardumont) | 16 December 2019, 16:15:42 UTC | backend_es: Add initialization endpoint | 17 December 2019, 12:33:42 UTC |
d5cea20 | Antoine R. Dumont (@ardumont) | 16 December 2019, 16:15:24 UTC | backend_es: Remove unused endpoint | 17 December 2019, 12:33:42 UTC |
18df124 | Antoine R. Dumont (@ardumont) | 16 December 2019, 16:14:54 UTC | cli.tasks: Unify logging instruction | 17 December 2019, 12:33:42 UTC |
c5e189b | Antoine R. Dumont (@ardumont) | 16 December 2019, 16:14:08 UTC | test: Allow status definition during task template generation | 17 December 2019, 12:33:42 UTC |
844f3e0 | Antoine R. Dumont (@ardumont) | 16 December 2019, 10:07:27 UTC | tests.scheduler: Extract common utility function and test it | 17 December 2019, 12:33:42 UTC |
2d56669 | Antoine R. Dumont (@ardumont) | 16 December 2019, 09:07:01 UTC | scheduler.cli.task: Rename appropriately backend variable | 17 December 2019, 12:33:42 UTC |
793c233 | Antoine R. Dumont (@ardumont) | 16 December 2019, 09:06:10 UTC | scheduler.backend_es: Rename backend class appropriately | 17 December 2019, 12:33:42 UTC |
d5bf6b1 | Antoine R. Dumont (@ardumont) | 14 December 2019, 17:44:57 UTC | cli.task: Rename internal method appropriately | 17 December 2019, 12:33:42 UTC |