https://forge.softwareheritage.org/source/swh-scheduler.git

sort by:
Revision Author Date Message Commit Date
3c41cef scheduler.task: Remove no longer used Task class All scheduler tasks have been rewritten to avoid using the inheritance paradigm. The post worker startup initialization no longer creates automatically queues for registered tasks. Queues creation is managed through explicit configuration entries: celery: task_queues: ... task_modules: ... 15 February 2019, 13:59:14 UTC
65d0f73 celery_backend/config: Fix loglevel for amqp module 15 February 2019, 09:43:00 UTC
283cb7c tests: Use hypothesis profile to configure sample generation sizes 14 February 2019, 08:42:26 UTC
c701c88 api/server: Do not read configuration at each request 13 February 2019, 15:19:59 UTC
f0a8c43 listener: make the listener's queue name independent from the hostname the queue being durable and not auto deleted (auto_delete=False), we do not want a new queue to be spawned for each listener instance (eg. in a docker environment). 13 February 2019, 13:37:18 UTC
b423c0b runner: fix task_run configuration bootstrap ensure the task_run is created before sending the celery task; this later task could be executed before the db commit. As a result, the task_run may not have its 'started' field properly set; even the 'ended' and 'status'. 13 February 2019, 13:28:37 UTC
3d761f1 sql: add the swh-lister-bitbucket-* task types 13 February 2019, 13:25:58 UTC
7e3f2fc task: do not send the task-result-exception event in Task.on_failure() since it is unused; the task-failed event being already sent by celery and handled by the listener. 13 February 2019, 13:23:19 UTC
3488c26 requirements-test: Enforce a version for hypothesis Same as other swh modules (swh-web, swh-storage, swh-indexer) Inferior version do not follow the specifications Related P356 11 February 2019, 11:26:24 UTC
db25694 Fix a bug in the listener: commit() is not defined in the backend but on its connection. 06 February 2019, 13:18:18 UTC
c29d383 Add basic stats to tasks This just increments a counter for started tasks, ended tasks, tasks that failed with an exception. It also registers a timer for every task run. Close T1460. 06 February 2019, 12:06:39 UTC
abfe3db Allow to override celery config file name via the SWH_CONFIG_FILENAME env var this will take precedence over the implicit config file scheme. The expected config file given via the environment variable is expected to have a [celery] section which will be used as config for the Celery app created in swh.scheduler.celery_backend.config. Related to T1410 and T826. 06 February 2019, 09:25:07 UTC
0f2f3ff Make (celery) tests immune to environment variables especially CELERY_BROKER_URL... 01 February 2019, 14:32:21 UTC
e2a91fb Remove call to tobytes(), BaseDB now handles conversion. 01 February 2019, 14:03:48 UTC
0c3306b Fix the task_queues Celery config setting in build_app() ensure the config entry contains Queue objects. 31 January 2019, 14:52:41 UTC
f188343 cli: display a sorted list of task-types 31 January 2019, 14:52:41 UTC
b0e6dd8 Fix the listener: accessing the db cnx from the backend has changed 31 January 2019, 14:52:41 UTC
5d40529 Make cli tools output logs on the console by default for other log levels than DEBUG, and add a --no-stdout option flag to disable this. 31 January 2019, 14:52:41 UTC
a8bc684 cli: build the celery app from a celery section of the given configuration file for runner and listener commands. related to T1410 31 January 2019, 14:52:32 UTC
4246286 Add a build_app() function to instantiate a Celery app with controlled config ie. being able to give the celery config dict as parameter. related to T1410 31 January 2019, 14:45:37 UTC
70581b6 Add a /site-map endpoint that lists published routes for this server 31 January 2019, 10:15:50 UTC
5880c52 Activate the support for options from environment variables for swh-scheduler tool so that one can type (typically in a venv, with services running in dockers): (venv) swh-environment$ export SWH_SCHEDULER_URL=http://127.0.0.1:5008 (venv) swh-environment$ swh-scheduler task-type list 31 January 2019, 09:29:43 UTC
9bc5640 Fix 'swh-scheduler runner' command: rollback() has beed removed from the SchedulerBackend and is not needed any more there. 31 January 2019, 09:29:43 UTC
174d89b Fix get_scheduler's cls value when using 'swh-scheduler --url' cli option also ensure args dict does not have default db settings (unsupported by the RemoteScheduler class). 31 January 2019, 09:29:43 UTC
6da09a1 Fix a regression introduced in 61c91b82 when deleting a click.option, one would better delete the function argument as well... 30 January 2019, 15:58:11 UTC
b25c7cd Drop 'except Exception', it catches too many errors. eg. ImportErrors when negotiate is not installed. 30 January 2019, 12:55:58 UTC
84cded2 BaseDb.copy_to's default_columns has been renamed as default_values 30 January 2019, 11:22:51 UTC
4fc7a89 Make the prepare_event helper function pre-aggregate the events with same url and strip these urls as well. It makes no sense to register 2 different URLs when they are equal to a trailing ws detail. As a result, we must preaggregate them because the swh_cache_put() sql function won't allow several 'on conflict' for the same id. 30 January 2019, 10:51:22 UTC
4117d5a Rewrite updater/test_backend.py with pytest and use the postgresql fixture also implement the test with more precise expected behavior, especially the content pre-aggregation of events and the url stripping. As is, this test will fail, the pre-aggregation and url stripping being implemented in the following revision. 30 January 2019, 09:57:11 UTC
ebee014 Kill DbBackend class we can now use directly the implementation of the copy_to() method from swh.core's BaseDb, so we just have to extract the format_query as a simple function (which it should have been since the beginning). Adapt updater/backend.py acordingly. 30 January 2019, 09:57:11 UTC
f338b75 Refactor swh/scheduler/updater as well - use the same config system as the main backend, with the same conventions, - do not make SchedulerUpdaterBackend inherit from DbBackend, use a simple association, similar to the SchedulerBackend class, - refactor the ghtorrent the same (explicit config), - same for the updater writer - move their main functions in cli.py - adapt tests accordingly 30 January 2019, 09:49:59 UTC
48e5372 Fix tests for the scheduler and the API adapt the configuration to latest refactorings (especially the 'scheduling_db' config entry which is now just 'db'). Also refactor the scheduler tests to simplify the test class a bit: constants used in the test do not need to be in the setUp of the test class, making this setUp method useless. Doing so, ensure to use simple dicts as test constants instead of copy-the-modify them. This is (unit)test file, the dumbest, the better. 30 January 2019, 09:49:59 UTC
61c91b8 Refactor config handling in cli.py Move the config file loading in the main cli group so that every command in this group have a consistent config loading behavior. This means that some cli commands "signatures" have changed: - every command now accepts a -C/--config-file option - the --cls has been dropped: either you give a config file, or passing a --database or --url option determine the 'class' of backend to use, - the api-server command 'config-path' argument has been dropped (use --config-file instead), 30 January 2019, 09:49:59 UTC
63af874 Move sheduler's default conf in swh.scheduler 30 January 2019, 09:49:59 UTC
a840ec4 Make configuration of SWHElasticSearchClient use an explicit and consistent behavior instead of looking the config file for this entity in a hard to find dedicated file, expect a config object (dict) as constructor kwargs and look for its config under the 'elasticsearch' section. 30 January 2019, 09:49:59 UTC
488b154 Remove the main function from listener.py it's now in cli.py 30 January 2019, 09:49:59 UTC
cebc11d Refactor the scheduler's backend to use swh.core.db.BaseDb (part 2) so it uses a proper connection pool to access the database. 30 January 2019, 09:46:29 UTC
94258bd Refactor the scheduler's backend to use swh.core.db.BaseDb so it uses a proper connection pool to access the database. 30 January 2019, 09:33:54 UTC
35c285f Add debug statements in the runner 30 January 2019, 09:33:54 UTC
7f51159 Make the 'runner' cli command a bit less verbose do only report the number of scheduled tasks at info level, if any, and lower the 'Run ready tasks' log message at debug level. 30 January 2019, 09:33:54 UTC
63e750a Small fix in the 'task respawn' cli command when respawning several tasks at once. 30 January 2019, 09:33:54 UTC
1203544 Do not crash the listener if a message has already been ack'ed Note sure whether we should be concerned by the fact this sometimes occurs (under heavy load in the docker env). 30 January 2019, 09:33:54 UTC
0252051 prevent pytest from displaying gazillions of warnings especially the infamous psycopg2 one. 30 January 2019, 09:33:54 UTC
7b81304 Increase the max queue size of the origin_metadata indexer. It's going very fast now, enough to empty its queue between two scheduler-runner runs. 29 January 2019, 16:12:25 UTC
53136b7 swh.scheduler.tests: Mark db tests as such This will work around the current debian package build failure Related T1498 28 January 2019, 15:22:53 UTC
e26aec7 Force tox environment to C.UTF-8 locale Works around https://github.com/ClearcodeHQ/pytest-postgresql/issues/16 18 January 2019, 16:45:02 UTC
df1cca3 Add debug logging in the SWHTask class instead of replicating the same logging code everywhere. 17 January 2019, 10:24:47 UTC
caaa44b Revert "tox: pifpaf is not needed any more" This reverts commit f267f454fb91b421994f4ceabb50baa70982b76d. Pushed by mistake. 16 January 2019, 08:56:32 UTC
f267f45 tox: pifpaf is not needed any more 15 January 2019, 17:11:48 UTC
99bf79e enforce dep on pytest<4 see https://github.com/pytest-dev/pytest/issues/4641 It should be possible to let this constraint go when celery 4.3 is out 15 January 2019, 17:11:37 UTC
e1c25d2 Replace task tests (in test_task.py) with pytest-based ones also remove now useless celery_testing.py and scheduler_testing.py files. 15 January 2019, 16:39:32 UTC
e741c93 Add a 'swh_scheduler' pytest fixture that creates a swh scheduler usable for tests this uses the pytest-postgresql package to manage the database life cycle. 15 January 2019, 16:39:10 UTC
9725365 Configure the celery result backend (as 'rpc://') by default the result backend must be configured to be able to save (celery) group results (as it is now the case for lister tasks for example). 15 January 2019, 14:02:06 UTC
6cb1813 Add a few debug statements in cli 15 January 2019, 14:02:06 UTC
c0f9320 Add a new 'task respawn' command that allows to respawn any task immediately (or any later date). 15 January 2019, 14:02:01 UTC
cafcb46 Add a 'next_run' argument to the SchedulerBackend.set_status_tasks method so this method can be used to respawn a task immediately. 15 January 2019, 13:57:44 UTC
1aadc10 Add a 'task list' cli command to list tasks with search criterions and not only pending tasks. 15 January 2019, 13:57:39 UTC
b3fd48e Add a 'full' flag argument to the pretty_print_task function to display also the status and priority fields. 15 January 2019, 13:49:49 UTC
e0f224e Add a SchedulerBackend.search_tasks method 15 January 2019, 09:20:22 UTC
2dbdb6c Add a new 'task-type add' cli command 15 January 2019, 09:20:22 UTC
c3bb48d Make SchedulerBackend.create_task_type work with only a subset of keys Not all keys are mandatory, so do not expect all the possible keys for the task_type table to be given when calling create_task_type(). Note: no validation is made whether the given set of keys fullfill the table constraints. 15 January 2019, 09:20:22 UTC
4c75330 Add a new task-type cli command group and move the task_type listing in there 14 January 2019, 10:38:19 UTC
d658f93 Add a few debug statements in swh.sceduler.backend 14 January 2019, 10:31:14 UTC
6bf1110 Fix swh.scheduler.compute_nb_tasks_from function to return integers instead of float numbers. 14 January 2019, 10:31:14 UTC
996e905 Add a --priority cmdline option to the 'swh-scheduler task add' tool 14 January 2019, 10:31:09 UTC
9f33c05 Make the SWHTask class the default base class for tasks and add some kind of unit tests. 10 January 2019, 14:43:40 UTC
20be147 Kill swh.scheduler.api.server.launch() in favor of the 'swh-scheduler api-server' command. 10 January 2019, 14:43:40 UTC
5c02d51 Add a logging statement in the runner cli command also log an eventual exception instead of crashing the runner service. 10 January 2019, 14:43:39 UTC
292e5dc Kill the CustomCelery class use functions instead of methods. This is required to be able to use celery pytest fixtures so one can really test celery tasks (especially when a task spawns sub tasks). one (get_queue_lenth) of the 3 methods has been added as (monkeypatched) method on the Celery class for the sake of bw compat, but this should really be removed as well as soon as possible (seems only used in swh-archiver). 10 January 2019, 14:43:39 UTC
96ad58b Move logging configuration into the cli group function so that logging level config can be consistently set for all swh-scheduler commands. 10 January 2019, 14:43:39 UTC
69f1759 Move the scheduler verification from the main cli group definition to subcommands since it's actually the responsibility of each subcommand to decide whether it can run without a properly configured scheduler instance. This is also required so the user can run: swh-scheduler subcommand --help even with a non-properly configured scheduler. 10 January 2019, 14:43:39 UTC
df075ba Add 3 cli commands: runner, listener and api-server These commands do what they say, ie. start a runner, listener or API server process. Note that processes are not daemonized and run in front. Typically used as: swh-scheduler --cls local --database postgresql:///?service=swh api-server --host 127.0.0.1 --port 5008 swh-scheduler --cls remote --url http://127.0.0.1:5008 runner --period 10 swh-scheduler --cls remote --url http://127.0.0.1:5008 listener 10 January 2019, 14:43:39 UTC
6eb80f5 Use a dict as click context obj in cli This is needed to be able to add more context objects (see following revisions). 10 January 2019, 14:43:39 UTC
584724d Add a new SWHTask class to be used as base class for swh celery tasks It is meant to be used to declare swh tasks via the task decorator instead of subclassing the (now deprecated) Task class. It is typically used like: from swh.scheduler.celery_backend.config import app from swh.scheduler.tasks import SWHTask @app.task(base=SWHTask) def ping(): return 'pong' 10 January 2019, 14:43:30 UTC
b812434 Replace the TaskRouter class by a simple function class-based router is a remainder of celery 3. 08 January 2019, 09:39:49 UTC
8559cec Improve logging of the listener - do use a dedicated logger instead of the root logger, - add a couple of logging statements (in perform_action methods), - replace the --verbose cli option by --log-level 08 January 2019, 09:39:49 UTC
be0e938 Improve logging configuration in celery's setup_log_handler function - allow to pass the loglevel as a string, - kill useless tmp variables, - add a filter to prevent amqp's heartbeat_tick debug messages. 08 January 2019, 09:39:49 UTC
b6bc2f2 Explicitly register class-based tasks inheriting from our own class This lets go of the metaclass madness imported from Celery 3, and goes for an explicit task registration mechanism as advised in Celery 4. We do explicit registration of all our tasks in the 'worker ready' signal, just before they automatically subscribe to the task queues. Unfortunately that signal is not emitted by the "test fixture" worker, so we need to explicitly register class-based tasks that are being used in tests. This doesn't show up for functions as the @task decorator handles registration. 19 December 2018, 14:39:10 UTC
05c641b Refactor a bit the 'swh-scheduler task list-pending' cli tool since the task-type is mandatory, make it a command argument instead of an option, and make it variadic, so one can list several pending task types at once. Also do not use the pager. Let the user decide whether she wants paginated results. 18 December 2018, 17:38:59 UTC
2582293 Add a new 'swh-schedule task add' command that allows to easily insert a new task in the scheduler's database, eg. swh-scheduler --database 'service=swh-scheduler' \ task add swh-lister-debian distribution=stretch policy=oneshot 18 December 2018, 17:38:59 UTC
4317681 Add a new policy argument to SchedulerBackend.create_tasks so that we can expose it to cli tools. 18 December 2018, 17:38:59 UTC
d2ea7b1 Add a --list-types option to the 'swh-scheduler task' command that lists available task types. 18 December 2018, 17:38:47 UTC
4335cfc Automatically subscribe workers to the per-task queues That only works for actual workers (the celeryd_after_setup signal isn't sent by the test fixture), so keep the workaround in the fixture 18 December 2018, 16:16:25 UTC
bbe9e55 Fix typo in celery_testing 18 December 2018, 16:16:25 UTC
4cacd65 Route tasks using their task name rather than a task_queue attribute 18 December 2018, 16:16:25 UTC
f984974 celery_backend: more robust queue length management - Ignore the absence of the rabbitmq management interface - Handle inexistent queues as if they were empty 18 December 2018, 16:16:25 UTC
af53734 Don't run SQL query with an empty tuple Noticed the issue when repeatedly running tests without really letting the database tear down properly... 18 December 2018, 16:16:25 UTC
15262bb Switch celery settings to lowercase names 18 December 2018, 16:16:25 UTC
7daad6f Record requirement on celery 4 The tests scaffolding depends on fixtures introduced in Celery 4 18 December 2018, 16:15:56 UTC
9966e55 Remove pickle from Celery's accepted content type 18 December 2018, 09:36:47 UTC
041f103 Fix Task's docstring 18 December 2018, 09:33:56 UTC
97396ed Update requirements to latest swh.core Related T1444 14 December 2018, 14:50:12 UTC
46b2ba6 Fallback for get_queue_stats() when using memory:// broker. Summary: When running tests, RabbitMQ is not used, so get_queue_stats() does not work. We did not notice this issue before because pytest skipped importing modules that define tasks with a queue, so this function was never actually called while running tests. Reviewers: #reviewers, ardumont Reviewed By: #reviewers, ardumont Subscribers: ardumont, swh-public-ci Differential Revision: https://forge.softwareheritage.org/D763 03 December 2018, 14:53:48 UTC
0f77286 Fallback for get_queue_stats() when using memory:// broker. When running tests, RabbitMQ is not used, so get_queue_stats() does not work. We did not notice this issue before because pytest skipped importing modules that define tasks with a queue, so this function was never actually called while running tests. 03 December 2018, 13:44:54 UTC
042edee README: fix a typo 28 November 2018, 10:54:37 UTC
cea1aed doc: update index to match new swh-doc format related to T1330 23 November 2018, 12:52:54 UTC
5e67d28 data/sql: Insert new task type for origin metadata indexer Related T1376 21 November 2018, 15:35:21 UTC
b7b596f data/sql: Insert new task type for revision metadata indexer Related T1375 21 November 2018, 15:33:23 UTC
5c636ef data/sql: Insert new task type for origin indexer Related T1326 21 November 2018, 12:00:13 UTC
26efaad sql/data: Create new range indexer fossology license task type 20 November 2018, 11:23:35 UTC
back to top