https://github.com/aiidateam/aiida_core

sort by:
Revision Author Date Message Commit Date
20fa6c2 Merge pull request #3821 from aiidateam/release/1.1.1 Release `v1.1.1` 03 March 2020, 09:36:18 UTC
4307c74 Release `v1.1.1` 03 March 2020, 06:53:43 UTC
557a6a8 Emit a warning when input port specifies a node instance as default (#3466) Using mutable objects as defaults for `InputPorts` can lead to unexpected result and should be discouraged. Implementing this for all types, including python builtins such as lists and dictionaries, is difficult and not the most pressing problem. The biggest problem is users specifying node instances as defaults. This will cause the backend to be loaded just when the process class is imported, which can cause problems during the building of documentation and with unit testing where instances in memory to deleted nodes can be referenced, among other things. To warn users for these complications, we override the `InputPort` constructor to check the type of the default and emit a warning if it is a node instance. The `CalcJob` implementation had to be adapted to conform with the new requirements as the `mpirun_extra_params` and `environment_variables` metadata options were using mutable types for their default. Just as certain test process classes, the defaults are now prefixed with the `lambda` keyword. 02 March 2020, 17:11:18 UTC
d5a706f remove deprecation warning from get_description (#3819) The `description` property is understood to return the value stored in the description column, while `get_description` may be a compound of various columns/attributes. Reverting incorrect deprecation of get_description for the Code class. 02 March 2020, 16:51:57 UTC
9389a52 `verdi status`: add the configuration directory path to the output (#3587) 02 March 2020, 15:49:56 UTC
dcd40af Caching: fix configuration spec and validation (#3785) Remove the `node_class` argument in caching config functions and change the `identifier` to always refer to the `process_type` of a node. The identifiers in the caching configuration now have to be either valid entry point names or the process type of a node, which is the join of a python module path and the resource, like function or class. The identifiers are now also allowed to contain the wildcard character `*`. This allows globbing to match groups of entry point names. This is technically a breaking change, because the interface of `get_use_cache`, `enable_caching`, `disable_caching` no longer allows the `node_class` interface. However, the previous version silently did nothing when this argument is passed, so this just makes the issue more apparent. 02 March 2020, 14:41:14 UTC
358b916 Disable caching for the `Data` node subclass (#3807) Caching was enabled for the `Data` node sub class, even though with the definition of "caching" in AiiDA that doesn't really make sense. It is a manner to prevent having to reexecute a calculation that already has been executed before, which clearly does not apply to `Data` nodes. Since it was simply a no-op, it was left in. However, it turns out it can actually create problems when a cloned data node is stored that has changed some mutable attributes. Since those are not included in the hash, the hash of the clone is the same as that of its source and the the caching is activated replacing the clone with the source, undoing the changes to the mutable attribute. All of this is solved by simply disabling caching for `Data` nodes. 29 February 2020, 18:31:15 UTC
f7101ed `BaseRestartWorkChain`: add method to enable/disable process handlers (#3786) The `process_handler` decorator is updated with a new keyword argument `enabled` which is by default `True`. By setting it to `False` the process handler is disabled and will always be skipped during the `inspect_process` outline step. This default can be overridden on a per instance basis through a new input called `handler_overrides`. The base spec of `BaseRestartWorkChain` defines this new base input called `handler_overrides` which takes a mapping of process handler names to a boolean. For `True` the process handler is enabled and for `False` it is disabled, where disabled means that during the `inspect_process` call it is not called but skipped. The validator on the port ensures that the keys correspond to actual instance methods of the work chain that are decorated with `process_handler`. The value specified in `handler_overrides`, as the name suggests, override the default value specified in the decorator. 29 February 2020, 17:40:59 UTC
d44dcbd Reuse `prepend_text` and `append_text` in `verdi computer/code duplicate` (#3788) This required adding a new `ContextualDefaultOption` class to get a default callable that also receives the context. Moreover, a bug was fixed in the function that prompts for the `prepend/append_text`, as if the file was not touched/modified, instead of reusing the text provided in input, an empty text was used instead. Co-authored-by: ConradJohnston <40352432+ConradJohnston@users.noreply.github.com> 28 February 2020, 11:27:52 UTC
67d4504 validate label string at code setup stage (#3793) Prevent users from using '@' in code labels when calling `verdi code setup`, instead of validating only in `verdi code relabel`. 25 February 2020, 12:29:09 UTC
9fcb78a Remove unused and obsolete language extensions and utilities (#3801) The `aiida.common.lang` module hosts a variety of utilities that extend the base python language. Many of these were backports for python 2, which since now that we support python 3, have become obsolete. Some also simply were not used: * `abstractclassmethod`: now available in `abc.abstractclassmethod` * `abstractstaticmethod`: now available in `abc.abstractstaticmethod` * `combomethod`: not used at all * `EmptyContextManager`: from py3.4 `contextlib.suppress()` * `protected_decorator`: only used in two places, but with `check=False` so it would never actually raise when the method was called from outside the class. What is worse, the `report` method is only useful when callable from the outside. All in all, it is best to remove this whole concept. 24 February 2020, 21:39:26 UTC
061f6ae `QueryBuilder`: add support for `datetime.date` objects in filters (#3796) Each query is hashed for efficiency reasons, such that repeated queries can be taken from a cache. This requires the builder instance and all its attributes to be hashable. The filters can reference `datetime.date` objects to express date and time conditions, but these were not supported by the hashing implementation, unlike `datetime.datetime` objects. Here we update the `aiida.common.hashing.make_hash` single dispatch to also support `datetime.date` objects. Note that, unlike for `datetime.datetime` objects, timezone does not have to be considered as dates are timezone unaware data structures. 24 February 2020, 13:50:34 UTC
2f9224a Write migrated config to disk in `Config.from_file` (#3797) When an existing configuration file with an outdated schema was loaded from disk through `Config.from_file`, as happens in the initialization call of `aiida.manage.configuration.load_config`, the content was properly migrated in memory but the changes were not written to file. This caused the migration to be performed each time. 24 February 2020, 11:59:53 UTC
12ed1f0 Remove conda activattion from configure-aiida.sh script. (#3791) Now handled globally by the base container. 23 February 2020, 20:26:35 UTC
a2bebb4 pyproject: add build-backend (#3784) if there is a `build-system` section, it should also define the backend, otherwise a PEP517-based build will fail: python -m pep517.build --source --binary --out-dir dist/ . 21 February 2020, 22:19:39 UTC
928c2fc `pytest_fixture.get_code`: raise descriptive exception if exe could not be found (#3774) 21 February 2020, 14:55:02 UTC
5e26bac Fix bugs in `Node._store_from_cache` and `Node.repository.erase` (#3777) The `_store_from_cache` needed to erase the content of its sandbox folder before copying over the content of the cache source. Currently, the existing contents are mixed with with the contents to be copied in. In fixing this, we discovered an issue in `Node.repository.erase`, where an unstored node would try erasing the (non-existent) folder in the permanent repository, instead of the `SandboxFolder`. 21 February 2020, 14:40:26 UTC
b28d3cf Add fixtures to clear the database before or after tests (#3783) The current 'clear_database' fixture clears the DB after the test has run. Since clearing it before the tests is also a common use case, this adds a 'clear_database_before_test' fixture. For naming consistency, we rename 'clear_database' to 'clear_database_after_test', and turn 'clear_database' into an alias for that function for backwards compatibility. 21 February 2020, 14:22:45 UTC
617b2df `BaseRestartWorkChain`: require process handlers to be instance methods (#3782) The original implementation of the `register_process_handler` allowed an unbound method to be bound to a subclass of `BaseRestartWorkChain` outside of its scope. Since this also makes it possible to attach these process handlers from outside of the module of the work chain, this will run the risk of the loss of provenance. Here we rename the decorator to `process_handler` and it can only be applied to instance methods. This forces the handlers to be defined in the same module as the work chain to which they apply. 21 February 2020, 09:26:37 UTC
11cefed Fix broken imports of `urllib` (#3767) `import urllib` does not automatically make the `urllib.request` and `urllib.error` modules available. 21 February 2020, 08:12:24 UTC
94f91a9 Fix CI postgres issue (#3781) * Set wrapt dependency to ~=1.11.1 * Use updated postgresql-action Set postgres auth-method to trust 20 February 2020, 18:20:06 UTC
12f9641 Conda: define explicit python requirement in `environment.yml` (#3758) It turns out that when creating a conda environment from an `environment.yml` file, it is not possible to specify or override the python version of the environment from the command line. I.e. it turns out that: conda env create -f environment.yml -n test-environment python=3.7 actually sets up a python 3.6 environment. This was because the dependency `plumpy` was not yet marked as being compatible with python 3.7. This has since been fixed and we now explicitly set the python version in the `environment.yml` file. 14 February 2020, 11:03:41 UTC
0a45f40 Match headers with actual output (#3756) For: `verdi data structure list` 13 February 2020, 17:19:30 UTC
a92c1f1 Release `v1.1.0` Forgot to bump the version in previous "release" 12 February 2020, 13:13:14 UTC
86fc153 Merge pull request #3753 from aiidateam/release/1.1.0 Release `v1.1.0` 12 February 2020, 11:33:52 UTC
ea513fe Release `v1.1.0` 12 February 2020, 10:18:40 UTC
b7bcf96 Django backend: limit batch size for `bulk_create` operations (#3713) Postgresql has a `MaxAllocSize` that defaults to 1 GB [1]. If you try to insert more than that in one go (e.g. during import of a large AiiDA export file), you encounter the error: psycopg2.errors.ProgramLimitExceeded: out of memory DETAIL: Cannot enlarge string buffer containing 0 bytes by 1257443654 more bytes. This commit avoids this issue by setting a batch size for `bulk_create` operations. The size of the batch is configurable through the new `db.batch_size` configuration option using `verdi config`. [1] https://github.com/postgres/postgres/blob/master/src/include/utils/memutils.h#L40 max alloc" size 12 February 2020, 09:10:43 UTC
d08cffa Remove deprecated resources for release `v1.1.0` (#3752) - Removed module `aiida.backends.profile` 11 February 2020, 16:10:16 UTC
67ae49f Add non-zero exit code for `verdi daemon status` (#3729) `verdi daemon status` was always returning exit code 0, which makes it difficult to use the command programmatically (e.g. in ansible). It now returns exit code 3 if the daemon of any of the requested profiles is not running. 11 February 2020, 13:47:04 UTC
1ba87c0 Add `exit_codes` argument to `register_process_handler` The `exit_codes` argument takes a single or list of `ExitCode` instances. If defined, the handler will return `None` if the exit code set on the `node` does not appear in the `exit_codes`. This is useful to have a handler called only when the process failed with a specific (set of) exit code(s). 11 February 2020, 09:47:58 UTC
021bf12 Force `priority` to be keyword only in `register_process_handler` To clarify the `register_process_handler` decorator, the `priority` argument is made a keyword-only argument such that callers are forced to specify it. Having just an integer in the method arguments can be confusing as to its meaning. This commit also adds type checking to the `priority` keyword. 11 February 2020, 09:47:58 UTC
9c8c8cc Add the `BaseRestartWorkChain` The `BaseRestartWorkChain` is designed to help the writing of a base workchain that wraps the launching of a sub process, for example a calculation job, and provides a framework for easily adding automated error handling and performing sanity checks. The class and its utilities were originally implemented in the `aiida-quantumespresso` plugin but were designed from the get-go to be generically applicable. It was quickly adopted and used in other plugins. Therefore we are now moving it to `aiida-core` so that it can be useful to many plugins without separate copies of the code having to be maintained. Originally it was only designed to wrap around calculation jobs, but the concept is generic enough that here it is generalized to any sub process, such as `WorkChains`. The basic concept is simple, instead of subclassing from `WorkChain` one subclasses the `BaseRestartWorkChain` and uses at a minimum the outline: cls.setup while_(cls.should_run_process)( cls.run_process, cls.inspect_process, ), cls.results The logic of those outline methods is implemented on the base class so the only thing that remains is to specify the process sub class that needs to be used. The sub process will be launched until it is sucessful, i.e. exit status 0, or the maximum number of iterations is exceeded. The `inspect_process` will loop over the list of registered process handlers. The handlers can check for errors in case that the sub process failed, or perform sanity checks even if the sub process was successful. The handlers can return a `ProcessHandlerReport` to control the further flow, for example by breaking out of the process handler call loop or even completely aborting the workchain if an unrecoverable problem was encountered. 11 February 2020, 09:47:58 UTC
b7162d7 `ArithmeticAddCalculation`: correct exit codes The calculation job class used exit codes in the range `100 - 199` which are reserved for scheduler errors. They have been changed to be in the three hundred range. 11 February 2020, 09:47:58 UTC
65ec6d6 Various changes to the docker infrastructure: (#3746) * Wrap bash variable names in curly brackets. * Dockerfile: install aiida with `atomic_tools` (needed to operate cif files for example). * Stop daemon before performing database migration. This might be useful if the container wasn't shut down properly and, at startup, the daemon is considered to be in running state. * Show the daemon status at the end of the Docker test. 10 February 2020, 17:37:48 UTC
53bbc74 Add `provenance_exclude_list` attribute to `CalcInfo` data structure (#3720) This new attribute takes a flat list of relative filepaths, which correspond to files in the `folder` sandbox passed to the `prepare_for_submission` call of the `CalcJob`, that should not be copied to the repository of the `CalcJobNode`. This functionality is useful to avoid the content of input files, that should be copied to the working directory of the calculation, to also be stored permanently in the file repository. Example use cases are for very large input files or files whose content is proprietary. Both use cases could already be implemented using the `local_copy_list` but only in the case of files of an input node in its entirety. The syntax of the `local_copy_list` does not support the exclusion of arbitrary files that are written by the calculation plugin to the sandbox folder. Before the addition of this new feature, the contents of the sandbox folder were added to the repository of the calculation node simply by moving the contents of the sandbox entirely to the repository. This was changed to an explicit loop over the contents and only copying those files that do not appear in the `provenance_exclude_list` list. The advantage of recursively looping over the contents of the sandbox folder and *copying* the contents to the repository as long as it is not part of `provenance_exclude_list`, over deleting those excluded files from the sandbox before *moving* the remaining content to the repository, is that in the former there is a better guarantee that the excluded files do not accidentally end up in the repository due to an unnoticed problem in the deletion from the sandbox. The moving method is of course a lot more efficient then copying files one by one. However, this moving approach is only possible now that the repository is still implemented on the same filesystem as the sandbox. Once the new repository interface is fully implemented, where non file system repositories are also possible, moving the sandbox folder to the repository will no longer be possible anyway, so it is acceptable to already make this change now, since it will have to be done at some point anyway. 10 February 2020, 17:10:31 UTC
48573df Docker: install aiida in conda environment, fix .bashrc (#3745) fixes #3739 In the latest update of the aiida-prerequisites image, python is provided within conda environment. Respectively, this PR adapts aiida-core image to use python from conda. Also, it fixes autocompletion enabling in user's .bashrc file 07 February 2020, 10:53:36 UTC
5b892ca add numfocus affiliation to README (#3744) * add numfocus affiliation to README and docs * add twitter badge 06 February 2020, 20:00:35 UTC
3eec05d improve docstring of reciprocal_cell (#3741) Clarify that reciprocal cell vectors are stored as rows of the reciprocal cell. 03 February 2020, 13:47:14 UTC
2308176 Improve `restapi.common.identifiers.get_node_namespace` efficiency (#3737) This function serves to build a full hierarchical namespace of all existing node types in the database. It is used by the REST API to populate the side bar accordion with all the available node types, which facilitates easy filtering. However, for big databases this function is very slow. The main cost is the query for node/process type tuples: QueryBuilder().append(Node, project=['node_type', 'process_type']) The function really only needs the unique tuples, which is currently done in python. This can be done straight on the SQL level by using: builder.distinct() This prevents having to load the tuples for all nodes, which saves a lot of computing time for big databases. By adding the `distinct` clause, the query for a database of approximately 3 million nodes, was reduced by a factor of 10, from 50 to roughly 5 seconds. 31 January 2020, 15:48:26 UTC
ffcef0b Remove `recreate_after_fork` trigger for SqlAlchemy scoped session Presumably, the `multiprocessing.recreate_after_fork` hook on the session creation of the SqlAlchemy was added because of the design of the old daemon of the v0.* series. There the daemon was implemented using `celery` which would spawn workers, by forking the main process. Since the definition of forking process means "cloning a process with the exact same state", the forked process would have the same session which is undesirable. This is why the session has to be reset such that an independent one has to be initialized. With the new engine design, there is no forking anymore by the daemon and it is also not supported. Instead, each daemon worker is launched as a completely new process and so will not start with an existing session. Finally, the `multiprocessing.recreate_after_fork` is not even documented and according to `https://bugs.python.org/issue21372` this is for a reason, because it is not intended for external use. Combined with the fact that its cross-platform consistent operation is pulled into question, it is better to not use it. 31 January 2020, 11:21:16 UTC
09cb6cb Implement `Backend.get_session` to retrieve scoped session The scoped session is an instance of SqlAlchemy's `Session` class that is used by the query builder to connect to the database, for both the SqlAlchemy database backend as well as for Django. Both database backends need to maintain their own scoped session factory which can be called to get a session instance. Certain applications need access to the session. For example, applications that run AiiDA in a threaded way, such as a REST API server need to manually close the session after the query has finished because this is not done automatically when the thread ends. The associated database connection remains open causing an eventual timeout when a new request comes in. The method `Backend.get_session` provides an official API to access the global scoped session instance which can then be closed. Additionally, a lot of code that was duplicated across the two implementations of the `QueryBuilder` for the two database backends has been moved to the abstract `BackendQueryBuilder`. Normally this code does indeed belong in the implementations but since the current implementation for both backends is based on SqlAlchemy they are both nearly identical. When in the future a new backend is implemented that does not use SqlAlchemy the current code can be factored out to a specific `SqlAlchemyQueryBuilder` that can be used for both database backends. 31 January 2020, 11:21:16 UTC
8cd0b60 Do not hijack the SqlAlchemy session factory in Django `QueryBuilder` Since the `QueryBuilder` implementation uses SqlAlchemy to map the query onto the models in order to generate the SQL to be sent to the database, it requires a session, which is an :class:`sqlalchemy.orm.session.Session` instance. The only purpose is for SqlAlchemy to be able to connect to the database perform the query and retrieve the results. Even the Django backend implementation will use SqlAlchemy for its `QueryBuilder` and so also needs an SqlA session. It is important that we do not reuse the scoped session factory in the SqlAlchemy implementation, because that runs the risk of cross-talk once profiles can be switched dynamically in a single python interpreter. Therefore the Django implementation of the `QueryBuilder` should keep its own SqlAlchemy engine and scoped session factory instances that are used to provide the query builder with a session. 31 January 2020, 11:21:16 UTC
8759669 Docs: ensure RTD style is set when building locally (#3734) The definition of the stylesheet to that of ReadTheDocs when building locally was recently moved to the top of the `docs/source/conf.py` file. However, the default statement, defining the default theme, now was executed afterwards, overriding the custom configuration. Simply commenting it out restores the correct behavior. 29 January 2020, 20:10:48 UTC
5b47bfc Fix local computer setup in the docker container (#3732) Due to erroneous argument to `verdi computer setup` the computer `localhost` was not configured in the `aiidateam/aiida-core` Docker container. This PR fixes the problem and adds a test to the Docker GitHub action. 28 January 2020, 15:22:20 UTC
3848649 Add docker image with minimal running AiiDA instance (#3722) The docker file builds on the `aiida-prerequisites` base image to get a basic install of an AiiDA instance. This image then configures a full profile with a configured localhost computer and the daemon is started. The parameters of the profile can be controlled through environment variables. 25 January 2020, 09:23:12 UTC
5790d0d Acknowledge swissuniversities in the README and in the docs (#3723) Also add the swissuniversities logo 23 January 2020, 14:54:40 UTC
0b21cec Move `CalcJob` spec validator to corresponding namespaces (#3702) The `CalcJob` process had a single validator defined on the top level input namespace, which validated many ports scattered across the namespace, such as the `metadata.options.parser_name` as well as the `metadata.options.resources`. The validator assumed that all of these ports would always be present, however, this is not necessarily true. The expose functionality allows a wrapping process to expose only part of the namespace but the validator remains the same. To ammeliorate this, the signature of validators is updated to also receive a `context` in the form of a second argument `port` in addition to the value passed to the port. This `port` will be the instance of the port to which the validator is assigned. This allows the validator implementation to first check whether a specific port is present before trying to validate the corresponding value. Note that the `port` will represent the port to which the validator is attached and so it will have no knowledge of any namespace it might be embedded in, as it shouldn't, because that will break the portability of the namespace. The `port` argument being passed to the validator call was introduced in `plumpy==0.14.5` so we upgrade the minimum requirement here. Since that version also requires `pyyaml~=5.1.2` we also update that explicit version in `aiida-core`. Going to `pyyaml==5.2` will break until we fix the serialization and deserialization of process instances that are currently using the `FullLoader` that is no longer allowed to serialize arbitrary python objects as we do. 23 January 2020, 10:24:08 UTC
381d941 Add support for python 3.8 Run CI tests on python 3.8 instead of 3.7. 20 January 2020, 18:35:24 UTC
9d896f5 Fix `aiida.common.folders.Folder` unit test for utf-8 encoding The test was calling `insert_path` from a filepath into itself. When moving to python 3.8 this was triggering a weird recursive condition in the underlying `shutil.copytree` call that kept concatenating the `šaltinis/destination` sub string to the target filepath. This would eventually throw an error that the filepath was too long. The origin for this occurring all of a sudden in python 3.8 is unsure, but most likely the recursive copy in the test was a typo and not intended. 20 January 2020, 18:35:24 UTC
07629a4 Remove `__contains__` tests on `enum.Enum` classes This behaviour is no longer supported in python 3.8: https://github.com/python/cpython/commit/3715176557cf For an enum `SomeEnum`, one can no longer do: if 'SOME_VALUE' not in SomeEnum: raise ValueError() but instead one should use `isinstance` directly. After all an enum is simply a normal python class. 20 January 2020, 18:35:24 UTC
72d38d5 Remove manipulations of mappings while it is being iterated This behavior will raise starting from python 3.8. 20 January 2020, 18:35:24 UTC
6ce6cd4 Update imports in moved test files The unit tests that were moved in the previous commit often imported utilities from a module that used to be located in the main package `aiida.backends.tests.utils`. This module has also been moved to the `tests` top level module. These utilities should really in most cases become pytest fixtures but that is left for a later time. 16 January 2020, 15:38:10 UTC
e4c7fb5 Move all tests from `aiida.backends.tests` to top level module This will prevent them from being included in the distribution that is uploaded to PyPI making the download a lot quicker. Note that this commit only moves the files without touching them. This means that the tests won't actually run since, for example, various imports are broken. For clarity these changes will be done in a following commit to not drown out the important changes between files that were just moved around. Note that the `django` and `sqlachemy` modules in `tests/backend`, which contain backend specific tests, are prefixed with `aiida_`, otherwise imports using either `from django` or `from sqlalchemy` will target incorrectly those modules, instead of the actual libraries. 16 January 2020, 15:38:10 UTC
40c0cfa Add `py:class set` to docs nitpick-exceptions (#3711) There was an error when trying to add the intrinsic 'set' to the list of possible `:types:` accepted by a parameter in the docstring. This is apparently a problem Sphinx compatibility with pythons base documentation, so the only current solution is adding an exception for it. 13 January 2020, 16:42:44 UTC
47cfe34 Add traverse_graph / AGE engine for visualization The graph visualization feature now uses the traverse_graph function (with AGE as the main engine) to collect the requested nodes to be visualized. This was implemented in the methods of the graph class: previously, `recurse_descendants` and `recurse_ancestors` used to work by calling `add_incoming` and `add_outgoing` many times, which in turn have to load nodes during the procedure. Now these are all independent and they all call the traverse_graph function, so the information is obtained directly from the query projections and no nodes are loaded. So these changes are not only important as a first step to homogenize graph traversal throughout the whole code: an improvement in the visualization procedure is expected as well. 10 January 2020, 10:11:03 UTC
aa6ca5b Add traverse_graph / AGE as engine for export The export function now uses the get_nodes_delete function (with the traverse_graph underlying interface using AGE as the main engine) to collect the extra nodes that are needed to keep a consistent provenance. This is performed, more specifically, by the 'retrieve_linked_nodes' function. Whereas previously a different query was performed for each new node added in the previous query step, this new implementation should do a single new query for all the nodes that were added in the previous query step. So these changes are not only important as a first step to homogenize graph traversal throughout the whole code: an improvement in the export procedure is expected as well. 10 January 2020, 10:11:03 UTC
d2d7126 Add traverse_graph / AGE as engine for node delete The node deletion function now uses the get_nodes_delete function (with the traverse_graph underlying interface using AGE as main engine) to collect the extra nodes that are needed to keep a consistent provenance. The procedure is not very different than the one that was initially implemented so no significant performance improvement is expected, but this is an important first step to homogenize graph traversal throughout the whole code. 10 January 2020, 10:11:03 UTC
f5aeaf4 Add feature traverse_graph and others The function traverse_graph works as a simplified interface to interact with the AGE that also removes the need to manually handle the basket and the querybuilder instance: * The price to pay for hiding the basket is that this function can only be used with sets of nodes and links (so, no groups). * The price to pay for hiding the querybuilder is that complex traversal procedures can no longer be specified, the user simply defines which links can be traversed forwards and which backwards, and this criteria is then applied in every iteration (so one could not, in a single call, search only for all called calc nodes of the called work nodes of an initial workflow node, as one will also obtain the calc nodes directly called by that initial workflow). Besides the starting nodes (pks) and links, the user can also provide the number of max iterations desired (which by default is None, which means 'until no new nodes are found') and a boolean that indicates if the links (edges) should be returned. Additionally, two other interfaces are included for ease of use when deleting and exporting. These functions only take the starting set of pks and the rules provided by the user (as 'rule_name_dir' = False/True) and can automatically check if the rule is toggable, set defaults (using aiida.common.links.GraphTraversalRules), and also parse the ruleset into two lists with the links for forward and backward traversal. They will return a dictionary containing the 'nodes' list, the 'links' list (if this was requested, else this will contain `None`) and a dict with the way in which all the rules were applied (using the following format: 'rule_name' = True/False). Co-Authored-By: Leonid Kahle <leonid.kahle@epfl.ch> 10 January 2020, 10:11:03 UTC
c23d854 Add feature AGE The AiiDA Graph Explorer (AGE) is a general purpose tool to perform graph traversal of AiiDA graphs. It considers AiiDA nodes and groups (eventually even computers and users) as if they were both 'graph nodes' of an 'expanded graph', and generalizes the exploration of said graph. The 'rules' that indicate how to traverse this graph are configured by using generic querybuilder instances (i.e. with information about the connections but without specific initial nodes/groups and without any projections). The initial set of nodes/groups is provided directly to the rule, which then will perform successive applications of the query, each on top of the results of the previous one. This cycle is repeated for a specified number of time, which can be specified to be 'until no new nodes are found'. The current implementation works with the following (public) classes: * Basket: generic container class that can store sets of nodes, groups, node-node edges (aiida links) and group-node edges. These are the objects that the rule-objects receive and return. * UpdateRule: initialized with a querybuilder instance (and optionally a max number of iterations and the option to track edges), it can then be run with an initial set of nodes to obtain the result of the accumulated traversal procedure described by the iterations of the query. * ReplaceRule: same as the update rule, except that at the end of the procedure the returned basket contains not the accumulation of the traversal steps but only the nodes obtained during the last step. This is rule is not compatible with the 'until no new nodes are found' end iteration criteria. * RuleSequence: this can concatenate the application of different rules (it basically works like an UpdateRule that iterates over a chain of rules instead of a single querybuilder instance). * RuleSaveWalkers and RuleSetWalkers: rules that can be provided in a chain of rules given to a RuleSequence to save a given state of the current basket (Save) that can later be used to overwrite the content of said working basket (Set). This is useful in the case where one might need to do two operations 'in parallel' (i.e. on the same set of nodes) instead of doing the second on the results of the first one. Co-Authored-By: ramirezfranciscof <ramirezfranciscof@users.noreply.github.com> 10 January 2020, 10:11:03 UTC
d32e2aa Add imports from `urllib` to dbimporters (#3704) Some query functions of the dbimporters used `urlopen` and `urlencode` without importing `urllib.request` and `urllib.parse`. 07 January 2020, 11:42:25 UTC
0e9be49 Fix bug in `upload_calculation` for `CalcJobs` with local codes (#3707) This code path was not tested at all and so the remaining occurrence of `get_folder_list` which is part of the old repository interface that was removed in `v1.0.0` went unnoticed. The fix was forced to write files to temporary files and flush them instead of using a filelike object, because `Transport.put` only accepts absolute filepaths for the moment. 07 January 2020, 09:50:42 UTC
982b089 Update dependeny requirement `circus~=0.16.1` (#3703) This new release of `circus` makes it compatible with `pyzmq>=17`. This in turn allows us to unpin the `ipython` requirement which required these versions of `pyzmq`. Note that implicitly the upper limit is `7.10` for `ipython` because there python 3.5 support is dropped which we support for another 9 months. Note that we specify `0.16.1` because the uploaded wrong tarball was uploaded for `0.16.0` making installation from source fail. 27 December 2019, 15:34:19 UTC
a6c4704 Make local modules importable when running `verdi run` (#3700) Running a python script through `verdi run` from its local directory that imports from a module in the same directory would yield an `ModuleNotFoundError`. The problem is because the current working directory was not being passed in the `sys.path` of the exec'ed file. 24 December 2019, 15:16:16 UTC
dcd0ce4 Ensure correct types for `QueryBuilder().dict()` with multiple projections (#3695) The returned results by the backend implementation of `QueryBuilder` are passed through `get_aiida_entity_res` to convert backend instances to front end class instances. It calls `aiida.orm.convert.get_orm_entity` which is a singledispatch to convert all known backend types to its corresponding front-end ORM analogue. The registered implementation for `Mapping` contained the bug. It used a simple comprehension and did not catch any `TypeError` that might be thrown from values that could not be converted. This would bubble up to `get_aiida_entity_res` which would then simply return the original value. If the mapping contains a mixture of backend entities and normal types, the entire converting would be undone. This was surfaced when calling `dict` on a query builder instance with `project=['*', 'id']`. The returned value for each match is a dictionary with one value an integer corresponding to the `id` and the other value a backend node instance. The integer would raise the `TypeError` in the `Mapping` converter and since it wasn't caught, the backend node was also not converted. 20 December 2019, 16:43:52 UTC
c3c9aaf `CalcJob`: do not pause when exception thrown in the `presubmit` (#3699) The `presubmit` call of the `CalcJob` has recently been moved to the `aiida.engine.processes.calcjobs.tasks.task_upload_job` which is wrapped in the exponential backoff mechanism. The latter was introduced to recover from transient problems such as connection problems during the actual upload to a remote machine. However it should not catch exception from the `presubmit` call which are not actually transient and thus not automatically recoverable. In this case the process should simply except. Here we test this by mocking the presubmit to raise an exception and check that it is bubbled up and the process does not end up in a paused state. To prevent the test from blocking in the case the process gets put in the paused state erroneously, we put a timeout on the test for which we need the `pytest-timeout` plugin. 20 December 2019, 16:30:42 UTC
b825bbb Remove `aiida.schedulers.plugins` files from pre-commit black list (#3697) Except for the test files. This is a last-ditch effort to get `pylint` to shut-up about nonsensical problems with `aiida.schedulers.plugin.slurm.py`. If this doesn't work: (╯°□°)╯︵ ┻━┻ 20 December 2019, 15:06:11 UTC
0834731 Sphinx extension: skip documenting the outline if it is None (#3690) If no outline is created in the `define` method, `get_outline` will return `None`. Previously, this led to an error in the sphinx directive. This is fixed by explicitly checking for None. This error occured in an abstract base workchain that does not implement an outline itself, but is still documented. 20 December 2019, 11:19:13 UTC
b597696 `pylint` will shut up one way or the other (#3696) It keeps complaining about `requested_wallclock_time_seconds` being an invalid name in `aiida.schedulers.plugins.slurm` however it is disabled with an inline statement that apparently gets ignored. It could be that another instance of this variable is triggering the warning and the source file is just incorrect. So here we add the disable statement for all instances of `requested_wallclock_time_seconds` and pray to god that this time it works. 20 December 2019, 11:02:52 UTC
a34b356 Update `prospector` to the latest version `v1.2.0` (#3693) This comes with new requirement of `pylint==2.4.4` which comes with a few new warnings that have been addressed in the code base: * no-else-break * no-else-continue * self-assigning-variable * property-with-parameters * invalid-overridden-method * unnecessary-comprehension One additional new warning has been added to the ignore list * import-outside-toplevel PEP8 recommends to only have imports at the toplevel of a module and not in function or class definitions. However, the changes required would be significant and so we leave that for another time. 20 December 2019, 07:09:21 UTC
d3e05c3 Add the `verdi node repo dump` command (#3623) This command can be used to easily dump the entire contents of a node's repository to a folder on the local file system. To prevent accidentally overwriting existing files, the command requires that the path to a non existing directory be specified where the contents are to be stored. If the path already exists, the command will abort. 19 December 2019, 13:59:32 UTC
3402c3e Consider 'AIIDA_TEST_PROFILE' in 'get_test_backend_name'. (#3685) Change the logic of 'get_test_backend_name' to check (if given) the backend set in the configuration for the profile specified in 'AIIDA_TEST_PROFILE'. If a backend is also specified in 'AIIDA_TEST_BACKEND', it checks that the two match, raising ValueError otherwise. If neither is specified, fall back to the default django backend. This addresses an issue with the current pytest setup, where the wrong tests would be discovered when running on a test profile with sqlalchemy backend without explicitly setting the AIIDA_TEST_BACKEND environment variable. 18 December 2019, 15:03:44 UTC
852a895 `QueryBuilder`: fix validation bug and improve message for `in` operator (#3682) The order of validation checks was incorrect and the type was not checked at all. In addition the validation error message clarity has been improved to contain the operator name. 18 December 2019, 09:36:07 UTC
ff02e0a fix performance issue when exporting many groups (#3681) When providing groups to `verdi export`, it was looping over all groups and using the `Group.nodes` iterator to retrieve the nodes contained in the groups. This results (at least) in one query per group, and is therefore every inefficient for large numbers of groups. The new implementation replaces this by two queries, one for Data nodes and one for Process nodes. It also no longer constructs the ORM objects since they are unnecessary. On a test set of 67k groups containing 5 nodes each, this change reduced time spent in getting the node identifiers from ~204s to ~11s. 17 December 2019, 22:00:09 UTC
e95d57b Deal with unreachable daemon worker in `get_daemon_status` (#3683) The `aiida.cmdline.utils.daemon.get_daemon_status` utility function used in `verdi daemon status` calls the `DaemonClient.get_worker_info` method which will ask the circus client to get information of the daemon workers it is managing. Under normal conditions this will return a dictionary of worker pids with a dictionary of their stats. However, sometimes the daemon may fail to retrieve these stats and the dictionary is replaced with a string containing an error message. The `get_daemon_status` method now deals with this elegantly and temporary prints `-` as placeholder for the unknown statistics. Since this is typically caused by a transient problem the next time the command is called the correct information will be displayed. 17 December 2019, 16:56:34 UTC
999ae3a Update `pyyaml` to prevent arbitrary code execution (#3675) * Fix various deprecation warnings * Do not use `Test` as prefix for dummy classes * Replace `imp` for `importlib` in REST API * Use `identifier` instead of deprecated `node_class` * Update `pyyaml` to prevent arbitrary code execution Before `pyyaml==5.1` the `yaml.load` function was vulnerable to arbitrary code execution, because it loaded the full set of YAML. There was an alternative `safe_load` but this was not the default and could only load a sub set of the markup language. The new version of pyyaml deprecates the old vulnerable code and provides the `FullLoader` that can load the full set without being vulnerable. 16 December 2019, 17:50:02 UTC
a73e4b6 reduce pytest warnings (#3674) * Update docs for pytest. * Bring pytest warnings from 423 down to 31 * fix usage of deprecated unittest api * fix tests skipped because of `__init__` constructor * fix classes incorrectly identified as tests * fix deprecated usage of external modules * fix incorrect escape sequences in regular expressions * ignore unnecessary warnings from external packages **Note:** this could be made more flexible (e.g. narrowing down filters), since such warnings can sometimes be useful, e.g. if we are using a deprecated API of the package (say, django). Most of the time, it is the external package that is using some deprecated API. * add individual test timings 16 December 2019, 09:47:43 UTC
8eeb432 Improve process directive in sphinxext and add `outline` support. (#3670) Adds support for showing the process outline, as a pseudo-code block. In addition, the content of the process description is now wrapped in an `desc_content` node, which means that the entire description will be indented. This is consistent with the behavior of the regular class directive. 15 December 2019, 14:04:28 UTC
e46997e Remove READTHEDOCS specific code from `aiida.manage.configuration` To compile the documentation, the code has to be imported which includes the Django database models. Unfortunately, Django will raise here if it has not been configured. Normally this is done through the backend which will load the backend environment for the currently loaded AiiDA profile. Since on readthedocs.org AiiDA is not installed and so there won't be a configuration with a profile to load, the profile management code was monkey patched to provide a dummy profile just for this purpose. Instead of cluttering the main codebase for this one exception, we move the responsibility to the documentation configuration itself. The requirements boil down to the Django settings being called and an AiiDA configuration and profile to be loaded. Since loading a profile with a django backend and loading said backend, the former will be accomplished automatically. This will now also allow building the documentation locally even if the default profile is not using a Django backend because the dummy documentation profile will simply be used. The function `aiida.manage.configuration.load_documentation_profile` performs all the required actions. 15 December 2019, 10:50:01 UTC
a33e74c Do not create config file upon import So far, importing aiida would directly create a config file. Such side effects can have problematic consequences e.g. if the default path that AiiDA chooses is not writeable. It also means that you pollute your file system just by importing the package. This PR moves the creation of the config file (if necessary) into the `load_profile` function, which is anyhow called e.g. by the `verdi` command line. The same goes for the configuration of the logger. 15 December 2019, 10:50:01 UTC
b8f40b1 Move `CalcJob.presubmit` call from `CalcJob.run` to `Waiting.execute` (#3666) The `presubmit` calls through to `prepare_for_submission` which is the place where the `CalcJob` implementation writes the input files based on the inputs nodes in a temporary sandbox folder. The contents of which are then stored in the repository of the node before the process transitions to the waiting state in await of the upload task. This upload task will then copy over the contents of the node repository to the working directory on the remote computer. There is a use case to limit which files are copied to the node's repository from the sandbox folder used by `prepare_for_submission`. This means that the sandbox folder will have to be passed to the `upload_calculation`. However, since the creation of this happens in the `CalcJob.run` call, which is quite far removed from the eventual use, this opens it up for a lot of problems. The time between folder population by `prepare_for_submission` and eventual use in the upload task can be significant, among other things because the upload task will have to wait for transport. To make this feasible the creation of the sandbox folder has to be moved closer to the upload task. Here we move the `presubmit` call from `CalcJob.run` to the upload transport task which will create the sandbox folder and pass it into the `upload_calculation`. This limits the time frame in which there is a chance for the contents of the sandbox to get lost. In addition, this now gives additional freedom of deciding which files from the sandbox are permanently stored in the node's repository for extra provenance. 14 December 2019, 17:25:10 UTC
bb82010 Fix problem with `aiida.schedulers.plugins.pbsbaseclasses` pre-commit (#3668) Since a recent commit, this file was failing in the pre-commit despite it being on the blacklist in the `.pre-commit-config.yaml` file. Even adding explicit pylint inline disable statements seemed to get ignore. Since the attributes are not actually used anywhere we adapt them to the conforming naming convention. 14 December 2019, 12:22:37 UTC
9092b93 Move `last_job_info` from JSON-serialized string to dictionary (#3651) For historical reasons, this field was stored as a JSON-serialized string in the `last_jobinfo` attribute of a CalcJob. However, this is cumbersome and makes querying very hard. We now replace this with a dictionary, thanks to new commands to get directly a dictionary (with serialized fields, so that the dictionary is JSON-serializable). These (and existing) methods of the JobInfo class are now also tested. Finally, the attribute key has been renamed from `last_jobinfo` to `last_job_info`, for consistency with the key `detailed_job_info` introduced in #3639. By changing the type of the content, the field is anyway not directly usable as before in scripts, so changing the name is not an additional issue. This should not give a real backward- incompatibility problem, since this field was there mostly for debugging reasons. 13 December 2019, 17:44:54 UTC
7895662 Make command line deprecation warnings visible with test profile (#3665) The `deprecated_command` decorator was only printing the warning if the profile was not a test profile. With `verdi devel tests` being deprecated, typically run with a test profile, the warning was being swallowed. This change gave problems with a test of `verdi comment show` that expected no output for no input, but now of course failed due to the output of the deprecation. Since it is a deprecated command anyway I simply remove the test. 13 December 2019, 15:46:58 UTC
ba51ed6 Remove items from the pre-commit black list (#3625) Remove 52 files from `.pre-commit` black list and remove `python_2_unicode_compatible` from all files. Co-Authored-By: Leopold Talirz <leopold.talirz@gmail.com> 13 December 2019, 12:20:12 UTC
dacbeb3 gha: disable faulty repositories (#3660) It seems that the GitHub actions virtual environments contain Microsoft repositories for .NET (and other tools) which can be broken at times: https://github.community/t5/GitHub-Actions/ubuntu-latest-Apt-repository-list-issues/m-p/41122 Disable them since we don't use packages from there for now. 13 December 2019, 09:36:13 UTC
4de836c Set job poll interval to zero in localhost pytest fixture (#3605) 12 December 2019, 22:29:18 UTC
87a7d34 remove old test infrastructure 12 December 2019, 22:01:17 UTC
562716e Switch to pytest for running unit tests Pytest is a powerful test framework that is already used by most aiida plugins. This PR uses the pytest fixtures (actually, onlty the aiida_profile fixture so far) in order to the AiiDA unit tests. * Fix errors at collection stage by using pytest.skip() in the setUp or the test function itself instead of using a decorator. Using 'collect_ignore_glob' to select tests by backend. * Run tests with pytest in GitHub actions. * remove unused tearDown_method, setUp_method * fix setupclass 12 December 2019, 22:01:17 UTC
f28d1c9 Update manifest to contain correct backup template (#3652) This was moved to `manage/backup/backup_info.json.tmpl` but the `MANIFEST` was never updated meaning that anyone installing the package from PyPI would be missing this file and creating a backup according to our instructions in the documentation would fail. 12 December 2019, 20:06:48 UTC
6c8efad Add test that Sphinx process extension raises when process spec raises (#3617) A problem in the implementation of the process spec construction in the `plumpy` library resulted in exceptions being raised during the `spec()` call being swallowed and an incomplete spec being set. As a result the compiled process documentation would be missing parts of the spec with no indication of what went wrong. The `Process.spec()` method was improved to not set the spec and reraise if any exceptions are raised. This allows the Sphinx extension to catch the error and re-raise with a more helpful error message. 12 December 2019, 18:37:57 UTC
66b6f6a Fix pre-commit in two files 12 December 2019, 17:39:12 UTC
2eac63e Fix two issues in `verdi computer configure ssh` The interactive SSH was not accepting an empty ssh key_filename, while this should be acceptable. I am also adding a test for the interactive setup of SSH computers. Moreover, when reconfiguring a computer, the value of the cooldown time wasn't resued from the authinfo; instead, the class default was also reused due to some hardcoding. Code now behaves correctly and is simplified avoid duplication in base classes and subclasses. Fixes #3633 and fixes #3634 12 December 2019, 17:39:12 UTC
f68b304 Add more methods to control cache invalidation of completed process node (#3637) A new attribute `invalidates_cache` is added to the `ExitCode` named tuple which is false by default. If a process node has an exit status that corresponds to an exit code with `invalidate_cache=False`, the node should not be considered for caching. This functionality is leveraged in the `ProcessNode.is_valid_cache` method which will return false if the exit code set on the node belongs to an invalid cache exit code. This method provides an additional hook over the same method on the `ProcessNode` class to allow plugins to add additional custom logic in `Process` sub-classes. Note that this is a backwards-incompatible change, which can break code that uses unpacking of `ExitCode` tuples, for example: status, message = ExitCode(...) Finally, the new `Process.is_valida_cache` implementation requires the class to be importable for it to be considered cachable. In the case of process functions this means that the function is importable. This meant that process function definitions in the unit tests had to be moved to the module level. 12 December 2019, 16:02:12 UTC
e74ad1b Filter out `None` timestamps in `get_process_state_change_timestamp` Otherwise the code may raise when trying to get the maximum of all timestamps, if one of the values is None and thus not comparable. 11 December 2019, 05:23:19 UTC
91016b9 Factor out detailed job info fields for SLURM Scheduler implementation This is useful for future implementation of a method that can parse the string output into a dictionary. 10 December 2019, 22:14:30 UTC
97658d8 Move getting completed job accounting to `retrieve` transport task The task of retrieving the detailed accounting info from the scheduler for completed jobs was erroneously placed within the scheduler status update cycle of the `JobsList`. This would have requested the detailed job info each update cycle where it not for a conditional that checked that the scheduler status was `DONE`. However, within the `JobsList` loop, completed jobs would not even appear in the result and hence the conditional would never be hit. This caused the detailed accounting never to be retrieved. A logically better place for this functionality is in the `retrieve` transport task after the `UPDATE` task, i.e. when the scheduler reports that the job has terminated. The job accounting will query the scheduler for the full accounting which, when implemented for the given scheduler plugin, will be stored as an attribute on the calculation job node. The `Scheduler.get_detailed_jobinfo` method is deprecated in favour of the `Scheduler.get_detailed_job_info`. The former was returning a string with the return value, stdout and stderr formatted within it, which is not very flexible. The replacing method returns a dictionary. 10 December 2019, 22:14:30 UTC
cc3f4bf Docs: adding documentation on performance tips (#3629) Some parts of the documentation on how to maximise performance of AiiDA were already there. I am now adding additional useful information originating from our experience, that is going to be useful to users. I am also adding suggestions from the CSCS staff on how to optimise/tune the performance of SLURM, that might be relevant information that AiiDA users might forward to cluster administrators. 10 December 2019, 18:55:21 UTC
80db4ef Add option to expand namespaces in sphinx directive (#3631) Adds the `:expand-namespaces:` flag to the sphinx directive, which causes the `<details>` tags to be created with 'open="open"' attributes. This makes them expand by default (collapsible by clicking). Added this case to the test / demo documentation of the sphinx extension, and in the documentation for `CalcJob`, where the options are explained. 10 December 2019, 18:31:48 UTC
f66d0a6 Make sure that datetime conversions ignore `None` (#3628) The functions `datetime_to_isoformat` and `isoformat_to_datetime` assumed to always have a proper type (string or datetime) as input. However, in some cases, they were called with values that could be potentially `None`, like this in `aiida.engine.utils.py`: timezone.isoformat_to_datetime(manager.get(key).value)) We are now directly returning `None` if `None` is passed as an input (and then it's up to the caller to then decide what to do with the value. 10 December 2019, 13:49:05 UTC
29ad71b Timeout a CI job after 30 min. (instead of 360) (#3627) It seems it is currently not possible to set a global timeout, so we set it individual timeouts for every job. 10 December 2019, 09:34:26 UTC
back to top