sort by:
Revision Author Date Message Commit Date
eff3c63 pre-commit: Update flake8 hook configuration flake8 hook has been removed from https://github.com/pre-commit/pre-commit-hooks so now use the one from https://gitlab.com/pycqa/flake8 17 September 2020, 11:56:04 UTC
7404486 cli: speedup the `swh` cli command startup time move most import statements in functions. Related to T2575. 10 September 2020, 14:19:18 UTC
12fe1f7 model: Fix "unused 'type: ignore' comment" error with mypy 0.782 25 August 2020, 13:32:51 UTC
c85990b Tell pytest not to recurse in dotdirs. pytest wastes a lot of time in .hypothesis and .git; this commit excludes them. 25 August 2020, 08:40:13 UTC
6dd6ace model: Raise error on naive datetimes. We may unknowingly pass naive datetimes to the storage through them, causing the underlying DB to assign them a timezone that might not match the actual one. It already happens in swh.model and swh.loader.package tests. 14 August 2020, 12:12:35 UTC
d1db7b9 model.Content.to_dict: Remove ctime entry when it's None Same as for the field data, it helps for code not yet migrated to use model object. 07 August 2020, 07:53:31 UTC
b1a16b1 model: Add Sha1 alias Related to T645 07 August 2020, 07:50:45 UTC
09dcc04 model: Add final object_type field on metadata related model objects 06 August 2020, 17:00:21 UTC
37cdd84 setup.py: Really use the correct keyword Related to T2105 06 August 2020, 16:43:30 UTC
3b2e6c0 setup.py: Use the correct keywords Related to T2105 06 August 2020, 16:13:47 UTC
dab3d72 setup.py: Migrate from vcversion from setuptools-scm Related to T2105 04 August 2020, 12:06:52 UTC
f9fc106 add ImmutableDict.__repr__ It can help in pytest's diffs 30 July 2020, 13:41:29 UTC
b58d901 Fix incorrectly typed null constants in extra_headers byte strings 29 July 2020, 10:46:18 UTC
8f609e5 Import Mapping from collections.abc instead of collections to fix a deprecationg warning. 29 July 2020, 10:46:09 UTC
81f9fbc Declare pytest markers to prevent warnings 29 July 2020, 10:41:12 UTC
3b2d72c Rename MetadataAuthorityType.DEPOSIT to MetadataAuthorityType.DEPOSIT_CLIENT. D3560 20 July 2020, 09:35:27 UTC
bf43536 Rework dia -> pdf pipeline for inkscape 1.0 - Use dia directly to convert from .dia to .svg (inkscape would use dia via a plugin anyway) - Add proper runes to detect inkscape >= 1 and use the export options for that. 09 July 2020, 17:35:21 UTC
0547a51 identifiers: Add to_dict method to SWHID class 08 July 2020, 14:18:40 UTC
52ef52e Use attr instead of NamedTuple to generate SWHID. As NamedTuple inherits from tuple, msgpack serializes it like a tuple, which makes it indistinguishable from a tuple when deserializing, which is an issue for the RPC API. 07 July 2020, 15:34:41 UTC
bea256e Make SWHID immutable and hashable. 07 July 2020, 13:12:44 UTC
06837d5 Implement ImmutableDict.__hash__. 07 July 2020, 13:10:53 UTC
c4dad17 Allow passing an ImmutableDict as argument to ImmutableDict's constructor. It allows easy conversion of Union[ImmutableDict, Dict] to ImmutableDict. 07 July 2020, 13:04:54 UTC
9e475a7 Implement to_dict and from_dict for metadata-related classes. 07 July 2020, 11:31:05 UTC
af0dd1a Add a new ImmutableDict class, and use it in model objects. So they are truly immutable now. 07 July 2020, 11:31:05 UTC
78fc5f7 Add raw metadata to the model. This will allow swh-storage to have a signature for *_metadata_add that is consistent with other *_add endpoints. 07 July 2020, 09:48:19 UTC
a7d9aca Extract the extra_headers from metadata on the Revision model class Add a new extra_headers attribute on Revision and use it for computing the revision's id instead of extract it from the metadata field. Only accept (bytes, bytes) as extra_header. Add a post init hook to Revision to initialize this new attribute from given metadata, if any, for bw compat. Also amend the revision_d hyptothesis strategy to generate extra_headers. 06 July 2020, 09:57:55 UTC
1ff0516 identifiers: Rename some functions and types related to SWHIDs When Software Heritage persistent identifiers were introduced, they were not yet abbreviated as SWHIDs. Now that abbreviation is growing adoption, rename some functions and types in swh.model.identifiers for consistency: - PersistentId -> SWHID - persistent_identifier -> swhid - parse_persistent_identifier -> parse_swhid Backward compatibility with previous naming is maintained but deprecation warnings are introduced to encourage the use of the new names. Numerous variables in swh.model codebase have also been renamed accordingly. Also rework and improve documentation. 03 July 2020, 12:11:32 UTC
8863b5c Refactor common loader behavior within from_disk.iter_directory 02 July 2020, 13:09:50 UTC
363b165 Unify object_type some more within the merkle and from_disk modules 02 July 2020, 13:03:04 UTC
40a40f5 model.OriginVisit: Drop obsolete fields Related to T2310 29 June 2020, 09:08:06 UTC
e632abe Tag model entities with their "object_type" this aims at preventing constant usage of isinstance() based dispatch code when writing generic code handling model entities. For example, the "object_type" argument of JournalWriter.write_addition() has become superflous now we only pass model entities, etc. This idea comes olasd's reading of mypy doc: https://mypy.readthedocs.io/en/latest/literal_types.html#tagged-unions This comes with a refactoring of from_dict.DiskBackedContent to make it *not* inherit from model.Content: object_type being Final, it cannot be overloaded. 24 June 2020, 15:39:02 UTC
661b7c2 OriginVisitStatus: Allow "created" status Related to T2310 24 June 2020, 07:16:50 UTC
636f8c2 model.OriginVisit: Make obsolete fields optional Related to T2310 23 June 2020, 15:29:53 UTC
f349bdc swh.model.model.OriginVisit: Drop the dateutil.parser.parse use 22 June 2020, 08:14:30 UTC
ba0c4e1 model.hypothesis_strategies: Make metadata always none on origin_visit This is not used. This is broken storage wise (origin-visit-add does not deal correctly with it and it so happens there is no test around it). And finally, this will soon go away with T2310. 16 June 2020, 17:10:53 UTC
f723eb1 Fix the model: Revision.message can be None And adapt the revisions_d() strategy accordingly. 16 June 2020, 08:35:02 UTC
b70b281 Fix message generation in hypothesis strategy releases_d() This can be None, according to the model. 16 June 2020, 08:35:02 UTC
5c5f34f Use the optional() strategy instead of one_of(none(), ...) when possible for the sake of consistency. 16 June 2020, 08:34:54 UTC
a427e18 Allow negative_utc to be None in normalize_timestamp() thus in TimestampWithTimezone.from_dict(). This is needed to help consuming existing (invalid) messages from kafka. Warning: tests added in this revision do not cover the whole normalize_timestamp() function. 15 June 2020, 07:40:43 UTC
3d9f694 Use Tuple instead of List in model declarations. This is a step forward having model objects, declared as frozen, immutable. This requires attrs_strict >= 0.0.7. 03 June 2020, 09:32:05 UTC
340656d Fix origin_visit hypothesis strategies the visit attribute is expected to be strictly positive. 03 June 2020, 09:23:00 UTC
a95646f Exclude [Skipped]Content.ctime from hash/eq computation this attribute is not an intrinsic property of a content object, so it should not be used when comparing or hashing. 29 May 2020, 15:14:31 UTC
29312df Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Person, the method returns a Person whith hashed fullname (and unset name and email). For Revision and Release, the method returns an anonymized version of the object, i.e. with instance of Person replaced by anonymized ones. 20 May 2020, 14:28:01 UTC
cce3036 SWHID spec: fix typos ";;" which made some examples fail 14 May 2020, 13:47:46 UTC
091498e Make aware_datetimes() generate only ISO8601-encodable datetimes. 05 May 2020, 10:03:39 UTC
9f5d266 SWHID spec: full reread Reviewers: rdicosmo Reviewed By: rdicosmo Differential Revision: https://forge.softwareheritage.org/D3108 30 April 2020, 17:06:41 UTC
b80b135 setup.py: add documentation link 29 April 2020, 16:32:31 UTC
08fd228 hypothesis_strategies: Generate aware datetimes instead of naive ones. Production should only use aware datetimes. 29 April 2020, 11:02:22 UTC
0fad886 doc: check-in IANA registration template for the "swh" URI scheme Closes T1003 29 April 2020, 07:34:30 UTC
8367eec Restructure SWHID documentation in preparation for T2385 - merge grammars into a single one - explain better that SWHIDs are made up of core identifier + qualifiers - separate qualifier into context and fragment onex - add reference to swh-identify 28 April 2020, 18:47:50 UTC
f97d216 SWHID spec: bump version to 1.3 and add last modified date 28 April 2020, 14:04:42 UTC
d230938 SWHID spec: make SWHIDs plural where needed 28 April 2020, 14:04:19 UTC
1379385 SWHID spec: simplify and generalize escaping requirements 27 April 2020, 13:17:50 UTC
3ef4843 SWHID spec: add support for IRI Closes T2379 26 April 2020, 14:44:51 UTC
56cf99a SWHID: deal with escaping in origin qualifiers 24 April 2020, 14:56:47 UTC
3f38808 SWHID doc: improve wording of intrinsic parts v. the rest 24 April 2020, 08:11:45 UTC
1037e88 Add a split_content argument to object_dicts() and objects() strategies Make it possible to generate Content and SkippedContent under different object types (namely "content" and "skipped_content"). Default to False to keep backward compat. 21 April 2020, 12:49:14 UTC
ebd3807 Add a blacklist_types argument to object_dicts() and objects() hypothesis strategies so one can choose not to generate some of the object types. Blacklist "origin_visit_status" by default to prevent breaking dependent packages' tests. 21 April 2020, 12:48:33 UTC
bfba3bd Fix hypothesis strategies alias for origin visit update objects 20 April 2020, 15:37:56 UTC
e5227e2 setup: Update the minimum required runtime python3 version Related to T2367 20 April 2020, 15:37:56 UTC
d52549f CLI: add test for swh identify w/o args and user required=True to check that, as it is the preferred way 17 April 2020, 15:42:16 UTC
7b2cc1f CLI: require explicit "-" to identify via stdin 17 April 2020, 15:25:03 UTC
6ac6cb7 SWHID doc: fix minor grammar issue hat tip to @rdicosmo for noticing 17 April 2020, 15:11:38 UTC
098f76a SWHID doc: fix link in CISE paper reference 17 April 2020, 14:42:46 UTC
36f921b identifiers.py: reference to SWHIDs using explicit anchors 17 April 2020, 14:23:13 UTC
94242ca swh identify: embrace SWHID naming in user-facing doc/messages 17 April 2020, 14:22:41 UTC
4c78d47 PID doc: embrace the SWHID naming 17 April 2020, 14:22:11 UTC
0ab482e PID doc: add reference to CISE paper 17 April 2020, 14:21:46 UTC
2ae347d doc: document identify CLI 16 April 2020, 14:25:14 UTC
401bc17 model: Rename OriginVisitUpdate to OriginVisitStatus This also adapts the hypothesis strategies, using the plural form origin_visit_statuses. That plural form is acceptable because in our context, the statuses are countable. Related to T2310 10 April 2020, 08:43:20 UTC
6f8c66c model: Black formatting 10 April 2020, 08:43:04 UTC
94da010 Add a pyproject.toml file to target py37 for black 08 April 2020, 20:16:56 UTC
bf3f1ce Enable black - blackify all the python files, - enable black in pre-commit, - add a black tox environment. 08 April 2020, 14:53:06 UTC
5d6883b from_disk: path parameter to dir_filter functions 08 April 2020, 09:31:22 UTC
c7c1a57 docs/data-model: Update visits chapter definition Hinting at the origin_visit_update model Related to T2310 02 April 2020, 14:32:02 UTC
64a7f62 model: Make message field optional in Release model A release may have an empty message, for instance those derived from a Mercurial repository. So make that field optional to avoid type validation errors. 02 April 2020, 12:00:30 UTC
074c210 hypothesis: Fix some issues in snapshots strategy and add tests Fix keyword parameters transmission to snapshots_d strategy. Ensure max_size constraint is respected when fixing snapshot aliases. 02 April 2020, 09:45:59 UTC
ca0f6a1 model: add support for ctime in [Skipped]Content.from_[data,dict]() With support for str representation of date. Mostly for testing purpose. 01 April 2020, 09:07:24 UTC
414a655 model: small code improvement of SkippedContent.from_dict 01 April 2020, 09:07:24 UTC
6ce0f71 model: fix SkippedContent origin to be a str instead of a reference to an Origin entity. 01 April 2020, 09:07:24 UTC
f513271 hypothesis: split hypothesis strategies as a dict + entity instance for each entity model `Model`, provide a `models_d` strategy that produces dicts suitable for using as argument for the `Model.from_dict` factory method, and reimplement the `models` generator using this former hypothesis generator. This is needed to help writing low level tests for model entities. 01 April 2020, 09:07:24 UTC
10b0699 model: improve a bit the TimestampWithTimezone model - add a validator for negative_utc (can be True iff offset is 0), - update the timestamps_with_timezone hypothesis strategy, - add low-level tests for it. 01 April 2020, 08:57:07 UTC
ac9d4c8 tests: add low level tests for the Timestamp model entity 01 April 2020, 08:57:07 UTC
85ca7d7 model: use attrs_static to enforce type validation of model objects This ensures all instanciated model entities have valid types for attributes. Related to T2308. 01 April 2020, 08:57:07 UTC
e9a4c75 model: Add new OriginVisitUpdate model object + test strategy (pairing with @vlorentz) Related to T2310 31 March 2020, 16:01:54 UTC
accca60 Typo 30 March 2020, 12:13:59 UTC
b6e92ea Further clarifications in the PID extension 30 March 2020, 12:11:55 UTC
d14883e Clarify ambiguities in PID extensions 28 March 2020, 17:22:11 UTC
0767c81 Extend SWH PID definition with additional context qualifiers. 28 March 2020, 14:16:04 UTC
4a2233c identifiers: encode origin URLs in utf-8 23 March 2020, 18:09:47 UTC
97af886 tests/identifiers: fix 'target', 'directory' and 'parents' object types These are expected to be bytes, not str. 12 March 2020, 13:28:22 UTC
56ae59c test/model: do not test direct instanciation of model objects this does not work in the general case since there is no (recursive) convertion of objects used as model object initialization. We can only check when using the from_dict() factory. 11 March 2020, 14:41:49 UTC
c746960 tests/models: use d.copy() instead of dict(d) for better clarity on the code author's intention. 11 March 2020, 14:39:32 UTC
f533f62 model: kill Origin.type attribute it was still here for bw-compat but should not be necessary any more. 11 March 2020, 14:01:23 UTC
0a6d7e0 Extract the dictify() function from BaseModel.to_dict() this function does not need to be a local function of the to_dict namespace. 11 March 2020, 12:15:32 UTC
a5a9f57 Add classmethod Person.from_address, to parse from 'name <email>' strings. This will allow deduplicating code across loaders. 04 March 2020, 10:52:29 UTC
5ccf8a8 Draw contents from a byte string instead of generating arbitrary hashes This generates more realistic contents and avoids spurious HashCollisions when generating a set of objects using these hypothesis strategies, at the cost of slightly worse "boundary checking" (i.e. we won't check contents with a length > 4096 bytes). 02 March 2020, 15:22:59 UTC
ded150d Add a method to generate Content/SkippedContent from binary data This lets us generate Content objects directly from a bytestring, with the proper set of hashes auto-generated from the contents. 02 March 2020, 15:22:43 UTC
cb075eb model.hypothesis: use the proper strategy name for building `Person`s 27 February 2020, 17:03:18 UTC
a9a42ea model.hypothesis: Fix person generation 27 February 2020, 15:27:40 UTC
back to top