122326d | Antoine R. Dumont (@ardumont) | 15 January 2018, 17:57:19 UTC | swh.models.hashutil: Add persistent identifier function Related T335 Related T933 | 15 January 2018, 18:28:22 UTC |
b61c666 | Stefano Zacchiroli | 14 January 2018, 21:30:16 UTC | docs: document the naming scheme for persistent identifiers Closes: T335 | 14 January 2018, 21:30:16 UTC |
a01d81c | Stefano Zacchiroli | 14 January 2018, 21:30:02 UTC | docs: shorter fulltitle for the data model document | 14 January 2018, 21:30:02 UTC |
c79c446 | Stefano Zacchiroli | 14 January 2018, 13:54:58 UTC | swh-hash-file: make sure that paths are passed on as bytes | 14 January 2018, 13:54:58 UTC |
73d5ffb | Stefano Zacchiroli | 14 January 2018, 13:40:18 UTC | bin/swh-hash-file: new binary to compute SWH-style content identifiers reincarnation of the old shw.model bin/ script, which is now gone | 14 January 2018, 13:47:43 UTC |
a5f7d1e | Stefano Zacchiroli | 14 January 2018, 13:39:45 UTC | improve hash_file() docstring to specify algorithms type | 14 January 2018, 13:39:45 UTC |
91d74ef | Antoine R. Dumont (@ardumont) | 20 December 2017, 09:40:06 UTC | swh.model.hashutil.hash_data: Optionally integrate length in result | 20 December 2017, 09:46:03 UTC |
eff2692 | Nicolas Dandrimont | 13 December 2017, 10:30:21 UTC | Merge branch 'wip/snapshots' | 13 December 2017, 10:30:21 UTC |
1b1cc8d | Nicolas Dandrimont | 12 December 2017, 19:08:47 UTC | hashutil: add `snapshot` object type for git hashes Summary: Add support for snapshot identifiers Close T566. Related to D268. Test Plan: Unit tests included Reviewers: zack, #reviewers! Maniphest Tasks: T566 Differential Revision: https://forge.softwareheritage.org/D277 | 13 December 2017, 10:30:10 UTC |
46ce819 | Nicolas Dandrimont | 12 December 2017, 19:08:30 UTC | hashutil: add `snapshot` object type for git hashes | 12 December 2017, 19:08:30 UTC |
94bd8dd | Stefano Zacchiroli | 02 November 2017, 10:09:21 UTC | docs: add absolute anchor to documentation index | 02 November 2017, 10:09:21 UTC |
0b7f217 | Nicolas Dandrimont | 12 October 2017, 15:16:56 UTC | Cleanup packaging | 12 October 2017, 15:16:56 UTC |
34228c5 | Nicolas Dandrimont | 05 October 2017, 18:45:42 UTC | test_from_disk: use os.fsencode to consistently get tmpfile names as bytes | 05 October 2017, 18:45:42 UTC |
2ab8360 | Nicolas Dandrimont | 05 October 2017, 18:38:07 UTC | mark tests needing the filesystem as such | 05 October 2017, 18:38:07 UTC |
6c30346 | Nicolas Dandrimont | 05 October 2017, 18:31:27 UTC | d/control: add breaks on packages depending on removed APIs | 05 October 2017, 18:31:27 UTC |
f6a4d7e | Nicolas Dandrimont | 05 October 2017, 18:28:11 UTC | Remove swh.model.git Close T709 | 05 October 2017, 18:28:11 UTC |
c67f012 | Nicolas Dandrimont | 04 October 2017, 20:33:45 UTC | from_disk: full test coverage | 04 October 2017, 20:33:45 UTC |
8900a91 | Nicolas Dandrimont | 04 October 2017, 20:33:17 UTC | from_disk.Directory: fix some random bugs found when using the API | 04 October 2017, 20:33:17 UTC |
a8e919a | Nicolas Dandrimont | 04 October 2017, 18:08:19 UTC | test_hashutil: remove temporary file after test | 04 October 2017, 18:08:19 UTC |
c790b85 | Nicolas Dandrimont | 04 October 2017, 17:12:39 UTC | from_disk: add a way to save the path to contents This allows loaders to lazily load data: you can read from disk and transfer only the contents that are really missing. | 04 October 2017, 17:12:39 UTC |
d54c066 | Nicolas Dandrimont | 19 September 2017, 11:40:21 UTC | from_disk: convert on-disk data to Software Heritage archive objects Summary: This module is a reimplementation of swh.model.git, with the underlying goal of replacing it and fixing T709 in the process. Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D248 | 03 October 2017, 17:47:46 UTC |
f44949a | Nicolas Dandrimont | 22 September 2017, 15:10:25 UTC | Add a Merkle tree data structure | 03 October 2017, 17:47:46 UTC |
ac3df91 | Stefano Zacchiroli | 19 September 2017, 15:07:36 UTC | docs/: add sub document data-model.rst Currently just a stub (but with a hereby fixed anchor) to allow linking to it from other part of the development documentation. Will be fillen later on with an abstract description of the Software Heritage data model. | 19 September 2017, 15:07:36 UTC |
8bafb40 | Nicolas Dandrimont | 15 September 2017, 17:15:38 UTC | hashutil: improve docstrings | 15 September 2017, 17:15:38 UTC |
bd43a7f | Stefano Zacchiroli | 06 September 2017, 18:27:43 UTC | docstring: drop useless heading ":py" domain in crossrefs | 06 September 2017, 18:27:43 UTC |
1d898f7 | Stefano Zacchiroli | 06 September 2017, 18:22:30 UTC | sanitize docstrings for sphinx | 06 September 2017, 18:22:30 UTC |
b53f5d8 | Stefano Zacchiroli | 30 August 2017, 10:25:57 UTC | docs/: add sphinx apidoc generation skeleton change cherry picked from python module template commit 71b117ba0cf9f1251b1cac26d0994df03a4c787d | 30 August 2017, 10:25:57 UTC |
8aa5c3a | Nicolas Dandrimont | 12 July 2017, 14:12:12 UTC | Load pyblake2 dynamically instead of hardcoding the Python version Summary: for those people with legacy openssls Reviewers: anlambert, #reviewers! Differential Revision: https://forge.softwareheritage.org/D224 | 12 July 2017, 14:12:35 UTC |
11de644 | Nicolas Dandrimont | 26 June 2017, 12:41:11 UTC | requirements: make pyblake2 conditional on Python3.5 | 26 June 2017, 12:41:11 UTC |
d281faf | Nicolas Dandrimont | 19 June 2017, 17:56:07 UTC | d/control: drop pyblake2 if python >= 3.5 | 19 June 2017, 17:56:07 UTC |
6f89adf | Nicolas Dandrimont | 07 April 2017, 10:02:30 UTC | git: make GitPerm an IntEnum rather than bytes Fix T685 out of spite. While we wait for a cleaner refactoring of this code, this fixes the immediate clogging of the database with bogus data issue. | 07 April 2017, 10:02:30 UTC |
4d6d748 | Antoine R. Dumont (@ardumont) | 24 March 2017, 14:17:13 UTC | d/changelog: Fix sbuild warning | 24 March 2017, 14:17:13 UTC |
9812285 | Antoine R. Dumont (@ardumont) | 24 March 2017, 13:08:29 UTC | swh.model.hashutil: Add blake2s256 in default algorithms Related T703 | 24 March 2017, 14:16:36 UTC |
a42c75e | Antoine R. Dumont (@ardumont) | 16 March 2017, 17:10:18 UTC | swh.model.hashutil: Use pyblake2 dependency on python3 <= 3.4 This resolves the caveat mentioned in prior commit about not being able to use blake2 prior to 3.5 Related T692 Closes D192 | 21 March 2017, 09:35:45 UTC |
24f8dd4 | Antoine R. Dumont (@ardumont) | 16 March 2017, 11:19:49 UTC | swh.model.hashutil: Adapt according to latest discussion - Add module docstring - Add blake2s256 and blake2b512 in supported algorithms - Spawn a new variable DEFAULT_ALGORITHMS as default computed algorithms for the main functions Related T692 | 17 March 2017, 08:43:15 UTC |
f75be5a | Antoine R. Dumont (@ardumont) | 15 March 2017, 09:02:28 UTC | swh.model.hashutil: Make unknown variable length algo creation break Remove the limit on the python3 version, this should be transparent. If the hash requested is not available, this will raise with an explanation on the error. Related T692 | 17 March 2017, 08:42:32 UTC |
8776435 | Antoine R. Dumont (@ardumont) | 14 March 2017, 14:51:56 UTC | swh.model.hashutil: Simplify length hash algorithms instantiation The same caveat applies, will only be supported from python3.6 onward. Related T692 | 17 March 2017, 08:42:31 UTC |
9c25f8f | Antoine R. Dumont (@ardumont) | 14 March 2017, 14:15:06 UTC | swh.model.hashutil: Open variable length hash algorithm support The caveat is that it will only be supported when we will be using python3 >= 3.5. Related T692 | 17 March 2017, 08:42:15 UTC |
3e325ca | Antoine R. Dumont (@ardumont) | 15 March 2017, 14:31:54 UTC | Migrate from swh.core.hashutil to swh.model.hashutil Related T700 | 15 March 2017, 15:00:44 UTC |
b0f7f06 | Antoine R. Dumont (@ardumont) | 24 February 2017, 07:28:37 UTC | Update docstring to clarify the ambiguity around symlinks | 24 February 2017, 07:29:38 UTC |
c40ab03 | Antoine R. Dumont (@ardumont) | 23 February 2017, 13:20:43 UTC | Consider special files as empty ones when computing content hashes Closes T255 Ref. D179 | 23 February 2017, 14:35:02 UTC |
e0dbae3 | Nicolas Dandrimont | 15 February 2017, 16:45:59 UTC | identifiers: properly escape newlines in author specifications Found by investigating T75 | 15 February 2017, 16:46:22 UTC |
58c5a24 | Nicolas Dandrimont | 14 February 2017, 17:34:48 UTC | git: don't use double underscores for function names | 14 February 2017, 17:35:14 UTC |
7912710 | Nicolas Dandrimont | 14 February 2017, 17:33:14 UTC | identifiers: force timestamps as integers everywhere The subversion loader (T680) has shown that throwing floating point values around for timestamps is a mess waiting to happen. We now coerce all clients to send us timestamps as integer numbers of seconds and microseconds, avoiding data losses everywhere. | 14 February 2017, 17:35:14 UTC |
87444d4 | Antoine Pietri | 09 February 2017, 11:12:08 UTC | requirements: split internal and external requirements in two separate files | 09 February 2017, 13:32:05 UTC |
2594832 | Antoine R. Dumont (@ardumont) | 23 June 2016, 10:30:35 UTC | Fix: echo -n to avoid adding an extra line | 23 June 2016, 10:30:35 UTC |
5c0be62 | Antoine R. Dumont (@ardumont) | 23 June 2016, 09:33:21 UTC | Open tools to check rev hash | 23 June 2016, 09:33:21 UTC |
cec445d | Nicolas Dandrimont | 14 June 2016, 15:00:27 UTC | d/rules: move to build_dir before tests | 14 June 2016, 15:00:27 UTC |
db20b20 | Antoine R. Dumont (@ardumont) | 13 June 2016, 14:23:36 UTC | Remove dead comment | 13 June 2016, 14:23:36 UTC |
b3c17c7 | Antoine R. Dumont (@ardumont) | 13 June 2016, 14:22:35 UTC | Remove print statement | 13 June 2016, 14:22:35 UTC |
5f7c931 | Antoine R. Dumont (@ardumont) | 12 June 2016, 13:34:07 UTC | Add tests on git.compute_hashes_from_directory - default - ignoring empty folders - ignore folder based on pattern in names | 12 June 2016, 14:08:54 UTC |
aa06697 | Antoine R. Dumont (@ardumont) | 12 June 2016, 13:33:34 UTC | Fix hash typos + remove print statement | 12 June 2016, 13:33:34 UTC |
843d814 | Antoine R. Dumont (@ardumont) | 12 June 2016, 09:29:05 UTC | Add missing tests on new api | 12 June 2016, 09:29:05 UTC |
1a2b969 | Antoine R. Dumont (@ardumont) | 11 June 2016, 00:20:22 UTC | Open children_hashes api function | 11 June 2016, 00:20:22 UTC |
05ac3c4 | Antoine R. Dumont (@ardumont) | 10 June 2016, 23:58:18 UTC | Remove dead code | 10 June 2016, 23:58:18 UTC |
8d2bf5a | Antoine R. Dumont (@ardumont) | 10 June 2016, 23:54:08 UTC | Rename walk_and_compute_sha1_from_directory_2 to compute_hashes_from_directory | 10 June 2016, 23:54:08 UTC |
87fcced | Antoine R. Dumont (@ardumont) | 08 June 2016, 13:09:54 UTC | Add objects_per_type api This permits to reuse the same logic for different clients (loader-dir, loader-tar, loader-svn) (Tests were lost) | 08 June 2016, 13:43:52 UTC |
17f0493 | Antoine R. Dumont (@ardumont) | 08 June 2016, 13:09:18 UTC | Open a new walk_and_compute_sha1_from_directory_2 api This actually is supposed to replace walk_and_compute_sha1_from_directory. The data structure used here is better at handling updates. (Code that actually got lost and rewritten - Tests are definitely lost though) | 08 June 2016, 13:09:18 UTC |
1af7aed | Antoine R. Dumont (@ardumont) | 08 June 2016, 13:08:08 UTC | Improve internal api regarding directory and tree hash computations Keep the old api (since i don't measure the impacts on other modules yet). + Improve docstring (Code that actually got lost and rewritten) | 08 June 2016, 13:08:08 UTC |
9b9ec94 | Antoine R. Dumont (@ardumont) | 26 May 2016, 10:33:12 UTC | Optimize walk for edge cases | 26 May 2016, 10:56:11 UTC |
a91bf69 | Antoine R. Dumont (@ardumont) | 26 May 2016, 09:57:58 UTC | Add tests about new use cases Combination of: - validation on files - ignore empty folder | 26 May 2016, 09:57:58 UTC |
22b9fca | Antoine R. Dumont (@ardumont) | 25 May 2016, 21:33:14 UTC | Try and detect the next existing parent to lookup from In some corner case, the changed paths can reference a previous ignored folder (thus not existing in the data structure) | 25 May 2016, 21:36:54 UTC |
1f98c67 | Antoine R. Dumont (@ardumont) | 25 May 2016, 21:26:57 UTC | Add optional clean up round-trip to remove empty folders | 25 May 2016, 21:26:57 UTC |
ca235a0 | Antoine R. Dumont (@ardumont) | 24 May 2016, 14:51:37 UTC | d/control: Ignore filesystem tests | 24 May 2016, 15:00:54 UTC |
aae146d | Antoine R. Dumont (@ardumont) | 23 May 2016, 13:28:48 UTC | swh.model.git - update - Deal with edge case about empty folder The empty folder was not previously in the objects structure. So we need to add it as child of its parent for the update. | 24 May 2016, 11:48:13 UTC |
dca0eaf | Antoine R. Dumont (@ardumont) | 23 May 2016, 13:27:07 UTC | swh.model.git - update - Secure paths removal | 23 May 2016, 13:27:44 UTC |
0fbf74e | Nicolas Dandrimont | 08 April 2016, 11:53:31 UTC | identifiers: support authors with only a Full Name field | 08 April 2016, 11:53:31 UTC |
16155c4 | Antoine R. Dumont (@ardumont) | 05 April 2016, 15:08:23 UTC | Fix some edge case on git hash update computation Enforce convention on directory name without trailing /. At the moment, the `git.walk_and_compute_sha1_from_directory` injected the rootdir with a possible trailing / (input from client). | 05 April 2016, 15:14:39 UTC |
3f63877 | Antoine R. Dumont (@ardumont) | 05 April 2016, 12:09:34 UTC | Add real use cases for the git computation update tests | 05 April 2016, 12:09:50 UTC |
d5d5bee | Antoine R. Dumont (@ardumont) | 05 April 2016, 12:08:48 UTC | Improve docstrings | 05 April 2016, 12:08:54 UTC |
0fc6af8 | Antoine R. Dumont (@ardumont) | 02 April 2016, 15:32:06 UTC | Add the length to the data returned Since we compute it anyway, better return it along with the result | 02 April 2016, 15:32:06 UTC |
97b0d9f | Antoine R. Dumont (@ardumont) | 01 April 2016, 12:53:46 UTC | Improve git hash update behavior Decrease the number of paths to compute to 1 common ancestor (if any): - Scan only that directory and rehash with new results (data changed) - Update the resulting objects with those new hashes. - Update from that directory to the rootdir the existing hashes computation | 01 April 2016, 15:26:22 UTC |
cd88163 | Antoine R. Dumont (@ardumont) | 01 April 2016, 09:45:07 UTC | Detect if we need to recompute all from disk anyway (change at the root level for example) | 01 April 2016, 09:56:00 UTC |
02e2357 | Antoine R. Dumont (@ardumont) | 01 April 2016, 07:30:54 UTC | Only compute root_tree_key's directory hash when needed | 01 April 2016, 07:32:15 UTC |
cdf2b70 | Antoine R. Dumont (@ardumont) | 01 April 2016, 07:30:21 UTC | Refactor - Improve test git class definition | 01 April 2016, 07:30:35 UTC |
97be2fd | Antoine R. Dumont (@ardumont) | 31 March 2016, 18:41:14 UTC | Fix: Delete paths below the path removal deletion | 31 March 2016, 18:41:14 UTC |
eb99dbf | Antoine R. Dumont (@ardumont) | 31 March 2016, 14:39:50 UTC | Update git hash computation on changed paths only | 31 March 2016, 17:23:12 UTC |
3cebfce | Antoine R. Dumont (@ardumont) | 31 March 2016, 11:18:52 UTC | Clean up after test | 31 March 2016, 11:18:52 UTC |
4bb18b2 | Antoine R. Dumont (@ardumont) | 31 March 2016, 11:18:39 UTC | Module import order | 31 March 2016, 11:18:39 UTC |
604f4f0 | Antoine R. Dumont (@ardumont) | 31 March 2016, 08:27:11 UTC | Update docstring on swh.model.git module | 31 March 2016, 08:27:56 UTC |
18086c1 | Nicolas Dandrimont | 30 March 2016, 15:59:07 UTC | identifier: Don't break on None metadata for revisions | 30 March 2016, 15:59:07 UTC |
4b13d16 | Nicolas Dandrimont | 30 March 2016, 13:04:28 UTC | test_identifiers: add test for negative UTC | 30 March 2016, 13:04:28 UTC |
a09e9b4 | Nicolas Dandrimont | 30 March 2016, 12:03:04 UTC | identifiers: proper support for negative utc offsets Move timestamp normalization to another function to make it more easily movable. | 30 March 2016, 12:04:31 UTC |
39c61bb | Nicolas Dandrimont | 29 March 2016, 16:07:04 UTC | test_identifiers: add tests for empty vs. null messages | 29 March 2016, 16:07:04 UTC |
e56b58a | Nicolas Dandrimont | 29 March 2016, 15:53:30 UTC | test_identifiers: this gpg signature is not from Linus | 29 March 2016, 15:53:30 UTC |
faf0840 | Nicolas Dandrimont | 29 March 2016, 15:37:06 UTC | identifiers: support None messages in revisions and releases | 29 March 2016, 15:51:54 UTC |
f62bc76 | Nicolas Dandrimont | 29 March 2016, 13:07:44 UTC | identifiers: enhance documentation of the revision_identifier function This function wasn't in sync with what's supposed to be our revision schema | 29 March 2016, 15:35:47 UTC |
dd1c4ba | Nicolas Dandrimont | 29 March 2016, 13:06:29 UTC | test_identifiers: proper revision w/gpgsig test naming and refactor | 29 March 2016, 13:06:29 UTC |
f7bc587 | Nicolas Dandrimont | 29 March 2016, 13:02:10 UTC | identifiers: import symbols from hashutil directly | 29 March 2016, 13:02:10 UTC |
c3d9439 | Antoine R. Dumont (@ardumont) | 24 March 2016, 09:36:21 UTC | 'metadata' entry is expected to be json serializable so no bytes, and we enforce during the checksum computation function | 24 March 2016, 09:37:00 UTC |
aca1e40 | Antoine R. Dumont (@ardumont) | 22 March 2016, 17:58:45 UTC | Use of optional gpgsig in git commit sha1 computation | 22 March 2016, 17:58:45 UTC |
020a555 | Antoine R. Dumont (@ardumont) | 22 March 2016, 16:18:06 UTC | Respect initial key convention (_ delimited) | 22 March 2016, 16:18:06 UTC |
963e393 | Antoine R. Dumont (@ardumont) | 22 March 2016, 16:17:19 UTC | Let the key/value of extra-headers be encoded when needed | 22 March 2016, 16:17:19 UTC |
fbd4e67 | Antoine R. Dumont (@ardumont) | 22 March 2016, 14:25:15 UTC | Use of optional extra-headers in git commit sha1 computation | 22 March 2016, 14:32:15 UTC |
bd5494f | Antoine R. Dumont (@ardumont) | 22 March 2016, 14:21:48 UTC | Add README-dev.md which explicits the sha1 computations | 22 March 2016, 14:21:48 UTC |
d84035a | Antoine R. Dumont (@ardumont) | 22 March 2016, 10:01:05 UTC | Fix exception message typos | 22 March 2016, 10:01:05 UTC |
696d23e | Antoine R. Dumont (@ardumont) | 21 March 2016, 13:44:40 UTC | Override locally the default flags | 21 March 2016, 14:17:44 UTC |
5ca1bda | Antoine R. Dumont (@ardumont) | 21 March 2016, 11:09:16 UTC | Allow filtering unwanted directory when computing git hash | 21 March 2016, 14:17:25 UTC |
2a0107e | Antoine R. Dumont (@ardumont) | 21 March 2016, 11:04:53 UTC | Move from swh.loader.dir.git to swh.model.git | 21 March 2016, 11:04:53 UTC |
100d537 | Antoine R. Dumont (@ardumont) | 27 January 2016, 13:18:37 UTC | Release name is now in bytes T270 | 27 January 2016, 13:18:37 UTC |