12770e2 | Antoine R. Dumont (@ardumont) | 24 November 2016, 13:53:50 UTC | Update indexer configuration data Related T574 Related T610 | 02 December 2016, 12:31:18 UTC |
0ccb413 | Antoine R. Dumont (@ardumont) | 01 December 2016, 09:25:41 UTC | storage: Add missing function in swh-func.sql Since 096 migration update. | 01 December 2016, 09:30:15 UTC |
d11f4e4 | Antoine R. Dumont (@ardumont) | 01 December 2016, 09:21:25 UTC | doc-sql: Add subgraph for content_indexer tables | 01 December 2016, 09:30:07 UTC |
d3ba860 | Antoine R. Dumont (@ardumont) | 01 December 2016, 09:20:39 UTC | doc-sql: Fix arrow from revision(directory) to directory(id) | 01 December 2016, 09:20:39 UTC |
ebcfd49 | Antoine R. Dumont (@ardumont) | 30 November 2016, 13:54:20 UTC | storage: Actually use the index for searching expression | 30 November 2016, 13:54:20 UTC |
ba29d18 | Antoine R. Dumont (@ardumont) | 29 November 2016, 16:20:37 UTC | Add index on ctags' name column | 29 November 2016, 16:20:37 UTC |
641ad5c | Antoine R. Dumont (@ardumont) | 29 November 2016, 16:03:25 UTC | storage: Use strict equality on ctags search | 29 November 2016, 16:03:25 UTC |
fb3722d | Antoine R. Dumont (@ardumont) | 28 November 2016, 15:07:05 UTC | storage: Fix edge case when searching symbols When the query is syntactly wrong, before that commit, we broke the server. Now it raises a bad input (400) request. | 28 November 2016, 15:07:05 UTC |
7f27e14 | Antoine R. Dumont (@ardumont) | 24 November 2016, 10:11:19 UTC | Add pagination to content_ctags_search api endpoint Related T605 | 24 November 2016, 10:11:19 UTC |
1fc21e6 | Antoine R. Dumont (@ardumont) | 23 November 2016, 16:30:54 UTC | storage: Open content_ctags_search for full-text search Related T605 | 23 November 2016, 16:30:54 UTC |
c690359 | Antoine R. Dumont (@ardumont) | 23 November 2016, 15:32:23 UTC | storage: Add fulltext search function on ctags Related T605 | 23 November 2016, 16:12:56 UTC |
3dafd17 | Antoine R. Dumont (@ardumont) | 22 November 2016, 15:55:17 UTC | storage: Fix error in function which reads licenses Related T602 | 22 November 2016, 15:55:17 UTC |
f5ece61 | Antoine R. Dumont (@ardumont) | 18 November 2016, 13:54:43 UTC | storage: Add indexer_configuration table json schema Related T596 | 18 November 2016, 13:55:18 UTC |
f454e44 | Antoine R. Dumont (@ardumont) | 18 November 2016, 12:24:07 UTC | storage: Update recognized fossology licenses Related T596 | 18 November 2016, 12:25:19 UTC |
bc7f776 | Antoine R. Dumont (@ardumont) | 15 November 2016, 17:04:10 UTC | storage: Fix divergent schema upgrade | 15 November 2016, 17:04:10 UTC |
99b09d4 | Antoine R. Dumont (@ardumont) | 15 November 2016, 17:04:01 UTC | Fix pep8 violation | 15 November 2016, 17:04:01 UTC |
3a4616c | Antoine R. Dumont (@ardumont) | 10 November 2016, 16:18:30 UTC | storage: ctags - Align conflict update policy with license endpoints In case of wanting to update, we first delete all ctags symbols for those impacted contents. Then we add the ctags information. Otherwise, simply add new entries. And In case of conflict, do nothing. | 10 November 2016, 16:25:48 UTC |
9a079c5 | Antoine R. Dumont (@ardumont) | 10 November 2016, 15:54:01 UTC | storage: Update fossology_license to latest design Related T596 | 10 November 2016, 16:17:54 UTC |
7528033 | Antoine R. Dumont (@ardumont) | 09 November 2016, 15:58:10 UTC | Update known licenses from fossology's master branch Related T596 Related 09923374e0f321da78faa0b37b2814fea9c5f1c1 | 10 November 2016, 09:44:16 UTC |
3dd9b0f | Antoine R. Dumont (@ardumont) | 09 November 2016, 15:45:12 UTC | storage: Return unknown licenses Related T596 | 09 November 2016, 15:46:47 UTC |
a63cbc7 | Antoine R. Dumont (@ardumont) | 08 November 2016, 16:08:04 UTC | storage: Open content_license endpoint (add/get) Related T596 | 09 November 2016, 11:30:39 UTC |
2fffbd4 | Antoine R. Dumont (@ardumont) | 08 November 2016, 11:42:22 UTC | storage: Add license and content_license tables Related T596 | 08 November 2016, 14:06:12 UTC |
04f2b2d | Antoine R. Dumont (@ardumont) | 08 November 2016, 11:06:28 UTC | storage: Add comments on enum | 08 November 2016, 14:06:12 UTC |
022e985 | Antoine R. Dumont (@ardumont) | 08 November 2016, 11:04:18 UTC | storage: Move enums to new swh-enums.sql namespace | 08 November 2016, 14:06:11 UTC |
51c4896 | Nicolas Dandrimont | 03 November 2016, 14:39:38 UTC | storage: add check_config method The check_config method allows a dynamic check of the configuration for a running storage. We can make sure that we have proper permissions on the object storage as well as the database before running things. | 03 November 2016, 14:39:38 UTC |
5e8bba5 | Antoine R. Dumont (@ardumont) | 20 October 2016, 13:53:19 UTC | storage: Improve index on content_ctags Work on the suggestion message from postgresql psycopg2.OperationalError: index row size 3992 exceeds maximum 2712 for index "content_ctags_id_name_kind_line_lang_idx" HINT: Values larger than 1/3 of a buffer page cannot be indexed. Consider a function index of an MD5 hash of the value, or use full text indexing. Related T589 | 20 October 2016, 13:55:44 UTC |
33043b1 | Antoine R. Dumont (@ardumont) | 20 October 2016, 12:41:22 UTC | storage: ctags - Improve schema Related T589 | 20 October 2016, 13:25:28 UTC |
a74a141 | Antoine R. Dumont (@ardumont) | 19 October 2016, 16:29:43 UTC | storage: Open ctags entry points (missing, add, get) Related T589 | 19 October 2016, 16:33:19 UTC |
71b4a88 | Antoine R. Dumont (@ardumont) | 19 October 2016, 16:28:48 UTC | Remove noisy test attribute 'one' | 19 October 2016, 16:33:18 UTC |
4bd537f | Nicolas Dandrimont | 19 October 2016, 14:47:10 UTC | storage: allow adding several origins at once | 19 October 2016, 14:49:08 UTC |
141afef | Nicolas Dandrimont | 19 October 2016, 14:42:19 UTC | common: allow passing in the cursor for the transaction decorators | 19 October 2016, 14:42:19 UTC |
f7becde | Antoine R. Dumont (@ardumont) | 11 October 2016, 16:13:45 UTC | Add the means to pipe contents to another queue once copied Related T575 | 13 October 2016, 13:24:00 UTC |
d2eb077 | Antoine R. Dumont (@ardumont) | 13 October 2016, 10:20:24 UTC | indexer: Unify function names according to conventions | 13 October 2016, 12:18:21 UTC |
97f610a | Antoine R. Dumont (@ardumont) | 13 October 2016, 09:28:07 UTC | Add tests around the content_{mimetype/language}_add endpoints Related T582 | 13 October 2016, 12:18:13 UTC |
1373667 | Antoine R. Dumont (@ardumont) | 13 October 2016, 09:18:40 UTC | indexer: Open mimetype/language get endpoints | 13 October 2016, 12:18:12 UTC |
54efa89 | Antoine R. Dumont (@ardumont) | 12 October 2016, 16:50:36 UTC | indexer: open drop/skip policy update on duplicates (language/mimetype) This adds the optional conflict_update parameter which specifies what to do when conflicts on sha1 occurs. conflict_update by default is false which ignores duplicates. Otherwise, conflich_update to true, overwrite existing data. Related T582 | 13 October 2016, 12:18:12 UTC |
dddbc4c | Antoine R. Dumont (@ardumont) | 13 October 2016, 08:23:16 UTC | Fix: Remove nose test attribute 'one' | 13 October 2016, 12:08:00 UTC |
3fcc628 | Antoine R. Dumont (@ardumont) | 12 October 2016, 00:23:25 UTC | Fix provenance storage init function | 12 October 2016, 00:23:25 UTC |
2fd7f72 | Antoine R. Dumont (@ardumont) | 11 October 2016, 23:33:57 UTC | provenance: Rework configuration setup | 11 October 2016, 23:33:57 UTC |
30f7883 | Antoine R. Dumont (@ardumont) | 07 October 2016, 17:08:34 UTC | Open language_mimetype_{missing,add} endpoints Related T578 | 07 October 2016, 18:30:55 UTC |
859860c | Antoine R. Dumont (@ardumont) | 07 October 2016, 16:53:38 UTC | sql/schema: Add content_language table Related T578 | 07 October 2016, 16:53:38 UTC |
fd717f3 | Antoine R. Dumont (@ardumont) | 07 October 2016, 12:36:56 UTC | Open content_mimetype_add endpoint to add missing mimetypes Related T577 | 07 October 2016, 15:08:05 UTC |
a77c187 | Antoine R. Dumont (@ardumont) | 07 October 2016, 12:36:20 UTC | Open content_mimetype_missing endpoint to list missing mimetypes Related T577 | 07 October 2016, 15:08:05 UTC |
5e9244c | Antoine R. Dumont (@ardumont) | 07 October 2016, 09:38:40 UTC | sql/schema: Add content_mimetype table Towards starting computing information on contents Related T577 | 07 October 2016, 15:08:04 UTC |
7add2cd | Stefano Zacchiroli | 07 October 2016, 14:53:49 UTC | DB schema graph: add new "provenance" cluster it includes the cache_* tables that are currently being populated | 07 October 2016, 14:53:49 UTC |
0f29092 | Stefano Zacchiroli | 07 October 2016, 14:53:14 UTC | DB schema graph: add stray origin_visit table | 07 October 2016, 14:53:14 UTC |
a1aa8be | Antoine R. Dumont (@ardumont) | 29 September 2016, 17:27:34 UTC | Align implementation with docstring's contract | 29 September 2016, 18:31:23 UTC |
6c505cc | Antoine R. Dumont (@ardumont) | 29 September 2016, 16:57:40 UTC | Fix: Missing incremented version 5 for archiver.dbversion | 29 September 2016, 16:57:40 UTC |
1afea82 | Antoine R. Dumont (@ardumont) | 29 September 2016, 14:55:44 UTC | Retrieve information on a content cached | 29 September 2016, 16:45:10 UTC |
a43b962 | Antoine R. Dumont (@ardumont) | 29 September 2016, 14:55:06 UTC | Rename to swh_cache_content_get_all | 29 September 2016, 14:55:06 UTC |
f12d9ef | Antoine R. Dumont (@ardumont) | 28 September 2016, 08:20:27 UTC | Fix copyright range | 29 September 2016, 12:42:57 UTC |
1b4aa4f | Antoine R. Dumont (@ardumont) | 25 September 2016, 09:59:39 UTC | archiver: Remove print statement | 29 September 2016, 12:42:57 UTC |
4b5287e | Nicolas Dandrimont | 23 September 2016, 11:39:43 UTC | upgrades/085: add upgrade script | 23 September 2016, 11:39:43 UTC |
005710e | Nicolas Dandrimont | 23 September 2016, 11:38:11 UTC | sql/swh-func: content cache populates lines in deterministic order This should reduce lock contention when parallelizing the operation | 23 September 2016, 11:38:11 UTC |
4d6d3bd | Antoine R. Dumont (@ardumont) | 23 September 2016, 10:16:25 UTC | archiver: Pass the destination as parameter of the worker to backend | 23 September 2016, 10:28:32 UTC |
394bb4d | Antoine R. Dumont (@ardumont) | 23 September 2016, 10:07:19 UTC | archiver: Add missing property for worker to backend | 23 September 2016, 10:28:32 UTC |
718dda6 | Antoine R. Dumont (@ardumont) | 23 September 2016, 10:06:55 UTC | archiver: Complete docstring's information | 23 September 2016, 10:28:32 UTC |
f29c207 | Antoine R. Dumont (@ardumont) | 23 September 2016, 10:01:41 UTC | archiver: Simplify update on content | 23 September 2016, 10:28:32 UTC |
a67aa26 | Antoine R. Dumont (@ardumont) | 23 September 2016, 09:50:53 UTC | archiver: Improve 'unknown sha1' and 'force copy' policies The 'unknown sha1 path' cannot happen in the default archiver since it reads from the archive db (so the fallback code is not necessary in the worker). To the contrary, since 'archiver to backend' reads from stdin (for now), we could have unregistered sha1s from that source. This commit makes the director deal with that before sending sha1 to workers. It's also the director's job to set the state to 'missing' when the force_copy is true before sending sha1 to worker. | 23 September 2016, 10:28:32 UTC |
9b04941 | Antoine R. Dumont (@ardumont) | 23 September 2016, 09:49:44 UTC | archiver: Fix random.choice input to a list | 23 September 2016, 10:28:31 UTC |
7332c31 | Antoine R. Dumont (@ardumont) | 23 September 2016, 09:47:02 UTC | sql/archiver/schema: Filter unknown sha1s from content_archive endpoint | 23 September 2016, 10:28:31 UTC |
de67eb7 | Nicolas Dandrimont | 22 September 2016, 18:37:57 UTC | provenance: fix typo: we have hex in the message, not hashes | 22 September 2016, 18:37:57 UTC |
ff87ac5 | Nicolas Dandrimont | 22 September 2016, 16:49:20 UTC | swh-func: content-revision cache population now takes a list of revs | 22 September 2016, 16:51:14 UTC |
30f5645 | Nicolas Dandrimont | 22 September 2016, 12:35:50 UTC | swh-func: less churn in the cache_content_revision table | 22 September 2016, 12:42:43 UTC |
4c3623c | Antoine R. Dumont (@ardumont) | 22 September 2016, 11:42:19 UTC | Archiver: Fix to copy only to targeted destination Before that, it could for example pushed copies to other mirrors where the content was missing. | 22 September 2016, 11:43:45 UTC |
cdf11d5 | Antoine R. Dumont (@ardumont) | 22 September 2016, 10:25:06 UTC | d/control: Bump dependency version to latest python3-swh.core | 22 September 2016, 10:37:43 UTC |
57053fe | Antoine R. Dumont (@ardumont) | 22 September 2016, 10:31:46 UTC | Refactor: Align source/destination configuration property names | 22 September 2016, 10:37:43 UTC |
f163c2a | Antoine R. Dumont (@ardumont) | 22 September 2016, 10:24:15 UTC | Handle copies of not registered contents in archiver db Closes T569 | 22 September 2016, 10:37:42 UTC |
df2e00a | Antoine R. Dumont (@ardumont) | 21 September 2016, 07:59:18 UTC | Refactor logging warning/critical message | 21 September 2016, 16:18:11 UTC |
88e0a05 | Antoine R. Dumont (@ardumont) | 21 September 2016, 14:42:02 UTC | Improve on cooking code and docstrings - Fix docstring typos - Some function calls were not renamed. | 21 September 2016, 16:14:44 UTC |
0879adf | Quentin Campos | 21 September 2016, 14:20:18 UTC | Refactor the vault cooker to add new bundle types Summary: Make some updates to the vault in order to prepare the next arrival of the revision cooker. Reviewers: #reviewers, ardumont Reviewed By: #reviewers, ardumont Subscribers: ardumont Maniphest Tasks: T531 Differential Revision: https://forge.softwareheritage.org/D115 Closes D115 | 21 September 2016, 16:14:44 UTC |
6900685 | Antoine R. Dumont (@ardumont) | 20 September 2016, 18:10:16 UTC | Be defensive against potential not found content | 20 September 2016, 18:10:16 UTC |
189e9c1 | Antoine R. Dumont (@ardumont) | 20 September 2016, 14:37:38 UTC | d/control: Fix typo in version | 20 September 2016, 14:37:38 UTC |
9ad3aef | Antoine R. Dumont (@ardumont) | 20 September 2016, 14:35:34 UTC | d/control: Bump dependency version to latest python3-swh.objstorage | 20 September 2016, 14:35:34 UTC |
0476b9c | Antoine R. Dumont (@ardumont) | 20 September 2016, 14:23:45 UTC | Fix objstorage instanciation in tests | 20 September 2016, 14:31:23 UTC |
2b18a7a | Antoine R. Dumont (@ardumont) | 20 September 2016, 10:35:10 UTC | Remove optional dependency on swh.objstorage.cloud If you want the archiver to have cloud abilities, install the package python3-swh.objstorage.cloud on the server as well. | 20 September 2016, 14:31:23 UTC |
f7665b3 | Nicolas Dandrimont | 19 September 2016, 12:21:38 UTC | sql/swh-schema: content->revision cache only has one line per content | 19 September 2016, 12:31:03 UTC |
941c52c | Antoine R. Dumont (@ardumont) | 17 September 2016, 09:58:55 UTC | Unify configuration property between director/worker | 17 September 2016, 10:47:53 UTC |
76d13b0 | Antoine R. Dumont (@ardumont) | 17 September 2016, 08:38:02 UTC | Deal with potential missing contents in the archiver db Logging an entry about it | 17 September 2016, 10:47:50 UTC |
5281ad1 | Antoine R. Dumont (@ardumont) | 17 September 2016, 08:14:07 UTC | Improve get_contents_error implementation - Only read the storage key once. - Improve the logging error. | 17 September 2016, 10:47:33 UTC |
d26477a | Antoine R. Dumont (@ardumont) | 17 September 2016, 10:28:47 UTC | Remove dead code already moved in archiver/db | 17 September 2016, 10:47:32 UTC |
80b21aa | Antoine R. Dumont (@ardumont) | 16 September 2016, 20:10:46 UTC | Adapt archiver director to read sha1 from stdin Also, adds a force_copy flag in the configuration file to avoid checking preexistence of sha1. This is to be efficient for the first time copy in a new backend. | 16 September 2016, 20:14:38 UTC |
ee4ecd7 | Antoine R. Dumont (@ardumont) | 15 September 2016, 15:01:52 UTC | archiver: Unify configuration file between director/worker The initial director and worker had split configuration files. Now it's unified to be both archiver/worker.yml file | 16 September 2016, 20:12:53 UTC |
5db327a | Antoine R. Dumont (@ardumont) | 15 September 2016, 14:24:53 UTC | Archiver: Adapt ArchiverToBackendDirector to latest storage api | 15 September 2016, 14:30:03 UTC |
57ee3b6 | Antoine R. Dumont (@ardumont) | 15 September 2016, 14:10:03 UTC | content_archive_get: api entry point to list cache contents | 15 September 2016, 14:12:59 UTC |
7d0b963 | Antoine R. Dumont (@ardumont) | 15 September 2016, 11:24:36 UTC | archiver: Add missing instruction about 003 upgrade | 15 September 2016, 13:07:10 UTC |
1228c3f | Antoine R. Dumont (@ardumont) | 15 September 2016, 09:06:04 UTC | Remove print statement | 15 September 2016, 09:06:04 UTC |
315a8eb | Antoine R. Dumont (@ardumont) | 15 September 2016, 08:54:42 UTC | cache_content_get: Fix broken test Related ae99623a1eb7e959944f420c9418ad519ce5bc6e | 15 September 2016, 08:55:50 UTC |
ce2f4e6 | Antoine R. Dumont (@ardumont) | 15 September 2016, 08:21:40 UTC | Improve choose backup contents to use multiple sources | 15 September 2016, 08:21:40 UTC |
02e1f1e | Antoine R. Dumont (@ardumont) | 14 September 2016, 18:18:26 UTC | Add logging ability to copier | 14 September 2016, 18:18:26 UTC |
ae99623 | Antoine R. Dumont (@ardumont) | 14 September 2016, 16:55:56 UTC | Archiver: Filter missing contents before archival | 14 September 2016, 17:48:25 UTC |
791444a | Antoine R. Dumont (@ardumont) | 14 September 2016, 10:01:06 UTC | archive - sql/upgrades/004: Insert new archive id + clean up Drop unused archive.url column. Simplify associated tests setup on the archiver with retention policy. | 14 September 2016, 10:06:21 UTC |
d0cbf0a | Antoine R. Dumont (@ardumont) | 12 September 2016, 16:26:49 UTC | Archiver Director/Worker: Add copy to backend worker implementation Actions storage: - Open cache_content_get to retrieve contents in cache - sql/upgrades/080: Add stored procedure to read contents from cache Actions archiver: - d/control: Add dependency to archiver on python3-swh.objstorage.cloud - Renamed Archiver(Director|Worker) to ArchiverWithRetentionPolicy(Director|Worker) - Add ArchiverToBackend(Director|Worker) - Add new celery task dedicated for new workers - Update docstring details Related T555 | 14 September 2016, 09:31:56 UTC |
a71109b | Quentin Campos | 24 August 2016, 10:50:24 UTC | Http API to access the SWH vault Summary: This API currently only concern the directories as it uses the first draft of the cooker. Ref T532 Depends on D102 Reviewers: #reviewers! Maniphest Tasks: T532 Differential Revision: https://forge.softwareheritage.org/D108 | 12 September 2016, 12:23:38 UTC |
b11cfe3 | Quentin Campos | 19 August 2016, 13:05:20 UTC | First version of the directory cooker & cache Summary: This first version does create a compressed folder of an archive directory but is not linked to any API or notification system. This diff is submitted for architecture and code review and will evolve. Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D102 | 12 September 2016, 10:51:20 UTC |
1c90cd4 | Antoine R. Dumont (@ardumont) | 08 September 2016, 12:13:08 UTC | Refactor: Rename adequately swh.storage.db.Db.origin_visit functions In swh.storage.db: - origin_visit_get -> origin_visit_get_all - origin_visit_get_by -> occurrence_by_origin_visit - origin_visit_info -> origin_visit_get | 08 September 2016, 12:13:08 UTC |
598f38c | Antoine R. Dumont (@ardumont) | 08 September 2016, 11:43:37 UTC | origin_visit_get_by: Fix origin_visit output data + format change This endpoint was wrongly returning only 1 result. | 08 September 2016, 11:49:04 UTC |
8c10704 | Antoine R. Dumont (@ardumont) | 06 September 2016, 12:25:18 UTC | Fix typo and remove unused parameter | 06 September 2016, 12:27:27 UTC |
6feddb3 | Antoine R. Dumont (@ardumont) | 06 September 2016, 12:23:31 UTC | Fix and explicit docstring's meaning | 06 September 2016, 12:27:27 UTC |
cf7ff9e | Antoine R. Dumont (@ardumont) | 05 September 2016, 09:10:40 UTC | origin_visit_get_by: Update to retrieve associated occurrence info Closed T559 | 05 September 2016, 09:12:41 UTC |