56366fa | Nicolas Dandrimont | 07 March 2017, 16:02:20 UTC | archiver.director: only yield plain content ids, not dicts | 07 March 2017, 16:02:20 UTC |
223992d | Nicolas Dandrimont | 07 March 2017, 14:44:19 UTC | archiver.worker: allow disabling the task chaining mechanism | 07 March 2017, 14:44:19 UTC |
543c8a4 | Nicolas Dandrimont | 07 March 2017, 14:11:24 UTC | archiver.storage: add a stub archiver only writing data to logfiles | 07 March 2017, 14:11:24 UTC |
5cad6d3 | Nicolas Dandrimont | 07 March 2017, 12:32:14 UTC | archiver.storage: refactor to provide a get_archiver_storage function This will allow us to handle another storage backend for the storage of the archiver data. | 07 March 2017, 12:32:14 UTC |
17c31f1 | Nicolas Dandrimont | 07 March 2017, 12:28:11 UTC | test_archiver: clean up after yourself | 07 March 2017, 12:28:49 UTC |
96c0a21 | Antoine R. Dumont (@ardumont) | 01 March 2017, 16:19:00 UTC | storage: open content_update endpoint Permits to batch update content rows (with or without optional new columns). Limited to contents (table content only, table skipped_content is not dealt with). Related T692 Closes D185 | 03 March 2017, 09:15:07 UTC |
eb9130c | Nicolas Dandrimont | 02 March 2017, 15:56:54 UTC | archiver.worker: only get copies from the configured object storages By default we would try to copy objects from all the archives, even those for which we didn't have a configuration. | 02 March 2017, 15:56:54 UTC |
9f9570a | Nicolas Dandrimont | 02 March 2017, 15:43:27 UTC | archiver.storage: remove implicit sources_missing from content_archive_add The default value for content copies is "missing", so we don't need to make it explicit. | 02 March 2017, 15:43:27 UTC |
e1225d1 | Nicolas Dandrimont | 02 March 2017, 15:40:46 UTC | archiver.director: the source objstorage for unknown content ids is implicit | 02 March 2017, 15:40:46 UTC |
de9a5c3 | Nicolas Dandrimont | 02 March 2017, 15:34:58 UTC | archiver.director: make the standard input reader more resilient to errors | 02 March 2017, 15:34:58 UTC |
4b27819 | Antoine R. Dumont (@ardumont) | 27 February 2017, 15:41:49 UTC | Refactor: Unify the content_archive_add with swh.storage.content_add Implementation wise, this uses COPY statement and drop duplicates if encountered for content_archive insertion. | 02 March 2017, 14:53:44 UTC |
61af747 | Antoine R. Dumont (@ardumont) | 27 February 2017, 14:26:18 UTC | archiver-storage: Improve content_archive_add function Use the same insertion pattern as swh.storage.content_add. | 02 March 2017, 14:53:31 UTC |
ca1c529 | Antoine R. Dumont (@ardumont) | 27 February 2017, 13:50:20 UTC | Refactor: Reuse swh.scheduler.get_task function This also has the benefit to hide some celery name (which is an implementation detail from swh.scheduler). | 02 March 2017, 14:53:28 UTC |
269a731 | Antoine R. Dumont (@ardumont) | 27 February 2017, 13:39:46 UTC | content_archive_add: Use the right 'missing' status Related: T494 | 02 March 2017, 14:52:21 UTC |
afb423e | Antoine R. Dumont (@ardumont) | 27 February 2017, 13:38:34 UTC | test: Remove impossible and commented test This use case cannot happen with ArchiverWithRetentionPolicyDirector: - If a row entry is referenced in the archiver db, it's present in the objstorage - And if a row entry is not referenced in the archiver db, it won't be listed as missing since it's the archiver db which is read for listing the contents we want to archive. | 02 March 2017, 14:52:21 UTC |
605ca00 | Antoine R. Dumont (@ardumont) | 27 February 2017, 12:12:30 UTC | Refactor: Merge common behavior in director and content updater client Related T494 Related T569 | 02 March 2017, 14:52:14 UTC |
c38a452 | Antoine R. Dumont (@ardumont) | 25 February 2017, 00:04:51 UTC | archiver.storage: Add content_archive_content_add endpoint Related T494 | 02 March 2017, 14:50:56 UTC |
e962cdd | Antoine Pietri | 23 February 2017, 16:37:55 UTC | RevisionVaultCooker: factor out tar creation function | 23 February 2017, 16:37:55 UTC |
327924d | Antoine Pietri | 22 February 2017, 15:34:54 UTC | RevisionVaultCooker: add naive flatten implementation | 22 February 2017, 15:34:54 UTC |
23c1138 | Antoine Pietri | 21 February 2017, 16:35:40 UTC | config: use 5002 as the default storage port | 21 February 2017, 16:35:40 UTC |
2feaa1d | Antoine Pietri | 17 February 2017, 15:52:33 UTC | vault: directory cooker: handle symlinks and executables | 20 February 2017, 14:21:27 UTC |
c2ca9fc | Antoine Pietri | 16 February 2017, 15:28:38 UTC | vault cooker: fix mismatching subclasses method signatures | 16 February 2017, 16:22:08 UTC |
28c3eec | Nicolas Dandrimont | 16 February 2017, 12:21:14 UTC | archiver: fix brown paper bag bug for object counter | 16 February 2017, 12:23:23 UTC |
7e4f780 | Antoine R. Dumont (@ardumont) | 15 February 2017, 13:35:31 UTC | README.dev: Update dev documentation with updated configuration samples | 15 February 2017, 13:41:20 UTC |
5c41ffc | Nicolas Dandrimont | 14 February 2017, 18:37:15 UTC | d/control: remove spurious blank line | 14 February 2017, 18:37:15 UTC |
a5fa26c | Nicolas Dandrimont | 14 February 2017, 18:25:02 UTC | debian/control: update swh-model requirement | 14 February 2017, 18:25:25 UTC |
ec8cebf | Nicolas Dandrimont | 14 February 2017, 18:14:13 UTC | converters: normalize timestamps using swh.model To make sure corruptions such as T680 don't happen again, use the same normalization function as swh.model before inserting timestamps into our database. This makes swh.storage reject non-integer timestamp values as well. Update tests to reflect this change. | 14 February 2017, 18:17:33 UTC |
c6abed2 | Nicolas Dandrimont | 09 February 2017, 17:40:51 UTC | sql/archiver: get the count of objects in each archive Close T672 | 09 February 2017, 17:40:51 UTC |
8ce681c | Nicolas Dandrimont | 09 February 2017, 17:40:24 UTC | sql/archiver: move function defs to the functions file | 09 February 2017, 17:40:24 UTC |
08b0802 | Antoine Pietri | 09 February 2017, 11:12:09 UTC | requirements: split internal and external requirements in two separate files | 09 February 2017, 14:09:28 UTC |
94d72cc | Antoine Pietri | 06 February 2017, 16:47:14 UTC | requirements.txt: s/dateutil/python-dateutil/ | 09 February 2017, 14:09:28 UTC |
76ea627 | Antoine Pietri | 06 February 2017, 16:47:05 UTC | style: test_storage.py: wrap >80 cols line | 09 February 2017, 14:09:28 UTC |
598114c | Nicolas Dandrimont | 07 February 2017, 17:28:05 UTC | sql/archiver: keep archive counts using a bucketed list The buckets are using the last two bytes of the object id, so that we spread the load across different lines on sequential archivings. | 07 February 2017, 17:28:05 UTC |
b5cd7f0 | Nicolas Dandrimont | 01 February 2017, 14:40:06 UTC | sql/upgrades: add 99 → 100 | 01 February 2017, 14:40:06 UTC |
48df525 | Nicolas Dandrimont | 01 February 2017, 14:37:32 UTC | sql/swh-func: in occurrence_get_by: only return data pertaining one visit By default, return data from the latest visit instead of returning data from all visits, which doesn't make much sense. | 01 February 2017, 14:37:32 UTC |
b5b8fd0 | Nicolas Dandrimont | 01 February 2017, 14:20:01 UTC | sql/swh-func: actually filter swh_visit_find_by_date by origin... | 01 February 2017, 14:21:35 UTC |
fed8afb | Antoine R. Dumont (@ardumont) | 26 January 2017, 13:35:05 UTC | d/control: Update dependencies Closes T646 | 26 January 2017, 14:07:46 UTC |
54cb088 | Antoine R. Dumont (@ardumont) | 26 January 2017, 13:34:43 UTC | Refactor: Unify redundant behavior in api server instantiation Related T646 | 26 January 2017, 13:40:17 UTC |
c92af7a | Antoine R. Dumont (@ardumont) | 26 January 2017, 13:13:30 UTC | Refactor: Unify redundant behavior in SWHRemoteAPI Related T646 | 26 January 2017, 13:16:45 UTC |
4213a0c | Antoine R. Dumont (@ardumont) | 19 January 2017, 13:19:24 UTC | Return page of results for origin visits endpoints Related T636 | 19 January 2017, 13:40:13 UTC |
a8c0e13 | Nicolas Dandrimont | 17 January 2017, 12:11:46 UTC | sql/swh-schema: reorder fields according to production database | 17 January 2017, 12:13:40 UTC |
4893a52 | Nicolas Dandrimont | 11 January 2017, 13:29:01 UTC | sql: refactor to split out indexes and triggers This is the first step towards having different sets of indexes between the master and read-only replicas. | 11 January 2017, 13:43:29 UTC |
e4d5aec | Nicolas Dandrimont | 03 January 2017, 15:30:20 UTC | archiver.worker: fix typo | 03 January 2017, 15:30:20 UTC |
2ff562f | Antoine R. Dumont (@ardumont) | 31 December 2016, 13:46:41 UTC | test: Fix wrong key from base_url to url | 31 December 2016, 13:47:14 UTC |
4bb4246 | Antoine R. Dumont (@ardumont) | 20 December 2016, 08:38:34 UTC | d/control: Update to latest objstorage | 20 December 2016, 08:38:34 UTC |
d5f4640 | Antoine R. Dumont (@ardumont) | 15 December 2016, 15:07:27 UTC | Unify objstorage and storage configuration Related T613 | 15 December 2016, 17:25:53 UTC |
afbdb14 | Antoine R. Dumont (@ardumont) | 15 December 2016, 14:44:19 UTC | Adapt storage's objstorage parameter as a setup property Related T613 | 15 December 2016, 17:25:15 UTC |
ebd1797 | Antoine R. Dumont (@ardumont) | 05 December 2016, 11:00:46 UTC | storage: Fix missing function definition change Related T610 | 05 December 2016, 11:01:57 UTC |
cc7a1ca | Antoine R. Dumont (@ardumont) | 05 December 2016, 10:57:47 UTC | storage: Move hash-function before schema install Related T610 | 05 December 2016, 11:01:40 UTC |
ec0893a | Antoine R. Dumont (@ardumont) | 05 December 2016, 09:12:37 UTC | storage: Adapt ctags' sql schema migration to be faster Related T610 | 05 December 2016, 10:43:43 UTC |
c9e2b15 | Antoine R. Dumont (@ardumont) | 02 December 2016, 16:52:53 UTC | storage: Cleanup only conflictual data for ctags, fossology_license Related T610 | 02 December 2016, 16:52:53 UTC |
648d071 | Antoine R. Dumont (@ardumont) | 02 December 2016, 16:28:12 UTC | storage: Adapt missing endpoint filtering for content_fossology_license Related T610 | 02 December 2016, 16:28:12 UTC |
7ef1700 | Antoine R. Dumont (@ardumont) | 02 December 2016, 15:17:22 UTC | storage: Add tool information on language api endpoints Related T610 | 02 December 2016, 15:17:22 UTC |
6a26397 | Antoine R. Dumont (@ardumont) | 02 December 2016, 14:24:16 UTC | storage: Add tool information on mimetype api endpoints Related T610 | 02 December 2016, 14:24:16 UTC |
2b5a2ff | Antoine R. Dumont (@ardumont) | 02 December 2016, 12:06:40 UTC | storage: Format ctags tool output result Related T610 | 02 December 2016, 12:31:32 UTC |
d35a85a | Antoine R. Dumont (@ardumont) | 02 December 2016, 10:49:02 UTC | storage: Add tool information on ctags api endpoints Related T610 | 02 December 2016, 12:31:32 UTC |
43b37dc | Antoine R. Dumont (@ardumont) | 02 December 2016, 09:20:02 UTC | Update indexer configuration - Index/constraints - pass 2 Related T610 | 02 December 2016, 12:31:25 UTC |
12770e2 | Antoine R. Dumont (@ardumont) | 24 November 2016, 13:53:50 UTC | Update indexer configuration data Related T574 Related T610 | 02 December 2016, 12:31:18 UTC |
0ccb413 | Antoine R. Dumont (@ardumont) | 01 December 2016, 09:25:41 UTC | storage: Add missing function in swh-func.sql Since 096 migration update. | 01 December 2016, 09:30:15 UTC |
d11f4e4 | Antoine R. Dumont (@ardumont) | 01 December 2016, 09:21:25 UTC | doc-sql: Add subgraph for content_indexer tables | 01 December 2016, 09:30:07 UTC |
d3ba860 | Antoine R. Dumont (@ardumont) | 01 December 2016, 09:20:39 UTC | doc-sql: Fix arrow from revision(directory) to directory(id) | 01 December 2016, 09:20:39 UTC |
ebcfd49 | Antoine R. Dumont (@ardumont) | 30 November 2016, 13:54:20 UTC | storage: Actually use the index for searching expression | 30 November 2016, 13:54:20 UTC |
ba29d18 | Antoine R. Dumont (@ardumont) | 29 November 2016, 16:20:37 UTC | Add index on ctags' name column | 29 November 2016, 16:20:37 UTC |
641ad5c | Antoine R. Dumont (@ardumont) | 29 November 2016, 16:03:25 UTC | storage: Use strict equality on ctags search | 29 November 2016, 16:03:25 UTC |
fb3722d | Antoine R. Dumont (@ardumont) | 28 November 2016, 15:07:05 UTC | storage: Fix edge case when searching symbols When the query is syntactly wrong, before that commit, we broke the server. Now it raises a bad input (400) request. | 28 November 2016, 15:07:05 UTC |
7f27e14 | Antoine R. Dumont (@ardumont) | 24 November 2016, 10:11:19 UTC | Add pagination to content_ctags_search api endpoint Related T605 | 24 November 2016, 10:11:19 UTC |
1fc21e6 | Antoine R. Dumont (@ardumont) | 23 November 2016, 16:30:54 UTC | storage: Open content_ctags_search for full-text search Related T605 | 23 November 2016, 16:30:54 UTC |
c690359 | Antoine R. Dumont (@ardumont) | 23 November 2016, 15:32:23 UTC | storage: Add fulltext search function on ctags Related T605 | 23 November 2016, 16:12:56 UTC |
3dafd17 | Antoine R. Dumont (@ardumont) | 22 November 2016, 15:55:17 UTC | storage: Fix error in function which reads licenses Related T602 | 22 November 2016, 15:55:17 UTC |
f5ece61 | Antoine R. Dumont (@ardumont) | 18 November 2016, 13:54:43 UTC | storage: Add indexer_configuration table json schema Related T596 | 18 November 2016, 13:55:18 UTC |
f454e44 | Antoine R. Dumont (@ardumont) | 18 November 2016, 12:24:07 UTC | storage: Update recognized fossology licenses Related T596 | 18 November 2016, 12:25:19 UTC |
bc7f776 | Antoine R. Dumont (@ardumont) | 15 November 2016, 17:04:10 UTC | storage: Fix divergent schema upgrade | 15 November 2016, 17:04:10 UTC |
99b09d4 | Antoine R. Dumont (@ardumont) | 15 November 2016, 17:04:01 UTC | Fix pep8 violation | 15 November 2016, 17:04:01 UTC |
3a4616c | Antoine R. Dumont (@ardumont) | 10 November 2016, 16:18:30 UTC | storage: ctags - Align conflict update policy with license endpoints In case of wanting to update, we first delete all ctags symbols for those impacted contents. Then we add the ctags information. Otherwise, simply add new entries. And In case of conflict, do nothing. | 10 November 2016, 16:25:48 UTC |
9a079c5 | Antoine R. Dumont (@ardumont) | 10 November 2016, 15:54:01 UTC | storage: Update fossology_license to latest design Related T596 | 10 November 2016, 16:17:54 UTC |
7528033 | Antoine R. Dumont (@ardumont) | 09 November 2016, 15:58:10 UTC | Update known licenses from fossology's master branch Related T596 Related 09923374e0f321da78faa0b37b2814fea9c5f1c1 | 10 November 2016, 09:44:16 UTC |
3dd9b0f | Antoine R. Dumont (@ardumont) | 09 November 2016, 15:45:12 UTC | storage: Return unknown licenses Related T596 | 09 November 2016, 15:46:47 UTC |
a63cbc7 | Antoine R. Dumont (@ardumont) | 08 November 2016, 16:08:04 UTC | storage: Open content_license endpoint (add/get) Related T596 | 09 November 2016, 11:30:39 UTC |
2fffbd4 | Antoine R. Dumont (@ardumont) | 08 November 2016, 11:42:22 UTC | storage: Add license and content_license tables Related T596 | 08 November 2016, 14:06:12 UTC |
04f2b2d | Antoine R. Dumont (@ardumont) | 08 November 2016, 11:06:28 UTC | storage: Add comments on enum | 08 November 2016, 14:06:12 UTC |
022e985 | Antoine R. Dumont (@ardumont) | 08 November 2016, 11:04:18 UTC | storage: Move enums to new swh-enums.sql namespace | 08 November 2016, 14:06:11 UTC |
51c4896 | Nicolas Dandrimont | 03 November 2016, 14:39:38 UTC | storage: add check_config method The check_config method allows a dynamic check of the configuration for a running storage. We can make sure that we have proper permissions on the object storage as well as the database before running things. | 03 November 2016, 14:39:38 UTC |
5e8bba5 | Antoine R. Dumont (@ardumont) | 20 October 2016, 13:53:19 UTC | storage: Improve index on content_ctags Work on the suggestion message from postgresql psycopg2.OperationalError: index row size 3992 exceeds maximum 2712 for index "content_ctags_id_name_kind_line_lang_idx" HINT: Values larger than 1/3 of a buffer page cannot be indexed. Consider a function index of an MD5 hash of the value, or use full text indexing. Related T589 | 20 October 2016, 13:55:44 UTC |
33043b1 | Antoine R. Dumont (@ardumont) | 20 October 2016, 12:41:22 UTC | storage: ctags - Improve schema Related T589 | 20 October 2016, 13:25:28 UTC |
a74a141 | Antoine R. Dumont (@ardumont) | 19 October 2016, 16:29:43 UTC | storage: Open ctags entry points (missing, add, get) Related T589 | 19 October 2016, 16:33:19 UTC |
71b4a88 | Antoine R. Dumont (@ardumont) | 19 October 2016, 16:28:48 UTC | Remove noisy test attribute 'one' | 19 October 2016, 16:33:18 UTC |
4bd537f | Nicolas Dandrimont | 19 October 2016, 14:47:10 UTC | storage: allow adding several origins at once | 19 October 2016, 14:49:08 UTC |
141afef | Nicolas Dandrimont | 19 October 2016, 14:42:19 UTC | common: allow passing in the cursor for the transaction decorators | 19 October 2016, 14:42:19 UTC |
f7becde | Antoine R. Dumont (@ardumont) | 11 October 2016, 16:13:45 UTC | Add the means to pipe contents to another queue once copied Related T575 | 13 October 2016, 13:24:00 UTC |
d2eb077 | Antoine R. Dumont (@ardumont) | 13 October 2016, 10:20:24 UTC | indexer: Unify function names according to conventions | 13 October 2016, 12:18:21 UTC |
97f610a | Antoine R. Dumont (@ardumont) | 13 October 2016, 09:28:07 UTC | Add tests around the content_{mimetype/language}_add endpoints Related T582 | 13 October 2016, 12:18:13 UTC |
1373667 | Antoine R. Dumont (@ardumont) | 13 October 2016, 09:18:40 UTC | indexer: Open mimetype/language get endpoints | 13 October 2016, 12:18:12 UTC |
54efa89 | Antoine R. Dumont (@ardumont) | 12 October 2016, 16:50:36 UTC | indexer: open drop/skip policy update on duplicates (language/mimetype) This adds the optional conflict_update parameter which specifies what to do when conflicts on sha1 occurs. conflict_update by default is false which ignores duplicates. Otherwise, conflich_update to true, overwrite existing data. Related T582 | 13 October 2016, 12:18:12 UTC |
dddbc4c | Antoine R. Dumont (@ardumont) | 13 October 2016, 08:23:16 UTC | Fix: Remove nose test attribute 'one' | 13 October 2016, 12:08:00 UTC |
3fcc628 | Antoine R. Dumont (@ardumont) | 12 October 2016, 00:23:25 UTC | Fix provenance storage init function | 12 October 2016, 00:23:25 UTC |
2fd7f72 | Antoine R. Dumont (@ardumont) | 11 October 2016, 23:33:57 UTC | provenance: Rework configuration setup | 11 October 2016, 23:33:57 UTC |
30f7883 | Antoine R. Dumont (@ardumont) | 07 October 2016, 17:08:34 UTC | Open language_mimetype_{missing,add} endpoints Related T578 | 07 October 2016, 18:30:55 UTC |
859860c | Antoine R. Dumont (@ardumont) | 07 October 2016, 16:53:38 UTC | sql/schema: Add content_language table Related T578 | 07 October 2016, 16:53:38 UTC |
fd717f3 | Antoine R. Dumont (@ardumont) | 07 October 2016, 12:36:56 UTC | Open content_mimetype_add endpoint to add missing mimetypes Related T577 | 07 October 2016, 15:08:05 UTC |
a77c187 | Antoine R. Dumont (@ardumont) | 07 October 2016, 12:36:20 UTC | Open content_mimetype_missing endpoint to list missing mimetypes Related T577 | 07 October 2016, 15:08:05 UTC |