https://github.com/SoftwareHeritage/swh-storage

sort by:
Revision Author Date Message Commit Date
f5e852a Updated backport on buster-swh from debian/0.30.0-1_swh1 (unstable-swh) 18 May 2021, 14:52:31 UTC
93f4567 Merge tag 'debian/0.30.0-1_swh1' into debian/buster-swh 18 May 2021, 14:52:31 UTC
6477a88 Updated debian changelog for version 0.30.0 18 May 2021, 14:45:21 UTC
e0fc621 Update upstream source from tag 'debian/upstream/0.30.0' Update to upstream version '0.30.0' with Debian dir bd97b7a393ad1fb6f02d20bb88e4989e57e84535 18 May 2021, 14:45:14 UTC
1ec845d New upstream version 0.30.0 18 May 2021, 14:45:12 UTC
0ed4a97 Make the TenaciousProxyStorage also handle content_add_metadata 18 May 2021, 10:56:15 UTC
f3ce043 Updated backport on buster-swh from debian/0.29.1-1_swh1 (unstable-swh) 14 May 2021, 17:05:44 UTC
9146ddd Merge tag 'debian/0.29.1-1_swh1' into debian/buster-swh 14 May 2021, 17:05:44 UTC
29fa2ad Updated debian changelog for version 0.29.1 14 May 2021, 16:59:42 UTC
72b0f63 Update upstream source from tag 'debian/upstream/0.29.1' Update to upstream version '0.29.1' with Debian dir 7f0734a925b33f8b6d0c2fa241729777d3fbefd4 14 May 2021, 16:59:41 UTC
5b5d0d3 New upstream version 0.29.1 14 May 2021, 16:59:38 UTC
53c21d4 Add missing schema migration for swh_directory_get_entries 14 May 2021, 16:31:00 UTC
00212b1 Updated debian changelog for version 0.29.0 11 May 2021, 13:12:43 UTC
16b2fb4 Update upstream source from tag 'debian/upstream/0.29.0' Update to upstream version '0.29.0' with Debian dir b2c5f0460c76f4ec0b2043303c82c1a0eac711e2 11 May 2021, 13:12:37 UTC
6ae5098 New upstream version 0.29.0 11 May 2021, 13:12:36 UTC
f328367 content_get: Add support for queries by sha1_git Before this commit, the only way to get Content objects from their sha1_git was to call content_find for each object. This was obviously neither convenient nor efficient. Using this endpoint to batch calls reduces the runtime of the git-bare vault cooker by 30%. 11 May 2021, 12:36:30 UTC
e3cbd5e Add endpoint directory_get_entries, to quickly list a directory's entries It spares a join with the content table, which should hopefully make the vault (and possibly other users) faster when they don't need this join. 11 May 2021, 10:00:27 UTC
f140f63 cassandra: Add tests checking directory_add and snapshot_add are atomic. 11 May 2021, 08:22:23 UTC
b487a21 Deprecate the "local" storage cls in favor of "postgresql" 10 May 2021, 12:56:44 UTC
9105253 Move all proxy storages in swh/storage/proxies/ to clean a bit the swh.storage namespace. 10 May 2021, 12:55:07 UTC
7617099 Make the TenaciousProxyStorage retry when a single object add fails give a chance to one-object batches to be ingested, and reduce the number of objects wrongly reported as non-ingested, e.g. during a replayer session, where this situation can occur. 07 May 2021, 11:46:00 UTC
f269f28 Updated debian changelog for version 0.28.0 06 May 2021, 14:06:51 UTC
da53050 Update upstream source from tag 'debian/upstream/0.28.0' Update to upstream version '0.28.0' with Debian dir d8cdc352f7a71a285fd47916abd71279ae22c86a 06 May 2021, 14:06:51 UTC
455191b New upstream version 0.28.0 06 May 2021, 14:06:49 UTC
35ae94a Use swh.core 0.14 It renamed db_name to dbname, which is a breaking change. 06 May 2021, 12:23:09 UTC
652e3d5 tenacious: Document potential issues about objects being dropped 06 May 2021, 09:56:32 UTC
e170fb2 Stop storing authority/fetcher metadata. We still don't have a use for them, and they are causing issues; such as being unable to add an authority/fetcher based only on a REMD object, which is needed by the replayer. 05 May 2021, 10:54:04 UTC
77ef651 Make postgresql's origin_add not raise an error in case of conflict there is no need for an url insertion in the origin table to result in a unicity error. Conflicting insertion of the same URL in this table may happen in case of concurrent process (loading or in a replayer session). 05 May 2021, 10:18:44 UTC
ffb38f7 Add a new TenaciousProxyStorage This proxy storage attempt to add buckets of objects, but in case of failure, it splits the bucket in parts so every valid object in the bucket get a chance to be inserted. Also provides an error rate-limiting feature. This proxy storage is mainly dedicated to help mirrorring an archive using the replayer stack. 05 May 2021, 09:57:58 UTC
051b771 cassandra: Add a test of a 'complex' migration, with a PK update 03 May 2021, 15:40:37 UTC
f233461 cassandra: Add 'check_missing' option, to allow updating objects as part of a migration. Also write a first test that simulates how a simple migration would go. 03 May 2021, 15:40:36 UTC
3d6aebd Updated debian changelog for version 0.27.4 29 April 2021, 13:04:47 UTC
e3d9edc Update upstream source from tag 'debian/upstream/0.27.4' Update to upstream version '0.27.4' with Debian dir 71b51f7b5839ceb9cd6c802e5be218d4cd04e29f 29 April 2021, 13:04:41 UTC
2b20af5 New upstream version 0.27.4 29 April 2021, 13:04:39 UTC
92d551a Normalize all Storage.xxx_add() methods to return a summary but origin_visit_add() which requires more work to do so. Note that this will change the way 'raw_extrinsinc_metadata_add()' report statsd metrics: the 'method_name' tag will now remain 'raw_extrinsic_metadata_add' instead of a forged '<type_name>_metadata_add'. 29 April 2021, 10:40:33 UTC
ff7ecb4 Properly annotate output of Storage.xxx_add() methods as Dict[str, int] when applicable. 29 April 2021, 10:03:10 UTC
98804f9 Add a fixer for ExtrinsicRawMetadata the 'type' attribute has been removed in swh.model v1.0.0 in favor of an ExtendedSWHID 'target'. 28 April 2021, 12:12:22 UTC
615d719 tox: Add sphinx environments to check sane doc build Enable to check package documentation can be built without producing sphinx warnings. The sphinx environment is designed to be used in continuous integration in order to prevent breaking documentation build when committing changes. The sphinx-dev environment is designed to be used inside a full swh development environment. Related to T3258 27 April 2021, 11:57:23 UTC
2c477ec Fix storage_data hardcoded id values and add a test to check this stays accurate, so that these objects can pass throught the validate proxy storage, for example. 23 April 2021, 13:45:35 UTC
eb8c147 cassandra: Deduplicate table names This removes all table names from cassandra/cql.py, and gets them from cassandra/schema.py instead. When possible, this uses existing constants (BaseRow.TABLE), otherwise it uses a function to compute these names. This is needed to support schema migrations, as updating a table's primary key requires creating a new table with a different name. 22 April 2021, 15:22:18 UTC
a1fc5fb cassandra: Use prepared statements in extid_index_* All other statements are, and there is no reason for them not to be too 15 April 2021, 13:56:32 UTC
3b00e3a Fix various Sphinx warnings 15 April 2021, 08:19:23 UTC
b999952 sql/Makefile: Also call dropdb prior createdb when using pifpaf Now that PGDATABASE value from pifpaf is used, that call is now needed otherwise the overall swh doc build in development mode fails. 14 April 2021, 16:41:20 UTC
1bacea5 docs: Fix db-schema.svg generation to use pifpaf-created database This makes 'tox -e sphinx-dev' not rely on the existence of the database on the system. 13 April 2021, 15:12:13 UTC
c96942b Cassandra: Deduplicate lists passed to *_add endpoints Previously only release_add supported deduplication. This commit aligns other _add endpoints with it 12 April 2021, 11:27:22 UTC
e7292d4 Updated backport on buster-swh from debian/0.27.3-1_swh1 (unstable-swh) 09 April 2021, 13:12:13 UTC
ff84c53 Merge tag 'debian/0.27.3-1_swh1' into debian/buster-swh 09 April 2021, 13:12:12 UTC
933289e Remove last references to no longer used SQLAlchemy package 09 April 2021, 13:07:35 UTC
9210dd4 Updated debian changelog for version 0.27.3 09 April 2021, 13:06:58 UTC
397de2a Update upstream source from tag 'debian/upstream/0.27.3' Update to upstream version '0.27.3' with Debian dir 4c5231bd56135cb6f1a25ff531b96d82b40ef292 09 April 2021, 13:06:57 UTC
a5342f9 New upstream version 0.27.3 09 April 2021, 13:06:55 UTC
50becef docs: Fix db-schema.svg inclusion when building full swh documentation The image was correctly included when building standalone swh-storage documentation but was not when building the full swh one. Closes T3227 09 April 2021, 11:37:53 UTC
5e2b83d Updated backport on buster-swh from debian/0.27.2-1_swh1 (unstable-swh) 08 April 2021, 08:10:45 UTC
fbe2292 Merge tag 'debian/0.27.2-1_swh1' into debian/buster-swh 08 April 2021, 08:10:45 UTC
eff39f0 Updated debian changelog for version 0.27.2 08 April 2021, 08:05:43 UTC
01ef7f4 Update upstream source from tag 'debian/upstream/0.27.2' Update to upstream version '0.27.2' with Debian dir aa26dbc874004024f4beea80edff49a567a5cf54 08 April 2021, 08:05:42 UTC
1562a78 New upstream version 0.27.2 08 April 2021, 08:05:41 UTC
ccaac11 migrate_extrinsic_metadata: Allow 'atom:title' as alternative to 'title' Some revisions use it instead. 07 April 2021, 12:20:19 UTC
39507b2 Make the replayer drop the Revision.metadata this attribute is deprecated and on the verge of being replaced by RawExtrinsicMetadata objects, and the kafka journal currently in production contains a few invalid metadata entries that makes the replayer unhappy. Closes T3201. 06 April 2021, 14:31:49 UTC
84dcbe3 Merge test_replay's _check_replayed and check_replayed in a single function 06 April 2021, 14:01:37 UTC
36a7fd3 Fix pg Storage.extid_add(): write ExtID objects to the journal and explicitely check for extid objects in the journal in TestStorage. 06 April 2021, 14:01:01 UTC
ce15c6f Updated backport on buster-swh from debian/0.27.1-1_swh1 (unstable-swh) 30 March 2021, 16:04:17 UTC
1650c0a Merge tag 'debian/0.27.1-1_swh1' into debian/buster-swh 30 March 2021, 16:04:15 UTC
4d89b30 Updated debian changelog for version 0.27.1 30 March 2021, 15:59:02 UTC
cfc5b44 Update upstream source from tag 'debian/upstream/0.27.1' Update to upstream version '0.27.1' with Debian dir 9adf4901adac3f232b83b38443774b80da3cb6cd 30 March 2021, 15:59:00 UTC
491e920 New upstream version 0.27.1 30 March 2021, 15:58:58 UTC
0a270d1 migrate_extrinsic_metadata: Filter out git revisions They can't have any extrinsic metadata, so fetching git revisions wastes a lot of time. 30 March 2021, 15:51:55 UTC
3309765 buffer: Add support for 'extid' Will be used by the extid migration script, and loaders can probably use it too. 30 March 2021, 15:33:00 UTC
96b82cb Updated backport on buster-swh from debian/0.27.0-1_swh1 (unstable-swh) 29 March 2021, 12:49:21 UTC
23b0457 Merge tag 'debian/0.27.0-1_swh1' into debian/buster-swh 29 March 2021, 12:49:20 UTC
a3b3dff Updated debian changelog for version 0.27.0 29 March 2021, 12:44:14 UTC
b8b8fd7 Update upstream source from tag 'debian/upstream/0.27.0' Update to upstream version '0.27.0' with Debian dir 467749f10e202fe504e9646b2d036afb83f43fa6 29 March 2021, 12:44:13 UTC
29e04cf New upstream version 0.27.0 29 March 2021, 12:44:11 UTC
cfb2417 extid: remove unicity on (extid_type, extid) and (target_type, target) It did not make sense for multiple reasons: 1. two extids can point to the same target (eg. extids with type git and git-sha256; or two package managers with different checksums) 2. inserting two objects with the same target or extid in a single call actually wrote both, but would crash when reading 3. inserting extid1 then extid2 would write both to Kafka, but only extid1 would be inserted. When replaying on a new DB, extid2 may be inserted and extid1 ignored Points 2 and 3 are simply fixable bugs, but 1 is an issue by design, and this commit fixes all of them at once. 26 March 2021, 15:08:13 UTC
ac6f642 origin_visit_status_add: Fix inconsistent/incorrect errors when type is None and visit is missing. 26 March 2021, 14:30:43 UTC
9243ffe Updated backport on buster-swh from debian/0.26.0-1_swh1 (unstable-swh) 22 March 2021, 21:58:45 UTC
13ebf00 Merge tag 'debian/0.26.0-1_swh1' into debian/buster-swh 22 March 2021, 21:58:45 UTC
c3c2a25 Updated debian changelog for version 0.26.0 22 March 2021, 21:53:39 UTC
55a3a0c Update upstream source from tag 'debian/upstream/0.26.0' Update to upstream version '0.26.0' with Debian dir b3dc40b3aa83a6825ebd07819458b8f242ac390e 22 March 2021, 21:53:38 UTC
9a0834b New upstream version 0.26.0 22 March 2021, 21:53:37 UTC
eff2383 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema) 22 March 2021, 11:42:46 UTC
2d540b0 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id. 22 March 2021, 08:53:16 UTC
f64d7d0 Updated backport on buster-swh from debian/0.25.0-1_swh1 (unstable-swh) 18 March 2021, 13:07:29 UTC
11a5410 Merge tag 'debian/0.25.0-1_swh1' into debian/buster-swh 18 March 2021, 13:07:26 UTC
991c32b Updated debian changelog for version 0.25.0 18 March 2021, 13:02:03 UTC
62e2773 Update upstream source from tag 'debian/upstream/0.25.0' Update to upstream version '0.25.0' with Debian dir 2c457efbb98596e479421a62d02a64f398cd0c21 18 March 2021, 13:02:01 UTC
7e25bb8 New upstream version 0.25.0 18 March 2021, 13:02:00 UTC
8dd9f7b Document the existing metadata formats 15 March 2021, 14:59:07 UTC
ffc0841 content_add: Write to the objstorage before the DB or Kafka Must add to the objstorage before the DB and journal. Otherwise: 1. in case of a crash the DB may "believe" we have the content, but we didn't have time to write to the objstorage before the crash 2. the objstorage mirroring, which reads from the journal, may attempt to read from the objstorage before we finished writing it This is already done in the postgresql backend unintentionally since 209de5dbaa127dacd114fbbd084f22632982eb77. This commit documents it, makes the cassandra backend behave that way too, and adds a test. 15 March 2021, 11:55:29 UTC
b565201 storage: Allow to filter out branches by prefix when counting them Add an optional branch_name_exclude_prefix parameter to the snapshot_count_branches method of the Storage interface. It enables to filter out branches whose name starts with a given prefix when counting. The purpose is to get accurate counters in swh-web as pull request branches will be filtered out by default. Related to T2782 12 March 2021, 14:23:54 UTC
93301a1 storage: Add branch names filtering support in snapshot_get_branches Add optional branch_name_include_substring parameter to snapshot_get_branches, if provided only branches whose name contains the given substring will be returned. Add optional branch_name_exclude_prefix parameter to snapshot_get_branches, if provided branches whose name starts with the given prefix will not be returned. Purpose of these new features: add a search form in the branches view of swh-web and filter out pull request branches (whose names start with "refs/pull/") by default. Related to T2782 12 March 2021, 14:23:28 UTC
b8e10f0 Add ExtID query support to the Storage These endpoints allow to add and query the storage for known ExtID from SWHID (typically get original VCS' revision intrinsic identifier from SWHID). The underlying data structure is to be filled typically by loaders using the `extid_add()` endpoint. This only provides the Postgresql implementation. Related to T2849. 11 March 2021, 13:20:18 UTC
6a77732 Add hg revisions to the test data set 10 March 2021, 15:25:00 UTC
e83452b Import TEST_OBJECTS from swh.model instead of swh.journal this later has been deprecated for a while now. 10 March 2021, 15:25:00 UTC
82ce7bf Make sure test_backfill does not depend on 2 dict keys being miraculously listed the same. 10 March 2021, 14:49:48 UTC
c4fdd6d Add support for raw_extrinsic_metadata in the replayer This also checks the basic raw_extrinsic_metadata codepaths in the backfiller tests. 10 March 2021, 13:07:11 UTC
53a58fa Add basic support for raw_extrinsic_metadata in the backfiller 10 March 2021, 13:00:05 UTC
89ae0a1 Add simple unit test for the backfill.byte_ranges function 10 March 2021, 08:34:27 UTC
0d785d2 Add support for reading RawExtrinsicMetadata with raw URL targets We convert the target attribute to a hashed ExtendedSWHID before returning the object. 10 March 2021, 08:33:53 UTC
82a376c Updated backport on buster-swh from debian/0.24.1-1_swh1 (unstable-swh) 04 March 2021, 22:43:55 UTC
back to top