https://github.com/postgres/postgres

sort by:
Revision Author Date Message Commit Date
272d82e Stamp 13.3. 10 May 2021, 20:41:42 UTC
9b93a33 Last-minute updates for release notes. Security: CVE-2021-32027, CVE-2021-32028, CVE-2021-32029 10 May 2021, 17:10:29 UTC
4a8656a Fix mishandling of resjunk columns in ON CONFLICT ... UPDATE tlists. It's unusual to have any resjunk columns in an ON CONFLICT ... UPDATE list, but it can happen when MULTIEXPR_SUBLINK SubPlans are present. If it happens, the ON CONFLICT UPDATE code path would end up storing tuples that include the values of the extra resjunk columns. That's fairly harmless in the short run, but if new columns are added to the table then the values would become accessible, possibly leading to malfunctions if they don't match the datatypes of the new columns. This had escaped notice through a confluence of missing sanity checks, including * There's no cross-check that a tuple presented to heap_insert or heap_update matches the table rowtype. While it's difficult to check that fully at reasonable cost, we can easily add assertions that there aren't too many columns. * The output-column-assignment cases in execExprInterp.c lacked any sanity checks on the output column numbers, which seems like an oversight considering there are plenty of assertion checks on input column numbers. Add assertions there too. * We failed to apply nodeModifyTable's ExecCheckPlanOutput() to the ON CONFLICT UPDATE tlist. That wouldn't have caught this specific error, since that function is chartered to ignore resjunk columns; but it sure seems like a bad omission now that we've seen this bug. In HEAD, the right way to fix this is to make the processing of ON CONFLICT UPDATE tlists work the same as regular UPDATE tlists now do, that is don't add "SET x = x" entries, and use ExecBuildUpdateProjection to evaluate the tlist and combine it with old values of the not-set columns. This adds a little complication to ExecBuildUpdateProjection, but allows removal of a comparable amount of now-dead code from the planner. In the back branches, the most expedient solution seems to be to (a) use an output slot for the ON CONFLICT UPDATE projection that actually matches the target table, and then (b) invent a variant of ExecBuildProjectionInfo that can be told to not store values resulting from resjunk columns, so it doesn't try to store into nonexistent columns of the output slot. (We can't simply ignore the resjunk columns altogether; they have to be evaluated for MULTIEXPR_SUBLINK to work.) This works back to v10. In 9.6, projections work much differently and we can't cheaply give them such an option. The 9.6 version of this patch works by inserting a JunkFilter when it's necessary to get rid of resjunk columns. In addition, v11 and up have the reverse problem when trying to perform ON CONFLICT UPDATE on a partitioned table. Through a further oversight, adjust_partition_tlist() discarded resjunk columns when re-ordering the ON CONFLICT UPDATE tlist to match a partition. This accidentally prevented the storing-bogus-tuples problem, but at the cost that MULTIEXPR_SUBLINK cases didn't work, typically crashing if more than one row has to be updated. Fix by preserving resjunk columns in that routine. (I failed to resist the temptation to add more assertions there too, and to do some minor code beautification.) Per report from Andres Freund. Back-patch to all supported branches. Security: CVE-2021-32028 10 May 2021, 15:02:29 UTC
467395b Prevent integer overflows in array subscripting calculations. While we were (mostly) careful about ensuring that the dimensions of arrays aren't large enough to cause integer overflow, the lower bound values were generally not checked. This allows situations where lower_bound + dimension overflows an integer. It seems that that's harmless so far as array reading is concerned, except that array elements with subscripts notionally exceeding INT_MAX are inaccessible. However, it confuses various array-assignment logic, resulting in a potential for memory stomps. Fix by adding checks that array lower bounds aren't large enough to cause lower_bound + dimension to overflow. (Note: this results in disallowing cases where the last subscript position would be exactly INT_MAX. In principle we could probably allow that, but there's a lot of code that computes lower_bound + dimension and would need adjustment. It seems doubtful that it's worth the trouble/risk to allow it.) Somewhat independently of that, array_set_element() was careless about possible overflow when checking the subscript of a fixed-length array, creating a different route to memory stomps. Fix that too. Security: CVE-2021-32027 10 May 2021, 14:44:38 UTC
4c7ba55 Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 832086c7a50768dd7a8c548ab063037741530ddf 10 May 2021, 12:32:18 UTC
0d204a4 Emit dummy statements for probes.d probes when disabled When building without --enable-dtrace, emit dummy do {} while (0) statements for the stubbed-out TRACE_POSTGRESQL_foo() macros instead of empty macros that totally elide the original probe statement. This fixes the warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] introduced by b94409a02f. Author: Craig Ringer <craig.ringer@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/20210504221531.cfvpmmdfsou6eitb%40alap3.anarazel.de 10 May 2021, 11:56:21 UTC
55fe672 Release notes for 13.3, 12.7, 11.12, 10.17, 9.6.22. 09 May 2021, 17:31:40 UTC
7f4bab7 First-draft release notes for 13.3. As usual, the release notes for older branches will be made by cutting these down, but put them up for community review first. 07 May 2021, 16:19:41 UTC
ef70b6f AlterSubscription_refresh: avoid stomping on global variable This patch replaces use of the global "wrconn" variable in AlterSubscription_refresh with a local variable of the same name, making it consistent with other functions in subscriptioncmds.c (e.g. DropSubscription). The global wrconn is only meant to be used for logical apply/tablesync worker. Abusing it this way is known to cause trouble if an apply worker manages to do a subscription refresh, such as reported by Jeremy Finzel and diagnosed by Andres Freund back in November 2020, at https://www.postgresql.org/message-id/20201111215820.qihhrz7fayu6myfi@alap3.anarazel.de Backpatch to 10. In branch master, also move the connection establishment to occur outside the PG_TRY block; this way we can remove a test for NULL in PG_FINALLY, and it also makes the code more consistent with similar code in the same file. Author: Peter Smith <peter.b.smith@fujitsu.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/CAHut+Pu7Jv9L2BOEx_Z0UtJxfDevQSAUW2mJqWU+CtmDrEZVAg@mail.gmail.com 07 May 2021, 15:46:37 UTC
f518c3d Document lock level used by ALTER TABLE VALIDATE CONSTRAINT Backpatch all the way back to 9.6. Author: Simon Riggs <simon.riggs@enterprisedb.com> Discussion: https://postgr.es/m/CANbhV-EwxvdhHuOLdfG2ciYrHOHXV=mm6=fD5aMhqcH09Li3Tg@mail.gmail.com 06 May 2021, 21:17:56 UTC
082d9d6 jit: Fix warning reported by gcc-11 caused by dubious function signature. Reported-By: Erik Rijkers <er@xs4all.nl> Discussion: https://postgr.es/m/833107370.1313189.1619647621213@webmailclassic.xs4all.nl Backpatch: 13, where b059d2f45685 introduced the issue. 06 May 2021, 05:14:52 UTC
923c135 Have ALTER CONSTRAINT recurse on partitioned tables When ALTER TABLE .. ALTER CONSTRAINT changes deferrability properties changed in a partitioned table, we failed to propagate those changes correctly to partitions and to triggers. Repair by adding a recursion mechanism to affect all derived constraints and all derived triggers. (In particular, recurse to partitions even if their respective parents are already in the desired state: it is possible for the partitions to have been altered individually.) Because foreign keys involve tables in two sides, we cannot use the standard ALTER TABLE recursion mechanism, so we invent our own by following pg_constraint.conparentid down. When ALTER TABLE .. ALTER CONSTRAINT is invoked on the derived pg_constraint object that's automaticaly created in a partition as a result of a constraint added to its parent, raise an error instead of pretending to work and then failing to modify all the affected triggers. Before this commit such a command would be allowed but failed to affect all triggers, so it would silently misbehave. (Restoring dumps of existing databases is not affected, because pg_dump does not produce anything for such a derived constraint anyway.) Add some tests for the case. Backpatch to 11, where foreign key support was added to partitioned tables by commit 3de241dba86f. (A related change is commit f56f8f8da6af in pg12 which added support for FKs *referencing* partitioned tables; this is what forces us to use an ad-hoc recursion mechanism for this.) Diagnosed by Tom Lane from bug report from Ron L Johnson. As of this writing, no reviews were offered. Discussion: https://postgr.es/m/75fe0761-a291-86a9-c8d8-4906da077469@gmail.com Discussion: https://postgr.es/m/3144850.1607369633@sss.pgh.pa.us 05 May 2021, 16:14:21 UTC
91a6b38 Fix OID passed to object-alter hook during ALTER CONSTRAINT The OID of the constraint is used instead of the OID of the trigger -- an easy mistake to make. Apparently the object-alter hooks are not very well tested :-( Backpatch to 12, where this typo was introduced by 578b229718e8 Discussion: https://postgr.es/m/20210503231633.GA6994@alvherre.pgsql 04 May 2021, 14:09:11 UTC
a6a3a27 pg_dump: Fix dump of generated columns in partitions The previous fix for dumping of inherited generated columns (0bf83648a52df96f7c8677edbbdf141bfa0cf32b) must not be applied to partitions, since, unlike normal inherited tables, they are always dumped separately and reattached. Reported-by: Santosh Udupi <email@hitha.net> Discussion: https://www.postgresql.org/message-id/flat/CACLRvHZ4a-%2BSM_159%2BtcrHdEqxFrG%3DW4gwTRnwf7Oj0UNj5R2A%40mail.gmail.com 04 May 2021, 12:18:23 UTC
64190d6 Fix ALTER TABLE / INHERIT with generated columns When running ALTER TABLE t2 INHERIT t1, we must check that columns in t2 that correspond to a generated column in t1 are also generated and have the same generation expression. Otherwise, this would allow creating setups that a normal CREATE TABLE sequence would not allow. Discussion: https://www.postgresql.org/message-id/22de27f6-7096-8d96-4619-7b882932ca25@2ndquadrant.com 04 May 2021, 10:10:22 UTC
e48ce7e Prevent lwlock dtrace probes from unnecessary work If dtrace is compiled in but disabled, the lwlock dtrace probes still evaluate their arguments. Since PostgreSQL 13, T_NAME(lock) does nontrivial work, so it should be avoided if not needed. To fix, make these calls conditional on the *_ENABLED() macro corresponding to each probe. Reviewed-by: Craig Ringer <craig.ringer@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/CAGRY4nwxKUS_RvXFW-ugrZBYxPFFM5kjwKT5O+0+Stuga5b4+Q@mail.gmail.com 03 May 2021, 19:01:09 UTC
3eeadc4 Doc: add an example of a self-referential foreign key to ddl.sgml. While we've always allowed such cases, the documentation didn't say you could do it. Discussion: https://postgr.es/m/161969805833.690.13680986983883602407@wrigleys.postgresql.org 30 April 2021, 19:37:56 UTC
7c810bd Doc: update libpq's documentation for PQfn(). Mention specifically that you can't call aggregates, window functions, or procedures this way (the inability to call SRFs was already mentioned). Also, the claim that PQfn doesn't support NULL arguments or results has been a lie since we invented protocol 3.0. Not sure why this text was never updated for that, but do it now. Discussion: https://postgr.es/m/2039442.1615317309@sss.pgh.pa.us 30 April 2021, 19:10:06 UTC
4d225ba Disallow calling anything but plain functions via the fastpath API. Reject aggregates, window functions, and procedures. Aggregates failed anyway, though with a somewhat obscure error message. Window functions would hit an Assert or null-pointer dereference. Procedures seemed to work as long as you didn't try to do transaction control, but (a) transaction control is sort of the point of a procedure, and (b) it's not entirely clear that no bugs lurk in that path. Given the lack of testing of this area, it seems safest to be conservative in what we support. Also reject proretset functions, as the fastpath protocol can't support returning a set. Also remove an easily-triggered assertion that the given OID isn't 0; the subsequent lookups can handle that case themselves. Per report from Theodor-Arsenij Larionov-Trichkin. Back-patch to all supported branches. (The procedure angle only applies in v11+, of course.) Discussion: https://postgr.es/m/2039442.1615317309@sss.pgh.pa.us 30 April 2021, 18:10:26 UTC
bbcfee0 Fix some more omissions in pg_upgrade's tests for non-upgradable types. Commits 29aeda6e4 et al closed up some oversights involving not checking for non-upgradable types within container types, such as arrays and ranges. However, I only looked at version.c, failing to notice that there were substantially-equivalent tests in check.c. (The division of responsibility between those files is less than clear...) In addition, because genbki.pl does not guarantee that auto-generated rowtype OIDs will hold still across versions, we need to consider that the composite type associated with a system catalog or view is non-upgradable. It seems unlikely that someone would have a user column declared that way, but if they did, trying to read it in another PG version would likely draw "no such pg_type OID" failures, thanks to the type OID embedded in composite Datums. To support the composite and reg*-type cases, extend the recursive query that does the search to allow any base query that returns a column of pg_type OIDs, rather than limiting it to exactly one starting type. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/2798740.1619622555@sss.pgh.pa.us 29 April 2021, 19:24:37 UTC
896cedc Improve documentation for default_tablespace on partitioned tables Backpatch to 12, where 87259588d0ab introduced the current behavior. Per note from Justin Pryzby. Co-authored-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20210416143135.GI3315@telsasoft.com 29 April 2021, 15:31:24 UTC
7bbcfb4 Doc: fix discussion of how to get real Julian Dates. Somehow I'd convinced myself that rotating to UTC-12 was the way to do this, but upon further review, it's definitely UTC+12. Discussion: https://postgr.es/m/1197050.1619123213@sss.pgh.pa.us 28 April 2021, 14:03:28 UTC
a928297 Fix use-after-release issue with pg_identify_object_as_address() Spotted by buildfarm member prion, with -DRELCACHE_FORCE_RELEASE. Introduced in f7aab36. Discussion: https://postgr.es/m/2759018.1619577848@sss.pgh.pa.us Backpatch-through: 9.6 28 April 2021, 02:58:43 UTC
f3c4537 Fix pg_identify_object_as_address() with event triggers Attempting to use this function with event triggers failed, as, since its introduction in a676201, this code has never associated an object name with event triggers. This addresses the failure by adding the event trigger name to the set defining its object address. Note that regression tests are added within event_trigger and not object_address to avoid issues with concurrent connections in parallel schedules. Author: Joel Jacobson Discussion: https://postgr.es/m/3c905e77-a026-46ae-8835-c3f6cd1d24c8@www.fastmail.com Backpatch-through: 9.6 28 April 2021, 02:18:17 UTC
ec5bab9 Doc: document EXTRACT(JULIAN ...), improve Julian Date explanation. For some reason, the "julian" option for extract()/date_part() has never gotten listed in the manual. Also, while Appendix B mentioned in passing that we don't conform to the usual astronomical definition that a Julian date starts at noon UTC, it was kind of vague about what we do instead. Clarify that, and add an example showing how to get the astronomical definition if you want it. It's been like this for ages, so back-patch to all supported branches. Discussion: https://postgr.es/m/1197050.1619123213@sss.pgh.pa.us 26 April 2021, 15:50:35 UTC
8019fcb doc: Fix obsolete description about pg_basebackup. Previously it was documented that if using "-X none" option there was no guarantee that all required WAL files were archived at the end of pg_basebackup when taking a backup from the standby. But this limitation was removed by commit 52f8a59dd9. Now, even when taking a backup from the standby, pg_basebackup can wait for all required WAL files to be archived. Therefore this commit removes such obsolete description from the docs. Also this commit adds new description about the limitation when taking a backup from the standby, into the docs. The limitation is that pg_basebackup cannot force the standbfy to switch to a new WAL file at the end of backup, which may cause pg_basebackup to wait a long time for the last required WAL file to be switched and archived, especially when write activity on the primary is low. Back-patch to v10 where the issue was introduced. Reported-by: Kyotaro Horiguchi Author: Kyotaro Horiguchi, Fujii Masao Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/20210420.133235.1342729068750553399.horikyota.ntt@gmail.com 23 April 2021, 06:46:42 UTC
2602ee4 Don't crash on reference to an un-available system column. Adopt a more consistent policy about what slot-type-specific getsysattr functions should do when system attributes are not available. To wit, they should all throw the same user-oriented error, rather than variously crashing or emitting developer-oriented messages. This closes a identifiable problem in commits a71cfc56b and 3fb93103a (in v13 and v12), so back-patch into those branches, along with a test case to try to ensure we don't break it again. It is not known that any of the former crash cases are reachable in HEAD, but this seems like a good safety improvement in any case. Discussion: https://postgr.es/m/141051591267657@mail.yandex.ru 22 April 2021, 21:30:42 UTC
00037d8 Doc: document the tie-breaking behavior of the round() function. Back-patch to v13; the table layout in older branches is unfriendly to adding such details. Laurenz Albe Discussion: https://postgr.es/m/161881920775.685.12293798764864559341@wrigleys.postgresql.org 22 April 2021, 18:47:26 UTC
a71cfc5 Fix bugs in RETURNING in cross-partition UPDATE cases. If the source and destination partitions don't have identical rowtypes (for example, one has dropped columns the other lacks), then the planSlot contents will be different because of that. If the query has a RETURNING list that tries to return resjunk columns out of the planSlot, that is columns from tables that were joined to the target table, we'd get errors or wrong answers. That's because we used the RETURNING list generated for the destination partition, which expects a planSlot matching that partition's subplan. The most practical fix seems to be to convert the updated destination tuple back to the source partition's rowtype, and then apply the RETURNING list generated for the source partition. This avoids making fragile assumptions about whether the per-subpartition subplans generated all the resjunk columns in the same order. This has been broken since v11 introduced cross-partition UPDATE. The lack of field complaints shows that non-identical partitions aren't a common case; therefore, don't stress too hard about making the conversion efficient. There's no such bug in HEAD, because commit 86dc90056 got rid of per-target-relation variance in the contents of the planSlot. Hence, patch v11-v13 only. Amit Langote and Etsuro Fujita, small changes by me Discussion: https://postgr.es/m/CA+HiwqE_UK1jTSNrjb8mpTdivzd3dum6mK--xqKq0Y9VmfwWQA@mail.gmail.com 22 April 2021, 15:46:41 UTC
574a1b8 fix silly perl error in commit d064afc720 21 April 2021, 15:15:54 UTC
d31ae92 Only ever test for non-127.0.0.1 addresses on Windows in PostgresNode This has been found to cause hangs where tcp usage is forced. Alexey Kodratov Discussion: https://postgr.es/m/82e271a9a11928337fcb5b5e57b423c0@postgrespro.ru Backpatch to all live branches 21 April 2021, 14:25:23 UTC
7bfba4f Fix planner failure in some cases of sorting by an aggregate. An oversight introduced by the incremental-sort patches caused "could not find pathkey item to sort" errors in some situations where a sort key involves an aggregate or window function. The basic problem here is that find_em_expr_usable_for_sorting_rel isn't properly modeling what prepare_sort_from_pathkeys will do later. Rather than hoping we can keep those functions in sync, let's refactor so that they actually share the code for identifying a suitable sort expression. With this refactoring, tlist.c's tlist_member_ignore_relabel is unused. I removed it in HEAD but left it in place in v13, in case any extensions are using it. Per report from Luc Vlaming. Back-patch to v13 where the problem arose. James Coleman and Tom Lane Discussion: https://postgr.es/m/91f3ec99-85a4-fa55-ea74-33f85a5c651f@swarm64.com 20 April 2021, 15:32:02 UTC
bf5d1f1 Fix typo in comment Author: Julien Rouhaud Backpatch-through: 11 Discussion: https://postgr.es/m/20210420121659.odjueyd4rpilorn5@nol 20 April 2021, 12:36:41 UTC
e480c6d Allow TestLib::slurp_file to skip contents, and use as needed In order to avoid getting old logfile contents certain functions in PostgresNode were doing one of two things. On Windows it rotated the logfile and restarted the server, while elsewhere it truncated the log file. Both of these are unnecessary. We borrow from the buildfarm which does this instead: note the size of the logfile before we start, and then when fetching the logfile skip to that position before accumulating contents. This is spelled differently on Windows but the effect is the same. This is largely centralized in TestLib's slurp_file function, which has a new optional parameter, the offset to skip to before starting to reading the file. Code in the client becomes much neater. Backpatch to all live branches. Michael Paquier, slightly modified by me. Discussion: https://postgr.es/m/YHajnhcMAI3++pJL@paquier.xyz 16 April 2021, 21:35:20 UTC
0e8acd3 doc: Fix typo in example query of SQL/JSON Author: Erik Rijkers Discussion: https://postgr.es/m/1219476687.20432.1617452918468@webmailclassic.xs4all.nl Backpatch-through: 12 16 April 2021, 07:56:25 UTC
c39aa1e Fix some inappropriately-disallowed uses of ALTER ROLE/DATABASE SET. Most GUC check hooks that inspect database state have special checks that prevent them from throwing hard errors for state-dependent issues when source == PGC_S_TEST. This allows, for example, "ALTER DATABASE d SET default_text_search_config = foo" when the "foo" configuration hasn't been created yet. Without this, we have problems during dump/reload or pg_upgrade, because pg_dump has no idea about possible dependencies of GUC values and can't ensure a safe restore ordering. However, check_role() and check_session_authorization() hadn't gotten the memo about that, and would throw hard errors anyway. It's not entirely clear what is the use-case for "ALTER ROLE x SET role = y", but we've now heard two independent complaints about that bollixing an upgrade, so apparently some people are doing it. Hence, fix these two functions to act more like other check hooks with similar needs. (But I did not change their insistence on being inside a transaction, as it's still not apparent that setting either GUC from the configuration file would be wise.) Also fix check_temp_buffers, which had a different form of the disease of making state-dependent checks without any exception for PGC_S_TEST. A cursory survey of other GUC check hooks did not find any more issues of this ilk. (There are a lot of interdependencies among PGC_POSTMASTER and PGC_SIGHUP GUCs, which may be a bad idea, but they're not relevant to the immediate concern because they can't be set via ALTER ROLE/DATABASE.) Per reports from Charlie Hornsby and Nathan Bossart. Back-patch to all supported branches. Discussion: https://postgr.es/m/HE1P189MB0523B31598B0C772C908088DB7709@HE1P189MB0523.EURP189.PROD.OUTLOOK.COM Discussion: https://postgr.es/m/20160711223641.1426.86096@wrigleys.postgresql.org 13 April 2021, 19:10:18 UTC
97b7ad4 Redesign the caching done by get_cached_rowtype(). Previously, get_cached_rowtype() cached a pointer to a reference-counted tuple descriptor from the typcache, relying on the ExprContextCallback mechanism to release the tupdesc refcount when the expression tree using the tupdesc was destroyed. This worked fine when it was designed, but the introduction of within-DO-block COMMITs broke it. The refcount is logged in a transaction-lifespan resource owner, but plpgsql won't destroy simple expressions made within the DO block (before its first commit) until the DO block is exited. That results in a warning about a leaked tupdesc refcount when the COMMIT destroys the original resource owner, and then an error about the active resource owner not holding a matching refcount when the expression is destroyed. To fix, get rid of the need to have a shutdown callback at all, by instead caching a pointer to the relevant typcache entry. Those survive for the life of the backend, so we needn't worry about the pointer becoming stale. (For registered RECORD types, we can still cache a pointer to the tupdesc, knowing that it won't change for the life of the backend.) This mechanism has been in use in plpgsql and expandedrecord.c since commit 4b93f5799, and seems to work well. This change requires modifying the ExprEvalStep structs used by the relevant expression step types, which is slightly worrisome for back-patching. However, there seems no good reason for extensions to be familiar with the details of these particular sub-structs. Per report from Rohit Bhogate. Back-patch to v11 where within-DO-block COMMITs became a thing. Discussion: https://postgr.es/m/CAAV6ZkQRCVBh8qAY+SZiHnz+U+FqAGBBDaDTjF2yiKa2nJSLKg@mail.gmail.com 13 April 2021, 17:37:07 UTC
37e7654 Avoid improbable PANIC during heap_update. heap_update needs to clear any existing "all visible" flag on the old tuple's page (and on the new page too, if different). Per coding rules, to do this it must acquire pin on the appropriate visibility-map page while not holding exclusive buffer lock; which creates a race condition since someone else could set the flag whenever we're not holding the buffer lock. The code is supposed to handle that by re-checking the flag after acquiring buffer lock and retrying if it became set. However, one code path through heap_update itself, as well as one in its subroutine RelationGetBufferForTuple, failed to do this. The end result, in the unlikely event that a concurrent VACUUM did set the flag while we're transiently not holding lock, is a non-recurring "PANIC: wrong buffer passed to visibilitymap_clear" failure. This has been seen a few times in the buildfarm since recent VACUUM changes that added code paths that could set the all-visible flag while holding only exclusive buffer lock. Previously, the flag was (usually?) set only after doing LockBufferForCleanup, which would insist on buffer pin count zero, thus preventing the flag from becoming set partway through heap_update. However, it's clear that it's heap_update not VACUUM that's at fault here. What's less clear is whether there is any hazard from these bugs in released branches. heap_update is certainly violating API expectations, but if there is no code path that can set all-visible without a cleanup lock then it's only a latent bug. That's not 100% certain though, besides which we should worry about extensions or future back-patch fixes that could introduce such code paths. I chose to back-patch to v12. Fixing RelationGetBufferForTuple before that would require also back-patching portions of older fixes (notably 0d1fe9f74), which is more code churn than seems prudent to fix a hypothetical issue. Discussion: https://postgr.es/m/2247102.1618008027@sss.pgh.pa.us 13 April 2021, 16:17:24 UTC
1388119 Use "-I." in directories holding Bison parsers, for Oracle compilers. With the Oracle Developer Studio 12.6 compiler, #line directives alter the current source file location for purposes of #include "..." directives. Hence, a VPATH build failed with 'cannot find include file: "specscanner.c"'. With two exceptions, parser-containing directories already add "-I. -I$(srcdir)"; eliminate the exceptions. Back-patch to 9.6 (all supported versions). 13 April 2021, 02:24:58 UTC
766c8fc Port regress-python3-mangle.mk to Solaris "sed". It doesn't support "\(foo\)*" like a POSIX "sed" implementation does; see the Autoconf manual. Back-patch to 9.6 (all supported versions). 13 April 2021, 02:24:24 UTC
16175e2 Fix potential SSI hazard in heap_update(). Commit 6f38d4dac38 failed to heed a warning about the stability of the value pointed to by "otid". The caller is allowed to pass in a pointer to newtup->t_self, which will be updated during the execution of the function. Instead, the SSI check should use the value we copy into oldtup.t_self near the top of the function. Not a live bug, because newtup->t_self doesn't really get updated until a bit later, but it was confusing and broke the rule established by the comment. Back-patch to 13. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2689164.1618160085%40sss.pgh.pa.us 13 April 2021, 01:03:30 UTC
8a7bd1e Fix old bug with coercing the result of a COLLATE expression. There are hacks in parse_coerce.c to push down a requested coercion to below any CollateExpr that may appear. However, we did that even if the requested data type is non-collatable, leading to an invalid expression tree in which CollateExpr is applied to a non-collatable type. The fix is just to drop the CollateExpr altogether, reasoning that it's useless. This bug is ten years old, dating to the original addition of COLLATE support. The lack of field complaints suggests that there aren't a lot of user-visible consequences. We noticed the problem because it would trigger an assertion in DefineVirtualRelation if the invalid structure appears as an output column of a view; however, in a non-assert build, you don't see a crash just a (subtly incorrect) complaint about applying collation to a non-collatable type. I found that by putting the incorrect structure further down in a view, I could make a view definition that would fail dump/reload, per the added regression test case. But CollateExpr doesn't do anything at run-time, so this likely doesn't lead to any really exciting consequences. Per report from Yulin Pei. Back-patch to all supported branches. Discussion: https://postgr.es/m/HK0PR01MB22744393C474D503E16C8509F4709@HK0PR01MB2274.apcprd01.prod.exchangelabs.com 12 April 2021, 18:37:22 UTC
48d4a8c Allocate access strategy in parallel VACUUM workers. Currently, parallel vacuum workers don't use any buffer access strategy. We can fix it either by propagating the access strategy from a leader or allow each worker to use BAS_VACUUM access strategy type. We don't see much use in propagating this information from leader as both leader and workers are supposed to use the same strategy. We might want to use a different access strategy for leader and workers but that would be a separate optimization not suitable for back-branches. This has been fixed in HEAD as commit f6b8f19a08. Author: Amit Kapila Reviewed-by: Sawada Masahiko, Bharath Rupireddy Discussion: https://postgr.es/m/CAA4eK1KbmJgRV2W3BbzRnKUSrukN7SbqBBriC4RDB5KBhopkGQ@mail.gmail.com 12 April 2021, 03:33:16 UTC
be79deb Fix out-of-bound memory access for interval -> char conversion Using Roman numbers (via "RM" or "rm") for a conversion to calculate a number of months has never considered the case of negative numbers, where a conversion could easily cause out-of-bound memory accesses. The conversions in themselves were not completely consistent either, as specifying 12 would result in NULL, but it should mean XII. This commit reworks the conversion calculation to have a more consistent behavior: - If the number of months and years is 0, return NULL. - If the number of months is positive, return the exact month number. - If the number of months is negative, do a backward calculation, with -1 meaning December, -2 November, etc. Reported-by: Theodor Arsenij Larionov-Trichkin Author: Julien Rouhaud Discussion: https://postgr.es/m/16953-f255a18f8c51f1d5@postgresql.org backpatch-through: 9.6 12 April 2021, 02:31:26 UTC
19b28d6 Fix typo Author: Daniel Westermann Backpatch-through: 9.6 Discussion: https://postgr.es/m/GV0P278MB0483A7AA85BAFCC06D90F453D2739@GV0P278MB0483.CHEP278.PROD.OUTLOOK.COM 09 April 2021, 10:41:01 UTC
dc6d285 Fix typos and grammar in documentation and code comments Comment fixes are applied on HEAD, and documentation improvements are applied on back-branches where needed. Author: Justin Pryzby Discussion: https://postgr.es/m/20210408164008.GJ6592@telsasoft.com Backpatch-through: 9.6 09 April 2021, 04:53:17 UTC
1aad1d1 Don't add non-existent pages to bitmap from BRIN The code in bringetbitmap() simply added the whole matching page range to the TID bitmap, as determined by pages_per_range, even if some of the pages were beyond the end of the heap. The query then might fail with an error like this: ERROR: could not open file "base/20176/20228.2" (target block 262144): previous segment is only 131021 blocks In this case, the relation has 262093 pages (131072 and 131021 pages), but we're trying to acess block 262144, i.e. first block of the 3rd segment. At that point _mdfd_getseg() notices the preceding segment is incomplete, and fails. Hitting this in practice is rather unlikely, because: * Most indexes use power-of-two ranges, so segments and page ranges align perfectly (segment end is also a page range end). * The table size has to be just right, with the last segment being almost full - less than one page range from full segment, so that the last page range actually crosses the segment boundary. * Prefetch has to be enabled. The regular page access checks that pages are not beyond heap end, but prefetch does not. On older releases (before 12) the execution stops after hitting the first non-existent page, so the prefetch distance has to be sufficient to reach the first page in the next segment to trigger the issue. Since 12 it's enough to just have prefetch enabled, the prefetch distance does not matter. Fixed by not adding non-existent pages to the TID bitmap. Backpatch all the way back to 9.6 (BRIN indexes were introduced in 9.5, but that release is EOL). Backpatch-through: 9.6 07 April 2021, 13:59:30 UTC
b556199 Fix potential rare failure in the kerberos TAP tests Instead of writing a query to psql's stdin, which can cause a failure where psql exits before writing, reporting a write failure with a broken pipe, this changes the logic to use -c. This was not seen in the buildfarm as no animals with a sensitive environment are running the kerberos tests, but let's be safe. HEAD is able to handle the situation as of 6d41dd0 for all the test suites doing connection checks. f44b9b6 has fixed the same problem for the LDAP tests. Discussion: https://postgr.es/m/YGu7ceWAiSNQDgH5@paquier.xyz Backpatch-through: 11 07 April 2021, 10:58:54 UTC
e7bcfd7 Shut down transaction tracking at startup process exit. Maxim Orlov reported that the shutdown of standby server could result in the following assertion failure. The cause of this issue was that, when the shutdown caused the startup process to exit, recovery-time transaction tracking was not shut down even if it's already initialized, and some locks the tracked transactions were holding could not be released. At this situation, if other process was invoked and the PGPROC entry that the startup process used was assigned to it, it found such unreleased locks and caused the assertion failure, during the initialization of it. TRAP: FailedAssertion("SHMQueueEmpty(&(MyProc->myProcLocks[i]))" This commit fixes this issue by making the startup process shut down transaction tracking and release all locks, at the exit of it. Back-patch to all supported branches. Reported-by: Maxim Orlov Author: Fujii Masao Reviewed-by: Maxim Orlov Discussion: https://postgr.es/m/ad4ce692cc1d89a093b471ab1d969b0b@postgrespro.ru 05 April 2021, 17:27:11 UTC
fd91fed Fix more confusion in SP-GiST. spg_box_quad_leaf_consistent unconditionally returned the leaf datum as leafValue, even though in its usage for poly_ops that value is of completely the wrong type. In versions before 12, that was harmless because the core code did nothing with leafValue in non-index-only scans ... but since commit 2a6368343, if we were doing a KNN-style scan, spgNewHeapItem would unconditionally try to copy the value using the wrong datatype parameters. Said copying is a waste of time and space if we're not going to return the data, but it accidentally failed to fail until I fixed the datatype confusion in ac9099fc1. Hence, change spgNewHeapItem to not copy the datum unless we're actually going to return it later. This saves cycles and dodges the question of whether lossy opclasses are returning the right type. Also change spg_box_quad_leaf_consistent to not return data that might be of the wrong type, as insurance against somebody introducing a similar bug into the core code in future. It seems like a good idea to back-patch these two changes into v12 and v13, although I'm afraid to change spgNewHeapItem's mistaken idea of which datatype to use in those branches. Per buildfarm results from ac9099fc1. Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us 04 April 2021, 21:57:07 UTC
45aea47 Use macro MONTHS_PER_YEAR instead of '12' in /ecpg/pgtypeslib All other places already use MONTHS_PER_YEAR appropriately. Backpatch-through: 9.6 02 April 2021, 20:42:29 UTC
4a1b95f Clarify documentation of RESET ROLE Command-line options, or previous "ALTER (ROLE|DATABASE) ... SET ROLE ..." commands, can change the value of the default role for a session. In the presence of one of these, RESET ROLE will change the current user identifier to the default role rather than the session user identifier. Fix the documentation to reflect this reality. Backpatch to all supported versions. Author: Nathan Bossart Reviewed-By: Laurenz Albe, David G. Johnston, Joe Conway Reported by: Nathan Bossart Discussion: https://postgr.es/m/flat/925134DB-8212-4F60-8AB1-B1231D750CB4%40amazon.com Backpatch-through: 9.6 02 April 2021, 17:48:45 UTC
1041643 pg_checksums: Fix progress reporting. pg_checksums uses two counters, total size and current size, to calculate the progress. Previously the progress that pg_checksums reported could not reach 100% at the end. The cause of this issue was that the sizes of only pages excluding new ones in each file were counted as the current size while the size of each file is counted as the total size. That is, the total size of all new pages could be reported as the difference between the total size and current size. This commit fixes this issue by making pg_checksums count the sizes of all pages including new ones in each file as the current size. Back-patch to v12 where progress reporting was added to pg_checksums. Reported-by: Shinya Kato Author: Shinya Kato Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/TYAPR01MB289656B1ACA0A5E7CAD07BE3C47A9@TYAPR01MB2896.jpnprd01.prod.outlook.com 02 April 2021, 15:07:49 UTC
f8c2d49 doc: Clarify how to generate backup files with non-exclusive backups The current instructions describing how to write the backup_label and tablespace_map files are confusing. For example, opening a file in text mode on Windows and copy-pasting the file's contents would result in a failure at recovery because of the extra CRLF characters generated. The documentation was not stating that clearly, and per discussion this is not considered as a supported scenario. This commit extends a bit the documentation to mention that it may be required to open the file in binary mode before writing its data. Reported-by: Wang Shenhao Author: David Steele Reviewed-by: Andrew Dunstan, Magnus Hagander Discussion: https://postgr.es/m/8373f61426074f2cb6be92e02f838389@G08CNEXMBPEKD06.g08.fujitsu.local Backpatch-through: 9.6 02 April 2021, 07:37:07 UTC
75e66ee doc: mention that intervening major releases can be skipped Also mention that you should read the intervening major releases notes. This change was also applied to the website. Discussion: https://postgr.es/m/20210330144949.GA8259@momjian.us Backpatch-through: 9.6 02 April 2021, 01:17:24 UTC
89937d0 Improve stability of test with vacuum_truncate in reloptions.sql This test has been using a simple VACUUM with pg_relation_size() to check if a relation gets physically truncated or not, but forgot the fact that some concurrent activity, like checkpoint buffer writes, could cause some pages to be skipped. The second test enabling vacuum_truncate could fail, seeing a non-empty relation. The first test would not have failed, but could finish by testing a behavior different than the one aimed for. Both tests gain a FREEZE option, to make the vacuums more aggressive and prevent page skips. This is similar to the issues fixed in c2dc1a7. Author: Arseny Sher Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/87tuotr2hh.fsf@ars-thinkpad backpatch-through: 12 02 April 2021, 00:44:50 UTC
35421a4 Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org 01 April 2021, 17:34:16 UTC
876ecfb doc: Clarify use of ACCESS EXCLUSIVE lock in various sections Some sections of the documentation used "exclusive lock" to describe that an ACCESS EXCLUSIVE lock is taken during a given operation. This can be confusing to the reader as ACCESS SHARE is allowed with an EXCLUSIVE lock is used, but that would not be the case with what is described on those parts of the documentation. Author: Greg Rychlewski Discussion: https://postgr.es/m/CAKemG7VptD=7fNWckFMsMVZL_zzvgDO6v2yVmQ+ZiBfc_06kCQ@mail.gmail.com Backpatch-through: 9.6 01 April 2021, 06:28:45 UTC
89e383b Add a docs section for obsoleted and renamed functions and settings The new appendix groups information on renamed or removed settings, commands, etc into an out-of-the-way part of the docs. The original id elements are retained in each subsection to ensure that the same filenames are produced for HTML docs. This prevents /current/ links on the web from breaking, and allows users of the web docs to follow links from old version pages to info on the changes in the new version. Prior to this change, a link to /current/ for renamed sections like the recovery.conf docs would just 404. Similarly if someone searched for recovery.conf they would find the pg11 docs, but there would be no /12/ or /current/ link, so they couldn't easily find out that it was removed in pg12 or how to adapt. Index entries are also added so that there's a breadcrumb trail for users to follow when they know the old name, but not what we changed it to. So a user who is trying to find out how to set standby_mode in PostgreSQL 12+, or where pg_resetxlog went, now has more chance of finding that information. Craig Ringer and Stephen Frost Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/CAGRY4nzPNOyYQ_1-pWYToUVqQ0ThqP5jdURnJMZPm539fdizOg%40mail.gmail.com Backpatch-through: 10 31 March 2021, 20:23:18 UTC
78f8b09 Update obsolete comment. Back-patch to all supported branches. Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK17DwzaSf%2BB71dhL2apXdtG-OmD6u2AL9Cq2ZmAR0%2BzapQ%40mail.gmail.com 30 March 2021, 04:00:02 UTC
f50dc2c psql: call clearerr() just before printing We were never doing clearerr() on the output stream, which results in a message being printed after each result once an EOF is seen: could not print result table: Success This message was added by commit b03436994bcc (in the pg13 era); before that, the error indicator would never be examined. So backpatch only that far back, even though the actual bug (to wit: the fact that the error indicator is never cleared) is older. 29 March 2021, 21:34:39 UTC
092d3db doc: Define TLS as an acronym Commit c6763156589 added an acronym reference for "TLS" but the definition was never added. Author: Daniel Gustafsson Reviewed-by: Michael Paquier Backpatch-through: 9.6 Discussion: https://postgr.es/m/27109504-82DB-41A8-8E63-C0498314F5B0@yesql.se 28 March 2021, 15:28:12 UTC
67251c8 Fix ndistinct estimates with system attributes When estimating the number of groups using extended statistics, the code was discarding information about system attributes. This led to strange situation that SELECT 1 FROM t GROUP BY ctid; could have produced higher estimate (equal to pg_class.reltuples) than SELECT 1 FROM t GROUP BY a, b, ctid; with extended statistics on (a,b). Fixed by retaining information about the system attribute. Backpatch all the way to 10, where extended statistics were introduced. Author: Tomas Vondra Backpatch-through: 10 26 March 2021, 21:37:45 UTC
c862299 Document lock obtained during partition detach On partition detach, we acquire a SHARE lock on all tables that reference the partitioned table that we're detaching a partition from, but failed to document this fact. My oversight in commit f56f8f8da6af. Repair. Backpatch to 12. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20210325180244.GA12738@alvherre.pgsql 25 March 2021, 19:30:22 UTC
a00c306 Remove StoreSingleInheritance reimplementation I introduced this duplicate code in commit 8b08f7d4820f for no good reason. Remove it, and backpatch to 11 where it was introduced. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> 25 March 2021, 13:47:38 UTC
092c077 Fix bug in WAL replay of COMMIT_TS_SETTS record. Previously the WAL replay of COMMIT_TS_SETTS record called TransactionTreeSetCommitTsData() with the argument write_xlog=true, which generated and wrote new COMMIT_TS_SETTS record. This should not be acceptable because it's during recovery. This commit fixes the WAL replay of COMMIT_TS_SETTS record so that it calls TransactionTreeSetCommitTsData() with write_xlog=false and doesn't generate new WAL during recovery. Back-patch to all supported branches. Reported-by: lx zou <zoulx1982@163.com> Author: Fujii Masao Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/16931-620d0f2fdc6108f1@postgresql.org 25 March 2021, 02:24:20 UTC
c6eac71 Fix psql's \connect command some more. Jasen Betts reported yet another unintended side effect of commit 85c54287a: reconnecting with "\c service=whatever" did not have the expected results. The reason is that starting from the output of PQconndefaults() effectively allows environment variables (such as PGPORT) to override entries in the service file, whereas the normal priority is the other way around. Not using PQconndefaults at all would require yet a third main code path in do_connect's parameter setup, so I don't really want to fix it that way. But we can have the logic effectively ignore all the default values for just a couple more lines of code. This patch doesn't change the behavior for "\c -reuse-previous=on service=whatever". That remains significantly different from before 85c54287a, because many more parameters will be re-used, and thus not be possible for service entries to replace. But I think this is (mostly?) intentional. In any case, since libpq does not report where it got parameter values from, it's hard to do differently. Per bug #16936 from Jasen Betts. As with the previous patches, back-patch to all supported branches. (9.5 is unfortunately now out of support, so this won't get fixed there.) Discussion: https://postgr.es/m/16936-3f524322a53a29f0@postgresql.org 23 March 2021, 18:27:50 UTC
d4791ac Avoid possible crash while finishing up a heap rewrite. end_heap_rewrite was not careful to ensure that the target relation is open at the smgr level before performing its final smgrimmedsync. In ordinary cases this is no problem, because it would have been opened earlier during the rewrite. However a crash can be reproduced by re-clustering an empty table with CLOBBER_CACHE_ALWAYS enabled. Although that exact scenario does not crash in v13, I think that's a chance result of unrelated planner changes, and the problem is likely still reachable with other test cases. The true proximate cause of this failure is commit c6b92041d, which replaced a call to heap_sync (which was careful about opening smgr) with a direct call to smgrimmedsync. Hence, back-patch to v13. Amul Sul, per report from Neha Sharma; cosmetic changes and test case by me. Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com 23 March 2021, 15:24:16 UTC
1faaad2 Use correct spelling of statistics kind A couple error messages and comments used 'statistic kind', not the correct 'statistics kind'. Fix and backpatch all the way back to 10, where extended statistics were introduced. Backpatch-through: 10 23 March 2021, 04:00:19 UTC
34279fd pg_waldump: Fix bug in per-record statistics. pg_waldump --stats=record identifies a record by a combination of the RmgrId and the four bits of the xl_info field of the record. But XACT records use the first bit of those four bits for an optional flag variable, and the following three bits for the opcode to identify a record. So previously the same type of XACT record could have different four bits (three bits are the same but the first one bit is different), and which could cause pg_waldump --stats=record to show two lines of per-record statistics for the same XACT record. This is a bug. This commit changes pg_waldump --stats=record so that it processes only XACT record differently, i.e., filters the opcode out of xl_info and uses a combination of the RmgrId and those three bits as the identifier of a record, only for XACT record. For other records, the four bits of the xl_info field are still used. Back-patch to all supported branches. Author: Kyotaro Horiguchi Reviewed-by: Shinya Kato, Fujii Masao Discussion: https://postgr.es/m/2020100913412132258847@highgo.ca 23 March 2021, 00:54:38 UTC
78c24e9 Fix concurrency issues with WAL segment recycling on Windows This commit is mostly a revert of aaa3aed, that switched the routine doing the internal renaming of recycled WAL segments to use on Windows a combination of CreateHardLinkA() plus unlink() instead of rename(). As reported by several users of Postgres 13, this is causing concurrency issues when manipulating WAL segments, mostly in the shape of the following error: LOG: could not rename file "pg_wal/000000XX000000YY000000ZZ": Permission denied This moves back to a logic where a single rename() (well, pgrename() for Windows) is used. This issue has proved to be hard to hit when I tested it, facing it only once with an archive_command that was not able to do its work, so it is environment-sensitive. The reporters of this issue have been able to confirm that the situation improved once we switched back to a single rename(). In order to check things, I have provided to the reporters a patched build based on 13.2 with aaa3aed reverted, to test if the error goes away, and an unpatched build of 13.2 to test if the error still showed up (just to make sure that I did not mess up my build process). Extra thanks to Fujii Masao for pointing out what looked like the culprit commit, and to all the reporters for taking the time to test what I have sent them. Reported-by: Andrus, Guy Burgess, Yaroslav Pashinsky, Thomas Trenz Reviewed-by: Tom Lane, Andres Freund Discussion: https://postgr.es/m/3861ff1e-0923-7838-e826-094cc9bef737@hot.ee Discussion: https://postgr.es/m/16874-c3eecd319e36a2bf@postgresql.org Discussion: https://postgr.es/m/095ccf8d-7f58-d928-427c-b17ace23cae6@burgess.co.nz Discussion: https://postgr.es/m/16927-67c570d968c99567%40postgresql.org Discussion: https://postgr.es/m/YFBcRbnBiPdGZvfW@paquier.xyz Backpatch-through: 13 22 March 2021, 05:02:36 UTC
48664e4 Make a test endure log_error_verbosity=verbose. Back-patch to v13, which introduced the test code in question. 22 March 2021, 02:09:35 UTC
7bdc717 Fix new TAP test for 2PC transactions and PITRs on Windows The test added by 595b9cb forgot that on Windows it is necessary to set up pg_hba.conf (see PostgresNode::set_replication_conf) with a specific entry or base backups fail. Any node that requires to support replication just needs to pass down allows_streaming at initialization. This updates the test to do so. Simplify things a bit while on it. Per buildfarm member fairywren. Any Windows hosts running this test would have failed, and I have reproduced the problem as well. Backpatch-through: 10 22 March 2021, 00:51:14 UTC
6e5ce88 Fix timeline assignment in checkpoints with 2PC transactions Any transactions found as still prepared by a checkpoint have their state data read from the WAL records generated by PREPARE TRANSACTION before being moved into their new location within pg_twophase/. While reading such records, the WAL reader uses the callback read_local_xlog_page() to read a page, that is shared across various parts of the system. This callback, since 1148e22a, has introduced an update of ThisTimeLineID when reading a record while in recovery, which is potentially helpful in the context of cascading WAL senders. This update of ThisTimeLineID interacts badly with the checkpointer if a promotion happens while some 2PC data is read from its record, as, by changing ThisTimeLineID, any follow-up WAL records would be written to an timeline older than the promoted one. This results in consistency issues. For instance, a subsequent server restart would cause a failure in finding a valid checkpoint record, resulting in a PANIC, for instance. This commit changes the code reading the 2PC data to reset the timeline once the 2PC record has been read, to prevent messing up with the static state of the checkpointer. It would be tempting to do the same thing directly in read_local_xlog_page(). However, based on the discussion that has led to 1148e22a, users may rely on the updates of ThisTimeLineID when a WAL record page is read in recovery, so changing this callback could break some cases that are working currently. A TAP test reproducing the issue is added, relying on a PITR to precisely trigger a promotion with a prepared transaction still tracked. Per discussion with Heikki Linnakangas, Kyotaro Horiguchi, Fujii Masao and myself. Author: Soumyadeep Chakraborty, Jimmy Yih, Kevin Yeap Discussion: https://postgr.es/m/CAE-ML+_EjH_fzfq1F3RJ1=XaaNG=-Jz-i3JqkNhXiLAsM3z-Ew@mail.gmail.com Backpatch-through: 10 21 March 2021, 23:31:01 UTC
4b41f69 Fix memory leak when rejecting bogus DH parameters. While back-patching e0e569e1d, I noted that there were some other places where we ought to be applying DH_free(); namely, where we load some DH parameters from a file and then reject them as not being sufficiently secure. While it seems really unlikely that anybody would hit these code paths in production, let alone do so repeatedly, let's fix it for consistency. Back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/16160-18367e56e9a28264@postgresql.org 20 March 2021, 16:47:35 UTC
1235483 Don't leak malloc'd error string in libpqrcv_check_conninfo(). We leaked the error report from PQconninfoParse, when there was one. It seems unlikely that real usage patterns would repeat the failure often enough to create serious bloat, but let's back-patch anyway to keep the code similar in all branches. Found via valgrind testing. Back-patch to v10 where this code was added. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us 19 March 2021, 02:21:58 UTC
642b0b6 Don't leak malloc'd strings when a GUC setting is rejected. Because guc.c prefers to keep all its string values in malloc'd not palloc'd storage, it has to be more careful than usual to avoid leaks. Error exits out of string GUC hook checks failed to clear the proposed value string, and error exits out of ProcessGUCArray() failed to clear the malloc'd results of ParseLongOption(). Found via valgrind testing. This problem is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us 19 March 2021, 02:09:41 UTC
eba9395 Don't leak compiled regex(es) when an ispell cache entry is dropped. The text search cache mechanisms assume that we can clean up an invalidated dictionary cache entry simply by resetting the associated long-lived memory context. However, that does not work for ispell affixes that make use of regular expressions, because the regex library deals in plain old malloc. Hence, we leaked compiled regex(es) any time we dropped such a cache entry. That could quickly add up, since even a fairly trivial regex can use up tens of kB, and a large one can eat megabytes. Add a memory context callback to ensure that a regex gets freed when its owning cache entry is cleared. Found via valgrind testing. This problem is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us 19 March 2021, 01:44:43 UTC
ea3989f Don't run RelationInitTableAccessMethod in a long-lived context. Some code paths in this function perform syscache lookups, which can lead to table accesses and possibly leakage of cruft into the caller's context. If said context is CacheMemoryContext, we eventually will have visible bloat. But fixing this is no harder than moving one memory context switch step. (The other callers don't have a problem.) Andres Freund and I independently found this via valgrind testing. Back-patch to v12 where this code was added. Discussion: https://postgr.es/m/20210317023101.anvejcfotwka6gaa@alap3.anarazel.de Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us 19 March 2021, 00:50:56 UTC
5368369 Don't leak rd_statlist when a relcache entry is dropped. Although these lists are usually NIL, and even when not empty are unlikely to be large, constant relcache update traffic could eventually result in visible bloat of CacheMemoryContext. Found via valgrind testing. Back-patch to v10 where this field was added. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us 19 March 2021, 00:37:09 UTC
0c3079e Fix function name in error hint pg_read_file() is the function that's in core, pg_file_read() is in adminpack. But when using pg_file_read() in adminpack it calls the *C* level function pg_read_file() in core, which probably threw the original author off. But the error hint should be about the SQL function. Reported-By: Sergei Kornilov Backpatch-through: 11 Discussion: https://postgr.es/m/373021616060475@mail.yandex.ru 18 March 2021, 10:22:37 UTC
b77e5d7 Prevent buffer overrun in read_tablespace_map(). Robert Foggia of Trustwave reported that read_tablespace_map() fails to prevent an overrun of its on-stack input buffer. Since the tablespace map file is presumed trustworthy, this does not seem like an interesting security vulnerability, but still we should fix it just in the name of robustness. While here, document that pg_basebackup's --tablespace-mapping option doesn't work with tar-format output, because it doesn't. To make it work, we'd have to modify the tablespace_map file within the tarball sent by the server, which might be possible but I'm not volunteering. (Less-painful solutions would require changing the basebackup protocol so that the source server could adjust the map. That's not very appetizing either.) 17 March 2021, 20:10:37 UTC
69739ec Revert "Fix race in Parallel Hash Join batch cleanup." This reverts commit 4e0f0995e923948631c4114ab353b256b51b58ad. Discussion: https://postgr.es/m/CA%2BhUKGJmcqAE3MZeDCLLXa62cWM0AJbKmp2JrJYaJ86bz36LFA%40mail.gmail.com 17 March 2021, 12:00:56 UTC
4e0f099 Fix race in Parallel Hash Join batch cleanup. With very unlucky timing and parallel_leader_participation off, PHJ could attempt to access per-batch state just as it was being freed. There was code intended to prevent that by checking for a cleared pointer, but it was buggy. Fix, by introducing an extra barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to find a batch to help with, and PHJ_BUILD_DONE means that it is too late. The last to detach will free the array of per-batch state as before, but now it will also atomically advance the phase at the same time, so that late attachers can avoid the hazard, without the data race. This mirrors the way per-batch hash tables are freed (see phases PHJ_BATCH_PROBING and PHJ_BATCH_DONE). Revealed by a one-off build farm failure, where BarrierAttach() failed a sanity check assertion, because the memory had been clobbered by dsa_free(). Back-patch to 11, where the code arrived. Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz 17 March 2021, 05:06:52 UTC
4d072bf Avoid corner-case memory leak in SSL parameter processing. After reading the root cert list from the ssl_ca_file, immediately install it as client CA list of the new SSL context. That gives the SSL context ownership of the list, so that SSL_CTX_free will free it. This avoids a permanent memory leak if we fail further down in be_tls_init(), which could happen if bogus CRL data is offered. The leak could only amount to something if the CRL parameters get broken after server start (else we'd just quit) and then the server is SIGHUP'd many times without fixing the CRL data. That's rather unlikely perhaps, but it seems worth fixing, if only because the code is clearer this way. While we're here, add some comments about the memory management aspects of this logic. Noted by Jelte Fennema and independently by Andres Freund. Back-patch to v10; before commit de41869b6 it doesn't matter, since we'd not re-execute this code during SIGHUP. Discussion: https://postgr.es/m/16160-18367e56e9a28264@postgresql.org 16 March 2021, 20:02:49 UTC
6ed0599 Fix race condition in psql \e's detection of file modification. psql's editing commands decide whether the user has edited the file by checking for change of modification timestamp. This is probably fine for a pre-existing file, but with a temporary file that is created within the command, it's possible for a fast typist to save-and-exit in less than the one-second granularity of stat(2) timestamps. On Windows FAT filesystems the granularity is even worse, 2 seconds, making the race a bit easier to hit. To fix, try to set the temp file's mod time to be two seconds ago. It's unlikely this would fail, but then again the race condition itself is unlikely, so just ignore any error. Also, we might as well check the file size as well as its mod time. While this is a difficult bug to hit, it still seems worth back-patching, to ensure that users' edits aren't lost. Laurenz Albe, per gripe from Jacob Champion; based on fix suggestions from Jacob and myself Discussion: https://postgr.es/m/0ba3f2a658bac6546d9934ab6ba63a805d46a49b.camel@cybertec.at 12 March 2021, 17:20:15 UTC
8a22977 Forbid marking an identity column as nullable. GENERATED ALWAYS AS IDENTITY implies NOT NULL, but the code failed to complain if you overrode that with "GENERATED ALWAYS AS IDENTITY NULL". One might think the old behavior was a feature, but it was inconsistent because the outcome varied depending on the order of the clauses, so it seems to have been just an oversight. Per bug #16913 from Pavel Boev. Back-patch to v10 where identity columns were introduced. Vik Fearing (minor tweaks by me) Discussion: https://postgr.es/m/16913-3b5198410f67d8c6@postgresql.org 12 March 2021, 16:08:42 UTC
3580b4a Re-simplify management of inStart in pqParseInput3's subroutines. Commit 92785dac2 copied some logic related to advancement of inStart from pqParseInput3 into getRowDescriptions and getAnotherTuple, because it wanted to allow user-defined row processor callbacks to potentially longjmp out of the library, and inStart would have to be updated before that happened to avoid an infinite loop. We later decided that that API was impossibly fragile and reverted it, but we didn't undo all of the related code changes, and this bit of messiness survived. Undo it now so that there's just one place in pqParseInput3's processing where inStart is advanced; this will simplify addition of better tracing support. getParamDescriptions had grown similar processing somewhere along the way (not in 92785dac2; I didn't track down just when), but it's actually buggy because its handling of corrupt-message cases seems to have been copied from the v2 logic where we lacked a known message length. The cases where we "goto not_enough_data" should not simply return EOF, because then we won't consume the message, potentially creating an infinite loop. That situation now represents a definitively corrupt message, and we should report it as such. Although no field reports of getParamDescriptions getting stuck in a loop have been seen, it seems appropriate to back-patch that fix. I chose to back-patch all of this to keep the logic looking more alike in supported branches. Discussion: https://postgr.es/m/2217283.1615411989@sss.pgh.pa.us 11 March 2021, 19:43:45 UTC
635048e Doc: B-Tree only has one additional parameter. Oversight in commit 9f3665fb. Backpatch: 13-, just like commit 9f3665fb. 11 March 2021, 06:10:34 UTC
4ecf359 tutorial: land height is "elevation", not "altitude" This is a follow-on patch to 92c12e46d5. In that patch, we renamed "altitude" to "elevation" in the docs, based on these details: https://mapscaping.com/blogs/geo-candy/what-is-the-difference-between-elevation-relief-and-altitude This renames the tutorial SQL files to match the documentation. Reported-by: max1@inbox.ru Discussion: https://postgr.es/m/161512392887.1046.3137472627109459518@wrigleys.postgresql.org Backpatch-through: 9.6 11 March 2021, 01:25:18 UTC
1fc5a57 VACUUM ANALYZE: Always update pg_class.reltuples. vacuumlazy.c sometimes fails to update pg_class entries for each index (to ensure that pg_class.reltuples is current), even though analyze.c assumed that that must have happened during VACUUM ANALYZE. There are at least a couple of reasons for this. For example, vacuumlazy.c could fail to update pg_class when the index AM indicated that its statistics are merely an estimate, per the contract for amvacuumcleanup() routines established by commit e57345975cf back in 2006. Stop assuming that pg_class must have been updated with accurate statistics within VACUUM ANALYZE -- update pg_class for indexes at the same time as the table relation in all cases. That way VACUUM ANALYZE will never fail to keep pg_class.reltuples reasonably accurate. The only downside of this approach (compared to the old approach) is that it might inaccurately set pg_class.reltuples for indexes whose heap relation ends up with the same inaccurate value anyway. This doesn't seem too bad. We already consistently called vac_update_relstats() (to update pg_class) for the heap/table relation twice during any VACUUM ANALYZE -- once in vacuumlazy.c, and once in analyze.c. We now make sure that we call vac_update_relstats() at least once (though often twice) for each index. This is follow up work to commit 9f3665fb, which dealt with issues in btvacuumcleanup(). Technically this fixes an unrelated issue, though. btvacuumcleanup() no longer provides an accurate num_index_tuples value following commit 9f3665fb (when there was no btbulkdelete() call during the VACUUM operation in question), but hashvacuumcleanup() has worked in the same way for many years now. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAH2-WzknxdComjhqo4SUxVFk_Q1171GJO2ZgHZ1Y6pion6u8rA@mail.gmail.com Backpatch: 13-, just like commit 9f3665fb. 11 March 2021, 01:07:55 UTC
9663d12 Don't consider newly inserted tuples in nbtree VACUUM. Remove the entire idea of "stale stats" within nbtree VACUUM (stop caring about stats involving the number of inserted tuples). Also remove the vacuum_cleanup_index_scale_factor GUC/param on the master branch (though just disable them on postgres 13). The vacuum_cleanup_index_scale_factor/stats interface made the nbtree AM partially responsible for deciding when pg_class.reltuples stats needed to be updated. This seems contrary to the spirit of the index AM API, though -- it is not actually necessary for an index AM's bulk delete and cleanup callbacks to provide accurate stats when it happens to be inconvenient. The core code owns that. (Index AMs have the authority to perform or not perform certain kinds of deferred cleanup based on their own considerations, such as page deletion and recycling, but that has little to do with pg_class.reltuples/num_index_tuples.) This issue was fairly harmless until the introduction of the autovacuum_vacuum_insert_threshold feature by commit b07642db, which had an undesirable interaction with the vacuum_cleanup_index_scale_factor mechanism: it made insert-driven autovacuums perform full index scans, even though there is no real benefit to doing so. This has been tied to a regression with an append-only insert benchmark [1]. Also have remaining cases that perform a full scan of an index during a cleanup-only nbtree VACUUM indicate that the final tuple count is only an estimate. This prevents vacuumlazy.c from setting the index's pg_class.reltuples in those cases (it will now only update pg_class when vacuumlazy.c had TIDs for nbtree to bulk delete). This arguably fixes an oversight in deduplication-related bugfix commit 48e12913. [1] https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAD21AoA4WHthN5uU6+WScZ7+J_RcEjmcuH94qcoUPuB42ShXzg@mail.gmail.com Backpatch: 13-, where autovacuum_vacuum_insert_threshold was added. 11 March 2021, 00:26:58 UTC
9a4e4af Doc: improve introductory information about procedures. Clarify the discussion in "User-Defined Procedures", by laying out the key differences between functions and procedures in a bulleted list. Notably, this avoids burying the lede about procedures being able to do transaction control. Make the back-link in the CREATE FUNCTION reference page more prominent, and add one in CREATE PROCEDURE. Per gripe from Guyren Howe. Thanks to David Johnston for discussion. Discussion: https://postgr.es/m/BYAPR03MB4903C53A8BB7EFF5EA289674A6949@BYAPR03MB4903.namprd03.prod.outlook.com 10 March 2021, 16:33:50 UTC
fe2b538 Validate the OID argument of pg_import_system_collations(). "SELECT pg_import_system_collations(0)" caused an assertion failure. With a random nonzero argument --- or indeed with zero, in non-assert builds --- it would happily make pg_collation entries with garbage values of collnamespace. These are harmless as far as I can tell (unless maybe the OID happens to become used for a schema, later on?). In any case this isn't a security issue, since the function is superuser-only. But it seems like a gotcha for unwary DBAs, so let's add a check that the given OID belongs to some schema. Back-patch to v10 where this function was introduced. 08 March 2021, 23:21:51 UTC
21d5a06 Clarify the usage of max_replication_slots on the subscriber side. It was not clear in the docs that the max_replication_slots is also used to track replication origins on the subscriber side. Author: Paul Martinez Reviewed-by: Amit Kapila Backpatch-through: 10 where logical replication was introduced Discussion: https://postgr.es/m/CACqFVBZgwCN_pHnW6dMNCrOS7tiHCw6Retf_=U2Vvj3aUSeATw@mail.gmail.com 03 March 2021, 04:47:47 UTC
b52fd1e Use native path separators to pg_ctl in initdb On Windows, CMD.EXE allegedly does not run a command that uses forward slashes, so let's convert the path to use backslashes instead. Backpatch to 10. Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> Discussion: https://postgr.es/m/CAMm1aWaNDuaPYFYMAqDeJrZmPtNvLcJRS++CcZWY8LT6KcoBZw@mail.gmail.com 02 March 2021, 18:39:34 UTC
621e8a6 Fix duplicated test case in TAP tests of reindexdb The same test for REINDEX (VERBOSE) was done twice, while it is clear that the second test should use --concurrently. Issue introduced in 5dc92b8, for what looks like a copy-paste mistake. Reviewed-by: Mark Dilger Discussion: https://postgr.es/m/A7AE97EA-F4B0-4CAB-8FFF-3FECD31F9D63@enterprisedb.com Backpatch-through: 12 02 March 2021, 04:19:07 UTC
2688852 Fix use-after-free bug with AfterTriggersTableData.storeslot AfterTriggerSaveEvent() wrongly allocates the slot in execution-span memory context, whereas the correct thing is to allocate it in a transaction-span context, because that's where the enclosing AfterTriggersTableData instance belongs into. Backpatch to 12 (the test back to 11, where it works well with no code changes, and it's good to have to confirm that the case was previously well supported); this bug seems introduced by commit ff11e7f4b9ae. Reported-by: Bertrand Drouvot <bdrouvot@amazon.com> Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/39a71864-b120-5a5c-8cc5-c632b6f16761@amazon.com 27 February 2021, 21:09:15 UTC
5744931 Doc: further clarify libpq's description of connection string URIs. Break the synopsis into named parts to make it less confusing. Make more than zero effort at applying SGML markup. Do a bit of copy-editing of nearby text. The synopsis revision is by Alvaro Herrera and Paul Förster, the rest is my fault. Back-patch to v10 where multi-host connection strings appeared. Discussion: https://postgr.es/m/6E752D6B-487C-463E-B6E2-C32E7FB007EA@gmail.com 26 February 2021, 20:24:00 UTC
49076fd Fix list-manipulation bug in WITH RECURSIVE processing. makeDependencyGraphWalker and checkWellFormedRecursionWalker thought they could hold onto a pointer to a list's first cons cell while the list was modified by recursive calls. That was okay when the cons cell was actually separately palloc'd ... but since commit 1cff1b95a, it's quite unsafe, leading to core dumps or incorrect complaints of faulty WITH nesting. In the field this'd require at least a seven-deep WITH nest to cause an issue, but enabling DEBUG_LIST_MEMORY_USAGE allows the bug to be seen with lesser nesting depths. Per bug #16801 from Alexander Lakhin. Back-patch to v13. Michael Paquier and Tom Lane Discussion: https://postgr.es/m/16801-393c7922143eaa4d@postgresql.org 26 February 2021, 01:47:32 UTC
back to top