https://github.com/postgres/postgres

sort by:
Revision Author Date Message Commit Date
553e576 Stamp 9.3.9. 09 June 2015, 19:31:32 UTC
d7705f7 Release notes for 9.4.4, 9.3.9, 9.2.13, 9.1.18, 9.0.22. 09 June 2015, 18:33:43 UTC
e7da27c Report more information if pg_perm_setlocale() fails at startup. We don't know why a few Windows users have seen this fail, but the taciturnity of the error message certainly isn't helping debug it. Let's at least find out which LC category isn't working. 09 June 2015, 17:37:08 UTC
82f81ba Allow HotStandbyActiveInReplay() to be called in single user mode. HotStandbyActiveInReplay, introduced in 061b079f, only allowed WAL replay to happen in the startup process, missing the single user case. This buglet is fairly harmless as it only causes problems when single user mode in an assertion enabled build is used to replay a btree vacuum record. Backpatch to 9.2. 061b079f was backpatched further, but the assertion was not. 08 June 2015, 12:09:49 UTC
4f2458d Use a safer method for determining whether relcache init file is stale. When we invalidate the relcache entry for a system catalog or index, we must also delete the relcache "init file" if the init file contains a copy of that rel's entry. The old way of doing this relied on a specially maintained list of the OIDs of relations present in the init file: we made the list either when reading the file in, or when writing the file out. The problem is that when writing the file out, we included only rels present in our local relcache, which might have already suffered some deletions due to relcache inval events. In such cases we correctly decided not to overwrite the real init file with incomplete data --- but we still used the incomplete initFileRelationIds list for the rest of the current session. This could result in wrong decisions about whether the session's own actions require deletion of the init file, potentially allowing an init file created by some other concurrent session to be left around even though it's been made stale. Since we don't support changing the schema of a system catalog at runtime, the only likely scenario in which this would cause a problem in the field involves a "vacuum full" on a catalog concurrently with other activity, and even then it's far from easy to provoke. Remarkably, this has been broken since 2002 (in commit 786340441706ac1957a031f11ad1c2e5b6e18314), but we had never seen a reproducible test case until recently. If it did happen in the field, the symptoms would probably involve unexpected "cache lookup failed" errors to begin with, then "could not open file" failures after the next checkpoint, as all accesses to the affected catalog stopped working. Recovery would require manually removing the stale "pg_internal.init" file. To fix, get rid of the initFileRelationIds list, and instead consult syscache.c's list of relations used in catalog caches to decide whether a relation is included in the init file. This should be a tad more efficient anyway, since we're replacing linear search of a list with ~100 entries with a binary search. It's a bit ugly that the init file contents are now so directly tied to the catalog caches, but in practice that won't make much difference. Back-patch to all supported branches. 07 June 2015, 19:32:09 UTC
ac86eda Fix incorrect order of database-locking operations in InitPostgres(). We should set MyProc->databaseId after acquiring the per-database lock, not beforehand. The old way risked deadlock against processes trying to copy or delete the target database, since they would first acquire the lock and then wait for processes with matching databaseId to exit; that left a window wherein an incoming process could set its databaseId and then block on the lock, while the other process had the lock and waited in vain for the incoming process to exit. CountOtherDBBackends() would time out and fail after 5 seconds, so this just resulted in an unexpected failure not a permanent lockup, but it's still annoying when it happens. A real-world example of a use-case is that short-duration connections to a template database should not cause CREATE DATABASE to fail. Doing it in the other order should be fine since the contract has always been that processes searching the ProcArray for a database ID must hold the relevant per-database lock while searching. Thus, this actually removes the former race condition that required an assumption that storing to MyProc->databaseId is atomic. It's been like this for a long time, so back-patch to all active branches. 05 June 2015, 17:22:27 UTC
2a9b019 Cope with possible failure of the oldest MultiXact to exist. Recent commits, mainly b69bf30b9bfacafc733a9ba77c9587cf54d06c0c and 53bb309d2d5a9432d2602c93ed18e58bd2924e15, introduced mechanisms to protect against wraparound of the MultiXact member space: the number of multixacts that can exist at one time is limited to 2^32, but the total number of members in those multixacts is also limited to 2^32, and older code did not take care to enforce the second limit, potentially allowing old data to be overwritten while it was still needed. Unfortunately, these new mechanisms failed to account for the fact that the code paths in which they run might be executed during recovery or while the cluster was in an inconsistent state. Also, they failed to account for the fact that users who used pg_upgrade to upgrade a PostgreSQL version between 9.3.0 and 9.3.4 might have might oldestMultiXid = 1 in the control file despite the true value being larger. To fix these problems, first, avoid unnecessarily examining the mmembers of MultiXacts when the cluster is not known to be consistent. TruncateMultiXact has done this for a long time, and this patch does not fix that. But the new calls used to prevent member wraparound are not needed until we reach normal running, so avoid calling them earlier. (SetMultiXactIdLimit is actually called before InRecovery is set, so we can't rely on that; we invent our own multixact-specific flag instead.) Second, make failure to look up the members of a MultiXact a non-fatal error. Instead, if we're unable to determine the member offset at which wraparound would occur, postpone arming the member wraparound defenses until we are able to do so. If we're unable to determine the member offset that should force autovacuum, force it continuously until we are able to do so. If we're unable to deterine the member offset at which we should truncate the members SLRU, log a message and skip truncation. An important consequence of these changes is that anyone who does have a bogus oldestMultiXid = 1 value in pg_control will experience immediate emergency autovacuuming when upgrading to a release that contains this fix. The release notes should highlight this fact. If a user has no pg_multixact/offsets/0000 file, but has oldestMultiXid = 1 in the control file, they may wish to vacuum any tables with relminmxid = 1 prior to upgrading in order to avoid an immediate emergency autovacuum after the upgrade. This must be done with a PostgreSQL version 9.3.5 or newer and with vacuum_multixact_freeze_min_age and vacuum_multixact_freeze_table_age set to 0. This patch also adds an additional log message at each database server startup, indicating either that protections against member wraparound have been engaged, or that they have not. In the latter case, once autovacuum has advanced oldestMultiXid to a sane value, the message indicating that the guards have been engaged will appear at the next checkpoint. A few additional messages have also been added at the DEBUG1 level so that the correct operation of this code can be properly audited. Along the way, this patch fixes another, related bug in TruncateMultiXact that has existed since PostgreSQL 9.3.0: when no MultiXacts exist at all, the truncation code looks up NextMultiXactId, which doesn't exist yet. This can lead to TruncateMultiXact removing every file in pg_multixact/offsets instead of keeping one around, as it should. This in turn will cause the database server to refuse to start afterwards. Patch by me. Review by Álvaro Herrera, Andres Freund, Noah Misch, and Thomas Munro. 05 June 2015, 13:34:15 UTC
746092a pgindent run on access/transam/multixact.c This file has been patched over and over, and the differences to master caused by pgindent are annoying enough that it seems saner to make the older branches look the same. Backpatch to 9.3, which is as far back as backpatching of bugfixes is necessary. 04 June 2015, 18:20:28 UTC
f051c16 Fix some issues in pg_class.relminmxid and pg_database.datminmxid documentation. - Correct the name of directory which those catalog columns allow to be shrunk. - Correct the name of symbol which is used as the value of pg_class.relminmxid when the relation is not a table. - Fix "ID ID" typo. Backpatch to 9.3 where those cataog columns were introduced. 04 June 2015, 04:24:06 UTC
d3fdec6 Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans. When the inner side of a nestloop SEMI or ANTI join is an indexscan that uses all the join clauses as indexquals, it can be presumed that both matched and unmatched outer rows will be processed very quickly: for matched rows, we'll stop after fetching one row from the indexscan, while for unmatched rows we'll have an indexscan that finds no matching index entries, which should also be quick. The planner already knew about this, but it was nonetheless charging for at least one full run of the inner indexscan, as a consequence of concerns about the behavior of materialized inner scans --- but those concerns don't apply in the fast case. If the inner side has low cardinality (many matching rows) this could make an indexscan plan look far more expensive than it actually is. To fix, rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we don't add the inner scan cost until we've inspected the indexquals, and then we can add either the full-run cost or just the first tuple's cost as appropriate. Experimentation with this fix uncovered another problem: add_path and friends were coded to disregard cheap startup cost when considering parameterized paths. That's usually okay (and desirable, because it thins the path herd faster); but in this fast case for SEMI/ANTI joins, it could result in throwing away the desired plain indexscan path in favor of a bitmap scan path before we ever get to the join costing logic. In the many-matching-rows cases of interest here, a bitmap scan will do a lot more work than required, so this is a problem. To fix, add a per-relation flag consider_param_startup that works like the existing consider_startup flag, but applies to parameterized paths, and set it for relations that are the inside of a SEMI or ANTI join. To make this patch reasonably safe to back-patch, care has been taken to avoid changing the planner's behavior except in the very narrow case of SEMI/ANTI joins with inner indexscans. There are places in compare_path_costs_fuzzily and add_path_precheck that are not terribly consistent with the new approach, but changing them will affect planner decisions at the margins in other cases, so we'll leave that for a HEAD-only fix. Back-patch to 9.3; before that, the consider_startup flag didn't exist, meaning that the second aspect of the patch would be too invasive. Per a complaint from Peter Holzer and analysis by Tomas Vondra. 03 June 2015, 15:58:47 UTC
00ca051 Stamp 9.3.8. 01 June 2015, 19:08:17 UTC
1ed0411 Release notes for 9.4.3, 9.3.8, 9.2.12, 9.1.17, 9.0.21. Also sneak entries for commits 97ff2a564 et al into the sections for the previous releases in the relevant branches. Those fixes did go out in the previous releases, but missed getting documented. 01 June 2015, 17:27:43 UTC
c2b68b1 initdb -S should now have an explicit check that $PGDATA is valid. The fsync code from the backend essentially assumes that somebody's already validated PGDATA, at least to the extent of it being a readable directory. That's safe enough for initdb's normal code path too, but "initdb -S" doesn't have any other processing at all that touches the target directory. To have reasonable error-case behavior, add a pg_check_dir call. Per gripe from Peter E. 29 May 2015, 21:02:58 UTC
35dd1b5 Remove special cases for ETXTBSY from new fsync'ing logic. The argument that this is a sufficiently-expected case to be silently ignored seems pretty thin. Andres had brought it up back when we were still considering that most fsync failures should be hard errors, and it probably would be legit not to fail hard for ETXTBSY --- but the same is true for EROFS and other cases, which is why we gave up on hard failures. ETXTBSY is surely not a normal case, so logging the failure seems fine from here. 29 May 2015, 19:11:36 UTC
52fc948 Adjust initdb to also not consider fsync'ing failures fatal. Make initdb's version of this logic look as much like the backend's as possible. This is much less critical than in the backend since not so many people use "initdb -S", but we want the same corner-case error handling in both cases. Back-patch to 9.3 where initdb -S option was introduced. Before that, initdb only had to deal with freshly-created data directories, wherein no failures should be expected. Abhijit Menon-Sen 29 May 2015, 17:05:16 UTC
81f3d3b Fix fsync-at-startup code to not treat errors as fatal. Commit 2ce439f3379aed857517c8ce207485655000fc8e introduced a rather serious regression, namely that if its scan of the data directory came across any un-fsync-able files, it would fail and thereby prevent database startup. Worse yet, symlinks to such files also caused the problem, which meant that crash restart was guaranteed to fail on certain common installations such as older Debian. After discussion, we agreed that (1) failure to start is worse than any consequence of not fsync'ing is likely to be, therefore treat all errors in this code as nonfatal; (2) we should not chase symlinks other than those that are expected to exist, namely pg_xlog/ and tablespace links under pg_tblspc/. The latter restriction avoids possibly fsync'ing a much larger part of the filesystem than intended, if the user has left random symlinks hanging about in the data directory. This commit takes care of that and also does some code beautification, mainly moving the relevant code into fd.c, which seems a much better place for it than xlog.c, and making sure that the conditional compilation for the pre_sync_fname pass has something to do with whether pg_flush_data works. I also relocated the call site in xlog.c down a few lines; it seems a bit silly to be doing this before ValidateXLOGDirectoryStructure(). The similar logic in initdb.c ought to be made to match this, but that change is noncritical and will be dealt with separately. Back-patch to all active branches, like the prior commit. Abhijit Menon-Sen and Tom Lane 28 May 2015, 21:33:03 UTC
27bae8d Fix pg_get_functiondef() to print a function's LEAKPROOF property. Seems to have been an oversight in the original leakproofness patch. Per report and patch from Jeevan Chalke. In passing, prettify some awkward leakproof-related code in AlterFunction. 28 May 2015, 15:24:37 UTC
5c8e43a Fix portability issue in isolationtester grammar. specparse.y and specscanner.l used "string" as a token name. Now, bison likes to define each token name as a macro for the token code it assigns, which means those names are basically off-limits for any other use within the grammar file or included headers. So names as generic as "string" are dangerous. This is what was causing the recent failures on protosciurus: some versions of Solaris' sys/kstat.h use "string" as a field name. With late-model bison we don't see this problem because the token macros aren't defined till later (that is why castoroides didn't show the problem even though it's on the same machine). But protosciurus uses bison 1.875 which defines the token macros up front. This land mine has been there from day one; we'd have found it sooner except that protosciurus wasn't trying to run the isolation tests till recently. To fix, rename the token to "string_literal" which is hopefully less likely to collide with names used by system headers. Back-patch to all branches containing the isolation tests. 27 May 2015, 23:14:40 UTC
9e980e7 Remove configure check prohibiting threaded libpython on OpenBSD. According to recent tests, this case now works fine, so there's no reason to reject it anymore. (Even if there are still some OpenBSD platforms in the wild where it doesn't work, removing the check won't break any case that worked before.) We can actually remove the entire test that discovers whether libpython is threaded, since without the OpenBSD case there's no need to know that at all. Per report from Davin Potts. Back-patch to all active branches. 27 May 2015, 02:14:59 UTC
605326e Update README.tuplock Multixact truncation is now handled differently, and this file hadn't gotten the memo. Per note from Amit Langote. I didn't use his patch, though. Also update the description of infomask bits, which weren't completely up to date either. This commit also propagates b01a4f6838 back to 9.3 and 9.4, which apparently I failed to do back then. 25 May 2015, 18:09:05 UTC
887e4b7 Rename pg_shdepend.c's typedef "objectType" to SharedDependencyObjectType. The name objectType is widely used as a field name, and it's pure luck that this conflict has not caused pgindent to go crazy before. It messed up pg_audit.c pretty good though. Since pg_shdepend.c doesn't export this typedef and only uses it in three places, changing that seems saner than changing the field usages. Back-patch because we're contemplating using the union of all branch typedefs for future pgindent runs, so this won't fix anything if it stays the same in back branches. 24 May 2015, 17:03:45 UTC
c6b7b9a Back-patch libpq support for TLS versions beyond v1. Since 7.3.2, libpq has been coded in such a way that the only SSL protocol it would allow was TLS v1. That approach is looking increasingly obsolete. In commit 820f08cabdcbb899 we fixed it to allow TLS >= v1, but did not back-patch the change at the time, partly out of caution and partly because the question was confused by a contemporary server-side change to reject the now-obsolete SSL protocol v3. 9.4 has now been out long enough that it seems safe to assume the change is OK; hence, back-patch into 9.0-9.3. (I also chose to back-patch some relevant comments added by commit 326e1d73c476a0b5, but did *not* change the server behavior; hence, pre-9.4 servers will continue to allow SSL v3, even though no remotely modern client will request it.) Per gripe from Jan Bilek. 22 May 2015, 00:41:55 UTC
70f2e3e Last-minute updates for release notes. Revise description of CVE-2015-3166, in line with scaled-back patch. Change release date. Security: CVE-2015-3166 19 May 2015, 22:33:58 UTC
1334127 Revert error-throwing wrappers for the printf family of functions. This reverts commit 16304a013432931e61e623c8d85e9fe24709d9ba, except for its changes in src/port/snprintf.c; as well as commit cac18a76bb6b08f1ecc2a85e46c9d2ab82dd9d23 which is no longer needed. Fujii Masao reported that the previous commit caused failures in psql on OS X, since if one exits the pager program early while viewing a query result, psql sees an EPIPE error from fprintf --- and the wrapper function thought that was reason to panic. (It's a bit surprising that the same does not happen on Linux.) Further discussion among the security list concluded that the risk of other such failures was far too great, and that the one-size-fits-all approach to error handling embodied in the previous patch is unlikely to be workable. This leaves us again exposed to the possibility of the type of failure envisioned in CVE-2015-3166. However, that failure mode is strictly hypothetical at this point: there is no concrete reason to believe that an attacker could trigger information disclosure through the supposed mechanism. In the first place, the attack surface is fairly limited, since so much of what the backend does with format strings goes through stringinfo.c or psprintf(), and those already had adequate defenses. In the second place, even granting that an unprivileged attacker could control the occurrence of ENOMEM with some precision, it's a stretch to believe that he could induce it just where the target buffer contains some valuable information. So we concluded that the risk of non-hypothetical problems induced by the patch greatly outweighs the security risks. We will therefore revert, and instead undertake closer analysis to identify specific calls that may need hardening, rather than attempt a universal solution. We have kept the portion of the previous patch that improved snprintf.c's handling of errors when it calls the platform's sprintf(). That seems to be an unalloyed improvement. Security: CVE-2015-3166 19 May 2015, 22:16:58 UTC
b3288a6 Fix off-by-one error in Assertion. The point of the assertion is to ensure that the arrays allocated in stack are large enough, but the check was one item short. This won't matter in practice because MaxIndexTuplesPerPage is an overestimate, so you can't have that many items on a page in reality. But let's be tidy. Spotted by Anastasia Lubennikova. Backpatch to all supported versions, like the patch that added the assertion. 19 May 2015, 16:25:54 UTC
8c479a8 Stamp 9.3.7. 18 May 2015, 18:31:21 UTC
8388680 Fix error message in pre_sync_fname. The old one didn't include %m anywhere, and required extra translation. Report by Peter Eisentraut. Fix by me. Review by Tom Lane. 18 May 2015, 17:17:01 UTC
32f8d57 Last-minute updates for release notes. Add entries for security issues. Security: CVE-2015-3165 through CVE-2015-3167 18 May 2015, 16:09:02 UTC
7b758b7 pgcrypto: Report errant decryption as "Wrong key or corrupt data". This has been the predominant outcome. When the output of decrypting with a wrong key coincidentally resembled an OpenPGP packet header, pgcrypto could instead report "Corrupt data", "Not text data" or "Unsupported compression algorithm". The distinct "Corrupt data" message added no value. The latter two error messages misled when the decrypted payload also exhibited fundamental integrity problems. Worse, error message variance in other systems has enabled cryptologic attacks; see RFC 4880 section "14. Security Considerations". Whether these pgcrypto behaviors are likewise exploitable is unknown. In passing, document that pgcrypto does not resist side-channel attacks. Back-patch to 9.0 (all supported versions). Security: CVE-2015-3167 18 May 2015, 14:02:37 UTC
c669915 Check return values of sensitive system library calls. PostgreSQL already checked the vast majority of these, missing this handful that nearly cannot fail. If putenv() failed with ENOMEM in pg_GSS_recvauth(), authentication would proceed with the wrong keytab file. If strftime() returned zero in cache_locale_time(), using the unspecified buffer contents could lead to information exposure or a crash. Back-patch to 9.0 (all supported versions). Other unchecked calls to these functions, especially those in frontend code, pose negligible security concern. This patch does not address them. Nonetheless, it is always better to check return values whose specification provides for indicating an error. In passing, fix an off-by-one error in strftime_win32()'s invocation of WideCharToMultiByte(). Upon retrieving a value of exactly MAX_L10N_DATA bytes, strftime_win32() would overrun the caller's buffer by one byte. MAX_L10N_DATA is chosen to exceed the length of every possible value, so the vulnerable scenario probably does not arise. Security: CVE-2015-3166 18 May 2015, 14:02:37 UTC
34d21e7 Add error-throwing wrappers for the printf family of functions. All known standard library implementations of these functions can fail with ENOMEM. A caller neglecting to check for failure would experience missing output, information exposure, or a crash. Check return values within wrappers and code, currently just snprintf.c, that bypasses the wrappers. The wrappers do not return after an error, so their callers need not check. Back-patch to 9.0 (all supported versions). Popular free software standard library implementations do take pains to bypass malloc() in simple cases, but they risk ENOMEM for floating point numbers, positional arguments, large field widths, and large precisions. No specification demands such caution, so this commit regards every call to a printf family function as a potential threat. Injecting the wrappers implicitly is a compromise between patch scope and design goals. I would prefer to edit each call site to name a wrapper explicitly. libpq and the ECPG libraries would, ideally, convey errors to the caller rather than abort(). All that would be painfully invasive for a back-patched security fix, hence this compromise. Security: CVE-2015-3166 18 May 2015, 14:02:36 UTC
d5abbd1 Permit use of vsprintf() in PostgreSQL code. The next commit needs it. Back-patch to 9.0 (all supported versions). 18 May 2015, 14:02:36 UTC
f4c12b4 Prevent a double free by not reentering be_tls_close(). Reentering this function with the right timing caused a double free, typically crashing the backend. By synchronizing a disconnection with the authentication timeout, an unauthenticated attacker could achieve this somewhat consistently. Call be_tls_close() solely from within proc_exit_prepare(). Back-patch to 9.0 (all supported versions). Benkocs Norbert Attila Security: CVE-2015-3165 18 May 2015, 14:02:36 UTC
b9403de Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 3ce9e5ca72c3948b4c592e82a5ddb9b69b97d14b 18 May 2015, 12:40:50 UTC
271a68b Fix typos 18 May 2015, 02:22:19 UTC
01d42ca Release notes for 9.4.2, 9.3.7, 9.2.11, 9.1.16, 9.0.20. 17 May 2015, 19:54:20 UTC
4e99359 pg_upgrade: properly handle timeline variables There is no behavior change here as we now always set the timeline to one. Report by Tom Lane Backpatch to 9.3 and 9.4 16 May 2015, 19:16:28 UTC
b054732 Fix docs typo I don't think "respectfully" is what was meant here ... 16 May 2015, 17:28:26 UTC
bffbeec pg_upgrade: force timeline 1 in the new cluster Previously, this prevented promoted standby servers from being upgraded because of a missing WAL history file. (Timeline 1 doesn't need a history file, and we don't copy WAL files anyway.) Report by Christian Echerer(?), Alexey Klyukin Backpatch through 9.0 16 May 2015, 04:40:18 UTC
4cfba53 pg_upgrade: only allow template0 to be non-connectable This patch causes pg_upgrade to error out during its check phase if: (1) template0 is marked connectable or (2) any other database is marked non-connectable This is done because, in the first case, pg_upgrade would fail because the pg_dumpall --globals restore would fail, and in the second case, the database would not be restored, leading to data loss. Report by Matt Landry (1), Stephen Frost (2) Backpatch through 9.0 16 May 2015, 04:10:03 UTC
4fd69e4 Update time zone data files to tzdata release 2015d. DST law changes in Egypt, Mongolia, Palestine. Historical corrections for Canada and Chile. Revised zone abbreviation for America/Adak (HST/HDT not HAST/HADT). 15 May 2015, 23:35:58 UTC
13a2b7b Docs: fix erroneous claim about max byte length of GB18030. This encoding has characters up to 4 bytes long, not 2. 14 May 2015, 18:59:00 UTC
96b676c Fix RBM_ZERO_AND_LOCK mode to not acquire lock on local buffers. Commit 81c45081 introduced a new RBM_ZERO_AND_LOCK mode to ReadBuffer, which takes a lock on the buffer before zeroing it. However, you cannot take a lock on a local buffer, and you got a segfault instead. The version of that patch committed to master included a check for !isLocalBuf, and therefore didn't crash, but oddly I missed that in the back-patched versions. This patch adds that check to the back-branches too. RBM_ZERO_AND_LOCK mode is only used during WAL replay, and in hash indexes. WAL replay only deals with shared buffers, so the only way to trigger the bug is with a temporary hash index. Reported by Artem Ignatyev, analysis by Tom Lane. 13 May 2015, 06:54:06 UTC
7d09fdf Fix incorrect checking of deferred exclusion constraint after a HOT update. If a row that potentially violates a deferred exclusion constraint is HOT-updated later in the same transaction, the exclusion constraint would be reported as violated when the check finally occurs, even if the row(s) the new row originally conflicted with have since been removed. This happened because the wrong TID was passed to check_exclusion_constraint(), causing the live HOT-updated row to be seen as a conflicting row rather than recognized as the row-under-test. Per bug #13148 from Evan Martin. It's been broken since exclusion constraints were invented, so back-patch to all supported branches. 11 May 2015, 16:25:45 UTC
ddebd21 Increase threshold for multixact member emergency autovac to 50%. Analysis by Noah Misch shows that the 25% threshold set by commit 53bb309d2d5a9432d2602c93ed18e58bd2924e15 is lower than any other, similar autovac threshold. While we don't know exactly what value will be optimal for all users, it is better to err a little on the high side than on the low side. A higher value increases the risk that users might exhaust the available space and start seeing errors before autovacuum can clean things up sufficiently, but a user who hits that problem can compensate for it by reducing autovacuum_multixact_freeze_max_age to a value dependent on their average multixact size. On the flip side, if the emergency cap imposed by that patch kicks in too early, the user will experience excessive wraparound scanning and will be unable to mitigate that problem by configuration. The new value will hopefully reduce the risk of such bad experiences while still providing enough headroom to avoid multixact member exhaustion for most users. Along the way, adjust the documentation to reflect the effects of commit 04e6d3b877e060d8445eb653b7ea26b1ee5cec6b, which taught autovacuum to run for multixact wraparound even when autovacuum is configured off. 11 May 2015, 16:16:51 UTC
543fbec Even when autovacuum=off, force it for members as we do in other cases. Thomas Munro, with some adjustments by me. 11 May 2015, 14:56:32 UTC
5bbac7e Advance the stop point for multixact offset creation only at checkpoint. Commit b69bf30b9bfacafc733a9ba77c9587cf54d06c0c advanced the stop point at vacuum time, but this has subsequently been shown to be unsafe as a result of analysis by myself and Thomas Munro and testing by Thomas Munro. The crux of the problem is that the SLRU deletion logic may get confused about what to remove if, at exactly the right time during the checkpoint process, the head of the SLRU crosses what used to be the tail. This patch, by me, fixes the problem by advancing the stop point only following a checkpoint. This has the additional advantage of making the removal logic work during recovery more like the way it works during normal running, which is probably good. At least one of the calls to DetermineSafeOldestOffset which this patch removes was already dead, because MultiXactAdvanceOldest is called only during recovery and DetermineSafeOldestOffset was set up to do nothing during recovery. That, however, is inconsistent with the principle that recovery and normal running should work similarly, and was confusing to boot. Along the way, fix some comments that previous patches in this area neglected to update. It's not clear to me whether there's any concrete basis for the decision to use only half of the multixact ID space, but it's neither necessary nor sufficient to prevent multixact member wraparound, so the comments should not say otherwise. 11 May 2015, 02:45:42 UTC
24aa77e Fix DetermineSafeOldestOffset for the case where there are no mxacts. Commit b69bf30b9bfacafc733a9ba77c9587cf54d06c0c failed to take into account the possibility that there might be no multixacts in existence at all. Report by Thomas Munro; patch by me. 11 May 2015, 01:47:41 UTC
3de791e Recommend include_realm=1 in docs As discussed, the default setting of include_realm=0 can be dangerous in multi-realm environments because it is then impossible to differentiate users with the same username but who are from two different realms. Recommend include_realm=1 and note that the default setting may change in a future version of PostgreSQL and therefore users may wish to explicitly set include_realm to avoid issues while upgrading. 08 May 2015, 23:40:06 UTC
596fb5a Teach autovacuum about multixact member wraparound. The logic introduced in commit b69bf30b9bfacafc733a9ba77c9587cf54d06c0c and repaired in commits 669c7d20e6374850593cb430d332e11a3992bbcf and 7be47c56af3d3013955c91c2877c08f2a0e3e6a2 helps to ensure that we don't overwrite old multixact member information while it is still needed, but a user who creates many large multixacts can still exhaust the member space (and thus start getting errors) while autovacuum stands idly by. To fix this, progressively ramp down the effective value (but not the actual contents) of autovacuum_multixact_freeze_max_age as member space utilization increases. This makes autovacuum more aggressive and also reduces the threshold for a manual VACUUM to perform a full-table scan. This patch leaves unsolved the problem of ensuring that emergency autovacuums are triggered even when autovacuum=off. We'll need to fix that via a separate patch. Thomas Munro and Robert Haas 08 May 2015, 16:55:14 UTC
83fbd9b Fix incorrect math in DetermineSafeOldestOffset. The old formula didn't have enough parentheses, so it would do the wrong thing, and it used / rather than % to find a remainder. The effect of these oversights is that the stop point chosen by the logic introduced in commit b69bf30b9bfacafc733a9ba77c9587cf54d06c0c might be rather meaningless. Thomas Munro, reviewed by Kevin Grittner, with a whitespace tweak by me. 07 May 2015, 15:16:41 UTC
ba3caee Properly send SCM status updates when shutting down service on Windows The Service Control Manager should be notified regularly during a shutdown that takes a long time. Previously we would increaes the counter, but forgot to actually send the notification to the system. The loop counter was also incorrectly initalized in the event that the startup of the system took long enough for it to increase, which could cause the shutdown process not to wait as long as expected. Krystian Bigaj, reviewed by Michael Paquier 07 May 2015, 13:09:32 UTC
cf7d5aa citext's regexp_matches() functions weren't documented, either. 05 May 2015, 20:11:13 UTC
ffac9f6 Fix incorrect declaration of citext's regexp_matches() functions. These functions should return SETOF TEXT[], like the core functions they are wrappers for; but they were incorrectly declared as returning just TEXT[]. This mistake had two results: first, if there was no match you got a scalar null result, whereas what you should get is an empty set (zero rows). Second, the 'g' flag was effectively ignored, since you would get only one result array even if there were multiple matches, as reported by Jeff Certain. While ignoring 'g' is a clear bug, the behavior for no matches might well have been thought to be the intended behavior by people who hadn't compared it carefully to the core regexp_matches() functions. So we should tread carefully about introducing this change in the back branches. Still, it clearly is a bug and so providing some fix is desirable. After discussion, the conclusion was to introduce the change in a 1.1 version of the citext extension (as we would need to do anyway); 1.0 still contains the incorrect behavior. 1.1 is the default and only available version in HEAD, but it is optional in the back branches, where 1.0 remains the default version. People wishing to adopt the fix in back branches will need to explicitly do ALTER EXTENSION citext UPDATE TO '1.1'. (I also provided a downgrade script in the back branches, so people could go back to 1.0 if necessary.) This should be called out as an incompatible change in the 9.5 release notes, although we'll also document it in the next set of back-branch release notes. The notes should mention that any views or rules that use citext's regexp_matches() functions will need to be dropped before upgrading to 1.1, and then recreated again afterwards. Back-patch to 9.1. The bug goes all the way back to citext's introduction in 8.4, but pre-9.1 there is no extension mechanism with which to manage the change. Given the lack of previous complaints it seems unnecessary to change this behavior in 9.0, anyway. 05 May 2015, 19:50:53 UTC
6fd6669 Fix some problems with patch to fsync the data directory. pg_win32_is_junction() was a typo for pgwin32_is_junction(). open() was used not only in a two-argument form, which breaks on Windows, but also where BasicOpenFile() should have been used. Per reports from Andrew Dunstan and David Rowley. 05 May 2015, 13:19:39 UTC
14de825 Recursively fsync() the data directory after a crash. Otherwise, if there's another crash, some writes from after the first crash might make it to disk while writes from before the crash fail to make it to disk. This could lead to data corruption. Back-patch to all supported versions. Abhijit Menon-Sen, reviewed by Andres Freund and slightly revised by me. 04 May 2015, 16:27:55 UTC
e60581f Fix pg_upgrade's multixact handling (again) We need to create the pg_multixact/offsets file deleted by pg_upgrade much earlier than we originally were: it was in TrimMultiXact(), which runs after we exit recovery, but it actually needs to run earlier than the first call to SetMultiXactIdLimit (before recovery), because that routine already wants to read the first offset segment. Per pg_upgrade trouble report from Jeff Janes. While at it, silence a compiler warning about a pointless assert that an unsigned variable was being tested non-negative. This was a signed constant in Thomas Munro's patch which I changed to unsigned before commit. Pointed out by Andres Freund. 30 April 2015, 16:55:06 UTC
cf0d888 Code review for multixact bugfix Reword messages, rename a confusingly named function. Per Robert Haas. 28 April 2015, 17:52:29 UTC
e2eda4b Protect against multixact members wraparound Multixact member files are subject to early wraparound overflow and removal: if the average multixact size is above a certain threshold (see note below) the protections against offset overflow are not enough: during multixact truncation at checkpoint time, some pg_multixact/members files would be removed because the server considers them to be old and not needed anymore. This leads to loss of files that are critical to interpret existing tuples's Xmax values. To protect against this, since we don't have enough info in pg_control and we can't modify it in old branches, we maintain shared memory state about the oldest value that we need to keep; we use this during new multixact creation to abort if an old still-needed file would get overwritten. This value is kept up to date by checkpoints, which makes it not completely accurate but should be good enough. We start emitting warnings sometime earlier, so that the eventual multixact-shutdown doesn't take DBAs completely by surprise (more precisely: once 20 members SLRU segments are remaining before shutdown.) On troublesome average multixact size: The threshold size depends on the multixact freeze parameters. The oldest age is related to the greater of multixact_freeze_table_age and multixact_freeze_min_age: anything older than that should be removed promptly by autovacuum. If autovacuum is keeping up with multixact freezing, the troublesome multixact average size is (2^32-1) / Max(freeze table age, freeze min age) or around 28 members per multixact. Having an average multixact size larger than that will eventually cause new multixact data to overwrite the data area for older multixacts. (If autovacuum is not able to keep up, or there are errors in vacuuming, the actual maximum is multixact_freeeze_max_age instead, at which point multixact generation is stopped completely. The default value for this limit is 400 million, which means that the multixact size that would cause trouble is about 10 members). Initial bug report by Timothy Garnett, bug #12990 Backpatch to 9.3, where the problem was introduced. Authors: Álvaro Herrera, Thomas Munro Reviews: Thomas Munro, Amit Kapila, Robert Haas, Kevin Grittner 28 April 2015, 14:32:53 UTC
723613e Build libecpg with -DFRONTEND in all supported versions. Fix an oversight in commit 151e74719b0cc5c040bd3191b51b95f925773dd1 by back-patching commit 44c5d387eafb4ba1a032f8d7b13d85c553d69181 to 9.0. 26 April 2015, 21:20:10 UTC
3e47d0b Prevent improper reordering of antijoins vs. outer joins. An outer join appearing within the RHS of an antijoin can't commute with the antijoin, but somehow I missed teaching make_outerjoininfo() about that. In Teodor Sigaev's recent trouble report, this manifests as a "could not find RelOptInfo for given relids" error within eqjoinsel(); but I think silently wrong query results are possible too, if the planner misorders the joins and doesn't happen to trigger any internal consistency checks. It's broken as far back as we had antijoins, so back-patch to all supported branches. 25 April 2015, 20:44:27 UTC
05c1392 Build every ECPG library with -DFRONTEND. Each of the libraries incorporates src/port files, which often check FRONTEND. Build systems disagreed on whether to build libpgtypes this way. Only libecpg incorporates files that rely on it today. Back-patch to 9.0 (all supported versions) to forestall surprises. 24 April 2015, 23:29:24 UTC
c82e13a Fix obsolete comment in set_rel_size(). The cross-reference to set_append_rel_pathlist() was obsoleted by commit e2fa76d80ba571d4de8992de6386536867250474, which split what had been set_rel_pathlist() and child routines into two sets of functions. But I (tgl) evidently missed updating this comment. Back-patch to 9.2 to avoid unnecessary divergence among branches. Amit Langote 24 April 2015, 19:18:43 UTC
f73ebd7 Fix deadlock at startup, if max_prepared_transactions is too small. When the startup process recovers transactions by scanning pg_twophase directory, it should clear MyLockedGxact after it's done processing each transaction. Like we do during normal operation, at PREPARE TRANSACTION. Otherwise, if the startup process exits due to an error, it will try to clear the locking_backend field of the last recovered transaction. That's usually harmless, but if the error happens in MarkAsPreparing, while holding TwoPhaseStateLock, the shmem-exit hook will try to acquire TwoPhaseStateLock again, and deadlock with itself. This fixes bug #13128 reported by Grant McAlister. The bug was introduced by commit bb38fb0d, so backpatch to all supported versions like that commit. 23 April 2015, 18:36:24 UTC
7954bc5 Fix typo in comment SLRU_SEGMENTS_PER_PAGE -> SLRU_PAGES_PER_SEGMENT I introduced this ancient typo in subtrans.c and later propagated it to multixact.c. I fixed the latter in f741300c, but only back to 9.3; backpatch to all supported branches for consistency. 14 April 2015, 15:12:18 UTC
a800267 Don't archive bogus recycled or preallocated files after timeline switch. After a timeline switch, we would leave behind recycled WAL segments that are in the future, but on the old timeline. After promotion, and after they become old enough to be recycled again, we would notice that they don't have a .ready or .done file, create a .ready file for them, and archive them. That's bogus, because the files contain garbage, recycled from an older timeline (or prealloced as zeros). We shouldn't archive such files. This could happen when we're following a timeline switch during replay, or when we switch to new timeline at end-of-recovery. To fix, whenever we switch to a new timeline, scan the data directory for WAL segments on the old timeline, but with a higher segment number, and remove them. Those don't belong to our timeline history, and are most likely bogus recycled or preallocated files. They could also be valid files that we streamed from the primary ahead of time, but in any case, they're not needed to recover to the new timeline. 13 April 2015, 14:22:35 UTC
8dfddf1 Remove duplicated words in comments. David Rowley 12 April 2015, 07:49:34 UTC
3b4da9a Fix incorrect punctuation Amit Langote 09 April 2015, 11:36:07 UTC
0d6c9e0 Fix autovacuum launcher shutdown sequence It was previously possible to have the launcher re-execute its main loop before shutting down if some other signal was received or an error occurred after getting SIGTERM, as reported by Qingqing Zhou. While investigating, Tom Lane further noticed that if autovacuum had been disabled in the config file, it would misbehave by trying to start a new worker instead of bailing out immediately -- it would consider itself as invoked in emergency mode. Fix both problems by checking the shutdown flag in a few more places. These problems have existed since autovacuum was introduced, so backpatch all the way back. 08 April 2015, 16:19:49 UTC
b1145ca Fix assorted inconsistent function declarations. While gcc doesn't complain if you declare a function "static" and then define it not-static, other compilers do; and in any case the code is highly misleading this way. Add the missing "static" keywords to a couple of recent patches. Per buildfarm member pademelon. 07 April 2015, 20:56:21 UTC
4e3b1e2 Fix typo in libpq.sgml. Back-patch to all supported versions. Michael Paquier 06 April 2015, 03:17:24 UTC
6347bdb Suppress clang's unhelpful gripes about -pthread switch being unused. Considering the number of cases in which "unused" command line arguments are silently ignored by compilers, it's fairly astonishing that anybody thought this warning was useful; it's certainly nothing but an annoyance when building Postgres. One such case is that neither gcc nor clang complain about unrecognized -Wno-foo switches, making it more difficult to figure out whether the switch does anything than one could wish. Back-patch to 9.3, which is as far back as the patch applies conveniently (we'd have to back-patch PGAC_PROG_CC_VAR_OPT to go further, and it doesn't seem worth that). 05 April 2015, 17:01:55 UTC
e105df2 Fix incorrect matching of subexpressions in outer-join plan nodes. Previously we would re-use input subexpressions in all expression trees attached to a Join plan node. However, if it's an outer join and the subexpression appears in the nullable-side input, this is potentially incorrect for apparently-matching subexpressions that came from above the outer join (ie, targetlist and qpqual expressions), because the executor will treat the subexpression value as NULL when maybe it should not be. The case is fairly hard to hit because (a) you need a non-strict subexpression (else NULL is correct), and (b) we don't usually compute expressions in the outputs of non-toplevel plan nodes. But we might do so if the expressions are sort keys for a mergejoin, for example. Probably in the long run we should make a more explicit distinction between Vars appearing above and below an outer join, but that will be a major planner redesign and not at all back-patchable. For the moment, just hack set_join_references so that it will not match any non-Var expressions coming from nullable inputs to expressions that came from above the join. (This is somewhat overkill, in that a strict expression could still be matched, but it doesn't seem worth the effort to check that.) Per report from Qingqing Zhou. The added regression test case is based on his example. This has been broken for a very long time, so back-patch to all active branches. 04 April 2015, 23:55:15 UTC
cbccaf2 Remove unnecessary variables in _hash_splitbucket(). Commit ed9cc2b5df59fdbc50cce37399e26b03ab2c1686 made it unnecessary to pass start_nblkno to _hash_splitbucket(), and for that matter unnecessary to have the internal nblkno variable either. My compiler didn't complain about that, but some did. I also rearranged the use of oblkno a bit to make that case more parallel. Report and initial patch by Petr Jelinek, rearranged a bit by me. Back-patch to all branches, like the previous patch. 03 April 2015, 20:49:11 UTC
f4540ca psql: fix \connect with URIs and conninfo strings psql was already accepting conninfo strings as the first parameter in \connect, but the way it worked wasn't sane; some of the other parameters would get the previous connection's values, causing it to connect to a completely unexpected server or, more likely, not finding any server at all because of completely wrong combinations of parameters. Fix by explicitely checking for a conninfo-looking parameter in the dbname position; if one is found, use its complete specification rather than mix with the other arguments. Also, change tab-completion to not try to complete conninfo/URI-looking "dbnames" and document that conninfos are accepted as first argument. There was a weak consensus to backpatch this, because while the behavior of using the dbname as a conninfo is nowhere documented for \connect, it is reasonable to expect that it works because it does work in many other contexts. Therefore this is backpatched all the way back to 9.0. To implement this, routines previously private to libpq have been duplicated so that psql can decide what looks like a conninfo/URI string. In back branches, just duplicate the same code all the way back to 9.2, where URIs where introduced; 9.0 and 9.1 have a simpler version. In master, the routines are moved to src/common and renamed. Author: David Fetter, Andrew Dunstan. Some editorialization by me (probably earning a Gierth's "Sloppy" badge in the process.) Reviewers: Andrew Gierth, Erik Rijkers, Pavel Stěhule, Stephen Frost, Robert Haas, Andrew Dunstan. 01 April 2015, 23:00:07 UTC
44f8f56 Fix incorrect markup in documentation of window frame clauses. You're required to write either RANGE or ROWS to start a frame clause, but the documentation incorrectly implied this is optional. Noted by David Johnston. 01 April 2015, 00:03:50 UTC
9f06729 Remove spurious semicolons. Petr Jelinek 31 March 2015, 12:15:04 UTC
0904eb3 Run pg_upgrade and pg_resetxlog with restricted token on Windows As with initdb these programs need to run with a restricted token, and if they don't pg_upgrade will fail when run as a user with Adminstrator privileges. Backpatch to all live branches. On the development branch the code is reorganized so that the restricted token code is now in a single location. On the stable bramches a less invasive change is made by simply copying the relevant code to pg_upgrade.c and pg_resetxlog.c. Patches and bug report from Muhammad Asif Naeem, reviewed by Michael Paquier, slightly edited by me. 30 March 2015, 21:17:17 UTC
246bbf6 Fix bogus concurrent use of _hash_getnewbuf() in bucket split code. _hash_splitbucket() obtained the base page of the new bucket by calling _hash_getnewbuf(), but it held no exclusive lock that would prevent some other process from calling _hash_getnewbuf() at the same time. This is contrary to _hash_getnewbuf()'s API spec and could in fact cause failures. In practice, we must only call that function while holding write lock on the hash index's metapage. An additional problem was that we'd already modified the metapage's bucket mapping data, meaning that failure to extend the index would leave us with a corrupt index. Fix both issues by moving the _hash_getnewbuf() call to just before we modify the metapage in _hash_expandtable(). Unfortunately there's still a large problem here, which is that we could also incur ENOSPC while trying to get an overflow page for the new bucket. That would leave the index corrupt in a more subtle way, namely that some index tuples that should be in the new bucket might still be in the old one. Fixing that seems substantially more difficult; even preallocating as many pages as we could possibly need wouldn't entirely guarantee that the bucket split would complete successfully. So for today let's just deal with the base case. Per report from Antonin Houska. Back-patch to all active branches. 30 March 2015, 20:40:05 UTC
995a664 Add vacuum_delay_point call in compute_index_stats's per-sample-row loop. Slow functions in index expressions might cause this loop to take long enough to make it worth being cancellable. Probably it would be enough to call CHECK_FOR_INTERRUPTS here, but for consistency with other per-sample-row loops in this file, let's use vacuum_delay_point. Report and patch by Jeff Janes. Back-patch to all supported branches. 29 March 2015, 19:04:24 UTC
56abebb Make SyncRepWakeQueue to a static function It is only used in src/backend/replication/syncrep.c. Back-patch to all supported branches except 9.1 which declares the function as static. 26 March 2015, 01:39:18 UTC
7cd5498 Fix ExecOpenScanRelation to take a lock on a ROW_MARK_COPY relation. ExecOpenScanRelation assumed that any relation listed in the ExecRowMark list has been locked by InitPlan; but this is not true if the rel's markType is ROW_MARK_COPY, which is possible if it's a foreign table. In most (possibly all) cases, failure to acquire a lock here isn't really problematic because the parser, planner, or plancache would have taken the appropriate lock already. In principle though it might leave us vulnerable to working with a relation that we hold no lock on, and in any case if the executor isn't depending on previously-taken locks otherwise then it should not do so for ROW_MARK_COPY relations. Noted by Etsuro Fujita. Back-patch to all active versions, since the inconsistency has been there a long time. (It's almost certainly irrelevant in 9.0, since that predates foreign tables, but the code's still wrong on its own terms.) 24 March 2015, 19:53:06 UTC
83587a0 Replace insertion sort in contrib/intarray with qsort(). It's all very well to claim that a simplistic sort is fast in easy cases, but O(N^2) in the worst case is not good ... especially if the worst case is as easy to hit as "descending order input". Replace that bit with our standard qsort. Per bug #12866 from Maksym Boguk. Back-patch to all active branches. 16 March 2015, 03:22:03 UTC
2cb76fa Remove workaround for ancient incompatibility between readline and libedit. GNU readline defines the return value of write_history() as "zero if OK, else an errno code". libedit's version of that function used to have a different definition (to wit, "-1 if error, else the number of lines written to the file"). We tried to work around that by checking whether errno had become nonzero, but this method has never been kosher according to the published API of either library. It's reportedly completely broken in recent Ubuntu releases: psql bleats about "No such file or directory" when saving ~/.psql_history, even though the write worked fine. However, libedit has been following the readline definition since somewhere around 2006, so it seems all right to finally break compatibility with ancient libedit releases and trust that the return value is what readline specifies. (I'm not sure when the various Linux distributions incorporated this fix, but I did find that OS X has been shipping fixed versions since 10.5/Leopard.) If anyone is still using such an ancient libedit, they will find that psql complains it can't write ~/.psql_history at exit, even when the file was written correctly. This is no worse than the behavior we're fixing for current releases. Back-patch to all supported branches. 14 March 2015, 17:43:13 UTC
089b371 Fix integer overflow in debug message of walreceiver The message tries to tell the replication apply delay which fails if the first WAL record is not applied yet. Fix is, instead of telling overflowed minus numeric, showing "N/A" which indicates that the delay data is not yet available. Problem reported by me and patch by Fabrízio de Royes Mello. Back patched to 9.4, 9.3 and 9.2 stable branches (9.1 and 9.0 do not have the debug message). 13 March 2015, 23:21:56 UTC
5bdf3cf Ensure tableoid reads correctly in EvalPlanQual-manufactured tuples. The ROW_MARK_COPY path in EvalPlanQualFetchRowMarks() was just setting tableoid to InvalidOid, I think on the assumption that the referenced RTE must be a subquery or other case without a meaningful OID. However, foreign tables also use this code path, and they do have meaningful table OIDs; so failure to set the tuple field can lead to user-visible misbehavior. Fix that by fetching the appropriate OID from the range table. There's still an issue about whether CTID can ever have a meaningful value in this case; at least with postgres_fdw foreign tables, it does. But that is a different problem that seems to require a significantly different patch --- it's debatable whether postgres_fdw really wants to use this code path at all. Simplified version of a patch by Etsuro Fujita, who also noted the problem to begin with. The issue can be demonstrated in all versions having FDWs, so back-patch to 9.1. 12 March 2015, 17:38:49 UTC
d16d821 Cast to (void *) rather than (int *) when passing int64's to PQfn(). This is a possibly-vain effort to silence a Coverity warning about bogus endianness dependency. The code's fine, because it takes care of endianness issues for itself, but Coverity sees an int64 being passed to an int* argument and not unreasonably suspects something's wrong. I'm not sure if putting the void* cast in the way will shut it up; but it can't hurt and seems better from a documentation standpoint anyway, since the pointer is not used as an int* in this code path. Just for a bit of additional safety, verify that the result length is 8 bytes as expected. Back-patch to 9.3 where the code in question was added. 08 March 2015, 17:58:39 UTC
9937f6e Fix documentation for libpq's PQfn(). The SGML docs claimed that 1-byte integers could be sent or received with the "isint" options, but no such behavior has ever been implemented in pqGetInt() or pqPutInt(). The in-code documentation header for PQfn() was even less in tune with reality, and the code itself used parameter names matching neither the SGML docs nor its libpq-fe.h declaration. Do a bit of additional wordsmithing on the SGML docs while at it. Since the business about 1-byte integers is a clear documentation bug, back-patch to all supported branches. 08 March 2015, 17:35:41 UTC
d645273 Rethink function argument sorting in pg_dump. Commit 7b583b20b1c95acb621c71251150beef958bb603 created an unnecessary dump failure hazard by applying pg_get_function_identity_arguments() to every function in the database, even those that won't get dumped. This could result in snapshot-related problems if concurrent sessions are, for example, creating and dropping temporary functions, as noted by Marko Tiikkaja in bug #12832. While this is by no means pg_dump's only such issue with concurrent DDL, it's unfortunate that we added a new failure mode for cases that used to work, and even more so that the failure was created for basically cosmetic reasons (ie, to sort overloaded functions more deterministically). To fix, revert that patch and instead sort function arguments using information that pg_dump has available anyway, namely the names of the argument types. This will produce a slightly different sort ordering for overloaded functions than the previous coding; but applying strcmp directly to the output of pg_get_function_identity_arguments really was a bit odd anyway. The sorting will still be name-based and hence independent of possibly-installation-specific OID assignments. A small additional benefit is that sorting now works regardless of server version. Back-patch to 9.3, where the previous commit appeared. 06 March 2015, 18:27:46 UTC
49bb343 Fix contrib/file_fdw's expected file I forgot to update it on yesterday's cf34e373fcf. 06 March 2015, 14:47:09 UTC
5cf4000 Fix user mapping object description We were using "user mapping for user XYZ" as description for user mappings, but that's ambiguous because users can have mappings on multiple foreign servers; therefore change it to "for user XYZ on server UVW" instead. Object identities for user mappings are also updated in the same way, in branches 9.3 and above. The incomplete description string was introduced together with the whole SQL/MED infrastructure by commit cae565e503 of 8.4 era, so backpatch all the way back. 05 March 2015, 21:03:16 UTC
73f236f Add comment for "is_internal" parameter This was missed in my commit f4c4335 of 9.3 vintage, so backpatch to that. 03 March 2015, 17:04:34 UTC
43d81f1 Fix pg_dump handling of extension config tables Since 9.1, we've provided extensions with a way to denote "configuration" tables- tables created by an extension which the user may modify. By marking these as "configuration" tables, the extension is asking for the data in these tables to be pg_dump'd (tables which are not marked in this way are assumed to be entirely handled during CREATE EXTENSION and are not included at all in a pg_dump). Unfortunately, pg_dump neglected to consider foreign key relationships between extension configuration tables and therefore could end up trying to reload the data in an order which would cause FK violations. This patch teaches pg_dump about these dependencies, so that the data dumped out is done so in the best order possible. Note that there's no way to handle circular dependencies, but those have yet to be seen in the wild. The release notes for this should include a caution to users that existing pg_dump-based backups may be invalid due to this issue. The data is all there, but restoring from it will require extracting the data for the configuration tables and then loading them in the correct order by hand. Discussed initially back in bug #6738, more recently brought up by Gilles Darold, who provided an initial patch which was further reworked by Michael Paquier. Further modifications and documentation updates by me. Back-patch to 9.1 where we added the concept of extension configuration tables. 02 March 2015, 19:12:33 UTC
585f16d Unlink static libraries before rebuilding them. When the library already exists in the build directory, "ar" preserves members not named on its command line. This mattered when, for example, a "configure" rerun dropped a file from $(LIBOBJS). libpgport carried the obsolete member until "make clean". Back-patch to 9.0 (all supported versions). 01 March 2015, 18:06:39 UTC
1b55878 Fix planning of star-schema-style queries. Part of the intent of the parameterized-path mechanism was to handle star-schema queries efficiently, but some overly-restrictive search limiting logic added in commit e2fa76d80ba571d4de8992de6386536867250474 prevented such cases from working as desired. Fix that and add a regression test about it. Per gripe from Marc Cousin. This is arguably a bug rather than a new feature, so back-patch to 9.2 where parameterized paths were introduced. 28 February 2015, 17:43:04 UTC
abce8dc Reconsider when to wait for WAL flushes/syncrep during commit. Up to now RecordTransactionCommit() waited for WAL to be flushed (if synchronous_commit != off) and to be synchronously replicated (if enabled), even if a transaction did not have a xid assigned. The primary reason for that is that sequence's nextval() did not assign a xid, but are worthwhile to wait for on commit. This can be problematic because sometimes read only transactions do write WAL, e.g. HOT page prune records. That then could lead to read only transactions having to wait during commit. Not something people expect in a read only transaction. This lead to such strange symptoms as backends being seemingly stuck during connection establishment when all synchronous replicas are down. Especially annoying when said stuck connection is the standby trying to reconnect to allow syncrep again... This behavior also is involved in a rather complicated <= 9.4 bug where the transaction started by catchup interrupt processing waited for syncrep using latches, but didn't get the wakeup because it was already running inside the same overloaded signal handler. Fix the issue here doesn't properly solve that issue, merely papers over the problems. In 9.5 catchup interrupts aren't processed out of signal handlers anymore. To fix all this, make nextval() acquire a top level xid, and only wait for transaction commit if a transaction both acquired a xid and emitted WAL records. If only a xid has been assigned we don't uselessly want to wait just because of writes to temporary/unlogged tables; if only WAL has been written we don't want to wait just because of HOT prunes. The xid assignment in nextval() is unlikely to cause overhead in real-world workloads. For one it only happens SEQ_LOG_VALS/32 values anyway, for another only usage of nextval() without using the result in an insert or similar is affected. Discussion: 20150223165359.GF30784@awork2.anarazel.de, 369698E947874884A77849D8FE3680C2@maumau, 5CF4ABBA67674088B3941894E22A0D25@maumau Per complaint from maumau and Thom Brown Backpatch all the way back; 9.0 doesn't have syncrep, but it seems better to be consistent behavior across all maintained branches. 26 February 2015, 11:50:07 UTC
4651e37 Free SQLSTATE and SQLERRM no earlier than other PL/pgSQL variables. "RETURN SQLERRM" prompted plpgsql_exec_function() to read from freed memory. Back-patch to 9.0 (all supported versions). Little code ran between the premature free and the read, so non-assert builds are unlikely to witness user-visible consequences. 26 February 2015, 04:48:54 UTC
f864fe0 Fix dumping of views that are just VALUES(...) but have column aliases. The "simple" path for printing VALUES clauses doesn't work if we need to attach nondefault column aliases, because there's noplace to do that in the minimal VALUES() syntax. So modify get_simple_values_rte() to detect nondefault aliases and treat that as a non-simple case. This further exposes that the "non-simple" path never actually worked; it didn't produce valid syntax. Fix that too. Per bug #12789 from Curtis McEnroe, and analysis by Andrew Gierth. Back-patch to all supported branches. Before 9.3, this also requires back-patching the part of commit 092d7ded29f36b0539046b23b81b9f0bf2d637f1 that created get_simple_values_rte() to begin with; inserting the extra test into the old factorization of that logic would've been too messy. 25 February 2015, 17:01:12 UTC
a6ddff8 Guard against spurious signals in LockBufferForCleanup. When LockBufferForCleanup() has to wait for getting a cleanup lock on a buffer it does so by setting a flag in the buffer header and then wait for other backends to signal it using ProcWaitForSignal(). Unfortunately LockBufferForCleanup() missed that ProcWaitForSignal() can return for other reasons than the signal it is hoping for. If such a spurious signal arrives the wait flags on the buffer header will still be set. That then triggers "ERROR: multiple backends attempting to wait for pincount 1". The fix is simple, unset the flag if still set when retrying. That implies an additional spinlock acquisition/release, but that's unlikely to matter given the cost of waiting for a cleanup lock. Alternatively it'd have been possible to move responsibility for maintaining the relevant flag to the waiter all together, but that might have had negative consequences due to possible floods of signals. Besides being more invasive. This looks to be a very longstanding bug. The relevant code in LockBufferForCleanup() hasn't changed materially since its introduction and ProcWaitForSignal() was documented to return for unrelated reasons since 8.2. The master only patch series removing ImmediateInterruptOK made it much easier to hit though, as ProcSendSignal/ProcWaitForSignal now uses a latch shared with other tasks. Per discussion with Kevin Grittner, Tom Lane and me. Backpatch to all supported branches. Discussion: 11553.1423805224@sss.pgh.pa.us 23 February 2015, 15:14:15 UTC
cdf813c Fix potential deadlock with libpq non-blocking mode. If libpq output buffer is full, pqSendSome() function tries to drain any incoming data. This avoids deadlock, if the server e.g. sends a lot of NOTICE messages, and blocks until we read them. However, pqSendSome() only did that in blocking mode. In non-blocking mode, the deadlock could still happen. To fix, take a two-pronged approach: 1. Change the documentation to instruct that when PQflush() returns 1, you should wait for both read- and write-ready, and call PQconsumeInput() if it becomes read-ready. That fixes the deadlock, but applications are not going to change overnight. 2. In pqSendSome(), drain the input buffer before returning 1. This alleviates the problem for applications that only wait for write-ready. In particular, a slow but steady stream of NOTICE messages during COPY FROM STDIN will no longer cause a deadlock. The risk remains that the server attempts to send a large burst of data and fills its output buffer, and at the same time the client also sends enough data to fill its output buffer. The application will deadlock if it goes to sleep, waiting for the socket to become write-ready, before the server's data arrives. In practice, NOTICE messages and such that the server might be sending are usually short, so it's highly unlikely that the server would fill its output buffer so quickly. Backpatch to all supported versions. 23 February 2015, 11:32:42 UTC
back to top