https://github.com/torvalds/linux
Revision a50026bdb867c8caf9d29e18f9fe9e1390312619 authored by Linus Torvalds on 05 March 2024, 13:33:36 UTC, committed by Christian Brauner on 06 March 2024, 09:52:12 UTC
This flag is only set by one single user: the magical core dumping code that looks up user pages one by one, and then writes them out using their kernel addresses (by using a BVEC_ITER). That actually ends up being a huge problem, because while we do use copy_mc_to_kernel() for this case and it is able to handle the possible machine checks involved, nothing else is really ready to handle the failures caused by the machine check. In particular, as reported by Tong Tiangen, we don't actually support fault_in_iov_iter_readable() on a machine check area. As a result, the usual logic for writing things to a file under a filesystem lock, which involves doing a copy with page faults disabled and then if that fails trying to fault pages in without holding the locks with fault_in_iov_iter_readable() does not work at all. We could decide to always just make the MC copy "succeed" (and filling the destination with zeroes), and that would then create a core dump file that just ignores any machine checks. But honestly, this single special case has been problematic before, and means that all the normal iov_iter code ends up slightly more complex and slower. See for example commit c9eec08bac96 ("iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()") where David Howells re-organized the code just to avoid having to check the 'copy_mc' flags inside the inner iov_iter loops. So considering that we have exactly one user, and that one user is a non-critical special case that doesn't actually ever trigger in real life (Tong found this with manual error injection), the sane solution is to just decide that the onus on handling the machine check lines on that user instead. Ergo, do the copy_mc_to_kernel() in the core dump logic itself, copying the user data to a stable kernel page before writing it out. Fixes: f1982740f5e7 ("iov_iter: Convert iterate*() to inline funcs") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Link: https://lore.kernel.org/r/20240305133336.3804360-1-tongtiangen@huawei.com Link: https://lore.kernel.org/all/4e80924d-9c85-f13a-722a-6a5d2b1c225a@huawei.com/ Tested-by: David Howells <dhowells@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reported-by: Tong Tiangen <tongtiangen@huawei.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
1 parent 961ebd1
Tip revision: a50026bdb867c8caf9d29e18f9fe9e1390312619 authored by Linus Torvalds on 05 March 2024, 13:33:36 UTC
iov_iter: get rid of 'copy_mc' flag
iov_iter: get rid of 'copy_mc' flag
Tip revision: a50026b
File | Mode | Size |
---|---|---|
Documentation | ||
LICENSES | ||
arch | ||
block | ||
certs | ||
crypto | ||
drivers | ||
fs | ||
include | ||
init | ||
io_uring | ||
ipc | ||
kernel | ||
lib | ||
mm | ||
net | ||
rust | ||
samples | ||
scripts | ||
security | ||
sound | ||
tools | ||
usr | ||
virt | ||
.clang-format | -rw-r--r-- | 21.7 KB |
.cocciconfig | -rw-r--r-- | 59 bytes |
.editorconfig | -rw-r--r-- | 672 bytes |
.get_maintainer.ignore | -rw-r--r-- | 151 bytes |
.gitattributes | -rw-r--r-- | 105 bytes |
.gitignore | -rw-r--r-- | 2.0 KB |
.mailmap | -rw-r--r-- | 37.3 KB |
.rustfmt.toml | -rw-r--r-- | 369 bytes |
COPYING | -rw-r--r-- | 496 bytes |
CREDITS | -rw-r--r-- | 101.6 KB |
Kbuild | -rw-r--r-- | 2.5 KB |
Kconfig | -rw-r--r-- | 555 bytes |
MAINTAINERS | -rw-r--r-- | 723.8 KB |
Makefile | -rw-r--r-- | 66.1 KB |
README | -rw-r--r-- | 727 bytes |
![swh spinner](/static/img/swh-spinner.gif)
Computing file changes ...