sort by:
Revision Author Date Message Commit Date
4374cbc spapr: fix device tree properties when using compatibility mode Commit 51f84465dd98 changed the compatility mode setting logic: - machine reset only sets compatibility mode for the boot CPU - compatibility mode is set for other CPUs when they are put online by the guest with the "start-cpu" RTAS call This causes a regression for machines started with max-compat-cpu: the device tree nodes related to secondary CPU cores contain wrong "cpu-version" and "ibm,pa-features" values, as shown below. Guest started on a POWER8 host with: -smp cores=2 -machine pseries,max-cpu-compat=compat7 ibm,pa-features = [18 00 f6 3f c7 c0 80 f0 80 00 00 00 00 00 00 00 00 00 80 00 80 00 80 00 00 00]; cpu-version = <0x4d0200>; ^^^ second CPU core ibm,pa-features = <0x600f63f 0xc70080c0>; cpu-version = <0xf000003>; ^^^ boot CPU core The second core is advertised in raw POWER8 mode. This happens because CAS assumes all CPUs to have the same compatibility mode. Since the boot CPU already has the requested compatibility mode, the CAS code does not set it for the secondary one, and exposes the bogus device tree properties in in the CAS response to the guest. A similar situation is observed when hot-plugging a CPU core. The related device tree properties are generated and exposed to guest with the "ibm,configure-connector" RTAS before "start-cpu" is called. The CPU core is advertised to the guest in raw mode as well. It both cases, it boils down to the fact that "start-cpu" happens too late. This can be fixed globally by propagating the compatibility mode of the boot CPU to the other CPUs during reset. For this to work, the compatibility mode of the boot CPU must be set before the machine code actually resets all CPUs. It is not needed to set the compatibility mode in "start-cpu" anymore, so the code is dropped. Fixes: 51f84465dd98 Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 9012a53f067a78022947e18050b145c34a3dc599) Conflicts: hw/ppc/spapr_cpu_core.c hw/ppc/spapr_rtas.c * drop context dep on d6322252b32 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:55:26 UTC
a1f33a5 ppc: Change Power9 compat table to support at most 8 threads/core Increases the max smt mode to 8 for Power9. That's because KVM supports smt emulation in this platform so QEMU should allow users to use it as well. Today if we try to pass -smp ...,threads=8, QEMU will silently truncate it to smt4 mode and may cause a crash if we try to perform a cpu hotplug. Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> [dwg: Added an explanatory comment] Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 03ee51d3548f5f553a3089f466483c1c6d5c666b) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:55:26 UTC
6a47136 hw/ppc/spapr_caps: Rework spapr_caps to use uint8 internal representation Currently spapr_caps are tied to boolean values (on or off). This patch reworks the caps so that they can have any uint8 value. This allows more capabilities with various values to be represented in the same way internally. Capabilities are numbered in ascending order. The internal representation of capability values is an array of uint8s in the sPAPRMachineState, indexed by capability number. Capabilities can have their own name, description, options, getter and setter functions, type and allow functions. They also each have their own section in the migration stream. Capabilities are only migrated if they were explictly set on the command line, with the assumption that otherwise the default will match. On migration we ensure that the capability value on the destination is greater than or equal to the capability value from the source. So long at this remains the case then the migration is considered compatible and allowed to continue. This patch implements generic getter and setter functions for boolean capabilities. It also converts the existings cap-htm, cap-vsx and cap-dfp capabilities to this new format. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 4e5fe3688e23d61b45cc549ff1322aff8f50ef45) Conflicts: include/hw/ppc/spapr.h *drop context dep on 60c6823b9bc Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:54:33 UTC
e4f4fa0 spapr: Handle Decimal Floating Point (DFP) as an optional capability Decimal Floating Point has been available on POWER7 and later (server) cpus. However, it can be disabled on the hypervisor, meaning that it's not available to guests. We currently handle this by conditionally advertising DFP support in the device tree depending on whether the guest CPU model supports it - which can also depend on what's allowed in the host for -cpu host. That can lead to confusion on migration, since host properties are silently affecting guest visible properties. This patch handles it by treating it as an optional capability for the pseries machine type. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit 2d1fb9bc8e6e78931d8e1bfeb0ed7a4d223b0480) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:53:54 UTC
ff6f7e1 spapr: Handle VMX/VSX presence as an spapr capability flag We currently have some conditionals in the spapr device tree code to decide whether or not to advertise the availability of the VMX (aka Altivec) and VSX vector extensions to the guest, based on whether the guest cpu has those features. This can lead to confusion and subtle failures on migration, since it makes a guest visible change based only on host capabilities. We now have a better mechanism for this, in spapr capabilities flags, which explicitly depend on user options rather than host capabilities. Rework the advertisement of VSX and VMX based on a new VSX capability. We no longer bother with a conditional for VMX support, because every CPU that's ever been supported by the pseries machine type supports VMX. NOTE: Some userspace distributions (e.g. RHEL7.4) already rely on availability of VSX in libc, so using cap-vsx=off may lead to a fatal SIGILL in init. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit 2938664286499c0c30d6e455a7e2e5d3e6c3f63d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:53:46 UTC
7c578cb target/ppc: Clean up probing of VMX, VSX and DFP availability on KVM When constructing the "host" cpu class we modify whether the VMX and VSX vector extensions and DFP (Decimal Floating Point) are available based on whether KVM can support those instructions. This can depend on policy in the host kernel as well as on the actual host cpu capabilities. However, the way we probe for this is not very nice: we explicitly check the host's device tree. That works in practice, but it's not really correct, since the device tree is a property of the host kernel's platform which we don't really know about. We get away with it because the only modern POWER platforms happen to encode VMX, VSX and DFP availability in the device tree in the same way. Arguably we should have an explicit KVM capability for this, but we haven't needed one so far. Barring specific KVM policies which don't yet exist, each of these instruction classes will be available in the guest if and only if they're available in the qemu userspace process. We can determine that from the ELF AUX vector we're supplied with. Once reworked like this, there are no more callers for kvmppc_get_vmx() and kvmppc_get_dfp() so remove them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit 3f2ca480eb872b4946baf77f756236b637a5b15a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:53:42 UTC
804e5ea spapr: Validate capabilities on migration Now that the "pseries" machine type implements optional capabilities (well, one so far) there's the possibility of having different capabilities available at either end of a migration. Although arguably a user error, it would be nice to catch this situation and fail as gracefully as we can. This adds code to migrate the capabilities flags. These aren't pulled directly into the destination's configuration since what the user has specified on the destination command line should take precedence. However, they are checked against the destination capabilities. If the source was using a capability which is absent on the destination, we fail the migration, since that could easily cause a guest crash or other bad behaviour. If the source lacked a capability which is present on the destination we warn, but allow the migration to proceed. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit be85537d654565e35e359a74b46fc08b7956525c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:53:31 UTC
9070f40 spapr: Treat Hardware Transactional Memory (HTM) as an optional capability This adds an spapr capability bit for Hardware Transactional Memory. It is enabled by default for pseries-2.11 and earlier machine types. with POWER8 or later CPUs (as it must be, since earlier qemu versions would implicitly allow it). However it is disabled by default for the latest pseries-2.12 machine type. This means that with the latest machine type, HTM will not be available, regardless of CPU, unless it is explicitly enabled on the command line. That change is made on the basis that: * This way running with -M pseries,accel=tcg will start with whatever cpu and will provide the same guest visible model as with accel=kvm. - More specifically, this means existing make check tests don't have to be modified to use cap-htm=off in order to run with TCG * We hope to add a new "HTM without suspend" feature in the not too distant future which could work on both POWER8 and POWER9 cpus, and could be enabled by default. * Best guesses suggest that future POWER cpus may well only support the HTM-without-suspend model, not the (frankly, horribly overcomplicated) POWER8 style HTM with suspend. * Anecdotal evidence suggests problems with HTM being enabled when it wasn't wanted are more common than being missing when it was. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit ee76a09fc72cfbfab2bb5529320ef7e460adffd8) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:53:23 UTC
78a38cd spapr: Capabilities infrastructure Because PAPR is a paravirtual environment access to certain CPU (or other) facilities can be blocked by the hypervisor. PAPR provides ways to advertise in the device tree whether or not those features are available to the guest. In some places we automatically determine whether to make a feature available based on whether our host can support it, in most cases this is based on limitations in the available KVM implementation. Although we correctly advertise this to the guest, it means that host factors might make changes to the guest visible environment which is bad: as well as generaly reducing reproducibility, it means that a migration between different host environments can easily go bad. We've mostly gotten away with it because the environments considered mature enough to be well supported (basically, KVM on POWER8) have had consistent feature availability. But, it's still not right and some limitations on POWER9 is going to make it more of an issue in future. This introduces an infrastructure for defining "sPAPR capabilities". These are set by default based on the machine version, masked by the capabilities of the chosen cpu, but can be overriden with machine properties. The intention is at reset time we verify that the requested capabilities can be supported on the host (considering TCG, KVM and/or host cpu limitations). If not we simply fail, rather than silently modifying the advertised featureset to the guest. This does mean that certain configurations that "worked" may now fail, but such configurations were already more subtly broken. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit 33face6b8981add8eba1f7cdaf4cf6cede415d2e) Conflicts: include/hw/ppc/spapr.h *drop context dep on 60c6823b9bc Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:52:49 UTC
0fac4aa spapr: Add pseries-2.12 machine type While we're at it fix a couple of small errors in the 2.11 and 2.10 models (they didn't have any real effect, but don't quite match the template). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 2b6154120cbd7f5514cefd3c6084d39922d26d88) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 06 February 2018, 00:52:01 UTC
97d1755 spapr: don't initialize PATB entry if max-cpu-compat < power9 if KVM is enabled and KVM capabilities MMU radix is available, the partition table entry (patb_entry) for the radix mode is initialized by default in ppc_spapr_reset(). It's a problem if we want to migrate the guest to a POWER8 host while the kernel is not started to set the value to the one expected for a POWER8 CPU. The "-machine max-cpu-compat=power8" should allow to migrate a POWER9 KVM host to a POWER8 KVM host, but because patb_entry is set, the destination QEMU tries to enable radix mode on the POWER8 host. This fails and cancels the migration: Process table config unsupported by the host error while loading state for instance 0x0 of device 'spapr' load of migration failed: Invalid argument This patch doesn't set the PATB entry if the user provides a CPU compatibility mode that doesn't support radix mode. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 1481fe5fcfeb7fcf3c1ebb9d8c0432e3e0188ccf) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 05 February 2018, 05:38:26 UTC
7e25155 linux-user/signal.c: Rename MC_* defines The SPARC code in linux-user/signal.c defines a set of MC_* constants. On some SPARC hosts these are also defined by sys/ucontext.h, resulting in build failures: linux-user/signal.c:2786:0: error: "MC_NGREG" redefined [-Werror] #define MC_NGREG 19 In file included from /usr/include/signal.h:302:0, from include/qemu/osdep.h:86, from linux-user/signal.c:19: /usr/include/sparc64-linux-gnu/sys/ucontext.h:59:0: note: this is the location of the previous definition # define MC_NGREG __MC_NGREG Rename all these constants to SPARC_MC_* to avoid the clash. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1517318239-15764-1-git-send-email-peter.maydell@linaro.org (cherry picked from commit 8ebb314b957403c1c9a3f1cf995f73c6ae9d5d10) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 01 February 2018, 21:22:33 UTC
80277d7 spapr_pci: fix MSI/MSIX selection In various place we don't correctly check if the device supports MSI or MSI-X. This can cause devices to be advertised with MSI support, even if they only support MSI-X (like virtio-pci-* devices for example): ethernet@0 { ibm,req#msi = <0x1>; <--- wrong! . ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0"; . ibm,req#msi-x = <0x3>; }; Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the PCI status and cause migration to fail: qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0 ^^ PCI_STATUS_CAP_LIST bit which is assumed to be constant This patch changes spapr_populate_pci_child_dt() to properly check for MSI support using msi_present(): this ensures that PCIDevice::msi_cap was set by msi_init() and that msi_nr_vectors_allocated() will look at the right place in the config space. Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add a call to msix_present() there as well for consistency. It also changes rtas_ibm_change_msi() to select the appropriate MSI type in Function 1 instead of always selecting plain MSI. This new behaviour is compliant with LoPAPR 1.1, as described in "Table 71. ibm,change-msi Argument Call Buffer": Function 1: If Number Outputs is equal to 3, request to set to a new number of MSIs (including set to 0). If the “ibm,change-msix-capable” property exists and Number Outputs is equal to 4, request is to set to a new number of MSI or MSI-X (platform choice) interrupts (including set to 0). Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's check for MSI support first. And finally, it checks the input parameters are valid, as described in LoPAPR 1.1 "R1–7.3.10.5.1–3": For the MSI option: The platform must return a Status of -3 (Parameter error) from ibm,change-msi, with no change in interrupt assignments if the PCI configuration address does not support MSI and Function 3 was requested (that is, the “ibm,req#msi” property must exist for the PCI configuration address in order to use Function 3), or does not support MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property must exist for the PCI configuration address in order to use Function 4), or if neither MSIs nor MSI-Xs are supported and Function 1 is requested. This ensures that the ret_intr_type variable contains a valid MSI type for this device, and that spapr_msi_setmsg() won't corrupt the PCI status. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 9cbe305b60cc49cfcd134765b85c28be95b1b57d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 01 February 2018, 21:21:29 UTC
50c6998 usb-storage: Fix share-rw option parsing Because usb-storage creates an internal scsi device, we should propagate options. We already do so for bootindex etc, but failed to take care of share-rw. Fix it in an apparent way: add a new parameter to scsi_bus_legacy_add_drive and pass in s->conf.share_rw. Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Message-id: 20180117005222.4781-1-famz@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit 395b95395934785ca86baafd314d0c31b307d16d) Conflicts: hw/usb/dev-storage.c * dropped context dep on ceff3e1f01e Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 29 January 2018, 15:27:47 UTC
dccdaac osdep: Retry SETLK upon EINTR We could hit lock failure if there is a signal that makes fcntl return -1 and errno set to EINTR. In this case we should retry. Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit f86428a1f4f91a460ed585682af70d3e8c31dc06) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 29 January 2018, 15:14:03 UTC
5683983 s390x/kvm: provide stfle.81 stfle.81 (ppa15) is a transparent facility that can be passed to the guest without the need to implement hypervisor support. As this feature can be provided by firmware we add it to all full models. Cc: qemu-stable@nongnu.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20180118085628.40798-4-borntraeger@de.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit 9f0d13f4f1de3cf9b70435cc4e87a301ee12471f) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 29 January 2018, 14:33:00 UTC
4d79c0e s390x/kvm: Handle bpb feature We need to handle the bpb control on reset and migration. Normally stfle.82 is transparent (and the normal guest part works without hypervisor activity). To prevent any issues we require full host kernel support for this feature. Cc: qemu-stable@nongnu.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20180118085628.40798-3-borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [CH: 'Branch Prediction Blocking' -> 'Branch prediction blocking'] Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit b073c87517d4d348c7bac0f0b35e8e83e6354d82) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 29 January 2018, 14:32:50 UTC
0981323 linux-headers: update Update headers against 4.15-rc9. Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit 9cbb636270b4df6f0a548e5c34b895330db5df8b) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 29 January 2018, 14:32:23 UTC
61c8e67 linux-headers: update to 4.15-rc1 Update headers against v4.15-rc1. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-id: 1511883692-11511-4-git-send-email-eric.auger@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit dd8739669f95b30653a3a05cb2e21da3f52894fa) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 29 January 2018, 14:32:18 UTC
e7857ad s390x: fix storage attributes migration for non-small guests Fix storage attribute migration so that it does not fail for guests with more than a few GB of RAM. With such guests, the index in the buffer would go out of bounds, usually by large amounts, thus receiving -EFAULT from the kernel. Migration itself would be successful, but storage attributes would then not be migrated completely. This patch fixes the out of bounds access, and thus migration of all storage attributes when the guest have large amounts of memory. Cc: qemu-stable@nongnu.org Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Fixes: 903fd80b03243476 ("s390x/migration: Storage attributes device") Message-Id: <1516297904-18188-1-git-send-email-imbrenda@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit 46fa893355e0bd88f3c59b886f0d75cbd5f0bbbe) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 29 January 2018, 14:28:08 UTC
9327a8e linux-user: Fix locking order in fork_start() Our locking order is that the tb lock should be taken inside the mmap_lock, but fork_start() grabs locks the other way around. This means that if a heavily multithreaded guest process (such as Java) calls fork() it can deadlock, with the thread that called fork() stuck in fork_start() with the tb lock and waiting for the mmap lock, but some other thread in tb_find() with the mmap lock and waiting for the tb lock. The cpu_list_lock() should also always be taken last, not first. Fix this by making fork_start() grab the locks in the right order. The order in which we drop locks doesn't matter, so we leave fork_end() the way it is. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-stable@nongnu.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1512397331-15238-1-git-send-email-peter.maydell@linaro.org> Signed-off-by: Laurent Vivier <laurent@vivier.eu> (cherry picked from commit 024949caf32805f4cc3e7d363a80084b47aac1f6) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 29 January 2018, 14:27:35 UTC
8ebfafa i386: Add EPYC-IBPB CPU model EPYC-IBPB is a copy of the EPYC CPU model with just CPUID_8000_0008_EBX_IBPB added. Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-7-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit 6cfbc54e8903a9bcc0346119949162d040c144c1) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 23:08:04 UTC
61efbbf i386: Add new -IBRS versions of Intel CPU models The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel microcode updated and can be used by OSes to mitigate CVE-2017-5715. Unfortunately we can't change the existing CPU models without breaking existing setups, so users need to explicitly update their VM configuration to use the new *-IBRS CPU model if they want to expose IBRS to guests. The new CPU models are simple copies of the existing CPU models, with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated. Cc: Jiri Denemark <jdenemar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit ac96c41354b7e4c70b756342d9b686e31ab87458) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 23:08:04 UTC
1ade973 i386: Add FEAT_8000_0008_EBX CPUID feature word Add the new feature word and the "ibpb" feature flag. Based on a patch by Paolo Bonzini. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-5-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit 1b3420e1c4d523c49866cca4e7544753201cd43d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 23:08:04 UTC
803d42f i386: Add spec-ctrl CPUID bit Add the feature name and a CPUID_7_0_EDX_SPEC_CTRL macro. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-4-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit a2381f0934432ef2cd47a335348ba8839632164c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 23:08:04 UTC
cb2637a i386: Add support for SPEC_CTRL MSR Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-3-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit a33a2cfe2f771b360b3422f6cdf566a560860bfc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 23:08:04 UTC
4b220d8 i386: Change X86CPUDefinition::model_id to const char* It is valid to have a 48-character model ID on CPUID, however the definition of X86CPUDefinition::model_id is char[48], which can make the compiler drop the null terminator from the string. If a CPU model happens to have 48 bytes on model_id, "-cpu help" will print garbage and the object_property_set_str() call at x86_cpu_load_def() will read data outside the model_id array. We could increase the array size to 49, but this would mean the compiler would not issue a warning if a 49-char string is used by mistake for model_id. To make things simpler, simply change model_id to be const char*, and validate the string length using an assert() on x86_register_cpudef_type(). Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-2-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit 807e9869b8c4119b81df902625af818519e01759) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 23:08:04 UTC
1027f34 hw/pci-bridge: fix QEMU crash because of pcie-root-port If we try to use more pcie_root_ports then available slots and an IO hint is passed to the port, QEMU crashes because we try to init the "IO hint" capability even if the device is not created. Fix it by checking for error before adding the capability, so QEMU can fail gracefully. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit fced4d00e68e7559c73746d963265f7fd0b6abf9) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 22:45:21 UTC
ccf82ae scsi-disk: release AioContext in unaligned WRITE SAME case scsi_write_same_complete() can retry the write if the request was unaligned. Make sure to release the AioContext when that code path is taken! This patch fixes a hang when QEMU terminates after an unaligned WRITE SAME request has been processed with dataplane. The hang occurs because iothread_stop_all() cannot acquire the AioContext lock that was leaked by the IOThread in scsi_write_same_complete(). Fixes: b9e413dd37 ("block: explicitly acquire aiocontext in aio callbacks that need it"). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Reported-by: Cong Li <coli@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20180104142502.15175-1-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 24355b79bdaf6ab12f7c610b032fc35ec045cd55) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 22:35:01 UTC
1e14820 hw/sd/ssi-sd: Reset SD card on controller reset Since ssi-sd is still using the legacy SD card API, the SD card created by sd_init() is not plugged into any bus. This means that the controller has to reset it manually. Failing to do this mostly didn't affect the guest since the guest typically does a programmed SD card reset as part of its SD controller driver initialization, but meant that migration failed because it's only in sd_reset() that we set up the wpgrps_size field. In the case of sd-ssi, we have to implement an entire reset function since there wasn't one previously, and that requires a QOM cast macro that got omitted when this device was QOMified. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1515506513-31961-4-git-send-email-peter.maydell@linaro.org (cherry picked from commit 8046d44f3c9f67828d3368797d4d314433ee75e9) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 22:34:00 UTC
a22b809 hw/sd/milkymist-memcard: Reset SD card on controller reset Since milkymist-memcard is still using the legacy SD card API, the SD card created by sd_init() is not plugged into any bus. This means that the controller has to reset it manually. Failing to do this mostly didn't affect the guest since the guest typically does a programmed SD card reset as part of its SD controller driver initialization, but meant that migration failed because it's only in sd_reset() that we set up the wpgrps_size field. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1515506513-31961-3-git-send-email-peter.maydell@linaro.org (cherry picked from commit 16bf0e0e7aaa8efc0b8ee7e2aecb2fa235f82d38) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 22:33:50 UTC
3f44190 hw/sd/pl181: Reset SD card on controller reset Since pl181 is still using the legacy SD card API, the SD card created by sd_init() is not plugged into any bus. This means that the controller has to reset it manually. Failing to do this mostly didn't affect the guest since the guest typically does a programmed SD card reset as part of its SD controller driver initialization, but meant that migration failed because it's only in sd_reset() that we set up the wpgrps_size field. Cc: qemu-stable@nongnu.org Fixes: https://bugs.launchpad.net/qemu/+bug/1739378 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1515506513-31961-2-git-send-email-peter.maydell@linaro.org (cherry picked from commit 0cb57cc701839e7358918d5f2922ccbc04d28d17) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 22:33:39 UTC
88bf4a7 vhost: remove assertion to prevent crash QEMU will assert on vhost-user backed virtio device hotplug if QEMU is using more RAM regions than VHOST_MEMORY_MAX_NREGIONS (for example if it were started with a lot of DIMM devices). Fix it by returning error instead of asserting and let callers of vhost_set_mem_table() handle error condition gracefully. Cc: qemu-stable@nongnu.org Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit f4bf56fb78ed0e9f60fa1ed656c14ff4c494da5a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 23 January 2018, 22:30:56 UTC
f793121 virtio_error: don't invoke status callbacks Backends don't need to know what frontend requested a reset, and notifying then from virtio_error is messy because virtio_error itself might be invoked from backend. Let's just set the status directly. Cc: qemu-stable@nongnu.org Reported-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 8fc47c876de638353bb635872f2c25bb7f4a3d6e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 16 January 2018, 00:18:05 UTC
0af294d hw/intc/arm_gic: reserved register addresses are RAZ/WI The GICv2 specification says that reserved register addresses must RAZ/WI; now that we implement external abort handling for Arm CPUs this means we must return MEMTX_OK rather than MEMTX_ERROR, to avoid generating a spurious guest data abort. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1513183941-24300-3-git-send-email-peter.maydell@linaro.org Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> (cherry picked from commit 0cf09852015e47a5fbb974ff7ac320366afd21ee) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 11 January 2018, 21:10:57 UTC
6242535 hw/intc/arm_gicv3: Make reserved register addresses RAZ/WI The GICv3 specification says that reserved register addresses should RAZ/WI. This means we need to return MEMTX_OK, not MEMTX_ERROR, because now that we support generating external aborts the latter will cause an abort on new board models. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1513183941-24300-2-git-send-email-peter.maydell@linaro.org Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> (cherry picked from commit f1945632b43e36bd9f3e0c2feb0e5b152be7ed91) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 11 January 2018, 21:10:48 UTC
d6f1448 vfio: Fix vfio-kvm group registration Commit 8c37faa475f3 ("vfio-pci, ppc64/spapr: Reorder group-to-container attaching") moved registration of groups with the vfio-kvm device from vfio_get_group() to vfio_connect_container(), but it missed the case where a group is attached to an existing container and takes an early exit. Perhaps this is a less common case on ppc64/spapr, but on x86 (without viommu) all groups are connected to the same container and thus only the first group gets registered with the vfio-kvm device. This becomes a problem if we then hot-unplug the devices associated with that first group and we end up with KVM being misinformed about any vfio connections that might remain. Fix by including the call to vfio_kvm_device_add_group() in this early exit path. Fixes: 8c37faa475f3 ("vfio-pci, ppc64/spapr: Reorder group-to-container attaching") Cc: qemu-stable@nongnu.org # qemu-2.10+ Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> (cherry picked from commit 2016986aedb6ea2839662eb5f60630f3e231bd1a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 09 January 2018, 17:09:36 UTC
7f53a81 block: Open backing image in force share mode for size probe Management tools create overlays of running guests with qemu-img: $ qemu-img create -b /image/in/use.qcow2 -f qcow2 /overlay/image.qcow2 but this doesn't work anymore due to image locking: qemu-img: /overlay/image.qcow2: Failed to get shared "write" lock Is another process using the image? Could not open backing image to determine size. Use the force share option to allow this use case again. Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit cc954f01e3c004aad081aa36736a17e842b80211) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 09 January 2018, 16:49:44 UTC
2b0c34c block: Call .drain_begin only once in bdrv_drain_all_begin() bdrv_drain_all_begin() used to call the .bdrv_co_drain_begin() driver callback inside its polling loop. This means that how many times it got called for each node depended on long it had to poll the event loop. This is obviously not right and results in nodes that stay drained even after bdrv_drain_all_end(), which calls .bdrv_co_drain_begin() once per node. Fix bdrv_drain_all_begin() to call the callback only once, too. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 2da9b7d456278bccc6ce889ae350f2867155d7e8) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 09 January 2018, 16:49:13 UTC
057364d block: Make bdrv_drain_invoke() recursive This change separates bdrv_drain_invoke(), which calls the BlockDriver drain callbacks, from bdrv_drain_recurse(). Instead, the function performs its own recursion now. One reason for this is that bdrv_drain_recurse() can be called multiple times by bdrv_drain_all_begin(), but the callbacks may only be called once. The separation is necessary to fix this bug. The other reason is that we intend to go to a model where we call all driver callbacks first, and only then start polling. This is not fully achieved yet with this patch, as bdrv_drain_invoke() contains a BDRV_POLL_WHILE() loop for the block driver callbacks, which can still call callbacks for any unrelated event. It's a step in this direction anyway. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit db0289b9b26cb653d5662f5d6a2a52d70243cd56) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 09 January 2018, 16:48:39 UTC
b9da3c1 block/nbd: fix segmentation fault when .desc is not null-terminated The find_desc_by_name() from util/qemu-option.c relies on the .name not being NULL to call strcmp(). This check becomes unsafe when the list is not NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can result in segmentation fault when strcmp() tries to access an invalid memory: #0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6 #1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166 #2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026 #3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406 #4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135 #5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395 >From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an invalid memory: >>> p desc[5] $8 = { name = 0x1037f098 "R27A", type = 1561964883, help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>, def_value_str = 0x2 <error: Cannot access memory at address 0x2> } >>> p desc[6] $9 = { name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001", type = 272101528, help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>, def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9> } This patch fixes the segmentation fault in strcmp() by adding a NULL element at the end of nbd_runtime_opts.desc list, which is the common practice to most of other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL check becomes safe because it will not evaluate to true when .desc list reached its end. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1727259 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com> Message-Id: <20180105133241.14141-2-muriloo@linux.vnet.ibm.com> CC: qemu-stable@nongnu.org Fixes: 7ccc44fd7d1dfa62c4d6f3a680df809d6e7068ce Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit c4365735a7d38f4355c6f77e6670d3972315f7c2) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 09 January 2018, 16:41:11 UTC
c184e17 qemu-pr-helper: miscellaneous fixes 1) Return a generic sense if TEST UNIT READY does not provide one; 2) Fix two mistakes in copying from the spec. Cc: qemu-stable@nongnu.org Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit a4a9b6eaf35dbe4bf0e069854945bf5e45fc7eab) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 08 January 2018, 13:52:41 UTC
5ba945f qemu-options: Remove stray colons from output of --help Commit 43f187a broke --help: it put colons into blank lines. It removed the colon from DEFHEADING(TITLE:) and added it back in the macro expansion of DEFHEADING(TITLE), so hxtool can emit "@subsection TITLE" more easily. Trouble is it's added back even for the blank lines made with DEFHEADING(). Put the colons back where they were before commit 43f187a, and strip them in hxtool instead. Cc: Paolo Bonzini <pbonzini@redhat.com> CC: qemu-stable@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20171002140307.5292-2-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> (cherry picked from commit de6b4f908c300c7e7e0dc057310f5cbdcf1aed78) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 08 January 2018, 13:52:41 UTC
ea311a9 target/sh4: fix TCG leak during gusa sequence This fixes bug #1735384 while running java under qemu-sh4. When debug was enabled it showed a problem with TCG temps. Once fixed I was able to run java -version normally. Cc: qemu-stable@nongnu.org Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20171206093050.25308-1-alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> (cherry picked from commit 6d56fc6cc372284a4571f09b361a9ccd99318103) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 08 January 2018, 13:52:41 UTC
fd89d93 block/iscsi: dont leave allocmap in an invalid state on UNMAP failure we forgot to set the allocmap to invalid if an UNMAP call fails. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1512733868-9009-2-git-send-email-pl@kamp.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit aef172ffdc2f9c41d9cc043a55f1259e7c07e587) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 08 January 2018, 13:40:25 UTC
817a9fc target/i386: Fix handling of VEX prefixes In commit e3af7c788b73a6495eb9d94992ef11f6ad6f3c56 we replaced direct calls to to cpu_ld*_code() with calls to the x86_ld*_code() wrappers which incorporate an advance of s->pc. Unfortunately we didn't notice that in one place the old code was deliberately not incrementing s->pc: @@ -4501,7 +4528,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu) static const int pp_prefix[4] = { 0, PREFIX_DATA, PREFIX_REPZ, PREFIX_REPNZ }; - int vex3, vex2 = cpu_ldub_code(env, s->pc); + int vex3, vex2 = x86_ldub_code(env, s); if (!CODE64(s) && (vex2 & 0xc0) != 0xc0) { /* 4.1.4.6: In 32-bit mode, bits [7:6] must be 11b, This meant we were mishandling this set of instructions. Remove the manual advance of s->pc for the "is VEX" case (which is now done by x86_ldub_code()) and instead rewind PC in the case where we decide that this isn't really VEX. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-stable@nongnu.org Reported-by: Alexandro Sanchez Bach <alexandro@phi.nz> Message-Id: <1513163959-17545-1-git-send-email-peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit cfcca361d77142f25fb1128755084cf91faa4db7) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> 08 January 2018, 13:39:13 UTC
0a0dc59 Update version for v2.11.0 release Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 13 December 2017, 14:31:09 UTC
6afd0c1 Update version for v2.11.0-rc5 release Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 11 December 2017, 17:49:53 UTC
7472e2e target/arm: Generate UNDEF for 32-bit Thumb2 insns The refactoring of commit 296e5a0a6c3935 has a nasty bug: it accidentally dropped the generation of code to raise the UNDEF exception when disas_thumb2_insn() returns nonzero. This means that 32-bit Thumb2 instruction patterns that ought to UNDEF just act like nops instead. This is likely to break any number of things, including the kernel's "disable the FPU and use the UNDEF exception to identify when to turn it back on again" trick. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1513006964-3371-1-git-send-email-peter.maydell@linaro.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org> 11 December 2017, 17:11:27 UTC
2babfe0 Update version for v2.11.0-rc4 release Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 05 December 2017, 16:36:46 UTC
2994cb2 vhost-scsi: add missing virtqueue_size parameter Commit 5c0919d02066 ("virtio-scsi: Add virtqueue_size parameter allowing virtqueue size to be set.") introduced a new parameter to virtio-scsi. Later, commit 920036106044 ("vhost-user-scsi: add missing virtqueue_size param") added that parameter to the new vhost-user-scsi interface but neglected the existing vhost-scsi interface it was built on. Apply the same change to vhost-scsi, so that we can boot a guest with a device defined. This also avoids crashing a guest when hotplugging a vhost-scsi device. Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Message-id: 20171201151538.6844-2-farman@linux.vnet.ibm.com Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 05 December 2017, 12:38:31 UTC
88f714a Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171205' into staging ppc patch queue 2017-12-05 Alas, this is yet another fix for ppc that I think it's worth squeezing into 2.11. It's a really ugly fix for some pretty ugly code, but it does seem to address a real problem. It's also a problem that's appeared relatively recently, since it was either created by, or made much easier to trigger by, by the merge of MTTCG. # gpg: Signature made Tue 05 Dec 2017 05:24:04 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.11-20171205: target/ppc: Fix system lockups caused by interrupt_request state corruption Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 05 December 2017, 10:00:48 UTC
044897e target/ppc: Fix system lockups caused by interrupt_request state corruption Occasionally in Linux guests on x86_64 we're seeing logs like: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000004 when they should read: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000002 The "00000004" is CPU_INTERRUPT_EXITTB yet the code calls cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("00000002") in this function just before the log message. Something is causing the HARD bit setting to get lost. The knock on effect of losing that bit is the decrementer timer interrupts don't get delivered which causes the guest to sit idle in its idle handler and 'hang'. The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB. Rather than poking directly into cs->interrupt_request, that code needs to: a) hold BQL b) use the cpu_interrupt() helper This patch fixes the call sites to do this, fixing the hang. The calls are made from a variety of contexts so a helper function is added to handle the necessary locking. This can likely be improved and optimised in the future but it ensures the code is correct and doesn't lockup as it stands today. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> 05 December 2017, 01:28:42 UTC
2a4c7e8 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches for 2.11.0-rc4 # gpg: Signature made Mon 04 Dec 2017 16:46:07 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: blockjob: Make block_job_pause_all() keep a reference to the jobs Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 04 December 2017, 17:19:04 UTC
3d5d319 blockjob: Make block_job_pause_all() keep a reference to the jobs Starting from commit 40840e419be31e6a32e6ea24511c74b389d5e0e4 we are pausing all block jobs during bdrv_reopen_multiple() to prevent any of them from finishing and removing nodes from the graph while they are being reopened. It turns out that pausing a block job doesn't necessarily prevent it from finishing: a paused block job can still run its exit function from the main loop and call block_job_completed(). The mirror block job in particular always goes to the main loop while it is paused (by virtue of the bdrv_drained_begin() call in mirror_run()). Destroying a paused block job during bdrv_reopen_multiple() has two consequences: 1) The references to the nodes involved in the job are released, possibly destroying some of them. If those nodes were in the reopen queue this would trigger the problem originally described in commit 40840e419be, crashing QEMU. 2) At the end of bdrv_reopen_multiple(), bdrv_drain_all_end() would not be doing all necessary bdrv_parent_drained_end() calls. I can reproduce problem 1) easily with iotest 030 by increasing STREAM_BUFFER_SIZE from 512KB to 8MB in block/stream.c, or by tweaking the iotest like in this example: https://lists.gnu.org/archive/html/qemu-block/2017-11/msg00934.html This patch keeps an additional reference to all block jobs between block_job_pause_all() and block_job_resume_all(), guaranteeing that they are kept alive. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 04 December 2017, 16:44:51 UTC
e80a256 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging pc, pci, virtio: fixes for rc3 A bunch of fixes all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Fri 01 Dec 2017 17:06:33 GMT # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: pc: fix crash on attempted cpu unplug virtio: check VirtQueue Vring object is set vhost: fix error check in vhost_verify_ring_mappings() dump-guest-memory.py: fix No symbol "vmcoreinfo_find" vhost: restore avail index from vring used index on disconnection virtio: Add queue interface to restore avail index from vring used index i386/msi: Correct mask of destination ID in MSI address Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 04 December 2017, 13:08:13 UTC
495566e Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171204' into staging ppc patch queue 2017-12-04 We are, alas, not yet to the bottom of ppc bugs. This pull request fixes several more. I believe they're important enough to include in 2.11. despite the late date. # gpg: Signature made Mon 04 Dec 2017 03:40:56 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.11-20171204: spapr: Include "pre-plugged" DIMMS in ram size calculation at reset target-ppc: Don't invalidate non-supported msr bits pseries: fix TCG migration Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 04 December 2017, 11:27:53 UTC
768a20f spapr: Include "pre-plugged" DIMMS in ram size calculation at reset At guest reset time, we allocate a hash page table (HPT) for the guest based on the guest's RAM size. If dynamic HPT resizing is not available we use the maximum RAM size, if it is we use the current RAM size. But the "current RAM size" calculation is incorrect - we just use the "base" ram_size from the machine structure. This doesn't include any pluggable DIMMs that are already plugged at reset time. This means that if you try to start a 'pseries' machine with a DIMM specified on the command line that's much larger than the "base" RAM size, then the guest will get a woefully inadequate HPT. This can lead to a guest freeze during boot as it runs out of HPT space during initial MMU setup. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> 04 December 2017, 00:31:22 UTC
75ba2dd pc: fix crash on attempted cpu unplug when qemu is started with '-no-acpi' CLI option, an attempt to unplug a CPU using device_del results in null pointer dereference at: #0 object_get_class #1 pc_machine_device_unplug_request_cb #2 qmp_marshal_device_del which is caused by pcms->acpi_dev == NULL due to ACPI support being disabled. Considering that ACPI support is necessary for unplug to work, check that it's enabled and fail unplug request gracefully if no acpi device were found. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> 01 December 2017, 17:05:58 UTC
758ead3 virtio: check VirtQueue Vring object is set A guest could attempt to use an uninitialised VirtQueue object or unset Vring.align leading to a arithmetic exception. Add check to avoid it. Reported-by: Zhangboxian <zhangboxian@huawei.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> 01 December 2017, 17:05:58 UTC
2fe45ec vhost: fix error check in vhost_verify_ring_mappings() Since commit f1f9e6c5 "vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout", we check the mapping of each part (descriptor table, available ring and used ring) of each virtqueue separately. The checking of a part is done by the vhost_verify_ring_part_mapping() function: it returns either 0 on success or a negative errno if the part cannot be mapped at the same place. Unfortunately, the vhost_verify_ring_mappings() function checks its return value the other way round. It means that we either: - only verify the descriptor table of the first virtqueue, and if it is valid we ignore all the other mappings - or ignore all broken mappings until we reach a valid one ie, we only raise an error if all mappings are broken, and we consider all mappings are valid otherwise (false success), which is obviously wrong. This patch ensures that vhost_verify_ring_mappings() only returns success if ALL mappings are okay. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> 01 December 2017, 17:05:58 UTC
d36d0a9 dump-guest-memory.py: fix No symbol "vmcoreinfo_find" When qemu is compiled without debug, the dump gdb python script can fail with: Error occurred in Python command: No symbol "vmcoreinfo_find" in current context. Because vmcoreinfo_find() is inlined and not exported. Use the underlying object_resolve_path_type() to get the instance instead. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> 01 December 2017, 17:05:58 UTC
2ae39a1 vhost: restore avail index from vring used index on disconnection vhost_virtqueue_stop() gets avail index value from the backend, except if the backend is not responding. It happens when the backend crashes, and in this case, internal state of the virtio queue is inconsistent, making packets to corrupt the vring state. With a Linux guest, it results in following error message on backend reconnection: [ 22.444905] virtio_net virtio0: output.0:id 0 is not a head! [ 22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5 [ 22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5 Fixes: 283e2c2adcb8 ("net: virtio-net discards TX data after link down") Cc: qemu-stable@nongnu.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> 01 December 2017, 17:05:58 UTC
2d4ba6c virtio: Add queue interface to restore avail index from vring used index In case of backend crash, it is not possible to restore internal avail index from the backend value as vhost_get_vring_base callback fails. This patch provides a new interface to restore internal avail index from the vring used index, as done by some vhost-user backend on reconnection. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> 01 December 2017, 17:05:58 UTC
861fec4 i386/msi: Correct mask of destination ID in MSI address According to SDM 10.11.1, only [19:12] bits of MSI address are Destination ID, change the mask to avoid ambiguity for VT-d spec has used the bit 4 to indicate a remappable interrupt request. Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> 01 December 2017, 16:28:15 UTC
be1b21e target-ppc: Don't invalidate non-supported msr bits The msr invalidation code (commits 993eb and 2360b) inverts all bits except MSR_TGPR and MSR_HVB. On non PowerPC 601 processors this leads to incorrect change of excp_prefix in hreg_store_msr() function. The problem is that new msr value get multiplied by msr_mask and inverted msr does not, thus values of MSR_EP bit in new msr value and inverted msr are distinct, so that excp_prefix changes but should not. Signed-off-by: Kurban Mallachiev <mallachiev@ispras.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> 30 November 2017, 03:56:42 UTC
0c86b2d pseries: fix TCG migration Migration of pseries is broken with TCG because QEMU tries to restore KVM MMU state unconditionally. The result is a SIGSEGV in kvm_vm_ioctl(): #0 kvm_vm_ioctl (s=0x0, type=-2146390353) at qemu/accel/kvm/kvm-all.c:2032 #1 0x00000001003e3e2c in kvmppc_configure_v3_mmu (cpu=<optimized out>, radix=<optimized out>, gtse=<optimized out>, proc_tbl=<optimized out>) at qemu/target/ppc/kvm.c:396 #2 0x00000001002f8b88 in spapr_post_load (opaque=0x1019103c0, version_id=<optimized out>) at qemu/hw/ppc/spapr.c:1578 #3 0x000000010059e4cc in vmstate_load_state (f=0x106230000, vmsd=0x1009479e0 <vmstate_spapr>, opaque=0x1019103c0, version_id=<optimized out>) at qemu/migration/vmstate.c:165 #4 0x00000001005987e0 in vmstate_load (f=<optimized out>, se=<optimized out>) at qemu/migration/savevm.c:748 This patch fixes the problem by not calling the KVM function with the TCG mode. Fixes: d39c90f5f3 ("spapr: Fix migration of Radix guests") Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> 30 November 2017, 02:57:51 UTC
c11d612 Update version for v2.11.0-rc3 release Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 29 November 2017, 17:59:34 UTC
915308b Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches for 2.11.0-rc3 # gpg: Signature made Wed 29 Nov 2017 15:25:13 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: block/nfs: fix nfs_client_open for filesize greater than 1TB blockjob: reimplement block_job_sleep_ns to allow cancellation blockjob: introduce block_job_do_yield blockjob: remove clock argument from block_job_sleep_ns block: Expect graph changes in bdrv_parent_drained_begin/end blockjob: Remove the job from the list earlier in block_job_unref() QAPI & interop: Clarify events emitted by 'block-job-cancel' qemu-options: Mention locking option of file driver docs: Add image locking subsection iotests: fix 075 and 078 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 29 November 2017, 16:25:23 UTC
5591c00 Merge remote-tracking branch 'mreitz/tags/pull-block-2017-11-29' into queue-block One block patch for 2.11.0-rc3 # gpg: Signature made Wed Nov 29 15:28:38 2017 CET # gpg: using RSA key F407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * mreitz/tags/pull-block-2017-11-29: block/nfs: fix nfs_client_open for filesize greater than 1TB Signed-off-by: Kevin Wolf <kwolf@redhat.com> 29 November 2017, 14:37:31 UTC
f1a7ff7 block/nfs: fix nfs_client_open for filesize greater than 1TB DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE) was overflowing ret (int) if st.st_size is greater than 1TB. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1511798407-31129-1-git-send-email-pl@kamp.de Signed-off-by: Max Reitz <mreitz@redhat.com> 29 November 2017, 14:28:15 UTC
fc24908 blockjob: reimplement block_job_sleep_ns to allow cancellation This reverts the effects of commit 4afeffc857 ("blockjob: do not allow coroutine double entry or entry-after-completion", 2017-11-21) This fixed the symptom of a bug rather than the root cause. Canceling the wait on a sleeping blockjob coroutine is generally fine, we just need to make it work correctly across AioContexts. To do so, use a QEMUTimer that calls block_job_enter. Use a mutex to ensure that block_job_enter synchronizes correctly with block_job_sleep_ns. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-By: Jeff Cody <jcody@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 29 November 2017, 14:26:21 UTC
356f59b blockjob: introduce block_job_do_yield Hide the clearing of job->busy in a single function, and set it in block_job_enter. This lets block_job_do_yield verify that qemu_coroutine_enter is not used while job->busy = false. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-By: Jeff Cody <jcody@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 29 November 2017, 14:11:14 UTC
5bf1d5a blockjob: remove clock argument from block_job_sleep_ns All callers are using QEMU_CLOCK_REALTIME, and it will not be possible to support more than one clock when block_job_sleep_ns switches to a single timer stored in the BlockJob struct. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Tested-By: Jeff Cody <jcody@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 29 November 2017, 14:11:02 UTC
02d2130 block: Expect graph changes in bdrv_parent_drained_begin/end The .drained_begin/end callbacks can (directly or indirectly via aio_poll()) cause block nodes to be removed or the current BdrvChild to point to a different child node. Use QLIST_FOREACH_SAFE() to make sure we don't access invalid BlockDriverStates or accidentally continue iterating the parents of the new child node instead of the node we actually came from. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 29 November 2017, 13:22:03 UTC
0a3e155 blockjob: Remove the job from the list earlier in block_job_unref() When destroying a block job in block_job_unref() we should remove it from the job list before calling block_job_remove_all_bdrv(). This is because removing the BDSs can trigger an aio_poll() and wake up other jobs that might attempt to use the block job list. If that happens the job we're currently destroying should not be in that list anymore. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 28 November 2017, 15:59:24 UTC
844496f Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-11-28' into staging nbd patches for 2017-11-28 Eric Blake - 0/2 fix two NBD server CVEs # gpg: Signature made Tue 28 Nov 2017 12:58:29 GMT # gpg: using RSA key 0xA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2017-11-28: nbd/server: CVE-2017-15118 Stack smash on large export name nbd/server: CVE-2017-15119 Reject options larger than 32M Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 28 November 2017, 13:12:48 UTC
51ae4f8 nbd/server: CVE-2017-15118 Stack smash on large export name Introduced in commit f37708f6b8 (2.10). The NBD spec says a client can request export names up to 4096 bytes in length, even though they should not expect success on names longer than 256. However, qemu hard-codes the limit of 256, and fails to filter out a client that probes for a longer name; the result is a stack smash that can potentially give an attacker arbitrary control over the qemu process. The smash can be easily demonstrated with this client: $ qemu-io f raw nbd://localhost:10809/$(printf %3000d 1 | tr ' ' a) If the qemu NBD server binary (whether the standalone qemu-nbd, or the builtin server of QMP nbd-server-start) was compiled with -fstack-protector-strong, the ability to exploit the stack smash into arbitrary execution is a lot more difficult (but still theoretically possible to a determined attacker, perhaps in combination with other CVEs). Still, crashing a running qemu (and losing the VM) is bad enough, even if the attacker did not obtain full execution control. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> 28 November 2017, 12:58:01 UTC
fdad35e nbd/server: CVE-2017-15119 Reject options larger than 32M The NBD spec gives us permission to abruptly disconnect on clients that send outrageously large option requests, rather than having to spend the time reading to the end of the option. No real option request requires that much data anyways; and meanwhile, we already have the practice of abruptly dropping the connection on any client that sends NBD_CMD_WRITE with a payload larger than 32M. For comparison, nbdkit drops the connection on any request with more than 4096 bytes; however, that limit is probably too low (as the NBD spec states an export name can theoretically be up to 4096 bytes, which means a valid NBD_OPT_INFO could be even longer) - even if qemu doesn't permit exports longer than 256 bytes. It could be argued that a malicious client trying to get us to read nearly 4G of data on a bad request is a form of denial of service. In particular, if the server requires TLS, but a client that does not know the TLS credentials sends any option (other than NBD_OPT_STARTTLS or NBD_OPT_EXPORT_NAME) with a stated payload of nearly 4G, then the server was keeping the connection alive trying to read all the payload, tying up resources that it would rather be spending on a client that can get past the TLS handshake. Hence, this warranted a CVE. Present since at least 2.5 when handling known options, and made worse in 2.6 when fixing support for NBD_FLAG_C_FIXED_NEWSTYLE to handle unknown options. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> 28 November 2017, 12:42:26 UTC
a914f04 Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2017-11-28-1' into staging Merge qio 2017/11/28 v1 # gpg: Signature made Tue 28 Nov 2017 10:49:08 GMT # gpg: using RSA key 0xBE86EBB415104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF * remotes/berrange/tags/pull-qio-2017-11-28-1: sockets: avoid crash when cleaning up sockets for an invalid FD Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 28 November 2017, 11:52:11 UTC
2d7ad7c sockets: avoid crash when cleaning up sockets for an invalid FD If socket_listen_cleanup is passed an invalid FD, then querying the socket local address will fail. We must thus be prepared for the returned addr to be NULL Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> 28 November 2017, 10:48:04 UTC
c7e1f82 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging # gpg: Signature made Tue 28 Nov 2017 03:58:11 GMT # gpg: using RSA key 0xEF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: virtio-net: don't touch virtqueue if vm is stopped Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 28 November 2017, 10:03:26 UTC
70e53e6 virtio-net: don't touch virtqueue if vm is stopped Guest state should not be touched if VM is stopped, unfortunately we didn't check running state and tried to drain tx queue unconditionally in virtio_net_set_status(). A crash was then noticed as a migration destination when user type quit after virtqueue state is loaded but before region cache is initialized. In this case, virtio_net_drop_tx_queue_data() tries to access the uninitialized region cache. Fix this by only dropping tx queue data when vm is running. Fixes: 283e2c2adcb80 ("net: virtio-net discards TX data after link down") Cc: Yuri Benditovich <yuri.benditovich@daynix.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> 28 November 2017, 03:54:50 UTC
c117bb1 QAPI & interop: Clarify events emitted by 'block-job-cancel' When you cancel an in-progress 'mirror' job (or "active `block-commit`") with QMP `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED. However, when `block-job-cancel` is issued *after* `drive-mirror` has indicated (via the event BLOCK_JOB_READY) that the source and destination have reached synchronization: [...] # Snip `drive-mirror` invocation & outputs { "execute":"block-job-cancel", "arguments":{ "device":"virtio0" } } {"return": {}} It (`block-job-cancel`) will counterintuitively emit the event 'BLOCK_JOB_COMPLETED': { "timestamp":{ "seconds":1510678024, "microseconds":526240 }, "event":"BLOCK_JOB_COMPLETED", "data":{ "device":"virtio0", "len":41126400, "offset":41126400, "speed":0, "type":"mirror" } } But this is expected behaviour, where the _COMPLETED event indicates that synchronization has successfully ended (and the destination now has a point-in-time copy, which is at the time of cancel). So add a small note to this effect in 'block-core.json'. While at it, also update the "Live disk synchronization -- drive-mirror and blockdev-mirror" section in 'live-block-operations.rst'. (Thanks: Max Reitz for reminding me of this caveat on IRC.) Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 27 November 2017, 13:59:35 UTC
5e19aed Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171127' into staging ppc patch queue 2017-11-27 This series contains a couple of migration fixes for hash guests on POWER9 radix MMU hosts. # gpg: Signature made Mon 27 Nov 2017 04:27:15 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.11-20171127: target/ppc: Fix setting of cpu->compat_pvr on incoming migration target/ppc: Move setting of patb_entry on hash table init Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 27 November 2017, 11:16:20 UTC
1878eaf qemu-options: Mention locking option of file driver Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 27 November 2017, 10:25:41 UTC
b1d1cb2 docs: Add image locking subsection This documents the image locking feature and explains when and how related options can be used. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 27 November 2017, 10:25:41 UTC
45f1882 iotests: fix 075 and 078 Both of these tests are for formats which now stipulate that they are read-only. Adjust the tests to match. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Lukáš Doktor <ldoktor@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> 27 November 2017, 10:25:41 UTC
e07cc19 target/ppc: Fix setting of cpu->compat_pvr on incoming migration cpu->compat_pvr is used to store the current compat mode of the cpu. On the receiving side during incoming migration we check compatibility with the compat mode by calling ppc_set_compat(). However we fail to set the compat mode with the hypervisor since the "new" compat mode doesn't differ from the current (due to a "cpu->compat_pvr != compat_pvr" check). This means that kvm runs the vcpus without a compat mode, which is the incorrect behaviour. The implication being that a compatibility mode will never be in effect after migration. To fix this so that the compat mode is correctly set with the hypervisor, store the desired compat mode and reset cpu->compat_pvr to zero before calling ppc_set_compat(). Fixes: 5dfaa532 ("ppc: fix ppc_set_compat() with KVM PR") Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> 27 November 2017, 01:20:11 UTC
ee4d9ec target/ppc: Move setting of patb_entry on hash table init The patb_entry is used to store the location of the process table in guest memory. The msb is also used to indicate the mmu mode of the guest, that is patb_entry & 1 << 63 ? radix_mode : hash_mode. Currently we set this to zero in spapr_setup_hpt_and_vrma() since if this function gets called then we know we're hash. However some code paths, such as setting up the hpt on incoming migration of a hash guest, call spapr_reallocate_hpt() directly bypassing this higher level function. Since we assume radix if the host is capable this results in the msb in patb_entry being left set so in spapr_post_load() we call kvmppc_configure_v3_mmu() and tell the host we're radix which as expected means addresses cannot be translated once we actually run the cpu. To fix this move the zeroing of patb_entry into spapr_reallocate_hpt(). Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> 27 November 2017, 01:20:11 UTC
e7b47c2 osdep.h: Make TIME_MAX handle different time_t types In our various supported host OSes, the time_t type may be either 32 or 64 bit, and could in theory also be either signed or unsigned. Notably, in OpenBSD time_t is a 64 bit type even if 'long' is 32 bits, so using LONG_MAX for TIME_MAX is incorrect. Use an approach suggested by Paolo Bonzini which calculates the maximum value of the type rather than hardcoding it; to do this we use the TYPE_MAXIMUM macro from Gnulib. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1511452598-6077-1-git-send-email-peter.maydell@linaro.org 24 November 2017, 13:23:36 UTC
79283dd hw/arm/virt: Add 2.11 machine type Add virt-2.11 machine type. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-id: 1511516626-21178-1-git-send-email-eric.auger@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 24 November 2017, 11:28:56 UTC
38e83b6 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20171124' into staging Deal with the fallout from the deletion of the old s390 virtio header in Linux master. # gpg: Signature made Fri 24 Nov 2017 09:56:49 GMT # gpg: using RSA key 0xDECF6B93C6F02FAF # gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" # gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" # gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" # gpg: aka "Cornelia Huck <cohuck@kernel.org>" # gpg: aka "Cornelia Huck <cohuck@redhat.com>" # Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF * remotes/cohuck/tags/s390x-20171124: s390/kvm_virtio/linux-headers: remove traces of old virtio transport Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 24 November 2017, 10:26:20 UTC
c1c4c21 s390/kvm_virtio/linux-headers: remove traces of old virtio transport We no longer support the old s390 transport, neither does the newest Linux kernel. Remove it from the linux header script as well as the s390x virtio code. We still should handle the VIRTIO_NOTIFY hypercall, to tolerate early printk on older guest kernels without an sclp console. We continue to ignore these events. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20171115154223.109991-1-borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> 24 November 2017, 09:52:05 UTC
c65d5e4 configure: Deal with OpenBSD/i386 emulation linker OpenBSD/i386 uses elf_i386_obsd for the emulation linker. Signed-off-by: Brad Smith <brad@comstyle.com> Message-id: 20171107234608.GA395@humpty.home.comstyle.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 23 November 2017, 16:52:24 UTC
54c85be Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20171122' into staging migration/next for 20171122 # gpg: Signature made Wed 22 Nov 2017 08:43:13 GMT # gpg: using RSA key 0xF487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" # gpg: aka "Juan Quintela <quintela@trasno.org>" # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * remotes/juanquintela/tags/migration/20171122: migration/ram.c: do not set 'postcopy_running' in POSTCOPY_INCOMING_END migration, xen: Fix block image lock issue on live migration Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 23 November 2017, 13:50:00 UTC
1b89975 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171122' into staging ppc patch queue 2017-11-22 Several more fixes to merge for qemu-2.11. # gpg: Signature made Wed 22 Nov 2017 04:29:57 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.11-20171122: ppc: fix VTB migration spapr: Implement bug in spapr-vty device to be compatible with PowerVM hw/ppc/spapr: Fix virtio-scsi bootindex handling for LUNs >= 256 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 23 November 2017, 13:15:02 UTC
2fe47fc Fix build of console and GUI executables for Windows It was broken by commit 8ecc89f6e792152496eccb684d6c8c48aba8027d which moved the SDL linker flags from macro libs_softmmu to macro SDL_LIBS. Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-id: 20171116163732.31584-1-sw@weilnetz.de Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 23 November 2017, 10:46:42 UTC
8f2c4cb tcg: Fix compilation without TCG Commit 27266271977c started to use tb_unlock() and tlb_set_dirty() on non TCG code. Add the functions as stubs, so that builds with TCG disabled continue to compile. Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> [PMM: tweaked commit message] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> 23 November 2017, 10:02:44 UTC
acab30b migration/ram.c: do not set 'postcopy_running' in POSTCOPY_INCOMING_END When migrating a VM with 'migrate_set_capability postcopy-ram on' a postcopy_state is set during the process, ending up with the state POSTCOPY_INCOMING_END when the migration is over. This postcopy_state is taken into account inside ram_load to check how it will load the memory pages. This same ram_load is called when in a loadvm command. Inside ram_load, the logic to see if we're at postcopy_running state is: postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING postcopy_state_get() returns this enum type: typedef enum { POSTCOPY_INCOMING_NONE = 0, POSTCOPY_INCOMING_ADVISE, POSTCOPY_INCOMING_DISCARD, POSTCOPY_INCOMING_LISTENING, POSTCOPY_INCOMING_RUNNING, POSTCOPY_INCOMING_END } PostcopyState; In the case where ram_load is executed and postcopy_state is POSTCOPY_INCOMING_END, postcopy_running will be set to 'true' and ram_load will behave like a postcopy is in progress. This scenario isn't achievable in a migration but it is reproducible when executing savevm/loadvm after migrating with 'postcopy-ram on', causing loadvm to fail with Error -22: Source: (qemu) migrate_set_capability postcopy-ram on (qemu) migrate tcp:127.0.0.1:4444 Dest: (qemu) migrate_set_capability postcopy-ram on (qemu) ubuntu1704-intel login: Ubuntu 17.04 ubuntu1704-intel ttyS0 ubuntu1704-intel login: (qemu) (qemu) savevm test1 (qemu) loadvm test1 Unknown combination of migration flags: 0x4 (postcopy mode) error while loading state for instance 0x0 of device 'ram' Error -22 while loading VM state (qemu) This patch fixes this problem by changing the existing logic for postcopy_advised and postcopy_running in ram_load, making them 'false' if we're at POSTCOPY_INCOMING_END state. Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> CC: Juan Quintela <quintela@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com> Signed-off-by: Juan Quintela <quintela@redhat.com> 22 November 2017, 07:50:37 UTC
back to top