https://github.com/torvalds/linux
Revision 844a5fe219cf472060315971e15cbf97674a3324 authored by Paolo Bonzini on 08 March 2016, 11:13:39 UTC, committed by Paolo Bonzini on 10 March 2016, 10:26:07 UTC
Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.

KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0.  Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution.  This will still cause a user write to fault, while
supervisor writes will succeed.  User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0).  User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.

When SMEP is in effect, however, U=0 will enable kernel execution of
this page.  To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0.  If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.

The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch.  (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).

There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.

Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1 parent 313f636
Raw File
Tip revision: 844a5fe219cf472060315971e15cbf97674a3324 authored by Paolo Bonzini on 08 March 2016, 11:13:39 UTC
KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
Tip revision: 844a5fe
Kconfig
#
# Security configuration
#

menu "Security options"

source security/keys/Kconfig

config SECURITY_DMESG_RESTRICT
	bool "Restrict unprivileged access to the kernel syslog"
	default n
	help
	  This enforces restrictions on unprivileged users reading the kernel
	  syslog via dmesg(8).

	  If this option is not selected, no restrictions will be enforced
	  unless the dmesg_restrict sysctl is explicitly set to (1).

	  If you are unsure how to answer this question, answer N.

config SECURITY
	bool "Enable different security models"
	depends on SYSFS
	depends on MULTIUSER
	help
	  This allows you to choose different security modules to be
	  configured into your kernel.

	  If this option is not selected, the default Linux security
	  model will be used.

	  If you are unsure how to answer this question, answer N.

config SECURITYFS
	bool "Enable the securityfs filesystem"
	help
	  This will build the securityfs filesystem.  It is currently used by
	  the TPM bios character driver and IMA, an integrity provider.  It is
	  not used by SELinux or SMACK.

	  If you are unsure how to answer this question, answer N.

config SECURITY_NETWORK
	bool "Socket and Networking Security Hooks"
	depends on SECURITY
	help
	  This enables the socket and networking security hooks.
	  If enabled, a security module can use these hooks to
	  implement socket and networking access controls.
	  If you are unsure how to answer this question, answer N.

config SECURITY_NETWORK_XFRM
	bool "XFRM (IPSec) Networking Security Hooks"
	depends on XFRM && SECURITY_NETWORK
	help
	  This enables the XFRM (IPSec) networking security hooks.
	  If enabled, a security module can use these hooks to
	  implement per-packet access controls based on labels
	  derived from IPSec policy.  Non-IPSec communications are
	  designated as unlabelled, and only sockets authorized
	  to communicate unlabelled data can send without using
	  IPSec.
	  If you are unsure how to answer this question, answer N.

config SECURITY_PATH
	bool "Security hooks for pathname based access control"
	depends on SECURITY
	help
	  This enables the security hooks for pathname based access control.
	  If enabled, a security module can use these hooks to
	  implement pathname based access controls.
	  If you are unsure how to answer this question, answer N.

config INTEL_TXT
	bool "Enable Intel(R) Trusted Execution Technology (Intel(R) TXT)"
	depends on HAVE_INTEL_TXT
	help
	  This option enables support for booting the kernel with the
	  Trusted Boot (tboot) module. This will utilize
	  Intel(R) Trusted Execution Technology to perform a measured launch
	  of the kernel. If the system does not support Intel(R) TXT, this
	  will have no effect.

	  Intel TXT will provide higher assurance of system configuration and
	  initial state as well as data reset protection.  This is used to
	  create a robust initial kernel measurement and verification, which
	  helps to ensure that kernel security mechanisms are functioning
	  correctly. This level of protection requires a root of trust outside
	  of the kernel itself.

	  Intel TXT also helps solve real end user concerns about having
	  confidence that their hardware is running the VMM or kernel that
	  it was configured with, especially since they may be responsible for
	  providing such assurances to VMs and services running on it.

	  See <http://www.intel.com/technology/security/> for more information
	  about Intel(R) TXT.
	  See <http://tboot.sourceforge.net> for more information about tboot.
	  See Documentation/intel_txt.txt for a description of how to enable
	  Intel TXT support in a kernel boot.

	  If you are unsure as to whether this is required, answer N.

config LSM_MMAP_MIN_ADDR
	int "Low address space for LSM to protect from user allocation"
	depends on SECURITY && SECURITY_SELINUX
	default 32768 if ARM || (ARM64 && COMPAT)
	default 65536
	help
	  This is the portion of low virtual memory which should be protected
	  from userspace allocation.  Keeping a user from writing to low pages
	  can help reduce the impact of kernel NULL pointer bugs.

	  For most ia64, ppc64 and x86 users with lots of address space
	  a value of 65536 is reasonable and should cause no problems.
	  On arm and other archs it should not be higher than 32768.
	  Programs which use vm86 functionality or have some need to map
	  this low address space will need the permission specific to the
	  systems running LSM.

source security/selinux/Kconfig
source security/smack/Kconfig
source security/tomoyo/Kconfig
source security/apparmor/Kconfig
source security/yama/Kconfig

source security/integrity/Kconfig

choice
	prompt "Default security module"
	default DEFAULT_SECURITY_SELINUX if SECURITY_SELINUX
	default DEFAULT_SECURITY_SMACK if SECURITY_SMACK
	default DEFAULT_SECURITY_TOMOYO if SECURITY_TOMOYO
	default DEFAULT_SECURITY_APPARMOR if SECURITY_APPARMOR
	default DEFAULT_SECURITY_DAC

	help
	  Select the security module that will be used by default if the
	  kernel parameter security= is not specified.

	config DEFAULT_SECURITY_SELINUX
		bool "SELinux" if SECURITY_SELINUX=y

	config DEFAULT_SECURITY_SMACK
		bool "Simplified Mandatory Access Control" if SECURITY_SMACK=y

	config DEFAULT_SECURITY_TOMOYO
		bool "TOMOYO" if SECURITY_TOMOYO=y

	config DEFAULT_SECURITY_APPARMOR
		bool "AppArmor" if SECURITY_APPARMOR=y

	config DEFAULT_SECURITY_DAC
		bool "Unix Discretionary Access Controls"

endchoice

config DEFAULT_SECURITY
	string
	default "selinux" if DEFAULT_SECURITY_SELINUX
	default "smack" if DEFAULT_SECURITY_SMACK
	default "tomoyo" if DEFAULT_SECURITY_TOMOYO
	default "apparmor" if DEFAULT_SECURITY_APPARMOR
	default "" if DEFAULT_SECURITY_DAC

endmenu

back to top