Revision 5b77e95dd7790ff6c8fbf1cd8d0104ebed818a03 authored by Alexander Potapenko on 02 April 2019, 11:28:13 UTC, committed by Ingo Molnar on 06 April 2019, 07:52:02 UTC
There's a number of problems with how arch/x86/include/asm/bitops.h
is currently using assembly constraints for the memory region
bitops are modifying:

1) Use memory clobber in bitops that touch arbitrary memory

Certain bit operations that read/write bits take a base pointer and an
arbitrarily large offset to address the bit relative to that base.
Inline assembly constraints aren't expressive enough to tell the
compiler that the assembly directive is going to touch a specific memory
location of unknown size, therefore we have to use the "memory" clobber
to indicate that the assembly is going to access memory locations other
than those listed in the inputs/outputs.

To indicate that BTR/BTS instructions don't necessarily touch the first
sizeof(long) bytes of the argument, we also move the address to assembly
inputs.

This particular change leads to size increase of 124 kernel functions in
a defconfig build. For some of them the diff is in NOP operations, other
end up re-reading values from memory and may potentially slow down the
execution. But without these clobbers the compiler is free to cache
the contents of the bitmaps and use them as if they weren't changed by
the inline assembly.

2) Use byte-sized arguments for operations touching single bytes.

Passing a long value to ANDB/ORB/XORB instructions makes the compiler
treat sizeof(long) bytes as being clobbered, which isn't the case. This
may theoretically lead to worse code in the case of heavy optimization.

Practical impact:

I've built a defconfig kernel and looked through some of the functions
generated by GCC 7.3.0 with and without this clobber, and didn't spot
any miscompilations.

However there is a (trivial) theoretical case where this code leads to
miscompilation:

  https://lkml.org/lkml/2019/3/28/393

using just GCC 8.3.0 with -O2.  It isn't hard to imagine someone writes
such a function in the kernel someday.

So the primary motivation is to fix an existing misuse of the asm
directive, which happens to work in certain configurations now, but
isn't guaranteed to work under different circumstances.

[ --mingo: Added -stable tag because defconfig only builds a fraction
  of the kernel and the trivial testcase looks normal enough to
  be used in existing or in-development code. ]

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: James Y Knight <jyknight@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20190402112813.193378-1-glider@google.com
[ Edited the changelog, tidied up one of the defines. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
1 parent faa3604
Raw File
Kconfig.cpu
# SPDX-License-Identifier: GPL-2.0
menu "Processor features"

choice
	prompt "Endianness selection" 
	default CPU_LITTLE_ENDIAN
	help
	  Some SuperH machines can be configured for either little or big
	  endian byte order. These modes require different kernels.

config CPU_LITTLE_ENDIAN
	bool "Little Endian"

config CPU_BIG_ENDIAN
	bool "Big Endian"
	depends on !CPU_SH5

endchoice

config SH_FPU
	def_bool y
	prompt "FPU support"
	depends on CPU_HAS_FPU
	help
	  Selecting this option will enable support for SH processors that
	  have FPU units (ie, SH77xx).

	  This option must be set in order to enable the FPU.

config SH64_FPU_DENORM_FLUSH
	bool "Flush floating point denorms to zero"
	depends on SH_FPU && SUPERH64

config SH_FPU_EMU
	def_bool n
	prompt "FPU emulation support"
	depends on !SH_FPU
	help
	  Selecting this option will enable support for software FPU emulation.
	  Most SH-3 users will want to say Y here, whereas most SH-4 users will
	  want to say N.

config SH_DSP
	def_bool y
	prompt "DSP support"
	depends on CPU_HAS_DSP
	help
	  Selecting this option will enable support for SH processors that
	  have DSP units (ie, SH2-DSP, SH3-DSP, and SH4AL-DSP).

	  This option must be set in order to enable the DSP.

config SH_ADC
	def_bool y
	prompt "ADC support"
	depends on CPU_SH3
	help
	  Selecting this option will allow the Linux kernel to use SH3 on-chip
	  ADC module.

	  If unsure, say N.

config SH_STORE_QUEUES
	bool "Support for Store Queues"
	depends on CPU_SH4
	help
	  Selecting this option will enable an in-kernel API for manipulating
	  the store queues integrated in the SH-4 processors.

config SPECULATIVE_EXECUTION
	bool "Speculative subroutine return"
	depends on CPU_SUBTYPE_SH7780 || CPU_SUBTYPE_SH7785 || CPU_SUBTYPE_SH7786
	help
	  This enables support for a speculative instruction fetch for
	  subroutine return. There are various pitfalls associated with
	  this, as outlined in the SH7780 hardware manual.

	  If unsure, say N.

config SH64_ID2815_WORKAROUND
	bool "Include workaround for SH5-101 cut2 silicon defect ID2815"
	depends on CPU_SUBTYPE_SH5_101

config CPU_HAS_INTEVT
	bool

config CPU_HAS_IPR_IRQ
	bool

config CPU_HAS_SR_RB
	bool
	help
	  This will enable the use of SR.RB register bank usage. Processors
	  that are lacking this bit must have another method in place for
	  accomplishing what is taken care of by the banked registers.

	  See <file:Documentation/sh/register-banks.txt> for further
	  information on SR.RB and register banking in the kernel in general.

config CPU_HAS_PTEAEX
	bool

config CPU_HAS_DSP
	bool

config CPU_HAS_FPU
	bool

endmenu
back to top