Revision 474095e46cd14421821da3201a9fd6a4c070996b authored by Linus Torvalds on 24 April 2015, 16:28:01 UTC, committed by Linus Torvalds on 24 April 2015, 16:28:01 UTC
Pull md updates from Neil Brown:
 "More updates that usual this time.  A few have performance impacts
  which hould mostly be positive, but RAID5 (in particular) can be very
  work-load ensitive...  We'll have to wait and see.

  Highlights:

   - "experimental" code for managing md/raid1 across a cluster using
     DLM.  Code is not ready for general use and triggers a WARNING if
     used.  However it is looking good and mostly done and having in
     mainline will help co-ordinate development.

   - RAID5/6 can now batch multiple (4K wide) stripe_heads so as to
     handle a full (chunk wide) stripe as a single unit.

   - RAID6 can now perform read-modify-write cycles which should help
     performance on larger arrays: 6 or more devices.

   - RAID5/6 stripe cache now grows and shrinks dynamically.  The value
     set is used as a minimum.

   - Resync is now allowed to go a little faster than the 'mininum' when
     there is competing IO.  How much faster depends on the speed of the
     devices, so the effective minimum should scale with device speed to
     some extent"

* tag 'md/4.1' of git://neil.brown.name/md: (58 commits)
  md/raid5: don't do chunk aligned read on degraded array.
  md/raid5: allow the stripe_cache to grow and shrink.
  md/raid5: change ->inactive_blocked to a bit-flag.
  md/raid5: move max_nr_stripes management into grow_one_stripe and drop_one_stripe
  md/raid5: pass gfp_t arg to grow_one_stripe()
  md/raid5: introduce configuration option rmw_level
  md/raid5: activate raid6 rmw feature
  md/raid6 algorithms: xor_syndrome() for SSE2
  md/raid6 algorithms: xor_syndrome() for generic int
  md/raid6 algorithms: improve test program
  md/raid6 algorithms: delta syndrome functions
  raid5: handle expansion/resync case with stripe batching
  raid5: handle io error of batch list
  RAID5: batch adjacent full stripe write
  raid5: track overwrite disk count
  raid5: add a new flag to track if a stripe can be batched
  raid5: use flex_array for scribble data
  md raid0: access mddev->queue (request queue member) conditionally because it is not set when accessed from dm-raid
  md: allow resync to go faster when there is competing IO.
  md: remove 'go_faster' option from ->sync_request()
  ...
2 parent s d56a669 + 9ffc8f7
Raw File
rcu.txt
RCU Concepts


The basic idea behind RCU (read-copy update) is to split destructive
operations into two parts, one that prevents anyone from seeing the data
item being destroyed, and one that actually carries out the destruction.
A "grace period" must elapse between the two parts, and this grace period
must be long enough that any readers accessing the item being deleted have
since dropped their references.  For example, an RCU-protected deletion
from a linked list would first remove the item from the list, wait for
a grace period to elapse, then free the element.  See the listRCU.txt
file for more information on using RCU with linked lists.


Frequently Asked Questions

o	Why would anyone want to use RCU?

	The advantage of RCU's two-part approach is that RCU readers need
	not acquire any locks, perform any atomic instructions, write to
	shared memory, or (on CPUs other than Alpha) execute any memory
	barriers.  The fact that these operations are quite expensive
	on modern CPUs is what gives RCU its performance advantages
	in read-mostly situations.  The fact that RCU readers need not
	acquire locks can also greatly simplify deadlock-avoidance code.

o	How can the updater tell when a grace period has completed
	if the RCU readers give no indication when they are done?

	Just as with spinlocks, RCU readers are not permitted to
	block, switch to user-mode execution, or enter the idle loop.
	Therefore, as soon as a CPU is seen passing through any of these
	three states, we know that that CPU has exited any previous RCU
	read-side critical sections.  So, if we remove an item from a
	linked list, and then wait until all CPUs have switched context,
	executed in user mode, or executed in the idle loop, we can
	safely free up that item.

	Preemptible variants of RCU (CONFIG_PREEMPT_RCU) get the
	same effect, but require that the readers manipulate CPU-local
	counters.  These counters allow limited types of blocking within
	RCU read-side critical sections.  SRCU also uses CPU-local
	counters, and permits general blocking within RCU read-side
	critical sections.  These variants of RCU detect grace periods
	by sampling these counters.

o	If I am running on a uniprocessor kernel, which can only do one
	thing at a time, why should I wait for a grace period?

	See the UP.txt file in this directory.

o	How can I see where RCU is currently used in the Linux kernel?

	Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu",
	"rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh",
	"srcu_read_lock", "srcu_read_unlock", "synchronize_rcu",
	"synchronize_net", "synchronize_srcu", and the other RCU
	primitives.  Or grab one of the cscope databases from:

	http://www.rdrop.com/users/paulmck/RCU/linuxusage/rculocktab.html

o	What guidelines should I follow when writing code that uses RCU?

	See the checklist.txt file in this directory.

o	Why the name "RCU"?

	"RCU" stands for "read-copy update".  The file listRCU.txt has
	more information on where this name came from, search for
	"read-copy update" to find it.

o	I hear that RCU is patented?  What is with that?

	Yes, it is.  There are several known patents related to RCU,
	search for the string "Patent" in RTFP.txt to find them.
	Of these, one was allowed to lapse by the assignee, and the
	others have been contributed to the Linux kernel under GPL.
	There are now also LGPL implementations of user-level RCU
	available (http://lttng.org/?q=node/18).

o	I hear that RCU needs work in order to support realtime kernels?

	This work is largely completed.  Realtime-friendly RCU can be
	enabled via the CONFIG_PREEMPT_RCU kernel configuration
	parameter.  However, work is in progress for enabling priority
	boosting of preempted RCU read-side critical sections.	This is
	needed if you have CPU-bound realtime threads.

o	Where can I find more information on RCU?

	See the RTFP.txt file in this directory.
	Or point your browser at http://www.rdrop.com/users/paulmck/RCU/.

o	What are all these files in this directory?

	See 00-INDEX for the list.
back to top