Revision c61ea31dac0319ec64b33725917bda81fc293a25 authored by David Howells on 11 May 2010, 15:51:39 UTC, committed by Linus Torvalds on 11 May 2010, 17:07:53 UTC
Fix an occasional EIO returned by a call to vfs_unlink():

	[ 4868.465413] CacheFiles: I/O Error: Unlink failed
	[ 4868.465444] FS-Cache: Cache cachefiles stopped due to I/O error
	[ 4947.320011] CacheFiles: File cache on md3 unregistering
	[ 4947.320041] FS-Cache: Withdrawing cache "mycache"
	[ 5127.348683] FS-Cache: Cache "mycache" added (type cachefiles)
	[ 5127.348716] CacheFiles: File cache on md3 registered
	[ 7076.871081] CacheFiles: I/O Error: Unlink failed
	[ 7076.871130] FS-Cache: Cache cachefiles stopped due to I/O error
	[ 7116.780891] CacheFiles: File cache on md3 unregistering
	[ 7116.780937] FS-Cache: Withdrawing cache "mycache"
	[ 7296.813394] FS-Cache: Cache "mycache" added (type cachefiles)
	[ 7296.813432] CacheFiles: File cache on md3 registered

What happens is this:

 (1) A cached NFS file is seen to have become out of date, so NFS retires the
     object and immediately acquires a new object with the same key.

 (2) Retirement of the old object is done asynchronously - so the lookup/create
     to generate the new object may be done first.

     This can be a problem as the old object and the new object must exist at
     the same point in the backing filesystem (i.e. they must have the same
     pathname).

 (3) The lookup for the new object sees that a backing file already exists,
     checks to see whether it is valid and sees that it isn't.  It then deletes
     that file and creates a new one on disk.

 (4) The retirement phase for the old file is then performed.  It tries to
     delete the dentry it has, but ext4_unlink() returns -EIO because the inode
     attached to that dentry no longer matches the inode number associated with
     the filename in the parent directory.

The trace below shows this quite well.

	[md5sum] ==> __fscache_relinquish_cookie(ffff88002d12fb58{NFS.fh,ffff88002ce62100},1)
	[md5sum] ==> __fscache_acquire_cookie({NFS.server},{NFS.fh},ffff88002ce62100)

NFS has retired the old cookie and asked for a new one.

	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_ACTIVE,24})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_DYING]
	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_INIT,0})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_LOOKING_UP]
	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_DYING,24})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_RECYCLING]

The old object (OBJ52) is going through the terminal states to get rid of it,
whilst the new object - (OBJ53) - is coming into being.

	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_LOOKING_UP,0})
	[kslowd] ==> cachefiles_walk_to_object({ffff88003029d8b8},OBJ53,@68,)
	[kslowd] lookup '@68'
	[kslowd] next -> ffff88002ce41bd0 positive
	[kslowd] advance
	[kslowd] lookup 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] next -> ffff8800369faac8 positive

The new object has looked up the subdir in which the file would be in (getting
dentry ffff88002ce41bd0) and then looked up the file itself (getting dentry
ffff8800369faac8).

	[kslowd] validate 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] ==> cachefiles_bury_object(,'@68','Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA')
	[kslowd] remove ffff8800369faac8 from ffff88002ce41bd0
	[kslowd] unlink stale object
	[kslowd] <== cachefiles_bury_object() = 0

It then checks the file's xattrs to see if it's valid.  NFS says that the
auxiliary data indicate the file is out of date (obvious to us - that's why NFS
ditched the old version and got a new one).  CacheFiles then deletes the old
file (dentry ffff8800369faac8).

	[kslowd] redo lookup
	[kslowd] lookup 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] next -> ffff88002cd94288 negative
	[kslowd] create -> ffff88002cd94288{ffff88002cdaf238{ino=148247}}

CacheFiles then redoes the lookup and gets a negative result in a new dentry
(ffff88002cd94288) which it then creates a file for.

	[kslowd] ==> cachefiles_mark_object_active(,OBJ53)
	[kslowd] <== cachefiles_mark_object_active() = 0
	[kslowd] === OBTAINED_OBJECT ===
	[kslowd] <== cachefiles_walk_to_object() = 0 [148247]
	[kslowd] <== fscache_object_state_machine() [->OBJECT_AVAILABLE]

The new object is then marked active and the state machine moves to the
available state - at which point NFS can start filling the object.

	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_RECYCLING,20})
	[kslowd] ==> fscache_release_object()
	[kslowd] ==> cachefiles_drop_object({OBJ52,2})
	[kslowd] ==> cachefiles_delete_object(,OBJ52{ffff8800369faac8})

The old object, meanwhile, goes on with being retired.  If allocation occurs
first, cachefiles_delete_object() has to wait for dir->d_inode->i_mutex to
become available before it can continue.

	[kslowd] ==> cachefiles_bury_object(,'@68','Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA')
	[kslowd] remove ffff8800369faac8 from ffff88002ce41bd0
	[kslowd] unlink stale object
	EXT4-fs warning (device sda6): ext4_unlink: Inode number mismatch in unlink (148247!=148193)
	CacheFiles: I/O Error: Unlink failed
	FS-Cache: Cache cachefiles stopped due to I/O error

CacheFiles then tries to delete the file for the old object, but the dentry it
has (ffff8800369faac8) no longer points to a valid inode for that directory
entry, and so ext4_unlink() returns -EIO when de->inode does not match i_ino.

	[kslowd] <== cachefiles_bury_object() = -5
	[kslowd] <== cachefiles_delete_object() = -5
	[kslowd] <== fscache_object_state_machine() [->OBJECT_DEAD]
	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_AVAILABLE,0})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_ACTIVE]

(Note that the above trace includes extra information beyond that produced by
the upstream code).

The fix is to note when an object that is being retired has had its object
deleted preemptively by a replacement object that is being created, and to
skip the second removal attempt in such a case.

Reported-by: Greg M <gregm@servu.net.au>
Reported-by: Mark Moseley <moseleymark@gmail.com>
Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 7d6fb7b
Raw File
Makefile
#
# arch/sh/Makefile
#
# Copyright (C) 1999  Kaz Kojima
# Copyright (C) 2002 - 2008  Paul Mundt
# Copyright (C) 2002  M. R. Brown
#
# This file is subject to the terms and conditions of the GNU General Public
# License.  See the file "COPYING" in the main directory of this archive
# for more details.
#
isa-y					:= any
isa-$(CONFIG_SH_DSP)			:= sh
isa-$(CONFIG_CPU_SH2)			:= sh2
isa-$(CONFIG_CPU_SH2A)			:= sh2a
isa-$(CONFIG_CPU_SH3)			:= sh3
isa-$(CONFIG_CPU_SH4)			:= sh4
isa-$(CONFIG_CPU_SH4A)			:= sh4a
isa-$(CONFIG_CPU_SH4AL_DSP)		:= sh4al
isa-$(CONFIG_CPU_SH5)			:= shmedia

ifeq ($(CONFIG_SUPERH32),y)
isa-$(CONFIG_SH_DSP)			:= $(isa-y)-dsp
isa-y					:= $(isa-y)-up
endif

cflags-$(CONFIG_CPU_SH2)		:= $(call cc-option,-m2,)
cflags-$(CONFIG_CPU_SH2A)		+= $(call cc-option,-m2a,) \
					   $(call cc-option,-m2a-nofpu,)
cflags-$(CONFIG_CPU_SH3)		:= $(call cc-option,-m3,)
cflags-$(CONFIG_CPU_SH4)		:= $(call cc-option,-m4,) \
	$(call cc-option,-mno-implicit-fp,-m4-nofpu)
cflags-$(CONFIG_CPU_SH4A)		+= $(call cc-option,-m4a,) \
					   $(call cc-option,-m4a-nofpu,)
cflags-$(CONFIG_CPU_SH4AL_DSP)		+= $(call cc-option,-m4al,)
cflags-$(CONFIG_CPU_SH5)		:= $(call cc-option,-m5-32media-nofpu,)

ifeq ($(cflags-y),)
#
# In the case where we are stuck with a compiler that has been uselessly
# restricted to a particular ISA, a favourite default of newer GCCs when
# extensive multilib targets are not provided, ensure we get the best fit
# regarding FP generation. This is intentionally stupid (albeit many
# orders of magnitude less than GCC's default behaviour), as anything
# with a large number of multilib targets better have been built
# correctly for the target in mind.
#
cflags-y	+= $(shell $(CC) $(KBUILD_CFLAGS) -print-multi-lib | \
		     grep nofpu | sed q | sed -e 's/^/-/;s/;.*$$//')
# At this point, anything goes.
isaflags-y	:= $(call as-option,-Wa$(comma)-isa=any,)
else
#
# -Wa,-isa= tuning implies -Wa,-dsp for the versions of binutils that
# support it, while -Wa,-dsp by itself limits the range of usable opcodes
# on certain CPU subtypes. Try the ISA variant first, and if that fails,
# fall back on -Wa,-dsp for the old binutils versions. Even without DSP
# opcodes, we always want the best ISA tuning the version of binutils
# will provide.
#
isaflags-y	:= $(call as-option,-Wa$(comma)-isa=$(isa-y),)

isaflags-$(CONFIG_SH_DSP)		:= \
	$(call as-option,-Wa$(comma)-isa=$(isa-y),-Wa$(comma)-dsp)
endif

cflags-$(CONFIG_CPU_BIG_ENDIAN)		+= -mb
cflags-$(CONFIG_CPU_LITTLE_ENDIAN)	+= -ml

cflags-y	+= $(call cc-option,-mno-fdpic)
cflags-y	+= $(isaflags-y) -ffreestanding

OBJCOPYFLAGS	:= -O binary -R .note -R .note.gnu.build-id -R .comment \
		   -R .stab -R .stabstr -S

# Give the various platforms the opportunity to set default image types
defaultimage-$(CONFIG_SUPERH32)			:= zImage
defaultimage-$(CONFIG_SH_SH7785LCR)		:= uImage
defaultimage-$(CONFIG_SH_RSK)			:= uImage
defaultimage-$(CONFIG_SH_URQUELL)		:= uImage
defaultimage-$(CONFIG_SH_MIGOR)			:= uImage
defaultimage-$(CONFIG_SH_AP325RXA)		:= uImage
defaultimage-$(CONFIG_SH_7724_SOLUTION_ENGINE)	:= uImage
defaultimage-$(CONFIG_SH_7206_SOLUTION_ENGINE)	:= vmlinux
defaultimage-$(CONFIG_SH_7619_SOLUTION_ENGINE)	:= vmlinux
defaultimage-$(CONFIG_SH_SDK7786)		:= vmlinux.bin

# Set some sensible Kbuild defaults
KBUILD_IMAGE		:= $(defaultimage-y)

#
# Choosing incompatible machines durings configuration will result in
# error messages during linking.
#
ifdef CONFIG_SUPERH32
UTS_MACHINE		:= sh
BITS			:= 32
LDFLAGS_vmlinux		+= -e _stext
KBUILD_DEFCONFIG	:= shx3_defconfig
else
UTS_MACHINE		:= sh64
BITS			:= 64
LDFLAGS_vmlinux		+= --defsym phys_stext=_stext-$(CONFIG_PAGE_OFFSET) \
			   --defsym phys_stext_shmedia=phys_stext+1 \
			   -e phys_stext_shmedia
KBUILD_DEFCONFIG	:= cayman_defconfig
endif

ifneq ($(SUBARCH),$(ARCH))
  ifeq ($(CROSS_COMPILE),)
    CROSS_COMPILE := $(call cc-cross-prefix, $(UTS_MACHINE)-linux-  $(UTS_MACHINE)-linux-gnu-  $(UTS_MACHINE)-unknown-linux-gnu-)
  endif
endif

ifdef CONFIG_CPU_LITTLE_ENDIAN
ld-bfd			:= elf32-$(UTS_MACHINE)-linux
LDFLAGS_vmlinux		+= --defsym 'jiffies=jiffies_64' --oformat $(ld-bfd)
LDFLAGS			+= -EL
else
ld-bfd			:= elf32-$(UTS_MACHINE)big-linux
LDFLAGS_vmlinux		+= --defsym 'jiffies=jiffies_64+4' --oformat $(ld-bfd)
LDFLAGS			+= -EB
endif

export ld-bfd BITS

head-y	:= arch/sh/kernel/init_task.o arch/sh/kernel/head_$(BITS).o

core-y				+= arch/sh/kernel/ arch/sh/mm/ arch/sh/boards/
core-$(CONFIG_SH_FPU_EMU)	+= arch/sh/math-emu/

# Mach groups
machdir-$(CONFIG_SOLUTION_ENGINE)		+= mach-se
machdir-$(CONFIG_SH_HP6XX)			+= mach-hp6xx
machdir-$(CONFIG_SH_DREAMCAST)			+= mach-dreamcast
machdir-$(CONFIG_SH_SH03)			+= mach-sh03
machdir-$(CONFIG_SH_SECUREEDGE5410)		+= mach-snapgear
machdir-$(CONFIG_SH_RTS7751R2D)			+= mach-r2d
machdir-$(CONFIG_SH_7751_SYSTEMH)		+= mach-systemh
machdir-$(CONFIG_SH_EDOSK7705)			+= mach-edosk7705
machdir-$(CONFIG_SH_HIGHLANDER)			+= mach-highlander
machdir-$(CONFIG_SH_MIGOR)			+= mach-migor
machdir-$(CONFIG_SH_AP325RXA)			+= mach-ap325rxa
machdir-$(CONFIG_SH_KFR2R09)			+= mach-kfr2r09
machdir-$(CONFIG_SH_ECOVEC)			+= mach-ecovec24
machdir-$(CONFIG_SH_SDK7780)			+= mach-sdk7780
machdir-$(CONFIG_SH_SDK7786)			+= mach-sdk7786
machdir-$(CONFIG_SH_X3PROTO)			+= mach-x3proto
machdir-$(CONFIG_SH_SH7763RDP)			+= mach-sh7763rdp
machdir-$(CONFIG_SH_SH4202_MICRODEV)		+= mach-microdev
machdir-$(CONFIG_SH_LANDISK)			+= mach-landisk
machdir-$(CONFIG_SH_LBOX_RE2)			+= mach-lboxre2
machdir-$(CONFIG_SH_CAYMAN)			+= mach-cayman
machdir-$(CONFIG_SH_RSK)			+= mach-rsk

ifneq ($(machdir-y),)
core-y	+= $(addprefix arch/sh/boards/, \
	     $(filter-out ., $(patsubst %,%/,$(machdir-y))))
endif

# Common machine type headers. Not part of the arch/sh/boards/ hierarchy.
machdir-y	+= mach-common

# Companion chips
core-$(CONFIG_HD6446X_SERIES)	+= arch/sh/cchips/hd6446x/

#
# CPU header paths
#
# These are ordered by optimization level. A CPU family that is a subset
# of another (ie, SH-2A / SH-2), is picked up first, with increasing
# levels of genericness if nothing more suitable is situated in the
# hierarchy.
#
# As an example, in order of preference, SH-2A > SH-2 > common definitions.
#
cpuincdir-$(CONFIG_CPU_SH2A)	+= cpu-sh2a
cpuincdir-$(CONFIG_CPU_SH2)	+= cpu-sh2
cpuincdir-$(CONFIG_CPU_SH3)	+= cpu-sh3
cpuincdir-$(CONFIG_CPU_SH4)	+= cpu-sh4
cpuincdir-$(CONFIG_CPU_SH5)	+= cpu-sh5
cpuincdir-y			+= cpu-common	# Must be last

drivers-y			+= arch/sh/drivers/
drivers-$(CONFIG_OPROFILE)	+= arch/sh/oprofile/

boot := arch/sh/boot

cflags-y	+= $(foreach d, $(cpuincdir-y), -Iarch/sh/include/$(d)) \
		   $(foreach d, $(machdir-y), -Iarch/sh/include/$(d))

KBUILD_CFLAGS		+= -pipe $(cflags-y)
KBUILD_CPPFLAGS		+= $(cflags-y)
KBUILD_AFLAGS		+= $(cflags-y)

ifeq ($(CONFIG_MCOUNT),y)
  KBUILD_CFLAGS += -pg
endif

ifeq ($(CONFIG_DWARF_UNWINDER),y)
  KBUILD_CFLAGS += -fasynchronous-unwind-tables
endif

libs-$(CONFIG_SUPERH32)		:= arch/sh/lib/	$(libs-y)
libs-$(CONFIG_SUPERH64)		:= arch/sh/lib64/ $(libs-y)

BOOT_TARGETS = uImage uImage.bz2 uImage.gz uImage.lzma uImage.lzo \
	       uImage.srec uImage.bin zImage vmlinux.bin vmlinux.srec \
	       romImage
PHONY += $(BOOT_TARGETS)

all: $(KBUILD_IMAGE)

$(BOOT_TARGETS): vmlinux
	$(Q)$(MAKE) $(build)=$(boot) $(boot)/$@

compressed: zImage

archprepare:
	$(Q)$(MAKE) $(build)=arch/sh/tools include/generated/machtypes.h

archclean:
	$(Q)$(MAKE) $(clean)=$(boot)
	$(Q)$(MAKE) $(clean)=arch/sh/kernel/vsyscall

define archhelp
	@echo '  zImage 	           - Compressed kernel image'
	@echo '  romImage	           - Compressed ROM image, if supported'
	@echo '  vmlinux.srec	           - Create an ELF S-record'
	@echo '  vmlinux.bin	           - Create an uncompressed binary image'
	@echo '* uImage  	           - Alias to bootable U-Boot image'
	@echo '  uImage.srec	           - Create an S-record for U-Boot'
	@echo '  uImage.bin	           - Kernel-only image for U-Boot (bin)'
	@echo '* uImage.gz	           - Kernel-only image for U-Boot (gzip)'
	@echo '  uImage.bz2	           - Kernel-only image for U-Boot (bzip2)'
	@echo '  uImage.lzma	           - Kernel-only image for U-Boot (lzma)'
	@echo '  uImage.lzo	           - Kernel-only image for U-Boot (lzo)'
endef
back to top