sort by:
Revision Author Date Message Commit Date
13212b5 btrfs: Fix out-of-space bug Btrfs will report NO_SPACE when we create and remove files for several times, and we can't write to filesystem until mount it again. Steps to reproduce: 1: Create a single-dev btrfs fs with default option 2: Write a file into it to take up most fs space 3: Delete above file 4: Wait about 100s to let chunk removed 5: goto 2 Script is like following: #!/bin/bash # Recommend 1.2G space, too large disk will make test slow DEV="/dev/sda16" MNT="/mnt/tmp" dev_size="$(lsblk -bn -o SIZE "$DEV")" || exit 2 file_size_m=$((dev_size * 75 / 100 / 1024 / 1024)) echo "Loop write ${file_size_m}M file on $((dev_size / 1024 / 1024))M dev" for ((i = 0; i < 10; i++)); do umount "$MNT" 2>/dev/null; done echo "mkfs $DEV" mkfs.btrfs -f "$DEV" >/dev/null || exit 2 echo "mount $DEV $MNT" mount "$DEV" "$MNT" || exit 2 for ((loop_i = 0; loop_i < 20; loop_i++)); do echo echo "loop $loop_i" echo "dd file..." cmd=(dd if=/dev/zero of="$MNT"/file0 bs=1M count="$file_size_m") "${cmd[@]}" 2>/dev/null || { # NO_SPACE error triggered echo "dd failed: ${cmd[*]}" exit 1 } echo "rm file..." rm -f "$MNT"/file0 || exit 2 for ((i = 0; i < 10; i++)); do df "$MNT" | tail -1 sleep 10 done done Reason: It is triggered by commit: 47ab2a6c689913db23ccae38349714edf8365e0a which is used to remove empty block groups automatically, but the reason is not in that patch. Code before works well because btrfs don't need to create and delete chunks so many times with high complexity. Above bug is caused by many reason, any of them can trigger it. Reason1: When we remove some continuous chunks but leave other chunks after, these disk space should be used by chunk-recreating, but in current code, only first create will successed. Fixed by Forrest Liu <forrestl@synology.com> in: Btrfs: fix find_free_dev_extent() malfunction in case device tree has hole Reason2: contains_pending_extent() return wrong value in calculation. Fixed by Forrest Liu <forrestl@synology.com> in: Btrfs: fix find_free_dev_extent() malfunction in case device tree has hole Reason3: btrfs_check_data_free_space() try to commit transaction and retry allocating chunk when the first allocating failed, but space_info->full is set in first allocating, and prevent second allocating in retry. Fixed in this patch by clear space_info->full in commit transaction. Tested for severial times by above script. Changelog v3->v4: use light weight int instead of atomic_t to record have_remove_bgs in transaction, suggested by: Josef Bacik <jbacik@fb.com> Changelog v2->v3: v2 fixed the bug by adding more commit-transaction, but we only need to reclaim space when we are really have no space for new chunk, noticed by: Filipe David Manana <fdmanana@gmail.com> Actually, our code already have this type of commit-and-retry, we only need to make it working with removed-bgs. v3 fixed the bug with above way. Changelog v1->v2: v1 will introduce a new bug when delete and create chunk in same disk space in same transaction, noticed by: Filipe David Manana <fdmanana@gmail.com> V2 fix this bug by commit transaction after remove block grops. Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Suggested-by: Filipe David Manana <fdmanana@gmail.com> Suggested-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 14 February 2015, 16:19:14 UTC
f55985f Btrfs: scrub, fix sleep in atomic context My previous patch "Btrfs: fix scrub race leading to use-after-free" introduced the possibility to sleep in an atomic context, which happens when the scrub_lock mutex is held at the time scrub_pending_bio_dec() is called - this function can be called under an atomic context. Chris ran into this in a debug kernel which gave the following trace: [ 1928.950319] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:621 [ 1928.967334] in_atomic(): 1, irqs_disabled(): 0, pid: 149670, name: fsstress [ 1928.981324] INFO: lockdep is turned off. [ 1928.989244] CPU: 24 PID: 149670 Comm: fsstress Tainted: G W 3.19.0-rc7-mason+ #41 [ 1929.006418] Hardware name: ZTSYSTEMS Echo Ridge T4 /A9DRPF-10D, BIOS 1.07 05/10/2012 [ 1929.022207] ffffffff81a22cf8 ffff881076e03b78 ffffffff816b8dd9 ffff881076e03b78 [ 1929.037267] ffff880d8e828710 ffff881076e03ba8 ffffffff810856c4 ffff881076e03bc8 [ 1929.052315] 0000000000000000 000000000000026d ffffffff81a22cf8 ffff881076e03bd8 [ 1929.067381] Call Trace: [ 1929.072344] <IRQ> [<ffffffff816b8dd9>] dump_stack+0x4f/0x6e [ 1929.083968] [<ffffffff810856c4>] ___might_sleep+0x174/0x230 [ 1929.095352] [<ffffffff810857d2>] __might_sleep+0x52/0x90 [ 1929.106223] [<ffffffff816bb68f>] mutex_lock_nested+0x2f/0x3b0 [ 1929.117951] [<ffffffff810ab37d>] ? trace_hardirqs_on+0xd/0x10 [ 1929.129708] [<ffffffffa05dc838>] scrub_pending_bio_dec+0x38/0x70 [btrfs] [ 1929.143370] [<ffffffffa05dd0e0>] scrub_parity_bio_endio+0x50/0x70 [btrfs] [ 1929.157191] [<ffffffff812fa603>] bio_endio+0x53/0xa0 [ 1929.167382] [<ffffffffa05f96bc>] rbio_orig_end_io+0x7c/0xa0 [btrfs] [ 1929.180161] [<ffffffffa05f97ba>] raid_write_parity_end_io+0x5a/0x80 [btrfs] [ 1929.194318] [<ffffffff812fa603>] bio_endio+0x53/0xa0 [ 1929.204496] [<ffffffff8130401b>] blk_update_request+0x1eb/0x450 [ 1929.216569] [<ffffffff81096e58>] ? trigger_load_balance+0x78/0x500 [ 1929.229176] [<ffffffff8144c74d>] scsi_end_request+0x3d/0x1f0 [ 1929.240740] [<ffffffff8144ccac>] scsi_io_completion+0xac/0x5b0 [ 1929.252654] [<ffffffff81441c50>] scsi_finish_command+0xf0/0x150 [ 1929.264725] [<ffffffff8144d317>] scsi_softirq_done+0x147/0x170 [ 1929.276635] [<ffffffff8130ace6>] blk_done_softirq+0x86/0xa0 [ 1929.288014] [<ffffffff8105d92e>] __do_softirq+0xde/0x600 [ 1929.298885] [<ffffffff8105df6d>] irq_exit+0xbd/0xd0 (...) Fix this by using a reference count on the scrub context structure instead of locking the scrub_lock mutex. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 14 February 2015, 16:19:14 UTC
575849e Btrfs: fix scheduler warning when syncing log We try to lock a mutex while the current task state is not TASK_RUNNING, which results in the following warning when CONFIG_DEBUG_LOCK_ALLOC=y: [30736.772501] ------------[ cut here ]------------ [30736.774545] WARNING: CPU: 9 PID: 19972 at kernel/sched/core.c:7300 __might_sleep+0x8b/0xa8() [30736.783453] do not call blocking ops when !TASK_RUNNING; state=2 set at [<ffffffff8107499b>] prepare_to_wait+0x43/0x89 [30736.786261] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop parport_pc psmouse parport pcspkr microcode serio_raw evdev processor thermal_sys i2c_piix4 i2c_core button ext4 crc16 jbd2 mbcache sg sr_mod cdrom sd_mod ata_generic virtio_scsi floppy ata_piix libata virtio_pci virtio_ring e1000 virtio scsi_mod [30736.794323] CPU: 9 PID: 19972 Comm: fsstress Not tainted 3.19.0-rc7-btrfs-next-5+ #1 [30736.795821] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [30736.798788] 0000000000000009 ffff88042743fbd8 ffffffff814248ed ffff88043d32f2d8 [30736.800504] ffff88042743fc28 ffff88042743fc18 ffffffff81045338 0000000000000001 [30736.802131] ffffffff81064514 ffffffff817c52d1 000000000000026d 0000000000000000 [30736.803676] Call Trace: [30736.804256] [<ffffffff814248ed>] dump_stack+0x4c/0x65 [30736.805245] [<ffffffff81045338>] warn_slowpath_common+0xa1/0xbb [30736.806360] [<ffffffff81064514>] ? __might_sleep+0x8b/0xa8 [30736.807391] [<ffffffff81045398>] warn_slowpath_fmt+0x46/0x48 [30736.808511] [<ffffffff8107499b>] ? prepare_to_wait+0x43/0x89 [30736.809620] [<ffffffff8107499b>] ? prepare_to_wait+0x43/0x89 [30736.810691] [<ffffffff81064514>] __might_sleep+0x8b/0xa8 [30736.811703] [<ffffffff81426eaf>] mutex_lock_nested+0x2f/0x3a0 [30736.812889] [<ffffffff8107bfa1>] ? trace_hardirqs_on_caller+0x18f/0x1ab [30736.814138] [<ffffffff8107bfca>] ? trace_hardirqs_on+0xd/0xf [30736.819878] [<ffffffffa038cfff>] wait_for_writer.isra.12+0x91/0xaa [btrfs] [30736.821260] [<ffffffff810748bd>] ? signal_pending_state+0x31/0x31 [30736.822410] [<ffffffffa0391f0a>] btrfs_sync_log+0x160/0x947 [btrfs] [30736.823574] [<ffffffff8107bfa1>] ? trace_hardirqs_on_caller+0x18f/0x1ab [30736.824847] [<ffffffff8107bfca>] ? trace_hardirqs_on+0xd/0xf [30736.825972] [<ffffffffa036e555>] btrfs_sync_file+0x2b0/0x319 [btrfs] [30736.827684] [<ffffffff8117901a>] vfs_fsync_range+0x21/0x23 [30736.828932] [<ffffffff81179038>] vfs_fsync+0x1c/0x1e [30736.829917] [<ffffffff8117928b>] do_fsync+0x34/0x4e [30736.830862] [<ffffffff811794b3>] SyS_fsync+0x10/0x14 [30736.831819] [<ffffffff8142a512>] system_call_fastpath+0x12/0x17 [30736.832982] ---[ end trace c0b57df60d32ae5c ]--- Fix this my acquiring the mutex after calling finish_wait(), which sets the task's state to TASK_RUNNING. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> 14 February 2015, 16:19:14 UTC
eb710b1 Btrfs: Remove unnecessary placeholder in btrfs_err_code "notused" is not necessary. Set 1 to the first entry is enough. Signed-off-by: Takeuchi Satoru <takeuchi_satoru@jp.fujitsu.com Cc: Gui Hecheng <guihc.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:25:51 UTC
b76808f btrfs: cleanup init for list in free-space-cache o removed an unecessary INIT_LIST_HEAD after LIST_HEAD o merge a declare & INIT_LIST_HEAD pair into one LIST_HEAD Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:25:50 UTC
2f08108 btrfs: delete chunk allocation attemp when setting block group ro Below test will fail currently: mkfs.ext4 -F /dev/sda btrfs-convert /dev/sda mount /dev/sda /mnt btrfs device add -f /dev/sdb /mnt btrfs balance start -v -dconvert=raid1 -mconvert=raid1 /mnt The reason is there are some block groups with usage 0, but the whole disk hasn't free space to allocate new chunk, so we even can't set such block group readonly. This patch deletes the chunk allocation when setting block group ro. For META, we already have reserve. But for SYSTEM, we don't have, so the check_system_chunk is still required. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:25:20 UTC
289454a btrfs: clear bio reference after submit_one_bio() After submit_one_bio(), `bio' can go away. However submit_extent_page() leave `bio' referable if submit_one_bio() failed (e.g. -ENOMEM on OOM). It will cause invalid paging request when submit_extent_page() is called next time. I reproduced ENOMEM case with the following script (need CONFIG_FAIL_PAGE_ALLOC, and CONFIG_FAULT_INJECTION_DEBUG_FS). #!/bin/bash dmesgout=dmesg.txt start=100000 end=300000 step=1000 # btrfs options device=/dev/vdb1 directory=/mnt/btrfs # fault-injection options percent=100 times=3 mkdir -p $directory || exit 1 mount -o compress $device $directory || exit 1 rm -f $directory/file || exit 1 dd if=/dev/zero of=$directory/file bs=1M count=512 || exit 1 for interval in `seq $start $step $end`; do dmesg -C echo 1 > /proc/sys/vm/drop_caches sync export FAILCMD_TYPE=fail_page_alloc ./failcmd.sh -p $percent -t $times -i $interval \ --ignore-gfp-highmem=N --ignore-gfp-wait=N --min-order=0 \ -- \ cat $directory/file > /dev/null dmesg > ${dmesgout} if grep -q BUG: ${dmesgout}; then cat ${dmesgout} exit 1 fi done umount $directory exit 0 Signed-off-by: Naohiro Aota <naota@elisp.net> Tested-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:24:51 UTC
de554a4 Btrfs: fix scrub race leading to use-after-free While running a scrub on a kernel with CONFIG_DEBUG_PAGEALLOC=y, I got the following trace: [68127.807663] BUG: unable to handle kernel paging request at ffff8803f8947a50 [68127.807663] IP: [<ffffffff8107da31>] do_raw_spin_lock+0x94/0x122 [68127.807663] PGD 3003067 PUD 43e1f5067 PMD 43e030067 PTE 80000003f8947060 [68127.807663] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [68127.807663] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop parport_pc processor parpo [68127.807663] CPU: 2 PID: 3081 Comm: kworker/u8:5 Not tainted 3.18.0-rc6-btrfs-next-3+ #4 [68127.807663] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [68127.807663] Workqueue: btrfs-btrfs-scrub btrfs_scrub_helper [btrfs] [68127.807663] task: ffff880101fc5250 ti: ffff8803f097c000 task.ti: ffff8803f097c000 [68127.807663] RIP: 0010:[<ffffffff8107da31>] [<ffffffff8107da31>] do_raw_spin_lock+0x94/0x122 [68127.807663] RSP: 0018:ffff8803f097fbb8 EFLAGS: 00010093 [68127.807663] RAX: 0000000028dd386c RBX: ffff8803f8947a50 RCX: 0000000028dd3854 [68127.807663] RDX: 0000000000000018 RSI: 0000000000000002 RDI: 0000000000000001 [68127.807663] RBP: ffff8803f097fbd8 R08: 0000000000000004 R09: 0000000000000001 [68127.807663] R10: ffff880102620980 R11: ffff8801f3e8c900 R12: 000000000001d390 [68127.807663] R13: 00000000cabd13c8 R14: ffff8803f8947800 R15: ffff88037c574f00 [68127.807663] FS: 0000000000000000(0000) GS:ffff88043dd00000(0000) knlGS:0000000000000000 [68127.807663] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [68127.807663] CR2: ffff8803f8947a50 CR3: 00000000b6481000 CR4: 00000000000006e0 [68127.807663] Stack: [68127.807663] ffffffff823942a8 ffff8803f8947a50 ffff8802a3416f80 0000000000000000 [68127.807663] ffff8803f097fc18 ffffffff8141e7c0 ffffffff81072948 000000000034f314 [68127.807663] ffff8803f097fc08 0000000000000292 ffff8803f097fc48 ffff8803f8947a50 [68127.807663] Call Trace: [68127.807663] [<ffffffff8141e7c0>] _raw_spin_lock_irqsave+0x4b/0x55 [68127.807663] [<ffffffff81072948>] ? __wake_up+0x22/0x4b [68127.807663] [<ffffffff81072948>] __wake_up+0x22/0x4b [68127.807663] [<ffffffffa0392327>] scrub_pending_bio_dec+0x32/0x36 [btrfs] [68127.807663] [<ffffffffa0395e70>] scrub_bio_end_io_worker+0x5a3/0x5c9 [btrfs] [68127.807663] [<ffffffff810e0c7c>] ? time_hardirqs_off+0x15/0x28 [68127.807663] [<ffffffff81078106>] ? trace_hardirqs_off_caller+0x4c/0xb9 [68127.807663] [<ffffffffa0372a7c>] normal_work_helper+0xf1/0x238 [btrfs] [68127.807663] [<ffffffffa0372d3d>] btrfs_scrub_helper+0x12/0x14 [btrfs] [68127.807663] [<ffffffff810582d2>] process_one_work+0x1e4/0x3b6 [68127.807663] [<ffffffff81078180>] ? trace_hardirqs_off+0xd/0xf [68127.807663] [<ffffffff81058dc9>] worker_thread+0x1fb/0x2a8 [68127.807663] [<ffffffff81058bce>] ? rescuer_thread+0x219/0x219 [68127.807663] [<ffffffff8105cd75>] kthread+0xdb/0xe3 [68127.807663] [<ffffffff8105cc9a>] ? __kthread_parkme+0x67/0x67 [68127.807663] [<ffffffff8141f1ec>] ret_from_fork+0x7c/0xb0 [68127.807663] [<ffffffff8105cc9a>] ? __kthread_parkme+0x67/0x67 [68127.807663] Code: 39 c2 75 14 8d 8a 00 00 01 00 89 d0 f0 0f b1 0b 39 d0 0f 84 81 00 00 00 4c 69 2d 27 86 99 00 fa 00 00 00 45 31 e4 4d 39 ec 74 2b <8b> 13 89 d0 c1 e8 10 66 39 c2 75 [68127.807663] RIP [<ffffffff8107da31>] do_raw_spin_lock+0x94/0x122 [68127.807663] RSP <ffff8803f097fbb8> [68127.807663] CR2: ffff8803f8947a50 [68127.807663] ---[ end trace d7045aac00a66cd8 ]--- This is due to a race that can happen in a very tiny time window and is illustrated by the following sequence diagram: CPU 1 CPU 2 btrfs_scrub_dev() scrub_bio_end_io_worker() scrub_pending_bio_dec() atomic_dec(&sctx->bios_in_flight) wait sctx->bios_in_flight == 0 wait sctx->workers_pending == 0 mutex_lock(&fs_info->scrub_lock) (...) mutex_lock(&fs_info->scrub_lock) scrub_free_ctx(sctx) kfree(sctx) wake_up(&sctx->list_wait) __wake_up() spin_lock_irqsave(&sctx->list_wait->lock, flags) Another variation of this scenario that results in the same use-after-free issue is: CPU 1 CPU 2 btrfs_scrub_dev() wait sctx->bios_in_flight == 0 scrub_bio_end_io_worker() scrub_pending_bio_dec() __wake_up(&sctx->list_wait) spin_lock_irqsave(&sctx->list_wait->lock, flags) default_wake_function() wake up task at CPU 2 wait sctx->workers_pending == 0 mutex_lock(&fs_info->scrub_lock) (...) mutex_lock(&fs_info->scrub_lock) scrub_free_ctx(sctx) kfree(sctx) spin_unlock_irqrestore(&sctx->list_wait->lock, flags) Fix this by holding the scrub lock while doing the wakeup. This isn't a recent regression, the issue as been around since the scrub feature was added (2011, commit a2de733c78fa7af51ba9670482fa7d392aa67c57). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:24:50 UTC
001a648 Btrfs: add missing cleanup on sysfs init failure If we failed during initialization of sysfs, we weren't unregistering the top level btrfs sysfs entry nor the debugfs stuff. Not unregistering the top level sysfs entry makes future attempts to reload the btrfs module impossible and the following is reported in dmesg: [ 2246.451296] WARNING: CPU: 3 PID: 10999 at fs/sysfs/dir.c:486 sysfs_warn_dup+0x91/0xb0() [ 2246.451298] sysfs: cannot create duplicate filename '/fs/btrfs' [ 2246.451298] Modules linked in: btrfs(+) raid6_pq xor bnep rfcomm bluetooth binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc parport_pc parport psmouse serio_raw pcspkr evbug i2c_piix4 e1000 floppy [last unloaded: btrfs] [ 2246.451310] CPU: 3 PID: 10999 Comm: modprobe Tainted: G W 3.13.0-fdm-btrfs-next-24+ #7 [ 2246.451311] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 2246.451312] 0000000000000009 ffff8800d353fa08 ffffffff816f1da6 0000000000000410 [ 2246.451314] ffff8800d353fa58 ffff8800d353fa48 ffffffff8104a32c ffff88020821a290 [ 2246.451316] ffff88020821a290 ffff88020821a290 ffff8802148f0000 ffff8800d353fb80 [ 2246.451318] Call Trace: [ 2246.451322] [<ffffffff816f1da6>] dump_stack+0x4e/0x68 [ 2246.451324] [<ffffffff8104a32c>] warn_slowpath_common+0x8c/0xc0 [ 2246.451325] [<ffffffff8104a416>] warn_slowpath_fmt+0x46/0x50 [ 2246.451328] [<ffffffff81367dc5>] ? strlcat+0x65/0x90 (....) This fixes the following change: btrfs: add simple debugfs interface commit 1bae30982bc86ab66d61ccb6e22792593b45d44d Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:24:49 UTC
d4b450c Btrfs: fix race between transaction commit and empty block group removal Committing a transaction can race with automatic removal of empty block groups (cleaner kthread), leading to a BUG_ON() in the transaction commit code while running btrfs_finish_extent_commit(). The following sequence diagram shows how it can happen: CPU 1 CPU 2 btrfs_commit_transaction() fs_info->running_transaction = NULL btrfs_finish_extent_commit() find_first_extent_bit() -> found range for block group X in fs_info->freed_extents[] btrfs_delete_unused_bgs() -> found block group X Removed block group X's range from fs_info->freed_extents[] btrfs_remove_chunk() btrfs_remove_block_group(bg X) unpin_extent_range(bg X range) btrfs_lookup_block_group(bg X) -> returns NULL -> BUG_ON() The trace that results from the BUG_ON() is: [48665.187808] ------------[ cut here ]------------ [48665.188032] kernel BUG at fs/btrfs/extent-tree.c:5675! [48665.188032] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [48665.188032] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop parport_pc evdev microcode [48665.197388] CPU: 2 PID: 31211 Comm: kworker/u32:16 Tainted: G W 3.19.0-rc5-btrfs-next-4+ #1 [48665.197388] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [48665.197388] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs] [48665.197388] task: ffff880222011810 ti: ffff8801b56a4000 task.ti: ffff8801b56a4000 [48665.197388] RIP: 0010:[<ffffffffa0350d05>] [<ffffffffa0350d05>] unpin_extent_range+0x6a/0x1ba [btrfs] [48665.197388] RSP: 0018:ffff8801b56a7b88 EFLAGS: 00010246 [48665.197388] RAX: 0000000000000000 RBX: ffff8802143a6000 RCX: ffff8802220120c8 [48665.197388] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8800a3c140b0 [48665.197388] RBP: ffff8801b56a7bd8 R08: 0000000000000003 R09: 0000000000000000 [48665.197388] R10: 0000000000000000 R11: 000000000000bbac R12: 0000000012e8e000 [48665.197388] R13: ffff8800a3c14000 R14: 0000000000000000 R15: 0000000000000000 [48665.197388] FS: 0000000000000000(0000) GS:ffff88023ec40000(0000) knlGS:0000000000000000 [48665.197388] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [48665.197388] CR2: 00007f065e42f270 CR3: 0000000206f70000 CR4: 00000000000006e0 [48665.197388] Stack: [48665.197388] ffff8801b56a7bd8 0000000012ea0000 01ff8800a3c14138 0000000012e9ffff [48665.197388] ffff880141df3dd8 ffff8802143a6000 ffff8800a3c14138 ffff880141df3df0 [48665.197388] ffff880141df3dd8 0000000000000000 ffff8801b56a7c08 ffffffffa0354227 [48665.197388] Call Trace: [48665.197388] [<ffffffffa0354227>] btrfs_finish_extent_commit+0xb0/0xd9 [btrfs] [48665.197388] [<ffffffffa0366b4b>] btrfs_commit_transaction+0x791/0x92c [btrfs] [48665.197388] [<ffffffffa0352432>] flush_space+0x43d/0x452 [btrfs] [48665.197388] [<ffffffff814295c3>] ? _raw_spin_unlock+0x28/0x33 [48665.197388] [<ffffffffa035255f>] btrfs_async_reclaim_metadata_space+0x118/0x164 [btrfs] [48665.197388] [<ffffffff81059917>] ? process_one_work+0x14b/0x3ab [48665.197388] [<ffffffff810599ac>] process_one_work+0x1e0/0x3ab [48665.197388] [<ffffffff81079fa9>] ? trace_hardirqs_off+0xd/0xf [48665.197388] [<ffffffff8105a55b>] worker_thread+0x210/0x2d0 [48665.197388] [<ffffffff8105a34b>] ? rescuer_thread+0x2c3/0x2c3 [48665.197388] [<ffffffff8105e5c0>] kthread+0xef/0xf7 [48665.197388] [<ffffffff81429682>] ? _raw_spin_unlock_irq+0x2d/0x39 [48665.197388] [<ffffffff8105e4d1>] ? __kthread_parkme+0xad/0xad [48665.197388] [<ffffffff81429dec>] ret_from_fork+0x7c/0xb0 [48665.197388] [<ffffffff8105e4d1>] ? __kthread_parkme+0xad/0xad [48665.197388] Code: 85 f6 74 14 49 8b 06 49 03 46 09 49 39 c4 72 1d 4c 89 f7 e8 83 ec ff ff 4c 89 e6 4c 89 ef e8 1e f1 ff ff 48 85 c0 49 89 c6 75 02 <0f> 0b 49 8b 1e 49 03 5e 09 48 8b [48665.197388] RIP [<ffffffffa0350d05>] unpin_extent_range+0x6a/0x1ba [btrfs] [48665.197388] RSP <ffff8801b56a7b88> [48665.272246] ---[ end trace b9c6ab9957521376 ]--- Fix this by ensuring that unpining the block group's range in btrfs_finish_extent_commit() is done in a synchronized fashion with removing the block group's range from freed_extents[] in btrfs_delete_unused_bgs() This race got introduced with the change: Btrfs: remove empty block groups automatically commit 47ab2a6c689913db23ccae38349714edf8365e0a Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:24:48 UTC
e3540ea btrfs: add more checks to btrfs_read_sys_array Verify that the sys_array has enough bytes to read the next item. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:24:39 UTC
1ffb22c btrfs: cleanup, rename a few variables in btrfs_read_sys_array There's a pointer to buffer, integer offset and offset passed as pointer, try to find matching names for them. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:24:28 UTC
ce7fca5 btrfs: add checks for sys_chunk_array sizes Verify that possible minimum and maximum size is set, validity of contents is checked in btrfs_read_sys_array. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:23:43 UTC
75d6ad3 btrfs: more superblock checks, lower bounds on devices and sectorsize/nodesize I received a few crafted images from Jiri, all got through the recently added superblock checks. The lower bounds checks for num_devices and sector/node -sizes were missing and caused a crash during mount. Tools for symbolic code execution were used to prepare the images contents. Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 03:20:39 UTC
9cc97d6 Btrfs: Add code to support file creation time This patch adds a new member to the 'struct btrfs_inode' structure to hold the file creation time. Signed-off-by: chandan <chandanrmail@gmail.com> [refreshed, removed btrfs_inode_otime] Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 02:39:16 UTC
a937b97 btrfs: kill btrfs_inode_*time helpers They just opencode taking address of the timespec member. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 03 February 2015, 02:39:07 UTC
95449a1 Btrfs: insert_new_root: Fix lock type of the extent buffer. btrfs_alloc_tree_block() returns an extent buffer on which a blocked lock has been taken. Hence assign the appropriate value to path->locks[level]. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 13:42:23 UTC
78f55e5 Btrfs: fix unused members in struct btrfs_root There isn't any real use of following members of struct btrfs_root so delete them. struct kobject root_kobj; struct completion kobj_unregister; Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:22:37 UTC
0ee13fe btrfs: qgroup: move WARN_ON() to the correct location. In function qgroup_excl_accounting(), we need to WARN when qg->excl is less than what we want to free, same to child and parents. But currently, for parent qgroup, the WARN_ON() is located after freeing qg->excl. It will WARN out even we free it normally. This patch move this WARN_ON() before freeing qg->excl. Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:22:37 UTC
26455d3 Btrfs: cleanup unused run_most "run_most" is not used anymore. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:22:16 UTC
5701934 Rename all ref_count to refs in struct refs is better than ref_count to record a struct's ref count. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Suggested-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:50 UTC
ffe2d20 Btrfs: Introduce BTRFS_BLOCK_GROUP_RAID56_MASK to check raid56 simply So we can check raid56 with: (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) instead of long: (map->type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6)) Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:49 UTC
10f1190 Btrfs: Include map_type in raid_bio Corrent code use many kinds of "clever" way to determine operation target's raid type, as: raid_map != NULL or raid_map[MAX_NR] == RAID[56]_Q_STRIPE To make code easy to maintenance, this patch put raid type into bbio, and we can always get raid type from bbio with a "stupid" way. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:49 UTC
be50a8d Btrfs: Simplify scrub_setup_recheck_block()'s argument scrub_setup_recheck_block() have many arguments but most of them can be get from one of them, we can remove them to make code clean. Some other cleanup for that function also included in this patch. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:49 UTC
b968fed Btrfs: Combine per-page recover in dev-replace and scrub The code are similar, combine them to make code clean and easy to maintenance. Some lost condition are also completed with benefit of this combination. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:49 UTC
8d6738c Btrfs: Separate finding-right-mirror and writing-to-target's process in scrub_handle_errored_block() In corrent code, code of finding-right-mirror and writing-to-target are mixed in logic, if we find a right mirror but failed in writing to target, it will treat as "hadn't found right block", and fill the target with sblock_bad. Actually, "failed in writing to target" does not mean "source block is wrong", this patch separate above two condition in logic, and do some cleanup to make code clean. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:49 UTC
dc5f7a3 Btrfs: Break loop when reach BTRFS_MAX_MIRRORS in scrub_setup_recheck_block() Use break instead of useless loop should be more suitable in this case. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:48 UTC
7653947 Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event() Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:48 UTC
09dd7a0 Btrfs: Cleanup btrfs_bio_counter_inc_blocked() 1: Remove no-need DEFINE_WAIT(wait) 2: Add likely() for BTRFS_FS_STATE_DEV_REPLACING condition 3: Use while loop instead of goto Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:48 UTC
114ab50 Btrfs: Remove noneed force_write in scrub_write_block_to_dev_replace It is always 1 in this place, because !1 case was already jumped out in previous code. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:48 UTC
b25c94c Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON() if (sctx->is_dev_replace && !is_metadata && !have_csum) { ... goto nodatasum_case; } ... nodatasum_case: WARN_ON(sctx->is_dev_replace); In above code, nodatasum_case marker should be moved after WARN_ON(). Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:48 UTC
6e9606d Btrfs: add ref_count and free function for btrfs_bio 1: ref_count is simple than current RBIO_HOLD_BBIO_MAP_BIT flag to keep btrfs_bio's memory in raid56 recovery implement. 2: free function for bbio will make code clean and flexible, plus forced data type checking in compile. Changelog v1->v2: Rename following by David Sterba's suggestion: put_btrfs_bio() -> btrfs_put_bio() get_btrfs_bio() -> btrfs_get_bio() bbio->ref_count -> bbio->refs Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:48 UTC
8e5cfb5 Btrfs: Make raid_map array be inlined in btrfs_bio structure It can make code more simple and clear, we need not care about free bbio and raid_map together. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:47 UTC
cc7539e Btrfs: sort raid_map before adding tgtdev stripes It can avoid complex calculation of real stripes in sort, moreover, we can clean up code of sorting tgtdev_map because it will be in order initially. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:47 UTC
e34c330 Btrfs: fix a out-of-bound access of raid_map We add the number of stripes on target devices into bbio->num_stripes if we are under device replacement, and we just sort the raid_map of those stripes that not on the target devices, so if when we need real raid_map, we need skip the stripes on the target devices. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:06:47 UTC
df8d116 Btrfs: fix fsync log replay for inodes with a mix of regular refs and extrefs If we have an inode with a large number of hard links, some of which may be extrefs, turn a regular ref into an extref, fsync the inode and then replay the fsync log (after a crash/reboot), we can endup with an fsync log that makes the replay code always fail with -EOVERFLOW when processing the inode's references. This is easy to reproduce with the test case I made for xfstests. Its steps are the following: _scratch_mkfs "-O extref" >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create a test file with 3001 hard links. This number is large enough to # make btrfs start using extrefs at some point even if the fs has the maximum # possible leaf/node size (64Kb). echo "hello world" > $SCRATCH_MNT/foo for i in `seq 1 3000`; do ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_`printf "%04d" $i` done # Make sure all metadata and data are durably persisted. sync # Now remove one link, add a new one with a new name, add another new one with # the same name as the one we just removed and fsync the inode. rm -f $SCRATCH_MNT/foo_link_0001 ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3001 ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_0001 rm -f $SCRATCH_MNT/foo_link_0002 ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3002 ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3003 $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo # Simulate a crash/power loss. This makes sure the next mount # will see an fsync log and will replay that log. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # Check that the number of hard links is correct, we are able to remove all # the hard links and read the file's data. This is just to verify we don't # get stale file handle errors (due to dangling directory index entries that # point to inodes that no longer exist). echo "Link count: $(stat --format=%h $SCRATCH_MNT/foo)" [ -f $SCRATCH_MNT/foo ] || echo "Link foo is missing" for ((i = 1; i <= 3003; i++)); do name=foo_link_`printf "%04d" $i` if [ $i -eq 2 ]; then [ -f $SCRATCH_MNT/$name ] && echo "Link $name found" else [ -f $SCRATCH_MNT/$name ] || echo "Link $name is missing" fi done rm -f $SCRATCH_MNT/foo_link_* cat $SCRATCH_MNT/foo rm -f $SCRATCH_MNT/foo status=0 exit The fix is simply to correct the overflow condition when overwriting a reference item because it was wrong, trying to increase the item in the fs/subvol tree by an impossible amount. Also ensure that we don't insert one normal ref and one ext ref for the same dentry - this happened because processing a dir index entry from the parent in the log happened when the normal ref item was full, which made the logic insert an extref and later when the normal ref had enough room, it would be inserted again when processing the ref item from the child inode in the log. This issue has been present since the introduction of the extrefs feature (2012). A test case for xfstests follows soon. This test only passes if the previous patch titled "Btrfs: fix fsync when extend references are added to an inode" is applied too. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:05 UTC
2c2c452 Btrfs: fix fsync when extend references are added to an inode If we added an extended reference to an inode and fsync'ed it, the log replay code would make our inode have an incorrect link count, which was lower then the expected/correct count. This resulted in stale directory index entries after deleting some of the hard links, and any access to the dangling directory entries resulted in -ESTALE errors because the entries pointed to inode items that don't exist anymore. This is easy to reproduce with the test case I made for xfstests, and the bulk of that test is: _scratch_mkfs "-O extref" >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create a test file with 3001 hard links. This number is large enough to # make btrfs start using extrefs at some point even if the fs has the maximum # possible leaf/node size (64Kb). echo "hello world" > $SCRATCH_MNT/foo for i in `seq 1 3000`; do ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_`printf "%04d" $i` done # Make sure all metadata and data are durably persisted. sync # Add one more link to the inode that ends up being a btrfs extref and fsync # the inode. ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3001 $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo # Simulate a crash/power loss. This makes sure the next mount # will see an fsync log and will replay that log. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # Now after the fsync log replay btrfs left our inode with a wrong link count N, # which was smaller than the correct link count M (N < M). # So after removing N hard links, the remaining M - N directory entries were # still visible to user space but it was impossible to do anything with them # because they pointed to an inode that didn't exist anymore. This resulted in # stale file handle errors (-ESTALE) when accessing those dentries for example. # # So remove all hard links except the first one and then attempt to read the # file, to verify we don't get an -ESTALE error when accessing the inodel # # The btrfs fsck tool also detected the incorrect inode link count and it # reported an error message like the following: # # root 5 inode 257 errors 2001, no inode item, link count wrong # unresolved ref dir 256 index 2978 namelen 13 name foo_link_2976 filetype 1 errors 4, no inode ref # # The fstests framework automatically calls fsck after a test is run, so we # don't need to call fsck explicitly here. rm -f $SCRATCH_MNT/foo_link_* cat $SCRATCH_MNT/foo status=0 exit So make sure an fsync always flushes the delayed inode item, so that the fsync log contains it (needed in order to trigger the link count fixup code) and fix the extref counting function, which always return -ENOENT to its caller (and made it assume there were always 0 extrefs). This issue has been present since the introduction of the extrefs feature (2012). A test case for xfstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:04 UTC
d36808e Btrfs: fix directory inconsistency after fsync log replay If we have an inode (file) with a link count greater than 1, remove one of its hard links, fsync the inode, power fail/crash and then replay the fsync log on the next mount, we end up getting the parent directory's metadata inconsistent - its i_size still reflects the deleted hard link and has dangling index entries (with no matching inode reference entries). This prevents the directory from ever being deletable, as its i_size can never decrease to BTRFS_EMPTY_DIR_SIZE even if all of its children inodes are deleted, and the dangling index entries can never be removed (as they point to an inode that does not exist anymore). This is easy to reproduce with the following excerpt from the test case for xfstests that I just made: _scratch_mkfs >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create a test file with 2 hard links in the same directory. mkdir -p $SCRATCH_MNT/a/b echo "hello world" > $SCRATCH_MNT/a/b/foo ln $SCRATCH_MNT/a/b/foo $SCRATCH_MNT/a/b/bar # Make sure all metadata and data are durably persisted. sync # Now remove one of the hard links and fsync the inode. rm -f $SCRATCH_MNT/a/b/bar $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/b/foo # Simulate a crash/power loss. This makes sure the next mount # will see an fsync log and will replay that log. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # Remove the last hard link of the file and attempt to remove its parent # directory - this failed in btrfs because the fsync log and replay code # didn't decrement the parent directory's i_size and left dangling directory # index entries - this made the btrfs rmdir implementation always fail with # the error -ENOTEMPTY. # # The dangling directory index entries were visible to user space, but it was # impossible to do anything on them (unlink, open, read, write, stat, etc) # because the inode they pointed to did not exist anymore. # # The parent directory's metadata inconsistency (stale index entries) was # also detected by btrfs' fsck tool, which is run automatically by the fstests # framework when the test finishes. The error message reported by fsck was: # # root 5 inode 259 errors 2001, no inode item, link count wrong # unresolved ref dir 258 index 3 namelen 3 name bar filetype 1 errors 4, no inode ref # rm -f $SCRATCH_MNT/a/b/* rmdir $SCRATCH_MNT/a/b rmdir $SCRATCH_MNT/a To fix this just make sure that after an unlink, if the inode is fsync'ed, he parent inode is fully logged in the fsync log. A test case for xfstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:04 UTC
6219872 Btrfs: lookup for block group only if needed when freeing a tree block Very often our extent buffer's header generation doesn't match the current transaction's id or it is also referenced by other trees (snapshots), so we don't need the corresponding block group cache object. Therefore only search for it if we are going to use it, so we avoid an unnecessary search in the block groups rbtree (and acquiring and releasing its spinlock). Freeing a tree block is performed when COWing or deleting a node/leaf, which implies we are holding the node/leaf's parent node lock, therefore reducing the amount of time spent when freeing a tree block helps reducing the amount of time we are holding the parent node's lock. For example, for a run of xfstests/generic/083, the block group cache object was needed only 682 times for a total of 226691 calls to free a tree block. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:04 UTC
730a78c btrfs: remove a no-op unfreeze superbock callback Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:04 UTC
9ee49a0 btrfs: switch extent_state state to unsigned Currently there's a 4B hole in the structure between refs and state and there are only 16 bits used so we can make it unsigned. This will get a better packing and may save some stack space for local variables. The size of extent_state gets reduced by 8B and there are usually a lot of slab objects. struct extent_state { u64 start; /* 0 8 */ u64 end; /* 8 8 */ struct rb_node rb_node; /* 16 24 */ wait_queue_head_t wq; /* 40 24 */ /* --- cacheline 1 boundary (64 bytes) --- */ atomic_t refs; /* 64 4 */ /* XXX 4 bytes hole, try to pack */ long unsigned int state; /* 72 8 */ u64 private; /* 80 8 */ /* size: 88, cachelines: 2, members: 7 */ /* sum members: 84, holes: 1, sum holes: 4 */ /* last cacheline: 24 bytes */ }; Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:04 UTC
5efa049 btrfs: set proper message level for skinny metadata This has been confusing people for too long, the message is really just informative. CC: <stable@vger.kernel.org> # 3.10+ Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:03 UTC
f0954c6 btrfs: update message levels after checksum errors The errors are worth noting and might get missed with INFO level. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:03 UTC
aa8ee31 btrfs: update message levels during failed mount All error conditions from open_ctree shall be ERR. Warning would suggest that something's wrong and we can continue. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:03 UTC
68b663d btrfs: update message levels for errors Several messages that point to some internal problem, level INFO is wrong here. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:03 UTC
a8df6fe Btrfs: fix setup_leaf_for_split() to avoid leaf corruption We were incorrectly detecting when the target key didn't exist anymore after releasing the path and re-searching the tree. This could make us split or duplicate (btrfs_split_item() and btrfs_duplicate_item() are its only callers at the moment) an item when we should not. For the case of duplicating an item, we currently only duplicate checksum items (csum tree) and file extent items (fs/subvol trees). For the checksum items we end up overriding the item completely, but for file extent items we update only some of their fields in the copy (done in __btrfs_drop_extents), which means we can end up having a logical corruption for some values. Also for the case where we duplicate a file extent item it will make us produce a leaf with a wrong key order, as btrfs_duplicate_item() advances us to the next slot and then its caller sets a smaller key on the new item at that slot (like in __btrfs_drop_extents() e.g.). Alternatively if the tree search in setup_leaf_for_split() leaves with path->slots[0] == btrfs_header_nritems(path->nodes[0]), we end up accessing beyond the leaf's end (when we check if the item's size has changed) and make our caller insert an item at the invalid slot btrfs_header_nritems(path->nodes[0]) + 1, causing an invalid memory access if the leaf is full or nearly full. This issue has been present since the introduction of this function in 2009: Btrfs: Add btrfs_duplicate_item commit ad48fd754676bfae4139be1a897b1ea58f9aaf21 Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 02:02:03 UTC
57bbddd Merge branch 'cleanup/blocksize-diet-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus 22 January 2015, 01:49:35 UTC
d354183 Merge branch 'fix/find-item-path-leak' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus 22 January 2015, 01:45:25 UTC
ce93ec5 Btrfs: track dirty block groups on their own list Currently any time we try to update the block groups on disk we will walk _all_ block groups and check for the ->dirty flag to see if it is set. This function can get called several times during a commit. So if you have several terabytes of data you will be a very sad panda as we will loop through _all_ of the block groups several times, which makes the commit take a while which slows down the rest of the file system operations. This patch introduces a dirty list for the block groups that we get added to when we dirty the block group for the first time. Then we simply update any block groups that have been dirtied since the last time we called btrfs_write_dirty_block_groups. This allows us to clean up how we write the free space cache out so it is much cleaner. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 01:36:52 UTC
e7070be Btrfs: change how we track dirty roots I've been overloading root->dirty_list to keep track of dirty roots and which roots need to have their commit roots switched at transaction commit time. This could cause us to lose an update to the root which could corrupt the file system. To fix this use a state bit to know if the root is dirty, and if it isn't set we go ahead and move the root to the dirty list. This way if we re-dirty the root after adding it to the switch_commit list we make sure to update it. This also makes it so that the extent root is always the last root on the dirty list to try and keep the amount of churn down at this point in the commit. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com> 22 January 2015, 01:35:49 UTC
ec6f34e Linux 3.19-rc5 18 January 2015, 06:02:20 UTC
d0ac5d8 Merge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "We've been sitting on our fixes branch for a while, so this batch is unfortunately on the large side. A lot of these are tweaks and fixes to device trees, fixing various bugs around clocks, reg ranges, etc. There's also a few defconfig updates (which are on the late side, no more of those). All in all the diffstat is bigger than ideal at this time, but nothing in here seems particularly risky" * tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits) reset: sunxi: fix spinlock initialization ARM: dts: disable CCI on exynos5420 based arndale-octa drivers: bus: check cci device tree node status ARM: rockchip: disable jtag/sdmmc autoswitching on rk3288 ARM: nomadik: fix up leftover device tree pins ARM: at91: board-dt-sama5: add phy_fixup to override NAND_Tree ARM: at91/dt: sam9263: Add missing clocks to lcdc node ARM: at91: sama5d3: dt: correct the sound route ARM: at91/dt: sama5d4: fix the timer reg length ARM: exynos_defconfig: Enable LM90 driver ARM: exynos_defconfig: Enable options for display panel support arm: dts: Use pmu_system_controller phandle for dp phy ARM: shmobile: sh73a0 legacy: Set .control_parent for all irqpin instances ARM: dts: berlin: correct BG2Q's SM GPIO location. ARM: dts: berlin: add broken-cd and set bus width for eMMC in Marvell DMP DT ARM: dts: berlin: fix io clk and add missing core clk for BG2Q sdhci2 host ARM: dts: Revert disabling of smc91x for n900 ARM: dts: imx51-babbage: Fix ULPI PHY reset modelling ARM: dts: dra7-evm: fix qspi device tree partition size ARM: omap2plus_defconfig: use CONFIG_CPUFREQ_DT ... 18 January 2015, 06:00:40 UTC
12ba857 Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux Pull clock driver fixes from Mike Turquette: "Small number of fixes for clock drivers and a single null pointer dereference fix in the framework core code. The driver fixes vary from fixing section mismatch warnings to preventing machines from hanging (and preventing developers from crying)" * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux: clk: fix possible null pointer dereference Revert "clk: ppc-corenet: Fix Section mismatch warning" clk: rockchip: fix deadlock possibility in cpuclk clk: berlin: bg2q: remove non-exist "smemc" gate clock clk: at91: keep slow clk enabled to prevent system hang clk: rockchip: fix rk3288 cpuclk core dividers clk: rockchip: fix rk3066 pll lock bit location clk: rockchip: Fix clock gate for rk3188 hclk_emem_peri clk: rockchip: add CLK_IGNORE_UNUSED flag to fix rk3066/rk3188 USB Host 18 January 2015, 03:29:11 UTC
901b208 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is one fix for a Multiqueue sleeping in invalid context problem and a MAINTAINER file update for Qlogic" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ->queue_rq can't sleep MAINTAINERS: Update maintainer list for qla4xxx 18 January 2015, 03:26:52 UTC
c7662fc clk: fix possible null pointer dereference The commit 646cafc6 (clk: Change clk_ops->determine_rate to return a clk_hw as the best parent) opens a possibility for null pointer dereference, fix this. Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Michael Turquette <mturquette@linaro.org> 17 January 2015, 19:33:57 UTC
176a107 Revert "clk: ppc-corenet: Fix Section mismatch warning" This reverts commit da788acb28386aa896224e784954bb73c99ff26c. That commit tried to fix the section mismatch warning by moving the ppc_corenet_clk_driver struct to init section. This is definitely wrong because the kernel would free the memories occupied by this struct after boot while this driver is still registered in the driver core. The kernel would panic when accessing this driver struct. Cc: stable@vger.kernel.org # 3.17 Signed-off-by: Kevin Hao <haokexin@gmail.com> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Michael Turquette <mturquette@linaro.org> 17 January 2015, 19:27:16 UTC
a5e1baf clk: rockchip: fix deadlock possibility in cpuclk Lockdep reported a possible deadlock between the cpuclk lock and for example the i2c driver. CPU0 CPU1 ---- ---- lock(clk_lock); local_irq_disable(); lock(&(&i2c->lock)->rlock); lock(clk_lock); <Interrupt> lock(&(&i2c->lock)->rlock); *** DEADLOCK *** The generic clock-types of the core ccf already use spin_lock_irqsave when touching clock registers, so do the same for the cpuclk. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Michael Turquette <mturquette@linaro.org> [mturquette@linaro.org: removed initialization of "flags"] 17 January 2015, 19:22:39 UTC
298e320 Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "Two patches, the first by Andy to fix dw dmac runtime pm and second one by me to fix the dmaengine headers in MAINTAINERS" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: dw: balance PM runtime calls MAINTAINERS: dmaengine: fix the header file for dmaengine 17 January 2015, 18:26:24 UTC
59b2858 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, but also two PMU driver fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools powerpc: Use dwfl_report_elf() instead of offline. perf tools: Fix segfault for symbol annotation on TUI perf test: Fix dwarf unwind using libunwind. perf tools: Avoid build splat for syscall numbers with uclibc perf tools: Elide strlcpy warning with uclibc perf tools: Fix statfs.f_type data type mismatch build error with uclibc tools: Remove bitops/hweight usage of bits in tools/perf perf machine: Fix __machine__findnew_thread() error path perf tools: Fix building error in x86_64 when dwarf unwind is on perf probe: Propagate error code when write(2) failed perf/x86/intel: Fix bug for "cycles:p" and "cycles:pp" on SLM perf/rapl: Fix sysfs_show() initialization for RAPL PMU 17 January 2015, 18:24:30 UTC
d01de23 Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix segfault when using both the map symtab viewer and annotation in the TUI (Namhyung Kim). - uClibc build fixes (Alexey Brodkin, Vineet Gupta). - bitops/hweight were moved from tools/perf/ too tools/include, move some leftovers (Arnaldo Carvalho de Melo) - Fix dwarf unwind x86_64 build error (Namhyung Kim) - Fix __machine__findnew_thread() error path (Namhyung Kim) - Propagate error code when write(2) failed in 'perf probe' (Namhyung Kim) - Use dwfl_report_elf() instead of offline in powerpc bits to properly handle non prelinked DSOs (Sukadev Bhattiprolu). - Fix dwarf unwind using libunwind in 'perf test' (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> 17 January 2015, 10:04:35 UTC
966903a Merge tag 'samsung-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes Merge "Samsung fixes for v3.19" from Kukjin Kim: Samsung fixes for v3.19 - exynos_defconfig: enable LM90 driver and display panel support - HWMON - SENSORS_LM90 - Direct Rendering Manager (DRM) - DRM bridge registration and lookup framework - Parade ps8622/ps8625 eDP/LVDS bridge - NXP ptn3460 eDP/LVDS bridge - Exynos Fully Interactive Mobile Display controller (FIMD) - Panel registration and lookup framework - Simple panels - Backlight & LCD device support - use pmu_system_controller phandle for dp phy : DP PHY requires pmu_system_controller to handle PMU reg. now * tag 'samsung-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung: ARM: exynos_defconfig: Enable LM90 driver ARM: exynos_defconfig: Enable options for display panel support arm: dts: Use pmu_system_controller phandle for dp phy Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:11:37 UTC
41544f9 reset: sunxi: fix spinlock initialization Call spin_lock_init() before the spinlocks are used, both in early init and probe functions preventing a lockdep splat. I have been observing lockdep complaining [1] during boot on my a80 optimus [2] when CONFIG_PROVE_LOCKING has been enabled. This patch resolves the splat, and has been tested on a few other sunxi platforms without issue. [1] http://storage.kernelci.org/next/next-20150107/arm-multi_v7_defconfig+CONFIG_PROVE_LOCKING=y/lab-tbaker/boot-sun9i-a80-optimus.html [2] http://kernelci.org/boot/?a80-optimus Signed-off-by: Tyler Baker <tyler.baker@linaro.org> Cc: <stable@vger.kernel.org> Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:11:31 UTC
a30e931 Merge tag 'renesas-soc-fixes-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes Merge "Renesas ARM Based SoC Fixes for v3.19" from Simon Horman: Renesas ARM Based SoC Fixes for v3.19 This pull request is based on the last round of SoC updates for v3.19, Fourth Round of Renesas ARM Based SoC Updates for v3.19, tagged as renesas-soc3-for-v3.19, merged into your next/soc branch and included in v3.19-rc1. - ARM: shmobile: r8a7740: Instantiate GIC from C board code in legacy builds Set .control_parent for all irqpin instances for sh73a0 SoC when booting using legacy C. - ARM: shmobile: r8a7740: Instantiate GIC from C board code in legacy builds This fixes a long standing problem which has been present since the sh73a0 SoC started using the INTC External IRQ pin driver. The patch that introduced the problem is 341eb5465f67437a ("ARM: shmobile: INTC External IRQ pin driver on sh73a0") which was included in v3.10. * tag 'renesas-soc-fixes-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: sh73a0 legacy: Set .control_parent for all irqpin instances ARM: shmobile: r8a7740: Instantiate GIC from C board code in legacy builds 17 January 2015, 03:10:43 UTC
25217fe ARM: dts: disable CCI on exynos5420 based arndale-octa The arndale-octa board was giving "imprecise external aborts" during boot-up with MCPM enabled. CCI enablement of the boot cluster was found to be the cause of these aborts (possibly because the secure f/w was not allowing it). Hence, disable CCI for the arndale-octa board. Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com> Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Tested-by: Kevin Hilman <khilman@linaro.org> Tested-by: Tyler Baker <tyler.baker@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:41 UTC
896ddd6 drivers: bus: check cci device tree node status The arm-cci driver completes the probe sequence even if the cci node is marked as disabled. Add a check in the driver to honour the cci status in the device tree. Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:41 UTC
6fda93b Merge tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91 into fixes Merge "at91: fixes for 3.19 #1 (ter)" from Nicolas Ferre: First fixes batch for AT91 on 3.19: - fix some DT entries - correct clock entry for the at91sam9263 LCD - add a phy_fixup for Eth1 on sama5d4 * tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91: ARM: at91: board-dt-sama5: add phy_fixup to override NAND_Tree ARM: at91/dt: sam9263: Add missing clocks to lcdc node ARM: at91: sama5d3: dt: correct the sound route ARM: at91/dt: sama5d4: fix the timer reg length Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:40 UTC
c9b75d5 ARM: rockchip: disable jtag/sdmmc autoswitching on rk3288 rk3288 SoCs have a function to automatically switch between jtag/sdmmc pinmux settings depending on the card state. This collides with a lot of assumptions. It only works when using the internal card-detect mechanism and breaks horribly when using either the normal card-detect via the slot-gpio function or via any other pin. Also there is of course no link between the mmc and jtag on the software-side, so the jtag clocks may very well be disabled when the card is ejected and the soc switches back to the jtag pinmux. Leaving the switching function enabled did result in mmc timeouts and rcu stalls thus hanging the system on 3.19-rc1. Therefore disable it in all cases, as we expect the devicetree to explicitly select either mmc or jtag pinmuxes anyway. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:39 UTC
1dbb36b Merge tag 'berlin-fixes-for-3.19-1' of git://git.infradead.org/users/hesselba/linux-berlin into fixes Merge "ARM: berlin: Fixes for v3.19 (round 1)" from Sebastian Hesselbarth: Marvell Berlin fixes for v3.19 round 1: - SDHCI DT fixes for BG2Q and BG2Q reference board - BG2Q SM GPIO DT node relocation * tag 'berlin-fixes-for-3.19-1' of git://git.infradead.org/users/hesselba/linux-berlin: ARM: dts: berlin: correct BG2Q's SM GPIO location. ARM: dts: berlin: add broken-cd and set bus width for eMMC in Marvell DMP DT ARM: dts: berlin: fix io clk and add missing core clk for BG2Q sdhci2 host Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:39 UTC
259e438 ARM: nomadik: fix up leftover device tree pins We altered the device tree bindings for the Nomadik family of pin controllers to be standard, this file was merged out-of-order so we missed fixing this. Fix it up. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:38 UTC
e3db221 Merge tag 'omap-for-v3.19/fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Merge "omap fixes against v3.19-rc1" from Tony Lindgren: Fixes for omaps mostly to deal with dra7 timer issues and hypervisor mode. The other fixes are minor fixes for various boards. The summary of the fixes is: - Fix real-time counter rate typos for some frequencies - Fix counter frequency drift for am572x - Fix booting of secondary CPU in HYP mode - Fix n900 board name for legacy user space - Fix cpufreq in omap2plus_defconfig after Kconfig change - Fix dra7 qspi partitions And also, let's re-enable smc91x on some n900 boards that we have sitting in a few test boot systems after the boot loader dependencies got fixed. * tag 'omap-for-v3.19/fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Revert disabling of smc91x for n900 ARM: dts: dra7-evm: fix qspi device tree partition size ARM: omap2plus_defconfig: use CONFIG_CPUFREQ_DT ARM: OMAP2+: Fix n900 board name for legacy user space ARM: omap5/dra7xx: Enable booting secondary CPU in HYP mode ARM: dra7xx: Fix counter frequency drift for AM572x errata i856 ARM: omap5/dra7xx: Fix frequency typos Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:37 UTC
3be8142 Merge tag 'imx-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes Merge "ARM: imx: fixes for 3.19" from Shawn Guo: The i.MX fixes for 3.19: - One fix for incorrect i.MX25 SPI1 clock assignment in device tree, which causes system hang when accessing SPI1. - Correct i.MX6SX QSPI parent clock configuration to fix a kernel Oops. - Fix ULPI PHY reset modelling on imx51-babbage board to remove the dependency on bootloader for USB3317 ULPI PHY reset. - Correct video divider setting on i.MX6Q rev T0 1.0 to fix the issue that HDMI is not working at high resolution on T0 1.0. - One incremental fix for CODA960 VPU enabling in device tree to correct interrupt order. - LS1021A SCFG block works in BE mode, add device tree property big-endian to make it right. * tag 'imx-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx51-babbage: Fix ULPI PHY reset modelling ARM: imx6sx: Set PLL2 as parent of QSPI clocks ARM: dts: imx25: Fix the SPI1 clocks ARM: clk-imx6q: fix video divider for rev T0 1.0 ARM: dts: imx6qdl: Fix CODA960 interrupt order ARM: ls1021a: dtsi: add 'big-endian' property for scfg node Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:37 UTC
1591dc4 Merge tag 'v3.19-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes Merge "ARM: rockchip: dts fix for 3.19" from Heiko Stübner: Increase drive-strength to sdmmc pins on rk3288-evb to fix an issue with the fixed highspeed card detection. * tag 'v3.19-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: rockchip: bump sd card pin drive strength up on rk3288-evb Signed-off-by: Olof Johansson <olof@lixom.net> 17 January 2015, 03:10:36 UTC
fc7f0dd kernel: avoid overflow in cmp_range Avoid overflow possibility. [ The overflow is purely theoretical, since this is used for memory ranges that aren't even close to using the full 64 bits, but this is the right thing to do regardless. - Linus ] Signed-off-by: Louis Langholtz <lou_langholtz@me.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Peter Anvin <hpa@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 January 2015, 21:02:23 UTC
6bcf9c1 perf tools powerpc: Use dwfl_report_elf() instead of offline. dwfl_report_offline() works only when libraries are prelinked. Replace dwfl_report_offline() with dwfl_report_elf() so we correctly extract debug info even from libraries that are not prelinked. Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20150114221045.GA17703@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:30 UTC
813ccd1 perf tools: Fix segfault for symbol annotation on TUI Currently the symbol structure is allocated with symbol_conf.priv_size to carry sideband information like annotation, map browser on TUI and sort-by-name tree node. So retrieving these information from symbol needs to care about the details of such placement. However the annotation code just assumes that the symbol is placed after the struct annotation. But actually there's other info between them. So accessing those struct will lead to an undefined behavior (usually a crash) after they write their info to the same location. To reproduce the problem, please follow the steps below: 1. run perf report (TUI of course) with -v option 2. open map browser (by pressing right arrow key for any entry) 3. search any function (by pressing '/' key and input whatever..) 4. return to the hist browser (by pressing 'q' or left arrow key) 5. open annotation window for the same entry (by pressing 'a' key) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1421234288-22758-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:29 UTC
b93b096 perf test: Fix dwarf unwind using libunwind. Perf tool fails to unwind user stack if the event raises in a shared object. This patch improves tests/dwarf-unwind.c to demonstrate the problem by utilizing commonly used glibc function "bsearch". If perf is not statically linked, the testcase will try to unwind a mixed call trace. By debugging libunwind I found that there is a bug in unwind-libunwind: it always passes 0 as segbase to libunwind, cause libunwind unable to locate debug_frame entry fir first level ip address (I add some more debugging output into libunwind to make things clear): >_Uarm_dwarf_find_debug_frame: start_ip = 10be98, end_ip = 10c2a4 >_Uarm_dwarf_find_debug_frame: found debug_frame table `/lib/libc-2.18.so': segbase=0x0, len=7, gp=0x0, table_data=0x449388 >_Uarm_dwarf_search_unwind_table: call lookup:ip = b6cd3bcc, segbase = 0, rel_ip = b6cd3bcc >lookup: e->start_ip_offset = bcf18 (rel_ip = b6cd3bcc) >lookup: e->start_ip_offset = 6d314 (rel_ip = b6cd3bcc) >lookup: e->start_ip_offset = 33d0c (rel_ip = b6cd3bcc) ... >lookup: e->start_ip_offset = 15d0c (rel_ip = b6cd3bcc) >lookup: e->start_ip_offset = 15c40 (rel_ip = b6cd3bcc) >_Uarm_dwarf_search_unwind_table: IP b6cd3bcc inside range b6c12000-b6d4c000, but no explicit unwind info found >put_rs_cache: unmasking signals/interrupts and releasing lock >_Uarm_dwarf_step: returning -10 >_Uarm_step: dwarf_step()=-10 This patch passes map->start as segbase to dwarf_find_debug_frame(), so di will be initialized correctly. In addition, dso and executable are different when setting segbase. This patch first check whether the elf is executable, and pass segbase only for shared object. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1421203007-75799-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:29 UTC
ea1fe3a perf tools: Avoid build splat for syscall numbers with uclibc This is due to duplicated unistd inclusion (via uClibc headers + kernel headers) Also seen on ARM uClibc based tools ------- ARC build ---------->8------------- CC util/evlist.o In file included from ~/arc/k.org/arch/arc/include/uapi/asm/unistd.h:25:0, from util/../perf-sys.h:10, from util/../perf.h:15, from util/event.h:7, from util/event.c:3: ~/arc/k.org/include/uapi/asm-generic/unistd.h:906:0: warning: "__NR_fcntl64" redefined [enabled by default] #define __NR_fcntl64 __NR3264_fcntl ^ In file included from ~/arc/gnu/INSTALL_1412-arc-2014.12-rc1/arc-snps-linux-uclibc/sysroot/usr/include/sys/syscall.h:24:0, from util/../perf-sys.h:6, ----------------->8------------------- ------- ARM build ---------->8------------- CC FPIC plugin_scsi.o In file included from util/../perf-sys.h:9:0, from util/../perf.h:15, from util/cache.h:7, from perf.c:12: ~/arc/k.org/arch/arm/include/uapi/asm/unistd.h:28:0: warning: "__NR_restart_syscall" redefined [enabled by default] In file included from ~/buildroot/host/usr/arm-buildroot-linux-uclibcgnueabi/sysroot/usr/include/sys/syscall.h:25:0, from util/../perf-sys.h:6, from util/../perf.h:15, from util/cache.h:7, from perf.c:12: ~/buildroot/host/usr/arm-buildroot-linux-uclibcgnueabi/sysroot/usr/include/bits/sysnum.h:17:0: note: this is the location of the previous definition ----------------->8------------------- Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1421156604-30603-4-git-send-email-vgupta@synopsys.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:29 UTC
a83d869 perf tools: Elide strlcpy warning with uclibc ----------------->8------------------ CC bench/sched-pipe.o In file included from builtin-annotate.c:13:0: util/cache.h:76:15: warning: redundant redeclaration of 'strlcpy' [-Wredundant-decls] extern size_t strlcpy(char *dest, const char *src, size_t size); ^ In file included from util/util.h:55:0, from builtin.h:4, from builtin-annotate.c:8: ~/vineetg/arc/gnu/INSTALL_1412-arc-2014.12-rc1/arc-snps-linux-uclibc/sysroot/usr/include/string.h:396:15: note: previous declaration of 'strlcpy' was here extern size_t strlcpy(char *__restrict dst, const char *__restrict src, ----------------->8------------------ Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1421156604-30603-3-git-send-email-vgupta@synopsys.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:29 UTC
db1806e perf tools: Fix statfs.f_type data type mismatch build error with uclibc ARC Linux uses the no legacy syscalls abi and corresponding uClibc headers statfs defines f_type to be U32 which causes perf build breakage http://git.uclibc.org/uClibc/tree/libc/sysdeps/linux/common-generic/bits/statfs.h ----------->8--------------- CC fs/fs.o fs/fs.c: In function 'fs__valid_mount': fs/fs.c:82:24: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] else if (st_fs.f_type != magic) ^ cc1: all warnings being treated as errors ----------->8--------------- Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Cody P Schafer <dev@codyps.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: http://lkml.kernel.org/r/1420888254-17504-2-git-send-email-vgupta@synopsys.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:29 UTC
25cd480 tools: Remove bitops/hweight usage of bits in tools/perf We need to use lib/hweight.c for that, just like we do for lib/rbtree.c, so tools need to link hweight.o. For now do it directly, but we need to have a tools/lib/lk.a or .so that collects these goodies... Reported-by: Jan Beulich <JBeulich@suse.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-a1e91dx3apzqw5kbdt7ut21s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:29 UTC
260d819 perf machine: Fix __machine__findnew_thread() error path When thread__init_map_groups() fails, a new thread should be removed from the rbtree since it's gonna be freed. Also update last match cache only if the function succeeded. Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1420763892-15535-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:28 UTC
c6e5e9f perf tools: Fix building error in x86_64 when dwarf unwind is on When build with 'make ARCH=x86' and dwarf unwind is on, there is a compiling error: CC /home/wn/perf/arch/x86/util/unwind-libdw.o CC /home/wn/perf/arch/x86/tests/regs_load.o arch/x86/tests/regs_load.S: Assembler messages: arch/x86/tests/regs_load.S:65: Error: operand type mismatch for `push' arch/x86/tests/regs_load.S:72: Error: operand type mismatch for `pop' make[1]: *** [/home/wn/perf/arch/x86/tests/regs_load.o] Error 1 make[1]: INTERNAL: Exiting with 25 jobserver tokens available; should be 24! make: *** [all] Error 2 ... Which is caused by incorrectly undefine macro HAVE_ARCH_X86_64_SUPPORT. 'config/Makefile.arch' tests __x86_64__ only when 'ARCH=x86_64'. However, when building x86_64 kernel, ARCH=x86 is valid and commonly used. Build systems, such as yocto, uses x86_64 compiler with 'ARCH=x86' to build x86_64 perf, which causes mismatching. As __LP64__ is defined for x86_64 as well, we can consolidate the __x86_64__ check to the __LP64__ check and get rid of the IS_X86_64 IMHO. (This patch is made by Namhyung Kim when replying my v1 patch: https://lkml.org/lkml/2015/1/7/17 I modified the code to remove dependency on RAW_ARCH: https://lkml.org/lkml/2015/1/7/865 Namhyung Kim didn't provide his SOB in his original email. I add mine only for my modification.) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1421029255-23039-1-git-send-email-wangnan0@huawei.com [ Namhyung provided his S-o-B on a followup to this patch thread on lkml ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:28 UTC
7949ba1 perf probe: Propagate error code when write(2) failed When it failed to write probe commands to the probe_event file in debugfs, it needs to propagate the error code properly. Current code blindly uses the return value of the write(2) so it always uses -1 (-EPERM) and it might confuse users. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1420886028-15135-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 16 January 2015, 20:49:28 UTC
7ad4b4a Merge tag 'char-misc-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are three small driver fixes for reported issues for 3.19-rc5. All of these have been in linux-next for a while with no reported problems" * tag 'char-misc-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: mcb: mcb-pci: Only remap the 1st 0x200 bytes of BAR 0 mei: add ABI documentation for fw_status exported through sysfs mei: clean reset bit before reset 16 January 2015, 19:18:08 UTC
62b1530 Merge tag 'driver-core-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fix from Greg KH: "Here is one kernfs fix for a reported issue for 3.19-rc5. It has been in linux-next for a while" * tag 'driver-core-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kernfs: Fix kernfs_name_compare 16 January 2015, 19:16:52 UTC
7b78de8 Merge tag 'tty-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some tty and serial driver fixes for 3.19-rc5 that resolve some reported issues, and add a new device id to the 8250 serial port driver. All have been in linux-next with no reported problems" * tag 'tty-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: samsung: Add the support for Exynos5433 SoC Revert "tty: Fix pty master poll() after slave closes v2" tty: Prevent hw state corruption in exclusive mode reopen tty: Add support for the WCH384 4S multi-IO card serial: fix parisc boot hang 16 January 2015, 19:15:49 UTC
937102f Merge tag 'staging-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are 6 staging driver fixes for 3.19-rc5. They fix some reported issues with some IIO drivers, as well as some issues with the vt6655 wireless driver. All have been in linux-next for a while" * tag 'staging-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: vt6655: fix sparse warning: argument type staging: vt6655: Fix loss of distant/weak access points on channel change. staging: vt6655: vnt_tx_packet Fix corrupted tx packets. staging: vt6655: fix sparse warnings: incorrect argument type iio: iio: Fix iio_channel_read return if channel havn't info iio: ad799x: Fix ad7991/ad7995/ad7999 config setup 16 January 2015, 19:13:45 UTC
79600d4 Merge tag 'usb-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here is a bunch of USB fixes for 3.19-rc5. Most of these are gadget driver fixes, along with the xhci driver fix that we both reported having problems with, as well as some new device ids and other tiny fixes. All have been in linux-next with no problems" * tag 'usb-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (43 commits) usb: dwc3: gadget: Stop TRB preparation after limit is reached usb: dwc3: gadget: Fix TRB preparation during SG usb: phy: mv-usb: fix usb_phy build errors usb: serial: handle -ENODEV quietly in generic_submit_read_urb usb: serial: silence all non-critical read errors USB: console: fix potential use after free USB: console: fix uninitialised ldisc semaphore usb: gadget: udc: atmel: fix possible oops when unloading module usb: gadget: gadgetfs: fix an oops in ep_write() usb: phy: Fix deferred probing OHCI: add a quirk for ULi M5237 blocking on reset uas: Add US_FL_NO_ATA_1X for 2 more Seagate disk enclosures uas: Do not blacklist ASM1153 disk enclosures usb: gadget: udc: avoid dereference before NULL check in ep_queue usb: host: ehci-tegra: request deferred probe when failing to get phy uas: disable UAS on Apricorn SATA dongles uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS566 with usb-id 0bc2:a013 uas: Add US_FL_NO_ATA_1X for Seagate devices with usb-id 0bc2:a013 xhci: Add broken-streams quirk for Fresco Logic FL1000G xhci controllers USB: EHCI: adjust error return code ... 16 January 2015, 19:02:44 UTC
fa818dc Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Wire up compat_sys_execveat for compat (AArch32) tasks - Revert 421520ba9829, as this breaks our side of the boot protocol * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: partially revert "ARM: 8167/1: extend the reserved memory for initrd to be page aligned" arm64: compat: wire up compat_sys_execveat 16 January 2015, 19:01:21 UTC
a2a32cd Merge tag 'nfs-for-3.19-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: "Highlights include: - Stable fix for a NFSv3/lockd race - Fixes for several NFSv4.1 client id trunking bugs - Remove an incorrect test when checking for delegated opens" * tag 'nfs-for-3.19-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Remove incorrect check in can_open_delegated() NFS: Ignore transport protocol when detecting server trunking NFSv4/v4.1: Verify the client owner id during trunking detection NFSv4: Cache the NFSv4/v4.1 client owner_id in the struct nfs_client NFSv4.1: Fix client id trunking on Linux LOCKD: Fix a race when initialising nlmsvc_timeout 16 January 2015, 18:59:06 UTC
23aa4b4 Merge tag 'trace-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "This holds a few fixes to the ftrace infrastructure as well as the mixture of function graph tracing and kprobes. When jprobes and function graph tracing is enabled at the same time it will crash the system: # modprobe jprobe_example # echo function_graph > /sys/kernel/debug/tracing/current_tracer After the first fork (jprobe_example probes it), the system will crash. This is due to the way jprobes copies the stack frame and does not do a normal function return. This messes up with the function graph tracing accounting which hijacks the return address from the stack and replaces it with a hook function. It saves the return addresses in a separate stack to put back the correct return address when done. But because the jprobe functions do not do a normal return, their stack addresses are not put back until the function they probe is called, which means that the probed function will get the return address of the jprobe handler instead of its own. The simple fix here was to disable function graph tracing while the jprobe handler is being called. While debugging this I found two minor bugs with the function graph tracing. The first was about the function graph tracer sharing its function hash with the function tracer (they both get filtered by the same input). The changing of the set_ftrace_filter would not sync the function recording records after a change if the function tracer was disabled but the function graph tracer was enabled. This was due to the update only checking one of the ops instead of the shared ops to see if they were enabled and should perform the sync. This caused the ftrace accounting to break and a ftrace_bug() would be triggered, disabling ftrace until a reboot. The second was that the check to update records only checked one of the filter hashes. It needs to test both the "filter" and "notrace" hashes. The "filter" hash determines what functions to trace where as the "notrace" hash determines what functions not to trace (trace all but these). Both hashes need to be passed to the update code to find out what change is being done during the update. This also broke the ftrace record accounting and triggered a ftrace_bug(). This patch set also include two more fixes that were reported separately from the kprobe issue. One was that init_ftrace_syscalls() was called twice at boot up. This is not a major bug, but that call performed a rather large kmalloc (NR_syscalls * sizeof(*syscalls_metadata)). The second call made the first one a memory leak, and wastes memory. The other fix is a regression caused by an update in the v3.19 merge window. The moving to enable events early, moved the enabling before PID 1 was created. The syscall events require setting the TIF_SYSCALL_TRACEPOINT for all tasks. But for_each_process_thread() does not include the swapper task (PID 0), and ended up being a nop. A suggested fix was to add the init_task() to have its flag set, but I didn't really want to mess with PID 0 for this minor bug. Instead I disable and re-enable events again at early_initcall() where it use to be enabled. This also handles any other event that might have its own reg function that could break at early boot up" * tag 'trace-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix enabling of syscall events on the command line tracing: Remove extra call to init_ftrace_syscalls() ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing ftrace: Check both notrace and filter for old hash ftrace: Fix updating of filters for shared global_ops filters 16 January 2015, 18:55:52 UTC
0145058 arm64: partially revert "ARM: 8167/1: extend the reserved memory for initrd to be page aligned" This patch partially reverts commit 421520ba98290a73b35b7644e877a48f18e06004 (only the arm64 part). There is no guarantee that the boot-loader places other images like dtb in a different page than initrd start/end, especially when the kernel is built with 64KB pages. When this happens, such pages must not be freed. The free_reserved_area() already takes care of rounding up "start" and rounding down "end" to avoid freeing partially used pages. Cc: <stable@vger.kernel.org> # 3.17+ Reported-by: Peter Maydell <Peter.Maydell@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> 16 January 2015, 13:57:33 UTC
3363673 perf/x86/intel: Fix bug for "cycles:p" and "cycles:pp" on SLM cycles:p and cycles:pp do not work on SLM since commit: 86a04461a99f ("perf/x86: Revamp PEBS event selection") UOPS_RETIRED.ALL is not a PEBS capable event, so it should not be used to count cycle number. Actually SLM calls intel_pebs_aliases_core2() which uses INST_RETIRED.ANY_P to count the number of cycles. It's a PEBS capable event. But inv and cmask must be set to count cycles. Considering SLM allows all events as PEBS with no flags, only INST_RETIRED.ANY_P, inv=1, cmask=16 needs to handled specially. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1421084541-31639-1-git-send-email-kan.liang@intel.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> 16 January 2015, 08:06:59 UTC
433678b perf/rapl: Fix sysfs_show() initialization for RAPL PMU This patch fixes a problem with the initialization of the sysfs_show() routine for the RAPL PMU. The current code was wrongly relying on the EVENT_ATTR_STR() macro which uses the events_sysfs_show() function in the x86 PMU code. That function itself was relying on the x86_pmu data structure. Yet RAPL and the core PMU (x86_pmu) have nothing to do with each other. They should therefore not interact with each other. The x86_pmu structure is initialized at boot time based on the host CPU model. When the host CPU is not supported, the x86_pmu remains uninitialized and some of the callbacks it contains are NULL. The false dependency with x86_pmu could potentially cause crashes in case the x86_pmu is not initialized while the RAPL PMU is. This may, for instance, be the case in virtualized environments. This patch fixes the problem by using a private sysfs_show() routine for exporting the RAPL PMU events. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150113225953.GA21525@thinkpad Cc: vincent.weaver@maine.edu Cc: jolsa@redhat.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> 16 January 2015, 08:06:58 UTC
cb59670 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "This fixes a regression in the latest fuse update plus a fix for a rather theoretical memory ordering issue" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: add memory barrier to INIT fuse: fix LOOKUP vs INIT compat handling 16 January 2015, 01:58:16 UTC
0b6212e Merge tag 'fbdev-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux Pull fbdev fixes from Tomi Valkeinen: - broadsheetfb: fix memory leak - simplefb: fix build failure on sparc * tag 'fbdev-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: fbdev/broadsheetfb: fix memory leak simplefb: Fix build failure on Sparc 16 January 2015, 01:55:47 UTC
7b552bc Merge tag 'mmc-v3.19-4' of git://git.linaro.org/people/ulf.hansson/mmc Pull MMC bugfix from Ulf Hansson: "Fix sdhci regulator regression for Qualcomm and Nvidia boards" * tag 'mmc-v3.19-4' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: sdhci: Set SDHCI_POWER_ON with external vmmc 16 January 2015, 01:53:07 UTC
f8cb395 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k fixlet from Geert Uytterhoeven. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Wire up execveat 16 January 2015, 01:29:21 UTC
3fa116e Merge tag 'powerpc-3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc fixes from Michael Ellerman: "A few powerpc fixes" * tag 'powerpc-3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: powerpc: Work around gcc bug in current_thread_info() cxl: Fix issues when unmapping contexts powernv: Fix OPAL tracepoint code 16 January 2015, 01:28:01 UTC
ce1039b tracing: Fix enabling of syscall events on the command line Commit 5f893b2639b2 "tracing: Move enabling tracepoints to just after rcu_init()" broke the enabling of system call events from the command line. The reason was that the enabling of command line trace events was moved before PID 1 started, and the syscall tracepoints require that all tasks have the TIF_SYSCALL_TRACEPOINT flag set. But the swapper task (pid 0) is not part of that. Since the swapper task is the only task that is running at this early in boot, no task gets the flag set, and the tracepoint never gets reached. Instead of setting the swapper task flag (there should be no reason to do that), re-enabled trace events again after the init thread (PID 1) has been started. It requires disabling all command line events and re-enabling them, as just enabling them again will not reset the logic to set the TIF_SYSCALL_TRACEPOINT flag, as the syscall tracepoint will be fooled into thinking that it was already set, and wont try setting it again. For this reason, we must first disable it and re-enable it. Link: http://lkml.kernel.org/r/1421188517-18312-1-git-send-email-mpe@ellerman.id.au Link: http://lkml.kernel.org/r/20150115040506.216066449@goodmis.org Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> 15 January 2015, 14:42:50 UTC
back to top