https://github.com/torvalds/linux

sort by:
Revision Author Date Message Commit Date
14ccee7 Linux 4.13-rc6 20 August 2017, 21:13:52 UTC
197e7e5 Sanitize 'move_pages()' permission checks The 'move_paghes()' system call was introduced long long ago with the same permission checks as for sending a signal (except using CAP_SYS_NICE instead of CAP_SYS_KILL for the overriding capability). That turns out to not be a great choice - while the system call really only moves physical page allocations around (and you need other capabilities to do a lot of it), you can check the return value to map out some the virtual address choices and defeat ASLR of a binary that still shares your uid. So change the access checks to the more common 'ptrace_may_access()' model instead. This tightens the access checks for the uid, and also effectively changes the CAP_SYS_NICE check to CAP_SYS_PTRACE, but it's unlikely that anybody really _uses_ this legacy system call any more (we hav ebetter NUMA placement models these days), so I expect nobody to notice. Famous last words. Reported-by: Otto Ebeling <otto.ebeling@iki.fi> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 20 August 2017, 20:26:27 UTC
7f680d7 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Another pile of small fixes and updates for x86: - Plug a hole in the SMAP implementation which misses to clear AC on NMI entry - Fix the norandmaps/ADDR_NO_RANDOMIZE logic so the command line parameter works correctly again - Use the proper accessor in the startup64 code for next_early_pgt to prevent accessing of invalid addresses and faulting in the early boot code. - Prevent CPU hotplug lock recursion in the MTRR code - Unbreak CPU0 hotplugging - Rename overly long CPUID bits which got introduced in this cycle - Two commits which mark data 'const' and restrict the scope of data and functions to file scope by making them 'static'" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Constify attribute_group structures x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt' x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks x86: Fix norandmaps/ADDR_NO_RANDOMIZE x86/mtrr: Prevent CPU hotplug lock recursion x86: Mark various structures and functions as 'static' x86/cpufeature, kvm/svm: Rename (shorten) the new "virtualized VMSAVE/VMLOAD" CPUID flag x86/smpboot: Unbreak CPU0 hotplug x86/asm/64: Clear AC on NMI entries 20 August 2017, 16:36:52 UTC
2615a38 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A few small fixes for timer drivers: - Prevent infinite recursion in the arm architected timer driver with ftrace - Propagate error codes to the caller in case of failure in EM STI driver - Adjust a bogus loop iteration in the arm architected timer driver - Add a missing Kconfig dependency to the pistachio clocksource to prevent build failures - Correctly check for IS_ERR() instead of NULL in the shared timer-of code" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies clocksource/drivers/timer-of: Checking for IS_ERR() instead of NULL clocksource/drivers/em_sti: Fix error return codes in em_sti_probe() clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization 20 August 2017, 16:34:24 UTC
e46db8d Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Two fixes for the perf subsystem: - Fix an inconsistency of RDPMC mm struct tagging across exec() which causes RDPMC to fault. - Correct the timestamp mechanics across IOC_DISABLE/ENABLE which causes incorrect timestamps and total time calculations" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix time on IOC_ENABLE perf/x86: Fix RDPMC vs. mm_struct tracking 20 August 2017, 16:20:57 UTC
9dae41a Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A pile of smallish changes all over the place: - Add a missing ISB in the GIC V1 driver - Remove an ACPI version check in the GIC V3 ITS driver - Add the missing irq_pm_shutdown function for BRCMSTB-L2 to avoid spurious wakeups - Remove the artifical limitation of ITS instances to the number of NUMA nodes which prevents utilizing the ITS hardware correctly - Prevent a infinite parsing loop in the GIC-V3 ITS/MSI code - Honour the force affinity argument in the GIC-V3 driver which is required to make perf work correctly - Correctly report allocation failures in GIC-V2/V3 to avoid using half allocated and initialized interrupts. - Fixup checks against nr_cpu_ids in the generic IPI code" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/ipi: Fixup checks against nr_cpu_ids genirq: Restore trigger settings in irq_modify_status() MAINTAINERS: Remove Jason Cooper's irqchip git tree irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES irqchip: brcmstb-l2: Define an irq_pm_shutdown function irqchip/gic: Ensure we have an ISB between ack and ->handle_irq irqchip/gic-v3-its: Remove ACPICA version check for ACPI NUMA irqchip/gic-v3: Honor forced affinity setting irqchip/gic-v3: Report failures in gic_irq_domain_alloc irqchip/gic-v2: Report failures in gic_irq_domain_alloc irqchip/atmel-aic: Remove root argument from ->fixup() prototype irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup() irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup() 20 August 2017, 16:07:56 UTC
e18a5eb Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull watchdog fix from Thomas Gleixner: "A fix for the hardlockup watchdog to prevent false positives with extreme Turbo-Modes which make the perf/NMI watchdog fire faster than the hrtimer which is used to verify. Slightly larger than the minimal fix, which just would increase the hrtimer frequency, but comes with extra overhead of more watchdog timer interrupts and thread wakeups for all users. With this change we restrict the overhead to the extreme Turbo-Mode systems" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kernel/watchdog: Prevent false positives with turbo modes 20 August 2017, 15:54:30 UTC
8fbbe2d genirq/ipi: Fixup checks against nr_cpu_ids Valid CPU ids are [0, nr_cpu_ids-1] inclusive. Fixes: 3b8e29a82dd1 ("genirq: Implement ipi_send_mask/single()") Fixes: f9bce791ae2a ("genirq: Add a new function to get IPI reverse mapping") Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170819095751.GB27864@avx2 20 August 2017, 08:49:05 UTC
58d4e45 Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "14 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes mm/vmalloc.c: don't unconditonally use __GFP_HIGHMEM mm/mempolicy: fix use after free when calling get_mempolicy mm/cma_debug.c: fix stack corruption due to sprintf usage signal: don't remove SIGNAL_UNKILLABLE for traced tasks. mm, oom: fix potential data corruption when oom_reaper races with writer mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS slub: fix per memcg cache leak on css offline mm: discard memblock data later test_kmod: fix description for -s -and -c parameters kmod: fix wait on recursive loop wait: add wait_event_killable_timeout() kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback() 18 August 2017, 23:06:33 UTC
c715b72 mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes Moving the x86_64 and arm64 PIE base from 0x555555554000 to 0x000100000000 broke AddressSanitizer. This is a partial revert of: eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE") 02445990a96e ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB") The AddressSanitizer tool has hard-coded expectations about where executable mappings are loaded. The motivation for changing the PIE base in the above commits was to avoid the Stack-Clash CVEs that allowed executable mappings to get too close to heap and stack. This was mainly a problem on 32-bit, but the 64-bit bases were moved too, in an effort to proactively protect those systems (proofs of concept do exist that show 64-bit collisions, but other recent changes to fix stack accounting and setuid behaviors will minimize the impact). The new 32-bit PIE base is fine for ASan (since it matches the ET_EXEC base), so only the 64-bit PIE base needs to be reverted to let x86 and arm64 ASan binaries run again. Future changes to the 64-bit PIE base on these architectures can be made optional once a more dynamic method for dealing with AddressSanitizer is found. (e.g. always loading PIE into the mmap region for marked binaries.) Link: http://lkml.kernel.org/r/20170807201542.GA21271@beast Fixes: eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE") Fixes: 02445990a96e ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB") Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Kostya Serebryany <kcc@google.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:02 UTC
704b862 mm/vmalloc.c: don't unconditonally use __GFP_HIGHMEM Commit 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly") added use of __GFP_HIGHMEM for allocations. vmalloc_32 may use GFP_DMA/GFP_DMA32 which does not play nice with __GFP_HIGHMEM and will trigger a BUG in gfp_zone. Only add __GFP_HIGHMEM if we aren't using GFP_DMA/GFP_DMA32. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1482249 Link: http://lkml.kernel.org/r/20170816220705.31374-1-labbott@redhat.com Fixes: 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly") Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:02 UTC
73223e4 mm/mempolicy: fix use after free when calling get_mempolicy I hit a use after free issue when executing trinity and repoduced it with KASAN enabled. The related call trace is as follows. BUG: KASan: use after free in SyS_get_mempolicy+0x3c8/0x960 at addr ffff8801f582d766 Read of size 2 by task syz-executor1/798 INFO: Allocated in mpol_new.part.2+0x74/0x160 age=3 cpu=1 pid=799 __slab_alloc+0x768/0x970 kmem_cache_alloc+0x2e7/0x450 mpol_new.part.2+0x74/0x160 mpol_new+0x66/0x80 SyS_mbind+0x267/0x9f0 system_call_fastpath+0x16/0x1b INFO: Freed in __mpol_put+0x2b/0x40 age=4 cpu=1 pid=799 __slab_free+0x495/0x8e0 kmem_cache_free+0x2f3/0x4c0 __mpol_put+0x2b/0x40 SyS_mbind+0x383/0x9f0 system_call_fastpath+0x16/0x1b INFO: Slab 0xffffea0009cb8dc0 objects=23 used=8 fp=0xffff8801f582de40 flags=0x200000000004080 INFO: Object 0xffff8801f582d760 @offset=5984 fp=0xffff8801f582d600 Bytes b4 ffff8801f582d750: ae 01 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ Object ffff8801f582d760: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object ffff8801f582d770: 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkk. Redzone ffff8801f582d778: bb bb bb bb bb bb bb bb ........ Padding ffff8801f582d8b8: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ Memory state around the buggy address: ffff8801f582d600: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8801f582d680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8801f582d700: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fc !shared memory policy is not protected against parallel removal by other thread which is normally protected by the mmap_sem. do_get_mempolicy, however, drops the lock midway while we can still access it later. Early premature up_read is a historical artifact from times when put_user was called in this path see https://lwn.net/Articles/124754/ but that is gone since 8bccd85ffbaf ("[PATCH] Implement sys_* do_* layering in the memory policy layer."). but when we have the the current mempolicy ref count model. The issue was introduced accordingly. Fix the issue by removing the premature release. Link: http://lkml.kernel.org/r/1502950924-27521-1-git-send-email-zhongjiang@huawei.com Signed-off-by: zhong jiang <zhongjiang@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> [2.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:02 UTC
da094e4 mm/cma_debug.c: fix stack corruption due to sprintf usage name[] in cma_debugfs_add_one() can only accommodate 16 chars including NULL to store sprintf output. It's common for cma device name to be larger than 15 chars. This can cause stack corrpution. If the gcc stack protector is turned on, this can cause a panic due to stack corruption. Below is one example trace: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffff8e69a75730 Call trace: dump_backtrace+0x0/0x2c4 show_stack+0x20/0x28 dump_stack+0xb8/0xf4 panic+0x154/0x2b0 print_tainted+0x0/0xc0 cma_debugfs_init+0x274/0x290 do_one_initcall+0x5c/0x168 kernel_init_freeable+0x1c8/0x280 Fix the short sprintf buffer in cma_debugfs_add_one() by using scnprintf() instead of sprintf(). Link: http://lkml.kernel.org/r/1502446217-21840-1-git-send-email-guptap@codeaurora.org Fixes: f318dd083c81 ("cma: Store a name in the cma structure") Signed-off-by: Prakash Gupta <guptap@codeaurora.org> Acked-by: Laura Abbott <labbott@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:02 UTC
eb61b59 signal: don't remove SIGNAL_UNKILLABLE for traced tasks. When forcing a signal, SIGNAL_UNKILLABLE is removed to prevent recursive faults, but this is undesirable when tracing. For example, debugging an init process (whether global or namespace), hitting a breakpoint and SIGTRAP will force SIGTRAP and then remove SIGNAL_UNKILLABLE. Everything continues fine, but then once debugging has finished, the init process is left killable which is unlikely what the user expects, resulting in either an accidentally killed init or an init that stops reaping zombies. Link: http://lkml.kernel.org/r/20170815112806.10728-1-jamie.iles@oracle.com Signed-off-by: Jamie Iles <jamie.iles@oracle.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:02 UTC
6b31d59 mm, oom: fix potential data corruption when oom_reaper races with writer Wenwei Tao has noticed that our current assumption that the oom victim is dying and never doing any visible changes after it dies, and so the oom_reaper can tear it down, is not entirely true. __task_will_free_mem consider a task dying when SIGNAL_GROUP_EXIT is set but do_group_exit sends SIGKILL to all threads _after_ the flag is set. So there is a race window when some threads won't have fatal_signal_pending while the oom_reaper could start unmapping the address space. Moreover some paths might not check for fatal signals before each PF/g-u-p/copy_from_user. We already have a protection for oom_reaper vs. PF races by checking MMF_UNSTABLE. This has been, however, checked only for kernel threads (use_mm users) which can outlive the oom victim. A simple fix would be to extend the current check in handle_mm_fault for all tasks but that wouldn't be sufficient because the current check assumes that a kernel thread would bail out after EFAULT from get_user*/copy_from_user and never re-read the same address which would succeed because the PF path has established page tables already. This seems to be the case for the only existing use_mm user currently (virtio driver) but it is rather fragile in general. This is even more fragile in general for more complex paths such as generic_perform_write which can re-read the same address more times (e.g. iov_iter_copy_from_user_atomic to fail and then iov_iter_fault_in_readable on retry). Therefore we have to implement MMF_UNSTABLE protection in a robust way and never make a potentially corrupted content visible. That requires to hook deeper into the PF path and check for the flag _every time_ before a pte for anonymous memory is established (that means all !VM_SHARED mappings). The corruption can be triggered artificially (http://lkml.kernel.org/r/201708040646.v746kkhC024636@www262.sakura.ne.jp) but there doesn't seem to be any real life bug report. The race window should be quite tight to trigger most of the time. Link: http://lkml.kernel.org/r/20170807113839.16695-3-mhocko@kernel.org Fixes: aac453635549 ("mm, oom: introduce oom reaper") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Wenwei Tao <wenwei.tww@alibaba-inc.com> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Andrea Argangeli <andrea@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
5b53a6e mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS Tetsuo Handa has noticed that MMF_UNSTABLE SIGBUS path in handle_mm_fault causes a lockdep splat Out of memory: Kill process 1056 (a.out) score 603 or sacrifice child Killed process 1056 (a.out) total-vm:4268108kB, anon-rss:2246048kB, file-rss:0kB, shmem-rss:0kB a.out (1169) used greatest stack depth: 11664 bytes left DEBUG_LOCKS_WARN_ON(depth <= 0) ------------[ cut here ]------------ WARNING: CPU: 6 PID: 1339 at kernel/locking/lockdep.c:3617 lock_release+0x172/0x1e0 CPU: 6 PID: 1339 Comm: a.out Not tainted 4.13.0-rc3-next-20170803+ #142 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 RIP: 0010:lock_release+0x172/0x1e0 Call Trace: up_read+0x1a/0x40 __do_page_fault+0x28e/0x4c0 do_page_fault+0x30/0x80 page_fault+0x28/0x30 The reason is that the page fault path might have dropped the mmap_sem and returned with VM_FAULT_RETRY. MMF_UNSTABLE check however rewrites the error path to VM_FAULT_SIGBUS and we always expect mmap_sem taken in that path. Fix this by taking mmap_sem when VM_FAULT_RETRY is held in the MMF_UNSTABLE path. We cannot simply add VM_FAULT_SIGBUS to the existing error code because all arch specific page fault handlers and g-u-p would have to learn a new error code combination. Link: http://lkml.kernel.org/r/20170807113839.16695-2-mhocko@kernel.org Fixes: 3f70dc38cec2 ("mm: make sure that kthreads will not refault oom reaped memory") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Andrea Argangeli <andrea@kernel.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Wenwei Tao <wenwei.tww@alibaba-inc.com> Cc: <stable@vger.kernel.org> [4.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
f6ba488 slub: fix per memcg cache leak on css offline To avoid a possible deadlock, sysfs_slab_remove() schedules an asynchronous work to delete sysfs entries corresponding to the kmem cache. To ensure the cache isn't freed before the work function is called, it takes a reference to the cache kobject. The reference is supposed to be released by the work function. However, the work function (sysfs_slab_remove_workfn()) does nothing in case the cache sysfs entry has already been deleted, leaking the kobject and the corresponding cache. This may happen on a per memcg cache destruction, because sysfs entries of a per memcg cache are deleted on memcg offline if the cache is empty (see __kmemcg_cache_deactivate()). The kmemleak report looks like this: unreferenced object 0xffff9f798a79f540 (size 32): comm "kworker/1:4", pid 15416, jiffies 4307432429 (age 28687.554s) hex dump (first 32 bytes): 6b 6d 61 6c 6c 6f 63 2d 31 36 28 31 35 39 39 3a kmalloc-16(1599: 6e 65 77 72 6f 6f 74 29 00 23 6b c0 ff ff ff ff newroot).#k..... backtrace: kmemleak_alloc+0x4a/0xa0 __kmalloc_track_caller+0x148/0x2c0 kvasprintf+0x66/0xd0 kasprintf+0x49/0x70 memcg_create_kmem_cache+0xe6/0x160 memcg_kmem_cache_create_func+0x20/0x110 process_one_work+0x205/0x5d0 worker_thread+0x4e/0x3a0 kthread+0x109/0x140 ret_from_fork+0x2a/0x40 unreferenced object 0xffff9f79b6136840 (size 416): comm "kworker/1:4", pid 15416, jiffies 4307432429 (age 28687.573s) hex dump (first 32 bytes): 40 fb 80 c2 3e 33 00 00 00 00 00 40 00 00 00 00 @...>3.....@.... 00 00 00 00 00 00 00 00 10 00 00 00 10 00 00 00 ................ backtrace: kmemleak_alloc+0x4a/0xa0 kmem_cache_alloc+0x128/0x280 create_cache+0x3b/0x1e0 memcg_create_kmem_cache+0x118/0x160 memcg_kmem_cache_create_func+0x20/0x110 process_one_work+0x205/0x5d0 worker_thread+0x4e/0x3a0 kthread+0x109/0x140 ret_from_fork+0x2a/0x40 Fix the leak by adding the missing call to kobject_put() to sysfs_slab_remove_workfn(). Link: http://lkml.kernel.org/r/20170812181134.25027-1-vdavydov.dev@gmail.com Fixes: 3b7b314053d02 ("slub: make sysfs file removal asynchronous") Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com> Reported-by: Andrei Vagin <avagin@gmail.com> Tested-by: Andrei Vagin <avagin@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> [4.12.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
3010f87 mm: discard memblock data later There is existing use after free bug when deferred struct pages are enabled: The memblock_add() allocates memory for the memory array if more than 128 entries are needed. See comment in e820__memblock_setup(): * The bootstrap memblock region count maximum is 128 entries * (INIT_MEMBLOCK_REGIONS), but EFI might pass us more E820 entries * than that - so allow memblock resizing. This memblock memory is freed here: free_low_memory_core_early() We access the freed memblock.memory later in boot when deferred pages are initialized in this path: deferred_init_memmap() for_each_mem_pfn_range() __next_mem_pfn_range() type = &memblock.memory; One possible explanation for why this use-after-free hasn't been hit before is that the limit of INIT_MEMBLOCK_REGIONS has never been exceeded at least on systems where deferred struct pages were enabled. Tested by reducing INIT_MEMBLOCK_REGIONS down to 4 from the current 128, and verifying in qemu that this code is getting excuted and that the freed pages are sane. Link: http://lkml.kernel.org/r/1502485554-318703-2-git-send-email-pasha.tatashin@oracle.com Fixes: 7e18adb4f80b ("mm: meminit: initialise remaining struct pages in parallel with kswapd") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
768dc4e test_kmod: fix description for -s -and -c parameters The descriptions were reversed, correct this. Link: http://lkml.kernel.org/r/20170809234635.13443-4-mcgrof@kernel.org Fixes: 64b671204afd71 ("test_sysctl: add generic script to expand on tests") Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Reported-by: Daniel Mentz <danielmentz@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jessica Yu <jeyu@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Matt Redfearn <matt.redfearn@imgetc.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Michal Marek <mmarek@suse.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
2ba293c kmod: fix wait on recursive loop Recursive loops with module loading were previously handled in kmod by restricting the number of modprobe calls to 50 and if that limit was breached request_module() would return an error and a user would see the following on their kernel dmesg: request_module: runaway loop modprobe binfmt-464c Starting init:/sbin/init exists but couldn't execute it (error -8) This issue could happen for instance when a 64-bit kernel boots a 32-bit userspace on some architectures and has no 32-bit binary format hanlders. This is visible, for instance, when a CONFIG_MODULES enabled 64-bit MIPS kernel boots a into o32 root filesystem and the binfmt handler for o32 binaries is not built-in. After commit 6d7964a722af ("kmod: throttle kmod thread limit") we now don't have any visible signs of an error and the kernel just waits for the loop to end somehow. Although this *particular* recursive loop could also be addressed by doing a sanity check on search_binary_handler() and disallowing a modular binfmt to be required for modprobe, a generic solution for any recursive kernel kmod issues is still needed. This should catch these loops. We can investigate each loop and address each one separately as they come in, this however puts a stop gap for them as before. Link: http://lkml.kernel.org/r/20170809234635.13443-3-mcgrof@kernel.org Fixes: 6d7964a722af ("kmod: throttle kmod thread limit") Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Reported-by: Matt Redfearn <matt.redfearn@imgtec.com> Tested-by: Matt Redfearn <matt.redfearn@imgetc.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jessica Yu <jeyu@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michal Marek <mmarek@suse.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
8ada927 wait: add wait_event_killable_timeout() These are the few pending fixes I have queued up for v4.13-final. One is a a generic regression fix for recursive loops on kmod and the other one is a trivial print out correction. During the v4.13 development we assumed that recursive kmod loops were no longer possible. Clearly that is not true. The regression fix makes use of a new killable wait. We use a killable wait to be paranoid in how signals might be sent to modprobe and only accept a proper SIGKILL. The signal will only be available to userspace to issue *iff* a thread has already entered a wait state, and that happens only if we've already throttled after 50 kmod threads have been hit. Note that although it may seem excessive to trigger a failure afer 5 seconds if all kmod thread remain busy, prior to the series of changes that went into v4.13 we would actually *always* fatally fail any request which came in if the limit was already reached. The new waiting implemented in v4.13 actually gives us *more* breathing room -- the wait for 5 seconds is a wait for *any* kmod thread to finish. We give up and fail *iff* no kmod thread has finished and they're *all* running straight for 5 consecutive seconds. If 50 kmod threads are running consecutively for 5 seconds something else must be really bad. Recursive loops with kmod are bad but they're also hard to implement properly as a selftest without currently fooling current userspace tools like kmod [1]. For instance kmod will complain when you run depmod if it finds a recursive loop with symbol dependency between modules as such this type of recursive loop cannot go upstream as the modules_install target will fail after running depmod. These tests already exist on userspace kmod upstream though (refer to the testsuite/module-playground/mod-loop-*.c files). The same is not true if request_module() is used though, or worst if aliases are used. Likewise the issue with 64-bit kernels booting 32-bit userspace without a binfmt handler built-in is also currently not detected and proactively avoided by userspace kmod tools, or kconfig for all architectures. Although we could complain in the kernel when some of these individual recursive issues creep up, proactively avoiding these situations in userspace at build time is what we should keep striving for. Lastly, since recursive loops could happen with kmod it may mean recursive loops may also be possible with other kernel usermode helpers, this should be investigated and long term if we can come up with a more sensible generic solution even better! [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20170809-kmod-for-v4.13-final [1] https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git This patch (of 3): This wait is similar to wait_event_interruptible_timeout() but only accepts SIGKILL interrupt signal. Other signals are ignored. Link: http://lkml.kernel.org/r/20170809234635.13443-2-mcgrof@kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Jessica Yu <jeyu@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michal Marek <mmarek@suse.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Matt Redfearn <matt.redfearn@imgetc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
92e5aae kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog Commit 05a4a9527931 ("kernel/watchdog: split up config options") lost the perf-based hardlockup detector's dependency on PERF_EVENTS, which can result in broken builds with some powerpc configurations. Restore the dependency. Add it in for x86 too, despite x86 always selecting PERF_EVENTS it seems reasonable to make the dependency explicit. Link: http://lkml.kernel.org/r/20170810114452.6673-1-npiggin@gmail.com Fixes: 05a4a9527931 ("kernel/watchdog: split up config options") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
739f79f mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback() Jaegeuk and Brad report a NULL pointer crash when writeback ending tries to update the memcg stats: BUG: unable to handle kernel NULL pointer dereference at 00000000000003b0 IP: test_clear_page_writeback+0x12e/0x2c0 [...] RIP: 0010:test_clear_page_writeback+0x12e/0x2c0 Call Trace: <IRQ> end_page_writeback+0x47/0x70 f2fs_write_end_io+0x76/0x180 [f2fs] bio_endio+0x9f/0x120 blk_update_request+0xa8/0x2f0 scsi_end_request+0x39/0x1d0 scsi_io_completion+0x211/0x690 scsi_finish_command+0xd9/0x120 scsi_softirq_done+0x127/0x150 __blk_mq_complete_request_remote+0x13/0x20 flush_smp_call_function_queue+0x56/0x110 generic_smp_call_function_single_interrupt+0x13/0x30 smp_call_function_single_interrupt+0x27/0x40 call_function_single_interrupt+0x89/0x90 RIP: 0010:native_safe_halt+0x6/0x10 (gdb) l *(test_clear_page_writeback+0x12e) 0xffffffff811bae3e is in test_clear_page_writeback (./include/linux/memcontrol.h:619). 614 mod_node_page_state(page_pgdat(page), idx, val); 615 if (mem_cgroup_disabled() || !page->mem_cgroup) 616 return; 617 mod_memcg_state(page->mem_cgroup, idx, val); 618 pn = page->mem_cgroup->nodeinfo[page_to_nid(page)]; 619 this_cpu_add(pn->lruvec_stat->count[idx], val); 620 } 621 622 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, 623 gfp_t gfp_mask, The issue is that writeback doesn't hold a page reference and the page might get freed after PG_writeback is cleared (and the mapping is unlocked) in test_clear_page_writeback(). The stat functions looking up the page's node or zone are safe, as those attributes are static across allocation and free cycles. But page->mem_cgroup is not, and it will get cleared if we race with truncation or migration. It appears this race window has been around for a while, but less likely to trigger when the memcg stats were updated first thing after PG_writeback is cleared. Recent changes reshuffled this code to update the global node stats before the memcg ones, though, stretching the race window out to an extent where people can reproduce the problem. Update test_clear_page_writeback() to look up and pin page->mem_cgroup before clearing PG_writeback, then not use that pointer afterward. It is a partial revert of 62cccb8c8e7a ("mm: simplify lock_page_memcg()") but leaves the pageref-holding callsites that aren't affected alone. Link: http://lkml.kernel.org/r/20170809183825.GA26387@cmpxchg.org Fixes: 62cccb8c8e7a ("mm: simplify lock_page_memcg()") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Jaegeuk Kim <jaegeuk@kernel.org> Tested-by: Jaegeuk Kim <jaegeuk@kernel.org> Reported-by: Bradley Bolen <bradleybolen@gmail.com> Tested-by: Brad Bolen <bradleybolen@gmail.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: <stable@vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 18 August 2017, 22:32:01 UTC
cc28fcd Merge tag 'xfs-4.13-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Darrick Wong: "A handful more bug fixes for you today. Changes since last time: - Don't leak resources when mount fails - Don't accidentally clobber variables when looking for free inodes" * tag 'xfs-4.13-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't leak quotacheck dquots when cow recovery xfs: clear MS_ACTIVE after finishing log recovery iomap: fix integer truncation issues in the zeroing and dirtying helpers xfs: fix inobt inode allocation search optimization 18 August 2017, 21:25:50 UTC
70bfc74 Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A small set of fixes that should go into this release. This contains: - An NVMe pull request from Christoph, with a few select fixes. One of them fix a polling regression in this series, in which it's trivial to cause the kernel to disable most of the hardware queue interrupts. - Fixup for a blk-mq queue usage imbalance on request allocation, from Keith. - A xen block pull request from Konrad, fixing two issues with xen/xen-blkfront" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL nvme-pci: set cqe_seen on polled completions nvme-fabrics: fix reporting of unrecognized options nvmet-fc: eliminate incorrect static markers on local variables nvmet-fc: correct use after free on list teardown nvmet: don't overwrite identify sn/fr with 0-bytes xen-blkfront: use a right index when checking requests xen: fix bio vec merging blk-mq: Fix queue usage on failed request allocation 18 August 2017, 21:12:39 UTC
edb20a1 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "Fourth set of -rc fixes for 4.13 cycle. This is all of the -rc fixes that we know of. I suspect this will be the last rc pull request, but you never know, I could be wrong. Nothing major here. There are the i40iw patches I mentioned in my last pull request minus one that I pulled out because it wasn't a fix and not appropriate for the rc cycle. Then a few other items trickled in and were added to the pull request. It's fairly small aside from those five i40iw patches - Set of five i40iw fixes (the first of these is rather large by line count consideration, but I decided to send it because if fixes a legitimate issue and the line count is because it does so by creating a new function and using it where needed instead of just patching up a few lines...a smaller fix could probably be done, but the larger fix is the better code solution) - One vmw_pvrdma fix - One hns_roce fix (this silences a checker warning, but can't actually happen, I expect a patch to remove this from all drivers that share this same check in for-next) - One iw_cxgb4 fix - Two IB core fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/uverbs: Fix NULL pointer dereference during device removal IB/core: Protect sysfs entry on ib_unregister_device iw_cxgb4: fix misuse of integer variable IB/hns: fix memory leak on ah on error return path i40iw: Fix potential fcn_id_array out of bounds i40iw: Use correct alignment for CQ0 memory i40iw: Fix typecast of tcp_seq_num i40iw: Correct variable names i40iw: Fix parsing of query/commit FPM buffers RDMA/vmw_pvrdma: Report CQ missed events 18 August 2017, 19:35:22 UTC
039a8e3 Merge tag 'powerpc-4.13-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A bug in the VSX register saving that could cause userspace FP/VMX register corruption. Never seen to happen (that we know of), was found by code inspection, but still tagged for stable given the consequences" * tag 'powerpc-4.13-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC 18 August 2017, 18:11:03 UTC
4283346 Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "A small number of bugfixes, nothing serious this time. Here is a full list. 4.13 regression fix: - imx7d-sdb pinctrl support regressed in 4.13 due to an incomplete patch DT fixes for recently added devices: - badly copied DT entries on imx6qdl-nitrogen6_som broke PCI reset - sama5d2 memory controller had the wrong ID and registers - imx7 power domains did not work correctly with deferred probing (driver added in 4.12) - Allwinner H5 pinctrl (added in 4.12) did not work right with GPIO interrupts Fixes for older bugs that just got noticed: - i.MX25 ADC support (added in 4.6) apparently never worked right due to a missing 'ranges' property in DT. - Renesas Salvador Audio support (added in v4.5) was broken for device repeated bind/unbind due to a naming conflict. - Various allwinner boards are missing an 'ethernet' alias in DT, leading to unstable device naming. Preventive bugfix: - TI Keystone needs a fix to prevent a NULL pointer dereference with an upcoming PM change" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: soc: ti: ti_sci_pm_domains: Populate name for genpd ARM: dts: imx6qdl-nitrogen6_som2: fix PCIe reset arm64: allwinner: h5: fix pinctrl IRQs arm64: allwinner: a64: sopine: add missing ethernet0 alias arm64: allwinner: a64: pine64: add missing ethernet0 alias arm64: allwinner: a64: bananapi-m64: add missing ethernet0 alias arm64: renesas: salvator-common: avoid audio_clkout naming conflict ARM: dts: i.MX25: add ranges to tscadc soc: imx: gpcv2: fix regulator deferred probe ARM: dts: at91: sama5d2: fix EBI/NAND controllers declaration ARM: dts: at91: sama5d2: use sama5d2 compatible string for SMC ARM: dts: imx7d-sdb: Put pinctrl_spi4 in the correct location 18 August 2017, 18:08:48 UTC
cb24785 Merge tag 'sound-4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes, mostly for regression fixes (sequencer kconfig and emu10k1 probe) and device-specific quirks (three for USB and one for HD-audio). One significant change is a fix for races in ALSA sequencer core, which covers over the previous incomplete fix" * tag 'sound-4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: emu10k1: Fix forgotten user-copy conversion in init code ALSA: usb-audio: add DSD support for new Amanero PID ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices ALSA: usb-audio: Apply sample rate quirk to Sennheiser headset ALSA: seq: 2nd attempt at fixing race creating a queue ALSA: hda/realtek - Fix pincfg for Dell XPS 13 9370 ALSA: seq: Fix CONFIG_SND_SEQ_MIDI dependency 18 August 2017, 18:02:49 UTC
4478976 Merge tag 'dma-mapping-4.13-3' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: "Another dma-mapping regression fix" * tag 'dma-mapping-4.13-3' of git://git.infradead.org/users/hch/dma-mapping: of: fix DMA mask generation 18 August 2017, 17:51:30 UTC
c005390 blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL While pci_irq_get_affinity should never fail for SMP kernel that implement the affinity mapping, it will always return NULL in the UP case, so provide a fallback mapping of all queues to CPU 0 in that case. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 18 August 2017, 14:08:14 UTC
6caa050 Merge branch 'nvme-4.13' of git://git.infradead.org/nvme into for-linus Pull NVMe changes from Christoph: "The fixes are getting really small now - two for FC, one for PCI, one for the fabrics layer and one for the target." 18 August 2017, 14:04:15 UTC
7edaeb6 kernel/watchdog: Prevent false positives with turbo modes The hardlockup detector on x86 uses a performance counter based on unhalted CPU cycles and a periodic hrtimer. The hrtimer period is about 2/5 of the performance counter period, so the hrtimer should fire 2-3 times before the performance counter NMI fires. The NMI code checks whether the hrtimer fired since the last invocation. If not, it assumess a hard lockup. The calculation of those periods is based on the nominal CPU frequency. Turbo modes increase the CPU clock frequency and therefore shorten the period of the perf/NMI watchdog. With extreme Turbo-modes (3x nominal frequency) the perf/NMI period is shorter than the hrtimer period which leads to false positives. A simple fix would be to shorten the hrtimer period, but that comes with the side effect of more frequent hrtimer and softlockup thread wakeups, which is not desired. Implement a low pass filter, which checks the perf/NMI period against kernel time. If the perf/NMI fires before 4/5 of the watchdog period has elapsed then the event is ignored and postponed to the next perf/NMI. That solves the problem and avoids the overhead of shorter hrtimer periods and more frequent softlockup thread wakeups. Fixes: 58687acba592 ("lockup_detector: Combine nmi_watchdog and softlockup detector") Reported-and-tested-by: Kan Liang <Kan.liang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: dzickus@redhat.com Cc: prarit@redhat.com Cc: ak@linux.intel.com Cc: babu.moger@oracle.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: acme@redhat.com Cc: stable@vger.kernel.org Cc: atomlin@redhat.com Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708150931310.1886@nanos 18 August 2017, 10:35:02 UTC
e8f2418 genirq: Restore trigger settings in irq_modify_status() irq_modify_status starts by clearing the trigger settings from irq_data before applying the new settings, but doesn't restore them, leaving them to IRQ_TYPE_NONE. That's pretty confusing to the potential request_irq() that could follow. Instead, snapshot the settings before clearing them, and restore them if the irq_modify_status() invocation was not changing the trigger. Fixes: 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ") Reported-and-tested-by: jeffy <jeffy.chen@rock-chips.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jon Hunter <jonathanh@nvidia.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170818095345.12378-1-marc.zyngier@arm.com 18 August 2017, 10:04:14 UTC
4dd6a99 soc: ti: ti_sci_pm_domains: Populate name for genpd Commit b6a1d093f96b ("PM / Domains: Extend generic power domain debugfs") now creates a debugfs directory for each genpd based on the name of the genpd. Currently no name is given to the genpd created by ti_sci_pm_domains driver so because of this we see a NULL pointer dereferences when it is accessed on boot when the debugfs entry creation is attempted. Give the genpd a name before registering it to avoid this. Fixes: 52835d59fc6c ("soc: ti: Add ti_sci_pm_domains driver") Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> 18 August 2017, 09:59:53 UTC
9311248 Merge tag 'imx-fixes-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes Pull "i.MX fixes for 4.13, round 3" from Shawn Guo: - Fix PCIe reset GPIO of imx6qdl-nitrogen6_som2 board, which was a bad copy from nitrogen6_max device tree. * tag 'imx-fixes-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx6qdl-nitrogen6_som2: fix PCIe reset 18 August 2017, 09:58:38 UTC
552c497 Merge tag 'sunxi-fixes-for-4.13-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes Pull "Allwinner fixes for 4.13, round 2" from Chen-Yu Tsai: Three fixes adding a missing alias for the Ethernet controller on A64 boards. One adding a missing interrupt for the pin controller. * tag 'sunxi-fixes-for-4.13-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: allwinner: h5: fix pinctrl IRQs arm64: allwinner: a64: sopine: add missing ethernet0 alias arm64: allwinner: a64: pine64: add missing ethernet0 alias arm64: allwinner: a64: bananapi-m64: add missing ethernet0 alias 18 August 2017, 09:55:44 UTC
45bd07a x86: Constify attribute_group structures attribute_groups are not supposed to change at runtime and none of the groups is modified. Mark the non-const structs as const. [ tglx: Folded into one big patch ] Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: tony.luck@intel.com Cc: bp@alien8.de Link: http://lkml.kernel.org/r/1500550238-15655-2-git-send-email-arvind.yadav.cs@gmail.com 18 August 2017, 09:30:35 UTC
7374bfb MAINTAINERS: Remove Jason Cooper's irqchip git tree Jason's irqchip tree does not seem to have been updated for many months now, remove it from the list of trees to avoid any possible confusion. Jason says: "Unfortunately, when I have time for irqchip, I don't always have the time to properly follow up with pull-requests. So, for the time being, I'll stick to reviewing as I can." Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Jason Cooper <jason@lakedaemon.net> Cc: marc.zyngier@arm.com Link: http://lkml.kernel.org/r/20170727224733.8288-1-f.fainelli@gmail.com 18 August 2017, 09:06:35 UTC
0b36f2b ALSA: emu10k1: Fix forgotten user-copy conversion in init code The commit d42fe63d5839 ("ALSA: emu10k1: Get rid of set_fs() usage") converted the user-space copy hack with set_fs() to the direct memcpy(), but one place was forgotten. This resulted in the error from snd_emu10k1_init_efx(), eventually failed to load the driver. Fix the missing piece. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196687 Fixes: d42fe63d5839 ("ALSA: emu10k1: Get rid of set_fs() usage") Signed-off-by: Takashi Iwai <tiwai@suse.de> 18 August 2017, 08:59:02 UTC
ed993c6 ALSA: usb-audio: add DSD support for new Amanero PID Add DSD support for new Amanero Combo384 firmware version with a new PID. This firmware uses DSD_U32_BE. Fixes: 3eff682d765b ("ALSA: usb-audio: Support both DSD LE/BE Amanero firmware versions") Signed-off-by: Jussi Laako <jussi@sonarnerd.net> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> 18 August 2017, 08:13:21 UTC
e9d8a0f nvme-pci: set cqe_seen on polled completions Fixes: 920d13a884 ("nvme-pci: factor out the cqe reading mechanics from __nvme_process_cq") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 18 August 2017, 07:19:39 UTC
c40bc54 ARM: dts: imx6qdl-nitrogen6_som2: fix PCIe reset Previous value was a bad copy of nitrogen6_max device tree. Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com> Fixes: 3faa1bb2e89c ("ARM: dts: imx: add Boundary Devices Nitrogen6_SOM2 support") Cc: <stable@vger.kernel.org> Signed-off-by: Shawn Guo <shawnguo@kernel.org> 18 August 2017, 01:40:40 UTC
04d49f3 Merge tag 'drm-fixes-for-v4.13-rc6' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Seems to be slowing down nicely, just one amdgpu fix, and a bunch of i915 fixes" * tag 'drm-fixes-for-v4.13-rc6' of git://people.freedesktop.org/~airlied/linux: drm/amdgpu: save list length when fence is signaled drm/i915: Avoid the gpu reset vs. modeset deadlock drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt drm/i915: Return correct EDP voltage swing table for 0.85V drm/i915/cnl: Add slice and subslice information to debugfs. drm/i915: Perform an invalidate prior to executing golden renderstate drm/i915: remove unused function declaration 17 August 2017, 23:48:29 UTC
d33a2a9 Merge tag 'pm-4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix two issues related to exposing the current CPU frequency to user space on x86. Specifics: - Disable interrupts around reading IA32_APERF and IA32_MPERF in aperfmperf_snapshot_khz() (introduced recently) to avoid excessive delays between the reads that may result from interrupt handling (Doug Smythies). - Fix the computation of the CPU frequency to be reported through the pstate_sample tracepoint in intel_pstate (Doug Smythies)" * tag 'pm-4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: x86: Disable interrupts during MSRs reading cpufreq: intel_pstate: report correct CPU frequencies during trace 17 August 2017, 21:21:18 UTC
440105d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: elan_i2c - Add antoher Lenovo ACPI ID for upcoming Lenovo NB Input: elan_i2c - add ELAN0608 to the ACPI table Input: trackpoint - assume 3 buttons when buttons detection fails 17 August 2017, 20:45:44 UTC
28eb462 Merge branch 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-fixes single amdgpu fix. * 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: save list length when fence is signaled 17 August 2017, 19:45:03 UTC
41d31b5 Merge tag 'drm-intel-fixes-2017-08-16' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes drm/i915 fixes for v4.13-rc6 "Chris' "drm/i915: Perform an invalidate prior to executing golden renderstate" and Daniel's "drm/i915: Avoid the gpu reset vs. modeset deadlock" seem like the most important ones. * tag 'drm-intel-fixes-2017-08-16' of git://anongit.freedesktop.org/git/drm-intel: drm/i915: Avoid the gpu reset vs. modeset deadlock drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt drm/i915: Return correct EDP voltage swing table for 0.85V drm/i915/cnl: Add slice and subslice information to debugfs. drm/i915: Perform an invalidate prior to executing golden renderstate drm/i915: remove unused function declaration 17 August 2017, 19:43:10 UTC
77aff8c xfs: don't leak quotacheck dquots when cow recovery If we fail a mount on account of cow recovery errors, it's possible that a previous quotacheck left some dquots in memory. The bailout clause of xfs_mountfs forgets to purge these, and so we leak them. Fix that. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> 17 August 2017, 19:40:33 UTC
8204f8d xfs: clear MS_ACTIVE after finishing log recovery Way back when we established inode block-map redo log items, it was discovered that we needed to prevent the VFS from evicting inodes during log recovery because any given inode might be have bmap redo items to replay even if the inode has no link count and is ultimately deleted, and any eviction of an unlinked inode causes the inode to be truncated and freed too early. To make this possible, we set MS_ACTIVE so that inodes would not be torn down immediately upon release. Unfortunately, this also results in the quota inodes not being released at all if a later part of the mount process should fail, because we never reclaim the inodes. So, set MS_ACTIVE right before we do the last part of log recovery and clear it immediately after we finish the log recovery so that everything will be torn down properly if we abort the mount. Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> 17 August 2017, 19:40:33 UTC
8179962 Merge branches 'intel_pstate-fix' and 'cpufreq-x86-fix' * intel_pstate-fix: cpufreq: intel_pstate: report correct CPU frequencies during trace * cpufreq-x86-fix: cpufreq: x86: Disable interrupts during MSRs reading 17 August 2017, 19:00:30 UTC
3bc6c90 Merge branch 'parisc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Fix PCI memory bar assignments with 64-bit kernels on machines with Dino/Cujo PCI chipsets. This makes PCI graphic cards work on such machines (from Thomas Bogendoerfer). - Fix documentation to be more clear about the difference between %pF and %pS printk format usage. There are still many places in the kernel which have it wrong (from Petr Mladek, Sergey Senozhatsky & me). * 'parisc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: printk-formats.txt: Better describe the difference between %pS and %pF parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo 17 August 2017, 18:39:54 UTC
81a0b8d nvme-fabrics: fix reporting of unrecognized options Only print the specified options that are not recognized, instead of the whole list of options. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> 17 August 2017, 16:48:54 UTC
99f781b Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota fix from Jan Kara: "A fix of a check for quota limit" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: correct space limit check 17 August 2017, 16:26:10 UTC
c8c03f1 pty: fix the cached path of the pty slave file descriptor in the master Christian Brauner reported that if you use the TIOCGPTPEER ioctl() to get a slave pty file descriptor, the resulting file descriptor doesn't look right in /proc/<pid>/fd/<fd>. In particular, he wanted to use readlink() on /proc/self/fd/<fd> to get the pathname of the slave pty (basically implementing "ptsname{_r}()"). The reason for that was that we had generated the wrong 'struct path' when we create the pty in ptmx_open(). In particular, the dentry was correct, but the vfsmount pointed to the mount of the ptmx node. That _can_ be correct - in case you use "/dev/pts/ptmx" to open the master - but usually is not. The normal case is to use /dev/ptmx, which then looks up the pts/ directory, and then the vfsmount of the ptmx node is obviously the /dev directory, not the /dev/pts/ directory. We actually did have the right vfsmount available, but in the wrong place (it gets looked up in 'devpts_acquire()' when we get a reference to the pts filesystem), and so ptmx_open() used the wrong mnt pointer. The end result of this confusion was that the pty worked fine, but when if you did TIOCGPTPEER to get the slave side of the pty, end end result would also work, but have that dodgy 'struct path'. And then when doing "d_path()" on to get the pathname, the vfsmount would not match the root of the pts directory, and d_path() would return an empty pathname thinking that the entry had escaped a bind mount into another mount. This fixes the problem by making devpts_acquire() return the vfsmount for the pts filesystem, allowing ptmx_open() to trivially just use the right mount for the pts dentry, and create the proper 'struct path'. Reported-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 17 August 2017, 16:10:48 UTC
0f174b3 ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices C-Media devices (at least some models) mute the playback stream when volumes are set to the minimum value. But this isn't informed via TLV and the user-space, typically PulseAudio, gets confused as if it's still played in a low volume. This patch adds the new flag, min_mute, to struct usb_mixer_elem_info for indicating that the mixer element is with the minimum-mute volume. This flag is set for known C-Media devices in snd_usb_mixer_fu_apply_quirk() in turn. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196669 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> 17 August 2017, 15:52:16 UTC
872784b Merge tag 'renesas-fixes4-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes Pull "Fourth Round of Renesas ARM Based SoC Fixes for v4.13" from Simon Horman: * Avoid audio_clkout naming conflict for salvator boards using Renesas R-Car Gen 3 SoCs Morimoto-san says "The clock name of "audio_clkout" is used by the Renesas sound driver. This duplicated naming breaks its clock registering/unregistering. Especially when unbind/bind it can't handle clkout correctly. This patch renames "audio_clkout" to "audio-clkout" to avoid the naming conflict." * tag 'renesas-fixes4-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: arm64: renesas: salvator-common: avoid audio_clkout naming conflict 17 August 2017, 09:00:26 UTC
ee7b1f3 of: fix DMA mask generation Historically, DMA masks have suffered some ambiguity between whether they represent the range of physical memory a device can access, or the address bits a device is capable of driving, particularly since on many platforms the two are equivalent. Whilst there are some stragglers left (dma_max_pfn(), I'm looking at you...), the majority of DMA code has been cleaned up to follow the latter definition, not least since it is the only one which makes sense once IOMMUs are involved. In this respect, of_dma_configure() has always done the wrong thing in how it generates initial masks based on "dma-ranges". Although rounding down did not affect the TI Keystone platform where dma_addr + size is already a power of two, in any other case it results in a mask which is at best unnecessarily constrained and at worst unusable. BCM2837 illustrates the problem nicely, where we have a DMA base of 3GB and a size of 1GB - 16MB, giving dma_addr + size = 0xff000000 and a resultant mask of 0x7fffffff, which is then insufficient to even cover the necessary offset, effectively making all DMA addresses out-of-range. This has been hidden until now (mostly because we don't yet prevent drivers from simply overwriting this initial mask later upon probe), but due to recent changes elsewhere now shows up as USB being broken on Raspberry Pi 3. Make it right by rounding up instead of down, such that the mask correctly correctly describes all possisble bits the device needs to emit. Fixes: 9a6d7298b083 ("of: Calculate device DMA masks based on DT dma-range size") Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Reported-by: Andreas Färber <afaerber@suse.de> Reported-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> 17 August 2017, 08:23:45 UTC
187e91f x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt' __startup_64() is normally using fixup_pointer() to access globals in a position-independent fashion. However 'next_early_pgt' was accessed directly, which wasn't guaranteed to work. Luckily GCC was generating a R_X86_64_PC32 PC-relative relocation for 'next_early_pgt', but Clang emitted a R_X86_64_32S, which led to accessing invalid memory and rebooting the kernel. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Davidson <md@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C") Link: http://lkml.kernel.org/r/20170816190808.131748-1-glider@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> 17 August 2017, 07:53:00 UTC
369157b nvmet-fc: eliminate incorrect static markers on local variables There were 2 statics introduced that were bogus. Removed the static designations. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 17 August 2017, 07:35:08 UTC
ac9a409 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "A couple of minor fixes (st, ses) and some bigger driver fixes for qla2xxx (crash triggered by fw dump) and ipr (lockdep problems with mq)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ses: Fix wrong page error scsi: ipr: Fix scsi-mq lockdep issue scsi: st: fix blk_get_queue usage scsi: qla2xxx: Fix system crash while triggering FW dump 17 August 2017, 00:21:20 UTC
422ce07 Merge tag 'audit-pr-20170816' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit fixes from Paul Moore: "Two small fixes to the audit code, both explained well in the respective patch descriptions, but the quick summary is one use-after-free fix, and one silly fanotify notification flag fix" * tag 'audit-pr-20170816' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: Receive unmount event audit: Fix use after free in audit_remove_watch_rule() 16 August 2017, 23:48:34 UTC
d6957f3 printk-formats.txt: Better describe the difference between %pS and %pF Sometimes people seems unclear when to use the %pS or %pF printk format. For example, see commit 51d96dc2e2dc ("random: fix warning message on ia64 and parisc") which fixed such a wrong format string. The documentation should be more clear about the difference. Signed-off-by: Helge Deller <deller@gmx.de> [pmladek@suse.com: Restructure the entire section] Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de> 16 August 2017, 19:09:45 UTC
01578e3 x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks The ADDR_NO_RANDOMIZE checks in stack_maxrandom_size() and randomize_stack_top() are not required. PF_RANDOMIZE is set by load_elf_binary() only if ADDR_NO_RANDOMIZE is not set, no need to re-check after that. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20170815154011.GB1076@redhat.com 16 August 2017, 18:32:02 UTC
47ac548 x86: Fix norandmaps/ADDR_NO_RANDOMIZE Documentation/admin-guide/kernel-parameters.txt says: norandmaps Don't use address space randomization. Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space but it doesn't work because arch_rnd() which is used to randomize mm->mmap_base returns a random value unconditionally. And as Kirill pointed out, ADDR_NO_RANDOMIZE is broken by the same reason. Just shift the PF_RANDOMIZE check from arch_mmap_rnd() to arch_rnd(). Fixes: 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20170815153952.GA1076@redhat.com 16 August 2017, 18:32:01 UTC
870201f IB/uverbs: Fix NULL pointer dereference during device removal As part of ib_uverbs_remove_one which might be triggered upon reset flow, we trigger IB_EVENT_DEVICE_FATAL event to userspace application. If device was removed after uverbs fd was opened but before ib_uverbs_get_context was called, the event file will be accessed before it was allocated, result in NULL pointer dereference: [ 72.325873] BUG: unable to handle kernel NULL pointer dereference at (null) ... [ 72.325984] IP: _raw_spin_lock_irqsave+0x22/0x40 [ 72.327123] Call Trace: [ 72.327168] ib_uverbs_async_handler.isra.8+0x2e/0x160 [ib_uverbs] [ 72.327216] ? synchronize_srcu_expedited+0x27/0x30 [ 72.327269] ib_uverbs_remove_one+0x120/0x2c0 [ib_uverbs] [ 72.327330] ib_unregister_device+0xd0/0x180 [ib_core] [ 72.327373] mlx5_ib_remove+0x74/0x140 [mlx5_ib] [ 72.327422] mlx5_remove_device+0xfb/0x110 [mlx5_core] [ 72.327466] mlx5_unregister_interface+0x3c/0xa0 [mlx5_core] [ 72.327509] mlx5_ib_cleanup+0x10/0x962 [mlx5_ib] [ 72.327546] SyS_delete_module+0x155/0x230 [ 72.328472] ? exit_to_usermode_loop+0x70/0xa6 [ 72.329370] do_syscall_64+0x54/0xc0 [ 72.330262] entry_SYSCALL64_slow_path+0x25/0x25 Fix it by checking that user context was allocated before trigger the event. Fixes: 036b10635739 ('IB/uverbs: Enable device removal when there are active user space applications') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 16:53:15 UTC
3e09fc8 Merge branch 'stable/for-jens-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus Pull xen block changes from Konrad: Two fixes, both of them spotted by Amazon: 1) Fix in Xen-blkfront caused by the re-write in 4.8 time-frame. 2) Fix in the xen_biovec_phys_mergeable which allowed guest requests when using NVMe - to slurp up more data than allowed leading to an XSA (which has been made public today). 16 August 2017, 15:56:34 UTC
06f8174 IB/core: Protect sysfs entry on ib_unregister_device ib_unregister_device is not protecting removal of sysfs entries. A call to ib_register_device in that window can result in duplicate sysfs entry warning. Move mutex_unlock to after ib_device_unregister_sysfs to protect against sysfs entry creation. This issue is exposed during driver load/unload stress test. WARNING: CPU: 5 PID: 4445 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5f/0x70 sysfs: cannot create duplicate filename '/class/infiniband/i40iw0' Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H BIOS F7 01/17/2014 Workqueue: i40e i40e_service_task [i40e] Call Trace: dump_stack+0x67/0x98 __warn+0xcc/0xf0 warn_slowpath_fmt+0x4a/0x50 ? kernfs_path_from_node+0x4b/0x60 sysfs_warn_dup+0x5f/0x70 sysfs_do_create_link_sd.isra.2+0xb7/0xc0 sysfs_create_link+0x20/0x40 device_add+0x28c/0x600 ib_device_register_sysfs+0x58/0x170 [ib_core] ib_register_device+0x325/0x570 [ib_core] ? i40iw_register_rdma_device+0x1f4/0x400 [i40iw] ? kmem_cache_alloc_trace+0x143/0x330 ? __raw_spin_lock_init+0x2d/0x50 i40iw_register_rdma_device+0x2dc/0x400 [i40iw] i40iw_open+0x10a6/0x1950 [i40iw] ? i40iw_open+0xeab/0x1950 [i40iw] ? i40iw_make_cm_node+0x9c0/0x9c0 [i40iw] i40e_client_subtask+0xa4/0x110 [i40e] i40e_service_task+0xc2d/0x1320 [i40e] process_one_work+0x203/0x710 ? process_one_work+0x16f/0x710 worker_thread+0x126/0x4a0 ? trace_hardirqs_on+0xd/0x10 kthread+0x112/0x150 ? process_one_work+0x710/0x710 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x2e/0x40 ---[ end trace fd11b69e21ea7653 ]--- Couldn't register device i40iw0 with driver model Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 15:47:55 UTC
d4ba61d iw_cxgb4: fix misuse of integer variable Fixes: ee30f7d507c0 ("iw_cxgb4: Max fastreg depth depends on DSGL support") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 15:43:21 UTC
5b59a39 IB/hns: fix memory leak on ah on error return path When dmac is NULL, ah is not being freed on the error return path. Fix this by kfree'ing it. Detected by CoverityScan, CID#1452636 ("Resource Leak") Fixes: d8966fcd4c25 ("IB/core: Use rdma_ah_attr accessor functions") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 15:30:33 UTC
aa939c1 i40iw: Fix potential fcn_id_array out of bounds Avoid out of bounds error by utilizing I40IW_MAX_STATS_COUNT instead of I40IW_INVALID_FCN_ID. Signed-off-by: Christopher N Bednarz <christoper.n.bednarz@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 15:27:44 UTC
a28f047 i40iw: Use correct alignment for CQ0 memory Utilize correct alignment variable when allocating DMA memory for CQ0. Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 15:27:44 UTC
29c2415 i40iw: Fix typecast of tcp_seq_num The typecast of tcp_seq_num incorrectly uses u8. Fix by casting to u32. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 15:27:44 UTC
8129331 i40iw: Correct variable names Fix incorrect naming of status code and struct. Use inline instead of immediate. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 15:27:44 UTC
f67ace2 i40iw: Fix parsing of query/commit FPM buffers Parsing of commit/query Host Memory Cache Function Private Memory is not skipping over reserved fields and incorrectly assigning those values into object's base/cnt/max_cnt fields. Skip over reserved fields and set correct values. Also correct memory alignment requirement for commit/query FPM buffers. Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 15:27:44 UTC
a7d2e03 RDMA/vmw_pvrdma: Report CQ missed events There is a chance of a race between arming the CQ and receiving completions. By reporting CQ missed events any ULPs should poll again to get the completions. Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver") Acked-by: Aditya Sarwade <asarwade@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com> 16 August 2017, 14:56:04 UTC
5a69aec powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC VSX uses a combination of the old vector registers, the old FP registers and new "second halves" of the FP registers. Thus when we need to see the VSX state in the thread struct (flush_vsx_to_thread()) or when we'll use the VSX in the kernel (enable_kernel_vsx()) we need to ensure they are all flushed into the thread struct if either of them is individually enabled. Unfortunately we only tested if the whole VSX was enabled, not if they were individually enabled. Fixes: 72cd7b44bc99 ("powerpc: Uncomment and make enable_kernel_vsx() routine available") Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> 16 August 2017, 09:35:54 UTC
16a5a48 nvmet-fc: correct use after free on list teardown Use list_for_each_entry_safe to prevent list handling from referencing next pointers directly after list_del's Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 16 August 2017, 08:06:18 UTC
42819eb nvmet: don't overwrite identify sn/fr with 0-bytes The merged version of my patch "nvmet: don't report 0-bytes in serial number" fails to remove two lines which should have been replaced, so that the space-padded strings are overwritten again with 0-bytes. Fix it. Fixes: 42de82a8b544 nvmet: don't report 0-bytes in serial number Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> 16 August 2017, 08:06:10 UTC
4098116 parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo For 64bit kernels the lmmio_space_offset of the host bridge window isn't set correctly on systems with dino/cujo PCI host bridges. This leads to not assigned memory bars and failing drivers, which need to use these bars. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: <stable@vger.kernel.org> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de> 16 August 2017, 07:50:39 UTC
510c8a8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix TCP checksum offload handling in iwlwifi driver, from Emmanuel Grumbach. 2) In ksz DSA tagging code, free SKB if skb_put_padto() fails. From Vivien Didelot. 3) Fix two regressions with bonding on wireless, from Andreas Born. 4) Fix build when busypoll is disabled, from Daniel Borkmann. 5) Fix copy_linear_skb() wrt. SO_PEEK_OFF, from Eric Dumazet. 6) Set SKB cached route properly in inet_rtm_getroute(), from Florian Westphal. 7) Fix PCI-E relaxed ordering handling in cxgb4 driver, from Ding Tianhong. 8) Fix module refcnt leak in ULP code, from Sabrina Dubroca. 9) Fix use of GFP_KERNEL in atomic contexts in AF_KEY code, from Eric Dumazet. 10) Need to purge socket write queue in dccp_destroy_sock(), also from Eric Dumazet. 11) Make bpf_trace_printk() work properly on 32-bit architectures, from Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits) bpf: fix bpf_trace_printk on 32 bit archs PCI: fix oops when try to find Root Port for a PCI device sfc: don't try and read ef10 data on non-ef10 NIC net_sched: remove warning from qdisc_hash_add net_sched/sfq: update hierarchical backlog when drop packet net_sched: reset pointers to tcf blocks in classful qdiscs' destructors ipv4: fix NULL dereference in free_fib_info_rcu() net: Fix a typo in comment about sock flags. ipv6: fix NULL dereference in ip6_route_dev_notify() tcp: fix possible deadlock in TCP stack vs BPF filter dccp: purge write queue in dccp_destroy_sock() udp: fix linear skb reception with PEEK_OFF ipv6: release rt6->rt6i_idev properly during ifdown af_key: do not use GFP_KERNEL in atomic contexts tcp: ulp: avoid module refcnt leak in tcp_set_ulp net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag PCI: Disable Relaxed Ordering Attributes for AMD A1100 PCI: Disable Relaxed Ordering for some Intel processors PCI: Disable PCIe Relaxed Ordering if unsupported ... 16 August 2017, 01:52:28 UTC
88a5c69 bpf: fix bpf_trace_printk on 32 bit archs James reported that on MIPS32 bpf_trace_printk() is currently broken while MIPS64 works fine: bpf_trace_printk() uses conditional operators to attempt to pass different types to __trace_printk() depending on the format operators. This doesn't work as intended on 32-bit architectures where u32 and long are passed differently to u64, since the result of C conditional operators follows the "usual arithmetic conversions" rules, such that the values passed to __trace_printk() will always be u64 [causing issues later in the va_list handling for vscnprintf()]. For example the samples/bpf/tracex5 test printed lines like below on MIPS32, where the fd and buf have come from the u64 fd argument, and the size from the buf argument: [...] 1180.941542: 0x00000001: write(fd=1, buf= (null), size=6258688) Instead of this: [...] 1625.616026: 0x00000001: write(fd=1, buf=009e4000, size=512) One way to get it working is to expand various combinations of argument types into 8 different combinations for 32 bit and 64 bit kernels. Fix tested by James on MIPS32 and MIPS64 as well that it resolves the issue. Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()") Reported-by: James Hogan <james.hogan@imgtec.com> Tested-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:32:15 UTC
0e40523 PCI: fix oops when try to find Root Port for a PCI device Eric report a oops when booting the system after applying the commit a99b646afa8a ("PCI: Disable PCIe Relaxed..."): [ 4.241029] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 [ 4.247001] IP: pci_find_pcie_root_port+0x62/0x80 [ 4.253011] PGD 0 [ 4.253011] P4D 0 [ 4.253011] [ 4.258013] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 4.262015] Modules linked in: [ 4.265005] CPU: 31 PID: 1 Comm: swapper/0 Not tainted 4.13.0-dbx-DEV #316 [ 4.271002] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 [ 4.279002] task: ffffa2ee38cfa040 task.stack: ffffa51ec0004000 [ 4.285001] RIP: 0010:pci_find_pcie_root_port+0x62/0x80 [ 4.290012] RSP: 0000:ffffa51ec0007ab8 EFLAGS: 00010246 [ 4.295003] RAX: 0000000000000000 RBX: ffffa2ee36bae000 RCX: 0000000000000006 [ 4.303002] RDX: 000000000000081c RSI: ffffa2ee38cfa8c8 RDI: ffffa2ee36bae000 [ 4.310013] RBP: ffffa51ec0007b58 R08: 0000000000000001 R09: 0000000000000000 [ 4.317001] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa51ec0007ad0 [ 4.324005] R13: ffffa2ee36bae098 R14: 0000000000000002 R15: ffffa2ee37204818 [ 4.331002] FS: 0000000000000000(0000) GS:ffffa2ee3fcc0000(0000) knlGS:0000000000000000 [ 4.339002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.345001] CR2: 0000000000000050 CR3: 000000401000f000 CR4: 00000000001406e0 [ 4.351002] Call Trace: [ 4.354012] ? pci_configure_device+0x19f/0x570 [ 4.359002] ? pci_conf1_read+0xb8/0xf0 [ 4.363002] ? raw_pci_read+0x23/0x40 [ 4.366011] ? pci_read+0x2c/0x30 [ 4.370014] ? pci_read_config_word+0x67/0x70 [ 4.374012] pci_device_add+0x28/0x230 [ 4.378012] ? pci_vpd_f0_read+0x50/0x80 [ 4.382014] pci_scan_single_device+0x96/0xc0 [ 4.386012] pci_scan_slot+0x79/0xf0 [ 4.389001] pci_scan_child_bus+0x31/0x180 [ 4.394014] acpi_pci_root_create+0x1c6/0x240 [ 4.398013] pci_acpi_scan_root+0x15f/0x1b0 [ 4.402012] acpi_pci_root_add+0x2e6/0x400 [ 4.406012] ? acpi_evaluate_integer+0x37/0x60 [ 4.411002] acpi_bus_attach+0xdf/0x200 [ 4.415002] acpi_bus_attach+0x6a/0x200 [ 4.418014] acpi_bus_attach+0x6a/0x200 [ 4.422013] acpi_bus_scan+0x38/0x70 [ 4.426011] acpi_scan_init+0x10c/0x271 [ 4.429001] acpi_init+0x2fa/0x348 [ 4.433004] ? acpi_sleep_proc_init+0x2d/0x2d [ 4.437001] do_one_initcall+0x43/0x169 [ 4.441001] kernel_init_freeable+0x1d0/0x258 [ 4.445003] ? rest_init+0xe0/0xe0 [ 4.449001] kernel_init+0xe/0x150 ====================== cut here ============================= It looks like the pci_find_pcie_root_port() was trying to find the Root Port for the PCI device which is the Root Port already, it will return NULL and trigger the problem, so check the highest_pcie_bridge to fix thie problem. Fixes: a99b646afa8a ("PCI: Disable PCIe Relaxed Ordering if unsupported") Fixes: c56d4450eb68 ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:25:16 UTC
61deee9 sfc: don't try and read ef10 data on non-ef10 NIC The MAC stats command takes a port ID, which doesn't exist on pre-ef10 NICs (5000- and 6000- series). This is extracted from the NIC specific data; we misinterpret this as the ef10 data structure, causing us to read potentially unallocated data. With a KASAN kernel this can cause errors with: BUG: KASAN: slab-out-of-bounds in efx_mcdi_mac_stats Fixes: 0a2ab4d988d7 ("sfc: set the port-id when calling MC_CMD_MAC_STATS") Reported-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:19:34 UTC
c90e951 net_sched: remove warning from qdisc_hash_add It was added in commit e57a784d8cae ("pkt_sched: set root qdisc before change() in attach_default_qdiscs()") to hide duplicates from "tc qdisc show" for incative deivices. After 59cc1f61f ("net: sched: convert qdisc linked list to hashtable") it triggered when classful qdisc is added to inactive device because default qdiscs are added before switching root qdisc. Anyway after commit ea3274695353 ("net: sched: avoid duplicates in qdisc dump") duplicates are filtered right in dumper. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:16:39 UTC
325d5dc net_sched/sfq: update hierarchical backlog when drop packet When sfq_enqueue() drops head packet or packet from another queue it have to update backlog at upper qdiscs too. Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:16:39 UTC
8989042 net_sched: reset pointers to tcf blocks in classful qdiscs' destructors Traffic filters could keep direct pointers to classes in classful qdisc, thus qdisc destruction first removes all filters before freeing classes. Class destruction methods also tries to free attached filters but now this isn't safe because tcf_block_put() unlike to tcf_destroy_chain() cannot be called second time. This patch set class->block to NULL after first tcf_block_put() and turn second call into no-op. Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:16:39 UTC
187e5b3 ipv4: fix NULL dereference in free_fib_info_rcu() If fi->fib_metrics could not be allocated in fib_create_info() we attempt to dereference a NULL pointer in free_fib_info_rcu() : m = fi->fib_metrics; if (m != &dst_default_metrics && atomic_dec_and_test(&m->refcnt)) kfree(m); Before my recent patch, we used to call kfree(NULL) and nothing wrong happened. Instead of using RCU to defer freeing while we are under memory stress, it seems better to take immediate action. This was reported by syzkaller team. Fixes: 3fb07daff8e9 ("ipv4: add reference counting to metrics") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:07:52 UTC
b3dc8f7 net: Fix a typo in comment about sock flags. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:07:17 UTC
12d94a8 ipv6: fix NULL dereference in ip6_route_dev_notify() Based on a syzkaller report [1], I found that a per cpu allocation failure in snmp6_alloc_dev() would then lead to NULL dereference in ip6_route_dev_notify(). It seems this is a very old bug, thus no Fixes tag in this submission. Let's add in6_dev_put_clear() helper, as we will probably use it elsewhere (once available/present in net-next) [1] kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 17294 Comm: syz-executor6 Not tainted 4.13.0-rc2+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff88019f456680 task.stack: ffff8801c6e58000 RIP: 0010:__read_once_size include/linux/compiler.h:250 [inline] RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline] RIP: 0010:refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178 RSP: 0018:ffff8801c6e5f1b0 EFLAGS: 00010202 RAX: 0000000000000037 RBX: dffffc0000000000 RCX: ffffc90005d25000 RDX: ffff8801c6e5f218 RSI: ffffffff82342bbf RDI: 0000000000000001 RBP: ffff8801c6e5f240 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff10038dcbe37 R13: 0000000000000006 R14: 0000000000000001 R15: 00000000000001b8 FS: 00007f21e0429700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001ddbc22000 CR3: 00000001d632b000 CR4: 00000000001426e0 DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Call Trace: refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211 in6_dev_put include/net/addrconf.h:335 [inline] ip6_route_dev_notify+0x1c9/0x4a0 net/ipv6/route.c:3732 notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93 __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401 call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1678 call_netdevice_notifiers net/core/dev.c:1694 [inline] rollback_registered_many+0x91c/0xe80 net/core/dev.c:7107 rollback_registered+0x1be/0x3c0 net/core/dev.c:7149 register_netdevice+0xbcd/0xee0 net/core/dev.c:7587 register_netdev+0x1a/0x30 net/core/dev.c:7669 loopback_net_init+0x76/0x160 drivers/net/loopback.c:214 ops_init+0x10a/0x570 net/core/net_namespace.c:118 setup_net+0x313/0x710 net/core/net_namespace.c:294 copy_net_ns+0x27c/0x580 net/core/net_namespace.c:418 create_new_namespaces+0x425/0x880 kernel/nsproxy.c:107 unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:206 SYSC_unshare kernel/fork.c:2347 [inline] SyS_unshare+0x653/0xfa0 kernel/fork.c:2297 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4512c9 RSP: 002b:00007f21e0428c08 EFLAGS: 00000216 ORIG_RAX: 0000000000000110 RAX: ffffffffffffffda RBX: 0000000000718150 RCX: 00000000004512c9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000062020200 RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004b973d R13: 00000000ffffffff R14: 000000002001d000 R15: 00000000000002dd Code: 50 2b 34 82 c7 00 f1 f1 f1 f1 c7 40 04 04 f2 f2 f2 c7 40 08 f3 f3 f3 f3 e8 a1 43 39 ff 4c 89 f8 48 8b 95 70 ff ff ff 48 c1 e8 03 <0f> b6 0c 18 4c 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85 RIP: __read_once_size include/linux/compiler.h:250 [inline] RSP: ffff8801c6e5f1b0 RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP: ffff8801c6e5f1b0 RIP: refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178 RSP: ffff8801c6e5f1b0 ---[ end trace e441d046c6410d31 ]--- Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 August 2017, 00:06:34 UTC
b5fed47 audit: Receive unmount event Although audit_watch_handle_event() can handle FS_UNMOUNT event, it is not part of AUDIT_FS_WATCH mask and thus such event never gets to audit_watch_handle_event(). Thus fsnotify marks are deleted by fsnotify subsystem on unmount without audit being notified about that which leads to a strange state of existing audit rules with dead fsnotify marks. Add FS_UNMOUNT to the mask of events to be received so that audit can clean up its state accordingly. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Paul Moore <paul@paul-moore.com> 15 August 2017, 20:03:00 UTC
d76036a audit: Fix use after free in audit_remove_watch_rule() audit_remove_watch_rule() drops watch's reference to parent but then continues to work with it. That is not safe as parent can get freed once we drop our reference. The following is a trivial reproducer: mount -o loop image /mnt touch /mnt/file auditctl -w /mnt/file -p wax umount /mnt auditctl -D <crash in fsnotify_destroy_mark()> Grab our own reference in audit_remove_watch_rule() earlier to make sure mark does not get freed under us. CC: stable@vger.kernel.org Reported-by: Tony Jones <tonyj@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Tested-by: Tony Jones <tonyj@suse.de> Signed-off-by: Paul Moore <paul@paul-moore.com> 15 August 2017, 19:58:17 UTC
40c6d1b Merge tag 'linux-kselftest-4.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "This update consists of important compile and run-time error fixes to timers/freq-step, kmod, and sysctl tests" * tag 'linux-kselftest-4.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: timers: freq-step: fix compile error selftests: futex: fix run_tests target test_sysctl: fix sysctl.sh by making it executable test_kmod: fix kmod.sh by making it executable 15 August 2017, 19:49:43 UTC
7a7c286 drm/amdgpu: save list length when fence is signaled update the list first to avoid redundant checks. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org 15 August 2017, 18:10:01 UTC
0a6f041 Merge tag 'wireless-drivers-for-davem-2017-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.13 This time quite a few fixes for iwlwifi and one major regression fix for brcmfmac. For the iwlwifi aggregation bug a small change was needed for mac80211, but as Johannes is still away the mac80211 patch is taken via wireless-drivers tree. brcmfmac * fix firmware crash (a recent regression in bcm4343{0,1,8} iwlwifi * Some simple PCI HW ID fix-ups and additions for family 9000 * Remove a bogus warning message with new FWs (bug #196915) * Don't allow illegal channel options to be used (bug #195299) * A fix for checksum offload in family 9000 * A fix serious throughput degradation in 11ac with multiple streams * An old bug in SMPS where the firmware was not aware of SMPS changes * Fix a memory leak in the SAR code * Fix a stuck queue case in AP mode; * Convert a WARN to a simple debug in a legitimate race case (from which we can recover) * Fix a severe throughput aggregation on 9000-family devices due to aggregation issues, needed a small change in mac80211 ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 15 August 2017, 17:19:14 UTC
4450675 Merge tag 'at91-ab-4.13-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux into fixes Pull "DT fixes for 4.13" from Alexandre Belloni: - Fix NAND flash support for sama5d2 * tag 'at91-ab-4.13-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: ARM: dts: at91: sama5d2: fix EBI/NAND controllers declaration ARM: dts: at91: sama5d2: use sama5d2 compatible string for SMC 15 August 2017, 15:37:10 UTC
64c2c37 Merge tag 'imx-fixes-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes Pull "i.MX fixes for 4.13, round 2" from Shawn Guo: - Add missing 'ranges' property for i.MX25 device tree TSCADC node, so that it's child nodes ADC and TSC device can be probed by kernel. - Fix i.MX GPCv2 power domain driver to request regulator after power domain initialization, since regulator could defer probing and therefore cause power domain initialized twice. * tag 'imx-fixes-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: i.MX25: add ranges to tscadc soc: imx: gpcv2: fix regulator deferred probe 15 August 2017, 15:34:52 UTC
69a80d8 Merge tag 'imx-fixes-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes Pull "i.MX fixes for 4.13" from Shawn Guo: - A fix for imx7d-sdb board to move pinctrl_spi4 pins from low power iomux controller to normal iomuxc node, as the pins belong to normal iomuxc rather than low power one. * tag 'imx-fixes-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx7d-sdb: Put pinctrl_spi4 in the correct location 15 August 2017, 15:34:04 UTC
b15bd8c xen-blkfront: use a right index when checking requests Since commit d05d7f40791c ("Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-block") and 3fc9d690936f ("Merge branch 'for-4.8/drivers' of git://git.kernel.dk/linux-block"), blkfront_resume() has been using an index for iterating ring_info to check request when iterating blk_shadow in an inner loop. This seems to have been accidentally introduced during the massive rewrite of the block layer macros in the commits. This may cause crash like this: [11798.057074] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [11798.058832] IP: [<ffffffff814411fa>] blkfront_resume+0x10a/0x610 .... [11798.061063] Call Trace: [11798.061063] [<ffffffff8139ce93>] xenbus_dev_resume+0x53/0x140 [11798.061063] [<ffffffff8139ce40>] ? xenbus_dev_probe+0x150/0x150 [11798.061063] [<ffffffff813f359e>] dpm_run_callback+0x3e/0x110 [11798.061063] [<ffffffff813f3a08>] device_resume+0x88/0x190 [11798.061063] [<ffffffff813f4cc0>] dpm_resume+0x100/0x2d0 [11798.061063] [<ffffffff813f5221>] dpm_resume_end+0x11/0x20 [11798.061063] [<ffffffff813950a8>] do_suspend+0xe8/0x1a0 [11798.061063] [<ffffffff813954bd>] shutdown_handler+0xfd/0x130 [11798.061063] [<ffffffff8139aba0>] ? split+0x110/0x110 [11798.061063] [<ffffffff8139ac26>] xenwatch_thread+0x86/0x120 [11798.061063] [<ffffffff810b4570>] ? prepare_to_wait_event+0x110/0x110 [11798.061063] [<ffffffff8108fe57>] kthread+0xd7/0xf0 [11798.061063] [<ffffffff811da811>] ? kfree+0x121/0x170 [11798.061063] [<ffffffff8108fd80>] ? kthread_park+0x60/0x60 [11798.061063] [<ffffffff810863b0>] ? call_usermodehelper_exec_work+0xb0/0xb0 [11798.061063] [<ffffffff810864ea>] ? call_usermodehelper_exec_async+0x13a/0x140 [11798.061063] [<ffffffff81534a45>] ret_from_fork+0x25/0x30 Use the right index in the inner loop. Fixes: d05d7f40791c ("Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-block") Fixes: 3fc9d690936f ("Merge branch 'for-4.8/drivers' of git://git.kernel.dk/linux-block") Signed-off-by: Munehisa Kamata <kamatam@amazon.com> Reviewed-by: Thomas Friebel <friebelt@amazon.de> Reviewed-by: Eduardo Valentin <eduval@amazon.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Roger Pau Monne <roger.pau@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> 15 August 2017, 14:34:04 UTC
462cdac xen: fix bio vec merging The current test for bio vec merging is not fully accurate and can be tricked into merging bios when certain grant combinations are used. The result of these malicious bio merges is a bio that extends past the memory page used by any of the originating bios. Take into account the following scenario, where a guest creates two grant references that point to the same mfn, ie: grant 1 -> mfn A, grant 2 -> mfn A. These references are then used in a PV block request, and mapped by the backend domain, thus obtaining two different pfns that point to the same mfn, pfn B -> mfn A, pfn C -> mfn A. If those grants happen to be used in two consecutive sectors of a disk IO operation becoming two different bios in the backend domain, the checks in xen_biovec_phys_mergeable will succeed, because bfn1 == bfn2 (they both point to the same mfn). However due to the bio merging, the backend domain will end up with a bio that expands past mfn A into mfn A + 1. Fix this by making sure the check in xen_biovec_phys_mergeable takes into account the offset and the length of the bio, this basically replicates whats done in __BIOVEC_PHYS_MERGEABLE using mfns (bus addresses). While there also remove the usage of __BIOVEC_PHYS_MERGEABLE, since that's already checked by the callers of xen_biovec_phys_mergeable. CC: stable@vger.kernel.org Reported-by: "Jan H. Schönherr" <jschoenh@amazon.de> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> 15 August 2017, 14:32:49 UTC
back to top