https://github.com/torvalds/linux

sort by:
Revision Author Date Message Commit Date
35897b9 nvme-fabrics: fix and refine state checks in __nvmf_check_ready - make sure we only allow internally generates commands in any non-live state - only allow connect commands on non-live queues when actually in the new or connecting states - treat all other non-live, non-dead states the same as a default cach-all This fixes a regression where we could not shutdown a controller orderly as we didn't allow the internal generated Property Set command, and also ensures we don't accidentally let a Connect command through in the wrong state. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com> 15 June 2018, 09:21:00 UTC
278ab37 nvme-fabrics: handle the admin-only case properly in nvmf_check_ready In the ADMIN_ONLY state we don't have any I/O queues, but we should accept all admin commands without further checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: James Smart <james.smart@broadcom.com> 15 June 2018, 09:21:00 UTC
3bc32bb nvme-fabrics: refactor queue ready check Move the is_connected check to the fibre channel transport, as it has no meaning for other transports. To facilitate this split out a new nvmf_fail_nonready_command helper that is called by the transport when it is asked to handle a command on a queue that is not ready. Also avoid a function call for the queue live fast path by inlining the check. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com> 15 June 2018, 09:21:00 UTC
e6c3456 blk-mq: remove blk_mq_tagset_iter Unused now that nvme stopped using it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> 14 June 2018, 15:01:45 UTC
14dfa40 nvme: remove nvme_reinit_tagset Unused now that all transports stopped using it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> 14 June 2018, 15:01:27 UTC
3e493c0 nvme-fc: fix nulling of queue data on reconnect The reconnect path is calling the init routines to clear a queue structure. But the queue structure has state that perhaps needs to persist as long as the controller is live. Remove the nvme_fc_init_queue() calls on reconnect. The nvme_fc_free_queue() calls will clear state bits and reset any relevant queue state for a new connection. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 14 June 2018, 15:01:01 UTC
587331f nvme-fc: remove reinit_request routine The reinit_request routine is not necessary. Remove support for the op callback. As all that nvme_reinit_tagset() does is itterate and call the reinit routine, it too has no purpose. Remove the call. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 14 June 2018, 15:00:53 UTC
4c98415 nvme-fc: change controllers first connect to use reconnect path Current code follows the framework that has been in the transports from the beginning where initial link-side controller connect occurs as part of "creating the controller". Thus that first connect fully talks to the controller and obtains values that can then be used in for blk-mq setup, etc. It also means that everything about the controller is fully know before the "create controller" call returns. This has several weaknesses: - The initial create_ctrl call made by the cli will block for a long time as wire transactions are performed synchronously. This delay becomes longer if errors occur or connectivity is lost and retries need to be performed. - Code wise, it means there is a separate connect path for initial controller connect vs the (same) steps used in the reconnect path. - And as there's separate paths, it means there's separate error handling and retry logic. It also plays havoc with the NEW state (should transition out of it after successful initial connect) vs the RESETTING and CONNECTING (reconnect) states that want to be transitioned to on error. - As there's separate paths, to recover from errors and disruptions, it requires separate recovery/retry paths as well and can severely convolute the controller state. This patch reworks the fc transport to use the same connect paths for the initial connection as it uses for reconnect. This makes a single path for error recovery and handling. This patch: - Removes the driving of the initial connect and replaces it with a state transition to CONNECTING and initiating the reconnect thread. A dummy state transition of RESETTING had to be traversed as a direct transtion of NEW->CONNECTING is not allowed. Given that the controller is "new", the RESETTING transition is a simple no-op. Once in the reconnecting thread, the normal behaviors of ctrl_loss_tmo (max_retries * connect_delay) and dev_loss_tmo will apply before the controller is torn down. - Only if the state transitions couldn't be traversed and the reconnect thread not scheduled, will the controller be torn down while in create_ctrl. - The prior code used the controller state of NEW to indicate whether request queues had been initialized or not. For the admin queue, the request queue is always created, so there's no need to check a state. For IO queues, change to tracking whether a successful io request queue create has occurred (e.g. 1st successful connect). - The initial controller id is initialized to the dynamic controller id used in the initial connect message. It will be overwritten by the real controller id once the controller is connected on the wire. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 14 June 2018, 12:25:09 UTC
f493af3 nvme: don't rely on the changed namespace list log Don't optimize our namespace rescan based on the changed namespace list log page as userspace might have changed the content through reading it. Suggested-by: Keith Busch <keith.busch@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@linux.intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> 13 June 2018, 07:24:34 UTC
c42d7a3 nvmet: free smart-log buffer after use Free smart-log buffer allocated in the function after use. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 11 June 2018, 14:18:05 UTC
94423a8 nvme-rdma: fix error flow during mapping request data After dma mapping the sgl, we map the sgl to nvme sgl descriptor. In case of failure during the last mapping we never dma unmap the sgl. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 11 June 2018, 14:17:58 UTC
2796b56 nvme: add bio remapping tracepoint Adding a tracepoint to trace bio remapping for native nvme multipath. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> 11 June 2018, 14:17:46 UTC
16001c1 nvme: fix NULL pointer dereference in nvme_init_subsystem When using nvme-pci driver the nvmf_ctrl_options is NULL. There is no need to check for discovery_nqn flag at non-fabrics controller. Fixes: 181303d0 ("nvme-fabrics: allow duplicate connections to the discovery controller") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de> 11 June 2018, 14:17:41 UTC
190b02e block: fix use-after-free in block flush handling A recent commit reused the original request flags for the flush queue handling. However, for some of the kick flush cases, the original request was already completed. This caused a use after free, if blk-mq wasn't used. Fixes: 84fca1b0c461 ("block: pass failfast and driver-specific flags to flush requests") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 09 June 2018, 12:37:14 UTC
7701619 nvme: cleanup double shift issue The problem here is that set_bit() and test_bit() take a bit number so we should be passing 0 but instead we're passing (1 << 0) which leads to a double shift. It doesn't cause a runtime bug in the current code because it's done consistently and we only set that one bit. I decided to just re-use NVME_AER_NOTICE_NS_CHANGED instead of introducing a new define for this. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:11 UTC
69f4eb9 nvme-pci: make CMB SQ mod-param read-only A controller reset after a run time change of the CMB module parameter breaks the driver. An 'on -> off' will have the driver use NULL for the host memory queue, and 'off -> on' will use mismatched queue depth between the device and the host. We could fix both, but there isn't really a good reason to change this at run time anyway, compared to at module load time, so this patch makes parameter read-only after after modprobe. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:11 UTC
1d39e69 nvme-pci: unquiesce dead controller queues This patch ensures the nvme namsepace request queues are not quiesced on a surprise removal. It's possible the queues were previously killed in a failed reset, so the queues need to be unquiesced to ensure all requests are flushed to completion. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:11 UTC
fe76fcf nvme-pci: remove HMB teardown on reset The controller is required to disable its host memory buffer use on controller reset. We don't need to submit an admin command to delete it, so this patch skips sending that command so we don't need to worry about handling a timeout. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:11 UTC
ded4550 nvme-pci: queue creation fixes We've been ignoring NVMe error status on queue creations. Fortunately they are uncommon, but we should handle these anyway. This patch adds checks for the a positive error return value that indicates an NVMe status. If we do see a negative return, the controller isn't usable, so this patch returns immediately in since we can't unwind that failure. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:11 UTC
397c699 nvme-pci: remove unnecessary completion doorbell check The nvme pci driver never unmaps the doorbell registers while the requests are active, so we can always safely update the completion queue head. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:11 UTC
0bc8819 nvme-pci: remove unnecessary nested locking The nvme pci driver no longer handles completions under the cq lock, so the nested locking is not necessary. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:10 UTC
9ba2a5c nvmet: filter newlines from user input We should avoid consuming the newlines in traddr, trsvcid and device_path. Add minimal processing to make sure they are gone. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:10 UTC
d4c68c7 nvme-rdma: correctly check for target keyed sgl support The code was checking bit 20 instead of bit 2. Also fixed the log entry. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:10 UTC
12a0b66 nvme: don't hold nvmf_transports_rwsem for more than transport lookups Only take nvmf_transports_rwsem when doing a lookup of registered transports, so that a blocking ->create_ctrl doesn't prevent other actions on /dev/nvme-fabrics. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> [hch: increased lock hold time a bit to be safe, added a comment and updated the changelog] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:10 UTC
f39ae47 nvmet: return all zeroed buffer when we can't find an active namespace Quote from Figure 106 in NVMe 1.3a: The Identify Namespace data structure is returned to the host for the namespace specified in the Namespace Identifier (CDW1.NSID) field if it is an active NSID. If the specified namespace is not an active NSID, then the controller returns a zero filled data structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@rimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 18:51:09 UTC
28dec87 md: Unify mddev destruction paths Previously, mddev_put() had a couple different paths for freeing a mddev, due to the fact that the kobject wasn't initialized when the mddev was first allocated. If we move the kobject_init() to when it's first allocated and just use kobject_add() later, we can clean all this up. This also removes a hack in mddev_put() to avoid freeing biosets under a spinlock, which involved copying biosets on the stack after the reset bioset_init() changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 14:41:17 UTC
2a2a4c5 dm: use bioset_init_from_src() to copy bio_set We can't just copy and clear a bio_set, use the bio helper to setup a new bio_set with the settings from another one. Fixes: 6f1c819c219f ("dm: convert to bioset_init()/mempool_init()") Reported-by: Venkat R.B <vrbagal1@linux.vnet.ibm.com> Tested-by: Venkat R.B <vrbagal1@linux.vnet.ibm.com> Tested-by: Li Wang <liwang@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 13:06:29 UTC
28e89fd block: add bioset_init_from_src() helper Add a helper that allows a caller to initialize a new bio_set, using the settings from an existing bio_set. Reported-by: Venkat R.B <vrbagal1@linux.vnet.ibm.com> Tested-by: Venkat R.B <vrbagal1@linux.vnet.ibm.com> Tested-by: Li Wang <liwang@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 08 June 2018, 13:03:35 UTC
c04fa44 block: always set partition number to '0' in blk_partition_remap() blk_partition_remap() will only clear bi_partno if an actual remapping has happened. But flush request et al don't have an actual size, so the remapping doesn't happen and bi_partno is never cleared. So for stacked devices blk_partition_remap() will be called on each level. If (as is the case for native nvme multipathing) one of the lower-level devices do _not_support partitioning a spurious I/O error is generated. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 07 June 2018, 12:56:01 UTC
84fca1b block: pass failfast and driver-specific flags to flush requests If flush requests are being sent to the device we need to inherit the failfast and driver-specific flags, too, otherwise I/O will fail. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 06 June 2018, 14:36:03 UTC
07ce213 nbd: set discard_alignment to the granularity Technically we should be able to get away with 0 as the discard_alignment, but there's no way currently for the protocol to indicate different alignments, and in real life most disks have discard_alignment == discard_granularity. Just set our alignment to our blocksize to make sure discards will actually work properly with 4k drives. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 05 June 2018, 15:50:46 UTC
ee57a05 nbd: Consistently use request pointer in debug messages. Existing dev_dbg messages sometimes identify request using request pointer, sometimes using nbd_cmd pointer. This makes it hard to follow request flow. Consistently use request pointer instead. Reviewed-by: Josef Bacik <jbacik@toxicpanda.com> Signed-off-by: Kevin Vigor <kvigor@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 05 June 2018, 15:45:01 UTC
645d409 block: add verifier for cmdline partition I meet strange filesystem corruption issue recently, the reason is there are overlaps partitions in cmdline partition argument. This patch add verifier for cmdline partition, then if there are overlaps partitions, cmdline_partition will log a warning. We don't treat overlaps partition as a error: " Caizhiyong <caizhiyong@hisilicon.com> said: Partition overlap was intentionally designed in this cmdline partition. reference http://lists.infradead.org/pipermail/linux-mtd/2013-August/048092.html " Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 05 June 2018, 15:20:53 UTC
0ec6937 lightnvm: pblk: fix resource leak of invalid_bitmap Currently the error exit path when the emeta could not be interpreted is via fail_free_ws and this fails to free invalid_bitmap. Fix this by adding another exit label and exiting via this to kfree invalid_bitmap. Detected by CoverityScan, CID#1469659 ("Resource leak") Fixes: 48b8d20895f8 ("lightnvm: pblk: garbage collect lines with failed writes") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 05 June 2018, 15:20:27 UTC
21ff139 lightnvm: pblk: make symbol write_buffer_size static Fixes the following sparse warning: drivers/lightnvm/pblk-init.c:23:14: warning: symbol 'write_buffer_size' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 05 June 2018, 15:19:08 UTC
d2ac838 loop: add recursion validation to LOOP_CHANGE_FD Refactor the validation code used in LOOP_SET_FD so it is also used in LOOP_CHANGE_FD. Otherwise it is possible to construct a set of loop devices that all refer to each other. This can lead to a infinite loop in starting with "while (is_loop_device(f)) .." in loop_set_fd(). Fix this by refactoring out the validation code and using it for LOOP_CHANGE_FD as well as LOOP_SET_FD. Reported-by: syzbot+4349872271ece473a7c91190b68b4bac7c5dbc87@syzkaller.appspotmail.com Reported-by: syzbot+40bd32c4d9a3cc12a339@syzkaller.appspotmail.com Reported-by: syzbot+769c54e66f994b041be7@syzkaller.appspotmail.com Reported-by: syzbot+0a89a9ce473936c57065@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk> 05 June 2018, 15:08:21 UTC
d377535 dm: Use kzalloc for all structs with embedded biosets/mempools mempool_init()/bioset_init() require that the mempools/biosets be zeroed first; they probably should not _require_ this, but not allocating those structs with kzalloc is a fairly nonsensical thing to do (calling mempool_exit()/bioset_exit() on an uninitialized mempool/bioset is legal and safe, but only works if said memory was zeroed.) Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 05 June 2018, 14:47:43 UTC
0196d6b blk-mq: return when hctx is stopped in blk_mq_run_work_fn If a hardware queue is stopped, it should not be run again before explicitly started. Ignore stopped queues in blk_mq_run_work_fn(), fixing a regression recently introduced when the START_ON_RUN bit was removed. Fixes: 15fe8a90bb45 ("blk-mq: remove blk_mq_delay_queue()") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 04 June 2018, 18:10:40 UTC
f956d08 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Misc bits and pieces not fitting into anything more specific" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: delete unnecessary assignment in vfs_listxattr Documentation: filesystems: update filesystem locking documentation vfs: namei: use path_equal() in follow_dotdot() fs.h: fix outdated comment about file flags __inode_security_revalidate() never gets NULL opt_dentry make xattr_getsecurity() static vfat: simplify checks in vfat_lookup() get rid of dead code in d_find_alias() it's SB_BORN, not MS_BORN... msdos_rmdir(): kill BS comment remove rpc_rmdir() fs: avoid fdput() after failed fdget() in vfs_dedupe_file_range() 04 June 2018, 17:14:28 UTC
cf626b0 Merge branch 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull procfs updates from Al Viro: "Christoph's proc_create_... cleanups series" * 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (44 commits) xfs, proc: hide unused xfs procfs helpers isdn/gigaset: add back gigaset_procinfo assignment proc: update SIZEOF_PDE_INLINE_NAME for the new pde fields tty: replace ->proc_fops with ->proc_show ide: replace ->proc_fops with ->proc_show ide: remove ide_driver_proc_write isdn: replace ->proc_fops with ->proc_show atm: switch to proc_create_seq_private atm: simplify procfs code bluetooth: switch to proc_create_seq_data netfilter/x_tables: switch to proc_create_seq_private netfilter/xt_hashlimit: switch to proc_create_{seq,single}_data neigh: switch to proc_create_seq_data hostap: switch to proc_create_{seq,single}_data bonding: switch to proc_create_seq_data rtc/proc: switch to proc_create_single_data drbd: switch to proc_create_single resource: switch to proc_create_seq_data staging/rtl8192u: simplify procfs code jfs: simplify procfs code ... 04 June 2018, 17:00:01 UTC
9c50eaf Merge branch 'work.rmdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull rmdir update from Al Viro: "More shrink_dcache_parent()-related stuff - killing the main source of potentially contended calls of that on large subtrees" * 'work.rmdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: rmdir(),rename(): do shrink_dcache_parent() only on success 04 June 2018, 16:53:33 UTC
06c86e6 Merge branch 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dcache updates from Al Viro: "This is the first part of dealing with livelocks etc around shrink_dcache_parent()." * 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: restore cond_resched() in shrink_dcache_parent() dput(): turn into explicit while() loop dcache: move cond_resched() into the end of __dentry_kill() d_walk(): kill 'finish' callback d_invalidate(): unhash immediately 04 June 2018, 15:57:36 UTC
f459c34 Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block Pull block updates from Jens Axboe: - clean up how we pass around gfp_t and blk_mq_req_flags_t (Christoph) - prepare us to defer scheduler attach (Christoph) - clean up drivers handling of bounce buffers (Christoph) - fix timeout handling corner cases (Christoph/Bart/Keith) - bcache fixes (Coly) - prep work for bcachefs and some block layer optimizations (Kent). - convert users of bio_sets to using embedded structs (Kent). - fixes for the BFQ io scheduler (Paolo/Davide/Filippo) - lightnvm fixes and improvements (Matias, with contributions from Hans and Javier) - adding discard throttling to blk-wbt (me) - sbitmap blk-mq-tag handling (me/Omar/Ming). - remove the sparc jsflash block driver, acked by DaveM. - Kyber scheduler improvement from Jianchao, making it more friendly wrt merging. - conversion of symbolic proc permissions to octal, from Joe Perches. Previously the block parts were a mix of both. - nbd fixes (Josef and Kevin Vigor) - unify how we handle the various kinds of timestamps that the block core and utility code uses (Omar) - three NVMe pull requests from Keith and Christoph, bringing AEN to feature completeness, file backed namespaces, cq/sq lock split, and various fixes - various little fixes and improvements all over the map * tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits) blk-mq: update nr_requests when switching to 'none' scheduler block: don't use blocking queue entered for recursive bio submits dm-crypt: fix warning in shutdown path lightnvm: pblk: take bitmap alloc. out of critical section lightnvm: pblk: kick writer on new flush points lightnvm: pblk: only try to recover lines with written smeta lightnvm: pblk: remove unnecessary bio_get/put lightnvm: pblk: add possibility to set write buffer size manually lightnvm: fix partial read error path lightnvm: proper error handling for pblk_bio_add_pages lightnvm: pblk: fix smeta write error path lightnvm: pblk: garbage collect lines with failed writes lightnvm: pblk: rework write error recovery path lightnvm: pblk: remove dead function lightnvm: pass flag on graceful teardown to targets lightnvm: pblk: check for chunk size before allocating it lightnvm: pblk: remove unnecessary argument lightnvm: pblk: remove unnecessary indirection lightnvm: pblk: return NVM_ error on failed submission lightnvm: pblk: warn in case of corrupted write buffer ... 04 June 2018, 14:58:06 UTC
29dcea8 Linux 4.17 03 June 2018, 21:15:21 UTC
325e14f Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro. - fix io_destroy()/aio_complete() race - the vfs_open() change to get rid of open_check_o_direct() boilerplate was nice, but buggy. Al has a patch avoiding a revert, but that's definitely not a last-day fodder, so for now revert it is... * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Revert "fs: fold open_check_o_direct into do_dentry_open" fix io_destroy()/aio_complete() race 03 June 2018, 18:01:28 UTC
af04fad Revert "fs: fold open_check_o_direct into do_dentry_open" This reverts commit cab64df194667dc5d9d786f0a895f647f5501c0d. Having vfs_open() in some cases drop the reference to struct file combined with error = vfs_open(path, f, cred); if (error) { put_filp(f); return ERR_PTR(error); } return f; is flat-out wrong. It used to be error = vfs_open(path, f, cred); if (!error) { /* from now on we need fput() to dispose of f */ error = open_check_o_direct(f); if (error) { fput(f); f = ERR_PTR(error); } } else { put_filp(f); f = ERR_PTR(error); } and sure, having that open_check_o_direct() boilerplate gotten rid of is nice, but not that way... Worse, another call chain (via finish_open()) is FUBAR now wrt FILE_OPENED handling - in that case we get error returned, with file already hit by fput() *AND* FILE_OPENED not set. Guess what happens in path_openat(), when it hits if (!(opened & FILE_OPENED)) { BUG_ON(!error); put_filp(file); } The root cause of all that crap is that the callers of do_dentry_open() have no way to tell which way did it fail; while that could be fixed up (by passing something like int *opened to do_dentry_open() and have it marked if we'd called ->open()), it's probably much too late in the cycle to do so right now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 03 June 2018, 17:58:23 UTC
874cd33 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: - two patches addressing the problem that the scheduler allows under certain conditions user space tasks to be scheduled on CPUs which are not yet fully booted which causes a few subtle and hard to debug issue - add a missing runqueue clock update in the deadline scheduler which triggers a warning under certain circumstances - fix a silly typo in the scheduler header file * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/headers: Fix typo sched/deadline: Fix missing clock update sched/core: Require cpu_active() in select_task_rq(), for user tasks sched/core: Fix rules for running on online && !active CPUs 03 June 2018, 16:01:41 UTC
26bdace Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling fixes from Thomas Gleixner: - fix 'perf test Session topology' segfault on s390 (Thomas Richter) - fix NULL return handling in bpf__prepare_load() (YueHaibing) - fix indexing on Coresight ETM packet queue decoder (Mathieu Poirier) - fix perf.data format description of NRCPUS header (Arnaldo Carvalho de Melo) - update perf.data documentation section on cpu topology - handle uncore event aliases in small groups properly (Kan Liang) - add missing perf_sample.addr into python sample dictionary (Leo Yan) * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools: Fix perf.data format description of NRCPUS header perf script python: Add addr into perf sample dict perf data: Update documentation section on cpu topology perf cs-etm: Fix indexing for decoder packet queue perf bpf: Fix NULL return handling in bpf__prepare_load() perf test: "Session topology" dumps core on s390 perf parse-events: Handle uncore event aliases in small groups properly 03 June 2018, 15:58:59 UTC
32a50fa blk-mq: update nr_requests when switching to 'none' scheduler Now we setup q->nr_requests when switching to one new scheduler, but not do it for 'none', then q->nr_requests may not be correct for 'none'. This patch fixes this issue by always updating 'nr_requests' when switching to 'none'. Cc: Marco Patalano <mpatalan@redhat.com> Cc: "Ewan D. Milne" <emilne@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 03 June 2018, 02:35:00 UTC
cd4a4ae block: don't use blocking queue entered for recursive bio submits If we end up splitting a bio and the queue goes away between the initial submission and the later split submission, then we can block forever in blk_queue_enter() waiting for the reference to drop to zero. This will never happen, since we already hold a reference. Mark a split bio as already having entered the queue, so we can just use the live non-blocking queue enter variant. Thanks to Tetsuo Handa for the analysis. Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> 03 June 2018, 02:35:00 UTC
d00a11d dm-crypt: fix warning in shutdown path The counter for the number of allocated pages includes pages in the mempool's reserve, so checking that the number of allocated pages is 0 needs to happen after we exit the mempool. Fixes: 6f1c819c219f ("dm: convert to bioset_init()/mempool_init()") Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Reported-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Mike Snitzer <snitzer@redhat.com> Fixed to always just use percpu_counter_sum() Signed-off-by: Jens Axboe <axboe@kernel.dk> 03 June 2018, 02:35:00 UTC
918fe1b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Infinite loop in _decode_session6(), from Eric Dumazet. 2) Pass correct argument to nla_strlcpy() in netfilter, also from Eric Dumazet. 3) Out of bounds memory access in ipv6 srh code, from Mathieu Xhonneux. 4) NULL deref in XDP_REDIRECT handling of tun driver, from Toshiaki Makita. 5) Incorrect idr release in cls_flower, from Paul Blakey. 6) Probe error handling fix in davinci_emac, from Dan Carpenter. 7) Memory leak in XPS configuration, from Alexander Duyck. 8) Use after free with cloned sockets in kcm, from Kirill Tkhai. 9) MTU handling fixes fo ip_tunnel and ip6_tunnel, from Nicolas Dichtel. 10) Fix UAPI hole in bpf data structure for 32-bit compat applications, from Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) bpf: fix uapi hole for 32 bit compat applications net: usb: cdc_mbim: add flag FLAG_SEND_ZLP ip6_tunnel: remove magic mtu value 0xFFF8 ip_tunnel: restore binding to ifaces with a large mtu net: dsa: b53: Add BCM5389 support kcm: Fix use-after-free caused by clonned sockets net-sysfs: Fix memory leak in XPS configuration ixgbe: fix parsing of TC actions for HW offload net: ethernet: davinci_emac: fix error handling in probe() net/ncsi: Fix array size in dumpit handler cls_flower: Fix incorrect idr release when failing to modify rule net/sonic: Use dma_mapping_error() xfrm Fix potential error pointer dereference in xfrm_bundle_create. vhost_net: flush batched heads before trying to busy polling tun: Fix NULL pointer dereference in XDP redirect be2net: Fix error detection logic for BE3 net: qmi_wwan: Add Netgear Aircard 779S mlxsw: spectrum: Forbid creation of VLAN 1 over port/LAG atm: zatm: fix memcmp casting iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs ... 03 June 2018, 00:35:53 UTC
e0255ae Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "Eve of merge window fix: The original code was so bogus as to be casting the wrong generic device to an rport and proceeding to take actions based on the bogus values it found. Fortunately it seems the location that is dereferenced always exists, so the code hasn't oopsed yet, but it certainly annoys the memory checkers" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: scsi_transport_srp: Fix shost to rport translation 02 June 2018, 22:54:49 UTC
ada7339 Merge tag 'drm-fixes-for-v4.17-rc8' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "A few final fixes: i915: - fix for potential Spectre vector in the new query uAPI - fix NULL pointer deref (FDO #106559) - DMI fix to hide LVDS for Radiant P845 (FDO #105468) amdgpu: - suspend/resume DC regression fix - underscan flicker fix on fiji - gamma setting fix after dpms omap: - fix oops regression core: - fix PSR timing dw-hdmi: - fix oops regression" * tag 'drm-fixes-for-v4.17-rc8' of git://people.freedesktop.org/~airlied/linux: drm/amd/display: Update color props when modeset is required drm/amd/display: Make atomic-check validate underscan changes drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense drm/amd/display: Fix BUG_ON during CRTC atomic check update drm/i915/query: nospec expects no more than an unsigned long drm/i915/query: Protect tainted function pointer lookup drm/i915/lvds: Move acpi lid notification registration to registration phase drm/i915: Disable LVDS on Radiant P845 drm/omap: fix NULL deref crash with SDI displays drm/psr: Fix missed entry in PSR setup time table. 02 June 2018, 22:24:45 UTC
012cfac Merge branch 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux into drm-fixes Two last minute DC fixes for 4.17. A fix for underscan on fiji and a fix for gamma settings getting after dpms. * 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux: drm/amd/display: Update color props when modeset is required drm/amd/display: Make atomic-check validate underscan changes 02 June 2018, 20:13:57 UTC
4277e6b Merge tag 'mips_fixes_4.17_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from James Hogan: "A final few MIPS fixes for 4.17: - drop Lantiq gphy reboot/remove reset (4.14) - prctl(PR_SET_FP_MODE): Disallow PRE without FR (4.0) - ptrace(PTRACE_PEEKUSR): Fix 64-bit FGRs (3.15)" * tag 'mips_fixes_4.17_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: ptrace: Fix PTRACE_PEEKUSR requests for 64-bit FGRs MIPS: prctl: Disallow FRE without FR with PR_SET_FP_MODE requests MIPS: lantiq: gphy: Drop reboot/remove reset asserts 02 June 2018, 17:12:23 UTC
7172a69 Merge tag 'vfio-v4.17' of git://github.com/awilliam/linux-vfio Pull VFIO fix from Alex Williamson: "Revert a pfn page mapping optimization identified as introducing a bad page state regression (Alex Williamson)" * tag 'vfio-v4.17' of git://github.com/awilliam/linux-vfio: Revert "vfio/type1: Improve memory pinning process for raw PFN mapping" 02 June 2018, 17:08:45 UTC
6ac9f42 Merge tag 'char-misc-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are four small bugfixes for some char/misc drivers. Well, really three fixes and one fix for one of those fixes due to problems found by 0-day. This resolves some reported issues with the hwtracing drivers, and a reported regression for the thunderbolt subsystem. All of these have been in linux-next for a while now with no reported problems" * tag 'char-misc-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: hwtracing: stm: fix build error on some arches intel_th: Use correct device when freeing buffers stm class: Use vmalloc for the master map thunderbolt: Handle NULL boot ACL entries properly 02 June 2018, 17:05:45 UTC
34a8e64 Merge tag 'staging-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull IIO driver fixes from Greg KH: "Here are some old IIO driver fixes that were sitting in my tree for a few weeks. Sorry about not getting them to you sooner. They fix a number of small IIO driver issues that have been reported. All of these have been in linux-next for a while with no reported problems" * tag 'staging-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: adc: select buffer for at91-sama5d2_adc iio: hid-sensor-trigger: Fix sometimes not powering up the sensor after resume iio: adc: at91-sama5d2_adc: fix channel configuration for differential channels iio:kfifo_buf: check for uint overflow iio:buffer: make length types match kfifo types iio: adc: stm32-dfsdm: fix sample rate for div2 spi clock iio: adc: stm32-dfsdm: fix successive oversampling settings iio: ad7793: implement IIO_CHAN_INFO_SAMP_FREQ 02 June 2018, 17:02:14 UTC
7fdf3e8 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "Just three small last minute regressions that were found in the last week. The Broadcom fix is a bit big for rc7, but since it is fixing driver crash regressions that were merged via netdev into rc1, I am sending it. - bnxt netdev changes merged this cycle caused the bnxt RDMA driver to crash under certain situations - Arnd found (several, unfortunately) kconfig problems with the patches adding INFINIBAND_ADDR_TRANS. Reverting this last part, will fix it more fully outside -rc. - Subtle change in error code for a uapi function caused breakage in userspace. This was bug was subtly introduced cycle" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/core: Fix error code for invalid GID entry IB: Revert "remove redundant INFINIBAND kconfig dependencies" RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes 02 June 2018, 16:55:44 UTC
a36b796 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "A documentation bugfix and a MAINTAINERS addition" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: ocores: update HDL sources URL i2c: xlp9xx: Add MAINTAINERS entry 02 June 2018, 16:52:22 UTC
0938a8f Merge branch 'akpm' (patches from Andrew) Merge two fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: fix the NULL mapping case in __isolate_lru_page() mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() 02 June 2018, 16:44:15 UTC
145e1a7 mm: fix the NULL mapping case in __isolate_lru_page() George Boole would have noticed a slight error in 4.16 commit 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page"). Fix it, to match both the comment above it, and the original behaviour. Although anonymous pages are not marked PageDirty at first, we have an old habit of calling SetPageDirty when a page is removed from swap cache: so there's a category of ex-swap pages that are easily migratable, but were inadvertently excluded from compaction's async migration in 4.16. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805302014001.12558@eggly.anvils Fixes: 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Ivan Kalvachev <ikalvachev@gmail.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 02 June 2018, 16:33:47 UTC
2d077d4 mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() Swapping load on huge=always tmpfs (with khugepaged tuned up to be very eager, but I'm not sure that is relevant) soon hung uninterruptibly, waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most often when "cp -a" was trying to write to a smallish file. Debug showed that the page in question was not locked, and page->mapping NULL by now, but page->index consistent with having been in a huge page before. Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added in; but took hours to reproduce on a 4.17 kernel (no idea why). The culprit proved to be the __ClearPageDirty() on tails beyond i_size in __split_huge_page(): the non-atomic __bitoperation may have been safe when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()") introduced it, but liable to erase PageWaiters after 4.10's 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit"). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 02 June 2018, 16:33:47 UTC
89c29de Revert "vfio/type1: Improve memory pinning process for raw PFN mapping" Bisection by Amadeusz Sławiński implicates this commit leading to bad page state issues after VM shutdown, likely due to unbalanced page references. The original commit was intended only as a performance improvement, therefore revert for offline rework. Link: https://lkml.org/lkml/2018/6/2/97 Fixes: 356e88ebe447 ("vfio/type1: Improve memory pinning process for raw PFN mapping") Cc: Jason Cai (Xiang Feng) <jason.cai@linux.alibaba.com> Reported-by: Amadeusz Sławiński <amade@asmblr.net> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> 02 June 2018, 14:41:44 UTC
cd075ce Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-06-02 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) BPF uapi fix in struct bpf_prog_info and struct bpf_map_info in order to fix offsets on 32 bit archs. This will have a minor merge conflict with net-next which has the __u32 gpl_compatible:1 bitfield in struct bpf_prog_info at this location. Resolution is to use the gpl_compatible member. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 02 June 2018, 12:07:52 UTC
36f9814 bpf: fix uapi hole for 32 bit compat applications In 64 bit, we have a 4 byte hole between ifindex and netns_dev in the case of struct bpf_map_info but also struct bpf_prog_info. In net-next commit b85fab0e67b ("bpf: Add gpl_compatible flag to struct bpf_prog_info") added a bitfield into it to expose some flags related to programs. Thus, add an unnamed __u32 bitfield for both so that alignment keeps the same in both 32 and 64 bit cases, and can be naturally extended from there as in b85fab0e67b. Before: # file test.o test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped # pahole test.o struct bpf_map_info { __u32 type; /* 0 4 */ __u32 id; /* 4 4 */ __u32 key_size; /* 8 4 */ __u32 value_size; /* 12 4 */ __u32 max_entries; /* 16 4 */ __u32 map_flags; /* 20 4 */ char name[16]; /* 24 16 */ __u32 ifindex; /* 40 4 */ __u64 netns_dev; /* 44 8 */ __u64 netns_ino; /* 52 8 */ /* size: 64, cachelines: 1, members: 10 */ /* padding: 4 */ }; After (same as on 64 bit): # file test.o test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped # pahole test.o struct bpf_map_info { __u32 type; /* 0 4 */ __u32 id; /* 4 4 */ __u32 key_size; /* 8 4 */ __u32 value_size; /* 12 4 */ __u32 max_entries; /* 16 4 */ __u32 map_flags; /* 20 4 */ char name[16]; /* 24 16 */ __u32 ifindex; /* 40 4 */ /* XXX 4 bytes hole, try to pack */ __u64 netns_dev; /* 48 8 */ __u64 netns_ino; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ /* size: 64, cachelines: 1, members: 10 */ /* sum members: 60, holes: 1, sum holes: 4 */ }; Reported-by: Dmitry V. Levin <ldv@altlinux.org> Reported-by: Eugene Syromiatnikov <esyr@redhat.com> Fixes: 52775b33bb507 ("bpf: offload: report device information about offloaded maps") Fixes: 675fc275a3a2d ("bpf: offload: report device information for offloaded programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> 02 June 2018, 03:41:35 UTC
9f7c728 net: usb: cdc_mbim: add flag FLAG_SEND_ZLP Testing Telit LM940 with ICMP packets > 14552 bytes revealed that the modem needs FLAG_SEND_ZLP to properly work, otherwise the cdc mbim data interface won't be anymore responsive. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> 01 June 2018, 18:01:42 UTC
8a11801 Merge branch 'tunnel-mtus' Nicolas Dichtel says: ==================== ip[6] tunnels: fix mtu calculations The first patch restores the possibility to bind an ip4 tunnel to an interface whith a large mtu. The second patch was spotted after the first fix. I also target it to net because it fixes the max mtu value that can be used for ipv6 tunnels. v2: remove the 0xfff8 in ip_tunnel_newlink() ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 01 June 2018, 17:56:31 UTC
f7ff1fd ip6_tunnel: remove magic mtu value 0xFFF8 I don't know where this value comes from (probably a copy and paste and paste and paste ...). Let's use standard values which are a bit greater. Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> 01 June 2018, 17:56:30 UTC
82612de ip_tunnel: restore binding to ifaces with a large mtu After commit f6cc9c054e77, the following conf is broken (note that the default loopback mtu is 65536, ie IP_MAX_MTU + 1): $ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev lo add tunnel "gre0" failed: Invalid argument $ ip l a type dummy $ ip l s dummy1 up $ ip l s dummy1 mtu 65535 $ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev dummy1 add tunnel "gre0" failed: Invalid argument dev_set_mtu() doesn't allow to set a mtu which is too large. First, let's cap the mtu returned by ip_tunnel_bind_dev(). Second, remove the magic value 0xFFF8 and use IP_MAX_MTU instead. 0xFFF8 seems to be there for ages, I don't know why this value was used. With a recent kernel, it's also possible to set a mtu > IP_MAX_MTU: $ ip l s dummy1 mtu 66000 After that patch, it's also possible to bind an ip tunnel on that kind of interface. CC: Petr Machata <petrm@mellanox.com> CC: Ido Schimmel <idosch@mellanox.com> Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a Fixes: f6cc9c054e77 ("ip_tunnel: Emit events for post-register MTU changes") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> 01 June 2018, 17:56:29 UTC
ccfde6e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2018-05-31 1) Avoid possible overflow of the offset variable in _decode_session6(), this fixes an infinite lookp there. From Eric Dumazet. 2) We may use an error pointer in the error path of xfrm_bundle_create(). Fix this by returning this pointer directly to the caller. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 01 June 2018, 17:25:41 UTC
a95691b net: dsa: b53: Add BCM5389 support This patch adds support for the BCM5389 switch connected through MDIO. Signed-off-by: Damien Thébault <damien.thebault@vitec.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 01 June 2018, 15:15:42 UTC
9cfd5a9 lightnvm: pblk: take bitmap alloc. out of critical section pblk allocates line bitmaps within the line lock unnecessarily. In order to take pressure out of the fast patch, allocate line bitmaps outside of this lock and refactor accordingly. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
cc9c9a0 lightnvm: pblk: kick writer on new flush points Unless we kick the writer directly when setting a new flush point, the user risks having to wait for up to one second (the default timeout for the write thread to be kicked) for the IO to complete. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
b06be28 lightnvm: pblk: only try to recover lines with written smeta When switching between different lun configurations, there is no guarantee that all lines that contain closed/open chunks have some valid data to recover. Check that the smeta chunk has been written to instead. Also skip bad lines (that does not have enough good chunks). Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
87cc40b lightnvm: pblk: remove unnecessary bio_get/put In the read path, pblk gets a reference to the incoming bio and puts it after ending the bio. Though this behavior is correct, it is unnecessary since pblk is the one putting the bio, therefore, it cannot disappear underneath it. Removing this reference, allows to clean up rqd->bio and avoids pointer bouncing for the different read paths. Now, the incoming bio always resides in the read context and pblk's internal bios (if any) reside in rqd->bio. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
4a82888 lightnvm: pblk: add possibility to set write buffer size manually In some cases, users can want set write buffer size manually, e.g. to adjust it to specific workload. This patch provides the possibility to set write buffer size via module parameter feature. Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
fbadca7 lightnvm: fix partial read error path When error occurs during bio_add_page on partial read path, pblk tries to free pages twice. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
f142ac0 lightnvm: proper error handling for pblk_bio_add_pages Currently in case of error caused by bio_pc_add_page in pblk_bio_add_pages two issues occur when calling from pblk_rb_read_to_bio(). First one is in pblk_bio_free_pages, since we are trying to free pages not allocated from our mempool. Second one is the warn from dma_pool_free, that we are trying to free NULL pointer dma. This commit fix both issues. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
6cf17a2 lightnvm: pblk: fix smeta write error path Smeta write errors were previously ignored. Skip these lines instead and throw them back on the free list, so the chunks will go through a reset cycle before we attempt to use the line again. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
48b8d20 lightnvm: pblk: garbage collect lines with failed writes Write failures should not happen under normal circumstances, so in order to bring the chunk back into a known state as soon as possible, evacuate all the valid data out of the line and let the fw judge if the block can be written to in the next reset cycle. Do this by introducing a new gc list for lines with failed writes, and ensure that the rate limiter allocates a small portion of the write bandwidth to get the job done. The lba list is saved in memory for use during gc as we cannot gurantee that the emeta data is readable if a write error occurred. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
6a3abf5 lightnvm: pblk: rework write error recovery path The write error recovery path is incomplete, so rework the write error recovery handling to do resubmits directly from the write buffer. When a write error occurs, the remaining sectors in the chunk are mapped out and invalidated and the request inserted in a resubmit list. The writer thread checks if there are any requests to resubmit, scans and invalidates any lbas that have been overwritten by later writes and resubmits the failed entries. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 15:02:53 UTC
eb7f54b kcm: Fix use-after-free caused by clonned sockets (resend for properly queueing in patchwork) kcm_clone() creates kernel socket, which does not take net counter. Thus, the net may die before the socket is completely destructed, i.e. kcm_exit_net() is executed before kcm_done(). Reported-by: syzbot+5f1a04e374a635efc426@syzkaller.appspotmail.com Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> 01 June 2018, 14:28:07 UTC
72b6cdb lightnvm: pblk: remove dead function Remove dead function for manual sync. I/O Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
a7c9e91 lightnvm: pass flag on graceful teardown to targets If the namespace is unregistered before the LightNVM target is removed (e.g., on hot unplug) it is too late for the target to store any metadata on the device - any attempt to write to the device will fail. In this case, pass on a "gracefull teardown" flag to the target to let it know when this happens. In the case of pblk, we pad the open line (close all open chunks) to improve data retention. In the event of an ungraceful shutdown, avoid this part and just clean up. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
6f9c960 lightnvm: pblk: check for chunk size before allocating it Do the check for the chunk state after making sure that the chunk type is supported. Fixes: 32ef9412c114 ("lightnvm: pblk: implement get log report chunk") Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
8e55c07 lightnvm: pblk: remove unnecessary argument Remove unnecessary argument on pblk_line_free() Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
e13f421 lightnvm: pblk: remove unnecessary indirection Call nvm_submit_io directly and remove an unnecessary indirection on the read path. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
b6730dd lightnvm: pblk: return NVM_ error on failed submission Return a meaningful error when the sanity vector I/O check fails. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
e37d079 lightnvm: pblk: warn in case of corrupted write buffer When cleaning up buffer entries as we wrap up, their state should be "completed". If any of the entries is in "submitted" state, it means that something bad has happened. Trigger a warning immediately instead of waiting for the state flag to eventually be updated, thus hiding the issue. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
03a34b2 lightnvm: pblk: improve error msg on corrupted LBAs In the event of a mismatch between the read LBA and the metadata pointer reported by the device, improve the error message to be able to detect the offending physical address (PPA) mapped to the corrupted LBA. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
310df58 lightnvm: pblk: check read lba on gc path Check that the lba stored in the LBA metadata is correct in the GC path too. This requires a new helper function to check random reads in the vector read. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
1d8b33e lightnvm: pblk: recheck for bad lines at runtime Bad blocks can grow at runtime. Check that the number of valid blocks in a line are within the sanity threshold before allocating the line for new writes. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
2deeefc lightnvm: pblk: fail gracefully on line alloc. failure In the event of a line failing to allocate, fail gracefully and stop the pipeline to avoid more write failing in the same place. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:43:53 UTC
84e92c1 Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-4.18/block Pull NVMe changes from Christoph: "Below is another set of NVMe updates for 4.18. Besides the usual bug fixes this includes more feature completness in terms of AEN and log page handling on the target." * 'nvme-4.18' of git://git.infradead.org/nvme: nvme: use the changed namespaces list log to clear ns data changed AENs nvme: mark nvme_queue_scan static nvme: submit AEN event configuration on startup nvmet: mask pending AENs nvmet: add AEN configuration support nvmet: implement the changed namespaces log nvmet: split log page implementation nvmet: add a new nvmet_zero_sgl helper nvme.h: add AEN configuration symbols nvme.h: add the changed namespace list log nvme.h: untangle AEN notice definitions nvmet: fix error return code in nvmet_file_ns_enable() nvmet: fix a typo in nvmet_file_ns_enable() nvme-fabrics: allow internal passthrough command on deleting controllers nvme-loop: add support for multiple ports nvme-pci: simplify __nvme_submit_cmd nvme-pci: Rate limit the nvme timeout warnings nvme: allow duplicate controller if prior controller being deleted 01 June 2018, 13:39:48 UTC
131d08e block: split the blk-mq case from elevator_init There is almost no shared logic, which leads to a very confusing code flow. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:38:21 UTC
acddf3b block: move sysfs_lock into elevator_init Both callers take just around so function call, so move it in. Also remove the now pointless blk_mq_sched_init wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:38:19 UTC
ddb7253 block: remove the always unused name argument to elevator_init Reported-by: Damien Le Moal <Damien.LeMoal@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:38:17 UTC
a8a275c block: unexport elevator_init/exit These are only used by the block core. Also move the declarations to block/blk.h. Reported-by: Damien Le Moal <Damien.LeMoal@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> 01 June 2018, 13:38:16 UTC
back to top