summaryrefslogtreecommitdiff
path: root/drivers/block/null_blk.c
AgeCommit message (Collapse)Author
2015-07-22null_blk: fix use-after-free problemMike Krinkin
end_cmd finishes request associated with nullb_cmd struct, so we should save pointer to request_queue in a local variable before calling end_cmd. The problem was causes general protection fault with slab poisoning enabled. Fixes: 8b70f45e2eb2 ("null_blk: restart request processing on completion handler") Tested-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-01Merge tag 'modules-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "Main excitement here is Peter Zijlstra's lockless rbtree optimization to speed module address lookup. He found some abusers of the module lock doing that too. A little bit of parameter work here too; including Dan Streetman's breaking up the big param mutex so writing a parameter can load another module (yeah, really). Unfortunately that broke the usual suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were appended too" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits) modules: only use mod->param_lock if CONFIG_MODULES param: fix module param locks when !CONFIG_SYSFS. rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() module: add per-module param_lock module: make perm const params: suppress unused variable error, warn once just in case code changes. modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'. kernel/module.c: avoid ifdefs for sig_enforce declaration kernel/workqueue.c: remove ifdefs over wq_power_efficient kernel/params.c: export param_ops_bool_enable_only kernel/params.c: generalize bool_enable_only kernel/module.c: use generic module param operaters for sig_enforce kernel/params: constify struct kernel_param_ops uses sysfs: tightened sysfs permission checks module: Rework module_addr_{min,max} module: Use __module_address() for module_address_lookup() module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING module: Optimize __module_address() using a latched RB-tree rbtree: Implement generic latch_tree seqlock: Introduce raw_read_seqcount_latch() ...
2015-06-01null_blk: restart request processing on completion handlerAkinobu Mita
When irqmode=2 (IRQ completion handler is timer) and queue_mode=1 (Block interface to use is rq), the completion handler should restart request handling for any pending requests on a queue because request processing stops when the number of commands are queued more than hw_queue_depth (null_rq_prep_fn returns BLKPREP_DEFER). Without this change, the following command cannot finish. # modprobe null_blk irqmode=2 queue_mode=1 hw_queue_depth=1 # fio --name=t --rw=read --size=1g --direct=1 \ --ioengine=libaio --iodepth=64 --filename=/dev/nullb0 Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-01null_blk: prevent timer handler running on a different CPU where startedAkinobu Mita
When irqmode=2 (IRQ completion handler is timer), timer handler should be called on the same CPU where the timer has been started. Since completion_queues are per-cpu and the completion handler only touches completion_queue for local CPU, we need to prevent the handler from running on a different CPU where the timer has been started. Otherwise, the IO cannot be completed until another completion handler is executed on that CPU. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-28kernel/params: constify struct kernel_param_ops usesLuis R. Rodriguez
Most code already uses consts for the struct kernel_param_ops, sweep the kernel for the last offending stragglers. Other than include/linux/moduleparam.h and kernel/params.c all other changes were generated with the following Coccinelle SmPL patch. Merge conflicts between trees can be handled with Coccinelle. In the future git could get Coccinelle merge support to deal with patch --> fail --> grammar --> Coccinelle --> new patch conflicts automatically for us on patches where the grammar is available and the patch is of high confidence. Consider this a feature request. Test compiled on x86_64 against: * allnoconfig * allmodconfig * allyesconfig @ const_found @ identifier ops; @@ const struct kernel_param_ops ops = { }; @ const_not_found depends on !const_found @ identifier ops; @@ -struct kernel_param_ops ops = { +const struct kernel_param_ops ops = { }; Generated-by: Coccinelle SmPL Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Junio C Hamano <gitster@pobox.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: cocci@systeme.lip6.fr Cc: linux-kernel@vger.kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-01-16null_blk: suppress invalid partition infoJens Axboe
null_blk is partitionable, but it doesn't store any of the info. When it is loaded, you would normally see: [1226739.343608] nullb0: unknown partition table [1226739.343746] nullb1: unknown partition table which can confuse some people. Add the appropriate gendisk flag to suppress this info. Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-02block: fix checking return value of blk_mq_init_queueMing Lei
Check IS_ERR_OR_NULL(return value) instead of just return value. Signed-off-by: Ming Lei <ming.lei@canonical.com> Reduced to IS_ERR() by me, we never return NULL. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-11-26null_blk: boundary check queue_mode and irqmodeMatias Bjorling
When either queue_mode or irq_mode parameter is set outside its boundaries, the driver will not complete requests. This stalls driver initialization when partitions are probed. Fix by setting out of bound values to the parameters default. Signed-off-by: Matias Bjørling <m@bjorling.me> Updated by me to have the parse+check in just one function. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-11-18Merge branch 'master' into for-3.19/driversJens Axboe
2014-10-29blk-mq: add a 'list' parameter to ->queue_rq()Jens Axboe
Since we have the notion of a 'last' request in a chain, we can use this to have the hardware optimize the issuing of requests. Add a list_head parameter to queue_rq that the driver can use to temporarily store hw commands for issue when 'last' is true. If we are doing a chain of requests, pass in a NULL list for the first request to force issue of that immediately, then batch the remainder for deferred issue until the last request has been sent. Instead of adding yet another argument to the hot ->queue_rq path, encapsulate the passed arguments in a blk_mq_queue_data structure. This is passed as a constant, and has been tested as faster than passing 4 (or even 3) args through ->queue_rq. Update drivers for the new ->queue_rq() prototype. There are no functional changes in this patch for drivers - if they don't use the passed in list, then they will just queue requests individually like before. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-10-22null_blk: Cleanup error recovery in null_add_dev()Jan Kara
When creation of queues fails in init_driver_queues(), we free the queues. But null_add_dev() doesn't test for this failure and continues with the setup leading to strange consequences, likely oops. Fix the problem by testing whether init_driver_queues() failed and do proper error cleanup. Coverity-id: 1148005 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-10-18Merge branch 'for-3.18/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer driver update from Jens Axboe: "This is the block driver pull request for 3.18. Not a lot in there this round, and nothing earth shattering. - A round of drbd fixes from the linbit team, and an improvement in asender performance. - Removal of deprecated (and unused) IRQF_DISABLED flag in rsxx and hd from Michael Opdenacker. - Disable entropy collection from flash devices by default, from Mike Snitzer. - A small collection of xen blkfront/back fixes from Roger Pau Monné and Vitaly Kuznetsov" * 'for-3.18/drivers' of git://git.kernel.dk/linux-block: block: disable entropy contributions for nonrot devices xen, blkfront: factor out flush-related checks from do_blkif_request() xen-blkback: fix leak on grant map error path xen/blkback: unmap all persistent grants when frontend gets disconnected rsxx: Remove deprecated IRQF_DISABLED block: hd: remove deprecated IRQF_DISABLED drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks drbd: compute the end before rb_insert_augmented() drbd: Add missing newline in resync progress display in /proc/drbd drbd: reduce lock contention in drbd_worker drbd: Improve asender performance drbd: Get rid of the WORK_PENDING macro drbd: Get rid of the __no_warn and __cond_lock macros drbd: Avoid inconsistent locking warning drbd: Remove superfluous newline from "resync_extents" debugfs entry. drbd: Use consistent names for all the bi_end_io callbacks drbd: Use better variable names
2014-10-04block: disable entropy contributions for nonrot devicesMike Snitzer
Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set QUEUE_FLAG_NONROT. Historically, all block devices have automatically made entropy contributions. But as previously stated in commit e2e1a148 ("block: add sysfs knob for turning off disk entropy contributions"): - On SSD disks, the completion times aren't as random as they are for rotational drives. So it's questionable whether they should contribute to the random pool in the first place. - Calling add_disk_randomness() has a lot of overhead. There are more reliable sources for randomness than non-rotational block devices. From a security perspective it is better to err on the side of caution than to allow entropy contributions from unreliable "random" sources. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-22blk-mq: rename blk_mq_end_io to blk_mq_end_requestChristoph Hellwig
Now that we've changed the driver API on the submission side use the opportunity to fix up the name on the completion side to fit into the general scheme. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-22blk-mq: call blk_mq_start_request from ->queue_rqChristoph Hellwig
When we call blk_mq_start_request from the core blk-mq code before calling into ->queue_rq there is a racy window where the timeout handler can hit before we've fully set up the driver specific part of the command. Move the call to blk_mq_start_request into the driver so the driver can start the request only once it is fully set up. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-22blk-mq: remove REQ_ENDChristoph Hellwig
Pass an explicit parameter for the last request in a batch to ->queue_rq instead of using a request flag. Besides being a cleaner and non-stateful interface this is also required for the next patch, which fixes the blk-mq I/O submission code to not start a time too early. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-02blk-mq: pass along blk_mq_alloc_tag_set return valuesRobert Elliott
Two of the blk-mq based drivers do not pass back the return value from blk_mq_alloc_tag_set, instead just returning -ENOMEM. blk_mq_alloc_tag_set returns -EINVAL if the number of queues or queue depth is bad. -ENOMEM implies that retrying after freeing some memory might be more successful, but that won't ever change in the -EINVAL cases. Change the null_blk and mtip32xx drivers to pass along the return value. Signed-off-by: Robert Elliott <elliott@hp.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-06-16null_blk: fix softirq completions for queue_mode == 1Jens Axboe
Only blk-mq completions have payload attached to the request, for request_fn mode we have stored it in req->special. This fixes an oops with queue_mode=1 and softirq completions. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-06-11null_blk: fix name and description of 'queue_mode' module parameterMike Snitzer
'use_mq' is not the name of the module parameter, 'queue_mode' is. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-05-28Merge branch 'for-3.16/core' into for-3.16/driversJens Axboe
Pull in core changes (again), since we got rid of the alloc/free hctx mq_ops hooks and mtip32xx then needed updating again. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-05-28blk-mq: remove alloc_hctx and free_hctx methodsChristoph Hellwig
There is no need for drivers to control hardware context allocation now that we do the context to node mapping in common code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-05-28Merge branch 'for-3.16/core' into for-3.16/driversJens Axboe
mtip32xx uses blk_mq_alloc_reserved_request(), so pull in the core changes so we have a properly merged end result. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-05-27blk-mq: pass in suggested NUMA node to ->alloc_hctx()Jens Axboe
Drivers currently have to figure this out on their own, and they are missing information to do it properly. The ones that did attempt to do it, do it wrong. So just pass in the suggested node directly to the alloc function. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-05-01block: null_blk: fix use after freeMing Lei
entry(cmd->ll_list) may belong to new request once end_cmd() returns, so fix the bug with the patch. Without the change, it is easy to observe oops when doing null_blk(timer) test. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-15blk-mq: split out tag initialization, support shared tagsChristoph Hellwig
Add a new blk_mq_tag_set structure that gets set up before we initialize the queue. A single blk_mq_tag_set structure can be shared by multiple queues. Signed-off-by: Christoph Hellwig <hch@lst.de> Modular export of blk_mq_{alloc,free}_tagset added by me. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-15blk-mq: do not initialize req->specialChristoph Hellwig
Drivers can reach their private data easily using the blk_mq_rq_to_pdu helper and don't need req->special. By not initializing it code can be simplified nicely, and we also shave off a few more instructions from the I/O path. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-10null_blk: use blk_complete_request and blk_mq_complete_requestChristoph Hellwig
Use the block layer helpers for CPU-local completions instead of reimplementing them locally. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-07block/null_blk: Fix completion processing from LIFO to FIFOShlomo Pongratz
The completion queue is implemented using lockless list. The llist_add is adds the events to the list head which is a push operation. The processing of the completion elements is done by disconnecting all the pushed elements and iterating over the disconnected list. The problem is that the processing is done in reverse order w.r.t order of the insertion i.e. LIFO processing. By reversing the disconnected list which is done in linear time the desired FIFO processing is achieved. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-01-30Merge branch 'for-3.14/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block IO driver changes from Jens Axboe: - bcache update from Kent Overstreet. - two bcache fixes from Nicholas Swenson. - cciss pci init error fix from Andrew. - underflow fix in the parallel IDE pg_write code from Dan Carpenter. I'm sure the 1 (or 0) users of that are now happy. - two PCI related fixes for sx8 from Jingoo Han. - floppy init fix for first block read from Jiri Kosina. - pktcdvd error return miss fix from Julia Lawall. - removal of IRQF_SHARED from the SEGA Dreamcast CD-ROM code from Michael Opdenacker. - comment typo fix for the loop driver from Olaf Hering. - potential oops fix for null_blk from Raghavendra K T. - two fixes from Sam Bradshaw (Micron) for the mtip32xx driver, fixing an OOM problem and a problem with handling security locked conditions * 'for-3.14/drivers' of git://git.kernel.dk/linux-block: (47 commits) mg_disk: Spelling s/finised/finished/ null_blk: Null pointer deference problem in alloc_page_buffers mtip32xx: Correctly handle security locked condition mtip32xx: Make SGL container per-command to eliminate high order dma allocation drivers/block/loop.c: fix comment typo in loop_config_discard drivers/block/cciss.c:cciss_init_one(): use proper errnos drivers/block/paride/pg.c: underflow bug in pg_write() drivers/block/sx8.c: remove unnecessary pci_set_drvdata() drivers/block/sx8.c: use module_pci_driver() floppy: bail out in open() if drive is not responding to block0 read bcache: Fix auxiliary search trees for key size > cacheline size bcache: Don't return -EINTR when insert finished bcache: Improve bucket_prio() calculation bcache: Add bch_bkey_equal_header() bcache: update bch_bkey_try_merge bcache: Move insert_fixup() to btree_keys_ops bcache: Convert sorting to btree_keys bcache: Convert debug code to btree_keys bcache: Convert btree_iter to struct btree_keys bcache: Refactor bset_tree sysfs stats ...
2014-01-21null_blk: Null pointer deference problem in alloc_page_buffersRaghavendra K T
If we load the null_blk module with bs=8k we get following oops: [ 3819.812190] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 3819.812387] IP: [<ffffffff81170aa5>] create_empty_buffers+0x28/0xaf [ 3819.812527] PGD 219244067 PUD 215a06067 PMD 0 [ 3819.812640] Oops: 0000 [#1] SMP [ 3819.812772] Modules linked in: null_blk(+) Fix that by resetting block size to PAGE_SIZE if it is greater than PAGE_SIZE Reported-by: Sumanth <sumantk2@linux.vnet.ibm.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Reviewed-by: Matias Bjorling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-12block: null_blk: fix queue leak inside removing deviceMing Lei
When queue_mode is NULL_Q_MQ and null_blk is being removed, blk_cleanup_queue() isn't called to cleanup queue, so the queue allocated won't be freed. This patch calls blk_cleanup_queue() for MQ to drain all pending requests first and release the reference counter of queue kobject, then blk_mq_free_queue() will be called in queue kobject's release handler when queue kobject's reference counter drops to zero. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-24Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - fix for a memory leak on certain unplug events - a collection of bcache fixes from Kent and Nicolas - a few null_blk fixes and updates form Matias - a marking of static of functions in the stec pci-e driver * 'for-linus' of git://git.kernel.dk/linux-block: null_blk: support submit_queues on use_per_node_hctx null_blk: set use_per_node_hctx param to false null_blk: corrections to documentation null_blk: warning on ignored submit_queues param null_blk: refactor init and init errors code paths null_blk: documentation null_blk: mem garbage on NUMA systems during init drivers: block: Mark the functions as static in skd_main.c bcache: New writeback PD controller bcache: bugfix for race between moving_gc and bucket_invalidate bcache: fix for gc and writeback race bcache: bugfix - moving_gc now moves only correct buckets bcache: fix for gc crashing when no sectors are used bcache: Fix heap_peek() macro bcache: Fix for can_attach_cache() bcache: Fix dirty_data accounting bcache: Use uninterruptible sleep in writeback bcache: kthread don't set writeback task to INTERUPTIBLE block: fix memory leaks on unplugging block device bcache: fix sparse non static symbol warning
2013-12-21null_blk: support submit_queues on use_per_node_hctxMatias Bjørling
In the case of both the submit_queues param and use_per_node_hctx param are used. We limit the number af submit_queues to the number of online nodes. If the submit_queues is a multiple of nr_online_nodes, its trivial. Simply map them to the nodes. For example: 8 submit queues are mapped as node0[0,1], node1[2,3], ... If uneven, we are left with an uneven number of submit_queues that must be mapped. These are mapped toward the first node and onward. E.g. 5 submit queues mapped onto 4 nodes are mapped as node0[0,1], node1[2], ... Signed-off-by: Matias Bjorling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-21null_blk: set use_per_node_hctx param to falseMatias Bjørling
The defaults for the module is to instantiate itself with blk-mq and a submit queue for each CPU node in the system. To save resources, initialize instead with a single submit queue. Signed-off-by: Matias Bjorling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-19null_blk: warning on ignored submit_queues paramMatias Bjorling
Let the user know when the number of submission queues are being ignored. Signed-off-by: Matias Bjorling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-19null_blk: refactor init and init errors code pathsMatias Bjorling
Simplify the initialization logic of the three block-layers. - The queue initialization is split into two parts. This allows reuse of code when initializing the sq-, bio- and mq-based layers. - Set submit_queues default value to 0 and always set it at init time. - Simplify the init error code paths. Signed-off-by: Matias Bjorling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-19null_blk: mem garbage on NUMA systems during initMatias Bjorling
For NUMA systems, initializing the blk-mq layer and using per node hctx. We initialize submit queues to 1, while blk-mq nr_hw_queues is initialized to the number of NUMA nodes. This makes the null_init_hctx function overwrite memory outside of what it allocated. In my case it lead to writing garbage into struct request_queue's mq_map. Signed-off-by: Matias Bjorling <m@bjorling.me> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-15null_blk: mem garbage on NUMA systems during initMatias Bjorling
For NUMA systems, initializing the blk-mq layer and using per node hctx. We initialize submit queues to 1, while blk-mq nr_hw_queues is initialized to the number of NUMA nodes. This makes the null_init_hctx function overwrite memory outside of what it allocated. In my case it lead to writing garbage into struct request_queue's mq_map. Signed-off-by: Matias Bjorling <m@bjorling.me> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-21kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS cleanlyYuanhan Liu
Remove CONFIG_USE_GENERIC_SMP_HELPERS left by commit 0a06ff068f12 ("kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS"). Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-25null_blk: multi queue aware block test driverJens Axboe
A driver that simply completes IO it receives, it does no transfers. Written to fascilitate testing of the blk-mq code. It supports various module options to use either bio queueing, rq queueing, or mq mode. Signed-off-by: Jens Axboe <axboe@kernel.dk>