summaryrefslogtreecommitdiff
path: root/drivers/scsi/sg.c
AgeCommit message (Collapse)Author
2014-08-28block,scsi: fixup blk_get_request dead queue scenariosJoe Lawrence
The blk_get_request function may fail in low-memory conditions or during device removal (even if __GFP_WAIT is set). To distinguish between these errors, modify the blk_get_request call stack to return the appropriate ERR_PTR. Verify that all callers check the return status and consider IS_ERR instead of a simple NULL pointer check. For consistency, make a similar change to the blk_mq_alloc_request leg of blk_get_request. It may fail if the queue is dead, or the caller was unwilling to wait. Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd] Acked-by: Boaz Harrosh <bharrosh@panasas.com> [for osd] Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-07-25scsi: convert device_busy to atomic_tChristoph Hellwig
Avoid taking the queue_lock to check the per-device queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Unlike the host and target busy counters this doesn't allow us to avoid the queue_lock in the request_fn due to the way the interface works, but it'll allow us to prepare for using the blk-mq code, which doesn't use the queue_lock at all, and it at least avoids a queue_lock round trip in scsi_device_unbusy, which is still important given how busy the queue_lock is. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>
2014-07-17scsi: Implement sg_printk()Hannes Reinecke
Update the sg driver to use dev_printk() variants instead of plain printk(); this will prefix logging messages with the appropriate device. Signed-off-by: Hannes Reinecke <hare@suse.de> Acked-by: Doug Gilbert <dgilbert@interlog.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17scsi: use 64-bit LUNsHannes Reinecke
The SCSI standard defines 64-bit values for LUNs, and large arrays employing large or hierarchical LUN numbers become more and more common. So update the linux SCSI stack to use 64-bit LUN numbers. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17sg: O_EXCL and other lock handlingDouglas Gilbert
This addresses a problem reported by Vaughan Cao concerning the correctness of the O_EXCL logic in the sg driver. POSIX doesn't defined O_EXCL semantics on devices but "allow only one open file descriptor at a time per sg device" is a rough definition. The sg driver's semantics have been to wait on an open() when O_NONBLOCK is not given and there are O_EXCL headwinds. Nasty things can happen during that wait such as the device being detached (removed). So multiple locks are reworked in this patch making it large and hard to break down into digestible bits. This patch is against Linus's current git repository which doesn't include any sg patches sent in the last few weeks. Hence this patch touches as little as possible that it doesn't need to and strips out most SCSI_LOG_TIMEOUT() changes in v3 because Hannes said he was going to rework all that stuff. The sg3_utils package has several test programs written to test this patch. See examples/sg_tst_excl*.cpp . Not all the locks and flags in sg have been re-worked in this patch, notably sg_request::done . That can wait for a follow-up patch if this one meets with approval. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-07-17sg: add SG_FLAG_Q_AT_TAIL flagDouglas Gilbert
When the SG_IO ioctl was copied into the block layer and later into the bsg driver, subtle differences emerged. One difference is the way injected commands are queued through the block layer (i.e. this is not SCSI device queueing nor SATA NCQ). Summarizing: - SG_IO in the block layer: blk_exec*(at_head=false) - sg SG_IO: at_head=true - bsg SG_IO: at_head=true Some time ago Boaz Harrosh introduced a sg v4 flag called BSG_FLAG_Q_AT_TAIL to override the bsg driver default. This patch does the equivalent for the sg driver. ChangeLog: Introduce SG_FLAG_Q_AT_TAIL flag to cause commands to be injected into the block layer with at_head=false. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17sg: relax 16 byte cdb restrictionDouglas Gilbert
- remove the 16 byte CDB (SCSI command) length limit from the sg driver by handling longer CDBs the same way as the bsg driver. Remove comment from sg.h public interface about the cmd_len field being limited to 16 bytes. - remove some dead code caused by this change - cleanup comment block at the top of sg.h, fix urls Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17sg: prevent integer overflow when converting from sectors to bytesAkinobu Mita
This prevents integer overflow when converting the request queue's max_sectors from sectors to bytes. However, this is a preparation for extending the data type of max_sectors in struct Scsi_Host and scsi_host_template. So, it is impossible to happen this integer overflow for now, because SCSI low-level drivers can not specify max_sectors greater than 0xffff due to the data type limitation. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-06-06block: add blk_rq_set_block_pc()Jens Axboe
With the optimizations around not clearing the full request at alloc time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC up to the user allocating the request. Add a blk_rq_set_block_pc() that sets the command type to REQ_TYPE_BLOCK_PC, and properly initializes the members associated with this type of request. Update callers to use this function instead of manipulating rq->cmd_type directly. Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed attempt. Signed-off-by: Jens Axboe <axboe@fb.com>
2013-10-25[SCSI] Revert "sg: use rwsem to solve race during exclusive open"James Bottomley
This reverts commit 15b06f9a02406e5460001db6d5af5c738cd3d4e7. This is one of four patches that was causing this bug [ 205.372823] ================================================ [ 205.372901] [ BUG: lock held when returning to user space! ] [ 205.372979] 3.12.0-rc6-hw-debug-pagealloc+ #67 Not tainted [ 205.373055] ------------------------------------------------ [ 205.373132] megarc.bin/5283 is leaving the kernel with locks still held! [ 205.373212] 1 lock held by megarc.bin/5283: [ 205.373285] #0: (&sdp->o_sem){.+.+..}, at: [<ffffffff8161e650>] sg_open+0x3a0/0x4d0 Cc: Vaughan Cao <vaughan.cao@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-10-25[SCSI] Revert "sg: no need sg_open_exclusive_lock"James Bottomley
This reverts commit 00b2d9d6d05b56fc1d77071ff8ccbd2c65b48dec. This is one of four patches that was causing this bug [ 205.372823] ================================================ [ 205.372901] [ BUG: lock held when returning to user space! ] [ 205.372979] 3.12.0-rc6-hw-debug-pagealloc+ #67 Not tainted [ 205.373055] ------------------------------------------------ [ 205.373132] megarc.bin/5283 is leaving the kernel with locks still held! [ 205.373212] 1 lock held by megarc.bin/5283: [ 205.373285] #0: (&sdp->o_sem){.+.+..}, at: [<ffffffff8161e650>] sg_open+0x3a0/0x4d0 Cc: Vaughan Cao <vaughan.cao@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-10-25[SCSI] Revert "sg: checking sdp->detached isn't protected when open"James Bottomley
This reverts commit e32c9e6300e3af659cbfe45e90a1e7dcd3572ada. This is one of four patches that was causing this bug [ 205.372823] ================================================ [ 205.372901] [ BUG: lock held when returning to user space! ] [ 205.372979] 3.12.0-rc6-hw-debug-pagealloc+ #67 Not tainted [ 205.373055] ------------------------------------------------ [ 205.373132] megarc.bin/5283 is leaving the kernel with locks still held! [ 205.373212] 1 lock held by megarc.bin/5283: [ 205.373285] #0: (&sdp->o_sem){.+.+..}, at: [<ffffffff8161e650>] sg_open+0x3a0/0x4d0 Cc: Vaughan Cao <vaughan.cao@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-10-25[SCSI] Revert "sg: push file descriptor list locking down to per-device locking"James Bottomley
This reverts commit 1f962ebcdfa15cede59e9edb299d1330949eec92. This is one of four patches that was causing this bug [ 205.372823] ================================================ [ 205.372901] [ BUG: lock held when returning to user space! ] [ 205.372979] 3.12.0-rc6-hw-debug-pagealloc+ #67 Not tainted [ 205.373055] ------------------------------------------------ [ 205.373132] megarc.bin/5283 is leaving the kernel with locks still held! [ 205.373212] 1 lock held by megarc.bin/5283: [ 205.373285] #0: (&sdp->o_sem){.+.+..}, at: [<ffffffff8161e650>] sg_open+0x3a0/0x4d0 Cc: Vaughan Cao <vaughan.cao@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-09-03[SCSI] sg: push file descriptor list locking down to per-device lockingVaughan Cao
Push file descriptor list locking down to per-device locking. Let sg_index_lock only protect device lookup. sdp->detached is also set and checked with this lock held. Signed-off-by: Vaughan Cao <vaughan.cao@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-09-03[SCSI] sg: checking sdp->detached isn't protected when openVaughan Cao
@detached is set under the protection of sg_index_lock. Without getting the lock, new sfp will be added during sg removal and there is no chance for it to be picked out. So check with sg_index_lock held in sg_add_sfp(). Signed-off-by: Vaughan Cao <vaughan.cao@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-09-03[SCSI] sg: no need sg_open_exclusive_lockVaughan Cao
Open exclusive check is protected by o_sem, no need sg_open_exclusive_lock. @exclude is used to record which type of rwsem we are holding. Signed-off-by: Vaughan Cao <vaughan.cao@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-09-03[SCSI] sg: use rwsem to solve race during exclusive openVaughan Cao
A race condition may happen if two threads are both trying to open the same sg with O_EXCL simultaneously. It's possible that they both find fsds list is empty and get_exclude(sdp) returns 0, then they both call set_exclude() and break out from wait_event_interruptible and resume open. Now use rwsem to protect this process. Exclusive open gets write lock and others get read lock. The lock will be held until file descriptor is closed. This also leads 'exclude' only a status rather than a check mark. Signed-off-by: Vaughan Cao <vaughan.cao@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-05-07aio: don't include aio.h in sched.hKent Overstreet
Faster kernel compiles by way of fewer unnecessary includes. [akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27scsi: convert to idr_alloc()Tejun Heo
Convert to the much saner new idr interface. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09mm: kill vma flag VM_RESERVED and mm->reserved_vm counterKonstantin Khlebnikov
A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA, currently it lost original meaning but still has some effects: | effect | alternative flags -+------------------------+--------------------------------------------- 1| account as reserved_vm | VM_IO 2| skip in core dump | VM_IO, VM_DONTDUMP 3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP 4| do not mlock | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP This patch removes reserved_vm counter from mm_struct. Seems like nobody cares about it, it does not exported into userspace directly, it only reduces total_vm showed in proc. Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP. remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP. remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP. [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-17[SCSI] sg: constify sg_proc_leaf_arrJörn Engel
Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sg: remove sg_mutexJörn Engel
With the exception of the detached field, sg_mutex no longer adds any locking. detached handling has been broken before and is still broken and this patch does not seem to make things worse than they were to begin with. However, I have observed cases of tasks being blocked for >200s waiting for sg_mutex. So the removal clearly adds value for very little cost. Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sg: completely protect sfdsJörn Engel
sfds is protected by sg_index_lock - except for sg_open(), where it isn't. Change that and add some documentation. Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sg: protect sdp->excludeJörn Engel
Changes since v1: set_exclude now returns the new value, which gets rid of the comma expression and the operator precedence bug. Thanks to Douglas for spotting it. sdp->exclude was previously protected by the BKL. The sg_mutex, which replaced the BKL, only semi-protected it, as it was missing from sg_release() and sg_proc_seq_show_debug(). Take an explicit spinlock for it. Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sg: prevent unwoken sleepJörn Engel
srp->done is protected by sfp->rq_list_lock everywhere, except for this one case. Result can be that the wake-up happens before the cacheline with the changed srp->done has arrived, so the waiter can go back to sleep and never be woken up again. The wait_event_interruptible() means that anyone trying to debug this unlikely race will likely notice everything working fine again, as the next signal will unwedge things. Evil. Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sg: remove closed flagJörn Engel
After sg_release() has been called, noone should be able to actually use that filedescriptor anymore. So if closed ever made a difference in the past five years or so, it would have meant a bug. Remove it. Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> [jejb: fix up checkpatch warnings] Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sg: use wait_event_interruptible()Jörn Engel
Afaics the use of __wait_event_interruptible() as opposed to wait_event_interruptible() is purely historic. So let's follow the rest of the kernel and check the condition before prepare_to_wait() - and also make the code a bit nicer. Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sg: remove while (1) non-loopJörn Engel
The while (1) construct isn't actually a loop at all. So let's not pretent and obfuscate the code. Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sg: remove unnecessary indentationJörn Engel
blocking is de-facto a constant and the now-removed comment wasn't all that useful either. Without them and the resulting indentation the code is a bit nicer to read. Signed-off-by: Joern Engel <joern@logfs.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-01-16[SCSI] sg: convert to kstrtoul_from_user()Stephen Boyd
Instead of open coding this function use kstrtoul_from_user() directly. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-01-03switch procfs to umode_t useAl Viro
both proc_dir_entry ->mode and populating functions Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-09-15scsi/sg: use printk_ratelimited instead of printk_ratelimitChristian Dietrich
Since printk_ratelimit() shouldn't be used anymore (see comment in include/linux/printk.h), replace it with printk_ratelimited. Signed-off-by: Christian Dietrich <christian.dietrich@informatik.uni-erlangen.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-10-24Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Update broken web addresses in arch directory. Update broken web addresses in the kernel. Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget Revert "Fix typo: configuation => configuration" partially ida: document IDA_BITMAP_LONGS calculation ext2: fix a typo on comment in ext2/inode.c drivers/scsi: Remove unnecessary casts of private_data drivers/s390: Remove unnecessary casts of private_data net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data drivers/infiniband: Remove unnecessary casts of private_data drivers/gpu/drm: Remove unnecessary casts of private_data kernel/pm_qos_params.c: Remove unnecessary casts of private_data fs/ecryptfs: Remove unnecessary casts of private_data fs/seq_file.c: Remove unnecessary casts of private_data arm: uengine.c: remove C99 comments arm: scoop.c: remove C99 comments Fix typo configue => configure in comments Fix typo: configuation => configuration Fix typo interrest[ing|ed] => interest[ing|ed] Fix various typos of valid in comments ... Fix up trivial conflicts in: drivers/char/ipmi/ipmi_si_intf.c drivers/usb/gadget/rndis.c net/irda/irnet/irnet_ppp.c
2010-10-22Merge branch 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block: (39 commits) cfq-iosched: Fix a gcc 4.5 warning and put some comments block: Turn bvec_k{un,}map_irq() into static inline functions block: fix accounting bug on cross partition merges block: Make the integrity mapped property a bio flag block: Fix double free in blk_integrity_unregister block: Ensure physical block size is unsigned int blkio-throttle: Fix possible multiplication overflow in iops calculations blkio-throttle: limit max iops value to UINT_MAX blkio-throttle: There is no need to convert jiffies to milli seconds blkio-throttle: Fix link failure failure on i386 blkio: Recalculate the throttled bio dispatch time upon throttle limit change blkio: Add root group to td->tg_list blkio: deletion of a cgroup was causes oops blkio: Do not export throttle files if CONFIG_BLK_DEV_THROTTLING=n block: set the bounce_pfn to the actual DMA limit rather than to max memory block: revert bad fix for memory hotplug causing bounces Fix compile error in blk-exec.c for !CONFIG_DETECT_HUNG_TASK block: set the bounce_pfn to the actual DMA limit rather than to max memory block: Prevent hang_check firing during long I/O cfq: improve fsync performance for small files ... Fix up trivial conflicts due to __rcu sparse annotation in include/linux/genhd.h
2010-10-22Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bklLinus Torvalds
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek
2010-10-15llseek: automatically add .llseek fopArnd Bergmann
All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-09-23drivers/scsi: Remove unnecessary casts of private_dataJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-09-16sg: fix a warning in blk_rq_aligned() callNamhyung Kim
2nd argument of blk_rq_aligned() has changed to 'unsigned long' by the previous commit 'block: fix an address space warning in blk-map.c'. That commit neglected to update a user of that function. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-15scsi: autoconvert trivial BKL users to private mutexArnd Bergmann
All these files use the big kernel lock in a trivial way to serialize their private file operations, typically resulting from an earlier semi-automatic pushdown from VFS. None of these drivers appears to want to lock against other code, and they all use the BKL as the top-level lock in their file operations, meaning that there is no lock-order inversion problem. Consequently, we can remove the BKL completely, replacing it with a per-file mutex in every case. Using a scripted approach means we can avoid typos. file=$1 name=$2 if grep -q lock_kernel ${file} ; then if grep -q 'include.*linux.mutex.h' ${file} ; then sed -i '/include.*<linux\/smp_lock.h>/d' ${file} else sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file} fi sed -i ${file} \ -e "/^#include.*linux.mutex.h/,$ { 1,/^\(static\|int\|long\)/ { /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex); } }" \ -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \ -e '/[ ]*cycle_kernel_lock();/d' else sed -i -e '/include.*\<smp_lock.h\>/d' ${file} \ -e '/cycle_kernel_lock()/d' fi Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: linux-scsi@vger.kernel.org Cc: "James E.J. Bottomley" <James.Bottomley@suse.de>
2010-08-11drivers/scsi: use memdup_userJulia Lawall
Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-28[SCSI] implement runtime Power ManagementAlan Stern
This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28[SCSI] sg: fix bio leak with a detached deviceFUJITA Tomonori
After blk_rq_map_user is successful, if we find that a device is unavailable (was detached), we must call blk_end_request_all to free bio(s) before blk_rq_unmap_user and blk_put_request. Reported-by: "Dailey, Nate" <Nate.Dailey@stratus.com> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Tested-by: "Dailey, Nate" <Nate.Dailey@stratus.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-17scsi: Push down BKL into ioctl functionsArnd Bergmann
Push down the bkl into ioctl functions on the scsi layer. [jkacur: Forward declaration missing ';'. Conflicting declaraction in megaraid.h changed Fixed missing inodes declarations] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Kacur <jkacur@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-02-26block: Consolidate phys_segment and hw_segment limitsMartin K. Petersen
Except for SCSI no device drivers distinguish between physical and hardware segment limits. Consolidate the two into a single segment limit. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (34 commits) [SCSI] qla2xxx: Fix NULL ptr deref bug in fail path during queue create [SCSI] st: fix possible memory use after free after MTSETBLK ioctl [SCSI] be2iscsi: Moving to pci_pools v3 [SCSI] libiscsi: iscsi_session_setup to allow for private space [SCSI] be2iscsi: add 10Gbps iSCSI - BladeEngine 2 driver [SCSI] zfcp: Fix hang when offlining device with offline chpid [SCSI] zfcp: Fix lockdep warning when offlining device with offline chpid [SCSI] zfcp: Fix oops during shutdown of offline device [SCSI] zfcp: Fix initial device and cfdc for delayed adapter allocation [SCSI] zfcp: correctly initialize unchained requests [SCSI] mpt2sas: Bump version 02.100.03.00 [SCSI] mpt2sas: Support dev remove when phy status is MPI2_EVENT_SAS_TOPO_PHYSTATUS_VACANT [SCSI] mpt2sas: Timeout occurred within the HANDSHAKE logic while waiting on firmware to ACK. [SCSI] mpt2sas: Call init_completion on a per request basis. [SCSI] mpt2sas: Target Reset will be issued from Interrupt context. [SCSI] mpt2sas: Added SCSIIO, Internal and high priority memory pools to support multiple TM [SCSI] mpt2sas: Copyright change to 2009. [SCSI] mpt2sas: Added mpi2_history.txt for MPI2 headers. [SCSI] mpt2sas: Update driver to MPI2 REV K headers. [SCSI] bfa: Brocade BFA FC SCSI driver ...
2009-10-02[SCSI] sg: Free data buffers after calling blk_rq_unmap_userChristof Schmitt
Running sg_luns on s390x with CONFIG_DEBUG_PAGEALLOC enabled fails with EFAULT from the SG_IO ioctl. The EFAULT is the result from copy_to_user failing in this call chain: sg_ioctl sg_new_read sg_finish_rem_req blk_rq_unmap_user __blk_rq_unmap_user bio_uncopy_user __bio_copy_iov copy_to_user The sg driver calls sg_remove_scat to free the memory pages before calling blk_rq_unmap_user that tries to copy the data back to userspace. Change the order to first call blk_rq_unmap_user before freeing the pages in sg_remove_scat. Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: stable@kernel.org Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-01const: constify remaining file_operationsAlexey Dobriyan
[akpm@linux-foundation.org: fix KVM] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-27const: mark struct vm_struct_operationsAlexey Dobriyan
* mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23seq_file: constify seq_operationsJames Morris
Make all seq_operations structs const, to help mitigate against revectoring user-triggerable function pointers. This is derived from the grsecurity patch, although generated from scratch because it's simpler than extracting the changes from there. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>