Age | Commit message (Collapse) | Author |
|
There is no consistency among filesystems from what bios (or requests)
are marked as being metadata. It's interesting to expose this in traces,
but we shouldn't schedule the requests differently based on whether or
not they're marked as being metadata.
Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
I'm often confused why not disable preempt when changing blk_plug list. It
would be better to add comments here in case others have the similar concerns.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
When I test fio script with big I/O depth, I found the total throughput drops
compared to some relative small I/O depth. The reason is the thread accumulates
big requests in its plug list and causes some delays (surely this depends
on CPU speed).
I thought we'd better have a threshold for requests. When a threshold reaches,
this means there is no request merge and queue lock contention isn't severe
when pushing per-task requests to queue, so the main advantages of blk plug
don't exist. We can force a plug list flush in this case.
With this, my test throughput actually increases and almost equals to small
I/O depth. Another side effect is irq off time decreases in blk_flush_plug_list()
for big I/O depth.
The BLK_MAX_REQUEST_COUNT is choosen arbitarily, but 16 is efficiently to
reduce lock contention to me. But I'm open here, 32 is ok in my test too.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fix headers_check error introduced by 390192b30057:
include/linux/fd.h:6: included file 'linux/compat.h' is not exported
Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Due to the recently identified overflow in read_capacity_16() it was
possible for max_discard_sectors to be zero but still have discards
enabled on the associated device's queue.
Eliminate the possibility for blkdev_issue_discard to infinitely loop.
Interestingly this issue wasn't identified until a device, whose
discard_granularity was 0 due to read_capacity_16 overflow, was consumed
by blk_stack_limits() to construct limits for a higher-level DM
multipath device. The multipath device's resulting limits never had the
discard limits stacked because blk_stack_limits() will only do so if
the bottom device's discard_granularity != 0. This resulted in the
multipath device's limits.max_discard_sectors being 0.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
On Linux x86_64 host with 32bit userspace, running
qemu or even just "qemu-img create -f qcow2 some.img 1G"
causes a kernel warning:
ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(00005326){t:'S';sz:0} arg(7fffffff) on some.img
ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(801c0204){t:02;sz:28} arg(fff77350) on some.img
ioctl 00005326 is CDROM_DRIVE_STATUS,
ioctl 801c0204 is FDGETPRM.
The warning appears because the Linux compat-ioctl handler for these
ioctls only applies to block devices, while qemu also uses the ioctls on
plain files.
Signed-off-by: Johannes Stezenbach <js@sig21.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Currently, only open(2) is defined as the 'clearing' point. It has
two roles - first, it's an acknowledgement from userland indicating
that the event has been received and kernel can clear pending states
and proceed to generate more events. Secondly, it's passed on to
device drivers as a hint indicating that a synchronization point has
been reached and it might want to take a deeper look at the device.
The latter currently is only used by sr which uses two different
mechanisms - GET_EVENT_MEDIA_STATUS_NOTIFICATION and TEST_UNIT_READY
to discover events, where the former is lighter weight and safe to be
used repeatedly but may not provide full coverage. Among other
things, GET_EVENT can't detect media removal while TUR can.
This patch makes close(2) - blkdev_put() - indicate clearing hint for
MEDIA_CHANGE to drivers. disk_check_events() is renamed to
disk_flush_events() and updated to take @mask for events to flush
which is or'd to ev->clearing and will be passed to the driver on the
next ->check_events() invocation.
This change makes sr generate MEDIA_CHANGE when media is ejected from
userland - e.g. with eject(1).
Note: Given the current usage, it seems @clearing hint is needlessly
complex. disk_clear_events() can simply clear all events and the hint
can be boolean @flush.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Conflicts:
block/blk-throttle.c
block/cfq-iosched.c
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
for-linus
|
|
We used to write these with BIO_RW_BARRIER aka REQ_HARDBARRIER (unless
disabled in the configuration). The correct semantic now would be to
write with FLUSH/FUA.
For example, with activity log transactions, FUA alone is not enough, we
need the corresponding bitmap update (and all related application
updates) on stable storage as well.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
data socket
If we have an asymetrically congested network, we may send P_PING,
but due to congestion, the corresponding P_PING_ACK would time out,
and we would drop a (congested, but otherwise) healthy connection
("PingAck did not arrive in time.")
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
If we have a good resync rate, we will frequently update the on-disk
bitmap, which, if not accounted for as resync io, may let an otherwise
idle device appear to be "busy", and cause us to throttle resync.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
The last commit, drbd: add missing spinlock to bitmap receive,
introduced a cond_resched_lock(), where the lock in question is taken
with irqs disabled.
As we must not schedule with IRQs disabled,
and cond_resched_lock_irq() does not exist, yet,
we re-aquire the spin_lock_irq() for each bitmap page processed in turn.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
During bitmap exchange, when using the RLE bitmap compression scheme,
we have a code path that can set the whole bitmap at once.
To avoid holding spin_lock_irq() for too long, we used to lock out other
bitmap modifications during bitmap exchange by other means, and then,
knowing we have exclusive access to the bitmap, modify it without
the spinlock, and with IRQs enabled.
Since we now allow local IO to continue, potentially setting additional
bits during the bitmap receive phase, this is no longer true, and we get
uncoordinated updates of bitmap members, causing bm_set to no longer
accurately reflect the total number of set bits.
To actually see this, you'd need to have a large bitmap, use RLE bitmap
compression, and have busy IO during sync handshake and bitmap exchange.
Fix this by taking the spin_lock_irq() in this code path as well, but
calling cond_resched_lock() after each page worth of bits processed.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
ioc->ioc_data is rcu protectd, so uses correct API to access it.
This doesn't change any behavior, but just make code consistent.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: stable@kernel.org # after ab4bd22d
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
I got a rcu warnning at boot. the ioc->ioc_data is rcu_deferenced, but
doesn't hold rcu_read_lock.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: stable@kernel.org # after ab4bd22d
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: mark CONFIG_CIFS_NFSD_EXPORT as BROKEN
cifs: free blkcipher in smbhash
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
cifs: propagate errors from cifs_get_root() to mount(2)
cifs: tidy cifs_do_mount() up a bit
cifs: more breakage on mount failures
cifs: close sget() races
cifs: pull freeing mountdata/dropping nls/freeing cifs_sb into cifs_umount()
cifs: move cifs_umount() call into ->kill_sb()
cifs: pull cifs_mount() call up
sanitize cifs_umount() prototype
cifs: initialize ->tlink_tree in cifs_setup_cifs_sb()
cifs: allocate mountdata earlier
cifs: leak on mount if we share superblock
cifs: don't pass superblock to cifs_mount()
cifs: don't leak nls on mount failure
cifs: double free on mount failure
take bdi setup/destruction into cifs_mount/cifs_umount
Acked-by: Steve French <smfrench@gmail.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rtc: vt8500: Fix build error & cleanup rtc_class_ops->update_irq_enable()
alarmtimers: Return -ENOTSUPP if no RTC device is present
alarmtimers: Handle late rtc module loading
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: Remove unneeded version.h includes from sound/
ASoC: pxa-ssp: Correct check for stream presence
ASoC: imx: add missing module informations
ASoC: imx: Remove unused Kconfig SND_MXC_SOC_SSI entry
ALSA: HDA: Pinfix quirk for HP Z200 Workstation
ALSA: VIA HDA: Create a master amplifier control for VT1718S.
ALSA: VIA HDA: Mute/unmute mixer conncted to Headphone for VT1718S.
ALSA: VIA HDA: Modify initial verbs list for VT1718S.
ALSA: hda - Remove ALC268 model override for CPR2000
ALSA: HDA: Remove quirk for an HP device
ASoC: Remove unused and about to be broken SND_SOC_CUSTOM I/O bus
|
|
git://git.linaro.org/people/jstultz/linux into timers/urgent
* rtc: vt8500: Fix build error & cleanup rtc_class_ops->update_irq_enable()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6
* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6:
drm/i915: save/resume forcewake lock fixes
Revert "drm/i915: Kill GTT mappings when moving from GTT domain"
drm/i915: Apply HWSTAM workaround for BSD ring on SandyBridge
drm/i915: Call intel_enable_plane from i9xx_crtc_mode_set (again)
|
|
... instead of just failing with -EINVAL
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
if cifs_get_root() fails, we end up with ->mount() returning NULL,
which is not what callers expect. Moreover, in case of superblock
reuse we end up leaking a superblock reference...
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
have ->s_fs_info set by the set() callback passed to sget()
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
all callers of cifs_umount() proceed to do the same thing; pull it into
cifs_umount() itself.
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
instead of calling it manually in case if cifs_read_super() fails
to set ->s_root, just call it from ->kill_sb(). cifs_put_super()
is gone now *and* we have cifs_sb shutdown and destruction done
after the superblock is gone from ->s_instances.
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... to the point prior to sget(). Now we have cifs_sb set up early
enough.
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
a) superblock argument is unused
b) it always returns 0
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
no need to wait until cifs_read_super() and we need it done
by the time cifs_mount() will be called.
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
pull mountdata allocation up, so that it won't stand in the way when
we lift cifs_mount() to location before sget().
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
cifs_sb and nls end up leaked...
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
To close sget() races we'll need to be able to set cifs_sb up before
we get the superblock, so we'll want to be able to do cifs_mount()
earlier. Fortunately, it's easy to do - setting ->s_maxbytes can
be done in cifs_read_super(), ditto for ->s_time_gran and as for
putting MS_POSIXACL into ->s_flags, we can mirror it in ->mnt_cifs_flags
until cifs_read_super() is called. Kill unused 'devname' argument,
while we are at it...
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
if cifs_sb allocation fails, we still need to drop nls we'd stashed
into volume_info - the one we would've copied to cifs_sb if we could
allocate the latter.
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
if we get to out_super with ->s_root already set (e.g. with
cifs_get_root() failure), we'll end up with cifs_put_super()
called and ->mountdata freed twice. We'll also get cifs_sb
freed twice and cifs_sb->local_nls dropped twice. The problem
is, we can get to out_super both with and without ->s_root,
which makes ->put_super() a bad place for such work.
Switch to ->kill_sb(), have all that work done there after
kill_anon_super(). Unlike ->put_super(), ->kill_sb() is
called by deactivate_locked_super() whether we have ->s_root
or not.
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This does not work properly with CIFS as current servers do not
enable support for the FILE_OPEN_BY_FILE_ID on SMB NTCreateX
and not all NFS clients handle ESTALE.
For now, it just plain doesn't work. Mark it BROKEN to discourage
distros from enabling it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
This is currently leaked in the rc == 0 case.
Reported-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
* 'for-linus' of git://git.kernel.dk/linux-block:
block: add REQ_SECURE to REQ_COMMON_MASK
block: use the passed in @bdev when claiming if partno is zero
block: Add __attribute__((format(printf...) and fix fallout
block: make disk_block_events() properly wait for work cancellation
block: remove non-syncing __disk_block_events() and fold it into disk_block_events()
block: don't use non-syncing event blocking in disk_check_events()
cfq-iosched: fix locking around ioc->ioc_data assignment
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
pata_marvell: Add support for 88SE91A0, 88SE91A4
libata/sas: only set FROZEN flag if new EH is supported
libata: apply NOSETXFER horkage to the affected Pioneer drives regardless of firmware revision
drivers/ata/sata_dwc_460ex: Fix typo 'corrresponding'
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: handle special cases for vddc
drm/radeon/kms: fix num_banks tiling config for fusion
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nab/scsi-post-merge-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/scsi-post-merge-2.6:
tcm_fc: Fix conversion spec warning
tcm_fc: Fix possible lock to unlock type deadlock
tcm_fc: Fix ft_send_tm LUN lookup OOPs
target: Fix incorrect strlen() NULL terminator checks
target: Drop bogus ERR_PTR usage in target_fabric_configfs_init
target: Fix ERR_PTR dereferencing bugs
target: Convert transport_deregister_session_configfs nacl_sess_lock to save irq state
target: Fix transport_get_lun_for_tmr failure cases
[SCSI] target: Convert TASK_ATTR to scsi_tcq.h definitions
[SCSI] target: Convert REPORT_LUNs to use int_to_scsilun
[SCSI] target: Fix task->task_execute_queue=1 clear bug + LUN_RESET OOPs
[SCSI] target: Fix bug with task_sg chained transport_free_dev_tasks release
[SCSI] target: Fix interrupt context bug with stats_lock and core_tmr_alloc_req
[SCSI] target: Fix multi task->task_sg[] chaining logic bug
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI/ACPI: fix type mismatch
PCI: fix new kernel-doc warning
PCI: Fix warning in drivers/pci/probe.c on sparc64
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix wsize negotiation to respect max buffer size and active signing (try #4)
CIFS: Fix problem with 3.0-rc1 null user mount failure
|
|
It was pointed out by 'make versioncheck' that some includes of
linux/version.h were not needed in fs/ (fs/btrfs/ctree.h and
fs/omfs/file.c).
This patch removes them.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|