Age | Commit message (Collapse) | Author |
|
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
This function creates service if it doesn't exist, or increases usage
counter if it does, and returns a pointer to it. The usage counter will
be droppepd by svc_destroy() later in lockd_up().
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
This patch also replaces svc_rpcb_setup() with svc_bind().
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The idea is to separate service destruction and per-net operations,
because these are two different things and the mix looks ugly.
Notes:
1) For NFS server this patch looks ugly (sorry for that). But these
place will be rewritten soon during NFSd containerization.
2) LockD per-net counter increase int lockd_up() was moved prior to
make_socks() to make lockd_down_net() call safe in case of error.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
This new routine is responsible for service registration in a specified
network context.
The idea is to separate service creation from per-net operations.
Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The fs_location->hosts list is split on colons, but this doesn't work when
IPv6 addresses are used (they contain colons).
This patch adds the function nfsd4_encode_components_esc() to
allow the caller to specify escape characters when splitting on 'sep'.
In order to fix referrals, this patch must be used with the mountd patch
that similarly fixes IPv6 [] escaping.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Though actually this doesn't matter much, as NFSv4.0 clients are
required to treat the change attribute as opaque.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
In each of these cases there's a simple unambiguous correct choice, and
no actual bug.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
OK, admittedly I'm mainly just trying to shut sparse up.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
for-linus
Conflicts:
fs/btrfs/ulist.h
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
Use the same mechanism as the block devices are using, but move the
helper functions from fs/direct-io.c into fs/inode.c to remove the
dependency on CONFIG_BLOCK.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When we rewind REMOVE_WHILE_FREEING operations, there's code that allocates
a fresh buffer instead of cloning the old one. Setting that buffer's level
correctly was missing in this case.
When rewinding a MOVE_KEYS operation, btrfs_node_key_ptr_offset(slot) was
missing for memmove_extent_buffer()'s arguments.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
|
|
Logging for del_ptr when we're not deleting the last pointer was wrong. This
fixes both, duplicate log entries and log sequence.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
|
|
Replace duplicate code by small inline helper function.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
|
|
tree_mod_alloc calls __get_tree_mod_seq and must acquire a spinlock before
doing so.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
|
|
We must build up the inode list with the extent lock held after following
indirect refs.
This also requires an extension to ulists, which allows to modify the stored
aux value in case a key already exists in the list.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
|
|
... i.e. file-dependent and address-dependent checks.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The ext4_error() function is missing a call to save_error_info().
Since this is the function which marks the file system as containing
an error, this oversight (which was introduced in 2.6.36) is quite
significant, and should be backported to older stable kernels with
high urgency.
Reported-by: Ken Sumrall <ksumrall@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: ksumrall@google.com
Cc: stable@kernel.org
|
|
Make it easy to test whether or not the error handling subsystem in
ext4 is working correctly. This allows us to simulate an ext4_error()
by echoing a string to /sys/fs/ext4/<dev>/trigger_fs_error.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: ksumrall@google.com
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
No reason to hold ->mmap_sem over the sequence
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
__mnt_make_shortterm() in there undoes the effect of __mnt_make_longterm()
we'd done back when we set ->mnt_ns non-NULL; it should not be done to
vfsmounts that had never gone through commit_tree() and friends. Kudos to
lczerner for catching that one...
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
As described in commit 07d106d0a ("vfs: fix up ENOIOCTLCMD error
handling"), drivers should return -ENOIOCTLCMD if they receive an ioctl
command which they don't understand. Doing so will result in -ENOTTY
being returned to userspace, which matches the behaviour of the compat
layer if it fails to translate an ioctl command.
This patch fixes the pipe ioctl to return -ENOIOCTLCMD instead of
-EINVAL when passed an unknown ioctl command.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Nobody sets want_disconn any more.
Reported-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
A directory should never have more than one dentry pointing to it.
But d_splice_alias() will add one if it finds a directory with an
already-existing non-DISCONNECTED dentry.
I can't find an obvious reproducer, but I also can't see what prevents
d_splice_alias() from encountering such a case.
It therefore seems safest to allow d_splice_alias to use any dentry it
finds.
(Prior to the removal of dentry_unhash() from vfs_rmdir(), around v3.0,
this could cause an nfsd deadlock like this:
- Somebody attempts to remove a non-empty directory.
- The dentry_unhash() in vfs_rmdir() unhashes the dentry
pointing to the non-empty directory.
- ->rmdir() then fails with -ENOTEMPTY
- Before the vfs_rmdir() caller reaches dput(), an nfsd process
in rename looks up the directory by filehandle; at the end of
that lookup, this dentry is found by d_alloc_anon(), and a
reference is taken on it, preventing dput() from removing it.
- A regular lookup of the directory calls d_splice_alias(),
finds only an unhashed (not a DISCONNECTED) dentry, and
insteads adds a new one, so the directory now has two
dentries.
- The nfsd process in rename, which was previously looking up
the source directory of the rename, now looks up the target
directory (which is the same), and gets the dentry newly
created by the previous lookup.
- The rename, seeing two different dentries, assumes this is a
cross-directory rename and attempts to take the i_mutex on the
directory twice.
That reproducer no longer exists, but I don't think there was anything
fundamentally incorrect about the vfs_rmdir() behavior there, so I think
the real fault was here in d_splice_alias().)
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We don't use "mnt" anymore in send_to_group() after 1968f5eed5 ("fanotify:
use both marks when possible") was applied.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
When a file is truncated with truncate()/ftruncate() and then closed,
iversion is not updated. This patch uses ATTR_SIZE flag as an indication
to increment iversion.
Mimi said:
On fput(), i_version is used to detect and flag files that have changed
and need to be re-measured in the IMA measurement policy. When a file
is truncated with truncate()/ftruncate() and then closed, i_version is
not updated. As a result, although the file has changed, it will not be
re-measured and added to the IMA measurement list on subsequent access.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
bh_cachep is only written to once on initialization, so move it to the
__read_mostly section.
Signed-off-by: Shai Fultheim <shai@scalemp.com>
Signed-off-by: Vlad Zolotarov <vlad@scalemp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
file_remove_suid() is a generic function operates on struct file,
it almost has no relations with file mapping, so move it to fs/inode.c.
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Currently JFFS2 file-system maps the VFS "superblock" abstraction to the
write-buffer. Namely, it uses VFS services to synchronize the write-buffer
periodically.
The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblock using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds no matter what. So we want to kill it completely and thus, we need to
make file-systems to stop using the '->write_super' VFS service, and then
remove it together with the kernel thread.
This patch switches the JFFS2 write-buffer management from
'->write_super()'/'->s_dirt' to a delayed work. Instead of setting the 's_dirt'
flag we just schedule a delayed work for synchronizing the write-buffer.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We do not need to call 'jffs2_write_super()' on sync. This function
causes a GC pass to make sure the current contents is pushed out with
the data which we already have on the media.
But this is not needed on unmount and only slows sync down unnecessarily.
It is enough to just sync the write-buffer.
This call was added by one of the generic VFS rework patch-sets,
see d579ed00aa96a7f7486978540a0d7cecaff742ae.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We do not need to call 'jffs2_write_super()' on unmount. This function
causes a GC pass to make sure the current contents is pushed out with
the data which we already have on the media.
But this is not needed on unmount and only slows unmount down unnecessarily.
It is enough to just sync the write-buffer.
This call was added by one of the generic VFS rework patch-sets,
see 8c85e125124a473d6f3e9bb187b0b84207f81d91.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We do not need 'lock_super()'/'unlock_super()' in JFFS2 - kill them.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull ceph updates from Sage Weil:
"There are some updates and cleanups to the CRUSH placement code, a bug
fix with incremental maps, several cleanups and fixes from Josh Durgin
in the RBD block device code, a series of cleanups and bug fixes from
Alex Elder in the messenger code, and some miscellaneous bounds
checking and gfp cleanups/fixes."
Fix up trivial conflicts in net/ceph/{messenger.c,osdmap.c} due to the
networking people preferring "unsigned int" over just "unsigned".
* git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (45 commits)
libceph: fix pg_temp updates
libceph: avoid unregistering osd request when not registered
ceph: add auth buf in prepare_write_connect()
ceph: rename prepare_connect_authorizer()
ceph: return pointer from prepare_connect_authorizer()
ceph: use info returned by get_authorizer
ceph: have get_authorizer methods return pointers
ceph: ensure auth ops are defined before use
ceph: messenger: reduce args to create_authorizer
ceph: define ceph_auth_handshake type
ceph: messenger: check return from get_authorizer
ceph: messenger: rework prepare_connect_authorizer()
ceph: messenger: check prepare_write_connect() result
ceph: don't set WRITE_PENDING too early
ceph: drop msgr argument from prepare_write_connect()
ceph: messenger: send banner in process_connect()
ceph: messenger: reset connection kvec caller
libceph: don't reset kvec in prepare_write_banner()
ceph: ignore preferred_osd field
ceph: fully initialize new layout
...
|
|
The sequence number for delayed refs is needed to postpone certain delayed
refs for a very short period while walking backrefs. Before the tree
modification log, we thought we'd only have to hold back those references
that don't have a counter operation.
While now we've the tree mod log, we're rewinding fs tree blocks to a
defined consistent state. We cannot know in advance for which tree block
we'll be doing rewind operations later. Therefore, we must postpone all the
delayed refs for fs-tree blocks, even those having a counter operation.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next into HEAD
|
|
Merge block/IO core bits from Jens Axboe:
"This is a bit bigger on the core side than usual, but that is purely
because we decided to hold off on parts of Tejun's submission on 3.4
to give it a bit more time to simmer. As a consequence, it's seen a
long cycle in for-next.
It contains:
- Bug fix from Dan, wrong locking type.
- Relax splice gifting restriction from Eric.
- A ton of updates from Tejun, primarily for blkcg. This improves
the code a lot, making the API nicer and cleaner, and also includes
fixes for how we handle and tie policies and re-activate on
switches. The changes also include generic bug fixes.
- A simple fix from Vivek, along with a fix for doing proper delayed
allocation of the blkcg stats."
Fix up annoying conflict just due to different merge resolution in
Documentation/feature-removal-schedule.txt
* 'for-3.5/core' of git://git.kernel.dk/linux-block: (92 commits)
blkcg: tg_stats_alloc_lock is an irq lock
vmsplice: relax alignement requirements for SPLICE_F_GIFT
blkcg: use radix tree to index blkgs from blkcg
blkcg: fix blkcg->css ref leak in __blkg_lookup_create()
block: fix elvpriv allocation failure handling
block: collapse blk_alloc_request() into get_request()
blkcg: collapse blkcg_policy_ops into blkcg_policy
blkcg: embed struct blkg_policy_data in policy specific data
blkcg: mass rename of blkcg API
blkcg: style cleanups for blk-cgroup.h
blkcg: remove blkio_group->path[]
blkcg: blkg_rwstat_read() was missing inline
blkcg: shoot down blkgs if all policies are deactivated
blkcg: drop stuff unused after per-queue policy activation update
blkcg: implement per-queue policy activation
blkcg: add request_queue->root_blkg
blkcg: make request_queue bypassing on allocation
blkcg: make sure blkg_lookup() returns %NULL if @q is bypassing
blkcg: make blkg_conf_prep() take @pol and return with queue lock held
blkcg: remove static policy ID enums
...
|
|
During unmount, it could happen that the integrity checker printed a
warning message "attempt to free ... on umount which is not yet iodone"
which turned out to be a false positive.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
|
|
If a file_extent_item was located at the very end of a leaf and there was
not enough space to hold a full item, but there was enough space to hold
one of type BTRFS_FILE_EXTENT_INLINE or PREALLOC, and it was only such a
short item, a warning was printed anyway. This check is now fixed.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
|
|
Reduce ioprio class of scrub readahead threads to idle priority.
This setting is fixed. This priority has shown the best performance
during all measurements.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
|
|
So dpkg fsync()'s the file and the directory containing the file whenever it
writes to a file which is really slow in btrfs. This is partly because
fsync()'ing a directory _always_ committed the transaction instead of just
going to the tree log. This is because drop_objectid_items() would return 1
since it does a btrfs_search_slot() which returns 1. In tree-log jargon
this means that we have to commit the transaction to be safe. So just check
if ret is greater than 0 and set it to 0 if it does. With this patch we now
use the tree-log instead of committing the entire transaction, which is
twice as fast on my box. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
|
|
We have this check down in the actual logging code, but this is after we
start a transaction and all that good stuff. So move the helper
inode_in_log() out so we can call it in fsync() and avoid starting a
transaction altogether and just exit if we've already fsync()'ed this file
recently. You would notice this issue if you fsync()'ed a file over and
over again until the transaction committed. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
|
|
btrfs_read_buffer() has the possibility of returning the error.
Therefore, I add the code in which the return value of btrfs_read_buffer()
is checked.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
|
|
The device statistics are written into the device tree with each
transaction commit. Only modified statistics are written.
When a filesystem is mounted, the device statistics for each involved
device are read from the device tree and used to initialize the
counters.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
|
|
An ioctl interface is added to get the device statistic counters.
A second ioctl is added to atomically get and reset these counters.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
|
|
The goal is to detect when drives start to get an increased error rate,
when drives should be replaced soon. Therefore statistic counters are
added that count IO errors (read, write and flush). Additionally, the
software detected errors like checksum errors and corrupted blocks are
counted.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
|