summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2010-06-24xfs: prevent swapext from operating on write-only filesDan Rosenberg
This patch prevents user "foo" from using the SWAPEXT ioctl to swap a write-only file owned by user "bar" into a file owned by "foo" and subsequently reading it. It does so by checking that the file descriptors passed to the ioctl are also opened for reading. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-06-22NFSv4: Fix an embarassing typo in encode_attrs()Trond Myklebust
Apparently, we have never been able to set the atime correctly from the NFSv4 client. Reported-by: 小倉一夫 <ka-ogura@bd6.so-net.ne.jp> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-06-22NFSv4: Ensure that /proc/self/mountinfo displays the minor version numberTrond Myklebust
Currently, we do not display the minor version mount parameter in the /proc mount info. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-06-22NFSv4.1: Ensure that we initialise the session when following a referralTrond Myklebust
Put the code that is common to both the referral and ordinary mount cases into a common helper routine. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22nfs4 use mandatory attribute file type in nfs4_get_rootAndy Adamson
S_ISDIR(fsinfo.fattr->mode) checks the file type rather than the mode bits, so we should be checking for the NFS_ATTR_FATTR_TYPE fattr property. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-06-21ceph: delay umount until all mds requests drop inode+dentry refsSage Weil
This fixes a race between handle_reply finishing an mds request, signalling completion, and then dropping the request structing and its dentry+inode refs, and pre_umount function waiting for requests to finish before letting the vfs tear down the dcache. If umount was delayed waiting for mds requests, we could race and BUG in shrink_dcache_for_umount_subtree because of a slow dput. This delays umount until the msgr queue flushes, which means handle_reply will exit and will have dropped the ceph_mds_request struct. I'm assuming the VFS has already ensured that its calls have all completed and those request refs have thus been dropped as well (I haven't seen that race, at least). Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-21ceph: handle splice_dentry/d_materialize_unique error in readdir_prepopulateSage Weil
Handle a splice_dentry failure (due to a d_materialize_unique error) without crashing. (Also, report the error code.) Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-17ceph: fix crush map update decodingSage Weil
If the incremental osdmap has a new crush map, advance the position after decoding so that we can parse the rest of the osdmap properly. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-16cifs: remove bogus first_time check in NTLMv2 session setup codeJeff Layton
This bug appears to be the result of a cut-and-paste mistake from the NTLMv1 code. The function to generate the MAC key was commented out, but not the conditional above it. The conditional then ended up causing the session setup key not to be copied to the buffer unless this was the first session on the socket, and that made all but the first NTLMv2 session setup fail. Fix this by removing the conditional and all of the commented clutter that made it difficult to see. Cc: Stable <stable@kernel.org> Reported-by: Gunther Deschner <gdeschne@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com>
2010-06-16cifs: don't call cifs_new_fileinfo unless cifs_open succeedsJeff Layton
It's currently possible for cifs_open to fail after it has already called cifs_new_fileinfo. In that situation, the new fileinfo will be leaked as the caller doesn't call fput. That in turn leads to a busy inodes after umount problem since the fileinfo holds an extra inode reference now. Shuffle cifs_open around a bit so that it only calls cifs_new_fileinfo if it's going to succeed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
2010-06-16cifs: don't ignore cifs_posix_open_inode_helper return valueSuresh Jayaraman
...and ensure that we propagate the error back to avoid any surprises. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com>
2010-06-16cifs: clean up arguments to cifs_open_inode_helperJeff Layton
...which takes a ton of unneeded arguments and does a lot more pointer dereferencing than is really needed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
2010-06-16cifs: pass instantiated filp back after open callJeff Layton
The current scheme of sticking open files on a list and assuming that cifs_open will scoop them off of it is broken and leads to "Busy inodes after umount..." errors at unmount time. The problem is that there is no guarantee that cifs_open will always be called after a ->lookup or ->create operation. If there are permissions or other problems, then it's quite likely that it *won't* be called. Fix this by fully instantiating the filp whenever the file is created and pass that filp back to the VFS. If there is a problem, the VFS can clean up the references. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
2010-06-16cifs: move cifs_new_fileinfo call out of cifs_posix_openJeff Layton
Having cifs_posix_open call cifs_new_fileinfo is problematic and inconsistent with how "regular" opens work. It's also buggy as cifs_reopen_file calls this function on a reconnect, which creates a new struct cifsFileInfo that just gets leaked. Push it out into the callers. This also allows us to get rid of the "mnt" arg to cifs_posix_open. Finally, in the event that a cifsFileInfo isn't or can't be created, we always want to close the filehandle out on the server as the client won't have a record of the filehandle and can't actually use it. Make sure that CIFSSMBClose is called in those cases. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
2010-06-16Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6Steve French
2010-06-15ocfs2: Limit default local alloc size within bitmap range.Tao Ma
In commit 6b82021b9e91cd689fdffadbcdb9a42597bbe764, we increase our local alloc size and calculate how much megabytes we can get according to group size and volume size. But we also need to check the maximum bits a local alloc block bitmap can have. With a bs=512, cs=32K, local volume with 160G, it calculate 96MB while the maximum local alloc size is only 76M. So the bitmap will overflow and corrupt the system truncate log file. See bug http://oss.oracle.com/bugzilla/show_bug.cgi?id=1262 Signed-off-by: Tao Ma <tao.ma@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-06-15ocfs2: Move orphan scan work to ocfs2_wq.Tao Ma
We used to let orphan scan work in the default work queue, but there is a corner case which will make the system deadlock. The scenario is like this: 1. set heartbeat threadshold to 200. this will allow us to have a great chance to have a orphan scan work before our quorum decision. 2. mount node 1. 3. after 1~2 minutes, mount node 2(in order to make the bug easier to reproduce, better add maxcpus=1 to kernel command line). 4. node 1 do orphan scan work. 5. node 2 do orphan scan work. 6. node 1 do orphan scan work. After this, node 1 hold the orphan scan lock while node 2 know node 1 is the master. 7. ifdown eth2 in node 2(eth2 is what we do ocfs2 interconnection). Now when node 2 begins orphan scan, the system queue is blocked. The root cause is that both orphan scan work and quorum decision work will use the system event work queue. orphan scan has a chance of blocking the event work queue(in dlm_wait_for_node_death) so that there is no chance for quorum decision work to proceed. This patch resolve it by moving orphan scan work to ocfs2_wq. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-06-15fs/ocfs2/dlm: Add missing spin_unlockJulia Lawall
Add a spin_unlock missing on the error path. Unlock as in the other code that leads to the leave label. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E1; @@ * spin_lock(E1,...); <+... when != E1 if (...) { ... when != E1 * return ...; } ...+> * spin_unlock(E1,...); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-06-14Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-linusJens Axboe
2010-06-13of: Drop properties with "/" in their nameMichael Ellerman
Some bogus firmwares include properties with "/" in their name. This causes problems when creating the /proc/device-tree file system, because the slash is taken to indicate a directory. We don't care about those properties, and we don't want to encourage them, so just throw them away when creating /proc/device-tree. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-06-13ceph: fix message memory leak, uninitialized variableSage Weil
We need to properly initialize skip, as not all alloc_msg op instances set it. Also, BUG if someone says skip but also allocates a message. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-13ceph: fix map handler error pathSage Weil
Don't leak message if we receive an unexpected message type. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-13ceph: some endianity fixesYehuda Sadeh
Fix some problems that came up with sparse. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-12cifs: implement drop_inode superblock opJeff Layton
The standard behavior for drop_inode is to delete the inode when the last reference to it is put and the nlink count goes to 0. This helps keep inodes that are still considered "not deleted" in cache as long as possible even when there aren't dentries attached to them. When server inode numbers are disabled, it's not possible for cifs_iget to ever match an existing inode (since inode numbers are generated via iunique). In this situation, cifs can keep a lot of inodes in cache that will never be used again. Implement a drop_inode routine that deletes the inode if server inode numbers are disabled on the mount. This helps keep the cifs inode caches down to a more manageable size when server inode numbers are disabled. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-06-12cifs: don't attempt busy-file rename unless it's in same directoryJeff Layton
Busy-file renames don't actually work across directories, so we need to limit this code to renames within the same dir. This fixes the bug detailed here: https://bugzilla.redhat.com/show_bug.cgi?id=591938 Signed-off-by: Jeff Layton <jlayton@redhat.com> CC: Stable <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-06-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: The file argument for fsync() is never null Btrfs: handle ERR_PTR from posix_acl_from_xattr() Btrfs: avoid BUG when dropping root and reference in same transaction Btrfs: prohibit a operation of changing acl's mask when noacl mount option used Btrfs: should add a permission check for setfacl Btrfs: btrfs_lookup_dir_item() can return ERR_PTR Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRs Btrfs: unwind after btrfs_start_transaction() errors Btrfs: btrfs_iget() returns ERR_PTR Btrfs: handle kzalloc() failure in open_ctree() Btrfs: handle error returns from btrfs_lookup_dir_item() Btrfs: Fix BUG_ON for fs converted from extN Btrfs: Fix null dereference in relocation.c Btrfs: fix remap_file_pages error Btrfs: uninitialized data is check_path_shared() Btrfs: fix fallocate regression Btrfs: fix loop device on top of btrfs
2010-06-11Btrfs: The file argument for fsync() is never nullDan Carpenter
The "file" argument for fsync is never null so we can remove this check. What drew my attention here is that 7ea8085910e: "drop unused dentry argument to ->fsync" introduced an unconditional dereference at the start of the function and that generated a smatch warning. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: handle ERR_PTR from posix_acl_from_xattr()Dan Carpenter
posix_acl_from_xattr() returns both ERR_PTRs and null, but it's OK to pass null values to set_cached_acl() Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: avoid BUG when dropping root and reference in same transactionSage Weil
If btrfs_ioctl_snap_destroy() deletes a snapshot but finishes with end_transaction(), the cleaner kthread may come in and drop the root in the same transaction. If that's the case, the root's refs still == 1 in the tree when btrfs_del_root() deletes the item, because commit_fs_roots() hasn't updated it yet (that happens during the commit). This wasn't a problem before only because btrfs_ioctl_snap_destroy() would commit the transaction before dropping the dentry reference, so the dead root wouldn't get queued up until after the fs root item was updated in the btree. Since it is not an error to drop the root reference and the root in the same transaction, just drop the BUG_ON() in btrfs_del_root(). Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: prohibit a operation of changing acl's mask when noacl mount option usedShi Weihua
when used Posix File System Test Suite(pjd-fstest) to test btrfs, some cases about setfacl failed when noacl mount option used. I simplified used commands in pjd-fstest, and the following steps can reproduce it. ------------------------ # cd btrfs-part/ # mkdir aaa # setfacl -m m::rw aaa <- successed, but not expected by pjd-fstest. ------------------------ I checked ext3, a warning message occured, like as: setfacl: aaa/: Operation not supported Certainly, it's expected by pjd-fstest. So, i compared acl.c of btrfs and ext3. Based on that, a patch created. Fortunately, it works. Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: should add a permission check for setfaclShi Weihua
On btrfs, do the following ------------------ # su user1 # cd btrfs-part/ # touch aaa # getfacl aaa # file: aaa # owner: user1 # group: user1 user::rw- group::rw- other::r-- # su user2 # cd btrfs-part/ # setfacl -m u::rwx aaa # getfacl aaa # file: aaa # owner: user1 # group: user1 user::rwx <- successed to setfacl group::rw- other::r-- ------------------ but we should prohibit it that user2 changing user1's acl. In fact, on ext3 and other fs, a message occurs: setfacl: aaa: Operation not permitted This patch fixed it. Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: btrfs_lookup_dir_item() can return ERR_PTRDan Carpenter
btrfs_lookup_dir_item() can return either ERR_PTRs or null. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRsDan Carpenter
btrfs_read_fs_root_no_name() returns ERR_PTRs on error so I added a check for that. It's not clear to me if it can also return NULL pointers or not so I left the original NULL pointer check as is. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: unwind after btrfs_start_transaction() errorsDan Carpenter
This was added by a22285a6a3: "Btrfs: Integrate metadata reservation with start_transaction". If we goto out here then we skip all the unwinding and there are locks still held etc. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: btrfs_iget() returns ERR_PTRDan Carpenter
btrfs_iget() returns an ERR_PTR() on failure and not null. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: handle kzalloc() failure in open_ctree()Dan Carpenter
Unwind and return -ENOMEM if the allocation fails here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: handle error returns from btrfs_lookup_dir_item()Dan Carpenter
If btrfs_lookup_dir_item() fails, we should can just let the mount fail with an error. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: Fix BUG_ON for fs converted from extNYan, Zheng
Tree blocks can live in data block groups in FS converted from extN. So it's easy to trigger the BUG_ON. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: Fix null dereference in relocation.cYan, Zheng
Fix a potential null dereference in relocation.c Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Acked-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: try to send partial cap release on cap message on missing inode ceph: release cap on import if we don't have the inode ceph: fix misleading/incorrect debug message ceph: fix atomic64_t initialization on ia64 ceph: fix lease revocation when seq doesn't match ceph: fix f_namelen reported by statfs ceph: fix memory leak in statfs ceph: fix d_subdirs ordering problem
2010-06-11Btrfs: fix remap_file_pages errorMiao Xie
when we use remap_file_pages() to remap a file, remap_file_pages always return error. It is because btrfs didn't set VM_CAN_NONLINEAR for vma. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: uninitialized data is check_path_shared()Dan Carpenter
refs can be used with uninitialized data if btrfs_lookup_extent_info() fails on the first pass through the loop. In the original code if that happens then check_path_shared() probably returns 1, this patch changes it to return 1 for safety. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: fix fallocate regressionJosef Bacik
Seems that when btrfs_fallocate was converted to use the new ENOSPC stuff we dropped passing the mode to the function that actually does the preallocation. This breaks anybody who wants to use FALLOC_FL_KEEP_SIZE. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11Btrfs: fix loop device on top of btrfsMiao Xie
We cannot use the loop device which has been connected to a file in the btrf The reproduce steps is following: # dd if=/dev/zero of=vdev0 bs=1M count=1024 # losetup /dev/loop0 vdev0 # mkfs.btrfs /dev/loop0 ... failed to zero device start -5 The reason is that the btrfs don't implement either ->write_begin or ->write the VFS API, so we fix it by setting ->write to do_sync_write(). Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-06-11writeback: fix pin_sb_for_writebackChristoph Hellwig
We need to check for s_instances to make sure we don't bother working against a filesystem that is beeing unmounted, and we need to call put_super to make sure a superblock is freed when we race against umount. Also no need to keep sb_lock after we got a reference on it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: add missing requeue_io in writeback_inodes_wbChristoph Hellwig
In "writeback: fix writeback_inodes_wb from writeback_inodes_sb" I accidentally removed the requeue_io if we need to skip a superblock because we can't pin it. Add it back, otherwise we're getting spurious lockups after multiple xfstests runs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: simplify and split bdi_start_writebackChristoph Hellwig
bdi_start_writeback now never gets a superblock passed, so we can just remove that case. And to further untangle the code and flatten the call stack split it into two trivial helpers for it's two callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: simplify wakeup_flusher_threadsChristoph Hellwig
bdi_writeback_all only has one caller, so fold it to simplify the code and flatten the call stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: fix writeback_inodes_wb from writeback_inodes_sbChristoph Hellwig
When we call writeback_inodes_wb from writeback_inodes_sb we always have s_umount held, which currently makes the whole operation a no-op. But if we are called to write out inodes for a specific superblock we always have s_umount held, so replace the incorrect logic checking for WB_SYNC_ALL which only worked by coincidence with the proper check for an explicit superblock argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-11writeback: enforce s_umount locking in writeback_inodes_sbChristoph Hellwig
Make sure that not only sync_filesystem but all callers of writeback_inodes_sb have the superblock protected against remount. As-is this disables all functionality for these callers, but the next patch relies on this locking to fix writeback_inodes_sb for sync_filesystem. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>