summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2010-05-25fuse: support splice() reading from fuse deviceMiklos Szeredi
Allow userspace filesystem implementation to use splice() to read from the fuse device. The userspace filesystem can now transfer data coming from a WRITE request to an arbitrary file descriptor (regular file, block device or socket) without having to go through a userspace buffer. The semantics of using splice() to read messages are: 1) with a single splice() call move the whole message from the fuse device to a temporary pipe 2) read the header from the pipe and determine the message type 3a) if message is a WRITE then splice data from pipe to destination 3b) else read rest of message to userspace buffer Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2010-05-25fuse: allow splice to move pagesMiklos Szeredi
When splicing buffers to the fuse device with SPLICE_F_MOVE, try to move pages from the pipe buffer into the page cache. This allows populating the fuse filesystem's cache without ever touching the page contents, i.e. zero copy read capability. The following steps are performed when trying to move a page into the page cache: - buf->ops->confirm() to make sure the new page is uptodate - buf->ops->steal() to try to remove the new page from it's previous place - remove_from_page_cache() on the old page - add_to_page_cache_locked() on the new page If any of the above steps fail (non fatally) then the code falls back to copying the page. In particular ->steal() will fail if there are external references (other than the page cache and the pipe buffer) to the page. Also since the remove_from_page_cache() + add_to_page_cache_locked() are non-atomic it is possible that the page cache is repopulated in between the two and add_to_page_cache_locked() will fail. This could be fixed by creating a new atomic replace_page_cache_page() function. fuse_readpages_end() needed to be reworked so it works even if page->mapping is NULL for some or all pages which can happen if the add_to_page_cache_locked() failed. A number of sanity checks were added to make sure the stolen pages don't have weird flags set, etc... These could be moved into generic splice/steal code. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2010-05-25fuse: support splice() writing to fuse deviceMiklos Szeredi
Allow userspace filesystem implementation to use splice() to write to the fuse device. The semantics of using splice() are: 1) buffer the message header and data in a temporary pipe 2) with a *single* splice() call move the message from the temporary pipe to the fuse device The READ reply message has the most interesting use for this, since now the data from an arbitrary file descriptor (which could be a regular file, a block device or a socket) can be tranferred into the fuse device without having to go through a userspace buffer. It will also allow zero copy moving of pages. One caveat is that the protocol on the fuse device requires the length of the whole message to be written into the header. But the length of the data transferred into the temporary pipe may not be known in advance. The current library implementation works around this by using vmplice to write the header and modifying the header after splicing the data into the pipe (error handling omitted): struct fuse_out_header out; iov.iov_base = &out; iov.iov_len = sizeof(struct fuse_out_header); vmsplice(pip[1], &iov, 1, 0); len = splice(input_fd, input_offset, pip[1], NULL, len, 0); /* retrospectively modify the header: */ out.len = len + sizeof(struct fuse_out_header); splice(pip[0], NULL, fuse_chan_fd(req->ch), NULL, out.len, flags); This works since vmsplice only saves a pointer to the data, it does not copy the data itself. Since pipes are currently limited to 16 pages and messages need to be spliced atomically, the length of the data is limited to 15 pages (or 60kB for 4k pages). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2010-05-25fuse: get page reference for readpagesMiklos Szeredi
Acquire a page ref on pages in ->readpages() and release them when the read has finished. Not acquiring a reference didn't seem to cause any trouble since the page is locked and will not be kicked out of the page cache during the read. However the following patches will want to remove the page from the cache so a separate ref is needed. Making the reference in req->pages explicit also makes the code easier to understand. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2010-05-25fuse: use get_user_pages_fast()Miklos Szeredi
Replace uses of get_user_pages() with get_user_pages_fast(). It looks nicer and should be faster in most cases. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2010-05-25fuse: remove unneeded variableDan Carpenter
"map" isn't needed any more after: 0bd87182d3ab18 "fuse: fix kunmap in fuse_ioctl_copy_user" Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2010-05-24Merge branch 'delayed-logging-for-2.6.35' into for-linusAlex Elder
2010-05-24xfs: Ensure inode allocation buffers are fully replayedDave Chinner
With delayed logging, we can get inode allocation buffers in the same transaction inode unlink buffers. We don't currently mark inode allocation buffers in the log, so inode unlink buffers take precedence over allocation buffers. The result is that when they are combined into the same checkpoint, only the unlinked inode chain fields are replayed, resulting in uninitialised inode buffers being detected when the next inode modification is replayed. To fix this, we need to ensure that we do not set the inode buffer flag in the buffer log item format flags if the inode allocation has not already hit the log. To avoid requiring a change to log recovery, we really need to make this a modification that relies only on in-memory sate. We can do this by checking during buffer log formatting (while the CIL cannot be flushed) if we are still in the same sequence when we commit the unlink transaction as the inode allocation transaction. If we are, then we do not add the inode buffer flag to the buffer log format item flags. This means the entire buffer will be replayed, not just the unlinked fields. We do this while CIL flusheѕ are locked out to ensure that we don't race with the sequence numbers changing and hence fail to put the inode buffer flag in the buffer format flags when we really need to. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: enable background pushing of the CILDave Chinner
If we let the CIL grow without bound, it will grow large enough to violate recovery constraints (must be at least one complete transaction in the log at all times) or take forever to write out through the log buffers. Hence we need a check during asynchronous transactions as to whether the CIL needs to be pushed. We track the amount of log space the CIL consumes, so it is relatively simple to limit it on a pure size basis. Make the limit the minimum of just under half the log size (recovery constraint) or 8MB of log space (which is an awful lot of metadata). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: forced unmounts need to push the CILDave Chinner
If the filesystem is being shut down and the there is no log error, the current code forces out the current log buffers. This code now needs to push the CIL before it forces out the log buffers to acheive the same result. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: Introduce delayed logging core codeDave Chinner
The delayed logging code only changes in-memory structures and as such can be enabled and disabled with a mount option. Add the mount option and emit a warning that this is an experimental feature that should not be used in production yet. We also need infrastructure to track committed items that have not yet been written to the log. This is what the Committed Item List (CIL) is for. The log item also needs to be extended to track the current log vector, the associated memory buffer and it's location in the Commit Item List. Extend the log item and log vector structures to enable this tracking. To maintain the current log format for transactions with delayed logging, we need to introduce a checkpoint transaction and a context for tracking each checkpoint from initiation to transaction completion. This includes adding a log ticket for tracking space log required/used by the context checkpoint. To track all the changes we need an io vector array per log item, rather than a single array for the entire transaction. Using the new log vector structure for this requires two passes - the first to allocate the log vector structures and chain them together, and the second to fill them out. This log vector chain can then be passed to the CIL for formatting, pinning and insertion into the CIL. Formatting of the log vector chain is relatively simple - it's just a loop over the iovecs on each log vector, but it is made slightly more complex because we re-write the iovec after the copy to point back at the memory buffer we just copied into. This code also needs to pin log items. If the log item is not already tracked in this checkpoint context, then it needs to be pinned. Otherwise it is already pinned and we don't need to pin it again. The only other complexity is calculating the amount of new log space the formatting has consumed. This needs to be accounted to the transaction in progress, and the accounting is made more complex becase we need also to steal space from it for log metadata in the checkpoint transaction. Calculate all this at insert time and update all the tickets, counters, etc correctly. Once we've formatted all the log items in the transaction, attach the busy extents to the checkpoint context so the busy extents live until checkpoint completion and can be processed at that point in time. Transactions can then be freed at this point in time. Now we need to issue checkpoints - we are tracking the amount of log space used by the items in the CIL, so we can trigger background checkpoints when the space usage gets to a certain threshold. Otherwise, checkpoints need ot be triggered when a log synchronisation point is reached - a log force event. Because the log write code already handles chained log vectors, writing the transaction is trivial, too. Construct a transaction header, add it to the head of the chain and write it into the log, then issue a commit record write. Then we can release the checkpoint log ticket and attach the context to the log buffer so it can be called during Io completion to complete the checkpoint. We also need to allow for synchronising multiple in-flight checkpoints. This is needed for two things - the first is to ensure that checkpoint commit records appear in the log in the correct sequence order (so they are replayed in the correct order). The second is so that xfs_log_force_lsn() operates correctly and only flushes and/or waits for the specific sequence it was provided with. To do this we need a wait variable and a list tracking the checkpoint commits in progress. We can walk this list and wait for the checkpoints to change state or complete easily, an this provides the necessary synchronisation for correct operation in both cases. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: Improve scalability of busy extent trackingDave Chinner
When we free a metadata extent, we record it in the per-AG busy extent array so that it is not re-used before the freeing transaction hits the disk. This array is fixed size, so when it overflows we make further allocation transactions synchronous because we cannot track more freed extents until those transactions hit the disk and are completed. Under heavy mixed allocation and freeing workloads with large log buffers, we can overflow this array quite easily. Further, the array is sparsely populated, which means that inserts need to search for a free slot, and array searches often have to search many more slots that are actually used to check all the busy extents. Quite inefficient, really. To enable this aspect of extent freeing to scale better, we need a structure that can grow dynamically. While in other areas of XFS we have used radix trees, the extents being freed are at random locations on disk so are better suited to being indexed by an rbtree. So, use a per-AG rbtree indexed by block number to track busy extents. This incures a memory allocation when marking an extent busy, but should not occur too often in low memory situations. This should scale to an arbitrary number of extents so should not be a limitation for features such as in-memory aggregation of transactions. However, there are still situations where we can't avoid allocating busy extents (such as allocation from the AGFL). To minimise the overhead of such occurences, we need to avoid doing a synchronous log force while holding the AGF locked to ensure that the previous transactions are safely on disk before we use the extent. We can do this by marking the transaction doing the allocation as synchronous rather issuing a log force. Because of the locking involved and the ordering of transactions, the synchronous transaction provides the same guarantees as a synchronous log force because it ensures that all the prior transactions are already on disk when the synchronous transaction hits the disk. i.e. it preserves the free->allocate order of the extent correctly in recovery. By doing this, we avoid holding the AGF locked while log writes are in progress, hence reducing the length of time the lock is held and therefore we increase the rate at which we can allocate and free from the allocation group, thereby increasing overall throughput. The only problem with this approach is that when a metadata buffer is marked stale (e.g. a directory block is removed), then buffer remains pinned and locked until the log goes to disk. The issue here is that if that stale buffer is reallocated in a subsequent transaction, the attempt to lock that buffer in the transaction will hang waiting the log to go to disk to unlock and unpin the buffer. Hence if someone tries to lock a pinned, stale, locked buffer we need to push on the log to get it unlocked ASAP. Effectively we are trading off a guaranteed log force for a much less common trigger for log force to occur. Ideally we should not reallocate busy extents. That is a much more complex fix to the problem as it involves direct intervention in the allocation btree searches in many places. This is left to a future set of modifications. Finally, now that we track busy extents in allocated memory, we don't need the descriptors in the transaction structure to point to them. We can replace the complex busy chunk infrastructure with a simple linked list of busy extents. This allows us to remove a large chunk of code, making the overall change a net reduction in code size. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: make the log ticket ID available outside the log infrastructureDave Chinner
The ticket ID is needed to uniquely identify transactions when doing busy extent matching. Delayed logging changes the lifecycle of busy extents with respect to the transaction structure lifecycle. Hence we can no longer use the transaction structure as a means of determining the owner of the busy extent as it may be freed and reused while the busy extent is still active. This commit provides the infrastructure to access the xlog_tid_t held in the ticket from a transaction handle. This avoids the need for callers to peek into the transaction and log structures to find this out. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: clean up log ticket overrun debug outputDave Chinner
Push the error message output when a ticket overrun is detected into the ticket printing functions. Also remove the debug version of the code as the production version will still panic just as effectively on a debug kernel via the panic mask being set. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: Clean up XFS_BLI_* flag namespaceDave Chinner
Clean up the buffer log format (XFS_BLI_*) flags because they have a polluted namespace. They XFS_BLI_ prefix is used for both in-memory and on-disk flag feilds, but have overlapping values for different flags. Rename the buffer log format flags to use the XFS_BLF_* prefix to avoid confusing them with the in-memory XFS_BLI_* prefixed flags. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: modify buffer item reference countingDave Chinner
The buffer log item reference counts used to take referenceѕ for every transaction, similar to the pin counting. This is symmetric (like the pin/unpin) with respect to transaction completion, but with dleayed logging becomes assymetric as the pinning becomes assymetric w.r.t. transaction completion. To make both cases the same, allow the buffer pinning to take a reference to the buffer log item and always drop the reference the transaction has on it when being unlocked. This is balanced correctly because the unpin operation always drops a reference to the log item. Hence reference counting becomes symmetric w.r.t. item pinning as well as w.r.t active transactions and as a result the reference counting model remain consistent between normal and delayed logging. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: allow log ticket allocation to take allocation flagsDave Chinner
Delayed logging currently requires ticket allocation to succeed, so we need to be able to sleep on allocation. It also should not allow memory allocation to recurse into the filesystem. hence we need to pass allocation flags directing the type of allocation the caller requires. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24xfs: Don't reuse the same transaction ID for duplicated transactions.Dave Chinner
The transaction ID is written into the log as the unique identifier for transactions during recover. When duplicating a transaction, we reuse the log ticket, which means it has the same transaction ID as the previous transaction. Rather than regenerating a random transaction ID for the duplicated transaction, just add one to the current ID so that duplicated transaction can be easily spotted in the log and during recovery during problem diagnosis. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-24Merge branch 'bkl/ioctl' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing * 'bkl/ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: uml: Pushdown the bkl from harddog_kern ioctl sunrpc: Pushdown the bkl from sunrpc cache ioctl sunrpc: Pushdown the bkl from ioctl autofs4: Pushdown the bkl from ioctl uml: Convert to unlocked_ioctls to remove implicit BKL ncpfs: BKL ioctl pushdown coda: Clean-up whitespace problems in pioctl.c coda: BKL ioctl pushdown drivers: Push down BKL into various drivers isdn: Push down BKL into ioctl functions scsi: Push down BKL into ioctl functions dvb: Push down BKL into ioctl functions smbfs: Push down BKL into ioctl function coda/psdev: Remove BKL from ioctl function um/mmapper: Remove BKL usage sn_hwperf: Kill BKL usage hfsplus: Push down BKL into ioctl function
2010-05-24Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds
* 'for-linus' of git://git.open-osd.org/linux-open-osd: exofs: confusion between kmap() and kmap_atomic() api exofs: Add default address_space_operations
2010-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/hirofumi/fatfs-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/hirofumi/fatfs-2.6: fat: convert to unlocked_ioctl fat: Cleanup nls_unload() usage fat: use pack_hex_byte() instead of custom one
2010-05-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: Optimize TCREATE by eliminating a redundant fid clone. 9p: cleanup: remove unneeded assignment 9p: Add mksock support fs/9p: Make sure we properly instantiate dentry. 9p: add 9P2000.L rename operation 9p: add 9P2000.L statfs operation 9p: VFS switches for 9p2000.L: VFS switches 9p: VFS switches for 9p2000.L: protocol and client changes
2010-05-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (59 commits) ceph: reuse mon subscribe message instead of allocated anew ceph: avoid resending queued message to monitor ceph: Storage class should be before const qualifier ceph: all allocation functions should get gfp_mask ceph: specify max_bytes on readdir replies ceph: cleanup pool op strings ceph: Use kzalloc ceph: use common helper for aborted dir request invalidation ceph: cope with out of order (unsafe after safe) mds reply ceph: save peer feature bits in connection structure ceph: resync headers with userland ceph: use ceph. prefix for virtual xattrs ceph: throw out dirty caps metadata, data on session teardown ceph: attempt mds reconnect if mds closes our session ceph: clean up send_mds_reconnect interface ceph: wait for mds OPEN reply to indicate reconnect success ceph: only send cap releases when mds is OPEN|HUNG ceph: dicard cap releases on mds restart ceph: make mon client statfs handling more generic ceph: drop src address(es) from message header [new protocol feature] ...
2010-05-24GFS2: Fix permissions checking for setflags ioctl()Steven Whitehouse
We should be checking for the ownership of the file for which flags are being set, rather than just for write access. Reported-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2010-05-24ufs: Remove dead quota codeJan Kara
UFS quota is non-functional at least since 2.6.12 because dq_op was set to NULL. Since the filesystem exists mainly to allow cooperation with Solaris and quota format isn't standard, just remove the dead code. CC: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-24udf: Remove dead quota codeJan Kara
Quota on UDF is non-functional at least since 2.6.16 (I'm too lazy to do more archeology) because it does not provide .quota_write and .quota_read functions and thus quotaon(8) just returns EINVAL. Since nobody complained for all those years and quota support is not even in UDF standard just nuke it. Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-24quota: rename default quotactl methods to dquot_Christoph Hellwig
Follow the dquot_* style used elsewhere in dquot.c. [Jan Kara: Fixed up missing conversion of ext2] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-24quota: explicitly set ->dq_op and ->s_qcopChristoph Hellwig
Only set the quota operation vectors if the filesystem actually supports quota instead of doing it for all filesystems in alloc_super(). [Jan Kara: Export dquot_operations and vfs_quotactl_ops] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-24quota: drop remount argument to ->quota_on and ->quota_offChristoph Hellwig
Remount handling has fully moved into the filesystem, so all this is superflous now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-24quota: move unmount handling into the filesystemChristoph Hellwig
Currently the VFS calls into the quotactl interface for unmounting filesystems. This means filesystems with their own quota handling can't easily distinguish between user-space originating quotaoff and an unount. Instead move the responsibily of the unmount handling into the filesystem to be consistent with all other dquot handling. Note that we do call dquot_disable a lot later now, e.g. after a sync_filesystem. But this is fine as the quota code does all its writes via blockdev's mapping and that is synced even later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-24quota: kill the vfs_dq_off and vfs_dq_quota_on_remount wrappersChristoph Hellwig
Instead of having wrappers in the VFS namespace export the dquot_suspend and dquot_resume helpers directly. Also rename vfs_quota_disable to dquot_disable while we're at it. [Jan Kara: Moved dquot_suspend to quotaops.h and made it inline] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-24quota: move remount handling into the filesystemChristoph Hellwig
Currently do_remount_sb calls into the dquot code to tell it about going from rw to ro and ro to rw. Move this code into the filesystem to not depend on the dquot code in the VFS - note ocfs2 already ignores these calls and handles remount by itself. This gets rid of overloading the quotactl calls and allows to unify the VFS and XFS codepaths in that area later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-24ocfs2: Fix use after free on remount read-onlyJan Kara
We also have to cancel quota syncing thread on remount read only because at that moment quota is being turned off. Otherwise quota syncing thread will try to access already freed quota structures. Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-23squashfs: fix name reading in squashfs_xattr_getPhillip Lougher
Only read potentially matching names into the target buffer, all obviously non matching names don't need to be read into the target buffer. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
2010-05-23squashfs: constify xattr handlersPhillip Lougher
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
2010-05-229p: Optimize TCREATE by eliminating a redundant fid clone.Venkateswararao Jujjuri
This patch removes a redundant fid clone on the directory fid and hence reduces a server transaction while creating new filesystem object. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-05-229p: cleanup: remove unneeded assignmentDan Carpenter
We never use "v9ses" and so we can remove it. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-05-229p: Add mksock supportVenkateswararao Jujjuri
Without this patch, an attempt to mksock will get an EINVAL. Before this patch: [root@localhost 1dir]# mksock mysock mksock: error making mysock: Invalid argument With this patch: [root@localhost 1dir]# mksock mysock [root@localhost 1dir]# ls -l mysock s--------- 1 root root 0 2010-03-31 17:44 mysock Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-05-22fs/9p: Make sure we properly instantiate dentry.Aneesh Kumar K.V
For lookup if we get ENOENT error from the server we still instantiate the dentry. We need to make sure we have dentry operations set in that case so that a later dput on the dentry does the expected. Without the patch we get the below error #ln -sf abc abclink ln: creating symbolic link `abclink': No such file or directory Now on the host do $ touch abclink Guest now gives ENOENT error. # ls ls: cannot access abclink: No such file or directory Debugged-by:Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-05-22autofs4: Pushdown the bkl from ioctlFrederic Weisbecker
Pushdown the bkl to autofs4_root_ioctl. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ian Kent <raven@themaw.net> Cc: Autofs <autofs@linux.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de>
2010-05-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (69 commits) fix handling of offsets in cris eeprom.c, get rid of fake on-stack files get rid of home-grown mutex in cris eeprom.c switch ecryptfs_write() to struct inode *, kill on-stack fake files switch ecryptfs_get_locked_page() to struct inode * simplify access to ecryptfs inodes in ->readpage() and friends AFS: Don't put struct file on the stack Ban ecryptfs over ecryptfs logfs: replace inode uid,gid,mode initialization with helper function ufs: replace inode uid,gid,mode initialization with helper function udf: replace inode uid,gid,mode init with helper ubifs: replace inode uid,gid,mode initialization with helper function sysv: replace inode uid,gid,mode initialization with helper function reiserfs: replace inode uid,gid,mode initialization with helper function ramfs: replace inode uid,gid,mode initialization with helper function omfs: replace inode uid,gid,mode initialization with helper function bfs: replace inode uid,gid,mode initialization with helper function ocfs2: replace inode uid,gid,mode initialization with helper function nilfs2: replace inode uid,gid,mode initialization with helper function minix: replace inode uid,gid,mode init with helper ext4: replace inode uid,gid,mode init with helper ... Trivial conflict in fs/fs-writeback.c (mark bitfields unsigned)
2010-05-21ceph: reuse mon subscribe message instead of allocated anewSage Weil
Use the same message, allocated during startup. No need to reallocate a new one each time around (and potentially ENOMEM). Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-21switch ecryptfs_write() to struct inode *, kill on-stack fake filesAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21switch ecryptfs_get_locked_page() to struct inode *Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21simplify access to ecryptfs inodes in ->readpage() and friendsAl Viro
we can get to them from page->mapping->host, no need to mess with file. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21AFS: Don't put struct file on the stackAl Viro
Don't put struct file on the stack as it takes up quite a lot of space and violates lifetime rules for struct file. Rather than calling afs_readpage() indirectly from the directory routines by way of read_mapping_page(), split afs_readpage() to have afs_page_filler() that's given a key instead of a file and call read_cache_page(), specifying the new function directly. Use it in afs_readpages() as well. Also make use of this in afs_mntpt_check_symlink() too for the same reason. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com>
2010-05-21Ban ecryptfs over ecryptfsAl Viro
This is a seriously simplified patch from Eric Sandeen; copy of rationale follows: === mounting stacked ecryptfs on ecryptfs has been shown to lead to bugs in testing. For crypto info in xattr, there is no mechanism for handling this at all, and for normal file headers, we run into other trouble: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffffa015b0b3>] ecryptfs_d_revalidate+0x43/0xa0 [ecryptfs] ... There doesn't seem to be any good usecase for this, so I'd suggest just disallowing the configuration. Based on a patch originally, I believe, from Mike Halcrow. === Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21logfs: replace inode uid,gid,mode initialization with helper functionAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21ufs: replace inode uid,gid,mode initialization with helper functionDmitry Monakhov
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21udf: replace inode uid,gid,mode init with helperDmitry Monakhov
Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>