summaryrefslogtreecommitdiff
path: root/fs/nfsd/vfs.c
AgeCommit message (Collapse)Author
2018-09-24vfs: swap names of {do,vfs}_clone_file_range()Amir Goldstein
Commit 031a072a0b8a ("vfs: call vfs_clone_file_range() under freeze protection") created a wrapper do_clone_file_range() around vfs_clone_file_range() moving the freeze protection to former, so overlayfs could call the latter. The more common vfs practice is to call do_xxx helpers from vfs_xxx helpers, where freeze protecction is taken in the vfs_xxx helper, so this anomality could be a source of confusion. It seems that commit 8ede205541ff ("ovl: add reflink/copyfile/dedup support") may have fallen a victim to this confusion - ovl_clone_file_range() calls the vfs_clone_file_range() helper in the hope of getting freeze protection on upper fs, but in fact results in overlayfs allowing to bypass upper fs freeze protection. Swap the names of the two helpers to conform to common vfs practice and call the correct helpers from overlayfs and nfsd. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-12IMA: don't propagate opened through the entire thingAl Viro
just check ->f_mode in ima_appraise_measurement() Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21nfsd: vfs_mkdir() might succeed leaving dentry negative unhashedAl Viro
That can (and does, on some filesystems) happen - ->mkdir() (and thus vfs_mkdir()) can legitimately leave its argument negative and just unhash it, counting upon the lookup to pick the object we'd created next time we try to look at that name. Some vfs_mkdir() callers forget about that possibility... Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-04-03nfsd: Add I/O trace points in the NFSv4 read procChuck Lever
NFSv4 read compound processing invokes nfsd_splice_read and nfs_readv directly, so the trace points currently in nfsd_read are not invoked for NFSv4 reads. Move the NFSD READ trace points to common helpers so that NFSv4 reads are captured. Also, record any local I/O error that occurs, the total count of bytes that were actually returned, and whether splice or vectored read was used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03nfsd: Add I/O trace points in the NFSv4 write pathChuck Lever
NFSv4 write compound processing invokes nfsd_vfs_write directly. The trace points currently in nfsd_write are not effective for NFSv4 writes. Move the trace points into the shared nfsd_vfs_write() helper. After the I/O, we also want to record any local I/O error that might have occurred, and the total count of bytes that were actually moved (rather than the requested number). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03nfsd: Add "nfsd_" to trace point namesChuck Lever
Follow naming convention used in client and in sunrpc layers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03nfsd: Record request byte count, not count of vectorsChuck Lever
Byte count is more helpful to know than vector count. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-31annotate RWF_... flagsChristoph Hellwig
[AV: added missing annotations in syscalls.h/compat.h] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-07-05Merge branch 'work.read_write' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull read/write updates from Al Viro: "Christoph's fs/read_write.c series - consolidation and cleanups" * 'work.read_write' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: nfsd: remove nfsd_vfs_read nfsd: use vfs_iter_read/write fs: implement vfs_iter_write using do_iter_write fs: implement vfs_iter_read using do_iter_read fs: move more code into do_iter_read/do_iter_write fs: remove __do_readv_writev fs: remove do_compat_readv_writev fs: remove do_readv_writev
2017-06-29nfsd: remove nfsd_vfs_readChristoph Hellwig
Simpler done in the only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29nfsd: use vfs_iter_read/writeChristoph Hellwig
Instead of messing with the address limit to use vfs_read/vfs_writev. Note that this requires that exported file implement ->read_iter and ->write_iter. All currently exportable file systems do this. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-27nfsd_readlink(): switch to vfs_get_link()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-10Merge tag 'nfsd-4.12' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Another RDMA update from Chuck Lever, and a bunch of miscellaneous bugfixes" * tag 'nfsd-4.12' of git://linux-nfs.org/~bfields/linux: (26 commits) nfsd: Fix up the "supattr_exclcreat" attributes nfsd: encoders mustn't use unitialized values in error cases nfsd: fix undefined behavior in nfsd4_layout_verify lockd: fix lockd shutdown race NFSv4: Fix callback server shutdown SUNRPC: Refactor svc_set_num_threads() NFSv4.x/callback: Create the callback service through svc_create_pooled lockd: remove redundant check on block svcrdma: Clean out old XDR encoders svcrdma: Remove the req_map cache svcrdma: Remove unused RDMA Write completion handler svcrdma: Reduce size of sge array in struct svc_rdma_op_ctxt svcrdma: Clean up RPC-over-RDMA backchannel reply processing svcrdma: Report Write/Reply chunk overruns svcrdma: Clean up RDMA_ERROR path svcrdma: Use rdma_rw API in RPC reply path svcrdma: Introduce local rdma_rw API helpers svcrdma: Clean up svc_rdma_get_inv_rkey() svcrdma: Add helper to save pages under I/O svcrdma: Eliminate RPCRDMA_SQ_DEPTH_MULT ...
2017-04-25NFS: don't try to cross a mountpount when there isn't one there.NeilBrown
consider the sequence of commands: mkdir -p /import/nfs /import/bind /import/etc mount --bind / /import/bind mount --make-private /import/bind mount --bind /import/etc /import/bind/etc exportfs -o rw,no_root_squash,crossmnt,async,no_subtree_check localhost:/ mount -o vers=4 localhost:/ /import/nfs ls -l /import/nfs/etc You would not expect this to report a stale file handle. Yet it does. The manipulations under /import/bind cause the dentry for /etc to get the DCACHE_MOUNTED flag set, even though nothing is mounted on /etc. This causes nfsd to call nfsd_cross_mnt() even though there is no mountpoint. So an upcall to mountd for "/etc" is performed. The 'crossmnt' flag on the export of / causes mountd to report that /etc is exported as it is a descendant of /. It assumes the kernel wouldn't ask about something that wasn't a mountpoint. The filehandle returned identifies the filesystem and the inode number of /etc. When this filehandle is presented to rpc.mountd, via "nfsd.fh", the inode cannot be found associated with any name in /etc/exports, or with any mountpoint listed by getmntent(). So rpc.mountd says the filehandle doesn't exist. Hence ESTALE. This is fixed by teaching nfsd not to trust DCACHE_MOUNTED too much. It is just a hint, not a guarantee. Change nfsd_mountpoint() to return '1' for a certain mountpoint, '2' for a possible mountpoint, and 0 otherwise. Then change nfsd_crossmnt() to check if follow_down() actually found a mountpount and, if not, to avoid performing a lookup if the location is not known to certainly require an export-point. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-04-11sched/core: Remove 'task' parameter and rename tsk_restore_flags() to ↵NeilBrown
current_restore_flags() It is not safe for one thread to modify the ->flags of another thread as there is no locking that can protect the update. So tsk_restore_flags(), which takes a task pointer and modifies the flags, is an invitation to do the wrong thing. All current users pass "current" as the task, so no developers have accepted that invitation. It would be best to ensure it remains that way. So rename tsk_restore_flags() to current_restore_flags() and don't pass in a task_struct pointer. Always operate on current->flags. Signed-off-by: NeilBrown <neilb@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-21nfsd: special case truncates some moreChristoph Hellwig
Both the NFS protocols and the Linux VFS use a setattr operation with a bitmap of attributes to set to set various file attributes including the file size and the uid/gid. The Linux syscalls never mix size updates with unrelated updates like the uid/gid, and some file systems like XFS and GFS2 rely on the fact that truncates don't update random other attributes, and many other file systems handle the case but do not update the other attributes in the same transaction. NFSD on the other hand passes the attributes it gets on the wire more or less directly through to the VFS, leading to updates the file systems don't expect. XFS at least has an assert on the allowed attributes, which caught an unusual NFS client setting the size and group at the same time. To handle this issue properly this splits the notify_change call in nfsd_setattr into two separate ones. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-02-20nfsd: minor nfsd_setattr cleanupChristoph Hellwig
Simplify exit paths, size_change use. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-02-20nfsd: merge stable fix into main nfsd branchJ. Bruce Fields
2017-02-09nfsd: Revert "nfsd: special case truncates some more"J. Bruce Fields
This patch incorrectly attempted nested mnt_want_write, and incorrectly disabled nfsd's owner override for truncate. We'll fix those problems and make another attempt soon, for the moment I think the safest is to revert. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-01-31NFSD: Remove unused value inode in nfsd_vfs_writeKinglong Mee
This is just cleanup, no change in functionality. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-01-31NFSD: cleanup dead codes and values in nfsd_writeKinglong Mee
This is just cleanup, no change in functionality. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-01-31NFSD: pass an integer for stable type to nfsd_vfs_writeKinglong Mee
After fae5096ad217 "nfsd: assume writeable exportabled filesystems have f_sync" we no longer modify this argument. This is just cleanup, no change in functionality. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-01-31nfsd: special case truncates some moreChristoph Hellwig
Both the NFS protocols and the Linux VFS use a setattr operation with a bitmap of attributs to set to set various file attributes including the file size and the uid/gid. The Linux syscalls never mixes size updates with unrelated updates like the uid/gid, and some file systems like XFS and GFS2 rely on the fact that truncates might not update random other attributes, and many other file systems handle the case but do not update the different attributes in the same transaction. NFSD on the other hand passes the attributes it gets on the wire more or less directly through to the VFS, leading to updates the file systems don't expect. XFS at least has an assert on the allowed attributes, which caught an unusual NFS client setting the size and group at the same time. To handle this issue properly this switches nfsd to call vfs_truncate for size changes, and then handle all other attributes through notify_change. As a side effect this also means less boilerplace code around the size change as we can now reuse the VFS code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-12-24Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-17Merge uncontroversial parts of branch 'readlink' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull partial readlink cleanups from Miklos Szeredi. This is the uncontroversial part of the readlink cleanup patch-set that simplifies the default readlink handling. Miklos and Al are still discussing the rest of the series. * git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: vfs: make generic_readlink() static vfs: remove ".readlink = generic_readlink" assignments vfs: default to generic_readlink() vfs: replace calling i_op->readlink with vfs_readlink() proc/self: use generic_readlink ecryptfs: use vfs_get_link() bad_inode: add missing i_op initializers
2016-12-16vfs: call vfs_clone_file_range() under freeze protectionAmir Goldstein
Move sb_start_write()/sb_end_write() out of the vfs helper and up into the ioctl handler. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-09vfs: replace calling i_op->readlink with vfs_readlink()Miklos Szeredi
Also check d_is_symlink() in callers instead of inode->i_op->readlink because following patches will allow NULL ->readlink for symlinks. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-10-07NFSD: Implement the COPY callAnna Schumaker
I only implemented the sync version of this call, since it's the easiest. I can simply call vfs_copy_range() and have the vfs do the right thing for the filesystem being exported. Signed-off-by: Anna Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-11nfsd: fix dentry refcounting on createJosef Bacik
b44061d0b9 introduced a dentry ref counting bug. Previously we were grabbing one ref to dchild in nfsd_create(), but with the creation of nfsd_create_locked() we have a ref for dchild from the lookup in nfsd_create(), and then another ref in nfsd_create_locked(). The ref from the lookup in nfsd_create() is never dropped and results in dentries still in use at unmount. Signed-off-by: Josef Bacik <jbacik@fb.com> Fixes: b44061d0b9 "nfsd: reorganize nfsd_create" Reported-by: kernel test robot <xiaolong.ye@intel.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-04nfsd: remove some dead code in nfsd_create_locked()Dan Carpenter
We changed this around in f135af1041f ('nfsd: reorganize nfsd_create') so "dchild" can't be an error pointer any more. Also, dchild can't be NULL here (and dput would already handle this even if it was). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-04nfsd: drop unnecessary MAY_EXEC check from createJ. Bruce Fields
We need an fh_verify to make sure we at least have a dentry, but actual permission checks happen later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-04nfsd: clean up bad-type check in nfsd_create_lockedJ. Bruce Fields
Minor cleanup, no change in behavior. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-04nfsd: remove unnecessary positive-dentry checkJ. Bruce Fields
vfs_{create,mkdir,mknod} each begin with a call to may_create(), which returns EEXIST if the object already exists. This check is therefore unnecessary. (In the NFSv2 case, nfsd_proc_create also has such a check. Contrary to RFC 1094, our code seems to believe that a CREATE of an existing file should succeed. I'm leaving that behavior alone.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-04nfsd: reorganize nfsd_createJ. Bruce Fields
There's some odd logic in nfsd_create() that allows it to be called with the parent directory either locked or unlocked. The only already-locked caller is NFSv2's nfsd_proc_create(). It's less confusing to split out the unlocked case into a separate function which the NFSv2 code can call directly. Also fix some comments while we're here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-04nfsd: check d_can_lookup in fh_verify of directoriesJ. Bruce Fields
Create and other nfsd ops generally assume we can call lookup_one_len on inodes with S_IFDIR set. Al says that this assumption isn't true in general, though it should be for the filesystem objects nfsd sees. Add a check just to make sure our assumption isn't violated. Remove a couple checks for i_op->lookup in create code. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-04nfsd: remove redundant zero-length check from createJ. Bruce Fields
lookup_one_len already has this check. The only effect of this patch is to return access instead of perm in the 0-length-filename case. I actually prefer nfserr_perm (or _inval?), but I doubt anyone cares. The isdotent check seems redundant too, but I worry that some client might actually care about that strange nfserr_exist error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-04nfsd: Make creates return EEXIST instead of EACCESOleg Drokin
When doing a create (mkdir/mknod) on a name, it's worth checking the name exists first before returning EACCES in case the directory is not writeable by the user. This makes return values on the client more consistent regardless of whenever the entry there is cached in the local cache or not. Another positive side effect is certain programs only expect EEXIST in that case even despite POSIX allowing any valid error to be returned. Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-05-01nfsd: use RWF_SYNCChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-04vfs: pass a flags argument to vfs_readv/vfs_writevChristoph Hellwig
This way we can set kiocb flags also from the sync read/write path for the read_iter/write_iter operations. For now there is no way to pass flags to plain read/write operations as there is no real need for that, and all flags passed are explicitly rejected for these files. Signed-off-by: Milosz Tanski <milosz@adfin.com> [hch: rebased on top of my kiocb changes] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stephen Bates <stephen.bates@pmcs.com> Tested-by: Stephen Bates <stephen.bates@pmcs.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22wrappers for ->i_mutex accessAl Viro
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-15Merge tag 'nfsd-4.5' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Smaller bugfixes and cleanup, including a fix for a failures of kerberized NFSv4.1 mounts, and Scott Mayhew's work addressing ACK storms that can affect some high-availability NFS setups" * tag 'nfsd-4.5' of git://linux-nfs.org/~bfields/linux: nfsd: add new io class tracepoint nfsd: give up on CB_LAYOUTRECALLs after two lease periods nfsd: Fix nfsd leaks sunrpc module references lockd: constify nlmsvc_binding structure lockd: use to_delayed_work nfsd: use to_delayed_work Revert "svcrdma: Do not send XDR roundup bytes for a write chunk" lockd: Register callbacks on the inetaddr_chain and inet6addr_chain nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain sunrpc: Add a function to close temporary transports immediately nfsd: don't base cl_cb_status on stale information nfsd4: fix gss-proxy 4.1 mounts for some AD principals nfsd: fix unlikely NULL deref in mach_creds_match nfsd: minor consolidation of mach_cred handling code nfsd: helper for dup of possibly NULL string svcrpc: move some initialization to common code nfsd: fix a warning message nfsd: constify nfsd4_callback_ops structure nfsd: recover: constify nfsd4_client_tracking_ops structures svcrdma: Do not send XDR roundup bytes for a write chunk
2016-01-14nfsd: add new io class tracepointJeff Layton
Add some new tracepoints in the nfsd read/write codepaths. The idea is that this will give us the ability to measure how long each phase of a read or write operation takes. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-01-12Merge branch 'work.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "All kinds of stuff. That probably should've been 5 or 6 separate branches, but by the time I'd realized how large and mixed that bag had become it had been too close to -final to play with rebasing. Some fs/namei.c cleanups there, memdup_user_nul() introduction and switching open-coded instances, burying long-dead code, whack-a-mole of various kinds, several new helpers for ->llseek(), assorted cleanups and fixes from various people, etc. One piece probably deserves special mention - Neil's lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets called without ->i_mutex and tries to avoid ever taking it. That, of course, means that it's not useful for any directory modifications, but things like getting inode attributes in nfds readdirplus are fine with that. I really should've asked for moratorium on lookup-related changes this cycle, but since I hadn't done that early enough... I *am* asking for that for the coming cycle, though - I'm going to try and get conversion of i_mutex to rwsem with ->lookup() done under lock taken shared. There will be a patch closer to the end of the window, along the lines of the one Linus had posted last May - mechanical conversion of ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/ inode_is_locked()/inode_lock_nested(). To quote Linus back then: ----- | This is an automated patch using | | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/' | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/' | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/' | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/' | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/' | | with a very few manual fixups ----- I'm going to send that once the ->i_mutex-affecting stuff in -next gets mostly merged (or when Linus says he's about to stop taking merges)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) nfsd: don't hold i_mutex over userspace upcalls fs:affs:Replace time_t with time64_t fs/9p: use fscache mutex rather than spinlock proc: add a reschedule point in proc_readfd_common() logfs: constify logfs_block_ops structures fcntl: allow to set O_DIRECT flag on pipe fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE fs: xattr: Use kvfree() [s390] page_to_phys() always returns a multiple of PAGE_SIZE nbd: use ->compat_ioctl() fs: use block_device name vsprintf helper lib/vsprintf: add %*pg format specifier fs: use gendisk->disk_name where possible poll: plug an unused argument to do_poll amdkfd: don't open-code memdup_user() cdrom: don't open-code memdup_user() rsxx: don't open-code memdup_user() mtip32xx: don't open-code memdup_user() [um] mconsole: don't open-code memdup_user_nul() [um] hostaudio: don't open-code memdup_user() ...
2016-01-09nfsd: don't hold i_mutex over userspace upcallsNeilBrown
We need information about exports when crossing mountpoints during lookup or NFSv4 readdir. If we don't already have that information cached, we may have to ask (and wait for) rpc.mountd. In both cases we currently hold the i_mutex on the parent of the directory we're asking rpc.mountd about. We've seen situations where rpc.mountd performs some operation on that directory that tries to take the i_mutex again, resulting in deadlock. With some care, we may be able to avoid that in rpc.mountd. But it seems better just to avoid holding a mutex while waiting on userspace. It appears that lookup_one_len is pretty much the only operation that needs the i_mutex. So we could just drop the i_mutex elsewhere and do something like mutex_lock() lookup_one_len() mutex_unlock() In many cases though the lookup would have been cached and not required the i_mutex, so it's more efficient to create a lookup_one_len() variant that only takes the i_mutex when necessary. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-07nfsd: implement the NFSv4.2 CLONE operationChristoph Hellwig
This is basically a remote version of the btrfs CLONE operation, so the implementation is fairly trivial. Made even more trivial by stealing the XDR code and general framework Anna Schumaker's COPY prototype. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-10-12nfsd: switch unsigned char flags in svc_fh to boolsJeff Layton
...just for clarity. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-31NFSD: Set the attributes used to store the verifier for EXCLUSIVE4_1Kinglong Mee
According to rfc5661 18.16.4, "If EXCLUSIVE4_1 was used, the client determines the attributes used for the verifier by comparing attrset with cva_attrs.attrmask;" So, EXCLUSIVE4_1 also needs those bitmask used to store the verifier. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-22nfsd: take struct file setup fully into nfs4_preprocess_stateid_opChristoph Hellwig
This patch changes nfs4_preprocess_stateid_op so it always returns a valid struct file if it has been asked for that. For that we now allocate a temporary struct file for special stateids, and check permissions if we got the file structure from the stateid. This ensures that all callers will get their handling of special stateids right, and avoids code duplication. There is a little wart in here because the read code needs to know if we allocated a file structure so that it can copy around the read-ahead parameters. In the long run we should probably aim to cache full file structures used with special stateids instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-19nfsd: clean up raparams handlingChristoph Hellwig
Refactor the raparam hash helpers to just deal with the raparms, and keep opening/closing files separate from that. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>