summaryrefslogtreecommitdiff
path: root/fs/nfsd/xdr4.h
AgeCommit message (Collapse)Author
2017-11-07nfsd4: fix cached replies to solo SEQUENCE compoundsJ. Bruce Fields
Currently our handling of 4.1+ requests without "cachethis" set is confusing and not quite correct. Suppose a client sends a compound consisting of only a single SEQUENCE op, and it matches the seqid in a session slot (so it's a retry), but the previous request with that seqid did not have "cachethis" set. The obvious thing to do might be to return NFS4ERR_RETRY_UNCACHED_REP, but the protocol only allows that to be returned on the op following the SEQUENCE, and there is no such op in this case. The protocol permits us to cache replies even if the client didn't ask us to. And it's easy to do so in the case of solo SEQUENCE compounds. So, when we get a solo SEQUENCE, we can either return the previously cached reply or NFSERR_SEQ_FALSE_RETRY if we notice it differs in some way from the original call. Currently, we're returning a corrupt reply in the case a solo SEQUENCE matches a previous compound with more ops. This actually matters because the Linux client recently started doing this as a way to recover from lost replies to idempotent operations in the case the process doing the original reply was killed: in that case it's difficult to keep the original arguments around to do a real retry, and the client no longer cares what the result is anyway, but it would like to make sure that the slot's sequence id has been incremented, and the solo SEQUENCE assures that: if the server never got the original reply, it will increment the sequence id. If it did get the original reply, it won't increment, and nothing else that about the reply really matters much. But we can at least attempt to return valid xdr! Tested-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-09-05nfsd: Incoming xdr_bufs may have content in tail bufferChuck Lever
Since the beginning, svcsock has built a received RPC Call message by populating the xdr_buf's head, then placing the remaining message bytes in the xdr_buf's page list. The xdr_buf's tail is never populated. This means that an NFSv4 COMPOUND containing an NFS WRITE operation plus trailing operations has a page list that contains the WRITE data payload followed by the trailing operations. NFSv4 XDR decoders will not look in the xdr_buf's tail, ever, because svcsock never put anything there. To support transports that can pass the write payload in the xdr_buf's pagelist and trailing content in the xdr_buf's tail, introduce logic in READ_BUF that switches to the xdr_buf's tail vec when the decoder runs out of content in rq_arg.pages. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-08-24nfsd4: skip encoder in trivial error casesJ. Bruce Fields
Most encoders do nothing in the error case. But they can still screw things up in that case: most errors happen very early in rpc processing, possibly before argument fields are filled in and bounds-tested, so encoders that do anything other than immediately bail on error can easily crash in odd error cases. So just handle errors centrally most of the time to remove the chance of error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-08-24nfsd4: define ->op_release for compound opsJ. Bruce Fields
Run a separate ->op_release function if necessary instead of depending on the xdr encoder to do this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-08-24nfsd4: opdesc will be useful outside nfs4proc.cJ. Bruce Fields
Trivial cleanup, no change in behavior. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-08-01nfsd4: move some nfsd4 op definitions to xdr4.hJ. Bruce Fields
I want code in nfs4xdr.c to have access to this stuff. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-05-15nfsd4: properly type op_func callbacksChristoph Hellwig
Pass union nfsd4_op_u to the op_func callbacks instead of using unsafe function pointer casts. It also adds two missing structures to struct nfsd4_op.u to facilitate this. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15nfsd4: properly type op_set_currentstateid callbacksChristoph Hellwig
Given the args union in struct nfsd4_op a name, and pass it to the op_set_currentstateid callbacks instead of using unsafe function pointer casts. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15sunrpc: properly type pc_encode callbacksChristoph Hellwig
Drop the resp argument as it can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-15sunrpc: properly type pc_decode callbacksChristoph Hellwig
Drop the argp argument as it can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15sunrpc: properly type pc_release callbacksChristoph Hellwig
Drop the p and resp arguments as they are always NULL or can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-10-07NFSD: Implement the COPY callAnna Schumaker
I only implemented the sync version of this call, since it's the easiest. I can simply call vfs_copy_range() and have the vfs do the right thing for the filesystem being exported. Signed-off-by: Anna Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-13nfsd: implement machine credential support for some operationsAndrew Elble
This addresses the conundrum referenced in RFC5661 18.35.3, and will allow clients to return state to the server using the machine credentials. The biggest part of the problem is that we need to allow the client to send a compound op with integrity/privacy on mounts that don't have it enabled. Add server support for properly decoding and using spo_must_enforce and spo_must_allow bits. Add support for machine credentials to be used for CLOSE, OPEN_DOWNGRADE, LOCKU, DELEGRETURN, and TEST/FREE STATEID. Implement a check so as to not throw WRONGSEC errors when these operations are used if integrity/privacy isn't turned on. Without this, Linux clients with credentials that expired while holding delegations were getting stuck in an endless loop. Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-13nfsd: allow mach_creds_match to be used more broadlyAndrew Elble
Rename mach_creds_match() to nfsd4_mach_creds_match() and un-staticify Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-12-07nfsd: implement the NFSv4.2 CLONE operationChristoph Hellwig
This is basically a remote version of the btrfs CLONE operation, so the implementation is fairly trivial. Made even more trivial by stealing the XDR code and general framework Anna Schumaker's COPY prototype. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-10-12nfsd: switch unsigned char flags in svc_fh to boolsJeff Layton
...just for clarity. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-22nfsd: take struct file setup fully into nfs4_preprocess_stateid_opChristoph Hellwig
This patch changes nfs4_preprocess_stateid_op so it always returns a valid struct file if it has been asked for that. For that we now allocate a temporary struct file for special stateids, and check permissions if we got the file structure from the stateid. This ensures that all callers will get their handling of special stateids right, and avoids code duplication. There is a little wart in here because the read code needs to know if we allocated a file structure so that it can copy around the read-ahead parameters. In the long run we should probably aim to cache full file structures used with special stateids instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04nfsd: fix pNFS return on close semanticsSachin Bhamare
For the sake of forgetful clients, the server should return the layouts to the file system on 'last close' of a file (assuming that there are no delegations outstanding to that particular client) or on delegreturn (assuming that there are no opens on a file from that particular client). In theory the information is all there in current data structures, but it's not efficiently available; nfs4_file->fi_ref includes references on the file across all clients, but we need a per-(client, file) count. Walking through lots of stateid's to calculate this on each close or delegreturn would be painful. This patch introduces infrastructure to maintain per-client opens and delegation counters on a per-file basis. [hch: ported to the mainline pNFS support, merged various fixes from Jeff] Signed-off-by: Sachin Bhamare <sachin.bhamare@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-15VFS: normal filesystems (and lustre): d_inode() annotationsDavid Howells
that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-31nfsd: Remove duplicate macro define for max sec label lengthKinglong Mee
NFS4_MAXLABELLEN has defined for sec label max length, use it directly. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31nfsd: remove unused status arg to nfsd4_cleanup_open_stateJeff Layton
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-02-02nfsd: implement pNFS operationsChristoph Hellwig
Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage outstanding layouts and devices. Layout management is very straight forward, with a nfs4_layout_stateid structure that extends nfs4_stid to manage layout stateids as the top-level structure. It is linked into the nfs4_file and nfs4_client structures like the other stateids, and contains a linked list of layouts that hang of the stateid. The actual layout operations are implemented in layout drivers that are not part of this commit, but will be added later. The worst part of this commit is the management of the pNFS device IDs, which suffers from a specification that is not sanely implementable due to the fact that the device-IDs are global and not bound to an export, and have a small enough size so that we can't store the fsid portion of a file handle, and must never be reused. As we still do need perform all export authentication and validation checks on a device ID passed to GETDEVICEINFO we are caught between a rock and a hard place. To work around this issue we add a new hash that maps from a 64-bit integer to a fsid so that we can look up the export to authenticate against it, a 32-bit integer as a generation that we can bump when changing the device, and a currently unused 32-bit integer that could be used in the future to handle more than a single device per export. Entries in this hash table are never deleted as we can't reuse the ids anyway, and would have a severe lifetime problem anyway as Linux export structures are temporary structures that can go away under load. Parts of the XDR data, structures and marshaling/unmarshaling code, as well as many concepts are derived from the old pNFS server implementation from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman, Mike Sager, Ricardo Labiaga and many others. Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-07nfsd: Add DEALLOCATE supportAnna Schumaker
DEALLOCATE only returns a status value, meaning we can use the noop() xdr encoder to reply to the client. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-11-07nfsd: Add ALLOCATE supportAnna Schumaker
The ALLOCATE operation is used to preallocate space in a file. I can do this by using vfs_fallocate() to do the actual preallocation. ALLOCATE only returns a status indicator, so we don't need to write a special encode() function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-09-29NFSD: Implement SEEKAnna Schumaker
This patch adds server support for the NFS v4.2 operation SEEK, which returns the position of the next hole or data segment in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-31nfsd: Add a mutex to protect the NFSv4.0 open owner replay cacheJeff Layton
We don't want to rely on the client_mutex for protection in the case of NFSv4 open owners. Instead, we add a mutex that will only be taken for NFSv4.0 state mutating operations, and that will be released once the entire compound is done. Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay take a reference to the stateowner when they are using it for NFSv4.0 open and lock replay caching. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-09nfsd: Allow struct nfsd4_compound_state to cache the nfs4_clientJeff Layton
We want to use the nfsd4_compound_state to cache the nfs4_client in order to optimise away extra lookups of the clid. In the v4.0 case, we use this to ensure that we only have to look up the client at most once per compound for each call into lookup_clientid. For v4.1+ we set the pointer in the cstate during SEQUENCE processing so we should never need to do a search for it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08nfsd: Cleanup nfs4svc_encode_compoundresTrond Myklebust
Move the slot return, put session etc into a helper in fs/nfsd/nfs4state.c instead of open coding in nfs4svc_encode_compoundres. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08nfsd4: replace defer_free by svcxdr_tmpallocJ. Bruce Fields
Avoid an extra allocation for the tmpbuf struct itself, and stop ignoring some allocation failures. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08nfsd4: remove unused defer_free argumentJ. Bruce Fields
28e05dd8457c "knfsd: nfsd4: represent nfsv4 acl with array instead of linked list" removed the last user that wanted a custom free function. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08nfsd4: rename cr_linkname->cr_dataJ. Bruce Fields
The name of a link is currently stored in cr_name and cr_namelen, and the content in cr_linkname and cr_linklen. That's confusing. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30nfsd4: allow large readdirsJ. Bruce Fields
Currently we limit readdir results to a single page. This can result in a performance regression compared to NFSv3 when reading large directories. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30nfsd4: more precise nfsd4_max_replyJ. Bruce Fields
It will turn out to be useful to have a more accurate estimate of reply size; so, piggyback on the existing op reply-size estimators. Also move nfsd4_max_reply to nfs4proc.c to get easier access to struct nfsd4_operation and friends. (Thanks to Christoph Hellwig for pointing out that simplification.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30nfsd4: convert 4.1 replay encodingJ. Bruce Fields
Limits on maxresp_sz mean that we only ever need to replay rpc's that are contained entirely in the head. The one exception is very small zero-copy reads. That's an odd corner case as clients wouldn't normally ask those to be cached. in any case, this seems a little more robust. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30nfsd4: teach encoders to handle reserve_space failuresJ. Bruce Fields
We've tried to prevent running out of space with COMPOUND_SLACK_SPACE and special checking in those operations (getattr) whose result can vary enormously. However: - COMPOUND_SLACK_SPACE may be difficult to maintain as we add more protocol. - BUG_ON or page faulting on failure seems overly fragile. - Especially in the 4.1 case, we prefer not to fail compounds just because the returned result came *close* to session limits. (Though perfect enforcement here may be difficult.) - I'd prefer encoding to be uniform for all encoders instead of having special exceptions for encoders containing, for example, attributes. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-27nfsd4: fix encoding of out-of-space repliesJ. Bruce Fields
If nfsd4_check_resp_size() returns an error then we should really be truncating the reply here, otherwise we may leave extra garbage at the end of the rpc reply. Also add a warning to catch any cases where our reply-size estimates may be wrong in the case of a non-idempotent operation. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23nfsd4: tweak nfsd4_encode_getattr to take xdr_streamJ. Bruce Fields
Just change the nfsd4_encode_getattr api. Not changing any code or adding any new functionality yet. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23nfsd4: embed xdr_stream in nfsd4_compoundresJ. Bruce Fields
This is a mechanical transformation with no change in behavior. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-28nfsd4: nfsd4_replay_cache_entry should be staticJ. Bruce Fields
This isn't actually used anywhere else. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-06Define op_iattr for nfsd4_open instead using macroKinglong Mee
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03nfsd: using nfsd4_encode_noop for encoding destroy_session/free_stateidKinglong Mee
Get rid of the extra code, using nfsd4_encode_noop for encoding destroy_session and free_stateid. And, delete unused argument (fr_status) int nfsd4_free_stateid. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-15NFSD: Server implementation of MAC LabelingDavid Quigley
Implement labeled NFS on the server: encoding and decoding, and writing and reading, of file labels. Enabled with CONFIG_NFSD_V4_SECURITY_LABEL. Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com> Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg> Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg> Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-08nfsd4: cleanup handling of nfsv4.0 closed stateid'sJ. Bruce Fields
Closed stateid's are kept around a little while to handle close replays in the 4.0 case. So we stash them in the last-used stateid in the oo_last_closed_stateid field of the open owner. We can free that in encode_seqid_op_tail once the seqid on the open owner is next incremented. But we don't want to do that on the close itself; so we set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the first time through encode_seqid_op_tail, then when we see that flag set next time we free it. This is unnecessarily baroque. Instead, just move the logic that increments the seqid out of the xdr code and into the operation code itself. The justification given for the current placement is that we need to wait till the last minute to be sure we know whether the status is a sequence-id-mutating error or not, but examination of the code shows that can't actually happen. Reported-by: Yanchuan Nian <ycnian@gmail.com> Tested-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03nfsd: remove unused macro in nfsv4Yanchuan Nian
lk_rflags is never used anywhere, and rflags is not defined in struct nfsd4_lock. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03nfsd4: handle seqid-mutating open errors from xdr decodingJ. Bruce Fields
If a client sets an owner (or group_owner or acl) attribute on open for create, and the mapping of that owner to an id fails, then we return BAD_OWNER. But BAD_OWNER is a seqid-mutating error, so we can't shortcut the open processing that case: we have to at least look up the owner so we can find the seqid to bump. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23nfsd4: simplify nfsd4_encode_fattr interface slightlyJ. Bruce Fields
It seems slightly simpler to make nfsd4_encode_fattr rather than its callers responsible for advancing the write pointer on success. (Also: the count == 0 check in the verify case looks superfluous. Running out of buffer space is really the only reason fattr encoding should fail with eresource.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17nfsd4: disable zero-copy on non-final read opsJ. Bruce Fields
To ensure ordering of read data with any following operations, turn off zero copy if the read is not the final operation in the compound. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26nfsd4: delay filling in write iovec array till after xdr decodingJ. Bruce Fields
Our server rejects compounds containing more than one write operation. It's unclear whether this is really permitted by the spec; with 4.0, it's possibly OK, with 4.1 (which has clearer limits on compound parameters), it's probably not OK. No client that we're aware of has ever done this, but in theory it could be useful. The source of the limitation: we need an array of iovecs to pass to the write operation. In the worst case that array of iovecs could have hundreds of elements (the maximum rwsize divided by the page size), so it's too big to put on the stack, or in each compound op. So we instead keep a single such array in the compound argument. We fill in that array at the time we decode the xdr operation. But we decode every op in the compound before executing any of them. So once we've used that array we can't decode another write. If we instead delay filling in that array till the time we actually perform the write, we can reuse it. Another option might be to switch to decoding compound ops one at a time. I considered doing that, but it has a number of other side effects, and I'd rather fix just this one problem for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26nfsd4: move more write parameters into xdr argumentJ. Bruce Fields
In preparation for moving some of this elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>