Age | Commit message (Collapse) | Author |
|
Clean up: C-structure style XDR encoding and decoding logic has
been replaced over the past several merge windows on both the
client and server. These data structures are no longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Lift the Send and LocalInv completion handlers out of soft IRQ mode
to make room for other work. Also, move the Send CQ to a different
CPU than the CPU where the Receive CQ is running, for improved
scalability.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
The sendctx circular queue now guarantees that xprtrdma cannot
overflow the Send Queue, so remove the remaining bits of the
original Send WQE counting mechanism.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
When an RPC Call includes a file data payload, that payload can come
from pages in the page cache, or a user buffer (for direct I/O).
If the payload can fit inline, xprtrdma includes it in the Send
using a scatter-gather technique. xprtrdma mustn't allow the RPC
consumer to re-use the memory where that payload resides before the
Send completes. Otherwise, the new contents of that memory would be
exposed by an HCA retransmit of the Send operation.
So, block RPC completion on Send completion, but only in the case
where a separate file data payload is part of the Send. This
prevents the reuse of that memory while it is still part of a Send
operation without an undue cost to other cases.
Waiting is avoided in the common case because typically the Send
will have completed long before the RPC Reply arrives.
These days, an RPC timeout will trigger a disconnect, which tears
down the QP. The disconnect flushes all waiting Sends. This bounds
the amount of time the reply handler has to wait for a Send
completion.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Invoke a common routine for releasing hardware resources (for
example, invalidating MRs). This needs to be done whether an
RPC Reply has arrived or the RPC was terminated early.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
We have one boolean flag in rpcrdma_req today. I'd like to add more
flags, so convert that boolean to a bit flag.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Problem statement:
Recently Sagi Grimberg <sagi@grimberg.me> observed that kernel RDMA-
enabled storage initiators don't handle delayed Send completion
correctly. If Send completion is delayed beyond the end of a ULP
transaction, the ULP may release resources that are still being used
by the HCA to complete a long-running Send operation.
This is a common design trait amongst our initiators. Most Send
operations are faster than the ULP transaction they are part of.
Waiting for a completion for these is typically unnecessary.
Infrequently, a network partition or some other problem crops up
where an ordering problem can occur. In NFS parlance, the RPC Reply
arrives and completes the RPC, but the HCA is still retrying the
Send WR that conveyed the RPC Call. In this case, the HCA can try
to use memory that has been invalidated or DMA unmapped, and the
connection is lost. If that memory has been re-used for something
else (possibly not related to NFS), and the Send retransmission
exposes that data on the wire.
Thus we cannot assume that it is safe to release Send-related
resources just because a ULP reply has arrived.
After some analysis, we have determined that the completion
housekeeping will not be difficult for xprtrdma:
- Inline Send buffers are registered via the local DMA key, and
are already left DMA mapped for the lifetime of a transport
connection, thus no additional handling is necessary for those
- Gathered Sends involving page cache pages _will_ need to
DMA unmap those pages after the Send completes. But like
inline send buffers, they are registered via the local DMA key,
and thus will not need to be invalidated
In addition, RPC completion will need to wait for Send completion
in the latter case. However, nearly always, the Send that conveys
the RPC Call will have completed long before the RPC Reply
arrives, and thus no additional latency will be accrued.
Design notes:
In this patch, the rpcrdma_sendctx object is introduced, and a
lock-free circular queue is added to manage a set of them per
transport.
The RPC client's send path already prevents sending more than one
RPC Call at the same time. This allows us to treat the consumer
side of the queue (rpcrdma_sendctx_get_locked) as if there is a
single consumer thread.
The producer side of the queue (rpcrdma_sendctx_put_locked) is
invoked only from the Send completion handler, which is a single
thread of execution (soft IRQ).
The only care that needs to be taken is with the tail index, which
is shared between the producer and consumer. Only the producer
updates the tail index. The consumer compares the head with the
tail to ensure that the a sendctx that is in use is never handed
out again (or, expressed more conventionally, the queue is empty).
When the sendctx queue empties completely, there are enough Sends
outstanding that posting more Send operations can result in a Send
Queue overflow. In this case, the ULP is told to wait and try again.
This introduces strong Send Queue accounting to xprtrdma.
As a final touch, Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
suggested a mechanism that does not require signaling every Send.
We signal once every N Sends, and perform SGE unmapping of N Send
operations during that one completion.
Reported-by: Sagi Grimberg <sagi@grimberg.me>
Suggested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Commit 655fec6987be ("xprtrdma: Use gathered Send for large inline
messages") assumed that, since the zeroeth element of the Send SGE
array always pointed to req->rl_rdmabuf, it needed to be initialized
just once. This was a valid assumption because the Send SGE array
and rl_rdmabuf both live in the same rpcrdma_req.
In a subsequent patch, the Send SGE array will be separated from the
rpcrdma_req, so the zeroeth element of the SGE array needs to be
initialized every time.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up: Make rpcrdma_prepare_send_sges() return a negative errno
instead of a bool. Soon callers will want distinct treatments of
different types of failures.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
When this function fails, it needs to undo the DMA mappings it's
done so far. Otherwise these are leaked.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up. rpcrdma_prepare_hdr_sge() sets num_sge to one, then
rpcrdma_prepare_msg_sges() sets num_sge again to the count of SGEs
it added, plus one for the header SGE just mapped in
rpcrdma_prepare_hdr_sge(). This is confusing, and nails in an
assumption about when these functions are called.
Instead, maintain a running count that both functions can update
with just the number of SGEs they have added to the SGE array.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
We need to decode and save the incoming rdma_credits field _after_
we know that the direction of the message is "forward direction
Reply". Otherwise, the credits value in reverse direction Calls is
also used to update the forward direction credits.
It is safe to decode the rdma_credits field in rpcrdma_reply_handler
now that rpcrdma_reply_handler is single-threaded. Receives complete
in the same order as they were sent on the NFS server.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
I noticed that the soft IRQ thread looked pretty busy under heavy
I/O workloads. perf suggested one area that was expensive was the
queue_work() call in rpcrdma_wc_receive. That gave me some ideas.
Instead of scheduling a separate worker to process RPC Replies,
promote the Receive completion handler to IB_POLL_WORKQUEUE, and
invoke rpcrdma_reply_handler directly.
Note that the poll workqueue is single-threaded. In order to keep
memory invalidation from serializing all RPC Replies, handle any
necessary invalidation tasks in a separate multi-threaded workqueue.
This provides a two-tier scheme, similar to OS I/O interrupt
handlers: A fast interrupt handler that schedules the slow handler
and re-enables the interrupt, and a slower handler that is invoked
for any needed heavy lifting.
Benefits include:
- One less context switch for RPCs that don't register memory
- Receive completion handling is moved out of soft IRQ context to
make room for other users of soft IRQ
- The same CPU core now DMA syncs and XDR decodes the Receive buffer
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up: I'd like to be able to invoke the tail of
rpcrdma_reply_handler in two different places. Split the tail out
into its own helper function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up: Make it easier to pass the decoded XID, vers, credits, and
proc fields around by moving these variables into struct rpcrdma_rep.
Note: the credits field will be handled in a subsequent patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
A reply with an unrecognized value in the version field means the
transport header is potentially garbled and therefore all the fields
are untrustworthy.
Fixes: 59aa1f9a3cce3 ("xprtrdma: Properly handle RDMA_ERROR ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up: There are no remaining callers of this method.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
The "safe" version of ro_unmap is used here to avoid waiting
unnecessarily. However:
- It is safe to wait. After all, we have to wait anyway when using
FMR to register memory.
- This case is rare: it occurs only after a reconnect.
By switching this call site to ro_unmap_sync, the final use of
ro_unmap_safe is removed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
In current kernels, waiting in xprt_release appears to be safe to
do. I had erroneously believed that for ASYNC RPCs, waiting of any
kind in xprt_release->xprt_rdma_free would result in deadlock. I've
done injection testing and consulted with Trond to confirm that
waiting in the RPC release path is safe.
For the very few times where RPC resources haven't yet been released
earlier by the reply handler, it is safe to wait synchronously in
xprt_rdma_free for invalidation rather than defering it to MR
recovery.
Note: When the QP is error state, posting a LocalInvalidate should
flush and mark the MR as bad. There is no way the remote HCA can
access that MR via a QP in error state, so it is effectively already
inaccessible and thus safe for the Upper Layer to access. The next
time the MR is used it should be recognized and cleaned up properly
by frwr_op_map.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
The ib_mr->length represents the length of the MR in bytes as per
the IBTA spec 1.3 section 11.2.10.3 (REGISTER PHYSICAL MEMORY REGION).
Currently ib_mr->length field is defined as only 32-bits field.
This might result into truncation and failed WRs of consumers who
registers more than 4GB bytes memory regions and whose WRs accessing
such MRs.
This patch makes the length 64-bit to avoid such truncation.
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Faisal Latif <faisal.latif@intel.com>
Fixes: 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API")
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Pull NFS client updates from Trond Myklebust:
"Hightlights include:
Stable bugfixes:
- Fix mirror allocation in the writeback code to avoid a use after
free
- Fix the O_DSYNC writes to use the correct byte range
- Fix 2 use after free issues in the I/O code
Features:
- Writeback fixes to split up the inode->i_lock in order to reduce
contention
- RPC client receive fixes to reduce the amount of time the
xprt->transport_lock is held when receiving data from a socket into
am XDR buffer.
- Ditto fixes to reduce contention between call side users of the
rdma rb_lock, and its use in rpcrdma_reply_handler.
- Re-arrange rdma stats to reduce false cacheline sharing.
- Various rdma cleanups and optimisations.
- Refactor the NFSv4.1 exchange id code and clean up the code.
- Const-ify all instances of struct rpc_xprt_ops
Bugfixes:
- Fix the NFSv2 'sec=' mount option.
- NFSv4.1: don't use machine credentials for CLOSE when using
'sec=sys'
- Fix the NFSv3 GRANT callback when the port changes on the server.
- Fix livelock issues with COMMIT
- NFSv4: Use correct inode in _nfs4_opendata_to_nfs4_state() when
doing and NFSv4.1 open by filehandle"
* tag 'nfs-for-4.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (69 commits)
NFS: Count the bytes of skipped subrequests in nfs_lock_and_join_requests()
NFS: Don't hold the group lock when calling nfs_release_request()
NFS: Remove pnfs_generic_transfer_commit_list()
NFS: nfs_lock_and_join_requests and nfs_scan_commit_list can deadlock
NFS: Fix 2 use after free issues in the I/O code
NFS: Sync the correct byte range during synchronous writes
lockd: Delete an error message for a failed memory allocation in reclaimer()
NFS: remove jiffies field from access cache
NFS: flush data when locking a file to ensure cache coherence for mmap.
SUNRPC: remove some dead code.
NFS: don't expect errors from mempool_alloc().
xprtrdma: Use xprt_pin_rqst in rpcrdma_reply_handler
xprtrdma: Re-arrange struct rx_stats
NFS: Fix NFSv2 security settings
NFSv4.1: don't use machine credentials for CLOSE when using 'sec=sys'
SUNRPC: ECONNREFUSED should cause a rebind.
NFS: Remove unused parameter gfp_flags from nfs_pageio_init()
NFSv4: Fix up mirror allocation
SUNRPC: Add a separate spinlock to protect the RPC request receive list
SUNRPC: Cleanup xs_tcp_read_common()
...
|
|
Adopt the use of xprt_pin_rqst to eliminate contention between
Call-side users of rb_lock and the use of rb_lock in
rpcrdma_reply_handler.
This replaces the mechanism introduced in 431af645cf66 ("xprtrdma:
Fix client lock-up after application signal fires").
Use recv_lock to quickly find the completing rqst, pin it, then
drop the lock. At that point invalidation and pull-up of the Reply
XDR can be done. Both are often expensive operations.
Finally, take recv_lock again to signal completion to the RPC
layer. It also protects adjustment of "cwnd".
This greatly reduces the amount of time a lock is held by the
reply handler. Comparing lock_stat results shows a marked decrease
in contention on rb_lock and recv_lock.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[trond.myklebust@primarydata.com: Remove call to rpcrdma_buffer_put() from
the "out_norqst:" path in rpcrdma_reply_handler.]
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
git://git.linux-nfs.org/projects/anna/linux-nfs into linux-next
NFS-over-RDMA client updates for Linux 4.14
Bugfixes and cleanups:
- Constify rpc_xprt_ops
- Harden RPC call encoding and decoding
- Clean up rpc call decoding to use xdr_streams
- Remove unused variables from various structures
- Refactor code to remove imul instructions
- Rearrange rx_stats structure for better cacheline sharing
|
|
The rdma_rw API adjusts max_send_wr upwards during the
rdma_create_qp() call. If the ULP actually wants to take advantage
of these extra resources, it must increase the size of its send
completion queue (created before rdma_create_qp is called) and
increase its send queue accounting limit.
Use the new rdma_rw_mr_factor API to figure out the correct value
to use for the Send Queue and Send Completion Queue depths.
And, ensure that the chosen Send Queue depth for a newly created
transport does not overrun the QP WR limit of the underlying device.
Lastly, there's no longer a need to carry the Send Queue depth in
struct svcxprt_rdma, since the value is used only in the
svc_rdma_accept() path.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Ensure that the chosen Receive Queue depth for a newly created
transport does not overrun the QP WR limit of the underlying device.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
So that NFS WRITE payloads can eventually be placed directly into a
file's page cache, enable the RPC-over-RDMA transport to present
these payloads in the xdr_buf's page list, while placing trailing
content (such as a GETATTR operation) in the xdr_buf's tail.
After this change, the RPC-over-RDMA's "copy tail" hack, added by
commit a97c331f9aa9 ("svcrdma: Handle additional inline content"),
is no longer needed and can be removed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Dan Carpenter <dan.carpenter@oracle.com> observed that the while()
loop in svc_rdma_build_read_chunk() does not document the assumption
that the loop interior is always executed at least once.
Defensive: the function now returns -EINVAL if this assumption
fails.
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Close an attack vector by moving the arrays of server-side transport
methods to read-only memory.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
To reduce false cacheline sharing, separate counters that are likely
to be accessed in the Call path from those accessed in the Reply
path.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
This further reduces contention with the transport_lock, and allows us
to convert to using a non-bh-safe spinlock, since the list is now never
accessed from a bh context.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Re-arrange the pointer arithmetic in the chunk list encoders to
eliminate several more integer multiplication instructions during
Transport Header encoding.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Re-arrange the pointer arithmetic in rpcrdma_convert_iovs() to
eliminate several integer multiplication instructions during
Transport Header encoding.
Also, array overflow does not occur outside development
environments, so replace overflow checking with one spot check
at the end. This reduces the number of conditional branches in
the common case.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Same changes as in rpcrdma_marshal_req(). This removes
C-structure style encoding from the backchannel.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
While marshaling chunk lists which are variable-length XDR objects,
check for XDR buffer overflow at every step. Measurements show no
significant changes in CPU utilization.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Initialize an xdr_stream at the top of rpcrdma_marshal_req(), and
use it to encode the fixed transport header fields. This xdr_stream
will be used to encode the chunk lists in a subsequent patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up: Remove a variable whose result is no longer used.
Commit 655fec6987be ("xprtrdma: Use gathered Send for large inline
messages") should have removed it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up: The caller already has rpcrdma_xprt, so pass that directly
instead. And provide a documenting comment for this critical
function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up: Replace C-structure based XDR decoding for consistency
with other areas.
struct rpcrdma_rep is rearranged slightly so that the relevant fields
are in cache when the Receive completion handler is invoked.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
This field is no longer used outside the Receive completion handler.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up: The opcode check is no longer necessary, because since
commit 2fa8f88d8892 ("xprtrdma: Use new CQ API for RPC-over-RDMA
client send CQs"), this completion handler is invoked only for
RECV work requests.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up chunk list decoding by using the xdr_stream set up in
rpcrdma_reply_handler. This hardens decoding by checking for buffer
overflow at every step while unmarshaling variable-length XDR
objects.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Refactor the reply handler's transport header decoding logic to make
it easier to understand and update.
Convert some of the handler to use xdr_streams, which will enable
stricter validation of input data and enable the eventual addition
of support for new combinations of chunks, such as "Write + Reply"
or "PZRC + normal Read".
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Transport header decoding deals with untrusted input data, therefore
decoding this header needs to be hardened.
Adopt the same infrastructure that is used when XDR decoding NFS
replies. This is slightly more CPU-intensive than the replaced code,
but we're not adding new atomics, locking, or context switches. The
cost is manageable.
Start by initializing an xdr_stream in rpcrdma_reply_handler().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
After transport instance creation, these function pointers never
change. Mark them as constant to prevent their use as an attack
vector for code injections.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Pull NFS client updates from Anna Schumaker:
"Stable bugfixes:
- Fix -EACCESS on commit to DS handling
- Fix initialization of nfs_page_array->npages
- Only invalidate dentries that are actually invalid
Features:
- Enable NFSoRDMA transparent state migration
- Add support for lookup-by-filehandle
- Add support for nfs re-exporting
Other bugfixes and cleanups:
- Christoph cleaned up the way we declare NFS operations
- Clean up various internal structures
- Various cleanups to commits
- Various improvements to error handling
- Set the dt_type of . and .. entries in NFS v4
- Make slot allocation more reliable
- Fix fscache stat printing
- Fix uninitialized variable warnings
- Fix potential list overrun in nfs_atomic_open()
- Fix a race in NFSoRDMA RPC reply handler
- Fix return size for nfs42_proc_copy()
- Fix against MAC forgery timing attacks"
* tag 'nfs-for-4.13-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (68 commits)
NFS: Don't run wake_up_bit() when nobody is waiting...
nfs: add export operations
nfs4: add NFSv4 LOOKUPP handlers
nfs: add a nfs_ilookup helper
nfs: replace d_add with d_splice_alias in atomic_open
sunrpc: use constant time memory comparison for mac
NFSv4.2 fix size storage for nfs42_proc_copy
xprtrdma: Fix documenting comments in frwr_ops.c
xprtrdma: Replace PAGE_MASK with offset_in_page()
xprtrdma: FMR does not need list_del_init()
xprtrdma: Demote "connect" log messages
NFSv4.1: Use seqid returned by EXCHANGE_ID after state migration
NFSv4.1: Handle EXCHGID4_FLAG_CONFIRMED_R during NFSv4.1 migration
xprtrdma: Don't defer MR recovery if ro_map fails
xprtrdma: Fix FRWR invalidation error recovery
xprtrdma: Fix client lock-up after application signal fires
xprtrdma: Rename rpcrdma_req::rl_free
xprtrdma: Pass only the list of registered MRs to ro_unmap_sync
xprtrdma: Pre-mark remotely invalidated MRs
xprtrdma: On invalidation failure, remove MWs from rl_registered
...
|
|
Clean up.
FASTREG and LOCAL_INV WRs are typically not signaled. localinv_wake
is used for the last LOCAL_INV WR in a chain, which is always
signaled. The documenting comments should reflect that.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up.
Reported by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Clean up.
Commit 38f1932e60ba ("xprtrdma: Remove FMRs from the unmap list
after unmapping") utilized list_del_init() to try to prevent some
list corruption. The corruption was actually caused by the reply
handler racing with a signal. Now that MR invalidation is properly
serialized, list_del_init() can safely be replaced.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Some have complained about the log messages generated when xprtrdma
opens or closes a connection to a server. When an NFS mount is
mostly idle these can appear every few minutes as the client idles
out the connection and reconnects.
Connection and disconnection is a normal part of operation, and not
exceptional, so change these to dprintk's for now. At some point
all of these will be converted to tracepoints, but that's for
another day.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|