diff options
author | David Howells <dhowells@redhat.com> | 2020-05-11 14:54:34 +0100 |
---|---|---|
committer | David Howells <dhowells@redhat.com> | 2020-05-11 16:42:28 +0100 |
commit | c410bf01933e5e09d142c66c3df9ad470a7eec13 (patch) | |
tree | 96a0a0b9ddcfa1489b4e34f9e77141b95fc81f30 /fs | |
parent | 42c556fef92361bbc58be22f91b1c49db0963c34 (diff) |
rxrpc: Fix the excessive initial retransmission timeout
rxrpc currently uses a fixed 4s retransmission timeout until the RTT is
sufficiently sampled. This can cause problems with some fileservers with
calls to the cache manager in the afs filesystem being dropped from the
fileserver because a packet goes missing and the retransmission timeout is
greater than the call expiry timeout.
Fix this by:
(1) Copying the RTT/RTO calculation code from Linux's TCP implementation
and altering it to fit rxrpc.
(2) Altering the various users of the RTT to make use of the new SRTT
value.
(3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO
value instead (which is needed in jiffies), along with a backoff.
Notes:
(1) rxrpc provides RTT samples by matching the serial numbers on outgoing
DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets
against the reference serial number in incoming REQUESTED ACK and
PING-RESPONSE ACK packets.
(2) Each packet that is transmitted on an rxrpc connection gets a new
per-connection serial number, even for retransmissions, so an ACK can
be cross-referenced to a specific trigger packet. This allows RTT
information to be drawn from retransmitted DATA packets also.
(3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than
on an rxrpc_call because many RPC calls won't live long enough to
generate more than one sample.
(4) The calculated SRTT value is in units of 8ths of a microsecond rather
than nanoseconds.
The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers.
Fixes: 17926a79320a ([AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both"")
Signed-off-by: David Howells <dhowells@redhat.com>
Diffstat (limited to 'fs')
-rw-r--r-- | fs/afs/fs_probe.c | 18 | ||||
-rw-r--r-- | fs/afs/vl_probe.c | 18 |
2 files changed, 10 insertions, 26 deletions
diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c index a587767b6ae1..237352d3cb53 100644 --- a/fs/afs/fs_probe.c +++ b/fs/afs/fs_probe.c @@ -32,9 +32,8 @@ void afs_fileserver_probe_result(struct afs_call *call) struct afs_server *server = call->server; unsigned int server_index = call->server_index; unsigned int index = call->addr_ix; - unsigned int rtt = UINT_MAX; + unsigned int rtt_us; bool have_result = false; - u64 _rtt; int ret = call->error; _enter("%pU,%u", &server->uuid, index); @@ -93,15 +92,9 @@ responded: } } - /* Get the RTT and scale it to fit into a 32-bit value that represents - * over a minute of time so that we can access it with one instruction - * on a 32-bit system. - */ - _rtt = rxrpc_kernel_get_rtt(call->net->socket, call->rxcall); - _rtt /= 64; - rtt = (_rtt > UINT_MAX) ? UINT_MAX : _rtt; - if (rtt < server->probe.rtt) { - server->probe.rtt = rtt; + rtt_us = rxrpc_kernel_get_srtt(call->net->socket, call->rxcall); + if (rtt_us < server->probe.rtt) { + server->probe.rtt = rtt_us; alist->preferred = index; have_result = true; } @@ -113,8 +106,7 @@ out: spin_unlock(&server->probe_lock); _debug("probe [%u][%u] %pISpc rtt=%u ret=%d", - server_index, index, &alist->addrs[index].transport, - (unsigned int)rtt, ret); + server_index, index, &alist->addrs[index].transport, rtt_us, ret); have_result |= afs_fs_probe_done(server); if (have_result) diff --git a/fs/afs/vl_probe.c b/fs/afs/vl_probe.c index 858498cc1b05..e3aa013c2177 100644 --- a/fs/afs/vl_probe.c +++ b/fs/afs/vl_probe.c @@ -31,10 +31,9 @@ void afs_vlserver_probe_result(struct afs_call *call) struct afs_addr_list *alist = call->alist; struct afs_vlserver *server = call->vlserver; unsigned int server_index = call->server_index; + unsigned int rtt_us = 0; unsigned int index = call->addr_ix; - unsigned int rtt = UINT_MAX; bool have_result = false; - u64 _rtt; int ret = call->error; _enter("%s,%u,%u,%d,%d", server->name, server_index, index, ret, call->abort_code); @@ -93,15 +92,9 @@ responded: } } - /* Get the RTT and scale it to fit into a 32-bit value that represents - * over a minute of time so that we can access it with one instruction - * on a 32-bit system. - */ - _rtt = rxrpc_kernel_get_rtt(call->net->socket, call->rxcall); - _rtt /= 64; - rtt = (_rtt > UINT_MAX) ? UINT_MAX : _rtt; - if (rtt < server->probe.rtt) { - server->probe.rtt = rtt; + rtt_us = rxrpc_kernel_get_srtt(call->net->socket, call->rxcall); + if (rtt_us < server->probe.rtt) { + server->probe.rtt = rtt_us; alist->preferred = index; have_result = true; } @@ -113,8 +106,7 @@ out: spin_unlock(&server->probe_lock); _debug("probe [%u][%u] %pISpc rtt=%u ret=%d", - server_index, index, &alist->addrs[index].transport, - (unsigned int)rtt, ret); + server_index, index, &alist->addrs[index].transport, rtt_us, ret); have_result |= afs_vl_probe_done(server); if (have_result) { |