summaryrefslogtreecommitdiff
path: root/net/dccp/ccids/ccid2.h
AgeCommit message (Collapse)Author
2011-08-01dccp ccid-2: prevent cwnd > Sequence WindowSamuel Jero
Add a check to prevent CCID-2 from increasing the cwnd greater than the Sequence Window. When the congestion window becomes bigger than the Sequence Window, CCID-2 will attempt to keep more data in the network than the DCCP Sequence Window code considers possible. This results in the Sequence Window code issuing a Sync, thereby inducing needless overhead. Further, if this occurs at the sender, CCID-2 will never detect the problem because the Acks it receives will indicate no losses. I have seen this cause a drop of 1/3rd in throughput for a connection. Also add code to adjust the Sequence Window to be about 5 times the number of packets in the network (RFC 4340, 7.5.2) and to adjust the Ack Ratio so that the remote Sequence Window will hold about 5 times the number of packets in the network. This allows the congestion window to increase correctly without being limited by the Sequence Window. Signed-off-by: Samuel Jero <sj323707@ohio.edu> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2011-07-04dccp ccid-2: Perform congestion-window validationGerrit Renker
CCID-2's cwnd increases like TCP during slow-start, which has implications for * the local Sequence Window value (should be > cwnd), * the Ack Ratio value. Hence an exponential growth, if it does not reflect the actual network conditions, can quickly lead to instability. This patch adds congestion-window validation (RFC2861) to CCID-2: * cwnd is constrained if the sender is application limited; * cwnd is reduced after a long idle period, as suggested in the '90 paper by Van Jacobson, in RFC 2581 (sec. 4.1); * cwnd is never reduced below the RFC 3390 initial window. As marked in the comments, the code is actually almost a direct copy of the TCP congestion-window-validation algorithms. By continuing this work, it may in future be possible to use the TCP code (not possible at the moment). The mechanism can be turned off using a module parameter. Sampling of the currently-used window (moving-maximum) is however done constantly; this is used to determine the expected window, which can be exploited to regulate DCCP's Sequence Window value. This patch also sets slow-start-after-idle (RFC 4341, 5.1), i.e. it behaves like TCP when net.ipv4.tcp_slow_start_after_idle = 1. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2011-07-04dccp ccid-2: Use existing function to test for data packetsGerrit Renker
This replaces a switch statement with a test, using the equivalent function dccp_data_packet(skb). It also doubles the range of the field `rx_num_data_pkts' by changing the type from `int' to `u32', avoiding signed/unsigned comparison with the u16 field `dccps_r_ack_ratio'. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2011-07-04dccp ccid-2: move rfc 3390 function into header fileGerrit Renker
This moves CCID-2's initial window function into the header file, since several parts throughout the CCID-2 code need to call it (CCID-2 still uses RFC 3390). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Leandro Melo de Sales <leandro@ic.ufal.br>
2010-11-15dccp ccid-2: Separate option parsing from CCID processingGerrit Renker
This patch replaces an almost identical replication of code: large parts of dccp_parse_options() re-appeared as ccid2_ackvector() in ccid2.c. Apart from the duplication, this caused two more problems: 1. CCIDs should not need to be concerned with parsing header options; 2. one can not assume that Ack Vectors appear as a contiguous area within an skb, it is legal to insert other options and/or padding in between. The current code would throw an error and stop reading in such a case. Since Ack Vectors provide CCID-specific information, they are now processed by the CCID directly, separating this functionality from the main DCCP code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2010-10-28dccp ccid-2: Stop pollingGerrit Renker
This updates CCID-2 to use the CCID dequeuing mechanism, converting from previous continuous-polling to a now event-driven mechanism. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-30dccp ccid-2: Use u32 timestamps uniformlyGerrit Renker
Since CCID-2 is de facto a mini implementation of TCP, it makes sense to share as much code as possible. Hence this patch aligns CCID-2 timestamping with TCP timestamping. This also halves the space consumption (on 64-bit systems). The necessary include file <net/tcp.h> is already included by way of net/dccp.h. Redundant includes have been removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23dccp ccid-2: Replace broken RTT estimator with better algorithmGerrit Renker
The current CCID-2 RTT estimator code is in parts broken and lags behind the suggestions in RFC2988 of using scaled variants for SRTT/RTTVAR. That code is replaced by the present patch, which reuses the Linux TCP RTT estimator code. Further details: ---------------- 1. The minimum RTO of previously one second has been replaced with TCP's, since RFC4341, sec. 5 says that the minimum of 1 sec. (suggested in RFC2988, 2.4) is not necessary. Instead, the TCP_RTO_MIN is used, which agrees with DCCP's concept of a default RTT (RFC 4340, 3.4). 2. The maximum RTO has been set to DCCP_RTO_MAX (64 sec), which agrees with RFC2988, (2.5). 3. De-inlined the function ccid2_new_ack(). 4. Added a FIXME: the RTT is sampled several times per Ack Vector, which will give the wrong estimate. It should be replaced with one sample per Ack. However, at the moment this can not be resolved easily, since - it depends on TX history code (which also needs some work), - the cleanest solution is not to use the `sent' time at all (saves 4 bytes per entry) and use DCCP timestamps / elapsed time to estimated the RTT, which however is non-trivial to get right (but needs to be done). Reasons for reusing the Linux TCP estimator algorithm: ------------------------------------------------------ Some time was spent to find a better alternative, using basic RFC2988 as a first step. Further analysis and experimentation showed that the Linux TCP RTO estimator is superior to a basic RFC2988 implementation. A summary is on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid2/rto_estimator/ In addition, this estimator fared well in a recent empirical evaluation: Rewaskar, Sushant, Jasleen Kaur and F. Donelson Smith. A Performance Study of Loss Detection/Recovery in Real-world TCP Implementations. Proceedings of 15th IEEE International Conference on Network Protocols (ICNP-07), 2007. Thus there is significant benefit in reusing the existing TCP code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07dccp ccid-2: Overhaul CCID naming convention 1/2Gerrit Renker
This patch starts a less problematic naming convention for CCID structs. The old naming convention used 'hc{tx,rx}->ccid?hc{tx,rx}->...' as recurring prefixes, which made the code * hard to write (not easy to fit into 80 characters); * hard to read (most of the space is occupied by prefixes). The new naming scheme: * struct entries for the TX socket are prefixed by 'tx_'; * and those for the RX socket are prefixed by 'rx_'. The identifiers then remain distinguishable when grep-ing through the tree: (a) RX/TX sockets are distinguished by the naming scheme, (b) individual CCIDs are distinguished by filename (ccid{2,3,4}.{c,h}). This first patch implements the scheme for CCID-2. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-14net-next-2.6 [PATCH 1/1] dccp: ccids whitespace-cleanup / CodingStyleGerrit Renker
No code change, cosmetical changes only: * whitespace cleanup via scripts/cleanfile, * remove self-references to filename at top of files, * fix coding style (extraneous brackets), * fix documentation style (kernel-doc-nano-HOWTO). Thanks are due to Ivo Augusto Calado who raised these issues by submitting good-quality patches. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Remove redundant ack-counting variableGerrit Renker
The code used two different variables to count Acks, one of them redundant. This patch reduces the number of Ack counters to one. The type of the Ack counter has also been changed to u32 (twice the range of int); and the variable has been renamed into `packets_acked' - for consistency with RFC 3465 (and similarly named variables are used by TCP and SCTP). Lastly, a slightly less aggressive `maxincr' increment is used (for even Ack Ratios, maxincr was Ack Ratio/2 + 1 instead of Ack Ratio/2). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Remove redundant synchronisation variableGerrit Renker
This removes the synchronisation variable `ccid2hctx_sendwait', which is set to 1 when the CCID2 sender may send a new packet, and which is set to 0 otherwise The variable is redundant, since it is only used in combination with the hc_tx_send_packet/ hc_tx_packet_sent function pair. Both functions are called under socket lock, so the following happens when the CCID2 may send a new packet: * it sets sendwait = 1 in tx_send_packet and returns 0; * the subsequent call to tx_packet_sent clears the sendwait flag; * since tx_send_packet returns 0 if and only if sendwait == 1, the BUG_ON condition in tx_packet_sent is never satisfied, since that function is never called when tx_send_packet returns a value different from 0 (cf. dccp_write_xmit); * the call to tx_packet_sent clears the flag so that the condition "!sendwait" is true the next time tx_packet_sent is called. In other words, it is sufficient to just return 0 / not-0 to synchronise tx_send_packet and tx_packet_sent -- which is what the patch does. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace pipe assignment-function with assignmentGerrit Renker
The function ccid2_change_pipe only does an assignment. This patch simplifies the code by replacing the function with the assignment it performs. Furthermore, the type of pipe is promoted from `signed' to unsigned (increasing the range). As a result, a BUG_ON test for negative values now becomes obsolete (for safety not removed, but replaced with a less annoying `DCCP_BUG'). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace cwnd assignment-function with assignmentGerrit Renker
The current function ccid2_change_cwnd in effect makes only an assignment, as the test whether cwnd has reached 0 is only required when cwnd is halved. This patch simplifies the code by replacing the function with the assignment it performs. Furthermore, since ssthresh derives from cwnd and appears in many assignments and comparisons, the type of ssthresh has also been changed to match that of cwnd. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace read-only variable with constantGerrit Renker
This replaces the field member `numdupack', which was used as a read-only constant in the code, with a #define. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Remove unused variableGerrit Renker
This removes a variable `ccid2hctx_sent' which is incremented but never referenced/read (i.e., dead code). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Disable broken Ack Ratio adaptation algorithmGerrit Renker
This comments out a problematic section comprising a half-finished algorithm: - The variable `ccid2hctx_ackloss' is never initialised to a value different from 0 and hence in fact is a read-only constant. - The `arsent' variable counts packets other than Acks (it is incremented for every packet), and there is no test for Ack Loss. - The concept of counting Acks as such leads to a complex calculation, and the calculation at the moment is inconsistent with this concept. The problem is that the number of Acks - rather than the number of windows - is counted, which leads to a complex (cubic/quadratic) expression - this is not even implemented. In its current state, the commented-out algorithm interfers with normal processing by changing Ack Ratio incorrectly, and at the wrong times. A new algorithm is necessary, which will not necessarily use the same variables as used by the unfinished one; hence the old variables have been removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[CCID2]: Remove redundant BUG_ONGerrit Renker
This removes a test for `val < 1' which would only have been triggered when val < 0, due to a preceding test for 0. Fixed by using an unsigned type for cwnd (as in TCP) instead. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[DCCP] ccid2: Allow window to grow largerAndrea Bittau
Now that we can stuff bigger ack vectors into options. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02[DCCP] CCID2: Code optimizationsAndrea Bittau
These are code optimizations which are relevant when dealing with large windows. They are not coded the way I would like to, but they do the job for the short-term. This patch should be more neat. Commiter note: Changed the seqno comparisions to use {after,before}48 to handle wrapping. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-09-22[DCCP] CCID2: Halve cwnd once upon multiple losses in a single RTTAndrea Bittau
When multiple losses occur in one RTT, the window should be halved only once [a single "congestion event"]. This is now implemented, although not perfectly. Slightly changed the interface for changing the cwnd: pass hctx instead of dp. This is required in order to allow for change_cwnd to be called from _init(). Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[DCCP] CCID2: Allocate seq records on demandAndrea Bittau
Allocate more sequence state on demand. Each time a packet is sent out by CCID2, a record of it needs to be kept. This list of records grows proportionally to cwnd. Previously, the length of this list was hardcored and therefore the cwnd could only grow to this value (of 128). Now, records are allocated on demand as necessary---cwnd may grow as it wishes. The exceptional case of when memory is not available is not handled gracefully. Perhaps, cwnd should be capped at that point. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[DCCP] CCID2: Initialize ssthresh to infinityAndrea Bittau
Initialize the slow-start threshold to infinity. This way, upon connection initiation, slow-start will be exited only upon a packet loss. This patch will allow connections to quickly gain speed. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20[DCCP] CCID: Improve CCID infrastructureArnaldo Carvalho de Melo
1. No need for ->ccid_init nor ->ccid_exit, this is what module_{init,exit} does and anynways neither ccid2 nor ccid3 were using it. 2. Rename struct ccid to struct ccid_operations and introduce struct ccid with a pointer to ccid_operations and rigth after it the rx or tx private state. 3. Remove the pointer to the state of the half connections from struct dccp_sock, now its derived thru ccid_priv() from the ccid pointer. Now we also can implement the setsockopt for changing the CCID easily as no ccid init routines can affect struct dccp_sock in any way that prevents other CCIDs from working if a CCID switch operation is asked by apps. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20[DCCP] CCID2: Initial CCID2 (TCP-Like) implementationAndrea Bittau
Original work by Andrea Bittau, Arnaldo Melo cleaned up and fixed several issues on the merge process. For now CCID2 was turned the default for all SOCK_DCCP connections, but this will be remedied soon with the merge of the feature negotiation code. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>