diff options
author | Sven Van Asbroeck <thesven73@gmail.com> | 2021-02-15 20:08:02 -0500 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2021-02-16 15:08:48 -0800 |
commit | a8db76d40e4d568a9e9cc9fb8d81352b5ff530ee (patch) | |
tree | 6cef744cc3706d774e622904fafcff273b06d0ad /net/ipv6/proc.c | |
parent | 80fea53dbecbaec9dadaa9452564b2314caea0f9 (diff) |
lan743x: boost performance on cpu archs w/o dma cache snooping
The buffers in the lan743x driver's receive ring are always 9K,
even when the largest packet that can be received (the mtu) is
much smaller. This performs particularly badly on cpu archs
without dma cache snooping (such as ARM): each received packet
results in a 9K dma_{map|unmap} operation, which is very expensive
because cpu caches need to be invalidated.
Careful measurement of the driver rx path on armv7 reveals that
the cpu spends the majority of its time waiting for cache
invalidation.
Optimize by keeping the rx ring buffer size as close as possible
to the mtu. This limits the amount of cache that requires
invalidation.
This optimization would normally force us to re-allocate all
ring buffers when the mtu is changed - a disruptive event,
because it can only happen when the network interface is down.
Remove the need to re-allocate all ring buffers by adding support
for multi-buffer frames. Now any combination of mtu and ring
buffer size will work. When the mtu changes from mtu1 to mtu2,
consumed buffers of size mtu1 are lazily replaced by newly
allocated buffers of size mtu2.
These optimizations double the rx performance on armv7.
Third parties report 3x rx speedup on armv8.
Tested with iperf3 on a freescale imx6qp + lan7430, both sides
set to mtu 1500 bytes, measure rx performance:
Before:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-20.00 sec 550 MBytes 231 Mbits/sec 0
After:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-20.00 sec 1.33 GBytes 570 Mbits/sec 0
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv6/proc.c')
0 files changed, 0 insertions, 0 deletions