Age | Commit message (Collapse) | Author |
|
arg.result is sometimes used as fib6_result and sometimes used to
hold the rt6_info. Add rt6_info to fib6_result and make the use
of arg.result consistent through ipv6 rules.
The rt6 entry is filled in for lookups returning a dst_entry, but not
for direct fib_lookups that just want a fib6_info.
Fixes: effda4dd97e8 ("ipv6: Pass fib6_result to fib lookups")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
fib rule actions should return -EAGAIN for the rules to continue to the
next one. A recent change overwrote err with the lookup always returning
0 (future change will make it more like IPv4) which means the rules
stopped at the first (e.g., local table lookup only). Catch and reset err
to -EAGAIN.
Fixes: effda4dd97e87 ("ipv6: Pass fib6_result to fib lookups")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previously BMC's MAC address is calculated by simply adding 1 to the
last byte of network controller's MAC address, and it produces incorrect
result when network controller's MAC address ends with 0xFF.
The problem can be fixed by calling eth_addr_inc() function to increment
MAC address; besides, the MAC address is also validated before assigning
to BMC.
Fixes: cb10c7c0dfd9 ("net/ncsi: Add NCSI Broadcom OEM command")
Signed-off-by: Tao Ren <taoren@fb.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull drm regression fixes from Dave Airlie:
"We interrupt your regularly scheduled drm fixes for a regression
special.
The first is for a fix in i915 that had unexpected side effects
fallout in the userspace X.org modesetting driver where X would no
longer start. I got tired of the nitpicking and issued a large hammer
on it. The X.org driver is buggy, but blackscreen regressions are
worse.
The second was an oversight that myself and Gerd should have noticed
better, Gerd is trying to fix this properly, but the regression is too
large to leave, even if the original behaviour is bad in some cases,
it's clearly bad to break a bunch of working use cases.
I'll likely have a regular fixes pull later, but I really wanted to
highlight these"
* tag 'drm-fixes-2019-04-24' of git://anongit.freedesktop.org/drm/drm:
Revert "drm/virtio: drop prime import/export callbacks"
Revert "drm/i915/fbdev: Actually configure untiled displays"
|
|
append
mlxsw currently does not support v6 gateways with v4 routes. Commit
19a9d136f198 ("ipv4: Flag fib_info with a fib_nh using IPv6 gateway")
prevents a route from being added, but nothing stops the replace or
append. Add a catch for them too.
$ ip ro add 172.16.2.0/24 via 10.99.1.2
$ ip ro replace 172.16.2.0/24 via inet6 fe80::202:ff:fe00:b dev swp1s0
Error: mlxsw_spectrum: IPv6 gateway with IPv4 route is not supported.
$ ip ro append 172.16.2.0/24 via inet6 fe80::202:ff:fe00:b dev swp1s0
Error: mlxsw_spectrum: IPv6 gateway with IPv4 route is not supported.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Andre Guedes says:
====================
Taprio qdisc fixes
I'm re-sending this series, now with the "net-next" prefix in the subject.
The only change from the previous version is in patch 3. As suggested by Cong
Wang, it removes the !entry check within should_restart_cycle() since it is
already checked by the caller. As a side effect, that function becomes a dummy
wrapper on list_is_last() so we simply remove it and call list_is_last()
instead.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case we don't have 'guard' or 'budget' to transmit the skb, we should
continue traversing the qdisc list since the remaining guard/budget
might be enough to transmit a skb from other children qdiscs.
Fixes: 5a781ccbd19e (“tc: Add support for configuring the taprio scheduler”)
Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While traversing taprio's children qdisc list, if the gate is closed for
a given traffic class, we should continue traversing the list since the
remaining qdiscs may have skb ready for transmission.
This patch also takes this opportunity and changes the function to use
the TAPRIO_ALL_GATES_OPEN macro instead of the magic number '-1'.
Fixes: 5a781ccbd19e (“tc: Add support for configuring the taprio scheduler”)
Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The 'entry' argument from should_restart_cycle() cannot be NULL since it
is already checked by the caller so the WARN_ON() within should_
restart_cycle() could be removed. By doing that, that function becomes
a dummy wrapper on list_is_last() so this patch simply gets rid of it
and call list_is_last() within advance_sched() instead.
Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch does a code refactoring to taprio_get_start_time() function
to improve readability and report error properly.
If 'base' time is later than 'now', the start time is equal to 'base'
and taprio_get_start_time() is done. That's the natural case so we move
that code to the beginning of the function. Also, if 'cycle' calculation
is zero, something went really wrong with taprio and we should log that
internal error properly.
Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch removes a pointless variable assigment in taprio_change().
The 'err' variable is not used from this assignment to the next one so
this patch removes it.
Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
nhc_flags holds the RTNH_F flags for a given nexthop (fib{6}_nh).
All of the RTNH_F_ flags fit in an unsigned char, and since the API to
userspace (rtnh_flags and lower byte of rtm_flags) is 1 byte it can not
grow. Make nhc_flags in fib_nh_common an unsigned char and shrink the
size of the struct by 8, from 56 to 48 bytes.
Update the flags arguments for up netdevice events and fib_nexthop_info
which determines the RTNH_F flags to return on a dump/event. The RTNH_F
flags are passed in the lower byte of rtm_flags which is an unsigned int
so use a temp variable for the flags to fib_nexthop_info and combine
with rtm_flags in the caller.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, lwtunnel_fill_encap hardcodes the encap and encap type
attributes as RTA_ENCAP and RTA_ENCAP_TYPE, respectively. The nexthop
objects want to re-use this code but the encap attributes passed to
userspace as NHA_ENCAP and NHA_ENCAP_TYPE. Since that is the only
difference, change lwtunnel_fill_encap to take the attribute type as
an input.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The netdev variant is usable on any context since it disables interrupts.
The napi variant of the call should only be used within softirq context.
Replace napi_alloc_frag on driver init with the correct netdev_alloc_frag
call
Changes since v1:
- Adjusted commit message
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jassi Brar <jaswinder.singh@linaro.org>
Fixes: 4acb20b46214 ("net: socionext: different approach on DMA")
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are spelling mistakes in structure elements, fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to the R-Car Gen3 Hardware Manual Rev 1.50 of Nov 30, 2018, the
TX clock internal delay mode isn't supported on R-Car E3 (r8a77990) or D3
(r8a77995). And by extension it is also not supported by RZ/G2E (r9a774c0).
This matches all ES versions of the affected SoCs as it is
not clear if this problem will be resolved in newer chips.
This can be revisited, as necessary.
This patch does not error-out if PHY_INTERFACE_MODE_RGMII_ID or
PHY_INTERFACE_MODE_RGMII_TXID are used on SoCs where TX clock delay
mode is not supported as there is a risk of introducing a regression
when used in conjunction with older DT blobs present in the field.
Rather, a warning is logged in such cases.
Based on work by Kazuya Mizuguchi.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
net/core/sock.c: In function 'sock_gettstamp':
net/core/sock.c:3007:23: error: expected '}' before ';' token
.tv_sec = ts.tv_sec;
^
net/core/sock.c:3011:4: error: expected ')' before 'return'
return -EFAULT;
^~~~~~
net/core/sock.c:3013:2: error: expected expression before '}' token
}
^
Fixes: c7cbdbf29f48 ("net: rework SIOCGSTAMP ioctl handling")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pointers should be printed with %p or %px rather than
cast to (u_long) and printed with %lx.
Change %lx to %p to print the pointer.
Change %lx to %pad to print the dma_addr_t.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pointers should be printed with %p or %px rather than
cast to (u_long) type and printed with %lX.
As the function seems to be for debug purpose.
Change %lX to %px to print the pointer value.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch does more harm than good, as it breaks both Xwayland and
gnome-shell with X11.
Xwayland requires DRI3 & DRI3 requires PRIME.
X11 crash for obscure double-free reason which are hard to debug
(starting X11 by hand doesn't trigger the crash).
I don't see an apparent problem implementing those stub prime
functions, they may return an error at run-time, and it seems to be
handled fine by GNOME at least.
This reverts commit b318e3ff7ca065d6b107e424c85a63d7a6798a69.
[airlied:
This broke userspace for virtio-gpus, and regressed things from DRI3 to DRI2.
This brings back the original problem, but it's better than regressions.]
Fixes: b318e3ff7ca065d6b107e424c85a63d7a6798a ("drm/virtio: drop prime import/export callbacks")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This reverts commit d179b88deb3bf6fed4991a31fd6f0f2cad21fab5.
This commit is documented to break userspace X.org modesetting driver in certain configurations.
The X.org modesetting userspace driver is broken. No fixes are available yet. In order for this patch to be applied it either needs a config option or a workaround developed.
This has been reported a few times, saying it's a userspace problem is clearly against the regression rules.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109806
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: <stable@vger.kernel.org> # v3.19+
|
|
Eric Dumazet says:
====================
ipv6: fib6_ref conversion to refcount_t
We are chasing use-after-free in IPv6 that could have their origin
in fib6_ref 0 -> 1 transitions.
This patch series should help finding the root causes if these
illegal transitions ever happen.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We suspect some issues involving fib6_ref 0 -> 1 transitions might
cause strange syzbot reports.
Lets convert fib6_ref to refcount_t to catch them earlier.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of using atomic_inc(), prefer fib6_info_hold()
so that upcoming refcount_t conversion is simpler.
Only fib6_info_alloc() is using atomic_set() since we
just allocated a new object.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We do not need to clear f6i->rt6i_exception_bucket right before
freeing f6i.
Note that f6i->rt6i_exception_bucket is properly protected by
f6i->exception_bucket_flushed being set to one in rt6_flush_exceptions()
under the protection of rt6_exception_lock.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-04-22
This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.
From Tariq:
1) Some Enhancements in rq->flags
2) Stabilize RX packet rate (on Striding RQ) with
multiple outstanding UMR posts
In this patch, we add support for multiple outstanding UMR posts,
to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.
Performance test:
As expected, huge improvement in large-scale (48 cores).
xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Before: Unstable, 7 to 30 Mpps
After: Stable, at 70.5 Mpps
From Shay:
3) XDP, Inline small packets into the TX MPWQE in XDP xmit flow
Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).
When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.
Performance:
Tested packet rate for UDP 64Byte multi-stream
over two dual port ConnectX-5 100Gbps NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
* Tested with hyper-threading disabled
XDP_TX:
| | before | after | |
| 24 rings | 51Mpps | 116Mpps | +126% |
| 1 ring | 12Mpps | 12Mpps | same |
XDP_REDIRECT:
** Below is the transmit rate, not the redirection rate
which might be larger, and is not affected by this patch.
| | before | after | |
| 32 rings | 64Mpps | 92Mpps | +43% |
| 1 ring | 6.4Mpps | 6.4Mpps | same |
As we can see, feature significantly improves scaling, without
hurting single ring performance.
From Maxim:
4) Some trivial refactoring and code improvements prior to a larger series
to support AF_XDP.
====================
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull nfsd bugfixes from Bruce Fields:
"Fix miscellaneous nfsd bugs, in NFSv4.1 callbacks, NFSv4.1
lock-notification callbacks, NFSv3 readdir encoding, and the
cache/upcall code"
* tag 'nfsd-5.1-1' of git://linux-nfs.org/~bfields/linux:
nfsd: wake blocked file lock waiters before sending callback
nfsd: wake waiters blocked on file_lock before deleting it
nfsd: Don't release the callback slot unless it was actually held
nfsd/nfsd3_proc_readdir: fix buffer count and page pointers
sunrpc: don't mark uninitialised items as VALID.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull syscall numbering updates from Arnd Bergmann:
"arch: add pidfd and io_uring syscalls everywhere
This comes a bit late, but should be in 5.1 anyway: we want the newly
added system calls to be synchronized across all architectures in the
release.
I hope that in the future, any newly added system calls can be added
to all architectures at the same time, and tested there while they are
in linux-next, avoiding dependencies between the architecture
maintainer trees and the tree that contains the new system call"
* tag 'syscalls-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
arch: add pidfd and io_uring syscalls everywhere
|
|
qca_set_baudrate() calls serdev_device_wait_until_sent() assuming that
the HCI is always associated with a serdev device. This isn't true for
ROME controllers instantiated through ldisc, where the call causes a
crash due to a NULL pointer dereferentiation. Only call the function
when we have a serdev device. The timeout for ROME devices at the end
of qca_set_baudrate() is long enough to be reasonably sure that the
command was sent.
Fixes: fa9ad876b8e0 ("Bluetooth: hci_qca: Add support for Qualcomm Bluetooth chip wcn3990")
Reported-by: Balakrishna Godavarthi <bgodavar@codeaurora.org>
Reported-by: Rocky Liao <rjliao@codeaurora.org>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Rocky Liao <rjliao@codeaurora.org>
Tested-by: Rocky Liao <rjliao@codeaurora.org>
Reviewed-by: Balakrishna Godavarthi <bgodavar@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Create a #define for the timeout of mlx5e_wait_for_min_rx_wqes to
clarify the meaning of a magic number.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Remove the no longer used page_reuse stat of RQs.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
mlx5e_trigger_irq posts a NOP to the ICO SQ just to trigger an IRQ and
enter the NAPI poll on the right CPU according to the affinity. Use it
in mlx5e_activate_rq.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
mdev is unused in mlx5e_rx_is_linear_skb.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
mlx5e_mpwqe_get_log_rq_size calculates the number of WQEs (N) based on
the requested number of frames in the RQ (F) and the number of packets
per WQE (P). It ensures that N is not less than the minimum number of
WQEs in an RQ (N_min). Arithmetically, it means that F / P >= N_min
should be true. This function deals with logarithms, so it should check
that log(F) - log(P) >= log(N_min). However, if F < P, this expression
will cause an unsigned underflow. Check log(F) >= log(P) + log(N_min)
instead.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
This commit moves the parameter calculation functions to a separate file
for better modularity and code sharing with future features.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
If the channels fail to reopen after setting an XDP program, return the
error code instead of 0. A proper fix is still needed, as now any error
while reopening the channels brings the interface down. This patch only
adds error reporting.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
params is unused in mlx5e_init_di_list.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).
When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.
This is added to better utilize the HW resources (which now makes
one less packet data prefetch) and allow better scalability, on the
account of CPU usage (which now 'memcpy's the packet into the WQE).
To load balance between HW and CPU and get max packet rate, we use
watermarks to detect how much the HW is congested and move the work
loads back and forth between HW and CPU.
Performance:
Tested packet rate for UDP 64Byte multi-stream
over two dual port ConnectX-5 100Gbps NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
* Tested with hyper-threading disabled
XDP_TX:
| | before | after | |
| 24 rings | 51Mpps | 116Mpps | +126% |
| 1 ring | 12Mpps | 12Mpps | same |
XDP_REDIRECT:
** Below is the transmit rate, not the redirection rate
which might be larger, and is not affected by this patch.
| | before | after | |
| 32 rings | 64Mpps | 92Mpps | +43% |
| 1 ring | 6.4Mpps | 6.4Mpps | same |
As we can see, feature significantly improves scaling, without
hurting single ring performance.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
This counter tracks how many TX MPWQE sessions are started in XDP SQ
in XDP TX/REDIRECT flow. It counts per-channel and global stats.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The XDP redirect flush indication belongs to the receive queue,
not to its XDP send queue.
For this, use a new bit on rq->flags.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Values in enum mlx5e_rq_flag are used as bit indixes.
Intention was to use them with no BIT(i) wrapping.
No functional bug fix here, as the same (shifted)flag bit
is used for all set, test, and clear operations.
Fixes: 121e89275471 ("net/mlx5e: Refactor RQ XDP_TX indication")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The buffers mapping of the Multi-Packet WQEs (of Striding RQ)
is done via UMR posts, one UMR WQE per an RX MPWQE.
A single MPWQE is capable of serving many incoming packets,
usually larger than the budget of a single napi cycle.
Hence, posting a single UMR WQE per napi cycle (and handling its
completion in the next cycle) works fine in many common cases,
but not always.
When an XDP program is loaded, every MPWQE is capable of serving less
packets, to satisfy the packet-per-page requirement.
Thus, for the same number of packets more MPWQEs (and UMR posts)
are needed (twice as much for the default MTU), giving less latency
room for the UMR completions.
In this patch, we add support for multiple outstanding UMR posts,
to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.
For better SW and HW locality, we combine the UMR posts in bulks of
(at least) two.
This is expected to improve packet rate in high CPU scale.
Performance test:
As expected, huge improvement in large-scale (48 cores).
xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Before: Unstable, 7 to 30 Mpps
After: Stable, at 70.5 Mpps
No degradation in other tested scenarios.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
|
|
Kavya Sree Kotagiri says:
====================
net: phy: mscc: Improvements to VSC8514 PHY driver.
The VSC8514 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX,
1000BASE-X, can communicate with the MAC via QSGMII.
The MAC interface protocol for each port within QSGMII can
be either 1000BASE-X or SGMII, if the QSGMII MAC that the VSC8514 is
connecting to supports this functionality.
VSC8514 also supports SGMII MAC-side autonegotiation on each individual
port, downshifting, can set the blinking pattern of each of its 4 LEDs,
SyncE, 1000BASE-T Ring Resiliency as well as HP Auto-MDIX detection.
This patch series adds support for 10BASE-T, 100BASE-TX, and
1000BASE-T, QSGMII link with the MAC, downshifting, HP Auto-MDIX
detection and blinking pattern for its 4 LEDs.
The GPIO register bank is a set of registers that are common to all
PHYs in the package. So any modification in any register of this bank
affects all PHYs of the package.
If the PHYs haven't been reset before booting the Linux kernel and were
configured to use interrupts for e.g. link status updates, it is
required to clear the interrupts mask register of all PHYs before being
able to use interrupts with any PHY. The first PHY of the package that
will be init will take care of clearing all PHYs interrupts mask
registers. Thus, we need to keep track of the init sequence in the
package, if it's already been done or if it's to be done.
Most of the init sequence of a PHY of the package is common to all PHYs
in the package, thus we use the SMI broadcast feature which enables us
to propagate a write in one register of one PHY to all PHYs in the same
package.
This patch series adds support for VSC8514 in Microsemi driver(mscc.c)
and removes support from Vitesse driver(vitesse.c).
v8
- mscc: Added appropriate code using phy_modify() in vsc8514_config_init().
v7
- mscc: Handled return values in vsc8514_config_init().
v6
- mscc: Added proper return value in vsc85xx_csr_ctrl_phy_read().
- mscc: Replaced __mdiobus_write and__mdiobus_read with __phy_write and __phy_read resp.
- mscc: Replaced register addresses in 8514_config_init() with proper constants.
v5
- mscc: Added return error statements for few function calls.
- mscc: Added comments in vsc85xx_csr_ctrl_phy_read() and vsc85xx_csr_ctrl_phy_write()
v4
- mscc: Removed features settings
- mscc: Removed aneg_done settings.
v3
- mscc: Used BIT(x) for PHY_MCB_S6G_WRITE and PHY_MCB_S6G_READ
instead of hex.
- mscc: Replaced magic numbers with proper constants.
- mscc: Handled delays and timeouts at appropriate points.
- mscc: Added comments/explanation where requested.
v2
- mscc: Sorted variable declarations in reverse christmas tree order.
v1
- Added 0/2 file.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for VSC8514 in Microsemi driver (mscc.c)
with more features.
Signed-off-by: Kavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The VSC8514 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX,
1000BASE-X, can communicate with the MAC via QSGMII.
The MAC interface protocol for each port within QSGMII can
be either 1000BASE-X or SGMII, if the QSGMII MAC that the VSC8514 is
connecting to supports this functionality.
VSC8514 also supports SGMII MAC-side autonegotiation on each individual
port, downshifting, can set the blinking pattern of each of its 4 LEDs,
SyncE, 1000BASE-T Ring Resiliency as well as HP Auto-MDIX detection.
This adds support for 10BASE-T, 100BASE-TX, and 1000BASE-T,
QSGMII link with the MAC, downshifting, HP Auto-MDIX detection
and blinking pattern for its 4 LEDs.
The GPIO register bank is a set of registers that are common to all PHYs
in the package. So any modification in any register of this bank affects
all PHYs of the package.
If the PHYs haven't been reset before booting the Linux kernel and were
configured to use interrupts for e.g. link status updates, it is
required to clear the interrupts mask register of all PHYs before being
able to use interrupts with any PHY. The first PHY of the package that
will be init will take care of clearing all PHYs interrupts mask
registers. Thus, we need to keep track of the init sequence in the
package, if it's already been done or if it's to be done.
Most of the init sequence of a PHY of the package is common to all PHYs
in the package, thus we use the SMI broadcast feature which enables us
to propagate a write in one register of one PHY to all PHYs in the same
package.
Signed-off-by: Kavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Co-developed-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add missing <of_device_id> table for SPI driver relying on SPI
device match since compatible is in a DT binding or in a DTS.
Before this patch:
modinfo drivers/nfc/st95hf/st95hf.ko | grep alias
alias: spi:st95hf
After this patch:
modinfo drivers/nfc/st95hf/st95hf.ko | grep alias
alias: spi:st95hf
alias: of:N*T*Cst,st95hfC*
alias: of:N*T*Cst,st95hf
Reported-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Daniel Gomez <dagmcr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add missing <of_device_id> table for SPI driver relying on SPI
device match since compatible is in a DT binding or in a DTS.
Before this patch:
modinfo drivers/net/phy/spi_ks8995.ko | grep alias
alias: spi:ksz8795
alias: spi:ksz8864
alias: spi:ks8995
After this patch:
modinfo drivers/net/phy/spi_ks8995.ko | grep alias
alias: spi:ksz8795
alias: spi:ksz8864
alias: spi:ks8995
alias: of:N*T*Cmicrel,ksz8795C*
alias: of:N*T*Cmicrel,ksz8795
alias: of:N*T*Cmicrel,ksz8864C*
alias: of:N*T*Cmicrel,ksz8864
alias: of:N*T*Cmicrel,ks8995C*
alias: of:N*T*Cmicrel,ks8995
Reported-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Daniel Gomez <dagmcr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The default m88e151x LED configuration is 0x1177, used LED[0]
for 1000M link, LED[1] for 100M link, and LED[2] for active.
But for some boards, which use LED[0] for link, and LED[1] for
active, prefer to be 0x1040. To be compatible with this case,
this patch defines a new dev_flag, and set it before connect
phy in HNS3 driver. When phy initializing, using the new
LED configuration if this dev_flag is set.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
size = sizeof(struct foo) + count * sizeof(struct boo);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
size = struct_size(instance, entry, count);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|