Age | Commit message (Collapse) | Author |
|
There are couple new values that PMTM register can return
in module_type field. Add them and map them to module width in
mlxsw_core_module_max_width(). Fix the existing names on the way.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Russell King says:
====================
rework phylink interface for split MAC/PCS support
The following series changes the phylink interface to allow us to
better support split MAC / MAC PCS setups. The fundamental change
required for this turns out to be quite simple.
Today, mac_config() is used for everything to do with setting the
parameters for the MAC, and mac_link_up() is used to inform the
MAC driver that the link is now up (and so to allow packet flow.)
mac_config() also has had a few implementation issues, with folk
who believe that members such as "speed" and "duplex" are always
valid, where "link" gets used inappropriately, etc.
With the proposed patches, all this changes subtly - but in a
backwards compatible way at this stage.
We pass the the full resolved link state (speed, duplex, pause) to
mac_link_up(), and it is now guaranteed that these parameters to
this function will always be valid (no more SPEED_UNKNOWN or
DUPLEX_UNKNOWN here - unless phylink is fed with such things.)
Drivers should convert over to using the state in mac_link_up()
rather than configuring the speed, duplex and pause in the
mac_config() method. The patch series includes a number of MAC
drivers which I've thought have been easy targets - I've left the
remainder as I think they need maintainer input. However, *all*
drivers will need conversion for future phylink development.
v2: add ocelot/felix and qca/ar9331 DSA drivers to patch 2, add
received tested-by so far.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert the Marvell mvpp2 ethernet driver to use the finalised link
parameters in mac_link_up() rather than the parameters in mac_config().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert the Marvell mvneta ethernet driver to use the finalised link
parameters in mac_link_up() rather than the parameters in mac_config().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert the macb ethernet driver to use the finalised link
parameters in mac_link_up() rather than the parameters in mac_config().
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert the DPAA2 ethernet driver to use the finalised link parameters
in mac_link_up() rather than the parameters in mac_config(), which are
more suited to the needs of the DPAA2 MC firmware than those available
via mac_config().
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert the Xilinx AXI ethernet driver to use the finalised link
parameters in mac_link_up() rather than the parameters in mac_config().
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the resolved link configuration to set the MAC configuration when
mac_link_up() for non-internal-PHY ports.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Propagate the resolved link configuration down via DSA's
phylink_mac_link_up() operation to allow split PCS/MAC to work.
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Propagate the resolved link parameters via the mac_link_up() call for
MACs that do not automatically track their PCS state. We propagate the
link parameters via function arguments so that inappropriate members
of struct phylink_link_state can't be accessed, and creating a new
structure just for this adds needless complexity to the API.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for ethtool -r so that PHY negotiation can be restarted.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Read the copper autonegotiation results from the copper specific
status register, rather than decoding the advertisements. Reading
what the link is actually doing will allow us to support downshift
modes.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Julian Wiedmann says:
====================
s390/qeth: updates 2020-02-27
please apply the following patch series for qeth to netdev's net-next
tree.
This adds support for ETHTOOL_RX_COPYBREAK, along with small cleanups
and fine-tuning.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement the ethtool hooks for the ETHTOOL_RX_COPYBREAK tunable.
The copybreak is stored into netdev_priv, so that we automatically go
back to the default value if the netdev is re-allocated.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Trust the napi_disable() in qeth_stop() to handle this.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Once the IDX connection is down, there's no point in trying to issue
more IOs.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This let's us start every new IDX connection with clean seqnos.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Looks like these were never used, ever since the driver was initially
added.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's good practice to not blindly trust what the HW offers.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Properly define the cmd's struct to get rid of some casts and accesses
at magic offsets.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
card->info.unique_id is always 0 for IQD devices, so don't bother with
copying it into the 0-initialized cmd.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jiri Pirko says:
====================
selftests: updates for mlxsw driver test
This patchset contains tweaks to the existing tests and is also adding
couple of new ones, namely tests for shared buffer and red offload.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The scale test for Spectrum-2 should be invoked for Spectrum-2 and
Spectrum-3. Add the appropriate device ID.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, the test inserts X /32 routes and for each route it is
testing that a packet sent from the first host is received by the second
host, which is very time-consuming.
Instead only validate the offload flag of each route and get the same result.
Wait between the creation of the routes and the offload validation in
order to make sure that all the routes were successfully offloaded.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After adding a given number of flower rules for different IPv6
addresses, the test generates traffic and ensures that each packet is
received, which is time-consuming.
Instead, test the offload indication of the tc flower rules and reduce
the running time by half.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Test the max shared buffer occupancy for port's pool and port's TC's (using
different types of packets).
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add mlxsw lib for common defines, helpers etc.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add two devlink port helpers:
* devlink port get by netdev
* devlink cpu port get
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Sanity check for devlink info command.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Test physical ports' shared buffer configuration options using random
values related to a specific configuration option. There are 3
configuration options: pool, TC bind and portpool.
Each sub-test, test a different configuration option and random the related
values as the follow:
* For pools, pool's size will be randomized.
* For TC bind, pool number and threshold will be randomized.
* For portpools, threshold will be randomized.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rtnetlink test uses offload indication checks.
Use a busywait helper and wait until the offload indication is set or
fail if it reaches timeout.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vxlan test uses offload indication checks.
Use a busywait helper and wait until the offload indication is set or
fail if it reaches timeout.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Blackhole routes test uses offload indication checks.
Use busywait helper and wait until the routes offload indication is set or
fail if it reaches timeout.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The test checks that packets are trapped when they should egress a
router interface (RIF) that has become disabled. This is a temporary
state in a RIF's deletion sequence.
Currently, the test deletes the RIF by flushing all the IP addresses
configured on the associated netdev (br0). However, this is racy, as
this also flushes all the routes pointing to the netdev and if the
routes are deleted from the device before the RIF is disabled, then no
packets will try to egress the disabled RIF and the trap will not be
triggered.
Instead, trigger the deletion of the RIF by unlinking the mlxsw port
from the bridge that is backing the RIF. Unlike before, this will not
cause the kernel to delete the routes pointing to the bridge.
Note that due to current mlxsw locking scheme the RIF is always deleted
first, but this is going to change.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Include test of forbidding to have multiple mirror actions.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Include test of forbidding to have redirect rule on egress-bound block.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This tests that below the queue minimum length, there is no dropping /
marking, and above max, everything is dropped / marked.
The test is structured as a core file with topology and test code, and
three wrappers: one for RED used as a root Qdisc, and two for
testing (W)RED under PRIO and ETS.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extract a helper __start_traffic() configurable by protocol type. Allow
passing through extra mausezahn arguments. Add a wrapper,
start_tcp_traffic().
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Russell King says:
====================
VLANs, DSA switches and multiple bridges
This is a repost of the previously posted RFC back in December, which
did not get fully reviewed. I've dropped the RFC tag this time as no
one really found anything too problematical in the RFC posting.
I've been trying to configure DSA for VLANs and not having much success.
The setup is quite simple:
- The main network is untagged
- The wifi network is a vlan tagged with id $VN running over the main
network.
I have an Armada 388 Clearfog with a PCIe wifi card which I'm trying to
setup to provide wifi access to the vlan $VN network, while the switch
is also part of the main network.
However, I'm encountering problems:
1) vlan support in DSA has a different behaviour from the Linux
software bridge implementation.
# bridge vlan
port vlan ids
lan1 1 PVID Egress Untagged
...
shows the default setup - the bridge ports are all configured for
vlan 1, untagged egress, and vlan 1 as the port vid. Issuing:
# ip li set dev br0 type bridge vlan_filtering 1
with no other vlan configuration commands on a Linux software bridge
continues to allow untagged traffic to flow across the bridge.
This difference in behaviour is because the MV88E6xxx VTU is
completely empty - because net/dsa ignores all vlan settings for
a port if br_vlan_enabled(dp->bridge_dev) is false - this reflects
the vlan filtering state of the bridge, not whether the bridge is
vlan aware.
What this means is that attempting to configure the bridge port
vlans before enabling vlan filtering works for Linux software
bridges, but fails for DSA bridges.
2) Assuming the above is sorted, we move on to the next issue, which
is altogether more weird. Let's take a setup where we have a
DSA bridge with lan1..6 in a bridge device, br0, with vlan
filtering enabled. lan1 is the upstream port, lan2 is a downstream
port that also wants to see traffic on vlan id $VN.
Both lan1 and lan2 are configured for that:
# bridge vlan add vid $VN dev lan1
# bridge vlan add vid $VN dev lan2
# ip li set br0 type bridge vlan_filtering 1
Untagged traffic can now pass between all the six lan ports, and
vlan $VN between lan1 and lan2 only. The MV88E6xxx 8021q_mode
debugfs file shows all lan ports are in mode "secure" - this is
important! /sys/class/net/br0/bridge/vlan_filtering contains 1.
tcpdumping from another machine on lan4 shows that no $VN traffic
reaches it. Everything seems to be working correctly...
In order to further bridge vlan $VN traffic to hostapd's wifi
interface, things get a little more complex - we can't add hostapd's
wifi interface to br0 directly, because hostapd will bring up the
wifi interface and leak the main, untagged traffic onto the wifi.
(hostapd does have vlan support, but only as a dynamic per-client
thing, and there's no hooks I can see to allow script-based config
of the network setup before hostapd up's the wifi interface.)
So, what I tried was:
# ip li add link br0 name br0.$VN type vlan id $VN
# bridge vlan add vid $VN dev br0 self
# ip li set dev br0.$VN up
So far so good, we get a vlan interface on top of the bridge, and
tcpdumping it shows we get traffic. The 8021q_mode file has not
changed state. Everything still seems to be correct.
# bridge addbr br1
Still nothing has changed.
# bridge addif br1 br0.$VN
And now the 8021q_mode debugfs file shows that all ports are now in
"disabled" mode, but /sys/class/net/br0/bridge/vlan_filtering still
contains '1'. In other words, br0 still thinks vlan filtering is
enabled, but the hardware has had vlan filtering disabled.
Adding some stack traces to an appropriate point indicates that this
is because __switchdev_handle_port_attr_set() recurses down through
the tree of interfaces, skipping over the vlan interface, applying
br1's configuration to br0's ports.
This surely can not be right - surely
__switchdev_handle_port_attr_set() and similar should stop recursing
down through another master bridge device? There are probably other
network device classes that switchdev shouldn't recurse down too.
I've considered whether switchdev is the right level to do it, and
I think it is - as we want the check/set callbacks to be called for
the top level device even if it is a master bridge device, but we
don't want to recurse through a lower master bridge device.
v2: dropped patch 3, since that has an outstanding issue, and my
question on it has not been answered. Otherwise, these are the
same patches. Maybe we can move forward with just these two?
v3: include DSA ports in patch 2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When setting VLANs on DSA switches, the VLAN is added to both the port
concerned as well as the CPU port by dsa_slave_vlan_add(), as well as
any DSA ports. If multiple ports are configured with the same VLAN ID,
this triggers a warning on the CPU and DSA ports.
Avoid this warning for CPU and DSA ports.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When configuring a tree of independent bridges, propagating changes
from the upper bridge across a bridge master to the lower bridge
ports brings surprises.
For example, a lower bridge may have vlan filtering enabled. It
may have a vlan interface attached to the bridge master, which may
then be incorporated into another bridge. As soon as the lower
bridge vlan interface is attached to the upper bridge, the lower
bridge has vlan filtering disabled.
This occurs because switchdev recursively applies its changes to
all lower devices no matter what.
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The callers only expect NULL pointers, so returning an error pointer
will lead to an Oops.
Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds a missing shift for the media operation mode selection.
This does not fix the driver as the current operation mode (copper) has
a value of 0, but this wouldn't work for other modes.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In this commit we revert the part of
commit 1a63443afd70 ("net/amazon: Ensure that driver version is aligned to the linux kernel"),
which breaks the interface between the ENA driver and FW.
We also replace the use of DRIVER_VERSION with DRIVER_GENERATION
when we bring back the deleted constants that are used in interface with
ENA device FW.
This commit does not change the driver version reported to the user via
ethtool, which remains the kernel version.
Fixes: 1a63443afd70 ("net/amazon: Ensure that driver version is aligned to the linux kernel")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Florian Westphal says:
====================
mptcp: update mptcp ack sequence outside of recv path
This series moves mptcp-level ack sequence update outside of the recvmsg path.
Current approach has two problems:
1. There is delay between arrival of new data and the time we can ack
this data.
2. If userspace doesn't call recv for some time, mptcp ack_seq is not
updated at all, even if this data is queued in the subflow socket
receive queue.
Move skbs from the subflow socket receive queue to the mptcp-level
receive queue, updating the mptcp-level ack sequence and have recv
take skbs from the mptcp-level receive queue.
The first place where we will attempt to update the mptcp level acks
is from the subflows' data_ready callback, even before we make userspace
aware of new data.
Because of possible deadlock (we need to take the mptcp socket lock
while already holding the subflow sockets lock), we may still need to
defer the mptcp-level ack update. In such case, this work will be either
done from work queue or recv path, depending on which runs sooner.
In order to avoid pointless scheduling of the work queue, work
will be queued from the mptcp sockets lock release callback.
This allows to detect when the socket owner did drain the subflow
socket receive queue.
Please see individual patches for more information.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Don't schedule the work queue right away, instead defer this
to the lock release callback.
This has the advantage that it will give recv path a chance to
complete -- this might have moved all pending packets from the
subflow to the mptcp receive queue, which allows to avoid the
schedule_work().
Co-developed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can't lock_sock() the mptcp socket from the subflow data_ready callback,
it would result in ABBA deadlock with the subflow socket lock.
We can however grab the spinlock: if that succeeds and the mptcp socket
is not owned at the moment, we can process the new skbs right away
without deferring this to the work queue.
This avoids the schedule_work and hence the small delay until the
work item is processed.
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|