summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-06-28skbuff: preserve sock reference when scrubbing the skb.Flavio Leitner
The sock reference is lost when scrubbing the packet and that breaks TSQ (TCP Small Queues) and XPS (Transmit Packet Steering) causing performance impacts of about 50% in a single TCP stream when crossing network namespaces. XPS breaks because the queue mapping stored in the socket is not available, so another random queue might be selected when the stack needs to transmit something like a TCP ACK, or TCP Retransmissions. That causes packet re-ordering and/or performance issues. TSQ breaks because it orphans the packet while it is still in the host, so packets are queued contributing to the buffer bloat problem. Preserving the sock reference fixes both issues. The socket is orphaned anyways in the receiving path before any relevant action and on TX side the netfilter checks if the reference is local before use it. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28netfilter: check if the socket netns is correct.Flavio Leitner
Netfilter assumes that if the socket is present in the skb, then it can be used because that reference is cleaned up while the skb is crossing netns. We want to change that to preserve the socket reference in a future patch, so this is a preparation updating netfilter to check if the socket netns matches before use it. Signed-off-by: Flavio Leitner <fbl@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28Merge branch 'net-sched-actions-code-style-cleanup-and-fixes'David S. Miller
Roman Mashak says: ==================== net sched actions: code style cleanup and fixes The patchset fixes a few code stylistic issues and typos, as well as one detected by sparse semantic checker tool. No functional changes introduced. Patch 1 & 2 fix coding style bits caught by the checkpatch.pl script Patch 3 fixes an issue with a shadowed variable Patch 4 adds sizeof() operator instead of magic number for buffer length Patch 5 fixes typos in diagnostics messages Patch 6 explicitly sets unsigned char for bitwise operation v2: - submit for net-next - added Reviewed-by tags - use u8* instead of char* as per Davide Caratti suggestion ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net sched actions: avoid bitwise operation on signed value in peditRoman Mashak
Since char can be unsigned or signed, and bitwise operators may have implementation-dependent results when performed on signed operands, declare 'u8 *' operand instead. Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net sched actions: fix misleading text strings in pedit actionRoman Mashak
Change "tc filter pedit .." to "tc actions pedit .." in error messages to clearly refer to pedit action. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net sched actions: use sizeof operator for buffer lengthRoman Mashak
Replace constant integer with sizeof() to clearly indicate the destination buffer length in skb_header_pointer() calls. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net sched actions: fix sparse warningRoman Mashak
The variable _data in include/asm-generic/sections.h defines sections, this causes sparse warning in pedit: net/sched/act_pedit.c:293:35: warning: symbol '_data' shadows an earlier one ./include/asm-generic/sections.h:36:13: originally declared here Therefore rename the variable. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net sched actions: fix coding style in pedit headersRoman Mashak
Fix coding style issues in tc pedit headers detected by the checkpatch script. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net sched actions: fix coding style in pedit actionRoman Mashak
Fix coding style issues in tc pedit action detected by the checkpatch script. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28netem: slotting with non-uniform distributionYousuk Seung
Extend slotting with support for non-uniform distributions. This is similar to netem's non-uniform distribution delay feature. Commit f043efeae2f1 ("netem: support delivering packets in delayed time slots") added the slotting feature to approximate the behaviors of media with packet aggregation but only supported a uniform distribution for delays between transmission attempts. Tests with TCP BBR with emulated wifi links with non-uniform distributions produced more useful results. Syntax: slot dist DISTRIBUTION DELAY JITTER [packets MAX_PACKETS] \ [bytes MAX_BYTES] The syntax and use of the distribution table is the same as in the non-uniform distribution delay feature. A file DISTRIBUTION must be present in TC_LIB_DIR (e.g. /usr/lib/tc) containing numbers scaled by NETEM_DIST_SCALE. A random value x is selected from the table and it takes DELAY + ( x * JITTER ) as delay. Correlation between values is not supported. Examples: Normal distribution delay with mean = 800us and stdev = 100us. > tc qdisc add dev eth0 root netem slot dist normal 800us 100us Optionally set the max slot size in bytes and/or packets. > tc qdisc add dev eth0 root netem slot dist normal 800us 100us \ bytes 64k packets 42 Signed-off-by: Yousuk Seung <ysseung@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28netlink: Return extack message if attribute validation failsDavid Ahern
Have one extack message for parsing and validating. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net: phy: xgmiitorgmii: Check read_status resultsBrandon Maier
We're ignoring the result of the attached phy device's read_status(). Return it so we can detect errors. Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net: phy: xgmiitorgmii: Use correct mdio busBrandon Maier
The xgmiitorgmii is using the mii_bus of the device it's attached to, instead of the bus it was given during probe. Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net: phy: xgmiitorgmii: Check phy_driver ready before accessingBrandon Maier
Since a phy_device is added to the global mdio_bus list during phy_device_register(), but a phy_device's phy_driver doesn't get attached until phy_probe(). It's possible of_phy_find_device() in xgmiitorgmii will return a valid phy with a NULL phy_driver. Leading to a NULL pointer access during the memcpy(). Fixes this Oops: Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.40 #1 Hardware name: Xilinx Zynq Platform task: ce4c8d00 task.stack: ce4ca000 PC is at memcpy+0x48/0x330 LR is at xgmiitorgmii_probe+0x90/0xe8 pc : [<c074bc68>] lr : [<c0529548>] psr: 20000013 sp : ce4cbb54 ip : 00000000 fp : ce4cbb8c r10: 00000000 r9 : 00000000 r8 : c0c49178 r7 : 00000000 r6 : cdc14718 r5 : ce762800 r4 : cdc14710 r3 : 00000000 r2 : 00000054 r1 : 00000000 r0 : cdc14718 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 18c5387d Table: 0000404a DAC: 00000051 Process swapper/0 (pid: 1, stack limit = 0xce4ca210) ... [<c074bc68>] (memcpy) from [<c0529548>] (xgmiitorgmii_probe+0x90/0xe8) [<c0529548>] (xgmiitorgmii_probe) from [<c0526a94>] (mdio_probe+0x28/0x34) [<c0526a94>] (mdio_probe) from [<c04db98c>] (driver_probe_device+0x254/0x414) [<c04db98c>] (driver_probe_device) from [<c04dbd58>] (__device_attach_driver+0xac/0x10c) [<c04dbd58>] (__device_attach_driver) from [<c04d96f4>] (bus_for_each_drv+0x84/0xc8) [<c04d96f4>] (bus_for_each_drv) from [<c04db5bc>] (__device_attach+0xd0/0x134) [<c04db5bc>] (__device_attach) from [<c04dbdd4>] (device_initial_probe+0x1c/0x20) [<c04dbdd4>] (device_initial_probe) from [<c04da8fc>] (bus_probe_device+0x98/0xa0) [<c04da8fc>] (bus_probe_device) from [<c04d8660>] (device_add+0x43c/0x5d0) [<c04d8660>] (device_add) from [<c0526cb8>] (mdio_device_register+0x34/0x80) [<c0526cb8>] (mdio_device_register) from [<c0580b48>] (of_mdiobus_register+0x170/0x30c) [<c0580b48>] (of_mdiobus_register) from [<c05349c4>] (macb_probe+0x710/0xc00) [<c05349c4>] (macb_probe) from [<c04dd700>] (platform_drv_probe+0x44/0x80) [<c04dd700>] (platform_drv_probe) from [<c04db98c>] (driver_probe_device+0x254/0x414) [<c04db98c>] (driver_probe_device) from [<c04dbc58>] (__driver_attach+0x10c/0x118) [<c04dbc58>] (__driver_attach) from [<c04d9600>] (bus_for_each_dev+0x8c/0xd0) [<c04d9600>] (bus_for_each_dev) from [<c04db1fc>] (driver_attach+0x2c/0x30) [<c04db1fc>] (driver_attach) from [<c04daa98>] (bus_add_driver+0x50/0x260) [<c04daa98>] (bus_add_driver) from [<c04dc440>] (driver_register+0x88/0x108) [<c04dc440>] (driver_register) from [<c04dd6b4>] (__platform_driver_register+0x50/0x58) [<c04dd6b4>] (__platform_driver_register) from [<c0b31248>] (macb_driver_init+0x24/0x28) [<c0b31248>] (macb_driver_init) from [<c010203c>] (do_one_initcall+0x60/0x1a4) [<c010203c>] (do_one_initcall) from [<c0b00f78>] (kernel_init_freeable+0x15c/0x1f8) [<c0b00f78>] (kernel_init_freeable) from [<c0763d10>] (kernel_init+0x18/0x124) [<c0763d10>] (kernel_init) from [<c0112d74>] (ret_from_fork+0x14/0x20) Code: ba000002 f5d1f03c f5d1f05c f5d1f07c (e8b151f8) ---[ end trace 3e4ec21905820a1f ]--- Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28Merge branch 'ipsec-selftests-updates'David S. Miller
Shannon Nelson says: ==================== Updates for ipsec selftests Fix up the existing ipsec selftest and add tests for the ipsec offload driver API. v2: addressed formatting nits in netdevsim from Jakub Kicinski v3: a couple more nits from Jakub ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28selftests: rtnetlink: add ipsec offload API testShannon Nelson
Using the netdevsim as a device for testing, try out the XFRM commands for setting up IPsec hardware offloads. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28netdevsim: add ipsec offload testingShannon Nelson
Implement the IPsec/XFRM offload API for testing. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28selftests: rtnetlink: use dummydev as a test deviceShannon Nelson
We really shouldn't mess with local system settings, so let's use the already created dummy device instead for ipsec testing. Oh, and let's put the temp file into a proper directory. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28selftests: rtnetlink: clear the return code at start of ipsec testShannon Nelson
Following the custom from the other functions, clear the global ret code before starting the test so as to not have previously failed tests cause us to thing this test has failed. Reported-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28l2tp: define helper for parsing struct sockaddr_pppol2tp*Guillaume Nault
'sockaddr_len' is checked against various values when entering pppol2tp_connect(), to verify its validity. It is used again later, to find out which sockaddr structure was passed from user space. This patch combines these two operations into one new function in order to simplify pppol2tp_connect(). A new structure, l2tp_connect_info, is used to pass sockaddr data back to pppol2tp_connect(), to avoid passing too many parameters to l2tp_sockaddr_get_info(). Also, the first parameter is void* in order to avoid casting between all sockaddr_* structures manually. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28tcp: remove one indentation level in tcp_create_openreq_childEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28sh_eth: fix *enum* {A|M}PR_BITSergei Shtylyov
The *enum* {A|M}PR_BIT were declared in the commit 86a74ff21a7a ("net: sh_eth: add support for Renesas SuperH Ethernet") adding SH771x support, however the SH771x manual doesn't have the APR/MPR registers described and the code writing to them for SH7710 was later removed by the commit 380af9e390ec ("net: sh_eth: CPU dependency code collect to "struct sh_eth_cpu_data""). All the newer SoC manuals have these registers documented as having a 16-bit TIME parameter of the PAUSE frame, not 1-bit -- update the *enum* accordingly, fixing up the APR/MPR writes... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28tc-tests: add an extreme-case csum action testKeara Leibovitz
Added an extreme-case test for all 7 csum action headers. Signed-off-by: Keara Leibovitz <kleib@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28Merge branch 'mscc-ocelot-add-more-features'David S. Miller
Alexandre Belloni says: ==================== net: mscc: ocelot: add more features This series adds link aggregation and VLAN filtering hardware offload support to the ocelot driver. PTP support will be sent later. changes in v2: - rebased on v4.18-rc1 - check for aggregation type and only offload it when type is hash (balance-xor or 802.3ad) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net: mscc: ocelot: add VLAN filteringAntoine Tenart
Add hardware VLAN filtering offloading on ocelot. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28net: mscc: ocelot: add bonding supportAlexandre Belloni
Add link aggregation hardware offload support for Ocelot. ocelot_get_link_ksettings() is not great but it does work until the driver is reworked to switch to phylink. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28cxgb4: Add new T5 PCI device id 0x50aeGanesh Goudar
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28cxgb4: Add flag tc_flower_initializedCasey Leedom
Add flag tc_flower_initialized to indicate the completion if tc flower initialization. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27neighbour: force neigh_invalidate when NUD_FAILED update is from adminRoopa Prabhu
In systems where neigh gc thresh holds are set to high values, admin deleted neigh entries (eg ip neigh flush or ip neigh del) can linger around in NUD_FAILED state for a long time until periodic gc kicks in. This patch forces neigh_invalidate when NUD_FAILED neigh_update is from an admin. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27Merge branch 'Multipath-tests-for-tunnel-devices'David S. Miller
Petr Machata says: ==================== Multipath tests for tunnel devices This patchset adds a test for ECMP and weighted ECMP between two GRE tunnels. In patches #1 and #2, the function multipath_eval() is first moved from router_multipath.sh to lib.sh for ease of reuse, and then fixed up. In patch #3, the function tc_rule_stats_get() is parameterized to be useful for egress rules as well. In patch #4, a new function __simple_if_init() is extracted from simple_if_init(). This covers the logic that needs to be done for the usual interface: VRF migration, upping and installation of IP addresses. Patch #5 then adds the test itself. Additionally in patch #6, a requirement to add diagrams to selftests is documented. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: README: Require diagramsPetr Machata
ASCII art diagrams are well suited for presenting the topology that a test uses while being easy to embed directly in the test file iteslf. They make the information very easy to grasp even for simple topologies, and for more complex ones they are almost essential, as figuring out the interconnects from the script itself proves to be difficult. Therefore state the requirement for topology ASCII art in README. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: Test multipath tunnelingPetr Machata
Add a GRE-tunneling test such that there are two tunnels involved, with a multipath route listing both as next hops. Similarly to router_multipath.sh, test that the distribution of traffic to the tunnels honors the configured weights. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: lib: Extract interface-init functionsPetr Machata
The function simple_if_init() does two things: it creates a VRF, then moves an interface into this VRF and configures addresses. The latter comes in handy when adding more interfaces into a VRF later on. The situation is similar for simple_if_fini(). Therefore split the interface remastering and address de/initialization logic to a new pair of helpers __simple_if_init() / __simple_if_fini(), and defer to these helpers from simple_if_init() and simple_if_fini(). Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: tc_rule_stats_get: Parameterize directionPetr Machata
The GRE multipath tests need stats on an egress counter. Change tc_rule_stats_get() to take direction as an optional argument, with default of ingress. Take the opportunity to change line continuation character from | to \. Move the | to the next line, which indent. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: multipath_eval(): Improve stylePetr Machata
- Change the indentation of the function body from 7 spaces to one tab. - Move initialization of weights_ratio up so that it can be referenced from the error message about packet difference being zero. - Move |'s consistently to continuation line, which reindent. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: Move multipath_eval() to lib.shPetr Machata
This function will be useful for the GRE multipath test that is coming later. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27net/tls: Remove VLA usage on nonceKees Cook
It looks like the prior VLA removal, commit b16520f7493d ("net/tls: Remove VLA usage"), and a new VLA addition, commit c46234ebb4d1e ("tls: RX path for ktls"), passed in the night. This removes the newly added VLA, which happens to have its bounds based on the same max value. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26selftests: forwarding: mirror_gre_vlan_bridge_1q: Unset rp_filterPetr Machata
The IP addresses of tunnel endpoint at H3 are set at the VLAN device $h3.555. Therefore when test_gretap_untagged_egress() sets vlan 555 to egress untagged at $swp3, $h3's rp_filter rejects these packets. The test then spuriously fails. Therefore turn off net.ipv4.conf.{all, $h3}.rp_filter. Fixes: 9c7c8a82442c ("selftests: forwarding: mirror_gre_vlan_bridge_1q: Add more tests") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26mdio-mux-gpio: Remove VLA usageKees Cook
In the quest to remove all stack VLA usage from the kernel[1], this allocates the values buffer during the callback instead of putting it on the stack. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26Merge branch 'net-sched-support-replay-of-filter-offload-when-binding-to-block'David S. Miller
Jakub Kicinski says: ==================== net: sched: support replay of filter offload when binding to block This series from John adds the ability to replay filter offload requests when new offload callback is being registered on a TC block. This is most likely to take place for shared blocks today, when a block which already has rules is bound to another interface. Prior to this patch set if any of the rules were offloaded the block bind would fail. A new tcf_proto_op is added to generate a filter-specific offload request. The new 'offload' op is supporting extack from day 0, hence we need to propagate extack to .ndo_setup_tc TC_BLOCK_BIND/TC_BLOCK_UNBIND and through tcf_block_cb_register() to tcf_block_playback_offloads(). The immediate use of this patch set is to simplify life of drivers which require duplicating rules when sharing blocks. Switch drivers (mlxsw) can bind ports to rule lists dynamically, NIC drivers generally don't have that ability and need the rules to be duplicated for each ingress they match on. In code terms this means that switch drivers don't register multiple callbacks for each port. NIC drivers do, and get a separate request and hance rule per-port, as if the block was not shared. The registration fails today, however, if some rules were already present. As John notes in description of patch 7, drivers which register multiple callbacks to shared blocks will likely need to flush the rules on block unbind. This set makes the core not only replay the the offload add requests but also offload remove requests when callback is unregistered. v2: - name parameters in patch 2; - use unsigned int instead of u32 for in_hw_coun; - improve extack message in patch 7. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: call reoffload op on block callback regJohn Hurley
Call the reoffload tcf_proto_op on all tcf_proto nodes in all chains of a block when a callback tries to register to a block that already has offloaded rules. If all existing rules cannot be offloaded then the registration is rejected. This replaces the previous policy of rejecting such callback registration outright. On unregistration of a callback, the rules are flushed for that given cb. The implementation of block sharing in the NFP driver, for example, duplicates shared rules to all devs bound to a block. This meant that rules could still exist in hw even after a device is unbound from a block (assuming the block still remains active). Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: cls_bpf: implement offload tcf_proto_opJohn Hurley
Add the offload tcf_proto_op in cls_bpf to generate an offload message for each bpf prog in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' prog. A prog contains a flag to indicate if it is in hardware or not. To ensure the offload function properly maintains this flag, keep a reference counter for the number of instances of the prog that are in hardware. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: cls_u32: implement offload tcf_proto_opJohn Hurley
Add the offload tcf_proto_op in cls_u32 to generate an offload message for each filter and the hashtable in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. A filter contains a flag to indicate if it is in hardware or not. To ensure the offload function properly maintains this flag, keep a reference counter for the number of instances of the filter that are in hardware. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: cls_matchall: implement offload tcf_proto_opJohn Hurley
Add the reoffload tcf_proto_op in matchall to generate an offload message for each filter in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. Ensure matchall flags correctly report if the rule is in hw by keeping a reference counter for the number of instances of the rule offloaded. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: cls_flower: implement offload tcf_proto_opJohn Hurley
Add the reoffload tcf_proto_op in flower to generate an offload message for each filter in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. A filter contains a flag to indicate if it is in hardware or not. To ensure the reoffload function properly maintains this flag, keep a reference counter for the number of instances of the filter that are in hardware. Only update the flag when this counter changes from or to 0. Add a generic helper function to implement this behaviour. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: add tcf_proto_op to offload a ruleJohn Hurley
Create a new tcf_proto_op called 'reoffload' that generates a new offload message for each node in a tcf_proto. Pointers to the tcf_proto and whether the offload request is to add or delete the node are included. Also included is a callback function to send the offload message to and the option of priv data to go with the cb. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: pass extack pointer to block binds and cb registrationJohn Hurley
Pass the extact struct from a tc qdisc add to the block bind function and, in turn, to the setup_tc ndo of binding device via the tc_block_offload struct. Pass this back to any block callback registrations to allow netlink logging of fails in the bind process. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26Merge branch 'sh_eth-RPADIR-related-clean-ups'David S. Miller
Sergei Shtylyov says: ==================== sh_eth: RPADIR related clean-ups Here's a set of 2 patches against DaveM's 'net-next.git' repo. They are clean-ups related to RPADIR (DMA padding to NET_IP_ALIGN)... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26sh_eth: remove sh_eth_cpu_data::rpadir_valueSergei Shtylyov
If RPADIR exists, the value written to it is always the same for all SoCs (and derived from NET_IP_ALIGN), so there has not been any need to store it in the *struct* sh_eth_cpu_data... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26sh_eth: fix *enum* RPADIR_BITSergei Shtylyov
The *enum* RPADIR_BIT was declared in the commit 86a74ff21a7a ("net: sh_eth: add support for Renesas SuperH Ethernet") adding SH771x support, however the SH771x manual doesn't have the RPADIR register described and, moreover, tells why the padding insertion must not be used. The newer SoC manuals do have RPADIR documented, though with somewhat different layout -- update the *enum* according to these manuals... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>