summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-10-20net: use core MTU range checking in WAN driversJarod Wilson
- set min/max_mtu in all hdlc drivers, remove hdlc_change_mtu - sent max_mtu in lec driver, remove lec_change_mtu - set min/max_mtu in x25_asy driver CC: netdev@vger.kernel.org CC: Krzysztof Halasa <khc@pm.waw.pl> CC: Krzysztof Halasa <khalasa@piap.pl> CC: Jan "Yenya" Kasprzak <kas@fi.muni.cz> CC: Francois Romieu <romieu@fr.zoreil.com> CC: Kevin Curtis <kevin.curtis@farsite.co.uk> CC: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: use core MTU range checking in wireless driversJarod Wilson
- set max_mtu in wil6210 driver - set max_mtu in atmel driver - set min/max_mtu in cisco airo driver, remove airo_change_mtu - set min/max_mtu in ipw2100/ipw2200 drivers, remove libipw_change_mtu - set min/max_mtu in p80211netdev, remove wlan_change_mtu - set min/max_mtu in net/mac80211/iface.c and remove ieee80211_change_mtu - set min/max_mtu in wimax/i2400m and remove i2400m_change_mtu - set min/max_mtu in intersil/hostap and remove prism2_change_mtu - set min/max_mtu in intersil/orinoco - set min/max_mtu in tty/n_gsm and remove gsm_change_mtu CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: Maya Erez <qca_merez@qca.qualcomm.com> CC: Simon Kelley <simon@thekelleys.org.uk> CC: Stanislav Yakovlev <stas.yakovlev@gmail.com> CC: Johannes Berg <johannes@sipsolutions.net> CC: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: use core MTU range checking in USB NIC driversJarod Wilson
usbnet: - Remove stale new_mtu <= 0 check in usbnet.c - Set min_mtu = 0, max_mtu = 65535 (sub-drivers must set their own max_mtu and/or min_mtu as needed) r8152: - Set appropriate max_mtu for different variants (1500 or 9194) lan78xx: - Set max_mtu = 9000 asix_driver: - max_mtu = 16384 for ax88178 variant ax88179: - max_mtu = 4088 cdc_ncm: - max_mtu from hardware cdc-phonet: - min_mtu = 6, max_mtu = 65541 sierra_net: - max_mtu = 1500, call usbnet_change_mtu directly - sierra_net_change_mtu checked for MTU > 1500, then called usbnet_change_mtu, but if we set max_mtu to let the network core handle the range check, then we can simply call usbnet_change_mtu directly smsc75xx: - max_mtu = 9000 CC: netdev@vger.kernel.org CC: Woojung Huh <woojung.huh@microchip.com> CC: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com> CC: Hayes Wang <hayeswang@realtek.com> CC: Oliver Neukum <oneukum@suse.com> CC: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20ethernet: use net core MTU range checking in more driversJarod Wilson
Somehow, I missed a healthy number of ethernet drivers in the last pass. Most of these drivers either were in need of an updated max_mtu to make jumbo frames possible to enable again. In a few cases, also setting a different min_mtu to match previous lower bounds. There are also a few drivers that had no upper bounds checking, so they're getting a brand new ETH_MAX_MTU that is identical to IP_MAX_MTU, but accessible by includes all ethernet and ethernet-like drivers all have already. acenic: - min_mtu = 0, max_mtu = 9000 amazon/ena: - min_mtu = 128, max_mtu = adapter->max_mtu amd/xgbe: - min_mtu = 0, max_mtu = 9000 sb1250: - min_mtu = 0, max_mtu = 1518 cxgb3: - min_mtu = 81, max_mtu = 65535 cxgb4: - min_mtu = 81, max_mtu = 9600 cxgb4vf: - min_mtu = 81, max_mtu = 65535 benet: - min_mtu = 256, max_mtu = 9000 ibmveth: - min_mtu = 68, max_mtu = 65535 ibmvnic: - min_mtu = adapter->min_mtu, max_mtu = adapter->max_mtu - remove now redundant ibmvnic_change_mtu jme: - min_mtu = 1280, max_mtu = 9202 mv643xx_eth: - min_mtu = 64, max_mtu = 9500 mlxsw: - min_mtu = 0, max_mtu = 65535 - Basically bypassing the core checks, and instead relying on dynamic checks in the respective switch drivers' ndo_change_mtu functions ns83820: - min_mtu = 0 - remove redundant ns83820_change_mtu, only checked for mtu > 1500 netxen: - min_mtu = 0, max_mtu = 8000 (P2), max_mtu = 9600 (P3) qlge: - min_mtu = 1500, max_mtu = 9000 - driver only supports setting mtu to 1500 or 9000, so the core check only rules out < 1500 and > 9000, qlge_change_mtu still needs to check that the value is 1500 or 9000 qualcomm/emac: - min_mtu = 46, max_mtu = 9194 xilinx_axienet: - min_mtu = 64, max_mtu = 9000 Fixes: 61e84623ace3 ("net: centralize net_device min/max MTU checking") CC: netdev@vger.kernel.org CC: Jes Sorensen <jes@trained-monkey.org> CC: Netanel Belgazal <netanel@annapurnalabs.com> CC: Tom Lendacky <thomas.lendacky@amd.com> CC: Santosh Raspatur <santosh@chelsio.com> CC: Hariprasad S <hariprasad@chelsio.com> CC: Sathya Perla <sathya.perla@broadcom.com> CC: Ajit Khaparde <ajit.khaparde@broadcom.com> CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> CC: Somnath Kotur <somnath.kotur@broadcom.com> CC: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> CC: John Allen <jallen@linux.vnet.ibm.com> CC: Guo-Fu Tseng <cooldavid@cooldavid.org> CC: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> CC: Jiri Pirko <jiri@mellanox.com> CC: Ido Schimmel <idosch@mellanox.com> CC: Manish Chopra <manish.chopra@qlogic.com> CC: Sony Chacko <sony.chacko@qlogic.com> CC: Rajesh Borundia <rajesh.borundia@qlogic.com> CC: Timur Tabi <timur@codeaurora.org> CC: Anirudha Sarangi <anirudh@xilinx.com> CC: John Linn <John.Linn@xilinx.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20myri10ge: fix typo in parameter descriptionWei Yongjun
Fix typo in parameter description. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: ethernet: mediatek: use dev_kfree_skb_any instead of dev_kfree_skbWei Yongjun
Replace dev_kfree_skb with dev_kfree_skb_any in mtk_start_xmit() which can be called from hard irq context (netpoll) and from other contexts. mtk_start_xmit() only frees skbs that it has dropped. This is detected by Coccinelle semantic patch. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20dwc_eth_qos: use dev_kfree_skb_any instead of dev_kfree_skbWei Yongjun
Replace dev_kfree_skb with dev_kfree_skb_any in dwceqos_start_xmit() which can be called from hard irq context (netpoll) and from other contexts. dwceqos_start_xmit() only frees skbs that it has dropped. This is detected by Coccinelle semantic patch. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: phy: aquantia: add PHY ID of AQR106 and AQR107Shaohui Xie
The AQR106 and AQR107 can use the existing driver. Signed-off-by: Shaohui Xie <Shaohui.Xie@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: fec: drop check for clk==NULL before calling clk_*Uwe Kleine-König
clk_prepare, clk_enable and their counterparts (at least the common clk ones, but also most others) do check for the clk being NULL anyhow (and return 0 then), so there is no gain when the caller checks, too. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20tcp: relax listening_hash operationsEric Dumazet
softirq handlers use RCU protection to lookup listeners, and write operations all happen from process context. We do not need to block BH for dump operations. Also SYN_RECV since request sockets are stored in the ehash table : 1) inet_diag_dump_icsk() no longer need to clear cb->args[3] and cb->args[4] that were used as cursors while iterating the old per listener hash table. 2) Also factorize a test : No need to scan listening_hash[] if r->id.idiag_dport is not zero. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: smc91x: fix neponset breakage by pxa u16 writesRobert Jarzmik
The patch isolating the u16 writes for pxa assumed all machine_is_*() calls were removed, and therefore removed the mach-types.h include which provided them. Unfortunately 2 machine_is_*() remained in smc91x.c file including smc91x.h from which the include was removed, triggering the error: drivers/net/ethernet/smsc/smc91x.c: In function ‘smc_drv_probe’: drivers/net/ethernet/smsc/smc91x.c:2380:2: error: implicit declaration of function ‘machine_is_assabet’ [-Werror=implicit-function-declaration] if (machine_is_assabet() && machine_has_neponset()) This adds back the wrongly removed include. Fixes: d09d747ae4c2 ("net: smc91x: isolate u16 writes alignment workaround") Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20ila: Fix tailroom allocation of lwtstateThomas Graf
Tailroom is supposed to be of length sizeof(struct ila_lwt) but sizeof(struct ila_params) is currently allocated. This leads to the dst_cache and connected member of ila_lwt being referenced out of bounds. struct ila_lwt { struct ila_params p; struct dst_cache dst_cache; u32 connected : 1; }; Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module") Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19Merge branch 'macb-ethtool-ringparam'David S. Miller
Zach Brown says: ==================== macb: Add ethtool get_ringparam and set_ringparam to cadence There are use cases like RT that would benefit from being able to tune the macb rx/tx ring sizes. The ethtool set_ringparam function is the standard way of doing so. The first patch changes the hardcoded tx/rx ring sizes to variables that are set to a hardcoded default. The second patch implements the get_ringparam and set_ringparam fucntions. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19net: macb: Add ethtool get_ringparam and set_ringparam functionalityZach Brown
Some applications want to tune the size of the macb rx/tx ring buffers. The ethtool set_ringparam function is the standard way of doing it. Signed-off-by: Zach Brown <zach.brown@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19net: macb: Use variables with defaults for tx/rx ring sizes instead of ↵Zach Brown
hardcoded values The macb driver hardcoded the tx/rx ring sizes. This made it impossible to change the sizes at run time. Add tx_ring_size, and rx_ring_size variables to macb object, which are initilized with default vales during macb_init. Change all references to RX_RING_SIZE and TX_RING_SIZE to their respective replacements. Signed-off-by: Zach Brown <zach.brown@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19net: arc_emac: use dev_kfree_skb_any instead of dev_kfree_skbWei Yongjun
Replace dev_kfree_skb with dev_kfree_skb_any in arc_emac_tx() which can be called from hard irq context (netpoll) and from other contexts. arc_emac_tx() only frees skbs that it has dropped. This is detected by Coccinelle semantic patch. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19Merge branch 'ovs-remove-unused'David S. Miller
Jiri Benc says: ==================== openvswitch: remove unused code Removed unused functions and unnecessary EXPORT_SYMBOLs from openvswitch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19openvswitch: remove unnecessary EXPORT_SYMBOLsJiri Benc
Some symbols exported to other modules are really used only by openvswitch.ko. Remove the exports. Tested by loading all 4 openvswitch modules, nothing breaks. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19openvswitch: remove unused functionsJiri Benc
ovs_vport_deferred_free is not used anywhere. It's the only caller of free_vport_rcu thus this one can be removed, too. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registersThomas Graf
A BPF program is required to check the return register of a map_elem_lookup() call before accessing memory. The verifier keeps track of this by converting the type of the result register from PTR_TO_MAP_VALUE_OR_NULL to PTR_TO_MAP_VALUE after a conditional jump ensures safety. This check is currently exclusively performed for the result register 0. In the event the compiler reorders instructions, BPF_MOV64_REG instructions may be moved before the conditional jump which causes them to keep their type PTR_TO_MAP_VALUE_OR_NULL to which the verifier objects when the register is accessed: 0: (b7) r1 = 10 1: (7b) *(u64 *)(r10 -8) = r1 2: (bf) r2 = r10 3: (07) r2 += -8 4: (18) r1 = 0x59c00000 6: (85) call 1 7: (bf) r4 = r0 8: (15) if r0 == 0x0 goto pc+1 R0=map_value(ks=8,vs=8) R4=map_value_or_null(ks=8,vs=8) R10=fp 9: (7a) *(u64 *)(r4 +0) = 0 R4 invalid mem access 'map_value_or_null' This commit extends the verifier to keep track of all identical PTR_TO_MAP_VALUE_OR_NULL registers after a map_elem_lookup() by assigning them an ID and then marking them all when the conditional jump is observed. Signed-off-by: Thomas Graf <tgraf@suug.ch> Reviewed-by: Josef Bacik <jbacik@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19net: fs_enet: Use net_device_stats from struct net_deviceTobias Klauser
Instead of using a private copy of struct net_device_stats in struct fs_enet_private, use stats from struct net_device. Also remove the now unnecessary .ndo_get_stats function. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18qed: Remove useless set memory to zero use memset()Wei Yongjun
The memory return by kzalloc() has already be set to zero, so remove useless memset(0). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: dsa: mv88e6xxx: fix non static symbol warningWei Yongjun
Fixes the following sparse warning: drivers/net/dsa/mv88e6xxx/chip.c:2866:5: warning: symbol 'mv88e6xxx_g1_set_switch_mac' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18r8152: add new products of Lenovohayeswang
Add the following four products of Lenovo and sort the order of the list. VID PID 0x17ef 0x3062 0x17ef 0x3069 0x17ef 0x720c 0x17ef 0x7214 Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: vlan: Use sizeof instead of literal numberGao Feng
Use sizeof variable instead of literal number to enhance the readability. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18Merge branch 'smc91x-dt'David S. Miller
Robert Jarzmik says: ==================== support smc91x on mainstone and devicetree This series aims at bringing support to mainstone board on a device-tree based build, as what is already in place for legacy mainstone. The bulk of the mainstone "specific" behavior is that a u16 write doesn't work on a address of the form 4*n + 2, while it works on 4*n. The legacy workaround was in SMC_outw(), with calls to machine_is_mainstone(). These calls don't work with a pxa27x-dt machine type, which is used when a generic device-tree pxa27x machine is used to boot the mainstone board. Therefore, this series enables the smc91c111 adapter of the mainstone board to work on a device-tree build, exaclty as it's been working for years with the legacy arch/arm/mach-pxa/mainstone.c definition. As a sum up, this extends an existing mechanism to device-tree based pxa platforms. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: smsc91x: add u16 workaround for pxa platformsRobert Jarzmik
Add a workaround for mainstone, idp and stargate2 boards, for u16 writes which must be aligned on 32 bits addresses. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Jeremy Linton <jeremy.linton@arm.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: smc91x: take into account half-word workaroundRobert Jarzmik
For device-tree builds, platforms such as mainstone, idp and stargate2 must have their u16 writes all aligned on 32 bit boundaries. This is already enabled in platform data builds, and this patch adds it to device-tree builds. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: smc91x: isolate u16 writes alignment workaroundRobert Jarzmik
Writes to u16 has a special handling on 3 PXA platforms, where the hardware wiring forces these writes to be u32 aligned. This patch isolates this handling for PXA platforms as before, but enables this "workaround" to be set up dynamically, which will be the case in device-tree build types. This patch was tested on 2 PXA platforms : mainstone, which relies on the workaround, and lubbock, which doesn't. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18ARM: pxa: enhance smc91x platform dataRobert Jarzmik
Instead of having the smc91x driver relying on machine_is_*() calls, provide this data through platform data, ie. idp, mainstone and stargate. This way, the driver doesn't need anymore machine_is_*() calls, which wouldn't work anymore with a device-tree build. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18ethernet/sfc: use core min/max MTU checkingBert Kenward
Fixes: 61e84623ace3 ("net: centralize net_device min/max MTU checking") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18Merge branch 'phy-led-triggers'David S. Miller
Zach Brown says: ==================== Add support for led triggers on phy link state change Fix skge driver that declared enum contants that conflicted with enum constants in linux/leds.h Create function that encapsulates actions taken during the adjust phy link step of phy state changes. Create function that provides list of speeds currently supported by the phy. Add support for led triggers on phy link state changes by adding a config option. When set the config option will create a set of led triggers for each phy device. Users can use the led triggers to represent link state changes on the phy. v2: * New patch that creates phy_adjust_link function to encapsulate actions taken when adjusting phy link during phy state changes * led trigger speed strings changed to match existing phy speed strings * New function that maps speeds to led triggers * Replace magic constants with definitions when declaring trigger name buffer and number of triggers. v3: * Changed LED_ON to LED_REG_ON in skge driver to avoid possible future conflict and improve consistency. * Dropped rtl8712 patch that was accepted separately. v4: * tweaked commit message v5 * Changed commit message to explain relationship between the new triggers and leds driven by phys. * Added new patch that creates phy_supported_speeds function. * Moved phy_leds_triggers_register and phy_leds_triggers_unregister to phy_attach and phy_detach respectively. This change is so the phydev->supported field will be filled by the time the triggers are registered. * Changed hardcoded list of triggers to dynamic list determined by speeds return by phy_supported_speeds. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: phy: leds: add support for led triggers on phy link state changeZach Brown
Create an option CONFIG_LED_TRIGGER_PHY (default n), which will create a set of led triggers for each instantiated PHY device. There is one LED trigger per link-speed, per-phy. The triggers are registered during phy_attach and unregistered during phy_detach. This allows for a user to configure their system to allow a set of LEDs not controlled by the phy to represent link state changes on the phy. LEDS controlled by the phy are unaffected. For example, we have a board where some of the leds in the RJ45 socket are controlled by the phy, but others are not. Using the triggers provided by this patch the leds not controlled by the phy can be configured to show the current speed of the ethernet connection. The leds controlled by the phy are unaffected. Signed-off-by: Josh Cartwright <josh.cartwright@ni.com> Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com> Signed-off-by: Zach Brown <zach.brown@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: phy: Create phy_supported_speeds function which lists speeds currently ↵Zach Brown
supported by a phydevice phy_supported_speeds provides a means to get a list of all the speeds a phy device currently supports. Signed-off-by: Zach Brown <zach.brown@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: phy: Encapsulate actions performed during link state changes into ↵Zach Brown
function phy_adjust_link During phy state machine state transitions some set of actions should occur whenever the link state changes. These actions should be encapsulated into a single function This patch adds the phy_adjust_link function, which is called whenever phydev->adjust_link would have been called before. Actions that should occur whenever the phy link is adjusted can now be added to the phy_adjust_link function. Signed-off-by: Zach Brown <zach.brown@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18skge: Rename LED_OFF and LED_ON in marvel skge driver to avoid conflicts ↵Zach Brown
with leds namespace Adding led support for phy causes namespace conflicts for some phy drivers. The marvel skge driver declared an enum for representing the states of Link LED Register. The enum contained constant LED_OFF which conflicted with declartation found in linux/leds.h. LED_OFF changed to LED_REG_OFF Also changed LED_ON to LED_REG_ON to avoid possible future conflict and for consistency. Signed-off-by: Zach Brown <zach.brown@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18Merge branch 'netdev-adjacency'David S. Miller
David Ahern says: ==================== net: Fix netdev adjacency tracking The netdev adjacency tracking is failing to create proper dependencies for some topologies. For example this topology +--------+ | myvrf | +--------+ | | | +---------+ | | macvlan | | +---------+ | | +----------+ | bridge | +----------+ | +--------+ | bond1 | +--------+ | +--------+ | eth3 | +--------+ hits 1 of 2 problems depending on the order of enslavement. The base set of commands for both cases: ip link add bond1 type bond ip link set bond1 up ip link set eth3 down ip link set eth3 master bond1 ip link set eth3 up ip link add bridge type bridge ip link set bridge up ip link add macvlan link bridge type macvlan ip link set macvlan up ip link add myvrf type vrf table 1234 ip link set myvrf up ip link set bridge master myvrf Case 1 enslave macvlan to the vrf before enslaving the bond to the bridge: ip link set macvlan master myvrf ip link set bond1 master bridge Attempts to delete the VRF: ip link delete myvrf trigger the BUG in __netdev_adjacent_dev_remove: [ 587.405260] tried to remove device eth3 from myvrf [ 587.407269] ------------[ cut here ]------------ [ 587.408918] kernel BUG at /home/dsa/kernel.git/net/core/dev.c:5661! [ 587.411113] invalid opcode: 0000 [#1] SMP [ 587.412454] Modules linked in: macvlan bridge stp llc bonding vrf [ 587.414765] CPU: 0 PID: 726 Comm: ip Not tainted 4.8.0+ #109 [ 587.416766] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 587.420241] task: ffff88013ab6eec0 task.stack: ffffc90000628000 [ 587.422163] RIP: 0010:[<ffffffff813cef03>] [<ffffffff813cef03>] __netdev_adjacent_dev_remove+0x40/0x12c ... [ 587.446053] Call Trace: [ 587.446424] [<ffffffff813d1542>] __netdev_adjacent_dev_unlink+0x20/0x3c [ 587.447390] [<ffffffff813d16a3>] netdev_upper_dev_unlink+0xfa/0x15e [ 587.448297] [<ffffffffa00003a3>] vrf_del_slave+0x13/0x2a [vrf] [ 587.449153] [<ffffffffa00004a4>] vrf_dev_uninit+0xea/0x114 [vrf] [ 587.450036] [<ffffffff813d19b0>] rollback_registered_many+0x22b/0x2da [ 587.450974] [<ffffffff813d1aac>] unregister_netdevice_many+0x17/0x48 [ 587.451903] [<ffffffff813de444>] rtnl_delete_link+0x3c/0x43 [ 587.452719] [<ffffffff813dedcd>] rtnl_dellink+0x180/0x194 When the BUG is converted to a WARN_ON it shows 4 missing adjacencies: eth3 - myvrf, mvrf - eth3, bond1 - myvrf and myvrf - bond1 All of those are because the __netdev_upper_dev_link function does not properly link macvlan lower devices to myvrf when it is enslaved. The second case just flips the ordering of the enslavements: ip link set bond1 master bridge ip link set macvlan master myvrf Then run: ip link delete bond1 ip link delete myvrf The vrf delete command hangs because myvrf has a reference that has not been released. In this case the removal code does not account for 2 paths between eth3 and myvrf - one from bridge to vrf and the other through the macvlan. Rather than try to maintain a linked list of all upper and lower devices per netdevice, only track the direct neighbors. The remaining stack can be determined by recursively walking the neighbors. The existing netdev_for_each_all_upper_dev_rcu, netdev_for_each_all_lower_dev and netdev_for_each_all_lower_dev_rcu macros are replaced with APIs that walk the upper and lower device lists. The new APIs take a callback function and a data arg that is passed to the callback for each device in the list. Drivers using the old macros are converted in separate patches to make it easier on reviewers. It is an API conversion only; no functional change is intended. v3 - address Stephen's comment to simplify logic and remove typecasts v2 - fixed bond0 references in cover-letter - fixed definition of netdev_next_lower_dev_rcu to mirror the upper_dev version. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: dev: Improve debug statements for adjacency trackingDavid Ahern
Adjacency code only has debugs for the insert case. Add debugs for the remove path and make both consistently worded to make it easier to follow the insert and removal with reference counts. In addition, change the BUG to a WARN_ON. A missing adjacency at removal time is not cause for a panic. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: Add warning if any lower device is still in adjacency listDavid Ahern
Lower list should be empty just like upper. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: Remove all_adj_list and its referencesDavid Ahern
Only direct adjacencies are maintained. All upper or lower devices can be learned via the new walk API which recursively walks the adj_list for upper devices or lower devices. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18rocker: Flip to the new dev walk APIDavid Ahern
Convert rocker to the new dev walk API. This is just a code conversion; no functional change is intended. v2 - removed typecast of data Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18mlxsw: Flip to the new dev walk APIDavid Ahern
Convert mlxsw users to new dev walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18ixgbe: Flip to the new dev walk APIDavid Ahern
Convert ixgbe users to new dev walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18IB/ipoib: Flip to new dev walk APIDavid Ahern
Convert ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a code conversion; no functional change is intended. v2 - removed typecast of data Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18IB/core: Flip to the new dev walk APIDavid Ahern
Convert rdma_is_upper_dev_rcu, handle_netdev_upper and ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a code conversion; no functional change is intended. v2 - removed typecast of data Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: bonding: Flip to the new dev walk APIDavid Ahern
Convert alb_send_learning_packets and bond_has_this_ip to use the new netdev_walk_all_upper_dev_rcu API. In both cases this is just a code conversion; no functional change is intended. v2 - removed typecast of data and simplified bond_upper_dev_walk Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: Introduce new api for walking upper and lower devicesDavid Ahern
This patch introduces netdev_walk_all_upper_dev_rcu, netdev_walk_all_lower_dev and netdev_walk_all_lower_dev_rcu. These functions recursively walk the adj_list of devices to determine all upper and lower devices. The functions take a callback function that is invoked for each device in the list. If the callback returns non-0, the walk is terminated and the functions return that code back to callers. v3 - simplified netdev_has_upper_dev_all_rcu and __netdev_has_upper_dev and removed typecast as suggested by Stephen v2 - fixed definition of netdev_next_lower_dev_rcu to mirror the upper_dev version. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: Remove refnr arg when inserting link adjacenciesDavid Ahern
Commit 93409033ae65 ("net: Add netdev all_adj_list refcnt propagation to fix panic") propagated the refnr to insert and remove functions tracking the netdev adjacency graph. However, for the insert path the refnr can only be 1. Accordingly, remove the refnr argument to make that clear. ie., the refnr arg in 93409033ae65 was only needed for the remove path. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18Merge branch 'bpf-selftests'David S. Miller
Daniel Borkmann says: ==================== Move to BPF selftests This set improves the test_verifier and test_maps suite and moves it over to a new BPF selftest directory, so we can keep improving it under kernel selftest umbrella. This also integrates a test script for checking test_bpf.ko under various JIT options. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18bpf: add initial suite for selftestsDaniel Borkmann
Add a start of a test suite for kernel selftests. This moves test_verifier and test_maps over to tools/testing/selftests/bpf/ along with various code improvements and also adds a script for invoking test_bpf module. The test suite can simply be run via selftest framework, f.e.: # cd tools/testing/selftests/bpf/ # make # make run_tests Both test_verifier and test_maps were kind of misplaced in samples/bpf/ directory and we were looking into adding them to selftests for a while now, so it can be picked up by kbuild bot et al and hopefully also get more exposure and thus new test case additions. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>