summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-02-25net: sched: pie: change default value of pie_params->targetMohit P. Tahiliani
RFC 8033 suggests a default value of 15 milliseconds for the target queue delay. Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: pie: change value of QUEUE_THRESHOLDMohit P. Tahiliani
RFC 8033 recommends a value of 16384 bytes for the queue threshold. Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25mlxsw: spectrum: acl: Use struct_size() in kzalloc()Gustavo A. R. Silva
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = kzalloc(size, GFP_KERNEL) Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL) Notice that, in this case, variable alloc_size is not necessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25Merge branch 'aquantia-hwmon'David S. Miller
Heiner Kallweit says: ==================== net: phy: aquantia: add hwmon support This series adds HWMON support for the temperature sensor and the related alarms on the 107/108/109 chips. v2: - remove struct aqr_priv - rename header file to aquantia.h v3: - add conditional compiling of aquantia_hwmon.c - improve converting sensor register values to/from long - add helper aqr_hwmon_test_bit ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: phy: aquantia: add hwmon supportHeiner Kallweit
This adds HWMON support for the temperature sensor and the related alarms on the 107/108/109 chips. This patch is based on work from Nikita and Andrew. I added: - support for changing alarm thresholds via sysfs - move HWMON code to a separate source file to improve maintainability - smaller changes like using IS_REACHABLE instead of ifdef (avoids problems if PHY driver is built in and HWMON is a module) v2: - remove struct aqr_priv - rename header file to aquantia.h v3: - add conditional compiling of aquantia_hwmon.c - improve converting sensor register values to/from long - add helper aqr_hwmon_test_bit Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: phy: aquantia: rename aquantia.c to aquantia_main.cHeiner Kallweit
Rename aquantia.c to aquantia_main.c to be prepared for adding new functionality to separate source code files. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2019-02-22 This series contains updates to the ice driver only. Bruce adds the __always_unused attribute to a parameter to avoid compiler warnings when using -Wunused-parameter. Fixed unnecessary type-casting and the use of sizeof(). Fix the allocation of structs that have become memory hogs, so allocate them in heaps and fix all the associated references. Fixed the "possible" numeric overflow issues that were caught with static analysis. Maciej fixes the maximum MTU calculation by taking into account double VLAN tagging amd ensure that the operations are done in the correct order. Victor fixes the supported node calculation, where we were not taking into account if there is space to add the new VSI or intermediate node above that layer, then it is not required to continue the calculation. Added a check for a leaf node presence for a given VSI, which is needed before removing a VSI. Jake fixes an issue where the VSI list is shared, so simply removing a VSI from the list will cause issues for the other users who reference the list. Since we also free the memory, this could lead to segmentation faults. Brett fixes an issue where driver unload could cause a system reboot when intel_iommu=on parameter is set. The issue is that we are not clearing the CAUSE_ENA bit for the appropriate control queues register when freeing the miscellaneous interrupt vector. Mitch is so kind, he prevented spamming the VF with link messages when the link status really has not changed. Updates the driver to use the absolute vector ID and not the per-PF vector ID for the VF MSIx vector allocation. Lukasz fixes the ethtool pause parameter for the ice driver, which was originally based off the link status but is now based off the PHY configuration. This is to resolve an issue where pause parameters could be set while link was down. Jesse updates the string that reports statistics so the string does not get modified at runtime and cause reports of string truncation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: don't release block->lock when dumping chainsVlad Buslov
Function tc_dump_chain() obtains and releases block->lock on each iteration of its inner loop that dumps all chains on block. Outputting chain template info is fast operation so locking/unlocking mutex multiple times is an overhead when lock is highly contested. Modify tc_dump_chain() to only obtain block->lock once and dump all chains without releasing it. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: set dedicated tcf_walker flag when tp is emptyVlad Buslov
Using tcf_walker->stop flag to determine when tcf_walker->fn() was called at least once is unreliable. Some classifiers set 'stop' flag on error before calling walker callback, other classifiers used to call it with NULL filter pointer when empty. In order to prevent further regressions, extend tcf_walker structure with dedicated 'nonempty' flag. Set this flag in tcf_walker->fn() implementation that is used to check if classifier has filters configured. Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: dsa: mv88e6xxx: Fix phylink_validate for Topaz familyMarek Behún
The Topaz family should have different phylink_validate method from the Peridot, since on Topaz the port supporting 2500BaseX mode is port 5, not 9 and 10. Signed-off-by: Marek Behún <marek.behun@nic.cz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: dsa: mv88e6xxx: Default CMODE to 1000BaseX only on 6390XMarek Behún
Commit 787799a9d555 sets the SERDES interfaces of 6390 and 6390X to 1000BaseX, but this is only needed on 6390X, since there are SERDES interfaces which can be used on lower ports on 6390. This commit fixes this by returning to previous behaviour on 6390. (Previous behaviour means that CMODE is not set at all if requested mode is NA). This is needed on Turris MOX, where the 88e6190 is connected to CPU in 2500BaseX mode. Fixes: 787799a9d555 ("net: dsa: mv88e6xxx: Default ports 9/10 6390X CMODE to 1000BaseX") Signed-off-by: Marek Behún <marek.behun@nic.cz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25tcp: clean up SOCK_DEBUG()Yafang Shao
Per discussion with Daniel[1] and Eric[2], these SOCK_DEBUG() calles in TCP are not needed now. We'd better clean up it. [1] https://patchwork.ozlabs.org/patch/1035573/ [2] https://patchwork.ozlabs.org/patch/1040533/ Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25tcp: remove unused parameter of tcp_sacktag_bsearch()Taehee Yoo
parameter state in the tcp_sacktag_bsearch() is not used. So, it can be removed. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25ice: fix overlong string, update stats outputJesse Brandeburg
A test started warning on a string truncation. This led to an unfortunate realization that we are likely not accounting for the stats length correctly before this patch, so fix the issue by putting "port." in front of all the PF stats, instead of magically prepending it at runtime. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: Fix for FC get rx/tx pause paramsLukasz Czapnik
Ethtool reported pause params based on the currently negotiated link settings instead of current PHY config. User was not able to turn off pause params because ethtool was incorrectly reporting parameters as off when link was down even though PHY was configured to support pause frames. Now pause params are taken from PHY config instead of link status. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: use absolute vector ID for VFsMitch Williams
When the PF driver sets up the VF MSI-X vector allocation, it needs to use the hardware absolute vector ID, not the per-PF vector ID. Without this change we see (apparent) TX hangs when using VFs on multiple PFs. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: check for a leaf node presenceVictor Raj
Check for a leaf node presence for a given VSI. This check is required before removing a VSI since VSIs can't be removed with enabled queues (with leaf nodes) from the FW scheduler tree unless its a reset. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: flush Tx pipe on disable queue timeoutVictor Raj
Set the flush Tx pipe flag instead of getting an EAGAIN error when FW times out in processing the disable Tx queue command. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: clear VF ARQLEN register on resetMitch Williams
On older devices like X710 and X722, the VF's ARQLEN register is cleared on reset, so the VF driver uses that register to detect an unannounced reset. Unfortunately, on devices controlled by ice, this register is NOT cleared on reset. This causes the VF to miss resets, and even on properly-announced resets, the VF driver complains that it didn't see the reset. To fix this, we'll do it in software. When we handle a VF reset (whether triggered by software or VFLR), clear this register after the HW reset is complete. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: don't spam VFs with link messagesMitch Williams
Don't send a link message to the VFs unless link actually changes state. This avoids a small timing hole in some VF drivers that can cause an apparent TX hang if they receive a link status message at the wrong time. Although we have fixed the timing hole in the current VF driver, there are still lots of drivers in the field that have this timing hole. Let's not fall into it if we can avoid it. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: only use the VF for ICE_VSI_VF in ice_vsi_releaseBrett Creeley
In ice_vsi_release we are always assigning a value to the local VF variable. Change this to only be assigned if the VSI is a VF VSI. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: fix numeric overflow warningBruce Allan
When compiling and analyzing the driver on newer kernels, a static analyzer warns about the following "numeric overflow" issues: "The result of expression: 'budget-1' generates 4-byte type while casting to a bigger size of 8-byte". "The result of expression: '*words-words_read' generates 4-byte type while casting to a bigger size of 8-byte". Fix them both. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: fix issue where host reboots on unload when iommu=onBrett Creeley
Currently if the kernel has the intel_iommu=on parameter set, on some platforms removing the driver causes a system reboot. In initialization we associate the control queue interrupts with the pf->hw_oicr_idx and enable the interrupts by setting the CAUSE_ENA bit. The problem comes on teardown because we are not clearing the CAUSE_ENA bit for the control queues, but the vector at pf->hw_oicr_idx (miscellaneous interrupt vector) gets disabled. Fix this by clearing the CAUSE_ENA bit in the appropriate control queue registers on when freeing the miscellaneous interrupt vector. Also, move the call to ice_free_irq_msix_misc() to after ice_deinit_sw() in ice_remove() because ice_deinit_sw() makes an AQ call, but ice_free_irq_msix_misc() disables the miscellaneous vector and it's associated interrupts. Also, create two small helper functions to enable and disable the control queue interrupts respectively. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: fix ice_remove_rule_internal vsi_list handlingJacob Keller
When adding multiple VLANs to the same VSI, the ice_add_vlan code will share the VSI list, so as not to create multiple unnecessary VSI lists. Consider the following flow ice_add_vlan(hw, <VSI 0 VID 7, VSI 0 VID 8, VSI 0 VID 9>) Where we add three VLAN filters for VIDs 7, 8, and 9, all for VSI 0. The ice_add_vlan will create a single vsi_list and share it among all the filters. Later, if we try to remove a VLAN, ice_remove_vlan(hw, <VSI 0 VID 7>) Then the removal code will update the vsi_list and remove VSI 0 from it. But, since the vsi_list is shared, this breaks the list for the other users who reference it. We actually even free the VSI list memory, and may result in segmentation faults. This is due to the way that VLAN rule share VSI lists with reference counts, and is caused because we call ice_rem_update_vsi_list even when the ref_cnt is greater than one. To fix this, handle the case where ref_cnt is greater than one separately. In this case, we need to remove the associated rule without modifying the vsi_list, since it is currently being referenced by another rule. Instead, we just need to decrement the VSI list ref_cnt. The case for handling sharing of VSI lists with multiple VSIs is not currently supported by this code. No such rules will be created today, and this code will require changes if/when such code is added. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: fix stack hogs from struct ice_vsi_ctx structuresBruce Allan
struct ice_vsi_ctx has gotten large enough that function local declarations of it on the stack are causing stack hogs. Fix that by allocating the structs on heap. Cleanup some formatting issues in the code around these changes and fix incorrect data type uses of returned functions in a couple places. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: sizeof(<type>) should be avoidedBruce Allan
With sizeof(), it is preferable to use the variable of type <type> instead of sizeof(<type>). There are multiple places where a temporary variable is used to hold a 'size' value which is then used for a subsequent alloc/memset. Get rid of the temporary variable by calculating size as part of the alloc/memset statement. Also remove unnecessary type-cast. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: Fix added in VSI supported nodes calcVictor Raj
VSI supported nodes are calculated in order to add the VSI parent or intermediate nodes to the scheduler tree. If one of the node in below layers (from VSI layer) has space to add the new VSI or intermediate node above that layer then it's not required to continue the calculation further for below layers. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: Fix the calculation of ICE_MAX_MTUMaciej Fijalkowski
Currently ICE_MAX_MTU subtracts only ETH_HLEN from max frame size and adds ETH_FCS_LEN and VLAN_HLEN, which is not what was intended. The ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN expression should be surrounded with parentheses. Wrap mentioned expression and take into account VLAN double tagging. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: Mark extack argument as __always_unusedBruce Allan
Commit 87b0984ebfab ("net: Add extack argument to ndo_fdb_add()") in net-next added an extended parameter to the .ndo_fdb_add op and changed ice_fdb_add() accordingly. Update the function header and add the __always_unused attribute to the new parameter to avoid -Wunused-parameter warnings. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-24switchdev: Complete removal of switchdev_port_attr_get()Florian Fainelli
We have no more in tree users of switchdev_port_attr_get() after d0e698d57a94 ("Merge branch 'net-Get-rid-of-switchdev_port_attr_get'") so completely remove the function signature and body. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24dsa: Remove phydev parameter from disable_port callAndrew Lunn
No current DSA driver makes use of the phydev parameter passed to the disable_port call. Remove it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: phy: fix reading fixed phy statusHeiner Kallweit
With the switch to phy_resolve_aneg_linkmode() we don't read from the chip any longer what is advertised but use phydev->advertising directly. For a fixed phy however this bitmap is empty so far, what results in no common mode being found. This breaks DSA. Fix this by advertising everything that is supported. For a normal phy this done by phy_probe(). Fixes: 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: phy: improve auto-neg emulation in swphyHeiner Kallweit
Auto-neg emulation currently doesn't set bit BMCR_ANENABLE in BMCR, add this. Users will ignore speed and duplex settings in BMCR because we're emulating auto-neg, therefore we can remove related code. See also following discussion [0]. [0] https://marc.info/?t=155041784900002&r=1&w=2 Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== Here's the main bluetooth-next pull request for the 5.1 kernel. - Fixes & improvements to mediatek, hci_qca, btrtl, and btmrvl HCI drivers - Fixes to parsing invalid L2CAP config option sizes - Locking fix to bt_accept_enqueue() - Add support for new Marvel sd8977 chipset - Various other smaller fixes & cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: fix double-free in bpf_lwt_xmit_reroutePeter Oskolkov
dst_output() frees skb when it fails (see, for example, ip_finish_output2), so it must not be freed in this case. Fixes: 3bd0b15281af ("bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c") Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ip_tunnel: Add ip tunnel tun_info type dst_cache in ip_tunnel_xmitwenxu
ip l add dev tun type gretap key 1000 Non-tunnel-dst ip tunnel device can send packet through lwtunnel This patch provide the tun_inf dst cache support for this mode. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24Merge branch 'dsa-mv88e6xxx-lockdep'David S. Miller
Andrew Lunn says: ==================== mv88e6xxx: Avoid false positive Lockdep splats When acquiring the GPIO interrupt line for the switch, it is possible to trigger lockdep splats. These are false positives, the mutex is in a different IRQ descriptor. But fix it anyway, since it could mask real locking issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: dsa: mv88e6xxx: Release lock while requesting IRQAndrew Lunn
There is no need to hold the register lock while requesting the GPIO interrupt. By not holding it we can also avoid a false positive lockdep splat. Reported-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: dsa: mv88e6xxx: Add lockdep classes to fix false positive splatAndrew Lunn
The following false positive lockdep splat has been observed. ====================================================== WARNING: possible circular locking dependency detected 4.20.0+ #302 Not tainted ------------------------------------------------------ systemd-udevd/160 is trying to acquire lock: edea6080 (&chip->reg_lock){+.+.}, at: __setup_irq+0x640/0x704 but task is already holding lock: edff0340 (&desc->request_mutex){+.+.}, at: __setup_irq+0xa0/0x704 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&desc->request_mutex){+.+.}: mutex_lock_nested+0x1c/0x24 __setup_irq+0xa0/0x704 request_threaded_irq+0xd0/0x150 mv88e6xxx_probe+0x41c/0x694 [mv88e6xxx] mdio_probe+0x2c/0x54 really_probe+0x200/0x2c4 driver_probe_device+0x5c/0x174 __driver_attach+0xd8/0xdc bus_for_each_dev+0x58/0x7c bus_add_driver+0xe4/0x1f0 driver_register+0x7c/0x110 mdio_driver_register+0x24/0x58 do_one_initcall+0x74/0x2e8 do_init_module+0x60/0x1d0 load_module+0x1968/0x1ff4 sys_finit_module+0x8c/0x98 ret_fast_syscall+0x0/0x28 0xbedf2ae8 -> #0 (&chip->reg_lock){+.+.}: __mutex_lock+0x50/0x8b8 mutex_lock_nested+0x1c/0x24 __setup_irq+0x640/0x704 request_threaded_irq+0xd0/0x150 mv88e6xxx_g2_irq_setup+0xcc/0x1b4 [mv88e6xxx] mv88e6xxx_probe+0x44c/0x694 [mv88e6xxx] mdio_probe+0x2c/0x54 really_probe+0x200/0x2c4 driver_probe_device+0x5c/0x174 __driver_attach+0xd8/0xdc bus_for_each_dev+0x58/0x7c bus_add_driver+0xe4/0x1f0 driver_register+0x7c/0x110 mdio_driver_register+0x24/0x58 do_one_initcall+0x74/0x2e8 do_init_module+0x60/0x1d0 load_module+0x1968/0x1ff4 sys_finit_module+0x8c/0x98 ret_fast_syscall+0x0/0x28 0xbedf2ae8 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&desc->request_mutex); lock(&chip->reg_lock); lock(&desc->request_mutex); lock(&chip->reg_lock); &desc->request_mutex refer to two different mutex. #1 is the GPIO for the chip interrupt. #2 is the chained interrupt between global 1 and global 2. Add lockdep classes to the GPIO interrupt to avoid this. Reported-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ip_tunnel: Add dst_cache support in lwtunnel_state of ip tunnelwenxu
The lwtunnel_state is not init the dst_cache Which make the ip_md_tunnel_xmit can't use the dst_cache. It will lookup route table every packets. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24tls: Return type of non-data records retrieved using MSG_PEEK in recvmsgVakul Garg
The patch enables returning 'type' in msghdr for records that are retrieved with MSG_PEEK in recvmsg. Further it prevents records peeked from socket from getting clubbed with any other record of different type when records are subsequently dequeued from strparser. For each record, we now retain its type in sk_buff's control buffer cb[]. Inside control buffer, record's full length and offset are already stored by strparser in 'struct strp_msg'. We store record type after 'struct strp_msg' inside 'struct tls_msg'. For tls1.2, the type is stored just after record dequeue. For tls1.3, the type is stored after record has been decrypted. Inside process_rx_list(), before processing a non-data record, we check that we must be able to return back the record type to the user application. If not, the decrypted records in tls context's rx_list is left there without consuming any data. Fixes: 692d7b5d1f912 ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24Merge branch 'ipv4-v6-icmp-small-cleanup-and-update'David S. Miller
Kefeng Wang says: ==================== ipv4/v6: icmp: small cleanup and update v2: - Add cover letter and user proper patch subject-prefix suggested-by Eric Dumazet This patch series contains some small cleanup and update, 1) use icmp/v6_sk_exit when icmp_sk_init fails instead of open-code 2) use new percpu allocation interface for the ipv6.icmp_sk ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ipv6: icmp: use percpu allocationKefeng Wang
Use percpu allocation for the ipv6.icmp_sk. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ipv6: icmp: use icmpv6_sk_exit()Kefeng Wang
Simply use icmpv6_sk_exit() when inet_ctl_sock_create() fail in icmpv6_sk_init(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ipv4: icmp: use icmp_sk_exit()Kefeng Wang
Simply use icmp_sk_exit() when inet_ctl_sock_create() fail in icmp_sk_init(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ila: Fix uninitialised return value in ila_xlat_nl_cmd_flushHerbert Xu
This patch fixes an uninitialised return value error in ila_xlat_nl_cmd_flush. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 6c4128f65857 ("rhashtable: Remove obsolete...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net/sched: act_tunnel_key: Add dst_cache supportwenxu
The metadata_dst is not init the dst_cache which make the ip_md_tunnel_xmit can't use the dst_cache. It will lookup route table every packets. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24Merge branch 'code-optimizations-and-bugfixes-for-HNS3-driver'David S. Miller
Huazhong Tan says: ==================== code optimizations & bugfixes for HNS3 driver This patchset includes bugfixes and code optimizations for the HNS3 ethernet controller driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: hns3: fix improper error handling for hns3_client_startHuazhong Tan
If hns3_client_start() failed in the hns3_client_init(), register_dev() should be undo in its error handling. Fixes: a6d818e31d08 ("net: hns3: Add vport alive state checking support") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: hns3: fix setting of the hns reset_type for rdma hw errorsShiju Jose
Presently the hns reset_type for the roce errors is set in the hclge_log_and_clear_rocee_ras_error function. This function is also called to detect and clear roce errors while enabling the rdma error interrupts. However there is no hns reset requested for this case. This can cause issue of wrong reset_type used with subsequent hns reset as the reset_type set in the above case was not cleared. This patch moves setting of hns reset_type for the roce errors from hclge_log_and_clear_rocee_ras_error function to hclge_handle_rocee_ras_error. Fixes: 630ba007f475 ("net: hns3: add handling of RDMA RAS errors") Reported-by: Huazhong Tan <tanhuazhong@huawei.com> Reported-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>