summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2021-02-01igc: check return value of ret_val in igc_config_fc_after_link_upKevin Lo
Check return value from ret_val to make error check actually work. Fixes: 4eb8080143a9 ("igc: Add setup link functionality") Signed-off-by: Kevin Lo <kevlo@kevlo.org> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwrKevin Lo
This patch sets the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr. Without this change it wouldn't lead to a shadow RAM write EEWR timeout. Fixes: ab4056126813 ("igc: Add NVM support") Signed-off-by: Kevin Lo <kevlo@kevlo.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01igc: Report speed and duplex as unknown when device is runtime suspendedKai-Heng Feng
Similar to commit 165ae7a8feb5 ("igb: Report speed and duplex as unknown when device is runtime suspended"), if we try to read speed and duplex sysfs while the device is runtime suspended, igc will complain and stops working: [ 123.449883] igc 0000:03:00.0 enp3s0: PCIe link lost, device now detached [ 123.450052] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 123.450056] #PF: supervisor read access in kernel mode [ 123.450058] #PF: error_code(0x0000) - not-present page [ 123.450059] PGD 0 P4D 0 [ 123.450064] Oops: 0000 [#1] SMP NOPTI [ 123.450068] CPU: 0 PID: 2525 Comm: udevadm Tainted: G U W OE 5.10.0-1002-oem #2+rkl2-Ubuntu [ 123.450078] RIP: 0010:igc_rd32+0x1c/0x90 [igc] [ 123.450080] Code: c0 5d c3 b8 fd ff ff ff c3 0f 1f 44 00 00 0f 1f 44 00 00 55 89 f0 48 89 e5 41 56 41 55 41 54 49 89 c4 53 48 8b 57 08 48 01 d0 <44> 8b 28 41 83 fd ff 74 0c 5b 44 89 e8 41 5c 41 5d 4 [ 123.450083] RSP: 0018:ffffb0d100d6fcc0 EFLAGS: 00010202 [ 123.450085] RAX: 0000000000000008 RBX: ffffb0d100d6fd30 RCX: 0000000000000000 [ 123.450087] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff945a12716c10 [ 123.450089] RBP: ffffb0d100d6fce0 R08: ffff945a12716550 R09: ffff945a09874000 [ 123.450090] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000008 [ 123.450092] R13: ffff945a12716000 R14: ffff945a037da280 R15: ffff945a037da290 [ 123.450094] FS: 00007f3b34c868c0(0000) GS:ffff945b89200000(0000) knlGS:0000000000000000 [ 123.450096] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 123.450098] CR2: 0000000000000008 CR3: 00000001144de006 CR4: 0000000000770ef0 [ 123.450100] PKRU: 55555554 [ 123.450101] Call Trace: [ 123.450111] igc_ethtool_get_link_ksettings+0xd6/0x1b0 [igc] [ 123.450118] __ethtool_get_link_ksettings+0x71/0xb0 [ 123.450123] duplex_show+0x74/0xc0 [ 123.450129] dev_attr_show+0x1d/0x40 [ 123.450134] sysfs_kf_seq_show+0xa1/0x100 [ 123.450137] kernfs_seq_show+0x27/0x30 [ 123.450142] seq_read+0xb7/0x400 [ 123.450148] ? common_file_perm+0x72/0x170 [ 123.450151] kernfs_fop_read+0x35/0x1b0 [ 123.450155] vfs_read+0xb5/0x1b0 [ 123.450157] ksys_read+0x67/0xe0 [ 123.450160] __x64_sys_read+0x1a/0x20 [ 123.450164] do_syscall_64+0x38/0x90 [ 123.450168] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 123.450170] RIP: 0033:0x7f3b351fe142 [ 123.450173] Code: c0 e9 c2 fe ff ff 50 48 8d 3d 3a ca 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 123.450174] RSP: 002b:00007fffef2ec138 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 123.450177] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3b351fe142 [ 123.450179] RDX: 0000000000001001 RSI: 00005644c047f070 RDI: 0000000000000003 [ 123.450180] RBP: 00007fffef2ec340 R08: 00005644c047f070 R09: 00007f3b352d9320 [ 123.450182] R10: 00005644c047c010 R11: 0000000000000246 R12: 00005644c047cbf0 [ 123.450184] R13: 00005644c047e6d0 R14: 0000000000000003 R15: 00007fffef2ec140 [ 123.450189] Modules linked in: rfcomm ccm cmac algif_hash algif_skcipher af_alg bnep toshiba_acpi industrialio toshiba_haps hp_accel lis3lv02d btusb btrtl btbcm btintel bluetooth ecdh_generic ecc joydev input_leds nls_iso8859_1 snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_soc_hdac_hda snd_hda_codec_hdmi snd_sof_xtensa_dsp snd_sof_intel_hda snd_sof snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation soundwire_cadence snd_hda_codec snd_hda_core ath10k_pci snd_hwdep intel_rapl_msr intel_rapl_common ath10k_core soundwire_bus snd_soc_core x86_pkg_temp_thermal ath intel_powerclamp snd_compress ac97_bus snd_pcm_dmaengine mac80211 snd_pcm coretemp snd_seq_midi snd_seq_midi_event snd_rawmidi kvm_intel cfg80211 snd_seq snd_seq_device snd_timer mei_hdcp kvm libarc4 snd crct10dif_pclmul ghash_clmulni_intel aesni_intel mei_me dell_wmi [ 123.450266] dell_smbios soundcore sparse_keymap dcdbas crypto_simd cryptd mei dell_uart_backlight glue_helper ee1004 wmi_bmof intel_wmi_thunderbolt dell_wmi_descriptor mac_hid efi_pstore acpi_pad acpi_tad intel_cstate sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log hid_generic usbhid hid i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec crc32_pclmul rc_core drm intel_lpss_pci i2c_i801 ahci igc intel_lpss i2c_smbus idma64 xhci_pci libahci virt_dma xhci_pci_renesas wmi video pinctrl_tigerlake [ 123.450335] CR2: 0000000000000008 [ 123.450338] ---[ end trace 9f731e38b53c35cc ]--- The more generic approach will be wrap get_link_ksettings() with begin() and complete() callbacks, and calls runtime resume and runtime suspend routine respectively. However, igc is like igb, runtime resume routine uses rtnl_lock() which upper ethtool layer also uses. So to prevent a deadlock on rtnl, take a different approach, use pm_runtime_suspended() to avoid reading register while device is runtime suspended. Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-29r8169: work around RTL8125 UDP hw bugHeiner Kallweit
It was reported that on RTL8125 network breaks under heavy UDP load, e.g. torrent traffic ([0], from comment 27). Realtek confirmed a hw bug and provided me with a test version of the r8125 driver including a workaround. Tests confirmed that the workaround fixes the issue. I modified the original version of the workaround to meet mainline code style. [0] https://bugzilla.kernel.org/show_bug.cgi?id=209839 v2: - rebased to net v3: - make rtl_skb_is_udp() more robust and use skb_header_pointer() to access the ip(v6) header v4: - remove dependency on ptp_classify.h - replace magic number with offsetof(struct udphdr, len) Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Tested-by: xplo <xplo.bn@gmail.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/6e453d49-1801-e6de-d5f7-d7e6c7526c8f@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-28Merge tag 'net-5.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes including fixes from can, xfrm, wireless, wireless-drivers and netfilter trees. Nothing scary, Intel WiFi-related fixes seemed most notable to the users. Current release - regressions: - dsa: microchip: ksz8795: fix KSZ8794 port map again to program the CPU port correctly Current release - new code bugs: - iwlwifi: pcie: reschedule in long-running memory reads Previous releases - regressions: - iwlwifi: dbg: don't try to overwrite read-only FW data - iwlwifi: provide gso_type to GSO packets - octeontx2: make sure the buffer is 128 byte aligned - tcp: make TCP_USER_TIMEOUT accurate for zero window probes - xfrm: fix wraparound in xfrm_policy_addr_delta() - xfrm: fix oops in xfrm_replay_advance_bmp due to a race between CPUs in presence of packet reorder - tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN - wext: fix NULL-ptr-dereference with cfg80211's lack of commit() Previous releases - always broken: - igc: fix link speed advertising - stmmac: configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing - team: protect features update by RCU to avoid deadlock - xfrm: fix disable_xfrm sysctl when used on xfrm interfaces themselves - fec: fix temporary RMII clock reset on link up - can: dev: prevent potential information leak in can_fill_info() Misc: - mrp: fix bad packing of MRP test packet structures - uapi: fix big endian definition of ipv6_rpl_sr_hdr - add David Ahern to IPv4/IPv6 maintainers" * tag 'net-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits) rxrpc: Fix memory leak in rxrpc_lookup_local mlxsw: spectrum_span: Do not overwrite policer configuration selftests: forwarding: Specify interface when invoking mausezahn stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing net: usb: cdc_ether: added support for Thales Cinterion PLSx3 modem family. ibmvnic: Ensure that CRQ entry read are correctly ordered MAINTAINERS: add missing header for bonding net: decnet: fix netdev refcount leaking on error path net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP can: dev: prevent potential information leak in can_fill_info() net: fec: Fix temporary RMII clock reset on link up net: lapb: Add locking to the lapb module team: protect features update by RCU to avoid deadlock MAINTAINERS: add David Ahern to IPv4/IPv6 maintainers net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset net/mlx5e: Revert parameters on errors when changing trust state without reset net/mlx5e: Correctly handle changing the number of queues when the interface is down net/mlx5e: Fix CT rule + encap slow path offload and deletion net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled ...
2021-01-28mlxsw: spectrum_span: Do not overwrite policer configurationIdo Schimmel
The purpose of the delayed work in the SPAN module is to potentially update the destination port and various encapsulation parameters of SPAN agents that point to a VLAN device or a GRE tap. The destination port can change following the insertion of a new route, for example. SPAN agents that point to a physical port or the CPU port are static and never change throughout the lifetime of the SPAN agent. Therefore, skip over them in the delayed work. This fixes an issue where the delayed work overwrites the policer that was set on a SPAN agent pointing to the CPU. Modifying the delayed work to inherit the original policer configuration is error-prone, as the same will be needed for any new parameter. Fixes: 4039504e6a0c ("mlxsw: spectrum_span: Allow setting policer on a SPAN agent") Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-28stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressingVoon Weifeng
Fix an issue where dump stack is printed and Reset Adapter occurs when PSE0 GbE or/and PSE1 GbE is/are enabled. EHL PSE0 GbE and PSE1 GbE use 32 bits DMA addressing whereas EHL PCH GbE uses 64 bits DMA addressing. [ 25.535095] ------------[ cut here ]------------ [ 25.540276] NETDEV WATCHDOG: enp0s29f2 (intel-eth-pci): transmit queue 2 timed out [ 25.548749] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:443 dev_watchdog+0x259/0x260 [ 25.558004] Modules linked in: 8021q bnep bluetooth ecryptfs snd_hda_codec_hdmi intel_gpy marvell intel_ishtp_loader intel_ishtp_hid iTCO_wdt mei_hdcp iTCO_vendor_support x86_pkg_temp_thermal kvm_intel dwmac_intel stmmac kvm igb pcs_xpcs irqbypass phylink snd_hda_intel intel_rapl_msr pcspkr dca snd_hda_codec i915 i2c_i801 i2c_smbus libphy intel_ish_ipc snd_hda_core mei_me intel_ishtp mei spi_dw_pci 8250_lpss spi_dw thermal dw_dmac_core parport_pc tpm_crb tpm_tis parport tpm_tis_core tpm intel_pmc_core sch_fq_codel uhid fuse configfs snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_sof_xtensa_dsp snd_sof snd_soc_acpi_intel_match snd_soc_acpi snd_intel_dspcfg ledtrig_audio snd_soc_core snd_compress ac97_bus snd_pcm snd_timer snd soundcore [ 25.633795] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G U 5.11.0-rc4-intel-lts-MISMAIL5+ #5 [ 25.644306] Hardware name: Intel Corporation Elkhart Lake Embedded Platform/ElkhartLake LPDDR4x T4 RVP1, BIOS EHLSFWI1.R00.2434.A00.2010231402 10/23/2020 [ 25.659674] RIP: 0010:dev_watchdog+0x259/0x260 [ 25.664650] Code: e8 3b 6b 60 ff eb 98 4c 89 ef c6 05 ec e7 bf 00 01 e8 fb e5 fa ff 89 d9 4c 89 ee 48 c7 c7 78 31 d2 9e 48 89 c2 e8 79 1b 18 00 <0f> 0b e9 77 ff ff ff 0f 1f 44 00 00 48 c7 47 08 00 00 00 00 48 c7 [ 25.685647] RSP: 0018:ffffb7ca80160eb8 EFLAGS: 00010286 [ 25.691498] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000103 [ 25.699483] RDX: 0000000080000103 RSI: 00000000000000f6 RDI: 00000000ffffffff [ 25.707465] RBP: ffff985709ce0440 R08: 0000000000000000 R09: c0000000ffffefff [ 25.715455] R10: ffffb7ca80160cf0 R11: ffffb7ca80160ce8 R12: ffff985709ce039c [ 25.723438] R13: ffff985709ce0000 R14: 0000000000000008 R15: ffff9857068af940 [ 25.731425] FS: 0000000000000000(0000) GS:ffff985864300000(0000) knlGS:0000000000000000 [ 25.740481] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 25.746913] CR2: 00005567f8bb76b8 CR3: 00000001f8e0a000 CR4: 0000000000350ee0 [ 25.754900] Call Trace: [ 25.757631] <IRQ> [ 25.759891] ? qdisc_put_unlocked+0x30/0x30 [ 25.764565] ? qdisc_put_unlocked+0x30/0x30 [ 25.769245] call_timer_fn+0x2e/0x140 [ 25.773346] run_timer_softirq+0x1f3/0x430 [ 25.777932] ? __hrtimer_run_queues+0x12c/0x2c0 [ 25.783005] ? ktime_get+0x3e/0xa0 [ 25.786812] __do_softirq+0xa6/0x2ef [ 25.790816] asm_call_irq_on_stack+0xf/0x20 [ 25.795501] </IRQ> [ 25.797852] do_softirq_own_stack+0x5d/0x80 [ 25.802538] irq_exit_rcu+0x94/0xb0 [ 25.806475] sysvec_apic_timer_interrupt+0x42/0xc0 [ 25.811836] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 25.817586] RIP: 0010:cpuidle_enter_state+0xd9/0x370 [ 25.823142] Code: 85 c0 0f 8f 0a 02 00 00 31 ff e8 22 d5 7e ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 47 02 00 00 31 ff e8 7b a0 84 ff fb 45 85 f6 <0f> 88 ab 00 00 00 49 63 ce 48 2b 2c 24 48 89 c8 48 6b d1 68 48 c1 [ 25.844140] RSP: 0018:ffffb7ca800f7e80 EFLAGS: 00000206 [ 25.849996] RAX: ffff985864300000 RBX: 0000000000000003 RCX: 000000000000001f [ 25.857975] RDX: 00000005f2028ea8 RSI: ffffffff9ec5907f RDI: ffffffff9ec62a5d [ 25.865961] RBP: 00000005f2028ea8 R08: 0000000000000000 R09: 0000000000029d00 [ 25.873947] R10: 000000137b0e0508 R11: ffff9858643294e4 R12: ffff9858643336d0 [ 25.881935] R13: ffffffff9ef74b00 R14: 0000000000000003 R15: 0000000000000000 [ 25.889918] cpuidle_enter+0x29/0x40 [ 25.893922] do_idle+0x24a/0x290 [ 25.897536] cpu_startup_entry+0x19/0x20 [ 25.901930] start_secondary+0x128/0x160 [ 25.906326] secondary_startup_64_no_verify+0xb0/0xbb [ 25.911983] ---[ end trace b4c0c8195d0ba61f ]--- [ 25.917193] intel-eth-pci 0000:00:1d.2 enp0s29f2: Reset adapter. Fixes: 67c08ac4140a ("net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID") Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Co-developed-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Link: https://lore.kernel.org/r/20210126100844.30326-1-mohammad.athari.ismail@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Several recent regressions and some bug fixes: - Typo corrupting the max_recv_sge for cxgb4 - Regression from re-using kernel enums as a HW AbI in vmw_pvrdma - Sleeping inside a spinlock in hns - Revert the attempt to fix devlink deadlocks as the fix is more buggy - Typo in sysfs_emit_at conversions - Revert the removal of VLAN support in rxe" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: Revert "RDMA/rxe: Remove VLAN code leftovers from RXE" RDMA/usnic: Fix misuse of sysfs_emit_at Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion" RDMA/hns: Use mutex instead of spinlock for ida allocation RDMA/vmw_pvrdma: Fix network_hdr_type reported in WC RDMA/cxgb4: Fix the reported max_recv_sge value
2021-01-27Merge tag 'mlx5-fixes-2021-01-26' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-01-26 This series introduces some fixes to mlx5 driver. * tag 'mlx5-fixes-2021-01-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset net/mlx5e: Revert parameters on errors when changing trust state without reset net/mlx5e: Correctly handle changing the number of queues when the interface is down net/mlx5e: Fix CT rule + encap slow path offload and deletion net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled net/mlx5: Maintain separate page trees for ECPF and PF functions net/mlx5e: Fix IPSEC stats net/mlx5e: Reduce tc unsupported key print level net/mlx5e: free page before return net/mlx5e: E-switch, Fix rate calculation for overflow net/mlx5: Fix memory leak on flow table creation error flow ==================== Link: https://lore.kernel.org/r/20210126234345.202096-1-saeedm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-27ibmvnic: Ensure that CRQ entry read are correctly orderedLijun Pan
Ensure that received Command-Response Queue (CRQ) entries are properly read in order by the driver. dma_rmb barrier has been added before accessing the CRQ descriptor to ensure the entire descriptor is read before processing. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Link: https://lore.kernel.org/r/20210128013442.88319-1-ljp@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-27Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Anthony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-01-26 This series contains updates to the ice, i40e, and igc driver. Henry corrects setting an unspecified protocol to IPPROTO_NONE instead of 0 for IPv6 flexbytes filters for ice. Nick fixes the IPv6 extension header being processed incorrectly and updates the netdev->dev_addr if it exists in hardware as it may have been modified outside the ice driver. Brett ensures a user cannot request more channels than available LAN MSI-X and fixes the minimum allocation logic as it was incorrectly trying to use more MSI-X than allocated for ice. Stefan Assmann minimizes the delay between getting and using the VSI pointer to prevent a possible crash for i40e. Corinna Vinschen fixes link speed advertising for igc. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: fix link speed advertising i40e: acquire VSI pointer only after VF is initialized ice: Fix MSI-X vector fallback logic ice: Don't allow more channels than LAN MSI-X available ice: update dev_addr in ice_set_mac_address even if HW filter exists ice: Implement flow for IPv6 next header (extension header) ice: fix FDir IPv6 flexbyte ==================== Link: https://lore.kernel.org/r/20210126221035.658124-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26net: fec: Fix temporary RMII clock reset on link upLaurent Badel
fec_restart() does a hard reset of the MAC module when the link status changes to up. This temporarily resets the R_CNTRL register which controls the MII mode of the ENET_OUT clock. In the case of RMII, the clock frequency momentarily drops from 50MHz to 25MHz until the register is reconfigured. Some link partners do not tolerate this glitch and invalidate the link causing failure to establish a stable link when using PHY polling mode. Since as per IEEE802.3 the criteria for link validity are PHY-specific, what the partner should tolerate cannot be assumed, so avoid resetting the MII clock by using software reset instead of hardware reset when the link is up. This is generally relevant only if the SoC provides the clock to an external PHY and the PHY is configured for RMII. Signed-off-by: Laurent Badel <laurentbadel@eaton.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtablePaul Blakey
If a non nat tuple entry is inserted just to the regular tuples rhashtable (ct_tuples_ht) and not to natted tuples rhashtable (ct_nat_tuples_ht). Commit bc562be9674b ("net/mlx5e: CT: Save ct entries tuples in hashtables") mixed up the return labels and names sot that on cleanup or failure we still try to remove for the natted tuples rhashtable. Fix that by correctly checking if a natted tuples insertion before removing it. While here make it more readable. Fixes: bc562be9674b ("net/mlx5e: CT: Save ct entries tuples in hashtables") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: Revert parameters on errors when changing MTU and LRO state ↵Maxim Mikityanskiy
without reset Sometimes, channel params are changed without recreating the channels. It happens in two basic cases: when the channels are closed, and when the parameter being changed doesn't affect how channels are configured. Such changes invoke a hardware command that might fail. The whole operation should be reverted in such cases, but the code that restores the parameters' values in the driver was missing. This commit adds this handling. Fixes: 2e20a151205b ("net/mlx5e: Fail safe mtu and lro setting") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: Revert parameters on errors when changing trust state without resetMaxim Mikityanskiy
Trust state may be changed without recreating the channels. It happens when the channels are closed, and when channel parameters (min inline mode) stay the same after changing the trust state. Changing the trust state is a hardware command that may fail. The current code didn't restore the channel parameters to their old values if an error happened and the channels were closed. This commit adds handling for this case. Fixes: 6e0504c69811 ("net/mlx5e: Change inline mode correctly when changing trust state") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: Correctly handle changing the number of queues when the interface ↵Maxim Mikityanskiy
is down This commit addresses two issues related to changing the number of queues when the channels are closed: 1. Missing call to mlx5e_num_channels_changed to update real_num_tx_queues when the number of TCs is changed. 2. When mlx5e_num_channels_changed returns an error, the channel parameters must be reverted. Two Fixes: tags correspond to the first commits where these two issues were introduced. Fixes: 3909a12e7913 ("net/mlx5e: Fix configuration of XPS cpumasks and netdev queues in corner cases") Fixes: fa3748775b92 ("net/mlx5e: Handle errors from netif_set_real_num_{tx,rx}_queues") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: Fix CT rule + encap slow path offload and deletionPaul Blakey
Currently, if a neighbour isn't valid when offloading tunnel encap rules, we offload the original match and replace the original action with "goto slow path" action. For this we use a temporary flow attribute based on the original flow attribute and then change the action. Flow flags, which among those is the CT flag, are still shared for the slow path rule offload, so we end up parsing this flow as a CT + goto slow path rule. Besides being unnecessary, CT action offload saves extra information in the passed flow attribute, such as created ct_flow and mod_hdr, which is lost onces the temporary flow attribute is freed. When a neigh is updated and is valid, we offload the original CT rule with original CT action, which again creates a ct_flow and mod_hdr and saves it in the flow's original attribute. Then we delete the slow path rule with a temporary flow attribute based on original updated flow attribute, and we free the relevant ct_flow and mod_hdr. Then when tc deletes this flow, we try to free the ct_flow and mod_hdr on the flow's attribute again. To fix the issue, skip all furture proccesing (CT/Sample/Split rules) in offload/unoffload of slow path rules. Call trace: [ 758.850525] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000218 [ 758.952987] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 758.964170] Modules linked in: act_csum(E) act_pedit(E) act_tunnel_key(E) act_ct(E) nf_flow_table(E) xt_nat(E) ip6table_filter(E) ip6table_nat(E) xt_comment(E) ip6_tables(E) xt_conntrack(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xt_addrtype(E) iptable_filter(E) iptable_nat(E) bpfilter(E) br_netfilter(E) bridge(E) stp(E) llc(E) xfrm_user(E) overlay(E) act_mirred(E) act_skbedit(E) rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) esp6_offload(E) esp6(E) esp4_offload(E) esp4(E) xfrm_algo(E) mlx5_ib(OE) ib_uverbs(OE) geneve(E) ip6_udp_tunnel(E) udp_tunnel(E) nfnetlink_cttimeout(E) nfnetlink(E) mlx5_core(OE) act_gact(E) cls_flower(E) sch_ingress(E) openvswitch(E) nsh(E) nf_conncount(E) nf_nat(E) mlxfw(OE) psample(E) nf_conntrack(E) nf_defrag_ipv4(E) vfio_mdev(E) mdev(E) ib_core(OE) mlx_compat(OE) crct10dif_ce(E) uio_pdrv_genirq(E) uio(E) i2c_mlx(E) mlxbf_pmc(E) sbsa_gwdt(E) mlxbf_gige(E) gpio_mlxbf2(E) mlxbf_pka(E) mlx_trio(E) mlx_bootctl(E) bluefield_edac(E) knem(O) [ 758.964225] ip_tables(E) mlxbf_tmfifo(E) ipv6(E) crc_ccitt(E) nf_defrag_ipv6(E) [ 759.154186] CPU: 5 PID: 122 Comm: kworker/u16:1 Tainted: G OE 5.4.60-mlnx.52.gde81e85 #1 [ 759.172870] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:3.5.0-2-gc1b5d64 Jan 4 2021 [ 759.195466] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [ 759.207344] pstate: a0000005 (NzCv daif -PAN -UAO) [ 759.217003] pc : mlx5_del_flow_rules+0x5c/0x160 [mlx5_core] [ 759.228229] lr : mlx5_del_flow_rules+0x34/0x160 [mlx5_core] [ 759.405858] Call trace: [ 759.410804] mlx5_del_flow_rules+0x5c/0x160 [mlx5_core] [ 759.421337] __mlx5_eswitch_del_rule.isra.43+0x5c/0x1c8 [mlx5_core] [ 759.433963] mlx5_eswitch_del_offloaded_rule_ct+0x34/0x40 [mlx5_core] [ 759.446942] mlx5_tc_rule_delete_ct+0x68/0x74 [mlx5_core] [ 759.457821] mlx5_tc_ct_delete_flow+0x160/0x21c [mlx5_core] [ 759.469051] mlx5e_tc_unoffload_fdb_rules+0x158/0x168 [mlx5_core] [ 759.481325] mlx5e_tc_encap_flows_del+0x140/0x26c [mlx5_core] [ 759.492901] mlx5e_rep_update_flows+0x11c/0x1ec [mlx5_core] [ 759.504127] mlx5e_rep_neigh_update+0x160/0x200 [mlx5_core] [ 759.515314] process_one_work+0x178/0x400 [ 759.523350] worker_thread+0x58/0x3e8 [ 759.530685] kthread+0x100/0x12c [ 759.537152] ret_from_fork+0x10/0x18 [ 759.544320] Code: 97ffef55 51000673 3100067f 54ffff41 (b9421ab3) [ 759.556548] ---[ end trace fab818bb1085832d ]--- Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabledMaor Dickman
The cited commit introduce new CONFIG_MLX5_CLS_ACT kconfig variable to control compilation of TC hardware offloads implementation. When this configuration is disabled the driver is still wrongly reports in ethtool that hw-tc-offload is supported. Fixed by reporting hw-tc-offload is supported only when CONFIG_MLX5_CLS_ACT is enabled. Fixes: d956873f908c ("net/mlx5e: Introduce kconfig var for TC support") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5: Maintain separate page trees for ECPF and PF functionsDaniel Jurgens
Pages for the host PF and ECPF were stored in the same tree, so the ECPF pages were being freed along with the host PF's when the host driver unloaded. Combine the function ID and ECPF flag to use as an index into the x-array containing the trees to get a different tree for the host PF and ECPF. Fixes: c6168161f693 ("net/mlx5: Add support for release all pages event") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: Fix IPSEC statsMaxim Mikityanskiy
When IPSEC offload isn't active, the number of stats is not zero, but the strings are not filled, leading to exposing stats with empty names. Fix this by using the same condition for NUM_STATS and FILL_STRS. Fixes: 0aab3e1b04ae ("net/mlx5e: IPSec, Expose IPsec HW stat only for supporting HW") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: Reduce tc unsupported key print levelMaor Dickman
"Unsupported key used:" appears in kernel log when flows with unsupported key are used, arp fields for example. OpenVSwitch was changed to match on arp fields by default that caused this warning to appear in kernel log for every arp rule, which can be a lot. Fix by lowering print level from warning to debug. Fixes: e3a2b7ed018e ("net/mlx5e: Support offload cls_flower with drop action") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: free page before returnPan Bian
Instead of directly return, goto the error handling label to free allocated page. Fixes: 5f29458b77d5 ("net/mlx5e: Support dump callback in TX reporter") Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5e: E-switch, Fix rate calculation for overflowParav Pandit
rate_bytes_ps is a 64-bit field. It passed as 32-bit field to apply_police_params(). Due to this when police rate is higher than 4Gbps, 32-bit calculation ignores the carry. This results in incorrect rate configurationn the device. Fix it by performing 64-bit calculation. Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26net/mlx5: Fix memory leak on flow table creation error flowRoi Dayan
When we create the ft object we also init rhltable in ft->fgs_hash. So in error flow before kfree of ft we need to destroy that rhltable. Fixes: 693c6883bbc4 ("net/mlx5: Add hash table for flow groups in flow table") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-26igc: fix link speed advertisingCorinna Vinschen
Link speed advertising in igc has two problems: - When setting the advertisement via ethtool, the link speed is converted to the legacy 32 bit representation for the intel PHY code. This inadvertently drops ETHTOOL_LINK_MODE_2500baseT_Full_BIT (being beyond bit 31). As a result, any call to `ethtool -s ...' drops the 2500Mbit/s link speed from the PHY settings. Only reloading the driver alleviates that problem. Fix this by converting the ETHTOOL_LINK_MODE_2500baseT_Full_BIT to the Intel PHY ADVERTISE_2500_FULL bit explicitly. - Rather than checking the actual PHY setting, the .get_link_ksettings function always fills link_modes.advertising with all link speeds the device is capable of. Fix this by checking the PHY autoneg_advertised settings and report only the actually advertised speeds up to ethtool. Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26i40e: acquire VSI pointer only after VF is initializedStefan Assmann
This change simplifies the VF initialization check and also minimizes the delay between acquiring the VSI pointer and using it. As known by the commit being fixed, there is a risk of the VSI pointer getting changed. Therefore minimize the delay between getting and using the pointer. Fixes: 9889707b06ac ("i40e: Fix crash caused by stress setting of VF MAC addresses") Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: Fix MSI-X vector fallback logicBrett Creeley
The current MSI-X enablement logic tries to enable best-case MSI-X vectors and if that fails we only support a bare-minimum set. This includes a single MSI-X for 1 Tx and 1 Rx queue and a single MSI-X for the OICR interrupt. Unfortunately, the driver fails to load when we don't get as many MSI-X as requested for a couple reasons. First, the code to allocate MSI-X in the driver tries to allocate num_online_cpus() MSI-X for LAN traffic without caring about the number of MSI-X actually enabled/requested from the kernel for LAN traffic. So, when calling ice_get_res() for the PF VSI, it returns failure because the number of available vectors is less than requested. Fix this by not allowing the PF VSI to allocation more than pf->num_lan_msix MSI-X vectors and pf->num_lan_msix Rx/Tx queues. Limiting the number of queues is done because we don't want more than 1 Tx/Rx queue per interrupt due to performance conerns. Second, the driver assigns pf->num_lan_msix = 2, to account for LAN traffic and the OICR. However, pf->num_lan_msix is only meant for LAN MSI-X. This is causing a failure when the PF VSI tries to allocate/reserve the minimum pf->num_lan_msix because the OICR MSI-X has already been reserved, so there may not be enough MSI-X vectors left. Fix this by setting pf->num_lan_msix = 1 for the failure case. Then the ICE_MIN_MSIX accounts for the LAN MSI-X and the OICR MSI-X needed for the failure case. Update the related defines used in ice_ena_msix_range() to align with the above behavior and remove the unused RDMA defines because RDMA is currently not supported. Also, remove the now incorrect comment. Fixes: 152b978a1f90 ("ice: Rework ice_ena_msix_range") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: Don't allow more channels than LAN MSI-X availableBrett Creeley
Currently users could create more channels than LAN MSI-X available. This is happening because there is no check against pf->num_lan_msix when checking the max allowed channels and will cause performance issues if multiple Tx and Rx queues are tied to a single MSI-X. Fix this by not allowing more channels than LAN MSI-X available in pf->num_lan_msix. Fixes: 87324e747fde ("ice: Implement ethtool ops for channels") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: update dev_addr in ice_set_mac_address even if HW filter existsNick Nunley
Fix the driver to copy the MAC address configured in ndo_set_mac_address into dev_addr, even if the MAC filter already exists in HW. In some situations (e.g. bonding) the netdev's dev_addr could have been modified outside of the driver, with no change to the HW filter, so the driver cannot assume that they match. Fixes: 757976ab16be ("ice: Fix check for removing/adding mac filters") Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: Implement flow for IPv6 next header (extension header)Nick Nunley
This patch is based on a similar change to i40e by Slawomir Laba: "i40e: Implement flow for IPv6 next header (extension header)". When a packet contains an IPv6 header with next header which is an extension header and not a protocol one, the kernel function skb_transport_header called with such sk_buff will return a pointer to the extension header and not to the TCP one. The above explained call caused a problem with packet processing for skb with encapsulation for tunnel with ICE_TX_CTX_EIPT_IPV6. The extension header was not skipped at all. The ipv6_skip_exthdr function does check if next header of the IPV6 header is an extension header and doesn't modify the l4_proto pointer if it points to a protocol header value so its safe to omit the comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can return value -1. This means that the skipping process failed and there is something wrong with the packet so it will be dropped. Fixes: a4e82a81f573 ("ice: Add support for tunnel offloads") Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: fix FDir IPv6 flexbyteHenry Tieman
The packet classifier would occasionally misrecognize an IPv6 training packet when the next protocol field was 0. The correct value for unspecified protocol is IPPROTO_NONE. Fixes: 165d80d6adab ("ice: Support IPv6 Flow Director filters") Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-23chtls: Fix potential resource leakPan Bian
The dst entry should be released if no neighbour is found. Goto label free_dst to fix the issue. Besides, the check of ndev against NULL is redundant. Signed-off-by: Pan Bian <bianpan2016@163.com> Link: https://lore.kernel.org/r/20210121145738.51091-1-bianpan2016@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22net: fec: put child node on error pathPan Bian
Also decrement the reference count of child device on error path. Fixes: 3e782985cb3c ("net: ethernet: fec: Allow configuration of MDIO bus speed") Signed-off-by: Pan Bian <bianpan2016@163.com> Link: https://lore.kernel.org/r/20210120122037.83897-1-bianpan2016@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22net: stmmac: dwmac-intel-plat: remove config data on errorPan Bian
Remove the config data when rate setting fails. Fixes: 9efc9b2b04c7 ("net: stmmac: Add dwmac-intel-plat for GBE driver") Signed-off-by: Pan Bian <bianpan2016@163.com> Link: https://lore.kernel.org/r/20210120110745.36412-1-bianpan2016@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22net: octeontx2: Make sure the buffer is 128 byte alignedKevin Hao
The octeontx2 hardware needs the buffer to be 128 byte aligned. But in the current implementation of napi_alloc_frag(), it can't guarantee the return address is 128 byte aligned even the request size is a multiple of 128 bytes, so we have to request an extra 128 bytes and use the PTR_ALIGN() to make sure that the buffer is aligned correctly. Fixes: 7a36e4918e30 ("octeontx2-pf: Use the napi_alloc_frag() to alloc the pool buffers") Reported-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Kevin Hao <haokexin@gmail.com> Tested-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20210121070906.25380-1-haokexin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20net: systemport: free dev before on error pathPan Bian
On the error path, it should goto the error handling label to free allocated memory rather than directly return. Fixes: 31bc72d97656 ("net: systemport: fetch and use clock resources") Signed-off-by: Pan Bian <bianpan2016@163.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210120044423.1704-1-bianpan2016@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20net: mscc: ocelot: Fix multicast to the CPU portAlban Bedel
Multicast entries in the MAC table use the high bits of the MAC address to encode the ports that should get the packets. But this port mask does not work for the CPU port, to receive these packets on the CPU port the MAC_CPU_COPY flag must be set. Because of this IPv6 was effectively not working because neighbor solicitations were never received. This was not apparent before commit 9403c158 (net: mscc: ocelot: support IPv4, IPv6 and plain Ethernet mdb entries) as the IPv6 entries were broken so all incoming IPv6 multicast was then treated as unknown and flooded on all ports. To fix this problem rework the ocelot_mact_learn() to set the MAC_CPU_COPY flag when a multicast entry that target the CPU port is added. For this we have to read back the ports endcoded in the pseudo MAC address by the caller. It is not a very nice design but that avoid changing the callers and should make backporting easier. Signed-off-by: Alban Bedel <alban.bedel@aerq.com> Fixes: 9403c158b872 ("net: mscc: ocelot: support IPv4, IPv6 and plain Ethernet mdb entries") Link: https://lore.kernel.org/r/20210119140638.203374-1-alban.bedel@aerq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-19sh_eth: Fix power down vs. is_opened flag orderingGeert Uytterhoeven
sh_eth_close() does a synchronous power down of the device before marking it closed. Revert the order, to make sure the device is never marked opened while suspended. While at it, use pm_runtime_put() instead of pm_runtime_put_sync(), as there is no reason to do a synchronous power down. Fixes: 7fa2955ff70ce453 ("sh_eth: Fix sleeping function called from invalid context") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Link: https://lore.kernel.org/r/20210118150812.796791-1-geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-19Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion"Parav Pandit
This reverts commit fbdd0049d98d44914fc57d4b91f867f4996c787b. Due to commit in fixes tag, netdevice events were received only in one net namespace of mlx5_core_dev. Due to this when netdevice events arrive in net namespace other than net namespace of mlx5_core_dev, they are missed. This results in empty GID table due to RDMA device being detached from its net device. Hence, revert back to receive netdevice events in all net namespaces to restore back RDMA functionality in non init_net net namespace. The deadlock will have to be addressed in another patch. Fixes: fbdd0049d98d ("RDMA/mlx5: Fix devlink deadlock on net namespace deletion") Link: https://lore.kernel.org/r/20210117092633.10690-1-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19sh_eth: Make PHY access aware of Runtime PM to fix reboot crashGeert Uytterhoeven
Wolfram reports that his R-Car H2-based Lager board can no longer be rebooted in v5.11-rc1, as it crashes with an imprecise external abort. The issue can be reproduced on other boards (e.g. Koelsch with R-Car M2-W) too, if CONFIG_IP_PNP is disabled, and the Ethernet interface is down at reboot time: Unhandled fault: imprecise external abort (0x1406) at 0x00000000 pgd = (ptrval) [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000 Internal error: : 1406 [#1] ARM Modules linked in: CPU: 0 PID: 1105 Comm: init Tainted: G W 5.10.0-rc1-00402-ge2f016cf7751 #1048 Hardware name: Generic R-Car Gen2 (Flattened Device Tree) PC is at sh_mdio_ctrl+0x44/0x60 LR is at sh_mmd_ctrl+0x20/0x24 ... Backtrace: [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24) r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4 [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8) [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc) r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001 [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0) r7:0000001f r6:00000001 r5:c221e000 r4:c221e000 [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54) r7:0000001f r6:00000001 r5:c221e000 r4:c221e458 [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20) r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800 [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80) [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50) r5:c229f800 r4:c229f800 [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c) r5:c229f800 r4:c229f804 [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8) [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48) r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000 [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60) [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208) r5:4321fedc r4:01234567 [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c) r7:00000058 r6:00000000 r5:00000000 r4:00000000 [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54) As of commit e2f016cf775129c0 ("net: phy: add a shutdown procedure"), system reboot calls phy_disable_interrupts() during shutdown. As this happens unconditionally, the PHY registers may be accessed while the device is suspended, causing undefined behavior, which may crash the system. Fix this by wrapping the PHY bitbang accessors in the sh_eth driver by wrappers that take care of Runtime PM, to resume the device when needed. Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-18net: mscc: ocelot: allow offloading of bridge on top of LAGVladimir Oltean
The blamed commit was too aggressive, and it made ocelot_netdevice_event react only to network interface events emitted for the ocelot switch ports. In fact, only the PRECHANGEUPPER should have had that check. When we ignore all events that are not for us, we miss the fact that the upper of the LAG changes, and the bonding interface gets enslaved to a bridge. This is an operation we could offload under certain conditions. Fixes: 7afb3e575e5a ("net: mscc: ocelot: don't handle netdev events for other netdevs") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210118135210.2666246-1-olteanv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15octeontx2-af: Fix missing check bugs in rvu_cgx.cYingjie Wang
In rvu_mbox_handler_cgx_mac_addr_get() and rvu_mbox_handler_cgx_mac_addr_set(), the msg is expected only from PFs that are mapped to CGX LMACs. It should be checked before mapping, so we add the is_cgx_config_permitted() in the functions. Fixes: 96be2e0da85e ("octeontx2-af: Support for MAC address filters in CGX") Signed-off-by: Yingjie Wang <wangyingjie55@126.com> Reviewed-by: Geetha sowjanya<gakula@marvell.com> Link: https://lore.kernel.org/r/1610719804-35230-1-git-send-email-wangyingjie55@126.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-14net: stmmac: fix taprio configuration when base_time is in the pastYannick Vignon
The Synopsys TSN MAC supports Qbv base times in the past, but only up to a certain limit. As a result, a taprio qdisc configuration with a small base time (for example when treating the base time as a simple phase offset) is not applied by the hardware and silently ignored. This was observed on an NXP i.MX8MPlus device, but likely affects all TSN-variants of the MAC. Fix the issue by making sure the base time is in the future, pushing it by an integer amount of cycle times if needed. (a similar check is already done in several other taprio implementations, see for example drivers/net/ethernet/intel/igc/igc_tsn.c#L116 or drivers/net/dsa/sja1105/sja1105_ptp.h#L39). Fixes: b60189e0392f ("net: stmmac: Integrate EST with TAPRIO scheduler API") Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com> Link: https://lore.kernel.org/r/20210113131557.24651-2-yannick.vignon@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-14net: stmmac: fix taprio schedule configurationYannick Vignon
When configuring a 802.1Qbv schedule through the tc taprio qdisc on an NXP i.MX8MPlus device, the effective cycle time differed from the requested one by N*96ns, with N number of entries in the Qbv Gate Control List. This is because the driver was adding a 96ns margin to each interval of the GCL, apparently to account for the IPG. The problem was observed on NXP i.MX8MPlus devices but likely affected all devices relying on the same configuration callback (dwmac 4.00, 4.10, 5.10 variants). Fix the issue by removing the margins, and simply setup the MAC with the provided cycle time value. This is the behavior expected by the user-space API, as altering the Qbv schedule timings would break standards conformance. This is also the behavior of several other Ethernet MAC implementations supporting taprio, including the dwxgmac variant of stmmac. Fixes: 504723af0d85 ("net: stmmac: Add basic EST support for GMAC5+") Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com> Link: https://lore.kernel.org/r/20210113131557.24651-1-yannick.vignon@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-13net: stmmac: Fixed mtu channged by cache alignedDavid Wu
Since the original mtu is not used when the mtu is updated, the mtu is aligned with cache, this will get an incorrect. For example, if you want to configure the mtu to be 1500, but mtu 1536 is configured in fact. Fixed: eaf4fac478077 ("net: stmmac: Do not accept invalid MTU values") Signed-off-by: David Wu <david.wu@rock-chips.com> Link: https://lore.kernel.org/r/20210113034109.27865-1-david.wu@rock-chips.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-13cxgb4/chtls: Fix tid stuck due to wrong update of qidAyush Sawal
TID stuck is seen when there is a race in CPL_PASS_ACCEPT_RPL/CPL_ABORT_REQ and abort is arriving before the accept reply, which sets the queue number. In this case HW ends up sending CPL_ABORT_RPL_RSS to an incorrect ingress queue. V1->V2: - Removed the unused variable len in chtls_set_quiesce_ctrl(). V2->V3: - As kfree_skb() has a check for null skb, so removed this check before calling kfree_skb() in func chtls_send_reset(). Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Link: https://lore.kernel.org/r/20210112053600.24590-1-ayush.sawal@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-13i40e: fix potential NULL pointer dereferencingCristian Dumitrescu
Currently, the function i40e_construct_skb_zc only frees the input xdp buffer when the output skb is successfully built. On error, the function i40e_clean_rx_irq_zc does not commit anything for the current packet descriptor and simply exits the packet descriptor processing loop, with the plan to restart the processing of this descriptor on the next invocation. Therefore, on error the ring next-to-clean pointer should not advance, the xdp i.e. *bi buffer should not be freed and the current buffer info should not be invalidated by setting *bi to NULL. Therefore, the *bi should only be set to NULL when the function i40e_construct_skb_zc is successful, otherwise a NULL *bi will be dereferenced when the work for the current descriptor is eventually restarted. Fixes: 3b4f0b66c2b3 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/r/20210111181138.49757-1-cristian.dumitrescu@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-13net: stmmac: use __napi_schedule() for PREEMPT_RTSeb Laveze
Use of __napi_schedule_irqoff() is not safe with PREEMPT_RT in which hard interrupts are not disabled while running the threaded interrupt. Using __napi_schedule() works for both PREEMPT_RT and mainline Linux, just at the cost of an additional check if interrupts are disabled for mainline (since they are already disabled). Similar to the fix done for enetc commit 215602a8d212 ("enetc: use napi_schedule to be compatible with PREEMPT_RT") Signed-off-by: Seb Laveze <sebastien.laveze@nxp.com> Link: https://lore.kernel.org/r/20210112140121.1487619-1-sebastien.laveze@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-12bnxt_en: Clear DEFRAG flag in firmware message when retry flashing.Pavan Chebbi
When the FW tells the driver to retry the INSTALL_UPDATE command after it has cleared the NVM area, the driver is not clearing the previously used ALLOWED_TO_DEFRAG flag. As a result the FW tries to defrag the NVM area a second time in a loop and can fail the request. Fixes: 1432c3f6a6ca ("bnxt_en: Retry installing FW package under NO_SPACE error condition.") Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-12bnxt_en: Improve stats context resource accounting with RDMA driver loaded.Michael Chan
The function bnxt_get_ulp_stat_ctxs() does not count the stats contexts used by the RDMA driver correctly when the RDMA driver is freeing the MSIX vectors. It assumes that if the RDMA driver is registered, the additional stats contexts will be needed. This is not true when the RDMA driver is about to unregister and frees the MSIX vectors. This slight error leads to over accouting of the stats contexts needed after the RDMA driver has unloaded. This will cause some firmware warning and error messages in dmesg during subsequent config. changes or ifdown/ifup. Fix it by properly accouting for extra stats contexts only if the RDMA driver is registered and MSIX vectors have been successfully requested. Fixes: c027c6b4e91f ("bnxt_en: get rid of num_stat_ctxs variable") Reviewed-by: Yongping Zhang <yongping.zhang@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>