Age | Commit message (Collapse) | Author |
|
The minimum and maximum limits for resources assigned to a given
resource group are programmed in pairs, with the limits for two
groups set in a single register.
If the number of supported resource groups is odd, only half of the
register that defines these limits is valid for the last group; that
group has no second group in the pair.
Currently we ignore this constraint, and it turns out to be harmless,
but it is not guaranteed to be. This patch addresses that, and adds
support for programming the 5th resource group's limits.
Rework how the resource group limit registers are programmed by
having a single function program all group pairs rather than having
one function program each pair. Add the programming of the 4-5
resource group pair limits to this function. If a resource group is
not supported, pass a null pointer to ipa_resource_config_common()
for that group and have that function write zeroes in that case.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The number of resource groups supported by the hardware can be
different for source and destination resources. Determine the
number supported for each using separate functions. Make the
functions inline end move their definitions into "ipa_reg.h",
because they determine whether certain register definitions are
valid. Pass just the IPA hardware version as argument.
IPA_RESOURCE_GROUP_COUNT represents the maximum number of resource
groups the driver supports for any hardware version. Change that
symbol to be two separate constants, one for source and the other
for destination resource groups. Rename them to end with "_MAX"
rather than "_COUNT", to reflect their true purpose.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The IPA hardware manages various resources (e.g. descriptors)
internally to perform its functions. The resources are grouped,
allowing different endpoints to use separate resource pools. This
way one group of endpoints can be configured to operate unaffected
by the resource use of endpoints in a different group.
Endpoints should be assigned to a resource group, but we currently
don't do that.
Define a new resource_group field in the endpoint configuration
data, and use it to assign the proper resource group to use for
each AP endpoint.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The mask for the RSRC_GRP field in the INIT_RSRC_GRP endpoint
initialization register is incorrectly defined for IPA v4.2 (where
it is only one bit wide). So we need to fix this.
The fix is not straightforward, however. Field masks are passed to
functions like u32_encode_bits(), and for that they must be constant.
To address this, we define a new inline function that returns the
*encoded* value to use for a given RSRC_GRP field, which depends on
the IPA version. The caller can then use something like this, to
assign a given endpoint resource id 1:
u32 offset = IPA_REG_ENDP_INIT_RSRC_GRP_N_OFFSET(endpoint_id);
u32 val = rsrc_grp_encoded(ipa->version, 1);
iowrite32(val, ipa->reg_virt + offset);
The next patch requires this fix.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
At the end of ipa_mem_setup() we write the local packet processing
context base register to tell it where the processing context memory
is. But we are writing the wrong value.
The value written turns out to be the offset of the modem header
memory region (assigned earlier in the function). Fix this bug.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The driver does not implement a shutdown handler which leads to issues
when using kexec in certain scenarios. The NIC keeps on fetching
descriptors which gets flagged by the IOMMU with errors like this:
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201028172125.496942-1-mdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Finisar FCLF8520P2BTL 1000BaseT SFP module uses a Marvel 88E1111 PHY
with a modified PHY ID. Add support for this ID using the 88E1111
methods.
By default these modules do not have 1000BaseX auto-negotiation enabled,
which is not generally desirable with Linux networking drivers. Add
handling to enable 1000BaseX auto-negotiation when these modules are
used in 1000BaseX mode. Also, some special handling is required to ensure
that 1000BaseT auto-negotiation is enabled properly when desired.
Based on existing handling in the AMD xgbe driver and the information in
the Finisar FAQ:
https://www.finisar.com/sites/default/files/resources/an-2036_1000base-t_sfp_faqreve1.pdf
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20201028171540.1700032-1-robert.hancock@calian.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
DSA assumes that a bridge which has vlan filtering disabled is not
vlan aware, and ignores all vlan configuration. However, the kernel
software bridge code allows configuration in this state.
This causes the kernel's idea of the bridge vlan state and the
hardware state to disagree, so "bridge vlan show" indicates a correct
configuration but the hardware lacks all configuration. Even worse,
enabling vlan filtering on a DSA bridge immediately blocks all traffic
which, given the output of "bridge vlan show", is very confusing.
Allow the VLAN configuration to be updated on Marvell DSA bridges,
otherwise we end up cutting all traffic when enabling vlan filtering.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/E1kYAU3-00071C-1G@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fixed spelling in comment like below:
s/defalut/default/p
This is in linux-next.
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201029095525.20200-1-unixbhaskar@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor phy_led_trigger_register() and deduplicate its functionality
when registering LED trigger for link.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201027182146.21355-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch enables the HW LPI Timer which controls the automatic entry
and exit of the LPI state.
The EEE LPI timer value is configured through ethtool. The driver will
auto select the LPI HW timer if the value in the HW timer supported range.
Else, the driver will fallback to SW timer.
Signed-off-by: Vineetha G. Jaya Kumaran <vineetha.g.jaya.kumaran@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Link: https://lore.kernel.org/r/20201027160051.22898-1-weifeng.voon@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground
Arnd Bergmann says:
====================
wimax: move to staging
After I sent a fix for what appeared to be a harmless warning in
the wimax user interface code, the conclusion was that the whole
thing has most likely not been used in a very long time, and the
user interface possibly been broken since b61a5eea5904 ("wimax: use
genl_register_family_with_ops()").
Using a shared branch between net-next and staging should help
coordinate patches getting submitted against it.
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The 'break' is unnecessary because of previous 'return',
and we could discard it.
Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Link: https://lore.kernel.org/r/20201027135159.71444-1-zhangqilong3@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Unify the set of information returned by mii_ethtool_get_link_ksettings(),
mii_ethtool_gset() and phy_ethtool_ksettings_get(). Make the mii_*()
functions report advertised settings when autonegotiation if disabled.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201027114317.8259-1-l.stelmach@samsung.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Current release regressions:
- r8169: fix forced threading conflicting with other shared
interrupts; we tried to fix the use of raise_softirq_irqoff from an
IRQ handler on RT by forcing hard irqs, but this driver shares
legacy PCI IRQs so drop the _irqoff() instead
- tipc: fix memory leak caused by a recent syzbot report fix to
tipc_buf_append()
Current release - bugs in new features:
- devlink: Unlock on error in dumpit() and fix some error codes
- net/smc: fix null pointer dereference in smc_listen_decline()
Previous release - regressions:
- tcp: Prevent low rmem stalls with SO_RCVLOWAT.
- net: protect tcf_block_unbind with block lock
- ibmveth: Fix use of ibmveth in a bridge; the self-imposed filtering
to only send legal frames to the hypervisor was too strict
- net: hns3: Clear the CMDQ registers before unmapping BAR region;
incorrect cleanup order was leading to a crash
- bnxt_en - handful of fixes to fixes:
- Send HWRM_FUNC_RESET fw command unconditionally, even if there
are PCIe errors being reported
- Check abort error state in bnxt_open_nic().
- Invoke cancel_delayed_work_sync() for PFs also.
- Fix regression in workqueue cleanup logic in bnxt_remove_one().
- mlxsw: Only advertise link modes supported by both driver and
device, after removal of 56G support from the driver 56G was not
cleared from advertised modes
- net/smc: fix suppressed return code
Previous release - always broken:
- netem: fix zero division in tabledist, caused by integer overflow
- bnxt_en: Re-write PCI BARs after PCI fatal error.
- cxgb4: set up filter action after rewrites
- net: ipa: command payloads already mapped
Misc:
- s390/ism: fix incorrect system EID, it's okay to change since it
was added in current release
- vsock: use ns_capable_noaudit() on socket create to suppress false
positive audit messages"
* tag 'net-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
r8169: fix issue with forced threading in combination with shared interrupts
netem: fix zero division in tabledist
ibmvnic: fix ibmvnic_set_mac
mptcp: add missing memory scheduling in the rx path
tipc: fix memory leak caused by tipc_buf_append()
gtp: fix an use-before-init in gtp_newlink()
net: protect tcf_block_unbind with block lock
ibmveth: Fix use of ibmveth in a bridge.
net/sched: act_mpls: Add softdep on mpls_gso.ko
ravb: Fix bit fields checking in ravb_hwtstamp_get()
devlink: Unlock on error in dumpit()
devlink: Fix some error codes
chelsio/chtls: fix memory leaks in CPL handlers
chelsio/chtls: fix deadlock issue
net: hns3: Clear the CMDQ registers before unmapping BAR region
bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally.
bnxt_en: Check abort error state in bnxt_open_nic().
bnxt_en: Re-write PCI BARs after PCI fatal error.
bnxt_en: Invoke cancel_delayed_work_sync() for PFs also.
bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one().
...
|
|
Pull rdma fixes from Jason Gunthorpe:
"The good news is people are testing rc1 in the RDMA world - the bad
news is testing of the for-next area is not as good as I had hoped, as
we really should have caught at least the rdma_connect_locked() issue
before now.
Notable merge window regressions that didn't get caught/fixed in time
for rc1:
- Fix in kernel users of rxe, they were broken by the rapid fix to
undo the uABI breakage in rxe from another patch
- EFA userspace needs to read the GID table but was broken with the
new GID table logic
- Fix user triggerable deadlock in mlx5 using devlink reload
- Fix deadlock in several ULPs using rdma_connect from the CM handler
callbacks
- Memory leak in qedr"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/qedr: Fix memory leak in iWARP CM
RDMA: Add rdma_connect_locked()
RDMA/uverbs: Fix false error in query gid IOCTL
RDMA/mlx5: Fix devlink deadlock on net namespace deletion
RDMA/rxe: Fix small problem in network_type patch
|
|
As reported by Serge flag IRQF_NO_THREAD causes an error if the
interrupt is actually shared and the other driver(s) don't have this
flag set. This situation can occur if a PCI(e) legacy interrupt is
used in combination with forced threading.
There's no good way to deal with this properly, therefore we have to
remove flag IRQF_NO_THREAD. For fixing the original forced threading
issue switch to napi_schedule().
Fixes: 424a646e072a ("r8169: fix operation under forced interrupt threading")
Link: https://www.spinics.net/lists/netdev/msg694960.html
Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
Link: https://lore.kernel.org/r/b5b53bfe-35ac-3768-85bf-74d1290cf394@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski brought up a concern in ibmvnic_set_mac().
ibmvnic_set_mac() does this:
ether_addr_copy(adapter->mac_addr, addr->sa_data);
if (adapter->state != VNIC_PROBED)
rc = __ibmvnic_set_mac(netdev, addr->sa_data);
So if state == VNIC_PROBED, the user can assign an invalid address to
adapter->mac_addr, and ibmvnic_set_mac() will still return 0.
The fix is to validate ethernet address at the beginning of
ibmvnic_set_mac(), and move the ether_addr_copy to
the case of "adapter->state != VNIC_PROBED".
Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20201027220456.71450-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are no known users of this driver as of October 2020, and it will
be removed unless someone turns out to still need it in future releases.
According to https://en.wikipedia.org/wiki/List_of_WiMAX_networks, there
have been many public wimax networks, but it appears that many of these
have migrated to LTE or discontinued their service altogether.
As most PCs and phones lack WiMAX hardware support, the remaining
networks tend to use standalone routers. These almost certainly
run Linux, but not a modern kernel or the mainline wimax driver stack.
NetworkManager appears to have dropped userspace support in 2015
https://bugzilla.gnome.org/show_bug.cgi?id=747846, the
www.linuxwimax.org
site had already shut down earlier.
WiMax is apparently still being deployed on airport campus networks
("AeroMACS"), but in a frequency band that was not supported by the old
Intel 2400m (used in Sandy Bridge laptops and earlier), which is the
only driver using the kernel's wimax stack.
Move all files into drivers/staging/wimax, including the uapi header
files and documentation, to make it easier to remove it when it gets
to that. Only minimal changes are made to the source files, in order
to make it possible to port patches across the move.
Also remove the MAINTAINERS entry that refers to a broken mailing
list and website.
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-By: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Suggested-by: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
*_pdp_find() from gtp_encap_recv() would trigger a crash when a peer
sends GTP packets while creating new GTP device.
RIP: 0010:gtp1_pdp_find.isra.0+0x68/0x90 [gtp]
<SNIP>
Call Trace:
<IRQ>
gtp_encap_recv+0xc2/0x2e0 [gtp]
? gtp1_pdp_find.isra.0+0x90/0x90 [gtp]
udp_queue_rcv_one_skb+0x1fe/0x530
udp_queue_rcv_skb+0x40/0x1b0
udp_unicast_rcv_skb.isra.0+0x78/0x90
__udp4_lib_rcv+0x5af/0xc70
udp_rcv+0x1a/0x20
ip_protocol_deliver_rcu+0xc5/0x1b0
ip_local_deliver_finish+0x48/0x50
ip_local_deliver+0xe5/0xf0
? ip_protocol_deliver_rcu+0x1b0/0x1b0
gtp_encap_enable() should be called after gtp_hastable_new() otherwise
*_pdp_find() will access the uninitialized hash table.
Fixes: 1e3a3abd8b28 ("gtp: make GTP sockets in gtp_newlink optional")
Signed-off-by: Masahiro Fujiwara <fujiwara.masahiro@gmail.com>
Link: https://lore.kernel.org/r/20201027114846.3924-1-fujiwara.masahiro@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fixes memory leak in iWARP CM
Fixes: e411e0587e0d ("RDMA/qedr: Add iWARP connection management functions")
Link: https://lore.kernel.org/r/20201021115008.28138-1-palok@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
handler triggers a completion and another thread does rdma_connect() or
the handler directly calls rdma_connect().
In all cases rdma_connect() needs to hold the handler_mutex, but when
handler's are invoked this is already held by the core code. This causes
ULPs using the 2nd method to deadlock.
Provide a rdma_connect_locked() and have all ULPs call it from their
handlers.
Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com
Reported-and-tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The Xilinx PCS/PMA PHY requires that BMCR_ISOLATE be disabled for proper
operation in 1000BaseX mode. It should be safe to ensure this bit is
disabled in phylink_mii_c22_pcs_config in all cases.
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20201026175802.1332477-1-robert.hancock@calian.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The check for src mac address in ibmveth_is_packet_unsupported is wrong.
Commit 6f2275433a2f wanted to shut down messages for loopback packets,
but now suppresses bridged frames, which are accepted by the hypervisor
otherwise bridging won't work at all.
Fixes: 6f2275433a2f ("ibmveth: Detect unsupported packets before sending to the hypervisor")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Link: https://lore.kernel.org/r/20201026104221.26570-1-msuchanek@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the function ravb_hwtstamp_get() in ravb_main.c with the existing
values for RAVB_RXTSTAMP_TYPE_V2_L2_EVENT (0x2) and RAVB_RXTSTAMP_TYPE_ALL
(0x6)
if (priv->tstamp_rx_ctrl & RAVB_RXTSTAMP_TYPE_V2_L2_EVENT)
config.rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT;
else if (priv->tstamp_rx_ctrl & RAVB_RXTSTAMP_TYPE_ALL)
config.rx_filter = HWTSTAMP_FILTER_ALL;
if the test on RAVB_RXTSTAMP_TYPE_ALL should be true,
it will never be reached.
This issue can be verified with 'hwtstamp_config' testing program
(tools/testing/selftests/net/hwtstamp_config.c). Setting filter type
to ALL and subsequent retrieving it gives incorrect value:
$ hwtstamp_config eth0 OFF ALL
flags = 0
tx_type = OFF
rx_filter = ALL
$ hwtstamp_config eth0
flags = 0
tx_type = OFF
rx_filter = PTP_V2_L2_EVENT
Correct this by converting if-else's to switch.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Andrew Gabbasov <andrew_gabbasov@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Link: https://lore.kernel.org/r/20201026102130.29368-1-andrew_gabbasov@mentor.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
CPL handler functions chtls_pass_open_rpl() and
chtls_close_listsrv_rpl() should return CPL_RET_BUF_DONE
so that caller function will do skb free to avoid leak.
Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Link: https://lore.kernel.org/r/20201025194228.31271-1-vinay.yadav@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In chtls_pass_establish() we hold child socket lock using bh_lock_sock
and we are again trying bh_lock_sock in add_to_reap_list, causing deadlock.
Remove bh_lock_sock in add_to_reap_list() as lock is already held.
Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Link: https://lore.kernel.org/r/20201025193538.31112-1-vinay.yadav@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove unneeded variable ret used to store return value.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Link: https://lore.kernel.org/r/20201023092107.28065-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove unnecessary cast in the argument to kfree.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Link: https://lore.kernel.org/r/20201023085533.4792-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A couple of x86 fixes which missed rc1 due to my stupidity:
- Drop lazy TLB mode before switching to the temporary address space
for text patching.
text_poke() switches to the temporary mm which clears the lazy mode
and restores the original mm afterwards. Due to clearing lazy mode
this might restore a already dead mm if exit_mmap() runs in
parallel on another CPU.
- Document the x32 syscall design fail vs. syscall numbers 512-547
properly.
- Fix the ORC unwinder to handle the inactive task frame correctly.
This was unearthed due to the slightly different code generation of
gcc-10.
- Use an up to date screen_info for the boot params of kexec instead
of the possibly stale and invalid version which happened to be
valid when the kexec kernel was loaded"
* tag 'x86-urgent-2020-10-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternative: Don't call text_poke() in lazy TLB mode
x86/syscalls: Document the fact that syscalls 512-547 are a legacy mistake
x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels
hyperv_fb: Update screen_info after removing old framebuffer
x86/kexec: Use up-to-dated screen_info copy to fill boot params
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- More binding additionalProperties/unevaluatedProperties additions
- More yamllint fixes on additions in the merge window
- CrOS embedded controller schema updates to fix warnings
- LEDs schema update adding ID_RGB
- A reserved-memory fix for regions starting at address 0x0
* tag 'devicetree-fixes-for-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: Another round of adding missing 'additionalProperties/unevalutatedProperties'
dt-bindings: Explicitly allow additional properties in board/SoC schemas
dt-bindings: More whitespace clean-ups in schema files
mfd: google,cros-ec: add missing properties
dt-bindings: input: convert cros-ec-keyb to json-schema
dt-bindings: i2c: convert i2c-cros-ec-tunnel to json-schema
of: Fix reserved-memory overlap detection
dt-bindings: mailbox: mtk-gce: fix incorrect mbox-cells value
dt-bindings: leds: Update devicetree documents for ID_RGB
|
|
When unbinding the hns3 driver with the HNS3 VF, I got the following
kernel panic:
[ 265.709989] Unable to handle kernel paging request at virtual address ffff800054627000
[ 265.717928] Mem abort info:
[ 265.720740] ESR = 0x96000047
[ 265.723810] EC = 0x25: DABT (current EL), IL = 32 bits
[ 265.729126] SET = 0, FnV = 0
[ 265.732195] EA = 0, S1PTW = 0
[ 265.735351] Data abort info:
[ 265.738227] ISV = 0, ISS = 0x00000047
[ 265.742071] CM = 0, WnR = 1
[ 265.745055] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000009b54000
[ 265.751753] [ffff800054627000] pgd=0000202ffffff003, p4d=0000202ffffff003, pud=00002020020eb003, pmd=00000020a0dfc003, pte=0000000000000000
[ 265.764314] Internal error: Oops: 96000047 [#1] SMP
[ 265.830357] CPU: 61 PID: 20319 Comm: bash Not tainted 5.9.0+ #206
[ 265.836423] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.05 09/18/2019
[ 265.843873] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
[ 265.843890] pc : hclgevf_cmd_uninit+0xbc/0x300
[ 265.861988] lr : hclgevf_cmd_uninit+0xb0/0x300
[ 265.861992] sp : ffff80004c983b50
[ 265.881411] pmr_save: 000000e0
[ 265.884453] x29: ffff80004c983b50 x28: ffff20280bbce500
[ 265.889744] x27: 0000000000000000 x26: 0000000000000000
[ 265.895034] x25: ffff800011a1f000 x24: ffff800011a1fe90
[ 265.900325] x23: ffff0020ce9b00d8 x22: ffff0020ce9b0150
[ 265.905616] x21: ffff800010d70e90 x20: ffff800010d70e90
[ 265.910906] x19: ffff0020ce9b0080 x18: 0000000000000004
[ 265.916198] x17: 0000000000000000 x16: ffff800011ae32e8
[ 265.916201] x15: 0000000000000028 x14: 0000000000000002
[ 265.916204] x13: ffff800011ae32e8 x12: 0000000000012ad8
[ 265.946619] x11: ffff80004c983b50 x10: 0000000000000000
[ 265.951911] x9 : ffff8000115d0888 x8 : 0000000000000000
[ 265.951914] x7 : ffff800011890b20 x6 : c0000000ffff7fff
[ 265.951917] x5 : ffff80004c983930 x4 : 0000000000000001
[ 265.951919] x3 : ffffa027eec1b000 x2 : 2b78ccbbff369100
[ 265.964487] x1 : 0000000000000000 x0 : ffff800054627000
[ 265.964491] Call trace:
[ 265.964494] hclgevf_cmd_uninit+0xbc/0x300
[ 265.964496] hclgevf_uninit_ae_dev+0x9c/0xe8
[ 265.964501] hnae3_unregister_ae_dev+0xb0/0x130
[ 265.964516] hns3_remove+0x34/0x88 [hns3]
[ 266.009683] pci_device_remove+0x48/0xf0
[ 266.009692] device_release_driver_internal+0x114/0x1e8
[ 266.030058] device_driver_detach+0x28/0x38
[ 266.034224] unbind_store+0xd4/0x108
[ 266.037784] drv_attr_store+0x40/0x58
[ 266.041435] sysfs_kf_write+0x54/0x80
[ 266.045081] kernfs_fop_write+0x12c/0x250
[ 266.049076] vfs_write+0xc4/0x248
[ 266.052378] ksys_write+0x74/0xf8
[ 266.055677] __arm64_sys_write+0x24/0x30
[ 266.059584] el0_svc_common.constprop.3+0x84/0x270
[ 266.064354] do_el0_svc+0x34/0xa0
[ 266.067658] el0_svc+0x38/0x40
[ 266.070700] el0_sync_handler+0x8c/0xb0
[ 266.074519] el0_sync+0x140/0x180
It looks like the BAR memory region had already been unmapped before we
start clearing CMDQ registers in it, which is pretty bad and the kernel
happily kills itself because of a Current EL Data Abort (on arm64).
Moving the CMDQ uninitialization a bit early fixes the issue for me.
Fixes: 862d969a3a4d ("net: hns3: do VF's pci re-initialization while PF doing FLR")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20201023051550.793-1-yuzenghui@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the AER or firmware reset flow, if we are in fatal error state or
if pci_channel_offline() is true, we don't send any commands to the
firmware because the commands will likely not reach the firmware and
most commands don't matter much because the firmware is likely to be
reset imminently.
However, the HWRM_FUNC_RESET command is different and we should always
attempt to send it. In the AER flow for example, the .slot_reset()
call will trigger this fw command and we need to try to send it to
effect the proper reset.
Fixes: b340dc680ed4 ("bnxt_en: Avoid sending firmware messages when AER error is detected.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
bnxt_open_nic() is called during configuration changes that require
the NIC to be closed and then opened. This call is protected by
rtnl_lock. Firmware reset can be happening at the same time. Only
critical portions of the entire firmware reset sequence are protected
by the rtnl_lock. It is possible that bnxt_open_nic() can be called
when the firmware reset sequence is aborting. In that case,
bnxt_open_nic() needs to check if the ABORT_ERR flag is set and
abort if it is. The configuration change that resulted in the
bnxt_open_nic() call will fail but the NIC will be brought to a
consistent IF_DOWN state.
Without this patch, if bnxt_open_nic() were to continue in this error
state, it may crash like this:
[ 1648.659736] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1648.659768] IP: [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.659796] PGD 101e1b3067 PUD 101e1b2067 PMD 0
[ 1648.659813] Oops: 0000 [#1] SMP
[ 1648.659825] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dell_smbios dell_wmi_descriptor dcdbas amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper vfat cryptd fat pcspkr ipmi_ssif sg k10temp i2c_piix4 wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_power_meter sch_fq_codel ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci megaraid_sas crct10dif_pclmul crct10dif_common
[ 1648.660063] tg3 libata crc32c_intel bnxt_en(OE) drm_panel_orientation_quirks devlink ptp pps_core dm_mirror dm_region_hash dm_log dm_mod fuse
[ 1648.660105] CPU: 13 PID: 3867 Comm: ethtool Kdump: loaded Tainted: G OE ------------ 3.10.0-1152.el7.x86_64 #1
[ 1648.660911] Hardware name: Dell Inc. PowerEdge R7515/0R4CNN, BIOS 1.2.14 01/28/2020
[ 1648.661662] task: ffff94e64cbc9080 ti: ffff94f55df1c000 task.ti: ffff94f55df1c000
[ 1648.662409] RIP: 0010:[<ffffffffc01e9b3a>] [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.663171] RSP: 0018:ffff94f55df1fba8 EFLAGS: 00010202
[ 1648.663927] RAX: 0000000000000000 RBX: ffff94e6827e0000 RCX: 0000000000000000
[ 1648.664684] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94e6827e08c0
[ 1648.665433] RBP: ffff94f55df1fc20 R08: 00000000000001ff R09: 0000000000000008
[ 1648.666184] R10: 0000000000000d53 R11: ffff94f55df1f7ce R12: ffff94e6827e08c0
[ 1648.666940] R13: ffff94e6827e08c0 R14: ffff94e6827e08c0 R15: ffffffffb9115e40
[ 1648.667695] FS: 00007f8aadba5740(0000) GS:ffff94f57eb40000(0000) knlGS:0000000000000000
[ 1648.668447] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1648.669202] CR2: 0000000000000000 CR3: 0000001022772000 CR4: 0000000000340fe0
[ 1648.669966] Call Trace:
[ 1648.670730] [<ffffffffc01f1d5d>] ? bnxt_need_reserve_rings+0x9d/0x170 [bnxt_en]
[ 1648.671496] [<ffffffffc01fa7ea>] __bnxt_open_nic+0x8a/0x9a0 [bnxt_en]
[ 1648.672263] [<ffffffffc01f7479>] ? bnxt_close_nic+0x59/0x1b0 [bnxt_en]
[ 1648.673031] [<ffffffffc01fb11b>] bnxt_open_nic+0x1b/0x50 [bnxt_en]
[ 1648.673793] [<ffffffffc020037c>] bnxt_set_ringparam+0x6c/0xa0 [bnxt_en]
[ 1648.674550] [<ffffffffb8a5f564>] dev_ethtool+0x1334/0x21a0
[ 1648.675306] [<ffffffffb8a719ff>] dev_ioctl+0x1ef/0x5f0
[ 1648.676061] [<ffffffffb8a324bd>] sock_do_ioctl+0x4d/0x60
[ 1648.676810] [<ffffffffb8a326bb>] sock_ioctl+0x1eb/0x2d0
[ 1648.677548] [<ffffffffb8663230>] do_vfs_ioctl+0x3a0/0x5b0
[ 1648.678282] [<ffffffffb8b8e678>] ? __do_page_fault+0x238/0x500
[ 1648.679016] [<ffffffffb86634e1>] SyS_ioctl+0xa1/0xc0
[ 1648.679745] [<ffffffffb8b93f92>] system_call_fastpath+0x25/0x2a
[ 1648.680461] Code: 9e 60 01 00 00 0f 1f 40 00 45 8b 8e 48 01 00 00 31 c9 45 85 c9 0f 8e 73 01 00 00 66 0f 1f 44 00 00 49 8b 86 a8 00 00 00 48 63 d1 <48> 8b 14 d0 48 85 d2 0f 84 46 01 00 00 41 8b 86 44 01 00 00 c7
[ 1648.681986] RIP [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.682724] RSP <ffff94f55df1fba8>
[ 1648.683451] CR2: 0000000000000000
Fixes: ec5d31e3c15d ("bnxt_en: Handle firmware reset status during IF_UP.")
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When a PCIe fatal error occurs, the internal latched BAR addresses
in the chip get reset even though the BAR register values in config
space are retained.
pci_restore_state() will not rewrite the BAR addresses if the
BAR address values are valid, causing the chip's internal BAR addresses
to stay invalid. So we need to zero the BAR registers during PCIe fatal
error to force pci_restore_state() to restore the BAR addresses. These
write cycles to the BAR registers will cause the proper BAR addresses to
latch internally.
Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As part of the commit b148bb238c02
("bnxt_en: Fix possible crash in bnxt_fw_reset_task()."),
cancel_delayed_work_sync() is called only for VFs to fix a possible
crash by cancelling any pending delayed work items. It was assumed
by mistake that the flush_workqueue() call on the PF would flush
delayed work items as well.
As flush_workqueue() does not cancel the delayed workqueue, extend
the fix for PFs. This fix will avoid the system crash, if there are
any pending delayed work items in fw_reset_task() during driver's
.remove() call.
Unify the workqueue cleanup logic for both PF and VF by calling
cancel_work_sync() and cancel_delayed_work_sync() directly in
bnxt_remove_one().
Fixes: b148bb238c02 ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A recent patch has moved the workqueue cleanup logic before
calling unregister_netdev() in bnxt_remove_one(). This caused a
regression because the workqueue can be restarted if the device is
still open. Workqueue cleanup must be done after unregister_netdev().
The workqueue will not restart itself after the device is closed.
Call bnxt_cancel_sp_work() after unregister_netdev() and
call bnxt_dl_fw_reporters_destroy() after that. This fixes the
regession and the original NULL ptr dereference issue.
Fixes: b16939b59cc0 ("bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Each EMAD transaction stores the skb used to issue the EMAD request
('trans->tx_skb') so that the request could be retried in case of a
timeout. The skb can be freed when a corresponding response is received
or as part of the retry logic (e.g., failed retransmit, exceeded maximum
number of retries).
The two tasks (i.e., response processing and retransmits) are
synchronized by the atomic 'trans->active' field which ensures that
responses to inactive transactions are ignored.
In case of a failed retransmit the transaction is finished and all of
its resources are freed. However, the current code does not mark it as
inactive. Syzkaller was able to hit a race condition in which a
concurrent response is processed while the transaction's resources are
being freed, resulting in a use-after-free [1].
Fix the issue by making sure to mark the transaction as inactive after a
failed retransmit and free its resources only if a concurrent task did
not already do that.
[1]
BUG: KASAN: use-after-free in consume_skb+0x30/0x370
net/core/skbuff.c:833
Read of size 4 at addr ffff88804f570494 by task syz-executor.0/1004
CPU: 0 PID: 1004 Comm: syz-executor.0 Not tainted 5.8.0-rc7+ #68
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xf6/0x16e lib/dump_stack.c:118
print_address_description.constprop.0+0x1c/0x250
mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
instrument_atomic_read include/linux/instrumented.h:56 [inline]
atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
refcount_read include/linux/refcount.h:147 [inline]
skb_unref include/linux/skbuff.h:1044 [inline]
consume_skb+0x30/0x370 net/core/skbuff.c:833
mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
mlxsw_emad_process_response drivers/net/ethernet/mellanox/mlxsw/core.c:651 [inline]
mlxsw_emad_rx_listener_func+0x5c9/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:672
mlxsw_core_skb_receive+0x4df/0x770 drivers/net/ethernet/mellanox/mlxsw/core.c:2063
mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
__do_softirq+0x223/0x964 kernel/softirq.c:292
asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711
Allocated by task 1006:
save_stack+0x1b/0x40 mm/kasan/common.c:48
set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc mm/kasan/common.c:494 [inline]
__kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
slab_post_alloc_hook mm/slab.h:586 [inline]
slab_alloc_node mm/slub.c:2824 [inline]
slab_alloc mm/slub.c:2832 [inline]
kmem_cache_alloc+0xcd/0x2e0 mm/slub.c:2837
__build_skb+0x21/0x60 net/core/skbuff.c:311
__netdev_alloc_skb+0x1e2/0x360 net/core/skbuff.c:464
netdev_alloc_skb include/linux/skbuff.h:2810 [inline]
mlxsw_emad_alloc drivers/net/ethernet/mellanox/mlxsw/core.c:756 [inline]
mlxsw_emad_reg_access drivers/net/ethernet/mellanox/mlxsw/core.c:787 [inline]
mlxsw_core_reg_access_emad+0x1ab/0x1420 drivers/net/ethernet/mellanox/mlxsw/core.c:1817
mlxsw_reg_trans_query+0x39/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:1831
mlxsw_sp_sb_pm_occ_clear drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:260 [inline]
mlxsw_sp_sb_occ_max_clear+0xbff/0x10a0 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:1365
mlxsw_devlink_sb_occ_max_clear+0x76/0xb0 drivers/net/ethernet/mellanox/mlxsw/core.c:1037
devlink_nl_cmd_sb_occ_max_clear_doit+0x1ec/0x280 net/core/devlink.c:1765
genl_family_rcv_msg_doit net/netlink/genetlink.c:669 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:714 [inline]
genl_rcv_msg+0x617/0x980 net/netlink/genetlink.c:731
netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2470
genl_rcv+0x24/0x40 net/netlink/genetlink.c:742
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg+0x150/0x190 net/socket.c:671
____sys_sendmsg+0x6d8/0x840 net/socket.c:2359
___sys_sendmsg+0xff/0x170 net/socket.c:2413
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2446
do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 73:
save_stack+0x1b/0x40 mm/kasan/common.c:48
set_track mm/kasan/common.c:56 [inline]
kasan_set_free_info mm/kasan/common.c:316 [inline]
__kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
slab_free_hook mm/slub.c:1474 [inline]
slab_free_freelist_hook mm/slub.c:1507 [inline]
slab_free mm/slub.c:3072 [inline]
kmem_cache_free+0xbe/0x380 mm/slub.c:3088
kfree_skbmem net/core/skbuff.c:622 [inline]
kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:616
__kfree_skb net/core/skbuff.c:679 [inline]
consume_skb net/core/skbuff.c:837 [inline]
consume_skb+0xe1/0x370 net/core/skbuff.c:831
mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
mlxsw_emad_transmit_retry.isra.0+0x9d/0xc0 drivers/net/ethernet/mellanox/mlxsw/core.c:613
mlxsw_emad_trans_timeout_work+0x43/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:625
process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
kthread+0x355/0x470 kernel/kthread.c:291
ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293
The buggy address belongs to the object at ffff88804f5703c0
which belongs to the cache skbuff_head_cache of size 224
The buggy address is located 212 bytes inside of
224-byte region [ffff88804f5703c0, ffff88804f5704a0)
The buggy address belongs to the page:
page:ffffea00013d5c00 refcount:1 mapcount:0 mapping:0000000000000000
index:0x0
flags: 0x100000000000200(slab)
raw: 0100000000000200 dead000000000100 dead000000000122 ffff88806c625400
raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88804f570380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
ffff88804f570400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88804f570480: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff88804f570500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88804f570580: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
Fixes: caf7297e7ab5f ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Free the devlink instance during the teardown sequence in the non-reload
case to avoid the following memory leak.
unreferenced object 0xffff888232895000 (size 2048):
comm "modprobe", pid 1073, jiffies 4295568857 (age 164.871s)
hex dump (first 32 bytes):
00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de ........".......
10 50 89 32 82 88 ff ff 10 50 89 32 82 88 ff ff .P.2.....P.2....
backtrace:
[<00000000c704e9a6>] __kmalloc+0x13a/0x2a0
[<00000000ee30129d>] devlink_alloc+0xff/0x760
[<0000000092ab3e5d>] 0xffffffffa042e5b0
[<000000004f3f8a31>] 0xffffffffa042f6ad
[<0000000092800b4b>] 0xffffffffa0491df3
[<00000000c4843903>] local_pci_probe+0xcb/0x170
[<000000006993ded7>] pci_device_probe+0x2c2/0x4e0
[<00000000a8e0de75>] really_probe+0x2c5/0xf90
[<00000000d42ba75d>] driver_probe_device+0x1eb/0x340
[<00000000bcc95e05>] device_driver_attach+0x294/0x300
[<000000000e2bc177>] __driver_attach+0x167/0x2f0
[<000000007d44cd6e>] bus_for_each_dev+0x148/0x1f0
[<000000003cd5a91e>] driver_attach+0x45/0x60
[<000000000041ce51>] bus_add_driver+0x3b8/0x720
[<00000000f5215476>] driver_register+0x230/0x4e0
[<00000000d79356f5>] __pci_register_driver+0x190/0x200
Fixes: a22712a96291 ("mlxsw: core: Fix devlink unregister flow")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reported-by: Vadim Pasternak <vadimp@nvidia.com>
Tested-by: Oleksandr Shamray <oleksandrs@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During port creation the driver instructs the device to advertise all
the supported link modes queried from the device.
Since cited commit not all the link modes supported by the device are
supported by the driver. This can result in the device negotiating a
link mode that is not recognized by the driver causing ethtool to show
an unsupported speed:
$ ethtool swp1
...
Speed: Unknown!
This is especially problematic when the netdev is enslaved to a bond, as
the bond driver uses unknown speed as an indication that the link is
down:
[13048.900895] net_ratelimit: 86 callbacks suppressed
[13048.900902] t_bond0: (slave swp52): failed to get link speed/duplex
[13048.912160] t_bond0: (slave swp49): failed to get link speed/duplex
Fix this by making sure that only link modes that are supported by both
the device and the driver are advertised.
Fixes: b97cd891268d ("mlxsw: Remove 56G speed support")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The system EID that is defined by the ISM driver is not correct. Using
an incorrect system EID allows to communicate with remote Linux systems
that use the same incorrect system EID, but when it comes to
interoperability with other operating systems then the system EIDs do
never match which prevents SMC-Dv2 communication.
Using the correct system EID fixes this problem.
Fixes: 201091ebb2a1 ("net/smc: introduce System Enterprise ID (SEID)")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The current code sets up the filter action field before
rewrites are set up. When the action 'switch' is used
with rewrites, this may result in initial few packets
that get switched out don't have rewrites applied
on them.
So, make sure filter action is set up along with rewrites
or only after everything else is set up for rewrites.
Fixes: 12b276fbf6e0 ("cxgb4: add support to create hash filters")
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Link: https://lore.kernel.org/r/20201023115852.18262-1-rajur@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Smatch complains that "ret" might be uninitialized if we don't enter
the loop. We do always enter the loop so it's a false positive, but
it's cleaner to just return a literal zero and that silences the
warning as well.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20201023112212.GA282278@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The code to try to shut up sparse warnings about questionable locking
didn't shut up sparse: it made the result not parse as valid C at all,
since the end result now has a label with no statement.
The proper fix is to just always lock the hardware, the same way Bart
did in commit 8ae178760b23 ("scsi: qla2xxx: Simplify the functions for
dumping firmware"). That avoids the whole problem with having locking
that is not statically obvious.
But in the meantime, just remove the incorrect attempt at trying to
avoid a sparse warning that just made things worse.
This was exposed by commit 3e6efab865ac ("scsi: qla2xxx: Fix reset of
MPI firmware"), very similarly to how commit cbb01c2f2f63 ("scsi:
qla2xxx: Fix MPI failure AEN (8200) handling") exposed the same problem
in another place, and caused that commit 8ae178760b23.
Please don't add code to just shut up sparse without actually fixing
what sparse complains about.
Reported-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Arun Easi <aeasi@marvell.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Some drivers (such as EFA) have a GID table, but aren't IB/RoCE devices.
Remove the unnecessary rdma_ib_or_roce() check.
This fixes rdma-core failures for EFA when it uses the new ioctl interface
for querying the GID table.
Fixes: 9f85cbe50aa0 ("RDMA/uverbs: Expose the new GID query API to user space")
Link: https://lore.kernel.org/r/20201026082621.32463-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When a mlx5 core devlink instance is reloaded in different net namespace,
its associated IB device is deleted and recreated.
Example sequence is:
$ ip netns add foo
$ devlink dev reload pci/0000:00:08.0 netns foo
$ ip netns del foo
mlx5 IB device needs to attach and detach the netdevice to it through the
netdev notifier chain during load and unload sequence. A below call graph
of the unload flow.
cleanup_net()
down_read(&pernet_ops_rwsem); <- first sem acquired
ops_pre_exit_list()
pre_exit()
devlink_pernet_pre_exit()
devlink_reload()
mlx5_devlink_reload_down()
mlx5_unload_one()
[...]
mlx5_ib_remove()
mlx5_ib_unbind_slave_port()
mlx5_remove_netdev_notifier()
unregister_netdevice_notifier()
down_write(&pernet_ops_rwsem);<- recurrsive lock
Hence, when net namespace is deleted, mlx5 reload results in deadlock.
When deadlock occurs, devlink mutex is also held. This not only deadlocks
the mlx5 device under reload, but all the processes which attempt to
access unrelated devlink devices are deadlocked.
Hence, fix this by mlx5 ib driver to register for per net netdev notifier
instead of global one, which operats on the net namespace without holding
the pernet_ops_rwsem.
Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload")
Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.com
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The patch referenced below has a typo that results in using the wrong L2
header size for outbound traffic. (V4 <-> V6).
It also breaks kernel-side RC traffic because they use AVs that use
RDMA_NETWORK_XXX enums instead of RXE_NETWORK_TYPE_XXX enums. Fix this by
transcoding between these enum types.
Fixes: e0d696d201dd ("RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI")
Link: https://lore.kernel.org/r/20201016211343.22906-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The reserved-memory overlap detection code fails to detect overlaps if
either of the regions starts at address 0x0. The code explicitly checks
for and ignores such regions, apparently in order to ignore dynamically
allocated regions which have an address of 0x0 at this point. These
dynamically allocated regions also have a size of 0x0 at this point, so
fix this by removing the check and sorting the dynamically allocated
regions ahead of any static regions at address 0x0.
For example, there are two overlaps in this case but they are not
currently reported:
foo@0 {
reg = <0x0 0x2000>;
};
bar@0 {
reg = <0x0 0x1000>;
};
baz@1000 {
reg = <0x1000 0x1000>;
};
quux {
size = <0x1000>;
};
but they are after this patch:
OF: reserved mem: OVERLAP DETECTED!
bar@0 (0x00000000--0x00001000) overlaps with foo@0 (0x00000000--0x00002000)
OF: reserved mem: OVERLAP DETECTED!
foo@0 (0x00000000--0x00002000) overlaps with baz@1000 (0x00001000--0x00002000)
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/ded6fd6b47b58741aabdcc6967f73eca6a3f311e.1603273666.git-series.vincent.whitchurch@axis.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.
Remove the quote operator # from compiler_attributes.h __section macro.
Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.
Conversion done using the script at:
https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|