linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-04-16	Merge tag 'mlx5-updates-2021-04-16' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-04-16 This patchset introduces updates to mlx5e netdev driver. 1) Tariq refactors TLS offloads and adds resiliency against RX resync failures 2) Maxim reduces code duplications by unifying channels reset flow regardless if channels are closed or open 3) Aya Enhances TX/RX health reporters diagnostics to expose the internal clock time-stamping format 4) Moshe adds support for ethtool extended link state, to show the reason for link down ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	gianfar: Drop GFAR_MQ_POLLING support	Claudiu Manoil
	Gianfar used to enable all 8 Rx queues (DMA rings) per ethernet device, even though the controller can only support 2 interrupt lines at most. This meant that multiple Rx queues would have to be grouped per NAPI poll routine, and the CPU would have to split the budget and service them in a round robin manner. The overhead of this scheme proved to outweight the potential benefits. The alternative was to introduce the "Single Queue" polling mode, supporting one Rx queue per NAPI, which became the default packet processing option and helped improve the performance of the driver. MQ_POLLING also relies on undocumeted device tree properties to specify how to map the 8 Rx and Tx queues to a given interrupt line (aka "interrupt group"). Using module parameters to enable this mode wasn't an option either. Long story short, MQ_POLLING became obsolete, now it is just dead code, and no one asked for it so far. For the Tx queues, multi-queue support (more than 1 Tx queue per CPU) could be revisited by adding tc MQPRIO support, but again, one has to consider that there are only 2 interrupt lines. So the NAPI poll routine would have to service multiple Tx rings. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	veth: check for NAPI instead of xdp_prog before xmit of XDP frame	Toke Høiland-Jørgensen
	The recent patch that tied enabling of veth NAPI to the GRO flag also has the nice side effect that a veth device can be the target of an XDP_REDIRECT without an XDP program needing to be loaded on the peer device. However, the patch adding this extra NAPI mode didn't actually change the check in veth_xdp_xmit() to also look at the new NAPI pointer, so let's fix that. Fixes: 6788fa154546 ("veth: allow enabling NAPI even without XDP") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	net: ipa: optionally define firmware name via DT	Alex Elder
	IPA initialization includes loading some firmware. This step is done either by the modem or by the AP under Trust Zone. If the AP loads firmware, the name of the firmware file is currently hard-coded ("ipa_fws.mdt"). Add the ability to specify the relative path of the firmware file to use in a property in the Device Tree IPA node. If the property is not found (or if any other error occurs attempting to get it), fall back to using a default relative path. Use the "old" fixed name as the default. Rename the symbol that represents this default to emphasize its purpose. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	virtio-net: page_to_skb() use build_skb when there's sufficient tailroom	Xuan Zhuo
	In page_to_skb(), if we have enough tailroom to save skb_shared_info, we can use build_skb to create skb directly. No need to alloc for additional space. And it can save a 'frags slot', which is very friendly to GRO. Here, if the payload of the received package is too small (less than GOOD_COPY_LEN), we still choose to copy it directly to the space got by napi_alloc_skb. So we can reuse these pages. Testing Machine: The four queues of the network card are bound to the cpu1. Test command: for ((i=0;i<5;++i)); do sockperf tp --ip 192.168.122.64 -m 1000 -t 150& done The size of the udp package is 1000, so in the case of this patch, there will always be enough tailroom to use build_skb. The sent udp packet will be discarded because there is no port to receive it. The irqsoftd of the machine is 100%, we observe the received quantity displayed by sar -n DEV 1: no build_skb: 956864.00 rxpck/s build_skb: 1158465.00 rxpck/s Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	net: Add Qcom WWAN control driver	Loic Poulain
	The MHI WWWAN control driver allows MHI QCOM-based modems to expose different modem control protocols/ports via the WWAN framework, so that userspace modem tools or daemon (e.g. ModemManager) can control WWAN config and state (APN config, SMS, provider selection...). A QCOM-based modem can expose one or several of the following protocols: - AT: Well known AT commands interactive protocol (microcom, minicom...) - MBIM: Mobile Broadband Interface Model (libmbim, mbimcli) - QMI: QCOM MSM/Modem Interface (libqmi, qmicli) - QCDM: QCOM Modem diagnostic interface (libqcdm) - FIREHOSE: XML-based protocol for Modem firmware management (qmi-firmware-update) Note that this patch is mostly a rework of the earlier MHI UCI tentative that was a generic interface for accessing MHI bus from userspace. As suggested, this new version is WWAN specific and is dedicated to only expose channels used for controlling a modem, and for which related opensource userpace support exist. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	net: Add a WWAN subsystem	Loic Poulain
	This change introduces initial support for a WWAN framework. Given the complexity and heterogeneity of existing WWAN hardwares and interfaces, there is no strict definition of what a WWAN device is and how it should be represented. It's often a collection of multiple devices that perform the global WWAN feature (netdev, tty, chardev, etc). One usual way to expose modem controls and configuration is via high level protocols such as the well known AT command protocol, MBIM or QMI. The USB modems started to expose them as character devices, and user daemons such as ModemManager learnt to use them. This initial version adds the concept of WWAN port, which is a logical pipe to a modem control protocol. The protocols are rawly exposed to user via character device, allowing straigthforward support in existing tools (ModemManager, ofono...). The WWAN core takes care of the generic part, including character device management, and relies on port driver operations to receive/submit protocol data. Since the different devices exposing protocols for a same WWAN hardware do not necessarily know about each others (e.g. two different USB interfaces, PCI/MHI channel devices...) and can be created/removed in different orders, the WWAN core ensures that all WAN ports contributing to the 'whole' WWAN feature are grouped under the same virtual WWAN device, relying on the provided parent device (e.g. mhi controller, USB device). It's a 'trick' I copied from Johannes's earlier WWAN subsystem proposal. This initial version is purposely minimalist, it's essentially moving the generic part of the previously proposed mhi_wwan_ctrl driver inside a common WWAN framework, but the implementation is open and flexible enough to allow extension for further drivers. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	net: mvpp2: Add parsing support for different IPv4 IHL values	Stefan Chulski
	Add parser entries for different IPv4 IHL values. Each entry will set the L4 header offset according to the IPv4 IHL field. L3 header offset will set during the parsing of the IPv4 protocol. Because of missed parser support for IP header length > 20, RX IPv4 checksum HW offload fails and skb->ip_summed set to CHECKSUM_NONE(checksum done by Network stack). This patch adds RX IPv4 checksum HW offload capability for frames with IP header length > 20. v1 --> v2 - Improve commit message. Suggested-by: Dana Vardi <danat@marvell.com> Signed-off-by: Stefan Chulski <stefanc@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	r8152: search the configuration of vendor mode	Hayes Wang
	The vendor mode is not always at config #1, so it is necessary to set the correct configuration number. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	r8152: support PHY firmware for RTL8156 series	Hayes Wang
	Support new firmware type and method for RTL8156 series. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	r8152: support new chips	Hayes Wang
	Support RTL8153C, RTL8153D, RTL8156A, and RTL8156B. The RTL8156A and RTL8156B are the 2.5G ethernet. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	r8152: add help function to change mtu	Hayes Wang
	The different chips may have different requests when changing mtu. Therefore, add a new help function of rtl_ops to change mtu. Besides, reset the tx/rx after changing mtu. Additionally, add mtu_to_size() and size_to_mtu() macros to simplify the code. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	r8152: adjust rtl8152_check_firmware function	Hayes Wang
	Use bits operations to record and check the firmware. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	r8152: set inter fram gap time depending on speed	Hayes Wang
	Set the maximum inter frame gap time (144ns) for speed 10M/half and 100M/half. It improves the performance for those speeds. And, there is no effect for the other speeds. For 10M/half and 100M/half, the fast inter frame gap time let the device couldn't use the feature of the aggregation effectively, because the transfer would be completed fastly. Therefore, use the maximum value to improve the effect of the aggregation. However, you may not feel the improvement for fast CPUs, because they compensate for the effect of the aggregation. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	net: ethernet: mediatek: ppe: fix busy wait loop	Ilya Lipnitskiy
	The intention is for the loop to timeout if the body does not succeed. The current logic calls time_is_before_jiffies(timeout) which is false until after the timeout, so the loop body never executes. Fix by using readl_poll_timeout as a more standard and less error-prone solution. Fixes: ba37b7caf1ed ("net: ethernet: mtk_eth_soc: add support for initializing the PPE") Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> Cc: Felix Fietkau <nbd@nbd.name> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	atl1c: move tx cleanup processing out of interrupt	Gatis Peisenieks
	Tx queue cleanup happens in interrupt handler on same core as rx queue processing. Both can take considerable amount of processing in high packet-per-second scenarios. Sending big amounts of packets can stall the rx processing which is unfair and also can lead to out-of-memory condition since __dev_kfree_skb_irq queues the skbs for later kfree in softirq which is not allowed to happen with heavy load in interrupt handler. This puts tx cleanup in its own napi and enables threaded napi to allow the rx/tx queue processing to happen on different cores. The ability to sustain equal amounts of tx/rx traffic increased: from 280Kpps to 1130Kpps on Threadripper 3960X with upcoming Mikrotik 10/25G NIC, from 520Kpps to 850Kpps on Intel i3-3320 with Mikrotik RB44Ge adapter. Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	net: bridge: switchdev: include local flag in FDB notifications	Vladimir Oltean
	As explained in bugfix commit 6ab4c3117aec ("net: bridge: don't notify switchdev for local FDB addresses") as well as in this discussion: https://lore.kernel.org/netdev/20210117193009.io3nungdwuzmo5f7@skbuf/ the switchdev notifiers for FDB entries managed to have a zero-day bug, which was that drivers would not know what to do with local FDB entries, because they were not told that they are local. The bug fix was to simply not notify them of those addresses. Let us now add the 'is_local' bit to bridge FDB entries, and make all drivers ignore these entries by their own choice. Co-developed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	net/mlx5: Enhance diagnostics info for TX/RX reporters	Aya Levin
	Add ts_format to 'Common Config' section of the TX/RX devlink reporters diagnostics info. Possible values for ts_format: 'RT' or 'FRC' which stands for: Real Time and Free Running Counters correspondingly. Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5: Add helper to initialize 1PPS	Aya Levin
	Wrap 1PPS initialization in a helper for a cleaner init flow. Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: Add ethtool extended link state	Moshe Tal
	In case the interface was set up but cannot establish the link, ethtool will print more information to help the user troubleshoot the state. For example, no link due to missing cable: $ ethtool eth1 ... Link detected: no (No cable) Beside the general extended state, drivers can pass additional information about the link state using the sub-state field. For example: $ ethtool eth1 ... Link detected: no (Autoneg, No partner detected) The extended state is available only for specific cases, in other cases ethtool with print only "Link detected: no" as before Signed-off-by: Moshe Tal <moshet@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5: Allocate FC bulk structs with kvzalloc() instead of kzalloc()	Maor Dickman
	The bulk size is larger than 16K so use kvzalloc(). The bulk bitmask upper size limit is 16K so use kvcalloc(). Signed-off-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: Cleanup safe switch channels API by passing params	Maxim Mikityanskiy
	mlx5e_safe_switch_channels accepts new_chs as a parameter and opens new channels in place, then copying them to priv->channels. It requires all the callers to allocate space for this temporary storage of the new channels. This commit cleans up the API by replacing new_chs with new_params, a meaningful subset of new_chs to be filled by the caller. The temporary space for the new channels is allocated inside mlx5e_safe_switch_params (a new name for mlx5e_safe_switch_channels). An extra copy of params is made, but since it's control flow, it's not critical. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: Refactor on-the-fly configuration changes	Maxim Mikityanskiy
	This commit extends mlx5e_safe_switch_channels() to support on-the-fly configuration changes, when the channels are open, but don't need to be recreated. Such flows exist when a parameter being changed doesn't affect how the queues are created, or when the queues can be modified while remaining active. Before this commit, such flows were handled as special cases on the caller site. This commit adds this functionality to mlx5e_safe_switch_channels(), allowing the caller to pass a boolean indicating whether it's required to recreate the channels or it's allowed to skip it. The logic of switching channel parameters is now completely encapsulated into mlx5e_safe_switch_channels(). Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: Use mlx5e_safe_switch_channels when channels are closed	Maxim Mikityanskiy
	This commit uses new functionality of mlx5e_safe_switch_channels introduced by the previous commit to reduce the amount of repeating similar code all over the driver. It's very common in mlx5e to call mlx5e_safe_switch_channels when the channels are open, but assign parameters and run hardware commands manually when the channels are closed. After the previous commit it's no longer needed to do such manual things every time, so this commit removes unneeded code and relies on the new functionality of mlx5e_safe_switch_channels. Some of the places are refactored and simplified, where more complex flows are used to change configuration on the fly, without recreating the channels (the logic is rewritten in a more robust way, with a reset required by default and a list of exceptions). Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: Allow mlx5e_safe_switch_channels to work with channels closed	Maxim Mikityanskiy
	mlx5e_safe_switch_channels is used to modify channel parameters and/or hardware configuration in a safe way, so that if anything goes wrong, everything reverts to the old configuration and remains in a consistent state. However, this function only works when the channels are open. When the caller needs to modify some parameters, first it has to check that the channels are open, otherwise it has to assign parameters directly, and such boilerplate repeats in many different places. This commit prepares for the refactoring of such places by allowing mlx5e_safe_switch_channels to work when the channels are closed. In this case it will assign the new parameters and run the preactivate hook. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: kTLS, Add resiliency to RX resync failures	Tariq Toukan
	When the TLS logic finds a tcp seq match for a kTLS RX resync request, it calls the driver callback function mlx5e_ktls_resync() to handle it and communicate it to the device. Errors might occur during mlx5e_ktls_resync(), however, they are not reported to the stack. Moreover, there is no error handling in the stack for these errors. In this patch, the driver obtains responsibility on errors handling, adding queue and retry mechanisms to these resyncs. We maintain a linked list of resync matches, and try posting them to the async ICOSQ in the NAPI context. Only possible failure that demands driver handling is ICOSQ being full. By relying on the NAPI mechanism, we make sure that the entries in list will be handled when ICOSQ completions arrive and make some room available. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: TX, Inline function mlx5e_tls_handle_tx_wqe()	Tariq Toukan
	When TLS is supported, WQE ctrl segment of every transmitted packet is updated with the (possibly empty, for non-TLS packets) TISN field. Take this one-liner function into the header file and inline it, to save the overhead of a function call per packet. While here, remove unused function parameter. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: TX, Inline TLS skb check	Tariq Toukan
	When TLS is supported and enabled, every transmitted packet is tested to identify if TLS offload is required. Take the early-return condition into an inline function, to save the overhead of a function call for non-TLS packets. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: Cleanup unused function parameter	Tariq Toukan
	Socket parameter is not used in accel_rule_init(), remove it. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	net/mlx5e: Remove non-essential TLS SQ state bit	Tariq Toukan
	Maintaining an SQ state bit to indicate TLS support has no real need, a simple and fast test [1] for the SKB is almost equally good. [1] !skb->sk \|\| !tls_is_sk_tx_device_offloaded(skb->sk) Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-15	mlx5: implement ethtool::get_fec_stats	Jakub Kicinski
	Report corrected bits. v2: catch reg access errors (Saeed) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15	sfc: ef10: implement ethtool::get_fec_stats	Jakub Kicinski
	Report what appears to be the standard block counts: - 30.5.1.1.17 aFECCorrectedBlocks - 30.5.1.1.18 aFECUncorrectableBlocks Don't report the per-lane symbol counts, if those really count symbols they are not what the standard calls for (even if symbols seem like the most useful thing to count.) Fingers crossed that fec_corrected_errors is not in symbols. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15	bnxt: implement ethtool::get_fec_stats	Jakub Kicinski
	Report corrected bits. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15	enetc: convert to schedule_work()	Yangbo Lu
	Convert system_wq queue_work() to schedule_work() which is a wrapper around it, since the former is a rare construct. Fixes: 7294380c5211 ("enetc: support PTP Sync packet one-step timestamping") Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15	net: hns3: VF not request link status when PF support push link status feature	Guangbin Huang
	To reduce the processing of unnecessary mailbox command when PF supports actively push its link status to VFs, VFs stop sending request link status command in periodic service task in this case. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15	net: hns3: PF add support for pushing link status to VFs	Guangbin Huang
	Previously, VF updates its link status every second by send query command to PF in periodic service task. If link stats of PF is changed, VF may need at most one second to update its link status. To reduce delay of link status between PF and VFs, PF actively push its link status to VFs when its link status is updated. And to let VF know PF supports this new feature, the link status changed mailbox command adds one bit to indicate it. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15	net: phy: at803x: select correct page on config init	David Bauer
	The Atheros AR8031 and AR8033 expose different registers for SGMII/Fiber as well as the copper side of the PHY depending on the BT_BX_REG_SEL bit in the chip configure register. The driver assumes the copper side is selected on probe, but this might not be the case depending which page was last selected by the bootloader. Notably, Ubiquiti UniFi bootloaders show this behavior. Select the copper page when probing to circumvent this. Signed-off-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14	ice: reduce scope of variable	Paul M Stillwell Jr
	The scope of this variable can be reduced so do that. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: remove return variable	Paul M Stillwell Jr
	We were saving the return value from ice_vsi_manage_rss_lut(), but the errors from that function are not critical so change it to return void and remove the code that saved the value. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: suppress false cppcheck issues	Bruce Allan
	Silence false errors, warnings and style issues reported by cppcheck. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: Set vsi->vf_id as ICE_INVAL_VFID for non VF VSI types	Brett Creeley
	Currently the vsi->vf_id is set only for ICE_VSI_VF and it's left as 0 for all other VSI types. This is confusing and could be problematic since 0 is a valid vf_id. Fix this by always setting non VF VSI types to ICE_INVAL_VFID. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: remove unused struct member	Jesse Brandeburg
	The only time you can ever have a rq_last_status is if a firmware event was somehow reporting a status on the receive queue, which are generally firmware initiated events or mailbox messages from a VF. Mostly this struct member was unused. Fix this problem by still printing the value of the field in a debug print, but don't store the value forever in a struct, potentially creating opportunities for callers to use the wrong struct member. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: use local for consistency	Jesse Brandeburg
	Do a minor refactor on ice_vsi_rebuild to use a local variable to store vsi->type. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: print name in /proc/iomem	Jesse Brandeburg
	The driver previously printed it's PCI address in the name field for the pci resource, which when displayed via /proc/iomem, would print the same thing twice. It's more useful for debugging to see the driver name, as most other modules do. Here's a diff of before and after this change: 99100000-991fffff : 0000:3b:00.1 9a000000-a04fffff : PCI Bus 0000:3b 9a000000-9bffffff : 0000:3b:00.1 - 9a000000-9bffffff : 0000:3b:00.1 + 9a000000-9bffffff : ice 9c000000-9dffffff : 0000:3b:00.0 - 9c000000-9dffffff : 0000:3b:00.0 + 9c000000-9dffffff : ice 9e000000-9effffff : 0000:3b:00.1 9f000000-9fffffff : 0000:3b:00.0 a0000000-a000ffff : 0000:3b:00.1 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: Reimplement module reads used by ethtool	Scott W Taylor
	There was an excessive increment of the QSFP page, which is now fixed. Additionally, this new update now reads 8 bytes at a time and will retry each request if the module/bus is busy. Also, prevent reading from upper pages if module does not support those pages. Signed-off-by: Scott W Taylor <scott.w.taylor@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: refactor ITR data structures	Jesse Brandeburg
	Use a dedicated bitfield in order to both increase the amount of checking around the length of ITR writes as well as simplify the checks of dynamic mode. Basically unpack the "high bit means dynamic" logic into bitfields. Also, remove some unused ITR defines. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: manage interrupts during poll exit	Jesse Brandeburg
	The driver would occasionally miss that there were outstanding descriptors to clean when exiting busy/napi poll. This issue has been in the code since the introduction of the ice driver. Attempt to "catch" any remaining work by triggering a software interrupt when exiting napi poll or busy-poll. This will not cause extra interrupts in the case of normal execution. This issue was found when running sfnt-pingpong, with busy poll enabled, and typically with larger I/O sizes like > 8192, the program would occasionally report > 1 second maximums to complete a ping pong. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: replace custom AIM algorithm with kernel's DIM library	Jacob Keller
	The ice driver has support for adaptive interrupt moderation, an algorithm for tuning the interrupt rate dynamically. This algorithm is based on various assumptions about ring size, socket buffer size, link speed, SKB overhead, ethernet frame overhead and more. The Linux kernel has support for a dynamic interrupt moderation algorithm known as "dimlib". Replace the custom driver-specific implementation of dynamic interrupt moderation with the kernel's algorithm. The Intel hardware has a different hardware implementation than the originators of the dimlib code had to work with, which requires the driver to use a slightly different set of inputs for the actual moderation values, while getting all the advice from dimlib of better/worse, shift left or right. The change made for this implementation is to use a pair of values for each of the 5 "slots" that the dimlib moderation expects, and the driver will program those pairs when dimlib recommends a slot to use. The currently implementation uses two tables, one for receive and one for transmit, and the pairs of values in each slot set the maximum delay of an interrupt and a maximum number of interrupts per second (both expressed in microseconds). There are two separate kinds of bugs fixed by using DIMLIB, one is UDP single stream send was too slow, and the other is that 8K ping-pong was going to the most aggressive moderation and has much too high latency. The overall result of using DIMLIB is that we meet or exceed our performance expectations set based on the old algorithm. Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: refactor interrupt moderation writes	Jesse Brandeburg
	Introduce several new helpers for writing ITR and GLINT_RATE registers, and refactor the code calling them. This resulted in removal of several duplicate functions and rolled a bunch of simple code back into the calling routines. In particular this removes some code that was doing both a store and a set in a helper function, which seems better done as separate tasks in the caller (and generally takes less lines of code even with a tiny bit of repetition). Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14	ice: Add new VSI states to track netdev alloc/registration	Anirudh Venkataramanan
	Add two new VSI states, one to track if a netdev for the VSI has been allocated and the other to track if the netdev has been registered. Call unregister_netdev/free_netdev only when the corresponding state bits are set. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>