summaryrefslogtreecommitdiff
path: root/drivers/net/hyperv
AgeCommit message (Collapse)Author
2018-01-22hv_netvsc: Use the num_online_cpus() for channel limitHaiyang Zhang
Since we no longer localize channel/CPU affiliation within one NUMA node, num_online_cpus() is used as the number of channel cap, instead of the number of processors in a NUMA node. This patch allows a bigger range for tuning the number of channels. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: empty current transmit aggregation if flow blockedStephen Hemminger
If the transmit queue is known full, then don't keep aggregating data. And the cp_partial flag which indicates that the current aggregation buffer is full can be folded in to avoid more conditionals. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: remove open_cnt reference countStephen Hemminger
There is only ever a single instance of network device object referencing the internal rndis object. Therefore the open_cnt atomic is not necessary. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: pass netvsc_device to receive callbackStephen Hemminger
The netvsc_receive_callback function was using RCU to find the appropriate underlying netvsc_device. Since calling function already had that pointer, this was unnecessary. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: simplify function args in receive status pathStephen Hemminger
The caller (netvsc_receive) already has the net device pointer, and should just pass that to functions rather than the hyperv device. This eliminates several impossible error paths in the process. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: track memory allocation failures in ethtool statsStephen Hemminger
When skb can not be allocated, update ethtool statisitics rather than rx_dropped which is intended for netif_receive. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: copy_to_send buf can be voidStephen Hemminger
Since only caller does not care about return value. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: Fix the TX/RX buffer default sizesHaiyang Zhang
The values were not computed correctly. There are no significant visible impact, though. The intended size of RX buffer is 16 MB, and the default slot size is 1728. So, NETVSC_DEFAULT_RX should be 16*1024*1024 / 1728 = 9709. The intended size of TX buffer is 1 MB, and the slot size is 6144. So, NETVSC_DEFAULT_TX should be 1024*1024 / 6144 = 170. The patch puts the formula directly into the macro, and moves them to hyperv_net.h, together with related macros. Fixes: 5023a6db73196 ("netvsc: increase default receive buffer size") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: Fix the receive buffer size limitHaiyang Zhang
The max should be 31 MB on host with NVSP version > 2. On legacy hosts (NVSP version <=2) only 15 MB receive buffer is allowed, otherwise the buffer request will be rejected by the host, resulting vNIC not coming up. The NVSP version is only available after negotiation. So, we add the limit checking for legacy hosts in netvsc_init_buf(). Fixes: 5023a6db73196 ("netvsc: increase default receive buffer size") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: optimize initialization of RNDIS headerStephen Hemminger
The memset of the whole maximum possible RNDIS header is unnecessary. For the main part of the header use a structure assignment. No need to memset the whole per packet info. Instead rely on caller to set what it wants. Also get rid of cast to void and signed/unsigned conversion. Now return pointer to per packet data (rather than the header) which simplifies use by code setting up the packet data. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: use reciprocal divide to speed up percent calculationStephen Hemminger
Every packet sent checks the available ring space. The calculation can be sped up by using reciprocal divide which is multiplication. Since ring_size can only be configured by module parameter, so it doesn't have to be passed around everywhere. Also it should be unsigned since it is number of pages. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: replace divide with mask when computing paddingStephen Hemminger
Packet alignment is always a power of 2 therefore modulus can be replaced with a faster and operation Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: don't need local xmit_moreStephen Hemminger
Since skb is always non-NULL in the copy portion of netvsc_send do not need local variable. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: drop unused macrosStephen Hemminger
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-16hv_netvsc: preserve hw_features on mtu/channels/ringparam changesVitaly Kuznetsov
rndis_filter_device_add() is called both from netvsc_probe() when we initially create the device and from set channels/mtu/ringparam routines where we basically remove the device and add it back. hw_features is reset in rndis_filter_device_add() and filled with host data. However, we lose all additional flags which are set outside of the driver, e.g. register_netdevice() adds NETIF_F_SOFT_FEATURES and many others. Unfortunately, calls to rndis_{query_hwcaps(), _set_offload_params()} calls cannot be avoided on every RNDIS reset: host expects us to set required features explicitly. Moreover, in theory hardware capabilities can change and we need to reflect the change in hw_features. Reset net->hw_features bits according to host data in rndis_netdev_set_hwcaps(), clear corresponding feature bits from net->features in case some features went missing (will never happen in real life I guess but let's be consistent). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08hv_netvsc: hide warnings about uninitialized/missing rndis deviceVitaly Kuznetsov
Hyper-V hosts are known to send RNDIS messages even after we halt the device in rndis_filter_halt_device(). Remove user visible messages as they are not really useful. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08hv_netvsc: netvsc_teardown_gpadl() splitVitaly Kuznetsov
It was found that in some cases host refuses to teardown GPADL for send/ receive buffers (probably when some work with these buffere is scheduled or ongoing). Change the teardown logic to be: 1) Send NVSP_MSG1_TYPE_REVOKE_* messages 2) Close the channel 3) Teardown GPADLs. This seems to work reliably. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29hv_netvsc: Set tx_table to equal weight after subchannels openHaiyang Zhang
In some cases, like internal vSwitch, the host doesn't provide send indirection table updates. This patch sets the table to be equal weight after subchannels are all open. Otherwise, all workload will be on one TX channel. As tested, this patch has largely increased the throughput over internal vSwitch. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14hv_netvsc: Add initialization of tx_table in netvsc_device_add()Haiyang Zhang
tx_table is part of the private data of kernel net_device. It is only zero-ed out when allocating net_device. We may recreate netvsc_device w/o recreating net_device, so the private netdev data, including tx_table, are not zeroed. It may contain channel numbers for the older netvsc_device. This patch adds initialization of tx_table each time we recreate netvsc_device. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14hv_netvsc: Rename tx_send_table to tx_tableHaiyang Zhang
Simplify the variable name: tx_send_table Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14hv_netvsc: Rename ind_table to rx_tableHaiyang Zhang
Rename this variable because it is the Receive indirection table. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08hv_netvsc: Add ethtool handler to set and get TCP hash levelsHaiyang Zhang
The patch supports the options to switch TCP hash level between L3 and L4 by ethtool command. TCP over IPv4 and v6 can be set differently. The default hash level is L4. We currently only allow switching TX hash level from within the guests. For example, for TCP over IPv4 on eth0: To include TCP port numbers in hashing: ethtool -N eth0 rx-flow-hash tcp4 sdfn To exclude TCP port numbers in hashing: ethtool -N eth0 rx-flow-hash tcp4 sd To show TCP hash level: ethtool -n eth0 rx-flow-hash tcp4 Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08hv_netvsc: Change the hash level variable to bit flagsHaiyang Zhang
This simplifies the logic and make it easier to add more options. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04net: Add extack to upper device linkingDavid Ahern
Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01hv_netvsc: report stop_queue and wake_queueSimon Xiao
Report the numbers of events for stop_queue and wake_queue in ethtool stats. Example: ethtool -S eth0 NIC statistics: ... stop_queue: 7 wake_queue: 7 ... Signed-off-by: Simon Xiao <sixiao@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25hv_netvsc: Fix the real number of queues of non-vRSS casesHaiyang Zhang
For older hosts without multi-channel (vRSS) support, and some error cases, we still need to set the real number of queues to one. This patch adds this missing setting. Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25hv_netvsc: make const array ver_list static, reduces object code sizeColin Ian King
Don't populate const array ver_list on the stack, instead make it static. Makes the object code smaller by over 400 bytes: Before: text data bss dec hex filename 18444 3168 320 21932 55ac drivers/net/hyperv/netvsc.o After: text data bss dec hex filename 17950 3224 320 21494 53f6 drivers/net/hyperv/netvsc.o (gcc 6.3.0, x86-64) Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-21hv_netvsc: fix send buffer failure on MTU changeAlex Ng
If MTU is changed the host would reject the send buffer change. This problem is result of recent change to allow changing send buffer size. Every time we change the MTU, we store the previous net_device section count before destroying the buffer, but we don’t store the previous section size. When we reinitialize the buffer, its size is calculated by multiplying the previous count and previous size. Since we continuously increase the MTU, the host returns us a decreasing count value while the section size is reinitialized to 1728 bytes every time. This eventually leads to a condition where the calculated buf_size is so small that the host rejects it. Fixes: 8b5327975ae1 ("netvsc: allow controlling send/recv buffer size") Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-15netvsc: increase default receive buffer sizeStephen Hemminger
The default receive buffer size was reduced by recent change to a value which was appropriate for 10G and Windows Server 2016. But the value is too small for full performance with 40G on Azure. Increase the default back to maximum supported by host. Fixes: 8b5327975ae1 ("netvsc: allow controlling send/recv buffer size") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-11hv_netvsc: avoid unnecessary wakeups on subchannel creationStephen Hemminger
Only need to wakeup the initiator after all sub-channels are opened. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-11hv_netvsc: fix deadlock on hotplugStephen Hemminger
When a virtual device is added dynamically (via host console), then the vmbus sends an offer message for the primary channel. The processing of this message for networking causes the network device to then initialize the sub channels. The problem is that setting up the sub channels needs to wait until the subsequent subchannel offers have been processed. These offers come in on the same ring buffer and work queue as where the primary offer is being processed; leading to a deadlock. This did not happen in older kernels, because the sub channel waiting logic was broken (it wasn't really waiting). The solution is to do the sub channel setup in its own work queue context that is scheduled by the primary channel setup; and then happens later. Fixes: 732e49850c5e ("netvsc: fix race on sub channel creation") Reported-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01hv_netvsc: Fix the channel limit in netvsc_set_rxfh()Haiyang Zhang
The limit of setting receive indirection table value should be the current number of channels, not the VRSS_CHANNEL_MAX. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01hv_netvsc: Simplify the limit check in netvsc_set_channels()Haiyang Zhang
Because of the following code, net->num_tx_queues equals to VRSS_CHANNEL_MAX, and max_chn is less than or equals to VRSS_CHANNEL_MAX. netvsc_drv.c: alloc_etherdev_mq(sizeof(struct net_device_context), VRSS_CHANNEL_MAX); rndis_filter.c: net_device->max_chn = min_t(u32, VRSS_CHANNEL_MAX, num_possible_rss_qs); So this patch removes the unnecessary limit check before comparing with "max_chn". Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01hv_netvsc: Simplify num_chn checking in rndis_filter_device_add()Haiyang Zhang
The minus one and assignment to a local variable is not necessary. This patch simplifies it. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01hv_netvsc: Clean up an unused parameter in rndis_filter_set_rss_param()Haiyang Zhang
This patch removes the parameter, num_queue in rndis_filter_set_rss_param(), which is no longer in use. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01netvsc: allow driver to be removed even if VF is presentStephen Hemminger
If VF is attached then can still allow netvsc driver module to be removed. Just have to make sure and do the cleanup. Also, avoid extra rtnl round trip when calling unregister. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01netvsc: cleanup datapath switchStephen Hemminger
Use one routine for datapath up/down. Don't need to reopen the rndis layer. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Three cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-24netvsc: fix deadlock betwen link status and removalstephen hemminger
There is a deadlock possible when canceling the link status delayed work queue. The removal process is run with RTNL held, and the link status callback is acquring RTNL. Resolve the issue by using trylock and rescheduling. If cancel is in process, that block it from happening. Fixes: 122a5f6410f4 ("staging: hv: use delayed_work for netvsc_send_garp()") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-24hv_netvsc: Fix rndis_filter_close error during netvsc_removeHaiyang Zhang
We now remove rndis filter before unregister_netdev(), which calls device close. It involves closing rndis filter already removed. This patch fixes this error. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22hv_netvsc: Add ethtool handler to set and get UDP hash levelsHaiyang Zhang
The patch add the functions to switch UDP hash level between L3 and L4 by ethtool command. UDP over IPv4 and v6 can be set differently. The default hash level is L4. We currently only allow switching TX hash level from within the guests. On Azure, fragmented UDP packets have high loss rate with L4 hashing. Using L3 hashing is recommended in this case. For example, for UDP over IPv4 on eth0: To include UDP port numbers in hasing: ethtool -N eth0 rx-flow-hash udp4 sdfn To exclude UDP port numbers in hasing: ethtool -N eth0 rx-flow-hash udp4 sd To show UDP hash level: ethtool -n eth0 rx-flow-hash udp4 Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22hv_netvsc: Clean up unused parameter from netvsc_get_rss_hash_opts()Haiyang Zhang
The parameter "nvdev" is not in use. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22hv_netvsc: Clean up unused parameter from netvsc_get_hash()Haiyang Zhang
The parameter "sk" is not in use. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16vmbus: remove unused vmbus_sendpacket_ctlstephen hemminger
The only usage of vmbus_sendpacket_ctl was by vmbus_sendpacket. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16vmbus: remove unused vmubs_sendpacket_pagebuffer_ctlstephen hemminger
The function vmbus_sendpacket_pagebuffer_ctl was never used directly. Just have vmbus_send_pagebuffer Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: keep track of some non-fatal overload conditionsstephen hemminger
Add ethtool statistics for case where send chimmeny buffer is exhausted and driver has to fall back to doing scatter/gather send. Also, add statistic for case where ring buffer is full and receive completions are delayed. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: allow controlling send/recv buffer sizestephen hemminger
Control the size of the buffer areas via ethtool ring settings. They aren't really traditional hardware rings, but host API breaks receive and send buffer into chunks. The final size of the chunks are controlled by the host. The default value of send and receive buffer area for host DMA is much larger than it needs to be. Experimentation shows that 4M receive and 1M send is sufficient. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: remove unnecessary check for NULL hdrstephen hemminger
The function init_page_array is always called with a valid pointer to RNDIS header. No check for NULL is needed. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: remove unnecessary cast of void pointerstephen hemminger
Assignment to a typed pointer is sufficient in C. No cast is needed. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11netvsc: whitespace cleanupstephen hemminger
Fix some minor indentation issues. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>