linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2018-01-24	kcm: Check if sk_user_data already set in kcm_attach	Tom Herbert
	This is needed to prevent sk_user_data being overwritten. The check is done under the callback lock. This should prevent a socket from being attached twice to a KCM mux. It also prevents a socket from being attached for other use cases of sk_user_data as long as the other cases set sk_user_data under the lock. Followup work is needed to unify all the use cases of sk_user_data to use the same locking. Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Signed-off-by: Tom Herbert <tom@quantonium.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	kcm: Only allow TCP sockets to be attached to a KCM mux	Tom Herbert
	TCP sockets for IPv4 and IPv6 that are not listeners or in closed stated are allowed to be attached to a KCM mux. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com Signed-off-by: Tom Herbert <tom@quantonium.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25	MAINTAINERS: update email address for James Morris	James Morris
	Update my email address. Signed-off-by: James Morris <jmorris@namei.org>
2018-01-24	igb: Clear TXSTMP when ptp_tx_work() is timeout	Daniel Hua
	Problem description: After ethernet cable connect and disconnect for several iterations on a device with i210, tx timestamp will stop being put into the socket. Steps to reproduce: 1. Setup a device with i210 and wire it to a 802.1AS capable switch ( Extreme Networks Summit x440 is used in our case) 2. Have the gptp daemon running on the device and make sure it is synced with the switch 3. Have the switch disable and enable the port, wait for the device gets resynced with the switch 4. Iterates step 3 until the device is not albe to get resynced 5. Review the log in dmesg and you will see warning message "igb : clearing Tx timestamp hang" Root cause: If ptp_tx_work() gets scheduled just before the port gets disabled, a LINK DOWN event will be processed before ptp_tx_work(), which may cause timeout in ptp_tx_work(). In the timeout logic, the TSYNCTXCTL's TXTT bit (Transmit timestamp valid bit) is not cleared, causing no new timestamp loaded to TXSTMP register. Consequently therefore, no new interrupt is triggerred by TSICR.TXTS bit and no more Tx timestamp send to the socket. Signed-off-by: Daniel Hua <daniel.hua@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24	igb: Delete an error message for a failed memory allocation in ↵	Markus Elfring
	igb_enable_sriov() Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24	igb: Free IRQs when device is hotplugged	Lyude Paul
	Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon hotplugging my kernel would immediately crash due to igb: [ 680.825801] kernel BUG at drivers/pci/msi.c:352! [ 680.828388] invalid opcode: 0000 [#1] SMP [ 680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb] [ 680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G O 4.15.0-rc3Lyude-Test+ #6 [ 680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017 [ 680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0 [ 680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286 [ 680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c [ 680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178 [ 680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00 [ 680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298 [ 680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000 [ 680.836332] FS: 0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000 [ 680.836817] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0 [ 680.837954] Call Trace: [ 680.838853] pci_disable_msix+0xce/0xf0 [ 680.839616] igb_reset_interrupt_capability+0x5d/0x60 [igb] [ 680.840278] igb_remove+0x9d/0x110 [igb] [ 680.840764] pci_device_remove+0x36/0xb0 [ 680.841279] device_release_driver_internal+0x157/0x220 [ 680.841739] pci_stop_bus_device+0x7d/0xa0 [ 680.842255] pci_stop_bus_device+0x2b/0xa0 [ 680.842722] pci_stop_bus_device+0x3d/0xa0 [ 680.843189] pci_stop_and_remove_bus_device+0xe/0x20 [ 680.843627] trim_stale_devices+0xf3/0x140 [ 680.844086] trim_stale_devices+0x94/0x140 [ 680.844532] trim_stale_devices+0xa6/0x140 [ 680.845031] ? get_slot_status+0x90/0xc0 [ 680.845536] acpiphp_check_bridge.part.5+0xfe/0x140 [ 680.846021] acpiphp_hotplug_notify+0x175/0x200 [ 680.846581] ? free_bridge+0x100/0x100 [ 680.847113] acpi_device_hotplug+0x8a/0x490 [ 680.847535] acpi_hotplug_work_fn+0x1a/0x30 [ 680.848076] process_one_work+0x182/0x3a0 [ 680.848543] worker_thread+0x2e/0x380 [ 680.848963] ? process_one_work+0x3a0/0x3a0 [ 680.849373] kthread+0x111/0x130 [ 680.849776] ? kthread_create_worker_on_cpu+0x50/0x50 [ 680.850188] ret_from_fork+0x1f/0x30 [ 680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b [ 680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0 As it turns out, normally the freeing of IRQs that would fix this is called inside of the scope of __igb_close(). However, since the device is already gone by the point we try to unregister the netdevice from the driver due to a hotplug we end up seeing that the netif isn't present and thus, forget to free any of the device IRQs. So: make sure that if we're in the process of dismantling the netdev, we always allow __igb_close() to be called so that IRQs may be freed normally. Additionally, only allow igb_close() to be called from __igb_close() if it hasn't already been called for the given adapter. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach") Cc: Todd Fujinaka <todd.fujinaka@intel.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: stable@vger.kernel.org Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24	e1000e: Alert the user that C-states will be disabled by enabling jumbo frames	Matt Turner
	I personally spent a long time trying to decypher why my CPU would not reach deeper C-states. Let's just tell the next user what's going on. Signed-off-by: Matt Turner <matt.turner@intel.com> Acked-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24	igb: Clarify idleslope config constraints	Jesus Sanchez-Palencia
	By design, the idleslope increments are restricted to 16.384kbps steps. Add a comment to igb_main.c making that explicit and add one example that illustrates the impact of that. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24	e1000e: Set HTHRESH when PTHRESH is used	Matt Turner
	According to section 12.0.3.4.13 "Receive Descriptor Control - RXDCTL" of the Intel® 82579 Gigabit Ethernet PHY Datasheet v2.1: "HTHRESH should be given a non zero value when ever PTHRESH is used." In RXDCTL(0), PTHRESH lives at bits 5:0, and HTHREST lives at bits 13:8. Set only bit 8 of HTHREST as is done in e1000_flush_rx_ring(). Found by inspection. Signed-off-by: Matt Turner <matt.turner@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24	igb: add function to get maximum RSS queues	Zhang Shengju
	This patch adds a new function igb_get_max_rss_queues() to get maximum RSS queues, this will reduce duplicate code and facilitate future maintenance. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24	igb: Allow to remove administratively set MAC on VFs	Corinna Vinschen
	Before libvirt modifies the MAC address and vlan tag for an SRIOV VF for use by a virtual machine (either using vfio device assignment or macvtap passthru mode), it saves the current MAC address and vlan tag so that it can reset them to their original value when the guest is done. Libvirt can't leave the VF MAC set to the value used by the now-defunct guest since it may be started again later using a different VF, but it certainly shouldn't just pick any random value, either. So it saves the state of everything prior to using the VF, and resets it to that. The igb driver initializes the MAC addresses of all VFs to 00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK netlink message, also visible in the list of VFs in the output of "ip link show"). But when libvirt attempts to restore the MAC address back to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel responds with "Invalid argument". Forbidding a reset back to the original value leaves the VF MAC at the value set for the now-defunct virtual machine. Especially on a system with NetworkManager enabled, this has very bad consequences, since NetworkManager forces all interfacess to be IFF_UP all the time - if the same virtual machine is restarted using a different VF (or even on a different host), there will be multiple interfaces watching for traffic with the same MAC address. To allow libvirt to revert to the original state, we need a way to remove the administrative set MAC on a VF, to allow normal host operation again, and to reset/overwrite the VF MAC via VF netdev. This patch implements the outlined scenario by allowing to set the VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF. igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0, so it's possible to reset the VF MAC back to the original value via the VF netdev. Note: Recent patches to libvirt allow for a workaround if the NIC isn't capable of resetting the administrative MAC back to all 0, but in theory the NIC should allow resetting the MAC in the first place. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Tested-by: Aaron Brown <arron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24	Merge branch 'pktgen-Behavior-flags-fixes'	David S. Miller
	Dmitry Safonov says: ==================== pktgen: Behavior flags fixes v2: o fixed a nitpick from David Miller There are a bunch of fixes/cleanups/Documentations. Diffstat says for itself, regardless added docs and missed flag parameters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	pktgen: Clean read user supplied flag mess	Dmitry Safonov
	Don't use error-prone-brute-force way. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	pktgen: Remove brute-force printing of flags	Dmitry Safonov
	Add macro generated pkt_flag_names array, with a little help of which the flags can be printed by using an index. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	pktgen: Add behaviour flags macro to generate flags/names	Dmitry Safonov
	PKT_FALGS macro will be used to add package behavior names definitions to simplify the code that prints/reads pkg flags. Sorted the array in order of printing the flags in pktgen_if_show() Note: Renamed IPSEC_ON => IPSEC for simplicity. No visible behavior change expected. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	pktgen: Add missing !flag parameters	Dmitry Safonov
	o FLOW_SEQ now can be disabled with pgset "flag !FLOW_SEQ" o FLOW_SEQ and FLOW_RND are antonyms, as it's shown by pktgen_if_show() o IPSEC now may be disabled Note, that IPV6 is enabled with dst6/src6 parameters, not with a flag parameter. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	Documentation/pktgen: Clearify how-to use pktgen samples	Dmitry Safonov
	o Change process name in ps output: looks like, these days the process is named kpktgend_<cpu>, rather than pktgen/<cpu>. o Use pg_ctrl for start/stop as it can work well with pgset without changes to $(PGDEV) variable. o Clarify a bit needed $(PGDEV) definition for sample scripts and that one needs to `source functions.sh`. o Document how-to unset a behaviour flag, note about history expansion. o Fix pgset spi parameter value. Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr	Wolfgang Bumiller
	TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as skb->data points to the network header. Use skb_mac_header instead. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	net: sched: em_nbyte: don't add the data offset twice	Wolfgang Bumiller
	'ptr' is shifted by the offset and then validated, the memcmp should not add it a second time. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	Bluetooth: btintel: Create common Intel Read Boot Params function	Tedd Ho-Jeong An
	The Intel_Read_Boot_Params command is used to read boot parameters from the bootloader and this is Intel generic command used in USB and UART drivers. Signed-off-by: Tedd Ho-Jeong An <tedd.an@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-24	Bluetooth: btintel: Use boot parameter from firmware file	Tedd Ho-Jeong An
	Each RAM SKU has a different boot parameter which is used in HCI_Intel_Reset command after downloading the firmware. The boot parameter is embedded in the firmware data and to support multiple SKUs, driver reads the boot parameter while downloading the firmware instead of using static values per SKU. Signed-off-by: Tedd Ho-Jeong An <tedd.an@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-24	Bluetooth: btintel: Create common function for Intel Reset	Tedd Ho-Jeong An
	The Intel_Reset command is used to reset the device after downloading the firmware and this is Intel generic command used in both USB and UART. Signed-off-by: Tedd Ho-Jeong An <tedd.an@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-24	Bluetooth: hci_intel: Update firmware filename for Intel 9x60 and later	Tedd Ho-Jeong An
	The format of Intel Bluetooth firmware for bootloader product is ibt-<hw_variant>-<device_revision_id>.sfi and .ddc. But for the 9x60 SKU, there are three variants of FW, which cannot be differenticate just with hw_variant and device_revision_id. So, to pick the appropriate FW file for 9x60 SKU, three fields, hw_variant, hw_revision, and fw_revision, needs to be used rather than hw_variant and device_revision_id. Format will be like this: ibt-<hw_variant>-<hw_revision>-<fw_revision>.sfi and .ddc Signed-off-by: Tedd Ho-Jeong An <tedd.an@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-24	Merge tag 'trace-v4.15-rc9' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "With the new ORC unwinder, ftrace stack tracing became disfunctional. One was that ORC didn't know how to handle the ftrace callbacks in general (which Josh fixed). The other was that ORC would just bail if it hit a dynamically allocated trampoline. Which means all ftrace stack tracing that happens from the function tracer would produce no results (that includes killing the max stack size tracer). I added a check to the ORC unwinder to see if the trampoline belonged to ftrace, and if it did, use the orc entry of the static trampoline that was used to create the dynamic one (it would be identical). Finally, I noticed that the skip values of the stack tracing were out of whack. I went through and fixed them up" * tag 'trace-v4.15-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Update stack trace skipping for ORC unwinder ftrace, orc, x86: Handle ftrace dynamically allocated trampolines x86/ftrace: Fix ORC unwinding from ftrace handlers
2018-01-24	MAINTAINERS: clarify that only verified bugs should be submitted to security@	Willy Tarreau
	We're seeing a raise of automated reports from testing tools and reports about address leaks that are not really exploitable as-is, many of which do not represent an immediate risk justifying to work in closed places. Signed-off-by: Willy Tarreau <w@1wt.eu> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-24	Revert "module: Add retpoline tag to VERMAGIC"	Greg Kroah-Hartman
	This reverts commit 6cfb521ac0d5b97470883ff9b7facae264b7ab12. Turns out distros do not want to make retpoline as part of their "ABI", so this patch should not have been merged. Sorry Andi, this was my fault, I suggested it when your original patch was the "correct" way of doing this instead. Reported-by: Jiri Kosina <jikos@kernel.org> Fixes: 6cfb521ac0d5 ("module: Add retpoline tag to VERMAGIC") Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: rusty@rustcorp.com.au Cc: arjan.van.de.ven@intel.com Cc: jeyu@kernel.org Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-24	mlxsw: spectrum_router: Don't log an error on missing neighbor	Yuval Mintz
	Driver periodically samples all neighbors configured in device in order to update the kernel regarding their state. When finding an entry configured in HW that doesn't show in neigh_lookup() driver logs an error message. This introduces a race when removing multiple neighbors - it's possible that a given entry would still be configured in HW as its removal is still being processed but is already removed from the kernel's neighbor tables. Simply remove the error message and gracefully accept such events. Fixes: c723c735fa6b ("mlxsw: spectrum_router: Periodically update the kernel's neigh table") Fixes: 60f040ca11b9 ("mlxsw: spectrum_router: Periodically dump active IPv6 neighbours") Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	Merge branch 'cxgb4-fix-build-error'	David S. Miller
	Rahul Lakkireddy says: ==================== cxgb4: fix build error Patch 1 fixes build error with compiling cudbg_zlib.c when CONFIG_ZLIB_DEFLATE macro is not defined. Patch 2 fixes following sparse warning: "Using plain integer as NULL pointer" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	cxgb4: properly initialize variables	Rahul Lakkireddy
	memset variables to 0 to fix sparse warnings: drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c:409:42: sparse: Using plain integer as NULL pointer drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:43:47: sparse: Using plain integer as NULL pointer Fixes: ad75b7d32f25 ("cxgb4: implement ethtool dump data operations") Fixes: 91c1953de387 ("cxgb4: use zlib deflate to compress firmware dump") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	cxgb4: enable ZLIB_DEFLATE when building cxgb4	Rahul Lakkireddy
	Fixes: drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:39:5: error: redefinition of 'cudbg_compress_buff' int cudbg_compress_buff(struct cudbg_init pdbg_init, ^~~~~~~~~~~~~~~~~~~ In file included from drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:23:0: drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.h:45:19: note: previous definition of 'cudbg_compress_buff' was here static inline int cudbg_compress_buff(struct cudbg_init pdbg_init, ^~~~~~~~~~~~~~~~~~~ Fixes: 91c1953de387 ("cxgb4: use zlib deflate to compress firmware dump") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	Merge branch 'net-smc-socket-closing-improvements'	David S. Miller
	Ursula Braun says: ==================== net/smc: socket closing improvements while the first 2 patches are just small cleanups, the remaing patches affect socket closing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	net/smc: continue waiting if peer signals write_shutdown	Ursula Braun
	If the peer sends a shutdown WRITE, this should not affect sending in general, and waiting for send buffer space in particular. Stop waiting of the local socket for send buffer space only, if peer signals closing, but not if peer signals just shutdown WRITE. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	net/smc: improve state change handling after close wait	Ursula Braun
	When a socket is closed or shutdown, smc waits for data being transmitted in certain states. If the state changes during this wait, the close switch depending on state should be reentered. In addition, state change is avoided if sending of close or shutdown fails. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	net/smc: make wait for work request uninterruptible	Ursula Braun
	Work requests are needed for every ib_post_send(), among them the ib_post_send() to signal closing. If an smc socket program is cancelled, the smc connections should be cleaned up, and require sending of closing signals to the peer. This may fail, if a wait for a free work request is needed, but is cancelled immediately due to the cancel interrupt. To guarantee notification of the peer, the wait for a work request is changed to uninterruptible. And the area to receive work request completion info with ib_poll_cq() is cleared first. And _tx_ variable names are used in the _tx_routines for the demultiplexing common type in the header. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	net/smc: get rid of tx_pend waits in socket closing	Ursula Braun
	There is no need to wait for confirmation of pending tx requests for a closing connection, since pending tx slots are dismissed when finishing a connection. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	net/smc: simplify function smc_clcsock_accept()	Ursula Braun
	Cleanup to avoid duplicate code in smc_clcsock_accept(). No functional change. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	net/smc: use local struct sock variables consistently	Ursula Braun
	Cleanup to consistently exploit the local struct sock definitions. No functional change. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	cxgb4/cxgb4vf: add support for ndo_set_vf_vlan	Ganesh Goudar
	implement ndo_set_vf_vlan for mgmt netdevice to configure the PCIe VF. Original work by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2018-01-24 1) Only offloads SAs after they are fully initialized. Otherwise a NIC may receive packets on a SA we can not yet handle in the stack. From Yossi Kuperman. 2) Fix negative refcount in case of a failing offload. From Aviad Yehezkel. 3) Fix inner IP ptoro version when decapsulating from interaddress family tunnels. From Yossi Kuperman. 4) Use true or false for boolean variables instead of an integer value in xfrm_get_type_offload. From Gustavo A. R. Silva. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	Merge branch 'bpf-and-netdevsim-test-updates'	David S. Miller
	Jakub Kicinski says: ==================== bpf and netdevsim test updates A number of test improvements (delayed by merges). Quentin provides patches for checking printing to the verifier log from the drivers and validating extack messages are propagated. There is also a test for replacing TC filters to avoid adding back the bug Daniel recently fixed in net and stable. ==================== Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	selftests/bpf: validate replace of TC filters is working	Jakub Kicinski
	Daniel discovered recently I broke TC filter replace (and fixed it in commit ad9294dbc227 ("bpf: fix cls_bpf on filter replace")). Add a test to make sure it never happens again. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	selftests/bpf: check bpf verifier log buffer usage works for HW offload	Quentin Monnet
	Make netdevsim print a message to the BPF verifier log buffer when a program is offloaded. Then use this message in hardware offload selftests to make sure that using this buffer actually prints the message to the console for eBPF hardware offload. The message is appended after the last instruction is processed with the verifying function from netdevsim. Output looks like the following: $ tc filter add dev foo ingress bpf obj sample_ret0.o \ sec .text verbose skip_sw Prog section '.text' loaded (5)! - Type: 3 - Instructions: 2 (0 over limit) - License: Verifier analysis: 0: (b7) r0 = 0 1: (95) exit [netdevsim] Hello from netdevsim! processed 2 insns, stack depth 0 "verbose" flag is required to see it in the console since netdevsim does not throw an error after printing the message. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	netdevsim: don't compile BPF code if syscall not enabled	Jakub Kicinski
	We should not compile netdevsim/bpf.c if BPF syscall is not enabled. Otherwise bpf core would have to provide wrappers for all functions offload drivers may call, even though system will never see a BPF object. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	selftests/bpf: add checks on extack messages for eBPF hw offload tests	Quentin Monnet
	Add checks to test that netlink extack messages are correctly displayed in some expected error cases for eBPF offload to netdevsim with TC and XDP. iproute2 may be built without libmnl support, in which case the extack messages will not be reported. Try to detect this condition, and when enountered print a mild warning to the user and skip the extack validation. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	netdevsim: add extack support for TC eBPF offload	Quentin Monnet
	Use the recently added extack support for TC eBPF filters in netdevsim. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	Merge branch '40GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2018-01-23 This series contains updates to i40e and i40evf only. Pawel enables FlatNVM support on x722 devices by allowing nvmupdate tool to configure the preservation flags in the AdminQ command. Mitch fixes a potential divide by zero error when DCB is enabled and the firmware fails to configure the VSI, so check for this state. Fixed a bug where the driver could fail to adhere to ETS bandwidth allocations if 8 traffic classes were configured on the switch. Sudheer fixes a potential deadlock by avoiding to call flush_schedule_work() in i40evf_remove(), since cancel_work_sync() and cancel_delayed_work_sync() already cleans up necessary work items. Fixed an issue with the problematic detection and recovery from hung queues in the PF which was causing lost interrupts. This is done by triggering a software interrupt so that interrupts are forced on and if we are already in napi_poll and an interrupt fires, napi_poll will not be rescheduled and the interrupt is lost. Avinash fixes an issue in the VF where is was possible to issue a reset_task while the device is currently being removed. Michal fixes an issue occurring while calling i40e_led_set() with the blink parameter set to true, which was causing the activity LED instead of the link LED to blink for port identification. Shiraz changes the client interface to not call client close/open on netdev down/up events, since this causes a lot of thrash that is not needed. Instead, disable the PE TCP-ENA flag during a netdev down event and re-enable on a netdev up event, since this blocks all TCP traffic to the RDMA protocol engine. Alan fixes an issue which was causing a potential transmit hang by ignoring the PF link up message if the VF state is not yet in the RUNNING state. Amritha fixes the channel VSI recreation during the reset flow to reconfigure the transmit rings and the queue context associated with the channel VSI. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	vmxnet3: repair memory leak	Neil Horman
	with the introduction of commit b0eb57cb97e7837ebb746404c2c58c6f536f23fa, it appears that rq->buf_info is improperly handled. While it is heap allocated when an rx queue is setup, and freed when torn down, an old line of code in vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0] being set to NULL prior to its being freed, causing a memory leak, which eventually exhausts the system on repeated create/destroy operations (for example, when the mtu of a vmxnet3 interface is changed frequently. Fix is pretty straight forward, just move the NULL set to after the free. Tested by myself with successful results Applies to net, and should likely be queued for stable, please Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-By: boyang@redhat.com CC: boyang@redhat.com CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: David S. Miller <davem@davemloft.net> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL	Ben Hutchings
	Commit 513674b5a2c9 ("net: reevalulate autoflowlabel setting after sysctl setting") removed the initialisation of ipv6_pinfo::autoflowlabel and added a second flag to indicate whether this field or the net namespace default should be used. The getsockopt() handling for this case was not updated, so it currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is not explicitly enabled. Fix it to return the effective value, whether that has been set at the socket or net namespace level. Fixes: 513674b5a2c9 ("net: reevalulate autoflowlabel setting after sysctl ...") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	Merge branch 'act_csum-spinlock-remove'	David S. Miller
	Davide Caratti says: ==================== net/sched: remove spinlock from 'csum' action Similarly to what has been done earlier with other actions [1][2], this series tries to improve the performance of 'csum' tc action, removing a spinlock in the data path. Patch 1 lets act_csum use per-CPU counters; patch 2 removes spin_{,un}lock_bh() calls from the act() method. test procedure (using pktgen from https://github.com/netoptimizer): # ip link add name eth1 type dummy # ip link set dev eth1 up # tc qdisc add dev eth1 root handle 1: prio # for a in pass drop; do > tc filter del dev eth1 parent 1: pref 10 matchall action csum udp > tc filter add dev eth1 parent 1: pref 10 matchall action csum udp $a > for n in 2 4; do > ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $n -n 1000000 -i eth1 > done > done test results: \| \| before patch \| after patch $a \| $n \| avg. pps/thread \| avg. pps/thread -----+----+-----------------+---------------- pass \| 2 \| 1671463 ± 4% \| 1920789 ± 3% pass \| 4 \| 648797 ± 1% \| 738190 ± 1% drop \| 2 \| 3212692 ± 2% \| 3719811 ± 2% drop \| 4 \| 1078824 ± 1% \| 1328099 ± 1% references: [1] https://www.spinics.net/lists/netdev/msg334760.html [2] https://www.spinics.net/lists/netdev/msg465862.html v3 changes: - use rtnl_dereference() in place of rcu_dereference() in tcf_csum_dump() v2 changes: - add 'drop' test, it produces more contentions - use RCU-protected struct to store 'action' and 'update_flags', to avoid reading the values from subsequent configurations ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23	net/sched: act_csum: don't use spinlock in the fast path	Davide Caratti
	use RCU instead of spin_{,unlock}_bh() to protect concurrent read/write on act_csum configuration, to reduce the effects of contention in the data path when multiple readers are present. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>