summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-12-25Merge branch 'turbostat' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown. * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: remove obsolete -M, -m, -C, -c options tools/power turbostat: Make extensible via the --add parameter tools/power turbostat: Denverton uses a 25 MHz crystal, not 19.2 MHz tools/power turbostat: line up headers when -M is used tools/power turbostat: fix SKX PKG_CSTATE_LIMIT decoding tools/power turbostat: Support Knights Mill (KNM) tools/power turbostat: Display HWP OOB status tools/power turbostat: fix Denverton BCLK tools/power turbostat: use intel-family.h model strings tools/power/turbostat: Add Denverton RAPL support tools/power/turbostat: Add Denverton support tools/power/turbostat: split core MSR support into status + limit tools/power turbostat: fix error case overflow read of slm_freq_table[] tools/power turbostat: Allocate correct amount of fd and irq entries tools/power turbostat: switch to tab delimited output tools/power turbostat: Gracefully handle ACPI S3 tools/power turbostat: tidy up output on Joule counter overflow
2016-12-25mm: add PageWaiters indicating tasks are waiting for a page bitNicholas Piggin
Add a new page flag, PageWaiters, to indicate the page waitqueue has tasks waiting. This can be tested rather than testing waitqueue_active which requires another cacheline load. This bit is always set when the page has tasks on page_waitqueue(page), and is set and cleared under the waitqueue lock. It may be set when there are no tasks on the waitqueue, which will cause a harmless extra wakeup check that will clears the bit. The generic bit-waitqueue infrastructure is no longer used for pages. Instead, waitqueues are used directly with a custom key type. The generic code was not flexible enough to have PageWaiters manipulation under the waitqueue lock (which simplifies concurrency). This improves the performance of page lock intensive microbenchmarks by 2-3%. Putting two bits in the same word opens the opportunity to remove the memory barrier between clearing the lock bit and testing the waiters bit, after some work on the arch primitives (e.g., ensuring memory operand widths match and cover both bits). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Lutomirski <luto@kernel.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-25mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBackedNicholas Piggin
A page is not added to the swap cache without being swap backed, so PageSwapBacked mappings can use PG_owner_priv_1 for PageSwapCache. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Lutomirski <luto@kernel.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24tools/power turbostat: remove obsolete -M, -m, -C, -c optionsLen Brown
The new --add option has replaced the -M, -m, -C, -c options Eg. -M 0x10 is now --add msr0x10,raw -m 0x10 is now --add msr0x10,raw,u32 -C 0x10 is now --add msr0x10,delta -c 0x10 is now --add msr0x10,delta,u32 The --add option can be repeated to add any number of counters, while the previous options were limited to adding one of each type. In addition, the --add option can accept a column label, and can also display a counter as a percentage of elapsed cycles. Eg. --add msr0x3fe,core,percent,MY_CC3 Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-24tools/power turbostat: Make extensible via the --add parameterLen Brown
Create the "--add" parameter. This can be used to teach an existing turbostat binary about any number of any type of counter. turbostat(8) details the syntax for --add. Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-24Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "This ncludes various cifs/smb3 bug fixes, mostly for stable as well. In the next week I expect that Germano will have some reconnection fixes, and also I expect to have the remaining pieces of the snapshot enablement and SMB3 ACLs, but wanted to get this set of bug fixes in" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cifs_get_root shouldn't use path with tree name Fix default behaviour for empty domains and add domainauto option cifs: use %16phN for formatting md5 sum cifs: Fix smbencrypt() to stop pointing a scatterlist at the stack CIFS: Fix a possible double locking of mutex during reconnect CIFS: Fix a possible memory corruption during reconnect CIFS: Fix a possible memory corruption in push locks CIFS: Fix missing nls unload in smb2_reconnect() CIFS: Decrease verbosity of ioctl call SMB3: parsing for new snapshot timestamp mount parm
2016-12-24Merge tag 'watchdog-for-linus-v4.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull watchdog updates from Wim Van Sebroeck and Guenter Roeck: - new driver for Add Loongson1 SoC - minor cleanup and fixes in various drivers * tag 'watchdog-for-linus-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: watchdog: it87_wdt: add IT8620E ID watchdog: mpc8xxx: Remove unneeded linux/miscdevice.h include watchdog: octeon: Remove unneeded linux/miscdevice.h include watchdog: bcm2835_wdt: set WDOG_HW_RUNNING bit when appropriate watchdog: loongson1: Add Loongson1 SoC watchdog driver watchdog: cpwd: remove memory allocate failure message watchdog: da9062/61: watchdog driver intel-mid_wdt: Error code is just an integer intel-mid_wdt: make sure watchdog is not running at startup watchdog: mei_wdt: request stop on reboot to prevent false positive event watchdog: hpwdt: changed maintainer information watchdog: jz4740: Fix modular build watchdog: qcom: fix kernel panic due to external abort on non-linefetch watchdog: davinci: add support for deferred probing watchdog: meson: Remove unneeded platform MODULE_ALIAS watchdog: Standardize leading tabs and spaces in Kconfig file watchdog: max77620_wdt: fix module autoload watchdog: bcm7038_wdt: fix module autoload
2016-12-24Merge tag 'ntb-4.10' of git://github.com/jonmason/ntbLinus Torvalds
Pull NTB update from Jon Mason: - NTB bug fixes for removing an unnecessary call to ntb_peer_spad_read, and correcting a free_irq inconsistency - add Intel SKX support - change the AMD NTB maintainer, and fix some bugs present there * tag 'ntb-4.10' of git://github.com/jonmason/ntb: ntb_transport: Remove unnecessary call to ntb_peer_spad_read NTB: Fix 'request_irq()' and 'free_irq()' inconsistancy ntb: fix SKX NTB config space size register offsets NTB: correct ntb_peer_spad_read for case when callback is not supplied. MAINTAINERS: Change in maintainer for AMD NTB ntb_transport: Limit memory windows based on available, scratchpads NTB: Register and offset values fix for memory window NTB: add support for hotplug feature ntb: Adding Skylake Xeon NTB support
2016-12-23Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "There's a number of fixes: - a round of fixes for CPUID-less legacy CPUs - a number of microcode loader fixes - i8042 detection robustization fixes - stack dump/unwinder fixes - x86 SoC platform driver fixes - a GCC 7 warning fix - virtualization related fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) Revert "x86/unwind: Detect bad stack return address" x86/paravirt: Mark unused patch_default label x86/microcode/AMD: Reload proper initrd start address x86/platform/intel/quark: Add printf attribute to imr_self_test_result() x86/platform/intel-mid: Switch MPU3050 driver to IIO x86/alternatives: Do not use sync_core() to serialize I$ x86/topology: Document cpu_llc_id x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic x86/asm: Rewrite sync_core() to use IRET-to-self x86/microcode/intel: Replace sync_core() with native_cpuid() Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6 x86/tools: Fix gcc-7 warning in relocs.c x86/unwind: Dump stack data on warnings x86/unwind: Adjust last frame check for aligned function stacks x86/init: Fix a couple of comment typos x86/init: Remove i8042_detect() from platform ops Input: i8042 - Trust firmware a bit more when probing on X86 x86/init: Add i8042 state to the platform data ...
2016-12-23Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "ARM/MOXA SoC clocksource driver fixes" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/moxart: Plug memory and mapping leaks
2016-12-23Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "On the kernel side there's two x86 PMU driver fixes and a uprobes fix, plus on the tooling side there's a number of fixes and some late updates" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) perf sched timehist: Fix invalid period calculation perf sched timehist: Remove hardcoded 'comm_width' check at print_summary perf sched timehist: Enlarge default 'comm_width' perf sched timehist: Honour 'comm_width' when aligning the headers perf/x86: Fix overlap counter scheduling bug perf/x86/pebs: Fix handling of PEBS buffer overflows samples/bpf: Move open_raw_sock to separate header samples/bpf: Remove perf_event_open() declaration samples/bpf: Be consistent with bpf_load_program bpf_insn parameter tools lib bpf: Add bpf_prog_{attach,detach} samples/bpf: Switch over to libbpf perf diff: Do not overwrite valid build id perf annotate: Don't throw error for zero length symbols perf bench futex: Fix lock-pi help string perf trace: Check if MAP_32BIT is defined (again) samples/bpf: Make perf_event_read() static uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation samples/bpf: Make samples more libbpf-centric tools lib bpf: Add flags to bpf_create_map() tools lib bpf: use __u32 from linux/types.h ...
2016-12-23Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "A build warning fix with certain .config's" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/st: Mark st_irq_syscfg_resume() __maybe_unused
2016-12-23ntb_transport: Remove unnecessary call to ntb_peer_spad_readSteve Wahl
The results were previously ignored, anyway. Signed-off-by: Steve Wahl <Steve.Wahl@dell.com> Fixes: e26a5843f7f5014ae4460030ca4de029a3ac35d3 Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23NTB: Fix 'request_irq()' and 'free_irq()' inconsistancyChristophe JAILLET
'request_irq()' and 'free_irq()' should have the same 'dev_id'. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23ntb: fix SKX NTB config space size register offsetsDave Jiang
The offsets for the SZ registers are wrong. Updated. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reported-by: Sandeep Mann <sandeep@purestorage.com> Tested-by: Zachary Ross <zacharyx.ross@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23NTB: correct ntb_peer_spad_read for case when callback is not supplied.Steven Wahl
Correct ntb_peer_spad_read for case when callback is not supplied Signed-off-by: Steve Wahl <Steve.Wahl@dell.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23MAINTAINERS: Change in maintainer for AMD NTBShyam Sundar S K
I would like to take maintainership for AMD NTB Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Acked-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23ntb_transport: Limit memory windows based on available, scratchpadsShyam Sundar S K
When the underlying NTB H/W driver advertises more memory windows than the number of scratchpads available to setup MW's, it is likely that we may end up filling the remaining memory windows with garbage. So to avoid that, lets limit the memory windows that transport driver can setup based on the available scratchpads. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23NTB: Register and offset values fix for memory windowShyam Sundar S K
Due to incorrect limit and translation register values, NTB link was going down when the memory window was setup. Made appropriate changes as per spec. Fix limit register values for BAR1, which was overlapping with the BAR23 address. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23NTB: add support for hotplug featureXiangliang Yu
AMD NTB support hotplug under B2B mode. NTB will trigger link up/down interrupt event when doing plug add/remove, this patch implements the two interrupt event to support B2B hotplug function. Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23ntb: Adding Skylake Xeon NTB supportDave Jiang
The Skylake Xeon NTB hardware has made some changes to the register name, offset, and the way doorbells work. Adding driver support for the new hardware. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23Revert "x86/unwind: Detect bad stack return address"Josh Poimboeuf
Revert the following commit: b6959a362177 ("x86/unwind: Detect bad stack return address") ... because Andrey Konovalov reported an unwinder warning: WARNING: unrecognized kernel stack return address ffffffffa0000001 at ffff88006377fa18 in a.out:4467 The unwind was initiated from an interrupt which occurred while running in the generated code for a kprobe. The unwinder printed the warning because it expected regs->ip to point to a valid text address, but instead it pointed to the generated code. Eventually we may want come up with a way to identify generated kprobe code so the unwinder can know that it's a valid return address. Until then, just remove the warning. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/02f296848fbf49fb72dfeea706413ecbd9d4caf6.1482418739.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-23Merge tag 'perf-urgent-for-mingo-20161222' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: Fixes for 'perf sched timehist': (Namhyung Kim) - Define a larger initial alignment value for the COMM column and make it be more consistently honoured, for instance in the header. - Fix invalid period calculation when using the --time option to select a time slice, when events outside that slice were being considered for the per cpu idle stats summary. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) We have to be careful to not try and place a checksum after the end of a rawv6 packet, fix from Dave Jones with help from Hannes Frederic Sowa. 2) Missing memory barriers in tcp_tasklet_func() lead to crashes, from Eric Dumazet. 3) Several bug fixes for the new XDP support in virtio_net, from Jason Wang. 4) Increase headroom in RX skbs in be2net driver to accomodate encapsulations such as geneve. From Kalesh A P. 5) Fix SKB frag unmapping on TX in mvpp2, from Thomas Petazzoni. 6) Pre-pulling UDP headers created a regression in RECVORIGDSTADDR socket option support, from Willem de Bruijn. 7) UID based routing added a potential OOPS in ip_do_redirect() when we see an SKB without a socket attached. We just need it for the network namespace which we can get from skb->dev instead. Fix from Lorenzo Colitti. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits) sctp: fix recovering from 0 win with small data chunks sctp: do not loose window information if in rwnd_over virtio-net: XDP support for small buffers virtio-net: remove big packet XDP codes virtio-net: forbid XDP when VIRTIO_NET_F_GUEST_UFO is support virtio-net: make rx buf size estimation works for XDP virtio-net: unbreak csumed packets for XDP_PASS virtio-net: correctly handle XDP_PASS for linearized packets virtio-net: fix page miscount during XDP linearizing virtio-net: correctly xmit linearized page on XDP_TX virtio-net: remove the warning before XDP linearizing mlxsw: spectrum_router: Correctly remove nexthop groups mlxsw: spectrum_router: Don't reflect dead neighs neigh: Send netevent after marking neigh as dead ipv6: handle -EFAULT from skb_copy_bits inet: fix IP(V6)_RECVORIGDSTADDR for udp sockets net/sched: cls_flower: Mandate mask when matching on flags net/sched: act_tunnel_key: Fix setting UDP dst port in metadata under IPv6 stmmac: CSR clock configuration fix net: ipv4: Don't crash if passing a null sk to ip_do_redirect. ...
2016-12-23sctp: fix recovering from 0 win with small data chunksMarcelo Ricardo Leitner
Currently if SCTP closes the receive window with window pressure, mostly caused by excessive skb overhead on payload/overheads ratio, SCTP will close the window abruptly while saving the delta on rwnd_press. It will start recovering rwnd as the chunks are consumed by the application and the rwnd_press will be only recovered after rwnd reach the same value as of rwnd_press, mostly to prevent silly window syndrome. Thing is, this is very inefficient with small data chunks, as with those it will never reach back that value, and thus it will never recover from such pressure. This means that we will not issue window updates when recovering from 0 window and will rely on a sender retransmit to notice it. The fix here is to remove such threshold, as no value is good enough: it depends on the (avg) chunk sizes being used. Test with netperf -t SCTP_STREAM -- -m 1, and trigger 0 window by sending SIGSTOP to netserver, sleep 1.2, and SIGCONT. Rate limited to 845kbps, for visibility. Capture done at netserver side. Previously: 01.500751 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632372996] [a_rwnd 99153] [ 01.500752 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632372997] [SID: 0] [SS 01.517471 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS 01.517483 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap 01.517485 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373083] [SID: 0] [SS 01.517488 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap 01.534168 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373096] [SID: 0] [SS 01.534180 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap 01.534181 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373169] [SID: 0] [SS 01.534185 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap 02.525978 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS 02.526021 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap (window update missed) 04.573807 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS 04.779370 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373082] [a_rwnd 859] [#g 04.789162 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373083] [SID: 0] [SS 04.789323 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373156] [SID: 0] [SS 04.789372 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373228] [a_rwnd 786] [#g After: 02.568957 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098728] [a_rwnd 99153] 02.568961 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098729] [SID: 0] [S 02.585631 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S 02.585666 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 02.585671 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098815] [SID: 0] [S 02.585683 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 02.602330 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098828] [SID: 0] [S 02.602359 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 02.602363 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098901] [SID: 0] [S 02.602372 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 03.600788 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S 03.600830 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 03.619455 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 13508] 03.619479 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 27017] 03.619497 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 40526] 03.619516 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 54035] 03.619533 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 67544] 03.619552 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 81053] 03.619570 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 94562] (following data transmission triggered by window updates above) 03.633504 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S 03.836445 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098814] [a_rwnd 100000] 03.843125 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098815] [SID: 0] [S 03.843285 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098888] [SID: 0] [S 03.843345 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098960] [a_rwnd 99894] 03.856546 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098961] [SID: 0] [S 03.866450 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490099011] [SID: 0] [S Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23sctp: do not loose window information if in rwnd_overMarcelo Ricardo Leitner
It's possible that we receive a packet that is larger than current window. If it's the first packet in this way, it will cause it to increase rwnd_over. Then, if we receive another data chunk (specially as SCTP allows you to have one data chunk in flight even during 0 window), rwnd_over will be overwritten instead of added to. In the long run, this could cause the window to grow bigger than its initial size, as rwnd_over would be charged only for the last received data chunk while the code will try open the window for all packets that were received and had its value in rwnd_over overwritten. This, then, can lead to the worsening of payload/buffer ratio and cause rwnd_press to kick in more often. The fix is to sum it too, same as is done for rwnd_press, so that if we receive 3 chunks after closing the window, we still have to release that same amount before re-opening it. Log snippet from sctp_test exhibiting the issue: [ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000 rwnd decreased by 1 to (0, 1, 114221) [ 146.209232] sctp: sctp_assoc_rwnd_decrease: association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1! [ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000 rwnd decreased by 1 to (0, 1, 114221) [ 146.209232] sctp: sctp_assoc_rwnd_decrease: association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1! [ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000 rwnd decreased by 1 to (0, 1, 114221) [ 146.209232] sctp: sctp_assoc_rwnd_decrease: association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1! [ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000 rwnd decreased by 1 to (0, 1, 114221) Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull final vfs updates from Al Viro: "Assorted cleanups and fixes all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sg_write()/bsg_write() is not fit to be called under KERNEL_DS ufs: fix function declaration for ufs_truncate_blocks fs: exec: apply CLOEXEC before changing dumpable task flags seq_file: reset iterator to first record for zero offset vfs: fix isize/pos/len checks for reflink & dedupe [iov_iter] fix iterate_all_kinds() on empty iterators move aio compat to fs/aio.c reorganize do_make_slave() clone_private_mount() doesn't need to touch namespace_sem remove a bogus claim about namespace_sem being held by callers of mnt_alloc_id()
2016-12-23Merge branch 'virtio-net-xdp-fixes'David S. Miller
Jason Wang says: ==================== several fixups for virtio-net XDP Merry Xmas and a Happy New year to all: This series tries to fixes several issues for virtio-net XDP which could be categorized into several parts: - fix several issues during XDP linearizing - allow csumed packet to work for XDP_PASS - make EWMA rxbuf size estimation works for XDP - forbid XDP when GUEST_UFO is support - remove big packet XDP support - add XDP support or small buffer Please see individual patches for details. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: XDP support for small buffersJason Wang
Commit f600b6905015 ("virtio_net: Add XDP support") leaves the case of small receive buffer untouched. This will confuse the user who want to set XDP but use small buffers. Other than forbid XDP in small buffer mode, let's make it work. XDP then can only work at skb->data since virtio-net create skbs during refill, this is sub optimal which could be optimized in the future. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: remove big packet XDP codesJason Wang
Now we in fact don't allow XDP for big packets, remove its codes. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: forbid XDP when VIRTIO_NET_F_GUEST_UFO is supportJason Wang
When VIRTIO_NET_F_GUEST_UFO is negotiated, host could still send UFO packet that exceeds a single page which could not be handled correctly by XDP. So this patch forbids setting XDP when GUEST_UFO is supported. While at it, forbid XDP for ECN (which comes only from GRO) too to prevent user from misconfiguration. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: make rx buf size estimation works for XDPJason Wang
We don't update ewma rx buf size in the case of XDP. This will lead underestimation of rx buf size which causes host to produce more than one buffers. This will greatly increase the possibility of XDP page linearization. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: unbreak csumed packets for XDP_PASSJason Wang
We drop csumed packet when do XDP for packets. This breaks XDP_PASS when GUEST_CSUM is supported. Fix this by allowing csum flag to be set. With this patch, simple TCP works for XDP_PASS. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: correctly handle XDP_PASS for linearized packetsJason Wang
When XDP_PASS were determined for linearized packets, we try to get new buffers in the virtqueue and build skbs from them. This is wrong, we should create skbs based on existed buffers instead. Fixing them by creating skb based on xdp_page. With this patch "ping 192.168.100.4 -s 3900 -M do" works for XDP_PASS. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: fix page miscount during XDP linearizingJason Wang
We don't put page during linearizing, the would cause leaking when xmit through XDP_TX or the packet exceeds PAGE_SIZE. Fix them by put page accordingly. Also decrease the number of buffers during linearizing to make sure caller can free buffers correctly when packet exceeds PAGE_SIZE. With this patch, we won't get OOM after linearize huge number of packets. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: correctly xmit linearized page on XDP_TXJason Wang
After we linearize page, we should xmit this page instead of the page of first buffer which may lead unexpected result. With this patch, we can see correct packet during XDP_TX. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23virtio-net: remove the warning before XDP linearizingJason Wang
Since we use EWMA to estimate the size of rx buffer. When rx buffer size is underestimated, it's usual to have a packet with more than one buffers. Consider this is not a bug, remove the warning and correct the comment before XDP linearizing. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23Merge tag 'befs-v4.10-rc1' of git://github.com/luisbg/linux-befsLinus Torvalds
Pull befs updates from Luis de Bethencourt: "A series of small fixes and adding NFS export support" * tag 'befs-v4.10-rc1' of git://github.com/luisbg/linux-befs: befs: add NFS export support befs: remove trailing whitespaces befs: remove signatures from comments befs: fix style issues in header files befs: fix style issues in linuxvfs.c befs: fix typos in linuxvfs.c befs: fix style issues in io.c befs: fix style issues in inode.c befs: fix style issues in debug.c
2016-12-23Merge tag 'drm-fixes-for-4.10-rc1' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Some fixes came in while I was out, mostly intel and amdgpu ones, with one ast fix" Daniel Vetter says: "This should also shut up the WARN_ON(!intel_dp->lane_count) noise" * tag 'drm-fixes-for-4.10-rc1' of git://people.freedesktop.org/~airlied/linux: (35 commits) drm/amdgpu: update tile table for oland/hainan drm/amdgpu: update tile table for verde drm/amdgpu: update rev id for verde drm/amdgpu: update golden setting for verde drm/amdgpu: update rev id for oland drm/amdgpu: update golden setting for oland drm/amdgpu: update rev id for hainan drm/amdgpu: update golden setting for hainan drm/amdgpu: update rev id for pitcairn drm/amdgpu: update golden setting for pitcairn drm/amdgpu: update golden setting/tiling table of tahiti drm/i915: skip the first 4k of stolen memory on everything >= gen8 drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping drm/i915: Fix use after free in logical_render_ring_init drm/i915: disable PSR by default on HSW/BDW drm/i915: Fix setting of boost freq tunable drm/i915: tune down the fast link training vs boot fail drm/i915: Reorder phys backing storage release drm/i915/gen9: Fix PCODE polling during SAGV disabling drm/i915/gen9: Fix PCODE polling during CDCLK change notification ...
2016-12-23Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "First round of -rc fixes for 4.10 kernel: - a series of qedr fixes - a series of rxe fixes - one i40iw fix - one cma fix - one cxgb4 fix" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/rxe: Don't check for null ptr in send() IB/rxe: Drop future atomic/read packets rather than retrying IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends qedr: Always notify the verb consumer of flushed CQEs qedr: clear the vendor error field in the work completion qedr: post_send/recv according to QP state qedr: ignore inline flag in read verbs qedr: modify QP state to error when destroying it qedr: return correct value on modify qp qedr: return error if destroy CQ failed qedr: configure the number of CQEs on CQ creation i40iw: Set 128B as the only supported RQ WQE size IB/cma: Fix a race condition in iboe_addr_get_sgid() IB/rxe: Fix a memory leak in rxe_qp_cleanup() iw_cxgb4: set correct FetchBurstMax for QPs
2016-12-23Merge tag 'scsi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull late SCSI updates from James Bottomley: "This is mostly stuff which missed the initial pull. There's a new driver: qedi, and some ufs, ibmvscsis and ncr5380 updates plus some assorted driver fixes and also a fix for the bug where if a device goes into a blocked state between configuration and sysfs device add (which can be a long time under async probing) it would become permanently blocked" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (30 commits) scsi: avoid a permanent stop of the scsi device's request queue scsi: mpt3sas: Recognize and act on iopriority info scsi: qla2xxx: Fix Target mode handling with Multiqueue changes. scsi: qla2xxx: Add Block Multi Queue functionality. scsi: qla2xxx: Add multiple queue pair functionality. scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls. scsi: qla2xxx: Only allow operational MBX to proceed during RESET. scsi: hpsa: remove memory allocate failure message scsi: Update 3ware driver email addresses scsi: zfcp: fix rport unblock race with LUN recovery scsi: zfcp: do not trace pure benign residual HBA responses at default level scsi: zfcp: fix use-after-"free" in FC ingress path after TMF scsi: libcxgbi: return error if interface is not up scsi: cxgb4i: libcxgbi: add missing module_put() scsi: cxgb4i: libcxgbi: cxgb4: add T6 iSCSI completion feature scsi: cxgb4i: libcxgbi: add active open cmd for T6 adapters scsi: cxgb4i: use cxgb4_tp_smt_idx() to get smt_idx scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework. scsi: aacraid: remove wildcard for series 9 controllers scsi: ibmvscsi: add write memory barrier to CRQ processing ...
2016-12-23Merge tag 'arc-4.10-rc1-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull more ARC updates from Vineet Gupta: - Fix for aliasing VIPT dcache in old ARC700 cores - micro-optimization in ARC700 ProtV handler - Enable SG_CHAIN [Vladimir] - ARC HS38 core intc default to prio 1 * tag 'arc-4.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: mm: arc700: Don't assume 2 colours for aliasing VIPT dcache ARC: mm: No need to save cache version in @cpuinfo ARC: enable SG chaining ARCv2: intc: default all interrupts to priority 1 ARCv2: entry: document intr disable in hard isr ARC: ARCompact entry: elide re-reading ECR in ProtV handler
2016-12-23Merge branch 'mlxsw-router-fixes'David S. Miller
Jiri Pirko says: ==================== mlxsw: Router fixes Ido says: First two patches ensure we remove from the device's table neighbours that are considered to be dead by the neighbour core. The last patch removes nexthop groups from the device when they are no longer valid. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23mlxsw: spectrum_router: Correctly remove nexthop groupsIdo Schimmel
At the end of the nexthop initialization process we determine whether the nexthop should be offloaded or not based on the NUD state of the neighbour representing it. After all the nexthops were initialized we refresh the nexthop group and potentially offload it to the device, in case some of the nexthops were resolved. Make the destruction of a nexthop group symmetric with its creation by marking all nexthops as invalid and then refresh the nexthop group to make sure it was removed from the device's tables. Fixes: b2157149b0b0 ("mlxsw: spectrum_router: Add the nexthop neigh activity update") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23mlxsw: spectrum_router: Don't reflect dead neighsIdo Schimmel
When a neighbour is considered to be dead, we should remove it from the device's table regardless of its NUD state. Without this patch, after setting a port to be administratively down we get the following errors when we periodically try to update the kernel about neighbours activity: [ 461.947268] mlxsw_spectrum 0000:03:00.0 sw1p3: Failed to find matching neighbour for IP=192.168.100.2 Fixes: a6bf9e933daf ("mlxsw: spectrum_router: Offload neighbours based on NUD state change") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23neigh: Send netevent after marking neigh as deadIdo Schimmel
neigh_cleanup_and_release() is always called after marking a neighbour as dead, but it only notifies user space and not in-kernel listeners of the netevent notification chain. This can cause multiple problems. In my specific use case, it causes the listener (a switch driver capable of L3 offloads) to believe a neighbour entry is still valid, and is thus erroneously kept in the device's table. Fix that by sending a netevent after marking the neighbour as dead. Fixes: a6bf9e933daf ("mlxsw: spectrum_router: Offload neighbours based on NUD state change") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23ipv6: handle -EFAULT from skb_copy_bitsDave Jones
By setting certain socket options on ipv6 raw sockets, we can confuse the length calculation in rawv6_push_pending_frames triggering a BUG_ON. RIP: 0010:[<ffffffff817c6390>] [<ffffffff817c6390>] rawv6_sendmsg+0xc30/0xc40 RSP: 0018:ffff881f6c4a7c18 EFLAGS: 00010282 RAX: 00000000fffffff2 RBX: ffff881f6c681680 RCX: 0000000000000002 RDX: ffff881f6c4a7cf8 RSI: 0000000000000030 RDI: ffff881fed0f6a00 RBP: ffff881f6c4a7da8 R08: 0000000000000000 R09: 0000000000000009 R10: ffff881fed0f6a00 R11: 0000000000000009 R12: 0000000000000030 R13: ffff881fed0f6a00 R14: ffff881fee39ba00 R15: ffff881fefa93a80 Call Trace: [<ffffffff8118ba23>] ? unmap_page_range+0x693/0x830 [<ffffffff81772697>] inet_sendmsg+0x67/0xa0 [<ffffffff816d93f8>] sock_sendmsg+0x38/0x50 [<ffffffff816d982f>] SYSC_sendto+0xef/0x170 [<ffffffff816da27e>] SyS_sendto+0xe/0x10 [<ffffffff81002910>] do_syscall_64+0x50/0xa0 [<ffffffff817f7cbc>] entry_SYSCALL64_slow_path+0x25/0x25 Handle by jumping to the failure path if skb_copy_bits gets an EFAULT. Reproducer: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #define LEN 504 int main(int argc, char* argv[]) { int fd; int zero = 0; char buf[LEN]; memset(buf, 0, LEN); fd = socket(AF_INET6, SOCK_RAW, 7); setsockopt(fd, SOL_IPV6, IPV6_CHECKSUM, &zero, 4); setsockopt(fd, SOL_IPV6, IPV6_DSTOPTS, &buf, LEN); sendto(fd, buf, 1, 0, (struct sockaddr *) buf, 110); } Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23inet: fix IP(V6)_RECVORIGDSTADDR for udp socketsWillem de Bruijn
Socket cmsg IP(V6)_RECVORIGDSTADDR checks that port range lies within the packet. For sockets that have transport headers pulled, transport offset can be negative. Use signed comparison to avoid overflow. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Reported-by: Nisar Jagabar <njagabar@cloudmark.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-23Merge branch 'cls_flower-act_tunnel_key-fixes'David S. Miller
Or Gerlitz says: ==================== net/sched fixes for cls_flower and act_tunnel_key This small series contain a fix to the matching flags support in flower and to the tunnel key action MD prep for IPv6. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>