summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2014-04-24Merge branch 'for-3.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Dan updated tag allocation to accomodate devices which choke when tags jump back and forth. Quite a few ahci MSI related fixes. A couple config dependency fixes and other misc fixes" * 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata/ahci: accommodate tag ordered controllers ahci: Do not receive interrupts sent by dummy ports ahci: Use pci_enable_msi_exact() instead of pci_enable_msi_range() ahci: Ensure "MSI Revert to Single Message" mode is not enforced ahci: do not request irq for dummy port pata_samsung_cf: fix ata_host_activate() failure handling pata_arasan_cf: fix ata_host_activate() failure handling ata: fix i.MX AHCI driver dependencies pata_at91: fix ata_host_activate() failure handling libata: Update queued trim blacklist for M5x0 drives libata: make AHCI_XGENE depend on PHY_XGENE
2014-04-24dt: tegra: remove non-existent clock IDsStephen Warren
The Tegra124 clock DT binding currently provides 3 clocks that don't actually exist; 2 for NAND and one for UART5/UARTE. Delete these. While this is technically an incompatible DT ABI change, nothing could have used these clock IDs for anything practical, since the HW doesn't exist. Cc: <stable@vger.kernel.org> Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2014-04-23locks: rename FL_FILE_PVT and IS_FILE_PVT to use "*_OFDLCK" insteadJeff Layton
File-private locks have been re-christened as "open file description" locks. Finish the symbol name cleanup in the internal implementation. Signed-off-by: Jeff Layton <jlayton@redhat.com>
2014-04-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "The main change is that we now publish "firmware ID" for the serio devices to help userspace figure out the kind of touchpads it is dealing with: i8042 will export PS/2 port's PNP IDs as firmware IDs. You will also get more quirks for Synaptics touchpads in various Lenovo laptops, a change to elantech driver to recognize even more models, and fixups to wacom and couple other drivers" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: elantech - add support for newer elantech touchpads Input: soc_button_array - fix a crash during rmmod Input: synaptics - add min/max quirk for ThinkPad T431s, L440, L540, S1 Yoga and X1 Input: synaptics - report INPUT_PROP_TOPBUTTONPAD property Input: Add INPUT_PROP_TOPBUTTONPAD device property Input: i8042 - add firmware_id support Input: serio - add firmware_id sysfs attribute Input: wacom - handle 1024 pressure levels in wacom_tpc_pen Input: wacom - references to 'wacom->data' should use 'unsigned char*' Input: wacom - override 'pressure_max' with value from HID_USAGE_PRESSURE Input: wacom - use full 32-bit HID Usage value in switch statement Input: wacom - missed the last bit of expresskey for DTU-1031 Input: ads7846 - fix device usage within attribute show Input: da9055_onkey - remove use of regmap_irq_get_virq()
2014-04-23Merge remote-tracking branch 'regulator/fix/core' into regulator-linusMark Brown
2014-04-23Fix: tracing: use 'E' instead of 'X' for unsigned module tain flagMathieu Desnoyers
In the following commit: commit 57673c2b0baa900dddae3b9eb3d7748ebf550eb3 Author: Rusty Russell <rusty@rustcorp.com.au> Date: Mon Mar 31 14:39:57 2014 +1030 Use 'E' instead of 'X' for unsigned module taint flag. One site has been forgotten in trace events module.h. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-04-22net: Fix ns_capable check in sock_diag_put_filterinfoAndrew Lutomirski
The caller needs capabilities on the namespace being queried, not on their own namespace. This is a security bug, although it likely has only a minor impact. Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22locks: rename file-private locks to "open file description locks"Jeff Layton
File-private locks have been merged into Linux for v3.15, and *now* people are commenting that the name and macro definitions for the new file-private locks suck. ...and I can't even disagree. The names and command macros do suck. We're going to have to live with these for a long time, so it's important that we be happy with the names before we're stuck with them. The consensus on the lists so far is that they should be rechristened as "open file description locks". The name isn't a big deal for the kernel, but the command macros are not visually distinct enough from the traditional POSIX lock macros. The glibc and documentation folks are recommending that we change them to look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a programmer will typo one of the commands wrong, and also makes it easier to spot this difference when reading code. This patch makes the following changes that I think are necessary before v3.15 ships: 1) rename the command macros to their new names. These end up in the uapi headers and so are part of the external-facing API. It turns out that glibc doesn't actually use the fcntl.h uapi header, but it's hard to be sure that something else won't. Changing it now is safest. 2) make the the /proc/locks output display these as type "OFDLCK" Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Carlos O'Donell <carlos@redhat.com> Cc: Stefan Metzmacher <metze@samba.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Frank Filz <ffilzlnx@mindspring.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jeff Layton <jlayton@redhat.com>
2014-04-20Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "These are regression and bug fixes for ext4. We had a number of new features in ext4 during this merge window (ZERO_RANGE and COLLAPSE_RANGE fallocate modes, renameat, etc.) so there were many more regression and bug fixes this time around. It didn't help that xfstests hadn't been fully updated to fully stress test COLLAPSE_RANGE until after -rc1" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (31 commits) ext4: disable COLLAPSE_RANGE for bigalloc ext4: fix COLLAPSE_RANGE failure with 1KB block size ext4: use EINVAL if not a regular file in ext4_collapse_range() ext4: enforce we are operating on a regular file in ext4_zero_range() ext4: fix extent merging in ext4_ext_shift_path_extents() ext4: discard preallocations after removing space ext4: no need to truncate pagecache twice in collapse range ext4: fix removing status extents in ext4_collapse_range() ext4: use filemap_write_and_wait_range() correctly in collapse range ext4: use truncate_pagecache() in collapse range ext4: remove temporary shim used to merge COLLAPSE_RANGE and ZERO_RANGE ext4: fix ext4_count_free_clusters() with EXT4FS_DEBUG and bigalloc enabled ext4: always check ext4_ext_find_extent result ext4: fix error handling in ext4_ext_shift_extents ext4: silence sparse check warning for function ext4_trim_extent ext4: COLLAPSE_RANGE only works on extent-based files ext4: fix byte order problems introduced by the COLLAPSE_RANGE patches ext4: use i_size_read in ext4_unaligned_aio() fs: disallow all fallocate operation on active swapfile fs: move falloc collapse range check into the filesystem methods ...
2014-04-19Input: Add INPUT_PROP_TOPBUTTONPAD device propertyHans de Goede
On some newer laptops with a trackpoint the physical buttons for the trackpoint have been removed to allow for a larger touchpad. On these laptops the buttonpad has clearly marked areas on the top which are to be used as trackpad buttons. Users of the event device-node need to know about this, so that they can properly interpret BTN_LEFT events as being a left / right / middle click depending on where on the button pad the clicking finger is. This commits adds a INPUT_PROP_TOPBUTTONPAD device property which drivers for such buttonpads will use to signal to the user that this buttonpad not only has the normal bottom button area, but also a top button area. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2014-04-19Input: serio - add firmware_id sysfs attributeHans de Goede
serio devices exposed via platform firmware interfaces such as ACPI may provide additional identifying information of use to userspace. We don't associate the serio devices with the firmware device (we don't set it as parent), so there's no way for userspace to make use of this information. We cannot change the parent for serio devices instantiated though a firmware interface as that would break suspend / resume ordering. Therefore this patch adds a new firmware_id sysfs attribute so that userspace can get a string from there with any additional identifying information the firmware interface may provide. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2014-04-19Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "Unfortunately this contains no easter eggs, its a bit larger than I'd like, but I included a patch that just moves code from one file to another and I'd like to avoid merge conflicts with that later, so it makes it seem worse than it is, Otherwise: - radeon: fixes to use new microcode to stabilise some cards, use some common displayport code, some runtime pm fixes, pll regression fixes - i915: fix for some context oopses, a warn in a used path, backlight fixes - nouveau: regression fix - omap: a bunch of fixes" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (51 commits) drm: bochs: drop unused struct fields drm: bochs: add power management support drm: cirrus: add power management support drm: Split out drm_probe_helper.c from drm_crtc_helper.c drm/plane-helper: Don't fake-implement primary plane disabling drm/ast: fix value check in cbr_scan2 drm/nouveau/bios: fix a bit shift error introduced by 457e77b drm/radeon/ci: make sure mc ucode is loaded before checking the size drm/radeon/si: make sure mc ucode is loaded before checking the size drm/radeon: improve PLL params if we don't match exactly v2 drm/radeon: memory leak on bo reservation failure. v2 drm/radeon: fix VCE fence command drm/radeon: re-enable mclk dpm on R7 260X asics drm/radeon: add support for newer mc ucode on CI (v2) drm/radeon: add support for newer mc ucode on SI (v2) drm/radeon: apply more strict limits for PLL params v2 drm/radeon: update CI DPM powertune settings drm/radeon: fix runpm handling on APUs (v4) drm/radeon: disable mclk dpm on R7 260X drm/tegra: Remove gratuitous pad field ...
2014-04-19Merge branch 'drm-next-3.15-wip' of ↵Dave Airlie
git://people.freedesktop.org/~deathsimple/linux into drm-next Some i2c fixes over DisplayPort. * 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux: drm/radeon: Improve vramlimit module param documentation drm/radeon: fix audio pin counts for DCE6+ (v2) drm/radeon/dp: switch to the common i2c over aux code drm/dp/i2c: Update comments about common i2c over dp assumptions (v3) drm/dp/i2c: send bare addresses to properly reset i2c connections (v4) drm/radeon/dp: handle zero sized i2c over aux transactions (v2) drm/i915: support address only i2c-over-aux transactions drm/tegra: dp: Support address-only I2C-over-AUX transactions
2014-04-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull more networking fixes from David Miller: 1) Fix mlx4_en_netpoll implementation, it needs to schedule a NAPI context, not synchronize it. From Chris Mason. 2) Ipv4 flow input interface should never be zero, it should be LOOPBACK_IFINDEX instead. From Cong Wang and Julian Anastasov. 3) Properly configure MAC to PHY connection in mvneta devices, from Thomas Petazzoni. 4) sys_recv should use SYSCALL_DEFINE. From Jan Glauber. 5) Tunnel driver ioctls do not use the correct namespace, fix from Nicolas Dichtel. 6) Fix memory leak on seccomp filter attach, from Kees Cook. 7) Fix lockdep warning for nested vlans, from Ding Tianhong. 8) Crashes can happen in SCTP due to how the auth_enable value is managed, fix from Vlad Yasevich. 9) Wireless fixes from John W Linville and co. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits) net: sctp: cache auth_enable per endpoint tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled vlan: Fix lockdep warning when vlan dev handle notification seccomp: fix memory leak on filter attach isdn: icn: buffer overflow in icn_command() ip6_tunnel: use the right netns in ioctl handler sit: use the right netns in ioctl handler ip_tunnel: use the right netns in ioctl handler net: use SYSCALL_DEFINEx for sys_recv net: mdio-gpio: Add support for separate MDI and MDO gpio pins net: mdio-gpio: Add support for active low gpio pins net: mdio-gpio: Use devm_ functions where possible ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source() ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll net: mvneta: properly configure the MAC <-> PHY connection in all situations net: phy: add minimal support for QSGMII PHY sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast) mwifiex: fix hung task on command timeout mwifiex: process event before command response ...
2014-04-18Merge tag 'char-misc-3.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a few driver fixes for char/misc drivers that resolve reported issues. All have been in linux-next successfully for a few days" * tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts Tools: hv: Handle the case when the target file exists correctly vme_tsi148: Utilize to_pci_dev() macro vme_tsi148: Fix PCI address mapping assumption vme_tsi148: Fix typo in tsi148_slave_get() w1: avoid recursive device_add w1: fix netlink refcnt leak on error path misc: Grammar s/addition/additional/ drivers: mcb: fix memory leak in chameleon_parse_cells() error path mei: ignore client writing state during cb completion mei: me: do not load the driver if the FW doesn't support MEI interface GenWQE: Increase driver version number GenWQE: Fix multithreading problems GenWQE: Ensure rc is not returning an uninitialized value GenWQE: Add wmb before DDCB is started GenWQE: Enable access to VPD flash area
2014-04-18Merge tag 'driver-core-3.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some driver core fixes for 3.15-rc2. Also in here are some documentation updates, as well as an API removal that had to wait for after -rc1 due to the cleanups coming into you from multiple developer trees (this one and the PPC tree.) All have been in linux next successfully" * tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: drivers/base/dd.c incorrect pr_debug() parameters Documentation: Update stable address in Chinese and Japanese translations topology: Fix compilation warning when not in SMP Chinese: add translation of io_ordering.txt stable_kernel_rules: spelling/word usage sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner() kernfs: protect lazy kernfs_iattrs allocation with mutex fs: Don't return 0 from get_anon_bdev
2014-04-18Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: thp: close race between split and zap huge pages mm: fix new kernel-doc warning in filemap.c mm: fix CONFIG_DEBUG_VM_RB description mm: use paravirt friendly ops for NUMA hinting ptes mips: export flush_icache_range mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages() wait: explain the shadowing and type inconsistencies Shiraz has moved Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt powerpc/mm: fix ".__node_distance" undefined kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write() init/Kconfig: move the trusted keyring config option to general setup vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()
2014-04-18mm: use paravirt friendly ops for NUMA hinting ptesMel Gorman
David Vrabel identified a regression when using automatic NUMA balancing under Xen whereby page table entries were getting corrupted due to the use of native PTE operations. Quoting him Xen PV guest page tables require that their entries use machine addresses if the preset bit (_PAGE_PRESENT) is set, and (for successful migration) non-present PTEs must use pseudo-physical addresses. This is because on migration MFNs in present PTEs are translated to PFNs (canonicalised) so they may be translated back to the new MFN in the destination domain (uncanonicalised). pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma() set and clear the _PAGE_PRESENT bit using pte_set_flags(), pte_clear_flags(), etc. In a Xen PV guest, these functions must translate MFNs to PFNs when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting _PAGE_PRESENT. His suggested fix converted p[te|md]_[set|clear]_flags to using paravirt-friendly ops but this is overkill. He suggested an alternative of using p[te|md]_modify in the NUMA page table operations but this is does more work than necessary and would require looking up a VMA for protections. This patch modifies the NUMA page table operations to use paravirt friendly operations to set/clear the flags of interest. Unfortunately this will take a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not see a way around it that does not break Xen. Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David Vrabel <david.vrabel@citrix.com> Tested-by: David Vrabel <david.vrabel@citrix.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Noonan <steven@uplinklabs.net> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18wait: explain the shadowing and type inconsistenciesPeter Zijlstra
Stick in a comment before someone else tries to fix the sparse warning this generates. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-o2ro6f3vkxklni0bc8f7m68s@git.kernel.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18Shiraz has movedViresh Kumar
shiraz.hashim@st.com email-id doesn't exist anymore as he has left the company. Replace ST's id with shiraz.linux.kernel@gmail.com. It also updates .mailmap file to fix address for 'git shortlog'. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Shiraz Hashim <shiraz.linux.kernel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18net: sctp: cache auth_enable per endpointVlad Yasevich
Currently, it is possible to create an SCTP socket, then switch auth_enable via sysctl setting to 1 and crash the system on connect: Oops[#1]: CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1 task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000 [...] Call Trace: [<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80 [<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4 [<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c [<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8 [<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214 [<ffffffff8043af68>] sctp_rcv+0x588/0x630 [<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24 [<ffffffff803acb50>] ip6_input+0x2c0/0x440 [<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564 [<ffffffff80310650>] process_backlog+0xb4/0x18c [<ffffffff80313cbc>] net_rx_action+0x12c/0x210 [<ffffffff80034254>] __do_softirq+0x17c/0x2ac [<ffffffff800345e0>] irq_exit+0x54/0xb0 [<ffffffff800075a4>] ret_from_irq+0x0/0x4 [<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48 [<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148 [<ffffffff805a88b0>] start_kernel+0x37c/0x398 Code: dd0900b8 000330f8 0126302d <dcc60000> 50c0fff1 0047182a a48306a0 03e00008 00000000 ---[ end trace b530b0551467f2fd ]--- Kernel panic - not syncing: Fatal exception in interrupt What happens while auth_enable=0 in that case is, that ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs() when endpoint is being created. After that point, if an admin switches over to auth_enable=1, the machine can crash due to NULL pointer dereference during reception of an INIT chunk. When we enter sctp_process_init() via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk, the INIT verification succeeds and while we walk and process all INIT params via sctp_process_param() we find that net->sctp.auth_enable is set, therefore do not fall through, but invoke sctp_auth_asoc_set_default_hmac() instead, and thus, dereference what we have set to NULL during endpoint initialization phase. The fix is to make auth_enable immutable by caching its value during endpoint initialization, so that its original value is being carried along until destruction. The bug seems to originate from the very first days. Fix in joint work with Daniel Borkmann. Reported-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Tested-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-18Merge tag 'rdma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma updates from Roland Dreier: - mostly cxgb4 fixes unblocked by the merge of some prerequisites via the net tree - drop deprecated MSI-X API use. - a couple other miscellaneous things. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cxgb4: Fix over-dereference when terminating RDMA/cxgb4: Use uninitialized_var() RDMA/cxgb4: Add missing debug stats RDMA/cxgb4: Initialize reserved fields in a FW work request RDMA/cxgb4: Use pr_warn_ratelimited RDMA/cxgb4: Max fastreg depth depends on DSGL support RDMA/cxgb4: SQ flush fix RDMA/cxgb4: rmb() after reading valid gen bit RDMA/cxgb4: Endpoint timeout fixes RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices IB/mlx5: Add block multicast loopback support IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix() IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix()
2014-04-18libata/ahci: accommodate tag ordered controllersDan Williams
The AHCI spec allows implementations to issue commands in tag order rather than FIFO order: 5.3.2.12 P:SelectCmd HBA sets pSlotLoc = (pSlotLoc + 1) mod (CAP.NCS + 1) or HBA selects the command to issue that has had the PxCI bit set to '1' longer than any other command pending to be issued. The result is that commands posted sequentially (time-wise) may play out of sequence when issued by hardware. This behavior has likely been hidden by drives that arrange for commands to complete in issue order. However, it appears recent drives (two from different vendors that we have found so far) inflict out-of-order completions as a matter of course. So, we need to take care to maintain ordered submission, otherwise we risk triggering a drive to fall out of sequential-io automation and back to random-io processing, which incurs large latency and degrades throughput. This issue was found in simple benchmarks where QD=2 seq-write performance was 30-50% *greater* than QD=32 seq-write performance. Tagging for -stable and making the change globally since it has a low risk-to-reward ratio. Also, word is that recent versions of an unnamed OS also does it this way now. So, drives in the field are already experienced with this tag ordering scheme. Cc: <stable@vger.kernel.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ed Ciechanowski <ed.ciechanowski@intel.com> Reviewed-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-04-18Merge tag 'dt-fixes-for-3.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - fix error handling in of_update_property - fix section mismatch warnings in __reserved_mem_check_root - add empty of_find_node_by_path for !OF builds - add various missing binding documentation * tag 'dt-fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: add empty of_find_node_by_path() for !OF of: Clean up of_update_property DT: add vendor prefix for EBV Elektronik of: Fix the section mismatch warnings. of: Add vendor prefix for Digi International Inc. DT: I2C: Add trivial bindings used by kirkwood boards DT: Vendor: Add prefixes used by Kirkwood devices DT: bindings: add missing Marvell Kirkwood SoC documentation dt-bindings: add vendor-prefix for Newhaven Display of: add vendor prefix for I2SE GmbH of: add vendor prefix for ISEE 2007 S.L.
2014-04-18regulator: core: Return error in get optional stubTim Kryger
Drivers that call regulator_get_optional are tolerant to the absence of that regulator. By modifying the value returned from the stub function to match that seen when a regulator isn't present, callers can wrap the regulator logic with an IS_ERR based conditional even if they happen to call regulator_is_supported_voltage. This improves efficiency as well as eliminates the possibility for a very subtle bug. Signed-off-by: Tim Kryger <tim.kryger@linaro.org> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org>
2014-04-18of: add empty of_find_node_by_path() for !OFAlexander Shiyan
Add an empty version of of_find_node_by_path(). This fixes following build error for asoc tree: sound/soc/fsl/fsl_ssi.c: In function 'fsl_ssi_probe': sound/soc/fsl/fsl_ssi.c:1471:2: error: implicit declaration of function 'of_find_node_by_path' [-Werror=implicit-function-declaration] sprop = of_get_property(of_find_node_by_path("/"), "compatible", NULL); Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Rob Herring <robh@kernel.org>
2014-04-18drm: Split out drm_probe_helper.c from drm_crtc_helper.cDaniel Vetter
This is leftover stuff from my previous doc round which I kinda wanted to do but didn't yet due to rebase hell. The modeset helpers and the probing helpers a independent and e.g. i915 uses the probing stuff but has its own modeset infrastructure. It hence makes to split this up. While at it add a DOC: comment for the probing libraray. It would be rather neat to pull some of the DocBook documenting these two helpers into in-line DOC: comments. But unfortunately kerneldoc doesn't support markdown or something similar to make nice-looking documentation, so the current state is better. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-17genirq: Allow forcing cpu affinity of interruptsThomas Gleixner
The current implementation of irq_set_affinity() refuses rightfully to route an interrupt to an offline cpu. But there is a special case, where this is actually desired. Some of the ARM SoCs have per cpu timers which require setting the affinity during cpu startup where the cpu is not yet in the online mask. If we can't do that, then the local timer interrupt for the about to become online cpu is routed to some random online cpu. The developers of the affected machines tried to work around that issue, but that results in a massive mess in that timer code. We have a yet unused argument in the set_affinity callbacks of the irq chips, which I added back then for a similar reason. It was never required so it got not used. But I'm happy that I never removed it. That allows us to implement a sane handling of the above scenario. So the affected SoC drivers can add the required force handling to their interrupt chip, switch the timer code to irq_force_affinity() and things just work. This does not affect any existing user of irq_set_affinity(). Tagged for stable to allow a simple fix of the affected SoC clock event drivers. Reported-and-tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Tomasz Figa <t.figa@samsung.com>, Cc: Daniel Lezcano <daniel.lezcano@linaro.org>, Cc: Kukjin Kim <kgene.kim@samsung.com> Cc: linux-arm-kernel@lists.infradead.org, Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20140416143315.717251504@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-17Merge branch 'ipmi' (emailed ipmi fixes)Linus Torvalds
Merge ipmi fixes from Corey Minyard: "Things collected since last kernel release. Some of these are pretty important. The first three are bug fixes. The next two are to hopefully make everyone happy about allowing ACPI to be on all the time and not have IPMI have an effect on the system when not in use. The last is a little cleanup" * emailed patches from Corey Minyard <cminyard@mvista.com>: ipmi: boolify some things ipmi: Turn off all activity on an idle ipmi interface ipmi: Turn off default probing of interfaces ipmi: Reset the KCS timeout when starting error recovery ipmi: Fix a race restarting the timer Char: ipmi_bt_sm, fix infinite loop
2014-04-17ipmi: boolify some thingsCorey Minyard
Convert some ints to bools. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-17ipmi: Turn off all activity on an idle ipmi interfaceCorey Minyard
The IPMI driver would wake up periodically looking for events and watchdog pretimeouts. If there is nothing waiting for these events, it's really kind of pointless to be checking for them. So modify the driver so the message handler can pass down if it needs the lower layer to be waiting for these. Modify the system interface lower layer to turn off all timer and thread activity if the upper layer doesn't need anything and it is not currently handling messages. And modify the message handler to not restart the timer if its timer is not needed. The timers and kthread will still be enabled if: - the SI interface is handling a message. - a user has enabled watching for events. - the IPMI watchdog timer is in use (since it uses pretimeouts). - the message handler is waiting on a remote response. - a user has registered to receive commands. This mostly affects interfaces without interrupts. Interfaces with interrupts already don't use CPU in the system interface when the interface is idle. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-16Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Various fixes: - reboot regression fix - build message spam fix - GPU quirk fix - 'make kvmconfig' fix plus the wire-up of the renameat2() system call on i386" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove the PCI reboot method from the default chain x86/build: Supress "Nothing to be done for ..." messages x86/gpu: Fix sign extension issue in Intel graphics stolen memory quirks x86/platform: Fix "make O=dir kvmconfig" i386: Wire up the renameat2() syscall
2014-04-16Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hostsK. Y. Srinivasan
Only ws2012r2 hosts support the ability to reconnect to the host on VMBUS. This functionality is needed by kexec in Linux. To use this functionality we need to negotiate version 3.0 of the VMBUS protocol. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> [3.9+] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-16Tools: hv: Handle the case when the target file exists correctlyK. Y. Srinivasan
Return the appropriate error code and handle the case when the target file exists correctly. This fixes a bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> [3.14] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-16net: mdio-gpio: Add support for separate MDI and MDO gpio pinsGuenter Roeck
This is for a system with fixed assignments of input and output pins (various variants of Kontron COMe). Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16net: mdio-gpio: Add support for active low gpio pinsGuenter Roeck
Some systems using mdio-gpio may use active-low gpio pins (eg with inverters or FETs connected to all or some of the gpio pins). Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iifCong Wang
As suggested by Julian: Simply, flowi4_iif must not contain 0, it does not look logical to ignore all ip rules with specified iif. because in fib_rule_match() we do: if (rule->iifindex && (rule->iifindex != fl->flowi_iif)) goto out; flowi4_iif should be LOOPBACK_IFINDEX by default. We need to move LOOPBACK_IFINDEX to include/net/flow.h: 1) It is mostly used by flowi_iif 2) Fix the following compile error if we use it in flow.h by the patches latter: In file included from include/linux/netfilter.h:277:0, from include/net/netns/netfilter.h:5, from include/net/net_namespace.h:21, from include/linux/netdevice.h:43, from include/linux/icmpv6.h:12, from include/linux/ipv6.h:61, from include/net/ipv6.h:16, from include/linux/sunrpc/clnt.h:27, from include/linux/nfs_fs.h:30, from init/do_mounts.c:32: include/net/flow.h: In function ‘flowi4_init_output’: include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function) Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julian Anastasov <ja@ssi.bg> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()Tejun Heo
All device_schedule_callback_owner() users are converted to use device_remove_file_self(). Remove now unused {sysfs|device}_schedule_callback_owner(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-16net: phy: add minimal support for QSGMII PHYThomas Petazzoni
This commit adds the necessary definitions for the PHY layer to recognize "qsgmii" as a valid PHY interface. A QSMII interface, as defined at http://en.wikipedia.org/wiki/Media_Independent_Interface#Quad_Serial_Gigabit_Media_Independent_Interface, is "is a method of combining four SGMII lines into a 5Gbit/s interface. QSGMII, like SGMII, uses LVDS signalling for the TX and RX data and a single LVDS clock signal. QSGMII uses significantly fewer signal lines than four SGMII busses." This type of MAC <-> PHY connection might require special handling on the MAC driver side, so it should be possible to express this type of MAC <-> PHY connection, for example in the Device Tree. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: devicetree@vger.kernel.org Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16drm/tegra: Remove gratuitous pad fieldThierry Reding
The version of the drm_tegra_submit structure that was merged all the way back in 3.10 contains a pad field that was originally intended to properly pad the following __u64 field. Unfortunately it seems like a different field was dropped during review that caused this padding to become unnecessary, but the pad field wasn't removed at that time. One possible side-effect of this is that since the __u64 following the pad is now no longer properly aligned, the compiler may (or may not) introduce padding itself, which results in no predictable ABI. Rectify this by removing the pad field so that all fields are again naturally aligned. Technically this is breaking existing userspace ABI, but given that there aren't any (released) userspace drivers that make use of this yet, the fallout should be minimal. Fixes: d43f81cbaf43 ("drm/tegra: Add gr2d device") Cc: <stable@vger.kernel.org> # 3.10 Signed-off-by: Thierry Reding <treding@nvidia.com>
2014-04-16x86: Remove the PCI reboot method from the default chainIngo Molnar
Steve reported a reboot hang and bisected it back to this commit: a4f1987e4c54 x86, reboot: Add EFI and CF9 reboot methods into the default list He heroically tested all reboot methods and found the following: reboot=t # triple fault ok reboot=k # keyboard ctrl FAIL reboot=b # BIOS ok reboot=a # ACPI FAIL reboot=e # EFI FAIL [system has no EFI] reboot=p # PCI 0xcf9 FAIL And I think it's pretty obvious that we should only try PCI 0xcf9 as a last resort - if at all. The other observation is that (on this box) we should never try the PCI reboot method, but close with either the 'triple fault' or the 'BIOS' (terminal!) reboot methods. Thirdly, CF9_COND is a total misnomer - it should be something like CF9_SAFE or CF9_CAREFUL, and 'CF9' should be 'CF9_FORCE' ... So this patch fixes the worst problems: - it orders the actual reboot logic to follow the reboot ordering pattern - it was in a pretty random order before for no good reason. - it fixes the CF9 misnomers and uses BOOT_CF9_FORCE and BOOT_CF9_SAFE flags to make the code more obvious. - it tries the BIOS reboot method before the PCI reboot method. (Since 'BIOS' is a terminal reboot method resulting in a hang if it does not work, this is essentially equivalent to removing the PCI reboot method from the default reboot chain.) - just for the miraculous possibility of terminal (resulting in hang) reboot methods of triple fault or BIOS returning without having done their job, there's an ordering between them as well. Reported-and-bisected-and-tested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Li Aubrey <aubrey.li@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Link: http://lkml.kernel.org/r/20140404064120.GB11877@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix BPF filter validation of netlink attribute accesses, from Mathias Kruase. 2) Netfilter conntrack generation seqcount not initialized properly, from Andrey Vagin. 3) Fix comparison mask computation on big-endian in nft_cmp_fast(), from Patrick McHardy. 4) Properly limit MTU over ipv6, from Eric Dumazet. 5) Fix seccomp system call argument population on 32-bit, from Daniel Borkmann. 6) skb_network_protocol() should not use hard-coded ETH_HLEN, instead skb->mac_len needs to be used. From Vlad Yasevich. 7) We have several cases of using socket based communications to implement a tunnel. For example, some tunnels are encapsulations over UDP so we use an internal kernel UDP socket to do the transmits. These tunnels should behave just like other software devices and pass the packets on down to the next layer. Most importantly we want the top-level socket (eg TCP) that created the traffic to be charged for the SKB memory. However, once you get into the IP output path, we have code that assumed that whatever was attached to skb->sk is an IP socket. To keep the top-level socket being charged for the SKB memory, whilst satisfying the needs of the IP output path, we now pass in an explicit 'sk' argument. From Eric Dumazet. 8) ping_init_sock() leaks group info, from Xiaoming Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) cxgb4: use the correct max size for firmware flash qlcnic: Fix MSI-X initialization code ip6_gre: don't allow to remove the fb_tunnel_dev ipv4: add a sock pointer to dst->output() path. ipv4: add a sock pointer to ip_queue_xmit() driver/net: cosa driver uses udelay incorrectly at86rf230: fix __at86rf230_read_subreg function at86rf230: remove check if AVDD settled net: cadence: Add architecture dependencies net: Start with correct mac_len in skb_network_protocol Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" cxgb4: Save the correct mac addr for hw-loopback connections in the L2T net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF qlcnic: Do not disable SR-IOV when VFs are assigned to VMs qlcnic: Fix QLogic application/driver interface for virtual NIC configuration qlcnic: Fix PVID configuration on eSwitch port. qlcnic: Fix max ring count calculation qlcnic: Fix to send INIT_NIC_FUNC as first mailbox. qlcnic: Fix panic due to uninitialzed delayed_work struct in use. ...
2014-04-15ipv4: add a sock pointer to dst->output() path.Eric Dumazet
In the dst->output() path for ipv4, the code assumes the skb it has to transmit is attached to an inet socket, specifically via ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the provider of the packet is an AF_PACKET socket. The dst->output() method gets an additional 'struct sock *sk' parameter. This needs a cascade of changes so that this parameter can be propagated from vxlan to final consumer. Fixes: 8f646c922d55 ("vxlan: keep original skb ownership") Reported-by: lucien xin <lucien.xin@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15ipv4: add a sock pointer to ip_queue_xmit()Eric Dumazet
ip_queue_xmit() assumes the skb it has to transmit is attached to an inet socket. Commit 31c70d5956fc ("l2tp: keep original skb ownership") changed l2tp to not change skb ownership and thus broke this assumption. One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(), so that we do not assume skb->sk points to the socket used by l2tp tunnel. Fixes: 31c70d5956fc ("l2tp: keep original skb ownership") Reported-by: Zhan Jianyu <nasa4836@gmail.com> Tested-by: Zhan Jianyu <nasa4836@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14ext4: remove temporary shim used to merge COLLAPSE_RANGE and ZERO_RANGETheodore Ts'o
In retrospect, this was a bad way to handle things, since it limited testing of these patches. We should just get the VFS level changes merged in first. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains three Netfilter fixes for your net tree, they are: * Fix missing generation sequence initialization which results in a splat if lockdep is enabled, it was introduced in the recent works to improve nf_conntrack scalability, from Andrey Vagin. * Don't flush the GRE keymap list in nf_conntrack when the pptp helper is disabled otherwise this crashes due to a double release, from Andrey Vagin. * Fix nf_tables cmp fast in big endian, from Patrick McHardy. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the ↵Daniel Borkmann
receiver's buffer" This reverts commit ef2820a735f7 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer") as it introduced a serious performance regression on SCTP over IPv4 and IPv6, though a not as dramatic on the latter. Measurements are on 10Gbit/s with ixgbe NICs. Current state: [root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.241.3 -V -l 1452 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64 Time: Fri, 11 Apr 2014 17:56:21 GMT Connecting to host 192.168.241.3, port 5201 Cookie: Lab200slot2.1397238981.812898.548918 [ 4] local 192.168.241.2 port 38616 connected to 192.168.241.3 port 5201 Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.09 sec 20.8 MBytes 161 Mbits/sec [ 4] 1.09-2.13 sec 10.8 MBytes 86.8 Mbits/sec [ 4] 2.13-3.15 sec 3.57 MBytes 29.5 Mbits/sec [ 4] 3.15-4.16 sec 4.33 MBytes 35.7 Mbits/sec [ 4] 4.16-6.21 sec 10.4 MBytes 42.7 Mbits/sec [ 4] 6.21-6.21 sec 0.00 Bytes 0.00 bits/sec [ 4] 6.21-7.35 sec 34.6 MBytes 253 Mbits/sec [ 4] 7.35-11.45 sec 22.0 MBytes 45.0 Mbits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-12.51 sec 16.0 MBytes 126 Mbits/sec [ 4] 12.51-13.59 sec 20.3 MBytes 158 Mbits/sec [ 4] 13.59-14.65 sec 13.4 MBytes 107 Mbits/sec [ 4] 14.65-16.79 sec 33.3 MBytes 130 Mbits/sec [ 4] 16.79-16.79 sec 0.00 Bytes 0.00 bits/sec [ 4] 16.79-17.82 sec 5.94 MBytes 48.7 Mbits/sec (etc) [root@Lab200slot2 ~]# iperf3 --sctp -6 -c 2001:db8:0:f101::1 -V -l 1400 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64 Time: Fri, 11 Apr 2014 19:08:41 GMT Connecting to host 2001:db8:0:f101::1, port 5201 Cookie: Lab200slot2.1397243321.714295.2b3f7c [ 4] local 2001:db8:0:f101::2 port 55804 connected to 2001:db8:0:f101::1 port 5201 Starting Test: protocol: SCTP, 1 streams, 1400 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 169 MBytes 1.42 Gbits/sec [ 4] 1.00-2.00 sec 201 MBytes 1.69 Gbits/sec [ 4] 2.00-3.00 sec 188 MBytes 1.58 Gbits/sec [ 4] 3.00-4.00 sec 174 MBytes 1.46 Gbits/sec [ 4] 4.00-5.00 sec 165 MBytes 1.39 Gbits/sec [ 4] 5.00-6.00 sec 199 MBytes 1.67 Gbits/sec [ 4] 6.00-7.00 sec 163 MBytes 1.36 Gbits/sec [ 4] 7.00-8.00 sec 174 MBytes 1.46 Gbits/sec [ 4] 8.00-9.00 sec 193 MBytes 1.62 Gbits/sec [ 4] 9.00-10.00 sec 196 MBytes 1.65 Gbits/sec [ 4] 10.00-11.00 sec 157 MBytes 1.31 Gbits/sec [ 4] 11.00-12.00 sec 175 MBytes 1.47 Gbits/sec [ 4] 12.00-13.00 sec 192 MBytes 1.61 Gbits/sec [ 4] 13.00-14.00 sec 199 MBytes 1.67 Gbits/sec (etc) After patch: [root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.240.3 -V -l 1452 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0+ #1 SMP Mon Apr 14 12:06:40 EDT 2014 x86_64 Time: Mon, 14 Apr 2014 16:40:48 GMT Connecting to host 192.168.240.3, port 5201 Cookie: Lab200slot2.1397493648.413274.65e131 [ 4] local 192.168.240.2 port 50548 connected to 192.168.240.3 port 5201 Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 240 MBytes 2.02 Gbits/sec [ 4] 1.00-2.00 sec 239 MBytes 2.01 Gbits/sec [ 4] 2.00-3.00 sec 240 MBytes 2.01 Gbits/sec [ 4] 3.00-4.00 sec 239 MBytes 2.00 Gbits/sec [ 4] 4.00-5.00 sec 245 MBytes 2.05 Gbits/sec [ 4] 5.00-6.00 sec 240 MBytes 2.01 Gbits/sec [ 4] 6.00-7.00 sec 240 MBytes 2.02 Gbits/sec [ 4] 7.00-8.00 sec 239 MBytes 2.01 Gbits/sec With the reverted patch applied, the SCTP/IPv4 performance is back to normal on latest upstream for IPv4 and IPv6 and has same throughput as 3.4.2 test kernel, steady and interval reports are smooth again. Fixes: ef2820a735f7 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer") Reported-by: Peter Butler <pbutler@sonusnet.com> Reported-by: Dongsheng Song <dongsheng.song@gmail.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: Peter Butler <pbutler@sonusnet.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com> Cc: Alexander Sverdlin <alexander.sverdlin@nsn.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_WDaniel Borkmann
While reviewing seccomp code, we found that BPF_S_ANC_SECCOMP_LD_W has been wrongly decoded by commit a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)") into the opcode BPF_LD|BPF_B|BPF_ABS although it should have been decoded as BPF_LD|BPF_W|BPF_ABS. In practice, this should not have much side-effect though, as such conversion is/was being done through prctl(2) PR_SET_SECCOMP. Reverse operation PR_GET_SECCOMP will only return the current seccomp mode, but not the filter itself. Since the transition to the new BPF infrastructure, it's also not used anymore, so we can simply remove this as it's unreachable. Fixes: a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14ipv6: Limit mtu to 65575 bytesEric Dumazet
Francois reported that setting big mtu on loopback device could prevent tcp sessions making progress. We do not support (yet ?) IPv6 Jumbograms and cook corrupted packets. We must limit the IPv6 MTU to (65535 + 40) bytes in theory. Tested: ifconfig lo mtu 70000 netperf -H ::1 Before patch : Throughput : 0.05 Mbits After patch : Throughput : 35484 Mbits Reported-by: Francois WELLENREITER <f.wellenreiter@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14netfilter: nf_tables: fix nft_cmp_fast failure on big endian for size < 4Patrick McHardy
nft_cmp_fast is used for equality comparisions of size <= 4. For comparisions of size < 4 byte a mask is calculated that is applied to both the data from userspace (during initialization) and the register value (during runtime). Both values are stored using (in effect) memcpy to a memory area that is then interpreted as u32 by nft_cmp_fast. This works fine on little endian since smaller types have the same base address, however on big endian this is not true and the smaller types are interpreted as a big number with trailing zero bytes. The mask therefore must not include the lower bytes, but the higher bytes on big endian. Add a helper function that does a cpu_to_le32 to switch the bytes on big endian. Since we're dealing with a mask of just consequitive bits, this works out fine. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>