summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-02-06arch/x86/mm/numa.c: initialize numa_kernel_nodes in ↵Tang Chen
numa_clear_kernel_node_hotplug() On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug() was not initialized. So we need to initialize it. [akpm@linux-foundation.org: use NODE_MASK_NONE, per David] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Tested-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reported-by: Dave Jones <davej@redhat.com> Reported-by: David Rientjes <rientjes@google.com> Tested-by: Dave Jones <davej@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of ↵KOSAKI Motohiro
spin_lock_irq() During aio stress test, we observed the following lockdep warning. This mean AIO+numa_balancing is currently deadlockable. The problem is, aio_migratepage disable interrupt, but __set_page_dirty_nobuffers unintentionally enable it again. Generally, all helper function should use spin_lock_irqsave() instead of spin_lock_irq() because they don't know caller at all. other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&ctx->completion_lock)->rlock); <Interrupt> lock(&(&ctx->completion_lock)->rlock); *** DEADLOCK *** dump_stack+0x19/0x1b print_usage_bug+0x1f7/0x208 mark_lock+0x21d/0x2a0 mark_held_locks+0xb9/0x140 trace_hardirqs_on_caller+0x105/0x1d0 trace_hardirqs_on+0xd/0x10 _raw_spin_unlock_irq+0x2c/0x50 __set_page_dirty_nobuffers+0x8c/0xf0 migrate_page_copy+0x434/0x540 aio_migratepage+0xb1/0x140 move_to_new_page+0x7d/0x230 migrate_pages+0x5e5/0x700 migrate_misplaced_page+0xbc/0xf0 do_numa_page+0x102/0x190 handle_pte_fault+0x241/0x970 handle_mm_fault+0x265/0x370 __do_page_fault+0x172/0x5a0 do_page_fault+0x1a/0x70 page_fault+0x28/0x30 Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <jweiner@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06mm/swap: fix race on swap_info reuse between swapoff and swaponWeijie Yang
swapoff clear swap_info's SWP_USED flag prematurely and free its resources after that. A concurrent swapon will reuse this swap_info while its previous resources are not cleared completely. These late freed resources are: - p->percpu_cluster - swap_cgroup_ctrl[type] - block_device setting - inode->i_flags &= ~S_SWAPFILE This patch clears the SWP_USED flag after all its resources are freed, so that swapon can reuse this swap_info by alloc_swap_info() safely. [akpm@linux-foundation.org: tidy up code comment] Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06swap: add a simple detector for inappropriate swapin readaheadShaohua Li
This is a patch to improve swap readahead algorithm. It's from Hugh and I slightly changed it. Hugh's original changelog: swapin readahead does a blind readahead, whether or not the swapin is sequential. This may be ok on harddisk, because large reads have relatively small costs, and if the readahead pages are unneeded they can be reclaimed easily - though, what if their allocation forced reclaim of useful pages? But on SSD devices large reads are more expensive than small ones: if the readahead pages are unneeded, reading them in caused significant overhead. This patch adds very simplistic random read detection. Stealing the PageReadahead technique from Konstantin Khlebnikov's patch, avoiding the vma/anon_vma sophistications of Shaohua Li's patch, swapin_nr_pages() simply looks at readahead's current success rate, and narrows or widens its readahead window accordingly. There is little science to its heuristic: it's about as stupid as can be whilst remaining effective. The table below shows elapsed times (in centiseconds) when running a single repetitive swapping load across a 1000MB mapping in 900MB ram with 1GB swap (the harddisk tests had taken painfully too long when I used mem=500M, but SSD shows similar results for that). Vanilla is the 3.6-rc7 kernel on which I started; Shaohua denotes his Sep 3 patch in mmotm and linux-next; HughOld denotes my Oct 1 patch which Shaohua showed to be defective; HughNew this Nov 14 patch, with page_cluster as usual at default of 3 (8-page reads); HughPC4 this same patch with page_cluster 4 (16-page reads); HughPC0 with page_cluster 0 (1-page reads: no readahead). HDD for swapping to harddisk, SSD for swapping to VertexII SSD. Seq for sequential access to the mapping, cycling five times around; Rand for the same number of random touches. Anon for a MAP_PRIVATE anon mapping; Shmem for a MAP_SHARED anon mapping, equivalent to tmpfs. One weakness of Shaohua's vma/anon_vma approach was that it did not optimize Shmem: seen below. Konstantin's approach was perhaps mistuned, 50% slower on Seq: did not compete and is not shown below. HDD Vanilla Shaohua HughOld HughNew HughPC4 HughPC0 Seq Anon 73921 76210 75611 76904 78191 121542 Seq Shmem 73601 73176 73855 72947 74543 118322 Rand Anon 895392 831243 871569 845197 846496 841680 Rand Shmem 1058375 1053486 827935 764955 764376 756489 SSD Vanilla Shaohua HughOld HughNew HughPC4 HughPC0 Seq Anon 24634 24198 24673 25107 21614 70018 Seq Shmem 24959 24932 25052 25703 22030 69678 Rand Anon 43014 26146 28075 25989 26935 25901 Rand Shmem 45349 45215 28249 24268 24138 24332 These tests are, of course, two extremes of a very simple case: under heavier mixed loads I've not yet observed any consistent improvement or degradation, and wider testing would be welcome. Shaohua Li: Test shows Vanilla is slightly better in sequential workload than Hugh's patch. I observed with Hugh's patch sometimes the readahead size is shrinked too fast (from 8 to 1 immediately) in sequential workload if there is no hit. And in such case, continuing doing readahead is good actually. I don't prepare a sophisticated algorithm for the sequential workload because so far we can't guarantee sequential accessed pages are swap out sequentially. So I slightly change Hugh's heuristic - don't shrink readahead size too fast. Here is my test result (unit second, 3 runs average): Vanilla Hugh New Seq 356 370 360 Random 4525 2447 2444 Attached graph is the swapin/swapout throughput I collected with 'vmstat 2'. The first part is running a random workload (till around 1200 of the x-axis) and the second part is running a sequential workload. swapin and swapout throughput are almost identical in steady state in both workloads. These are expected behavior. while in Vanilla, swapin is much bigger than swapout especially in random workload (because wrong readahead). Original patches by: Shaohua Li and Konstantin Khlebnikov. [fengguang.wu@intel.com: swapin_nr_pages() can be static] Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06ocfs2: free allocated clusters if error occurs after ocfs2_claim_clustersZongxun Wang
Even if using the same jbd2 handle, we cannot rollback a transaction. So once some error occurs after successfully allocating clusters, the allocated clusters will never be used and it means they are lost. For example, call ocfs2_claim_clusters successfully when expanding a file, but failed in ocfs2_insert_extent. So we need free the allocated clusters if they are not used indeed. Signed-off-by: Zongxun Wang <wangzongxun@huawei.com> Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Li Zefan <lizefan@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06Documentation/kernel-parameters.txt: fix memmap= languageRandy Dunlap
Clean up descriptions of memmap= boot options. Add periods (full stops), drop commas, change "used" to "reserved" or "marked". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Andiry Xu <andiry.xu@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06Merge tag 'sound-3.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A few HD-audio fixes and one USB-audio kconfig dependency fix. All small and device-specific changes marked with Cc to stable" * tag 'sound-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Improve loopback path lookups for AD1983 ALSA: hda - Fix missing VREF setup for Mac Pro 1,1 ALSA: hda - Add missing mixer widget for AD1983 ALSA: hda/realtek - Avoid invalid COEFs for ALC271X ALSA: hda - Fix silent output on Toshiba Satellite L40 ALSA: usb-audio: Add missing kconfig dependecy
2014-02-06Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "A few regression fixes already, one for my own stupidity, and mgag200 typo fix, vmwgfx fixes and ttm regression fixes, and a radeon register checker update for older cards to handle geom shaders" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/radeon: allow geom rings to be setup on r600/r700 (v2) drm/mgag200,ast,cirrus: fix regression with drm_can_sleep conversion drm/ttm: Don't clear page metadata of imported sg pages drm/ttm: Fix TTM object open regression vmwgfx: Fix unitialized stack read in vmw_setup_otable_base drm/vmwgfx: Reemit context bindings when necessary v2 drm/vmwgfx: Detect old user-space drivers and set up legacy emulation v2 drm/vmwgfx: Emulate legacy shaders on guest-backed devices v2 drm/vmwgfx: Fix legacy surface reference size copyback drm/vmwgfx: Fix SET_SHADER_CONST emulation on guest-backed devices drm/vmwgfx: Fix regression caused by "drm/ttm: make ttm reservation calls behave like reservation calls" drm/vmwgfx: Don't commit staged bindings if execbuf fails drm/mgag200: fix typo causing bw limits to be ignored on some chips
2014-02-06x86, microcode, AMD: Unify valid container checksBorislav Petkov
For additional coverage, BorisO and friends unknowlingly did swap AMD microcode with Intel microcode blobs in order to see what happens. What did happen on 32-bit was [ 5.722656] BUG: unable to handle kernel paging request at be3a6008 [ 5.722693] IP: [<c106d6b4>] load_microcode_amd+0x24/0x3f0 [ 5.722716] *pdpt = 0000000000000000 *pde = 0000000000000000 because there was a valid initrd there but without valid microcode in it and the container check happened *after* the relocated ramdisk handling on 32-bit, which was clearly wrong. While at it, take care of the ramdisk relocation on both 32- and 64-bit as it is done on both. Also, comment what we're doing because this code is a bit tricky. Reported-and-tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391460104-7261-1-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-02-06x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=yPeter Oberparleiter
Commit d61931d89b, "x86: Add optimized popcnt variants" introduced compile flag -fcall-saved-rdi for lib/hweight.c. When combined with options -fprofile-arcs and -O2, this flag causes gcc to generate broken constructor code. As a result, a 64 bit x86 kernel compiled with CONFIG_GCOV_PROFILE_ALL=y prints message "gcov: could not create file" and runs into sproadic BUGs during boot. The gcc people indicate that these kinds of problems are endemic when using ad hoc calling conventions. It is therefore best to treat any file compiled with ad hoc calling conventions as an isolated environment and avoid things like profiling or coverage analysis, since those subsystems assume a "normal" calling conventions. This patch avoids the bug by excluding lib/hweight.o from coverage profiling. Reported-by: Meelis Roos <mroos@linux.ee> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/52F3A30C.7050205@linux.vnet.ibm.com Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@vger.kernel.org>
2014-02-06pinctrl: tegra: return correct error typeLaxman Dewangan
When memory allocation failed, drive should return error as ENOMEM. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2014-02-06pinctrl: do not init debugfs entries for unimplemented functionalitiesFlorian Vaussard
Commit c420619 "pinctrl: pinconf: remove checks on ops->pin_config_get" removed the check on (ops != NULL) when performing pinconf_pins_show() or pinconf_groups_show(). As these entries are always enabled, even if pinconf is not supported, reading will result in an oops due to NULL ops. Instead of checking for ops, remove the corresponding debugfs entries if pinconf and/or pinmux are not implemented. Tested on OMAP3 (pinctrl-single). Cc: stable@vger.kernel.org Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2014-02-06MIPS: fpu.h: Fix build when CONFIG_BUG is not setAaro Koskinen
__enable_fpu produces a build failure when CONFIG_BUG is not set: In file included from arch/mips/kernel/cpu-probe.c:24:0: arch/mips/include/asm/fpu.h: In function '__enable_fpu': arch/mips/include/asm/fpu.h:77:1: error: control reaches end of non-void function [-Werror=return-type] This is regression introduced in 3.14-rc1. Fix that. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Paul Burton <paul.burton@imgtec.com> Cc: John Crispin <blogic@openwrt.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/6504/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-02-06arm64: barriers: allow dsb macro to take option parameterWill Deacon
The dsb instruction takes an option specifying both the target access types and shareability domain. This patch allows such an option to be passed to the dsb macro, resulting in potentially more efficient code. Currently the option is ignored until all callers are updated (unlike ARM, the option is mandated by the assembler). Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-06drm/radeon: allow geom rings to be setup on r600/r700 (v2)Dave Airlie
the evergreen CS parser has allowed this for a while, just port the code to the r600 one. This is required before geom shaders can be made work. v2: agd5f: minor cleanup and add additional 7xx reg. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-02-06Merge tag 'vmwgfx-fixes-3.14-2014-02-05' of ↵Dave Airlie
git://people.freedesktop.org/~thomash/linux into drm-next A couple of vmwgfx fixes together with missing bits of legacy device emulation to facilitate old user-space drivers on new devices. The shader emulation bits are a bit large, but since they mostly touch the new device code, regressions are unlikely. I figure the gain of having this from the start clearly outweighs the risc of adding these bits at this point. Pull request of 2014-02-05 * tag 'vmwgfx-fixes-3.14-2014-02-05' of git://people.freedesktop.org/~thomash/linux: vmwgfx: Fix unitialized stack read in vmw_setup_otable_base drm/vmwgfx: Reemit context bindings when necessary v2 drm/vmwgfx: Detect old user-space drivers and set up legacy emulation v2 drm/vmwgfx: Emulate legacy shaders on guest-backed devices v2 drm/vmwgfx: Fix legacy surface reference size copyback drm/vmwgfx: Fix SET_SHADER_CONST emulation on guest-backed devices drm/vmwgfx: Fix regression caused by "drm/ttm: make ttm reservation calls behave like reservation calls" drm/vmwgfx: Don't commit staged bindings if execbuf fails
2014-02-06Merge tag 'ttm-fixes-3.14-2014-02-05' of ↵Dave Airlie
git://people.freedesktop.org/~thomash/linux into drm-next Two ttm regression fixes. Pull request of 2014-02-05 * tag 'ttm-fixes-3.14-2014-02-05' of git://people.freedesktop.org/~thomash/linux: drm/ttm: Don't clear page metadata of imported sg pages drm/ttm: Fix TTM object open regression
2014-02-06drm/mgag200,ast,cirrus: fix regression with drm_can_sleep conversionDave Airlie
I totally sign inverted my way out of this one. Cc: stable@vger.kernel.org Reported-by: "Sabrina Dubroca" <sd@queasysnail.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-02-05Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "This lot provides: * Bugfixes for armada irq controller * Updates to renesas irq chip * Support for the TI-NSPIRE irq controller Not strictly a bug fix only pull request, but important updates for some of the arm Socs which I completely forgot to send last week. Seems like my obliviousness is getting worse, I just can't remember when it started" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: Add support for TI-NSPIRE irqchip irqchip: renesas-irqc: Enable mask on suspend irqchip: renesas-irqc: Use lazy disable irqchip: armada-370-xp: fix MSI race condition irqchip: armada-370-xp: fix IPI race condition
2014-02-05Merge tag 'stable/for-linus-3.14-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen fixes from Konrad Rzeszutek Wilk: "Bug-fixes: - Revert "xen/grant-table: Avoid m2p_override during mapping" as it broke Xen ARM build. - Fix CR4 not being set on AP processors in Xen PVH mode" * tag 'stable/for-linus-3.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvh: set CR4 flags for APs Revert "xen/grant-table: Avoid m2p_override during mapping"
2014-02-05Merge tag 'please-pull-ia64-syscalls' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull ia64 update from Tony Luck: "Wire up new sched_setattr and sched_getattr syscalls" * tag 'please-pull-ia64-syscalls' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: [IA64] Wire up new sched_setattr and sched_getattr syscalls
2014-02-05Merge git://git.infradead.org/users/willy/linux-nvmeLinus Torvalds
Pull NVMe driver update from Matthew Wilcox: "Looks like I missed the merge window ... but these are almost all bugfixes anyway (the ones that aren't have been baking for months)" * git://git.infradead.org/users/willy/linux-nvme: NVMe: Namespace use after free on surprise removal NVMe: Correct uses of INIT_WORK NVMe: Include device and queue numbers in interrupt name NVMe: Add a pci_driver shutdown method NVMe: Disable admin queue on init failure NVMe: Dynamically allocate partition numbers NVMe: Async IO queue deletion NVMe: Surprise removal handling NVMe: Abort timed out commands NVMe: Schedule reset for failed controllers NVMe: Device resume error handling NVMe: Cache dev->pci_dev in a local pointer NVMe: Fix lockdep warnings NVMe: compat SG_IO ioctl NVMe: remove deprecated IRQF_DISABLED NVMe: Avoid shift operation when writing cq head doorbell
2014-02-05Merge tag 'regulator-v3.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of driver fixes here but the main thing is a fix to the checks for deferred probe non-DT systems with fully specified regulators which had been broken by a device tree fix which meant that we wouldn't insert optional regulators. This had slipped through the cracks since very few systems do that in the first place and those that do it in mainline don't need optional regulators anyway" * tag 'regulator-v3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: s2mps11: Fix NULL pointer of_node value when using platform data regulator: core: Correct default return value for full constraints regulator: ab3100: cast fix
2014-02-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto fixes from Herbert Xu: "This fixes a number of concurrency issues on s390 where multiple users of the same crypto transform may clobber each other's results" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: s390 - fix des and des3_ede ctr concurrency issue crypto: s390 - fix des and des3_ede cbc concurrency issue crypto: s390 - fix concurrency issue in aes-ctr mode
2014-02-05x86/efi: Allow mapping BGRT on x86-32Matt Fleming
CONFIG_X86_32 doesn't map the boot services regions into the EFI memory map (see commit 700870119f49 ("x86, efi: Don't map Boot Services on i386")), and so efi_lookup_mapped_addr() will fail to return a valid address. Executing the ioremap() path in efi_bgrt_init() causes the following warning on x86-32 because we're trying to ioremap() RAM, WARNING: CPU: 0 PID: 0 at arch/x86/mm/ioremap.c:102 __ioremap_caller+0x2ad/0x2c0() Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-0.rc5.git0.1.2.fc21.i686 #1 Hardware name: DellInc. Venue 8 Pro 5830/09RP78, BIOS A02 10/17/2013 00000000 00000000 c0c0df08 c09a5196 00000000 c0c0df38 c0448c1e c0b41310 00000000 00000000 c0b37bc1 00000066 c043bbfd c043bbfd 00e7dfe0 00073eff 00073eff c0c0df48 c0448ce2 00000009 00000000 c0c0df9c c043bbfd 00078d88 Call Trace: [<c09a5196>] dump_stack+0x41/0x52 [<c0448c1e>] warn_slowpath_common+0x7e/0xa0 [<c043bbfd>] ? __ioremap_caller+0x2ad/0x2c0 [<c043bbfd>] ? __ioremap_caller+0x2ad/0x2c0 [<c0448ce2>] warn_slowpath_null+0x22/0x30 [<c043bbfd>] __ioremap_caller+0x2ad/0x2c0 [<c0718f92>] ? acpi_tb_verify_table+0x1c/0x43 [<c0719c78>] ? acpi_get_table_with_size+0x63/0xb5 [<c087cd5e>] ? efi_lookup_mapped_addr+0xe/0xf0 [<c043bc2b>] ioremap_nocache+0x1b/0x20 [<c0cb01c8>] ? efi_bgrt_init+0x83/0x10c [<c0cb01c8>] efi_bgrt_init+0x83/0x10c [<c0cafd82>] efi_late_init+0x8/0xa [<c0c9bab2>] start_kernel+0x3ae/0x3c3 [<c0c9b53b>] ? repair_env_string+0x51/0x51 [<c0c9b378>] i386_start_kernel+0x12e/0x131 Switch to using early_memremap(), which won't trigger this warning, and has the added benefit of more accurately conveying what we're trying to do - map a chunk of memory. This patch addresses the following bug report, https://bugzilla.kernel.org/show_bug.cgi?id=67911 Reported-by: Adam Williamson <awilliam@redhat.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-02-05x86: Disable CONFIG_X86_DECODER_SELFTEST in allmod/allyesconfigsIngo Molnar
It can take some time to validate the image, make sure {allyes|allmod}config doesn't enable it. I'd say randconfig will cover it often enough, and the failure is also borderline build coverage related: you cannot really make the decoder test fail via source level changes, only with changes in the build environment, so I agree with Andi that we can disable this one too. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Paul Gortmaker paul.gortmaker@windriver.com> Suggested-and-acked-by: Andi Kleen andi@firstfloor.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-05execve: use 'struct filename *' for executable name passingLinus Torvalds
This changes 'do_execve()' to get the executable name as a 'struct filename', and to free it when it is done. This is what the normal users want, and it simplifies and streamlines their error handling. The controlled lifetime of the executable name also fixes a use-after-free problem with the trace_sched_process_exec tracepoint: the lifetime of the passed-in string for kernel users was not at all obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize the pathname allocation lifetime with the execve() having finished, which in turn meant that the trace point that happened after mm_release() of the old process VM ended up using already free'd memory. To solve the kernel string lifetime issue, this simply introduces "getname_kernel()" that works like the normal user-space getname() function, except with the source coming from kernel memory. As Oleg points out, this also means that we could drop the tcomm[] array from 'struct linux_binprm', since the pathname lifetime now covers setup_new_exec(). That would be a separate cleanup. Reported-by: Igor Zhbanov <i.zhbanov@samsung.com> Tested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-05kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flagTejun Heo
kernfs_deactivate() forgot to check whether KERNFS_LOCKDEP is set before performing lockdep annotations and ends up feeding uninitialized lockdep_map to lockdep triggering warning like the following on USB stick hotunplug. usb 1-2: USB disconnect, device number 2 INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 1 PID: 62 Comm: khubd Not tainted 3.13.0-work+ #82 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 ffff880065ca7f60 ffff88013a4ffa08 ffffffff81cfb6bd 0000000000000002 ffff88013a4ffac8 ffffffff810f8530 ffff88013a4fc710 0000000000000002 ffff880100000000 ffffffff82a3db50 0000000000000001 ffff88013a4fc710 Call Trace: [<ffffffff81cfb6bd>] dump_stack+0x4e/0x7a [<ffffffff810f8530>] __lock_acquire+0x1910/0x1e70 [<ffffffff810f931a>] lock_acquire+0x9a/0x1d0 [<ffffffff8127c75e>] kernfs_deactivate+0xee/0x130 [<ffffffff8127d4c8>] kernfs_addrm_finish+0x38/0x60 [<ffffffff8127d701>] kernfs_remove_by_name_ns+0x51/0xa0 [<ffffffff8127b4f1>] remove_files.isra.1+0x41/0x80 [<ffffffff8127b7e7>] sysfs_remove_group+0x47/0xa0 [<ffffffff8127b873>] sysfs_remove_groups+0x33/0x50 [<ffffffff8177d66d>] device_remove_attrs+0x4d/0x80 [<ffffffff8177e25e>] device_del+0x12e/0x1d0 [<ffffffff819722c2>] usb_disconnect+0x122/0x1a0 [<ffffffff819749b5>] hub_thread+0x3c5/0x1290 [<ffffffff810c6a6d>] kthread+0xed/0x110 [<ffffffff81d0a56c>] ret_from_fork+0x7c/0xb0 Fix it by making kernfs_deactivate() perform lockdep annotations only if KERNFS_LOCKDEP is set. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Fabio Estevam <festevam@gmail.com> Reported-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Jiri Kosina <jkosina@suse.cz> Reported-by: Dave Jones <davej@redhat.com> Tested-by: Fabio Estevam <fabio.estevam@freescale.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-05SELinux: Fix kernel BUG on empty security contexts.Stephen Smalley
Setting an empty security context (length=0) on a file will lead to incorrectly dereferencing the type and other fields of the security context structure, yielding a kernel BUG. As a zero-length security context is never valid, just reject all such security contexts whether coming from userspace via setxattr or coming from the filesystem upon a getxattr request by SELinux. Setting a security context value (empty or otherwise) unknown to SELinux in the first place is only possible for a root process (CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only if the corresponding SELinux mac_admin permission is also granted to the domain by policy. In Fedora policies, this is only allowed for specific domains such as livecd for setting down security contexts that are not defined in the build host policy. Reproducer: su setenforce 0 touch foo setfattr -n security.selinux foo Caveat: Relabeling or removing foo after doing the above may not be possible without booting with SELinux disabled. Any subsequent access to foo after doing the above will also trigger the BUG. BUG output from Matthew Thode: [ 473.893141] ------------[ cut here ]------------ [ 473.962110] kernel BUG at security/selinux/ss/services.c:654! [ 473.995314] invalid opcode: 0000 [#6] SMP [ 474.027196] Modules linked in: [ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I 3.13.0-grsec #1 [ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti: ffff8805f50cd488 [ 474.183707] RIP: 0010:[<ffffffff814681c7>] [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246 [ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX: 0000000000000100 [ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI: ffff8805e8aaa000 [ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09: 0000000000000006 [ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12: 0000000000000006 [ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15: 0000000000000000 [ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000 [ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4: 00000000000207f0 [ 474.556058] Stack: [ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98 ffff8805f1190a40 [ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990 ffff8805e8aac860 [ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060 ffff8805c0ac3d94 [ 474.690461] Call Trace: [ 474.723779] [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a [ 474.778049] [<ffffffff81468824>] security_compute_av+0xf4/0x20b [ 474.811398] [<ffffffff8196f419>] avc_compute_av+0x2a/0x179 [ 474.843813] [<ffffffff8145727b>] avc_has_perm+0x45/0xf4 [ 474.875694] [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31 [ 474.907370] [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e [ 474.938726] [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22 [ 474.970036] [<ffffffff811b057d>] vfs_getattr+0x19/0x2d [ 475.000618] [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91 [ 475.030402] [<ffffffff811b063b>] vfs_lstat+0x19/0x1b [ 475.061097] [<ffffffff811b077e>] SyS_newlstat+0x15/0x30 [ 475.094595] [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3 [ 475.148405] [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b [ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48 8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7 75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8 [ 475.255884] RIP [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 475.296120] RSP <ffff8805c0ac3c38> [ 475.328734] ---[ end trace f076482e9d754adc ]--- Reported-by: Matthew Thode <mthode@mthode.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Cc: stable@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-02-05selinux: add SOCK_DIAG_BY_FAMILY to the list of netlink message typesPaul Moore
The SELinux AF_NETLINK/NETLINK_SOCK_DIAG socket class was missing the SOCK_DIAG_BY_FAMILY definition which caused SELINUX_ERR messages when the ss tool was run. # ss Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port u_str ESTAB 0 0 * 14189 * 14190 u_str ESTAB 0 0 * 14145 * 14144 u_str ESTAB 0 0 * 14151 * 14150 {...} # ausearch -m SELINUX_ERR ---- time->Thu Jan 23 11:11:16 2014 type=SYSCALL msg=audit(1390493476.445:374): arch=c000003e syscall=44 success=yes exit=40 a0=3 a1=7fff03aa11f0 a2=28 a3=0 items=0 ppid=1852 pid=1895 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="ss" exe="/usr/sbin/ss" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=SELINUX_ERR msg=audit(1390493476.445:374): SELinux: unrecognized netlink message type=20 for sclass=32 Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-02-05Merge tag 'v3.13' into stable-3.14Paul Moore
Linux 3.13 Conflicts: security/selinux/hooks.c Trivial merge issue in selinux_inet_conn_request() likely due to me including patches that I sent to the stable folks in my next tree resulting in the patch hitting twice (I think). Thankfully it was an easy fix this time, but regardless, lesson learned, I will not do that again.
2014-02-05drm/ttm: Don't clear page metadata of imported sg pagesThomas Hellstrom
These page pointers shouldn't be visible to TTM in the first place, but until we fix that up, don't clear the page metadata because that will upset the exporter. Reported-and-tested-by: Cristoph Haag <haagch.christoph@googleemail.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2014-02-05security: select correct default LSM_MMAP_MIN_ADDR on arm on arm64Colin Cross
Binaries compiled for arm may run on arm64 if CONFIG_COMPAT is selected. Set LSM_MMAP_MIN_ADDR to 32768 if ARM64 && COMPAT to prevent selinux failures launching 32-bit static executables that are mapped at 0x8000. Signed-off-by: Colin Cross <ccross@android.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: compat: Wire up new AArch32 syscallsCatalin Marinas
This patch enables sys_compat, sys_finit_module, sys_sched_setattr and sys_sched_getattr for compat (AArch32) applications. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSENathan Lynch
Update wall-to-monotonic fields in the VDSO data page unconditionally. These are used to service CLOCK_MONOTONIC_COARSE, which is not guarded by use_syscall. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: vdso: fix coarse clock handlingNathan Lynch
When __kernel_clock_gettime is called with a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE clock id, it returns incorrectly to whatever the caller has placed in x2 ("ret x2" to return from the fast path). Fix this by saving x30/LR to x2 only in code that will call __do_get_tspec, restoring x30 afterward, and using a plain "ret" to return from the routine. Also: while the resulting tv_nsec value for CLOCK_REALTIME and CLOCK_MONOTONIC must be computed using intermediate values that are left-shifted by cs_shift (x12, set by __do_get_tspec), the results for coarse clocks should be calculated using unshifted values (xtime_coarse_nsec is in units of actual nanoseconds). The current code shifts intermediate values by x12 unconditionally, but x12 is uninitialized when servicing a coarse clock. Fix this by setting x12 to 0 once we know we are dealing with a coarse clock id. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05ACPI / hotplug: Fix panic on eject to ejected deviceToshi Kani
When an eject request is sent to an ejected ACPI device, the following panic occurs: ACPI: \_SB_.SCK3.CPU3: ACPI_NOTIFY_EJECT_REQUEST event BUG: unable to handle kernel NULL pointer dereference at 0000000000000070 IP: [<ffffffff813a7cfe>] acpi_device_hotplug+0x10b/0x33b : Call Trace: [<ffffffff813a24da>] acpi_hotplug_work_fn+0x1c/0x27 [<ffffffff8109cbe5>] process_one_work+0x175/0x430 [<ffffffff8109d7db>] worker_thread+0x11b/0x3a0 This is becase device->handler is NULL in acpi_device_hotplug(). This case was used to fail in acpi_hotplug_notify_cb() as the target had no acpi_deivce. However, acpi_device now exists after ejection. Added a check to verify if acpi_device->handler is valid for an eject request in acpi_hotplug_notify_cb(). Note that handler passed from an argument is still valid while acpi_device->handler is NULL. Fixes: 202317a573b2 (ACPI / scan: Add acpi_device objects for all device nodes in the namespace) Signed-off-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-05arm64: simplify pgd_allocMark Rutland
Currently pgd_alloc has a redundant NULL check in its return path that can be removed with no ill effects. With that removed it's also possible to return early and eliminate the new_pgd temporary variable. This patch applies said modifications, making the logic of pgd_alloc correspond 1-1 with that of pgd_free. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: fix typo: s/SERRROR/SERROR/Mark Rutland
Somehow SERROR has acquired an additional 'R' in a couple of headers. This patch removes them before they spread further. As neither instance is in use yet, no other sites need to be fixed up. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: Invalidate the TLB when replacing pmd entries during bootCatalin Marinas
With the 64K page size configuration, __create_page_tables in head.S maps enough memory to get started but using 64K pages rather than 512M sections with a single pgd/pud/pmd entry pointing to a pte table. create_mapping() may override the pgd/pud/pmd table entry with a block (section) one if the RAM size is more than 512MB and aligned correctly. For the end of this block to be accessible, the old TLB entry must be invalidated. Cc: <stable@vger.kernel.org> Reported-by: Mark Salter <msalter@redhat.com> Tested-by: Mark Salter <msalter@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: Align CMA sizes to PAGE_SIZELaura Abbott
dma_alloc_from_contiguous takes number of pages for a size. Align up the dma size passed in to page size to avoid truncation and allocation failures on sizes less than PAGE_SIZE. Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05arm64: add DSB after icache flush in __flush_icache_all()Vinayak Kale
Add DSB after icache flush to complete the cache maintenance operation. The function __flush_icache_all() is used only for user space mappings and an ISB is not required because of an exception return before executing user instructions. An exception return would behave like an ISB. Signed-off-by: Vinayak Kale <vkale@apm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05genirq: Generic irq chip requires IRQ_DOMAINNitin A Kamble
The generic_chip.c uses interfaces from irq_domain.c which is controlled by the IRQ_DOMAIN config option, but there is no Kconfig dependency so the build can fail: linux/kernel/irq/generic-chip.c:400:11: error: 'irq_domain_xlate_onetwocell' undeclared here (not in a function) Select IRQ_DOMAIN when GENERIC_IRQ_CHIP is selected. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Link: http://lkml.kernel.org/r/1391129410-54548-2-git-send-email-nitin.a.kamble@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # 3.11+
2014-02-05drm/ttm: Fix TTM object open regressionThomas Hellstrom
Commit drm/ttm: ttm object security fixes for render nodes introduced a regression where, if a TTM object was opened multiple times from the same open file, the caller would spin uninterruptibly in the kernel. Fix this. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2014-02-05vmwgfx: Fix unitialized stack read in vmw_setup_otable_baseDave Jones
One of the error paths in vmw_setup_otable_base causes us to return with 'ret' having never been set to anything causing us to return whatever was on the stack. Found with Coverity Signed-off-by: Dave Jones <davej@fedoraproject.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2014-02-05ALSA: hda - Improve loopback path lookups for AD1983Takashi Iwai
AD1983 has flexible loopback routes and the generic parser would take wrong path confusingly instead of taking individual paths via NID 0x0c and 0x0d. For avoiding it, limit the connections at these widgets so that the parser can think more straightforwardly. This fixes the regression of the missing line-in loopback on Dell machine. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70011 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-02-05drm/vmwgfx: Reemit context bindings when necessary v2Thomas Hellstrom
When a context is first referenced in the command stream, make sure that all scrubbed (as a result of eviction) bindings are re-emitted. Also make sure that all bound resources are put on the resource validate list. This is needed for legacy emulation, since legacy user-space drivers will typically not re-emit shader bindings. It also removes the requirement for user-space drivers to re-emit render-target- and texture bindings. Makes suspend and hibernate now also work with legacy user-space drivers on guest-backed devices. v2: Don't rebind on legacy devices. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2014-02-05drm/vmwgfx: Detect old user-space drivers and set up legacy emulation v2Thomas Hellstrom
GB aware mesa userspace drivers are detected by the fact that they are calling the vmw getparam ioctl querying DRM_VMW_PARAM_HW_CAPS to detect whether the device is Guest-backed object capable. For other drivers, lie about hardware version and send the 3D capabilities in a format they expect. v2: Use DRM_VMW_PARAM_MAX_MOB_MEMORY to detect gb awareness, Make sure we don't ovwerwrite bounce buffer or write past user-space buffer indicated size. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2014-02-05drm/vmwgfx: Emulate legacy shaders on guest-backed devices v2Thomas Hellstrom
Command stream legacy shader creation and destruction is replaced by NOPs in the command stream, and instead guest-backed shaders are created and destroyed as part of the command validation process. v2: Removed some stray debug messages. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2014-02-05drm/vmwgfx: Fix legacy surface reference size copybackThomas Hellstrom
Surfaces created using the guest-backed surface interface only keeps the base mip size, so only copy that if the legacy surface reference ioctl requests the size information. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>