summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2015-04-21KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPIPaul Mackerras
When running a multi-threaded guest and vcpu 0 in a virtual core is not running in the guest (i.e. it is busy elsewhere in the host), thread 0 of the physical core will switch the MMU to the guest and then go to nap mode in the code at kvm_do_nap. If the guest sends an IPI to thread 0 using the msgsndp instruction, that will wake up thread 0 and cause all the threads in the guest to exit to the host unnecessarily. To avoid the unnecessary exit, this arranges for the PECEDP bit to be cleared in this situation. When napping due to a H_CEDE from the guest, we still set PECEDP so that the thread will wake up on an IPI sent using msgsndp. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_wokenPaul Mackerras
We can tell when a secondary thread has finished running a guest by the fact that it clears its kvm_hstate.kvm_vcpu pointer, so there is no real need for the nap_count field in the kvmppc_vcore struct. This changes kvmppc_wait_for_nap to poll the kvm_hstate.kvm_vcpu pointers of the secondary threads rather than polling vc->nap_count. Besides reducing the size of the kvmppc_vcore struct by 8 bytes, this also means that we can tell which secondary threads have got stuck and thus print a more informative error message. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpuPaul Mackerras
Rather than calling cond_resched() in kvmppc_run_core() before doing the post-processing for the vcpus that we have just run (that is, calling kvmppc_handle_exit_hv(), kvmppc_set_timer(), etc.), we now do that post-processing before calling cond_resched(), and that post- processing is moved out into its own function, post_guest_process(). The reschedule point is now in kvmppc_run_vcpu() and we define a new vcore state, VCORE_PREEMPT, to indicate that that the vcore's runner task is runnable but not running. (Doing the reschedule with the vcore in VCORE_INACTIVE state would be bad because there are potentially other vcpus waiting for the runner in kvmppc_wait_for_exec() which then wouldn't get woken up.) Also, we make use of the handy cond_resched_lock() function, which unlocks and relocks vc->lock for us around the reschedule. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Minor cleanupsPaul Mackerras
* Remove unused kvmppc_vcore::n_busy field. * Remove setting of RMOR, since it was only used on PPC970 and the PPC970 KVM support has been removed. * Don't use r1 or r2 in setting the runlatch since they are conventionally reserved for other things; use r0 instead. * Streamline the code a little and remove the ext_interrupt_to_host label. * Add some comments about register usage. * hcall_try_real_mode doesn't need to be global, and can't be called from C code anyway. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA updatePaul Mackerras
Previously, if kvmppc_run_core() was running a VCPU that needed a VPA update (i.e. one of its 3 virtual processor areas needed to be pinned in memory so the host real mode code can update it on guest entry and exit), we would drop the vcore lock and do the update there and then. Future changes will make it inconvenient to drop the lock, so instead we now remove it from the list of runnable VCPUs and wake up its VCPU task. This will have the effect that the VCPU task will exit kvmppc_run_vcpu(), go around the do loop in kvmppc_vcpu_run_hv(), and re-enter kvmppc_run_vcpu(), whereupon it will do the necessary call to kvmppc_update_vpas() and then rejoin the vcore. The one complication is that the runner VCPU (whose VCPU task is the current task) might be one of the ones that gets removed from the runnable list. In that case we just return from kvmppc_run_core() and let the code in kvmppc_run_vcpu() wake up another VCPU task to be the runner if necessary. This all means that the VCORE_STARTING state is no longer used, so we remove it. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Accumulate timing information for real-mode codePaul Mackerras
This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called "timings". This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. The overhead of the extra code amounts to about 30ns for an hcall that is handled in real mode (e.g. H_SET_DABR), which is about 25%. Since production environments may not wish to incur this overhead, the new code is conditional on a new config symbol, CONFIG_KVM_BOOK3S_HV_EXIT_TIMING. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Create debugfs file for each guest's HPTPaul Mackerras
This creates a debugfs directory for each HV guest (assuming debugfs is enabled in the kernel config), and within that directory, a file by which the contents of the guest's HPT (hashed page table) can be read. The directory is named vmnnnn, where nnnn is the PID of the process that created the guest. The file is named "htab". This is intended to help in debugging problems in the host's management of guest memory. The contents of the file consist of a series of lines like this: 3f48 4000d032bf003505 0000000bd7ff1196 00000003b5c71196 The first field is the index of the entry in the HPT, the second and third are the HPT entry, so the third entry contains the real page number that is mapped by the entry if the entry's valid bit is set. The fourth field is the guest's view of the second doubleword of the entry, so it contains the guest physical address. (The format of the second through fourth fields are described in the Power ISA and also in arch/powerpc/include/asm/mmu-hash64.h.) Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Add ICP real mode countersSuresh Warrier
Add two counters to count how often we generate real-mode ICS resend and reject events. The counters provide some performance statistics that could be used in the future to consider if the real mode functions need further optimizing. The counters are displayed as part of IPC and ICP state provided by /sys/debug/kernel/powerpc/kvm* for each VM. Also added two counters that count (approximately) how many times we don't find an ICP or ICS we're looking for. These are not currently exposed through sysfs, but can be useful when debugging crashes. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-modeSuresh Warrier
Interrupt-based hypercalls return H_TOO_HARD to inform KVM that it needs to switch to the host to complete the rest of hypercall function in virtual mode. This patch ports the virtual mode ICS/ICP reject and resend functions to be runnable in hypervisor real mode, thus avoiding the need to switch to the host to execute these functions in virtual mode. However, the hypercalls continue to return H_TOO_HARD for vcpu_wakeup and notify events - these events cannot be done in real mode and they will still need a switch to host virtual mode. There are sufficient differences between the real mode code and the virtual mode code for the ICS/ICP resend and reject functions that for now the code has been duplicated instead of sharing common code. In the future, we can look at creating common functions. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lockSuresh Warrier
Replaces the ICS mutex lock with a spin lock since we will be porting these routines to real mode. Note that we need to disable interrupts before we take the lock in anticipation of the fact that on the guest side, we are running in the context of a hard irq and interrupts are disabled (EE bit off) when the lock is acquired. Again, because we will be acquiring the lock in hypervisor real mode, we need to use an arch_spinlock_t instead of a normal spinlock here as we want to avoid running any lockdep code (which may not be safe to execute in real mode). Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Add guest->host real mode completion countersSuresh E. Warrier
Add counters to track number of times we switch from guest real mode to host virtual mode during an interrupt-related hyper call because the hypercall requires actions that cannot be completed in real mode. This will help when making optimizations that reduce guest-host transitions. It is safe to use an ordinary increment rather than an atomic operation because there is one ICP per virtual CPU and kvmppc_xics_rm_complete() only works on the ICP for the current VCPU. The counters are displayed as part of IPC and ICP state provided by /sys/debug/kernel/powerpc/kvm* for each VM. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Add helpers for lock/unlock hpteAneesh Kumar K.V
This adds helper routines for locking and unlocking HPTEs, and uses them in the rest of the code. We don't change any locking rules in this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Remove RMA-related variables from codeAneesh Kumar K.V
We don't support real-mode areas now that 970 support is removed. Remove the remaining details of rma from the code. Also rename rma_setup_done to hpte_setup_done to better reflect the changes. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.Michael Ellerman
Some PowerNV systems include a hardware random-number generator. This HWRNG is present on POWER7+ and POWER8 chips and is capable of generating one 64-bit random number every microsecond. The random numbers are produced by sampling a set of 64 unstable high-frequency oscillators and are almost completely entropic. PAPR defines an H_RANDOM hypercall which guests can use to obtain one 64-bit random sample from the HWRNG. This adds a real-mode implementation of the H_RANDOM hypercall. This hypercall was implemented in real mode because the latency of reading the HWRNG is generally small compared to the latency of a guest exit and entry for all the threads in the same virtual core. Userspace can detect the presence of the HWRNG and the H_RANDOM implementation by querying the KVM_CAP_PPC_HWRNG capability. The H_RANDOM hypercall implementation will only be invoked when the guest does an H_RANDOM hypercall if userspace first enables the in-kernel H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVMDavid Gibson
On POWER, storage caching is usually configured via the MMU - attributes such as cache-inhibited are stored in the TLB and the hashed page table. This makes correctly performing cache inhibited IO accesses awkward when the MMU is turned off (real mode). Some CPU models provide special registers to control the cache attributes of real mode load and stores but this is not at all consistent. This is a problem in particular for SLOF, the firmware used on KVM guests, which runs entirely in real mode, but which needs to do IO to load the kernel. To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to a logical address (aka guest physical address). SLOF uses these for IO. However, because these are implemented within qemu, not the host kernel, these bypass any IO devices emulated within KVM itself. The simplest way to see this problem is to attempt to boot a KVM guest from a virtio-blk device with iothread / dataplane enabled. The iothread code relies on an in kernel implementation of the virtio queue notification, which is not triggered by the IO hcalls, and so the guest will stall in SLOF unable to load the guest OS. This patch addresses this by providing in-kernel implementations of the 2 hypercalls, which correctly scan the KVM IO bus. Any access to an address not handled by the KVM IO bus will cause a VM exit, hitting the qemu implementation as before. Note that a userspace change is also required, in order to enable these new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [agraf: fix compilation] Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21powerpc: Export __spin_yieldSuresh E. Warrier
Export __spin_yield so that the arch_spin_unlock() function can be invoked from a module. This will be required for modules where we want to take a lock that is also is acquired in hypervisor real mode. Because we want to avoid running any lockdep code (which may not be safe in real mode), this lock needs to be an arch_spinlock_t instead of a normal spinlock. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21Merge branch 'drm-next-merged' of ↵Mauro Carvalho Chehab
git://people.freedesktop.org/~airlied/linux into v4l_for_linus * 'drm-next-merged' of git://people.freedesktop.org/~airlied/linux: (9717 commits) media-bus: Fixup RGB444_1X12, RGB565_1X16, and YUV8_1X24 media bus format hexdump: avoid warning in test function fs: take i_mutex during prepare_binprm for set[ug]id executables smp: Fix error case handling in smp_call_function_*() iommu-common: Fix PARISC compile-time warnings sparc: Make LDC use common iommu poll management functions sparc: Make sparc64 use scalable lib/iommu-common.c functions Break up monolithic iommu table/lock into finer graularity pools and lock sparc: Revert generic IOMMU allocator. tools/power turbostat: correct dumped pkg-cstate-limit value tools/power turbostat: calculate TSC frequency from CPUID(0x15) on SKL tools/power turbostat: correct DRAM RAPL units on recent Xeon processors tools/power turbostat: Initial Skylake support tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile tools/power turbostat: modprobe msr, if needed tools/power turbostat: dump MSR_TURBO_RATIO_LIMIT2 tools/power turbostat: use new MSR_TURBO_RATIO_LIMIT names Bluetooth: hidp: Fix regression with older userspace and flags validation config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected perf/x86/intel/pt: Fix and clean up error handling in pt_event_add() ... That solves several merge conflicts: Documentation/DocBook/media/v4l/subdev-formats.xml Documentation/devicetree/bindings/vendor-prefixes.txt drivers/staging/media/mn88473/mn88473.c include/linux/kconfig.h include/uapi/linux/media-bus-format.h The ones at subdev-formats.xml and media-bus-format.h are not trivial. That's why we opted to merge from DRM.
2015-04-21Merge branch 'topic/sh' into for-linusVinod Koul
2015-04-20ARM: OMAP2+: Fix booting with configs that don't have MFD_SYSCONTony Lindgren
With the recent changes omaps have developed a dependency to MFD_SYSCON. This is used for system control module generic register area and some clocks. We do have it selected in omap2plus_defconfig, but targeted config files may not have it selected. Let's make sure it's selected like few other ARM platforms are already doing. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-04-20Merge tag 'cpumask-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell: "This is the final removal (after several years!) of the obsolete cpus_* functions, prompted by their mis-use in staging. With these function removed, all cpu functions should only iterate to nr_cpu_ids, so we finally only allocate that many bits when cpumasks are allocated offstack" * tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits) cpumask: remove __first_cpu / __next_cpu cpumask: resurrect CPU_MASK_CPU0 linux/cpumask.h: add typechecking to cpumask_test_cpu cpumask: only allocate nr_cpumask_bits. Fix weird uses of num_online_cpus(). cpumask: remove deprecated functions. mips: fix obsolete cpumask_of_cpu usage. x86: fix more deprecated cpu function usage. ia64: remove deprecated cpus_ usage. powerpc: fix deprecated CPU_MASK_CPU0 usage. CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region. staging/lustre/o2iblnd: Don't use cpus_weight staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_ staging/lustre/ptlrpc: Do not use deprecated cpus_* functions blackfin: fix up obsolete cpu function usage. parisc: fix up obsolete cpu function usage. tile: fix up obsolete cpu function usage. arm64: fix up obsolete cpu function usage. mips: fix up obsolete cpu function usage. x86: fix up obsolete cpu function usage. ...
2015-04-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Martin Schwidefsky: "The big thing in this second merge for s390 is the new eBPF JIT from Michael which replaces the old 32-bit backend. The remaining commits are bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: add locking for fmb access s390/pci: extract software counters from fmb s390/dasd: Fix unresumed device after suspend/resume having no paths s390/dasd: fix unresumed device after suspend/resume s390/dasd: fix inability to set a DASD device offline s390/mm: Fix memory hotplug for unaligned standby memory s390/bpf: Add s390x eBPF JIT compiler backend s390: Use bool function return values of true/false not 1/0
2015-04-20Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68k fixes from Greg Ungerer: "Nothing big, spelling fixes and fix/cleanup for ColdFire eth device setup" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68knommu: fix fec setup warning for ColdFire 5271 builds m68knommu: ColdFire 5271 only has a single FEC controller m68k: Fix trivial typos in comments
2015-04-20Merge branch 'fixes' into next/fixes-non-criticalOlof Johansson
Merge a set of fixes that we missed sending in before v4.0 release. These will also be sent to -stable. * fixes: (659 commits) ARM: at91/dt: sama5d3 xplained: add phy address for macb1 kbuild: Create directory for target DTB ARM: mvebu: Disable CPU Idle on Armada 38x arm64: juno: Fix misleading name of UART reference clock ARM: dts: sunxi: Remove overclocked/overvoltaged OPP ARM: dts: sun4i: a10-lime: Override and remove 1008MHz OPP setting ARM: socfpga: dts: fix spi1 interrupt ARM: dts: Fix gpio interrupts for dm816x ARM: dts: dra7: remove ti,hwmod property from pcie phy ARM: EXYNOS: Fix build breakage cpuidle on !SMP ARM: OMAP: dmtimer: disable pm runtime on remove ARM: OMAP: dmtimer: check for pm_runtime_get_sync() failure ARM: dts: fix lid and power pin-functions for exynos5250-spring ARM: dts: fix mmc node updates for exynos5250-spring ARM: OMAP2+: Fix socbus family info for AM33xx devices ARM: dts: omap3: Add missing dmas for crypto + Linux 4.0-rc4 Signed-off-by: Olof Johansson <olof@lixom.net>
2015-04-20ARC: perf: don't add code for impossible caseVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: Rename DT binding to not confuse with power mgmtVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: add user space attribution in callchainsVineet Gupta
The actual user space unwinding is more involved, so simply capture the user space PC Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: Add kernel callchain supportVineet Gupta
Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: support cache hit/miss ratioVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: Add some comments/debug stuffVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: make @arc_pmu static globalVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20nios2: rework trap handlerLey Foon Tan
Redefine trap handler as below: 0 N/A reserved for system calls 1 SIGUSR1 user-defined signal 1 2 SIGUSR2 user-defined signal 2 3 SIGILL illegal instruction 4..29 reserved (but implemented to raise SIGILL instead of being undefined) 30 SIGTRAP KGDB 31 SIGTRAP trace/breakpoint trap Signed-off-by: Ley Foon Tan <lftan@altera.com>
2015-04-20nios2: remove end address checking for initdaLey Foon Tan
Remove the end address checking for initda function. We need to invalidate each address line for initda instruction, from start to end address. Signed-off-by: Ley Foon Tan <lftan@altera.com>
2015-04-19Merge branch 'turbostat' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat update from Len Brown: "Updates to the turbostat utility. Just one kernel dependency in this batch -- added a #define to msr-index.h" * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: correct dumped pkg-cstate-limit value tools/power turbostat: calculate TSC frequency from CPUID(0x15) on SKL tools/power turbostat: correct DRAM RAPL units on recent Xeon processors tools/power turbostat: Initial Skylake support tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile tools/power turbostat: modprobe msr, if needed tools/power turbostat: dump MSR_TURBO_RATIO_LIMIT2 tools/power turbostat: use new MSR_TURBO_RATIO_LIMIT names x86 msr-index: define MSR_TURBO_RATIO_LIMIT,1,2 tools/power turbostat: label base frequency tools/power turbostat: update PERF_LIMIT_REASONS decoding tools/power turbostat: simplify default output
2015-04-19cpumask: remove __first_cpu / __next_cpuRusty Russell
They were for use by the deprecated first_cpu() and next_cpu() wrappers, but sparc used them directly. They're now replaced by cpumask_first / cpumask_next. And __next_cpu_nr is completely obsolete. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: David S. Miller <davem@davemloft.net>
2015-04-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc fixes from David Miller "Unfortunately, I brown paper bagged the generic iommu pool allocator by applying the wrong revision of the patch series. This reverts the bad one, and puts the right one in" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: iommu-common: Fix PARISC compile-time warnings sparc: Make LDC use common iommu poll management functions sparc: Make sparc64 use scalable lib/iommu-common.c functions Break up monolithic iommu table/lock into finer graularity pools and lock sparc: Revert generic IOMMU allocator.
2015-04-18sparc: Make LDC use common iommu poll management functionsSowmini Varadhan
Note that this conversion is only being done to consolidate the code and ensure that the common code provides the sufficient abstraction. It is not expected to result in any noticeable performance improvement, as there is typically one ldc_iommu per vnet_port, and each one has 8k entries, with a typical request for 1-4 pages. Thus LDC uses npools == 1. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-18sparc: Make sparc64 use scalable lib/iommu-common.c functionsSowmini Varadhan
In iperf experiments running linux as the Tx side (TCP client) with 10 threads results in a severe performance drop when TSO is disabled, indicating a weakness in the software that can be avoided by using the scalable IOMMU arena DMA allocation. Baseline numbers before this patch: with default settings (TSO enabled) : 9-9.5 Gbps Disable TSO using ethtool- drops badly: 2-3 Gbps. After this patch, iperf client with 10 threads, can give a throughput of at least 8.5 Gbps, even when TSO is disabled. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-18sparc: Revert generic IOMMU allocator.David S. Miller
I applied the wrong version of this patch series, V4 instead of V10, due to a patchwork bundling snafu. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-18tools/power turbostat: Initial Skylake supportLen Brown
Skylake adds some additional residency counters. Skylake supports a different mix of RAPL registers from any previous product. In most other ways, Skylake is like Broadwell. Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18Merge branch 'x86-pmem-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull PMEM driver from Ingo Molnar: "This is the initial support for the pmem block device driver: persistent non-volatile memory space mapped into the system's physical memory space as large physical memory regions. The driver is based on Intel code, written by Ross Zwisler, with fixes by Boaz Harrosh, integrated with x86 e820 memory resource management and tidied up by Christoph Hellwig. Note that there were two other separate pmem driver submissions to lkml: but apparently all parties (Ross Zwisler, Boaz Harrosh) are reasonably happy with this initial version. This version enables minimal support that enables persistent memory devices out in the wild to work as block devices, identified through a magic (non-standard) e820 flag and auto-discovered if CONFIG_X86_PMEM_LEGACY=y, or added explicitly through manipulating the memory maps via the "memmap=..." boot option with the new, special '!' modifier character. Limitations: this is a regular block device, and since the pmem areas are not struct page backed, they are invisible to the rest of the system (other than the block IO device), so direct IO to/from pmem areas, direct mmap() or XIP is not possible yet. The page cache will also shadow and double buffer pmem contents, etc. Initial support is for x86" * 'x86-pmem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: drivers/block/pmem: Fix 32-bit build warning in pmem_alloc() drivers/block/pmem: Add a driver for persistent memory x86/mm: Add support for the non-standard protected e820 type
2015-04-18Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "This tree includes: - an FPU related crash fix - a ptrace fix (with matching testcase in tools/testing/selftests/) - an x86 Kconfig DMA-config defaults tweak to better avoid non-working drivers" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected x86/fpu: Load xsave pointer *after* initialization x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() x86, selftests: Add single_step_syscall test
2015-04-18Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "This update has mostly fixes, but also other bits: - perf tooling fixes - PMU driver fixes - Intel Broadwell PMU driver HW-enablement for LBR callstacks - a late coming 'perf kmem' tool update that enables it to also analyze page allocation data. Note, this comes with MM tracepoint changes that we believe to not break anything: because it changes the formerly opaque 'struct page *' field that uniquely identifies pages to 'pfn' which identifies pages uniquely too, but isn't as opaque and can be used for other purposes as well" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/pt: Fix and clean up error handling in pt_event_add() perf/x86/intel: Add Broadwell support for the LBR callstack perf/x86/intel/rapl: Fix energy counter measurements but supporing per domain energy units perf/x86/intel: Fix Core2,Atom,NHM,WSM cycles:pp events perf/x86: Fix hw_perf_event::flags collision perf probe: Fix segfault when probe with lazy_line to file perf probe: Find compilation directory path for lazy matching perf probe: Set retprobe flag when probe in address-based alternative mode perf kmem: Analyze page allocator events also tracing, mm: Record pfn instead of pointer to struct page
2015-04-18config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selectedKonrad Rzeszutek Wilk
A huge amount of NIC drivers use the DMA API, however if compiled under 32-bit an very important part of the DMA API can be ommitted leading to the drivers not working at all (especially if used with 'swiotlb=force iommu=soft'). As Prashant Sreedharan explains it: "the driver [tg3] uses DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma "mapping" and dma_unmap_addr() to get the "mapping" value. On most of the platforms this is a no-op, but ... with "iommu=soft and swiotlb=force" this house keeping is required, ... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the DMA address." As such enable this even when using 32-bit kernels. Reported-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Prashant Sreedharan <prashant@broadcom.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Chan <mchan@broadcom.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: cascardo@linux.vnet.ibm.com Cc: david.vrabel@citrix.com Cc: sanjeevb@broadcom.com Cc: siva.kallam@broadcom.com Cc: vyasevich@gmail.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/20150417190448.GA9462@l.oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-18Merge tag 'gpio-v4.1-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO changes for the v4.1 development cycle: - A new GPIO hogging mechanism has been added. This can be used on boards that want to drive some GPIO line high, low, or set it as input on boot and then never touch it again. For some embedded systems this is bliss and simplifies things to a great extent. - Some API cleanup and closure: gpiod_get_array() and gpiod_put_array() has been added to get and put GPIOs in bulk as was possible with the non-descriptor API. - Encapsulate cross-calls to the pin control subsystem in <linux/gpio/driver.h>. Now this should be the only header any GPIO driver needs to include or something is wrong. Cleanups restricting drivers to this include are welcomed if tested. - Sort the GPIO Kconfig and split it into submenus, as it was becoming and unstructured, illogical and unnavigatable mess. I hope this is easier to follow. Menus that require a certain subsystem like I2C can now be hidden nicely for example, still working on others. - New drivers: - New driver for the Altera Soft GPIO. - The F7188x driver now handles the F71869 and F71869A variants. - The MIPS Loongson driver has been moved to drivers/gpio for consolidation and cleanup. - Cleanups: - The MAX732x is converted to use the GPIOLIB_IRQCHIP infrastructure. - The PCF857x is converted to use the GPIOLIB_IRQCHIP infrastructure. - Radical cleanup of the OMAP driver. - Misc: - Enable the DWAPB GPIO for all architectures. This is a "hard IP" block from Synopsys which has started to turn up in so diverse architectures as X86 Quark, ARC and a slew of ARM systems. So even though it's not an expander, it's generic enough to be available for all. - We add a mock GPIO on Crystalcove PMIC after a long discussion with Daniel Vetter et al, tracing back to the shootout at the kernel summit where DRM drivers and sub-componentization was discussed. In this case a mock GPIO is assumed to be the best compromise gaining some reuse of infrastructure without making DRM drivers overly complex at the same time. Let's see" * tag 'gpio-v4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (62 commits) Revert "gpio: sch: use uapi/linux/pci_ids.h directly" gpio: dwapb: remove dependencies gpio: dwapb: enable for ARC gpio: removing kfree remove functionality gpio: mvebu: Fix mask/unmask managment per irq chip type gpio: split GPIO drivers in submenus gpio: move MFD GPIO drivers under their own comment gpio: move BCM Kona Kconfig option gpio: arrange SPI Kconfig symbols alphabetically gpio: arrange PCI GPIO controllers alphabetically gpio: arrange I2C Kconfig symbols alphabetically gpio: arrange Kconfig symbols alphabetically gpio: ich: Implement get_direction function gpio: use (!foo) instead of (foo == NULL) gpio: arizona: drop owner assignment from platform_drivers gpio: max7300: remove 'ret' variable gpio: use devm_kzalloc gpio: sch: use uapi/linux/pci_ids.h directly gpio: x-gene: fix devm_ioremap_resource() check gpio: loongson: Add Loongson-3A/3B GPIO driver support ...
2015-04-18perf/x86/intel/pt: Fix and clean up error handling in pt_event_add()Ingo Molnar
Dan Carpenter reported that pt_event_add() has buggy error handling logic: it returns 0 instead of -EBUSY when it fails to start a newly added event. Furthermore, the control flow in this function is messy, with cleanup labels mixed with direct returns. Fix the bug and clean up the code by converting it to a straight fast path for the regular non-failing case, plus a clear sequence of cascading goto labels to do all cleanup. NOTE: I materially changed the existing clean up logic in the pt_event_start() failure case to use the direct perf_aux_output_end() path, not pt_event_del(), because perf_aux_output_end() is enough here. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150416103830.GB7847@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-18ARM: 8342/1: VDSO: depend on CPU_V7Nathan Lynch
When targeting ARMv3 (e.g. rpc) and enabling CONFIG_VDSO we get: arch/arm/vdso/datapage.S:13: Error: selected processor does not support ARM mode `bx lr' One fix considered was to use 'ldr pc,lr' for such configurations, but since the VDSO is unlikely to be useful for pre-v7 hardware, just make it depend on CONFIG_CPU_V7. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc updates from David Miller: "The PowerPC folks have a really nice scalable IOMMU pool allocator that we wanted to make use of for sparc. So here we have a series that abstracts out their code into a common layer that anyone can make use of. Sparc is converted, and the PowerPC folks have reviewed and ACK'd this series and plan to convert PowerPC over as well" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: iommu-common: Fix PARISC compile-time warnings sparc: Make LDC use common iommu poll management functions sparc: Make sparc64 use scalable lib/iommu-common.c functions sparc: Break up monolithic iommu table/lock into finer graularity pools and lock
2015-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds
Pull arch/tile updates from Chris Metcalf: "These are mostly nohz_full changes, plus a smattering of minor fixes (notably a couple for ftrace)" * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: tile: nohz: warn if nohz_full uses hypervisor shared cores tile: ftrace: fix function_graph tracer issues tile: map data region shadow of kernel as R/W tile: support CONTEXT_TRACKING and thus NOHZ_FULL tile: support arch_irq_work_raise arch: tile: fix null pointer dereference on pt_regs pointer tile/elf: reorganize notify_exec() tile: use si_int instead of si_ptr for compat_siginfo
2015-04-17Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS updates from Ralf Baechle: "This is the main pull request for MIPS for Linux 4.1. Most noteworthy: - Add more Octeon-optimized crypto functions - Octeon crypto preemption and locking fixes - Little endian support for Octeon - Use correct CSR to soft reset Octeons - Support LEDs on the Octeon-based DSR-1000N - Fix PCI interrupt mapping for the Octeon-based DSR-1000N - Mark prom_free_prom_memory() as __init for a number of systems - Support for Imagination's Pistachio SOC. This includes arch and CLK bits. I'd like to merge pinctrl bits later - Improve parallelism of csum_partial for certain pipelines - Organize DTB files in subdirs like other architectures - Implement read_sched_clock for all MIPS platforms other than Octeon - Massive series of 38 fixes and cleanups for the FPU emulator / kernel - Further FPU remulator work to support new features. This sits on a separate branch which also has been pulled into the 4.1 KVM branch - Clean up and fixes for the SEAD3 eval board; remove unused file - Various updates for Netlogic platforms - A number of small updates for Loongson 3 platforms - Increase the memory limit for ATH79 platforms to 256MB - A fair number of fixes and updates for BCM47xx platforms - Finish the implementation of XPA support - MIPS FDC support. No, not floppy controller but Fast Debug Channel :) - Detect the R16000 used in SGI legacy platforms - Fix Kconfig dependencies for the SSB bus support" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (265 commits) MIPS: Makefile: Fix MIPS ASE detection code MIPS: asm: elf: Set O32 default FPU flags MIPS: BCM47XX: Fix detecting Microsoft MN-700 & Asus WL500G MIPS: Kconfig: Disable SMP/CPS for 64-bit MIPS: Hibernate: flush TLB entries earlier MIPS: smp-cps: cpu_set FPU mask if FPU present MIPS: lose_fpu(): Disable FPU when MSA enabled MIPS: ralink: add missing symbol for RALINK_ILL_ACC MIPS: ralink: Fix bad config symbol in PCI makefile. SSB: fix Kconfig dependencies MIPS: Malta: Detect and fix bad memsize values Revert "MIPS: Avoid pipeline stalls on some MIPS32R2 cores." MIPS: Octeon: Delete override of cpu_has_mips_r2_exec_hazard. MIPS: Fix cpu_has_mips_r2_exec_hazard. MIPS: kernel: entry.S: Set correct ISA level for mips_ihb MIPS: asm: spinlock: Fix addiu instruction for R10000_LLSC_WAR case MIPS: r4kcache: Use correct base register for MIPS R6 cache flushes MIPS: Kconfig: Fix typo for the r2-to-r6 emulator kernel parameter MIPS: unaligned: Fix regular load/store instruction emulation for EVA MIPS: unaligned: Surround load/store macros in do {} while statements ...
2015-04-17Merge tag 'xtensa-20150416' of git://github.com/czankel/xtensa-linuxLinus Torvalds
Pull Xtensa updates from Chris Zankel: - fix linker script transformation for .text / .text.fixup - wire bpf and execveat syscalls - provide __NR_sync_file_range2 instead of __NR_sync_file_range, as that's what xtensa uses. - make xtfpgs LCD driver functional and configurable. This fixes hardware lockup on KC705/ML605 boot - add audio subsystem bits to xtfpga DTS and provide sample KC705 config with audio features enabled - add CY7C67300 USB controller support to XTFPGA - fix locking issues in ISS network driver - document PIC and MX interrupt distributor device tree bindings * tag 'xtensa-20150416' of git://github.com/czankel/xtensa-linux: xtensa: xtfpga: add CY7C67300 USB controller support irqchip: xtensa-pic: xtensa-mx: document DT bindings xtensa: ISS: fix locking in TAP network adapter xtensa: Fix fix linker script transformation for .text / .text.fixup xtensa: provide __NR_sync_file_range2 instead of __NR_sync_file_range xtensa: wire bpf and execveat syscalls xtensa: xtfpga: fix hardware lockup caused by LCD driver xtensa: xtfpga: provide defconfig with audio subsystem xtensa: xtfpga: add audio card to xtfpga DTS