Age | Commit message (Collapse) | Author |
|
When running a multi-threaded guest and vcpu 0 in a virtual core
is not running in the guest (i.e. it is busy elsewhere in the host),
thread 0 of the physical core will switch the MMU to the guest and
then go to nap mode in the code at kvm_do_nap. If the guest sends
an IPI to thread 0 using the msgsndp instruction, that will wake
up thread 0 and cause all the threads in the guest to exit to the
host unnecessarily. To avoid the unnecessary exit, this arranges
for the PECEDP bit to be cleared in this situation. When napping
due to a H_CEDE from the guest, we still set PECEDP so that the
thread will wake up on an IPI sent using msgsndp.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
We can tell when a secondary thread has finished running a guest by
the fact that it clears its kvm_hstate.kvm_vcpu pointer, so there
is no real need for the nap_count field in the kvmppc_vcore struct.
This changes kvmppc_wait_for_nap to poll the kvm_hstate.kvm_vcpu
pointers of the secondary threads rather than polling vc->nap_count.
Besides reducing the size of the kvmppc_vcore struct by 8 bytes,
this also means that we can tell which secondary threads have got
stuck and thus print a more informative error message.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Rather than calling cond_resched() in kvmppc_run_core() before doing
the post-processing for the vcpus that we have just run (that is,
calling kvmppc_handle_exit_hv(), kvmppc_set_timer(), etc.), we now do
that post-processing before calling cond_resched(), and that post-
processing is moved out into its own function, post_guest_process().
The reschedule point is now in kvmppc_run_vcpu() and we define a new
vcore state, VCORE_PREEMPT, to indicate that that the vcore's runner
task is runnable but not running. (Doing the reschedule with the
vcore in VCORE_INACTIVE state would be bad because there are potentially
other vcpus waiting for the runner in kvmppc_wait_for_exec() which
then wouldn't get woken up.)
Also, we make use of the handy cond_resched_lock() function, which
unlocks and relocks vc->lock for us around the reschedule.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
* Remove unused kvmppc_vcore::n_busy field.
* Remove setting of RMOR, since it was only used on PPC970 and the
PPC970 KVM support has been removed.
* Don't use r1 or r2 in setting the runlatch since they are
conventionally reserved for other things; use r0 instead.
* Streamline the code a little and remove the ext_interrupt_to_host
label.
* Add some comments about register usage.
* hcall_try_real_mode doesn't need to be global, and can't be
called from C code anyway.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Previously, if kvmppc_run_core() was running a VCPU that needed a VPA
update (i.e. one of its 3 virtual processor areas needed to be pinned
in memory so the host real mode code can update it on guest entry and
exit), we would drop the vcore lock and do the update there and then.
Future changes will make it inconvenient to drop the lock, so instead
we now remove it from the list of runnable VCPUs and wake up its
VCPU task. This will have the effect that the VCPU task will exit
kvmppc_run_vcpu(), go around the do loop in kvmppc_vcpu_run_hv(), and
re-enter kvmppc_run_vcpu(), whereupon it will do the necessary call
to kvmppc_update_vpas() and then rejoin the vcore.
The one complication is that the runner VCPU (whose VCPU task is the
current task) might be one of the ones that gets removed from the
runnable list. In that case we just return from kvmppc_run_core()
and let the code in kvmppc_run_vcpu() wake up another VCPU task to be
the runner if necessary.
This all means that the VCORE_STARTING state is no longer used, so we
remove it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
This reads the timebase at various points in the real-mode guest
entry/exit code and uses that to accumulate total, minimum and
maximum time spent in those parts of the code. Currently these
times are accumulated per vcpu in 5 parts of the code:
* rm_entry - time taken from the start of kvmppc_hv_entry() until
just before entering the guest.
* rm_intr - time from when we take a hypervisor interrupt in the
guest until we either re-enter the guest or decide to exit to the
host. This includes time spent handling hcalls in real mode.
* rm_exit - time from when we decide to exit the guest until the
return from kvmppc_hv_entry().
* guest - time spend in the guest
* cede - time spent napping in real mode due to an H_CEDE hcall
while other threads in the same vcore are active.
These times are exposed in debugfs in a directory per vcpu that
contains a file called "timings". This file contains one line for
each of the 5 timings above, with the name followed by a colon and
4 numbers, which are the count (number of times the code has been
executed), the total time, the minimum time, and the maximum time,
all in nanoseconds.
The overhead of the extra code amounts to about 30ns for an hcall that
is handled in real mode (e.g. H_SET_DABR), which is about 25%. Since
production environments may not wish to incur this overhead, the new
code is conditional on a new config symbol,
CONFIG_KVM_BOOK3S_HV_EXIT_TIMING.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
This creates a debugfs directory for each HV guest (assuming debugfs
is enabled in the kernel config), and within that directory, a file
by which the contents of the guest's HPT (hashed page table) can be
read. The directory is named vmnnnn, where nnnn is the PID of the
process that created the guest. The file is named "htab". This is
intended to help in debugging problems in the host's management
of guest memory.
The contents of the file consist of a series of lines like this:
3f48 4000d032bf003505 0000000bd7ff1196 00000003b5c71196
The first field is the index of the entry in the HPT, the second and
third are the HPT entry, so the third entry contains the real page
number that is mapped by the entry if the entry's valid bit is set.
The fourth field is the guest's view of the second doubleword of the
entry, so it contains the guest physical address. (The format of the
second through fourth fields are described in the Power ISA and also
in arch/powerpc/include/asm/mmu-hash64.h.)
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Add two counters to count how often we generate real-mode ICS resend
and reject events. The counters provide some performance statistics
that could be used in the future to consider if the real mode functions
need further optimizing. The counters are displayed as part of IPC and
ICP state provided by /sys/debug/kernel/powerpc/kvm* for each VM.
Also added two counters that count (approximately) how many times we
don't find an ICP or ICS we're looking for. These are not currently
exposed through sysfs, but can be useful when debugging crashes.
Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Interrupt-based hypercalls return H_TOO_HARD to inform KVM that it needs
to switch to the host to complete the rest of hypercall function in
virtual mode. This patch ports the virtual mode ICS/ICP reject and resend
functions to be runnable in hypervisor real mode, thus avoiding the need
to switch to the host to execute these functions in virtual mode. However,
the hypercalls continue to return H_TOO_HARD for vcpu_wakeup and notify
events - these events cannot be done in real mode and they will still need
a switch to host virtual mode.
There are sufficient differences between the real mode code and the
virtual mode code for the ICS/ICP resend and reject functions that
for now the code has been duplicated instead of sharing common code.
In the future, we can look at creating common functions.
Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Replaces the ICS mutex lock with a spin lock since we will be porting
these routines to real mode. Note that we need to disable interrupts
before we take the lock in anticipation of the fact that on the guest
side, we are running in the context of a hard irq and interrupts are
disabled (EE bit off) when the lock is acquired. Again, because we
will be acquiring the lock in hypervisor real mode, we need to use
an arch_spinlock_t instead of a normal spinlock here as we want to
avoid running any lockdep code (which may not be safe to execute in
real mode).
Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Add counters to track number of times we switch from guest real mode
to host virtual mode during an interrupt-related hyper call because the
hypercall requires actions that cannot be completed in real mode. This
will help when making optimizations that reduce guest-host transitions.
It is safe to use an ordinary increment rather than an atomic operation
because there is one ICP per virtual CPU and kvmppc_xics_rm_complete()
only works on the ICP for the current VCPU.
The counters are displayed as part of IPC and ICP state provided by
/sys/debug/kernel/powerpc/kvm* for each VM.
Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
This adds helper routines for locking and unlocking HPTEs, and uses
them in the rest of the code. We don't change any locking rules in
this patch.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
We don't support real-mode areas now that 970 support is removed.
Remove the remaining details of rma from the code. Also rename
rma_setup_done to hpte_setup_done to better reflect the changes.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Some PowerNV systems include a hardware random-number generator.
This HWRNG is present on POWER7+ and POWER8 chips and is capable of
generating one 64-bit random number every microsecond. The random
numbers are produced by sampling a set of 64 unstable high-frequency
oscillators and are almost completely entropic.
PAPR defines an H_RANDOM hypercall which guests can use to obtain one
64-bit random sample from the HWRNG. This adds a real-mode
implementation of the H_RANDOM hypercall. This hypercall was
implemented in real mode because the latency of reading the HWRNG is
generally small compared to the latency of a guest exit and entry for
all the threads in the same virtual core.
Userspace can detect the presence of the HWRNG and the H_RANDOM
implementation by querying the KVM_CAP_PPC_HWRNG capability. The
H_RANDOM hypercall implementation will only be invoked when the guest
does an H_RANDOM hypercall if userspace first enables the in-kernel
H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
On POWER, storage caching is usually configured via the MMU - attributes
such as cache-inhibited are stored in the TLB and the hashed page table.
This makes correctly performing cache inhibited IO accesses awkward when
the MMU is turned off (real mode). Some CPU models provide special
registers to control the cache attributes of real mode load and stores but
this is not at all consistent. This is a problem in particular for SLOF,
the firmware used on KVM guests, which runs entirely in real mode, but
which needs to do IO to load the kernel.
To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
a logical address (aka guest physical address). SLOF uses these for IO.
However, because these are implemented within qemu, not the host kernel,
these bypass any IO devices emulated within KVM itself. The simplest way
to see this problem is to attempt to boot a KVM guest from a virtio-blk
device with iothread / dataplane enabled. The iothread code relies on an
in kernel implementation of the virtio queue notification, which is not
triggered by the IO hcalls, and so the guest will stall in SLOF unable to
load the guest OS.
This patch addresses this by providing in-kernel implementations of the
2 hypercalls, which correctly scan the KVM IO bus. Any access to an
address not handled by the KVM IO bus will cause a VM exit, hitting the
qemu implementation as before.
Note that a userspace change is also required, in order to enable these
new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: fix compilation]
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Export __spin_yield so that the arch_spin_unlock() function can
be invoked from a module. This will be required for modules where
we want to take a lock that is also is acquired in hypervisor
real mode. Because we want to avoid running any lockdep code
(which may not be safe in real mode), this lock needs to be
an arch_spinlock_t instead of a normal spinlock.
Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
git://people.freedesktop.org/~airlied/linux into v4l_for_linus
* 'drm-next-merged' of git://people.freedesktop.org/~airlied/linux: (9717 commits)
media-bus: Fixup RGB444_1X12, RGB565_1X16, and YUV8_1X24 media bus format
hexdump: avoid warning in test function
fs: take i_mutex during prepare_binprm for set[ug]id executables
smp: Fix error case handling in smp_call_function_*()
iommu-common: Fix PARISC compile-time warnings
sparc: Make LDC use common iommu poll management functions
sparc: Make sparc64 use scalable lib/iommu-common.c functions
Break up monolithic iommu table/lock into finer graularity pools and lock
sparc: Revert generic IOMMU allocator.
tools/power turbostat: correct dumped pkg-cstate-limit value
tools/power turbostat: calculate TSC frequency from CPUID(0x15) on SKL
tools/power turbostat: correct DRAM RAPL units on recent Xeon processors
tools/power turbostat: Initial Skylake support
tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile
tools/power turbostat: modprobe msr, if needed
tools/power turbostat: dump MSR_TURBO_RATIO_LIMIT2
tools/power turbostat: use new MSR_TURBO_RATIO_LIMIT names
Bluetooth: hidp: Fix regression with older userspace and flags validation
config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected
perf/x86/intel/pt: Fix and clean up error handling in pt_event_add()
...
That solves several merge conflicts:
Documentation/DocBook/media/v4l/subdev-formats.xml
Documentation/devicetree/bindings/vendor-prefixes.txt
drivers/staging/media/mn88473/mn88473.c
include/linux/kconfig.h
include/uapi/linux/media-bus-format.h
The ones at subdev-formats.xml and media-bus-format.h are not trivial.
That's why we opted to merge from DRM.
|
|
|
|
With the recent changes omaps have developed a dependency to MFD_SYSCON.
This is used for system control module generic register area and some
clocks.
We do have it selected in omap2plus_defconfig, but targeted config
files may not have it selected. Let's make sure it's selected like
few other ARM platforms are already doing.
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell:
"This is the final removal (after several years!) of the obsolete
cpus_* functions, prompted by their mis-use in staging.
With these function removed, all cpu functions should only iterate to
nr_cpu_ids, so we finally only allocate that many bits when cpumasks
are allocated offstack"
* tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
cpumask: remove __first_cpu / __next_cpu
cpumask: resurrect CPU_MASK_CPU0
linux/cpumask.h: add typechecking to cpumask_test_cpu
cpumask: only allocate nr_cpumask_bits.
Fix weird uses of num_online_cpus().
cpumask: remove deprecated functions.
mips: fix obsolete cpumask_of_cpu usage.
x86: fix more deprecated cpu function usage.
ia64: remove deprecated cpus_ usage.
powerpc: fix deprecated CPU_MASK_CPU0 usage.
CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region.
staging/lustre/o2iblnd: Don't use cpus_weight
staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_
staging/lustre/ptlrpc: Do not use deprecated cpus_* functions
blackfin: fix up obsolete cpu function usage.
parisc: fix up obsolete cpu function usage.
tile: fix up obsolete cpu function usage.
arm64: fix up obsolete cpu function usage.
mips: fix up obsolete cpu function usage.
x86: fix up obsolete cpu function usage.
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Martin Schwidefsky:
"The big thing in this second merge for s390 is the new eBPF JIT from
Michael which replaces the old 32-bit backend.
The remaining commits are bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: add locking for fmb access
s390/pci: extract software counters from fmb
s390/dasd: Fix unresumed device after suspend/resume having no paths
s390/dasd: fix unresumed device after suspend/resume
s390/dasd: fix inability to set a DASD device offline
s390/mm: Fix memory hotplug for unaligned standby memory
s390/bpf: Add s390x eBPF JIT compiler backend
s390: Use bool function return values of true/false not 1/0
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68k fixes from Greg Ungerer:
"Nothing big, spelling fixes and fix/cleanup for ColdFire eth device setup"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: fix fec setup warning for ColdFire 5271 builds
m68knommu: ColdFire 5271 only has a single FEC controller
m68k: Fix trivial typos in comments
|
|
Merge a set of fixes that we missed sending in before v4.0 release. These
will also be sent to -stable.
* fixes: (659 commits)
ARM: at91/dt: sama5d3 xplained: add phy address for macb1
kbuild: Create directory for target DTB
ARM: mvebu: Disable CPU Idle on Armada 38x
arm64: juno: Fix misleading name of UART reference clock
ARM: dts: sunxi: Remove overclocked/overvoltaged OPP
ARM: dts: sun4i: a10-lime: Override and remove 1008MHz OPP setting
ARM: socfpga: dts: fix spi1 interrupt
ARM: dts: Fix gpio interrupts for dm816x
ARM: dts: dra7: remove ti,hwmod property from pcie phy
ARM: EXYNOS: Fix build breakage cpuidle on !SMP
ARM: OMAP: dmtimer: disable pm runtime on remove
ARM: OMAP: dmtimer: check for pm_runtime_get_sync() failure
ARM: dts: fix lid and power pin-functions for exynos5250-spring
ARM: dts: fix mmc node updates for exynos5250-spring
ARM: OMAP2+: Fix socbus family info for AM33xx devices
ARM: dts: omap3: Add missing dmas for crypto
+ Linux 4.0-rc4
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The actual user space unwinding is more involved, so simply capture the
user space PC
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Redefine trap handler as below:
0 N/A reserved for system calls
1 SIGUSR1 user-defined signal 1
2 SIGUSR2 user-defined signal 2
3 SIGILL illegal instruction
4..29 reserved (but implemented to raise SIGILL instead of being undefined)
30 SIGTRAP KGDB
31 SIGTRAP trace/breakpoint trap
Signed-off-by: Ley Foon Tan <lftan@altera.com>
|
|
Remove the end address checking for initda function. We need to invalidate
each address line for initda instruction, from start to end address.
Signed-off-by: Ley Foon Tan <lftan@altera.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat update from Len Brown:
"Updates to the turbostat utility.
Just one kernel dependency in this batch -- added a #define to
msr-index.h"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: correct dumped pkg-cstate-limit value
tools/power turbostat: calculate TSC frequency from CPUID(0x15) on SKL
tools/power turbostat: correct DRAM RAPL units on recent Xeon processors
tools/power turbostat: Initial Skylake support
tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile
tools/power turbostat: modprobe msr, if needed
tools/power turbostat: dump MSR_TURBO_RATIO_LIMIT2
tools/power turbostat: use new MSR_TURBO_RATIO_LIMIT names
x86 msr-index: define MSR_TURBO_RATIO_LIMIT,1,2
tools/power turbostat: label base frequency
tools/power turbostat: update PERF_LIMIT_REASONS decoding
tools/power turbostat: simplify default output
|
|
They were for use by the deprecated first_cpu() and next_cpu() wrappers,
but sparc used them directly.
They're now replaced by cpumask_first / cpumask_next. And __next_cpu_nr
is completely obsolete.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: David S. Miller <davem@davemloft.net>
|
|
Pull sparc fixes from David Miller
"Unfortunately, I brown paper bagged the generic iommu pool allocator
by applying the wrong revision of the patch series.
This reverts the bad one, and puts the right one in"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
iommu-common: Fix PARISC compile-time warnings
sparc: Make LDC use common iommu poll management functions
sparc: Make sparc64 use scalable lib/iommu-common.c functions
Break up monolithic iommu table/lock into finer graularity pools and lock
sparc: Revert generic IOMMU allocator.
|
|
Note that this conversion is only being done to consolidate the
code and ensure that the common code provides the sufficient
abstraction. It is not expected to result in any noticeable
performance improvement, as there is typically one ldc_iommu
per vnet_port, and each one has 8k entries, with a typical
request for 1-4 pages. Thus LDC uses npools == 1.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In iperf experiments running linux as the Tx side (TCP client) with
10 threads results in a severe performance drop when TSO is disabled,
indicating a weakness in the software that can be avoided by using
the scalable IOMMU arena DMA allocation.
Baseline numbers before this patch:
with default settings (TSO enabled) : 9-9.5 Gbps
Disable TSO using ethtool- drops badly: 2-3 Gbps.
After this patch, iperf client with 10 threads, can give a
throughput of at least 8.5 Gbps, even when TSO is disabled.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I applied the wrong version of this patch series, V4 instead
of V10, due to a patchwork bundling snafu.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Skylake adds some additional residency counters.
Skylake supports a different mix of RAPL registers
from any previous product.
In most other ways, Skylake is like Broadwell.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull PMEM driver from Ingo Molnar:
"This is the initial support for the pmem block device driver:
persistent non-volatile memory space mapped into the system's physical
memory space as large physical memory regions.
The driver is based on Intel code, written by Ross Zwisler, with fixes
by Boaz Harrosh, integrated with x86 e820 memory resource management
and tidied up by Christoph Hellwig.
Note that there were two other separate pmem driver submissions to
lkml: but apparently all parties (Ross Zwisler, Boaz Harrosh) are
reasonably happy with this initial version.
This version enables minimal support that enables persistent memory
devices out in the wild to work as block devices, identified through a
magic (non-standard) e820 flag and auto-discovered if
CONFIG_X86_PMEM_LEGACY=y, or added explicitly through manipulating the
memory maps via the "memmap=..." boot option with the new, special '!'
modifier character.
Limitations: this is a regular block device, and since the pmem areas
are not struct page backed, they are invisible to the rest of the
system (other than the block IO device), so direct IO to/from pmem
areas, direct mmap() or XIP is not possible yet. The page cache will
also shadow and double buffer pmem contents, etc.
Initial support is for x86"
* 'x86-pmem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
drivers/block/pmem: Fix 32-bit build warning in pmem_alloc()
drivers/block/pmem: Add a driver for persistent memory
x86/mm: Add support for the non-standard protected e820 type
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"This tree includes:
- an FPU related crash fix
- a ptrace fix (with matching testcase in tools/testing/selftests/)
- an x86 Kconfig DMA-config defaults tweak to better avoid
non-working drivers"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected
x86/fpu: Load xsave pointer *after* initialization
x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal()
x86, selftests: Add single_step_syscall test
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"This update has mostly fixes, but also other bits:
- perf tooling fixes
- PMU driver fixes
- Intel Broadwell PMU driver HW-enablement for LBR callstacks
- a late coming 'perf kmem' tool update that enables it to also
analyze page allocation data. Note, this comes with MM tracepoint
changes that we believe to not break anything: because it changes
the formerly opaque 'struct page *' field that uniquely identifies
pages to 'pfn' which identifies pages uniquely too, but isn't as
opaque and can be used for other purposes as well"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/pt: Fix and clean up error handling in pt_event_add()
perf/x86/intel: Add Broadwell support for the LBR callstack
perf/x86/intel/rapl: Fix energy counter measurements but supporing per domain energy units
perf/x86/intel: Fix Core2,Atom,NHM,WSM cycles:pp events
perf/x86: Fix hw_perf_event::flags collision
perf probe: Fix segfault when probe with lazy_line to file
perf probe: Find compilation directory path for lazy matching
perf probe: Set retprobe flag when probe in address-based alternative mode
perf kmem: Analyze page allocator events also
tracing, mm: Record pfn instead of pointer to struct page
|
|
A huge amount of NIC drivers use the DMA API, however if
compiled under 32-bit an very important part of the DMA API can
be ommitted leading to the drivers not working at all
(especially if used with 'swiotlb=force iommu=soft').
As Prashant Sreedharan explains it: "the driver [tg3] uses
DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of
the dma "mapping" and dma_unmap_addr() to get the "mapping"
value. On most of the platforms this is a no-op, but ... with
"iommu=soft and swiotlb=force" this house keeping is required,
... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_
instead of the DMA address."
As such enable this even when using 32-bit kernels.
Reported-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Prashant Sreedharan <prashant@broadcom.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Chan <mchan@broadcom.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: cascardo@linux.vnet.ibm.com
Cc: david.vrabel@citrix.com
Cc: sanjeevb@broadcom.com
Cc: siva.kallam@broadcom.com
Cc: vyasevich@gmail.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/20150417190448.GA9462@l.oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for the v4.1 development cycle:
- A new GPIO hogging mechanism has been added. This can be used on
boards that want to drive some GPIO line high, low, or set it as
input on boot and then never touch it again. For some embedded
systems this is bliss and simplifies things to a great extent.
- Some API cleanup and closure: gpiod_get_array() and
gpiod_put_array() has been added to get and put GPIOs in bulk as
was possible with the non-descriptor API.
- Encapsulate cross-calls to the pin control subsystem in
<linux/gpio/driver.h>. Now this should be the only header any GPIO
driver needs to include or something is wrong. Cleanups
restricting drivers to this include are welcomed if tested.
- Sort the GPIO Kconfig and split it into submenus, as it was
becoming and unstructured, illogical and unnavigatable mess. I
hope this is easier to follow. Menus that require a certain
subsystem like I2C can now be hidden nicely for example, still
working on others.
- New drivers:
- New driver for the Altera Soft GPIO.
- The F7188x driver now handles the F71869 and F71869A variants.
- The MIPS Loongson driver has been moved to drivers/gpio for
consolidation and cleanup.
- Cleanups:
- The MAX732x is converted to use the GPIOLIB_IRQCHIP
infrastructure.
- The PCF857x is converted to use the GPIOLIB_IRQCHIP
infrastructure.
- Radical cleanup of the OMAP driver.
- Misc:
- Enable the DWAPB GPIO for all architectures. This is a "hard
IP" block from Synopsys which has started to turn up in so
diverse architectures as X86 Quark, ARC and a slew of ARM
systems. So even though it's not an expander, it's generic
enough to be available for all.
- We add a mock GPIO on Crystalcove PMIC after a long discussion
with Daniel Vetter et al, tracing back to the shootout at the
kernel summit where DRM drivers and sub-componentization was
discussed. In this case a mock GPIO is assumed to be the best
compromise gaining some reuse of infrastructure without making
DRM drivers overly complex at the same time. Let's see"
* tag 'gpio-v4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (62 commits)
Revert "gpio: sch: use uapi/linux/pci_ids.h directly"
gpio: dwapb: remove dependencies
gpio: dwapb: enable for ARC
gpio: removing kfree remove functionality
gpio: mvebu: Fix mask/unmask managment per irq chip type
gpio: split GPIO drivers in submenus
gpio: move MFD GPIO drivers under their own comment
gpio: move BCM Kona Kconfig option
gpio: arrange SPI Kconfig symbols alphabetically
gpio: arrange PCI GPIO controllers alphabetically
gpio: arrange I2C Kconfig symbols alphabetically
gpio: arrange Kconfig symbols alphabetically
gpio: ich: Implement get_direction function
gpio: use (!foo) instead of (foo == NULL)
gpio: arizona: drop owner assignment from platform_drivers
gpio: max7300: remove 'ret' variable
gpio: use devm_kzalloc
gpio: sch: use uapi/linux/pci_ids.h directly
gpio: x-gene: fix devm_ioremap_resource() check
gpio: loongson: Add Loongson-3A/3B GPIO driver support
...
|
|
Dan Carpenter reported that pt_event_add() has buggy
error handling logic: it returns 0 instead of -EBUSY when
it fails to start a newly added event.
Furthermore, the control flow in this function is messy,
with cleanup labels mixed with direct returns.
Fix the bug and clean up the code by converting it to
a straight fast path for the regular non-failing case,
plus a clear sequence of cascading goto labels to do
all cleanup.
NOTE: I materially changed the existing clean up logic in the
pt_event_start() failure case to use the direct
perf_aux_output_end() path, not pt_event_del(), because
perf_aux_output_end() is enough here.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150416103830.GB7847@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When targeting ARMv3 (e.g. rpc) and enabling CONFIG_VDSO we get:
arch/arm/vdso/datapage.S:13: Error: selected processor does not
support ARM mode `bx lr'
One fix considered was to use 'ldr pc,lr' for such configurations, but
since the VDSO is unlikely to be useful for pre-v7 hardware, just make
it depend on CONFIG_CPU_V7.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Pull sparc updates from David Miller:
"The PowerPC folks have a really nice scalable IOMMU pool allocator
that we wanted to make use of for sparc. So here we have a series
that abstracts out their code into a common layer that anyone can make
use of.
Sparc is converted, and the PowerPC folks have reviewed and ACK'd this
series and plan to convert PowerPC over as well"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
iommu-common: Fix PARISC compile-time warnings
sparc: Make LDC use common iommu poll management functions
sparc: Make sparc64 use scalable lib/iommu-common.c functions
sparc: Break up monolithic iommu table/lock into finer graularity pools and lock
|
|
Pull arch/tile updates from Chris Metcalf:
"These are mostly nohz_full changes, plus a smattering of minor fixes
(notably a couple for ftrace)"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: nohz: warn if nohz_full uses hypervisor shared cores
tile: ftrace: fix function_graph tracer issues
tile: map data region shadow of kernel as R/W
tile: support CONTEXT_TRACKING and thus NOHZ_FULL
tile: support arch_irq_work_raise
arch: tile: fix null pointer dereference on pt_regs pointer
tile/elf: reorganize notify_exec()
tile: use si_int instead of si_ptr for compat_siginfo
|
|
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for MIPS for Linux 4.1. Most
noteworthy:
- Add more Octeon-optimized crypto functions
- Octeon crypto preemption and locking fixes
- Little endian support for Octeon
- Use correct CSR to soft reset Octeons
- Support LEDs on the Octeon-based DSR-1000N
- Fix PCI interrupt mapping for the Octeon-based DSR-1000N
- Mark prom_free_prom_memory() as __init for a number of systems
- Support for Imagination's Pistachio SOC. This includes arch and
CLK bits. I'd like to merge pinctrl bits later
- Improve parallelism of csum_partial for certain pipelines
- Organize DTB files in subdirs like other architectures
- Implement read_sched_clock for all MIPS platforms other than
Octeon
- Massive series of 38 fixes and cleanups for the FPU emulator /
kernel
- Further FPU remulator work to support new features. This sits on a
separate branch which also has been pulled into the 4.1 KVM branch
- Clean up and fixes for the SEAD3 eval board; remove unused file
- Various updates for Netlogic platforms
- A number of small updates for Loongson 3 platforms
- Increase the memory limit for ATH79 platforms to 256MB
- A fair number of fixes and updates for BCM47xx platforms
- Finish the implementation of XPA support
- MIPS FDC support. No, not floppy controller but Fast Debug Channel :)
- Detect the R16000 used in SGI legacy platforms
- Fix Kconfig dependencies for the SSB bus support"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (265 commits)
MIPS: Makefile: Fix MIPS ASE detection code
MIPS: asm: elf: Set O32 default FPU flags
MIPS: BCM47XX: Fix detecting Microsoft MN-700 & Asus WL500G
MIPS: Kconfig: Disable SMP/CPS for 64-bit
MIPS: Hibernate: flush TLB entries earlier
MIPS: smp-cps: cpu_set FPU mask if FPU present
MIPS: lose_fpu(): Disable FPU when MSA enabled
MIPS: ralink: add missing symbol for RALINK_ILL_ACC
MIPS: ralink: Fix bad config symbol in PCI makefile.
SSB: fix Kconfig dependencies
MIPS: Malta: Detect and fix bad memsize values
Revert "MIPS: Avoid pipeline stalls on some MIPS32R2 cores."
MIPS: Octeon: Delete override of cpu_has_mips_r2_exec_hazard.
MIPS: Fix cpu_has_mips_r2_exec_hazard.
MIPS: kernel: entry.S: Set correct ISA level for mips_ihb
MIPS: asm: spinlock: Fix addiu instruction for R10000_LLSC_WAR case
MIPS: r4kcache: Use correct base register for MIPS R6 cache flushes
MIPS: Kconfig: Fix typo for the r2-to-r6 emulator kernel parameter
MIPS: unaligned: Fix regular load/store instruction emulation for EVA
MIPS: unaligned: Surround load/store macros in do {} while statements
...
|
|
Pull Xtensa updates from Chris Zankel:
- fix linker script transformation for .text / .text.fixup
- wire bpf and execveat syscalls
- provide __NR_sync_file_range2 instead of __NR_sync_file_range, as
that's what xtensa uses.
- make xtfpgs LCD driver functional and configurable. This fixes
hardware lockup on KC705/ML605 boot
- add audio subsystem bits to xtfpga DTS and provide sample KC705
config with audio features enabled
- add CY7C67300 USB controller support to XTFPGA
- fix locking issues in ISS network driver
- document PIC and MX interrupt distributor device tree bindings
* tag 'xtensa-20150416' of git://github.com/czankel/xtensa-linux:
xtensa: xtfpga: add CY7C67300 USB controller support
irqchip: xtensa-pic: xtensa-mx: document DT bindings
xtensa: ISS: fix locking in TAP network adapter
xtensa: Fix fix linker script transformation for .text / .text.fixup
xtensa: provide __NR_sync_file_range2 instead of __NR_sync_file_range
xtensa: wire bpf and execveat syscalls
xtensa: xtfpga: fix hardware lockup caused by LCD driver
xtensa: xtfpga: provide defconfig with audio subsystem
xtensa: xtfpga: add audio card to xtfpga DTS
|