Age | Commit message (Collapse) | Author |
|
When a vcpu is loaded/unloaded to a physical core, we need to update
host physical APIC ID information in the Physical APIC-ID table
accordingly.
Also, when vCPU is blocking/un-blocking (due to halt instruction),
we need to make sure that the is-running bit in set accordingly in the
physical APIC-ID table.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
[Return void from new functions, add WARN_ON when they returned negative
errno; split load and put into separate function as they have almost
nothing in common. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When enable AVIC:
* Do not intercept CR8 since this should be handled by AVIC HW.
* Also, we don't need to sync cr8/V_TPR and APIC backing page.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
[Rename svm_in_nested_interrupt_shadow to svm_nested_virtualize_tpr. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Since AVIC only virtualizes xAPIC hardware for the guest, this patch
disable x2APIC support in guest CPUID.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Adding kvm_x86_ops hooks to allow APICv to do post state restore.
This is required to support VM save and restore feature.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This patch introduces VMEXIT handlers, avic_incomplete_ipi_interception()
and avic_unaccelerated_access_interception() along with two trace points
(trace_kvm_avic_incomplete_ipi and trace_kvm_avic_unaccelerated_access).
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This patch introduces a new mechanism to inject interrupt using AVIC.
Since VINTR is not supported when enable AVIC, we need to inject
interrupt via APIC backing page instead.
This patch also adds support for AVIC doorbell, which is used by
KVM to signal a running vcpu to check IRR for injected interrupts.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This patch introduces AVIC-related data structure, and AVIC
initialization code.
There are three main data structures for AVIC:
* Virtual APIC (vAPIC) backing page (per-VCPU)
* Physical APIC ID table (per-VM)
* Logical APIC ID table (per-VM)
Currently, AVIC is disabled by default. Users can manually
enable AVIC via kernel boot option kvm-amd.avic=1 or during
kvm-amd module loading with parameter avic=1.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
[Avoid extra indentation (Boris). - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Introduce new AVIC VMCB registers.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Adding new function pointer in struct kvm_x86_ops, and calling them
from the kvm_arch_vcpu[blocking/unblocking].
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Adding function pointers in struct kvm_x86_ops for processor-specific
layer to provide hooks for when KVM initialize and destroy VM.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rename kvm_apic_get_reg to kvm_lapic_get_reg to be consistent with
the existing kvm_lapic_set_reg counterpart.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Exporting LAPIC utility functions and macros for re-use in SVM code.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When booting with nr_cpus=1, uncore_pci_probe tries to init the PCI/uncore
also for the other packages and fails with warning when they are not found.
The warning is bogus because it's correct to fail here for packages which are
not initialized. Remove it and return silently.
Fixes: cf6d445f6897 "perf/x86/uncore: Track packages, not per CPU data"
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching updates from Jiri Kosina:
- remove of our own implementation of architecture-specific relocation
code and leveraging existing code in the module loader to perform
arch-dependent work, from Jessica Yu.
The relevant patches have been acked by Rusty (for module.c) and
Heiko (for s390).
- live patching support for ppc64le, which is a joint work of Michael
Ellerman and Torsten Duwe. This is coming from topic branch that is
share between livepatching.git and ppc tree.
- addition of livepatching documentation from Petr Mladek
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: make object/func-walking helpers more robust
livepatch: Add some basic livepatch documentation
powerpc/livepatch: Add live patching support on ppc64le
powerpc/livepatch: Add livepatch stack to struct thread_info
powerpc/livepatch: Add livepatch header
livepatch: Allow architectures to specify an alternate ftrace location
ftrace: Make ftrace_location_range() global
livepatch: robustify klp_register_patch() API error checking
Documentation: livepatch: outline Elf format and requirements for patch modules
livepatch: reuse module loader code to write relocations
module: s390: keep mod_arch_specific for livepatch modules
module: preserve Elf information for livepatch modules
Elf: add livepatch-specific Elf constants
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits)
gitignore: fix wording
mfd: ab8500-debugfs: fix "between" in printk
memstick: trivial fix of spelling mistake on management
cpupowerutils: bench: fix "average"
treewide: Fix typos in printk
IB/mlx4: printk fix
pinctrl: sirf/atlas7: fix printk spelling
serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/
w1: comment spelling s/minmum/minimum/
Blackfin: comment spelling s/divsor/divisor/
metag: Fix misspellings in comments.
ia64: Fix misspellings in comments.
hexagon: Fix misspellings in comments.
tools/perf: Fix misspellings in comments.
cris: Fix misspellings in comments.
c6x: Fix misspellings in comments.
blackfin: Fix misspelling of 'register' in comment.
avr32: Fix misspelling of 'definitions' in comment.
treewide: Fix typos in printk
Doc: treewide : Fix typos in DocBook/filesystem.xml
...
|
|
Pull networking updates from David Miller:
"Highlights:
1) Support SPI based w5100 devices, from Akinobu Mita.
2) Partial Segmentation Offload, from Alexander Duyck.
3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.
4) Allow cls_flower stats offload, from Amir Vadai.
5) Implement bpf blinding, from Daniel Borkmann.
6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
actually using FASYNC these atomics are superfluous. From Eric
Dumazet.
7) Run TCP more preemptibly, also from Eric Dumazet.
8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
driver, from Gal Pressman.
9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.
10) Improve BPF usage documentation, from Jesper Dangaard Brouer.
11) Support tunneling offloads in qed, from Manish Chopra.
12) aRFS offloading in mlx5e, from Maor Gottlieb.
13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
Leitner.
14) Add MSG_EOR support to TCP, this allows controlling packet
coalescing on application record boundaries for more accurate
socket timestamp sampling. From Martin KaFai Lau.
15) Fix alignment of 64-bit netlink attributes across the board, from
Nicolas Dichtel.
16) Per-vlan stats in bridging, from Nikolay Aleksandrov.
17) Several conversions of drivers to ethtool ksettings, from Philippe
Reynes.
18) Checksum neutral ILA in ipv6, from Tom Herbert.
19) Factorize all of the various marvell dsa drivers into one, from
Vivien Didelot
20) Add VF support to qed driver, from Yuval Mintz"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
Revert "phy dp83867: Make rgmii parameters optional"
r8169: default to 64-bit DMA on recent PCIe chips
phy dp83867: Make rgmii parameters optional
phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
bpf: arm64: remove callee-save registers use for tmp registers
asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
switchdev: pass pointer to fib_info instead of copy
net_sched: close another race condition in tcf_mirred_release()
tipc: fix nametable publication field in nl compat
drivers: net: Don't print unpopulated net_device name
qed: add support for dcbx.
ravb: Add missing free_irq() calls to ravb_close()
qed: Remove a stray tab
net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fec-mpc52xx: use phydev from struct net_device
bpf, doc: fix typo on bpf_asm descriptions
stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fs-enet: use phydev from struct net_device
...
|
|
* pci/hotplug:
PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
* pci/resource:
PCI: Disable all BAR sizing for devices with non-compliant BARs
x86/PCI: Mark Broadwell-EP Home Agent 1 as having non-compliant BARs
PCI: Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs
|
|
Megha Dey reported a kernel panic in crypto code. The problem is that
sha1_x8_avx2() clobbers registers r12-r15 without saving and restoring
them.
Before commit aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in
sha-mb functions"), those registers were saved and restored by the
callers of the function. I removed them with that commit because I
didn't realize sha1_x8_avx2() clobbered them.
Fix the potential undefined behavior associated with clobbering the
registers and make the behavior less surprising by changing the
registers to be callee saved/restored to conform with the C function
call ABI.
Also, rdx (aka RSP_SAVE) doesn't need to be saved: I verified that none
of the callers rely on it being saved, and it's not a callee-saved
register in the C ABI.
Fixes: aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in sha-mb functions")
Cc: stable@vger.kernel.org # 4.6
Reported-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Alex Thorlton reported that the SGI/UV code crashes in the efi_call()
code when invoked with 7 parameters, due to:
mov (%rsp), %rax
mov 8(%rax), %rax
...
mov %rax, 40(%rsp)
Offset 8 is only true if CONFIG_FRAME_POINTERS is disabled,
with frame pointers enabled it should be 16.
Furthermore, the SAVE_XMM code saves the old stack pointer, but
that's just crazy. It saves the stack pointer *AFTER* we've done
the:
FRAME_BEGIN
... which will have *changed* the stack pointer, depending on whether
stack frames are enabled or not.
So when the code then does:
mov (%rsp), %rax
... we now move that old stack pointer into %rax, but the offset off that
stack pointer will depend on whether that FRAME_BEGIN saved off %rbp
or not.
So that whole 8-vs-16 offset confusion depends on the frame pointer!
If frame pointers were enabled, it will be 16. If they weren't, it
will be 8.
The right fix is to just get rid of that silly conditional frame
pointer thing, and always use frame pointers in this stub function.
And then we don't need that (odd) load to get the old stack
pointer into %rax - we can just use the frame pointer.
Reported-by: Alex Thorlton <athorlton@sgi.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/CA%2B55aFzBS2v%3DWnEH83cUDg7XkOremFqJ30BJwF40dCYjReBkUQ@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"The new features here are ACPI 6.1 support (and some previously
missing bits of ACPI 6.0 support) in ACPICA and two new drivers, a
driver for the ACPI Generic Event Device (GED) feature introduced by
ACPI 6.1 and the INT3406 thermal driver for display thermal
management. Also the value returned by the _HRV (hardware revision)
ACPI object will be exported to user space via sysfs now.
In addition to that, ACPI on ARM64 will not depend on EXPERT any more.
The rest is mostly fixes and cleanups and some code reorganization.
Specifics:
- In-kernel ACPICA code update to the upstream release 20160422
adding support for ACPI 6.1 along with some previously missing bits
of ACPI 6.0 support, making a fair amount of fixes and cleanups and
reducing divergences between the upstream ACPICA and the in-kernel
code (Bob Moore, Lv Zheng, Al Stone, Aleksey Makarov, Will Miles)
- ACPI Generic Event Device (GED) support and a fix for it (Sinan
Kaya, Paul Gortmaker)
- INT3406 thermal driver for display thermal management and ACPI
backlight support code reorganization related to it (Aaron Lu, Arnd
Bergmann)
- Support for exporting the value returned by the _HRV (hardware
revision) ACPI object via sysfs (Betty Dall)
- Removal of the EXPERT dependency for ACPI on ARM64 (Mark Brown)
- Rework of the handling of ACPI _OSI mechanism allowing the
_OSI("Darwin") support to be overridden from the kernel command
line among other things (Lv Zheng, Chen Yu)
- Rework of the ACPI tables override mechanism to prepare it for the
introduction of overlays support going forward (Lv Zheng, Rafael
Wysocki)
- Fixes related to the ECDT support and module-level execution of AML
(Lv Zheng)
- ACPI PCI interrupts management update to make it work better on
ARM64 mostly (Sinan Kaya)
- ACPI SRAT handling update to make the code process all entires in
the table order regardless of the entry type (Lukasz Anaczkowski)
- EFI power off support for full-hardware ACPI platforms that don't
support ACPI S5 (Chen Yu)
- Fixes and cleanups related to the ACPI core's sysfs interface (Dan
Carpenter, Betty Dall)
- acpi_dev_present() API rework to reduce possible confusion related
to it (Lukas Wunner)
- Removal of CLK_IS_ROOT from two ACPI drivers (Stephen Boyd)"
* tag 'acpi-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (82 commits)
ACPI / video: mark acpi_video_get_levels() inline
Thermal / ACPI / video: add INT3406 thermal driver
ACPI / GED: make evged.c explicitly non-modular
ACPI / tables: Fix DSDT override mechanism
ACPI / sysfs: fix error code in get_status()
ACPICA: Update version to 20160422
ACPICA: Move all ASCII utilities to a common file
ACPICA: ACPI 2.0, Hardware: Add access_width/bit_offset support for acpi_hw_write()
ACPICA: ACPI 2.0, Hardware: Add access_width/bit_offset support in acpi_hw_read()
ACPICA: Executer: Introduce a set of macros to handle bit width mask generation
ACPICA: Hardware: Add optimized access bit width support
ACPICA: Utilities: Add ACPI_IS_ALIGNED() macro
ACPICA: Renamed some #defined flag constants for clarity
ACPICA: ACPI 6.0, tools/iasl: Add support for new resource descriptors
ACPICA: ACPI 6.0: Update _BIX support for new package element
ACPICA: ACPI 6.1: Support for new PCCT subtable
ACPICA: Refactor evaluate_object to reduce nesting
ACPICA: Divergence: remove unwanted spaces for typedef
ACPI,PCI,IRQ: remove SCI penalize function
ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()
..
|
|
We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.
This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
"The main change is the addition of SGI/UV4 support"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
x86/platform/UV: Fix incorrect nodes and pnodes for cpuless and memoryless nodes
x86/platform/UV: Remove Obsolete GRU MMR address translation
x86/platform/UV: Update physical address conversions for UV4
x86/platform/UV: Build GAM reference tables
x86/platform/UV: Support UV4 socket address changes
x86/platform/UV: Add obtaining GAM Range Table from UV BIOS
x86/platform/UV: Add UV4 addressing discovery function
x86/platform/UV: Fold blade info into per node hub info structs
x86/platform/UV: Allocate common per node hub info structs on local node
x86/platform/UV: Move blade local processor ID to the per cpu info struct
x86/platform/UV: Move scir info to the per cpu info struct
x86/platform/UV: Create per cpu info structs to replace per hub info structs
x86/platform/UV: Update MMIOH setup function to work for both UV3 and UV4
x86/platform/UV: Clean up redunduncies after merge of UV4 MMR definitions
x86/platform/UV: Add UV4 Specific MMR definitions
x86/platform/UV: Prep for UV4 MMR updates
x86/platform/UV: Add UV MMR Illegal Access Function
x86/platform/UV: Add UV4 Specific Defines
x86/platform/UV: Add UV Architecture Defines
x86/platform/UV: Add Initial UV4 definitions
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 debug cleanup from Ingo Molnar:
"A printk() output simplification"
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/dumpstack: Combine some printk()s
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanup from Ingo Molnar:
"Inline optimizations"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Fix non-static inlines
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86-64 defconfig update from Ingo Molnar:
"Small defconfig addition"
[ I'm not actually convinced our defconfig is sensible, but whatever ]
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build/defconfig/64: Enable CONFIG_E1000E=y
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
"The biggest changes in this cycle were:
- prepare for more KASLR related changes, by restructuring, cleaning
up and fixing the existing boot code. (Kees Cook, Baoquan He,
Yinghai Lu)
- simplifly/concentrate subarch handling code, eliminate
paravirt_enabled() usage. (Luis R Rodriguez)"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
x86/KASLR: Clarify purpose of each get_random_long()
x86/KASLR: Add virtual address choosing function
x86/KASLR: Return earliest overlap when avoiding regions
x86/KASLR: Add 'struct slot_area' to manage random_addr slots
x86/boot: Add missing file header comments
x86/KASLR: Initialize mapping_info every time
x86/boot: Comment what finalize_identity_maps() does
x86/KASLR: Build identity mappings on demand
x86/boot: Split out kernel_ident_mapping_init()
x86/boot: Clean up indenting for asm/boot.h
x86/KASLR: Improve comments around the mem_avoid[] logic
x86/boot: Simplify pointer casting in choose_random_location()
x86/KASLR: Consolidate mem_avoid[] entries
x86/boot: Clean up pointer casting
x86/boot: Warn on future overlapping memcpy() use
x86/boot: Extract error reporting functions
x86/boot: Correctly bounds-check relocations
x86/KASLR: Clean up unused code from old 'run_size' and rename it to 'kernel_total_size'
x86/boot: Fix "run_size" calculation
x86/boot: Calculate decompression size during boot not build
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were:
- MSR access API fixes and enhancements (Andy Lutomirski)
- early exception handling improvements (Andy Lutomirski)
- user-space FS/GS prctl usage fixes and improvements (Andy
Lutomirski)
- Remove the cpu_has_*() APIs and replace them with equivalents
(Borislav Petkov)
- task switch micro-optimization (Brian Gerst)
- 32-bit entry code simplification (Denys Vlasenko)
- enhance PAT handling in enumated CPUs (Toshi Kani)
... and lots of other cleanups/fixlets"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
x86/arch_prctl/64: Restore accidentally removed put_cpu() in ARCH_SET_GS
x86/entry/32: Remove asmlinkage_protect()
x86/entry/32: Remove GET_THREAD_INFO() from entry code
x86/entry, sched/x86: Don't save/restore EFLAGS on task switch
x86/asm/entry/32: Simplify pushes of zeroed pt_regs->REGs
selftests/x86/ldt_gdt: Test set_thread_area() deletion of an active segment
x86/tls: Synchronize segment registers in set_thread_area()
x86/asm/64: Rename thread_struct's fs and gs to fsbase and gsbase
x86/arch_prctl/64: Remove FSBASE/GSBASE < 4G optimization
x86/segments/64: When load_gs_index fails, clear the base
x86/segments/64: When loadsegment(fs, ...) fails, clear the base
x86/asm: Make asm/alternative.h safe from assembly
x86/asm: Stop depending on ptrace.h in alternative.h
x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
x86/asm: Make sure verify_cpu() has a good stack
x86/extable: Add a comment about early exception handlers
x86/msr: Set the return value to zero when native_rdmsr_safe() fails
x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y
x86/paravirt: Add paravirt_{read,write}_msr()
x86/msr: Carry on after a non-"safe" MSR access fails
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- massive CPU hotplug rework (Thomas Gleixner)
- improve migration fairness (Peter Zijlstra)
- CPU load calculation updates/cleanups (Yuyang Du)
- cpufreq updates (Steve Muckle)
- nohz optimizations (Frederic Weisbecker)
- switch_mm() micro-optimization on x86 (Andy Lutomirski)
- ... lots of other enhancements, fixes and cleanups.
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits)
ARM: Hide finish_arch_post_lock_switch() from modules
sched/core: Provide a tsk_nr_cpus_allowed() helper
sched/core: Use tsk_cpus_allowed() instead of accessing ->cpus_allowed
sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems
sched/fair: Correct unit of load_above_capacity
sched/fair: Clean up scale confusion
sched/nohz: Fix affine unpinned timers mess
sched/fair: Fix fairness issue on migration
sched/core: Kill sched_class::task_waking to clean up the migration logic
sched/fair: Prepare to fix fairness problems on migration
sched/fair: Move record_wakee()
sched/core: Fix comment typo in wake_q_add()
sched/core: Remove unused variable
sched: Make hrtick_notifier an explicit call
sched/fair: Make ilb_notifier an explicit call
sched/hotplug: Make activate() the last hotplug step
sched/hotplug: Move migration CPU_DYING to sched_cpu_dying()
sched/migration: Move CPU_ONLINE into scheduler state
sched/migration: Move calc_load_migrate() into CPU_DYING
sched/migration: Move prepare transition to SCHED_STARTING state
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
"Main changes in this cycle were:
- AMD MCE/RAS handling updates (Yazen Ghannam, Aravind
Gopalakrishnan)
- Cleanups (Borislav Petkov)
- logging fix (Tony Luck)"
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/RAS: Add SMCA support to AMD Error Injector
EDAC, mce_amd: Detect SMCA using X86_FEATURE_SMCA
x86/mce: Update AMD mcheck init to use cpu_has() facilities
x86/cpu: Add detection of AMD RAS Capabilities
x86/mce/AMD: Save an indentation level in prepare_threshold_block()
x86/mce/AMD: Disable LogDeferredInMcaStat for SMCA systems
x86/mce/AMD: Log Deferred Errors using SMCA MCA_DE{STAT,ADDR} registers
x86/mce: Detect local MCEs properly
x86/mce: Look in genpool instead of mcelog for pending error records
x86/mce: Detect and use SMCA-specific msr_ops
x86/mce: Define vendor-specific MSR accessors
x86/mce: Carve out writes to MCx_STATUS and MCx_CTL
x86/mce: Grade uncorrected errors for SMCA-enabled systems
x86/mce: Log MCEs after a warm rest on AMD, Fam17h and later
x86/mce: Remove explicit smp_rmb() when starting CPUs sync
x86/RAS: Rename AMD MCE injector config item
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Bigger kernel side changes:
- Add backwards writing capability to the perf ring-buffer code,
which is preparation for future advanced features like robust
'overwrite support' and snapshot mode. (Wang Nan)
- Add pause and resume ioctls for the perf ringbuffer (Wang Nan)
- x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner)
- x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter
Zijlstra)
- x86 AUX (Intel PT) related enhancements and updates (Alexander
Shishkin)
- x86 MSR PMU driver enhancements and updates (Huang Rui)
- ... and lots of other changes spread out over 40+ commits.
Biggest tooling side changes:
- 'perf trace' features and enhancements. (Arnaldo Carvalho de Melo)
- BPF tooling updates (Wang Nan)
- 'perf sched' updates (Jiri Olsa)
- 'perf probe' updates (Masami Hiramatsu)
- ... plus 200+ other enhancements, fixes and cleanups to tools/
The merge commits, the shortlog and the changelogs contain a lot more
details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits)
perf/core: Disable the event on a truncated AUX record
perf/x86/intel/pt: Generate PMI in the STOP region as well
perf buildid-cache: Use lsdir() for looking up buildid caches
perf symbols: Use lsdir() for the search in kcore cache directory
perf tools: Use SBUILD_ID_SIZE where applicable
perf tools: Fix lsdir to set errno correctly
perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/
perf trace: Move flock op beautifier to tools/perf/trace/beauty/
perf build: Add build-test for debug-frame on arm/arm64
perf build: Add build-test for libunwind cross-platforms support
perf script: Fix export of callchains with recursion in db-export
perf script: Fix callchain addresses in db-export
perf script: Fix symbol insertion behavior in db-export
perf symbols: Add dso__insert_symbol function
perf scripting python: Use Py_FatalError instead of die()
perf tools: Remove xrealloc and ALLOC_GROW
perf help: Do not use ALLOC_GROW in add_cmd_list
perf pmu: Make pmu_formats_string to check return value of strbuf
perf header: Make topology checkers to check return value of strbuf
perf tools: Make alias handler to check return value of strbuf
...
|
|
Commit b894157145e4 ("x86/PCI: Mark Broadwell-EP Home Agent & PCU as having
non-compliant BARs") marked Home Agent 0 & PCU has having non-compliant
BARs. Home Agent 1 also has non-compliant BARs.
Mark Home Agent 1 as having non-compliant BARs so the PCI core doesn't
touch them.
The problem with these devices is documented in the Xeon v4 specification
update:
BDF2 PCI BARs in the Home Agent Will Return Non-Zero Values
During Enumeration
Problem: During system initialization the Operating System may access
the standard PCI BARs (Base Address Registers). Due to
this erratum, accesses to the Home Agent BAR registers (Bus
1; Device 18; Function 0,4; Offsets (0x14-0x24) will return
non-zero values.
Implication: The operating system may issue a warning. Intel has not
observed any functional failures due to this erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-update.html
Fixes: b894157145e4 ("x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs")
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andi Kleen <ak@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull support for killable rwsems from Ingo Molnar:
"This, by Michal Hocko, implements down_write_killable().
The main usecase will be to update mm_sem usage sites to use this new
API, to allow the mm-reaper introduced in commit aac453635549 ("mm,
oom: introduce oom reaper") to tear down oom victim address spaces
asynchronously with minimum latencies and without deadlock worries"
[ The vfs will want it too as the inode lock is changed from a mutex to
a rwsem due to the parallel lookup and readdir updates ]
* 'locking-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Fix comment on register clobbering
locking/rwsem: Fix down_write_killable()
locking/rwsem, x86: Add frame annotation for call_rwsem_down_write_failed_killable()
locking/rwsem: Provide down_write_killable()
locking/rwsem, x86: Provide __down_write_killable()
locking/rwsem, s390: Provide __down_write_killable()
locking/rwsem, ia64: Provide __down_write_killable()
locking/rwsem, alpha: Provide __down_write_killable()
locking/rwsem: Introduce basis for down_write_killable()
locking/rwsem, sparc: Drop superfluous arch specific implementation
locking/rwsem, sh: Drop superfluous arch specific implementation
locking/rwsem, xtensa: Drop superfluous arch specific implementation
locking/rwsem: Drop explicit memory barriers
locking/rwsem: Get rid of __down_write_nested()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:
- Drop the unused EFI_SYSTEM_TABLES efi.flags bit and ensure the
ARM/arm64 EFI System Table mapping is read-only (Ard Biesheuvel)
- Add a comment to explain that one of the code paths in the x86/pat
code is only executed for EFI boot (Matt Fleming)
- Improve Secure Boot status checks on arm64 and handle unexpected
errors (Linn Crosetto)
- Remove the global EFI memory map variable 'memmap' as the same
information is already available in efi::memmap (Matt Fleming)
- Add EFI Memory Attribute table support for ARM/arm64 (Ard
Biesheuvel)
- Add EFI GOP framebuffer support for ARM/arm64 (Ard Biesheuvel)
- Add EFI Bootloader Control driver for storing reboot(2) data in EFI
variables for consumption by bootloaders (Jeremy Compostella)
- Add Core EFI capsule support (Matt Fleming)
- Add EFI capsule char driver (Kweh, Hock Leong)
- Unify EFI memory map code for ARM and arm64 (Ard Biesheuvel)
- Add generic EFI support for detecting when firmware corrupts CPU
status register bits (like IRQ flags) when performing EFI runtime
service calls (Mark Rutland)
... and other misc cleanups"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
efivarfs: Make efivarfs_file_ioctl() static
efi: Merge boolean flag arguments
efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()
efibc: Fix excessive stack footprint warning
efi/capsule: Make efi_capsule_pending() lockless
efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI driver
efi/runtime-wrappers: Remove ARCH_EFI_IRQ_FLAGS_MASK #ifdef
x86/efi: Enable runtime call flag checking
arm/efi: Enable runtime call flag checking
arm64/efi: Enable runtime call flag checking
efi/runtime-wrappers: Detect firmware IRQ flag corruption
efi/runtime-wrappers: Remove redundant #ifdefs
x86/efi: Move to generic {__,}efi_call_virt()
arm/efi: Move to generic {__,}efi_call_virt()
arm64/efi: Move to generic {__,}efi_call_virt()
efi/runtime-wrappers: Add {__,}efi_call_virt() templates
efi/arm-init: Reserve rather than unmap the memory map for ARM as well
efi: Add misc char driver interface to update EFI firmware
x86/efi: Force EFI reboot to process pending capsules
efi: Add 'capsule' update support
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core signal updates from Ingo Molnar:
"These updates from Stas Sergeev and Andy Lutomirski, improve the
sigaltstack interface by extending its ABI with the SS_AUTODISARM
feature, which makes it possible to use swapcontext() in a sighandler
that works on sigaltstack. Without this flag, the subsequent signal
will corrupt the state of the switched-away sighandler.
The inspiration is more robust dosemu signal handling"
* 'core-signals-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
signals/sigaltstack: Change SS_AUTODISARM to (1U << 31)
signals/sigaltstack: Report current flag bits in sigaltstack()
selftests/sigaltstack: Fix the sigaltstack test on old kernels
signals/sigaltstack: If SS_AUTODISARM, bypass on_sig_stack()
selftests/sigaltstack: Add new testcase for sigaltstack(SS_ONSTACK|SS_AUTODISARM)
signals/sigaltstack: Implement SS_AUTODISARM flag
signals/sigaltstack: Prepare to add new SS_xxx flags
signals/sigaltstack, x86/signals: Unify the x86 sigaltstack check with other architectures
|
|
This patch adds recently added constant blinding helpers into the
x86 eBPF JIT. In the bpf_int_jit_compile() path, requirements are
to utilize bpf_jit_blind_constants()/bpf_jit_prog_release_other()
pair for rewriting the program into a blinded one, and to map the
BPF_REG_AX register to a CPU register. The mapping of BPF_REG_AX
is at non-callee saved register r10, and thus shared with cached
skb->data used for ld_abs/ind and not in every program type needed.
When blinding is not used, there's zero additional overhead in the
generated image.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since the blinding is strictly only called from inside eBPF JITs,
we need to change signatures for bpf_int_jit_compile() and
bpf_prog_select_runtime() first in order to prepare that the
eBPF program we're dealing with can change underneath. Hence,
for call sites, we need to return the latest prog. No functional
change in this patch.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is never such a situation, where bpf_int_jit_compile() is
called with either prog as NULL or len as 0, so the tests are
unnecessary and confusing as people would just copy them. s390
doesn't have them, so no change is needed there.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Split the HAVE_BPF_JIT into two for distinguishing cBPF and eBPF JITs.
Current cBPF ones:
# git grep -n HAVE_CBPF_JIT arch/
arch/arm/Kconfig:44: select HAVE_CBPF_JIT
arch/mips/Kconfig:18: select HAVE_CBPF_JIT if !CPU_MICROMIPS
arch/powerpc/Kconfig:129: select HAVE_CBPF_JIT
arch/sparc/Kconfig:35: select HAVE_CBPF_JIT
Current eBPF ones:
# git grep -n HAVE_EBPF_JIT arch/
arch/arm64/Kconfig:61: select HAVE_EBPF_JIT
arch/s390/Kconfig:126: select HAVE_EBPF_JIT if PACK_STACK && HAVE_MARCH_Z196_FEATURES
arch/x86/Kconfig:94: select HAVE_EBPF_JIT if X86_64
Later code also needs this facility to check for eBPF JITs.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
* acpi-pci:
ACPI,PCI,IRQ: remove SCI penalize function
ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()
ACPI,PCI,IRQ: reduce static IRQ array size to 16
ACPI,PCI,IRQ: reduce resource requirements
* acpi-misc:
ACPI / sysfs: fix error code in get_status()
ACPI / device_sysfs: Clean up checkpatch errors
ACPI / device_sysfs: Change _SUN and _STA show functions error return to EIO
ACPI / device_sysfs: Add sysfs support for _HRV hardware revision
arm64: defconfig: Enable ACPI
ACPI / ARM64: Remove EXPERT dependency for ACPI on ARM64
ACPI / ARM64: Don't enable ACPI by default on ARM64
acer-wmi: Use acpi_dev_found()
eeepc-wmi: Use acpi_dev_found()
ACPI / utils: Rename acpi_dev_present()
* acpi-tools:
tools/power/acpi: close file only if it is open
|
|
* acpi-numa:
ACPI / SRAT: fix SRAT parsing order with both LAPIC and X2APIC present
* acpi-tables:
ACPI / tables: Fix DSDT override mechanism
ACPI / tables: Convert initrd table override to table upgrade mechanism
ACPI / x86: Cleanup initrd related code
ACPI / tables: Move table override mechanisms to tables.c
* acpi-osi:
ACPI / osi: Collect _OSI handling into one single file
ACPI / osi: Cleanup coding style issues before creating a separate OSI source file
ACPI / osi: Cleanup OSI handling code to use bool
ACPI / osi: Fix default _OSI(Darwin) support
ACPI / osi: Add acpi_osi=!! to allow reverting acpi_osi=!
ACPI / osi: Cleanup _OSI("Linux") related code before introducing new support
ACPI / osi: Fix an issue that acpi_osi=!* cannot disable ACPICA internal strings
Conflicts:
drivers/acpi/internal.h
|
|
* acpi-drivers:
ACPI / GED: make evged.c explicitly non-modular
ACPI / amba: Remove CLK_IS_ROOT
ACPI / APD: Remove CLK_IS_ROOT
ACPI: implement Generic Event Device
* acpi-pm:
ACPI / PM: Introduce efi poweroff for HW-full platforms without _S5
* acpi-ec:
ACPI 2.0 / AML: Improve module level execution by moving the If/Else/While execution to per-table basis
ACPI 2.0 / ECDT: Enable correct ECDT initialization order
ACPI 2.0 / ECDT: Remove early namespace reference from EC
ACPI 2.0 / ECDT: Split EC_FLAGS_HANDLERS_INSTALLED
* acpi-video:
ACPI / video: mark acpi_video_get_levels() inline
Thermal / ACPI / video: add INT3406 thermal driver
ACPI/video: export acpi_video_get_levels
video / backlight: remove the backlight_device_registered API
video / backlight: add two APIs for drivers to use
|
|
When I added support for the Memory Protection Keys processor
feature, I had to reindent the REQUIRED/DISABLED_MASK macros, and
also consult the later cpufeature words.
I'm not quite sure how I bungled it, but I consulted the wrong
word at the end. This only affected required or disabled cpu
features in cpufeature words 14, 15 and 16. So, only Protection
Keys itself was screwed over here.
The result was that if you disabled pkeys in your .config, you
might still see some code show up that should have been compiled
out. There should be no functional problems, though.
In verifying this patch I also realized that the DISABLE_PKU/OSPKE
macros were defined backwards and that the cpu_has() check in
setup_pku() was not doing the compile-time disabled checks.
So also fix the macro for DISABLE_PKU/OSPKE and add a compile-time
check for pkeys being enabled in setup_pku().
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: dfb4a70f20c5 ("x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions")
Link: http://lkml.kernel.org/r/20160513221328.C200930B@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Document explicitly that %edx can get clobbered on the slow path, on
32-bit kernels. Something I learned the hard way. :-\
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-next@vger.kernel.org
Link: http://lkml.kernel.org/r/20160516093428.GA26108@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
"Just the missing compat entry for the new pread/writev2"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Use compat version for preadv2 and pwritev2
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"An uncharacteristically large number of bugs popped up in the last
week:
- various tooling fixes, two crashes and build problems
- two Intel PT fixes
- an KNL uncore driver fix
- an Intel PMU driver fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf stat: Fallback to user only counters when perf_event_paranoid > 1
perf evsel: Handle EACCESS + perf_event_paranoid=2 in fallback()
perf evsel: Improve EPERM error handling in open_strerror()
tools lib traceevent: Do not reassign parg after collapse_tree()
perf probe: Check if dwarf_getlocations() is available
perf dwarf: Guard !x86_64 definitions under #ifdef else clause
perf tools: Use readdir() instead of deprecated readdir_r()
perf thread_map: Use readdir() instead of deprecated readdir_r()
perf script: Use readdir() instead of deprecated readdir_r()
perf tools: Use readdir() instead of deprecated readdir_r()
perf/core: Disable the event on a truncated AUX record
perf/x86/intel/pt: Generate PMI in the STOP region as well
perf/x86: Fix undefined shift on 32-bit kernels
perf/x86/msr: Fix SMI overflow
perf/x86/intel/uncore: Fix CHA registers configuration procedure for Knights Landing platform
perf diff: Fix duplicated output column
|
|
Some wakeups should not be considered a sucessful poll. For example on
s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
would be considered runnable - letting all vCPUs poll all the time for
transactional like workload, even if one vCPU would be enough.
This can result in huge CPU usage for large guests.
This patch lets architectures provide a way to qualify wakeups if they
should be considered a good/bad wakeups in regard to polls.
For s390 the implementation will fence of halt polling for anything but
known good, single vCPU events. The s390 implementation for floating
interrupts does a wakeup for one vCPU, but the interrupt will be delivered
by whatever CPU checks first for a pending interrupt. We prefer the
woken up CPU by marking the poll of this CPU as "good" poll.
This code will also mark several other wakeup reasons like IPI or
expired timers as "good". This will of course also mark some events as
not sucessful. As KVM on z runs always as a 2nd level hypervisor,
we prefer to not poll, unless we are really sure, though.
This patch successfully limits the CPU usage for cases like uperf 1byte
transactional ping pong workload or wakeup heavy workload like OLTP
while still providing a proper speedup.
This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
wakeups that are considered not good for polling.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
Cc: David Matlack <dmatlack@google.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
[Rename config symbol. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This fixes an oversight in:
731e33e39a5b95 ("Remove FSBASE/GSBASE < 4G optimization")
Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1462913803-29634-1-git-send-email-mguzik@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Currently, the PT driver always sets the PMI bit one region (page) before
the STOP region so that we can wake up the consumer before we run out of
room in the buffer and have to disable the event. However, we also need
an interrupt in the last output region, so that we actually get to disable
the event (if no more room from new data is available at that point),
otherwise hardware just quietly refuses to start, but the event is
scheduled in and we end up losing trace data till the event gets removed.
For a cpu-wide event it is even worse since there may not be any
re-scheduling at all and no chance for the ring buffer code to notice
that its buffer is filled up and the event needs to be disabled (so that
the consumer can re-enable it when it finishes reading the data out). In
other words, all the trace data will be lost after the buffer gets filled
up.
This patch makes PT also generate a PMI when the last output region is
full.
Reported-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1462886313-13660-2-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Intel Cherrytrail is based on Airmont core so MSR_FSB_FREQ[2:0] = 4
means that the CPU reference clock runs at 80MHz. Add this missing
frequency to the table.
Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>
Link: http://lkml.kernel.org/r/87y47gty89.fsf@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|