Age | Commit message (Collapse) | Author |
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2020-03-13
The following pull-request contains BPF updates for your *net-next* tree.
We've added 86 non-merge commits during the last 12 day(s) which contain
a total of 107 files changed, 5771 insertions(+), 1700 deletions(-).
The main changes are:
1) Add modify_return attach type which allows to attach to a function via
BPF trampoline and is run after the fentry and before the fexit programs
and can pass a return code to the original caller, from KP Singh.
2) Generalize BPF's kallsyms handling and add BPF trampoline and dispatcher
objects to be visible in /proc/kallsyms so they can be annotated in
stack traces, from Jiri Olsa.
3) Extend BPF sockmap to allow for UDP next to existing TCP support in order
in order to enable this for BPF based socket dispatch, from Lorenz Bauer.
4) Introduce a new bpftool 'prog profile' command which attaches to existing
BPF programs via fentry and fexit hooks and reads out hardware counters
during that period, from Song Liu. Example usage:
bpftool prog profile id 337 duration 3 cycles instructions llc_misses
4228 run_cnt
3403698 cycles (84.08%)
3525294 instructions # 1.04 insn per cycle (84.05%)
13 llc_misses # 3.69 LLC misses per million isns (83.50%)
5) Batch of improvements to libbpf, bpftool and BPF selftests. Also addition
of a new bpf_link abstraction to keep in particular BPF tracing programs
attached even when the applicaion owning them exits, from Andrii Nakryiko.
6) New bpf_get_current_pid_tgid() helper for tracing to perform PID filtering
and which returns the PID as seen by the init namespace, from Carlos Neira.
7) Refactor of RISC-V JIT code to move out common pieces and addition of a
new RV32G BPF JIT compiler, from Luke Nelson.
8) Add gso_size context member to __sk_buff in order to be able to know whether
a given skb is GSO or not, from Willem de Bruijn.
9) Add a new bpf_xdp_output() helper which reuses XDP's existing perf RB output
implementation but can be called from tracepoint programs, from Eelco Chaudron.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The kbuild test robot reported compile issue on x86 in one of
the following patches that adds <linux/kallsyms.h> include into
<linux/bpf.h>, which is picked up by init_32.c object.
The problem is that <linux/kallsyms.h> defines global function
is_kernel_text which colides with the static function of the
same name defined in init_32.c:
$ make ARCH=i386
...
>> arch/x86/mm/init_32.c:241:19: error: redefinition of 'is_kernel_text'
static inline int is_kernel_text(unsigned long addr)
^~~~~~~~~~~~~~
In file included from include/linux/bpf.h:21:0,
from include/linux/bpf-cgroup.h:5,
from include/linux/cgroup-defs.h:22,
from include/linux/cgroup.h:28,
from include/linux/hugetlb.h:9,
from arch/x86/mm/init_32.c:18:
include/linux/kallsyms.h:31:19: note: previous definition of 'is_kernel_text' was here
static inline int is_kernel_text(unsigned long addr)
Renaming the init_32.c is_kernel_text function to __is_kernel_text.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200312195610.346362-2-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Minor overlapping changes, nothing serious.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
"It looks like a decent sized set of fixes, but a lot of these are one
liner off-by-one and similar type changes:
1) Fix netlink header pointer to calcular bad attribute offset
reported to user. From Pablo Neira Ayuso.
2) Don't double clear PHY interrupts when ->did_interrupt is set,
from Heiner Kallweit.
3) Add missing validation of various (devlink, nl802154, fib, etc.)
attributes, from Jakub Kicinski.
4) Missing *pos increments in various netfilter seq_next ops, from
Vasily Averin.
5) Missing break in of_mdiobus_register() loop, from Dajun Jin.
6) Don't double bump tx_dropped in veth driver, from Jiang Lidong.
7) Work around FMAN erratum A050385, from Madalin Bucur.
8) Make sure ARP header is pulled early enough in bonding driver,
from Eric Dumazet.
9) Do a cond_resched() during multicast processing of ipvlan and
macvlan, from Mahesh Bandewar.
10) Don't attach cgroups to unrelated sockets when in interrupt
context, from Shakeel Butt.
11) Fix tpacket ring state management when encountering unknown GSO
types. From Willem de Bruijn.
12) Fix MDIO bus PHY resume by checking mdio_bus_phy_may_suspend()
only in the suspend context. From Heiner Kallweit"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (112 commits)
net: systemport: fix index check to avoid an array out of bounds access
tc-testing: add ETS scheduler to tdc build configuration
net: phy: fix MDIO bus PM PHY resuming
net: hns3: clear port base VLAN when unload PF
net: hns3: fix RMW issue for VLAN filter switch
net: hns3: fix VF VLAN table entries inconsistent issue
net: hns3: fix "tc qdisc del" failed issue
taprio: Fix sending packets without dequeueing them
net: mvmdio: avoid error message for optional IRQ
net: dsa: mv88e6xxx: Add missing mask of ATU occupancy register
net: memcg: fix lockdep splat in inet_csk_accept()
s390/qeth: implement smarter resizing of the RX buffer pool
s390/qeth: refactor buffer pool code
s390/qeth: use page pointers to manage RX buffer pool
seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number
net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed
net/packet: tpacket_rcv: do not increment ring index on drop
sxgbe: Fix off by one in samsung driver strncpy size arg
net: caif: Add lockdep expression to RCU traversal primitive
MAINTAINERS: remove Sathya Perla as Emulex NIC maintainer
...
|
|
This reverts commit 9cc5ae125f0eaee471bc87fb5cbf29385fd9272a.
This commit:
b303f9f0050b arm64: dts: sdm845: Redefine interconnect provider DT nodes
found in the Qualcomm for-next tree removes/redefines the interconnect
provider node(s) used for IPA. I'm not sure whether it technically
conflicts with the IPA change to "sdm845.dtsi" in for-next, but it renders
it broken.
Revert this commit in the for-next tree, with the plan to incorporate
it into the Qualcomm tree instead.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"Fix a build problem with x86/curve25519"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86/curve25519 - support assemblers with no adx support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
"A few MIPS fixes:
- DT fixes for CI20
- Fix command line handling
- Correct patchwork URL"
* tag 'mips_fixes_5.6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MAINTAINERS: Correct MIPS patchwork URL
MIPS: DTS: CI20: fix interrupt for pcf8563 RTC
MIPS: DTS: CI20: fix PMU definitions for ACT8600
MIPS: Fix CONFIG_MIPS_CMDLINE_DTB_EXTEND handling
|
|
fmod_ret progs are emitted as:
start = __bpf_prog_enter();
call fmod_ret
*(u64 *)(rbp - 8) = rax
__bpf_prog_exit(, start);
test eax, eax
jne do_fexit
That 'test eax, eax' is working by accident. The compiler is free to use rax
inside __bpf_prog_exit() or inside functions that __bpf_prog_exit() is calling.
Which caused "test_progs -t modify_return" to sporadically fail depending on
compiler version and kconfig. Fix it by using 'cmp [rbp - 8], 0' instead of
'test eax, eax'.
Fixes: ae24082331d9 ("bpf: Introduce BPF_MODIFY_RETURN")
Reported-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200311003906.3643037-1-ast@kernel.org
|
|
Add IPA-related nodes and definitions to "sdm845.dtsi".
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"We've been accruing these for a couple of weeks, so the batch is a bit
bigger than usual.
Largest delta is due to a led-bl driver that is added -- there was a
miscommunication before the merge window and the driver didn't make it
in. Due to this, the platforms needing it regressed. At this point, it
seemed easier to add the new driver than unwind the changes.
Besides that, there are a handful of various fixes:
- AMD tee memory leak fix
- A handful of fixlets for i.MX SCU communication
- A few maintainers woke up and realized DEBUG_FS had been missing
for a while, so a few updates of that.
... and the usual collection of smaller fixes to various platforms"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (37 commits)
ARM: socfpga_defconfig: Add back DEBUG_FS
arm64: dts: socfpga: agilex: Fix gmac compatible
ARM: bcm2835_defconfig: Explicitly restore CONFIG_DEBUG_FS
arm64: dts: meson: fix gxm-khadas-vim2 wifi
arm64: dts: meson-sm1-sei610: add missing interrupt-names
ARM: meson: Drop unneeded select of COMMON_CLK
ARM: dts: bcm2711: Add pcie0 alias
ARM: dts: bcm283x: Add missing properties to the PWR LED
tee: amdtee: fix memory leak in amdtee_open_session()
ARM: OMAP2+: Fix compile if CONFIG_HAVE_ARM_SMCCC is not set
arm: dts: dra76x: Fix mmc3 max-frequency
ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes
bus: ti-sysc: Fix 1-wire reset quirk
ARM: dts: r8a7779: Remove deprecated "renesas, rcar-sata" compatible value
soc: imx-scu: Align imx sc msg structs to 4
firmware: imx: Align imx_sc_msg_req_cpu_start to 4
firmware: imx: scu-pd: Align imx sc msg structs to 4
firmware: imx: misc: Align imx sc msg structs to 4
firmware: imx: scu: Ensure sequential TX
ARM: dts: imx7-colibri: Fix frequency for sd/mmc
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix panic in gup_fast on large pud by providing an implementation of
pud_write. This has been overlooked during migration to common gup
code.
- Fix unexpected write combining on PCI stores.
* tag 's390-5.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: Fix unexpected write combine on resource
s390/mm: fix panic in gup_fast on large pud
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.6:
- One fix for a recent regression to our breakpoint/watchpoint code.
- Another fix for our KUAP support, this time a missing annotation in
a rarely used path in signal handling.
- A fix for our handling of a CPU feature that effects the PMU, when
booting guests in some configurations.
- A minor fix to our linker script to explicitly include the .BTF
section.
Thanks to: Christophe Leroy, Desnes A. Nunes do Rosario, Leonardo
Bras, Naveen N. Rao, Ravi Bangoria, Stefan Berger"
* tag 'powerpc-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix missing KUAP disable in flush_coherent_icache()
powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems
powerpc: Include .BTF section
powerpc/watchpoint: Don't call dar_within_range() for Book3S
|
|
Interrupts should not be specified by interrupt line but by
gpio parent and reference.
Fixes: 73f2b940474d ("MIPS: CI20: DTS: Add I2C nodes")
Cc: stable@vger.kernel.org
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
There is a ACT8600 on the CI20 board and the bindings of the
ACT8865 driver have changed without updating the CI20 device
tree. Therefore the PMU can not be probed successfully and
is running in power-on reset state.
Fix DT to match the latest act8865-regulator bindings.
Fixes: 73f2b940474d ("MIPS: CI20: DTS: Add I2C nodes")
Cc: stable@vger.kernel.org
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.
This driver did not previously reject unsupported parameters.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The LS1043A SoC is affected by the A050385 erratum stating that
FMAN DMA read or writes under heavy traffic load may cause FMAN
internal resource leak thus stopping further packet processing.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"This contains a handful of fixes that I would like to target for 5.6:
- A pair of fixes to module loading, which we hope solve the last of
the issues with module text being loaded too sparsely for our call
relocations.
- A Kconfig fix that disallows selecting memory models not supported
by NOMMU.
- A series of Kconfig updates to ease selecting the drivers necessary
to run on QEMU's virt platform.
- DTS updates for SiFive's HiFive Unleashed.
- A fix to our seccomp support that avoids mangling restartable
syscalls"
* tag 'riscv-for-linus-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: fix seccomp reject syscall code path
riscv: dts: Add GPIO reboot method to HiFive Unleashed DTS file
RISC-V: Select Goldfish RTC driver for QEMU virt machine
RISC-V: Select SYSCON Reboot and Poweroff for QEMU virt machine
RISC-V: Enable QEMU virt machine support in defconfigs
RISC-V: Add kconfig option for QEMU virt machine
riscv: Fix range looking for kernel image memblock
riscv: Force flat memory model with no-mmu
riscv: Change code model of module to medany to improve data accessing
riscv: avoid the PIC offset of static percpu data in module beyond 2G limits
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Here are another three arm64 fixes for 5.6, all pretty minor. Main
thing is fixing a silly bug in the fsl_imx8_ddr PMU driver where we
would zero the counters when disabling them.
- Fix misreporting of ASID limit when KPTI is enabled
- Fix busted NULL pointer checks for GICC structure in ACPI PMU code
- Avoid nobbling the "fsl_imx8_ddr" PMU counters when disabling them"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: context: Fix ASID limit in boot messages
drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer
drivers/perf: fsl_imx8_ddr: Correct the CLEAR bit definition
|
|
save_stack_trace_tsk_reliable() is not the only function providing the
reliable stack traces anymore. Architecture might define ARCH_STACKWALK
which provides a newer stack walking interface and has
arch_stack_walk_reliable() function. Update the description accordingly.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: http://lkml.kernel.org/r/20200120154042.9934-1-mbenes@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If secure_computing() rejected a system call, we were previously setting
the system call number to -1, to indicate to later code that the syscall
failed. However, if something (e.g. a user notification) was sleeping, and
received a signal, we may set a0 to -ERESTARTSYS and re-try the system call
again.
In this case, seccomp "denies" the syscall (because of the signal), and we
would set a7 to -1, thus losing the value of the system call we want to
restart.
Instead, let's return -1 from do_syscall_trace_enter() to indicate that the
syscall was rejected, so we don't clobber the value in case of -ERESTARTSYS
or whatever.
This commit fixes the user_notification_signal seccomp selftest on riscv to
no longer hang. That test expects the system call to be re-issued after the
signal, and it wasn't due to the above bug. Now that it is, everything
works normally.
Note that in the ptrace (tracer) case, the tracer can set the register
values to whatever they want, so we still need to keep the code that
handles out-of-bounds syscalls. However, we can drop the comment.
We can also drop syscall_set_nr(), since it is no longer used anywhere, and
the code that re-loads the value in a7 because of it.
Reported in: https://lore.kernel.org/bpf/CAEn-LTp=ss0Dfv6J00=rCAy+N78U2AmhqJNjfqjr2FDpPYjxEQ@mail.gmail.com/
Reported-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
Add the ability to reboot the HiFive Unleashed board via GPIO.
Signed-off-by: Yash Shah <yash.shah@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
We select Goldfish RTC driver using QEMU virt machine kconfig option
to access RTC device on QEMU virt machine.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
The SYSCON Reboot and Poweroff drivers can be used on QEMU virt machine
to reboot or poweroff the system hence we select these drivers using
QEMU virt machine kconfig option.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
We have kconfig option for QEMU virt machine so let's enable it
in RV32 and RV64 defconfigs. Also, we remove various VIRTIO configs
from RV32 and RV64 defconfigs because these are now selected by
QEMU virt machine kconfig option.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
We add kconfig option for QEMU virt machine and select all
required VIRTIO drivers using this kconfig option.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
This is an eBPF JIT for RV32G, adapted from the JIT for RV64G and
the 32-bit ARM JIT.
There are two main changes required for this to work compared to
the RV64 JIT.
First, eBPF registers are 64-bit, while RV32G registers are 32-bit.
BPF registers either map directly to 2 RISC-V registers, or reside
in stack scratch space and are saved and restored when used.
Second, many 64-bit ALU operations do not trivially map to 32-bit
operations. Operations that move bits between high and low words,
such as ADD, LSH, MUL, and others must emulate the 64-bit behavior
in terms of 32-bit instructions.
This patch also makes related changes to bpf_jit.h, such
as adding RISC-V instructions required by the RV32 JIT.
Supported features:
The RV32 JIT supports the same features and instructions as the
RV64 JIT, with the following exceptions:
- ALU64 DIV/MOD: Requires loops to implement on 32-bit hardware.
- BPF_XADD | BPF_DW: There's no 8-byte atomic instruction in RV32.
These features are also unsupported on other BPF JITs for 32-bit
architectures.
Testing:
- lib/test_bpf.c
test_bpf: Summary: 378 PASSED, 0 FAILED, [349/366 JIT'ed]
test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED
The tests that are not JITed are all due to use of 64-bit div/mod
or 64-bit xadd.
- tools/testing/selftests/bpf/test_verifier.c
Summary: 1415 PASSED, 122 SKIPPED, 43 FAILED
Tested both with and without BPF JIT hardening.
This is the same set of tests that pass using the BPF interpreter
with the JIT disabled.
Verification and synthesis:
We developed the RV32 JIT using our automated verification tool,
Serval. We have used Serval in the past to verify patches to the
RV64 JIT. We also used Serval to superoptimize the resulting code
through program synthesis.
You can find the tool and a guide to the approach and results here:
https://github.com/uw-unsat/serval-bpf/tree/rv32-jit-v5
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Björn Töpel <bjorn.topel@gmail.com>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Link: https://lore.kernel.org/bpf/20200305050207.4159-3-luke.r.nels@gmail.com
|
|
This patch factors out code that can be used by both the RV64 and RV32
BPF JITs to a common bpf_jit.h and bpf_jit_core.c.
Move struct definitions and macro-like functions to header. Rename
rv_sb_insn/rv_uj_insn to rv_b_insn/rv_j_insn to match the RISC-V
specification.
Move reusable functions emit_body() and bpf_int_jit_compile() to
bpf_jit_core.c with minor simplifications. Rename emit_insn() and
build_{prologue,epilogue}() to be prefixed with "bpf_jit_" as they are
no longer static.
Rename bpf_jit_comp.c to bpf_jit_comp64.c to be more explicit.
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Björn Töpel <bjorn.topel@gmail.com>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Link: https://lore.kernel.org/bpf/20200305050207.4159-2-luke.r.nels@gmail.com
|
|
Some older version of GAS do not support the ADX instructions, similarly
to how they also don't support AVX and such. This commit adds the same
build-time detection mechanisms we use for AVX and others for ADX, and
then makes sure that the curve25519 library dispatcher calls the right
functions.
Reported-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Stefan reported a strange kernel fault which turned out to be due to a
missing KUAP disable in flush_coherent_icache() called from
flush_icache_range().
The fault looks like:
Kernel attempted to access user page (7fffc30d9c00) - exploit attempt? (uid: 1009)
BUG: Unable to handle kernel data access on read at 0x7fffc30d9c00
Faulting instruction address: 0xc00000000007232c
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
CPU: 35 PID: 5886 Comm: sigtramp Not tainted 5.6.0-rc2-gcc-8.2.0-00003-gfc37a1632d40 #79
NIP: c00000000007232c LR: c00000000003b7fc CTR: 0000000000000000
REGS: c000001e11093940 TRAP: 0300 Not tainted (5.6.0-rc2-gcc-8.2.0-00003-gfc37a1632d40)
MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28000884 XER: 00000000
CFAR: c0000000000722fc DAR: 00007fffc30d9c00 DSISR: 08000000 IRQMASK: 0
GPR00: c00000000003b7fc c000001e11093bd0 c0000000023ac200 00007fffc30d9c00
GPR04: 00007fffc30d9c18 0000000000000000 c000001e11093bd4 0000000000000000
GPR08: 0000000000000000 0000000000000001 0000000000000000 c000001e1104ed80
GPR12: 0000000000000000 c000001fff6ab380 c0000000016be2d0 4000000000000000
GPR16: c000000000000000 bfffffffffffffff 0000000000000000 0000000000000000
GPR20: 00007fffc30d9c00 00007fffc30d8f58 00007fffc30d9c18 00007fffc30d9c20
GPR24: 00007fffc30d9c18 0000000000000000 c000001e11093d90 c000001e1104ed80
GPR28: c000001e11093e90 0000000000000000 c0000000023d9d18 00007fffc30d9c00
NIP flush_icache_range+0x5c/0x80
LR handle_rt_signal64+0x95c/0xc2c
Call Trace:
0xc000001e11093d90 (unreliable)
handle_rt_signal64+0x93c/0xc2c
do_notify_resume+0x310/0x430
ret_from_except_lite+0x70/0x74
Instruction dump:
409e002c 7c0802a6 3c62ff31 3863f6a0 f8010080 48195fed 60000000 48fe4c8d
60000000 e8010080 7c0803a6 7c0004ac <7c00ffac> 7c0004ac 4c00012c 38210070
This path through handle_rt_signal64() to setup_trampoline() and
flush_icache_range() is only triggered by 64-bit processes that have
unmapped their VDSO, which is rare.
flush_icache_range() takes a range of addresses to flush. In
flush_coherent_icache() we implement an optimisation for CPUs where we
know we don't actually have to flush the whole range, we just need to
do a single icbi.
However we still execute the icbi on the user address of the start of
the range we're flushing. On CPUs that also implement KUAP (Power9)
that leads to the spurious fault above.
We should be able to pass any address, including a kernel address, to
the icbi on these CPUs, which would avoid any interaction with KUAP.
But I don't want to make that change in a bug fix, just in case it
surfaces some strange behaviour on some CPU.
So for now just disable KUAP around the icbi. Note the icbi is treated
as a load, so we allow read access, not write as you'd expect.
Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
Cc: stable@vger.kernel.org # v5.2+
Reported-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200303235708.26004-1-mpe@ellerman.id.au
|
|
When looking for the memblock where the kernel lives, we should check
that the memory range associated to the memblock entirely comprises the
kernel image and not only intersects with it.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
When multiple programs are attached, each program receives the return
value from the previous program on the stack and the last program
provides the return value to the attached function.
The fmod_ret bpf programs are run after the fentry programs and before
the fexit programs. The original function is only called if all the
fmod_ret programs return 0 to avoid any unintended side-effects. The
success value, i.e. 0 is not currently configurable but can be made so
where user-space can specify it at load time.
For example:
int func_to_be_attached(int a, int b)
{ <--- do_fentry
do_fmod_ret:
<update ret by calling fmod_ret>
if (ret != 0)
goto do_fexit;
original_function:
<side_effects_happen_here>
} <--- do_fexit
The fmod_ret program attached to this function can be defined as:
SEC("fmod_ret/func_to_be_attached")
int BPF_PROG(func_name, int a, int b, int ret)
{
// This will skip the original function logic.
return 1;
}
The first fmod_ret program is passed 0 in its return argument.
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-4-kpsingh@chromium.org
|
|
* Split the invoke_bpf program to prepare for special handling of
fmod_ret programs introduced in a subsequent patch.
* Move the definition of emit_cond_near_jump and emit_nops as they are
needed for fmod_ret.
* Refactor branch target alignment into its own generic helper function
i.e. emit_align.
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-3-kpsingh@chromium.org
|
|
As we need to introduce a third type of attachment for trampolines, the
flattened signature of arch_prepare_bpf_trampoline gets even more
complicated.
Refactor the prog and count argument to arch_prepare_bpf_trampoline to
use bpf_tramp_progs to simplify the addition and accounting for new
attachment types.
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-2-kpsingh@chromium.org
|
|
Compilation errors trigger if ARCH_SPARSEMEM_ENABLE is enabled for
a nommu kernel. Since the sparsemem model does not make sense anyway
for the nommu case, do not allow selecting this option to always use
the flatmem model.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes
ARM: socfpga_defconfig: add back DEBUGFS
- Add back DEBUG_FS for socfpga_defconfig
* tag 'socfpga_defconfig_fix_for_v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
ARM: socfpga_defconfig: Add back DEBUG_FS
Link: https://lore.kernel.org/r/20200304101917.1243-1-dinguyen@kernel.org
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
In the initial MIO support introduced in
commit 71ba41c9b1d9 ("s390/pci: provide support for MIO instructions")
zpci_map_resource() and zpci_setup_resources() default to using the
mio_wb address as the resource's start address. This means users of the
mapping, which includes most drivers, will get write combining on PCI
Stores. This may lead to problems when drivers expect write through
behavior when not using an explicit ioremap_wc().
Cc: stable@vger.kernel.org
Fixes: 71ba41c9b1d9 ("s390/pci: provide support for MIO instructions")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
On s390 there currently is no implementation of pud_write(). That was ok
as long as we had our own implementation of get_user_pages_fast() which
checked for pud protection by testing the bit directly w/o using
pud_write(). The other callers of pud_write() are not reachable on s390.
After commit 1a42010cdc26 ("s390/mm: convert to the generic
get_user_pages_fast code") we use the generic get_user_pages_fast(), which
does call pud_write() in pud_access_permitted() for FOLL_WRITE access on
a large pud. Without an s390 specific pud_write(), the generic version is
called, which contains a BUG() statement to remind us that we don't have a
proper implementation. This results in a kernel panic.
Fix this by providing an implementation of pud_write().
Cc: <stable@vger.kernel.org> # 5.2+
Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code")
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Commit 0e4a459f56c3 ("tracing: Remove unnecessary DEBUG_FS dependency")
removed select for DEBUG_FS but we still need it for development purposes.
Fixes: 0e4a459f56c3 ("tracing: Remove unnecessary DEBUG_FS dependency")
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes
arm64: dts: agilex: fix gmac compatible
- The compatible for Agilex GMAC should be "altr,socfpga-stmmac-a10-s10"
* tag 'socfpga_dts_fix_for_v5.6_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: (578 commits)
arm64: dts: socfpga: agilex: Fix gmac compatible
Linux 5.6-rc4
KVM: VMX: check descriptor table exits on instruction emulation
ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()
macintosh: therm_windtunnel: fix regression when instantiating devices
jbd2: fix data races at struct journal_head
kvm: x86: Limit the number of "kvm: disabled by bios" messages
KVM: x86: avoid useless copy of cpufreq policy
KVM: allow disabling -Werror
KVM: x86: allow compiling as non-module with W=1
KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis
KVM: Introduce pv check helpers
KVM: let declaration of kvm_get_running_vcpus match implementation
KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter
MAINTAINERS: Correct Cadence PCI driver path
io_uring: fix 32-bit compatability with sendmsg/recvmsg
net: dsa: mv88e6xxx: Fix masking of egress port
mlxsw: pci: Wait longer before accessing the device after reset
sfc: fix timestamp reconstruction at 16-bit rollover points
vsock: fix potential deadlock in transport->release()
...
Link: https://lore.kernel.org/r/20200303153509.28248-1-dinguyen@kernel.org
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
All the loaded module locates in the region [&_end-2G,VMALLOC_END] at
runtime, so the distance from the module start to the end of the kernel
image does not exceed 2GB. Hence, the code model of the kernel module can
be changed to medany to improve the performance data access.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
The compiler uses the PIC-relative method to access static variables
instead of GOT when the code model is PIC. Therefore, the limitation of
the access range from the instruction to the symbol address is +-2GB.
Under this circumstance, the kernel cannot load a kernel module if this
module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
is that kernel relocates the .data..percpu section of the kernel module to
the end of kernel's .data..percpu. Hence, the distance between the per-CPU
symbols and the instruction will exceed the 2GB limits. To solve this
problem, the kernel should place the loaded module in the memory area
[&_end-2G, VMALLOC_END].
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Suggested-by: Alexandre Ghiti <alex@ghiti.fr>
Suggested-by: Anup Patel <anup@brainfault.org>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Tested-by: Carlos de Paula <me@carlosedp.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
Fix gmac compatible string to "altr,socfpga-stmmac-a10-s10". Gmac for
Agilex should use same compatible as Stratix 10.
Fixes: 4b36daf9ada3 ("arm64: dts: agilex: Add initial support for Intel's Agilex SoCFPGA")
Cc: stable@vger.kernel.org
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
|
|
https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs defconfig file(s)
fixes for v5.6, please pull the following:
- Stefan restores CONFIG_DEBUG_FS from the bcm2835_defconfig which was
accidentally removed
* tag 'arm-soc/for-5.6/defconfig-fixes' of https://github.com/Broadcom/stblinux:
ARM: bcm2835_defconfig: Explicitly restore CONFIG_DEBUG_FS
Link: https://lore.kernel.org/r/20200302195043.14513-1-f.fainelli@gmail.com
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into arm/fixes
Amlogic fixes for v5.6-rc
* tag 'amlogic-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic:
arm64: dts: meson: fix gxm-khadas-vim2 wifi
arm64: dts: meson-sm1-sei610: add missing interrupt-names
ARM: meson: Drop unneeded select of COMMON_CLK
Link: https://lore.kernel.org/r/7hr1yc9cc1.fsf@baylibre.com
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
The commit 0e4a459f56c3 ("tracing: Remove unnecessary DEBUG_FS dependency")
accidentally dropped the DEBUG FS support in bcm2835_defconfig. So
restore the config as before the commit.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 0e4a459f56c3 ("tracing: Remove unnecessary DEBUG_FS dependency")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes: a pkeys fix for a bug that triggers with weird BIOS
settings, and two Xen PV fixes: a paravirt interface fix, and
pagetable dumping fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix dump_pagetables with Xen PV
x86/ioperm: Add new paravirt function update_io_bitmap()
x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"Three fixes to EFI mixed boot mode, mostly related to x86-64 vmap
stacks activated years ago, bug-fixed recently for EFI, which had
knock-on effects of various 1:1 mapping assumptions in mixed mode.
There's also a READ_ONCE() fix for reading an mmap-ed EFI firmware
data field only once, out of caution"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: READ_ONCE rng seed size before munmap
efi/x86: Handle by-ref arguments covering multiple pages in mixed mode
efi/x86: Remove support for EFI time and counter services in mixed mode
efi/x86: Align GUIDs to their size in the mixed mode runtime wrapper
|
|
Since commit f88f42f853a8 ("arm64: context: Free up kernel ASIDs if KPTI
is not in use"), the NUM_USER_ASIDS macro doesn't correspond to the
effective number of ASIDs when KPTI is enabled. Get an accurate number
of available ASIDs in an arch_initcall, once we've discovered all CPUs'
capabilities and know if we still need to halve the ASID space for KPTI.
Fixes: f88f42f853a8 ("arm64: context: Free up kernel ASIDs if KPTI is not in use")
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Pull KVM fixes from Paolo Bonzini:
"More bugfixes, including a few remaining "make W=1" issues such as too
large frame sizes on some configurations.
On the ARM side, the compiler was messing up shadow stacks between EL1
and EL2 code, which is easily fixed with __always_inline"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: VMX: check descriptor table exits on instruction emulation
kvm: x86: Limit the number of "kvm: disabled by bios" messages
KVM: x86: avoid useless copy of cpufreq policy
KVM: allow disabling -Werror
KVM: x86: allow compiling as non-module with W=1
KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis
KVM: Introduce pv check helpers
KVM: let declaration of kvm_get_running_vcpus match implementation
KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter
arm64: Ask the compiler to __always_inline functions used by KVM at HYP
KVM: arm64: Define our own swab32() to avoid a uapi static inline
KVM: arm64: Ask the compiler to __always_inline functions used at HYP
kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe()
KVM: arm/arm64: Fix up includes for trace.h
|
|
KVM emulates UMIP on hardware that doesn't support it by setting the
'descriptor table exiting' VM-execution control and performing
instruction emulation. When running nested, this emulation is broken as
KVM refuses to emulate L2 instructions by default.
Correct this regression by allowing the emulation of descriptor table
instructions if L1 hasn't requested 'descriptor table exiting'.
Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode")
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|