Age | Commit message (Collapse) | Author |
|
The bpf syscall and selftests conflicts were trivial
overlapping changes.
The r8169 change involved moving the added mdelay from 'net' into a
different function.
A TLS close bug fix overlapped with the splitting of the TLS state
into separate TX and RX parts. I just expanded the tests in the bug
fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf
== X".
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) Verify lengths of keys provided by the user is AF_KEY, from Kevin
Easton.
2) Add device ID for BCM89610 PHY. Thanks to Bhadram Varka.
3) Add Spectre guards to some ATM code, courtesy of Gustavo A. R.
Silva.
4) Fix infinite loop in NSH protocol code. To Eric Dumazet we are most
grateful for this fix.
5) Line up /proc/net/netlink headers properly. This fix from YU Bo, we
do appreciate.
6) Use after free in TLS code. Once again we are blessed by the
honorable Eric Dumazet with this fix.
7) Fix regression in TLS code causing stalls on partial TLS records.
This fix is bestowed upon us by Andrew Tomt.
8) Deal with too small MTUs properly in LLC code, another great gift
from Eric Dumazet.
9) Handle cached route flushing properly wrt. MTU locking in ipv4, to
Hangbin Liu we give thanks for this.
10) Fix regression in SO_BINDTODEVIC handling wrt. UDP socket demux.
Paolo Abeni, he gave us this.
11) Range check coalescing parameters in mlx4 driver, thank you Moshe
Shemesh.
12) Some ipv6 ICMP error handling fixes in rxrpc, from our good brother
David Howells.
13) Fix kexec on mlx5 by freeing IRQs in shutdown path. Daniel Juergens,
you're the best!
14) Don't send bonding RLB updates to invalid MAC addresses. Debabrata
Benerjee saved us!
15) Uh oh, we were leaking in udp_sendmsg and ping_v4_sendmsg. The ship
is now water tight, thanks to Andrey Ignatov.
16) IPSEC memory leak in ixgbe from Colin Ian King, man we've got holes
everywhere!
17) Fix error path in tcf_proto_create, Jiri Pirko what would we do
without you!
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits)
net sched actions: fix refcnt leak in skbmod
net: sched: fix error path in tcf_proto_create() when modules are not configured
net sched actions: fix invalid pointer dereferencing if skbedit flags missing
ixgbe: fix memory leak on ipsec allocation
ixgbevf: fix ixgbevf_xmit_frame()'s return type
ixgbe: return error on unsupported SFP module when resetting
ice: Set rq_last_status when cleaning rq
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()'
bonding: send learning packets for vlans on slave
bonding: do not allow rlb updates to invalid mac
net/mlx5e: Err if asked to offload TC match on frag being first
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
net/mlx5: Free IRQs in shutdown path
rxrpc: Trace UDP transmission failure
rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages
rxrpc: Fix the min security level for kernel calls
rxrpc: Fix error reception on AF_INET6 sockets
rxrpc: Fix missing start of call timeout
qed: fix spelling mistake: "taskelt" -> "tasklet"
...
|
|
Pull arch/sh fixes from Rich Felker:
"Fixes for critical regressions and a build failure.
The regressions were introduced in 4.15 and 4.17-rc1 and prevented
booting on affected systems"
* tag 'sh-for-4.17-fixes' of git://git.libc.org/linux-sh:
sh: switch to NO_BOOTMEM
sh: mm: Fix unprotected access to struct device
sh: fix build failure for J2 cpu with SMP disabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"There's a small memblock accounting problem when freeing the initrd
and a Spectre-v2 mitigation for NVIDIA Denver CPUs which just requires
a match on the CPU ID register.
Summary:
- Mitigate Spectre-v2 for NVIDIA Denver CPUs
- Free memblocks corresponding to freed initrd area"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: capabilities: Add NVIDIA Denver CPU to bp_harden list
arm64: Add MIDR encoding for NVIDIA CPUs
arm64: To remove initrd reserved area entry from memblock
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"One fix for an actual regression, the change to the SYSCALL_DEFINE
wrapper broke FTRACE_SYSCALLS for us due to a name mismatch. There's
also another commit to the same code to make sure we match all our
syscalls with various prefixes.
And then just one minor build fix, and the removal of an unused
variable that was removed and then snuck back in due to some rebasing.
Thanks to: Naveen N. Rao"
* tag 'powerpc-4.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries: Fix CONFIG_NUMA=n build
powerpc/trace/syscalls: Update syscall name matching logic to account for ppc_ prefix
powerpc/trace/syscalls: Update syscall name matching logic
powerpc/64: Remove unused paca->soft_enabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"One fix for the kernel running as a fully virtualized guest using PV
drivers on old Xen hypervisor versions"
* tag 'for-linus-4.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Reset VCPU0 info pointer after shared_info remap
|
|
Commit 0fa1c579349f ("of/fdt: use memblock_virt_alloc for early alloc")
inadvertently switched the DT unflattening allocations from memblock to
bootmem which doesn't work because the unflattening happens before
bootmem is initialized. Swapping the order of bootmem init and
unflattening could also fix this, but removing bootmem is desired. So
enable NO_BOOTMEM on SH like other architectures have done.
Fixes: 0fa1c579349f ("of/fdt: use memblock_virt_alloc for early alloc")
Reported-by: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rich Felker <dalias@libc.org>
|
|
The NVIDIA Denver CPU also needs a PSCI call to harden the branch
predictor.
Signed-off-by: David Gilhooley <dgilhooley@nvidia.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
This patch adds the MIDR encodings for NVIDIA as well as
the Denver and Carmel CPUs used in Tegra SoCs.
Signed-off-by: David Gilhooley <dgilhooley@nvidia.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Commit d50f4630c2e1 ("arm: dts: Remove p1010-flexcan compatible from imx
series dts") removed the fallback compatible "fsl,p1010-flexcan" from
the imx device trees. As the flexcan cores on i.MX25, i.MX35 and i.MX53
are identical, introduce the first as fallback for the two latter ones.
Fixes: d50f4630c2e1 ("arm: dts: Remove p1010-flexcan compatible from imx series dts")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: linux-stable <stable@vger.kernel.org> # >= v4.16
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The build is failing with CONFIG_NUMA=n and some compiler versions:
arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
hotplug-cpu.c:(.text+0x12c): undefined reference to `timed_topology_update'
arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_cpu_remove':
hotplug-cpu.c:(.text+0x400): undefined reference to `timed_topology_update'
Fix it by moving the empty version of timed_topology_update() into the
existing #ifdef block, which has the right guard of SPLPAR && NUMA.
Fixes: cee5405da402 ("powerpc/hotplug: Improve responsiveness of hotplug change")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Minor conflict, a CHECK was placed into an if() statement
in net-next, whilst a newline was added to that CHECK
call in 'net'. Thanks to Daniel for the merge resolution.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes crashes during boot for HVM guests on older (pre HVM
vector callback) Xen versions. Without this, current kernels will always
fail to boot on those Xen versions.
Sample stack trace:
BUG: unable to handle kernel paging request at ffffffffff200000
IP: __xen_evtchn_do_upcall+0x1e/0x80
PGD 1e0e067 P4D 1e0e067 PUD 1e10067 PMD 235c067 PTE 0
Oops: 0002 [#1] SMP PTI
Modules linked in:
CPU: 0 PID: 512 Comm: kworker/u2:0 Not tainted 4.14.33-52.13.amzn1.x86_64 #1
Hardware name: Xen HVM domU, BIOS 3.4.3.amazon 11/11/2016
task: ffff88002531d700 task.stack: ffffc90000480000
RIP: 0010:__xen_evtchn_do_upcall+0x1e/0x80
RSP: 0000:ffff880025403ef0 EFLAGS: 00010046
RAX: ffffffff813cc760 RBX: ffffffffff200000 RCX: ffffc90000483ef0
RDX: ffff880020540a00 RSI: ffff880023c78000 RDI: 000000000000001c
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880025403f5c R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880025400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff200000 CR3: 0000000001e0a000 CR4: 00000000000006f0
Call Trace:
<IRQ>
do_hvm_evtchn_intr+0xa/0x10
__handle_irq_event_percpu+0x43/0x1a0
handle_irq_event_percpu+0x20/0x50
handle_irq_event+0x39/0x60
handle_fasteoi_irq+0x80/0x140
handle_irq+0xaf/0x120
do_IRQ+0x41/0xd0
common_interrupt+0x7d/0x7d
</IRQ>
During boot, the HYPERVISOR_shared_info page gets remapped to make it work
with KASLR. This means that any pointer derived from it needs to be
adjusted.
The only value that this applies to is the vcpu_info pointer for VCPU 0.
For PV and HVM with the callback vector feature, this gets done via the
smp_ops prepare_boot_cpu callback. Older Xen versions do not support the
HVM callback vector, so there is no Xen-specific smp_ops set up in that
scenario. So, the vcpu_info pointer for VCPU 0 never gets set to the proper
value, and the first reference of it will be bad. Fix this by resetting it
immediately after the remap.
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Reviewed-by: Eduardo Valentin <eduval@amazon.com>
Reviewed-by: Alakesh Haloi <alakeshh@amazon.com>
Reviewed-by: Vallish Vaidyeshwara <vallish@amazon.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
|
|
ppc_ prefix
Some syscall entry functions on powerpc are prefixed with
ppc_/ppc32_/ppc64_ rather than the usual sys_/__se_sys prefix. fork(),
clone(), swapcontext() are some examples of syscalls with such entry
points. We need to match against these names when initializing ftrace
syscall tracing.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
On powerpc64 ABIv1, we are enabling syscall tracing for only ~20
syscalls. This is due to commit e145242ea0df6 ("syscalls/core,
syscalls/x86: Clean up syscall stub naming convention") which has
changed the syscall entry wrapper prefix from "SyS" to "__se_sys".
Update the logic for ABIv1 to not just skip the initial dot, but also
the "__se_sys" prefix.
Fixes: commit e145242ea0df6 ("syscalls/core, syscalls/x86: Clean up syscall stub naming convention")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In commit 4e26bc4a4ed6 ("powerpc/64: Rename soft_enabled to
irq_soft_mask") we renamed paca->soft_enabled. But then in commit
8e0b634b1327 ("powerpc/64s: Do not allocate lppaca if we are not
virtualized") we added it back. Oops. This happened because the two
patches were in flight at the same time and rebased vs each other
multiple times, and we missed it in review.
Fixes: 8e0b634b1327 ("powerpc/64s: Do not allocate lppaca if we are not virtualized")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Pll KVM fixes from Radim Krčmář:
"ARM:
- Fix proxying of GICv2 CPU interface accesses
- Fix crash when switching to BE
- Track source vcpu git GICv2 SGIs
- Fix an outdated bit of documentation
x86:
- Speed up injection of expired timers (for stable)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: remove APIC Timer periodic/oneshot spikes
arm64: vgic-v2: Fix proxying of cpuif access
KVM: arm/arm64: vgic_init: Cleanup reference to process_maintenance
KVM: arm64: Fix order of vcpu_write_sys_reg() arguments
KVM: arm/arm64: vgic: Fix source vcpu issues for GICv2 SGI
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
"Unbreak the CPUID CPUID_8000_0008_EBX reload which got dropped when
the evaluation of physical and virtual bits which uses the same CPUID
leaf was moved out of get_cpu_cap()"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Restore CPUID_8000_0008_EBX reload
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull clocksource fixes from Thomas Gleixner:
"The recent addition of the early TSC clocksource breaks on machines
which have an unstable TSC because in case that TSC is disabled, then
the clocksource selection logic falls back to the early TSC which is
obviously bogus.
That also unearthed a few robustness issues in the clocksource
derating code which are addressed as well"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Rework stale comment
clocksource: Consistent de-rate when marking unstable
x86/tsc: Fix mark_tsc_unstable()
clocksource: Initialize cs->wd_list
clocksource: Allow clocksource_mark_unstable() on unregistered clocksources
x86/tsc: Always unregister clocksource_tsc_early
|
|
Since the commit "8003c9ae204e: add APIC Timer periodic/oneshot mode VMX
preemption timer support", a Windows 10 guest has some erratic timer
spikes.
Here the results on a 150000 times 1ms timer without any load:
Before 8003c9ae204e | After 8003c9ae204e
Max 1834us | 86000us
Mean 1100us | 1021us
Deviation 59us | 149us
Here the results on a 150000 times 1ms timer with a cpu-z stress test:
Before 8003c9ae204e | After 8003c9ae204e
Max 32000us | 140000us
Mean 1006us | 1997us
Deviation 140us | 11095us
The root cause of the problem is starting hrtimer with an expiry time
already in the past can take more than 20 milliseconds to trigger the
timer function. It can be solved by forward such past timers
immediately, rather than submitting them to hrtimer_start().
In case the timer is periodic, update the target expiration and call
hrtimer_start with it.
v2: Check if the tsc deadline is already expired. Thank you Mika.
v3: Execute the past timers immediately rather than submitting them to
hrtimer_start().
v4: Rearm the periodic timer with advance_periodic_target_expiration() a
simpler version of set_target_expiration(). Thank you Paolo.
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@blade-group.com>
8003c9ae204e ("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/arm fixes for 4.17, take #2
- Fix proxying of GICv2 CPU interface accesses
- Fix crash when switching to BE
- Track source vcpu git GICv2 SGIs
- Fix an outdated bit of documentation
|
|
With commit ce88313069c36eef80f21fd7 ("arch/sh: make the DMA mapping
operations observe dev->dma_pfn_offset") the generic DMA allocation
function on which the SH 'dma_alloc_coherent()' function relies on,
accesses the 'dma_pfn_offset' field of struct device.
Unfortunately the 'dma_generic_alloc_coherent()' function is called from
several places with a NULL struct device argument, halting the CPU
during the boot process.
This patch fixes the issue by protecting access to dev->dma_pfn_offset,
with a trivial check for validity. It also passes a valid 'struct device'
in the 'platform_resource_setup_memory()' function which is the main user
of 'dma_alloc_coherent()', and inserts a WARN_ON() check to remind to future
(and existing) bogus users of this function to provide a valid 'struct device'
whenever possible.
Fixes: ce88313069c36eef80f21fd7 ("arch/sh: make the DMA mapping operations observe dev->dma_pfn_offset")
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Rich Felker <dalias@libc.org>
|
|
The sh asm/smp.h defines a fallback hard_smp_processor_id macro for
the !SMP case, but linux/smp.h never includes asm/smp.h in the !SMP
case.
Signed-off-by: Rich Felker <dalias@libc.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen cleanup from Juergen Gross:
"One cleanup to remove VLAs from the kernel"
* tag 'for-linus-4.17-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Remove use of VLAs
|
|
Proxying the cpuif accesses at EL2 makes use of vcpu_data_guest_to_host
and co, which check the endianness, which call into vcpu_read_sys_reg...
which isn't mapped at EL2 (it was inlined before, and got moved OoL
with the VHE optimizations).
The result is of course a nice panic. Let's add some specialized
cruft to keep the broken platforms that require this hack alive.
But, this code used vcpu_data_guest_to_host(), which expected us to
write the value to host memory, instead we have trapped the guest's
read or write to an mmio-device, and are about to replay it using the
host's readl()/writel() which also perform swabbing based on the host
endianness. This goes wrong when both host and guest are big-endian,
as readl()/writel() will undo the guest's swabbing, causing the
big-endian value to be written to device-memory.
What needs doing?
A big-endian guest will have pre-swabbed data before storing, undo this.
If its necessary for the host, writel() will re-swab it.
For a read a big-endian guest expects to swab the data after the load.
The hosts's readl() will correct for host endianness, giving us the
device-memory's value in the register. For a big-endian guest, swab it
as if we'd only done the load.
For a little-endian guest, nothing needs doing as readl()/writel() leave
the correct device-memory value in registers.
Tested on Juno with that rarest of things: a big-endian 64K host.
Based on a patch from Marc Zyngier.
Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Fixes: bf8feb39642b ("arm64: KVM: vgic-v2: Add GICV access from HYP")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
A typo in kvm_vcpu_set_be()'s call:
| vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr)
causes us to use the 32bit register value as an index into the sys_reg[]
array, and sail off the end of the linear map when we try to bring up
big-endian secondaries.
| Unable to handle kernel paging request at virtual address ffff80098b982c00
| Mem abort info:
| ESR = 0x96000045
| Exception class = DABT (current EL), IL = 32 bits
| SET = 0, FnV = 0
| EA = 0, S1PTW = 0
| Data abort info:
| ISV = 0, ISS = 0x00000045
| CM = 0, WnR = 1
| swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000002ea0571a
| [ffff80098b982c00] pgd=00000009ffff8803, pud=0000000000000000
| Internal error: Oops: 96000045 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 2 PID: 1561 Comm: kvm-vcpu-0 Not tainted 4.17.0-rc3-00001-ga912e2261ca6-dirty #1323
| Hardware name: ARM Juno development board (r1) (DT)
| pstate: 60000005 (nZCv daif -PAN -UAO)
| pc : vcpu_write_sys_reg+0x50/0x134
| lr : vcpu_write_sys_reg+0x50/0x134
| Process kvm-vcpu-0 (pid: 1561, stack limit = 0x000000006df4728b)
| Call trace:
| vcpu_write_sys_reg+0x50/0x134
| kvm_psci_vcpu_on+0x14c/0x150
| kvm_psci_0_2_call+0x244/0x2a4
| kvm_hvc_call_handler+0x1cc/0x258
| handle_hvc+0x20/0x3c
| handle_exit+0x130/0x1ec
| kvm_arch_vcpu_ioctl_run+0x340/0x614
| kvm_vcpu_ioctl+0x4d0/0x840
| do_vfs_ioctl+0xc8/0x8d0
| ksys_ioctl+0x78/0xa8
| sys_ioctl+0xc/0x18
| el0_svc_naked+0x30/0x34
| Code: 73620291 604d00b0 00201891 1ab10194 (957a33f8)
|---[ end trace 4b4a4f9628596602 ]---
Fix the order of the arguments.
Fixes: 8d404c4c24613 ("KVM: arm64: Rewrite system register accessors to read/write functions")
CC: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Pull networking fixes from David Miller:
1) Various sockmap fixes from John Fastabend (pinned map handling,
blocking in recvmsg, double page put, error handling during redirect
failures, etc.)
2) Fix dead code handling in x86-64 JIT, from Gianluca Borello.
3) Missing device put in RDS IB code, from Dag Moxnes.
4) Don't process fast open during repair mode in TCP< from Yuchung
Cheng.
5) Move address/port comparison fixes in SCTP, from Xin Long.
6) Handle add a bond slave's master into a bridge properly, from
Hangbin Liu.
7) IPv6 multipath code can operate on unitialized memory due to an
assumption that the icmp header is in the linear SKB area. Fix from
Eric Dumazet.
8) Don't invoke do_tcp_sendpages() recursively via TLS, from Dave
Watson.
9) Fix memory leaks in x86-64 JIT, from Daniel Borkmann.
10) RDS leaks kernel memory to userspace, from Eric Dumazet.
11) DCCP can invoke a tasklet on a freed socket, take a refcount. Also
from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
dccp: fix tasklet usage
smc: fix sendpage() call
net/smc: handle unregistered buffers
net/smc: call consolidation
qed: fix spelling mistake: "offloded" -> "offloaded"
net/mlx5e: fix spelling mistake: "loobpack" -> "loopback"
tcp: restore autocorking
rds: do not leak kernel memory to user land
qmi_wwan: do not steal interfaces from class drivers
ipv4: fix fnhe usage by non-cached routes
bpf: sockmap, fix error handling in redirect failures
bpf: sockmap, zero sg_size on error when buffer is released
bpf: sockmap, fix scatterlist update on error path in send with apply
net_sched: fq: take care of throttled flows before reuse
ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6"
bpf, x64: fix memleak when not converging on calls
bpf, x64: fix memleak when not converging after image
net/smc: restrict non-blocking connect finish
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
sctp: fix the issue that the cookie-ack with auth can't get processed
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"Fix two section mismatches, convert to read_persistent_clock64(), add
further documentation regarding the HPMC crash handler and make
bzImage the default build target"
* 'parisc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix section mismatches
parisc: drivers.c: Fix section mismatches
parisc: time: Convert read_persistent_clock() to read_persistent_clock64()
parisc: Document rules regarding checksum of HPMC handler
parisc: Make bzImage default build target
|
|
Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from x32 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from s390x JIT.
Tested on s390x instance on LinuxONE.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from ppc64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from mips64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from arm32 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from sparc64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from arm64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from x64 JIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF
only. Classic BPF is supported because of the conversion by BPF core.
Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW
It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment.
IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use
EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF
ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others
eBPF registers, R0-R10, are simulated through scratch space on stack.
The reasons behind the hardware registers allocation policy are:
1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit
for general eBPF 64bit register simulation.
2:We need at least 4 registers to simulate most eBPF ISA operations
on registers operands instead of on register&memory operands.
3:We need to put BPF_REG_AX on hardware registers, or constant blinding
will degrade jit performance heavily.
Tested on PC (Intel(R) Core(TM) i5-5200U CPU).
Testing results on i5-5200U:
1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 83 PASSED, 0 FAILED.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 828 PASSED, 0 FAILED.
Above tests are all done in following two conditions separately:
1:bpf_jit_enable=1 and bpf_jit_harden=0
2:bpf_jit_enable=1 and bpf_jit_harden=2
Below are some numbers for this jit implementation:
Note:
I run test_progs in kselftest 100 times continuously for every condition,
the numbers are in format: total/times=avg.
The numbers that test_bpf reports show almost the same relation.
a:jit_enable=0 and jit_harden=0 b:jit_enable=1 and jit_harden=0
test_pkt_access:PASS:ipv4:15622/100=156 test_pkt_access:PASS:ipv4:10674/100=106
test_pkt_access:PASS:ipv6:9130/100=91 test_pkt_access:PASS:ipv6:4855/100=48
test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389
test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685
test_l4lb:PASS:ipv4:61100/100=611 test_l4lb:PASS:ipv4:37302/100=373
test_l4lb:PASS:ipv6:101000/100=1010 test_l4lb:PASS:ipv6:55030/100=550
c:jit_enable=1 and jit_harden=2
test_pkt_access:PASS:ipv4:10558/100=105
test_pkt_access:PASS:ipv6:5092/100=50
test_xdp:PASS:ipv4:131902/100=1319
test_xdp:PASS:ipv6:77932/100=779
test_l4lb:PASS:ipv4:38924/100=389
test_l4lb:PASS:ipv6:57520/100=575
The numbers show we get 30%~50% improvement.
See Documentation/networking/filter.txt for more information.
Changelog:
Changes v5-v6:
1:Add do {} while (0) to RETPOLINE_RAX_BPF_JIT for
consistence reason.
2:Clean up non-standard comments, reported by Daniel Borkmann.
3:Fix a memory leak issue, repoted by Daniel Borkmann.
Changes v4-v5:
1:Delete is_on_stack, BPF_REG_AX is the only one
on real hardware registers, so just check with
it.
2:Apply commit 1612a981b766 ("bpf, x64: fix JIT emission
for dead code"), suggested by Daniel Borkmann.
Changes v3-v4:
1:Fix changelog in commit.
I install llvm-6.0, then test_progs willn't report errors.
I submit another patch:
"bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform"
to fix another problem, after that patch, test_verifier willn't report errors too.
2:Fix clear r0[1] twice unnecessarily in *BPF_IND|BPF_ABS* simulation.
Changes v2-v3:
1:Move BPF_REG_AX to real hardware registers for performance reason.
3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann.
4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann.
5:Some bug fixes and comments improvement.
Changes v1-v2:
1:Fix bug in emit_ia32_neg64.
2:Fix bug in emit_ia32_arsh_r64.
3:Delete filename in top level comment, suggested by Thomas Gleixner.
4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner.
5:Rewrite some words in changelog.
6:CodingSytle improvement and a little more comments.
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Fix three section mismatches:
1) Section mismatch in reference from the function ioread8() to the
function .init.text:pcibios_init_bridge()
2) Section mismatch in reference from the function free_initmem() to the
function .init.text:map_pages()
3) Section mismatch in reference from the function ccio_ioc_init() to
the function .init.text:count_parisc_driver()
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Fix two section mismatches in drivers.c:
1) Section mismatch in reference from the function alloc_tree_node() to
the function .init.text:create_tree_node().
2) Section mismatch in reference from the function walk_native_bus() to
the function .init.text:alloc_pa_dev().
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
The JIT logic in jit_subprogs() is as follows: for all subprogs we
allocate a bpf_prog_alloc(), populate it (prog->is_func = 1 here),
and pass it to bpf_int_jit_compile(). If a failure occurred during
JIT and prog->jited is not set, then we bail out from attempting to
JIT the whole program, and punt to the interpreter instead. In case
JITing went successful, we fixup BPF call offsets and do another
pass to bpf_int_jit_compile() (extra_pass is true at that point) to
complete JITing calls. Given that requires to pass JIT context around
addrs and jit_data from x86 JIT are freed in the extra_pass in
bpf_int_jit_compile() when calls are involved (if not, they can
be freed immediately). However, if in the original pass, the JIT
image didn't converge then we leak addrs and jit_data since image
itself is NULL, the prog->is_func is set and extra_pass is false
in that case, meaning both will become unreachable and are never
cleaned up, therefore we need to free as well on !image. Only x64
JIT is affected.
Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
While reviewing x64 JIT code, I noticed that we leak the prior allocated
JIT image in the case where proglen != oldproglen during the JIT passes.
Prior to the commit e0ee9c12157d ("x86: bpf_jit: fix two bugs in eBPF JIT
compiler") we would just break out of the loop, and using the image as the
JITed prog since it could only shrink in size anyway. After e0ee9c12157d,
we would bail out to out_addrs label where we free addrs and jit_data but
not the image coming from bpf_jit_binary_alloc().
Fixes: e0ee9c12157d ("x86: bpf_jit: fix two bugs in eBPF JIT compiler")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The recent commt which addresses the x86_phys_bits corruption with
encrypted memory on CPUID reload after a microcode update lost the reload
of CPUID_8000_0008_EBX as well.
As a consequence IBRS and IBRS_FW are not longer detected
Restore the behaviour by bringing the reload of CPUID_8000_0008_EBX
back. This restore has a twist due to the convoluted way the cpuid analysis
works:
CPUID_8000_0008_EBX is used by AMD to enumerate IBRB, IBRS, STIBP. On Intel
EBX is not used. But the speculation control code sets the AMD bits when
running on Intel depending on the Intel specific speculation control
bits. This was done to use the same bits for alternatives.
The change which moved the 8000_0008 evaluation out of get_cpu_cap() broke
this nasty scheme due to ordering. So that on Intel the store to
CPUID_8000_0008_EBX clears the IBRB, IBRS, STIBP bits which had been set
before by software.
So the actual CPUID_8000_0008_EBX needs to go back to the place where it
was and the phys/virt address space calculation cannot touch it.
In hindsight this should have used completely synthetic bits for IBRB,
IBRS, STIBP instead of reusing the AMD bits, but that's for 4.18.
/me needs to find time to cleanup that steaming pile of ...
Fixes: d94a155c59c9 ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption")
Reported-by: Jörg Otte <jrg.otte@gmail.com>
Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jörg Otte <jrg.otte@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kirill.shutemov@linux.intel.com
Cc: Borislav Petkov <bp@alien8.de
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1805021043510.1668@nanos.tec.linutronix.de
|
|
So by chance I looked into x86 assembly in arch/x86/net/bpf_jit_comp.c and
noticed the weird and inconsistent comment style it mistakenly learned from
the networking code:
/* Multi-line comment ...
* ... looks like this.
*/
Fix this to use the standard comment style specified in Documentation/CodingStyle
and used in arch/x86/ as well:
/*
* Multi-line comment ...
* ... looks like this.
*/
Also, to quote Linus's ... more explicit views about this:
http://article.gmane.org/gmane.linux.kernel.cryptoapi/21066
> But no, the networking code picked *none* of the above sane formats.
> Instead, it picked these two models that are just half-arsed
> shit-for-brains:
>
> (no)
> /* This is disgusting drug-induced
> * crap, and should die
> */
>
> (no-no-no)
> /* This is also very nasty
> * and visually unbalanced */
>
> Please. The networking code actually has the *worst* possible comment
> style. You can literally find that (no-no-no) style, which is just
> really horribly disgusting and worse than the otherwise fairly similar
> (d) in pretty much every way.
Also improve the comments and some other details while at it:
- Don't mix same-line and previous-line comment style on otherwise
identical code patterns within the same function,
- capitalize 'BPF' and x86 register names consistently,
- capitalize sentences consistently,
- instead of 'x64' use 'x86-64': x64 is a Microsoft specific term,
- use more consistent punctuation,
- use standard coding style in macros as well,
- fix typos and a few other minor details.
Consistent coding style is not optional, at least in arch/x86/.
No change in functionality.
( In case this commit causes conflicts with pending development code
I'll be glad to help resolve any conflicts! )
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
mark_tsc_unstable() also needs to affect tsc_early, Now that
clocksource_mark_unstable() can be used on a clocksource irrespective of
its registration state, use it on both tsc_early and tsc.
This does however require cs->list to be initialized empty, otherwise it
cannot tell the registation state before registation.
Fixes: aa83c45762a2 ("x86/tsc: Introduce early tsc clocksource")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Diego Viola <diego.viola@gmail.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: len.brown@intel.com
Cc: rjw@rjwysocki.net
Cc: rui.zhang@intel.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180430100344.533326547@infradead.org
|
|
Don't leave the tsc-early clocksource registered if it errors out
early.
This was reported by Diego, who on his Core2 era machine got TSC
invalidated while it was running with tsc-early (due to C-states).
This results in keeping tsc-early with very bad effects.
Reported-and-Tested-by: Diego Viola <diego.viola@gmail.com>
Fixes: aa83c45762a2 ("x86/tsc: Introduce early tsc clocksource")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: len.brown@intel.com
Cc: rjw@rjwysocki.net
Cc: diego.viola@gmail.com
Cc: rui.zhang@intel.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180430100344.350507853@infradead.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel
Pull hexagon fixes from Richard Kuo:
"Some small fixes for module compilation"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel:
hexagon: export csum_partial_copy_nocheck
hexagon: add memset_io() helper
|
|
This is needed to link ipv6 as a loadable module, which in turn happens
in allmodconfig.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
|
|
We already have memcpy_toio(), but not memset_io(), so let's
add the obvious version to allow building an allmodconfig kernel
without errors like
drivers/gpu/drm/ttm/ttm_bo_util.c: In function 'ttm_bo_move_memcpy':
drivers/gpu/drm/ttm/ttm_bo_util.c:390:3: error: implicit declaration of function 'memset_io' [-Werror=implicit-function-declaration]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
|
|
INITRD reserved area entry is not removed from memblock
even though initrd reserved area is freed. After freeing
the memory it is released from memblock. The same can be
checked from /sys/kernel/debug/memblock/reserved.
The patch makes sure that the initrd entry is removed from
memblock when keepinitrd is not enabled.
The patch only affects accounting and debugging. This does not
fix any memory leak.
Acked-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: CHANDAN VN <chandan.vn@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
- Fixup license text for oradax driver, from Rob Gardner.
- Release device object with put_device() instead of straight kfree(),
from Arvind Yadav.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: vio: use put_device() instead of kfree()
sparc64: Fix mistake in oradax license text
|