Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-09-01 (Software steering support)
Abstract:
--------
Mellanox ConnetX devices supports packet matching, packet modification and
redirection. These functionalities are also referred to as flow-steering.
To configure a steering rule, the rule is written to the device owned
memory, this memory is accessed and cached by the device when processing
a packet.
Steering rules are constructed from multiple steering entries (STE).
Rules are configured using the Firmware command interface. The Firmware
processes the given driver command and translates them to STEs, then
writes them to the device memory in the current steering tables.
This process is slow due to the architecture of the command interface and
the processing complexity of each rule.
The highlight of this patchset is to cut the middle man (The firmware) and
do steering rules programming into device directly from the driver, with
no firmware intervention whatsoever.
Motivation:
-----------
Software (driver managed) steering allows for high rule insertion rates
compared to the FW steering described above, this is achieved by using
internal RDMA writes to the device owned memory instead of the slow
command interface to program steering rules.
Software (driver managed) steering, doesn't depend on new FW
for new steering functionality, new implementations can be done in the
driver skipping the FW layer.
Performance:
------------
The insertion rate on a single core using the new approach allows
programming ~300K rules per sec. (Done via direct raw test to the new mlx5
sw steering layer, without any kernel layer involved).
Test: TC L2 rules
33K/s with Software steering (this patchset).
5K/s with FW and current driver.
This will improve OVS based solution performance.
Architecture and implementation details:
----------------------------------------
Software steering will be dynamically selected via devlink device
parameter. Example:
$ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
pci/0000:06:00.0:
name flow_steering_mode type driver-specific
values:
cmode runtime value smfs
mlx5 software steering module a.k.a (DR - Direct Rule) is implemented
and contained in mlx5/core/steering directory and controlled by
MLX5_SW_STEERING kconfig flag.
mlx5 core steering layer (fs_core) already provides a shim layer for
implementing different steering mechanisms, software steering will
leverage that as seen at the end of this series.
When Software Steering for a specific steering domain
(NIC/RDMA/Vport/ESwitch, etc ..) is supported, it will cause rules
targeting this domain to be created using SW steering instead of FW.
The implementation includes:
Domain - The steering domain is the object that all other object resides
in. It holds the memory allocator, send engine, locks and other shared
data needed by lower objects such as table, matcher, rule, action.
Each domain can contain multiple tables. Domain is equivalent to
namespaces e.g (NIC/RDMA/Vport/ESwitch, etc ..) as implemented
currently in mlx5_core fs_core (flow steering core).
Table - Table objects are used for holding multiple matchers, each table
has a level used to prevent processing loops. Packets are being
directed to this table once it is set as the root table, this is done
by fs_core using a FW command. A packet is being processed inside the
table matcher by matcher until a successful hit, otherwise the packet
will perform the default action.
Matcher - Matchers objects are used to specify the fields mask for
matching when processing a packet. A matcher belongs to a table, each
matcher can hold multiple rules, each rule with different matching
values corresponding to the matcher mask. Each matcher has a priority
used for rule processing order inside the table.
Action - Action objects are created to specify different steering actions
such as count, reformat (encapsulate, decapsulate, ...), modify
header, forward to table and many other actions. When creating a rule
a sequence of actions can be provided to be executed on a successful
match.
Rule - Rule objects are used to specify a specific match on packets as
well as the actions that should be executed. A rule belongs to a
matcher.
STE - This layer is used to hold the specific STE format for the device
and to convert the requested rule to STEs. Each rule is constructed of
an STE chain, Multiple rules construct a steering graph. Each node in
the graph is a hash table containing multiple STEs. The index of each
STE in the hash table is being calculated using a CRC32 hash function.
Memory pool - Used for managing and caching device owned memory for rule
insertion. The memory is being allocated using DM (device memory) API.
Communication with device - layer for standard RDMA operation using RC QP
to configure the device steering.
Command utility - This module holds all of the FW commands that are
required for SW steering to function.
Patch planning and files:
-------------------------
1) First patch, adds the support to Add flow steering actions to fs_cmd
shim layer.
2) Next 12 patch will add a file per each Software steering
functionality/module as described above. (See patches with title: DR, *)
3) Add CONFIG_MLX5_SW_STEERING for software steering support and enable
build with the new files
4) Next two patches will add the support for software steering in mlx5
steering shim layer
net/mlx5: Add API to set the namespace steering mode
net/mlx5: Add direct rule fs_cmd implementation
5) Last two patches will add the new devlink parameter to select mlx5
steering mode, will be valid only for switchdev mode for now.
Two modes are supported:
1. DMFS - Device managed flow steering
2. SMFS - Software/Driver managed flow steering.
In the DMFS mode, the HW steering entities are created through the
FW. In the SMFS mode this entities are created though the driver
directly.
The driver will use the devlink steering mode only if the steering
domain supports it, for now SMFS will manages only the switchdev
eswitch steering domain.
User command examples:
- Set SMFS flow steering mode::
$ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime
- Read device flow steering mode::
$ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
pci/0000:06:00.0:
name flow_steering_mode type driver-specific
values:
cmode runtime value smfs
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add new parameter (flow_steering_mode) to control the flow steering
mode of the driver.
Two modes are supported:
1. DMFS - Device managed flow steering
2. SMFS - Software/Driver managed flow steering.
In the DMFS mode, the HW steering entities are created through the
FW. In the SMFS mode this entities are created though the driver
directly.
The driver will use the devlink steering mode only if the steering
domain supports it, for now SMFS will manages only the switchdev eswitch
steering domain.
User command examples:
- Set SMFS flow steering mode::
$ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime
- Read device flow steering mode::
$ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
pci/0000:06:00.0:
name flow_steering_mode type driver-specific
values:
cmode runtime value smfs
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
r8152 conflicts are the NAPI fixes in 'net' overlapping with
some tasklet stuff in net-next
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char and misc driver fixes for reported issues for
5.3-rc7
Also included in here is the documentation for how we are handling
hardware issues under embargo that everyone has finally agreed on, as
well as a MAINTAINERS update for the suckers who agreed to handle the
LICENSES/ files.
All of these have been in linux-next last week with no reported
issues"
* tag 'char-misc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
fsi: scom: Don't abort operations for minor errors
vmw_balloon: Fix offline page marking with compaction
VMCI: Release resource if the work is already queued
Documentation/process: Embargoed hardware security issues
lkdtm/bugs: fix build error in lkdtm_EXHAUST_STACK
mei: me: add Tiger Lake point LP device ID
intel_th: pci: Add Tiger Lake support
intel_th: pci: Add support for another Lewisburg PCH
stm class: Fix a double free of stm_source_device
MAINTAINERS: add entry for LICENSES and SPDX stuff
fpga: altera-ps-spi: Fix getting of optional confd gpio
|
|
It is a 3-Port 10/100 Ethernet Switch with 1588v2 PTP.
Signed-off-by: Razvan Stefanescu <razvan.stefanescu@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As per the discussion with Nicolas Ferre[0], rename the compatible property
to a more appropriate and specific string.
[0] https://lore.kernel.org/netdev/CAJ2_jOFEVZQat0Yprg4hem4jRrqkB72FKSeQj4p8P5KA-+rgww@mail.gmail.com/
Signed-off-by: Yash Shah <yash.shah@sifive.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To address the requirements of embargoed hardware issues, like Meltdown,
Spectre, L1TF etc. it is necessary to define and document a process for
handling embargoed hardware security issues.
Following the discussion at the maintainer summit 2018 in Edinburgh
(https://lwn.net/Articles/769417/) the volunteered people have worked
out a process and a Memorandum of Understanding. The latter addresses
the fact that the Linux kernel community cannot sign NDAs for various
reasons.
The initial contact point for hardware security issues is different from
the regular kernel security contact to provide a known and neutral
interface for hardware vendors and researchers. The initial primary
contact team is proposed to be staffed by Linux Foundation Fellows, who
are not associated to a vendor or a distribution and are well connected
in the industry as a whole.
The process is designed with the experience of the past incidents in
mind and tries to address the remaining gaps, so future (hopefully rare)
incidents can be handled more efficiently. It won't remove the fact,
that most of this has to be done behind closed doors, but it is set up
to avoid big bureaucratic hurdles for individual developers.
The process is solely for handling hardware security issues and cannot
be used for regular kernel (software only) security bugs.
This memo can help with hardware companies who, and I quote, "[my
manager] doesn't want to bet his job on the list keeping things secret."
This despite numerous leaks directly from that company over the years,
and none ever so far from the kernel security team. Cognitive
dissidence seems to be a requirement to be a good manager.
To accelerate the adoption of this process, we introduce the concept of
ambassadors in participating companies. The ambassadors are there to
guide people to comply with the process, but are not automatically
involved in the disclosure of a particular incident.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Acked-by: Ben Hutchings <ben@decadent.org.uk>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Link: https://lore.kernel.org/r/20190815212505.GC12041@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
phylink API
This patch the removes the recently added mediatek,physpeed property.
Use the fixed-link property speed = <2500> to set the phy in 2.5Gbit.
See mt7622-bananapi-bpi-r64.dts for a working example.
Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Minor conflict in r8169, bug fix had two versions in net
and net-next, take the net-next hunks.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC updates from Vineet Gupta:
- support for Edge Triggered IRQs in ARC IDU intc
- other fixes here and there
* tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
arc: prefer __section from compiler_attributes.h
dt-bindings: IDU-intc: Add support for edge-triggered interrupts
dt-bindings: IDU-intc: Clean up documentation
ARCv2: IDU-intc: Add support for edge-triggered interrupts
ARC: unwind: Mark expected switch fall-throughs
ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations
ARC: fix typo in setup_dma_ops log message
ARCv2: entry: early return from exception need not clear U & DE bits
|
|
This updates the documentation for supporting an optional extra interrupt
cell to specify edge vs level triggered.
Signed-off-by: Mischa Jonker <mischa.jonker@synopsys.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
* Some lines exceeded 80 characters.
* Clarified statement about AUX register interface
Signed-off-by: Mischa Jonker <mischa.jonker@synopsys.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A few fixes for x86:
- Fix a boot regression caused by the recent bootparam sanitizing
change, which escaped the attention of all people who reviewed that
code.
- Address a boot problem on machines with broken E820 tables caused
by an underflow which ended up placing the trampoline start at
physical address 0.
- Handle machines which do not advertise a legacy timer of any form,
but need calibration of the local APIC timer gracefully by making
the calibration routine independent from the tick interrupt. Marked
for stable as well as there seems to be quite some new laptops
rolled out which expose this.
- Clear the RDRAND CPUID bit on AMD family 15h and 16h CPUs which are
affected by broken firmware which does not initialize RDRAND
correctly after resume. Add a command line parameter to override
this for machine which either do not use suspend/resume or have a
fixed BIOS. Unfortunately there is no way to detect this on boot,
so the only safe decision is to turn it off by default.
- Prevent RFLAGS from being clobbers in CALL_NOSPEC on 32bit which
caused fast KVM instruction emulation to break.
- Explain the Intel CPU model naming convention so that the repeating
discussions come to an end"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
x86/boot: Fix boot regression caused by bootparam sanitizing
x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
x86/boot/compressed/64: Fix boot on machines with broken E820 table
x86/apic: Handle missing global clockevent gracefully
x86/cpu: Explain Intel model naming convention
|
|
Now that we have the DT validation in place, let's convert the device tree
bindings for the Synopsys DWMAC Glue for Amlogic SoCs over to a YAML schemas.
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Amlogic Meson DWMAC glue bindings needs a second reg cells for the
glue registers, thus update the reg minItems/maxItems to allow more
than a single reg cell.
Also update the allwinner,sun7i-a20-gmac.yaml derivative schema to specify
maxItems to 1.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Reset both NVIDIA GPU and HDA in ThinkPad P50 quirk, which was broken
by another quirk that enabled the HDA device (Lyude Paul)
- Fix pciebus-howto.rst documentation filename typo (Bjorn Helgaas)
* tag 'pci-v5.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Documentation PCI: Fix pciebus-howto.rst filename typo
PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround
|
|
Allow tracing neigh state during neigh update task that is executed on
workqueue and is scheduled by neigh state change event.
Usage example:
># cd /sys/kernel/debug/tracing
># echo mlx5:mlx5e_rep_neigh_update >> set_event
># cat trace
...
kworker/u48:7-2221 [009] ...1 1475.387435: mlx5e_rep_neigh_update:
netdev: ens1f0 MAC: 24:8a:07:9a:17:9a IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_connected=1
Added corresponding documentation in
Documentation/networking/device-driver/mellanox/mlx5.rst
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Allow tracing result of neigh used value update task that is executed
periodically on workqueue.
Usage example:
># cd /sys/kernel/debug/tracing
># echo mlx5:mlx5e_tc_update_neigh_used_value >> set_event
># cat trace
...
kworker/u48:4-8806 [009] ...1 55117.882428: mlx5e_tc_update_neigh_used_value:
netdev: ens1f0 IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_used=1
Added corresponding documentation in
Documentation/networking/device-driver/mellanox/mlx5.rst
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Implemented following tracepoints:
1. Configure flower (mlx5e_configure_flower)
2. Delete flower (mlx5e_delete_flower)
3. Stats flower (mlx5e_stats_flower)
Usage example:
># cd /sys/kernel/debug/tracing
># echo mlx5:mlx5e_configure_flower >> set_event
># cat trace
...
tc-6535 [019] ...1 2672.404466: mlx5e_configure_flower: cookie=0000000067874a55 actions= REDIRECT
Added corresponding documentation in
Documentation/networking/device-driver/mellanox/mlx5.rst
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add documentation for devlink health rx reporter supported by mlx5.
Update tx reporter documentation.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Merge conflict of mlx5 resolved using instructions in merge
commit 9566e650bf7fdf58384bb06df634f7531ca3a97e.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There have been reports of RDRAND issues after resuming from suspend on
some AMD family 15h and family 16h systems. This issue stems from a BIOS
not performing the proper steps during resume to ensure RDRAND continues
to function properly.
RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be
reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND
support using CPUID, including the kernel, will believe that RDRAND is
not supported.
Update the CPU initialization to clear the RDRAND CPUID bit for any family
15h and 16h processor that supports RDRAND. If it is known that the family
15h or family 16h system does not have an RDRAND resume issue or that the
system will not be placed in suspend, the "rdrand=force" kernel parameter
can be used to stop the clearing of the RDRAND CPUID bit.
Additionally, update the suspend and resume path to save and restore the
MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in
place after resuming from suspend.
Note, that clearing the RDRAND CPUID bit does not prevent a processor
that normally supports the RDRAND instruction from executing it. So any
code that determined the support based on family and model won't #UD.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com
|
|
Pull networking fixes from David Miller:
1) Fix jmp to 1st instruction in x64 JIT, from Alexei Starovoitov.
2) Severl kTLS fixes in mlx5 driver, from Tariq Toukan.
3) Fix severe performance regression due to lack of SKB coalescing of
fragments during local delivery, from Guillaume Nault.
4) Error path memory leak in sch_taprio, from Ivan Khoronzhuk.
5) Fix batched events in skbedit packet action, from Roman Mashak.
6) Propagate VLAN TX offload to hw_enc_features in bond and team
drivers, from Yue Haibing.
7) RXRPC local endpoint refcounting fix and read after free in
rxrpc_queue_local(), from David Howells.
8) Fix endian bug in ibmveth multicast list handling, from Thomas
Falcon.
9) Oops, make nlmsg_parse() wrap around the correct function,
__nlmsg_parse not __nla_parse(). Fix from David Ahern.
10) Memleak in sctp_scend_reset_streams(), fro Zheng Bin.
11) Fix memory leak in cxgb4, from Wenwen Wang.
12) Yet another race in AF_PACKET, from Eric Dumazet.
13) Fix false detection of retransmit failures in tipc, from Tuong
Lien.
14) Use after free in ravb_tstamp_skb, from Tho Vu.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
ravb: Fix use-after-free ravb_tstamp_skb
netfilter: nf_tables: map basechain priority to hardware priority
net: sched: use major priority number as hardware priority
wimax/i2400m: fix a memory leak bug
net: cavium: fix driver name
ibmvnic: Unmap DMA address of TX descriptor buffers after use
bnxt_en: Fix to include flow direction in L2 key
bnxt_en: Use correct src_fid to determine direction of the flow
bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command
bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
bnxt_en: Improve RX doorbell sequence.
bnxt_en: Fix VNIC clearing logic for 57500 chips.
net: kalmia: fix memory leaks
cx82310_eth: fix a memory leak bug
bnx2x: Fix VF's VLAN reconfiguration in reload.
Bluetooth: Add debug setting for changing minimum encryption key size
tipc: fix false detection of retransmit failures
lan78xx: Fix memory leaks
MAINTAINERS: r8169: Update path to the driver
MAINTAINERS: PHY LIBRARY: Update files in the record
...
|
|
Add compatible for the ethernet IP core on MT7628/88 SoCs. Its
compatible with the older Ralink Rt5350F SoC. And OpenWrt already
uses this compatible string for the MT76x8.
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: René van Dorst <opensource@vdorst.com>
Cc: Daniel Golle <daniel@makrotopia.org>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: John Crispin <john@phrozen.org>
Cc: devicetree@vger.kernel.org
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add initial documentation of the devlink-trap mechanism, explaining the
background, motivation and the semantics of the interface.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change adds bindings for the Analog Devices ADIN PHY driver, detailing
all the properties implemented by the driver.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
One additional interrupt needs to be described within the Ocelot device
tree node: the PTP ready one. This patch documents the binding needed to
do so.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
One additional register range needs to be described within the Ocelot
device tree node: the PTP. This patch documents the binding needed to do
so.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
2e6422444894 ("Documentation: PCI: convert PCIEBUS-HOWTO.txt to reST")
incorrectly renamed PCIEBUS-HOWTO.txt to picebus-howto.rst.
Rename it to pciebus-howto.rst.
Fixes: 2e6422444894 ("Documentation: PCI: convert PCIEBUS-HOWTO.txt to reST")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix building DT binding examples for in tree builds
- Correct some refcounting in adjust_local_phandle_references()
- Update FSL FEC binding with deprecated properties
- Schema fix in stm32 pinctrl
- Fix typo in of_irq_parse_one docbook comment
* tag 'devicetree-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: irq: fix a trivial typo in a doc comment
dt-bindings: pinctrl: stm32: Fix 'st,syscfg' schema
dt-bindings: fec: explicitly mark deprecated properties
of: resolver: Add of_node_put() before return and break
dt-bindings: Fix generated example files getting added to schemas
|
|
The proper way to add additional contraints to an existing json-schema
is using 'allOf' to reference the base schema. Using just '$ref' doesn't
work. Fix this for the 'st,syscfg' property.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-arm-kernel@lists.infradead.org
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Daniel Borkmann says:
====================
The following pull-request contains BPF updates for your *net-next* tree.
There is a small merge conflict in libbpf (Cc Andrii so he's in the loop
as well):
for (i = 1; i <= btf__get_nr_types(btf); i++) {
t = (struct btf_type *)btf__type_by_id(btf, i);
if (!has_datasec && btf_is_var(t)) {
/* replace VAR with INT */
t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0);
<<<<<<< HEAD
/*
* using size = 1 is the safest choice, 4 will be too
* big and cause kernel BTF validation failure if
* original variable took less than 4 bytes
*/
t->size = 1;
*(int *)(t+1) = BTF_INT_ENC(0, 0, 8);
} else if (!has_datasec && kind == BTF_KIND_DATASEC) {
=======
t->size = sizeof(int);
*(int *)(t + 1) = BTF_INT_ENC(0, 0, 32);
} else if (!has_datasec && btf_is_datasec(t)) {
>>>>>>> 72ef80b5ee131e96172f19e74b4f98fa3404efe8
/* replace DATASEC with STRUCT */
Conflict is between the two commits 1d4126c4e119 ("libbpf: sanitize VAR to
conservative 1-byte INT") and b03bc6853c0e ("libbpf: convert libbpf code to
use new btf helpers"), so we need to pick the sanitation fixup as well as
use the new btf_is_datasec() helper and the whitespace cleanup. Looks like
the following:
[...]
if (!has_datasec && btf_is_var(t)) {
/* replace VAR with INT */
t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0);
/*
* using size = 1 is the safest choice, 4 will be too
* big and cause kernel BTF validation failure if
* original variable took less than 4 bytes
*/
t->size = 1;
*(int *)(t + 1) = BTF_INT_ENC(0, 0, 8);
} else if (!has_datasec && btf_is_datasec(t)) {
/* replace DATASEC with STRUCT */
[...]
The main changes are:
1) Addition of core parts of compile once - run everywhere (co-re) effort,
that is, relocation of fields offsets in libbpf as well as exposure of
kernel's own BTF via sysfs and loading through libbpf, from Andrii.
More info on co-re: http://vger.kernel.org/bpfconf2019.html#session-2
and http://vger.kernel.org/lpc-bpf2018.html#session-2
2) Enable passing input flags to the BPF flow dissector to customize parsing
and allowing it to stop early similar to the C based one, from Stanislav.
3) Add a BPF helper function that allows generating SYN cookies from XDP and
tc BPF, from Petar.
4) Add devmap hash-based map type for more flexibility in device lookup for
redirects, from Toke.
5) Improvements to XDP forwarding sample code now utilizing recently enabled
devmap lookups, from Jesper.
6) Add support for reporting the effective cgroup progs in bpftool, from Jakub
and Takshak.
7) Fix reading kernel config from bpftool via /proc/config.gz, from Peter.
8) Fix AF_XDP umem pages mapping for 32 bit architectures, from Ivan.
9) Follow-up to add two more BPF loop tests for the selftest suite, from Alexei.
10) Add perf event output helper also for other skb-based program types, from Allan.
11) Fix a co-re related compilation error in selftests, from Yonghong.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Expose kernel's BTF under the name vmlinux to be more uniform with using
kernel module names as file names in the future.
Fixes: 341dfcf8d78e ("btf: expose BTF info through sysfs")
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Make .BTF section allocated and expose its contents through sysfs.
/sys/kernel/btf directory is created to contain all the BTFs present
inside kernel. Currently there is only kernel's main BTF, represented as
/sys/kernel/btf/kernel file. Once kernel modules' BTFs are supported,
each module will expose its BTF as /sys/kernel/btf/<module-name> file.
Current approach relies on a few pieces coming together:
1. pahole is used to take almost final vmlinux image (modulo .BTF and
kallsyms) and generate .BTF section by converting DWARF info into
BTF. This section is not allocated and not mapped to any segment,
though, so is not yet accessible from inside kernel at runtime.
2. objcopy dumps .BTF contents into binary file and subsequently
convert binary file into linkable object file with automatically
generated symbols _binary__btf_kernel_bin_start and
_binary__btf_kernel_bin_end, pointing to start and end, respectively,
of BTF raw data.
3. final vmlinux image is generated by linking this object file (and
kallsyms, if necessary). sysfs_btf.c then creates
/sys/kernel/btf/kernel file and exposes embedded BTF contents through
it. This allows, e.g., libbpf and bpftool access BTF info at
well-known location, without resorting to searching for vmlinux image
on disk (location of which is not standardized and vmlinux image
might not be even available in some scenarios, e.g., inside qemu
during testing).
Alternative approach using .incbin assembler directive to embed BTF
contents directly was attempted but didn't work, because sysfs_proc.o is
not re-compiled during link-vmlinux.sh stage. This is required, though,
to update embedded BTF data (initially empty data is embedded, then
pahole generates BTF info and we need to regenerate sysfs_btf.o with
updated contents, but it's too late at that point).
If BTF couldn't be generated due to missing or too old pahole,
sysfs_btf.c handles that gracefully by detecting that
_binary__btf_kernel_bin_start (weak symbol) is 0 and not creating
/sys/kernel/btf at all.
v2->v3:
- added Documentation/ABI/testing/sysfs-kernel-btf (Greg K-H);
- created proper kobject (btf_kobj) for btf directory (Greg K-H);
- undo v2 change of reusing vmlinux, as it causes extra kallsyms pass
due to initially missing __binary__btf_kernel_bin_{start/end} symbols;
v1->v2:
- allow kallsyms stage to re-use vmlinux generated by gen_btf();
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
fec's gpio phy reset properties have been deprecated.
Update the dt-bindings documentation to explicitly mark
them as such, and provide a short description of the
recommended alternative.
Signed-off-by: Sven Van Asbroeck <TheSven73@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
"A few minor RISC-V updates for v5.3-rc4:
- Remove __udivdi3() from the 32-bit Linux port, converting the only
upstream user to use do_div(), per Linux policy
- Convert the RISC-V standard clocksource away from per-cpu data
structures, since only one is used by Linux, even on a multi-CPU
system
- A set of DT binding updates that remove an obsolete text binding in
favor of a YAML binding, fix a bogus compatible string in the
schema (thus fixing a "make dtbs_check" warning), and clarifies the
future values expected in one of the RISC-V CPU properties"
* tag 'riscv/for-v5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
dt-bindings: riscv: fix the schema compatible string for the HiFive Unleashed board
dt-bindings: riscv: remove obsolete cpus.txt
RISC-V: Remove udivdi3
riscv: delay: use do_div() instead of __udivdi3()
dt-bindings: Update the riscv,isa string description
RISC-V: Remove per cpu clocksource
|
|
The current implementation of TCP MTU probing can considerably
underestimate the MTU on lossy connections allowing the MSS to get down to
48. We have found that in almost all of these cases on our networks these
paths can handle much larger MTUs meaning the connections are being
artificially limited. Even though TCP MTU probing can raise the MSS back up
we have seen this not to be the case causing connections to be "stuck" with
an MSS of 48 when heavy loss is present.
Prior to pushing out this change we could not keep TCP MTU probing enabled
b/c of the above reasons. Now with a reasonble floor set we've had it
enabled for the past 6 months.
The new sysctl will still default to TCP_MIN_SND_MSS (48), but gives
administrators the ability to control the floor of MSS probing.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sk_validate_xmit_skb() and drivers depend on the sk member of
struct sk_buff to identify segments requiring encryption.
Any operation which removes or does not preserve the original TLS
socket such as skb_orphan() or skb_clone() will cause clear text
leaks.
Make the TCP socket underlying an offloaded TLS connection
mark all skbs as decrypted, if TLS TX is in offload mode.
Then in sk_validate_xmit_skb() catch skbs which have no socket
(or a socket with no validation) and decrypted flag set.
Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
sk->sk_validate_xmit_skb are slightly interchangeable right now,
they all imply TLS offload. The new checks are guarded by
CONFIG_TLS_DEVICE because that's the option guarding the
sk_buff->decrypted member.
Second, smaller issue with orphaning is that it breaks
the guarantee that packets will be delivered to device
queues in-order. All TLS offload drivers depend on that
scheduling property. This means skb_orphan_partial()'s
trick of preserving partial socket references will cause
issues in the drivers. We need a full orphan, and as a
result netem delay/throttling will cause all TLS offload
skbs to be dropped.
Reusing the sk_buff->decrypted flag also protects from
leaking clear text when incoming, decrypted skb is redirected
(e.g. by TC).
See commit 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect
through ULP") for justification why the internal flag is safe.
The only location which could leak the flag in is tcp_bpf_sendmsg(),
which is taken care of by clearing the previously unused bit.
v2:
- remove superfluous decrypted mark copy (Willem);
- remove the stale doc entry (Boris);
- rely entirely on EOR marking to prevent coalescing (Boris);
- use an internal sendpages flag instead of marking the socket
(Boris).
v3 (Willem):
- reorganize the can_skb_orphan_partial() condition;
- fix the flag leak-in through tcp_bpf_sendmsg.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IPX is no longer supported, but the example in the documentation
might useful. Replace it with IPv6.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both IPX and TR have not been supported for a while now.
Remove them from the /proc/sys/net documentation.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unleashed board
The YAML binding document for SiFive boards has an incorrect
compatible string for the HiFive Unleashed board. Change it to match
the name of the board on the SiFive web site:
https://www.sifive.com/boards/hifive-unleashed
which also matches the contents of the board DT data file:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dts#n13
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Acked-by: Rob Herring <robh@kernel.org>
|
|
Remove the now-obsolete riscv/cpus.txt DT binding document, since we
are using YAML binding documentation instead.
While doing so, transfer the explanatory text about 'harts' (with some
edits) into the YAML file, at Rob's request.
Link: https://lore.kernel.org/linux-riscv/CAL_JsqJs6MtvmuyAknsUxQymbmoV=G+=JfS1PQj9kNHV7fjC9g@mail.gmail.com/
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Rob Herring <robh@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
|
|
Since the RISC-V specification states that ISA description strings are
case-insensitive, there's no functional difference between mixed-case,
upper-case, and lower-case ISA strings. Thus, to simplify parsing,
specify that the letters present in "riscv,isa" must be all lowercase.
Suggested-by: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|
|
Pull cifs fixes from Steve French:
"Six small SMB3 fixes, two for stable"
* tag '5.3-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL
smb3: update TODO list of missing features
smb3: send CAP_DFS capability during session setup
SMB3: Fix potential memory leak when processing compound chain
SMB3: Fix deadlock in validate negotiate hits reconnect
cifs: fix rmmod regression in cifs.ko caused by force_sig changes
|
|
Just minor overlapping changes in the conflicts here.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
"Yeah I should have sent a pull request last week, so there is a lot
more here than usual:
1) Fix memory leak in ebtables compat code, from Wenwen Wang.
2) Several kTLS bug fixes from Jakub Kicinski (circular close on
disconnect etc.)
3) Force slave speed check on link state recovery in bonding 802.3ad
mode, from Thomas Falcon.
4) Clear RX descriptor bits before assigning buffers to them in
stmmac, from Jose Abreu.
5) Several missing of_node_put() calls, mostly wrt. for_each_*() OF
loops, from Nishka Dasgupta.
6) Double kfree_skb() in peak_usb can driver, from Stephane Grosjean.
7) Need to hold sock across skb->destructor invocation, from Cong
Wang.
8) IP header length needs to be validated in ipip tunnel xmit, from
Haishuang Yan.
9) Use after free in ip6 tunnel driver, also from Haishuang Yan.
10) Do not use MSI interrupts on r8169 chips before RTL8168d, from
Heiner Kallweit.
11) Upon bridge device init failure, we need to delete the local fdb.
From Nikolay Aleksandrov.
12) Handle erros from of_get_mac_address() properly in stmmac, from
Martin Blumenstingl.
13) Handle concurrent rename vs. dump in netfilter ipset, from Jozsef
Kadlecsik.
14) Setting NETIF_F_LLTX on mac80211 causes complete breakage with
some devices, so revert. From Johannes Berg.
15) Fix deadlock in rxrpc, from David Howells.
16) Fix Kconfig deps of enetc driver, we must have PHYLIB. From Yue
Haibing.
17) Fix mvpp2 crash on module removal, from Matteo Croce.
18) Fix race in genphy_update_link, from Heiner Kallweit.
19) bpf_xdp_adjust_head() stopped working with generic XDP when we
fixes generic XDP to support stacked devices properly, fix from
Jesper Dangaard Brouer.
20) Unbalanced RCU locking in rt6_update_exception_stamp_rt(), from
David Ahern.
21) Several memory leaks in new sja1105 driver, from Vladimir Oltean"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (214 commits)
net: dsa: sja1105: Fix memory leak on meta state machine error path
net: dsa: sja1105: Fix memory leak on meta state machine normal path
net: dsa: sja1105: Really fix panic on unregistering PTP clock
net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled
net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus()
net: sched: sample: allow accessing psample_group with rtnl
net: sched: police: allow accessing police->params with rtnl
net: hisilicon: Fix dma_map_single failed on arm64
net: hisilicon: fix hip04-xmit never return TX_BUSY
net: hisilicon: make hip04_tx_reclaim non-reentrant
tc-testing: updated vlan action tests with batch create/delete
net sched: update vlan action for batched events operations
net: stmmac: tc: Do not return a fragment entry
net: stmmac: Fix issues when number of Queues >= 4
net: stmmac: xgmac: Fix XGMAC selftests
be2net: disable bh with spin_lock in be_process_mcc
net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs
net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull pti updates from Thomas Gleixner:
"The performance deterioration departement is not proud at all to
present yet another set of speculation fences to mitigate the next
chapter in the 'what could possibly go wrong' story.
The new vulnerability belongs to the Spectre class and affects GS
based data accesses and has therefore been dubbed 'Grand Schemozzle'
for secret communication purposes. It's officially listed as
CVE-2019-1125.
Conditional branches in the entry paths which contain a SWAPGS
instruction (interrupts and exceptions) can be mis-speculated which
results in speculative accesses with a wrong GS base.
This can happen on entry from user mode through a mis-speculated
branch which takes the entry from kernel mode path and therefore does
not execute the SWAPGS instruction. The following speculative accesses
are done with user GS base.
On entry from kernel mode the mis-speculated branch executes the
SWAPGS instruction in the entry from user mode path which has the same
effect that the following GS based accesses are done with user GS
base.
If there is a disclosure gadget available in these code paths the
mis-speculated data access can be leaked through the usual side
channels.
The entry from user mode issue affects all CPUs which have speculative
execution. The entry from kernel mode issue affects only Intel CPUs
which can speculate through SWAPGS. On CPUs from other vendors SWAPGS
has semantics which prevent that.
SMAP migitates both problems but only when the CPU is not affected by
the Meltdown vulnerability.
The mitigation is to issue LFENCE instructions in the entry from
kernel mode path for all affected CPUs and on the affected Intel CPUs
also in the entry from user mode path unless PTI is enabled because
the CR3 write is serializing.
The fences are as usual enabled conditionally and can be completely
disabled on the kernel command line. The Spectre V1 documentation is
updated accordingly.
A big "Thank You!" goes to Josh for doing the heavy lifting for this
round of hardware misfeature 'repair'. Of course also "Thank You!" to
everybody else who contributed in one way or the other"
* 'x86/grand-schemozzle' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation: Add swapgs description to the Spectre v1 documentation
x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
x86/entry/64: Use JMP instead of JMPQ
x86/speculation: Enable Spectre v1 swapgs mitigations
x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
|
|
minor cleanup of documentation, updating to more current status.
Signed-off-by: Steve French <stfrench@microsoft.com>
|