diff options
Diffstat (limited to 'Documentation')
21 files changed, 692 insertions, 87 deletions
diff --git a/Documentation/ABI/testing/sysfs-kernel-btf b/Documentation/ABI/testing/sysfs-kernel-btf new file mode 100644 index 000000000000..2c9744b2cd59 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-btf @@ -0,0 +1,17 @@ +What: /sys/kernel/btf +Date: Aug 2019 +KernelVersion: 5.5 +Contact: bpf@vger.kernel.org +Description: + Contains BTF type information and related data for kernel and + kernel modules. + +What: /sys/kernel/btf/vmlinux +Date: Aug 2019 +KernelVersion: 5.5 +Contact: bpf@vger.kernel.org +Description: + Read-only binary attribute exposing kernel's own BTF type + information with description of all internal kernel types. See + Documentation/bpf/btf.rst for detailed description of format + itself. diff --git a/Documentation/PCI/pci-error-recovery.rst b/Documentation/PCI/pci-error-recovery.rst index e5d450df06b4..13beee23cb04 100644 --- a/Documentation/PCI/pci-error-recovery.rst +++ b/Documentation/PCI/pci-error-recovery.rst @@ -421,7 +421,6 @@ That is, the recovery API only requires that: - drivers/net/ixgbe - drivers/net/cxgb3 - drivers/net/s2io.c - - drivers/net/qlge The End ------- diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst index ed343abe541e..a78bf036cadd 100644 --- a/Documentation/bpf/prog_flow_dissector.rst +++ b/Documentation/bpf/prog_flow_dissector.rst @@ -26,6 +26,7 @@ The inputs are: * ``nhoff`` - initial offset of the networking header * ``thoff`` - initial offset of the transport header, initialized to nhoff * ``n_proto`` - L3 protocol type, parsed out of L2 header + * ``flags`` - optional flags Flow dissector BPF program should fill out the rest of the ``struct bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be @@ -101,6 +102,23 @@ can be called for both cases and would have to be written carefully to handle both cases. +Flags +===== + +``flow_keys->flags`` might contain optional input flags that work as follows: + +* ``BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG`` - tells BPF flow dissector to + continue parsing first fragment; the default expected behavior is that + flow dissector returns as soon as it finds out that the packet is fragmented; + used by ``eth_get_headlen`` to estimate length of all headers for GRO. +* ``BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL`` - tells BPF flow dissector to + stop parsing as soon as it reaches IPv6 flow label; used by + ``___skb_get_hash`` and ``__skb_get_hash_symmetric`` to get flow hash. +* ``BPF_FLOW_DISSECTOR_F_STOP_AT_ENCAP`` - tells BPF flow dissector to stop + parsing as soon as it reaches encapsulated headers; used by routing + infrastructure. + + Reference Implementation ======================== diff --git a/Documentation/devicetree/bindings/net/adi,adin.yaml b/Documentation/devicetree/bindings/net/adi,adin.yaml new file mode 100644 index 000000000000..69375cb28e92 --- /dev/null +++ b/Documentation/devicetree/bindings/net/adi,adin.yaml @@ -0,0 +1,73 @@ +# SPDX-License-Identifier: GPL-2.0+ +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/adi,adin.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Analog Devices ADIN1200/ADIN1300 PHY + +maintainers: + - Alexandru Ardelean <alexandru.ardelean@analog.com> + +description: | + Bindings for Analog Devices Industrial Ethernet PHYs + +allOf: + - $ref: ethernet-phy.yaml# + +properties: + adi,rx-internal-delay-ps: + description: | + RGMII RX Clock Delay used only when PHY operates in RGMII mode with + internal delay (phy-mode is 'rgmii-id' or 'rgmii-rxid') in pico-seconds. + enum: [ 1600, 1800, 2000, 2200, 2400 ] + default: 2000 + + adi,tx-internal-delay-ps: + description: | + RGMII TX Clock Delay used only when PHY operates in RGMII mode with + internal delay (phy-mode is 'rgmii-id' or 'rgmii-txid') in pico-seconds. + enum: [ 1600, 1800, 2000, 2200, 2400 ] + default: 2000 + + adi,fifo-depth-bits: + description: | + When operating in RMII mode, this option configures the FIFO depth. + enum: [ 4, 8, 12, 16, 20, 24 ] + default: 8 + + adi,disable-energy-detect: + description: | + Disables Energy Detect Powerdown Mode (default disabled, i.e energy detect + is enabled if this property is unspecified) + type: boolean + +examples: + - | + ethernet { + #address-cells = <1>; + #size-cells = <0>; + + phy-mode = "rgmii-id"; + + ethernet-phy@0 { + reg = <0>; + + adi,rx-internal-delay-ps = <1800>; + adi,tx-internal-delay-ps = <2200>; + }; + }; + - | + ethernet { + #address-cells = <1>; + #size-cells = <0>; + + phy-mode = "rmii"; + + ethernet-phy@1 { + reg = <1>; + + adi,fifo-depth-bits = <16>; + adi,disable-energy-detect; + }; + }; diff --git a/Documentation/devicetree/bindings/net/allwinner,sun7i-a20-gmac.yaml b/Documentation/devicetree/bindings/net/allwinner,sun7i-a20-gmac.yaml index 06b1cc8bea14..ef446ae166f3 100644 --- a/Documentation/devicetree/bindings/net/allwinner,sun7i-a20-gmac.yaml +++ b/Documentation/devicetree/bindings/net/allwinner,sun7i-a20-gmac.yaml @@ -17,6 +17,9 @@ properties: compatible: const: allwinner,sun7i-a20-gmac + reg: + maxItems: 1 + interrupts: maxItems: 1 diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml new file mode 100644 index 000000000000..ae91aa9d8616 --- /dev/null +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml @@ -0,0 +1,113 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +# Copyright 2019 BayLibre, SAS +%YAML 1.2 +--- +$id: "http://devicetree.org/schemas/net/amlogic,meson-dwmac.yaml#" +$schema: "http://devicetree.org/meta-schemas/core.yaml#" + +title: Amlogic Meson DWMAC Ethernet controller + +maintainers: + - Neil Armstrong <narmstrong@baylibre.com> + - Martin Blumenstingl <martin.blumenstingl@googlemail.com> + +# We need a select here so we don't match all nodes with 'snps,dwmac' +select: + properties: + compatible: + contains: + enum: + - amlogic,meson6-dwmac + - amlogic,meson8b-dwmac + - amlogic,meson8m2-dwmac + - amlogic,meson-gxbb-dwmac + - amlogic,meson-axg-dwmac + required: + - compatible + +allOf: + - $ref: "snps,dwmac.yaml#" + - if: + properties: + compatible: + contains: + enum: + - amlogic,meson8b-dwmac + - amlogic,meson8m2-dwmac + - amlogic,meson-gxbb-dwmac + - amlogic,meson-axg-dwmac + + then: + properties: + clocks: + items: + - description: GMAC main clock + - description: First parent clock of the internal mux + - description: Second parent clock of the internal mux + + clock-names: + minItems: 3 + maxItems: 3 + items: + - const: stmmaceth + - const: clkin0 + - const: clkin1 + + amlogic,tx-delay-ns: + $ref: /schemas/types.yaml#definitions/uint32 + description: + The internal RGMII TX clock delay (provided by this driver) in + nanoseconds. Allowed values are 0ns, 2ns, 4ns, 6ns. + When phy-mode is set to "rgmii" then the TX delay should be + explicitly configured. When not configured a fallback of 2ns is + used. When the phy-mode is set to either "rgmii-id" or "rgmii-txid" + the TX clock delay is already provided by the PHY. In that case + this property should be set to 0ns (which disables the TX clock + delay in the MAC to prevent the clock from going off because both + PHY and MAC are adding a delay). + Any configuration is ignored when the phy-mode is set to "rmii". + +properties: + compatible: + additionalItems: true + maxItems: 3 + items: + - enum: + - amlogic,meson6-dwmac + - amlogic,meson8b-dwmac + - amlogic,meson8m2-dwmac + - amlogic,meson-gxbb-dwmac + - amlogic,meson-axg-dwmac + contains: + enum: + - snps,dwmac-3.70a + - snps,dwmac + + reg: + items: + - description: + The first register range should be the one of the DWMAC controller + - description: + The second range is is for the Amlogic specific configuration + (for example the PRG_ETHERNET register range on Meson8b and newer) + +required: + - compatible + - reg + - interrupts + - interrupt-names + - clocks + - clock-names + - phy-mode + +examples: + - | + ethmac: ethernet@c9410000 { + compatible = "amlogic,meson-gxbb-dwmac", "snps,dwmac"; + reg = <0xc9410000 0x10000>, <0xc8834540 0x8>; + interrupts = <8>; + interrupt-names = "macirq"; + clocks = <&clk_eth>, <&clkc_fclk_div2>, <&clk_mpll2>; + clock-names = "stmmaceth", "clkin0", "clkin1"; + phy-mode = "rgmii"; + }; diff --git a/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml b/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml new file mode 100644 index 000000000000..71808e78a495 --- /dev/null +++ b/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml @@ -0,0 +1,45 @@ +# SPDX-License-Identifier: GPL-2.0-or-later +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/aspeed,ast2600-mdio.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: ASPEED AST2600 MDIO Controller + +maintainers: + - Andrew Jeffery <andrew@aj.id.au> + +description: |+ + The ASPEED AST2600 MDIO controller is the third iteration of ASPEED's MDIO + bus register interface, this time also separating out the controller from the + MAC. + +allOf: + - $ref: "mdio.yaml#" + +properties: + compatible: + const: aspeed,ast2600-mdio + reg: + maxItems: 1 + description: The register range of the MDIO controller instance + +required: + - compatible + - reg + - "#address-cells" + - "#size-cells" + +examples: + - | + mdio0: mdio@1e650000 { + compatible = "aspeed,ast2600-mdio"; + reg = <0x1e650000 0x8>; + #address-cells = <1>; + #size-cells = <0>; + + ethphy0: ethernet-phy@0 { + compatible = "ethernet-phy-ieee802.3-c22"; + reg = <0>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt b/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt index bc77477c6878..94c0f8bf4deb 100644 --- a/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt +++ b/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt @@ -32,6 +32,15 @@ Optional properties: ack_gpr is the gpr register offset of CAN stop acknowledge. ack_bit is the bit offset of CAN stop acknowledge. +- fsl,clk-source: Select the clock source to the CAN Protocol Engine (PE). + It's SoC Implementation dependent. Refer to RM for detailed + definition. If this property is not set in device tree node + then driver selects clock source 1 by default. + 0: clock source 0 (oscillator clock) + 1: clock source 1 (peripheral clock) + +- wakeup-source: enable CAN remote wakeup + Example: can@1c000 { @@ -40,4 +49,5 @@ Example: interrupts = <48 0x2>; interrupt-parent = <&mpic>; clock-frequency = <200000000>; // filled in by bootloader + fsl,clk-source = <0>; // select clock source 0 for PE }; diff --git a/Documentation/devicetree/bindings/net/can/tcan4x5x.txt b/Documentation/devicetree/bindings/net/can/tcan4x5x.txt new file mode 100644 index 000000000000..c388f7d9feb1 --- /dev/null +++ b/Documentation/devicetree/bindings/net/can/tcan4x5x.txt @@ -0,0 +1,37 @@ +Texas Instruments TCAN4x5x CAN Controller +================================================ + +This file provides device node information for the TCAN4x5x interface contains. + +Required properties: + - compatible: "ti,tcan4x5x" + - reg: 0 + - #address-cells: 1 + - #size-cells: 0 + - spi-max-frequency: Maximum frequency of the SPI bus the chip can + operate at should be less than or equal to 18 MHz. + - data-ready-gpios: Interrupt GPIO for data and error reporting. + - device-wake-gpios: Wake up GPIO to wake up the TCAN device. + +See Documentation/devicetree/bindings/net/can/m_can.txt for additional +required property details. + +Optional properties: + - reset-gpios: Hardwired output GPIO. If not defined then software + reset. + - device-state-gpios: Input GPIO that indicates if the device is in + a sleep state or if the device is active. + +Example: +tcan4x5x: tcan4x5x@0 { + compatible = "ti,tcan4x5x"; + reg = <0>; + #address-cells = <1>; + #size-cells = <1>; + spi-max-frequency = <10000000>; + bosch,mram-cfg = <0x0 0 0 32 0 0 1 1>; + data-ready-gpios = <&gpio1 14 GPIO_ACTIVE_LOW>; + device-state-gpios = <&gpio3 21 GPIO_ACTIVE_HIGH>; + device-wake-gpios = <&gpio1 15 GPIO_ACTIVE_HIGH>; + reset-gpios = <&gpio1 27 GPIO_ACTIVE_LOW>; +}; diff --git a/Documentation/devicetree/bindings/net/dsa/ksz.txt b/Documentation/devicetree/bindings/net/dsa/ksz.txt index 4ac21cef370e..5e8429b6f9ca 100644 --- a/Documentation/devicetree/bindings/net/dsa/ksz.txt +++ b/Documentation/devicetree/bindings/net/dsa/ksz.txt @@ -5,6 +5,9 @@ Required properties: - compatible: For external switch chips, compatible string must be exactly one of the following: + - "microchip,ksz8765" + - "microchip,ksz8794" + - "microchip,ksz8795" - "microchip,ksz9477" - "microchip,ksz9897" - "microchip,ksz9896" diff --git a/Documentation/devicetree/bindings/net/dsa/marvell.txt b/Documentation/devicetree/bindings/net/dsa/marvell.txt index 6f9538974bb9..30c11fea491b 100644 --- a/Documentation/devicetree/bindings/net/dsa/marvell.txt +++ b/Documentation/devicetree/bindings/net/dsa/marvell.txt @@ -22,7 +22,7 @@ which is at a different MDIO base address in different switch families. - "marvell,mv88e6190" : Switch has base address 0x00. Use with models: 6190, 6190X, 6191, 6290, 6390, 6390X - "marvell,mv88e6250" : Switch has base address 0x08 or 0x18. Use with model: - 6250 + 6220, 6250 Required properties: - compatible : Should be one of "marvell,mv88e6085", diff --git a/Documentation/devicetree/bindings/net/fsl-enetc.txt b/Documentation/devicetree/bindings/net/fsl-enetc.txt index 25fc687419db..b7034ccbc1bd 100644 --- a/Documentation/devicetree/bindings/net/fsl-enetc.txt +++ b/Documentation/devicetree/bindings/net/fsl-enetc.txt @@ -11,7 +11,9 @@ Required properties: to parent node bindings. - compatible : Should be "fsl,enetc". -1) The ENETC external port is connected to a MDIO configurable phy: +1. The ENETC external port is connected to a MDIO configurable phy + +1.1. Using the local ENETC Port MDIO interface In this case, the ENETC node should include a "mdio" sub-node that in turn should contain the "ethernet-phy" node describing the @@ -47,8 +49,42 @@ Example: }; }; -2) The ENETC port is an internal port or has a fixed-link external -connection: +1.2. Using the central MDIO PCIe endpoint device + +In this case, the mdio node should be defined as another PCIe +endpoint node, at the same level with the ENETC port nodes. + +Required properties: + +- reg : Specifies PCIe Device Number and Function + Number of the ENETC endpoint device, according + to parent node bindings. +- compatible : Should be "fsl,enetc-mdio". + +The remaining required mdio bus properties are standard, their bindings +already defined in Documentation/devicetree/bindings/net/mdio.txt. + +Example: + + ethernet@0,0 { + compatible = "fsl,enetc"; + reg = <0x000000 0 0 0 0>; + phy-handle = <&sgmii_phy0>; + phy-connection-type = "sgmii"; + }; + + mdio@0,3 { + compatible = "fsl,enetc-mdio"; + reg = <0x000300 0 0 0 0>; + #address-cells = <1>; + #size-cells = <0>; + sgmii_phy0: ethernet-phy@2 { + reg = <0x2>; + }; + }; + +2. The ENETC port is an internal port or has a fixed-link external +connection In this case, the ENETC port node defines a fixed link connection, as specified by Documentation/devicetree/bindings/net/fixed-link.txt. diff --git a/Documentation/devicetree/bindings/net/mediatek-net.txt b/Documentation/devicetree/bindings/net/mediatek-net.txt index 770ff98d4524..72d03e07cf7c 100644 --- a/Documentation/devicetree/bindings/net/mediatek-net.txt +++ b/Documentation/devicetree/bindings/net/mediatek-net.txt @@ -12,6 +12,7 @@ Required properties: "mediatek,mt7623-eth", "mediatek,mt2701-eth": for MT7623 SoC "mediatek,mt7622-eth": for MT7622 SoC "mediatek,mt7629-eth": for MT7629 SoC + "ralink,rt5350-eth": for Ralink Rt5350F and MT7628/88 SoC - reg: Address and length of the register set for the device - interrupts: Should contain the three frame engines interrupts in numeric order. These are fe_int0, fe_int1 and fe_int2. diff --git a/Documentation/devicetree/bindings/net/meson-dwmac.txt b/Documentation/devicetree/bindings/net/meson-dwmac.txt deleted file mode 100644 index 1321bb194ed9..000000000000 --- a/Documentation/devicetree/bindings/net/meson-dwmac.txt +++ /dev/null @@ -1,71 +0,0 @@ -* Amlogic Meson DWMAC Ethernet controller - -The device inherits all the properties of the dwmac/stmmac devices -described in the file stmmac.txt in the current directory with the -following changes. - -Required properties on all platforms: - -- compatible: Depending on the platform this should be one of: - - "amlogic,meson6-dwmac" - - "amlogic,meson8b-dwmac" - - "amlogic,meson8m2-dwmac" - - "amlogic,meson-gxbb-dwmac" - - "amlogic,meson-axg-dwmac" - Additionally "snps,dwmac" and any applicable more - detailed version number described in net/stmmac.txt - should be used. - -- reg: The first register range should be the one of the DWMAC - controller. The second range is is for the Amlogic specific - configuration (for example the PRG_ETHERNET register range - on Meson8b and newer) - -Required properties on Meson8b, Meson8m2, GXBB and newer: -- clock-names: Should contain the following: - - "stmmaceth" - see stmmac.txt - - "clkin0" - first parent clock of the internal mux - - "clkin1" - second parent clock of the internal mux - -Optional properties on Meson8b, Meson8m2, GXBB and newer: -- amlogic,tx-delay-ns: The internal RGMII TX clock delay (provided - by this driver) in nanoseconds. Allowed values - are: 0ns, 2ns, 4ns, 6ns. - When phy-mode is set to "rgmii" then the TX - delay should be explicitly configured. When - not configured a fallback of 2ns is used. - When the phy-mode is set to either "rgmii-id" - or "rgmii-txid" the TX clock delay is already - provided by the PHY. In that case this - property should be set to 0ns (which disables - the TX clock delay in the MAC to prevent the - clock from going off because both PHY and MAC - are adding a delay). - Any configuration is ignored when the phy-mode - is set to "rmii". - -Example for Meson6: - - ethmac: ethernet@c9410000 { - compatible = "amlogic,meson6-dwmac", "snps,dwmac"; - reg = <0xc9410000 0x10000 - 0xc1108108 0x4>; - interrupts = <0 8 1>; - interrupt-names = "macirq"; - clocks = <&clk81>; - clock-names = "stmmaceth"; - } - -Example for GXBB: - ethmac: ethernet@c9410000 { - compatible = "amlogic,meson-gxbb-dwmac", "snps,dwmac"; - reg = <0x0 0xc9410000 0x0 0x10000>, - <0x0 0xc8834540 0x0 0x8>; - interrupts = <0 8 1>; - interrupt-names = "macirq"; - clocks = <&clkc CLKID_ETH>, - <&clkc CLKID_FCLK_DIV2>, - <&clkc CLKID_MPLL2>; - clock-names = "stmmaceth", "clkin0", "clkin1"; - phy-mode = "rgmii"; - }; diff --git a/Documentation/devicetree/bindings/net/mscc-ocelot.txt b/Documentation/devicetree/bindings/net/mscc-ocelot.txt index 9e5c17d426ce..3b6290b45ce5 100644 --- a/Documentation/devicetree/bindings/net/mscc-ocelot.txt +++ b/Documentation/devicetree/bindings/net/mscc-ocelot.txt @@ -12,13 +12,15 @@ Required properties: - "sys" - "rew" - "qs" + - "ptp" (optional due to backward compatibility) - "qsys" - "ana" - "portX" with X from 0 to the number of last port index available on that switch -- interrupts: Should contain the switch interrupts for frame extraction and - frame injection -- interrupt-names: should contain the interrupt names: "xtr", "inj" +- interrupts: Should contain the switch interrupts for frame extraction, + frame injection and PTP ready. +- interrupt-names: should contain the interrupt names: "xtr", "inj". Can contain + "ptp_rdy" which is optional due to backward compatibility. - ethernet-ports: A container for child nodes representing switch ports. The ethernet-ports container has the following properties @@ -44,6 +46,7 @@ Example: reg = <0x1010000 0x10000>, <0x1030000 0x10000>, <0x1080000 0x100>, + <0x10e0000 0x10000>, <0x11e0000 0x100>, <0x11f0000 0x100>, <0x1200000 0x100>, @@ -57,11 +60,12 @@ Example: <0x1280000 0x100>, <0x1800000 0x80000>, <0x1880000 0x10000>; - reg-names = "sys", "rew", "qs", "port0", "port1", "port2", - "port3", "port4", "port5", "port6", "port7", - "port8", "port9", "port10", "qsys", "ana"; - interrupts = <21 22>; - interrupt-names = "xtr", "inj"; + reg-names = "sys", "rew", "qs", "ptp", "port0", "port1", + "port2", "port3", "port4", "port5", "port6", + "port7", "port8", "port9", "port10", "qsys", + "ana"; + interrupts = <18 21 22>; + interrupt-names = "ptp_rdy", "xtr", "inj"; ethernet-ports { #address-cells = <1>; diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index 76fea2be66ac..c78be15704b9 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -50,6 +50,11 @@ properties: - allwinner,sun8i-r40-emac - allwinner,sun8i-v3s-emac - allwinner,sun50i-a64-emac + - amlogic,meson6-dwmac + - amlogic,meson8b-dwmac + - amlogic,meson8m2-dwmac + - amlogic,meson-gxbb-dwmac + - amlogic,meson-axg-dwmac - snps,dwmac - snps,dwmac-3.50a - snps,dwmac-3.610 @@ -61,7 +66,8 @@ properties: - snps,dwxgmac-2.10 reg: - maxItems: 1 + minItems: 1 + maxItems: 2 interrupts: minItems: 1 diff --git a/Documentation/networking/device_drivers/mellanox/mlx5.rst b/Documentation/networking/device_drivers/mellanox/mlx5.rst index 214325897732..b30a63dbf4b7 100644 --- a/Documentation/networking/device_drivers/mellanox/mlx5.rst +++ b/Documentation/networking/device_drivers/mellanox/mlx5.rst @@ -12,6 +12,7 @@ Contents - `Enabling the driver and kconfig options`_ - `Devlink info`_ - `Devlink health reporters`_ +- `mlx5 tracepoints`_ Enabling the driver and kconfig options ================================================ @@ -126,7 +127,7 @@ Devlink health reporters tx reporter ----------- -The tx reporter is responsible of two error scenarios: +The tx reporter is responsible for reporting and recovering of the following two error scenarios: - TX timeout Report on kernel tx timeout detection. @@ -135,7 +136,7 @@ The tx reporter is responsible of two error scenarios: Report on error tx completion. Recover by flushing the TX queue and reset it. -TX reporter also support Diagnose callback, on which it provides +TX reporter also support on demand diagnose callback, on which it provides real time information of its send queues status. User commands examples: @@ -144,11 +145,40 @@ User commands examples: $ devlink health diagnose pci/0000:82:00.0 reporter tx +NOTE: This command has valid output only when interface is up, otherwise the command has empty output. + - Show number of tx errors indicated, number of recover flows ended successfully, is autorecover enabled and graceful period from last recover:: $ devlink health show pci/0000:82:00.0 reporter tx +rx reporter +----------- +The rx reporter is responsible for reporting and recovering of the following two error scenarios: + +- RX queues initialization (population) timeout + RX queues descriptors population on ring initialization is done in + napi context via triggering an irq, in case of a failure to get + the minimum amount of descriptors, a timeout would occur and it + could be recoverable by polling the EQ (Event Queue). +- RX completions with errors (reported by HW on interrupt context) + Report on rx completion error. + Recover (if needed) by flushing the related queue and reset it. + +RX reporter also supports on demand diagnose callback, on which it +provides real time information of its receive queues status. + +- Diagnose rx queues status, and corresponding completion queue:: + + $ devlink health diagnose pci/0000:82:00.0 reporter rx + +NOTE: This command has valid output only when interface is up, otherwise the command has empty output. + +- Show number of rx errors indicated, number of recover flows ended successfully, + is autorecover enabled and graceful period from last recover:: + + $ devlink health show pci/0000:82:00.0 reporter rx + fw reporter ----------- The fw reporter implements diagnose and dump callbacks. @@ -190,3 +220,48 @@ User commands examples: $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal NOTE: This command can run only on PF. + +mlx5 tracepoints +================ + +mlx5 driver provides internal trace points for tracking and debugging using +kernel tracepoints interfaces (refer to Documentation/trace/ftrase.rst). + +For the list of support mlx5 events check /sys/kernel/debug/tracing/events/mlx5/ + +tc and eswitch offloads tracepoints: + +- mlx5e_configure_flower: trace flower filter actions and cookies offloaded to mlx5:: + + $ echo mlx5:mlx5e_configure_flower >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + tc-6535 [019] ...1 2672.404466: mlx5e_configure_flower: cookie=0000000067874a55 actions= REDIRECT + +- mlx5e_delete_flower: trace flower filter actions and cookies deleted from mlx5:: + + $ echo mlx5:mlx5e_delete_flower >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + tc-6569 [010] .N.1 2686.379075: mlx5e_delete_flower: cookie=0000000067874a55 actions= NULL + +- mlx5e_stats_flower: trace flower stats request:: + + $ echo mlx5:mlx5e_stats_flower >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + tc-6546 [010] ...1 2679.704889: mlx5e_stats_flower: cookie=0000000060eb3d6a bytes=0 packets=0 lastused=4295560217 + +- mlx5e_tc_update_neigh_used_value: trace tunnel rule neigh update value offloaded to mlx5:: + + $ echo mlx5:mlx5e_tc_update_neigh_used_value >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + kworker/u48:4-8806 [009] ...1 55117.882428: mlx5e_tc_update_neigh_used_value: netdev: ens1f0 IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_used=1 + +- mlx5e_rep_neigh_update: trace neigh update tasks scheduled due to neigh state change events:: + + $ echo mlx5:mlx5e_rep_neigh_update >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + kworker/u48:7-2221 [009] ...1 1475.387435: mlx5e_rep_neigh_update: netdev: ens1f0 MAC: 24:8a:07:9a:17:9a IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_connected=1 diff --git a/Documentation/networking/devlink-trap-netdevsim.rst b/Documentation/networking/devlink-trap-netdevsim.rst new file mode 100644 index 000000000000..b721c9415473 --- /dev/null +++ b/Documentation/networking/devlink-trap-netdevsim.rst @@ -0,0 +1,20 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====================== +Devlink Trap netdevsim +====================== + +Driver-specific Traps +===================== + +.. list-table:: List of Driver-specific Traps Registered by ``netdevsim`` + :widths: 5 5 90 + + * - Name + - Type + - Description + * - ``fid_miss`` + - ``exception`` + - When a packet enters the device it is classified to a filtering + indentifier (FID) based on the ingress port and VLAN. This trap is used + to trap packets for which a FID could not be found diff --git a/Documentation/networking/devlink-trap.rst b/Documentation/networking/devlink-trap.rst new file mode 100644 index 000000000000..c20c7c483664 --- /dev/null +++ b/Documentation/networking/devlink-trap.rst @@ -0,0 +1,208 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============ +Devlink Trap +============ + +Background +========== + +Devices capable of offloading the kernel's datapath and perform functions such +as bridging and routing must also be able to send specific packets to the +kernel (i.e., the CPU) for processing. + +For example, a device acting as a multicast-aware bridge must be able to send +IGMP membership reports to the kernel for processing by the bridge module. +Without processing such packets, the bridge module could never populate its +MDB. + +As another example, consider a device acting as router which has received an IP +packet with a TTL of 1. Upon routing the packet the device must send it to the +kernel so that it will route it as well and generate an ICMP Time Exceeded +error datagram. Without letting the kernel route such packets itself, utilities +such as ``traceroute`` could never work. + +The fundamental ability of sending certain packets to the kernel for processing +is called "packet trapping". + +Overview +======== + +The ``devlink-trap`` mechanism allows capable device drivers to register their +supported packet traps with ``devlink`` and report trapped packets to +``devlink`` for further analysis. + +Upon receiving trapped packets, ``devlink`` will perform a per-trap packets and +bytes accounting and potentially report the packet to user space via a netlink +event along with all the provided metadata (e.g., trap reason, timestamp, input +port). This is especially useful for drop traps (see :ref:`Trap-Types`) +as it allows users to obtain further visibility into packet drops that would +otherwise be invisible. + +The following diagram provides a general overview of ``devlink-trap``:: + + Netlink event: Packet w/ metadata + Or a summary of recent drops + ^ + | + Userspace | + +---------------------------------------------------+ + Kernel | + | + +-------+--------+ + | | + | drop_monitor | + | | + +-------^--------+ + | + | + | + +----+----+ + | | Kernel's Rx path + | devlink | (non-drop traps) + | | + +----^----+ ^ + | | + +-----------+ + | + +-------+-------+ + | | + | Device driver | + | | + +-------^-------+ + Kernel | + +---------------------------------------------------+ + Hardware | + | Trapped packet + | + +--+---+ + | | + | ASIC | + | | + +------+ + +.. _Trap-Types: + +Trap Types +========== + +The ``devlink-trap`` mechanism supports the following packet trap types: + + * ``drop``: Trapped packets were dropped by the underlying device. Packets + are only processed by ``devlink`` and not injected to the kernel's Rx path. + The trap action (see :ref:`Trap-Actions`) can be changed. + * ``exception``: Trapped packets were not forwarded as intended by the + underlying device due to an exception (e.g., TTL error, missing neighbour + entry) and trapped to the control plane for resolution. Packets are + processed by ``devlink`` and injected to the kernel's Rx path. Changing the + action of such traps is not allowed, as it can easily break the control + plane. + +.. _Trap-Actions: + +Trap Actions +============ + +The ``devlink-trap`` mechanism supports the following packet trap actions: + + * ``trap``: The sole copy of the packet is sent to the CPU. + * ``drop``: The packet is dropped by the underlying device and a copy is not + sent to the CPU. + +Generic Packet Traps +==================== + +Generic packet traps are used to describe traps that trap well-defined packets +or packets that are trapped due to well-defined conditions (e.g., TTL error). +Such traps can be shared by multiple device drivers and their description must +be added to the following table: + +.. list-table:: List of Generic Packet Traps + :widths: 5 5 90 + + * - Name + - Type + - Description + * - ``source_mac_is_multicast`` + - ``drop`` + - Traps incoming packets that the device decided to drop because of a + multicast source MAC + * - ``vlan_tag_mismatch`` + - ``drop`` + - Traps incoming packets that the device decided to drop in case of VLAN + tag mismatch: The ingress bridge port is not configured with a PVID and + the packet is untagged or prio-tagged + * - ``ingress_vlan_filter`` + - ``drop`` + - Traps incoming packets that the device decided to drop in case they are + tagged with a VLAN that is not configured on the ingress bridge port + * - ``ingress_spanning_tree_filter`` + - ``drop`` + - Traps incoming packets that the device decided to drop in case the STP + state of the ingress bridge port is not "forwarding" + * - ``port_list_is_empty`` + - ``drop`` + - Traps packets that the device decided to drop in case they need to be + flooded and the flood list is empty + * - ``port_loopback_filter`` + - ``drop`` + - Traps packets that the device decided to drop in case after layer 2 + forwarding the only port from which they should be transmitted through + is the port from which they were received + * - ``blackhole_route`` + - ``drop`` + - Traps packets that the device decided to drop in case they hit a + blackhole route + * - ``ttl_value_is_too_small`` + - ``exception`` + - Traps unicast packets that should be forwarded by the device whose TTL + was decremented to 0 or less + * - ``tail_drop`` + - ``drop`` + - Traps packets that the device decided to drop because they could not be + enqueued to a transmission queue which is full + +Driver-specific Packet Traps +============================ + +Device drivers can register driver-specific packet traps, but these must be +clearly documented. Such traps can correspond to device-specific exceptions and +help debug packet drops caused by these exceptions. The following list includes +links to the description of driver-specific traps registered by various device +drivers: + + * :doc:`/devlink-trap-netdevsim` + +Generic Packet Trap Groups +========================== + +Generic packet trap groups are used to aggregate logically related packet +traps. These groups allow the user to batch operations such as setting the trap +action of all member traps. In addition, ``devlink-trap`` can report aggregated +per-group packets and bytes statistics, in case per-trap statistics are too +narrow. The description of these groups must be added to the following table: + +.. list-table:: List of Generic Packet Trap Groups + :widths: 10 90 + + * - Name + - Description + * - ``l2_drops`` + - Contains packet traps for packets that were dropped by the device during + layer 2 forwarding (i.e., bridge) + * - ``l3_drops`` + - Contains packet traps for packets that were dropped by the device or hit + an exception (e.g., TTL error) during layer 3 forwarding + * - ``buffer_drops`` + - Contains packet traps for packets that were dropped by the device due to + an enqueue decision + +Testing +======= + +See ``tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh`` for a +test covering the core infrastructure. Test cases should be added for any new +functionality. + +Device drivers should focus their tests on device-specific functionality, such +as the triggering of supported packet traps. diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index a46fca264bee..37eabc17894c 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -14,6 +14,8 @@ Contents: device_drivers/index dsa/index devlink-info-versions + devlink-trap + devlink-trap-netdevsim ieee802154 kapi z8530book diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index df33674799b5..49e95f438ed7 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -256,6 +256,12 @@ tcp_base_mss - INTEGER Path MTU discovery (MTU probing). If MTU probing is enabled, this is the initial MSS used by the connection. +tcp_mtu_probe_floor - INTEGER + If MTU probing is enabled this caps the minimum MSS used for search_low + for the connection. + + Default : 48 + tcp_min_snd_mss - INTEGER TCP SYN and SYNACK messages usually advertise an ADVMSS option, as described in RFC 1122 and RFC 6691. |