diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-29 11:57:23 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-29 11:57:23 -0700 |
commit | 9d31d2338950293ec19d9b095fbaa9030899dcb4 (patch) | |
tree | e688040d0557c24a2eeb9f6c9c223d949f6f7ef9 /arch | |
parent | 635de956a7f5a6ffcb04f29d70630c64c717b56b (diff) | |
parent | 4a52dd8fefb45626dace70a63c0738dbd83b7edb (diff) |
Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- bpf:
- allow bpf programs calling kernel functions (initially to
reuse TCP congestion control implementations)
- enable task local storage for tracing programs - remove the
need to store per-task state in hash maps, and allow tracing
programs access to task local storage previously added for
BPF_LSM
- add bpf_for_each_map_elem() helper, allowing programs to walk
all map elements in a more robust and easier to verify fashion
- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
redirection
- lpm: add support for batched ops in LPM trie
- add BTF_KIND_FLOAT support - mostly to allow use of BTF on
s390 which has floats in its headers files
- improve BPF syscall documentation and extend the use of kdoc
parsing scripts we already employ for bpf-helpers
- libbpf, bpftool: support static linking of BPF ELF files
- improve support for encapsulation of L2 packets
- xdp: restructure redirect actions to avoid a runtime lookup,
improving performance by 4-8% in microbenchmarks
- xsk: build skb by page (aka generic zerocopy xmit) - improve
performance of software AF_XDP path by 33% for devices which don't
need headers in the linear skb part (e.g. virtio)
- nexthop: resilient next-hop groups - improve path stability on
next-hops group changes (incl. offload for mlxsw)
- ipv6: segment routing: add support for IPv4 decapsulation
- icmp: add support for RFC 8335 extended PROBE messages
- inet: use bigger hash table for IP ID generation
- tcp: deal better with delayed TX completions - make sure we don't
give up on fast TCP retransmissions only because driver is slow in
reporting that it completed transmitting the original
- tcp: reorder tcp_congestion_ops for better cache locality
- mptcp:
- add sockopt support for common TCP options
- add support for common TCP msg flags
- include multiple address ids in RM_ADDR
- add reset option support for resetting one subflow
- udp: GRO L4 improvements - improve 'forward' / 'frag_list'
co-existence with UDP tunnel GRO, allowing the first to take place
correctly even for encapsulated UDP traffic
- micro-optimize dev_gro_receive() and flow dissection, avoid
retpoline overhead on VLAN and TEB GRO
- use less memory for sysctls, add a new sysctl type, to allow using
u8 instead of "int" and "long" and shrink networking sysctls
- veth: allow GRO without XDP - this allows aggregating UDP packets
before handing them off to routing, bridge, OvS, etc.
- allow specifing ifindex when device is moved to another namespace
- netfilter:
- nft_socket: add support for cgroupsv2
- nftables: add catch-all set element - special element used to
define a default action in case normal lookup missed
- use net_generic infra in many modules to avoid allocating
per-ns memory unnecessarily
- xps: improve the xps handling to avoid potential out-of-bound
accesses and use-after-free when XPS change race with other
re-configuration under traffic
- add a config knob to turn off per-cpu netdev refcnt to catch
underflows in testing
Device APIs:
- add WWAN subsystem to organize the WWAN interfaces better and
hopefully start driving towards more unified and vendor-
independent APIs
- ethtool:
- add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
support)
- allow network drivers to dump arbitrary SFP EEPROM data,
current offset+length API was a poor fit for modern SFP which
define EEPROM in terms of pages (incl. mlx5 support)
- act_police, flow_offload: add support for packet-per-second
policing (incl. offload for nfp)
- psample: add additional metadata attributes like transit delay for
packets sampled from switch HW (and corresponding egress and
policy-based sampling in the mlxsw driver)
- dsa: improve support for sandwiched LAGs with bridge and DSA
- netfilter:
- flowtable: use direct xmit in topologies with IP forwarding,
bridging, vlans etc.
- nftables: counter hardware offload support
- Bluetooth:
- improvements for firmware download w/ Intel devices
- add support for reading AOSP vendor capabilities
- add support for virtio transport driver
- mac80211:
- allow concurrent monitor iface and ethernet rx decap
- set priority and queue mapping for injected frames
- phy: add support for Clause-45 PHY Loopback
- pci/iov: add sysfs MSI-X vector assignment interface to distribute
MSI-X resources to VFs (incl. mlx5 support)
New hardware/drivers:
- dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
interfaces.
- dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
BCM63xx switches
- Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
- ath11k: support for QCN9074 a 802.11ax device
- Bluetooth: Broadcom BCM4330 and BMC4334
- phy: Marvell 88X2222 transceiver support
- mdio: add BCM6368 MDIO mux bus controller
- r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
- mana: driver for Microsoft Azure Network Adapter (MANA)
- Actions Semi Owl Ethernet MAC
- can: driver for ETAS ES58X CAN/USB interfaces
Pure driver changes:
- add XDP support to: enetc, igc, stmmac
- add AF_XDP support to: stmmac
- virtio:
- page_to_skb() use build_skb when there's sufficient tailroom
(21% improvement for 1000B UDP frames)
- support XDP even without dedicated Tx queues - share the Tx
queues with the stack when necessary
- mlx5:
- flow rules: add support for mirroring with conntrack, matching
on ICMP, GTP, flex filters and more
- support packet sampling with flow offloads
- persist uplink representor netdev across eswitch mode changes
- allow coexistence of CQE compression and HW time-stamping
- add ethtool extended link error state reporting
- ice, iavf: support flow filters, UDP Segmentation Offload
- dpaa2-switch:
- move the driver out of staging
- add spanning tree (STP) support
- add rx copybreak support
- add tc flower hardware offload on ingress traffic
- ionic:
- implement Rx page reuse
- support HW PTP time-stamping
- octeon: support TC hardware offloads - flower matching on ingress
and egress ratelimitting.
- stmmac:
- add RX frame steering based on VLAN priority in tc flower
- support frame preemption (FPE)
- intel: add cross time-stamping freq difference adjustment
- ocelot:
- support forwarding of MRP frames in HW
- support multiple bridges
- support PTP Sync one-step timestamping
- dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
learning, flooding etc.
- ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
SC7280 SoCs)
- mt7601u: enable TDLS support
- mt76:
- add support for 802.3 rx frames (mt7915/mt7615)
- mt7915 flash pre-calibration support
- mt7921/mt7663 runtime power management fixes"
* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
net: selftest: fix build issue if INET is disabled
net: netrom: nr_in: Remove redundant assignment to ns
net: tun: Remove redundant assignment to ret
net: phy: marvell: add downshift support for M88E1240
net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
net/sched: act_ct: Remove redundant ct get and check
icmp: standardize naming of RFC 8335 PROBE constants
bpf, selftests: Update array map tests for per-cpu batched ops
bpf: Add batched ops support for percpu array
bpf: Implement formatted output helpers with bstr_printf
seq_file: Add a seq_bprintf function
sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
net: fix a concurrency bug in l2tp_tunnel_register()
net/smc: Remove redundant assignment to rc
mpls: Remove redundant assignment to err
llc2: Remove redundant assignment to rc
net/tls: Remove redundant initialization of record
rds: Remove redundant assignment to nr_sig
dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
...
Diffstat (limited to 'arch')
-rw-r--r-- | arch/arm/boot/dts/uniphier-pxs2.dtsi | 2 | ||||
-rw-r--r-- | arch/arm/mach-mvebu/kirkwood.c | 3 | ||||
-rw-r--r-- | arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi | 6 | ||||
-rw-r--r-- | arch/arm64/boot/dts/rockchip/rk3328.dtsi | 4 | ||||
-rw-r--r-- | arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi | 2 | ||||
-rw-r--r-- | arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi | 4 | ||||
-rw-r--r-- | arch/mips/rb532/devices.c | 25 | ||||
-rw-r--r-- | arch/powerpc/boot/dts/fsl/bsc9131si-post.dtsi | 4 | ||||
-rw-r--r-- | arch/powerpc/boot/dts/fsl/bsc9132si-post.dtsi | 4 | ||||
-rw-r--r-- | arch/powerpc/boot/dts/fsl/c293si-post.dtsi | 4 | ||||
-rw-r--r-- | arch/powerpc/boot/dts/fsl/p1010si-post.dtsi | 21 | ||||
-rw-r--r-- | arch/powerpc/sysdev/tsi108_dev.c | 5 | ||||
-rw-r--r-- | arch/s390/net/bpf_jit_comp.c | 64 | ||||
-rw-r--r-- | arch/x86/net/bpf_jit_comp.c | 5 | ||||
-rw-r--r-- | arch/x86/net/bpf_jit_comp32.c | 198 |
15 files changed, 281 insertions, 70 deletions
diff --git a/arch/arm/boot/dts/uniphier-pxs2.dtsi b/arch/arm/boot/dts/uniphier-pxs2.dtsi index b0b15c97306b..e81e5937a60a 100644 --- a/arch/arm/boot/dts/uniphier-pxs2.dtsi +++ b/arch/arm/boot/dts/uniphier-pxs2.dtsi @@ -583,7 +583,7 @@ clocks = <&sys_clk 6>; reset-names = "ether"; resets = <&sys_rst 6>; - phy-mode = "rgmii"; + phy-mode = "rgmii-id"; local-mac-address = [00 00 00 00 00 00]; socionext,syscon-phy-mode = <&soc_glue 0>; diff --git a/arch/arm/mach-mvebu/kirkwood.c b/arch/arm/mach-mvebu/kirkwood.c index ceaad6d5927e..06b1706595f4 100644 --- a/arch/arm/mach-mvebu/kirkwood.c +++ b/arch/arm/mach-mvebu/kirkwood.c @@ -84,6 +84,7 @@ static void __init kirkwood_dt_eth_fixup(void) struct device_node *pnp = of_get_parent(np); struct clk *clk; struct property *pmac; + u8 tmpmac[ETH_ALEN]; void __iomem *io; u8 *macaddr; u32 reg; @@ -93,7 +94,7 @@ static void __init kirkwood_dt_eth_fixup(void) /* skip disabled nodes or nodes with valid MAC address*/ if (!of_device_is_available(pnp) || - !IS_ERR(of_get_mac_address(np))) + !of_get_mac_address(np, tmpmac)) goto eth_fixup_skip; clk = of_clk_get(pnp, 0); diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi index 9506f0669ead..eca06a0c3cf8 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi @@ -1118,6 +1118,12 @@ }; }; + /* Integrated Endpoint Register Block */ + ierb@1f0800000 { + compatible = "fsl,ls1028a-enetc-ierb"; + reg = <0x01 0xf0800000 0x0 0x10000>; + }; + rcpm: power-controller@1e34040 { compatible = "fsl,ls1028a-rcpm", "fsl,qoriq-rcpm-2.1+"; reg = <0x0 0x1e34040 0x0 0x1c>; diff --git a/arch/arm64/boot/dts/rockchip/rk3328.dtsi b/arch/arm64/boot/dts/rockchip/rk3328.dtsi index 5bab61784735..3ed69ecbcf3c 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3328.dtsi @@ -916,8 +916,8 @@ "mac_clk_tx", "clk_mac_ref", "aclk_mac", "pclk_mac", "clk_macphy"; - resets = <&cru SRST_GMAC2PHY_A>, <&cru SRST_MACPHY>; - reset-names = "stmmaceth", "mac-phy"; + resets = <&cru SRST_GMAC2PHY_A>; + reset-names = "stmmaceth"; phy-mode = "rmii"; phy-handle = <&phy>; snps,txpbl = <0x4>; diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi index a87b8a678719..8f2c1c1e2c64 100644 --- a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi +++ b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi @@ -734,7 +734,7 @@ clocks = <&sys_clk 6>; reset-names = "ether"; resets = <&sys_rst 6>; - phy-mode = "rgmii"; + phy-mode = "rgmii-id"; local-mac-address = [00 00 00 00 00 00]; socionext,syscon-phy-mode = <&soc_glue 0>; diff --git a/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi b/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi index 0e52dadf54b3..be97da132258 100644 --- a/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi +++ b/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi @@ -564,7 +564,7 @@ clocks = <&sys_clk 6>; reset-names = "ether"; resets = <&sys_rst 6>; - phy-mode = "rgmii"; + phy-mode = "rgmii-id"; local-mac-address = [00 00 00 00 00 00]; socionext,syscon-phy-mode = <&soc_glue 0>; @@ -585,7 +585,7 @@ clocks = <&sys_clk 7>; reset-names = "ether"; resets = <&sys_rst 7>; - phy-mode = "rgmii"; + phy-mode = "rgmii-id"; local-mac-address = [00 00 00 00 00 00]; socionext,syscon-phy-mode = <&soc_glue 1>; diff --git a/arch/mips/rb532/devices.c b/arch/mips/rb532/devices.c index dd34f1b32b79..04684990e28e 100644 --- a/arch/mips/rb532/devices.c +++ b/arch/mips/rb532/devices.c @@ -58,37 +58,27 @@ EXPORT_SYMBOL(get_latch_u5); static struct resource korina_dev0_res[] = { { - .name = "korina_regs", + .name = "emac", .start = ETH0_BASE_ADDR, .end = ETH0_BASE_ADDR + sizeof(struct eth_regs), .flags = IORESOURCE_MEM, }, { - .name = "korina_rx", + .name = "rx", .start = ETH0_DMA_RX_IRQ, .end = ETH0_DMA_RX_IRQ, .flags = IORESOURCE_IRQ }, { - .name = "korina_tx", + .name = "tx", .start = ETH0_DMA_TX_IRQ, .end = ETH0_DMA_TX_IRQ, .flags = IORESOURCE_IRQ }, { - .name = "korina_ovr", - .start = ETH0_RX_OVR_IRQ, - .end = ETH0_RX_OVR_IRQ, - .flags = IORESOURCE_IRQ - }, { - .name = "korina_und", - .start = ETH0_TX_UND_IRQ, - .end = ETH0_TX_UND_IRQ, - .flags = IORESOURCE_IRQ - }, { - .name = "korina_dma_rx", + .name = "dma_rx", .start = ETH0_RX_DMA_ADDR, .end = ETH0_RX_DMA_ADDR + DMA_CHAN_OFFSET - 1, .flags = IORESOURCE_MEM, }, { - .name = "korina_dma_tx", + .name = "dma_tx", .start = ETH0_TX_DMA_ADDR, .end = ETH0_TX_DMA_ADDR + DMA_CHAN_OFFSET - 1, .flags = IORESOURCE_MEM, @@ -105,6 +95,9 @@ static struct platform_device korina_dev0 = { .name = "korina", .resource = korina_dev0_res, .num_resources = ARRAY_SIZE(korina_dev0_res), + .dev = { + .platform_data = &korina_dev0_data.mac, + } }; static struct resource cf_slot0_res[] = { @@ -299,8 +292,6 @@ static int __init plat_setup_devices(void) /* set the uart clock to the current cpu frequency */ rb532_uart_res[0].uartclk = idt_cpu_freq; - dev_set_drvdata(&korina_dev0.dev, &korina_dev0_data); - gpiod_add_lookup_table(&cf_slot0_gpio_table); return platform_add_devices(rb532_devs, ARRAY_SIZE(rb532_devs)); } diff --git a/arch/powerpc/boot/dts/fsl/bsc9131si-post.dtsi b/arch/powerpc/boot/dts/fsl/bsc9131si-post.dtsi index 0c0efa94cfb4..2a677fd323eb 100644 --- a/arch/powerpc/boot/dts/fsl/bsc9131si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/bsc9131si-post.dtsi @@ -170,8 +170,6 @@ timer@41100 { /include/ "pq3-etsec2-0.dtsi" enet0: ethernet@b0000 { queue-group@b0000 { - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; interrupts = <26 2 0 0 27 2 0 0 28 2 0 0>; }; }; @@ -179,8 +177,6 @@ enet0: ethernet@b0000 { /include/ "pq3-etsec2-1.dtsi" enet1: ethernet@b1000 { queue-group@b1000 { - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; interrupts = <33 2 0 0 34 2 0 0 35 2 0 0>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/bsc9132si-post.dtsi b/arch/powerpc/boot/dts/fsl/bsc9132si-post.dtsi index b5f071574e83..b8e0edd1ac69 100644 --- a/arch/powerpc/boot/dts/fsl/bsc9132si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/bsc9132si-post.dtsi @@ -190,8 +190,6 @@ crypto@30000 { /include/ "pq3-etsec2-0.dtsi" enet0: ethernet@b0000 { queue-group@b0000 { - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; interrupts = <26 2 0 0 27 2 0 0 28 2 0 0>; }; }; @@ -199,8 +197,6 @@ enet0: ethernet@b0000 { /include/ "pq3-etsec2-1.dtsi" enet1: ethernet@b1000 { queue-group@b1000 { - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; interrupts = <33 2 0 0 34 2 0 0 35 2 0 0>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/c293si-post.dtsi b/arch/powerpc/boot/dts/fsl/c293si-post.dtsi index bd208320bff5..bec0fc36849d 100644 --- a/arch/powerpc/boot/dts/fsl/c293si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/c293si-post.dtsi @@ -171,8 +171,6 @@ enet0: ethernet@b0000 { queue-group@b0000 { reg = <0x10000 0x1000>; - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; }; }; @@ -180,8 +178,6 @@ enet1: ethernet@b1000 { queue-group@b1000 { reg = <0x11000 0x1000>; - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi index 1b4aafc1f6a2..c2717f31925a 100644 --- a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi @@ -172,29 +172,8 @@ /include/ "pq3-mpic-timer-B.dtsi" /include/ "pq3-etsec2-0.dtsi" - enet0: ethernet@b0000 { - queue-group@b0000 { - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; - }; - }; - /include/ "pq3-etsec2-1.dtsi" - enet1: ethernet@b1000 { - queue-group@b1000 { - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; - }; - }; - /include/ "pq3-etsec2-2.dtsi" - enet2: ethernet@b2000 { - queue-group@b2000 { - fsl,rx-bit-map = <0xff>; - fsl,tx-bit-map = <0xff>; - }; - - }; global-utilities@e0000 { compatible = "fsl,p1010-guts"; diff --git a/arch/powerpc/sysdev/tsi108_dev.c b/arch/powerpc/sysdev/tsi108_dev.c index 0baec82510b9..4c4a6efd5e5f 100644 --- a/arch/powerpc/sysdev/tsi108_dev.c +++ b/arch/powerpc/sysdev/tsi108_dev.c @@ -73,7 +73,6 @@ static int __init tsi108_eth_of_init(void) struct device_node *phy, *mdio; hw_info tsi_eth_data; const unsigned int *phy_id; - const void *mac_addr; const phandle *ph; memset(r, 0, sizeof(r)); @@ -101,9 +100,7 @@ static int __init tsi108_eth_of_init(void) goto err; } - mac_addr = of_get_mac_address(np); - if (!IS_ERR(mac_addr)) - ether_addr_copy(tsi_eth_data.mac_addr, mac_addr); + of_get_mac_address(np, tsi_eth_data.mac_addr); ph = of_get_property(np, "mdio-handle", NULL); mdio = of_find_node_by_phandle(*ph); diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index f973e2ead197..63cae0476bb4 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -1209,21 +1209,67 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, */ case BPF_STX | BPF_ATOMIC | BPF_DW: case BPF_STX | BPF_ATOMIC | BPF_W: - if (insn->imm != BPF_ADD) { + { + bool is32 = BPF_SIZE(insn->code) == BPF_W; + + switch (insn->imm) { +/* {op32|op64} {%w0|%src},%src,off(%dst) */ +#define EMIT_ATOMIC(op32, op64) do { \ + EMIT6_DISP_LH(0xeb000000, is32 ? (op32) : (op64), \ + (insn->imm & BPF_FETCH) ? src_reg : REG_W0, \ + src_reg, dst_reg, off); \ + if (is32 && (insn->imm & BPF_FETCH)) \ + EMIT_ZERO(src_reg); \ +} while (0) + case BPF_ADD: + case BPF_ADD | BPF_FETCH: + /* {laal|laalg} */ + EMIT_ATOMIC(0x00fa, 0x00ea); + break; + case BPF_AND: + case BPF_AND | BPF_FETCH: + /* {lan|lang} */ + EMIT_ATOMIC(0x00f4, 0x00e4); + break; + case BPF_OR: + case BPF_OR | BPF_FETCH: + /* {lao|laog} */ + EMIT_ATOMIC(0x00f6, 0x00e6); + break; + case BPF_XOR: + case BPF_XOR | BPF_FETCH: + /* {lax|laxg} */ + EMIT_ATOMIC(0x00f7, 0x00e7); + break; +#undef EMIT_ATOMIC + case BPF_XCHG: + /* {ly|lg} %w0,off(%dst) */ + EMIT6_DISP_LH(0xe3000000, + is32 ? 0x0058 : 0x0004, REG_W0, REG_0, + dst_reg, off); + /* 0: {csy|csg} %w0,%src,off(%dst) */ + EMIT6_DISP_LH(0xeb000000, is32 ? 0x0014 : 0x0030, + REG_W0, src_reg, dst_reg, off); + /* brc 4,0b */ + EMIT4_PCREL_RIC(0xa7040000, 4, jit->prg - 6); + /* {llgfr|lgr} %src,%w0 */ + EMIT4(is32 ? 0xb9160000 : 0xb9040000, src_reg, REG_W0); + if (is32 && insn_is_zext(&insn[1])) + insn_count = 2; + break; + case BPF_CMPXCHG: + /* 0: {csy|csg} %b0,%src,off(%dst) */ + EMIT6_DISP_LH(0xeb000000, is32 ? 0x0014 : 0x0030, + BPF_REG_0, src_reg, dst_reg, off); + break; + default: pr_err("Unknown atomic operation %02x\n", insn->imm); return -1; } - /* *(u32/u64 *)(dst + off) += src - * - * BFW_W: laal %w0,%src,off(%dst) - * BPF_DW: laalg %w0,%src,off(%dst) - */ - EMIT6_DISP_LH(0xeb000000, - BPF_SIZE(insn->code) == BPF_W ? 0x00fa : 0x00ea, - REG_W0, src_reg, dst_reg, off); jit->seen |= SEEN_MEM; break; + } /* * BPF_LDX */ diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 7b9e3ff27c1a..2a2e290fa5d8 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -2355,3 +2355,8 @@ out: tmp : orig_prog); return prog; } + +bool bpf_jit_supports_kfunc_call(void) +{ + return true; +} diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c index 6a99def7d315..3da88ded6ee3 100644 --- a/arch/x86/net/bpf_jit_comp32.c +++ b/arch/x86/net/bpf_jit_comp32.c @@ -1390,6 +1390,19 @@ static inline void emit_push_r64(const u8 src[], u8 **pprog) *pprog = prog; } +static void emit_push_r32(const u8 src[], u8 **pprog) +{ + u8 *prog = *pprog; + int cnt = 0; + + /* mov ecx,dword ptr [ebp+off] */ + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src_lo)); + /* push ecx */ + EMIT1(0x51); + + *pprog = prog; +} + static u8 get_cond_jmp_opcode(const u8 op, bool is_cmp_lo) { u8 jmp_cond; @@ -1459,6 +1472,174 @@ static u8 get_cond_jmp_opcode(const u8 op, bool is_cmp_lo) return jmp_cond; } +/* i386 kernel compiles with "-mregparm=3". From gcc document: + * + * ==== snippet ==== + * regparm (number) + * On x86-32 targets, the regparm attribute causes the compiler + * to pass arguments number one to (number) if they are of integral + * type in registers EAX, EDX, and ECX instead of on the stack. + * Functions that take a variable number of arguments continue + * to be passed all of their arguments on the stack. + * ==== snippet ==== + * + * The first three args of a function will be considered for + * putting into the 32bit register EAX, EDX, and ECX. + * + * Two 32bit registers are used to pass a 64bit arg. + * + * For example, + * void foo(u32 a, u32 b, u32 c, u32 d): + * u32 a: EAX + * u32 b: EDX + * u32 c: ECX + * u32 d: stack + * + * void foo(u64 a, u32 b, u32 c): + * u64 a: EAX (lo32) EDX (hi32) + * u32 b: ECX + * u32 c: stack + * + * void foo(u32 a, u64 b, u32 c): + * u32 a: EAX + * u64 b: EDX (lo32) ECX (hi32) + * u32 c: stack + * + * void foo(u32 a, u32 b, u64 c): + * u32 a: EAX + * u32 b: EDX + * u64 c: stack + * + * The return value will be stored in the EAX (and EDX for 64bit value). + * + * For example, + * u32 foo(u32 a, u32 b, u32 c): + * return value: EAX + * + * u64 foo(u32 a, u32 b, u32 c): + * return value: EAX (lo32) EDX (hi32) + * + * Notes: + * The verifier only accepts function having integer and pointers + * as its args and return value, so it does not have + * struct-by-value. + * + * emit_kfunc_call() finds out the btf_func_model by calling + * bpf_jit_find_kfunc_model(). A btf_func_model + * has the details about the number of args, size of each arg, + * and the size of the return value. + * + * It first decides how many args can be passed by EAX, EDX, and ECX. + * That will decide what args should be pushed to the stack: + * [first_stack_regno, last_stack_regno] are the bpf regnos + * that should be pushed to the stack. + * + * It will first push all args to the stack because the push + * will need to use ECX. Then, it moves + * [BPF_REG_1, first_stack_regno) to EAX, EDX, and ECX. + * + * When emitting a call (0xE8), it needs to figure out + * the jmp_offset relative to the jit-insn address immediately + * following the call (0xE8) instruction. At this point, it knows + * the end of the jit-insn address after completely translated the + * current (BPF_JMP | BPF_CALL) bpf-insn. It is passed as "end_addr" + * to the emit_kfunc_call(). Thus, it can learn the "immediate-follow-call" + * address by figuring out how many jit-insn is generated between + * the call (0xE8) and the end_addr: + * - 0-1 jit-insn (3 bytes each) to restore the esp pointer if there + * is arg pushed to the stack. + * - 0-2 jit-insns (3 bytes each) to handle the return value. + */ +static int emit_kfunc_call(const struct bpf_prog *bpf_prog, u8 *end_addr, + const struct bpf_insn *insn, u8 **pprog) +{ + const u8 arg_regs[] = { IA32_EAX, IA32_EDX, IA32_ECX }; + int i, cnt = 0, first_stack_regno, last_stack_regno; + int free_arg_regs = ARRAY_SIZE(arg_regs); + const struct btf_func_model *fm; + int bytes_in_stack = 0; + const u8 *cur_arg_reg; + u8 *prog = *pprog; + s64 jmp_offset; + + fm = bpf_jit_find_kfunc_model(bpf_prog, insn); + if (!fm) + return -EINVAL; + + first_stack_regno = BPF_REG_1; + for (i = 0; i < fm->nr_args; i++) { + int regs_needed = fm->arg_size[i] > sizeof(u32) ? 2 : 1; + + if (regs_needed > free_arg_regs) + break; + + free_arg_regs -= regs_needed; + first_stack_regno++; + } + + /* Push the args to the stack */ + last_stack_regno = BPF_REG_0 + fm->nr_args; + for (i = last_stack_regno; i >= first_stack_regno; i--) { + if (fm->arg_size[i - 1] > sizeof(u32)) { + emit_push_r64(bpf2ia32[i], &prog); + bytes_in_stack += 8; + } else { + emit_push_r32(bpf2ia32[i], &prog); + bytes_in_stack += 4; + } + } + + cur_arg_reg = &arg_regs[0]; + for (i = BPF_REG_1; i < first_stack_regno; i++) { + /* mov e[adc]x,dword ptr [ebp+off] */ + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, *cur_arg_reg++), + STACK_VAR(bpf2ia32[i][0])); + if (fm->arg_size[i - 1] > sizeof(u32)) + /* mov e[adc]x,dword ptr [ebp+off] */ + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, *cur_arg_reg++), + STACK_VAR(bpf2ia32[i][1])); + } + + if (bytes_in_stack) + /* add esp,"bytes_in_stack" */ + end_addr -= 3; + + /* mov dword ptr [ebp+off],edx */ + if (fm->ret_size > sizeof(u32)) + end_addr -= 3; + + /* mov dword ptr [ebp+off],eax */ + if (fm->ret_size) + end_addr -= 3; + + jmp_offset = (u8 *)__bpf_call_base + insn->imm - end_addr; + if (!is_simm32(jmp_offset)) { + pr_err("unsupported BPF kernel function jmp_offset:%lld\n", + jmp_offset); + return -EINVAL; + } + + EMIT1_off32(0xE8, jmp_offset); + + if (fm->ret_size) + /* mov dword ptr [ebp+off],eax */ + EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX), + STACK_VAR(bpf2ia32[BPF_REG_0][0])); + + if (fm->ret_size > sizeof(u32)) + /* mov dword ptr [ebp+off],edx */ + EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX), + STACK_VAR(bpf2ia32[BPF_REG_0][1])); + + if (bytes_in_stack) + /* add esp,"bytes_in_stack" */ + EMIT3(0x83, add_1reg(0xC0, IA32_ESP), bytes_in_stack); + + *pprog = prog; + + return 0; +} + static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, int oldproglen, struct jit_context *ctx) { @@ -1888,6 +2069,18 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, if (insn->src_reg == BPF_PSEUDO_CALL) goto notyet; + if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { + int err; + + err = emit_kfunc_call(bpf_prog, + image + addrs[i], + insn, &prog); + + if (err) + return err; + break; + } + func = (u8 *) __bpf_call_base + imm32; jmp_offset = func - (image + addrs[i]); @@ -2402,3 +2595,8 @@ out: tmp : orig_prog); return prog; } + +bool bpf_jit_supports_kfunc_call(void) +{ + return true; +} |