Skip to content

Commit

Permalink
Merge tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kerne…
Browse files Browse the repository at this point in the history
…l/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "The most significant set of changes is the per netns RTNL. The new
  behavior is disabled by default, regression risk should be contained.

  Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its
  default value from PTP_1588_CLOCK_KVM, as the first is intended to be
  a more reliable replacement for the latter.

  Core:

   - Started a very large, in-progress, effort to make the RTNL lock
     scope per network-namespace, thus reducing the lock contention
     significantly in the containerized use-case, comprising:
       - RCU-ified some relevant slices of the FIB control path
       - introduce basic per netns locking helpers
       - namespacified the IPv4 address hash table
       - remove rtnl_register{,_module}() in favour of
         rtnl_register_many()
       - refactor rtnl_{new,del,set}link() moving as much validation as
         possible out of RTNL lock
       - convert all phonet doit() and dumpit() handlers to RCU
       - convert IPv4 addresses manipulation to per-netns RTNL
       - convert virtual interface creation to per-netns RTNL
     the per-netns lock infrastructure is guarded by the
     CONFIG_DEBUG_NET_SMALL_RTNL knob, disabled by default ad interim.

   - Introduce NAPI suspension, to efficiently switching between busy
     polling (NAPI processing suspended) and normal processing.

   - Migrate the IPv4 routing input, output and control path from direct
     ToS usage to DSCP macros. This is a work in progress to make ECN
     handling consistent and reliable.

   - Add drop reasons support to the IPv4 rotue input path, allowing
     better introspection in case of packets drop.

   - Make FIB seqnum lockless, dropping RTNL protection for read access.

   - Make inet{,v6} addresses hashing less predicable.

   - Allow providing timestamp OPT_ID via cmsg, to correlate TX packets
     and timestamps

  Things we sprinkled into general kernel code:

   - Add small file operations for debugfs, to reduce the struct ops
     size.

   - Refactoring and optimization for the implementation of page_frag
     API, This is a preparatory work to consolidate the page_frag
     implementation.

  Netfilter:

   - Optimize set element transactions to reduce memory consumption

   - Extended netlink error reporting for attribute parser failure.

   - Make legacy xtables configs user selectable, giving users the
     option to configure iptables without enabling any other config.

   - Address a lot of false-positive RCU issues, pointed by recent CI
     improvements.

  BPF:

   - Put xsk sockets on a struct diet and add various cleanups. Overall,
     this helps to bump performance by 12% for some workloads.

   - Extend BPF selftests to increase coverage of XDP features in
     combination with BPF cpumap.

   - Optimize and homogenize bpf_csum_diff helper for all archs and also
     add a batch of new BPF selftests for it.

   - Extend netkit with an option to delegate skb->{mark,priority}
     scrubbing to its BPF program.

   - Make the bpf_get_netns_cookie() helper available also to tc(x) BPF
     programs.

  Protocols:

   - Introduces 4-tuple hash for connected udp sockets, speeding-up
     significantly connected sockets lookup.

   - Add a fastpath for some TCP timers that usually expires after
     close, the socket lock contention.

   - Add inbound and outbound xfrm state caches to speed up state
     lookups.

   - Avoid sending MPTCP advertisements on stale subflows, reducing
     risks on loosing them.

   - Make neighbours table flushing more scalable, maintaining per
     device neigh lists.

  Driver API:

   - Introduce a unified interface to configure transmission H/W
     shaping, and expose it to user-space via generic-netlink.

   - Add support for per-NAPI config via netlink. This makes napi
     configuration persistent across queues removal and re-creation.
     Requires driver updates, currently supported drivers are:
     nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice.

   - Add ethtool support for writing SFP / PHY firmware blocks.

   - Track RSS context allocation from ethtool core.

   - Implement support for mirroring to DSA CPU port, via TC mirror
     offload.

   - Consolidate FDB updates notification, to avoid duplicates on
     device-specific entries.

   - Expose DPLL clock quality level to the user-space.

   - Support master-slave PHY config via device tree.

  Tests and tooling:

   - forwarding: introduce deferred commands, to simplify the cleanup
     phase

  Drivers:

   - Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic,
     Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the
     IRQs and queues to NAPI IDs, allowing busy polling and better
     introspection.

   - Ethernet high-speed NICs:
      - nVidia/Mellanox:
         - mlx5:
           - a large refactor to implement support for cross E-Switch
             scheduling
           - refactor H/W conter management to let it scale better
           - H/W GRO cleanups
      - Intel (100G, ice)::
         - add support for ethtool reset
         - implement support for per TX queue H/W shaping
      - AMD/Solarflare:
         - implement per device queue stats support
      - Broadcom (bnxt):
         - improve wildcard l4proto on IPv4/IPv6 ntuple rules
      - Marvell Octeon:
         - Add representor support for each Resource Virtualization Unit
           (RVU) device.
      - Hisilicon:
         - add support for the BMC Gigabit Ethernet
      - IBM (EMAC):
         - driver cleanup and modernization
      - Cisco (VIC):
         - raise the queues number limit to 256

   - Ethernet virtual:
      - Google vNIC:
         - implement page pool support
      - macsec:
         - inherit lower device's features and TSO limits when
           offloading
      - virtio_net:
         - enable premapped mode by default
         - support for XDP socket(AF_XDP) zerocopy TX
      - wireguard:
         - set the TSO max size to be GSO_MAX_SIZE, to aggregate larger
           packets.

   - Ethernet NICs embedded and virtual:
      - Broadcom ASP:
         - enable software timestamping
      - Freescale:
         - add enetc4 PF driver
      - MediaTek: Airoha SoC:
         - implement BQL support
      - RealTek r8169:
         - enable TSO by default on r8168/r8125
         - implement extended ethtool stats
      - Renesas AVB:
         - enable TX checksum offload
      - Synopsys (stmmac):
         - support header splitting for vlan tagged packets
         - move common code for DWMAC4 and DWXGMAC into a separate FPE
           module.
         - add dwmac driver support for T-HEAD TH1520 SoC
      - Synopsys (xpcs):
         - driver refactor and cleanup
      - TI:
         - icssg_prueth: add VLAN offload support
      - Xilinx emaclite:
         - add clock support

   - Ethernet switches:
      - Microchip:
         - implement support for the lan969x Ethernet switch family
         - add LAN9646 switch support to KSZ DSA driver

   - Ethernet PHYs:
      - Marvel: 88q2x: enable auto negotiation
      - Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2

   - PTP:
      - Add support for the Amazon virtual clock device
      - Add PtP driver for s390 clocks

   - WiFi:
      - mac80211
         - EHT 1024 aggregation size for transmissions
         - new operation to indicate that a new interface is to be added
         - support radio separation of multi-band devices
         - move wireless extension spy implementation to libiw
      - Broadcom:
         - brcmfmac: optional LPO clock support
      - Microchip:
         - add support for Atmel WILC3000
      - Qualcomm (ath12k):
         - firmware coredump collection support
         - add debugfs support for a multitude of statistics
      - Qualcomm (ath5k):
         -  Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support
      - Realtek:
         - rtw88: 8821au and 8812au USB adapters support
         - rtw89: add thermal protection
         - rtw89: fine tune BT-coexsitence to improve user experience
         - rtw89: firmware secure boot for WiFi 6 chip

   - Bluetooth
      - add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and
        0x13d3:0x3623
      - add Realtek RTL8852BE support for id Foxconn 0xe123
      - add MediaTek MT7920 support for wireless module ids
      - btintel_pcie: add handshake between driver and firmware
      - btintel_pcie: add recovery mechanism
      - btnxpuart: add GPIO support to power save feature"

* tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1475 commits)
  mm: page_frag: fix a compile error when kernel is not compiled
  Documentation: tipc: fix formatting issue in tipc.rst
  selftests: nic_performance: Add selftest for performance of NIC driver
  selftests: nic_link_layer: Add selftest case for speed and duplex states
  selftests: nic_link_layer: Add link layer selftest for NIC driver
  bnxt_en: Add FW trace coredump segments to the coredump
  bnxt_en: Add a new ethtool -W dump flag
  bnxt_en: Add 2 parameters to bnxt_fill_coredump_seg_hdr()
  bnxt_en: Add functions to copy host context memory
  bnxt_en: Do not free FW log context memory
  bnxt_en: Manage the FW trace context memory
  bnxt_en: Allocate backing store memory for FW trace logs
  bnxt_en: Add a 'force' parameter to bnxt_free_ctx_mem()
  bnxt_en: Refactor bnxt_free_ctx_mem()
  bnxt_en: Add mem_valid bit to struct bnxt_ctx_mem_type
  bnxt_en: Update firmware interface spec to 1.10.3.85
  selftests/bpf: Add some tests with sockmap SK_PASS
  bpf: fix recursive lock when verdict program return SK_PASS
  wireguard: device: support big tcp GSO
  wireguard: selftests: load nf_conntrack if not present
  ...
  • Loading branch information
torvalds committed Nov 21, 2024
2 parents 6e95ef0 + dd72078 commit fcc79e1
Show file tree
Hide file tree
Showing 1,754 changed files with 77,307 additions and 52,092 deletions.
1 change: 1 addition & 0 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1553,6 +1553,7 @@
failslab=
fail_usercopy=
fail_page_alloc=
fail_skb_realloc=
fail_make_request=[KNL]
General fault injection mechanism.
Format: <interval>,<probability>,<space>,<times>
Expand Down
71 changes: 71 additions & 0 deletions Documentation/core-api/packing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,77 @@ the more significant 4-byte word.
We always think of our offsets as if there were no quirk, and we translate
them afterwards, before accessing the memory region.

Note on buffer lengths not multiple of 4
----------------------------------------

To deal with memory layout quirks where groups of 4 bytes are laid out "little
endian" relative to each other, but "big endian" within the group itself, the
concept of groups of 4 bytes is intrinsic to the packing API (not to be
confused with the memory access, which is performed byte by byte, though).

With buffer lengths not multiple of 4, this means one group will be incomplete.
Depending on the quirks, this may lead to discontinuities in the bit fields
accessible through the buffer. The packing API assumes discontinuities were not
the intention of the memory layout, so it avoids them by effectively logically
shortening the most significant group of 4 octets to the number of octets
actually available.

Example with a 31 byte sized buffer given below. Physical buffer offsets are
implicit, and increase from left to right within a group, and from top to
bottom within a column.

No quirks:

::

31 29 28 | Group 7 (most significant)
27 26 25 24 | Group 6
23 22 21 20 | Group 5
19 18 17 16 | Group 4
15 14 13 12 | Group 3
11 10 9 8 | Group 2
7 6 5 4 | Group 1
3 2 1 0 | Group 0 (least significant)

QUIRK_LSW32_IS_FIRST:

::

3 2 1 0 | Group 0 (least significant)
7 6 5 4 | Group 1
11 10 9 8 | Group 2
15 14 13 12 | Group 3
19 18 17 16 | Group 4
23 22 21 20 | Group 5
27 26 25 24 | Group 6
30 29 28 | Group 7 (most significant)

QUIRK_LITTLE_ENDIAN:

::

30 28 29 | Group 7 (most significant)
24 25 26 27 | Group 6
20 21 22 23 | Group 5
16 17 18 19 | Group 4
12 13 14 15 | Group 3
8 9 10 11 | Group 2
4 5 6 7 | Group 1
0 1 2 3 | Group 0 (least significant)

QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST:

::

0 1 2 3 | Group 0 (least significant)
4 5 6 7 | Group 1
8 9 10 11 | Group 2
12 13 14 15 | Group 3
16 17 18 19 | Group 4
20 21 22 23 | Group 5
24 25 26 27 | Group 6
28 29 30 | Group 7 (most significant)

Intended use
------------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,17 +34,25 @@ properties:
firmware-name:
maxItems: 1

device-wakeup-gpios:
maxItems: 1
description:
Host-To-Chip power save mechanism is driven by this GPIO
connected to BT_WAKE_IN pin of the NXP chipset.

required:
- compatible

additionalProperties: false

examples:
- |
#include <dt-bindings/gpio/gpio.h>
serial {
bluetooth {
compatible = "nxp,88w8987-bt";
fw-init-baudrate = <3000000>;
firmware-name = "uartuart8987_bt_v0.bin";
device-wakeup-gpios = <&gpio 11 GPIO_ACTIVE_HIGH>;
};
};
22 changes: 21 additions & 1 deletion Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ properties:
- microchip,ksz9563
- microchip,ksz8563
- microchip,ksz8567
- microchip,lan9646

reset-gpios:
description:
Expand Down Expand Up @@ -81,6 +82,26 @@ properties:
interrupts:
maxItems: 1

mdio:
$ref: /schemas/net/mdio.yaml#
unevaluatedProperties: false
properties:
mdio-parent-bus:
$ref: /schemas/types.yaml#/definitions/phandle
description:
Phandle pointing to the MDIO bus controller connected to the
secondary MDIO interface. This property should be used when
the internal MDIO bus is accessed via a secondary MDIO
interface rather than the primary management interface.

patternProperties:
"^ethernet-phy@[0-9a-f]$":
type: object
$ref: /schemas/net/ethernet-phy.yaml#
unevaluatedProperties: false
description:
Integrated PHY node

required:
- compatible
- reg
Expand Down Expand Up @@ -138,7 +159,6 @@ examples:
pinctrl-0 = <&pinctrl_spi_ksz>;
cs-gpios = <&pioC 25 0>;
id = <1>;
ksz9477: switch@0 {
compatible = "microchip,ksz9477";
Expand Down
46 changes: 23 additions & 23 deletions Documentation/devicetree/bindings/net/dsa/realtek.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ examples:
#include <dt-bindings/interrupt-controller/irq.h>
platform {
switch {
ethernet-switch {
compatible = "realtek,rtl8366rb";
/* 22 = MDIO (has input reads), 21 = MDC (clock, output only) */
mdc-gpios = <&gpio0 21 GPIO_ACTIVE_HIGH>;
Expand All @@ -163,35 +163,35 @@ examples:
#interrupt-cells = <1>;
};
ports {
ethernet-ports {
#address-cells = <1>;
#size-cells = <0>;
port@0 {
ethernet-port@0 {
reg = <0>;
label = "lan0";
phy-handle = <&phy0>;
};
port@1 {
ethernet-port@1 {
reg = <1>;
label = "lan1";
phy-handle = <&phy1>;
};
port@2 {
ethernet-port@2 {
reg = <2>;
label = "lan2";
phy-handle = <&phy2>;
};
port@3 {
ethernet-port@3 {
reg = <3>;
label = "lan3";
phy-handle = <&phy3>;
};
port@4 {
ethernet-port@4 {
reg = <4>;
label = "wan";
phy-handle = <&phy4>;
};
port@5 {
ethernet-port@5 {
reg = <5>;
ethernet = <&gmac0>;
phy-mode = "rgmii";
Expand Down Expand Up @@ -241,7 +241,7 @@ examples:
#include <dt-bindings/interrupt-controller/irq.h>
platform {
switch {
ethernet-switch {
compatible = "realtek,rtl8365mb";
mdc-gpios = <&gpio1 16 GPIO_ACTIVE_HIGH>;
mdio-gpios = <&gpio1 17 GPIO_ACTIVE_HIGH>;
Expand All @@ -255,30 +255,30 @@ examples:
#interrupt-cells = <1>;
};
ports {
ethernet-ports {
#address-cells = <1>;
#size-cells = <0>;
port@0 {
ethernet-port@0 {
reg = <0>;
label = "swp0";
phy-handle = <&ethphy0>;
};
port@1 {
ethernet-port@1 {
reg = <1>;
label = "swp1";
phy-handle = <&ethphy1>;
};
port@2 {
ethernet-port@2 {
reg = <2>;
label = "swp2";
phy-handle = <&ethphy2>;
};
port@3 {
ethernet-port@3 {
reg = <3>;
label = "swp3";
phy-handle = <&ethphy3>;
};
port@6 {
ethernet-port@6 {
reg = <6>;
ethernet = <&fec1>;
phy-mode = "rgmii";
Expand Down Expand Up @@ -330,7 +330,7 @@ examples:
#address-cells = <1>;
#size-cells = <0>;
switch@29 {
ethernet-switch@29 {
compatible = "realtek,rtl8365mb";
reg = <29>;
Expand All @@ -344,36 +344,36 @@ examples:
#interrupt-cells = <1>;
};
ports {
ethernet-ports {
#address-cells = <1>;
#size-cells = <0>;
port@0 {
ethernet-port@0 {
reg = <0>;
label = "lan4";
};
port@1 {
ethernet-port@1 {
reg = <1>;
label = "lan3";
};
port@2 {
ethernet-port@2 {
reg = <2>;
label = "lan2";
};
port@3 {
ethernet-port@3 {
reg = <3>;
label = "lan1";
};
port@4 {
ethernet-port@4 {
reg = <4>;
label = "wan";
};
port@7 {
ethernet-port@7 {
reg = <7>;
ethernet = <&ethernet>;
phy-mode = "rgmii";
Expand Down
21 changes: 21 additions & 0 deletions Documentation/devicetree/bindings/net/ethernet-phy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,27 @@ properties:
Mark the corresponding energy efficient ethernet mode as
broken and request the ethernet to stop advertising it.

timing-role:
$ref: /schemas/types.yaml#/definitions/string
enum:
- forced-master
- forced-slave
- preferred-master
- preferred-slave
description: |
Specifies the timing role of the PHY in the network link. This property is
required for setups where the role must be explicitly assigned via the
device tree due to limitations in hardware strapping or incorrect strap
configurations.
It is applicable to Single Pair Ethernet (1000/100/10Base-T1) and other
PHY types, including 1000Base-T, where it controls whether the PHY should
be a master (clock source) or a slave (clock receiver).
- 'forced-master': The PHY is forced to operate as a master.
- 'forced-slave': The PHY is forced to operate as a slave.
- 'preferred-master': Prefer the PHY to be master but allow negotiation.
- 'preferred-slave': Prefer the PHY to be slave but allow negotiation.
pses:
$ref: /schemas/types.yaml#/definitions/phandle-array
maxItems: 1
Expand Down
11 changes: 7 additions & 4 deletions Documentation/devicetree/bindings/net/fsl,enetc-mdio.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,13 @@ maintainers:

properties:
compatible:
items:
- enum:
- pci1957,ee01
- const: fsl,enetc-mdio
oneOf:
- items:
- enum:
- pci1957,ee01
- const: fsl,enetc-mdio
- items:
- const: pci1131,ee00

reg:
maxItems: 1
Expand Down
28 changes: 25 additions & 3 deletions Documentation/devicetree/bindings/net/fsl,enetc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,25 @@ maintainers:

properties:
compatible:
items:
oneOf:
- items:
- enum:
- pci1957,e100
- const: fsl,enetc
- enum:
- pci1957,e100
- const: fsl,enetc
- pci1131,e101

reg:
maxItems: 1

clocks:
items:
- description: MAC transmit/receive reference clock

clock-names:
items:
- const: ref

mdio:
$ref: mdio.yaml
unevaluatedProperties: false
Expand All @@ -40,6 +51,17 @@ required:
allOf:
- $ref: /schemas/pci/pci-device.yaml
- $ref: ethernet-controller.yaml
- if:
not:
properties:
compatible:
contains:
enum:
- pci1131,e101
then:
properties:
clocks: false
clock-names: false

unevaluatedProperties: false

Expand Down
Loading

0 comments on commit fcc79e1

Please sign in to comment.