summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-11-04ixgbe: Add support to retrieve and store LED link activeDon Skidmore
This patch adds support to get the LED link active via the LEDCTL register. If the LEDCTL register does not have LED link active (LED mode field = 0x0100) set then default LED link active returned. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-11-04ixgbe: Add X552 iXFI configuration helper functionDon Skidmore
X553 doesn't need all the initialization that X552 did for iXFI. This patch will allow native SPI SFP+ to work with X553 devices. Future patches will add additional configuration as needed. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-11-04Merge branch 'nfp-ring-reconfig-and-xdp-support'David S. Miller
Jakub Kicinski says: ==================== ring reconfiguration and XDP support This set adds support for ethtool channel API and XDP. I kick off with ethtool get_channels() implementation. set_channels() needs some preparations to get right. I follow the prepare/commit paradigm and allocate all resources before stopping the device. It has already been done for ndo_change_mtu and ethtool set_ringparam(), it makes sense now to consolidate all the required logic in one place. XDP support requires splitting TX rings into two classes - for the stack and for XDP. The ring structures are identical. The differences are in how they are connected to IRQ vector structs and how the completion/cleanup works. When XDP is enabled I switch from the frag allocator to page-per-packet and map buffers BIDIRECTIONALly. Last but not least XDP offload is added (the patch just takes care of the small formal differences between cls_bpf and XDP). There is a tiny & trivial DebugFS patch in the mix, I hope it can be taken via net-next provided we have the right Acks. Resending with improved commit message and CCing more people on patch 10. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: add support for offload of XDP programsJakub Kicinski
Most infrastructure can be reused, provide separate handling of context offsets and exit codes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: remove unnecessary parameters from nfp_net_bpf_offload()Jakub Kicinski
nfp_net_bpf_offload() takes all .setup_tc() parameters but it doesn't use them at the moment. Remove unnecessary ones to make it possible for XDP to reuse this function. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: add XDP support in the driverJakub Kicinski
Add XDP support. Separate stack's and XDP's TX rings logically. Add functions for handling XDP_TX and cleanup of XDP's TX rings. For XDP allocate all RX buffers as separate pages and map them with DMA_BIDIRECTIONAL. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04debugfs: constify argument to debugfs_real_fops()Jakub Kicinski
seq_file users can only access const version of file pointer, because the ->file member of struct seq_operations is marked as such. Make parameter to debugfs_real_fops() const. CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: Nicolai Stange <nicstange@gmail.com> CC: Christian Lamparter <chunkeey@gmail.com> CC: LKML <linux-kernel@vger.kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: reorganize nfp_net_rx() to get packet offsets earlyJakub Kicinski
Calculate packet offsets early in nfp_net_rx() so that we will be able to use them in upcoming XDP handler. While at it move relevant variables into the loop scope. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: add support for ethtool .set_channelsJakub Kicinski
Allow changing the number of rings via ethtool .set_channels API. Runtime reconfig needs to be extended to handle number of rings. We need to be able to activate interrupt vectors before rings are assigned to them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: move RSS indirection table init into a separate functionJakub Kicinski
We will need to rerun the initialization of the RSS indirection table after the number of rings is changed. Move the code to a separate function. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: add helper to reassign rings to IRQ vectorsJakub Kicinski
Instead of fixing ring -> vector relations up in ring swap functions put the reassignment into a helper function which will reinit all links. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: loosen relation between rings and IRQs vectorsJakub Kicinski
Upcoming XDP support will break the assumption that one can iterate over IRQ vectors to get to all the rings easily. Use nn->.x_ring arrays directly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: reuse ring helpers on .ndo_open() pathJakub Kicinski
Ring allocation helpers encapsulate all ring allocation and initialization steps nicely. Reuse them on .ndo_open() path. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: rename ring allocation helpersJakub Kicinski
"Shadow" in ring helpers used to mean that the helper will allocate rings without touching existing configuration, this was used for reconfiguration while the device was running. We will soon use the same helpers for .ndo_open() path, so replace "shadow" with "ring_set". No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: centralize runtime reconfiguration logicJakub Kicinski
All functions which need to reallocate ring resources at runtime look very similar. Centralize that logic into a separate function. Encapsulate configuration parameters in a structure. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04nfp: add support for ethtool .get_channelsJakub Kicinski
Report number of rings via ethtool .get_channels API. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04Merge branch 'amd-xgbe-updates'David S. Miller
Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver updates 2016-11-03 This patch series is targeted at preparing the driver for a new PCI version of the hardware. After this series is applied, a follow-on series will introduce the support for the PCI version of the hardware. The following updates and fixes are included in this driver update series: - Fix formatting of PCS debug register dump - Prepare for priority-based FIFO allocation - Implement priority-based FIFO allocation - Prepare for working with more than one type of PCS/PHY - Prepare for the introduction of clause 37 auto-negotiation - Add support for clause 37 auto-negotiation - Prepare for supporting a new PCS register access method - Add support for 64-bit management counter registers - Update DMA channel status determination - Prepare for supporting PCI devices in addition to platform devices This patch series is based on net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Prepare for supporting PCI devicesLendacky, Thomas
Update the driver framework to separate out platform/ACPI specific code from general code during device initialization. This will allow for the introduction of PCI device support. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Update how to determine DMA channel statusLendacky, Thomas
Tx and Rx DMA channel status determiniation is different depending on the version of the hardware. Update the channel status processing code to account for the change. Also, reduce the timeout value used when stopping the channels. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Support for 64-bit management counter registersLendacky, Thomas
Add support for reading all management counter registers as 64-bit values. The indication of whether to read the high 32-bits to form a 64-bit value is indicated in the version data. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Prepare for a new PCS register access methodLendacky, Thomas
Prepare the code to be able to support accessing of the PCS registers in a new way, while maintaining the current access method. Provide a version specific field that indicates the method to use. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Add support for clause 37 auto-negotiationLendacky, Thomas
Add support to be able to use clause 37 auto-negotiation. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Prepare for introduction of clause 37 autonegLendacky, Thomas
Prepare for the future introduction of clause 37 auto-negotiation by updating the current auto-negotiation related functions to identify them as clause 73 functions. Move interrupt enablement to the enable/disable auto-negotiation functions. Update what will be common routines to check for the current type of AN and process accordingly. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Prepare for working with more than one type of phyLendacky, Thomas
Prepare the code to be able to work with more than one type of phy by adding additional callable functions into the phy interface and removing phy specific settings/functions from non-phy related files. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Perform priority-based hardware FIFO allocationLendacky, Thomas
Allocate the FIFO across the hardware Rx queues based on the priority of the queues. Giving more FIFO resources to queues with a higher priority. If PFC is active but not enabled for a queue, then less resources can allocated to the queue. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Prepare for priority-based FIFO allocationLendacky, Thomas
Currently, the Rx and Tx fifos are evenly allocated between the hardware queues of the device. As more queues are instantiated, the fifo memory needs to be able to be allocated based on queue priority. This allows for higher priority queues to have more fifo memory than lower priority queues. Prepare for this by modifying the current fifo calculation to assign the fifo queue allocation in an array that is then used to program the hardware. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04amd-xgbe: Fix formatting of PCS register dumpLendacky, Thomas
Fix the length value used for the PCS register dump so that the full value can be displayed. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04Merge branch 'uid-routing'David S. Miller
Lorenzo Colitti says: ==================== net: inet: Support UID-based routing This patchset adds support for per-UID routing. It allows the administrator to configure rules such as: ip rule add uidrange 100-200 lookup 123 This functionality has been in use by all Android devices since 5.0. It is primarily used to impose per-app routing policies (on Android, every app has its own UID) without having to resort to rerouting packets in iptables, which breaks getsockname() and MTU/MSS calculation, and generally disrupts end-to-end connectivity. This patch series is similar to the code currently used on Android, but has better correctness and performance because it stores the UID in the socket instead of calling sock_i_uid. This avoids contention on sk->sk_callback_lock, and makes it possible to correctly route a socket on which userspace has called close(), for which sock_i_uid will return 0. Changes from v1: - Don't set the UID in sk_clone_lock, it's already set by sock_copy. - For packets originated by kernel sockets, don't use the socket UID. This is the UID that created the namespace, but it might not be mapped in the namespace at all. Instead, use UID 0 in the namespace, which is less surprising and consistent with what happens in the root namespace. - Fix UID routing of IPv4 and IPv6 SYN_RECV sockets. - Fix UID routing of received IPv6 redirects. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: inet: Support UID-based routing in IP protocols.Lorenzo Colitti
- Use the UID in routing lookups made by protocol connect() and sendmsg() functions. - Make sure that routing lookups triggered by incoming packets (e.g., Path MTU discovery) take the UID of the socket into account. - For packets not associated with a userspace socket, (e.g., ping replies) use UID 0 inside the user namespace corresponding to the network namespace the socket belongs to. This allows all namespaces to apply routing and iptables rules to kernel-originated traffic in that namespaces by matching UID 0. This is better than using the UID of the kernel socket that is sending the traffic, because the UID of kernel sockets created at namespace creation time (e.g., the per-processor ICMP and TCP sockets) is the UID of the user that created the socket, which might not be mapped in the namespace. Tested: compiles allnoconfig, allyesconfig, allmodconfig Tested: https://android-review.googlesource.com/253302 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: core: add UID to flows, rules, and routesLorenzo Colitti
- Define a new FIB rule attributes, FRA_UID_RANGE, to describe a range of UIDs. - Define a RTA_UID attribute for per-UID route lookups and dumps. - Support passing these attributes to and from userspace via rtnetlink. The value INVALID_UID indicates no UID was specified. - Add a UID field to the flow structures. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: core: Add a UID field to struct sock.Lorenzo Colitti
Protocol sockets (struct sock) don't have UIDs, but most of the time, they map 1:1 to userspace sockets (struct socket) which do. Various operations such as the iptables xt_owner match need access to the "UID of a socket", and do so by following the backpointer to the struct socket. This involves taking sk_callback_lock and doesn't work when there is no socket because userspace has already called close(). Simplify this by adding a sk_uid field to struct sock whose value matches the UID of the corresponding struct socket. The semantics are as follows: 1. Whenever sk_socket is non-null: sk_uid is the same as the UID in sk_socket, i.e., matches the return value of sock_i_uid. Specifically, the UID is set when userspace calls socket(), fchown(), or accept(). 2. When sk_socket is NULL, sk_uid is defined as follows: - For a socket that no longer has a sk_socket because userspace has called close(): the previous UID. - For a cloned socket (e.g., an incoming connection that is established but on which userspace has not yet called accept): the UID of the socket it was cloned from. - For a socket that has never had an sk_socket: UID 0 inside the user namespace corresponding to the network namespace the socket belongs to. Kernel sockets created by sock_create_kern are a special case of #1 and sk_uid is the user that created them. For kernel sockets created at network namespace creation time, such as the per-processor ICMP and TCP sockets, this is the user that created the network namespace. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04Merge branch 'dsa-mv88e6xxx-port-operation-refine'David S. Miller
Vivien Didelot says: ==================== net: dsa: mv88e6xxx: refine port operations The Marvell chips have one internal SMI device per port, containing a set of registers used to configure a port's link, STP state, default VLAN or addresses database, etc. This patchset creates port files to implement the port operations as described in datasheets, and extend the chip ops structure with them. Patches 1 to 6 implement accessors for port's STP state, port based VLAN map, default FID, default VID, and 802.1Q mode. Patches 7 to 11 implement the port's MAC setup of link state, duplex mode, RGMII delay and speed, all accessed through port's register 0x01. The new port's MAC setup code is used to re-implement the adjust_link code and correctly force the link down before changing any of the MAC settings, as requested by the datasheets. The port's MAC accessors use values compatible with struct phy_device (e.g. DUPLEX_FULL) and extend them when needed (e.g. SPEED_MAX). Changes in v2: - Strictly use new _UNFORCED values instead of re-using _UNKNOWN ones. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: setup port's MACVivien Didelot
Now that we have setters to configure the port's MAC, use them to refactor the port setup and adjust_link code. Note that port's MAC speed, duplex or RGMII delay must not be changed unless the port's link is forced down. So wrap all that in a mv88e6xxx_port_setup_mac function. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port's MAC speed setterVivien Didelot
While the two bits for link, duplex or RGMII delays are used the same way on chips supporting the said feature, the two bits for speed have different meaning for most of the chips out there. Speed value is stored in bits 1:0, 0x3 means unforce (normal detection). Some chips reuse values for alternative speeds when bit 12 is set. Newer chips with speed > 1Gbps reuse value 0x3 thus need a new bit 13. Here are the values to write in register 0x1 to (un)force speed: | Speed | 88E6065 | 88E6185 | 88E6352 | 88E6390 | 88E6390X | | ------- | ------- | ------- | ------- | ------- | -------- | | 10 | 0x0000 | 0x0000 | 0x0000 | 0x2000 | 0x2000 | | 100 | 0x0001 | 0x0001 | 0x0001 | 0x2001 | 0x2001 | | 200 | 0x0002 | NA | 0x1001 | 0x3001 | 0x3001 | | 1000 | NA | 0x0002 | 0x0002 | 0x2002 | 0x2002 | | 2500 | NA | NA | NA | 0x3003 | 0x3003 | | 10000 | NA | NA | NA | NA | 0x2003 | | unforce | 0x0003 | 0x0003 | 0x0003 | 0x0000 | 0x0000 | This patch implements a generic mv88e6xxx_port_set_speed() function used by chip-specific wrappers to filter supported ports and speeds. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port's RGMII delay setterVivien Didelot
Some chips such as 88E6352 and 88E6390 can be programmed to add delays to RXCLK for IND inputs or to GTXCLK for OUTD outputs when port is in RGMII mode. Add a port function to program such delays according to the provided PHY interface mode. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port duplex setterVivien Didelot
Similarly to port's link, add setter to force port's half duplex, full duplex or let normal duplex detection occurs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port link setterVivien Didelot
Most of the chips will have a port register control bits to force the port's link up, down, or let normal link detection occurs. Implement such operation to use it later when setting duplex, etc. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port 802.1Q mode setterVivien Didelot
Add port functions to set the port 802.1Q mode. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port PVID accessorsVivien Didelot
Add port functions to access the ports default VID. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port FID accessorsVivien Didelot
Add functions to port files to access the ports default FID. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port vlan map setterVivien Didelot
Add a port function to access the Port Based VLAN Map register. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port state setterVivien Didelot
Add the port STP state setter to the port files. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-04net: dsa: mv88e6xxx: add port filesVivien Didelot
The Marvell switches contains one internal SMI device per port, called "Port Registers". Depending on the model, the addresses of these devices start from 0x0, 0x8 or 0x10. Start moving Port Registers specific code to their own files. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03net/sched: cls_flower: Support matching on SCTP portsSimon Horman
Support matching on SCTP ports in the same way that matching on TCP and UDP ports is already supported. Example usage: tc qdisc add dev eth0 ingress tc filter add dev eth0 protocol ip parent ffff: \ flower indev eth0 ip_proto sctp dst_port 80 \ action drop Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03mlxsw: pci: Fix the FW ready mask lengthElad Raz
The system-status register is actually 16-bit wide and not 8 bit-wide. Fixes: 233fa44bd67ae ("mlxsw: pci: Implement reset done check") Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03Merge branch 'ip-recvfragsize-cmsg'David S. Miller
Willem de Bruijn says: ==================== ip: add RECVFRAGSIZE cmsg On IP datagrams and raw sockets, when packets arrive fragmented, expose the largest received fragment size through a new cmsg. Protocols implemented on top of these sockets may use this, for instance, to inform peers to lower MSS on platforms that silently allow send calls to exceed PMTU and cause fragmentation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03ipv6: on reassembly, record frag_max_sizeWillem de Bruijn
IP6CB and IPCB have a frag_max_size field. In IPv6 this field is filled in when packets are reassembled by the connection tracking code. Also fill in when reassembling in the input path, to expose it through cmsg IPV6_RECVFRAGSIZE in all cases. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03ipv6: add IPV6_RECVFRAGSIZE cmsgWillem de Bruijn
When reading a datagram or raw packet that arrived fragmented, expose the maximum fragment size if recorded to allow applications to estimate receive path MTU. At this point, the field is only recorded when ipv6 connection tracking is enabled. A follow-up patch will record this field also in the ipv6 input path. Tested using the test for IP_RECVFRAGSIZE plus ip netns exec to ip addr add dev veth1 fc07::1/64 ip netns exec from ip addr add dev veth0 fc07::2/64 ip netns exec to ./recv_cmsg_recvfragsize -6 -u -p 6000 & ip netns exec from nc -q 1 -u fc07::1 6000 < payload Both with and without enabling connection tracking ip6tables -A INPUT -m state --state NEW -p udp -j LOG Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03ipv4: add IP_RECVFRAGSIZE cmsgWillem de Bruijn
The IP stack records the largest fragment of a reassembled packet in IPCB(skb)->frag_max_size. When reading a datagram or raw packet that arrived fragmented, expose the value to allow applications to estimate receive path MTU. Tested: Sent data over a veth pair of which the source has a small mtu. Sent data using netcat, received using a dedicated process. Verified that the cmsg IP_RECVFRAGSIZE is returned only when data arrives fragmented, and in that cases matches the veth mtu. ip link add veth0 type veth peer name veth1 ip netns add from ip netns add to ip link set dev veth1 netns to ip netns exec to ip addr add dev veth1 192.168.10.1/24 ip netns exec to ip link set dev veth1 up ip link set dev veth0 netns from ip netns exec from ip addr add dev veth0 192.168.10.2/24 ip netns exec from ip link set dev veth0 up ip netns exec from ip link set dev veth0 mtu 1300 ip netns exec from ethtool -K veth0 ufo off dd if=/dev/zero bs=1 count=1400 2>/dev/null > payload ip netns exec to ./recv_cmsg_recvfragsize -4 -u -p 6000 & ip netns exec from nc -q 1 -u 192.168.10.1 6000 < payload using github.com/wdebruij/kerneltools/blob/master/tests/recvfragsize.c Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03Merge branch 'stmmac-OXNAS'David S. Miller
Neil Armstrong says: ==================== net: stmmac: Add OXNAS DWMAC Glue This patchset add support for the Sysnopsys DWMAC Gigabit Ethernet controller Glue layer of the Oxford Semiconductor OX820 SoC. Changes since v2 at http://lkml.kernel.org/r/20161031105345.16711-1-narmstrong@baylibre.com : - Disable/Unprepare clock if regmap read fails in oxnas_dwmac_init Changes since v1 at https://patchwork.kernel.org/patch/9388231/ : - Split dt-bindings in a separate patch - Add IP version in the dt-bindings compatible - Check return of clk_prepare_enable() - use get_stmmac_bsp_priv() helper - hardwire setup values in oxnas_dwmac_init() Changes since RFC at https://patchwork.kernel.org/patch/9387257 : - Drop init/exit callbacks - Implement proper remove and PM callback - Call init from probe - Disable/Unprepare clock if stmmac probe fails ==================== Signed-off-by: David S. Miller <davem@davemloft.net>