linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2017-02-06	can: rx-offload: Add support for HW fifo based irq offloading	David Jander
	Some CAN controllers have a usable FIFO already but can still benefit from off-loading the CAN controller FIFO. The CAN frames of the FIFO are read and put into a skb queue during interrupt and then transmitted in a NAPI context. Signed-off-by: David Jander <david@protonic.nl> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-02-05	Merge branch 'remove-__napi_complete_done'	David S. Miller
	Eric Dumazet says: ==================== net: get rid of __napi_complete() This patch series removes __napi_complete() calls, in an effort to make NAPI API simpler and generalize GRO and napi_complete_done() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	net: remove __napi_complete()	Eric Dumazet
	All __napi_complete() callers have been converted to use the more standard napi_complete_done(), we can now remove this NAPI method for good. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	aeroflex/greth: use napi_complete_done()	Eric Dumazet
	We plan to remove __napi_complete() soon, this driver is the last user. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	ibm/emac: use napi_complete_done()	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() We plan to remove __napi_complete() to reduce NAPI complexity. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	qla3xxx: add GRO support	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	ks8695net: add GRO support	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API. Note that rx_lock seems to be useless, NAPI logic should not need this extra care. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	skge: use napi_complete_done()	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API and get rid of napi_gro_flush() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	ep93xx_eth: add GRO support	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API. 4) get rid of baroque code and ease maintenance. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	pcnet32: use napi_complete_done()	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	amd8111e: add GRO support	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API. 4) get rid of baroque code and ease maintenance. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	epic100: use napi_complete_done()	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API. 4) get rid of baroque code and ease maintenance. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	8139cp: use napi_complete_done()	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API. 4) Eventually get rid of napi_gro_flush() in the future. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05	8139too: use napi_complete_done()	Eric Dumazet
	Use napi_complete_done() instead of __napi_complete() to : 1) Get support of gro_flush_timeout if opt-in 2) Not rearm interrupts for busy-polling users. 3) use standard NAPI API. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04	Merge branch 'ipv6-Improve-user-experience-with-multipath-routes'	David S. Miller
	David Ahern says: ==================== net: ipv6: Improve user experience with multipath routes This series closes a couple of gaps between IPv4 and IPv6 with respect to multipath routes: 1. IPv4 allows all nexthops of multipath routes to be deleted using just the prefix and length; IPv6 only deletes the first nexthop for the route if only the prefix and length are given. 2. IPv4 returns multipath routes encoded in the RTA_MULTIPATH attribute. IPv6 returns a series of routes with the same prefix and length - one for each nexthop. This happens for both dumps and notifications. IPv6 does accept RTA_MULTIPATH encoded routes, but installs them as a series of routes. Patch 1 addresses the first item by allowing IPv6 multipath routes to be deleted using just the prefix and length. Patch 2 addresses the second allowing IPv6 multipath routes to be returned encoded in the RTA_MULTIPATH. Patches 3 and 4 upate the RTM_{NEW,DEL}ROUTE notifications to generate 1 notification with RTA_MULTIPATH where applicable. Patch 5 prints IPv6 addresses in compressed format when showing route replace errors. This was noticed testing REPLACE failures. The end result for multipath routes: 1. Dump - RTA_MULTIPATH used for multipath routes $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:200::/120 metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 ... 2. Route Add - one notification with RTA_MULTIPATH attribute $ ip -6 ro add vrf red 2001:db8:200::/120 nexthop via 2001:db8:1::2 nexthop via 2001:db8:2::2 $ ip mon route 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 2. Route Replace - one notification with RTA_MULTIPATH attribute $ ip -6 ro replace vrf red 2001:db8:200::/120 nexthop via 2001:db8:1::16 nexthop via 2001:db8:2::16 $ ip mon route Replaced 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::16 dev eth1 weight 1 nexthop via 2001:db8:2::16 dev eth2 weight 1 - on a failure after the insertion of the first nexthop (which means the original route has been replaced in the FIB), a notification is sent with the successful nexthops and then the nexthops are deleted with one notification per hop. This is consistent with how it works today except the successful additions are coalesced into 1 notification. 3. Route Delete - delete of entire multipath route using prefix/length only 1 notification is generated: $ ip -6 ro del vrf red 2001:db8:200::/120 $ ip mon route Deleted 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::16 dev eth1 weight 1 nexthop via 2001:db8:2::16 dev eth2 weight 1 - if a delete request contains nexthops one notification is generated per nexthop deleted. This is unavoidable since IPv6 alllows a single nexthop to be deleted within a multipath route 4. Route Appends - IPv6 allows nexthops to be appended to an existing route. In this case one notification is sent for the new route with the append flag set. $ ip -6 ro append vrf red 2001:db8:200::/120 nexthop via 2001:db8:2::20 nexthop via 2001:db8:1::20 $ ip mon route Append 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 nexthop via 2001:db8:2::20 dev eth2 weight 1 nexthop via 2001:db8:1::20 dev eth1 weight 1 - on failure of an append, a notification is sent with the route containing all of the nexthops successfully added, and it is followed by delete notifications as the hops are removed returning the route to its prior state. This is consistent with how it works today except the successful additions are coalesced into 1 notification. Addresses some of the inconsistencies also noted by Roopa at netdev0.1: https://www.netdev01.org/docs/prabhu-linux_ipv4_ipv6_inconsistencies_talk_slides.pdf v4 - changed series to do encoding in 1 patch and updating notificatons in separate patches to make it easier to review and understand - 1 notification for delete when using prefix/length; 1 notification for append - handle delete of a single nexthop without RTA_MULTIPATH in delete request - upated commit messages and cover letter v3 - removed the need for a user API to opt-in to change. Requiring an API just shifts the difference from same API with different behavior to different API to achieve equivalent behavior - route notifications changed to use RTA_MULTIPATH for add and replace - upated commit messages and cover letter v2 - fixed locking in patch 1 as noted by DaveM - changed user API for patch 2 to require an rtmsg with RTM_F_ALL_NEXTHOPS set in rtm_flags - revamped explanation of patch 2 and cover letter ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04	net: ipv6: Use compressed IPv6 addresses showing route replace error	David Ahern
	ip6_print_replace_route_err logs an error if a route replace fails with IPv6 addresses in the full format. e.g,: IPv6: IPV6: multipath route replace failed (check consistency of installed routes): 2001:0db8:0200:0000:0000:0000:0000:0000 nexthop 2001:0db8:0001:0000:0000:0000:0000:0016 ifi 0 Change the message to dump the addresses in the compressed format. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04	net: ipv6: Change notifications for multipath delete to RTA_MULTIPATH	David Ahern
	If an entire multipath route is deleted using prefix and len (without any nexthops), send a single RTM_DELROUTE notification with the full route using RTA_MULTIPATH. This is done by generating the skb before the route delete when all of the sibling routes are still present but sending it after the route has been removed from the FIB. The skip_notify flag is used to tell the lower fib code not to send notifications for the individual nexthop routes. If a route is deleted using RTA_MULTIPATH for any nexthops or a single nexthop entry is deleted, then the nexthops are deleted one at a time with notifications sent as each hop is deleted. This is necessary given that IPv6 allows individual hops within a route to be deleted. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04	net: ipv6: Change notifications for multipath add to RTA_MULTIPATH	David Ahern
	Change ip6_route_multipath_add to send one notifciation with the full route encoded with RTA_MULTIPATH instead of a series of individual routes. This is done by adding a skip_notify flag to the nl_info struct. The flag is used to skip sending of the notification in the fib code that actually inserts the route. Once the full route has been added, a notification is generated with all nexthops. ip6_route_multipath_add handles 3 use cases: new routes, route replace, and route append. The multipath notification generated needs to be consistent with the order of the nexthops and it should be consistent with the order in a FIB dump which means the route with the first nexthop needs to be used as the route reference. For the first 2 cases (new and replace), a reference to the route used to send the notification is obtained by saving the first route added. For the append case, the last route added is used to loop back to its first sibling route which is the first nexthop in the multipath route. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04	net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH attribute	David Ahern
	IPv6 returns multipath routes as a series of individual routes making their display and handling by userspace different and more complicated than IPv4, putting the burden on the user to see that a route is part of a multipath route and internally creating a multipath route if desired (e.g., libnl does this as of commit 29b71371e764). This patch addresses this difference, allowing multipath routes to be returned using the RTA_MULTIPATH attribute. The end result is that IPv6 multipath routes can be treated and displayed in a format similar to IPv4: $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:200::/120 metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04	net: ipv6: Allow shorthand delete of all nexthops in multipath route	David Ahern
	IPv4 allows multipath routes to be deleted using just the prefix and length. For example: $ ip ro ls vrf red unreachable default metric 8192 1.1.1.0/24 nexthop via 10.100.1.254 dev eth1 weight 1 nexthop via 10.11.200.2 dev eth11.200 weight 1 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3 $ ip ro del 1.1.1.0/24 vrf red $ ip ro ls vrf red unreachable default metric 8192 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3 The same notation does not work with IPv6 because of how multipath routes are implemented for IPv6. For IPv6 only the first nexthop of a multipath route is deleted if the request contains only a prefix and length. This leads to unnecessary complexity in userspace dealing with IPv6 multipath routes. This patch allows all nexthops to be deleted without specifying each one in the delete request. Internally, this is done by walking the sibling list of the route matching the specifications given (prefix, length, metric, protocol, etc). $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:200::/120 via 2001:db8:1::2 dev eth1 metric 1024 pref medium 2001:db8:200::/120 via 2001:db8:2::2 dev eth2 metric 1024 pref medium ... $ ip -6 ro del vrf red 2001:db8:200::/120 $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium ... Because IPv6 allows individual nexthops to be deleted without deleting the entire route, the ip6_route_multipath_del and non-multipath code path (ip6_route_del) have to be discriminated so that all nexthops are only deleted for the latter case. This is done by making the existing fc_type in fib6_config a u16 and then adding a new u16 field with fc_delete_all_nh as the first bit. Suggested-by: Dinesh Dutt <ddutt@cumulusnetworks.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04	virtio_net: exploit napi_complete_done() return value	Eric Dumazet
	Since commit 364b6055738b ("net: busy-poll: return busypolling status to drivers"), napi_complete_done() returns a boolean that can be used by drivers to conditionally rearm interrupts. This patch changes virtio_net to use this boolean to avoid a bit of overhead for busy-poll users. Jason reports about 1.1% improvement for 1 byte TCP_RR (burst 100). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04	Merge branch '40GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-02-03 This series contains updates to i40e/i40evf only. Jake fixes up the driver to not call i40e_vsi_kill_vlan() or i40e_vsi_add_vlan() when the PVID is set or when the VID is less than 1. Cleaned up a check which really is not needed since there is no real reason why we cannot just call i40e_del_mac_all_vlan() directly. Renamed functions to better reflect their actual purpose and how they function in a more clear manner. Bimmy cleans up unused/deprecated macros. Mitch cleans up unused device ids which were intended for use when running Linux VF drivers under Hyper-V, but found to be not needed. Then cleaned up a function that is no longer needed since the client open and close functions were refactored. Adds a sleep without timeout until the reply from the PF driver has been received since the iWARP client cannot continue until the operation has been completed. Tushar Dave fixes an issue seen on SPARC where the use of the 'packed' directive was causing kernel unaligned errors. Alex does a refactor to pull some data off of the stack and store it in the transmit buffer info section of the transmit ring. Alan fixes a bug which was caused by passing a bad register value to the firmware, by refactoring the macro INTRL_USEC_TO_REG into a static inline function. Also added feedback to the user as to the actual interrupt rate limit being used when it differs from the requested limit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	net: skb_needs_check() accepts CHECKSUM_NONE for tx	Eric Dumazet
	My recent change missed fact that UFO would perform a complete UDP checksum before segmenting in frags. In this case skb->ip_summed is set to CHECKSUM_NONE. We need to add this valid case to skb_needs_check() Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	net: remove support for per driver ndo_busy_poll()	Eric Dumazet
	We added generic support for busy polling in NAPI layer in linux-4.5 No network driver uses ndo_busy_poll() anymore, we can get rid of the pointer in struct net_device_ops, and its use in sk_busy_loop() Saves NETIF_F_BUSY_POLL features bit. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	enic: Remove local ndo_busy_poll() implementation.	David S. Miller
	We do polling generically these days. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	ixgbevf: get rid of custom busy polling code	Eric Dumazet
	In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of code, we also remove one lock operation in fast path, and allow GRO to do its job. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	ixgbe: get rid of custom busy polling code	Eric Dumazet
	In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of code, we also remove one lock operation in fast path, and allow GRO to do its job. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next	David S. Miller
	Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Stash ctinfo 3-bit field into pointer to nf_conntrack object from sk_buff so we only access one single cacheline in the conntrack hotpath. Patchset from Florian Westphal. 2) Don't leak pointer to internal structures when exporting x_tables ruleset back to userspace, from Willem DeBruijn. This includes new helper functions to copy data to userspace such as xt_data_to_user() as well as conversions of our ip_tables, ip6_tables and arp_tables clients to use it. Not surprinsingly, ebtables requires an ad-hoc update. There is also a new field in x_tables extensions to indicate the amount of bytes that we copy to userspace. 3) Add nf_log_all_netns sysctl: This new knob allows you to enable logging via nf_log infrastructure for all existing netnamespaces. Given the effort to provide pernet syslog has been discontinued, let's provide a way to restore logging using netfilter kernel logging facilities in trusted environments. Patch from Michal Kubecek. 4) Validate SCTP checksum from conntrack helper, from Davide Caratti. 5) Merge UDPlite conntrack and NAT helpers into UDP, this was mostly a copy&paste from the original helper, from Florian Westphal. 6) Reset netfilter state when duplicating packets, also from Florian. 7) Remove unnecessary check for broadcast in IPv6 in pkttype match and nft_meta, from Liping Zhang. 8) Add missing code to deal with loopback packets from nft_meta when used by the netdev family, also from Liping. 9) Several cleanups on nf_tables, one to remove unnecessary check from the netlink control plane path to add table, set and stateful objects and code consolidation when unregister chain hooks, from Gao Feng. 10) Fix harmless reference counter underflow in IPVS that, however, results in problems with the introduction of the new refcount_t type, from David Windsor. 11) Enable LIBCRC32C from nf_ct_sctp instead of nf_nat_sctp, from Davide Caratti. 12) Missing documentation on nf_tables uapi header, from Liping Zhang. 13) Use rb_entry() helper in xt_connlimit, from Geliang Tang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	Merge branch 'mlxsw-Introduce-TC-Flower-offload-using-TCAM'	David S. Miller
	Jiri Pirko says: ==================== mlxsw: Introduce TC Flower offload using TCAM This patchset introduces support for offloading TC cls_flower and actions to Spectrum TCAM-base policy engine. The patchset contains patches to allow work with flexible keys and actions which are used in Spectrum TCAM. It also contains in-driver infrastructure for offloading TC rules to TCAM HW. The TCAM management code is simple and limited for now. It is going to be extended as a follow-up work. The last patch uses the previously introduced infra to allow to implement cls_flower offloading. Initially, only limited set of match-keys and only a drop and forward actions are supported. As a dependency, this patchset introduces parman - priority array area manager - as a library. v1->v2: - patch11: - use __set_bit and __test_and_clear_bit as suggested by DaveM - patch16: - Added documentation to the API functions as suggested by Tom Herbert - patch17: - use __set_bit and __clear_bit as suggested by DaveM ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: spectrum: Implement TC flower offload	Jiri Pirko
	Extend the existing setup_tc ndo call and allow to offload cls_flower rules. Only limited set of dissector keys and actions are supported now. Use previously introduced ACL infrastructure to offload cls_flower rules to be processed in the HW. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	sched: cls_flower: expose priority to offloading netdevice	Jiri Pirko
	The driver that offloads flower rules needs to know with which priority user inserted the rules. So add this information into offload struct. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: spectrum: Introduce ACL core with simple TCAM implementation	Jiri Pirko
	Add ACL core infrastructure for Spectrum ASIC. This infra provides an abstraction layer over specific HW implementations. There are two basic objects used. One is "rule" and the second is "ruleset" which serves as a container of multiple rules. In general, within one ruleset the rules are allowed to have multiple priorities and masks. Each ruleset is bound to either ingress or egress a of port netdevice. The initial TCAM implementation is very simple and limited. It utilizes parman lsort manager to take care of TCAM region layout. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	lib: Introduce priority array area manager	Jiri Pirko
	This introduces a infrastructure for management of linear priority areas. Priority order in an array matters, however order of items inside a priority group does not matter. As an initial implementation, L-sort algorithm is used. It is quite trivial. More advanced algorithm called P-sort will be introduced as a follow-up. The infrastructure is prepared for other algos. Alongside this, a testing module is introduced as well. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	list: introduce list_for_each_entry_from_reverse helper	Jiri Pirko
	Similar to list_for_each_entry_continue and its reverse variant list_for_each_entry_continue_reverse, introduce reverse helper for list_for_each_entry_from. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: resources: Add ACL related resources	Jiri Pirko
	Add couple of resource limits related to ACL. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: spectrum: Introduce basic set of flexible key blocks	Jiri Pirko
	Introduce basic set of Spectrum flexible key blocks. It contains blocks needed to carry all elements defined so far. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: core: Introduce flexible actions support	Jiri Pirko
	Each entry which is matched during ACL lookup points to an action set. This action set contains up to three separate actions. If more actions are needed to be chained, the extended set is created to hold them in KVD linear area. This patch implements handling of sets and encoding of actions. Currectly, only two actions are supported. Drop and forward. Forward action uses PBS pointer to KVD linear area, so the action code needs to take care of this as well. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: core: Introduce flexible keys support	Jiri Pirko
	Hardware supports matching on so called "flexible keys". The idea is to assemble an optimal key to use for matching according to the fields in packet (elements) requested by user. Certain sets of elements are combined into pre-defined blocks. There is a picker to find needed blocks. Keys consist of 1..n blocks. Alongside with that, an initial portion of elements is introduced in order to be able to offload basic cls_flower rules. Picked keys are cached so multiple rules could share them. There is an encode function provided that takes care of encoding key and mask values according to given key. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: reg: Add Policy-Engine Extended Flexible Action Register	Jiri Pirko
	PEFA register is used for accessing an extended flexible action entry in the central KVD Linear Database. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: reg: Add Policy-Engine Policy Based Switching Register	Jiri Pirko
	The PPBS register retrieves and sets Policy Based Switching Table entries. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: reg: Add Policy-Engine Rules Copy Register	Jiri Pirko
	The PRCR register is used for accessing rules within a TCAM region. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: reg: Add Policy-Engine Port Binding Table	Jiri Pirko
	The PPBT is used for configuration of the Port Binding Table. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: reg: Add Policy-Engine TCAM Entry Register Version 2	Jiri Pirko
	The PTCE-V2 register is used for accessing rules within a TCAM region. It is a new version of PTCE in order to support wider key, mask and action within a TCAM region. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: reg: Add Policy-Engine TCAM Allocation Register	Jiri Pirko
	The PTAR register is used for allocation of regions in the TCAM. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: reg: Add Policy-Engine ACL Group Table register	Jiri Pirko
	The PAGT register is used for configuration of the ACL Group Table. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: reg: Add Policy-Engine ACL Register	Jiri Pirko
	The PACL register is used for configuration of the ACL. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: item: Add helpers for getting pointer into payload for char buffer item	Jiri Pirko
	Sometimes it is handy to get a pointer to a char buffer item and use it direcly to write/read data. So add these helpers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	mlxsw: item: Add 8bit item helpers	Jiri Pirko
	Item heplers for 8bit values are needed, let's add them. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	bonding: Remove unnecessary returned value check	Zhu Yanjun
	The function bond_info_query alwarys returns 0. As such, in the function bond_do_ioctl, it is not necessary to check the returned value. So the interface type of the function bond_info_query is changed to void. The redundant check is removed. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03	tcp: clear pfmemalloc on outgoing skb	Eric Dumazet
	Josef Bacik diagnosed following problem : I was seeing random disconnects while testing NBD over loopback. This turned out to be because NBD sets pfmemalloc on it's socket, however the receiving side is a user space application so does not have pfmemalloc set on its socket. This means that sk_filter_trim_cap will simply drop this packet, under the assumption that the other side will simply retransmit. Well we do retransmit, and then the packet is just dropped again for the same reason. It seems the better way to address this problem is to clear pfmemalloc in the TCP transmit path. pfmemalloc strict control really makes sense on the receive path. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>