summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2017-04-03batman-adv: Group ethtool related code togetherSven Eckelmann
The ethtool code was spread in soft-interface.c. This makes reading the code and working on it unnecessary complicated. Having everything in a common place next to the other code which references it, makes it slightly easier. Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-04-03batman-adv: Remove ethtool .get_settings stubSven Eckelmann
The .get_settings function pointer and the related API was deprecated. Fortunately, batman-adv is a virtual interface and never provided any useful information via .get_settings. The stub can therefore be removed. This also avoids that incorrect information is shown in ethtool about the batadv interface. Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-04-03batman-adv: Remove ethtool msglevel functionsSven Eckelmann
batadv devices don't support msglevel. The ethtool stubs therefore returned that it isn't supported. But instead, the complete function can be dropped to avoid that bogus values are shown in ethtool. Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-04-03batman-adv: Use ethtool helper to get link statusSven Eckelmann
The ethtool_ops of batman-adv never contained more than a stub for the get_link function pointer. It was always returning that a link exists even when the devices was not yet up and therefore nothing resampling a link could have been available. Instead use the ethtool helper which returns the current carrier state. Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-04-02rds: tcp: canonical connection order for all paths with index > 0Sowmini Varadhan
The rds_connect_worker() has a bug in the check that enforces the canonical connection order described in the comments of rds_tcp_state_change(). The intention is to make sure that all the multipath connections are always initiated by the smaller IP address via rds_start_mprds. To achieve this, rds_connection_worker should check that cp_index > 0. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-02rds: tcp: allow progress of rds_conn_shutdown if the rds_connection is ↵Sowmini Varadhan
marked ERROR by an intervening FIN rds_conn_shutdown() runs in workq context, and marks the rds_connection as DISCONNECTING before quiescing Tx/Rx paths. However, after all I/O has quiesced, we may still find the rds_connection state to be RDS_CONN_ERROR if an intervening FIN was processed in softirq context. This is not a fatal error: rds_conn_shutdown() should continue the shutdown, and there is no need to log noisy messages about this event. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-02make skb_copy_datagram_msg() et.al. preserve ->msg_iter on errorAl Viro
Fixes the mess observed in e.g. rsync over a noisy link we'd been seeing since last Summer. What happens is that we copy part of a datagram before noticing a checksum mismatch. Datagram will be resent, all right, but we want the next try go into the same place, not after it... All this family of primitives (copy/checksum and copy a datagram into destination) is "all or nothing" sort of interface - either we get 0 (meaning that copy had been successful) or we get an error (and no way to tell how much had been copied before we ran into whatever error it had been). Make all of them leave iterator unadvanced in case of errors - all callers must be able to cope with that (an error might've been caught before the iterator had been advanced), it costs very little to arrange, it's safer for callers and actually fixes at least one bug in said callers. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-01net: mpls: Increase max number of labels for lwt encapDavid Ahern
Alow users to push down more labels per MPLS encap. Similar to LSR case, move label array to the end of mpls_iptunnel_encap and allocate based on the number of labels for the route. For consistency with the LSR case, re-use the same maximum number of labels. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01net: mpls: bump maximum number of labelsDavid Ahern
Allow users to push down more labels per MPLS route. With the previous patches, no memory allocations are based on MAX_NEW_LABELS; the limit is only used to keep userspace in check. At this point MAX_NEW_LABELS is only used for mpls_route_config (copying route data from userspace) and processing nexthops looking for the max number of labels across the route spec. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01net: mpls: Limit memory allocation for mpls_routeDavid Ahern
Limit memory allocation size for mpls_route to 4096. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01net: mpls: change mpls_route layoutDavid Ahern
Move labels to the end of mpls_nh as a 0-sized array and within mpls_route move the via for a nexthop after the mpls_nh. The new layout becomes: +----------------------+ | mpls_route | +----------------------+ | mpls_nh 0 | +----------------------+ | alignment padding | 4 bytes for odd number of labels; 0 for even +----------------------+ | via[rt_max_alen] 0 | +----------------------+ | alignment padding | via's aligned on sizeof(unsigned long) +----------------------+ | ... | +----------------------+ | mpls_nh n-1 | +----------------------+ | via[rt_max_alen] n-1 | +----------------------+ Memory allocated for nexthop + via is constant across all nexthops and their via. It is based on the maximum number of labels across all nexthops and the maximum via length. The size is saved in the mpls_route as rt_nh_size. Accessing a nexthop becomes rt->rt_nh + index * rt->rt_nh_size. The offset of the via address from a nexthop is saved as rt_via_offset so that given an mpls_nh pointer the via for that hop is simply nh + rt->rt_via_offset. With prior code, memory allocated per mpls_route with 1 nexthop: via is an ethernet address - 64 bytes via is an ipv4 address - 64 via is an ipv6 address - 72 With this patch set, memory allocated per mpls_route with 1 nexthop and 1 or 2 labels: via is an ethernet address - 56 bytes via is an ipv4 address - 56 via is an ipv6 address - 64 The 8-byte reduction is due to the previous patch; the change introduced by this patch has no impact on the size of allocations for 1 or 2 labels. Performance impact of this change was examined using network namespaces with veth pairs connecting namespaces. ns0 inserts the packet to the label-switched path using an lwt route with encap mpls. ns1 adds 1 or 2 labels depending on test, ns2 (and ns3 for 2-label test) pops the label and forwards. ns3 (or ns4) for a 2-label is the destination. Similar series of namespaces used for 2-nexthop test. Intent is to measure changes to latency (overhead in manipulating the packet) in the forwarding path. Tests used netperf with UDP_RR. IPv4: current patches 1 label, 1 nexthop 29908 30115 2 label, 1 nexthop 29071 29612 1 label, 2 nexthop 29582 29776 2 label, 2 nexthop 29086 29149 IPv6: current patches 1 label, 1 nexthop 24502 24960 2 label, 1 nexthop 24041 24407 1 label, 2 nexthop 23795 23899 2 label, 2 nexthop 23074 22959 In short, the change has no effect to a modest increase in performance. This is expected since this patch does not really have an impact on routes with 1 or 2 labels (the current limit) and 1 or 2 nexthops. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01net: mpls: Convert number of nexthops to u8David Ahern
Number of nexthops and number of alive nexthops are tracked using an unsigned int. A route should never have more than 255 nexthops so convert both to u8. Update all references and intermediate variables to consistently use u8 as well. Shrinks the size of mpls_route from 32 bytes to 24 bytes with a 2-byte hole before the nexthops. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01net: mpls: rt_nhn_alive and nh_flags should be accessed using READ_ONCEDavid Ahern
The number of alive nexthops for a route (rt->rt_nhn_alive) and the flags for a next hop (nh->nh_flags) are modified by netdev event handlers. The event handlers run with rtnl_lock held so updates are always done with the lock held. The packet path accesses the fields under the rcu lock. Since those fields can change at any moment in the packet path, both fields should be accessed using READ_ONCE. Updates to both fields should use WRITE_ONCE. Update mpls_select_multipath (packet path) and mpls_ifdown and mpls_ifup (event handlers) accordingly. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01l2tp: take a reference on sessions used in genetlink handlersGuillaume Nault
Callers of l2tp_nl_session_find() need to hold a reference on the returned session since there's no guarantee that it isn't going to disappear from under them. Relying on the fact that no l2tp netlink message may be processed concurrently isn't enough: sessions can be deleted by other means (e.g. by closing the PPPOL2TP socket of a ppp pseudowire). l2tp_nl_cmd_session_delete() is a bit special: it runs a callback function that may require a previous call to session->ref(). In particular, for ppp pseudowires, the callback is l2tp_session_delete(), which then calls pppol2tp_session_close() and dereferences the PPPOL2TP socket. The socket might already be gone at the moment l2tp_session_delete() calls session->ref(), so we need to take a reference during the session lookup. So we need to pass the do_ref variable down to l2tp_session_get() and l2tp_session_get_by_ifname(). Since all callers have to be updated, l2tp_session_find_by_ifname() and l2tp_nl_session_find() are renamed to reflect their new behaviour. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01l2tp: hold session while sending creation notificationsGuillaume Nault
l2tp_session_find() doesn't take any reference on the returned session. Therefore, the session may disappear while sending the notification. Use l2tp_session_get() instead and decrement session's refcount once the notification is sent. Fixes: 33f72e6f0c67 ("l2tp : multicast notification to the registered listeners") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01l2tp: fix duplicate session creationGuillaume Nault
l2tp_session_create() relies on its caller for checking for duplicate sessions. This is racy since a session can be concurrently inserted after the caller's verification. Fix this by letting l2tp_session_create() verify sessions uniqueness upon insertion. Callers need to be adapted to check for l2tp_session_create()'s return code instead of calling l2tp_session_find(). pppol2tp_connect() is a bit special because it has to work on existing sessions (if they're not connected) or to create a new session if none is found. When acting on a preexisting session, a reference must be held or it could go away on us. So we have to use l2tp_session_get() instead of l2tp_session_find() and drop the reference before exiting. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01l2tp: ensure session can't get removed during pppol2tp_session_ioctl()Guillaume Nault
Holding a reference on session is required before calling pppol2tp_session_ioctl(). The session could get freed while processing the ioctl otherwise. Since pppol2tp_session_ioctl() uses the session's socket, we also need to take a reference on it in l2tp_session_get(). Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01l2tp: fix race in l2tp_recv_common()Guillaume Nault
Taking a reference on sessions in l2tp_recv_common() is racy; this has to be done by the callers. To this end, a new function is required (l2tp_session_get()) to atomically lookup a session and take a reference on it. Callers then have to manually drop this reference. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01sctp: use right in and out stream cntXin Long
Since sctp reconf was added in sctp, the real cnt of in/out stream have not been c.sinit_max_instreams and c.sinit_num_ostreams any more. This patch is to replace them with stream->in/outcnt. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01net: dsa: fix build error with devlink build as moduleTobias Regnery
After commit 96567d5dacf4 ("net: dsa: dsa2: Add basic support of devlink") I see the following link error with CONFIG_NET_DSA=y and CONFIG_NET_DEVLINK=m: net/built-in.o: In function 'dsa_register_switch': (.text+0xe226b): undefined reference to `devlink_alloc' net/built-in.o: In function 'dsa_register_switch': (.text+0xe2284): undefined reference to `devlink_register' net/built-in.o: In function 'dsa_register_switch': (.text+0xe243e): undefined reference to `devlink_port_register' net/built-in.o: In function 'dsa_register_switch': (.text+0xe24e1): undefined reference to `devlink_port_register' net/built-in.o: In function 'dsa_register_switch': (.text+0xe24fa): undefined reference to `devlink_port_type_eth_set' net/built-in.o: In function 'dsa_dst_unapply.part.8': dsa2.c:(.text.unlikely+0x345): undefined reference to 'devlink_port_unregister' dsa2.c:(.text.unlikely+0x36c): undefined reference to 'devlink_port_unregister' dsa2.c:(.text.unlikely+0x38e): undefined reference to 'devlink_port_unregister' dsa2.c:(.text.unlikely+0x3f2): undefined reference to 'devlink_unregister' dsa2.c:(.text.unlikely+0x3fb): undefined reference to 'devlink_free' Fix this by adding a dependency on MAY_USE_DEVLINK so that CONFIG_NET_DSA get switched to be build as module when CONFIG_NET_DEVLINK=m. Fixes: 96567d5dacf4 ("net: dsa: dsa2: Add basic support of devlink") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01Merge tag 'mac80211-for-davem-2017-03-31' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Two fixes: * don't block netdev queues (indefinitely!) if mac80211 manages traffic queueing itself * check wiphy registration before checking for ops on resume, to avoid crash ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01bpf: introduce BPF_PROG_TEST_RUN commandAlexei Starovoitov
development and testing of networking bpf programs is quite cumbersome. Despite availability of user space bpf interpreters the kernel is the ultimate authority and execution environment. Current test frameworks for TC include creation of netns, veth, qdiscs and use of various packet generators just to test functionality of a bpf program. XDP testing is even more complicated, since qemu needs to be started with gro/gso disabled and precise queue configuration, transferring of xdp program from host into guest, attaching to virtio/eth0 and generating traffic from the host while capturing the results from the guest. Moreover analyzing performance bottlenecks in XDP program is impossible in virtio environment, since cost of running the program is tiny comparing to the overhead of virtio packet processing, so performance testing can only be done on physical nic with another server generating traffic. Furthermore ongoing changes to user space control plane of production applications cannot be run on the test servers leaving bpf programs stubbed out for testing. Last but not least, the upstream llvm changes are validated by the bpf backend testsuite which has no ability to test the code generated. To improve this situation introduce BPF_PROG_TEST_RUN command to test and performance benchmark bpf programs. Joint work with Daniel Borkmann. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01net: dsa: add cross-chip bridging operationsVivien Didelot
Introduce crosschip_bridge_{join,leave} operations in the dsa_switch_ops structure, which can be used by switches supporting interconnection. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01openvswitch: Fix ovs_flow_key_update()Yi-Hung Wei
ovs_flow_key_update() is called when the flow key is invalid, and it is used to update and revalidate the flow key. Commit 329f45bc4f19 ("openvswitch: add mac_proto field to the flow key") introduces mac_proto field to flow key and use it to determine whether the flow key is valid. However, the commit does not update the code path in ovs_flow_key_update() to revalidate the flow key which may cause BUG_ON() on execute_recirc(). This patch addresses the aforementioned issue. Fixes: 329f45bc4f19 ("openvswitch: add mac_proto field to the flow key") Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01Merge tag 'nfsd-4.11-1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "The restriction of NFSv4 to TCP went overboard and also broke the backchannel; fix. Also some minor refinements to the nfsd version-setting interface that we'd like to get fixed before release" * tag 'nfsd-4.11-1' of git://linux-nfs.org/~bfields/linux: svcrdma: set XPT_CONG_CTRL flag for bc xprt NFSD: fix nfsd_reset_versions for NFSv4. NFSD: fix nfsd_minorversion(.., NFSD_AVAIL) NFSD: further refinement of content of /proc/fs/nfsd/versions nfsd: map the ENOKEY to nfserr_perm for avoiding warning SUNRPC/backchanel: set XPT_CONG_CTRL flag for bc xprt
2017-03-30sock: avoid dirtying sk_stamp, if possiblePaolo Abeni
sock_recv_ts_and_drops() unconditionally set sk->sk_stamp for every packet, even if the SOCK_TIMESTAMP flag is not set in the related socket. If selinux is enabled, this cause a cache miss for every packet since sk->sk_stamp and sk->sk_security share the same cacheline. With this change sk_stamp is set only if the SOCK_TIMESTAMP flag is set, and is cleared for the first packet, so that the user perceived behavior is unchanged. This gives up to 5% speed-up under udp-flood with small packets. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30net: tcp: Refine the __tcp_select_windowGao Feng
1. Move the "window = tp->rcv_wnd;" into the condition block without tp->rx_opt.rcv_wscale. Because it is unnecessary when enable wscale; 2. Use the macro ALIGN instead of two statements. The two statements are used to make window align to 1<<wscale. Use the ALIGN is more clearer. 3. Use the rounddown to make codes clearer. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30sctp: alloc stream info when initializing asocXin Long
When sending a msg without asoc established, sctp will send INIT packet first and then enqueue chunks. Before receiving INIT_ACK, stream info is not yet alloced. But enqueuing chunks needs to access stream info, like out stream state and out stream cnt. This patch is to fix it by allocing out stream info when initializing an asoc, allocing in stream and re-allocing out stream when processing init. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30VSOCK: remove unnecessary ternary operator on return valueColin Ian King
Rather than assign the positive errno values to ret and then checking if it is positive and flip the sign, just return the errno value. Detected by CoverityScan, CID#986649 ("Logically Dead Code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30drivers: add explicit interrupt.h includesFlorian Westphal
These files all use functions declared in interrupt.h, but currently rely on implicit inclusion of this file (via netns/xfrm.h). That won't work anymore when the flow cache is removed so include that header where needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30net/packet: fix overflow in check for tp_reserveAndrey Konovalov
When calculating po->tp_hdrlen + po->tp_reserve the result can overflow. Fix by checking that tp_reserve <= INT_MAX on assign. Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30net/packet: fix overflow in check for tp_frame_nrAndrey Konovalov
When calculating rb->frames_per_block * req->tp_block_nr the result can overflow. Add a check that tp_block_size * tp_block_nr <= UINT_MAX. Since frames_per_block <= tp_block_size, the expression would never overflow. Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30net/packet: fix overflow in check for priv area sizeAndrey Konovalov
Subtracting tp_sizeof_priv from tp_block_size and casting to int to check whether one is less then the other doesn't always work (both of them are unsigned ints). Compare them as is instead. Also cast tp_sizeof_priv to u64 before using BLK_PLUS_PRIV, as it can overflow inside BLK_PLUS_PRIV otherwise. Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains a rather large update with Netfilter fixes, specifically targeted to incorrect RCU usage in several spots and the userspace conntrack helper infrastructure (nfnetlink_cthelper), more specifically they are: 1) expect_class_max is incorrect set via cthelper, as in kernel semantics mandate that this represents the array of expectation classes minus 1. Patch from Liping Zhang. 2) Expectation policy updates via cthelper are currently broken for several reasons: This code allows illegal changes in the policy such as changing the number of expeciation classes, it is leaking the updated policy and such update occurs with no RCU protection at all. Fix this by adding a new nfnl_cthelper_update_policy() that describes what is really legal on the update path. 3) Fix several memory leaks in cthelper, from Jeffy Chen. 4) synchronize_rcu() is missing in the removal path of several modules, this may lead to races since CPU may still be running on code that has just gone. Also from Liping Zhang. 5) Don't use the helper hashtable from cthelper, it is not safe to walk over those bits without the helper mutex. Fix this by introducing a new independent list for userspace helpers. From Liping Zhang. 6) nf_ct_extend_unregister() needs synchronize_rcu() to make sure no packets are walking on any conntrack extension that is gone after module removal, again from Liping. 7) nf_nat_snmp may crash if we fail to unregister the helper due to accidental leftover code, from Gao Feng. 8) Fix leak in nfnetlink_queue with secctx support, from Liping Zhang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29tipc: allow rdm/dgram socketpairsErik Hugne
for socketpairs using connectionless transport, we cache the respective node local TIPC portid to use in subsequent calls to send() in the socket's private data. Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29tipc: add support for stream/seqpacket socketpairsErik Hugne
sockets A and B are connected back-to-back, similar to what AF_UNIX does. Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29Merge branch 'apw' (xfrm_user fixes)Linus Torvalds
Merge xfrm_user validation fixes from Andy Whitcroft: "Two patches we are applying to Ubuntu for XFRM_MSG_NEWAE validation issue reported by ZDI. The first of these is the primary fix, and the second is for a more theoretical issue that Kees pointed out when reviewing the first" * emailed patches from Andy Whitcroft <apw@canonical.com>: xfrm_user: validate XFRM_MSG_NEWAE incoming ESN size harder xfrm_user: validate XFRM_MSG_NEWAE XFRMA_REPLAY_ESN_VAL replay_window
2017-03-29net: mpls: Update lfib_nlmsg_size to skip deleted nexthopsDavid Ahern
A recent commit skips nexthops in a route if the device has been deleted. Update lfib_nlmsg_size accordingly. Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29l2tp: purge socket queues in the .destruct() callbackGuillaume Nault
The Rx path may grab the socket right before pppol2tp_release(), but nothing guarantees that it will enqueue packets before skb_queue_purge(). Therefore, the socket can be destroyed without its queues fully purged. Fix this by purging queues in pppol2tp_session_destruct() where we're guaranteed nothing is still referencing the socket. Fixes: 9e9cb6221aa7 ("l2tp: fix userspace reception on plain L2TP sockets") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29l2tp: hold tunnel socket when handling control frames in l2tp_ip and l2tp_ip6Guillaume Nault
The code following l2tp_tunnel_find() expects that a new reference is held on sk. Either sk_receive_skb() or the discard_put error path will drop a reference from the tunnel's socket. This issue exists in both l2tp_ip and l2tp_ip6. Fixes: a3c18422a4b4 ("l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv()") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29xfrm_user: validate XFRM_MSG_NEWAE incoming ESN size harderAndy Whitcroft
Kees Cook has pointed out that xfrm_replay_state_esn_len() is subject to wrapping issues. To ensure we are correctly ensuring that the two ESN structures are the same size compare both the overall size as reported by xfrm_replay_state_esn_len() and the internal length are the same. CVE-2017-7184 Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-29xfrm_user: validate XFRM_MSG_NEWAE XFRMA_REPLAY_ESN_VAL replay_windowAndy Whitcroft
When a new xfrm state is created during an XFRM_MSG_NEWSA call we validate the user supplied replay_esn to ensure that the size is valid and to ensure that the replay_window size is within the allocated buffer. However later it is possible to update this replay_esn via a XFRM_MSG_NEWAE call. There we again validate the size of the supplied buffer matches the existing state and if so inject the contents. We do not at this point check that the replay_window is within the allocated memory. This leads to out-of-bounds reads and writes triggered by netlink packets. This leads to memory corruption and the potential for priviledge escalation. We already attempt to validate the incoming replay information in xfrm_new_ae() via xfrm_replay_verify_len(). This confirms that the user is not trying to change the size of the replay state buffer which includes the replay_esn. It however does not check the replay_window remains within that buffer. Add validation of the contained replay_window. CVE-2017-7184 Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-29mac80211: unconditionally start new netdev queues with iTXQ supportJohannes Berg
When internal mac80211 TXQs aren't supported, netdev queues must always started out started even when driver queues are stopped while the interface is added. This is necessary because with the internal TXQ support netdev queues are never stopped and packet scheduling/dropping is done in mac80211. Cc: stable@vger.kernel.org # 4.9+ Fixes: 80a83cfc434b1 ("mac80211: skip netdev queue control with software queuing") Reported-and-tested-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-03-29netfilter: nfnetlink_queue: fix secctx memory leakLiping Zhang
We must call security_release_secctx to free the memory returned by security_secid_to_secctx, otherwise memory may be leaked forever. Fixes: ef493bd930ae ("netfilter: nfnetlink_queue: add security context information") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-29cfg80211: check rdev resume callback only for registered wiphyArend Van Spriel
We got the following use-after-free KASAN report: BUG: KASAN: use-after-free in wiphy_resume+0x591/0x5a0 [cfg80211] at addr ffff8803fc244090 Read of size 8 by task kworker/u16:24/2587 CPU: 6 PID: 2587 Comm: kworker/u16:24 Tainted: G B 4.9.13-debug+ Hardware name: Dell Inc. XPS 15 9550/0N7TVV, BIOS 1.2.19 12/22/2016 Workqueue: events_unbound async_run_entry_fn ffff880425d4f9d8 ffffffffaeedb541 ffff88042b80ef00 ffff8803fc244088 ffff880425d4fa00 ffffffffae84d7a1 ffff880425d4fa98 ffff8803fc244080 ffff88042b80ef00 ffff880425d4fa88 ffffffffae84da3a ffffffffc141f7d9 Call Trace: [<ffffffffaeedb541>] dump_stack+0x85/0xc4 [<ffffffffae84d7a1>] kasan_object_err+0x21/0x70 [<ffffffffae84da3a>] kasan_report_error+0x1fa/0x500 [<ffffffffc141f7d9>] ? cfg80211_bss_age+0x39/0xc0 [cfg80211] [<ffffffffc141f83a>] ? cfg80211_bss_age+0x9a/0xc0 [cfg80211] [<ffffffffae48d46d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc13fb1c0>] ? wiphy_suspend+0xc70/0xc70 [cfg80211] [<ffffffffae84def1>] __asan_report_load8_noabort+0x61/0x70 [<ffffffffc13fb100>] ? wiphy_suspend+0xbb0/0xc70 [cfg80211] [<ffffffffc13fb751>] ? wiphy_resume+0x591/0x5a0 [cfg80211] [<ffffffffc13fb751>] wiphy_resume+0x591/0x5a0 [cfg80211] [<ffffffffc13fb1c0>] ? wiphy_suspend+0xc70/0xc70 [cfg80211] [<ffffffffaf3b206e>] dpm_run_callback+0x6e/0x4f0 [<ffffffffaf3b31b2>] device_resume+0x1c2/0x670 [<ffffffffaf3b367d>] async_resume+0x1d/0x50 [<ffffffffae3ee84e>] async_run_entry_fn+0xfe/0x610 [<ffffffffae3d0666>] process_one_work+0x716/0x1a50 [<ffffffffae3d05c9>] ? process_one_work+0x679/0x1a50 [<ffffffffafdd7b6d>] ? _raw_spin_unlock_irq+0x3d/0x60 [<ffffffffae3cff50>] ? pwq_dec_nr_in_flight+0x2b0/0x2b0 [<ffffffffae3d1a80>] worker_thread+0xe0/0x1460 [<ffffffffae3d19a0>] ? process_one_work+0x1a50/0x1a50 [<ffffffffae3e54c2>] kthread+0x222/0x2e0 [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80 [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80 [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80 [<ffffffffafdd86aa>] ret_from_fork+0x2a/0x40 Object at ffff8803fc244088, in cache kmalloc-1024 size: 1024 Allocated: PID = 71 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x12/0x20 __kmalloc_track_caller+0x134/0x360 kmemdup+0x20/0x50 brcmf_cfg80211_attach+0x10b/0x3a90 [brcmfmac] brcmf_bus_start+0x19a/0x9a0 [brcmfmac] brcmf_pcie_setup+0x1f1a/0x3680 [brcmfmac] brcmf_fw_request_nvram_done+0x44c/0x11b0 [brcmfmac] request_firmware_work_func+0x135/0x280 process_one_work+0x716/0x1a50 worker_thread+0xe0/0x1460 kthread+0x222/0x2e0 ret_from_fork+0x2a/0x40 Freed: PID = 2568 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x71/0xb0 kfree+0xe8/0x2e0 brcmf_cfg80211_detach+0x62/0xf0 [brcmfmac] brcmf_detach+0x14a/0x2b0 [brcmfmac] brcmf_pcie_remove+0x140/0x5d0 [brcmfmac] brcmf_pcie_pm_leave_D3+0x198/0x2e0 [brcmfmac] pci_pm_resume+0x186/0x220 dpm_run_callback+0x6e/0x4f0 device_resume+0x1c2/0x670 async_resume+0x1d/0x50 async_run_entry_fn+0xfe/0x610 process_one_work+0x716/0x1a50 worker_thread+0xe0/0x1460 kthread+0x222/0x2e0 ret_from_fork+0x2a/0x40 Memory state around the buggy address: ffff8803fc243f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8803fc244000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8803fc244080: fc fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8803fc244100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8803fc244180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb What is happening is that brcmf_pcie_resume() detects a device that is no longer responsive and it decides to unbind resulting in a wiphy_unregister() and wiphy_free() call. Now the wiphy instance remains allocated, because PM needs to call wiphy_resume() for it. However, brcmfmac already does a kfree() for the struct cfg80211_registered_device::ops field. Change the checks in wiphy_resume() to only access the struct cfg80211_registered_device::ops if the wiphy instance is still registered at this time. Cc: stable@vger.kernel.org # 4.10.x, 4.9.x Reported-by: Daniel J Blueman <daniel@quora.org> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-03-28net: dsa: dsa2: Add basic support of devlinkAndrew Lunn
Register the switch and its ports with devlink. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28net: break include loop netdevice.h, dsa.h, devlink.hAndrew Lunn
There is an include loop between netdevice.h, dsa.h, devlink.h because of NETDEV_ALIGN, making it impossible to use devlink structures in dsa.h. Break this loop by taking dsa.h out of netdevice.h, add a forward declaration of dsa_switch_tree and netdev_set_default_ethtool_ops() function, which is what netdevice.h requires. No longer having dsa.h in netdevice.h means the includes in dsa.h no longer get included. This breaks a few other files which depend on these includes. Add these directly in the affected file. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28net: mpls: Send netconf messages on device register and unregisterDavid Ahern
Send netconf notifications for MPLS when the device registers and unregisters. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28net:mpls: Refactor mpls_netconf_notify_devconf to take eventDavid Ahern
Refactor mpls_netconf_notify_devconf to take the event as an input arg. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28net: ipv6: Add support for RTM_DELNETCONFDavid Ahern
Send RTM_DELNETCONF notifications when a device is deleted. The message only needs the device index, so modify inet6_netconf_fill_devconf to skip devconf references if it is NULL. Allows a userspace cache to remove entries as devices are deleted. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>