linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-01-08	ppp: fix refcount underflow on channel unbridge	Tom Parkin
	When setting up a channel bridge, ppp_bridge_channels sets the pch->bridge field before taking the associated reference on the bridge file instance. This opens up a refcount underflow bug if ppp_bridge_channels called via. iotcl runs concurrently with ppp_unbridge_channels executing via. file release. The bug is triggered by ppp_bridge_channels taking the error path through the 'err_unset' label. In this scenario, pch->bridge is set, but the reference on the bridged channel will not be taken because the function errors out. If ppp_unbridge_channels observes pch->bridge before it is unset by the error path, it will erroneously drop the reference on the bridged channel and cause a refcount underflow. To avoid this, ensure that ppp_bridge_channels holds a reference on each channel in advance of setting the bridge pointers. Signed-off-by: Tom Parkin <tparkin@katalix.com> Fixes: 4cf476ced45d ("ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls") Acked-by: Guillaume Nault <gnault@redhat.com> Link: https://lore.kernel.org/r/20210107181315.3128-1-tparkin@katalix.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-10	ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls	Tom Parkin
	This new ioctl pair allows two ppp channels to be bridged together: frames arriving in one channel are transmitted in the other channel and vice versa. The practical use for this is primarily to support the L2TP Access Concentrator use-case. The end-user session is presented as a ppp channel (typically PPPoE, although it could be e.g. PPPoA, or even PPP over a serial link) and is switched into a PPPoL2TP session for transmission to the LNS. At the LNS the PPP session is terminated in the ISP's network. When a PPP channel is bridged to another it takes a reference on the other's struct ppp_file. This reference is dropped when the channels are unbridged, which can occur either explicitly on userspace calling the PPPIOCUNBRIDGECHAN ioctl, or implicitly when either channel in the bridge is unregistered. In order to implement the channel bridge, struct channel is extended with a new field, 'bridge', which points to the other struct channel making up the bridge. This pointer is RCU protected to avoid adding another lock to the data path. To guard against concurrent writes to the pointer, the existing struct channel lock 'upl' coverage is extended rather than adding a new lock. The 'upl' lock is used to protect the existing unit pointer. Since the bridge effectively replaces the unit (they're mutually exclusive for a channel) it makes coding easier to use the same lock to cover them both. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04	net: partially revert dynamic lockdep key changes	Cong Wang
	This patch reverts the folowing commits: commit 064ff66e2bef84f1153087612032b5b9eab005bd "bonding: add missing netdev_update_lockdep_key()" commit 53d374979ef147ab51f5d632dfe20b14aebeccd0 "net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key()" commit 1f26c0d3d24125992ab0026b0dab16c08df947c7 "net: fix kernel-doc warning in <linux/netdevice.h>" commit ab92d68fc22f9afab480153bd82a20f6e2533769 "net: core: add generic lockdep keys" but keeps the addr_list_lock_key because we still lock addr_list_lock nestedly on stack devices, unlikely xmit_lock this is safe because we don't take addr_list_lock on any fast path. Reported-and-tested-by: syzbot+aaa6fa4949cc5d9b7b25@syzkaller.appspotmail.com Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-27	ppp: Remove redundant BUG_ON() check in ppp_pernet	Xu Wang
	Passing NULL to ppp_pernet causes a crash via BUG_ON. Dereferencing net in net_generic() also has the same effect. This patch removes the redundant BUG_ON check on the same parameter. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-05	ppp: fix out-of-bounds access in bpf_prog_create()	Eric Biggers
	sock_fprog_kern::len is in units of struct sock_filter, not bytes. Fixes: 3e859adf3643 ("compat_ioctl: unify copy-in of ppp filters") Reported-by: syzbot+eb853b51b10f1befa0b7@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-01	Merge tag 'compat-ioctl-5.5' of ↵	Linus Torvalds
	git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann: "As part of the cleanup of some remaining y2038 issues, I came to fs/compat_ioctl.c, which still has a couple of commands that need support for time64_t. In completely unrelated work, I spent time on cleaning up parts of this file in the past, moving things out into drivers instead. After Al Viro reviewed an earlier version of this series and did a lot more of that cleanup, I decided to try to completely eliminate the rest of it and move it all into drivers. This series incorporates some of Al's work and many patches of my own, but in the end stops short of actually removing the last part, which is the scsi ioctl handlers. I have patches for those as well, but they need more testing or possibly a rewrite" * tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits) scsi: sd: enable compat ioctls for sed-opal pktcdvd: add compat_ioctl handler compat_ioctl: move SG_GET_REQUEST_TABLE handling compat_ioctl: ppp: move simple commands into ppp_generic.c compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic compat_ioctl: unify copy-in of ppp filters tty: handle compat PPP ioctls compat_ioctl: move SIOCOUTQ out of compat_ioctl.c compat_ioctl: handle SIOCOUTQNSD af_unix: add compat_ioctl support compat_ioctl: reimplement SG_IO handling compat_ioctl: move WDIOC handling into wdt drivers fs: compat_ioctl: move FITRIM emulation into file systems gfs2: add compat_ioctl support compat_ioctl: remove unused convert_in_user macro compat_ioctl: remove last RAID handling code compat_ioctl: remove /dev/raw ioctl translation compat_ioctl: remove PCI ioctl translation compat_ioctl: remove joystick ioctl translation ...
2019-10-24	net: core: add generic lockdep keys	Taehee Yoo
	Some interface types could be nested. (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..) These interface types should set lockdep class because, without lockdep class key, lockdep always warn about unexisting circular locking. In the current code, these interfaces have their own lockdep class keys and these manage itself. So that there are so many duplicate code around the /driver/net and /net/. This patch adds new generic lockdep keys and some helper functions for it. This patch does below changes. a) Add lockdep class keys in struct net_device - qdisc_running, xmit, addr_list, qdisc_busylock - these keys are used as dynamic lockdep key. b) When net_device is being allocated, lockdep keys are registered. - alloc_netdev_mqs() c) When net_device is being free'd llockdep keys are unregistered. - free_netdev() d) Add generic lockdep key helper function - netdev_register_lockdep_key() - netdev_unregister_lockdep_key() - netdev_update_lockdep_key() e) Remove unnecessary generic lockdep macro and functions f) Remove unnecessary lockdep code of each interfaces. After this patch, each interface modules don't need to maintain their lockdep keys. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-23	compat_ioctl: ppp: move simple commands into ppp_generic.c	Arnd Bergmann
	All ppp commands that are not already handled in ppp_compat_ioctl() are compatible, so they can now handled by calling the native ppp_ioctl() directly. Without CONFIG_BLOCK, the generic compat_ioctl table is now empty, so add a check to avoid a build failure in the looking function for that configuration. Cc: netdev@vger.kernel.org Cc: linux-ppp@vger.kernel.org Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-10-23	compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t	Arnd Bergmann
	The ppp_idle structure is defined in terms of __kernel_time_t, which is defined as 'long' on all architectures, and this usage is not affected by the y2038 problem since it transports a time interval rather than an absolute time. However, the ppp user space defines the same structure as time_t, which may be 64-bit wide on new libc versions even on 32-bit architectures. It's easy enough to just handle both possible structure layouts on all architectures, to deal with the possibility that a user space ppp implementation comes with its own ppp_idle structure definition, as well as to document the fact that the driver is y2038-safe. Doing this also avoids the need for a special compat mode translation, since 32-bit and 64-bit kernels now support the same interfaces. The old 32-bit structure is also available on native 64-bit architectures now, but this is harmless. Cc: netdev@vger.kernel.org Cc: linux-ppp@vger.kernel.org Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-10-23	compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic	Al Viro
	Rather than using a compat_alloc_user_space() buffer, moving this next to the native handler allows sharing most of the code, leaving only the user copy portion distinct. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: netdev@vger.kernel.org Cc: linux-ppp@vger.kernel.org Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-10-23	compat_ioctl: unify copy-in of ppp filters	Al Viro
	Now that isdn4linux is gone, the is only one implementation of PPPIOCSPASS and PPPIOCSACTIVE in ppp_generic.c, so this is where the compat_ioctl support should be implemented. The two commands are implemented in very similar ways, so introduce new helpers to allow sharing between the two and between native and compat mode. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [arnd: rebased, and added changelog text] Cc: netdev@vger.kernel.org Cc: linux-ppp@vger.kernel.org Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-09-25	ppp: Fix memory leak in ppp_write	Takeshi Misawa
	When ppp is closing, __ppp_xmit_process() failed to enqueue skb and skb allocated in ppp_write() is leaked. syzbot reported : BUG: memory leak unreferenced object 0xffff88812a17bc00 (size 224): comm "syz-executor673", pid 6952, jiffies 4294942888 (age 13.040s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000d110fff9>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline] [<00000000d110fff9>] slab_post_alloc_hook mm/slab.h:522 [inline] [<00000000d110fff9>] slab_alloc_node mm/slab.c:3262 [inline] [<00000000d110fff9>] kmem_cache_alloc_node+0x163/0x2f0 mm/slab.c:3574 [<000000002d616113>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:197 [<000000000167fc45>] alloc_skb include/linux/skbuff.h:1055 [inline] [<000000000167fc45>] ppp_write+0x48/0x120 drivers/net/ppp/ppp_generic.c:502 [<000000009ab42c0b>] __vfs_write+0x43/0xa0 fs/read_write.c:494 [<00000000086b2e22>] vfs_write fs/read_write.c:558 [inline] [<00000000086b2e22>] vfs_write+0xee/0x210 fs/read_write.c:542 [<00000000a2b70ef9>] ksys_write+0x7c/0x130 fs/read_write.c:611 [<00000000ce5e0fdd>] __do_sys_write fs/read_write.c:623 [inline] [<00000000ce5e0fdd>] __se_sys_write fs/read_write.c:620 [inline] [<00000000ce5e0fdd>] __x64_sys_write+0x1e/0x30 fs/read_write.c:620 [<00000000d9d7b370>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:296 [<0000000006e6d506>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by freeing skb, if ppp is closing. Fixes: 6d066734e9f0 ("ppp: avoid loop in xmit recursion detection code") Reported-and-tested-by: syzbot+d9c8bf24e56416d7ce2c@syzkaller.appspotmail.com Signed-off-by: Takeshi Misawa <jeliantsurux@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Tested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152	Thomas Gleixner
	Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-20	ppp: Move PFC decompression to PPP generic layer	Sam Protsenko
	Extract "Protocol" field decompression code from transport protocols to PPP generic layer, where it actually belongs. As a consequence, this patch fixes incorrect place of PFC decompression in L2TP driver (when it's not PPPOX_BOUND) and also enables this decompression for other protocols, like PPPoE. Protocol field decompression also happens in PPP Multilink Protocol code and in PPP compression protocols implementations (bsd, deflate, mppe). It looks like there is no easy way to get rid of that, so it was decided to leave it as is, but provide those cases with appropriate comments instead. Changes in v2: - Fix the order of checking skb data room and proto decompression - Remove "inline" keyword from ppp_decompress_proto() - Don't split line before function name - Prefix ppp_decompress_proto() function with "__" - Add ppp_decompress_proto() function with skb data room checks - Add description for introduced functions - Fix comments (as per review on mailing list) Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org> Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10	ppp: Remove direct skb_queue_head list pointer access.	David S. Miller
	Add a helper, __skb_peek(), and use it in ppp_mp_reconstruct(). Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24	ppp: remove the PPPIOCDETACH ioctl	Eric Biggers
	The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file before f_count has reached 0, which is fundamentally a bad idea. It does check 'f_count < 2', which excludes concurrent operations on the file since they would only be possible with a shared fd table, in which case each fdget() would take a file reference. However, it fails to account for the fact that even with 'f_count == 1' the file can still be linked into epoll instances. As reported by syzbot, this can trivially be used to cause a use-after-free. Yet, the only known user of PPPIOCDETACH is pppd versions older than ppp-2.4.2, which was released almost 15 years ago (November 2003). Also, PPPIOCDETACH apparently stopped working reliably at around the same time, when the f_count check was added to the kernel, e.g. see https://lkml.org/lkml/2002/12/31/83. Also, the current 'f_count < 2' check makes PPPIOCDETACH only work in single-threaded applications; it always fails if called from a multithreaded application. All pppd versions released in the last 15 years just close() the file descriptor instead. Therefore, instead of hacking around this bug by exporting epoll internals to modules, and probably missing other related bugs, just remove the PPPIOCDETACH ioctl and see if anyone actually notices. Leave a stub in place that prints a one-time warning and returns EINVAL. Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27	net: Drop pernet_operations::async	Kirill Tkhai
	Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26	drivers/net: Use octal not symbolic permissions	Joe Perches
	Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Fun set of conflict resolutions here... For the mac80211 stuff, these were fortunately just parallel adds. Trivially resolved. In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the function phy_disable_interrupts() earlier in the file, whilst in 'net-next' the phy_error() call from this function was removed. In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the 'rt_table_id' member of rtable collided with a bug fix in 'net' that added a new struct member "rt_mtu_locked" which needs to be copied over here. The mlxsw driver conflict consisted of net-next separating the span code and definitions into separate files, whilst a 'net' bug fix made some changes to that moved code. The mlx5 infiniband conflict resolution was quite non-trivial, the RDMA tree's merge commit was used as a guide here, and here are their notes: ==================== Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524 (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit b5ca15ad7e61 (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22	ppp: avoid loop in xmit recursion detection code	Guillaume Nault
	We already detect situations where a PPP channel sends packets back to its upper PPP device. While this is enough to avoid deadlocking on xmit locks, this doesn't prevent packets from looping between the channel and the unit. The problem is that ppp_start_xmit() enqueues packets in ppp->file.xq before checking for xmit recursion. Therefore, __ppp_xmit_process() might dequeue a packet from ppp->file.xq and send it on the channel which, in turn, loops it back on the unit. Then ppp_start_xmit() queues the packet back to ppp->file.xq and __ppp_xmit_process() picks it up and sends it again through the channel. Therefore, the packet will loop between __ppp_xmit_process() and ppp_start_xmit() until some other part of the xmit path drops it. For L2TP, we rapidly fill the skb's headroom and pppol2tp_xmit() drops the packet after a few iterations. But PPTP reallocates the headroom if necessary, letting the loop run and exhaust the machine resources (as reported in https://bugzilla.kernel.org/show_bug.cgi?id=199109). Fix this by letting __ppp_xmit_process() enqueue the skb to ppp->file.xq, so that we can check for recursion before adding it to the queue. Now ppp_xmit_process() can drop the packet when recursion is detected. __ppp_channel_push() is a bit special. It calls __ppp_xmit_process() without having any actual packet to send. This is used by ppp_output_wakeup() to re-enable transmission on the parent unit (for implementations like ppp_async.c, where the .start_xmit() function might not consume the skb, leaving it in ppp->xmit_pending and disabling transmission). Therefore, __ppp_xmit_process() needs to handle the case where skb is NULL, dequeuing as many packets as possible from ppp->file.xq. Reported-by: xu heng <xuheng333@zoho.com> Fixes: 55454a565836 ("ppp: avoid dealock on recursive xmit") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-06	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	All of the conflicts were cases of overlapping changes. In net/core/devlink.c, we have to make care that the resouce size_params have become a struct member rather than a pointer to such an object. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04	ppp: prevent unregistered channels from connecting to PPP units	Guillaume Nault
	PPP units don't hold any reference on the channels connected to it. It is the channel's responsibility to ensure that it disconnects from its unit before being destroyed. In practice, this is ensured by ppp_unregister_channel() disconnecting the channel from the unit before dropping a reference on the channel. However, it is possible for an unregistered channel to connect to a PPP unit: register a channel with ppp_register_net_channel(), attach a /dev/ppp file to it with ioctl(PPPIOCATTCHAN), unregister the channel with ppp_unregister_channel() and finally connect the /dev/ppp file to a PPP unit with ioctl(PPPIOCCONNECT). Once in this situation, the channel is only held by the /dev/ppp file, which can be released at anytime and free the channel without letting the parent PPP unit know. Then the ppp structure ends up with dangling pointers in its ->channels list. Prevent this scenario by forbidding unregistered channels from connecting to PPP units. This maintains the code logic by keeping ppp_unregister_channel() responsible from disconnecting the channel if necessary and avoids modification on the reference counting mechanism. This issue seems to predate git history (successfully reproduced on Linux 2.6.26 and earlier PPP commits are unrelated). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27	net: Convert ppp_net_ops	Kirill Tkhai
	These pernet_operations are similar to bond_net_ops. Exit method unregisters all net ppp devices, and it looks like another pernet_operations are not interested in foreign net ppp list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-11	vfs: do bulk POLL* -> EPOLL* replacement	Linus Torvalds
	This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V \| grep -v '^t' \| grep -v /um/ \| grep -v '^sa' \| grep -v '/poll.h$'\|grep -v '^D'` for f in $L; do sed -i "-es/^$[^\"]$$\<POLL$V\>$/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-30	Merge branch 'misc.poll' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull poll annotations from Al Viro: "This introduces a __bitwise type for POLL### bitmap, and propagates the annotations through the tree. Most of that stuff is as simple as 'make ->poll() instances return __poll_t and do the same to local variables used to hold the future return value'. Some of the obvious brainos found in process are fixed (e.g. POLLIN misspelled as POLL_IN). At that point the amount of sparse warnings is low and most of them are for genuine bugs - e.g. ->poll() instance deciding to return -EINVAL instead of a bitmap. I hadn't touched those in this series - it's large enough as it is. Another problem it has caught was eventpoll() ABI mess; select.c and eventpoll.c assumed that corresponding POLL### and EPOLL### were equal. That's true for some, but not all of them - EPOLL### are arch-independent, but POLL### are not. The last commit in this series separates userland POLL### values from the (now arch-independent) kernel-side ones, converting between them in the few places where they are copied to/from userland. AFAICS, this is the least disruptive fix preserving poll(2) ABI and making epoll() work on all architectures. As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and it will trigger only on what would've triggered EPOLLWRBAND on other architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered at all on sparc. With this patch they should work consistently on all architectures" * 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits) make kernel-side POLL... arch-independent eventpoll: no need to mask the result of epi_item_poll() again eventpoll: constify struct epoll_event pointers debugging printk in sg_poll() uses %x to print POLL... bitmap annotate poll(2) guts 9p: untangle ->poll() mess ->si_band gets POLL... bitmap stored into a user-visible long field ring_buffer_poll_wait() return value used as return value of ->poll() the rest of drivers/*: annotate ->poll() instances media: annotate ->poll() instances fs: annotate ->poll() instances ipc, kernel, mm: annotate ->poll() instances net: annotate ->poll() instances apparmor: annotate ->poll() instances tomoyo: annotate ->poll() instances sound: annotate ->poll() instances acpi: annotate ->poll() instances crypto: annotate ->poll() instances block: annotate ->poll() instances x86: annotate ->poll() instances ...
2018-01-15	ppp: unlock all_ppp_mutex before registering device	Guillaume Nault
	ppp_dev_uninit(), which is the .ndo_uninit() handler of PPP devices, needs to lock pn->all_ppp_mutex. Therefore we mustn't call register_netdevice() with pn->all_ppp_mutex already locked, or we'd deadlock in case register_netdevice() fails and calls .ndo_uninit(). Fortunately, we can unlock pn->all_ppp_mutex before calling register_netdevice(). This lock protects pn->units_idr, which isn't used in the device registration process. However, keeping pn->all_ppp_mutex locked during device registration did ensure that no device in transient state would be published in pn->units_idr. In practice, unlocking it before calling register_netdevice() doesn't change this property: ppp_unit_register() is called with 'ppp_mutex' locked and all searches done in pn->units_idr hold this lock too. Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion") Reported-and-tested-by: syzbot+367889b9c9e279219175@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28	the rest of drivers/*: annotate ->poll() instances	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-14	ppp: exit_net cleanup checks added	Vasily Averin
	Be sure that lists initialized in net_init hook were return to initial state. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01	ppp: Destroy the mutex when cleanup	Gao Feng
	The mutex_destroy only makes sense when enable DEBUG_MUTEX. For the good readbility, it's better to invoke it in exit func when the init func invokes mutex_init. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29	ppp: allow usage in namespaces	Matteo Croce
	Check for CAP_NET_ADMIN with ns_capable() instead of capable() to allow usage of ppp in user namespace other than the init one. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22	drivers, net, ppp: convert ppp_file.refcnt from atomic_t to refcount_t	Elena Reshetova
	atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable ppp_file.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06	ppp: fix race in ppp device destruction	Guillaume Nault
	ppp_release() tries to ensure that netdevices are unregistered before decrementing the unit refcount and running ppp_destroy_interface(). This is all fine as long as the the device is unregistered by ppp_release(): the unregister_netdevice() call, followed by rtnl_unlock(), guarantee that the unregistration process completes before rtnl_unlock() returns. However, the device may be unregistered by other means (like ppp_nl_dellink()). If this happens right before ppp_release() calling rtnl_lock(), then ppp_release() has to wait for the concurrent unregistration code to release the lock. But rtnl_unlock() releases the lock before completing the device unregistration process. This allows ppp_release() to proceed and eventually call ppp_destroy_interface() before the unregistration process completes. Calling free_netdev() on this partially unregistered device will BUG(): ------------[ cut here ]------------ kernel BUG at net/core/dev.c:8141! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 1557 Comm: pppd Not tainted 4.14.0-rc2+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ppp_destroy_interface+0xd8/0xe0 [ppp_generic] ppp_disconnect_channel+0xda/0x110 [ppp_generic] ppp_unregister_channel+0x5e/0x110 [ppp_generic] pppox_unbind_sock+0x23/0x30 [pppox] pppoe_connect+0x130/0x440 [pppoe] SYSC_connect+0x98/0x110 ? do_fcntl+0x2c0/0x5d0 SyS_connect+0xe/0x10 entry_SYSCALL_64_fastpath+0x1a/0xa5 RIP: free_netdev+0x107/0x110 RSP: ffffc28a40573d88 ---[ end trace ed294ff0cc40eeff ]--- We could set the ->needs_free_netdev flag on PPP devices and move the ppp_destroy_interface() logic in the ->priv_destructor() callback. But that'd be quite intrusive as we'd first need to unlink from the other channels and units that depend on the device (the ones that used the PPPIOCCONNECT and PPPIOCATTACH ioctls). Instead, we can just let the netdevice hold a reference on its ppp_file. This reference is dropped in ->priv_destructor(), at the very end of the unregistration process, so that neither ppp_release() nor ppp_disconnect_channel() can call ppp_destroy_interface() in the interim. Reported-by: Beniamino Galvani <bgalvani@redhat.com> Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01	ppp: fix __percpu annotation	Guillaume Nault
	Move sparse annotation right after pointer type. Fixes sparse warning: drivers/net/ppp/ppp_generic.c:1422:13: warning: incorrect type in initializer (different address spaces) drivers/net/ppp/ppp_generic.c:1422:13: expected void const [noderef] <asn:3>__vpp_verify drivers/net/ppp/ppp_generic.c:1422:13: got int <noident> ... Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08	ppp: fix xmit recursion detection on ppp channels	Guillaume Nault
	Commit e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") dropped the xmit_recursion counter incrementation in ppp_channel_push() and relied on ppp_xmit_process() for this task. But __ppp_channel_push() can also send packets directly (using the .start_xmit() channel callback), in which case the xmit_recursion counter isn't incremented anymore. If such packets get routed back to the parent ppp unit, ppp_xmit_process() won't notice the recursion and will call ppp_channel_push() on the same channel, effectively creating the deadlock situation that the xmit_recursion mechanism was supposed to prevent. This patch re-introduces the xmit_recursion counter incrementation in ppp_channel_push(). Since the xmit_recursion variable is now part of the parent ppp unit, incrementation is skipped if the channel doesn't have any. This is fine because only packets routed through the parent unit may enter the channel recursively. Finally, we have to ensure that pch->ppp is not going to be modified while executing ppp_channel_push(). Instead of taking this lock only while calling ppp_xmit_process(), we now have to hold it for the full ppp_channel_push() execution. This respects the ppp locks ordering which requires locking ->upl before ->downl. Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18	ppp: Fix false xmit recursion detect with two ppp devices	Gao Feng
	The global percpu variable ppp_xmit_recursion is used to detect the ppp xmit recursion to avoid the deadlock, which is caused by one CPU tries to lock the xmit lock twice. But it would report false recursion when one CPU wants to send the skb from two different PPP devices, like one L2TP on the PPPoE. It is a normal case actually. Now use one percpu member of struct ppp instead of the gloable variable to detect the xmit recursion of one ppp device. Fixes: 55454a565836 ("ppp: avoid dealock on recursive xmit") Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: Liu Jianying <jianying.liu@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26	net: add netlink_ext_ack argument to rtnl_link_ops.validate	Matthias Schiffer
	Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26	net: add netlink_ext_ack argument to rtnl_link_ops.newlink	Matthias Schiffer
	Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16	networking: make skb_push & __skb_push return void pointers	Johannes Berg
	It seems like a historic accident that these return unsigned char , and in many places that means casts are required, more often than not. Make these functions return void and remove all the casts across the tree, adding a (u8 ) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - (fn(SKB, LEN)) + (u8 )fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; type T; @@ - E = ((T )(fn(SKB, LEN))) + E = fn(SKB, LEN) @@ expression SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - fn(SKB, LEN)[0] + (u8 )fn(SKB, LEN) Note that the last part there converts from push(...)[0] to the more idiomatic (u8 *)push(...). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01	ppp: remove unnecessary bh disable in xmit path	Gao Feng
	Since the commit 55454a565836 ("ppp: avoid dealock on recursive xmit"), the PPP xmit path is protected by wrapper functions which disable the bh already. So it is unnecessary to disable the bh again in the real xmit path. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-02	sched/headers: Prepare to move signal wakeup & sigpending methods from ↵	Ingo Molnar
	<linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-08	net: make ndo_get_stats64 a void function	stephen hemminger
	The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18	netns: make struct pernet_operations::id unsigned int	Alexey Dobriyan
	Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	ppp: declare PPP devices as LLTX	Guillaume Nault
	ppp_xmit_process() already locks the xmit path. If HARD_TX_LOCK() tries to hold the _xmit_lock we can get lock inversion. [ 973.726130] ====================================================== [ 973.727311] [ INFO: possible circular locking dependency detected ] [ 973.728546] 4.8.0-rc2 #1 Tainted: G O [ 973.728986] ------------------------------------------------------- [ 973.728986] accel-pppd/1806 is trying to acquire lock: [ 973.728986] (&qdisc_xmit_lock_key){+.-...}, at: [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [ 973.728986] but task is already holding lock: [ 973.728986] (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] [ 973.728986] which lock already depends on the new lock. [ 973.728986] [ 973.728986] [ 973.728986] the existing dependency chain (in reverse order) is: [ 973.728986] -> #3 (l2tp_sock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] -> #2 (&(&pch->downl)->rlock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40 [ 973.728986] [<ffffffffa01808e2>] ppp_push+0xa7/0x82d [ppp_generic] [ 973.728986] [<ffffffffa0184675>] __ppp_xmit_process+0x48/0x877 [ppp_generic] [ 973.728986] [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic] [ 973.728986] [<ffffffffa01853f7>] ppp_write+0x10e/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] -> #1 (&(&ppp->wlock)->rlock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40 [ 973.728986] [<ffffffffa0184654>] __ppp_xmit_process+0x27/0x877 [ppp_generic] [ 973.728986] [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic] [ 973.728986] [<ffffffffa01852da>] ppp_start_xmit+0x21b/0x22a [ppp_generic] [ 973.728986] [<ffffffff8143f767>] dev_hard_start_xmit+0x1a9/0x43d [ 973.728986] [<ffffffff8146f747>] sch_direct_xmit+0xd6/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff8150e62b>] ip6_finish_output2+0x5a9/0x623 [ 973.728986] [<ffffffff81512128>] ip6_output+0x15e/0x16a [ 973.728986] [<ffffffff8153ef86>] dst_output+0x76/0x7f [ 973.728986] [<ffffffff8153f737>] mld_sendpack+0x335/0x404 [ 973.728986] [<ffffffff81541c61>] mld_send_initial_cr.part.21+0x99/0xa2 [ 973.728986] [<ffffffff8154441d>] ipv6_mc_dad_complete+0x42/0x71 [ 973.728986] [<ffffffff8151c4bd>] addrconf_dad_completed+0x1cf/0x2ea [ 973.728986] [<ffffffff8151e4fa>] addrconf_dad_work+0x453/0x520 [ 973.728986] [<ffffffff8107a393>] process_one_work+0x365/0x6f0 [ 973.728986] [<ffffffff8107aecd>] worker_thread+0x2de/0x421 [ 973.728986] [<ffffffff810816fb>] kthread+0x121/0x130 [ 973.728986] [<ffffffff81575dbf>] ret_from_fork+0x1f/0x40 [ 973.728986] -> #0 (&qdisc_xmit_lock_key){+.-...}: [ 973.728986] [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483 [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff81487811>] ip_finish_output2+0x5db/0x609 [ 973.728986] [<ffffffff81489590>] ip_finish_output+0x152/0x15e [ 973.728986] [<ffffffff8148a0d4>] ip_output+0x8c/0x96 [ 973.728986] [<ffffffff81489652>] ip_local_out+0x41/0x4a [ 973.728986] [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609 [ 973.728986] [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] [ 973.728986] other info that might help us debug this: [ 973.728986] [ 973.728986] Chain exists of: &qdisc_xmit_lock_key --> &(&pch->downl)->rlock --> l2tp_sock [ 973.728986] Possible unsafe locking scenario: [ 973.728986] [ 973.728986] CPU0 CPU1 [ 973.728986] ---- ---- [ 973.728986] lock(l2tp_sock); [ 973.728986] lock(&(&pch->downl)->rlock); [ 973.728986] lock(l2tp_sock); [ 973.728986] lock(&qdisc_xmit_lock_key); [ 973.728986] [ 973.728986] * DEADLOCK * [ 973.728986] [ 973.728986] 6 locks held by accel-pppd/1806: [ 973.728986] #0: (&(&pch->downl)->rlock){+.-...}, at: [<ffffffffa0184efa>] ppp_channel_push+0x56/0x14a [ppp_generic] [ 973.728986] #1: (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] #2: (rcu_read_lock){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #3: (rcu_read_lock_bh){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #4: (rcu_read_lock_bh){......}, at: [<ffffffff814340e3>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #5: (dev->qdisc_running_key ?: &qdisc_running_key#2){+.....}, at: [<ffffffff8144011e>] __dev_queue_xmit+0x564/0x912 [ 973.728986] [ 973.728986] stack backtrace: [ 973.728986] CPU: 2 PID: 1806 Comm: accel-pppd Tainted: G O 4.8.0-rc2 #1 [ 973.728986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 973.728986] ffff7fffffffffff ffff88003436f850 ffffffff812a20f4 ffffffff82156e30 [ 973.728986] ffffffff82156920 ffff88003436f890 ffffffff8115c759 ffff88003344ae00 [ 973.728986] ffff88003344b5c0 0000000000000002 0000000000000006 ffff88003344b5e8 [ 973.728986] Call Trace: [ 973.728986] [<ffffffff812a20f4>] dump_stack+0x67/0x90 [ 973.728986] [<ffffffff8115c759>] print_circular_bug+0x22e/0x23c [ 973.728986] [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483 [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff810b3130>] ? lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff81487811>] ip_finish_output2+0x5db/0x609 [ 973.728986] [<ffffffff81486853>] ? dst_mtu+0x29/0x2e [ 973.728986] [<ffffffff81489590>] ip_finish_output+0x152/0x15e [ 973.728986] [<ffffffff8148a0bc>] ? ip_output+0x74/0x96 [ 973.728986] [<ffffffff8148a0d4>] ip_output+0x8c/0x96 [ 973.728986] [<ffffffff81489652>] ip_local_out+0x41/0x4a [ 973.728986] [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609 [ 973.728986] [<ffffffff814c559e>] ? udp_set_csum+0x207/0x21e [ 973.728986] [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff8124c11d>] ? fsnotify_perm+0x27/0x95 [ 973.728986] [<ffffffff8124d41d>] ? security_file_permission+0x4d/0x54 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] [<ffffffff810ae0fa>] ? trace_hardirqs_off_caller+0x121/0x12f Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	ppp: avoid dealock on recursive xmit	Guillaume Nault
	In case of misconfiguration, a virtual PPP channel might send packets back to their parent PPP interface. This typically happens in misconfigured L2TP setups, where PPP's peer IP address is set with the IP of the L2TP peer. When that happens the system hangs due to PPP trying to recursively lock its xmit path. [ 243.332155] BUG: spinlock recursion on CPU#1, accel-pppd/926 [ 243.333272] lock: 0xffff880033d90f18, .magic: dead4ead, .owner: accel-pppd/926, .owner_cpu: 1 [ 243.334859] CPU: 1 PID: 926 Comm: accel-pppd Not tainted 4.8.0-rc2 #1 [ 243.336010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 243.336018] ffff7fffffffffff ffff8800319a77a0 ffffffff8128de85 ffff880033d90f18 [ 243.336018] ffff880033ad8000 ffff8800319a77d8 ffffffff810ad7c0 ffffffff0000039e [ 243.336018] ffff880033d90f18 ffff880033d90f60 ffff880033d90f18 ffff880033d90f28 [ 243.336018] Call Trace: [ 243.336018] [<ffffffff8128de85>] dump_stack+0x4f/0x65 [ 243.336018] [<ffffffff810ad7c0>] spin_dump+0xe1/0xeb [ 243.336018] [<ffffffff810ad7f0>] spin_bug+0x26/0x28 [ 243.336018] [<ffffffff810ad8b9>] do_raw_spin_lock+0x5c/0x160 [ 243.336018] [<ffffffff815522aa>] _raw_spin_lock_bh+0x35/0x3c [ 243.336018] [<ffffffffa01a88e2>] ? ppp_push+0xa7/0x82d [ppp_generic] [ 243.336018] [<ffffffffa01a88e2>] ppp_push+0xa7/0x82d [ppp_generic] [ 243.336018] [<ffffffff810adada>] ? do_raw_spin_unlock+0xc2/0xcc [ 243.336018] [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7 [ 243.336018] [<ffffffff81552438>] ? _raw_spin_unlock_irqrestore+0x34/0x49 [ 243.336018] [<ffffffffa01ac657>] ppp_xmit_process+0x48/0x877 [ppp_generic] [ 243.336018] [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7 [ 243.336018] [<ffffffff81408cd3>] ? skb_queue_tail+0x71/0x7c [ 243.336018] [<ffffffffa01ad1c5>] ppp_start_xmit+0x21b/0x22a [ppp_generic] [ 243.336018] [<ffffffff81426af1>] dev_hard_start_xmit+0x15e/0x32c [ 243.336018] [<ffffffff81454ed7>] sch_direct_xmit+0xd6/0x221 [ 243.336018] [<ffffffff814273a8>] __dev_queue_xmit+0x52a/0x820 [ 243.336018] [<ffffffff814276a9>] dev_queue_xmit+0xb/0xd [ 243.336018] [<ffffffff81430a3c>] neigh_direct_output+0xc/0xe [ 243.336018] [<ffffffff8146b5d7>] ip_finish_output2+0x4d2/0x548 [ 243.336018] [<ffffffff8146a8e6>] ? dst_mtu+0x29/0x2e [ 243.336018] [<ffffffff8146d49c>] ip_finish_output+0x152/0x15e [ 243.336018] [<ffffffff8146df84>] ? ip_output+0x74/0x96 [ 243.336018] [<ffffffff8146df9c>] ip_output+0x8c/0x96 [ 243.336018] [<ffffffff8146d55e>] ip_local_out+0x41/0x4a [ 243.336018] [<ffffffff8146dd15>] ip_queue_xmit+0x531/0x5c5 [ 243.336018] [<ffffffff814a82cd>] ? udp_set_csum+0x207/0x21e [ 243.336018] [<ffffffffa01f2f04>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 243.336018] [<ffffffffa01ea458>] pppol2tp_xmit+0x1eb/0x257 [l2tp_ppp] [ 243.336018] [<ffffffffa01acf17>] ppp_channel_push+0x91/0x102 [ppp_generic] [ 243.336018] [<ffffffffa01ad2d8>] ppp_write+0x104/0x11c [ppp_generic] [ 243.336018] [<ffffffff811a3c1e>] __vfs_write+0x56/0x120 [ 243.336018] [<ffffffff81239801>] ? fsnotify_perm+0x27/0x95 [ 243.336018] [<ffffffff8123ab01>] ? security_file_permission+0x4d/0x54 [ 243.336018] [<ffffffff811a4ca4>] vfs_write+0xbd/0x11b [ 243.336018] [<ffffffff811a5a0a>] SyS_write+0x5e/0x96 [ 243.336018] [<ffffffff81552a1b>] entry_SYSCALL_64_fastpath+0x13/0x94 The main entry points for sending packets over a PPP unit are the .write() and .ndo_start_xmit() callbacks (simplified view): .write(unit fd) or .ndo_start_xmit() \ CALL ppp_xmit_process() \ LOCK unit's xmit path (ppp->wlock) \| CALL ppp_push() \ LOCK channel's xmit path (chan->downl) \| CALL lower layer's .start_xmit() callback \ ... might recursively call .ndo_start_xmit() ... / RETURN from .start_xmit() \| UNLOCK channel's xmit path / RETURN from ppp_push() \| UNLOCK unit's xmit path / RETURN from ppp_xmit_process() Packets can also be directly sent on channels (e.g. LCP packets): .write(channel fd) or ppp_output_wakeup() \ CALL ppp_channel_push() \ LOCK channel's xmit path (chan->downl) \| CALL lower layer's .start_xmit() callback \ ... might call .ndo_start_xmit() ... / RETURN from .start_xmit() \| UNLOCK channel's xmit path / RETURN from ppp_channel_push() Key points about the lower layer's .start_xmit() callback: * It can be called directly by a channel fd .write() or by ppp_output_wakeup() or indirectly by a unit fd .write() or by .ndo_start_xmit(). * In any case, it's always called with chan->downl held. * It might route the packet back to its parent unit using .ndo_start_xmit() as entry point. This patch detects and breaks recursion in ppp_xmit_process(). This function is a good candidate for the task because it's called early enough after .ndo_start_xmit(), it's always part of the recursion loop and it's on the path of whatever entry point is used to send a packet on a PPP unit. Recursion detection is done using the per-cpu ppp_xmit_recursion variable. Since ppp_channel_push() too locks the channel's xmit path and calls the lower layer's .start_xmit() callback, we need to also increment ppp_xmit_recursion there. However there's no need to check for recursion, as it's out of the recursion loop. Reported-by: Feng Gao <gfree.wind@gmail.com> Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-09	ppp: build ifname using unit identifier for rtnl based devices	Guillaume Nault
	Userspace programs generally need to know the name of the ppp devices they create. Both ioctl and rtnl interfaces use the ppp<suffix> sheme to name them. But although the suffix used by the ioctl interface can be known by userspace (it's the PPP unit identifier returned by the PPPIOCGUNIT ioctl), the one used by the rtnl is only known by the kernel. This patch brings more consistency between ioctl and rtnl based ppp devices by generating device names using the PPP unit identifer as suffix in both cases. This way, userspace can always infer the name of the devices they create. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Just several instances of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-08	ppp: defer netns reference release for ppp channel	WANG Cong
	Matt reported that we have a NULL pointer dereference in ppp_pernet() from ppp_connect_channel(), i.e. pch->chan_net is NULL. This is due to that a parallel ppp_unregister_channel() could happen while we are in ppp_connect_channel(), during which pch->chan_net set to NULL. Since we need a reference to net per channel, it makes sense to sync the refcnt with the life time of the channel, therefore we should release this reference when we destroy it. Fixes: 1f461dcdd296 ("ppp: take reference on channels netns") Reported-by: Matt Bennett <Matt.Bennett@alliedtelesis.co.nz> Cc: Paul Mackerras <paulus@samba.org> Cc: linux-ppp@vger.kernel.org Cc: Guillaume Nault <g.nault@alphalink.fr> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09	net: add netdev_lockdep_set_classes() helper	Eric Dumazet
	It is time to add netdev_lockdep_set_classes() helper so that lockdep annotations per device type are easier to manage. This removes a lot of copies and missing annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07	net_sched: transform qdisc running bit into a seqcount	Eric Dumazet
	Instead of using a single bit (__QDISC___STATE_RUNNING) in sch->__state, use a seqcount. This adds lockdep support, but more importantly it will allow us to sample qdisc/class statistics without having to grab qdisc root lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29	ppp: add rtnetlink device creation support	Guillaume Nault
	Define PPP device handler for use with rtnetlink. The only PPP specific attribute is IFLA_PPP_DEV_FD. It is mandatory and contains the file descriptor of the associated /dev/ppp instance (the file descriptor which would have been used for ioctl(PPPIOCNEWUNIT) in the ioctl-based API). The PPP device is removed when this file descriptor is released (same behaviour as with ioctl based PPP devices). PPP devices created with the rtnetlink API behave like the ones created with ioctl(PPPIOCNEWUNIT). In particular existing ioctls work the same way, no matter how the PPP device was created. The rtnl callbacks are also assigned to ioctl based PPP devices. This way, rtnl messages have the same effect on any PPP devices. The immediate effect is that all PPP devices, even ioctl-based ones, can now be removed with "ip link del". A minor difference still exists between ioctl and rtnl based PPP interfaces: in the device name, the number following the "ppp" prefix corresponds to the PPP unit number for ioctl based devices, while it is just an unrelated incrementing index for rtnl ones. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>