summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-02-12net/mlx4_en: Force CHECKSUM_NONE for short ethernet framesSaeed Mahameed
When an ethernet frame is padded to meet the minimum ethernet frame size, the padding octets are not covered by the hardware checksum. Fortunately the padding octets are usually zero's, which don't affect checksum. However, it is not guaranteed. For example, switches might choose to make other use of these octets. This repeatedly causes kernel hardware checksum fault. Prior to the cited commit below, skb checksum was forced to be CHECKSUM_NONE when padding is detected. After it, we need to keep skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP headers, it does not worth the effort as the packets are so small that CHECKSUM_COMPLETE has no significant advantage. Future work: when reporting checksum complete is not an option for IP non-TCP/UDP packets, we can actually fallback to report checksum unnecessary, by looking at cqe IPOK bit. Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: phylink: avoid resolving link state too earlyRussell King
During testing on Armada 388 platforms, it was found with a certain module configuration that it was possible to trigger a kernel oops during the module load process, caused by the phylink resolver being triggered for a currently disabled interface. This problem was introduced by changing the way the SFP registration works, which now can result in the sfp link down notification being called during phylink_create(). Fixes: b5bfc21af5cb ("net: sfp: do not probe SFP module before we're attached") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12geneve: change NET_UDP_TUNNEL dependency to selectMatteo Croce
Due to the depends on NET_UDP_TUNNEL, at the moment it is impossible to compile GENEVE if no other protocol depending on NET_UDP_TUNNEL is selected. Fix this changing the depends to a select, and drop NET_IP_TUNNEL from the select list, as it already depends on NET_UDP_TUNNEL. Signed-off-by: Matteo Croce <mcroce@redhat.com> Reviewed-and-tested-by: Andrea Claudi <aclaudi@redhat.com> Tested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12sfc: initialise found bitmap in efx_ef10_mtd_probeBert Kenward
The bitmap of found partitions in efx_ef10_mtd_probe was not initialised, causing partitions to be suppressed based off whatever value was in the bitmap at the start. Fixes: 3366463513f5 ("sfc: suppress duplicate nvmem partition types in efx_ef10_mtd_probe") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12Merge tag 'mac80211-for-davem-2019-02-12' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a few fixes: * aggregation session teardown with internal TXQs was continuing to send some frames marked as aggregation, fix from Ilan * IBSS join was missed during firmware restart, should such a thing happen * speculative execution based on the return value of cfg80211_classify8021d() - which is controlled by the sender of the packet - could be problematic in some code using it, prevent it * a few peer measurement fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12floppy: check_events callback should not return a negative numberYufen Yu
floppy_check_events() is supposed to return bit flags to say which events occured. We should return zero to say that no event flags are set. Only BIT(0) and BIT(1) are used in the caller. And .check_events interface also expect to return an unsigned int value. However, after commit a0c80efe5956, it may return -EINTR (-4u). Here, both BIT(0) and BIT(1) are cleared. So this patch shouldn't affect runtime, but it obviously is still worth fixing. Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: a0c80efe5956 ("floppy: fix lock_fdc() signal handling") Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-12xsk: do not remove umem from netdevice on fall-back to copy-modeBjörn Töpel
Commit c9b47cc1fabc ("xsk: fix bug when trying to use both copy and zero-copy on one queue id") stores the umem into the netdev._rx struct. However, the patch incorrectly removed the umem from the netdev._rx struct when user-space passed "best-effort" mode (i.e. select the fastest possible option available), and zero-copy mode was not available. This commit fixes that. Fixes: c9b47cc1fabc ("xsk: fix bug when trying to use both copy and zero-copy on one queue id") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-12drm/i915/opregion: rvda is relative from opregion base in opregion 2.1+Jani Nikula
Starting from opregion version 2.1 (roughly corresponding to ICL+) the RVDA field is relative from the beginning of opregion, not absolute address. Fix the error path while at it. v2: Make relative vs. absolute conditional on the opregion version, bumped for the purpose. Turned out there are machines relying on absolute RVDA in the wild. v3: Fix the version checks Fixes: 04ebaadb9f2d ("drm/i915/opregion: handle VBT sizes bigger than 6 KB") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208184254.24123-2-jani.nikula@intel.com (cherry picked from commit a0f52c3d357af218a9c1f7cd906ab70426176a1a) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-02-12drm/i915/opregion: fix version checkJani Nikula
The u32 version field encodes major, minor, revision and reserved. We've basically been checking for any non-zero version. Add opregion version logging while at it. v2: Fix the fix of the version check Fixes: 04ebaadb9f2d ("drm/i915/opregion: handle VBT sizes bigger than 6 KB") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208184254.24123-1-jani.nikula@intel.com (cherry picked from commit 98fdaaca9537b997062f1abc0aa87c61b50ce40a) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-02-12drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC setJoonas Lahtinen
Make sure the underlying VMA in the process address space is the same as it was during vm_mmap to avoid applying WC to wrong VMA. A more long-term solution would be to have vm_mmap_locked variant in linux/mmap.h for when caller wants to hold mmap_sem for an extended duration. v2: - Refactor the compare function Fixes: 1816f9236303 ("drm/i915: Support creation of unbound wc user mappings for objects") Reported-by: Adam Zabrocki <adamza@microsoft.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.0+ Cc: Akash Goel <akash.goel@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Adam Zabrocki <adamza@microsoft.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Link: https://patchwork.freedesktop.org/patch/msgid/20190207085454.10598-1-joonas.lahtinen@linux.intel.com (cherry picked from commit 5c4604e757ba9b193b09768d75a7d2105a5b883f) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-02-12drm/i915: Block fbdev HPD processing during suspendLyude Paul
When resuming, we check whether or not any previously connected MST topologies are still present and if so, attempt to resume them. If this fails, we disable said MST topologies and fire off a hotplug event so that userspace knows to reprobe. However, sending a hotplug event involves calling drm_fb_helper_hotplug_event(), which in turn results in fbcon doing a connector reprobe in the caller's thread - something we can't do at the point in which i915 calls drm_dp_mst_topology_mgr_resume() since hotplugging hasn't been fully initialized yet. This currently causes some rather subtle but fatal issues. For example, on my T480s the laptop dock connected to it usually disappears during a suspend cycle, and comes back up a short while after the system has been resumed. This guarantees pretty much every suspend and resume cycle, drm_dp_mst_topology_mgr_set_mst(mgr, false); will be caused and in turn, a connector hotplug will occur. Now it's Rute Goldberg time: when the connector hotplug occurs, i915 reprobes /all/ of the connectors, including eDP. However, eDP probing requires that we power on the panel VDD which in turn, grabs a wakeref to the appropriate power domain on the GPU (on my T480s, this is the PORT_DDI_A_IO domain). This is where things start breaking, since this all happens before intel_power_domains_enable() is called we end up leaking the wakeref that was acquired and never releasing it later. Come next suspend/resume cycle, this causes us to fail to shut down the GPU properly, which causes it not to resume properly and die a horrible complicated death. (as a note: this only happens when there's both an eDP panel and MST topology connected which is removed mid-suspend. One or the other seems to always be OK). We could try to fix the VDD wakeref leak, but this doesn't seem like it's worth it at all since we aren't able to handle hotplug detection while resuming anyway. So, let's go with a more robust solution inspired by nouveau: block fbdev from handling hotplug events until we resume fbdev. This allows us to still send sysfs hotplug events to be handled later by user space while we're resuming, while also preventing us from actually processing any hotplug events we receive until it's safe. This fixes the wakeref leak observed on the T480s and as such, also fixes suspend/resume with MST topologies connected on this machine. Changes since v2: * Don't call drm_fb_helper_hotplug_event() under lock, do it after lock (Chris Wilson) * Don't call drm_fb_helper_hotplug_event() in intel_fbdev_output_poll_changed() under lock (Chris Wilson) * Always set ifbdev->hpd_waiting (Chris Wilson) Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 0e32b39ceed6 ("drm/i915: add DP 1.2 MST support (v0.7)") Cc: Todd Previte <tprevite@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.17+ Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190129191001.442-2-lyude@redhat.com (cherry picked from commit fe5ec65668cdaa4348631d8ce1766eed43b33c10) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-02-12drm/i915/pmu: Fix enable count array size and bounds checkingTvrtko Ursulin
Enable count array is supposed to have one counter for each possible engine sampler. As such, array sizing and bounds checking is not correct and would blow up the asserts if more samplers were added. No ill-effect in the current code base but lets fix it for correctness. At the same time tidy the assert for readability and robustness. v2: * One check per assert. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130353.21105-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit 26a11deea685b41a43edb513194718aa1f461c9a) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-02-12ipvs: fix dependency on nf_defrag_ipv6Andrea Claudi
ipvs relies on nf_defrag_ipv6 module to manage IPv6 fragmentation, but lacks proper Kconfig dependencies and does not explicitly request defrag features. As a result, if netfilter hooks are not loaded, when IPv6 fragmented packet are handled by ipvs only the first fragment makes through. Fix it properly declaring the dependency on Kconfig and registering netfilter hooks on ip_vs_add_service() and ip_vs_new_dest(). Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-12nvme-pci: add missing unlock for reset errorKeith Busch
The reset work holds a mutex to prevent races with removal modifying the same resources, but was unlocking only on success. Unlock on failure too. Fixes: 5c959d73dba64 ("nvme-pci: fix rapid add remove sequence") Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-02-11tipc: fix link session and re-establish issuesTuong Lien
When a link endpoint is re-created (e.g. after a node reboot or interface reset), the link session number is varied by random, the peer endpoint will be synced with this new session number before the link is re-established. However, there is a shortcoming in this mechanism that can lead to the link never re-established or faced with a failure then. It happens when the peer endpoint is ready in ESTABLISHING state, the 'peer_session' as well as the 'in_session' flag have been set, but suddenly this link endpoint leaves. When it comes back with a random session number, there are two situations possible: 1/ If the random session number is larger than (or equal to) the previous one, the peer endpoint will be updated with this new session upon receipt of a RESET_MSG from this endpoint, and the link can be re- established as normal. Otherwise, all the RESET_MSGs from this endpoint will be rejected by the peer. In turn, when this link endpoint receives one ACTIVATE_MSG from the peer, it will move to ESTABLISHED and start to send STATE_MSGs, but again these messages will be dropped by the peer due to wrong session. The peer link endpoint can still become ESTABLISHED after receiving a traffic message from this endpoint (e.g. a BCAST_PROTOCOL or NAME_DISTRIBUTOR), but since all the STATE_MSGs are invalid, the link will be forced down sooner or later! Even in case the random session number is larger than the previous one, it can be that the ACTIVATE_MSG from the peer arrives first, and this link endpoint moves quickly to ESTABLISHED without sending out any RESET_MSG yet. Consequently, the peer link will not be updated with the new session number, and the same link failure scenario as above will happen. 2/ Another situation can be that, the peer link endpoint was reset due to any reasons in the meantime, its link state was set to RESET from ESTABLISHING but still in session, i.e. the 'in_session' flag is not reset... Now, if the random session number from this endpoint is less than the previous one, all the RESET_MSGs from this endpoint will be rejected by the peer. In the other direction, when this link endpoint receives a RESET_MSG from the peer, it moves to ESTABLISHING and starts to send ACTIVATE_MSGs, but all these messages will be rejected by the peer too. As a result, the link cannot be re-established but gets stuck with this link endpoint in state ESTABLISHING and the peer in RESET! Solution: =========== This link endpoint should not go directly to ESTABLISHED when getting ACTIVATE_MSG from the peer which may belong to the old session if the link was re-created. To ensure the session to be correct before the link is re-established, the peer endpoint in ESTABLISHING state will send back the last session number in ACTIVATE_MSG for a verification at this endpoint. Then, if needed, a new and more appropriate session number will be regenerated to force a re-synch first. In addition, when a link in ESTABLISHING state is reset, its state will move to RESET according to the link FSM, along with resetting the 'in_session' flag (and the other data) as a normal link reset, it will also be deleted if requested. The solution is backward compatible. Acked-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11net: fix IPv6 prefix route residueZhiqiang Liu
Follow those steps: # ip addr add 2001:123::1/32 dev eth0 # ip addr add 2001:123:456::2/64 dev eth0 # ip addr del 2001:123::1/32 dev eth0 # ip addr del 2001:123:456::2/64 dev eth0 and then prefix route of 2001:123::1/32 will still exist. This is because ipv6_prefix_equal in check_cleanup_prefix_route func does not check whether two IPv6 addresses have the same prefix length. If the prefix of one address starts with another shorter address prefix, even though their prefix lengths are different, the return value of ipv6_prefix_equal is true. Here I add a check of whether two addresses have the same prefix to decide whether their prefixes are equal. Fixes: 5b84efecb7d9 ("ipv6 addrconf: don't cleanup prefix route for IFA_F_NOPREFIXROUTE") Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reported-by: Wenhao Zhang <zhangwenhao8@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11blk-mq: insert rq with DONTPREP to hctx dispatch list when requeueJianchao Wang
When requeue, if RQF_DONTPREP, rq has contained some driver specific data, so insert it to hctx dispatch list to avoid any merge. Take scsi as example, here is the trace event log (no io scheduler, because RQF_STARTED would prevent merging), kworker/0:1H-339 [000] ...1 2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H] scsi_inert_test-1987 [000] .... 2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test] scsi_inert_test-1987 [000] ...2 2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test] kworker/0:1H-339 [000] .... 2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H] scsi_inert_test-1996 [000] ..s1 2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0] scsi_inert_test-1996 [000] .Ns1 2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0] kworker/0:1H-339 [000] ...1 2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H] kworker/0:1H-339 [000] ...1 2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H] scsi_inert_test-1986 [000] ..s1 2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0] (32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP. Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP, the sdb only contained the part of (32768 + 8), then only that part was completed. The lucky thing was that scsi_io_completion detected it and requeued the remaining part. So we didn't get corrupted data. However, the requeue of (32776 + 8) is not expected. Suggested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-11tipc: fix skb may be leaky in tipc_link_inputHoang Le
When we free skb at tipc_data_input, we return a 'false' boolean. Then, skb passed to subcalling tipc_link_input in tipc_link_rcv, <snip> 1303 int tipc_link_rcv: ... 1354 if (!tipc_data_input(l, skb, l->inputq)) 1355 rc |= tipc_link_input(l, skb, l->inputq); </snip> Fix it by simple changing to a 'true' boolean when skb is being free-ed. Then, tipc_link_rcv will bypassed to subcalling tipc_link_input as above condition. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <maloy@donjonn.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12netfilter: compat: initialize all fields in xt_initFrancesco Ruggeri
If a non zero value happens to be in xt[NFPROTO_BRIDGE].cur at init time, the following panic can be caused by running % ebtables -t broute -F BROUTING from a 32-bit user level on a 64-bit kernel. This patch replaces kmalloc_array with kcalloc when allocating xt. [ 474.680846] BUG: unable to handle kernel paging request at 0000000009600920 [ 474.687869] PGD 2037006067 P4D 2037006067 PUD 2038938067 PMD 0 [ 474.693838] Oops: 0000 [#1] SMP [ 474.697055] CPU: 9 PID: 4662 Comm: ebtables Kdump: loaded Not tainted 4.19.17-11302235.AroraKernelnext.fc18.x86_64 #1 [ 474.707721] Hardware name: Supermicro X9DRT/X9DRT, BIOS 3.0 06/28/2013 [ 474.714313] RIP: 0010:xt_compat_calc_jump+0x2f/0x63 [x_tables] [ 474.720201] Code: 40 0f b6 ff 55 31 c0 48 6b ff 70 48 03 3d dc 45 00 00 48 89 e5 8b 4f 6c 4c 8b 47 60 ff c9 39 c8 7f 2f 8d 14 08 d1 fa 48 63 fa <41> 39 34 f8 4c 8d 0c fd 00 00 00 00 73 05 8d 42 01 eb e1 76 05 8d [ 474.739023] RSP: 0018:ffffc9000943fc58 EFLAGS: 00010207 [ 474.744296] RAX: 0000000000000000 RBX: ffffc90006465000 RCX: 0000000002580249 [ 474.751485] RDX: 00000000012c0124 RSI: fffffffff7be17e9 RDI: 00000000012c0124 [ 474.758670] RBP: ffffc9000943fc58 R08: 0000000000000000 R09: ffffffff8117cf8f [ 474.765855] R10: ffffc90006477000 R11: 0000000000000000 R12: 0000000000000001 [ 474.773048] R13: 0000000000000000 R14: ffffc9000943fcb8 R15: ffffc9000943fcb8 [ 474.780234] FS: 0000000000000000(0000) GS:ffff88a03f840000(0063) knlGS:00000000f7ac7700 [ 474.788612] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 474.794632] CR2: 0000000009600920 CR3: 0000002037422006 CR4: 00000000000606e0 [ 474.802052] Call Trace: [ 474.804789] compat_do_replace+0x1fb/0x2a3 [ebtables] [ 474.810105] compat_do_ebt_set_ctl+0x69/0xe6 [ebtables] [ 474.815605] ? try_module_get+0x37/0x42 [ 474.819716] compat_nf_setsockopt+0x4f/0x6d [ 474.824172] compat_ip_setsockopt+0x7e/0x8c [ 474.828641] compat_raw_setsockopt+0x16/0x3a [ 474.833220] compat_sock_common_setsockopt+0x1d/0x24 [ 474.838458] __compat_sys_setsockopt+0x17e/0x1b1 [ 474.843343] ? __check_object_size+0x76/0x19a [ 474.847960] __ia32_compat_sys_socketcall+0x1cb/0x25b [ 474.853276] do_fast_syscall_32+0xaf/0xf6 [ 474.857548] entry_SYSENTER_compat+0x6b/0x7a Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-11Revert "RISC-V: Make BSS section as the last section in vmlinux.lds.S"Palmer Dabbelt
At least BBL relies on the flat binaries containing all the bytes in the actual image to exist in the file. Before this revert the flat images dropped the trailing zeros, which caused BBL to put its copy of the device tree where Linux thought the BSS was, which wreaks all sorts of havoc. Manifesting the bug is a bit subtle because BBL aligns everything to 2MiB page boundaries, but with large enough kernels you're almost certain to get bitten by the bug. While moving the sections around isn't a great long-term fix, it will at least avoid producing broken images. This reverts commit 22e6a2e14cb8ebcae059488cf24e778e4058c2bf. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-02-11riscv: Add pte bit to distinguish swap from invalidStefan O'Rear
Previously, invalid PTEs and swap PTEs had the same binary representation, causing errors when attempting to unmap PROT_NONE mappings, including implicit unmap on exit. Typical error: swap_info_get: Bad swap file entry 40000000007a9879 BUG: Bad page map in process a.out pte:3d4c3cc0 pmd:3e521401 Cc: stable@vger.kernel.org Signed-off-by: Stefan O'Rear <sorear2@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-02-11Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal SoC management fixes from Eduardo Valentin: "Minor fixes on of-thermal and cpu cooling" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: cpu_cooling: Clarify error message thermal: of-thermal: Print name of device node with error
2019-02-11net/x25: do not hold the cpu too long in x25_new_lci()Eric Dumazet
Due to quadratic behavior of x25_new_lci(), syzbot was able to trigger an rcu stall. Fix this by not blocking BH for the whole duration of the function, and inserting a reschedule point when possible. If we care enough, using a bitmap could get rid of the quadratic behavior. syzbot report : rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 0-...!: (10500 ticks this GP) idle=4fa/1/0x4000000000000002 softirq=283376/283376 fqs=0 rcu: (t=10501 jiffies g=383105 q=136) rcu: rcu_preempt kthread starved for 10502 jiffies! g383105 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0 rcu: RCU grace-period kthread stack dump: rcu_preempt I28928 10 2 0x80000000 Call Trace: context_switch kernel/sched/core.c:2844 [inline] __schedule+0x817/0x1cc0 kernel/sched/core.c:3485 schedule+0x92/0x180 kernel/sched/core.c:3529 schedule_timeout+0x4db/0xfd0 kernel/time/timer.c:1803 rcu_gp_fqs_loop kernel/rcu/tree.c:1948 [inline] rcu_gp_kthread+0x956/0x17a0 kernel/rcu/tree.c:2105 kthread+0x357/0x430 kernel/kthread.c:246 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352 NMI backtrace for cpu 0 CPU: 0 PID: 8759 Comm: syz-executor2 Not tainted 5.0.0-rc4+ #51 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 nmi_cpu_backtrace.cold+0x63/0xa4 lib/nmi_backtrace.c:101 nmi_trigger_cpumask_backtrace+0x1be/0x236 lib/nmi_backtrace.c:62 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] rcu_dump_cpu_stacks+0x183/0x1cf kernel/rcu/tree.c:1211 print_cpu_stall kernel/rcu/tree.c:1348 [inline] check_cpu_stall kernel/rcu/tree.c:1422 [inline] rcu_pending kernel/rcu/tree.c:3018 [inline] rcu_check_callbacks.cold+0x500/0xa4a kernel/rcu/tree.c:2521 update_process_times+0x32/0x80 kernel/time/timer.c:1635 tick_sched_handle+0xa2/0x190 kernel/time/tick-sched.c:161 tick_sched_timer+0x47/0x130 kernel/time/tick-sched.c:1271 __run_hrtimer kernel/time/hrtimer.c:1389 [inline] __hrtimer_run_queues+0x33e/0xde0 kernel/time/hrtimer.c:1451 hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1509 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1035 [inline] smp_apic_timer_interrupt+0x120/0x570 arch/x86/kernel/apic/apic.c:1060 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807 </IRQ> RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline] RIP: 0010:queued_write_lock_slowpath+0x13e/0x290 kernel/locking/qrwlock.c:86 Code: 00 00 fc ff df 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 <41> 0f b6 55 00 41 38 d7 7c eb 84 d2 74 e7 48 89 df e8 6c 0f 4f 00 RSP: 0018:ffff88805f117bd8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000300 RBX: ffffffff89413ba0 RCX: 1ffffffff1282774 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89413ba0 RBP: ffff88805f117c70 R08: 1ffffffff1282774 R09: fffffbfff1282775 R10: fffffbfff1282774 R11: ffffffff89413ba3 R12: 00000000000000ff R13: fffffbfff1282774 R14: 1ffff1100be22f7d R15: 0000000000000003 queued_write_lock include/asm-generic/qrwlock.h:104 [inline] do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203 __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline] _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312 x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267 x25_bind+0x273/0x340 net/x25/af_x25.c:705 __sys_bind+0x23f/0x290 net/socket.c:1505 __do_sys_bind net/socket.c:1516 [inline] __se_sys_bind net/socket.c:1514 [inline] __x64_sys_bind+0x73/0xb0 net/socket.c:1514 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e39 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fafccd0dc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e39 RDX: 0000000000000012 RSI: 0000000020000240 RDI: 0000000000000004 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fafccd0e6d4 R13: 00000000004bdf8b R14: 00000000004ce4b8 R15: 00000000ffffffff Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 8752 Comm: syz-executor4 Not tainted 5.0.0-rc4+ #51 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__x25_find_socket+0x78/0x120 net/x25/af_x25.c:328 Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 a6 00 00 00 4d 8b 64 24 68 4d 85 e4 74 7f e8 03 97 3d fb 49 83 ec 68 74 74 e8 f8 96 3d fb <49> 8d bc 24 88 04 00 00 48 89 f8 48 c1 e8 03 0f b6 04 18 84 c0 74 RSP: 0018:ffff8880639efc58 EFLAGS: 00000246 RAX: 0000000000040000 RBX: dffffc0000000000 RCX: ffffc9000e677000 RDX: 0000000000040000 RSI: ffffffff863244b8 RDI: ffff88806a764628 RBP: ffff8880639efc80 R08: ffff8880a80d05c0 R09: fffffbfff1282775 R10: fffffbfff1282774 R11: ffffffff89413ba3 R12: ffff88806a7645c0 R13: 0000000000000001 R14: ffff88809f29ac00 R15: 0000000000000000 FS: 00007fe8d0c58700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32823000 CR3: 00000000672eb000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: x25_new_lci net/x25/af_x25.c:357 [inline] x25_connect+0x374/0xdf0 net/x25/af_x25.c:786 __sys_connect+0x266/0x330 net/socket.c:1686 __do_sys_connect net/socket.c:1697 [inline] __se_sys_connect net/socket.c:1694 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1694 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e39 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fe8d0c57c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e39 RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000004 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe8d0c586d4 R13: 00000000004be378 R14: 00000000004ceb00 R15: 00000000ffffffff Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: linux-x25@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11tracing: probeevent: Correctly update remaining space in dynamic areaAndreas Ziegler
Commit 9178412ddf5a ("tracing: probeevent: Return consumed bytes of dynamic area") improved the string fetching mechanism by returning the number of required bytes after copying the argument to the dynamic area. However, this return value is now only used to increment the pointer inside the dynamic area but misses updating the 'maxlen' variable which indicates the remaining space in the dynamic area. This means that fetch_store_string() always reads the *total* size of the dynamic area from the data_loc pointer instead of the *remaining* size (and passes it along to strncpy_from_{user,unsafe}) even if we're already about to copy data into the middle of the dynamic area. Link: http://lkml.kernel.org/r/20190206190013.16405-1-andreas.ziegler@fau.de Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 9178412ddf5a ("tracing: probeevent: Return consumed bytes of dynamic area") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Andreas Ziegler <andreas.ziegler@fau.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-11vxlan: test dev->flags & IFF_UP before calling netif_rx()Eric Dumazet
netif_rx() must be called under a strict contract. At device dismantle phase, core networking clears IFF_UP and flush_all_backlogs() is called after rcu grace period to make sure no incoming packet might be in a cpu backlog and still referencing the device. Most drivers call netif_rx() from their interrupt handler, and since the interrupts are disabled at device dismantle, netif_rx() does not have to check dev->flags & IFF_UP Virtual drivers do not have this guarantee, and must therefore make the check themselves. Otherwise we risk use-after-free and/or crashes. Note this patch also fixes a small issue that came with commit ce6502a8f957 ("vxlan: fix a use after free in vxlan_encap_bypass"), since the dev->stats.rx_dropped change was done on the wrong device. Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Fixes: ce6502a8f957 ("vxlan: fix a use after free in vxlan_encap_bypass") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Petr Machata <petrm@mellanox.com> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11Documentation: bring operstate documentation up-to-dateJouke Witteveen
Netlink has moved from bitmasks to group numbers long ago. Signed-off-by: Jouke Witteveen <j.witteveen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11xsk: share the mmap_sem for page pinningDavidlohr Bueso
Holding mmap_sem exclusively for a gup() is an overkill. Lets share the lock and replace the gup call for gup_longterm(), as it is better suited for the lifetime of the pinning. Fixes: c0c77d8fb787 ("xsk: add user memory registration support sockopt") Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: David S. Miller <davem@davemloft.net> Cc: Bjorn Topel <bjorn.topel@intel.com> Cc: Magnus Karlsson <magnus.karlsson@intel.com> CC: netdev@vger.kernel.org Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Out-of-bound access to packet data from the snmp nat helper, from Jann Horn. 2) ICMP(v6) error packets are set as related traffic by conntrack, update protocol number before calling nf_nat_ipv4_manip_pkt() to use ICMP(v6) rather than the original protocol number, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11Merge tag 's390-5.0-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 bug fixes from Martin Schwidefsky: - Fix specification exception on z196 during ap probe - A fix for suspend-to-disk, the VMAP stack patch broke the swsusp_arch_suspend function - The EMC CKD ioctl of the dasd driver needs an additional size check for user space data - Revert an incorrect patch for the PCI base code that removed a bit lock that turned out to be required after all * tag 's390-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: Revert "s390/pci: remove bit_lock usage in interrupt handler" s390/zcrypt: fix specification exception on z196 during ap probe s390/dasd: fix using offset into zero size array error s390/suspend: fix stack setup in swsusp_arch_suspend
2019-02-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha Pull alpha fixes from Matt Turner: "A few changes for alpha, including a build fix, a fix for the Eiger platform, and a fix for a tricky bug uncovered by the strace test suite that has existed since at least 1997 (v2.1.32)!" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha: alpha: fix page fault handling for r16-r18 targets alpha: Fix Eiger NR_IRQS to 128 tools uapi: fix Alpha support
2019-02-11Documentation: Fix grammatical error in sysctl/fs.txt & clarify negative dentryWaiman Long
Fix a grammatical error in the dentry-state text and clarify the usage of negative dentries. Fixes: af0c9af1b3f66 ("fs/dcache: Track & report number of negative dentries") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-11dm crypt: don't overallocate the integrity tag spaceMikulas Patocka
bio_sectors() returns the value in the units of 512-byte sectors (no matter what the real sector size of the device). dm-crypt multiplies bio_sectors() by on_disk_tag_size to calculate the space allocated for integrity tags. If dm-crypt is running with sector size larger than 512b, it allocates more data than is needed. Device Mapper trims the extra space when passing the bio to dm-integrity, so this bug didn't result in any visible misbehavior. But it must be fixed to avoid wasteful memory allocation for the block integrity payload. Fixes: ef43aa38063a6 ("dm crypt: add cryptographic data integrity protection (authenticated encryption)") Cc: stable@vger.kernel.org # 4.12+ Reported-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-02-11netfilter: nat: fix spurious connection timeoutsFlorian Westphal
Sander Eikelenboom bisected a NAT related regression down to the l4proto->manip_pkt indirection removal. I forgot that ICMP(v6) errors (e.g. PKTTOOBIG) can be set as related to the existing conntrack entry. Therefore, when passing the skb to nf_nat_ipv4/6_manip_pkt(), that ended up calling the wrong l4 manip function, as tuple->dst.protonum is the original flows l4 protocol (TCP, UDP, etc). Set the dst protocol field to ICMP(v6), we already have a private copy of the tuple due to the inversion of src/dst. Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Fixes: faec18dbb0405 ("netfilter: nat: remove l4proto->manip_pkt") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-11netfilter: nf_nat_snmp_basic: add missing length checks in ASN.1 cbsJann Horn
The generic ASN.1 decoder infrastructure doesn't guarantee that callbacks will get as much data as they expect; callbacks have to check the `datalen` parameter before looking at `data`. Make sure that snmp_version() and snmp_helper() don't read/write beyond the end of the packet data. (Also move the assignment to `pdata` down below the check to make it clear that it isn't necessarily a pointer we can use before the `datalen` check.) Fixes: cc2d58634e0f ("netfilter: nf_nat_snmp_basic: use asn1 decoder library") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-11drm/i915/cnl: Fix CNL macros for Voltage Swing programmingAditya Swarup
CNL macros for register groups CNL_PORT_TX_DW2_* / CNL_PORT_TX_DW5_* are configured incorrectly wrt definition of _CNL_PORT_TX_DW_GRP. v2: Jani suggested to keep the macros organized semantically i.e., by function, secondarily by port/pipe/transcoder.->(dw, port) Fixes: 4e53840fdfdd ("drm/i915/icl: Introduce new macros to get combophy registers") Cc: Clint Taylor <clinton.a.taylor@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190110230844.9213-1-aditya.swarup@intel.com (cherry picked from commit b14c06ec024947eaa35212f2380e90233d5092e0) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-02-11drm/i915/icl: combo port vswing programming changes per BSPECClint Taylor
In August 2018 the BSPEC changed the ICL port programming sequence to closely resemble earlier gen programming sequence. Restrict combo phy to HBR max rate unless eDP panel is connected to port. v2: remove debug code that Imre found v3: simplify translation table if-else v4: edp translation table now based on link rate and low_swing v5: Misc review comments + r-b BSpec: 21257 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1545084827-5776-1-git-send-email-clinton.a.taylor@intel.com (cherry picked from commit b265a2a6255f581258ccfdccbd2efca51a142fe2) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-02-11bpf: fix lockdep false positive in stackmapAlexei Starovoitov
Lockdep warns about false positive: [ 11.211460] ------------[ cut here ]------------ [ 11.211936] DEBUG_LOCKS_WARN_ON(depth <= 0) [ 11.211985] WARNING: CPU: 0 PID: 141 at ../kernel/locking/lockdep.c:3592 lock_release+0x1ad/0x280 [ 11.213134] Modules linked in: [ 11.214954] RIP: 0010:lock_release+0x1ad/0x280 [ 11.223508] Call Trace: [ 11.223705] <IRQ> [ 11.223874] ? __local_bh_enable+0x7a/0x80 [ 11.224199] up_read+0x1c/0xa0 [ 11.224446] do_up_read+0x12/0x20 [ 11.224713] irq_work_run_list+0x43/0x70 [ 11.225030] irq_work_run+0x26/0x50 [ 11.225310] smp_irq_work_interrupt+0x57/0x1f0 [ 11.225662] irq_work_interrupt+0xf/0x20 since rw_semaphore is released in a different task vs task that locked the sema. It is expected behavior. Fix the warning with up_read_non_owner() and rwsem_release() annotation. Fixes: bae77c5eb5b2 ("bpf: enable stackmap with build_id in nmi context") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-11mac80211: Fix Tx aggregation session tear down with ITXQsIlan Peer
When mac80211 requests the low level driver to stop an ongoing Tx aggregation, the low level driver is expected to call ieee80211_stop_tx_ba_cb_irqsafe() to indicate that it is ready to stop the session. The callback in turn schedules a worker to complete the session tear down, which in turn also handles the relevant state for the intermediate Tx queue. However, as this flow in asynchronous, the intermediate queue should be stopped and not continue servicing frames, as in such a case frames that are dequeued would be marked as part of an aggregation, although the aggregation is already been stopped. Fix this by stopping the intermediate Tx queue, before calling the low level driver to stop the Tx aggregation. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-11cfg80211: prevent speculation on cfg80211_classify8021d() returnJohannes Berg
It's possible that the caller of cfg80211_classify8021d() uses the value to index an array, like mac80211 in ieee80211_downgrade_queue(). Prevent speculation on the return value. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-11cfg80211: pmsr: record netlink port IDJohannes Berg
Without recording the netlink port ID, we cannot return the results or complete messages to userspace, nor will we be able to abort if the socket is closed, so clearly we need to fill the value. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-11nl80211: Fix FTM per burst maximum valueAviya Erenfeld
Fix FTM per burst maximum value from 15 to 31 (The maximal bits that represents that number in the frame is 5 hence a maximal value of 31) Signed-off-by: Aviya Erenfeld <aviya.erenfeld@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-11mac80211: call drv_ibss_join() on restartJohannes Berg
If a driver does any significant activity in its ibss_join method, then it will very well expect that to be called during restart, before any stations are added. Do that. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-10alpha: fix page fault handling for r16-r18 targetsSergei Trofimovich
Fix page fault handling code to fixup r16-r18 registers. Before the patch code had off-by-two registers bug. This bug caused overwriting of ps,pc,gp registers instead of fixing intended r16,r17,r18 (see `struct pt_regs`). More details: Initially Dmitry noticed a kernel bug as a failure on strace test suite. Test passes unmapped userspace pointer to io_submit: ```c #include <err.h> #include <unistd.h> #include <sys/mman.h> #include <asm/unistd.h> int main(void) { unsigned long ctx = 0; if (syscall(__NR_io_setup, 1, &ctx)) err(1, "io_setup"); const size_t page_size = sysconf(_SC_PAGESIZE); const size_t size = page_size * 2; void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (MAP_FAILED == ptr) err(1, "mmap(%zu)", size); if (munmap(ptr, size)) err(1, "munmap"); syscall(__NR_io_submit, ctx, 1, ptr + page_size); syscall(__NR_io_destroy, ctx); return 0; } ``` Running this test causes kernel to crash when handling page fault: ``` Unable to handle kernel paging request at virtual address ffffffffffff9468 CPU 3 aio(26027): Oops 0 pc = [<fffffc00004eddf8>] ra = [<fffffc00004edd5c>] ps = 0000 Not tainted pc is at sys_io_submit+0x108/0x200 ra is at sys_io_submit+0x6c/0x200 v0 = fffffc00c58e6300 t0 = fffffffffffffff2 t1 = 000002000025e000 t2 = fffffc01f159fef8 t3 = fffffc0001009640 t4 = fffffc0000e0f6e0 t5 = 0000020001002e9e t6 = 4c41564e49452031 t7 = fffffc01f159c000 s0 = 0000000000000002 s1 = 000002000025e000 s2 = 0000000000000000 s3 = 0000000000000000 s4 = 0000000000000000 s5 = fffffffffffffff2 s6 = fffffc00c58e6300 a0 = fffffc00c58e6300 a1 = 0000000000000000 a2 = 000002000025e000 a3 = 00000200001ac260 a4 = 00000200001ac1e8 a5 = 0000000000000001 t8 = 0000000000000008 t9 = 000000011f8bce30 t10= 00000200001ac440 t11= 0000000000000000 pv = fffffc00006fd320 at = 0000000000000000 gp = 0000000000000000 sp = 00000000265fd174 Disabling lock debugging due to kernel taint Trace: [<fffffc0000311404>] entSys+0xa4/0xc0 ``` Here `gp` has invalid value. `gp is s overwritten by a fixup for the following page fault handler in `io_submit` syscall handler: ``` __se_sys_io_submit ... ldq a1,0(t1) bne t0,4280 <__se_sys_io_submit+0x180> ``` After a page fault `t0` should contain -EFALUT and `a1` is 0. Instead `gp` was overwritten in place of `a1`. This happens due to a off-by-two bug in `dpf_reg()` for `r16-r18` (aka `a0-a2`). I think the bug went unnoticed for a long time as `gp` is one of scratch registers. Any kernel function call would re-calculate `gp`. Dmitry tracked down the bug origin back to 2.1.32 kernel version where trap_a{0,1,2} fields were inserted into struct pt_regs. And even before that `dpf_reg()` contained off-by-one error. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: linux-alpha@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reported-and-reviewed-by: "Dmitry V. Levin" <ldv@altlinux.org> Cc: stable@vger.kernel.org # v2.1.32+ Bug: https://bugs.gentoo.org/672040 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-02-10alpha: Fix Eiger NR_IRQS to 128Meelis Roos
Eiger machine vector definition has nr_irqs 128, and working 2.6.26 boot shows SCSI getting IRQ-s 64 and 65. Current kernel boot fails because Symbios SCSI fails to request IRQ-s and does not find the disks. It has been broken at least since 3.18 - the earliest I could test with my gcc-5. The headers have moved around and possibly another order of defines has worked in the past - but since 128 seems to be correct and used, fix arch/alpha/include/asm/irq.h to have NR_IRQS=128 for Eiger. This fixes 4.19-rc7 boot on my Force Flexor A264 (Eiger subarch). Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-02-10xsk: add missing smp_rmb() in xsk_mmapMagnus Karlsson
All the setup code in AF_XDP is protected by a mutex with the exception of the mmap code that cannot use it. To make sure that a process banging on the mmap call at the same time as another process is setting up the socket, smp_wmb() calls were added in the umem registration code and the queue creation code, so that the published structures that xsk_mmap needs would be consistent. However, the corresponding smp_rmb() calls were not added to the xsk_mmap code. This patch adds these calls. Fixes: 37b076933a8e3 ("xsk: add missing write- and data-dependency barrier") Fixes: c0c77d8fb787c ("xsk: add user memory registration support sockopt") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10bpf: only adjust gso_size on bytestream protocolsWillem de Bruijn
bpf_skb_change_proto and bpf_skb_adjust_room change skb header length. For GSO packets they adjust gso_size to maintain the same MTU. The gso size can only be safely adjusted on bytestream protocols. Commit d02f51cbcf12 ("bpf: fix bpf_skb_adjust_net/bpf_skb_proto_xlat to deal with gso sctp skbs") excluded SKB_GSO_SCTP. Since then type SKB_GSO_UDP_L4 has been added, whose contents are one gso_size unit per datagram. Also exclude these. Move from a blacklist to a whitelist check to future proof against additional such new GSO types, e.g., for fraglist based GRO. Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10tools uapi: fix Alpha supportBob Tracy
Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Bob Tracy <rct@frus.com> Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-02-10Linux 5.0-rc6Linus Torvalds
2019-02-10Merge branch 'r8169-revert-two-commits-due-to-a-regression'David S. Miller
Heiner Kallweit says: ==================== r8169: revert two commits due to a regression Sander reported a regression (kernel panic, see[1]), therefore let's revert these commits. Removal of the barriers doesn't seem to contribute to the issue, the patch just overlaps with the problematic one and only reverting both patches was tested. [1] https://marc.info/?t=154965066400001&r=1&w=2 v2: - improve commit message ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10Revert "r8169: make use of xmit_more and __netdev_sent_queue"Heiner Kallweit
This reverts commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356. Sander reported a regression causing a kernel panic[1], therefore let's revert this commit. [1] https://marc.info/?t=154965066400001&r=1&w=2 Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>