summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-12-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Two user triggerable crashers and a some EFA related regressions: - Syzkaller found a bug in CM - Restore access to the GID table and fix modify_qp for EFA - Crasher in qedr" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/cm: Fix an attempt to use non-valid pointer when cleaning timewait RDMA/core: Fix empty gid table for non IB/RoCE devices RDMA/efa: Use the correct current and new states in modify QP RDMA/qedr: iWARP invalid(zero) doorbell address fix
2020-12-10Merge tag 'media/v5.10-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A couple of fixes: - videobuf2: fix a DMABUF bug, preventing it to properly handle cache sync/flush - vidtv: an usage after free and a few sparse/smatch warning fixes - pulse8-cec: a duplicate free and a bug related to new firmware usage - mtk-cir: fix a regression on a clock setting" * tag 'media/v5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: vidtv: fix some warnings media: vidtv: fix kernel-doc markups media: [next] media: vidtv: fix a read from an object after it has been freed media: vb2: set cache sync hints when init buffers media: pulse8-cec: add support for FW v10 and up media: pulse8-cec: fix duplicate free at disconnect or probe error media: mtk-cir: fix calculation of chk period
2020-12-10x86/resctrl: Fix incorrect local bandwidth when mba_sc is enabledXiaochen Shen
The MBA software controller (mba_sc) is a feedback loop which periodically reads MBM counters and tries to restrict the bandwidth below a user-specified value. It tags along the MBM counter overflow handler to do the updates with 1s interval in mbm_update() and update_mba_bw(). The purpose of mbm_update() is to periodically read the MBM counters to make sure that the hardware counter doesn't wrap around more than once between user samplings. mbm_update() calls __mon_event_count() for local bandwidth updating when mba_sc is not enabled, but calls mbm_bw_count() instead when mba_sc is enabled. __mon_event_count() will not be called for local bandwidth updating in MBM counter overflow handler, but it is still called when reading MBM local bandwidth counter file 'mbm_local_bytes', the call path is as below: rdtgroup_mondata_show() mon_event_read() mon_event_count() __mon_event_count() In __mon_event_count(), m->chunks is updated by delta chunks which is calculated from previous MSR value (m->prev_msr) and current MSR value. When mba_sc is enabled, m->chunks is also updated in mbm_update() by mistake by the delta chunks which is calculated from m->prev_bw_msr instead of m->prev_msr. But m->chunks is not used in update_mba_bw() in the mba_sc feedback loop. When reading MBM local bandwidth counter file, m->chunks was changed unexpectedly by mbm_bw_count(). As a result, the incorrect local bandwidth counter which calculated from incorrect m->chunks is shown to the user. Fix this by removing incorrect m->chunks updating in mbm_bw_count() in MBM counter overflow handler, and always calling __mon_event_count() in mbm_update() to make sure that the hardware local bandwidth counter doesn't wrap around. Test steps: # Run workload with aggressive memory bandwidth (e.g., 10 GB/s) git clone https://github.com/intel/intel-cmt-cat && cd intel-cmt-cat && make ./tools/membw/membw -c 0 -b 10000 --read # Enable MBA software controller mount -t resctrl resctrl -o mba_MBps /sys/fs/resctrl # Create control group c1 mkdir /sys/fs/resctrl/c1 # Set MB throttle to 6 GB/s echo "MB:0=6000;1=6000" > /sys/fs/resctrl/c1/schemata # Write PID of the workload to tasks file echo `pidof membw` > /sys/fs/resctrl/c1/tasks # Read local bytes counters twice with 1s interval, the calculated # local bandwidth is not as expected (approaching to 6 GB/s): local_1=`cat /sys/fs/resctrl/c1/mon_data/mon_L3_00/mbm_local_bytes` sleep 1 local_2=`cat /sys/fs/resctrl/c1/mon_data/mon_L3_00/mbm_local_bytes` echo "local b/w (bytes/s):" `expr $local_2 - $local_1` Before fix: local b/w (bytes/s): 11076796416 After fix: local b/w (bytes/s): 5465014272 Fixes: ba0f26d8529c (x86/intel_rdt/mba_sc: Prepare for feedback loop) Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/1607063279-19437-1-git-send-email-xiaochen.shen@intel.com
2020-12-10Merge tag 'kvmarm-fixes-5.10-5' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD kvm/arm64 fixes for 5.10, take #5 - Don't leak page tables on PTE update - Correctly invalidate TLBs on table to block transition - Only update permissions if the fault level matches the expected mapping size
2020-12-10Merge branch 'md-fixes' of ↵Jens Axboe
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.10 Pull MD fixes from Song: "This is to fix raid10 data corruption [1] in 5.10-rc7." * 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: Revert "md: add md_submit_discard_bio() for submitting discard bio" Revert "md/raid10: extend r10bio devs to raid disks" Revert "md/raid10: pull codes that wait for blocked dev into one function" Revert "md/raid10: improve raid10 discard request" Revert "md/raid10: improve discard request for far layout" Revert "dm raid: remove unnecessary discard limits for raid10"
2020-12-10x86/mm/mem_encrypt: Fix definition of PMD_FLAGS_DEC_WPArvind Sankar
The PAT bit is in different locations for 4k and 2M/1G page table entries. Add a definition for _PAGE_LARGE_CACHE_MASK to represent the three caching bits (PWT, PCD, PAT), similar to _PAGE_CACHE_MASK for 4k pages, and use it in the definition of PMD_FLAGS_DEC_WP to get the correct PAT index for write-protected pages. Fixes: 6ebcb060713f ("x86/mm: Add support to encrypt the kernel in-place") Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201111160946.147341-1-nivedita@alum.mit.edu
2020-12-10zonefs: fix page reference and BIO leakDamien Le Moal
In zonefs_file_dio_append(), the pages obtained using bio_iov_iter_get_pages() are not released on completion of the REQ_OP_APPEND BIO, nor when bio_iov_iter_get_pages() fails. Furthermore, a call to bio_put() is missing when bio_iov_iter_get_pages() fails. Fix these resource leaks by adding BIO resource release code (bio_put()i and bio_release_pages()) at the end of the function after the BIO execution and add a jump to this resource cleanup code in case of bio_iov_iter_get_pages() failure. While at it, also fix the call to task_io_account_write() to be passed the correct BIO size instead of bio_iov_iter_get_pages() return value. Reported-by: Christoph Hellwig <hch@lst.de> Fixes: 02ef12a663c7 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-12-09Input: i8042 - add Acer laptops to the i8042 reset listChris Chiu
The touchpad operates in Basic Mode by default in the Acer BIOS setup, but some Aspire/TravelMate models require the i8042 to be reset in order to be correctly detected. Signed-off-by: Chris Chiu <chiu@endlessos.org> Link: https://lore.kernel.org/r/20201207071250.15021-1-chiu@endlessos.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-12-09Revert "md: add md_submit_discard_bio() for submitting discard bio"Song Liu
This reverts commit 2628089b74d5a64bd0bcb5d247a18f78d7b6f4d0. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-12-09Revert "md/raid10: extend r10bio devs to raid disks"Song Liu
This reverts commit 8650a889017cb1f6ea6813ccf83a2e9f6fa49dd3. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-12-09Revert "md/raid10: pull codes that wait for blocked dev into one function"Song Liu
This reverts commit f046f5d0d79cdb968f219ce249e497fd1accf484. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-12-09Revert "md/raid10: improve raid10 discard request"Song Liu
This reverts commit bcc90d280465ebd51ab8688be86e1f00c62dccf9. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-12-09Revert "md/raid10: improve discard request for far layout"Song Liu
This reverts commit d3ee2d8415a6256c1c41e1be36e80e640c3e6359. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-12-09Revert "dm raid: remove unnecessary discard limits for raid10"Song Liu
This reverts commit f0e90b6c663a7e3b4736cb318c6c7c589f152c28. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-12-09MAINTAINERS: Add entry for Marvell Prestera Ethernet Switch driverMickey Rachamim
Add maintainers info for new Marvell Prestera Ethernet switch driver. Signed-off-by: Mickey Rachamim <mickeyr@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: sched: Fix dump of MPLS_OPT_LSE_LABEL attribute in cls_flowerGuillaume Nault
TCA_FLOWER_KEY_MPLS_OPT_LSE_LABEL is a u32 attribute (MPLS label is 20 bits long). Fixes the following bug: $ tc filter add dev ethX ingress protocol mpls_uc \ flower mpls lse depth 2 label 256 \ action drop $ tc filter show dev ethX ingress filter protocol mpls_uc pref 49152 flower chain 0 filter protocol mpls_uc pref 49152 flower chain 0 handle 0x1 eth_type 8847 mpls lse depth 2 label 0 <-- invalid label 0, should be 256 ... Fixes: 61aec25a6db5 ("cls_flower: Support filtering on multiple MPLS Label Stack Entries") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09drm/amd/pm: typo fix (CUSTOM -> COMPUTE)Evan Quan
The "COMPUTE" was wrongly spelled as "CUSTOM". Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.9.x
2020-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Switch to RCU in x_tables to fix possible NULL pointer dereference, from Subash Abhinov Kasiviswanathan. 2) Fix netlink dump of dynset timeouts later than 23 days. 3) Add comment for the indirect serialization of the nft commit mutex with rtnl_mutex. 4) Remove bogus check for confirmed conntrack when matching on the conntrack ID, from Brett Mastbergen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2020-12-09 This series contains updates to igb, ixgbe, i40e, and ice drivers. Sven Auhagen fixes issues with igb XDP: return correct error value in XDP xmit back, increase header padding to include space for double VLAN, add an extack error when Rx buffer is too small for frame size, set metasize if it is set in xdp, change xdp_do_flush_map to xdp_do_flush, and update trans_start to avoid possible Tx timeout. Björn fixes an issue where an Rx buffer can be reused prematurely with XDP redirect for ixgbe, i40e, and ice drivers. The following are changes since commit 323a391a220c4a234cb1e678689d7f4c3b73f863: can: isotp: isotp_setsockopt(): block setsockopt on bound sockets and are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 1GbE ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09Input: cros_ec_keyb - send 'scancodes' in addition to key eventsDmitry Torokhov
To let userspace know what 'scancodes' should be used in EVIOCGKEYCODE and EVIOCSKEYCODE ioctls, we should send EV_MSC/MSC_SCAN events in addition to EV_KEY/KEY_* events. The driver already declared MSC_SCAN capability, so it is only matter of actually sending the events. Link: https://lore.kernel.org/r/X87aOaSptPTvZ3nZ@google.com Acked-by: Rajat Jain <rajatja@google.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-12-09Merge branch 'mlx4_en-fixes'David S. Miller
Tariq Toukan says: ==================== mlx4_en fixes This patchset by Moshe contains fixes to the mlx4 Eth driver, addressing issues in restart flow. Patch 1 protects the restart task from being rescheduled while active. Please queue for -stable >= v2.6. Patch 2 reconstructs SQs stuck in error state, and adds prints for improved debuggability. Please queue for -stable >= v3.12. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net/mlx4_en: Handle TX error CQEMoshe Shemesh
In case error CQE was found while polling TX CQ, the QP is in error state and all posted WQEs will generate error CQEs without any data transmitted. Fix it by reopening the channels, via same method used for TX timeout handling. In addition add some more info on error CQE and WQE for debug. Fixes: bd2f631d7c60 ("net/mlx4_en: Notify user when TX ring in error state") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net/mlx4_en: Avoid scheduling restart task if it is already runningMoshe Shemesh
Add restarting state flag to avoid scheduling another restart task while such task is already running. Change task name from watchdog_task to restart_task to better fit the task role. Fixes: 1e338db56e5a ("mlx4_en: Fix a race at restart task") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09tcp: fix cwnd-limited bug for TSO deferral where we send nothingNeal Cardwell
When cwnd is not a multiple of the TSO skb size of N*MSS, we can get into persistent scenarios where we have the following sequence: (1) ACK for full-sized skb of N*MSS arrives -> tcp_write_xmit() transmit full-sized skb with N*MSS -> move pacing release time forward -> exit tcp_write_xmit() because pacing time is in the future (2) TSQ callback or TCP internal pacing timer fires -> try to transmit next skb, but TSO deferral finds remainder of available cwnd is not big enough to trigger an immediate send now, so we defer sending until the next ACK. (3) repeat... So we can get into a case where we never mark ourselves as cwnd-limited for many seconds at a time, even with bulk/infinite-backlog senders, because: o In case (1) above, every time in tcp_write_xmit() we have enough cwnd to send a full-sized skb, we are not fully using the cwnd (because cwnd is not a multiple of the TSO skb size). So every time we send data, we are not cwnd limited, and so in the cwnd-limited tracking code in tcp_cwnd_validate() we mark ourselves as not cwnd-limited. o In case (2) above, every time in tcp_write_xmit() that we try to transmit the "remainder" of the cwnd but defer, we set the local variable is_cwnd_limited to true, but we do not send any packets, so sent_pkts is zero, so we don't call the cwnd-limited logic to update tp->is_cwnd_limited. Fixes: ca8a22634381 ("tcp: make cwnd-limited checks measurement-based, and gentler") Reported-by: Ingemar Johansson <ingemar.s.johansson@ericsson.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20201209035759.1225145-1-ncardwell.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-09net: flow_offload: Fix memory leak for indirect flow blockChris Mi
The offending commit introduces a cleanup callback that is invoked when the driver module is removed to clean up the tunnel device flow block. But it returns on the first iteration of the for loop. The remaining indirect flow blocks will never be freed. Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure") CC: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com>
2020-12-09tcp: Retain ECT bits for tos reflectionWei Wang
For DCTCP, we have to retain the ECT bits set by the congestion control algorithm on the socket when reflecting syn TOS in syn-ack, in order to make ECN work properly. Fixes: ac8f1710c12b ("tcp: reflect tos value received in SYN to the socket") Reported-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Wei Wang <weiwan@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09ethtool: fix stack overflow in ethnl_parse_bitset()Michal Kubecek
Syzbot reported a stack overflow in bitmap_from_arr32() called from ethnl_parse_bitset() when bitset from netlink message is longer than target bitmap length. While ethnl_compact_sanity_checks() makes sure that trailing part is all zeros (i.e. the request does not try to touch bits kernel does not recognize), we also need to cap change_bits to nbits so that we don't try to write past the prepared bitmaps. Fixes: 88db6d1e4f62 ("ethtool: add ethnl_parse_bitset() helper") Reported-by: syzbot+9d39fa49d4df294aab93@syzkaller.appspotmail.com Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Link: https://lore.kernel.org/r/3487ee3a98e14cd526f55b6caaa959d2dcbcad9f.1607465316.git.mkubecek@suse.cz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-09e1000e: fix S0ix flow to allow S0i3.2 subset entryVitaly Lifshits
Changed a configuration in the flows to align with architecture requirements to achieve S0i3.2 substate. This helps both i219V and i219LM configurations. Also fixed a typo in the previous commit 632fbd5eb5b0 ("e1000e: fix S0ix flows for cable connected case"). Fixes: 632fbd5eb5b0 ("e1000e: fix S0ix flows for cable connected case"). Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Link: https://lore.kernel.org/r/20201208185632.151052-1-mario.limonciello@dell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-09ice: avoid premature Rx buffer reuseBjörn Töpel
The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Fixes: efc2214b6047 ("ice: Add support for XDP") Reported-and-analyzed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ixgbe: avoid premature Rx buffer reuseBjörn Töpel
The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Fixes: 6453073987ba ("ixgbe: add initial support for xdp redirect") Reported-and-analyzed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09i40e: avoid premature Rx buffer reuseBjörn Töpel
The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Longer explanation: Intel NICs have a recycle mechanism. The main idea is that a page is split into two parts. One part is owned by the driver, one part might be owned by someone else, such as the stack. t0: Page is allocated, and put on the Rx ring +--------------- used by NIC ->| upper buffer (rx_buffer) +--------------- | lower buffer +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX t1: Buffer is received, and passed to the stack (e.g.) +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 1 t2: Buffer is received, and redirected +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- Now, prior calling xdp_do_redirect(): page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 2 This means that buffer *cannot* be flipped/reused, because the skb is still using it. The problem arises when xdp_do_redirect() actually frees the segment. Then we get: page count == USHRT_MAX - 1 rx_buffer->pagecnt_bias == USHRT_MAX - 2 From a recycle perspective, the buffer can be flipped and reused, which means that the skb data area is passed to the Rx HW ring! To work around this, the page count is stored prior calling xdp_do_redirect(). Note that this is not optimal, since the NIC could actually reuse the "lower buffer" again. However, then we need to track whether XDP_REDIRECT consumed the buffer or not. Fixes: d9314c474d4f ("i40e: add support for XDP_REDIRECT") Reported-and-analyzed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: avoid transmit queue timeout in xdp pathSven Auhagen
Since we share the transmit queue with the network stack, it is possible that we run into a transmit queue timeout. This will reset the queue. This happens under high load when XDP is using the transmit queue pretty much exclusively. netdev_start_xmit() sets the trans_start variable of the transmit queue to jiffies which is later utilized by dev_watchdog(), so to avoid timeout, let stack know that XDP xmit happened by bumping the trans_start within XDP Tx routines to jiffies. Fixes: 9cbc948b5a20 ("igb: add XDP support") Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: use xdp_do_flushSven Auhagen
Since it is a new XDP implementation change xdp_do_flush_map to xdp_do_flush. Fixes: 9cbc948b5a20 ("igb: add XDP support") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: skb add metasize for xdpSven Auhagen
add metasize if it is set in xdp Fixes: 9cbc948b5a20 ("igb: add XDP support") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: XDP extack message on errorSven Auhagen
Add an extack error message when the RX buffer size is too small for the frame size. Fixes: 9cbc948b5a20 ("igb: add XDP support") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: take VLAN double header into accountSven Auhagen
Increase the packet header padding to include double VLAN tagging. This patch uses a macro for this. Fixes: 9cbc948b5a20 ("igb: add XDP support") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: XDP xmit back fix error codeSven Auhagen
The igb XDP xmit back function should only return defined error codes. Fixes: 9cbc948b5a20 ("igb: add XDP support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09Merge tag 'arm-soc-fixes-v5.10-4b' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There are a few more PHY mode changes for allwinner SoC based boards with a Realtek PHY after the driver changed its behavior, I assume there will be more of these in the future. Also on for Allwinner, the Banana Pi M2 board had a regression that led to some devices not working because of a slightly incorrect voltage being applied. By popular demand, I picked up a change from Krzysztof Kozlowski to actually list the SoC tree in the MAINTAINERS file. We don't want to get Cc'd on normal patches that are picked up by platform maintainers, but the lack of an entry has led to confusion in the past. All the other changes are fairly benign, fixing boot-time or compile-time warning messages in various places: - A dtc warning on the OLPC XO-1.75 - A boot-time warning on i.MX6 wandboard - A harmless compile-time warning - A regression causing one of the i.MX6 SoCs to be identified as another - Missing SoC identification of Allwinner V3 and S3" * tag 'arm-soc-fixes-v5.10-4b' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firmware: xilinx: Mark pm_api_features_map with static keyword ARM: dts: mmp2-olpc-xo-1-75: clear the warnings when make dtbs MAINTAINERS: add a limited ARM and ARM64 SoC entry MAINTAINERS: correct SoC Git address (formerly: arm-soc) ARM: keystone: remove SECTION_SIZE_BITS/MAX_PHYSMEM_BITS arm64: dts: allwinner: H5: NanoPi Neo Plus2: phy-mode rgmii-id arm64: dts: allwinner: A64 Sopine: phy-mode rgmii-id ARM: dts: imx6qdl-kontron-samx6i: fix I2C_PM scl pin ARM: dts: imx6qdl-wandboard-revd1: Remove PAD_GPIO_6 from enetgrp ARM: imx: Use correct SRC base address ARM: dts: sun7i: pcduino3-nano: enable RGMII RX/TX delay on PHY ARM: dts: sun8i: v3s: fix GIC node memory range ARM: dts: sun8i: v40: bananapi-m2-berry: Fix ethernet node ARM: dts: sun8i: r40: bananapi-m2-berry: Fix dcdc1 regulator ARM: dts: sun7i: bananapi: Enable RGMII RX/TX delay on Ethernet PHY ARM: dts: s3: pinecube: align compatible property to other S3 boards ARM: sunxi: Add machine match for the Allwinner V3 SoC arm64: dts: allwinner: h6: orangepi-one-plus: Fix ethernet
2020-12-09Revert "geneve: pull IP header before ECN decapsulation"Jakub Kicinski
This reverts commit 4179b00c04d1 ("geneve: pull IP header before ECN decapsulation"). Eric says: "network header should have been pulled already before hitting geneve_rx()". Let's revert the syzbot fix since it's causing more harm than good, and revisit. Suggested-by: Eric Dumazet <edumazet@google.com> Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: 4179b00c04d1 ("geneve: pull IP header before ECN decapsulation") Link: https://bugzilla.kernel.org/show_bug.cgi?id=210569 Link: https://lore.kernel.org/netdev/CANn89iJVWfb=2i7oU1=D55rOyQnBbbikf+Mc6XHMkY7YX-yGEw@mail.gmail.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-09firmware: xilinx: Mark pm_api_features_map with static keywordZou Wei
Fix the following sparse warning: drivers/firmware/xilinx/zynqmp.c:32:1: warning: symbol 'pm_api_features_map' was not declared. Should it be static? Signed-off-by: Zou Wei <zou_wei@huawei.com> Link: https://lore.kernel.org/r/1606823513-121578-1-git-send-email-zou_wei@huawei.com Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-12-09ARM: dts: mmp2-olpc-xo-1-75: clear the warnings when make dtbsZhen Lei
The check_spi_bus_bridge() in scripts/dtc/checks.c requires that the node have "spi-slave" property must with "#address-cells = <0>" and "#size-cells = <0>". But currently both "#address-cells" and "#size-cells" properties are deleted, the corresponding default values are 2 and 1. As a result, the check fails and below warnings is displayed. arch/arm/boot/dts/mmp2.dtsi:472.23-480.6: Warning (spi_bus_bridge): \ /soc/apb@d4000000/spi@d4037000: incorrect #address-cells for SPI bus also defined at arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts:225.7-237.3 arch/arm/boot/dts/mmp2.dtsi:472.23-480.6: Warning (spi_bus_bridge): \ /soc/apb@d4000000/spi@d4037000: incorrect #size-cells for SPI bus also defined at arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts:225.7-237.3 arch/arm/boot/dts/mmp2-olpc-xo-1-75.dtb: Warning (spi_bus_reg): \ Failed prerequisite 'spi_bus_bridge' Because the value of "#size-cells" is already defined as zero in the node "ssp3: spi@d4037000" in arch/arm/boot/dts/mmp2.dtsi. So we only need to explicitly add "#address-cells = <0>" and keep "#size-cells" no change. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20201207084752.1665-2-thunder.leizhen@huawei.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-12-09RDMA/cm: Fix an attempt to use non-valid pointer when cleaning timewaitLeon Romanovsky
If cm_create_timewait_info() fails, the timewait_info pointer will contain an error value and will be used in cm_remove_remote() later. general protection fault, probably for non-canonical address 0xdffffc0000000024: 0000 [#1] SMP KASAN PTI KASAN: null-ptr-deref in range [0×0000000000000120-0×0000000000000127] CPU: 2 PID: 12446 Comm: syz-executor.3 Not tainted 5.10.0-rc5-5d4c0742a60e #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:cm_remove_remote.isra.0+0x24/0×170 drivers/infiniband/core/cm.c:978 Code: 84 00 00 00 00 00 41 54 55 53 48 89 fb 48 8d ab 2d 01 00 00 e8 7d bf 4b fe 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 48 89 ea 83 e2 07 38 d0 7f 08 84 c0 0f 85 fc 00 00 00 RSP: 0018:ffff888013127918 EFLAGS: 00010006 RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: ffffc9000a18b000 RDX: 0000000000000024 RSI: ffffffff82edc573 RDI: fffffffffffffff4 RBP: 0000000000000121 R08: 0000000000000001 R09: ffffed1002624f1d R10: 0000000000000003 R11: ffffed1002624f1c R12: ffff888107760c70 R13: ffff888107760c40 R14: fffffffffffffff4 R15: ffff888107760c9c FS: 00007fe1ffcc1700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2ff21000 CR3: 000000010f504001 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: cm_destroy_id+0x189/0×15b0 drivers/infiniband/core/cm.c:1155 cma_connect_ib drivers/infiniband/core/cma.c:4029 [inline] rdma_connect_locked+0x1100/0×17c0 drivers/infiniband/core/cma.c:4107 rdma_connect+0x2a/0×40 drivers/infiniband/core/cma.c:4140 ucma_connect+0x277/0×340 drivers/infiniband/core/ucma.c:1069 ucma_write+0x236/0×2f0 drivers/infiniband/core/ucma.c:1724 vfs_write+0x220/0×830 fs/read_write.c:603 ksys_write+0x1df/0×240 fs/read_write.c:658 do_syscall_64+0x33/0×40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation") Link: https://lore.kernel.org/r/20201204064205.145795-1-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Reported-by: Amit Matityahu <mitm@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-09Merge tag 'iommu-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull iommu fix from Will Deacon: "Fix interrupt table length definition for AMD IOMMU. It's actually a fix for a fix, where the size of the interrupt remapping table was increased but a related constant for the size of the interrupt table was forgotten" * tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs
2020-12-09can: isotp: isotp_setsockopt(): block setsockopt on bound socketsOliver Hartkopp
The isotp socket can be widely configured in its behaviour regarding addressing types, fill-ups, receive pattern tests and link layer length. Usually all these settings need to be fixed before bind() and can not be changed afterwards. This patch adds a check to enforce the common usage pattern. Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Thomas Wagner <thwa1@web.de> Link: https://lore.kernel.org/r/20201203140604.25488-2-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20201204133508.742120-3-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-09Merge branch 'bpf-xdp-offload-fixes'Daniel Borkmann
Toke Høiland-Jørgensen says: ==================== This series restores the test_offload.py selftest to working order. It seems a number of subtle behavioural changes have crept into various subsystems which broke test_offload.py in a number of ways. Most of these are fairly benign changes where small adjustments to the test script seems to be the best fix, but one is an actual kernel bug that I've observed in the wild caused by a bad interaction between xdp_attachment_flags_ok() and the rework of XDP program handling in the core netdev code. Patch 1 fixes the bug by removing xdp_attachment_flags_ok(), and the reminder of the patches are adjustments to test_offload.py, including a new feature for netdevsim to force a BPF verification fail. Please see the individual patches for details. Changelog: v4: - Accidentally truncated the Fixes: hashes in patches 3/4 to 11 chars v3: - Add Fixes: tags v2: - Replace xdp_attachment_flags_ok() with a check in dev_xdp_attach() - Better packing of struct nsim_dev ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-12-09selftests/bpf/test_offload.py: Filter bpftool internal map when counting mapsToke Høiland-Jørgensen
A few of the tests in test_offload.py expects to see a certain number of maps created, and checks this by counting the number of maps returned by bpftool. There is already a filter that will remove any maps already there at the beginning of the test, but bpftool now creates a map for the PID iterator rodata on each invocation, which makes the map count wrong. Fix this by also filtering the pid_iter.rodata map by name when counting. Fixes: d53dee3fe013 ("tools/bpftool: Show info for processes holding BPF map/prog/link/btf FDs") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/160752226387.110217.9887866138149423444.stgit@toke.dk
2020-12-09selftests/bpf/test_offload.py: Reset ethtool features after failed settingToke Høiland-Jørgensen
When setting the ethtool feature flag fails (as expected for the test), the kernel now tracks that the feature was requested to be 'off' and refuses to subsequently disable it again. So reset it back to 'on' so a subsequent disable (that's not supposed to fail) can succeed. Fixes: 417ec26477a5 ("selftests/bpf: add offload test based on netdevsim") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/160752226280.110217.10696241563705667871.stgit@toke.dk
2020-12-09selftests/bpf/test_offload.py: Fix expected case of extack messagesToke Høiland-Jørgensen
Commit 7f0a838254bd ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device") changed the case of some of the extack messages being returned when attaching of XDP programs failed. This broke test_offload.py, so let's fix the test to reflect this. Fixes: 7f0a838254bd ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/160752226175.110217.11214100824416344952.stgit@toke.dk
2020-12-09selftests/bpf/test_offload.py: Only check verifier log on verification failsToke Høiland-Jørgensen
Since commit 6f8a57ccf851 ("bpf: Make verifier log more relevant by default"), the verifier discards log messages for successfully-verified programs. This broke test_offload.py which is looking for a verification message from the driver callback. Change test_offload.py to use the toggle in netdevsim to make the verification fail before looking for the verification message. Fixes: 6f8a57ccf851 ("bpf: Make verifier log more relevant by default") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/160752226069.110217.12370824996153348073.stgit@toke.dk
2020-12-09netdevsim: Add debugfs toggle to reject BPF programs in verifierToke Høiland-Jørgensen
This adds a new debugfs toggle ('bpf_bind_verifier_accept') that can be used to make netdevsim reject BPF programs from being accepted by the verifier. If this toggle (which defaults to true) is set to false, nsim_bpf_verify_insn() will return EOPNOTSUPP on the last instruction (after outputting the 'Hello from netdevsim' verifier message). This makes it possible to check the verification callback in the driver from test_offload.py in selftests, since the verifier now clears the verifier log on a successful load, hiding the message from the driver. Fixes: 6f8a57ccf851 ("bpf: Make verifier log more relevant by default") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/160752225964.110217.12584017165318065332.stgit@toke.dk