Age | Commit message (Collapse) | Author |
|
Pull drm fixes from Dave Airlie:
"Another pretty normal week. I didn't get any i915 fixes yet, so next
week I'd expect double the usual i915, but otherwise a bunch of amdgpu
and some scattered other fixes.
hdcp:
- fix HDCP regression
amdgpu:
- Runtime PM fixes
- DC fix for PPC
- Misc DC fixes
virtio:
- fix context ordering issue
sun4i:
- old gcc warning fix
ingenic-drm:
- missing module support"
* tag 'drm-fixes-2020-05-08' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Prevent dpcd reads with passive dongles
drm/amd/display: fix counter in wait_for_no_pipes_pending
drm/amd/display: Update DCN2.1 DV Code Revision
drm: Fix HDCP failures when SRM fw is missing
sun6i: dsi: fix gcc-4.8
drm: ingenic-drm: add MODULE_DEVICE_TABLE
drm/virtio: create context before RESOURCE_CREATE_2D in 3D mode
drm/amd/display: work around fp code being emitted outside of DC_FP_START/END
drm/amdgpu/dc: Use WARN_ON_ONCE for ASSERT
drm/amdgpu: drop redundant cg/pg ungate on runpm enter
drm/amdgpu: move kfd suspend after ip_suspend_phase1
|
|
Merge misc fixes from Andrew Morton:
"14 fixes and one selftest to verify the ipc fixes herein"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: limit boost_watermark on small zones
ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST
mm/vmscan: remove unnecessary argument description of isolate_lru_pages()
epoll: atomically remove wait entry on wake up
kselftests: introduce new epoll60 testcase for catching lost wakeups
percpu: make pcpu_alloc() aware of current gfp context
mm/slub: fix incorrect interpretation of s->offset
scripts/gdb: repair rb_first() and rb_last()
eventpoll: fix missing wakeup for ovflist in ep_poll_callback
arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory()
scripts/decodecode: fix trapping instruction formatting
kernel/kcov.c: fix typos in kcov_remote_start documentation
mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
mm, memcg: fix error return value of mem_cgroup_css_alloc()
ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()
|
|
Elsewhere in the file, there is a list_for_each_entry with
&vdev->resv_regions as the second argument, suggesting that
&vdev->resv_regions is the list head. So exchange the
arguments on the list_add call to put the list head in the
second argument.
Fixes: 2a5a31487445 ("iommu/virtio: Add probe request")
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/1588704467-13431-1-git-send-email-Julia.Lawall@inria.fr
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
It turns out that when extending an existing bio, gfs2_find_jhead fails to
check if the block number is consecutive, which leads to incorrect reads for
fragmented journals.
In addition, limit the maximum bio size to an arbitrary value of 2 megabytes:
since commit 07173c3ec276 ("block: enable multipage bvecs"), if we just keep
adding pages until bio_add_page fails, bios will grow much larger than useful,
which pins more memory than necessary with barely any additional performance
gains.
Fixes: f4686c26ecc3 ("gfs2: read journal in large chunks")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Make sure we don't walk past the end of the metadata in gfs2_walk_metadata: the
inode holds fewer pointers than indirect blocks.
Slightly clean up gfs2_iomap_get.
Fixes: a27a0c9b6a20 ("gfs2: gfs2_walk_metadata fix")
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
When the gfs2_logd daemon withdrew, the withdraw sequence called
into make_fs_ro() to make the file system read-only. That caused the
journal descriptors to be freed. However, those journal descriptors
were used by gfs2_logd's call to gfs2_ail_flush_reqd(). This caused
a use-after free and NULL pointer dereference.
This patch changes function gfs2_logd() so that it stops all logd
work until the thread is told to stop. Once a withdraw is done,
it only does an interruptible sleep.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Before this patch, when the logd daemon was forced to withdraw, it
would try to request its journal be recovered by another cluster node.
However, in single-user cases with lock_nolock, there are no other
nodes to recover the journal. Function signal_our_withdraw() was
recognizing the lock_nolock situation, but not until after it had
evicted its journal inode. Since the journal descriptor that points
to the inode was never removed from the master list, when the unmount
occurred, it did another iput on the evicted inode, which resulted in
a BUG_ON(inode->i_state & I_CLEAR).
This patch moves the check for this situation earlier in function
signal_our_withdraw(), which avoids the extra iput, so the unmount
may happen normally.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Before this patch, if an error was detected from glock function go_sync
by function do_xmote, it would return. But the function had temporarily
unlocked the gl_lockref spin_lock, and it never re-locked it. When the
caller of do_xmote tried to unlock it again, it was already unlocked,
which resulted in a corrupted spin_lock value.
This patch makes sure the gl_lockref spin_lock is re-locked after it is
unlocked.
Thanks to Wu Bo <wubo40@huawei.com> for reporting this problem.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A few minor fixes for an ordering issue in virtio, an (old) gcc warning
in sun4i, a probe issue in ingenic-drm and a regression in the HDCP
support.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507160130.id64niqgf5wsha4u@gilmour.lan
|
|
git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.7-2020-05-06:
amdgpu:
- Runtime PM fixes
- DC fix for PPC
- Misc DC fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506212257.3893-1-alexander.deucher@amd.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem fix from James Morris:
"Fix the default value of fs_context_parse_param hook"
* 'for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: Fix the default value of fs_context_parse_param hook
|
|
Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an
external fragmentation event occurs") adds a boost_watermark() function
which increases the min watermark in a zone by at least
pageblock_nr_pages or the number of pages in a page block.
On Arm64, with 64K pages and 512M huge pages, this is 8192 pages or
512M. It does this regardless of the number of managed pages managed in
the zone or the likelihood of success.
This can put the zone immediately under water in terms of allocating
pages from the zone, and can cause a small machine to fail immediately
due to OoM. Unlike set_recommended_min_free_kbytes(), which
substantially increases min_free_kbytes and is tied to THP,
boost_watermark() can be called even if THP is not active.
The problem is most likely to appear on architectures such as Arm64
where pageblock_nr_pages is very large.
It is desirable to run the kdump capture kernel in as small a space as
possible to avoid wasting memory. In some architectures, such as Arm64,
there are restrictions on where the capture kernel can run, and
therefore, the space available. A capture kernel running in 768M can
fail due to OoM immediately after boost_watermark() sets the min in zone
DMA32, where most of the memory is, to 512M. It fails even though there
is over 500M of free memory. With boost_watermark() suppressed, the
capture kernel can run successfully in 448M.
This patch limits boost_watermark() to boosting a zone's min watermark
only when there are enough pages that the boost will produce positive
results. In this case that is estimated to be four times as many pages
as pageblock_nr_pages.
Mel said:
: There is no harm in marking it stable. Clearly it does not happen very
: often but it's not impossible. 32-bit x86 is a lot less common now
: which would previously have been vulnerable to triggering this easily.
: ppc64 has a larger base page size but typically only has one zone.
: arm64 is likely the most vulnerable, particularly when CMA is
: configured with a small movable zone.
Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1588294148-6586-1-git-send-email-henry.willard@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The documentation for UBSAN_ALIGNMENT already mentions that it should
not be used on all*config builds (and for efficient-unaligned-access
architectures), so just refactor the Kconfig to correctly implement this
so randconfigs will stop creating insane images that freak out objtool
under CONFIG_UBSAN_TRAP (due to the false positives producing functions
that never return, etc).
Link: http://lkml.kernel.org/r/202005011433.C42EA3E2D@keescook
Fixes: 0887a7ebc977 ("ubsan: add trap instrumentation option")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/linux-next/202004231224.D6B3B650@keescook/
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since commit a9e7c39fa9fd9 ("mm/vmscan.c: remove 7th argument of
isolate_lru_pages()"), the explanation of 'mode' argument has been
unnecessary. Let's remove it.
Signed-off-by: Qiwu Chen <chenqiwu@xiaomi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200501090346.2894-1-chenqiwu@xiaomi.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch does two things:
- fixes a lost wakeup introduced by commit 339ddb53d373 ("fs/epoll:
remove unnecessary wakeups of nested epoll")
- improves performance for events delivery.
The description of the problem is the following: if N (>1) threads are
waiting on ep->wq for new events and M (>1) events come, it is quite
likely that >1 wakeups hit the same wait queue entry, because there is
quite a big window between __add_wait_queue_exclusive() and the
following __remove_wait_queue() calls in ep_poll() function.
This can lead to lost wakeups, because thread, which was woken up, can
handle not all the events in ->rdllist. (in better words the problem is
described here: https://lkml.org/lkml/2019/10/7/905)
The idea of the current patch is to use init_wait() instead of
init_waitqueue_entry().
Internally init_wait() sets autoremove_wake_function as a callback,
which removes the wait entry atomically (under the wq locks) from the
list, thus the next coming wakeup hits the next wait entry in the wait
queue, thus preventing lost wakeups.
Problem is very well reproduced by the epoll60 test case [1].
Wait entry removal on wakeup has also performance benefits, because
there is no need to take a ep->lock and remove wait entry from the queue
after the successful wakeup. Here is the timing output of the epoll60
test case:
With explicit wakeup from ep_scan_ready_list() (the state of the
code prior 339ddb53d373):
real 0m6.970s
user 0m49.786s
sys 0m0.113s
After this patch:
real 0m5.220s
user 0m36.879s
sys 0m0.019s
The other testcase is the stress-epoll [2], where one thread consumes
all the events and other threads produce many events:
With explicit wakeup from ep_scan_ready_list() (the state of the
code prior 339ddb53d373):
threads events/ms run-time ms
8 5427 1474
16 6163 2596
32 6824 4689
64 7060 9064
128 6991 18309
After this patch:
threads events/ms run-time ms
8 5598 1429
16 7073 2262
32 7502 4265
64 7640 8376
128 7634 16767
(number of "events/ms" represents event bandwidth, thus higher is
better; number of "run-time ms" represents overall time spent
doing the benchmark, thus lower is better)
[1] tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
[2] https://github.com/rouming/test-tools/blob/master/stress-epoll.c
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jason Baron <jbaron@akamai.com>
Cc: Khazhismel Kumykov <khazhy@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Heiher <r@hev.cc>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200430130326.1368509-2-rpenyaev@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This test case catches lost wake up introduced by commit 339ddb53d373
("fs/epoll: remove unnecessary wakeups of nested epoll")
The test is simple: we have 10 threads and 10 event fds. Each thread
can harvest only 1 event. 1 producer fires all 10 events at once and
waits that all 10 events will be observed by 10 threads.
In case of lost wakeup epoll_wait() will timeout and 0 will be returned.
Test case catches two sort of problems: forgotten wakeup on event, which
hits the ->ovflist list, this problem was fixed by:
5a2513239750 ("eventpoll: fix missing wakeup for ovflist in ep_poll_callback")
the other problem is when several sequential events hit the same waiting
thread, thus other waiters get no wakeups. Problem is fixed in the
following patch.
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Khazhismel Kumykov <khazhy@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Heiher <r@hev.cc>
Cc: Jason Baron <jbaron@akamai.com>
Link: http://lkml.kernel.org/r/20200430130326.1368509-1-rpenyaev@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since 5.7-rc1, on btrfs we have a percpu counter initialization for
which we always pass a GFP_KERNEL gfp_t argument (this happens since
commit 2992df73268f78 ("btrfs: Implement DREW lock")).
That is safe in some contextes but not on others where allowing fs
reclaim could lead to a deadlock because we are either holding some
btrfs lock needed for a transaction commit or holding a btrfs
transaction handle open. Because of that we surround the call to the
function that initializes the percpu counter with a NOFS context using
memalloc_nofs_save() (this is done at btrfs_init_fs_root()).
However it turns out that this is not enough to prevent a possible
deadlock because percpu_alloc() determines if it is in an atomic context
by looking exclusively at the gfp flags passed to it (GFP_KERNEL in this
case) and it is not aware that a NOFS context is set.
Because percpu_alloc() thinks it is in a non atomic context it locks the
pcpu_alloc_mutex. This can result in a btrfs deadlock when
pcpu_balance_workfn() is running, has acquired that mutex and is waiting
for reclaim, while the btrfs task that called percpu_counter_init() (and
therefore percpu_alloc()) is holding either the btrfs commit_root
semaphore or a transaction handle (done fs/btrfs/backref.c:
iterate_extent_inodes()), which prevents reclaim from finishing as an
attempt to commit the current btrfs transaction will deadlock.
Lockdep reports this issue with the following trace:
======================================================
WARNING: possible circular locking dependency detected
5.6.0-rc7-btrfs-next-77 #1 Not tainted
------------------------------------------------------
kswapd0/91 is trying to acquire lock:
ffff8938a3b3fdc8 (&delayed_node->mutex){+.+.}, at: __btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs]
but task is already holding lock:
ffffffffb4f0dbc0 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (fs_reclaim){+.+.}:
fs_reclaim_acquire.part.0+0x25/0x30
__kmalloc+0x5f/0x3a0
pcpu_create_chunk+0x19/0x230
pcpu_balance_workfn+0x56a/0x680
process_one_work+0x235/0x5f0
worker_thread+0x50/0x3b0
kthread+0x120/0x140
ret_from_fork+0x3a/0x50
-> #3 (pcpu_alloc_mutex){+.+.}:
__mutex_lock+0xa9/0xaf0
pcpu_alloc+0x480/0x7c0
__percpu_counter_init+0x50/0xd0
btrfs_drew_lock_init+0x22/0x70 [btrfs]
btrfs_get_fs_root+0x29c/0x5c0 [btrfs]
resolve_indirect_refs+0x120/0xa30 [btrfs]
find_parent_nodes+0x50b/0xf30 [btrfs]
btrfs_find_all_leafs+0x60/0xb0 [btrfs]
iterate_extent_inodes+0x139/0x2f0 [btrfs]
iterate_inodes_from_logical+0xa1/0xe0 [btrfs]
btrfs_ioctl_logical_to_ino+0xb4/0x190 [btrfs]
btrfs_ioctl+0x165a/0x3130 [btrfs]
ksys_ioctl+0x87/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x5c/0x260
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #2 (&fs_info->commit_root_sem){++++}:
down_write+0x38/0x70
btrfs_cache_block_group+0x2ec/0x500 [btrfs]
find_free_extent+0xc6a/0x1600 [btrfs]
btrfs_reserve_extent+0x9b/0x180 [btrfs]
btrfs_alloc_tree_block+0xc1/0x350 [btrfs]
alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs]
__btrfs_cow_block+0x122/0x5a0 [btrfs]
btrfs_cow_block+0x106/0x240 [btrfs]
commit_cowonly_roots+0x55/0x310 [btrfs]
btrfs_commit_transaction+0x509/0xb20 [btrfs]
sync_filesystem+0x74/0x90
generic_shutdown_super+0x22/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x93/0xc0
exit_to_usermode_loop+0xf9/0x100
do_syscall_64+0x20d/0x260
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #1 (&space_info->groups_sem){++++}:
down_read+0x3c/0x140
find_free_extent+0xef6/0x1600 [btrfs]
btrfs_reserve_extent+0x9b/0x180 [btrfs]
btrfs_alloc_tree_block+0xc1/0x350 [btrfs]
alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs]
__btrfs_cow_block+0x122/0x5a0 [btrfs]
btrfs_cow_block+0x106/0x240 [btrfs]
btrfs_search_slot+0x50c/0xd60 [btrfs]
btrfs_lookup_inode+0x3a/0xc0 [btrfs]
__btrfs_update_delayed_inode+0x90/0x280 [btrfs]
__btrfs_commit_inode_delayed_items+0x81f/0x870 [btrfs]
__btrfs_run_delayed_items+0x8e/0x180 [btrfs]
btrfs_commit_transaction+0x31b/0xb20 [btrfs]
iterate_supers+0x87/0xf0
ksys_sync+0x60/0xb0
__ia32_sys_sync+0xa/0x10
do_syscall_64+0x5c/0x260
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (&delayed_node->mutex){+.+.}:
__lock_acquire+0xef0/0x1c80
lock_acquire+0xa2/0x1d0
__mutex_lock+0xa9/0xaf0
__btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs]
btrfs_evict_inode+0x40d/0x560 [btrfs]
evict+0xd9/0x1c0
dispose_list+0x48/0x70
prune_icache_sb+0x54/0x80
super_cache_scan+0x124/0x1a0
do_shrink_slab+0x176/0x440
shrink_slab+0x23a/0x2c0
shrink_node+0x188/0x6e0
balance_pgdat+0x31d/0x7f0
kswapd+0x238/0x550
kthread+0x120/0x140
ret_from_fork+0x3a/0x50
other info that might help us debug this:
Chain exists of:
&delayed_node->mutex --> pcpu_alloc_mutex --> fs_reclaim
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(pcpu_alloc_mutex);
lock(fs_reclaim);
lock(&delayed_node->mutex);
*** DEADLOCK ***
3 locks held by kswapd0/91:
#0: (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
#1: (shrinker_rwsem){++++}, at: shrink_slab+0x12f/0x2c0
#2: (&type->s_umount_key#43){++++}, at: trylock_super+0x16/0x50
stack backtrace:
CPU: 1 PID: 91 Comm: kswapd0 Not tainted 5.6.0-rc7-btrfs-next-77 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8f/0xd0
check_noncircular+0x170/0x190
__lock_acquire+0xef0/0x1c80
lock_acquire+0xa2/0x1d0
__mutex_lock+0xa9/0xaf0
__btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs]
btrfs_evict_inode+0x40d/0x560 [btrfs]
evict+0xd9/0x1c0
dispose_list+0x48/0x70
prune_icache_sb+0x54/0x80
super_cache_scan+0x124/0x1a0
do_shrink_slab+0x176/0x440
shrink_slab+0x23a/0x2c0
shrink_node+0x188/0x6e0
balance_pgdat+0x31d/0x7f0
kswapd+0x238/0x550
kthread+0x120/0x140
ret_from_fork+0x3a/0x50
This could be fixed by making btrfs pass GFP_NOFS instead of GFP_KERNEL
to percpu_counter_init() in contextes where it is not reclaim safe,
however that type of approach is discouraged since
memalloc_[nofs|noio]_save() were introduced. Therefore this change
makes pcpu_alloc() look up into an existing nofs/noio context before
deciding whether it is in an atomic context or not.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Link: http://lkml.kernel.org/r/20200430164356.15543-1-fdmanana@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In a couple of places in the slub memory allocator, the code uses
"s->offset" as a check to see if the free pointer is put right after the
object. That check is no longer true with commit 3202fa62fb43 ("slub:
relocate freelist pointer to middle of object").
As a result, echoing "1" into the validate sysfs file, e.g. of dentry,
may cause a bunch of "Freepointer corrupt" error reports like the
following to appear with the system in panic afterwards.
=============================================================================
BUG dentry(666:pmcd.service) (Tainted: G B): Freepointer corrupt
-----------------------------------------------------------------------------
To fix it, use the check "s->offset == s->inuse" in the new helper
function freeptr_outside_object() instead. Also add another helper
function get_info_end() to return the end of info block (inuse + free
pointer if not overlapping with object).
Fixes: 3202fa62fb43 ("slub: relocate freelist pointer to middle of object")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Vitaly Nikolenko <vnik@duasynt.com>
Cc: Silvio Cesare <silvio.cesare@gmail.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Markus Elfring <Markus.Elfring@web.de>
Cc: Changbin Du <changbin.du@gmail.com>
Link: http://lkml.kernel.org/r/20200429135328.26976-1-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The current implementations of the rb_first() and rb_last() gdb
functions have a variable that references itself in its instanciation,
which causes the function to throw an error if a specific condition on
the argument is met. The original author rather intended to reference
the argument and made a typo. Referring the argument instead makes the
function work as intended.
Signed-off-by: Aymeric Agon-Rambosson <aymeric.agon@yandex.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Nikolay Borisov <n.borisov.lkml@gmail.com>
Cc: Jackie Liu <liuyun01@kylinos.cn>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/r/20200427051029.354840-1-aymeric.agon@yandex.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In the event that we add to ovflist, before commit 339ddb53d373
("fs/epoll: remove unnecessary wakeups of nested epoll") we would be
woken up by ep_scan_ready_list, and did no wakeup in ep_poll_callback.
With that wakeup removed, if we add to ovflist here, we may never wake
up. Rather than adding back the ep_scan_ready_list wakeup - which was
resulting in unnecessary wakeups, trigger a wake-up in ep_poll_callback.
We noticed that one of our workloads was missing wakeups starting with
339ddb53d373 and upon manual inspection, this wakeup seemed missing to me.
With this patch added, we no longer see missing wakeups. I haven't yet
tried to make a small reproducer, but the existing kselftests in
filesystem/epoll passed for me with this patch.
[khazhy@google.com: use if/elif instead of goto + cleanup suggested by Roman]
Link: http://lkml.kernel.org/r/20200424190039.192373-1-khazhy@google.com
Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll")
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: Heiher <r@hev.cc>
Cc: Jason Baron <jbaron@akamai.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200424025057.118641-1-khazhy@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When trying to lock read-only pages, sev_pin_memory() fails because
FOLL_WRITE is used as the flag for get_user_pages_fast().
Commit 73b0140bf0fe ("mm/gup: change GUP fast to use flags rather than a
write 'bool'") updated the get_user_pages_fast() call sites to use
flags, but incorrectly updated the call in sev_pin_memory(). As the
original coding of this call was correct, revert the change made by that
commit.
Fixes: 73b0140bf0fe ("mm/gup: change GUP fast to use flags rather than a write 'bool'")
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Mike Marshall <hubcap@omnibond.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Link: http://lkml.kernel.org/r/20200423152419.87202-1-Janakarajan.Natarajan@amd.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If the trapping instruction contains a ':', for a memory access through
segment registers for example, the sed substitution will insert the '*'
marker in the middle of the instruction instead of the line address:
2b: 65 48 0f c7 0f cmpxchg16b %gs:*(%rdi) <-- trapping instruction
I started to think I had forgotten some quirk of the assembly syntax
before noticing that it was actually coming from the script. Fix it to
add the address marker at the right place for these instructions:
28: 49 8b 06 mov (%r14),%rax
2b:* 65 48 0f c7 0f cmpxchg16b %gs:(%rdi) <-- trapping instruction
30: 0f 94 c0 sete %al
Fixes: 18ff44b189e2 ("scripts/decodecode: make faulting insn ptr more robust")
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20200419223653.GA31248@visor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Maciej Grochowski <maciej.grochowski@pm.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Link: http://lkml.kernel.org/r/20200420030259.31674-1-maciek.grochowski@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Without CONFIG_PREEMPT, it can happen that we get soft lockups detected,
e.g., while booting up.
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: __pageblock_pfn_to_page+0x134/0x1c0
Call Trace:
set_zone_contiguous+0x56/0x70
page_alloc_init_late+0x166/0x176
kernel_init_freeable+0xfa/0x255
kernel_init+0xa/0x106
ret_from_fork+0x35/0x40
The issue becomes visible when having a lot of memory (e.g., 4TB)
assigned to a single NUMA node - a system that can easily be created
using QEMU. Inside VMs on a hypervisor with quite some memory
overcommit, this is fairly easy to trigger.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When I run my memcg testcase which creates lots of memcgs, I found
there're unexpected out of memory logs while there're still enough
available free memory. The error log is
mkdir: cannot create directory 'foo.65533': Cannot allocate memory
The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs,
an -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right
errno should be -ENOSPC "No space left on device", which is an
appropriate errno for userspace's failed mkdir.
As the errno really misled me, we should make it right. After this
patch, the error log will be
mkdir: cannot create directory 'foo.65533': No space left on device
[akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
[akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Link: http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/1586192163-20099-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit cc731525f26a ("signal: Remove kernel interal si_code magic")
changed the value of SI_FROMUSER(SI_MESGQ), this means that mq_notify() no
longer works if the sender doesn't have rights to send a signal.
Change __do_notify() to use do_send_sig_info() instead of kill_pid_info()
to avoid check_kill_permission().
This needs the additional notify.sigev_signo != 0 check, shouldn't we
change do_mq_notify() to deny sigev_signo == 0 ?
Test-case:
#include <signal.h>
#include <mqueue.h>
#include <unistd.h>
#include <sys/wait.h>
#include <assert.h>
static int notified;
static void sigh(int sig)
{
notified = 1;
}
int main(void)
{
signal(SIGIO, sigh);
int fd = mq_open("/mq", O_RDWR|O_CREAT, 0666, NULL);
assert(fd >= 0);
struct sigevent se = {
.sigev_notify = SIGEV_SIGNAL,
.sigev_signo = SIGIO,
};
assert(mq_notify(fd, &se) == 0);
if (!fork()) {
assert(setuid(1) == 0);
mq_send(fd, "",1,0);
return 0;
}
wait(NULL);
mq_unlink("/mq");
assert(notified);
return 0;
}
[manfred@colorfullife.com: 1) Add self_exec_id evaluation so that the implementation matches do_notify_parent 2) use PIDTYPE_TGID everywhere]
Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic")
Reported-by: Yoji <yoji.fujihar.min@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Markus Elfring <elfring@users.sourceforge.net>
Cc: <1vier1@web.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/e2a782e4-eab9-4f5c-c749-c07a8f7a4e66@colorfullife.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix bootconfig causing kernels to fail with CONFIG_BLK_DEV_RAM
enabled
- Fix allocation leaks in bootconfig tool
- Fix a double initialization of a variable
- Fix API bootconfig usage from kprobe boot time events
- Reject NULL location for kprobes
- Fix crash caused by preempt delay module not cleaning up kthread
correctly
- Add vmalloc_sync_mappings() to prevent x86_64 page faults from
recursively faulting from tracing page faults
- Fix comment in gpu/trace kerneldoc header
- Fix documentation of how to create a trace event class
- Make the local tracing_snapshot_instance_cond() function static
* tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tools/bootconfig: Fix resource leak in apply_xbc()
tracing: Make tracing_snapshot_instance_cond() static
tracing: Fix doc mistakes in trace sample
gpu/trace: Minor comment updates for gpu_mem_total tracepoint
tracing: Add a vmalloc_sync_mappings() for safe measure
tracing: Wait for preempt irq delay thread to finish
tracing/kprobes: Reject new event if loc is NULL
tracing/boottime: Fix kprobe event API usage
tracing/kprobes: Fix a double initialization typo
bootconfig: Fix to remove bootconfig data from initrd while boot
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"ftrace test fixes and a fix to kvm Makefile for relocatable
native/cross builds and installs"
* tag 'linux-kselftest-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: fix kvm relocatable native/cross builds and installs
selftests/ftrace: Make XFAIL green color
ftrace/selftest: make unresolved cases cause failure if --fail-unresolved set
ftrace/selftests: workaround cgroup RT scheduling issues
|
|
We currently make some guesses as when to open this fd, but in reality
we have no business (or need) to do so at all. In fact, it makes certain
things fail, like O_PATH.
Remove the fd lookup from these opcodes, we're just passing the 'fd' to
generic helpers anyway. With that, we can also remove the special casing
of fd values in io_req_needs_file(), and the 'fd_non_neg' check that
we have. And we can ensure that we only read sqe->fd once.
This fixes O_PATH usage with openat/openat2, and ditto statx path side
oddities.
Cc: stable@vger.kernel.org: # v5.6
Reported-by: Max Kellermann <mk@cm4all.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fix the @data and @fd allocations that are leaked in the error path of
apply_xbc().
Link: http://lkml.kernel.org/r/583a49c9-c27a-931d-e6c2-6f63a4b18bea@huawei.com
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Fix the following sparse warning:
kernel/trace/trace.c:950:6: warning: symbol 'tracing_snapshot_instance_cond'
was not declared. Should it be static?
Link: http://lkml.kernel.org/r/1587614905-48692-1-git-send-email-zou_wei@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
As the example below shows, DECLARE_EVENT_CLASS() is used instead of
DEFINE_EVENT_CLASS().
Link: http://lkml.kernel.org/r/20200428214959.11259-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
This change updates the improper comment for the 'size' attribute in the
tracepoint definition. Most gfx drivers pre-fault in physical pages
instead of making virtual allocations. So we drop the 'Virtual' keyword
here and leave this to the implementations.
Link: http://lkml.kernel.org/r/20200428220825.169606-1-zzyiwei@google.com
Signed-off-by: Yiwei Zhang <zzyiwei@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu
areas can be complex, to say the least. Mappings may happen at boot up, and
if nothing synchronizes the page tables, those page mappings may not be
synced till they are used. This causes issues for anything that might touch
one of those mappings in the path of the page fault handler. When one of
those unmapped mappings is touched in the page fault handler, it will cause
another page fault, which in turn will cause a page fault, and leave us in
a loop of page faults.
Commit 763802b53a42 ("x86/mm: split vmalloc_sync_all()") split
vmalloc_sync_all() into vmalloc_sync_unmappings() and
vmalloc_sync_mappings(), as on system exit, it did not need to do a full
sync on x86_64 (although it still needed to be done on x86_32). By chance,
the vmalloc_sync_all() would synchronize the page mappings done at boot up
and prevent the per cpu area from being a problem for tracing in the page
fault handler. But when that synchronization in the exit of a task became a
nop, it caused the problem to appear.
Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home
Cc: stable@vger.kernel.org
Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code")
Reported-by: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
Suggested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Running on a slower machine, it is possible that the preempt delay kernel
thread may still be executing if the module was immediately removed after
added, and this can cause the kernel to crash as the kernel thread might be
executing after its code has been removed.
There's no reason that the caller of the code shouldn't just wait for the
delay thread to finish, as the thread can also be created by a trigger in
the sysfs code, which also has the same issues.
Link: http://lore.kernel.org/r/5EA2B0C8.2080706@cn.fujitsu.com
Cc: stable@vger.kernel.org
Fixes: 793937236d1ee ("lib: Add module for testing preemptoff/irqsoff latency tracers")
Reported-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Avoid potential NULL dereference in huge_pte_alloc() on pmd_alloc()
failure"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hugetlb: avoid potential NULL dereference
|
|
Pull kvm fixes from Paolo Bonzini:
"Bugfixes, mostly for ARM and AMD, and more documentation.
Slightly bigger than usual because I couldn't send out what was
pending for rc4, but there is nothing worrisome going on. I have more
fixes pending for guest debugging support (gdbstub) but I will send
them next week"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly
KVM: selftests: Fix build for evmcs.h
kvm: x86: Use KVM CPU capabilities to determine CR4 reserved bits
KVM: VMX: Explicitly clear RFLAGS.CF and RFLAGS.ZF in VM-Exit RSB path
docs/virt/kvm: Document configuring and running nested guests
KVM: s390: Remove false WARN_ON_ONCE for the PQAP instruction
kvm: ioapic: Restrict lazy EOI update to edge-triggered interrupts
KVM: x86: Fixes posted interrupt check for IRQs delivery modes
KVM: SVM: fill in kvm_run->debug.arch.dr[67]
KVM: nVMX: Replace a BUG_ON(1) with BUG() to squash clang warning
KVM: arm64: Fix 32bit PC wrap-around
KVM: arm64: vgic-v4: Initialize GICv4.1 even in the absence of a virtual ITS
KVM: arm64: Save/restore sp_el0 as part of __guest_enter
KVM: arm64: Delete duplicated label in invalid_vector
KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi()
KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits
KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits
KVM: arm: vgic: Synchronize the whole guest on GIC{D,R}_I{S,C}ACTIVER read
KVM: arm64: PSCI: Forbid 64bit functions for 32bit guests
...
|
|
Pull configfs fix from Christoph Hellwig:
"Fix a refcount leak in configfs_rmdir (Xiyu Yang)"
* tag 'configfs-for-5.7' of git://git.infradead.org/users/hch/configfs:
configfs: fix config_item refcnt leak in configfs_rmdir()
|
|
do_splice() is used by io_uring, as will be do_tee(). Move f_mode
checks from sys_{splice,tee}() to do_{splice,tee}(), so they're
enforced for io_uring as well.
Fixes: 7d67af2c0134 ("io_uring: add splice(2) support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Kristen found a hang in objtool when building with -ffunction-sections.
It was caused by evergreen_pcie_gen2_enable.cold() being laid out
immediately before evergreen_pcie_gen2_enable(). Since their "pfunc" is
always the same, find_jump_table() got into an infinite loop because it
didn't recognize the boundary between the two functions.
Fix that with a new prev_insn_same_sym() helper, which doesn't cross
subfunction boundaries.
Reported-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/378b51c9d9c894dc3294bc460b4b0869e950b7c5.1588110291.git.jpoimboe@redhat.com
|
|
bdi_dev_name is not a fast path function, move it out of line. This
prepares for using it from modular callers without having to export
an implementation detail like bdi_unknown_name.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Simplify the bdi name to mirror what we are doing elsewhere, and
drop them name in favor of just using a number. This avoids a
potentially very long bdi name.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The static analyzer in GCC 10 spotted that in huge_pte_alloc() we may
pass a NULL pmdp into pte_alloc_map() when pmd_alloc() returns NULL:
| CC arch/arm64/mm/pageattr.o
| CC arch/arm64/mm/hugetlbpage.o
| from arch/arm64/mm/hugetlbpage.c:10:
| arch/arm64/mm/hugetlbpage.c: In function ‘huge_pte_alloc’:
| ./arch/arm64/include/asm/pgtable-types.h:28:24: warning: dereference of NULL ‘pmdp’ [CWE-690] [-Wanalyzer-null-dereference]
| ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
| arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
| |arch/arm64/mm/hugetlbpage.c:232:10:
| |./arch/arm64/include/asm/pgtable-types.h:28:24:
| ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
| arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
This can only occur when the kernel cannot allocate a page, and so is
unlikely to happen in practice before other systems start failing.
We can avoid this by bailing out if pmd_alloc() fails, as we do earlier
in the function if pud_alloc() fails.
Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Kyrill Tkachov <kyrylo.tkachov@arm.com>
Cc: <stable@vger.kernel.org> # 4.5.x-
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Currently we check to make sure there is no error state on the extcon
handle for VBUS when writing to the HS_PHY_GENCONFIG_2 register. When using
the USB role-switch API we still need to write to this register absent an
extcon handle.
This patch makes the appropriate update to ensure the write happens if
role-switching is true.
Fixes: 05559f10ed79 ("usb: chipidea: add role switch class support")
Cc: stable <stable@vger.kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: linux-usb@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Link: https://lore.kernel.org/r/20200507004918.25975-2-peter.chen@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull networking fixes from David Miller:
1) Fix reference count leaks in various parts of batman-adv, from Xiyu
Yang.
2) Update NAT checksum even when it is zero, from Guillaume Nault.
3) sk_psock reference count leak in tls code, also from Xiyu Yang.
4) Sanity check TCA_FQ_CODEL_DROP_BATCH_SIZE netlink attribute in
fq_codel, from Eric Dumazet.
5) Fix panic in choke_reset(), also from Eric Dumazet.
6) Fix VLAN accel handling in bnxt_fix_features(), from Michael Chan.
7) Disallow out of range quantum values in sch_sfq, from Eric Dumazet.
8) Fix crash in x25_disconnect(), from Yue Haibing.
9) Don't pass pointer to local variable back to the caller in
nf_osf_hdr_ctx_init(), from Arnd Bergmann.
10) Wireguard should use the ECN decap helper functions, from Toke
Høiland-Jørgensen.
11) Fix command entry leak in mlx5 driver, from Moshe Shemesh.
12) Fix uninitialized variable access in mptcp's
subflow_syn_recv_sock(), from Paolo Abeni.
13) Fix unnecessary out-of-order ingress frame ordering in macsec, from
Scott Dial.
14) IPv6 needs to use a global serial number for dst validation just
like ipv4, from David Ahern.
15) Fix up PTP_1588_CLOCK deps, from Clay McClure.
16) Missing NLM_F_MULTI flag in gtp driver netlink messages, from
Yoshiyuki Kurauchi.
17) Fix a regression in that dsa user port errors should not be fatal,
from Florian Fainelli.
18) Fix iomap leak in enetc driver, from Dejin Zheng.
19) Fix use after free in lec_arp_clear_vccs(), from Cong Wang.
20) Initialize protocol value earlier in neigh code paths when
generating events, from Roman Mashak.
21) netdev_update_features() must be called with RTNL mutex in macsec
driver, from Antoine Tenart.
22) Validate untrusted GSO packets even more strictly, from Willem de
Bruijn.
23) Wireguard decrypt worker needs a cond_resched(), from Jason
Donenfeld.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE
MAINTAINERS: put DYNAMIC INTERRUPT MODERATION in proper order
wireguard: send/receive: use explicit unlikely branch instead of implicit coalescing
wireguard: selftests: initalize ipv6 members to NULL to squelch clang warning
wireguard: send/receive: cond_resched() when processing worker ringbuffers
wireguard: socket: remove errant restriction on looping to self
wireguard: selftests: use normal kernel stack size on ppc64
net: ethernet: ti: am65-cpsw-nuss: fix irqs type
ionic: Use debugfs_create_bool() to export bool
net: dsa: Do not leave DSA master with NULL netdev_ops
net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirred
net: stricter validation of untrusted gso packets
seg6: fix SRH processing to comply with RFC8754
net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms
net: dsa: ocelot: the MAC table on Felix is twice as large
net: dsa: sja1105: the PTP_CLK extts input reacts on both edges
selftests: net: tcp_mmap: fix SO_RCVLOWAT setting
net: hsr: fix incorrect type usage for protocol variable
net: macsec: fix rtnl locking issue
net: mvpp2: cls: Prevent buffer overflow in mvpp2_ethtool_cls_rule_del()
...
|
|
This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
that the frontend does not need counters, this hw stats type request
never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
the driver to disable the stats, however, if the driver cannot disable
counters, it bails out.
TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
(this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.
Fixes: 319a1d19471e ("flow_offload: check for basic action hw stats type")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 9b038086f06b ("docs: networking: convert DIM to RST") added a new
file entry to DYNAMIC INTERRUPT MODERATION to the end, and not following
alphabetical order.
So, ./scripts/checkpatch.pl -f MAINTAINERS complains:
WARNING: Misordered MAINTAINERS entry - list file patterns in alphabetic
order
#5966: FILE: MAINTAINERS:5966:
+F: lib/dim/
+F: Documentation/networking/net_dim.rst
Reorder the file entries to keep MAINTAINERS nicely ordered.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jason A. Donenfeld says:
====================
wireguard fixes for 5.7-rc5
With Ubuntu and Debian having backported this into their kernels, we're
finally seeing testing from places we hadn't seen prior, which is nice.
With that comes more fixes:
1) The CI for PPC64 was running with extremely small stacks for 64-bit,
causing spurious crashes in surprising places.
2) There's was an old leftover routing loop restriction, which no longer
makes sense given the queueing architecture, and was causing problems
for people who really did want nested routing.
3) Not yielding our kthread on CONFIG_PREEMPT_VOLUNTARY systems caused
RCU stalls and other issues, reported by Wang Jian, with the fix
suggested by Sultan Alsawaf.
4) Clang spewed warnings in a selftest for CONFIG_IPV6=n, reported by
Arnd Bergmann.
5) A complicated if statement was simplified to an assignment while also
making the likely/unlikely hinting more correct and simple, and
increasing readability, suggested by Sultan.
Patches (2) and (3) have Fixes: lines and are probably good candidates
for stable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
coalescing
It's very unlikely that send will become true. It's nearly always false
between 0 and 120 seconds of a session, and in most cases becomes true
only between 120 and 121 seconds before becoming false again. So,
unlikely(send) is clearly the right option here.
What happened before was that we had this complex boolean expression
with multiple likely and unlikely clauses nested. Since this is
evaluated left-to-right anyway, the whole thing got converted to
unlikely. So, we can clean this up to better represent what's going on.
The generated code is the same.
Suggested-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Without setting these to NULL, clang complains in certain
configurations that have CONFIG_IPV6=n:
In file included from drivers/net/wireguard/ratelimiter.c:223:
drivers/net/wireguard/selftest/ratelimiter.c:173:34: error: variable 'skb6' is uninitialized when used here [-Werror,-Wuninitialized]
ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count);
^~~~
drivers/net/wireguard/selftest/ratelimiter.c:123:29: note: initialize the variable 'skb6' to silence this warning
struct sk_buff *skb4, *skb6;
^
= NULL
drivers/net/wireguard/selftest/ratelimiter.c:173:40: error: variable 'hdr6' is uninitialized when used here [-Werror,-Wuninitialized]
ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count);
^~~~
drivers/net/wireguard/selftest/ratelimiter.c:125:22: note: initialize the variable 'hdr6' to silence this warning
struct ipv6hdr *hdr6;
^
We silence this warning by setting the variables to NULL as the warning
suggests.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|