summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2021-06-29mm,hwpoison: send SIGBUS with error virutal addressNaoya Horiguchi
Now an action required MCE in already hwpoisoned address surely sends a SIGBUS to current process, but the SIGBUS doesn't convey error virtual address. That's not optimal for hwpoison-aware applications. To fix the issue, make memory_failure() call kill_accessing_process(), that does pagetable walk to find the error virtual address. It could find multiple virtual addresses for the same error page, and it seems hard to tell which virtual address is correct one. But that's rare and sending incorrect virtual address could be better than no address. So let's report the first found virtual address for now. [naoya.horiguchi@nec.com: fix walk_page_range() return] Link: https://lkml.kernel.org/r/20210603051055.GA244241@hori.linux.bs1.fc.nec.co.jp Link: https://lkml.kernel.org/r/20210521030156.2612074-4-nao.horiguchi@gmail.com Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Aili Yao <yaoaili@kingsoft.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jue Wang <juew@google.com> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMAMike Rapoport
After removal of DISCINTIGMEM the NEED_MULTIPLE_NODES and NUMA configuration options are equivalent. Drop CONFIG_NEED_MULTIPLE_NODES and use CONFIG_NUMA instead. Done with $ sed -i 's/CONFIG_NEED_MULTIPLE_NODES/CONFIG_NUMA/' \ $(git grep -wl CONFIG_NEED_MULTIPLE_NODES) $ sed -i 's/NEED_MULTIPLE_NODES/NUMA/' \ $(git grep -wl NEED_MULTIPLE_NODES) with manual tweaks afterwards. [rppt@linux.ibm.com: fix arm boot crash] Link: https://lkml.kernel.org/r/YMj9vHhHOiCVN4BF@linux.ibm.com Link: https://lkml.kernel.org/r/20210608091316.3622-9-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arch, mm: remove stale mentions of DISCONIGMEMMike Rapoport
There are several places that mention DISCONIGMEM in comments or have stale code guarded by CONFIG_DISCONTIGMEM. Remove the dead code and update the comments. Link: https://lkml.kernel.org/r/20210608091316.3622-7-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29m68k: remove support for DISCONTIGMEMMike Rapoport
DISCONTIGMEM was replaced by FLATMEM with freeing of the unused memory map in v5.11. Remove the support for DISCONTIGMEM entirely. Link: https://lkml.kernel.org/r/20210608091316.3622-5-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arc: remove support for DISCONTIGMEMMike Rapoport
DISCONTIGMEM was replaced by FLATMEM with freeing of the unused memory map in v5.11. Remove the support for DISCONTIGMEM entirely. Link: https://lkml.kernel.org/r/20210608091316.3622-4-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arc: update comment about HIGHMEM implementationMike Rapoport
Arc does not use DISCONTIGMEM to implement high memory, update the comment describing how high memory works to reflect this. Link: https://lkml.kernel.org/r/20210608091316.3622-3-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29alpha: remove DISCONTIGMEM and NUMAMike Rapoport
Patch series "Remove DISCONTIGMEM memory model", v3. SPARSEMEM memory model was supposed to entirely replace DISCONTIGMEM a (long) while ago. The last architectures that used DISCONTIGMEM were updated to use other memory models in v5.11 and it is about the time to entirely remove DISCONTIGMEM from the kernel. This set removes DISCONTIGMEM from alpha, arc and m68k, simplifies memory model selection in mm/Kconfig and replaces usage of redundant CONFIG_NEED_MULTIPLE_NODES and CONFIG_FLAT_NODE_MEM_MAP with CONFIG_NUMA and CONFIG_FLATMEM respectively. I've also removed NUMA support on alpha that was BROKEN for more than 15 years. There were also minor updates all over arch/ to remove mentions of DISCONTIGMEM in comments and #ifdefs. This patch (of 9): NUMA is marked broken on alpha for more than 15 years and DISCONTIGMEM was replaced with SPARSEMEM in v5.11. Remove both NUMA and DISCONTIGMEM support from alpha. Link: https://lkml.kernel.org/r/20210608091316.3622-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20210608091316.3622-2-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29mm: define default MAX_PTRS_PER_* in include/pgtable.hDaniel Axtens
Commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT and PTRS_PER_P4D variable") made PTRS_PER_P4D variable on x86 and introduced MAX_PTRS_PER_P4D as a constant for cases which need a compile-time constant (e.g. fixed-size arrays). powerpc likewise has boot-time selectable MMU features which can cause other mm "constants" to vary. For KASAN, we have some static PTE/PMD/PUD/P4D arrays so we need compile-time maximums for all these constants. Extend the MAX_PTRS_PER_ idiom, and place default definitions in include/pgtable.h. These define MAX_PTRS_PER_x to be PTRS_PER_x unless an architecture has defined MAX_PTRS_PER_x in its arch headers. Clean up pgtable-nop4d.h and s390's MAX_PTRS_PER_P4D definitions while we're at it: both can just pick up the default now. Link: https://lkml.kernel.org/r/20210624034050.511391-4-dja@axtens.net Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Marco Elver <elver@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29h8300: remove unused variableSouptick Joarder
Kernel test robot throws below warning -> >> arch/h8300/kernel/setup.c:72:26: warning: Unused variable: region [unusedVariable] struct memblock_region *region; Fixed it by removing unused variable. Link: https://lkml.kernel.org/r/20210602185431.11416-1-jrdr.linux@gmail.com Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29mm: update legacy flush_tlb_* to use vmaChen Li
1. These tlb flush functions have been using vma instead mm long time ago, but there is still some comments use mm as parameter. 2. the actual struct we use is vm_area_struct instead of vma_struct. 3. remove unused flush_kern_tlb_page. Link: https://lkml.kernel.org/r/87k0oaq311.wl-chenli@uniontech.com Signed-off-by: Chen Li <chenli@uniontech.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29x86/sgx: use vma_lookup() in sgx_encl_find()Liam Howlett
Use vma_lookup() to find the VMA at a specific address. As vma_lookup() will return NULL if the address is not within any VMA, the start address no longer needs to be validated. Link: https://lkml.kernel.org/r/20210521174745.2219620-10-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arch/m68k/kernel/sys_m68k: use vma_lookup() in sys_cacheflush()Liam Howlett
Use vma_lookup() to find the VMA at a specific address. As vma_lookup() will return NULL if the address is not within any VMA, the start address no longer needs to be validated. Link: https://lkml.kernel.org/r/20210521174745.2219620-9-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arch/mips/kernel/traps: use vma_lookup() instead of find_vma()Liam Howlett
Use vma_lookup() to find the VMA at a specific address. As vma_lookup() will return NULL if the address is not within any VMA, the start address no longer needs to be validated. Link: https://lkml.kernel.org/r/20210521174745.2219620-8-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arch/powerpc/kvm/book3s: use vma_lookup() in kvmppc_hv_setup_htab_rma()Liam Howlett
Using vma_lookup() removes the requirement to check if the address is within the returned vma. The code is easier to understand and more compact. Link: https://lkml.kernel.org/r/20210521174745.2219620-7-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arch/powerpc/kvm/book3s_hv_uvmem: use vma_lookup() instead of ↵Liam Howlett
find_vma_intersection() vma_lookup() finds the vma of a specific address with a cleaner interface and is more readable. Link: https://lkml.kernel.org/r/20210521174745.2219620-6-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arch/arm64/kvm: use vma_lookup() instead of find_vma_intersection()Liam Howlett
vma_lookup() finds the vma of a specific address with a cleaner interface and is more readable. Link: https://lkml.kernel.org/r/20210521174745.2219620-5-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arch/arc/kernel/troubleshoot: use vma_lookup() instead of find_vma()Liam Howlett
Use vma_lookup() to find the VMA at a specific address. As vma_lookup() will return NULL if the address is not within any VMA, the start address no longer needs to be validated. Link: https://lkml.kernel.org/r/20210521174745.2219620-4-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29binfmt: remove in-tree usage of MAP_EXECUTABLEDavid Hildenbrand
Ever since commit e9714acf8c43 ("mm: kill vma flag VM_EXECUTABLE and mm->num_exe_file_vmas"), VM_EXECUTABLE is gone and MAP_EXECUTABLE is essentially completely ignored. Let's remove all usage of MAP_EXECUTABLE. [akpm@linux-foundation.org: fix blooper in fs/binfmt_aout.c. per David] Link: https://lkml.kernel.org/r/20210421093453.6904-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kevin Brodsky <Kevin.Brodsky@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29ia64: mca_drv: fix incorrect array size calculationArnd Bergmann
gcc points out a mistake in the mca driver that goes back to before the git history: arch/ia64/kernel/mca_drv.c: In function 'init_record_index_pools': arch/ia64/kernel/mca_drv.c:346:54: error: expression does not compute the number of elements in this array; element typ e is 'int', not 'size_t' {aka 'long unsigned int'} [-Werror=sizeof-array-div] 346 | for (i = 1; i < sizeof sal_log_sect_min_sizes/sizeof(size_t); i++) | ^ This is the same as sizeof(size_t), which is two shorter than the actual array. Use the ARRAY_SIZE() macro to get the correct calculation instead. Link: https://lkml.kernel.org/r/20210514214123.875971-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29ia64: headers: drop duplicated wordsRandy Dunlap
Delete the repeated words "to" and "the". Link: https://lkml.kernel.org/r/20210507184837.10754-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-26Merge tag 's390-5.13-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix a couple of late pt_regs flags handling findings of conversion to generic entry. - Fix potential register clobbering in stack switch helper. - Fix thread/group masks for offline cpus. - Fix cleanup of mdev resources when remove callback is invoked in vfio-ap code. * tag 's390-5.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/stack: fix possible register corruption with stack switch helper s390/topology: clear thread/group maps for offline cpus s390/vfio-ap: clean up mdev resources when remove callback invoked s390: clear pt_regs::flags on irq entry s390: fix system call restart with multiple signals
2021-06-25Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "24 patches, based on 4a09d388f2ab382f217a764e6a152b3f614246f6. Subsystems affected by this patch series: mm (thp, vmalloc, hugetlb, memory-failure, and pagealloc), nilfs2, kthread, MAINTAINERS, and mailmap" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits) mailmap: add Marek's other e-mail address and identity without diacritics MAINTAINERS: fix Marek's identity again mm/page_alloc: do bulk array bounds check after checking populated elements mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array mm/hwpoison: do not lock page again when me_huge_page() successfully recovers mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned mm/memory-failure: use a mutex to avoid memory_failure() races mm, futex: fix shared futex pgoff on shmem huge page kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() kthread_worker: split code for canceling the delayed work timer mm/vmalloc: unbreak kasan vmalloc support KVM: s390: prepare for hugepage vmalloc mm/vmalloc: add vmalloc_no_huge nilfs2: fix memory leak in nilfs_sysfs_delete_device_group mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes mm: page_vma_mapped_walk(): get vma_address_end() earlier mm: page_vma_mapped_walk(): use goto instead of while (1) mm: page_vma_mapped_walk(): add a level of indentation mm: page_vma_mapped_walk(): crossing page table boundary ...
2021-06-25Merge tag 'x86_urgent_for_v5.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "Two more urgent FPU fixes: - prevent unprivileged userspace from reinitializing supervisor states - prepare init_fpstate, which is the buffer used when initializing FPU state, properly in case the skip-writing-state-components XSAVE* variants are used" * tag 'x86_urgent_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Make init_fpstate correct with optimized XSAVE x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()
2021-06-24KVM: s390: prepare for hugepage vmallocClaudio Imbrenda
The Create Secure Configuration Ultravisor Call does not support using large pages for the virtual memory area. This is a hardware limitation. This patch replaces the vzalloc call with an almost equivalent call to the newly introduced vmalloc_no_huge function, which guarantees that only small pages will be used for the backing. The new call will not clear the allocated memory, but that has never been an actual requirement. Link: https://lkml.kernel.org/r/20210614132357.10202-3-imbrenda@linux.ibm.com Fixes: 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24Merge tag 'perf-urgent-2021-06-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fix from Ingo Molnar: "An LBR buffer fix for code that probably only worked accidentally" * tag 'perf-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/lbr: Zero the xstate buffer on allocation
2021-06-24Merge tag 'objtool-urgent-2021-06-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "Address a number of objtool warnings that got reported. No change in behavior intended, but code generation might be impacted by commit 1f008d46f124 ("x86: Always inline task_size_max()")" * tag 'objtool-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/lockdep: Improve noinstr vs errors x86: Always inline task_size_max() x86/xen: Fix noinstr fail in exc_xen_unknown_trap() x86/xen: Fix noinstr fail in xen_pv_evtchn_do_upcall() x86/entry: Fix noinstr fail in __do_fast_syscall_32() objtool/x86: Ignore __x86_indirect_alt_* symbols
2021-06-24perf/x86/intel/lbr: Zero the xstate buffer on allocationThomas Gleixner
XRSTORS requires a valid xstate buffer to work correctly. XSAVES does not guarantee to write a fully valid buffer according to the SDM: "XSAVES does not write to any parts of the XSAVE header other than the XSTATE_BV and XCOMP_BV fields." XRSTORS triggers a #GP: "If bytes 63:16 of the XSAVE header are not all zero." It's dubious at best how this can work at all when the buffer is not zeroed before use. Allocate the buffers with __GFP_ZERO to prevent XRSTORS failure. Fixes: ce711ea3cab9 ("perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/87wnr0wo2z.ffs@nanos.tec.linutronix.de
2021-06-22x86: Always inline task_size_max()Peter Zijlstra
Fix: vmlinux.o: warning: objtool: handle_bug()+0x10: call to task_size_max() leaves .noinstr.text section When #UD isn't a BUG, we shouldn't violate noinstr (we'll still probably die, but that's another story). Fixes: 025768a966a3 ("x86/cpu: Use alternative to generate the TASK_SIZE_MAX constant") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210621120120.682468274@infradead.org
2021-06-22x86/xen: Fix noinstr fail in exc_xen_unknown_trap()Peter Zijlstra
Fix: vmlinux.o: warning: objtool: exc_xen_unknown_trap()+0x7: call to printk() leaves .noinstr.text section Fixes: 2e92493637a0 ("x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210621120120.606560778@infradead.org
2021-06-22x86/xen: Fix noinstr fail in xen_pv_evtchn_do_upcall()Peter Zijlstra
Fix: vmlinux.o: warning: objtool: xen_pv_evtchn_do_upcall()+0x23: call to irq_enter_rcu() leaves .noinstr.text section Fixes: 359f01d1816f ("x86/entry: Use run_sysvec_on_irqstack_cond() for XEN upcall") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210621120120.532960208@infradead.org
2021-06-22x86/entry: Fix noinstr fail in __do_fast_syscall_32()Peter Zijlstra
Fix: vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xf5: call to trace_hardirqs_off() leaves .noinstr.text section Fixes: 5d5675df792f ("x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210621120120.467898710@infradead.org
2021-06-22x86/fpu: Make init_fpstate correct with optimized XSAVEThomas Gleixner
The XSAVE init code initializes all enabled and supported components with XRSTOR(S) to init state. Then it XSAVEs the state of the components back into init_fpstate which is used in several places to fill in the init state of components. This works correctly with XSAVE, but not with XSAVEOPT and XSAVES because those use the init optimization and skip writing state of components which are in init state. So init_fpstate.xsave still contains all zeroes after this operation. There are two ways to solve that: 1) Use XSAVE unconditionally, but that requires to reshuffle the buffer when XSAVES is enabled because XSAVES uses compacted format. 2) Save the components which are known to have a non-zero init state by other means. Looking deeper, #2 is the right thing to do because all components the kernel supports have all-zeroes init state except the legacy features (FP, SSE). Those cannot be hard coded because the states are not identical on all CPUs, but they can be saved with FXSAVE which avoids all conditionals. Use FXSAVE to save the legacy FP/SSE components in init_fpstate along with a BUILD_BUG_ON() which reminds developers to validate that a newly added component has all zeroes init state. As a bonus remove the now unused copy_xregs_to_kernel_booting() crutch. The XSAVE and reshuffle method can still be implemented in the unlikely case that components are added which have a non-zero init state and no other means to save them. For now, FXSAVE is just simple and good enough. [ bp: Fix a typo or two in the text. ] Fixes: 6bad06b76892 ("x86, xsave: Use xsaveopt in context-switch path when supported") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210618143444.587311343@linutronix.de
2021-06-22x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()Thomas Gleixner
sanitize_restored_user_xstate() preserves the supervisor states only when the fx_only argument is zero, which allows unprivileged user space to put supervisor states back into init state. Preserve them unconditionally. [ bp: Fix a typo or two in the text. ] Fixes: 5d6b6a6f9b5c ("x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210618143444.438635017@linutronix.de
2021-06-21Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fix from Russell King: - fix gcc 10 compiler regression with cpu_init() * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9081/1: fix gcc-10 thumb2-kernel regression
2021-06-21objtool/x86: Ignore __x86_indirect_alt_* symbolsPeter Zijlstra
Because the __x86_indirect_alt* symbols are just that, objtool will try and validate them as regular symbols, instead of the alternative replacements that they are. This goes sideways for FRAME_POINTER=y builds; which generate a fair amount of warnings. Fixes: 9bc0bb50727c ("objtool/x86: Rewrite retpoline thunk calls") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/YNCgxwLBiK9wclYJ@hirez.programming.kicks-ass.net
2021-06-21s390/stack: fix possible register corruption with stack switch helperHeiko Carstens
The CALL_ON_STACK macro is used to call a C function from inline assembly, and therefore must consider the C ABI, which says that only registers 6-13, and 15 are non-volatile (restored by the called function). The inline assembly incorrectly marks all registers used to pass parameters to the called function as read-only input operands, instead of operands that are read and written to. This might result in register corruption depending on usage, compiler, and compile options. Fix this by marking all operands used to pass parameters as read/write operands. To keep the code simple even register 6, if used, is marked as read-write operand. Fixes: ff340d2472ec ("s390: add stack switch helper") Cc: <stable@kernel.org> # 4.20 Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-21s390/topology: clear thread/group maps for offline cpusSven Schnelle
The current code doesn't clear the thread/group maps for offline CPUs. This may cause kernel crashes like the one bewlow in common code that assumes if a CPU has sibblings it is online. Unable to handle kernel pointer dereference in virtual kernel address space Call Trace: [<000000013a4b8c3c>] blk_mq_map_swqueue+0x10c/0x388 ([<000000013a4b8bcc>] blk_mq_map_swqueue+0x9c/0x388) [<000000013a4b9300>] blk_mq_init_allocated_queue+0x448/0x478 [<000000013a4b9416>] blk_mq_init_queue+0x4e/0x90 [<000003ff8019d3e6>] loop_add+0x106/0x278 [loop] [<000003ff801b8148>] loop_init+0x148/0x1000 [loop] [<0000000139de4924>] do_one_initcall+0x3c/0x1e0 [<0000000139ef449a>] do_init_module+0x6a/0x2a0 [<0000000139ef61bc>] __do_sys_finit_module+0xa4/0xc0 [<0000000139de9e6e>] do_syscall+0x7e/0xd0 [<000000013a8e0aec>] __do_syscall+0xbc/0x110 [<000000013a8ee2e8>] system_call+0x78/0xa0 Fixes: 52aeda7accb6 ("s390/topology: remove offline CPUs from CPU topology masks") Cc: <stable@kernel.org> # 5.7+ Reported-by: Marius Hillenbrand <mhillen@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-21s390: clear pt_regs::flags on irq entrySven Schnelle
The current irq entry code doesn't initialize pt_regs::flags. On exit to user mode arch_do_signal_or_restart() tests whether PIF_SYSCALL is set, which might yield wrong results. Fix this by clearing pt_regs::flags in the entry.S irq handler code. Reported-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Fixes: 56e62a737028 ("s390: convert to generic entry") Cc: <stable@vger.kernel.org> # 5.12 Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-21s390: fix system call restart with multiple signalsSven Schnelle
glibc complained with "The futex facility returned an unexpected error code.". It turned out that the futex syscall returned -ERESTARTSYS because a signal is pending. arch_do_signal_or_restart() restored the syscall parameters (nameley regs->gprs[2]) and set PIF_SYSCALL_RESTART. When another signal is made pending later in the exit loop arch_do_signal_or_restart() is called again. This function clears PIF_SYSCALL_RESTART and checks the return code which is set in regs->gprs[2]. However, regs->gprs[2] was restored in the previous run and no longer contains -ERESTARTSYS, so PIF_SYSCALL_RESTART isn't set again and the syscall is skipped. Fix this by not clearing PIF_SYSCALL_RESTART - it is already cleared in __do_syscall() when the syscall is restarted. Reported-by: Bjoern Walk <bwalk@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Fixes: 56e62a737028 ("s390: convert to generic entry") Cc: <stable@vger.kernel.org> # 5.12 Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-20Merge tag 'x86_urgent_for_v5.13_rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "A first set of urgent fixes to the FPU/XSTATE handling mess^W code. (There's a lot more in the pipe): - Prevent corruption of the XSTATE buffer in signal handling by validating what is being copied from userspace first. - Invalidate other task's preserved FPU registers on XRSTOR failure (#PF) because latter can still modify some of them. - Restore the proper PKRU value in case userspace modified it - Reset FPU state when signal restoring fails Other: - Map EFI boot services data memory as encrypted in a SEV guest so that the guest can access it and actually boot properly - Two SGX correctness fixes: proper resources freeing and a NUMA fix" * tag 'x86_urgent_for_v5.13_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Avoid truncating memblocks for SGX memory x86/sgx: Add missing xa_destroy() when virtual EPC is destroyed x86/fpu: Reset state for all signal restore failures x86/pkru: Write hardware init value to PKRU when xstate is init x86/process: Check PF_KTHREAD and not current->mm for kernel threads x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer x86/fpu: Prevent state corruption in __fpu__restore_sig() x86/ioremap: Map EFI-reserved memory as encrypted for SEV
2021-06-19Merge tag 'powerpc-5.13-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix initrd corruption caused by our recent change to use relative jump labels. Fix a crash using perf record on systems without a hardware PMU backend. Rework our 64-bit signal handling slighty to make it more closely match the old behaviour, after the recent change to use unsafe user accessors. Thanks to Anastasia Kovaleva, Athira Rajeev, Christophe Leroy, Daniel Axtens, Greg Kurz, and Roman Bolshakov" * tag 'powerpc-5.13-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/perf: Fix crash in perf_instruction_pointer() when ppmu is not set powerpc: Fix initrd corruption with relative jump labels powerpc/signal64: Copy siginfo before changing regs->nip powerpc/mem: Add back missing header to fix 'no previous prototype' error
2021-06-19Merge tag 'riscv-for-linus-5.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A build fix to always build modules with the 'medany' code model, as the module loader doesn't support 'medlow'. - A Kconfig warning fix for the SiFive errata. - A pair of fixes that for regressions to the recent memory layout changes. - A fix for the FU740 device tree. * tag 'riscv-for-linus-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: dts: fu740: fix cache-controller interrupts riscv: Ensure BPF_JIT_REGION_START aligned with PMD size riscv: kasan: Fix MODULES_VADDR evaluation due to local variables' name riscv: sifive: fix Kconfig errata warning riscv32: Use medany C model for modules
2021-06-19Merge tag 's390-5.13-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix zcrypt ioctl hang due to AP queue msg counter dropping below 0 when pending requests are purged. - Two fixes for the machine check handler in the entry code. * tag 's390-5.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ap: Fix hanging ioctl caused by wrong msg counter s390/mcck: fix invalid KVM guest condition check s390/mcck: fix calculation of SIE critical section size
2021-06-19riscv: dts: fu740: fix cache-controller interruptsDavid Abdurachmanov
The order of interrupt numbers is incorrect. The order for FU740 is: DirError, DataError, DataFail, DirFail From SiFive FU740-C000 Manual: 19 - L2 Cache DirError 20 - L2 Cache DirFail 21 - L2 Cache DataError 22 - L2 Cache DataFail Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18riscv: Ensure BPF_JIT_REGION_START aligned with PMD sizeJisheng Zhang
Andreas reported commit fc8504765ec5 ("riscv: bpf: Avoid breaking W^X") breaks booting with one kind of defconfig, I reproduced a kernel panic with the defconfig: [ 0.138553] Unable to handle kernel paging request at virtual address ffffffff81201220 [ 0.139159] Oops [#1] [ 0.139303] Modules linked in: [ 0.139601] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5-default+ #1 [ 0.139934] Hardware name: riscv-virtio,qemu (DT) [ 0.140193] epc : __memset+0xc4/0xfc [ 0.140416] ra : skb_flow_dissector_init+0x1e/0x82 [ 0.140609] epc : ffffffff8029806c ra : ffffffff8033be78 sp : ffffffe001647da0 [ 0.140878] gp : ffffffff81134b08 tp : ffffffe001654380 t0 : ffffffff81201158 [ 0.141156] t1 : 0000000000000002 t2 : 0000000000000154 s0 : ffffffe001647dd0 [ 0.141424] s1 : ffffffff80a43250 a0 : ffffffff81201220 a1 : 0000000000000000 [ 0.141654] a2 : 000000000000003c a3 : ffffffff81201258 a4 : 0000000000000064 [ 0.141893] a5 : ffffffff8029806c a6 : 0000000000000040 a7 : ffffffffffffffff [ 0.142126] s2 : ffffffff81201220 s3 : 0000000000000009 s4 : ffffffff81135088 [ 0.142353] s5 : ffffffff81135038 s6 : ffffffff8080ce80 s7 : ffffffff80800438 [ 0.142584] s8 : ffffffff80bc6578 s9 : 0000000000000008 s10: ffffffff806000ac [ 0.142810] s11: 0000000000000000 t3 : fffffffffffffffc t4 : 0000000000000000 [ 0.143042] t5 : 0000000000000155 t6 : 00000000000003ff [ 0.143220] status: 0000000000000120 badaddr: ffffffff81201220 cause: 000000000000000f [ 0.143560] [<ffffffff8029806c>] __memset+0xc4/0xfc [ 0.143859] [<ffffffff8061e984>] init_default_flow_dissectors+0x22/0x60 [ 0.144092] [<ffffffff800010fc>] do_one_initcall+0x3e/0x168 [ 0.144278] [<ffffffff80600df0>] kernel_init_freeable+0x1c8/0x224 [ 0.144479] [<ffffffff804868a8>] kernel_init+0x12/0x110 [ 0.144658] [<ffffffff800022de>] ret_from_exception+0x0/0xc [ 0.145124] ---[ end trace f1e9643daa46d591 ]--- After some investigation, I think I found the root cause: commit 2bfc6cd81bd ("move kernel mapping outside of linear mapping") moves BPF JIT region after the kernel: | #define BPF_JIT_REGION_START PFN_ALIGN((unsigned long)&_end) The &_end is unlikely aligned with PMD size, so the front bpf jit region sits with part of kernel .data section in one PMD size mapping. But kernel is mapped in PMD SIZE, when bpf_jit_binary_lock_ro() is called to make the first bpf jit prog ROX, we will make part of kernel .data section RO too, so when we write to, for example memset the .data section, MMU will trigger a store page fault. To fix the issue, we need to ensure the BPF JIT region is PMD size aligned. This patch acchieve this goal by restoring the BPF JIT region to original position, I.E the 128MB before kernel .text section. The modification to kasan_init.c is inspired by Alexandre. Fixes: fc8504765ec5 ("riscv: bpf: Avoid breaking W^X") Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18riscv: kasan: Fix MODULES_VADDR evaluation due to local variables' nameJisheng Zhang
commit 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear mapping") makes use of MODULES_VADDR to populate kernel, BPF, modules mapping. Currently, MODULES_VADDR is defined as below for RV64: | #define MODULES_VADDR (PFN_ALIGN((unsigned long)&_end) - SZ_2G) But kasan_init() has two local variables which are also named as _start, _end, so MODULES_VADDR is evaluated with the local variable _end rather than the global "_end" as we expected. Fix this issue by renaming the two local variables. Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear mapping") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18Merge tag 'pci-v5.13-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Clear 64-bit flag for host bridge windows below 4GB to fix a resource allocation regression added in -rc1 (Punit Agrawal) - Fix tegra194 MCFG quirk build regressions added in -rc1 (Jon Hunter) - Avoid secondary bus resets on TI KeyStone C667X devices (Antti Järvinen) - Avoid secondary bus resets on some NVIDIA GPUs (Shanker Donthineni) - Work around FLR erratum on Huawei Intelligent NIC VF (Chiqijun) - Avoid broken ATS on AMD Navi14 GPU (Evan Quan) - Trust Broadcom BCM57414 NIC to isolate functions even though it doesn't advertise ACS support (Sriharsha Basavapatna) - Work around AMD RS690 BIOSes that don't configure DMA above 4GB (Mikel Rychliski) - Fix panic during PIO transfer on Aardvark controller (Pali Rohár) * tag 'pci-v5.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: aardvark: Fix kernel panic during PIO transfer PCI: Add AMD RS690 quirk to enable 64-bit DMA PCI: Add ACS quirk for Broadcom BCM57414 NIC PCI: Mark AMD Navi14 GPU ATS as broken PCI: Work around Huawei Intelligent NIC VF FLR erratum PCI: Mark some NVIDIA GPUs to avoid bus reset PCI: Mark TI C667X to avoid bus reset PCI: tegra194: Fix MCFG quirk build regressions PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB
2021-06-18Merge tag 'arc-5.13-rc7-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - ARCv2 userspace ABI not populating a few registers - Unbork CONFIG_HARDENED_USERCOPY for ARC * tag 'arc-5.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: fix CONFIG_HARDENED_USERCOPY ARCv2: save ABI registers across signal handling
2021-06-18x86/mm: Avoid truncating memblocks for SGX memoryFan Du
tl;dr: Several SGX users reported seeing the following message on NUMA systems: sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0. This turned out to be the memblock code mistakenly throwing away SGX memory. === Full Changelog === The 'max_pfn' variable represents the highest known RAM address. It can be used, for instance, to quickly determine for which physical addresses there is mem_map[] space allocated. The numa_meminfo code makes an effort to throw out ("trim") all memory blocks which are above 'max_pfn'. SGX memory is not considered RAM (it is marked as "Reserved" in the e820) and is not taken into account by max_pfn. Despite this, SGX memory areas have NUMA affinity and are enumerated in the ACPI SRAT table. The existing SGX code uses the numa_meminfo mechanism to look up the NUMA affinity for its memory areas. In cases where SGX memory was above max_pfn (usually just the one EPC section in the last highest NUMA node), the numa_memblock is truncated at 'max_pfn', which is below the SGX memory. When the SGX code tries to look up the affinity of this memory, it fails and produces an error message: sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0. and assigns the memory to NUMA node 0. Instead of silently truncating the memory block at 'max_pfn' and dropping the SGX memory, add the truncated portion to 'numa_reserved_meminfo'. This allows the SGX code to later determine the NUMA affinity of its 'Reserved' area. Before, numa_meminfo looked like this (from 'crash'): blk = { start = 0x0, end = 0x2080000000, nid = 0x0 } { start = 0x2080000000, end = 0x4000000000, nid = 0x1 } numa_reserved_meminfo is empty. With this, numa_meminfo looks like this: blk = { start = 0x0, end = 0x2080000000, nid = 0x0 } { start = 0x2080000000, end = 0x4000000000, nid = 0x1 } and numa_reserved_meminfo has an entry for node 1's SGX memory: blk = { start = 0x4000000000, end = 0x4080000000, nid = 0x1 } [ daveh: completely rewrote/reworked changelog ] Fixes: 5d30f92e7631 ("x86/NUMA: Provide a range-to-target_node lookup facility") Reported-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20210617194657.0A99CB22@viggo.jf.intel.com
2021-06-18PCI: Add AMD RS690 quirk to enable 64-bit DMAMikel Rychliski
Although the AMD RS690 chipset has 64-bit DMA support, BIOS implementations sometimes fail to configure the memory limit registers correctly. The Acer F690GVM mainboard uses this chipset and a Marvell 88E8056 NIC. The sky2 driver programs the NIC to use 64-bit DMA, which will not work: sky2 0000:02:00.0: error interrupt status=0x8 sky2 0000:02:00.0 eth0: tx timeout sky2 0000:02:00.0 eth0: transmit ring 0 .. 22 report=0 done=0 Other drivers required by this mainboard either don't support 64-bit DMA, or have it disabled using driver specific quirks. For example, the ahci driver has quirks to enable or disable 64-bit DMA depending on the BIOS version (see ahci_sb600_enable_64bit() in ahci.c). This ahci quirk matches against the SB600 SATA controller, but the real issue is almost certainly with the RS690 PCI host that it was commonly attached to. To avoid this issue in all drivers with 64-bit DMA support, fix the configuration of the PCI host. If the kernel is aware of physical memory above 4GB, but the BIOS never configured the PCI host with this information, update the registers with our values. [bhelgaas: drop PCI_DEVICE_ID_ATI_RS690 definition] Link: https://lore.kernel.org/r/20210611214823.4898-1-mikel@mikelr.com Signed-off-by: Mikel Rychliski <mikel@mikelr.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>