summaryrefslogtreecommitdiff
path: root/arch/sparc
AgeCommit message (Collapse)Author
2021-05-02sparc: syscalls: switch to generic syscalltbl.shMasahiro Yamada
Many architectures duplicate similar shell scripts. This commit converts sparc to use scripts/syscalltbl.sh. This also unifies syscall_table_64.h and syscall_table_c32.h. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-04-30mm: move mem_init_print_info() into mm_init()Kefeng Wang
mem_init_print_info() is called in mem_init() on each architecture, and pass NULL argument, so using void argument and move it into mm_init(). Link: https://lkml.kernel.org/r/20210317015210.33641-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> [x86] Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> [powerpc] Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Anatoly Pugachev <matorola@gmail.com> [sparc64] Acked-by: Russell King <rmk+kernel@armlinux.org.uk> [arm] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Guo Ren <guoren@kernel.org> Cc: Yoshinori Sato <ysato@users.osdn.me> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_pageNicholas Piggin
vmalloc_to_page returns NULL for addresses mapped by larger pages[*]. Whether or not a vmap is huge depends on the architecture details, alignments, boot options, etc., which the caller can not be expected to know. Therefore HUGE_VMAP is a regression for vmalloc_to_page. This change teaches vmalloc_to_page about larger pages, and returns the struct page that corresponds to the offset within the large page. This makes the API agnostic to mapping implementation details. [*] As explained by commit 029c54b095995 ("mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings") [npiggin@gmail.com: sparc32: add stub pud_page define for walking huge vmalloc page tables] Link: https://lkml.kernel.org/r/20210324232825.1157363-1-npiggin@gmail.com Link: https://lkml.kernel.org/r/20210317062402.533919-3-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30mm: move page_mapping_file to pagemap.hMatthew Wilcox (Oracle)
page_mapping_file() is only used by some architectures, and then it is usually only used in one place. Make it a static inline function so other architectures don't have to carry this dead code. Link: https://lkml.kernel.org/r/20210317123011.350118-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-29Merge tag 'for_v5.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota, ext2, reiserfs updates from Jan Kara: - support for path (instead of device) based quotactl syscall (quotactl_path(2)) - ext2 conversion to kmap_local() - other minor cleanups & fixes * tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs/reiserfs/journal.c: delete useless variables fs/ext2: Replace kmap() with kmap_local_page() ext2: Match up ext2_put_page() with ext2_dotdot() and ext2_find_entry() fs/ext2/: fix misspellings using codespell tool quota: report warning limits for realtime space quotas quota: wire up quotactl_path quota: Add mountpath based quota support
2021-04-22arch: Wire up Landlock syscallsMickaël Salaün
Wire up the following system calls for all architectures: * landlock_create_ruleset(2) * landlock_add_rule(2) * landlock_restrict_self(2) Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-10-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-08asm-generic/io.h: Add a non-posted variant of ioremap()Hector Martin
ARM64 currently defaults to posted MMIO (nGnRE), but some devices require the use of non-posted MMIO (nGnRnE). Introduce a new ioremap() variant to handle this case. ioremap_np() returns NULL on arches that do not implement this variant. sparc64 is the only architecture that needs to be touched directly, because it includes neither of the generic io.h or iomap.h headers. This adds the IORESOURCE_MEM_NONPOSTED flag, which maps to this variant and marks a given resource as requiring non-posted mappings. This is implemented in the resource system because it is a SoC-level requirement, so existing drivers do not need special-case code to pick this ioremap variant. Then this is implemented in devres by introducing devm_ioremap_np(), and making devm_ioremap_resource() automatically select this variant when the resource has the IORESOURCE_MEM_NONPOSTED flag set. Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Hector Martin <marcan@marcan.st>
2021-03-23tracing: Fix various typos in commentsIngo Molnar
Fix ~59 single-word typos in the tracing code comments, and fix the grammar in a handful of places. Link: https://lore.kernel.org/r/20210322224546.GA1981273@gmail.com Link: https://lkml.kernel.org/r/20210323174935.GA4176821@gmail.com Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-17quota: wire up quotactl_pathSascha Hauer
Wire up the quotactl_path syscall added in the previous patch. Link: https://lore.kernel.org/r/20210304123541.30749-3-s.hauer@pengutronix.de Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2021-03-09Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc fixes from David Miller: "Fix opcode filtering for exceptions, and clean up defconfig" * git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc: sparc: sparc64_defconfig: remove duplicate CONFIGs sparc64: Fix opcode filtering in handling of no fault loads
2021-03-09sparc: sparc64_defconfig: remove duplicate CONFIGsCorentin Labbe
After my patch there is CONFIG_ATA defined twice. Remove the duplicate one. Same problem for CONFIG_HAPPYMEAL, except I added as builtin for boot test with NFS. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: a57cdeb369ef ("sparc: sparc64_defconfig: add necessary configs for qemu") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09sparc64: Fix opcode filtering in handling of no fault loadsRob Gardner
is_no_fault_exception() has two bugs which were discovered via random opcode testing with stress-ng. Both are caused by improper filtering of opcodes. The first bug can be triggered by a floating point store with a no-fault ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040. The code first tests op3[5] (0x1000000), which denotes a floating point instruction, and then tests op3[2] (0x200000), which denotes a store instruction. But these bits are not mutually exclusive, and the above mentioned opcode has both bits set. The intent is to filter out stores, so the test for stores must be done first in order to have any effect. The second bug can be triggered by a floating point load with one of the invalid ASI values 0x8e or 0x8f, which pass this check in is_no_fault_exception(): if ((asi & 0xf2) == ASI_PNF) An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38", opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal in handle_ldf_stq(), and is_no_fault_exception() must not allow these invalid asi values to make it that far. In both of these cases, handle_ldf_stq() reacts by calling sun4v_data_access_exception() or spitfire_data_access_exception(), which call is_no_fault_exception() and results in an infinite recursion. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Tested-by: Anatoly Pugachev <matorola@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-08Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc updates from David Miller: "Just some more random bits from Al, including a conversion over to generic extables" * git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc: sparc32: take ->thread.flags out sparc32: get rid of fake_swapper_regs sparc64: get rid of fake_swapper_regs sparc32: switch to generic extables sparc32: switch copy_user.S away from range exception table entries sparc32: get rid of range exception table entries in checksum_32.S sparc32: switch __bzero() away from range exception table entries sparc32: kill lookup_fault() sparc32: don't bother with lookup_fault() in __bzero()
2021-02-27Merge tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring thread rewrite from Jens Axboe: "This converts the io-wq workers to be forked off the tasks in question instead of being kernel threads that assume various bits of the original task identity. This kills > 400 lines of code from io_uring/io-wq, and it's the worst part of the code. We've had several bugs in this area, and the worry is always that we could be missing some pieces for file types doing unusual things (recent /dev/tty example comes to mind, userfaultfd reads installing file descriptors is another fun one... - both of which need special handling, and I bet it's not the last weird oddity we'll find). With these identical workers, we can have full confidence that we're never missing anything. That, in itself, is a huge win. Outside of that, it's also more efficient since we're not wasting space and code on tracking state, or switching between different states. I'm sure we're going to find little things to patch up after this series, but testing has been pretty thorough, from the usual regression suite to production. Any issue that may crop up should be manageable. There's also a nice series of further reductions we can do on top of this, but I wanted to get the meat of it out sooner rather than later. The general worry here isn't that it's fundamentally broken. Most of the little issues we've found over the last week have been related to just changes in how thread startup/exit is done, since that's the main difference between using kthreads and these kinds of threads. In fact, if all goes according to plan, I want to get this into the 5.10 and 5.11 stable branches as well. That said, the changes outside of io_uring/io-wq are: - arch setup, simple one-liner to each arch copy_thread() implementation. - Removal of net and proc restrictions for io_uring, they are no longer needed or useful" * tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block: (30 commits) io-wq: remove now unused IO_WQ_BIT_ERROR io_uring: fix SQPOLL thread handling over exec io-wq: improve manager/worker handling over exec io_uring: ensure SQPOLL startup is triggered before error shutdown io-wq: make buffered file write hashed work map per-ctx io-wq: fix race around io_worker grabbing io-wq: fix races around manager/worker creation and task exit io_uring: ensure io-wq context is always destroyed for tasks arch: ensure parisc/powerpc handle PF_IO_WORKER in copy_thread() io_uring: cleanup ->user usage io-wq: remove nr_process accounting io_uring: flag new native workers with IORING_FEAT_NATIVE_WORKERS net: remove cmsg restriction from io_uring based send/recvmsg calls Revert "proc: don't allow async path resolution of /proc/self components" Revert "proc: don't allow async path resolution of /proc/thread-self components" io_uring: move SQPOLL thread io-wq forked worker io-wq: make io_wq_fork_thread() available to other users io-wq: only remove worker from free_list, if it was there io_uring: remove io_identity io_uring: remove any grabbing of context ...
2021-02-26Merge branch 'work.sparc32' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
2021-02-26Merge branch 'work.sparc' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
2021-02-25Merge tag 'kbuild-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Fix false-positive build warnings for ARCH=ia64 builds - Optimize dictionary size for module compression with xz - Check the compiler and linker versions in Kconfig - Fix misuse of extra-y - Support DWARF v5 debug info - Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x exceeded the limit - Add generic syscall{tbl,hdr}.sh for cleanups across arches - Minor cleanups of genksyms - Minor cleanups of Kconfig * tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (38 commits) initramfs: Remove redundant dependency of RD_ZSTD on BLK_DEV_INITRD kbuild: remove deprecated 'always' and 'hostprogs-y/m' kbuild: parse C= and M= before changing the working directory kbuild: reuse this-makefile to define abs_srctree kconfig: unify rule of config, menuconfig, nconfig, gconfig, xconfig kconfig: omit --oldaskconfig option for 'make config' kconfig: fix 'invalid option' for help option kconfig: remove dead code in conf_askvalue() kconfig: clean up nested if-conditionals in check_conf() kconfig: Remove duplicate call to sym_get_string_value() Makefile: Remove # characters from compiler string Makefile: reuse CC_VERSION_TEXT kbuild: check the minimum linker version in Kconfig kbuild: remove ld-version macro scripts: add generic syscallhdr.sh scripts: add generic syscalltbl.sh arch: syscalls: remove $(srctree)/ prefix from syscall tables arch: syscalls: add missing FORCE and fix 'targets' to make if_changed work gen_compile_commands: prune some directories kbuild: simplify access to the kernel's version ...
2021-02-24Merge tag 'x86-entry-2021-02-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq entry updates from Thomas Gleixner: "The irq stack switching was moved out of the ASM entry code in course of the entry code consolidation. It ended up being suboptimal in various ways. This reworks the X86 irq stack handling: - Make the stack switching inline so the stackpointer manipulation is not longer at an easy to find place. - Get rid of the unnecessary indirect call. - Avoid the double stack switching in interrupt return and reuse the interrupt stack for softirq handling. - A objtool fix for CONFIG_FRAME_POINTER=y builds where it got confused about the stack pointer manipulation" * tag 'x86-entry-2021-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix stack-swizzle for FRAME_POINTER=y um: Enforce the usage of asm-generic/softirq_stack.h x86/softirq/64: Inline do_softirq_own_stack() softirq: Move do_softirq_own_stack() to generic asm header softirq: Move __ARCH_HAS_DO_SOFTIRQ to Kconfig x86: Select CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK x86/softirq: Remove indirection in do_softirq_own_stack() x86/entry: Use run_sysvec_on_irqstack_cond() for XEN upcall x86/entry: Convert device interrupts to inline stack switching x86/entry: Convert system vectors to irq stack macro x86/irq: Provide macro for inlining irq stack switching x86/apic: Split out spurious handling code x86/irq/64: Adjust the per CPU irq stack pointer by 8 x86/irq: Sanitize irq stack tracking x86/entry: Fix instrumentation annotation
2021-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc updates from David Miller: "A host of mall cleanups and adjustments that have accumulated while I was away, nothing major" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: (26 commits) sparc: make xchg() into a statement expression sparc64: Use arch_validate_flags() to validate ADI flag sparc32: Fix comparing pointer to 0 coccicheck warning sparc: fix led.c driver when PROC_FS is not enabled sparc: Fix handling of page table constructor failure sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is set tty: hvcs: Drop unnecessary if block tty: vcc: Drop unnecessary if block tty: vcc: Drop impossible to hit WARN_ON sparc: sparc64_defconfig: add necessary configs for qemu sparc64: switch defconfig from the legacy ide driver to libata sparc32: Preserve clone syscall flags argument for restarts due to signals sparc32: Limit memblock allocation to low memory sparc: Replace test_ti_thread_flag() with test_tsk_thread_flag() sbus: char: Remove meaningless jump label out_free sparc32: signal: Fix stack trampoline for RT signals sparc: remove SA_STATIC_ALLOC macro definition sparc: use for_each_child_of_node() macro sparc: Use fallthrough pseudo-keyword sparc32: srmmu: improve type safety of __nocache_fix() ...
2021-02-23Merge tag 'idmapped-mounts-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull idmapped mounts from Christian Brauner: "This introduces idmapped mounts which has been in the making for some time. Simply put, different mounts can expose the same file or directory with different ownership. This initial implementation comes with ports for fat, ext4 and with Christoph's port for xfs with more filesystems being actively worked on by independent people and maintainers. Idmapping mounts handle a wide range of long standing use-cases. Here are just a few: - Idmapped mounts make it possible to easily share files between multiple users or multiple machines especially in complex scenarios. For example, idmapped mounts will be used in the implementation of portable home directories in systemd-homed.service(8) where they allow users to move their home directory to an external storage device and use it on multiple computers where they are assigned different uids and gids. This effectively makes it possible to assign random uids and gids at login time. - It is possible to share files from the host with unprivileged containers without having to change ownership permanently through chown(2). - It is possible to idmap a container's rootfs and without having to mangle every file. For example, Chromebooks use it to share the user's Download folder with their unprivileged containers in their Linux subsystem. - It is possible to share files between containers with non-overlapping idmappings. - Filesystem that lack a proper concept of ownership such as fat can use idmapped mounts to implement discretionary access (DAC) permission checking. - They allow users to efficiently changing ownership on a per-mount basis without having to (recursively) chown(2) all files. In contrast to chown (2) changing ownership of large sets of files is instantenous with idmapped mounts. This is especially useful when ownership of a whole root filesystem of a virtual machine or container is changed. With idmapped mounts a single syscall mount_setattr syscall will be sufficient to change the ownership of all files. - Idmapped mounts always take the current ownership into account as idmappings specify what a given uid or gid is supposed to be mapped to. This contrasts with the chown(2) syscall which cannot by itself take the current ownership of the files it changes into account. It simply changes the ownership to the specified uid and gid. This is especially problematic when recursively chown(2)ing a large set of files which is commong with the aforementioned portable home directory and container and vm scenario. - Idmapped mounts allow to change ownership locally, restricting it to specific mounts, and temporarily as the ownership changes only apply as long as the mount exists. Several userspace projects have either already put up patches and pull-requests for this feature or will do so should you decide to pull this: - systemd: In a wide variety of scenarios but especially right away in their implementation of portable home directories. https://systemd.io/HOME_DIRECTORY/ - container runtimes: containerd, runC, LXD:To share data between host and unprivileged containers, unprivileged and privileged containers, etc. The pull request for idmapped mounts support in containerd, the default Kubernetes runtime is already up for quite a while now: https://github.com/containerd/containerd/pull/4734 - The virtio-fs developers and several users have expressed interest in using this feature with virtual machines once virtio-fs is ported. - ChromeOS: Sharing host-directories with unprivileged containers. I've tightly synced with all those projects and all of those listed here have also expressed their need/desire for this feature on the mailing list. For more info on how people use this there's a bunch of talks about this too. Here's just two recent ones: https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf https://fosdem.org/2021/schedule/event/containers_idmap/ This comes with an extensive xfstests suite covering both ext4 and xfs: https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts It covers truncation, creation, opening, xattrs, vfscaps, setid execution, setgid inheritance and more both with idmapped and non-idmapped mounts. It already helped to discover an unrelated xfs setgid inheritance bug which has since been fixed in mainline. It will be sent for inclusion with the xfstests project should you decide to merge this. In order to support per-mount idmappings vfsmounts are marked with user namespaces. The idmapping of the user namespace will be used to map the ids of vfs objects when they are accessed through that mount. By default all vfsmounts are marked with the initial user namespace. The initial user namespace is used to indicate that a mount is not idmapped. All operations behave as before and this is verified in the testsuite. Based on prior discussions we want to attach the whole user namespace and not just a dedicated idmapping struct. This allows us to reuse all the helpers that already exist for dealing with idmappings instead of introducing a whole new range of helpers. In addition, if we decide in the future that we are confident enough to enable unprivileged users to setup idmapped mounts the permission checking can take into account whether the caller is privileged in the user namespace the mount is currently marked with. The user namespace the mount will be marked with can be specified by passing a file descriptor refering to the user namespace as an argument to the new mount_setattr() syscall together with the new MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern of extensibility. The following conditions must be met in order to create an idmapped mount: - The caller must currently have the CAP_SYS_ADMIN capability in the user namespace the underlying filesystem has been mounted in. - The underlying filesystem must support idmapped mounts. - The mount must not already be idmapped. This also implies that the idmapping of a mount cannot be altered once it has been idmapped. - The mount must be a detached/anonymous mount, i.e. it must have been created by calling open_tree() with the OPEN_TREE_CLONE flag and it must not already have been visible in the filesystem. The last two points guarantee easier semantics for userspace and the kernel and make the implementation significantly simpler. By default vfsmounts are marked with the initial user namespace and no behavioral or performance changes are observed. The manpage with a detailed description can be found here: https://git.kernel.org/brauner/man-pages/c/1d7b902e2875a1ff342e036a9f866a995640aea8 In order to support idmapped mounts, filesystems need to be changed and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The patches to convert individual filesystem are not very large or complicated overall as can be seen from the included fat, ext4, and xfs ports. Patches for other filesystems are actively worked on and will be sent out separately. The xfstestsuite can be used to verify that port has been done correctly. The mount_setattr() syscall is motivated independent of the idmapped mounts patches and it's been around since July 2019. One of the most valuable features of the new mount api is the ability to perform mounts based on file descriptors only. Together with the lookup restrictions available in the openat2() RESOLVE_* flag namespace which we added in v5.6 this is the first time we are close to hardened and race-free (e.g. symlinks) mounting and path resolution. While userspace has started porting to the new mount api to mount proper filesystems and create new bind-mounts it is currently not possible to change mount options of an already existing bind mount in the new mount api since the mount_setattr() syscall is missing. With the addition of the mount_setattr() syscall we remove this last restriction and userspace can now fully port to the new mount api, covering every use-case the old mount api could. We also add the crucial ability to recursively change mount options for a whole mount tree, both removing and adding mount options at the same time. This syscall has been requested multiple times by various people and projects. There is a simple tool available at https://github.com/brauner/mount-idmapped that allows to create idmapped mounts so people can play with this patch series. I'll add support for the regular mount binary should you decide to pull this in the following weeks: Here's an example to a simple idmapped mount of another user's home directory: u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt u1001@f2-vm:/$ ls -al /home/ubuntu/ total 28 drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 . drwxr-xr-x 4 root root 4096 Oct 28 04:00 .. -rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 ubuntu ubuntu 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 ubuntu ubuntu 807 Feb 25 2020 .profile -rw-r--r-- 1 ubuntu ubuntu 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ ls -al /mnt/ total 28 drwxr-xr-x 2 u1001 u1001 4096 Oct 28 22:07 . drwxr-xr-x 29 root root 4096 Oct 28 22:01 .. -rw------- 1 u1001 u1001 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 u1001 u1001 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 u1001 u1001 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 u1001 u1001 807 Feb 25 2020 .profile -rw-r--r-- 1 u1001 u1001 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 u1001 u1001 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ touch /mnt/my-file u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file u1001@f2-vm:/$ ls -al /mnt/my-file -rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file u1001@f2-vm:/$ ls -al /home/ubuntu/my-file -rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file u1001@f2-vm:/$ getfacl /mnt/my-file getfacl: Removing leading '/' from absolute path names # file: mnt/my-file # owner: u1001 # group: u1001 user::rw- user:u1001:rwx group::rw- mask::rwx other::r-- u1001@f2-vm:/$ getfacl /home/ubuntu/my-file getfacl: Removing leading '/' from absolute path names # file: home/ubuntu/my-file # owner: ubuntu # group: ubuntu user::rw- user:ubuntu:rwx group::rw- mask::rwx other::r--" * tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits) xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl xfs: support idmapped mounts ext4: support idmapped mounts fat: handle idmapped mounts tests: add mount_setattr() selftests fs: introduce MOUNT_ATTR_IDMAP fs: add mount_setattr() fs: add attr_flags_to_mnt_flags helper fs: split out functions to hold writers namespace: only take read lock in do_reconfigure_mnt() mount: make {lock,unlock}_mount_hash() static namespace: take lock_mount_hash() directly when changing flags nfs: do not export idmapped mounts overlayfs: do not mount on top of idmapped mounts ecryptfs: do not mount on top of idmapped mounts ima: handle idmapped mounts apparmor: handle idmapped mounts fs: make helpers idmap mount aware exec: handle idmapped mounts would_dump: handle idmapped mounts ...
2021-02-21arch: setup PF_IO_WORKER threads like PF_KTHREADJens Axboe
PF_IO_WORKER are kernel threads too, but they aren't PF_KTHREAD in the sense that we don't assign ->set_child_tid with our own structure. Just ensure that every arch sets up the PF_IO_WORKER threads like kthreads in the arch implementation of copy_thread(). Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-22arch: syscalls: remove $(srctree)/ prefix from syscall tablesMasahiro Yamada
The 'syscall' variables are not directly used in the commands. Remove the $(srctree)/ prefix because we can rely on VPATH. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-02-22arch: syscalls: add missing FORCE and fix 'targets' to make if_changed workMasahiro Yamada
The rules in these Makefiles cannot detect the command line change because the prerequisite 'FORCE' is missing. Adding 'FORCE' will result in the headers being rebuilt every time because the 'targets' additions are also wrong; the file paths in 'targets' must be relative to the current Makefile. Fix all of them so the if_changed rules work correctly. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-02-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "x86: - Support for userspace to emulate Xen hypercalls - Raise the maximum number of user memslots - Scalability improvements for the new MMU. Instead of the complex "fast page fault" logic that is used in mmu.c, tdp_mmu.c uses an rwlock so that page faults are concurrent, but the code that can run against page faults is limited. Right now only page faults take the lock for reading; in the future this will be extended to some cases of page table destruction. I hope to switch the default MMU around 5.12-rc3 (some testing was delayed due to Chinese New Year). - Cleanups for MAXPHYADDR checks - Use static calls for vendor-specific callbacks - On AMD, use VMLOAD/VMSAVE to save and restore host state - Stop using deprecated jump label APIs - Workaround for AMD erratum that made nested virtualization unreliable - Support for LBR emulation in the guest - Support for communicating bus lock vmexits to userspace - Add support for SEV attestation command - Miscellaneous cleanups PPC: - Support for second data watchpoint on POWER10 - Remove some complex workarounds for buggy early versions of POWER9 - Guest entry/exit fixes ARM64: - Make the nVHE EL2 object relocatable - Cleanups for concurrent translation faults hitting the same page - Support for the standard TRNG hypervisor call - A bunch of small PMU/Debug fixes - Simplification of the early init hypercall handling Non-KVM changes (with acks): - Detection of contended rwlocks (implemented only for qrwlocks, because KVM only needs it for x86) - Allow __DISABLE_EXPORTS from assembly code - Provide a saner follow_pfn replacements for modules" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (192 commits) KVM: x86/xen: Explicitly pad struct compat_vcpu_info to 64 bytes KVM: selftests: Don't bother mapping GVA for Xen shinfo test KVM: selftests: Fix hex vs. decimal snafu in Xen test KVM: selftests: Fix size of memslots created by Xen tests KVM: selftests: Ignore recently added Xen tests' build output KVM: selftests: Add missing header file needed by xAPIC IPI tests KVM: selftests: Add operand to vmsave/vmload/vmrun in svm.c KVM: SVM: Make symbol 'svm_gp_erratum_intercept' static locking/arch: Move qrwlock.h include after qspinlock.h KVM: PPC: Book3S HV: Fix host radix SLB optimisation with hash guests KVM: PPC: Book3S HV: Ensure radix guest has no SLB entries KVM: PPC: Don't always report hash MMU capability for P9 < DD2.2 KVM: PPC: Book3S HV: Save and restore FSCR in the P9 path KVM: PPC: remove unneeded semicolon KVM: PPC: Book3S HV: Use POWER9 SLBIA IH=6 variant to clear SLB KVM: PPC: Book3S HV: No need to clear radix host SLB before loading HPT guest KVM: PPC: Book3S HV: Fix radix guest SLB side channel KVM: PPC: Book3S HV: Remove support for running HPT guest on RPT host without mixed mode support KVM: PPC: Book3S HV: Introduce new capability for 2nd DAWR KVM: PPC: Book3S HV: Add infrastructure to support 2nd DAWR ...
2021-02-21Merge tag 'core-mm-2021-02-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull tlb gather updates from Ingo Molnar: "Theses fix MM (soft-)dirty bit management in the procfs code & clean up the TLB gather API" * tag 'core-mm-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ldt: Use tlb_gather_mmu_fullmm() when freeing LDT page-tables tlb: arch: Remove empty __tlb_remove_tlb_entry() stubs tlb: mmu_gather: Remove start/end arguments from tlb_gather_mmu() tlb: mmu_gather: Introduce tlb_gather_mmu_fullmm() tlb: mmu_gather: Remove unused start/end arguments from tlb_finish_mmu() mm: proc: Invalidate TLB after clearing soft-dirty page state
2021-02-21Merge tag 'oprofile-removal-5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux Pull oprofile and dcookies removal from Viresh Kumar: "Remove oprofile and dcookies support The 'oprofile' user-space tools don't use the kernel OPROFILE support any more, and haven't in a long time. User-space has been converted to the perf interfaces. The dcookies stuff is only used by the oprofile code. Now that oprofile's support is getting removed from the kernel, there is no need for dcookies as well. Remove kernel's old oprofile and dcookies support" * tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux: fs: Remove dcookies support drivers: Remove CONFIG_OPROFILE support arch: xtensa: Remove CONFIG_OPROFILE support arch: x86: Remove CONFIG_OPROFILE support arch: sparc: Remove CONFIG_OPROFILE support arch: sh: Remove CONFIG_OPROFILE support arch: s390: Remove CONFIG_OPROFILE support arch: powerpc: Remove oprofile arch: powerpc: Stop building and using oprofile arch: parisc: Remove CONFIG_OPROFILE support arch: mips: Remove CONFIG_OPROFILE support arch: microblaze: Remove CONFIG_OPROFILE support arch: ia64: Remove rest of perfmon support arch: ia64: Remove CONFIG_OPROFILE support arch: hexagon: Don't select HAVE_OPROFILE arch: arc: Remove CONFIG_OPROFILE support arch: arm: Remove CONFIG_OPROFILE support arch: alpha: Remove CONFIG_OPROFILE support
2021-02-21Merge branch 'work.elf-compat' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ELF compat updates from Al Viro: "Sanitizing ELF compat support, especially for triarch architectures: - X32 handling cleaned up - MIPS64 uses compat_binfmt_elf.c both for O32 and N32 now - Kconfig side of things regularized Eventually I hope to have compat_binfmt_elf.c killed, with both native and compat built from fs/binfmt_elf.c, with -DELF_BITS={64,32} passed by kbuild, but that's a separate story - not included here" * 'work.elf-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: get rid of COMPAT_ELF_EXEC_PAGESIZE compat_binfmt_elf: don't bother with undef of ELF_ARCH Kconfig: regularize selection of CONFIG_BINFMT_ELF mips compat: switch to compat_binfmt_elf.c mips: don't bother with ELF_CORE_EFLAGS mips compat: don't bother with ELF_ET_DYN_BASE mips: KVM_GUEST makes no sense for 64bit builds... mips: kill unused definitions in binfmt_elf[on]32.c mips binfmt_elf*32.c: use elfcore-compat.h x32: make X32, !IA32_EMULATION setups able to execute x32 binaries [amd64] clean PRSTATUS_SIZE/SET_PR_FPVALID up properly elf_prstatus: collect the common part (everything before pr_reg) into a struct binfmt_elf: partially sanitize PRSTATUS_SIZE and SET_PR_FPVALID
2021-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: "Here is what we have this merge window: 1) Support SW steering for mlx5 Connect-X6Dx, from Yevgeny Kliteynik. 2) Add RSS multi group support to octeontx2-pf driver, from Geetha Sowjanya. 3) Add support for KS8851 PHY. From Marek Vasut. 4) Add support for GarfieldPeak bluetooth controller from Kiran K. 5) Add support for half-duplex tcan4x5x can controllers. 6) Add batch skb rx processing to bcrm63xx_enet, from Sieng Piaw Liew. 7) Rework RX port offload infrastructure, particularly wrt, UDP tunneling, from Jakub Kicinski. 8) Add BCM72116 PHY support, from Florian Fainelli. 9) Remove Dsa specific notifiers, they are unnecessary. From Vladimir Oltean. 10) Add support for picosecond rx delay in dwmac-meson8b chips. From Martin Blumenstingl. 11) Support TSO on xfrm interfaces from Eyal Birger. 12) Add support for MP_PRIO to mptcp stack, from Geliang Tang. 13) Support BCM4908 integrated switch, from Rafał Miłecki. 14) Support for directly accessing kernel module variables via module BTF info, from Andrii Naryiko. 15) Add DASH (esktop and mobile Architecture for System Hardware) support to r8169 driver, from Heiner Kallweit. 16) Add rx vlan filtering to dpaa2-eth, from Ionut-robert Aron. 17) Add support for 100 base0x SFP devices, from Bjarni Jonasson. 18) Support link aggregation in DSA, from Tobias Waldekranz. 19) Support for bitwidse atomics in bpf, from Brendan Jackman. 20) SmartEEE support in at803x driver, from Russell King. 21) Add support for flow based tunneling to GTP, from Pravin B Shelar. 22) Allow arbitrary number of interconnrcts in ipa, from Alex Elder. 23) TLS RX offload for bonding, from Tariq Toukan. 24) RX decap offklload support in mac80211, from Felix Fietkou. 25) devlink health saupport in octeontx2-af, from George Cherian. 26) Add TTL attr to SCM_TIMESTAMP_OPT_STATS, from Yousuk Seung 27) Delegated actionss support in mptcp, from Paolo Abeni. 28) Support receive timestamping when doin zerocopy tcp receive. From Arjun Ray. 29) HTB offload support for mlx5, from Maxim Mikityanskiy. 30) UDP GRO forwarding, from Maxim Mikityanskiy. 31) TAPRIO offloading in dsa hellcreek driver, from Kurt Kanzenbach. 32) Weighted random twos choice algorithm for ipvs, from Darby Payne. 33) Fix netdev registration deadlock, from Johannes Berg. 34) Various conversions to new tasklet api, from EmilRenner Berthing. 35) Bulk skb allocations in veth, from Lorenzo Bianconi. 36) New ethtool interface for lane setting, from Danielle Ratson. 37) Offload failiure notifications for routes, from Amit Cohen. 38) BCM4908 support, from Rafał Miłecki. 39) Support several new iwlwifi chips, from Ihab Zhaika. 40) Flow drector support for ipv6 in i40e, from Przemyslaw Patynowski. 41) Support for mhi prrotocols, from Loic Poulain. 42) Optimize bpf program stats. 43) Implement RFC6056, for better port randomization, from Eric Dumazet. 44) hsr tag offloading support from George McCollister. 45) Netpoll support in qede, from Bhaskar Upadhaya. 46) 2005/400g speed support in bonding 3ad mode, from Nikolay Aleksandrov. 47) Netlink event support in mptcp, from Florian Westphal. 48) Better skbuff caching, from Alexander Lobakin. 49) MRP (Media Redundancy Protocol) offloading in DSA and a few drivers, from Horatiu Vultur. 50) mqprio saupport in mvneta, from Maxime Chevallier. 51) Remove of_phy_attach, no longer needed, from Florian Fainelli" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1766 commits) octeontx2-pf: Fix otx2_get_fecparam() cteontx2-pf: cn10k: Prevent harmless double shift bugs net: stmmac: Add PCI bus info to ethtool driver query output ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable. ptp: ptp_clockmatrix: Coding style - tighten vertical spacing. ptp: ptp_clockmatrix: Clean-up dev_*() messages. ptp: ptp_clockmatrix: Remove unused header declarations. ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable. ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock. net: stmmac: dwmac-sun8i: Add a shutdown callback net: stmmac: dwmac-sun8i: Minor probe function cleanup net: stmmac: dwmac-sun8i: Use reset_control_reset net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check net: stmmac: dwmac-sun8i: Return void from PHY unpower r8169: use macro pm_ptr net: mdio: Remove of_phy_attach() net: mscc: ocelot: select PACKING in the Kconfig net: re-solve some conflicts after net -> net-next merge net: dsa: tag_rtl4_a: Support also egress tags ...
2021-02-18sparc: make xchg() into a statement expressionRandy Dunlap
Fix several of the same warning by using a GCC statement expression (extension): ../arch/sparc/include/asm/cmpxchg_32.h:28:22: warning: value computed is not used [-Wunused-value] 28 | #define xchg(ptr,x) ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr)))) Passes sparc32 allmodconfig with no xchg warnings. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc64: Use arch_validate_flags() to validate ADI flagKhalid Aziz
When userspace calls mprotect() to enable ADI on an address range, do_mprotect_pkey() calls arch_validate_prot() to validate new protection flags. arch_validate_prot() for sparc looks at the first VMA associated with address range to verify if ADI can indeed be enabled on this address range. This has two issues - (1) Address range might cover multiple VMAs while arch_validate_prot() looks at only the first VMA, (2) arch_validate_prot() peeks at VMA without holding mmap lock which can result in race condition. arch_validate_flags() from commit c462ac288f2c ("mm: Introduce arch_validate_flags()") allows for VMA flags to be validated for all VMAs that cover the address range given by user while holding mmap lock. This patch updates sparc code to move the VMA check from arch_validate_prot() to arch_validate_flags() to fix above two issues. Suggested-by: Jann Horn <jannh@google.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc32: Fix comparing pointer to 0 coccicheck warningKaixu Xia
Fixes coccicheck warning: /arch/sparc/mm/srmmu.c:354:42-43: WARNING comparing pointer to 0 Avoid pointer type value compared to 0. Reported-by: Tosk Robot <tencent_os_robot@tencent.com> Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc: fix led.c driver when PROC_FS is not enabledRandy Dunlap
Fix Sparc build when CONFIG_PROC_FS is not enabled. Fixes this build error: arch/sparc/kernel/led.c:107:30: error: 'led_proc_ops' defined but not used [-Werror=unused-const-variable=] 107 | static const struct proc_ops led_proc_ops = { | ^~~~~~~~~~~~ cc1: all warnings being treated as errors Fixes: 97a32539b956 ("proc: convert everything to "struct proc_ops"") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Lars Kotthoff <metalhead@metalhead.ws> Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc: Fix handling of page table constructor failureMatthew Wilcox (Oracle)
The page has just been allocated, so its refcount is 1. free_unref_page() is for use on pages which have a zero refcount. Use __free_page() like the other implementations of pte_alloc_one(). Fixes: 1ae9ae5f7df7 ("sparc: handle pgtable_page_ctor() fail") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is setRandy Dunlap
Currently COMPAT on SPARC64 selects COMPAT_BINFMT_ELF unconditionally, even when BINFMT_ELF is not enabled. This causes a kconfig warning. Instead, just select COMPAT_BINFMT_ELF if BINFMT_ELF is enabled. This builds cleanly with no kconfig warnings. WARNING: unmet direct dependencies detected for COMPAT_BINFMT_ELF Depends on [n]: COMPAT [=y] && BINFMT_ELF [=n] Selected by [y]: - COMPAT [=y] && SPARC64 [=y] Fixes: 26b4c912185a ("sparc,sparc64: unify Kconfig files") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc: sparc64_defconfig: add necessary configs for qemuCorentin Labbe
The sparc64 qemu machines uses sunhme network hardware by default, so for simple NFS boot testing using qemu, having CONFIG_HAPPYMEAL is useful. And so we need also IP_PNP_DHCP for NFS boot. For the same reason we need to enable its storage which is a PATA_CMD64. And finally, we need CONFIG_DEVTMPFS for handling recent udev/systemd. All those options will permit to enable boot testing in both kernelCI and gentoo's kernelCI. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc64: switch defconfig from the legacy ide driver to libataChristoph Hellwig
Replace the ide options with the equivalent libata options. This has been carried by various downstreams like the linux-build-test repo for years already. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc32: Preserve clone syscall flags argument for restarts due to signalsAndreas Larsson
This fixes a bug where a clone syscall that is restarted due to a pending signal is restarted with garbage in the register %o0 that holds the clone flags. This keep the original %i0 of a syscall (as seen from the trap handler) in %l6 rather than %l5. This is done because for clone (and also qfork) %l5 is used as a temporary variable in the same register window. Before this, that temporary value would be the value that was then incorrectly used as the orig_i0 argument to do_notify_resume. In order to preserve %l6, the temporary usage of %l6 in ret_sys_call is changed to use %l5 instead and the setting %l6 to 0 or 1 was removed. The use of that 0 or 1 value in %l6 was removed in commit 28e6103665301ce60634e8a77f0b657c6cc099de. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc32: Limit memblock allocation to low memoryAndreas Larsson
Commit cca079ef8ac29a7c02192d2bad2ffe4c0c5ffdd0 changed sparc32 to use memblocks instead of bootmem, but also made high memory available via memblock allocation which does not work together with e.g. phys_to_virt and can lead to kernel panic. This changes back to only low memory being allocatable in the early stages, now using memblock allocation. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-18sparc: Replace test_ti_thread_flag() with test_tsk_thread_flag()Tiezhu Yang
Use test_tsk_thread_flag() directly instead of test_ti_thread_flag() to improve readability when the argument type is struct task_struct, it is similar with commit 5afc78551bf5 ("arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP"). Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15sparc: remove wrong comment from arch/sparc/include/asm/KbuildMasahiro Yamada
These are NOT exported to userspace. The headers listed in arch/sparc/include/uapi/asm/Kbuild are exported. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-02-12Merge branch 'x86/paravirt' into x86/entryIngo Molnar
Merge in the recent paravirt changes to resolve conflicts caused by objtool annotations. Conflicts: arch/x86/xen/xen-asm.S Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-02-11locking/arch: Move qrwlock.h include after qspinlock.hWaiman Long
include/asm-generic/qrwlock.h was trying to get arch_spin_is_locked via asm-generic/qspinlock.h. However, this does not work because architectures might be using queued rwlocks but not queued spinlocks (csky), or because they might be defining their own queued_* macros before including asm/qspinlock.h. To fix this, ensure that asm/spinlock.h always includes qrwlock.h after defining arch_spin_is_locked (either directly for csky, or via asm/qspinlock.h for other architectures). The only inclusion elsewhere is in kernel/locking/qrwlock.c. That one is really unnecessary because the file is only compiled in SMP configurations (config QUEUED_RWLOCKS depends on SMP) and in that case linux/spinlock.h already includes asm/qrwlock.h if needed, via asm/spinlock.h. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Waiman Long <longman@redhat.com> Fixes: 26128cb6c7e6 ("locking/rwlocks: Add contention detection for rwlocks") Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Ben Gardon <bgardon@google.com> [Add arch/sparc and kernel/locking parts per discussion with Waiman. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-10softirq: Move do_softirq_own_stack() to generic asm headerThomas Gleixner
To avoid include recursion hell move the do_softirq_own_stack() related content into a generic asm header and include it from all places in arch/ which need the prototype. This allows architectures to provide an inline implementation of do_softirq_own_stack() without introducing a lot of #ifdeffery all over the place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210210002513.289960691@linutronix.de
2021-02-10softirq: Move __ARCH_HAS_DO_SOFTIRQ to KconfigThomas Gleixner
To prepare for inlining do_softirq_own_stack() replace __ARCH_HAS_DO_SOFTIRQ with a Kconfig switch and select it in the affected architectures. This allows in the next step to move the function prototype and the inline stub into a seperate asm-generic header file which is required to avoid include recursion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210210002513.181713427@linutronix.de
2021-01-29tlb: arch: Remove empty __tlb_remove_tlb_entry() stubsWill Deacon
If __tlb_remove_tlb_entry() is not defined by the architecture then we provide an empty definition in asm-generic/tlb.h. Remove the redundant empty definitions for sparc64 and x86. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Yu Zhao <yuzhao@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20210127235347.1402-6-will@kernel.org
2021-01-29arch: sparc: Remove CONFIG_OPROFILE supportViresh Kumar
The "oprofile" user-space tools don't use the kernel OPROFILE support any more, and haven't in a long time. User-space has been converted to the perf interfaces. Remove the old oprofile's architecture specific support. Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Robert Richter <rric@kernel.org> Acked-by: William Cohen <wcohen@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2021-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/can/dev.c b552766c872f ("can: dev: prevent potential information leak in can_fill_info()") 3e77f70e7345 ("can: dev: move driver related infrastructure into separate subdir") 0a042c6ec991 ("can: dev: move netlink related code into seperate file") Code move. drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 57ac4a31c483 ("net/mlx5e: Correctly handle changing the number of queues when the interface is down") 214baf22870c ("net/mlx5e: Support HTB offload") Adjacent code changes net/switchdev/switchdev.c 20776b465c0c ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP") ffb68fc58e96 ("net: switchdev: remove the transaction structure from port object notifiers") bae33f2b5afe ("net: switchdev: remove the transaction structure from port attributes") Transaction parameter gets dropped otherwise keep the fix. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-24sparc/mm/highmem: flush cache and TLBThomas Gleixner
Patch series "mm/highmem: Fix fallout from generic kmap_local conversions". The kmap_local conversion wreckaged sparc, mips and powerpc as it missed some of the details in the original implementation. This patch (of 4): The recent conversion to the generic kmap_local infrastructure failed to assign the proper pre/post map/unmap flush operations for sparc. Sparc requires cache flush before map/unmap and tlb flush afterwards. Link: https://lkml.kernel.org/r/20210112170136.078559026@linutronix.de Link: https://lkml.kernel.org/r/20210112170410.905976187@linutronix.de Fixes: 3293efa97807 ("sparc/mm/highmem: Switch to generic kmap atomic") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Andreas Larsson <andreas@gaisler.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-24fs: add mount_setattr()Christian Brauner
This implements the missing mount_setattr() syscall. While the new mount api allows to change the properties of a superblock there is currently no way to change the properties of a mount or a mount tree using file descriptors which the new mount api is based on. In addition the old mount api has the restriction that mount options cannot be applied recursively. This hasn't changed since changing mount options on a per-mount basis was implemented in [1] and has been a frequent request not just for convenience but also for security reasons. The legacy mount syscall is unable to accommodate this behavior without introducing a whole new set of flags because MS_REC | MS_REMOUNT | MS_BIND | MS_RDONLY | MS_NOEXEC | [...] only apply the mount option to the topmost mount. Changing MS_REC to apply to the whole mount tree would mean introducing a significant uapi change and would likely cause significant regressions. The new mount_setattr() syscall allows to recursively clear and set mount options in one shot. Multiple calls to change mount options requesting the same changes are idempotent: int mount_setattr(int dfd, const char *path, unsigned flags, struct mount_attr *uattr, size_t usize); Flags to modify path resolution behavior are specified in the @flags argument. Currently, AT_EMPTY_PATH, AT_RECURSIVE, AT_SYMLINK_NOFOLLOW, and AT_NO_AUTOMOUNT are supported. If useful, additional lookup flags to restrict path resolution as introduced with openat2() might be supported in the future. The mount_setattr() syscall can be expected to grow over time and is designed with extensibility in mind. It follows the extensible syscall pattern we have used with other syscalls such as openat2(), clone3(), sched_{set,get}attr(), and others. The set of mount options is passed in the uapi struct mount_attr which currently has the following layout: struct mount_attr { __u64 attr_set; __u64 attr_clr; __u64 propagation; __u64 userns_fd; }; The @attr_set and @attr_clr members are used to clear and set mount options. This way a user can e.g. request that a set of flags is to be raised such as turning mounts readonly by raising MOUNT_ATTR_RDONLY in @attr_set while at the same time requesting that another set of flags is to be lowered such as removing noexec from a mount tree by specifying MOUNT_ATTR_NOEXEC in @attr_clr. Note, since the MOUNT_ATTR_<atime> values are an enum starting from 0, not a bitmap, users wanting to transition to a different atime setting cannot simply specify the atime setting in @attr_set, but must also specify MOUNT_ATTR__ATIME in the @attr_clr field. So we ensure that MOUNT_ATTR__ATIME can't be partially set in @attr_clr and that @attr_set can't have any atime bits set if MOUNT_ATTR__ATIME isn't set in @attr_clr. The @propagation field lets callers specify the propagation type of a mount tree. Propagation is a single property that has four different settings and as such is not really a flag argument but an enum. Specifically, it would be unclear what setting and clearing propagation settings in combination would amount to. The legacy mount() syscall thus forbids the combination of multiple propagation settings too. The goal is to keep the semantics of mount propagation somewhat simple as they are overly complex as it is. The @userns_fd field lets user specify a user namespace whose idmapping becomes the idmapping of the mount. This is implemented and explained in detail in the next patch. [1]: commit 2e4b7fcd9260 ("[PATCH] r/o bind mounts: honor mount writer counts at remount") Link: https://lore.kernel.org/r/20210121131959.646623-35-christian.brauner@ubuntu.com Cc: David Howells <dhowells@redhat.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-api@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-14bpf: Rename BPF_XADD and prepare to encode other atomics in .immBrendan Jackman
A subsequent patch will add additional atomic operations. These new operations will use the same opcode field as the existing XADD, with the immediate discriminating different operations. In preparation, rename the instruction mode BPF_ATOMIC and start calling the zero immediate BPF_ADD. This is possible (doesn't break existing valid BPF progs) because the immediate field is currently reserved MBZ and BPF_ADD is zero. All uses are removed from the tree but the BPF_XADD definition is kept around to avoid breaking builds for people including kernel headers. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com