summaryrefslogtreecommitdiff
path: root/arch/um/kernel
AgeCommit message (Collapse)Author
2016-10-07nmi_backtrace: generate one-line reports for idle cpusChris Metcalf
When doing an nmi backtrace of many cores, most of which are idle, the output is a little overwhelming and very uninformative. Suppress messages for cpus that are idling when they are interrupted and just emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN". We do this by grouping all the cpuidle code together into a new .cpuidle.text section, and then checking the address of the interrupted PC to see if it lies within that section. This commit suitably tags x86 and tile idle routines, and only adds in the minimal framework for other architectures. Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm] Tested-by: Petr Mladek <pmladek@suse.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-07um/ptrace: Fix the syscall number update after a ptraceMickaël Salaün
Update the syscall number after each PTRACE_SETREGS on ORIG_*AX. This is needed to get the potentially altered syscall number in the seccomp filters after RET_TRACE. This fix four seccomp_bpf tests: > [ RUN ] TRACE_syscall.skip_after_RET_TRACE > seccomp_bpf.c:1560:TRACE_syscall.skip_after_RET_TRACE:Expected -1 (18446744073709551615) == syscall(39) (26) > seccomp_bpf.c:1561:TRACE_syscall.skip_after_RET_TRACE:Expected 1 (1) == (*__errno_location ()) (22) > [ FAIL ] TRACE_syscall.skip_after_RET_TRACE > [ RUN ] TRACE_syscall.kill_after_RET_TRACE > TRACE_syscall.kill_after_RET_TRACE: Test exited normally instead of by signal (code: 1) > [ FAIL ] TRACE_syscall.kill_after_RET_TRACE > [ RUN ] TRACE_syscall.skip_after_ptrace > seccomp_bpf.c:1622:TRACE_syscall.skip_after_ptrace:Expected -1 (18446744073709551615) == syscall(39) (26) > seccomp_bpf.c:1623:TRACE_syscall.skip_after_ptrace:Expected 1 (1) == (*__errno_location ()) (22) > [ FAIL ] TRACE_syscall.skip_after_ptrace > [ RUN ] TRACE_syscall.kill_after_ptrace > TRACE_syscall.kill_after_ptrace: Test exited normally instead of by signal (code: 1) > [ FAIL ] TRACE_syscall.kill_after_ptrace Fixes: 26703c636c1f ("um/ptrace: run seccomp after ptrace") Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: James Morris <jmorris@namei.org> Cc: user-mode-linux-devel@lists.sourceforge.net Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-07um/ptrace: Fix the syscall_trace_leave callMickaël Salaün
Keep the same semantic as before the commit 26703c636c1f: deallocate audit context and fake a proper syscall exit. This fix a kernel panic triggered by the seccomp_bpf test: > [ RUN ] global.ERRNO_valid > BUG: failure at kernel/auditsc.c:1504/__audit_syscall_entry()! > Kernel panic - not syncing: BUG! Fixes: 26703c636c1f ("um/ptrace: run seccomp after ptrace") Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: James Morris <jmorris@namei.org> Cc: user-mode-linux-devel@lists.sourceforge.net Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2016-08-04Merge branch 'for-linus-4.8-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: "Beside of various fixes this also contains patches to enable features such was Kcov, kmemleak and TRACE_IRQFLAGS_SUPPORT on UML" * 'for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common() um: Support kcov um: Enable TRACE_IRQFLAGS_SUPPORT um: Use asm-generic/irqflags.h um: Fix possible deadlock in sig_handler_common() um: Select HAVE_DEBUG_KMEMLEAK um: Setup physical memory in setup_arch() um: Eliminate null test after alloc_bootmem
2016-08-04um: Support kcovVegard Nossum
This adds support for kcov to UML. There is a small problem where UML will randomly segfault during boot; this is because current_thread_info() occasionally returns an invalid (non-NULL) pointer and we try to dereference it in __sanitizer_cov_trace_pc(). I consider this a bug in UML itself and this patch merely exposes it. [v2: disable instrumentation in UML-specific code] Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Meyer <thomas@m3y3r.de> Cc: user-mode-linux-devel <user-mode-linux-devel@lists.sourceforge.net> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2016-08-04um: Setup physical memory in setup_arch()Richard Weinberger
Currently UML sets up physical memory very early, long before setup_arch() was called by the kernel main function. This can cause problems when code paths in UML's memory setup code assume that the kernel is already running. i.e. when kmemleak is enabled it will evaluate current() in free_bootmem(). That early current() is undefined and UML explodes. Solve the problem by setting up physical memory in setup_arch(), at this stage the kernel has materialized and basic infrastructure such as current() works. Signed-off-by: Richard Weinberger <richard@nod.at>
2016-08-04um: Eliminate null test after alloc_bootmemAmitoj Kaur Chawla
alloc_bootmem function never returns NULL. Thus a NULL test after a call to this function is unnecessary. The Coccinelle semantic patch used to make this change is follows: @@ expression E; statement S; @@ E = alloc_bootmem(...) ... when != E - if (E == NULL) S Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - TPM core and driver updates/fixes - IPv6 security labeling (CALIPSO) - Lots of Apparmor fixes - Seccomp: remove 2-phase API, close hole where ptrace can change syscall #" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits) apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family) tpm: Factor out common startup code tpm: use devm_add_action_or_reset tpm2_i2c_nuvoton: add irq validity check tpm: read burstcount from TPM_STS in one 32-bit transaction tpm: fix byte-order for the value read by tpm2_get_tpm_pt tpm_tis_core: convert max timeouts from msec to jiffies apparmor: fix arg_size computation for when setprocattr is null terminated apparmor: fix oops, validate buffer size in apparmor_setprocattr() apparmor: do not expose kernel stack apparmor: fix module parameters can be changed after policy is locked apparmor: fix oops in profile_unpack() when policy_db is not present apparmor: don't check for vmalloc_addr if kvzalloc() failed apparmor: add missing id bounds check on dfa verification apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task apparmor: use list_next_entry instead of list_entry_next apparmor: fix refcount race when finding a child profile apparmor: fix ref count leak when profile sha1 hash is read apparmor: check that xindex is in trans_table bounds ...
2016-07-26mm: do not pass mm_struct into handle_mm_faultKirill A. Shutemov
We always have vma->vm_mm around. Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-24tree wide: get rid of __GFP_REPEAT for order-0 allocations part IMichal Hocko
This is the third version of the patchset previously sent [1]. I have basically only rebased it on top of 4.7-rc1 tree and dropped "dm: get rid of superfluous gfp flags" which went through dm tree. I am sending it now because it is tree wide and chances for conflicts are reduced considerably when we want to target rc2. I plan to send the next step and rename the flag and move to a better semantic later during this release cycle so we will have a new semantic ready for 4.8 merge window hopefully. Motivation: While working on something unrelated I've checked the current usage of __GFP_REPEAT in the tree. It seems that a majority of the usage is and always has been bogus because __GFP_REPEAT has always been about costly high order allocations while we are using it for order-0 or very small orders very often. It seems that a big pile of them is just a copy&paste when a code has been adopted from one arch to another. I think it makes some sense to get rid of them because they are just making the semantic more unclear. Please note that GFP_REPEAT is documented as * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt * _might_ fail. This depends upon the particular VM implementation. while !costly requests have basically nofail semantic. So one could reasonably expect that order-0 request with __GFP_REPEAT will not loop for ever. This is not implemented right now though. I would like to move on with __GFP_REPEAT and define a better semantic for it. $ git grep __GFP_REPEAT origin/master | wc -l 111 $ git grep __GFP_REPEAT | wc -l 36 So we are down to the third after this patch series. The remaining places really seem to be relying on __GFP_REPEAT due to large allocation requests. This still needs some double checking which I will do later after all the simple ones are sorted out. I am touching a lot of arch specific code here and I hope I got it right but as a matter of fact I even didn't compile test for some archs as I do not have cross compiler for them. Patches should be quite trivial to review for stupid compile mistakes though. The tricky parts are usually hidden by macro definitions and thats where I would appreciate help from arch maintainers. [1] http://lkml.kernel.org/r/1461849846-27209-1-git-send-email-mhocko@kernel.org This patch (of 19): __GFP_REPEAT has a rather weak semantic but since it has been introduced around 2.6.12 it has been ignored for low order allocations. Yet we have the full kernel tree with its usage for apparently order-0 allocations. This is really confusing because __GFP_REPEAT is explicitly documented to allow allocation failures which is a weaker semantic than the current order-0 has (basically nofail). Let's simply drop __GFP_REPEAT from those places. This would allow to identify place which really need allocator to retry harder and formulate a more specific semantic for what the flag is supposed to do actually. Link: http://lkml.kernel.org/r/1464599699-30131-2-git-send-email-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: John Crispin <blogic@openwrt.org> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-14um/ptrace: run seccomp after ptraceKees Cook
Close the hole where ptrace can change a syscall out from under seccomp. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: user-mode-linux-devel@lists.sourceforge.net
2016-06-14seccomp: Add a seccomp_data parameter secure_computing()Andy Lutomirski
Currently, if arch code wants to supply seccomp_data directly to seccomp (which is generally much faster than having seccomp do it using the syscall_get_xyz() API), it has to use the two-phase seccomp hooks. Add it to the easy hooks, too. Cc: linux-arch@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2016-05-27Merge branch 'for-linus-4.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: "This contains a nice FPU fixup from Eli Cooper for UML" * 'for-linus-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: add extended processor state save/restore support um: extend fpstate to _xstate to support YMM registers um: fix FPU state preservation around signal handlers
2016-05-21um: add extended processor state save/restore supportEli Cooper
This patch extends save_fp_registers() and restore_fp_registers() to use PTRACE_GETREGSET and PTRACE_SETREGSET with the XSTATE note type, adding support for new processor state extensions between context switches. When the new ptrace requests are unavailable, it falls back to the old PTRACE_GETFPREGS and PTRACE_SETFPREGS methods, which have been renamed to save_i387_registers() and restore_i387_registers(). Now these functions expect *fp_regs to have the space of an _xstate struct. Thus, this also makes ptrace in UML responde to PTRACE_GETFPREGS/_SETFPREG requests with a user_i387_struct (thus independent from HOST_FP_SIZE), and by calling save_i387_registers() and restore_i387_registers() instead of the extended save_fp_registers() and restore_fp_registers() functions. Signed-off-by: Eli Cooper <elicooper@gmx.com>
2016-05-20exit_thread: remove empty bodiesJiri Slaby
Define HAVE_EXIT_THREAD for archs which want to do something in exit_thread. For others, let's define exit_thread as an empty inline. This is a cleanup before we change the prototype of exit_thread to accept a task parameter. [akpm@linux-foundation.org: fix mips] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17mm: cleanup *pte_alloc* interfacesKirill A. Shutemov
There are few things about *pte_alloc*() helpers worth cleaning up: - 'vma' argument is unused, let's drop it; - most __pte_alloc() callers do speculative check for pmd_none(), before taking ptl: let's introduce pte_alloc() macro which does the check. The only direct user of __pte_alloc left is userfaultfd, which has different expectation about atomicity wrt pmd. - pte_alloc_map() and pte_alloc_map_lock() are redefined using pte_alloc(). [sudeep.holla@arm.com: fix build for arm64 hugetlbpage] [sfr@canb.auug.org.au: fix arch/arm/mm/mmu.c some more] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-05um: Export pm_power_offRichard Weinberger
...modules are using this symbol. Export it like all other archs to. Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-05Revert "um: Fix get_signal() usage"Richard Weinberger
Commit db2f24dc240856fb1d78005307f1523b7b3c121b was plain wrong. I did not realize the we are allowed to loop here. In fact we have to loop and must not return to userspace before all SIGSEGVs have been delivered. Other archs do this directly in their entry code, UML does it here. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10um: Add seccomp supportMickaël Salaün
This brings SECCOMP_MODE_STRICT and SECCOMP_MODE_FILTER support through prctl(2) and seccomp(2) to User-mode Linux for i386 and x86_64 subarchitectures. secure_computing() is called first in handle_syscall() so that the syscall emulation will be aborted quickly if matching a seccomp rule. This is inspired from Meredydd Luff's patch (https://gerrit.chromium.org/gerrit/21425). Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: James Hogan <james.hogan@imgtec.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: David Drysdale <drysdale@google.com> Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Kees Cook <keescook@chromium.org>
2016-01-10um: Fix ptrace GETREGS/SETREGS bugsMickaël Salaün
This fix two related bugs: * PTRACE_GETREGS doesn't get the right orig_ax (syscall) value * PTRACE_SETREGS can't set the orig_ax value (erased by initial value) Get rid of the now useless and error-prone get_syscall(). Fix inconsistent behavior in the ptrace implementation for i386 when updating orig_eax automatically update the syscall number as well. This is now updated in handle_syscall(). Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Thomas Meyer <thomas@m3y3r.de> Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Anton Ivanov <aivanov@brocade.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: David Drysdale <drysdale@google.com> Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Kees Cook <keescook@chromium.org>
2015-12-08um: Fix get_signal() usageRichard Weinberger
If get_signal() returns us a signal to post we must not call it again, otherwise the already posted signal will be overridden. Before commit a610d6e672d this was the case as we stopped the while after a successful handle_signal(). Cc: <stable@vger.kernel.org> # 3.10- Fixes: a610d6e672d ("pull clearing RESTORE_SIGMASK into block_sigmask()") Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06um: Switch clocksource to hrtimersAnton Ivanov
UML is using an obsolete itimer call for all timers and "polls" for kernel space timer firing in its userspace portion resulting in a long list of bugs and incorrect behaviour(s). It also uses ITIMER_VIRTUAL for its timer which results in the timer being dependent on it running and the cpu load. This patch fixes this by moving to posix high resolution timers firing off CLOCK_MONOTONIC and relaying the timer correctly to the UML userspace. Fixes: - crashes when hosts suspends/resumes - broken userspace timers - effecive ~40Hz instead of what they should be. Note - this modifies skas behavior by no longer setting an itimer per clone(). Timer events are relayed instead. - kernel network packet scheduling disciplines - tcp behaviour especially under load - various timer related corner cases Finally, overall responsiveness of userspace is better. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Anton Ivanov <aivanov@brocade.com> [rw: massaged commit message] Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06um: Report host OOM more nicelyRichard Weinberger
If UML runs on the host side out of memory, report this condition more nicely. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06um: Get rid of open coded NR_SYSCALLSRichard Weinberger
We can use __NR_syscall_max. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06um: Store syscall number after syscall_trace_enter()Richard Weinberger
To support changing syscall numbers we have to store it after syscall_trace_enter(). Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-19um: Fix kernel mode fault conditionRichard Weinberger
We have to exclude memory locations <= PAGE_SIZE from the condition and let the kernel mode fault path catch it. Otherwise a kernel NULL pointer exception will be reported as a kernel user space access. Fixes: d2313084e2c (um: Catch unprotected user memory access) Signed-off-by: Richard Weinberger <richard@nod.at>
2015-09-01Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Rather large, but nothing exiting: - new range check for settimeofday() to prevent that boot time becomes negative. - fix for file time rounding - a few simplifications of the hrtimer code - fix for the proc/timerlist code so the output of clock realtime timers is accurate - more y2038 work - tree wide conversion of clockevent drivers to the new callbacks" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (88 commits) hrtimer: Handle failure of tick_init_highres() gracefully hrtimer: Unconfuse switch_hrtimer_base() a bit hrtimer: Simplify get_target_base() by returning current base hrtimer: Drop return code of hrtimer_switch_to_hres() time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64() time: Introduce current_kernel_time64() time: Introduce struct itimerspec64 time: Add the common weak version of update_persistent_clock() time: Always make sure wall_to_monotonic isn't positive time: Fix nanosecond file time rounding in timespec_trunc() timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers cris/time: Migrate to new 'set-state' interface kernel: broadcast-hrtimer: Migrate to new 'set-state' interface xtensa/time: Migrate to new 'set-state' interface unicore/time: Migrate to new 'set-state' interface um/time: Migrate to new 'set-state' interface sparc/time: Migrate to new 'set-state' interface sh/localtimer: Migrate to new 'set-state' interface score/time: Migrate to new 'set-state' interface s390/time: Migrate to new 'set-state' interface ...
2015-08-10um/time: Migrate to new 'set-state' interfaceViresh Kumar
Migrate um driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: user-mode-linux-devel@lists.sourceforge.net Cc: user-mode-linux-user@lists.sourceforge.net Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-07-07um: Fix do_signal() prototypeIngo Molnar
Once x86 exports its do_signal(), the prototypes will clash. Fix the clash and also improve the code a bit: remove the unnecessary kern_do_signal() indirection. This allows interrupt_end() to share the 'regs' parameter calculation. Also remove the unused return code to match x86. Minimally build and boot tested. Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/67c57eac09a589bac3c6c5ff22f9623ec55a184a.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-28Merge branch 'for-linus-4.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: - remove hppfs ("HonePot ProcFS") - initial support for musl libc - uaccess cleanup - random cleanups and bug fixes all over the place * 'for-linus-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (21 commits) um: Don't pollute kernel namespace with uapi um: Include sys/types.h for makedev(), major(), minor() um: Do not use stdin and stdout identifiers for struct members um: Do not use __ptr_t type for stack_t's .ss pointer um: Fix mconsole dependency um: Handle tracehook_report_syscall_entry() result um: Remove copy&paste code from init.h um: Stop abusing __KERNEL__ um: Catch unprotected user memory access um: Fix warning in setup_signal_stack_si() um: Rework uaccess code um: Add uaccess.h to ldt.c um: Add uaccess.h to syscalls_64.c um: Add asm/elf.h to vma.c um: Cleanup mem_32/64.c headers um: Remove hppfs um: Move syscall() declaration into os.h um: kernel: ksyms: Export symbol syscall() for fixing modpost issue um/os-Linux: Use char[] for syscall_stub declarations um: Use char[] for linker script address declarations ...
2015-06-25um: Don't pollute kernel namespace with uapiRichard Weinberger
Don't include ptrace uapi stuff in arch headers, it will pollute the kernel namespace and conflict with existing stuff. In this case it fixes clashes with common names like R8. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31um: Handle tracehook_report_syscall_entry() resultRichard Weinberger
tracehook_report_syscall_entry() is allowed to fail, in case of failure we have to abort the current syscall. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31um: Catch unprotected user memory accessRichard Weinberger
If the kernel tries to access user memory without copy_from_user() a trap will happen as kernel and userspace run in different processes on the host side. Currently this special page fault cannot be resolved and will happen over and over again. As result UML will lockup. This patch allows the page fault code to detect that situation and causes a panic() such that the root cause of the unprotected memory access can be found and fixed. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31um: Rework uaccess codeRichard Weinberger
Rework UML's uaccess code to reuse as much as possible from asm-generic/uaccess.c. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31um: Move syscall() declaration into os.hRichard Weinberger
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31um: kernel: ksyms: Export symbol syscall() for fixing modpost issueChen Gang
syscall() is implemented in libc.so/a (e.g. for glibc, in "syscall.o"), so for normal ".o" files, it is undefined, neither can be found within kernel wide, so will break modpost. Since ".o" files is OK, can simply export 'syscall' symbol, let modpost know about that, then can fix this issue. The related error (with allmodconfig under um): MODPOST 1205 modules ERROR: "syscall" [fs/hostfs/hostfs.ko] undefined! Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31um: Use char[] for linker script address declarationsNicolas Iooss
The linker script defines some variables which are declared either with type char[] in include/asm-generic/sections.h or with a meaningless integer type in arch/um/include/asm/sections.h. Fix this inconsistency by declaring every variable char[]. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31um: Create asm/sections.hNicolas Iooss
arch/um/kernel/dyn.lds.S and arch/um/kernel/uml.lds.S define some UML-specific symbols. These symbols are used in the kernel part of UML with extern declarations. Move these declarations to a new header, asm/sections.h, like other architectures do. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-19mm/fault, um: Fix compile errorPeter Zijlstra
A missing include file caused build fail. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: David.Laight@ACULAB.COM Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zijlstra (Intel) <peterz@infradead.org> Cc: airlied@linux.ie Cc: akpm@linux-foundation.org Cc: benh@kernel.crashing.org Cc: bigeasy@linutronix.de Cc: borntraeger@de.ibm.com Cc: daniel.vetter@intel.com Cc: heiko.carstens@de.ibm.com Cc: herbert@gondor.apana.org.au Cc: hocko@suse.cz Cc: hughd@google.com Cc: mst@redhat.com Cc: paulus@samba.org Cc: ralf@linux-mips.org Cc: schwidefsky@de.ibm.com Cc: yang.shi@windriver.com Fixes: 70ffdb9393a7 ("mm/fault, arch: Use pagefault_disable() to check for disabled pagefaults in the handler") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19mm/fault, arch: Use pagefault_disable() to check for disabled pagefaults in ↵David Hildenbrand
the handler Introduce faulthandler_disabled() and use it to check for irq context and disabled pagefaults (via pagefault_disable()) in the pagefault handlers. Please note that we keep the in_atomic() checks in place - to detect whether in irq context (in which case preemption is always properly disabled). In contrast, preempt_disable() should never be used to disable pagefaults. With !CONFIG_PREEMPT_COUNT, preempt_disable() doesn't modify the preempt counter, and therefore the result of in_atomic() differs. We validate that condition by using might_fault() checks when calling might_sleep(). Therefore, add a comment to faulthandler_disabled(), describing why this is needed. faulthandler_disabled() and pagefault_disable() are defined in linux/uaccess.h, so let's properly add that include to all relevant files. This patch is based on a patch from Thomas Gleixner. Reviewed-and-tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: David.Laight@ACULAB.COM Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: airlied@linux.ie Cc: akpm@linux-foundation.org Cc: benh@kernel.crashing.org Cc: bigeasy@linutronix.de Cc: borntraeger@de.ibm.com Cc: daniel.vetter@intel.com Cc: heiko.carstens@de.ibm.com Cc: herbert@gondor.apana.org.au Cc: hocko@suse.cz Cc: hughd@google.com Cc: mst@redhat.com Cc: paulus@samba.org Cc: ralf@linux-mips.org Cc: schwidefsky@de.ibm.com Cc: yang.shi@windriver.com Link: http://lkml.kernel.org/r/1431359540-32227-7-git-send-email-dahi@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-13um: Print minimum physical memory requirementThomas Meyer
Print a more sensible message about the minimum physical memory requirement. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-13um: Move uml_postsetup in the init_thread stackThomas Meyer
atomic_notifier_chain_register() and uml_postsetup() do call kernel code that rely on the "current" kernel macro and a valid task_struct resp. thread_info struct. Give those functions a valid stack by moving uml_postsetup() in the init_thread stack. This moves enables a panic() call in this early code to generate a valid stacktrace, instead of crashing. E.g. when an UML kernel is started with an initrd but too few physical memory the panic() call get's actually processed. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-13um: add a kmsg_dumperThomas Meyer
Add a kmsg_dumper, that dumps the kmsg buffer to stdout, when no console is available. This an enables the printing of early panic() calls triggered in uml_postsetup(). When a panic() call happens so early in the UML kernel no earlyprintk/console is available yet, but with a kmsg_dumper in place the kernel message buffer will be outputted to the user, to give a better hint, of what the failure was. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-13um: Remove broken highmem supportRichard Weinberger
Highmem was always buggy and experimental on UML(i386). In times where 64 bit computers are default we can remove that experimental code. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-13um: Remove broken SMP supportRichard Weinberger
At times where UML used the TT mode to operate it had kind of SMP support. It never got finished nor was stable. Let's rip out that cruft and stop confusing developers which do tree-wide SMP cleanups. If someone wants SMP support UML it has do be done from scratch. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-13um: Remove SKAS3/4 supportRichard Weinberger
Before we had SKAS0 UML had two modes of operation TT (tracing thread) and SKAS3/4 (separated kernel address space). TT was known to be insecure and got removed a long time ago. SKAS3/4 required a few (3 or 4) patches on the host side which never went mainline. The last host patch is 10 years old. With SKAS0 mode (separated kernel address space using 0 host patches), default since 2005, SKAS3/4 is obsolete and can be removed. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-13um: Remove dead code from stacktraceRichard Weinberger
Remove left over code from commit 970e51feaddb (um: Add support for CONFIG_STACKTRACE) Signed-off-by: Richard Weinberger <richard@nod.at>
2015-01-29vm: add VM_FAULT_SIGSEGV handling supportLinus Torvalds
The core VM already knows about VM_FAULT_SIGBUS, but cannot return a "you should SIGSEGV" error, because the SIGSEGV case was generally handled by the caller - usually the architecture fault handler. That results in lots of duplication - all the architecture fault handlers end up doing very similar "look up vma, check permissions, do retries etc" - but it generally works. However, there are cases where the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV. In particular, when accessing the stack guard page, libsigsegv expects a SIGSEGV. And it usually got one, because the stack growth is handled by that duplicated architecture fault handler. However, when the generic VM layer started propagating the error return from the stack expansion in commit fee7e49d4514 ("mm: propagate error from stack expansion even for guard page"), that now exposed the existing VM_FAULT_SIGBUS result to user space. And user space really expected SIGSEGV, not SIGBUS. To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those duplicate architecture fault handlers about it. They all already have the code to handle SIGSEGV, so it's about just tying that new return value to the existing code, but it's all a bit annoying. This is the mindless minimal patch to do this. A more extensive patch would be to try to gather up the mostly shared fault handling logic into one generic helper routine, and long-term we really should do that cleanup. Just from this patch, you can generally see that most architectures just copied (directly or indirectly) the old x86 way of doing things, but in the meantime that original x86 model has been improved to hold the VM semaphore for shorter times etc and to handle VM_FAULT_RETRY and other "newer" things, so it would be a good idea to bring all those improvements to the generic case and teach other architectures about them too. Reported-and-tested-by: Takashi Iwai <tiwai@suse.de> Tested-by: Jan Engelhardt <jengelh@inai.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots" Cc: linux-arch@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-19Merge git://git.infradead.org/users/eparis/auditLinus Torvalds
Pull audit updates from Eric Paris: "So this change across a whole bunch of arches really solves one basic problem. We want to audit when seccomp is killing a process. seccomp hooks in before the audit syscall entry code. audit_syscall_entry took as an argument the arch of the given syscall. Since the arch is part of what makes a syscall number meaningful it's an important part of the record, but it isn't available when seccomp shoots the syscall... For most arch's we have a better way to get the arch (syscall_get_arch) So the solution was two fold: Implement syscall_get_arch() everywhere there is audit which didn't have it. Use syscall_get_arch() in the seccomp audit code. Having syscall_get_arch() everywhere meant it was a useless flag on the stack and we could get rid of it for the typical syscall entry. The other changes inside the audit system aren't grand, fixed some records that had invalid spaces. Better locking around the task comm field. Removing some dead functions and structs. Make some things static. Really minor stuff" * git://git.infradead.org/users/eparis/audit: (31 commits) audit: rename audit_log_remove_rule to disambiguate for trees audit: cull redundancy in audit_rule_change audit: WARN if audit_rule_change called illegally audit: put rule existence check in canonical order next: openrisc: Fix build audit: get comm using lock to avoid race in string printing audit: remove open_arg() function that is never used audit: correct AUDIT_GET_FEATURE return message type audit: set nlmsg_len for multicast messages. audit: use union for audit_field values since they are mutually exclusive audit: invalid op= values for rules audit: use atomic_t to simplify audit_serial() kernel/audit.c: use ARRAY_SIZE instead of sizeof/sizeof[0] audit: reduce scope of audit_log_fcaps audit: reduce scope of audit_net_id audit: arm64: Remove the audit arch argument to audit_syscall_entry arm64: audit: Add audit hook in syscall_trace_enter/exit() audit: x86: drop arch from __audit_syscall_entry() interface sparc: implement is_32bit_task sparc: properly conditionalize use of TIF_32BIT ...
2014-10-13um: Add support for CONFIG_STACKTRACEDaniel Walter
Add stacktrace support for User Mode Linux Signed-off-by: Daniel Walter <dwalter@google.com> Signed-off-by: Richard Weinberger <richard@nod.at>