Age | Commit message (Collapse) | Author |
|
The RTC core is always calling rtc_valid_tm after the read_time callback.
It is not necessary to call it just before returning from the callback.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Add a couple of trace points in the VAS driver
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[mpe: Add SPDX tag to new header]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When VAS is not configured, unregister the platform driver. Also simplify
cleanup by delaying vas debugfs init until we know VAS is configured.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
pnv_npu2_init_context wasn't checking the return code from
__mmu_notifier_register. If __mmu_notifier_register failed, the
npu_context was still assigned to the mm and the caller wasn't given any
indication that things went wrong. Later on pnv_npu2_destroy_context would
be called, which in turn called mmu_notifier_unregister and dropped
mm->mm_count without having incremented it in the first place. This led to
various forms of corruption like mm use-after-free and mm double-free.
__mmu_notifier_register can fail with EINTR if a signal is pending, so
this case can be frequent.
This patch calls opal_npu_destroy_context on the failure paths, and makes
sure not to assign mm->context.npu_context until past the failure points.
Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
Acked-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This is a tidy up which removes radix MMU calls into the slice
code.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The slice_mask cache was a basic conversion which copied the slice
mask into caller's structures, because that's how the original code
worked. In most cases the pointer can be used directly instead, saving
a copy and an on-stack structure.
On POWER8, this increases vfork+exec+exit performance by 0.3%
and reduces time to mmap+munmap a 64kB page by 2%.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This code is never compiled in, and it gets broken by the next
patch, so remove it.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This converts the slice_mask bit operation helpers to be the usual
3-operand kind, which allows 2 inputs to set a different output
without an extra copy, which is used in the next patch.
Adds slice_copy_mask, which will be used in the next patch.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Rather than build slice masks from a range then use that to check for
fit in a candidate mask, implement slice_check_range_fits that checks
if a range fits in a mask directly.
This allows several structures to be removed from stacks, and also we
don't expect a huge range in a lot of these cases, so building and
comparing a full mask is going to be more expensive than testing just
one or two bits of the range.
On POWER8, this increases vfork+exec+exit performance by 0.3%
and reduces time to mmap+munmap a 64kB page by 5%.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Calculating the slice mask can become a signifcant overhead for
get_unmapped_area. This patch adds a struct slice_mask for
each page size in the mm_context, and keeps these in synch with
the slices psize arrays and slb_addr_limit.
On Book3S/64 this adds 288 bytes to the mm_context_t for the
slice mask caches.
On POWER8, this increases vfork+exec+exit performance by 9.9%
and reduces time to mmap+munmap a 64kB page by 28%.
Reduces time to mmap+munmap by about 10% on 8xx.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Pass around const pointers to struct slice_mask where possible, rather
than copies of slice_mask, to reduce stack and call overhead.
checkstack.pl gives, before:
0x00000d1c slice_get_unmapped_area [slice.o]: 592
0x00001864 is_hugepage_only_range [slice.o]: 448
0x00000754 slice_find_area_topdown [slice.o]: 400
0x00000484 slice_find_area_bottomup.isra.1 [slice.o]: 272
0x000017b4 slice_set_range_psize [slice.o]: 224
0x00000a4c slice_find_area [slice.o]: 128
0x00000160 slice_check_fit [slice.o]: 112
after:
0x00000ad0 slice_get_unmapped_area [slice.o]: 448
0x00001464 is_hugepage_only_range [slice.o]: 288
0x000006c0 slice_find_area [slice.o]: 144
0x0000016c slice_check_fit [slice.o]: 128
0x00000528 slice_find_area_bottomup.isra.2 [slice.o]: 128
0x000013e4 slice_set_range_psize [slice.o]: 128
This increases vfork+exec+exit performance by 1.5%.
Reduces time to mmap+munmap a 64kB page by 17%.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Make these loops look the same, and change their form so the
important part is not wrapped over so many lines.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The slice state of an mm gets zeroed then initialised upon exec.
This is the only caller of slice_set_user_psize now, so that can be
removed and instead implement a faster and simplified approach that
requires no locking or checking existing state.
This speeds up vfork+exec+exit performance on POWER8 by 3%.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Now that plpar_wrappers.h has an #ifdef PSERIES we can move the empty
version of plpar_set_ciabr() which xmon wants into there.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Back in 2013 we added some hypercall wrappers which misspelled
"plpar" (P-series Logical PARtition) as "plapr".
Visually they're hard to distinguish and it almost doesn't matter, but
it is confusing when grepping to miss some calls because of the typo.
They've also started spreading, so before they take over let's fix
them all to be "plpar".
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Currently plpar_wrappers.h is not safe to include when
CONFIG_PPC_PSERIES=n, or at least it can be depending on other config
options and so on.
Fix that by wrapping the entire content in an ifdef.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
smp_query_cpu_stopped() and related #defines are currently in
plpar_wrappers.h. The function actually does an RTAS call, not an
hcall, and basically has nothing to do with plpar_wrappers.h
Move it into pseries.h, where it can easily be used by the only two
callers in pseries/smp.c and pseries/hotplug-cpu.c.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
early_init() and machine_init() have no prototype, add one in
asm-prototypes.h.
Fixes the following warnings (treated as error in W=1):
arch/powerpc/kernel/setup_32.c:68:30: error: no previous prototype for ‘early_init’
arch/powerpc/kernel/setup_32.c:99:21: error: no previous prototype for ‘machine_init’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
[mpe: Move them to asm-prototypes.h, drop other functions]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
These functions can all be static, make it so.
Signed-off-by: Mathieu Malaterre <malat@debian.org>
[mpe: Combine a patch of Mathieu's with some other static conversions]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Rewrite function-like macro into regular static inline function to
avoid a warning during macro expansion.
Fix warning (treated as error in W=1):
./arch/powerpc/include/asm/uaccess.h:52:35: error: comparison of unsigned expression >= 0 is always true
(((size) == 0) || (((size) - 1) <= ((segment).seg - (addr)))))
^
Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Rewrite comparison since all values compared are of type `unsigned long`.
Instead of using unsigned properties and rewriting the original code as:
(originally suggested by Segher Boessenkool <segher@kernel.crashing.org>)
#define pfn_valid(pfn) \
(((pfn) - ARCH_PFN_OFFSET) < (max_mapnr - ARCH_PFN_OFFSET))
Prefer a static inline function to make code as readable as possible.
Fix a warning (treated as error in W=1):
arch/powerpc/include/asm/page.h:129:32: error: comparison of unsigned expression >= 0 is always true [-Werror=type-limits]
#define pfn_valid(pfn) ((pfn) >= ARCH_PFN_OFFSET && (pfn) < max_mapnr)
^
Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When neither CONFIG_ALTIVEC, nor CONFIG_VSX or CONFIG_PPC64 is
defined, the array feature_properties is defined as an empty array,
which in turn triggers the following warning (treated as error on
W=1):
arch/powerpc/kernel/prom.c: In function ‘check_cpu_feature_properties’:
arch/powerpc/kernel/prom.c:298:16: error: comparison of unsigned expression < 0 is always false
for (i = 0; i < ARRAY_SIZE(feature_properties); ++i, ++fp) {
^
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Add missing prototypes for ppc_select() & ppc_fadvise64_64() to header
asm-prototypes.h. Fix the following warnings (treated as errors in W=1)
arch/powerpc/kernel/syscalls.c:87:1: error: no previous prototype for ‘ppc_select’
arch/powerpc/kernel/syscalls.c:119:6: error: no previous prototype for ‘ppc_fadvise64_64’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
arch_unregister_hw_breakpoint()
In commit 5aae8a537080 ("powerpc, hw_breakpoints: Implement
hw_breakpoints for 64-bit server processors") function
hw_breakpoint_handler() and arch_unregister_hw_breakpoint() were added
without function prototypes in hw_breakpoint.h header.
Fix the following warning(s) (treated as error in W=1):
arch/powerpc/kernel/hw_breakpoint.c:106:6: error: no previous prototype for ‘arch_unregister_hw_breakpoint’
arch/powerpc/kernel/hw_breakpoint.c:209:5: error: no previous prototype for ‘hw_breakpoint_handler’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Two functions did not have a prototype defined in signal.h header. Fix
the following two warnings (treated as errors in W=1):
arch/powerpc/kernel/signal_32.c:1135:6: error: no previous prototype for ‘sys_rt_sigreturn’
arch/powerpc/kernel/signal_32.c:1422:6: error: no previous prototype for ‘sys_sigreturn’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In commit 81e7009ea46c ("powerpc: merge ppc signal.c and ppc64
signal32.c") the function sys_debug_setcontext was added without a
prototype.
Fix compilation warning (treated as error in W=1):
arch/powerpc/kernel/signal_32.c:1227:5: error: no previous prototype for ‘sys_debug_setcontext’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
A function init_IRQ() was added without a prototype declared in header
irq.h. Fix the following warning (treated as error in W=1):
arch/powerpc/kernel/irq.c:662:13: error: no previous prototype for ‘init_IRQ’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In commit 4f8b50bbbe63 ("irq_work, ppc: Fix up arch hooks") a new
function arch_irq_work_raise() was added without a prototype in header
irq_work.h.
Fix the following warning (treated as error in W=1):
arch/powerpc/kernel/time.c:523:6: error: no previous prototype for ‘arch_irq_work_raise’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In commit 55ccf3fe3f9a ("fork: move the real prepare_to_copy() users to
arch_dup_task_struct()") a new arch_dup_task_struct() was added
without a prototype declared in thread_info.h header. Fix the
following warning (treated as error in W=1):
arch/powerpc/kernel/process.c:1609:5: error: no previous prototype for ‘arch_dup_task_struct’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The function time_init did not have a prototype defined in the time.h
header. Fix the following warning (treated as error in W=1):
arch/powerpc/kernel/time.c:1068:13: error: no previous prototype for ‘time_init’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In commit dabe859ec636 ("powerpc: Give hypervisor decrementer interrupts
their own handler") an empty body function was added, but no prototype
was declared. Fix warning (treated as error in W=1):
arch/powerpc/kernel/time.c:629:6: error: no previous prototype for ‘hdec_interrupt’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In commit f0f558b131db ("powerpc/mm: Preserve CFAR value on SLB miss caused
by access to bogus address"), the function slb_miss_bad_addr() was added
without a prototype. This commit adds it.
Fix a warning (treated as error in W=1):
arch/powerpc/kernel/traps.c:1498:6: error: no previous prototype for ‘slb_miss_bad_addr’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
__giveup_fpu() is never called outside process.c, so it can be static.
That also means we don't need an empty definition in switch_to.h
Signed-off-by: Mathieu Malaterre <malat@debian.org>
[mpe: Also drop the empty version, rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Change signature of two functions, adding static keyword to prevent the
following two warnings (treated as errors on W=1):
arch/powerpc/platforms/embedded6xx/flipper-pic.c:135:28: error: no previous prototype for ‘flipper_pic_init’
arch/powerpc/platforms/embedded6xx/usbgecko_udbg.c:172:6: error: no previous prototype for ‘ug_udbg_putc’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Since the value of `tmp` is never intended to be read, declare both `tmp`
variables as unused. Fix warning (treated as error in W=1):
arch/powerpc/kernel/signal_32.c: In function ‘sys_swapcontext’:
arch/powerpc/kernel/signal_32.c:1048:16: error: variable ‘tmp’ set but not used
arch/powerpc/kernel/signal_32.c: In function ‘sys_debug_setcontext’:
arch/powerpc/kernel/signal_32.c:1234:16: error: variable ‘tmp’ set but not used
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The inline keyword was not at the beginning of the function declaration.
Fix the following warning (treated as error in W=1):
arch/powerpc/lib/sstep.c:283:1: error: ‘inline’ is not at beginning of declaration
static int nokprobe_inline copy_mem_in(u8 *dest, unsigned long ea, int nb,
arch/powerpc/lib/sstep.c:388:1: error: ‘inline’ is not at beginning of declaration
static int nokprobe_inline copy_mem_out(u8 *dest, unsigned long ea, int nb,
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Fix warning for all register unsigned long (0,3-12) that appear during W=1
compilation:
./arch/powerpc/include/asm/epapr_hcalls.h:479:2: warning: ‘register’ is not at beginning of declaration [-Wold-style-declaration]
unsigned long register r[\d] asm("r[\d]");
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
On MCE the current code will restart the machine with
ppc_md.restart(). This case was extremely unlikely since
prior to that a skiboot call is made and that resulted in
a checkstop for analysis.
With newer skiboots, on P9 we don't checkstop the box by
default, instead we return back to the kernel to extract
useful information at the time of the MCE. While we still
get this information, this patch converts the restart to
a panic(), so that if configured a dump can be taken and
we can track and probably debug the potential issue causing
the MCE.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
P9 supports PCI tunneled operations (atomics and as_notify). This
patch adds support for tunneled operations on powernv, with a new
API, to be called by device drivers:
pnv_pci_enable_tunnel()
Enable tunnel operations, tell driver the 16-bit ASN indication
used by kernel.
pnv_pci_disable_tunnel()
Disable tunnel operations.
pnv_pci_set_tunnel_bar()
Tell kernel the Tunnel BAR Response address used by driver.
This function uses two new OPAL calls, as the PBCQ Tunnel BAR
register is configured by skiboot.
pnv_pci_get_as_notify_info()
Return the ASN info of the thread to be woken up.
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When sending TLB invalidates to the NPU we need to send extra flushes due
to a hardware issue. The original implementation would lock the all the
ATSD MMIO registers sequentially before unlocking and relocking each of
them sequentially to do the extra flush.
This introduced a deadlock as it is possible for one thread to hold one
ATSD register whilst waiting for another register to be freed while the
other thread is holding that register waiting for the one in the first
thread to be freed.
For example if there are two threads and two ATSD registers:
Thread A Thread B
----------------------
Acquire 1
Acquire 2
Release 1 Acquire 1
Wait 1 Wait 2
Both threads will be stuck waiting to acquire a register resulting in an
RCU stall warning or soft lockup.
This patch solves the deadlock by refactoring the code to ensure registers
are not released between flushes and to ensure all registers are either
acquired or released together and in order.
Fixes: bbd5ff50afff ("powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
cpm_cascade() doesn't have to call eoi() as it is already called
by handle_fasteoi_irq()
And cpm_get_irq() will always return an unsigned int so the test
is useless
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This is left over from the segment table implementation and not getting
called from any where now. Hence just drop it.
Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Presently when xmon is disabled by debugfs any existing
instruction/data-access breakpoints set are not disabled. This may
lead to kernel oops when those breakpoints are hit as the necessary
debugger hooks aren't installed.
Hence this patch introduces a new function named clear_all_bpt() which
is called when xmon is disabled via debugfs. The function will
unpatch/clear all the trap and ciabr/dab based breakpoints.
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Fix build break when CONFIG_DEBUG_FS=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Presently sysrq key for xmon('x') is registered during kernel init
irrespective of the value of kernel param 'xmon'. Thus xmon is enabled
even if 'xmon=off' is passed on the kernel command line. However this
doesn't enable the kernel debugger hooks needed for instruction or
data breakpoints. Thus when a break-point is hit with xmon=off a
kernel oops of the form below is reported:
Oops: Exception in kernel mode, sig: 5 [#1]
< snip >
Trace/breakpoint trap
To fix this the patch checks and enables debugger hooks when an
instruction or data break-point is set via xmon console.
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Just printf directly, no need for static const char[]]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Fix the order of cleanup to ensure we free the name buffer in case
of an error creating 'hvwc' or 'info' files.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Remove a bogus line from arch/powerpc/platforms/powernv/Makefile that
was added by commit ece4e51 ("powerpc/vas: Export HVWC to debugfs").
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This merges two commits from the `kvm-ppc-fixes` branch into next, as
they fix build breaks we are seeing while testing next.
|
|
On the 8xx, the minimum slice size is the size of the area
covered by a single PMD entry, ie 4M in 4K pages mode and 64M in
16K pages mode.
This patch increases the number of slices from 16 to 64 on the 8xx.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
While the implementation of the "slices" address space allows
a significant amount of high slices, it limits the number of
low slices to 16 due to the use of a single u64 low_slices_psize
element in struct mm_context_t
On the 8xx, the minimum slice size is the size of the area
covered by a single PMD entry, ie 4M in 4K pages mode and 64M in
16K pages mode. This means we could have at least 64 slices.
In order to override this limitation, this patch switches the
handling of low_slices_psize to char array as done already for
high_slices_psize.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
On the 8xx, the page size is set in the PMD entry and applies to
all pages of the page table pointed by the said PMD entry.
When an app has some regular pages allocated (e.g. see below) and tries
to mmap() a huge page at a hint address covered by the same PMD entry,
the kernel accepts the hint allthough the 8xx cannot handle different
page sizes in the same PMD entry.
10000000-10001000 r-xp 00000000 00:0f 2597 /root/malloc
10010000-10011000 rwxp 00000000 00:0f 2597 /root/malloc
mmap(0x10080000, 524288, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x10080000
This results the app remaining forever in do_page_fault()/hugetlb_fault()
and when interrupting that app, we get the following warning:
[162980.035629] WARNING: CPU: 0 PID: 2777 at arch/powerpc/mm/hugetlbpage.c:354 hugetlb_free_pgd_range+0xc8/0x1e4
[162980.035699] CPU: 0 PID: 2777 Comm: malloc Tainted: G W 4.14.6 #85
[162980.035744] task: c67e2c00 task.stack: c668e000
[162980.035783] NIP: c000fe18 LR: c00e1eec CTR: c00f90c0
[162980.035830] REGS: c668fc20 TRAP: 0700 Tainted: G W (4.14.6)
[162980.035854] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 24044224 XER: 20000000
[162980.036003]
[162980.036003] GPR00: c00e1eec c668fcd0 c67e2c00 00000010 c6869410 10080000 00000000 77fb4000
[162980.036003] GPR08: ffff0001 0683c001 00000000 ffffff80 44028228 10018a34 00004008 418004fc
[162980.036003] GPR16: c668e000 00040100 c668e000 c06c0000 c668fe78 c668e000 c6835ba0 c668fd48
[162980.036003] GPR24: 00000000 73ffffff 74000000 00000001 77fb4000 100fffff 10100000 10100000
[162980.036743] NIP [c000fe18] hugetlb_free_pgd_range+0xc8/0x1e4
[162980.036839] LR [c00e1eec] free_pgtables+0x12c/0x150
[162980.036861] Call Trace:
[162980.036939] [c668fcd0] [c00f0774] unlink_anon_vmas+0x1c4/0x214 (unreliable)
[162980.037040] [c668fd10] [c00e1eec] free_pgtables+0x12c/0x150
[162980.037118] [c668fd40] [c00eabac] exit_mmap+0xe8/0x1b4
[162980.037210] [c668fda0] [c0019710] mmput.part.9+0x20/0xd8
[162980.037301] [c668fdb0] [c001ecb0] do_exit+0x1f0/0x93c
[162980.037386] [c668fe00] [c001f478] do_group_exit+0x40/0xcc
[162980.037479] [c668fe10] [c002a76c] get_signal+0x47c/0x614
[162980.037570] [c668fe70] [c0007840] do_signal+0x54/0x244
[162980.037654] [c668ff30] [c0007ae8] do_notify_resume+0x34/0x88
[162980.037744] [c668ff40] [c000dae8] do_user_signal+0x74/0xc4
[162980.037781] Instruction dump:
[162980.037821] 7fdff378 81370000 54a3463a 80890020 7d24182e 7c841a14 712a0004 4082ff94
[162980.038014] 2f890000 419e0010 712a0ff0 408200e0 <0fe00000> 54a9000a 7f984840 419d0094
[162980.038216] ---[ end trace c0ceeca8e7a5800a ]---
[162980.038754] BUG: non-zero nr_ptes on freeing mm: 1
[162985.363322] BUG: non-zero nr_ptes on freeing mm: -1
In order to fix this, this patch uses the address space "slices"
implemented for BOOK3S/64 and enhanced to support PPC32 by the
preceding patch.
This patch modifies the context.id on the 8xx to be in the range
[1:16] instead of [0:15] in order to identify context.id == 0 as
not initialised contexts as done on BOOK3S
This patch activates CONFIG_PPC_MM_SLICES when CONFIG_HUGETLB_PAGE is
selected for the 8xx
Alltough we could in theory have as many slices as PMD entries, the
current slices implementation limits the number of low slices to 16.
This limitation is not preventing us to fix the initial issue allthough
it is suboptimal. It will be cured in a subsequent patch.
Fixes: 4b91428699477 ("powerpc/8xx: Implement support of hugepages")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|