summaryrefslogtreecommitdiff
path: root/arch/x86/platform/uv
AgeCommit message (Collapse)Author
2014-12-10Merge branches 'x86-platform-for-linus' and 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform changes from Ingo Molnar: "A handful of numachip APIC driver updates/fixes, and two small SGI/UV fixes" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: numachip: APIC driver cleanups x86: numachip: Elide self-IPI ICR polling x86: numachip: Fix 16-bit APIC ID truncation * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: UV BAU: Increase maximum CPUs per socket/hub x86: UV BAU: Avoid NULL pointer reference in ptc_seq_show
2014-12-08x86: Replace seq_printf() with seq_puts()Rasmus Villemoes
seq_puts is a lot cheaper than seq_printf, so use that to print literal strings. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: http://lkml.kernel.org/r/1417208622-12264-1-git-send-email-linux@rasmusvillemoes.dk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-03x86: UV BAU: Avoid NULL pointer reference in ptc_seq_showJames Custer
In init_per_cpu(), when get_cpu_topology() fails, init_per_cpu_tunables() is not called afterwards. This means that bau_control->statp is NULL. If a user then reads /proc/sgi_uv/ptc_statistics ptc_seq_show() references a NULL pointer. Therefore, since uv_bau_init calls set_bau_off when init_per_cpu() fails, we add code that detects when the bau is off in ptc_seq_show() to avoid referencing a NULL pointer. Signed-off-by: James Custer <jcuster@sgi.com> Cc: Russ Anderson <rja@sgi.com> Link: http://lkml.kernel.org/r/1414952199-185319-2-git-send-email-jcuster@sgi.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-08-26uv: Replace __get_cpu_varChristoph Lameter
Use __this_cpu_read instead. Cc: Hedi Berriche <hedi@sgi.com> Cc: Mike Travis <travis@sgi.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-08-26x86: Replace __get_cpu_var usesChristoph Lameter
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (*this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-08-08arch/x86: replace strict_strto callsDaniel Walter
Replace obsolete strict_strto calls with appropriate kstrto calls Signed-off-by: Daniel Walter <dwalter@google.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-04Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 UV TLB update from Ingo Molnar: "UV TLB shootdown logic updates for version of the UV architecture" * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/uv: Update the UV3 TLB shootdown logic
2014-06-05Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull x86 EFI updates from Peter Anvin: "A collection of EFI changes. The perhaps most important one is to fully save and restore the FPU state around each invocation of EFI runtime, and to not choke on non-ASCII characters in the boot stub" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efivars: Add compatibility code for compat tasks efivars: Refactor sanity checking code into separate function efivars: Stop passing a struct argument to efivar_validate() efivars: Check size of user object efivars: Use local variables instead of a pointer dereference x86/efi: Save and restore FPU context around efi_calls (i386) x86/efi: Save and restore FPU context around efi_calls (x86_64) x86/efi: Implement a __efi_call_virt macro x86, fpu: Extend the use of static_cpu_has_safe x86/efi: Delete most of the efi_call* macros efi: x86: Handle arbitrary Unicode characters efi: Add get_dram_base() helper function efi: Add shared printk wrapper for consistent prefixing efi: create memory map iteration helper efi: efi-stub-helper cleanup
2014-06-05x86/uv: Update the UV3 TLB shootdown logicCliff Wickman
Update of TLB shootdown code for UV3. Kernel function native_flush_tlb_others() calls uv_flush_tlb_others() on UV to invalidate tlb page definitions on remote cpus. The UV systems have a hardware 'broadcast assist unit' which can be used to broadcast shootdown messages to all cpu's of selected nodes. The behavior of the BAU has changed only slightly with UV3: - UV3 is recognized with is_uv3_hub(). - UV2 functions and structures (uv2_xxx) are in most cases simply renamed to uv2_3_xxx. - Some UV2 error workarounds are not needed for UV3. (see uv_bau_message_interrupt and enable_timeouts) Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/E1WkgWh-0001yJ-3K@eag09.americas.sgi.com [ Removed a few linebreak uglies. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-04Merge branch 'akpm' (patchbomb from Andrew) into nextLinus Torvalds
Merge misc updates from Andrew Morton: - a few fixes for 3.16. Cc'ed to stable so they'll get there somehow. - various misc fixes and cleanups - most of the ocfs2 queue. Review is slow... - most of MM. The MM queue is pretty huge this time, but not much in the way of feature work. - some tweaks under kernel/ - printk maintenance work - updates to lib/ - checkpatch updates - tweaks to init/ * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (276 commits) fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul init/main.c: remove an ifdef kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND init/main.c: add initcall_blacklist kernel parameter init/main.c: don't use pr_debug() fs/binfmt_flat.c: make old_reloc() static fs/binfmt_elf.c: fix bool assignements fs/efs: convert printk(KERN_DEBUG to pr_debug fs/efs: add pr_fmt / use __func__ fs/efs: convert printk to pr_foo() scripts/checkpatch.pl: device_initcall is not the only __initcall substitute checkpatch: check stable email address checkpatch: warn on unnecessary void function return statements checkpatch: prefer kstrto<foo> to sscanf(buf, "%<lhuidx>", &bar); checkpatch: add warning for kmalloc/kzalloc with multiply checkpatch: warn on #defines ending in semicolon checkpatch: make --strict a default for files in drivers/net and net/ checkpatch: always warn on missing blank line after variable declaration block checkpatch: fix wildcard DT compatible string checking ...
2014-06-04kernel/printk: use symbolic defines for console loglevelsBorislav Petkov
... instead of naked numbers. Stuff in sysrq.c used to set it to 8 which is supposed to mean above default level so set it to DEBUG instead as we're terminating/killing all tasks and we want to be verbose there. Also, correct the check in x86_64_start_kernel which should be >= as we're clearly issuing the string there for all debug levels, not only the magical 10. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: Joe Perches <joe@perches.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-16x86: uv: Use irq_alloc/free_hwirq()Thomas Gleixner
No functional change. The request to allocate the irq above NR_IRQS_LEGACY is completely pointless as the implementation enforces that the dynamic allocations are above the GSI interrupts, which includes the legacy PIT irqs. This does not replace the requirement to move x86 to irq domains, but it limits the mess to some degree. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20140507154335.252789823@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-17x86/efi: Delete most of the efi_call* macrosMatt Fleming
We really only need one phys and one virt function call, and then only one assembly function to make firmware calls. Since we are not using the C type system anyway, we're not really losing much by deleting the macros apart from no longer having a check that we are passing the correct number of parameters. The lack of duplicated code seems like a worthwhile trade-off. Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-01-25x86/uv/nmi, kgdb/kdb: Fix UV NMI handler when KDB not configuredMike Travis
Fix UV call into kgdb to depend only on whether KGDB is defined and not both KGDB and KDB. This allows the power nmi command to use the gdb remote connection if enabled. Note new action of 'kgdb' needs to be set as well to indicate user wants to wait for gdb to be connected. If it's set to 'kdb' then an error message is displayed if KDB is not configured. Also note that if both KGDB and KDB are enabled, then the action of 'kgdb' or 'kdb' has no affect on which is used. See the KGDB documentation for further information. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20140114162551.635540667@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-25x86/uv/nmi: Fix Sparse warningsMike Travis
Make uv_register_nmi_notifier() and uv_handle_nmi_ping() static to address sparse warnings. Fix problem where uv_nmi_kexec_failed is unused when CONFIG_KEXEC is not defined. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20140114162551.480872353@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-25kgdb/kdb: Fix no KDB config problemMike Travis
Some code added to the debug_core module had KDB dependencies that it shouldn't have. Move the KDB dependent REASON back to the caller to remove the dependency in the debug core code. Update the call from the UV NMI handler to conform to the new interface. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20140114162551.318251993@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQsPeter Zijlstra
Use a ring-buffer like multi-version object structure which allows always having a coherent object; we use this to avoid having to disable IRQs while reading sched_clock() and avoids a problem when getting an NMI while changing the cyc2ns data. MAINLINE PRE POST sched_clock_stable: 1 1 1 (cold) sched_clock: 329841 331312 257223 (cold) local_clock: 301773 310296 309889 (warm) sched_clock: 38375 38247 25280 (warm) local_clock: 100371 102713 85268 (warm) rdtsc: 27340 27289 24247 sched_clock_stable: 0 0 0 (cold) sched_clock: 382634 372706 301224 (cold) local_clock: 396890 399275 399870 (warm) sched_clock: 38194 38124 25630 (warm) local_clock: 143452 148698 129629 (warm) rdtsc: 27345 27365 24307 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-s567in1e5ekq2nlyhn8f987r@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-10x86/UV: Fix NULL pointer dereference in uv_flush_tlb_others() if the 'nobau' ↵cpw
boot option is used The SGI UV tlb shootdown code panics the system with a NULL pointer deference if 'nobau' is specified on the boot commandline. uv_flush_tlb_other() gets called for every flush, whether the BAU is disabled or not. It should not be keeping the s_enters statistic while the BAU is disabled. The panic occurs because during initialization init_per_cpu_tunables() does not set the bcp->statp pointer if 'nobau' was specified. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: <stable@vger.kernel.org> # 3.12.x Link: http://lkml.kernel.org/r/E1VnzBi-0005yF-MU@eag09.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-12x86/dumpstack: Fix printk_address for direct addressesJiri Slaby
Consider a kernel crash in a module, simulated the following way: static int my_init(void) { char *map = (void *)0x5; *map = 3; return 0; } module_init(my_init); When we turn off FRAME_POINTERs, the very first instruction in that function causes a BUG. The problem is that we print IP in the BUG report using %pB (from printk_address). And %pB decrements the pointer by one to fix printing addresses of functions with tail calls. This was added in commit 71f9e59800e5ad4 ("x86, dumpstack: Use %pB format specifier for stack trace") to fix the call stack printouts. So instead of correct output: BUG: unable to handle kernel NULL pointer dereference at 0000000000000005 IP: [<ffffffffa01ac000>] my_init+0x0/0x10 [pb173] We get: BUG: unable to handle kernel NULL pointer dereference at 0000000000000005 IP: [<ffffffffa0152000>] 0xffffffffa0151fff To fix that, we use %pS only for stack addresses printouts (via newly added printk_stack_address) and %pB for regs->ip (via printk_address). I.e. we revert to the old behaviour for all except call stacks. And since from all those reliable is 1, we remove that parameter from printk_address. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: joe@perches.com Cc: jirislaby@gmail.com Link: http://lkml.kernel.org/r/1382706418-8435-1-git-send-email-jslaby@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-11Revert "x86/UV: Add uvtrace support"Ingo Molnar
This reverts commit 8eba18428ac926f436064ac281e76d36d51bd631. uv_trace() is not used by anything, nor is uv_trace_nmi_func, nor uv_trace_func. That's not how we do instrumentation code in the kernel: we add tracepoints, printk()s, etc. so that everyone not just those with magic kernel modules can debug a system. So remove this unused (and misguied) piece of code. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Mike Travis <travis@sgi.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/n/tip-tumfBffmr4jmnt8Gyxanoblg@git.kernel.org
2013-10-03x86/UV: Add call to KGDB/KDB from NMI handlerMike Travis
This patch restores the capability to enter KDB (and KGDB) from the UV NMI handler. This is needed because the UV system console is not capable of sending the 'break' signal to the serial console port. It is also useful when the kernel is hung in such a way that it isn't responding to normal external I/O, so sending 'g' to sysreq-trigger does not work either. Another benefit of the external NMI command is that all the cpus receive the NMI signal at roughly the same time so they are more closely aligned timewise. It utilizes the newly added kgdb_nmicallin function to gain entry to KGDB/KDB by the master. The slaves still enter via the standard kgdb_nmicallback function. It also uses the new 'send_ready' pointer to tell KGDB/KDB to signal the slaves when to proceed into the KGDB slave loop. It is enabled when the nmi action is set to "kdb" and the kernel is built with CONFIG_KDB enabled. Note that if kgdb is connected that interface will be used instead. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20131002151418.089692683@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86/UV: Check for alloc_cpumask_var() failures properly in uv_nmi_setup()Ingo Molnar
GCC warned about: arch/x86/platform/uv/uv_nmi.c: In function ‘uv_nmi_setup’: arch/x86/platform/uv/uv_nmi.c:664:2: warning: the address of ‘uv_nmi_cpu_mask’ will always evaluate as ‘true’ The reason is this code: alloc_cpumask_var(&uv_nmi_cpu_mask, GFP_KERNEL); BUG_ON(!uv_nmi_cpu_mask); which is not the way to check for alloc_cpumask_var() failures - its return code should be checked instead. Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/n/tip-2pXRemsjupmvonbpmmnzleo1@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86/UV: Add uvtrace supportMike Travis
This patch adds support for the uvtrace module by providing a skeleton call to the registered trace function. It also provides another separate 'NMI' tracer that is triggered by the system wide 'power nmi' command. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20130923212501.185052551@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86/UV: Add kdump to UV NMI handlerMike Travis
If a system has hung and it no longer responds to external events, this patch adds the capability of doing a standard kdump and system reboot then triggered by the system NMI command. It is enabled when the nmi action is changed to "kdump" and the kernel is built with CONFIG_KEXEC enabled. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20130923212500.660567460@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86/UV: Add summary of cpu activity to UV NMI handlerMike Travis
The standard NMI handler dumps the states of all the cpus. This includes a full register dump and stack trace. This can be way more information than what is needed. This patch adds a "summary" dump that is basically a form of the "ps" command. It includes the symbolic IP address as well as the command field and basic process information. It is enabled when the nmi action is changed to "ips". Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20130923212500.507922930@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86/UV: Update UV support for external NMI signalsMike Travis
The current UV NMI handler has not been updated for the changes in the system NMI handler and the perf operations. The UV NMI handler reads an MMR in the UV Hub to check to see if the NMI event was caused by the external 'system NMI' that the operator can initiate on the System Mgmt Controller. The problem arises when the perf tools are running, causing millions of perf events per second on very large CPU count systems. Previously this was okay because the perf NMI handler ran at a higher priority on the NMI call chain and if the NMI was a perf event, it would stop calling other NMI handlers remaining on the NMI call chain. Now the system NMI handler calls all the handlers on the NMI call chain including the UV NMI handler. This causes the UV NMI handler to read the MMRs at the same millions per second rate. This can lead to significant performance loss and possible system failures. It also can cause thousands of 'Dazed and Confused' messages being sent to the system console. This effectively makes perf tools unusable on UV systems. To avoid this excessive overhead when perf tools are running, this code has been optimized to minimize reading of the MMRs as much as possible, by moving to the NMI_UNKNOWN notifier chain. This chain is called only when all the users on the standard NMI_LOCAL call chain have been called and none of them have claimed this NMI. There is an exception where the NMI_LOCAL notifier chain is used. When the perf tools are in use, it's possible that the UV NMI was captured by some other NMI handler and then either ignored or mistakenly processed as a perf event. We set a per_cpu ('ping') flag for those CPUs that ignored the initial NMI, and then send them an IPI NMI signal. The NMI_LOCAL handler on each cpu does not need to read the MMR, but instead checks the in memory flag indicating it was pinged. There are two module variables, 'ping_count' indicating how many requested NMI events occurred, and 'ping_misses' indicating how many stray NMI events. These most likely are perf events so it shows the overhead of the perf NMI interrupts and how many MMR reads were avoided. This patch also minimizes the reads of the MMRs by having the first cpu entering the NMI handler on each node set a per HUB in-memory atomic value. (Having a per HUB value avoids sending lock traffic over NumaLink.) Both types of UV NMIs from the SMI layer are supported. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20130923212500.353547733@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86/UV: Move NMI supportMike Travis
This patch moves the UV NMI support from the x2apic file to a new separate uv_nmi.c file in preparation for the next sequence of patches. It prevents upcoming bloat of the x2apic file, and has the added benefit of putting the upcoming /sys/module parameters under the name 'uv_nmi' instead of 'x2apic_uv_x', which was obscure. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20130923212500.183295611@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-11x86/platform/uv: Replace kmalloc() & memset with kzalloc()Alexandru Gheorghiu
This was found using coccicheck. Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com> Link: http://lkml.kernel.org/r/1362822043-15559-1-git-send-email-gheorghiuandru@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 UV3 support update from Ingo Molnar: "Support for the SGI Ultraviolet System 3 (UV3) platform - the upcoming third major iteration and upscaling of the SGI UV supercomputing platform." * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, uv, uv3: Trim MMR register definitions after code changes for SGI UV3 x86, uv, uv3: Check current gru hub support for SGI UV3 x86, uv, uv3: Update Time Support for SGI UV3 x86, uv, uv3: Update x2apic Support for SGI UV3 x86, uv, uv3: Update Hub Info for SGI UV3 x86, uv, uv3: Update ACPI Check to include SGI UV3 x86, uv, uv3: Update MMR register definitions for SGI Ultraviolet System 3 (UV3)
2013-02-19Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanup patches from Ingo Molnar: "Misc smaller cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: ptrace.c only needs export.h and not the full module.h x86, apb_timer: remove unused variable percpu_timer um: don't compare a pointer to 0 arch/x86/platform/uv: use ARRAY_SIZE where possible
2013-02-11x86, uv, uv3: Update Time Support for SGI UV3Mike Travis
This patch updates time support for the SGI UV3 hub. Since the UV2 and UV3 time support is identical, "is_uvx_hub" is used instead of having both "is_uv2_hub" and "is_uv3_hub". Signed-off-by: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20130211194508.893907185@gulag1.americas.sgi.com Acked-by: Russ Anderson <rja@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24arch/x86/platform/uv: Fix incorrect tlb flush all issueAlex Shi
The flush tlb optimization code has logical issue on UV platform. It doesn't flush the full range at all, since it simply ignores its 'end' parameter (and hence also the "all" indicator) in uv_flush_tlb_others() function. Cliff's notes: | I tested the patch on a UV. It has the effect of either | clearing 1 or all TLBs in a cpu. I added some debugging to | test for the cases when clearing all TLBs is overkill, and in | practice it happens very seldom. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Cliff Wickman <cpw@sgi.com> Tested-by: Cliff Wickman <cpw@sgi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-20arch/x86/platform/uv: use ARRAY_SIZE where possibleSasha Levin
Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Link: http://lkml.kernel.org/r/1356030701-16284-26-git-send-email-sasha.levin@oracle.com Cc: Cliff Wickman <cpw@sgi.com> Cc: Alex Shi <alex.shi@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-07-26Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/mm changes from Peter Anvin: "The big change here is the patchset by Alex Shi to use INVLPG to flush only the affected pages when we only need to flush a small page range. It also removes the special INVALIDATE_TLB_VECTOR interrupts (32 vectors!) and replace it with an ordinary IPI function call." Fix up trivial conflicts in arch/x86/include/asm/apic.h (added code next to changed line) * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tlb: Fix build warning and crash when building for !SMP x86/tlb: do flush_tlb_kernel_range by 'invlpg' x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR x86/tlb: enable tlb flush range support for x86 mm/mmu_gather: enable tlb flush range in generic mmu_gather x86/tlb: add tlb_flushall_shift knob into debugfs x86/tlb: add tlb_flushall_shift for specific CPU x86/tlb: fall back to flush all when meet a THP large page x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range x86/tlb_info: get last level TLB entry number of CPU x86: Add read_mostly declaration/definition to variables from smp.h x86: Define early read-mostly per-cpu macros
2012-07-22Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/uv changes from Ingo Molnar: "UV2 BAU productization fixes. The BAU (Broadcast Assist Unit) is SGI's fancy out of line way on UV hardware to do TLB flushes, instead of the normal APIC IPI methods. The commits here fix / work around hangs in their latest hardware iteration (UV2). My understanding is that the main purpose of the out of line signalling channel is to improve scalability: the UV APIC hardware glue does not handle broadcasting to many CPUs very well, and this matters most for TLB shootdowns. [ I don't agree with all aspects of the current approach: in hindsight it would have been better to link the BAU at the IPI/APIC driver level instead of the TLB shootdown level, where TLB flushes are really just one of the uses of broadcast SMP messages. Doing that would improve scalability in some other ways and it would also remove a few uglies from the TLB path. It would also be nice to push more is_uv_system() tests into proper x86_init or x86_platform callbacks. Cliff? ]" * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/uv: Work around UV2 BAU hangs x86/uv: Implement UV BAU runtime enable and disable control via /proc/sgi_uv/ x86/uv: Fix the UV BAU destination timeout period
2012-07-22Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform changes from Ingo Molnar: "This tree mostly involves various APIC driver cleanups/robustization, and vSMP motivated platform callback improvements/cleanups" Fix up trivial conflict due to printk cleanup right next to return value change. * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) Revert "x86/early_printk: Replace obsolete simple_strtoul() usage with kstrtoint()" x86/apic/x2apic: Use multiple cluster members for the irq destination only with the explicit affinity x86/apic/x2apic: Limit the vector reservation to the user specified mask x86/apic: Optimize cpu traversal in __assign_irq_vector() using domain membership x86/vsmp: Fix vector_allocation_domain's return value irq/apic: Use config_enabled(CONFIG_SMP) checks to clean up irq_set_affinity() for UP x86/vsmp: Fix linker error when CONFIG_PROC_FS is not set x86/apic/es7000: Make apicid of a cluster (not CPU) from a cpumask x86/apic/es7000+summit: Always make valid apicid from a cpumask x86/apic/es7000+summit: Fix compile warning in cpu_mask_to_apicid() x86/apic: Fix ugly casting and branching in cpu_mask_to_apicid_and() x86/apic: Eliminate cpu_mask_to_apicid() operation x86/x2apic/cluster: Vector_allocation_domain() should return a value x86/apic/irq_remap: Silence a bogus pr_err() x86/vsmp: Ignore IOAPIC IRQ affinity if possible x86/apic: Make cpu_mask_to_apicid() operations check cpu_online_mask x86/apic: Make cpu_mask_to_apicid() operations return error code x86/apic: Avoid useless scanning thru a cpumask in assign_irq_vector() x86/apic: Try to spread IRQ vectors to different priority levels x86/apic: Factor out default vector_allocation_domain() operation ...
2012-06-27x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_rangeAlex Shi
x86 has no flush_tlb_range support in instruction level. Currently the flush_tlb_range just implemented by flushing all page table. That is not the best solution for all scenarios. In fact, if we just use 'invlpg' to flush few lines from TLB, we can get the performance gain from later remain TLB lines accessing. But the 'invlpg' instruction costs much of time. Its execution time can compete with cr3 rewriting, and even a bit more on SNB CPU. So, on a 512 4KB TLB entries CPU, the balance points is at: (512 - X) * 100ns(assumed TLB refill cost) = X(TLB flush entries) * 100ns(assumed invlpg cost) Here, X is 256, that is 1/2 of 512 entries. But with the mysterious CPU pre-fetcher and page miss handler Unit, the assumed TLB refill cost is far lower then 100ns in sequential access. And 2 HT siblings in one core makes the memory access more faster if they are accessing the same memory. So, in the patch, I just do the change when the target entries is less than 1/16 of whole active tlb entries. Actually, I have no data support for the percentage '1/16', so any suggestions are welcomed. As to hugetlb, guess due to smaller page table, and smaller active TLB entries, I didn't see benefit via my benchmark, so no optimizing now. My micro benchmark show in ideal scenarios, the performance improves 70 percent in reading. And in worst scenario, the reading/writing performance is similar with unpatched 3.4-rc4 kernel. Here is the reading data on my 2P * 4cores *HT NHM EP machine, with THP 'always': multi thread testing, '-t' paramter is thread number: with patch unpatched 3.4-rc4 ./mprotect -t 1 14ns 24ns ./mprotect -t 2 13ns 22ns ./mprotect -t 4 12ns 19ns ./mprotect -t 8 14ns 16ns ./mprotect -t 16 28ns 26ns ./mprotect -t 32 54ns 51ns ./mprotect -t 128 200ns 199ns Single process with sequencial flushing and memory accessing: with patch unpatched 3.4-rc4 ./mprotect 7ns 11ns ./mprotect -p 4096 -l 8 -n 10240 21ns 21ns [ hpa: http://lkml.kernel.org/r/1B4B44D9196EFF41AE41FDA404FC0A100BFF94@SHSMSX101.ccr.corp.intel.com has additional performance numbers. ] Signed-off-by: Alex Shi <alex.shi@intel.com> Link: http://lkml.kernel.org/r/1340845344-27557-3-git-send-email-alex.shi@intel.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-25x86/uv: Work around UV2 BAU hangsCliff Wickman
On SGI's UV2 the BAU (Broadcast Assist Unit) driver can hang under a heavy load. To cure this: - Disable the UV2 extended status mode (see UV2_EXT_SHFT), as this mode changes BAU behavior in more ways then just delivering an extra bit of status. Revert status to just two meaningful bits, like UV1. - Use no IPI-style resets on UV2. Just give up the request for whatever the reason it failed and let it be accomplished with the legacy IPI method. - Use no alternate sending descriptor (the former UV2 workaround bcp->using_desc and handle_uv2_busy() stuff). Just disable the use of the BAU for a period of time in favor of the legacy IPI method when the h/w bug leaves a descriptor busy. -- new tunable: giveup_limit determines the threshold at which a hub is so plugged that it should do all requests with the legacy IPI method for a period of time -- generalize disable_for_congestion() (renamed disable_for_period()) for use whenever a hub should avoid using the BAU for a period of time Also: - Fix find_another_by_swack(), which is part of the UV2 bug workaround - Correct and clarify the statistics (new stats s_overipilimit, s_giveuplimit, s_enters, s_ipifordisabled, s_plugged, s_congested) Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120622131459.GC31884@sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-25x86/uv: Implement UV BAU runtime enable and disable control via /proc/sgi_uv/Cliff Wickman
This patch enables the BAU to be turned on or off dynamically. echo "on" > /proc/sgi_uv/ptc_statistics echo "off" > /proc/sgi_uv/ptc_statistics The system may be booted with or without the nobau option. Whether the system currently has the BAU off can be seen in the /proc file -- normally with the baustats script. Each cpu will have a 1 in the bauoff field if the BAU was turned off, so baustats will give a count of cpus that have it off. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120622131330.GB31884@sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-25x86/uv: Fix the UV BAU destination timeout periodCliff Wickman
Correct the calculation of a destination timeout period, which is used to distinguish between a destination timeout and the situation where all the target software ack resources are full and a request is returned immediately. The problem is that integer arithmetic was overflowing, yielding a very large result. Without this fix destination timeouts are identified as resource 'plugged' events and an ipi method of resource releasing is unnecessarily employed. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120622131212.GA31884@sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-15Merge branch 'x86/cleanups' into x86/apicIngo Molnar
Merge in the cleanups because a followup x86/apic change relies on them. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14x86/apic: Eliminate cpu_mask_to_apicid() operationAlexander Gordeev
Since there are only two locations where cpu_mask_to_apicid() is called from, remove the operation and use only cpu_mask_to_apicid_and() instead. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Suggested-and-acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120614074935.GE3383@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08x86/uv: Fix UV2 BAU legacy modeCliff Wickman
The SGI Altix UV2 BAU (Broadcast Assist Unit) as used for tlb-shootdown (selective broadcast mode) always uses UV2 broadcast descriptor format. There is no need to clear the 'legacy' (UV1) mode, because the hardware always uses UV2 mode for selective broadcast. But the BIOS uses general broadcast and legacy mode, and the hardware pays attention to the legacy mode bit for general broadcast. So the kernel must not clear that mode bit. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/E1SccoO-0002Lh-Cb@eag09.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08x86/apic: Make cpu_mask_to_apicid() operations return error codeAlexander Gordeev
Current cpu_mask_to_apicid() and cpu_mask_to_apicid_and() implementations have few shortcomings: 1. A value returned by cpu_mask_to_apicid() is written to hardware registers unconditionally. Should BAD_APICID get ever returned it will be written to a hardware too. But the value of BAD_APICID is not universal across all hardware in all modes and might cause unexpected results, i.e. interrupts might get routed to CPUs that are not configured to receive it. 2. Because the value of BAD_APICID is not universal it is counter- intuitive to return it for a hardware where it does not make sense (i.e. x2apic). 3. cpu_mask_to_apicid_and() operation is thought as an complement to cpu_mask_to_apicid() that only applies a AND mask on top of a cpumask being passed. Yet, as consequence of 18374d8 commit the two operations are inconsistent in that of: cpu_mask_to_apicid() should not get a offline CPU with the cpumask cpu_mask_to_apicid_and() should not fail and return BAD_APICID These limitations are impossible to realize just from looking at the operations prototypes. Most of these shortcomings are resolved by returning a error code instead of BAD_APICID. As the result, faults are reported back early rather than possibilities to cause a unexpected behaviour exist (in case of [1]). The only exception is setup_timer_IRQ0_pin() routine. Although obviously controversial to this fix, its existing behaviour is preserved to not break the fragile check_timer() and would better addressed in a separate fix. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120607131559.GF4759@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-24x86: Return IRQ_SET_MASK_OK_NOCOPY from irq affinity functionsJiang Liu
The interrupt chip irq_set_affinity() functions copy the affinity mask to irq_data->affinity but return 0, i.e. IRQ_SET_MASK_OK. IRQ_SET_MASK_OK causes the core code to do another redundant copy. Return IRQ_SET_MASK_OK_NOCOPY to avoid this. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Cliff Wickman <cpw@sgi.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Keping Chen <chenkeping@huawei.com> Link: http://lkml.kernel.org/r/1333120296-13563-4-git-send-email-jiang.liu@huawei.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-20x86/UV: Lower UV rtc clocksource ratingDimitri Sivanich
Lower the rating of the UV rtc clocksource to just below that of the tsc, to improve performance. Reading the tsc clocksource has lower latency than reading the rtc, so favor it in situations where it is synchronized and stable. When the tsc is unsynchronized, the rtc needs to be the chosen clocksource. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Jack Steiner <steiner@sgi.com> Link: http://lkml.kernel.org/r/20120217141641.GA28063@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-26x86/uv: Fix uninitialized spinlocksCliff Wickman
Initialize two spinlocks in tlb_uv.c and also properly define/initialize the uv_irq_lock. The lack of explicit initialization seems to be functionally harmless, but it is diagnosed when these are turned on: CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_LOCKDEP=y Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: <stable@kernel.org> Cc: Dimitri Sivanich <sivanich@sgi.com> Link: http://lkml.kernel.org/r/E1RnXd1-0003wU-PM@eag09.americas.sgi.com [ Added the uv_irq_lock initialization fix by Dimitri Sivanich ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-19Merge remote-tracking branch 'linus/master' into x86/urgentH. Peter Anvin
2012-01-17x86/UV2: Add accounting for BAU strong nacksCliff Wickman
This patch adds separate accounting of UV2 message "strong nack's" in the BAU statistics. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120116212238.GF5767@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-17x86/UV2: Ack BAU interrupt earlierCliff Wickman
This patch moves the ack of the BAU interrupt to the beginning of the interrupt handler so that there is less possibility of a lost interrupt and slower response to a shootdown message. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120116212146.GE5767@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>