summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-01-15Merge branch 'tip/perf/urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
2011-01-14tracing: Remove syscall_exit_fieldsLai Jiangshan
There is no need for syscall_exit_fields as the syscall exit event class can already host the fields in its structure, like most other trace events do by default. Use that default behavior instead. Following this scheme, we don't need anymore to override the get_fields() callback of the syscall exit event class either. Hence both syscall_exit_fields and syscall_get_exit_fields() can be removed. Also changed some indentation to keep the following under 80 characters: ".fields = LIST_HEAD_INIT(event_class_syscall_exit.fields)," Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4D301C0E.8090408@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-14tracing: Only process module tracepoints onceSteven Rostedt
The commit: 9f987b3141f086de27832514aad9f50a53f754 tracing: Include module.h in define_trace.h only solved half the problem. If the trace/events/module.h header is included at the time of define_trace.h (or in ftrace.h within it), the module.h TRACE_SYSTEM will override the current TRACE_SYSTEM macro. Since define_trace.h is included when CREATE_TRACE_POINTS is set, and the first thing it does is to #undef CREATE_TRACE_POINTS, by placing the module.h TRACE_SYSTEM inside a #ifdef CREATE_TRACE_POINTS we can prevent it from overriding the TRACE_SYSTEM that is being processed, and still process the module.h tracepoints when the module code defines CREATE_TRACE_POINTS and includes the trace/events/module.h header. As with commit 9f987b3141, this is only an issue if module.h is not included before the trace/events/<event>.h file is included, which (luckily) has not happened yet. Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-13perf record: Add "nodelay" mode, disabled by defaultKirill Smelkov
Sometimes there is a need to use perf in "live-log" mode. The problem is, for seldom events, actual info output is largely delayed because perf-record reads sample data in whole pages. So for such scenarious, add flag for perf-record to go in "nodelay" mode. To track e.g. what's going on in icmp_rcv while ping is running Use it with something like this: (1) $ perf probe -L icmp_rcv | grep -U8 '^ *43\>' goto error; } 38 if (!pskb_pull(skb, sizeof(*icmph))) goto error; icmph = icmp_hdr(skb); 43 ICMPMSGIN_INC_STATS_BH(net, icmph->type); /* * 18 is the highest 'known' ICMP type. Anything else is a mystery * * RFC 1122: 3.2.2 Unknown ICMP messages types MUST be silently * discarded. */ 50 if (icmph->type > NR_ICMP_TYPES) goto error; $ perf probe icmp_rcv:43 'type=icmph->type' (2) $ cat trace-icmp.py [...] def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def probe__icmp_rcv(event_name, context, common_cpu, common_secs, common_nsecs, common_pid, common_comm, __probe_ip, type): print_header(event_name, common_cpu, common_secs, common_nsecs, common_pid, common_comm) print "__probe_ip=%u, type=%u\n" % \ (__probe_ip, type), [...] (3) $ perf record -a -D -e probe:icmp_rcv -o - | \ perf script -i - -s trace-icmp.py Thanks to Peter Zijlstra for pointing how to do it. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <20110112140613.GA11698@tugrik.mns.mnsspb.ru> Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-13perf sched: Fix list of events, dropping unsupported ':r' modifierStephane Eranian
Looks to me like the :r modifier is not supported anymore, so remove it from the list of events. Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <AANLkTim=jawJyBj0iFd0r4-LCKzvjFW+NddzJMD5GUB9@mail.gmail.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-11Revert "perf tools: Emit clearer message for sys_perf_event_open ENOENT return"Arnaldo Carvalho de Melo
This reverts commit aa7bc7ef73efc46d7c3a0e185eefaf85744aec98. It removed the fallback from hardware profiling to software profiling. .e.g., in a VM with no PMU. Reported-by: David Ahern <daahern@cisco.com> Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-11perf top: Fix annotate segvArnaldo Carvalho de Melo
Before we had sym_counter, it was initialized to zero and we used that as an index in the global attrs variable, now we have a list of evsel entries, and sym_counter became sym_evsel, that remained initialized to zero (NULL): b00m. Fix it by initializing it to the first entry in the evsel list. Bug-introduced: 69aad6f Reported-by: Kirill Smelkov <kirr@mns.spb.ru> Tested-by: Kirill Smelkov <kirr@mns.spb.ru> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill Smelkov <kirr@mns.spb.ru> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-11perf evsel: Fix order of event list deletionArnaldo Carvalho de Melo
We need to defer calling perf_evsel_list__delete() till after atexit registered routines, because we need to traverse the events being recorded at that time at least on 'perf record'. This fixes the problem reported by Thomas Renninger where cmd_record called by cmd_timechart would not write the tracing data to the perf.data file header because the evsel_list at atexit (control+C on 'perf timechart record') time would be empty, being already deleted by run_builtin(), and thus 'perf timechart' when trying to process such perf.data file would die with: "no trace data in the file" Problem introduced in 70d544d. Reported-by: Thomas Renninger <trenn@suse.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10perf session: Fix infinite loop in __perf_session__process_eventsArnaldo Carvalho de Melo
In this if statement: if (head + event->header.size >= mmap_size) { if (mmaps[map_idx]) { munmap(mmaps[map_idx], mmap_size); mmaps[map_idx] = NULL; } page_offset = page_size * (head / page_size); file_offset += page_offset; head -= page_offset; goto remap; } With, for instance, these values: head=2992 event->header.size=48 mmap_size=3040 We end up endlessly looping back to remap. Off by one. Problem introduced in 55b4462. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Ingo Molnar <mingo@elte.hu> Reported-by: David Ahern <daahern@cisco.com> Bisected-by: David Ahern <daahern@cisco.com> Tested-by: David Ahern <daahern@cisco.com> Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1)Arnaldo Carvalho de Melo
And a test for it: [acme@felicio linux]$ perf test 1: vmlinux symtab matches kallsyms: Ok 2: detect open syscall event: Ok 3: detect open syscall event on all cpus: Ok [acme@felicio linux]$ Translating C the test does: 1. generates different number of open syscalls on each CPU by using sched_setaffinity 2. Verifies that the expected number of events is generated on each CPU It works as expected. LKML-Reference: <new-submission> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10perf sched: Use PTHREAD_STACK_MIN to avoid pthread_attr_setstacksize() failJiri Pirko
on ppc64: /usr/include/bits/local_lim.h:#define PTHREAD_STACK_MIN 131072 therefore following set of commands: gives: perf.2.6.37test: builtin-sched.c:493: create_tasks: Assertion `!(err)' failed. So make sure we do not set stack size lower than PTHREAD_STACK_MIN. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110110160417.GB2685@psychotron.brq.redhat.com> Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10perf tools: Emit clearer message for sys_perf_event_open ENOENT returnArnaldo Carvalho de Melo
Improve sys_perf_event_open ENOENT return handling in top and record, just like 5a3446b does for stat. Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10perf stat: better error message for unsupported eventsDavid Ahern
For unsupported events (e.g., H/W events when running in a VM) perf stat currently fails with the error message: Error: open_counter returned with 2 (No such file or directory). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. dmesg is of no help and it is not clear as to why it fails to open the counter. This patch changes the error message to Error: cache-misses event is not supported. Fatal: Not all events could be opened. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: a.p.zijlstra@chello.nl LPU-Reference: <1294597272-17335-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern <daahern@cisco.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10perf sched: Fix allocation result checkArnaldo Carvalho de Melo
Bug introduced in ce47dc56. Reported-by: Mike Galbraith <efault@gmx.de> Cc: Chris Samuel <chris@csamuel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-09Merge branch 'tip/perf/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
2011-01-09perf, x86: P4 PMU - Fix unflagged overflows handlingCyrill Gorcunov
Don found that P4 PMU reads CCCR register instead of counter itself (in attempt to catch unflagged event) this makes P4 NMI handler to consume all NMIs it observes. So the other NMI users such as kgdb simply have no chance to get NMI on their hands. Side note: at moment there is no way to run nmi-watchdog together with perf tool. This is because both 'perf top' and nmi-watchdog use same event. So while nmi-watchdog reserves one event/counter for own needs there is no room for perf tool left (there is a way to disable nmi-watchdog on boot of course). Ming has tested this patch with the following results | 1. watchdog disabled | | kgdb tests on boot OK | perf works OK | | 2. watchdog enabled, without patch perf-x86-p4-nmi-4 | | kgdb tests on boot hang | | 3. watchdog enabled, without patch perf-x86-p4-nmi-4 and do not run kgdb | tests on boot | | "perf top" partialy works | cpu-cycles no | instructions yes | cache-references no | cache-misses no | branch-instructions no | branch-misses yes | bus-cycles no | | 4. watchdog enabled, with patch perf-x86-p4-nmi-4 applied | | kgdb tests on boot OK | perf does not work, NMI "Dazed and confused" messages show up | Which means we still have problems with p4 box due to 'unknown' nmi happens but at least it should fix kgdb test cases. Reported-by: Jason Wessel <jason.wessel@windriver.com> Reported-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Lin Ming <ming.m.lin@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4D275E7E.3040903@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07dynamic debug: Fix build issue with older gccJason Baron
On older gcc (3.3) dynamic debug fails to compile: include/net/inet_connection_sock.h: In function `inet_csk_reset_xmit_timer': include/net/inet_connection_sock.h:236: error: duplicate label declaration `do_printk' include/net/inet_connection_sock.h:219: error: this is a previous declaration include/net/inet_connection_sock.h:236: error: duplicate label declaration `out' include/net/inet_connection_sock.h:219: error: this is a previous declaration include/net/inet_connection_sock.h:236: error: duplicate label `do_printk' include/net/inet_connection_sock.h:236: error: duplicate label `out' Fix, by reverting the usage of JUMP_LABEL() in dynamic debug for now. Cc: <stable@kernel.org> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-07tracing: Fix TRACE_EVENT power tracepoint creationMathieu Desnoyers
DEFINE_TRACE should also exist when CONFIG_EVENT_TRACING=n. Otherwise, setting only TRACEPOINTS=y is broken. Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20101028153117.GA4051@Krystal> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-07tracing: Fix preempt count leakLi Zefan
While running my ftrace stress test, this showed up: BUG: sleeping function called from invalid context at mm/mmap.c:233 ... note: cat[3293] exited with preempt_count 1 The bug was introduced by commit 91e86e560d0b3ce4c5fc64fd2bbb99f856a30a4e ("tracing: Fix recursive user stack trace") Cc: <stable@kernel.org> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4D0089AC.1020802@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-07tracepoint: Add __rcu annotationLai Jiangshan
Add __rcu annotation to : (struct tracepoint)->funcs Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4D22D4F1.50505@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-07tracing: remove duplicate null-pointer check in skb tracepointMathieu Desnoyers
The check for NULL skb in the kfree_skb trace event is a duplicate from the check already done in its only caller, kfree_skb(). Remove this duplicate check. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20110106175319.GA30610@Krystal> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: David S. Miller <davem@davemloft.net> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> CC: Zhaolei <zhaolei@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-07tracing/trivial: Add missing comma in TRACE_EVENT commentMathieu Desnoyers
Add missing comma within the TRACE_EVENT() example in tracepoint.h. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20110106184532.GA2526@Krystal> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-07tracing: Include module.h in define_trace.hSteven Rostedt
While doing some developing, Peter Zijlstra and I have found that if a CREATE_TRACE_POINTS include is done before module.h is included, it can break the build. We have been lucky so far that this has not broke the build since module.h is included in almost everything. Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-01-07x86: Save rbp in pt_regs on irq entryFrederic Weisbecker
From the x86_64 low level interrupt handlers, the frame pointer is saved right after the partial pt_regs frame. rbp is not supposed to be part of the irq partial saved registers, but it only requires to extend the pt_regs frame by 8 bytes to do so, plus a tiny stack offset fixup on irq exit. This changes a bit the semantics or get_irq_entry() that is supposed to provide only the value of caller saved registers and the cpu saved frame. However it's a win for unwinders that can walk through stack frames on top of get_irq_regs() snapshots. A noticeable impact is that it makes perf events cpu-clock and task-clock events based callchains working on x86_64. Let's then save rbp into the irq pt_regs. As a result with: perf record -e cpu-clock perf bench sched messaging perf report --stdio Before: 20.94% perf [kernel.kallsyms] [k] lock_acquire | --- lock_acquire | |--44.01%-- __write_nocancel | |--43.18%-- __read | |--6.08%-- fork | create_worker | |--0.88%-- _dl_fixup | |--0.65%-- do_lookup_x | |--0.53%-- __GI___libc_read --4.67%-- [...] After: 19.23% perf [kernel.kallsyms] [k] __lock_acquire | --- __lock_acquire | |--97.74%-- lock_acquire | | | |--21.82%-- _raw_spin_lock | | | | | |--37.26%-- unix_stream_recvmsg | | | sock_aio_read | | | do_sync_read | | | vfs_read | | | sys_read | | | system_call | | | __read | | | | | |--24.09%-- unix_stream_sendmsg | | | sock_aio_write | | | do_sync_write | | | vfs_write | | | sys_write | | | system_call | | | __write_nocancel v2: Fix cfi annotations. Reported-by: Soeren Sandmann Pedersen <sandmann@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Jan Beulich <JBeulich@novell.com>
2011-01-07x86, dumpstack: Fix unused variable warningRakib Mullick
In dump_stack function, bp isn't used anymore, which is introduced by commit 9c0729dc8062bed96189bd14ac6d4920f3958743. This patch removes bp completely. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Soeren Sandmann <sandmann@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <AANLkTik9U_Z0WSZ7YjrykER_pBUfPDdgUUmtYx=R74nL@mail.gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-01-07x86, NMI: Clean-up default_do_nmi()Don Zickus
Just re-arrange the code a bit to make it easier to follow what is going on. Basically un-negating the if-statement and swapping the code inside the if-statement with code outside. No functional changes. Originally-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-7-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPUDon Zickus
In original NMI handler, NMI reason io port (0x61) is only processed on BSP. This makes it impossible to hot-remove BSP. To solve the issue, a raw spinlock is used to allow the port to be processed on any CPU. Originally-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-6-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, NMI: Remove DIE_NMI_IPIDon Zickus
With priorities in place and no one really understanding the difference between DIE_NMI and DIE_NMI_IPI, just remove DIE_NMI_IPI and convert everyone to DIE_NMI. This also simplifies default_do_nmi() a little bit. Instead of calling the die_notifier in both the if and else part, just pull it out and call it before the if-statement. This has the side benefit of avoiding a call to the ioport to see if there is an external NMI sitting around until after the (more frequent) internal NMIs are dealt with. Patch-Inspired-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-5-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, NMI: Add priorities to handlersDon Zickus
In order to consolidate the NMI die_chain events, we need to setup the priorities for the die notifiers. I started by defining a bunch of common priorities that can be used by the notifier blocks. Then I modified the notifier blocks to use the newly created priorities. Now that the priorities are straightened out, it should be easier to remove the event DIE_NMI_IPI. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-4-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86: Convert some devices to use DIE_NMIUNKNOWNDon Zickus
They are a handful of places in the code that register a die_notifier as a catch all in case no claims the NMI. Unfortunately, they trigger on events like DIE_NMI and DIE_NMI_IPI, which depending on when they registered may collide with other handlers that have the ability to determine if the NMI is theirs or not. The function unknown_nmi_error() makes one last effort to walk the die_chain when no one else has claimed the NMI before spitting out messages that the NMI is unknown. This is a better spot for these devices to execute any code without colliding with the other handlers. The two drivers modified are only compiled on x86 arches I believe, so they shouldn't be affected by other arches that may not have DIE_NMIUNKNOWN defined. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Russ Anderson <rja@sgi.com> Cc: Corey Minyard <minyard@acm.org> Cc: openipmi-developer@lists.sourceforge.net Cc: dann frazier <dannf@hp.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, NMI: Add NMI symbol constants and rename memory parity to PCI SERRHuang Ying
Replace the NMI related magic numbers with symbol constants. Memory parity error is only valid for IBM PC-AT, newer machine use bit 7 (0x80) of 0x61 port for PCI SERR. While memory error is usually reported via MCE. So corresponding function name and kernel log string is changed. But on some machines, PCI SERR line is still used to report memory errors. This is used by EDAC, so corresponding EDAC call is reserved. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07perf_events: Add perf_event_time()Stephane Eranian
Adds perf_event_time() to try and centralize access to event timing and in particular ctx->time. Prepares for cgroup support. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4d22059c.122ae30a.5e0e.ffff8b8b@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07perf_events: Generalize use of event_filter_match()Stephane Eranian
Replace all occurrences of: event->cpu != -1 && event->cpu == smp_processor_id() by a call to: event_filter_match(event) This makes the code more consistent and will make the cgroup patch smaller. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4d220593.2308e30a.48c5.ffff8ae9@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07perf_events: Move code around to prepare for cgroupStephane Eranian
In particular this patch move perf_event_exit_task() before cgroup_exit() to allow for cgroup support. The cgroup_exit() function detaches the cgroups attached to a task. Other movements include hoisting some definitions and inlines at the top of perf_event.c Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4d22058b.cdace30a.4657.ffff95b1@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07perf tools: Pass whole attr to event selectorsLin Ming
Since commit 69aad6f1(perf tools: Introduce event selectors), only perf_event_attr::type and ::config are passed to event selector, which makes perf tool not work correctly. For example, PEBS does not work because perf_event_attr::precise_ip is not passed to the syscall. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1294369869.20563.19.camel@minggr.sh.intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-06perf tools: Build with frame pointerFrederic Weisbecker
It seems that some gcc versions build by default with frame pointers and some others omit them. Just build the tools with frame pointers as the callchains can be an important part of the perf workflow. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1294325513-14276-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-06perf tools: Fix buffer overflow error when specifying all tracepointsHan Pingtian
I found when specifying all tracepoints with -e to one of subcommand, such as 'stat', the program will trigger a buffer overflow error, like this: *** buffer overflow detected ***: ./perf terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x382cefb2c7] .... The tracepoints are separated by comma, something like this: $ perf stat -a -e `perf list |grep Tracepoint|awk -F'[' '{gsub(/[[:space:]]+/,"",$1);array[FNR]=$1}END{outputs=array[1];for (i=2;i<=FNR;i++){ outputs=outputs "," array[i];};print outputs}'` The root reason of this problem is that store_event_type() is called for all events, and will overflow the 'filename' at: strncat(filename, orgname, strlen(orgname)); This patch fixes it by calling store_event_type() only when the event name has been found. LKML-Reference: <20110106093922.GB6713@hpt.nay.redhat.com> Signed-off-by: Han Pingtian <phan@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-06Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mm: Initialize initial_page_table before paravirt jumps
2011-01-06Merge branches 'x86-alternatives-for-linus', 'x86-fpu-for-linus', ↵Linus Torvalds
'x86-hwmon-for-linus', 'x86-paravirt-for-linus', 'core-locking-for-linus' and 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64, asm: Use fxsaveq/fxrestorq in more places * 'x86-hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hwmon: Add core threshold notification to therm_throt.c * 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, paravirt: Use native_halt on a halt, not native_safe_halt * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking, lockdep: Convert sprintf_symbol to %pS * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: Better struct irqaction layout
2011-01-06Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV, BAU: Extend for more than 16 cpus per socket x86, UV: Fix the effect of extra bits in the hub nodeid register x86, UV: Add common uv_early_read_mmr() function for reading MMRs
2011-01-06Merge branch 'x86-tsc-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-tsc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Check tsc available/disabled in the delayed init function x86: Improve TSC calibration using a delayed workqueue x86: Make tsc=reliable override boot time stability checks
2011-01-06Merge branch 'x86-security-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-security-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: module: Move RO/NX module protection to after ftrace module update x86: Resume trampoline must be executable x86: Add RO/NX protection for loadable kernel modules x86: Add NX protection for kernel data x86: Fix improper large page preservation
2011-01-06Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, earlyprintk: Move mrst early console to platform/ and fix a typo x86, apbt: Setup affinity for apb timers acting as per-cpu timer ce4100: Add errata fixes for UART on CE4100 x86: platform: Move iris to x86/platform where it belongs x86, mrst: Check platform_device_register() return code x86/platform: Add Eurobraille/Iris power off support x86, mrst: Add explanation for using 1960 as the year offset for vrtc x86, mrst: Fix dependencies of "select INTEL_SCU_IPC" x86, mrst: The shutdown for MRST requires the SCU IPC mechanism x86: Ce4100: Add reboot_fixup() for CE4100 ce4100: Add PCI register emulation for CE4100 x86: Add CE4100 platform support x86: mrst: Set vRTC's IRQ to level trigger type x86: mrst: Add audio driver bindings rtc: Add drivers/rtc/rtc-mrst.c x86: mrst: Add vrtc driver which serves as a wall clock device x86: mrst: Add Moorestown specific reboot/shutdown support x86: mrst: Parse SFI timer table for all timer configs x86/mrst: Add SFI platform device parsing code
2011-01-06Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, microcode, AMD: Cleanup code a bit x86, microcode, AMD: Replace vmalloc+memset with vzalloc
2011-01-06Merge branch 'x86-mce-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: apic, amd: Make firmware bug messages more meaningful mce, amd: Remove goto in threshold_create_device() mce, amd: Add helper functions to setup APIC mce, amd: Shorten local variables mci_misc_{hi,lo} mce, amd: Implement mce_threshold_block_init() helper function
2011-01-06Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix included-by file reference comments x86, cpu: Only CPU features determine NX capabilities x86, cpu: Call verify_cpu during 32bit CPU startup x86, cpu: Clear XD_DISABLED flag on Intel to regain NX x86, cpu: Rename verify_cpu_64.S to verify_cpu.S
2011-01-06Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix APIC ID sizing bug on larger systems, clean up MAX_APICS confusion x86, acpi: Parse all SRAT cpu entries even above the cpu number limitation x86, acpi: Add MAX_LOCAL_APIC for 32bit x86: io_apic: Split setup_ioapic_ids_from_mpc() x86: io_apic: Fix CONFIG_X86_IO_APIC=n breakage x86: apic: Move probe_nr_irqs_gsi() into ioapic_init_mappings() x86: Allow platforms to force enable apic
2011-01-06Merge branch 'x86-amd-nb-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Cleanup L3 cache index disable support x86, amd-nb: Cleanup AMD northbridge caching code x86, amd-nb: Complete the rename of AMD NB and related code
2011-01-06Merge branch 'timers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: MAINTAINERS: Update timer related entries timers: Use this_cpu_read timerqueue: Make timerqueue_getnext() static inline hrtimer: fix timerqueue conversion flub hrtimers: Convert hrtimers to use timerlist infrastructure timers: Fixup allmodconfig build issue timers: Rename timerlist infrastructure to timerqueue timers: Introduce timerlist infrastructure. hrtimer: Remove stale comment on curr_timer timer: Warn when del_timer_sync() is called in hardirq context timer: Del_timer_sync() can be used in softirq context timer: Make try_to_del_timer_sync() the same on SMP and UP posix-timers: Annotate lock_timer() timer: Permit statically-declared work with deferrable timers time: Use ARRAY_SIZE macro in timecompare.c timer: Initialize the field slack of timer_list timer_list: Remove alignment padding on 64 bit when CONFIG_TIMER_STATS time: Compensate for rounding on odd-frequency clocksources Fix up trivial conflict in MAINTAINERS
2011-01-06Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) sched: Change wait_for_completion_*_timeout() to return a signed long sched, autogroup: Fix reference leak sched, autogroup: Fix potential access to freed memory sched: Remove redundant CONFIG_CGROUP_SCHED ifdef sched: Fix interactivity bug by charging unaccounted run-time on entity re-weight sched: Move periodic share updates to entity_tick() printk: Use this_cpu_{read|write} api on printk_pending sched: Make pushable_tasks CONFIG_SMP dependant sched: Add 'autogroup' scheduling feature: automated per session task groups sched: Fix unregister_fair_sched_group() sched: Remove unused argument dest_cpu to migrate_task() mutexes, sched: Introduce arch_mutex_cpu_relax() sched: Add some clock info to sched_debug cpu: Remove incorrect BUG_ON cpu: Remove unused variable sched: Fix UP build breakage sched: Make task dump print all 15 chars of proc comm sched: Update tg->shares after cpu.shares write sched: Allow update_cfs_load() to update global load sched: Implement demand based update_cfs_load() ...