summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-12-13cpumask: convert struct clock_event_device to cpumask pointers.Rusty Russell
Impact: change calling convention of existing clock_event APIs struct clock_event_timer's cpumask field gets changed to take pointer, as does the ->broadcast function. Another single-patch change. For safety, we BUG_ON() in clockevents_register_device() if it's not set. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu>
2008-12-13cpumask: make irq_set_affinity() take a const struct cpumaskRusty Russell
Impact: change existing irq_chip API Not much point with gentle transition here: the struct irq_chip's setaffinity method signature needs to change. Fortunately, not widely used code, but hits a few architectures. Note: In irq_select_affinity() I save a temporary in by mangling irq_desc[irq].affinity directly. Ingo, does this break anything? (Folded in fix from KOSAKI Motohiro) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Grant Grundler <grundler@parisc-linux.org> Acked-by: Ingo Molnar <mingo@redhat.com> Cc: ralf@linux-mips.org Cc: grundler@parisc-linux.org Cc: jeremy@xensource.com Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
2008-12-13cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and ↵Rusty Russell
cpulist_scnprintf to take pointers. Impact: change calling convention of existing cpumask APIs Most cpumask functions started with cpus_: these have been replaced by cpumask_ ones which take struct cpumask pointers as expected. These four functions don't have good replacement names; fortunately they're rarely used, so we just change them over. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: paulus@samba.org Cc: mingo@redhat.com Cc: tony.luck@intel.com Cc: ralf@linux-mips.org Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: cl@linux-foundation.org Cc: srostedt@redhat.com
2008-12-13cpumask: centralize cpu_online_map and cpu_possible_mapRusty Russell
Impact: cleanup Each SMP arch defines these themselves. Move them to a central location. Twists: 1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a CONFIG_INIT_ALL_POSSIBLE for this rather than break them. 2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'. Those archs simply have phys_cpu_present_map replaced everywhere. 3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky so I just manipulate them both in sync. 4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map' declarations. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Grant Grundler <grundler@parisc-linux.org> Tested-by: Tony Luck <tony.luck@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Mike Travis <travis@sgi.com> Cc: ink@jurassic.park.msu.ru Cc: rmk@arm.linux.org.uk Cc: starvik@axis.com Cc: tony.luck@intel.com Cc: takata@linux-m32r.org Cc: ralf@linux-mips.org Cc: grundler@parisc-linux.org Cc: paulus@samba.org Cc: schwidefsky@de.ibm.com Cc: lethal@linux-sh.org Cc: wli@holomorphy.com Cc: davem@davemloft.net Cc: jdike@addtoit.com Cc: mingo@redhat.com
2008-12-10Revert "radeonfb: accelerate imageblit and other improvements"Linus Torvalds
This reverts commit b1ee26bab14886350ba12a5c10cbc0696ac679bf, along with the "fixes" for it that all just caused problems: - c4c6fa9891f3d1bcaae4f39fb751d5302965b566 "radeonfb: fix problem with color expansion & alignment" - f3179748a157c21d44d929fd3779421ebfbeaa93 "radeonfb: Disable new color expand acceleration unless explicitely enabled" because even when disabled, it breaks for people. See http://bugzilla.kernel.org/show_bug.cgi?id=12191 for the latest example. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: James Cloos <cloos@jhcloos.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Jean-Luc Coulon <jean.luc.coulon@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10Linux 2.6.28-rc8Linus Torvalds
2008-12-10Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: CPU remove deadlock fix
2008-12-10fix mapping_writably_mapped()Hugh Dickins
Lee Schermerhorn noticed yesterday that I broke the mapping_writably_mapped test in 2.6.7! Bad bad bug, good good find. The i_mmap_writable count must be incremented for VM_SHARED (just as i_writecount is for VM_DENYWRITE, but while holding the i_mmap_lock) when dup_mmap() copies the vma for fork: it has its own more optimal version of __vma_link_file(), and I missed this out. So the count was later going down to 0 (dangerous) when one end unmapped, then wrapping negative (inefficient) when the other end unmapped. The only impact on x86 would have been that setting a mandatory lock on a file which has at some time been opened O_RDWR and mapped MAP_SHARED (but not necessarily PROT_WRITE) across a fork, might fail with -EAGAIN when it should succeed, or succeed when it should fail. But those architectures which rely on flush_dcache_page() to flush userspace modifications back into the page before the kernel reads it, may in some cases have skipped the flush after such a fork - though any repetitive test will soon wrap the count negative, in which case it will flush_dcache_page() unnecessarily. Fix would be a two-liner, but mapping variable added, and comment moved. Reported-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10Merge branch 'to-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland * 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: tracehook: exec double-reporting fix
2008-12-10lib/idr.c: Fix bug introduced by RCU fixManfred Spraul
The last patch to lib/idr.c caused a bug if idr_get_new_above() was called on an empty idr. Usually, nodes stay on the same layer. New layers are added to the top of the tree. The exception is idr_get_new_above() on an empty tree: In this case, the new root node is first added on layer 0, then moved upwards. p->layer was not updated. As usual: You shall never rely on the source code comments, they will only mislead you. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10MN10300: Give correct size when reserving interrupt vector tableAkira Takeuchi
Give the correct size when reserving the interrupt vector table. It should be a page not a single byte. Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10MN10300: Fix __put_user_asm8()Akira Takeuchi
Fix __put_user_asm8() by jumping to the end label (3:) from the exception handler, rather than jumping back to retry the second store instruction (label 2:). Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10MN10300: Fix the preemption resume_kernel() routineAkira Takeuchi
Fix the preemption resume_kernel() routine by inverting the test to see whether interrupts are off (IM7 is all enabled, not all disabled). Furthermore, interrupts should be disabled on entry to resume_kernel() so that they're correctly set for jumping to restore_all() and doing the need reschedule test. Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10MN10300: Discard low-priority Tx interrupts when closing an on-chip serial portAkira Takeuchi
Discard low-prioriy Tx interrupts when closing an MN10300 on-chip serial port. The MN10300 on-chip serial port uses three interrupts to manage its serial ports: (1) A very high priority interrupt that drives virtual DMA for Rx. (2) A very high priority interrupt that drives virtual DMA for Tx. (3) A normal priority virtual interrupt that does the normal UART interrupt stuff and is shared between Rx and Tx. mn10300_serial_stop_tx() only disables the high priority Tx interrupt. It doesn't also disable the normal priority one because it is shared with Rx. However, the high priority interrupt may interrupt local_irq_disabled() sections, and so may have queued up a low priority virtual interrupt whilst the UART driver is asking for the Tx interrupt to be disabled. The result of this can be an oops when we try to process the interrupt in mn10300_serial_transmit_interrupt() as port->uart.info and port->uart.info->tty may have gone away. To deal with this, if either of those pointers is NULL, we make sure the high-priority Tx interrupt is disabled and discard the interrupt. The low priority interrupt is disabled by the mn10300_serial_pic irq_chip table. Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10MN10300: vmlinux.lds.S cleanup - use PAGE_SIZE, PERCPU macrosCyrill Gorcunov
Include the linux/page.h header into the MN10300 kernel linker script thus allowing us to use PAGE_SIZE macro instead of a numeric constant. Also use the PERCPU macro instead of an explicit section definition. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: api - Disallow cryptomgr as a module if algorithms are built-in
2008-12-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch PCI: stop leaking 'slot_name' in pci_create_slot
2008-12-10Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] SN: prevent IRQ retargetting in request_irq() [IA64] Fix section mismatch ioc3uart_init()/ioc3uart_submodule [IA64] Clear up section mismatch for ioc4_ide_attach_one. [IA64] Clear up section mismatch with arch_unregister_cpu() [IA64] Clear up section mismatch for sn_check_wars. [IA64] Updated the generic_defconfig to work with the 2.6.28-rc7 kernel. [IA64] Fix GRU compile error w/o CONFIG_HUGETLB_PAGE [IA64] eliminate NULL test and memset after alloc_bootmem [IA64] remove BUILD_BUG_ON from paravirt_getreg()
2008-12-10Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: MIPS: Better than nothing implementation of PCI mmap to fix X.
2008-12-10pktcdvd: remove broken dev_t export of class devicesKay Sievers
The pktcdvd created class devices only export some sysfs files, but have no char dev_t registered in the driver. At class device creation time they copy the dev_t value of the block device to the char device, wich will register a new char device in the driver core and userspace, with a conflicting dev_t value. In many cases the class devices dev_t just points to a random USB device. This fixes the sysfs "duplicate entry" errors. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Peter Osterlund <petero2@telia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: fw-ohci: fix IOMMU resource exhaustion ieee1394: node manager causes up to ~3.25s delay in freezing tasks
2008-12-10drivers/video/mb862xx/mb862xxfb.c: fix printkAndrew Morton
sparc64: drivers/video/mb862xx/mb862xxfb.c:929: warning: long long unsigned int format, resource_size_t arg (arg 4) drivers/video/mb862xx/mb862xxfb.c:931: warning: long long unsigned int format, resource_size_t arg (arg 4) We don't know what type the architecture uses to implement u64, hence they cannot be printed. Cc: Anatolij Gustschin <agust@denx.de> Cc: Dmitry Baryshkov <dbaryshkov@gmail.com> Cc: Anton Vorontsov <avorontsov@ru.mvista.com> Cc: Matteo Fortini <m.fortini@selcomgroup.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10KSYM_SYMBOL_LEN fixesHugh Dickins
Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked to my 966c8c12dc9e77f931e2281ba25d2f0244b06949 sprint_symbol(): use less stack exposing a bug in slub's list_locations() - kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was beyond the end of page provided. The 100 slop which list_locations() allows at end of page looks roughly enough for all the other stuff it might print after the symbol before it checks again: break out KSYM_SYMBOL_LEN earlier than before. Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies them. [akpm@linux-foundation.org: ftrace.h needs module.h] Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc Miles Lane <miles.lane@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10inotify: fix IN_ONESHOT unmount event watcherDmitri Monakhov
On umount two event will be dispatched to watcher: 1: inotify_dev_queue_event(.., IN_UNMOUNT,..) 2: remove_watch(watch, dev) ->inotify_dev_queue_event(.., IN_IGNORED, ..) But if watcher has IN_ONESHOT bit set then the watcher will be released inside first event. Which result in accessing invalid object later. IMHO it is not pure regression. This bug wasn't triggered while initial inotify interface testing phase because of another bug in IN_ONESHOT handling logic :) commit ac74c00e499ed276a965e5b5600667d5dc04a84a Author: Ulisses Furquim <ulissesf@gmail.com> Date: Fri Feb 8 04:18:16 2008 -0800 inotify: fix check for one-shot watches before destroying them As the IN_ONESHOT bit is never set when an event is sent we must check it in the watch's mask and not in the event's mask. TESTCASE: mkdir mnt mount -ttmpfs none mnt mkdir mnt/d ./inotify mnt/d& umount mnt ## << lockup or crash here TESTSOURCE: /* gcc -oinotify inotify.c */ #include <stdio.h> #include <stdlib.h> #include <sys/inotify.h> int main(int argc, char **argv) { char buf[1024]; struct inotify_event *ie; char *p; int i; ssize_t l; p = argv[1]; i = inotify_init(); inotify_add_watch(i, p, ~0); l = read(i, buf, sizeof(buf)); printf("read %d bytes\n", l); ie = (struct inotify_event *) buf; printf("event mask: %d\n", ie->mask); return 0; } Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Robert Love <rlove@google.com> Cc: Ulisses Furquim <ulissesf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10atomic: fix a typo in atomic_long_xchg()Eric Dumazet
atomic_long_xchg() is not correctly defined for 32bit arches. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10mm: no get_user/put_user while holding mmap_sem in do_pages_stat?Brice Goglin
Since commit 2f007e74bb85b9fc4eab28524052161703300f1a, do_pages_stat() gets the page address from user-space and puts the corresponding status back while holding the mmap_sem for read. There is no need to hold mmap_sem there while some page-faults may occur. This patch adds a temporary address and status buffer so as to only hold mmap_sem while working on these kernel buffers. This is implemented by extracting do_pages_stat_array() out of do_pages_stat(). Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Cc: Christoph Lameter <clameter@sgi.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10drivers/serial/s3c2440.c: fix typo in MODULE_LICENSEBalaji Rao
Signed-off-by: Balaji Rao <balajirrao@gmail.com> Acked-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10pagemap: fix 32-bit pagemap regressionMatt Mackall
The large pages fix from bcf8039ed45 broke 32-bit pagemap by pulling the pagemap entry code out into a function with the wrong return type. Pagemap entries are 64 bits on all systems and unsigned long is only 32 bits on 32-bit systems. Signed-off-by: Matt Mackall <mpm@selenic.com> Reported-by: Doug Graham <dgraham@nortel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10page_cgroup should ignore empty nodesKAMEZAWA Hiroyuki
Fix a total bootup freeze on ia64. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Tested-by: Li Zefan <lizf@cn.fujitsu.com> Reported-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10rtc twl4030: rename ioctl function when RTC_INTF_DEV=nRandy Dunlap
Fix build error when RTC_INTF_DEV=n: drivers/rtc/rtc-twl4030.c:402: error: 'twl4030_rtc_ioctl' undeclared here (not in a function) make[3]: *** [drivers/rtc/rtc-twl4030.o] Error 1 Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: Tony Lindgren <tony@atomide.com> Cc: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10fbcon: fix workqueue shutdownGeoff Levand
Add a call to cancel_work_sync() in fbcon_exit() to cancel any pending work in the fbcon workqueue. The current implementation of fbcon_exit() sets the fbcon workqueue function info->queue.func to NULL, but does not assure that there is no work pending when it does so. On occasion, depending on system timing, there will still be pending work in the queue when fbcon_exit() is called. This results in a null pointer deference when run_workqueue() tries to call the queue's work function. Fixes errors on shutdown similar to these: Console: switching to colour dummy device 80x25 Unable to handle kernel paging request for data at address 0x00000000 Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10mm: remove UP version of lru_add_drain_all()KOSAKI Motohiro
Currently, lru_add_drain_all() has two version. (1) use schedule_on_each_cpu() (2) don't use schedule_on_each_cpu() Gerald Schaefer reported it doesn't work well on SMP (not NUMA) S390 machine. offline_pages() calls lru_add_drain_all() followed by drain_all_pages(). While drain_all_pages() works on each cpu, lru_add_drain_all() only runs on the current cpu for architectures w/o CONFIG_NUMA. This let us run into the BUG_ON(!PageBuddy(page)) in __offline_isolated_pages() during memory hotplug stress test on s390. The page in question was still on the pcp list, because of a race with lru_add_drain_all() and drain_all_pages() on different cpus. Actually, Almost machine has CONFIG_UNEVICTABLE_LRU=y. Then almost machine use (1) version lru_add_drain_all although the machine is UP. Then this ifdef is not valueable. simple removing is better. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10revert "percpu_counter: new function percpu_counter_sum_and_set"Andrew Morton
Revert commit e8ced39d5e8911c662d4d69a342b9d053eaaac4e Author: Mingming Cao <cmm@us.ibm.com> Date: Fri Jul 11 19:27:31 2008 -0400 percpu_counter: new function percpu_counter_sum_and_set As described in revert "percpu counter: clean up percpu_counter_sum_and_set()" the new percpu_counter_sum_and_set() is racy against updates to the cpu-local accumulators on other CPUs. Revert that change. This means that ext4 will be slow again. But correct. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10revert "percpu counter: clean up percpu_counter_sum_and_set()"Andrew Morton
Revert commit 1f7c14c62ce63805f9574664a6c6de3633d4a354 Author: Mingming Cao <cmm@us.ibm.com> Date: Thu Oct 9 12:50:59 2008 -0400 percpu counter: clean up percpu_counter_sum_and_set() Before this patch we had the following: percpu_counter_sum(): return the percpu_counter's value percpu_counter_sum_and_set(): return the percpu_counter's value, copying that value into the central value and zeroing the per-cpu counters before returning. After this patch, percpu_counter_sum_and_set() has gone, and percpu_counter_sum() gets the old percpu_counter_sum_and_set() functionality. Problem is, as Eric points out, the old percpu_counter_sum_and_set() functionality was racy and wrong. It zeroes out counters on "other" cpus, without holding any locks which will prevent races agaist updates from those other CPUS. This patch reverts 1f7c14c62ce63805f9574664a6c6de3633d4a354. This means that percpu_counter_sum_and_set() still has the race, but percpu_counter_sum() does not. Note that this is not a simple revert - ext4 has since started using percpu_counter_sum() for its dirty_blocks counter as well. Note that this revert patch changes percpu_counter_sum() semantics. Before the patch, a call to percpu_counter_sum() will bring the counter's central counter mostly up-to-date, so a following percpu_counter_read() will return a close value. After this patch, a call to percpu_counter_sum() will leave the counter's central accumulator unaltered, so a subsequent call to percpu_counter_read() can now return a significantly inaccurate result. If there is any code in the tree which was introduced after e8ced39d5e8911c662d4d69a342b9d053eaaac4e was merged, and which depends upon the new percpu_counter_sum() semantics, that code will break. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10percpu_counter: fix CPU unplug race in percpu_counter_destroy()Eric Dumazet
We should first delete the counter from percpu_counters list before freeing memory, or a percpu_counter_hotcpu_callback() could dereference a NULL pointer. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10rtc: fix missing id_table in rtc-ds1672 and rtc-max6900 driversAlessandro Zummo
Add missing id_table to the drivers in subject. Patch is against the latest git. It should go in with 2.6.28 if possible, the drivers won't work without the id_table bits. Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Reported-by: Imre Kaloz <kaloz@openwrt.org> Tested-by: Imre Kaloz <kaloz@openwrt.org> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10relayfs: fix infinite loop with splice()Tom Zanussi
Running kmemtraced, which uses splice() on relayfs, causes a hard lock on x86-64 SMP. As described by Tom Zanussi: It looks like you hit the same problem as described here: commit 8191ecd1d14c6914c660dfa007154860a7908857 splice: fix infinite loop in generic_file_splice_read() relay uses the same loop but it never got noticed or fixed. Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Tested-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10uml: boot broken due to buffer overrunBalbir Singh
mconsole_init() passed 256 bytes as length in os_create_unix_socket, while the sizeof UNIX_PATH_MAX is 108. This patch fixes that problem and avoids a big overrun bug reported on UML bootup. sockaddr_un.sun_path is UNIX_PATH_MAX long which causes the problem. Reported-by: Vikas K Managutte <vikki.km@gmail.com> Reported-by: Sarvesh Kumar Lal Das <skldas@gmail.com> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Reviewed-by: WANG Cong <wangcong@zeuux.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: <stable@kernel.org> [please check with Jeff] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10mm/backing-dev.c: remove recently-added WARN_ON()Andrew Morton
On second thoughts, this is just going to disturb people while telling us things which we already knew. Cc: Peter Korsgaard <jacmet@sunsite.dk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10crypto: api - Disallow cryptomgr as a module if algorithms are built-inHerbert Xu
If we have at least one algorithm built-in then it no longer makes sense to have the testing framework, and hence cryptomgr to be a module. It should be either on or off, i.e., built-in or disabled. This just happens to stop a potential runaway modprobe loop that seems to trigger on at least one distro. With fixes from Evgeniy Polyakov. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-12-10firewire: fw-ohci: fix IOMMU resource exhaustionStefan Richter
There is a DMA map/ unmap imbalance whenever a block write request packet is sent and then dequeued with ohci_cancel_packet. The latter may happen frequently if the AR resp tasklet is executed before the AT req tasklet for the same transaction. Add the missing dma_unmap_single. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=475156 Reported-by: Emmanuel Kowalski Tested-by: Emmanuel Kowalski Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-12-09tracehook: exec double-reporting fixRoland McGrath
The patch 6341c39 "tracehook: exec" introduced a small regression in 2.6.27 regarding binfmt_misc exec event reporting. Since the reporting is now done in the common search_binary_handler() function, an exec of a misc binary will result in two (or possibly multiple) exec events being reported, instead of just a single one, because the misc handler contains a recursive call to search_binary_handler. To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple SIGTRAP signals will in fact cause only a single ptrace intercept, as the signals are not queued. However, if PTRACE_O_TRACEEXEC is on, the debugger will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC). The test program included below demonstrates the problem. This change fixes the bug by calling tracehook_report_exec() only in the outermost search_binary_handler() call (bprm->recursion_depth == 0). The additional change to restore bprm->recursion_depth after each binfmt load_binary call is actually superfluous for this bug, since we test the value saved on entry to search_binary_handler(). But it keeps the use of of the depth count to its most obvious expected meaning. Depending on what binfmt handlers do in certain cases, there could have been false-positive tests for recursion limits before this change. /* Test program using PTRACE_O_TRACEEXEC. This forks and exec's the first argument with the rest of the arguments, while ptrace'ing. It expects to see one PTRACE_EVENT_EXEC stop and then a successful exit, with no other signals or events in between. Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec: $ gcc -g traceexec.c -o traceexec $ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register' $ echo 'foobar test' > ./foobar $ chmod +x ./foobar $ ./traceexec ./foobar; echo $? ==> good <== foobar test 0 $ ==> bad <== foobar test unexpected status 0x4057f != 0 3 $ */ #include <stdio.h> #include <sys/types.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <unistd.h> #include <signal.h> #include <stdlib.h> static void wait_for (pid_t child, int expect) { int status; pid_t p = wait (&status); if (p != child) { perror ("wait"); exit (2); } if (status != expect) { fprintf (stderr, "unexpected status %#x != %#x\n", status, expect); exit (3); } } int main (int argc, char **argv) { pid_t child = fork (); if (child < 0) { perror ("fork"); return 127; } else if (child == 0) { ptrace (PTRACE_TRACEME); raise (SIGUSR1); execv (argv[1], &argv[1]); perror ("execve"); _exit (127); } wait_for (child, W_STOPCODE (SIGUSR1)); if (ptrace (PTRACE_SETOPTIONS, child, 0L, (void *) (long) PTRACE_O_TRACEEXEC) != 0) { perror ("PTRACE_SETOPTIONS"); return 4; } if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0) { perror ("PTRACE_CONT"); return 5; } wait_for (child, W_STOPCODE (SIGTRAP | (PTRACE_EVENT_EXEC << 8))); if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0) { perror ("PTRACE_CONT"); return 6; } wait_for (child, W_EXITCODE (0, 0)); return 0; } Reported-by: Arnd Bergmann <arnd@arndb.de> CC: Ulrich Weigand <ulrich.weigand@de.ibm.com> Signed-off-by: Roland McGrath <roland@redhat.com>
2008-12-09PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switchThomas Renninger
Makes a Compaq 6735s boot reliably again. It used to hang in the loop on some boots. Give the link one second to train, otherwise break out of the loop and reset the previously set clock bits. Cc: stable@vger.kernel.org Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-12-09PCI: stop leaking 'slot_name' in pci_create_slotAlex Chiang
In pci_create_slot(), the local variable 'slot_name' is allocated by make_slot_name(), but never freed. We never use it after passing it to the kobject core, so we should free it upon function exit. Cc: stable@kernel.org Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-12-09MIPS: Better than nothing implementation of PCI mmap to fix X.Ralf Baechle
Certain X11 servers such as the SIS server will only work if PCI mmap is implemented. This patch implements PCI mmap but to be on the same side so close to a release it only supports uncached mappings so performance will not be optimal for some uses such as framebuffers. Thanks to Zhang Le <r0bertz@gentoo.org> for the original report and testing. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-12-09[IA64] SN: prevent IRQ retargetting in request_irq()John Keller
With the introduction of the generic affinity autoselector, irq_select_affinity(), IRQs are now being retargetted, using a default mask, via the request_irq() path. This results in all IRQs targetted at CPU 0. SN Altix assigns affinity in the SN PROM, and does not expect that to be changed as part of request_irq(). Set the IRQ_AFFINITY_SET flag to prevent request_irq() from resetting affinity. Signed-off-by: John Keller <jpk@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-12-09ieee1394: node manager causes up to ~3.25s delay in freezing tasksNigel Cunningham
The firewire nodemanager function "nodemgr_host_thread" contains a loop that calls try_to_freeze near the top of the loop, but then delays for up to 3.25 seconds (plus time to do work) before getting back to the top of the loop. When starting a cycle post-boot, this doesn't seem to bite, but it is causing a noticeable delay at boot time, when freezing processes prior to starting to read the image. The following patch adds invocation of try_to_freeze to the subloops that are used in the body of this function. With these additions, the time to freeze when starting to resume at boot time is virtually zero. I'm no expert on firewire, and so don't know that we shouldn't check the return value and jump back to the top of the loop or such like after being frozen, but I submit it for your consideration. Signed-off-by: Nigel Cunningham <nigel@tuxonice.net> The delay until nodemgr freezes was up to 0.25s (plus time for node probes) in Linux 2.6.27 and older and up to 3.25s (plus ~) since Linux 2.6.28-rc1, hence much more noticeable. try_to_freeze() without any jump is correct. The surrounding code in the respective loops will catch whether another bus reset happens during the freeze and handle it. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-12-09sched: CPU remove deadlock fixBrian King
Impact: fix possible deadlock in CPU hot-remove path This patch fixes a possible deadlock scenario in the CPU remove path. migration_call grabs rq->lock, then wakes up everything on rq->migration_queue with the lock held. Then one of the tasks on the migration queue ends up calling tg_shares_up which then also tries to acquire the same rq->lock. [c000000058eab2e0] c000000000502078 ._spin_lock_irqsave+0x98/0xf0 [c000000058eab370] c00000000008011c .tg_shares_up+0x10c/0x20c [c000000058eab430] c00000000007867c .walk_tg_tree+0xc4/0xfc [c000000058eab4d0] c0000000000840c8 .try_to_wake_up+0xb0/0x3c4 [c000000058eab590] c0000000000799a0 .__wake_up_common+0x6c/0xe0 [c000000058eab640] c00000000007ada4 .complete+0x54/0x80 [c000000058eab6e0] c000000000509fa8 .migration_call+0x5fc/0x6f8 [c000000058eab7c0] c000000000504074 .notifier_call_chain+0x68/0xe0 [c000000058eab860] c000000000506568 ._cpu_down+0x2b0/0x3f4 [c000000058eaba60] c000000000506750 .cpu_down+0xa4/0x108 [c000000058eabb10] c000000000507e54 .store_online+0x44/0xa8 [c000000058eabba0] c000000000396260 .sysdev_store+0x3c/0x50 [c000000058eabc10] c0000000001a39b8 .sysfs_write_file+0x124/0x18c [c000000058eabcd0] c00000000013061c .vfs_write+0xd0/0x1bc [c000000058eabd70] c0000000001308a4 .sys_write+0x68/0x114 [c000000058eabe30] c0000000000086b4 syscall_exit+0x0/0x40 Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-09[IA64] Fix section mismatch ioc3uart_init()/ioc3uart_submoduleTony Luck
s/ioc3uart_submodule/ioc3uart_ops/ makes the section mismatch check happy. Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-12-09[IA64] Clear up section mismatch for ioc4_ide_attach_one.Robin Holt
The generic_defconfig has three section mismatches. This clears up ioc4_ide_attach_one(). Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Mike Reid <mdr@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>