summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2009-02-17Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: doc: mmiotrace.txt, buffer size control change trace: mmiotrace to the tracer menu in Kconfig mmiotrace: count events lost due to not recording
2009-02-17Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, vm86: fix preemption bug x86, olpc: fix model detection without OFW x86, hpet: fix for LS21 + HPET = boot hang x86: CPA avoid repeated lazy mmu flush x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem x86/cpa: make sure cpa is safe to call in lazy mmu mode x86, ptrace, mm: fix double-free on race
2009-02-17Merge branch 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: Flush volatile msrs before emulating rdmsr KVM: Fix assigned devices circular locking dependency KVM: x86: fix LAPIC pending count calculation KVM: Fix INTx for device assignment KVM: MMU: Map device MMIO as UC in EPT KVM: x86: disable kvmclock on non constant TSC hosts KVM: PIT: fix i8254 pending count read KVM: Fix racy in kvm_free_assigned_irq KVM: Add kvm_arch_sync_events to sync with asynchronize events KVM: mmu_notifiers release method KVM: Avoid using CONFIG_ in userspace visible headers KVM: ia64: fix fp fault/trap handler
2009-02-17x86, rcu: fix strange load average and ksoftirqd behaviorPaul E. McKenney
Damien Wyart reported high ksoftirqd CPU usage (20%) on an otherwise idle system. The function-graph trace Damien provided: > 799.521187 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521371 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521555 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521738 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521934 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522068 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.522208 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522392 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522575 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522759 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522956 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523074 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.523214 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523397 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523579 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523762 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523960 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524079 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.524220 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524403 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524587 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524770 | 1) <idle>-0 | | rcu_check_callbacks() { > [ . . . ] Shows rcu_check_callbacks() being invoked way too often. It should be called once per jiffy, and here it is called no less than 22 times in about 3.5 milliseconds, meaning one call every 160 microseconds or so. Why do we need to call rcu_pending() and rcu_check_callbacks() from the idle loop of 32-bit x86, especially given that no other architecture does this? The following patch removes the call to rcu_pending() and rcu_check_callbacks() from the x86 32-bit idle loop in order to reduce the softirq load on idle systems. Reported-by: Damien Wyart <damien.wyart@free.fr> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16cpumask: fix powernow-k8: partial revert of ↵Rusty Russell
2fdf66b491ac706657946442789ec644cc317e1a Impact: fix powernow-k8 when acpi=off (or other error). There was a spurious change introduced into powernow-k8 in this patch: so that we try to "restore" the cpus_allowed we never saved. We revert that file. See lkml "[PATCH] x86/powernow: fix cpus_allowed brokage when acpi=off" from Yinghai for the bug report. Cc: Mike Travis <travis@sgi.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-02-15trace: mmiotrace to the tracer menu in KconfigPekka Paalanen
Impact: cosmetic change in Kconfig menu layout This patch was originally suggested by Peter Zijlstra, but seems it was forgotten. CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable directly under the Kernel hacking / debugging menu in the kernel configuration system. They were present only for x86 and x86_64. Other tracers that use the ftrace tracing framework are in their own sub-menu. This patch moves the mmiotrace configuration options there. Since the Kconfig file, where the tracer menu is, is not architecture specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by x86/x86_64. CONFIG_MMIOTRACE now depends on it. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15x86, vm86: fix preemption bugThomas Gleixner
Commit 3d2a71a596bd9c761c8487a2178e95f8a61da083 ("x86, traps: converge do_debug handlers") changed the preemption disable logic of do_debug() so vm86_handle_trap() is called with preemption disabled resulting in: BUG: sleeping function called from invalid context at include/linux/kernel.h:155 in_atomic(): 1, irqs_disabled(): 0, pid: 3005, name: dosemu.bin Pid: 3005, comm: dosemu.bin Tainted: G W 2.6.29-rc1 #51 Call Trace: [<c050d669>] copy_to_user+0x33/0x108 [<c04181f4>] save_v86_state+0x65/0x149 [<c0418531>] handle_vm86_trap+0x20/0x8f [<c064e345>] do_debug+0x15b/0x1a4 [<c064df1f>] debug_stack_correct+0x27/0x2c [<c040365b>] sysenter_do_call+0x12/0x2f BUG: scheduling while atomic: dosemu.bin/3005/0x10000001 Restore the original calling convention and reenable preemption before calling handle_vm86_trap(). Reported-by: Michal Suchanek <hramrach@centrum.cz> Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15KVM: VMX: Flush volatile msrs before emulating rdmsrAvi Kivity
Some msrs (notable MSR_KERNEL_GS_BASE) are held in the processor registers and need to be flushed to the vcpu struture before they can be read. This fixes cygwin longjmp() failure on Windows x64. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15KVM: x86: fix LAPIC pending count calculationMarcelo Tosatti
Simplify LAPIC TMCCT calculation by using hrtimer provided function to query remaining time until expiration. Fixes host hang with nested ESX. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15KVM: MMU: Map device MMIO as UC in EPTSheng Yang
Software are not allow to access device MMIO using cacheable memory type, the patch limit MMIO region with UC and WC(guest can select WC using PAT and PCD/PWT). Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15KVM: x86: disable kvmclock on non constant TSC hostsMarcelo Tosatti
This is better. Currently, this code path is posing us big troubles, and we won't have a decent patch in time. So, temporarily disable it. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15KVM: PIT: fix i8254 pending count readMarcelo Tosatti
count_load_time assignment is bogus: its supposed to contain what it means, not the expiration time. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15KVM: Fix racy in kvm_free_assigned_irqSheng Yang
In the past, kvm_get_kvm() and kvm_put_kvm() was called in assigned device irq handler and interrupt_work, in order to prevent cancel_work_sync() in kvm_free_assigned_irq got a illegal state when waiting for interrupt_work done. But it's tricky and still got two problems: 1. A bug ignored two conditions that cancel_work_sync() would return true result in a additional kvm_put_kvm(). 2. If interrupt type is MSI, we would got a window between cancel_work_sync() and free_irq(), which interrupt would be injected again... This patch discard the reference count used for irq handler and interrupt_work, and ensure the legal state by moving the free function at the very beginning of kvm_destroy_vm(). And the patch fix the second bug by disable irq before cancel_work_sync(), which may result in nested disable of irq but OK for we are going to free it. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15KVM: Add kvm_arch_sync_events to sync with asynchronize eventsSheng Yang
kvm_arch_sync_events is introduced to quiet down all other events may happen contemporary with VM destroy process, like IRQ handler and work struct for assigned device. For kvm_arch_sync_events is called at the very beginning of kvm_destroy_vm(), so the state of KVM here is legal and can provide a environment to quiet down other events. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15KVM: Avoid using CONFIG_ in userspace visible headersAvi Kivity
Kconfig symbols are not available in userspace, and are not stripped by headers-install. Avoid their use by adding #defines in <asm/kvm.h> to suit each architecture. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-14x86, olpc: fix model detection without OFWChris Ball
Impact: fix "garbled display, laptop is unusable" bug Commit e51a1ac2dfca9ad869471e88f828281db7e810c0 ("x86, olpc: fix endian bug in openfirmware workaround") breaks model comparison on OLPC; the value 0xc2 needs to be scaled up by olpc_board(). The pre-patch version was wrong, but accidentally worked anyway (big-endian 0xc2 is big enough to satisfy all other board revisions, but little endian 0xc2 is not). Signed-off-by: Chris Ball <cjb@laptop.org> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Andres Salomon <dilinger@queued.net> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13x86, hpet: fix for LS21 + HPET = boot hangjohn stultz
Between 2.6.23 and 2.6.24-rc1 a change was made that broke IBM LS21 systems that had the HPET enabled in the BIOS, resulting in boot hangs for x86_64. Specifically commit b8ce33590687888ebb900d09557b8807c4539022, which merges the i386 and x86_64 HPET code. Prior to this commit, when we setup the HPET timers in x86_64, we did the following: hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | HPET_TN_32BIT, HPET_T0_CFG); However after the i386/x86_64 HPET merge, we do the following: cfg = hpet_readl(HPET_Tn_CFG(timer)); cfg |= HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | HPET_TN_32BIT; hpet_writel(cfg, HPET_Tn_CFG(timer)); However on LS21s with HPET enabled in the BIOS, the HPET_T0_CFG register boots with Level triggered interrupts (HPET_TN_LEVEL) enabled. This causes the periodic interrupt to be not so periodic, and that results in the boot time hang I reported earlier in the delay calibration. My fix: Always disable HPET_TN_LEVEL when setting up periodic mode. Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12x86: CPA avoid repeated lazy mmu flushThomas Gleixner
Impact: Flush the lazy MMU only once Pending mmu updates only need to be flushed once to bring the in-memory pagetable state up to date. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-02-12x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible contextThomas Gleixner
Impact: Catch cases where lazy MMU state is active in a preemtible context arch_flush_lazy_mmu_cpu() has been changed to disable preemption so the checks in enter/leave will never trigger. Put the preemtible() check into arch_flush_lazy_mmu_cpu() to catch such cases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-02-12x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemptionJeremy Fitzhardinge
Impact: avoid access to percpu vars in preempible context They are intended to be used whenever there's the possibility that there's some stale state which is going to be overwritten with a queued update, or to force a state change when we may be in lazy mode. Either way, we could end up calling it with preemption enabled, so wrap the functions in their own little preempt-disable section so they can be safely called in any context (though preemption should never be enabled if we're actually in a lazy state). (Move out of line to avoid #include dependencies.) Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-02-12x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/memSuresh Siddha
Jeff Mahoney reported: > With Suse's hwinfo tool, on -tip: > WARNING: at arch/x86/mm/pat.c:637 reserve_pfn_range+0x5b/0x26d() reserve_pfn_range() is not tracking the memory range below 1MB as non-RAM and as such is inconsistent with similar checks in reserve_memtype() and free_memtype() Rename the pagerange_is_ram() to pat_pagerange_is_ram() and add the "track legacy 1MB region as non RAM" condition. And also, fix reserve_pfn_range() to return -EINVAL, when the pfn range is RAM. This is to be consistent with this API design. Reported-and-tested-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12x86/cpa: make sure cpa is safe to call in lazy mmu modeJeremy Fitzhardinge
Impact: fix race leading to crash under KVM and Xen The CPA code may be called while we're in lazy mmu update mode - for example, when using DEBUG_PAGE_ALLOC and doing a slab allocation in an interrupt handler which interrupted a lazy mmu update. In this case, the in-memory pagetable state may be out of date due to pending queued updates. We need to flush any pending updates before inspecting the page table. Similarly, we must explicitly flush any modifications CPA may have made (which comes down to flushing queued operations when flushing the TLB). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers: fix TIMER_ABSTIME for process wide cpu timers timers: split process wide cpu clocks/timers, fix x86: clean up hpet timer reinit timers: split process wide cpu clocks/timers, remove spurious warning timers: split process wide cpu clocks/timers signal: re-add dead task accumulation stats. x86: fix hpet timer reinit for x86_64 sched: fix nohz load balancer on cpu offline
2009-02-11Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ptrace, x86: fix the usage of ptrace_fork() i8327: fix outb() parameter order x86: fix math_emu register frame access x86: math_emu info cleanup x86: include correct %gs in a.out core dump x86, vmi: put a missing paravirt_release_pmd in pgd_dtor x86: find nr_irqs_gsi with mp_ioapic_routing x86: add clflush before monitor for Intel 7400 series x86: disable intel_iommu support by default x86: don't apply __supported_pte_mask to non-present ptes x86: fix grammar in user-visible BIOS warning x86/Kconfig.cpu: make Kconfig help readable in the console x86, 64-bit: print DMI info in the oops trace
2009-02-11x86, ptrace, mm: fix double-free on raceMarkus Metzger
Ptrace_detach() races with __ptrace_unlink() if the traced task is reaped while detaching. This might cause a double-free of the BTS buffer. Change the ptrace_detach() path to only do the memory accounting in ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace() which will be called from __ptrace_unlink(). The fix follows a proposal from Oleg Nesterov. Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11tracing, x86: fix constraint for parent variableSteven Rostedt
The constraint used for retrieving and restoring the parent function pointer is incorrect. The parent variable is a pointer, and the address of the pointer is modified by the asm statement and not the pointer itself. It is incorrect to pass it in as an output constraint since the asm will never update the pointer. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10tracing, x86: fix fixup section to return to original codeSteven Rostedt
Impact: fix to prevent a kernel crash on fault If for some reason the pointer to the parent function on the stack takes a fault, the fix up code will not return back to the original faulting code. This can lead to unpredictable results and perhaps even a kernel panic. A fault should not happen, but if it does, we should simply disable the tracer, warn, and continue running the kernel. It should not lead to a kernel crash. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-10i8327: fix outb() parameter orderClemens Ladisch
In i8237A_resume(), when resetting the DMA controller, the parameters to dma_outb() were mixed up. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> [ cleaned up the file a tiny bit. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10x86: fix math_emu register frame accessTejun Heo
do_device_not_available() is the handler for #NM and it declares that it takes a unsigned long and calls math_emu(), which takes a long argument and surprisingly expects the stack frame starting at the zero argument would match struct math_emu_info, which isn't true regardless of configuration in the current code. This patch makes do_device_not_available() take struct pt_regs like other exception handlers and initialize struct math_emu_info with pointer to it and pass pointer to the math_emu_info to math_emulate() like normal C functions do. This way, unless gcc makes a copy of struct pt_regs in do_device_not_available(), the register frame is correctly accessed regardless of kernel configuration or compiler used. This doesn't fix all math_emu problems but it at least gets it somewhat working. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] powernow-k8: Get transition latency from ACPI _PSS table [CPUFREQ] Make ignore_nice_load setting of ondemand work as expected.
2009-02-09x86: spinlocks: define dummy __raw_spin_is_contendedKyle McMartin
Architectures other than mips and x86 are not using ticket spinlocks. Therefore, the contention on the lock is meaningless, since there is nobody known to be waiting on it (arguably /fairly/ unfair locks). Dummy it out to return 0 on other architectures. Signed-off-by: Kyle McMartin <kyle@redhat.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-09x86: math_emu info cleanupTejun Heo
Impact: cleanup * Come on, struct info? s/struct info/struct math_emu_info/ * Use struct pt_regs and kernel_vm86_regs instead of defining its own register frame structure. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09x86: include correct %gs in a.out core dumpTejun Heo
Impact: dump the correct %gs into a.out core dump aout_dump_thread() read %gs but didn't include it in core dump. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09x86, vmi: put a missing paravirt_release_pmd in pgd_dtorAlok Kataria
Commit 6194ba6ff6ccf8d5c54c857600843c67aa82c407 ("x86: don't special-case pmd allocations as much") made changes to the way we handle pmd allocations, and while doing that it dropped a call to paravirt_release_pd on the pgd page from the pgd_dtor code path. As a result of this missing release, the hypervisor is now unaware of the pgd page being freed, and as a result it ends up tracking this page as a page table page. After this the guest may start using the same page for other purposes, and depending on what use the page is put to, it may result in various performance and/or functional issues ( hangs, reboots). Since this release is only required for VMI, I now release the pgd page from the (vmi)_pgd_free hook. Signed-off-by: Alok N Kataria <akataria@vmware.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org>
2009-02-09x86: find nr_irqs_gsi with mp_ioapic_routingYinghai Lu
Impact: find right nr_irqs_gsi on some systems. One test-system has gap between gsi's: [ 0.000000] ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0]) [ 0.000000] IOAPIC[0]: apic_id 4, version 0, address 0xfec00000, GSI 0-23 [ 0.000000] ACPI: IOAPIC (id[0x05] address[0xfeafd000] gsi_base[48]) [ 0.000000] IOAPIC[1]: apic_id 5, version 0, address 0xfeafd000, GSI 48-54 [ 0.000000] ACPI: IOAPIC (id[0x06] address[0xfeafc000] gsi_base[56]) [ 0.000000] IOAPIC[2]: apic_id 6, version 0, address 0xfeafc000, GSI 56-62 ... [ 0.000000] nr_irqs_gsi: 38 So nr_irqs_gsi is not right. some irq for MSI will overwrite with io_apic. need to get that with acpi_probe_gsi when acpi io_apic is used Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09x86: add clflush before monitor for Intel 7400 seriesPallipadi, Venkatesh
For Intel 7400 series CPUs, the recommendation is to use a clflush on the monitored address just before monitor and mwait pair [1]. This clflush makes sure that there are no false wakeups from mwait when the monitored address was recently written to. [1] "MONITOR/MWAIT Recommendations for Intel Xeon Processor 7400 series" section in specification update document of 7400 series http://download.intel.com/design/xeon/specupdt/32033601.pdf Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-07Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', ↵Len Brown
'ec', 'misc', 'printk' and 'processor' into release
2009-02-06x86-64: fix int $0x80 -ENOSYS returnRoland McGrath
One of my past fixes to this code introduced a different new bug. When using 32-bit "int $0x80" entry for a bogus syscall number, the return value is not correctly set to -ENOSYS. This only happens when neither syscall-audit nor syscall tracing is enabled (i.e., never seen if auditd ever started). Test program: /* gcc -o int80-badsys -m32 -g int80-badsys.c Run on x86-64 kernel. Note to reproduce the bug you need auditd never to have started. */ #include <errno.h> #include <stdio.h> int main (void) { long res; asm ("int $0x80" : "=a" (res) : "0" (99999)); printf ("bad syscall returns %ld\n", res); return res != -ENOSYS; } The fix makes the int $0x80 path match the sysenter and syscall paths. Reported-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Roland McGrath <roland@redhat.com>
2009-02-06x86: clean up hpet timer reinitPavel Emelyanov
Implement Linus's suggestion: introduce the hpet_cnt_ahead() helper function to compare hpet time values - like other wrapping counter comparisons are abstracted away elsewhere. (jiffies, ktime_t, etc.) Reported-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05prevent kprobes from catching spurious page faultsMasami Hiramatsu
Prevent kprobes from catching spurious faults which will cause infinite recursive page-fault and memory corruption by stack overflow. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-05[CPUFREQ] powernow-k8: Get transition latency from ACPI _PSS tableMark Langsdorf
At this time, the PowerNow! driver for K8 uses an experimentally derived formula to calculate transition latency. The value it provides is orders of magnitude too large on modern systems. This patch replaces the formula with ACPI _PSS latency values for more accuracy and better performance. I've tested it on two 2nd generation Opteron systems, a 3rd generation Operton system, and a Turion X2 without seeing any stability problems. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-05x86: disable intel_iommu support by defaultKyle McMartin
Due to recurring issues with DMAR support on certain platforms. There's a number of filesystem corruption incidents reported: https://bugzilla.redhat.com/show_bug.cgi?id=479996 http://bugzilla.kernel.org/show_bug.cgi?id=12578 Provide a Kconfig option to change whether it is enabled by default. If disabled, it can still be reenabled by passing intel_iommu=on to the kernel. Keep the .config option off by default. Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-By: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-04x86: don't apply __supported_pte_mask to non-present ptesJeremy Fitzhardinge
On an x86 system which doesn't support global mappings, __supported_pte_mask has _PAGE_GLOBAL clear, to make sure it never appears in the PTE. pfn_pte() and so on will enforce it with: static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot) { return __pte((((phys_addr_t)page_nr << PAGE_SHIFT) | pgprot_val(pgprot)) & __supported_pte_mask); } However, we overload _PAGE_GLOBAL with _PAGE_PROTNONE on non-present ptes to distinguish them from swap entries. However, applying __supported_pte_mask indiscriminately will clear the bit and corrupt the pte. I guess the best fix is to only apply __supported_pte_mask to present ptes. This seems like the right solution to me, as it means we can completely ignore the issue of overlaps between the present pte bits and the non-present pte-as-swap entry use of the bits. __supported_pte_mask contains the set of flags we support on the current hardware. We also use bits in the pte for things like logically present ptes with no permissions, and swap entries for swapped out pages. We should only apply __supported_pte_mask to present ptes, because otherwise we may destroy other information being stored in the ptes. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-05x86: fix grammar in user-visible BIOS warningAlex Chiang
Fix user-visible grammo. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05x86: fix hpet timer reinit for x86_64Pavel Emelyanov
There's a small problem with hpet_rtc_reinit function - it checks for the: hpet_readl(HPET_COUNTER) - hpet_t1_cmp > 0 to continue increasing both the HPET_T1_CMP (register) and the hpet_t1_cmp (variable). But since the HPET_COUNTER is always 32-bit, if the hpet_t1_cmp is 64-bit this condition will always be FALSE once the latter hits the 32-bit boundary, and we can have a situation, when we don't increase the HPET_T1_CMP register high enough. The result - timer stops ticking, since HPET_T1_CMP becomes less, than the COUNTER and never increased again. The solution is (based on Linus's suggestion) to not compare 64-bits (on 64-bit x86), but to do the comparison on 32-bit signed integers. Reported-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-04x86/Kconfig.cpu: make Kconfig help readable in the consoleBorislav Petkov
Impact: cleanup Some lines exceed the 80 char width making them unreadable. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-04x86, 64-bit: print DMI info in the oops traceKyle McMartin
This patch echoes what we already do on 32-bit since 90f7d25c6b672137344f447a30a9159945ffea72, and prints the DMI product name in show_regs, so that system specific problems can be easily identified. Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-04Merge branch 'core/xen' into x86/urgentIngo Molnar
2009-02-04ACPI: cpufreq: Remove deprecated /proc/acpi/processor/../performance proc ↵Thomas Renninger
entries They were long enough set deprecated... Update Documentation/cpu-freq/users-guide.txt: The deprecated files listed there seen not to exist for some time anymore already. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-03x86: APIC: enable workaround on AMD Fam10h CPUsBorislav Petkov
Impact: fix to enable APIC for AMD Fam10h on chipsets with a missing/b0rked ACPI MP table (MADT) Booting a 32bit kernel on an AMD Fam10h CPU running on chipsets with missing/b0rked MP table leads to a hang pretty early in the boot process due to the APIC not being initialized. Fix that by falling back to the default APIC base address in 32bit code, as it is done in the 64bit codepath. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>