summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-08-13Merge tag 'msi-3.12-2' into for-3.12/socStephen Warren
pci msi changes for v3.12 (round 2) - fix build breakage for s390 allyesconfig due to !HAVE_GENERIC_HARDIRQS
2013-08-13PCI: msi: add default MSI operations for !HAVE_GENERIC_HARDIRQS platformsThomas Petazzoni
Some platforms (e.g S390) don't use the generic hardirqs code and therefore do not defined HAVE_GENERIC_HARDIRQS. This prevents using the irq_set_chip_data() and irq_get_chip_data() functions that are used for the default implementations of the MSI operations. So, when CONFIG_GENERIC_HARDIRQS is not enabled, provide another default implementation of the MSI operations, that simply errors out. The architecture is responsible for implementing those operations (which is the case on S390), and cannot use the msi_chip infrastructure. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12ARM: tegra: add LP1 suspend support for Tegra114Joseph Lo
The LP1 suspend mode will power off the CPU, clock gated the PLLs and put SDRAM to self-refresh mode. Any interrupt can wake up device from LP1. The sequence when LP1 suspending: * tunning off L1 data cache and the MMU * storing some EMC registers, DPD (deep power down) status, clk source of mselect and SCLK burst policy * putting SDRAM into self-refresh * switching CPU to CLK_M (12MHz OSC) * tunning off PLLM, PLLP, PLLA, PLLC and PLLX * switching SCLK to CLK_S (32KHz OSC) * shutting off the CPU rail The sequence of LP1 resuming: * re-enabling PLLM, PLLP, PLLA, PLLC and PLLX * restoring the clk source of mselect and SCLK burst policy * setting up CCLK burst policy to PLLX * restoring DPD status and some EMC registers * resuming SDRAM to normal mode * jumping to the "tegra_resume" from PMC_SCRATCH41 Due to the SDRAM will be put into self-refresh mode, the low level procedures of LP1 suspending and resuming should be copied to TEGRA_IRAM_CODE_AREA (TEGRA_IRAM_BASE + SZ_4K) when suspending. Before restoring the CPU context when resuming, the SDRAM needs to be switched back to normal mode. And the PLLs need to be re-enabled, SCLK burst policy be restored. Then jumping to "tegra_resume" that was expected to be stored in PMC_SCRATCH41 to restore CPU context and back to kernel. Based on the work by: Bo Yan <byan@nvidia.com> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-08-12ARM: tegra: add LP1 suspend support for Tegra20Joseph Lo
The LP1 suspend mode will power off the CPU, clock gated the PLLs and put SDRAM to self-refresh mode. Any interrupt can wake up device from LP1. The sequence when LP1 suspending: * tunning off L1 data cache and the MMU * putting SDRAM into self-refresh * storing some EMC registers and SCLK burst policy * switching CPU to CLK_M (12MHz OSC) * switching SCLK to CLK_S (32KHz OSC) * tunning off PLLM, PLLP and PLLC * shutting off the CPU rail The sequence of LP1 resuming: * re-enabling PLLM, PLLP, and PLLC * restoring some EMC registers and SCLK burst policy * setting up CCLK burst policy to PLLP * resuming SDRAM to normal mode * jumping to the "tegra_resume" from PMC_SCRATCH41 Due to the SDRAM will be put into self-refresh mode, the low level procedures of LP1 suspending and resuming should be copied to TEGRA_IRAM_CODE_AREA (TEGRA_IRAM_BASE + SZ_4K) when suspending. Before restoring the CPU context when resuming, the SDRAM needs to be switched back to normal mode. And the PLLs need to be re-enabled, SCLK burst policy be restored, CCLK burst policy be set in PLLP. Then jumping to "tegra_resume" that was expected to be stored in PMC_SCRATCH41 to restore CPU context and back to kernel. Based on the work by: Colin Cross <ccross@android.com> Gary King <gking@nvidia.com> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-08-12ARM: tegra: add LP1 suspend support for Tegra30Joseph Lo
The LP1 suspend mode will power off the CPU, clock gated the PLLs and put SDRAM to self-refresh mode. Any interrupt can wake up device from LP1. The sequence when LP1 suspending: * tunning off L1 data cache and the MMU * storing some EMC registers, DPD (deep power down) status, clk source of mselect and SCLK burst policy * putting SDRAM into self-refresh * switching CPU to CLK_M (12MHz OSC) * tunning off PLLM, PLLP, PLLA, PLLC and PLLX * switching SCLK to CLK_S (32KHz OSC) * shutting off the CPU rail The sequence of LP1 resuming: * re-enabling PLLM, PLLP, PLLA, PLLC and PLLX * restoring the clk source of mselect and SCLK burst policy * setting up CCLK burst policy to PLLX * restoring DPD status and some EMC registers * resuming SDRAM to normal mode * jumping to the "tegra_resume" from PMC_SCRATCH41 Due to the SDRAM will be put into self-refresh mode, the low level procedures of LP1 suspending and resuming should be copied to TEGRA_IRAM_CODE_AREA (TEGRA_IRAM_BASE + SZ_4K) when suspending. Before restoring the CPU context when resuming, the SDRAM needs to be switched back to normal mode. And the PLLs need to be re-enabled, SCLK burst policy be restored, CCLK burst policy be set in PLLX. Then jumping to "tegra_resume" that was expected to be stored in PMC_SCRATCH41 to restore CPU context and back to kernel. Based on the work by: Scott Williams <scwilliams@nvidia.com> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-08-12ARM: tegra: add common LP1 suspend supportJoseph Lo
The LP1 suspending mode on Tegra means CPU rail off, devices and PLLs are clock gated and SDRAM in self-refresh mode. That means the low level LP1 suspending and resuming code couldn't be run on DRAM and the CPU must switch to the always on clock domain (a.k.a. CLK_M 12MHz oscillator). And the system clock (SCLK) would be switched to CLK_S, a 32KHz oscillator. The LP1 low level handling code need to be moved to IRAM area first. And marking the LP1 mask for indicating the Tegra device is in LP1. The CPU power timer needs to be re-calculated based on 32KHz that was originally based on PCLK. When resuming from LP1, the LP1 reset handler will resume PLLs and then put DRAM to normal mode. Then jumping to the "tegra_resume" that will restore full context before back to kernel. The "tegra_resume" handler was expected to be found in PMC_SCRATCH41 register. This is common LP1 procedures for Tegra, so we do these jobs mainly in this patch: * moving LP1 low level handling code to IRAM * marking LP1 mask * copying the physical address of "tegra_resume" to PMC_SCRATCH41 * re-calculate the CPU power timer based on 32KHz Signed-off-by: Joseph Lo <josephl@nvidia.com> [swarren, replaced IRAM_CODE macro with IO_ADDRESS(TEGRA_IRAM_CODE_AREA)] Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-08-12clk: tegra114: add LP1 suspend/resume supportJoseph Lo
When the system suspends to LP1, the CPU clock source is switched to CLK_M (12MHz Oscillator) during suspend/resume flow. The CPU clock source is controlled by the CCLKG_BURST_POLICY register, and hence this register must be restored during LP1 resume. Cc: Mike Turquette <mturquette@linaro.org> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-08-12ARM: tegra: config the polarity of the request of sys clockJoseph Lo
When suspending to LP1 mode, the SYSCLK will be clock gated. And different board may have different polarity of the request of SYSCLK, this patch configure the polarity from the DT for the board. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-08-12ARM: tegra: add common resume handling code for LP1 resumingJoseph Lo
Add support to the Tegra CPU reset vector to detect whether the CPU is resuming from LP1 suspend state. If it is, branch to the LP1-specific resume code. When Tegra enters the LP1 suspend state, the SDRAM controller is placed into a self-refresh state. For this reason, we must place the LP1 resume code into IRAM, so that it is accessible before SDRAM access has been re-enabled. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-08-12ARM: pci: add ->add_bus() and ->remove_bus() hooks to hw_pciThomas Petazzoni
Some PCI drivers may need to adjust the pci_bus structure after it has been allocated by the Linux PCI core. The PCI core allows architectures to implement the pcibios_add_bus() and pcibios_remove_bus() for this purpose. This commit therefore extends the hw_pci and pci_sys_data structures of the ARM PCI core to allow PCI drivers to register ->add_bus() and ->remove_bus() in hw_pci, which will get called when a bus is added or removed from the system. This will be used for example by the Marvell PCIe driver to connect a particular PCI bus with its corresponding MSI chip to handle Message Signaled Interrupts. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Reviewed-by: Thierry Reding <thierry.reding@gmail.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12of: pci: add registry of MSI chipsThomas Petazzoni
This commit adds a very basic registry of msi_chip structures, so that an IRQ controller driver can register an msi_chip, and a PCIe host controller can find it, based on a 'struct device_node'. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12PCI: Introduce new MSI chip infrastructureThierry Reding
The new struct msi_chip is used to associated an MSI controller with a PCI bus. It is automatically handed down from the root to its children during bus enumeration. This patch provides default (weak) implementations for the architecture- specific MSI functions (arch_setup_msi_irq(), arch_teardown_msi_irq() and arch_msi_check_device()) which check if a PCI device's bus has an attached MSI chip and forward the call appropriately. Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12PCI: remove ARCH_SUPPORTS_MSI kconfig optionThomas Petazzoni
Now that we have weak versions for each of the PCI MSI architecture functions, we can actually build the MSI support for all platforms, regardless of whether they provide or not architecture-specific versions of those functions. For this reason, the ARCH_SUPPORTS_MSI hidden kconfig boolean becomes useless, and this patch gets rid of it. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: linux-s390@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12PCI: use weak functions for MSI arch-specific functionsThomas Petazzoni
Until now, the MSI architecture-specific functions could be overloaded using a fairly complex set of #define and compile-time conditionals. In order to prepare for the introduction of the msi_chip infrastructure, it is desirable to switch all those functions to use the 'weak' mechanism. This commit converts all the architectures that were overidding those MSI functions to use the new strategy. Note that we keep two separate, non-weak, functions default_teardown_msi_irqs() and default_restore_msi_irqs() for the default behavior of the arch_teardown_msi_irqs() and arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI code. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: linux-s390@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-08ARM: tegra: unify Tegra's Kconfig a bit moreStephen Warren
Move all common select clauses from ARCH_TEGRA_*_SOC to ARCH_TEGRA to eliminate duplication. The USB-related selects all should have been common too, but were missing from Tegra114 previously. Move these to ARCH_TEGRA too. The latter fixes a build break when only Tegra114 support was enabled, but not Tegra20 or Tegra30 support. Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra: remove the limitation that Tegra114 can't support suspendJoseph Lo
The Tegra114 can support suspend function now, removing the limitation. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19clk: tegra: add suspend/resume function for tegra_cpu_car_opsJoseph Lo
Adding suspend/resume function for tegra_cpu_car_ops. We only save and restore the setting of the clock of CoreSight. Other clocks still need to be taken care by clock driver. Cc: Mike Turquette <mturquette@linaro.org> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra: flowctrl: add support for cpu_suspend_enter/exitJoseph Lo
The flow controller can help CPU to go into suspend mode (powered-down state). When CPU goes into powered-down state, it needs some careful settings before getting into and after leaving. The enter and exit functions do that by configuring appropriate mode for flow controller. For Tegra114, the setting is compatible with Tegra30. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra: hook tegra_tear_down_cpu functionJoseph Lo
Hooking tegra_tear_down_cpu for Tegra114 for supporting cluster power down when CPU cluster suspneded in LP2. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra: shut off the CPU rail when the last CPU in suspendJoseph Lo
When the last CPU core in suspend, the CPU power rail can be turned off by setting flags to flow controller. Then the flow controller will inform PMC to turn off the CPU rail when the last CPU goes into suspend. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra: add low level code for Tegra114 cluster power downJoseph Lo
When the CPU cluster power down, the vGIC is powered down too. The flow controller needs to monitor the legacy interrupt controller to wake up CPU. So setting up the appropriate wake up event in flow controller. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra: set up the correct L2 data RAM latency for Cortex-A15Joseph Lo
When there is a cluster power down cycle in suspend, we need to set up the correct L2 RAM data RAM latency to make L2 cache work correctly. This is only needed for cluster 0 and needs to be done in tegra_resume before the cache is enabled. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra: add a flag for tegra_disable_clean_inv_dcache to do LoUIS or ALLJoseph Lo
Adding a flag for tegra_disable_clean_inv_dcache to flush cache as LoUIS or ALL. After this patch, the v7_flush_dcache_louis is used for CPU hotplug and CPU suspend in CPU power down (e.g. CPU idle power-down mode) case. And the v7_flush_dcache_all is used for CPU cluster power down (e.g. suspend to LP2 mode). Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra: do v7_invalidate_l1 only when CPU is Cortex-A9Joseph Lo
The v7_invalidate_l1 was used for the L1 cache that come out from reset in a undefined state. This is no need for Cortex-A15. We do it for A9 only. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra114: cpuidle: add powered-down stateJoseph Lo
This supports CPU core power down on each CPU when CPU idle. When CPU go into this state, it saves it's context and needs a proper configuration in flow controller to power gate the CPU when CPU runs into WFI instruction. And the CPU also needs to set the IRQ as CPU power down idle wake up event in flow controller. Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra114: add low level support for CPU idle powered-down modeJoseph Lo
The flow controller would take care the power sequence when CPU idle in powered-down mode. It powered gate the CPU when CPU runs into WFI instruction. And wake up the CPU when event be triggered. The sequence is below. * setting wfi bitmap for the CPU as the halt event in the FLOW_CTRL_CPU_HALT_REG to monitor the CPU running into WFI,then power gate it * setting IRQ and FIQ as wake up event to wake up CPU when event triggered Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19ARM: tegra114: Reprogram GIC CPU interface to bypass IRQ on CPU PM entryJoseph Lo
There is a difference between GICv1 and v2 when CPU in power management mode (aka CPU power down on Tegra). For GICv1, IRQ/FIQ interrupt lines going to CPU are same lines which are also used for wake-interrupt. Therefore, we cannot disable the GIC CPU interface if we need to use same interrupts for CPU wake purpose. This creates a race condition for CPU power off entry. Also, in GICv1, disabling GICv1 CPU interface puts GICv1 into bypass mode such that incoming legacy IRQ/FIQ are sent to CPU, which means disabling GIC CPU interface doesn't really disable IRQ/FIQ to CPU. GICv2 provides a wake IRQ/FIQ (for wake-event purpose), which are not disabled by GIC CPU interface. This is done by adding a bypass override capability when the interrupts are disabled at the CPU interface. To support this, there are four bits about IRQ/FIQ BypassDisable in CPU interface Control Register. When the IRQ/FIQ not being driver by the CPU interface, each interrupt output signal can be deasserted rather than being driven by the legacy interrupt input. So the wake-event can be used as wakeup signals to SoC (system power controller). To prevent race conditions and ensure proper interrupt routing on Cortex-A15 CPUs when they are power-gated, add a CPU PM notifier call-back to reprogram the GIC CPU interface on PM entry. The GIC CPU interface will be reset back to its normal state by the common GIC CPU PM exit callback when the CPU wakes up. Based on the work by: Scott Williams <scwilliams@nvidia.com> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-19Revert "ARM: tegra: add cpu_disable for hotplug"Joseph Lo
This reverts commit 510bb59 "ARM: tegra: add cpu_disable for hotplug". The Tegra114 support CPU0 hotplug function in HW physically, but it needs other software to make it work normally after we add CPU idle power down mode support. So remove them for now. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-15ARM: tegra: enable Cortex-A15 erratum 798181Joseph Lo
The commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)) introduced a workaround for Cortex-A15 erratum 798181. Enable it for Tegra114. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
2013-07-14Linux 3.11-rc1Linus Torvalds
2013-07-14Merge branch 'slab/for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull slab update from Pekka Enberg: "Highlights: - Fix for boot-time problems on some architectures due to init_lock_keys() not respecting kmalloc_caches boundaries (Christoph Lameter) - CONFIG_SLUB_CPU_PARTIAL requested by RT folks (Joonsoo Kim) - Fix for excessive slab freelist draining (Wanpeng Li) - SLUB and SLOB cleanups and fixes (various people)" I ended up editing the branch, and this avoids two commits at the end that were immediately reverted, and I instead just applied the oneliner fix in between myself. * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux slub: Check for page NULL before doing the node_match check mm/slab: Give s_next and s_stop slab-specific names slob: Check for NULL pointer before calling ctor() slub: Make cpu partial slab support configurable slab: add kmalloc() to kernel API documentation slab: fix init_lock_keys slob: use DIV_ROUND_UP where possible slub: do not put a slab to cpu partial list when cpu_partial is 0 mm/slub: Use node_nr_slabs and node_nr_objs in get_slabinfo mm/slub: Drop unnecessary nr_partials mm/slab: Fix /proc/slabinfo unwriteable for slab mm/slab: Sharing s_next and s_stop between slab and slub mm/slab: Fix drain freelist excessively slob: Rework #ifdeffery in slab.h mm, slab: moved kmem_cache_alloc_node comment to correct place
2013-07-14slub: Check for page NULL before doing the node_match checkSteven Rostedt
In the -rt kernel (mrg), we hit the following dump: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 PGD a2d39067 PUD b1641067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: sunrpc cpufreq_ondemand ipv6 tg3 joydev sg serio_raw pcspkr k8temp amd64_edac_mod edac_core i2c_piix4 e100 mii shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw ata_generic pata_acpi pata_serverworks radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU 3 Pid: 20878, comm: hackbench Not tainted 3.6.11-rt25.14.el6rt.x86_64 #1 empty empty/Tyan Transport GT24-B3992 RIP: 0010:[<ffffffff811573f1>] [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 RSP: 0018:ffff8800a9b17d70 EFLAGS: 00010213 RAX: 0000000000000000 RBX: 0000000001200011 RCX: ffff8800a06d8000 RDX: 0000000004d92a03 RSI: 00000000000000d0 RDI: ffff88013b805500 RBP: ffff8800a9b17dc0 R08: ffff88023fd14d10 R09: ffffffff81041cbd R10: 00007f4e3f06e9d0 R11: 0000000000000246 R12: ffff88013b805500 R13: ffff8801ff46af40 R14: 0000000000000001 R15: 0000000000000000 FS: 00007f4e3f06e700(0000) GS:ffff88023fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000a2d3a000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process hackbench (pid: 20878, threadinfo ffff8800a9b16000, task ffff8800a06d8000) Stack: ffff8800a9b17da0 ffffffff81202e08 ffff8800a9b17de0 000000d001200011 0000000001200011 0000000001200011 0000000000000000 0000000000000000 00007f4e3f06e9d0 0000000000000000 ffff8800a9b17e60 ffffffff81041cbd Call Trace: [<ffffffff81202e08>] ? current_has_perm+0x68/0x80 [<ffffffff81041cbd>] copy_process+0xdd/0x15b0 [<ffffffff810a2125>] ? rt_up_read+0x25/0x30 [<ffffffff8104369a>] do_fork+0x5a/0x360 [<ffffffff8107c66b>] ? migrate_enable+0xeb/0x220 [<ffffffff8100b068>] sys_clone+0x28/0x30 [<ffffffff81527423>] stub_clone+0x13/0x20 [<ffffffff81527152>] ? system_call_fastpath+0x16/0x1b Code: 89 fc 89 75 cc 41 89 d6 4d 8b 04 24 65 4c 03 04 25 48 ae 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 12 41 83 fe ff 74 27 <48> 8b 00 48 c1 e8 3a 41 39 c6 74 1b 8b 75 cc 4c 89 c9 44 89 f2 RIP [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 RSP <ffff8800a9b17d70> CR2: 0000000000000000 ---[ end trace 0000000000000002 ]--- Now, this uses SLUB pretty much unmodified, but as it is the -rt kernel with CONFIG_PREEMPT_RT set, spinlocks are mutexes, although they do disable migration. But the SLUB code is relatively lockless, and the spin_locks there are raw_spin_locks (not converted to mutexes), thus I believe this bug can happen in mainline without -rt features. The -rt patch is just good at triggering mainline bugs ;-) Anyway, looking at where this crashed, it seems that the page variable can be NULL when passed to the node_match() function (which does not check if it is NULL). When this happens we get the above panic. As page is only used in slab_alloc() to check if the node matches, if it's NULL I'm assuming that we can say it doesn't and call the __slab_alloc() code. Is this a correct assumption? Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs stuff from Al Viro: "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including making simple_lookup() usable for filesystems with non-NULL s_d_op, which allows us to get rid of quite a bit of ugliness" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sunrpc: now we can just set ->s_d_op cgroup: we can use simple_lookup() now efivarfs: we can use simple_lookup() now make simple_lookup() usable for filesystems that set ->s_d_op configfs: don't open-code d_alloc_name() __rpc_lookup_create_exclusive: pass string instead of qstr rpc_create_*_dir: don't bother with qstr llist: llist_add() can use llist_add_batch() llist: fix/simplify llist_add() and llist_add_batch() fput: turn "list_head delayed_fput_list" into llist_head fs/file_table.c:fput(): add comment Safer ABI for O_TMPFILE
2013-07-14sunrpc: now we can just set ->s_d_opAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14cgroup: we can use simple_lookup() nowAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14efivarfs: we can use simple_lookup() nowAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14make simple_lookup() usable for filesystems that set ->s_d_opAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14configfs: don't open-code d_alloc_name()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14__rpc_lookup_create_exclusive: pass string instead of qstrAl Viro
... and use d_hash_and_lookup() instead of open-coding it, for fsck sake... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14rpc_create_*_dir: don't bother with qstrAl Viro
just pass the name Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-13Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86Linus Torvalds
Pull x86 platform driver updates from Matthew Garrett: "Nothing overly exciting here - a couple of new drivers that don't do a great deal, along with some miscellaneous fixes and a couple of small feature enablement patches" * 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86: x86 platform drivers: fix gpio leak toshiba_acpi: Add dependency on SERIO_I8042 asus-nb-wmi: set wapf=4 for ASUSTeK COMPUTER INC. 1015E/U Add trivial driver to disable Intel Smart Connect Add support driver for Intel Rapid Start Technology hp-wmi: add supports for POST code error asus-wmi: control wlan-led only if wapf == 4 drivers/platform/x86/intel_ips: Convert to module_pci_driver asus-nb-wmi: ignore ALS notification key code asus-wmi: append newline to messages x86: asus-laptop: fix invalid point access x86: msi-laptop: fix memleak amilo-rfkill: Add dependency on SERIO_I8042 dell-laptop: fix error return code in dell_init() hp-wmi: Enable hotkeys on some systems
2013-07-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull second round of input updates from Dmitry Torokhov: "An update to Elantech driver to support hardware v7, fix to the new cyttsp4 driver to use proper addressing, ads7846 device tree support and nspire-keypad got a small cleanup." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: nspire-keypad - replace magic offset with define Input: elantech - fix for newer hardware versions (v7) Input: cyttsp4 - use 16bit address for I2C/SPI communication Input: ads7846 - add device tree bindings Input: ads7846 - make sure we do not change platform data
2013-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Just a bunch of small fixes and tidy ups: 1) Finish the "busy_poll" renames, from Eliezer Tamir. 2) Fix RCU stalls in IFB driver, from Ding Tianhong. 3) Linearize buffers properly in tun/macvtap zerocopy code. 4) Don't crash on rmmod in vxlan, from Pravin B Shelar. 5) Spinlock used before init in alx driver, from Maarten Lankhorst. 6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry Kravkov. 7) Dummy and ifb driver load failure paths can oops, fixes from Tan Xiaojun and Ding Tianhong. 8) Correct MTU calculations in IP tunnels, from Alexander Duyck. 9) Account all TCP retransmits in SNMP stats properly, from Yuchung Cheng. 10) atl1e and via-rhine do not handle DMA mapping failures properly, from Neil Horman. 11) Various equal-cost multipath route fixes in ipv6 from Hannes Frederic Sowa" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits) ipv6: only static routes qualify for equal cost multipathing via-rhine: fix dma mapping errors atl1e: fix dma mapping warnings tcp: account all retransmit failures usb/net/r815x: fix cast to restricted __le32 usb/net/r8152: fix integer overflow in expression net: access page->private by using page_private net: strict_strtoul is obsolete, use kstrtoul instead drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe net/usb: add relative mii functions for r815x net/tipc: use %*phC to dump small buffers in hex form qlcnic: Adding Maintainers. gre: Fix MTU sizing check for gretap tunnels pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts pkt_sched: sch_qfq: improve efficiency of make_eligible gso: Update tunnel segmentation to support Tx checksum offload inet: fix spacing in assignment ifb: fix oops when loading the ifb failed ...
2013-07-13Merge tag 'scsi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull final round of SCSI updates from James Bottomley: "This is the remaining set of SCSI patches for the merge window. It's mostly driver updates (scsi_debug, qla2xxx, storvsc, mp3sas). There are also several bug fixes in fcoe, libfc, and megaraid_sas. We also have a couple of core changes to try to make device destruction more deterministic" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (46 commits) [SCSI] scsi constants: command, sense key + additional sense strings fcoe: Reduce number of sparse warnings fcoe: Stop fc_rport_priv structure leak libfcoe: Fix meaningless log statement libfc: Differentiate echange timer cancellation debug statements libfc: Remove extra space in fc_exch_timer_cancel definition fcoe: fix the link error status block sparse warnings fcoe: Fix smatch warning in fcoe_fdmi_info function libfc: Reject PLOGI from nodes with incompatible role [SCSI] enable destruction of blocked devices which fail LUN scanning [SCSI] Fix race between starved list and device removal [SCSI] megaraid_sas: fix a bug for 64 bit arches [SCSI] scsi_debug: reduce duplication between prot_verify_read and prot_verify_write [SCSI] scsi_debug: simplify offset calculation for dif_storep [SCSI] scsi_debug: invalidate protection info for unmapped region [SCSI] scsi_debug: fix NULL pointer dereference with parameters dif=0 dix=1 [SCSI] scsi_debug: fix incorrectly nested kmap_atomic() [SCSI] scsi_debug: fix invalid address passed to kunmap_atomic() [SCSI] mpt3sas: Bump driver version to v02.100.00.00 [SCSI] mpt3sas: when async scanning is enabled then while scanning, devices are removed but their transport layer entries are not removed ...
2013-07-13Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Thomas Gleixner: "Fix a potential deadlock versus hrtimers" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix HRTICK
2013-07-13Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: - core fix for missing round up in the generic irq chip implementation - new irq chip for MOXA SoCs - a few fixes and cleanups in the irqchip drivers * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: Add support for MOXA ART SoCs genirq: generic chip: Use DIV_ROUND_UP to calculate numchips irqchip: nvic: Fix wrong num_ct argument for irq_alloc_domain_generic_chips() irqchip: sun4i: Staticize sun4i_irq_ack() irqchip: vt8500: Staticize local symbols
2013-07-13Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: - watchdog fixes for full dynticks - improved debug output for full dynticks - remove an obsolete full dynticks check - two ARM SoC clocksource drivers for sharing across SoCs - tick broadcast fix for CPU hotplug * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick: broadcast: Check broadcast mode on CPU hotplug clocksource: arm_global_timer: Add ARM global timer support clocksource: Add Marvell Orion SoC timer nohz: Remove obsolete check for full dynticks CPUs to be RCU nocbs watchdog: Boot-disable by default on full dynticks watchdog: Rename confusing state variable watchdog: Register / unregister watchdog kthreads on sysctl control nohz: Warn if the machine can not perform nohz_full
2013-07-13Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: - fix for do_div() abuse on x86 - locking fix in perf core - a pile of (build) fixes and cleanups in perf tools * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) perf/x86: Fix incorrect use of do_div() in NMI warning perf: Fix perf_lock_task_context() vs RCU perf: Remove WARN_ON_ONCE() check in __perf_event_enable() for valid scenario perf: Clone child context from parent context pmu perf script: Fix broken include in Context.xs perf tools: Fix -ldw/-lelf link test when static linking perf tools: Revert regression in configuration of Python support perf tools: Fix perf version generation perf stat: Fix per-socket output bug for uncore events perf symbols: Fix vdso list searching perf evsel: Fix missing increment in sample parsing perf tools: Update symbol_conf.nr_events when processing attribute events perf tools: Fix new_term() missing free on error path perf tools: Fix parse_events_terms() segfault on error path perf evsel: Fix count parameter to read call in event_format__new perf tools: fix a typo of a Power7 event name perf tools: Fix -x/--exclude-other option for report command perf evlist: Enhance perf_evlist__start_workload() perf record: Remove -f/--force option perf record: Remove -A/--append option ...
2013-07-13Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking updates from Thomas Gleixner: "Header cleanup as requested by Linus" (This is the "don't include support for ww_mutex in a header file that everybody wants, when almost nobody wants the ww part" change) * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mutex: Move ww_mutex definitions to ww_mutex.h
2013-07-13Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "This is our first set of fixes from arm-soc for 3.11. - A handful of build and warning fixes from Arnd - A collection of OMAP fixes - defconfig updates to make the default configs more useful for real use (and testing) out of the box on hardware And a couple of other small fixes. Some of these have been recently applied but it's normally how we deal with fixes, with less bake time in -next needed" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits) arm: multi_v7_defconfig: Tweaks for omap and sunxi arm: multi_v7_defconfig: add i.MX options and NFS root ARM: omap2: add select of TI_PRIV_EDMA ARM: exynos: select PM_GENERIC_DOMAINS only when used ARM: ixp4xx: avoid circular header dependency ARM: OMAP: omap_common_late_init may be unused ARM: sti: move DEBUG_STI_UART into alphabetical order ARM: OMAP: build mach-omap code only if needed ARM: zynq: use DT_MACHINE_START ARM: omap5: omap5 has SCU and TWD ARM: OMAP2+: omap2plus_defconfig: Enable appended DTB support ARM: OMAP2+: Enable TI_EDMA in omap2plus_defconfig ARM: OMAP2+: omap2plus_defconfig: enable DRA752 thermal support by default ARM: OMAP2+: omap2plus_defconfig: enable TI bandgap driver ARM: OMAP2+: devices: remove duplicated include from devices.c ARM: OMAP3: igep0020: Set DSS pins in correct mux mode. ARM: OMAP2+: N900: enable N900-specific drivers even if device tree is enabled ARM: OMAP2+: Cocci spatch "ptr_ret.spatch" ARM: OMAP2+: Remove obsolete Makefile line ARM: OMAP5: Enable Cortex A15 errata 798181 ...