summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)Author
2019-07-08PCI: mobiveil: Unify register accessorsHou Zhiqiang
It is confusing to have two sets of functions to read/write registers, some with csr_readl()/csr_writel(), while others with read_paged_register()/write_paged_register(). In the register space the lower 3KB of 4KB PCIe configure space can be accessed directly and higher 1KB through a simple paging mechanism. Unify the register accessors in csr_readl() and csr_writel() by comparing the register offset with page access boundary 3KB in the accessor internal so that the paging mechanism is hidden behind the csr_read()/write() common function calls. Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Minghuan Lian <Minghuan.Lian@nxp.com> Reviewed-by: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in>
2019-07-08Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: PM: sleep: Drop dev_pm_skip_next_resume_phases() ACPI: PM: Drop unused function and function header ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS ACPI: PM: Simplify and fix PM domain hibernation callbacks PCI: PM: Simplify bus-level hibernation callbacks PM: ACPI/PCI: Resume all devices during hibernation kernel: power: swap: use kzalloc() instead of kmalloc() followed by memset() PM: sleep: Update struct wakeup_source documentation drivers: base: power: remove wakeup_sources_stats_dentry variable PM: suspend: Rename pm_suspend_via_s2idle() PM: sleep: Show how long dpm_suspend_start() and dpm_suspend_end() take PM: hibernate: powerpc: Expose pfn_is_nosave() prototype
2019-07-05PCI: imx6: Simplify Kconfig depends onLeonard Crestez
The imx6 driver can be used on imx6sx without enabling support for imx6q or imx7d but the "depends on" condition doesn't allow that. Instead of making the condition even longer just make it depend on "ARCH_MXC || COMPILE_TEST" instead. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Lucas Stach <l.stach@pengutronix.de>
2019-07-05PCI: hv: Fix a use-after-free bug in hv_eject_device_work()Dexuan Cui
Fix a use-after-free in hv_eject_device_work(). Fixes: 05f151a73ec2 ("PCI: hv: Fix a memory leak in hv_eject_device_work()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org
2019-07-05PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30Vidya Sagar
The PCI Tegra controller conversion to a device tree configurable driver in commit d1523b52bff3 ("PCI: tegra: Move PCIe driver to drivers/pci/host") implied that code for the driver can be compiled in for a kernel supporting multiple platforms. Unfortunately, a blind move of the code did not check that some of the quirks that were applied in arch/arm (eg enabling Relaxed Ordering on all PCI devices - since the quirk hook erroneously matches PCI_ANY_ID for both Vendor-ID and Device-ID) are now applied in all kernels that compile the PCI Tegra controlled driver, DT and ACPI alike. This is completely wrong, in that enablement of Relaxed Ordering is only required by default in Tegra20 platforms as described in the Tegra20 Technical Reference Manual (available at https://developer.nvidia.com/embedded/downloads#?search=tegra%202 in Section 34.1, where it is mentioned that Relaxed Ordering bit needs to be enabled in its root ports to avoid deadlock in hardware) and in the Tegra30 platforms for the same reasons (unfortunately not documented in the TRM). There is no other strict requirement on PCI devices Relaxed Ordering enablement on any other Tegra platforms or PCI host bridge driver. Fix this quite upsetting situation by limiting the vendor and device IDs to which the Relaxed Ordering quirk applies to the root ports in question, reported above. Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [lorenzo.pieralisi@arm.com: completely rewrote the commit log/fixes tag] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-07-05PCI: tegra: Change link retry log level to debugManikanta Maddireddy
Driver checks for link up three times before giving up, each retry attempt is printed as an error. Letting users know that PCIe link is down and in the process of being brought up again is for debug, not an error condition. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-07-05PCI: tegra: Add support for GPIO based PERST#Manikanta Maddireddy
Tegra PCIe has fixed per port SFIO line to signal PERST#, which can be controlled by AFI port register. However, if a platform routes a different GPIO to the PCIe slot, then port register cannot control it. Add support for GPIO based PERST# signal for such platforms. GPIO number comes from per port PCIe device tree node. PCIe driver probe doesn't fail if per port "reset-gpios" property is not populated, so platforms that require this workaround must make sure that the DT property is not missed in the corresponding device tree. Link: https://lore.kernel.org/linux-pci/20190705084850.30777-1-jonathanh@nvidia.com/ Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> [lorenzo.pieralisi@arm.com: squashed in fix in Link] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-07-03PCI/IOV: Assume SR-IOV VFs support extended config space.Alex Williamson
The SR-IOV specification requires both PFs and VFs to implement a PCIe capability. Generally this is sufficient to assume extended config space is present, but we generally also perform additional tests to make sure the extended config space is reachable and not simply an alias of standard config space. For a VF to exist extended config space must be accessible on the PF, therefore we can also assume it to be accessible on the VF. This enables a micro performance optimization previously implemented in commit 975bb8b4dc93 ("PCI/IOV: Use VF0 cached config space size for other VFs") to speed up probing of VFs. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: Hao Zheng <yinhe@linux.alibaba.com>
2019-07-03Revert "PCI/IOV: Use VF0 cached config space size for other VFs"Alex Williamson
Revert 975bb8b4dc93 ("PCI/IOV: Use VF0 cached config space size for other VFs"), which attempted to cache the config space size from the first VF to re-use for subsequent VFs. The cached value was determined prior to discovering the PCIe capability on the VF, which resulted in the first VF reporting the correct config space size (4K), as it has a special case through pci_cfg_space_size(), while all the other VFs only reported 256 bytes. As this was only a performance optimization, we're better off without it. Fixes: 975bb8b4dc93 ("PCI/IOV: Use VF0 cached config space size for other VFs") Link: https://lore.kernel.org/r/156046663197.29869.3633634445109057665.stgit@gimli.home Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Hao Zheng <yinhe@linux.alibaba.com>
2019-07-02PCI: Use seq_puts() instead of seq_printf() in show_device()Markus Elfring
The driver name in /proc/bus/pci/devices can be printed without a printf format specification, so use seq_puts() instead of seq_printf(). This issue was detected by using the Coccinelle software. Link: https://lore.kernel.org/r/a6b110cb-0d0e-5dc3-9ca1-9041609cf74c@web.de Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-07-02PCI/P2PDMA: Fix missing check for dma_virt_opsLogan Gunthorpe
Drivers that use dma_virt_ops were meant to be rejected when testing compatibility for P2PDMA. This check got inadvertently dropped in one of the later versions of the original patchset, so add it back. Fixes: 52916982af48 ("PCI/P2PDMA: Support peer-to-peer memory") Link: https://lore.kernel.org/r/20190702173544.21950-1-logang@deltatee.com Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-07-03PCI: PM: Simplify bus-level hibernation callbacksRafael J. Wysocki
After a previous change causing all runtime-suspended PCI devices to be resumed before creating a snapshot image of memory during hibernation, it is not necessary to worry about the case in which them might be left in runtime-suspend any more, so get rid of the code related to that from bus-level PCI hibernation callbacks. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2019-07-03PM: ACPI/PCI: Resume all devices during hibernationRafael J. Wysocki
Both the PCI bus type and the ACPI PM domain avoid resuming runtime-suspended devices with DPM_FLAG_SMART_SUSPEND set during hibernation (before creating the snapshot image of system memory), but that turns out to be a mistake. It leads to functional issues and adds complexity that's hard to justify. For this reason, resume all runtime-suspended PCI devices and all devices in the ACPI PM domains before creating a snapshot image of system memory during hibernation. Fixes: 05087360fd7a (ACPI / PM: Take SMART_SUSPEND driver flag into account) Fixes: c4b65157aeef (PCI / PM: Take SMART_SUSPEND driver flag into account) Link: https://lore.kernel.org/linux-acpi/917d4399-2e22-67b1-9d54-808561f9083f@uwyo.edu/T/#maf065fe6e4974f2a9d79f332ab99dfaba635f64c Reported-by: Robert R. Howell <RHowell@uwyo.edu> Tested-by: Robert R. Howell <RHowell@uwyo.edu> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2019-07-02PCI: Skip resource distribution when no hotplug bridgesNicholas Johnson
If "hotplug_bridges == 0", "!dev->is_hotplug_bridge" is always true, so the loop that divides the remaining resources among hotplug-capable bridges does nothing. Check for "hotplug_bridges == 0" earlier, so we don't even have to compute the amount of remaining resources. No functional change intended. Link: https://lore.kernel.org/r/PS2P216MB0642C7A485649D2D787A1C6F80000@PS2P216MB0642.KORP216.PROD.OUTLOOK.COM Link: https://lore.kernel.org/r/20190622210310.180905-3-helgaas@kernel.org Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-07-02PCI: Simplify pci_bus_distribute_available_resources()Nicholas Johnson
Reorder pci_bus_distribute_available_resources() to group related code together. No functional change intended. Link: https://lore.kernel.org/r/PS2P216MB0642C7A485649D2D787A1C6F80000@PS2P216MB0642.KORP216.PROD.OUTLOOK.COM Link: https://lore.kernel.org/r/20190622210310.180905-2-helgaas@kernel.org Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2019-07-02PCI/P2PDMA: use the dev_pagemap internal refcountChristoph Hellwig
The functionality is identical to the one currently open coded in p2pdma.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-02memremap: pass a struct dev_pagemap to ->kill and ->cleanupChristoph Hellwig
Passing the actual typed structure leads to more understandable code vs just passing the ref member. Reported-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-02memremap: move dev_pagemap callbacks into a separate structureChristoph Hellwig
The dev_pagemap is a growing too many callbacks. Move them into a separate ops structure so that they are not duplicated for multiple instances, and an attacker can't easily overwrite them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-30Merge back PCI power management material for v5.3.Rafael J. Wysocki
2019-06-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The new route handling in ip_mc_finish_output() from 'net' overlapped with the new support for returning congestion notifications from BPF programs. In order to handle this I had to take the dev_loopback_xmit() calls out of the switch statement. The aquantia driver conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27PCI: dwc: Export APIs to support .remove() implementationVidya Sagar
Export all configuration space access APIs and also other APIs to support host controller drivers of dwc core based implementations while adding support for .remove() hook to build their respective drivers as modules. Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2019-06-27PCI: dwc: Cleanup DBI,ATU read and write APIsVidya Sagar
Cleanup DBI read and write APIs by removing leading "__" (underscore) from their names as there is no reason to have leading underscores in the first place in the function definition. Remove dbi/dbi2 base address parameters as the same behaviour can be obtained through read and write APIs. Since dw_pcie_{readl/writel}_dbi() APIs can't be used for ATU read/write as ATU base address could be different from DBI base address, implement ATU read/write APIs using ATU base address without using dw_pcie_{readl/writel}_dbi() APIs. Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Jingoo Han <jingoohan1@gmail.com>
2019-06-27PCI: dwc: Add API support to de-initialize hostVidya Sagar
Add an API to group all the tasks to be done to de-initialize host which can then be called by any dwc core based driver implementations while adding .remove() support in their respective drivers. Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2019-06-27PCI: PM/ACPI: Refresh all stale power state data in pci_pm_complete()Rafael J. Wysocki
In pci_pm_complete() there are checks to decide whether or not to resume devices that were left in runtime-suspend during the preceding system-wide transition into a sleep state. They involve checking the current power state of the device and comparing it with the power state of it set before the preceding system-wide transition, but the platform component of the device's power state is not handled correctly in there. Namely, on platforms with ACPI, the device power state information needs to be updated with care, so that the reference counters of power resources used by the device (if any) are set to ensure that the refreshed power state of it will be maintained going forward. To that end, introduce a new ->refresh_state() platform PM callback for PCI devices, for asking the platform to refresh the device power state data and ensure that the corresponding power state will be maintained going forward, make it invoke acpi_device_update_power() (for devices with ACPI PM) on platforms with ACPI and make pci_pm_complete() use it, through a new pci_refresh_power_state() wrapper function. Fixes: a0d2a959d3da (PCI: Avoid unnecessary resume after direct-complete) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-27PCI / ACPI: Add _PR0 dependent devicesMika Westerberg
If otherwise unrelated PCI devices share ACPI power resources turning them on causes the devices to enter D0uninitialized power state which may cause problems. For example in Intel Ice Lake two root ports (RP0 and RP1), Thunderbolt controller (NHI) and xHCI controller all share power resources as can be ween in the topology below where power resources are marked with []: Host bridge | +- RP0 ---\ +- RP1 ---|--+--> [TBT] +- NHI --/ | | | | v +- xHCI --> [D3C] In a situation where all devices sharing the power resources are in D3cold (the power resources are turned off) and for example the Thunderbolt controller is runtime resumed resulting that the power resources are turned on. This means that the other devices sharing them (RP0, RP1 and xHCI) are transitioned into D0uninitialized state. If they were configured to trigger wake (PME) on a certain event that configuration gets lost after reset so we would need to re-initialize them to get the wakeup working as expected again. To do so we would need to runtime resume all of them to make sure their registers get restored properly before we can runtime suspend them again. Since we just added concept of "_PR0 dependent device" we can solve this by calling the relevant add/remove functions when the PCI device is bind to its ACPI representation. If it has power resources the PCI device will be added as dependent device to them and runtime resumed whenever they are physically turned on. This should make sure PCI core can reconfigure wakes after the device is transitioned into D0uninitialized. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-27PCI / ACPI: Use cached ACPI device state to get PCI device power stateMika Westerberg
The ACPI power state returned by acpi_device_get_power() may depend on the configuration of ACPI power resources in the system which may change any time after acpi_device_get_power() has returned, unless the reference counters of the ACPI power resources in question are set to prevent that from happening. Thus it is invalid to use acpi_device_get_power() in acpi_pci_get_power_state() the way it is done now and the value of the ->power.state field in the corresponding struct acpi_device objects (which reflects the ACPI power resources reference counting, among other things) should be used instead. As an example where this becomes an issue is Intel Ice Lake where the Thunderbolt controller (NHI), two PCIe root ports (RP0 and RP1) and xHCI all share the same power resources. The following picture with power resources marked with [] shows the topology: Host bridge | +- RP0 ---\ +- RP1 ---|--+--> [TBT] +- NHI --/ | | | | v +- xHCI --> [D3C] Here TBT and D3C are the shared ACPI power resources. ACPI _PR3() method of the devices in question returns either TBT or D3C or both. Say we runtime suspend first the root ports RP0 and RP1, then NHI. Now since the TBT power resource is still on when the root ports are runtime suspended their dev->current_state is set to D3hot. When NHI is runtime suspended TBT is finally turned off but state of the root ports remain to be D3hot. Now when the xHCI is runtime suspended D3C gets also turned off. PCI core thus has power states of these devices cached in their dev->current_state as follows: RP0 -> D3hot RP1 -> D3hot NHI -> D3cold xHCI -> D3cold If the user now runs lspci for instance, the result is all 1's like in the below output (00:07.0 is the first root port, RP0): 00:07.0 PCI bridge: Intel Corporation Device 8a1d (rev ff) (prog-if ff) !!! Unknown header type 7f Kernel driver in use: pcieport In short the hardware state is not in sync with the software state anymore. The exact same thing happens with the PME polling thread which ends up bringing the root ports back into D0 after they are runtime suspended. For this reason, modify acpi_pci_get_power_state() so that it uses the ACPI device power state that was cached by the ACPI core. This makes the PCI device power state match the ACPI device power state regardless of state of the shared power resources which may still be on at this point. Link: https://lore.kernel.org/r/20190618161858.77834-2-mika.westerberg@linux.intel.com Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-26PCI: PM: Avoid skipping bus-level PM on platforms without ACPIRafael J. Wysocki
There are platforms that do not call pm_set_suspend_via_firmware(), so pm_suspend_via_firmware() returns 'false' on them, but the power states of PCI devices (PCIe ports in particular) are changed as a result of powering down core platform components during system-wide suspend. Thus the pm_suspend_via_firmware() checks in pci_pm_suspend_noirq() and pci_pm_resume_noirq() introduced by commit 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to- idle") are not sufficient to determine that devices left in D0 during suspend will remain in D0 during resume and so the bus-level power management can be skipped for them. For this reason, introduce a new global suspend flag, PM_SUSPEND_FLAG_NO_PLATFORM, set it for suspend-to-idle only and replace the pm_suspend_via_firmware() checks mentioned above with checks against this flag. Fixes: 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to-idle") Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-26PCI: xilinx-nwl: Fix Multi MSI data programmingBharat Kumar Gogada
According to the PCI Local Bus specification Revision 3.0, section 6.8.1.3 (Message Control for MSI), endpoints that are Multiple Message Capable as defined by bits [3:1] in the Message Control for MSI can request a number of vectors that is power of two aligned. As specified in section 6.8.1.6 "Message data for MSI", the Multiple Message Enable field (bits [6:4] of the Message Control register) defines the number of low order message data bits the function is permitted to modify to generate its system software allocated vectors. The MSI controller in the Xilinx NWL PCIe controller supports a number of MSI vectors specified through a bitmap and the hwirq number for an MSI, that is the value written in the MSI data TLP is determined by the bitmap allocation. For instance, in a situation where two endpoints sitting on the PCI bus request the following MSI configuration, with the current PCI Xilinx bitmap allocation code (that does not align MSI vector allocation on a power of two boundary): Endpoint #1: Requesting 1 MSI vector - allocated bitmap bits 0 Endpoint #2: Requesting 2 MSI vectors - allocated bitmap bits [1,2] The bitmap value(s) corresponds to the hwirq number that is programmed into the Message Data for MSI field in the endpoint MSI capability and is detected by the root complex to fire the corresponding MSI irqs. The value written in Message Data for MSI field corresponds to the first bit allocated in the bitmap for Multi MSI vectors. The current Xilinx NWL MSI allocation code allows a bitmap allocation that is not a power of two boundaries, so endpoint #2, is allowed to toggle Message Data bit[0] to differentiate between its two vectors (meaning that the MSI data will be respectively 0x0 and 0x1 for the two vectors allocated to endpoint #2). This clearly aliases with the Endpoint #1 vector allocation, resulting in a broken Multi MSI implementation. Update the code to allocate MSI bitmap ranges with a power of two alignment, fixing the bug. Fixes: ab597d35ef11 ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2019-06-24Merge back PCI power management material for v5.3.Rafael J. Wysocki
2019-06-24bus_find_device: Unify the match callback with class_find_deviceSuzuki K Poulose
There is an arbitrary difference between the prototypes of bus_find_device() and class_find_device() preventing their callers from passing the same pair of data and match() arguments to both of them, which is the const qualifier used in the prototype of class_find_device(). If that qualifier is also used in the bus_find_device() prototype, it will be possible to pass the same match() callback function to both bus_find_device() and class_find_device(), which will allow some optimizations to be made in order to avoid code duplication going forward. Also with that, constify the "data" parameter as it is passed as a const to the match function. For this reason, change the prototype of bus_find_device() to match the prototype of class_find_device() and adjust its callers to use the const qualifier in accordance with the new prototype of it. Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Andreas Noever <andreas.noever@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Corey Minyard <minyard@acm.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Kershner <david.kershner@unisys.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Airlie <airlied@linux.ie> Cc: Felipe Balbi <balbi@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Cameron <jic23@kernel.org> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: Len Brown <lenb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Jamet <michael.jamet@intel.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Sebastian Ott <sebott@linux.ibm.com> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Yehezkel Bernat <YehezkelShB@gmail.com> Cc: rafael@kernel.org Acked-by: Corey Minyard <minyard@acm.org> Acked-by: David Kershner <david.kershner@unisys.com> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Acked-by: Wolfram Sang <wsa@the-dreams.de> # for the I2C parts Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-22Merge tag 'pci-v5.2-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "If an IOMMU is present, ignore the P2PDMA whitelist we added for v5.2 because we don't yet know how to support P2PDMA in that case (Logan Gunthorpe)" * tag 'pci-v5.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/P2PDMA: Ignore root complex whitelist when an IOMMU is present
2019-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor SPDX change conflict. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-21PCI: let pci_disable_link_state propagate errorsHeiner Kallweit
Drivers may rely on pci_disable_link_state() having disabled certain ASPM link states. If OS can't control ASPM then pci_disable_link_state() turns into a no-op w/o informing the caller. The driver therefore may falsely assume the respective ASPM link states are disabled. Let pci_disable_link_state() propagate errors to the caller, enabling the caller to react accordingly. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-21PCI: Don't auto-realloc if we're preserving firmware configBenjamin Herrenschmidt
Prevent auto-enabling of bridges reallocation when the FW tells us that the initial configuration must be preserved for a given host bridge. Link: https://lore.kernel.org/r/20190615002359.29577-3-benh@kernel.crashing.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-06-21PCI: sysfs: Ignore lockdep for remove attributeMarek Vasut
With CONFIG_PROVE_LOCKING=y, using sysfs to remove a bridge with a device below it causes a lockdep warning, e.g., # echo 1 > /sys/class/pci_bus/0000:00/device/0000:00:00.0/remove ============================================ WARNING: possible recursive locking detected ... pci_bus 0000:01: busn_res: [bus 01] is released The remove recursively removes the subtree below the bridge. Each call uses a different lock so there's no deadlock, but the locks were all created with the same lockdep key so the lockdep checker can't tell them apart. Mark the "remove" sysfs attribute with __ATTR_IGNORE_LOCKDEP() as it is safe to ignore the lockdep check between different "remove" kernfs instances. There's discussion about a similar issue in USB at [1], which resulted in 356c05d58af0 ("sysfs: get rid of some lockdep false positives") and e9b526fe7048 ("i2c: suppress lockdep warning on delete_device"), which do basically the same thing for USB "remove" and i2c "delete_device" files. [1] https://lore.kernel.org/r/Pine.LNX.4.44L0.1204251436140.1206-100000@iolanthe.rowland.org Link: https://lore.kernel.org/r/20190526225151.3865-1-marek.vasut@gmail.com Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com> [bhelgaas: trim commit log, details at above links] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Phil Edworthy <phil.edworthy@renesas.com> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Tejun Heo <tj@kernel.org> Cc: Wolfram Sang <wsa@the-dreams.de>
2019-06-20PCI: tegra: Put PEX CLK & BIAS pads in DPD modeManikanta Maddireddy
In Tegra210 AFI design has clamp value for the BIAS pad as 0, which keeps the bias pad in non power down mode. This is leading to power consumption of 2 mW in BIAS pad, even if the PCIe partition is powergated. To avoid unnecessary power consumption, put PEX CLK & BIAS pads in deep power down mode when PCIe partition is power gated. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Add AFI_PEX2_CTRL reg offset as part of SoC structManikanta Maddireddy
Tegra186 and Tegra30 have three PCIe root ports. AFI_PEX2_CTRL register is defined for third root port. Offset of this register in Tegra186 is different from Tegra30, so add the offset as part of SoC data structure. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Change PRSNT_SENSE IRQ log to debugManikanta Maddireddy
PRSNT_MAP bit field is programmed to update the slot present status. PRSNT_SENSE IRQ is triggered when this bit field is programmed, which is not an error. Add a new if condition to trap PRSNT_SENSE code and print it with debug log level. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Program AFI_CACHE_BAR_{0,1}_{ST,SZ} registers only for Tegra20Manikanta Maddireddy
Cacheable upstream transactions are supported in Tegra20 and Tegra186 only. AFI_CACHE_BAR_{0,1}_{ST,SZ} registers are available in Tegra20 to support cacheable upstream transactions. In Tegra186, AFI_AXCACHE register is defined instead of AFI_CACHE_BAR_{0,1}_{ST,SZ} to be in line with its memory subsystem design. Therefore, program AFI_CACHE_BAR_{0,1}_{ST,SZ} registers only for Tegra20. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Fix PLLE power down issue due to CLKREQ# signalManikanta Maddireddy
Disable controllers which failed to bring the link up and configure CLKREQ# signals of these controllers as GPIO. This is required to avoid CLKREQ# signal of inactive controllers interfering with PLLE power down sequence. PCIE_CLKREQ_GPIO bits are defined only in Tegra186, however programming these bits in other SoCs doesn't cause any side effects. Program these bits for all Tegra SoCs to avoid a conditional check. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Set target speed as Gen1 before starting LTSSMManikanta Maddireddy
PCIe link up fails with few legacy endpoints if root port advertises both Gen-1 and Gen-2 speeds in Tegra. This is because link number negotiation fails if both Gen1 & Gen2 are advertised. Tegra doesn't retry link up by advertising only Gen1. Hence, the strategy followed here is to initially advertise only Gen-1 and after link is up, retrain link to Gen-2 speed. Tegra doesn't support HW autonomous speed change. Link comes up in Gen1 even if Gen2 is advertised, so there is no downside of this change. This behavior is observed with following two PCIe devices on Tegra: - Fusion HDTV 5 Express card - IOGear SIL - PCIE - SATA card Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Update flow control timer frequency in Tegra210Manikanta Maddireddy
Recommended UpdateFC threshold in Tegra210 is 0x60 for best performance of x1 link. Setting this to 0x60 provides the best balance between number of UpdateFC packets and read data sent over the link. UpdateFC timer frequency is equal to twice the value of register content in nsec, i.e (2 * 0x60) = 192 nsec. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Add SW fixup for RAW violationsManikanta Maddireddy
The logic which blocks read requests till AFI gets ACK for all outstanding writes from memory controller does not behave correctly when number of outstanding writes become more than 32 in Tegra124 and Tegra132. SW fixup is to prevent writes from accumulating more than 32 by: - limiting outstanding posted writes to 14 - modifying Gen1 and Gen2 UpdateFC timer frequency UpdateFC timer frequency is equal to twice the value of register content in nsec. These settings are recommended after stress testing with different values. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Increase the deskew retry timeManikanta Maddireddy
Sometimes link speed change from Gen2 to Gen1 fails due to instability in deskew logic on lane-0 in Tegra210. Increase the deskew retry time to resolve this issue. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Enable PCIe xclk clock clampingManikanta Maddireddy
Enable xclk clock clamping when entering L1. Clamp threshold will determine the time spent waiting for clock module to turn on xclk after signaling it. Default threshold value in Tegra124 and Tegra210 is not enough to turn on xclk clock. Increase the clamp threshold to meet the clock module timing in Tegra124 and Tegra210. Default threshold value is enough in Tegra20, Tegra30 and Tegra186. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Process pending DLL transactions before entering L1 or L2Manikanta Maddireddy
PM message are truncated while entering L1 or L2, which is resulting in receiver errors. Set the required bit to finish processing DLLP before link enter L1 or L2. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Disable AFI dynamic clock gatingManikanta Maddireddy
Outstanding write counter in AFI is used to generate idle signal to dynamically gate the AFI clock. When there are 32 outstanding writes from AFI to memory, the outstanding write counter overflows and indicates that there are "0" outstanding write transactions. When memory controller is under heavy load, write completions to AFI gets delayed and AFI write counter overflows. This causes AFI clock gating even when there are outstanding transactions towards memory controller resulting in a system hang. Disable dynamic clock gating of AFI clock to avoid system hang. CLKEN_OVERRIDE bit is not defined in Tegra20 and Tegra30, however programming this bit doesn't cause any side effects. Program this bit for all Tegra SoCs to avoid conditional check. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Enable opportunistic UpdateFC and ACKManikanta Maddireddy
Enable opportunistic UpdateFC and ACK to allow data link layer send pending ACKs and UpdateFC packets when link is idle instead of waiting for timers to expire. This improves the PCIe performance due to better utilization of PCIe bandwidth. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Program UPHY electrical settings for Tegra210Manikanta Maddireddy
UPHY electrical programming guidelines are documented in Tegra210 TRM. Program these electrical settings for proper eye diagram in Gen1 and Gen2 link speeds. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20PCI: tegra: Advertise PCIe Advanced Error Reporting (AER) capabilityManikanta Maddireddy
Default root port setting hides AER capability. This patch enables the advertisement of AER capability by root port. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>