summaryrefslogtreecommitdiff
path: root/drivers/iommu
AgeCommit message (Collapse)Author
2016-06-21iommu/msm: Add support for generic master bindingsSricharan R
This adds the xlate callback which gets invoked during device registration from DT. The master devices gets added through this. Signed-off-by: Sricharan R <sricharan@codeaurora.org> Tested-by: Archit Taneja <architt@codeaurora.org> Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-21iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.cSricharan R
There are only two functions left in msm_iommu_dev.c. Move it to msm_iommu.c and delete the file. Signed-off-by: Sricharan R <sricharan@codeaurora.org> Tested-by: Archit Taneja <architt@codeaurora.org> Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-21iommu/msm: Add DT adaptationSricharan R
The driver currently works based on platform data. Remove this and add support for DT. A single master can have multiple ports connected to more than one iommu. master | | | ------------------------ | | IOMMU0 IOMMU1 | | ctx0 ctx1 ctx0 ctx1 This association of master and iommus/contexts were previously represented by platform data parent/child device details. The client drivers were responsible for programming all of the iommus/contexts for the device. Now while adapting to generic DT bindings we maintain the list of iommus, contexts that each master domain is connected to and program all of them on attach/detach. Signed-off-by: Sricharan R <sricharan@codeaurora.org> Tested-by: Archit Taneja <architt@codeaurora.org> Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-21iommu/exynos: update to use iommu big-endianBen Dooks
Add initial support for big endian by always writing the pte in le32. Note, revisit if hardware capable of doing big endian fetches. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-21iommu/mediatek: Make mtk_iommu_pm_ops staticJoerg Roedel
The symbol exists elsewhere already, so that is fails to link if the symbol is non-static. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-21iommu/mediatek: add support for mtk iommu generation one HWHonghui Zhang
Mediatek SoC's M4U has two generations of HW architcture. Generation one uses flat, one layer pagetable, and was shipped with ARM architecture, it only supports 4K size page mapping. MT2701 SoC uses this generation one m4u HW. Generation two uses the ARM short-descriptor translation table format for address translation, and was shipped with ARM64 architecture, MT8173 uses this generation two m4u HW. All the two generation iommu HW only have one iommu domain, and all its iommu clients share the same iova address. These two generation m4u HW have slit different register groups and register offset, but most register names are the same. This patch add iommu support for mediatek SoC mt2701. Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-21iommu/mediatek: move the common struct into header fileHonghui Zhang
Move the struct defines of mtk iommu into a new header files for common use. Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-21iommu/mediatek: Do not call of_node_put in mtk_iommu_of_xlateHonghui Zhang
The device_node will be released in of_iommu_configure, it may be double released if call of_node_put in mtk_iommu_of_xlate. Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-21iommu/amd: Remove create_workqueueBhaktipriya Shridhar
alloc_workqueue replaces deprecated create_workqueue(). A dedicated workqueue has been used since the workitem (viz &fault->work), is involved in IO page-fault handling. WQ_MEM_RECLAIM has been set to guarantee forward progress under memory pressure, which is a requirement here. Since there are only a fixed number of work items, explicit concurrency limit is unnecessary. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-17iommu/vt-d: Enable QI on all IOMMUs before setting root entryJoerg Roedel
This seems to be required on some X58 chipsets on systems with more than one IOMMU. QI does not work until it is enabled on all IOMMUs in the system. Reported-by: Dheeraj CVR <cvr.dheeraj@gmail.com> Tested-by: Dheeraj CVR <cvr.dheeraj@gmail.com> Fixes: 5f0a7f7614a9 ('iommu/vt-d: Make root entry visible for hardware right after allocation') Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/vt-d: Don't reject NTB devices due to scope mismatchRoland Dreier
On a system with an Intel PCIe port configured as an NTB device, iommu initialization fails with DMAR: Device scope type does not match for 0000:80:03.0 This is because the DMAR table reports this device as having scope 2 (ACPI_DMAR_SCOPE_TYPE_BRIDGE): [0A0h 0160 1] Device Scope Entry Type : 02 [0A1h 0161 1] Entry Length : 08 [0A2h 0162 2] Reserved : 0000 [0A4h 0164 1] Enumeration ID : 00 [0A5h 0165 1] PCI Bus Number : 80 [0A6h 0166 2] PCI Path : 03,00 but the device has a type 0 PCI header: 80:03.0 Bridge [0680]: Intel Corporation Device [8086:2f0d] (rev 02) 00: 86 80 0d 2f 00 00 10 00 02 00 80 06 10 00 80 00 10: 0c 00 c0 00 c0 38 00 00 0c 00 00 00 80 38 00 00 20: 00 00 00 c8 00 00 10 c8 00 00 00 00 86 80 00 00 30: 00 00 00 00 60 00 00 00 00 00 00 00 ff 01 00 00 VT-d works perfectly on this system, so there's no reason to bail out on initialization due to this apparent scope mismatch. Use the class 0x0680 ("Other bridge device") as a heuristic for allowing DMAR initialization for non-bridge PCI devices listed with scope bridge. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/exynos: Prepare for deferred probe supportMarek Szyprowski
Register iommu_ops at the end of successful probe instead of doing that unconditionally. This makes Exynos IOMMU driver ready for deferred probe caused by not-yet-available clocks. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/exynos: Prepare clocks when needed, not in driver probeMarek Szyprowski
Make clock preparation together with clk_enable(). This way inactive SYSMMU controllers will not keep clocks prepared all the time. This change allows more fine graded power management in the future. All the code assumes that clock management doesn't fail, so guard clock_prepare_enable() it with BUG_ON(). Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/exynos: Fix master clock management for inactive SYSMMUMarek Szyprowski
If SYSMMU controller is not active, there is no point in enabling master's clock just for doing the the of internal state. This patch moves enabling that clock to the block which actually does the register access. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/exynos: Return proper errors from getting clocksMarek Szyprowski
This patch reworks driver probe code to propagate error codes from clk_get() operation. This will allow to properly handle deferred probe in the future. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/vt-d: Reduce extra first level entry in iommu->domainsWei Yang
In commit <8bf478163e69> ("iommu/vt-d: Split up iommu->domains array"), it it splits iommu->domains in two levels. Each first level contains 256 entries of second level. In case of the ndomains is exact a multiple of 256, it would have one more extra first level entry for current implementation. This patch refines this calculation to reduce the extra first level entry. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/exynos: Suppress unbinding to prevent system failureMarek Szyprowski
Removal of IOMMU driver cannot be done reliably, so Exynos IOMMU driver doesn't support this operation. It is essential for system operation, so it makes sense to prevent unbinding by disabling bind/unbind sysfs feature for SYSMMU controller driver to avoid kernel ops or trashing memory caused by such operation. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> CC: stable@vger.kernel.org # v4.2+ Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/rockchip: Fix zap cache during device attachJohn Keeping
rk_iommu_command() takes a struct rk_iommu and iterates over the slave MMUs, so this is doubly wrong in that we're passing in the wrong pointer and talking to MMUs that we shouldn't be. Fixes: cd6438c5f844 ("iommu/rockchip: Reconstruct to support multi slaves") Cc: stable@vger.kernel.org Signed-off-by: John Keeping <john@metanate.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-15iommu/amd: Set AMD iommu callbacks for platform bus driverWan Zongshun
AMD has more drivers will use ACPI to platform bus driver later, all those devices need iommu support, for example: eMMC driver. For latest AMD eMMC controller, it will utilize sdhci-acpi.c driver, which will rely on platform bus to match device and driver, where we will set 'dev' of struct platform_device as map_sg parameter passing to iommu driver for DMA request, so the iommu-ops are needed on the platform bus. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-06-13iommu/arm-smmu: Wire up map_sg for arm-smmu-v3Jean-Philippe Brucker
The map_sg callback is missing from arm_smmu_ops, but is required by iommu.h. Similarly to most other IOMMU drivers, connect it to default_iommu_map_sg. Cc: <stable@vger.kernel.org> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-27remove lots of IS_ERR_VALUE abusesArnd Bergmann
Most users of IS_ERR_VALUE() in the kernel are wrong, as they pass an 'int' into a function that takes an 'unsigned long' argument. This happens to work because the type is sign-extended on 64-bit architectures before it gets converted into an unsigned type. However, anything that passes an 'unsigned short' or 'unsigned int' argument into IS_ERR_VALUE() is guaranteed to be broken, as are 8-bit integers and types that are wider than 'unsigned long'. Andrzej Hajda has already fixed a lot of the worst abusers that were causing actual bugs, but it would be nice to prevent any users that are not passing 'unsigned long' arguments. This patch changes all users of IS_ERR_VALUE() that I could find on 32-bit ARM randconfig builds and x86 allmodconfig. For the moment, this doesn't change the definition of IS_ERR_VALUE() because there are probably still architecture specific users elsewhere. Almost all the warnings I got are for files that are better off using 'if (err)' or 'if (err < 0)'. The only legitimate user I could find that we get a warning for is the (32-bit only) freescale fman driver, so I did not remove the IS_ERR_VALUE() there but changed the type to 'unsigned long'. For 9pfs, I just worked around one user whose calling conventions are so obscure that I did not dare change the behavior. I was using this definition for testing: #define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \ unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO)) which ends up making all 16-bit or wider types work correctly with the most plausible interpretation of what IS_ERR_VALUE() was supposed to return according to its users, but also causes a compile-time warning for any users that do not pass an 'unsigned long' argument. I suggested this approach earlier this year, but back then we ended up deciding to just fix the users that are obviously broken. After the initial warning that caused me to get involved in the discussion (fs/gfs2/dir.c) showed up again in the mainline kernel, Linus asked me to send the whole thing again. [ Updated the 9p parts as per Al Viro - Linus ] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andrzej Hajda <a.hajda@samsung.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.org/lkml/2016/1/7/363 Link: https://lkml.org/lkml/2016/5/27/486 Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-27Merge git://git.infradead.org/intel-iommuLinus Torvalds
Pull intel IOMMU updates from David Woodhouse: "This patchset improves the scalability of the Intel IOMMU code by resolving two spinlock bottlenecks and eliminating the linearity of the IOVA allocator, yielding up to ~5x performance improvement and approaching 'iommu=off' performance" * git://git.infradead.org/intel-iommu: iommu/vt-d: Use per-cpu IOVA caching iommu/iova: introduce per-cpu caching to iova allocation iommu/vt-d: change intel-iommu to use IOVA frame numbers iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs iommu/vt-d: only unmap mapped entries iommu/vt-d: correct flush_unmaps pfn usage iommu/vt-d: per-cpu deferred invalidation queues iommu/vt-d: refactoring of deferred flush entries
2016-05-20Merge tag 'devicetree-for-4.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: - Rewrite of the unflattening code to avoid recursion and lessen the stack usage. - Rewrite of the phandle args parsing code to get rid of the fixed args size. This is needed for IOMMU code. - Sync to latest dtc which adds more dts style checking. These warnings are enabled with "W=1" compiles. - Tegra documentation updates related to the above warnings. - A bunch of spelling and other doc fixes. - Various vendor prefix additions. * tag 'devicetree-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (52 commits) devicetree: Add Creative Technology vendor id gpio: dt-bindings: add ibm,ppc4xx-gpio binding of/unittest: Remove unnecessary module.h header inclusion drivers/of: Fix build warning in populate_node() drivers/of: Fix depth when unflattening devicetree of: dynamic: changeset prop-update revert fix drivers/of: Export of_detach_node() drivers/of: Return allocated memory from of_fdt_unflatten_tree() drivers/of: Specify parent node in of_fdt_unflatten_tree() drivers/of: Rename unflatten_dt_node() drivers/of: Avoid recursively calling unflatten_dt_node() drivers/of: Split unflatten_dt_node() of: include errno.h in of_graph.h of: document refcount incrementation of of_get_cpu_node() Documentation: dt: soc: fix spelling mistakes Documentation: dt: power: fix spelling mistake Documentation: dt: pinctrl: fix spelling mistake Documentation: dt: opp: fix spelling mistake Documentation: dt: net: fix spelling mistakes Documentation: dt: mtd: fix spelling mistake ...
2016-05-19Merge tag 'iommu-updates-v4.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "The updates include: - rate limiting for the VT-d fault handler - remove statistics code from the AMD IOMMU driver. It is unused and should be replaced by something more generic if needed - per-domain pagesize-bitmaps in IOMMU core code to support systems with different types of IOMMUs - support for ACPI devices in the AMD IOMMU driver - 4GB mode support for Mediatek IOMMU driver - ARM-SMMU updates from Will Deacon: - support for 64k pages with SMMUv1 implementations (e.g MMU-401) - remove open-coded 64-bit MMIO accessors - initial support for 16-bit VMIDs, as supported by some ThunderX SMMU implementations - a couple of errata workarounds for silicon in the field - various fixes here and there" * tag 'iommu-updates-v4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (44 commits) iommu/arm-smmu: Use per-domain page sizes. iommu/amd: Remove statistics code iommu/dma: Finish optimising higher-order allocations iommu: Allow selecting page sizes per domain iommu: of: enforce const-ness of struct iommu_ops iommu: remove unused priv field from struct iommu_ops iommu/dma: Implement scatterlist segment merging iommu/arm-smmu: Clear cache lock bit of ACR iommu/arm-smmu: Support SMMUv1 64KB supplement iommu/arm-smmu: Decouple context format from kernel config iommu/arm-smmu: Tidy up 64-bit/atomic I/O accesses io-64-nonatomic: Add relaxed accessor variants iommu/arm-smmu: Work around MMU-500 prefetch errata iommu/arm-smmu: Convert ThunderX workaround to new method iommu/arm-smmu: Differentiate specific implementations iommu/arm-smmu: Workaround for ThunderX erratum #27704 iommu/arm-smmu: Add support for 16 bit VMID iommu/amd: Move get_device_id() and friends to beginning of file iommu/amd: Don't use IS_ERR_VALUE to check integer values iommu/amd: Signedness bug in acpihid_device_group() ...
2016-05-19Merge tag 'pci-v4.7-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Refine PCI support check in pcibios_init() (Adrian-Ken Rueegsegger) - Provide common functions for ECAM mapping (Jayachandran C) - Allow all PCIe services on non-ACPI host bridges (Jon Derrick) - Remove return values from pcie_port_platform_notify() and relatives (Jon Derrick) - Widen portdrv service type from 4 bits to 8 bits (Keith Busch) - Add Downstream Port Containment portdrv service type (Keith Busch) - Add Downstream Port Containment driver (Keith Busch) Resource management: - Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs (Alex Williamson) - Supply CPU physical address (not bus address) to iomem_is_exclusive() (Bjorn Helgaas) - alpha: Call iomem_is_exclusive() for IORESOURCE_MEM, but not IORESOURCE_IO (Bjorn Helgaas) - Mark Broadwell-EP Home Agent 1 as having non-compliant BARs (Prarit Bhargava) - Disable all BAR sizing for devices with non-compliant BARs (Prarit Bhargava) - Move PCI I/O space management from OF to PCI core code (Tomasz Nowicki) PCI device hotplug: - acpiphp_ibm: Avoid uninitialized variable reference (Dan Carpenter) - Use cached copy of PCI_EXP_SLTCAP_HPC bit (Lukas Wunner) Virtualization: - Mark Intel i40e NIC INTx masking as broken (Alex Williamson) - Reverse standard ACS vs device-specific ACS enabling (Alex Williamson) - Work around Intel Sunrise Point PCH incorrect ACS capability (Alex Williamson) IOMMU: - Add pci_add_dma_alias() to abstract implementation (Bjorn Helgaas) - Move informational printk to pci_add_dma_alias() (Bjorn Helgaas) - Add support for multiple DMA aliases (Jacek Lawrynowicz) - Add DMA alias quirk for mic_x200_dma (Jacek Lawrynowicz) Thunderbolt: - Fix double free of drom buffer (Andreas Noever) - Add Intel Thunderbolt device IDs (Lukas Wunner) - Fix typos and magic number (Lukas Wunner) - Support 1st gen Light Ridge controller (Lukas Wunner) Generic host bridge driver: - Use generic ECAM API (Jayachandran C) Cavium ThunderX host bridge driver: - Don't clobber read-only bits in bridge config registers (David Daney) - Use generic ECAM API (Jayachandran C) Freescale i.MX6 host bridge driver: - Use enum instead of bool for variant indicator (Andrey Smirnov) - Implement reset sequence for i.MX6+ (Andrey Smirnov) - Factor out ref clock enable (Bjorn Helgaas) - Add initial imx6sx support (Christoph Fritz) - Add reset-gpio-active-high boolean property to DT (Petr Štetiar) - Add DT property for link gen, default to Gen1 (Tim Harvey) - dts: Specify imx6qp version of PCIe core (Andrey Smirnov) - dts: Fix PCIe reset GPIO polarity on Toradex Apalis Ixora (Petr Štetiar) Marvell Armada host bridge driver: - add DT binding for Marvell Armada 7K/8K PCIe controller (Thomas Petazzoni) - Add driver for Marvell Armada 7K/8K PCIe controller (Thomas Petazzoni) Marvell MVEBU host bridge driver: - Constify mvebu_pcie_pm_ops structure (Jisheng Zhang) - Use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS for mvebu_pcie_pm_ops (Jisheng Zhang) Microsoft Hyper-V host bridge driver: - Report resources release after stopping the bus (Vitaly Kuznetsov) - Add explicit barriers to config space access (Vitaly Kuznetsov) Renesas R-Car host bridge driver: - Select PCI_MSI_IRQ_DOMAIN (Arnd Bergmann) Synopsys DesignWare host bridge driver: - Remove incorrect RC memory base/limit configuration (Gabriele Paoloni) - Move Root Complex setup code to dw_pcie_setup_rc() (Jisheng Zhang) TI Keystone host bridge driver: - Add error IRQ handler (Murali Karicheri) - Remove unnecessary goto statement (Murali Karicheri) Miscellaneous: - Fix spelling errors (Colin Ian King)" * tag 'pci-v4.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits) PCI: Disable all BAR sizing for devices with non-compliant BARs x86/PCI: Mark Broadwell-EP Home Agent 1 as having non-compliant BARs PCI: Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs PCI, of: Move PCI I/O space management to PCI core code PCI: generic, thunder: Use generic ECAM API PCI: Provide common functions for ECAM mapping PCI: hv: Add explicit barriers to config space access PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit PCI: Add Downstream Port Containment driver PCI: Add Downstream Port Containment portdrv service type PCI: Widen portdrv service type from 4 bits to 8 bits PCI: designware: Remove incorrect RC memory base/limit configuration PCI: hv: Report resources release after stopping the bus ARM: dts: imx6qp: Specify imx6qp version of PCIe core PCI: imx6: Implement reset sequence for i.MX6+ PCI: imx6: Use enum instead of bool for variant indicator PCI: thunder: Don't clobber read-only bits in bridge config registers thunderbolt: Fix double free of drom buffer PCI: rcar: Select PCI_MSI_IRQ_DOMAIN PCI: armada: Add driver for Marvell Armada 7K/8K PCIe controller ...
2016-05-09Merge branches 'arm/io-pgtable', 'arm/rockchip', 'arm/omap', 'x86/vt-d', ↵Joerg Roedel
'ppc/pamu', 'core' and 'x86/amd' into next
2016-05-09iommu/arm-smmu: Use per-domain page sizes.Robin Murphy
Now that we can accurately reflect the context format we choose for each domain, do that instead of imposing the global lowest-common-denominator restriction and potentially ending up with nothing. We currently have a strict 1:1 correspondence between domains and context banks, so we don't need to entertain the possibility of multiple formats _within_ a domain. Signed-off-by: Will Deacon <will.deacon@arm.com> [rm: split from original patch, added SMMUv3] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09iommu/amd: Remove statistics codeJoerg Roedel
The statistics are not really used for anything and should be replaced by generic and per-device statistic counters. Remove the code for now. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09iommu/dma: Finish optimising higher-order allocationsRobin Murphy
Now that we know exactly which page sizes our caller wants to use in the given domain, we can restrict higher-order allocation attempts to just those sizes, if any, and avoid wasting any time or effort on other sizes which offer no benefit. In the same vein, this also lets us accommodate a minimum order greater than 0 for special cases. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09iommu: Allow selecting page sizes per domainRobin Murphy
Many IOMMUs support multiple page table formats, meaning that any given domain may only support a subset of the hardware page sizes presented in iommu_ops->pgsize_bitmap. There are also certain use-cases where the creator of a domain may want to control which page sizes are used, for example to force the use of hugepage mappings to reduce pagetable walk depth. To this end, add a per-domain pgsize_bitmap to represent the subset of page sizes actually in use, to make it possible for domains with different requirements to coexist. Signed-off-by: Will Deacon <will.deacon@arm.com> [rm: hijacked and rebased original patch with new commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09iommu: of: enforce const-ness of struct iommu_opsRobin Murphy
As a set of driver-provided callbacks and static data, there is no compelling reason for struct iommu_ops to be mutable in core code, so enforce const-ness throughout. Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09Merge branch 'arm/smmu' into coreJoerg Roedel
2016-05-09iommu/dma: Implement scatterlist segment mergingRobin Murphy
Stop wasting IOVA space by over-aligning scatterlist segments for a theoretical worst-case segment boundary mask, and instead take the real limits into account to merge consecutive segments wherever appropriate, so our callers can benefit from getting back nicely simplified lists. This also represents the last piece of functionality wanted by users of the current arch/arm implementation, thus brings us a small step closer to converting that over to the common code. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09Merge branch 'for-joerg/arm-smmu/updates' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu
2016-05-05Merge tag 'v4.6-rc6' into x86/asm, to refresh the treeIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-03iommu/arm-smmu: Clear cache lock bit of ACRPeng Fan
According MMU-500r2 TRM, section 3.7.1 Auxiliary Control registers, You can modify ACTLR only when the ACR.CACHE_LOCK bit is 0. So before clearing ARM_MMU500_ACTLR_CPRE of each context bank, need clear CACHE_LOCK bit of ACR register first. Since CACHE_LOCK bit is only present in MMU-500r2 onwards, need to check the major number of IDR7. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Peng Fan <van.freenix@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03iommu/arm-smmu: Support SMMUv1 64KB supplementRobin Murphy
The 64KB Translation Granule Supplement to the SMMUv1 architecture allows an SMMUv1 implementation to support 64KB pages for stage 2 translations, using a constrained VMSAv8 descriptor format limited to 40-bit addresses. Now that we can freely mix and match context formats, we can actually handle having 4KB pages via an AArch32 context but 64KB pages via an AArch64 context, so plumb it in. It is assumed that any implementations will have hardware capabilities matching the format constraints, thus obviating the need for excessive sanity-checking; this is the case for MMU-401, the only ARM Ltd. implementation. CC: Eric Auger <eric.auger@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03iommu/arm-smmu: Decouple context format from kernel configRobin Murphy
The way the driver currently forces an AArch32 or AArch64 context format based on the kernel config and SMMU architecture version is suboptimal, in that it makes it very hard to support oddball mix-and-match cases like the SMMUv1 64KB supplement, or situations where the reduced table depth of an AArch32 short descriptor context may be desirable under an AArch64 kernel. It also only happens to work on current implementations which do support all the relevant formats. Introduce an explicit notion of context format, so we can manage that independently and get rid of the inflexible #ifdeffery. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03iommu/arm-smmu: Tidy up 64-bit/atomic I/O accessesRobin Murphy
With {read,write}q_relaxed now able to fall back to the common nonatomic-hi-lo helper, make use of that so that we don't have to open-code our own. In the process, also convert the other remaining split accesses, and repurpose the custom accessor to smooth out the couple of troublesome instances where we really want to avoid nonatomic writes (and a 64-bit access is unnecessary in the 32-bit context formats we would use on a 32-bit CPU). This paves the way for getting rid of some of the assumptions currently baked into the driver which make it really awkward to use 32-bit context formats with SMMUv2 under a 64-bit kernel. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03iommu/arm-smmu: Work around MMU-500 prefetch errataRobin Murphy
MMU-500 erratum #841119 is tickled by a particular set of circumstances interacting with the next-page prefetcher. Since said prefetcher is quite dumb and actually detrimental to performance in some cases (by causing unwanted TLB evictions for non-sequential access patterns), we lose very little by turning it off, and what we gain is a guarantee that the erratum is never hit. As a bonus, the same workaround will also prevent erratum #826419 once v7 short descriptor support is implemented. CC: Catalin Marinas <catalin.marinas@arm.com> CC: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03iommu/arm-smmu: Convert ThunderX workaround to new methodRobin Murphy
With a framework for implementation-specific funtionality in place, the currently-FDT-dependent ThunderX workaround gets to be the first user. Acked-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03iommu/arm-smmu: Differentiate specific implementationsRobin Murphy
As the inevitable reality of implementation-specific errata workarounds begin to accrue alongside our integration quirk handling, it's about time the driver had a decent way of keeping track. Extend the per-SMMU data so we can identify specific implementations in an efficient and firmware-agnostic manner. Acked-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03iommu/arm-smmu: Workaround for ThunderX erratum #27704Tirumalesh Chalamarla
Due to erratum #27704, the CN88xx SMMUv2 implementation supports only shared ASID and VMID numberspaces. This patch ensures that ASID and VMIDs are unique across all SMMU instances on affected Cavium systems. Signed-off-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: Akula Geethasowjanya <Geethasowjanya.Akula@caviumnetworks.com> [will: commit message, comments and formatting] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03iommu/arm-smmu: Add support for 16 bit VMIDTirumalesh Chalamarla
This patch adds support for 16-bit VMIDs on implementations of SMMUv2 that support it. Signed-off-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com> [will: commit messsage and comments] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-21iommu/amd: Move get_device_id() and friends to beginning of fileJoerg Roedel
They will be needed there later. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-21iommu/amd: Don't use IS_ERR_VALUE to check integer valuesJoerg Roedel
Use the better 'var < 0' check. Fixes: 7aba6cb9ee9d ('iommu/amd: Make call-sites of get_device_id aware of its return value') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-21iommu/arm-smmu: Don't allocate resources for bypass domainsRobin Murphy
Until we get fully plumbed into of_iommu_configure, our default IOMMU_DOMAIN_DMA domains just bypass translation. Since we achieve that by leaving the stream table entries set to bypass instead of pointing at a translation context, the context bank we allocate for the domain is completely wasted. Context banks are typically a rather limited resource, so don't hog ones we don't need. Reported-by: Eric Auger <eric.auger@linaro.org> Tested-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-21iommu/arm-smmu: Fix stream-match conflict with IOMMU_DOMAIN_DMAWill Deacon
Commit cbf8277ef456 ("iommu/arm-smmu: Treat IOMMU_DOMAIN_DMA as bypass for now") ignores requests to attach a device to the default domain since, without IOMMU-basked DMA ops available everywhere, the default domain will just lead to unexpected transaction faults being reported. Unfortunately, the way this was implemented on SMMUv2 causes a regression with VFIO PCI device passthrough under KVM on AMD Seattle. On this system, the host controller device is associated with both a pci_dev *and* a platform_device, and can therefore end up with duplicate SMR entries, resulting in a stream-match conflict at runtime. This patch amends the original fix so that attaching to IOMMU_DOMAIN_DMA is rejected even before configuring the SMRs. This restores the old behaviour for now, but we'll need to look at handing host controllers specially when we come to supporting the default domain fully. Reported-by: Eric Auger <eric.auger@linaro.org> Tested-by: Eric Auger <eric.auger@linaro.org> Tested-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-20iommu/vt-d: Use per-cpu IOVA cachingOmer Peleg
Commit 9257b4a2 ('iommu/iova: introduce per-cpu caching to iova allocation') introduced per-CPU IOVA caches to massively improve scalability. Use them. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2016-04-20iommu/iova: introduce per-cpu caching to iova allocationOmer Peleg
IOVA allocation has two problems that impede high-throughput I/O. First, it can do a linear search over the allocated IOVA ranges. Second, the rbtree spinlock that serializes IOVA allocations becomes contended. Address these problems by creating an API for caching allocated IOVA ranges, so that the IOVA allocator isn't accessed frequently. This patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs without taking the rbtree spinlock. The per-CPU caches are backed by a global cache, to avoid invoking the (linear-time) IOVA allocator without needing to make the per-CPU cache size excessive. This design is based on magazines, as described in "Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources" (currently available at https://www.usenix.org/legacy/event/usenix01/bonwick.html) Adding caching on top of the existing rbtree allocator maintains the property that IOVAs are densely packed in the IO virtual address space, which is important for keeping IOMMU page table usage low. To keep the cache size reasonable, we bound the IOVA space a CPU can cache by 32 MiB (we cache a bounded number of IOVA ranges, and only ranges of size <= 128 KiB). The shared global cache is bounded at 4 MiB of IOVA space. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>