Age | Commit message (Collapse) | Author |
|
All iommu drivers use the default_iommu_map_sg implementation, and there
is no good reason to ever override it. Just expose it as iommu_map_sg
directly and remove the indirection, specially in our post-spectre world
where indirect calls are horribly expensive.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
In case of error, the function iommu_group_alloc() returns ERR_PTR() and
never returns NULL. The NULL test in the return value check should be
replaced with IS_ERR().
Fixes: 7f4c9176f760 ("iommu/tegra: Allow devices to be grouped")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Implement the ->device_group() and ->of_xlate() callbacks which are used
in order to group devices. Each group can then share a single domain.
This is implemented primarily in order to achieve the same semantics on
Tegra210 and earlier as on Tegra186 where the Tegra SMMU was replaced by
an ARM SMMU. Users of the IOMMU API can now use the same code to share
domains between devices, whereas previously they used to attach each
device individually.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The bus_set_iommu() function will call the add_device()
call-back which needs the iommu to be registered.
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Fixes: 0b480e447006 ('iommu/tegra: Add support for struct iommu_device')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add a struct iommu_device to each tegra-smmu and register it
with the iommu-core. Also link devices added to the driver
to their respective hardware iommus.
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
As the last step to making groups mandatory, clean up the remaining
drivers by adding basic support. Whilst it may not perfectly reflect
the isolation capabilities of the hardware (tegra_smmu_swgroup sounds
suspiciously like something that might warrant representing at the
iommu_group level), using generic_device_group() should at least
maintain existing behaviour with respect to the API.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The include file does not need any PCI specifics, so remove
that include. Also fix the places that relied on it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The number of TLB lines was increased from 16 on Tegra30 to 32 on
Tegra114 and later. Parameterize the value so that the initial default
can be set accordingly.
On Tegra30, initializing the value to 32 would effectively disable the
TLB and hence cause massive latencies for memory accesses translated
through the SMMU. This is especially noticeable for isochronuous clients
such as display, whose FIFOs would continuously underrun.
Fixes: 891846516317 ("memory: Add NVIDIA Tegra memory controller support")
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
This code is used both when creating a new page directory entry and when
tearing it down, with only the PDE value changing between both cases.
Factor the code out so that it can be reused.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: make commit message more accurate]
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Extract the use count reference accounting into a separate function and
separate it from allocating the PTE.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: extract and write commit message]
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Rather than explicitly zeroing pages allocated via alloc_page(), add
__GFP_ZERO to the gfp mask to ask the allocator for zeroed pages.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Remove the unnecessary manipulation of the PageReserved flags in the
Tegra SMMU driver. None of this is required as the page(s) remain
private to the SMMU driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Use the DMA API instead of calling architecture internal functions in
the Tegra SMMU driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Pass smmu_flush_ptc() the device address rather than struct page
pointer.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
smmu_flush_ptc() is used in two modes: one is to flush an individual
entry, the other is to flush all entries. We know at the call site
which we require. Split the function into smmu_flush_ptc_all() and
smmu_flush_ptc().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Drivers should not be using __cpuc_* functions nor outer_cache_flush()
directly. This change partly cleans up tegra-smmu.c.
The only difference between cache handling of the tegra variants is
Denver, which omits the call to outer_cache_flush(). This is due to
Denver being an ARM64 CPU, and the ARM64 architecture does not provide
this function. (This, in itself, is a good reason why these should not
be used.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: fix build failure on 64-bit ARM]
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Use kcalloc() to allocate the use-counter array for the page directory
entries/page tables. Using kcalloc() allows us to be provided with
zero-initialised memory from the allocators, rather than initialising
it ourselves.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Store the struct page pointer for the second level page tables, rather
than working back from the page directory entry. This is necessary as
we want to eliminate the use of physical addresses used with
arch-private functions, switching instead to use the streaming DMA API.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Fix the page table lookup in the unmap and iova_to_phys methods.
Neither of these methods should allocate a page table; a missing page
table should be treated the same as no mapping present.
More importantly, using as_get_pte() for an IOVA corresponding with a
non-present page table entry increments the use-count for the page
table, on the assumption that the caller of as_get_pte() is going to
setup a mapping. This is an incorrect assumption.
Fix both of these bugs by providing a separate helper which only looks
up the page table, but never allocates it. This is akin to pte_offset()
for CPU page tables.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add a pair of helpers to get the page directory and page table indexes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Factor out the common PTE setting code into a separate function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The Tegra SMMU unmap path has several problems:
1. as_pte_put() can perform a write-after-free
2. tegra_smmu_unmap() can perform cache maintanence on a page we have
just freed.
3. when a page table is unmapped, there is no CPU cache maintanence of
the write clearing the page directory entry, nor is there any
maintanence of the IOMMU to ensure that it sees the page table has
gone.
Fix this by getting rid of as_pte_put(), and instead coding the PTE
unmap separately from the PDE unmap, placing the PDE unmap after the
PTE unmap has been completed.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
iova_to_phys() has several problems:
(a) iova_to_phys() is supposed to return 0 if there is no entry present
for the iova.
(b) if as_get_pte() fails, we oops the kernel by dereferencing a NULL
pointer. Really, we should not even be trying to allocate a page
table at all, but should only be returning the presence of the 2nd
level page table. This will be fixed in a subsequent patch.
Treat both of these conditions as "no mapping" conditions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Provide clients and swgroups files in debugfs. These files show for
which clients IOMMU translation is enabled and which ASID is associated
with each SWGROUP.
Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
and 'core' into next
Conflicts:
drivers/iommu/amd_iommu.c
drivers/iommu/tegra-gart.c
drivers/iommu/tegra-smmu.c
|
|
The SMMU on Tegra30 and Tegra114 supports addressing up to 4 GiB of
physical memory. On Tegra124 the addressable physical memory was
extended to 16 GiB. The page frame number stored in PTEs therefore
requires 20 or 22 bits, depending on SoC generation.
In order to cope with this, compute the proper value at runtime.
Reported-by: Joseph Lo <josephl@nvidia.com>
Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Each address space in the Tegra SMMU provides 4 GiB worth of addresses.
Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Implement domain_alloc and domain_free iommu-ops as a
replacement for domain_init/domain_destroy.
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The memory controller on NVIDIA Tegra exposes various knobs that can be
used to tune the behaviour of the clients attached to it.
Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.
This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller. It is supported on Tegra30, Tegra114 and Tegra124
currently. Tegra20 has a GART instead.
The Tegra SMMU operates on memory clients and SWGROUPs. A memory client
is a unidirectional, special-purpose DMA master. A SWGROUP represents a
set of memory clients that form a logical functional unit corresponding
to a single device. Typically a device has two clients: one client for
read transactions and one client for write transactions, but there are
also devices that have only read clients, but many of them (such as the
display controllers).
Because there is no 1:1 relationship between memory clients and devices
the driver keeps a table of memory clients and the SWGROUPs that they
belong to per SoC. Note that this is an exception and due to the fact
that the SMMU is tightly integrated with the rest of the Tegra SoC. The
use of these tables is discouraged in drivers for generic IOMMU devices
such as the ARM SMMU because the same IOMMU could be used in any number
of SoCs and keeping such tables for each SoC would not scale.
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Mapping and unmapping are more often than not in the critical path.
map_sg allows IOMMU driver implementations to optimize the process
of mapping buffers into the IOMMU page tables.
Instead of mapping a buffer one page at a time and requiring potentially
expensive TLB operations for each page, this function allows the driver
to map all pages in one go and defer TLB maintenance until after all
pages have been mapped.
Additionally, the mapping operation would be faster in general since
clients does not have to keep calling map API over and over again for
each physically contiguous chunk of memory that needs to be mapped to a
virtually contiguous region.
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Make of_device_id array const, because all OF functions handle it as const.
Signed-off-by: Kiran Padwal <kiran.padwal@smartplayin.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC cleanups from Olof Johansson:
"This merge window brings a good size of cleanups on various platforms.
Among the bigger ones:
- Removal of Samsung s5pc100 and s5p64xx platforms. Both of these
have lacked active support for quite a while, and after asking
around nobody showed interest in keeping them around. If needed,
they could be resurrected in the future but it's more likely that
we would prefer reintroduction of them as DT and
multiplatform-enabled platforms instead.
- OMAP4 controller code register define diet. They defined a lot of
registers that were never actually used, etc.
- Move of some of the Tegra platform code (PMC, APBIO, fuse,
powergate) to drivers/soc so it can be shared with 64-bit code.
This also converts them over to traditional driver models where
possible.
- Removal of legacy gpio-samsung driver, since the last users have
been removed (moved to pinctrl)
Plus a bunch of smaller changes for various platforms that sort of
dissapear in the diffstat for the above. clps711x cleanups, shmobile
header file refactoring/moves for multiplatform friendliness, some
misc cleanups, etc"
* tag 'cleanup-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (117 commits)
drivers: CCI: Correct use of ! and &
video: clcd-versatile: Depend on ARM
video: fix up versatile CLCD helper move
MAINTAINERS: Add sdhci-st file to ARCH/STI architecture
ARM: EXYNOS: Fix build breakge with PM_SLEEP=n
MAINTAINERS: Remove Kirkwood
ARM: tegra: Convert PMC to a driver
soc/tegra: fuse: Set up in early initcall
ARM: tegra: Always lock the CPU reset vector
ARM: tegra: Setup CPU hotplug in a pure initcall
soc/tegra: Implement runtime check for Tegra SoCs
soc/tegra: fuse: fix dummy functions
soc/tegra: fuse: move APB DMA into Tegra20 fuse driver
soc/tegra: Add efuse and apbmisc bindings
soc/tegra: Add efuse driver for Tegra
ARM: tegra: move fuse exports to soc/tegra/fuse.h
ARM: tegra: export apb dma readl/writel
ARM: tegra: Use a function to get the chip ID
ARM: tegra: Sort includes alphabetically
ARM: tegra: Move includes to include/soc/tegra
...
|
|
In order to not clutter the include/linux directory with SoC specific
headers, move the Tegra-specific headers out into a separate directory.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
This structure is read-only data and should never be modified.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
'tegra_smmu_pm_ops' is used only in this file. Make it static.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
When enabling LPAE on ARM, phys_addr_t becomes 64 bits wide and printing
a variable of that type using a simple %x format specifier causes the
compiler to complain. Change the format specifier to %pa, which is used
specifically for variables of type phys_addr_t.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to devm_ioremap_resource.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
'arm/tegra' into next
|
|
Fix printk formats for dma_addr_t:
drivers/iommu/tegra-smmu.c: In function 'smmu_iommu_iova_to_phys':
>> drivers/iommu/tegra-smmu.c:774:2: warning: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'dma_addr_t' [-Wformat]
--
drivers/iommu/tegra-gart.c: In function 'gart_iommu_iova_to_phys':
>> drivers/iommu/tegra-gart.c:298:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'dma_addr_t' [-Wformat]
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
This is required in case of PAMU, as it can support a window size of up
to 64G (even on 32bit).
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU Updates from Joerg Roedel:
"Besides some fixes and cleanups in the code there are three more
important changes to point out this time:
* New IOMMU driver for the ARM SHMOBILE platform
* An IOMMU-API extension for non-paging IOMMUs (required for
upcoming PAMU driver)
* Rework of the way the Tegra IOMMU driver accesses its
registetrs - register windows are easier to extend now.
There are also a few changes to non-iommu code, but that is acked by
the respective maintainers."
* tag 'iommu-updates-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (23 commits)
iommu/tegra: assume CONFIG_OF in SMMU driver
iommu/tegra: assume CONFIG_OF in gart driver
iommu/amd: Remove redundant NULL check before dma_ops_domain_free().
iommu/amd: Initialize device table after dma_ops
iommu/vt-d: Zero out allocated memory in dmar_enable_qi
iommu/tegra: smmu: Fix incorrect mask for regbase
iommu/exynos: Make exynos_sysmmu_disable static
ARM: mach-shmobile: r8a7740: Add IPMMU device
ARM: mach-shmobile: sh73a0: Add IPMMU device
ARM: mach-shmobile: sh7372: Add IPMMU device
iommu/shmobile: Add iommu driver for Renesas IPMMU modules
iommu: Add DOMAIN_ATTR_WINDOWS domain attribute
iommu: Add domain window handling functions
iommu: Implement DOMAIN_ATTR_PAGING attribute
iommu: Check for valid pgsize_bitmap in iommu_map/unmap
iommu: Make sure DOMAIN_ATTR_MAX is really the maximum
iommu/tegra: smmu: Change SMMU's dependency on ARCH_TEGRA
iommu/tegra: smmu: Use helper function to check for valid register offset
iommu/tegra: smmu: Support variable MMIO ranges/blocks
iommu/tegra: Add missing spinlock initialization
...
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Tegra only supports, and always enables, device tree. Remove all ifdefs
for DT support from the driver.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
This fixes kernel crash because of BUG() in register address
validation.
Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Do not repeat the checking loop in the read and write
functions. Use a single helper function for that check and
call it in both accessors.
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Presently SMMU registers are located in discontiguous 3 blocks. They
are interleaved by MC registers. Ideally SMMU register blocks should
be in an independent one block, but it is too late to change this H/W
design. In the future Tegra chips over some generations, it is
expected that some of register block "size" can be extended towards
the end and also more new register blocks will be added at most a few
blocks. The starting address of each existing block won't change. This
patch allocates multiple number of register blocks dynamically based
on the info passed from DT. Those ranges are verified in the
accessors{read,write}. This may sacrifice some performance because a
new accessors prevents compiler optimization of a fixed size register
offset calculation. Since SMMU register accesses are not so frequent,
this would be acceptable. This patch is necessary to unify
"tegra-smmu.ko" over some Tegra SoC generations.
Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Reviewed-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Fix tegra_smmu_probe() to initialize client_lock spinlocks in
per-address-space structures.
Signed-off-by: Sami Liedes <sliedes@nvidia.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|