summaryrefslogtreecommitdiff
path: root/drivers/gpu/host1x
AgeCommit message (Collapse)Author
2020-02-07Merge tag 'drm/tegra/for-5.6-rc1-fixes' of ↵Dave Airlie
git://anongit.freedesktop.org/tegra/linux into drm-next drm/tegra: Fixes for v5.6-rc1 These are a couple of quick fixes for regressions that were found during the first two weeks of the merge window. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200206172753.2185390-1-thierry.reding@gmail.com
2020-02-06gpu: host1x: Set DMA direction only for DMA-mapped buffer objectsThierry Reding
The DMA direction is only used by the DMA API, so there is no use in setting it when a buffer object isn't mapped with the DMA API. Signed-off-by: Thierry Reding <treding@nvidia.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
2020-02-06drm/tegra: Reuse IOVA mapping where possibleThierry Reding
This partially reverts the DMA API support that was recently merged because it was causing performance regressions on older Tegra devices. Unfortunately, the cache maintenance performed by dma_map_sg() and dma_unmap_sg() causes performance to drop by a factor of 10. The right solution for this would be to cache mappings for buffers per consumer device, but that's a bit involved. Instead, we simply revert to the old behaviour of sharing IOVA mappings when we know that devices can do so (i.e. they share the same IOMMU domain). Cc: <stable@vger.kernel.org> # v5.5 Reported-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
2020-01-15Merge tag 'drm/tegra/for-5.6-rc1' of ↵Dave Airlie
git://anongit.freedesktop.org/tegra/linux into drm-next drm/tegra: Changes for v5.6-rc1 This contains a small set of mostly fixes and some minor improvements. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200111004835.2412858-1-thierry.reding@gmail.com
2020-01-10gpu: host1x: Remove dev_err() on platform_get_irq() failureYueHaibing
platform_get_irq() will call dev_err() itself on failure, so there is no need for the driver to also do this. This is detected by coccinelle. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-01-10drm/tegra: Do not implement runtime PMThierry Reding
The Tegra DRM driver heavily relies on the implementations for runtime suspend/resume to be called at specific times. Unfortunately, there are some cases where that doesn't work. One example is if the user disables runtime PM for a given subdevice. Another example is that the PM core acquires a reference to runtime PM during system sleep, effectively preventing devices from going into low power modes. This is intentional to avoid nasty race conditions, but it also causes system sleep to not function properly on all Tegra systems. Fix this by not implementing runtime PM at all. Instead, a minimal, reference-counted suspend/resume infrastructure is added to the host1x bus. This has the benefit that it can be used regardless of the system power state (or any transitions we might be in), or whether or not the user allows runtime PM. Atomic modesetting guarantees that these functions will end up being called at the right point in time, so the pitfalls for the more generic runtime PM do not apply here. Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-01-10gpu: host1x: Rename "parent" to "host"Thierry Reding
Rename the host1x clients' parent to "host" because that more closely describes what it is. The parent can be confused with the parent device in terms of the device hierarchy. Subsequent patches will add a new member that refers to the parent in that hierarchy. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-12-17Merge tag 'drm-misc-next-2019-12-16' of ↵Daniel Vetter
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.6: UAPI Changes: - Add support for DMA-BUF HEAPS. Cross-subsystem Changes: - mipi dsi definition updates, pulled into drm-intel as well. - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim. - Remove support for dma-buf kmap/kunmap. - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well. Core Changes: - Small cleanups to ttm. - Fix SCDC definition. - Assorted cleanups to core. - Add todo to remove load/unload hooks, and use generic fbdev emulation. - Assorted documentation updates. - Use blocking ww lock in ttm fault handler. - Remove drm_fb_helper_fbdev_setup/teardown. - Warning fixes with W=1 for atomic. - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers. - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted) - Various kconfig indentation fixes in core and drivers. - Fix freeing transactions in dp-mst correctly. - Sean Paul is steping down as core maintainer. :-( - Add lockdep annotations for atomic locks vs dma-resv. - Prevent use-after-free for a bad job in drm_scheduler. - Fill out all block sizes in the P01x and P210 definitions. - Avoid division by zero in drm/rect, and fix bounds. - Add drm/rect selftests. - Add aspect ratio and alternate clocks for HDMI 4k modes. - Add todo for drm_framebuffer_funcs and fb_create cleanup. - Drop DRM_AUTH for prime import/export ioctls. - Clear DP-MST payload id tables downstream when initializating. - Fix for DSC throughput definition. - Add extra FEC definitions. - Fix fake offset in drm_gem_object_funs.mmap. - Stop using encoder->bridge in core directly - Handle bridge chaining slightly better. - Add backlight support to drm/panel, and use it in many panel drivers. - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes. Driver Changes: - Small fixes all over. - Fix documentation in vkms. - Fix mmap_sem vs dma_resv in nouveau. - Small cleanup in komeda. - Add page flip support in gma500 for psb/cdv. - Add ddc symlink in the connector sysfs directory for many drivers. - Add support for analogic an6345, and fix small bugs in it. - Add atomic modesetting support to ast. - Fix radeon fault handler VMA race. - Switch udl to use generic shmem helpers. - Unconditional vblank handling for mcde. - Miscellaneous fixes to mcde. - Tweak debug output from komeda using debugfs. - Add gamma and color transform support to komeda for DOU-IPS. - Add support for sony acx424AKP panel. - Various small cleanups to gma500. - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation. - Add support for Logic PD Type 28 panel. - Use drm_panel_* wrapper functions in exynos/tegra/msm. - Add devicetree bindings for generic DSI panels. - Don't include drm_pci.h directly in many drivers. - Add support for begin/end_cpu_access in udmabuf. - Stop using drm_get_pci_dev in gma500 and mga200. - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access. - Add devfreq thermal support to panfrost. - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager. - meson: Add support for OSD1 plane AFBC commit. - Stop displaying garbage when toggling ast primary plane on/off. - More cleanups and fixes to UDL. - Add D32 suport to komeda. - Remove globle copy of drm_dev in gma500. - Add support for Boe Himax8279d MIPI-DSI LCD panel. - Add support for ingenic JZ4770 panel. - Small null pointer deference fix in ingenic. - Remove support for the special tfp420 driver, as there is a generic way to do it. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
2019-11-25drm/tegra: Map cmdbuf once for reloc processingDaniel Vetter
A few reasons to drop kmap: - For native objects all we do is look at obj->vaddr anyway, so might as well not call functions for every page. - Reloc-processing on dma-buf is ... questionable. - Plus most dma-buf that bother kernel cpu mmaps give you at least vmap, much less kmaps. And all the ones relevant for arm-soc are again doing a obj->vaddr game anyway, there's no real kmap going on on arm it seems. Plus this seems to be the only real in-tree user of dma_buf_kmap, and I'd like to get rid of that. Reviewed-by: Thierry Reding <treding@nvidia.com> Tested-by: Thierry Reding <treding@nvidia.com> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: linux-tegra@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20191118103536.17675-2-daniel.vetter@ffwll.ch
2019-11-01gpu: host1x: Unconditionally select IOMMU_IOVAThierry Reding
Currently configurations can be generated where IOMMU_SUPPORT is disabled but IOMMU_IOVA is built as a module and HOST1X as built-in. In such a case, the symbols guarded by IOMMU_IOVA will not be available when linking the host1x driver and cause a linking failure. Simplify this by unconditionally selecting IOMMU_IOVA, which makes sure that it will be forced to =y if HOST1X=y. Technically we can now get IOMMU_IOVA code built-in even if we don't use it (host1x only uses it when IOMMU_SUPPORT is also enabled), but such configuration are of a mostly academic nature. In all practical configurations we want IOMMU support anyway. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29gpu: host1x: Set DMA mask based on IOMMU setupThierry Reding
If the Tegra DRM clients are backed by an IOMMU, push buffers are likely to be allocated beyond the 32-bit boundary if sufficient system memory is available. This is problematic on earlier generations of Tegra where host1x supports a maximum of 32 address bits for the GATHER opcode. More recent versions of Tegra (Tegra186 and later) have a wide variant of the GATHER opcode, which allows addressing up to 64 bits of memory. If host1x itself is behind an IOMMU as well this doesn't matter because the IOMMU's input address space is restricted to 32 bits on generations without support for wide GATHER opcodes. However, if host1x is not behind an IOMMU, it won't be able to process push buffers beyond the 32-bit boundary on Tegra generations that don't support wide GATHER opcodes. Restrict the DMA mask to 32 bits on these generations prevents buffers from being allocated from beyond the 32-bit boundary. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29gpu: host1x: Support DMA mapping of buffersThierry Reding
If host1x_bo_pin() returns an SG table, create a DMA mapping for the buffer. For buffers that the host1x client has already mapped itself, host1x_bo_pin() returns NULL and the existing DMA address is used. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29gpu: host1x: Allocate gather copy for host1xThierry Reding
Currently when the gather buffers are copied, they are copied to a buffer that is allocated for the host1x client that wants to execute the command streams in the buffers. However, the gather buffers will be read by the host1x device, which causes SMMU faults if the DMA API is backed by an IOMMU. Fix this by allocating the gather buffer copy for the host1x device, which makes sure that it will be mapped into the host1x's IOVA space if the DMA API is backed by an IOMMU. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29gpu: host1x: Clean up debugfs on removalThierry Reding
The debugfs files created for host1x are never removed, causing these files to be left dangling in debugfs. This results in a crash when any of these files are accessed after the host1x driver has been removed, as well as a failure to create the debugfs entries when they are added again on driver probe. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29gpu: host1x: Overhaul host1x_bo_{pin,unpin}() APIThierry Reding
The host1x_bo_pin() and host1x_bo_unpin() APIs are used to pin and unpin buffers during host1x job submission. Pinning currently returns the SG table and the DMA address (an IOVA if an IOMMU is used or a physical address if no IOMMU is used) of the buffer. The DMA address is only used for buffers that are relocated, whereas the host1x driver will map gather buffers into its own IOVA space so that they can be processed by the CDMA engine. This approach has a couple of issues. On one hand it's not very useful to return a DMA address for the buffer if host1x doesn't need it. On the other hand, returning the SG table of the buffer is suboptimal because a single SG table cannot be shared for multiple mappings, because the DMA address is stored within the SG table, and the DMA address may be different for different devices. Subsequent patches will move the host1x driver over to the DMA API which doesn't work with a single shared SG table. Fix this by returning a new SG table each time a buffer is pinned. This allows the buffer to be referenced by multiple jobs for different engines. Change the prototypes of host1x_bo_pin() and host1x_bo_unpin() to take a struct device *, specifying the device for which the buffer should be pinned. This is required in order to be able to properly construct the SG table. While at it, make host1x_bo_pin() return the SG table because that allows us to return an ERR_PTR()-encoded error code if we need to, or return NULL to signal that we don't need the SG table to be remapped and can simply use the DMA address as-is. At the same time, returning the DMA address is made optional because in the example of command buffers, host1x doesn't need to know the DMA address since it will have to create its own mapping anyway. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28gpu: host1x: Make host1x_cdma_wait_pushbuffer_space() staticBen Dooks (Codethink)
The host1x_cdma_wait_pushbuffer_space() function is not declared or directly called from outside the file it is in, so make it static. Fixes the following sparse warning: drivers/gpu/host1x/cdma.c:235:5: warning: symbol 'host1x_cdma_wait_pushbuffer_space' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28gpu: host1x: Request channels for clients, not devicesThierry Reding
A struct device doesn't carry much information that a channel might be interested in, but the client very much does. Request channels for the clients rather than their parent devices and store a pointer to them in order to have that information available when needed. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28gpu: host1x: Explicitly initialize host1x_info structuresThierry Reding
It's technically not required to explicitly initialize the fields that will be zero by default, but it's easier to read these structures if they are all initialized uniformly. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28gpu: host1x: Remove gratuitous blank lineThierry Reding
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28gpu: host1x: Do not limit DMA segment sizeThierry Reding
host1x nor any its clients have any limitations on the DMA segment size, so don't pretend that they do. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-25Merge tag 'drm/tegra/for-5.3-rc1' of ↵Dave Airlie
git://anongit.freedesktop.org/tegra/linux into drm-next drm/tegra: Changes for v5.3-rc1 This contains a couple of small improvements and cleanups for the Tegra DRM driver. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190621150753.19550-1-thierry.reding@gmail.com
2019-06-13host1x: debugfs_create_dir() can never return NULLGreg Kroah-Hartman
So there is no need to check for a value that can never happen. No need to check the return value all anyway, as any debugfs call can take the result of this function as an option just fine. Cc: Thierry Reding <thierry.reding@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-tegra@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282Thomas Gleixner
Based on 1 normalized pattern(s): this software is licensed under the terms of the gnu general public license version 2 as published by the free software foundation and may be copied distributed and modified under those terms this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 285 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05gpu: host1x: Do not link logical devices to DT nodesThierry Reding
Logical devices created by the host1x bus infrastructure don't need to be associated with a device tree node. Doing so will cause the driver core to attempt to hook up IOMMU operations and fail because it is not a real device. However, for backwards-compatibility, we need to provide various OF_* uevent variables that were previously provided by of_device_uevent() and which are parsed by libdrm in userspace when querying the available devices. Do this by implementing a uevent callback for the host1x bus. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05gpu: host1x: Increase maximum DMA segment sizeThierry Reding
Recent versions of the DMA API debug code have started to warn about violations of the maximum DMA segment size. This is because the segment size defaults to 64 KiB, which can easily be exceeded in large buffer allocations such as used in DRM/KMS for framebuffers. Technically the Tegra SMMU and ARM SMMU don't have a maximum segment size (they map individual pages irrespective of whether they are contiguous or not), so the choice of 4 MiB is a bit arbitrary here. The maximum segment size is a 32-bit unsigned integer, though, so we can't set it to the correct maximum size, which would be the size of the aperture. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05gpu: host1x: Do not output error message for deferred probeThierry Reding
When deferring probe, avoid logging a confusing error message. While at it, make the error message more informational. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 228 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-11gpu: host1x: Program stream ID to bypass without SMMUArnd Bergmann
If SMMU support is not available, fall back to programming the bypass stream ID (0x7f). Fixes: de5469c21ff9 ("gpu: host1x: Program the channel stream ID") Suggested-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> [treding@nvidia.com: rebase this on top of a later build fix] Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-04-11gpu: host1x: Fix compile error when IOMMU API is not availableStefan Agner
In case the IOMMU API is not available compiling host1x fails with the following error: In file included from drivers/gpu/host1x/hw/host1x06.c:27: drivers/gpu/host1x/hw/channel_hw.c: In function ‘host1x_channel_set_streamid’: drivers/gpu/host1x/hw/channel_hw.c:118:30: error: implicit declaration of function ‘dev_iommu_fwspec_get’; did you mean ‘iommu_fwspec_free’? [-Werror=implicit-function-declaration] struct iommu_fwspec *spec = dev_iommu_fwspec_get(channel->dev->parent); ^~~~~~~~~~~~~~~~~~~~ iommu_fwspec_free Fixes: de5469c21ff9 ("gpu: host1x: Program the channel stream ID") Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Continue CDMA execution starting with a next jobDmitry Osipenko
Currently gathers of a hung job are getting NOP'ed and a restarted CDMA executes the NOP'ed gathers. There shouldn't be a reason to not restart CDMA execution starting with a next job, avoiding the unnecessary churning with gathers NOP'ing. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Don't complete a completed jobDmitry Osipenko
There is a chance that the last job has been completed at the time of CDMA timeout handler invocation. In this case there is no need to complete the completed job. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Cancel only job that actually got stuckDmitry Osipenko
Host1x doesn't have information about jobs inter-dependency, that is something that will become available once host1x will get a proper jobs scheduler implementation. Currently a hang job causes other unrelated jobs to be canceled, that is a relic from downstream driver which is irrelevant to upstream. Let's cancel only the hanging job and not to touch other jobs in queue. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Optimize CDMA push buffer memory usageThierry Reding
The host1x CDMA push buffer is terminated by a special opcode (RESTART) that tells the CDMA to wrap around to the beginning of the push buffer. To accomodate the RESTART opcode, an extra 4 bytes are allocated on top of the 512 * 8 = 4096 bytes needed for the 512 slots (1 slot = 2 words) that are used for other commands passed to CDMA. This requires that two memory pages are allocated, but most of the second page (4092 bytes) is never used. Decrease the number of slots to 511 so that the RESTART opcode fits within the page. Adjust the push buffer wraparound code to take into account push buffer sizes that are not a power of two. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Use correct semantics for HOST1X_CHANNEL_DMAENDThierry Reding
The HOST1X_CHANNEL_DMAEND is an offset relative to the value written to the HOST1X_CHANNEL_DMASTART register, but it is currently treated as an absolute address. This can cause SMMU faults if the CDMA fetches past a pushbuffer's IOMMU mapping. Properly setting the DMAEND prevents the CDMA from fetching beyond that address and avoid such issues. This is currently not observed because a whole (almost) page of essentially scratch space absorbs any excessive prefetching by CDMA. However, changing the number of slots in the push buffer can trigger these SMMU faults. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Support 40-bit addressing on Tegra186Thierry Reding
The host1x and clients instantiated on Tegra186 support addressing 40 bits of memory. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Restrict IOVA space to DMA maskThierry Reding
On Tegra186 and later, the ARM SMMU provides an input address space that is 48 bits wide. However, memory clients can only address up to 40 bits. If the geometry is used as-is, allocations of IOVA space can end up in a region that is not addressable by the memory clients. To fix this, restrict the IOVA space to the DMA mask of the host1x device. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Support 40-bit addressingThierry Reding
Tegra186 and later support 40 bits of address space. Additional registers need to be programmed to store the full 40 bits of push buffer addresses. Since command stream gathers can also reside in buffers in a 40-bit address space, a new variant of the GATHER opcode is also introduced. It takes two parameters: the first parameter contains the lower 32 bits of the address and the second parameter contains bits 32 to 39. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Introduce support for wide opcodesThierry Reding
The CDMA push buffer can currently only handle opcodes that take a single word parameter. However, the host1x implementation on Tegra186 and later supports opcodes that require multiple words as parameters. Unfortunately the way the push buffer is structured, these wide opcodes cannot simply be composed of two regular opcodes because that could result in the wide opcode being split across the end of the push buffer and the final RESTART opcode required to wrap the push buffer around would break the wide opcode. One way to fix this would be to remove the concept of slots to simplify push buffer operations. However, that's not entirely trivial and should be done in a separate patch. For now, simply use a different function to push four-word opcodes into the push buffer. Technically only three words are pushed, with the fourth word used as padding to preserve the 2-word alignment required by the slots abstraction. The fourth word is always a NOP opcode. Additional care must be taken when the end of the push buffer is reached. If a four-word opcode doesn't fit into the push buffer without being split by the boundary, NOP opcodes will be introduced and the new wide opcode placed at the beginning of the push buffer. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07gpu: host1x: Program the channel stream IDThierry Reding
When processing command streams, make sure the host1x's stream ID is programmed for the channel so that addresses are properly translated through the SMMU. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04gpu: host1x: Set up stream ID tableThierry Reding
In order to enable the MMIO path stream ID protection provided by the incarnation of host1x found in Tegra186 and later, the host1x must be provided with the list of stream ID register offsets for each of its clients. Some clients (such as VIC) have multiple stream ID registers that are assumed to be contiguous. The host1x is programmed with the base offset and a limit which provide the range of registers that the host1x needs to monitor for writes. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04gpu: host1x: Represent host1x bus devices in debugfsThierry Reding
This new debugfs file represents the state of host1x bus devices, specifying the list of subdevices and marking which ones have successfully registered. Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04gpu: host1x: Use completion instead of semaphoreArnd Bergmann
In this usage, the two are completely equivalent, but the completion documents better what is going on, and we generally try to avoid semaphores these days. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-29gpu: host1x: Add Tegra194 supportThierry Reding
The host1x hardware found on Tegra194 is mostly backwards compatible with the version found on Tegra186, with the notable exceptions of the increased number of syncpoints and mlocks. In addition, some rarely used features such as syncpoint wait bases were dropped and some registers had to move around to accomodate the increased number of syncpoints. Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-27gpu: host1x: Fix syncpoint ID field size on Tegra186Thierry Reding
The number of syncpoints on Tegra186 is 576 and therefore no longer fits into 8 bits. Increase the size of the syncpoint ID field to 10 in order to accomodate all syncpoints. Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-27gpu: host1x: Resize channel register region on Tegra186 and laterThierry Reding
The register region allocated per channel was decreased from 16384 bytes to 256 bytes on Tegra186 and later. Resize the region to make sure every channel (instead of only the first) is properly programmed. Suggested-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-09-26gpu: host1x: Detach Host1x from IOMMU DMA domain on arm32Dmitry Osipenko
Host1x is getting attached to an implicit IOMMU DMA domain if CONFIG_ARM_DMA_USE_IOMMU=y. Since Host1x driver manages IOMMU by itself, Host1x device must be detached from the implicit domain using arch-specific IOMMU-API. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-09-26gpu: host1x: Remove spurious tabThierry Reding
All other assignments have a single space around the = sign, so remove the spurious tab for consistency. Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-07-09gpu: host1x: Check whether size of unpin isn't 0Dmitry Osipenko
Only gather pins are mapped by the Host1x driver, regular BO relocations are not. Check whether size of unpin isn't 0, otherwise IOVA allocation at 0x0 could be erroneously released. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-07-09gpu: host1x: Skip IOMMU initialization if firewall is enabledDmitry Osipenko
Host1x's CDMA can't access the command buffers if IOMMU and Host1x firewall are enabled in the kernels config because firewall doesn't map the copied buffer into IOVA space. Fix this by skipping IOMMU initialization if firewall is enabled as firewall merges sparse cmdbufs into a single contiguous buffer and hence IOMMU isn't needed in this case. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>