summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/i915_vma.c
AgeCommit message (Collapse)Author
2017-07-09Merge tag 'drm-for-v4.13' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm updates from Dave Airlie: "This is the main pull request for the drm, I think I've got one later driver pull for mediatek SoC driver, I'm undecided on if it needs to go to you yet. Otherwise summary below: Core drm: - Atomic add driver private objects - Deprecate preclose hook in modern drivers - MST bandwidth tracking - Use kvmalloc in more places - Add mode_valid hook for crtc/encoder/bridge - Reduce sync_file construction time - Documentation updates - New DRM synchronisation object support New drivers: - pl111 - pl111 CLCD display controller Panel: - Innolux P079ZCA panel driver - Add NL12880B20-05, NL192108AC18-02D, P320HVN03 panels - panel-samsung-s6e3ha2: Add s6e3hf2 panel support i915: - SKL+ watermark fixes - G4x/G33 reset improvements - DP AUX backlight improvements - Buffer based GuC/host communication - New getparam for (sub)slice infomation - Cannonlake and Coffeelake initial patches - Execbuf optimisations radeon/amdgpu: - Lots of Vega10 bug fixes - Preliminary raven support - KIQ support for compute rings - MEC queue management rework - DCE6 Audio support - SR-IOV improvements - Better radeon/amdgpu selection support nouveau: - HDMI stereoscopic support - Display code rework for >= GM20x GPUs msm: - GEM rework for fine-grained locking - Per-process pagetable work - HDMI fixes for Snapdragon 820. vc4: - Remove 256MB CMA limit from vc4 - Add out-fence support - Add support for cygnus - Get/set tiling ioctls support - Add T-format tiling support for scanout zte: - add VGA support. etnaviv: - Thermal throttle support for newer GPUs - Restore userspace buffer cache performance - dma-buf sync fix stm: - add stm32f429 display support exynos: - Rework vblank handling - Fixup sw-trigger code sun4i: - V3s display engine support - HDMI support for older SoCs - Preliminary work on dual-pipeline SoCs. rcar-du: - VSP work imx-drm: - Remove counter load enable from PRE - Double read/write reduction flag support tegra: - Documentation for the host1x and drm driver. - Lots of staging ioctl fixes due to grate project work. omapdrm: - dma-buf fence support - TILER rotation fixes" * tag 'drm-for-v4.13' of git://people.freedesktop.org/~airlied/linux: (1270 commits) drm: Remove unused drm_file parameter to drm_syncobj_replace_fence() drm/amd/powerplay: fix bug fail to remove sysfs when rmmod amdgpu. amdgpu: Set cik/si_support to 1 by default if radeon isn't built drm/amdgpu/gfx9: fix driver reload with KIQ drm/amdgpu/gfx8: fix driver reload with KIQ drm/amdgpu: Don't call amd_powerplay_destroy() if we don't have powerplay drm/ttm: Fix use-after-free in ttm_bo_clean_mm drm/amd/amdgpu: move get memory type function from early init to sw init drm/amdgpu/cgs: always set reference clock in mode_info drm/amdgpu: fix vblank_time when displays are off drm/amd/powerplay: power value format change for Vega10 drm/amdgpu/gfx9: support the amdgpu.disable_cu option drm/amd/powerplay: change PPSMC_MSG_GetCurrPkgPwr for Vega10 drm/amdgpu: Make amdgpu_cs_parser_init static (v2) drm/amdgpu/cs: fix a typo in a comment drm/amdgpu: Fix the exported always on CU bitmap drm/amdgpu/gfx9: gfx_v9_0_enable_gfx_static_mg_power_gating() can be static drm/amdgpu/psp: upper_32_bits/lower_32_bits for address setup drm/amd/powerplay/cz: print message if smc message fails drm/amdgpu: fix typo in amdgpu_debugfs_test_ib_init ...
2017-06-26drm/i915: Retire the VMA's fence tracker before unbindingChris Wilson
Since we may track unfenced access (GPU access to the vma that explicitly requires no fence), vma->last_fence may be set without any attached fence (vma->fence) and so will not be flushed when we call i915_vma_put_fence(). Since we stopped doing a full retire of the activity trackers for unbind, we need to explicitly retire each tracker. Fixes: b0decaf75bd9 ("drm/i915: Track active vma requests") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620124321.1108-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> (cherry picked from commit 760a898d8069111704e1bd43f00ebf369ae46e57) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-06-16drm/i915: Stash a pointer to the obj's resv in the vmaChris Wilson
During execbuf, a mandatory step is that we add this request (this fence) to each object's reservation_object. Inside execbuf, we track the vma, and to add the fence to the reservation_object then means having to first chase the obj, incurring another cache miss. We can reduce the number of cache misses by stashing a pointer to the reservation_object in the vma itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170616140525.6394-1-chris@chris-wilson.co.uk
2017-06-16drm/i915: Store a persistent reference for an object in the execbuffer cacheChris Wilson
If we take a reference to the object/vma when it is first used in an execbuf, we can keep that reference until the object's file-local handle is closed. Thereby saving a frequent ref/unref pair. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16drm/i915: Eliminate lots of iterations over the execobjects arrayChris Wilson
The major scaling bottleneck in execbuffer is the processing of the execobjects. Creating an auxiliary list is inefficient when compared to using the execobject array we already have allocated. Reservation is then split into phases. As we lookup up the VMA, we try and bind it back into active location. Only if that fails, do we add it to the unbound list for phase 2. In phase 2, we try and add all those objects that could not fit into their previous location, with fallback to retrying all objects and evicting the VM in case of severe fragmentation. (This is the same as before, except that phase 1 is now done inline with looking up the VMA to avoid an iteration over the execobject array. In the ideal case, we eliminate the separate reservation phase). During the reservation phase, we only evict from the VM between passes (rather than currently as we try to fit every new VMA). In testing with Unreal Engine's Atlantis demo which stresses the eviction logic on gen7 class hardware, this speed up the framerate by a factor of 2. The second loop amalgamation is between move_to_gpu and move_to_active. As we always submit the request, even if incomplete, we can use the current request to track active VMA as we perform the flushes and synchronisation required. The next big advancement is to avoid copying back to the user any execobjects and relocations that are not changed. v2: Add a Theory of Operation spiel. v3: Fall back to slow relocations in preparation for flushing userptrs. v4: Document struct members, factor out eb_validate_vma(), add a few more comments to explain some magic and hide other magic behind macros. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16drm/i915: Store a direct lookup from object handle to vmaChris Wilson
The advent of full-ppgtt lead to an extra indirection between the object and its binding. That extra indirection has a noticeable impact on how fast we can convert from the user handles to our internal vma for execbuffer. In order to bypass the extra indirection, we use a resizable hashtable to jump from the object to the per-ctx vma. rhashtable was considered but we don't need the online resizing feature and the extra complexity proved to undermine its usefulness. Instead, we simply reallocate the hastable on demand in a background task and serialize it before iterating. In non-full-ppgtt modes, multiple files and multiple contexts can share the same vma. This leads to having multiple possible handle->vma links, so we only use the first to establish the fast path. The majority of buffers are not shared and so we should still be able to realise speedups with multiple clients. v2: Prettier names, more magic. v3: Many style tweaks, most notably hiding the misuse of execobj[].rsvd2 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16drm/i915: Make i915_vma_destroy() staticChris Wilson
i915_vma_destroy() is now not used outside of i915_vma.c so we can remove the export and make the function static. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170616123508.12673-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2017-06-15drm/i915: Use vma->exec_entry as our double-entry placeholderChris Wilson
This has the benefit of not requiring us to manipulate the vma->exec_link list when tearing down the execbuffer, and is a marginally cheaper test to detect the user error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-2-chris@chris-wilson.co.uk
2017-02-27drm/i915: Remove the vma from the drm_mm if binding failsChris Wilson
As we track whether a vma has been inserted into the drm_mm using the vma->flags, if we fail to bind the vma into the GTT we do not update those bits and will attempt to reinsert the vma into the drm_mm on future passes. To prevent that, we want to unwind i915_vma_insert() if we fail in our attempt to bind. Fixes: 59bfa1248e22 ("drm/i915: Start passing around i915_vma from execbuffer") Testcase: igt/drv_selftest/live_gtt Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227122654.27651-3-chris@chris-wilson.co.uk
2017-02-25drm/i915: Sanity check the vma->node prior to binding into the GTTChris Wilson
We rely on the VMA being allocated inside the drm_mm and for its allotted node being large enough to accommodate all the vma->pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170225181122.4788-3-chris@chris-wilson.co.uk
2017-02-15drm/i915: Move allocate_va_range to GTTChris Wilson
In the future, we need to call allocate_va_range on the aliasing-ppgtt which means moving the call down from the vma into the vm (which is more appropriate for calling the vm function). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-8-chris@chris-wilson.co.uk
2017-02-13drm/i915: Exercise i915_vma_pin/i915_vma_insertChris Wilson
High-level testing of the struct drm_mm by verifying our handling of weird requests to i915_vma_pin. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-35-chris@chris-wilson.co.uk
2017-02-13drm/i915: Test creation of VMAChris Wilson
Simple test to exercise creation and lookup of VMA within an object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-34-chris@chris-wilson.co.uk
2017-02-09drm/i915: Assert that we never create a vma for the aliasing_ppgttChris Wilson
The aliasing_ppgtt is just a container for the HW context that mirrors the global gtt. It should never be used directly, so assert if we make the mistake of trying to allocate a VMA for it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170209111933.12420-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-23drm/i915: Pevent copying uninitialised garbage into vma->ggtt_viewChris Wilson
Since tweaking i915_vma_compare() we allowed constructors to skip clearing the ggtt_view believing that we didn't access the unused members. That, as it turns out, was not entirely true. In particular, i915_gem_fault() uses ret = remap_io_mapping(area, area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT), (ggtt->mappable_base + vma->node.start) >> PAGE_SHIFT, min_t(u64, vma->size, area->vm_end - area->vm_start), &ggtt->mappable); i.e. the ggtt_view.partial for both normal and partial views. If we allowed garbage into the normal vma->ggtt_view and then try userspace tried to mmap it, we could explode in an unobvious fashion. Fixes: 7b92c047bae2 ("drm/i915: Eliminate superfluous i915_ggtt_view_rotated") Fixes: 3bf4d5751943 ("drm/i915: Stop clearing i915_ggtt_view") Reported-by: Matthew Auld <matthew.william.auld@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170123145245.3972-1-chris@chris-wilson.co.uk Tested-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2017-01-21drm/i915: reinstate call to trace_i915_vma_bindDaniele Ceraolo Spurio
The call went away in: commit 3b16525cc4c1a43e9053cfdc414356eea24bdfad Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Aug 4 16:32:25 2016 +0100 drm/i915: Split insertion/binding of an object into the VM It is useful to have this trace as it pairs nicely with the vma_unbind one to track vma activity. Added inside the i915_vma_bind function (was outside before) to keep a similar placement as trace_i915_vma_unbind. v2: print bind_flags instead of flags (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484949083-11430-1-git-send-email-daniele.ceraolospurio@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-01-21drm/i915: Assert that created vma has a whole number of pagesChris Wilson
VMA (and their objects) are supposed to composed of whole pages. Add an assert to catch any invalid construct when we create the VMA. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-6-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-21drm/i915: Assert the drm_mm_node is allocated when on the VM listsChris Wilson
Before moving the vma between the VM active/inactive lists, assert that the node is still allocated. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-21drm/i915: Reject vma creation larger than address spaceChris Wilson
Disallow creation of a vma that is larger than the available address space, or triggers an overflow on fence expansion. Testcase: igt/gem_exec_reloc/gtt-32 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-19drm/i915: Remove i915_gem_object_to_ggtt()Chris Wilson
With the last user of this convenience wrapper gone, we can kill the wrapper and in the process make the lookup function static. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-19drm/i915: Remove i915_vma_create from VMA APIChris Wilson
With the introduce of i915_vma_instance() for obtaining the VMA singleton for a (obj, vm, view) tuple, we can remove the i915_vma_create() in favour of a single entry point. We do incur a lookup onto an empty tree, but the i915_vma_create() were being called infrequently and during initialisation, so the small overhead is negligible. v2: Drop the i915_ prefix from the now static vma_create() function Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-4-chris@chris-wilson.co.uk
2017-01-19drm/i915: Add a check that the VMA instance we lookup matches the requestChris Wilson
Just as added paranoia against our future-selves add another check that the lookup/created VMA instance matches the request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-19drm/i915: Rename some warts in the VMA APIChris Wilson
Whilst writing testcases to exercise the VMA API, some oddities came to light, such as i915_gem_obj_lookup_or_create(). Joonas suggested i915_vma_instance() as a neat replacement, so rename them, move them to i915_vma.c and add some kerneldoc as a sugary bonus. s/i915_gem_obj_to_vma/i915_vma_lookup/ s/i915_gem_obj_lookup_or_create_vma/i915_vma_instance/ Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-2-chris@chris-wilson.co.uk
2017-01-14drm/i915: Convert i915_ggtt_view to use an anonymous unionChris Wilson
Reading the ggtt_views is much more pleasant without the extra characters from specifying the union (i.e. ggtt_view.partial rather than ggtt_view.params.partial). To make this work inside i915_vma_compare() with only a single memcmp requires us to ensure that there are no uninitialised bytes within each branch of the union (we make sure the structs are packed) and we need to store the size of each branch. v4: Rewrite changelog and add comments explaining the assert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-01-13drm/i915: Assert that we have allocated the drm_mm_node upon pinningChris Wilson
We currently check after the slow path that the vma is bound correctly, but we don't currently check after the fast path. This is important in case we accidentally take the fast path and leave the vma misplaced. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170111210937.29252-27-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-11drm/i915: Extract reserving space in the GTT to a helperChris Wilson
Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its own routine so that it can be shared rather than duplicated. v2: Kerneldoc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: igvt-g-dev@lists.01.org Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk
2017-01-11drm/i915: Use the MRU stack search after evictingChris Wilson
When we evict from the GTT to make room for an object, the hole we create is put onto the MRU stack inside the drm_mm range manager. On the next search pass, we can speed up a PIN_HIGH allocation by referencing that stack for the new hole. v2: Pull together the 3 identical implements (ahem, a couple were outdated) into a common routine for allocating a node and evicting as necessary. v3: Detect invalid calls to i915_gem_gtt_insert() v4: kerneldoc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-1-chris@chris-wilson.co.uk
2017-01-10drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZEChris Wilson
Start converting over from the byte count to its semantic macro, either we want to allocate the size of a physical page in main memory or we want the size of a virtual page in the GTT. 4096 could mean either, but PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve code comprehension and future changes. In the future, we may want to use variable GTT page sizes and so have the challenge of knowing which hardcoded values were used to represent a physical page vs the virtual page. v2: Look for a few more 4096s to convert, discover IS_ALIGNED(). v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep bdw stolen w/a as 4096 until we know better. v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page sizes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-10drm/i915: Move ggtt fence/alignment to i915_gem_tiling.cChris Wilson
Rename i915_gem_get_ggtt_size() and i915_gem_get_ggtt_alignment() to i915_gem_fence_size() and i915_gem_fence_alignment() respectively to better match usage. Similarly move the pair of functions into i915_gem_tiling.c next to the fence restrictions. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-6-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-10drm/i915: Store required fence size/alignment for GGTT vmaChris Wilson
The fence size/alignment is a combination of the vma size plus object tiling parameters. Those parameters are rarely changed, making the fence size/alignemnt roughly constant for the lifetime of the VMA. We can simplify subsequent calculations by precalculating the size/alignment required for GGTT vma taking fencing into account (with an update if we do change the tiling or stride). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-4-chris@chris-wilson.co.uk
2017-01-10drm/i915: Align GGTT sizes to a fence tile rowChris Wilson
Ensure the view occupies the full tile row so that reads/writes into the VMA do not escape (via fenced detiling) into neighbouring objects - we will pad the object with scratch pages to satisfy the fence. This applies the lazy-tiling we employed on gen2/3 to gen4+. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-2-chris@chris-wilson.co.uk
2017-01-06drm/i915: Use range_overflows()Chris Wilson
Replace a few more open-coded overflow checks with the macro. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-4-chris@chris-wilson.co.uk
2017-01-04Merge tag 'drm-misc-next-2016-12-30' of ↵Daniel Vetter
git://anongit.freedesktop.org/git/drm-misc into drm-intel-next-queued Directly merge drm-misc into drm-intel since Dave is on vacation and we need the various drm-misc patches (fb format rework, drm mm fixes, selftest framework and others). Also pulled back -rc2 in first to resync with drm-intel-fixes and make sure I can reuse the exact rerere solutions from drm-tip for safety, and because I'm lazy. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-12-31drm/i915: Move assert of page pin vs bind count into i915_vma_unbindChris Wilson
The read of the page pin count and the bind count are unordered, presenting races in the assert and it firing off incorrectly. Prevent this by restricting the assert to the vma bind/unbind routines where we have local cpu ordering between the two. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-1-chris@chris-wilson.co.uk
2016-12-28drm: Wrap drm_mm_node.hole_followsChris Wilson
Insulate users from changes to the internal hole tracking within struct drm_mm_node by using an accessor for hole_follows. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> [danvet: resolve conflicts in i915_vma.c] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-12-23drm/i915: Assert that the partial VMA fits within the objectChris Wilson
When creating a partial VMA assert that it first fits with the parent object, and that if it covers the whole of the parent a normal view was created instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-5-chris@chris-wilson.co.uk
2016-12-16drm/i915: convert to using range_overflowsMatthew Auld
Convert some of the obvious hand-rolled ranged overflow sanity checks to our shiny new range_overflows macro. Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-4-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-12-16drm/i915: move vma sanity checking into i915_vma_bindMatthew Auld
If we move the sanity checking from gen8_alloc_va_range_3lvl and gen6_alloc_va_range into i915_vma_bind, we will increase our coverage to now both callbacks. We also convert each WARN_ON over to a GEM_WARN_ON. Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-2-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-12-05drm/i915: Tidy i915_gem_valid_gtt_space()Chris Wilson
We can replace a couple of tests with an assertion that the passed in node is already allocated (as matches the existing call convention) and by a small bit of refactoring we can bring the line lengths to under 80cols. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-3-chris@chris-wilson.co.uk
2016-12-05drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)Chris Wilson
Soft-pinning depends upon being able to check for availabilty of an interval and evict overlapping object from a drm_mm range manager very quickly. Currently it uses a linear list, and so performance is dire and not suitable as a general replacement. Worse, the current code will oops if it tries to evict an active buffer. It also helps if the routine reports the correct error codes as expected by its callers and emits a tracepoint upon use. For posterity since the wrong patch was pushed (i.e. that missed these key points and had known bugs), this is the changelog that should have been on commit 506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer"): Userspace can pass in an offset that it presumes the object is located at. The kernel will then do its utmost to fit the object into that location. The assumption is that userspace is handling its own object locations (for example along with full-ppgtt) and that the kernel will rarely have to make space for the user's requests. This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following: * if the user supplies a virtual address via the execobject->offset *and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then that object is placed at that offset in the address space selected by the context specifier in execbuffer. * the location must be aligned to the GTT page size, 4096 bytes * as the object is placed exactly as specified, it may be used by this execbuffer call without relocations pointing to it It may fail to do so if: * EINVAL is returned if the object does not have a 4096 byte aligned address * the object conflicts with another pinned object (either pinned by hardware in that address space, e.g. scanouts in the aliasing ppgtt) or within the same batch. EBUSY is returned if the location is pinned by hardware EINVAL is returned if the location is already in use by the batch * EINVAL is returned if the object conflicts with its own alignment (as meets the hardware requirements) or if the placement of the object does not fit within the address space All other execbuffer errors apply. Presence of this execbuf extension may be queried by passing I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for a reported value of 1 (or greater). v2: Combine the hole/adjusted-hole ENOSPC checks v3: More color, more splitting, more blurb. Fixes: 506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-2-chris@chris-wilson.co.uk
2016-11-29drm/i915: Convert vm->dev backpointer to vm->i915Chris Wilson
99% of the time we access i915_address_space->dev we want the i915 device and not the drm device, so let's store the drm_i915_private backpointer instead. The only real complication here are the inlines in i915_vma.h where drm_i915_private is not yet defined and so we have to choose an alternate path for our asserts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161129095008.32622-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-18drm/i915: Move frontbuffer CS write tracking from ggtt vma to objectChris Wilson
I tried to avoid having to track the write for every VMA by only tracking writes to the ggtt. However, for the purposes of frontbuffer tracking this is insufficient as we need to invalidate around writes not just to the the ggtt but all aliased ppgtt views of the framebuffer. By moving the critical section to the object and only doing so for framebuffer writes we can reduce the tracking even further by only watching framebuffers and not vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161116190704.5293-1-chris@chris-wilson.co.uk Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-11drm/i915: Split out i915_vma.cJoonas Lahtinen
As a side product, had to split two other files; - i915_gem_fence_reg.h - i915_gem_object.h (only parts that needed immediate untanglement) I tried to move code in as big chunks as possible, to make review easier. i915_vma_compare was moved to a header temporarily. v2: - Use i915_gem_fence_reg.{c,h} v3: - Rebased v4: - Fix building when DEBUG_GEM is enabled by reordering a bit. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com