Age | Commit message (Collapse) | Author |
|
After much hair pulling, resort to preallocating the ppGTT entries on
init to circumvent the apparent lack of PD invalidate following the
write to PP_DCLV upon switching mm between contexts (and here the same
context after binding new objects). However, the details of that PP_DCLV
invalidate are still unknown, and it appears we need to reload the mm
twice to cover over a timing issue. Worrying.
Fixes: 3dc007fe9b2b ("drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129201328.1398583-1-chris@chris-wilson.co.uk
|
|
The following patches in the series will use it to avoid certain
operations when the mappable aperture is not available in HW.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029095856.25431-1-matthew.auld@intel.com
|
|
The HW performs swizzling as part of its fence tiling inside the Global
GTT. We already do the probing of the HW settings from the GGTT setup,
complete the picture by storing the information as part of the GGTT. The
primary benefit is the consistency of our probe routines do not break
the i915_ggtt encapsulation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016143234.4075-2-chris@chris-wilson.co.uk
|
|
Replace the struct_mutex requirement for pinning the i915_vma with the
local vm->mutex instead. Note that the vm->mutex is tainted by the
shrinker (we require unbinding from inside fs-reclaim) and so we cannot
allocate while holding that mutex. Instead we have to preallocate
workers to do allocate and apply the PTE updates after we have we
reserved their slot in the drm_mm (using fences to order the PTE writes
with the GPU work and with later unbind).
In adding the asynchronous vma binding, one subtle requirement is to
avoid coupling the binding fence into the backing object->resv. That is
the asynchronous binding only applies to the vma timeline itself and not
to the pages as that is a more global timeline (the binding of one vma
does not need to be ordered with another vma, nor does the implicit GEM
fencing depend on a vma, only on writes to the backing store). Keeping
the vma binding distinct from the backing store timelines is verified by
a number of async gem_exec_fence and gem_exec_schedule tests. The way we
do this is quite simple, we keep the fence for the vma binding separate
and only wait on it as required, and never add it to the obj->resv
itself.
Another consequence in reducing the locking around the vma is the
destruction of the vma is no longer globally serialised by struct_mutex.
A natural solution would be to add a kref to i915_vma, but that requires
decoupling the reference cycles, possibly by introducing a new
i915_mm_pages object that is own by both obj->mm and vma->pages.
However, we have not taken that route due to the overshadowing lmem/ttm
discussions, and instead play a series of complicated games with
trylocks to (hopefully) ensure that only one destruction path is called!
v2: Add some commentary, and some helpers to reduce patch churn.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
|
|
Since we cannot allocate underneath the vm->mutex (it is used in the
direct-reclaim paths), we need to shift the allocations off into a
mutexless worker with fence recursion prevention. To know when we need
this protection, we mark up the address spaces that do allocate before
insertion. In the future, we may wish to extend the async bind scheme to
more than just allocations.
v2: s/vm->bind_alloc/vm->bind_async_flags/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-3-chris@chris-wilson.co.uk
|
|
The premise here is to simply avoiding having to acquire the vm->mutex
inside vma create/destroy to update the vm->unbound_lists, to avoid some
nasty lock recursions later.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-2-chris@chris-wilson.co.uk
|
|
As we remove the struct_mutex protection from around the vma pinning,
counters need to be atomic and aware that there may be multiple threads
simultaneously active.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913064200.24297-1-chris@chris-wilson.co.uk
|
|
Try to tidy up the cache-coloring such that we rid the code of any
mm.color_adjust assumptions, this should hopefully make it more obvious
in the code when we need to actually use the cache-level as the color,
and as a bonus should make adding a different color-scheme simpler.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-3-matthew.auld@intel.com
|
|
This is no longer used anywhere and so can be removed. However, tracking
the dirty status on the ppgtt doesn't work very well if the ppgtt is
shared, so perhaps for the best that it is no longer required.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830180000.24608-3-chris@chris-wilson.co.uk
|
|
The sg_table for our backing store might contain addresses from
stolen-memory or in the future local-memory, at which point this is no
longer a dma-iterator. As a consequence we should now break on NULL
iter.sgp, instead of dmap == 0 which is considered an invalid dma
address.
As a bonus, gcc much prefers this construct,
Function old new delta
gen8_ggtt_insert_entries 211 192 -19
gen6_ggtt_insert_entries 292 262 -30
i915_error_object_create 996 954 -42
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829201919.21493-1-matthew.auld@intel.com
|
|
When under severe stress for GTT mappable space, the LRU eviction model
falls off a cliff. We spend all our time scanning the much larger
non-mappable area searching for something within the mappable zone we can
evict. Turn this on its head by only using the full vma for the object if
it is already pinned in the mappable zone or there is sufficient *free*
space to accommodate it (prioritizing speedy reuse). If there is not,
immediately fall back to using small chunks (tilerow for GTT mmap, single
pages for pwrite/relocation) and using random eviction before doing a full
search.
Testcase: igt/gem_concurrent_blt
References: https://bugs.freedesktop.org/show_bug.cgi?id=110848
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821123234.19194-1-chris@chris-wilson.co.uk
|
|
The aliasing_ppgtt provides a PIN_USER alias for the global gtt, so move
it under the i915_ggtt to simplify later transformations to enable
intel_context.vm.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-1-chris@chris-wilson.co.uk
|
|
Apply the new radix shift helpers to extract the multi-level indices
cleanly when inserting pte into the gtt tree.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712112725.2892-5-chris@chris-wilson.co.uk
|
|
With our HW interface logic moving from i915 to gt and with GuC and HuC
being part of the gt HW, it makes sense to use the intel_gt structure
instead of i915 as our reference object in GuC/HuC paths.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-9-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Replace mixed "_fini"/"_cleanup"/"_cleanup_hw" suffixes found in names
of functions called from i915_driver_release() with "_release" suffix
consistently. This provides better code readability, especially
helpful when trying to work out which phase the code is in.
Functions names starting with "i915_driver_", i.e., those defined in
drivers/gpu/dri/i915/i915_drv.c, just have their "cleanup" or "fini"
parts of their names replaced with the "_release" suffix, while names
of functions coming from other source files have been suffixed with
"_driver_release" to avoid ambiguity with other possible .release entry
points.
v2: early_probe pairs better with late_release (Chris)
v3: fix typo in commit message (Joonas)
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712112429.740-5-janusz.krzysztofik@linux.intel.com
|
|
We can simplify our gtt walking code by comparing against NULL for
scratch entries as opposed to looking up the distinct per-level scratch
pointer.
The only caveat is to remember to protect external parties and map the
NULL to the scratch top pd.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-6-chris@chris-wilson.co.uk
|
|
Each level has its own scratch. Make the levels more obvious by forgoing
the fancy similarly names and replace them with a number. 0 is the bottom
most level, the physical page used for actual data; 1+ are the page
directories.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-5-chris@chris-wilson.co.uk
|
|
The radix levels of each page directory are easily determined so replace
the numerous hardcoded constants with precomputed derived constants.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-4-chris@chris-wilson.co.uk
|
|
This will be useful to consolidate recursive code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-3-chris@chris-wilson.co.uk
|
|
The page directory extends the page table with the shadow entries. Make
the page directory struct embed the page table for easier code reuse.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-1-chris@chris-wilson.co.uk
|
|
We only use the dma pages for scratch, and so do not need to allocate
the extra storage for the shadow page directory.
v2: Refrain from reintroducing I915_PDES
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712075818.20616-1-chris@chris-wilson.co.uk
|
|
For all page directory entries, the pde encoding is
identical. Don't complicate call sites with different
versions of doing the same thing, so we always check the
existence of physical page before writing the entry into
it. This further generalizes the pd so that manipulation in
callsites will be identical, removing the need to handle
pdps differently for gen8.
v2: squash
v3: inc/dec with set/clear (Chris)
v4: inlines, warn, stray set_pd (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190705215204.4559-1-chris@chris-wilson.co.uk
|
|
This reverts commit 4395890a48551982549d222d1923e2833dac47cf.
It's been over a year since this was merged, and the actual users of
intel_ppat_get / intel_ppat_put never materialized.
Time to remove it!
v2: Unbreak suspend (Chris)
v3: Rebase, drop fixes tag to avoid confusion
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190702113149.21200-1-michal.winiarski@intel.com
|
|
Move all timeline code under gt and rename to intel_gt prefix.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com
|
|
Continuing on the theme of better logical organization of our code, make
the first step towards making the ggtt code better isolated from wider
struct drm_i915_private.
v2:
* Bring the ickle onion unwind back. (Chris)
* Rename to i915_init_ggtt. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-27-tvrtko.ursulin@linux.intel.com
|
|
This will come useful in the following patch.
v2:
* Handle mock ggtt.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-21-tvrtko.ursulin@linux.intel.com
|
|
It is more logical for ggtt invalidation to take ggtt as input parameter.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-20-tvrtko.ursulin@linux.intel.com
|
|
More removal of implicit dev_priv from using old mmio accessors.
v2:
* Rebase for uncore_to_i915 removal.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-13-tvrtko.ursulin@linux.intel.com
|
|
Enable RCU protection of i915_address_space and its ppgtt superclasses,
and defer its cleanup into a worker executed after an RCU grace period.
In the future we will be able to use the RCU protection to reduce the
locking around VM lookups, but the immediate benefit is being able to
defer the release into a kworker (process context). This is required as
we may need to sleep to reap the WC pages stashed away inside the ppgtt.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110934
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620183705.31006-1-chris@chris-wilson.co.uk
|
|
All page directories are identical in function, only the position in the
hierarchy differ. Use same base type for directory functionality.
v2: cleanup, size always 512, init to null
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164350.30415-2-mika.kuoppala@linux.intel.com
|
|
As the fence registers only apply to regions inside the GGTT is makes
more sense that we track these as part of the i915_ggtt and not the
general mm. In the next patch, we will then pull the register locking
underneath the i915_ggtt.mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk
|
|
Keeping the _hw_ in there does not help to distinguish it from its
only brethren i915_ggtt, so drop it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-2-chris@chris-wilson.co.uk
|
|
Make the kref common to both derived structs (i915_ggtt and i915_ppgtt)
so that we can safely reference count an abstract ctx->vm address space.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-1-chris@chris-wilson.co.uk
|
|
The code is logically about reset so it makes sense.
It also enables making i915_clear_error_registers static.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607115932.20271-1-tvrtko.ursulin@linux.intel.com
|
|
These two are only used from within i915_gem_gtt.c and can trivially be
made static.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-5-tvrtko.ursulin@linux.intel.com
|
|
Instead of relying on the caller holding struct_mutex across the
allocation, push the allocation under a tree of spinlocks stored inside
the page tables. Not only should this allow us to avoid struct_mutex
here, but it will allow multiple users to lock independent ranges for
concurrent allocations, and operate independently. This is vital for
pushing the GTT manipulation into a background thread where dependency
on struct_mutex is verboten, and for allowing other callers to avoid
struct_mutex altogether.
v2: Restore lost GEM_BUG_ON for removing too many PTE from
gen6_ppgtt_clear_range.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190604153830.19096-1-chris@chris-wilson.co.uk
|
|
Out scatterlist utility routines can be pulled out of i915_gem.h for a
bit more decluttering.
v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the
caller.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-9-chris@chris-wilson.co.uk
|
|
For convenience in avoiding inline spaghetti, keep the type definition
as a separate header.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-1-chris@chris-wilson.co.uk
|
|
We rearranged the vm_destroy_ioctl to avoid taking struct_mutex, little
realising that buried underneath the gen6 ppgtt release path was a
struct_mutex requirement (to remove its GGTT vma). Until that
struct_mutex is vanquished, take a detour in gen6_ppgtt_cleanup to do
the i915_vma_destroy from inside a worker under the struct_mutex.
<4> [257.740160] WARN_ON(debug_locks && !lock_is_held(&(&vma->vm->i915->drm.struct_mutex)->dep_map))
<4> [257.740213] WARNING: CPU: 3 PID: 1507 at drivers/gpu/drm/i915/i915_vma.c:841 i915_vma_destroy+0x1ae/0x3a0 [i915]
<4> [257.740214] Modules linked in: snd_hda_codec_hdmi i915 x86_pkg_temp_thermal mei_hdcp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core r8169 realtek snd_pcm mei_me mei prime_numbers lpc_ich
<4> [257.740224] CPU: 3 PID: 1507 Comm: gem_vm_create Tainted: G U 5.2.0-rc1-CI-CI_DRM_6118+ #1
<4> [257.740225] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
<4> [257.740249] RIP: 0010:i915_vma_destroy+0x1ae/0x3a0 [i915]
<4> [257.740250] Code: 00 00 00 48 81 c7 c8 00 00 00 e8 ed 08 f0 e0 85 c0 0f 85 78 fe ff ff 48 c7 c6 e8 ec 30 a0 48 c7 c7 da 55 33 a0 e8 42 8c e9 e0 <0f> 0b 8b 83 40 01 00 00 85 c0 0f 84 63 fe ff ff 48 c7 c1 c1 58 33
<4> [257.740251] RSP: 0018:ffffc90000aafc68 EFLAGS: 00010282
<4> [257.740252] RAX: 0000000000000000 RBX: ffff8883f7957840 RCX: 0000000000000003
<4> [257.740253] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffffff8212d1b9
<4> [257.740254] RBP: ffffc90000aafcc8 R08: 0000000000000000 R09: 0000000000000000
<4> [257.740255] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8883f4d5c2a8
<4> [257.740256] R13: ffff8883f4d5d680 R14: ffff8883f4d5c668 R15: ffff8883f4d5c2f0
<4> [257.740257] FS: 00007f777fa8fe40(0000) GS:ffff88840f780000(0000) knlGS:0000000000000000
<4> [257.740258] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [257.740259] CR2: 00007f777f6522b0 CR3: 00000003c612a006 CR4: 00000000001606e0
<4> [257.740260] Call Trace:
<4> [257.740283] gen6_ppgtt_cleanup+0x25/0x60 [i915]
<4> [257.740306] i915_ppgtt_release+0x102/0x290 [i915]
<4> [257.740330] i915_gem_vm_destroy_ioctl+0x7c/0xa0 [i915]
<4> [257.740376] ? i915_gem_vm_create_ioctl+0x160/0x160 [i915]
<4> [257.740379] drm_ioctl_kernel+0x83/0xf0
<4> [257.740382] drm_ioctl+0x2f3/0x3b0
<4> [257.740422] ? i915_gem_vm_create_ioctl+0x160/0x160 [i915]
<4> [257.740426] ? _raw_spin_unlock_irqrestore+0x39/0x60
<4> [257.740430] do_vfs_ioctl+0xa0/0x6e0
<4> [257.740433] ? lock_acquire+0xa6/0x1c0
<4> [257.740436] ? __task_pid_nr_ns+0xb9/0x1f0
<4> [257.740439] ksys_ioctl+0x35/0x60
<4> [257.740441] __x64_sys_ioctl+0x11/0x20
<4> [257.740443] do_syscall_64+0x55/0x1c0
<4> [257.740445] entry_SYSCALL_64_after_hwframe+0x49/0xbe
References: e0695db7298e ("drm/i915: Create/destroy VM (ppGTT) for use with contexts")
Fixes: 7f3f317a66ca ("drm/i915: Restore control over ppgtt for context creation ABI")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190523064933.23604-1-chris@chris-wilson.co.uk
|
|
To overcome display engine stride limits we'll want to remap the
pages in the GTT. To that end we need a new gtt_view type which
is just like the "rotated" type except not rotated.
v2: Use intel_remapped_plane_info base type
s/unused/unused_mbz/ (Chris)
Separate BUILD_BUG_ON()s (Chris)
Use I915_GTT_PAGE_SIZE (Chris)
v3: Use i915_gem_object_get_dma_address() (Chris)
Trim the sg (Tvrtko)
v4: Actually trim this time. Limit the max length
to one row of pages to keep things simple
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
It was noted that we made the same mistake for VM_ID as for object
handles, whereby we ensured that we only allocated a single handle for
one ppgtt. This has the unfortunate consequence for userspace that they
need to reference count the handles to avoid destroying an active ID. If
we allow multiple handles to the same ppgtt, userspace can freely
unreference any handle they own without fear of destroying the same
handle in use elsewhere.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190425054333.27299-1-chris@chris-wilson.co.uk
|
|
Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/
One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
|
|
GuC and HuC depend on struct_mutex for device reinitialization. Moving
away from this dependency requires perma-pinning the firmware images in
GGTT. The upper portion of the GuC address space has a sizeable hole
(several MB) that is inaccessible by GuC. Reserve this range within GGTT
as it can comfortably hold GuC/HuC firmware images.
v2: Reserve node rather than insert (Chris)
Simpler determination of node start/size (Daniele)
Move reserve/release out to intel_guc.* files
v3: Reserve starting at GUC_GGTT_TOP only and bail if this
fails (Chris)
Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419230015.18121-3-fernando.pacheco@intel.com
|
|
We want to use intel_engine_mask_t inside i915_request.h, which means
extracting it from the general header file mess and placing it inside a
types.h. A knock on effect is that the compiler wants to warn about
type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
for the worst.
v2: Use intel_engine_mask_t consistently
v3: Move I915_NUM_ENGINES to its natural home at the end of the enum
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190401162641.10963-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
In preparation to making the ppGTT binding for a context explicit (to
facilitate reusing the same ppGTT between different contexts), allow the
user to create and destroy named ppGTT.
v2: Replace global barrier for swapping over the ppgtt and tlbs with a
local context barrier (Tvrtko)
v3: serialise with struct_mutex; it's lazy but required dammit
v4: Rewrite igt_ctx_shared_exec to be more different (aimed to be more
similarly, turned out different!)
v5: Fix up test unwind for aliasing-ppgtt (snb)
v6: Tighten language for uapi struct drm_i915_gem_vm_control.
v7: Patch the context image for runtime ppgtt switching!
Testcase: igt/gem_vm_create
Testcase: igt/gem_ctx_param/vm
Testcase: igt/gem_ctx_clone/vm
Testcase: igt/gem_ctx_shared
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-2-chris@chris-wilson.co.uk
|
|
In later patches, it became apparent that userspace can see a partially
constructed GEM context and begin using it before it was ready, to much
hilarity. Close this window of opportunity by lifting the registration of
the context with userspace (the insertion of the context into the filp's
idr) to the very end of the CONTEXT_CREATE ioctl.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-1-chris@chris-wilson.co.uk
|
|
Large ppGTT are differentiated by the requirement to go to four levels
to address more than 32b. Given the introduction of more 4 level ppGTT
with different sizes of addressable bits, rename i915_vm_is_48b() to
better reflect the commonality of using 4 levels.
Based on a patch by Bob Paauwe.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-4-chris@chris-wilson.co.uk
|
|
In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.
v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
|
|
As the scratch page is the only one to be allocated with variable size,
rather than keep an unused slot in all i915_page_table structs, store it
alongside the vm->scratch_page.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305135430.4948-1-chris@chris-wilson.co.uk
|
|
Previously we only accommodated having a vma pinned by a small number of
users, with the maximum being pinned for use by the display engine. As
such, we used a small bitfield only large enough to allow the vma to
be pinned twice (for back/front buffers) in each scanout plane. Keeping
the maximum permissible pin_count small allows us to quickly catch a
potential leak. However, as we want to split a 4096B page into 64
different cachelines and pin each cacheline for use by a different
timeline, we will exceed the current maximum permissible vma->pin_count
and so time has come to enlarge it.
Whilst we are here, try to pull together the similar bits:
Address/layout specification:
- bias, mappable, zone_4g: address limit specifiers
- fixed: address override, limits still apply though
- high: not strictly an address limit, but an address direction to search
Search controls:
- nonblock, nonfault, noevict
v2: Rewrite the guideline comment on bit consumption.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: John Harrison <john.C.Harrison@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-2-chris@chris-wilson.co.uk
|