diff options
author | Dave Airlie <airlied@redhat.com> | 2019-10-22 13:51:04 +1000 |
---|---|---|
committer | Dave Airlie <airlied@redhat.com> | 2019-10-22 13:51:05 +1000 |
commit | 89910e62009a9359e6302f7e036a0a4564869cca (patch) | |
tree | dfb6d5f1f6a116e06bada6b4f5db2629db9aa367 /drivers/gpu/drm/i915/gt | |
parent | 7ed093602e0e1b60a0fc074a9692687e7d2b723d (diff) | |
parent | ce53908bba6fa6e905d8fe81da4591d3e7a65878 (diff) |
Merge tag 'drm-intel-next-2019-10-21' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Introduce a versioning of the i915-perf uapi (Lionel)
- Add support for perf configuration queries (Lionel)
Allow listing perf configurations with IOCTL in addition
to sysfs. This is useful in container usecases.
- Allow dynamic reconfiguration of the OA stream (Chris)
Allows the OA stream to be reconfigured between
batch buffers, giving greater flexibility in sampling.
- Allow holding preemption on filtered perf ctx
Allow CAP_ADMIN to block pre-emption of a context
to query performance counters without disturbances.
Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932
Cross-subsystem Changes:
- drm-next backmerge for HDR DP changes
https://lists.freedesktop.org/archives/dri-devel/2019-September/236453.html
Driver Changes:
- Add DC3CO sleep state for Tigerlake (Anshuman)
- Tigerlake BCS engine support engine relative MMIO (Daniele)
- Simplify the Tigerlake LRC register list for !RCS (Daniele)
- Read SAGV block time from PCODE on Tigerlake (James)
- Add 12 missing Tigerlake workarounds (Mika)
- Enable DDI/Port G for Tigerlake (Khaled)
- Avoid hang in tsg,vfe units by keeping l3 clocks ICL+(Mika)
- Fix Bugzilla #111966: Favor last VBT child device (Ville)
- Fix blue/black screen on boot due to broken gamma (Swati)
- Add support of BT.2020 Colorimetry to DP MSA (Gwan-gyeong)
- Attach colorspace property to DP connector (Gwan-gyeong)
- Attach HDR metadata property to DP connector (Gwan-gyeong)
- Base intel_memory_region support prep for local memory (Matt A)
- Introduce Jasper Lake PCH (Matt R)
- Support multiple GPUs in PMU (Tvrtko)
- Fix MST oops due to MSA changes (Ville)
- Refuse modes with hdisplay==4096 on pre-HSW DP (Ville)
- Correct the PCH type in irq postinstall for JSP (Vivek)
- Save Master transcoder in slave's crtc_state for Transcoder Port Sync (Manasi)
- Enable TRANSCODER PORT SYNC for tiled displays across separate ports (Manasi)
- HW state readout for transcoder port sync config (Manasi)
- Enable master-slaves in trans port sync (Manasi)
- In port sync mode disable slaves first then master (Manasi)
- Fix port checks for MST support on gen >= 11 (Lucas)
- Flush submission tasklet before waiting/retiring (Chris)
- Flush tasklet submission before sleeping on i915_request_wait (Chris)
- Object pin reference counting fixes (Chris, Matt A)
- Clear semaphore immediately upon ELSP promotion (Chris)
- Child device size remains unchanged through VBT 229 (Matt R)
- Restore dropped 'interruptible' flag on retiring requests (Chris)
- Treat a busy timeline as 'active' while waiting (Chris)
- Clean up struct_mutex from perf (Chris)
- Update locking around execlists->active (Chris)
- Mark up expected execlist state during reset (Chris)
- Remove cursor use of properties for coordinates (Maarten)
- Only mark incomplete requests as -EIO on cancelling (Chris)
- Add an rcu_barrier option to i915_drop_caches (Chris)
- Replace perf global wakeref tracking with engine-pm (Chris)
- Prevent merging requests with conflicting flags (Chris)
- Allow for CS OA configs to be created lazily (Lionel)
- Implement active wait for noa configurations (Lionel)
- Execute OA configuration from command stream (Lionel)
- Prefer using the pinned_ctx for emitting delays on config (Chris)
- Port C's hotplug interrupt is associated with TC1 bits (Vivek, Matt R)
- Extend program of VSC Header and DB for Colorimetry Format (Gwan-gyeong)
- Fine-tune timeslicing of contexts (Chris)
- Do initial mocs configuration directly (Chris)
- Fix uninitialized variable on PMU error path (Tvrtko)
- Don't disable interrupts independently of the locking (Sebastian)
- Eliminate struct_mutext from GVT (Chris)
- Move perf types to their own header (Lionel)
- Drop list of perf streams (always size 1) (Lionel)
- Store the perf associated engine of a stream (Lionel)
- Make array hw_engine_mask static (Colin)
- Prefer shortest path to RPM/perf/GT instead of dev_priv (Chris, Tvrtko)
- Virtual request submission fixes (Chris)
- Selftest/CI improvements (Chris)
- Fix Kconfig indentation (Krzysztof)
- Give engine->kernel_context distinct timeline lock classes (Chris)
- Fix null pointer deref on selftest error path (Colin)
- Select DPLL's via mask (Matt R)
- Introduce and use intel_atomic_crtc_state_for_each_plane_state (Maarten)
- Use intel_plane_state in prepare and cleanup plane_fb (Maarten)
- Remove begin/finish_crtc_commit (Maarten)
- Move SAGV block time to dev_priv (James)
- Avoid polluting the i915_oa_config with error pointers (Chris)
- Squelch display kerneldoc warnings (Chris)
- Assert tasklet is locked for process_csb() (Chris)
- Switch to using DP_MSA_MISC_* defines (Ville)
- Stop using drm_atomic_helper_check_planes() (Ville)
- Make .modeset_calc_cdclk() mandatory (Ville)
- Use drm_rect_translate_to()/drm_rect_init() (Ville)
- Refactor timestamping constants update (Ville)
- Switch intel_legacy_cursor_update() to intel_ types (Ville)
- Prepare the connector/encoder mask readout for hw vs. uapi state split (Ville)
- Prepare the mode readout for hw vs. uapi state split (Ville)
- Move swizzle_bit under i915_ggtt (Chris)
- Improve microcontrollers documentation (Daniele)
- Move the cursor rotation handling into intel_cursor_check_surface() (Ville)
- Cleanups to pipe code (Ville)
- Shrink eDRAM ways/sets arrays for code size (Ville)
- Cleanups to HDCP2 timeout code (Ville)
- Restore full symmetry in i915_driver_modeset_probe/remove (Janusz)
- Simplify setting of ddi_io_power_domain (Lucas)
- Add pipe id/name to pipe mismatch logs (Lucas)
- Prettify MST debug message (Lucas)
- Extract GT ring management to separate files (Andi)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021180337.GA24338@jlahtine-desk.ger.corp.intel.com
Diffstat (limited to 'drivers/gpu/drm/i915/gt')
41 files changed, 1705 insertions, 762 deletions
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index be34d97ac18f..59c3083c1ec1 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -62,7 +62,7 @@ int __intel_context_do_pin(struct intel_context *ce) } err = 0; - with_intel_runtime_pm(&ce->engine->i915->runtime_pm, wakeref) + with_intel_runtime_pm(ce->engine->uncore->rpm, wakeref) err = ce->ops->pin(ce); if (err) goto err; diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index c9e8c8ccbd47..93ea367fe624 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -136,6 +136,20 @@ execlists_active(const struct intel_engine_execlists *execlists) return READ_ONCE(*execlists->active); } +static inline void +execlists_active_lock_bh(struct intel_engine_execlists *execlists) +{ + local_bh_disable(); /* prevent local softirq and lock recursion */ + tasklet_lock(&execlists->tasklet); +} + +static inline void +execlists_active_unlock_bh(struct intel_engine_execlists *execlists) +{ + tasklet_unlock(&execlists->tasklet); + local_bh_enable(); /* restore softirq, and kick ksoftirqd! */ +} + struct i915_request * execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists); @@ -407,8 +421,9 @@ static inline void __intel_engine_reset(struct intel_engine_cs *engine, engine->serial++; /* contexts lost */ } -bool intel_engine_is_idle(struct intel_engine_cs *engine); bool intel_engines_are_idle(struct intel_gt *gt); +bool intel_engine_is_idle(struct intel_engine_cs *engine); +void intel_engine_flush_submission(struct intel_engine_cs *engine); void intel_engines_reset_default_submission(struct intel_gt *gt); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 80fd072ac719..051734c9b733 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -277,6 +277,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH)); BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH)); + if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine))) + return -EINVAL; + if (GEM_DEBUG_WARN_ON(info->class > MAX_ENGINE_CLASS)) return -EINVAL; @@ -293,6 +296,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) BUILD_BUG_ON(BITS_PER_TYPE(engine->mask) < I915_NUM_ENGINES); engine->id = id; + engine->legacy_idx = INVALID_ENGINE; engine->mask = BIT(id); engine->i915 = gt->i915; engine->gt = gt; @@ -328,6 +332,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) intel_engine_sanitize_mmio(engine); gt->engine_class[info->class][info->instance] = engine; + gt->engine[id] = engine; intel_engine_add_user(engine); gt->i915->engine[id] = engine; @@ -736,6 +741,7 @@ intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass) static struct intel_context * create_kernel_context(struct intel_engine_cs *engine) { + static struct lock_class_key kernel; struct intel_context *ce; int err; @@ -751,6 +757,14 @@ create_kernel_context(struct intel_engine_cs *engine) return ERR_PTR(err); } + /* + * Give our perma-pinned kernel timelines a separate lockdep class, + * so that we can use them from within the normal user timelines + * should we need to inject GPU operations during their request + * construction. + */ + lockdep_set_class(&ce->timeline->mutex, &kernel); + return ce; } @@ -1040,6 +1054,25 @@ static bool ring_is_idle(struct intel_engine_cs *engine) return idle; } +void intel_engine_flush_submission(struct intel_engine_cs *engine) +{ + struct tasklet_struct *t = &engine->execlists.tasklet; + + if (__tasklet_is_scheduled(t)) { + local_bh_disable(); + if (tasklet_trylock(t)) { + /* Must wait for any GPU reset in progress. */ + if (__tasklet_is_enabled(t)) + t->func(t->data); + tasklet_unlock(t); + } + local_bh_enable(); + } + + /* Otherwise flush the tasklet if it was running on another cpu */ + tasklet_unlock_wait(t); +} + /** * intel_engine_is_idle() - Report if the engine has finished process all work * @engine: the intel_engine_cs @@ -1058,21 +1091,9 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine) /* Waiting to drain ELSP? */ if (execlists_active(&engine->execlists)) { - struct tasklet_struct *t = &engine->execlists.tasklet; - synchronize_hardirq(engine->i915->drm.pdev->irq); - local_bh_disable(); - if (tasklet_trylock(t)) { - /* Must wait for any GPU reset in progress. */ - if (__tasklet_is_enabled(t)) - t->func(t->data); - tasklet_unlock(t); - } - local_bh_enable(); - - /* Otherwise flush the tasklet if it was on another cpu */ - tasklet_unlock_wait(t); + intel_engine_flush_submission(engine); if (execlists_active(&engine->execlists)) return false; @@ -1102,7 +1123,7 @@ bool intel_engines_are_idle(struct intel_gt *gt) if (!READ_ONCE(gt->awake)) return true; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { if (!intel_engine_is_idle(engine)) return false; } @@ -1115,7 +1136,7 @@ void intel_engines_reset_default_submission(struct intel_gt *gt) struct intel_engine_cs *engine; enum intel_engine_id id; - for_each_engine(engine, gt->i915, id) + for_each_engine(engine, gt, id) engine->set_default_submission(engine); } @@ -1225,13 +1246,22 @@ static struct intel_timeline *get_timeline(struct i915_request *rq) return tl; } +static const char *repr_timer(const struct timer_list *t) +{ + if (!READ_ONCE(t->expires)) + return "inactive"; + + if (timer_pending(t)) + return "active"; + + return "expired"; +} + static void intel_engine_print_registers(struct intel_engine_cs *engine, struct drm_printer *m) { struct drm_i915_private *dev_priv = engine->i915; - const struct intel_engine_execlists * const execlists = - &engine->execlists; - unsigned long flags; + struct intel_engine_execlists * const execlists = &engine->execlists; u64 addr; if (engine->id == RENDER_CLASS && IS_GEN_RANGE(dev_priv, 4, 7)) @@ -1288,19 +1318,20 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine, unsigned int idx; u8 read, write; - drm_printf(m, "\tExeclist status: 0x%08x %08x, entries %u\n", - ENGINE_READ(engine, RING_EXECLIST_STATUS_LO), - ENGINE_READ(engine, RING_EXECLIST_STATUS_HI), - num_entries); + drm_printf(m, "\tExeclist tasklet queued? %s (%s), timeslice? %s\n", + yesno(test_bit(TASKLET_STATE_SCHED, + &engine->execlists.tasklet.state)), + enableddisabled(!atomic_read(&engine->execlists.tasklet.count)), + repr_timer(&engine->execlists.timer)); read = execlists->csb_head; write = READ_ONCE(*execlists->csb_write); - drm_printf(m, "\tExeclist CSB read %d, write %d, tasklet queued? %s (%s)\n", - read, write, - yesno(test_bit(TASKLET_STATE_SCHED, - &engine->execlists.tasklet.state)), - enableddisabled(!atomic_read(&engine->execlists.tasklet.count))); + drm_printf(m, "\tExeclist status: 0x%08x %08x; CSB read:%d, write:%d, entries:%d\n", + ENGINE_READ(engine, RING_EXECLIST_STATUS_LO), + ENGINE_READ(engine, RING_EXECLIST_STATUS_HI), + read, write, num_entries); + if (read >= num_entries) read = 0; if (write >= num_entries) @@ -1313,7 +1344,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine, idx, hws[idx * 2], hws[idx * 2 + 1]); } - spin_lock_irqsave(&engine->active.lock, flags); + execlists_active_lock_bh(execlists); for (port = execlists->active; (rq = *port); port++) { char hdr[80]; int len; @@ -1351,7 +1382,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine, if (tl) intel_timeline_put(tl); } - spin_unlock_irqrestore(&engine->active.lock, flags); + execlists_active_unlock_bh(execlists); } else if (INTEL_GEN(dev_priv) > 6) { drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n", ENGINE_READ(engine, RING_PP_DIR_BASE)); @@ -1458,10 +1489,10 @@ void intel_engine_dump(struct intel_engine_cs *engine, spin_unlock_irqrestore(&engine->active.lock, flags); drm_printf(m, "\tMMIO base: 0x%08x\n", engine->mmio_base); - wakeref = intel_runtime_pm_get_if_in_use(&engine->i915->runtime_pm); + wakeref = intel_runtime_pm_get_if_in_use(engine->uncore->rpm); if (wakeref) { intel_engine_print_registers(engine, m); - intel_runtime_pm_put(&engine->i915->runtime_pm, wakeref); + intel_runtime_pm_put(engine->uncore->rpm, wakeref); } else { drm_printf(m, "\tDevice is asleep; skipping register dump\n"); } @@ -1493,8 +1524,8 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine) if (!intel_engine_supports_stats(engine)) return -ENODEV; - spin_lock_irqsave(&engine->active.lock, flags); - write_seqlock(&engine->stats.lock); + execlists_active_lock_bh(execlists); + write_seqlock_irqsave(&engine->stats.lock, flags); if (unlikely(engine->stats.enabled == ~0)) { err = -EBUSY; @@ -1522,8 +1553,8 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine) } unlock: - write_sequnlock(&engine->stats.lock); - spin_unlock_irqrestore(&engine->active.lock, flags); + write_sequnlock_irqrestore(&engine->stats.lock, flags); + execlists_active_unlock_bh(execlists); return err; } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 8e5e513eddc9..67eb6183648a 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -185,7 +185,7 @@ static const struct intel_wakeref_ops wf_ops = { void intel_engine_init__pm(struct intel_engine_cs *engine) { - struct intel_runtime_pm *rpm = &engine->i915->runtime_pm; + struct intel_runtime_pm *rpm = engine->uncore->rpm; intel_wakeref_init(&engine->wakeref, rpm, &wf_ops); } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 6199064f332b..3451be034caf 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -148,6 +148,7 @@ enum intel_engine_id { VECS1, #define _VECS(n) (VECS0 + (n)) I915_NUM_ENGINES +#define INVALID_ENGINE ((enum intel_engine_id)-1) }; struct st_preempt_hang { diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c index 77cd5de83930..7f7150a733f4 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c @@ -160,10 +160,10 @@ static int legacy_ring_idx(const struct legacy_ring *ring) }; if (GEM_DEBUG_WARN_ON(ring->class >= ARRAY_SIZE(map))) - return -1; + return INVALID_ENGINE; if (GEM_DEBUG_WARN_ON(ring->instance >= map[ring->class].max)) - return -1; + return INVALID_ENGINE; return map[ring->class].base + ring->instance; } @@ -171,23 +171,15 @@ static int legacy_ring_idx(const struct legacy_ring *ring) static void add_legacy_ring(struct legacy_ring *ring, struct intel_engine_cs *engine) { - int idx; - if (engine->gt != ring->gt || engine->class != ring->class) { ring->gt = engine->gt; ring->class = engine->class; ring->instance = 0; } - idx = legacy_ring_idx(ring); - if (unlikely(idx == -1)) - return; - - GEM_BUG_ON(idx >= ARRAY_SIZE(ring->gt->engine)); - ring->gt->engine[idx] = engine; - ring->instance++; - - engine->legacy_idx = idx; + engine->legacy_idx = legacy_ring_idx(ring); + if (engine->legacy_idx != INVALID_ENGINE) + ring->instance++; } void intel_engines_driver_register(struct drm_i915_private *i915) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index b0227ab2fe1b..4294f146f13c 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -138,6 +138,7 @@ /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */ #define MI_LRI_CS_MMIO (1<<19) #define MI_LRI_FORCE_POSTED (1<<12) +#define MI_LOAD_REGISTER_IMM_MAX_REGS (126) #define MI_STORE_REGISTER_MEM MI_INSTR(0x24, 1) #define MI_STORE_REGISTER_MEM_GEN8 MI_INSTR(0x24, 2) #define MI_SRM_LRM_GLOBAL_GTT (1<<22) @@ -162,7 +163,8 @@ #define MI_BATCH_BUFFER_START MI_INSTR(0x31, 0) #define MI_BATCH_GTT (2<<6) /* aliased with (1<<7) on gen4 */ #define MI_BATCH_BUFFER_START_GEN8 MI_INSTR(0x31, 1) -#define MI_BATCH_RESOURCE_STREAMER (1<<10) +#define MI_BATCH_RESOURCE_STREAMER REG_BIT(10) +#define MI_BATCH_PREDICATE REG_BIT(15) /* HSW+ on RCS only*/ /* * 3D instructions used by the kernel @@ -223,6 +225,7 @@ #define PIPE_CONTROL_CS_STALL (1<<20) #define PIPE_CONTROL_TLB_INVALIDATE (1<<18) #define PIPE_CONTROL_MEDIA_STATE_CLEAR (1<<16) +#define PIPE_CONTROL_WRITE_TIMESTAMP (3<<14) #define PIPE_CONTROL_QW_WRITE (1<<14) #define PIPE_CONTROL_POST_SYNC_OP_MASK (3<<14) #define PIPE_CONTROL_DEPTH_STALL (1<<13) @@ -230,7 +233,9 @@ #define PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH (1<<12) /* gen6+ */ #define PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE (1<<11) /* MBZ on ILK */ #define PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE (1<<10) /* GM45+ only */ +#define PIPE_CONTROL_L3_RO_CACHE_INVALIDATE REG_BIT(10) /* gen12 */ #define PIPE_CONTROL_INDIRECT_STATE_DISABLE (1<<9) +#define PIPE_CONTROL_HDC_PIPELINE_FLUSH REG_BIT(9) /* gen12 */ #define PIPE_CONTROL_NOTIFY (1<<8) #define PIPE_CONTROL_FLUSH_ENABLE (1<<7) /* gen7+ */ #define PIPE_CONTROL_DC_FLUSH_ENABLE (1<<5) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 8f44cf8c79b2..1c4b6c9642ad 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -186,7 +186,7 @@ intel_gt_clear_error_registers(struct intel_gt *gt, struct intel_engine_cs *engine; enum intel_engine_id id; - for_each_engine_masked(engine, i915, engine_mask, id) + for_each_engine_masked(engine, gt, engine_mask, id) gen8_clear_engine_error_register(engine); } } @@ -197,7 +197,7 @@ static void gen6_check_faults(struct intel_gt *gt) enum intel_engine_id id; u32 fault; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { fault = GEN6_RING_FAULT_REG_READ(engine); if (fault & RING_FAULT_VALID) { DRM_DEBUG_DRIVER("Unexpected fault\n" @@ -273,7 +273,7 @@ void intel_gt_check_and_clear_faults(struct intel_gt *gt) void intel_gt_flush_ggtt_writes(struct intel_gt *gt) { - struct drm_i915_private *i915 = gt->i915; + struct intel_uncore *uncore = gt->uncore; intel_wakeref_t wakeref; /* @@ -297,13 +297,12 @@ void intel_gt_flush_ggtt_writes(struct intel_gt *gt) wmb(); - if (INTEL_INFO(i915)->has_coherent_ggtt) + if (INTEL_INFO(gt->i915)->has_coherent_ggtt) return; intel_gt_chipset_flush(gt); - with_intel_runtime_pm(&i915->runtime_pm, wakeref) { - struct intel_uncore *uncore = gt->uncore; + with_intel_runtime_pm(uncore->rpm, wakeref) { unsigned long flags; spin_lock_irqsave(&uncore->lock, flags); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c index b52e2ba3d092..b866d5b1eee0 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c @@ -94,7 +94,7 @@ static const struct intel_wakeref_ops wf_ops = { void intel_gt_pm_init_early(struct intel_gt *gt) { - intel_wakeref_init(>->wakeref, >->i915->runtime_pm, &wf_ops); + intel_wakeref_init(>->wakeref, gt->uncore->rpm, &wf_ops); BLOCKING_INIT_NOTIFIER_HEAD(>->pm_notifications); } @@ -136,11 +136,18 @@ void intel_gt_sanitize(struct intel_gt *gt, bool force) intel_uc_sanitize(>->uc); - if (!reset_engines(gt) && !force) - return; + for_each_engine(engine, gt, id) + if (engine->reset.prepare) + engine->reset.prepare(engine); - for_each_engine(engine, gt->i915, id) - __intel_engine_reset(engine, false); + if (reset_engines(gt) || force) { + for_each_engine(engine, gt, id) + __intel_engine_reset(engine, false); + } + + for_each_engine(engine, gt, id) + if (engine->reset.finish) + engine->reset.finish(engine); } void intel_gt_pm_disable(struct intel_gt *gt) @@ -170,7 +177,7 @@ int intel_gt_resume(struct intel_gt *gt) intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL); intel_rc6_sanitize(>->rc6); - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { struct intel_context *ce; intel_engine_pm_get(engine); @@ -222,7 +229,7 @@ void intel_gt_suspend(struct intel_gt *gt) /* We expect to be idle already; but also want to be independent */ wait_for_idle(gt); - with_intel_runtime_pm(>->i915->runtime_pm, wakeref) + with_intel_runtime_pm(gt->uncore->rpm, wakeref) intel_rc6_disable(>->rc6); } diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c index 8aed89fd2cdc..b73229a84d85 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -4,6 +4,7 @@ * Copyright © 2019 Intel Corporation */ +#include "i915_drv.h" /* for_each_engine() */ #include "i915_request.h" #include "intel_gt.h" #include "intel_gt_pm.h" @@ -19,6 +20,15 @@ static void retire_requests(struct intel_timeline *tl) break; } +static void flush_submission(struct intel_gt *gt) +{ + struct intel_engine_cs *engine; + enum intel_engine_id id; + + for_each_engine(engine, gt, id) + intel_engine_flush_submission(engine); +} + long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout) { struct intel_gt_timelines *timelines = >->timelines; @@ -32,10 +42,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout) if (unlikely(timeout < 0)) timeout = -timeout, interruptible = false; + flush_submission(gt); /* kick the ksoftirqd tasklets */ + spin_lock_irqsave(&timelines->lock, flags); list_for_each_entry_safe(tl, tn, &timelines->active_list, link) { - if (!mutex_trylock(&tl->mutex)) + if (!mutex_trylock(&tl->mutex)) { + active_count++; /* report busy to caller, try again? */ continue; + } intel_timeline_get(tl); GEM_BUG_ON(!tl->active_count); @@ -48,7 +62,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout) fence = i915_active_fence_get(&tl->last_request); if (fence) { timeout = dma_fence_wait_timeout(fence, - true, + interruptible, timeout); dma_fence_put(fence); } diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 802f516a3430..ae4aaf75ac78 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -17,6 +17,7 @@ #include "i915_vma.h" #include "intel_engine_types.h" +#include "intel_llc_types.h" #include "intel_reset_types.h" #include "intel_rc6_types.h" #include "intel_wakeref.h" @@ -79,6 +80,7 @@ struct intel_gt { */ intel_wakeref_t awake; + struct intel_llc llc; struct intel_rc6 rc6; struct blocking_notifier_head pm_notifications; @@ -109,6 +111,11 @@ enum intel_gt_scratch_field { /* 8 bytes */ INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256, + /* 6 * 8 bytes */ + INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR = 2048, + + /* 4 bytes */ + INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1 = 2096, }; #endif /* __INTEL_GT_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c index 9814b18b32ad..0fdef00af9e4 100644 --- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c @@ -237,7 +237,7 @@ static void hangcheck_declare_hang(struct intel_gt *gt, hung &= ~stuck; len = scnprintf(msg, sizeof(msg), "%s on ", stuck == hung ? "no progress" : "hang"); - for_each_engine_masked(engine, gt->i915, hung, tmp) + for_each_engine_masked(engine, gt, hung, tmp) len += scnprintf(msg + len, sizeof(msg) - len, "%s, ", engine->name); msg[len-2] = '\0'; @@ -271,7 +271,7 @@ static void hangcheck_elapsed(struct work_struct *work) if (intel_gt_is_wedged(gt)) return; - wakeref = intel_runtime_pm_get_if_in_use(>->i915->runtime_pm); + wakeref = intel_runtime_pm_get_if_in_use(gt->uncore->rpm); if (!wakeref) return; @@ -281,7 +281,7 @@ static void hangcheck_elapsed(struct work_struct *work) */ intel_uncore_arm_unclaimed_mmio_detection(gt->uncore); - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { struct hangcheck hc; intel_engine_breadcrumbs_irq(engine); @@ -303,7 +303,7 @@ static void hangcheck_elapsed(struct work_struct *work) if (GEM_SHOW_DEBUG() && (hung | stuck)) { struct drm_printer p = drm_debug_printer("hangcheck"); - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { if (intel_engine_is_idle(engine)) continue; @@ -322,7 +322,7 @@ static void hangcheck_elapsed(struct work_struct *work) if (hung) hangcheck_declare_hang(gt, hung, stuck); - intel_runtime_pm_put(>->i915->runtime_pm, wakeref); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); /* Reset timer in case GPU hangs without another request being added */ intel_gt_queue_hangcheck(gt); diff --git a/drivers/gpu/drm/i915/gt/intel_llc.c b/drivers/gpu/drm/i915/gt/intel_llc.c new file mode 100644 index 000000000000..35093eb5f24e --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_llc.c @@ -0,0 +1,161 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#include <linux/cpufreq.h> + +#include "i915_drv.h" +#include "intel_gt.h" +#include "intel_llc.h" +#include "intel_sideband.h" + +struct ia_constants { + unsigned int min_gpu_freq; + unsigned int max_gpu_freq; + + unsigned int min_ring_freq; + unsigned int max_ia_freq; +}; + +static struct intel_gt *llc_to_gt(struct intel_llc *llc) +{ + return container_of(llc, struct intel_gt, llc); +} + +static unsigned int cpu_max_MHz(void) +{ + struct cpufreq_policy *policy; + unsigned int max_khz; + + policy = cpufreq_cpu_get(0); + if (policy) { + max_khz = policy->cpuinfo.max_freq; + cpufreq_cpu_put(policy); + } else { + /* + * Default to measured freq if none found, PCU will ensure we + * don't go over + */ + max_khz = tsc_khz; + } + + return max_khz / 1000; +} + +static bool get_ia_constants(struct intel_llc *llc, + struct ia_constants *consts) +{ + struct drm_i915_private *i915 = llc_to_gt(llc)->i915; + struct intel_rps *rps = &i915->gt_pm.rps; + + if (rps->max_freq <= rps->min_freq) + return false; + + consts->max_ia_freq = cpu_max_MHz(); + + consts->min_ring_freq = + intel_uncore_read(llc_to_gt(llc)->uncore, DCLK) & 0xf; + /* convert DDR frequency from units of 266.6MHz to bandwidth */ + consts->min_ring_freq = mult_frac(consts->min_ring_freq, 8, 3); + + consts->min_gpu_freq = rps->min_freq; + consts->max_gpu_freq = rps->max_freq; + if (INTEL_GEN(i915) >= 9) { + /* Convert GT frequency to 50 HZ units */ + consts->min_gpu_freq /= GEN9_FREQ_SCALER; + consts->max_gpu_freq /= GEN9_FREQ_SCALER; + } + + return true; +} + +static void calc_ia_freq(struct intel_llc *llc, + unsigned int gpu_freq, + const struct ia_constants *consts, + unsigned int *out_ia_freq, + unsigned int *out_ring_freq) +{ + struct drm_i915_private *i915 = llc_to_gt(llc)->i915; + const int diff = consts->max_gpu_freq - gpu_freq; + unsigned int ia_freq = 0, ring_freq = 0; + + if (INTEL_GEN(i915) >= 9) { + /* + * ring_freq = 2 * GT. ring_freq is in 100MHz units + * No floor required for ring frequency on SKL. + */ + ring_freq = gpu_freq; + } else if (INTEL_GEN(i915) >= 8) { + /* max(2 * GT, DDR). NB: GT is 50MHz units */ + ring_freq = max(consts->min_ring_freq, gpu_freq); + } else if (IS_HASWELL(i915)) { + ring_freq = mult_frac(gpu_freq, 5, 4); + ring_freq = max(consts->min_ring_freq, ring_freq); + /* leave ia_freq as the default, chosen by cpufreq */ + } else { + const int min_freq = 15; + const int scale = 180; + + /* + * On older processors, there is no separate ring + * clock domain, so in order to boost the bandwidth + * of the ring, we need to upclock the CPU (ia_freq). + * + * For GPU frequencies less than 750MHz, + * just use the lowest ring freq. + */ + if (gpu_freq < min_freq) + ia_freq = 800; + else + ia_freq = consts->max_ia_freq - diff * scale / 2; + ia_freq = DIV_ROUND_CLOSEST(ia_freq, 100); + } + + *out_ia_freq = ia_freq; + *out_ring_freq = ring_freq; +} + +static void gen6_update_ring_freq(struct intel_llc *llc) +{ + struct drm_i915_private *i915 = llc_to_gt(llc)->i915; + struct ia_constants consts; + unsigned int gpu_freq; + + if (!get_ia_constants(llc, &consts)) + return; + + /* + * For each potential GPU frequency, load a ring frequency we'd like + * to use for memory access. We do this by specifying the IA frequency + * the PCU should use as a reference to determine the ring frequency. + */ + for (gpu_freq = consts.max_gpu_freq; + gpu_freq >= consts.min_gpu_freq; + gpu_freq--) { + unsigned int ia_freq, ring_freq; + + calc_ia_freq(llc, gpu_freq, &consts, &ia_freq, &ring_freq); + sandybridge_pcode_write(i915, + GEN6_PCODE_WRITE_MIN_FREQ_TABLE, + ia_freq << GEN6_PCODE_FREQ_IA_RATIO_SHIFT | + ring_freq << GEN6_PCODE_FREQ_RING_RATIO_SHIFT | + gpu_freq); + } +} + +void intel_llc_enable(struct intel_llc *llc) +{ + if (HAS_LLC(llc_to_gt(llc)->i915)) + gen6_update_ring_freq(llc); +} + +void intel_llc_disable(struct intel_llc *llc) +{ + /* Currently there is no HW configuration to be done to disable. */ +} + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftest_llc.c" +#endif diff --git a/drivers/gpu/drm/i915/gt/intel_llc.h b/drivers/gpu/drm/i915/gt/intel_llc.h new file mode 100644 index 000000000000..ef09a890d2b7 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_llc.h @@ -0,0 +1,15 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#ifndef INTEL_LLC_H +#define INTEL_LLC_H + +struct intel_llc; + +void intel_llc_enable(struct intel_llc *llc); +void intel_llc_disable(struct intel_llc *llc); + +#endif /* INTEL_LLC_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_llc_types.h b/drivers/gpu/drm/i915/gt/intel_llc_types.h new file mode 100644 index 000000000000..ecad4687b930 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_llc_types.h @@ -0,0 +1,13 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#ifndef INTEL_LLC_TYPES_H +#define INTEL_LLC_TYPES_H + +struct intel_llc { +}; + +#endif /* INTEL_LLC_TYPES_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 468438fb47af..d0088d020220 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -247,8 +247,12 @@ static void __context_pin_release(struct intel_context *ce) static void mark_eio(struct i915_request *rq) { - if (!i915_request_signaled(rq)) - dma_fence_set_error(&rq->fence, -EIO); + if (i915_request_completed(rq)) + return; + + GEM_BUG_ON(i915_request_signaled(rq)); + + dma_fence_set_error(&rq->fence, -EIO); i915_request_mark_complete(rq); } @@ -348,10 +352,15 @@ static inline bool need_preempt(const struct intel_engine_cs *engine, * However, the priority hint is a mere hint that we may need to * preempt. If that hint is stale or we may be trying to preempt * ourselves, ignore the request. + * + * More naturally we would write + * prio >= max(0, last); + * except that we wish to prevent triggering preemption at the same + * priority level: the task that is running should remain running + * to preserve FIFO ordering of dependencies. */ - last_prio = effective_prio(rq); - if (!i915_scheduler_need_preempt(engine->execlists.queue_priority_hint, - last_prio)) + last_prio = max(effective_prio(rq), I915_PRIORITY_NORMAL - 1); + if (engine->execlists.queue_priority_hint <= last_prio) return false; /* @@ -669,64 +678,6 @@ static const u8 gen12_xcs_offsets[] = { REG16(0x274), REG16(0x270), - NOP(13), - LRI(2, POSTED), - REG16(0x200), - REG16(0x204), - - NOP(11), - LRI(50, POSTED), - REG16(0x588), - REG16(0x588), - REG16(0x588), - REG16(0x588), - REG16(0x588), - REG16(0x588), - REG(0x028), - REG(0x09c), - REG(0x0c0), - REG(0x178), - REG(0x17c), - REG16(0x358), - REG(0x170), - REG(0x150), - REG(0x154), - REG(0x158), - REG16(0x41c), - REG16(0x600), - REG16(0x604), - REG16(0x608), - REG16(0x60c), - REG16(0x610), - REG16(0x614), - REG16(0x618), - REG16(0x61c), - REG16(0x620), - REG16(0x624), - REG16(0x628), - REG16(0x62c), - REG16(0x630), - REG16(0x634), - REG16(0x638), - REG16(0x63c), - REG16(0x640), - REG16(0x644), - REG16(0x648), - REG16(0x64c), - REG16(0x650), - REG16(0x654), - REG16(0x658), - REG16(0x65c), - REG16(0x660), - REG16(0x664), - REG16(0x668), - REG16(0x66c), - REG16(0x670), - REG16(0x674), - REG16(0x678), - REG16(0x67c), - REG(0x068), - END(), }; @@ -857,6 +808,15 @@ static const u8 gen12_rcs_offsets[] = { static const u8 *reg_offsets(const struct intel_engine_cs *engine) { + /* + * The gen12+ lists only have the registers we program in the basic + * default state. We rely on the context image using relative + * addressing to automatic fixup the register state between the + * physical engines for virtual engine. + */ + GEM_BUG_ON(INTEL_GEN(engine->i915) >= 12 && + !intel_engine_has_relative_mmio(engine)); + if (engine->class == RENDER_CLASS) { if (INTEL_GEN(engine->i915) >= 12) return gen12_rcs_offsets; @@ -892,7 +852,6 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) list_for_each_entry_safe_reverse(rq, rn, &engine->active.requests, sched.link) { - struct intel_engine_cs *owner; if (i915_request_completed(rq)) continue; /* XXX */ @@ -907,8 +866,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) * engine so that it can be moved across onto another physical * engine as load dictates. */ - owner = rq->hw_context->engine; - if (likely(owner == engine)) { + if (likely(rq->execution_mask == engine->mask)) { GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); if (rq_prio(rq) != prio) { prio = rq_prio(rq); @@ -919,6 +877,8 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) list_move(&rq->sched.link, pl); active = rq; } else { + struct intel_engine_cs *owner = rq->hw_context->engine; + /* * Decouple the virtual breadcrumb before moving it * back to the virtual engine -- we don't want the @@ -928,7 +888,8 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) */ if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) { - spin_lock(&rq->lock); + spin_lock_nested(&rq->lock, + SINGLE_DEPTH_NESTING); i915_request_cancel_breadcrumb(rq); spin_unlock(&rq->lock); } @@ -1092,6 +1053,10 @@ static u64 execlists_update_context(const struct i915_request *rq) desc = ce->lrc_desc; ce->lrc_desc &= ~CTX_DESC_FORCE_RESTORE; + /* Wa_1607138340:tgl */ + if (IS_TGL_REVID(rq->i915, TGL_REVID_A0, TGL_REVID_A0)) + desc |= CTX_DESC_FORCE_RESTORE; + return desc; } @@ -1137,25 +1102,45 @@ assert_pending_valid(const struct intel_engine_execlists *execlists, trace_ports(execlists, msg, execlists->pending); - if (!execlists->pending[0]) + if (!execlists->pending[0]) { + GEM_TRACE_ERR("Nothing pending for promotion!\n"); return false; + } - if (execlists->pending[execlists_num_ports(execlists)]) + if (execlists->pending[execlists_num_ports(execlists)]) { + GEM_TRACE_ERR("Excess pending[%d] for promotion!\n", + execlists_num_ports(execlists)); return false; + } for (port = execlists->pending; (rq = *port); port++) { - if (ce == rq->hw_context) + if (ce == rq->hw_context) { + GEM_TRACE_ERR("Duplicate context in pending[%zd]\n", + port - execlists->pending); return false; + } ce = rq->hw_context; if (i915_request_completed(rq)) continue; - if (i915_active_is_idle(&ce->active)) + if (i915_active_is_idle(&ce->active)) { + GEM_TRACE_ERR("Inactive context in pending[%zd]\n", + port - execlists->pending); return false; + } + + if (!i915_vma_is_pinned(ce->state)) { + GEM_TRACE_ERR("Unpinned context in pending[%zd]\n", + port - execlists->pending); + return false; + } - if (!i915_vma_is_pinned(ce->state)) + if (!i915_vma_is_pinned(ce->ring->vma)) { + GEM_TRACE_ERR("Unpinned ringbuffer in pending[%zd]\n", + port - execlists->pending); return false; + } } return ce; @@ -1232,6 +1217,10 @@ static bool can_merge_rq(const struct i915_request *prev, if (i915_request_completed(next)) return true; + if (unlikely((prev->flags ^ next->flags) & + (I915_REQUEST_NOPREEMPT | I915_REQUEST_SENTINEL))) + return false; + if (!can_merge_ctx(prev->hw_context, next->hw_context)) return false; @@ -1288,7 +1277,7 @@ static void virtual_xfer_breadcrumbs(struct virtual_engine *ve, static struct i915_request * last_active(const struct intel_engine_execlists *execlists) { - struct i915_request * const *last = execlists->active; + struct i915_request * const *last = READ_ONCE(execlists->active); while (*last && i915_request_completed(*last)) last++; @@ -1489,7 +1478,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) last->hw_context->lrc_desc |= CTX_DESC_FORCE_RESTORE; last = NULL; } else if (need_timeslice(engine, last) && - !timer_pending(&engine->execlists.timer)) { + timer_expired(&engine->execlists.timer)) { GEM_TRACE("%s: expired last=%llx:%lld, prio=%d, hint=%d\n", engine->name, last->fence.context, @@ -1525,8 +1514,17 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * submission. */ if (!list_is_last(&last->sched.link, - &engine->active.requests)) + &engine->active.requests)) { + /* + * Even if ELSP[1] is occupied and not worthy + * of timeslices, our queue might be. + */ + if (!execlists->timer.expires && + need_timeslice(engine, last)) + mod_timer(&execlists->timer, + jiffies + 1); return; + } /* * WaIdleLiteRestore:bdw,skl @@ -1680,6 +1678,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine) if (last->hw_context == rq->hw_context) goto done; + if (i915_request_has_sentinel(last)) + goto done; + /* * If GVT overrides us we only ever submit * port[0], leaving port[1] empty. Note that we @@ -1859,6 +1860,13 @@ static void process_csb(struct intel_engine_cs *engine) const u8 num_entries = execlists->csb_size; u8 head, tail; + /* + * As we modify our execlists state tracking we require exclusive + * access. Either we are inside the tasklet, or the tasklet is disabled + * and we assume that is only inside the reset paths and so serialised. + */ + GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) && + !reset_in_progress(execlists)); GEM_BUG_ON(USES_GUC_SUBMISSION(engine->i915)); /* @@ -1920,6 +1928,9 @@ static void process_csb(struct intel_engine_cs *engine) else promote = gen8_csb_parse(execlists, buf + 2 * head); if (promote) { + if (!inject_preempt_hang(execlists)) + ring_set_paused(engine, 0); + /* cancel old inflight, prepare for switch */ trace_ports(execlists, "preempted", execlists->active); while (*execlists->active) @@ -1935,9 +1946,8 @@ static void process_csb(struct intel_engine_cs *engine) if (enable_timeslice(execlists)) mod_timer(&execlists->timer, jiffies + 1); - - if (!inject_preempt_hang(execlists)) - ring_set_paused(engine, 0); + else + cancel_timer(&execlists->timer); WRITE_ONCE(execlists->pending[0], NULL); } else { @@ -1980,8 +1990,11 @@ static void process_csb(struct intel_engine_cs *engine) static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) { lockdep_assert_held(&engine->active.lock); - if (!engine->execlists.pending[0]) + if (!engine->execlists.pending[0]) { + rcu_read_lock(); /* protect peeking at execlists->active */ execlists_dequeue(engine); + rcu_read_unlock(); + } } /* @@ -2773,8 +2786,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) if (!rq) goto unwind; + /* We still have requests in-flight; the engine should be active */ + GEM_BUG_ON(!intel_engine_pm_is_awake(engine)); + ce = rq->hw_context; - GEM_BUG_ON(i915_active_is_idle(&ce->active)); GEM_BUG_ON(!i915_vma_is_pinned(ce->state)); /* Proclaim we have exclusive access to the context image! */ @@ -2782,10 +2797,13 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) rq = active_request(rq); if (!rq) { + /* Idle context; tidy up the ring so we can restart afresh */ ce->ring->head = ce->ring->tail; goto out_replay; } + /* Context has requests still in-flight; it should not be idle! */ + GEM_BUG_ON(i915_active_is_idle(&ce->active)); ce->ring->head = intel_ring_wrap(ce->ring, rq->head); /* @@ -3205,8 +3223,11 @@ static int gen12_emit_flush_render(struct i915_request *request, flags |= PIPE_CONTROL_TILE_CACHE_FLUSH; flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH; flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH; + /* Wa_1409600907:tgl */ + flags |= PIPE_CONTROL_DEPTH_STALL; flags |= PIPE_CONTROL_DC_FLUSH_ENABLE; flags |= PIPE_CONTROL_FLUSH_ENABLE; + flags |= PIPE_CONTROL_HDC_PIPELINE_FLUSH; flags |= PIPE_CONTROL_STORE_DATA_INDEX; flags |= PIPE_CONTROL_QW_WRITE; @@ -3232,6 +3253,7 @@ static int gen12_emit_flush_render(struct i915_request *request, flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE; flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE; flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE; + flags |= PIPE_CONTROL_L3_RO_CACHE_INVALIDATE; flags |= PIPE_CONTROL_STORE_DATA_INDEX; flags |= PIPE_CONTROL_QW_WRITE; @@ -3253,6 +3275,26 @@ static int gen12_emit_flush_render(struct i915_request *request, *cs++ = preparser_disable(false); intel_ring_advance(request, cs); + + /* + * Wa_1604544889:tgl + */ + if (IS_TGL_REVID(request->i915, TGL_REVID_A0, TGL_REVID_A0)) { + flags = 0; + flags |= PIPE_CONTROL_CS_STALL; + flags |= PIPE_CONTROL_HDC_PIPELINE_FLUSH; + + flags |= PIPE_CONTROL_STORE_DATA_INDEX; + flags |= PIPE_CONTROL_QW_WRITE; + + cs = intel_ring_begin(request, 6); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + cs = gen8_emit_pipe_control(cs, flags, + LRC_PPHWSP_SCRATCH_ADDR); + intel_ring_advance(request, cs); + } } return 0; @@ -3415,15 +3457,18 @@ gen12_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs) PIPE_CONTROL_TILE_CACHE_FLUSH | PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH | PIPE_CONTROL_DEPTH_CACHE_FLUSH | + /* Wa_1409600907:tgl */ + PIPE_CONTROL_DEPTH_STALL | PIPE_CONTROL_DC_FLUSH_ENABLE | - PIPE_CONTROL_FLUSH_ENABLE); + PIPE_CONTROL_FLUSH_ENABLE | + PIPE_CONTROL_HDC_PIPELINE_FLUSH); return gen12_emit_fini_breadcrumb_footer(request, cs); } static void execlists_park(struct intel_engine_cs *engine) { - del_timer(&engine->execlists.timer); + cancel_timer(&engine->execlists.timer); } void intel_execlists_set_default_submission(struct intel_engine_cs *engine) @@ -3447,7 +3492,7 @@ void intel_execlists_set_default_submission(struct intel_engine_cs *engine) engine->flags |= I915_ENGINE_HAS_PREEMPTION; } - if (engine->class != COPY_ENGINE_CLASS && INTEL_GEN(engine->i915) >= 12) + if (INTEL_GEN(engine->i915) >= 12) engine->flags |= I915_ENGINE_HAS_RELATIVE_MMIO; } @@ -4169,6 +4214,7 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx, ve->base.i915 = ctx->i915; ve->base.gt = siblings[0]->gt; + ve->base.uncore = siblings[0]->uncore; ve->base.id = -1; ve->base.class = OTHER_CLASS; ve->base.uabi_class = I915_ENGINE_CLASS_INVALID; diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index 728704bbbe18..5bac3966906b 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -287,10 +287,9 @@ static const struct drm_i915_mocs_entry icelake_mocs_table[] = { GEN11_MOCS_ENTRIES }; -static bool get_mocs_settings(struct intel_gt *gt, +static bool get_mocs_settings(const struct drm_i915_private *i915, struct drm_i915_mocs_table *table) { - struct drm_i915_private *i915 = gt->i915; bool result = false; if (INTEL_GEN(i915) >= 12) { @@ -331,9 +330,9 @@ static bool get_mocs_settings(struct intel_gt *gt, return result; } -static i915_reg_t mocs_register(enum intel_engine_id engine_id, int index) +static i915_reg_t mocs_register(const struct intel_engine_cs *engine, int index) { - switch (engine_id) { + switch (engine->id) { case RCS0: return GEN9_GFX_MOCS(index); case VCS0: @@ -347,7 +346,7 @@ static i915_reg_t mocs_register(enum intel_engine_id engine_id, int index) case VCS2: return GEN11_MFX2_MOCS(index); default: - MISSING_CASE(engine_id); + MISSING_CASE(engine->id); return INVALID_MMIO_REG; } } @@ -365,118 +364,25 @@ static u32 get_entry_control(const struct drm_i915_mocs_table *table, return table->table[I915_MOCS_PTE].control_value; } -/** - * intel_mocs_init_engine() - emit the mocs control table - * @engine: The engine for whom to emit the registers. - * - * This function simply emits a MI_LOAD_REGISTER_IMM command for the - * given table starting at the given address. - */ -void intel_mocs_init_engine(struct intel_engine_cs *engine) +static void init_mocs_table(struct intel_engine_cs *engine, + const struct drm_i915_mocs_table *table) { - struct intel_gt *gt = engine->gt; - struct intel_uncore *uncore = gt->uncore; - struct drm_i915_mocs_table table; - unsigned int index; - u32 unused_value; - - /* Platforms with global MOCS do not need per-engine initialization. */ - if (HAS_GLOBAL_MOCS_REGISTERS(gt->i915)) - return; - - /* Called under a blanket forcewake */ - assert_forcewakes_active(uncore, FORCEWAKE_ALL); - - if (!get_mocs_settings(gt, &table)) - return; - - /* Set unused values to PTE */ - unused_value = table.table[I915_MOCS_PTE].control_value; - - for (index = 0; index < table.size; index++) { - u32 value = get_entry_control(&table, index); + struct intel_uncore *uncore = engine->uncore; + u32 unused_value = table->table[I915_MOCS_PTE].control_value; + unsigned int i; + for (i = 0; i < table->size; i++) intel_uncore_write_fw(uncore, - mocs_register(engine->id, index), - value); - } + mocs_register(engine, i), + get_entry_control(table, i)); - /* All remaining entries are also unused */ - for (; index < table.n_entries; index++) + /* All remaining entries are unused */ + for (; i < table->n_entries; i++) intel_uncore_write_fw(uncore, - mocs_register(engine->id, index), + mocs_register(engine, i), unused_value); } -static void intel_mocs_init_global(struct intel_gt *gt) -{ - struct intel_uncore *uncore = gt->uncore; - struct drm_i915_mocs_table table; - unsigned int index; - - GEM_BUG_ON(!HAS_GLOBAL_MOCS_REGISTERS(gt->i915)); - - if (!get_mocs_settings(gt, &table)) - return; - - if (GEM_DEBUG_WARN_ON(table.size > table.n_entries)) - return; - - for (index = 0; index < table.size; index++) - intel_uncore_write(uncore, - GEN12_GLOBAL_MOCS(index), - table.table[index].control_value); - - /* - * Ok, now set the unused entries to the invalid entry (index 0). These - * entries are officially undefined and no contract for the contents and - * settings is given for these entries. - */ - for (; index < table.n_entries; index++) - intel_uncore_write(uncore, - GEN12_GLOBAL_MOCS(index), - table.table[0].control_value); -} - -static int emit_mocs_control_table(struct i915_request *rq, - const struct drm_i915_mocs_table *table) -{ - enum intel_engine_id engine = rq->engine->id; - unsigned int index; - u32 unused_value; - u32 *cs; - - if (GEM_WARN_ON(table->size > table->n_entries)) - return -ENODEV; - - /* Set unused values to PTE */ - unused_value = table->table[I915_MOCS_PTE].control_value; - - cs = intel_ring_begin(rq, 2 + 2 * table->n_entries); - if (IS_ERR(cs)) - return PTR_ERR(cs); - - *cs++ = MI_LOAD_REGISTER_IMM(table->n_entries); - - for (index = 0; index < table->size; index++) { - u32 value = get_entry_control(table, index); - - *cs++ = i915_mmio_reg_offset(mocs_register(engine, index)); - *cs++ = value; - } - - /* All remaining entries are also unused */ - for (; index < table->n_entries; index++) { - *cs++ = i915_mmio_reg_offset(mocs_register(engine, index)); - *cs++ = unused_value; - } - - *cs++ = MI_NOOP; - intel_ring_advance(rq, cs); - - return 0; -} - /* * Get l3cc_value from MOCS entry taking into account when it's not used: * I915_MOCS_PTE's value is returned in this case. @@ -494,141 +400,93 @@ static inline u32 l3cc_combine(const struct drm_i915_mocs_table *table, u16 low, u16 high) { - return low | high << 16; + return low | (u32)high << 16; } -static int emit_mocs_l3cc_table(struct i915_request *rq, - const struct drm_i915_mocs_table *table) +static void init_l3cc_table(struct intel_engine_cs *engine, + const struct drm_i915_mocs_table *table) { - u16 unused_value; + struct intel_uncore *uncore = engine->uncore; + u16 unused_value = table->table[I915_MOCS_PTE].l3cc_value; unsigned int i; - u32 *cs; - - if (GEM_WARN_ON(table->size > table->n_entries)) - return -ENODEV; - - /* Set unused values to PTE */ - unused_value = table->table[I915_MOCS_PTE].l3cc_value; - - cs = intel_ring_begin(rq, 2 + table->n_entries); - if (IS_ERR(cs)) - return PTR_ERR(cs); - - *cs++ = MI_LOAD_REGISTER_IMM(table->n_entries / 2); for (i = 0; i < table->size / 2; i++) { u16 low = get_entry_l3cc(table, 2 * i); u16 high = get_entry_l3cc(table, 2 * i + 1); - *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); - *cs++ = l3cc_combine(table, low, high); + intel_uncore_write(uncore, + GEN9_LNCFCMOCS(i), + l3cc_combine(table, low, high)); } /* Odd table size - 1 left over */ - if (table->size & 0x01) { + if (table->size & 1) { u16 low = get_entry_l3cc(table, 2 * i); - *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); - *cs++ = l3cc_combine(table, low, unused_value); + intel_uncore_write(uncore, + GEN9_LNCFCMOCS(i), + l3cc_combine(table, low, unused_value)); i++; } /* All remaining entries are also unused */ - for (; i < table->n_entries / 2; i++) { - *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); - *cs++ = l3cc_combine(table, unused_value, unused_value); - } - - *cs++ = MI_NOOP; - intel_ring_advance(rq, cs); - - return 0; + for (; i < table->n_entries / 2; i++) + intel_uncore_write(uncore, + GEN9_LNCFCMOCS(i), + l3cc_combine(table, unused_value, + unused_value)); } -static void intel_mocs_init_l3cc_table(struct intel_gt *gt) +void intel_mocs_init_engine(struct intel_engine_cs *engine) { - struct intel_uncore *uncore = gt->uncore; struct drm_i915_mocs_table table; - unsigned int i; - u16 unused_value; - if (!get_mocs_settings(gt, &table)) + /* Called under a blanket forcewake */ + assert_forcewakes_active(engine->uncore, FORCEWAKE_ALL); + + if (!get_mocs_settings(engine->i915, &table)) return; - /* Set unused values to PTE */ - unused_value = table.table[I915_MOCS_PTE].l3cc_value; + /* Platforms with global MOCS do not need per-engine initialization. */ + if (!HAS_GLOBAL_MOCS_REGISTERS(engine->i915)) + init_mocs_table(engine, &table); - for (i = 0; i < table.size / 2; i++) { - u16 low = get_entry_l3cc(&table, 2 * i); - u16 high = get_entry_l3cc(&table, 2 * i + 1); + if (engine->class == RENDER_CLASS) + init_l3cc_table(engine, &table); +} - intel_uncore_write(uncore, - GEN9_LNCFCMOCS(i), - l3cc_combine(&table, low, high)); - } +static void intel_mocs_init_global(struct intel_gt *gt) +{ + struct intel_uncore *uncore = gt->uncore; + struct drm_i915_mocs_table table; + unsigned int index; - /* Odd table size - 1 left over */ - if (table.size & 0x01) { - u16 low = get_entry_l3cc(&table, 2 * i); + GEM_BUG_ON(!HAS_GLOBAL_MOCS_REGISTERS(gt->i915)); - intel_uncore_write(uncore, - GEN9_LNCFCMOCS(i), - l3cc_combine(&table, low, unused_value)); - i++; - } + if (!get_mocs_settings(gt->i915, &table)) + return; - /* All remaining entries are also unused */ - for (; i < table.n_entries / 2; i++) - intel_uncore_write(uncore, - GEN9_LNCFCMOCS(i), - l3cc_combine(&table, unused_value, - unused_value)); -} + if (GEM_DEBUG_WARN_ON(table.size > table.n_entries)) + return; -/** - * intel_mocs_emit() - program the MOCS register. - * @rq: Request to use to set up the MOCS tables. - * - * This function will emit a batch buffer with the values required for - * programming the MOCS register values for all the currently supported - * rings. - * - * These registers are partially stored in the RCS context, so they are - * emitted at the same time so that when a context is created these registers - * are set up. These registers have to be emitted into the start of the - * context as setting the ELSP will re-init some of these registers back - * to the hw values. - * - * Return: 0 on success, otherwise the error status. - */ -int intel_mocs_emit(struct i915_request *rq) -{ - struct drm_i915_mocs_table t; - int ret; - - if (HAS_GLOBAL_MOCS_REGISTERS(rq->i915) || - rq->engine->class != RENDER_CLASS) - return 0; - - if (get_mocs_settings(rq->engine->gt, &t)) { - /* Program the RCS control registers */ - ret = emit_mocs_control_table(rq, &t); - if (ret) - return ret; - - /* Now program the l3cc registers */ - ret = emit_mocs_l3cc_table(rq, &t); - if (ret) - return ret; - } + for (index = 0; index < table.size; index++) + intel_uncore_write(uncore, + GEN12_GLOBAL_MOCS(index), + table.table[index].control_value); - return 0; + /* + * Ok, now set the unused entries to the invalid entry (index 0). These + * entries are officially undefined and no contract for the contents and + * settings is given for these entries. + */ + for (; index < table.n_entries; index++) + intel_uncore_write(uncore, + GEN12_GLOBAL_MOCS(index), + table.table[0].control_value); } void intel_mocs_init(struct intel_gt *gt) { - intel_mocs_init_l3cc_table(gt); - if (HAS_GLOBAL_MOCS_REGISTERS(gt->i915)) intel_mocs_init_global(gt); } diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.h b/drivers/gpu/drm/i915/gt/intel_mocs.h index 2ae816b7ca19..83371f3e6ba1 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.h +++ b/drivers/gpu/drm/i915/gt/intel_mocs.h @@ -49,13 +49,10 @@ * context handling keep the MOCS in step. */ -struct i915_request; struct intel_engine_cs; struct intel_gt; void intel_mocs_init(struct intel_gt *gt); void intel_mocs_init_engine(struct intel_engine_cs *engine); -int intel_mocs_emit(struct i915_request *rq); - #endif diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c b/drivers/gpu/drm/i915/gt/intel_rc6.c index 71184aa72896..70f0e01a38b9 100644 --- a/drivers/gpu/drm/i915/gt/intel_rc6.c +++ b/drivers/gpu/drm/i915/gt/intel_rc6.c @@ -65,7 +65,7 @@ static void gen11_rc6_enable(struct intel_rc6 *rc6) set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ - for_each_engine(engine, rc6_to_gt(rc6)->i915, id) + for_each_engine(engine, rc6_to_gt(rc6), id) set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); set(uncore, GUC_MAX_IDLE_COUNT, 0xA); @@ -133,7 +133,7 @@ static void gen9_rc6_enable(struct intel_rc6 *rc6) set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ - for_each_engine(engine, rc6_to_gt(rc6)->i915, id) + for_each_engine(engine, rc6_to_gt(rc6), id) set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); set(uncore, GUC_MAX_IDLE_COUNT, 0xA); @@ -192,7 +192,7 @@ static void gen8_rc6_enable(struct intel_rc6 *rc6) set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16); set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ - for_each_engine(engine, rc6_to_gt(rc6)->i915, id) + for_each_engine(engine, rc6_to_gt(rc6), id) set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); set(uncore, GEN6_RC_SLEEP, 0); set(uncore, GEN6_RC6_THRESHOLD, 625); /* 800us/1.28 for TO */ @@ -219,7 +219,7 @@ static void gen6_rc6_enable(struct intel_rc6 *rc6) set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); - for_each_engine(engine, i915, id) + for_each_engine(engine, rc6_to_gt(rc6), id) set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); set(uncore, GEN6_RC_SLEEP, 0); @@ -344,7 +344,7 @@ static void chv_rc6_enable(struct intel_rc6 *rc6) set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ - for_each_engine(engine, rc6_to_gt(rc6)->i915, id) + for_each_engine(engine, rc6_to_gt(rc6), id) set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); set(uncore, GEN6_RC_SLEEP, 0); @@ -371,7 +371,7 @@ static void vlv_rc6_enable(struct intel_rc6 *rc6) set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); - for_each_engine(engine, rc6_to_gt(rc6)->i915, id) + for_each_engine(engine, rc6_to_gt(rc6), id) set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); set(uncore, GEN6_RC6_THRESHOLD, 0x557); diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 7b3d9d4517a0..bf8d1ed4b1d8 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -282,14 +282,14 @@ static int gen6_reset_engines(struct intel_gt *gt, intel_engine_mask_t engine_mask, unsigned int retry) { - struct intel_engine_cs *engine; - const u32 hw_engine_mask[] = { + static const u32 hw_engine_mask[] = { [RCS0] = GEN6_GRDOM_RENDER, [BCS0] = GEN6_GRDOM_BLT, [VCS0] = GEN6_GRDOM_MEDIA, [VCS1] = GEN8_GRDOM_MEDIA2, [VECS0] = GEN6_GRDOM_VECS, }; + struct intel_engine_cs *engine; u32 hw_mask; if (engine_mask == ALL_ENGINES) { @@ -298,7 +298,7 @@ static int gen6_reset_engines(struct intel_gt *gt, intel_engine_mask_t tmp; hw_mask = 0; - for_each_engine_masked(engine, gt->i915, engine_mask, tmp) { + for_each_engine_masked(engine, gt, engine_mask, tmp) { GEM_BUG_ON(engine->id >= ARRAY_SIZE(hw_engine_mask)); hw_mask |= hw_engine_mask[engine->id]; } @@ -413,7 +413,7 @@ static int gen11_reset_engines(struct intel_gt *gt, intel_engine_mask_t engine_mask, unsigned int retry) { - const u32 hw_engine_mask[] = { + static const u32 hw_engine_mask[] = { [RCS0] = GEN11_GRDOM_RENDER, [BCS0] = GEN11_GRDOM_BLT, [VCS0] = GEN11_GRDOM_MEDIA, @@ -432,7 +432,7 @@ static int gen11_reset_engines(struct intel_gt *gt, hw_mask = GEN11_GRDOM_FULL; } else { hw_mask = 0; - for_each_engine_masked(engine, gt->i915, engine_mask, tmp) { + for_each_engine_masked(engine, gt, engine_mask, tmp) { GEM_BUG_ON(engine->id >= ARRAY_SIZE(hw_engine_mask)); hw_mask |= hw_engine_mask[engine->id]; ret = gen11_lock_sfc(engine, &hw_mask); @@ -451,7 +451,7 @@ sfc_unlock: * expiration). */ if (engine_mask != ALL_ENGINES) - for_each_engine_masked(engine, gt->i915, engine_mask, tmp) + for_each_engine_masked(engine, gt, engine_mask, tmp) gen11_unlock_sfc(engine); return ret; @@ -510,7 +510,7 @@ static int gen8_reset_engines(struct intel_gt *gt, intel_engine_mask_t tmp; int ret; - for_each_engine_masked(engine, gt->i915, engine_mask, tmp) { + for_each_engine_masked(engine, gt, engine_mask, tmp) { ret = gen8_engine_reset_prepare(engine); if (ret && !reset_non_ready) goto skip_reset; @@ -536,7 +536,7 @@ static int gen8_reset_engines(struct intel_gt *gt, ret = gen6_reset_engines(gt, engine_mask, retry); skip_reset: - for_each_engine_masked(engine, gt->i915, engine_mask, tmp) + for_each_engine_masked(engine, gt, engine_mask, tmp) gen8_engine_reset_cancel(engine); return ret; @@ -682,7 +682,7 @@ static intel_engine_mask_t reset_prepare(struct intel_gt *gt) intel_engine_mask_t awake = 0; enum intel_engine_id id; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { if (intel_engine_pm_get_if_awake(engine)) awake |= engine->mask; reset_prepare_engine(engine); @@ -712,10 +712,10 @@ static int gt_reset(struct intel_gt *gt, intel_engine_mask_t stalled_mask) if (err) return err; - for_each_engine(engine, gt->i915, id) + for_each_engine(engine, gt, id) __intel_engine_reset(engine, stalled_mask & engine->mask); - i915_gem_restore_fences(gt->i915); + i915_gem_restore_fences(gt->ggtt); return err; } @@ -733,7 +733,7 @@ static void reset_finish(struct intel_gt *gt, intel_engine_mask_t awake) struct intel_engine_cs *engine; enum intel_engine_id id; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { reset_finish_engine(engine); if (awake & engine->mask) intel_engine_pm_put(engine); @@ -769,7 +769,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt) if (GEM_SHOW_DEBUG() && !intel_engines_are_idle(gt)) { struct drm_printer p = drm_debug_printer(__func__); - for_each_engine(engine, gt->i915, id) + for_each_engine(engine, gt, id) intel_engine_dump(engine, &p, "%s\n", engine->name); } @@ -786,7 +786,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt) if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display) __intel_gt_reset(gt, ALL_ENGINES); - for_each_engine(engine, gt->i915, id) + for_each_engine(engine, gt, id) engine->submit_request = nop_submit_request; /* @@ -798,7 +798,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt) set_bit(I915_WEDGED, >->reset.flags); /* Mark all executing requests as skipped */ - for_each_engine(engine, gt->i915, id) + for_each_engine(engine, gt, id) engine->cancel_requests(engine); reset_finish(gt, awake); @@ -811,7 +811,7 @@ void intel_gt_set_wedged(struct intel_gt *gt) intel_wakeref_t wakeref; mutex_lock(>->reset.mutex); - with_intel_runtime_pm(>->i915->runtime_pm, wakeref) + with_intel_runtime_pm(gt->uncore->rpm, wakeref) __intel_gt_set_wedged(gt); mutex_unlock(>->reset.mutex); } @@ -872,8 +872,14 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt) ok = !HAS_EXECLISTS(gt->i915); /* XXX better agnosticism desired */ if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display) ok = __intel_gt_reset(gt, ALL_ENGINES) == 0; - if (!ok) + if (!ok) { + /* + * Warn CI about the unrecoverable wedged condition. + * Time for a reboot. + */ + add_taint_for_CI(TAINT_WARN); return false; + } /* * Undo nop_submit_request. We prevent all new i915 requests from @@ -928,7 +934,7 @@ static int resume(struct intel_gt *gt) enum intel_engine_id id; int ret; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { ret = engine->resume(engine); if (ret) return ret; @@ -1186,7 +1192,7 @@ void intel_gt_handle_error(struct intel_gt *gt, * isn't the case at least when we get here by doing a * simulated reset via debugfs, so get an RPM reference. */ - wakeref = intel_runtime_pm_get(>->i915->runtime_pm); + wakeref = intel_runtime_pm_get(gt->uncore->rpm); engine_mask &= INTEL_INFO(gt->i915)->engine_mask; @@ -1200,7 +1206,7 @@ void intel_gt_handle_error(struct intel_gt *gt, * single reset fails. */ if (intel_has_reset_engine(gt) && !intel_gt_is_wedged(gt)) { - for_each_engine_masked(engine, gt->i915, engine_mask, tmp) { + for_each_engine_masked(engine, gt, engine_mask, tmp) { BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE); if (test_and_set_bit(I915_RESET_ENGINE + engine->id, >->reset.flags)) @@ -1228,7 +1234,7 @@ void intel_gt_handle_error(struct intel_gt *gt, synchronize_rcu_expedited(); /* Prevent any other reset-engine attempt. */ - for_each_engine(engine, gt->i915, tmp) { + for_each_engine(engine, gt, tmp) { while (test_and_set_bit(I915_RESET_ENGINE + engine->id, >->reset.flags)) wait_on_bit(>->reset.flags, @@ -1238,7 +1244,7 @@ void intel_gt_handle_error(struct intel_gt *gt, intel_gt_reset_global(gt, engine_mask, msg); - for_each_engine(engine, gt->i915, tmp) + for_each_engine(engine, gt, tmp) clear_bit_unlock(I915_RESET_ENGINE + engine->id, >->reset.flags); clear_bit_unlock(I915_RESET_BACKOFF, >->reset.flags); @@ -1246,7 +1252,7 @@ void intel_gt_handle_error(struct intel_gt *gt, wake_up_all(>->reset.queue); out: - intel_runtime_pm_put(>->i915->runtime_pm, wakeref); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); } int intel_gt_reset_trylock(struct intel_gt *gt, int *srcu) diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index 311fdc0a21bc..bf631f15aa78 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -1609,7 +1609,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) struct intel_engine_cs *signaller; *cs++ = MI_LOAD_REGISTER_IMM(num_engines); - for_each_engine(signaller, i915, id) { + for_each_engine(signaller, engine->gt, id) { if (signaller == engine) continue; @@ -1663,7 +1663,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) i915_reg_t last_reg = {}; /* keep gcc quiet */ *cs++ = MI_LOAD_REGISTER_IMM(num_engines); - for_each_engine(signaller, i915, id) { + for_each_engine(signaller, engine->gt, id) { if (signaller == engine) continue; @@ -1676,7 +1676,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) /* Insert a delay before the next switch! */ *cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT; *cs++ = i915_mmio_reg_offset(last_reg); - *cs++ = intel_gt_scratch_offset(rq->engine->gt, + *cs++ = intel_gt_scratch_offset(engine->gt, INTEL_GT_SCRATCH_FIELD_DEFAULT); *cs++ = MI_NOOP; } diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index ba65e5018978..af8a8183154a 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -567,7 +567,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, static void tgl_ctx_workarounds_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { - /* Wa_1409142259 */ + /* Wa_1409142259:tgl */ WA_SET_BIT_MASKED(GEN11_COMMON_SLICE_CHICKEN3, GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); } @@ -892,11 +892,27 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) wa_write_or(wal, GAMT_CHKN_BIT_REG, GAMT_CHKN_DISABLE_L3_COH_PIPE); + + /* Wa_1607087056:icl */ + wa_write_or(wal, + SLICE_UNIT_LEVEL_CLKGATE, + L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); } static void tgl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) { + /* Wa_1409420604:tgl */ + if (IS_TGL_REVID(i915, TGL_REVID_A0, TGL_REVID_A0)) + wa_write_or(wal, + SUBSLICE_UNIT_LEVEL_CLKGATE2, + CPSSUNIT_CLKGATE_DIS); + + /* Wa_1409180338:tgl */ + if (IS_TGL_REVID(i915, TGL_REVID_A0, TGL_REVID_A0)) + wa_write_or(wal, + SLICE_UNIT_LEVEL_CLKGATE, + L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); } static void @@ -1260,6 +1276,26 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { struct drm_i915_private *i915 = engine->i915; + if (IS_TGL_REVID(i915, TGL_REVID_A0, TGL_REVID_A0)) { + /* Wa_1606700617:tgl */ + wa_masked_en(wal, + GEN9_CS_DEBUG_MODE1, + FF_DOP_CLOCK_GATE_DISABLE); + + /* Wa_1607138336:tgl */ + wa_write_or(wal, + GEN9_CTX_PREEMPT_REG, + GEN12_DISABLE_POSH_BUSY_FF_DOP_CG); + + /* Wa_1607030317:tgl */ + /* Wa_1607186500:tgl */ + /* Wa_1607297627:tgl */ + wa_masked_en(wal, + GEN6_RC_SLEEP_PSMI_CONTROL, + GEN12_WAIT_FOR_EVENT_POWER_DOWN_DISABLE | + GEN8_RC_SEMA_IDLE_MSG_DISABLE); + } + if (IS_GEN(i915, 11)) { /* This is not an Wa. Enable for better image quality */ wa_masked_en(wal, diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 5d43cbc3f345..123db2c3f956 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -240,6 +240,7 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915, struct mock_engine *engine; GEM_BUG_ON(id >= I915_NUM_ENGINES); + GEM_BUG_ON(!i915->gt.uncore); engine = kzalloc(sizeof(*engine) + PAGE_SIZE, GFP_KERNEL); if (!engine) @@ -248,9 +249,11 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915, /* minimal engine setup for requests */ engine->base.i915 = i915; engine->base.gt = &i915->gt; + engine->base.uncore = i915->gt.uncore; snprintf(engine->base.name, sizeof(engine->base.name), "%s", name); engine->base.id = id; engine->base.mask = BIT(id); + engine->base.legacy_idx = INVALID_ENGINE; engine->base.instance = id; engine->base.status_page.addr = (void *)(engine + 1); @@ -265,6 +268,9 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915, engine->base.reset.finish = mock_reset_finish; engine->base.cancel_requests = mock_cancel_requests; + i915->gt.engine[id] = &engine->base; + i915->gt.engine_class[0][id] = &engine->base; + /* fake hw queue */ spin_lock_init(&engine->hw_lock); timer_setup(&engine->hw_delay, hw_delay_complete, 0); diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c index 7c838a57e174..f63a26a3e620 100644 --- a/drivers/gpu/drm/i915/gt/selftest_context.c +++ b/drivers/gpu/drm/i915/gt/selftest_context.c @@ -159,7 +159,7 @@ static int live_context_size(void *arg) if (IS_ERR(fixme)) return PTR_ERR(fixme); - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { struct { struct drm_i915_gem_object *state; void *pinned; @@ -305,7 +305,7 @@ static int live_active_context(void *arg) goto out_file; } - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { err = __live_active_context(engine, fixme); if (err) break; @@ -415,7 +415,7 @@ static int live_remote_context(void *arg) goto out_file; } - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { err = __live_remote_context(engine, fixme); if (err) break; diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c index 3a1419376912..20b9c83f43ad 100644 --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c @@ -25,7 +25,7 @@ static int live_engine_pm(void *arg) } GEM_BUG_ON(intel_gt_pm_is_awake(gt)); - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { const typeof(*igt_atomic_phases) *p; for (p = igt_atomic_phases; p->name; p++) { diff --git a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c index 87985bd46423..5d429037cdad 100644 --- a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c @@ -5,6 +5,8 @@ * Copyright © 2019 Intel Corporation */ +#include "selftest_llc.h" + static int live_gt_resume(void *arg) { struct intel_gt *gt = arg; @@ -32,6 +34,13 @@ static int live_gt_resume(void *arg) err = -EINVAL; break; } + + err = st_llc_verify(>->llc); + if (err) { + pr_err("llc state not restored upon resume!\n"); + intel_gt_set_wedged_on_init(gt); + break; + } } while (!__igt_timeout(end_time, NULL)); return err; diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index e8a40df79bd0..8e0016464325 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -323,7 +323,7 @@ static int igt_hang_sanitycheck(void *arg) if (err) return err; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { struct intel_wedge_me w; long timeout; @@ -400,7 +400,7 @@ static int igt_reset_nop(void *arg) reset_count = i915_reset_count(global); count = 0; do { - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { int i; for (i = 0; i < 16; i++) { @@ -471,7 +471,7 @@ static int igt_reset_nop_engine(void *arg) } i915_gem_context_clear_bannable(ctx); - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { unsigned int reset_count, reset_engine_count; unsigned int count; IGT_TIMEOUT(end_time); @@ -560,7 +560,7 @@ static int __igt_reset_engine(struct intel_gt *gt, bool active) return err; } - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { unsigned int reset_count, reset_engine_count; IGT_TIMEOUT(end_time); @@ -782,7 +782,7 @@ static int __igt_reset_engines(struct intel_gt *gt, h.ctx->sched.priority = 1024; } - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { struct active_engine threads[I915_NUM_ENGINES] = {}; unsigned long device = i915_reset_count(global); unsigned long count = 0, reported; @@ -800,7 +800,7 @@ static int __igt_reset_engines(struct intel_gt *gt, } memset(threads, 0, sizeof(threads)); - for_each_engine(other, gt->i915, tmp) { + for_each_engine(other, gt, tmp) { struct task_struct *tsk; threads[tmp].resets = @@ -914,7 +914,7 @@ static int __igt_reset_engines(struct intel_gt *gt, } unwind: - for_each_engine(other, gt->i915, tmp) { + for_each_engine(other, gt, tmp) { int ret; if (!threads[tmp].task) @@ -1335,7 +1335,7 @@ static int wait_for_others(struct intel_gt *gt, struct intel_engine_cs *engine; enum intel_engine_id id; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { if (engine == exclude) continue; @@ -1363,7 +1363,7 @@ static int igt_reset_queue(void *arg) if (err) goto unlock; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { struct i915_request *prev; IGT_TIMEOUT(end_time); unsigned int count; @@ -1651,7 +1651,7 @@ static int igt_reset_engines_atomic(void *arg) struct intel_engine_cs *engine; enum intel_engine_id id; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { err = igt_atomic_reset_engine(engine, p); if (err) goto out; @@ -1695,14 +1695,14 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915) if (intel_gt_is_wedged(gt)) return -EIO; /* we're long past hope of a successful reset */ - wakeref = intel_runtime_pm_get(>->i915->runtime_pm); + wakeref = intel_runtime_pm_get(gt->uncore->rpm); saved_hangcheck = fetch_and_zero(&i915_modparams.enable_hangcheck); drain_delayed_work(>->hangcheck.work); /* flush param */ err = intel_gt_live_subtests(tests, gt); i915_modparams.enable_hangcheck = saved_hangcheck; - intel_runtime_pm_put(>->i915->runtime_pm, wakeref); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); return err; } diff --git a/drivers/gpu/drm/i915/gt/selftest_llc.c b/drivers/gpu/drm/i915/gt/selftest_llc.c new file mode 100644 index 000000000000..a7057785e420 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/selftest_llc.c @@ -0,0 +1,77 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#include "intel_pm.h" /* intel_gpu_freq() */ +#include "selftest_llc.h" + +static int gen6_verify_ring_freq(struct intel_llc *llc) +{ + struct drm_i915_private *i915 = llc_to_gt(llc)->i915; + struct ia_constants consts; + intel_wakeref_t wakeref; + unsigned int gpu_freq; + int err = 0; + + wakeref = intel_runtime_pm_get(llc_to_gt(llc)->uncore->rpm); + + if (!get_ia_constants(llc, &consts)) { + err = -ENODEV; + goto out_rpm; + } + + for (gpu_freq = consts.min_gpu_freq; + gpu_freq <= consts.max_gpu_freq; + gpu_freq++) { + unsigned int ia_freq, ring_freq, found; + u32 val; + + calc_ia_freq(llc, gpu_freq, &consts, &ia_freq, &ring_freq); + + val = gpu_freq; + if (sandybridge_pcode_read(i915, + GEN6_PCODE_READ_MIN_FREQ_TABLE, + &val, NULL)) { + pr_err("Failed to read freq table[%d], range [%d, %d]\n", + gpu_freq, consts.min_gpu_freq, consts.max_gpu_freq); + err = -ENXIO; + break; + } + + found = (val >> 0) & 0xff; + if (found != ia_freq) { + pr_err("Min freq table(%d/[%d, %d]):%dMHz did not match expected CPU freq, found %d, expected %d\n", + gpu_freq, consts.min_gpu_freq, consts.max_gpu_freq, + intel_gpu_freq(i915, gpu_freq * (INTEL_GEN(i915) >= 9 ? GEN9_FREQ_SCALER : 1)), + found, ia_freq); + err = -EINVAL; + break; + } + + found = (val >> 8) & 0xff; + if (found != ring_freq) { + pr_err("Min freq table(%d/[%d, %d]):%dMHz did not match expected ring freq, found %d, expected %d\n", + gpu_freq, consts.min_gpu_freq, consts.max_gpu_freq, + intel_gpu_freq(i915, gpu_freq * (INTEL_GEN(i915) >= 9 ? GEN9_FREQ_SCALER : 1)), + found, ring_freq); + err = -EINVAL; + break; + } + } + +out_rpm: + intel_runtime_pm_put(llc_to_gt(llc)->uncore->rpm, wakeref); + return err; +} + +int st_llc_verify(struct intel_llc *llc) +{ + int err = 0; + + if (HAS_LLC(llc_to_gt(llc)->i915)) + err = gen6_verify_ring_freq(llc); + + return err; +} diff --git a/drivers/gpu/drm/i915/gt/selftest_llc.h b/drivers/gpu/drm/i915/gt/selftest_llc.h new file mode 100644 index 000000000000..873f896e72f2 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/selftest_llc.h @@ -0,0 +1,14 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#ifndef SELFTEST_LLC_H +#define SELFTEST_LLC_H + +struct intel_llc; + +int st_llc_verify(struct intel_llc *llc); + +#endif /* SELFTEST_LLC_H */ diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 393ae5321e1d..5dc679781a08 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -19,22 +19,52 @@ #include "gem/selftests/igt_gem_utils.h" #include "gem/selftests/mock_context.h" +#define CS_GPR(engine, n) ((engine)->mmio_base + 0x600 + (n) * 4) +#define NUM_GPR_DW (16 * 2) /* each GPR is 2 dwords */ + +static struct i915_vma *create_scratch(struct intel_gt *gt) +{ + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + int err; + + obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE); + if (IS_ERR(obj)) + return ERR_CAST(obj); + + i915_gem_object_set_cache_coherency(obj, I915_CACHING_CACHED); + + vma = i915_vma_instance(obj, >->ggtt->vm, NULL); + if (IS_ERR(vma)) { + i915_gem_object_put(obj); + return vma; + } + + err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL); + if (err) { + i915_gem_object_put(obj); + return ERR_PTR(err); + } + + return vma; +} + static int live_sanitycheck(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct i915_gem_engines_iter it; struct i915_gem_context *ctx; struct intel_context *ce; struct igt_spinner spin; int err = -ENOMEM; - if (!HAS_LOGICAL_RING_CONTEXTS(i915)) + if (!HAS_LOGICAL_RING_CONTEXTS(gt->i915)) return 0; - if (igt_spinner_init(&spin, &i915->gt)) + if (igt_spinner_init(&spin, gt)) return -ENOMEM; - ctx = kernel_context(i915); + ctx = kernel_context(gt->i915); if (!ctx) goto err_spin; @@ -51,13 +81,13 @@ static int live_sanitycheck(void *arg) if (!igt_wait_for_spinner(&spin, rq)) { GEM_TRACE("spinner failed to start\n"); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_ctx; } igt_spinner_end(&spin); - if (igt_flush_test(i915)) { + if (igt_flush_test(gt->i915)) { err = -EIO; goto err_ctx; } @@ -72,12 +102,11 @@ err_spin: return err; } -static int live_unlite_restore(struct drm_i915_private *i915, int prio) +static int live_unlite_restore(struct intel_gt *gt, int prio) { struct intel_engine_cs *engine; struct i915_gem_context *ctx; enum intel_engine_id id; - intel_wakeref_t wakeref; struct igt_spinner spin; int err = -ENOMEM; @@ -86,18 +115,15 @@ static int live_unlite_restore(struct drm_i915_private *i915, int prio) * on the same engine from the same parent context. */ - mutex_lock(&i915->drm.struct_mutex); - wakeref = intel_runtime_pm_get(&i915->runtime_pm); - - if (igt_spinner_init(&spin, &i915->gt)) - goto err_unlock; + if (igt_spinner_init(&spin, gt)) + return err; - ctx = kernel_context(i915); + ctx = kernel_context(gt->i915); if (!ctx) goto err_spin; err = 0; - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct intel_context *ce[2] = {}; struct i915_request *rq[2]; struct igt_live_test t; @@ -109,7 +135,7 @@ static int live_unlite_restore(struct drm_i915_private *i915, int prio) if (!intel_engine_can_store_dword(engine)) continue; - if (igt_live_test_begin(&t, i915, __func__, engine->name)) { + if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) { err = -EIO; break; } @@ -143,11 +169,11 @@ static int live_unlite_restore(struct drm_i915_private *i915, int prio) GEM_BUG_ON(!ce[1]->ring->size); intel_ring_reset(ce[1]->ring, ce[1]->ring->size / 2); - local_bh_disable(); /* appease lockdep */ + local_irq_disable(); /* appease lockdep */ __context_pin_acquire(ce[1]); __execlists_update_reg_state(ce[1], engine); __context_pin_release(ce[1]); - local_bh_enable(); + local_irq_enable(); rq[0] = igt_spinner_create_request(&spin, ce[0], MI_ARB_CHECK); if (IS_ERR(rq[0])) { @@ -234,9 +260,6 @@ err_ce: kernel_context_close(ctx); err_spin: igt_spinner_fini(&spin); -err_unlock: - intel_runtime_pm_put(&i915->runtime_pm, wakeref); - mutex_unlock(&i915->drm.struct_mutex); return err; } @@ -302,7 +325,13 @@ semaphore_queue(struct intel_engine_cs *engine, struct i915_vma *vma, int idx) if (IS_ERR(rq)) goto out_ctx; - err = emit_semaphore_chain(rq, vma, idx); + err = 0; + if (rq->engine->emit_init_breadcrumb) + err = rq->engine->emit_init_breadcrumb(rq); + if (err == 0) + err = emit_semaphore_chain(rq, vma, idx); + if (err == 0) + i915_request_get(rq); i915_request_add(rq); if (err) rq = ERR_PTR(err); @@ -315,10 +344,10 @@ out_ctx: static int release_queue(struct intel_engine_cs *engine, struct i915_vma *vma, - int idx) + int idx, int prio) { struct i915_sched_attr attr = { - .priority = I915_USER_PRIORITY(I915_PRIORITY_MAX), + .priority = prio, }; struct i915_request *rq; u32 *cs; @@ -339,9 +368,15 @@ release_queue(struct intel_engine_cs *engine, *cs++ = 1; intel_ring_advance(rq, cs); + + i915_request_get(rq); i915_request_add(rq); + local_bh_disable(); engine->schedule(rq, &attr); + local_bh_enable(); /* kick tasklet */ + + i915_request_put(rq); return 0; } @@ -360,8 +395,7 @@ slice_semaphore_queue(struct intel_engine_cs *outer, if (IS_ERR(head)) return PTR_ERR(head); - i915_request_get(head); - for_each_engine(engine, outer->i915, id) { + for_each_engine(engine, outer->gt, id) { for (i = 0; i < count; i++) { struct i915_request *rq; @@ -370,10 +404,12 @@ slice_semaphore_queue(struct intel_engine_cs *outer, err = PTR_ERR(rq); goto out; } + + i915_request_put(rq); } } - err = release_queue(outer, vma, n); + err = release_queue(outer, vma, n, INT_MAX); if (err) goto out; @@ -393,7 +429,7 @@ out: static int live_timeslice_preempt(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct drm_i915_gem_object *obj; struct i915_vma *vma; void *vaddr; @@ -409,11 +445,11 @@ static int live_timeslice_preempt(void *arg) * ready task. */ - obj = i915_gem_object_create_internal(i915, PAGE_SIZE); + obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE); if (IS_ERR(obj)) return PTR_ERR(obj); - vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL); + vma = i915_vma_instance(obj, >->ggtt->vm, NULL); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto err_obj; @@ -433,7 +469,7 @@ static int live_timeslice_preempt(void *arg) struct intel_engine_cs *engine; enum intel_engine_id id; - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { if (!intel_engine_has_preemption(engine)) continue; @@ -443,7 +479,7 @@ static int live_timeslice_preempt(void *arg) if (err) goto err_pin; - if (igt_flush_test(i915)) { + if (igt_flush_test(gt->i915)) { err = -EIO; goto err_pin; } @@ -459,9 +495,153 @@ err_obj: return err; } +static struct i915_request *nop_request(struct intel_engine_cs *engine) +{ + struct i915_request *rq; + + rq = i915_request_create(engine->kernel_context); + if (IS_ERR(rq)) + return rq; + + i915_request_get(rq); + i915_request_add(rq); + + return rq; +} + +static void wait_for_submit(struct intel_engine_cs *engine, + struct i915_request *rq) +{ + do { + cond_resched(); + intel_engine_flush_submission(engine); + } while (!i915_request_is_active(rq)); +} + +static int live_timeslice_queue(void *arg) +{ + struct intel_gt *gt = arg; + struct drm_i915_gem_object *obj; + struct intel_engine_cs *engine; + enum intel_engine_id id; + struct i915_vma *vma; + void *vaddr; + int err = 0; + + /* + * Make sure that even if ELSP[0] and ELSP[1] are filled with + * timeslicing between them disabled, we *do* enable timeslicing + * if the queue demands it. (Normally, we do not submit if + * ELSP[1] is already occupied, so must rely on timeslicing to + * eject ELSP[0] in favour of the queue.) + */ + + obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE); + if (IS_ERR(obj)) + return PTR_ERR(obj); + + vma = i915_vma_instance(obj, >->ggtt->vm, NULL); + if (IS_ERR(vma)) { + err = PTR_ERR(vma); + goto err_obj; + } + + vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC); + if (IS_ERR(vaddr)) { + err = PTR_ERR(vaddr); + goto err_obj; + } + + err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL); + if (err) + goto err_map; + + for_each_engine(engine, gt, id) { + struct i915_sched_attr attr = { + .priority = I915_USER_PRIORITY(I915_PRIORITY_MAX), + }; + struct i915_request *rq, *nop; + + if (!intel_engine_has_preemption(engine)) + continue; + + memset(vaddr, 0, PAGE_SIZE); + + /* ELSP[0]: semaphore wait */ + rq = semaphore_queue(engine, vma, 0); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err_pin; + } + engine->schedule(rq, &attr); + wait_for_submit(engine, rq); + + /* ELSP[1]: nop request */ + nop = nop_request(engine); + if (IS_ERR(nop)) { + err = PTR_ERR(nop); + i915_request_put(rq); + goto err_pin; + } + wait_for_submit(engine, nop); + i915_request_put(nop); + + GEM_BUG_ON(i915_request_completed(rq)); + GEM_BUG_ON(execlists_active(&engine->execlists) != rq); + + /* Queue: semaphore signal, matching priority as semaphore */ + err = release_queue(engine, vma, 1, effective_prio(rq)); + if (err) { + i915_request_put(rq); + goto err_pin; + } + + intel_engine_flush_submission(engine); + if (!READ_ONCE(engine->execlists.timer.expires) && + !i915_request_completed(rq)) { + struct drm_printer p = + drm_info_printer(gt->i915->drm.dev); + + GEM_TRACE_ERR("%s: Failed to enable timeslicing!\n", + engine->name); + intel_engine_dump(engine, &p, + "%s\n", engine->name); + GEM_TRACE_DUMP(); + + memset(vaddr, 0xff, PAGE_SIZE); + err = -EINVAL; + } + + /* Timeslice every jiffie, so within 2 we should signal */ + if (i915_request_wait(rq, 0, 3) < 0) { + struct drm_printer p = + drm_info_printer(gt->i915->drm.dev); + + pr_err("%s: Failed to timeslice into queue\n", + engine->name); + intel_engine_dump(engine, &p, + "%s\n", engine->name); + + memset(vaddr, 0xff, PAGE_SIZE); + err = -EIO; + } + i915_request_put(rq); + if (err) + break; + } + +err_pin: + i915_vma_unpin(vma); +err_map: + i915_gem_object_unpin_map(obj); +err_obj: + i915_gem_object_put(obj); + return err; +} + static int live_busywait_preempt(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct i915_gem_context *ctx_hi, *ctx_lo; struct intel_engine_cs *engine; struct drm_i915_gem_object *obj; @@ -475,19 +655,19 @@ static int live_busywait_preempt(void *arg) * preempt the busywaits used to synchronise between rings. */ - ctx_hi = kernel_context(i915); + ctx_hi = kernel_context(gt->i915); if (!ctx_hi) return -ENOMEM; ctx_hi->sched.priority = I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY); - ctx_lo = kernel_context(i915); + ctx_lo = kernel_context(gt->i915); if (!ctx_lo) goto err_ctx_hi; ctx_lo->sched.priority = I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY); - obj = i915_gem_object_create_internal(i915, PAGE_SIZE); + obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE); if (IS_ERR(obj)) { err = PTR_ERR(obj); goto err_ctx_lo; @@ -499,7 +679,7 @@ static int live_busywait_preempt(void *arg) goto err_obj; } - vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL); + vma = i915_vma_instance(obj, >->ggtt->vm, NULL); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto err_map; @@ -509,7 +689,7 @@ static int live_busywait_preempt(void *arg) if (err) goto err_map; - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct i915_request *lo, *hi; struct igt_live_test t; u32 *cs; @@ -520,7 +700,7 @@ static int live_busywait_preempt(void *arg) if (!intel_engine_can_store_dword(engine)) continue; - if (igt_live_test_begin(&t, i915, __func__, engine->name)) { + if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) { err = -EIO; goto err_vma; } @@ -600,7 +780,7 @@ static int live_busywait_preempt(void *arg) i915_request_add(hi); if (i915_request_wait(lo, 0, HZ / 5) < 0) { - struct drm_printer p = drm_info_printer(i915->drm.dev); + struct drm_printer p = drm_info_printer(gt->i915->drm.dev); pr_err("%s: Failed to preempt semaphore busywait!\n", engine->name); @@ -608,7 +788,7 @@ static int live_busywait_preempt(void *arg) intel_engine_dump(engine, &p, "%s\n", engine->name); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_vma; } @@ -654,45 +834,45 @@ spinner_create_request(struct igt_spinner *spin, static int live_preempt(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct i915_gem_context *ctx_hi, *ctx_lo; struct igt_spinner spin_hi, spin_lo; struct intel_engine_cs *engine; enum intel_engine_id id; int err = -ENOMEM; - if (!HAS_LOGICAL_RING_PREEMPTION(i915)) + if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915)) return 0; - if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION)) + if (!(gt->i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION)) pr_err("Logical preemption supported, but not exposed\n"); - if (igt_spinner_init(&spin_hi, &i915->gt)) + if (igt_spinner_init(&spin_hi, gt)) return -ENOMEM; - if (igt_spinner_init(&spin_lo, &i915->gt)) + if (igt_spinner_init(&spin_lo, gt)) goto err_spin_hi; - ctx_hi = kernel_context(i915); + ctx_hi = kernel_context(gt->i915); if (!ctx_hi) goto err_spin_lo; ctx_hi->sched.priority = I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY); - ctx_lo = kernel_context(i915); + ctx_lo = kernel_context(gt->i915); if (!ctx_lo) goto err_ctx_hi; ctx_lo->sched.priority = I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY); - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct igt_live_test t; struct i915_request *rq; if (!intel_engine_has_preemption(engine)) continue; - if (igt_live_test_begin(&t, i915, __func__, engine->name)) { + if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) { err = -EIO; goto err_ctx_lo; } @@ -708,7 +888,7 @@ static int live_preempt(void *arg) if (!igt_wait_for_spinner(&spin_lo, rq)) { GEM_TRACE("lo spinner failed to start\n"); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_ctx_lo; } @@ -725,7 +905,7 @@ static int live_preempt(void *arg) if (!igt_wait_for_spinner(&spin_hi, rq)) { GEM_TRACE("hi spinner failed to start\n"); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_ctx_lo; } @@ -753,7 +933,7 @@ err_spin_hi: static int live_late_preempt(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct i915_gem_context *ctx_hi, *ctx_lo; struct igt_spinner spin_hi, spin_lo; struct intel_engine_cs *engine; @@ -761,34 +941,34 @@ static int live_late_preempt(void *arg) enum intel_engine_id id; int err = -ENOMEM; - if (!HAS_LOGICAL_RING_PREEMPTION(i915)) + if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915)) return 0; - if (igt_spinner_init(&spin_hi, &i915->gt)) + if (igt_spinner_init(&spin_hi, gt)) return -ENOMEM; - if (igt_spinner_init(&spin_lo, &i915->gt)) + if (igt_spinner_init(&spin_lo, gt)) goto err_spin_hi; - ctx_hi = kernel_context(i915); + ctx_hi = kernel_context(gt->i915); if (!ctx_hi) goto err_spin_lo; - ctx_lo = kernel_context(i915); + ctx_lo = kernel_context(gt->i915); if (!ctx_lo) goto err_ctx_hi; /* Make sure ctx_lo stays before ctx_hi until we trigger preemption. */ ctx_lo->sched.priority = I915_USER_PRIORITY(1); - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct igt_live_test t; struct i915_request *rq; if (!intel_engine_has_preemption(engine)) continue; - if (igt_live_test_begin(&t, i915, __func__, engine->name)) { + if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) { err = -EIO; goto err_ctx_lo; } @@ -852,7 +1032,7 @@ err_spin_hi: err_wedged: igt_spinner_end(&spin_hi); igt_spinner_end(&spin_lo); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_ctx_lo; } @@ -862,14 +1042,13 @@ struct preempt_client { struct i915_gem_context *ctx; }; -static int preempt_client_init(struct drm_i915_private *i915, - struct preempt_client *c) +static int preempt_client_init(struct intel_gt *gt, struct preempt_client *c) { - c->ctx = kernel_context(i915); + c->ctx = kernel_context(gt->i915); if (!c->ctx) return -ENOMEM; - if (igt_spinner_init(&c->spin, &i915->gt)) + if (igt_spinner_init(&c->spin, gt)) goto err_ctx; return 0; @@ -887,7 +1066,7 @@ static void preempt_client_fini(struct preempt_client *c) static int live_nopreempt(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_engine_cs *engine; struct preempt_client a, b; enum intel_engine_id id; @@ -898,16 +1077,16 @@ static int live_nopreempt(void *arg) * that may be being observed and not want to be interrupted. */ - if (!HAS_LOGICAL_RING_PREEMPTION(i915)) + if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915)) return 0; - if (preempt_client_init(i915, &a)) + if (preempt_client_init(gt, &a)) return -ENOMEM; - if (preempt_client_init(i915, &b)) + if (preempt_client_init(gt, &b)) goto err_client_a; b.ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX); - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct i915_request *rq_a, *rq_b; if (!intel_engine_has_preemption(engine)) @@ -967,7 +1146,7 @@ static int live_nopreempt(void *arg) goto err_wedged; } - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) goto err_wedged; } @@ -981,14 +1160,14 @@ err_client_a: err_wedged: igt_spinner_end(&b.spin); igt_spinner_end(&a.spin); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_client_b; } static int live_suppress_self_preempt(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_engine_cs *engine; struct i915_sched_attr attr = { .priority = I915_USER_PRIORITY(I915_PRIORITY_MAX) @@ -1004,28 +1183,28 @@ static int live_suppress_self_preempt(void *arg) * completion event. */ - if (!HAS_LOGICAL_RING_PREEMPTION(i915)) + if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915)) return 0; - if (USES_GUC_SUBMISSION(i915)) + if (USES_GUC_SUBMISSION(gt->i915)) return 0; /* presume black blox */ - if (intel_vgpu_active(i915)) + if (intel_vgpu_active(gt->i915)) return 0; /* GVT forces single port & request submission */ - if (preempt_client_init(i915, &a)) + if (preempt_client_init(gt, &a)) return -ENOMEM; - if (preempt_client_init(i915, &b)) + if (preempt_client_init(gt, &b)) goto err_client_a; - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct i915_request *rq_a, *rq_b; int depth; if (!intel_engine_has_preemption(engine)) continue; - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) goto err_wedged; intel_engine_pm_get(engine); @@ -1086,7 +1265,7 @@ static int live_suppress_self_preempt(void *arg) } intel_engine_pm_put(engine); - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) goto err_wedged; } @@ -1100,7 +1279,7 @@ err_client_a: err_wedged: igt_spinner_end(&b.spin); igt_spinner_end(&a.spin); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_client_b; } @@ -1160,7 +1339,7 @@ static void dummy_request_free(struct i915_request *dummy) static int live_suppress_wait_preempt(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct preempt_client client[4]; struct intel_engine_cs *engine; enum intel_engine_id id; @@ -1173,19 +1352,19 @@ static int live_suppress_wait_preempt(void *arg) * not needlessly generate preempt-to-idle cycles. */ - if (!HAS_LOGICAL_RING_PREEMPTION(i915)) + if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915)) return 0; - if (preempt_client_init(i915, &client[0])) /* ELSP[0] */ + if (preempt_client_init(gt, &client[0])) /* ELSP[0] */ return -ENOMEM; - if (preempt_client_init(i915, &client[1])) /* ELSP[1] */ + if (preempt_client_init(gt, &client[1])) /* ELSP[1] */ goto err_client_0; - if (preempt_client_init(i915, &client[2])) /* head of queue */ + if (preempt_client_init(gt, &client[2])) /* head of queue */ goto err_client_1; - if (preempt_client_init(i915, &client[3])) /* bystander */ + if (preempt_client_init(gt, &client[3])) /* bystander */ goto err_client_2; - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { int depth; if (!intel_engine_has_preemption(engine)) @@ -1240,7 +1419,7 @@ static int live_suppress_wait_preempt(void *arg) for (i = 0; i < ARRAY_SIZE(client); i++) igt_spinner_end(&client[i].spin); - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) goto err_wedged; if (engine->execlists.preempt_hang.count) { @@ -1268,14 +1447,14 @@ err_client_0: err_wedged: for (i = 0; i < ARRAY_SIZE(client); i++) igt_spinner_end(&client[i].spin); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_client_3; } static int live_chain_preempt(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_engine_cs *engine; struct preempt_client hi, lo; enum intel_engine_id id; @@ -1287,16 +1466,16 @@ static int live_chain_preempt(void *arg) * the previously submitted spinner in B. */ - if (!HAS_LOGICAL_RING_PREEMPTION(i915)) + if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915)) return 0; - if (preempt_client_init(i915, &hi)) + if (preempt_client_init(gt, &hi)) return -ENOMEM; - if (preempt_client_init(i915, &lo)) + if (preempt_client_init(gt, &lo)) goto err_client_hi; - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct i915_sched_attr attr = { .priority = I915_USER_PRIORITY(I915_PRIORITY_MAX), }; @@ -1327,7 +1506,7 @@ static int live_chain_preempt(void *arg) goto err_wedged; } - if (igt_live_test_begin(&t, i915, __func__, engine->name)) { + if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) { err = -EIO; goto err_wedged; } @@ -1365,7 +1544,7 @@ static int live_chain_preempt(void *arg) igt_spinner_end(&hi.spin); if (i915_request_wait(rq, 0, HZ / 5) < 0) { struct drm_printer p = - drm_info_printer(i915->drm.dev); + drm_info_printer(gt->i915->drm.dev); pr_err("Failed to preempt over chain of %d\n", count); @@ -1381,7 +1560,7 @@ static int live_chain_preempt(void *arg) i915_request_add(rq); if (i915_request_wait(rq, 0, HZ / 5) < 0) { struct drm_printer p = - drm_info_printer(i915->drm.dev); + drm_info_printer(gt->i915->drm.dev); pr_err("Failed to flush low priority chain of %d requests\n", count); @@ -1407,45 +1586,45 @@ err_client_hi: err_wedged: igt_spinner_end(&hi.spin); igt_spinner_end(&lo.spin); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_client_lo; } static int live_preempt_hang(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct i915_gem_context *ctx_hi, *ctx_lo; struct igt_spinner spin_hi, spin_lo; struct intel_engine_cs *engine; enum intel_engine_id id; int err = -ENOMEM; - if (!HAS_LOGICAL_RING_PREEMPTION(i915)) + if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915)) return 0; - if (!intel_has_reset_engine(&i915->gt)) + if (!intel_has_reset_engine(gt)) return 0; - if (igt_spinner_init(&spin_hi, &i915->gt)) + if (igt_spinner_init(&spin_hi, gt)) return -ENOMEM; - if (igt_spinner_init(&spin_lo, &i915->gt)) + if (igt_spinner_init(&spin_lo, gt)) goto err_spin_hi; - ctx_hi = kernel_context(i915); + ctx_hi = kernel_context(gt->i915); if (!ctx_hi) goto err_spin_lo; ctx_hi->sched.priority = I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY); - ctx_lo = kernel_context(i915); + ctx_lo = kernel_context(gt->i915); if (!ctx_lo) goto err_ctx_hi; ctx_lo->sched.priority = I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY); - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct i915_request *rq; if (!intel_engine_has_preemption(engine)) @@ -1462,7 +1641,7 @@ static int live_preempt_hang(void *arg) if (!igt_wait_for_spinner(&spin_lo, rq)) { GEM_TRACE("lo spinner failed to start\n"); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_ctx_lo; } @@ -1484,28 +1663,28 @@ static int live_preempt_hang(void *arg) HZ / 10)) { pr_err("Preemption did not occur within timeout!"); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_ctx_lo; } - set_bit(I915_RESET_ENGINE + id, &i915->gt.reset.flags); + set_bit(I915_RESET_ENGINE + id, >->reset.flags); intel_engine_reset(engine, NULL); - clear_bit(I915_RESET_ENGINE + id, &i915->gt.reset.flags); + clear_bit(I915_RESET_ENGINE + id, >->reset.flags); engine->execlists.preempt_hang.inject_hang = false; if (!igt_wait_for_spinner(&spin_hi, rq)) { GEM_TRACE("hi spinner failed to start\n"); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto err_ctx_lo; } igt_spinner_end(&spin_hi); igt_spinner_end(&spin_lo); - if (igt_flush_test(i915)) { + if (igt_flush_test(gt->i915)) { err = -EIO; goto err_ctx_lo; } @@ -1534,7 +1713,7 @@ static int random_priority(struct rnd_state *rnd) } struct preempt_smoke { - struct drm_i915_private *i915; + struct intel_gt *gt; struct i915_gem_context **contexts; struct intel_engine_cs *engine; struct drm_i915_gem_object *batch; @@ -1634,7 +1813,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) unsigned long count; int err = 0; - for_each_engine(engine, smoke->i915, id) { + for_each_engine(engine, smoke->gt, id) { arg[id] = *smoke; arg[id].engine = engine; if (!(flags & BATCH)) @@ -1651,7 +1830,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) } count = 0; - for_each_engine(engine, smoke->i915, id) { + for_each_engine(engine, smoke->gt, id) { int status; if (IS_ERR_OR_NULL(tsk[id])) @@ -1668,7 +1847,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n", count, flags, - RUNTIME_INFO(smoke->i915)->num_engines, smoke->ncontext); + RUNTIME_INFO(smoke->gt->i915)->num_engines, smoke->ncontext); return 0; } @@ -1680,7 +1859,7 @@ static int smoke_random(struct preempt_smoke *smoke, unsigned int flags) count = 0; do { - for_each_engine(smoke->engine, smoke->i915, id) { + for_each_engine(smoke->engine, smoke->gt, id) { struct i915_gem_context *ctx = smoke_context(smoke); int err; @@ -1696,14 +1875,14 @@ static int smoke_random(struct preempt_smoke *smoke, unsigned int flags) pr_info("Submitted %lu random:%x requests across %d engines and %d contexts\n", count, flags, - RUNTIME_INFO(smoke->i915)->num_engines, smoke->ncontext); + RUNTIME_INFO(smoke->gt->i915)->num_engines, smoke->ncontext); return 0; } static int live_preempt_smoke(void *arg) { struct preempt_smoke smoke = { - .i915 = arg, + .gt = arg, .prng = I915_RND_STATE_INITIALIZER(i915_selftest.random_seed), .ncontext = 1024, }; @@ -1713,7 +1892,7 @@ static int live_preempt_smoke(void *arg) u32 *cs; int n; - if (!HAS_LOGICAL_RING_PREEMPTION(smoke.i915)) + if (!HAS_LOGICAL_RING_PREEMPTION(smoke.gt->i915)) return 0; smoke.contexts = kmalloc_array(smoke.ncontext, @@ -1722,7 +1901,8 @@ static int live_preempt_smoke(void *arg) if (!smoke.contexts) return -ENOMEM; - smoke.batch = i915_gem_object_create_internal(smoke.i915, PAGE_SIZE); + smoke.batch = + i915_gem_object_create_internal(smoke.gt->i915, PAGE_SIZE); if (IS_ERR(smoke.batch)) { err = PTR_ERR(smoke.batch); goto err_free; @@ -1739,13 +1919,13 @@ static int live_preempt_smoke(void *arg) i915_gem_object_flush_map(smoke.batch); i915_gem_object_unpin_map(smoke.batch); - if (igt_live_test_begin(&t, smoke.i915, __func__, "all")) { + if (igt_live_test_begin(&t, smoke.gt->i915, __func__, "all")) { err = -EIO; goto err_batch; } for (n = 0; n < smoke.ncontext; n++) { - smoke.contexts[n] = kernel_context(smoke.i915); + smoke.contexts[n] = kernel_context(smoke.gt->i915); if (!smoke.contexts[n]) goto err_ctx; } @@ -1778,7 +1958,7 @@ err_free: return err; } -static int nop_virtual_engine(struct drm_i915_private *i915, +static int nop_virtual_engine(struct intel_gt *gt, struct intel_engine_cs **siblings, unsigned int nsibling, unsigned int nctx, @@ -1797,7 +1977,7 @@ static int nop_virtual_engine(struct drm_i915_private *i915, GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ctx)); for (n = 0; n < nctx; n++) { - ctx[n] = kernel_context(i915); + ctx[n] = kernel_context(gt->i915); if (!ctx[n]) { err = -ENOMEM; nctx = n; @@ -1822,7 +2002,7 @@ static int nop_virtual_engine(struct drm_i915_private *i915, } } - err = igt_live_test_begin(&t, i915, __func__, ve[0]->engine->name); + err = igt_live_test_begin(&t, gt->i915, __func__, ve[0]->engine->name); if (err) goto out; @@ -1869,7 +2049,7 @@ static int nop_virtual_engine(struct drm_i915_private *i915, request[nc]->fence.context, request[nc]->fence.seqno); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); break; } } @@ -1891,7 +2071,7 @@ static int nop_virtual_engine(struct drm_i915_private *i915, prime, div64_u64(ktime_to_ns(times[1]), prime)); out: - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) err = -EIO; for (nc = 0; nc < nctx; nc++) { @@ -1904,19 +2084,18 @@ out: static int live_virtual_engine(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1]; struct intel_engine_cs *engine; - struct intel_gt *gt = &i915->gt; enum intel_engine_id id; unsigned int class, inst; int err; - if (USES_GUC_SUBMISSION(i915)) + if (USES_GUC_SUBMISSION(gt->i915)) return 0; - for_each_engine(engine, i915, id) { - err = nop_virtual_engine(i915, &engine, 1, 1, 0); + for_each_engine(engine, gt, id) { + err = nop_virtual_engine(gt, &engine, 1, 1, 0); if (err) { pr_err("Failed to wrap engine %s: err=%d\n", engine->name, err); @@ -1938,13 +2117,13 @@ static int live_virtual_engine(void *arg) continue; for (n = 1; n <= nsibling + 1; n++) { - err = nop_virtual_engine(i915, siblings, nsibling, + err = nop_virtual_engine(gt, siblings, nsibling, n, 0); if (err) return err; } - err = nop_virtual_engine(i915, siblings, nsibling, n, CHAIN); + err = nop_virtual_engine(gt, siblings, nsibling, n, CHAIN); if (err) return err; } @@ -1952,7 +2131,7 @@ static int live_virtual_engine(void *arg) return 0; } -static int mask_virtual_engine(struct drm_i915_private *i915, +static int mask_virtual_engine(struct intel_gt *gt, struct intel_engine_cs **siblings, unsigned int nsibling) { @@ -1968,7 +2147,7 @@ static int mask_virtual_engine(struct drm_i915_private *i915, * restrict it to our desired engine within the virtual engine. */ - ctx = kernel_context(i915); + ctx = kernel_context(gt->i915); if (!ctx) return -ENOMEM; @@ -1982,7 +2161,7 @@ static int mask_virtual_engine(struct drm_i915_private *i915, if (err) goto out_put; - err = igt_live_test_begin(&t, i915, __func__, ve->engine->name); + err = igt_live_test_begin(&t, gt->i915, __func__, ve->engine->name); if (err) goto out_unpin; @@ -2013,7 +2192,7 @@ static int mask_virtual_engine(struct drm_i915_private *i915, request[n]->fence.context, request[n]->fence.seqno); GEM_TRACE_DUMP(); - intel_gt_set_wedged(&i915->gt); + intel_gt_set_wedged(gt); err = -EIO; goto out; } @@ -2029,7 +2208,7 @@ static int mask_virtual_engine(struct drm_i915_private *i915, err = igt_live_test_end(&t); out: - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) err = -EIO; for (n = 0; n < nsibling; n++) @@ -2046,13 +2225,12 @@ out_close: static int live_virtual_mask(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1]; - struct intel_gt *gt = &i915->gt; unsigned int class, inst; int err; - if (USES_GUC_SUBMISSION(i915)) + if (USES_GUC_SUBMISSION(gt->i915)) return 0; for (class = 0; class <= MAX_ENGINE_CLASS; class++) { @@ -2068,7 +2246,7 @@ static int live_virtual_mask(void *arg) if (nsibling < 2) continue; - err = mask_virtual_engine(i915, siblings, nsibling); + err = mask_virtual_engine(gt, siblings, nsibling); if (err) return err; } @@ -2076,7 +2254,158 @@ static int live_virtual_mask(void *arg) return 0; } -static int bond_virtual_engine(struct drm_i915_private *i915, +static int preserved_virtual_engine(struct intel_gt *gt, + struct intel_engine_cs **siblings, + unsigned int nsibling) +{ + struct i915_request *last = NULL; + struct i915_gem_context *ctx; + struct intel_context *ve; + struct i915_vma *scratch; + struct igt_live_test t; + unsigned int n; + int err = 0; + u32 *cs; + + ctx = kernel_context(gt->i915); + if (!ctx) + return -ENOMEM; + + scratch = create_scratch(siblings[0]->gt); + if (IS_ERR(scratch)) { + err = PTR_ERR(scratch); + goto out_close; + } + + ve = intel_execlists_create_virtual(ctx, siblings, nsibling); + if (IS_ERR(ve)) { + err = PTR_ERR(ve); + goto out_scratch; + } + + err = intel_context_pin(ve); + if (err) + goto out_put; + + err = igt_live_test_begin(&t, gt->i915, __func__, ve->engine->name); + if (err) + goto out_unpin; + + for (n = 0; n < NUM_GPR_DW; n++) { + struct intel_engine_cs *engine = siblings[n % nsibling]; + struct i915_request *rq; + + rq = i915_request_create(ve); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto out_end; + } + + i915_request_put(last); + last = i915_request_get(rq); + + cs = intel_ring_begin(rq, 8); + if (IS_ERR(cs)) { + i915_request_add(rq); + err = PTR_ERR(cs); + goto out_end; + } + + *cs++ = MI_STORE_REGISTER_MEM_GEN8 | MI_USE_GGTT; + *cs++ = CS_GPR(engine, n); + *cs++ = i915_ggtt_offset(scratch) + n * sizeof(u32); + *cs++ = 0; + + *cs++ = MI_LOAD_REGISTER_IMM(1); + *cs++ = CS_GPR(engine, (n + 1) % NUM_GPR_DW); + *cs++ = n + 1; + + *cs++ = MI_NOOP; + intel_ring_advance(rq, cs); + + /* Restrict this request to run on a particular engine */ + rq->execution_mask = engine->mask; + i915_request_add(rq); + } + + if (i915_request_wait(last, 0, HZ / 5) < 0) { + err = -ETIME; + goto out_end; + } + + cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB); + if (IS_ERR(cs)) { + err = PTR_ERR(cs); + goto out_end; + } + + for (n = 0; n < NUM_GPR_DW; n++) { + if (cs[n] != n) { + pr_err("Incorrect value[%d] found for GPR[%d]\n", + cs[n], n); + err = -EINVAL; + break; + } + } + + i915_gem_object_unpin_map(scratch->obj); + +out_end: + if (igt_live_test_end(&t)) + err = -EIO; + i915_request_put(last); +out_unpin: + intel_context_unpin(ve); +out_put: + intel_context_put(ve); +out_scratch: + i915_vma_unpin_and_release(&scratch, 0); +out_close: + kernel_context_close(ctx); + return err; +} + +static int live_virtual_preserved(void *arg) +{ + struct intel_gt *gt = arg; + struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1]; + unsigned int class, inst; + + /* + * Check that the context image retains non-privileged (user) registers + * from one engine to the next. For this we check that the CS_GPR + * are preserved. + */ + + if (USES_GUC_SUBMISSION(gt->i915)) + return 0; + + /* As we use CS_GPR we cannot run before they existed on all engines. */ + if (INTEL_GEN(gt->i915) < 9) + return 0; + + for (class = 0; class <= MAX_ENGINE_CLASS; class++) { + int nsibling, err; + + nsibling = 0; + for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) { + if (!gt->engine_class[class][inst]) + continue; + + siblings[nsibling++] = gt->engine_class[class][inst]; + } + if (nsibling < 2) + continue; + + err = preserved_virtual_engine(gt, siblings, nsibling); + if (err) + return err; + } + + return 0; +} + +static int bond_virtual_engine(struct intel_gt *gt, unsigned int class, struct intel_engine_cs **siblings, unsigned int nsibling, @@ -2092,13 +2421,13 @@ static int bond_virtual_engine(struct drm_i915_private *i915, GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1); - ctx = kernel_context(i915); + ctx = kernel_context(gt->i915); if (!ctx) return -ENOMEM; err = 0; rq[0] = ERR_PTR(-ENOMEM); - for_each_engine(master, i915, id) { + for_each_engine(master, gt, id) { struct i915_sw_fence fence = {}; if (master->class == class) @@ -2203,7 +2532,7 @@ static int bond_virtual_engine(struct drm_i915_private *i915, out: for (n = 0; !IS_ERR(rq[n]); n++) i915_request_put(rq[n]); - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) err = -EIO; kernel_context_close(ctx); @@ -2220,13 +2549,12 @@ static int live_virtual_bond(void *arg) { "schedule", BOND_SCHEDULE }, { }, }; - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1]; - struct intel_gt *gt = &i915->gt; unsigned int class, inst; int err; - if (USES_GUC_SUBMISSION(i915)) + if (USES_GUC_SUBMISSION(gt->i915)) return 0; for (class = 0; class <= MAX_ENGINE_CLASS; class++) { @@ -2245,7 +2573,7 @@ static int live_virtual_bond(void *arg) continue; for (p = phases; p->name; p++) { - err = bond_virtual_engine(i915, + err = bond_virtual_engine(gt, class, siblings, nsibling, p->flags); if (err) { @@ -2266,6 +2594,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915) SUBTEST(live_unlite_switch), SUBTEST(live_unlite_preempt), SUBTEST(live_timeslice_preempt), + SUBTEST(live_timeslice_queue), SUBTEST(live_busywait_preempt), SUBTEST(live_preempt), SUBTEST(live_late_preempt), @@ -2277,6 +2606,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915) SUBTEST(live_preempt_smoke), SUBTEST(live_virtual_engine), SUBTEST(live_virtual_mask), + SUBTEST(live_virtual_preserved), SUBTEST(live_virtual_bond), }; @@ -2286,7 +2616,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915) if (intel_gt_is_wedged(&i915->gt)) return 0; - return i915_live_subtests(tests, i915); + return intel_gt_live_subtests(tests, &i915->gt); } static void hexdump(const void *buf, size_t len) @@ -2336,7 +2666,7 @@ static int live_lrc_layout(void *arg) return -ENOMEM; err = 0; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { u32 *hw, *lrc; int dw; @@ -2419,10 +2749,280 @@ static int live_lrc_layout(void *arg) return err; } +static int __live_lrc_state(struct i915_gem_context *fixme, + struct intel_engine_cs *engine, + struct i915_vma *scratch) +{ + struct intel_context *ce; + struct i915_request *rq; + enum { + RING_START_IDX = 0, + RING_TAIL_IDX, + MAX_IDX + }; + u32 expected[MAX_IDX]; + u32 *cs; + int err; + int n; + + ce = intel_context_create(fixme, engine); + if (IS_ERR(ce)) + return PTR_ERR(ce); + + err = intel_context_pin(ce); + if (err) + goto err_put; + + rq = i915_request_create(ce); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err_unpin; + } + + cs = intel_ring_begin(rq, 4 * MAX_IDX); + if (IS_ERR(cs)) { + err = PTR_ERR(cs); + i915_request_add(rq); + goto err_unpin; + } + + *cs++ = MI_STORE_REGISTER_MEM_GEN8 | MI_USE_GGTT; + *cs++ = i915_mmio_reg_offset(RING_START(engine->mmio_base)); + *cs++ = i915_ggtt_offset(scratch) + RING_START_IDX * sizeof(u32); + *cs++ = 0; + + expected[RING_START_IDX] = i915_ggtt_offset(ce->ring->vma); + + *cs++ = MI_STORE_REGISTER_MEM_GEN8 | MI_USE_GGTT; + *cs++ = i915_mmio_reg_offset(RING_TAIL(engine->mmio_base)); + *cs++ = i915_ggtt_offset(scratch) + RING_TAIL_IDX * sizeof(u32); + *cs++ = 0; + + i915_request_get(rq); + i915_request_add(rq); + + intel_engine_flush_submission(engine); + expected[RING_TAIL_IDX] = ce->ring->tail; + + if (i915_request_wait(rq, 0, HZ / 5) < 0) { + err = -ETIME; + goto err_rq; + } + + cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB); + if (IS_ERR(cs)) { + err = PTR_ERR(cs); + goto err_rq; + } + + for (n = 0; n < MAX_IDX; n++) { + if (cs[n] != expected[n]) { + pr_err("%s: Stored register[%d] value[0x%x] did not match expected[0x%x]\n", + engine->name, n, cs[n], expected[n]); + err = -EINVAL; + break; + } + } + + i915_gem_object_unpin_map(scratch->obj); + +err_rq: + i915_request_put(rq); +err_unpin: + intel_context_unpin(ce); +err_put: + intel_context_put(ce); + return err; +} + +static int live_lrc_state(void *arg) +{ + struct intel_gt *gt = arg; + struct intel_engine_cs *engine; + struct i915_gem_context *fixme; + struct i915_vma *scratch; + enum intel_engine_id id; + int err = 0; + + /* + * Check the live register state matches what we expect for this + * intel_context. + */ + + fixme = kernel_context(gt->i915); + if (!fixme) + return -ENOMEM; + + scratch = create_scratch(gt); + if (IS_ERR(scratch)) { + err = PTR_ERR(scratch); + goto out_close; + } + + for_each_engine(engine, gt, id) { + err = __live_lrc_state(fixme, engine, scratch); + if (err) + break; + } + + if (igt_flush_test(gt->i915)) + err = -EIO; + + i915_vma_unpin_and_release(&scratch, 0); +out_close: + kernel_context_close(fixme); + return err; +} + +static int gpr_make_dirty(struct intel_engine_cs *engine) +{ + struct i915_request *rq; + u32 *cs; + int n; + + rq = i915_request_create(engine->kernel_context); + if (IS_ERR(rq)) + return PTR_ERR(rq); + + cs = intel_ring_begin(rq, 2 * NUM_GPR_DW + 2); + if (IS_ERR(cs)) { + i915_request_add(rq); + return PTR_ERR(cs); + } + + *cs++ = MI_LOAD_REGISTER_IMM(NUM_GPR_DW); + for (n = 0; n < NUM_GPR_DW; n++) { + *cs++ = CS_GPR(engine, n); + *cs++ = STACK_MAGIC; + } + *cs++ = MI_NOOP; + + intel_ring_advance(rq, cs); + i915_request_add(rq); + + return 0; +} + +static int __live_gpr_clear(struct i915_gem_context *fixme, + struct intel_engine_cs *engine, + struct i915_vma *scratch) +{ + struct intel_context *ce; + struct i915_request *rq; + u32 *cs; + int err; + int n; + + if (INTEL_GEN(engine->i915) < 9 && engine->class != RENDER_CLASS) + return 0; /* GPR only on rcs0 for gen8 */ + + err = gpr_make_dirty(engine); + if (err) + return err; + + ce = intel_context_create(fixme, engine); + if (IS_ERR(ce)) + return PTR_ERR(ce); + + rq = intel_context_create_request(ce); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err_put; + } + + cs = intel_ring_begin(rq, 4 * NUM_GPR_DW); + if (IS_ERR(cs)) { + err = PTR_ERR(cs); + i915_request_add(rq); + goto err_put; + } + + for (n = 0; n < NUM_GPR_DW; n++) { + *cs++ = MI_STORE_REGISTER_MEM_GEN8 | MI_USE_GGTT; + *cs++ = CS_GPR(engine, n); + *cs++ = i915_ggtt_offset(scratch) + n * sizeof(u32); + *cs++ = 0; + } + + i915_request_get(rq); + i915_request_add(rq); + + if (i915_request_wait(rq, 0, HZ / 5) < 0) { + err = -ETIME; + goto err_rq; + } + + cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB); + if (IS_ERR(cs)) { + err = PTR_ERR(cs); + goto err_rq; + } + + for (n = 0; n < NUM_GPR_DW; n++) { + if (cs[n]) { + pr_err("%s: GPR[%d].%s was not zero, found 0x%08x!\n", + engine->name, + n / 2, n & 1 ? "udw" : "ldw", + cs[n]); + err = -EINVAL; + break; + } + } + + i915_gem_object_unpin_map(scratch->obj); + +err_rq: + i915_request_put(rq); +err_put: + intel_context_put(ce); + return err; +} + +static int live_gpr_clear(void *arg) +{ + struct intel_gt *gt = arg; + struct intel_engine_cs *engine; + struct i915_gem_context *fixme; + struct i915_vma *scratch; + enum intel_engine_id id; + int err = 0; + + /* + * Check that GPR registers are cleared in new contexts as we need + * to avoid leaking any information from previous contexts. + */ + + fixme = kernel_context(gt->i915); + if (!fixme) + return -ENOMEM; + + scratch = create_scratch(gt); + if (IS_ERR(scratch)) { + err = PTR_ERR(scratch); + goto out_close; + } + + for_each_engine(engine, gt, id) { + err = __live_gpr_clear(fixme, engine, scratch); + if (err) + break; + } + + if (igt_flush_test(gt->i915)) + err = -EIO; + + i915_vma_unpin_and_release(&scratch, 0); +out_close: + kernel_context_close(fixme); + return err; +} + int intel_lrc_live_selftests(struct drm_i915_private *i915) { static const struct i915_subtest tests[] = { SUBTEST(live_lrc_layout), + SUBTEST(live_lrc_state), + SUBTEST(live_gpr_clear), }; if (!HAS_LOGICAL_RING_CONTEXTS(i915)) diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c index d79482db7fe8..6efb9221b7fa 100644 --- a/drivers/gpu/drm/i915/gt/selftest_reset.c +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c @@ -17,7 +17,7 @@ static int igt_global_reset(void *arg) /* Check that we can issue a global GPU reset */ igt_global_reset_lock(gt); - wakeref = intel_runtime_pm_get(>->i915->runtime_pm); + wakeref = intel_runtime_pm_get(gt->uncore->rpm); reset_count = i915_reset_count(>->i915->gpu_error); @@ -28,7 +28,7 @@ static int igt_global_reset(void *arg) err = -EINVAL; } - intel_runtime_pm_put(>->i915->runtime_pm, wakeref); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); igt_global_reset_unlock(gt); if (intel_gt_is_wedged(gt)) @@ -45,14 +45,14 @@ static int igt_wedged_reset(void *arg) /* Check that we can recover a wedged device with a GPU reset */ igt_global_reset_lock(gt); - wakeref = intel_runtime_pm_get(>->i915->runtime_pm); + wakeref = intel_runtime_pm_get(gt->uncore->rpm); intel_gt_set_wedged(gt); GEM_BUG_ON(!intel_gt_is_wedged(gt)); intel_gt_reset(gt, ALL_ENGINES, NULL); - intel_runtime_pm_put(>->i915->runtime_pm, wakeref); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); igt_global_reset_unlock(gt); return intel_gt_is_wedged(gt) ? -EIO : 0; @@ -125,7 +125,7 @@ static int igt_atomic_engine_reset(void *arg) if (!igt_force_reset(gt)) goto out_unlock; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { tasklet_disable_nosync(&engine->execlists.tasklet); intel_engine_pm_get(engine); diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c index d6df40cdc8a6..dac86f699a4c 100644 --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c @@ -35,7 +35,7 @@ static unsigned long hwsp_cacheline(struct intel_timeline *tl) #define CACHELINES_PER_PAGE (PAGE_SIZE / CACHELINE_BYTES) struct mock_hwsp_freelist { - struct drm_i915_private *i915; + struct intel_gt *gt; struct radix_tree_root cachelines; struct intel_timeline **history; unsigned long count, max; @@ -68,7 +68,7 @@ static int __mock_hwsp_timeline(struct mock_hwsp_freelist *state, unsigned long cacheline; int err; - tl = intel_timeline_create(&state->i915->gt, NULL); + tl = intel_timeline_create(state->gt, NULL); if (IS_ERR(tl)) return PTR_ERR(tl); @@ -106,6 +106,7 @@ static int __mock_hwsp_timeline(struct mock_hwsp_freelist *state, static int mock_hwsp_freelist(void *arg) { struct mock_hwsp_freelist state; + struct drm_i915_private *i915; const struct { const char *name; unsigned int flags; @@ -117,12 +118,14 @@ static int mock_hwsp_freelist(void *arg) unsigned int na; int err = 0; + i915 = mock_gem_device(); + if (!i915) + return -ENOMEM; + INIT_RADIX_TREE(&state.cachelines, GFP_KERNEL); state.prng = I915_RND_STATE_INITIALIZER(i915_selftest.random_seed); - state.i915 = mock_gem_device(); - if (!state.i915) - return -ENOMEM; + state.gt = &i915->gt; /* * Create a bunch of timelines and check that their HWSP do not overlap. @@ -151,7 +154,7 @@ out: __mock_hwsp_record(&state, na, NULL); kfree(state.history); err_put: - drm_dev_put(&state.i915->drm); + drm_dev_put(&i915->drm); return err; } @@ -476,11 +479,11 @@ out: } static struct intel_timeline * -checked_intel_timeline_create(struct drm_i915_private *i915) +checked_intel_timeline_create(struct intel_gt *gt) { struct intel_timeline *tl; - tl = intel_timeline_create(&i915->gt, NULL); + tl = intel_timeline_create(gt, NULL); if (IS_ERR(tl)) return tl; @@ -497,7 +500,7 @@ checked_intel_timeline_create(struct drm_i915_private *i915) static int live_hwsp_engine(void *arg) { #define NUM_TIMELINES 4096 - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_timeline **timelines; struct intel_engine_cs *engine; enum intel_engine_id id; @@ -516,7 +519,7 @@ static int live_hwsp_engine(void *arg) return -ENOMEM; count = 0; - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { if (!intel_engine_can_store_dword(engine)) continue; @@ -526,7 +529,7 @@ static int live_hwsp_engine(void *arg) struct intel_timeline *tl; struct i915_request *rq; - tl = checked_intel_timeline_create(i915); + tl = checked_intel_timeline_create(gt); if (IS_ERR(tl)) { err = PTR_ERR(tl); break; @@ -548,7 +551,7 @@ static int live_hwsp_engine(void *arg) break; } - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) err = -EIO; for (n = 0; n < count; n++) { @@ -570,7 +573,7 @@ static int live_hwsp_engine(void *arg) static int live_hwsp_alternate(void *arg) { #define NUM_TIMELINES 4096 - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_timeline **timelines; struct intel_engine_cs *engine; enum intel_engine_id id; @@ -591,14 +594,14 @@ static int live_hwsp_alternate(void *arg) count = 0; for (n = 0; n < NUM_TIMELINES; n++) { - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct intel_timeline *tl; struct i915_request *rq; if (!intel_engine_can_store_dword(engine)) continue; - tl = checked_intel_timeline_create(i915); + tl = checked_intel_timeline_create(gt); if (IS_ERR(tl)) { intel_engine_pm_put(engine); err = PTR_ERR(tl); @@ -620,7 +623,7 @@ static int live_hwsp_alternate(void *arg) } out: - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) err = -EIO; for (n = 0; n < count; n++) { @@ -641,8 +644,7 @@ out: static int live_hwsp_wrap(void *arg) { - struct drm_i915_private *i915 = arg; - struct intel_gt *gt = &i915->gt; + struct intel_gt *gt = arg; struct intel_engine_cs *engine; struct intel_timeline *tl; enum intel_engine_id id; @@ -664,7 +666,7 @@ static int live_hwsp_wrap(void *arg) if (err) goto out_free; - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { const u32 *hwsp_seqno[2]; struct i915_request *rq; u32 seqno[2]; @@ -740,7 +742,7 @@ static int live_hwsp_wrap(void *arg) } out: - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) err = -EIO; intel_timeline_unpin(tl); @@ -751,7 +753,7 @@ out_free: static int live_hwsp_recycle(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_engine_cs *engine; enum intel_engine_id id; unsigned long count; @@ -764,7 +766,7 @@ static int live_hwsp_recycle(void *arg) */ count = 0; - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { IGT_TIMEOUT(end_time); if (!intel_engine_can_store_dword(engine)) @@ -776,7 +778,7 @@ static int live_hwsp_recycle(void *arg) struct intel_timeline *tl; struct i915_request *rq; - tl = checked_intel_timeline_create(i915); + tl = checked_intel_timeline_create(gt); if (IS_ERR(tl)) { err = PTR_ERR(tl); break; @@ -831,5 +833,5 @@ int intel_timeline_live_selftests(struct drm_i915_private *i915) if (intel_gt_is_wedged(&i915->gt)) return 0; - return i915_live_subtests(tests, i915); + return intel_gt_live_subtests(tests, &i915->gt); } diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 95627e80f246..ef02920cec29 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -33,8 +33,32 @@ struct wa_lists { } engine[I915_NUM_ENGINES]; }; +static int request_add_sync(struct i915_request *rq, int err) +{ + i915_request_get(rq); + i915_request_add(rq); + if (i915_request_wait(rq, 0, HZ / 5) < 0) + err = -EIO; + i915_request_put(rq); + + return err; +} + +static int request_add_spin(struct i915_request *rq, struct igt_spinner *spin) +{ + int err = 0; + + i915_request_get(rq); + i915_request_add(rq); + if (spin && !igt_wait_for_spinner(spin, rq)) + err = -ETIMEDOUT; + i915_request_put(rq); + + return err; +} + static void -reference_lists_init(struct drm_i915_private *i915, struct wa_lists *lists) +reference_lists_init(struct intel_gt *gt, struct wa_lists *lists) { struct intel_engine_cs *engine; enum intel_engine_id id; @@ -42,10 +66,10 @@ reference_lists_init(struct drm_i915_private *i915, struct wa_lists *lists) memset(lists, 0, sizeof(*lists)); wa_init_start(&lists->gt_wa_list, "GT_REF", "global"); - gt_init_workarounds(i915, &lists->gt_wa_list); + gt_init_workarounds(gt->i915, &lists->gt_wa_list); wa_init_finish(&lists->gt_wa_list); - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { struct i915_wa_list *wal = &lists->engine[id].wa_list; wa_init_start(wal, "REF", engine->name); @@ -59,12 +83,12 @@ reference_lists_init(struct drm_i915_private *i915, struct wa_lists *lists) } static void -reference_lists_fini(struct drm_i915_private *i915, struct wa_lists *lists) +reference_lists_fini(struct intel_gt *gt, struct wa_lists *lists) { struct intel_engine_cs *engine; enum intel_engine_id id; - for_each_engine(engine, i915, id) + for_each_engine(engine, gt, id) intel_wa_list_free(&lists->engine[id].wa_list); intel_wa_list_free(&lists->gt_wa_list); @@ -191,10 +215,10 @@ static int check_whitelist(struct i915_gem_context *ctx, err = 0; i915_gem_object_lock(results); - intel_wedge_on_timeout(&wedge, &ctx->i915->gt, HZ / 5) /* safety net! */ + intel_wedge_on_timeout(&wedge, engine->gt, HZ / 5) /* safety net! */ err = i915_gem_object_set_to_cpu_domain(results, false); i915_gem_object_unlock(results); - if (intel_gt_is_wedged(&ctx->i915->gt)) + if (intel_gt_is_wedged(engine->gt)) err = -EIO; if (err) goto out_put; @@ -243,7 +267,6 @@ switch_to_scratch_context(struct intel_engine_cs *engine, struct i915_gem_context *ctx; struct intel_context *ce; struct i915_request *rq; - intel_wakeref_t wakeref; int err = 0; ctx = kernel_context(engine->i915); @@ -255,9 +278,7 @@ switch_to_scratch_context(struct intel_engine_cs *engine, ce = i915_gem_context_get_engine(ctx, engine->legacy_idx); GEM_BUG_ON(IS_ERR(ce)); - rq = ERR_PTR(-ENODEV); - with_intel_runtime_pm(&engine->i915->runtime_pm, wakeref) - rq = igt_spinner_create_request(spin, ce, MI_NOOP); + rq = igt_spinner_create_request(spin, ce, MI_NOOP); intel_context_put(ce); @@ -267,13 +288,7 @@ switch_to_scratch_context(struct intel_engine_cs *engine, goto err; } - i915_request_add(rq); - - if (spin && !igt_wait_for_spinner(spin, rq)) { - pr_err("Spinner failed to start\n"); - err = -ETIMEDOUT; - } - + err = request_add_spin(rq, spin); err: if (err && spin) igt_spinner_end(spin); @@ -313,7 +328,7 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine, if (err) goto out_spin; - with_intel_runtime_pm(&i915->runtime_pm, wakeref) + with_intel_runtime_pm(engine->uncore->rpm, wakeref) err = reset(engine); igt_spinner_end(&spin); @@ -586,15 +601,11 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx, goto err_request; err_request: - i915_request_add(rq); - if (err) - goto out_batch; - - if (i915_request_wait(rq, 0, HZ / 5) < 0) { + err = request_add_sync(rq, err); + if (err) { pr_err("%s: Futzing %x timedout; cancelling test\n", engine->name, reg); - intel_gt_set_wedged(&ctx->i915->gt); - err = -EIO; + intel_gt_set_wedged(engine->gt); goto out_batch; } @@ -693,34 +704,29 @@ out_scratch: static int live_dirty_whitelist(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct intel_engine_cs *engine; struct i915_gem_context *ctx; enum intel_engine_id id; - intel_wakeref_t wakeref; struct drm_file *file; int err = 0; /* Can the user write to the whitelisted registers? */ - if (INTEL_GEN(i915) < 7) /* minimum requirement for LRI, SRM, LRM */ + if (INTEL_GEN(gt->i915) < 7) /* minimum requirement for LRI, SRM, LRM */ return 0; - wakeref = intel_runtime_pm_get(&i915->runtime_pm); + file = mock_file(gt->i915); + if (IS_ERR(file)) + return PTR_ERR(file); - file = mock_file(i915); - if (IS_ERR(file)) { - err = PTR_ERR(file); - goto out_rpm; - } - - ctx = live_context(i915, file); + ctx = live_context(gt->i915, file); if (IS_ERR(ctx)) { err = PTR_ERR(ctx); goto out_file; } - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { if (engine->whitelist.count == 0) continue; @@ -730,43 +736,43 @@ static int live_dirty_whitelist(void *arg) } out_file: - mock_file_free(i915, file); -out_rpm: - intel_runtime_pm_put(&i915->runtime_pm, wakeref); + mock_file_free(gt->i915, file); return err; } static int live_reset_whitelist(void *arg) { - struct drm_i915_private *i915 = arg; - struct intel_engine_cs *engine = i915->engine[RCS0]; + struct intel_gt *gt = arg; + struct intel_engine_cs *engine; + enum intel_engine_id id; int err = 0; /* If we reset the gpu, we should not lose the RING_NONPRIV */ + igt_global_reset_lock(gt); - if (!engine || engine->whitelist.count == 0) - return 0; - - igt_global_reset_lock(&i915->gt); + for_each_engine(engine, gt, id) { + if (engine->whitelist.count == 0) + continue; - if (intel_has_reset_engine(&i915->gt)) { - err = check_whitelist_across_reset(engine, - do_engine_reset, - "engine"); - if (err) - goto out; - } + if (intel_has_reset_engine(gt)) { + err = check_whitelist_across_reset(engine, + do_engine_reset, + "engine"); + if (err) + goto out; + } - if (intel_has_gpu_reset(&i915->gt)) { - err = check_whitelist_across_reset(engine, - do_device_reset, - "device"); - if (err) - goto out; + if (intel_has_gpu_reset(gt)) { + err = check_whitelist_across_reset(engine, + do_device_reset, + "device"); + if (err) + goto out; + } } out: - igt_global_reset_unlock(&i915->gt); + igt_global_reset_unlock(gt); return err; } @@ -782,6 +788,14 @@ static int read_whitelisted_registers(struct i915_gem_context *ctx, if (IS_ERR(rq)) return PTR_ERR(rq); + i915_vma_lock(results); + err = i915_request_await_object(rq, results->obj, true); + if (err == 0) + err = i915_vma_move_to_active(results, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(results); + if (err) + goto err_req; + srm = MI_STORE_REGISTER_MEM; if (INTEL_GEN(ctx->i915) >= 8) srm++; @@ -807,12 +821,7 @@ static int read_whitelisted_registers(struct i915_gem_context *ctx, intel_ring_advance(rq, cs); err_req: - i915_request_add(rq); - - if (i915_request_wait(rq, 0, HZ / 5) < 0) - err = -EIO; - - return err; + return request_add_sync(rq, err); } static int scrub_whitelisted_registers(struct i915_gem_context *ctx, @@ -872,9 +881,7 @@ static int scrub_whitelisted_registers(struct i915_gem_context *ctx, err = engine->emit_bb_start(rq, batch->node.start, 0, 0); err_request: - i915_request_add(rq); - if (i915_request_wait(rq, 0, HZ / 5) < 0) - err = -EIO; + err = request_add_sync(rq, err); err_unpin: i915_gem_object_unpin_map(batch->obj); @@ -991,7 +998,7 @@ err_a: static int live_isolated_whitelist(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct { struct i915_gem_context *ctx; struct i915_vma *scratch[2]; @@ -1005,17 +1012,14 @@ static int live_isolated_whitelist(void *arg) * invisible to a second context. */ - if (!intel_engines_has_context_isolation(i915)) - return 0; - - if (!i915->kernel_context->vm) + if (!intel_engines_has_context_isolation(gt->i915)) return 0; for (i = 0; i < ARRAY_SIZE(client); i++) { struct i915_address_space *vm; struct i915_gem_context *c; - c = kernel_context(i915); + c = kernel_context(gt->i915); if (IS_ERR(c)) { err = PTR_ERR(c); goto err; @@ -1044,7 +1048,10 @@ static int live_isolated_whitelist(void *arg) i915_vm_put(vm); } - for_each_engine(engine, i915, id) { + for_each_engine(engine, gt, id) { + if (!engine->kernel_context->vm) + continue; + if (!whitelist_writable_count(engine)) continue; @@ -1098,7 +1105,7 @@ err: kernel_context_close(client[i].ctx); } - if (igt_flush_test(i915)) + if (igt_flush_test(gt->i915)) err = -EIO; return err; @@ -1133,16 +1140,16 @@ verify_wa_lists(struct i915_gem_context *ctx, struct wa_lists *lists, static int live_gpu_reset_workarounds(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct i915_gem_context *ctx; intel_wakeref_t wakeref; struct wa_lists lists; bool ok; - if (!intel_has_gpu_reset(&i915->gt)) + if (!intel_has_gpu_reset(gt)) return 0; - ctx = kernel_context(i915); + ctx = kernel_context(gt->i915); if (IS_ERR(ctx)) return PTR_ERR(ctx); @@ -1150,25 +1157,25 @@ live_gpu_reset_workarounds(void *arg) pr_info("Verifying after GPU reset...\n"); - igt_global_reset_lock(&i915->gt); - wakeref = intel_runtime_pm_get(&i915->runtime_pm); + igt_global_reset_lock(gt); + wakeref = intel_runtime_pm_get(gt->uncore->rpm); - reference_lists_init(i915, &lists); + reference_lists_init(gt, &lists); ok = verify_wa_lists(ctx, &lists, "before reset"); if (!ok) goto out; - intel_gt_reset(&i915->gt, ALL_ENGINES, "live_workarounds"); + intel_gt_reset(gt, ALL_ENGINES, "live_workarounds"); ok = verify_wa_lists(ctx, &lists, "after reset"); out: i915_gem_context_unlock_engines(ctx); kernel_context_close(ctx); - reference_lists_fini(i915, &lists); - intel_runtime_pm_put(&i915->runtime_pm, wakeref); - igt_global_reset_unlock(&i915->gt); + reference_lists_fini(gt, &lists); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); + igt_global_reset_unlock(gt); return ok ? 0 : -ESRCH; } @@ -1176,7 +1183,7 @@ out: static int live_engine_reset_workarounds(void *arg) { - struct drm_i915_private *i915 = arg; + struct intel_gt *gt = arg; struct i915_gem_engines_iter it; struct i915_gem_context *ctx; struct intel_context *ce; @@ -1186,17 +1193,17 @@ live_engine_reset_workarounds(void *arg) struct wa_lists lists; int ret = 0; - if (!intel_has_reset_engine(&i915->gt)) + if (!intel_has_reset_engine(gt)) return 0; - ctx = kernel_context(i915); + ctx = kernel_context(gt->i915); if (IS_ERR(ctx)) return PTR_ERR(ctx); - igt_global_reset_lock(&i915->gt); - wakeref = intel_runtime_pm_get(&i915->runtime_pm); + igt_global_reset_lock(gt); + wakeref = intel_runtime_pm_get(gt->uncore->rpm); - reference_lists_init(i915, &lists); + reference_lists_init(gt, &lists); for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) { struct intel_engine_cs *engine = ce->engine; @@ -1229,12 +1236,10 @@ live_engine_reset_workarounds(void *arg) goto err; } - i915_request_add(rq); - - if (!igt_wait_for_spinner(&spin, rq)) { + ret = request_add_spin(rq, &spin); + if (ret) { pr_err("Spinner failed to start\n"); igt_spinner_fini(&spin); - ret = -ETIMEDOUT; goto err; } @@ -1251,12 +1256,12 @@ live_engine_reset_workarounds(void *arg) } err: i915_gem_context_unlock_engines(ctx); - reference_lists_fini(i915, &lists); - intel_runtime_pm_put(&i915->runtime_pm, wakeref); - igt_global_reset_unlock(&i915->gt); + reference_lists_fini(gt, &lists); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); + igt_global_reset_unlock(gt); kernel_context_close(ctx); - igt_flush_test(i915); + igt_flush_test(gt->i915); return ret; } @@ -1274,5 +1279,5 @@ int intel_workarounds_live_selftests(struct drm_i915_private *i915) if (intel_gt_is_wedged(&i915->gt)) return 0; - return i915_subtests(tests, i915); + return intel_gt_live_subtests(tests, &i915->gt); } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 249c747e9756..37f7bcbf7dac 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -9,6 +9,27 @@ #include "intel_guc_submission.h" #include "i915_drv.h" +/** + * DOC: GuC + * + * The GuC is a microcontroller inside the GT HW, introduced in gen9. The GuC is + * designed to offload some of the functionality usually performed by the host + * driver; currently the main operations it can take care of are: + * + * - Authentication of the HuC, which is required to fully enable HuC usage. + * - Low latency graphics context scheduling (a.k.a. GuC submission). + * - GT Power management. + * + * The enable_guc module parameter can be used to select which of those + * operations to enable within GuC. Note that not all the operations are + * supported on all gen9+ platforms. + * + * Enabling the GuC is not mandatory and therefore the firmware is only loaded + * if at least one of the operations is selected. However, not loading the GuC + * might result in the loss of some features that do require the GuC (currently + * just the HuC, but more are expected to land in the future). + */ + static void gen8_guc_raise_irq(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); @@ -548,9 +569,15 @@ int intel_guc_resume(struct intel_guc *guc) } /** - * DOC: GuC Address Space + * DOC: GuC Memory Management * - * The layout of GuC address space is shown below: + * GuC can't allocate any memory for its own usage, so all the allocations must + * be handled by the host driver. GuC accesses the memory via the GGTT, with the + * exception of the top and bottom parts of the 4GB address space, which are + * instead re-mapped by the GuC HW to memory location of the FW itself (WOPCM) + * or other parts of the HW. The driver must take care not to place objects that + * the GuC is going to access in these reserved ranges. The layout of the GuC + * address space is shown below: * * :: * diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index 36332064de9c..2cf2d3314f62 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -607,7 +607,6 @@ out_unlock: void intel_guc_log_relay_flush(struct intel_guc_log *log) { struct intel_guc *guc = log_to_guc(log); - struct drm_i915_private *i915 = guc_to_gt(guc)->i915; intel_wakeref_t wakeref; /* @@ -616,7 +615,7 @@ void intel_guc_log_relay_flush(struct intel_guc_log *log) */ flush_work(&log->relay.flush_work); - with_intel_runtime_pm(&i915->runtime_pm, wakeref) + with_intel_runtime_pm(guc_to_gt(guc)->uncore->rpm, wakeref) guc_action_flush_log(guc); /* GuC would have updated log buffer by now, so capture it */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index f325d3dd564f..009e54a3764f 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -29,6 +29,12 @@ enum { /** * DOC: GuC-based command submission * + * IMPORTANT NOTE: GuC submission is currently not supported in i915. The GuC + * firmware is moving to an updated submission interface and we plan to + * turn submission back on when that lands. The below documentation (and related + * code) matches the old submission model and will be updated as part of the + * upgrade to the new flow. + * * GuC client: * A intel_guc_client refers to a submission path through GuC. Currently, there * is only one client, which is charged with all submissions to the GuC. This @@ -1014,7 +1020,7 @@ static void guc_interrupts_capture(struct intel_gt *gt) * to GuC */ irqs = _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING); - for_each_engine(engine, gt->i915, id) + for_each_engine(engine, gt, id) ENGINE_WRITE(engine, RING_MODE_GEN7, irqs); /* route USER_INTERRUPT to Host, all others are sent to GuC. */ @@ -1062,7 +1068,7 @@ static void guc_interrupts_release(struct intel_gt *gt) */ irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_NEVER); irqs |= _MASKED_BIT_DISABLE(GFX_INTERRUPT_STEERING); - for_each_engine(engine, gt->i915, id) + for_each_engine(engine, gt, id) ENGINE_WRITE(engine, RING_MODE_GEN7, irqs); /* route all GT interrupts to the host */ @@ -1145,7 +1151,7 @@ int intel_guc_submission_enable(struct intel_guc *guc) /* Take over from manual control of ELSP (execlists) */ guc_interrupts_capture(gt); - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt, id) { engine->set_default_submission = guc_set_default_submission; engine->set_default_submission(engine); } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c index d4625c97b4f9..8be515c8d0f0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c @@ -9,6 +9,34 @@ #include "intel_huc.h" #include "i915_drv.h" +/** + * DOC: HuC + * + * The HuC is a dedicated microcontroller for usage in media HEVC (High + * Efficiency Video Coding) operations. Userspace can directly use the firmware + * capabilities by adding HuC specific commands to batch buffers. + * + * The kernel driver is only responsible for loading the HuC firmware and + * triggering its security authentication, which is performed by the GuC. For + * The GuC to correctly perform the authentication, the HuC binary must be + * loaded before the GuC one. Loading the HuC is optional; however, not using + * the HuC might negatively impact power usage and/or performance of media + * workloads, depending on the use-cases. + * + * See https://github.com/intel/media-driver for the latest details on HuC + * functionality. + */ + +/** + * DOC: HuC Memory Management + * + * Similarly to the GuC, the HuC can't do any memory allocations on its own, + * with the difference being that the allocations for HuC usage are handled by + * the userspace driver instead of the kernel one. The HuC accesses the memory + * via the PPGTT belonging to the context loaded on the VCS executing the + * HuC-specific commands. + */ + void intel_huc_init_early(struct intel_huc *huc) { struct drm_i915_private *i915 = huc_to_gt(huc)->i915; @@ -118,10 +146,9 @@ void intel_huc_fini(struct intel_huc *huc) * * Called after HuC and GuC firmware loading during intel_uc_init_hw(). * - * This function pins HuC firmware image object into GGTT. - * Then it invokes GuC action to authenticate passing the offset to RSA - * signature through intel_guc_auth_huc(). It then waits for 50ms for - * firmware verification ACK and unpins the object. + * This function invokes the GuC action to authenticate the HuC firmware, + * passing the offset of the RSA signature to intel_guc_auth_huc(). It then + * waits for up to 50ms for firmware verification ACK. */ int intel_huc_auth(struct intel_huc *huc) { @@ -185,7 +212,7 @@ int intel_huc_check_status(struct intel_huc *huc) if (!intel_huc_is_supported(huc)) return -ENODEV; - with_intel_runtime_pm(>->i915->runtime_pm, wakeref) + with_intel_runtime_pm(gt->uncore->rpm, wakeref) status = intel_uncore_read(gt->uncore, huc->status.reg); return (status & huc->status.mask) == huc->status.value; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_huc_fw.c index 74602487ed67..d654340d4d03 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_huc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc_fw.c @@ -8,21 +8,6 @@ #include "i915_drv.h" /** - * DOC: HuC Firmware - * - * Motivation: - * GEN9 introduces a new dedicated firmware for usage in media HEVC (High - * Efficiency Video Coding) operations. Userspace can use the firmware - * capabilities by adding HuC specific commands to batch buffers. - * - * Implementation: - * The same firmware loader is used as the GuC. However, the actual - * loading to HW is deferred until GEM initialization is done. - * - * Note that HuC firmware loading must be done before GuC loading. - */ - -/** * intel_huc_fw_init_early() - initializes HuC firmware struct * @huc: intel_huc struct * diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c index 29a9eec60d2e..3fdbc935d155 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c @@ -587,7 +587,7 @@ void intel_uc_suspend(struct intel_uc *uc) if (!intel_guc_is_running(guc)) return; - with_intel_runtime_pm(&uc_to_gt(uc)->i915->runtime_pm, wakeref) + with_intel_runtime_pm(uc_to_gt(uc)->uncore->rpm, wakeref) intel_uc_runtime_suspend(uc); } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h index f8f6c91a0df6..029214cdedd5 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h @@ -39,9 +39,6 @@ * 3. Length info of each component can be found in header, in dwords. * 4. Modulus and exponent key are not required by driver. They may not appear * in fw. So driver will load a truncated firmware in this case. - * - * The only difference between GuC and HuC firmwares is how the version - * information is saved. */ struct uc_css_header { diff --git a/drivers/gpu/drm/i915/gt/uc/selftest_guc.c b/drivers/gpu/drm/i915/gt/uc/selftest_guc.c index f927f851aadf..d8a80388bd31 100644 --- a/drivers/gpu/drm/i915/gt/uc/selftest_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/selftest_guc.c @@ -108,22 +108,15 @@ static bool client_doorbell_in_sync(struct intel_guc_client *client) * validating that the doorbells status expected by the driver matches what the * GuC/HW have. */ -static int igt_guc_clients(void *args) +static int igt_guc_clients(void *arg) { - struct drm_i915_private *dev_priv = args; + struct intel_gt *gt = arg; + struct intel_guc *guc = >->uc.guc; intel_wakeref_t wakeref; - struct intel_guc *guc; int err = 0; - GEM_BUG_ON(!HAS_GT_UC(dev_priv)); - wakeref = intel_runtime_pm_get(&dev_priv->runtime_pm); - - guc = &dev_priv->gt.uc.guc; - if (!guc) { - pr_err("No guc object!\n"); - err = -EINVAL; - goto unlock; - } + GEM_BUG_ON(!HAS_GT_UC(gt->i915)); + wakeref = intel_runtime_pm_get(gt->uncore->rpm); err = check_all_doorbells(guc); if (err) @@ -188,7 +181,7 @@ out: guc_clients_create(guc); guc_clients_enable(guc); unlock: - intel_runtime_pm_put(&dev_priv->runtime_pm, wakeref); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); return err; } @@ -199,21 +192,14 @@ unlock: */ static int igt_guc_doorbells(void *arg) { - struct drm_i915_private *dev_priv = arg; + struct intel_gt *gt = arg; + struct intel_guc *guc = >->uc.guc; intel_wakeref_t wakeref; - struct intel_guc *guc; int i, err = 0; u16 db_id; - GEM_BUG_ON(!HAS_GT_UC(dev_priv)); - wakeref = intel_runtime_pm_get(&dev_priv->runtime_pm); - - guc = &dev_priv->gt.uc.guc; - if (!guc) { - pr_err("No guc object!\n"); - err = -EINVAL; - goto unlock; - } + GEM_BUG_ON(!HAS_GT_UC(gt->i915)); + wakeref = intel_runtime_pm_get(gt->uncore->rpm); err = check_all_doorbells(guc); if (err) @@ -295,19 +281,19 @@ out: guc_client_free(clients[i]); } unlock: - intel_runtime_pm_put(&dev_priv->runtime_pm, wakeref); + intel_runtime_pm_put(gt->uncore->rpm, wakeref); return err; } -int intel_guc_live_selftest(struct drm_i915_private *dev_priv) +int intel_guc_live_selftest(struct drm_i915_private *i915) { static const struct i915_subtest tests[] = { SUBTEST(igt_guc_clients), SUBTEST(igt_guc_doorbells), }; - if (!USES_GUC_SUBMISSION(dev_priv)) + if (!USES_GUC_SUBMISSION(i915)) return 0; - return i915_subtests(tests, dev_priv); + return intel_gt_live_subtests(tests, &i915->gt); } |