diff options
author | Chris Wilson <chris@chris-wilson.co.uk> | 2019-02-26 09:49:19 +0000 |
---|---|---|
committer | Chris Wilson <chris@chris-wilson.co.uk> | 2019-02-26 09:55:31 +0000 |
commit | 89531e7d8ee8602b2723431a581250d5d0ec2913 (patch) | |
tree | fccf27612b7590e765eca61cdd8e13b38f57326d /drivers/gpu/drm/i915/intel_engine_cs.c | |
parent | 37fc7845df7b6ab9b2885b42e8de8ebcf33bee7a (diff) |
drm/i915: Replace global_seqno with a hangcheck heartbeat seqno
To determine whether an engine has 'stuck', we simply check whether or
not is still on the same seqno for several seconds. To keep this simple
mechanism intact over the loss of a global seqno, we can simply add a
new global heartbeat seqno instead. As we cannot know the sequence in
which requests will then be completed, we use a primitive random number
generator instead (with a cycle long enough to not matter over an
interval of a few thousand requests between hangcheck samples).
The alternative to using a dedicated seqno on every request is to issue
a heartbeat request and query its progress through the system. Sadly
this requires us to reduce struct_mutex so that we can issue requests
without requiring that bkl.
v2: And without the extra CS_STALL for the hangcheck seqno -- we don't
need strict serialisation with what comes later, we just need to be sure
we don't write the hangcheck seqno before our batch is flushed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-1-chris@chris-wilson.co.uk
Diffstat (limited to 'drivers/gpu/drm/i915/intel_engine_cs.c')
-rw-r--r-- | drivers/gpu/drm/i915/intel_engine_cs.c | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 81b80f8fd9ea..57bc5c4fb3ff 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -1497,10 +1497,11 @@ void intel_engine_dump(struct intel_engine_cs *engine, if (i915_reset_failed(engine->i915)) drm_printf(m, "*** WEDGED ***\n"); - drm_printf(m, "\tcurrent seqno %x, last %x, hangcheck %x [%d ms]\n", + drm_printf(m, "\tcurrent seqno %x, last %x, hangcheck %x/%x [%d ms]\n", intel_engine_get_seqno(engine), intel_engine_last_submit(engine), - engine->hangcheck.seqno, + engine->hangcheck.last_seqno, + engine->hangcheck.next_seqno, jiffies_to_msecs(jiffies - engine->hangcheck.action_timestamp)); drm_printf(m, "\tReset count: %d (global %d)\n", i915_reset_engine_count(error, engine), |