summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2016-07-06drm/i915: Group the irq breadcrumb variables into the same cachelineChris Wilson
As we inspect both the tasklet (to check for an active bottom-half) and set the irq-posted flag at the same time (both in the interrupt handler and then in the bottom-halt), group those two together into the same cacheline. (Not having total control over placement of the struct means we can't guarantee the cacheline boundary, we need to align the kmalloc and then each struct, but the grouping should help.) v2: Try a couple of different names for the state touched by the user interrupt handler. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-3-git-send-email-chris@chris-wilson.co.uk
2016-07-06drm/i915: Wake up the bottom-half if we steal their interruptChris Wilson
Following on from the scenario Tvrtko envisioned to explain a hard-to-hit race with multiple first waiters, we could also then race in the __i915_request_irq_complete() and the bottom-half may miss the vital irq-seqno barrier and so go to sleep not noticing their seqno is complete. v2: unlock, not double lock the rcu_read_lock. Fixes: 3d5564e91025 ("drm/i915: Only apply one barrier after...") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-2-git-send-email-chris@chris-wilson.co.uk
2016-07-06drm/i915: Always double check for a missed interrupt for new bottom halvesChris Wilson
After assigning ourselves as the new bottom-half, we must perform a cursory check to prevent a missed interrupt. Either we miss the interrupt whilst programming the hardware, or if there was a previous waiter (for a later seqno) they may be woken instead of us (due to the inherent race in the unlocked read of b->tasklet in the irq handler) and so we miss the wake up. Spotted-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96806 Fixes: 688e6c725816 ("drm/i915: Slaughter the thundering... herd") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-1-git-send-email-chris@chris-wilson.co.uk
2016-07-05drm/i915: Convert dev_priv->dev backpointers to dev_priv->drmChris Wilson
Since drm_i915_private is now a subclass of drm_device we do not need to chase the drm_i915_private->dev backpointer and can instead simply access drm_i915_private->drm directly. text data bss dec hex filename 1068757 4565 416 1073738 10624a drivers/gpu/drm/i915/i915.ko 1066949 4565 416 1071930 105b3a drivers/gpu/drm/i915/i915.ko Created by the coccinelle script: @@ struct drm_i915_private *d; identifier i; @@ ( - d->dev->i + d->drm.i | - d->dev + &d->drm ) and for good measure the dev_priv->dev backpointer was removed entirely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk
2016-07-05drm/i915: Remove impossible tests for dev->dev_privateChris Wilson
If we have a drm_device, we have a drm_i915_private (since they are the same). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-3-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-05drm/i915: Remove use of dev_priv->dev backpointer in __i915_printk()Chris Wilson
As we can just directly use drm_dev->drm.dev, we do not need the drm_dev->dev backpointer anymore and can also loose the warning about order of __i915_printk() and our initialisation (which is now always safe). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-2-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-05drm/i915: Split out runtime configuration of device info to its own fileChris Wilson
Let's reclaim a few hundred lines from i915_drv.c by splitting out the runtime configuration of the "constant" dev_priv->info. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-05drm/i915: Skip capturing an error state if we already have oneChris Wilson
As we only ever keep the first error state around, we can avoid some work that can be quite intrusive if we don't record the error the second time around. This does move the race whereby the user could discard one error state as the second is being captured, but that race exists in the current code and we hope that recapturing error state is only done for debugging. Note that as we discard the error state for simulated errors, igt that exercise error capture continue to function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-3-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-05drm/i915: Clean up GPU hang messageChris Wilson
Remove some redundant kernel messages as we deduce a hung GPU and capture the error state. v2: Fix "hang" vs "no progress" message whilst I was there v3: s/snprintf/scnprintf/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-2-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-05drm/i915: Amalgamate gen6_mm_switch() and vgpu_mm_switch()Chris Wilson
These are identical, so let's just use the same vfunc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-05drm/i915: Replace lockless_dereference(bool) with READ_ONCE()Chris Wilson
After Joonas complained about using READ_ONCE() on the only use of the variable in the function, where the intent was to simply document that the read was intentionally racy and unlocked, I switched the READ_ONCE() over to lockless_dereference(). However, in linux-next that has a stronger type-check to only allow pointers and is no longer interchangeable with READ_ONCE(), see commit 331b6d8c7afc ("locking/barriers: Validate lockless_dereference() is used on a pointer type") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 67d97da34917 ("drm/i915: Only start retire worker when idle") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467705276-707-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-04drm/i915: Explicitly convert some macros to boolean valuesTvrtko Ursulin
Some IS_ and HAS_ macros can return any non-zero value for true. One potential problem with that is that someone could assign them to integers and be surprised with the result. Therefore it is probably safer to do the conversion to 0/1 in the macros themselves. Luckily this does not seem to have an effect on code size. Only one call site was getting bit by this and a patch for that has been sent as "drm/i915/guc: Protect against HAS_GUC_* returning true values other than one". v2: Added some extra braces as suggested by checkpatch. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467643823-9798-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-07-04drm/i915: convert a few more E->dev_private to to_i915(E)Dave Gordon
Also remove some redundant dev and dev_priv locals Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467626365-29871-1-git-send-email-david.s.gordon@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-2-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Mass convert dev->dev_private to to_i915(dev)Chris Wilson
Since we now subclass struct drm_device, we can save pointer dances by noting the equivalence of struct drm_device and struct drm_i915_private, i.e. by using to_i915(). text data bss dec hex filename 1073824 4562 416 1078802 107612 drivers/gpu/drm/i915/i915.ko 1068976 4562 416 1073954 106322 drivers/gpu/drm/i915/i915.ko Created by the coccinelle script: @@ expression E; identifier p; @@ - struct drm_i915_private *p = E->dev_private; + struct drm_i915_private *p = to_i915(E); Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk
2016-07-04i915/guc: Add Kabylake GuC LoadingPeter Antoine
This patch added the loading of the GuC for Kabylake. It loads a 9.14 firmware. v2: Fix commit message v3: Fix major/minor var names to match -nightly. (Rodrigo) Cc: Christophe Prigent <christophe.prigent@intel.com> Signed-off-by: Peter Antoine <peter.antoine@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v3) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467304672-2106-1-git-send-email-rodrigo.vivi@intel.com
2016-07-04Revert "drm/i915/kbl: drm/i915: Avoid GuC loading for now on Kabylake."Peter Antoine
This reverts commit 2b81b84471b9 Cc: Christophe Prigent <christophe.prigent@intel.com> Signed-off-by: Peter Antoine <peter.antoine@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-07-04drm/i915: Limit i915_ring_test_irq debugfs to actual ringsChris Wilson
For simplicity in testing, only report known rings in the mask. This allows userspace to try and trigger a missed irq on every ring and do a comparison between i915_ring_test_irq and i915_ring_missed_irq to see if any rings failed. v2: Move the debug message to after the rings are selected (so that the message accurately reflects reality) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466170505-8048-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-04drm/i915: Hold irq uncore.lock when initialising fw_domainsChris Wilson
Acquiring the forcewake domain asserts that it is in an atomic section (as we always expect to be under the uncore.lock). This is true except for initialising the domains on Ivybridge, and so we generate a warning. Wrap the manual usage of fw_domains inside the spin_lock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467566973-13596-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-07-04drm/i915: Allow userspace to request no-error-capture upon GPU hangsChris Wilson
igt likes to inject GPU hangs into its command streams. However, as we expect these hangs, we don't actually want them recorded in the dmesg output or stored in the i915_error_state (usually). To accommodate this allow userspace to set a flag on the context that any hang emanating from that context will not be recorded. We still do the error capture (otherwise how do we find the guilty context and know its intent?) as part of the reason for random GPU hang injection is to exercise the race conditions between the error capture and normal execution. v2: Split out the request->ringbuf error capture changes. v3: Move the flag defines next to the intel_context->flags definition Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-9-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Record the ringbuffer associated with the requestChris Wilson
The request tells us where to read the ringbuf from, so use that information to simplify the error capture. If no request was active at the time of the hang, the ring is idle and there is no information inside the ring pertaining to the hang. Note carefully that this will reduce the amount of information stored in the error state - any ring without an active request will not be recorded. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-8-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Remove stop-rings debugfs interfaceChris Wilson
Now that we have (near) universal GPU recovery code, we can inject a real hang from userspace and not need any fakery. Not only does this mean that the testing is far more realistic, but we can simplify the kernel in the process. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-7-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Flush the RPS bottom-half when the GPU idlesChris Wilson
Make sure that the RPS bottom-half is flushed before we set the idle frequency when we decide the GPU is idle. This should prevent any races with the bottom-half and setting the idle frequency, and ensures that the bottom-half is bounded by the GPU's rpm reference taken for when it is active (i.e. between gen6_rps_busy() and gen6_rps_idle()). v2: Avoid recursively using the i915->wq - RPS does not touch the struct_mutex so has no place being on the ordered i915->wq. v3: Enable/disable interrupts for RPS busy/idle in order to prevent further HW access from RPS outside of the wakeref. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> References: https://bugs.freedesktop.org/show_bug.cgi?id=89728 Reviewed-by: MichaƂ Winiarski <michal.winiarski@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-6-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Add background commentary to "waitboosting"Chris Wilson
Describe the intent of boosting the GPU frequency to maximum before waiting on the GPU. RPS waitboosting was introduced with commit b29c19b64528 ("drm/i915: Boost RPS frequency for CPU stalls") but lacked a concise comment in the code to explain itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-5-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Restore waitboost credit to the synchronous waiterChris Wilson
Ideally, we want to automagically have the GPU respond to the instantaneous load by reclocking itself. However, reclocking occurs relatively slowly, and to the client waiting for a result from the GPU, too late. To compensate and reduce the client latency, we allow the first wait from a client to boost the GPU clocks to maximum. This overcomes the lag in autoreclocking, at the expense of forcing the GPU clocks too high. So to offset the excessive power usage, we currently allow a client to only boost the clocks once before we detect the GPU is idle again. This works reasonably for say the first frame in a benchmark, but for many more synchronous workloads (like OpenCL) we find the GPU clocks remain too low. By noting a wait which would idle the GPU (i.e. we just waited upon the last known request), we can give that client the idle boost credit (for their next wait) without the 100ms delay required for us to detect the GPU idle state. The intention is to boost clients that are stalling in the process of feeding the GPU more work (and who in doing so let the GPU idle), without granting boost credits to clients that are throttling themselves (such as compositors). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Zou, Nanhai" <nanhai.zou@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-4-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Remove redundant queue_delayed_work() from throttle ioctlChris Wilson
We know, by design, that whilst the GPU is active (and thus we are throttling) the retire_worker is queued. Therefore attempting to requeue it with queue_delayed_work() is a no-op and we can safely remove it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-3-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Do not keep postponing the idle-workChris Wilson
Rather than persistently postponing the idle-work everytime somebody calls i915_gem_retire_requests() (potentially ensuring that we never reach the idle state), queue the work the first time we detect all requests are complete. Then if in 100ms, more requests have been queued, we will abort the idle-worker and wait again until all the new requests have been completed. Of course, this does depend upon the idle worker cancelling itself gracefully from the previous patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-2-git-send-email-chris@chris-wilson.co.uk
2016-07-04drm/i915: Only start retire worker when idleChris Wilson
The retire worker is a low frequency task that makes sure we retire outstanding requests if userspace is being lax. We only need to start it once as it remains active until the GPU is idle, so do a cheap test before the more expensive queue_work(). A consequence of this is that we need correct locking in the worker to make the hot path of request submission cheap. To keep the symmetry and keep hangcheck strictly bound by the GPU's wakelock, we move the cancel_sync(hangcheck) to the idle worker before dropping the wakelock. v2: Guard against RCU fouling the breadcrumbs bottom-half whilst we kick the waiter. v3: Remove the wakeref assertion squelching (now we hold a wakeref for the hangcheck, any rpm error there is genuine). v4: To prevent excess work when retiring requests, we split the busy flag into two, a boolean to denote whether we hold the wakeref and a bitmask of active engines. v5: Reorder cancelling hangcheck upon idling to avoid a race where we might cancel a hangcheck after being preempted by a new task Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> References: https://bugs.freedesktop.org/show_bug.cgi?id=88437 Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-1-git-send-email-chris@chris-wilson.co.uk
2016-07-02drm/i915: Remove check for !crtc_state in intel_plane_atomic_calc_changes()Chris Wilson
smatch spotted that: drivers/gpu/drm/i915/intel_display.c:11986 intel_plane_atomic_calc_changes() warn: variable dereferenced before check 'crtc_state' (see line 11972) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-9-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-02drm/i915: Fix inconsistent indentation in intel_pre_enable_lvds()Chris Wilson
smatch complains: drivers/gpu/drm/i915/intel_lvds.c:187 intel_pre_enable_lvds() warn: inconsistent indenting Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-8-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-02drm/i915: Fix buffer overflow in dsi_calc_mnp()Chris Wilson
smatch complain: drivers/gpu/drm/i915/intel_dsi_pll.c:101 dsi_calc_mnp() error: buffer overflow 'lfsr_converts' 39 <= 4294967234 and looks justified in doing so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-7-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-02drm/i915: Fix inconsistent indenting in vbt_panel_init()Chris Wilson
smatch complains: drivers/gpu/drm/i915/intel_dsi_panel_vbt.c:657 vbt_panel_init() warn: inconsistent indenting Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-6-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-02drm/i915: Match bitmask size to types in intel_fb_initial_config()Chris Wilson
smatch complains of: drivers/gpu/drm/i915/intel_fbdev.c:403 intel_fb_initial_config() warn: should '1 << i' be a 64 bit type? drivers/gpu/drm/i915/intel_fbdev.c:422 intel_fb_initial_config() warn: should '1 << i' be a 64 bit type? drivers/gpu/drm/i915/intel_fbdev.c:501 intel_fb_initial_config() warn: should '1 << i' be a 64 bit type? We are prepared to iterate over a u64 but don't limit the number of connectors we try to configure to a maximum of 64. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-5-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-02drm/i915: Fix inconsistent indenting in i915_error_state_to_str()Chris Wilson
smatch complains: drivers/gpu/drm/i915/i915_gpu_error.c:503 i915_error_state_to_str() warn: inconsistent indenting Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-4-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-02drm/i915: Fix indentation in i915_gem_framebuffer_info()Chris Wilson
smatch complains: drivers/gpu/drm/i915/i915_debugfs.c:1390 i915_frequency_info() Function too hairy. Giving up. drivers/gpu/drm/i915/i915_debugfs.c:1985 i915_gem_framebuffer_info() warn: inconsistent indenting Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-3-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-02drm/915: Fix long lines and random indent in gen6_set_rps_thresholds()Chris Wilson
smatch complains: drivers/gpu/drm/i915/intel_pm.c:4745 gen6_set_rps_thresholds() warn: inconsistent indenting Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-2-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-02drm/i915: Fix random indent in i915_drm_resume()Chris Wilson
smatch complains: drivers/gpu/drm/i915/i915_drv.c:1616 i915_drm_resume() warn: inconsistent indenting Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-01drm/i915: Remove debug noise on detecting fault-injection of missed interruptsChris Wilson
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs for the handling of a "missed interrupt", adding it to the dmesg at INFO is just noise. When it happens for real, we still class it as an ERROR. Note that I have chose to remove it entirely because when we detect the "missed interrupt" is irrelevant and the message contains no more information than we glean from looking in debugfs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-20-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Simplify enabling user-interrupts with L3-remappingChris Wilson
Borrow the idea from intel_lrc.c to precompute the mask of interrupts we wish to always enable to avoid having lots of conditionals inside the interrupt enabling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-19-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Move the get/put irq locking into the callerChris Wilson
With only a single callsite for intel_engine_cs->irq_get and ->irq_put, we can reduce the code size by moving the common preamble into the caller, and we can also eliminate the reference counting. For completeness, as we are no longer doing reference counting on irq, rename the get/put vfunctions to enable/disable respectively and are able to review the use of posting reads. We only require the serialisation with hardware when enabling the interrupt (i.e. so we cannot miss an interrupt by going to sleep before the hardware truly enables it). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-18-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Embed signaling node into the GEM requestChris Wilson
Under the assumption that enabling signaling will be a frequent operation, lets preallocate our attachments for signaling inside the (rather large) request struct (and so benefiting from the slab cache). v2: Convert from void * to more meaningful names and types. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-17-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Convert trace-irq to the breadcrumb waiterChris Wilson
If we convert the tracing over from direct use of ring->irq_get() and over to the breadcrumb infrastructure, we only have a single user of the ring->irq_get and so we will be able to simplify the driver routines (eliminating the redundant validation and irq refcounting). Process context is preferred over softirq (or even hardirq) for a couple of reasons: - we already utilize process context to have fast wakeup of a single client (i.e. the client waiting for the GPU inspects the seqno for itself following an interrupt to avoid the overhead of a context switch before it returns to userspace) - engine->irq_seqno() is not suitable for use from an softirq/hardirq context as we may require long waits (100-250us) to ensure the seqno write is posted before we read it from the CPU A signaling framework is a requirement for enabling dma-fences. v2: Move to a signaling framework based upon the waiter. v3: Track the first-signal to avoid having to walk the rbtree everytime. v4: Mark the signaler thread as RT priority to reduce latency in the indirect wakeups. v5: Make failure to allocate the thread fatal. v6: Rename kthreads to i915/signal:%u Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-16-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Stop setting wraparound seqno on initialisationChris Wilson
We have testcases to ensure that seqno wraparound works fine, so we can forgo forcing everyone to encounter seqno wraparound during early uptime. seqno wraparound incurs a full GPU stall so not forcing it will eliminate one jitter from the early system. Using the testcases, we have very deterministic testing which given how difficult it would be to debug an issue (GPU hang) stemming from a wraparound using pure postmortem analysis I see no value in forcing a wrap during boot. Advancing the global next_seqno after a GPU reset is equally pointless. References? https://bugs.freedesktop.org/show_bug.cgi?id=95023 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-15-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Only apply one barrier after a breadcrumb interrupt is postedChris Wilson
If we flag the seqno as potentially stale upon receiving an interrupt, we can use that information to reduce the frequency that we apply the heavyweight coherent seqno read (i.e. if we wake up a chain of waiters). v2: Use cmpxchg to replace READ_ONCE/WRITE_ONCE for more explicit control of the ordering wrt to interrupt generation and interrupt checking in the bottom-half. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-14-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Check the CPU cached value in HWS of seqno after waking the waiterChris Wilson
If we have multiple waiters, we may find that many complete on the same wake up. If we first inspect the seqno from the CPU cache, we may reduce the number of heavyweight coherent seqno reads we require. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-13-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Add a delay between interrupt and inspecting the final seqno (ilk)Chris Wilson
On Ironlake, there is no command nor register to ensure that the write from a MI_STORE command is completed (and coherent on the CPU) before the command parser continues. This means that the ordering between the seqno write and the subsequent user interrupt is undefined (like gen6+). So to ensure that the seqno write is completed after the final user interrupt we need to delay the read sufficiently to allow the write to complete. This delay is undefined by the bspec, and empirically requires 75us even though a register read combined with a clflush is less than 500ns. Hence, the delay is due to an on-chip buffer rather than the latency of the write to memory. Note that the render ring controls this by filling the PIPE_CONTROL fifo with stalling commands that force the earliest pipe-control with the seqno to be completed before the command parser continues. Given that we need a barrier operation for BSD, we may as well forgo the extra per-batch latency by using a common per-interrupt barrier. Studying the impact of adding the usleep shows that in both sequences of and individual synchronous no-op batches is negligible for the media engine (where the write now is unordered with the interrupt). Converting the render engine over from the current glutton of pie-controls over to the per-interrupt delays speeds up both the sequential and individual synchronous no-ops by 20% and 60%, respectively. This speed up holds even when looking at the throughput of small copies (4KiB->4MiB), both serial and synchronous, by about 20%. This is because despite adding a significant delay to the interrupt, in all likelihood we will see the seqno write without having to apply the barrier (only in the rare corner cases where the write is delayed on the last required is the delay necessary). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94307 Testcase: igt/gem_sync #ilk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-12-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Refactor scratch object allocation for gen2 w/a bufferChris Wilson
The gen2 w/a buffer is stuffed into the same slot as the gen5+ scratch buffer. If we pass in the size we want to allocate for the scratch buffer, both callers can use the same routine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-11-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Allocate scratch page from stolenChris Wilson
With the last direct CPU access to the scratch page removed, we can now allocate it from our small amount of reserved system pages (stolen memory). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-10-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Stop mapping the scratch page into CPU spaceChris Wilson
After the elimination of using the scratch page for Ironlake's breadcrumb, we no longer need to kmap the object. We therefore can move it into the high unmappable space and do not need to force the object to be coherent (i.e. snooped on !llc platforms). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-9-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Use HWS for seqno tracking everywhereChris Wilson
By using the same address for storing the HWS on every platform, we can remove the platform specific vfuncs and reduce the get-seqno routine to a single read of a cached memory location. v2: Fix semaphore_passed() to look at the signaling engine (not the waiter's) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-8-git-send-email-chris@chris-wilson.co.uk
2016-07-01drm/i915: Spin after waking up for an interruptChris Wilson
When waiting for an interrupt (waiting for the engine to complete some work), we know we are the only waiter to be woken on this engine. We also know when the GPU has nearly completed our request (or at least started processing it), so after being woken and we detect that the GPU is active and working on our request, allow us the bottom-half (the first waiter who wakes up to handle checking the seqno after the interrupt) to spin for a very short while to reduce client latencies. The impact is minimal, there was an improvement to the realtime-vs-many clients case, but exporting the function proves useful later. However, it is tempting to adjust irq_seqno_barrier to include the spin. The problem is first ensuring that the "start-of-request" seqno is coherent as we use that as our basis for judging when it is ok to spin. If we could, spinning there could dramatically shorten some sleeps, and allow us to make the barriers more conservative to handle missed seqno writes on more platforms (all gen7+ are known to have the occasional issue, at least). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-7-git-send-email-chris@chris-wilson.co.uk