summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-11-22drm/i915/pmu: Suspend sampling when GPU is idleTvrtko Ursulin
If only a subset of events is enabled we can afford to suspend the sampling timer when the GPU is idle and so save some cycles and power. v2: Rebase and limit timer even more. v3: Rebase. v4: Rebase. v5: Skip action if perf PMU failed to register. v6: Checkpatch cleanup. v7: * Add a common helper to start the timer if needed. (Chris Wilson) * Add comment explaining bitwise logic in pmu_needs_timer. v8: Fix some comments styles. (Chris Wilson) v9: Rebase. v10: Move function declarations to i915_pmu.h. v11: Rename functions to i915_pmu_gt_(un)parked. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-3-tvrtko.ursulin@linux.intel.com
2017-11-22drm/i915/pmu: Expose a PMU interface for perf queriesTvrtko Ursulin
From: Chris Wilson <chris@chris-wilson.co.uk> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> The first goal is to be able to measure GPU (and invidual ring) busyness without having to poll registers from userspace. (Which not only incurs holding the forcewake lock indefinitely, perturbing the system, but also runs the risk of hanging the machine.) As an alternative we can use the perf event counter interface to sample the ring registers periodically and send those results to userspace. Functionality we are exporting to userspace is via the existing perf PMU API and can be exercised via the existing tools. For example: perf stat -a -e i915/rcs0-busy/ -I 1000 Will print the render engine busynnes once per second. All the performance counters can be enumerated (perf list) and have their unit of measure correctly reported in sysfs. v1-v2 (Chris Wilson): v2: Use a common timer for the ring sampling. v3: (Tvrtko Ursulin) * Decouple uAPI from i915 engine ids. * Complete uAPI defines. * Refactor some code to helpers for clarity. * Skip sampling disabled engines. * Expose counters in sysfs. * Pass in fake regs to avoid null ptr deref in perf core. * Convert to class/instance uAPI. * Use shared driver code for rc6 residency, power and frequency. v4: (Dmitry Rogozhkin) * Register PMU with .task_ctx_nr=perf_invalid_context * Expose cpumask for the PMU with the single CPU in the mask * Properly support pmu->stop(): it should call pmu->read() * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE) * Introduce refcounting of event subscriptions. * Make pmu.busy_stats a refcounter to avoid busy stats going away with some deleted event. * Expose cpumask for i915 PMU to avoid multiple events creation of the same type followed by counter aggregation by perf-stat. * Track CPUs getting online/offline to migrate perf context. If (likely) cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be needed to see effect of CPU status tracking. * End result is that only global events are supported and perf stat works correctly. * Deny perf driver level sampling - it is prohibited for uncore PMU. v5: (Tvrtko Ursulin) * Don't hardcode number of engine samplers. * Rewrite event ref-counting for correctness and simplicity. * Store initial counter value when starting already enabled events to correctly report values to all listeners. * Fix RC6 residency readout. * Comments, GPL header. v6: * Add missing entry to v4 changelog. * Fix accounting in CPU hotplug case by copying the approach from arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin) v7: * Log failure message only on failure. * Remove CPU hotplug notification state on unregister. v8: * Fix error unwind on failed registration. * Checkpatch cleanup. v9: * Drop the energy metric, it is available via intel_rapl_perf. (Ville Syrjälä) * Use HAS_RC6(p). (Chris Wilson) * Handle unsupported non-engine events. (Dmitry Rogozhkin) * Rebase for intel_rc6_residency_ns needing caller managed runtime pm. * Drop HAS_RC6 checks from the read callback since creating those events will be rejected at init time already. * Add counter units to sysfs so perf stat output is nicer. * Cleanup the attribute tables for brevity and readability. v10: * Fixed queued accounting. v11: * Move intel_engine_lookup_user to intel_engine_cs.c * Commit update. (Joonas Lahtinen) v12: * More accurate sampling. (Chris Wilson) * Store and report frequency in MHz for better usability from perf stat. * Removed metrics: queued, interrupts, rc6 counters. * Sample engine busyness based on seqno difference only for less MMIO (and forcewake) on all platforms. (Chris Wilson) v13: * Comment spelling, use mul_u32_u32 to work around potential GCC issue and somne code alignment changes. (Chris Wilson) v14: * Rebase. v15: * Rebase for RPS refactoring. v16: * Use the dynamic slot in the CPU hotplug state machine so that we are free to setup our state as multi-instance. Previously we were re-using the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as multi-instance, nor owned by our driver to start with. * Register the CPU hotplug handlers after the PMU, otherwise the callback will get called before the PMU is initialized which can end up in perf_pmu_migrate_context with an un-initialized base. * Added workaround for a probable bug in cpuhp core. v17: * Remove workaround for the cpuhp bug. v18: * Rebase for drm_i915_gem_engine_class getting upstream before us. v19: * Rebase. (trivial) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
2017-11-22drm/i915: Extract intel_get_cagfTvrtko Ursulin
Code to be shared between debugfs and the PMU implementation. v2: Checkpatch cleanup. v3: Also consolidate i915_sysfs.c/gt_act_freq_mhz_show. v4: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-1-tvrtko.ursulin@linux.intel.com
2017-11-21drm/i915/selftests: Avoid drm_gem_handle_create under struct_mutexChris Wilson
Despite us reloading the module around every selftest, the lockclasses persist and the chains used in selftesting may then dictate how we are allowed to nest locks during runtime testing. As such we have to be just as careful, and in particular it turns out we are not allowed to nest dev->object_name_lock (drm_gem_handle_create) inside dev->struct_mutex. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103830 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171121110652.1107-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-21drm/i915: Add rudimentary plane state verificationVille Syrjälä
Check that the planes are in the state we expect them to be. For now we can only check whether each plane is correctly enabled or disabled. In the future we may want to expand the plane state readout to support a more thorough verification. v2: Verify all planes part of the state as long as at least one crtc is doing a modeset (Daniel) v3: Fix typoes (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: James Ausmus <james.ausmus@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-11-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: Use plane->get_hw_state() for initial plane fb readoutVille Syrjälä
Since we now have a ->get_hw_state() method for planes, let's use that during the initial plane fb readout. v2: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: James Ausmus <james.ausmus@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-10-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: Nuke crtc->planeVille Syrjälä
Eliminate crtc->plane since it's pretty much a layering violation. We can always get the plane via crtc->primary if we actually need it. The only ugly thing left is plane_to_crtc_mapping[], but that's still needed by the pre-g4x watermark code. v2: Removed a misplaced comment change (Daniel) v3: Rebase due to fbc crtc->y usage removal v4: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-9-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: Switch fbc over to for_each_new_intel_plane_in_state()Ville Syrjälä
Stop using the old for_each_intel_plane_in_state() type iteration macro and replace it with for_each_new_intel_plane_in_state(). And similarly replace drm_atomic_get_existing_crtc_state() with intel_atomic_get_new_crtc_state(). Switch over to intel_ types as well to make the code less cluttered. v2: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-8-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: Nuke ironlake_get_initial_plane_config()Ville Syrjälä
The only relevant difference between i9xx_get_initial_plane_config() and ironlake_get_initial_plane_config() is the HSW/BDW TILEOFF handling. Add that to i9xx_get_initial_plane_config() and nuke ironlake_get_initial_plane_config(). v2: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-7-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: Cleanup enum pipe/enum plane_id/enum i9xx_plane_id in initial fb ↵Ville Syrjälä
readout Use enum pipe, enum plane_id, and enum i9xx_plane_id consistently in the initial framebuffe readout. v2: Use old_plane_id in the ilk code v3: s/old_plane_id/i9xx_plane_id/ (Daniel) v4: Rebase due to GLK/CNL PLANE_COLOR_CTL alpha stuff v5: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-6-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: Use enum i9xx_plane_id for the .get_fifo_size() hooksVille Syrjälä
Replace the 0 and 1 with PLANE_A and PLANE_B in the pre-g4x wm code. v2: s/old_plane_id/i9xx_plane_id/ (Daniel) v3: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-5-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: s/enum plane/enum i9xx_plane_id/Ville Syrjälä
Rename enum plane to enum i9xx_plane_id to make it clear that it only applies to pre-SKL platforms. enum i9xx_plane_id is a global identifier, whereas enum plane_id is per-pipe. We need the old global identifier to index the primary plane (and the pre-g4x sprite C if we ever expose it) registers on pre-SKL platforms. v2: Reorder patches v3: s/old_plane_id/i9xx_plane_id/ (Daniel) Pimp the commit message a bit Note that i9xx_plane_id doesn't apply to SKL+ v4: Rebase due to power domain handling in plane readout v5: Rebase due to crtc->dspaddr_offset removal v6: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-4-ville.syrjala@linux.intel.com Reviewed-by: James Ausmus <james.ausmus@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: Redo plane sanitation during readoutVille Syrjälä
Unify the plane disabling during state readout by pulling the code into a new helper intel_plane_disable_noatomic(). We'll also read out the state of all planes, so that we know which planes really need to be diabled. Additonally we change the plane<->pipe mapping sanitation to work by simply disabling the offending planes instead of entire pipes. And we do it before we otherwise sanitize the crtcs, which means we don't have to worry about misassigned planes during crtc sanitation anymore. v2: Reoder patches to not depend on enum old_plane_id v3: s/for_each_pipe/for_each_intel_crtc/ Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Alex Villacís Lasso <alexvillacislasso@hotmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103223 Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-3-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-21drm/i915: Add .get_hw_state() method for planesVille Syrjälä
Add a .get_hw_state() method for planes, returning true or false depending on whether the plane is enabled. Use it to rewrite the plane enabled/disabled asserts in platform agnostic fashion. We do lose the pre-gen4 plane<->pipe mapping checks, but since we're supposed sanitize that anyway it doesn't really matter. v2: Reoder patches to not depend on enum old_plane_id Just call assert_plane_disabled() from assert_planes_disabled() v3: Deal with disabled power wells in .get_hw_state() v4: Rebase due skl primary plane code removal Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Alex Villacís Lasso <alexvillacislasso@hotmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #v2 Tested-by: Thierry Reding <thierry.reding@gmail.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-2-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-20drm/i915: Don't use GEN6_RC_VIDEO_FREQ on gen10+David Weinehall
GEN6_RC_VIDEO_FREQ is deprecated for >= gen10; don't try to program it. v2: Use IS_GEN9() instead of INTEL_GEN() and remove comment (Rodrigo) Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117080146.20150-1-david.weinehall@linux.intel.com
2017-11-20drm/i915/selftests: Declare we allocated the guc clientsChris Wilson
Silence smatch over drivers/gpu/drm/i915/selftests/intel_guc.c:135 igt_guc_init_doorbell_hw() error: we previously assumed 'guc->execbuf_client' could be null (see line 123) drivers/gpu/drm/i915/selftests/intel_guc.c:142 igt_guc_init_doorbell_hw() error: we previously assumed 'guc->preempt_client' could be null (see line 123) by asserting that we did succeed in creating the pair of clients for testing. References: 55bd6bd75717 ("drm/i915/selftests: Add a GuC doorbells selftest") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120211907.1649-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2017-11-20drm/i915: Remove i915.semaphores modparamChris Wilson
Having disabled the broken semaphores on Sandybridge, there is no need for a modparam any more, so remove it in favour of a simple HAS_LEGACY_SEMAPHORES() guard. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-5-chris@chris-wilson.co.uk
2017-11-20drm/i915: Move debugfs/i915_semaphore_status to i915_engine_infoChris Wilson
As the semaphores is just part of the engine, include it with the general pretty printer universally used for debugging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-4-chris@chris-wilson.co.uk
2017-11-20drm/i915: Disable semaphores on SandybridgeChris Wilson
I should have admitted defeat long ago as there has been a rare but persistent error on Sandybridge where semaphore signaling did not propagate to the waiter, leading to a GPU hang. With the work on fence signaling for v4.9, the impact of using CPU driven signaling was greatly reduced wrt to the latency of GPU semaphores, though without logical rings support, the benefit of reordering work to avoid bubbles is not realised (i.e. as it stands fence signaling is just a slower, more costly version of HW semaphores; but works more consistently). As a rough indicator of the difference, with semaphores: Sequential (3 engines, 1 processes): average 5.470us per cycle [expected 4.988us] w/o semaphores: Sequential (3 engines, 1 processes): average 15.771us per cycle [expected 4.923us] In comparison, v3.4: with semaphores: Sequential (3 engines, 1 processes): average 16.066us per cycle [expected 11.842us] w/o semaphores: Sequential (3 engines, 1 processes): average 23.460us per cycle [expected 11.839us] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54226 #and 100+ dupes Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-3-chris@chris-wilson.co.uk
2017-11-20drm/i915: Remove obsolete ringbuffer emission for gen8+Chris Wilson
Since removing the module parameter to force selection of ringbuffer emission for gen8, the code is defunct. Remove it. To put the difference into perspective, a couple of microbenchmarks (bdw i7-5557u, 20170324): ring execlists exec continuous nops on all rings: 1.491us 2.223us exec sequential nops on each ring: 12.508us 53.682us single nop + sync: 9.272us 30.291us vblank_mode=0 glxgears: ~11000fps ~9000fps Since the earlier submission, gen8 ringbuffer submission has fallen further and further behind in features. So while ringbuffer may hold the throughput crown, in terms of interactive latency, execlists is much better. Alas, we have no convenient metrics for such, other than demonstrating things we can do with execlists but can not using legacy ringbuffer submission. We have made a few improvements to lowlevel execlists throughput, and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026): ring execlists exec continuous nops on all rings: n/a 1.921us exec sequential nops on each ring: n/a 44.621us single nop + sync: n/a 21.953us vblank_mode=0 glxgears: n/a ~18500fps References: https://bugs.freedesktop.org/show_bug.cgi?id=87725 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Once-upon-a-time-Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk
2017-11-20drm/i915: Remove i915.enable_execlists module parameterChris Wilson
Execlists and legacy ringbuffer submission are no longer feature comparable (execlists now offer greater functionality that should overcome their performance hit) and obsoletes the unsafe module parameter, i.e. comparing the two modes of execution is no longer useful, so remove the debug tool. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk
2017-11-20drm/i915/execlists: Delay writing to ELSP until HW has processed the ↵Michel Thierry
previous write The hardware needs some time to process the information received in the ExecList Submission Port, and expects us to not write anything more until it has 'acknowledged' this new submission by sending an IDLE_ACTIVE or PREEMPTED CSB event. If we do not follow this, the driver could write new data into the ELSP before HW had finishing fetching the previous one, putting us in 'undefined behaviour' space. This seems to be the problem causing the spurious PREEMPTED & COMPLETE events after a COMPLETE like the one below: [] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3. [] vcs0: Execlist CSB[0]: 0x00000018 _ 0x00000007 [] vcs0: Execlist CSB[1]: 0x00000001 _ 0x00000000 [] vcs0: Execlist CSB[2]: 0x00000018 _ 0x00000007 <<< COMPLETE [] vcs0: Execlist CSB[3]: 0x00000012 _ 0x00000007 <<< PREEMPTED & COMPLETE [] vcs0: Execlist CSB[4]: 0x00008002 _ 0x00000006 [] vcs0: Execlist CSB[5]: 0x00000014 _ 0x00000006 The ELSP writes that lead to this CSB sequence show that the HW hadn't started executing the previous execlist (the one with only ctx 0x6) by the time the new one was submitted; this is a bit more clear in the data show in the EXECLIST_STATUS register at the time of the ELSP write. [] vcs0: ELSP[0] = 0x0_0 [execlist1] - status_reg = 0x0_302 [] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302 [] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308 [] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308 Note that having to wait for this ack does not disable lite-restores, although it may reduce their numbers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035 Signed-off-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/<20171118003038.7935-1-michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-4-chris@chris-wilson.co.uk Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-20drm/i915/selftest: Make guc clients staticChris Wilson
Make the private array used for stashing test clients static, to silence sparse. References: 55bd6bd75717 ("drm/i915/selftests: Add a GuC doorbells selftest") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120132606.4254-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2017-11-20drm/i915/perf: reuse timestamp frequency from device infoLionel Landwerlin
Now that we have this stored in the device info, we can drop it from perf part of the driver. Note that this requires to init perf after we've computed the frequency, hence why we move i915_perf_init() from i915_driver_init_early() to after intel_device_info_runtime_init(). v2: Use div_u64 (Chris) v3: Drop u64 divs by switching to kHz (Chris/Ville) Move i915_perf_fini to i915_driver_cleanup_hw (Matthew) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113181902.12411-2-lionel.g.landwerlin@intel.com
2017-11-20drm/i915: Automatic i915_switch_context for legacyChris Wilson
During request construction, after pinning the context we know whether or not we have to emit a context switch. So move this common operation from every caller into i915_gem_request_alloc() itself. v2: Always submit the request if we emitted some commands during request construction, as typically it also involves changes in global state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-2-chris@chris-wilson.co.uk
2017-11-20drm/i915: Pull the unconditional GPU cache invalidation into request ↵Chris Wilson
construction As the request will, in the following patch, implicitly invoke a context-switch on construction, we should precede that with a GPU TLB invalidation. Also, even before using GGTT, we always want to invalidate the TLBs for any updates (as well as the ppgtt invalidates that are unconditionally applied by execbuf). Since we almost always require the TLB invalidate, do it unconditionally on request allocation and so we can remove it from all other paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-11-20drm/i915/perf: replace .reg accesses with i915_mmio_reg_offsetLionel Landwerlin
This replaces accesses to the reg field of the i915_reg_t structure with the i915_mmio_reg_offset() inline function. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ewelina Musial <ewelina.musial@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113233455.12085-2-lionel.g.landwerlin@intel.com
2017-11-20drm/i915/execlists: Assert that we don't get mixed IDLE_ACTIVE | COMPLETE eventsChris Wilson
If IDLE_ACTIVE is set, then all other bits are invalid. For us, we can assert that if we see a COMPLETE | PREEMPTED event, then it should be impossible for it to also contain an IDLE_ACTIVE flag. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-20drm/i915/execlists: Reduce completed event mask to COMPLETE | PREEMPTEDChris Wilson
Since we get a COMPLETE event when the context switch occurs on RING_HEAD == RING_TAIL and a PREEMPTED event when a switch occurs before that point, COMPLETE | PREEMPTED should cover all possible context switch completion events. We can move the ELEMENT_SWITCH info message from the COMPLETED_MASK into an assertion for when we are performing a switch to port[1]. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-20drm/i915/execlists: Listen to COMPLETE context event not ACTIVE_IDLEChris Wilson
Since commit e1fee72c2ea2e9c0c6e6743d32a6832f21337d6c Author: Oscar Mateo <oscar.mateo@intel.com> Date: Thu Jul 24 17:04:40 2014 +0100 drm/i915/bdw: Avoid non-lite-restore preemptions execlists has listened to (ACTIVE_IDLE | ELEMENT_SWITCH) for detecting when one context completed and it either continued onto the next (in port 1) or idled. We would always see COMPLETE | ACTIVE_IDLE on the final context-switch event, but on recent gen it appears that we now get separate ACTIVE_IDLE and COMPLETE events. In particular, the ACTIVE_IDLE events may not be coupled to a context (since it is a general state rather than a specific context completion event). v2: Update the history, execlists did originally start out by listening to the COMPLETE event not ACTIVE_IDLE. v3: Update preempt completion test to also use COMPLETE not ACTIVE_IDLE. References: bspec/12255 References: https://bugs.freedesktop.org/show_bug.cgi?id=103800 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Acked-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-1-chris@chris-wilson.co.uk
2017-11-20drm/i915: Fix init_clock_gating for resumeVille Syrjälä
Moving the init_clock_gating() call from intel_modeset_init_hw() to intel_modeset_gem_init() had an unintended effect of not applying some workarounds on resume. This, for example, cause some kind of corruption to appear at the top of my IVB Thinkpad X1 Carbon LVDS screen after hibernation. Fix the problem by explicitly calling init_clock_gating() from the resume path. I really hope this doesn't break something else again. At least the problems reported at https://bugs.freedesktop.org/show_bug.cgi?id=103549 didn't make a comeback, even after a hibernate cycle. v2: Reorder the init_clock_gating vs. modeset_init_hw to match the display reset path (Rodrigo) Cc: stable@vger.kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Fixes: 6ac43272768c ("drm/i915: Move init_clock_gating() back to where it was") Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171116160215.25715-1-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17drm/i915: Update DRIVER_DATE to 20171117Rodrigo Vivi
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-11-17drm/i915: Add a policy note for removing workaroundsChris Wilson
Rodrigo gave a persuasive argument for keeping workarounds: that they serve as a good guide for the bring up of the next generation. Not only do workarounds persist into the early revisions, they show where the workarounds were previously added to the code flow and sometimes the old workarounds have an explanation that give insight into their wider implications. Based on his suggestion, document the policy that we want to keep the workarounds from the current generation to guide the next. Older preproduction workarounds we still want to remove to keep the code clean. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117102635.8689-1-chris@chris-wilson.co.uk Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-11-17drm/i915/selftests: Report ENOMEM clearly for an allocation failureChris Wilson
If we can not run the drunk_hole test because we couldn't allocate the memory for the permutation array (even after we tried trimming the size), report a clear ENOMEM. Similary, if we are asked to operate on a hole too small for ourselves, make it skip quietly. v2: Avoid malloc(0) since that returns ZERO_SIZE_PTR not NULL. v3: Fixup similar construction for lowlevel_hole v4: Use u64 >> 1 to avoid 64b div. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117101732.4335-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117162945.16390-1-chris@chris-wilson.co.uk
2017-11-17Revert "drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk"Radhakrishna Sripada
This reverts commit 8f067837c4b713ce2e69be95af7b2a5eb3bd7de8. HSD says "WA withdrawn. It was causing corruption with some images. WA is not strictly necessary since this bug just causes loss of FBC compression with some sizes and images, but doesn't break anything." Fixes: 8f067837c4b7 ("drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117010825.23118-1-radhakrishna.sripada@intel.com
2017-11-17drm/i915: Calculate g4x intermediate watermarks correctlyMaarten Lankhorst
The watermarks it should calculate against are the old optimal watermarks. The currently active crtc watermarks are pure fiction, and are invalid in case of a nonblocking modeset, page flip enabling/disabling planes or any other reason. When the crtc is disabled or during a modeset the intermediate watermarks don't need to be programmed separately, and could be directly assigned to the optimal watermarks. CXSR must always be disabled in the intermediate case for modesets, else we get a WARN for vblank wait timeout. Also rename crtc_state to new_crtc_state, to distinguish it from the old state. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-2-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17drm/i915: Calculate vlv/chv intermediate watermarks correctly, v3.Maarten Lankhorst
The watermarks it should calculate against are the old optimal watermarks. The currently active crtc watermarks are pure fiction, and are invalid in case of a nonblocking modeset, page flip enabling/disabling planes or any other reason. When the crtc is disabled or during a modeset the intermediate watermarks don't need to be programmed separately, and could be directly assigned to the optimal watermarks. CXSR must always be disabled in the intermediate case for modesets, else we get a WARN for vblank wait timeout. Also rename crtc_state to new_crtc_state, to distinguish it from the old state. Changes since v1: - Use intel_atomic_get_old_crtc_state. (ville) Changes since v2: - Always unset cxsr during modeset. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-1-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17drm/i915: Pass crtc_state to ips toggle functions, v2Maarten Lankhorst
Changes since v1: - Only pass crtc_state, not crtc. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110113503.16253-8-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17drm/i915: Pass idle crtc_state to intel_dp_sink_crcMaarten Lankhorst
IPS can only be enabled if the primary plane is visible, so first make sure sw state matches hw state by waiting for hw_done. After this pass crtc_state to intel_dp_sink_crc() so that can be used, instead of using legacy pointers. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110113503.16253-7-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17drm/i915: Enable FIFO underrun reporting after initial fastset, v4.Maarten Lankhorst
The firmware may have set up the pipe correctly, but the FIFO underrun and CRC interrupts are likely not enabled. This resulted in debugfs_test.read_all_entries failing on haswell, because of a timeout when reading the crc debugfs entry. Solve this by enabling FIFO underrun reporting after the initial fastset, which lets interrupts be generated as expected. Changes since v1: - Always enable CPU FIFO underrun reporting for >GEN2, and handle GEN2 correctly. Changes since v2: - Remove unneeded HAS_DDI, simplify GEN2 case. Changes since v3: - Use intel_crtc_pch_transcoder to determine pch transcoder for underruns. (Ville) - Remove crtc->config dereference in intel_crtc_pch_transcoder. (Ville) Testcase: debugfs_test.read_all_entries Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113144043.58658-1-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-11-17drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIMChris Wilson
Commit 21cc6431e0c2 ("drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM") tried to fixup the check_flush_dependency warning for hitting i915_gem_userptr_mn_invalidate_range_start from within the shrinker, but I failed to notice userptr has 2 similarly named workqueues. I marked up i915-userptr-acquire as WQ_MEM_RECLAIM whereas we only wait upon i915-userptr-release from inside the reclaim paths. [62530.869510] workqueue: PF_MEMALLOC task 7983(gem_shrink) is flushing !WQ_MEM_RECLAIM i915-userptr-release: (null) [62530.869515] ------------[ cut here ]------------ [62530.869519] WARNING: CPU: 1 PID: 7983 at kernel/workqueue.c:2434 check_flush_dependency+0x7f/0x110 [62530.869519] Modules linked in: pegasus mii ip6table_filter ip6_tables bnep iptable_filter snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic binfmt_misc nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec kvm_intel snd_hda_core snd_hwdep kvm snd_pcm irqbypass snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul crc32_pclmul 8250_dw ghash_clmulni_intel snd_seq pcbc snd_seq_device snd_timer btusb aesni_intel btrtl btbcm aes_x86_64 iwlwifi btintel crypto_simd glue_helper cryptd bluetooth snd intel_cstate input_leds idma64 intel_rapl_perf ecdh_generic serio_raw soundcore cfg80211 wmi_bmof virt_dma intel_lpss_pci intel_lpss acpi_als kfifo_buf industrialio winbond_cir soc_button_array rc_core spidev tpm_crb intel_hid acpi_pad mac_hid sparse_keymap [62530.869546] parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid i915 i2c_algo_bit prime_numbers drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ahci ptp pps_core libahci drm wmi video i2c_hid hid [62530.869557] CPU: 1 PID: 7983 Comm: gem_shrink Tainted: G U W L 4.14.0-rc8-drm-tip-ww45-commit-1342299+ #1 [62530.869558] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X098.A00.1707301945 07/30/2017 [62530.869559] task: ffffa1049dbeec80 task.stack: ffffae7d05c44000 [62530.869560] RIP: 0010:check_flush_dependency+0x7f/0x110 [62530.869561] RSP: 0018:ffffae7d05c473a0 EFLAGS: 00010286 [62530.869562] RAX: 000000000000006e RBX: ffffa1049540f400 RCX: ffffffffa3e55788 [62530.869562] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 0000000000000202 [62530.869563] RBP: ffffae7d05c473c0 R08: 000000000000006e R09: 000000000038bb0e [62530.869563] R10: 0000000000000000 R11: 000000000000006e R12: ffffa1049dbeec80 [62530.869564] R13: 0000000000000000 R14: 0000000000000000 R15: ffffae7d05c473e0 [62530.869565] FS: 00007f621b129880(0000) GS:ffffa1050b240000(0000) knlGS:0000000000000000 [62530.869566] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [62530.869566] CR2: 00007f6214400000 CR3: 0000000353a17003 CR4: 00000000003606e0 [62530.869567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [62530.869567] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [62530.869568] Call Trace: [62530.869570] flush_workqueue+0x115/0x3d0 [62530.869573] ? wake_up_process+0x15/0x20 [62530.869596] i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915] [62530.869614] ? i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915] [62530.869616] __mmu_notifier_invalidate_range_start+0x55/0x80 [62530.869618] try_to_unmap_one+0x791/0x8b0 [62530.869620] ? call_rwsem_down_read_failed+0x18/0x30 [62530.869622] rmap_walk_anon+0x10b/0x260 [62530.869624] rmap_walk+0x48/0x60 [62530.869625] try_to_unmap+0x93/0xf0 [62530.869626] ? page_remove_rmap+0x2a0/0x2a0 [62530.869627] ? page_not_mapped+0x20/0x20 [62530.869629] ? page_get_anon_vma+0x90/0x90 [62530.869630] ? invalid_mkclean_vma+0x20/0x20 [62530.869631] migrate_pages+0x946/0xaa0 [62530.869633] ? __ClearPageMovable+0x10/0x10 [62530.869635] ? isolate_freepages_block+0x3c0/0x3c0 [62530.869636] compact_zone+0x22f/0x970 [62530.869638] compact_zone_order+0xa3/0xd0 [62530.869640] try_to_compact_pages+0x1a5/0x2a0 [62530.869641] ? try_to_compact_pages+0x1a5/0x2a0 [62530.869643] __alloc_pages_direct_compact+0x50/0x110 [62530.869644] __alloc_pages_slowpath+0x4da/0xf30 [62530.869646] __alloc_pages_nodemask+0x262/0x280 [62530.869648] alloc_pages_vma+0x165/0x1e0 [62530.869649] shmem_alloc_hugepage+0xd0/0x130 [62530.869651] ? __radix_tree_insert+0x45/0x230 [62530.869652] ? __vm_enough_memory+0x29/0x130 [62530.869654] shmem_alloc_and_acct_page+0x10d/0x1e0 [62530.869655] shmem_getpage_gfp+0x426/0xc00 [62530.869657] shmem_fault+0xa0/0x1e0 [62530.869659] ? file_update_time+0x60/0x110 [62530.869660] __do_fault+0x1e/0xc0 [62530.869661] __handle_mm_fault+0xa35/0x1170 [62530.869662] handle_mm_fault+0xcc/0x1c0 [62530.869664] __do_page_fault+0x262/0x4f0 [62530.869666] do_page_fault+0x2e/0xe0 [62530.869667] page_fault+0x22/0x30 [62530.869668] RIP: 0033:0x404335 [62530.869669] RSP: 002b:00007fff7829e420 EFLAGS: 00010216 [62530.869670] RAX: 00007f6210400000 RBX: 0000000000000004 RCX: 0000000000b80000 [62530.869670] RDX: 0000000000002e01 RSI: 0000000000008000 RDI: 0000000000000004 [62530.869671] RBP: 0000000000000019 R08: 0000000000000002 R09: 0000000000000000 [62530.869671] R10: 0000000000000559 R11: 0000000000000246 R12: 0000000008000000 [62530.869672] R13: 00000000004042f0 R14: 0000000000000004 R15: 000000000000007e [62530.869673] Code: 00 8b b0 18 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 c0 06 00 00 4d 89 f0 48 c7 c7 40 c0 c8 a3 c6 05 68 c5 e8 00 01 e8 c2 68 04 00 <0f> ff 4d 85 ed 74 18 49 8b 45 20 48 8b 70 08 8b 86 00 01 00 00 [62530.869691] ---[ end trace 01e01ad0ff5781f8 ]--- Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103739 Fixes: 21cc6431e0c2 ("drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114173520.8829-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2017-11-17drm/i915: Add might_sleep() check to wait_for()Chris Wilson
We should long past the time of trying to use wait_for() from inside atomic contexts, so add a might_sleep() check to prevent misuse. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114215655.4849-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-11-17drm/i915/selftests: Add a GuC doorbells selftestMichel Thierry
The first test aims to check guc_init_doorbell_hw, changing the existing guc clients and doorbells state before calling it. The second test tries to create as many clients as it is currently possible (currently limited to max number of doorbells) and exercise the doorbell alloc/dealloc code. Since our usage mode require very few clients/doorbells, this code has been exercised very lightly and it's good to have a simple test for it. As reference, this test already helped identify the bug fixed by commit 7f1ea2ac3017 ("drm/i915/guc: Fix doorbell id selection"). v2: Extend number of clients; check for client allocation failure when number of doorbells is exceeded; validate client properties; reuse guc_init_doorbell_hw (Chris). v3: guc_init_doorbell_hw test added per Chris suggestion. v4: Try to explain why guc_init_doorbell_hw exist and comment some details in the subtest. v5: Remove redundant pr_info at the beginning of each subtest (Chris); rebase (s/i915_guc_client/intel_guc_client/). Signed-off-by: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171116220632.1909-1-michel.thierry@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16Merge tag 'gvt-next-2017-11-16' of https://github.com/intel/gvt-linux into ↵Rodrigo Vivi
drm-intel-next-queued gvt-next-2017-11-16 - CSB HWSP update support (Weinan) - GVT debug helpers, dyndbg and debugfs (Chuanxiao, Shuo) - full virtualized opregion (Xiaolin) - VM health check for sane fallback (Fred) - workload submission code refactor for future enabling (Zhi) - Updated repo URL in MAINTAINERS (Zhenyu) - other many misc fixes Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171116092007.ww5bvfx7rf36bjmn@zhen-hp.sh.intel.com
2017-11-16drm/i915/cnl: Extend HDMI 2.0 support to CNL.Rodrigo Vivi
Starting on GLK we support HDMI 2.0. So this patch only extend the work Shashank has made to GLK to CNL. v2: The version that compiles :/ v3: Invert order to newer || older platforms check. (Ville). Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115184205.8104-1-rodrigo.vivi@intel.com
2017-11-16drm/i915/cnl: Simplify dco_fraction calculation.Rodrigo Vivi
I confess I never fully understood that previous calculation, so this is not a "fix". But let's simplify this math so poor brains like mine can read and make some sense of it in the future. v2: Don't follow the spec since that gives invalid values and it is also confusing. This Ville's version is much simpler. v3: Use u64 cast instead of declaring a u64 dco. (Ville). Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Mika Kahola <mika.kahola@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115184257.8633-1-rodrigo.vivi@intel.com
2017-11-16drm/i915/cnl: Don't blindly replace qdiv.Rodrigo Vivi
Accordingly to spec "If Kdiv != 2, then Qdiv must be 1." but we already handle qdiv values properly and this case here should be spurious. But instead of blindly replacing let's warn loudly instead. Because it means something was really wrong on initial setup. Cc: Mika Kahola <mika.kahola@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114194759.24541-6-rodrigo.vivi@intel.com
2017-11-16drm/i915/cnl: Fix wrpll math for higher freqs.Rodrigo Vivi
Spec describe all values in MHz. We handle our clocks in KHz. This includes the best_dco_centrality that was forgot in the same unity as spec. Consequently we couldn't get a good divider for high frequenies. Hence HDMI 2.0 wasn't working. Spec tells 999999 for initial best_dco_centrality meaning the max value in MHz. Since we convert dco from MHz to KHz we also need to convert this initial best_doc_centrality to 999999000 or 999999999 or even better, to the max that its variable allow. This patch also replaces the use of "* KHz(1)" with the values directly on KHz to avoid future confusion. v2: Use U32_MAX instead of random 99999 as spec tells. (Ville). Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Mika Kahola <mika.kahola@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114234223.10600-1-rodrigo.vivi@intel.com
2017-11-16drm/i915/cnl: Fix, simplify and unify wrpll variable sizes.Rodrigo Vivi
- 64 bits is not needed for afe_clock now we don't convert that to Hz. - 16 bits is not enough for all dco stuff. - unsigned is not relevant/needed for all divisors values. Cc: Mika Kahola <mika.kahola@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20171114194759.24541-4-rodrigo.vivi@intel.com
2017-11-16drm/i915/cnl: Remove useless conversion.Rodrigo Vivi
No functional change. Just starting the wrpll fixes with a clean-up to make units a bit more clear. Cc: Mika Kahola <mika.kahola@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114194759.24541-3-rodrigo.vivi@intel.com