summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-08-05drm/i915: Remove forced stop ring on suspend/unloadChris Wilson
Before suspending (or unloading), we would first wait upon all rendering to be completed and then disable the rings. This later step is a remanent from DRI1 days when we did not use request tracking for all operations upon the ring. Now that we are sure we are waiting upon the very last operation by the engine, we can forgo clobbering the ring registers, though we do keep the assert that the engine is indeed idle before sleeping. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-5-git-send-email-chris@chris-wilson.co.uk
2016-08-05drm/i915/userptr: Remove superfluous interruptible=false on waitingChris Wilson
Inside the kthread context, we can't be interrupted by signals so touching the mm.interruptible flag is pointless and wait-request now consumes EIO itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-4-git-send-email-chris@chris-wilson.co.uk
2016-08-05drm/i915: Convert non-blocking userptr waits for requests over to using RCUChris Wilson
We can completely avoid taking the struct_mutex around the non-blocking waits by switching over to the RCU request management (trading the mutex for a RCU read lock and some complex atomic operations). The improvement is that we gain further contention reduction, and overall the code become simpler due to the reduced mutex dancing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-3-git-send-email-chris@chris-wilson.co.uk
2016-08-05drm/i915: Convert non-blocking waits for requests over to using RCUChris Wilson
We can completely avoid taking the struct_mutex around the non-blocking waits by switching over to the RCU request management (trading the mutex for a RCU read lock and some complex atomic operations). The improvement is that we gain further contention reduction, and overall the code become simpler due to the reduced mutex dancing. v2: Move i915_gem_fault tracepoint back to the start of the function, before the unlocked wait. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-2-git-send-email-chris@chris-wilson.co.uk
2016-08-05drm/i915: Introduce i915_gem_active_wait_unlocked()Chris Wilson
It is useful to be able to wait on pending rendering without grabbing the struct_mutex. We can do this by using the i915_gem_active_get_rcu() primitive to acquire a reference to the pending request without requiring struct_mutex, just the RCU read lock, and then call i915_wait_request(). v2: Rebase onto new i915_gem_active_get_unlocked() semantics that take the RCU read lock on behalf of the caller. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-1-git-send-email-chris@chris-wilson.co.uk
2016-08-05Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queuedDaniel Vetter
Backmerge the 4.8 pull request state from Dave - conflicts were getting out of hand, and Chris has some patches which outright don't apply without everything merged together again. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-08-05drm/i915: Fix iboost setting for SKL Y/U DP DDI buffer translation entry 2Ville Syrjälä
The spec was recently fixed to have the correct iboost setting for the SKL Y/U DP DDI buffer translation table entry 2. Update our tables to match. Cc: David Weinehall <david.weinehall@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470140517-13011-1-git-send-email-ville.syrjala@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-08-04drm/i915/gen9: Give one extra block per line for SKL plane WM calculationsMatt Roper
The bspec was updated a couple weeks ago to add an extra block per line to plane watermark calculations for linear pixel formats. Bspec update 115327 description: "Gen9+ - Updated the plane blocks per line calculation for linear cases. Adds +1 for all linear cases to handle the non-block aligned stride cases." Cc: Lyude <cpaul@redhat.com> Cc: drm-intel-fixes@lists.freedesktop.org Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470344880-27394-1-git-send-email-matthew.d.roper@intel.com Reviewed-by: Lyude <cpaul@redhat.com>
2016-08-04drm/i915: Export our request as a dma-buf fence on the reservation objectChris Wilson
If the GEM objects being rendered with in this request have been exported via dma-buf to a third party, hook ourselves into the dma-buf reservation object so that the third party can serialise with our rendering via the dma-buf fences. Testcase: igt/prime_busy Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-26-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Enable lockless lookup of request tracking via RCUChris Wilson
If we enable RCU for the requests (providing a grace period where we can inspect a "dead" request before it is freed), we can allow callers to carefully perform lockless lookup of an active request. However, by enabling deferred freeing of requests, we can potentially hog a lot of memory when dealing with tens of thousands of requests per second - with a quick insertion of a synchronize_rcu() inside our shrinker callback, that issue disappears. v2: Currently, it is our responsibility to handle reclaim i.e. to avoid hogging memory with the delayed slab frees. At the moment, we wait for a grace period in the shrinker, and block for all RCU callbacks on oom. Suggested alternatives focus on flushing our RCU callback when we have a certain number of outstanding request frees, and blocking on that flush after a second high watermark. (So rather than wait for the system to run out of memory, we stop issuing requests - both are nondeterministic.) Paul E. McKenney wrote: Another approach is synchronize_rcu() after some largish number of requests. The advantage of this approach is that it throttles the production of callbacks at the source. The corresponding disadvantage is that it slows things up. Another approach is to use call_rcu(), but if the previous call_rcu() is still in flight, block waiting for it. Yet another approach is the get_state_synchronize_rcu() / cond_synchronize_rcu() pair. The idea is to do something like this: cond_synchronize_rcu(cookie); cookie = get_state_synchronize_rcu(); You would of course do an initial get_state_synchronize_rcu() to get things going. This would not block unless there was less than one grace period's worth of time between invocations. But this assumes a busy system, where there is almost always a grace period in flight. But you can make that happen as follows: cond_synchronize_rcu(cookie); cookie = get_state_synchronize_rcu(); call_rcu(&my_rcu_head, noop_function); Note that you need additional code to make sure that the old callback has completed before doing a new one. Setting and clearing a flag with appropriate memory ordering control suffices (e.g,. smp_load_acquire() and smp_store_release()). v3: More comments on compiler and processor order of operations within the RCU lookup and discover we can use rcu_access_pointer() here instead. v4: Wrap i915_gem_active_get_rcu() to take the rcu_read_lock itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: "Goel, Akash" <akash.goel@intel.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-25-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Move i915_gem_object_wait_rendering()Chris Wilson
Just move it earlier so that we can use the companion nonblocking version in a couple of more callsites without having to add a forward declaration. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-24-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Move obj->active:5 to obj->flagsChris Wilson
We are motivated to avoid using a bitfield for obj->active for a couple of reasons. Firstly, we wish to document our lockless read of obj->active using READ_ONCE inside i915_gem_busy_ioctl() and that requires an integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code when presented with a bitfield and that shows up high on the profiles of request tracking (mainly due to excess memory traffic as it converts the bitfield to a register and back and generates frequent AGI in the process). v2: BIT, break up a long line in compute the other engines, new paint for i915_gem_object_is_active (now i915_gem_object_get_active). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-23-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Use dev_priv consistently through the intel_frontbuffer interfaceChris Wilson
Rather than a mismash of struct drm_device *dev and struct drm_i915_private *dev_priv being used freely within a function, be consistent and only pass along dev_priv. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-22-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Use atomics to manipulate obj->frontbuffer_bitsChris Wilson
The individual bits inside obj->frontbuffer_bits are protected by each plane->mutex, but the whole bitfield may be accessed by multiple KMS operations simultaneously and so the RMW need to be under atomics. However, for updating the single field we do not need to mandate that it be under the struct_mutex, one more step towards its removal as the de facto BKL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-21-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Make fb_tracking.lock a spinlockChris Wilson
We only need a very lightweight mechanism here as the locking is only used for co-ordinating a bitfield. v2: Move the cheap unlikely tests into the caller v3: Move the kerneldoc into the header (now separated out into intel_fronbuffer.h for better kerneldoc and readability) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtien <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-20-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Separate intel_frontbuffer into its own headerChris Wilson
In view of adding inline functions into the intel_frontbuffer section, we first split the header into its own file so that we can integrate it more easily with kerneldoc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-19-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin()Chris Wilson
Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for i915_gem_object_ggtt_pin(), spare us the confusion and remove it. Removing it now simplifies later patches to change the i915_vma_pin() (and friends) interface. v2: Add a redundant GEM_BUG_ON(!view) to i915_gem_obj_lookup_or_create_ggtt_vma() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-18-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Make i915_vma_pin() small and inlineChris Wilson
Not only is i915_vma_pin() called for every single object on every single execbuf, it is usually a simple increment as the VMA is already bound for execution by the GPU. Rearrange the tests for unbound and pin_count overflow so that we can do the increment and test very cheaply and compact enough to inline the operation into execbuf. The trick used is to note that we can check for an overflow bit (keeping space available for it inside the flags) at the same time as checking the binding bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-17-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Combine all i915_vma bitfields into a single set of flagsChris Wilson
In preparation to perform some magic to speed up i915_vma_pin(), which is among the hottest of hot paths in execbuf, refactor all the bitfields accessed by i915_vma_pin() into a single unified set of flags. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Start passing around i915_vma from execbufferChris Wilson
During execbuffer we look up the i915_vma in order to reserve them in the VM. However, we then do a double lookup of the vma in order to then pin them, all because we lack the necessary interfaces to operate on i915_vma - so introduce i915_vma_pin()! v2: Tidy parameter lists to remove one level of redirection in the hot path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-15-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Wrap vma->pin_count accessors with small inline helpersChris Wilson
In the next few patches, the VMA pinning API is overhauled and to reduce the churn we pull out the update to the accessors into a prep patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-14-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Record allocated vma sizeChris Wilson
Tracking the size of the VMA as allocated allows us to dramatically reduce the complexity of later functions (like inserting the VMA in to the drm_mm range manager). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-13-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Update i915_gem_get_ggtt_size/_alignment to use drm_i915_privateChris Wilson
For consistency, internal functions should take drm_i915_private rather than drm_device. Now that we are subclassing drm_device, there are no more size wins, but being consistent is its own blessing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-12-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Update the GGTT size/alignment query functionsChris Wilson
In order to be consistent with other address space functions, we want to pass around 64-bit sizes, even though all known global GTT are limited to 4GiB. Similarly, we are trying to be consistent in using the _ggtt_ nomenclature when referring to the special global GTT. v2: Update docs to consistently state "global GTT". Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-11-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Convert 4096 alignment request to 0 for drm_mm allocationsChris Wilson
As we always allocate in chunks of 4096 (that being both the PAGE_SIZE and our own GTT_PAGE_SIZE), we know that all results from the drm_mm are aligned to at least 4096. The drm_mm allocator itself is optimised for alignment == 0, and so by converting alignments of 4096 to 0 we can satisfy our own requirements and still hit the faster path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-10-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Split insertion/binding of an object into the VMChris Wilson
Split the insertion into the address space's range manager and binding of that object into the GTT to simplify the code flow when pinning a VMA. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-9-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Reduce WARN(i915_gem_valid_gtt_space) to a debug-only checkChris Wilson
i915_gem_valid_gtt_space() is used after inserting the VMA to double check the list - the location should have been chosen to pass all the restrictions. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-8-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Pad GTT views of exec objects up to user specified sizeChris Wilson
Our GPUs impose certain requirements upon buffers that depend upon how exactly they are used. Typically this is expressed as that they require a larger surface than would be naively computed by pitch * height. Normally such requirements are hidden away in the userspace driver, but when we accept pointers from strangers and later impose extra conditions on them, the original client allocator has no idea about the monstrosities in the GPU and we require the userspace driver to inform the kernel how many padding pages are required beyond the client allocation. v2: Long time, no see v3: Try an anonymous union for uapi struct compatibility Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-7-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Fix up vma alignment to be u64Chris Wilson
This is not the full fix, as we are required to percolate the u64 nature down through the drm_mm stack, but this is required now to prevent explosions due to mismatch between execbuf (eb_vma_misplaced) and vma binding (i915_vma_misplaced) - and reduces the risk of spurious changes as we adjust the vma interface in the next patches. v2: long long casts not required for u64 printk (%llx) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-6-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Remove i915_gem_execbuffer_retire_commands()Chris Wilson
Move the single line to the callsite as the name is now misleading, and the purpose is solely to add the request to the execution queue. Here, we can see that if we failed to dispatch the batch from the request, we can forgo flushing the GPU when closing the request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-5-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Remove request retirement before each batchChris Wilson
This reimplements the denial-of-service protection against igt from commit 227f782e4667 ("drm/i915: Retire requests before creating a new one") and transfers the stall from before each batch into get_pages(). The issue is that the stall is increasing latency between batches which is detrimental in some cases (especially coupled with execlists) to keeping the GPU well fed. Also we have made the observation that retiring requests can of itself free objects (and requests) and therefore makes a good first step when shrinking. v2: Recycle objects prior to i915_gem_object_get_pages() v3: Remove the reference to the ring from i915_gem_requests_ring() as it operates on an intel_engine_cs. v4: Since commit 9b5f4e5ed6fd ("drm/i915: Retire oldest completed request before allocating next") we no longer need the safeguard to retire requests before get_pages(). We no longer see the huge latencies when hitting the shrinker between allocations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-4-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Double check the active status on the batch poolChris Wilson
We should not rely on obj->active being uptodate unless we manually flush it. Instead, we can verify that the next available batch object is idle by looking at its last active request (and checking it for completion). v2: remove the struct drm_device forward declaration added in the process of removing its necessity Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-3-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Remove surplus drm_device parameter to i915_gem_evict_something()Chris Wilson
Eviction is VM local, so we can ignore the significance of the drm_device in the caller, and leave it to i915_gem_evict_something() to manage itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-2-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Combine loops within i915_gem_evict_somethingChris Wilson
Slight micro-optimise to produce combine loops so that gcc is able to optimise the inner-loops concisely. Since we are reviewing the loops, we can update the comments to describe the current state of affairs, in particular the distinction between evicting from the global GTT (which may contain untracked items and transient global pins) and the per-process GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-1-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Acquire audio powerwell for HD-Audio registersChris Wilson
On Haswell/Broadwell, the HD-Audio block is inside the HDMI/display power well and so the sna-hda audio codec acquires the display power well while it is operational. However, Skylake separates the powerwells again, but yet we still need the audio powerwell to setup the registers. (But then the hardware uses those registers even while powered off???) Acquiring the powerwell around setting the chicken bits when setting up the audio channel does at least silence the WARNs from touching our registers whilst unpowered. We silence our own test cases, but maybe there is a latent bug in using the audio channel? v2: Grab both rpm wakelock and audio wakelock Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96214 Fixes: 03b135cebc47 "ALSA: hda - remove dependency on i915 power well for SKL") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Libin Yang <libin.yang@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Marius Vlad <marius.c.vlad@intel.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/1470240540-29004-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-04drm/i915: Don't try to ack sink irqs when there are noneVille Syrjälä
My ASUS PB278 at least doesn't seem to appreciate when you try to ack sink irqs when there are none. Results in this sort of dmesg spam [drm:drm_dp_dpcd_access] too many retries, giving up Let's skip the ack if there are no pending irqs. I have no clue why we do this in two places. One of them likely should just go away. Oh, and MST has its own sink irq handler too... Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469717448-4297-12-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-04drm/i915: Remove useless rate_to_index() usageVille Syrjälä
No need to iterate the rates array in intel_dp_max_link_rate(). We know the max rate will be the last entry, and we already know the size. Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Jim Bride <jim.bride@linux.intel.com> Cc: Manasi D Navare <manasi.d.navare@intel.com> Cc: Durgadoss R <durgadoss.r@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469717448-4297-10-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-04drm/i915: Allow MST sinks to work even if drm_probe_ddc() failsVille Syrjälä
With HSW + Dell UP2414Q (at least) drm_probe_ddc() occasionally fails, and then we'll assume that the entire display has been disconnected. We don't need the EDID from the main link, so we can simply check if the sink is MST capable, and if so treat is as connected. v2: Skip drm_probe_ddc() entirely for MST (Daniel) Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Jim Bride <jim.bride@linux.intel.com> Cc: Manasi D Navare <manasi.d.navare@intel.com> Cc: Durgadoss R <durgadoss.r@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1469800276-6979-1-git-send-email-ville.syrjala@linux.intel.com
2016-08-04drm/i915: Track active streams also for DP SSTVille Syrjälä
s/active_mst_links/active_streams/ and use it also for SST. We can then use this information in the hpd handling to see if the link is active or not, and thus whether we may need to retrain. Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Jim Bride <jim.bride@linux.intel.com> Cc: Manasi D Navare <manasi.d.navare@intel.com> Cc: Durgadoss R <durgadoss.r@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469717448-4297-6-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-04drm/i915: Reject mixing MST and SST/HDMI on the same digital portVille Syrjälä
We can't mix MST with SST/HDMI on the same physical port, so we'll need to reject such configurations in check_digital_port_conflicts(). Nothing else will prevent this as MST has its fake encoders and its own connectors so the cloning checks won't catch this. The same digital port can be used multiple times, but only if all the encoders involved are MST encoders, so we only want to check MST vs. SST/HDMI, not MST vs. MST. And SST/HDMI vs. SST/HDMI we already check. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469717448-4297-5-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-04drm/i915: Avoid mixing up SST and MST in DDI setupVille Syrjälä
The MST vs. SST selection should depend purely on the choice of the connector/encoder. So don't try to determine the correct DDI mode based on the intel_dp->is_mst, which simply tells us whether the sink is in MST mode or not. Instead derive the information from the encoder type. Since the link training code deals in non-fake encoders, we'll also need to keep a second copy of that information around, which we'll now designate as 'link_mst'. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469717448-4297-4-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-04drm/i915: Read PSR caps/intermediate freqs/etc. only once on eDPVille Syrjälä
Currently we re-read a bunch of static eDP panel caps from the DPCD over and over again. Let's do it only once to save some time and effort. v2: Make thing less confusing with intel_edp_init_dpcd() (Chris) Move no_aux_handshake setup in there as well v3: Move tps3/rate printout to intel_dp_long_pulse() so that we'll still get them on eDP as well Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Link: http://patchwork.freedesktop.org/patch/msgid/1469800359-7087-1-git-send-email-ville.syrjala@linux.intel.com
2016-08-04drm/i915: Add missing rpm wakelock to GGTT preadChris Wilson
Joonas spotted a discrepancy between the pwrite and pread ioctls, in that pwrite takes the rpm wakelock around its GGTT access, The wakelock is required in order for the GTT to function. In disregard for the current convention, we take the rpm wakelock around the access itself rather than around the struct_mutex as the nesting is not strictly required and such ordering will one day be fixed by explicitly noting the barrier dependencies between the GGTT and rpm. Fixes: b50a53715f09 ("drm/i915: Support for pread/pwrite ...") Reported-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: drm-intel-fixes@lists.freedesktop.org Link: http://patchwork.freedesktop.org/patch/msgid/1470298193-21765-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-08-04drm/i915/fbc: FBC causes display flicker when VT-d is enabled on SkylakeChris Wilson
Erratum SKL075: Display Flicker May Occur When Both VT-d And FBC Are Enabled "Display flickering may occur when both FBC (Frame Buffer Compression) and VT - d (Intel® Virtualization Technology for Directed I/O) are enabled and in use by the display controller." Ville found the w/a name in the database: WaFbcTurnOffFbcWhenHyperVisorIsUsed:skl,bxt and also dug out that it affects Broxton. v2: Log when the quirk is applied. v3: Ensure i915.enable_fbc is false when !HAS_FBC() v4: Fix function name after rebase v5: Add Broxton to the workaround Note for backporting to stable, we need to add #define mkwrite_device_info(ptr) \ ((struct intel_device_info *)INTEL_INFO(ptr)) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/1470296633-20388-1-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Fix use of engine->index for register offsetChris Wilson
Since commit de1add360522 ("drm/i915: Decouple execbuf uAPI from internal implementation") the index of the engine (its engine->id) in the internal list no longer matches the hardware id. However, in a couple of locations we missed fixing up the difference. In this case, RING_FAULT_REG() refers to engine->id which is now not what the register offset actually should be. Fortunately, in both case we should be more or less looping over 0..I915_NUM_ENGINES. Fixes: de1add360522 ("drm/i915: Decouple execbuf uAPI from internal...") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469643077-2523-2-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: stable@vger.kernel.org
2016-08-04Revert "drm/i915: Clean up associated VMAs on context destruction"Chris Wilson
This reverts commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae. The patch was only a stop-gap measure that fixed half the problem - the leak of the fbcon when restarting X. A complete solution required releasing the VMA when the object itself was closed rather than rely on file/process exit. The previous patches add the VMA tracking necessary to do close them along with the object, context or file, and so the time has come to remove the partial fix. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-28-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Mark the context and address space as closedChris Wilson
When the user closes the context mark it and the dependent address space as closed. As we use an asynchronous destruct method, this has two purposes. First it allows us to flag the closed context and detect internal errors if we to create any new objects for it (as it is removed from the user's namespace, these should be internal bugs only). And secondly, it allows us to immediately reap stale vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-27-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Release vma when the handle is closedChris Wilson
In order to prevent a leak of the vma on shared objects, we need to hook into the object_close callback to destroy the vma on the object for this file. However, if we destroyed that vma immediately we may cause unexpected application stalls as we try to unbind a busy vma - hence we defer the unbind to when we retire the vma. v2: Keep vma allocated until closed. This is useful for a later optimisation, but it is required now in order to handle potential recursion of i915_vma_unbind() by retiring itself. v3: Comments are important. Testcase: igt/gem_ppggtt/flink-and-close-vma-leak Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-26-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Track active vma requestsChris Wilson
Hook the vma itself into the i915_gem_request_retire() so that we can accurately track when a solitary vma is inactive (as opposed to having to wait for the entire object to be idle). This improves the interaction when using multiple contexts (with full-ppgtt) and eliminates some frequent list walking when retiring objects after a completed request. A side-effect is that we get an active vma reference for free. The consequence of this is shown in the next patch... v2: Update inline names to be consistent with i915_gem_object_get_active() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-25-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: i915_vma_move_to_active prep patchChris Wilson
This patch is broken out of the next just to remove the code motion from that patch and make it more readable. What we do here is move the i915_vma_move_to_active() to i915_gem_execbuffer.c and put the three stages (read, write, fenced) together so that future modifications to active handling are all located in the same spot. The importance of this is so that we can more simply control the order in which the requests are place in the retirement list (i.e. control the order at which we retire and so control the lifetimes to avoid having to hold onto references). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-24-git-send-email-chris@chris-wilson.co.uk