summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/intel_lrc.c
AgeCommit message (Collapse)Author
2015-07-15drm/i915/gen9: Add WaSetDisablePixMaskCammingAndRhwoInCommonSliceChickenArun Siluvery
In Indirect context w/a batch buffer, +WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken v2: SKL revision id was used for BXT, copy paste error found during internal review (Bob Beckett). v3: explain why part of the WA is in Per ctx batch (Mika) Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-15drm/i915/gen9: Add WaFlushCoherentL3CacheLinesAtContextSwitch workaroundArun Siluvery
In Indirect context w/a batch buffer, +WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt v2: address static checker warning where unsigned value was checked for less than zero which is never true (Dan Carpenter). v3: The WA uses default value of GEN8_L3SQCREG4 during flush but that disables some other WA; update default value to retain it and document dependency (Mika). Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-15drm/i915/gen9: Add WaDisableCtxRestoreArbitration workaroundArun Siluvery
In Indirect and Per context w/a batch buffer, +WaDisableCtxRestoreArbitration v2: SKL revision id was used for BXT, copy paste error found during internal review (Bob Beckett). v3: use updated macro. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Robert Beckett <robert.beckett@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-15drm/i915: Enable WA batch buffers for Gen9Arun Siluvery
This patch only enables support for Gen9, the actual WA will be initialized in subsequent patches. The WARN that we use to warn user if WA batch support is not available for a particular Gen is replaced with DRM_ERROR as warning here doesn't really add much value. v2: include all infrastructure bits in this patch so that subsequent changes only correspond the WA added (Chris) v3: use updated macro. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-14drm/i915: Added Programming of the MOCSPeter Antoine
This change adds the programming of the MOCS registers to the gen 9+ platforms. The set of MOCS configuration entries introduced by this patch is intended to be minimal but sufficient to cover the needs of current userspace - i.e. a good set of defaults. It is expected to be extended in the future to provide further default values or to allow userspace to redefine its private MOCS tables based on its demand for additional caching configurations. In this setup, userspace should only utilize the first N entries, higher entries are reserved for future use. It creates a fixed register set that is programmed across the different engines so that all engines have the same table. This is done as the main RCS context only holds the registers for itself and the shared L3 values. By trying to keep the registers consistent across the different engines it should make the programming for the registers consistent. v2: -'static const' for private data structures and style changes.(Matt Turner) v3: - Make the tables "slightly" more readable. (Damien Lespiau) - Updated tables fix performance regression. v4: - Code formatting. (Chris Wilson) - re-privatised mocs code. (Daniel Vetter) v5: - Changed the name of a function. (Chris Wilson) v6: - re-based - Added Mesa table entry (skylake & broxton) (Francisco Jerez) - Tidied up the readability defines (Francisco Jerez) - NUMBER of entries defines wrong. (Jim Bish) - Added comments to clear up the meaning of the tables (Jim Bish) Signed-off-by: Peter Antoine <peter.antoine@intel.com> v7 (Francisco Jerez): - Don't write L3-specific MOCS_ESC/SCC values into the e/LLC control tables. Prefix L3-specific defines consistently with L3_ and e/LLC-specific defines with LE_ to avoid this kind of confusion in the future. - Change L3CC WT define back to RESERVED (matches my hardware documentation and the original patch, probably a misunderstanding of my own previous comment). - Drop Android tables, define new minimal tables more suitable for the open source stack. - Add comment that the MOCS tables are part of the kernel ABI. - Move intel_logical_ring_begin() and _advance() calls one level down (Chris Wilson). - Minor formatting and style fixes. v8 (Francisco Jerez): - Add table size sanity check to emit_mocs_control/l3cc_table() (Chris Wilson). - Add comment about undefined entries being implicitly set to uncached for forwards compatibility. v9 (Francisco Jerez): - Minor style fixes. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Acked-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-08drm/i915: Update wa_ctx_emit() macro as per kernel coding guidelinesArun Siluvery
wa_ctx_emit() depends on the name of a local variable; if the name of that variable is changed then we get compile errors. In this case it is unlikely to be changed as this macro is only used in this set of functions but Kernel coding guidelines doesn't recommend doing this. It was my mistake as I should have corrected it at the beginning but missed so correct this before there are more usages of this macro (Bob Beckett). https://www.kernel.org/doc/Documentation/CodingStyle, Chapter 12, "Things to avoid when using macros", point 2): " 2) macros that depend on having a local variable with a magic name: #define FOO(val) bar(index, val) might look like a good thing, but it's confusing as hell when one reads the code and it's prone to breakage from seemingly innocent changes. " v2: Optimization to avoid multiple evaluation of 'index' in the macro. Since we invoke it multiple times, compiler, if it can, should be able to coalesce them into a single condition and remove multiple WARN_ON checks (Chris). Suggested-by: Robert Beckett <robert.beckett@intel.com> Cc: Robert Beckett <robert.beckett@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Mark elsps submitted when they are pushed to hwMika Kuoppala
Now when we have requests this deep on call chain, we can mark the elsp being submitted when it actually is. Remove temp variable and readjust commenting to more closely fit to the code. v2: Avoid tmp variable and reduce number of writes (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Convert execlists_ctx_descriptor() for requestsMika Kuoppala
Pass around requests to carry context deeper in callchain. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Convert execlists_elsp_writ() for requestsMika Kuoppala
Pass around requests to carry context deeper in callchain. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Convert intel_lr_context_pin() for requestsMika Kuoppala
Pass around requests to carry context deeper in callchain. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Assign request ringbuf before pinMika Kuoppala
In preparation to make intel_lr_context_pin|unpin to accept requests, assign ringbuf into request before we call the pinning. v2: No need to unset ringbuf on error path (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Convert execlists_update_context() for requestsMika Kuoppala
Pass around requests to carry context deeper in callchain. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Convert execlist_submit_contexts() for requestsMika Kuoppala
Pass around requests to carry context deeper in callchain. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Update WaFlushCoherentL3CacheLinesAtContextSwitchArun Siluvery
In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after PIPE_CONTROL instruction but there is a slight complication as this is applied in WA batch where the values are only initialized once. Dave identified an issue with the current implementation where the register value is read once at the beginning and it is reused; this patch corrects this by saving the register value to memory, update register with the bit of our interest and restore it back with original value. This implementation uses MI_LOAD_REGISTER_MEM which is currently only used by command parser and was using a default length of 0. This is now updated with correct length and moved to appropriate place. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-06drm/i915: Enable resource streamer on ExeclistsAbdiel Janulgue
GEN8 and above uses Execlists by default instead of the legacy ringbuffer for batch execution. This patch enables the resource streamer bits when required. Patch is based on the initial work by Minu Mathai <minu.mathai@intel.com> This version also adds the required bits to enable GEN8 Resource Streamer context save and restore for Execlists. Cc: ville.syrjala@linux.intel.com Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-03drm/i915: Reserve space improvementsJohn Harrison
An earlier patch was added to reserve space in the ring buffer for the commands issued during 'add_request()'. The initial version was pessimistic in the way it handled buffer wrapping and would cause premature wraps and thus waste ring space. This patch updates the code to better handle the wrap case. It no longer enforces that the space being asked for and the reserved space are a single contiguous block. Instead, it allows the reserve to be on the far end of a wrap operation. It still guarantees that the space is available so when the wrap occurs, no wait will happen. Thus the wrap cannot fail which is the whole point of the exercise. Also fixed a merge failure with some comments from the original patch. v2: Incorporated suggestion by David Gordon to move the wrap code inside the prepare function and thus allow a single combined wait_for_space() call rather than doing one before the wrap and another after. This also makes the prepare code much simpler and easier to follow. v3: Fix for 'effective_size' vs 'size' during ring buffer remainder calculations (spotted by Tomas Elf). For: VIZ-5115 CC: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-26drm/i915/lrc: Update PDPx registers with lri commandsMichel Thierry
A safer way to update the PDPx registers is sending lri commands, added in the ring before the batchbuffer start. Otherwise, the ctx must be idle before trying to change anything (but the ring-tail) in the ctx image. An example where the ctx won't be idle is lite-restore. This patch depends on 5b7e4c9ce ("drm/i915/gtt: Mark TLBS dirty for gen8+"). v2: Combine lri writes (and save 8 commands). (Mika) v3: Rebase after ring/req changes, and removed references to deprecated patches. Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-26drm/i915/gtt: Introduce i915_page_dir_dma_addrMika Kuoppala
The legacy mode mm switch and the execlist context assignment needs dma address for the page directories. Introduce a function that encapsulates the scratch_pd dma fallback if no pd is found. v2: Rebase, s/ring/req Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-24drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch workaroundArun Siluvery
In Indirect context w/a batch buffer, WaClearSlmSpaceAtContextSwitch This WA performs writes to scratch page so it must be valid, this check is performed before initializing the batch with this WA. v2: s/PIPE_CONTROL_FLUSH_RO_CACHES/PIPE_CONTROL_FLUSH_L3 (Ville) v3: GTT bit in scratch address should be mbz (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Bail out early if WA batch is not available for given GenArun Siluvery
To initialize WA batch, at the moment we first allocate batch and then check whether we have any WA to be initialized for the given Gen; if we don't have any WA then we WARN the user, destroy the batch and return but this is causing another WARN in cleanup code complaining about sleeping in atomic context. Till we understand this better and to keep things simpler, bail out early if we don't have WA. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Fix warnings reported by 0-dayArun Siluvery
Kernel 0-day framework reported warnings with WA batch patches, this patch fixes those warnings and an additional warning reported in intel_lrc.c file. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update a bunch of LRC functions to take requestsJohn Harrison
A bunch of the low level LRC functions were passing around ringbuf and ctx pairs. In a few cases, they took the r/c pair and a request as well. This is all quite messy and unnecesary. The context_queue() call is especially bad since the fake request code got removed - it takes a request and three extra things that must be extracted from the request and then it checks them against what it finds in the request. Removing all the derivable data makes the code much simpler all round. This patch updates those functions to just take the request structure. Note that logical_ring_wait_for_space now takes a request structure but already had a local request pointer that it uses to scan for something to wait on. To avoid confusion the local variable has been renamed 'target' (it is searching for a target request to do something with) and the parameter has been called req (to guarantee anything accidentally missed gets a compiler error). v2: Updated commit message re wait_for_space (Tomas Elf review comment). For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Remove 'faked' request from LRC submissionJohn Harrison
The LRC submission code requires a request for tracking purposes. It does not actually require that request to 'complete' it simply uses it for keeping hold of reference counts on contexts and such like. Previously, the fall back path of polling for space in the ring would start by submitting any outstanding work that was sat in the buffer. This submission was not done as part of the request that that work was owned by because that would lead to complications with the request being submitted twice. Instead, a null request structure was passed in to the submit call and a fake one was created. That fall back path has long since been obsoleted and has now been removed. Thus there is never any need to fake up a request structure. This patch removes that code. A couple of sanity check warnings are added as well, just in case. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <thomas.daniel@intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Remove the now obsolete 'outstanding_lazy_request'John Harrison
The outstanding_lazy_request is no longer used anywhere in the driver. Everything that was looking at it now has a request explicitly passed in from on high. Everything that was relying upon it behind the scenes is now explicitly creating/passing/submitting its own private request. Thus the OLR can be removed. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Add *_ring_begin() to request allocationJohn Harrison
Now that the *_ring_begin() functions no longer call the request allocation code, it is finally safe for the request allocation code to call *_ring_begin(). This is important to guarantee that the space reserved for the subsequent i915_add_request() call does actually get reserved. v2: Renamed functions according to review feedback (Tomas Elf). For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update intel_logical_ring_begin() to take a request structureJohn Harrison
Now that everything above has been converted to use requests, intel_logical_ring_begin() can be updated to take a request instead of a ringbuf/context pair. This also means that it no longer needs to lazily allocate a request if no-one happens to have done it earlier. Note that this change makes the execlist signature the same as the legacy version. Thus the two functions could be merged into a ring->begin() wrapper if required. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update ring->emit_bb_start() to take a request structureJohn Harrison
Updated the ring->emit_bb_start() implementation to take a request instead of a ringbuf/context pair. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update ring->emit_request() to take a request structureJohn Harrison
Updated the ring->emit_request() implementation to take a request instead of a ringbuf/request pair. Also removed its use of the OLR for obtaining the request's seqno. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update ring->emit_flush() to take a request structureJohn Harrison
Updated the various ring->emit_flush() implementations to take a request instead of a ringbuf/context pair. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update flush_all_caches() to take request structuresJohn Harrison
Updated the *_ring_flush_all_caches() functions to take requests instead of rings or ringbuf/context pairs. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update workarounds_emit() to take request structuresJohn Harrison
Updated the *_ring_workarounds_emit() functions to take requests instead of ring/context pairs. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update a bunch of execbuffer helpers to take request structuresJohn Harrison
Updated *_ring_invalidate_all_caches(), i915_reset_gen7_sol_offsets() and i915_emit_box() to take request structures instead of ring or ringbuf/context pairs. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update [vma|object]_move_to_active() to take request structuresJohn Harrison
Now that everything above has been converted to use request structures, it is possible to update the lower level move_to_active() functions to be request based as well. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update add_request() to take a request structureJohn Harrison
Now that all callers of i915_add_request() have a request pointer to hand, it is possible to update the add request function to take a request pointer rather than pulling it out of the OLR. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update i915_gem_object_sync() to take a request structureJohn Harrison
The plan is to pass requests around as the basic submission tracking structure rather than rings and contexts. This patch updates the i915_gem_object_sync() code path. v2: Much more complex patch to share a single request between the sync and the page flip. The _sync() function now supports lazy allocation of the request structure. That is, if one is passed in then that will be used. If one is not, then a request will be allocated and passed back out. Note that the _sync() code does not necessarily require a request. Thus one will only be created until certain situations. The reason the lazy allocation must be done within the _sync() code itself is because the decision to need one or not is not really something that code above can second guess (except in the case where one is definitely not required because no ring is passed in). The call chains above _sync() now support passing a request through which most callers passing in NULL and assuming that no request will be required (because they also pass in NULL for the ring and therefore can't be generating any ring code). The exeception is intel_crtc_page_flip() which now supports having a request returned from _sync(). If one is, then that request is shared by the page flip (if the page flip is of a type to need a request). If _sync() does not generate a request but the page flip does need one, then the page flip path will create its own request. v3: Updated comment description to be clearer about 'to_req' parameter (Tomas Elf review request). Rebased onto newer tree that significantly changed the synchronisation code. v4: Updated comments from review feedback (Tomas Elf) For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update render_state_init() to take a request structureJohn Harrison
Updated the two render_state_init() functions to take a request pointer instead of a ring. This removes their reliance on the OLR. v2: Rebased to newer tree. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update init_context() to take a request structureJohn Harrison
Now that everything above has been converted to use requests, it is possible to update init_context() to take a request pointer instead of a ring/context pair. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update deferred context creation to do explicit request managementJohn Harrison
In execlist mode, context initialisation is deferred until first use of the given context. This is because execlist mode has per ring context state and thus many more context storage objects than legacy mode and many are never actually used. Previously, the initialisation commands were written to the ring and tagged with some random request structure via the OLR. This seemed to be causing a null pointer deference bug under certain circumstances (BZ:88865). This patch adds explicit request creation and submission to the deferred initialisation code path. Thus removing any reliance on or randomness caused by the OLR. Note that it should be possible to move the deferred context creation until even later - when the context is actually switched to rather than when it is merely validated. This would allow the initialisation to be done within the request of the work that is wanting to use the context. Hence, the extra request that is created, used and retired just for the context init could be removed completely. However, this is left for a follow up patch. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Add explicit request management to i915_gem_init_hw()John Harrison
Now that a single per ring loop is being done for all the different intialisation steps in i915_gem_init_hw(), it is possible to add proper request management as well. The last remaining issue is that the context enable call eventually ends up within *_render_state_init() and this does its own private _i915_add_request() call. This patch adds explicit request creation and submission to the top level loop and removes the add_request() from deep within the sub-functions. v2: Updated for removal of batch_obj from add_request call in previous patch. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Don't tag kernel batches as user batchesJohn Harrison
The render state initialisation code does an explicit i915_add_request() call to commit the init commands. It was passing in the initialisation batch buffer to add_request() as the batch object parameter. However, the batch object entry in the request structure (which is all that parameter is used for) is meant for keeping track of user generated batch buffers for blame tagging during GPU hangs. This patch clears the batch object parameter so that kernel generated batch buffers are not tagged as being user generated. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Add flag to i915_add_request() to skip the cache flushJohn Harrison
In order to explcitly track all GPU work (and completely remove the outstanding lazy request), it is necessary to add extra i915_add_request() calls to various places. Some of these do not need the implicit cache flush done as part of the standard batch buffer submission process. This patch adds a flag to _add_request() to specify whether the flush is required or not. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update execbuffer_move_to_active() to take a request structureJohn Harrison
The plan is to pass requests around as the basic submission tracking structure rather than rings and contexts. This patch updates the execbuffer_move_to_active() code path. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update move_to_gpu() to take a request structureJohn Harrison
The plan is to pass requests around as the basic submission tracking structure rather than rings and contexts. This patch updates the move_to_gpu() code paths. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update the dispatch tracepoint to use params->requestJohn Harrison
Updated a couple of trace points to use the now cached request pointer rather than extracting it from the ring. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Update alloc_request to return the allocated requestJohn Harrison
The alloc_request() function does not actually return the newly allocated request. Instead, it must be pulled from ring->outstanding_lazy_request. This patch fixes this so that code can create a request and start using it knowing exactly which request it actually owns. v2: Updated for new i915_gem_request_alloc() scheme. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Simplify i915_gem_execbuffer_retire_commands() parametersJohn Harrison
Shrunk the parameter list of i915_gem_execbuffer_retire_commands() to a single structure as everything it requires is available in the execbuff_params object. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Merged the many do_execbuf() parameters into a structureJohn Harrison
The do_execbuf() function takes quite a few parameters. The actual set of parameters is going to change with the conversion to passing requests around. Further, it is due to grow massively with the arrival of the GPU scheduler. This patch simplifies the prototype by passing a parameter structure instead. Changing the parameter set in the future is then simply a matter of adding/removing items to the structure. Note that the structure does not contain absolutely everything that is passed in. This is because the intention is to use this structure more extensively later in this patch series and more especially in the GPU scheduler that is coming soon. The latter requires hanging on to the structure as the final hardware submission can be delayed until long after the execbuf IOCTL has returned to user land. Thus it is unsafe to put anything in the structure that is local to the IOCTL call itself - such as the 'args' parameter. All entries must be copies of data or pointers to structures that are reference counted in some way and guaranteed to exist for the duration of the batch buffer's life. v2: Rebased to newer tree and updated for changes to the command parser. Specifically, a code shuffle has required saving the batch start address in the params structure. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Set context in request from creation even in legacy modeJohn Harrison
In execlist mode, the context object pointer is written in to the request structure (and reference counted) at the point of request creation. In legacy mode, this only happens inside i915_add_request(). This patch updates the legacy code path to match the execlist version. This allows all the intermediate code between request creation and request submission to get at the context object given only a request structure. Thus negating the need to pass context pointers here, there and everywhere. v2: Moved the context reference so it does not need to be undone if the get_seqno() fails. v3: Fixed execlist mode always hitting a warning about invalid last_contexts (which don't exist in execlist mode). v4: Updated for new i915_gem_request_alloc() scheme. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: i915_add_request must not failJohn Harrison
The i915_add_request() function is called to keep track of work that has been written to the ring buffer. It adds epilogue commands to track progress (seqno updates and such), moves the request structure onto the right list and other such house keeping tasks. However, the work itself has already been written to the ring and will get executed whether or not the add request call succeeds. So no matter what goes wrong, there isn't a whole lot of point in failing the call. At the moment, this is fine(ish). If the add request does bail early on and not do the housekeeping, the request will still float around in the ring->outstanding_lazy_request field and be picked up next time. It means multiple pieces of work will be tagged as the same request and driver can't actually wait for the first piece of work until something else has been submitted. But it all sort of hangs together. This patch series is all about removing the OLR and guaranteeing that each piece of work gets its own personal request. That means that there is no more 'hoovering up of forgotten requests'. If the request does not get tracked then it will be leaked. Thus the add request call _must_ not fail. The previous patch should have already ensured that it _will_ not fail by removing the potential for running out of ring space. This patch enforces the rule by actually removing the early exit paths and the return code. Note that if something does manage to fail and the epilogue commands don't get written to the ring, the driver will still hang together. The request will be added to the tracking lists. And as in the old case, any subsequent work will generate a new seqno which will suffice for marking the old one as complete. v2: Improved WARNings (Tomas Elf review request). For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-23drm/i915: Reserve ring buffer space for i915_add_request() commandsJohn Harrison
It is a bad idea for i915_add_request() to fail. The work will already have been send to the ring and will be processed, but there will not be any tracking or management of that work. The only way the add request call can fail is if it can't write its epilogue commands to the ring (cache flushing, seqno updates, interrupt signalling). The reasons for that are mostly down to running out of ring buffer space and the problems associated with trying to get some more. This patch prevents that situation from happening in the first place. When a request is created, it marks sufficient space as reserved for the epilogue commands. Thus guaranteeing that by the time the epilogue is written, there will be plenty of space for it. Note that a ring_begin() call is required to actually reserve the space (and do any potential waiting). However, that is not currently done at request creation time. This is because the ring_begin() code can allocate a request. Hence calling begin() from the request allocation code would lead to infinite recursion! Later patches in this series remove the need for begin() to do the allocate. At that point, it becomes safe for the allocate to call begin() and really reserve the space. Until then, there is a potential for insufficient space to be available at the point of calling i915_add_request(). However, that would only be in the case where the request was created and immediately submitted without ever calling ring_begin() and adding any work to that request. Which should never happen. And even if it does, and if that request happens to fall down the tiny window of opportunity for failing due to being out of ring space then does it really matter because the request wasn't doing anything in the first place? v2: Updated the 'reserved space too small' warning to include the offending sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added re-initialisation of tracking state after a buffer wrap to keep the sanity checks accurate. v3: Incremented the reserved size to accommodate Ironlake (after finally managing to run on an ILK system). Also fixed missing wrap code in LRC mode. v4: Added extra comment and removed duplicate WARN (feedback from Tomas). For: VIZ-5115 CC: Tomas Elf <tomas.elf@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>