Age | Commit message (Collapse) | Author |
|
There is a really hairy resolution involving amdgpu fixes, that I'd rather confirm here.
Also some misc fixes are landed by me, but the pr has them as well.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
In VRR mode, keep track of the vblank count of the last
completed pageflip in amdgpu_crtc->last_flip_vblank, as
recorded in the pageflip completion handler after each
completed flip.
Use that count to prevent mmio programming a new pageflip
within the same vblank in which the last pageflip completed,
iow. to throttle pageflips to at most one flip per video
frame, while at the same time allowing to request a flip
not only before start of vblank, but also anywhere within
vblank.
The old logic did the same, and made sense for regular fixed
refresh rate flipping, but in vrr mode it prevents requesting
a flip anywhere inside the possibly huge vblank, thereby
reducing framerate in vrr mode instead of improving it, by
delaying a slightly delayed flip requests up to a maximum
vblank duration + 1 scanout duration. This would limit VRR
usefulness to only help applications with a very high GPU
demand, which can submit the flip request before start of
vblank, but then have to wait long for fences to complete.
With this method a flip can be both requested and - after
fences have completed - executed, ie. it doesn't matter if
the request (amdgpu_dm_do_flip()) gets delayed until deep
into the extended vblank due to cpu execution delays. This
also allows clients which want to regulate framerate within
the vrr range a much more fine-grained control of flip timing,
a feature that might be useful for video playback, and is
very useful for neuroscience/vision research applications.
In regular non-VRR mode, retain the old flip submission
behavior. This to keep flip scheduling for fullscreen X11/GLX
OpenGL clients intact, if they use the GLX_OML_sync_control
extensions glXSwapBufferMscOML(, ..., target_msc,...) function
with a specific target_msc target vblank count.
glXSwapBuffersMscOML() or DRI3/Present PresentPixmap() will
not flip at the proper target_msc for a non-zero target_msc
if VRR mode is active with this patch. They'd often flip one
frame too early. However, this limitation should not matter
much in VRR mode, as scheduling based on vblank counts is
pretty futile/unusable under variable refresh duration
anyway, so no real extra harm is done.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
into drm-next
Fixes for 5.1:
amdgpu:
- Fix missing fw declaration after dropping old CI DPM code
- Fix debugfs access to registers beyond the MMIO bar size
- Fix context priority handling
- Add missing license on some new files
- Various cleanups and bug fixes
radeon:
- Fix missing break in CS parser for evergreen
- Various cleanups and bug fixes
sched:
- Fix entities with 0 run queues
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221214134.3308-1-alexander.deucher@amd.com
|
|
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The changes to fix those are two invasive for backporting.
Just disable the feature in 4.20 and 5.0.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: <stable@vger.kernel.org> [4.20+]
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We still need to set bulk_movable to false when new BOs are added or removed.
v2: also set it to false on removal
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: StDenis, Tom <Tom.StDenis@amd.com>
Tested-by: Przemek Socha <soprwa@gmail.com>
Reviewed-by: Zhou, David(ChunMing) <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We only need to set this to false now when BOs are removed from the LRU.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The original change caused a regression, so revert it until the new fix
is ready.
BUG: https://bugs.freedesktop.org/show_bug.cgi?id=109650
This reverts commit 764c85fef41722db0f21558c6c2fb38bee172d19.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit 9006c6bd9059cb9807fa863bafc1d776222cb61b.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
According to hardware engineer, WRITE_BURST_LENGTH [9:8] in register
SDMA0_CHICKEN_BITS need to change to 3 for better performance
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Based on a similar patch from Rafael for radeon.
When using ATPX to control dGPU power, the state is not retained
across suspend and resume cycles by default. This can probably
be loosened for Hybrid Graphics (_PR3) laptops where I think the
state is properly retained.
Fixes: c62ec4610c40 ("PM / core: Fix direct_complete handling for devices with no callbacks")
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We can directly calculate sdma doorbell indexes in the process doorbell
pages through the doorbell_index structure in amdgpu_device, so no need
to cache them in kgd2kfd_shared_resources any more. This alleviates the
adaptation needs when new SDMA configurations are introduced.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Reserved doorbells for SDMA IH and VCN were not properly masked out
when allocating doorbells for CP user queues. This patch fixed that.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
They will be used to inform KFD the doorbell range not usable for CP.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Backmerging for nouveau and imx that needed some fixes for next pulls.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Carried over from radeon, but no longer used.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Given a master fd we can then override the priority of the context
in another fd.
Using these overrides was recommended by Christian instead of trying
to submit from a master fd, and I am adding a way to override a
single context instead of the entire process so we can only upgrade
a single Vulkan queue and not effectively the entire process.
Reused the flags field as it was checked to be 0 anyways, so nothing
used it. This is source-incompatible (due to the name change), but
ABI compatible.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Otherwise we interpret the file private data as drm & amdgpu data
while it might not be, possibly allowing one to get memory corruption.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
I don't see another way to figure out if a ring is initialized if
the hardware block might not be initialized.
Entities have been fixed up to handle num_rqs = 0.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This prevents us from accessing extended registers in tools like
umr. The register access functions already check if the offset
is beyond the BAR size and use the indirect accessors with locking
so this is safe.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There are several statements that are incorrectly indented. Fix these.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
They are no longer used, so delete them to avoid confusion.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
CP_RB_DOORBELL_RANGE_LOWER/UPPER and CP_MEC_DOORBELL_RANGE_LOWER/UPPER
are used for waking up an idle scheduler and for power gating support.
Usually the first few doorbells in pci doorbell bar are used for RB
and all leftover for MEC. This patch fixes the incorrect settings.
Theoretically, gfx ring doorbells should come before all MEC doorbells
to be consistent with the design. However, since the doorbell
allocations are agreed by all and we are not free to change them, also
considering the kernel MEC ring doorbells which are before gfx ring
doorbells are not used often, we compromise by leaving the doorbell
allocations unchanged.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Temporarily removing eviction fences to avoid triggering them by
accident is no longer necessary due to the fence_owner logic in
amdgpu_sync_resv.
As a result the ef_list usage of amdgpu_amdkfd_remove_eviction_fence
and amdgpu_amdkfd_add_eviction_fence are no longer needed.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use FENCE_OWNER_KFD to synchronize PT/PD initialization and clearing
of page table entries. This avoids triggering KFD eviction fences on
the PD reservation objects of compute VMs.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The fence_owner logic in amdgpu_sync_wait will allow waiting without
having to temporarily remove eviction fences.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Creates a temporary sync object to wait for the BO reservation. This
generalizes amdgpu_vm_wait_pd.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
sriov
sriov's gpu_recover inside xgpu_ai_mailbox_flr_work would cause duplicate recover in TDR.
TDR's gpu_recover would be triggered by amdgpu_job_timedout,
that could avoid vk-cts failure by unexpected recover.
Signed-off-by: Wentao Lou <Wentao.Lou@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove the callback and call the dispatcher directly.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Don't warn or fail if it's missing.
v2: handle xgmi case more gracefully.
v3: handle older kernels properly
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
into drm-next
Updates for 5.1:
- GDS fixes
- Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES interface
- GPUVM fixes
- PCIE DPM switching fixes for vega20
- Vega10 uclk DPM regression fix
- DC Freesync fixes
- DC ABM fixes
- Various DC cleanups
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208210214.27666-1-alexander.deucher@amd.com
|
|
The exclusive fence is of course perfectly optional here.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The new Vega series GPU cards have in-built bridges. To get the pcie
speed and width supported by the platform walk the hierarchy and get the
slowest link.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No functional change.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Otherwise we open up the possibility to use uninitialized memory.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
I'm not increasing the DRM version because GDS isn't totally without bugs yet.
v2: update emit_ib_size
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
New chunk for dependency on start of job's execution instead on
the end. This is used for GPU deadlock prevention when
userspace uses mid-IB fences to wait for mid-IB work on other rings.
v2: Fix typo in AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES
v3: Bump KMS version
v4: put old fence AFTER acquiring the scheduled fence.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is a spelling mistake in a dev_err message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_vm_get_task_info is called from interrupt handler and sched timeout
workqueue, we should use irq version spin_lock to avoid deadlock.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
into drm-next
ttm:
- Replace ref/unref naming with get/put
amdgpu:
- Revert DC clang fix, causes a segfault with some compiler versions
- SR-IOV fix
- PCIE fix for vega20
- Misc DC fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190201062345.7304-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.1:
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- Split out some part of drm_crtc_helper.h into drm_probe_helper.h
- DRIVER_* flags improvements
- New tasks on the TODO-list
- Improvements to the documentation
Driver Changes:
- Continual of drmP.h removal in multiple drivers
- Removal of FBINFO_(FLAG_)DEFAULT in multiple drivers
- sun4i: Addition of the A23 support, multiple fixes for the tiled
formats
- atmel-hlcdc: Fix of clipping and rotation properties
- qxl: various BO-related improvements, prime and generic fbdev emulation
support
- dw-hdmi: Support for HDMI2.0 2160p modes and YUV420 output
- New Sitronix ST7701 panel driver
- New Kingdisplay KD097D04 panel driver
- New LeMaker BL035-RGB-002 panel driver
- New PDA 91-00156-A0 panel driver
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190201144749.t3abxvguhstu6bcl@flea
|
|
- move all adjustments into one place
- specify GDS/GWS/OA alignment in basic units of the heaps
- it looks like GDS alignment was 1 instead of 4
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch fixes the incorrect external id that kernel reports to user mode
driver. Raven2's rev_id is starts from 0x8, so its external id (0x81) should
start from rev_id + 0x79 (0x81 - 0x8). And Raven's rev_id should be 0x21 while
rev_id == 1.
Reported-by: Crystal Jin <Crystal.Jin@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes doorbell reflection on Vega20.
Change-Id: I0495139d160a9032dff5977289b1eec11c16f781
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
into drm-next
New stuff for 5.1.
amdgpu:
- DC bandwidth formula updates
- Support for DCC on scanout surfaces
- Support for multiple IH rings on soc15 asics
- Fix xgmi locking
- Add sysfs interface to get pcie usage stats
- Simplify DC i2c/aux code
- Initial support for BACO on vega10/20
- New runtime SMU feature debug interface
- Expand existing sysfs power interfaces to new clock domains
- Handle kexec properly
- Simplify IH programming
- Rework doorbell handling across asics
- Drop old CI DPM implementation
- DC page flipping fixes
- Misc SR-IOV fixes
amdkfd:
- Simplify the interfaces between amdkfd and amdgpu
ttm:
- Add a callback to notify the driver when the lru changes
sched:
- Refactor mirror list handling
- Rework hw fence processing
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125231517.26268-1-alexander.deucher@amd.com
|
|
amdgpu only uses shared-fences internally, but dmabuf importers rely on
implicit write hazard tracking via the reservation_object.fence_excl.
For example, the importer use the write hazard for timing a page flip to
only occur after the exporter has finished flushing its write into the
surface. As such, on exporting a dmabuf, we must either flush all
outstanding fences (for we do not know which are writes and should have
been exclusive) or alternatively create a new exclusive fence that is
the composite of all the existing shared fences, and so will only be
signaled when all earlier fences are signaled (ensuring that we can not
be signaled before the completion of any earlier write).
v2: reservation_object is already locked by amdgpu_bo_reserve()
v3: Replace looping with get_fences_rcu and special case the promotion
of a single shared fence directly to an exclusive fence, bypassing the
fence array.
v4: Drop the fence array ref after assigning to reservation_object
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341
Testcase: igt/amd_prime/amd-to-i915
References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Reviewed-by: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Reduce the repeated node and hive information during XGMI initialization
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
kptr is not used any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
sriov need to restrict max_pfn below AMDGPU_GMC_HOLE.
access the hole results in a range fault interrupt IIRC.
Signed-off-by: Wentao Lou <Wentao.Lou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|