summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2017-05-24drm/amdgpu: add RAVEN family id definitionChunming Zhou
RAVEN is a new APU. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add register headers for VCN 1.0Alex Deucher
Add registers for Video Controller Next 1.0 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add register headers for THM 10.0Alex Deucher
Add registers for THerMal control 10.0 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add register headers for SDMA 4.1Alex Deucher
Add registers for SDMA 4.1 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add register headers for NBIO 7.0Alex Deucher
Add registers for NBIO 7.0 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add register headers for MP 10.0Alex Deucher
Add registers for MP 10.0 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add register headers for MMHUB 9.1Alex Deucher
Add registers for the MultiMedia Hub 9.1 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add register headers for GC 9.1Alex Deucher
Registers for Graphics Controller 9.1 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add register headers for DCN 1.0Alex Deucher
Registers for Display Controller Next 1.0 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:use job's list instead of check fenceMonk Liu
because if the fence is really signaled, it could already released so the fence pointer is a wild pointer, but if we use job->base.node we are safe because job will not be released untill amdgpu_job_timedout finished. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu/SRIOV:implement guilty job TDR for(V2)Monk Liu
1,TDR will kickout guilty job if it hang exceed the threshold of the given one from kernel paramter "job_hang_limit", that way a bad command stream will not infinitly cause GPU hang. by default this threshold is 1 so a job will be kicked out after it hang. 2,if a job timeout TDR routine will not reset all sched/ring, instead if will only reset on the givn one which is indicated by @job of amdgpu_sriov_gpu_reset, that way we don't need to reset and recover each sched/ring if we already know which job cause GPU hang. 3,unblock sriov_gpu_reset for AI family. V2: 1:put kickout guilty job after sched parked. 2:since parking scheduler prior to kickout already occupies a while, we can do last check on the in question job before doing hw_reset. TODO: 1:when a job is considered as guilty, we should mark some flag in its fence status flag, and let UMD side aware that this fence signaling is not due to job complete but job hang. 2:if gpu reset cause all video memory lost, we need introduce a new policy to implement TDR, like drop all jobs not yet signaled, and all IOCTL on this device will return ERROR DEVICE_LOST. this will be implemented later. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:don't init entity for KIQMonk Liu
We don't need a scheduler for KIQ. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:only call flr_work under infinite timeoutMonk Liu
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:use job* to replace voluntaryMonk Liu
that way we can know which job cause hang and can do per sched reset/recovery instead of all sched. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset (v2)Monk Liu
because we don't want to do sriov-gpu-reset under certain cases, so just split those two funtion and don't invoke sr-iov one from bare-metal one. V2: remove debugfs_gpu_reset routine on SRIOV case. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: id reset count only is updated when used end v2Chunming Zhou
before that, we have function to check if reset happens by using reset count. v2: always update reset count after vm flush Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: make pipeline sync be in same place v2Chunming Zhou
v2: directly return for 'if' case. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add sched sync for amdgpu job v2Chunming Zhou
this is an improvement for previous patch, the sched_sync is to store fence that could be skipped as scheduled, when job is executed, we didn't need pipeline_sync if all fences in sched_sync are signalled, otherwise insert pipeline_sync still. v2: handle error when adding fence to sync failed. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: remove unsed amdgpu_gem_handle_lockup (v2)Christian König
This kind of reset handling was removed a long time ago. v2: fix warning (Alex) Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: print when gpu reset successedChunming Zhou
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Roger.He <Hongbo.He@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: fix ring0 failed on pro cardChunming Zhou
the root cause is vram content is lost completely after pci reset. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Roger.He <Hongbo.He@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: extend lock range for race condition when gpu resetRoger.He
to cover below case: 1. A task gart bind/unbind but not add to adev->gtt_list yet 2. at this time gpu reset, gtt only recover those gtt in adev->gtt_list Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Roger.He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: Fix comments in source codeAlex Xie
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: fix errors in comments.Alex Xie
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu/gfx9: move define to header fileAlex Deucher
rather than defining it locally. Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amd/amdgpu: get rid of else branchNikola Pajkovsky
else branch is pointless if it's right at the end of function and use unlikely() on err path. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:cleanup flag not usedMonk Liu
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:use FRAME_CNTL for new GFX ucode (v2)Monk Liu
AI affected: CP/HW team requires KMD insert FRAME_CONTROL(end) after the last IB and before the fence of this DMAframe. this is to make sure the cache are flushed, and it's a must change no matter MCBP/SR-IOV or bare-metal case because new CP hw won't do the cache flush for each IB anymore, it just leaves it to KMD now. with this patch, certain MCBP hang issue when rendering vulkan/chained-ib are resolved. v2: drop gfx8 changes. gfx8 is not affected (Alex) Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:new PM4 entry for VI/AIMonk Liu
TMZ package will be used for VULKAN/CHAINED-IB MCBP Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:change SR-IOV DMAframe schemeMonk Liu
According to CP/hw team requirment, to support PAL/CHAINED-IB MCBP, kernel driver must guarantee DE_META must be inserted right prior to the work_load DE IB (with PREEMPT flag), there cannot be any non-work_load DE IB between-in DE_META and work_load DE IB. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:unify gfx8/9 ce/de meta_dataMonk Liu
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:cleanup indent/format for gfx_v9_0.cMonk Liu
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: clean doorbell after sending init table to mmschFrank Min
According to HW design, need to clean doorbell after setup MMSCH table. Signed-off-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu/virt: change AI ack-irq message to debug levelXiangliang Yu
Change message to debug level as VI does. Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu/psp: Do not load asd for SRIOVXiangliang Yu
If psp version doesn't match asd version, asd loading will be failed. Add workaround to bypass it for sriov. Signed-off-by: Daniel Wang <Daniel.Wang2@amd.com> Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: Bypass GMC/UVD/VCE hw_fini in SR-IOVTrigger Huang
On vega10, some hw finish operations should not be applied in SR-IOV case. This works as workaround to fix multi-VFs reboot/shutdown issues. Signed-off-by: Trigger Huang <trigger.huang@amd.com> Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:re-write sriov_reinit_early/late (v2)Monk Liu
1,this way we make those routines compatible with the sequence requirment for both Tonga and Vega10 2,ignore PSP hw init when doing TDR, because for SR-IOV device the ucode won't get lost after VF FLR, so no need to invoke PSP doing the ucode reloading again. v2: squash in ARRAY_SIZE fix Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:need som change on vega10 mailboxMonk Liu
if sriov gpu reset is invoked by job timeout, it is run in a global work-queue which is very slow and better not call msleep ortherwise it takes long time to get back CPU. so make below changes: 1: Change msleep 1 to mdelay 5 2: Ignore the ack fail from pf after time out, because VF FLR will clear ack, sometime VF FLR is done prior to the beginning of poll_ack so we can ignore this ack TODO: Put job_timedout (and the following gpu reset) in a driver thread, instead of the global work_struct. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:fix cannot receive rcv/ack irq bugMonk Liu
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu:kiq reg access need timeout(v2)Monk Liu
this is to prevent fence forever waiting if FLR occured during register accessing. v2: use define instead of hardcode for the timeout msec Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu/gfx9: wait for completion in KIQ initAlex Deucher
We need to make sure the various init sequences submitted to KIQ complete before testing the rings. Reviewed-by: monk liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu/gfx9: use new KIQ packet definesAlex Deucher
Rather than magic numbers. Reviewed-by: monk liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add KIQ packet defines to soc15d.hAlex Deucher
Will be used in subsequent commits rather rather than magic numbers. Reviewed-by: monk liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu/gfx9: clear the compute ring on resetAlex Deucher
To be consistent with gfx8. Reviewed-by: monk liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu/gfx9: create mqd backupsAlex Deucher
And properly synchronize them with the master during queue init. Reviewed-by: monk liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: Move kiq ring lock out of virt structureShaoyun Liu
The usage of kiq should not depend on the virtualization. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by:Andres Rodriquez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: bump module verion for reserved vmidChunming Zhou
Interface to reserve a vmid for a specific process to add in shader debugging that requries a fixed vmid. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: implement grab reserved vmid V4Chunming Zhou
Implement the vmid reservation. v2: move sync waiting only when flush needs v3: fix racy v4: peek fence instead of get fence, and fix potential context starved. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: add limitation for dedicated vm number v4Chunming Zhou
Limit reserved vmids to 1 to avoid taking too many out of commission and starving the system. v2: move #define to amdgpu_vm.h v3: move reserved vmid counter to id_manager, and increase counter before allocating vmid v4: rename to reserved_vmid_num Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24drm/amdgpu: reserve/unreserve vmid by vm ioctl v4Chunming Zhou
add reserve/unreserve vmid funtions. Used to reserve vmids for certain shader debugging functionality that required a fixed vmid for the life of the debug. v3: only reserve vmid from gfxhub v4: fix racy condition Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>