summaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs
AgeCommit message (Collapse)Author
2020-03-24habanalabs: fix pm manual->auto in GOYAOded Gabbay
When moving from manual to automatic power management mode in GOYA, the driver didn't correctly place the device in LOW power mode. As a result, if an application was run immediately after the move, it would have run with low frequencies. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: show unsupported message for GAUDIOded Gabbay
If a GAUDI device is present in the system, display an error message that it is not supported by the current kernel. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: add print upon clock changeOmer Shpigelman
Add print upon clock slow down due to power consumption or overheating. In addition, add print when back to optimal clock. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: update goya firmware register mapOded Gabbay
Use specific values in enum of register map to be able to deprecate old values. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: Add missing annotation for goya_hw_queues_unlock()Jules Irenge
Sparse reports a warning at goya_hw_queues_unlock() warning: context imbalance in goya_hw_queues_unlock() - unexpected unlock The root cause is a missing annotation at goya_hw_queues_unlock() Add the missing __releases(&goya->hw_queues_lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: Add missing annotation for goya_hw_queues_lock()Jules Irenge
Sparse reports a warning at goya_hw_queues_lock() warning: context imbalance in goya_hw_queues_lock() - wrong count at exit The root cause is a missing annotation at goya_hw_queues_lock() Add the missing __acquires(&goya->hw_queues_lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: Remove unused parse_cnt variableTomer Tayar
The "parse_cnt" variable is incremented while validating the CS chunks, but it is actually not being used. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: provide historical maximum of various sensorsChristine Gharzuzi
Add support for hwmon_in_highest, hwmon_temp_highest and hwmon_curr_highest attributes. These attributes retrieve the historical maximum voltage, temperature and current that were sampled, respectively. Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: modify the return values of hl_read/write routinesMoti Haimovski
The hl read and write routines implement the hwmon_ops read and write interface routines respectively. These routines are expected to return a completion status when called, which was not the case until this commit. This commit modifies these routines to return 0 upon success and a negative error value upon failure. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: support temperature offset via sysfsMoti Haimovski
This commit adds support for offsetting the temperatures reading by a specified value as defined in https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface using the standard sysfs defined for hwmon. This is required by system administrators to inject errors to test their monitoring applications in data centers. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: ratelimit error prints of IRQsOded Gabbay
The compute engines can perform millions of transactions per second. If there is a bug in the S/W stack, we could get a lot of interrupts and spam the kernel log. Therefore, ratelimit these prints Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: add debugfs write64/read64Moti Haimovski
Allow debug user to write/read 64-bit data through debugfs. This will expedite the dump process of the (large) internal memories of the device done during debug. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: fix DDR bar address settingOmer Shpigelman
DRAM_PHYS_BASE is already taken into account in MMU_PAGE_TABLES_ADDR. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: removing extra ;Oded Gabbay
There is an extra ; after the end of a function, which needs to be removed Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-03-24habanalabs: Avoid running restore chunks if no execute chunksTomer Tayar
CS with no chunks for execute phase is invalid, so its context_switch/restore phase should not be run. Hence, move the check of the execute chunks number to the beginning of hl_cs_ioctl(). Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: Modify CS jobs counter to u16Tomer Tayar
As HL_MAX_JOBS_PER_CS is 512, it is possible that more than 255 CS jobs will be submitted for a certain queue. Hence, modify the "jobs_in_queue_cnt" parameter of the "hl_cs" structure to be u16 instead of u8. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: split the host MMU propertiesOmer Shpigelman
Host memory may be allocated with huge pages. A different virtual range may be used for mapping in this case. Add Huge PCI MMU (HPMMU) properties to support it. This patch is a prerequisite for future ASICs support and has no effect on Goya ASIC as currently a single virtual host range is used for all page sizes. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: use the user CB size as a default job sizeOmer Shpigelman
When no patched command buffer (CB) is created, use the user CB size as the job size. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-03-24habanalabs: flush only at the end of the map/unmapPawel Piskorski
Optimize hl_mmu_map and hl_mmu_unmap by not calling flush(ctx) within per-page loop. Signed-off-by: Pawel Piskorski <ppiskorski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-02-11habanalabs: patched cb equals user cb in device memsetOded Gabbay
During device memory memset, the driver allocates and use a CB (command buffer). To reuse existing code, it keeps a pointer to the CB in two variables, user_cb and patched_cb. Therefore, there is no need to "put" both the user_cb and patched_cb, as it will cause an underflow of the refcnt of the CB. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-02-11habanalabs: do not halt CoreSight during hard resetOmer Shpigelman
During hard reset we must not write to the device. Hence avoid halting CoreSight during user context close if it is done during hard reset. In addition, we must not re-enable clock gating afterwards as it was deliberately disabled in the beginning of the hard reset flow. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-02-11habanalabs: halt the engines before hard-resetOded Gabbay
The driver must halt the engines before doing hard-reset, otherwise the device can go into undefined state. There is a place where the driver didn't do that and this patch fixes it. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-12-14habanalabs: remove variable 'val' set but not usedChen Wandun
Fixes gcc '-Wunused-but-set-variable' warning: drivers/misc/habanalabs/goya/goya.c: In function goya_pldm_init_cpu: drivers/misc/habanalabs/goya/goya.c:2195:6: warning: variable val set but not used [-Wunused-but-set-variable] drivers/misc/habanalabs/goya/goya.c: In function goya_hw_init: drivers/misc/habanalabs/goya/goya.c:2505:6: warning: variable val set but not used [-Wunused-but-set-variable] Fixes: 9494a8dd8d22 ("habanalabs: add h/w queues module") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-12-14habanalabs: rate limit error msg on waiting for CSOded Gabbay
In case a user submits a CS, and the submission fails, and the user doesn't check the return value and instead use the error return value as a valid sequence number of a CS and ask to wait on it, the driver will print an error and return an error code for that wait. The real problem happens if now the user ignores the error of the wait, and try to wait again and again. This can lead to a flood of error messages from the driver and even soft lockup event. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2019-11-21habanalabs: add more protection of device during resetOded Gabbay
Prevent accesses to the device (register read/write) from debugfs entries during reset as that can cause the device to get stuck. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2019-11-21habanalabs: flush EQ workers in hard resetOded Gabbay
During hard-reset, there can be multiple events received from the H/W. For each event, the driver opens a worker thread to handle it. For some of the events, the driver will read/write registers in the code that handles the event. In case of hard-reset, we must prevent reads/writes to the registers during the reset operation because the device might get stuck if that happens. Therefore, flush the EQ workers before resetting the device (in hard-reset only). Additional events won't arrive as we synced and disabled the interrupts. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2019-11-21habanalabs: make the reset code more consistentOded Gabbay
In the hl_device_reset we ask about the hard_reset argument when we want to differentiate between soft and hard reset, except for three places where we use "from_hard_reset_thread". Replace one of those locations with the hard_reset argument as it is guaranteed that if we reached to that line in the code during hard_reset, it is from a kernel thread. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2019-11-21habanalabs: expose reset counters via existing INFO IOCTLMoti Haimovski
Expose both soft and hard reset counts via INFO IOCTL. This will allow system management applications to easily check if the device has undergone reset. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: make code more conciseOded Gabbay
Instead of doing if inside if, just write them with && operator. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: use defines for F/W filesOded Gabbay
Make the code more concise and maintainable by using defines for the F/W files. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: remove prints on successful device initializationOded Gabbay
Successful device initialization is mentioned in kernel log with the message "Successfully added device to habanalabs driver". There is no point of spamming the log with additional messages about successful queue testing, which are implied by the above mentioned message. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: remove unnecessary checksOmer Shpigelman
Now that the VA block free list is not updated on context close in order to optimize this flow, no need in the sanity checks of the list contents as these will fail for sure. In addition, remove the "context closing with VA in use" print during hard reset as this situation is a side effect of the failure that caused the hard reset. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: invalidate MMU cache only onceOmer Shpigelman
Reduce context close time by performing MMU cache invalidation once at the end of the unmap loop rather in each iteration, in order to avoid hard reset with open contexts. Reset with open contexts can potentially lead to a kernel crash as the generic pool of the MMU hops is destroyed while it is not empty because some unmap operations are not done. The commit affect mainly when running on simulator. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: skip VA block list update in reset flowOmer Shpigelman
Reduce context close time by skipping the VA block free list update in order to avoid hard reset with open contexts. Reset with open contexts can potentially lead to a kernel crash as the generic pool of the MMU hops is destroyed while it is not empty because some unmap operations are not done. The commit affect mainly when running on simulator. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: optimize MMU unmapOmer Shpigelman
Reduce context close time by skipping hash table lookup if possible in order to avoid hard reset with open contexts. Reset with open contexts can potentially lead to a kernel crash as the generic pool of the MMU hops is destroyed while it is not empty because some unmap operations are not done. This commit affect mainly when running on simulator. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: prevent read/write from/to the device during hard resetOmer Shpigelman
During hard reset we should not access the device except of necessary reset operations because the device might be stuck or unresponsive. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: split MMU properties to PCI/DRAMOmer Shpigelman
Split the properties used for MMU mappings to DRAM and PCI (host) types. This is a prerequisite for future ASICs support. Note that in Goya ASIC, the PMMU and DMMU are the same (except of page sizes) as only one MMU mechanism is used for both of the mapping types. Hence this patch should not have any effect on current behavior. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: re-factor MMU masks and documentationOmer Shpigelman
Some cosmetics around the MMU code to make it more self-explanatory. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: type specific MMU cache invalidationOmer Shpigelman
Add the ability to invalidate the necessary MMU cache only. This ability is a prerequisite for future ASICs support. Note that in Goya ASIC, a single cache is used for both host/DRAM mappings and hence this patch should not have any effect on current behavior. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: re-factor memory module codeOmer Shpigelman
Some of the functions in the memory module code were too long and/or contained multiple operations that are not always done together. Re-factor the code by dividing those functions to smaller functions which are more readable and maintainable. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: export uapi defines to user-spaceOded Gabbay
The two defines that control the maximum size of a command buffer and the maximum number of JOBS per CS need to be exported to the user as they are part of the API towards user-space. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: don't print error when queues are fullOded Gabbay
If the queues are full and we return -EAGAIN to the user, there is no need to print an error, as that case isn't an error and the user is expected to re-submit the work. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: increase max jobs number to 512Oded Gabbay
In training, there is a need for a large amount of patching to the recipe. This results in many command buffers contains a lot of DMA packets. The number of command buffers per CS is larger than the current maximum of 64, which is an arbitrary number that is enough for inference, but it has no real affect on the code and/or resources of the host machine. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: set ETR as non-securedOded Gabbay
ETR should always be non-secured as it is used by the users to record profiling/trace data. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: use registers name defines for ETR blockOded Gabbay
We have a single ETR block in the SOC, so use explicit register name defines for initializing this block. This makes it more readable and maintainable. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: read F/W versions before failureOded Gabbay
Move the read of the F/W boot versions before exiting on possible failures of the F/W boot. This will help debug boot failures as we will be able to know the F/W boot version. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21habanalabs: expose card name in INFO IOCTLOded Gabbay
To enable userspace processes, e.g. management utilities, to display the card name to the user, add the card name property to the HW_IP structure that is copied to the user in the INFO IOCTL. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: remove set but not used variable 'qman_base_addr'YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: drivers/misc/habanalabs/goya/goya.c: In function 'goya_init_mme_cmdq': drivers/misc/habanalabs/goya/goya.c:1536:6: warning: variable 'qman_base_addr' set but not used [-Wunused-but-set-variable] It is never used, so can be removed. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21habanalabs: add opcode to INFO IOCTL to return clock rateOded Gabbay
Add a new opcode to the INFO IOCTL to allow the user application to retrieve the ASIC's current and maximum clock rate. The rate is returned in MHz. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2019-11-21habanalabs: set TPC Icache to 16 cache linesOded Gabbay
Reduce latency to memory during TPC kernel execution. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>