summaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs
AgeCommit message (Collapse)Author
2021-03-10habanalabs: fix debugfs address translationfarah kassabri
when user uses virtual addresses to access dram through debugfs, driver translate this address to physical and use it for the access through the pcie bar. in case dram page size is different than the dmmu page size, we need to have special treatment for adding the page offset to the actual address, which is to use the dram page size mask to fetch the page offset from the virtual address, instead of the dmmu last hop shift. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-03-10habanalabs: Disable file operations after device is removedTomer Tayar
A device can be removed from the PCI subsystem while a process holds the file descriptor opened. In such a case, the driver attempts to kill the process, but as it is still possible that the process will be alive after this step, the device removal will complete, and we will end up with a process object that points to a device object which was already released. To prevent the usage of this released device object, disable the following file operations for this process object, and avoid the cleanup steps when the file descriptor is eventually closed. The latter is just a best effort, as memory leak will occur. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-03-10habanalabs: Call put_pid() when releasing control deviceTomer Tayar
The refcount of the "hl_fpriv" structure is not used for the control device, and thus hl_hpriv_put() is not called when releasing this device. This results with no call to put_pid(), so add it explicitly in hl_device_release_ctrl(). Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-03-10drivers: habanalabs: remove unused dentry pointer for debugfs filesGreg Kroah-Hartman
The dentry for the created debugfs file was being saved, but never used anywhere. As the pointer isn't needed for anything, and the debugfs files are being properly removed by removing the parent directory, remove the saved pointer as well, saving a tiny bit of memory and logic. Cc: Oded Gabbay <ogabbay@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Tomer Tayar <ttayar@habana.ai> Cc: Moti Haimovski <mhaimovski@habana.ai> Cc: Omer Shpigelman <oshpigelman@habana.ai> Cc: Ofir Bitton <obitton@habana.ai> Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-03-10habanalabs: mark hl_eq_inc_ptr() as staticOded Gabbay
hl_eq_inc_ptr() is not called from anywhere outside irq.c so mark it as static Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-24Merge tag 'char-misc-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the large set of char/misc/whatever driver subsystem updates for 5.12-rc1. Over time it seems like this tree is collecting more and more tiny driver subsystems in one place, making it easier for those maintainers, which is why this is getting larger. Included in here are: - coresight driver updates - habannalabs driver updates - virtual acrn driver addition (proper acks from the x86 maintainers) - broadcom misc driver addition - speakup driver updates - soundwire driver updates - fpga driver updates - amba driver updates - mei driver updates - vfio driver updates - greybus driver updates - nvmeem driver updates - phy driver updates - mhi driver updates - interconnect driver udpates - fsl-mc bus driver updates - random driver fix - some small misc driver updates (rtsx, pvpanic, etc.) All of these have been in linux-next for a while, with the only reported issue being a merge conflict due to the dfl_device_id addition from the fpga subsystem in here" * tag 'char-misc-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (311 commits) spmi: spmi-pmic-arb: Fix hw_irq overflow Documentation: coresight: Add PID tracing description coresight: etm-perf: Support PID tracing for kernel at EL2 coresight: etm-perf: Clarify comment on perf options ACRN: update MAINTAINERS: mailing list is subscribers-only regmap: sdw-mbq: use MODULE_LICENSE("GPL") regmap: sdw: use no_pm routines for SoundWire 1.2 MBQ regmap: sdw: use _no_pm functions in regmap_read/write soundwire: intel: fix possible crash when no device is detected MAINTAINERS: replace my with email with replacements mhi: Fix double dma free uapi: map_to_7segment: Update example in documentation uio: uio_pci_generic: don't fail probe if pdev->irq equals to IRQ_NOTCONNECTED drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue firewire: replace tricky statement by two simple ones vme: make remove callback return void firmware: google: make coreboot driver's remove callback return void firmware: xilinx: Use explicit values for all enum values sample/acrn: Introduce a sample of HSM ioctl interface usage virt: acrn: Introduce an interface for Service VM to control vCPU ...
2021-02-22Merge tag 'topic/iomem-mmap-vs-gup-2021-02-22' of ↵Linus Torvalds
git://anongit.freedesktop.org/drm/drm Pull follow_pfn() updates from Daniel Vetter: "Fixes around VM_FPNMAP and follow_pfn: - replace mm/frame_vector.c by get_user_pages in misc/habana and drm/exynos drivers, then move that into media as it's sole user - close race in generic_access_phys - s390 pci ioctl fix of this series landed in 5.11 already - properly revoke iomem mappings (/dev/mem, pci files)" * tag 'topic/iomem-mmap-vs-gup-2021-02-22' of git://anongit.freedesktop.org/drm/drm: PCI: Revoke mappings like devmem PCI: Also set up legacy files only after sysfs init sysfs: Support zapping of binary attr mmaps resource: Move devmem revoke code to resource framework /dev/mem: Only set filp->f_mapping PCI: Obey iomem restrictions for procfs mmap mm: Close race in generic_access_phys media: videobuf2: Move frame_vector into media subsystem mm/frame-vector: Use FOLL_LONGTERM misc/habana: Use FOLL_LONGTERM for userptr misc/habana: Stop using frame_vector helpers drm/exynos: Use FOLL_LONGTERM for g2d cmdlists drm/exynos: Stop using frame_vector helpers
2021-02-08habanalabs/gaudi: don't enable clock gating on DMA5Oded Gabbay
Graph Compiler uses DMA5 in a non-standard way and it requires the driver to disable clock gating on that DMA. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-08habanalabs: return block size + block IDOded Gabbay
When user gives us a block address to get its ID to mmap it, he also needs to get from us the block size to pass to the driver in the mmap function. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-08habanalabs: update security map after init CPU QsOhad Sharabi
when reading CPU_BOOT_DEV_STS0 reg after FW reports SRAM AVAILABLE the value in the register might not yet be updated by FW. to overcome this issue another "up-to-date" read of this register is done at the end of CPU queues init. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-08habanalabs: enable F/W events after init doneOded Gabbay
Only after the initialization of the device is done, the driver is ready to receive events from the F/W. The driver can't handle events before that because of races so it will ignore events. In case of a fatal event, the driver won't know about it and the device will be operational although it shouldn't be. Same logic should be applied after hard-reset. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-08habanalabs/gaudi: use HBM_ECC_EN bit for ECC ERROhad Sharabi
driver should use ECC info from FW only if HBM ECC CAP is set. otherwise, try to fetch the data from MC regs only if security is disabled. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-08habanalabs: support fetching first available user CQOfir Bitton
User must be aware of the available CQs when it needs to use them. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-08habanalabs: improve communication protocol with cpucpOfir Bitton
Current messaging communictaion protocol with cpucp can get out of sync due to coherency issues. In order to improve the protocol reliability, we modify the protocol to expect a different acknowledgment for every packet sent to cpucp. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-08habanalabs: fix integer handling issueOded Gabbay
Need to add ull suffix to constant when doing shift of constant into 64-bit variables Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: update to latest hl_boot_if.h spec from F/WOded Gabbay
It adds the definition for indication that the F/W handles HBM ECC events. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: unmask HBM interrupts after handlingOded Gabbay
As the driver does with all interrupts, we need to tell F/W to unmask the HBM interrupts after the driver handled them. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: update SyncManager interrupt handlingOded Gabbay
The firmware provides more information about SyncManager events. Adjust the code to the latest firmware interface file. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: fix ETR security issueOhad Sharabi
ETR should always be non-secured as it is used by the users to record profiling/trace data. This patch fixes the configuration to match those requirements. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: staged submission supportOfir Bitton
We introduce a new mechanism named Staged Submission. This mechanism allows the user to send a whole CS in pieces. Each CS will not require completion rather than the last CS. Timeout timer will be triggered upon reception of the first CS in group. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: modify device_idle interfaceOhad Sharabi
Currently this API uses single 64 bits mask for engines idle indication. Recently, it was observed that more bits are needed for some ASICs. This patch modifies the use of the idle mask and the idle_extensions mask. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: add CS completion and timeout propertiesOfir Bitton
In order to support staged submission feature, we need to distinguish on which command submission we want to receive timeout and for which we want to receive completion. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: add new mem ioctl op for mapping hw blocksOfir Bitton
For future ASIC support the driver allows user to map certain regions in the device's configuration space for direct access from userspace. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: fix MMU debugfs related nodesfarah kassabri
In mmu debugfs node show un-scrambled physical addresses. before read/write through data nodes, need to unscramble the physical address before using it for pci transaction. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: add user available interrupt to hw_ipOfir Bitton
In order to support completions that arrive directly to the user, the driver needs to supply the user with the first available msix interrupt available. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: always try to use the hint addressfarah kassabri
Currently hint address is ignored in case va block page size is not power of 2. We need to support th user hint address also in this case, but only if the hint address is aligned to page size. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: add security violations dump to debugfsOfir Bitton
In order to improve driver security debuggability, we add security violations dump to debugfs. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: ignore F/W BMC errors in case no BMC presentOfir Bitton
In order to support operation mode in which BMC is not active, driver must not take BMC errors into consideration. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: print sync manager SEI interrupt infoOfir Bitton
Driver must print sync manager SEI information upon receiving interrupt from FW. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: Use 'dma_set_mask_and_coherent()'Christophe JAILLET
Axe 'hl_pci_set_dma_mask()' and replace it with an equivalent 'dma_set_mask_and_coherent()' call. This makes the code a bit less verbose. It also removes an erroneous comment, because 'hl_pci_set_dma_mask()' does not try to use a fall-back value. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: remove PCI access to SM blockOfir Bitton
Due to HW limitation we must remove all direct access to SM registers, in order to do that we will access SM registers using the HW QMANS. When possible and no user context is present, we can directly access the HW QMANS. Whenever there is an active user, driver will prepare a pending command buffer list which will be sent upon user submissions. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: add driver support for internal cb schedulingOfir Bitton
In order to support scnenarios in which driver needs access to HW components but it cannot access them directly, we add support for scheduling command buffers internally. These command buffers will be transmitted upon next user command submission context. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: increment ctx ref from within a cs allocationOfir Bitton
A CS must increment the relevant context reference count. We want to increment the reference inside the CS allocation function as opposed for today where we increment it outside. This is logical since we want to avoid explicitly incrementing the context every time we call the CS allocate function. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: separate common code to dedicated foldersOfir Bitton
We separate some of the common code source files to different folders for a better maintainability and testability. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: read device boot errors after cpucp is upOfir Bitton
Boot cpu can report errors in various boot stages. Current implementaion does not take into consideration errors reported in late stages, hence we will check for errors at the most late stage when fetching cpucp information. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: report correct dram size in info ioctlOfir Bitton
In case MMU is enabled, we must take MMU page size into consideration when reporting dram size to the user. This is because the MMU page size can be a value which is NOT a power-of-2 value. As a result, the total DRAM size (which is always a power-of-2 value) needed to be rounded-down. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: support non power-of-2 DRAM phys page sizesMoti Haimovski
DRAM physical page sizes depend of the amount of HBMs available in the device. this number is device-dependent and may also be subject to binning when one or more of the DRAM controllers are found to to be faulty. Such a configuration may lead to partitioning the DRAM to non-power-of-2 pages. To support this feature we also need to add infrastructure of address scarmbling. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: remove access to kernel memory using debugfsOfir Bitton
Accessing kernel allocated memory through debugfs should not be allowed as it introduces a security vulnerability. We remove the option to read/write kernel memory for all asics. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: set uninitialized symbolOfir Bitton
Initialize local variable that is returned by the function, in case it is never assigned. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: return dram virtual address in info ioctlAlon Mizrahi
When working with DRAM MMU, we should supply the userspace with the virtual start address of the DRAM instead of the physical one. This is because the physical one has no meaning for the user as he only knows the virtual address range. Signed-off-by: Alon Mizrahi <amizrahi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: update to latest hl_boot_if.hOded Gabbay
Update the latest version of this file that the F/W exports Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: add ASIC property of functional HBMsOded Gabbay
The number of functional HBMs in the same ASIC can be different due to malfunctioning HBM banks. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: add debug prints for security statusOfir Bitton
In order to have more information while debugging boot issues, we should print the firmware security status at every boot stage. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: modify memory functions signaturesOmer Shpigelman
For consistency, modify all memory ioctl functions to get the ioctl arguments structure rather than the arguments themselves. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: kernel doc format in memory functionsOmer Shpigelman
Change all memory functions documentation according to kernel doc format. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: replace WARN/WARN_ON with dev_crit in driverAlon Mizrahi
Often WARN is defined in data-centers as BUG and we would like to avoid hanging the entire server on some internal error of the driver (important as it might be). Therefore, use dev_crit instead. Signed-off-by: Alon Mizrahi <amizrahi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: report dram_page_size in hw_ip_info ioctlMoti Haimovski
Instead of having it hard-coded as a define, pass it to the user in runtime. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/goya: move mmu_prepare to context initOhad Sharabi
Currently mmu_prepare is located at context switch. Since we support a single context, no reason to reconfigure the MMU registers every context switch. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: remove duplicated gaudi packets masksOfir Bitton
As all packets use the same CTL register masks, we remove duplicated masks and use common masks instead. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: allow user to pass a staged submission seqOfir Bitton
In order to support the staged submission feature, user must be allowed to use the same CS sequence for all submissions in the same staged submission. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>