Age | Commit message (Collapse) | Author |
|
Perform mass deletion of static inline functions that are not used.
Link: https://lore.kernel.org/r/20210314133908.291945-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The word 'descriptor' is misspelled throughout the tree.
Fix it up accordingly:
decriptors -> descriptors
Link: https://lore.kernel.org/r/20200609124610.3445662-3-kieran.bingham+renesas@ideasonboard.com
Link: https://lore.kernel.org/r/20200609124610.3445662-12-kieran.bingham+renesas@ideasonboard.com
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The following warning can happen when a memory shortage
occurs during txreq allocation:
[10220.939246] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
[10220.939246] Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
[10220.939247] cache: mnt_cache, object size: 384, buffer size: 384, default order: 2, min order: 0
[10220.939260] Workqueue: hfi0_0 _hfi1_do_send [hfi1]
[10220.939261] node 0: slabs: 1026568, objs: 43115856, free: 0
[10220.939262] Call Trace:
[10220.939262] node 1: slabs: 820872, objs: 34476624, free: 0
[10220.939263] dump_stack+0x5a/0x73
[10220.939265] warn_alloc+0x103/0x190
[10220.939267] ? wake_all_kswapds+0x54/0x8b
[10220.939268] __alloc_pages_slowpath+0x86c/0xa2e
[10220.939270] ? __alloc_pages_nodemask+0x2fe/0x320
[10220.939271] __alloc_pages_nodemask+0x2fe/0x320
[10220.939273] new_slab+0x475/0x550
[10220.939275] ___slab_alloc+0x36c/0x520
[10220.939287] ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939299] ? __get_txreq+0x54/0x160 [hfi1]
[10220.939310] ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939312] __slab_alloc+0x40/0x61
[10220.939323] ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939325] kmem_cache_alloc+0x181/0x1b0
[10220.939336] hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939348] ? hfi1_verbs_send_dma+0x386/0xa10 [hfi1]
[10220.939359] ? find_prev_entry+0xb0/0xb0 [hfi1]
[10220.939371] hfi1_do_send+0x1d9/0x3f0 [hfi1]
[10220.939372] process_one_work+0x171/0x380
[10220.939374] worker_thread+0x49/0x3f0
[10220.939375] kthread+0xf8/0x130
[10220.939377] ? max_active_store+0x80/0x80
[10220.939378] ? kthread_bind+0x10/0x10
[10220.939379] ret_from_fork+0x35/0x40
[10220.939381] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
The shortage is handled properly so the message isn't needed. Silence by
adding the no warn option to the slab allocation.
Fixes: 45842abbb292 ("staging/rdma/hfi1: move txreq header code")
Cc: <stable@vger.kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
ACK packets are generally associated with request completion and resource
release and therefore should be sent first. This patch optimizes the
send engine by using the following policies:
(1) QPs with RVT_S_ACK_PENDING bit set in qp->s_flags or qpriv->s_flags
should have their priority incremented;
(2) QPs with ACK or TID-ACK packet queued should have their priority
incremented;
(3) When a QP is queued to the wait list due to resource constraints, it
will be queued to the head if it has ACK packet to send;
(4) When selecting qps to run from the wait list, the one with the highest
priority and starve_cnt will be selected; each priority will be equivalent
to a fixed number of starve_cnt (16).
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Current implementation allows each qp to have only one send engine. As
such, each qp has only one list to queue prebuilt packets when send engine
resources are not available. To improve performance, it is desired to
support multiple send engines for each qp.
This patch creates the framework to support two send engines
(two legs) for each qp for the TID RDMA protocol, which can be easily
extended to support more send engines. It achieves the goal by creating a
leg specific struct, iowait_work in the iowait struct, to hold the
work_struct and the tx_list as well as a pointer to the parent iowait
struct.
The hfi1_pkt_state now has an additional field to record the current legs
work structure and that is now passed to all egress waiters to determine
the leg that needs to wait via a new iowait helper. The APIs are adjusted
to use the new leg specific struct as required.
Many new and modified helpers are added to support this change.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The __get_txreq() function can return a pointer, ERR_PTR(-EBUSY), or NULL.
All of the relevant call sites look for IS_ERR, so the NULL return would
lead to a NULL pointer exception.
Do not use the ERR_PTR mechanism for this function.
Update all call sites to handle the return value correctly.
Clean up error paths to reflect return value.
Fixes: 45842abbb292 ("staging/rdma/hfi1: move txreq header code")
Cc: <stable@vger.kernel.org> # 4.9.x+
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The s_hdrwords variable was used to indicate whether a
packet was already built on a previous iteration of the
send engine. This variable assumed the protection of the
QP's RVT_S_BUSY flag, which was required since the the
QP's s_lock was dropped just prior to the packet being
queued on the one of the egress mechanisms.
Support for multiple send engine instantiations require
that the field not be used due to concurency issues.
The ps.txreq signals the "already built" without the
potential concurency issues.
Fix by getting rid of all s_hdrword usage. A wrapper
is added to test for the already built case that used to
use s_hdrwords.
What used to be stored in s_hdrwords is now in the txreq.
The PBC is not counted, but is added in the pio/sdma code
paths prior to posting the packet.
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Setting the protocol type should be part of initializing the tx request.
For UC and RC, the current protocol type is part of the qp priv structure.
For ud requests, it needs to be adjusted dynamically, based on the AV
posted with the WQE. This patch will simplify the initialization of the
tx request.
Fixes: 5b6cabb0db77 ("IB/hfi1: Add 16B RC/UC support")
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The qp->s_cur_size field assumes that the S_BUSY bit protects
the field from modification after the slock is dropped. Scaling the
send engine to multiple cores would break that assumption.
Correct the issue by carrying the payload size in the txreq structure.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
hfi1_pio_header should really be called hfi1_sdma_header
as it is only used for sdma transmits.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
A failure in the get_txreq() inline will result in a
slow path retry using __get_txreq().
__get_txreq() attempts to procure the qp s_lock, which
is already held in all callers.
Fix by deleting the s_lock maintenance in __get_txreq()
and add sparse syntax hooks to future proof the code.
Cc: Stable <stable@vger.kernel.org> # 4.6+
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The TODO list for the hfi1 driver was completed during 4.6. In addition
other objections raised (which are far beyond what was in the TODO list)
have been addressed as well. It is now time to remove the driver from
staging and into the drivers/infiniband sub-tree.
Reviewed-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|