Age | Commit message (Collapse) | Author |
|
Between acquiring the this_cpu_ptr() and using it, ideally we don't want
to be preempted and work on another CPU's private data. this_cpu_ptr()
checks whether or not preemption is disable, and get_cpu_ptr() provides
a convenient wrapper for operating on the cpu ptr inside a preemption
disabled critical section (which currently is provided by the
spinlock).
[ 167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216
[ 167.997940] caller is debug_smp_processor_id+0x17/0x20
[ 167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G U 4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1
[ 167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
[ 167.997951] 0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007
[ 167.997958] ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000
[ 167.997965] ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08
[ 167.997971] Call Trace:
[ 167.997977] [<ffffffff8140dca5>] dump_stack+0x67/0x92
[ 167.997981] [<ffffffff8142a927>] check_preemption_disabled+0xd7/0xe0
[ 167.997985] [<ffffffff8142a947>] debug_smp_processor_id+0x17/0x20
[ 167.997990] [<ffffffff81507e17>] alloc_iova_fast+0xb7/0x210
[ 167.997994] [<ffffffff8150c55f>] intel_alloc_iova+0x7f/0xd0
[ 167.997998] [<ffffffff8151021d>] intel_map_sg+0xbd/0x240
[ 167.998002] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 167.998009] [<ffffffff81596059>] usb_hcd_map_urb_for_dma+0x4b9/0x5a0
[ 167.998013] [<ffffffff81596d19>] usb_hcd_submit_urb+0xe9/0xaa0
[ 167.998017] [<ffffffff810cff2f>] ? mark_held_locks+0x6f/0xa0
[ 167.998022] [<ffffffff810d525c>] ? __raw_spin_lock_init+0x1c/0x50
[ 167.998025] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 167.998028] [<ffffffff815988f3>] usb_submit_urb+0x3f3/0x5a0
[ 167.998032] [<ffffffff810d0082>] ? trace_hardirqs_on_caller+0x122/0x1b0
[ 167.998035] [<ffffffff81599ae7>] usb_sg_wait+0x67/0x150
[ 167.998039] [<ffffffff815dc202>] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0
[ 167.998042] [<ffffffff815dc29c>] usb_stor_bulk_srb+0x4c/0x60
[ 167.998045] [<ffffffff815dc42e>] usb_stor_Bulk_transport+0x17e/0x420
[ 167.998049] [<ffffffff815dcf32>] usb_stor_invoke_transport+0x242/0x540
[ 167.998052] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 167.998058] [<ffffffff815dba19>] usb_stor_transparent_scsi_command+0x9/0x10
[ 167.998061] [<ffffffff815de518>] usb_stor_control_thread+0x158/0x260
[ 167.998064] [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[ 167.998067] [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[ 167.998071] [<ffffffff8109ddfa>] kthread+0xea/0x100
[ 167.998078] [<ffffffff817ac6af>] ret_from_fork+0x1f/0x40
[ 167.998081] [<ffffffff8109dd10>] ? kthread_create_on_node+0x1f0/0x1f0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Fixes: 9257b4a206fc ('iommu/iova: introduce per-cpu caching to iova allocation')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
IOVA allocation has two problems that impede high-throughput I/O.
First, it can do a linear search over the allocated IOVA ranges.
Second, the rbtree spinlock that serializes IOVA allocations becomes
contended.
Address these problems by creating an API for caching allocated IOVA
ranges, so that the IOVA allocator isn't accessed frequently. This
patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs
without taking the rbtree spinlock. The per-CPU caches are backed by
a global cache, to avoid invoking the (linear-time) IOVA allocator
without needing to make the per-CPU cache size excessive. This design
is based on magazines, as described in "Magazines and Vmem: Extending
the Slab Allocator to Many CPUs and Arbitrary Resources" (currently
available at https://www.usenix.org/legacy/event/usenix01/bonwick.html)
Adding caching on top of the existing rbtree allocator maintains the
property that IOVAs are densely packed in the IO virtual address space,
which is important for keeping IOMMU page table usage low.
To keep the cache size reasonable, we bound the IOVA space a CPU can
cache by 32 MiB (we cache a bounded number of IOVA ranges, and only
ranges of size <= 128 KiB). The shared global cache is bounded at
4 MiB of IOVA space.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
[dwmw2: split out VT-d part into a separate patch]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
The iova library has use outside the intel-iommu driver, thus make it a
module.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Use EXPORT_SYMBOL_GPL() to export the iova library symbols. The symbols
include:
init_iova_domain();
iova_cache_get();
iova_cache_put();
iova_cache_init();
alloc_iova();
find_iova();
__free_iova();
free_iova();
put_iova_domain();
reserve_iova();
copy_reserved_iova();
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
This is necessary to separate intel-iommu from the iova library.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Currently, allocating a size-aligned IOVA region quietly adjusts the
actual allocation size in the process, returning a rounded-up
power-of-two-sized allocation. This results in mismatched behaviour in
the IOMMU driver if the original size was not a power of two, where the
original size is mapped, but the rounded-up IOVA size is unmapped.
Whilst some IOMMUs will happily unmap already-unmapped pages, others
consider this an error, so fix it by computing the necessary alignment
padding without altering the actual allocation size. Also clean up by
making pad_size unsigned, since its callers always pass unsigned values
and negative padding makes little sense here anyway.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Fixed checkpatch warnings for missing blank line after
declaration of struct.
Signed-off-by: Robert Callicotte <rcallicotte@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Systems may contain heterogeneous IOMMUs supporting differing minimum
page sizes, which may also not be common with the CPU page size.
Thus it is practical to have an explicit notion of IOVA granularity
to simplify handling of mapping and allocation constraints.
As an initial step, move the IOVA page granularity from an implicit
compile-time constant to a per-domain property so we can make use
of it in IOVA domain context at runtime. To keep the abstraction tidy,
extend the little API of inline iova_* helpers to parallel some of the
equivalent PAGE_* macros.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
To share the IOVA allocator with other architectures, it needs to
accommodate more general aperture restrictions; move the lower limit
from a compile-time constant to a runtime domain property to allow
IOVA domains with different requirements to co-exist.
Also reword the slightly unclear description of alloc_iova since we're
touching it anyway.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
In order to share the IOVA allocator with other architectures, break
the unnecssary dependency on the Intel IOMMU driver and move the
remaining IOVA internals to iova.c
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
If static identity domain is created, IOMMU driver needs to update
si_domain page table when memory hotplug event happens. Otherwise
PCI device DMA operations can't access the hot-added memory regions.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Correct spelling typo in debug messages and comments
in drivers/iommu.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.
Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.
As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.
Compile-tested on x86_64.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|