Age | Commit message (Collapse) | Author |
|
Introduce a new top-level lock for the FB helper code. This will allow
better locking granularity and avoid the need to abuse modeset locking
for this purpose instead.
This patch just adds the new lock everywhere we currently grab
mode_config->mutex (explicitly, or through drm_modeset_lock_all).
Follow-up patches will push the kms locking down into only the places
that need it.
v2:
- use lockdep_assert_held
- use drm_fb_helper_for_each_connector where possible
- use the new top-level lock consistently, i.e. in all the places
we're currently acquiring mode_config.mutex.
- small polish to the kerneldoc
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Thierry Reding <treding@nvidia.com> (v1)
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170704151833.17304-4-daniel.vetter@ffwll.ch
|
|
The last user of these (i915.ko) no longer does. We can slim down the
core GEM object by removing the unused 8 bytes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705154900.28697-1-chris@chris-wilson.co.uk
|
|
In some cases, like cursor updates, it is interesting to update the
plane in an asynchronous fashion to avoid big delays. The current queued
update could be still waiting for a fence to signal and thus block any
subsequent update until its scan out. In cases like this if we update the
cursor synchronously through the atomic API it will cause significant
delays that would even be noticed by the final user.
This patch creates a fast path to jump ahead the current queued state and
do single planes updates without going through all atomic steps in
drm_atomic_helper_commit(). We take this path for legacy cursor updates.
For now only single plane updates are supported, but we plan to support
multiple planes updates and async PageFlips through this interface as well
in the near future.
v6: - move check code to drm_atomic_helper.c (Daniel Vetter)
v5:
- improve comments (Eric Anholt)
v4:
- fix state->crtc NULL check (Archit Taneja)
v3:
- fix iteration on the wrong crtc state
- put back code to forbid updates if there is a queued update for
the same plane (Ville Syrjälä)
- move size checks back to drivers (Ville Syrjälä)
- move ASYNC_UPDATE flag addition to its own patch (Ville Syrjälä)
v2:
- allow updates even if there is a queued update for the same
plane.
- fixes on the documentation (Emil Velikov)
- unconditionally call ->atomic_async_update (Emil Velikov)
- check for ->atomic_async_update earlier (Daniel Vetter)
- make ->atomic_async_check() the last step (Daniel Vetter)
- add ASYNC_UPDATE flag (Eric Anholt)
- update state in core after ->atomic_async_update (Eric Anholt)
- update docs (Eric Anholt)
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org> (v5)
Acked-by: Eric Anholt <eric@anholt.net> (v5)
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170630180322.29007-2-gustavo@padovan.org
|
|
The old state is useful for drivers that need to perform operations at
enable time that depend on the transition between the old and new
states.
While at it, rename the operation to .atomic_enable() to be consistent
with .atomic_disable(), as the .enable() operation is used by atomic
helpers only.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com> # for sun4i
Acked-by: Philipp Zabel <p.zabel@pengutronix.de> # for imx-drm and mediatek
Acked-by: Alexey Brodkin <abrodkin@synopsys.com> # for arcpgu
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> # for atmel-hlcdc
Acked-by: Liviu Dudau <Liviu.Dudau@arm.com> # for hdlcd and mali-dp
Acked-by: Stefan Agner <stefan@agner.ch> # for fsl-dcu
Tested-by: Philippe Cornu <philippe.cornu@st.com> # for stm
Acked-by: Philippe Cornu <philippe.cornu@st.com> # for stm
Acked-by: Vincent Abriou <vincent.abriou@st.com> # for sti
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> # for vmwgfx
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170630093646.7928-2-laurent.pinchart+renesas@ideasonboard.com
|
|
Often we have the task of comparing two seqno known to be on the same
context, so provide a common __dma_fence_is_later().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170629125930.821-1-chris@chris-wilson.co.uk
|
|
There's no reason for drivers to call this, and all the ones I've
removed looked very fishy:
- Proper quiescenting of the vblank machinery should be done by
calling drm_crtc_vblank_off(), which is best done by shutting down
the entire display engine with drm_atomic_helper_shutdown.
- Releasing of allocated memory is done by the core already, it calls
drm_vblank_cleanup as a fallback.
- drm_vblank_cleanup also has checks for drivers which forget to clean
up vblank interrupts.
This essentially reverts
commit e77cef9c2d87db835ad9d70cde4a9b00b0ca2262
Author: Jerome Glisse <jglisse@redhat.com>
Date: Thu Jan 7 15:39:13 2010 +0100
drm: Avoid calling vblank function is vblank wasn't initialized
which was done to fix a bug in radeon code with msi interrupts:
commit 003e69f9862bcda89a75c27750efdbc17ac02945
Author: Jerome Glisse <jglisse@redhat.com>
Date: Thu Jan 7 15:39:14 2010 +0100
drm/radeon/kms: Don't try to enable IRQ if we have no handler installed
Afaict from digging around in old code, this was needed to avoid
blowing up in the ums fallback, and has stopped serving it's purpose
long ago - if irq init fails, the driver fails to load, and there's
really no way to blow up anymore.
Long story short, this was most likely a small ums compat/fallback
hack that became a thing of it's own and got cargo-cult duplicated all
over the drm codebase for essentially no gain at all.
v2: Mention that for drivers with a ->release callback cleanup is
handled by drm_dev_fini() (Thierry).
Cc: Thierry Reding <treding@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170626161949.25629-2-daniel.vetter@ffwll.ch
|
|
Required for Daniel's drm_vblank_cleanup cleanup
|
|
Linux 4.12-rc7
Needed at least rc6 for drm-misc-next-fixes, may as well go to rc7
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A few fixes for timekeeping and timers:
- Plug a subtle race due to a missing READ_ONCE() in the timekeeping
code where reloading of a pointer results in an inconsistent
callback argument being supplied to the clocksource->read function.
- Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the
time keeping core code, to prevent a possible discontuity.
- Apply a similar fix to the arm64 vdso clock_gettime()
implementation
- Add missing includes to clocksource drivers, which relied on
indirect includes which fails in certain configs.
- Use the proper iomem pointer for read/iounmap in a probe function"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW
time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting
time: Fix clock->read(clock) race around clocksource changes
clocksource: Explicitly include linux/clocksource.h when needed
clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"This fixes the ACPI-based enumeration of some I2C and SPI devices
broken in 4.11.
Specifics:
- I2C and SPI devices are expected to be enumerated by the I2C and
SPI subsystems, respectively, but due to a change made during the
4.11 cycle, in some cases the ACPI core marks them as already
enumerated which causes the I2C and SPI subsystems to overlook
them, so fix that (Jarkko Nikula)"
* tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / scan: Fix enumeration for special SPI and I2C devices
|
|
Commit bf5eb3de3847 ("slub: separate out sysfs_slab_release() from
sysfs_slab_remove()") made slub sysfs file removals synchronous to
kmem_cache shutdown.
Unfortunately, this created a possible ABBA deadlock between slab_mutex
and sysfs draining mechanism triggering the following lockdep warning.
======================================================
[ INFO: possible circular locking dependency detected ]
4.10.0-test+ #48 Not tainted
-------------------------------------------------------
rmmod/1211 is trying to acquire lock:
(s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40
but task is already holding lock:
(slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (slab_mutex){+.+.+.}:
lock_acquire+0xf6/0x1f0
__mutex_lock+0x75/0x950
mutex_lock_nested+0x1b/0x20
slab_attr_store+0x75/0xd0
sysfs_kf_write+0x45/0x60
kernfs_fop_write+0x13c/0x1c0
__vfs_write+0x28/0x120
vfs_write+0xc8/0x1e0
SyS_write+0x49/0xa0
entry_SYSCALL_64_fastpath+0x1f/0xc2
-> #0 (s_active#120){++++.+}:
__lock_acquire+0x10ed/0x1260
lock_acquire+0xf6/0x1f0
__kernfs_remove+0x254/0x320
kernfs_remove+0x23/0x40
sysfs_remove_dir+0x51/0x80
kobject_del+0x18/0x50
__kmem_cache_shutdown+0x3e6/0x460
kmem_cache_destroy+0x1fb/0x2d0
kvm_exit+0x2d/0x80 [kvm]
vmx_exit+0x19/0xa1b [kvm_intel]
SyS_delete_module+0x198/0x1f0
entry_SYSCALL_64_fastpath+0x1f/0xc2
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(slab_mutex);
lock(s_active#120);
lock(slab_mutex);
lock(s_active#120);
*** DEADLOCK ***
2 locks held by rmmod/1211:
#0: (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80
#1: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
stack backtrace:
CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
Call Trace:
print_circular_bug+0x1be/0x210
__lock_acquire+0x10ed/0x1260
lock_acquire+0xf6/0x1f0
__kernfs_remove+0x254/0x320
kernfs_remove+0x23/0x40
sysfs_remove_dir+0x51/0x80
kobject_del+0x18/0x50
__kmem_cache_shutdown+0x3e6/0x460
kmem_cache_destroy+0x1fb/0x2d0
kvm_exit+0x2d/0x80 [kvm]
vmx_exit+0x19/0xa1b [kvm_intel]
SyS_delete_module+0x198/0x1f0
? SyS_delete_module+0x5/0x1f0
entry_SYSCALL_64_fastpath+0x1f/0xc2
It'd be the cleanest to deal with the issue by removing sysfs files
without holding slab_mutex before the rest of shutdown; however, given
the current code structure, it is pretty difficult to do so.
This patch punts sysfs file removal to a work item. Before commit
bf5eb3de3847, the removal was punted to a RCU delayed work item which is
executed after release. Now, we're punting to a different work item on
shutdown which still maintains the goal removing the sysfs files earlier
when destroying kmem_caches.
Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org
Fixes: bf5eb3de3847 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Drop them from u64 fields, tag local variables correctly instead.
While being at it switch the code to use u64_to_user_ptr().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620113916.6967-2-kraxel@redhat.com
|
|
Add an helper to wait for all page flips of an atomic state to be done.
v2:
- Pimp kerneldoc as discussed with Boris on irc
- Add missing doc for @dev.
- Use old_state for consitency with wait_for_vblanks
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> (v1)
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1496392332-8722-2-git-send-email-boris.brezillon@free-electrons.com
|
|
Pull block fixes from Jens Axboe:
"This contains a set of fixes for xen-blkback by way of Konrad, and a
performance regression fix for blk-mq for shared tags.
The latter could account for as much as a 50x reduction in
performance, with the test case from the user with 500 name spaces. A
more realistic setup on my end with 32 drives showed a 3.5x drop. The
fix has been thoroughly tested before being committed"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: fix performance regression with shared tags
xen-blkback: don't leak stack data via response ring
xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
xen/blkback: don't free be structure too early
xen/blkback: fix disconnect while I/Os in flight
|
|
Commit f406270bf73d ("ACPI / scan: Set the visited flag for all
enumerated devices") caused that two group of special SPI or I2C
devices do not enumerate. SPI and I2C devices are expected to be
enumerated by the SPI and I2C subsystems but change caused that
acpi_bus_attach() marks those devices with acpi_device_set_enumerated().
First group of devices are matched using Device Tree compatible property
with special _HID "PRP0001". Those devices have matched scan handler,
acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them
with acpi_device_set_enumerated().
Second group of devices without valid _HID such as "LNXVIDEO" have
device->pnp.type.platform_id set to zero and change again marks them
with acpi_device_set_enumerated().
Fix this by flagging the SPI and I2C devices during struct acpi_device
object initialization time and let the code in acpi_bus_attach() to go
through the device_attach() and acpi_default_enumeration() path for all
SPI and I2C devices.
Fixes: f406270bf73d (ACPI / scan: Set the visited flag for all enumerated devices)
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Pull networking fixes from David Miller:
1) Fix refcounting wrt timers which hold onto inet6 address objects,
from Xin Long.
2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg.
3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel.
4) Several mlx5 driver fixes (firmware readiness, timestamp cap
reporting, devlink command validity checking, tc offloading, etc.)
From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz.
5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan.
6) Fix dst refcount bug in decnet, from Wei Wang.
7) Netdev can be double freed in register_vlan_device(). Fix from Gao
Feng.
8) Don't allow object to be destroyed while it is being dumped in SCTP,
from Xin Long.
9) Fix dpaa_eth build when modular, from Madalin Bucur.
10) Fix throw route leaks, from Serhey Popovych.
11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table,
also from Serhey Popovych.
12) Fix premature TX SKB free in stmmac, from Niklas Cassel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
igmp: add a missing spin_lock_init()
net: stmmac: free an skb first when there are no longer any descriptors using it
sfc: remove duplicate up_write on VF filter_sem
rtnetlink: add IFLA_GROUP to ifla_policy
ipv6: Do not leak throw route references
dt-bindings: net: sms911x: Add missing optional VDD regulators
dpaa_eth: reuse the dma_ops provided by the FMan MAC device
fsl/fman: propagate dma_ops
net/core: remove explicit do_softirq() from busy_poll_stop()
fib_rules: Resolve goto rules target on delete
sctp: ensure ep is not destroyed before doing the dump
net/hns:bugfix of ethtool -t phy self_test
net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
cxgb4: notify uP to route ctrlq compl to rdma rspq
ip6_tunnel: Correct tos value in collect_md mode
decnet: always not take dst->__refcnt when inserting dst into hash table
ip6_tunnel: fix potential issue in __ip6_tnl_rcv
ip_tunnel: fix potential issue in ip_tunnel_rcv
brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2()
net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it
...
|
|
If we have shared tags enabled, then every IO completion will trigger
a full loop of every queue belonging to a tag set, and every hardware
queue for each of those queues, even if nothing needs to be done.
This causes a massive performance regression if you have a lot of
shared devices.
Instead of doing this huge full scan on every IO, add an atomic
counter to the main queue that tracks how many hardware queues have
been marked as needing a restart. With that, we can avoid looking for
restartable queues, if we don't have to.
Max reports that this restores performance. Before this patch, 4K
IOPS was limited to 22-23K IOPS. With the patch, we are running at
950-970K IOPS.
Fixes: 6d8c6c0f97ad ("blk-mq: Restart a single queue if tag sets are shared")
Reported-by: Max Gurtovoy <maxg@mellanox.com>
Tested-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://anongit.freedesktop.org/git/drm-misc into drm-next
UAPI Changes:
- vc4: Add get/set tiling format ioctls (Eric)
Driver Changes:
- vc4: Add tiling T-format support for scanout (Eric)
- vc4: Use atomic helpers in commit (Boris)
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Eric Anholt <eric@anholt.net>
* tag 'drm-misc-next-2017-06-19_0' of git://anongit.freedesktop.org/git/drm-misc:
drm/vc4: Mimic drm_atomic_helper_commit() behavior
drm/vc4: Add get/set tiling ioctls.
drm/vc4: Add T-format scanout support.
|
|
git://anongit.freedesktop.org/git/drm-intel into drm-next
Final pile of features for 4.13
New uabi:
- batch bo in first slot, for faster execbuf assembly in userspace
(Chris Wilson)
- (sub)slice getparam, needed for mesa perf support (Robert Bragg)
First pile of patches for cnl/cfl support, maintained by Rodrigo but
with lots of contributions from others. Still incomplete since public
review still ongoing.
Features/refactoring:
- Make execbuf faster (Chris Wilson), a pile of series to make execbuf
buffer handling have fewer passes, use less list walking, postpone
more work to async workers and shuffle buffers less, all to make the
common case much faster (in some cases at least).
- cold boot support for glk dsi (Madhav Chauhan)
- Clean up pipe A quirk and related old platform hacks (Ville)
- perf sampling support for kbl/glk (Lionel)
- perf cleanups (Robert Bragg)
- wire atomic state to backlight code, to avoid pipe lookup hacks
(Maarten)
- reduce request waiting latency/overhead to remove the spinning and
associated cpu cycle wasting (Chris)
- fix 90/270 rotation wm computation (Ville)
- new ddb allocation algo for skl (Kumar Mahesh)
- fix regression due to system suspend optimiazatino (Imre)
- the usual pile of small cleanups and refactors all over
GVT updates contained in this tag:
- optimization for per-VM mmio save/restore (Changbin)
- optimization for mmio hash table (Changbin)
- scheduler optimization with event (Ping)
- vGPU reset refinement (Fred)
- other misc refactor and cleanups, etc.
* tag 'drm-intel-next-2017-06-19' of git://anongit.freedesktop.org/git/drm-intel: (170 commits)
drm/i915: Update DRIVER_DATE to 20170619
drm/i915/cfl: Introduce Coffee Lake workarounds.
drm/i915: Store 9 bits of PCI Device ID for platforms with a LP PCH
drm/i915: Stash a pointer to the obj's resv in the vma
drm/i915: Async GPU relocation processing
drm/i915: Allow execbuffer to use the first object as the batch
drm/i915: Wait upon userptr get-user-pages within execbuffer
drm/i915: First try the previous execbuffer location
drm/i915: Store a persistent reference for an object in the execbuffer cache
drm/i915: Eliminate lots of iterations over the execobjects array
drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations
drm/i915: Pass vma to relocate entry
drm/i915: Store a direct lookup from object handle to vma
drm/i915: Fix retrieval of hangcheck stats
drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty
drm/i915: Mark CPU cache as dirty on every transition for CPU writes
drm/i915: Make i915_vma_destroy() static
drm/i915: Actually attach the tv_format property to the SDVO connector
Revert "drm/i915/skl: New ddb allocation algorithm"
drm/i915/glk: Add cold boot sequence for GLK DSI
...
|
|
git://people.freedesktop.org/~robclark/linux into drm-next
This time around, the biggest thing is a bunch of GEM rework for more
fine grained locking and prep work to handle multiple address spaces
(ie. per-process pagetables). Also some HDMI fixes for 8x96
(snapdragon 820).
One unrelated bus patch, for something that seems to get merged
through whatever random tree (and has all the right ack's).
* tag 'drm-msm-next-2017-06-20' of git://people.freedesktop.org/~robclark/linux:
drm/msm: Fix potential buffer overflow issue
bus: SIMPLE_PM_BUS does not depend on ARCH_RENESAS
drm/msm: Separate locking of buffer resources from struct_mutex
drm/msm/hdmi: Fix HDMI pink strip issue seen on 8x96
drm/msm/hdmi: 8996 PLL: Populate unprepare
drm/msm/hdmi: Use bitwise operators when building register values
drm/msm: update generated headers
drm/msm: remove address-space id
drm/msm: support for an arbitrary number of address spaces
drm/msm: refactor how we handle vram carveout buffers
drm/msm: pass address-space to _get_iova() and friends
drm/msm/mdp4+5: move aspace/id to base class
drm/msm/mdp5: kill pipe_lock
drm/msm: fix locking inconsistency for gpu->hw_init()
drm/msm: Remove memptrs->wptr
drm/msm: Add a struct to pass configuration to msm_gpu_init()
drm/msm: Add hint to DRM_IOCTL_MSM_GEM_INFO to return an object IOVA
drm/msm: Remove idle function hook
drm/msm: Remove DRM_MSM_NUM_IOCTLS
drm/msm: gpu: Enable zap shader for A5XX
|
|
drm_fbdev_cma_set_suspend{,_unlocked} use an integer parameter
to describe whether the intended state is a suspend or a resume.
It then passes the value to drm_fb_helper_set_suspend{,_unlocked}
which uses a boolean. Switch to using bool everywhere.
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620102320.8849-1-Liviu.Dudau@arm.com
|
|
Due to how the MONOTONIC_RAW accumulation logic was handled,
there is the potential for a 1ns discontinuity when we do
accumulations. This small discontinuity has for the most part
gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW
in their vDSO clock_gettime implementation, we've seen failures
with the inconsistency-check test in kselftest.
This patch addresses the issue by using the same sub-ns
accumulation handling that CLOCK_MONOTONIC uses, which avoids
the issue for in-kernel users.
Since the ARM64 vDSO implementation has its own clock_gettime
calculation logic, this patch reduces the frequency of errors,
but failures are still seen. The ARM64 vDSO will need to be
updated to include the sub-nanosecond xtime_nsec values in its
calculation for this issue to be completely fixed.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Daniel Mentz <danielmentz@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In tests, which excercise switching of clocksources, a NULL
pointer dereference can be observed on AMR64 platforms in the
clocksource read() function:
u64 clocksource_mmio_readl_down(struct clocksource *c)
{
return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
}
This is called from the core timekeeping code via:
cycle_now = tkr->read(tkr->clock);
tkr->read is the cached tkr->clock->read() function pointer.
When the clocksource is changed then tkr->clock and tkr->read
are updated sequentially. The code above results in a sequential
load operation of tkr->read and tkr->clock as well.
If the store to tkr->clock hits between the loads of tkr->read
and tkr->clock, then the old read() function is called with the
new clock pointer. As a consequence the read() function
dereferences a different data structure and the resulting 'reg'
pointer can point anywhere including NULL.
This problem was introduced when the timekeeping code was
switched over to use struct tk_read_base. Before that, it was
theoretically possible as well when the compiler decided to
reload clock in the code sequence:
now = tk->clock->read(tk->clock);
Add a helper function which avoids the issue by reading
tk_read_base->clock once into a local variable clk and then issue
the read function via clk->read(clk). This guarantees that the
read() function always gets the proper clocksource pointer handed
in.
Since there is now no use for the tkr.read pointer, this patch
also removes it, and to address stopping the fast timekeeper
during suspend/resume, it introduces a dummy clocksource to use
rather then just a dummy read function.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: stable <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Daniel Mentz <danielmentz@google.com>
Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
I spotted a markup issue, plus adding the descriptions in drm_driver.
Plus a few more links while at it.
I'm still mildly unhappy with the split between fops and ioctls, but I
still think having the ioctls in the uapi chapter makes more sense. Oh
well ...
v2: Rebase.
v3: Move misplace hunk to the right patch.
Cc: Stefan Agner <stefan@agner.ch>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170531092045.3950-1-daniel.vetter@ffwll.ch
|
|
The magic switching between proper pci driver and shadow-attach isn't
useful anymore since there's no ums+kms drivers left. Let's split this
up properly, calling pci_register_driver for kms drivers and renaming
the shadow-attach init to drm_legacy_pci_init/exit.
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-6-daniel.vetter@ffwll.ch
|
|
The only special-case is pci devices, and we can easily handle this in
the core. Do so and drop a pile of boilerplate from drivers.
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-5-daniel.vetter@ffwll.ch
|
|
We use drm_crtc_ for all the new-style vblank functions which directly
take a struct drm_crtc *. drm_accurate_vblank_count was the odd one
out, correct this to appease my OCD.
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-13-daniel.vetter@ffwll.ch
|
|
Unify and review everything, plus make sure it's all correct markup.
Drop the kernel-doc for internal functions. Also rework the overview
section, it's become rather outdated.
Unfortuantely the kernel-doc in drm_driver isn't rendered yet, but
that will change as soon as drm_driver is kernel-docified properly.
Also document properly that drm_vblank_cleanup is optional, the core
calls this already.
v2: Make it clear that cleanup happens in drm_dev_fini for drivers
with their own ->release callback (Thierry).
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-11-daniel.vetter@ffwll.ch
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"One build fix for an Amlogic clk driver and a handful of Allwinner clk
driver fixes for some DT bindings and a randconfig build error that
all came in this merge window"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition
clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
clk: meson: gxbb: fix build error without RESET_CONTROLLER
clk: sunxi-ng: v3s: Fix usb otg device reset bit
clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
|
|
into drm-next
A few more things for 4.13:
- Semaphore support using sync objects
- Drop fb location programming
- Optimize bo list ioctl
* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: Optimize mutex usage (v4)
drm/amdgpu: Optimization of AMDGPU_BO_LIST_OP_CREATE (v2)
amdgpu: use drm sync objects for shared semaphores (v6)
amdgpu/cs: split out fence dependency checking (v2)
drm/amdgpu: don't check the default value for vm size
|
|
git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v4.13-rc1
This starts off with the addition of more documentation for the host1x
and DRM drivers and finishes with a slew of fixes and enhancements for
the staging IOCTLs as a result of the awesome work done by Dmitry and
Erik on the grate reverse-engineering effort.
* tag 'drm/tegra/for-4.13-rc1' of git://anongit.freedesktop.org/tegra/linux:
gpu: host1x: At first try a non-blocking allocation for the gather copy
gpu: host1x: Refactor channel allocation code
gpu: host1x: Remove unused host1x_cdma_stop() definition
gpu: host1x: Remove unused 'struct host1x_cmdbuf'
gpu: host1x: Check waits in the firewall
gpu: host1x: Correct swapped arguments in the is_addr_reg() definition
gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall
gpu: host1x: Forbid RESTART opcode in the firewall
gpu: host1x: Forbid relocation address shifting in the firewall
gpu: host1x: Do not leak BO's phys address to userspace
gpu: host1x: Correct host1x_job_pin() error handling
gpu: host1x: Initialize firewall class to the job's one
drm/tegra: dc: Disable plane if it is invisible
drm/tegra: dc: Apply clipping to the plane
drm/tegra: dc: Avoid reset asserts on Tegra20
drm/tegra: Check syncpoint ID in the 'submit' IOCTL
drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL
drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL
drm/tegra: Add driver documentation
gpu: host1x: Flesh out kerneldoc
|
|
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Here's just the fix for that ancient bug:
* remove wext calling ndo_do_ioctl, since nobody needs
that now and it makes the type change easier
* use struct iwreq instead of struct ifreq almost everywhere
in wireless extensions code
* copy only struct iwreq from userspace in dev_ioctl for the
wireless extensions, since it's smaller than struct ifreq
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This creates a new command submission chunk for amdgpu
to add in and out sync objects around the submission.
Sync objects are managed via the drm syncobj ioctls.
The command submission interface is enhanced with two new
chunks, one for syncobj pre submission dependencies,
and one for post submission sync obj signalling,
and just takes a list of handles for each.
This is based on work originally done by David Zhou at AMD,
with input from Christian Konig on what things should look like.
In theory VkFences could be backed with sync objects and
just get passed into the cs as syncobj handles as well.
NOTE: this interface addition needs a version bump to expose
it to userspace.
TODO: update to dep_sync when rebasing onto amdgpu master.
(with this - r-b from Christian)
v1.1: keep file reference on import.
v2: move to using syncobjs
v2.1: change some APIs to just use p pointer.
v3: make more robust against CS failures, we now add the
wait sems but only remove them once the CS job has been
submitted.
v4: rewrite names of API and base on new syncobj code.
v5: move post deps earlier, rename some apis
v6: lookup post deps earlier, and just replace fences
in post deps stage (Christian)
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently, the last object in the execlist is the always the batch.
However, when building the batch buffer we often know the batch object
first and if we can use the first slot in the execlist we can emit
relocation instructions relative to it immediately and avoid a separate
pass to adjust the relocations to point to the last execlist slot.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Modify the 'pad' member of struct drm_msm_gem_info to 'flags'. If the
user sets 'flags' to non-zero it means that they want a IOVA for the
GEM object instead of a mmap() offset. Return the iova in the 'offset'
member.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[robclark: s/hint/flags in commit msg]
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
The ioctl array is sparsely populated but the compiler will make sure
that it is sufficiently sized for all the values that we have so we
can safely use ARRAY_SIZE() instead of having a constantly changing
#define in the uapi header.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Pull configfs updates from Christoph Hellwig:
"A fix from Nic for a race seen in production (including a stable tag).
And while I'm sending you this I'm also sneaking in a trivial new
helper from Bart so that we don't need inter-tree dependencies for the
next merge window"
* tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs:
configfs: Introduce config_item_get_unless_zero()
configfs: Fix race between create_link and configfs_rmdir
|
|
Pull block layer fix from Jens Axboe:
"Just a single fix this week, fixing a regression introduced in this
release.
When we put the final reference to the queue, we may need to block.
Ensure that we can safely do so. From Bart"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Fix a blk_exit_rl() regression
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi fixes from Jean Delvare.
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi_scan: Check DMI structure length
firmware: dmi: Fix permissions of product_family
firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes
firmware: dmi_scan: Look for SMBIOS 3 entry point first
|
|
Linux 4.12-rc5 for nouveau fixes
|
|
git://git.pengutronix.de/git/pza/linux into drm-next
imx-drm: cleanups and YUV 4:2:0 memory read/write reduction support
- Remove counter load enable form PRE, which has no effect.
- Add support for setting the double read/write reduction flag in channel
parameter memory. This can be used to save some memory bandwidth when
capturing in YUV 4:2:0 chroma subsampled formats.
- Allocate DMA channel structures as needed, most of the 64 channels are
unused or even reserved.
- Remove unused interrupt busy waiting routine.
- Set VDIC field order for both AUTO and MAN inputs simultaneously as
both can't be active at the same time.
* tag 'imx-drm-next-2017-06-08' of git://git.pengutronix.de/git/pza/linux:
gpu: ipu-v3: vdic: include AUTO field order bit in ipu_vdi_set_field_order
gpu: ipu-v3: remove interrupt busy waiting routine
gpu: ipu-v3: allocate ipuv3_channels as needed
gpu: ipu-v3: Add support for double read/write reduction
gpu: ipu-v3: prg: remove counter load enable
|
|
The series interleaves DRM and V4L2 patches due to dependencies between the R-
Car DU and VSP drivers. Mauro has acked all the V4L2 patches to go through
your tree, and they don't conflict with anything queued for v4.13 in his tree.
If I need to send any conflicting patches through Mauro's tree for v4.13, I'll
make sure to base them on this branch.
* 'drm/next/du' of git://linuxtv.org/pinchartl/media:
drm: rcar-du: Map memory through the VSP device
v4l: vsp1: Add API to map and unmap DRM buffers through the VSP
v4l: vsp1: Map the DL and video buffers through the proper bus master
v4l: rcar-fcp: Add an API to retrieve the FCP device
v4l: rcar-fcp: Don't get/put module reference
drm: rcar-du: Register a completion callback with VSP1
v4l: vsp1: Extend VSP1 module API to allow DRM callbacks
v4l: vsp1: Postpone frame end handling in event of display list race
drm: rcar-du: Arm the page flip event after queuing the page flip
|
|
into drm-next
New radeon and amdgpu features for 4.13:
- Lots of Vega10 bug fixes
- Preliminary Raven support
- KIQ support for compute rings
- MEC queue management rework from Andres
- Audio support for DCE6
- SR-IOV improvements
- Improved module parameters for controlling radeon vs amdgpu support
for SI and CIK
- Bug fixes
- General code cleanups
[airlied: dropped drmP.h header from one file was needed and build broke]
* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: (362 commits)
drm/amdgpu: Fix compiler warnings
drm/amdgpu: vm_update_ptes remove code duplication
drm/amd/amdgpu: Port VCN over to new SOC15 macros
drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
drm/amd/amdgpu: Add offset variant to SOC15 macros
drm/amd/powerplay: add avfs control for Vega10
drm/amdgpu: add virtual display support for raven
drm/amdgpu/gfx9: fix compute ring doorbell index
drm/amd/amdgpu: Rename KIQ ring to avoid spaces
drm/amd/amdgpu: gfx9 tidy ups (v2)
drm/amdgpu: add contiguous flag in ucode bo create
drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
drm/amdgpu: export test ib debugfs interface
...
|
|
git://anongit.freedesktop.org/git/drm-misc into drm-next
Cross-subsystem Changes:
- dt-bindings: add vendor prefix for NLT Technologies, Ltd. (Lucas)
- dt-bindings: Add support for samsung s6e3hf2 panel (Hoegeun)
Core Changes:
- Add drm_panel_bridge to avoid connector boilerplate in drivers (Eric)
- Trival fixes for dupe forward decl and reduce scope of variable (Dawid)
Driver Changes:
- dw-hdmi: Use mode_valid hook on bridge instead of connector (Jose)
- vc4,atmel-hlcdc: Use drm_panel_bridge where appropriate (Eric)
- panel: Add Innolux P079ZCA panel driver (Chris)
- panel-simple: Add NL12880B20-05, NL192108AC18-02D, P320HVN03 panels (Lucas)
- panel-samsung-s6e3ha2: Add s6e3hf2 panel support (Hoegeun)
- zte,vc4,pl111,panel,mxsfb: Miscellaneous fixes
Cc: Jose Abreu <Jose.Abreu@synopsys.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Chris Zhong <zyw@rock-chips.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Cc: Dawid Kurek <dawikur@gmail.com>
* tag 'drm-misc-next-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc: (26 commits)
drm: Reduce scope of 'state' variable
drm: mxsfb_crtc: Reset the eLCDIF controller
drm: Remove duplicate forward declaration
drm/panel: s6e3ha2: Add support for s6e3hf2 panel on TM2e board
dt-bindings: Add support for samsung s6e3hf2 panel
drm/panel: add backlight dependency for sitronix-st7789v
drm/panel: S6E3HA2 needs backlight code
drm/panel: simple: add support for AUO P320HVN03
drm/panel: simple: add support for NLT NL192108AC18-02D
dt-bindings: add vendor prefix for NLT Technologies, Ltd.
drm/panel: simple: add support for NEC NL12880B20-05
drm/panel: add Innolux P079ZCA panel driver
dt-bindings: Add INNOLUX P079ZCA panel bindings
drm/vc4: Fix resource leak in 'vc4_get_hang_state_ioctl()' in error handling path
drm/vc4/vc4_bo.c: always set bo->resv
drm: Add const to name field declaration in struct drm_prop_enum_list
drm/pl111: Fix offset calculation for the primary plane.
drm/atmel-hlcdc: Fix panel registration
drm/bridge: Build the panel wrapper in drm_kms_helper
drm/atmel-hlcdc: Replace the panel usage with drm_panel_bridge.
...
|
|
This allows mesa to set the tiling format for a BO and have that
tiling format be respected by mesa on the other side of an
import/export (and by vc4 scanout in the kernel), without defining a
protocol to pass the tiling through userspace.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-2-eric@anholt.net
Acked-by: Dave Airlie <airlied@redhat.com>
|
|
The T tiling format is what V3D uses for textures, with no raster
support at all until later revisions of the hardware (and always at a
large 3D performance penalty). If we can't scan out V3D's format,
then we often need to do a relayout at some stage of the pipeline,
either right before texturing from the scanout buffer (common in X11
without a compositor) or between a tiled screen buffer right before
scanout (an option I've considered in trying to resolve this
inconsistency, but which means needing to use the dirty fb ioctl and
having some update policy).
T-format scanout lets us avoid either of those shadow copies, for a
massive, obvious performance improvement to X11 window dragging
without a compositor. Unfortunately, enabling a compositor to work
around the discrepancy has turned out to be too costly in memory
consumption for the Raspbian distribution.
Because the HVS operates a scanline at a time, compositing from T does
increase the memory bandwidth cost of scanout. On my 1920x1080@32bpp
display on a RPi3, we go from about 15% of system memory bandwidth
with linear to about 20% with tiled. However, for X11 this still ends
up being a huge performance win in active usage.
This patch doesn't yet handle src_x/src_y offsetting within the tiled
buffer. However, we fail to do so for untiled buffers already.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-1-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
|
|
This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:
- The previous code could deadlock due to an interaction
between the 'reflock' mutex and CDMA timeout handling.
This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
of channel ownership handling into channel.c
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Arguments of the .is_addr_reg() are swapped in the definition of the
function, that is quite confusing.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Several channels could be made to write the same unit concurrently via
the SETCLASS opcode, trusting userspace is a bad idea. It should be
possible to drop the per-client channel reservation and add a per-unit
locking by inserting MLOCK's to the command stream to re-allow the
SETCLASS opcode, but it will be much more work. Let's forbid the
unit-unrelated class changes for now.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|