Age | Commit message (Collapse) | Author |
|
Fix up one typos @nev -> @nr_irqs.
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1589770859-19340-1-git-send-email-zhangshaokun@hisilicon.com
|
|
This change adds accounting for the memory allocated for shadow stacks.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The amount of time spent parsing fwnodes of devices can become really
high if the devices are added in an non-ideal order. Worst case can be
O(N^2) when N devices are added. But this can be optimized to O(N) by
adding all the devices and then parsing all their fwnodes in one batch.
This commit adds fw_devlink_pause() and fw_devlink_resume() to allow
doing this.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20200515053500.215929-4-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 4dbe191c046e ("driver core: Add device links from fwnode only for
the primary device") skipped linking a fwnode's secondary device to
the suppliers listed in its fwnode.
However, a fwnode's secondary device can't be found using
get_dev_from_fwnode(). So, there's no point in trying to see if devices
waiting for suppliers might want to link to a fwnode's secondary device.
This commit removes that unnecessary step for devices that aren't a
fwnode's primary device and also moves the code to a more appropriate
part of the file.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20200515053500.215929-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This commit just moves around code to match the general organization of
the file.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20200515053500.215929-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
These interfaces return a negative error number or an IRQ:
platform_get_irq()
platform_get_irq_optional()
platform_get_irq_byname()
platform_get_irq_byname_optional()
The function comments suggest checking for error like this:
irq = platform_get_irq(...);
if (irq < 0)
return irq;
which is what most callers (~900 of 1400) do, so it's implicit that IRQ 0
is invalid. But some callers check for "irq <= 0", and it's not obvious
from the source that we never return an IRQ 0.
Make this more explicit by updating the comments to say that an IRQ number
is always non-zero and adding a WARN() if we ever do return zero. If we do
return IRQ 0, it likely indicates a bug in the arch-specific parts of
platform_get_irq().
Relevant prior discussion at [1, 2].
[1] https://lore.kernel.org/r/Pine.LNX.4.64.0701250940220.25027@woody.linux-foundation.org/
[2] https://lore.kernel.org/r/Pine.LNX.4.64.0701252029570.25027@woody.linux-foundation.org/
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
|
|
We want the driver core fixes in here and this resolves a merge issue
with drivers/base/dd.c
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
All external users of device_create_vargs are gone, so remove it and
open code it in the only caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
default""
This reverts commit 18555cb6db2373b9a5ec1f7572773fd58c77f9ba.
The reason[1] for the original revert has now been fixed by
commit 00b247557858 ("driver core: Fix handling of
fw_devlink=permissive"). So, this patch reverts the revert. Marek has
also tested this patch with the fix mentioned above and confirmed that
the issue has been fixed.
[1] - https://lore.kernel.org/lkml/CAGETcx8nbz-J1gLvoEKE_HgCcVGyV2o8rZeq_USFKM6=s7WmNg@mail.gmail.com/T/#m12dfb5dfd23805b84c49f4bb2238a8cce436c2f7
[2] - https://lore.kernel.org/lkml/CAGETcx8nbz-J1gLvoEKE_HgCcVGyV2o8rZeq_USFKM6=s7WmNg@mail.gmail.com/T/#m2408a6ce098b2ebf583ca8534329695923ae57fe
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20200428192006.109006-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
driver developer is foolish
If platform bus driver registration is failed then, accessing
platform bus spin lock (&drv->driver.bus->p->klist_drivers.k_lock)
in __platform_driver_probe() without verifying the return value
__platform_driver_register() can lead to NULL pointer exception.
So check the return value before attempting the spin lock.
One such example is below:
For a custom usecase, I have intentionally failed the platform bus
registration and I expected all the platform device/driver
registrations to fail gracefully. But I came across this panic
issue.
[ 1.331067] BUG: kernel NULL pointer dereference, address: 00000000000000c8
[ 1.331118] #PF: supervisor write access in kernel mode
[ 1.331163] #PF: error_code(0x0002) - not-present page
[ 1.331208] PGD 0 P4D 0
[ 1.331233] Oops: 0002 [#1] PREEMPT SMP
[ 1.331268] CPU: 3 PID: 1 Comm: swapper/0 Tainted: G W 5.6.0-00049-g670d35fb0144 #165
[ 1.331341] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 1.331406] RIP: 0010:_raw_spin_lock+0x15/0x30
[ 1.331588] RSP: 0000:ffffc9000001be70 EFLAGS: 00010246
[ 1.331632] RAX: 0000000000000000 RBX: 00000000000000c8 RCX: 0000000000000001
[ 1.331696] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000000
[ 1.331754] RBP: 00000000ffffffed R08: 0000000000000501 R09: 0000000000000001
[ 1.331817] R10: ffff88817abcc520 R11: 0000000000000670 R12: 00000000ffffffed
[ 1.331881] R13: ffffffff82dbc268 R14: ffffffff832f070a R15: 0000000000000000
[ 1.331945] FS: 0000000000000000(0000) GS:ffff88817bd80000(0000) knlGS:0000000000000000
[ 1.332008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.332062] CR2: 00000000000000c8 CR3: 000000000681e001 CR4: 00000000003606e0
[ 1.332126] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1.332189] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1.332252] Call Trace:
[ 1.332281] __platform_driver_probe+0x92/0xee
[ 1.332323] ? rtc_dev_init+0x2b/0x2b
[ 1.332358] cmos_init+0x37/0x67
[ 1.332396] do_one_initcall+0x7d/0x168
[ 1.332428] kernel_init_freeable+0x16c/0x1c9
[ 1.332473] ? rest_init+0xc0/0xc0
[ 1.332508] kernel_init+0x5/0x100
[ 1.332543] ret_from_fork+0x1f/0x30
[ 1.332579] CR2: 00000000000000c8
[ 1.332616] ---[ end trace 3bd87f12e9010b87 ]---
[ 1.333549] note: swapper/0[1] exited with preempt_count 1
[ 1.333592] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[ 1.333736] Kernel Offset: disabled
Note, this can only be triggered if a driver errors out from this call,
which should never happen. If it does, the driver needs to be fixed.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lore.kernel.org/r/20200408214003.3356-1-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
Take advantage of the new kernel symbol namespacing functionality, and
export the fw_fallback_config symbol only to a new private firmware loader
namespace. This would prevent misuses from other drivers and makes it clear
the goal is to keep this private to the firmware loader only.
It should also make it clearer for folks git grep'ing for users of
the symbol that this exported symbol is private, and prevent future
accidental removals of the exported symbol.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20200424184916.22843-2-mcgrof@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Message logged by 'dev_xxx()' or 'pr_xxx()' should end with a '\n'.
While at it, convert some "printk(KERN_" into equivalent but less verbose
(pr|dev)_xxx functions.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20200411133158.27390-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use kobj_to_dev() API instead of container_of().
Signed-off-by: zhouchuangao <zhouchuangao@xiaomi.com>
Link: https://lore.kernel.org/r/1587878031-16591-1-git-send-email-zhouchuangao@xiaomi.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
deferred_probe_timeout fires
In commit c8c43cee29f6 ("driver core: Fix
driver_deferred_probe_check_state() logic"), we set the default
driver_deferred_probe_timeout value to 30 seconds to allow for
drivers that are missing dependencies to have some time so that
the dependency may be loaded from userland after initcalls_done
is set.
However, Yoshihiro Shimoda reported that on his device that
expects to have unmet dependencies (due to "optional links" in
its devicetree), was failing to mount the NFS root.
In digging further, it seemed the problem was that while the
device properly probes after waiting 30 seconds for any missing
modules to load, the ip_auto_config() had already failed,
resulting in NFS to fail. This was due to ip_auto_config()
calling wait_for_device_probe() which doesn't wait for the
driver_deferred_probe_timeout to fire.
This patch tries to fix the issue by creating a waitqueue
for the driver_deferred_probe_timeout, and calling wait_event()
to make sure driver_deferred_probe_timeout is zero in
wait_for_device_probe() to make sure all the probing is
finished.
The downside to this solution is that kernel functionality that
uses wait_for_device_probe(), will block until the
driver_deferred_probe_timeout fires, regardless of if there is
any missing dependencies.
However, the previous patch reverts the default timeout value to
zero, so this side-effect will only affect users who specify a
driver_deferred_probe_timeout= value as a boot argument, where
the additional delay would be beneficial to allow modules to
load later during boot.
Thanks to Geert for chasing down that ip_auto_config was why NFS
was failing in this case!
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Basil Eljuse <Basil.Eljuse@arm.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: linux-pm@vger.kernel.org
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() logic")
Signed-off-by: John Stultz <john.stultz@linaro.org>
Link: https://lore.kernel.org/r/20200422203245.83244-4-john.stultz@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
warnings
In commit c8c43cee29f6 ("driver core: Fix
driver_deferred_probe_check_state() logic") and following
changes the logic was changes slightly so that if there is no
driver to match whats found in the dtb, we wait the sepcified
seconds for modules to be loaded by userland, and then timeout,
where as previously we'd print "ignoring dependency for device,
assuming no driver" and immediately return -ENODEV after
initcall_done.
However, in the timeout case (which previously existed but was
practicaly un-used without a boot argument), the timeout message
uses dev_WARN(). This means folks are now seeing a big backtrace
in their boot logs if there a entry in their dts that doesn't
have a driver.
To fix this, lets use dev_warn(), instead of dev_WARN() to match
the previous error path.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Basil Eljuse <Basil.Eljuse@arm.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: linux-pm@vger.kernel.org
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() logic")
Signed-off-by: John Stultz <john.stultz@linaro.org>
Link: https://lore.kernel.org/r/20200422203245.83244-3-john.stultz@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch addresses a regression in 5.7-rc1+
In commit c8c43cee29f6 ("driver core: Fix
driver_deferred_probe_check_state() logic"), we both cleaned up
the logic and also set the default driver_deferred_probe_timeout
value to 30 seconds to allow for drivers that are missing
dependencies to have some time so that the dependency may be
loaded from userland after initcalls_done is set.
However, Yoshihiro Shimoda reported that on his device that
expects to have unmet dependencies (due to "optional links" in
its devicetree), was failing to mount the NFS root.
In digging further, it seemed the problem was that while the
device properly probes after waiting 30 seconds for any missing
modules to load, the ip_auto_config() had already failed,
resulting in NFS to fail. This was due to ip_auto_config()
calling wait_for_device_probe() which doesn't wait for the
driver_deferred_probe_timeout to fire.
Fixing that issue is possible, but could also introduce 30
second delays in bootups for users who don't have any
missing dependencies, which is not ideal.
So I think the best solution to avoid any regressions is to
revert back to a default timeout value of zero, and allow
systems that need to utilize the timeout in order for userland
to load any modules that supply misisng dependencies in the dts
to specify the timeout length via the exiting documented boot
argument.
Thanks to Geert for chasing down that ip_auto_config was why NFS
was failing in this case!
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Basil Eljuse <Basil.Eljuse@arm.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anders Roxell <anders.roxell@linaro.org>
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() logic")
Signed-off-by: John Stultz <john.stultz@linaro.org>
Link: https://lore.kernel.org/r/20200422203245.83244-2-john.stultz@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If a component fails to bind due to -EPROBE_DEFER we should not log an
error as this is not a real failure.
Fixes messages like:
vc4-drm soc:gpu: failed to bind 3f902000.hdmi (ops vc4_hdmi_ops): -517
vc4-drm soc:gpu: master bind failed: -517
Signed-off-by: James Hilliard <james.hilliard1@gmail.com>
Link: https://lore.kernel.org/r/20200411190241.89404-1-james.hilliard1@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When commit 8375e74f2bca ("driver core: Add fw_devlink kernel
commandline option") added fw_devlink, it didn't implement "permissive"
mode correctly.
That commit got the device links flags correct to make sure unprobed
suppliers don't block the probing of a consumer. However, if a consumer
is waiting for mandatory suppliers to register, that could still block a
consumer from probing.
This commit fixes that by making sure in permissive mode, all suppliers
to a consumer are treated as a optional suppliers. So, even if a
consumer is waiting for suppliers to register and link itself (using the
DL_FLAG_SYNC_STATE_ONLY flag) to the supplier, the consumer is never
blocked from probing.
Fixes: 8375e74f2bca ("driver core: Add fw_devlink kernel commandline option")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20200331022832.209618-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
It's currently the platform driver's responsibility to initialize the
pointer, dma_parms, for its corresponding struct device. The benefit with
this approach allows us to avoid the initialization and to not waste memory
for the struct device_dma_parameters, as this can be decided on a case by
case basis.
However, it has turned out that this approach is not very practical. Not
only does it lead to open coding, but also to real errors. In principle
callers of dma_set_max_seg_size() doesn't check the error code, but just
assumes it succeeds.
For these reasons, let's do the initialization from the common platform bus
at the device registration point. This also follows the way the PCI devices
are being managed, see pci_device_add().
Suggested-by: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Tested-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20200422100954.31211-1-ulf.hansson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the driver core fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some small firmware/driver core/debugfs fixes for 5.7-rc3.
The debugfs change is now possible as now the last users of
debugfs_create_u32() have been fixed up in the different trees that
got merged into 5.7-rc1, and I don't want it creeping back in.
The firmware changes did cause a regression in linux-next, so the
final patch here reverts part of that, re-exporting the symbol to
resolve that issue. All of these patches, with the exception of the
final one, have been in linux-next with only that one reported issue"
* tag 'driver-core-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
firmware_loader: revert removal of the fw_fallback_config export
debugfs: remove return value of debugfs_create_u32()
firmware_loader: remove unused exports
firmware: imx: fix compile-testing
|
|
Christoph's patch removed two unsused exported symbols, however, one
symbol is used by the firmware_loader itself. If CONFIG_FW_LOADER=m so
the firmware_loader is modular but CONFIG_FW_LOADER_USER_HELPER=y we fail
the build at mostpost.
ERROR: modpost: "fw_fallback_config" [drivers/base/firmware_loader/firmware_class.ko] undefined!
This happens because the variable fw_fallback_config is built into the
kernel if CONFIG_FW_LOADER_USER_HELPER=y always, so we need to grant
access to the firmware loader module by exporting it.
Revert only one hunk from his patch.
Fixes: 739604734bd8 ("firmware_loader: remove unused exports")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20200424184916.22843-1-mcgrof@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
Rename DPM_FLAG_LEAVE_SUSPENDED to DPM_FLAG_MAY_SKIP_RESUME which
matches its purpose more closely.
No functional impact.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de> # for I2C
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Rename DPM_FLAG_NEVER_SKIP to DPM_FLAG_NO_DIRECT_COMPLETE which
matches its purpose more closely.
No functional impact.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # for PCI parts
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Because all callers of dev_pm_smart_suspend_and_suspended use it only
for checking whether or not to skip driver suspend callbacks for a
device, rename it to dev_pm_skip_suspend() in analogy with
dev_pm_skip_resume().
No functional impact.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The name of dev_pm_may_skip_resume() may be easily confused with the
power.may_skip_resume flag which is not checked by that function, so
rename the former as dev_pm_skip_resume().
No functional impact.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Because the power.may_skip_resume device status bit is taken
into account in combination with the DPM_FLAG_LEAVE_SUSPENDED
driver flag, it can be set to 'true' for all devices in the
"suspend" phase of a suspend-resume cycle, so do that.
Then, neither the PM core nor the middle-layer (sybsystem) code
handling it needs to set it to 'true' any more and it just has
to be cleared if there is a reason to avoid skipping the "noirq"
and "early" resume callbacks provided by the driver, so update
the code in question accordingly.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The current code in device_resume_noirq() causes the entire early
resume and resume phases of device suspend to be skipped for
devices for which the noirq resume phase have been skipped (due
to the LEAVE_SUSPENDED flag being set) on the premise that those
devices should stay in runtime-suspend after system-wide resume.
However, that may not be correct in two situations. First, the
middle layer (subsystem) noirq resume callback may be missing for
a given device, but its early resume callback may be present and it
may need to do something even if it decides to skip the driver
callback. Second, if the device's wakeup settings were adjusted
in the suspend phase without resuming the device (that was in
runtime suspend at that time), they most likely need to be
adjusted again in the resume phase and so the driver callback
in that phase needs to be run.
For the above reason, modify the core to allow the middle layer
->resume_late callback to run even if its ->resume_noirq callback
is missing (and the core has skipped the driver-level callback
in that phase) and to allow all device callbacks to run in the
resume phase. Also make the core set the PM-runtime status of
devices with SMART_SUSPEND set whose resume callbacks are not
skipped to "active" in the "noirq" resume phase and update the
affected subsystems (PCI and ACPI) accordingly.
After this change, middle-layer (subsystem) callbacks will always
be invoked in all phases of system suspend and resume and driver
callbacks will always run in the prepare, suspend, resume, and
complete phases for all devices.
For devices with SMART_SUSPEND set, driver callbacks will be
skipped in the late and noirq phases of system suspend if those
devices remain in runtime suspend in __device_suspend_late().
Driver callbacks will also be skipped for them during the
noirq and early phases of the "thaw" transition related to
hibernation in that case.
Setting LEAVE_SUSPENDED means that the driver allows its callbacks
to be skipped in the noirq and early phases of system resume, but
some additional conditions need to be met for that to happen (among
other things, the power.may_skip_resume flag needs to be set for the
device during system suspend for the driver callbacks to be skipped
during the subsequent resume transition).
For all devices with SMART_SUSPEND set whose driver callbacks are
invoked during system resume, the PM-runtime status will be set to
"active" (by the core).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
This allows to access data with 16-bit width of registers
via i2c SMBus block functions.
The multi-command sequence of the reading function is not safe
and may read the wrong data from other address if other commands
are sent in-between the SMBus commands in the read function.
Read performance:
32768 bytes (33 kB, 32 KiB) copied, 11.4869 s, 2.9 kB/s
Write performance(with 1-byte page):
32768 bytes (33 kB, 32 KiB) copied, 129.591 s, 0.3 kB/s
The implementation is inspired by below commit
https://patchwork.ozlabs.org/patch/545292/
v2: add more descriptions about the issue that maybe introduced
by this commit
Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Link: https://lore.kernel.org/r/20200424123358.144850-1-acelan.kao@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The variable ret is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20200402111341.511801-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 8b9ec6b73277 ("PM core: Use new async_schedule_dev command")
introduced a new function for better performance.
However commit f2a424f6c613 ("PM / core: Introduce dpm_async_fn()
helper") went back to the non-optimized version, async_schedule().
So switch back to the sync_schedule_dev() to improve performance
Fixes: f2a424f6c613 ("PM / core: Introduce dpm_async_fn() helper")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Currrently, two warnings are generated when building docs:
./drivers/base/platform.c:136: WARNING: Unexpected indentation.
./drivers/base/platform.c:214: WARNING: Unexpected indentation.
As examples are code blocks, they should use "::" markup. However,
Example::
Is currently interpreted as a new section.
While we could fix kernel-doc to accept such new syntax, it is
easier to just replace it with:
For Example::
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/564273815a76136fb5e453969b1012a786d99e28.1586881715.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Some filesystem references got broken by a previous patch
series I submitted. Address those.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> # fs/affs/Kconfig
Link: https://lore.kernel.org/r/57318c53008dbda7f6f4a5a9e5787f4d37e8565a.1586881715.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Sometimes it's more convenient to register a set of individual software nodes
grouped together. Add couple of functions for that.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
|
|
Some drivers when compiled as modules may need to set secondary firmware node.
Export set_secondary_fwnode() to make it possible without code duplication.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
|
|
SRBDS is an MDS-like speculative side channel that can leak bits from the
random number generator (RNG) across cores and threads. New microcode
serializes the processor access during the execution of RDRAND and
RDSEED. This ensures that the shared buffer is overwritten before it is
released for reuse.
While it is present on all affected CPU models, the microcode mitigation
is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
cases where TSX is not supported or has been disabled with TSX_CTRL.
The mitigation is activated by default on affected processors and it
increases latency for RDRAND and RDSEED instructions. Among other
effects this will reduce throughput from /dev/urandom.
* Enable administrator to configure the mitigation off when desired using
either mitigations=off or srbds=off.
* Export vulnerability status via sysfs
* Rename file-scoped macros to apply for non-whitelist table initializations.
[ bp: Massage,
- s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
- do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
- flip check in cpu_set_bug_bits() to save an indentation level,
- reflow comments.
jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
tglx: Dropped the fused off magic for now
]
Signed-off-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
|
|
Fold four functions in the PM core that each have only one caller
now into their callers.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
|
|
The code to handle the SMART_SUSPEND driver PM flag is hard to follow
and somewhat inconsistent with respect to devices without middle-layer
(subsystem) callbacks.
Namely, for those devices the core takes the role of a middle layer
in providing the expected ordering of execution of callbacks (under
the assumption that the drivers setting SMART_SUSPEND can reuse their
PM-runtime callbacks directly for system-wide suspend). To that end,
it prevents driver ->suspend_late and ->suspend_noirq callbacks from
being executed for devices that are still runtime-suspended in
__device_suspend_late(), because running the same callback funtion
that was previously run by PM-runtime for them may be invalid.
However, it does that only for devices without any middle-layer
callbacks for the late/noirq/early suspend/resume phases even
though it would be simpler and more consistent to skip the
driver-lavel callbacks for all devices with SMART_SUSPEND set
that are runtime-suspended in __device_suspend_late().
Simplify the code in accordance with the above observation.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
|
|
The struct firmware contains a page table pointer that was used only
internally in the past. Since the actual page tables are referred
from struct fw_priv and should be never from struct firmware, we can
drop this unused field gracefully.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20200415164500.28749-1-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Neither fw_fallback_config nor firmware_config_table are used by modules.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20200417064146.1086644-3-hch@lst.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
<michael@walle.cc>:
The Kontron sl28cpld is a board management chip providing gpio, pwm, fan
monitoring and an interrupt controller. For now this controller is used on
the Kontron SMARC-sAL28 board. But because of its flexible nature, it
might also be used on other boards in the future. The individual blocks
(like gpio, pwm, etc) are kept intentionally small. The MFD core driver
then instantiates different (or multiple of the same) blocks. It also
provides the register layout so it might be updated in the future without a
device tree change; and support other boards with a different layout or
functionalities.
See also [1] for more information.
This is my first take of a MFD driver. I don't know whether the subsystem
maintainers should only be CCed on the patches which affect the subsystem
or on all patches for this series. I've chosen the latter so you can get a
more complete picture.
[1] https://lore.kernel.org/linux-devicetree/0e3e8204ab992d75aa07fc36af7e4ab2@walle.cc/
Changes since v1:
- use of_match_table in all drivers, needed for automatic module loading,
when using OF_MFD_CELL()
- add new gpio-regmap.c which adds a generic regmap gpio_chip implemention
- new patch for reqmap_irq, so we can reuse its implementation
- remove almost any code from gpio-sl28cpld.c, instead use gpio-regmap and
regmap-irq
- change the handling of the mfd core vs device tree nodes; add a new
property "of_reg" to the mfd_cell struct which, when set, is matched to
the unit-address of the device tree nodes.
- fix sl28cpld watchdog when it is not initialized by the bootloader.
Explicitly set the operation mode.
- also add support for kontron,assert-wdt-timeout-pin in sl28cpld-wdt.
As suggested by Bartosz Golaszewski:
- define registers as hex
- make gpio enum uppercase
- move parent regmap check before memory allocation
- use device_property_read_bool() instead of the of_ version
- mention the gpio flavors in the bindings documentation
As suggested by Guenter Roeck:
- cleanup #includes and sort them
- use devm_watchdog_register_device()
- use watchdog_stop_on_reboot()
- provide a Documentation/hwmon/sl28cpld.rst
- cleaned up the weird tristate->bool and I2C=y issue. Instead mention
that the MFD driver is bool because of the following intc patch
- removed the SL28CPLD_IRQ typo
As suggested by Rob Herring:
- combine all dt bindings docs into one patch
- change the node name for all gpio flavors to "gpio"
- removed the interrupts-extended rule
- cleaned up the unit-address space, see above
Michael Walle (16):
include/linux/ioport.h: add helper to define REG resource constructs
mfd: mfd-core: Don't overwrite the dma_mask of the child device
mfd: mfd-core: match device tree node against reg property
regmap-irq: make it possible to add irq_chip do a specific device node
dt-bindings: mfd: Add bindings for sl28cpld
mfd: Add support for Kontron sl28cpld management controller
irqchip: add sl28cpld interrupt controller support
watchdog: add support for sl28cpld watchdog
pwm: add support for sl28cpld PWM controller
gpio: add a reusable generic gpio_chip using regmap
gpio: add support for the sl28cpld GPIO controller
hwmon: add support for the sl28cpld hardware monitoring controller
arm64: dts: freescale: sl28: enable sl28cpld
arm64: dts: freescale: sl28: map GPIOs to input events
arm64: dts: freescale: sl28: enable LED support
arm64: dts: freescale: sl28: enable fan support
.../bindings/gpio/kontron,sl28cpld-gpio.yaml | 51 +++
.../hwmon/kontron,sl28cpld-hwmon.yaml | 27 ++
.../bindings/mfd/kontron,sl28cpld.yaml | 162 +++++++++
.../bindings/pwm/kontron,sl28cpld-pwm.yaml | 35 ++
.../watchdog/kontron,sl28cpld-wdt.yaml | 35 ++
Documentation/hwmon/sl28cpld.rst | 36 ++
.../fsl-ls1028a-kontron-kbox-a-230-ls.dts | 14 +
.../fsl-ls1028a-kontron-sl28-var3-ads2.dts | 9 +
.../freescale/fsl-ls1028a-kontron-sl28.dts | 124 +++++++
drivers/base/regmap/regmap-irq.c | 84 ++++-
drivers/gpio/Kconfig | 15 +
drivers/gpio/Makefile | 2 +
drivers/gpio/gpio-regmap.c | 321 ++++++++++++++++++
drivers/gpio/gpio-sl28cpld.c | 187 ++++++++++
drivers/hwmon/Kconfig | 10 +
drivers/hwmon/Makefile | 1 +
drivers/hwmon/sl28cpld-hwmon.c | 152 +++++++++
drivers/irqchip/Kconfig | 3 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-sl28cpld.c | 99 ++++++
drivers/mfd/Kconfig | 21 ++
drivers/mfd/Makefile | 2 +
drivers/mfd/mfd-core.c | 31 +-
drivers/mfd/sl28cpld.c | 154 +++++++++
drivers/pwm/Kconfig | 10 +
drivers/pwm/Makefile | 1 +
drivers/pwm/pwm-sl28cpld.c | 204 +++++++++++
drivers/watchdog/Kconfig | 11 +
drivers/watchdog/Makefile | 1 +
drivers/watchdog/sl28cpld_wdt.c | 242 +++++++++++++
include/linux/gpio-regmap.h | 88 +++++
include/linux/ioport.h | 5 +
include/linux/mfd/core.h | 26 +-
include/linux/regmap.h | 10 +
34 files changed, 2142 insertions(+), 32 deletions(-)
create mode 100644 Documentation/devicetree/bindings/gpio/kontron,sl28cpld-gpio.yaml
create mode 100644 Documentation/devicetree/bindings/hwmon/kontron,sl28cpld-hwmon.yaml
create mode 100644 Documentation/devicetree/bindings/mfd/kontron,sl28cpld.yaml
create mode 100644 Documentation/devicetree/bindings/pwm/kontron,sl28cpld-pwm.yaml
create mode 100644 Documentation/devicetree/bindings/watchdog/kontron,sl28cpld-wdt.yaml
create mode 100644 Documentation/hwmon/sl28cpld.rst
create mode 100644 drivers/gpio/gpio-regmap.c
create mode 100644 drivers/gpio/gpio-sl28cpld.c
create mode 100644 drivers/hwmon/sl28cpld-hwmon.c
create mode 100644 drivers/irqchip/irq-sl28cpld.c
create mode 100644 drivers/mfd/sl28cpld.c
create mode 100644 drivers/pwm/pwm-sl28cpld.c
create mode 100644 drivers/watchdog/sl28cpld_wdt.c
create mode 100644 include/linux/gpio-regmap.h
--
2.20.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
|
|
Add a new function regmap_add_irq_chip_np() with its corresponding
devm_regmap_add_irq_chip_np() variant. Sometimes one want to register
the IRQ domain on a different device node that the one of the regmap
node. For example when using a MFD where there are different interrupt
controllers and particularly for the generic regmap gpio_chip/irq_chip
driver. In this case it is not desireable to have the IRQ domain on
the parent node.
Signed-off-by: Michael Walle <michael@walle.cc>
Link: https://lore.kernel.org/r/20200402203656.27047-5-michael@walle.cc
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add reg_update_bits() support in case some platforms use a special method
to update bits of registers.
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/df32fd0529957d1e7e26ba1465723f16cfbe92c8.1586757922.git.baolin.wang7@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
For now, distributions implement advanced udev rules to essentially
- Don't online any hotplugged memory (s390x)
- Online all memory to ZONE_NORMAL (e.g., most virt environments like
hyperv)
- Online all memory to ZONE_MOVABLE in case the zone imbalance is taken
care of (e.g., bare metal, special virt environments)
In summary: All memory is usually onlined the same way, however, the
kernel always has to ask user space to come up with the same answer.
E.g., Hyper-V always waits for a memory block to get onlined before
continuing, otherwise it might end up adding memory faster than
onlining it, which can result in strange OOM situations. This waiting
slows down adding of a bigger amount of memory.
Let's allow to specify a default online_type, not just "online" and
"offline". This allows distributions to configure the default online_type
when booting up and be done with it.
We can now specify "offline", "online", "online_movable" and
"online_kernel" via
- "memhp_default_state=" on the kernel cmdline
- /sys/devices/system/memory/auto_online_blocks
just like we are able to specify for a single memory block via
/sys/devices/system/memory/memoryX/state
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Yumei Huang <yuhuang@redhat.com>
Link: http://lkml.kernel.org/r/20200317104942.11178-9-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
... and rename it to memhp_default_online_type. This is a preparation
for more detailed default online behavior.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Yumei Huang <yuhuang@redhat.com>
Link: http://lkml.kernel.org/r/20200317104942.11178-8-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Let's use a simple array which we can reuse soon. While at it, move the
string->mmop conversion out of the device hotplug lock.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Yumei Huang <yuhuang@redhat.com>
Link: http://lkml.kernel.org/r/20200317104942.11178-4-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Historically, we used the value -1. Just treat 0 as the special case now.
Clarify a comment (which was wrong, when we come via device_online() the
first time, the online_type would have been 0 / MEM_ONLINE). The default
is now always MMOP_OFFLINE. This removes the last user of the manual
"-1", which didn't use the enum value.
This is a preparation to use the online_type as an array index.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Yumei Huang <yuhuang@redhat.com>
Link: http://lkml.kernel.org/r/20200317104942.11178-3-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "mm/memory_hotplug: allow to specify a default online_type", v3.
Distributions nowadays use udev rules ([1] [2]) to specify if and how to
online hotplugged memory. The rules seem to get more complex with many
special cases. Due to the various special cases,
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE cannot be used. All memory hotplug
is handled via udev rules.
Every time we hotplug memory, the udev rule will come to the same
conclusion. Especially Hyper-V (but also soon virtio-mem) add a lot of
memory in separate memory blocks and wait for memory to get onlined by
user space before continuing to add more memory blocks (to not add memory
faster than it is getting onlined). This of course slows down the whole
memory hotplug process.
To make the job of distributions easier and to avoid udev rules that get
more and more complicated, let's extend the mechanism provided by
- /sys/devices/system/memory/auto_online_blocks
- "memhp_default_state=" on the kernel cmdline
to be able to specify also "online_movable" as well as "online_kernel"
=== Example /usr/libexec/config-memhotplug ===
#!/bin/bash
VIRT=`systemd-detect-virt --vm`
ARCH=`uname -p`
sense_virtio_mem() {
if [ -d "/sys/bus/virtio/drivers/virtio_mem/" ]; then
DEVICES=`find /sys/bus/virtio/drivers/virtio_mem/ -maxdepth 1 -type l | wc -l`
if [ $DEVICES != "0" ]; then
return 0
fi
fi
return 1
}
if [ ! -e "/sys/devices/system/memory/auto_online_blocks" ]; then
echo "Memory hotplug configuration support missing in the kernel"
exit 1
fi
if grep "memhp_default_state=" /proc/cmdline > /dev/null; then
echo "Memory hotplug configuration overridden in kernel cmdline (memhp_default_state=)"
exit 1
fi
if [ $VIRT == "microsoft" ]; then
echo "Detected Hyper-V on $ARCH"
# Hyper-V wants all memory in ZONE_NORMAL
ONLINE_TYPE="online_kernel"
elif sense_virtio_mem; then
echo "Detected virtio-mem on $ARCH"
# virtio-mem wants all memory in ZONE_NORMAL
ONLINE_TYPE="online_kernel"
elif [ $ARCH == "s390x" ] || [ $ARCH == "s390" ]; then
echo "Detected $ARCH"
# standby memory should not be onlined automatically
ONLINE_TYPE="offline"
elif [ $ARCH == "ppc64" ] || [ $ARCH == "ppc64le" ]; then
echo "Detected" $ARCH
# PPC64 onlines all hotplugged memory right from the kernel
ONLINE_TYPE="offline"
elif [ $VIRT == "none" ]; then
echo "Detected bare-metal on $ARCH"
# Bare metal users expect hotplugged memory to be unpluggable. We assume
# that ZONE imbalances on such enterpise servers cannot happen and is
# properly documented
ONLINE_TYPE="online_movable"
else
# TODO: Hypervisors that want to unplug DIMMs and can guarantee that ZONE
# imbalances won't happen
echo "Detected $VIRT on $ARCH"
# Usually, ballooning is used in virtual environments, so memory should go to
# ZONE_NORMAL. However, sometimes "movable_node" is relevant.
ONLINE_TYPE="online"
fi
echo "Selected online_type:" $ONLINE_TYPE
# Configure what to do with memory that will be hotplugged in the future
echo $ONLINE_TYPE 2>/dev/null > /sys/devices/system/memory/auto_online_blocks
if [ $? != "0" ]; then
echo "Memory hotplug cannot be configured (e.g., old kernel or missing permissions)"
# A backup udev rule should handle old kernels if necessary
exit 1
fi
# Process all already pluggedd blocks (e.g., DIMMs, but also Hyper-V or virtio-mem)
if [ $ONLINE_TYPE != "offline" ]; then
for MEMORY in /sys/devices/system/memory/memory*; do
STATE=`cat $MEMORY/state`
if [ $STATE == "offline" ]; then
echo $ONLINE_TYPE > $MEMORY/state
fi
done
fi
=== Example /usr/lib/systemd/system/config-memhotplug.service ===
[Unit]
Description=Configure memory hotplug behavior
DefaultDependencies=no
Conflicts=shutdown.target
Before=sysinit.target shutdown.target
After=systemd-modules-load.service
ConditionPathExists=|/sys/devices/system/memory/auto_online_blocks
[Service]
ExecStart=/usr/libexec/config-memhotplug
Type=oneshot
TimeoutSec=0
RemainAfterExit=yes
[Install]
WantedBy=sysinit.target
=== Example modification to the 40-redhat.rules [2] ===
: diff --git a/40-redhat.rules b/40-redhat.rules-new
: index 2c690e5..168fd03 100644
: --- a/40-redhat.rules
: +++ b/40-redhat.rules-new
: @@ -6,6 +6,9 @@ SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}
: # Memory hotadd request
: SUBSYSTEM!="memory", GOTO="memory_hotplug_end"
: ACTION!="add", GOTO="memory_hotplug_end"
: +# memory hotplug behavior configured
: +PROGRAM=="grep online /sys/devices/system/memory/auto_online_blocks", GOTO="memory_hotplug_end"
: +
: PROGRAM="/bin/uname -p", RESULT=="s390*", GOTO="memory_hotplug_end"
:
: ENV{.state}="online"
===
[1] https://github.com/lnykryn/systemd-rhel/pull/281
[2] https://github.com/lnykryn/systemd-rhel/blob/staging/rules/40-redhat.rules
This patch (of 8):
The name is misleading and it's not really clear what is "kept". Let's
just name it like the online_type name we expose to user space ("online").
Add some documentation to the types.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Yumei Huang <yuhuang@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Link: http://lkml.kernel.org/r/20200319131221.14044-1-david@redhat.com
Link: http://lkml.kernel.org/r/20200317104942.11178-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|