summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/topology.c
AgeCommit message (Collapse)Author
2020-12-18Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull more arm64 updates from Catalin Marinas: "These are some some trivial updates that mostly fix/clean-up code pushed during the merging window: - Work around broken GCC 4.9 handling of "S" asm constraint - Suppress W=1 missing prototype warnings - Warn the user when a small VA_BITS value cannot map the available memory - Drop the useless update to per-cpu cycles" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Work around broken GCC 4.9 handling of "S" constraint arm64: Warn the user when a small VA_BITS value wastes memory arm64: entry: suppress W=1 prototype warnings arm64: topology: Drop the useless update to per-cpu cycles
2020-12-15arm64: topology: Drop the useless update to per-cpu cyclesViresh Kumar
The previous call to update_freq_counters_refs() has already updated the per-cpu variables, don't overwrite them with the same value again. Fixes: 4b9cf23c179a ("arm64: wrap and generalise counter read functions") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/7a171f710cdc0f808a2bfbd7db839c0d265527e7.1607579234.git.viresh.kumar@linaro.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-14Merge tag 'sched-core-2020-12-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - migrate_disable/enable() support which originates from the RT tree and is now a prerequisite for the new preemptible kmap_local() API which aims to replace kmap_atomic(). - A fair amount of topology and NUMA related improvements - Improvements for the frequency invariant calculations - Enhanced robustness for the global CPU priority tracking and decision making - The usual small fixes and enhancements all over the place * tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits) sched/fair: Trivial correction of the newidle_balance() comment sched/fair: Clear SMT siblings after determining the core is not idle sched: Fix kernel-doc markup x86: Print ratio freq_max/freq_base used in frequency invariance calculations x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC x86, sched: Calculate frequency invariance for AMD systems irq_work: Optimize irq_work_single() smp: Cleanup smp_call_function*() irq_work: Cleanup sched: Limit the amount of NUMA imbalance that can exist at fork time sched/numa: Allow a floating imbalance between NUMA nodes sched: Avoid unnecessary calculation of load imbalance at clone time sched/numa: Rename nr_running and break out the magic number sched: Make migrate_disable/enable() independent of RT sched/topology: Condition EAS enablement on FIE support arm64: Rebuild sched domains on invariance status changes sched/topology,schedutil: Wrap sched domains rebuild sched/uclamp: Allow to reset a task uclamp constraint value sched/core: Fix typos in comments Documentation: scheduler: fix information on arch SD flags, sched_domain and sched_debug ...
2020-11-19arm64: Rebuild sched domains on invariance status changesIonela Voinescu
Task scheduler behavior depends on frequency invariance (FI) support and the resulting invariant load tracking signals. For example, in order to make accurate predictions across CPUs for all performance states, Energy Aware Scheduling (EAS) needs frequency-invariant load tracking signals and therefore it has a direct dependency on FI. This dependency is known, but EAS enablement is not yet conditioned on the presence of FI during the built of the scheduling domain hierarchy. Before this is done, the following must be considered: while arch_scale_freq_invariant() will see changes in FI support and could be used to condition the use of EAS, it could return different values during system initialisation. For arm64, such a scenario will happen for a system that does not support cpufreq driven FI, but does support counter-driven FI. For such a system, arch_scale_freq_invariant() will return false if called before counter based FI initialisation, but change its status to true after it. If EAS becomes explicitly dependent on FI this would affect the task scheduler behavior which builds its scheduling domain hierarchy well before the late counter-based FI init. During that process, EAS would be disabled due to its dependency on FI. Two points of future early calls to arch_scale_freq_invariant() which would determine EAS enablement are: - (1) drivers/base/arch_topology.c:126 <<update_topology_flags_workfn>> rebuild_sched_domains(); This will happen after CPU capacity initialisation. - (2) kernel/sched/cpufreq_schedutil.c:917 <<rebuild_sd_workfn>> rebuild_sched_domains_energy(); -->rebuild_sched_domains(); This will happen during sched_cpufreq_governor_change() for the schedutil cpufreq governor. Therefore, before enforcing the presence of FI support for the use of EAS, ensure the following: if there is a change in FI support status after counter init, use the existing rebuild_sched_domains_energy() function to trigger a rebuild of the scheduling and performance domains that in turn will determine the enablement of EAS. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lkml.kernel.org/r/20201027180713.7642-3-ionela.voinescu@arm.com
2020-11-13arm64: abort counter_read_on_cpu() when irqs_disabled()Ionela Voinescu
Given that smp_call_function_single() can deadlock when interrupts are disabled, abort the SMP call if irqs_disabled(). This scenario is currently not possible given the function's uses, but safeguard this for potential future uses. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20201113155328.4194-1-ionela.voinescu@arm.com [catalin.marinas@arm.com: modified following Mark's comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-13arm64: implement CPPC FFH support using AMUsIonela Voinescu
If Activity Monitors (AMUs) are present, two of the counters can be used to implement support for CPPC's (Collaborative Processor Performance Control) delivered and reference performance monitoring functionality using FFH (Functional Fixed Hardware). Given that counters for a certain CPU can only be read from that CPU, while FFH operations can be called from any CPU for any of the CPUs, use smp_call_function_single() to provide the requested values. Therefore, depending on the register addresses, the following values are returned: - 0x0 (DeliveredPerformanceCounterRegister): AMU core counter - 0x1 (ReferencePerformanceCounterRegister): AMU constant counter The use of Activity Monitors is hidden behind the generic cpu_read_{corecnt,constcnt}() functions. Read functionality for these two registers represents the only current FFH support for CPPC. Read operations for other register values or write operation for all registers are unsupported. Therefore, keep CPPC's FFH unsupported if no CPUs have valid AMU frequency counters. For this purpose, the get_cpu_with_amu_feat() is introduced. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201106125334.21570-4-ionela.voinescu@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-13arm64: split counter validation functionIonela Voinescu
In order for the counter validation function to be reused, split validate_cpu_freq_invariance_counters() into: - freq_counters_valid(cpu) - check cpu for valid cycle counters - freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate) - generic function that sets the normalization ratio used by topology_scale_freq_tick() Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201106125334.21570-3-ionela.voinescu@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-13arm64: wrap and generalise counter read functionsIonela Voinescu
In preparation for other uses of Activity Monitors (AMU) cycle counters, place counter read functionality in generic functions that can reused: read_corecnt() and read_constcnt(). As a result, implement update_freq_counters_refs() to replace init_cpu_freq_invariance_counters() and both initialise and update the per-cpu reference variables. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201106125334.21570-2-ionela.voinescu@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-10-14Merge tag 'pm-5.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These rework the collection of cpufreq statistics to allow it to take place if fast frequency switching is enabled in the governor, rework the frequency invariance handling in the cpufreq core and drivers, add new hardware support to a couple of cpufreq drivers, fix a number of assorted issues and clean up the code all over. Specifics: - Rework cpufreq statistics collection to allow it to take place when fast frequency switching is enabled in the governor (Viresh Kumar). - Make the cpufreq core set the frequency scale on behalf of the driver and update several cpufreq drivers accordingly (Ionela Voinescu, Valentin Schneider). - Add new hardware support to the STI and qcom cpufreq drivers and improve them (Alain Volmat, Manivannan Sadhasivam). - Fix multiple assorted issues in cpufreq drivers (Jon Hunter, Krzysztof Kozlowski, Matthias Kaehlcke, Pali Rohár, Stephan Gerhold, Viresh Kumar). - Fix several assorted issues in the operating performance points (OPP) framework (Stephan Gerhold, Viresh Kumar). - Allow devfreq drivers to fetch devfreq instances by DT enumeration instead of using explicit phandles and modify the devfreq core code to support driver-specific devfreq DT bindings (Leonard Crestez, Chanwoo Choi). - Improve initial hardware resetting in the tegra30 devfreq driver and clean up the tegra cpuidle driver (Dmitry Osipenko). - Update the cpuidle core to collect state entry rejection statistics and expose them via sysfs (Lina Iyer). - Improve the ACPI _CST code handling diagnostics (Chen Yu). - Update the PSCI cpuidle driver to allow the PM domain initialization to occur in the OSI mode as well as in the PC mode (Ulf Hansson). - Rework the generic power domains (genpd) core code to allow domain power off transition to be aborted in the absence of the "power off" domain callback (Ulf Hansson). - Fix two suspend-to-idle issues in the ACPI EC driver (Rafael Wysocki). - Fix the handling of timer_expires in the PM-runtime framework on 32-bit systems and the handling of device links in it (Grygorii Strashko, Xiang Chen). - Add IO requests batching support to the hibernate image saving and reading code and drop a bogus get_gendisk() from there (Xiaoyi Chen, Christoph Hellwig). - Allow PCIe ports to be put into the D3cold power state if they are power-manageable via ACPI (Lukas Wunner). - Add missing header file include to a power capping driver (Pujin Shi). - Clean up the qcom-cpr AVS driver a bit (Liu Shixin). - Kevin Hilman steps down as designated reviwer of adaptive voltage scaling (AVS) drivers (Kevin Hilman)" * tag 'pm-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits) cpufreq: stats: Fix string format specifier mismatch arm: disable frequency invariance for CONFIG_BL_SWITCHER cpufreq,arm,arm64: restructure definitions of arch_set_freq_scale() cpufreq: stats: Add memory barrier to store_reset() cpufreq: schedutil: Simplify sugov_fast_switch() ACPI: EC: PM: Drop ec_no_wakeup check from acpi_ec_dispatch_gpe() ACPI: EC: PM: Flush EC work unconditionally after wakeup PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPI PM: hibernate: remove the bogus call to get_gendisk() in software_resume() cpufreq: Move traces and update to policy->cur to cpufreq core cpufreq: stats: Enable stats for fast-switch as well cpufreq: stats: Mark few conditionals with unlikely() cpufreq: stats: Remove locking cpufreq: stats: Defer stats update to cpufreq_stats_record_transition() PM: domains: Allow to abort power off when no ->power_off() callback PM: domains: Rename power state enums for genpd PM / devfreq: tegra30: Improve initial hardware resetting PM / devfreq: event: Change prototype of devfreq_event_get_edev_by_phandle function PM / devfreq: Change prototype of devfreq_get_devfreq_by_phandle function PM / devfreq: Add devfreq_get_devfreq_by_node function ...
2020-09-18arch_topology, arm, arm64: define arch_scale_freq_invariant()Valentin Schneider
arch_scale_freq_invariant() is used by schedutil to determine whether the scheduler's load-tracking signals are frequency invariant. Its definition is overridable, though by default it is hardcoded to 'true' if arch_scale_freq_capacity() is defined ('false' otherwise). This behaviour is not overridden on arm, arm64 and other users of the generic arch topology driver, which is somewhat precarious: arch_scale_freq_capacity() will always be defined, yet not all cpufreq drivers are guaranteed to drive the frequency invariance scale factor setting. In other words, the load-tracking signals may very well *not* be frequency invariant. Now that cpufreq can be queried on whether the current driver is driving the Frequency Invariance (FI) scale setting, the current situation can be improved. This combines the query of whether cpufreq supports the setting of the frequency scale factor, with whether all online CPUs are counter-based FI enabled. While cpufreq FI enablement applies at system level, for all CPUs, counter-based FI support could also be used for only a subset of CPUs to set the invariance scale factor. Therefore, if cpufreq-based FI support is present, we consider the system to be invariant. If missing, we require all online CPUs to be counter-based FI enabled in order for the full system to be considered invariant. If the system ends up not being invariant, a new condition is needed in the counter initialization code that disables all scale factor setting based on counters. Precedence of counters over cpufreq use is not important here. The invariant status is only given to the system if all CPUs have at least one method of setting the frequency scale factor. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-18arch_topology, cpufreq: constify arch_* cpumasksValentin Schneider
The passed cpumask arguments to arch_set_freq_scale() and arch_freq_counters_available() are only iterated over, so reflect this in the prototype. This also allows to pass system cpumasks like cpu_online_mask without getting a warning. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-07arm64: topology: Stop using MPIDR for topology informationValentin Schneider
In the absence of ACPI or DT topology data, we fallback to haphazardly decoding *something* out of MPIDR. Sadly, the contents of that register are mostly unusable due to the implementation leniancy and things like Aff0 having to be capped to 15 (despite being encoded on 8 bits). Consider a simple system with a single package of 32 cores, all under the same LLC. We ought to be shoving them in the same core_sibling mask, but MPIDR is going to look like: | CPU | 0 | ... | 15 | 16 | ... | 31 | |------+---+-----+----+----+-----+----+ | Aff0 | 0 | ... | 15 | 0 | ... | 15 | | Aff1 | 0 | ... | 0 | 1 | ... | 1 | | Aff2 | 0 | ... | 0 | 0 | ... | 0 | Which will eventually yield core_sibling(0-15) == 0-15 core_sibling(16-31) == 16-31 NUMA woes ========= If we try to play games with this and set up NUMA boundaries within those groups of 16 cores via e.g. QEMU: # Node0: 0-9; Node1: 10-19 $ qemu-system-aarch64 <blah> \ -smp 20 -numa node,cpus=0-9,nodeid=0 -numa node,cpus=10-19,nodeid=1 The scheduler's MC domain (all CPUs with same LLC) is going to be built via arch_topology.c::cpu_coregroup_mask() In there we try to figure out a sensible mask out of the topology information we have. In short, here we'll pick the smallest of NUMA or core sibling mask. node_mask(CPU9) == 0-9 core_sibling(CPU9) == 0-15 MC mask for CPU9 will thus be 0-9, not a problem. node_mask(CPU10) == 10-19 core_sibling(CPU10) == 0-15 MC mask for CPU10 will thus be 10-19, not a problem. node_mask(CPU16) == 10-19 core_sibling(CPU16) == 16-19 MC mask for CPU16 will thus be 16-19... Uh oh. CPUs 16-19 are in two different unique MC spans, and the scheduler has no idea what to make of that. That triggers the WARN_ON() added by commit ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") Fixing MPIDR-derived topology ============================= We could try to come up with some cleverer scheme to figure out which of the available masks to pick, but really if one of those masks resulted from MPIDR then it should be discarded because it's bound to be bogus. I was hoping to give MPIDR a chance for SMT, to figure out which threads are in the same core using Aff1-3 as core ID, but Sudeep and Robin pointed out to me that there are systems out there where *all* cores have non-zero values in their higher affinity fields (e.g. RK3288 has "5" in all of its cores' MPIDR.Aff1), which would expose a bogus core ID to userspace. Stop using MPIDR for topology information. When no other source of topology information is available, mark each CPU as its own core and its NUMA node as its LLC domain. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20200829130016.26106-1-valentin.schneider@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-03-06arm64: use activity monitors for frequency invarianceIonela Voinescu
The Frequency Invariance Engine (FIE) is providing a frequency scaling correction factor that helps achieve more accurate load-tracking. So far, for arm and arm64 platforms, this scale factor has been obtained based on the ratio between the current frequency and the maximum supported frequency recorded by the cpufreq policy. The setting of this scale factor is triggered from cpufreq drivers by calling arch_set_freq_scale. The current frequency used in computation is the frequency requested by a governor, but it may not be the frequency that was implemented by the platform. This correction factor can also be obtained using a core counter and a constant counter to get information on the performance (frequency based only) obtained in a period of time. This will more accurately reflect the actual current frequency of the CPU, compared with the alternative implementation that reflects the request of a performance level from the OS. Therefore, implement arch_scale_freq_tick to use activity monitors, if present, for the computation of the frequency scale factor. The use of AMU counters depends on: - CONFIG_ARM64_AMU_EXTN - depents on the AMU extension being present - CONFIG_CPU_FREQ - the current frequency obtained using counter information is divided by the maximum frequency obtained from the cpufreq policy. While it is possible to have a combination of CPUs in the system with and without support for activity monitors, the use of counters for frequency invariance is only enabled for a CPU if all related CPUs (CPUs in the same frequency domain) support and have enabled the core and constant activity monitor counters. In this way, there is a clear separation between the policies for which arch_set_freq_scale (cpufreq based FIE) is used, and the policies for which arch_scale_freq_tick (counter based FIE) is used to set the frequency scale factor. For this purpose, a late_initcall_sync is registered to trigger validation work for policies that will enable or disable the use of AMU counters for frequency invariance. If CONFIG_CPU_FREQ is not defined, the use of counters is enabled on all CPUs only if all possible CPUs correctly support the necessary counters. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-08-14Merge tag 'common/for-v5.4-rc1/cpu-topology' of ↵Will Deacon
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into for-next/cpu-topology Pull in generic CPU topology changes from Paul Walmsley (RISC-V). * tag 'common/for-v5.4-rc1/cpu-topology' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: MAINTAINERS: Add an entry for generic architecture topology base: arch_topology: update Kconfig help description RISC-V: Parse cpu topology during boot. arm: Use common cpu_topology structure and functions. cpu-topology: Move cpu topology code to common code. dt-binding: cpu-topology: Move cpu-map to a common binding. Documentation: DT: arm: add support for sockets defining package boundaries
2019-08-12arm64: topology: Use PPTT to determine if PE is a threadJeremy Linton
ACPI 6.3 adds a thread flag to represent if a CPU/PE is actually a thread. Given that the MPIDR_MT bit may not represent this information consistently on homogeneous machines we should prefer the PPTT flag if its available. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Robert Richter <rrichter@marvell.com> [will: made acpi_cpu_is_threaded() return 'bool'] Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22cpu-topology: Move cpu topology code to common code.Atish Patra
Both RISC-V & ARM64 are using cpu-map device tree to describe their cpu topology. It's better to move the relevant code to a common place instead of duplicate code. To: Will Deacon <will.deacon@arm.com> To: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Atish Patra <atish.patra@wdc.com> [Tested on QDF2400] Tested-by: Jeffrey Hugo <jhugo@codeaurora.org> [Tested on Juno and other embedded platforms.] Tested-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2018-07-06arm64: topology: re-introduce numa mask check for scheduler MC selectionSudeep Holla
Commit 37c3ec2d810f ("arm64: topology: divorce MC scheduling domain from core_siblings") selected the smallest of LLC, socket siblings, and NUMA node siblings to ensure that the sched domain we build for the MC layer isn't larger than the DIE above it or it's shrunk to the socket or NUMA node if LLC exist acrosis NUMA node/chiplets. Commit acd32e52e4e0 ("arm64: topology: Avoid checking numa mask for scheduler MC selection") reverted the NUMA siblings checks since the CPU topology masks weren't updated on hotplug at that time. This patch re-introduces numa mask check as the CPU and NUMA topology is now updated in hotplug paths. Effectively, this patch does the partial revert of commit acd32e52e4e0. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-06arm64: topology: rename llc_siblings to align with other struct membersSudeep Holla
Similar to core_sibling and thread_sibling, it's better to align and rename llc_siblings to llc_sibling. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-06arm64: topology: restrict updating siblings_masks to online cpus onlySudeep Holla
It's incorrect to iterate over all the possible CPUs to update the sibling masks when any CPU is hotplugged in. In case the topology siblings masks of the CPU is removed when is it hotplugged out, we end up updating those masks when one of it's sibling is powered up again. This will provide inconsistent view. Further, since the CPU calling update_sibling_masks is yet to be set online, there's no need to compare itself with each online CPU when updating the siblings masks. This patch restricts updation of sibling masks only for CPUs that are already online. It also the drops the unnecessary cpuid check. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-06arm64: topology: add support to remove cpu topology sibling masksSudeep Holla
This patch adds support to remove all the CPU topology information using clear_cpu_topology and also resetting the sibling information on other sibling CPUs. This will be used in cpu_disable so that all the topology sibling information is removed on CPU hotplug out. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-06arm64: topology: refactor reset_cpu_topology to add support for removing ↵Sudeep Holla
topology Currently reset_cpu_topology clears all the CPU topology information and resets to default values. However we may need to just clear the information when we hotplug out the CPU. In preparation to add the support the same, let's refactor reset_cpu_topology to just reset the information and move clearing out the topology information to clear_cpu_topology. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-06-07arm64: topology: Avoid checking numa mask for scheduler MC selectionJeremy Linton
The numa mask subset check can often lead to system hang or crash during CPU hotplug and system suspend operation if NUMA is disabled. This is mostly observed on HMP systems where the CPU compute capacities are different and ends up in different scheduler domains. Since cpumask_of_node is returned instead core_sibling, the scheduler is confused with incorrect cpumasks(e.g. one CPU in two different sched domains at the same time) on CPU hotplug. Lets disable the NUMA siblings checks for the time being, as NUMA in socket machines have LLC's that will assure that the scheduler topology isn't "borken". The NUMA check exists to assure that if a LLC within a socket crosses NUMA nodes/chiplets the scheduler domains remain consistent. This code will likely have to be re-enabled in the near future once the NUMA mask story is sorted. At the moment its not necessary because the NUMA in socket machines LLC's are contained within the NUMA domains. Further, as a defensive mechanism during hot-plug, lets assure that the LLC siblings are also masked. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-17arm64: topology: divorce MC scheduling domain from core_siblingsJeremy Linton
Now that we have an accurate view of the physical topology we need to represent it correctly to the scheduler. Generally MC should equal the LLC in the system, but there are a number of special cases that need to be dealt with. In the case of NUMA in socket, we need to assure that the sched domain we build for the MC layer isn't larger than the DIE above it. Similarly for LLC's that might exist in cross socket interconnect or directory hardware we need to assure that MC is shrunk to the socket or NUMA node. This patch builds a sibling mask for the LLC, and then picks the smallest of LLC, socket siblings, or NUMA node siblings, which gives us the behavior described above. This is ever so slightly different than the similar alternative where we look for a cache layer less than or equal to the socket/NUMA siblings. The logic to pick the MC layer affects all arm64 machines, but only changes the behavior for DT/MPIDR systems if the NUMA domain is smaller than the core siblings (generally set to the cluster). Potentially this fixes a possible bug in DT systems, but really it only affects ACPI systems where the core siblings is correctly set to the socket siblings. Thus all currently available ACPI systems should have MC equal to LLC, including the NUMA in socket machines where the LLC is partitioned between the NUMA nodes. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-17arm64: topology: enable ACPI/PPTT based CPU topologyJeremy Linton
Propagate the topology information from the PPTT tree to the cpu_topology array. We can get the thread id and core_id by assuming certain levels of the PPTT tree correspond to those concepts. The package_id is flagged in the tree and can be found by calling find_acpi_cpu_topology_package() which terminates its search when it finds an ACPI node flagged as the physical package. If the tree doesn't contain enough levels to represent all of the requested levels then the root node will be returned for all subsequent levels. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-17arm64: topology: rename cluster_idJeremy Linton
The cluster concept isn't architecturally defined for arm64. Lets match the name of the arm64 topology field to the kernel macro that uses it. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-02arm64: Use of_cpu_node_to_id helper for CPU topology parsingSuzuki K Poulose
Make use of the new generic helper to convert an of_node of a CPU to the logical CPU id in parsing the topology. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-20arm64: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-03arm,arm64,drivers: add a prefix to drivers arch_topology interfacesJuri Lelli
Now that some functions that deal with arch topology information live under drivers, there is a clash of naming that might create confusion. Tidy things up by creating a topology namespace for interfaces used by arch code; achieve this by prepending a 'topology_' prefix to driver interfaces. Signed-off-by: Juri Lelli <juri.lelli@arm.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-03arm,arm64,drivers: move externs in a new header fileJuri Lelli
Create a new header file (include/linux/arch_topology.h) and put there declarations of interfaces used by arm, arm64 and drivers code. Signed-off-by: Juri Lelli <juri.lelli@arm.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-03arm,arm64,drivers: reduce scope of cap_parsing_failedJuri Lelli
Reduce the scope of cap_parsing_failed (making it static in drivers/base/arch_topology.c) by slightly changing {arm,arm64} DT parsing code. For arm checking for !cap_parsing_failed before calling normalize_ cpu_capacity() is superfluous, as returning an error from parse_ cpu_capacity() (above) means cap_from _dt is set to false. For arm64 we can simply check if raw_capacity points to something, which is not if capacity parsing has failed. Suggested-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-03arm, arm64: factorize common cpu capacity default codeJuri Lelli
arm and arm64 share lot of code relative to parsing CPU capacity information from DT, using that information for appropriate scaling and exposing a sysfs interface for chaging such values at runtime. Factorize such code in a common place (driver/base/arch_topology.c) in preparation for further additions. Suggested-by: Will Deacon <will.deacon@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/topology.h> We are going to split <linux/sched/topology.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/topology.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-22Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: - Errata workarounds for Qualcomm's Falkor CPU - Qualcomm L2 Cache PMU driver - Qualcomm SMCCC firmware quirk - Support for DEBUG_VIRTUAL - CPU feature detection for userspace via MRS emulation - Preliminary work for the Statistical Profiling Extension - Misc cleanups and non-critical fixes * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (74 commits) arm64/kprobes: consistently handle MRS/MSR with XZR arm64: cpufeature: correctly handle MRS to XZR arm64: traps: correctly handle MRS/MSR with XZR arm64: ptrace: add XZR-safe regs accessors arm64: include asm/assembler.h in entry-ftrace.S arm64: fix warning about swapper_pg_dir overflow arm64: Work around Falkor erratum 1003 arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2 arm64: arch_timer: document Hisilicon erratum 161010101 arm64: use is_vmalloc_addr arm64: use linux/sizes.h for constants arm64: uaccess: consistently check object sizes perf: add qcom l2 cache perf events driver arm64: remove wrong CONFIG_PROC_SYSCTL ifdef ARM: smccc: Update HVC comment to describe new quirk parameter arm64: do not trace atomic operations ACPI/IORT: Fix the error return code in iort_add_smmu_platform_device() ACPI/IORT: Fix iort_node_get_id() mapping entries indexing arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA perf: xgene: Include module.h ...
2017-02-08arm64: remove wrong CONFIG_PROC_SYSCTL ifdefJuri Lelli
The sysfs cpu_capacity entry for each CPU has nothing to do with PROC_FS, nor it's in /proc/sys path. Remove such ifdef. Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Reported-and-suggested-by: Sudeep Holla <sudeep.holla@arm.com> Fixes: be8f185d8af4 ('arm64: add sysfs cpu_capacity attribute') Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-01-27arm64: skip register_cpufreq_notifier on ACPI-based systemsPrashanth Prakash
On ACPI based systems where the topology is setup using the API store_cpu_topology, at the moment we do not have necessary code to parse cpu capacity and handle cpufreq notifier, thus resulting in a kernel panic. Stack: init_cpu_capacity_callback+0xb4/0x1c8 notifier_call_chain+0x5c/0xa0 __blocking_notifier_call_chain+0x58/0xa0 blocking_notifier_call_chain+0x3c/0x50 cpufreq_set_policy+0xe4/0x328 cpufreq_init_policy+0x80/0x100 cpufreq_online+0x418/0x710 cpufreq_add_dev+0x118/0x180 subsys_interface_register+0xa4/0xf8 cpufreq_register_driver+0x1c0/0x298 cppc_cpufreq_init+0xdc/0x1000 [cppc_cpufreq] do_one_initcall+0x5c/0x168 do_init_module+0x64/0x1e4 load_module+0x130c/0x14d0 SyS_finit_module+0x108/0x120 el0_svc_naked+0x24/0x28 Fixes: 7202bde8b7ae ("arm64: parse cpu capacity-dmips-mhz from DT") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07arm64: add sysfs cpu_capacity attributeJuri Lelli
Add a sysfs cpu_capacity attribute with which it is possible to read and write (thus over-writing default values) CPUs capacity. This might be useful in situations where values needs changing after boot. The new attribute shows up as: /sys/devices/system/cpu/cpu*/cpu_capacity Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07arm64: parse cpu capacity-dmips-mhz from DTJuri Lelli
With the introduction of cpu capacity-dmips-mhz bindings, CPU capacities can now be calculated from values extracted from DT and information coming from cpufreq. Add parsing of DT information at boot time, and complement it with cpufreq information. Also, store such information using per CPU variables, as we do for arm. Caveat: the information provided by this patch will start to be used in the future. We need to #define arch_scale_cpu_capacity to something provided in arch, so that scheduler's default implementation (which gets used if arch_scale_cpu_capacity is not defined) is overwritten. Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-27arm64: kernel: remove non-legit DT warnings when booting using ACPISudeep Holla
Since both CONFIG_ACPI and CONFIG_OF are enabled when booting using ACPI tables on ARM64 platforms, we get few device tree warnings which are not valid for ACPI boot. We can use of_have_populated_dt to check if the device tree is populated or not before throwing out those errors. This patch uses of_have_populated_dt to remove non legitimate device tree warning when booting using ACPI tables. Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-11-25arm64: topology: Fix handling of multi-level cluster MPIDR-based detectionMark Brown
The only requirement the scheduler has on cluster IDs is that they must be unique. When enumerating the topology based on MPIDR information the kernel currently generates cluster IDs by using the first level of affinity above the core ID (either level one or two depending on if the core has multiple threads) however the ARMv8 architecture allows for up to three levels of affinity. This means that an ARMv8 system may contain cores which have MPIDRs identical other than affinity level three which with current code will cause us to report multiple cores with the same identification to the scheduler in violation of its uniqueness requirement. Ensure that we do not violate the scheduler requirements on systems that uses all the affinity levels by incorporating both affinity levels two and three into the cluser ID when the cores are not threaded. While no currently known hardware uses multi-level clusters it is better to program defensively, this will help ease bringup of systems that have them and will ensure that things like distribution install media do not need to be respun to replace kernels in order to deploy such systems. In the worst case the system will work but perform suboptimally until a kernel modified to handle the new topology better is installed, in the best case this will be an adequate description of such topologies for the scheduler to perform well. Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-07-09arm64: topology: add MPIDR-based detectionZi Shen Lim
Create cpu topology based on MPIDR. When hardware sets MPIDR to sane values, this method will always work. Therefore it should also work well as the fallback method. [1] When we have multiple processing elements in the system, we create the cpu topology by mapping each affinity level (from lowest to highest) to threads (if they exist), cores, and clusters. [1] http://www.spinics.net/lists/arm-kernel/msg317445.html Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Zi Shen Lim <zlim@broadcom.com> Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-16arm64: topology: Add support for topology DT bindingsMark Brown
Add support for parsing the explicit topology bindings to discover the topology of the system. Since it is not currently clear how to map multi-level clusters for the scheduler all leaf clusters are presented to the scheduler at the same level. This should be enough to provide good support for current systems. Signed-off-by: Mark Brown <broonie@linaro.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-16arm64: topology: Initialise default topology state immediatelyMark Brown
As a legacy of the way 32 bit ARM did things the topology code uses a null topology map by default and then overwrites it by mapping cores with no information to a cluster by themselves later. In order to make it simpler to reset things as part of recovering from parse failures in firmware information directly set this configuration on init. A core will always be its own sibling so there should be no risk of confusion with firmware provided information. Signed-off-by: Mark Brown <broonie@linaro.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-03-04arm64: topology: Implement basic CPU topology supportMark Brown
Add basic CPU topology support to arm64, based on the existing pre-v8 code and some work done by Mark Hambleton. This patch does not implement any topology discovery support since that should be based on information from firmware, it merely implements the scaffolding for integration of topology support in the architecture. No locking of the topology data is done since it is only modified during CPU bringup with external serialisation from the SMP code. The goal is to separate the architecture hookup for providing topology information from the DT parsing in order to ease review and avoid blocking the architecture code (which will be built on by other work) with the DT code review by providing something simple and basic. Following patches will implement support for interpreting topology information from MPIDR and for parsing the DT topology bindings for ARM, similar patches will be needed for ACPI. Signed-off-by: Mark Brown <broonie@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> [catalin.marinas@arm.com: removed CONFIG_CPU_TOPOLOGY, always on if SMP] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>