sched/fair: Improve spreading of utilization

During load_balancing, a group with spare capacity will try to pull some utilizations from an overloaded group. In such case, the load balance looks for the runqueue with the highest utilization. Nevertheless, it should also ensure that there are some pending tasks to pull otherwise the load balance will fail to pull a task and the spread of the load will be delayed. This situation is quite transient but it's possible to highlight the effect with a short run of sysbench test so the time to spread task impacts the global result significantly. Below are the average results for 15 iterations on an arm64 octo core: sysbench --test=cpu --num-threads=8 --max-requests=1000 run tip/sched/core +patchset total time: 172ms 158ms per-request statistics: avg: 1.337ms 1.244ms max: 21.191ms 10.753ms The average max doesn't fully reflect the wide spread of the value which ranges from 1.350ms to more than 41ms for the tip/sched/core and from 1.350ms to 21ms with the patch. Other factors like waiting for an idle load balance or cache hotness can delay the spreading of the tasks which explains why we can still have up to 21ms with the patch. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200312165429.990-1-vincent.guittot@linaro.org
author: Vincent Guittot <vincent.guittot@linaro.org> 2020-03-12 17:54:29 +0100
committer: Peter Zijlstra <peterz@infradead.org> 2020-03-20 13:06:20 +0100
commit: c32b4308295aaaaedd5beae56cb42e205ae63e58 (patch)
tree: 87ed7a5e12345b7c5f13f24787aac55bf76e918b /kernel/sched
parent: 26cf52229efc87e2effa9d788f9b33c40fb3358a (diff)
1 files changed, 8 insertions, 0 deletions
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c7aaae2b1030..783356f96b7b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9313,6 +9313,14 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 		case migrate_util:
 			util = cpu_util(cpu_of(rq));
 
+			/*
+			 * Don't try to pull utilization from a CPU with one
+			 * running task. Whatever its utilization, we will fail
+			 * detach the task.
+			 */
+			if (nr_running <= 1)
+				continue;
+
 			if (busiest_util < util) {
 				busiest_util = util;
 				busiest = rq;
author	Vincent Guittot <vincent.guittot@linaro.org>	2020-03-12 17:54:29 +0100
committer	Peter Zijlstra <peterz@infradead.org>	2020-03-20 13:06:20 +0100
commit	c32b4308295aaaaedd5beae56cb42e205ae63e58 (patch)
tree	87ed7a5e12345b7c5f13f24787aac55bf76e918b /kernel/sched
parent	26cf52229efc87e2effa9d788f9b33c40fb3358a (diff)