Scheduling for bigLITTLE core Architectures
cluster, that satisfies the resource requirement of each tasks and minimize the energy consumption in each scheduling interval. There are three phases, TaskInfo, LittleCore, and
BigCore phase, in each scheduling interval. The flowchart of the proposed method is in theAppendix.
One thing worth mentioning is that the proposed method can support all the three big-LITTLE models by applying different candidate selection methods. For example, in the
cluster migration mode, all the tasks will be marked as candidates since we can only powerup a core cluster at a time. As for the heterogeneous multi-processing model, a task that consumes the largest amount of resources will be selected and marked as migration candidate. The reason we choose the task with the largest amount of resources instead of tasks with insufficient resource is that migrating larger tasks to a big core release more
resources in the little core cluster, resulting in larger opportunity to scale down thefrequency or power-off the cores.
frequency is used, while the other threads have loading less than 5 %. Figure 21 shows a piece of workload from Candy Crush. However, even with the high loading, the fps of Candy Crush is above 24 while running on little core with the minimum frequency. Thus we set the minimum resource requirement of Candy Crush as the minimum frequency (250 kHz) multiplies 100%.
Figure 21 A piece of workload from Candy Crush
The minimum resource required is slightly different with Chrome. Since web browser is “best-effort”, this means that the performance will be better if it gets more resource.
Also the thread behavior of Chrome is very different from TTpod and Candy Crush.
Instead of a high-loading main thread, the loadings vary a lot during execution. While user clicks a link and enters a new web page, the loading increases drastically. However, the high loading only continues for a short period of time and then decreases. Since each website has different contents, the time to fully load a page varies. Thus we consider only the time between user click and the browser enters the new page instead of fully loading an entire page. The QoS requirement can be satisfied by setting the minimum resource
required to the minimum frequency of a big core multiplies the CPU load of Chrome atthat core frequency. This means that if our scheduler encounters web events/tasks, it will use big core with the lowest frequency to complete the tasks.
Currently we compute the minimum resource required manually. In the future, we will design an online profiler which automatically computes the minimum resources required for each incoming new task.
In the first experiment, we simulate the execution of the three applications separately, and measure their average power consumption. We apply three different candidate selection methods, one for each model. Table 12 shows the result. We can see from Table 12 that the average of estimated power consumption of TTpod is the same. The reason is that TTpod requires only one LITTLE core with the least frequency in all three models. To further verify this result, we measure the actual power consumption of TTpod. The result is in Figure 22.
Figure 22 shows the result of the loading of TTpod (3318-ttpod), our estimated power (estimate_w), and the actual power on a LITTLE core (a7_w). As can be seen from Figure 22, the average value of estimated power (0.01904) is close to the actual one (0.01929).
However, the estimated power is less fluctuating than the actual power does.
Figure 22 Estimated power and actual power of TTpod
On the other hand, the average estimated power consumption of Candy Crush differs between our method and Linaro’s. We’ve mentioned that the main thread of Candy Crush always results in high loading. Since Linaro’s scheduler considers only loading, it will keep scaling up the frequency, and eventually uses the highest frequency of big core. Our method can keep Candy Crush working on LITTLE cores and still satisfies the QoS requirement, which is at least 24 fps during gameplay.
As for Chrome, both our method and Linaro’s use big cores. Again, Linaro’s scheduler considers only loading, and uses the highest frequency of big cores. Our scheduler only uses the lowest frequency of big cores to complete the tasks.
Resource-Guided Scheduler Linaro
Model I II III I II III
TTpod 0.019 0.019 0.019 0.019 0.019 0.019
Candy Crush 0.371 0.371 0.371 1.49 1.49 1.49
Chrome 0.916 0.916 0.916 1.88 1.73 1.73
Table 12 Comparison of average power consumptions using simulation
4.8.2 Experiments on the ODROID-XU ARM platform.
In the second experiment, we execute the three applications together, and measure the average power consumption during execution. The applications start on different time. The scenario is that a user first starts TTpod at time 0 to play some music. A minute later, this user wants to play the game, Candy Crush, while keeping the music. After playing the game for three minutes, this user finishes the game and opens Chrome to search for a solution on how to conquer a certain stage of Candy Crush.
We generate the scheduling plan using our simulator. The hardware settings, i.e.
number of cores and core frequency, change during execution according to the scheduling plan. The two executions, our resource-guided and Linaro, use different scheduling plans since the strategies are different.
Figure 23 and Figure 24 show the results. Figure 23 is the loading and power consumption result of Linaro’s strategy, while Figure 24 is the result of our resource-guided strategy. The average power consumptions are 0.071 mWatt and 0.0072mWatt, respectively. This result shows that our resource-guided is more power-efficient than
Figure 23 Loading and Power consumption of Linaro
Figure 24 Loading and Power consumption of resource-guided