• 沒有找到結果。

Chapter 5 Experiment

5.1 Redundancy Analysis

By the time measurement result, at first, we can observe many operations of accessing flash and some bad configuration during boot loader phase. Secondly, during kernel phase, we can obtain the execution of all kernel routines by printk useful information. Finally, the choice of file system has huge affection to the boot time. After checking that, we can conclude following redundant works and the methods of fast boot.

z Boot loader phase

METHOD 01: Adjust clocking mode

[Original] The Default setting of U-Boot uses fully synchronous mode as the clocking mode [18]. In fully synchronous mode, the MPU, DSP, and Memory traffic controller (TC) domains run at the same clock frequency derived from DPLL1.

[Limitation] The frequency of MPU and DSP are limited by the upper bound of TC [19], i.e. 96 MHz. However, the upper bound frequency of MPU and DSP are 192 MHz. So, the performance of U-Boot is limited because the frequency of each

domain.

[Modification] We changed the clocking mode from fully synchronous mode to synchronous scalable mode by setting the value of ARM_SYSST (MPU System Status Register) from 0x0000 to 0x1000 [18] [19]. In synchronous scalable mode, the domains of MPU, DSP, and TC are synchronous and run at different clock speeds.

[Improvement] We can ramp up the DPLL1 clock to 192 MHz and let MPU work on

192 MHz by setting ARMDIV to 00, i.e. the frequency of ARM core equals the frequency of DPLL1 divided by 20, and TC work on 96 MHz at the same time by setting TCDIV to 01, i.e. the frequency of TC equals the frequency of DPLL1 divided by 21.

METHOD 02: Reduce unused console functions

[Original] During U-Boot doing initialization, the initialization of console device is separated into two functions: console_init_f and console_init_r.

[Limitation] After executing the two functions sequentially, the console device will be initialized as a fully console device. However, we do not need U-Boot to provide a fully console device during boot. Therefore fully initialization of console device is redundant.

[Modification] After reading the U-Boot source code and doing experiment repeatedly, we know that the function console_init_r is useless during boot. Therefore, we skip the execution of console_init_r.

[Improvement] The execution time of console_init_r can be saved. Although we skip the execution of console_init_r, boot is still successful and the output messages still can be shown by console after the function console_init_f finished the first stage initialization of console.

METHOD 03: Improve abort boot function

[Original] The function abortboot will lock U-Boot to wait and check if any key already pressed. If there is any key already pressed, function abortboot will abort the boot process, and redirect to U-Boot prompt. Otherwise, after numbers of second, function abortboot will unlock U-Boot, and resume the boot process.

[Limitation] The time of waiting is bootdelay seconds; the default value of bootdelay is setting as 10. It will waste 1.25s to wait using U-Boot 1.1.3 (The timer is not accurate; the correct wait time should be 10s. If the timer is accurate, the boot time should add 10-1.25=8.75s more).

[Modification] We modified the code of function abortboot to reduce the waiting time during U-Boot check if any key already pressed. Original abortboot routine will spend numbers of seconds to check repeatedly.

[Improvement] After modifying, abortboot routine will only check once and waste no time. The wait time can be saved

METHOD 04: Improve image verification mechanism

[Original] U-Boot provides an image verification mechanism; it will verify both header checksum and data checksum of image at each time during boot.

[Limitation] In fact, after burning image, we only need to verify the image checksum once. If the image is correct, doing image verification each time during boot is nonsensical.

[Modification] We added a new parameter called verify in the U-Boot environment parameters and regard it as a switch of the image verification mechanism. When verify is y, U-Boot will do header checksum and data checksum same as default.

When verify is n, U-boot will skip the operation of verification.

[Improvement] In the practical application, we set verify as y after burning to verify

the image checksum and set verify as n if we sure the image is correct. Therefore, we can save the time of image verification.

METHOD 05: Use silent console in boot loader phase

[Original] U-Boot provides some functions to print the information of devices; the information is useful during development and debug. In U-Boot, most of device initialization and information are deal with separate functions.

[Limitation] The execution of information function and every console output by serial port will spend much time.

[Modification] We added a new parameter called quiet in the U-Boot environment parameters and modified the U-Boot source code to achieve quiet console. When quiet is n, U-Boot will show full messages of U-Boot banner, dram configuration, flash configuration, function abartboot, image verification and invoking Linux kernel.

When quiet is y, U-Boot will show no console messages.

[Improvement] By the parameter quiet, we can use the silent console to reduce boot time.

z Kernel phase

METHOD 06: Use uncompressed kernel image

[Original] In the past, the cost of flash storage device in embedded product is quite high, so compressed kernel is used to reduce the cost of product. However, the compressed kernel size of optimized embedded Linux is general less than 1 MB, and the uncompressed kernel size is less than 2MB. At present, the cost of 1MB flash storage is not so high. Therefore, using uncompressed kernel becomes an acceptable choice.

[Modification] We change the Makefile in linux/arch/arm/boot to build an uncompressed image for U-Boot.

[Improvement] The size of uncompressed kernel is close to 2.1 times of compressed kernel. Therefore the time of coping uncompressed image from flash to ram is close to 2.1 times of compressed kernel, too. However, after comparing the time of coping image and uncompressing kernel between uncompressed and compressed kernel, using uncompressed kernel can save huge proportion of boot time. Although the size of uncompressed kernel is bigger, it is still within the default upper limit (2MB).

METHOD 07: Eliminate BogoMIPS calibration

[Original] The function calibrate_delay [10] [20] [21] can compute an appropriate value for loops_per_jiffy and BogoMIPS at boot time. The value of loops_per_jiffy is used to execute busy wait (non-yielding) delays inside the Linux kernel and primarily dependent on processor speed. BogoMips is an unscientific performance of MPU and cache, and the ratio of loops_per_jiffy. Its initial value at boot time is expected to be constant for each boot of Linux on the same hardware.

[Limitation] The value of loops_per_jiffy is primarily dependent on processor speed.

Therefore, its initial value at boot time is expected to be constant for each boot of Linux on the same hardware. We don’t need to compute the value every system boot.

[Modification] Because of the initial value at boot time is expected to be constant, we can preset the initial value in advance.

[Improvement] By to preset the initial value in advance, we can avoid the delay associated with dynamically calculating the value, by the kernel, on every system boot.

METHOD 08: Use device modularization

[Original] Kernel initiates many devices during boot in default setting for different purpose of every kind of product.

[Limitation] Because of we might not need all devices as default setting, many settings become non-critical or useless. For example, if we don’t need pseudo terminal device (PTY), we should remove or modularize it.

[Modification] By reading the dmesg information, we can observe the useless, non-critical or time-consumed devices. After understanding the function of those devices, we should decide which the non-critical devices are. In our experiment platform, we should change the setting of shared memory file system, paging of anonymous memory (swap) support, resetting unused clocks, OMAP multiplexing support, PCMCIA/CardBus support, firewall support, loopback device support, initial RAM disk support, ATA/ATAPI/MFM/RLL support, PPP support, frame buffer devices support, second extended file system support and kernel automounter support and NFS file system support…etc, to remove or modularize them.

[Improvement] By removing and modularizing device driver, we can save much time in initiating useless, non-critical or time-consumed device.

METHOD 09: Use silent console in kernel phase

[Original] During boot, Linux kernel provides much information for debug. Because of the printk messages of kernel are quite a lot, they will spend much time by using serial port or VGA [10].

[Modification] We can add quiet parameter in the kernel command line to changes the loglevel to 4, which suppresses the output of regular (non-emergency) printk messages. Even though the messages are not printed to the system console, they are still placed in the kernel printk buffer, and can be retrieved after boot using the dmesg

command.

[Improvement] We can unable the printk output, and view the message using dmesg.

That will save some time.

z User space phase

METHOD 10: Simplify user space utilities

[Original] BusyBox provide many useful utilities for using of user space. However, we should give up some utilities that have similar function or useless in embedded product.

[Modification] To reduce the size of busybox is to reduce the size of file system. We can give up some similar utilities and some useless utilities which are related with the requirement of a product. Most of archival utilities, editors and console utilities could be gave up in embedded product.

[Improvement] The smaller size of busybox can reduce the execution time of busybox.

METHOD 11: Accelerate shell prompt start

[Original] For the reason of saving memory, BusyBox will lock and wait for user to press Enter key to activate shell prompt.

[Limitation] Generally, user wants to use a product immediately and don’t need to press extra key. And the memory using of shell is few comparing full memory size on OMAP5912OSK.

[Modification] We skip the wait operation and put shell prompt directly.

[Improvement] The time from the “Please press Enter to activate this console”

massage shown to user pressing enter is measured as 600 ms in average. That is too long to reduce the boot time.

METHOD 12: Use complex file system

[Original] By the time measurement result, we can observe function mount_root of kernel spend a large amount of time to build the JFFS2 files system. If we can change the file system which has a short mount time, the time can be saving.

[Limitation] Generally, we use the JFFS2 (The Journalling Flash File System, version 2) file system which is log-structured and writable on flash storage device in embedded systems. However, for a 32MB NOR flash, kernel always spends 2 to 3 seconds to build the JFFS2 file system. The mount time of JFFS2 file system is too long to make the boot time shorter.

[Modification] For flash storage device, CramFS and SquashFS are highly compressed read-only file system, the runtime performance and compression of SquashFS is better than CramFS. No matter CramFS or SquashFS, the mount time is quite short.

In view of the characteristics of writable and read-only file system, we use both writable and read-only file system on a single flash storage device at the same time.

First, using appropriate spaces as root file system partition including init and most of routines, then using remaining space as writable file system. Finally we let the function mount_root just build the root file system, and build the writable file system in the background after shell prompt.

[Improvement] The boot time can be reduced greatly, and we still can do write operation on flash storage.

相關文件