Chapter 5 Experiment
5.2 Comparison
In this section, we compare the time needed of affected function block. In
addition, we will supply the patch file if we modified the source code.
z Boot loader phase
METHOD 01: Adjust clocking mode
We modified the file u-boot/board/omap5912/platform.S (If you use last version of U-Boot, you need to modify the file u-boot/board/omap5912osk/lowlevel_init.S) to change the clocking mode. After we change the clocking mode and ramp up the frequency of ARM core to 192 MHz, the timer inaccurate timer become accurate one, and the wait time of abortboot is also accurate. Therefore, we skip the time measurement result of the function block which includes the execution of abortboot.
Because U-Boot works on the upper limited frequency, operations which use MPU to compute data will have the shorter execution time. This part of modification reduces the time needed from 4774.72 ms to 3811.24 ms, i.e. 963.48 ms has been eliminated.
The time comparison is shown at Table 5.1 and the patch is shown at Figure 5.1.
Table 5.1 The Time Comparison of Changing Clocking Mode
Function block
Start Point End Point
Before (ms)
After (ms) Device reset start Device reset over 31.38 31.38 Device reset over MPU read first instruction 0.74 0.82 MPU read first instruction env_relocate_spec start 122.26 100.42 env_relocate_spec start env_relocate_spec over 44.98 37.02 env_relocate_spec over image date checksum start - - image date checksum start image date checksum over 487.92 423.68 image date checksum over copy image to ram start 0.44 0.50 copy image to ram start copy image to ram over 395.52 323.38 Boot
Loader
copy image to ram over transfer control to Linux 0.90 0.88 transfer control to Linux uncompress kernel start 13.48 13.32 uncompress kernel start uncompress kernel over 1836.62 1040.92 Kernel
uncompress kernel over jffs2_build_filesystem start 1840.48 1838.92
Amount 4774.72 3811.24
--- board/omap5912osk/platform.S.old 2006-07-20 21:51:21.000000000 +0800 +++ board/omap5912osk/platform.S 2006-07-20 21:53:38.000000000 +0800
@@ -79,7 +79,7 @@
/* Set CLKM to Sync-Scalable */
/* I supposedly need to enable the dsp clock before switching */
- mov r1, #0x0000
Figure 5.1 The Patch of Changing Clocking Mode
METHOD 02: Reduce unused console functions
It is easy to skip the fully console device initialization, we delete the function call console_init_r in file u-boot/lib_arm/board.c. We also delete the function call misc_init_r, because that function is temp one. The time needed is reduced from 219.73 ms to 0 ms; it is shown at Table 5.2.
Table 5.2 The Time Reduced by Skipping console_init_r
Function Before (ms) After (ms)
console_init_r 219.74 0.0
METHOD 03: Improve abort boot function
After we simply the function abortboot in u-boot/common/main.c, the time needed is reduced from 1704.74ms to 448.12 ms, i.e. 1256.74 ms has been eliminated, and it is shown at Table 5.3 and the patch is shown at Figure 5.2.
Table 5.3 The Time Reduced by Simplifying abortboot
Function block
Start Point End Point
Before (ms)
After (ms) Device reset start Device reset over 31.38 31.38 Device reset over MPU read first instruction 0.74 0.74 MPU read first instruction env_relocate_spec start 122.26 124.18 env_relocate_spec start env_relocate_spec over 44.98 45.28 env_relocate_spec over image date checksum start 1505.38 246.54
Amount 1704.74 448.12
--- u-boot-1.1.3/common/main.c.old 2006-07-21 01:04:03.000000000 +0800 +++ u-boot-1.1.3/common/main.c 2006-07-21 01:07:06.000000000 +0800
@@ -238,7 +238,6 @@
printf("Hit any key to stop autoboot: %2d ", bootdelay);
#endif
-#if defined CONFIG_ZERO_BOOTDELAY_CHECK /*
* Check if key already pressed
* Don't check if bootdelay < 0
- printf ("\b\b\b%2d ", bootdelay);
- }
putc ('\n');
Figure 5.2 The Patch of Simplifying abortboot
METHOD 04: Improve image verification mechanism
After we added a new parameter called verify in the U-Boot environment parameters and modified file u-boot/common/cmd_bootm.c. When verify is n, U-boot will skip the verification of image header checksum and image data checksum. The time needed is reduced from 2193.1 ms to 1702.94 ms, i.e. 490.16 ms has been eliminated, and it is shown at Table 5.4 and the patch is shown at Figure 5.3.
Table 5.4 The Time Reduced by Verification Switch
Function block
Start Point End Point
Before (ms)
After (ms) Device reset start Device reset over 31.38 31.34 Device reset over MPU read first instruction 0.74 0.74 MPU read first instruction env_relocate_spec start 122.26 122.90 env_relocate_spec start env_relocate_spec over 44.98 45.10 env_relocate_spec over image date checksum start 1505.38
image date checksum start image date checksum over 487.92 Boot
Loader
image date checksum over copy image to ram start 0.44
1502.86
Amount 2193.10 1702.94
--- u-boot-1.1.3/common/cmd_bootm.c.old 2005-08-14 07:53:35.000000000 +0800 +++ u-boot-1.1.3/common/cmd_bootm.c 2006-07-21 01:45:33.000000000 +0800
@@ -191,16 +191,18 @@
}
SHOW_BOOT_PROGRESS (2);
- data = (ulong)&header;
- len = sizeof(image_header_t);
+ if (verify) {
+ data = (ulong)&header;
+ len = sizeof(image_header_t);
- checksum = ntohl(hdr->ih_hcrc);
- hdr->ih_hcrc = 0;
+ checksum = ntohl(hdr->ih_hcrc);
+ hdr->ih_hcrc = 0;
- if (crc32 (0, (char *)data, len) != checksum) { - puts ("Bad Header Checksum\n");
- SHOW_BOOT_PROGRESS (-2);
- return 1;
SHOW_BOOT_PROGRESS (3);
Figure 5.3 The Patch of Verification Switch
METHOD 05: Use silent console in boot loader phase
There are five files need to be modified. In the file u-boot/common/cmd_bootm.c, we need to modify the code of U-Boot banner, print_image_hdr. In the file u-boot/common/main.c, we need to modify the code of abortboot message. In the file u-boot/include/configs/omap5912osk.h, we set CFG_CONSOLE_INFO_QUIET=1.
In the file u-boot/lib_arm/armlinux.c, we modify the code of transfer control to Linux.
Finally, in the file u-boot/lib_arm/board.c, we modify the code of display_banner, dram_init, display_dram_config and display_flash_config. This part of silent console reduces the time needed from 2603.00 ms to 2557.80 ms, i.e. 45.20 ms has been eliminated, and it is shown at Table 5.5 and the sample patch for u-boot/lib_arm/armlinux.c is shown at Figure 5.4.
Table 5.5 The Time Reduced by Silent Console in U-Boot
Function block
Start Point End Point
Before (ms)
After (ms) Device reset start Device reset over 31.38 31.38 Device reset over MPU read first instruction 0.74 0.74 MPU read first instruction env_relocate_spec start 122.26 109.48 env_relocate_spec start env_relocate_spec over 44.98 45.24 env_relocate_spec over image date checksum start 1505.38 1474.04 image date checksum star image date checksum over 487.92
image date checksum over
copy image to ram start 0.44 copy image to ram start copy image to ram over 395.52
884.72 Boot
Loader
copy image to ram over transfer control to Linux 0.90 0.78 Kernel transfer control to Linux uncompress kernel start 13.48 11.42
Amount 2603.00 2557.80
diff -Nur u-boot-1.1.3/lib_arm/armlinux.c u-boot-1.1.3o/lib_arm/armlinux.c --- u-boot-1.1.3/lib_arm/armlinux.c 2005-08-14 07:53:35.000000000 +0800 +++ u-boot-1.1.3o/lib_arm/armlinux.c 2006-07-21 03:43:12.000000000 +0800
@@ -85,11 +85,15 @@
void (*theKernel)(int zero, int arch, uint params);
image_header_t *hdr = &header;
bd_t *bd = gd->bd;
+ int quiet;
+ char *s;
#ifdef CONFIG_CMDLINE_TAG
char *commandline = getenv ("bootargs");
#endif
+ s = getenv ("quiet");
+ quiet = (s && (*s == 'y')) ? 0 : 1;
theKernel = (void (*)(int, int, uint))ntohl(hdr->ih_ep);
/*
@@ -256,7 +260,11 @@
#endif
/* we assume that the kernel is in place */
- printf ("\nStarting kernel ...\n\n");
+ if (quiet) {
Figure 5.4 The Sample Patch of Silent Console
z Kernel phase
METHOD 06: Use uncompressed kernel image
For uncompressed kernel, we need to increase the size of mtdblock2 to put the uncompressed kernel, which is assigned in linux/arch/arm/mach-omap1/board-osk.c and linux/include/asm-arm/sizes.h. If the kernel has been optimized, the original size of mtdblock2 is enough to put the optimized uncompressed kernel. After using of uncompressed kernel, the time needed is reduced from 6282.1 ms to 5369.46 ms, i.e.
912.64 ms has been eliminated, and it is shown at Table 5.6 and the patch is shown at Figure 5.5. The noteworthy one is if we use uncompressed kernel with no image verification, the time will be reduced for 912.64+1026.1=1938.74 ms.
Table 5.6 The Time Reduced by Uncompressed Kernel
Function block
Start Point End Point
Before (ms)
After (ms) Device reset start Device reset over 31.38 31.38 Device reset over MPU read first instruction 0.74 0.74
MPU read first instruction env_relocate_spec start 122.26 122.22 env_relocate_spec start env_relocate_spec over 44.98 45.18
env_relocate_spec over image date checksum start 1505.38 1505.20 image date checksum start image date checksum over 487.92 1026.10 image date checksum over copy image to ram start 0.44 0.64 copy image to ram start copy image to ram over 395.52 831.58 Boot
Loader
copy image to ram over transfer control to Linux 0.90 0.90 transfer control to Linux uncompress kernel start 13.48
uncompress kernel start uncompress kernel over 1838.62 Kernel
uncompress kernel over jffs2_build_filesystem start 1840.48
1805.52
Amount 6282.1 5369.46
diff -ruN linux-2.6.14.orig/arch/arm/boot/Makefile linux-2.6.14/arch/arm/boot/Makefile
--- linux-2.6.14.orig/arch/arm/boot/Makefile 2006-06-15 21:57:19.000000000 +0800
+++ linux-2.6.14/arch/arm/boot/Makefile 2006-06-15 22:20:09.000000000 +0800
@@ -46,6 +46,10 @@
$(obj)/Image: vmlinux FORCE
$(call if_changed,objcopy)
@echo ' Kernel: $@ is ready'
+ $(CONFIG_SHELL) $(MKIMAGE) -A arm -O linux -T kernel \
+ -C none -a $(ZRELADDR) -e $(ZRELADDR) -n 'Linux-$(KERNELRELEASE)' \ + -d arch/arm/boot/Image arch/arm/boot/uImage-uncompress
+ @echo ' Kernel: arch/arm/boot/uImage-uncompress is ready'
$(obj)/compressed/vmlinux: $(obj)/Image FORCE
$(Q)$(MAKE) $(build)=$(obj)/compressed $@
diff -ruN linux-2.6.14.orig/Makefile linux-2.6.14/Makefile
--- linux-2.6.14.orig/Makefile 2006-06-15 22:02:41.000000000 +0800 +++ linux-2.6.14/Makefile 2006-06-15 22:29:54.000000000 +0800
@@ -984,7 +984,8 @@
# Directories & files removed with 'make clean' CLEAN_DIRS += $(MODVERDIR)
CLEAN_FILES += vmlinux System.map \
- .tmp_kallsyms* .tmp_version .tmp_vmlinux* .tmp_System.map + .tmp_kallsyms* .tmp_version .tmp_vmlinux* .tmp_System.map \
+ arch/arm/boot/uImage-uncompress
# Directories & files removed with 'make mrproper' MRPROPER_DIRS += include/config include2
Figure 5.5 The Patch of Uncompressed Kernel
METHOD 07: Eliminate BogoMIPS calibration
By last boot, we can obtain the value of loops_per_jiffy by dmesg, which is 373760. By put “lpj=373760” (passing 373760 to kernel as the value of loops_per_jiffy) in command line, the boot time is reduced from 154.785157 ms to 0.061036 ms, i.e. 154.724121 ms has been eliminated, and it is shown at Table 5.7 and the effect is shown at Figure 5.6.
Table 5.7 The Time of calibrate_delay
normal boot preset loops_per_jiffy 154.785157 ms 0.061036 ms
Before presetting:
Calibrating delay loop... 74.75 BogoMIPS (lpj=373760) After presetting:
Calibrating delay loop (skipped)... 74.75 BogoMIPS preset Figure 5.6 The Effect of Preset LPJ
METHOD 08: Use device modularization
After using the unofficial patch for OMAP5912, we find still many choices can be modified. We remove these options: Code maturity level options, Support for paging of anonymous memory (swap), Reset unused clocks during boot, OMAP multiplexing support, IP kernel level autoconfiguration, Initial RAM disk (initrd)
support, ATA/ATAPI/MFM/RLL support, Mouse interface, Keyboards, Virtual terminal, Unix98 PTY support, Legacy (BSD) PTY support, Second extended fs support, Inotify file change notification support, Dnotify support, Kernel automounter support, MSDOS fs support and VFAT (Windows-95) fs support. And then we modularize these options: PCCard (PCMCIA/CardBus) support, Unix domain sockets, INET (socket monitoring interface), Loopback device support, PPP (point-to-point protocol) support, Texas Instruments TLV320AIC23 Codec, Hardware Monitoring support ,Support for frame buffer devices ,Kernel automounter version 4 support and NFS file system support. Finally, we choose the option: Configure standard kernel features (for small systems) to finish this part of works. Finally, by the initcall-times patch, we can obtain the value of initcalls. The time measured by initcall-times is accurate, which is the same as the value measured by oscilloscope and logic analyzer.
The boot time is reduced from 1574.645998 ms to 448.913571 ms, i.e. 1125.732427 ms has been eliminated, and it is shown at Table 5.8.
Table 5.8 The Time of Initcalls
Inincall name Before (ms) After (ms) customize_machine 4.516601 4.364014 omap_init_devices 1.983643 1.983642
init_bio 1.037598 0.946044
i2c_init 1.220703 1.373291
omap_i2c_init_driver 2.380371 2.380371
tps_init 11.138916 11.016846
chr_dev_init 8.331299 8.392334
param_sysfs_init 16.387940 9.460449 init_jffs2_fs 1.312256 1.220703
omapfb_init 32.745361
-tty_init 70.251465 1.922607
pty_init 626.342774
-serial8250_init 245.086670 189.575195
METHOD 09: Use silent console in kernel phase
After using silent console, the time needed is reduced from 5882.6 ms to 5499.6 ms, i.e. 383 ms has been eliminated, and it is shown at Table 5.9.
Table 5.9 The Time Reduced by Silent Console in Linux kernel
Function block
Start Point End Point
Before (ms)
After (ms) transfer control to Linux uncompress kernel start 13.48 13.38 uncompress kernel start uncompress kernel over 1838.62 1838.66 uncompress kernel over jffs2_build_filesystem start 1840.48 1463.88 jffs2_build_filesystem start jffs2_build_filesystem over 2179.54
jffs2_build_filesystem over invoke init 10.48 2183.68
Amount 5882.60 5499.60
z User space phase
METHOD 10: Simplify user space utilities
We give up the most of archival utilities, editors and console utilities, because the usage of these utilities is not many. And we remove the utilities for user management, login/password management utilities and system logging utilities for single user mode. Finally, we add the Linux module utilities, web server, telnet server and some daemons. We have built a powerful file system, and the size of pure file system without modules is only 636.4 KB using JFFS2.
METHOD 11: Accelerate shell prompt start
After skip the wait for enter, the wait time, 600ms in average is reduced. The patch is shown at Figure 5.7.
diff -Nur busybox-1.01/init/init.c busybox-1.01-phantom-v2/init/init.c --- busybox-1.01/init/init.c 2005-08-17 09:29:16.000000000 +0800
+++ busybox-1.01-phantom-v2/init/init.c 2006-07-11 06:13:37.000000000 +0800
@@ -429,12 +429,14 @@
char *s, *tmpCmd, *cmd[INIT_BUFFS_SIZE], *cmdpath;
char buf[INIT_BUFFS_SIZE + 6]; /* INIT_BUFFS_SIZE+strlen("exec ")+1 */
sigset_t nmask, omask;
+/* skip press_enter */
+/*
static const char press_enter[] = #ifdef CUSTOMIZED_BANNER
#include CUSTOMIZED_BANNER #endif
"\nPlease press Enter to activate this console. ";
- +*/
/* Block sigchild while forking. */
sigemptyset(&nmask);
+ * Save memory by not exec-ing anything large (like a shell) + * before the user wants it. This is critical if swap is not + * enabled and the system has low memory. Generally this will
+ * be run on the second virtual console, and the first will + * be allowed to start a shell or whatever an init script + * specifies.
+ */
+/*
#if !defined(__UCLIBC__) || defined(__ARCH_HAS_MMU__) if (a->action & ASKFIRST) {
char c;
- /*
- * Save memory by not exec-ing anything large (like a shell) - * before the user wants it. This is critical if swap is not
- * enabled and the system has low memory. Generally this will
- * be run on the second virtual console, and the first will - * be allowed to start a shell or whatever an init script
- * specifies.
- */
messageD(LOG, "Waiting for enter to start '%s'"
"(pid %d, terminal %s)\n", cmdpath, getpid(), a->terminal);
message(LOG, "Starting pid %d, console %s: '%s'", getpid(), a->terminal, cmdpath);
Figure 5.7 The Patch of Quick Shell Prompt
METHOD 12: Use complex file System
Firstly, the comparison between JFFS2, CramFS and SquashFS is shown at Table 5.10.
Table 5.10 The Comparison between Different FS
Writable FS Read-only FS JFFS2 CramFS SquashFS Kernel size (KB) 1721920 1627312 1660832 FS image size (KB) 1162272 1007616 1085440
Mount time (ms) 2179.54 8.62 6.98
Figure 5.8 The NOR Flash Memory Map
Figure 5.9 The Mount Operation of JFFS2 Partition in Background
And we implement a complex file system which includes SquashFS and JFFS2 FS to reduce the boot time greatly, and we still can do write operation on flash storage.
The NOR flash memory map is shown at Figure 5.8, and the mount operation of JFFS2 partition in background is shown at Figure 5.9, the shell prompt is put at 1489.59 ms for user, and the JFFS2 partition is mounted at 3744.60 ms in background.