• 沒有找到結果。

This chapter proposes a two-phase algorithm and architecture to significantly re-duce the computational load of motion estimation by removing the unlikely mo-tion vectors in the fist phase. As the result of simulating video clips, the quality degradation is very little comparing with FS, only degrading 0.435 per pixel in MAD averagely. In addition, the algorithm features adaptive choosing for the scan direction; it turns out a high degree of data reusability and low memory re-quirement.

Table 3.I.: Quality degradation analysis for different video clips.

Clips FS LRQ EFBLA vs. FS vs. LRQ akiyo 0.605 0.645 0.652 0.047 0.007 children 2.572 2.882 2.930 0.358 0.048 coastguard 5.341 6.309 6.390 1.049 0.081 container 1.564 1.578 1.591 0.027 0.013 dancer 2.696 3.963 3.974 1.278 0.011 destruct 4.022 4.439 4.475 0.454 0.036 flower 6.000 6.367 6.491 0.491 0.124 foreman 2.838 3.614 3.684 0.846 0.070 hall monitor 2.543 2.678 2.686 0.143 0.008 mobile 8.837 9.053 9.445 0.608 0.392 mother daughter 1.496 1.646 1.645 0.148 -0.001 news 1.197 1.336 1.349 0.151 0.013 paris 2.500 2.732 2.782 0.282 0.049 sean 1.647 1.713 1.725 0.078 0.012 silent 1.723 1.923 1.930 0.207 0.007 singer 0.821 0.885 0.885 0.064 -0.000 stefan 6.615 7.429 7.715 1.099 0.286 table tennis 4.388 5.262 5.298 0.910 0.036 tempete 5.685 6.181 6.336 0.651 0.155 waterfall 2.948 3.152 3.150 0.202 -0.002 weather 0.797 0.830 0.847 0.050 0.017 Average 3.183 3.553 3.618 0.435 0.065

Table 3.II.: Computational load analysis for different video clips.

Clips FS LRQ EFBLA vs. FS vs. LRQ

akiyo 711341 69105 57138 -91.97% -17.32%

children2 711341 69105 55823 -92.15% -19.22%

coastguard 711341 69105 58278 -91.81% -15.67%

container 711341 69105 57416 -91.93% -16.92%

dancer 711341 69105 58929 -91.72% -14.72%

destruct 711341 69105 56073 -92.12% -18.86%

flower 711341 69105 57918 -91.86% -16.19%

foreman 711341 69105 56121 -92.11% -18.79%

hall monitor 711341 69105 56671 -92.03% -17.99%

mobile 711341 69105 56742 -92.02% -17.89%

mother daughter 711341 69105 57223 -91.96% -17.19%

news 711341 69105 56064 -92.12% -18.87%

paris 711341 69105 56042 -92.12% -18.90%

sean 711341 69105 56911 -92.00% -17.65%

silent 711341 69105 56887 -92.00% -17.68%

singer 711341 69105 56734 -92.02% -17.90%

stefan 711341 69105 57402 -91.93% -16.94%

table tennis 711341 69105 57469 -91.92% -16.84%

tempete 711341 69105 56905 -92.00% -17.65%

waterfall 711341 69105 58352 -91.80% -15.56%

weather 711341 69105 56631 -92.04% -18.05%

Average 711341 69105 57035 -91.98% -17.47%

Unit: Equivalent Adder (εadder).

0 10 20 30 40 50 60 70 80 90 100

Figure 3-12: MAD curves of FS, LRQ and EFBLA for four clips. (a) The Akiyo Clip. (b) The Children Clip. (c) The Stefan Clip. (d) The Weather Clip.

Chapter 4

Power-Aware Algorithm and Architecture

This chapter presents a power-aware architecture based on subsample algorithms to perform graceful tradeoffs between power consumption and compression qual-ity while the battery status changes [39–41]. As the available energy decreases, the algorithm raises the subsample rate for maximizing battery lifetime. As shown in experimental results, the proposed algorithm and architecture can dynamically operate at different power consumption modes with little quality degradation ac-cording to remaining capacity of battery pack.

This chapter is organized as follows. In Section 4.1 and 4.2, we will intro-duce the motivation and background of power-aware paradigm. Section 4.3 and 4.4 present generic and content-based subsample algorithms in detail. Section 4.6 describes the proposed power-aware architecture and section 4.7 shows the performance analysis. Finally, Section 4.8 is the conclusion of this work.

44

4.1 Motivation

Motion estimation (ME) has been notably recognized as the most critical part in many video compression applications, such as MPEG standards and H.26x, which tends to dominate most computational load and hence power requirements. With increasing demand of battery-powered multimedia devices, an ME architecture that can be flexible in both power consumption and compression quality is highly required. The requirement is driven by user-centric perspective [42]. Basically, users have two thoughts on using portable devices. Sometimes, users might want extremely high video quality at the cost of reduced battery lifetime. At other times, users might want acceptable quality for extending battery lifetime.

This chapter, therefore, intends to presents a novel power-aware ME architec-ture using a content-based subsample algorithm, which can adaptively perform tradeoffs between power consumption and compression quality as the battery sta-tus changes. The proposed architecture is driven by a content-based subsample algorithm that allows the architecture to work at different power consumption modes with acceptable quality degradation. Since the control mechanism and data sequences at different power consumption modes are the same in the architecture, the power-aware algorithm can switch power consumption modes very smoothly on the fly. The block diagram shown in Fig. 4-1 illustrates a typical application of the proposed power-aware ME architecture. The host processor monitors the remaining capacity of battery pack and switches the power consumption modes.

According to the power mode, the power-aware architecture sets the subsample rate and calculates the motion vector (MV) for motion compensation. Note that most portable multimedia devices, in practice, have the battery monitor unit and power management subroutines. Besides the power-aware motion estimation unit,

all the units marked as gray background also can be designed with power-aware capability to facilitate this portable system to be friendlier for the battery usage.

In this chapter, the thesis focuses the target to the power-aware motion estimation based on the content property.

Lots of published papers have presented efficient algorithms for VLSI imple-mentation of motion estimation, on either high performance or low power design.

Yet, most of them cannot dynamically adapt the compression quality to different power consumption modes. Among these proposed algorithms, the Full-Search Block-Matching (FSBM) algorithm with Sum of Absolute Difference (SAD) cri-terion is the most popular approach for motion estimation because of its consid-erably good quality. It is particularly attractive to the ones who require extremely high quality. There are many types of architectures that have been proposed for the implementation of FSBM algorithms [8, 11, 12, 15]. However, they require a huge number of comparison/difference operations and result in high computation load and power consumption. To reduce the computational complexity of FSBM, researchers have proposed various fast algorithms. They either reduce search steps [17–19, 21, 43, 44] or simplify calculations of error criterion [13, 29, 34, 45]. By combining step-reduction and criterion-simplifying, some researchers proposed two-phase algorithms to balance the performance between complexity and quality [31, 32, 46]. They first use FSBM with a simplified matching criterion to generate candidate vectors and then select the best motion vector from these candidates with SAD criterion. These fast-search algorithms have successfully improved the block matching speed while the quality degradation is little and, thus, lead to a low power implementation. However, a low power implementation is not necessarily a power-aware system in that a power-aware system should adaptively modify its

Video Input

Figure 4-1: The system block diagram of a portable, battery-powered multimedia device.

behavior with the change of power/energy status and balance the performance be-tween quality and battery life [47]. The requirement for ME algorithms to be suit-able for power-aware design is high degree of scalability in performance tradeoffs.

Unfortunately, the fast algorithms mentioned above do not meet the requirement.

Articles in [24, 48] present subsample algorithms to significantly reduce the computation cost with low quality degradation. The reduction of computation cost implies the saving of power consumption. Since the power consumption can be reduced by simply increasing the subsample rate, the subsample algorithms have high degree of scalability and are very suitable for power-aware ME archi-tecture. However, applying subsample algorithms for power-aware architecture may suffer from aliasing problem in high frequency band. The aliasing problem degrades the compression quality rapidly as the subsample rate increases. To alle-viate the problem, we extend traditional subsample algorithms to a content-based algorithm, called the content-based subsample algorithm (CSA). In the algorithm, we first use edge extraction techniques to separate the high-frequency band from a macro-block and then subsample the low-frequency band only. Combining the edge pixels and subsample pixels, the algorithm generates a turn-on mask for the architecture to limit the switch activities of processing elements (PEs) in a semi-systolic array. By doing so, we can have significant power consumption save and keep the quality degradation little as the subsample rate increases. Because the number of high-frequency pixels varies with different video clips, we use an adaptive control mechanism to set the threshold value for edge determination and make the number of masked pixels stationary for a given power mode.

The CSA can be used in most existing ME architectures by turning off PEs accordingly with subsample rate. In this chapter, we will present a semi-systolic

architecture with gated PEs. The proposed architecture shows that the CSA algo-rithm can dynamically alter the subsample rate as the power consumption mode changes.

相關文件