依動態規劃開發之定點式乘法係數之快速設計法

(1)

行政院國家科學委員會專題研究計畫成果報告

計畫類別：個別型計畫

計畫編號： NSC94-2215-E-011-007-

執行期間： 94 年 08 月 01 日至 95 年 07 月 31 日執行單位：國立臺灣科技大學電機工程系

計畫主持人：姚嘉瑜

報告類型：精簡報告

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫涉及專利或其他智慧財產權，2 年後可公開查詢

中華民國 95 年 7 月 31 日

(2)

行政院國家科學委員會補助專題研究計畫成果報告

※※※※※※※※※※※※※※※※※※※※※※※※※※

※ 依動態規劃開發之定點式乘法係數之快速設計法 ※

※※※※※※※※※※※※※※※※※※※※※※※※※※

計畫類別：5個別型計畫 □整合型計畫計畫編號：變更前編號：NSC 94-2215-E-211-002 變更後編號：NSC 94-2215-E-011-007 執行期間： 94 年 8 月 1 日至 95 年 7 月 31 日

計畫主持人：姚嘉瑜共同主持人：

計畫參與人員：徐俊德、葉錦智、劉宏德、張育偉、周家豪、蔡其智、徐佩雯、陳鈺媛、王宏文

執行單位：變更前：華梵大學電子工程學系

變更後：國立台灣科技大學電機工程系

中華民國 95 年 7 月 31 日

(3)

依動態規劃開發之定點式乘法係數之快速設計法

“A fast design method of fixed-point coefficients for constant multiplications by dynamic programming”

計畫編號：NSC 94-2215-E-011-007 執行期間：94 年 8 月 1 日至 95 年 7 月 31 日主持人：姚嘉瑜國立台灣科技大學電機工程系副教授

一、中文摘要

常係數乘法在數位訊號理中到處可見，如 FIR 濾波器、DFT 計算、DCT 計算等，都有大量使用常係數乘法的地方；因此，可否有效率地做常係數乘法，往往影響到最後硬體的面積與功率消耗，甚而系統的複雜度，在目前系統晶片(SOC)成為顯學的今日，使用到常係數乘法的矽智財(SIP)之研究，將日益重要。

常係數乘法的硬體實現首先要在系統規格的規範下設計其對應的fixed-point 係數，而最直接的方法就是對floating-point 係數自某一位數起取其四捨五入值，不幸的是，若要顧及系統規格，則要取的位數通常會非常長；故而有一些學者研究其他的方法，早期這方面的研究集中在 FIR 濾波器 fixed-point 係數的設計上，包括有名的 mixed-integer linear programming (MILP)、local search、trellis search 等，我們在這方面也曾做過研究，並於 2002 年發表一種partial MILP 的設計法。

不過以上所提到的方法皆有令人不滿意的地方，如MILP 雖為 optimal，但其計算複雜度太高，設計時間很長，雖然我們的 partial MILP 設計法略為改善這問題，但若面對龐大的設計問題，partial MILP 的設計時間仍會很長，而 local search 與 trellis search 皆為 sub-optimal，我們很容易可找到比他們的解更好的答案。因此本計畫另行發展其他的設計法，我們以動態規劃為本，開發比partial MILP 更快的設計方法，且設計結果也比local search 與trellis search 要好。

英文摘要

There are many DSP applications employ multiple constant multiplications (MCM), for example, the realization of FIR filters, the

computations of DFT and DCT, etc. Therefore, performing MCM efficiently can reduce the hardware complexity and the power consumption. In the SOC era nowadays, the study on those silicon intellectual properties (SIP) that use MCM may become important.

To realize an MCM hardware, we first need to determine the corresponding fixed-point coefficients. The direct method for determining the fixed-point coefficients is doing round-off to the floating-point coefficients. Unfortunately, the word length is usually very long in order not to violate the system specifications. Therefore, there are many methods proposed by many researchers, for example, the famous mixed-integer linear programming (MILP) method, the local search method, and the trellis search algorithm, etc. We have also done some work on this subject and have published a partial MILP method in 2002.

However, the methods mentioned above all have their own weaknesses. The MILP method is optimal, but its computational complexity is too high for a large design. Although our partial MILP method somewhat reduces the computational complexity of the MILP method, it is still a non-polynomial-time design method.

Thus, for an extremely large design problem, the partial MILP method still needs long time to finish the work. On the other hand, the local search and the trellis search algorithms are sub-optimal. We can easily find better solutions than the solutions they can offer. Therefore, in this project, we develop another design method better than the partial MILP based on the dynamic programming technique. The design results of our new method are better than the results designed by the local search and trellis search algorithms.

(4)

二、計畫的緣由與目的

常係數乘法（multiple constant multiplications，簡稱 MCM）在數位訊號理中十分常見，

如 FIR 濾波器、Discrete Fourier Transform

（DFT）計算、Discrete Cosine Transform（DCT）

計算，乃至於一般linear transformation 等，都大量地使用常係數乘法，當然常係數乘法可用軟體程式來做，但若考量速度的因素，則可利用硬體來執行常係數乘法；因此，可否有效率地做常係數乘法，往往影響最後的硬體面積與功率消耗，甚而系統之複雜度，在系統晶片

（SOC）已成為顯學的今日，使用到常係數乘法的矽智財（SIP）之研究，將日益重要。

常係數乘法的硬體實現首先要在系統規格的規範下設計其對應的fixed-point 係數，最直接的方法就是對floating-point 係數自某一位數起，取其四捨五入值，但不幸的是，若要顧及系統規格或精確度，則要取的位數常會非常多；故有些學者研究利用其他方法來設計 fixed-point 係數，這方面的研究集中在 FIR 濾波器的設計上，包括利用 mixed-integer linear programming (MILP) [1, 2]、weighted LMS [3]、local search [4, 5]、trellis search [6]等；我們也曾做過這方面的研究，並於 2002 年發表一種partial MILP (PMILP)的設計方法 [7]，

我們把 MILP 法重新 formulate，改為針對係數的 signed-powers-of-two (SPT) 項數進行 minimization，同時為了減少計算量，在設計時僅針對係數最後幾個數元做最佳化的動作，因此我們稱此法為 partial MILP 法，與其他方法相比，我們的方法在相同的 NPRM 規格下可設計出 SPT 項數最少的係數。

不過以上所提及的方法都有不盡理想之處，如MILP 雖為 optimal，但其計算複雜度太高，設計時間很長，雖然我們的partial MILP 設計法已改善此點，但若面對龐大的設計問題，partial MILP 的設計時間仍會變長；而 local search 與 trellis search 等其他方法雖執行速度快，但為sub-optimal，相較之下我們很容易可找出更好的答案。基於前述理由，我們希望能發展夠快速的設計法，以應付日益龐大的設計問題。故本計畫的目的在發展不同於前述之新

的fixed-point 設計法，我們計劃以動態規劃為本，開發快速的設計方法，其設計結果除了能滿足所要求的規格外，同時合成硬體的複雜度也要低。

三、研究方法及成果

我們以FIR 濾波器為例，其輸出為

∑⁻

=

−

= ¹

0

) ( ) ( )

( ^N

k

k n x k h n

y

其中N 為濾波器長度，x 為輸入，h(k)為 fixed- point 係數，為降低硬體複雜度，我們將係數以有限數元之CSD 碼來表示，若有 through rate 的規範，則每一係數 CSD 碼的非零項個數將有限制，令其上限為 Lmax，而所有係數 CSD 碼的總非零項個數以 LT表示。硬體複雜度通常以加法器個數表示，雖然最後合成硬體的加法器個數與 LT 並無絕對關係 [8]，但大致上 LT 越大，硬體複雜度會越高，故 FIR 濾波器 CSD 係數的設計上傾向於 LT越小越好。

濾波器之frequency response 為

∑⁻

=

= ¹ − 0

) ( )

( ^N

n

e j

n h

H ω ^ω

傳統濾波器的規格要求 |H(ω)|/b

（normalized magnitude response）滿足某一頻譜規範（b 為 passband gain），而 data transmission filters 則除了頻譜規範外，亦規定了時域的規格（即zero ISI 的要求）；故係數設計的問題可描述為：尋找一組 CSD 係數 h(n) 具有夠小的 LT，使得濾波器頻域及時域的規格皆可被滿足，且每一係數CSD 碼的非零項個數皆不超過Lmax。求解過程可視為一integer-based constrained optimization 的過程，值得一提的是，頻域規格可描述為一組線性的不等式，但時域的規格可以為非線性的，若passband gain 為一常數，且頻域規格與時域規格皆可用一組線性的不等式表示，則 optimization 的問題是線性的，可用MILP 之類的方法解決，不過當 passband gain 為變數時，MILP 必須適度修改，我們的 PMILP 即為一種修改過的方法，可用來克服passband gain 為變數的問題。

(5)

誠如上一節所述，當設計問題很大時，

MILP 或 PMILP 設計時間會變很長，部分原因是因 MILP-based 的演算法是 non-polynomial time 的，另一原因是若要令 LT小，問題中的 formulation 成為針對 CSD 碼的每一數元做 optimization，這讓原本的設計問題變得更加龐大，例如若欲設計之係數有N 個，而欲最佳化 的CSD 數元有 M 個，則設計問題的大小由原 本的N 變為 2MN [7, 8]，這將令設計時間大幅 增加。

我們重新審視係數的設計問題，我們認為不須針對CSD 碼的每一數元做 optimization，

因為任一整數之 CSD 表示法是唯一的，只要能隨時掌控設計過程中的整係數之 CSD 碼，

我們應可直接在整數的範疇內做設計，而不用把設計問題變得龐大。另外，因 MILP-based 的演算法為non-polynomial time 的，為了加快設計速度，我們計畫以搜尋的方法進行設計，

不過鑒於 local search 及 trellis search 仍是在 CSD 碼的數元 domain 中做搜尋，造成增加問題的大小，或是搜尋的範圍過小，因此本計畫將在整數 domain 做搜尋，藉以尋得較佳之結果。

我們的搜尋方法將採動態規劃式，每組整係數（有些符合規範，有些則否）都以一節點

（node）代表，而每一節點都伴隨一代表實現該節點對應整係數之複雜度的函數 f，而一節 點對應之整係數經某種調整後成為其相鄰節點所對應之整係數，用這種方式表現後，我們所要搜尋的範圍可表示為一trellis 的結構。在尋搜過程中，我們像走在trellis 的節點間，每到一節點，我們即計算該節點對應的整係數是否符合規範，若是，則計算複雜度的函數 f；

若否，則繼續移動。而在節點間如何移動，我們必須遵循下述規則：

1. Clear Counter D and assign T to be infinity.

Start from a node i.

2. Search for all neighbors of node i. Check whether any of its neighbors satisfies the specification or not.

3. If there exist some, pick up the one, say node j, with the lowest complexity f and compare it with T. If the new f is smaller, replace T with the new one and assign i = j. Go to Step 2.

4. Otherwise, if node i satisfies the specification, then output node i as a solution and STOP.

5. If not, check if D reaches a bound. If not, sort node i’s neighbors by the distance to the specification. Assign D = D + 1 and push node i’s neighbors into the stack LD. Otherwise, assign D = D – 1.

6. If LD is not exhausted, take out the top element from LD and rename it as i. Go to Step 2.

7. Otherwise, assign D = D – 1. Go to Step 6.

在此，我們特別強調我們的方法與 trellis search 不同：trellis search 是在 CSD 碼的數元 domain 做搜尋，而我們的方法是在整數的範疇內做搜尋；另一方面trellis search 所搜尋的數元在MSDs (most significant digits) 的部分，僅搜尋由 floating-point 係數量化而來的 1 或

−1，只有最後兩數元 LSDs (least significant digits) 的部分才是做全部搜尋（因此 trellis search 可視為一種改良的 local search），故搜尋範圍有限，而運用我們的方法搜尋的範圍將會增大許多，有更多機會尋得較佳的結果。

設計範例

DOCSIS 2.0 [9] 中規定 Cable modem 的 upstream transmitter 須近似一頻寬為 5/(8T)，

roll-off factor 為 0.25 之 SRRC 濾波器 (1/T 為 symbol rate)，其 stopband attenuation 須達到 –30dB。文獻 [10] 中提出了一用於 M- QAM modulator 的 fixed-point FIR 濾波器可滿足DOCSIS 2.0 之需求，該文所提出的 FIR 濾波器長度為 40，係數是以 7 數元的 CSD (canonic-signed-digit) 碼編成，提供的 stopband attenuation 可達到 –30.4dB，ISI (intersymbol interference) 則為 –41.1dB (註：文獻 [10]是以 local search 的方法找出濾波器的 CSD 係數)。

我們重新設計上述的FIR 濾波器，首先以我們去年提出的方法 [11] 找出濾波器的 floating-point 係數，再以本計畫的方法將係數量化為 7 數元的 CSD 碼，結果我們的濾波器長度只需 35，而 stopband attenuation 可達到 –31.46dB，ISI 則為 –42.18dB，較文獻 [10]

的濾波器優異且複雜度更低。以下為我們求得

(6)

的 FIR 濾波器係數，圖 1 則為其對應的 magnitude response。：

2 7

] 0 [ = ⁻

h h[18]=2⁻¹−2⁻⁵−2⁻⁷ 2 5

] 1 [ = ⁻

h h[19]=2⁻²−2⁻⁷

2 7

] 2 [ = ⁻

h h[20]=0

0 ] 3 [ =

h h[21]=−2⁻²+2⁻⁴+2⁻⁶ 2 5

] 4 [ =− ⁻

h h[22]=−2⁻²+2⁻⁶ 2 5

] 5 [ =− ⁻

h h[23]=−2⁻²+2⁻⁴+2⁻⁷ 2 7

] 6 [ =− ⁻

h h[24]=−2⁻⁴

2 6

] 7 [ = ⁻

h h[25]=2⁻⁴−2⁻⁶

6

4 2

2 ] 8

[ = ⁻ − ⁻

h h[26]=2⁻³−2⁻⁵+2⁻⁷

7

4 2

2 ] 9

[ = ⁻ − ⁻

h h[27]=2⁻³−2⁻⁵−2⁻⁷ 2 5

] 10 [ = ⁻

h h[28]=2⁻⁵+2⁻⁷

0 ] 11

[ =

h h[29]=−2⁻⁷

7

5 2

2 ] 12

[ = ⁻ − ⁻

h h[30]=−2⁻⁴+2⁻⁶

6

3 2

2 ] 13

[ = ⁻ − ⁻

h h[31]=−2⁻⁴+2⁻⁶

7

2 2

2 ] 14

[ = ⁻ + ⁻

h h[32]=−2⁻⁶

4 1 2 2 ] 15

[ = ⁻ − ⁻

h h[33]=0

4

1 2

2 ] 16

[ = ⁻ + ⁻

h h[34]=2⁻⁵−2⁻⁷

6 4

1 2 2

2 ] 17

[ = ⁻ + ⁻ + ⁻ h

圖1：設計範例之振幅響應。

四、結論與討論

本計畫提出一種依動態規劃開發之定點式乘法係數之快速設計法，其設計定點式乘法係數的速度比MILP 或 Partial MILP 快很多，

雖與 local search 及 trellis search 同屬 suboptimal 的方法，但設計出的定點式乘法係數比local search 及 trellis search 所找出的係數

複雜度要低。

本計畫提出的方法可與主持人以前計畫的方法相結合，而成為一完整的設計流程。

五、參考文獻

1. Y. C. Lim and S. R. Parker, “FIR filter design over a discrete powers-of-two coefficient space,” IEEE Trans. Acoust., Speech, Signal Processing, vol.

ASSP-31, pp.583-591, June 1983.

2. Y. C. Lim and S. R. Parker, “Design of discrete-coefficient-value linear phase FIR filters with optimum normalized peak ripple magnitude,” IEEE Trans. Circuits Syst., vol.37, pp. 1480- 1486, Dec.

1990.

3. Y. C. Lim and S. R. Parker, “Discrete coefficient FIR digital filter design based upon an LMS criteria,”

IEEE Trans. Circuits Syst., vol. CAS-30, pp.723-739, Oct. 1983.

4. H. Samueli, “An improved search algorithm for the design of multiplierless FIR filters with powers-of-two coefficients,” IEEE Trans. Circuits Syst., vol.36, pp. 1044-1047, July 1989.

5. Q. Zhao and Y. Tadokoro, “A simple design of FIR filters with powers-of-two coefficients,” IEEE Trans.

Circuits Syst.,vol.35, pp.566-570, May 1988.

6. C.-L. Chen and A. N. Willson Jr., “A trellis search algorithm for the design of FIR filers with signed powers-of-two coefficients,” IEEE Trans. Circuits Syst.II, vol.46, pp.29-39, Jan. 1999.

7. Chia-Yu Yao and Chiang-Ju Chien, “A Partial MILP Algorithm for the Design of Linear Phase FIR Filters with SPT Coefficients,” IEICE Trans. Fundamentals of Electronics, Communications, and Computer Sciences, vol. E85-A, no. 10, pp. 2302-2310, Oct.

2002.

8. O. Gustafsson and L. Wanhammar, “Design of linear-phase FIR filters combining subexpression sharing with MILP,” Proc. IEEE Midwest Symp.

Circuits Syst., Tulsa, OK, Aug. 4-7, 2002, vol. III, pp.

9-12.

9. Cable Television Laboratories, Inc., DOCSIS 2.0:

Radio frequency interface specification. SP-RFIv2.0- I04-030730, July 2003.

10. D. M. Klymyshyn and D. T. Haluzan, “FPGA implementation of multiplierless M-QAM modulator,” Electron. Lett., vol. 38, pp. 461-462, May 9, 2002.

11. Chia-Yu Yao and Chiang-Ju Chien, “The design of a square-root-raised-cosine FIR filter by a recursive method,” in Proc. 2005 IEEE Int. Symp. Circuits and Systems, Kobe, Japan, May 2005, pp.512-515.

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

−50

−45

−40

−35

−30

−25

−20

−15

−10

−5 0 5

Normalized frequency (cycle/sample)

Normalized magnitude respanse (dB)

0 0.05 0.1 0.15

−4

−3

−2

−1 0 1