Chapter 6 Downlink Baseband Receiver Implementation
6.3 Synchronizer
6.3.2 Carrier Frequency Offset Estimator
6.4.2.1 Merge Sorting Network with Programmable and Partial Sorting
First, the design of the proposed merge sorting network, MSNP2, is discussed. In order to avoid the high complexity of parallel sorting network, a fixed I/O size sorting network and a set of memory module are used to accommodate the number of sorting elements [74]-[76]. Here, the architecture of the MSNP2 with a memory bank, a sorting control unit and an 8-item sorter is shown in Fig. 6.12. The 8-item sorter is the Batcher’s sorting network with I/O size of eight. The Batcher’s sorting network is widely used because of its inherent parallelism and short latency [77]. Fig. 6.13 shows the 8-item sorter, and the basic unit is a 2x2 comparator which is used to perform data comparison and exchange. The memory bank which is primarily used to save the path power values is organized into eight independent memory modules denoted as MD1~MD8.
Fig. 6.12 Block diagram of the MSNP2.
Since the maximum sorting item is 128, the sorting data (path power values) are arranged with 32 rows which the row definition is used in the sorting sequence, and each row contains four sorting data. The odd rows are loaded into MD1~MD4, and the even rows are loaded into MD5~MD8 as illustrated in Fig. 6.14. Based on the sorting sequence, the sorting control unit takes two rows of data to the 8-item sorter for sorting in each cycle; then, the outputs of the sorter which are divided into two
Fig. 6.14 Memory bank arrangement of merge sorting network.
Fig. 6.13 Batcher classic sorting network with I/O size of eight.
clusters in descending order are written back to the memory bank and replace the original two rows. The L-item merge-sorting sequence can be divided into three cycles: (1) the first local sorting cycle, (2) the cross sorting cycle, and (3) the second local sorting cycle. At the first local sorting cycle, L-item data are divided into two L/2-item data clusters to do L/2-item merge-sorting, respectively. Then, L-item data will be arranged in two L/2-item clusters in descending order. At the cross sorting cycle, the data in the up cluster are compared and exchanged with the data in the down cluster. After cross sorting, the data in the up cluster are larger than that in the down cluster. At the second local sorting, the two clusters are sorted separately again in descending order. Finally, the sorted results are saved in the memory bank and arranged in the row order.
Fig. 6.15 shows the 32-item merge sorting sequence represented by the directed arrows in the line representation, and each arrow represents an operation of the 8-item sorter. According to a 32-item merge sorting sequence, where (x, y) represents the row x and row y sorted by 8-item sorter, the first local sorting sequences of 32-item sorting are (1, 2), (3, 4), (5, 6), (7, 8), (1, 4), (5, 8), (2, 3), (6, 7), (1, 2), (3, 4), (5, 6), and (7, 8). The crossing sorting sequences are (1, 8), (2, 7), (3, 6), and (4, 5). The second local sorting sequences are the same as the first local sorting sequences.
The merge sorting is used two times in the SMPIC-based decorrelator. The first time is to sort the 128-item data to find the first 32-item data and denoted as
Fig. 6.15 32-item merge sorting sequence.
128-32-item sorting. The second time is to sort the 32-item data to find the first 8-item data and denoted as 32-8-item sorting. The 32-item sorting sequence is used to be a basic control sequence, and the 128-item sorting sequence can be extended by the 32-item sorting sequence and constructed as the line representation shown in Fig. 6.15.
For saving execution time and power, the 128-32-item sorting only executes the grey part of Fig. 6.16, and the 32-8-item sorting executes the grey part of Fig. 6.15.
The state diagram of the proposed MSNP2 is shown in Fig. 6.17. The MSNP2 has the programmable and partial sorting capability to execute the 128-32-item sorting and the 32-8-item sorting. There are five state definitions as listed in Table 6.2. The basic 32-item sorting is composed of the execution of State_Set_0, State_Set_1 and State_Set_2. Moreover, the 128-32-item sorting can be performed by arranging above three states with State_Cross_0 and State_Cross_1. Several control signals are used to control the execution flow and decide the execution state in the different situations as introduced in Table 6.3.
Fig. 6.16 128-item merge sorting sequence.
Fig. 6.17 State diagram of the proposed MSNP2.
TABLE 6.2
STATE DEFINITIONS OF THE MSNP2 STATE DIAGRAM
State Name State Behavior
State_Set_0 For a basic 32-item sorting, State_Set_0 is used to execute the sorting sequence {(1, 2), (3, 4), (5, 6), (7, 8)}
State_Set_1 For a basic 32-item sorting, State_Set_1 is used to execute the sorting sequence {(1, 4), (5, 8), (2, 3), (6, 7)}
State_Set_2 For a basic 32-item sorting, State_Set_2 is used to execute the sorting sequence {(1, 8), (2, 7), (3, 6), (4, 5)}
State_Cross_0
For extending to 128-item sorting, State_Cross_0 is used to execute the sorting sequence {(1,16) (2,15) (3,14) (4,13) (5,12) (6,11) (7,10) (8,9)}
State_Cross_1
For extending to 128-item sorting, State_Cross_1 is used to execute the sorting sequence {(1 ,32) (2 ,31) (3 ,30) (4 ,29) (5 ,28) (6 ,27) (7 ,26) (8 ,25)} and {(9 ,24) (10,23) (11,22) (12,21) (13,20) (14,19) (15,18) (16,17)}
TABLE 6.3
CONTROL SIGNALS USED IN THE MSNP2 STATE DIAGRAM
Control Signals Description
CRL [3:0] This 4-bit signal is used to count the execution times of State_Set_0.
CS_32_8 This one-bit signal is used to note that the execution should be the 128-32-item sorting (CS_32_8 = 1’b0) or the 32-8-item sorting (CS_32_8 = 1’b1).
CROSS_TYPE This one-bit signal is used to switch the cross sorting to State_Cross_0 (CROSS_TYPE = 1’b0) or State_Cross_1 (CROSS_TYPE = 1’b1).
STATE_CHANGE
This one-bit signal is changed when the 16-row data are completed in the cross sorting (State_Cross_0 or
State_Cross_1).
COUNT_END This one-bit signal is used to sign that the second local sorting of the 128-32-item sorting is partial executed in the up cluster.