DRAM Interface Slot

PU1

DRAM PU2

DRAM PU3

DRAM DRAM

PU8

DRAM PU9

Data Address

1 To 1

Switch

...

Switch Switch

1. PC opens the connection with searcher.

2. PC sends search command and three consecutive 64-bit data to searcher. Two 64-64-bit data for search criterion, and another 64-bit represents starting address of index file, maximum loop number.

3. PC closes the connection with searcher, and start to enter search loop.

4. The hardware of each PU performs the algorithm discussed in next section. Search operation can be completed within maximum loop number.

5. Searcher interrupts PC after completing the search operation.

6. PC sends result command, then PU that finds the required search value will transferred object physical address, object size and qualified object number to data bus. If no object is found, PU outputs the address for insertion.

To implement insertion, search the insertion value first. If criterion is found, physical address to insert will be calculated and reported by using count value of same function name. Otherwise, a PU will reports insert address.

3.4 The advantages of prolog language

Several advantages of prolog program are:

1. Prolog has strong capability to represent knowledge contained in natural language. Document clustering can be well performed by prolog.

2. All the other language use interface program to access databases. Databases are represented as prolog facts. Hence databases are a part of program and data access and processing is much easier.

3. Instead of writing all the detailed program steps, only logic rules are required in prolog. It will reduce the program developing cost, and has good capability for implementing complex programs.

4. Program size is greatly reduced, comparing with program size of other language.

5. The prolog program is considered as knowledge bases of expert system. The system using prolog has better intelligence than the system developed using other language.

The major disadvantage of prolog program is that this program requires fast searching capability over huge amount of knowledge bases. Parallel hardware searcher can be specially designed to overcome such disadvantage.

4. Design a parallel hardware searcher

Search is implemented by using an example in Section 4.1. Insertion and deletion is processed in Section 4.2. Search algorithm is given in Section 4.3.

4.1 Implement search operation

Let N represent the data size and m represent total processor number. N data is ordered according to search key values and distributed to m processors' memories. N is 64 and m is 5 in the example shown in the Figure 4. It is noted that each Location k (where k is between 1 and 64) contains a search key value and k is merely sequence ordered number. Pi is the i-th processor where i is between 1 and m. Therefore, the relationship among k, m and i is found: k MOD m = i. In another words, if we want to know the search key value of location k, then this value can be found in the memory of processor Pi.

The search key values of data can be sorted and distributed by using a hardware sorter. At initial, host sends BlockSize = 64, UpperBound = 64, m = 5, height

= ^┌ log m-1 N ^┐ = 3 and path = 4 to all processors. This search example can be completed in 3 levels with 3 comparisons in worst case. If search value is found earlier by a processor, this processor will broadcast stop signal to all processor. Assume the search criterion can be found at Location = 38. The proposed algorithm can be described on detail below:

Figure 4. Distributing ordered data into multiple processor units for searching.

1. At level 1, all processor will process BlockSize = BlockSize / (m-1) = 16, UpperBound = 64 and UpperBound MOD m = 4. Therefore, all processor know P4 represents path = 4 with location = 64. Since level number is odd, processor numbers must be increasing when corresponding path numbers are increasing. Each processor can calculate his representing path and the location to retrieve data. In Figure 5, P1~P4

represents path 1~4, and P5 must take a rest (represents path 0). P1~P4 will retrieve data at

location 16, 32, 48 and 64. Finally P3 finds that the search criterion is in his range, and broadcast path = 3 to all processors.

2. At level 2, BlockSize = 4, UpperBound = 48 and UpperBound MOD m = 3. P3 represents path = 4 with location = 48. Since level number is even,

processor numbers must be decreasing when corresponding path numbers are increasing. Each processor can calculate his representing path and the location to retrieve data. In Figure 5, P1, P5, P4, P3 represents path 1~4, and P2 must take a rest.

P1, P5, P4 and P3 will retrieve data at location 36, 40, 44 and 64. P5 finds that the search criterion is in his range, and broadcast path = 2 to all processors.

3. At level 3, BlockSize = 1, UpperBound =40 and UpperBound MOD m = 0. P5 represents path = 4 with location = 40. Level number is odd again. In Figure 5, P2, P3, P4 and P5 represents path 1~4, and P1 must take a rest. P2, P3, P4 and P5 will retrieve data at location 37, 38, 39, and 40. P3

finds that the search criterion at location 38, and broadcast a stop signal to all processors.

4.2 Implement insertion and deletion

Assume communication links are existent between every two adjacent processors. Hence, data can rotate left or right among m processors. To implement INSERT operation, search and find the location to insert is Location = d. All processors must rotate data right one location from Location = N to Location =d. Then

insert the data into Location=d. The detailed algorithm for all processor Pi (where1 <= i <= m) is given in the following section. To implement DELETE operation, search the data for deleting and find the location to delete is Location = d. All processors must rotate data left one location from Location d+1 to Location N.

Figure 5. The virtual tree of parallel hardware searcher.

4.3 The virtual tree parallel search algorithm

/* At beginning, host sends N = 64, m = 5, HEIGHT =

┌ log m-1 N ^┐ = 3 to all processors. */

BlockSize = N; /* Set up initial values. */

UpperBound = N;

DEC _ m = m-1;

PATH = Dec _ m ; LEVEL = 1;

/* Start to perform the operations in each level. */

while ( LEVEL <= HEIGHT ) {

UpperBound = UpperBound - BlockSize * ( DEC_m – PATH );

j = UpperBound MOD m;

/* Find the processor Pj represents PATH = DEC_m. */

BlockSize = BlockSize / DEC _ m;

/* New block size is used in this level. */

if ( Level is odd ) {

if ( i <= j ) {PATH= i +Dec _ m – j; }else {PATH

= i – j –1; } };

else { value. Pi sends STOP signal when criterion is found. If Pi finds that criterion is located in his range, then Pi

broadcast his path to all processors. */

Level = Level+1;

};

5. Conclusion

A gene network expert system had been designed using a prolog machine that is incorporated with a parallel hardware searcher. Gene network databases were represented as facts, and the programs were presented as both rules and facts. All the facts and rules are treated as objects and stored in DRAM to increase access time. Java has been used to integrate the whole system. Only a small amount of qualified data needs to transfer to PC for further procession. This searcher can readily solve time consuming problem in prolog system.

This expert system itself cam have enough intelligence to answer gene network problems.

6. Reference

[1] R. Caspi, and etc., “MetaCyc: A multiorganism database of metabolic pathways and enzymes”, Nucleic Acids Res, Volume 34. pp. 511-516, 2006.

[2] P. Karp, S. Paley, and P. Romero "The pathway tools software", Bioinformatics, Volume 18. S225-S232 2002.

[3] T. Schrijvers, J. Wielemaker and B. Demoen,

“Constraint handling rules for SWI-prolog”, Workshop on (Constraint) Logic Programming, Ulm, February, 2005.

[4] J. Wielemaker and A. Anjewierden. “An architecture for making object-oriented systems available from Prolog”. In Alexandre Tessier, editor, Computer Scienc, abstract 2002. http:// lanl.arxiv.org /abs /cs.SE /0207053.

[5] R. BAYER and C. MCCREIGHT, "Organization and maintenance of large ordered indexes", Acta Inf.

Volume 1, No.9, pp. 173-189, 1972.

[6] D. Lomet. " The evolution of effective B-tree: page organization and techniques: A personal account", ACM SIGMOD Record, Volume 30(3), pp. 64-69, Sep. 2001.

[7] T. Johnson and D. Sasha, "The performance of current B-tree algorithms", ACM Transactions on Database Systems (TODS), Volume 18 , Issue 1, pp. 51-101, March 1993.

[8] S. Chen, P.B. Gibbons, T.C. Mowry and G. Valentin, "Fractal prefetching B+-Trees: optimizing both cache and disk performance", Proceedings of the ACM SIGMOD international conference on Management of data, 2002 , Madison, Wisconsin.

[9] S.W. Kim , H.S. Won, "Batch-construction of B+-trees", Proceedings of the 2001 ACM symposium on Applied computing, pp. 231-235, March 2001.

[10] A. Guttman "R-trees: A dynamic index structure for Spatial Searching", Proceedings of the ACM SIGMOD, pp. 47-57, June 1984.

[11] T. Sellis, N. Roussopoulos, and C. Faloutsos. "The R+ Tree: A dynamic index for multi-dimensional Objects", Proceedings of the 13th VLDB Conference, 1987.

[12] B. Wang, H. Horinokuchi, K. Kaneko and A. Makinouchi , "Parallel R-tree search algorithm on DSVM", Proceedings sixth International Conference on Database Systems for Advanced Applications, pp.

237-244, April 1999.

[13] S. Gerard and B. Chris, "Parallel text search methods", Communications of the ACM, Volume 31, Issue 2, pp. 202-215, February 1988.

[14] M. Sergey, R. Sriram, Y. Beverly and G.M. Hector,

"Building a distributed full-text index for the web", ACM Transactions on Information Systems (TOIS), Volume 19 , Issue 3, pp.217-241, July 2001.

[15] P. Sakti and H.K. Myoung, "Parallel processing of large node B-trees", IEEE Transactions on Computers, Volume 39, No.9, pp.1208-1212, September 1990.

[16] M.A. Torres, S. Kuroyanagi and A. Iwata, "A fast parallel search method for large dictionaries", Proceedings of International Symposium on Software Engineering for Parallel and Distributed Systems, pp.

207 -214, 1998.

Reconstructing protein-protein interaction networks from

在文檔中 APAMI 2006 亞太醫學資訊研討會論文集 (頁 71-75)