• 沒有找到結果。

In this article, we present a fast and scalable matching automaton (FSAM) with the novel prehash and root-index techniques. The prehash technique is used to quickly verify the text in order to avoid AC matching. The prehash tech-nique has two distinguished enhancements from the previous BFSM. First, Bloom filter uses all patterns to build a big vector, but our approach builds the bit vector from partial patterns (suffixes of the current state). Second, BFSM uses multiple hashing functions, but our approach uses only one hashing func-tion. Therefore, our prehash significantly reduces the hardware complexity and makes the hashing technique more feasible in string-matching. In addition to prehash, our root-index technique is a space-efficient matching technique for matching multiple bytes in one single matching. Since the root state is fre-quently visited in the string-matching, it is an effective approach to accelerate the automaton.

Table III. Comparisons of String-Matching Hardware

Matching Circuit Pattern Throughput Pattern

Type Hardware Device Size (LE) Size (Byte) (Gbps)1 Placement

AC DFA FSAM2 Virtex2P 656 32,634 11.1 Internal

Memory

Xilinx FPGA N/A4 2,048 10.0 Internal

Memory RE DFA DFA+counter

[Lockwood et al.

2001]

VirtextE 1000 98 11 3.8 Hardwired

Circuit

Parallel Regular DFA [Moscola et al.

2003]

VirtexE 2000E 8,134 420 1.2 Internal

Memory

KMP DFA KMP Comparators [Baker et al. 2004]

Xilinx Virtex2P 130 32 2.4 Internal

Memory Comparator

NFA

Comparator NFA [Sidhu et al. 2001]

Xilinx Virtex 100 1,920 29 0.5 Hardwired

Circuit

Virtex 1000 19,660 17,550 0.8 Hardwired

Circuit Multi-character

decoder NFA [Clark et al. 2003]

Xilinx Virtex2 29,281 17,537 7.3 Hardwired Circuit

Approximate Decoder NFA [Clark et al.

2004]

Virtex2 6000 6,478 17,537 2.0 Hardwired

Circuit

Spartan3 400 1,163 20,800 1.9 Internal

Memory

Discrete Comparators [Sourdis et al.

2003]

Virtex2 6000 76,032 2,457 8.0 Hardwired

Circuit

Pre-decoded CAM Comparators [Sourdis et al.

2004]

Virtex2 6000 64,268 18,032 9.7 Hardwired

Circuit

CAM Comparators [Gokhale 2002]

VirtexE 1000 9,722 640 2.2 CAM5

(continued on next page)

19:28 K. K. Tseng et al.

Table III. (continued) Hashing Parallel Bloom Filter

[Dharmapurikar et al. 2004]

VirtexE 2000 6,048 9,800 0.6 Internal Memory

PHmem [Sourdis et al. 2005]

Virtex2 1000 8,115 20,911 2.9 Hardwired and

Virtex2 1000 2,570 18,636 2.0 Internal Memory

1. Throughput is an average performance. The hardware except FSAM and BFSM has the worst-case throughput equal to the average-case throughput.

2. The single-engine FSAM requires 329, 159, 154, 1,426, and 2,430 LE, while it performs 5.6, 3.2, 3.4, 1.0, and 0.8 Gbps for the Virtex2P, Virtex2 1000, Virtex4 xc4vlx80, VirtexE 2000, and Virtex 800 devices, respectively.

3. Since FSAM cannot be fit into Virtex 100, we performed the Virtex 800 device instead. Since the Virtex 800 and VirtexE series do not support block RAM, the bitmap table is placed in the external memories with the dedicated bus, which should be acceptable in the evaluation.

4. Since we lack the matching hardware to provide sufficient information, N/A represents that information is not available.

5. CAM is the content address memory, which can match content against data in parallel.

Substantial evaluation exhibited that the proposed FSAM can achieve the 573% and 233% increases in speedup compared to bitmap AC for 21,302 URL and 10,000 virus patterns, respectively. Moreover, our FSAM has the same worst-case time as bitmap AC when performing the prehash, root-index, and bitmap AC matching in parallel. For the space requirements, our FSAM in-creases by only 4 bytes of the bit vector for each state and the root-index tables for the root state. Therefore, the extra space requirements of 10.73MB and 5.22MB for 21,302 URL and 10,000 virus patterns, respectively, are quite ac-ceptable with the currently available technologies.

In the implementation with a Xilinx Virtex2P device, the result demonstrates that our FSAM surpasses all other existing hardware in terms of the pattern size and throughput. Our FSAM can support the largest pattern size of 32,634 bytes and run at the high throughput of 11.1Gpbs. Furthermore, since our architecture works for both external and internal memories, and the external ASIC memories often run at a much higher clock rate than FPGA memories, our architecture is scalable to a large amount of patterns. If the high-speed external memories are employed, FSAM can support up to 21,302 patterns while maintaining similar high performance.

There are two possible future directions for this work. First, for broaden-ing the applications usbroaden-ing FSAM, our prehash and root-index techniques can be applied to the other automaton matching algorithms such as the regular expression automaton and the suffix automaton. Second, our FSAM for the content-filtering service can be integrated into a network gateway for field trial evaluation.

ACKNOWLEDGMENTS

Many thanks to anonymous reviewers who gave their time and helpful advices.

REFERENCES

AHO, A. V.ANDCORASICK, M. J. 1975. Efficient string matching: an aid to bibliographic search.

Comm. ACM, 333–340.

ALDWAIRI, M., CONTE, T.ANDFRANZON, P. 2005. Configurable string matching hardware for speed-ing up intrusion detection. ACM SIGARCH Comput. Archit. News.

ANTONATOSS., POLYCHRONAKISM., AKRITIDISP., ANAGNOSTAKISK. D.,ANDMARKATOSE. P. 2005. Pi-ranha: fast and memory-efficient pattern matching for intrusion detection. In Proceedings of the 20th IFIP International Information Security Conference. Springer, Berlin, Germany.

ANTONATOS, S., ANAGNOSTAKISK.,ANDMARKATOS, E. 2004. Generating realistic workloads for net-work intrusion detection systems. In Proceeding of the ACM Workshop on Software and Perfor-mance. ACM, New York.

ATTIG, M., DHARMAPURIKAR, S.ANDLOCKWOOD, J. 2004. Implementation results of bloom filters for string matching. In Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. IEEE, Los Alamitos, CA.

BAKER, Z. K.ANDPRASANNA, V. K. 2004. Time and area efficient pattern matching on FPGAs. In Proceedings of the ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays. ACM, New York.

BLUTHGEN¨ , H. M., NOLL, T.ANDAACHEN, R. 2000. A Programmable processor for approximate string matching with high throughput rate. In Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors. IEEE, Los Alamitos, CA.

BOSE, P., GUO, H., KRANAKIS, E., MAHESHWARI, A., MORIN, P., MORRISON, J., SMID, M.,ANDTANG, Y. 2005.

On the false-positive rate of bloom filters. http://cg.scs.carleton.ca/∼morin/publications/ds/bloom-submitted.pdf.

BOYER, R. S.,AND MOORE, J. S. 1977. A fast string searching algorithm. Comm. ACM 20, 10, 762–772.

BU, L.ANDCHANDY, J. A. 2001. A keyword match processor architecture using content address-able memory. In Proceedings of the 14th ACM Great Lakes symposium on VLSI. ACM, New York.

CHO, Y. H.ANDMANGIONE-SMITH, W. H. 2005. A pattern matching coprocessor for network security.

In Proceedings of the 42nd Annual Conference on Design Automation. ACM, New York.

CLAMANTIVIRUS. 2006. Clam Anti-virus. http://www.clamav.net/.

CLARK, C. R.ANDSCHIMMEL, D. E. 2003. Efficient reconfigurable logic circuits for matching complex network intrusion detection patterns. Lecture Notes in Computer Science, vol. 2778.

CLARK, C. R.ANDSCHIMMEL, D. E. 2004. A pattern-matching co-processor for network intrusion detection systems. In Proceedings of the IEEE International Conference on Field-Programmable Technology (FPT ‘03). IEEE, Los Alamitos, CA.

CLARK, C. R.ANDSCHIMMEL, D. E. 2004. Scalable pattern matching for high speed networks. In Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’04). IEEE, Los Alamitos, CA.

COIT, C., STANIFORD, S.,ANDMCALERNEY, J. 2002. Towards faster string matching for intrusion detection. In Proceedings of the DARPA Information Survivability Conference and Exhibition.

ACM, New York, 367–373.

DANSGUARDIAN. 2006. DansGuardian content filter. http://dansguardian.org.

DESAI, N. 2002. Increasing performance in high speed NIDS. http://www.snort.org/

docs/Increasing Performance in High Speed NIDS.pdf.

DHARMAPURIKAR, S.ANDKRISHNAMURTHY, P., SPROULL, T. S.,ANDLOCKWOOD, J. W. 2004. Deep packet inspection using parallel bloom filters. IEEE Micro 24, 1.

ERDOGAN, O.ANDCAO, P. 2006. Hash-AV: fast virus signature scanning by cache-resident filters.

http://crypto.stanford.edu/∼cao/hash-av.html.

FRANKLIN, R., CARVER, D.ANDHUTCHINGS, B. L. 2002. Assisting network intrusion detection with reconfigurable hardware. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines. IEEE, Los Alamitos, CA.

GOKHALE, M., DUBOIS, D., DUBOIS, A., BOORMAN, M., POOLE, S.,AND HOGSETT, V. 2002. Granidt:

towards gigabit rate network intrusion detection technology. Lecture Notes in Computer Science, vol. 2438.

19:30 K. K. Tseng et al.

LOCKWOOD, J. 2001. An open platform for development of network processing modules in reconfig-urable hardware. In Proceedings of the International Engineering Consortium Design Conference.

MIKE, F.AND GEORGE, V. 2001. Fast Content-Based. Packet Handling for Intrusion Detection.

Tech. rep. CS2001-0670, University of California, San Diego.

MITZENMACHER, M. 2005. Compressed bloom filters. IEEE/ACM Trans. Netw.

MOSCOLA, J., LOCKWOOD, J., LOUI, R. P.,ANDPACHOS, M. 2003. Implementation of a content-scanning module for an internet firewall. In Proceedings of the 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. IEEE, Los Alamitos, CA.

NAVARRO, G. 2001. A guided tour to approximate string matching. ACM Comput. Surv. 33, 31–88.

NAVARRO, G.ANDRANOT, M. 2002. Flexible Pattern Matching in Strings. Cambridge University Press, Cambridge, MA.

PAPADOPOULOS, G.ANDPNEVMATIKATOS, D. 2005. Hashing + memory = low cost, exact pattern match-ing. In Proceedings of the International Conference on Field Programmable Logic and Applica-tions. Springer, Berlin, Germany.

PARK, J. H.ANDGEORGE, K. M. Parallel string matching algorithms based on dataflow. In Pro-ceedings of the 32nd Annual Hawaii International Conference on System Sciences. IEEE, Los Alamitos, CA.

RAFFINOT, M. 1997. On the multi backward dawg matching algorithm (MultiBDM). In Proceed-ings of the 4th South American Workshop on String Processing.

SASTRY, R., RANGANATHAN, N.ANDREMEDIOS, K. 1995. CASM: a VLSI chip for approximate string matching. IEEE Trans. Pattern Anal. Mach. Intell. 17.

SIDHU, R.ANDPRASANNA, V. 2001. Fast regular expression matching using FPGAs. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’01). IEEE, Los Alamitos, CA.

SNORT. 2006. Snort: The Open Source Network Intrusion Detection System. http://www.snort.org.

SOURDIS, I.ANDPNEVMATIKATOS, D. 2003. Fast, large-scale string match for a 10Gbps FPGA-based network intrusion detection system. Lecture Notes in Computer Science, vol. 2778.

SOURDIS, I.ANDPNEVMATIKATOS, D. 2004. Pre-decoded CAMs for efficient and high-speed NIDS pattern matching. In Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’04). IEEE, Los Alamitos, CA.

SOURDIS, I., PNEVMATIKATOS, D., WONG, S.ANDVASSILIADIS, S. 2005. A reconfigurable perfect-hashing scheme for packet inspection. In Proceedings of the International Conference on Field Pro-grammable Logic and Applications. Springer, Berlin, Germany.

SPAMASSASSIN. 2006. The Apache SpamAssassin Project. http://spamassassin.apache.org/

SQUIDGUARD. 2006. SquidGuard filter. http://www.squidguard.org/.

TAN, L.ANDSHERWOOD, T. 2005. A high throughput string matching architecture for intrusion detection and prevention. In Proceedings of the 32nd Annual International Symposium on Com-puter Architecture (ISCA’05). ACM, New York.

TRIPP, G. 2005. A finite-state-machine based string matching system for intrusion detection on high-speed network. In Proceedings of the EICAR Conference. IEEE, Los Alamitos, CA, 26–40.

TUCK, N., SHERWOOD, T., CALDER, B.ANDVARGHESE, G. 2004. Deterministic memory-efficient string matching algorithms for intrusion detection. In Proceedings of the IEEE INFOCOM Conference.

IEEE, Los Alamitos, CA.

WU, S.ANDMANBER, U. 1992. Fast text searching allowing errors. Comm. ACM 35, 83–91.

Received May 2006; revised March 2007, June 2007; accepted August 2007

相關文件