• 沒有找到結果。

本論文使用 sequential pattern mining 方法分析 openssh 程式之動態歷程,輔以在 openssh 版本歷程中的變動情況,以描述程式元素(program element)之關係,其中使用 Support 用以挑選出現次數夠高的 pattern,接著以 Confidence 決定 pattern 之品質,其品 質可分為三類:

z frequent: 該 pattern 於執行中完全無例外情況,可做為使用建議

z Potential Error:該 pattern 於執行中有少數例外情況,可檢視該例外情況以除錯 z Unlikely: 該 pattern 於執行中有許多例外情況,pattern 可能實際上不存在 最後以 Variant 比較不同 pattern 的重要性,以及 pattern 中各單元的重要程度。

本研究有二個主要頁獻,其一是相較其它論文所使用的 apriori-based 之方法,使用 sequential pattern 方法,可以確定 pattern 中先元素的前後次序;二是導入變動情況的概 念,故可以更明確地描述 pattern,並顯示出易錯誤處。

5.1. 與相關研究之比較

分析實體 分析範圍 特點比較

Guide Software Changes (2004)

Each revision diff

• Less dedicated program analysis

System specific Rule (2005)

Procedure call

10 lines in a revision

• Pairwise pattern

• Static analysis Matching Method

calls (2005)

Procedure call

Each revision diff

• Pairwise pattern

• Dynamic Analysis PR-Miner (2005) Procedure

call and variable

intra-procedure • Static analysis

Our work Procedure call and variable

intra-procedure • Dynamic analysis

• Ordered pattern

5.2. 未來改進方向

5.2.1. 偵測異名變數

C 語言由於其指標(pointer)便於使用,因而功能強大,然而指標的使用卻帶來程式分 析時相當大的困擾。以下列三行 C 程式為例說明:

int * flag1 = 3;

int * flag2;

flag2 = flag1;

在執行後,記憶體內的有一塊單元其值為三,而 flag1 及 flag2 皆指向此位址,使用指標 透過 flag1 修改該單元的值也會造成 flag2 指向的值改變,反之亦然,因此在此 flag1 及 flag2 具有相同之意義,然而在本論文中 flag1 及 flag2 是視為不同的,例如:

foo(flag1);

以及

foo(flag2);

將會被統計為不同的 pattern,由於該問題為 NP-Complete program,因此若加入 pointer alias analysis 之近似演算法應可以部份解決此問題。以上的問題可能造 成 false-negative 之情形(即存在的規則被忽略而沒被找出)。

5.2.2. 改善版本變動歸因

由於本研究在處理版本資料時皆只使用一行修正(one-line check-ins),若有一組重要 的函式使用方式是從未出錯或恰好其修改皆在二行以上,則我們的研究將無法彰顯這些 函式的重要性。若能有更細緻的演算法標定每次的版本更動的性質,則能更精確描述每 個 pattern 之變動性或重要性。

5.2.3. 利用資料相依性

本研究僅使用變數更名的方式以表達資料相依性(data dependency),然而此種簡易 的方法仍不足我們的需求,若能透過變數展現函式間的關係,搭配前文之 sequential mining 之方法,則有助於找出低發生次數的 pattern,並降低 false-pattern 的機會

參考文獻

[1] Y. Shigio, "GNU GLOBAL source code tag system",http://www.gnu.org/software/global/

[2] Z. Li and Y. Zhou, "PR-miner: Automatically extracting implicit programming rules and detecting violations in large software code," in ESEC/FSE-13: Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005, pp. 306-315.

[3] B. Livshits and T. Zimmermann , "Locating matching method calls by mining revision history data," in Workshop on the Evaluation of Software Defect Detection Tools, 2005.

[4] I. Neamtiu, J. S. Foster and M. Hicks, "Understanding source code evolution using abstract syntax tree matching," in MSR '05: Proceedings of the 2005 International Workshop on Mining Software Repositories, 2005.

[5] C. C. Williams and J. K. Hollingsworth, "Recovering system specific rules from software repositories," in MSR '05: Proceedings of the 2005 International Workshop on Mining Software Repositories, 2005.

[6] T. Zimmermann, P. Weisgerber, S. Diehl and A. Zeller, "Mining version histories to guide software changes," in ICSE '04: Proceedings of the 26th International Conference on Software Engineering, 2004, pp. 563-572.

[7] R. Purushothaman and D. Perry, "Towards understanding the rhetoric of small changes," in 2004, pp. 90-94.

[8] "OpenSSH" http://www.openssh.com/, May, 2006.

[9] D. Kramer, "API documentation from source code comments: A case study of javadoc," in SIGDOC '99: Proceedings of the 17th Annual International Conference on Computer

Documentation, 1999, pp. 147-153.

[10] A. Zeller, "Configuration Management with Version Sets," Abteilung Softwaretechnologie, Technische Universität Braunschweig, Braunschweig, 1997.

[11] CollabNet, "subversion", http://subversion.tigris.org/

[12] S. Huang and K. Liu, "Mining version histories to verify the learning process of legitimate peripheral participants," in MSR '05: Proceedings of the 2005 International Workshop on

Mining Software Repositories, 2005.

[13] D. Cubranic, Murphy and Gail C., "Hipikat: Recommending pertinent software development artifacts," in ICSE '03: Proceedings of the 25th International Conference on Software Engineering, 2003, pp. 408-418.

[14] T. Zimmermann and P. Weisserber, "Preprocessing CVS data for fine-grained analysis," in MSR 2004: International Workshop on Mining Software Repositories, 2004.

[15] I. Sommerville, Software Engineering. ,7th ed.Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA, 2004,

[16] J. Whaley, M. C. Martin and M. S. Lam, "Automatic extraction of object-oriented component interfaces," in ISSTA '02: Proceedings of the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, 2002, pp. 218-228.

[17] S. Konrad and B.H.C. Cheng, “Requirements Patterns for Embedded Systems," IEEE Joint International Conference on Requirements Engineering, 2002. Proceedings., 2002, pp.127-136

[18] Z. Balanyi and R. Ferenc, "Mining design patterns from C++ source

code," .Proceedings.International Conference on Software Maintenance, 2003., pp. 305-314, 2003.

[19] J. W. Nimmer and M. D. Ernst, "Automatic generation of program specifications," ACM SIGSOFT Software Engineering Notes, vol. 27, pp. 229-239, 2002.

[20] R. Kollmann, P. Selonen, E. Stroulia, T. Systa and A. Zundorf, "A study on the current state of the art in tool-supported UML-based static reverse engineering," Proceedings.Ninth Working Conference on Reverse Engineering, 2002., pp. 22-32, 2002.

[21] L. C. Briand, Y. Labiche and Y. Miao, "Towards the reverse engineering of UML sequence diagrams," in WCRE '03: Proceedings of the 10th Working Conference on Reverse Engineering, 2003, pp. 57.

[22] A. Zeller and D. Lutkehaus, "DDD—a free graphical front-end for UNIX debuggers,"

SIGPLAN ., vol. 31, pp. 22-27, 1996.

[23] B. Demsky and M. Rinard, "Data structure repair using goal-directed reasoning,"

Proceedings of the 27th International Conference on Software Engineering, pp. 176-185, 2005.

[24] J. Cordy, "TXL-A Language for Programming Language Tools and Applications,"

Proc.4th Int.Workshop on Language Descriptions, Tools and Applications, Electronic Notes in Theoretical Computer Science, vol. 110, pp. 3–31, 2004.

[25] G. C. Necula, S. McPeak, S. P. Rahul and W. Weimer, "CIL: Intermediate language and tools for analysis and transformation of C programs," Conference on Compiler Construction, 2002.

[26] J. Han and M. Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann, 2000, [27] A. Michail, "Data mining library reuse patterns using generalized association rules,"

International Conference on Software Engineering, pp. 167–176, 2000.

[28] A. Michail, "Data mining library reuse patterns in user-selected applications,"

Automated Software Engineering, 1999.14th IEEE International Conference on., pp. 24-33, 1999.

[29] H. Mannila, H. Toivonen and A. I. Verkamo, "Discovering frequent episodes in sequences," KDD, pp. 210-215, 1995.

[30] “CVS - open source version control," http://www.nongnu.org/cvs/

[31] John Polstra, "CVSup," http://www.cvsup.org/

[32] K. CL, "SVN-Mirror," http://search.cpan.org/~clkao/SVN-Mirror-0.68/

[33] P. Godefroid, N. Klarlund and K. Sen, "DART: Directed automated random testing," in PLDI '05: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2005, pp. 213-223.

相關文件