4.3 Regression Testing Framework
4.3.3 More Functionalities
Being a regression testing framework, our platform tries to provide useful functionalities as many as possible.
• Showing useable test cases: We can list the built test cases in our system (Code 2).
• Selecting specific test cases: When building a test case, we can tag it with aributes.
en we can select a bunch of test cases by these aributes and do some actions on them (Code 3).
• Showing tested results: We can review the test results of any test case which we have tested before (Code 4).
• Regression testing wrapper: When some parts in the system are being modified, we may want to take a test to see what will be impacted by this change. us we can use a wrapper code to handle this situation (Code 5).
Code 1: A sample test case: test1.py 1 # Import required module 2 import CRF
3
4 # New a CRF testing and give required attributes for this testing 5 test1 = CRF.test.Test('test1 ',
6 {
7 ' test_type ': ' exploitgen ',
8 ' image_id ': ' Windows_XP_0 ',
9 ' crash_id ': 'CVE -2010 -3333 _DOC_2011 -01 -06. doc ',
10 ' target_id ': ' Office_2003 ',
11 ' autoit_id ': ' office_1 ',
12 ' channel_port ': 10354 ,
13 ' symexec_offset ': 35650 ,
14 ' shellcode_id ': 7
Code 2: Showing test cases example
1 # list all test cases under a directory
2 CRF.admin. listAllCase ('/home/data/tests / windows ')
We list the basic types in CRAXUnit in Table 4, and the APIs currently provided in CRAXUnit are listed in Table 5.
Table 4: Basic type in CRAXUnit Type name Description
Test A runnable test case including required information Set A collection of Test, but not runnable
TestSet A runnable test set composed of a Set including required information
Code 3: Selecting test cases example
1 set0 = CRF.test. selectAll ('/home/data/ tests/ windows ') 2 set1 = set0. selectByTagAnd ('Windows ', 'Office ')
3 set2 = set1. selectByTagOr ('Windows ', 'Office ')
Code 4: Showing test results example
1 date1 = time. mktime ((2012 , 12, 21, 0, 0, 0, 0, 0, 0)) 2 date2 = time. mktime ((2013 , 2, 1, 0, 0, 0, 0, 0, 0)) 3
4 # Show all results for specific test 5 CRF.admin. showTestResults ('test1 ')
6 # Restrict the result between a date range
7 CRF.admin. showTestResults ('test1 ', date1 , date2 )
Code 5: An example of testing wrapper
1 set0 = CRF.test. selectAll ('/home/data/ tests/ windows ') 2 set1 = set0. selectByTagAnd ('Windows ', 'Office ')
3
4 test2 = CRF.test. TestSet (set1 ,
5 {
6 ' symexec_offset ': 21480 ,
7 ' shellcode_id ': 7 }
8 ). run ()
Table 5: APIs in CRAXUnit
Class Method Description
test Test Create a new test case
test TestSet Create a new test set with a set test selectAll Select all test case as a set
test.set selectByTagAnd Select a set by tag using AND logic test.set selectByTagOr Select a set by tag using OR logic
test.testCase run Run a test case
test.testCase setTag Set tags to test case
test.testSet run Run each tset case in a test set admin listAllCase List all test cases
admin showTestResults Show tested results
Chapter 5 Results
Building a reusable soware testing method on QEMU-based system as the goal, we implement the regression testing framework that can easily create, manage, and run tests. We also create an exploitable test case database for the framework using. us, we can do soware testing and verify our modification to the exploit generation method efficiently.
In this chapter, we will show the experimental result of our implementation and compare it to some similar works.
5.1 Regression Testing Framework for QEMU-based System
e regression testing framework can provide the following functionalities:
• Test cases creation, managing, and running
We provide a series of API to let testers to manipulate test cases. rough these APIs, one can create tests easily with providing required entries. Later, she or he can classify these tests into different tags according to one’s purpose. Test cases maintainers can choose to list these tests by their tags or other aributes, or choose to view previous tests result by constraints. e last, as the most important part of the regression testing framework, one can efficiently run a batch of test cases on demand, and providing different arguments to change the testing method is also acceptable.
• Automation testing procedure
e past method used on QEMU-based system testing need human intervention in guest
OS. We replace the manual GUI operations with programmable GUI automation scripts and communication channel that can send control message from host to guest. As a result, we can now run through the whole testing procedure without geing into guest OS manually. It’s a more efficient and innovation way to do the testing.
• Regression testing ability: parameterizable
Our framework reserve flexibility for testers to customize their wanted testing. Either
testing target, testing environment, S2E version, shellcode, or symbolic offset are all parameterizable.
is feature makes the framework have the ability to do regression testing, both for previous test cases verification and future development testing.
• Flexible extensibility: extending connection channel commands
We use a connection channel with self-defined communication protocol to handle the actions taken on guest OS and host OS. We provide a series of commands to control some common actions. Developers can extend the protocol by simply adding their own commands and corresponding actions to guest OS by themselves, then the framework will be able to do more extensive testing.
5.2 Testing Bran Database
In our framework, we have rebuilt a test case database from the existed exploitable testing.
ose messes such as images files, configurations, and crash inputs we need to handle manually are not needed any more. We can run a test case just using the database to select out what need to be tested.
e built exploitable test cases in our database are shown in Table 6.
5.3 Improvement
In order to test our method, we compare our testing framework with others’ and the past method to show what functionalities our platform can provide.
Table 6: Exploitable Test Case in Database
# OS Soware Crash Input Image
1 Windows XP Office 2003 CVE-2010-3333_DOC_2011-01-06.doc Windows_XP_0
2 Windows XP Coolplayer CVE-2008-3408.mp3 Windows_XP_1
3 Windows XP Distiny CVE-2009-3429.mp3 Windows_XP_2
4 Windows XP Dizzy EDB-ID-15566.mp3 Windows_XP_2
5 Windows XP GAlan OSVDB-ID-60897.galan Windows_XP_2
6 Windows XP GSPlayer OSVDB-ID-69006.mp3 Windows_XP_1
7 Windows XP MPlayer EDB-ID-17013.mp3 Windows_XP_1
8 Windows XP Foxit Reader OSVDB-ID-68648.pdf Windows_XP_3
5.3.1 Time and Efficiency Comparison
e beginning motivation of our work is to build a framework that provides automatic regression testing without human intervention. Here we list the comparison of testing work using the past method and our CRAXUnit framework in Table 7 and Table 8.
Table 7: Comparison of running one testing
method past method using CRAXUnit
time consuming 10 minutes 8 minutes
operation type human monitoring / operation all the time just one command
tunable factors manual parameterizable input
Table 8: Comparison of running a testing benchmark
method past method using CRAXUnit
test case numbers 7 7
time consuming 1 hour 23 minutes 45 minutes
operation type human monitoring / operation all the time just one command
involved configurations 7 none (set in test cases)
tunable factors manual parameterizable input
5.3.2 Feature Comparison
Table 9 shows the comparison between the related implementations mentioned previously and our framework. Our platform, CRAXUnit supports almost all QEMU-based system testing, and extending the functionalities of the framework is an easy work just through writing plug-ins
Table 9: Comparison between Frameworks
Framework D-Cloud ETICS CRAXUnit
Image Mgmt. yes - yes
Test case Mgmt. yes yes yes
Parameterizable input - - yes
Target type Windows/Linux Windows/Linux QEMU-based system
Guest operations - - yes
Distributed testing yes yes yes∗
Test application type fault testing fault testing fault, security, benchmark testing
Extensibility modify QEMU - Python plug-in
∗Note: S2E only supports one host testing currently.
in Python. erefore, by extensibility features, we can use CRAXUnit to execute not only fault testing but also security, benchmark testing, etc. Moreover, our framework can accept parameterizable inputs so that users don’t need to generate another test case when they just want to tune some factors.
Chapter 6
Conclusion and Further Work
In this final chapter, we summarize our work and list some further work to refine our method.
6.1 Conclusion
In this thesis, we implemented a regression testing framework for soware testing on QEMU-based system, which is applicable for testing work on Windows, Unix-like system, Android and testing target as fuzzer, malware, embedded system and so on.
Our framework provides an innovative method for soware testing. Usually those testing involve GUI operations need human intervention using past methods, and testing based on QEMU also need human to jump into guest OS to manipulate in guest environments. Our method makes the whole testing automatically without any manual operations by establishing a channel to take control between guest and host OS.
As a well-formed framework, we also provide APIs for creation, managing, and running test cases. Tests can easily be operated and repeated through these APIs, and testing results are also managed properly that can be selected and reviewed at any time. Our framework architecture also allows users to do more type of testing by simply extending its functionalities if the built-in ones are not sufficient.
In order to run experiments on our framework, we also rebuild an exploitable test case database transformed from previous manual test cases. Some scenarios that have not been exploited successfully yet have also been created as test cases in our database waiting for method
improving in the future.
6.2 Further Work
Here we propose some further work to implement or to improve that can refine our work:
• Environments seing up and GUI automation method improvements
e initial disk image files used by our framework require users to customize the environments and install a daemon for the framework. Since these procedures take really a long time in test case preparing stage, it will be beer to have an automatic initializing method.
On the other hand, the GUI automation scripts we use now are made case by case. To quickly build large amount of test cases, we can take one basic script and modify from it to adapt to dedicated cases, but this is still not efficient. For long-term considerations, we are beer to propose a more efficient method to solve the GUI automation problem.
• Running queue management
ough that we use cloud management techniques to handle underlying hardware resources, the framework still can deploy only one testing instance at a time. is is because our framework doesn’t provide parallel APIs. A test will be deployed only when the previous one terminated. If users want to run several tests in parallel, they need to handle the simultaneous problem by using fork system call or issue several commands successively.
It will be a good enhancement if the framework supports running queue management.
Users can add test tasks into queue, and the tasks can be issued depending on hardware resources automatically or by users’ decision.
• Testing ability for distributed computing
e current architecture of S2E is only support single-machine soware testing. Since we are using S2E as our core element, the range of testable target is restricted as well.
To achieve this goal, our framework should have the ability of booting several testing instances at once and communication between co-working guests. e trend of information industry is toward to distributed computing, and more and more soware are starting to support distributed co-working. So if our system can support real distributed system testing, the testing coverage will be raise up and comprehensive.
Reference
[1] James C King. Symbolic execution and program testing. Communications of the ACM, 19(7):385–394, 1976.
[2] Corina S Păsăreanu and Willem Visser. A survey of new trends in symbolic execution for soware testing and analysis. International journal on soware tools for technology transfer, 11(4):339–353, 2009.
[3] Edward J Schwartz, anassis Avgerinos, and David Brumley. All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In Security and Privacy (SP), 2010 IEEE Symposium on, pages 317–331. IEEE, 2010.
[4] Cristian Cadar, Patrice Godefroid, Sarfraz Khurshid, Corina S Păsăreanu, Koushik Sen, Nikolai Tillmann, and Willem Visser. Symbolic execution for soware testing in practice:
preliminary assessment. In Proceedings of the 33rd International Conference on Soware Engineering, pages 1066–1071. ACM, 2011.
[5] Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. S2E: A platform for in-vivo multi-path analysis of soware systems. ACM SIGARCH Computer Architecture News, 39(1):
265–278, 2011.
[6] Shih-Kun Huang, Min-Hsiang Huang, Po-Yen Huang, Chung-Wei Lai, Han-Lin Lu, and Wai-Meng Leong. CRAX: Soware crash analysis for automatic exploit generation by modeling aacks as symbolic continuations. In Soware Security and Reliability (SERE), 2012 IEEE Sixth International Conference on, pages 78–87. IEEE, 2012.
[7] QEMU - open source processor emulator. http://qemu.org/.
[8] Fabrice Bellard. QEMU, a fast and portable dynamic translator. USENIX, 2005.
[9] libvirt: e virtualization API. http://libvirt.org/.
[10] OpenStack open source cloud computing soware. http://www.openstack.org/.
[11] Daniel Nurmi, Rich Wolski, Chris Grzegorczyk, Graziano Obertelli, Sunil Soman, Lamia Youseff, and Dmitrii Zagorodnov. e eucalyptus open-source cloud-computing system. In Cluster Computing and the Grid, 2009. CCGRID’09. 9th IEEE/ACM International Symposium on, pages 124–131. IEEE, 2009.
[12] Daniel Nurmi, Rich Wolski, Chris Grzegorczyk, Graziano Obertelli, Sunil Soman, Lamia Youseff, and Dmitrii Zagorodnov. Eucalyptus: A technical report on an elastic utility computing architecture linking your programs to useful systems. In UCSB TECHNICAL REPORT. Citeseer, 2008.
[13] Tom Yeh, Tsung-Hsiang Chang, and Robert C Miller. Sikuli: using GUI screenshots for search and automation. In Proceedings of the 22nd annual ACM symposium on User interface soware and technology, pages 183–192. ACM, 2009.
[14] Jason Brand and Jeff Balvanz. Automation is a breeze with AutoIt. In Proceedings of the 33rd annual ACM SIGUCCS fall conference, pages 12–15. ACM, 2005.
[15] Dorota Huizinga and Adam Kolawa. Automated defect prevention: best practices in soware management. Wiley-IEEE Computer Society Press, 2007.
[16] Kent Beck. Embracing change with extreme programming. Computer, 32(10):70–77, 1999.
[17] Kent Beck and Cynthia Andres. Extreme programming explained: embrace change.
Addison-Wesley Professional, 2004.
[18] Kent Beck. Simple smalltalk testing: With paerns, 1999.
[19] SUnit - the mother of all unit testing frameworks. http://sunit.sourceforge.net/.
[20] Kent Beck and Erich Gamma. Test infected: Programmers love writing tests. Java Report, 3(7):37–50, 1998.
[21] Erich Gamma and Kent Beck. JUnit: A cook’s tour. Java Report, 4(5):27–38, 1999.
[22] Takayuki Banzai, Hitoshi Koizumi, Ryo Kanbayashi, Takayuki Imada, Toshihiro Hanawa, and Mitsuhisa Sato. D-Cloud: Design of a soware testing environment for reliable distributed systems using cloud computing technology. In Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pages 631–636.
IEEE Computer Society, 2010.
[23] Marc-Elian Bégin, Guillermo Diez-Andino Sancho, Alberto Di Meglio, Enrico Ferro, Elisabea Ronchieri, Maeo Selmi, and Marian Żurek. Build, configuration, integration and testing tools for large soware projects: ETICS. In Rapid Integration of Soware Engineering Techniques, pages 81–97. Springer, 2007.
[24] Francisco J González-Castaño, Javier Vales-Alonso, Miron Livny, Enrique Costa-Montenegro, and Luis Anido-Rifón. Condor grid computing from mobile handheld devices. ACM SIGMOBILE Mobile Computing and Communications Review, 6(2):
18–27, 2002.
[25] Tristan Richardson, entin Stafford-Fraser, Kenneth R. Wood, and Andy Hopper. Virtual network computing. Internet Computing, IEEE, 2(1):33–38, 1998.
[26] David Brumley, Sang Kil Cha, and anassis Avgerinos. Automated exploit generation, December 13 2012. US Patent 20,120,317,647.
[27] Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, and Bob Lyon. Design and implementation of the Sun network filesystem. In Proceedings of the Summer USENIX conference, pages 119–130, 1985.