In this chapter we first provide background knowledge of image comparison methods used in automated GUI testing. We then present a survey of related work on automated GUI testing techniques and Android-based automated GUI testing tools.
2.1 Image comparison techniques Histogram comparison
A color histogram is a representation of the color distribution of an image (i.e., the number of pixels of an image with a given color). Color histogram comparison extracts and compares the color histogram of two images. If both histograms are similar, then the images are considered to be similar. Although, this is an efficient technique to compare images, it is sensitive to changes in lighting conditions and not aware of the contents of an image. In other words, two completely different images will be considered as having similar contents if they have similar histograms [11].
SURF: Speeded Up Robust Features
SURF is a “scale- and rotation-invariant interest point detector and descriptor” [12]. In computer vision, an interest point detector is used to detect parts of an image that can be used to uniquely describe it. An interest point, also called feature or key point, has many properties but the most important one is its repeatability, which means that it must be reliably computed under different conditions (e.g., changes in size, rotation, etc.). After an interest point of an image has been detected, the interest point descriptor, a feature vector that represents this neighborhood, uses the neighborhood information of the interest point to characterize it. By adopting the characteristics of interest points, SURF-based image comparison method first extracts the interest points of the two images being compared. It then matches descriptors of both images. Finally, image similarity is measured according to the amount of matches.
Template matching
Template matching is a method used to find a small image (template) in a larger image (source). This is done by taking the template and sliding it on the original image pixel by pixel; at every point a metric is calculated to determine how good the match is. After all metrics are calculated, the best match can be selected. Depending on the method used, the best match may be the highest or the lowest calculated value [13].
The image comparison techniques described above have proved to be accurate and efficient but not perfect, each one of them have some weaknesses. The Table 1 compares these
5 while still being accurate. The reason we use all three techniques and not just one is because according to our experiments they complement each other; that is, errors undetected by one technique are likely to be detected by any of the other two.
2.2 Automated GUI testing techniques
A lot of effort has been dedicated on the area of automated testing. In this section, we introduce several commonly-used automated GUI testing techniques.
Model-based testing
In recent years an increasing amount of research effort has been dedicated to model-based testing. Model-based testing represents the system under test (SUT) as a model. A model is a detailed abstract description of how the SUT is supposed to work. The behavior of the SUT can be represented in many ways but it is usually represented as a state machine, and graph algorithms are used to automatically derive the test cases [14].
Time consumption is the major problem of model-based testing because the great amount of possible test cases can be automatically generated. However, many of them are irrelevant.
Furthermore, in order to create a model, it is required to have detailed documentation of the SUT, and keep that documentation and the model updated; otherwise, invalid tests will be generated. By using the record-replay technique, SPAG-C takes advantage of the knowledge testers have about how the application is supposed to work and the context in which it is used.
Therefore, testers can always create relevant test cases and SPAG-C verifies only what is necessary.
Accessibility technologies
Although the real purpose of accessibility technologies is to facilitate the use of technology for disabled people, many researchers are making use of it to access GUI
6
information, and thus, verify the GUI state of an application. For example, Grechanik et al.
[7] made use of Windows accessibility technologies to create hooks that listen to events generated by the GAP (GUI-based application). Whenever an event of interest is triggered these hooks can react to it and perform some operations like gathering the properties of the elements currently being displayed and verifying its state. Although, this is a valid approach for black box testing, it is still limited by not only the system’s API but also the way developers make use of it. Android automatically makes applications more accessible but there are some steps that developers can take to provide extra information about the application. SPAG-C, on the other hand, does not depend on accessibility services to verify an application’s GUI.
Record-replay
Record-replay is the most popular approach for GUI test automation [20] because it allows creating test cases for an application without the need of writing code. The testing process basically consists of two phases: the record phase and the replay phase. During the record phase, testers interact with the AUT in exactly the same way end users would do it. While a tester is interacting with the AUT, a tool is automatically recording all input events and writing them into a test case. Later, testers can modify the recorded test cases if required and replay them at any time. In order to improve over the traditional record-replay approach we propose a method to automatically capture the required images and add the verification commands during the record process. This way all testers have to do is record the test cases.
2.3 Android-based automated GUI testing tools Monkeyrunner
Monkeyrunner [15] is a testing tool provided by Google. It provides an API that developers can use to control Android devices without the need of any source code. To use Monkeyrunner, developers write Python programs to simulate user interaction. If they want to corroborate the state of the GUI, they can also write commands to capture screenshots from within the devices using Android’s frame buffer which is the part of video memory containing the current video frame. There are three main issues with Monkeyrunner apart from the fact that in order to use it testers need programming skills: first, the naïve form in which it simulates events on the AUT [6]; second, its verification approach; third, capturing screenshots from Android’s frame buffer is time-sensitive, which means that testers need to adequately synchronize the simulation of events with the time of the capture or otherwise invalid images will be taken for verification. On the contrary, SPAG-C takes advantage of the
7
method used by SPAG to accurately simulate events on the DUT, and uses a non-intrusive method to capture images which is automatically synchronized with the simulated events at all times.
Robotium framework
Robotium [16] is a framework used to perform black box testing on Android devices. It uses Android Instrumentation [18] to interact with an application’s GUI and gather some information. In order to check the state of an application, screenshots can be taken or object identification can be performed using Robotium’s API and JUnit’s assertions. Robotium is widely used but just like Monkeyrunner it requires testers to manually program test cases.
SPAG-C automatically creates test cases by listening to user events and recording them in the test case which reduces the test writing time considerably.
Testdroid
Testdroid [8] is an Android testing platform that uses the Robotium framework to define test cases. Testdroid records user interaction and automatically generates Java code with calls to Robotium API. These test cases can be later replayed at any time in the same way that Robotium tests are executed. With Testdroid, testers can execute their tests either locally, on their own devices, or remotely, using Testdroid’s cloud services. Testdroid’s cloud services provide log files and statistics about test execution; additionally, it takes screenshots during the testing process so developers can verify the GUI. Testdroid services, however, are quite expensive, and GUI verification has to be done manually by the testers since Testdroid does not perform any comparison against expected states. On the contrary, SPAG-C completely automates the verification process so that all testers need to do is to record the tests.
GUITAR
Android GUITAR (Graphical User Interface Testing frAmewoRk) [17] was an effort of Xie and Memon to migrate their previous work [4] on model-based testing to the Android platform. GUITAR consists of two modules: ripper and replayer. The ripper is in charge of automatically generating event-flow graphs for their later conversion into test cases. It does this by automatically interacting with an application and gathering all relevant information about its GUI. Since the GUI ripper cannot be guaranteed to have access to all different windows and widgets of an application, a capture/replay tool was created for testers to complement the ripper. The replayer is in charge of the execution of the generated test cases.
The main problem with GUITAR is that only works on Android emulator because it uses Hierarchy Viewer [27] a tool that only works on development builds, not on real devices. On
8
the other hand, SPAG-C can be used on a great variety of real devices, and its test oracle can be reused by different testing tools.
2.4 SPAG
This work, SPAG-C, is an extension of a previous work called SPAG (Smart Phone Automated GUI testing tool) [6]. SPAG combines and extends two open source tools: Sikuli [9] and Android Screencast [10]. SPAG merges these two tools together to be able to use Sikuli’s API for testing Android devices. SPAG intercepts user interactions with Android Screencast, saves these interactions in a Sikuli test file and replays them later as required by the tester.
SPAG provides three contributions: Batch event, which is used to accurately reproduce the recorded event sequences; Smart wait, which is used to automatically establish a delay between events to ensure that the DUT has enough time to process previous events; and an automatic verification method, which makes use of Android accessibility services to record transition between activities after an event is executed.
Since SPAG is integrated with Sikuli, it can also take advantage of Sikuli’s API to perform image verification in a semi-automatic way, which means that the verification is done by Sikuli but the tester still needs to provide the images and write the commands into the test case. SPAG also provides an automatic verification that uses Android Accessibility services to gather the name of the activities, and performs a string comparison to verify that the same activity transition that occurred after the input of a specific event during record also happens during replay. This, however, does not ensure that applications are being displayed as expected. SPAG-C also provides two verification approaches: semi-automatic and automatic.
In both approaches SPAG-C performs image verification with images captured from a camera, the only difference is that the semi-automatic approach requires testers to capture the images, while automatic approach does not.
SPAG depends on Android Screencast to interact with the DUT; therefore, it inherits its limitations like limited support for devices, slow response time, which may affect the image verification process, and the inability to reproduce multi-touch events. Since SPAG-C is based on SPAG it also inherits some of SPAG’s limitations but we improve the verification process by making it more reusable, automated, better synchronized and platform-independent.
In Table 2 we compare the previously discussed tools by looking at: how often the GUI is verified during test execution (verification frequency), what technique is used to verify the GUI (verification approach), how screenshots are taken (capture screen method), and what
9 testing technique is used.
Table 2: Common automation tools in Android Tool Verification
frequency Verification
approach Capture screen
method Testing Technique Monkey
runner[15]
up to tester image
comparison frame buffer test script Robotium[16] up to tester code assertions no need test script GUITAR[17] -- code assertions no need model-based
Testdroid[8] -- -- frame buffer record-replay
SPAG semi[6] up to tester image comparison
frame buffer record-replay SPAG auto activity transition string
comparison
no need record-replay SPAG-C semi up to tester image
comparison
external camera record-replay SPAG-C auto dynamic image
comparison external camera record-replay
10