CHAPTER 4 VISUALIZATION
4.2 Graphic Visualization
國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
4.2 Graphic Visualization
We visualize the data and give the user a global view. We collect an analysis report of each file in the application and visualize the data as a bee comb structure graph. Every cell in the bee comb graph is a single file, and its color will be different based on the vulnerabilities the files have. If a file has many vulnerabilities, it will be more easily attacked, and the cell that the file mapped will have the color red to indicate its dangerous status. On the other hand, if a file has few vulnerabilities but has connections with one or more very dangerous files, its cell will be yellow because it could possibly execute the malicious scripts from the dangerous files. Though the yellow files do not have obvious vulnerabilities, users still have to monitor them, and they are categorized as secondary dangerous files. The third type of files are those without any vulnerabilities and no connections with vulnerable files. The cells they mapped are colored green,
Figure 15. The graph represents the security status of the target application.
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
which means they are safe. The screen will zoom in to the file and list its name and status when users click it.
Another view we wanted to show was the dependency graph. Originally, Stranger analyzes the source and builds the dependency graph of the programming logic so that it can easily detect the vulnerabilities in the source. This is the core value of information-security analysis. However, users will not be able to see these data in the analysis report, so we wanted to provide them with views of this aspect. Users can check the relation of the variable and method in the application, as well as the data flow and how the malicious scripts affect the application.
Then they can choose one of the files in the bee comb graph and press the space key, which causes the scene to be transformed to a view of the dependency graph. The dark node represents the sink node, which may cause vulnerabilities, and the red node represents the input node, which receives input values. Users can check every node in the execution process of the target file by clicking the node, which causes the screen to zoom in to the chosen node, with a textual area showing up in the up left corner of the screen. The textual area contains the properties of the Figure 16. Users can explore the dependency graph by checking nodes with the information.
‧
chosen node, such as the line position, the variable, and the method called. We also provide users with the capability to trace the path from input nodes to the sink node by moving the camera of the screen backward and forward as shown in Fig.18. Users can use the “A” key to move forward and the “S” key to move backward while examing the graph.
We integrated the Website with the visualizing tool so that users can manipulate the toolsets at the same time. Users launch the visualization tool when browsing the Patcher website, and the website synchronizes the users’ manipulations with the visualization tool. For instance, when users click one of the applications in the list and the pages are navigated to, the pages contain a list of files in the selected application. The visualization tool will receive the command with information of the target application from the Website. Then the visualization tool will transform each file of the selected application into the cell and color it based on the amount of vulnerability, so the user can check the overall status of the application. If users drill down to the specific file in the application, the visualization tool will synchronize the manipulation as zooming in the camera to the cell representing the selected file, with the label showing its file name and security level. In the page of view single file, users can click the “Dependency Graph”
button to view the dependency graph of vulnerabilities generated by the dot file from the analysis report. Users can also switch between different dependency graphs. We are working on reverse synchronizing, which means that the corresponding source codes will be highlighted when the nodes of dependency graphs are chosen. It could improve users’ comprehension because they will not have to check the information of the node in the visualization and then switch to the Website to search for the corresponding source. Having to switch applications and windows too often could cause frustration and make users stop manipulating.
‧
We use Unity as the development tool to generate the graph, and we will have the material of the cell that is the model exported by the Maya software. First, the user can upload an application to the back-end server and check the vulnerabilities, then the visualization application will load the output XML file provided by the Stranger back end and get the detailed information of the analysis report, which could include the number of files, the vulnerabilities of each file, and the types of vulnerabilities. The application will generate the bee comb dynamically, piece by piece, based on the information in the XML file from the back end. Users can experience the analysis process by viewing the animation. In the other words, the application transforms the single node in the XML file into a cell of the bee comb, and all of the visualized objects will be shown on the big screen by the projector.
4.2.1 Processing Graphic Inputs
Stranger generates graphics in the dot format (Graphviz, 2013), which has been widely adopted to illustrate flow graphs. Due to the requirement for representing dependency graphs in Unity, we have to parse the dot file of the analysis results as XML. We developed a parser that checks the nodes and the relations between nodes in the dot file and transfers the information into the XML format so that the visualization tool can read it with the XML parser and instantiate visual objects. The dot file contains a large amount of information, so we have to filter and separate it. We collect the node name, shape, and each line of the label. If the node represents an input, the system will instantiate a node in the environment to represent the input node, which receives the user’s input value in the dependency graph, and we use the labels Condition and OnCondition to determine the relations of the nodes. If a node contains the label
‧
Condition:xxxx, it means that the node is a method that can be called for several times by other nodes.
The dot graph uses an arrow to indicate the directions of nodes. When the edge contains a label, it will be placed between the [ ]. We record the start point as From and the end point as To, and there is a special label called EdgeLabel to record the label in the edge. We list the information in the table: in addition to Nodes and Edges, we make a Root list to inform Unity about which node is the root. The root node represents the tainted sink, which means it is the end of the data on how and where the malicious behavior may happen. The system traces the vulnerable input of the source code from the root. The root node in Graphviz looks like the figure below.
We find every node and collect the nodes that are shaped as double octagons and that are Condition nodes, and we make them into a list. Because a dot file does not tell the condition relationship, we need to develop an algorithm to build the relationship. We need to match the Condition node and the OnCondition node. We already have the Condition node list at the step where we make roots, so what we need to do is find which OnCondition node is matched to the target Condition node. Just like with the Edge table, we use the attribute From to record the Condition node and the attribute To for the OnCondition node, but we also use an additional table to record the conditions. We parse all of the files as XML files and each dot file has four child nodes: Nodes, Edges, Roots, and Conditions. We pack all of a node's information into a node element, then append it under the nodes element list, with the roots being appended under Roots. If the file does not have a Condition, there will be a Non-condition attribute in the Conditionedges node.
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
4.2.2 Building Interactive Graphs
In order to build a function-oriented graph-exploration view, we used a tree data structure to reconstruct the dependency graph. In addition, to instantiate visual nodes and link them to parent nodes with visual lines, we also linked the nodes by the pointer and made them the tree.
After the doubly-linked tree is created, it is easy to access and modify the node and its Figure 17. The algorithm of generating interactive dependency graphs.
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
information both in the visible aspect and invisible aspect such as broadcasting information of root nodes to every child node and adjusting the distance between two visual nodes.
We developed the capability to trace the function call path based on the mechanism above, and users can move the focus forward and backward to check nodes in the dependency graph as shown in Fig. 15. When users move the focus forward and backward to explore the dependency graph, the visualization sends synchronization commands, and corresponding codes in the Websites are highlighted. By this capability, users can easily trace the source code without searching by the line number.