Check Ad fraud - 靜態廣告欺詐行為偵測技術研究-以 iOS 為例

Algorithm 2 Check Ad related API with dynamic invocation algorithm

1: procedure checkIAD(CF G, ApiString)

2: ▷ CFG is the control flow graph of an app

4: for ∀node ∈ CF GAD do

5: if node∈ NSClassF romString then

6: DP G← createDP G(node)

7: SDP G← BuildStringAutomataF romDP G(DP G)

8: SAP I ← BuildStringAutomata(ApiString)

string. We will use the algorithm 2 to check an Ad related API string shown in Table 1.

Detailed descriptions of the algorithm 2 will show as follows. Building dependency graphs on parameters of N SClassF romString functions is The first step. The second step is building a string automata of the constructor or method of Ad related API. Then check if there is an intersection between the string automata of the dependency graph and these Ad related API string automata. So that we can reveal Ad related API invocations with their argument values on N SClassF romString functions. The final step is to record the checking result so that we can skip the apps without Ad related API when we perform the Ad fraud detection algorithm as follows.

4.4 Check Ad fraud

If an app includes Ad related API, we will start performing Ad fraud detection in Figure 9. We will use diﬀerent algorithms to detect diﬀerent Ad fraud. The detailed steps of the algorithm will show in subsections as follows. When our algorithm needs to solve the possible string of parameters, we will use the BuildStringAutomataF romDP G func-tion and BuildStringAutomata funcfunc-tion menfunc-tioned in secfunc-tion 4.3 to check. We will use BuildStringAutomataF romDP G to traverse the parameter of the input DPG to find the string operation and string expression, then building the automata for it.

Then we will use BuildStringAutomata to build the automata for the string we

con-‧

cerned about. After building these two string automata, we will check the intersection for these two automata so that we can reveal which potential API string invocations with their argument values.

4.4.1 Interstitial violation Ad fraud

The behavior of showing Interstitial Ad is harmful to user experience. Interstitial viola-tion Ad fraud means developers will call ADInterstitial API to generate an interstitial Ad. ADInterstitial API represents the union set of the Interstitial API provided by Ad networks in Table 1. We will determine the flow to the parameter of functions that load the class with dynamic invocation. Then we conduct string analysis on these dependency graphs to determine if these parameters can take the ADInterstitial API string as an input. That is to say, developers may use reflection to call the ADInterstitial API to generate an interstitial Ad, so we will perform the algorithm 3 to check the behavior of calling ADInterstitial API with dynamic invocation. Developers need to use alloc or init method to call it from other locations.

When we perform algorithm 3, we first get all the nodes in the control flow graph of the app which includes Ad related API. Then we check if the node is a function node of N SClassF romString. If the node belongs to a function node of N SClassF romString, we will build the dependency graph on parameters(R0) of the functions through the algorithm 1. After getting the dependency graph, we will build the string automata of the dependency graph through buildStringAutomata, which can determine the state of the input string for the dependency graph. We will also build automata for the ADInterstitial API string. The string of the Ad API about building the interstitial Ad is various in diﬀerent Ad network providers. We have collected them before this detection. Then we can check the intersection between these two automata. If there is an intersection between these two automata, it means that the apps have called ADInterstitial API. Then we will use hasInitM ethod function to check if the node calls alloc or init method in other locations of the control flow graph. If the node matches the two situations, We will record

‧

the result as violating Interstitial violation Ad fraud.

Algorithm 3 Check the Interstitial violation Ad fraud

1: procedure checkIVAF(CF GAD)

2: ▷ CF G_AD is the control flow graph of an app included Ad related API

3: for ∀node ∈ CF GAD do

4: if node∈ NSClassF romString then

5: DP G← createDP G(node, ”R0”)

6: SDP G← buildStringAutomata(DP G)

7: SV S ← buildStringAutomata(adInterstitalAP IString)

8: checkResult← SDP G ∩ SV S

9: if chekResult = true∧ hasInitMethod(node) then

10: Record(checkResult, SDP G, SV S)

11: end if

12: end if

13: end for

14: end procedure

4.4.2 Size violation Ad fraud

To check Size violation Ad fraud, it is necessary to determine the flow to the parameters of the Ad view functions(which is used for setting the Ad size). For each parameter corresponds to Ad view functions, we construct a dependency graph on parameters of these functions. We conduct a string analysis on these dependency graphs to reveal potential API invocations with their argument values on Ad view functions. By checking the intersection of these values with patterns that characterize a violate value for Ad view, we can detect Size violation Ad fraud.

The algorithm 5 shows how we check Size violation Ad fraud in the apps which include Ad related API. When displaying Ads in the application, some Ad networks will provide API that developers can decide the size of the Ad view. The size of the Ad should be reasonable, so we will try to resolve the size of an Ad view, and check if the behavior is Size violation Ad fraud. There are some steps we need to do first. The algorithm 4 called by algorithm 5 to get the Ad view node list from the control flow graph.

Because we record the callSbrt for each node when we construct the control flow graph, so we can use function GetCallerSbrt to find the callSbrtN ode list of the input node.

‧

The details for how we find the callSbrt for each node shown in the following section.

We scan the whole control flow graph node first. When we confront a node that denotes obj msg sender, it will give us what node it will send and what method it will call. So we can know which subroutine or methods the node has called. We will record them as callSbrt for each node.

Algorithm 4 Check Ad-view algorithm

1: procedure checkAdView(CF G)

2: ▷ CFG is the control flow graph of an app

3: ▷ AdViewString is the api of Ad view provided by Ad Network

4: AdV iewList← []

5: for ∀node ∈ CF G do

6: if node∈ NSClassF romString then

7: DP G← createDP G(node, R0)

8: SDP G← buildStringAutomata(DP G)

9: SAP I ← buildStringAutomata(AdV iewClassString)

10: checkResult← SDP G ∩ SAP I

11: for ∀callerNodes ∈ node.getCallerNodes do

12: AdV iewList.add(callerN odes)

We will get the control flow graph from an app included Ad related API. We get the Ad view node list from the control flow graph with algorithm 4. Then we get the subroutine node list called by the Ad view node. If the subroutine node, such as function or method, is in the Ad View API provide by any Ad Network, we will build the dependency graph on parameters of this subroutine node. V iolateSize denotes that the violate string, such as CGRectZero. Then we build the violate size string automata and check if there is an intersection between these two automata. It will be reported Size violation Ad fraud if there is an intersection. Finally, we record the result and calculate the number of the app which contains the behavior of Size violation Ad fraud.

‧

Algorithm 5 Check Size-violation Ad fraud algorithm

1: procedure checkSVAF(CF GAD)

2: ▷ CF GAD is the control flow graph of an app includ Ad related API

3: ADV iewList = checkADV iew(CF G_AD)

4: for ∀adviewNode ∈ ADV iewList do

5: CallSbrtList← GetCallerSbrt(adviewNode)

6: for ∀sbrt ∈ CallSbrtList do

7: if sbrt.name∈ ADV iewMethodList then

8: DP G← createDP G(sbrt, ”R0”)

9: SDP G← buildStringAutomata(DP G)

10: SV S← buildStringAutomata(V iolateSize)

11: checkResult← SDP G ∩ SV S

12: Record(checkResult, SDP G, SV S)

13: end if

14: end for

15: end for

16: end procedure

4.4.3 Multi-view violation Ad fraud

Each Ad view will detach on a ViewController. The same ViewController should not detach more than one Ad view. If it does, it will be a Multi-view violation Ad fraud.

There will be an instance of ViewController, and it will call addSubView method. We will check if the parameter of addSubView is an instance of any kind of Ad view. We will count the number of times that the parameter of the addSubView called by the instance of ViewController is an instance of Ad view. When the number is higher than one time, we will call it Multi-view violation Ad fraud.

Developers may import the Ad view with dynamic invocation, we will first use the algorithm 4 to find the instance called Ad view dynamically. We will store them in AdV iewList. Then we will find the addSubView method called by each ViewController of Ad network. Get the parameter of addSubView method we find and check if it is Ad view node in the AdV iewList. If it matches the process we give above, we will use the RecordAddSubAdV iew function to record the app has called addSubView with Ad view one time. If the number is higher than one time, we will call it Multi-view violation Ad fraud. The steps described above have shown in algorithm 6.

‧

Algorithm 6 Check Multi-view violation ad algorithm

1: procedure checkFMAF(CF GAD)

2: ▷ CF GAD is the control flow graph of an app include Ad related API

3: for ∀V iewNode ∈ CF GAD do

4: ADV iewList = checkADV iew(cf g)

5: for ∀adviewNode ∈ ADV iewList do

6: CallSbrtList← GetCallerSbrt(V iewNode)

7: for ∀sbrt ∈ CallSbrtList do

8: if sbrt.name∈ addSubV iew ∧ sbrt.param ∈ adviewNode then

9: RecordAddSubAdV iew(CF G_AD)

10: end if

A ViewController will call addSubView function to add many types of view. If one ViewController adds ad view and full-screen view at the same time, the chance that it will be an Overlay-view violation Ad fraud is high. We will check if the parameter of addSubView of the ViewController is an instance of any kind of Ad view. Then checking that if the View Controller calls addSubView to add an instance of the full-screen view. If a ViewController matches the two behavior we mentioned above, we will call it Overlay-view violation Ad fraud. The algorithm of checking Overlay-Overlay-view violation Ad fraud is shown in algorithm 7. checkF ullV iew functions is almost the same as the checkADV iew but the input string automata are Full relate API.

5 Evaluation

5.1 Environment

We perform Ad fraud detection analysis in the real environment as the flow shown in Figure 9. First, We download iOS apps through Sikuli based approach[14] from Apples App Store via iPhone 7. We then install the App, fetch its binary, and decrypt the binary. Next, we construct its ARMv7 assembly with IDA Pro, then we generate its

‧

Algorithm 7 Check Overlay-view violation ad fraud algorithm

1: procedure checkFOAF(CF GAD)

2: ▷ CF GAD is the control flow graph of an app include Ad related API

3: for ∀V iewNode ∈ CF GAD do

4: ADV iewList = checkADV iew(cf g)

5: F ullV iewList = checkF ullV iew(cf g)

6: for ∀adV iewNode ∈ ADV iewList ∧ ∀fullV iewNode ∈ F ullV iewList do

7: CallSbrtList← GetCallerSbrt(V iewNode)

8: for ∀sbrt ∈ CallSbrtList do

9: if sbrt.name∈ addSubV iew ∧ sbrt.param ∈ adV iewNode then

10: RecordAddSubAdV iew(CF G_AD)

11: return

12: end if

13: end for

14: for ∀sbrt ∈ CallSbrtList do

15: if sbrt.name∈ addSubV iew ∧ sbrt.param ∈ fullV iewNode then

16: RecordAddSubF ullV iew(CF G_AD)

17: return

control flow graph by Binflow script [58]. In the last step, we perform the proposed Ad fraud algorithms 2,3,4,5,6 in section 4 to detect the Ad related API and Ad fraud by using the generated CFG.

We collected 30 thousand apps from App Store, covering 33 genres. The release dates of these apps range from 2008 to 2017, all of which have been updated after 2016. The number of generated control flow graph nodes could be more than 10 thousand which makes it a memory-intensive task. Hence, we deploy a high-end server with 24 GB RAM to generate the control flow graph of an App.

Our analysis is based on the binary code on iOS 9, which belongs to the arm v32 architecture. To confirm whether the apps we analyzed have violated Ad fraud violation, we downloaded the latest version of the app to observe its violation of Ad fraud. We have confirmed the revision history of the violated applications. There is no description of the way they change the presentation of advertisements. Therefore, we think there are

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

violations in the code of the latest applications if we have detected Ad fraud violation in the binary code of them.

We also provide a GitHub to represent the result of Ad fraud detection analysis [87].

The results until the control flow graph will report in the Binflow github[88].

在文檔中靜態廣告欺詐行為偵測技術研究-以 iOS 為例 - 政大學術集成 (頁 33-40)