實驗結果 - 實驗結果與評估 - 利用標籤社會網絡之影響力最大化達到目標式廣告行銷

第四章實驗結果與評估

4.2 實驗結果

針對 Labeled influence maximization problem，我們提出六種解決方法，分別為 LabeledGreedy 、 LabeledNewGreedy 、 CELFLabeledGreedy 、 LabeledDegreeDiscount 、 MaximumCoverage 和 ProximityDiscount。由於 LabeledGreedy 和 CELFLabeledGreedy 所需的運算量太大，而根據[3]的實驗顯示，NewGreedy 的效果與 CELFGreedy 差不多，因此我們接下來的實驗只會比較 LabeledNewGreedy 、 LabeledDegreeDiscount 、 MaximumCoverage 和 ProximityDiscount 的效果。

實驗時，若標記方法名稱為 MaximumCoverage_threshold_0.05 則代表方法為 MaximumCoverage with proximity threshold 0.05。

實驗採用 Independent cascade model 模擬影響力的擴散，影響機率為 0.05。

實驗部分，針對效果，有三種比較方式，(1)針對單一標籤比較；(2)針對多個標籤，

每個皆為 1 比較；(3)針對多個標籤，每個皆不一定比較。

4.2 實驗結果

實驗一：針對單一標籤的比較

圖 4.1 分別比較 LabeledNewGreedy、ProximityDiscount、LabeledDegreeDiscount 和 MaximumCoverage with proximity threshold 0.05(簡稱 MaximumCoverage)四種方法的效果；圖 4.1 中，目標標籤為 Drama，而影響一個標籤為 Drama 的演員的利潤為 1。Drama 的演員數是 Dataset 裡最多的，共有 3927 個演員。由圖可見，LabeledNewGreedy 的效果是四個方法裡面最好的。ProximityDiscount 在第一個種子節點時，影響的利潤跟 LabeledNewGreedy 相同，而較 LabeledDegreeDiscount 多出大約 100 ，也多出 MaximumCoverage 大約 70，這是 ProximityDiscount 領先 LabeledDegreeDiscount 與 MaximumCoverage 兩個方法幅度最大的時候。ProximityDiscount 只有在種子節點數量為 4 的時候會輸給 LabeledDegreeDiscount，其餘的情況都較 LabeledDegreeDiscount 來得好。

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

然而，隨著種子節點數量越來越多，ProximityDiscount 領先 MaximumCoverage 和 LabeledDegreeDiscount 的幅度則越來越小。此外，MaximumCoverage 的結果也普遍較 LabeledDegreeDiscount 來得好，雖然領先幅度不像 ProximityDiscount 一樣大，除了種子節點為 3 時， MaximumCoverage 輸給 LabeledDegreeDiscount ，其餘皆顯示 MaximumCoverage 的效果較 LabeledDegreeDiscount 好。

圖 4.1 實驗一( ={Drama}, )之效果。

圖 4.2 是目標標籤為 Comedy，且的情況進行四種方法的比較，由圖可知，ProximityDiscount 在種子節點數量較小的時候會贏 LabeledDegreeDiscount、

MaximumCoverage 和 LabeledNewGreedy，但隨著種子節點數量越大，ProximityDiscount 領先的幅度越小，甚至被 MaximumCoverage 、 LabeledDegreeDiscount 和 LabeledNewGreedy 超過。MaximumCoverage 和 LabeledDegreeDiscount 在此實驗中，當種子節點數量超過 5 時，效果較 ProximityDiscount 來得好。而 LabeledNewGreedy 雖然在種子節點數量小的時候表現不佳，種子節點數量為 1 的時候輸給 ProximityDiscount 約 25 利潤，但隨著種子節點數量變多，LabeledNewGreedy 效果越來越好，在種子節點數量大於 15 後，效果較其他三種方法來得好。

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

圖 4.2 實驗一( ={Comedy}, )之效果。

圖 4.3 實驗二( ={Comedy, Biography}, )之效果。

實驗二：針對多個標籤，而目標標籤的利潤皆為 1 的比較

圖 4.3 目標標籤為 Comedy 和 Biography，而 和皆為 1。實驗顯示，LabeledNewGreedy 的效果是最好的。而 ProximityDiscount 在此條件設定下，

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

只有在種子節點小於 5 的時候，效果較 MaximumCoverage 和 LabeledDegreeDiscount 來得好。當種子節點數量增多後，MaximumCoverage 和 LabeledDegreeDiscount 的效果皆會優於 ProximityDiscount。

圖 4.4 為目標標籤為 Thriller 和 Comedy，和皆為 1 的比較，

與圖 4.3 相似， ProximityDiscount 在種子節點數量較小的時候，效果較 LabeledDegreeDiscount 和 MaximumCoverage 來得好，當種子節點數量較大時， MaximumCoverage 和 LabeledDegreeDiscount 的效果較 ProximityDiscount 來得好，而兩者的效果持平。反觀 LabeledNewGreedy 在此種目標標籤和標籤權重值的設定之下，只有在種子節點數量為 1 的時候領先 profit 約 10，其餘情況皆較其他三種來得差，最差的情況是在種子節點數量為 2 的時候，影響的利潤少於 ProximityDiscount 約 38，但隨著種子節點數量變多，利潤的差距也越來越小。

圖 4.4 實驗二( ={Comedy, Thriller}, )之效果。

圖 4.5 目標標籤為所有的標籤，且所有的標籤的利潤皆為 1，而在這樣的目標標籤條件的限制下，其結果就是 Influence maximization 的結果。由實驗顯示，

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

LabeledNewGreedy 的結果是四種方法裡面最好的，其次是 ProximityDiscount 。 ProximityDiscount 在種子節點數量較小的時候，效果都明顯優於 MaximumCoverage 和 LabeledDegreeDiscount，並且與 LabeledNewGreedy 持平。當種子節點較多的時候，

ProximityDiscount 的效果都較 LabeledDegreeDiscount 來得好或是持平。而當種子節點數量大於 9 時，MaximumCoverage 的效果都較 LabeledDegreeDiscount 的效果來得好。

圖 4.5 實驗二(目標標籤為所有的標籤，且所有的目標標籤之利潤皆為 1)之效果。

實驗三：比較多個目標標籤，且目標標籤的利潤皆不一定

圖 4.6 是目標標籤為 Comedy 和 Drama，而，的實驗結果。在此條件設定下，影響一個標籤為 Comedy 的演員相當於影響三個標籤為 Drama 的演員。由圖可見， LabeledNewGreedy 的效果是四種方法裡面最好的，其次是 ProximityDiscount 。 ProximityDiscount 在種子節點數量小於 4 時，效果與 LabeledNewGreedy 持平。而與 LabeledDegreeDiscount 比較， ProximityDiscount 贏 LabeledDegreeDiscount 的利潤最多達 90。此外，MaximumCoverage 在此次實驗中，不管種子節點數量為多少，效果都較 LabeledDegreeDiscount 來得好。

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

圖 4.6 實驗三( ={Comedy, Drama}, )之效果。

圖 4.7 實驗三( ={Comedy, Thriller, Drama}, )之效果。

圖 4.7 中目標標籤為 Thriller、Comedy 和 Drama，而、和。在此目標標籤設定情況下，LabeledNewGreedy 是四種方法裡面效

‧

1 的時候，利潤多過 LabeledDegreeDiscount 和 MaximumCoverage 約 340，且在種子節點數量小於 4 的時候，結果與 LabeledNewGreedy 相同。

由以上面實驗數據可知，ProximityDiscount 在種子節點數量較小的時候，像是種子數量為 1 或 2 時，效果明顯優於 LabeledDegreeDiscount 和 MaximumCoverage，而當種子節點數量變大時，贏的幅度會越來越小。由此可知，ProximityDiscount 在判斷影響力最大的節點時，較 MaximumCoverage 和 LabeledDegreeDiscount 來得好。反觀 MaximumCoverage，雖然在種子節點數量小時，效果會較 ProximityDiscount 來得差，但在種子節點數量較大時，利潤所增加的幅度會較 ProximityDiscount 來得明顯。而 MaximumCoverage 與 LabeledDegreeDiscount 相比，MaximumCoverage 的效果不是優於 LabeledDegreeDiscount，不然就是與其持平。

此外，LabeledNewGreedy 在目標標籤有 Comedy 時(如圖 4.2、圖 4.3 和圖 4.4)，效果較於其他三種方法來得差，我們推測社會網絡的節點結構在目標標籤為 Comedy 或目標標籤為 Comedy 搭配節點數量較其少的標籤，例如 ={Comedy, Biography} 或

={Comedy, Thriller}，的情況下， LabeledNewGreedy 的方法較不適用，其原因可能標籤是 Comedy 的節點在標籤社會網絡上有特殊結構性。

但在圖 4.5 、圖 4.6 和圖 4.7 ， Comedy 搭配節點數量較其多的 Drama ， LabeledNewGreedy 的效果又較其他方法來得好。其原因可能是標籤為 Drama 的節點數量較標籤為 Comedy 多出 3011 個節點，因此 LabeledNewGreedy 在影響 Comedy 的節點時，其特殊結構性質所產生的利潤變小的效應被 Drama 帶來的利潤效應給蓋過去，因此效果還是優於其他三種方法。

圖 4.8 是四種方法在執行目標標籤為 Comedy 且 Comedy 的利潤為 1 的資料時所需的時間， LabeledDegreeDiscount 需要 0.1 秒， ProximityDiscount 需要 235 秒， MaximumCoverage 需要 20 秒，而 LabeledNewGreedy 則需要 30000 秒。

‧

雖然 ProximityDiscount 和 MaximumCoverage 的執行時間較 LabeledDegreeDiscount 久，但還在可接受的範圍之內。而兩者方法的效果雖然較 LabeledNewGreedy 來得差，

但卻也優於 LabeledDegreeDiscount 。因此，在考慮效果與效率的情況下， ProximityDiscount 和 MaximumCoverage 會是較好的選擇。

圖 4.8 LabeledDegreeDiscount、ProximityDiscount、MaximumCoverage 和

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

第五章

結論與未來研究方向

5.1 結論

Influence maximization problem 是要在社會網絡中找出 k 個具有影響力的人，使得社會 網絡中有最多的人受到影響，然而，Influence maximization problem 並沒有考慮到不同的對象，具有不同的重要性。因此我們針對標籤社會網絡提出 Labeled influence maximization problem。

在標籤社會網絡中，節點都有標籤，而每個標籤都有權重值，代表標籤的重要性。

而 Labeled influence maximization problem 是指我們如何從標籤社會網絡中找出影響最多 符合目標標籤 (Target label)條件的人的 k 個人。

我們共提出了六個新的方法來解決 Labeled influence maximization problem。其中 LabeledGreedy、LabeledNewGreedy、LabeledCELFGreedy 和 LabeledDegreeDiscount 是修改原本研究 Influence maximization problem 的方法，此外，我們也提出了兩個新的方法來解決 Labeled influence maximization problem ，分別為 ProximityDiscount 以及 MaximumCoverage，

根據實驗結果顯示，在兼顧效率與效果的情況下，ProximityDiscount 會是最好的選擇。 ProximityDiscount 在種子節點數量較小的情況下，效果明顯地優於 LabeledDegreeDiscount 和 MaximumCoverage ，而當種子節點數量變大時， MaximumCoverage 的效果會較 LabeledDegreeDiscount 和 ProximityDiscount 來得好。因 此，我們可以依照行銷人員所需來決定方法，若行銷人員所需的 k 值較小，則可以用 ProximityDiscount 來求解，反之則用 MaximumCoverage。

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

5.2 未來研究方向

Proximity threshold 的設定會影響 MaximumCoverage 和 ProximityDiscount 的效果和效率，Proximity threshold 設得越高，ProximityDiscount 和 MaximumCoverage 的執行效率就越快，但效果卻不一定變好。因此，如何找到合適的 Proximity threshold 是值得研究的目標。

此外，目前針對 Labeled influence maximization 所提出的方法，包括 LabeledNewGreedy、LabeledDegreeDiscount、MaximumCoverage 和 ProximityDiscount，

主要都是依據 Independent cascade mode 的特性而得，是否可以針對 Weighted cascade model 提出解決 Labeled influence maximization problem 的方法。

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

參考文獻

[1] F. Bass, “A New Product Growth Model for Consumer Durables,” Management Science, Vol. 5, No. 5, 1969.

[2] J. Brown and P. Reinegen, “Social Ties and Word-of-mouth Referral Behavior,”

Journal of Consumer Research, Vol. 14, No. 3, 1987.

[3] W. Chen, Y. Wang, and S. Yang, “Efficient Influence Maximization in Social Networks,”

Proc. of ACM International Conference on Knowledge Discovery and Data Ming SIGKDD, 2009.

[4] A. Chin and M. Chignell, “A Social Hypertext Model for Finding Community in Blogs,”

Proc. of Conference on Hypertext and Hypermedia, 2006.

[5] G. Cornuejols, M.Fisher and G. Nemhauser, “Location of Bank Accounts to Optimize Float,” Management Science, Vol. 23, 1997

[6] P. Domings and M. Richardson, “Mining the Network Value of Consumers,” Proc. of ACM International Conference on Knowledge Discovery and Data Mining SIGKDD, 2001.

[7] P. G. Doyle and J. L. Sell, “Random Walks and Electrical Networks,” The Mathematical Association of America, 1985.

[8] C. Faloutsos, K. S. McCurley, and A. Tomkins, “Fast Discovery of Connection Subgraphs,” Proc. of ACM International Conference on Knowledge Discovery and Data Mining SIGKDD, 2004.

[9] B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos, “Using Ghost Edges for

‧

Classification in Sparsely Labeled Networks,” Proc. of ACM International Conference on Knowledge Discovery and Data Mining SIGKDD, 2008.

[10] J. Goldenberg, B. Libai, and E. Muller, “Using Complex Systems Analysis to Advance Marketing Theory Development,” Academy of Marketing Science Review, Vol. 2001, No. 9, 2001.

[11] J. Goldenberg, B. Libai, and E. Muller, “Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth,” Marketing Letters, Vol.12, No. 3 , 2003.

[12] D. Kempe, J. Kleinberg, and E. Tardos, “Maximizing the Spread of Influence through a Social Network,” Proc. of ACM International Conference on Knowledge Discovery and Data Mining SIGKDD, 2003.

[13] N. Katoh, T. Ibaraki and H. Mine, “An Efficient Algorithm for k Shortest Simple Paths,”

Networks, Vol.12 , pages.411-427,1982.

[14] Y. Koren, S. C. North, and C. Volinsky, “Measuring and Extracting Proximity in Networks,” Proc. of ACM International Conference on Knowledge Discovery and Data Mining SIGKDD, 2006.

[15] J. Leskovec, L. A. Adamic, and B. A. Huberman, “The Dynamics of Viral Marketing,”

ACM Transactions on the Web, Vol. 1, No. 1, 2007.

[16] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance,

“Cost-effective Outbreak Detection in Networks,” Proc. of ACM International Conference on Knowledge Discovery and Data Ming SIGKDD, 2007.

[17] D. Liben-Nowell and J. Kleinberg, “The Link Prediction Problem for Social Network,”

Proc. of International Conference on Information and Knowledge Management CIKM, 2003.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

[18] V. Mahajan, E. Muller, and F. Bass, “New Product Diffusion Model in Marketing: A Review and Directions for Research,” Journal of Marketing, Vol.54, No.1m pages.1-26, 1990.

[19] E. Q. V. Martins and M. M. B. Pascoal, “A New Implementation of Yen’s ranking loopless paths algorithm,” Quarterly Journal of the Belgian, French and Italian Operations Research Societies, 2002.

[20] G. Nemhauser, L. Wolsey, and M. Fisher, “An Analysis of the Approximations for Maximizing Submodular Set Functions,” Mathematical Programming, Vol.14, No.1, 1978.

[21] M. Richardson and P. Domingos, “Mining Knowledge-Sharing Sites for Viral Marketing,” Proc. of International Conference on Knowledge Discovery and Data Mining, 2002.

[22] J. Scripps, P. N. Tan, and A. H. Esfahanian, “Exploring the Link Structure and Community-based Node Roles in Networked Data,” Proc. of IEEE International Conference on Data Mining ICDM, 2007.

[23] H. Tong, C. Faloutsos, and J. Y. Pan, “Fast Random Walk with Restart and Its Applications,” Proc. of IEEE International Conference on Data Mining ICDM, 2006.

[24] X. Yan and J. Han, “gSpan: Graph-Based Substructure Pattern Mining,” Proc. of the 2002 IEEE International Conference on Data Mining ICDM, 2002.

在文檔中利用標籤社會網絡之影響力最大化達到目標式廣告行銷 - 政大學術集成 (頁 39-0)

實驗結果

第四章 實驗結果與評估

4.2 實驗結果

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

第五章

結論與未來研究方向

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

參考文獻

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

第四章實驗結果與評估

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學