• 沒有找到結果。

HOW AND WHY (NOT) SHOULD WE EMBRACE BIG DATA? A R EFLECTION FROM THE ASPECT OF EPISTEMOLOGY

N/A
N/A
Protected

Academic year: 2022

Share "HOW AND WHY (NOT) SHOULD WE EMBRACE BIG DATA? A R EFLECTION FROM THE ASPECT OF EPISTEMOLOGY"

Copied!
36
0
0

加載中.... (立即查看全文)

全文

(1)

HOW AND WHY (NOT) SHOULD WE EMBRACE BIG DATA? A R EFLECTION FROM THE ASPECT

OF EPISTEMOLOGY

Prof. Chengshan (Frank) Liu

Institue of Political Science, NSYSU

2018.5.3 @Dept. of Political Science, NCKU

(2)

WHAT IS PHD FOR?

fact, truth, reality, knowledge, or… ?

(3)

WHAT IS “BIG DATA

”?

5Vs: Big volume, velocity, variety, veracity, and value.

Honestly, this term has gone out of fashion.

(4)

WHAT DO SCHOLARS MEAN BY SAYING “BIG DATA”?

In our field ”data-driven” and “method-driven” researc h works are labelled as “big data” studies.

Methods that are associated with “big data”

Text-mining ( 文本探勘 ) ,

data-mining ( 資料探勘 ) ,

automatic content analysis ( 自動內容分析 ) ,

computer-assisted text analysis ( 電腦輔助文本分析 ) ,

automatic annotation ( 自動附記 ) ,

sentiment analysis ( 情緒分析 ) ,

geographic information system ( 地理資訊系統 )

network analysis ( 網絡分析 ) 等等。

(5)

國外關於大數據應用於政治學研究的出版以 Gary King 為主帥。其他文獻也大都或多或少受過 Gary King 所帶 領的研究群之影響與啟發,儼然成為 Gary King 學派。 Gary King 在哈佛大學社會科學量化研究院( Institute for Quantitative Social Science, IQSS )中,鑽研如何使用不同的研究方法與量化工具推進社會科學研究。

Check out his upcoming talks May 29-30 @NTU

圖片來源: http://ppt.cc/Aqutw

(6)

KING’S PURPOSES OF EMBRACING BI G DATA

Evaluate public policy

understand what social posts say

estimate the causes of death,

ensure fair legislative redistricting,

reverse engineer Chinese government’s censorship program,

forecast elections and international conflict

(7)

主題一:資訊工具在社科 ( 政治 ) 應用概論

• 2010. “A Method of Automated Nonparametric Content Analysis for Social Science.”

• 2012. “Social Science Research Methods in Internet Time.

• 2014. “Restructuring the Social Sciences: Reflections from Harvard’s Institute for Quantitative Social Science.”

• 2015. “Computer-Assisted Text Analysis for Comparative Politics.”

• 2015. “No! Formal Theory, Causal Inference, and Big Data Are Not Contradictory Trends in Political Science.”

• 2015. “We Are All Social Scientists Now: How Big Data, Machine Learning, and Causal Inference Work Together.”

• 2015. “Is Bigger Always Better? Potential Biases of Big Data Derived from Social Network Sites.”

• 2016. “Machine Translation: Mining Text for Social Theory.”

(8)

主題二:公共言論趨勢之辨識或追蹤

• 2008. “Recognizing Citations in Public Comments.”

• 2008. “Parsing, Semantic Networks, and Political Authority Using Syntactic Analysis to Extract Semantic Relations from Dutch Newspaper Articles.”

• 2008. “Good News or Bad News? Conducting Sentiment Analysis on Dutch Text to Distinguish Between Positive and Negative Relations.”

• 2008. “Media Monitoring by Means of Speech and Language Indexing for Political Analysis.”

• 2012. “Media Coverage in Times of Political Crisis: A Text Mining Approach.”

• 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.”

• 2014. “Echo Chamber or Public Sphere? Predicting Political Orientation and Measuring Political Homophily in Twitter Using Big Data.”

• 2017. “Critical News Reading with Twitter? Exploring Data-mining Practices and their Impact on Societal Discourse.”

(9)

其他主題(三 ~ 五)

主題三: 政治立場的辨識 / 追蹤

2003. “Extracting Policy Positions from Political Texts Using Words as Dat a.”

2008. “A Scaling Model for Estimating Time-series Party Positions from Texts.”

2014. “Scaling Politically Meaningful Dimensions Using Texts and V otes.”

2015. “Quantifying Social Media’s Political Space: Estimating Ide ology from Publicly Revealed Preferences on Facebook.”

主題四:政治言論的管制策略

2013. “How Censorship in China Allows Government Criticism but Silences Co llective Expression.”

2013. Media Commercialization & Authoritarian Rule in China.

2017. "How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument."

主題五:公共政策形成之探討

2005. “Using Geographic Information Systems to Study Interstate Competitio n.”

2014. “’Big Data’ in Research on Social Policy.”

2015. “Analyzing Big Data: Social Choice and Measurement.”

(10)

主題六:政治言論的語意分析

2008. “Automatic Annotation of Semantic Fields for Politica l Science Research.”

2015. “Uncovering Social Semantics from Textual Traces: A T heory Driven Approach and Evidence from Public Statements of US Members of Congress.”

主題七:政治選舉的運用

2014. “Political Campaigns and Big Data.”

2017. “The Pulse of the People: Can internet data outdo cos tly and unreliable polls in predicting election outcomes?”

主題八:國際關係研究

2012. “Richardson in the Information Age: Geographic Inform ation Systems and Spatial Data in International Studies.”

其他主題(六 ~ 八)

(11)

WHY (NOT) BIG DATA?

Your epistemological and methodological stances and attitudes toward methods decide how you evaluate (if not distain) “big data”.

(12)
(13)

FROM BIG DATA TO DATA SC IENCE

“Data science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining.”

~ Wikipedia

(14)
(15)

HOW POSITIVISTS LOOK AT “BIG DATA”?

Evans & Aceves (2016) “Machine Translation: Mining Text for Social Theory.”

(16)

LET’S LOOK AT THE WHOLE THING FROM THE RIGHT ANGLE:

DATA-ASSISTED MEANING NETTI NG 資料輔助的意義織造

大數據的實作告訴我們,既然知識目的是探索。那就專注在在發現,而不

(必)在驗證。資料數據可用於發現關聯,更可用於探勘意義。不妨先辨識 自己有興趣的概念或面向(什麼價值、什麼行為、什麼態度?),再透過資 料進行探索。一面辨識出不同價值、態度、行為之間的可能關係,一面與自 己的預期關係進行對話。最後再來進行意義的詮釋。

Let’s make our exploration DAMN right.

(17)

DATA SCIENCE FOR

EXTRACTING FACTS AND DISCOVERING MEANING

fact vs. truth vs. reality vs. knowledge

(18)

March 2016. Google watched how people use a phone in a van for over an hour at a time. Goal: complete interviewing 500 people.

(19)

REFLECTIONS FROM THE HUMANITIES

Holmes, J. (2015). Nonsense: The Power of Not Knowing (First Edition). New York: Crown Publishers. 《無知的力 量》Lindstrom, M. (2016). Small Data: The Tiny Clues That Uncover Huge Trends. New York City: St. Martin’s Press.

《小數據獵人》

Madsbjerg, C. (2017). Sensemaking: The Power of the Humanities in the Age of the Algorithm. New York, NY:

Hachette Books.

(20)

MEANING NETTNG

Blackburn, S. (2012). What Do We Really Know? The Big Questions i n Philosophy. London: Quercus.

Cohen, L. H. (2013). I don’t know: In Praise of Admitting Ignora nce. New York: Riverhead Books.

Holmes, J. (2015). Nonsense: The Power of Not Knowing (First Edit ion). New York: Crown Publishers.

Madsbjerg, C. (2017). Sensemaking: The Power of the Humanities in the Age of the Algorithm. New York, NY: Hachette Books.

Sesno, F., & Blitzer, W. (2017). Ask More: The Power of Questions to Open Doors, Uncover Solutions, and Spark Change. New York: AMA COM.

Zarkadakis, G. (2016). In Our Own Image: Savior or Destroyer? The History and Future of Artificial Intelligence (1 edition). Pegasu s Books.

(21)

DAMN METHODS

(22)
(23)
(24)

資料

 Taiwan Election and Democracy Studies 201 6

 Data Collection Period: 2017.1.17 ~ 4.28

 N=1,690

 $$$: > NTD 1,000,000

(25)
(26)

無政黨支持傾向者的樣貌

(27)

藍綠支持者的樣貌

(28)

CONCLUSION

不是手段上的量化 vs. 質化,也不是大數據 vs. 厚數據 而是研究者心中資料-意義之間的對話

(29)

HOW DO I RE-EVALUATE “SUR

VEY” ?

(30)
(31)

你有想過,台灣民眾對於「獨立」的定義有很多種,而且很可能沒有什麼共識嗎?

(32)
(33)

SMILEPOLL.TW

A quali-quantative platform of collecting

preferences, patterns, and values for netting data and meaning.

(34)

CONCLUSION: HOW AND WHY (NO T) SHOULD WE EMBRACE BIG DA TA?

Exploring new patterns via big data is the spirit of

data s cience. (So think again what political science means.)

Different epistemology camps see different uses of big data . (Which side will you take?)

“Meaning mining with data” is the consequences of the abo ve way of thinking

Data size matters much less than purposes of using data.

Learning new data analytical tools will help you get connec ted to the world of exploring patterns and facts via data.

But be fully aware that we should locate our purposes first .

(35)
(36)

參考文獻

相關文件

• The memory storage unit holds instructions and data for a running program.. • A bus is a group of wires that transfer data from one part to another (data,

A quote from Dan Ariely, “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they

● In computer science, a data structure is a data organization, management, and storage format that enables efficient access and

• A cell array is a data type with indexed data containers called cells, and each cell can contain any type of data. • Cell arrays commonly contain either lists of text

“Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced?. insight and

For the data sets used in this thesis we find that F-score performs well when the number of features is large, and for small data the two methods using the gradient of the

Know how to implement the data structure using computer programs... What are we

• Recorded video will be available on NTU COOL after the class..