FREE ELECTRONIC LIBRARY - Abstract, dissertation, book

Pages:   || 2 | 3 | 4 |

«Date: Prof. Deb Roy Associate Professor of Media Arts and Sciences Massachusetts Institute of Technology Date: Dr. Allen Gorin Research Associate JHU ...»

-- [ Page 1 ] --

Predicting the Veracity of Rumors in Social

Networks: Computational Explorations

by Soroush Vosoughi

Ph.D. Thesis Proposal, Media Arts and Sciences

Massachusetts Institute of Technology

November, 2014


Prof. Deb Roy

Associate Professor of Media Arts and Sciences

Massachusetts Institute of Technology


Dr. Allen Gorin

Research Associate

JHU Center of Excellence for Human Language Technology

Visiting Scholar

MIT Laboratory for Social Machines


Prof. Aral Sinan Associate Professor of Information Technology and Marketing Massachusetts Institute of Technology Contents 1 Introduction 4 2 Background 6 3 Thesis Summary and Methodology 6

3.1 Anatomy of an Assertion in Social Media.................... 7

3.2 Quantifying and Operationalizing Assertions................... 7 3.2.1 The Form/Style.............................. 8 3.2.2 The Function/Content.......................... 8 3.2.3 The Agents/Users............................. 9 3.2.4 The Propagation/Cascade Dynamics................... 9

3.3 A Computation Model of Rumors......................... 10 3.3.1 What is a Rumor?............................. 10 3.3.2 Predictive Model and Real Time Rumor Verification........... 11 4 Evaluation 11 5 Conclusion 12 6 Research Plan 13

6.1 Completed work.................................. 13

6.2 Timeline...................................... 13

6.3 Required resources................................. 13

–  –  –

Abstract The spread of malicious or accidental misinformation in social media, especially in timesensitive situations such as real-world emergencies can have harmful effects on individuals and society. Using computational methods, this thesis investigates the nature of rumors surrounding real-world events on Twitter and Reddit, using the April 2013 Boston Marathon bombings as a case study. With the perspective that in social media both the linguistic and the network dynamics of messages need to be taken into consideration, we propose a set of linguistic and graph-theoretic features that make up the anatomy of rumors. The key idea is that there are measurable differences in the make up of false and true rumors. We extract these features using novel natural language processing and network analytic algorithms that we have developed. In this thesis, we propose a dynamic computational model of rumors composed of these features. The model will be evaluated on the rumors surrounding the August 2014 Ferguson unrest. Once fully evaluated, the model will be used to build a real-time rumor verification system for Twitter and Reddit that can be used during realworld emergencies. This system will have immediate real-world applications for consumers of news, journalists and emergency services and can help minimize and dampen the impact of misinformation.

1 Introduction In the last decade the Internet has become a major player as a source for news. In fact a study by the Pew Research Center has identified the Internet as the most important resource for the news for people under the age of 30 in the US and the second most important overall after television [5]. More recently, the emergence and rise in popularity of social media and networking services such as Twitter, Facebook and Reddit have greatly affected the news reporting and journalism landscapes. While social media is mostly used for everyday chatter, it is also used to share news and other important information [11, 18]. Now more than ever people turn to social media as their source of news [15, 24, 14], this is especially true for breaking-news, where people crave rapid updates on developing events in real time. As Kwak et al. (2010) have shown, over 85% of all trending topics 1 on Twitter are headline or persistent news [14]. Moreover, the ubiquity, Trending topics are those topics being discussed more than others on Twitter.

accessibility, speed and ease-of-use of social media have made them invaluable sources of firsthand information. Twitter for example has proven to be very useful in emergency situations, particularly for response and recovery [26]. However, the same factors that make social media a great resource for dissemination of breaking-news, combined with the relative lack of oversight of such services, make social media fertile ground for the creation and spread of unsubstantiated and unverified information about events happening in the world.

This unprecedented shift from traditional news media, where there is a clear distinction between journalists and news consumers, to social media, where news is crowd-sourced and anyone can be a reporter, has presented many challenges for various sectors of society, such as journalists, emergency services and news consumers. Journalists now have to compete with millions of people online for breaking-news. Often time this leads journalists to fail to strike a balance between the need to be first and the need to be correct, resulting in an increasing number of traditional news sources reporting unsubstantiated information in the rush to be first [6, 7]. Emergency services have to deal with the consequences and the fallout of rumors and witch-hunts on social media, and finally, news consumers have the incredibly hard task of sifting through posts in order to separate substantiated and trust-worthy posts from rumors and unjustified assumptions. A case in point of this phenomenon is the social media’s response to the Boston Marathon bombings. As the events of the bombings unfolded, people turned to social media services like Twitter and Reddit to learn about the situation on the ground as it was happening. Many people tuned into police scanners and posted transcripts of police conversations on these sites. As much as this was a great resource for the people living in the greater Boston, enabling them to stay up-to-date on the situation as it was unfolding, it led to several unfortunate instances of false rumors being spread, and innocent people being implicated in witch-hunts [13, 16, 25]. Another example of such phenomenon is the 2010 earthquake in Chile where rumors propagated in social media created chaos and confusion amongst the news consumers [17].

In this thesis, we plan to develop and combine a set of natural language processing and complex network analysis tools and algorithms that enable the study and analysis of the underlying processes that develop on social media in emergency situations. More generally, we are interested in using social media as an experimental ground for studying and quantifying the nature of communicative discourse in highly connected, complex and massive communication networks (such as social media), in order to better understand and model the dynamic processes that evolve on these networks and the underlying signals driving them. Through modeling these signals and processes we attempt to explain, predict and modify how these systems behave under different conditions. As mentioned, one such behavior that we are interested in modeling is how such systems behave during real-world emergencies (e.g., natural disasters, terrorist attacks, plane crashes, etc). Specifically, we want to model the emergence, evolution, propagation and impact of unverified assertions (or rumors) on social media during emergency situations. We then plan to use these models to predict the veracity of assertions made about such events on social media, with the goal of creating a rumor verification tool for use in emergencies. Finally, we plan to study and experiment with possible approaches for intervening and minimizing the impact and spread of false information in these networks.

2 Background Although there has been extensive work done on measuring and quantifying information credibility and modeling the spread of information in networks, most have approached this problem either through a text and language processing or network science and complex system analytics framework. The research done in the network science domain have mainly focused on modeling various diffusion and cascade structures [8, 10], the spread of “epidemics” [20, 19, 9], knowledge [8] and information and propaganda [21]. Work has also been done on identifying influential players in spreading information through a network [28, 1] and identifying sources of information [22]. In a work more directly related to our research direction, Mendoza et al, have looked at the difference in propagation behavior of false rumors and true news on Twitter [17]. In all of these cases the properties of the actual entity that is being spread–be it a message, knowledge, or a virus– is never analyzed or taken into consideration in the models. In contrast, our work will be looking at the content of the messages being spread in addition to the propagation behavior of these messages and any information that might be available about the agents involved in the propagation.

Relevant research done in text and language processing domain primarily falls either under information retrieval and comparison or semantic and sentiment analysis. The former involves using various NLP techniques to retrieve relevant information from text (or speech) and then comparing the information against a database of known-facts. The Washington Post’s TruthTeller 2 which attempts to fact check political speech in real time is a great example of such work. The latter research attempt to detect non-literal text (text that is not supposed to be taken at face value) such as sarcasm [12], satire [3] and hostility (flames) [23] through a combination of semantic and sentiment analytic techniques.

As far as we can tell, there have been very few studies that take all these factors into consideration. Most relevant is the work of Castillo et al [4], where the authors have looked at a combination of linguistics and propagation factors that can be used to approximate users’ subjective perceptions of credibility on Twitter (i.e. whether users believe the tweets they are reading), they do not however focus on objective credibility of messages.

3 Thesis Summary and Methodology This work uses Twitter’s and Reddit’s response to the April 2013 Boston Marathon bombings as a case study to analyze and model the genesis, evolution and propagation of rumors. The work starts by annotating more than 20 rumors that spread about the events surrounding the bombings, followed by processing and parsing raw tweets and posts using various NLP and network analytic tools which we have developed. Leading to a computational analysis of rumors and predictive models for estimating the veracity of assertions in these mediums and finally evaluating these models on tweets and reddit posts about other real-world events and emergencies, such as the August 2014 unrest in Ferguson.


This section will explaining the following:

• The definition of rumors.

• The process through which messages on Twitter and Reddit are operationalized as a collection of computationally measurable and quantifiable features.

• The tools that have been built and need to be built to extract these features.

• The creation of a computational model of rumors using these features.

3.1 Anatomy of an Assertion in Social Media At any given time an assertion on social media can potentially be broken down into the following


• The Form/Style: How is the message presented? Is it well polished? Grammatical?

Does it use slang?

• The Function/Content: What is the message about? What is it intended to achieve?

• The Agents/Users: Who is presenting the message? Which platform is being used?

Which social group does the author belong to? What is the history of the author?

• The Propagation/Cascade Dynamics: What was the speed at which the message spread? What did the propagation tree look like? How many “influential nodes” did it pass through? How fast did its spread decay?

By breaking down assertions along these dimensions over time, we can create a dynamic fingerprint for each assertion. We can then group false and true assertions together and look for common structural properties between assertions in each group. In addition we can look for possible signals that can differentiate between the false and true assertions. Even though our work focuses on rumors, similar characterization techniques can be used to analyze different phenomenons in social networks.

3.2 Quantifying and Operationalizing Assertions In the section above we briefly talked about how assertions in social media can be characterized by a combination of their form, function, agents and propagation dynamics. In this section we will explain in greater detail the nature of these four dimensions and describe how they are quantified.

3.2.1 The Form/Style The form of a message captures how it is presented and is assumed to be independent of its information content. There are many ways to encode the form of a message, however we have found two aspects of the form to carry the most information (and thus be more predictive) about the nature of a signal in social media. These two aspects are the sophistication and the formality

of a message. The sophistication of a message is captured through the following features:

• Type/token ratio. (E.g., number of adjectives, etc.)

• Complexity of the sentences. (E.g. embedded clauses, etc.)

• Complexity of words. (E.g., number of syllables, rarity.)

The formality of a message is captured through these features:

• Grammatical correctness of the message.

• Usage of emoticons.

Pages:   || 2 | 3 | 4 |

Similar works:

«EMS : Electroacoustic Music Studies Network – De Montfort/Leicester 2007 The role of behaviour in the analysis of electroacoustic music. Dante Tanzi L.I.M. Laboratorio di Informatica Musicale Dipartimento di Informatica e Comunicazione Università Degli Studi di Milano Via Comelico 39/41 20135 Milano Tel. +39 02 5031 6380 dante.tanzi@unimi.it Abstract The analytical discourse is generally based on two assumptions: that to some extent musical traits, forms and meanings can be considered...»

«Führen mit Werten Die Benediktsregel als Richtschnur unternehmerischen Handelns Anselm Bilgri Der Mensch in unseren modernen westlichen Gesellschaften leidet unter der mangelnden Balance von Arbeit und Leben. Fragt man Führungskräfte in allen Arten von Organisationen, äußern sie häufig, sie empfänden sich wie im Hamsterrad. Sie sind nicht mehr fähig auszusteigen und von einer höheren Warte mit Abstand auf sich selbst und ihre Arbeit einen kritischen Blick zu werfen. Mit dem Bild des...»

«two 2011 NEWSLETTER Electrochem2011: Electrochemical Horizons Special Issue September 5-6, 2011 The University of Bath, United Kingdom Serving Electrochemical Science, Technology and Engineering within the catchment of The Royal Society of Chemistry and The Society of Chemical Industry Published by the SCI Electrochemical Technology, the RSC Electrochemistry and the RSC Electroanalytical Sensing Systems Groups © [2011], all rights reserved. Contents Editorial 3 Electrochem2011 : Programme 4...»

«William Smith 1769-1839 Acknowledgements This meeting is a part of a number of events that mark the Bicentennial of the first map published by William Smith. We gratefully acknowledge the support of ARUP for making this meeting possible.Sponsor: CONTENTS Inside Cover Sponsors Acknowledgement Event Programme Page 1 Speaker Abstracts Page 37 Poster Abstracts Page 47 Speaker Biographies Page 57 Burlington House Fire Safety Information Page 58 Ground Floor Plan of the Geological Society, Burlington...»

«[Cite as U.S. Bank v. Fitzgerrel, 2012-Ohio-4522.] IN THE COURT OF APPEALS TWELFTH APPELLATE DISTRICT OF OHIO CLERMONT COUNTY U.S. BANK, NATIONAL ASSOCIATION : SUCCESSOR BY MERGER TO FIRSTAR BANK, N.A., : CASE NO. CA2011-09-063 Plaintiff-Appellee, : OPINION : 10/1/2012 vs RICHARD E. FITZGERREL, et al., : Defendants-Appellants. : CIVIL APPEAL FROM CLERMONT COUNTY COURT OF COMMON PLEAS Case No. 2010CVE00976 Laurito & Laurito, Erin M. Laurito, 7550 Paragon Road, Dayton, Ohio 45459, for...»

«Adopted by the College of Bishops 2011 Preparation Standards for Seminaries of the Anglican Church in North America and Approved Anglican Tracks Defining an Anglican Education/Formation for Presbyter/Priests and Deacons Introduction We are very aware that there are urgent issues around the formation of Anglican clergy. These relate to ongoing Anglican identity in the world and to the mission to which we are called. We deeply desire that godly humility and personal holiness be highly valued...»

«JEFF COWEN press info Exhibition at WILLAS contemporary November 5th December 5th 2015 in collaboration with MICHAEL WERNER KUNSTHANDEL WILLAS contemporary Tordenskioldsgate 7 0160 Oslo Norway www.willas.com info@willas.com JEFF COWEN, CV 1966 Born in New York, New York 1994-1996 Arts Students League New York/ New York Studio School 1990-1992 Assistant of Ralph Gibson 1988-1990 Assistant of Larry Clark 1985-1988 Oriental Studies and award winner of University of New York 1987 University of...»

«Leonhard Plank, Cornelia Staritz, Karin Lukas Labour rights in gLobaL Production networks An Analysis of the Apparel and Electronics Sector in Romania Leonhard Plank, Cornelia Staritz, Karin Lukas LABOUR RIGHTS IN GLOBAL PRODUCTION NETWORKS An Analysis of the Apparel and Electronics Sector in Romania Stand Juni 2009 Medieninhaber: Kammer für Arbeiter und Angestellte für Wien 1040 Wien Prinz-Eunge-Straße 20-22 Druck: Eigenvervielfältigung Verlagsund Herstellort: Wien Leonhard Plank, Cornelia...»

«Specification and Verification of Security Policies for Smart Cards DISSERTATION zur Erlangung des akademischen Grades doctor rerum naturalium (Dr. rer. nat.) im Fach Informatik eingereicht an der Mathematisch-Naturwissenschaftlichen Fakultät II Humboldt-Universität zu Berlin von Herr Dipl.-Inf. Matthias Schwan geboren am 08.08.1973 Präsident der Humboldt-Universität zu Berlin: Prof. Dr. Christoph Markschies Dekan der Mathematisch-Naturwissenschaftlichen Fakultät II: Prof. Dr. Wolfgang...»

«MBA PROGRAM CATALOG Ukrainian-American Liberal Arts Institute “Wisconsin International University (USA) Ukraine” Ukrainian-American Liberal Arts Institute “Wisconsin International University (USA) Ukraine” Ukrainian-American Liberal Arts Institute “Wisconsin International University (USA) Ukraine” Contents Welcome General Information Mission Accreditation and Licensure Membership Partners abroad Countries Represented by Students Degree Programs MBA Program Information Learning...»

«Princeton/Stanford Working Papers in Classics Sweating Truth in Ancient Carthage Version 1.0 June 2010 Adrienne Mayor Stanford University Abstract: Richard Miles’s Carthage Must Be Destroyed (2010) justifies a new look at Gustave Flaubert’s controversial novel Salammbô (1862). An abridged version of this essay appeared as “Pacesetter,” London Review of Books 32 (June 2010): 30-31. © Adrienne Mayor mayor@stanford.edu Sweating Truth in Ancient Carthage Adrienne Mayor review of Richard...»

«Funded by: Australian National ECEC reforms with a focus on the National Quality Framework and the National Quality Standard Expert report for the German Youth Institute Margaret Sims Gerry Mulhearn Sue Grieshaber Jennifer Sumsion Professor Margaret Sims, University of New England Ms Gerry Mulhearn, Charles Sturt University Professor Sue Grieshaber, Monash University Professor Jennifer Sumsion, Charles Sturt University Australian National ECEC reforms, with a focus on the National Quality...»

<<  HOME   |    CONTACTS
2016 www.abstract.xlibx.info - Free e-library - Abstract, dissertation, book

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.