Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vlad Niculae is active.

Publication


Featured researches published by Vlad Niculae.


international world wide web conferences | 2016

Winning Arguments: Interaction Dynamics and Persuasion Strategies in Good-faith Online Discussions

Chenhao Tan; Vlad Niculae; Cristian Danescu-Niculescu-Mizil; Lillian Lee

Changing someones opinion is arguably one of the most important challenges of social interaction. The underlying process proves difficult to study: it is hard to know how someones opinions are formed and whether and how someones views shift. Fortunately, ChangeMyView, an active community on Reddit, provides a platform where users present their own opinions and reasoning, invite others to contest them, and acknowledge when the ensuing discussions change their original views. In this work, we study these interactions to understand the mechanisms behind persuasion. We find that persuasive arguments are characterized by interesting patterns of interaction dynamics, such as participant entry-order and degree of back-and-forth exchange. Furthermore, by comparing similar counterarguments to the same opinion, we show that language factors play an essential role. In particular, the interplay between the language of the opinion holder and that of the counterargument provides highly predictive cues of persuasiveness. Finally, since even in this favorable setting people may not be persuaded, we investigate the problem of determining whether someones opinion is susceptible to being changed at all. For this more difficult task, we show that stylistic choices in how the opinion is expressed carry predictive power.


international world wide web conferences | 2015

QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns

Vlad Niculae; Caroline Suen; Justine Zhang; Cristian Danescu-Niculescu-Mizil; Jure Leskovec

Given the extremely large pool of events and stories available, media outlets need to focus on a subset of issues and aspects to convey to their audience. Outlets are often accused of exhibiting a systematic bias in this selection process, with different outlets portraying different versions of reality. However, in the absence of objective measures and empirical evidence, the direction and extent of systematicity remains widely disputed. In this paper we propose a framework based on quoting patterns for quantifying and characterizing the degree to which media outlets exhibit systematic bias. We apply this framework to a massive dataset of news articles spanning the six years of Obamas presidency and all of his speeches, and reveal that a systematic pattern does indeed emerge from the outlets quoting behavior. Moreover, we show that this pattern can be successfully exploited in an unsupervised prediction setting, to determine which new quotes an outlet will select to broadcast. By encoding bias patterns in a low-rank space we provide an analysis of the structure of political media coverage. This reveals a latent media bias space that aligns surprisingly well with political ideology and outlet type. A linguistic analysis exposes striking differences across these latent dimensions, showing how the different types of media outlets portray different realities even when reporting on the same events. For example, outlets mapped to the mainstream conservative side of the latent space focus on quotes that portray a presidential persona disproportionately characterized by negativity.


conference of the european chapter of the association for computational linguistics | 2014

Temporal Text Ranking and Automatic Dating of Texts

Vlad Niculae; Marcos Zampieri; Liviu P. Dinu; Alina Maria Ciobanu

This paper presents a novel approach to the task of temporal text classification combining text ranking and probability for the automatic dating of historical texts. The method was applied to three historical corpora: an English, a Portuguese and a Romanian corpus. It obtained performance ranging from 83% to 93% accuracy, using a fully automated approach with very basic features.


international joint conference on natural language processing | 2015

Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game

Vlad Niculae; Srijan Kumar; Jordan L. Boyd-Graber; Cristian Danescu-Niculescu-Mizil

Interpersonal relations are fickle, with close friendships often dissolving into enmity. In this work, we explore linguistic cues that presage such transitions by studying dyadic interactions in an online strategy game where players form alliances and break those alliances through betrayal. We characterize friendships that are unlikely to last and examine temporal patterns that foretell betrayal. We reveal that subtle signs of imminent betrayal are encoded in the conversational patterns of the dyad, even if the victim is not aware of the relationships fate. In particular, we find that lasting friendships exhibit a form of balance that manifests itself through language. In contrast, sudden changes in the balance of certain conversational attributes---such as positive sentiment, politeness, or focus on future planning---signal impending betrayal.


north american chapter of the association for computational linguistics | 2016

Conversational Markers of Constructive Discussions

Vlad Niculae; Cristian Danescu-Niculescu-Mizil

Group discussions are essential for organizing every aspect of modern life, from faculty meetings to senate debates, from grant review panels to papal conclaves. While costly in terms of time and organization effort, group discussions are commonly seen as a way of reaching better decisions compared to solutions that do not require coordination between the individuals (e.g. voting)---through discussion, the sum becomes greater than the parts. However, this assumption is not irrefutable: anecdotal evidence of wasteful discussions abounds, and in our own experiments we find that over 30% of discussions are unproductive. We propose a framework for analyzing conversational dynamics in order to determine whether a given task-oriented discussion is worth having or not. We exploit conversational patterns reflecting the flow of ideas and the balance between the participants, as well as their linguistic choices. We apply this framework to conversations naturally occurring in an online collaborative world exploration game developed and deployed to support this research. Using this setting, we show that linguistic cues and conversational patterns extracted from the first 20 seconds of a team discussion are predictive of whether it will be a wasteful or a productive one.


empirical methods in natural language processing | 2014

Brighter than Gold: Figurative Language in User Generated Comparisons

Vlad Niculae; Cristian Danescu-Niculescu-Mizil

Comparisons are common linguistic devices used to indicate the likeness of two things. Often, this likeness is not meant in the literal sense—for example, “I slept like a log” does not imply that logs actually sleep. In this paper we propose a computational study of figurative comparisons, or similes. Our starting point is a new large dataset of comparisons extracted from product reviews and annotated for figurativeness. We use this dataset to characterize figurative language in naturally occurring comparisons and reveal linguistic patterns indicative of this phenomenon. We operationalize these insights and apply them to a new task with high relevance to text understanding: distinguishing between figurative and literal comparisons. Finally, we apply this framework to explore the social context in which figurative language is produced, showing that similes are more likely to accompany opinions showing extreme sentiment, and that they are uncommon in reviews deemed helpful.


north american chapter of the association for computational linguistics | 2015

AMBRA: A Ranking Approach to Temporal Text Classification

Marcos Zampieri; Alina Maria Ciobanu; Vlad Niculae; Liviu P. Dinu

This paper describes the AMBRA system, entered in the SemEval-2015 Task 7: ‘Diachronic Text Evaluation’ subtasks one and two, which consist of predicting the date when a text was originally written. The task is valuable for applications in digital humanities, information systems, and historical linguistics. The novelty of this shared task consists of incorporating label uncertainty by assigning an interval within which the document was written, rather than assigning a clear time marker to each training document. To deal with non-linear effects and variable degrees of uncertainty, we reduce the problem to pairwise comparisons of the form is Document A older than Document B?, and propose a nonparametric way to transform the ordinal output into time intervals.


text speech and dialogue | 2013

Romanian Syllabication Using Machine Learning

Liviu P. Dinu; Vlad Niculae; Octavia-Maria Sulea

The task of finding syllable boundaries can be straightforward or challenging, depending on the language. Text-to-speech applications have been shown to perform considerably better when syllabication, whether orthographic or phonetic, is employed as a means of breaking down the text into units bellow word level. Romanian syllabication is non-trivial mainly but not exclusively due to its hiatus-diphthong ambiguity. This phenomenon affects both phonetic and orthographic syllabication. In this paper, we focus on orthographic syllabication for Romanian and show that the task can be carried out with a high degree of accuracy by using sequence tagging. We compare this approach to support vector machines and rule-based methods. The features we used are simply character n-grams with end-of-word marking.


meeting of the association for computational linguistics | 2017

Argument Mining with Structured SVMs and RNNs.

Vlad Niculae; Joonsuk Park; Claire Cardie

We propose a novel factor graph model for argument mining, designed for settings in which the argumentative relations in a document do not necessarily form a tree structure. (This is the case in over 20% of the web comments dataset we release.) Our model jointly learns elementary unit type classification and argumentative relation prediction. Moreover, our model supports SVM and RNN parametrizations, can enforce structure constraints (e.g., transitivity), and can express dependencies between adjacent relations and propositions. Our approaches outperform unstructured baselines in both web comments and argumentative essay datasets.


POLIBITS | 2012

String Distances for Near-duplicate Detection

Iulia Dănăilă; Liviu P. Dinu; Vlad Niculae; Octavia-Maria Șulea

Near-duplicate detection is important when dealing with large, noisy databases in data mining tasks. In this paper, we present the results of applying the Rank distance and the Smith-Waterman distance, along with more popular string similarity measures such as the Levenshtein distance, together with a disjoint set data structure, for the problem of near-duplicate detection. HE concept of near-duplicates belongs to the larger class of problems known as knowledge discovery and data mining, that is identifying consistent patterns in large scale data bases of any nature. Any two chunks of text that have possibly different syntactic structure, but identical or very similar semantics, are said to be near duplicates. During the last decade, largely due to low cost storage capacity, the volume of stored data increased at amassing rates; thus, the size of useful and available datasets for almost any task has become very large, prompting the need of scalable methods. Many datasets are noisy, in the very specific sense of having redundant data in the form of identical or nearly identical entries. In an interview for The Metropolitan Corporate Counsel (see http://www.metrocorpcounsel.com/articles/7757/ near-duplicates-elephant-document-review-room), Warwick Sharp, vice-president of Equivio Ltd., a company offering information on retrieval services to law firms with huge legal document databases, noted that 20 to 30 percent of data they work with are actually near-duplicates, and this is after identical duplicate elimination. The most extreme case they handled was made up of 45% near-duplicates. Today it is estimated that around 7% of websites are approximately duplicates of one another, and their number is growing rapidly. On the one hand, near-duplicates have the effect of artificially enlarging the dataset and therefore slowing down any processing; on the other hand, the small variation between them can contain additional information so that, by merging them, we obtain an entry with more information than any of the original near-duplicates on their own. Therefore, the key problems regarding near-duplicates are identification

Collaboration


Dive into the Vlad Niculae's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge