David K. Elson
Columbia University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David K. Elson.
meeting of the association for computational linguistics | 2010
David K. Elson; Nicholas Dames; Kathleen R. McKeown
We present a method for extracting social networks from literature, namely, nineteenth-century British novels and serials. We derive the networks from dialogue interactions, and thus our method depends on the ability to determine when two characters are in conversation. Our approach involves character name chunking, quoted speech attribution and conversation detection given the set of quotes. We extract features from the social networks and examine their correlation with one another, as well as with metadata such as the novels setting. Our results provide evidence that the majority of novels in this time period do not fit two characterizations provided by literacy scholars. Instead, our results suggest an alternative explanation for differences in social networks.
Biochimica et Biophysica Acta | 1955
David K. Elson; Erwin Chargaff
Abstract 1. 1. The results of a survey of the nucleotide composition of pentose nucleic acids (PNA) have been presented. The material analyzed included whole tissues and various subcellular fractions derived from beef, rat and frog liver and kidney, from sea urchin eggs and embryos, and from several microorganisms. 2. 2. It appears characteristics of all preparations studied that the bases with 6-amino groups (adenine, cytosine) and those with 6-keto groups (guanine, uracil) occur in approximately equal number. 3. 3. This regularity was found only when the total PNA of the sample was analyzed. 4. 4. Some of the structural implications of the findings are discussed.
national conference on artificial intelligence | 2010
David K. Elson; Kathleen R. McKeown
We describe a method for identifying the speakers of quoted speech in natural-language textual stories. We have assembled a corpus of more than 3,000 quotations, whose speakers (if any) are manually identified, from a collection of 19th and 20th century literature by six authors. Using rule-based and statistical learning, our method identifies candidate characters, determines their genders, and attributes each quote to the most likely speaker. We divide the quotes into syntactic classes in order to leverage common discourse patterns, which enable rapid attribution for many quotes. We apply learning algorithms to the remainder and achieve an overall accuracy of 83%.
Biochimica et Biophysica Acta | 1955
David K. Elson; Leah Wenig Trent; Erwin Chargaff
Abstract The nucleotide composition of pentose nucleic acids of nucleic from rat liver and ox liver and kidney as well as of cytoplasmic fractions from rat and frog liver and kidney has been studied. While no significant differences either between organs or, with one doubtful exception, between species were found, the hepatic nuclear and cytoplasmic pentose nucleic acids of the rat did differ in composition.
national conference on artificial intelligence | 2007
David K. Elson; Mark O. Riedl
Machinima is a low-cost alternative to full production filmmaking. However, creating quality cinematic visualizations with existing machinima techniques still requires a high degree of talent and effort. We introduce a lightweight artificial intelligence system, Cambot, that can be used to assist in machinima production. Cambot takes a script as input and produces a cinematic visualization. Unlike other virtual cinematography systems, Cambot favors an offline algorithm coupled with an extensible library of specific modular and reusable facets of cinematic knowledge. One of the advantages of this approach to virtual cinematography is a tight coordination between the positions and movements of the camera and the actors.
international acm sigir conference on research and development in information retrieval | 2005
Julia Hirschberg; Kathleen R. McKeown; Rebecca J. Passonneau; David K. Elson; Ani Nenkova
We describe a task-based evaluation to determine whether multi-document summaries measurably improve user performance when using online news browsing systems for directed research. We evaluated the multi-document summaries generated by Newsblaster, a robust news browsing system that clusters online news articles and summarizes multiple articles on each event. Four groups of subjects were asked to perform the same time-restricted fact-gathering tasks, reading news under different conditions: no summaries at all, single sentence summaries drawn from one of the articles, Newsblaster multi-document summaries, and human summaries. Our results show that, in comparison to source documents only, the quality of reports assembled using Newsblaster summaries was significantly better and user satisfaction was higher with both Newsblaster and human summaries.
international acm sigir conference on research and development in information retrieval | 2005
Kathleen R. McKeown; Rebecca J. Passonneau; David K. Elson; Ani Nenkova; Julia Hirschberg
We describe a task-based evaluation to determine whether multi-document summaries measurably improve user performance whe using online news browsing systems for directed research. We evaluated the multi-document summaries generated by Newsblaster, a robust news browsing system that clusters online news articles and summarizes multiple articles on each event. Four groups of subjects were asked to perform the same time-restricted fact-gathering tasks, reading news under different conditions: no summaries at all, single sentence summaries drawn from one of the articles, Newsblaster multi-document summaries, and human summaries. Our results show that, in comparison to source documents only, the quality of reports assembled using Newsblaster summaries was significantly better and user satisfaction was higher with both Newsblaster and human summaries.
ProQuest LLC | 2012
Kathleen R. McKeown; David K. Elson
This thesis describes new approaches to the formal modeling of narrative discourse. Although narratives of all kinds are ubiquitous in daily life, contemporary text processing techniques typically do not leverage the aspects that separate narrative from expository discourse. We describe two approaches to the problem. The first approach considers the conversational networks to be found in literary fiction as a key aspect of discourse coherence; by isolating and analyzing these networks, we are able to comment on longstanding literary theories. The second approach proposes a new set of discourse relations that are specific to narrative. By focusing on certain key aspects, such as agentive characters, goals, plans, beliefs, and time, these relations represent a theory-of-mind interpretation of a text. We show that these discourse relations are expressive, formal, robust, and through the use of a software system, amenable to corpus collection projects through the use of trained annotators. We have procured and released a collection of over 100 encodings, covering a set of fables as well as longer texts including literary fiction and epic poetry. We are able to inferentially find similarities and analogies between encoded stories based on the proposed relations, and an evaluation of this technique shows that human raters prefer such a measure of similarity to a more traditional one based on the semantic distances between story propositions.
international conference on interactive digital storytelling | 2013
Elena Rishes; Stephanie M. Lukin; David K. Elson; Marilyn A. Walker
In order to tell stories in different voices for different audiences, interactive story systems require: (1) a semantic representation of story structure, and (2) the ability to automatically generate story and dialogue from this semantic representation using some form of Natural Language Generation (nlg). However, there has been limited research on methods for linking story structures to narrative descriptions of scenes and story events. In this paper we present an automatic method for converting from Scheherazades story intention graph, a semantic representation, to the input required by the personage nlg engine. Using 36 Aesop Fables distributed in DramaBank, a collection of story encodings, we train translation rules on one story and then test these rules by generating text for the remaining 35. The results are measured in terms of the string similarity metrics Levenshtein Distance and BLEU score. The results show that we can generate the 35 stories with correct content: the test set stories on average are close to the output of the Scheherazade realizer, which was customized to this semantic representation. We provide some examples of story variations generated by personage. In future work, we will experiment with measuring the quality of the same stories generated in different voices, and with techniques for making storytelling interactive.
north american chapter of the association for computational linguistics | 2003
Kathleen R. McKeown; Regina Barzilay; John Chen; David K. Elson; David Evans; Judith L. Klavans; Ani Nenkova; Barry Schiffman; Sergey Sigelman
Columbias Newsblaster tracking and summarization system is a robust system that clusters news into events, categorizes events into broad topics and summarizes multiple articles on each event. Here we outline our most current work on tracking events over days, producing summaries that update a user on new information about an event, outlining the perspectives of news coming from different countries and clustering and summarizing non-English sources.