Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jena D. Hwang is active.

Publication


Featured researches published by Jena D. Hwang.


Journal of the American Medical Informatics Association | 2013

Towards comprehensive syntactic and semantic annotations of the clinical narrative

Daniel Albright; Arrick Lanfranchi; Anwen Fredriksen; Will Styler; Colin Warner; Jena D. Hwang; Jinho D. Choi; Dmitriy Dligach; Rodney D. Nielsen; James H. Martin; Wayne H. Ward; Martha Palmer; Guergana Savova

Objective To create annotated clinical narratives with layers of syntactic and semantic labels to facilitate advances in clinical natural language processing (NLP). To develop NLP algorithms and open source components. Methods Manual annotation of a clinical narrative corpus of 127 606 tokens following the Treebank schema for syntactic information, PropBank schema for predicate-argument structures, and the Unified Medical Language System (UMLS) schema for semantic information. NLP components were developed. Results The final corpus consists of 13 091 sentences containing 1772 distinct predicate lemmas. Of the 766 newly created PropBank frames, 74 are verbs. There are 28 539 named entity (NE) annotations spread over 15 UMLS semantic groups, one UMLS semantic type, and the Person semantic category. The most frequent annotations belong to the UMLS semantic groups of Procedures (15.71%), Disorders (14.74%), Concepts and Ideas (15.10%), Anatomy (12.80%), Chemicals and Drugs (7.49%), and the UMLS semantic type of Sign or Symptom (12.46%). Inter-annotator agreement results: Treebank (0.926), PropBank (0.891–0.931), NE (0.697–0.750). The part-of-speech tagger, constituency parser, dependency parser, and semantic role labeler are built from the corpus and released open source. A significant limitation uncovered by this project is the need for the NLP community to develop a widely agreed-upon schema for the annotation of clinical concepts and their relations. Conclusions This project takes a foundational step towards bringing the field of clinical NLP up to par with NLP in the general domain. The corpus creation and NLP components provide a resource for research and application development that would have been previously impossible.


linguistic annotation workshop | 2007

Criteria for the Manual Grouping of Verb Senses

Cecily Jill Duffield; Jena D. Hwang; Susan Windisch Brown; Dmitriy Dligach; Sarah Vieweg; Jenny Davis; Martha Palmer

In this paper, we argue that clustering WordNet senses into more coarse-grained groupings results in higher inter-annotator agreement and increased system performance. Clustering of verb senses involves examining syntactic and semantic features of verbs and arguments on a case-by-case basis rather than applying a strict methodology. Determining appropriate criteria for clustering is based primarily on the needs of annotators.


linguistic annotation workshop | 2015

A Hierarchy with, of, and for Preposition Supersenses

Nathan Schneider; Vivek Srikumar; Jena D. Hwang; Martha Palmer

English prepositions are extremely frequent and extraordinarily polysemous. In some usages they contribute information about spatial, temporal, or causal roles/relations; in other cases they are institutionalized, somewhat arbitrarily, as case markers licensed by a particular governing verb, verb class, or syntactic construction. To facilitate automatic disambiguation, we propose a general-purpose, broadcoverage taxonomy of preposition functions that we call supersenses: these are coarse and unlexicalized so as to be tractable for efficient manual annotation, yet capture crucial semantic distinctions. Our resource, including extensive documentation of the supersenses, many example sentences, and mappings to other lexical resources, will be publicly released. Prepositions are perhaps the most beguiling yet pervasive lexicosyntactic class in English. They are everywhere; their functional versatility is dizzying and largely idiosyncratic (1). They are nearly invisible, yet indispensable for situating the where, when, why, and how of events. In a way, prepositions are the bastard children of lexicon and grammar, rising to the occasion almost whenever a noun-noun or verbnoun relation is needed and neither subject nor object is appropriate. Consider the many uses of the word to, just a few of which are illustrated in (1):1 (1) a. My cake is to die for. b. If you want I can treat you to some. c. How about this: you go to the store d. to buy ingredients. e. Then if you give the recipe to me f. I’m happy to make the batter g. and put it in the oven for 30 to 40 minutes h. so you’ll arrive to the sweet smell of chocolate. i. That sounds good to me. j. That’s all there is to it. 1Though infinitival to is traditionally not considered a preposition, we allow it to be labeled with a supersense if the infinitival clause serves as a PURPOSE (as in (1d)) or FUNCTION. See §2. Sometimes a preposition specifies a relationship between two entities or quantities, as in (1g). In other scenarios it serves a case-marking sort of function, marking a complement or adjunct—principally to a verb (1b–1e, 1h, 1i), but also to an argument-taking noun or adjective (1f). Further, it is not always possible to separate the semantic contribution of the preposition from that of other words in the sentence. As amply demonstrated in the literature, prepositions play a key role in multiword expressions (Baldwin and Kim, 2010), as in (1a, 1b, 1j). An adequate descriptive annotation scheme for prepositions must deal with these messy facts. Following a brief discussion of existing approaches to preposition semantics (§1), this paper offers a new approach to characterizing their functions at a coarsegrained level. Our scheme is intended to apply to almost all preposition tokens, though some are excluded on the grounds that they belong to a larger multiword expression or are purely syntactic (§2). The rest of the paper is devoted to our coarse semantic categories, supersenses (§3).2 Many of these categories are based on previous proposals—primarily, Srikumar and Roth (2013a) (so-called preposition relations) and VerbNet (thematic roles; Bonial et al., 2011; Hwang, 2014, appendix C)—but we organize them into a hierarchy and motivate a number of new or altered categories that make the scheme more robust. Because prepositions are so frequent, so polysemous, and so crucial in establishing relations, we believe that a wide variety of NLP applications (including knowledge base construction, reasoning about events, summarization, paraphrasing, and translation) stand to benefit from automatic disambiguation of preposition supersenses. 2Supersense inventories have also been described for nouns and verbs (Ciaramita and Altun, 2006; Schneider et al., 2012; Schneider and Smith, 2015) and adjectives (Tsvetkov et al., 2014). Other inventories characterize semantic functions expressed via morphosyntax: e.g., tense/aspect (Reichart and Rappoport, 2010), definiteness (Bhatia et al., 2014, also hierarchical). A wiki documenting our scheme in detail can be accessed at http://tiny.cc/prepwiki. It maps finegrained preposition senses to our supersenses, along with numerous examples. The wiki is conducive to browsing and to exporting the structure and examples for use elsewhere (e.g., in an annotation tool). From our experience with pilot annotations, we believe that the scheme is fairly stable and broadly applicable.


meeting of the association for computational linguistics | 2016

A Corpus of Preposition Supersenses

Nathan Schneider; Jena D. Hwang; Vivek Srikumar; Meredith Green; Abhijit Suresh; Kathryn Conger; Tim O'Gorman; Martha Palmer

We present the first corpus annotated with preposition supersenses, unlexicalized categories for semantic functions that can be marked by English prepositions (Schneider et al., 2015). The preposition supersenses are organized hierarchically and designed to facilitate comprehensive manual annotation. Our dataset is publicly released on the web. 1


north american chapter of the association for computational linguistics | 2016

Crazy Mad Nutters: The Language of Mental Health.

Jena D. Hwang; Kristy Hollingshead

Many people with mental illnesses face challenges posed by stigma perpetuated by fear and misconception in society at large. This societal stigma against mental health conditions is present in everyday language. In this study we take a set of 14 words with the potential to stigmatize mental health and sample Twitter as an approximation of contemporary discourse. Annotation reveals that these words are used with different senses, from expressive to stigmatizing to clinical.We use these wordsense annotations to extract a set of mental health–aware Twitter users, and compare their language use to that of an ageand gendermatched comparison set of users, discovering a difference in frequency of stigmatizing senses as well as a change in the target of pejorative senses. Such analysis may provide a first step towards a tool with the potential to help everyday people to increase awareness of their own stigmatizing language, and to measure the effectiveness of anti-stigma campaigns to change our discourse.


Handbook of Linguistic Annotation | 2017

Current Directions in English and Arabic PropBank

Claire Bonial; Kathryn Conger; Jena D. Hwang; Aous Mansouri; Yahya Aseri; Julia Bonn; Timothy O’Gorman; Martha Palmer

This chapter gives an overview of the infrastructure, annotation practices, and current challenges of both the English and Arabic PropBank corpora. More details about the Hindi and Urdu PropBanks can be found in chapter “ The Hindi/Urdu Treebank Project” (this volume). The focus of current efforts is on expanding the types of relations covered by PropBank. Previously, the annotation effort focused on event relations expressed solely by verbs. (A separate but related effort, NomBank, focused on nouns [26].) However, a complete representation of event relations within and across sentences requires expanding that focus to all syntactic realizations of event and state semantics, including expressions in the form of nouns, adjectives and multi-word expressions. This effort reflects a general desire to move to a deeper level of semantic understanding, abstracting away from language-particular syntactic facts. The chapter closes with a discussion of future directions for PropBank.


joint conference on lexical and computational semantics | 2015

Identification of Caused Motion Construction

Jena D. Hwang; Martha Palmer

This research describes the development of a supervised classifier of English Caused Motion Constructions (CMCs) (e.g. The goalie kicked the ball into the field). Consistent identification of CMCs is a necessary step to a correct interpretation of semantics for sentences where the verb does not conform to the expected semantics of the verb (e.g. The crowd laughed the clown off the stage). We expand on a previous study on the classification CMCs (Hwang et al., 2010) to show that CMCs can be successfully identified in the corpus data. In this paper, we present the classifier and the series of experiments carried out to improve its performance.


linguistic annotation workshop | 2010

PropBank Annotation of Multilingual Light Verb Constructions

Jena D. Hwang; Archna Bhatia; Claire Bonial; Aous Mansouri; Ashwini Vaidya; Nianwen Xue; Martha Palmer


language resources and evaluation | 2014

PropBank: Semantics of New Predicate Types

Claire Bonial; Julia Bonn; Kathryn Conger; Jena D. Hwang; Martha Palmer


north american chapter of the association for computational linguistics | 2010

Towards a Domain Independent Semantics: Enhancing Semantic Representation with Construction Grammar

Jena D. Hwang; Rodney D. Nielsen; Martha Palmer

Collaboration


Dive into the Jena D. Hwang's collaboration.

Top Co-Authors

Avatar

Martha Palmer

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Nathan Schneider

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tim O'Gorman

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Na-Rae Han

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Cecily Jill Duffield

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Claire Bonial

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Kathryn Conger

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Susan Windisch Brown

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Dmitriy Dligach

University of Colorado Boulder

View shared research outputs
Researchain Logo
Decentralizing Knowledge