Is this you? Create Your Porfile

Yangfeng Ji

Georgia Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yangfeng Ji is active.

Explore More

Publication

Featured researches published by Yangfeng Ji.

meeting of the association for computational linguistics | 2014

Representation Learning for Text-level Discourse Parsing

Yangfeng Ji; Jacob Eisenstein

Text-level discourse parsing is notoriously difficult, as distinctions between discourse relations require subtle semantic judgments that are not easily captured using standard features. In this paper, we present a representation learning approach, in which we transform surface features into a latent space that facilitates RST discourse parsing. By combining the machinery of large-margin transition-based structured prediction with representation learning, our method jointly learns to parse discourse while at the same time learning a discourse-driven projection of surface features. The resulting shift-reduce discourse parser obtains substantial improvements over the previous state-of-the-art in predicting relations and nuclearity on the RST Treebank.

empirical methods in natural language processing | 2015

Better Document-level Sentiment Analysis from RST Discourse Parsing

Parminder Bhatia; Yangfeng Ji; Jacob Eisenstein

Discourse structure is the hidden link between surface features and document-level properties, such as sentiment polarity. We show that the discourse analyses produced by Rhetorical Structure Theory (RST) parsers can improve document-level sentiment analysis, via composition of local information up the discourse tree. First, we show that reweighting discourse units according to their position in a dependency representation of the rhetorical structure can yield substantial improvements on lexicon-based sentiment analysis. Next, we present a recursive neural network over the RST structure, which offers significant improvements over classificationbased methods.

north american chapter of the association for computational linguistics | 2016

A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models

Yangfeng Ji; Gholamreza Haffari; Jacob Eisenstein

This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model can therefore employ a training objective that includes not only discourse relation classification, but also word prediction. As a result, it outperforms state-of-the-art alternatives for two tasks: implicit discourse relation classification in the Penn Discourse Treebank, and dialog act classification in the Switchboard corpus. Furthermore, by marginalizing over latent discourse relations at test time, we obtain a discourse informed language model, which improves over a strong LSTM baseline.

empirical methods in natural language processing | 2015

Closing the Gap: Domain Adaptation from Explicit to Implicit Discourse Relations

Yangfeng Ji; Gongbo Zhang; Jacob Eisenstein

Many discourse relations are explicitly marked with discourse connectives, and these examples could potentially serve as a plentiful source of training data for recognizing implicit discourse relations. However, there are important linguistic differences between explicit and implicit discourse relations, which limit the accuracy of such an approach. We account for these differences by applying techniques from domain adaptation, treating implicitly and explicitly-marked discourse relations as separate domains. The distribution of surface features varies across these two domains, so we apply a marginalized denoising autoencoder to induce a dense, domain-general representation. The label distribution is also domain-specific, so we apply a resampling technique that is similar to instance weighting. In combination with a set of automatically-labeled data, these improvements eliminate more than 80% of the transfer loss incurred by training an implicit discourse relation classifier on explicitly-marked discourse relations.

Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality | 2014

Mining Themes and Interests in the Asperger's and Autism Community

Yangfeng Ji; Hwajung Hong; Rosa I. Arriaga; Agata Rozga; Gregory D. Abowd; Jacob Eisenstein

Discussion forums offer a new source of insight for the experiences and challenges faced by individuals affected by mental disorders. Language technology can help domain experts gather insight from these forums, by aggregating themes and user behaviors across thousands of conversations. We present a novel model for web forums, which captures both thematic content as well as user-specific interests. Applying this model to the Aspies Central forum (which covers issues related to Asperger’s syndrome and autism spectrum disorder), we identify several topics of concern to individuals who report being on the autism spectrum. We perform the evaluation on the data collected from Aspies Central forum, including 1,939 threads, 29,947 posts and 972 users. Quantitative evaluations demonstrate that the topics extracted by this model are substantially more than those obtained by Latent Dirichlet Allocation and the Author-Topic Model. Qualitative analysis by subjectmatter experts suggests intriguing directions for future investigation.

empirical methods in natural language processing | 2013

Discriminative Improvements to Distributional Sentence Similarity

Yangfeng Ji; Jacob Eisenstein

arXiv: Machine Learning | 2017

DyNet: The Dynamic Neural Network Toolkit.

Graham Neubig; Chris Dyer; Yoav Goldberg; Austin Matthews; Waleed Ammar; Antonios Anastasopoulos; Miguel Ballesteros; David Chiang; Daniel Clothiaux; Trevor Cohn; Kevin Duh; Manaal Faruqui; Cynthia Gan; Dan Garrette; Yangfeng Ji; Lingpeng Kong; Adhiguna Kuncoro; Gaurav Kumar; Chaitanya Malaviya; Paul Michel; Yusuke Oda; Matthew Richardson; Naomi Saphra; Swabha Swayamdipta; Pengcheng Yin

Transactions of the Association for Computational Linguistics | 2015