Dani Yogatama | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dani Yogatama is active.

Explore More

Publication

Featured researches published by Dani Yogatama.

international joint conference on natural language processing | 2015

Sparse Overcomplete Word Vector Representations

Manaal Faruqui; Yulia Tsvetkov; Dani Yogatama; Chris Dyer; Noah A. Smith

Current distributed representations of words show little resemblance to theories of lexical semantics. The former are dense and uninterpretable, the latter largely based on familiar, discrete classes (e.g., supersenses) and relations (e.g., synonymy and hypernymy). We propose methods that transform word vectors into sparse (and optionally binary) vectors. The resulting representations are more similar to the interpretable features typically used in NLP, though they are discovered automatically from raw corpora. Because the vectors are highly sparse, they are computationally easy to work with. Most importantly, we find that they outperform the original vectors on benchmark tasks.

meeting of the association for computational linguistics | 2014

Linguistic Structured Sparsity in Text Categorization

Dani Yogatama; Noah A. Smith

We introduce three linguistically motivated structured regularizers based on parse trees, topics, and hierarchical word clusters for text categorization. These regularizers impose linguistic bias in feature weights, enabling us to incorporate prior knowledge into conventional bagof-words models. We show that our structured regularizers consistently improve classification accuracies compared to standard regularizers that penalize features in isolation (such as lasso, ridge, and elastic net regularizers) on a range of datasets for various text prediction problems: topic classification, sentiment analysis, and forecasting.

international joint conference on natural language processing | 2015

Embedding Methods for Fine Grained Entity Type Classification

Dani Yogatama; Daniel Gillick; Nevena Lazic

We propose a new approach to the task of fine grained entity type classifications based on label embeddings that allows for information sharing among related labels. Specifically, we learn an embedding for each label and each feature such that labels which frequently co-occur are close in the embedded space. We show that it outperforms state-of-the-art methods on two fine grained entity-classification benchmarks and that the model can exploit the finer-grained labels to improve classification of standard coarse types.

empirical methods in natural language processing | 2015

Bayesian Optimization of Text Representations

Dani Yogatama; Lingpeng Kong; Noah A. Smith

When applying machine learning to problems in NLP, there are many choices to make about how to represent input texts. These choices can have a big effect on performance, but they are often uninteresting to researchers or practitioners who simply need a module that performs well. We propose an approach to optimizing over this space of choices, formulating the problem as global optimization. We apply a sequential model-based optimization technique and show that our method makes standard linear models competitive with more sophisticated, expensive state-of-the-art methods based on latent variable models or neural networks on various topic classification and sentiment analysis problems. Our approach is a first step towards black-box NLP systems that work with raw text and do not require manual tuning.

meeting of the association for computational linguistics | 2017

Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems

Wang Ling; Dani Yogatama; Chris Dyer; Phil Blunsom

Solving algebraic word problems requires executing a series of arithmetic operations---a program---to obtain a final answer. However, since programs can be arbitrarily complicated, inducing them directly from question-answer pairs is a formidable challenge. To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. Although rationales do not explicitly specify programs, they provide a scaffolding for their structure via intermediate milestones. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales. Experimental results show that indirect supervision of program learning via answer rationales is a promising strategy for inducing arithmetic programs.

empirical methods in natural language processing | 2015

Extractive Summarization by Maximizing Semantic Volume

Dani Yogatama; Fei Liu; Noah A. Smith

The most successful approaches to extractive text summarization seek to maximize bigram coverage subject to a budget constraint. In this work, we propose instead to maximize semantic volume. We embed each sentence in a semantic space and construct a summary by choosing a subset of sentences whose convex hull maximizes volume in that space. We provide a greedy algorithm based on the GramSchmidt process to efficiently perform volume maximization. Our method outperforms the state-of-the-art summarization approaches on benchmark datasets.

empirical methods in natural language processing | 2009

Multilingual Spectral Clustering Using Document Similarity Propagation

Dani Yogatama; Kumiko Tanaka-Ishii

We present a novel approach for multilingual document clustering using only comparable corpora to achieve cross-lingual semantic interoperability. The method models document collections as weighted graph, and supervisory information is given as sets of must-linked constraints for documents in different languages. Recursive k-nearest neighbor similarity propagation is used to exploit the prior knowledge and merge two language spaces. Spectral method is applied to find the best cuts of the graph. Experimental results show that using limited supervisory information, our method achieves promising clustering results. Furthermore, since the method does not need any language dependent information in the process, our algorithm can be applied to languages in various alphabetical systems.

meeting of the association for computational linguistics | 2011

Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments

Kevin Gimpel; Nathan Schneider; Brendan O'Connor; Dipanjan Das; Daniel Mills; Jacob Eisenstein; Michael Heilman; Dani Yogatama; Jeffrey Flanigan; Noah A. Smith

international conference on machine learning | 2016

Deep speech 2: end-to-end speech recognition in English and mandarin

Dario Amodei; Sundaram Ananthanarayanan; Rishita Anubhai; Jingliang Bai; Eric Battenberg; Carl Case; Jared Casper; Bryan Catanzaro; Qiang Cheng; Guoliang Chen; Jie Chen; Jingdong Chen; Zhijie Chen; Mike Chrzanowski; Adam Coates; Greg Diamos; Ke Ding; Niandong Du; Erich Elsen; Jesse Engel; Weiwei Fang; Linxi Fan; Christopher Fougner; Liang Gao; Caixia Gong; Awni Y. Hannun; Tony Han; Lappi Vaino Johannes; Bing Jiang; Cai Ju

international conference on artificial intelligence and statistics | 2014