Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tae Yano is active.

Publication


Featured researches published by Tae Yano.


north american chapter of the association for computational linguistics | 2009

Predicting Response to Political Blog Posts with Topic Models

Tae Yano; William W. Cohen; Noah A. Smith

In this paper we model discussions in online political blogs. To do this, we extend Latent Dirichlet Allocation (Blei et al., 2003), in various ways to capture different characteristics of the data. Our models jointly describe the generation of the primary documents (posts) as well as the authorship and, optionally, the contents of the blog communitys verbal reactions to each post (comments). We evaluate our model on a novel comment prediction task where the models are used to predict which blog users will leave comments on a given post. We also provide a qualitative discussion about what the models discover.


conference on information and knowledge management | 2013

Identifying salient entities in web pages

Michael Gamon; Tae Yano; Xinying Song; Johnson Apacible; Patrick Pantel

We propose a system that determines the salience of entities within web documents. Many recent advances in commercial search engines leverage the identification of entities in web pages. However, for many pages, only a small subset of entities are central to the document, which can lead to degraded relevance for entity triggered experiences. We address this problem by devising a system that scores each entity on a web page according to its centrality to the page content. We propose salience classification functions that incorporate various cues from document content, web search logs, and a large web graph. To cost-effectively train the models, we introduce a soft labeling methodology that generates a set of annotations based on user behaviors observed in web search logs. We evaluate several variations of our model via a large-scale empirical study conducted over a test set, which we release publicly to the research community. We demonstrate that our methods significantly outperform competitive baselines and the previous state of the art, while keeping the human annotation cost to a minimum.


knowledge discovery and data mining | 2013

Exploring venue-based city-to-city similarity measures

Daniel Preoţiuc-Pietro; Justin Cranshaw; Tae Yano

In this work we explore the use of incidentally generated social network data for the folksonomic characterization of cities by the types of amenities located within them. Using data collected about venue categories in various cities, we examine the effect of different granularities of spatial aggregation and data normalization when representing a city as a collection of its venues. We introduce three vector-based representations of a city, where aggregations of the venue categories are done within a grid structure, within the citys municipal neighborhoods, and across the city as a whole. We apply our methods to a novel dataset consisting of Foursquare venue data from 17 cities across the United States, totaling over 1 million venues. Our preliminary investigation demonstrates that different assumptions in the urban perception could lead to qualitative, yet distinctive, variations in the induced city description and categorization.


Archive | 2007

Selecting and Categorizing Textual Descriptions of Images in the Context of an Image Indexer's Toolkit

Rebecca J. Passonneau; Tae Yano; Judith L. Klavans; Rachael Bradley; Carolyn Sheffield; Eileen G. Abels; Laura Jenemann

We describe a series of studies aimed at identifying specifications for a text extraction module of an image indexer’s toolkit. The materials used in the studies consist of images paired with paragraph sequences that describe the images. We administered a pilot survey to visual resource center professionals at three universities to determine what types of paragraphs would be preferred for metadata selection. Respondents generally showed a strong preference for one of two paragraphs they were presented with, indicating that not all paragraphs that describe images are seen as good sources of metadata. We developed a set of semantic category labels to assign to spans of text in order to distinguish between different types of information about the images, thus to classify metadata contexts. Human agreement on metadata is notoriously variable. In order to maximize agreement, we conducted four human labeling experiments using the seven semantic category labels we developed. A subset of our labelers had much higher inter-annotator reliability, and highest reliability occurs when labelers can pick two labels per text unit.


international conference on weblogs and social media | 2010

What’s Worthy of Comment? Content and Comment Volume in Political Blogs

Tae Yano; Noah A. Smith


north american chapter of the association for computational linguistics | 2010

Shedding (a Thousand Points of) Light on Biased Language

Tae Yano; Philip Resnik; Noah A. Smith


north american chapter of the association for computational linguistics | 2012

Textual Predictors of Bill Survival in Congressional Committees

Tae Yano; Noah A. Smith; John Wilkerson


international conference on weblogs and social media | 2013

A Penny for Your Tweets: Campaign Contributions and Capitol Hill Microblogs.

Tae Yano; Dani Yogatama; Noah A. Smith


empirical methods in natural language processing | 2011

Structured Databases of Named Entities from Bayesian Nonparametrics

Jacob Eisenstein; Tae Yano; William W. Cohen; Noah A. Smith; Eric P. Xing


Archive | 2013

Understanding Document Aboutness Step One: Identifying Salient Entities

Michael Gamon; Tae Yano; Xinying Song; Johnson Apacible; Patrick Pantel

Collaboration


Dive into the Tae Yano's collaboration.

Top Co-Authors

Avatar

Noah A. Smith

University of Washington

View shared research outputs
Top Co-Authors

Avatar

William W. Cohen

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dani Yogatama

Carnegie Mellon University

View shared research outputs
Researchain Logo
Decentralizing Knowledge