Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tongtao Zhang is active.

Publication


Featured researches published by Tongtao Zhang.


empirical methods in natural language processing | 2015

Cross-document Event Coreference Resolution based on Cross-media Features

Tongtao Zhang; Hongzhi Li; Heng Ji; Shih-Fu Chang

In this paper we focus on a new problem of event coreference resolution across television news videos. Based on the observation that the contents from multiple data modalities are complementary, we develop a novel approach to jointly encode effective features from both closed captions and video key frames. Experiment results demonstrate that visual features provided 7.2% absolute F-score gain on stateof-the-art text based event extraction and coreference resolution.


north american chapter of the association for computational linguistics | 2016

Cross-media Event Extraction and Recommendation

Di Lu; Clare R. Voss; Fangbo Tao; Xiang Ren; Rachel Guan; Rostyslav Korolov; Tongtao Zhang; Dongang Wang; Hongzhi Li; Taylor Cassidy; Heng Ji; Shih-Fu Chang; Jiawei Han; William A. Wallace; James A. Hendler; Mei Si; Lance M. Kaplan

The sheer volume of unstructured multimedia data (e.g., texts, images, videos) posted on the Web during events of general interest is overwhelming and difficult to distill if seeking information relevant to a particular concern. We have developed a comprehensive system that searches, identifies, organizes and summarizes complex events from multiple data modalities. It also recommends events related to the user’s ongoing search based on previously selected attribute values and dimensions of events being viewed. In this paper we briefly present the algorithms of each component and demonstrate the system’s capabilities.


linguistic annotation workshop | 2016

Building a Cross-document Event-Event Relation Corpus

Yu Hong; Tongtao Zhang; Tim O'Gorman; Sharone Horowit-Hendler; Heng Ji; Martha Palmer

We propose a new task of extracting eventevent relations across documents. We present our efforts at designing an annotation schema and building a corpus for this task. Our schema includes five main types of relations: Inheritance, Expansion, Contingency, Comparison and Temporality, along with 21 subtypes. We also lay out the main challenges based on detailed inter-annotator disagreement and error analysis. We hope these resources can serve as a benchmark to encourage research on this new problem.


acm multimedia | 2012

Active query sensing: Suggesting the best query view for mobile visual search

Rongrong Ji; Felix X. Yu; Tongtao Zhang; Shih-Fu Chang

While much exciting progress is being made in mobile visual search, one important question has been left unexplored in all current systems. When searching objects or scenes in the 3D world, which viewing angle is more likely to be successful? More particularly, if the first query fails to find the right target, how should the user control the mobile camera to form the second query? In this article, we propose a novel Active Query Sensing system for mobile location search, which actively suggests the best subsequent query view to recognize the physical location in the mobile environment. The proposed system includes two unique components: (1) an offline process for analyzing the saliencies of different views associated with each geographical location, which predicts the location search precisions of individual views by modeling their self-retrieval score distributions. (2) an online process for estimating the view of an unseen query, and suggesting the best subsequent view change. Specifically, the optimal viewing angle change for the next query can be formulated as an online information theoretic approach. Using a scalable visual search system implemented over a NYC street view dataset (0.3 million images), we show a performance gain by reducing the failure rate of mobile location search to only 12% after the second query. We have also implemented an end-to-end functional system, including user interfaces on iPhones, client-server communication, and a remote search server. This work may open up an exciting new direction for developing interactive mobile media applications through the innovative exploitation of active sensing and query formulation.


acm multimedia | 2011

A mobile location search system with active query sensing

Felix X. Yu; Rongrong Ji; Tongtao Zhang; Shih-Fu Chang

How should the second query be taken once the first query fails in mobile location search based on visual recognition? In this demo, we describe a mobile search system with a unique Active Query Sensing (AQS) function to intelligently guide the mobile user to take a successful second query. This suggestion is built upon a scalable visual matching system covering over 0.3 million street view reference images in New York City, where each location is associated with multiple surrounding views and panorama. In online search, once the initial search result fails, the system will perform online analysis and suggest the mobile user to turn to the most discriminative viewing angle to take the second visual query, from which the search performance is expected to greatly improve. The AQS suggestion is based on both offline salient view discovery and online viewing angle prediction and intelligent turning decision. Our experiments show our AVS can improve the mobile location search with a performance gain as high as 100%, reducing the failure rate to only 12% after taking the second visual query.


acm multimedia | 2017

Improving Event Extraction via Multimodal Integration

Tongtao Zhang; Spencer Whitehead; Hanwang Zhang; Hongzhi Li; Joseph G. Ellis; Lifu Huang; Wei Liu; Heng Ji; Shih-Fu Chang

In this paper, we focus on improving Event Extraction (EE) by incorporating visual knowledge with words and phrases from text documents. We first discover visual patterns from large-scale text-image pairs in a weakly-supervised manner and then propose a multimodal event extraction algorithm where the event extractor is jointly trained with textual features and visual patterns. Extensive experimental results on benchmark data sets demonstrate that the proposed multimodal EE method can achieve significantly better performance on event extraction: absolute 7.1% F-score gain on event trigger labeling and 8.5% F-score gain on event argument labeling.


IEEE MultiMedia | 2016

Multimedia Hashing and Networking

Wei Liu; Tongtao Zhang

This department discusses multimedia hashing and networking. The authors summarize shallow-learning-based hashing and deep-learning-based hashing. By exploiting successful shallow-learning algorithms, state-of-the-art hashing techniques have been widely used in high-efficiency multimedia storage, indexing, and retrieval, especially in multimedia search applications on smartphone devices. The authors also introduce Multimedia Information Networks (MINets) and present one paradigm of leveraging MINets to incorporate both visual and textual information to reach a sensible event coreference resolution. The goal is to make deep learning practical in realistic multimedia applications.


empirical methods in natural language processing | 2015

Biography-Dependent Collaborative Entity Archiving for Slot Filling

Yu Hong; Xiaobin Wang; Yadong Chen; Jian Wang; Tongtao Zhang; Heng Ji

Knowledge Base Population (KBP) tasks, such as slot filling, show the particular importance of entity-oriented automatic relevant document acquisition. Rich, diverse and reliable relevant documents satisfy the fundamental requirement that a KBP system explores the nature of an entity. Towards the bottleneck problem between comprehensiveness and definiteness of acquisition, we propose a collaborative archiving method. In particular we introduce topic modeling methodologies into entity biography profiling, so as to build a bridge between fuzzy and exact matching. On one side, we employ the topics in a small-scale high-quality relevant documents (i.e., exact matching results) to summarize the life slices of a target entity (i.e., biography), and on the other side, we use the biography as a reliable reference material to detect new truly relevant documents from a large-scale partially complete pseudo-feedback (i.e., fuzzy matching results). We leverage the archiving method to enhance slot filling systems. Experiments on KBP corpus show significant improvement over stateof-the-art.


international joint conference on artificial intelligence | 2013

Semi-supervised learning with manifold fitted graphs

Tongtao Zhang; Rongrong Ji; Wei Liu; Dacheng Tao; Gang Hua


Archive | 2012

Systems and methods for automatically determining an improved view for a visual query in a mobile search

Shih-Fu Chang; Xinnan Yu; Rongrong Ji; Tongtao Zhang

Collaboration


Dive into the Tongtao Zhang's collaboration.

Top Co-Authors

Avatar

Heng Ji

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lifu Huang

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Spencer Whitehead

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Boliang Zhang

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Di Lu

Rensselaer Polytechnic Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge