Joseph G. Ellis | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Joseph G. Ellis is active.

Explore More

Publication

Featured researches published by Joseph G. Ellis.

international conference on multimodal interfaces | 2014

Why We Watch the News: A Dataset for Exploring Sentiment in Broadcast Video News

Joseph G. Ellis; Brendan Jou; Shih-Fu Chang

We present a multimodal sentiment study performed on a novel collection of videos mined from broadcast and cable television news programs. To the best of our knowledge, this is the first dataset released for studying sentiment in the domain of broadcast video news. We describe our algorithm for the processing and creation of person-specific segments from news video, yielding 929 sentence-length videos, and are annotated via Amazon Mechanical Turk. The spoken transcript and the video content itself are each annotated for their expression of positive, negative or neutral sentiment. Based on these gathered user annotations, we demonstrate for news video the importance of taking into account multimodal information for sentiment prediction, and in particular, challenging previous text-based approaches that rely solely on available transcripts. We show that as much as 21.54% of the sentiment annotations for transcripts differ from their respective sentiment annotations when the video clip itself is presented. We present audio and visual classification baselines over a three-way sentiment prediction of positive, negative and neutral, as well as person-dependent versus person-independent classification influence on performance. Finally, we release the News Rover Sentiment dataset to the greater research community.

international conference on multimedia retrieval | 2018

PatternNet: Visual Pattern Mining with Deep Neural Network

Hongzhi Li; Joseph G. Ellis; Lei Zhang; Shih-Fu Chang

Visual patterns represent the discernible regularity in the visual world. They capture the essential nature of visual objects or scenes. Understanding and modeling visual patterns is a fundamental problem in visual recognition that has wide ranging applications. In this paper, we study the problem of visual pattern mining and propose a novel deep neural network architecture called PatternNet for discovering these patterns that are both discriminative and representative. The proposed PatternNet leverages the filters in the last convolution layer of a convolutional neural network to find locally consistent visual patches, and by combining these filters we can effectively discover unique visual patterns. In addition, PatternNet can discover visual patterns efficiently without performing expensive image patch sampling, and this advantage provides an order of magnitude speedup compared to most other approaches. We evaluate the proposed PatternNet subjectively by showing randomly selected visual patterns which are discovered by our method and quantitatively by performing image classification with the identified visual patterns and comparing our performance with the current state-of-the-art. We also directly evaluate the quality of the discovered visual patterns by leveraging the identified patterns as proposed objects in an image and compare with other relevant methods. Our proposed network and procedure, PatterNet, is able to outperform competing methods for the tasks described.

acm multimedia | 2017

Improving Event Extraction via Multimodal Integration

Tongtao Zhang; Spencer Whitehead; Hanwang Zhang; Hongzhi Li; Joseph G. Ellis; Lifu Huang; Wei Liu; Heng Ji; Shih-Fu Chang

In this paper, we focus on improving Event Extraction (EE) by incorporating visual knowledge with words and phrases from text documents. We first discover visual patterns from large-scale text-image pairs in a weakly-supervised manner and then propose a multimodal event extraction algorithm where the event extractor is jointly trained with textual features and visual patterns. Extensive experimental results on benchmark data sets demonstrate that the proposed multimodal EE method can achieve significantly better performance on event extraction: absolute 7.1% F-score gain on event trigger labeling and 8.5% F-score gain on event argument labeling.

international conference on multimedia retrieval | 2016

Watching What and How Politicians Discuss Various Topics: A Large-Scale Video Analytics UI

Emily Song; Joseph G. Ellis; Hongzhi Li; Shih-Fu Chang

Accurately gauging the political atmosphere is especially difficult in this day and age, as individuals have access to a constantly growing collection of written and audiovisual news sources. This is especially true with regards to the U.S. presidential election, as there are numerous candidates, countless stories, and opinion articles discussing the merits of each particular candidate. It is therefore challenging for people to make an accurate assessment of what each candidate represents and how they would act if they were elected into office. To address this problem, we present a large-scale dataset comprised of videos of politicians speaking organized by the topics they are speaking about, and a user interface for exploring this interesting dataset. Our interface links people and events to relevant pieces of audiovisual media, and presents the desired information in a meaningful and intuitive manner. Our approach is unique by direct linking to actual speaking by politicians about specific topics, rather than links to textual quotes only. We describe the larger underlying infrastructure, a novel automated system that crawls thousands of internet news sources and 100 television news channels daily, and automatically discovers entities and indexes the content into events and topics. We examine how our user interface provides helpful and unique insights to its users, and give an example of the type of large scale trend analysis that can be performed with our system. Our online demo can be accessed at: http://www.ee.columbia.edu/dvmm/PoliticialSpeakerDemo

acm multimedia | 2016

Placing Broadcast News Videos in their Social Media Context Using Hashtags

Joseph G. Ellis; Svebor Karaman; Hongzhi Li; Hong Bin Shim; Shih-Fu Chang

With the growth of social media platforms in recent years, social media is now a major source of information and news for many people around the world. In particular the rise of hashtags have helped to build communities of discussion around particular news, topics, opinions, and ideologies. However, television news programs still provide value and are used by a vast majority of the population to obtain their news, but these videos are not easily linked to broader discussion on social media. We have built a novel pipeline that allows television news to be placed in its relevant social media context, by leveraging hashtags. In this paper, we present a method for automatically collecting television news and social media content (Twitter) and discovering the hashtags that are relevant for a TV news video. Our algorithms incorporate both the visual and text information within social media and television content, and we show that by leveraging both modalities we can improve performance over single modality approaches.

acm multimedia | 2013