Is this you? Create Your Porfile

Julia Bernd

International Computer Science Institute

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Julia Bernd is active.

Explore More

Publication

Featured researches published by Julia Bernd.

international conference on multimedia retrieval | 2015

Audio-Based Multimedia Event Detection with DNNs and Sparse Sampling

Khalid Ashraf; Benjamin Elizalde; Forrest N. Iandola; Matthew W. Moskewicz; Julia Bernd; Gerald Friedland; Kurt Keutzer

This paper presents advances in analyzing audio content information to detect events in videos, such as a parade or a birthday party. We developed a set of tools for audio processing within the predominantly vision-focused deep neural network (DNN) framework Caffe. Using these tools, we show, for the first time, the potential of using only a DNN for audio-based multimedia event detection. Training DNNs for event detection using the entire audio track from each video causes a computational bottleneck. Here, we address this problem by developing a sparse audio frame-sampling method that improves event-detection speed and accuracy. We achieved a 10 percentage-point improvement in event classification accuracy, with a 200x reduction in the number of training input examples as compared to using the entire track. This reduction in input feature volume led to a 16x reduction in the size of the DNN architecture and a 300x reduction in training time. We applied our method using the recently released YLI-MED dataset and compared our results with a state-of-the-art system and with results reported in the literature for TRECVIDMED. Our results show much higher MAP scores compared to a baseline i-vector system - at a significantly reduced computational cost. The speed improvement is relevant for processing videos on a large scale, and could enable more effective deployment in mobile systems.

acm multimedia | 2015

Kickstarting the Commons: The YFCC100M and the YLI Corpora

Julia Bernd; Damian Borth; Carmen J. Carrano; Jaeyoung Choi; Benjamin Elizalde; Gerald Friedland; Luke R. Gottlieb; Karl Ni; Roger A. Pearce; Douglas N. Poland; Khalid Ashraf; David A. Shamma; Bart Thomee

The publication of the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M)--to date the largest open-access collection of photos and videos--has provided a unique opportunity to stimulate new research in multimedia analysis and retrieval. To make the YFCC100M even more valuable, we have started working towards supplementing it with a comprehensive set of precomputed features and high-quality ground truth annotations. As part of our efforts, we are releasing the YLI feature corpus, as well as the YLI-GEO and YLI-MED annotation subsets. Under the Multimedia Commons Project (MMCP), we are currently laying the groundwork for a common platform and framework around the YFCC100M that (i) facilitates researchers in contributing additional features and annotations, (ii) supports experimentation on the dataset, and (iii) enables sharing of obtained results. This paper describes the YLI features and annotations released thus far, and sketches our vision for the MMCP.

acm multimedia | 2015

Insights into Audio-Based Multimedia Event Classification with Neural Networks

Mirco Ravanelli; Benjamin Elizalde; Julia Bernd; Gerald Friedland

Multimedia Event Detection (MED) aims to identify events-also called scenes-in videos, such as a flash mob or a wedding ceremony. Audio content information complements cues such as visual content and text. In this paper, we explore the optimization of neural networks (NNs) for audio-based multimedia event classification, and discuss some insights towards more effectively using this paradigm for MED. We explore different architectures, in terms of number of layers and number of neurons. We also assess the performance impact of pre-training with Restricted Boltzmann Machines (RBMs) in contrast with random initialization, and explore the effect of varying the context window for the input to the NNs. Lastly, we compare the performance of Hidden Markov Models (HMMs) with a discriminative classifier for the event classification. We used the publicly available event-annotated YLI-MED dataset. Our results showed a performance improvement of more than 6% absolute accuracy compared to the latest results reported in the literature. Interestingly, these results were obtained with a single-layer neural network with random initialization, suggesting that standard approaches with deep learning and RBM pre-training are not fully adequate to address the high-level video event-classification task.

acm multimedia | 2016

A Discriminative and Compact Audio Representation for Event Detection

Liping Jing; Bo Liu; Jaeyoung Choi; Adam Janin; Julia Bernd; Michael W. Mahoney; Gerald Friedland

This paper presents a novel two-phase method for audio representation: Discriminative and Compact Audio Representation (DCAR). In the first phase, each audio track is modeled using a Gaussian mixture model (GMM) that includes several components to capture the variability within that track. The second phase takes into account both global structure and local structure. In this phase, the components are rendered more discriminative and compact by formulating an optimization problem on Grassmannian manifolds, which we found represents the structure of audio effectively. Experimental results on the YLI-MED dataset show that the proposed DCAR representation consistently outperforms state-of-the-art audio representations: i-vector, mv-vector, and GMM.

acm multimedia | 2016

Multimedia Privacy

Gerald Friedland; Symeon Papadopoulos; Julia Bernd; Yiannis Kompatsiaris

This tutorial brings together a number of recent advances at the nexus of multimedia analysis, online privacy, and social media mining. Our goal is to offer a multidisciplinary view of the emerging field of Multimedia Privacy: the study of privacy issues arising in the context of multimedia sharing in online platforms, and the pursuit of new approaches to mitigating those issues within multimedia computer science.

IEEE MultiMedia | 2015

Teaching Privacy: Multimedia Making a Difference

Julia Bernd; Blanca Gordo; Jaeyoung Choi; Bryan Morgan; Nicholas Henderson; Serge Egelman; Daniel D. Garcia; Gerald Friedland

As part of the Teaching Privacy project, researchers at the International Computer Science Institute and the University of California, Berkeley, are developing learning tools to empower K-12 students and college undergraduates in making informed choices about privacy. Teaching Privacy in part grew out of empirical research on the privacy implications of multimedia technology; this research generated a great deal of interest from teachers, who often want to provide students with guidance on online privacy but feel they are not sufficiently versed in the technical details. These interactions inspired the project, which focuses on working with teachers through outreach, curriculum-building, and professional development. This article describes the project teams interdisciplinary approach to developing and disseminating engaging, interactive educational apps that demonstrate what happens to personal information on the Internet, with a particular focus on multimedia, and their approach to explaining the underlying social and technical principles in accessible terms.

IEEE Transactions on Multimedia | 2017

DCAR: A Discriminative and Compact Audio Representation for Audio Processing

Liping Jing; Bo Liu; Jaeyoung Choi; Adam Janin; Julia Bernd; Michael W. Mahoney; Gerald Friedland

This paper presents a novel two-phase method for audio representation, discriminative and compact audio representation (DCAR), and evaluates its performance at detecting events and scenes in consumer-produced videos. In the first phase of DCAR, each audio track is modeled using a Gaussian mixture model (GMM) that includes several components to capture the variability within that track. The second phase takes into account both global structure and local structure. In this phase, the components are rendered more discriminative and compact by formulating an optimization problem on a Grassmannian manifold. The learned components can effectively represent the structure of audio. Our experiments used the YLI-MED and DCASE Acoustic Scenes datasets. The results show that variants on the proposed DCAR representation consistently outperform four popular audio representations (mv-vector, i-vector, GMM, and HEM-GMM). The advantage is significant for both easier and harder discrimination tasks; we discuss how these performance differences across tasks follow from how each type of model leverages (or does not leverage) the intrinsic structure of the data.

arXiv: Multimedia | 2015

The YLI-MED Corpus: Characteristics, Procedures, and Plans

Julia Bernd; Damian Borth; Benjamin Elizalde; Gerald Friedland; Heather Gallagher; Luke R. Gottlieb; Adam Janin; Sara Karabashlieva; Jocelyn Takahashi; Jennifer Won

technical symposium on computer science education | 2016