Is this you? Create Your Porfile

Karl Ni

Lawrence Livermore National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karl Ni is active.

Explore More

Publication

Featured researches published by Karl Ni.

Communications of The ACM | 2016

YFCC100M: the new data in multimedia research

Bart Thomee; David A. Shamma; Gerald Friedland; Benjamin Elizalde; Karl Ni; Douglas N. Poland; Damian Borth; Li-Jia Li

This publicly available curated dataset of almost 100 million photos and videos is free and legal for all.This publicly available curated dataset of almost 100 million photos and videos is free and legal for all.

acm multimedia | 2014

The Placing Task: A Large-Scale Geo-Estimation Challenge for Social-Media Videos and Images

Jaeyoung Choi; Bart Thomee; Gerald Friedland; Liangliang Cao; Karl Ni; Damian Borth; Benjamin Elizalde; Luke R. Gottlieb; Carmen J. Carrano; Roger A. Pearce; Douglas N. Poland

The Placing Task is a yearly challenge offered by the MediaEval Multimedia Benchmarking Initiative that requires participants to develop algorithms that automatically predict the geo-location of social media videos and images. We introduce a recent development of a new standardized web-scale geo-tagged dataset for Placing Task 2014, which contains 5.5 million photos and 35,000 videos. This standardized benchmark with a large persistent dataset allows research community to easily evaluate new algorithms and to analyze their performance with respect to the state-of-the-art approaches. We discuss the characteristics of this years Placing Task along with the description of the new dataset components and how they were collected.

acm multimedia | 2015

Kickstarting the Commons: The YFCC100M and the YLI Corpora

Julia Bernd; Damian Borth; Carmen J. Carrano; Jaeyoung Choi; Benjamin Elizalde; Gerald Friedland; Luke R. Gottlieb; Karl Ni; Roger A. Pearce; Douglas N. Poland; Khalid Ashraf; David A. Shamma; Bart Thomee

The publication of the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M)--to date the largest open-access collection of photos and videos--has provided a unique opportunity to stimulate new research in multimedia analysis and retrieval. To make the YFCC100M even more valuable, we have started working towards supplementing it with a comprehensive set of precomputed features and high-quality ground truth annotations. As part of our efforts, we are releasing the YLI feature corpus, as well as the YLI-GEO and YLI-MED annotation subsets. Under the Multimedia Commons Project (MMCP), we are currently laying the groundwork for a common platform and framework around the YFCC100M that (i) facilitates researchers in contributing additional features and annotations, (ii) supports experimentation on the dataset, and (iii) enables sharing of obtained results. This paper describes the YLI features and annotations released thus far, and sketches our vision for the MMCP.

Journal of the Acoustical Society of America | 2018

The speakers in the room corpus

Aaron Lawson; Karl Ni; Colleen Richey; Zeb Armstrong; Martin Graciarena; Todd Stavish; Cory Stephenson; Jeff Hetherly; Paul Gamble; María Auxiliadora Barrios

The speakers in the room (SITR) corpus is a collaboration between Lab41 and SRI International, designed to be a freely available data set for speech and acoustics research in noisy room conditions. The main focus of the corpus is on distant microphone collection in a series of four rooms of different sizes and configurations. There are both foreground speech and background adversarial sounds, played through high-quality speakers in each room to create multiple, realistic acoustic environments. The foreground speech is played from a randomly rotating speaker to emulate head motion. Foreground speech consists of files from LibriVox audio collections and the background distractor sounds will consist of babble, music, HVAC, TV/radio, dogs, vehicles, and weather sounds drawn from the MUSAN collection. Each room has multiple sessions to exhaustively cover the background foreground combinations, and the audio is collected with twelve different microphones (omnidirectional lavalier, studio cardioid, and piezoelectric) placed strategically around the room. The resulting data set was designed to enable acoustic research on event detection, background detection, source separation, speech enhancement, source distance, sound localization, as well as speech research on speaker recognition, speech activity detection, speech recognition, and language recognition.The speakers in the room (SITR) corpus is a collaboration between Lab41 and SRI International, designed to be a freely available data set for speech and acoustics research in noisy room conditions. The main focus of the corpus is on distant microphone collection in a series of four rooms of different sizes and configurations. There are both foreground speech and background adversarial sounds, played through high-quality speakers in each room to create multiple, realistic acoustic environments. The foreground speech is played from a randomly rotating speaker to emulate head motion. Foreground speech consists of files from LibriVox audio collections and the background distractor sounds will consist of babble, music, HVAC, TV/radio, dogs, vehicles, and weather sounds drawn from the MUSAN collection. Each room has multiple sessions to exhaustively cover the background foreground combinations, and the audio is collected with twelve different microphones (omnidirectional lavalier, studio cardioid, and piezoelec...

ieee global conference on signal and information processing | 2013

Learning features in deep architectures with unsupervised kernel k-means

Karl Ni; Ryan Prenger

Deep learning technology and related algorithms have dramatically broken landmark records for a broad range of learning problems in vision, speech, audio, and text processing. Meanwhile, kernel methods have found common-place usage due to their nonlinear expressive power and elegant optimization formulation. Based on recent progress in learning high-level, class-specific features in unlabeled data, we improve upon the result by combining nonlinear kernels and multi-layer (deep) architecture, which we apply at scale. In particular, our experimentation is based on k-means with an RBF kernel, though it is a straightforward extension to other unsupervised clustering techniques and other reproducing kernel Hilbert spaces. With the proposed method, we discover features distilled from unorganized images. We augment high-level feature invariance by pooling techniques.

arXiv: Multimedia | 2015