Karl Ni
Lawrence Livermore National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Karl Ni.
Communications of The ACM | 2016
Bart Thomee; David A. Shamma; Gerald Friedland; Benjamin Elizalde; Karl Ni; Douglas N. Poland; Damian Borth; Li-Jia Li
This publicly available curated dataset of almost 100 million photos and videos is free and legal for all.This publicly available curated dataset of almost 100 million photos and videos is free and legal for all.
acm multimedia | 2014
Jaeyoung Choi; Bart Thomee; Gerald Friedland; Liangliang Cao; Karl Ni; Damian Borth; Benjamin Elizalde; Luke R. Gottlieb; Carmen J. Carrano; Roger A. Pearce; Douglas N. Poland
The Placing Task is a yearly challenge offered by the MediaEval Multimedia Benchmarking Initiative that requires participants to develop algorithms that automatically predict the geo-location of social media videos and images. We introduce a recent development of a new standardized web-scale geo-tagged dataset for Placing Task 2014, which contains 5.5 million photos and 35,000 videos. This standardized benchmark with a large persistent dataset allows research community to easily evaluate new algorithms and to analyze their performance with respect to the state-of-the-art approaches. We discuss the characteristics of this years Placing Task along with the description of the new dataset components and how they were collected.
acm multimedia | 2015
Julia Bernd; Damian Borth; Carmen J. Carrano; Jaeyoung Choi; Benjamin Elizalde; Gerald Friedland; Luke R. Gottlieb; Karl Ni; Roger A. Pearce; Douglas N. Poland; Khalid Ashraf; David A. Shamma; Bart Thomee
The publication of the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M)--to date the largest open-access collection of photos and videos--has provided a unique opportunity to stimulate new research in multimedia analysis and retrieval. To make the YFCC100M even more valuable, we have started working towards supplementing it with a comprehensive set of precomputed features and high-quality ground truth annotations. As part of our efforts, we are releasing the YLI feature corpus, as well as the YLI-GEO and YLI-MED annotation subsets. Under the Multimedia Commons Project (MMCP), we are currently laying the groundwork for a common platform and framework around the YFCC100M that (i) facilitates researchers in contributing additional features and annotations, (ii) supports experimentation on the dataset, and (iii) enables sharing of obtained results. This paper describes the YLI features and annotations released thus far, and sketches our vision for the MMCP.
Journal of the Acoustical Society of America | 2018
Aaron Lawson; Karl Ni; Colleen Richey; Zeb Armstrong; Martin Graciarena; Todd Stavish; Cory Stephenson; Jeff Hetherly; Paul Gamble; María Auxiliadora Barrios
The speakers in the room (SITR) corpus is a collaboration between Lab41 and SRI International, designed to be a freely available data set for speech and acoustics research in noisy room conditions. The main focus of the corpus is on distant microphone collection in a series of four rooms of different sizes and configurations. There are both foreground speech and background adversarial sounds, played through high-quality speakers in each room to create multiple, realistic acoustic environments. The foreground speech is played from a randomly rotating speaker to emulate head motion. Foreground speech consists of files from LibriVox audio collections and the background distractor sounds will consist of babble, music, HVAC, TV/radio, dogs, vehicles, and weather sounds drawn from the MUSAN collection. Each room has multiple sessions to exhaustively cover the background foreground combinations, and the audio is collected with twelve different microphones (omnidirectional lavalier, studio cardioid, and piezoelectric) placed strategically around the room. The resulting data set was designed to enable acoustic research on event detection, background detection, source separation, speech enhancement, source distance, sound localization, as well as speech research on speaker recognition, speech activity detection, speech recognition, and language recognition.The speakers in the room (SITR) corpus is a collaboration between Lab41 and SRI International, designed to be a freely available data set for speech and acoustics research in noisy room conditions. The main focus of the corpus is on distant microphone collection in a series of four rooms of different sizes and configurations. There are both foreground speech and background adversarial sounds, played through high-quality speakers in each room to create multiple, realistic acoustic environments. The foreground speech is played from a randomly rotating speaker to emulate head motion. Foreground speech consists of files from LibriVox audio collections and the background distractor sounds will consist of babble, music, HVAC, TV/radio, dogs, vehicles, and weather sounds drawn from the MUSAN collection. Each room has multiple sessions to exhaustively cover the background foreground combinations, and the audio is collected with twelve different microphones (omnidirectional lavalier, studio cardioid, and piezoelec...
ieee global conference on signal and information processing | 2013
Karl Ni; Ryan Prenger
Deep learning technology and related algorithms have dramatically broken landmark records for a broad range of learning problems in vision, speech, audio, and text processing. Meanwhile, kernel methods have found common-place usage due to their nonlinear expressive power and elegant optimization formulation. Based on recent progress in learning high-level, class-specific features in unlabeled data, we improve upon the result by combining nonlinear kernels and multi-layer (deep) architecture, which we apply at scale. In particular, our experimentation is based on k-means with an RBF kernel, though it is a straightforward extension to other unsupervised clustering techniques and other reproducing kernel Hilbert spaces. With the proposed method, we discover features distilled from unorganized images. We augment high-level feature invariance by pooling techniques.
arXiv: Multimedia | 2015
Bart Thomee; David A. Shamma; Gerald Friedland; Benjamin Elizalde; Karl Ni; Douglas N. Poland; Damian Borth; Li-Jia Li
Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European | 2014
Mirco Ravanelli; Benjamin Elizalde; Karl Ni; Gerald Friedland
arXiv: Learning | 2015
Karl Ni; Roger A. Pearce; Kofi Boakye; Brian Van Essen; Damian Borth; Barry Chen; Eric X. Wang
arXiv: Computer Vision and Pattern Recognition | 2015
Takuya Narihira; Damian Borth; Stella X. Yu; Karl Ni; Trevor Darrell
conference of the international speech communication association | 2014
Benjamin Elizalde; Mirco Ravanelli; Karl Ni; Damian Borth; Gerald Friedland