Yi-Jie Lu
City University of Hong Kong
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yi-Jie Lu.
IEEE MultiMedia | 2011
Xiao Wu; Yi-Jie Lu; Qiang Peng; Chong-Wah Ngo
The article is discussing the issues of mining event structures from Web video search results using text analysis, burst detection, and clustering as with the proliferation of social media, the volume of Web videos have grown exponentially.
international conference on multimedia retrieval | 2016
Yi-Jie Lu; Hao Zhang; Maaike de Boer; Chong-Wah Ngo
Complex video event detection without visual examples is a very challenging issue in multimedia retrieval. We present a state-of-the-art framework for event search without any need of exemplar videos and textual metadata in search corpus. To perform event search given only query words, the core of our framework is a large, pre-built bank of concept detectors which can understand the content of a video in the perspective of object, scene, action and activity concepts. Leveraging such knowledge can effectively narrow the semantic gap between textual query and the visual content of videos. Besides the large concept bank, this paper focuses on two challenges that largely affect the retrieval performance when the size of the concept bank increases: (1) How to choose the right concepts in the concept bank to accurately represent the query; (2) if noisy concepts are inevitably chosen, how to minimize their influence. We share our novel insights on these particular problems, which paves the way for a practical system that achieves the best performance in NIST TRECVID 2015.
conference on multimedia modeling | 2017
Yi-Jie Lu; Phuong Anh Nguyen; Hao Zhang; Chong-Wah Ngo
Our successful multimedia event detection system at TREC-VID 2015 showed its strength on handling complex concepts in a query. The system was based on a large number of pre-trained concept detectors for textual-to-visual relation. In this paper, we enhance the system by enabling human-in-the-loop. In order to facilitate a user to quickly find an information need, we incorporate concept screening, video reranking by highlighted concepts, relevance feedback and color sketch to refine a coarse retrieval result. The aim is to eventually come up with a system suitable for both Ad-hoc Video Search and Known-Item Search. In addition, as the increasing awareness of difficulty in distinguishing shots of very similar scenes, we also explore the automatic story annotation along the timeline of a video, so that a user can quickly grasp the story happened in the context of a target shot and reject shots with incorrect context. With the story annotation, a user can refine the search result as well by simply adding a few keywords in a special “context field” of a query.
International Journal of Multimedia Information Retrieval | 2016
Maaike de Boer; Klamer Schutte; Hao Zhang; Yi-Jie Lu; Chong-Wah Ngo; Wessel Kraaij
One of the challenges in Multimedia Event Retrieval is the integration of data from multiple modalities. A modality is defined as a single channel of sensory input, such as visual or audio. We also refer to this as data source. Previous research has shown that the integration of different data sources can improve performance compared to only using one source, but a clear insight of success factors of alternative fusion methods is still lacking. We introduce several new blind late fusion methods based on inversions and ratios of the state-of-the-art blind fusion methods and compare performance in both simulations and an international benchmark data set in multimedia event retrieval named TRECVID MED. The results show that five of the proposed methods outperform the state-of-the-art methods in a case with sufficient training examples (100 examples). The novel fusion method named JRER is not only the best method with dependent data sources, but this method is also a robust method in all simulations with sufficient training examples.
ACM Transactions on Multimedia Computing, Communications, and Applications | 2017
Maaike de Boer; Yi-Jie Lu; Hao Zhang; Klamer Schutte; Chong-Wah Ngo; Wessel Kraaij
Searching in digital video data for high-level events, such as a parade or a car accident, is challenging when the query is textual and lacks visual example images or videos. Current research in deep neural networks is highly beneficial for the retrieval of high-level events using visual examples, but without examples it is still hard to (1) determine which concepts are useful to pre-train (Vocabulary challenge) and (2) which pre-trained concept detectors are relevant for a certain unseen high-level event (Concept Selection challenge). In our article, we present our Semantic Event Retrieval System which (1) shows the importance of high-level concepts in a vocabulary for the retrieval of complex and generic high-level events and (2) uses a novel concept selection method (i-w2v) based on semantic embeddings. Our experiments on the international TRECVID Multimedia Event Detection benchmark show that a diverse vocabulary including high-level concepts improves performance on the retrieval of high-level events in videos and that our novel method outperforms a knowledge-based concept selection method.
conference on multimedia modeling | 2018
Phuong Anh Nguyen; Yi-Jie Lu; Hao Zhang; Chong-Wah Ngo
The VIREO Known-Item Search (KIS) system has joined the Video Browser Showdown (VBS) [1] evaluation benchmark for the first time in year 2017. With experiences learned, the second version of VIREO KIS is presented in this paper. Considering the color-sketch based retrieval, we propose a simple grid-based approach for color query. This method allows the aggregation of color distributions in video frames into a shot representation, and generates the pre-computed rank list for all available queries which reduces computational resources and favors a recommendation module. With focusing on concept based retrieval, we modify our multimedia event detection system at TRECVID 2015 in VIREO KIS 2017. In this year, the concept bank of VIREO KIS has been upgraded to 14K concepts. An adaptive concept selection, combination and expansion mechanism, which assists the user in picking the right concepts and logically combining concepts to form more expressive query, has been developed. In addition, metadata is included for textual query and some interface designs are also revised for providing a flexible view of results to the user.
VIREO-TNO @ TRECVID 2015, 1-12 | 2015
Hong-Jiang Zhang; Yi-Jie Lu; M.H.T. de Boer; F.B. ter Haar; Klamer Schutte; Wessel Kraaij; Chong-Wah Ngo
VIREO-TNO @ TRECVID 2014, 1-12 | 2014
Chong-Wah Ngo; Yi-Jie Lu; Hong-Jiang Zhang; Chun-Chet Tan; Lei Pang; M.H.T. de Boer; John G. M. Schavemaker; Klamer Schutte; Wessel Kraaij
Archive | 2014
Chong-Wah Ngo; Yi-Jie Lu; Hao Zhang; Chun-Chet Tan; Lei Pang; Maaike de Boer; John G. M. Schavemaker; Klamer Schutte; Wessel Kraaij
Archive | 2014
Wei Zhang; Hao Zhang; Yi-Jie Lu; Jingjing Chen; Chong-Wah Ngo