Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jinlin Guo is active.

Publication


Featured researches published by Jinlin Guo.


content based multimedia indexing | 2012

Detecting complex events in user-generated video using concept classifiers

Jinlin Guo; David Scott; Frank Hopfgartner; Cathal Gurrin

Automatic detection of complex events in user-generated videos (UGV) is a challenging task due to its new characteristics differing from broadcast video. In this work, we firstly summarize the new characteristics of UGV, and then explore how to utilize concept classifiers to recognize complex events in UGV content. The method starts from manually selecting a variety of relevant concepts, followed by constructing classifiers for these concepts. Finally, complex event detectors are learned by using the concatenated probabilistic scores of these concept classifiers as features. Further, we also compare three different fusion operations of probabilistic scores, namely Maximum, Average and Minimum fusion. Experimental results suggest that our method provides promising results. It also shows that Maximum fusion tends to give better performance for most complex events.


conference on multimedia modeling | 2013

Evaluating novice and expert users on handheld video retrieval systems

David Scott; Frank Hopfgartner; Jinlin Guo; Cathal Gurrin

Content-based video retrieval systems have been widely associated with desktop environments that are largely complex in nature, targeting expert users and often require complex queries. Due to this complexity, interaction with these systems can be a challenge for regular ”novice” users. In recent years, a shift can be observed from this traditional desktop environment to that of handheld devices, which requires a different approach to interacting with the user. In this paper, we evaluate the performance of a handheld content-based video retrieval system on both expert and novice users. We show that with this type of device, a simple and intuitive interface, which incorporates the principles of content-based systems, though hidden from the user, attains the same accuracy for both novice and desktop users when faced with complex information retrieval tasks. We describe an experiment which utilises the Apple iPad as our handheld medium in which both a group of experts and novice users run the interactive experiments from the 2010 TRECVid Known-Item Search task. The results indicate that a carefully defined interface can equalise the performance of both novice and expert users.


Proceedings of the 2012 ACM international workshop on Audio and multimedia methods for large-scale video analysis | 2012

Short user-generated videos classification using accompanied audio categories

Jinlin Guo; Cathal Gurrin

This paper investigates the classification of short user-generated videos (UGVs) using the accompanied audio data since short UGVs accounts for a great proportion of the Internet UGVs and many short UGVs are accompanied by single-category soundtracks. We define seven types of UGVs corresponding to seven audio categories respectively. We also investigate three modeling approaches for audio feature representation, namely, single Gaussian (1G), Gaussian mixture (GMM) and Bag-of-Audio-Word (BoAW) models. Then using Support Vector Machine (SVM) with three different distance measurements corresponding to three feature representations, classifiers are trained to categorize the UGVs. The accompanying evaluation results show that these approaches are effective for categorizing the short UGVs based on their audio track. Experimental results show that a GMM representation with approximated Bhattacharyya distance (ABD) measurement produces the best performance, and BoAW representation with chi_square kernel also reports comparable results.


conference on multimedia modeling | 2011

Localization and recognition of the scoreboard in sports video based on SIFT point matching

Jinlin Guo; Cathal Gurrin; Songyang Lao; Colum Foley; Alan F. Smeaton

In broadcast sports video, the scoreboard is attached at a fixed location in the video and generally the scoreboard always exists in all video frames in order to help viewers to understand the matchs progression quickly. Based on these observations, we present a new localization and recognition method for scoreboard text in sport videos in this paper. The method first matches the Scale Invariant Feature Transform (SIFT) points using a modified matching technique between two frames extracted from a video clip and then localizes the scoreboard by computing a robust estimate of the matched point cloud in a two-stage non-scoreboard filter process based on some domain rules. Next some enhancement operations are performed on the localized scoreboard, and a Multi-frame Voting Decision is used. Both aim to increasing the OCR rate. Experimental results demonstrate the effectiveness and efficiency of our proposed method.


content based multimedia indexing | 2013

Exploring the optimal visual vocabulary sizes for semantic concept detection

Jinlin Guo; Zhengwei Qiu; Cathal Gurrin

The framework based on the Bag-of-Visual-Words (BoVW) feature representation and SVM classification is popularly used for generic content-based concept detection or visual categorization. However, visual vocabulary (VV) size, one important factor in this framework, is always chosen differently and arbitrarily in previous work. In this paper, we focus on investigating the optimal VV sizes depending on other components of this framework which also govern the performance. This is useful as a default VV size for reducing the computation cost. By unsupervised clustering, a series of VVs covering a wide range of sizes are evaluated under two popular local features, three assignment modes, and four kernels on two different-scale benchmarking datasets respectively. These factors are also evaluated. Experimental results show that best VV sizes vary as these factors change. However, the concept detection performance usually improves as the VV size increases initially, and then gains less, or even deteriorates if larger VVs are used since overfitting occurs. Overall, VVs with sizes ranging from 1024 to 4096 achieve best performance with higher probability when compared with other-size VVs. With regard to the other factors, experimental results show that the OpponentSIFT descriptor outperforms the SURF feature, and soft assignment mode yields better performance than binary and hard assignment. In addition, generalized RBF kernels such as X2 and Laplace RBF kernels are more appropriate for semantic concept detection with SVM classification.


conference on multimedia modeling | 2013

DCU at MMM 2013 Video Browser Showdown

David Scott; Jinlin Guo; Cathal Gurrin; Frank Hopfgartner; Kevin McGuinness; Noel E. O'Connor; Alan F. Smeaton; Yang Yang; Zhenxing Zhang

This paper describes a handheld video browser that in corporates shot boundary detection, key frame extraction, semantic content analysis, key frame browsing, and similarity search.


Multimedia Tools and Applications | 2018

Irrelevance reduction with locality-sensitive hash learning for efficient cross-media retrieval

Yuhua Jia; Liang Bai; Peng Wang; Jinlin Guo; Yuxiang Xie; Tianyuan Yu

Cross-media retrieval is an imperative approach to handle the explosive growth of multimodal data on the web. However, existing approaches to cross-media retrieval are computationally expensive due to high dimensionality. To efficiently retrieve in multimodal data, it is essential to reduce the proportion of irrelevant documents. In this paper, we propose a fast cross-media retrieval approach (FCMR) based on locality-sensitive hashing (LSH) and neural networks. One modality of multimodal information is projected by LSH algorithm to cluster similar objects into the same hash bucket and dissimilar objects into different ones and then another modality is mapped into these hash buckets using hash functions learned through neural networks. Once given a textual or visual query, it can be efficiently mapped to a hash bucket in which objects stored can be near neighbors of this query. Experimental results show that, in the set of the queries’ near neighbors obtained by the proposed method, the proportions of relevant documents can be much boosted, and it indicates that the retrieval based on near neighbors can be effectively conducted. Further evaluations on two public datasets demonstrate the efficacy of the proposed retrieval method compared to the baselines.


conference on multimedia modeling | 2013

Quality Assessment of User-Generated Video Using Camera Motion

Jinlin Guo; Cathal Gurrin; Frank Hopfgartner; Zhenxing Zhang; Songyang Lao

With user-generated video (UGV) becoming so popular on the Web, the availability of a reliable quality assessment (QA) measure of UGV is necessary for improving the users’ quality of experience in video-based application. In this paper, we explore QA of UGV based on how much irregular camera motion it contains with low-cost manner. A block-match based optical flow approach has been employed to extract camera motion features in UGV, based on which, irregular camera motion is calculated and automatic QA scores are given. Using a set of UGV clips from benchmarking datasets as a showcase, we observe that QA scores from the proposed automatic method and subjective method fit well. Further, the automatic method reports much better performance than the random run. These confirm the satisfaction of the automatic QA scores indicating the quality of the UGV when only considering visual camera motion. Furthermore, it also shows that the UGV quality can be assessed automatically for improving the end users quality of experience in video-based applications.


conference on multimedia modeling | 2012

Clipboard: a visual search and browsing engine for tablet and PC

David Scott; Jinlin Guo; Hongyi Wang; Yang Yang; Frank Hopfgartner; Cathal Gurrin

In this work, we present a handheld video browser that utilizes two methods of search; Concept Search and Keyframe Similarity. Concept Search allows a user to define a query using selected visual concepts and presents the user with a cluster of video segments based on extracted image features using OpponentSIFT. Keyframe Similarity has a dependance on the previous search for input criteria, allowing a user to select a keyframe for similarity search, returning three types of results; local keyframes from the current scene, global shot similarity based on visual features and text similarity of shots, based on frequently occurring words generated from ASR transcripts.


international conference on multimedia retrieval | 2013

Who produced this video, amateur or professional?

Jinlin Guo; Cathal Gurrin; Songyang Lao

As the increasing affordability for capturing and storing video and the proliferation of Web 2.0 applications, video content is no longer necessarily created and supplied by a limited number of professional producers; any amateur can produce and publish his/her video quickly. Therefore, the amount of both professional-produced as well as amateur-produced video on the web is ever increasing. In this work, we propose a question; whether we can automatically classify an Internet video clip as being either professional-produced or amateur-produced? Hence, we investigate features and classification methods to answer this question. Based on the differences in the production processes of these two video categories, four features including camera motion, structure, audio feature and combined feature are adopted and studied along with with four popular classifiers KNN, SVM GMM and C4.5. Extensive experiments over representative datasets, evaluate these features and classifiers under different settings and compare to existing techniques. Experimental results demonstrate that SVMs with multimodal features from multi-sources are more effective at classifying video type. Finally, for answering the proposed question, results also show that automatically classifying a clip as professional-produced video or amateur-produced video can be achieved with good accuracy.

Collaboration


Dive into the Jinlin Guo's collaboration.

Top Co-Authors

Avatar

David Scott

Dublin City University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Colum Foley

Dublin City University

View shared research outputs
Top Co-Authors

Avatar

Songyang Lao

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Yang Yang

Dublin City University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Liang Bai

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Yuxiang Xie

National University of Defense Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge