Is this you? Create Your Porfile

Heng Tao Shen

University of Electronic Science and Technology of China

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Heng Tao Shen is active.

Explore More

Publication

Featured researches published by Heng Tao Shen.

computer vision and pattern recognition | 2015

Supervised Discrete Hashing

Fumin Shen; Chunhua Shen; Wei Liu; Heng Tao Shen

Recently, learning based hashing techniques have attracted broad research interests because they can support efficient storage and retrieval for high-dimensional data such as images, videos, documents, etc. However, a major difficulty of learning to hash lies in handling the discrete constraints imposed on the pursued hash codes, which typically makes hash optimizations very challenging (NP-hard in general). In this work, we propose a new supervised hashing framework, where the learning objective is to generate the optimal binary hash codes for linear classification. By introducing an auxiliary variable, we reformulate the objective such that it can be solved substantially efficiently by employing a regularization algorithm. One of the key steps in this algorithm is to solve a regularization sub-problem associated with the NP-hard binary optimization. We show that the sub-problem admits an analytical solution via cyclic coordinate descent. As such, a high-quality discrete solution can eventually be obtained in an efficient computing manner, therefore enabling to tackle massive datasets. We evaluate the proposed approach, dubbed Supervised Discrete Hashing (SDH), on four large image datasets and demonstrate its superiority to the state-of-the-art hashing methods in large-scale image retrieval.

very large data bases | 2008

Discovery of convoys in trajectory databases

Hoyoung Jeung; Man Lung Yiu; Xiaofang Zhou; Christian S. Jensen; Heng Tao Shen

As mobile devices with positioning capabilities continue to proliferate, data management for so-called trajectory databases that capture the historical movements of populations of moving objects becomes important. This paper considers the querying of such databases for convoys, a convoy being a group of objects that have traveled together for some time. More specifically, this paper formalizes the concept of a convoy query using density-based notions, in order to capture groups of arbitrary extents and shapes. Convoy discovery is relevant for real-life applications in throughput planning of trucks and carpooling of vehicles. Although there has been extensive research on trajectories in the literature, none of this can be applied to retrieve correctly exact convoy result sets. Motivated by this, we develop three efficient algorithms for convoy discovery that adopt the well-known filter-refinement framework. In the filter step, we apply line-simplification techniques on the trajectories and establish distance bounds between the simplified trajectories. This permits efficient convoy discovery over the simplified trajectories without missing any actual convoys. In the refinement step, the candidate convoys are further processed to obtain the actual convoys. Our comprehensive empirical study offers insight into the properties of the papers proposals and demonstrates that the proposals are effective and efficient on real-world trajectory data.

international joint conference on artificial intelligence | 2011

l 2,1 -norm regularized discriminative feature selection for unsupervised learning

Yi Yang; Heng Tao Shen; Zhigang Ma; Zi Huang; Xiaofang Zhou

Compared with supervised learning for feature selection, it is much more difficult to select the discriminative features in unsupervised learning due to the lack of label information. Traditional unsupervised feature selection algorithms usually select the features which best preserve the data distribution, e.g., manifold structure, of the whole feature set. Under the assumption that the class label of input data can be predicted by a linear classifier, we incorporate discriminative analysis and l2,1-norm minimization into a joint framework for unsupervised feature selection. Different from existing unsupervised feature selection algorithms, our algorithm selects the most discriminative feature subset from the whole feature set in batch mode. Extensive experiment on different data types demonstrates the effectiveness of our algorithm.

international conference on data engineering | 2008

A Hybrid Prediction Model for Moving Objects

Hoyoung Jeung; Qing Liu; Heng Tao Shen; Xiaofang Zhou

Existing prediction methods in moving objects databases cannot forecast locations accurately if the query time is far away from the current time. Even for near future prediction, most techniques assume the trajectory of an objects movements can be represented by some mathematical formulas of motion functions based on its recent movements. However, an objects movements are more complicated than what the mathematical formulas can represent. Prediction based on an objects trajectory patterns is a powerful way and has been investigated by several work. But their main interest is how to discover the patterns. In this paper, we present a novel prediction approach, namely The Hybrid Prediction Model, which estimates an objects future locations based on its pattern information as well as existing motion functions using the objects recent movements. Specifically, an objects trajectory patterns which have ad-hoc forms for prediction are discovered and then indexed by a novel access method for efficient query processing. In addition, two query processing techniques that can provide accurate results for both near and distant time predictive queries are presented. Our extensive experiments demonstrate that proposed techniques are more accurate and efficient than existing forecasting schemes.

acm multimedia | 2011

Multiple feature hashing for real-time large scale near-duplicate video retrieval

Jingkuan Song; Yi Yang; Zi Huang; Heng Tao Shen

Near-duplicate video retrieval (NDVR) has recently attracted lots of research attention due to the exponential growth of online videos. It helps in many areas, such as copyright protection, video tagging, online video usage monitoring, etc. Most of existing approaches use only a single feature to represent a video for NDVR. However, a single feature is often insufficient to characterize the video content. Besides, while the accuracy is the main concern in previous literatures, the scalability of NDVR algorithms for large scale video datasets has been rarely addressed. In this paper, we present a novel approach - Multiple Feature Hashing (MFH) to tackle both the accuracy and the scalability issues of NDVR. MFH preserves the local structure information of each individual feature and also globally consider the local structures for all the features to learn a group of hash functions which map the video keyframes into the Hamming space and generate a series of binary codes to represent the video dataset. We evaluate our approach on a public video dataset and a large scale video dataset consisting of 132,647 videos, which was collected from YouTube by ourselves. The experiment results show that the proposed method outperforms the state-of-the-art techniques in both accuracy and efficiency.

international conference on data engineering | 2007

Multi-source Skyline Query Processing in Road Networks

Ke Deng; Xiaofang Zhou; Heng Tao Shen

Skyline query processing has been investigated extensively in recent years, mostly for only one query reference point. An example of a single-source skyline query is to find hotels which are cheap and close to the beach (an absolute query), or close to a user-given location (a relatively query). A multi-source skyline query considers several query points at the same time (e.g., to find hotels which are cheap and close to the University, the Botanic Garden and the China Town). In this paper, we consider the problem of efficient multi-source skyline query processing in road networks. It is not only the first effort to consider multi-source skyline query in road networks but also the first effort to process the relative skyline queries where the network distance between two locations needs to be computed on-the-fly. Three different query processing algorithms are proposed and evaluated in this paper. The Lower Bound Constraint algorithm (LBC) is proven to be an instance optimal algorithm. Extensive experiments using large real road network datasets demonstrate that LBC is four times more efficient than a straightforward algorithm.

international conference on management of data | 2010

Searching trajectories by locations: an efficiency study

Zaiben Chen; Heng Tao Shen; Xiaofang Zhou; Yu Zheng; Xing Xie

Trajectory search has long been an attractive and challenging topic which blooms various interesting applications in spatial-temporal databases. In this work, we study a new problem of searching trajectories by locations, in which context the query is only a small set of locations with or without an order specified, while the target is to find the k Best-Connected Trajectories (k-BCT) from a database such that the k-BCT best connect the designated locations geographically. Different from the conventional trajectory search that looks for similar trajectories w.r.t. shape or other criteria by using a sample query trajectory, we focus on the goodness of connection provided by a trajectory to the specified query locations. This new query can benefit users in many novel applications such as trip planning. In our work, we firstly define a new similarity function for measuring how well a trajectory connects the query locations, with both spatial distance and order constraint being considered. Upon the observation that the number of query locations is normally small (e.g. 10 or less) since it is impractical for a user to input too many locations, we analyze the feasibility of using a general-purpose spatial index to achieve efficient k-BCT search, based on a simple Incremental k-NN based Algorithm (IKNN). The IKNN effectively prunes and refines trajectories by using the devised lower bound and upper bound of similarity. Our contributions mainly lie in adapting the best-first and depth-first k-NN algorithms to the basic IKNN properly, and more importantly ensuring the efficiency in both search effort and memory usage. An in-depth study on the adaption and its efficiency is provided. Further optimization is also presented to accelerate the IKNN algorithm. Finally, we verify the efficiency of the algorithm by extensive experiments.

Archive | 2006

Frontiers of WWW Research and Development - APWeb 2006

Xiaofang Zhou; Jianzhong Li; Heng Tao Shen; Masaru Kitsuregawa; Yanchun Zhang

Keynote Papers.- Applications Development for the Computational Grid.- Strongly Connected Dominating Sets in Wireless Sensor Networks with Unidirectional Links.- Mobile Web and Location-Based Services.- The Case of the Duplicate Documents Measurement, Search, and Science.- Regular Papers.- An Effective System for Mining Web Log.- Adapting K-Means Algorithm for Discovering Clusters in Subspaces.- Sample Sizes for Query Probing in Uncooperative Distributed Information Retrieval.- The Probability of Success of Mobile Agents When Routing in Faulty Networks.- Clustering Web Documents Based on Knowledge Granularity.- XFlat: Query Friendly Encrypted XML View Publishing.- Distributed Energy Efficient Data Gathering with Intra-cluster Coverage in Wireless Sensor Networks.- QoS-Driven Web Service Composition with Inter Service Conflicts.- An Agent-Based Approach for Cooperative Data Management.- Transforming Heterogeneous Messages Automatically in Web Service Composition.- User-Perceived Web QoS Measurement and Evaluation System.- An RDF Storage and Query Framework with Flexible Inference Strategy.- An Aspect-Oriented Approach to Declarative Access Control for Web Applications.- A Statistical Study of Todays Gnutella.- Automatically Constructing Descriptive Site Maps.- TWStream: Finding Correlated Data Streams Under Time Warping.- Supplier Categorization with K-Means Type Subspace Clustering.- Classifying Web Data in Directory Structures.- Semantic Similarity Based Ontology Cache.- In-Network Join Processing for Sensor Networks.- Transform BPEL Workflow into Hierarchical CP-Nets to Make Tool Support for Verification.- Identifying Agitators as Important Blogger Based on Analyzing Blog Threads.- Detecting Collusion Attacks in Security Protocols.- Role-Based Delegation with Negative Authorization.- Approximate Top-k Structural Similarity Search over XML Documents.- Towards Enhancing Trust on Chinese E-Commerce.- Flexible Deployment Models for Location-Aware Key Management in Wireless Sensor Networks.- A Diachronic Analysis of Gender-Related Web Communities Using a HITS-Based Mining Tool.- W3 Trust-Profiling Framework (W3TF) to Assess Trust and Transitivity of Trust of Web-Based Services in a Heterogeneous Web Environment.- Image Description Mining and Hierarchical Clustering on Data Records Using HR-Tree.- Personalized News Categorization Through Scalable Text Classification.- The Adaptability of English Based Web Search Algorithms to Chinese Search Engines.- A Feedback Based Framework for Semi-automic Composition of Web Services.- Fast Approximate Matching Between XML Documents and Schemata.- Mining Query Log to Assist Ontology Learning from Relational Database.- An Area-Based Collaborative Sleeping Protocol for Wireless Sensor Networks.- F@: A Framework of Group Awareness in Synchronous Distributed Groupware.- Adaptive User Profile Model and Collaborative Filtering for Personalized News.- Context Matcher: Improved Web Search Using Query Term Context in Source Document and in Search Results.- Weighted Ontology-Based Search Exploiting Semantic Similarity.- Determinants of Groupware Usability for Community Care Collaboration.- Automated Discovering of What is Hindering the Learning Performance of a Student.- Sharing Protected Web Resources Using Distributed Role-Based Modeling.- Concept Map Model for Web Ontology Exploration.- A Resource-Adaptive Transcoding Proxy Caching Strategy.- Optimizing Collaborative Filtering by Interpolating the Individual and Group Behaviors.- Extracting Semantic Relationships Between Terms from PC Documents and Its Applications to Web Search Personalization.- Detecting Implicit Dependencies Between Tasks from Event Logs.- Implementing Privacy Negotiations in E-Commerce.- A Community-Based, Agent-Driven, P2P Overlay Architecture for Personalized Web.- Providing an Uncertainty Reasoning Service for Semantic Web Application.- Indexing XML Documents Using Self Adaptive Genetic Algorithms for Better Retreival.- GCC: A Knowledge Management Environment for Research Centers and Universities.- Towards More Personalized Web: Extraction and Integration of Dynamic Content from the Web.- Supporting Relative Workflows with Web Services.- Text Based Knowledge Discovery with Information Flow Analysis.- Short Papers.- Study on QoS Driven Web Services Composition.- Optimizing the Data Intensive Mediator-Based Web Services Composition.- Role of Triple Space Computing in Semantic Web Services.- Modified ID-Based Threshold Decryption and Its Application to Mediated ID-Based Encryption.- Materialized View Maintenance in Peer Data Management Systems.- Cubic Analysis of Social Bookmarking for Personalized Recommendation.- MAGMS: Mobile Agent-Based Grid Monitoring System.- A Computational Trust Model for Semantic Web Based on Bayesian Decision Theory.- Efficient Dynamic Traffic Navigation with Hierarchical Aggregation Tree.- A Color Bar Based Affective Annotation Method for Media Player.- Robin: Extracting Visual and Textual Features from Web Pages.- Generalized Projected Clustering in High-Dimensional Data Streams.- An Effective Web Page Layout Adaptation for Various Resolutions.- XMine: A Methodology for Mining XML Structure.- Multiple Join Processing in Data Grid.- A Novel Architecture for Realizing Grid Workflow Using Pi-Calculus Technology.- A Chord-Based Novel Mobile Peer-to-Peer File Sharing Protocol.- Web-Based Genomic Information Integration with Gene Ontology.- Table Detection from Plain Text Using Machine Learning and Document Structure.- Efficient Mining Strategy for Frequent Serial Episodes in Temporal Database.- Efficient and Provably Secure Client-to-Client Password-Based Key Exchange Protocol.- Effective Criteria for Web Page Changes.- WordRank-Based Lexical Signatures for Finding Lost or Related Web Pages.- A Scalable Update Management Mechanism for Query Result Caching Systems at Database-Driven Web Sites.- Building Content Clusters Based on Modelling Page Pairs.- IRFCF: Iterative Rating Filling Collaborative Filtering Algorithm.- A Method to Select the Optimum Web Services.- A New Methodology for Information Presentations on the Web.- Integration of Single Sign-On and Role-Based Access Control Profiles for Grid Computing.- An Effective Service Discovery Model for Highly Reliable Web Services Composition in a Specific Domain.- Using Web Archive for Improving Search Engine Results.- Closed Queueing Network Model for Multi-tier Data Stream Processing Center.- Optimal Task Scheduling Algorithm for Non-preemptive Processing System.- A Multi-agent Based Grid Service Discovery Framework Using Fuzzy Petri Net and Ontology.- Modeling Identity Management Architecture Within a Social Setting.- Ontological Engineering in Data Warehousing.- Mapping Ontology Relations: An Approach Based on Best Approximations.- Building a Semantic P2P Scientific References Sharing System with JXTA.- Named Graphs as a Mechanism for Reasoning About Provenance.- Discovery of Spatiotemporal Patterns in Mobile Environment.- Visual Description Conversion for Enhancing Search Engines and Navigational Systems.- Reusing Experiences for an Effective Learning in a Web-Based Context.- Special Sessions on e-Water.- Collaboration Between China and Australia: An e-Water Workshop Report.- On Sensor Network Segmentation for Urban Water Distribution Monitoring.- Using the Shuffled Complex Evolution Global Optimization Method to Solve Groundwater Management Models.- Integrating Hydrological Data of Yellow River for Efficient Information Services.- Application and Integration of Information Technology in Water Resources Informatization.- An Empirical Study on Groupware Support for Water Resources Ontology Integration.- Ontology Mapping Approach Based on OCL.- Object Storage System for Mass Geographic Information.- The Service-Oriented Data Integration Platform for Water Resources Management.- Construction of Yellow River Digital Project Management System.- Study on the Construction and Application of 3D Visualization Platform for the Yellow River Basin.- Industry Papers.- A Light-Weighted Approach to Workflow View Implementation.- RSS Feed Generation from Legacy HTML Pages.- Ontology Driven Securities Data Management and Analysis.- Context Gallery: A Service-Oriented Framework to Facilitate Context Information Sharing.- A Service-Oriented Architecture Based Macroeconomic Analysis & Forecasting System.- A Web-Based Method for Building Company Name Knowledge Base.- Demo Sessions.- Healthy Waterways: Healthy Catchments - An Integrated Research/Management Program to Understand and Reduce Impacts of Sediments and Nutrients on Waterways in Queensland, Australia.- Groundwater Monitoring in China.- The Digital Yellow River Programme.- Web Services Based State of the Environment Reporting.- COEDIG: Collaborative Editor in Grid Computing.- HVEM Grid: Experiences in Constructing an Electron Microscopy Grid.- WISE: A Prototype for Ontology Driven Development of Web Information Systems.- DSEC: A Data Stream Engine Based Clinical Information System.- SESQ: A Novel System for Building Domain Specific Web Search Engines.- Digital Map: Animated Mode.- Dynamic Voice User Interface Using VoiceXML and Active Server Pages.- WebVine Suite: A Web Services Based BPMS.- Adaptive Mobile Cooperation Model Based on Context Awareness.- An Integrated Network Management System.- Ichigen-San: An Ontology-Based Information Retrieval System.- A Database Monitoring and Disaster Recovery System.- IPVita: An Intelligent Platform of Virtual Travel Agency.- LocalRank: A Prototype for Ranking Web Pages with Database Considering Geographical Locality.- Automated Content Transformation with Adjustment for Visual Presentation Related to Terminal Types.

acm multimedia | 2000

Giving meanings to WWW images

Heng Tao Shen; Beng Chin Ooi; Kian-Lee Tan

Images are increasingly being embedded in HTML documents on the WWW. Such documents over the WWW essentially provides a rich source of image collection from which user can query. Interestingly, the semantics of these images are typically described by their surrounding text. Unfortunately, most WWW image search engines fail to exploit these image semantics and give rise to poor recall and precision performance. In this paper, we propose a novel image representation model called Weight ChainNet. Weight ChainNet is based on lexical chain that represents the semantics of an image from its nearby text. A new formula, called list space model, for computing semantic similarities is also introduced. To further improve the retrieval effectiveness, we also propose two relevance feedback mechanisms. We conducted an extensive performance study on a collection of 5000 images obtained from documents identified by more than 2000 URLs. Our results show that our models and methods outperform existing technique. Moreover, the relevant feedback mechanisms can lead to significantly better retrieval effectiveness.

computer vision and pattern recognition | 2011

Tag localization with spatial correlations and joint group sparsity

Yang Yang; Yi Yang; Zi Huang; Heng Tao Shen; Feiping Nie

Nowadays numerous social images have been emerging on the Web. How to precisely label these images is critical to image retrieval. However, traditional image-level tagging methods may become less effective because global image matching approaches can hardly cope with the diversity and arbitrariness of Web image content. This raises an urgent need for the fine-grained tagging schemes. In this work, we study how to establish mapping between tags and image regions, i.e. localize tags to image regions, so as to better depict and index the content of images. We propose the spatial group sparse coding (SGSC) by extending the robust encoding ability of group sparse coding with spatial correlations among training regions. We present spatial correlations in a two-dimensional image space and design group-specific spatial kernels to produce a more interpretable regularizer. Further we propose a joint version of the SGSC model which is able to simultaneously encode a group of intrinsically related regions within a test image. An effective algorithm is developed to optimize the objective function of the Joint SGSC. The tag localization task is conducted by propagating tags from sparsely selected groups of regions to the target regions according to the reconstruction coefficients. Extensive experiments on three public image datasets illustrate that our proposed models achieve great performance improvements over the state-of-the-art method in the tag localization task.

Explore More