Is this you? Create Your Porfile

Jie Lu

Carnegie Mellon University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jie Lu is active.

Explore More

Publication

Featured researches published by Jie Lu.

conference on information and knowledge management | 2003

Content-based retrieval in hybrid peer-to-peer networks

Jie Lu; James P. Callan

Hybrid peer-to-peer architectures use special nodes to provide directory services for regions of the network (regional directory services). Hybrid peer-to-peer architectures are a potentially powerful model for developing large-scale networks of complex digital libraries, but peer-to-peer networks have so far tended to use very simple methods of resource selection and document retrieval. In this paper, we study the application of content-based resource selection and document retrieval to hybrid peer-to-peer networks. The directory nodes that provide regional directory services construct and use the content models of neighboring nodes to determine how to route query messages through the network. The leaf nodes that provide information use content-based retrieval to decide which documents to retrieve for queries. The experimental results demonstrate that using content-based retrieval in hybrid peer-to-peer networks is both more accurate and more efficient for some digital library environments than more common alternatives such as Gnutella 0.6.

european conference on information retrieval | 2005

Federated search of text-based digital libraries in hierarchical peer-to-peer networks

Jie Lu; Jamie Callan

Peer-to-peer architectures are a potentially powerful model for developing large-scale networks of text-based digital libraries, but peer-to-peer networks have so far provided very limited support for text-based federated search of digital libraries using relevance-based ranking. This paper addresses the problems of resource representation, resource ranking and selection, and result merging for federated search of text-based digital libraries in hierarchical peer-to-peer networks. Existing approaches to text-based federated search are adapted and new methods are developed for resource representation and resource selection according to the unique characteristics of hierarchical peer-to-peer networks. Experimental results demonstrate that the proposed approaches offer a better combination of accuracy and efficiency than more common alternatives for federated search in peer-to-peer networks.

international acm sigir conference on research and development in information retrieval | 2006

User modeling for full-text federated search in peer-to-peer networks

Jie Lu; James P. Callan

User modeling for information retrieval has mostly been studied to improve the effectiveness of information access in centralized repositories. In this paper we explore user modeling in the context of full-text federated search in peer-to-peer networks. Our approach models a users persistent, long-term interests based on past queries, and uses the model to improve search efficiency for future queries that represent interests similar to past queries. Our approach also enables queries representing a users transient, ad-hoc interests to be automatically recognized so that search for these queries can rely on a relatively large search radius to avoid sacrificing effectiveness for efficiency. Experimental results demonstrate that our approach can significantly improve the efficiency of full-text federated search without degrading its accuracy. Furthermore, the proposed approach does not require a large amount of training data, and is robust to a range of parameter values.

conference on information and knowledge management | 2002

Pruning long documents for distributed information retrieval

Jie Lu; James P. Callan

Query-based sampling is a method of discovering the contents of a text database by submitting queries to a search engine and observing the documents returned. In prior research sampled documents were used to build resource descriptions for automatic database selection, and to build a centralized sample database for query expansion and result merging. An unstated assumption was that the associated storage costs were acceptable.When sampled documents are long, storage costs can be large. This paper investigates methods of pruning long documents to reduce storage costs. The experimental results demonstrate that building resource descriptions and centralized sample databases from the pruned contents of sampled documents can reduce storage costs by 54-93% while causing only minor losses in the accuracy of distributed information retrieval.

international acm sigir conference on research and development in information retrieval | 2007

Full-text federated search in peer-to-peer networks

Jie Lu

european conference on information retrieval | 2006

Full-text federated search of text-based digital libraries in peer-to-peer networks

Jie Lu; Jamie Callan

Peer-to-peer (P2P) networks integrate autonomous computing resources without requiring a central coordinating authority, which makes them a potentially robust and scalable model for providing federated search capability to large-scale networks of text-based digital libraries. However, peer-to-peer networks have so far provided very limited support for full-text federated search with relevance-based document ranking. This paper provides solutions to full-text federated search of text-based digital libraries in hierarchical peer-to-peer networks. Existing approaches to full-text search are adapted and new methods are developed for the problems of resource representation, resource selection, and result merging according to the unique characteristics of hierarchical peer-to-peer networks. Experimental results demonstrate that the proposed approaches offer a better combination of accuracy and efficiency than more common alternatives for federated search of text-based digital libraries in peer-to-peer networks.

international acm sigir conference on research and development in information retrieval | 2004