George A. Mihaila | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where George A. Mihaila is active.

Explore More

Publication

Featured researches published by George A. Mihaila.

international conference on parallel and distributed information systems | 1996

Querying the World Wide Web

Alberto O. Mendelzon; George A. Mihaila; Tova Milo

The World Wide Web is a large, heterogeneous, distributed collection of documents connected by hypertext links. The most common technology currently used for searching the Web depends on sending information retrieval requests to ``index servers that index as many documents as they can ®nd by navigating the network. One problem with this is that users must be aware of the various index servers (over a dozen of them are currently deployed on the Web), of their strengths and weaknesses, and of the peculiarities of their query interfaces. A more serious problem is that these queries cannot exploit the structure and topology of the document network. In this paper we propose a query language, WebSQL, that takes advantage of multiple index servers without requiring users to know about them, and that integrates textual retrieval with structure and topology-based queries. We give a formal semantics for WebSQL using a calculus based on a novel ``virtual graph model of a document network. We propose a new theory of query cost based on the idea of ``query locality, that is, how much of the network must be visited to answer a particular query. We give an algorithm for characterizing WebSQL queries with respect to query locality. Finally, we describe a prototype implementation of WebSQL written in Java.

international world wide web conferences | 1997

Applications of a Web query language

Gustavo O. Arocena; Alberto O. Mendelzon; George A. Mihaila

Abstract In this paper we report on our experience using WebSQL, a high level declarative query language for extracting information from the Web. WebSQL takes advantage of multiple index servers without requiring users to know about them, and integrates full-text with topology-based queries. The WebSQL query engine is a library of Java classes, and WebSQL queries can be embedded into Java programs much in the same way as SQL queries are embedded in C programs. This allows us to access the Web from Java at a much higher level of abstraction than bare HTTP requests. We illustrate the use of WebSQL for application development by describing two applications we are experimenting with: Web site maintenance and specialized index construction. We also sketch several other possible applications. Using the library, we have also implemented a client-server architecture that allows us to perform interactive intelligent searches on the Web from an applet running on a browser.

international world wide web conferences | 2001

When experts agree: using non-affiliated experts to rank popular topics

Krishna Bharat; George A. Mihaila

In response to a query, a search engine returns a ranked list of documents. If the query is about a popular topic (i.e., it matches many documents), then the returned list is usually too long to view fully. Studies show that users usually look at only the top 10 to 20 results. However, we can exploit the fact that the best targets for popular topics are usually linked to by enthusiasts in the same domain. In this paper, we propose a novel ranking scheme for popular topics that places the most authoritative pages on the query topic at the top of the ranking. Our algorithm operates on a special index of expert documents. These are a subset of the pages on the WWW identified as directories of links to non-affiliated sources on specific topics. Results are ranked based on the match between the query and relevant descriptive text for hyperlinks on expert pages pointing to a given result page. We present a prototype search engine that implements our ranking scheme and discuss its performance. With a relatively small (2.5 million page) expert index, our algorithm was able to perform comparably on popular queries with the best of the mainstream search engines.

very large data bases | 2002

Locating and accessing data repositories with WebSemantics

George A. Mihaila; Louiqa Raschid; Anthony Tomasic

Abstract. Many collections of scientific data in particular disciplines are available today on the World Wide Web. Most of these data sources are compliant with some standard for interoperable access. In addition, sources may support a common semantics, i.e., a shared meaning for the data types and their domains. However, sharing data among a global community of users is still difficult because of the following reasons: (i) data providers need a mechanism for describing and publishing available sources of data; (ii) data administrators need a mechanism for discovering the location of published sources and obtaining metadata from these sources; and (iii) users need a mechanism for browsing and selecting sources. This paper describes a system, WebSemantics, that accomplishes the above tasks. We describe an architecture for the publication and discovery of scientific data sources, which is an extension of the World Wide Web architecture and protocols. We support catalogs containing metadata about data sources for some application domain. We define a language for discovering sources and querying their metadata. We then describe the WebSemantics prototype.

extending database technology | 1998

Equal Time for Data on the Internet with WebSemantics

George A. Mihaila; Louiqa Raschid; Anthony Tomasic

Many collections of scientific data in particular disciplines are available today around the world. Much of this data conforms to some agreed upon standard for data exchange, i.e., a standard schema and its semantics. However, sharing this data among a global community of users is still difficult because of a lack of standards for the following necessary functions: (i) data providers need a standard for describing or publishing available sources of data; (ii) data administrators need a standard for discovering the published data and (iii) users need a standard for accessing this discovered data. This paper describes a prototype implementation of a system, WebSemantics, that accomplishes the above tasks. We describe an architecture and protocols for the publication, discovery and access to scientific data. We define a language for discovering sources and querying the data in these sources, and we provide a formal semantics for this language.

International Journal on Digital Libraries | 1997