Is this you? Create Your Porfile

Yigal Arens

University of Southern California

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yigal Arens is active.

Explore More

Publication

Featured researches published by Yigal Arens.

International Journal of Cooperative Information Systems | 1993

Retrieving and Integrating Data from Multiple Information Sources

Yigal Arens; Chin Y. Chee; Chun-Nan Hsu; Craig A. Knoblock

With the current explosion of data, retrieving and integrating information from various sources is a critical problem. Work in multidatabase systems has begun to address this problem, but it has primarily focused on methods for communicating between databases and requires significant effort for each new database added to the system. This paper describes a more general approach that exploits a semantic model of a problem domain to integrate the information from various information sources. The information sources handled include both databases and knowledge bases, and other information sources (e.g. programs) could potentially be incorporated into the system. This paper describes how both the domain and the information sources are modeled, shows how a query at the domain level is mapped into a set of queries to individual information sources, and presents algorithms for automatically improving the efficiency of queries using knowledge about both the domain and the information sources. This work is implemented in a system called SIMS and has been tested in a transportation planning domain using nine Oracle databases and a Loom knowledge base.

intelligent information systems | 1996

Query reformulation for dynamic information integration

Yigal Arens; Craig A. Knoblock; Wei-Min Shen

The standard approach to integrating heterogeneous information sources is to build a global schema that relates all of the information in the different sources, and to pose queries directly against it. The problem is that schema integration is usually difficult, and as soon as any of the information sources change or a new source is added, the process may have to be repeated.The SIMS system uses an alternative approach. A domain model of the application domain is created, establishing a fixed vocabulary for describing data sets in the domain. Using this language, each available information source is described. Queries to SIMS against the collection of available information sources are posed using terms from the domain model, and reformulation operators are employed to dynamically select an appropriate set of information sources and to determine how to integrate the available information to satisfy a query. This approach results in a system that is more flexible than existing ones, more easily scalable, and able to respond dynamically to newly available or unexpectedly missing information sources.This paper describes the query reformulation process in SIMS and the operators used in it. We provide precise definitions of the reformulation operators and explain the rationale behind choosing the specific ones SIMS uses. We have demonstrated the feasibility and effectiveness of this approach by applying SIMS in the domains of transportation planning and medical trauma care.

Communications of The ACM | 1984

Talking to UNIX in English: an overview of UC

Robert Wilensky; Yigal Arens; David N. Chin

UC is a natural language help facility which advises users in using the UNIX operating system. Users can query UC about how to do things, command names and formats, online definitions of UNIX or general operating systems terminology, and debugging problems in using commands. UC is comprised of the following components: a language analyzer and generator, a context and memory model, an experimental common-sense planner, highly extensible knowledge bases on both the UNIX domain and the English language, a goal analysis component, and a system for acquisition of new knowledge through instruction in English. The language interface of UC is based on a “phrasal analysis” approach which integrates semantic, grammatical and other types of information. In addition, it includes capabilities for ellipsis resolution and reference disambiguation.

IEEE Computer | 2001

Simplifying data access: the Energy Data Collection project

José Luis Ambite; Yigal Arens; Eduard H. Hovy; Andrew Philpot; Luis Gravano; Vasileios Hatzivassiloglou; Judith L. Klavans

Using technology developed at the Digital Government Research Center, a team of researchers is seeking to make government statistical data more accessible through the Internet. In collaboration with government experts, they are conducting research into advanced information systems, developing standards, interfaces and a shared infrastructure, and building and managing pilot systems.

conference on information and knowledge management | 1994

Intelligent caching: selecting, representing, and reusing data in an information server

Yigal Arens; Craig A. Knoblock

Accessing information sources to retrieve data requested by a user can be expensive, especially when dealing with distributed information sources. One way to reduce this cost is to cache the results of queries, or related classes of data. This paper presents an approach to caching and addresses the issues of which information to cache, how to describe what has been cached, and how to use the cached information to answer future queries. We consider these issues in the context of the SIMS information server, which is a system for retrieving information from multiple heterogeneous and distributed information sources. The design of this information server is ideal for representing and reusing cached information since each class of cached information is simply viewed as another information source that is available for answering future queries.

international conference on management of data | 1993

SIMS: Retrieving and integrating information from multiple sources

Yigal Arens; Craig A. Knoblock

Most tasks performed by users of complex information systems involve interaction with multiple knowledgeand databases. Examples can be found in the areas of analysis, resource planning and briefing applications. Retrieval of desired information dispersed in multiple sources requires general familiarity with their contents and structure, with their query languages, with their location on existing networks, and more. The user must break down a given retrieval task into a sequence of actual queries to databases and/or knowledge bases, and must handle the temporary storing and possible transformation of intermediate results — all this while satisfying constraints on reliability of the results and the cost of the retrieval process. With a large number of information sources, it is difficult to find individuals who possess the required knowledge, and automation becomes a necessity. There is an elegant solution to the problem described above: the creation of a knowledge server that will form the interface between information sources and applications in need of that information. A user, or an application, will query the knowledge server in a manner that is independent of the distribution of information over various sources, independent of the various query

Archive | 2002

Data Integration and Access

José Luis Ambite; Yigal Arens; Walter Bourne; Steve Feiner; Luis Gravano; Vasileios Hatzivassiloglou; Eduard H. Hovy; Judith L. Klavans; Andrew Philpot; Kenneth A. Ross; Jay Sandhaus; Deniz Sariöz; Rolfe R. Schmidt; Cyrus Shahabi; Anurag Singla; Surabhan Temiyabutr; Brian Whitman; Kazi A. Zaman

This chapter describes the progress of the Digital Government Research Center in tackling the challenges of integrating and accessing the massive amount of statistical and text data available from government agencies. In particular, we address the issues of database heterogeneity, size, distribution, and control of terminology. In this chapter we provide an overview of our results in addressing problems such as (1) ontological mappings for terminology standardization, (2) data integration across data bases with high speed query processing, and (3) interfaces for query input and presentation of results. The DGRC is a collaboration between researchers from Columbia University and the Information Sciences Institute of the University of Southern California employing technology developed at both locations, in particular the SENSUS ontology, the SIMS multi-database access planner, the LEXING automated dictionary and terminology analysis system, the main-memory query processing component and others. The pilot application targets gasoline data from the Bureau of Labor Statistics, the Energy Information Administration of the Department of Energy, the Census Bureau, and other government agencies.

Artificial Intelligence Review | 1995

The design of a model-based multimedia interaction manager

Yigal Arens; Eduard H. Hovy

We describe here the conceptual design ofCicero, an application-independent human-computer interaction manager that performs run-time media coordination and allocation, so as to adapt dynamically to the presentation context; knows what it is presenting, so as to maintain coherent extended human-machine dialogues; and is plug-in compatible with host information resources such as “briefing associate” workstations, expert systems, databases, etc., as well as with multiple media such as natural language, graphics, etc. The system design calls for two linked reactive planners that coordinate the actions of the systems media and information sources. To enable presentational flexibility, the capabilities of each medium and the nature of the contents of each information source are semantically modeled as Virtual Devices — abstract descriptions of device I/O capabilities — and abstract information types respectively in a single uniform knowledge representation framework. These models facilitate extensibility by supporting the specification of new interaction behaviors and the inclusion of new media and information sources.

Communications of The ACM | 2003

Responding to the unexpected

Yigal Arens; Paul S. Rosenbloom

How IT can help prepare for future attacks and disasters.

meeting of the association for computational linguistics | 1987

Phrasal Analysis of Long Noun Sequences

Yigal Arens; John J. Granacki; Alice C. Parker

Noun phrases consisting of a sequence of nouns (sometimes referred to as nominal compounds) pose considerable difficulty for language analyzers but are common in many technical domains. The problems are compounded when some of the nouns in the sequence are ambiguously also verbs. The phrasal approach to language analysis, as implemented in PHRAN (PHRasal ANalyzer), has been extended to handle the recognition and partial analysis of such constructions. The phrasal analysis of a noun sequence is performed to an extent sufficient for continued analysis of the sentence in which it appears. PHRAN is currently being used as part of the SPAN (SPecification ANalysis) natural language interface to the USC Advanced Design AutoMation system (ADAM) (Granacki et al, 1985). PHRAN-SPAN is an interface for entering and interpreting digital system specifications, in which long noun sequences occur often. The extensions to PHRANs knowledge base to recognize these constructs are described, along with the algorithm used to detect and resolve ambiguities which arise in the noun sequences.

Explore More