Eliyahu Safra
Technion – Israel Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eliyahu Safra.
very large data bases | 2004
Catriel Beeri; Yaron Kanza; Eliyahu Safra; Yehoshua Sagiv
Given two geographic databases, a fusion algorithm should produce all pairs of corresponding objects (i.e., objects that represent the same real-world entity). Four fusion algorithms, which only use locations of objects, are described and their performance is measured in terms of recall and precision. These algorithms are designed to work even when locations are imprecise and each database represents only some of the real-world entities. Results of extensive experimentation are presented and discussed. The tests show that the performance depends on the density of the data sources and the degree of overlap among them. All four algorithms are much better than the current state of the art (i.e., the one-sided nearest-neighbor join). One of these four algorithms is best in all cases, at a cost of a small increase in the running time compared to the other algorithms.
International Journal of Geographical Information Science | 2005
Catriel Beeri; Yerach Doytsher; Yaron Kanza; Eliyahu Safra; Yehoshua Sagiv
When integrating geo-spatial datasets, a join algorithm is used for finding sets of corresponding objects (i.e., objects that represent the same real-world entity). Algorithms for joining two datasets were studied in the past. This paper investigates integration of three datasets and proposes methods that can be easily generalized to any number of datasets. Two approaches that use only locations of objects are presented and compared. In one approach, a join algorithm for two datasets is applied sequentially. In the second approach, all the integrated datasets are processed simultaneously. For the two approaches, join algorithms are given and their performances, in terms of recall and precision, are compared. The algorithms are designed to perform well even when locations are imprecise and each dataset represents only some of the real-world entities. Results of extensive experiments show that one of the algorithms has the best (or close to the best) performances under all circumstances. This algorithm has a much better performance than applying sequentially the one-sided nearest-neighbor join.
International Journal of Geographical Information Science | 2010
Eliyahu Safra; Yaron Kanza; Yehoshua Sagiv; Catriel Beeri; Yerach Doytsher
When integrating geo‐spatial data sets, a join algorithm is used for finding sets of corresponding objects (i.e., objects that represent the same real‐world entity). This article investigates location‐based join algorithms for integration of several data sets. First, algorithms for integration of two data sets are presented and their performances, in terms of recall and precision, are compared. Then, two approaches for integration of more than two data sets are described. In one approach, all the integrated data sets are processed simultaneously. In the second approach, a join algorithm for two data sets is applied sequentially, either in a serial manner, where in each join at least one of the joined data sets is a single source, or in a hierarchical manner, where two join results can be joined. For the two approaches, join algorithms are given. The algorithms are designed to perform well even when location of objects are imprecise and each data set represents only some of the real‐world entities. Results of extensive experiments with the different approaches are provided and analyzed. The experiments show the differences, in accuracy and efficiency, between the approaches, under different circumstances. The results also show that all our algorithms have much better accuracy than applying the commonly used one‐sided nearest‐neighbor join.
very large data bases | 2010
Yaron Kanza; Roy Levin; Eliyahu Safra; Yehoshua Sagiv
A route search is an enhancement of an ordinary geographic search. Instead of merely returning a set of entities, the result is a route that goes via entities that are relevant to the search. The input to the problem consists of several search queries, and each query defines a type of geographical entities. When visited, some of the entities succeed in satisfying the user while others fail to do so; however, only the probability of success is known prior to arrival. The main task is to find a route that visits at least one satisfying entity of each type. In an interactive search, the route is computed in steps. In each step, only the next entity of the route is given to the user, and after visiting that entity, the user provides a feedback specifying whether the entity satisfies her. This paper investigates interactive route search in the presence of order constraints that specify that some types of entities should be visited before others. We present heuristic algorithms for interactive route search for two cases, depending on whether the constraints define a complete order or a partial one. The main challenge is to utilize the feedback in order to compute a route that is shorter and has a higher degree of success, compared to routes that are computed non-interactively. We also discuss how to compare the results of the algorithms and introduce suitable measures for doing so. Experiments on real-world data illustrate the efficiency and effectiveness of our algorithms.
advances in geographic information systems | 2006
Eliyahu Safra; Yaron Kanza; Yehoshua Sagiv; Yerach Doytsher
Integration of two road maps is finding a matching between pairs of objects that represent, in the maps, the same real-world road. Several algorithms were proposed in the past for road-map integration; however, these algorithms are not efficient and some of them even require human feedback. Thus, they are not suitable for many important applications (e.g., Web services) where efficiency, in terms of both time and space, is crucial. This paper presents two efficient algorithms for integrating maps in which roads are represented as polylines. The main novelty of these algorithms is in using only the locations of the endpoints of the polylines rather than trying to match whole lines. Experiments on real-world data are given, showing that our approach of integration based on matching merely endpoints is efficient and accurate (that is, it provides high recall and precision).
International Journal of Geographical Information Science | 2013
Eliyahu Safra; Yaron Kanza; Yehoshua Sagiv; Yerach Doytsher
In integration of road maps modeled as road vector data, the main task is matching pairs of objects that represent, in different maps, the same segment of a real-world road. In an ad hoc integration, the matching is done for a specific need and, thus, is performed in real time, where only a limited preprocessing is possible. Usually, ad hoc integration is performed as part of some interaction with a user and, hence, the matching algorithm is required to complete its task in time that is short enough for human users to provide feedback to the application, that is, in no more than a few seconds. Such interaction is typical of services on the World Wide Web and to applications in car-navigation systems or in handheld devices. Several algorithms were proposed in the past for matching road vector data; however, these algorithms are not efficient enough for ad hoc integration. This article presents algorithms for ad hoc integration of maps in which roads are represented as polylines. The main novelty of these algorithms is in using only the locations of the endpoints of the polylines rather than trying to match whole lines. The efficiency of the algorithms is shown both analytically and experimentally. In particular, these algorithms do not require the existence of a spatial index, and they are more efficient than an alternative approach based on using a grid index. Extensive experiments using various maps of three different cities show that our approach to matching road networks is efficient and accurate (i.e., it provides high recall and precision). General Terms:Algorithms, Experimentation
advances in geographic information systems | 2008
Yaron Kanza; Eliyahu Safra; Yehoshua Sagiv; Yerach Doytsher
In a geographical route search, given search terms, the goal is to find an effective route that (1) starts at a given location, (2) ends at a given location, and (3) travels via geographical entities that are relevant to the given terms. A route is effective if it does not exceed a given distance limit whereas the ranking scores of the visited entities, with respect to the search terms, are maximal. This paper introduces route-search queries, suggests three semantics for such queries and deals with the problem of efficiently answering queries under the different semantics. Since the problem of answering route-search queries is a generalization of the traveling salesman problem, it is unlikely to have an efficient solution, i.e., there is no polynomial-time algorithm that solves the problem (unless P=NP). Hence, in this work we consider heuristics for the problem. Methods for effectively computing routes are presented. The methods are compared analytically and experimentally. For these methods, experiments on both synthetic and real-world data illustrate their efficiency and their effectiveness in computing a route that satisfies the constraints of a route-search query.
symposium on large spatial databases | 2009
Yaron Kanza; Eliyahu Safra; Yehoshua Sagiv
In a route search over geospatial data, a user provides terms for specifying types of geographical entities that she wishes to visit. The goal is to find a route that (1) starts at a given location, (2) ends at a given location, and (3) travels via geospatial entities that are relevant to the provided search terms. Earlier work studied the problem of finding a route that is effective in the sense that its length does not exceed a given limit, the relevancy of the objects is as high as possible, and the route visits a single object from each specified type. This paper investigates route search over probabilistic geospatial data . It is shown that the notion of an effective route requires a new definition and, specifically, two alternative semantics are proposed. Computing an effective route is more complicated, compared to the non-probabilistic case, and hence necessitates new algorithms. Heuristic methods for computing an effective route, under either one of the two semantics, are developed. (Note that the problem is NP-hard.) These methods are compared analytically and experimentally. In particular, experiments on both synthetic and real-world data illustrate the efficiency and effectiveness of these methods in computing a route under the two semantics.
web and wireless geographical information systems | 2006
Eliyahu Safra; Yaron Kanza; Yehoshua Sagiv; Yerach Doytsher
A substantial amount of data about geographical entities is available on the World-Wide Web, in the form of digital maps. This paper investigates the integration of such data. A three-step integration process is presented. First, geographical objects are retrieved from Maps on the Web. Secondly, pairs of objects that represent the same real-world entity, in different maps, are discovered and the information about them is combined. Finally, selected objects are presented to the user. The proposed process is efficient, accurate (i.e., the discovery of corresponding objects has high recall and precision) and it can be applied to any pair of digital maps, without requiring the existence of specific attributes. For the step of discovering corresponding objects, three new algorithms are presented. These algorithms modify existing methods that use only the locations of geographical objects, so that information additional to locations will be utilized in the process. The three algorithms are compared using experiments on datasets with varying levels of completeness and accuracy. It is shown that when used correctly, additional information can improve the accuracy of location-based methods even when the data is not complete or not entirely accurate.
symposium on large spatial databases | 2007
Eliyahu Safra; Yaron Kanza; Nir Dolev; Yehoshua Sagiv; Yerach Doytsher
An uncertain geo-spatial dataset is a collection of geo-spatial objects that do not represent accurately real-world entities. Each object has a confidence value indicating how likely it is for the object to be correct. Uncertain data can be the result of operations such as imprecise integration, incorrect update or inexact querying. A k-route, over an uncertain geo-spatial dataset, is a path that travels through the geo-spatial objects, starting at a given location and stopping after visiting k correct objects. A k-route is considered shortest if the expected length of the route is less than or equal to the expected length of any other k-route that starts at the given location. This paper introduces the problem of finding a shortest k-route over an uncertain dataset. Since the problem is a generalization of the traveling salesman problem, it is unlikely to have an efficient solution, i.e., there is no polynomial-time algorithm that solves the problem (unless P=NP). Hence, in this work we consider heuristics for the problem. Three methods for computing a short k-route are presented. The three methods are compared analytically and experimentally. For these three methods, experiments on both synthetic and realworld data show the tradeoff between the quality of the result (i.e., the expected length of the returned route) and the efficiency of the computation.