Alejandro Salinger | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alejandro Salinger is active.

Explore More

Publication

Featured researches published by Alejandro Salinger.

ACM Journal of Experimental Algorithms | 2009

An experimental investigation of set intersection algorithms for text searching

Jérémy Barbay; Alejandro López-Ortiz; Tyler Lu; Alejandro Salinger

The intersection of large ordered sets is a common problem in the context of the evaluation of boolean queries to a search engine. In this article, we propose several improved algorithms for computing the intersection of sorted arrays, and in particular for searching sorted arrays in the intersection context. We perform an experimental comparison with the algorithms from the previous studies from Demaine, López-Ortiz, and Munro [ALENEX 2001] and from Baeza-Yates and Salinger [SPIRE 2005]; in addition, we implement and test the intersection algorithm from Barbay and Kenyon [SODA 2002] and its randomized variant [SAGA 2003]. We consider both the random data set from Baeza-Yates and Salinger, the Google queries used by Demaine et al., a corpus provided by Google, and a larger corpus from the TREC Terabyte 2006 efficiency query stream, along with its own query log. We measure the performance both in terms of the number of comparisons and searches performed, and in terms of the CPU time on two different architectures. Our results confirm or improve the results from both previous studies in their respective context (comparison model on real data, and CPU measures on random data) and extend them to new contexts. In particular, we show that value-based search algorithms perform well in posting lists in terms of the number of comparisons performed.

string processing and information retrieval | 2005

Experimental analysis of a fast intersection algorithm for sorted sequences

Ricardo A. Baeza-Yates; Alejandro Salinger

This work presents an experimental comparison of intersection algorithms for sorted sequences, including the recent algorithm of Baeza-Yates. This algorithm performs on average less comparisons than the total number of elements of both inputs (n and m respectively) when n=αm (α > 1). We can find applications of this algorithm on query processing in Web search engines, where large intersections, or differences, must be performed fast. In this work we concentrate in studying the behavior of the algorithm in practice, using for the experiments test data that is close to the actual conditions of its applications. We compare the efficiency of the algorithm with other intersection algorithm and we study different optimizations, showing that the algorithm is more efficient than the alternatives in most cases, especially when one of the sequences is much larger than the other.

International Journal of Foundations of Computer Science | 2006

A SIMPLE ALPHABET-INDEPENDENT FM-INDEX

Szymon Grabowski; Gonzalo Navarro; Rafał Przywarski; Alejandro Salinger; Veli Mäkinen

We design a succinct full-text index based on the idea of Huffman-compressing the text and then applying the Burrows-Wheeler transform over it. The resulting structure can be searched as an FM-index, with the benefit of removing the sharp dependence on the alphabet size, σ, present in that structure. On a text of length n with zero-order entropy H0, our index needs O(n(H0 + 1)) bits of space, without any significant dependence on σ. The average search time for a pattern of length m is O(m(H0 + 1)), under reasonable assumptions. Each position of a text occurrence can be located in worst case time O((H0 + 1) log n), while any text substring of length L can be retrieved in O((H0 + 1)L) average time in addition to the previous worst case time. Our index provides a relevant space/time tradeoff between existing succinct data structures, with the additional interest of being easy to implement. We also explore other coding variants alternative to Huffman and exploit their synchronization properties. Our experimental results on various types of texts show that our indexes are highly competitive in the space/time tradeoff map.

acm symposium on parallel algorithms and architectures | 2008

Optimal speedup on a low-degree multi-core parallel architecture (LoPRAM)

Reza Dorrigiv; Alejandro López-Ortiz; Alejandro Salinger

Over the last five years, major microprocessor manufacturers have released plans for a rapidly increasing number of cores per microprossesor, with upwards of 64 cores by 2015. In this setting, a sequential RAM computer will no longer accurately reflect the architecture on which algorithms are being executed. In this paper we propose a model of low degree parallelism (LoPRAM) which builds upon the RAM and PRAM models yet better reflects recent advances in parallel (multi-core) architectures. This model supports a high level of abstraction that simplifies the design and analysis of parallel programs. More importantly we show that in many instances it naturally leads to work-optimal parallel algorithms via simple modifications to sequential algorithms.

international symposium on algorithms and computation | 2009

Practical Discrete Unit Disk Cover Using an Exact Line-Separable Algorithm

Francisco Claude; Reza Dorrigiv; Stephane Durocher; Robert Fraser; Alejandro López-Ortiz; Alejandro Salinger

Given m unit disks and n points in the plane, the discrete unit disk cover problem is to select a minimum subset of the disks to cover the points. This problem is NP-hard [11] and the best previous practical solution is a 38-approximation algorithm by Carmi et al. [4]. We first consider the line-separable discrete unit disk cover problem (the set of disk centres can be separated from the set of points by a line) for which we present an O(m 2 n)-time algorithm that finds an exact solution. Combining our line-separable algorithm with techniques from the algorithm of Carmi et al. [4] results in an O(m 2 n 4) time 22-approximate solution to the discrete unit disk cover problem.

Journal of Discrete Algorithms | 2005

Bit-parallel (δ, γ)-matching and suffix automata

Maxime Crochemore; Costas S. Iliopoulos; Gonzalo Navarro; Yoan J. Pinzón; Alejandro Salinger

Abstract ( δ , γ ) -matching is a string matching problem with applications to music retrieval. The goal is, given a pattern P 1 … m and a text T 1 … n on an alphabet of integers, find the occurrences P ′ of the pattern in the text such that (i) ∀ 1 ⩽ i ⩽ m , | P i − P i ′ | ⩽ δ , and (ii) ∑ 1 ⩽ i ⩽ m | P i − P i ′ | ⩽ γ . The problem makes sense for δ ⩽ γ ⩽ δ m . Several techniques for ( δ , γ ) -matching have been proposed, based on bit-parallelism or on skipping characters. We first present an O ( m n log ( γ ) / w ) worst-case time and O ( n ) average-case time bit-parallel algorithm (being w the number of bits in the computer word). It improves the previous O ( m n log ( δ m ) / w ) worst-case time algorithm of the same type. Second, we combine our bit-parallel algorithm with suffix automata to obtain the first algorithm that skips characters using both δ and γ. This algorithm examines less characters than any previous approach, as the others do just δ-matching and check the γ-condition on the candidates. We implemented our algorithms and drew experimental results on real music, showing that our algorithms are superior to current alternatives with high values of δ.

international symposium on algorithms and computation | 2009

Untangled Monotonic Chains and Adaptive Range Search

Diego Arroyuelo; Francisco Claude; Reza Dorrigiv; Stephane Durocher; Meng He; Alejandro López-Ortiz; J. Ian Munro; Patrick K. Nicholson; Alejandro Salinger; Matthew Skala

We present the first adaptive data structure for two-dimensional orthogonal range search. Our data structure is adaptive in the sense that it gives improved search performance for data that is better than the worst case (Demaine et al., 2000) [8]; in this case, data with more inherent sortedness. Given n points on the plane, the linear space data structure can answer range queries in O(logn+k+m) time, where m is the number of points in the output and k is the minimum number of monotonic chains into which the point set can be decomposed, which is O(n) in the worst case. Our result matches the worst-case performance of other optimal-time linear space data structures, or surpasses them when k=o(n). Our data structure can be made implicit, requiring no extra space beyond that of the data points themselves (Munro and Suwanda, 1980) [16], in which case the query time becomes O(klogn+m). We also present a novel algorithm of independent interest to decompose a point set into a minimum number of untangled, similarly directed monotonic chains in O(k^2n+nlogn) time.

conference on innovations in theoretical computer science | 2012

Paging for multi-core shared caches

Alejandro López-Ortiz; Alejandro Salinger

Paging for multi-core processors extends the classical paging problem to a setting in which several processes simultaneously share the cache. Recently, Hassidim proposed a model for multi-core paging [25], studying cache eviction policies for multi-cores under the traditional competitive analysis metric and showing that LRU is not competitive against an offline policy that has the power to arbitrarily delay request sequences to its advantage. While Hassidim brought attention to this problem, an effective and realistic model with accompanying competitive caching algorithms remains to be introduced. In this paper we propose a more conventional model in which requests must be served as they arrive. We study the problem of minimizing the number of faults, deriving bounds on the competitive ratios of natural strategies to manage the cache. We show that traditional online paging algorithms are not competitive in our model. We then study the offline paging problem and show that the problem of deciding if a request can be served such that at a given time each sequence has faulted at most a given number of times is NP-complete and that its optimization version is APX-hard (for an unbounded number of sequences). We show as well that although offline algorithms can benefit from properly aligning future requests by means of faults, an algorithm that does so by forcing faults on pages that it has in its cache has no advantage over an honest algorithm that evicts pages only when faults occur. Lastly, we describe offline algorithms for the decision problem and for minimizing the total number of faults that run in polynomial time in the length of the sequences.

Algorithms and Applications | 2010

Fast intersection algorithms for sorted sequences

Ricardo A. Baeza-Yates; Alejandro Salinger

This paper presents and analyzes a simple intersection algorithm for sorted sequences that is fast on average. It is related to the multiple searching problem and to merging. We present the worst and average case analysis, showing that in the former, the complexity nicely adapts to the smallest list size. In the latter case, it performs less comparisons than the total number of elements on both inputs, n and m, when n=αm (α>1), achieving O(m log(n/m)) complexity. The algorithm is motivated by its application to fast query processing in Web search engines, where large intersections, or differences, must be performed fast. In this case we experimentally show that the algorithm is faster than previous solutions.

workshop on approximation and online algorithms | 2012

Minimizing Cache Usage in Paging

Alejandro López-Ortiz; Alejandro Salinger

Traditional paging models seek algorithms that maximize their performance while using the maximum amount of cache resources available. However, in many applications this resource is shared or its usage involves a cost. In this work we introduce the Minimum Cache Usage problem, which is an extension to the classic paging problem that accounts for the efficient use of cache resources by paging algorithms. In this problem, the cost of a paging algorithm is a combination of both its number of faults and the amount of cache it uses, where the relative cost of faults and cache usage can vary with the application. We present a simple family of online paging algorithms that adapt to the ratio α between cache and fault costs, achieving competitive ratios that vary with α, and that are between 2 and the cache size k. Furthermore, for sequences with high locality of reference, we show that the competitive ratio is at most 2, and provide evidence of the competitiveness of our algorithms on real world traces. Finally, we show that the offline problem admits a polynomial time algorithm. In doing so, we define a reduction of paging with cache usage to weighted interval scheduling on identical machines.

Explore More