Yair Toaff
IBM
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yair Toaff.
Discrete Applied Mathematics | 2016
Lior Aronovich; Ron Asher; Danny Harnik; Michael Hirsch; Shmuel T. Klein; Yair Toaff
Large backup and restore systems may have a petabyte or more data in their repository. Such systems are often compressed by means of deduplication techniques, that partition the input text into chunks and store recurring chunks only once. One of the approaches is to use hashing methods to store fingerprints for each data chunk, detecting identical chunks with very low probability for collisions. As alternative, it has been suggested to use similarity instead of identity based searches, which allows the definition of much larger chunks. This implies that the data structure needed to store the fingerprints is much smaller, so that such a system may be more scalable than systems built on the first approach.This paper deals with an extension of the second approach to systems in which it is still preferred to use small chunks. We describe the design choices made during the development of what we call an approximate hash function, serving as the basic tool of the new suggested deduplication system and report on extensive tests performed on a variety of large input files.
Discrete Applied Mathematics | 2014
Michael Hirsch; Shmuel T. Klein; Yair Toaff
The time efficiency of many storage systems rely critically on the ability to perform a large number of evaluations of certain hashing functions fast enough. The remainder function BmodP, generally applied with a large prime number P, is often used as a building block of such hashing functions, which leads to the need of accelerating remainder evaluations, possibly using parallel processors. We suggest several improvements exploiting the mathematical properties of the remainder function, leading to iterative or hierarchical evaluations. Experimental results show a 2 to 5-fold increase in the processing speed.
Journal of Discrete Algorithms | 2014
Michael Hirsch; Shmuel T. Klein; Yair Toaff
New layouts for the assignment of a set of n parallel processors to perform certain tasks in several hierarchically connected layers are suggested, leading, after some initialization phase, to the full exploitation of all of the processing power all of the time. This framework is useful for a variety of string theoretic problems, ranging from modular arithmetic used, among others, in Karp-Rabin type rolling hashes, as well as in cryptographic applications, and up to data compression and error-correcting codes.
Archive | 2010
Lior Aronovich; Michael Hirsch; Yair Toaff
Archive | 2012
Lior Aronovich; Yair Toaff; Gil Paz; Ron Asher
Archive | 2013
Lior Aronovich; Ron Asher; Michael Hirsch; Shmuel T. Klein; Ehud Meiri; Yair Toaff
Archive | 2015
Michael Hirsch; Shmuel T. Klein; Yair Toaff
Archive | 2013
Lior Aronovich; Yair Toaff; Gil Paz; Ron Asher
Archive | 2012
Lior Aronovich; Yair Toaff; Gil Paz; Ron Asher
Archive | 2012
Lior Aronovich; Yair Toaff; Gil Paz; Ron Asher