Is this you? Create Your Porfile

Tamar Avraham

Technion – Israel Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tamar Avraham is active.

Explore More

Publication

Featured researches published by Tamar Avraham.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2010

Esaliency (Extended Saliency): Meaningful Attention Using Stochastic Image Modeling

Tamar Avraham; Michael Lindenbaum

Computer vision attention processes assign variable-hypothesized importance to different parts of the visual input and direct the allocation of computational resources. This nonuniform allocation might help accelerate the image analysis process. This paper proposes a new bottom-up attention mechanism. Rather than taking the traditional approach, which tries to model human attention, we propose a validated stochastic model to estimate the probability that an image part is of interest. We refer to this probability as saliency and thus specify saliency in a mathematically well-defined sense. The model quantifies several intuitive observations, such as the greater likelihood of correspondence between visually similar image regions and the likelihood that only a few of interesting objects will be present in the scene. The latter observation, which implies that such objects are (relaxed) global exceptions, replaces the traditional preference for local contrast. The algorithm starts with a rough preattentive segmentation and then uses a graphical model approximation to efficiently reveal which segments are more likely to be of interest. Experiments on natural scenes containing a variety of objects demonstrate the proposed method and show its advantages over previous approaches.

international conference on computer vision | 2012

Learning implicit transfer for person re-identification

Tamar Avraham; Ilya Gurvich; Michael Lindenbaum; Shaul Markovitch

This paper proposes a novel approach for pedestrian re-identification. Previous re-identification methods use one of 3 approaches: invariant features; designing metrics that aim to bring instances of shared identities close to one another and instances of different identities far from one another; or learning a transformation from the appearance in one domain to the other. Our implicit approach models camera transfer by a binary relation R={(x,y)|x and y describe the same person seen from cameras A and B respectively}. This solution implies that the camera transfer function is a multi-valued mapping and not a single-valued transformation, and does not assume the existence of a metric with desirable properties. We present an algorithm that follows this approach and achieves new state-of-the-art performance.

IEEE Journal of Selected Topics in Signal Processing | 2011

Ultrawide Foveated Video Extrapolation

Tamar Avraham; Yoav Y. Schechner

Consider the task of creating a very wide visual extrapolation, i.e., a synthetic continuation of the field of view much beyond the acquired data. Existing related methods deal mainly with filling in holes in images and video. These methods are very time consuming and often prone to noticeable artifacts. The probability for artifacts grows as the synthesized regions become more distant from the domain of the raw video. Therefore, such methods do not lend themselves easily to very large extrapolations. We suggest an approach to enable this task. First, an improved completion algorithm that rejects peripheral distractions significantly reduces attention-drawing artifacts. Second, a foveated video extrapolation approach exploits weaknesses of the human visual system, in order to enable efficient extrapolation of video, while further reducing attention-drawing artifacts. Consider a screen showing the raw video. Let the region beyond the raw video domain reside outside the field corresponding to the viewers fovea. Then, the farther the extrapolated synthetic region is from the raw field of view, the more the spatial resolution can be reduced. This enables image synthesis using spatial blocks that become gradually coarser and significantly fewer (per unit area), as the extrapolated region expands. The substantial reduction in the number of synthesized blocks notably speeds the process and increases the probability of success without distracting artifacts. Furthermore, results supporting the foveated approach are obtained by a user study.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006

Attention-based dynamic visual search using inner-scene similarity: algorithms and bounds

Tamar Avraham; Michael Lindenbaum

A visual search is required when applying a recognition process on a scene containing multiple objects. In such cases, we would like to avoid an exhaustive sequential search. This work proposes a dynamic visual search framework based mainly on inner-scene similarity. Given a number of candidates (e.g., subimages), we hypothesize is that more visually similar candidates are more likely to have the same identity. We use this assumption for determining the order of attention. Both deterministic and stochastic approaches, relying on this hypothesis, are considered. Under the deterministic approach, we suggest a measure similar to Kolmogorovs epsilon-covering that quantifies the difficulty of a search task. We show that this measure bounds the performance of all search algorithms and suggest a simple algorithm that meets this bound. Under the stochastic approach, we model the identity of the candidates as a set of correlated random variables and derive a search procedure based on linear estimation. Several experiments are presented in which the statistical characteristics, search algorithm, and bound are evaluated and verified.

computer vision and pattern recognition | 2017

3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder

Gil Elbaz; Tamar Avraham; Anath Fischer

We present an algorithm for registration between a large-scale point cloud and a close-proximity scanned point cloud, providing a localization solution that is fully independent of prior information about the initial positions of the two point cloud coordinate systems. The algorithm, denoted LORAX, selects super-points–local subsets of points–and describes the geometric structure of each with a low-dimensional descriptor. These descriptors are then used to infer potential matching regions for an efficient coarse registration process, followed by a fine-tuning stage. The set of super-points is selected by covering the point clouds with overlapping spheres, and then filtering out those of low-quality or nonsalient regions. The descriptors are computed using state-of-the-art unsupervised machine learning, utilizing the technology of deep neural network based auto-encoders. Abstract This novel framework provides a strong alternative to the common practice of using manually designed key-point descriptors for coarse point cloud registration. Utilizing super-points instead of key-points allows the available geometrical data to be better exploited to find the correct transformation. Encoding local 3D geometric structures using a deep neural network auto-encoder instead of traditional descriptors continues the trend seen in other computer vision applications and indeed leads to superior results. The algorithm is tested on challenging point cloud registration datasets, and its advantages over previous approaches as well as its robustness to density changes, noise, and missing data are shown.

international conference on computational photography | 2011

Multiscale ultrawide foveated video extrapolation

Amit Aides; Tamar Avraham; Yoav Y. Schechner

Video extrapolation is the task of extending a video beyond its original field of view. Extrapolating video in a manner that is consistent with the original video and visually pleasing is difficult. In this work we aim at very wide video extrapolation which increases the complexity of the task. Some video extrapolation methods simplify the task by using a rough color extrapolation. A recent approach focuses on artifact avoidance and run time reduction using foveated video extrapolation, but fails to preserve the structure of the scene. This paper introduces a multi-scale method which combines a coarse to fine approach with foveated video extrapolation. Foveated video extrapolation reduces the effective number of pixels that need to be extrapolated, making the extrapolation less time consuming and less prone to artifacts. The coarse to fine approach better preserves the structure of the scene while preserving finer details near the domain of the input video. The combined method gains improvement both visually and in processing time.

british machine vision conference | 2013

Transitive Re-identification.

Yulia Brand; Tamar Avraham; Michael Lindenbaum

Person re-identification accuracy can be significantly improved given a training set that demonstrates changes in appearances associated with the two non-overlapping cameras involved. Here we test whether this advantage can be maintained when directly annotated training sets are not available for all camera-pairs at the site. Given the training sets capturing correspondences between cameras A and B and a different training set capturing correspondences between cameras B and C, the Transitive Re-IDentification algorithm (TRID) suggested here provides a classifier for (A,C) appearance pairs. The proposed method is based on statistical modeling and uses a marginalization process for the inference. This approach significantly reduces the annotation effort inherent in a learning system, which goes down from O(N2) to O(N), for a site containing N cameras. Moreover, when adding camera (N + 1), only one inter-camera training set is required for establishing all correspondences. In our experiments we found that the method is effective and more accurate than the competing camera invariant approach.

european conference on computer vision | 2004

Dynamic Visual Search Using Inner-Scene Similarity: Algorithms and Inherent Limitations

Tamar Avraham; Michael Lindenbaum

A dynamic visual search framework based mainly on inner-scene similarity is proposed. Algorithms as well as measures quantifying the difficulty of search tasks are suggested. Given a number of candidates (e.g. sub-images), our basic hypothesis is that more visually similar candidates are more likely to have the same identity. Both deterministic and stochastic approaches, relying on this hypothesis, are used to quantify this intuition. Under the deterministic approach, we suggest a measure similar to Kolmogorov’s e-covering that quantifies the difficulty of a search task and bounds the performance of all search algorithms. We also suggest a simple algorithm that meets this bound. Under the stochastic approach, we model the identities of the candidates as correlated random variables and characterize the task using its second order statistics. We derive a search procedure based on minimum MSE linear estimation. Simple extensions enable the algorithm to use top-down and/or bottom-up information, when available.

Person Re-Identification | 2014

Learning Appearance Transfer for Person Re-identification

Tamar Avraham; Michael Lindenbaum

In this chapter we review methods that model the transfer a person’s appearance undergoes when passing between two cameras with non-overlapping fields of view. While many recent studies deal with re-identifying a person at any new location and search for universal signatures and metrics, here we focus on solutions for the natural setup of surveillance systems in which the cameras are specific and stationary, solutions which exploit the limited transfer domain associated with a specific camera pair. We compare the performance of explicit transfer modeling, implicit transfer modeling, and camera-invariant methods. Although explicit transfer modeling is advantageous over implicit transfer modeling when the inter-camera training data are poor, implicit camera transfer, which can model multi-valued mappings and better utilize negative training data, is advantageous when a larger training set is available. While camera-invariant methods have the advantage of not relying on specific inter-camera training data, they are outperformed by both camera-transfer approaches when sufficient training data are available. We therefore conclude that camera-specific information is very informative for improving re-identification in sites with static non-overlapping cameras and that it should still be considered even with the improvement of camera-invariant methods.

european conference on computer vision | 2016

Interpreting the Ratio Criterion for Matching SIFT Descriptors

Avi Kaplan; Tamar Avraham; Michael Lindenbaum

Matching keypoints by minimizing the Euclidean distance between their SIFT descriptors is an effective and extremely popular technique. Using the ratio between distances, as suggested by Lowe, is even more effective and leads to excellent matching accuracy. Probabilistic approaches that model the distribution of the distances were found effective as well. This work focuses, for the first time, on analyzing Lowe’s ratio criterion using a probabilistic approach. We provide two alternative interpretations of this criterion, which show that it is not only an effective heuristic but can also be formally justified. The first interpretation shows that Lowe’s ratio corresponds to a conditional probability that the match is incorrect. The second shows that the ratio corresponds to the Markov bound on this probability. The interpretations make it possible to slightly increase the effectiveness of the ratio criterion, and to obtain matching performance that exceeds all previous (non-learning based) results.

Explore More