Is this you? Create Your Porfile

Hadas Benisty

Technion – Israel Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hadas Benisty is active.

Explore More

Publication

Featured researches published by Hadas Benisty.

international conference on acoustics, speech, and signal processing | 2014

Non-parallel voice conversion using joint optimization of alignment by temporal context and spectral distortion

Hadas Benisty; David Malah; Koby Crammer

Many voice conversion systems require parallel training sets of the source and target speakers. Non-parallel training is more complicated as it involves evaluation of source-target correspondence along with the conversion function itself. INCA is a recently proposed method for non-parallel training, based on iterative estimation of alignment and conversion function. The alignment is evaluated using a simple nearest-neighbor search, which often leads to phonetic miss-matched source-target pairs. We propose here a generalized approach, denoted as Temporal-Context INCA (TC-INCA), based on matching temporal context vectors. We formulate the training stage as a minimization problem of a joint cost, considering both context-based alignment and conversion function. We show that TC-INCA reduces the joint cost and prove its convergence. Experimental results indicate that TC-INCA significantly improves the alignment accuracy, compared to INCA. Moreover, subjective evaluations show that TC-INCA leads to improved quality of the synthesized output signals, when small training sets are used.

ieee convention of electrical and electronics engineers in israel | 2014

Metric learning using labeled and unlabeled data for semi-supervised/domain adaptation classification

Hadas Benisty; Koby Crammer

Metric learning includes a wide range of algorithms aiming to improve the classification accuracy by capturing the spatial structure of the training set. The performance of those (supervised) methods greatly depends on the amount of labeled data available for training. In practice, however, it is usually not easy to obtain a large-scale labeled set, as opposed to an unlabeled one. In this paper we propose a new method for metric learning using a small-scale labeled set and a large-scale unlabeled set. This method can be applied in two setups - a Semi-Supervised (SS) classification setup and a Domain Adaptation (DA) setup. We used two sources of hand-written digits images to demonstrate the performance of our proposed method. We show that in both SS and DA setups, the proposed method leads to fewer classification errors compared to Euclidean distance and to Large Margin Nearest Neighbor (LMNN).

ieee convention of electrical and electronics engineers in israel | 2014

Sequential voice conversion using grid-based approximation

Hadas Benisty; David Malah; Koby Crammer

Common voice conversion methods are based on Gaussian Mixture Modeling (GMM), which requires exhaustive training (typically lasting hours), often leading to ill-conditioning if the dataset used is too small. We propose a new conversion method that is trained in seconds, using either small or large scale datasets. The proposed Grid-Based (GB) method is based on sequential Bayesian tracking, by which the conversion process is expressed as a sequential estimation problem of tracking the target spectrum based on the observed source spectrum. The converted MFCC vectors are sequentially evaluated using a weighted sum of the target training set used as grid-points. To improve the perceived quality of the synthesized signals, we use a post-processing block for enhancing the global variance. Objective and subjective evaluations show that the enhanced-GB method is comparable to classic GMM-based methods, in terms of quality, and comparable to their enhanced versions, in terms of individuality.

Speech Communication | 2018

Discriminative Keyword Spotting for limited-data applications

Hadas Benisty; Itamar Katz; Koby Crammer; David Malah

Abstract Mobile devices are widely used around the world, frequently by people speaking local languages or dialects that are not well documented. For these languages, it might not be beneficial for commercial companies to develop Automatic Speech Recognition (ASR) systems, so users of these languages cannot utilize voice activation features (often using Keyword Spotting, KWS) of their devices. Standard KWS methods aim to statistically model the generation process of the speech signal, requiring hours of recorded and transcribed speech for training, and therefore are not adequate for limited-data scenarios. In this paper we propose a new KWS method, suitable for limited-data scenarios, which can be easily applied by developers. The proposed method uses a new histogram representation for words, obtained with respect to a pre-trained Gaussian Mixture Model (GMM). Sentences are represented by fixed-length global feature vectors, extracted from the response curves obtained by a word classifier. Word and sentence classifiers are trained using a discriminative approach, which is typically robust to training-set size. The dataset for training the GMM is easy to obtain, since no annotation is required. We compared the proposed system to a Hidden Markov Model (HMM) based system, trained using the same low data-resources conditions as ours, and to a state-of-the-art ASR system, trained using either the limited data scenario, or using many hours of recorded speech. In the limited data situation, our system performs better then both benchmarks in all experiments except for clean speech of children (CSLU dataset), where it performs as good as the HMM. Since the ASR benchmark performs poorly without enough training data, we also trained it without limiting the available data. In this case the ASR benchmark performs better when tested on speech of adults (TED-LIUM dataset of TED lectures) for all noise conditions, and our system performs better when tested on speech of children with low to moderate SNR values. The results demonstrate the advantages of the proposed system, and the conditions under which it performs better.

Archive | 2018

Research data supporting [Mapping piezoelectric response in nanomaterials using a dedicated non destructive scanning probe technique]

Yonatan Calahorra; Michael Smith; Anuja Datta; Hadas Benisty; Sohini Kar-Narayan

These files contain the raw data used to extract ND-PFM signals in Figure 2,3,4 of the manuscript. The MATALB code used to perform virtual lock-in operation is present.

ieee international conference on science of electrical engineering | 2016

Enhancement of BCI classifiers through domain adaptation

Hadas Benisty; Daniel Furman; Talor Abramovich; Amir Ivry; Hillel Pratt

Clinical Brain-Computer Interface (BCI) systems seek to enable paralyzed individuals to operate devices with their brain activity. Non-invasive systems based on electroen-cephalographic (EEG) signals are popular since they avoid risks associated with invasive procedures, but unfortunately EEG signals are inherently noisy, making effective classifiers challenging to develop. Commonly, new classifiers are benchmarked on signals from healthy subjects executing physical movements, under the assumption that the performance will transfer to clinical cases where only imagined movements are possible. Here, we show in contrast that classifiers trained on signals associated with actual movements perform erratically when applied to signals associated with imagined movements. We suggest that this is because the signals lay in different domains. Then, to exploit the different statistical distributions, we apply a domain adaptation technique, Frustratingly Easy Domain Adaptation (FEDA), improving classifier performance accuracy by a third on a discrimination task that simulates the clinical condition.

international conference on applications of digital information and web technologies | 2009

Adaptive system identification using time-varying Fourier transform

Hadas Benisty; Yekutiel Avargel; Israel Cohen

In this paper, we introduce a time-varying shorttime Fourier transform (TV-STFT) for representing discrete signals. We derive an explicit condition for perfect reconstruction using time-varying analysis and synthesis windows. Based on the derived representation, we propose an adaptive algorithm that controls the length of the analysis window to achieve a lower mean-square error (MSE) at each iteration. When compared to the conventional multiplicative transfer function approach with a fixed length analysis window, the resulting algorithm achieves faster convergence without compromising for higher steady state MSE. Experimental results demonstrate the effectiveness of the proposed approach.

conference of the international speech communication association | 2011