Vicent Alabau
Polytechnic University of Valencia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vicent Alabau.
The Prague Bulletin of Mathematical Linguistics | 2013
Vicent Alabau; Ragnar Bonk; Christian Buck; Michael Carl; Francisco Casacuberta; Mercedes García-Martínez; Jesús González; Philipp Koehn; Luis A. Leiva; Bartolomé Mesa-Lao; Daniel Ortiz; Herve Saint-Amand; Germán Sanchis; Chara Tsoukala
Abstract We describe an open source workbench that offers advanced computer aided translation (CAT) functionality: post-editing machine translation (MT), interactive translation prediction (ITP), visualization of word alignment, extensive logging with replay mode, integration with eye trackers and e-pen.
Pattern Recognition | 2014
Vicent Alabau; Alberto Sanchis; Francisco Casacuberta
On-line handwriting text recognition (HTR) could be used as a more natural way of interaction in many interactive applications. However, current HTR technology is far from developing error-free systems and, consequently, its use in many applications is limited. Despite this, there are many scenarios, as in the correction of the errors of fully-automatic systems using HTR in a post-editing step, in which the information from the specific task allows to constrain the search and therefore to improve the HTR accuracy. For example, in machine translation (MT), the on-line HTR system can also be used to correct translation errors. The HTR can take advantage of information from the translation problem such as the source sentence that is translated, the portion of the translated sentence that has been supervised by the human, or the translation error to be amended. Empirical experimentation suggests that this is a valuable information to improve the robustness of the on-line HTR system achieving remarkable results. Graphical abstractThis work presents an e-pen enabled system where handwriting is used to amend the errors of a machine translation system. Handwriting recognition is performed in such a way that the contextual information (source, prefix, translation, and error) is integrated to improve the final recognition accuracy.Display Omitted HighlightsWe present a specific on-line HTR system for editing machine translation (MT) output.We leverage information from different sources in MT to constrain the HTR search.All the proposed systems outperform the baseline.The use of information from the translation models achieves remarkable results.Finally, we propose a system to amend HTR errors with a 75% typing effort reduction.
international conference on multimodal interfaces | 2011
Vicent Alabau; Luis Rodríguez-Ruiz; Alberto Sanchis; Pascual Martínez-Gómez; Francisco Casacuberta
Interactive machine translation (IMT) is an increasingly popular paradigm for semi-automated machine translation, where a human expert is integrated into the core of an automatic machine translation system. The human expert interacts with the IMT system by partially correcting the errors of the systems output. Then, the system proposes a new solution. This process is repeated until the output meets the desired quality. In this scenario, the interaction is typically performed using the keyboard and the mouse. However, speech is also a very interesting input modality since the user does not need to abandon the keyboard to interact with it. In this work, we present a new approach to perform speech interaction in a way that translation and speech inputs are tightly fused. This integration is performed early in the speech recognition step. Thus, the information from the translation models allows the speech recognition system to recover from errors that otherwise would be impossible to amend. In addition, this technique allows to use currently available speech recognition technology. The proposed system achieves an important boost in performance with respect to previous approaches.
Interacting with Computers | 2015
Luis A. Leiva; Vicent Alabau; Verónica Romero; Alejandro Héctor Toselli; Enrique Vidal
This is a pre-copyedited, author-produced PDF of an article accepted for publication in Interacting with computers following peer review. The version of record is available online at: http://dx.doi.org/10.1093/iwc/iwu019
Computer Speech & Language | 2015
Antonio L. Lagarda; Daniel Ortiz-Martínez; Vicent Alabau; Francisco Casacuberta
HighlightsWe present a method to customize machine translation systems when in-domain data is not available.For that we perform an online learning automatic post-editing from ready-to-use generic machine translation systems.The results show that the method is very effective on rule-based machine translation systems.On statistical machine translation systems the method performs well if no in-domain data was used in the training.Finally, if there is not enough repetition our method has limited use. Globalization has dramatically increased the need of translating information from one language to another. Frequently, such translation needs should be satisfied under very tight time constraints. Machine translation (MT) techniques can constitute a solution to this overly complex problem. However, the documents to be translated in real scenarios are often limited to a specific domain, such as a particular type of medical or legal text. This situation seriously hinders the applicability of MT, since it is usually expensive to build a reliable translation system, no matter what technology is used, due to the linguistic resources that are required to build them, such as dictionaries, translation memories or parallel texts. In order to solve this problem, we propose the application of automatic post-editing in an online learning framework. Our proposed technique allows the human expert to translate in a specific domain by using a base translation system designed to work in a general domain whose output is corrected (or adapted to the specific domain) by means of an automatic post-editing module. This automatic post-editing module learns to make its corrections from user feedback in real time by means of online learning techniques. We have validated our system using different translation technologies to implement the base translation system, as well as several texts involving different domains and languages. In most cases, our results show significant improvements in terms of BLEU (up to 16 points) with respect to the baseline systems. The proposed technique works effectively when the n-grams of the document to be translated presents a certain rate of repetition, situation which is common according to the document-internal repetition property.
international conference on multimodal interfaces | 2010
Vicent Alabau; Daniel Ortiz-Martínez; Alberto Sanchis; Francisco Casacuberta
Interactive machine translation (IMT) [1] is an alternative approach to machine translation, integrating human expertise into the automatic translation process. In this framework, a human iteratively interacts with a system until the output desired by the human is completely generated. Traditionally, interaction has been performed using a keyboard and a mouse. However, the use of touchscreens has been popularised recently. Many touchscreen devices already exist in the market, namely mobile phones, laptops and tablet computers like the iPad. In this work, we propose a new interaction modality to take advantage of such devices, for which online handwritten text seems a very natural way of input. Multimodality is formulated as an extension to the traditional IMT protocol where the user can amend errors by writing text with an electronic pen or a stylus on a touchscreen. Different approaches to modality fusion have been studied. In addition, these approaches have been assessed on the Xerox task. Finally, a thorough study of the errors committed by the online handwritten system will show future work directions.
conference of the european chapter of the association for computational linguistics | 2014
Vicent Alabau; Christian Buck; Michael Carl; Francisco Casacuberta; Mercedes García-Martínez; Ulrich Germann; Jesús González-Rubio; Robin L. Hill; Philipp Koehn; Luis A. Leiva; Bartolomé Mesa-Lao; Daniel Ortiz-Martínez; Herve Saint-Amand; Germán Sanchis Trilles; Chara Tsoukala
CASMACAT is a modular, web-based translation workbench that offers advanced functionalities for computer-aided translation and the scientific study of human translation: automatic interaction with machine translation (MT) engines and translation memories (TM) to obtain raw translations or close TM matches for conventional post-editing; interactive translation prediction based on an MT engine’s search graph, detailed recording and replay of edit actions and translator’s gaze (the latter via eye-tracking), and the support of e-pen as an alternative input device. The system is open source sofware and interfaces with multiple MT systems.
Pattern Recognition Letters | 2014
Vicent Alabau; Carlos D. Martínez-Hinarejos; Verónica Romero; Antonio L. Lagarda
The transcription of historical documents is one of the most interesting tasks in which Handwritten Text Recognition can be applied, due to its interest in humanities research. One alternative for transcribing the ancient manuscripts is the use of speech dictation by using Automatic Speech Recognition techniques. In the two alternatives similar models (Hidden Markov Models and n-grams) and decoding processes (Viterbi decoding) are employed, which allows a possible combination of the two modalities with little difficulties. In this work, we explore the possibility of using recognition results of one modality to restrict the decoding process of the other modality, and apply this process iteratively. Results of these multimodal iterative alternatives are significantly better than the baseline uni-modal systems and better than the non-iterative alternatives.
conference of the association for machine translation in the americas | 2016
Daniel Ortiz-Martínez; Jesús González-Rubio; Vicent Alabau; Germán Sanchis-Trilles; Francisco Casacuberta
This chapter describes a pilot study aiming at testing the integration of online and active learning features into the computer-assisted translation workbench developed within the CASMACAT project. These features can be used to take advantage of the new knowledge implicitly provided by human experts when they generate new translations. Online learning (OL) allows the system to learn from user feedback in real time by incrementally adapting the parameters of the statistical models involved in the translation process. On the other hand, active learning (AL) determines those sentences that need to be supervised by the user so as to maximize the final translation quality minimizing user effort and, at the same time, improving the statistical model parameters. We investigate the effect of these features on translation productivity, using interactive translation prediction (ITP) as a baseline. ITP is a computer assisted translation approach where the user interactively collaborates with a statistical machine translation system to generate high quality translations. User activity data was collected from ten translators using key-logging and eye-tracking. We found that ITP with OL performs better than standard ITP, especially in terms of typing effort required from the user to generate correct translations. Additionally, ITP with AL provides better translation quality than standard ITP for the same levels of user effort.
human factors in computing systems | 2013
Luis A. Leiva; Vicent Alabau; Enrique Vidal
We present a straightforward solution to incorporate text-editing gestures to mixed-initiative user interfaces (MIUIs). Our approach provides (1) disambiguation from handwritten text, (2) edition context, (3) virtually perfect accuracy, and (4) a trivial implementation. An evaluation study with 32 e-pen users showed that our approach is suitable to production-ready environments. In addition, performance tests on a desktop PC and on a mobile device revealed that gestures are really fast to recognize (0.1 ms on average). Taken together, these results suggest that our approach can help developers to deploy simple but effective, high-performance text-editing gestures.