João Graça | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where João Graça is active.

Explore More

Publication

Featured researches published by João Graça.

Computational Linguistics | 2010

Learning tractable word alignment models with complex constraints

João Graça; Kuzman Ganchev; Ben Taskar

Word-level alignment of bilingual text is a critical resource for a growing variety of tasks. Probabilistic models for word alignment present a fundamental trade-off between richness of captured constraints and correlations versus efficiency and tractability of inference. In this article, we use the Posterior Regularization framework (Graça, Ganchev, and Taskar 2007) to incorporate complex constraints into probabilistic models during learning without changing the efficiency of the underlying model. We focus on the simple and tractable hidden Markov model, and present an efficient learning algorithm for incorporating approximate bijectivity and symmetry constraints. Models estimated with these constraints produce a significant boost in performance as measured by both precision and recall of manually annotated alignments for six language pairs. We also report experiments on two different tasks where word alignments are required: phrase-based machine translation and syntax transfer, and show promising improvements over standard methods.

Journal of Artificial Intelligence Research | 2011

Controlling complexity in part-of-speech induction

João Graça; Kuzman Ganchev; Luísa Coheur; Fernando Pereira; Benjamin Taskar

We consider the problem of fully unsupervised learning of grammatical (part-of-speech) categories from unlabeled text. The standard maximum-likelihood hidden Markov model for this task performs poorly, because of its weak inductive bias and large model capacity. We address this problem by refining the model and modifying the learning objective to control its capacity via parametric and non-parametric constraints. Our approach enforces word-category association sparsity, adds morphological and orthographic features, and eliminates hard-to-estimate parameters for rare words. We develop an efficient learning algorithm that is not much more computationally intensive than standard training. We also provide an open-source implementation of the algorithm. Our experiments on five diverse languages (Bulgarian, Danish, English, Portuguese, Spanish) achieve significant improvements compared with previous methods for the same task.

The Prague Bulletin of Mathematical Linguistics | 2009

PostCAT - Posterior Constrained Alignment Toolkit

João Graça; Kuzman Ganchev; Ben Taskar

PostCAT - Posterior Constrained Alignment Toolkit In this paper we present a new open-source toolkit for statistical word alignments - Posterior Constrained Alignment Toolkit (PostCAT). The toolkit implements three well known word alignment algorithms (IBM M1, IBM M2, HMM) as well as six new models. In addition to the usual Viterbi decoding scheme, the toolkit provides posterior decoding with several flavors for tuning the threshold. The toolkit also provides an implementation of alignment symmetrization heuristics and a set of utilities for analyzing and pretty printing alignments. The new models have already been shown to improve intrinsic alignment metrics and also to lead to better translations when integrated into a state of the art machine translation system. The toolkit is developed in Java and available in source at its website1. We encourage other researchers to build on our work by modifying the toolkit and using it for their research.

international symposium on circuits and systems | 2015

Live demonstration: A CMOS ASIC for precise reading of a Magnetoresistive sensor array for NDT

Diogo M. Caetano; Moisés Piedade; João Graça; Jorge R. Fernandes; Luis S. Rosado; Tiago Costa

Non-destructive testing (NDT) based on eddy currents (EC) is commonly used to detect defects in conductive materials. Usually the system includes an emitter coil, and one receiver coil or one Magnetoresistive (MR) sensor. In this work we added an interface ASIC that pre-amplifies and filters the signal from an array of MR sensors. This demo will present a new version based on the work presented at the ECNDT 2014 conference with a paper entitled “A CMOS ASIC for Precise Reading of a Magnetoresistive Sensor Array for NDT”. Since this is an on-going work, improvements have been made, namely the reduction of the system thermal noise to 30 nV/√Hz, the development of a multigain amplifier and the application of the same concept and circuit to a multichannel parallel signal acquisition system. Detection of surface and buried defects will be demonstrated in different material mock-ups.

Journal of Machine Learning Research | 2010