Is this you? Create Your Porfile

Jorge Castro

Polytechnic University of Catalonia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jorge Castro is active.

Explore More

Publication

Featured researches published by Jorge Castro.

international colloquium on grammatical inference | 2008

Towards Feasible PAC-Learning of Probabilistic Deterministic Finite Automata

Jorge Castro; Ricard Gavaldà

We present an improvement of an algorithm due to Clark and Thollard (Journal of Machine Learning Research, 2004) for PAC-learning distributions generated by Probabilistic Deterministic Finite Automata (PDFA). Our algorithm is an attempt to keep the rigorous guarantees of the original one but use sample sizes that are not as astronomical as predicted by the theory. We prove that indeed our algorithm PAC-learns in a stronger sense than the Clark-Thollard. We also perform very preliminary experiments: We show that on a few small targets (8-10 states) it requires only hundreds of examples to identify the target. We also test the algorithm on a web logfile recording about a hundred thousand sessions from an ecommerce site, from which it is able to extract some nontrivial structure in the form of a PDFA with 30-50 states. An additional feature, in fact partly explaining the reduction in sample size, is that our algorithm does not need as input any information about the distinguishability of the target.

algorithmic learning theory | 2002

The consistency dimension and distribution-dependent learning from queries

José L. Balcázar; Jorge Castro; David Guijarro; Hans Ulrich Simon

We prove a new combinatorial characterization of polynomial learnability from equivalence queries, and state some of its consequences relating the learnability of a class with the learnability via equivalence and membership queries of its subclasses obtained by restricting the instance space. Then we propose and study two models of query learning in which there is a probability distribution on the instance space, both as an application of the tools developed from the combinatorial characterization and as models of independent interest.

Journal of Computer and System Sciences | 2002

A New Abstract Combinatorial Dimension for Exact Learning via Queries

José L. Balcázar; Jorge Castro; David Guijarro

We introduce an abstract model of exact learning via queries that can be instantiated to all the query learning models currently in use, while being closer to them than previous unifying attempts. We present a characterization of those Boolean function classes learnable in this abstract model, in terms of a new combinatorial notion that we introduce, the abstract identification dimension. Then we prove that the particularization of our notion to specific known protocols such as equivalence, membership, and membership and equivalence queries results in exactly the same combinatorial notions currently known to characterize learning in these models, such as strong consistency dimension, extended teaching dimension, and certificate size. Our theory thus fully unifies all these characterizations. For models enjoying a specific property that we identify, the notion can be simplified while keeping the same characterizations. From our results we can derive combinatorial characterizations of all those other models for query learning proposed in the literature. We can also obtain the first polynomial-query learning algorithms for specific interesting problems such as learning DNF with proper subset and superset queries.

electronic government | 2002

Supporting Efficient Multinational Disaster Response through a Web-Based System

Ignacio Aedo; Paloma Díaz; Camino Fernández; Jorge Castro

The current process to deal with disaster mitigation has a number of drawbacks that can be solved using web technology. The basic problem is that there is a unidirectional and asynchronous flow of information among the different agents involved in a disaster mitigation procedure. This situation often results in a lack of coordination in the resources provision and in a useless assistance. In this paper we introduce ARCE, a web based system envisaged to cope with the lack of synchronism among assistance requests and responses in a multinational environment as the Latin-American Association of Governmental Organisms of Civil Defence and Protection is. ARCE makes uses of role-based access policies (RBAC) and information flow mechanisms to offer an efficient and reliable communication channel.

Journal of Computer and System Sciences | 2007

A general dimension for query learning

José L. Balcázar; Jorge Castro; David Guijarro; Johannes Köbler; Wolfgang Lindner

We introduce a combinatorial dimension that characterizes the number of queries needed to exactly (or approximately) learn concept classes in various models. Our general dimension provides tight upper and lower bounds on the query complexity for all sorts of queries, not only for example-based queries as in previous works. As an application we show that for learning DNF formulas, unspecified attribute value membership and equivalence queries are not more powerful than standard membership and equivalence queries. Further, in the approximate learning setting, we use the general dimension to characterize the query complexity in the statistical query as well as the learning by distances model. Moreover, we derive close bounds on the number of statistical queries needed to approximately learn DNF formulas.

european conference on computational learning theory | 2001

A General Dimension for Exact Learning

José L. Balcázar; Jorge Castro; David Guijarro

We introduce a new combinatorial dimension that gives a good approximation of the number of queries needed to learn in the exact learning model, no matter what set of queries is used. This new dimension generalizes previous dimensions providing upper and lower bounds for all sorts of queries, and not for just example-based queries as in previous works. Our new approach gives also simpler proofs for previous results. We present specific applications of our general dimension for the case of unspecified attribute value queries, and show that unspecified attribute value membership and equivalence queries are not more powerful than standard membership and equivalence queries for the problem of learning DNF formulas.

algorithmic learning theory | 1995

Simple PAC Learning of Simple Decision Lists

Jorge Castro; José L. Balcázar

We prove that log n-decision lists — the class of decision lists such that all their terms have low Kolmogorov complexity— are learnable in the simple PAG learning model. The proof is based on a transformation from an algorithm based on equivalence queries (found independently by Simon). Then we introduce the class of simple decision lists, and extend our algorithm to show that simple decision lists are simple-PAC learnable as well. This last result is relevant in that it is, to our knowledge, the first learning algorithm for decision lists in which an exponentially wide set of functions may be used for the terms.

algorithmic learning theory | 1999

The Consistency Dimension and Distribution-Dependent Learning from Queries (Extended Abstract)

José L. Balcázar; Jorge Castro; David Guijarro; Hans Ulrich Simon

Machine Learning | 2014

Adaptively learning probabilistic deterministic automata from data streams

Borja Balle; Jorge Castro; Ricard Gavaldà

Markovian models with hidden state are widely-used formalisms for modeling sequential phenomena. Learnability of these models has been well studied when the sample is given in batch mode, and algorithms with PAC-like learning guarantees exist for specific classes of models such as Probabilistic Deterministic Finite Automata (PDFA). Here we focus on PDFA and give an algorithm for inferring models in this class in the restrictive data stream scenario: Unlike existing methods, our algorithm works incrementally and in one pass, uses memory sublinear in the stream length, and processes input items in amortized constant time. We also present extensions of the algorithm that (1) reduce to a minimum the need for guessing parameters of the target distribution and (2) are able to adapt to changes in the input distribution, relearning new models when needed. We provide rigorous PAC-like bounds for all of the above. Our algorithm makes a key usage of stream sketching techniques for reducing memory and processing time, and is modular in that it can use different tests for state equivalence and for change detection in the stream.

algorithmic learning theory | 2010

A lower bound for learning distributions generated by probabilistic automata

Borja Balle; Jorge Castro; Ricard Gavaldà

Known algorithms for learning PDFA can only be shown to run in time polynomial in the so-called distinguishability µ of the target machine, besides the number of states and the usual accuracy and confidence parameters. We show that the dependence on µ is necessary for every algorithm whose structure resembles existing ones. As a technical tool, a new variant of Statistical Queries termed L∞-queries is defined. We show how these queries can be simulated from samples and observe that known PAC algorithms for learning PDFA can be rewritten to access its target using L∞-queries and standard Statistical Queries. Finally, we show a lower bound: every algorithm to learn PDFA using queries with a resonable tolerance needs a number of queries larger than (1/µ)c for every c < 1.

Explore More