Jesús E. García | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jesús E. García is active.

Explore More

Publication

Featured researches published by Jesús E. García.

The Annals of Applied Statistics | 2012

Context tree selection and linguistic rhythm retrieval from written texts

Antonio Galves; Charlotte Galves; Jesús E. García; Nancy L. Garcia; Florencia Leonardi

We introduce a new criterion to select in a consistent way the probabilistic context tree generating a sample. The basic idea is to construct a totally ordered set of candidate trees. This set is composed by the “champion trees”, the ones that maximize the likelihood of the sample for each number of degrees of freedom. The smallest maximizer criterion selects the infimum of the subset of champion trees whose gain in likelihood is negligible. This study was motivated by the linguistic challenge of retrieving rhythmic patterns from written texts. Applied to a data set consisting of texts extracted from daily newspapers, our algorithm identifies different context trees for European Portuguese and Brazilian Portuguese. This is compatible with the long standing conjecture that European Portuguese and Brazilian Portuguese belong to different rhythmic classes. Moreover, these context trees have several interesting properties which are linguistically meaningful.

Journal of Multivariate Analysis | 2013

A new index to measure positive dependence in trivariate distributions

Jesús E. García; V. A. González-López; Roger B. Nelsen

We introduce a new index to detect dependence in trivariate distributions. The index is based on the maximization of the coefficients of directional dependence over the set of directions. We show how to calculate the index using the three pairwise Spearmans rho coefficients and the three common 3-dimensional versions of Spearmans rho. We obtain the asymptotic distributions of the empirical processes related to the estimators of the coefficients of directional dependence and also we derive the asymptotic distribution of our index. We display examples where the index identifies dependence undetected by the aforementioned 3-dimensional versions of Spearmans rho. The value of the new index and the direction in which the maximal dependence occurs are easily computed and we illustrate with a simulation study and a real data set.

Communications in Statistics-theory and Methods | 2014

Modeling of Acoustic Signal Energies with a Generalized Frank Copula. A Linguistic Conjecture is Reviewed

Jesús E. García; V. A. González-López

In this article a generalized Frank copula was selected to model the dependence between the energy on two frequency bands of the speech signal, coming from eight languages. An algorithm was developed that uses maximum likelihood to choose the best fitting copula’s parameters. Through bootstrap, the algorithm estimates the variability of the parameters for each language and also computes confidence regions by means of Voronoi tesselations. A linguistic conjecture which claims that the languages are organized in three rhythmic classes, was confirmed by the Voronoi regions. Modeling with a uniparametric Frank copula, the different degrees of dependence between the energies were quantified.

Journal of Applied Statistics | 2007

Classifying speech sonority functional data using a projected Kolmogorov-Smirnov approach

Juan A. Cuesta-Albertos; Ricardo Fraiman; Antonio Galves; Jesús E. García; Marcela Svarc

Abstract This paper addresses a linguistically motivated question of classification of functional data, namely the statistical classification of languages according to their rhythmic features. This is an important open problem in phonology. The analysis is based on the information provided by the sonority, which is an index of local regularity of the speech signal. Our main tool is the projected Kolmogorov–Smirnov test. This is a new goodness of fit test for functional data. The result obtained supports the linguistic conjecture of the existence of three rhythmic classes.

XI BRAZILIAN MEETING ON BAYESIAN STATISTICS: EBEB 2012 | 2012

Robust model selection and the statistical classification of languages

Jesús E. García; V. A. González-López; M. L. L. Viola

In this paper we address the problem of model selection for the set of finite memory stochastic processes with finite alphabet, when the data is contaminated. We consider m independent samples, with more than half of them being realizations of the same stochastic process with law Q, which is the one we want to retrieve. We devise a model selection procedure such that for a sample size large enough, the selected process is the one with law Q. Our model selection strategy is based on estimating relative entropies to select a subset of samples that are realizations of the same law. Although the procedure is valid for any family of finite order Markov models, we will focus on the family of variable length Markov chain models, which include the fixed order Markov chain model family. We define the asymptotic breakdown point (ABDP) for a model selection procedure, and we show the ABDP for our procedure. This means that if the proportion of contaminated samples is smaller than the ABDP, then, as the sample size g...

Entropy | 2017

Consistent Estimation of Partition Markov Models

Jesús E. García; V. A. González-López

The Partition Markov Model characterizes the process by a partition L of the state space, where the elements in each part of L share the same transition probability to an arbitrary element in the alphabet. This model aims to answer the following questions: what is the minimal number of parameters needed to specify a Markov chain and how to estimate these parameters. In order to answer these questions, we build a consistent strategy for model selection which consist of: giving a size n realization of the process, finding a model within the Partition Markov class, with a minimal number of parts to represent the process law. From the strategy, we derive a measure that establishes a metric in the state space. In addition, we show that if the law of the process is Markovian, then, eventually, when n goes to infinity, L will be retrieved. We show an application to model internet navigation patterns.

Communications in Statistics-theory and Methods | 2018

A copula-based partition Markov procedure

M. Fernández; Jesús E. García; V. A. González-López

ABSTRACT The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain, and when the size of the database is not large enough, it is not possibly a consistent estimation. In this paper, we introduce a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consists in obtaining a partition of the state space which is constructed from a combination of the partitions corresponding to the marginal processes and the partitions corresponding to the multivariate Markov chain.

Communications in Statistics-theory and Methods | 2014

Robust Model Selection for Stochastic Processes

Jesús E. García; V. A. González-López; M. L. L. Viola

We address the problem of robust model selection for finite memory stochastic processes. Consider m independent samples, with most of them being realizations of the same stochastic process with law Q, which is the one we want to retrieve. We define the asymptotic breakdown point γ for a model selection procedure and also we devise a model selection procedure. We compute the value of γ which is 0.5, when all the processes are Markovian. This result is valid for any family of finite order Markov models but for simplicity we will focus on the family of variable length Markov chains.

Archive | 2018

Bayesian sensitivity analysis for asymmetric copulas with cubic sections

M. Fernández; Jesús E. García; V. A. González-López; N. Romano

The use of copulas to model the dependence between indicators leads us to observe the different methods of estimation and its applicability, given practical circumstances, such as having small databases. For this reason, Bayesian methods under the scope of copulas come to show immense utility. In this paper, we investigate how the responses of the Asymmetric Cubic Sections copula model are affected when we vary the prior distributions display over the model parameters. We use as a reference setting a non-informative prior distribution and we observe the effect of the other prior distributions in relation to it. We used this diversity of scenarios to map the possible degrees of dependence between two educational scores obtained by students of the undergraduate course of statistics at the University of Campinas in 2014.

Archive | 2018

Stochastic Distance Between Burkitt Lymphoma/Leukemia Strains

Jesús E. García; Ramin Gholizadeh; V. A. González López

Quantifying the proximity between N-grams allows to establish criteria of comparison between them. Recently, a consistent distance d to achieve this end was proposed, see Garcia JE, Gonzalez-Lopez VA. Detecting regime changes in Markov models. In New trends in stochastic modeling and data analysis (chapter 2, page 103), 2015. This distance takes advantage of a model structure on Markovian processes in finite alphabets and with finite memories, called Partition Markov Models, see Garcia JE, Gonzalez-Lopez VA. Entropy 19:160, 2017. In this work we explore the performance of d in a real problem, using d to establish a notion of natural proximity between DNA sequences from patients with identical diagnosis, which is: Burkitt lymphoma/leukemia. And we present a robust strategy of estimation to identify the stochastic law that governs most of the sequences considered, thus mapping out a common profile to all these patients, via their DNA sequences.

Explore More