Julio J. Valdés
National Research Council
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Julio J. Valdés.
Artificial Intelligence in Medicine | 2004
P. Roy Walker; Brandon Smith; Qing Yan Liu; A. Fazel Famili; Julio J. Valdés; Z. Liu; Boleslaw Lach
Genome-wide transcription profiling is a powerful technique for studying the enormous complexity of cellular states. Moreover, when applied to disease tissue it may reveal quantitative and qualitative alterations in gene expression that give information on the context or underlying basis for the disease and may provide a new diagnostic approach. However, the data obtained from high-density microarrays is highly complex and poses considerable challenges in data mining. The data requires care in both pre-processing and the application of data mining techniques. This paper addresses the problem of dealing with microarray data that come from two known classes (Alzheimer and normal). We have applied three separate techniques to discover genes associated with Alzheimer disease (AD). The 67 genes identified in this study included a total of 17 genes that are already known to be associated with Alzheimers or other neurological diseases. This is higher than any of the previously published Alzheimers studies. Twenty known genes, not previously associated with the disease, have been identified as well as 30 uncharacterized expressed sequence tags (ESTs). Given the success in identifying genes already associated with AD, we can have some confidence in the involvement of the latter genes and ESTs. From these studies we can attempt to define therapeutic strategies that would prevent the loss of specific components of neuronal function in susceptible patients or be in a position to stimulate the replacement of lost cellular function in damaged neurons. Although our study is based on a relatively small number of patients (four AD and five normal), we think our approach sets the stage for a major step in using gene expression data for disease modeling (i.e. classification and diagnosis). It can also contribute to the future of gene function identification, pathology, toxicogenomics, and pharmacogenomics.
Neural Networks | 2006
Vladimir Cherkassky; Vladimir M. Krasnopolsky; Dimitri P. Solomatine; Julio J. Valdés
This paper introduces a generic theoretical framework for predictive learning, and relates it to data-driven and learning applications in earth and environmental sciences. The issues of data quality, selection of the error function, incorporation of the predictive learning methods into the existing modeling frameworks, expert knowledge, model uncertainty, and other application-domain specific problems are discussed. A brief overview of the papers in the Special Issue is provided, followed by discussion of open issues and directions for future research.
granular computing | 2003
Julio J. Valdés
This present paper introduces a virtual reality technique for visual data mining on heterogeneous information systems. The method is based on parametrized mappings between heterogeneous spaces with extended information systems and a virtual reality space. They can be also constructed for unions of heterogeneous and incomplete data sets together with knowledge bases composed by decision rules. This approach has been applied successfully to a wide variety of real-world domains and examples are presented from genomic research and geology.
congress on evolutionary computation | 2007
Julio J. Valdés; Alan J. Barton
This paper presents an approach for constructing visual representations of high dimensional objective spaces using virtual reality. These spaces arise from the solution of multi-objective optimization problems with more than 3 objective functions which lead to high dimensional Pareto fronts which are difficult to use. This approach is preliminarily investigated using both theoretically derived high dimensional Pareto fronts for a test problem (DTLZ2) and practically obtained objective spaces for the 4 dimensional knapsack problem via multi-objective evolutionary algorithms like HLGA, NSGA, and VEGA. The expected characteristics of the high dimensional fronts in terms of relative sizes, sequencing, embedding and asymmetry were systematically observed in the constructed virtual reality spaces.
Neural Networks | 2006
Julio J. Valdés; Graeme F. Bonham-Carter
A computational intelligence approach is used to explore the problem of detecting internal state changes in time dependent processes; described by heterogeneous, multivariate time series with imprecise data and missing values. Such processes are approximated by collections of time dependent non-linear autoregressive models represented by a special kind of neuro-fuzzy neural network. Grid and high throughput computing model mining procedures based on neuro-fuzzy networks and genetic algorithms, generate: (i) collections of models composed of sets of time lag terms from the time series, and (ii) prediction functions represented by neuro-fuzzy networks. The composition of the models and their prediction capabilities, allows the identification of changes in the internal structure of the process. These changes are associated with the alternation of steady and transient states, zones with abnormal behavior, instability, and other situations. This approach is general, and its sensitivity for detecting subtle changes of state is revealed by simulation experiments. Its potential in the study of complex processes in earth sciences and astrophysics is illustrated with applications using paleoclimate and solar data.
international work conference on artificial and natural neural networks | 1997
Julio J. Valdés; Ricardo García
In the classical neuron model inputs are continuous real-valued quantities. However in many important domains from the real world, objects are described by a mixture of continuous and discrete variables and usually containg missing information. A general class of neuron models accepting heterogeneous inputs in the form mixtures of continous and discrete quantities admiting missing data is presented. From these, several particular models can be derived as instances and also different neural architectures can be constructed with them. In particular, hybrid feedforward neural networks composed by layers of heterogeneous and classical neurons are studied here, and a training procedure for them is constructed using genetic algoritmhs. Their possibilities in solving classification and diagnostic problems are illustrated by experiments with data sets from known repositories. The experiments shows that these networks are robust and that they can both learn and classify complex data very effectively and without preprocessing or variable transformations, also in the presence of missing information.
genetic and evolutionary computation conference | 2007
Julio J. Valdés; Robert Orchard; Alan J. Barton
Two medical data sets (Breast cancer and Colon cancer) are investigated within a visual data mining paradigm through the unsupervised construction of virtual reality spaces using genetic programming and classical optimization (for comparison purposes). The desired visual spaces are such that a modified genetic programming approach was proposed in order to generate programs representing vector functions. The extension leads to populations that are composed of forests, instead of single expression trees. No particular kind of genetic programming algorithm is required due to the generic nature of the approach taken in the paper. The results (visual spaces) show that the relationships between the data objects and their classes can be appreciated in all of the obtained spaces regardless of the mapping error. In addition, the spaces obtained with genetic programming resulted in lower mapping errors than a classical optimizer and produced relatively simple equations. Further, the set of obtained equations can be statistically analyzed in terms of the original attributes in order to further the understanding of the derivation of the new nonlinear features that are constructed. Thus, explicit mappings provided by genetic programming can be used for feature selection and generation in data mining where scalar and/or vector functions are involved.
ieee international conference on evolutionary computation | 2006
Julio J. Valdés; Alan J. Barton
Multi-objective optimization is used for the computation of virtual reality spaces for visual data mining and knowledge discovery. Two methods for computing new spaces are discussed: implicit and explicit function representations. In the first, the images of the objects are computed directly, and in the second, universal function approximators (neural networks) are obtained. The pros and cons of each approach are discussed, as well as their complementary character. The NSGA-II algorithm is used for computing spaces requested to minimize two objectives: a similarity structure loss measure (Sammons error) and classification error (mean cross-validation error on a k-nn classifier). Two examples using solutions along approximations to the Pareto front are presented: Alzheimers disease gene expressions and geophysical fields for prospecting underground caves. This approach is a general non-linear feature generation and can be used in problems not necessarily oriented to the construction of visual data representations.
rough sets and knowledge technology | 2006
Julio J. Valdés; Alan J. Barton
In many domains, the data objects are described in terms of a large number of features. The pipelined data mining approach introduced in [1] using two clustering algorithms in combination with rough sets and extended with genetic programming, is investigated with the purpose of discovering important subsets of attributes in high dimensional data. Their classification ability is described in terms of both collections of rules and analytic functions obtained by genetic programming (gene expression programming). The Leader and several k-means algorithms are used as procedures for attribute set simplification of the information systems later presented to rough sets algorithms. Visual data mining techniques including virtual reality were used for inspecting results. The data mining process is setup using high throughput distributed computing techniques. This approach was applied to Breast Cancer microarray data and it led to subsets of genes with high discrimination power with respect to the decision classes
industrial and engineering applications of artificial intelligence and expert systems | 2004
Julio J. Valdés; Alan J. Barton
One of the difficulties of using Artificial Neural Networks (ANNs) to estimate atmospheric temperature is the large number of potential input variables available. In this study, four different feature extraction methods were used to reduce the input vector to train four networks to estimate temperature at different atmospheric levels. The four techniques used were: genetic algorithms (GA), coefficient of determination (CoD), mutual information (MI) and simple neural analysis (SNA). The results demonstrate that of the four methods used for this data set, mutual information and simple neural analysis can generate networks that have a smaller input parameter set, while still maintaining a high degree of accuracy.