María Pérez-Ortiz
Loyola University Chicago
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by María Pérez-Ortiz.
IEEE Transactions on Knowledge and Data Engineering | 2016
Pedro Antonio Gutiérrez; María Pérez-Ortiz; Javier Sánchez-Monedero; Francisco Fernández-Navarro; César Hervás-Martínez
Ordinal regression problems are those machine learning problems where the objective is to classify patterns using a categorical scale which shows a natural order between the labels. Many real-world applications present this labelling structure and that has increased the number of methods and algorithms developed over the last years in this field. Although ordinal regression can be faced using standard nominal classification techniques, there are several algorithms which can specifically benefit from the ordering information. Therefore, this paper is aimed at reviewing the state of the art on these techniques and proposing a taxonomy based on how the models are constructed to take the order into account. Furthermore, a thorough experimental study is proposed to check if the use of the order information improves the performance of the models obtained, considering some of the approaches within the taxonomy. The results confirm that ordering information benefits ordinal models improving their accuracy and the closeness of the predictions to actual targets in the ordinal scale.
Expert Systems With Applications | 2016
María Pérez-Ortiz; José M. Peña; Pedro Antonio Gutiérrez; Jorge Torres-Sánchez; César Hervás-Martínez; Francisca López-Granados
The problem of remote weed mapping via machine learning is considered.Unmanned aerial vehicles are used to capture maize and sunflower field images.The proposed method considers pattern and feature selection techniques.The final model requires few user information to generalise to new areas.There are features of great influence for the classification of both crops. This paper approaches the problem of weed mapping for precision agriculture, using imagery provided by Unmanned Aerial Vehicles (UAVs) from sunflower and maize crops. Precision agriculture referred to weed control is mainly based on the design of early post-emergence site-specific control treatments according to weed coverage, where one of the most important challenges is the spectral similarity of crop and weed pixels in early growth stages. Our work tackles this problem in the context of object-based image analysis (OBIA) by means of supervised machine learning methods combined with pattern and feature selection techniques, devising a strategy for alleviating the user intervention in the system while not compromising the accuracy. This work firstly proposes a method for choosing a set of training patterns via clustering techniques so as to consider a representative set of the whole field data spectrum for the classification method. Furthermore, a feature selection method is used to obtain the best discriminating features from a set of several statistics and measures of different nature. Results from this research show that the proposed method for pattern selection is suitable and leads to the construction of robust sets of data. The exploitation of different statistical, spatial and texture metrics represents a new avenue with huge potential for between and within crop-row weed mapping via UAV-imagery and shows good synergy when complemented with OBIA. Finally, there are some measures (specially those linked to vegetation indexes) that are of great influence for weed mapping in both sunflower and maize crops.
Applied Soft Computing | 2015
María Pérez-Ortiz; J.M. Peña; Pedro Antonio Gutiérrez; Jorge Torres-Sánchez; César Hervás-Martínez; Francisca López-Granados
Graphical abstractDisplay Omitted HighlightsThe problem of constructing a weed mapping model via machine learning techniques is assessed.The combination of spectral properties with vegetation indexes and crop rows helps the prediction.A semi-supervised classifier has been proved to perform well for the classification problem assessed with very few information provided by the user.An extended experimental design for weed mapping could be performed considering other crops. This paper presents a system for weed mapping, using imagery provided by unmanned aerial vehicles (UAVs). Weed control in precision agriculture is based on the design of site-specific control treatments according to weed coverage. A key component is precise and timely weed maps, and one of the crucial steps is weed monitoring, by ground sampling or remote detection. Traditional remote platforms, such as piloted planes and satellites, are not suitable for early weed mapping, given their low spatial and temporal resolutions. Nonetheless, the ultra-high spatial resolution provided by UAVs can be an efficient alternative. The proposed method for weed mapping partitions the image and complements the spectral information with other sources of information. Apart from the well-known vegetation indexes, which are commonly used in precision agriculture, a method for crop row detection is proposed. Given that crops are always organised in rows, this kind of information simplifies the separation between weeds and crops. Finally, the system incorporates classification techniques for the characterisation of pixels as crop, soil and weed. Different machine learning paradigms are compared to identify the best performing strategies, including unsupervised, semi-supervised and supervised techniques. The experiments study the effect of the flight altitude and the sensor used. Our results show that an excellent performance is obtained using very few labelled data complemented with unlabelled data (semi-supervised approach), which motivates the use of weed maps to design site-specific weed control strategies just when farmers implement the early post-emergence weed control.
hybrid artificial intelligence systems | 2012
Pedro Antonio Gutiérrez; María Pérez-Ortiz; Francisco Fernández-Navarro; Javier Sánchez-Monedero; César Hervás-Martínez
In this paper, an experimental study of different ordinal regression methods and measures is presented. The first objective is to gather the results of a considerably high number of methods, datasets and measures, since there are not many previous comparative studies of this kind in the literature. The second objective is to detect the redundancy between the evaluation measures used for ordinal regression. The results obtained present the maximum MAE (maximum of the mean absolute error of the difference between the true and the predicted ranks of the worst classified class) as a very interesting alternative for ordinal regression, being the less uncorrelated with respect to the rest of measures. Additionally, SVOREX and SVORIM are found to yield very good performance when the objective is to minimize this maximum MAE.
IEEE Transactions on Systems, Man, and Cybernetics | 2014
María Pérez-Ortiz; Pedro Antonio Gutiérrez; César Hervás-Martínez
The classification of patterns into naturally ordered labels is referred to as ordinal regression. This paper proposes an ensemble methodology specifically adapted to this type of problem, which is based on computing different classification tasks through the formulation of different order hypotheses. Every single model is trained in order to distinguish between one given class (k) and all the remaining ones, while grouping them in those classes with a rank lower than k, and those with a rank higher than k. Therefore, it can be considered as a reformulation of the well-known one-versus-all scheme. The base algorithm for the ensemble could be any threshold (or even probabilistic) method, such as the ones selected in this paper: kernel discriminant analysis, support vector machines and logistic regression (LR) (all reformulated to deal with ordinal regression problems). The method is seen to be competitive when compared with other state-of-the-art methodologies (both ordinal and nominal), by using six measures and a total of 15 ordinal datasets. Furthermore, an additional set of experiments is used to study the potential scalability and interpretability of the proposed method when using LR as base methodology for the ensemble.
IEEE Transactions on Knowledge and Data Engineering | 2015
María Pérez-Ortiz; Pedro Antonio Gutiérrez; César Hervás-Martínez; Xin Yao
The classification of patterns into naturally ordered labels is referred to as ordinal regression or ordinal classification. Usually, this classification setting is by nature highly imbalanced, because there are classes in the problem that are a priori more probable than others. Although standard over-sampling methods can improve the classification of minority classes in ordinal classification, they tend to introduce severe errors in terms of the ordinal label scale, given that they do not take the ordering into account. A specific ordinal over-sampling method is developed in this paper for the first time in order to improve the performance of machine learning classifiers. The method proposed includes ordinal information by approaching over-sampling from a graph-based perspective. The results presented in this paper show the good synergy of a popular ordinal regression method (a reformulation of support vector machines) with the graph-based proposed algorithms, and the possibility of improving both the classification and the ordering of minority classes. A cost-sensitive version of the ordinal regression method is also introduced and compared with the over-sampling proposals, showing in general lower performance for minority classes.
Applied Soft Computing | 2014
María Pérez-Ortiz; Manuel Cruz-Ramírez; María Dolores Ayllón-Terán; Nigel Heaton; Rubén Ciria; César Hervás-Martínez
Liver transplantation is nowadays a widely-accepted treatment for patients who present a terminal liver disease. Nevertheless, transplantation is greatly hampered by the un-availability of suitable liver donors; several methods have been developed and applied to find a better system to prioritize recipients on the waiting list, although most of them only consider donor or recipient characteristics (but not both). This paper proposes a novel donor-recipient liver allocation system constructed to predict graft survival after transplantation by means of a dataset comprised of donor-recipient pairs from different centres (seven Spanish and one UK hospitals). The best model obtained is used in conjunction with the Model for End-stage Liver Disease score (MELD), one of the current assignation methodology most used globally. This problem is assessed using the ordinal regression learning paradigm due to the natural ordering in the classes of the problem, via a cascade binary decomposition methodology and the Support Vector Machine methodology. The methodology proposed has shown competitiveness in all the metrics selected, when compared to other machine learning techniques and efficiently complements the MELD score based on the principles of efficiency and equity. Finally, a simulation of the proposal is included, in order to visualize its performance in realistic situations. This simulation has shown that there are some determining factors in the characterization of the survival time after transplantation (concerning both donors and recipients) and that the joint use of these sets of information could be, in fact, more useful and beneficial for the survival principle. Nonetheless, the results obtained indicate the true complexity of the problem dealt within this study and the fact that other characteristics that have not been included in the dataset may be of importance for the characterization of the dependent variable (survival time after transplantation), thus starting a promising line of future work.
IEEE Transactions on Neural Networks | 2016
María Pérez-Ortiz; Pedro Antonio Gutiérrez; Peter Tino; César Hervás-Martínez
The imbalanced nature of some real-world data is one of the current challenges for machine learning researchers. One common approach oversamples the minority class through convex combination of its patterns. We explore the general idea of synthetic oversampling in the feature space induced by a kernel function (as opposed to input space). If the kernel function matches the underlying problem, the classes will be linearly separable and synthetically generated patterns will lie on the minority class region. Since the feature space is not directly accessible, we use the empirical feature space (EFS) (a Euclidean space isomorphic to the feature space) for oversampling purposes. The proposed method is framed in the context of support vector machines, where the imbalanced data sets can pose a serious hindrance. The idea is investigated in three scenarios: 1) oversampling in the full and reduced-rank EFSs; 2) a kernel learning technique maximizing the data class separation to study the influence of the feature space structure (implicitly defined by the kernel function); and 3) a unified framework for preferential oversampling that spans some of the previous approaches in the literature. We support our investigation with extensive experiments over 50 imbalanced data sets.
Knowledge Based Systems | 2014
María Pérez-Ortiz; M. de la Paz-Marín; Pedro Antonio Gutiérrez; César Hervás-Martínez
Sustainable development (SD) is a major challenge for nations, even more so in the current economic crisis and uncertain environment. Although different indicators, compindices and rankings to measure and monitor SD advances at the macro level exist, the benefits for stakeholders and policy makers are still limited because of the absence of predictive models (in the sense of models able to classify countries according to their SD advances). To cope with this need, this paper presents a first approximation via machine learning techniques. First, we study the SD stage of the 27 European Union Member States using information from the years 2005–2010 and different major indicators that have been related to SD. A hierarchical clustering analysis is conducted, and the patterns are categorised as advanced, followers, moderate and initiated, according to their progress towards SD. The classification problem is addressed from an ordinal regression point of view because of the inherent order among the categories. To do so, a reformulation of the one-versus-all scheme for ordinal regression problems is used, making use of threshold models (Logistic Regression (LR) and Support Vector Machines in this case) and a new trainable decision rule for probability estimation fusion. The empirical results indicate that the constructed model is able to achieve very promising and competitive performance. Thus, it could be used for monitoring the progress towards SD of the different EU countries, in a manner similar to that used for rankings. Finally, the decomposition method based on LR is used for model interpretation purposes, providing valuable information about the most relevant indicators for ranking the end-point variable.
intelligent systems design and applications | 2011
María Pérez-Ortiz; Pedro Antonio Gutiérrez; Carlos R. García-Alonso; Luis Salvador-Carulla; José A. Salinas-Pérez; César Hervás-Martínez
In this paper we apply and test a recent ordinal algorithm for classification (Kernel Discriminant Learning Ordinal Regression, KDLOR), in order to recognize a group of geographically close spatial units with a similar prevalence pattern significantly high (or low), which are called hot-spots (or cold-spots). Different spatial analysis techniques have been used for studying geographical distribution of a specific illness in mental health-care because it could be useful to organize the spatial distribution of health-care services. Ordinal classification is used in this problem because the classes are: spatial unit with depression, spatial unit which could present depression and spatial unit where there is not depression. It is shown that the proposed method is capable of preserving the rank of data classes in a projected data space for this database. In comparison to other standard methods like C4.5, SVMRank, Adaboost, and MLP nominal classifiers, the proposed KDLOR algorithm is shown to be competitive.