Irene Díaz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Irene Díaz is active.

Explore More

Publication

Featured researches published by Irene Díaz.

IEEE Transactions on Knowledge and Data Engineering | 2005

Introducing a family of linear measures for feature selection in text categorization

Elías F. Combarro; Elena Montañés; Irene Díaz; José Ranilla; Ricardo Mones

Text categorization, which consists of automatically assigning documents to a set of categories, usually involves the management of a huge number of features. Most of them are irrelevant and others introduce noise which could mislead the classifiers. Thus, feature reduction is often performed in order to increase the efficiency and effectiveness of the classification. In this paper, we propose to select relevant features by means of a family of linear filtering measures which are simpler than the usual measures applied for this purpose. We carry out experiments over two different corpora and find that the proposed measures perform better than the existing ones.

Journal of the Association for Information Science and Technology | 2004

Improving performance of text categorization by combining filtering and support vector machines

Irene Díaz; José Ranilla; Elena Montañés; Javier Fernández; Elías F. Combarro

Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going on with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, on two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.

Knowledge Based Systems | 2015

An entropy measure definition for finite interval-valued hesitant fuzzy sets

Pelayo Quirós; Pedro Alonso; Humberto Bustince; Irene Díaz; Susana Montes

In this work, a definition of entropy is studied in an interval-valued hesitant fuzzy environment, instead of the classical fuzzy logic or the interval-valued one. As the properties of this kind of sets are more complex, the entropy is built by three different functions, where each one represents a different measure: fuzziness, lack of knowledge and hesitance. Using all, an entropy measure for interval-valued hesitant fuzzy sets is obtained, quantifying various types of uncertainty.From this definition, several results have been developed for each mapping that shapes the entropy measure in order to get such functions with ease, and as a consequence, allowing to obtain this new entropy in a simpler way.

international symposium on neural networks | 2003

A wrapper approach with support vector machines for text categorization

Elena Montañés; José Ramón Quevedo; Irene Díaz

The ScanningN-T uple classifier (SNT) was introduced by Lucas and Amiri [1, 2] as an efficient and accurate classifier for chain-coded hand-written digits. The SNT operates as speeds of tens of thousands of sequences per second, during both the trainingand the recognition phases. The main contribution of this paper is to present a new discriminative trainingrule for the SNT. Two versions of the rule are provided, based on minimizingthe mean-squared error and the cross-entropy, respectively. The discriminative trainingrule offers improved accuracy at the cost of slower trainingtime, since the trainingis now iterative instead of single pass. The cross-entropy trained SNT offers the best results, with an error rate of 2.5% on sequences derived from the MNIST test set.

International Journal of Computer Mathematics | 2012

Interpretability of fuzzy association rules as means of discovering threats to privacy

Luigi Troiano; Luis J. Rodríguez-Muñiz; José Ranilla; Irene Díaz

This paper focuses on studying how data privacy could be preserved with fuzzy rule bases as interpretable as possible. These fuzzy rule bases are obtained from a data mining strategy based on building a decision tree. The antecedents of each rule produced by these systems contain information about the released variables (quasi-identifier), whereas the consequent contains information only about the protected variable. Experimental results show that fuzzy rules are generally simpler and easier to interpret than other approaches but the risk of disclosing does not increase.

intelligent data analysis | 2003

Measures of Rule Quality for Feature Selection in Text Categorization

Elena Montañés; Javier Fernández; Irene Díaz; Elías F. Combarro; José Ranilla

Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going on with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines have shown very good results. In this paper we propose a new family of measures taken from the Machine Learning environment to apply them to feature reduction task. The experiments are performed on two different corpus (Reuters and Ohsumed). The results show that the new family of measures performs better than the traditional Information Theory measures.

Computer-Aided Engineering | 2014

On the use of fuzzy partitions to protect data

Pelayo Quirós; Pedro Alonso; Irene Díaz; Susana Montes

Data protection is one of the most challenging tasks nowadays due to the huge amount of information that can be shared and crossed from different sources. Releasing microdata is a way to protect data, mainly in the economic and medical field. However, this kind of data can experience privacy attacks. This paper proposes the use of fuzzy sets as a way to improve the protection of privacy in microdata. Then, traditional definitions of k-anonymity, l-diversity and t-closeness are extended. The performance of these new approaches is checked in terms of the risk index.

International Journal of Approximate Reasoning | 2016

Representations of votes facilitating monotonicity-based ranking rules

Raúl Pérez-Fernández; Michael Rademaker; Pedro Alonso; Irene Díaz; Susana Montes; Bernard De Baets

We propose a new point of view of the long-standing problem where several voters have expressed a (strict) linear order (or ranking) over a set of candidates. For a ranking a ? b ? c to represent a groups opinion, it would be natural that the strength with which a ? c is supported should not be less than both the strength with which a ? b and the strength with which b ? c are supported. This intuitive property is called monotonicity and it has been recently addressed for the first time in the context of social choice. In this paper, two different representations of votes (the votrix and the votex) are considered. The former one is a formalization of the well-known reciprocal matrix of pairwise comparisons between candidates already introduced by Condorcet. The latter one is an extension of this reciprocal matrix considering hitherto unexploited information. These two representations lead to two monotonicity-based ranking rules.

Fuzzy Sets and Systems | 2015

Multi-factorial risk assessment

Raúl Pérez-Fernández; Pedro Alonso; Irene Díaz; Susana Montes

The main purpose of this paper is to develop a new method to aggregate the information given by several experts or criteria about different alternatives in order to obtain the preferred alternative or alternatives. This method has to take into account the interaction of the different alternatives and a parameter modelling the flexibility of this method has to be introduced. More precisely, this method uses fuzzy preference relations, aggregated by means of weighted ordered weighted averaging aggregation operators (WOWA). For the exploitation phase the extended weighted voting algorithm is introduced and studied in detail. Finally, the goodness of this approach is analyzed using it to combine different points of view (people, environment, assets and reputation impact for the company) in the assessment of risk associated with human reliability.

Information Sciences | 2014

Statistical analysis of parametric t-norms

Luigi Troiano; Luis J. Rodríguez-Muñiz; Pasquale Marinaro; Irene Díaz

Theoretical aspects of aggregation functions, particularly t-norms, have been deeply investigated. A further step in the study of these functions is to consider statistical behavior of outputs. This research approach has been initiated by the authors in a recent paper, mainly focused on the compensatory functions (such as averages). Now, parametric t-norms are especially studied in order to provide simply easy rules to be used by practitioners when choosing one parameter value. A pairwise parametric comparison of several t-norm families is provided by means of charts.

Explore More