Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrés Soto is active.

Publication


Featured researches published by Andrés Soto.


language resources and evaluation | 2013

Classifying unlabeled short texts using a fuzzy declarative approach

Francisco P. Romero; Pascual Julián-Iranzo; Andrés Soto; Mateus Ferreira-Satler; Juan Gallardo-Casero

Web 2.0 provides user-friendly tools that allow persons to create and publish content online. User generated content often takes the form of short texts (e.g., blog posts, news feeds, snippets, etc). This has motivated an increasing interest on the analysis of short texts and, specifically, on their categorisation. Text categorisation is the task of classifying documents into a certain number of predefined categories. Traditional text classification techniques are mainly based on word frequency statistical analysis and have been proved inadequate for the classification of short texts where word occurrence is too small. On the other hand, the classic approach to text categorization is based on a learning process that requires a large number of labeled training texts to achieve an accurate performance. However labeled documents might not be available, when unlabeled documents can be easily collected. This paper presents an approach to text categorisation which does not need a pre-classified set of training documents. The proposed method only requires the category names as user input. Each one of these categories is defined by means of an ontology of terms modelled by a set of what we call proximity equations. Hence, our method is not category occurrence frequency based, but highly depends on the definition of that category and how the text fits that definition. Therefore, the proposed approach is an appropriate method for short text classification where the frequency of occurrence of a category is very small or even zero. Another feature of our method is that the classification process is based on the ability of an extension of the standard Prolog language, named Bousi~Prolog, for flexible matching and knowledge representation. This declarative approach provides a text classifier which is quick and easy to build, and a classification process which is easy for the user to understand. The results of experiments showed that the proposed method achieved a reasonably useful performance.


Archive | 2008

Fuzzy Approach of Synonymy and Polysemy for Information Retrieval

Andrés Soto; José A. Olivas; Manuel E. Prieto

Development of methods for Information Retrieval based on conceptual aspects is vital to reduce the quantity of unimportant documents retrieved by the search engines. In this chapter, a method for expanding user queries is presented, such that for each term in the original query, all of its synonyms by a certain meaning with maximum concept frequency are introduced. To measure the degree of concept presence in a document (or even in a document collection), a concept frequency formula is introduced. New fuzzy formulas are also introduced to calculate the synonymy degree between terms to manage with concepts (meanings). With them, even though a certain term does not appear in a document, some degree of its presence could be estimated based on its degree of synonymy with terms that do appear in the document. A polysemy index is also introduced in order to simplify the treatment of weak and strong words.


soft computing | 2010

Fuzzy optimized self-organizing maps and their application to document clustering

Francisco P. Romero; Arturo Peralta; Andrés Soto; José A. Olivas; Jesús Serrano-Guerrero

In this paper, an approach using fuzzy logic techniques and self-organizing maps (SOM) is presented in order to manage conceptual aspects in document clusters and to reduce the training time. In order to measure the presence degree of a concept in a document, a concept frequency formula is introduced. This formula is based on new fuzzy formulas to calculate the polysemy degree of terms and the synonymy degree between terms. In this approach, new fuzzy improvements such as automatic choice of the topology, heuristic map initialization, a fuzzy similarity measure and a keywords extraction process are used. Some experiments have been carried out in order to compare the proposed system with classic SOM approaches by means of Reuters collection. The system performance has been measured in terms of F-measure and training time. The experimental results show that the proposed approach generates good results with less training time compared to classic SOM techniques.


ieee international conference on fuzzy systems | 2007

Using Generalized Constraints and Protoforms to Deal with Adjectives

Andrés Soto; José A. Olivas; Manuel E. Prieto

Natural languages (NLs) are basically systems for describing perceptions, which are intrinsically imprecise. Soft computing approach to NL-Computation considers that new tools such as generalized constraints and prototypical forms (protoforms) should be employed to deal with such semantic imprecision. If a NL proposition could be expressed in terms of a generalized constraint, it could be considered precise, at least, in some degree. In this paper, NL adjectives are considered as constraints. Therefore, generalized constraints are proposed to specify the attribute values associated with nouns by means of adjectives. Then, after recognizing a noun phrase in a sentence, it could be transformed into a generalized constraint. This way, a lot of data could be retrieved from documents in NL, and used to deduce new information. Protoforms allow describing the deep semantic structure of propositions. They could be used to represent noun phrases, involving adjectives and nouns, as symbolic expressions.


international workshop on fuzzy logic and applications | 2011

A fuzzy declarative approach for classifying unlabeled short texts using thesauri

Francisco P. Romero; Pascual Julián-Iranzo; Andrés Soto; Mateus Ferreira-Satler; Juan Gallardo-Casero

The classic approach to text categorisation is based on a learning process that requires a large number of labelled training texts to achieve an accurate performance. The most notable problem is that labelled texts are difficult to generate because categorising shorts texts as snippets or messages must be done by human developers, although unlabelled short texts could be easily collected. In this paper, we present an approach to categorising unlabelled short texts which only require, as user input, the category names defined by means of an ontology of terms modelled by a set of proximity equations. The proposed classification process is based on the ability of a fuzzy extension of the standard Prolog language named Bousi~Prolog for flexible matching and knowledge representation. This declarative approach provides a text classifier which is fast and easy to build, as well as a classification process that is easy for the user to understand. The results of the experiment showed that the proposed method achieved a reasonably good performance.


ieee international conference on fuzzy systems | 2010

A category-based information filtering approach based on interval type 2 fuzzy sets

Francisco P. Romero; Jesús Serrano-Guerrero; José A. Olivas; Andrés Soto

Category-based information filtering is ground on the representation of user preferences according to a set of categories of similar items. The use of type 1 fuzzy sets provides a good method to represent categories when only one static interpretation of them is considered. This representation is not enough when documents do not have the same meaning for two different users because there are some degrees of subjectivity. On the other hand, type 2 fuzzy sets have been successfully applied to manage uncertainty more effectively than type-1 fuzzy sets in several environments. This paper presents a method to manage efficiently uncertainties in the filtering process in environments where there is a constant flow of new information (news, e-mail, etc.) and multiple users are involved. The proposed solution is based on the extension of the categories-based filtering method using interval type 2 fuzzy sets for representing each category and the user preferences. Experimental results, that illustrate the feasibility of this approach, are provided.


intelligent systems design and applications | 2009

CASTALIA: Architecture of a Fuzzy Metasearch Engine for Question Answering Systems

Jesús Serrano-Guerrero; José A. Olivas; Jesus A. Gallego; Francisco P. Romero; Andrés Soto

The goal of this paper is to present the architecture of a metasearch engine called Castalia, still under development, which includes several underlying Q&A systems. Usually metasearch engines manage typical search engines like Google or Yahoo, but in this case the encapsulation of Q&A systems proposes new challenges that can be modeled by fuzzy logic apart from the other existing challenges such as the fuzzy modeling of temporal or causal questions.


intelligent systems design and applications | 2009

An Experiment About Using Copulative and Comparative Sentences as Constraining Relations

Andrés Soto; José A. Olivas; Francisco P. Romero; Jesús Serrano-Guerrero

Existing search engines and question-answering (QA) systems have made possible processing large volumes of textual information. Current work on QA has mainly focused on answering two basic types of questions: factoid and definition questions. However, the capability to synthesize an answer to a query by drawing on bodies of information which reside in various parts of the knowledge base is not among the capabilities of those systems. In this paper, a system oriented to infer query answers from a collection of propositions expressed in natural language is introduced. By means of a specific example, it is outlined how the system proceeds to face those situations. This approach is based on the use of formal constraining relations modeling copulative and comparative sentences. Combining those propositions with others contained in different knowledge bases and applying deduction rules, the desired answer could be obtained.


IFSA (2) | 2007

A Hybrid Model for Document Clustering Based on a Fuzzy Approach of Synonymy and Polysemy

Francisco P. Romero; Andrés Soto; José A. Olivas

A new model for document clustering is proposed in order to manage with conceptual aspects. To measure the presence degree of a concept in a document (or even in a document collection), a concept frequency formula is introduced. This formula is based on new fuzzy formulas to calculate the synonymy and polysemy degrees between terms. To solve the several shortcomings of classical clustering algorithm a soft approach to hybrid model is proposed. The clustering procedure is implemented by two connected and tailored algorithms with the aim to build a fuzzy-hierarchical structure. A fuzzy hierarchical clustering algorithm is used to determine an initial clustering and the process is completed using an improved soft clustering algorithm. Experiments show that using this model, clustering tends to perform better than the classical approach.


european society for fuzzy logic and technology conference | 2007

Using Generalized Constraints and Protoforms to Deal with Adverbs.

Andrés Soto; José A. Olivas; Manuel E. Prieto

Collaboration


Dive into the Andrés Soto's collaboration.

Researchain Logo
Decentralizing Knowledge