Ivan Lopez-Arevalo
CINVESTAV
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ivan Lopez-Arevalo.
Expert Systems With Applications | 2013
Ana B. Rios-Alvarado; Ivan Lopez-Arevalo; Victor Sosa-Sosa
Ontologies play a very important role in knowledge management and the Semantic Web, their use has been exploited in many current applications. Ontologies are especially useful because they support the exchange and sharing of information. Ontology learning from text is the process of deriving high-level concepts and their relations. An important task in ontology learning from text is to obtain a set of representative concepts to model a domain and organize them into a hierarchical structure (taxonomy) from unstructured information. In the process of building a taxonomy, the identification of hypernym/hyponym relations between terms is essential. How to automatically build the appropriate structure to represent the information contained in unstructured texts is a challenging task. This paper presents a novel method to obtain, from unstructured texts, representative concepts and their taxonomic relationships in a specific knowledge domain. This approach builds a concept hierarchy from a specific-domain corpus by using a clustering algorithm, a set of linguistic patterns, and additional contextual information extracted from the Web that improves the discovery of the most representative hypernym/hyponym relationships. A set of experiments were carried out using four different corpora. We evaluated the quality of the constructed taxonomies against gold standard ontologies, the experiments show promising results.
Knowledge and Information Systems | 2007
Ivan Lopez-Arevalo; René Bañares-Alcántara; Arantza Aldea; A. Rodríguez-Martínez
An approach to improve the management of complexity during the redesign of technical processes is proposed. The approach consists of two abstract steps. In the first step, model-based reasoning is used to generate automatically alternative representations of an existing process at several levels of abstraction. In the second step, process alternatives are generated through the application of case-based reasoning. The key point of our framework is the modeling approach, which is an extension of the Multimodeling and Multilevel Flow Modeling methodologies. These, together with a systematic design methodology, are used to represent a process hierarchically, thus improving the identification of analogous equipment/sections from different processes. The hierarchical representation results in sets of equipment/sections organized according to their functions and intentions. A case-based reasoning system then retrieves from a library of cases similar equipment/sections to the one selected by the user. The final output is a set of equipment/sections ordered according to their similarity. Human intervention is necessary to adapt the most promising case within the original process.
intelligent information systems | 2013
Heidy M. Marin-Castro; Victor Sosa-Sosa; Jose F. Martinez-Trinidad; Ivan Lopez-Arevalo
The amount of information contained in databases available on the Web has grown explosively in the last years. This information, known as the Deep Web, is heterogeneous and dynamically generated by querying these back-end (relational) databases through Web Query Interfaces (WQIs) that are a special type of HTML forms. The problem of accessing to the information of Deep Web is a great challenge because the information existing usually is not indexed by general-purpose search engines. Therefore, it is necessary to create efficient mechanisms to access, extract and integrate information contained in the Deep Web. Since WQIs are the only means to access to the Deep Web, the automatic identification of WQIs plays an important role. It facilitates traditional search engines to increase the coverage and the access to interesting information not available on the indexable Web. The accurate identification of Deep Web data sources are key issues in the information retrieval process. In this paper we propose a new strategy for automatic discovery of WQIs. This novel proposal makes an adequate selection of HTML elements extracted from HTML forms, which are used in a set of heuristic rules that help to identify WQIs. The proposed strategy uses machine learning algorithms for classification of searchable (WQIs) and non-searchable (non-WQI) HTML forms using a prototypes selection algorithm that allows to remove irrelevant or redundant data in the training set. The internal content of Web Query Interfaces was analyzed with the objective of identifying only those HTML elements that are frequently appearing provide relevant information for the WQIs identification. For testing, we use three groups of datasets, two available at the UIUC repository and a new dataset that we created using a generic crawler supported by human experts that includes advanced and simple query interfaces. The experimental results show that the proposed strategy outperforms others previously reported works.
Computers & Chemical Engineering | 2004
A. Rodríguez-Martínez; Ivan Lopez-Arevalo; René Bañares-Alcántara; Arantza Aldea
The paper presents a proposal of a multi-model knowledge representation to be used within a retrofit methodology for chemical processes. The retrofit of an existing process is a complex and lengthy task. Therefore, a tool to support the steps of retrofit by reasoning about the existing process and the potential areas of improvement could be of great help. The use of structural, behavioural, functional and teleological models of units of equipment/devices of the process allows the designer to work with a combination of detailed and abstract information depending on the retrofit step. Our retrofit methodology consists of four steps: data extraction, analysis, modification and evaluation. The HYSYS ExtrAction Data (HEAD) and automatic hierarchical abstraction (AHA!) prototype systems have been implemented for the two initial steps. These systems have been applied to three case studies: the ammonia, acrylic acid and acetone processes.
Proceedings of the Third International Workshop on Keyword Search on Structured Data | 2012
Jaime I. Lopez-Veyna; Victor Sosa-Sosa; Ivan Lopez-Arevalo
Most of the information on the Web can be currently classified according to its (information) structure in three different forms: unstructured (plain text), semi-structured (XML files) and structured (tables in a relational database). Currently Web search is the primary way to access massive information. Keyword search also becomes an alternative of querying over relational databases and XML documents, which is simple to people who are familiar with the use of Web search engines. There are several approaches to perform keyword search over relational databases such as Steiner Trees, Candidate Networks and Tuple Units. However these methods have some constraints. The Steiner Trees method is considered a NP-hard problem, moreover, a real databases can produce a large number of Steiner Trees, which are difficult to identify and index. The Candidate Network approach first needs to generate the candidate networks and then to evaluate them to find the best answer. The problem is that for a keyword query the number of Candidate Networks can be very large and to find a common join expression to evaluate all the candidate networks could require a big computational effort. Finally, the use of Tuple Units in a general conception produce very large structures that most of the time store redundant information. To address this problem we propose a novel approach for keywords search over structured data (KESOSD). KESOSD models the structured information as graphs and proposed the use of a keyword-structure-aware-index called KSAI that captures the implicit structural relationships of the information producing fast and accuracy search responses. We have conducted some experiments and the results show that KESOSD achieves high search efficiency and high accuracy for keyword search over structured data.
international conference on electrical engineering, computing science and automatic control | 2008
Dulce Aguilar-Lopez; Ivan Lopez-Arevalo; Victor Sosa-Sosa
This work describes a Web search approach taking into account the semantic content of Web pages. Eliminating irrelevant Web pages, the time-consuming task of revise the obtained results from actual search engines is reduced. The proposed approach is focused on Web pages that are not defined with semantic Web structure (most of the actual Web pages are in this format). The challenge is extract the semantic content from heterogeneous and human oriented Web pages. The approach integrates structures of ontologies, WordNet, and a hierarchical similarity measure to determine the relevance of a Web page.
distributed computing and artificial intelligence | 2009
Dulce Aguilar-Lopez; Ivan Lopez-Arevalo; Victor J. Sosa
The Web is a wide repository of information available, but its heterogeneity, size and human oriented semantic supposes an obstacle in the search for desired information. Web search engines are a great help for accessing Web resources, nevertheless their classification algorithms are still limited since they only check the presence of a specific keyword or links, they do not analyse the semantic content of the resources. In recent years, several works are being related to convert the Web from an information space to a knowledge space by using common plans. One of the ways to achieve this purpose is the use of ontologies. The present paper proposes a methodology, where previously defined domain ontologies and the WordNet thesaurus are used to perform semantic searches obtaining suitable Web page results that really belong to the expected domain.
Journal of Medical Systems | 2015
Ana B. Rios-Alvarado; Ivan Lopez-Arevalo; Edgar Tello-Leal; Victor Sosa-Sosa
The access to medical information (journals, blogs, web-pages, dictionaries, and texts) has been increased due to availability of many digital media. In particular, finding an appropriate structure that represents the information contained in texts is not a trivial task. One of the structures for modeling the knowledge are ontologies. An ontology refers to a conceptualization of a specific domain of knowledge. Ontologies are especially useful because they support the exchange and sharing of information as well as reasoning tasks. The usage of ontologies in medicine is mainly focussed in the representation and organization of medical terminologies. Ontology learning techniques have emerged as a set of techniques to get ontologies from unstructured information. This paper describes a new ontology learning approach that consists of a method for the acquisition of concepts and its corresponding taxonomic relations, where also axioms disjointWith and equivalentClass are learned from text without human intervention. The source of knowledge involves files about medical domain. Our approach is divided into two stages, the first part corresponds to discover hierarchical relations and the second part to the axiom extraction. Our automatic ontology learning approach shows better results compared against previous work, giving rise to more expressive ontologies.
Information Sciences | 2014
Jaime I. Lopez-Veyna; Victor Sosa-Sosa; Ivan Lopez-Arevalo
A new keyword-search technique is described, which solves the problem of duplicate data in a Virtual Document approach.A complete keyword-based search engine architecture is presented.A reduction in indexing time and index size when applying the Virtual Document approach in large datasets is presented. Keyword Search has been recognised as a viable alternative for information search in semi-structured and structured data sources. Current state-of-the-art keyword-search techniques over relational databases do not take advantage of correlative meta-information included in structured and semi-structured data sources leaving relevant answers out. These techniques are also limited due to scalability, performance and precision issues that are evident when they are implemented on large datasets. Based on an in-depth analysis of issues related to indexing and ranking semi-structured and structured information. We propose a new keyword-search algorithm that takes into account the semantic information extracted from the schemes of the structured and semi-structured data sources and combine it with the textual relevance obtained by a common text retrieval approach. The algorithm is implemented in a keyword-based search engine called KESOSASD (Keyword Search Over Semi-structured and Structured Data), improving its precision and response time. Our approach models the semi-structured and structured information as graphs, and make use of a Virtual Document Structure Aware Inverted Index (VDSAII). This index is created from a set of logical structures called Virtual Documents, which capture and exploit the implicit structural relationships (semantics) depicted in the schemas of the structured and semi-structured data sources. Extensive experiments were conducted to demonstrate that KESOSASD outperforms existing approaches in terms of search efficiency and accuracy. Moreover, KESOSASD is prepared to scale out and manage large databases without degrading its effectiveness.
international conference on electrical engineering, computing science and automatic control | 2009
Isidra Ocampo-Guzman; Ivan Lopez-Arevalo; Victor Sosa-Sosa
In this paper an approach to construct ontologies based on a text corpus is described. Using Latent Dirichlet Allocation the topics that describe the documents contained in the corpus are identified. Each topic is formed by a set of terms whose semantic relatednesses are determined applying the distributional hypothesis, which considers as similar terms those that share the similar linguistic context. This context is described by the verbs they share. The concept described by each topics terms is modeled through a taxonomy that describes the relation between them.