Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Manuel Álvarez is active.

Publication


Featured researches published by Manuel Álvarez.


International Journal of Cancer | 2006

Cervical carcinoma and reproductive factors: Collaborative reanalysis of individual data on 16,563 women with cervical carcinoma and 33,542 women without cervical carcinoma from 25 epidemiological studies.

Thangarajan Rajkumar; Jack Cuzick; P. Appleby; R. Barnabas; Valerie Beral; A Berrington de González; D. Bull; K. Canfell; B. Crossley; J. Green; G. Reeves; S. Sweetland; Susanne K. Kjaer; R. Painter; Martin Vessey; Janet R. Daling; Margaret M. Madeleine; Roberta M. Ray; David B. Thomas; Rolando Herrero; Nathalie Ylitalo; F. X. Bosch; S de Sanjosé; Xavier Castellsagué; V. Moreno; D. Hammouda; E. Negri; G. Randi; Manuel Álvarez; O. Galdos

The International Collaboration of Epidemiological Studies of Cervical Cancer has combined individual data on 11,161 women with invasive carcinoma, 5,402 women with cervical intraepithelial neoplasia (CIN)3/carcinoma in situ and 33,542 women without cervical carcinoma from 25 epidemiological studies. Relative risks (RRs) and 95% confidence intervals (CIs) of cervical carcinoma in relation to number of full‐term pregnancies, and age at first full‐term pregnancy, were calculated conditioning by study, age, lifetime number of sexual partners and age at first sexual intercourse. Number of full‐term pregnancies was associated with a risk of invasive cervical carcinoma. After controlling for age at first full‐term pregnancy, the RR for invasive cervical carcinoma among parous women was 1.76 (95% CI: 1.53–2.02) for ≥≥7 full‐term pregnancies compared with 1–2. For CIN3/carcinoma in situ, no significant trend was found with increasing number of births after controlling for age at first full‐term pregnancy among parous women. Early age at first full‐term pregnancy was also associated with risk of both invasive cervical carcinoma and CIN3/carcinoma in situ. After controlling for number of full‐term pregnancies, the RR for first full‐term pregnancy at age <17 years compared with ≥≥25 years was 1.77 (95% CI: 1.42–2.23) for invasive cervical carcinoma, and 1.78 (95% CI: 1.26–2.51) for CIN3/carcinoma in situ. Results were similar in analyses restricted to high‐risk human papilloma virus (HPV)‐positive cases and controls. No relationship was found between cervical HPV positivity and number of full‐term pregnancies, or age at first full‐term pregnancy among controls. Differences in reproductive habits may have contributed to differences in cervical cancer incidence between developed and developing countries.


data and knowledge engineering | 2008

Extracting lists of data records from semi-structured web pages

Manuel Álvarez; Alberto Pan; Juan Raposo; Fernando Bellas; Fidel Cacheda

Many web sources provide access to an underlying database containing structured data. These data can be usually accessed in HTML form only, which makes it difficult for software programs to obtain them in structured form. Nevertheless, web sources usually encode data records using a consistent template or layout, and the implicit regularities in the template can be used to automatically infer the structure and extract the data. In this paper, we propose a set of novel techniques to address this problem. While several previous works have addressed the same problem, most of them require multiple input pages while our method requires only one. In addition, previous methods make some assumptions about how data records are encoded into web pages, which do not always hold in real websites. Finally, we have also tested our techniques with a high number of real web sources and we have found them to be very effective.


Proceedings of the IFIP TC8 / WG8.1 Working Conference on Engineering Information Systems in the Internet Context | 2002

Semi-Automatic Wrapper Generation for Commercial Web Sources

Alberto Pan; Juan Raposo; Manuel Álvarez; Justo Hidalgo; Ángel Viña

Semi-automatic wrapper generation tools aim to ease the task of building structured views over semi-structured web sources. But the wrapper generation techniques presented up to date are unable to properly deal with sources requiring complex navigational sequences for accessing data. In this paper, we present WARGO, a semiautomatic wrapper generation tool, which has been used by non-programmer staff to successfully wrap more than 700 commercial web sources in several industrial applications. We describe our approach for wrapper generation and show the difficulties found with other systems for wrapping this kind of sources.


PLOS ONE | 2014

Twitter: A Good Place to Detect Health Conditions

Víctor M. Prieto; Sérgio Matos; Manuel Álvarez; Fidel Cacheda; José Luís Oliveira

With the proliferation of social networks and blogs, the Internet is increasingly being used to disseminate personal health information rather than just as a source of information. In this paper we exploit the wealth of user-generated data, available through the micro-blogging service Twitter, to estimate and track the incidence of health conditions in society. The method is based on two stages: we start by extracting possibly relevant tweets using a set of specially crafted regular expressions, and then classify these initial messages using machine learning methods. Furthermore, we selected relevant features to improve the results and the execution times. To test the method, we considered four health states or conditions, namely flu, depression, pregnancy and eating disorders, and two locations, Portugal and Spain. We present the results obtained and demonstrate that the detection results and the performance of the method are improved after feature selection. The results are promising, with areas under the receiver operating characteristic curve between 0.7 and 0.9, and f-measure values around 0.8 and 0.9. This fact indicates that such approach provides a feasible solution for measuring and tracking the evolution of health states within the society.


database and expert systems applications | 2002

The Wargo system: semi-automatic wrapper generation in presence of complex data access modes

Juan Raposo; Alberto Pan; Manuel Álvarez; Justo Hidalgo; Ángel Viña

Semi-automatic wrapper generation tools aim to ease the task of building structured views over Web sources. But the wrapper generation techniques presented to date show several weaknesses when dealing with the complex commercial Web sources of today, especially when constructing advanced navigational sequences for accessing data. We present Wargo, a semi-automatic wrapper generation tool, which has been used by non-programmer staff to successfully wrap more than 700 commercial Web sources in several industrial applications.


Urology | 2009

Safety of Active Surveillance Program for Recurrent Nonmuscle-invasive Bladder Carcinoma

V. Hernández; Manuel Álvarez; E. de la Peña; N. Amaruch; M.D. Martín; J.M. de la Morena; V. Gómez; C. Llorente

OBJECTIVES To report our experience with a select group of patients with low-risk tumors included in an observation and monitoring program after the diagnosis of recurrence. METHODS We performed a prospective cohort study in patients diagnosed with recurrent, nonmuscle-invasive bladder cancer maintained under an active surveillance protocol. The inclusion criteria were papillary tumors with negative cytology findings, previous nonmuscle-invasive tumor (Stage pTa, pT1a), grade 1-2, size <1 cm, and number of tumors <5. No symptomatic patients or those with carcinoma in situ or grade 3 tumors were included. A retrospective analysis of a control group of patients with clinical characteristics similar to those of the patients on active surveillance, but who underwent transurethral resection immediately after the recurrence was diagnosed was also performed. RESULTS The data from 64 patients (70 observation events) were analyzed. The mean patient age was 66.7 years. The median follow-up was 38.6 months. The median time patients remained in observation was 10.3 months. The tumor histologic features before observation were Stage pTa in 77.1%, Stage pT1a in 22.9%, grade 1 in 67.1%, and grade 2 in 23%. After 10.3 months, 93.5% of the patients had not progressed in stage and 83.8% had not progressed in grade. None of the patients experienced progression to muscle-invasive disease. A comparison between the rates of progression in the study and control groups showed no statistically significant difference. CONCLUSIONS Patients with recurrent, small (<1 cm), nonmuscle-invasive bladder tumors can be safely offered monitoring under an active surveillance protocol, with a minimal risk of progression in either grade or stage, thus reducing the amount of surgical intervention they might undergo throughout their life.


very large data bases | 2002

The denodo data integration platform

Alberto Pan; Juan Raposo; Manuel Álvarez; Paula Montoto; Vicente Orjales; Justo Hidalgo; Lucía Ardao; Anastasio Molano; Ángel Viña

The world today is characterised by the proliferation of information sources available through media such as the WWW, databases, semi-structured files (e.g. XML documents), etc. Nevertheless, this information is usually scattered, heterogeneous and weakly structured, so it is difficult to process it automatically. DENODO Corporation has developed a mediator system for the construction of semi-structured and structured data integration applications. This system has already been used in the construction of several applications on the Internet and in corporate environments, which are currently deployed at several important Internet audience sites and large sized business corporations. In this extended abstract, we present an overview of the system and we put forward some conclusions arising from our experience in building real-world data integration applications, focusing in some challenges we believe require more attention from the research community.


web age information management | 2006

Crawling web pages with support for client-side dynamism

Manuel Álvarez; Alberto Pan; Juan Raposo; Justo Hidalgo

There is a great amount of information on the web that can not be accessed by conventional crawler engines. This portion of the web is usually known as the Hidden Web. To be able to deal with this problem, it is necessary to solve two tasks: crawling the client-side and crawling the server-side hidden web. In this paper we present an architecture and a set of related techniques for accessing the information placed in web pages with support for client-side dynamism, dealing with aspects such as JavaScript technology, non-standard session maintenance mechanisms, client redirections, pop-up menus, etc. Our approach leverages current browser APIs and implements novel crawling models and algorithms.


signal processing systems | 2010

Finding and Extracting Data Records from Web Pages

Manuel Álvarez; Alberto Pan; Juan Raposo; Fernando Bellas; Fidel Cacheda

Many HTML pages are generated by software programs by querying some underlying databases and then filling in a template with the data. In these situations the metainformation about the data structure is lost, so automated software programs cannot process these data in such powerful manners as information from databases. We propose a set of novel techniques for detecting structured records in a web page and extracting the data values that constitute them. Our method needs only an input page. It starts by identifying the data region of interest in the page. Then it is partitioned into records by using a clustering method that groups similar subtrees in the DOM tree of the page. Finally, the attributes of the data records are extracted by using a method based on multiple string alignment. We have tested our techniques with a high number of real web sources, obtaining high precision and recall values.


ieee international conference on e-commerce technology for dynamic e-business | 2004

Client-side deep Web data extraction

Manuel Álvarez; Alberto Pan; Juan Raposo; Ángel Viña

The problem of data extraction from the deep Web can be divided into two tasks: crawling the client-side and the server-side deep Web. The objective is to define an architecture and a set of related techniques to access the information placed in the client-side deep Web. This involves dealing with aspects such as JavaScript technology, nonstandard session maintenance mechanisms, client redirections, pop-up menus, etc. We use current browser APIs as building blocks and leverage them to implement novel crawling models and algorithms

Collaboration


Dive into the Manuel Álvarez's collaboration.

Top Co-Authors

Avatar

Alberto Pan

University of A Coruña

View shared research outputs
Top Co-Authors

Avatar

Juan Raposo

University of A Coruña

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ángel Viña

University of A Coruña

View shared research outputs
Top Co-Authors

Avatar

José Losada

University of A Coruña

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge