Ana Freire
University of A Coruña
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ana Freire.
Current Proteomics | 2009
José M. Vázquez; Vanessa Aguiar; Jose A. Seoane; Ana Freire; Jose A. Serantes; Julian Dorado; Alejandro Pazos; Cristian R. Munteanu
The impact of cancer in the society has created the necessity of new and faster theoretical models that may al- low earlier cancer detection. The present review gives the prediction of cancer by using the star graphs of the protein se- quences and proteome mass spectra by building a Quantitative Protein - Disease Relationships (QPDRs), similar to Quan- titative Structure Activity Relationship (QSAR) models. The nodes of these star graphs are represented by the amino acids of each protein or by the amplitudes of the mass spectra signals and the edged are the geometric and/or functional rela- tionships between the nodes. The star graphs can be numerically described by the invariant values named topological in- dices (TIs). The transformation of the star graphs (graphical representation) of proteins into TIs (numbers) facilitates the manipulation of protein information and the search for structure-function relationships in Proteomics. The advantages of this method include simplicity, fast calculations and free resources such as S2SNet and MARCH-INSIDE tools. Thus, this ideal theoretical scheme can be easily extended to other types of diseases or even other fields, such as Genomics or Sys- tems Biology.
web search and data mining | 2014
Ana Freire; Craig Macdonald; Nicola Tonellotto; Iadh Ounis; Fidel Cacheda
For many search settings, distributed/replicated search engines deploy a large number of machines to ensure efficient retrieval. This paper investigates how the power consumption of a replicated search engine can be automatically reduced when the system has low contention, without compromising its efficiency. We propose a novel self-adapting model to analyse the trade-off between latency and power consumption for distributed search engines. When query volumes are high and there is contention for the resources, the model automatically increases the necessary number of active machines in the system to maintain acceptable query response times. On the other hand, when the load of the system is low and the queries can be served easily, the model is able to reduce the number of active machines, leading to power savings. The model bases its decisions on examining the current and historical query loads of the search engine. Our proposal is formulated as a general dynamic decision problem, which can be quickly solved by dynamic programming in response to changing query loads. Thorough experiments are conducted to validate the usefulness of the proposed adaptive model using historical Web search traffic submitted to a commercial search engine. Our results show that our proposed self-adapting model can achieve an energy saving of 33% while only degrading mean query completion time by 10 ms compared to a baseline that provisions replicas based on a previous days traffic.
european conference on information retrieval | 2013
Ana Freire; Craig Macdonald; Nicola Tonellotto; Iadh Ounis; Fidel Cacheda
Search engines use replication and distribution of large indices across many query servers to achieve efficient retrieval. Under high query load, queries can be scheduled to replicas that are expected to be idle soonest, facilitated by the use of predicted query response times. However, the overhead of making response time predictions can hinder the usefulness of query scheduling under low query load. In this paper, we propose a hybrid scheduling approach that combines the scheduling methods appropriate for both low and high load conditions, and can adapt in response to changing conditions. We deploy a simulation framework, which is prepared with actual and predicted response times for real Web search queries for one full day. Our experiments using different numbers of shards and replicas of the 50 million document ClueWeb09 corpus show that hybrid scheduling can reduce the average waiting times of one day of queries by 68% under high load conditions and by 7% under low load conditions w.r.t. traditional scheduling methods.
european conference on information retrieval | 2009
Javier Parapar; Ana Freire; Álvaro Barreiro
The traditional retrieval models based on term matching are not effective in collections of degraded documents (output of OCR or ASR systems for instance). This paper presents a n-gram based distributed model for retrieval on degraded text large collections. Evaluation was carried out with both the TREC Confusion Track and Legal Track collections showing that the presented approach outperforms in terms of effectiveness the classical term centred approach and the most of the participant systems in the TREC Confusion Track.
international acm sigir conference on research and development in information retrieval | 2012
Ana Freire; Craig Macdonald; Nicola Tonellotto; Iadh Ounis; Fidel Cacheda
For increased efficiency, an information retrieval system can split its index into multiple shards, and then replicate these shards across many query servers. For each new query, an appropriate replica for each shard must be selected, such that the query is answered as quickly as possible. Typically, the replica with the lowest number of queued queries is selected. However, not every query takes the same time to execute, particularly if a dynamic pruning strategy is applied by each query server. Hence, the replicas queue length is an inaccurate indicator of the workload of a replica, and can result in inefficient usage of the replicas. In this work, we propose that improved replica selection can be obtained by using query efficiency prediction to measure the expected workload of a replica. Experiments are conducted using 2.2k queries, over various numbers of shards and replicas for the large GOV2 collection. Our results show that query waiting and completion times can be markedly reduced, showing that accurate response time predictions can improve scheduling accuracy and attesting the benefit of the proposed scheduling algorithm.
ambient intelligence | 2009
Vanessa Aguiar; Jose A. Seoane; Ana Freire; Cristian R. Munteanu
A new algorithm is presented for finding genotype-phenotype association rules from data related to complex diseases. The algorithm was based on Genetic Algorithms, a technique of Evolutionary Computation. The algorithm was compared to several traditional data mining techniques and it was proved that it obtained similar classification scores but found more rules from the data generated artificially. In this paper it is assumed that several groups of SNPs have an impact on the predisposition to develop a complex disease like schizophrenia. It is expected to validate this in a short period of time on real data.
International Journal of Electronic Healthcare | 2010
Alba Cabarcos; Tamara Sanchez; Jose A. Seoane; Vanessa Aguiar-Pulido; Ana Freire; Julian Dorado; Alejandro Pazos
Nowadays, medical practice needs, at the patient Point-of-Care (POC), personalised knowledge adjustable in each moment to the clinical needs of each patient, in order to provide support to decision-making processes, taking into account personalised information. To achieve this, adapting the hospital information systems is necessary. Thus, there is a need of computational developments capable of retrieving and integrating the large amount of biomedical information available today, managing the complexity and diversity of these systems. Hence, this paper describes a prototype which retrieves biomedical information from different sources, manages it to improve the results obtained and to reduce response time and, finally, integrates it so that it is useful for the clinician, providing all the information available about the patient at the POC. Moreover, it also uses tools which allow medical staff to communicate and share knowledge.
international acm sigir conference on research and development in information retrieval | 2015
Ana Freire
Web search engines have to deal with a huge increase of information, demanded by high incoming query traffic. This situation has driven companies to build large, geographically distributed data centres housing thousands of servers and consuming enormous amounts of electricity. At this scale, even minor efficiency improvements may result in large financial and power savings. This thesis represents a novel contribution to the state-of-the-art of Query Scheduling and Green Information Retrieval (Green IR), by assisting large-scale data centres to build more efficient and environmentally-friendly search engines. The main contributions of this work are the following: Query Scheduling. We introduce query efficiency predictors as suitable estimators to improve Query Scheduling. We estimate the processing time of the queries waiting in each query server and we calculate an approximate time that a new query must spend in each queue. Based on this estimation, the fastest query server is selected. Green IR. Once we have developed new methods to improve the average response time of a search engine, we focus on reducing the power consumption of the whole system. This thesis proposes a mathematical model that establishes a trade-off between latency and power consumption. This model attempts to automatically adapt the number of active servers in the system based on the fluctuations of a daily query traffic flow. Queueing Theory. We prove the limitation of Queueing Theory models for estimating the latency in search engines. As a consequence, we develop our trade-off model by predicting the latency using historical data. Results show the good performance of this approach. IR evaluation. We attest that Simulation platforms are suitable for IR experimentation. We support this conclusion by establishing an exhaustive analysis of the current IR evaluation platforms..
Archive | 2010
Vanessa Aguiar; Jose A. Seoane; Ana Freire; Ling Guo
Current Bioinformatics | 2015
Cristian R. Munteanu; Vanessa Aguiar-Pulido; Ana Freire; Marcos Martínez-Romero; Ana B. Porto-Pazos; Javier Pereira; Julian Dorado