Is this you? Create Your Porfile

Nicola Tonellotto

Istituto di Scienza e Tecnologie dell'Informazione

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nicola Tonellotto is active.

Explore More

Publication

Featured researches published by Nicola Tonellotto.

european conference on parallel processing | 2005

A grid information service based on peer-to-peer

Diego Puppin; Stefano Moncelli; Ranieri Baraglia; Nicola Tonellotto; Fabrizio Silvestri

Information Services are fundamental blocks of the Grid infrastructure. They are responsible for collecting and distributing information about resource availability and status to users: the quality of these data may have a strong impact on scheduling algorithms and overall performance. Many popular information services have a centralized structure. This clearly introduces problems related to information updating and fault tolerance. Also, in very large configurations, scalability may be an issue. In this work, we present a Grid Information Service based on the peer-to-peer technology. Our system offers a fast propagation of information and has high scalability and reliability. We implemented our system complying to the OGSA standard using the Globus Toolkit 3. Our system can run on Linux and Windows systems, with different network configurations, so to trade off between redundancy (reliability) and cost.

international acm sigir conference on research and development in information retrieval | 2012

Learning to predict response times for online query scheduling

Craig Macdonald; Nicola Tonellotto; Iadh Ounis

Dynamic pruning strategies permit efficient retrieval by not fully scoring all postings of the documents matching a query -- without degrading the retrieval effectiveness of the top-ranked results. However, the amount of pruning achievable for a query can vary, resulting in queries taking different amounts of time to execute. Knowing in advance the execution time of queries would permit the exploitation of online algorithms to schedule queries across replicated servers in order to minimise the average query waiting and completion times. In this work, we investigate the impact of dynamic pruning strategies on query response times, and propose a framework for predicting the efficiency of a query. Within this framework, we analyse the accuracy of several query efficiency predictors across 10,000 queries submitted to in-memory inverted indices of a 50-million-document Web crawl. Our results show that combining multiple efficiency predictors with regression can accurately predict the response time of a query before it is executed. Moreover, using the efficiency predictors to facilitate online scheduling algorithms can result in a 22% reduction in the mean waiting time experienced by queries before execution, and a 7% reduction in the mean completion time experienced by users.

ACM Transactions on Information Systems | 2011

Upper-bound approximations for dynamic pruning

Craig Macdonald; Iadh Ounis; Nicola Tonellotto

Dynamic pruning strategies for information retrieval systems can increase querying efficiency without decreasing effectiveness by using upper bounds to safely omit scoring documents that are unlikely to make the final retrieved set. Often, such upper bounds are pre-calculated at indexing time for a given weighting model. However, this precludes changing, adapting or training the weighting model without recalculating the upper bounds. Instead, upper bounds should be approximated at querying time from various statistics of each term to allow on-the-fly adaptation of the applied retrieval strategy. This article, by using uniform notation, formulates the problem of determining a term upper-bound given a weighting model and discusses the limitations of existing approximations. Moreover, we propose an upper-bound approximation using a constrained nonlinear maximization problem. We prove that our proposed upper-bound approximation does not impact the retrieval effectiveness of several modern weighting models from various different families. We also show the applicability of the approximation for the Markov Random Field proximity model. Finally, we empirically examine how the accuracy of the upper-bound approximation impacts the number of postings scored and the resulting efficiency in the context of several large Web test collections.

international acm sigir conference on research and development in information retrieval | 2015

QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees

Claudio Lucchese; Franco Maria Nardini; Salvatore Orlando; Raffaele Perego; Nicola Tonellotto; Rossano Venturini

Learning-to-Rank models based on additive ensembles of regression trees have proven to be very effective for ranking query results returned by Web search engines, a scenario where quality and efficiency requirements are very demanding. Unfortunately, the computational cost of these ranking models is high. Thus, several works already proposed solutions aiming at improving the efficiency of the scoring process by dealing with features and peculiarities of modern CPUs and memory hierarchies. In this paper, we present QuickScorer, a new algorithm that adopts a novel bitvector representation of the tree-based ranking model, and performs an interleaved traversal of the ensemble by means of simple logical bitwise operations. The performance of the proposed algorithm are unprecedented, due to its cache-aware approach, both in terms of data layout and access patterns, and to a control flow that entails very low branch mis-prediction rates. The experiments on real Learning-to-Rank datasets show that QuickScorer is able to achieve speedups over the best state-of-the-art baseline ranging from 2x to 6.5x.

Joint Workshop on Making Grids Works | 2008

Behavioural skeletons for component autonomic management on grids

Marco Aldinucci; Sonia Campa; Marco Danelutto; Patrizio Dazzi; Domenico Laforenza; Nicola Tonellotto; Peter Kilpatrick

We present behavioural skeletons for the CoreGRID Component Model, which are an abstraction aimed at simplifying the development of GCM-based selfmanagement applications. Behavioural skeletons abstract component self-managent in component-based design as design patterns abstract class design in classic OO development. As here we just wish to introduce the behavioural skeleton framework, emphasis is placed on general skeleton structure rather than on their autonomic management policies.

Archive | 2007

A Proposal for a Generic Grid Scheduling Architecture

Nicola Tonellotto; Ramin Yahyapour; Philipp Wieder

In the past years, many Grids have been deployed and became commodity systems in production environments. While several Grid scheduling systems have already been implemented, they still provide only “ad hoc” and domain-specific solutions to the problem of scheduling resources in a Grid. However, no common and generic Grid scheduling system has emerged yet. In this work we identify generic features of three common Grid scheduling scenarios, and we introduce a single entity called scheduling instance that can be used as a building block for the scheduling solutions presented. We identify the behaviour that a scheduling instance must exhibit in order to be composed with other instances, and we describe its interactions with other Grid services. This work can be used as a foundation for designing common Grid scheduling infrastructures.

personal satellite services | 2013

Performance Evaluation of SPDY over High Latency Satellite Channels

Andrea Cardaci; Luca Caviglione; Alberto Gotta; Nicola Tonellotto

Originally developed by Google, SPDY is an open protocol for reducing download times of content rich pages, as well as for managing channels characterized by large Round Trip Times (RTTs) and high packet losses. With such features, it could be an efficient solution to cope with performance degradations of Web 2.0 services used over satellite networks. In this perspective, this paper evaluates the SPDY protocol over a wireless access also exploiting a satellite link. To this aim, we implemented an experimental set-up, composed of an SPDY proxy, a wireless link emulator, and an instrumented Web browser. Results confirm that SPDY can enhance the performances in terms of throughput, and reduce the traffic fragmentation. Moreover, owing to its connection multiplexing architecture, it can also mitigate the transport layer complexity, which is critical when in presence of middleboxes deployed to isolate satellite trunks.

Future Generation Computer Systems | 2006

HPC Application Execution on Grids

Marco Danelutto; Marco Vanneschi; Corrado Zoccolo; Nicola Tonellotto; Salvatore Orlando; Ranieri Baraglia; Tiziano Fagni; Domenico Laforenza; Alessandro Paccosi

Research has demonstrated that many applications can benefit from the Grid infrastructure. This benefit is somewhat weakened by the fact that writing Grid applications as well as porting existing ones to the Grid is a difficult and often tedious and error-prone task. Our approach intends to automatise the common tasks needed to start Grid applications, in order to allow an as large as possible user community to gain the full benefits from the Grid. This approach, combined with the adoption of high-level programming tools, can greatly simplify the task of writing and deploying Grid applications.

web search and data mining | 2015

Optimal Space-time Tradeoffs for Inverted Indexes

Giuseppe Ottaviano; Nicola Tonellotto; Rossano Venturini

Inverted indexes are usually represented by dividing posting lists into constant-sized blocks and representing them with an encoder for sequences of integers. Different encoders yield a different point in the space-time trade-off curve, with the fastest being several times larger than the most space-efficient. An important design decision for an index is thus the choice of the fastest encoding method such that the index fits in the available memory. However, a better usage of the space budget could be obtained by using faster encoders for frequently accessed blocks, and more space-efficient ones those that are rarely accessed. To perform this choice optimally, we introduce a linear time algorithm that, given a query distribution and a set of encoders, selects the best encoder for each index block to obtain the lowest expected query processing time respecting a given space constraint. To demonstrate the effectiveness of this approach we perform an extensive experimental analysis, which shows that our algorithm produces indexes which are significantly faster than single-encoder indexes under several query processing strategies, while respecting the same space constraints.

conference on information and knowledge management | 2013

Load-sensitive selective pruning for distributed search

Daniele Broccolo; Craig Macdonald; Salvatore Orlando; Iadh Ounis; Raffaele Perego; Fabrizio Silvestri; Nicola Tonellotto

A search engine infrastructure must be able to provide the same quality of service to all queries received during a day. During normal operating conditions, the demand for resources is considerably lower than under peak conditions, yet an oversized infrastructure would result in an unnecessary waste of computing power. A possible solution adopted in this situation might consist of defining a maximum threshold processing time for each query, and dropping queries for which this threshold elapses, leading to disappointed users. In this paper, we propose and evaluate a different approach, where, given a set of different query processing strategies with differing efficiency, each query is considered by a framework that sets a maximum query processing time and selects which processing strategy is the best for that query, such that the processing time for all queries is kept below the threshold. The processing time estimates used by the scheduler are learned from past queries. We experimentally validate our approach on 10,000 queries from a standard TREC dataset with over 50 million documents, and we compare it with several baselines. These experiments encompass testing the system under different query loads and different maximum tolerated query response times. Our results show that, at the cost of a marginal loss in terms of response quality, our search system is able to answer 90% of queries within half a second during times of high query volume.

Explore More