Elio Masciari
Indian Council of Agricultural Research
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Elio Masciari.
IEEE Transactions on Knowledge and Data Engineering | 2005
Sergio Flesca; Giuseppe Manco; Elio Masciari; Luigi Pontieri; Andrea Pugliese
Because of the widespread diffusion of semistructured data in XML format, much research effort is currently devoted to support the storage and retrieval of large collections of such documents. XML documents can be compared as to their structural similarity, in order to group them into clusters so that different storage, retrieval, and processing techniques can be effectively exploited. In this scenario, an efficient and effective similarity function is the key of a successful data management process. We present an approach for detecting structural similarity between XML documents which significantly differs from standard methods based on graph-matching algorithms, and allows a significant reduction of the required computation costs. Our proposal roughly consists of linearizing the structure of each XML document, by representing it as a numerical sequence and, then, comparing such sequences through the analysis of their frequencies. First, some basic strategies for encoding a document are proposed, which can focus on diverse structural facets. Moreover, the theory of discrete Fourier transform is exploited to effectively and efficiently compare the encoded documents (i.e., signals) in the domain of frequencies. Experimental results reveal the effectiveness of the approach, also in comparison with standard methods.
very large data bases | 2003
Sergio Flesca; Filippo Furfaro; Elio Masciari
XML queries are usually expressed by means of XPath expressions identifying portions of the selected documents. An XPath expression defines a way of navigating an XML tree and returns the set of nodes which are reachable from one or more starting nodes through the paths specified by the expression. The problem of efficiently answering XPath queries is very interesting and has recently received increasing attention by the research community. In particular, an increasing effort has been devoted to define effective optimization techniques for XPath queries. One of the main issues related to the optimization of XPath queries is their minimization. The minimization of XPath queries has been studied for limited fragments of XPath, containing only the descendent, the child and the branch operators. In this work, we address the problem of minimizing XPath queries for a more general fragment, containing also the wildcard operator. We characterize the complexity of the minimization of XPath queries, stating that it is NP-hard, and propose an algorithm for computing minimum XPath queries. Moreover, we identify an interesting tractable case and propose an ad hoc algorithm handling the minimization of this kind of queries in polynomial time.
pervasive computing and communications | 2005
Alfredo Cuzzocrea; Filippo Furfaro; Sergio Greco; Elio Masciari; Giuseppe M. Mazzeo; Domenico Saccà
A distributed system for approximate query answering on sensor network data is proposed, where a suitable compression technique is exploited to represent data and support query answering. Each node of the system stores either detailed or summarized sensor readings. Query answers are computed by identifying the set of nodes that contain (either compressed or not) data involved in the query, and eventually partitioning the query in a set of sub-queries to be evaluated at different nodes. Queries are partitioned according to a cost model aiming at making the evaluation efficient and guaranteeing the desired degree of accuracy of query answers.
international conference on tools with artificial intelligence | 2002
Giuseppe Manco; Elio Masciari; Andrea Tagarelli
We introduce a technique based on data mining algorithms for classifying incoming messages, as a basis for an overall architecture for maintenance and management of e-mail messages. We exploit clustering techniques for grouping structured and unstructured information extracted from e-mail messages in an unsupervised way, and exploit the resulting algorithm in the process of folder creation (and maintenance) and e-mail redirection. Some initial experimental results show the effectiveness of the technique, both from an efficiency and a quality-of-results viewpoint.
data and knowledge engineering | 2003
Sergio Flesca; Elio Masciari
In this paper we present a new technique for detecting changes in Web documents. The technique is based on a new method to measure the similarity of two documents, that represent the actual and the previous version of the monitored page. The technique has been effectively used to discover changes in selected portions of the original document.The proposed technique has been implemented in the CMW system providing a change monitoring service on the Web. The main features of CMW are the detection of changes on selected portions of web documents and the possibility to express complex queries on the changed information. For instance, a query can require to check if the value of a given stock has increased by more than 10%. Several tests on stock exchange and auction web pages proved the effectiveness of the proposed approach.
acm symposium on applied computing | 2007
Elio Masciari
Radio Frequency Identification (RFID) applications are emerging as key components in object tracking and supply chain management systems. In next future almost every major retailer will use RFID systems to track the shipment of products from suppliers to warehouses. Due to RFID readings features this will result in a huge amount of information generated by such systems when costs will be at a level such that each individual item could be tagged thus leaving a trail of data as it moves through different locations. We define a technique for efficiently detecting anomalous data in order to prevent problems related to inefficient shipment or fraudulent actions. Since items usually move together in large groups through distribution centers and only in stores do they move in smaller groups we exploit such a feature in order to design our technique. The preliminary experiments show the effectiveness of our approach.
international database engineering and applications symposium | 2007
Elio Masciari
Radio frequency identification (RFID) applications are emerging as key components in object tracking and supply chain management systems. In next future almost every major retailer will use RFID systems to track the shipment of products from suppliers to warehouses. Due to RFID readings features this will result in a huge amount of information generated by such systems when costs will be at a level such that each individual item could be tagged thus leaving a trail of data as it moves through different locations. We define a technique for efficiently detecting anomalous data in order to prevent problems related to inefficient shipment or fraudulent actions. Since items usually move together in large groups through distribution centers and only in stores do they move in smaller groups we exploit such a feature in order to design our technique. The preliminary experiments show the effectiveness of our approach.
data and knowledge engineering | 2007
Sergio Flesca; Giuseppe Manco; Elio Masciari; Luigi Pontieri; Andrea Pugliese
In this paper, we propose a classification technique for Web pages, based on the detection of structural similarities among semistructured documents, and devise an architecture exploiting such technique for the purpose of information extraction. The proposal significantly differs from standard methods based on graph-matching algorithms, and is based on the idea of representing the structure of a document as a time series in which each occurrence of a tag corresponds to an impulse. The degree of similarity between documents is then stated by analyzing the frequencies of the corresponding Fourier transform. Experiments on real data show the effectiveness of the proposed technique.
Information Sciences | 2012
Elio Masciari
Datastreams are potentially infinite data sources that flow continuously while monitoring a physical phenomenon, like temperature levels or other kind of human activities, such as clickstreams, telephone call records, and so on. RFID technology has lead in recent years the generation of huge streams of data. Moreover, RFID based systems allow the effective management of items tagged by RFID tags, especially for supply chain management or objects tracking. In this paper we introduce SMART (Stream Monitoring enterprise Activities by RFID Tags) a system based on an outlier template definition for detecting anomalies in RFID streams. We describe SMART features and its application on a real life scenario that shows the effectiveness of the proposed method for enterprise management. Moreover, we describe an outlier detection approach we defined and effectively exploited in SMART.
flexible query answering systems | 2009
Elio Masciari
The increasing availability of huge amounts of data pertaining to time and positions generated by different sources using a wide variety of technologies (e.g., RFID tags, GPS, GSM networks) leads to large spatial data collections. Mining such amounts of data is challenging, since the possibility to extract useful information from this peculiar kind of data is crucial in many application scenarios such as vehicle traffic management, hand-off in cellular networks, supply chain management. In this paper, we address the problem of clustering spatial trajectories. In the context of trajectory data, clustering is really challenging as we deal with data (trajectories) for which the order of elements is relevant. We propose a novel approach based on a suitable regioning strategy and an efficient and effective clustering technique based on a proper metric. Finally, we performed several tests on real world datasets that confirmed the efficiency and effectiveness of the proposed techniques.