Nikos Ntarmos
University of Patras
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nikos Ntarmos.
extending database technology | 2006
Theoni Pitoura; Nikos Ntarmos; Peter Triantafillou
We consider the conflicting problems of ensuring data-access load balancing and efficiently processing range queries on peer-to-peer data networks maintained over Distributed Hash Tables (DHTs). Placing consecutive data values in neighboring peers is frequently used in DHTs since it accelerates range query processing. However, such a placement is highly susceptible to load imbalances, which are preferably handled by replicating data (since replication also introduces fault tolerance benefits). In this paper, we present HotRoD, a DHT-based architecture that deals effectively with this combined problem through the use of a novel locality-preserving hash function, and a tunable data replication mechanism which allows trading off replication costs for fair load distribution. Our detailed experimentation study shows strong gains in both range query processing efficiency and data-access load balancing, with low replication overhead. To our knowledge, this is the first work that concurrently addresses the two conflicting problems using data replication.
conference on information and knowledge management | 2006
Sebastian Michel; Matthias Bender; Nikos Ntarmos; Peter Triantafillou; Gerhard Weikum; Christian Zimmer
Peer-to-Peer (P2P) search requires intelligent decisions for query routing: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving additional search results. These decisions are based on statistical summaries for each peer, which are usually organized on a per-keyword basis and managed in a distributed directory of routing indices. Such architectures disregard the possible correlations among keywords. Together with the coarse granularity of per-peer summaries, which are mandated for scalability, this limitation may lead to poor search result quality.This paper develops and evaluates two solutions to this problem, sk-STAT based on single-key statistics only, and mk-STAT based on additional multi-key statistics. For both cases, hash sketch synopses are used to compactly represent a peers data items and are efficiently disseminated in the P2P network to form a decentralized directory. Experimental studies with Gnutella and Web data demonstrate the viability and the trade-offs of the approaches.
international conference on peer-to-peer computing | 2004
Nikos Ntarmos; Peter Triantafillou
We present SeAl, a novel data/resource and data-access management infrastructure designed for the purpose of addressing a key problem in P2P data sharing networks, namely the problem of wide-scale selfish peer behavior. Selfish behavior has been manifested and well documented and it is widely accepted that unless this is dealt with, the scalability, efficiency, and the usefulness of P2P sharing networks will be diminished. SeAl essentially consists of a monitoring/accounting subsystem, an auditing/verification subsystem, and incentive mechanisms. The monitoring subsystem facilitates the classification of peers into selfish/altruistic. The auditing/verification layer provides a shield against perjurer/slandering and colluding peers that may try to cheat the monitoring subsystem. The incentives mechanisms effectively utilize these layers so to increase the computational/networking and data resources that are available to the community. Our extensive performance results show that SeAl performs its tasks swiftly, while the overhead introduced by our accounting and auditing mechanisms in terms of response time, network, and storage overheads are very small.
ACM Transactions on Computer Systems | 2009
Nikos Ntarmos; Peter Triantafillou; Gerhard Weikum
Counting items in a distributed system, and estimating the cardinality of multisets in particular, is important for a large variety of applications and a fundamental building block for emerging Internet-scale information systems. Examples of such applications range from optimizing query access plans in peer-to-peer data sharing, to computing the significance (rank/score) of data items in distributed information retrieval. The general formal problem addressed in this article is computing the network-wide distinct number of items with some property (e.g., distinct files with file name containing “spiderman”) where each node in the network holds an arbitrary subset, possibly overlapping the subsets of other nodes. The key requirements that a viable approach must satisfy are: (1) scalability towards very large network size, (2) efficiency regarding messaging overhead, (3) load balance of storage and access, (4) accuracy of the cardinality estimation, and (5) simplicity and easy integration in applications. This article contributes the DHS (Distributed Hash Sketches) method for this problem setting: a distributed, scalable, efficient, and accurate multiset cardinality estimator. DHS is based on hash sketches for probabilistic counting, but distributes the bits of each counter across network nodes in a judicious manner based on principles of Distributed Hash Tables, paying careful attention to fast access and aggregation as well as update costs. The article discusses various design choices, exhibiting tunable trade-offs between estimation accuracy, hop-count efficiency, and load distribution fairness. We further contribute a full-fledged, publicly available, open-source implementation of all our methods, and a comprehensive experimental evaluation for various settings.
databases information systems and peer to peer computing | 2004
Nikos Ntarmos; Peter Triantafillou
We argue the case for a new paradigm for architecting structured P2P overlay networks, coined AESOP. AESOP consists of 3 layers: (i) an architecture, PLANES, that ensures significant performance speedups, assuming knowledge of altruistic peers; (ii) an accounting/auditing layer, AltSeAl, that identifies and validates altruistic peers; and (iii) SeAledPLANES, a layer that facilitates the coordination/collaboration of the previous two components. We briefly present these components along with experimental and analytical data of the promised significant performance gains and the related overhead. In light of these very encouraging results, we put this three-layer architecture paradigm forth as the way to structure the P2P overlay networks of the future.
international conference on data engineering | 2013
George Sfakianakis; I. Patlakas; Nikos Ntarmos; Peter Triantafillou
Cloud key-value stores are becoming increasingly more important. Challenging applications, requiring efficient and scalable access to massive data, arise every day. We focus on supporting interval queries (which are prevalent in several data intensive applications, such as temporal querying for temporal analytics), an efficient solution for which is lacking. We contribute a compound interval index structure, comprised of two tiers: (i) the MRSegmentTree (MRST), a key-value representation of the Segment Tree, and (ii) the Endpoints Index (EPI), a column family index that stores information for interval endpoints. In addition to the above, our contributions include: (i) algorithms for efficiently constructing and populating our indices using MapReduce jobs, (ii) techniques for efficient and scalable index maintenance, and (iii) algorithms for processing interval queries. We have implemented all algorithms using HBase and Hadoop, and conducted a detailed performance evaluation. We quantify the costs associated with the construction of the indices, and evaluate our query processing algorithms using queries on real data sets. We compare the performance of our approach to two alternatives: the native support for interval queries provided in HBase, and the execution of such queries using the Hive query execution tool. Our results show a significant speedup, far outperforming the state of the art.
international conference on peer-to-peer computing | 2003
Peter Triantafillou; Nikos Ntarmos; Sotiris E. Nikoletseas; Paul G. Spirakis
We present the NanoPeers architecture paradigm, a peer-to-peer network of lightweight devices, lacking all or most of the capabilities of their computer-world counterparts. We identify the problems arising when we apply current routing and searching methods to this nanoworld, and present some initial solutions, using a case study of a sensor network instance; Smart Dust. Furthermore, we propose the P2P Worlds framework as a hybrid P2P architecture paradigm, consisting of cooperating layers of P2P networks, populated by computing entities with escalating capabilities. Our position is that: (i) experience gained through research and experimentation in the field of P2P computing, can be indispensable when moving down the stair of computing capabilities, and that (ii) the proposed framework can be the basis of numerous real-world applications, opening up several challenging research problems.
IEEE Transactions on Knowledge and Data Engineering | 2012
Theoni Pitoura; Nikos Ntarmos; Peter Triantafillou
In this paper, we present Saturn, an overlay architecture for large-scale data networks maintained over Distributed Hash Tables (DHTs) that efficiently processes range queries and ensures access load balancing and fault-tolerance. Placing consecutive data values in neighboring peers is desirable in DHTs since it accelerates range query processing; however, such a placement is highly susceptible to load imbalances. At the same time, DHTs may be susceptible to node departures/failures and high data availability and fault tolerance are significant issues. Saturn deals effectively with these problems through the introduction of a novel multiple ring, order-preserving architecture. The use of a novel order-preserving hash function ensures fast range query processing. Replication across and within data rings (termed vertical and horizontal replication) forms the foundation over which our mechanisms are developed, ensuring query load balancing and fault tolerance, respectively. Our detailed experimentation study shows strong gains in range query processing efficiency, access load balancing, and fault tolerance, with low replication overheads. The significance of Saturn is not only that it effectively tackles all three issues together - i.e., supporting range queries, ensuring load balancing, and providing fault tolerance over DHTs - but also that it can be applied on top of any order-preserving DHT enabling it to dynamically handle replication and, thus, to trade off replication costs for fair load distribution and fault tolerance.
International Journal of Digital Crime and Forensics | 2013
Sotiris Karavarsamis; Nikos Ntarmos; Konstantinos Blekas; Ioannis Pitas
AbstractIn this study 1 , a novel algorithm for recognizing pornographic images based on the analysis of skin color regionsis presented. The skin color information essentially provides Regions of Interest (ROIs). It is demonstrated that theconvex hull of these ROIs provides semantically useful information for pornographic image detection. Based onthese convex hulls, the authors extract a small set of low-level visual features that are empirically proven to possessdiscriminative power for pornographic image classification. In this study, we consider multi-class pornographicimage classification, where the ”nude” and ”benign” image classes are further split into two specialized sub-classes, namely ”bikini” / ”porn” and ”skin” / ”non-skin”, respectively. The extracted feature vectors are fed to anensemble of random forest classifiers for image classification. Each classifier is trained on a partition of the trainingset and solves a binary classification problem. In this sense, the model allows for seamless coarse-to-fine-grainedclassification by means of a tree-structured topology of a small number of intervening binary classifiers. The overalltechnique is evaluated on the AIIA-PID challenge of 9;000 samples of pornographic and benign images collectedfrom the Web. The technique is shown to exhibit state-of-the-art performance against publicly available integratedpornographic image classifiers.Index Termsconvex hull calculation, multi-class classification, porn detection, random forests, skin ROI localization
extending database technology | 2016
Jing Wang; Nikos Ntarmos; Peter Triantafillou
Subgraph/supergraph queries although central to graph an- alytics, are costly as they entail the NP-Complete problem of subgraph isomorphism. We present a fresh solution, the novel principle of which is to acquire and utilize knowledge from the results of previously executed queries. Our ap- proach, iGQ, encompasses two component subindexes to identify if a new query is a subgraph/supergraph of pre- viously executed queries and stores related key informa- tion. iGQ comes with novel query processing and index space management algorithms, including graph replacement policies. The end result is a system that leads to signifi- cant reduction in the number of required subgraph isomor- phism tests and speedups in query processing time. iGQ can be incorporated into any sub/supergraph query processing method and help improve performance. In fact, it is the only contribution that can speedup significantly both subgraph and supergraph query processing. We establish the princi- ples of iGQ and formally prove its correctness. We have im- plemented iGQ and have incorporated it within three popu- lar recent state of the art index-based graph query process- ing solutions. We evaluated its performance using real-world and synthetic graph datasets with different characteristics, and a number of query workloads, showcasing its benefits.