Unai Arronategui | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Unai Arronategui is active.

Explore More

Publication

Featured researches published by Unai Arronategui.

Future Generation Computer Systems | 2013

A task routing approach to large-scale scheduling

Javier Celaya; Unai Arronategui

Scheduling many tasks in environments of millions of unreliable nodes is a challenging problem. To our knowledge, no work in the literature has proposed a solution that also supports many policies with very different objectives. In this paper, we present a decentralized scheduling model that overcomes these problems. A hierarchical network overlay supports a scalable resource discovery and allocation scheme. It uses aggregated information to route tasks to the most suitable execution nodes, and is easily extensible to provide very different scheduling policies. For this paper, we implemented a policy that just allocates tasks to idle nodes, a policy that minimizes the global makespan and a policy that fulfills deadline requirements. With thorough simulation tests, we conclude that our model allocates any number of tasks to several million nodes in just a few seconds, with very low overhead and high resilience. Meanwhile, policies with different objectives implemented on our model perform almost as well as their centralized counterpart.

Future Generation Computer Systems | 2015

Fair scheduling of bag-of-tasks applications on large-scale platforms

Javier Celaya; Unai Arronategui

Users of distributed computing platforms want to obtain a fair share of the resources they use. With respect to the amount of computation, the most suitable measure of fairness is the stretch. It describes the slowdown that the applications suffer for being executed in a shared platform, in contrast to being executed alone. In this paper, we present a decentralized scheduling policy that minimizes the maximum stretch among user-submitted applications. With two reasonable assumptions, that can be deduced from existing system traces, we are able to minimize the stretch using only local information. In this way, we avoid a centralized design and provide scalability and fault tolerance. As a result, our policy performs just 11% worse than a centralized implementation, and largely outperforms other common policies. Additionally, it easily scales to hundreds of thousands of nodes. We presume that it can scale to millions with a minimal overhead. Finally, we also show that preemption is crucial to provide fairness in any case. We present a scheduling model for fair resource sharing on large-scale platforms.It effectively aggregates information about application stretch.Task allocation is performed so that the maximum stretch is minimized.Our model is able to perform similar to a centralized implementation.The management overhead is bounded.

grid computing | 2011

A Highly Scalable Decentralized Scheduler of Tasks with Deadlines

Javier Celaya; Unai Arronategui

Scheduling of tasks in distributed environments, like cloud and grid computing platforms, using deadlines to provide quality of service is a challenging problem. The few existing proposals suffer from scalability limitations, because they try to manage full knowledge of the system state. To our knowledge, there is no implementation yet that reaches scales of a hundred thousand nodes. In this paper, we present a fully decentralized scheduler, that aggregates information about the availability of the execution nodes throughout the network and uses it to allocate tasks to those nodes that are able to finish them in time. Through simulation, we show that our scheduler is able to operate on different scenarios, from many-task applications in cloud computing sites to volunteer computing projects. Simulations on networks of up to a hundred thousand nodes show very competitive performance, reaching allocation times of under a second and very low overhead in low latency gigabit networks.

parallel, distributed and network-based processing | 2010

Distributed Scheduler of Workflows with Deadlines in a P2P Desktop Grid

Javier Celaya; Unai Arronategui

Scheduling large amounts of tasks in distributed computing platforms composed of millions of nodes is a challenging goal, even more in a fully decentralized way and with low overhead. Thus, we propose a new scalable scheduler for task workflows with deadlines following a completely decentralized architecture. Its built upon a tree-based P2P overlay that supports efficient and fast aggregation of resource availability information. Constraints for deadlines and the correct timing of tasks in workflows are guaranteed with a suitable distributed management of availability time intervals of resources. A local scheduler in each node provides its available time intervals to the distributed global scheduler, which summarizes them in the aggregation process. A two phase reservation protocol looks for suitable resources that comply with workflow structure and deadline. Experimental results, from simulations of a system composed of one million nodes, show scalable fast scheduling with low overhead that can allow a high dynamic usage of computational resources.

high performance computing and communications | 2006

Scalable architecture for allocation of idle CPUs in a p2p network

Javier Celaya; Unai Arronategui

In this paper we present a scalable, distributed architecture that allocates idle CPUs for task execution, where any node may request the execution of a group of tasks by other ones. A fast, scalable discovery protocol is an essential component. Also, up to date information about free nodes is efficiently managed in each node by an availability protocol. Both protocols exploit a tree-based peer-to-peer network that adds fault-tolerant capabilities. Results from experiments and simulation tests, using a simple allocation method, show discovery and allocation costs scaling logarithmically with the number of nodes, even with low communication overhead and little, bounded state in each node.

grid computing | 2006

YA: Fast and Scalable Discovery of Idle CPUs in a P2P network.

Javier Celaya; Unai Arronategui

Discovery of large amounts of idle CPUs in fully distributed and shared grid systems is needed in relevant applications and is still a challenging problem. In this paper we present a fast, scalable and efficient discovery protocol founded on a tree-based peer-to-peer (p2p) network with fault-tolerant capabilities and locality features. Each system node stores a good estimation of the number of CPUs that are available in its branch. Each node notifies its father about changes in this value only when it is meaningful enough. This allows low overhead and a stable behavior with concurrent and dynamic allocation of CPUs. This basic mechanism allows any node to launch a discovery process that needs only to follow the information of free CPUs in each branch. Results from experiments and simulation tests, using a simple allocation method, show discovery time scaling logarithmically with the number of nodes

ieee acm international conference utility and cloud computing | 2016

Modelling performance & resource management in kubernetes

Víctor Medel; Omer Farooq Rana; José Ángel Bañares; Unai Arronategui

Containers are rapidly replacing Virtual Machines (VMs) as the compute instance of choice in cloud-based deployments. The significantly lower overhead of deploying containers (compared to VMs) has often been cited as one reason for this. We analyse performance of the Kubernetes system and develop a Reference net-based model of resource management within this system. Our model is characterised using real data from a Kubernetes deployment, and can be used as a basis to design scalable applications that make use of Kubernetes.

trans. computational science | 2009

Behavioural Characterization for Network Anomaly Detection

Victor P. Roche; Unai Arronategui

In this paper we propose a methodology for detecting abnormal traffic on the net, such as worm attacks, based on the observation of the behaviours of different elements at the network edges. In order to achieve this, we suggest a set of critical features and we judge normal site status based on these standards. For our goal this characterization must be free of virus traffic. Once this has been set, we would be able to find abnormal situations when the observed behaviour, set against the same features, is significantly different from the previous model. We have based our work on NetFlow information generated by the main routers in the University of Zaragoza network, with more than 12,000 hosts. The proposed model helps to characterize the whole corporate network, sub-nets and the individual hosts. This methodology has proved its effectiveness in real infections caused by viruses such as SpyBot, Agobot, etc in accordance with our experimental tests. This system would allow to detect new kind of worms, independently from the vulnerabilities or methods used for their propagation.

ieee acm international conference utility and cloud computing | 2016

Adaptive application scheduling under interference in Kubernetes

Víctor Medel; Omer Farooq Rana; José Ángel Bañares; Unai Arronategui

Containers are rapidly replacing Virtual Machines (VMs) as the compute instance in cloud-based deployments. The significantly lower overhead of deploying containers (compared to VMs) has often been cited as one reason for this. However, interference caused by the limited isolation in shared resources can impact into the performance of hosted applications. We develop a Reference Net-based model of resource management within Kubernetes, primarily to better characterise such performance issues. Our model makes use of data obtained from a Kubernetes deployment, and can be used as a basis to design scalable (and potentially interference tolerant) applications that make use of Kubernetes.

Computers & Electrical Engineering | 2018

Characterising resource management performance in Kubernetes

Víctor Medel; Rafael Tolosana-Calasanz; José Ángel Bañares; Unai Arronategui; Omer Farooq Rana

A key challenge for supporting elastic behaviour in cloud systems is to achieve a good performance in automated (de-)provisioning and scheduling of computing resources. One of the key aspects that can be significant is the overheads associated with deploying, terminating and maintaining resources. Therefore, due to their lower start up and termination overhead, containers are rapidly replacing Virtual Machines (VMs) in many cloud deployments, as the computation instance of choice. In this paper, we analyse the performance of Kubernetes achieved through a Petri net-based performance model. Kubernetes is a container management system for a distributed cluster environment. Our model can be characterised using data from a Kubernetes deployment, and can be exploited for supporting capacity planning and designing Kubernetes-based elastic applications.

Explore More