Engin Arslan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Engin Arslan is active.

Explore More

Publication

Featured researches published by Engin Arslan.

ieee international conference on cloud computing technology and science | 2016

Application-Level Optimization of Big Data Transfers through Pipelining, Parallelism and Concurrency

Esma Yildirim; Engin Arslan; Jangyoung Kim; Tevfik Kosar

In end-to-end data transfers, there are several factors affecting the data transfer throughput, such as the network characteristics (e.g., network bandwidth, round-trip-time, background traffic); end-system characteristics (e.g., NIC capacity, number of CPU cores and their clock rate, number of disk drives and their I/O rate); and the dataset characteristics (e.g., average file size, dataset size, file size distribution). Optimization of big data transfers over inter-cloud and intra-cloud networks is a challenging task that requires joint-consideration of all of these parameters. This optimization task becomes even more challenging when transferring datasets comprised of heterogeneous file sizes (i.e., large files and small files mixed). Previous work in this area only focuses on the end-system and network characteristics however does not provide models regarding the dataset characteristics. In this study, we analyze the effects of the three most important transfer parameters that are used to enhance data transfer throughput: pipelining,parallelism and concurrency. We provide models and guidelines to set the best values for these parameters and present two different transfer optimization algorithms that use the models developed. The tests conducted over high-speed networking and cloud testbeds show that our algorithms outperform the most popular data transfer tools like Globus Online and UDT in majority of the cases.

international conference on parallel processing | 2013

Dynamic protocol tuning algorithms for high performance data transfers

Engin Arslan; Brandon Ross; Tevfik Kosar

Obtaining optimal data transfer performance is of utmost importance to todays data-intensive distributed applications and wide-area data replication services. Doing so necessitates effectively utilizing available network bandwidth and resources, yet in practice transfers seldom reach the levels of utilization they potentially could. Tuning protocol parameters such as pipelining, parallelism, and concurrency can significantly increase utilization and performance, however determining the best settings for these parameters is a difficult problem, as network conditions can vary greatly between sites and over time. In this paper, we present four application-level algorithms for heuristically tuning protocol parameters for data transfers in wide-area networks. Our algorithms dynamically tune the number of parallel data streams per file, the level of control channel pipelining, and the number of concurrent file transfers to fill network pipes. The presented algorithms are implemented as a standalone service as well as being used in interaction with external data scheduling tools such as Stork. The experimental results are very promising, and our algorithms outperform existing solutions in this area.

ieee international conference on high performance computing data and analytics | 2016

HARP: predictive transfer optimization based on historical analysis and real-time probing

Engin Arslan; Kemal Guner; Tevfik Kosar

Increasingly data-intensive scientific and commercial applications require frequent movement of large datasets from one site to the other. Despite the growing capacity of the networking capacity, these data movements rarely achieve the promised data transfer rates of the underlying physical network due to the poorly tuned data transfer protocols. Accurately and efficiently tuning the data transfer protocol parameters in a dynamically changing network environment is a big challenge and still an open research problem. In this paper, we present predictive end-to-end data transfer optimization algorithms based on historical data analysis and real-time background traffic probing, dubbed HARP. Most of the existing work in this area is solely based on real time network probing, which either cause too much sampling overhead or fail to accurately predict the correct transfer parameters. Combining historical data analysis with real time sampling enables our algorithms to tune the application level data transfer parameters accurately and efficiently to achieve close-to-optimal end-to-end data transfer throughput with very low overhead. Our experimental analysis over a variety of network settings shows that HARP outperforms existing solutions by up to 50% in terms of the achieved throughput.

ieee international conference on high performance computing data and analytics | 2015

Energy-aware data transfer algorithms

Ismail Alan; Engin Arslan; Tevfik Kosar

The amount of data moved over the Internet per year has already exceeded the Exabyte scale and soon will hit the Zettabyte range. To support this massive amount of data movement across the globe, the networking infrastructure as well as the source and destination nodes consume immense amount of electric power, with an estimated cost measured in billions of dollars. Although considerable amount of research has been done on power management techniques for the networking infrastructure, there has not been much prior work focusing on energy-aware data transfer algorithms for minimizing the power consumed at the end-systems. We introduce novel data transfer algorithms which aim to achieve high data transfer throughput while keeping the energy consumption during the transfers at the minimal levels. Our experimental results show that our energy-aware data transfer algorithms can achieve up to 50% energy savings with the same or higher level of data transfer throughput.

Proceedings of the 5th International Workshop on Data-Intensive Computing in the Clouds | 2014

Locality and network-aware reduce task scheduling for data-intensive applications

Engin Arslan; Mrigank Shekhar; Tevfik Kosar

MapReduce is one of the leading programming frameworks to implement data-intensive applications by splitting the map and reduce tasks to distributed servers. Although there has been substantial amount of work on map task scheduling and optimization in the literature, the work on reduce task scheduling is very limited. Effective scheduling of the reduce tasks to the resources becomes especially important for the performance of data-intensive applications where large amounts of data are moved between the map and reduce tasks. In this paper, we propose a new algorithm (LoNARS) for reduce task scheduling, which takes both data locality and network traffic into consideration. Data locality awareness aims to schedule the reduce tasks closer to the map tasks to decrease the delay in data access as well as the amount of traffic pushed to the network. Network traffic awareness intends to distribute the traffic over the whole network and minimize the hotspots to reduce the effect of network congestion in data transfers. We have integrated LoNARS into Hadoop-1.2.1. Using our LoNARS algorithm, we achieved up to 15% gain in data shuffling time and up to 3-4% improvement in total job completion time compared to the other reduce task scheduling algorithms. Moreover, we reduced the amount of traffic on network switches by 15% which helps to save energy consumption considerably.

workshop on local and metropolitan area networks | 2011

Network management game

Engin Arslan; Murat Yuksel; Mehmet Hadi Gunes

Network management and automated configuration of large-scale networks is one of the crucial issues for Internet Service Providers (ISPs). Since wrong configurations might lead to an enormous amount of customer traffic to be lost, highly experienced network administrators are typically the ones who are trusted for the management and configuration of a running ISP network. We frame the management and experimentation of a network as a “game” for training network administrators without having to risk the network operation. The interactive environment treats the trainee network administrators as players of a game and tests them with various network failures or dynamics. To prototype the concept of “network management as a game”, we modified NS-2 to establish an interactive simulation engine and connected the modified engine to a graphical user interface for traffic animation and interactivity with the player. We present initial results from our game applied to a small set of players.

cluster computing and the grid | 2014

Energy-Aware Data Transfer Tuning

Ismail Alan; Engin Arslan; Tevfik Kosar

The annual electricity consumed by data transfers in the U.S. is estimated to be 20 Terawatt hours, which translates to around 4 billion U.S. Dollars per year. There has been considerable amount of prior work looking at power management and energy efficiency in hardware and software systems, and more recently in power-aware networking. Despite the growing body of research in power management techniques for the networking infrastructure, there has been no prior work (to the best of our knowledge), focusing on saving energy at the end systems(sender and receiver nodes) during the data transfer. We argue that although network-only approaches are part of the solution, the end-system power management is a key in optimizing energy efficiency of the data transfers, which has been long ignored. In this paper, we analyze various factors that will affect the power consumption in end-to-end data transfers, such as the level of parallelism, concurrency and pipelining. Our results show that significant amount of energy savings can be achieved at the end-systems during data transfer with no or minimal performance penalty.

global communications conference | 2011

Analysis of academic ties: A case study of Mathematics Genealogy

Engin Arslan; Mehmet Hadi Gunes; Murat Yuksel

Analyzing social networks of communities helps us obtain local and large-scale information about the social relations. We can observe how structural as well as behavioral changes occur among the members of a social network over years. In this paper, we analyze academic ties of mathematicians using the Mathematics Genealogy Project data. Additionally, using university and nation information of mathematicians, we examine which universities or nations are more correlated as well as how these correlations change over the years.

Journal of Parallel and Distributed Computing | 2018

Big data transfer optimization through adaptive parameter tuning

Engin Arslan; Bahadir A. Pehlivan; Tevfik Kosar

Abstract Obtaining optimal data transfer performance is of utmost importance to today’s data-intensive distributed applications and wide-area data replication services. Tuning application-layer protocol parameters such as pipelining, parallelism, and concurrency can significantly increase efficient utilization of the available network bandwidth as well as the end-to-end data transfer performance. However, determining the best settings for these parameters is a challenging problem, as network conditions can vary greatly between sites and over time. Poor protocol tuning can cause either under- or over-utilization of network resources and thus degrade transfer performance. In this paper, we present three novel algorithms for application-layer parameter tuning and transfer scheduling to maximize transfer throughput in wide-area networks. Our algorithms use heuristics to tune the level of control channel pipelining (for small file optimization), the number of parallel data streams per file (for large file optimization), and the number of concurrent file transfers to increase I/O throughput (for all types of files). The proposed algorithms improve the transfer throughput up to 10x compared to the baseline and 7x compared to the state-of-the-art solutions. We also propose adaptive tuning to adjust the values of parameters based on real-time observations. The results show that adaptive tuning can further improve transfer throughput by up to 24% compared to the heuristic approach.

Data-Intensive Computing in the Clouds (DataCloud), 2014 5th International Workshop on | 2015