Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andres J. Gonzalez is active.

Publication


Featured researches published by Andres J. Gonzalez.


pacific rim international symposium on dependable computing | 2010

Analysis of Dependencies between Failures in the UNINETT IP Backbone Network

Andres J. Gonzalez; Bjarne E. Helvik; Jon Kåre Hellan; Pirkko Kuusela

Dependencies between failures in operational networks may have a huge impact on their reliability and availability. In this paper we analyze failure logs to identify simultaneous and potentially correlated failures in routers and links of an IP backbone network. We show that the actual behavior of failure processes does not support the independence assumption commonly used in theoretical studies. Scatter plots are presented to visualize the failure processes, and it is seen that geographical adjacency has a pronounced effect. The existence of high correlation coefficients and high autocorrelation in some failure processes was observed. A formal analysis confirms this. The consequences of these dependencies on the provisioning of guaranteed availability are briefly discussed.


advanced information networking and applications | 2011

Analysis of Failures Characteristics in the UNINETT IP Backbone Network

Andres J. Gonzalez; Bjarne E. Helvik

Core backbone networks must be designed to guarantee high levels of availability. Any interruption in the services that they provide may have massive consequences. For this reason there is a huge interest in developing methods able to keep the network robustness in the desired level. For the design of these methods are used models that need input information such as the operational state of network components which are stochastic variables. The aim of this paper is to provide an insight into the core networks behavior based on real operational data in order to help future related works to take more realistic assumptions. Based on failure logs provided by UNINETT we analyze availability levels and failure intensities in routers and links. We show that links may be classified in three groups with different properties. Additionally we observe that some links have similar dependability features than routers, making the perfect node assumption used on many related studies not correct. Finally, there were used parametrization techniques in order to fit the empirical processes with well-known distributions. We observe that the Weibull assumption that is traditionally used to model link failures processes fits properly the behavior of routers and short distance links but for the case of long distance fibers the gamma distribution seems to fit better.


International Journal of Space-Based and Situated Computing | 2012

Characterisation of router and link failure processes in UNINETT's IP backbone network

Andres J. Gonzalez; Bjarne E. Helvik

Backbone networks must be highly reliable. The offered availability can be predicted prior to operation if the stochastic behaviour of network components is known. The aim of this paper is provide information about failures and repairs processes in an operational network. Operational logs from UNINETTs core network were analysed to obtain distributions of the time between failures and downtimes of routers and links. The network components were classified according to their role in the network. The measured processes were fit with well-known distributions. The inter-failure times of routers and short distance links may be characterised by a Weibull distribution, but for the long distance links the gamma distribution yielded a better characterisation. The difference is discussed based on the hazard function. The parameters of each network component are made available and provide a detailed insight that may be used for dependability predictions and research.


ieee latin-american conference on communications | 2009

Guaranteeing service availability in SLAs; a study of the risk associated with contract period and failure process

Andres J. Gonzalez; Bjarne E. Helvik

Service Level Agreements (SLAs) are a common means to define the obligations of network/service providers and users in business relationships. The terms that define the guaranteed availability for a given period are an important element of these contracts. The appropriate values selection is difficult due to the large number of variables involved, the complexities of the network and service provision and the computational challenge posed by the transient solution, as opposed to a steady state, that is needed. A common policy taken to solve it, is using the steady state availability as a reference. Nevertheless this simplification may put on risk the contract fulfillment as stochastic variation of the measured availability is significant over a typical contract period. This paper analyzes the relevance that the interval availability analysis has on SLAs, and provides suggestions to the network providers on the selection of adequate availability guarantees. The interval availability of unprotected and shared protected connections is studied under exponential and Weibull failure and repair distributions. It is observed that for a single path scenario, a small reduction of the guaranteed availability below the steady state value improve the probability to meet the requirements considerably. The same is the case for connections with shared backup protection. However performing this analysis in the transient domain is quite demanding. Hence, to simplify it, it is proposed to obtain the steady state results and introduce a safeguard factor to control that the availability guarantee is meet. For the Weibull distributed times between failures, where the shape factor is less than one (as observed in operational networks), the probability of meeting a guaranteed availability over a finite contract period, decrease more radically than for the commonly assumed Poisson failure process. This increases the importance of making a transient analysis.


Computer Communications | 2013

SLA success probability assessment in networks with correlated failures

Andres J. Gonzalez; Bjarne E. Helvik

Service Level Agreements (SLAs) are used to define obligations between network/service providers and customers in business relationships. The terms that define the guaranteed availability for a given period are fundamental to these contracts. The appropriate selection of the availability to be promised is still an open challenge for network operators due to: (i) SLAs are defined for finite periods, and hence the stochastic properties of the availability have to be considered. (ii) Real operational networks have not the Markovian properties. (iii) The way that correlation affects the interval availability in operational networks is unknown. In this work, we show the impact of dependent failures on SLAs, based on operational failure data obtained from the UNINETT network. Using these data, we simulate the behavior of network connections that use shared backup protection. We evaluate the SLA success probability using two different methods. First, we apply trace driven simulation combined with random circular shifting. Second, we develop a model that uses Monte Carlo techniques. This approach includes the characterization of up and down times of each network component and the use of a model that generates correlated samples based on fitted marginal distributions. Finally, we analyze the probability density function of the interval availability for different observation periods under independent and correlated failures.


ieee international conference on cloud computing technology and science | 2012

System management to comply with SLA availability guarantees in cloud computing

Andres J. Gonzalez; Bjarne E. Helvik

SLAs are common means to define specifications and requirements of cloud computing services in business relationships. The terms that define the guaranteed availability for a given period are fundamental to these contracts. In this context, a natural question for cloud providers is: How to guarantee the availability promised? This paper studies the level of availability offered to a virtual machine during an SLA period in clouds with different: size, redundancy, and fault tolerance techniques. Finally, this paper proposes the use of the SLA -budget for the implementation of smart policies in: i) the assignment of spare servers when virtual machines are restored. ii) the dynamic use of different fault tolerance licenses. Using such policies results in a considerable reduction of the probability of breaching the SLA guarantee, by making an efficient use of the cloud resources available. This paper is a first step in the design of SLA-aware cloud architectures.


IEEE Latin America Transactions | 2010

LatinCon08 - Guaranteeing Service Availability in SLAs; a Study of the Risk Associated with Contract Period and Failure Process

Andres J. Gonzalez; Bjarne E. Helvik

Service Level Agreements (SLAs) are a common means to define the obligations of network/service providers and users in business relationships. The terms that define the guaranteed availability for a given period are an important element of these contracts. The appropriate selection values is difficult due to the large number of variables involved, the complexities of the network and service provision and the computational challenge posed by the transient solution, as opposed to a steady state, that is needed. A common policy taken to solve it, is using the steady state availability as a reference. Nevertheless this simplification may put on risk the contract fulfillment as stochastic variation of the measured availability is significant over a typical contract period. This paper analyzes the relevance that the interval availability analysis has on SLAs, and provides suggestions to the network providers on the selection of adequate availability guarantees. The interval availability of unprotected and shared protected connections is studied under exponential and Weibull failure and repair distributions. It is observed that for a single path scenario, a small reduction of the guaranteed availability below the steady state value improve the probability to meet the requirements considerably. The same is the case for connections with shared backup protection. However performing this analysis in the transient domain is quite demanding. Hence, to simplify it, it is proposed to obtain the steady state results and introduce a safeguard factor to control that the availability guarantee is meet. For the Weibull distributed times between failures, where the shape factor is less than one, as observed in operational networks, the the probability of meeting a guaranteed availability over a finite contract period, decrease more radically than for the commonly assumed Poisson failure process. This increases the importance of making a transient analysis.


2010 IEEE International Workshop Technical Committee on Communications Quality and Reliability (CQR 2010) | 2010

Dynamic sharing mechanism for guaranteed availability in MPLS based networks

Andres J. Gonzalez; Bjarne E. Helvik

This paper proposes an algorithm for allocating connections with bandwidth and availability requirements in a telecommunication network where the connections may share resources in their backup paths. The allocation is dynamic in the sense that allocation requests have random arrival times, source and destinations, and allocated connections have a random duration. The algorithm has been designed for core backbone networks with continuous bandwidth distribution, like for example MPLS networks. An efficient bandwidth utilization was obtained by an intelligent sharing mechanism that takes into account the properties of networks with continuous bandwidth allocation. The problem may be formulated as an NIP (Nonlinear Integer Programming) problem. However, due to the well known complexity and scalability limitations in solving this kind of problems, the solution is based on heuristic procedures. A performance comparison with previously published algorithms is carried out for some ”reference networks”, demonstrating a substantially better resource usage.


international telecommunications network strategy and planning symposium | 2012

Guaranteeing SLA availability in telecommunications networks

Andres J. Gonzalez; Bjarne E. Helvik

Network operators have to allocate connections fulfilling availability requirements stipulated in SLAs for a finite interval. However, modeling accurately the transient solution of repairable systems is still an open challenge. We study the SLA penalty scheme and propose a model to allocate connections with SLA requirements, maximizing the operator profit through a two-stage stochastic program. Our model considers the stochastic behavior of network components, correlation between failure/repair processes, the SLA finite duration, and the flexibility to allocate or reject a connection based on its commercial convenience. The model is designed for three different protection schemes: unprotected, dedicated backup and shared backup.


network computing and applications | 2013

Hybrid Cloud Management to Comply Efficiently with SLA Availability Guarantees

Andres J. Gonzalez; Bjarne E. Helvik

SLAs are common means to define specifications and requirements of cloud computing services, where the guaranteed availability is one of the most important parameters. Fulfilling the stipulated availability may be expensive, due to the cost of failure recovery software, and the amount of physical equipment needed to deploy the cloud services. Therefore, a relevant question for cloud providers is: How to guarantee the SLA availability in a cost efficient way? This paper studies different fault tolerance techniques available in the market, and it proposes the use of an hybrid management to have full control over the SLA risk, using only the necessary resources in order to keep a cost efficient operation. This paper shows how to model the probability distribution of the accumulated downtime, and how this can be used in the design of hybrid policies. Using specific case studies, this paper illustrates how to implement the proposed hybrid policies, and it shows the obtained cost saving by using them. This paper takes advantage of the cloud computing flexibility, and it opens the door to the use of dynamic management policies to reach specific performance objectives in ICT systems.

Collaboration


Dive into the Andres J. Gonzalez's collaboration.

Top Co-Authors

Avatar

Bjarne E. Helvik

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andrzej Kamisinski

AGH University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Poul E. Heegaard

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Denis M. Becker

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Otto J. Wittner

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Prakriti Tiwari

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge