Genady Grabarnik | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Genady Grabarnik is active.

Explore More

Publication

Featured researches published by Genady Grabarnik.

network operations and management symposium | 2004

Real-time problem determination in distributed systems using active probing

Irina Rish; Mark Brodie; Natalia Odintsova; Sheng Ma; Genady Grabarnik

We describe algorithms and an architecture for a real-time problem determination system that uses online selection of most-informative measurements - the approach called herein active probing. Probes are end-to-end test transactions which gather information about system components. Active probing allows probes to be selected and sent on-demand, in response to ones belief about the state of the system. At each step the most informative next probe is computed and sent. As probe results are received, belief about the system state is updated using probabilistic inference. This process continues until the problem is diagnosed. We demonstrate through both analysis and simulation that the active probing scheme greatly reduces both the number of probes and the time needed for localizing the problem when compared with non-active probing schemes.

international conference on machine learning | 2008

Closed-form supervised dimensionality reduction with generalized linear models

Irina Rish; Genady Grabarnik; Guillermo A. Cecchi; Francisco Pereira; Geoffrey J. Gordon

We propose a family of supervised dimensionality reduction (SDR) algorithms that combine feature extraction (dimensionality reduction) with learning a predictive model in a unified optimization framework, using data- and class-appropriate generalized linear models (GLMs), and handling both classification and regression problems. Our approach uses simple closed-form update rules and is provably convergent. Promising empirical results are demonstrated on a variety of high-dimensional datasets.

congress on evolutionary computation | 2007

Management of Service Process QoS in a Service Provider - Service Supplier Environment

Genady Grabarnik; Heiko Ludwig; Larisa Shwartz

IT service providers typically must comply with service level agreements that are part of their usage contracts with customers. Not only IT infrastructure is subject to service level guarantees such as availability or response time but also service management processes as defined by the IT infrastructure library (ITIL) such as change and incident processes and the fulfillment of service requests. SLAs relating to service management processes typically address metrics such as initial response time and fulfillment time. Large service providers have the choice of which internal service delivery team or external service provider they assign to atomic processs of a service process, each of which having different costs or prices associated with it for different turn-around times at different risk. This choice in QoS of different service providers can be used to manage the trade-off between penalty costs and fulfillment cost. This paper proposes a model as a basis for service provider choice at process request time. This model can be used to reduce total service costs of IT service providers using alternative delivery teams and external service providers.

knowledge discovery and data mining | 2003

Data-driven validation, completion and construction of event relationship networks

Chang-shing Perng; David Thoenen; Genady Grabarnik; Sheng Ma; Joseph L. Hellerstein

Event management is a focal point in building and maintaining high quality information infrastructures. We have witnessed the shift of the paradigm of event management in practice from root cause analysis (RCA) to action-oriented analysis (AOA). IBM has developed a pioneer event management methodology (EMD) based on the AOA paradigm and applied it to more than two hundred production sites with success. Foreseeably, more and more event management professionals will apply AOA in different incarnations in building proactive management facilities. By that, building correct and effective Event Relationship Networks (ERNs) becomes the dominating activity in AOA service design process. Currently, the quality of ERNs and the cost of building them largely depend on the knowledge of domain experts. We believe that we can utilize historical event logs in shortening the ERNs design process and perfecting the quality of ERNs. In this paper, we describe in detail how to apply this data-driven approach in ERN validation, completion and construction.

network operations and management symposium | 2012

Optimizing system monitoring configurations for non-actionable alerts

Liang Tang; Tao Li; Florian Pinel; Larisa Shwartz; Genady Grabarnik

Todays competitive business climate and the complexity of IT environments dictate efficient and cost effective service delivery and support of IT services. This is largely achieved through automating of routine maintenance procedures including problem detection, determination and resolution. System monitoring provides effective and reliable means for problem detection. Coupled with automated ticket creation, it ensures that a degradation of the vital signs, defined by acceptable thresholds or monitoring conditions, is flagged as a problem candidate and sent to supporting personnel as an incident ticket. This paper describes a novel methodology and a system for minimizing non-actionable tickets while preserving all tickets which require corrective action. Our proposed method defines monitoring conditions and the optimal corresponding delay times based on an off-line analysis of historical alerts and the matching incident tickets. Potential monitoring conditions are built on a set of predictive rules which are automatically generated by a rule-based learning algorithm with coverage, confidence and rule complexity criteria. These conditions and delay times are propagated as configurations into run-time monitoring systems.

knowledge discovery and data mining | 2013

An integrated framework for optimizing automatic monitoring systems in large IT infrastructures

Liang Tang; Tao Li; Larisa Shwartz; Florian Pinel; Genady Grabarnik

The competitive business climate and the complexity of IT environments dictate efficient and cost-effective service delivery and support of IT services. These are largely achieved by automating routine maintenance procedures, including problem detection, determination and resolution. System monitoring provides an effective and reliable means for problem detection. Coupled with automated ticket creation, it ensures that a degradation of the vital signs, defined by acceptable thresholds or monitoring conditions, is flagged as a problem candidate and sent to supporting personnel as an incident ticket. This paper describes an integrated framework for minimizing false positive tickets and maximizing the monitoring coverage for system faults. In particular, the integrated framework defines monitoring conditions and the optimal corresponding delay times based on an off-line analysis of historical alerts and incident tickets. Potential monitoring conditions are built on a set of predictive rules which are automatically generated by a rule-based learning algorithm with coverage, confidence and rule complexity criteria. These conditions and delay times are propagated as configurations into run-time monitoring systems. Moreover, a part of misconfigured monitoring conditions can be corrected according to false negative tickets that are discovered by another text classification algorithm in this framework. This paper also provides implementation details of a program product that uses this framework and shows some illustrative examples of successful results.

integrated network management | 2009

Towards an optimized model of incident ticket correlation

Patricia Marcu; Genady Grabarnik; Laura Z. Luan; Daniela Rosu; Larisa Shwartz; Christopher Ward

In recent years, IT Service Management (ITSM) has become one of the most researched areas of IT. Incident and Problem Management are two of the Service Operation processes in the IT Infrastructure Library (ITIL). These two processes aim to recognize, log, isolate and correct errors which occur in the environment and disrupt the delivery of services. Incident Management and Problem Management form the basis of the tooling provided by an Incident Ticket Systems (ITS).

network operations and management symposium | 2014

Hierarchical multi-label classification over ticket data using contextual loss

Chunqiu Zeng; Tao Li; Larisa Shwartz; Genady Grabarnik

Maximal automation of routine IT maintenance procedures is an ultimate goal of IT service management. System monitoring, an effective and reliable means for IT problem detection, generates monitoring tickets to be processed by system administrators. IT problems are naturally organized in a hierarchy by specialization. The problem hierarchy is used to help triage tickets to the processing team for problem resolving. In this paper, a hierarchical multi-label classification method is proposed to classify the monitoring tickets by utilizing the problem hierarchy. In order to find the most effective classification, a novel contextual hierarchy (CH) loss is introduced in accordance with the problem hierarchy. Consequently, an arising optimization problem is solved by a new greedy algorithm. An extensive empirical study over ticket data was conducted to validate the effectiveness and efficiency of our method.

integrated network management | 2015

Resolution recommendation for event tickets in service management

Wubai Zhou; Liang Tang; Tao Li; Larisa Shwartz; Genady Grabarnik

In recent years, IT Service Providers have been rapidly transforming to an automated service delivery model. This is due to advances in technology and driven by the unrelenting market pressure to reduce cost and maintain quality. Tremendous progress has been made to date towards attainment of truly automated service delivery; that is, the ability to deliver the same service automatically using the same process with the same quality. However, automating Incident and Problem Management continuous to be a difficult problem, particularly due to the growing complexity of IT environments. Software monitoring systems are designed to actively collect and signal event occurrances and, when necessary, automatically generate incident tickets. Repeating events generate similar tickets, which in turn have a vast number of repeated problem resolutions likely to be found in earlier tickets. In this paper we find an appropriate resolution by making use of similarities between the events and previous resolutions of similar events. Traditional KNN (K Nearest Neighbor) algorithm has been used to recommend resolutions for incoming tickets. However, the effectiveness of recommendation heavily relies on the underlying similarity measure in KNN. In this paper, we significantly improve the similarity measure used in KNN by utilizing both the event and resolution information in historical tickets via a topic-level feature extraction using the LDA (Latent Dirichlet Allocation) model. In addition, when resolution categories are available, we propose to learn a more effective similarity measure using metric learning. Extensive empirical evaluations on three ticket data sets demonstrate the effectiveness and efficiency of our proposed methods.

international conference on autonomic computing | 2004

Generic adapter logging toolkit

Genady Grabarnik; Abdi Salahshour; Balan Subramanian; Sheng Ma

The generic adapter for autonomic computing provides a framework for a unified approach to transform software messages and events into a standard situational event format in the autonomic computing architecture. The adapter as an application supports self-configuration and self-optimization.

Explore More