Shobana Balakrishnan
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shobana Balakrishnan.
knowledge discovery and data mining | 2014
Herodotos Herodotou; Bolin Ding; Shobana Balakrishnan; Geoff Outhred; Percy Fitter
Large-scale data center networks are complex---comprising several thousand network devices and several hundred thousand links---and form the critical infrastructure upon which all higher-level services depend on. Despite the built-in redundancy in data center networks, performance issues and device or link failures in the network can lead to user-perceived service interruptions. Therefore, determining and localizing user-impacting availability and performance issues in the network in near real time is crucial. Traditionally, both passive and active monitoring approaches have been used for failure localization. However, data from passive monitoring is often too noisy and does not effectively capture silent or gray failures, whereas active monitoring is potent in detecting faults but limited in its ability to isolate the exact fault location depending on its scale and granularity. Our key idea is to use statistical data mining techniques on large-scale active monitoring data to determine a ranked list of suspect causes, which we refine with passive monitoring signals. In particular, we compute a failure probability for devices and links in near real time using data from active monitoring, and look for statistically significant increases in the failure probability. We also correlate the probabilistic output with other failure signals from passive monitoring to increase the confidence of the probabilistic analysis. We have implemented our approach in the Windows Azure production environment and have validated its effectiveness in terms of localization accuracy, precision, and time to localization using known network incidents over the past three months. The correlated ranked list of devices and links is surfaced as a report that is used by network operators to investigate current issues and identify probable root causes.
Archive | 2011
Robert M. Fries; Galen C. Hunt; Shobana Balakrishnan
Archive | 2003
Craig Rowland; Adam Sandford; Shobana Balakrishnan; Mark Mccasey
Archive | 2008
Shobana Balakrishnan; Mudit Goel; Dinan Gunawardena; Dave Maltz; Michael D. Schroeder; Fan Yang
operating systems design and implementation | 2014
Shobana Balakrishnan; Richard Black; Austin Donnelly; Paul England; Adam B. Glass; David T. Harper; Sergey Legtchenko; Aaron W. Ogus; Eric C. Peterson; Antony I. T. Rowstron
Archive | 2013
Shobana Balakrishnan; Surajit Chaudhuri
Archive | 2012
Ashvinkumar J. Sanghvi; Shobana Balakrishnan; Vishwajith Kumbalimutt; Anders B. Vinberg; Srivatsan Parthasarathy; James P. Finnigan
Archive | 2006
Guhan Suriyanarayanan; Huisheng Liu; Shobana Balakrishnan; Nikolaj Bjørner
Archive | 2005
Guhan Suriyanarayanan; Nikolaj Bjørner; Rafik Robeal; Shi Cong; Joseph A. Porkka; Christophe Franck Robert; Dan Teodosiu; David P. Golds; Huisheng Liu; Shobana Balakrishnan
Archive | 2003
Sandford L. Spinrad; Steven Richard Hollasch; Shobana Balakrishnan