Haibo Mi

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Haibo Mi is active.

Explore More

Publication

Featured researches published by Haibo Mi.

ieee international conference on services computing | 2010

Haibo Mi; Huaimin Wang; Gang Yin; Yangfan Zhou; Dianxi Shi; Lin Yuan

In a typical large-scale data center, a set of applications are hosted over virtual machines (VMs) running on a large number of physical machines (PMs). Such a virtualization technique can be used for conserving power consumption by minimizing the number of PMs that should be turned on according to the application requirements to resource. However, the resource demands for VMs is dynamic in nature since the number of user requests the applications should handle is rapidly changing in practice. It is a great challenge to online reconfigure the VMs (i.e., optimize the number and the locations for the VMs) according to the dynamic resource demands. Especially for the emerging applications of large-scale data centers for cloud computing systems, existing approaches either fails to find a best configuration of VMs or cannot produce a result in an acceptable time. In this paper, we propose an online self-reconfiguration approach for reallocating VMs in large-scale data centers. It first accurately predicts the future workloads of the applications with Brown’s quadratic exponential smoothing. Based on such a prediction, it adopts a genetic algorithm to efficiently find the optimal reconfiguration policy. The resource utilization of large-scale cloud computing data centers can thus be improved and their energy consumption can be greatly conserved. We conduct extensive experiments and the results verify that our approach can effectively switch off more unnecessary running PMs comparing with current approaches without a performance degradation of the whole system.

IEEE Transactions on Parallel and Distributed Systems | 2013

Haibo Mi; Huaimin Wang; Yangfan Zhou; Michael R. Lyu; Hua Cai

Performance diagnosis is labor intensive in production cloud computing systems. Such systems typically face many real-world challenges, which the existing diagnosis techniques for such distributed systems cannot effectively solve. An efficient, unsupervised diagnosis tool for locating fine-grained performance anomalies is still lacking in production cloud computing systems. This paper proposes CloudDiag to bridge this gap. Combining a statistical technique and a fast matrix recovery algorithm, CloudDiag can efficiently pinpoint fine-grained causes of the performance problems, which does not require any domain-specific knowledge to the target system. CloudDiag has been applied in a practical production cloud computing systems to diagnose performance problems. We demonstrate the effectiveness of CloudDiag in three real-world case studies.

ieee international conference on services computing | 2011

Haibo Mi; Huaimin Wang; Gang Yin; Hua Cai; Qi Zhou; Tingtao Sun; Yangfan Zhou

In large-scale cloud computing systems, even a simple user request may go through numerous of services that are deployed on different physical machines. As a result, it is a great challenge to online localize the prime causes of performance degradation in such systems. Existing end-to-end request tracing approaches are not suitable for online anomaly detection because their time complexity is exponential in the size of the trace logs. In this paper, we propose an approach, namely Magnifier, to rapidly diagnose the source of performance degradation in large-scale non-stop cloud systems. In Magnifier, the execution path graph of a user request is modeled by a hierarchical structure including component layer, module layer and function layer, and anomalies are detected from higher layer to lower layer separately. In each layer every node is assigned a newly created identifier in addition to the global identifier of the request, which significantly decreases the size of parsing trace logs and accelerates the anomaly detection process. We conduct extensive experiments over a real-world enterprise system (the Alibaba cloud computing platform) providing services for the public. The results show that Magnifier can locate the prime causes of performance degradation more accurately and efficiently.

network operations and management symposium | 2012

Haibo Mi; Huaimin Wang; Gang Yin; Hua Cai; Qi Zhou; Tingtao Sun

In cloud computing systems, end-to-end request tracing approach is helpful for developers to understand the runtime behavior of user requests. Based on trace logs, we propose an approach to localize the abnormal methods that are the primary causes of performance problems. Our approach involves three steps: (1) cluster the user requests into different categories according to request call sequences and select major categories; (2) extract the principal methods that might be the causes of performance degradation; (3) pick out abnormal methods from those principal methods in each major category. We conduct four cases of performance degradations to validate our approach over a real-world enterprise-class cloud computing platform. The experimental results show that our approach can locate the prime causes of performance problems with low false-positive rate and false-negative rate.

computer software and applications conference | 2012

Haibo Mi; Huaimin Wang; Hua Cai; Yangfan Zhou; Michael R. Lyu; Zhenbang Chen

In large-scale cloud computing systems, the growing scale and complexity of component interactions pose great challenges for operators to understand the characteristics of system performance. Performance profiling has long been proved to be an effective approach to performance analysis; however, existing approaches do not consider two new requirements that emerge in cloud computing systems. First, the efficiency of the profiling becomes of critical concern; second, visual analytics should be utilized to make profiling results more readable. To address the above two issues, in this paper, we present P-Tracer, an online performance profiling approach specifically tailored for large-scale cloud computing systems. P-Tracer constructs a specific search engine that adopts a proactive way to process performance logs and generates particular indices for fast queries; furthermore, PTracer provides users with a suite of web-based interfaces to query statistical information of all kinds of services, which helps them quickly and intuitively understand system behavior. The approach has been successfully applied in Alibaba Cloud Computing Inc. to conduct online performance profiling both in production clusters and test clusters. Experience with one real-world case demonstrates that P-Tracer can effectively and efficiently help users conduct performance profiling and localize the primary causes of performance anomalies.

Frontiers of Computer Science in China | 2013

Haibo Mi; Huaimin Wang; Yangfan Zhou; Michael R. Lyu; Hua Cai; Gang Yin

The growing scale and complexity of component interactions in cloud computing systems post great challenges for operators to understand the characteristics of system performance. Profiling has long been proved to be an effective approach to performance analysis; however, existing approaches confront new challenges that emerge in cloud computing systems. First, the efficiency of the profiling becomes of critical concern; second, service-oriented profiling should be considered to support separation-of-concerns performance analysis. To address the above issues, in this paper, we present P-Tracer, an online performance profiling tool specifically tailored for cloud computing systems. P-Tracer constructs a specific search engine that proactively processes performance logs and generates a particular index for fast queries; second, for each service, P-Tracer retrieves a statistical insight of performance characteristics from multi-dimensions and provides operators with a suite of web-based interfaces to query the critical information. We evaluate P-Tracer in the aspects of tracing overheads, data preprocessing scalability and querying efficiency. Three real-world case studies that happened in Alibaba cloud computing platform demonstrate that P-Tracer can help operators understand software behaviors and localize the primary causes of performance anomalies effectively and efficiently.

Science in China Series F: Information Sciences | 2012

Haibo Mi; Huaimin Wang; Yangfan Zhou; Michael R. Lyu; Hua Cai

It is hard to localize the primary cause of performance anomalies in cloud computing systems because of the complexity of interactions between components. The hidden connections in the huge number of request execution paths in such systems usually contain useful information for diagnosing performance anomalies. We propose an approach to localize anomalous invoked methods and their physical locations by leveraging request trace logs, which involves two steps: (1) firstly, cluster the requests according to their corresponding call sequences, identify anomalous requests with principal component analysis, and then pick out anomalous methods with Mann-Whitney hypothesis test; (2) secondly, compare the behavior similarities of all replicated instances of the anomalous methods with Jensen-Shannon divergence, and select the ones whose behaviors are different from those of others, which will be chosen as the final culprits of performance anomalies. We conduct experiments with four real-world cases to validate our approach in Alibaba Cloud Computing Inc. The results demonstrate that our approach can locate the prime causes of performance anomalies with the low false-positive rate and false-negative rate.

dependable systems and networks | 2011

Haibo Mi; Huaimin Wang; Gang Yin; Hua Cai; Qi Zhou; Tingtao Sun

It is quite a headache for developers to online detect performance problems in large-scale cloud computing systems. The behavior and the hidden connections among the huge amount of runtime request execution paths in cloud computing systems usually contain useful information for performance problem detection. In this paper, we propose an approach to rapidly diagnose the source of performance degradation in large-scale non-stop cloud computing systems. The approach first groups the user requests into categories with a fast clustering algorithm; then applies the principal components analysis to extract the primary methods; finally compares the normal and abnormal behaviors of the primary methods to localize the main cause of performance problems. We conduct extensive experiments over a real-world enterprise system providing services for the public. The results show that our approach can locate the prime causes of performance problems accurately and efficiently.1

ieee international conference on smart city socialcom sustaincom | 2015

Huining Yan; Huaimin Wang; Bo Ding; Haibo Mi; Dianxi Shi

Server consolidation is one of the critical techniques for energy-efficiency in cloud data centers. As it is often assumed that cloud service instances (e.g., Amazon EC2 instances) utilize the shared storage only, most existing work did not consider the problems introduced by utilizing local storage. In recent years, however, cloud service providers have been providing local storage for cloud users, e.g., Amazon EC2, Aliyun ECS and RDS, since local storage can offer a better performance with identified price. Thus, several problems might be incurred, e.g., migrating much more data, consuming much more migration time and network bandwidth. To address these problems, this paper proposes SaSercon, a storage-aware server consolidation approach to minimize the total migrated data size (stored on the local storage) by releasing the servers which utilize lower data size. Evaluation results on production traces demonstrate that SaSercon significantly reduces the total migrated data size.

International Journal of Big Data Intelligence | 2017

Huining Yan; Yiming Zhang; Huaimin Wang; Bo Ding; Haibo Mi

Services redeployment is one of the critical techniques for energy-efficiency in cloud data centres. In recent years, cloud providers have been providing local storage for cloud services, since it offers a better performance with identified price. Nevertheless, most existing work did not consider the problems introduced by utilising local storage, e.g., migrating much more data, and therefore consuming much more migration time and network bandwidth. Meanwhile, instance migration is a costly operation, the number of migrated instances must be considered. However, the data size and the number of instances on servers are not often accordant, and therefore a tradeoff should be made. To address this problem, this paper proposes S3R, a storage-sensitive services redeployment approach. S3R firstly builds a tradeoff model to estimate the release cost for each server, and then adopts a FFD-based heuristic algorithm to migrate/redeploy instances. Evaluation results on production traces demonstrate the effectiveness of S3R.

Explore More