Is this you? Create Your Porfile

Yexi Jiang

Florida International University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yexi Jiang is active.

Explore More

Publication

Featured researches published by Yexi Jiang.

international conference on data mining | 2011

ASAP: A Self-Adaptive Prediction System for Instant Cloud Resource Demand Provisioning

Yexi Jiang; Chang-shing Perng; Tao Li; Rong N. Chang

The promise of cloud computing is to provide computing resources instantly whenever they are needed. The state-of-art virtual machine (VM) provisioning technology can provision a VM in tens of minutes. This latency is unacceptable for jobs that need to scale out during computation. To truly enable on-the-fly scaling, new VM needs to be ready in seconds upon request. In this paper, We present an online temporal data mining system called ASAP, to model and predict the cloud VM demands. ASAP aims to extract high level characteristics from VM provisioning request stream and notify the provisioning system to prepare VMs in advance. For quantification issue, we propose Cloud Prediction Cost to encodes the cost and constraints of the cloud and guide the training of prediction algorithms. Moreover, we utilize a two-level ensemble method to capture the characteristics of the high transient demands time series. Experimental results using historical data from an IBM cloud in operation demonstrate that ASAP significantly improves the cloud service quality and provides possibility for on-the-fly provisioning.

IEEE Transactions on Network and Service Management | 2013

Cloud Analytics for Capacity Planning and Instant VM Provisioning

Yexi Jiang; Chang Shing Perng; Tao Li; Rong N. Chang

The popularity of cloud service spurs the increasing demands of virtual resources to the service vendors. Along with the promising business opportunities, it also brings new technique challenges such as effective capacity planning and instant cloud resource provisioning. In this paper, we describe our research efforts on improving the service quality for the capacity planning and instant cloud resource provisioning problem. We first formulate both of the two problems as a generic cost-sensitive prediction problem. Then, considering the highly dynamic environment of cloud, we propose an asymmetric and heterogeneous measure to quantify the prediction error. Finally, we design an ensemble prediction mechanism by combining the prediction power of a set of prediction techniques based on the proposed measure. To evaluate the effectiveness of our proposed solution, we design and implement an integrated prototype system to help improve the service quality of the cloud. Our system considers many practical situations of the cloud system, and is able to dynamically adapt to the changing environment. A series of experiments on the IBM Smart Cloud Enterprise (SCE) trace data demonstrate that our method can significantly improve the service quality by reducing the resource provisioning time while maintaining a low cloud overhead.

conference on recommender systems | 2014

Ensemble contextual bandits for personalized recommendation

Liang Tang; Yexi Jiang; Lei Li; Tao Li

The cold-start problem has attracted extensive attention among various online services that provide personalized recommendation. Many online vendors employ contextual bandit strategies to tackle the so-called exploration/exploitation dilemma rooted from the cold-start problem. However, due to high-dimensional user/item features and the underlying characteristics of bandit policies, it is often difficult for service providers to obtain and deploy an appropriate algorithm to achieve acceptable and robust economic profit. In this paper, we explore ensemble strategies of contextual bandit algorithms to obtain robust predicted click-through rate (CTR) of web objects. The ensemble is acquired by aggregating different pulling policies of bandit algorithms, rather than forcing the agreement of prediction results or learning a unified predictive model. To this end, we employ a meta-bandit paradigm that places a hyper bandit over the base bandits, to explicitly explore/exploit the relative importance of base bandits based on user feedbacks. Extensive empirical experiments on two real-world data sets (news recommendation and online advertising) demonstrate the effectiveness of our proposed approach in terms of CTR.

ACM Computing Surveys | 2017

Data-Driven Techniques in Disaster Information Management

Tao Li; Ning Xie; Chunqiu Zeng; Wubai Zhou; Li Zheng; Yexi Jiang; Yimin Yang; Hsin-Yu Ha; Wei Xue; Yue Huang; Shu-Ching Chen; Jainendra K. Navlakha; S. Sitharama Iyengar

Improving disaster management and recovery techniques is one of national priorities given the huge toll caused by man-made and nature calamities. Data-driven disaster management aims at applying advanced data collection and analysis technologies to achieve more effective and responsive disaster management, and has undergone considerable progress in the last decade. However, to the best of our knowledge, there is currently no work that both summarizes recent progress and suggests future directions for this emerging research area. To remedy this situation, we provide a systematic treatment of the recent developments in data-driven disaster management. Specifically, we first present a general overview of the requirements and system architectures of disaster management systems and then summarize state-of-the-art data-driven techniques that have been applied on improving situation awareness as well as in addressing users’ information needs in disaster management. We also discuss and categorize general data-mining and machine-learning techniques in disaster management. Finally, we recommend several research directions for further investigations.

knowledge discovery and data mining | 2013

FIU-Miner: a fast, integrated, and user-friendly system for data mining in distributed environment

Chunqiu Zeng; Yexi Jiang; Li Zheng; Jingxuan Li; Lei Li; Hongtai Li; Chao Shen; Wubai Zhou; Tao Li; Bing Duan; Ming Lei; Pengnian Wang

The advent of Big Data era drives data analysts from different domains to use data mining techniques for data analysis. However, performing data analysis in a specific domain is not trivial; it often requires complex task configuration, onerous integration of algorithms, and efficient execution in distributed environments.Few efforts have been paid on developing effective tools to facilitate data analysts in conducting complex data analysis tasks. In this paper, we design and implement FIU-Miner, a Fast, Integrated, and User-friendly system to ease data analysis. FIU-Miner allows users to rapidly configure a complex data analysis task without writing a single line of code. It also helps users conveniently import and integrate different analysis programs. Further, it significantly balances resource utilization and task execution in heterogeneous environments. A case study of a real-world application demonstrates the efficacy and effectiveness of our proposed system.

IEEE Transactions on Knowledge and Data Engineering | 2014

Dynamic Query Forms for Database Queries

Liang Tang; Tao Li; Yexi Jiang; Zhiyuan Chen

Modern scientific databases and web databases maintain large and heterogeneous data. These real-world databases contain hundreds or even thousands of relations and attributes. Traditional predefined query forms are not able to satisfy various ad-hoc queries from users on those databases. This paper proposes DQF, a novel database query form interface, which is able to dynamically generate query forms. The essence of DQF is to capture a users preference and rank query form components, assisting him/her in making decisions. The generation of a query form is an iterative process and is guided by the user. At each iteration, the system automatically generates ranking lists of form components and the user then adds the desired form components into the query form. The ranking of form components is based on the captured user preference. A user can also fill the query form and submit queries to view the query result at each iteration. In this way, a query form could be dynamically refined until the user is satisfied with the query results. We utilize the expected F-measure for measuring the goodness of a query form. A probabilistic model is developed for estimating the goodness of a query form in DQF. Our experimental evaluation and user study demonstrate the effectiveness and efficiency of the system.

ieee international conference on services computing | 2012

Self-Adaptive Cloud Capacity Planning

Yexi Jiang; Chang-shing Perng; Tao Li; Rong N. Chang

The popularity of cloud service spurs the increasing demands of cloud resources to the cloud service providers. Along with the new business opportunities, the pay-as-you-go model drastically changes the usage pattern and brings technology challenges to effective capacity planning. In this paper, we propose a new method for cloud capacity planning with the goal of fully utilizing the physical resources, as we believe this is one of the emerging problems for cloud providers. To solve this problem, we present an integrated system with intelligent cloud capacity prediction. Considering the unique characteristics of the cloud service that virtual machines are provisioned and de-provisioned frequently to meet the business needs, we propose an asymmetric and heterogeneous measure for modeling the over-estimation, and under-estimation of the capacity. To accurately forecast the capacity, we first divide the change of cloud capacity demand into provisioning and de-provisioning components, and then estimate the individual components respectively. The future provisioning demand is predicted by an ensemble time-series prediction method, while the future de-provisioning is inferred based on the life span distribution and the number of active virtual machines. Our proposed solution is simple and computational efficient, which make it practical for development and deployment. Our solution also has the advantages for generating interpretable predictions. The experimental results on the IBM Smart Cloud Enterprise trace data demonstrate the effectiveness, accuracy and efficiency of our solution.

conference on information and knowledge management | 2011

Natural event summarization

Yexi Jiang; Chang-shing Perng; Tao Li

Event mining is a useful way to understand computer system behaviors. The focus of recent works on event mining has been shifted to event summarization from discovering frequent patterns. Event summarization seeks to provide a comprehensible explanation of the event sequence on certain aspects. Previous methods have several limitations such as ignoring temporal information, generating the same set of boundaries for all event patterns, and providing a summary which is difficult for human to understand. In this paper, we propose a novel framework called natural event summarization that summarizes an event sequence using inter-arrival histograms to capture the temporal relationship among events. Our framework uses the minimum description length principle to guide the process in order to balance between accuracy and brevity. Also, we use multi-resolution analysis for pruning the problem space. We demonstrate how the principles can be applied to generate summaries with periodic patterns and correlation patterns in the framework. Experimental results on synthetic and real data show our method is capable of producing usable event summary, robust to noises, and scalable.

international acm sigir conference on research and development in information retrieval | 2015

Personalized Recommendation via Parameter-Free Contextual Bandits

Liang Tang; Yexi Jiang; Lei Li; Chunqiu Zeng; Tao Li

Personalized recommendation services have gained increasing popularity and attention in recent years as most useful information can be accessed online in real-time. Most online recommender systems try to address the information needs of users by virtue of both user and content information. Despite extensive recent advances, the problem of personalized recommendation remains challenging for at least two reasons. First, the user and item repositories undergo frequent changes, which makes traditional recommendation algorithms ineffective. Second, the so-called cold-start problem is difficult to address, as the information for learning a recommendation model is limited for new items or new users. Both challenges are formed by the dilemma of exploration and exploitation. In this paper, we formulate personalized recommendation as a contextual bandit problem to solve the exploration/exploitation dilemma. Specifically in our work, we propose a parameter-free bandit strategy, which employs a principled resampling approach called online bootstrap, to derive the distribution of estimated models in an online manner. Under the paradigm of probability matching, the proposed algorithm randomly samples a model from the derived distribution for every recommendation. Extensive empirical experiments on two real-world collections of web data (including online advertising and news recommendation) demonstrate the effectiveness of the proposed algorithm in terms of the click-through rate. The experimental results also show that this proposed algorithm is robust in the cold-start situation, in which there is no sufficient data or knowledge to tune the hyper-parameters.

knowledge discovery and data mining | 2014

Applying data mining techniques to address critical process optimization needs in advanced manufacturing

Li Zheng; Chunqiu Zeng; Lei Li; Yexi Jiang; Wei Xue; Jingxuan Li; Chao Shen; Wubai Zhou; Hongtai Li; Liang Tang; Tao Li; Bing Duan; Ming Lei; Pengnian Wang

Advanced manufacturing such as aerospace, semi-conductor, and flat display device often involves complex production processes, and generates large volume of production data. In general, the production data comes from products with different levels of quality, assembly line with complex flows and equipments, and processing craft with massive controlling parameters. The scale and complexity of data is beyond the analytic power of traditional IT infrastructures. To achieve better manufacturing performance, it is imperative to explore the underlying dependencies of the production data and exploit analytic insights to improve the production process. However, few research and industrial efforts have been reported on providing manufacturers with integrated data analytical solutions to reveal potentials and optimize the production process from data-driven perspectives. In this paper, we design, implement and deploy an integrated solution, named PDP-Miner, which is a data analytics platform customized for process optimization in Plasma Display Panel (PDP) manufacturing. The system utilizes the latest advances in data mining technologies and Big Data infrastructures to create a complete analytical solution. Besides, our proposed system is capable of supporting automatically configuring and scheduling analysis tasks, and balancing heterogeneous computing resources. The system and the analytic strategies can be applied to other advanced manufacturing fields to enable complex data analysis tasks. Since 2013, PDP-Miner has been deployed as the data analysis platform of ChangHong COC. By taking the advantages of our system, the overall PDP yield rate has increased from 91% to 94%. The monthly production is boosted by 10,000 panels, which brings more than 117 million RMB of revenue improvement per year.

Explore More