Yihua Huang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yihua Huang is active.

Explore More

Publication

Featured researches published by Yihua Huang.

very large data bases | 2001

An Internet-based negotiation server for e-commerce

Stanley Y. W. Su; C.-Y. F. Huang; Joachim Hammer; Yihua Huang; Haifei Li; Liu Wang; Youzhong Liu; Charnyote Pluempitiwiriyawej; Minsoo Lee; Herman Lam

Abstract. This paper describes the design and implementation of a replicable, Internet-based negotiation server for conducting bargaining-type negotiations between enterprises involved in e-commerce and e-business. Enterprises can be buyers and sellers of products/services or participants of a complex supply chain engaged in purchasing, planning, and scheduling. Multiple copies of our server can be installed to complement the services of Web servers. Each enterprise can install or select a trusted negotiation server to represent his/her interests. Web-based GUI tools are used during the build-time registration process to specify the requirements, constraints, and rules that represent negotiation policies and strategies, preference scoring of different data conditions, and aggregation methods for deriving a global cost-benefit score for the item(s) under negotiation. The registration information is used by the negotiation servers to automatically conduct bargaining type negotiations on behalf of their clients. In this paper, we present the architecture of our implementation as well as a framework for automated negotiations, and describe a number of communication primitives which are used in the underlying negotiation protocol. A constraint satisfaction processor (CSP) is used to evaluate a negotiation proposal or counterproposal against the registered requirements and constraints of a client company. In case of a constraint violation, an event is posted to trigger the execution of negotiation strategic rules, which either automatically relax the violated constraint, ask for human intervention, invoke an application, or perform other remedial operations. An Event-Trigger-Rule (ETR) server is used to manage events, triggers, and rules. Negotiation strategic rules can be added or modified at run-time. A cost-benefit analysis component is used to perform quantitative analysis of alternatives. The use of negotiation servers to conduct automated negotiation has been demonstrated in the context of an integrated supply chain scenario.

Journal of Autoimmune Diseases | 2005

Lack of correlation between the levels of soluble cytotoxic T-lymphocyte associated antigen-4 (CTLA-4) and the CT-60 genotypes.

Sharad Purohit; Robert H. Podolsky; Christin D. Collins; Weipeng Zheng; Desmond A. Schatz; Andrew Muir; Diane Hopkins; Yihua Huang; Jin Xiong She

BackgroundCytotoxic T lymphocyte-associated antigen-4 (CTLA-4) plays a critical role in downregulation of antigen-activated immune response and polymorphisms at the CTLA-4 gene have been shown to be associated with several autoimmune diseases including type-1 diabetes (T1D). The etiological mutation was mapped to the CT60-A/G single nucleotide polymorphism (SNP) that is believed to control the processing and production of soluble CTLA-4 (sCTLA-4).MethodsWe therefore determined sCTLA-4 protein levels in the sera from 82 T1D patients and 19 autoantibody positive (AbP) subjects and 117 autoantibody negative (AbN) controls using ELISA. The CT-60 SNP was genotyped for these samples by using PCR and restriction enzyme digestion of a 268 bp DNA segment containing the SNP. Genotyping of CT-60 SNP was confirmed by dye terminating sequencing reaction.ResultsHigher levels of sCTLA-4 were observed in T1D (2.24 ng/ml) and AbP (mean = 2.17 ng/ml) subjects compared to AbN controls (mean = 1.69 ng/ml) with the differences between these subjects becoming significant with age (p = 0.02). However, we found no correlation between sCTLA-4 levels and the CTLA-4 CT-60 SNP genotypes.ConclusionConsistent with the higher serum sCTLA-4 levels observed in other autoimmune diseases, our results suggest that sCTLA-4 may be a risk factor for T1D. However, our results do not support the conclusion that the CT-60 SNP controls the expression of sCTLA-4.

Journal of Parallel and Distributed Computing | 2014

SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters

Rong Gu; Xiaoliang Yang; Jinshuang Yan; Yuanhao Sun; Bing Wang; Chunfeng Yuan; Yihua Huang

As a widely-used parallel computing framework for big data processing today, the Hadoop MapReduce framework puts more emphasis on high-throughput of data than on low-latency of job execution. However, today more and more big data applications developed with MapReduce require quick response time. As a result, improving the performance of MapReduce jobs, especially for short jobs, is of great significance in practice and has attracted more and more attentions from both academia and industry. A lot of efforts have been made to improve the performance of Hadoop from job scheduling or job parameter optimization level. In this paper, we explore an approach to improve the performance of the Hadoop MapReduce framework by optimizing the job and task execution mechanism. First of all, by analyzing the job and task execution mechanism in MapReduce framework we reveal two critical limitations to job execution performance. Then we propose two major optimizations to the MapReduce job and task execution mechanisms: first, we optimize the setup and cleanup tasks of a MapReduce job to reduce the time cost during the initialization and termination stages of the job; second, instead of adopting the loose heartbeat-based communication mechanism to transmit all messages between the JobTracker and TaskTrackers, we introduce an instant messaging communication mechanism for accelerating performance-sensitive task scheduling and execution. Finally, we implement SHadoop, an optimized and fully compatible version of Hadoop that aims at shortening the execution time cost of MapReduce jobs, especially for short jobs. Experimental results show that compared to the standard Hadoop, SHadoop can achieve stable performance improvement by around 25% on average for comprehensive benchmarks without losing scalability and speedup. Our optimization work has passed a production-level test in Intel and has been integrated into the Intel Distributed Hadoop (IDH). To the best of our knowledge, this work is the first effort that explores on optimizing the execution mechanism inside map/reduce tasks of a job. The advantage is that it can complement job scheduling optimizations to further improve the job execution performance.

international parallel and distributed processing symposium | 2014

YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark

Hongjian Qiu; Rong Gu; Chunfeng Yuan; Yihua Huang

The frequent itemset mining (FIM) is one of the most important techniques to extract knowledge from data in many real-world applications. The Apriori algorithm is the widely-used algorithm for mining frequent itemsets from a transactional dataset. However, the FIM process is both data-intensive and computing-intensive. On one side, large scale data sets are usually adopted in data mining nowadays, on the other side, in order to generate valid information, the algorithm needs to scan the datasets iteratively for many times. These make the FIM algorithm very time-consuming over big data. The parallel and distributed computing is effective and mostly-used strategy for speeding up large scale dataset algorithms. However, the existing parallel Apriori algorithms implemented with the MapReduce model are not efficient enough for iterative computation. In this paper, we proposed YAFIM (Yet Another Frequent Itemset Mining), a parallel Apriori algorithm based on the Spark RDD framework -- a specially-designed in-memory parallel computing model to support iterative algorithms and interactive data mining. Experimental results show that, compared with the algorithms implemented with MapReduce, YAFIM achieved 18× speedup in average for various benchmarks. Especially, we apply YAFIM in a real-world medical application to explore the relationships in medicine. It outperforms the MapReduce method around 25 times.

PLOS ONE | 2010

Genetically Dependent ERBB3 Expression Modulates Antigen Presenting Cell Function and Type 1 Diabetes Risk

Hongjie Wang; Yulan Jin; M. V.Prasad Linga Reddy; Robert H. Podolsky; Siyang Liu; Ping Yang; Bruce W. Bode; John C. Reed; R. Dennis Steed; Stephen W. Anderson; Leigh Steed; Diane Hopkins; Yihua Huang; Jin Xiong She

Type 1 diabetes (T1D) is an autoimmune disease resulting from the complex interaction between multiple susceptibility genes, environmental factors and the immune system. Over 40 T1D susceptibility regions have been suggested by recent genome-wide association studies; however, the specific genes and their role in the disease remain elusive. The objective of this study is to identify the susceptibility gene(s) in the 12q13 region and investigate the functional link to the disease pathogenesis. A total of 19 SNPs in the 12q13 region were analyzed by the TaqMan assay for 1,434 T1D patients and 1,865 controls. Thirteen of the SNPs are associated with T1D (best p = 4×10−11), thus providing confirmatory evidence for at least one susceptibility gene in this region. To identify candidate genes, expression of six genes in the region was analyzed by real-time RT-PCR for PBMCs from 192 T1D patients and 192 controls. SNP genotypes in the 12q13 region are the main factors that determine ERBB3 mRNA levels in PBMCs. The protective genotypes for T1D are associated with higher ERBB3 mRNA level (p<10−10). Furthermore, ERBB3 protein is expressed on the surface of CD11c+ cells (dendritic cells and monocytes) in peripheral blood after stimulation with LPS, polyI:C or CpG. Subjects with protective genotypes have significantly higher percentages of ERBB3+ monocytes and dendritic cells (p = 1.1×10−9); and the percentages of ERBB3+ cells positively correlate with the ability of APC to stimulate T cell proliferation (R2 = 0.90, p<0.0001). Our results indicate that ERBB3 plays a critical role in determining APC function and potentially T1D pathogenesis.

international conference on cloud and green computing | 2012

Performance Optimization for Short MapReduce Job Execution in Hadoop

Jinshuang Yan; Xiaoliang Yang; Rong Gu; Chunfeng Yuan; Yihua Huang

Hadoop MapReduce is a widely used parallel computing framework for solving data-intensive problems. To be able to process large-scale datasets, the fundamental design of the standard Hadoop places more emphasis on high-throughput of data than on job execution performance. This causes performance limitation when we use Hadoop MapReduce to execute short jobs that requires quick responses. In order to speed up the execution of short jobs, this paper proposes optimization methods to improve the execution performance of MapReduce jobs. We made three major optimizations: first, we reduce the time cost during the initialization and termination stages of a job by optimizing its setup and cleanup tasks, second, we replace the pull-model task assignment mechanism with a push-model, third, we replace the heartbeat-based communication mechanism with an instant message communication mechanism for event notifications between the Job Tracker and Task Trackers. Experimental results show that the job execution performance of our improved version of Hadoop is about 23% faster on average than the standard Hadoop for our test application.

international conference on big data | 2013

A parallel computing platform for training large scale neural networks

Rong Gu; Furao Shen; Yihua Huang

Artificial neural networks (ANNs) have been proved to be successfully used in a variety of pattern recognition and data mining applications. However, training ANNs on large scale datasets are both data-intensive and computation-intensive. Therefore, large scale ANNs are used with reservation for their time-consuming training to get high precision. In this paper, we present cNeural, a customized parallel computing platform to accelerate training large scale neural networks with the backpropagation algorithm. Unlike many existing parallel neural network training systems working on thousands of training samples, cNeural is designed for fast training large scale datasets with millions of training samples. To achieve this goal, firstly, cNeural adopts HBase for large scale training dataset storage and parallel loading. Secondly, it provides a parallel in-memory computing framework for fast iterative training. Third, we choose a compact, event-driven messaging communication model instead of the heartbeat polling model for instant messaging delivery. Experimental results show that the overhead time cost by data loading and messaging communication is very low in cNeural and cNeural is around 50 times faster than the solution based on Hadoop MapReduce. It also achieves nearly linear scalability and excellent load balancing.

international conference on data engineering | 2000

The IDEAL approach to Internet-based negotiation for e-business

Joachim Hammer; C.-Y. F. Huang; Yihua Huang; Charnyote Pluempitiwiriyawej; Minsoo Lee; Haifei Li; Liu Wang; Youzhong Liu; Stanley Y. W. Su

With the emergence of e-business as the next killer application for the Web, automating bargaining-type negotiations between clients (i.e., buyers and sellers) has become increasingly important. With IDEAL (Internet based Dealmaker for e-business), we have developed an architecture and framework, including a negotiation protocol, for automated negotiations among multiple IDEAL servers. The main components of IDEAL are a constraint satisfaction processor (CSP) to evaluate a proposal, an Event-Trigger-Rule (ETR) server for managing and triggering the execution of rules which make up the negotiation strategy (rules can be updated at run-time to deal with the dynamic nature of negotiations), and a cost-benefit analysis to help in the selection of alternative strategies. We have implemented a fully functional prototype system of IDEAL to demonstrate automated negotiations among buyers and suppliers participating in a supply chain.

international parallel and distributed processing symposium | 2014

Training Large Scale Deep Neural Networks on the Intel Xeon Phi Many-Core Coprocessor

Lei Jin; Zhaokang Wang; Rong Gu; Chunfeng Yuan; Yihua Huang

As a new area of machine learning research, the deep learning algorithm has attracted a lot of attention from the research community. It may bring human beings to a higher cognitive level of data. Its unsupervised pre-training step allows us to find high-dimensional representations or abstract features which work much better than the principal component analysis (PCA) method. However, it will face problems when being applied to deal with large scale data due to its intensive computation from many levels of training process against large scale data. The sequential deep learning algorithms usually can not finish the computation in an acceptable time. In this paper, we propose a many-core algorithm which is based on a parallel method and is used in the Intel Xeon Phi many-core systems to speed up the unsupervised training process of Sparse Autoencoder and Restricted Boltzmann Machine (RBM). Using the sequential training algorithm as a baseline to compare, we adopted several optimization methods to parallelize the algorithm. The experimental results show that our fully-optimized algorithm gains more than 300-fold speedup on parallelized Sparse Autoencoder compared with the original sequential algorithm on the Intel Xeon Phi coprocessor. Also, we ran the fully-optimized code on both the Intel Xeon Phi coprocessor and an expensive Intel Xeon CPU. Our method on the Intel Xeon Phi coprocessor is 7 to 10 times faster than the Intel Xeon CPU for this application. In addition to this, we compared our fully-optimized code on the Intel Xeon Phi with a Matlab code running on single Intel Xeon CPU. Our method on the Intel Xeon Phi runs 16 times faster than the Matlab implementation. The result also suggests that the Intel Xeon Phi can offer an efficient but more general-purposed way to parallelize the deep learning algorithm compared to GPU. It also achieves faster speed with better parallelism than the Intel Xeon CPU.

international parallel and distributed processing symposium | 2015

Cichlid: Efficient Large Scale RDFS/OWL Reasoning with Spark

Rong Gu; Shanyong Wang; Fangfang Wang; Chunfeng Yuan; Yihua Huang

In the era of big data, the volume of semantic data grows rapidly. The large scale semantic data contains a lot of significant but often implicit information that needs to be derived by reasoning. The semantic data reasoning is a challenging process. On one hand, the traditional single-node reasoning systems can hardly cope with such large amount of data due to the resource limitations. On the other hand, the existing large scale reasoning systems are not very efficient and scalable due to the complexity of reasoning process. In this paper, we propose Cichlid, an efficient distributed reasoning engine for the widely-used RDFS and OWL Horst rule sets. Cichlid is built on top of Spark. It implements parallel reasoning algorithms with the Spark RDD programming model. Further, we proposed the optimized parallel RDFS reasoning algorithm from three aspects, including data partition model, the execution order of reasoning rules and removing of duplicated data. Then, for the parallel OWL reasoning process, we optimized the most time-consuming parts, including large-scale data join, the transitive closure computation and the equivalent relation computation. In addition to above optimizations at the reasoning algorithm level, we also optimized the inner Spark execution mechanism by proposing an off-heap memory storage mechanism for RDD. This system-level optimization patch has been accepted and integrated into Apache Spark 1.0. The experimental results show that Cichlid is around 10 times faster on average than the state-of-the-art distributed reasoning systems for both large scale synthetic and real-world benchmarks. The proposed reasoning algorithms and engine also achieve excellent scalability and fault tolerance.

Explore More