Chaochao Chen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chaochao Chen is active.

Explore More

Publication

Featured researches published by Chaochao Chen.

knowledge discovery and data mining | 2017

KunPeng: Parameter Server based Distributed Learning Systems and Its Applications in Alibaba and Ant Financial

Jun Zhou; Xiaolong Li; Peilin Zhao; Chaochao Chen; Longfei Li; Xinxing Yang; Qing Cui; Jin Yu; Xu Chen; Yi Ding; Yuan Alan Qi

In recent years, due to the emergence of Big Data (terabytes or petabytes) and Big Model (tens of billions of parameters), there has been an ever-increasing need of parallelizing machine learning (ML) algorithms in both academia and industry. Although there are some existing distributed computing systems, such as Hadoop and Spark, for parallelizing ML algorithms, they only provide synchronous and coarse-grained operators (e.g., Map, Reduce, and Join, etc.), which may hinder developers from implementing more efficient algorithms. This motivated us to design a universal distributed platform termed KunPeng, that combines both distributed systems and parallel optimization algorithms to deal with the complexities that arise from large-scale ML. Specifically, KunPeng not only encapsulates the characteristics of data/model parallelism, load balancing, model sync-up, sparse representation, industrial fault-tolerance, etc., but also provides easy-to-use interface to empower users to focus on the core ML logics. Empirical results on terabytes of real datasets with billions of samples and features demonstrate that, such a design brings compelling performance improvements on ML programs ranging from Follow-the-Regularized-Leader Proximal algorithm to Sparse Logistic Regression and Multiple Additive Regression Trees. Furthermore, KunPengs encouraging performance is also shown for several real-world applications including the Alibabas Double 11 Online Shopping Festival and Ant Financials transaction risk estimation.

computer and communications security | 2017

POSTER: Neural Network-based Graph Embedding for Malicious Accounts Detection

Ziqi Liu; Chaochao Chen; Jun Zhou; Xiaolong Li; Feng Xu; Tao Chen; Le Song

We present a neural network based graph embedding method for detecting malicious accounts at Alipay, one of the worlds leading mobile payment platform. Our method adaptively learns discriminative embeddings from an account-device graph based on two fundamental weaknesses of attackers, i.e. device aggregation and activity aggregation. Experiments show that our method achieves outstanding precision-recall curve compared with existing methods.

pacific-asia conference on knowledge discovery and data mining | 2018

A Local Online Learning Approach for Non-linear Data

Xinxing Yang; Jun Zhou; Peilin Zhao; Cen Chen; Chaochao Chen; Xiaolong Li

The efficiency and scalability of online learning methods make them a popular choice for solving the learning problems with big data and limited memory. Most of the existing online learning approaches are based on global models, which consider the incoming example as linear separable. However, this assumption is not always valid in practice. Therefore, local online learning framework was proposed to solve non-linear separable task without kernel modeling. Weights in local online learning framework are based on the first-order information, thus will significantly limit the performance of online learning. Intuitively, the second-order online learning algorithms, e.g., Soft Confidence-Weighted (SCW), can significantly alleviate this issue. Inspired by the second-order algorithms and local online learning framework, we propose a Soft Confidence-Weighted Local Online Learning (SCW-LOL) algorithm, which extends the single hyperplane SCW to the case with multiple local hyperplanes. Those local hyperplanes are connected by a common component and will be optimized simultaneously. We also examine the theoretical relationship between the single and multiple hyperplanes. The extensive experimental results show that the proposed SCW-LOL learns an online convergence boundary, overall achieving the best performance over almost all datasets, without any kernel modeling and parameter tuning.

knowledge discovery and data mining | 2018

Distributed Collaborative Hashing and Its Applications in Ant Financial

Chaochao Chen; Ziqi Liu; Peilin Zhao; Longfei Li; Jun Zhou; Xiaolong Li

Collaborative filtering, especially latent factor model, has been popularly used in personalized recommendation. Latent factor model aims to learn user and item latent factors from user-item historic behaviors. To apply it into real big data scenarios, efficiency becomes the first concern, including offline model training efficiency and online recommendation efficiency. In this paper, we propose a D istributed C ollaborative H ashing ( DCH ) model which can significantly improve both efficiencies. Specifically, we first propose a distributed learning framework, following the state-of-the-art parameter server paradigm, to learn the offline collaborative model. Our model can be learnt efficiently by distributedly computing subgradients in minibatches on workers and updating model parameters on servers asynchronously. We then adopt hashing technique to speedup the online recommendation procedure. Recommendation can be quickly made through exploiting lookup hash tables. We conduct thorough experiments on two real large-scale datasets. The experimental results demonstrate that, comparing with the classic and state-of-the-art (distributed) latent factor models, DCH has comparable performance in terms of recommendation accuracy but has both fast convergence speed in offline model training procedure and realtime efficiency in online recommendation procedure. Furthermore, the encouraging performance of DCH is also shown for several real-world applications in Ant Financial.

database systems for advanced applications | 2018

An Industrial-Scale System for Heterogeneous Information Card Ranking in Alipay

Zhiqiang Zhang; Chaochao Chen; Jun Zhou; Xiaolong Li

Alipay (https://global.alipay.com/), one of the world’s largest mobile and online payment platforms, provides not only payment services but also business about many aspects of our daily lives (finance, insurance, credit, express, news, social contact, etc.). The homepage in Alipay app (https://render.alipay.com/p/s/download) integrates massive heterogeneous information cards, which need to be ranked in appropriate order for better user experience. This paper demonstrates an industrial-scale system for heterogeneous information card ranking. We implement an ensemble ranking model, blending online and chunked-based learning algorithms which are developed on parameter server mechanism and able to handle industrial-scale data. Moreover, we propose efficient and effective factor embedding methods, which aim to reduce high-dimensional heterogenous factor features to low-dimensional embedding vectors by subtly revealing feature interactions. Offline experimental as well as online A/B testing results illustrate the efficiency and effectiveness of our proposals.

conference on information and knowledge management | 2018

Heterogeneous Graph Neural Networks for Malicious Account Detection

Ziqi Liu; Chaochao Chen; Xinxing Yang; Jun Zhou; Xiaolong Li; Le Song

We present, GEM, the first heterogeneous graph neural network approach for detecting malicious accounts at Alipay, one of the worlds leading mobile cashless payment platform. Our approach, inspired from a connected subgraph approach, adaptively learns discriminative embeddings from heterogeneous account-device graphs based on two fundamental weaknesses of attackers, i.e. device aggregation and activity aggregation. For the heterogeneous graph consists of various types of nodes, we propose an attention mechanism to learn the importance of different types of nodes, while using the sum operator for modeling the aggregation patterns of nodes in each type. Experiments show that our approaches consistently perform promising results compared with competitive methods over time.

national conference on artificial intelligence | 2018