Zhiwu Xu
Shenzhen University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhiwu Xu.
symposium on principles of programming languages | 2014
Giuseppe Castagna; Kim Nguyen; Zhiwu Xu; Hyeonseung Im; Sergueï Lenglet; Luca Padovani
This article is the first part of a two articles series about a calculus with higher-order polymorphic functions, recursive types with arrow and product type constructors and set-theoretic type connectives (union, intersection, and negation). In this first part we define and study the explicitly-typed version of the calculus in which type instantiation is driven by explicit instantiation annotations. In particular, we define an explicitly-typed lambda-calculus with intersection types and an efficient evaluation model for it. In the second part, presented in a companion paper, we define a local type inference system that allows the programmer to omit explicit instantiation annotations, and a type reconstruction system that allows the programmer to omit explicit type annotations. The work presented in the two articles provides the theoretical foundations and technical machinery needed to design and implement higher-order polymorphic functional languages for semi-structured data.
software engineering and formal methods | 2010
Zhiwu Xu; Lixiao Zheng; Haiming Chen
Producing sentences from a grammar, according to various criteria, is required in many applications. It is also a basic building block for grammar engineering. This paper presents a toolkit for context-free grammars, which mainly consists of several algorithms for sentence generation or enumeration and for coverage analysis for context-free grammars. The toolkit deals with general context-free grammars. Besides providing implementations of algorithms, the toolkit also provides a simple graphical user interface, through which the user can use the toolkit directly. The toolkit is implemented in Java and is available at http://lcs.ios.ac.cn/ hiwu/toolkit.php. In the paper, the overview of the toolkit and the description of the GUI are presented, and experimental results and preliminary applications of the toolkit are also contained.
symposium on principles of programming languages | 2015
Giuseppe Castagna; Kim Nguyen; Zhiwu Xu; Pietro Abate
This article is the second part of a two articles series about the definition of higher order polymorphic functions in a type system with recursive types and set-theoretic type connectives (unions, intersections, and negations). In the first part, presented in a companion paper, we defined and studied the syntax, semantics, and evaluation of the explicitly-typed version of a calculus, in which type instantiation is driven by explicit instantiation annotations. In this second part we present a local type inference system that allows the programmer to omit explicit instantiation annotations for function applications, and a type reconstruction system that allows the programmer to omit explicit type annotations for function definitions. The work presented in the two articles provides the theoretical foundations and technical machinery needed to design and implement higher-order polymorphic functional languages with union and intersection types and/or for semi-structured data processing.
web age information management | 2013
Liang Du; Zhiyong Shen; Jianying Wang; Zhiwu Xu
Clustering ensemble refers to combine a number of base clusterings for a particular data set into a consensus clustering solution. In this paper, we propose a novel self-supervised learning framework for clustering ensemble. Specifically, we treat the base clusterings as pseudo class labels and learn classifiers for each of them. By adding priors to the parameters of these classifiers, we capture the relationships between different base clusterings and meanwhile obtain a a single consolidated clustering result. In the proposed framework, we are able to incorporate the original data features to improve the performance of clustering ensemble. Another advantage, which distinguishes the proposed framework from the traditional clustering ensemble approaches, is with the generalization capability, i.e. it is able to assign the incoming data instances to the consensus clusters directly based on the original data features. We conduct extensive experiments on multiple real world data sets to show the effectiveness of our method.
international conference on formal engineering methods | 2017
Zhiwu Xu; Cheng Wen; Shengchao Qin
Type inference for Binary codes is a challenging problem due partly to the fact that much type-related information has been lost during the compilation from high-level source code. Most of the existing research on binary code type inference tend to resort to program analysis techniques, which can be too conservative to infer types with high accuracy or too heavy-weight to be viable in practice. In this paper, we propose a new approach to learning types for recovered variables from their related representative instructions. Our idea is motivated by “duck typing”, where the type of a variable is determined by its features and properties. Our approach first learns a classifier from existing binaries with debug information and then uses this classifier to predict types for new, unseen binaries. We have implemented our approach in a tool called BITY and used it to conduct some experiments on a well-known benchmark coreutils (v8.4). The results show that our tool is more precise than the commercial tool Hey-Rays, both in terms of correct types and compatible types.
theoretical aspects of software engineering | 2016
Zhiwu Xu; Dongxiao Fan; Shengchao Qin
To verify whether a program uses resources in a valid manner is vital for program correctness. A number of solutions have been proposed to ensure such a property for resource usage. But most of them are sophisticated to use for resource bugs detection in practice and do not concern about the issue that an opened resource should be used. This open-but-not-used problem can cause resource starvation in some case as well. In particular, resources of smartphones are not only scarce but also energy-hungry. The misuse of resources could not only cause the system to run out of resources but also lead to a shorter battery life. That is the so-call energy leak problem. Aiming to provide a lightweight method and to detect as many resource bugs as possible, we propose a statetaint analysis in this paper. First, take the open-but-not-used problem into account, we specify the appropriate usage of resources as resource protocols. Then we propose a taint-like analysis which takes resource protocols as a guide to detect resource bugs. As an application, we enrich the resource usage protocols by taking into account energy leaks and use the refined protocols to guide the analysis for energy leak detection. We implement the analysis as a prototype tool called statedroid. Using this tool, we conduct experiments on several real Android applications and find several energy leaks.
formal methods | 2018
Jingyi Wang; Jun Sun; Yifan Jia; Shengchao Qin; Zhiwu Xu
Modeling and verifying real-world cyber-physical systems is challenging, which is especially so for complex systems where manually modeling is infeasible. In this work, we report our experience on combining model learning and abstraction refinement to analyze a challenging system, i.e., a real-world Secure Water Treatment system (SWaT). Given a set of safety requirements, the objective is to either show that the system is safe with a high probability (so that a system shutdown is rarely triggered due to safety violation) or not. As the system is too complicated to be manually modeled, we apply latest automatic model learning techniques to construct a set of Markov chains through abstraction and refinement, based on two long system execution logs (one for training and the other for testing). For each probabilistic safety property, we either report it does not hold with a certain level of probabilistic confidence, or report that it holds by showing the evidence in the form of an abstract Markov chain. The Markov chains can subsequently be implemented as runtime monitors in SWaT.
Science of Computer Programming | 2017
Zhiwu Xu; Cheng Wen; Shengchao Qin
Abstract To ensure that a program uses its resources in an appropriate manner is vital for program correctness. A number of solutions have been proposed to check that programs meet such a property on resource usage. But many of them are sophisticated to use for resource bug detection in practice and do not take into account the expectation that a resource should be used once it is opened or required. This open-but-not-used problem can cause resource starvation in some cases, for example, smartphones or other mobile devices where resources are not only scarce but also energy-hungry, hence inappropriate resource usage can not only cause the system to run out of resources but also lead to much shorter battery life between battery recharge. That is the so-call energy leak problem. In this paper, we propose a static analysis called state-taint analysis to detect resource bugs. Taking the open-but-not-used problem into account, we specify the appropriate usage of resources in terms of resource protocols. We then propose a taint-like analysis which employs resource protocols to guide resource bug detection. As an extension and an application, we enrich the protocols with the inappropriate behaviours that may cause energy leaks, and use the refined protocols to guide the analysis for energy leak detection. We implement the analysis as a prototype tool called statedroid . Using this tool, we conduct experiments on several real Android applications and test datasets from Relda and GreenDroid. The experimental results show that our tool is precise, helpful and suitable in practice, and can detect more energy leak patterns.
International Conference on Smart Computing and Communication | 2017
Zhiwu Xu; Cheng Wen; Shengchao Qin; Zhong Ming
Malware is one of the most serious security threats on the Internet today. Traditional detection methods become ineffective as malware continues to evolve. Recently, various machine learning approaches have been proposed for detecting malware. However, either they focused on behaviour information, leaving the data information out of consideration, or they did not consider too much about the new malware with different behaviours or new malware versions obtained by obfuscation techniques. In this paper, we propose an effective approach for malware detection using machine learning. Different from most existing work, we take into account not only the behaviour information but also the data information, namely, the opcodes, data types and system libraries used in executables. We employ various machine learning methods in our implementation. Several experiments are conducted to evaluate our approach. The results show that (1) the classifier trained by Random Forest performs best with the accuracy 0.9788 and the AUC 0.9959; (2) all the features (including data types) are effective for malware detection; (3) our classifier is capable of detecting some fresh malware; (4) our classifier has a resistance to some obfuscation techniques.
international conference on functional programming | 2011
Giuseppe Castagna; Zhiwu Xu