Is this you? Create Your Porfile

Dinghao Wu

Pennsylvania State University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dinghao Wu is active.

Explore More

Publication

Featured researches published by Dinghao Wu.

programming language design and implementation | 2004

KISS: keep it simple and sequential

Shaz Qadeer; Dinghao Wu

The design of concurrent programs is error-prone due to the interaction between concurrently executing threads. Traditional automated techniques for finding errors in concurrent programs, such as model checking, explore all possible thread interleavings. Since the number of thread interleavings increases exponentially with the number of threads, such analyses have high computational complexity. In this paper, we present a novel analysis technique for concurrent programs that avoids this exponential complexity. Our analysis transforms a concurrent program into a sequential program that simulates the execution of a large subset of the behaviors of the concurrent program. The sequential program is then analyzed by a tool that only needs to understand the semantics of sequential execution. Our technique never reports false errors but may miss errors. We have implemented the technique in KISS, an automated checker for multithreaded C programs, and obtained promising initial results by using KISS to detect race conditions in Windows device drivers.

trust and trustworthy computing | 2013

A Framework for Evaluating Mobile App Repackaging Detection Algorithms

Heqing Huang; Sencun Zhu; Peng Liu; Dinghao Wu

Because it is not hard to reverse engineer the Dalvik bytecode used in the Dalvik virtual machine, Android application repackaging has become a serious problem. With repackaging, a plagiarist can simply steal others’ code violating the intellectual property of the developers. More seriously, after repackaging, popular apps can become the carriers of malware, adware or spy-ware for wide spreading. To maintain a healthy app market, several detection algorithms have been proposed recently, which can catch some types of repackaged apps in various markets efficiently. However, they are generally lack of valid analysis on their effectiveness. After analyzing these approaches, we find simple obfuscation techniques can potentially cause false negatives, because they change the main characteristics or features of the apps that are used for similarity detections. In practice, more sophisticated obfuscation techniques can be adopted (or have already been performed) in the context of mobile apps. We envision this obfuscation based repackaging will become a phenomenon due to the arms race between repackaging and its detection. To this end, we propose a framework to evaluate the obfuscation resilience of repackaging detection algorithms comprehensively. Our evaluation framework is able to perform a set of obfuscation algorithms in various forms on the Dalvik bytecode. Our results provide insights to help gauge both broadness and depth of algorithms’ obfuscation resilience. We applied our framework to conduct a comprehensive case study on AndroGuard, an Android repackaging detector proposed in Black-hat 2011. Our experimental results have demonstrated the effectiveness and stability of our framework.

privacy security risk and trust | 2011

Get Online Support, Feel Better -- Sentiment Analysis and Dynamics in an Online Cancer Survivor Community

Baojun Qiu; Kang Zhao; Prasenjit Mitra; Dinghao Wu; Cornelia Caragea; John Yen; Greta E. Greer; Kenneth Portier

Many users join online health communities (OHC) to obtain information and seek social support. Understanding the emotional impacts of participation on patients and their informal caregivers is important for OHC managers. Ethnographical observations, interviews, and questionnaires have reported benefits from online health communities, but these approaches are too costly to adopt for large-scale analyses of emotional impacts. A computational approach using machine learning and text mining techniques is demonstrated using data from the American Cancer Society Cancer Survivors Network (CSN), an online forum of nearly a half million posts. This approach automatically estimates the sentiment of forum posts, discovers sentiment change patterns in CSN members, and allows investigation of factors that affect the sentiment change. This first study of sentiment benefits and dynamics in a large-scale health-related electronic community finds that an estimated 75\%--85\% of CSN forum participants change their sentiment in a positive direction through online interactions with other community members. Two new features, \textit{Name} and \textit{Slang}, not previously used in sentiment analysis, facilitate identifying positive sentiment in posts. This work establishes foundational concepts for further studies of sentiment impact of OHC participation and provides insight useful for the design of new OHCs or enhancement of existing OHCs in providing better emotional support to their members.

principles and practice of declarative programming | 2003

Foundational proof checkers with small witnesses

Dinghao Wu; Andrew W. Appel; Aaron Stump

Proof checkers for proof-carrying code (and similar systems) can suffer from two problems: huge proof witnesses and untrustworthy proof rules. No previous design has addressed both of these problems simultaneously. We show the theory, design, and implementation of a proof-checker that permits small proof witnesses and machine-checkable proofs of the soundness of the system.

programming language design and implementation | 2003

A provably sound TAL for back-end optimization

Juan Chen; Dinghao Wu; Andrew W. Appel; Hai Fang

Typed assembly languages provide a way to generate machine-checkable safety proofs for machine-language programs. But the soundness proofs of most existing typed assembly languages are hand-written and cannot be machine-checked, which is worrisome for such large calculi. We have designed and implemented a low-level typed assembly language (LTAL) with a semantic model and established its soundness from the model. Compared to existing typed assembly languages, LTAL is more scalable and more secure; it has no macro instructions that hinder low-level optimizations such as instruction scheduling; its type constructors are expressive enough to capture dataflow information, support the compilers choice of data representations and permit typed position-independent code; and its type-checking algorithm is completely syntax-directed.We have built a prototype system, based on Standard ML of New Jersey, that compiles most of core ML to Sparc code. We explain how we were able to make the untyped back end in SML/NJ preserve types during instruction selection and register allocation, without restricting low-level optimizations and without knowledge of any type system pervading the instruction selector and register allocator.

foundations of software engineering | 2014

Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection

Lannan Luo; Jiang Ming; Dinghao Wu; Peng Liu; Sencun Zhu

Existing code similarity comparison methods, whether source or binary code based, are mostly not resilient to obfuscations. In the case of software plagiarism, emerging obfuscation techniques have made automated detection increasingly difficult. In this paper, we propose a binary-oriented, obfuscation-resilient method based on a new concept, longest common subsequence of semantically equivalent basic blocks, which combines rigorous program semantics with longest common subsequence based fuzzy matching. We model the semantics of a basic block by a set of symbolic formulas representing the input-output relations of the block. This way, the semantics equivalence (and similarity) of two blocks can be checked by a theorem prover. We then model the semantics similarity of two paths using the longest common subsequence with basic blocks as elements. This novel combination has resulted in strong resiliency to code obfuscation. We have developed a prototype and our experimental results show that our method is effective and practical when applied to real-world software.

international conference on software engineering | 2011

Value-based program characterization and its application to software plagiarism detection

Yoon-Chan Jhi; Xinran Wang; Xiaoqi Jia; Sencun Zhu; Peng Liu; Dinghao Wu

Identifying similar or identical code fragments becomes much more challenging in code theft cases where plagiarizers can use various automated code transformation techniques to hide stolen code from being detected. Previous works in this field are largely limited in that (1) most of them cannot handle advanced obfuscation techniques; (2) the methods based on source code analysis are less practical since the source code of suspicious programs is typically not available until strong evidences are collected; and (3) those depending on the features of specific operating systems or programming languages have limited applicability. Based on an observation that some critical runtime values are hard to be replaced or eliminated by semantics-preserving transformation techniques, we introduce a novel approach to dynamic characterization of executable programs. Leveraging such invariant values, our technique is resilient to various control and data obfuscation techniques. We show how the values can be extracted and refined to expose the critical values and how we can apply this runtime property to help solve problems in software plagiarism detection. We have implemented a prototype with a dynamic taint analyzer atop a generic processor emulator. Our experimental results show that the value-based method successfully discriminates 34 plagiarisms obfuscated by SandMark, plagiarisms heavily obfuscated by KlassMaster, programs obfuscated by Thicket, and executables obfuscated by Loco/Diablo.

verification model checking and abstract interpretation | 2004

Construction of a Semantic Model for a Typed Assembly Language

Gang Tan; Andrew W. Appel; Kedar N. Swadi; Dinghao Wu

Typed Assembly Languages (TALs) can be used to validate the safety of assembly-language programs. However, typing rules are usually trusted as axioms. In this paper, we show how to build semantic models for typing judgments in TALs based on an induction technique, so that both the type-safety theorem and the typing rules can be proved as lemmas in a simple logic. We demonstrate this technique by giving a complete model to a sample TAL. This model allows a typing derivation to be interpreted as a machine-checkable safety proof at the machine level.

2012 ASCE International Conference on Computing in Civil Engineering | 2012

BIM server requirements to support the energy efficient building lifecycle

Yufei Jiang; Jiang Ming; Dinghao Wu; John Yen; Prasenjit Mitra; John I. Messner; Robert M. Leicht

Energy efficient building design, construction, and operations require the development and sharing of building information among different individuals, organizations, and computer applications. Building Information Modeling (BIM) servers are tools used to enable an effective exchange of data. This paper describes an investigation into the core BIM server requirements needed to effectively support information sharing related to energy efficient retrofit projects. The requirements have been developed through an analysis of existing functional capabilities combined with a case study analysis. The set of requirements identified includes fine-grained queries such as selective model queries, information queries (e.g. weather, building description), and operational information queries (by building parts, proximity, and context). A set of RESTful programming interfaces for building tools to access and exchange data, including security and data privacy issues, is being explored to provide a server-centric building information model exchange and interoperability to facilitate energy efficient retrofit.

cluster computing and the grid | 2012

Towards Trusted Services: Result Verification Schemes for MapReduce

Chu Huang; Sencun Zhu; Dinghao Wu

Recent development in Internet-scale data applications and services, combined with the proliferation of cloud computing, has created a new computing model for data intensive computing best characterized by the MapReduce paradigm. The MapReduce computing paradigm, pioneered by Google in its Internet search application, is an architectural and programming model for efficiently processing massive amount of raw unstructured data. With the availability of the open source Hadoop tools, applications built based on the MapReduce computing model are rapidly growing. In this work, we focus on a unique security concern on the MapReduce architecture. Given the potential security risks from lazy or malicious servers involved in a MapReduce task, we design efficient and innovative mechanisms for detecting cheating services under the MapReduce environment based on watermark injection and random sampling methods. The new detection schemes are expected to significantly reduce the cost of verification overhead. Finally, extensive analytical and experimental evaluation confirms the effectiveness of our schemes in MapReduce result verification.

Explore More