Zhichun Li | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhichun Li is active.

Explore More

Publication

Featured researches published by Zhichun Li.

computer and communications security | 2012

CHEX: statically vetting Android apps for component hijacking vulnerabilities

Long Lu; Zhichun Li; Zhenyu Wu; Wenke Lee; Guofei Jiang

An enormous number of apps have been developed for Android in recent years, making it one of the most popular mobile operating systems. However, the quality of the booming apps can be a concern [4]. Poorly engineered apps may contain security vulnerabilities that can severally undermine users security and privacy. In this paper, we study a general category of vulnerabilities found in Android apps, namely the component hijacking vulnerabilities. Several types of previously reported app vulnerabilities, such as permission leakage, unauthorized data access, intent spoofing, and etc., belong to this category.n We propose CHEX, a static analysis method to automatically vet Android apps for component hijacking vulnerabilities. Modeling these vulnerabilities from a data-flow analysis perspective, CHEX analyzes Android apps and detects possible hijack-enabling flows by conducting low-overhead reachability tests on customized system dependence graphs. To tackle analysis challenges imposed by Androids special programming paradigm, we employ a novel technique to discover component entry points in their completeness and introduce app splitting to model the asynchronous executions of multiple entry points in an app.n We prototyped CHEX based on Dalysis, a generic static analysis framework that we built to support many types of analysis on Android app bytecode. We evaluated CHEX with 5,486 real Android apps and found 254 potential component hijacking vulnerabilities. The median execution time of CHEX on an app is 37.02 seconds, which is fast enough to be used in very high volume app vetting and testing scenarios.

IEEE Transactions on Information Forensics and Security | 2011

Towards Situational Awareness of Large-Scale Botnet Probing Events

Zhichun Li; Anup Goyal; Yan Chen; Vern Paxson

Botnets dominate todays attack landscape. In this work, we investigate ways to analyze collections of malicious probing traffic in order to understand the significance of large-scale “botnet probes.” In such events, an entire collection of remote hosts together probes the address space monitored by a sensor in some sort of coordinated fashion. Our goal is to develop methodologies by which sites receiving such probes can infer-using purely local observation-information about the probing activity: What scanning strategies does the probing employ? Is this an attack that specifically targets the site, or is the site only incidentally probed as part of a larger, indiscriminant attack? Our analysis draws upon extensive honeynet data to explore the prevalence of different types of scanning, including properties, such as trend, uniformity, coordination, and darknet avoidance. In addition, we design schemes to extrapolate the global properties of scanning events (e.g., total population and target scope) as inferred from the limited local view of a honeynet. Cross-validating with data from DShield shows that our inferences exhibit promising accuracy.

computer and communications security | 2012

Virtual browser: a virtualized browser to sandbox third-party JavaScripts with enhanced security

Yinzhi Cao; Zhichun Li; Vaibhav Rastogi; Yan Chen; Xitao Wen

Third party JavaScripts not only offer much richer features to the web and its applications but also introduce new threats. These scripts cannot be completely trusted and executed with the privileges given to host web sites. Due to incomplete virtualization and lack of tracking all the data flows, all existing approaches without native sandbox support can secure only a subset of third party JavaScripts, and they are vulnerable to attacks encoded in non-standard HTML/-JavaScript (browser quirks) as these approaches will parse third party JavaScripts independently at server side without considering client-side non-standard parsing quirks. At the same time, native sandboxes are vulnerable to attacks based on unknown native JavaScript engine bugs.n In this paper, we propose Virtual Browser, a full browser-level virtualized environment within existing browsers for executing untrusted third party code. Our approach supports more complete JavaScript language features including those hard-to-secure functions, such as with and eval. Since Virtual Browser does not rely on native browser parsing behavior, there is no possibility of attacks being executed through browser quirks. Moreover, given the third-party Javascripts are running in Virtual Browser instead of native browsers, it is harder for the attackers to exploit unknown vulnerabilities in the native JavaScript engine. In our design, we first completely isolate Virtual Browser from the native browser components and then introduce communication by adding data flows carefully examined for security. The evaluation of the Virtual Browser prototype shows that our execution speed is the same as Microsoft Web Sandbox[5], a state of the art runtime web-level sandbox. In addition, Virtual Browser is more secure and supports more complete JavaScript for third party JavaScript development.

dependable systems and networks | 2013

Redefining web browser principals with a Configurable Origin Policy

Yinzhi Cao; Vaibhav Rastogi; Zhichun Li; Yan Chen; Alexander Moshchuk

With the advent of Web 2.0, web developers have designed multiple additions to break SOP boundary, such as splitting and combining traditional web browser protection boundaries (security principals). However, these newly generated principals lack a new label to represent its security property. To address the inconsistent label problem, this paper proposes a new way to define a security principal and its labels in the browser. In particular, we propose a Configurable Origin Policy (COP), in which a browsers security principal is defined by a configurable ID rather than a fixed triple <;scheme, host, port>. The server-side and client-side code of a web application can create, join, and destroy its own principals. We perform a formal security analysis on COP to ensure session integrity. Then we also show that COP is compatible with legacy web sites, and those sites utilizing COP are also compatible with legacy browsers.

computer and communications security | 2016

High Fidelity Data Reduction for Big Data Security Dependency Analyses

Zhang Xu; Zhenyu Wu; Zhichun Li; Kangkook Jee; Junghwan Rhee; Xusheng Xiao; Fengyuan Xu; Haining Wang; Guofei Jiang

Intrusive multi-step attacks, such as Advanced Persistent Threat (APT) attacks, have plagued enterprises with significant financial losses and are the top reason for enterprises to increase their security budgets. Since these attacks are sophisticated and stealthy, they can remain undetected for years if individual steps are buried in background noise. Thus, enterprises are seeking solutions to connect the suspicious dots across multiple activities. This requires ubiquitous system auditing for long periods of time, which in turn causes overwhelmingly large amount of system audit events. Given a limited system budget, how to efficiently handle ever-increasing system audit logs is a great challenge. This paper proposes a new approach that exploits the dependency among system events to reduce the number of log entries while still supporting high-quality forensic analysis. In particular, we first propose an aggregation algorithm that preserves the dependency of events during data reduction to ensure the high quality of forensic analysis. Then we propose an aggressive reduction algorithm and exploit domain knowledge for further data reduction. To validate the efficacy of our proposed approach, we conduct a comprehensive evaluation on real-world auditing systems using log traces of more than one month. Our evaluation results demonstrate that our approach can significantly reduce the size of system logs and improve the efficiency of forensic analysis without losing accuracy.

very large data bases | 2015

Behavior query discovery in system-generated temporal graphs

Bo Zong; Xusheng Xiao; Zhichun Li; Zhenyu Wu; Zhiyun Qian; Xifeng Yan; Ambuj K. Singh; Guofei Jiang

Computer system monitoring generates huge amounts of logs that record the interaction of system entities. How to query such data to better understand system behaviors and identify potential system risks and malicious behaviors becomes a challenging task for system administrators due to the dynamics and heterogeneity of the data. System monitoring data are essentially heterogeneous temporal graphs with nodes being system entities and edges being their interactions over time. Given the complexity of such graphs, it becomes time-consuming for system administrators to manually formulate useful queries in order to examine abnormal activities, attacks, and vulnerabilities in computer systems. nIn this work, we investigate how to query temporal graphs and treat query formulation as a discriminative temporal graph pattern mining problem. We introduce TGMiner to mine discriminative patterns from system logs, and these patterns can be taken as templates for building more complex queries. TGMiner leverages temporal information in graphs to prune graph patterns that share similar growth trend without compromising pattern quality. Experimental results on real system data show that TGMiner is 6-32 times faster than baseline methods. The discovered patterns were verified by system experts; they achieved high precision (97%) and recall (91%).

recent advances in intrusion detection | 2016

Detecting Stack Layout Corruptions with Robust Stack Unwinding

Yangchun Fu; Junghwan Rhee; Zhiqiang Lin; Zhichun Li; Hui Zhang; Guofei Jiang

The stack is a critical memory structure to ensure the correct execution of programs because control flow changes through the data stored in it, such as return addresses and function pointers. Thus the stack has been a popular target by many attacks and exploits like stack smashing attacks and return-oriented programming (ROP). We present a novel system to detect the corruption of the stack layout using a robust stack unwinding technique and detailed stack layouts extracted from the stack unwinding information for exception handling widely available in off-the-shelf binaries. Our evaluation with real-world ROP exploits has demonstrated successful detection of them with performance overhead of only 3.93 % on average transparently without accessing any source code or debugging symbols of a protected binary.

computer and communications security | 2011

Poster: CUD: crowdsourcing for URL spam detection

Jun Hu; Hongyu Gao; Zhichun Li; Yan Chen

The prevalence of spam URLs in Internet services, such as email, social networks, blogs and online forums has become a serious problem. These spam URLs host spam advertisements, phishing attempts, and malwares, which are harmful for normal users. Existing URL blacklist approaches offer limited protection. Although recentmachine learning based URL classification approaches demonstrate good accuracy and reasonable throughput, they are based on observations fromexisting spamURLs and hard to detect new spam URLs when attackers employ new strategies. In this paper, we present CUD (Crowdsourcing for URL spam detection) as a supplement of existing detection tools. CUD leverages human intelligence for URL classification through crowdsourcing. CUD crawls existing user comments about spamURLs already on the Internet, and employs sentiment analysis from nature language processing to analyze the user comments automatically for detecting spam URLs. Since CUD does not using features directly associated with the URLs and their landing pages, it is more robust when attackers change their strategies. Through evaluation, we find up to 70% of URLs have user comments online. CUD achieves an accuracy of 86.8% in terms of true positive rate with a false positive rate 0.9%. Moreover, about 75% of spam URLs CUD detects are missed by other approaches. Therefore, CUD can be used as a good complement to other approaches.

computer and communications security | 2015

Discover and Tame Long-running Idling Processes in Enterprise Systems

Jun Wang; Zhiyun Qian; Zhichun Li; Zhenyu Wu; Junghwan Rhee; Xia Ning; Peng Liu; Guofei Jiang

Reducing attack surface is an effective preventive measure to strengthen security in large systems. However, it is challenging to apply this idea in an enterprise environment where systems are complex and evolving over time. In this paper, we empirically analyze and measure a real enterprise to identify unused services that expose attack surface. Interestingly, such unused services are known to exist and summarized by security best practices, yet such solutions require significant manual effort. We propose an automated approach to accurately detect the idling (most likely unused) services that are in either blocked or bookkeeping states. The idea is to identify repeating events with perfect time alignment, which is the indication of being idling. We implement this idea by developing a novel statistical algorithm based on autocorrelation with time information incorporated. From our measurement results, we find that 88.5% of the detected idling services can be constrained with a simple syscall-based policy, which confines the process behaviors within its bookkeeping states. In addition, working with two IT departments (one of which is a cross validation), we receive positive feedbacks which show that about 30.6% of such services can be safely disabled or uninstalled directly. In the future, the IT department plan to incorporate the results to build a smaller OS installation image. Finally, we believe our measurement results raise the awareness of the potential security risks of idling services.

computer and communications security | 2018

NodeMerge: Template Based Efficient Data Reduction For Big-Data Causality Analysis

Yutao Tang; Ding Li; Zhichun Li; Mu Zhang; Kangkook Jee; Xusheng Xiao; Zhenyu Wu; Junghwan Rhee; Fengyuan Xu; Qun Li

Todays enterprises are exposed to sophisticated attacks, such as Advanced Persistent Threats~(APT) attacks, which usually consist of stealthy multiple steps. To counter these attacks, enterprises often rely on causality analysis on the system activity data collected from a ubiquitous system monitoring to discover the initial penetration point, and from there identify previously unknown attack steps. However, one major challenge for causality analysis is that the ubiquitous system monitoring generates a colossal amount of data and hosting such a huge amount of data is prohibitively expensive. Thus, there is a strong demand for techniques that reduce the storage of data for causality analysis and yet preserve the quality of the causality analysis. To address this problem, in this paper, we propose NodeMerge, a template based data reduction system for online system event storage. Specifically, our approach can directly work on the stream of system dependency data and achieve data reduction on the read-only file events based on their access patterns. It can either reduce the storage cost or improve the performance of causality analysis under the same budget. Only with a reasonable amount of resource for online data reduction, it nearly completely preserves the accuracy for causality analysis. The reduced form of data can be used directly with little overhead. To evaluate our approach, we conducted a set of comprehensive evaluations, which show that for different categories of workloads, our system can reduce the storage capacity of raw system dependency data by as high as 75.7 times, and the storage capacity of the state-of-the-art approach by as high as 32.6 times. Furthermore, the results also demonstrate that our approach keeps all the causality analysis information and has a reasonably small overhead in memory and hard disk.

Explore More