Thu D. Nguyen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thu D. Nguyen is active.

Explore More

Publication

Featured researches published by Thu D. Nguyen.

high performance distributed computing | 2003

PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities

Francisco Matias Cuenca-Acuna; Christopher Peery; Richard P. Martin; Thu D. Nguyen

We introduce PlanetP, content addressable publish/subscribe service for unstructured peer-to-peer (P2P) communities. PlanetP supports content addressing by providing: (1) a gossiping layer used to globally replicate a membership directory and an extremely compact content index; and (2) a completely distributed content search and ranking algorithm that help users find the most relevant information. PlanetP is a simple, yet powerful system for sharing information. PlanetP is simple because each peer must only perform a periodic, randomized, point-to-point message exchange with other peers. PlanetP is powerful because it maintains a globally content-ranked view of the shared data. Using simulation and a prototype implementation, we show that PlanetP achieves ranking accuracy that is comparable to a centralized solution and scales easily to several thousand peers while remaining resilient to rapid membership changes.

acm special interest group on data communication | 1993

Implementing network protocols at user level

Chandramohan A. Thekkath; Thu D. Nguyen; Evelyn Moy; Edward D. Lazowska

Traditionally, network software has been structured in a monolithic fashion with all protocol stacks executing either within the kernel or in a single trusted user-level server. This organization is motivated by performance and security concerns. However, considerations of code maintenance, ease of debugging, customization, and the simultaneous existence of multiple protocols argue for separating the implementations into more manageable user-level libraries of protocols. This paper describes the design and implementation of transport protocols as user-level libraries.We begin by motivating the need for protocol implementations as user-level libraries and placing our approach in the context of previous work. We then describe our alternative to monolithic protocol organization, which has been implemented on Mach workstations connected not only to traditional Ethernet, but also to a more modern network, the DEC SRC ANI. Based on our experience, we discuss the implications for host-network interface design and for overall system structure to support efficient user-level implementations of network protocols.

ieee international conference on high performance computing data and analytics | 2011

Reducing electricity cost through virtual machine placement in high performance computing clouds

Kien Le; Ricardo Bianchini; Jingru Zhang; Yogesh Jaluria; Jiandong Meng; Thu D. Nguyen

In this paper, we first study the impact of load placement policies on cooling and maximum data center temperatures in cloud service providers that operate multiple geographically distributed data centers. Based on this study, we then propose dynamic load distribution policies that consider all electricity-related costs as well as transient cooling effects. Our evaluation studies the ability of different cooling strategies to handle load spikes, compares the behaviors of our dynamic cost-aware policies to cost-unaware and static policies, and explores the effects of many parameter settings. Among other interesting results, we demonstrate that (1) our policies can provide large cost savings, (2) load migration enables savings in many scenarios, and (3) all electricity-related costs must be considered at the same time for higher and consistent cost savings.

international conference on green computing | 2010

Capping the brown energy consumption of Internet services at low cost

Kien Le; Ricardo Bianchini; Thu D. Nguyen; Ozlem Bilgir; Margaret Martonosi

The large amount of energy consumed by Internet services represents significant and fast-growing financial and environmental costs. Increasingly, services are exploring dynamic methods to minimize energy costs while respecting their service-level agreements (SLAs). Furthermore, it will soon be important for these services to manage their usage of “brown energy” (produced via carbon-intensive means) relative to renewable or “green” energy. This paper introduces a general, optimization-based framework for enabling multi-data-center services to manage their brown energy consumption and leverage green energy, while respecting their SLAs and minimizing energy costs. Based on the framework, we propose a policy for request distribution across the data centers. Our policy can be used to abide by caps on brown energy consumption, such as those that might arise from Kyoto-style carbon limits, from corporate pledges on carbon-neutrality, or from limits imposed on services to encourage brown energy conservation. We evaluate our framework and policy extensively through simulations and real experiments. Our results show how our policy allows a service to trade off consumption and cost. For example, using our policy, the service can reduce brown energy consumption by 24% for only a 10% increase in cost, while still abiding by SLAs.

measurement and modeling of computer systems | 2002

Improving cluster availability using workstation validation

Taliver Heath; Richard P. Martin; Thu D. Nguyen

We demonstrate a framework for improving the availability of cluster based Internet services. Our approach models Internet services as a collection of interconnected components, each possessing well defined interfaces and failure semantics. Such a decomposition allows designers to engineer high availability based on an understanding of the interconnections and isolated fault behavior of each component, as opposed to ad-hoc methods. In this work, we focus on using the entire commodity workstation as a component because it possesses natural, fault-isolated interfaces. We define a failure event as a reboot because not only is a workstation unavailable during a reboot, but also because reboots are symptomatic of a larger class of failures, such as configuration and operator errors. Our observations of 3 distinct clusters show that the time between reboots is best modeled by a Weibull distribution with shape parameters of less than 1, implying that a workstation becomes more reliable the longer it has been operating. Leveraging this observed property, we design an allocation strategy which withholds recently rebooted workstations from active service, validating their stability before allowing them to return to service. We show via simulation that this policy leads to a 70-30 rule-of-thumb: For a constant utilization, approximately 70% of the workstation failures can be masked from end clients with 30% extra capacity added to the cluster, provided reboots are not strongly correlated. We also found our technique is most sensitive to the burstiness of reboots as opposed to absolute lengths of workstation uptimes.

international conference on distributed computing systems | 2012

DMap: A Shared Hosting Scheme for Dynamic Identifier to Locator Mappings in the Global Internet

Tam Vu; Akash Baid; Yanyong Zhang; Thu D. Nguyen; Junichiro Fukuyama; Richard P. Martin; Dipankar Raychaudhuri

This paper presents the design and evaluation of a novel distributed shared hosting approach, DMap, for managing dynamic identifier to locator mappings in the global Internet. DMap is the foundation for a fast global name resolution service necessary to enable emerging Internet services such as seamless mobility support, content delivery and cloud computing. Our approach distributes identifier to locator mappings among Autonomous Systems (ASs) by directly applying K>;1 consistent hash functions on the identifier to produce network addresses of the AS gateway routers at which the mapping will be stored. This direct mapping technique leverages the reachability information of the underlying routing mechanism that is already available at the network layer, and achieves low lookup latencies through a single overlay hop without additional maintenance overheads. The proposed DMap technique is described in detail and specific design problems such as address space fragmentation, reducing latency through replication, taking advantage of spatial locality, as well as coping with inconsistent entries are addressed. Evaluation results are presented from a large-scale discrete event simulation of the Internet with ~26,000 ASs using real-world traffic traces from the DIMES repository. The results show that the proposed method evenly balances storage load across the global network while achieving lookup latencies with a mean value of ~50 ms and 95th percentile value of ~100 ms, considered adequate for support of dynamic mobility across the global Internet.

symposium on reliable distributed systems | 2003

Autonomous replication for high availability in unstructured P2P systems

Cuenca-Acunam Fm; Richard P. Martin; Thu D. Nguyen

We consider the problem of increasing the availability of shared data in peer-to-peer systems. In particular, we conservatively estimate the amount of excess storage required to achieve a practical availability of 99.9% by studying a decentralized algorithm that only depends on a modest amount of loosely synchronized global state. Our algorithm uses randomized decisions extensively together with a novel application of an erasure code to tolerate autonomous peer actions as well as staleness in the loosely synchronized global state. We study the behavior of this algorithm in three distinct environments modeled on previously reported measurements. We show that while peers act autonomously, the community as a whole will reach a stable configuration. We also show that space is used fairly and efficiently, delivering three times availability at a cost of six times the storage footprint of the data collection when the average peer availability is only 24%.

measurement and modeling of computer systems | 2010

Managing the cost, energy consumption, and carbon footprint of internet services

Kien Le; Ozlem Bilgir; Ricardo Bianchini; Margaret Martonosi; Thu D. Nguyen

The large amount of energy consumed by Internet services represents significant and fast-growing financial and environmental costs. This paper introduces a general, optimization-based framework and several request distribution policies that enable multi-data-center services to manage their brown energy consumption and leverage green energy, while respecting their service-level agreements (SLAs) and minimizing energy cost. Our policies can be used to abide by caps on brown energy consumption that might arise from various scenarios such as government imposed Kyoto-style carbon limits. Extensive simulations and real experiments show that our policies allow a service to trade off consumption and cost. For example, using our policies, a service can reduce brown energy consumption by 24% for only a 10% increase in cost, while still abiding by SLAs.

architectural support for programming languages and operating systems | 2015

ApproxHadoop: Bringing Approximations to MapReduce Frameworks

Íñigo Goiri; Ricardo Bianchini; Santosh Nagarakatte; Thu D. Nguyen

We propose and evaluate a framework for creating and running approximation-enabled MapReduce programs. Specifically, we propose approximation mechanisms that fit naturally into the MapReduce paradigm, including input data sampling, task dropping, and accepting and running a precise and a user-defined approximate version of the MapReduce code. We then show how to leverage statistical theories to compute error bounds for popular classes of MapReduce programs when approximating with input data sampling and/or task dropping. We implement the proposed mechanisms and error bound estimations in a prototype system called ApproxHadoop. Our evaluation uses MapReduce applications from different domains, including data analytics, scientific computing, video encoding, and machine learning. Our results show that ApproxHadoop can significantly reduce application execution time and/or energy consumption when the user is willing to tolerate small errors. For example, ApproxHadoop can reduce runtimes by up to 32x when the user can tolerate an error of 1% with 95% confidence. We conclude that our framework and system can make approximation easily accessible to many application domains using the MapReduce model.

policies for distributed systems and networks | 2002

A hierarchical policy specification language, and enforcement mechanism, for governing digital enterprises

Xuhui Ao; Naftaly H. Minsky; Thu D. Nguyen

This paper is part of a research program based on the thesis that the only reliable way for ensuring that a heterogeneous distributed community of software modules and people conforms to a given policy is for this policy to be enforced. We have devised a mechanism called law-governed interaction (LGI) for this purpose. LGI can be used to specify a wide range of policies to govern the interactions among the members of large and heterogeneous communities of agents dispersed throughout a distributed enterprise, and to enforce such policies in a decentralized and efficient manner. What concerns us in this paper is the fact that a typical enterprise is bound to be governed by a multitude of policies. Stich policies are likely to be interrelated in complex ways, forming an ensemble of policies that is to govern the enterprise as a whole. As a step toward organizing such an ensemble of policies, we introduce a hierarchical inter-policy relation called a superior/subordinate relation. This relation is intended to serve two distinct but related purposes: first, it helps to organize and classify a set of enterprise policies; second, it helps regulate the long-term evolution of the various policies that govern an enterprise. For this purpose, each policy in the hierarchy should circumscribe the authority and the structure of those policies that are subordinate to it, in some way analogous to the manner in which a constitution in American jurisprudence constrains the laws which are subordinate to it. Broadly speaking, the hierarchical structure of the ensemble of policies that govern a given enterprise should reflect the hierarchical structure of the enterprise itself.

Explore More