Marc Frîncu
University of Southern California
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marc Frîncu.
ieee high performance extreme computing conference | 2014
Charith Wickramaarachchi; Marc Frîncu; Patrick Small; Viktor K. Prasanna
Detecting community structures in graphs is a well studied problem in graph data analytics. Unprecedented growth in graph structured data due to the development of the world wide web and social networks in the past decade emphasizes the need for fast graph data analytics techniques. In this paper we present a simple yet efficient approach to detect communities in large scale graphs by modifying the sequential Louvain algorithm for community detection. The proposed distributed memory parallel algorithm targets the costly first iteration of the initial method by parallelizing it. Experimental results on a MPI setup with 128 parallel processes shows that up to ≈5× performance improvement is achieved as compared to the sequential version while not compromising the correctness of the final result.
ieee international conference on cloud computing technology and science | 2015
Alok Gautam Kumbhare; Yogesh Simmhan; Marc Frîncu; Viktor K. Prasanna
The need for low latency analysis over high-velocity data streams motivates the need for distributed continuous dataflow systems. Contemporary stream processing systems use simple techniques to scale on elastic cloud resources to handle variable data rates. However, application QoS is also impacted by variability in resource performance exhibited by clouds and hence necessitates autonomic methods of provisioning elastic resources to support such applications on cloud infrastructure. We develop the concept of “dynamic dataflows” which utilize alternate tasks as additional control over the dataflows cost and QoS. Further, we formalize an optimization problem to represent deployment and runtime resource provisioning that allows us to balance the applications QoS, value, and the resource cost. We propose two greedy heuristics, centralized and sharded, based on the variable-sized bin packing algorithm and compare against a Genetic Algorithm (GA) based heuristic that gives a near-optimal solution. A large-scale simulation study, using the linear road benchmark and VM performance traces from the AWS public cloud, shows that while GA-based heuristic provides a better quality schedule, the greedy heuristics are more practical, and can intelligently utilize cloud elasticity to mitigate the effect of variability, both in input data rates and cloud resource performance, to meet the QoS of fast data applications.
ieee international symposium on parallel & distributed processing, workshops and phd forum | 2013
Marc Frîncu; Stéphane Genaud; Julien Gossa
Cloud computing is emerging as a leading solution for deploying on-demand applications in both the industry and the scientific community. An important problem which needs to be considered is that of scheduling tasks on existing resources. Since clouds are linked to grid systems much of the work done on the latter can be ported with some modifications due to specific aspects that concern clouds, e.g., virtualization, scalability and on-demand provisioning. Two types of applications are usually considered for cloud migration: bag-of-tasks and workflows. This paper deals with the second case and investigates the impact virtual machine provisioning policies have on the scheduling strategy when various workflow types and execution times are used. Five provisioning methods are proposed and tested on well known workflow scheduling algorithms such as CPA, Gain and HEFT. We show that some correlation between the application characteristics and provisioning method exists. This result paves the way for adaptive scheduling in which based on the workflow properties a specific provisioning can be applied in order to optimize execution times or costs.
international conference on big data | 2014
Marc Frîncu; Charalampos Chelmis; Muhammad Usman Noor; Viktor K. Prasanna
Smart grids are becoming popular with the advent of sophisticated smart meters. They allow utilities to optimize energy consumption during peak hours by applying various demand response techniques including voluntary curtailment, direct control and price incentives. To sustain the curtailment over long periods of time of up to several hours utilities need to make fast and accurate consumption predictions on a large set of customers based on a continuous flow of real time data and huge historical data sets. Given the numerous consumption patterns customers exhibit, different prediction methods need to be used to reduce the prediction error. The straightforward approach of testing each customer against every method is unfeasible in this large volume and high velocity environment. To this aim, we propose a neural network based approach for automatically selecting the best prediction method per customer by relying only on a small subset of customers. We also introduce two historical averaging methods for consumption prediction that take advantage of the variability of the data and continuously update the results based on a sliding window technique. We show that once trained, the proposed neural network does not require frequent retraining, ensuring its applicability in online scenarios such as the sustainable demand response.
International Green Computing Conference | 2014
Vasileios Zois; Marc Frîncu; Charalampos Chelmis; Muhammad Rizwan Saeed; Viktor K. Prasanna
Regulating the power consumption to avoid peaks in demand is a common practice. Demand Response(DR) is being used by utility providers to minimize costs or ensure system reliability. Although it has been used extensively there is a shortage of solutions dealing with dynamic DR. Past attempts focus on minimizing the load demand without considering the sustainability of the reduced energy. In this paper an efficient algorithm is presented which solves the problem of dynamic DR scheduling. Data from the USC campus micro grid were used to evaluate the efficiency as well as the robustness of the proposed solution. The targeted energy reduction is achieved with a maximum average approximation error of ≈ 0.7%. Sustainability of the reduced energy is achieved with respect to the optimal available solution providing a maximum average error less than 0.6%. It is also shown that a solution is provided with a low computational cost fulfilling the requirements of dynamic DR.
ieee pes innovative smart grid technologies conference | 2015
Fabian Knirsch; Dominik Engel; Marc Frîncu; Viktor K. Prasanna
The smart grid changes the way energy is produced and distributed. In addition both, energy and information is exchanged bidirectionally among participating parties. Therefore heterogeneous systems have to cooperate effectively in order to achieve a common high-level use case, such as smart metering for billing or demand response for load curtailment. Furthermore, a substantial amount of personal data is often needed for achieving that goal. Capturing and processing personal data in the smart grid increases customer concerns about privacy and in addition, certain statutory and operational requirements regarding privacy aware data processing and storage have to be met. An increase of privacy constraints, however, often limits the operational capabilities of the system. In this paper, we present an approach that automates the process of finding an optimal balance between privacy requirements and operational requirements in a smart grid use case and application scenario. This is achieved by formally describing use cases in an abstract model and by finding an algorithm that determines the optimum balance by forward mapping privacy and operational impacts. For this optimal balancing algorithm both, a numeric approximation and - if feasible - an analytic assessment are presented and investigated. The system is evaluated by applying the tool to a real-world use case from the University of Southern California (USC) microgrid.
international conference on distributed computing systems | 2015
Alok Gautam Kumbhare; Marc Frîncu; Yogesh Simmhan; Viktor K. Prasanna
The MapReduce programming model, due to its simplicity and scalability, has become an essential tool for processing large data volumes in distributed environments. Recent Stream Processing Systems (SPS) this model to provide low-latency analysis of high-velocity continuous data streams. However, integrating MapReduce with streaming poses challenges: first, the runtime variations in data characteristics such as data-rates and key-distribution cause resource overload, that in-turn leads to fluctuations in the Quality of the Service (QoS), and second, the stateful reducers, whose state depends on the complete tuple history, necessitates efficient fault-recovery mechanisms to maintain the desired QoS in the presence of resource failures. We propose an integrated streaming MapReduce architecture leveraging the concept of consistent hashing to support runtime elasticity along with locality-aware data and state replication to provide efficient load-balancing with low-overhead fault-tolerance and parallel fault-recovery from multiple simultaneous failures. Our evaluation on a private cloud shows up to 2.8× improvement in peak throughput compared to Apache Storm SPS, and a low recovery latency of 700 - 1500 ms from multiple failures.
international parallel and distributed processing symposium | 2015
Yogesh Simmhan; Neel Choudhury; Charith Wickramaarachchi; Alok Gautam Kumbhare; Marc Frîncu; Cauligi S. Raghavendra; Viktor K. Prasanna
Graphs are a key form of Big Data, and performing scalable analytics over them is invaluable to many domains. There is an emerging class of inter-connected data which accumulates or varies over time, and on which novel algorithms both over the network structure and across the time-variant attribute values is necessary. We formalize the notion of time-series graphs and propose a Temporally Iterative BSP programming abstraction to develop algorithms on such datasets using several design patterns. Our abstractions leverage a sub-graph centric programming model and extend it to the temporal dimension. We present three time-series graph algorithms based on these design patterns and abstractions, and analyze their performance using the Offish distributed platform on Amazon AWS Cloud. Our results demonstrate the efficacy of the abstractions to develop practical time-series graph algorithms, and scale them on commodity hardware.
international conference on information systems security | 2015
Fabian Knirsch; Dominik Engel; Christian Neureiter; Marc Frîncu; Viktor K. Prasanna
In a smart grid, data and information are transported, transmitted, stored, and processed with various stakeholders having to cooperate effectively. Furthermore, personal data is the key to many smart grid applications and therefore privacy impacts have to be taken into account. For an effective smart grid, well integrated solutions are crucial and for achieving a high degree of customer acceptance, privacy should already be considered at design time of the system. To assist system engineers in early design phase, frameworks for the automated privacy evaluation of use cases are important. For evaluation, use cases for services and software architectures need to be formally captured in a standardized and commonly understood manner. In order to ensure this common understanding for all kinds of stakeholders, reference models have recently been developed. In this paper we present a model-driven approach for the automated assessment of such services and software architectures in the smart grid that builds on the standardized reference models. The focus of qualitative and quantitative evaluation is on privacy. For evaluation, the framework draws on use cases from the University of Southern California microgrid.
Journal of intelligent systems | 2017
Absalom E. Ezugwu; Nneoma A. Okoroafor; Seyed M. Buhari; Marc Frîncu; Sahalu B. Junaidu
Abstract The operational efficacy of the grid computing system depends mainly on the proper management of grid resources to carry out the various jobs that users send to the grid. The paper explores an alternative way of efficiently searching, matching, and allocating distributed grid resources to jobs in such a way that the resource demand of each grid user job is met. A proposal of resource selection method that is based on the concept of genetic algorithm (GA) using populations based on multisets is made. Furthermore, the paper presents a hybrid GA-based scheduling framework that efficiently searches for the best available resources for user jobs in a typical grid computing environment. For the proposed resource allocation method, additional mechanisms (populations based on multiset and adaptive matching) are introduced into the GA components to enhance their search capability in a large problem space. Empirical study is presented in order to demonstrate the importance of operator improvement on traditional GA. The preliminary performance results show that the proposed introduction of an additional operator fine-tuning is efficient in both speed and accuracy and can keep up with high job arrival rates.