Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Edmon Begoli is active.

Publication


Featured researches published by Edmon Begoli.


working ieee/ifip conference on software architecture | 2012

Design Principles for Effective Knowledge Discovery from Big Data

Edmon Begoli; James Horey

Big data phenomenon refers to the practice of collection and processing of very large data sets and associated systems and algorithms used to analyze these massive datasets. Architectures for big data usually range across multiple machines and clusters, and they commonly consist of multiple special purpose sub-systems. Coupled with the knowledge discovery process, big data movement offers many unique opportunities for organizations to benefit (with respect to new insights, business optimizations, etc.). However, due to the difficulty of analyzing such large datasets, big data presents unique systems engineering and architectural challenges. In this paper, we present three system design principles that can inform organizations on effective analytic and data collection processes, system organization, and data dissemination practices. The principles presented derive from our own research and development experiences with big data problems from various federal agencies, and we illustrate each principle with our own experiences and recommendations.


Proceedings of the WICSA/ECSA 2012 Companion Volume on | 2012

A short survey on the state of the art in architectures and platforms for large scale data analysis and knowledge discovery from data

Edmon Begoli

Intended as a survey for practicing architects and researchers seeking an overview of the state-of-the-art architectures for data analysis, this paper provides an overview of the emerging data management and analytic platforms including parallel databases, Hadoop-based systems, High Performance Computing (HPC) platforms and platforms popularly referred to as NoSQL platforms. Platforms are presented based on their relevance, analysis they support and the data organization model they support.


Proceedings of the 2015 XSEDE Conference on Scientific Advancements Enabled by Enhanced Cyberinfrastructure | 2015

Integrating apache spark into PBS-Based HPC environments

Troy Baer; Paul Peltz; Junqi Yin; Edmon Begoli

This paper describes an effort at the University of Tennessees National Institute for Computational Sciences (NICS) to integrate Apache Spark into the widely used TORQUE HPC batch environment. The similarities and differences between the execution of a Spark program and that of an MPI program on a cluster are used to motivate how to implement Spark/TORQUE integration. An implementation of this integration, pbs-spark-submit, is described, including demonstrations of functionality on two HPC clusters and a large shared-memory system.


international conference on big data | 2016

Real-Time Discovery Services over Large, Heterogeneous and Complex Healthcare Datasets Using Schema-Less, Column-Oriented Methods.

Edmon Begoli; Ted Dunning; Charlie Frasure

We present a service platform for schema-leess exploration of data and discovery of patient-related statistics from healthcare data sets. The architecture of this platform is motivated by the need for fast, schema-less, and flexible approaches to SQL-based exploration and discovery of information embedded in the common, heterogeneously structured healthcare data sets and supporting components (electronic health records, practice management systems, etc.) The motivating use cases described in the paper are clinical trials candidate discovery, and a treatment effectiveness analysis. Following the use cases, we discuss the key features and software architecture of the platform, the underlying core components (Apache Parquet, Drill, the web services server), and the runtime profiles and performance characteristics of the platform. We conclude by showing dramatic speedup with some approaches, and the performance tradeoffs and limitations of others.


international conference on big data | 2016

Towards a heterogeneous, polystore-like data architecture for the US Department of Veteran Affairs (VA) enterprise analytics

Edmon Begoli; Derek Kistler; Jack Bates

The Polystore architecture revisits the federated approach to access and querying the standalone, independent databases in the uniform and optimized fashion, but this time in the context of heterogeneous data and specialized analyses. In light of this architectural philosophy, and in the light of the major data architecture development efforts at the US Department of Veterans Administration (VA), we discuss the need for the heterogeneous data store consisting of large relational data warehouse, an image and text datastore, and a peta-scale genomic repository. The VAs heterogeneous datastore would, to a larger or smaller degree, follow the architectural blueprint proposed by the polystore architecture. To this end, we discuss the current state of the data architecture at VA, architectural alternatives for development of the heterogeneous datastore, some relevant use cases, the anticipated challenges, and the drawbacks and benefits of adopting the polystore architecture.


artificial intelligence in education | 2013

Towards an Integrative Computational Foundation for Applied Behavior Analysis in Early Autism Interventions

Edmon Begoli; Cristi Ogle; David F. Cihak; Bruce J. MacLennan

Applied Behavior Analysis-based early interventions are ev- idence based, efficacious therapies for autism. They are, however, labor intensive and often inaccessible at the recommended levels. In this paper we present ongoing doctoral research aimed at development of the for- mal, computational representation for Applied Behavior Analysis (ABA) that could serve as a reasoning foundation for intelligent-agent medi- ated ABA therapies. Our approach is to formulate the representation of ABA dynamics and concepts as a process ontology expressed in a con- trolled natural language (CNL). As an ontology language, CNL is not only a machine interpretable, logically sound reasoning foundation, but also understandable and editable by human users.


international conference on management of data | 2018

Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources

Edmon Begoli; Jesús Camacho-Rodríguez; Julian Hyde; Michael J. Mior; Daniel Lemire

Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. The goal of this paper is to formally introduce Calcite to the broader research community, brie y present its history, and describe its architecture, features, functionality, and patterns for adoption. Calcites architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This exible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.


very large data bases | 2017

An Emerging Role for Polystores in Precision Medicine

Edmon Begoli; J. Blair Christian; Vijay Gadepally; Stavros Papadopoulos

Medical data is organically heterogeneous, and it usually varies significantly in both size and composition. Yet, this data is also a key for the recent and promising field of precision medicine, which focuses on identifying and tailoring appropriate medical treatments for the needs of the individual patients, based on their specific conditions, their medical history, lifestyle, genetic, and other individual factors. As we, and a database community at large, recognize that a “one size does not fit all” solution is required to work with such data, we present our observations based on our experiences, and the applications in the field of precision medicine. We make the case for the use of polystore architecture; how it applies for precision medicine; we discuss the reference architecture; describe some of its critical components (array database); and discuss the specific types of analysis that directly benefit from this database architecture, and the ways it serves the data.


ACM Crossroads Student Magazine | 2017

The Heidelberg Laureate Forum on the moving frontier between mathematics and computer science

Edmon Begoli; Vincent Schlegel; Michael Francis Atiyah; Praise Adeyemo; Tim Baarslag

Young and early-career researchers at the 2016 Heidelberg Laureate Forum discuss how the frontier between mathematics and computer science is shifting, what the future promises, and the implications the frontiers shape and dynamics will have on both fields.


southeastcon | 2014

Procedural Reasoning System (PRS) architecture for agent-mediated behavioral interventions

Edmon Begoli

We present an architecture in support of intelligent agent-mediated, behavioral interventions in special education programs for individuals with Autism Spectrum Disorder (ASD). The proposed architecture is a derivative of the Procedural Reasoning System (PRS) architecture with representative, inter-pretative, reasoning, knowledge-based, and procedural control components abstracted from the physical and locomotoric aspects of the agents placement in the environment. The architecture is designed to serve as the unifying foundation for virtual, mixed reality and embodied implementations, so its behavior-oriented control, reasoning, knowledge base, and inference mechanisms are designed as abstractive.

Collaboration


Dive into the Edmon Begoli's collaboration.

Top Co-Authors

Avatar

James Horey

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

J. Blair Christian

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Jack Bates

United States Department of Veterans Affairs

View shared research outputs
Top Co-Authors

Avatar

Jack C. Schryver

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Seung-Hwan Lim

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Ajith Jose

Missouri University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Ashwin Kumar Vajantri

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Brant Boehmann

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Derek Kistler

Oak Ridge National Laboratory

View shared research outputs
Researchain Logo
Decentralizing Knowledge