Katarina Grolinger | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Katarina Grolinger is active.

Explore More

Publication

Featured researches published by Katarina Grolinger.

ieee international conference on cloud computing technology and science | 2013

Data management in cloud environments: NoSQL and NewSQL data stores

Katarina Grolinger; Wilson A. Higashino; Abhinav Tiwari; Miriam A. M. Capretz

Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages.

world congress on services | 2014

Challenges for MapReduce in Big Data

Katarina Grolinger; Michael A. Hayes; Wilson A. Higashino; Alexandra L'Heureux; David S. Allison; Miriam A. M. Capretz

In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce paradigm which allows for massively parallel and distributed execution over a large number of computing nodes. This paper identifies MapReduce issues and challenges in handling Big Data with the objective of providing an overview of the field, facilitating better planning and management of Big Data projects, and identifying opportunities for future research in this field. The identified challenges are grouped into four main categories corresponding to Big Data tasks types: data storage (relational databases and NoSQL stores), Big Data analytics (machine learning and interactive analytics), online processing, and security and privacy. Moreover, current efforts aimed at improving and extending MapReduce to address identified challenges are presented. Consequently, by identifying issues and challenges MapReduce faces when handling Big Data, this study encourages future Big Data research.

IEEE Access | 2017

Machine Learning With Big Data: Challenges and Approaches

Alexandra L'Heureux; Katarina Grolinger; Hany F. ElYamany; Miriam A. M. Capretz

The Big Data revolution promises to transform how we live, work, and think by enabling process optimization, empowering insight discovery and improving decision making. The realization of this grand potential relies on the ability to extract value from such massive data through data analytics; machine learning is at its core because of its ability to learn from data and provide data driven insights, decisions, and predictions. However, traditional machine learning approaches were developed in a different era, and thus are based upon multiple assumptions, such as the data set fitting entirely into memory, what unfortunately no longer holds true in this new context. These broken assumptions, together with the Big Data characteristics, are creating obstacles for the traditional techniques. Consequently, this paper compiles, summarizes, and organizes machine learning challenges with Big Data. In contrast to other research that discusses challenges, this work highlights the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity. Moreover, emerging machine learning approaches and techniques are discussed in terms of how they are capable of handling the various challenges with the ultimate objective of helping practitioners select appropriate solutions for their use cases. Finally, a matrix relating the challenges and approaches is presented. Through this process, this paper provides a perspective on the domain, identifies research gaps and opportunities, and provides a strong foundation and encouragement for further research in the field of machine learning with Big Data.

service oriented computing and applications | 2014

Integration of business process modeling and Web services: a survey

Katarina Grolinger; Miriam A. M. Capretz; Americo B. Cunha; Saïd Tazi

A significant challenge in business process automation involves bridging the gap between business process representations and Web service technologies that implement business activities. We are interested in business process representations such as Business Process Modeling Notation (BPMN) and Event-Driven Process Chains (EPCs). Web service technologies include protocols such as Simple Object Access Protocol (SOAP), architectures such as REpresentational State Transfer (RESTful), or semantic description languages and formalisms such as Web Ontology Language for Services (OWL-S) and Web Service Modeling Ontology (WSMO). This paper reviews previous work on the integration of business process representations and Web service technologies. It provides a perspective on the field by summarizing, organizing, and classifying the proposed approaches. Consequently, this study has identified opportunities for future research in the field, including the need for a generic transformation approach among arbitrary models, the need to represent mappings in a formalized way, and the necessity of a common execution framework.

workshops on enabling technologies: infrastracture for collaborative enterprises | 2013

Knowledge as a Service Framework for Disaster Data Management

Katarina Grolinger; Miriam A. M. Capretz; Emna Mezghani; Ernesto Exposito

Each year, a number of natural disasters strike across the globe, killing hundreds and causing billions of dollars in property and infrastructure damage. Minimizing the impact of disasters is imperative in todays society. As the capabilities of software and hardware evolve, so does the role of information and communication technology in disaster mitigation, preparation, response, and recovery. A large quantity of disaster-related data is available, including response plans, records of previous incidents, simulation data, social media data, and Web sites. However, current data management solutions offer few or no integration capabilities. Moreover, recent advances in cloud computing, big data, and NoSQL open the door for new solutions in disaster data management. In this paper, a Knowledge as a Service (KaaS) framework is proposed for disaster cloud data management (Disaster-CDM), with the objectives of 1) storing large amounts of disaster-related data from diverse sources, 2) facilitating search, and 3) supporting their interoperability and integration. Data are stored in a cloud environment using a combination of relational and NoSQL databases. The case study presented in this paper illustrates the use of Disaster-CDM on an example of simulation models.

Artificial Intelligence in Engineering | 1999

Autonomous agent based on reinforcement learning and adaptive shadowed network

Bojan Jerbć; Katarina Grolinger; Božo Vranjš

Abstract The planning of intelligent robot behavior plays an important role in the development of flexible automated systems. The robot’s intelligence comprises its capability to act in unpredictable and chaotic situations, which requires not just a change but the creation of the robot’s working knowledge. Planning of intelligent robot behavior addresses three main issues: finding task solutions in unknown situations, learning from experience and recognizing the similarity of problem paradigms. This article outlines a planning system which integrates the reinforcement learning method and a neural network approach with the aim to ensure autonomous robot behavior in unpredictable working conditions. The assumption is that the robot is a tabula rasa and has no knowledge of the work space structure. Initially, it has just basic strategic knowledge of searching for solutions, based on random attempts, and a built-in learning system. The reinforcement learning method is used here to evaluate robot behavior and to induce new, or improve the existing, knowledge. The acquired action (task) plan is stored as experience which can be used in solving similar future problems. To provide the recognition of problem similarities, the Adaptive Fuzzy Shadowed neural network is designed. This novel network concept with a fuzzy learning rule and shadowed hidden layer architecture enables the recognition of slightly translated or rotated patterns and does not forget already learned structures. The intelligent planning system is simulated using object-oriented techniques and verified on planned and random examples, proving the main advantages of the proposed approach: autonomous learning, which is invariant with regard to the order of training samples, and single iteration learning progress.

canadian conference on electrical and computer engineering | 2011

Federated critical infrastructure simulators: Towards ontologies for support of collaboration

Katarina Grolinger; Miriam A. M. Capretz; Adam Shypanski; Gagandeep S. Gill

Our society relies greatly on a variety of critical infrastructures (CI), such as power system networks, water distribution, oil and natural gas systems, telecommunication networks and others. Interdependency between those systems is high and may result in cascading failures spanning different infrastructures. Behavior of each CI can be observed and analyzed through the use of domain simulators, but this does not account for their interdependency. To explore CI interdependencies, domain simulators need to be integrated in a federation where they can collaborate. This paper explores three different simulators: the EPANET water distribution simulator, the PSCAD power system simulator and the I2Sim infrastructure interdependency simulator. Each simulators modeling approach is explored and their similarities and differences between modeling approaches are determined. Core ontology for each simulation engine is created as well as initial mapping between them. Ontologies and their mapping will support collaboration of simulators by enabling exchange of information in a semantic manner.

international congress on big data | 2016

Energy Consumption Prediction with Big Data: Balancing Prediction Accuracy and Computational Resources

Katarina Grolinger; Miriam A. M. Capretz; Luke Seewald

In recent years, advances in sensor technologies and expansion of smart meters have resulted in massive growth of energy data sets. These Big Data have created new opportunities for energy prediction, but at the same time, they impose new challenges for traditional technologies. On the other hand, new approaches for handling and processing these Big Data have emerged, such as MapReduce, Spark, Storm, and Oxdata H2O. This paper explores how findings from machine learning with Big Data can benefit energy consumption prediction. An approach based on local learning with support vector regression (SVR) is presented. Although local learning itself is not a novel concept, it has great potential in the Big Data domain because it reduces computational complexity. The local SVR approach presented here is compared to traditional SVR and to deep neural networks with an H2O machine learning platform for Big Data. Local SVR outperformed both SVR and H2O deep learning in terms of prediction accuracy and computation time. Especially significant was the reduction in training time, local SVR training was an order of magnitude faster than SVR or H2O deep learning.

international conference on machine learning and applications | 2015

MLaaS: Machine Learning as a Service

Mauro Ribeiro; Katarina Grolinger; Miriam A. M. Capretz

The demand for knowledge extraction has been increasing. With the growing amount of data being generated by global data sources (e.g., social media and mobile apps) and the popularization of context-specific data (e.g., the Internet of Things), companies and researchers need to connect all these data and extract valuable information. Machine learning has been gaining much attention in data mining, leveraging the birth of new solutions. This paper proposes an architecture to create a flexible and scalable machine learning as a service. An open source solution was implemented and presented. As a case study, a forecast of electricity demand was generated using real-world sensor and weather data by running different algorithms at the same time.

international joint conference on neural network | 2016

Collective contextual anomaly detection framework for smart buildings

Daniel B. Araya; Katarina Grolinger; Hany F. ElYamany; Miriam A. M. Capretz; Girma Bitsuamlak

Buildings are responsible for a significant amount of total global energy consumption and as a result account for a substantial portion of overall carbon emissions. Moreover, buildings have a great potential for helping to meet energy efficiency targets. Hence, energy saving goals that target buildings can have a significant contribution in reducing environmental impact. Todays smart buildings achieve energy efficiency by monitoring energy usage with the aim of detecting and diagnosing abnormal energy consumption behaviour. This research proposes a generic collective contextual anomaly detection (CCAD) framework that uses sliding window approach and integrates historic sensor data along with generated and contextual features to train an autoencoder to recognize normal consumption patterns. Subsequently, by determining a threshold that optimizes sensitivity and specificity, the framework identifies abnormal consumption behaviour. The research compares two models trained with different features using real-world data provided by Powersmiths, located in Brampton, Ontario, Canada.

Explore More