Miriam A. M. Capretz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Miriam A. M. Capretz is active.

Explore More

Publication

Featured researches published by Miriam A. M. Capretz.

ieee international conference on cloud computing technology and science | 2013

Data management in cloud environments: NoSQL and NewSQL data stores

Katarina Grolinger; Wilson A. Higashino; Abhinav Tiwari; Miriam A. M. Capretz

Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages.

world congress on services | 2014

Challenges for MapReduce in Big Data

Katarina Grolinger; Michael A. Hayes; Wilson A. Higashino; Alexandra L'Heureux; David S. Allison; Miriam A. M. Capretz

In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce paradigm which allows for massively parallel and distributed execution over a large number of computing nodes. This paper identifies MapReduce issues and challenges in handling Big Data with the objective of providing an overview of the field, facilitating better planning and management of Big Data projects, and identifying opportunities for future research in this field. The identified challenges are grouped into four main categories corresponding to Big Data tasks types: data storage (relational databases and NoSQL stores), Big Data analytics (machine learning and interactive analytics), online processing, and security and privacy. Moreover, current efforts aimed at improving and extending MapReduce to address identified challenges are presented. Consequently, by identifying issues and challenges MapReduce faces when handling Big Data, this study encourages future Big Data research.

international conference on web services | 2009

A Dependency Impact Analysis Model for Web Services Evolution

Shuying Wang; Miriam A. M. Capretz

As many software systems have been turned as Web services, the evolutionary changes of Web services are becoming an important issue. To understand the way in which the change affects the services, we must ascertain parts of the system that will be effected by the change and examine them for additional impacts. In this paper, we propose an impact analysis model based on service dependency. In particular, the service dependency graph model, service dependency and the relation matrix are examined. Based on the shift and calculation of the matrix, the dependency and impact of the service evolution can be analyzed and its quantity can be ascertained. Furthermore, we also represent an approach for service change annotation and for service evolution process. Overall, these works provide a foundation for the automatic management, control, and evaluation of service evolution.

conference of the industrial electronics society | 2001

Component-based software development

Luiz Fernando Capretz; Miriam A. M. Capretz; Dahai Li

Component-based software development (CBSD) strives to achieve a set of pre-built, standardized software components available to fit a specific architectural style for some application domain; the application is then assembled using these components. Component-based software reusability will be at the forefront of software development technology in the next few years. This paper describes a software life cycle that supports component-based development under an object-oriented framework. Development time versus software life cycle phases, which is an important assessment of the component-based development model put forward, is also mentioned.

Journal of Big Data | 2015

Contextual anomaly detection framework for big sensor data

Michael A. Hayes; Miriam A. M. Capretz

The ability to detect and process anomalies for Big Data in real-time is a difficult task. The volume and velocity of the data within many systems makes it difficult for typical algorithms to scale and retain their real-time characteristics. The pervasiveness of data combined with the problem that many existing algorithms only consider the content of the data source; e.g. a sensor reading itself without concern for its context, leaves room for potential improvement. The proposed work defines a contextual anomaly detection framework. It is composed of two distinct steps: content detection and context detection. The content detector is used to determine anomalies in real-time, while possibly, and likely, identifying false positives. The context detector is used to prune the output of the content detector, identifying those anomalies which are considered both content and contextually anomalous. The context detector utilizes the concept of profiles, which are groups of similarly grouped data points generated by a multivariate clustering algorithm. The research has been evaluated against two real-world sensor datasets provided by a local company in Brampton, Canada. Additionally, the framework has been evaluated against the open-source Dodgers dataset, available at the UCI machine learning repository, and against the R statistical toolbox.

IEEE Access | 2017

Machine Learning With Big Data: Challenges and Approaches

Alexandra L'Heureux; Katarina Grolinger; Hany F. ElYamany; Miriam A. M. Capretz

The Big Data revolution promises to transform how we live, work, and think by enabling process optimization, empowering insight discovery and improving decision making. The realization of this grand potential relies on the ability to extract value from such massive data through data analytics; machine learning is at its core because of its ability to learn from data and provide data driven insights, decisions, and predictions. However, traditional machine learning approaches were developed in a different era, and thus are based upon multiple assumptions, such as the data set fitting entirely into memory, what unfortunately no longer holds true in this new context. These broken assumptions, together with the Big Data characteristics, are creating obstacles for the traditional techniques. Consequently, this paper compiles, summarizes, and organizes machine learning challenges with Big Data. In contrast to other research that discusses challenges, this work highlights the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity. Moreover, emerging machine learning approaches and techniques are discussed in terms of how they are capable of handling the various challenges with the ultimate objective of helping practitioners select appropriate solutions for their use cases. Finally, a matrix relating the challenges and approaches is presented. Through this process, this paper provides a perspective on the domain, identifies research gaps and opportunities, and provides a strong foundation and encouragement for further research in the field of machine learning with Big Data.

service oriented computing and applications | 2014

Integration of business process modeling and Web services: a survey

Katarina Grolinger; Miriam A. M. Capretz; Americo B. Cunha; Saïd Tazi

A significant challenge in business process automation involves bridging the gap between business process representations and Web service technologies that implement business activities. We are interested in business process representations such as Business Process Modeling Notation (BPMN) and Event-Driven Process Chains (EPCs). Web service technologies include protocols such as Simple Object Access Protocol (SOAP), architectures such as REpresentational State Transfer (RESTful), or semantic description languages and formalisms such as Web Ontology Language for Services (OWL-S) and Web Service Modeling Ontology (WSMO). This paper reviews previous work on the integration of business process representations and Web service technologies. It provides a perspective on the field by summarizing, organizing, and classifying the proposed approaches. Consequently, this study has identified opportunities for future research in the field, including the need for a generic transformation approach among arbitrary models, the need to represent mappings in a formalized way, and the necessity of a common execution framework.

international congress on big data | 2014

Contextual Anomaly Detection in Big Sensor Data

Michael A. Hayes; Miriam A. M. Capretz

Performing predictive modelling, such as anomaly detection, in Big Data is a difficult task. This problem is compounded as more and more sources of Big Data are generated from environmental sensors, logging applications, and the Internet of Things. Further, most current techniques for anomaly detection only consider the content of the data source, i.e. the data itself, without concern for the context of the data. As data becomes more complex it is increasingly important to bias anomaly detection techniques for the context, whether it is spatial, temporal, or semantic. The work proposed in this paper outlines a contextual anomaly detection technique for use in streaming sensor networks. The technique uses a well-defined content anomaly detection algorithm for real-time point anomaly detection. Additionally, we present a post-processing context-aware anomaly detection algorithm based on sensor profiles, which are groups of contextually similar sensors generated by a multivariate clustering algorithm. Our proposed research has been implemented and evaluated with real-world data provided by Powersmiths, located in Brampton, Ontario, Canada.

workshops on enabling technologies: infrastracture for collaborative enterprises | 2013

Knowledge as a Service Framework for Disaster Data Management

Katarina Grolinger; Miriam A. M. Capretz; Emna Mezghani; Ernesto Exposito

Each year, a number of natural disasters strike across the globe, killing hundreds and causing billions of dollars in property and infrastructure damage. Minimizing the impact of disasters is imperative in todays society. As the capabilities of software and hardware evolve, so does the role of information and communication technology in disaster mitigation, preparation, response, and recovery. A large quantity of disaster-related data is available, including response plans, records of previous incidents, simulation data, social media data, and Web sites. However, current data management solutions offer few or no integration capabilities. Moreover, recent advances in cloud computing, big data, and NoSQL open the door for new solutions in disaster data management. In this paper, a Knowledge as a Service (KaaS) framework is proposed for disaster cloud data management (Disaster-CDM), with the objectives of 1) storing large amounts of disaster-related data from diverse sources, 2) facilitating search, and 3) supporting their interoperability and integration. Data are stored in a cloud environment using a combination of relational and NoSQL databases. The case study presented in this paper illustrates the use of Disaster-CDM on an example of simulation models.

international multi-conference on computing in global information technology | 2010

Online Trust: Definition and Principles

Zainab M. Aljazzaf; Mark Perry; Miriam A. M. Capretz

Trust is as significant a factor for successful online interactions as it is in offline communities. Trust is an important factor to predict the behaviour of an entity and as a criterion for an entity selection. Most trust studies focused on trust establishment without identifying and considering the main trust definition components and trust principles. This paper explores trust in the offline and the online world to extract important trust definition components and trust principles. The trust definition and principles are presented, which form a basis that should be followed to establish trust online.

Explore More