Manuel Martin Salvador

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Manuel Martin Salvador is active.

Explore More

Publication

Featured researches published by Manuel Martin Salvador.

hybrid artificial intelligence systems | 2016

Towards Automatic Composition of Multicomponent Predictive Systems

Manuel Martin Salvador; Marcin Budka; Bogdan Gabrys

Automatic composition and parametrisation of multicomponent predictive systems (MCPSs) consisting of chains of data transformation steps is a challenging task. In this paper we propose and describe an extension to the Auto-WEKA software which now allows to compose and optimise such flexible MCPSs by using a sequence of WEKA methods. In the experimental analysis we focus on examining the impact of significantly extending the search space by incorporating additional hyperparameters of the models, on the quality of the found solutions. In a range of extensive experiments three different optimisation strategies are used to automatically compose MCPSs on 21 publicly available datasets. A comparison with previous work indicates that extending the search space improves the classification accuracy in the majority of the cases. The diversity of the found MCPSs are also an indication that fully and automatically exploiting different combinations of data cleaning and preprocessing techniques is possible and highly beneficial for different predictive models. This can have a big impact on high quality predictive models development, maintenance and scalability aspects needed in modern application and deployment scenarios.

intelligent data analysis | 2014

From Sensor Readings to Predictions: On the Process of Developing Practical Soft Sensors

Marcin Budka; Mark Eastwood; Bogdan Gabrys; Petr Kadlec; Manuel Martin Salvador; Stephanie Schwan; Athanasios Tsakonas; Indrė Žliobaitė

Automatic data acquisition systems provide large amounts of streaming data generated by physical sensors. This data forms an input to computational models (soft sensors) routinely used for monitoring and control of industrial processes, traffic patterns, environment and natural hazards, and many more. The majority of these models assume that the data comes in a cleaned and pre-processed form, ready to be fed directly into a predictive model. In practice, to ensure appropriate data quality, most of the modelling efforts concentrate on preparing data from raw sensor readings to be used as model inputs. This study analyzes the process of data preparation for predictive models with streaming sensor data. We present the challenges of data preparation as a four-step process, identify the key challenges in each step, and provide recommendations for handling these issues. The discussion is focused on the approaches that are less commonly used, while, based on our experience, may contribute particularly well to solving practical soft sensor tasks. Our arguments are illustrated with a case study in the chemical production industry.

Procedia Computer Science | 2014

Online Detection of Shutdown Periods in Chemical Plants: A Case Study

Manuel Martin Salvador; Bogdan Gabrys; Indrė Žliobaitė

In process industry, chemical processes are controlled and monitored by using readings from multiple physical sensors across the plants. Such physical sensors are also supplemented by soft sensors, i.e. adaptive predictive models, which are often used for computing hard-to-measure variables of the process. For soft sensors to work well and adapt to changing operating conditions they need to be provided with relevant data. As production plants are regularly stopped, data instances generated during shutdown periods have to be identified to avoid updating these predictive models with wrong data. We present a case study concerned with a large chemical plant operation over a 2 years period. The task is to robustly and accurately identify the shutdown periods even in case of multiple sensor failures. State-of-the-art methods were evaluated using the first half of the dataset for calibration purposes and the other half for measuring the performance. Results show that shutdowns (i.e. sudden changes) can be quickly detected in any case but the detection delay of startups (i.e. gradual changes) is directly related with the choice of a window size.

Proceedings of the Thirty-second SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence | 2012

eRules: A Modular Adaptive Classification Rule Learning Algorithm for Data Streams

Frederic T. Stahl; Mohamed Medhat Gaber; Manuel Martin Salvador

Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values

Procedia Computer Science | 2016

Effects of Change Propagation Resulting from Adaptive Preprocessing in Multicomponent Predictive Systems

Manuel Martin Salvador; Marcin Budka; Bogdan Gabrys

Predictive modelling is a complex process that requires a number of steps to transform raw data into predictions. Preprocessing of the input data is a key step in such process, and the selection of proper preprocessing methods is often a labour intensive task. Such methods are usually trained offline and their parameters remain fixed during the whole model deployment lifetime. However, preprocessing of non-stationary data streams is more challenging since the lack of adaptation of such preprocessing methods may degrade system performance. In addition, dependencies between different predictive system components make the adaptation process more challenging. In this paper we discuss the effects of change propagation resulting from using adaptive preprocessing in a Multicomponent Predictive System (MCPS). To highlight various issues we present four scenarios with different levels of adaptation. A number of experiments have been performed with a range of datasets to compare the prediction error in all four scenarios. Results show that well managed adaptation considerably improves the prediction performance. However, the model can become inconsistent if adaptation in one component is not correctly propagated throughout the rest of system components. Sometimes, such inconsistency may not cause an obvious deterioration in the system performance, therefore being difficult to detect. In some other cases it may even lead to a system failure as was observed in our experiments.

international conference on machine learning | 2016