Frank Rosenthal
Dresden University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Frank Rosenthal.
Datenbank-spektrum | 2013
Ulrike Fischer; Lars Dannecker; Laurynas Siksnys; Frank Rosenthal; Matthias Boehm; Wolfgang Lehner
Integrating sophisticated statistical methods into database management systems is gaining more and more attention in research and industry in order to be able to cope with increasing data volume and increasing complexity of the analytical algorithms. One important statistical method is time series forecasting, which is crucial for decision making processes in many domains. The deep integration of time series forecasting offers additional advanced functionalities within a DBMS. More importantly, however, it allows for optimizations that improve the efficiency, consistency, and transparency of the overall forecasting process. To enable efficient integrated forecasting, we propose to enhance the traditional 3-layer ANSI/SPARC architecture of a DBMS with forecasting functionalities. This article gives a general overview of our proposed enhancements and presents how forecast queries can be processed using an example from the energy data management domain. We conclude with open research topics and challenges that arise in this area.
international conference on data engineering | 2009
Peter Benjamin Volk; Frank Rosenthal; Martin Hahmann; Dirk Habich; Wolfgang Lehner
The topic of managing uncertain data has been explored in many ways. Different methodologies for data storage and query processing have been proposed. As the availability of management systems grows, the research on analytics of uncertain data is gaining in importance. Similar to the challenges faced in the field of data management, algorithms for uncertain data mining also have a high performance degradation compared to their certain algorithms. To overcome the problem of performance degradation, the MCDB approach was developed for uncertain data management based on the possible world scenario. As this methodology shows significant performance and scalability enhancement, we adopt this method for the field of mining on uncertain data. In this paper, we introduce a clustering methodology for uncertain data and illustrate current issues with this approach within the field of clustering uncertain data.
international conference on data engineering | 2012
Ulrike Fischer; Frank Rosenthal; Wolfgang Lehner
Forecasts are important to decision-making and risk assessment in many domains. Since current database systems do not provide integrated support for forecasting, it is usually done outside the database system by specially trained experts using forecast models. However, integrating model-based forecasting as a first-class citizen inside a DBMS speeds up the forecasting process by avoiding exporting the data and by applying database-related optimizations like reusing created forecast models. It especially allows subsequent processing of forecast results inside the database. In this demo, we present our prototype F2DB based on PostgreSQL, which allows for transparent processing of forecast queries. Our system automatically takes care of model maintenance when the underlying dataset changes. In addition, we offer optimizations to save maintenance costs and increase accuracy by using derivation schemes for multidimensional data. Our approach reduces the required expert knowledge by enabling arbitrary users to apply forecasting in a declarative way.
computer information systems and industrial management applications | 2007
Christin Groba; Sebastian Cech; Frank Rosenthal; Andreas Gossling
Predictive maintenance is a promising maintenance strategy. However, existing solutions are isolated from enterprise systems and limited to specific applications. A predictive maintenance framework that integrates the diversity of existing techniques for equipment failure predictions and that incorporates data both from machine level and the upper enterprise level is still missing. We envision the development of a predictive maintenance framework that is characterized by a high degree of automation and the possibility to use state-of-the-art prediction methods. We attempt to create an open architecture that enables third-party suppliers to integrate their specialized prediction components into our framework. In this paper we analyze the requirements and introduce the initial architecture associated with such a predictive maintenance framework, which is being realized in a joint project with SAP Research.
intelligent data analysis | 2009
Martin Hahmann; Peter Benjamin Volk; Frank Rosenthal; Dirk Habich; Wolfgang Lehner
One of the most important and challenging questions in the area of clustering is how to choose the best-fitting algorithm and parameterization to obtain an optimal clustering for the considered data. The clustering aggregation concept tries to bypass this problem by generating a set of separate, heterogeneous partitionings of the same data set, from which an aggregate clustering is derived. As of now, almost every existing aggregation approach combines given crisp clusterings on the basis of pair-wise similarities. In this paper, we regard an input set of soft clusterings and show that it contains additional information that is efficiently useable for the aggregation. Our approach introduces an expansion of mentioned pair-wise similarities, allowing control and adjustment of the aggregation process and its result. Our experiments show that our flexible approach offers adaptive results, improved identification of structures and high useability.
international database engineering and applications symposium | 2010
Ulrike Fischer; Frank Rosenthal; Matthias Boehm; Wolfgang Lehner
Forecasts are important to decision-making and risk assessment in many domains. There has been recent interest in integrating forecast queries inside a DBMS. Answering a forecast query requires the creation of forecast models. Creating a forecast model is an expensive process and may require several scans over the base data as well as expensive operations to estimate model parameters. However, if forecast queries are issued repeatedly, answer times can be reduced significantly if forecast models are reused. Due to the possibly high number of forecast queries, existing models need to be found quickly. Therefore, we propose a model index that efficiently stores forecast models and allows for the efficient reuse of existing ones. Our experiments illustrate that the model index shows a negligible overhead for update transactions, but it yields significant improvements during query execution.
international database engineering and applications symposium | 2012
Ulrike Fischer; Frank Rosenthal; Wolfgang Lehner
Time series forecasting is challenging as sophisticated forecast models are computationally expensive to build. Recent research has addressed the integration of forecasting inside a DBMS. One main benefit is that models can be created once and then repeatedly used to answer forecast queries. Often forecast queries are submitted on higher aggregation levels, e. g., forecasts of sales over all locations. To answer such a forecast query, we have two possibilities. First, we can aggregate all base time series (sales in Austria, sales in Belgium...) and create only one model for the aggregate time series. Second, we can create models for all base time series and aggregate the base forecast values. The second possibility might lead to a higher accuracy but it is usually too expensive due to a high number of base time series. However, we actually do not need all base models to achieve a high accuracy, a sample of base models is enough. With this approach, we still achieve a better accuracy than an aggregate model, very similar to using all models, but we need less models to create and maintain in the database. We further improve this approach if new actual values of the base time series arrive at different points in time. With each new actual value we can refine the aggregate forecast and eventually converge towards the real actual value. Our experimental evaluation using several real-world data sets, shows a high accuracy of our approaches and a fast convergence towards the optimal value with increasing sample sizes and increasing number of actual values respectively.
statistical and scientific database management | 2011
Bernhard Jaecksch; Franz Faerber; Frank Rosenthal; Wolfgang Lehner
Domain-specific query languages (DSQL) let users express custom business logic. Relational databases provide a limited set of options to execute business logic. Usually, stored procedures or a series of queries with some glue code. Both methods have drawbacks and often business logic is still executed on application side transferring large amounts of data between application and database, which is expensive. We translate a DSQL into a hybrid data-flow execution plan, containing relational operators mixed with procedural ones. A cost model is used to drive the translation towards an optimal mixture of relational and procedural plan operators.
international workshop on factory communication systems | 2010
Jakob Krause; Sebastian Cech; Frank Rosenthal; Andreas Gossling; Christin Groba; Volodymyr Vasyutynskyy
Predictive maintenance is an important strategy to increase overall productivity in a technology-pervaded domain like manufacturing. However, existing approaches cannot be applied in a straight-forward manner because they insufficiently support the multitude of diverse machines on the shop floor and provide only limited automation for failure prediction. In this paper, we identify seamless integration, heterogeneity, and flexibility as the main challenges when applying predictive maintenance in a factory-wide setting. We propose an architecture for a predictive maintenance framework. We implemented the framework and built a prototype that estimates the probability of breakdowns based on power consumption.
statistical and scientific database management | 2011
Frank Rosenthal; Wolfgang Lehner
Forecasting is an important analysis task and there is a need of integrating time series models and estimation methods in database systems. The main issue is the computationally expensive maintenance of model parameters when new data is inserted. In this paper, we examine how an important class of time series models, the AutoRegressive Integrated Moving Average (ARIMA) models, can be maintained with respect to inserts. Therefore, we propose a novel approach, on-demand estimation, for the efficient maintenance of maximum likelihood estimates from numerically implemented estimators. We present an extensive experimental evaluation on both real and synthetic data, which shows that our approach yields a substantial speedup while sacrificing only a limited amount of predictive accuracy.