Is this you? Create Your Porfile

Mikolaj Morzy

Poznań University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mikolaj Morzy is active.

Explore More

Publication

Featured researches published by Mikolaj Morzy.

machine learning and data mining in pattern recognition | 2007

Mining Frequent Trajectories of Moving Objects for Location Prediction

Mikolaj Morzy

Advances in wireless and mobile technology flood us with amounts of moving object data that preclude all means of manual data processing. The volume of data gathered from position sensors of mobile phones, PDAs, or vehicles, defies human ability to analyze the stream of input data. On the other hand, vast amounts of gathered data hide interesting and valuable knowledge patterns describing the behavior of moving objects. Thus, new algorithms for mining moving object data are required to unearth this knowledge. An important function of the mobile objects management system is the prediction of the unknown location of an object. In this paper we introduce a data mining approach to the problem of predicting the location of a moving object. We mine the database of moving object locations to discover frequent trajectories and movement rules. Then, we match the trajectory of a moving object with the database of movement rules to build a probabilistic model of object location. Experimental evaluation of the proposal reveals prediction accuracy close to 80%. Our original contribution includes the elaboration on the location prediction model, the design of an efficient mining algorithm, introduction of movement rule matching strategies, and a thorough experimental evaluation of the proposed model.

workshop on internet and network economics | 2006

The sound of silence: mining implicit feedbacks to compute reputation

Mikolaj Morzy; Adam Wierzbicki

A reliable mechanism for scoring the reputation of sellers is crucial for the development of a successful environment for customer-to-customer e-commerce. Unfortunately, most C2C environments utilize simple feedback-based reputation systems, that not only do not offer sufficient protection from fraud, but tend to overestimate the reputation of sellers by introducing a strong bias toward maximizing the volume of sales at the expense of the quality of service. In this paper we present a method that avoids the unfavorable phenomenon of overestimating the reputation of sellers by using implicit feedbacks. We introduce the notion of an implicit feedback and we propose two strategies for discovering implicit feedbacks. We perform a twofold evaluation of our proposal. To demonstrate the existence of the implicit feedback and to propose an advanced method of implicit feedback discovery we conduct experiments on a large volume of real-world data acquired from an online auction site. Next, a game-theoretic approach is presented that uses simulation to show that the use of the implicit feedback can improve a simple reputation system such as used by eBay. Both the results of the simulation and the results of experiments prove the validity and importance of using implicit feedbacks in reputation scoring.

international conference on conceptual modeling | 2012

OLAP-Like analysis of time point-based sequential data

Bartosz Bębel; Mikolaj Morzy; Tadeusz Morzy; Zbyszko Królikowski; Robert Wrembel

Nowadays business intelligence technologies allow to analyze mainly set oriented data, without considering order dependencies between data. Few approaches to analyzing data of sequential order have been proposed so far. Nonetheless, for storing and manipulating sequential data the approaches use either the relational data model or its extensions. We argue that in order to be able to fully support the analysis of sequential data, a dedicated new data model is needed. In this paper, we propose a formal model for time point-based sequential data with operations that allow to construct sequences of events, organize them in an OLAP-like manner, and analyze them. To the best of our knowledge, this is the first formal model and query language for this class of data.

advances in social networks analysis and mining | 2011

Opinion Mining and Social Networks: A Promising Match

Krzysztof Jędrzejewski; Mikolaj Morzy

In this paper we discuss the role and importance of social networks as preferred environments for opinion mining and sentiment analysis especially. We begin by briefly describing selected properties of social networks that are relevant with respect to opinion mining and we outline the general relationships between the two disciplines. We present the related work and provide basic definitions used in opinion mining. Then, we introduce our original method of opinion classification and we test the presented algorithm on real world datasets acquired from popular Polish social networks, reporting on the results. The results are promising and soundly support the main thesis of the paper, namely, that social networks exhibit properties that make them very suitable for opinion mining activities.

database and expert systems applications | 2002

Materialized views in data mining

Bogdan D. Czejdo; Mikolaj Morzy; Marek Wojciechowski; Maciej Zakrzewicz

Data mining is an interactive and iterative process. It is highly probable that a user will issue a series of similar queries until he or she receives satisfying results. Currently available mining algorithms suffer from long processing times depending mainly on the size of the dataset. As the pattern discovery takes place mainly in the data warehouse environment, such long processing times are unacceptable from the point of view of interactive data mining. On the other hand, the results of consecutive data mining queries are usually very similar. This observation leads to the idea of reusing materialized results of previous data mining queries in order to improve performance of the system. In this paper we present the concept of materialized data mining views and we show how the results stored in these views can be used to accelerate processing of data mining queries. We demonstrate the use of materialized views in the domains of association rules discovery and sequential pattern search.

international symposium on computer and information sciences | 2004

A Study on Answering a Data Mining Query Using a Materialized View

Maciej Zakrzewicz; Mikolaj Morzy; Marek Wojciechowski

One of the classic data mining problems is discovery of frequent itemsets. This problem particularly attracts database community as it resembles traditional database querying. In this paper we consider a data mining system which supports storing of previous query results in the form of materialized data mining views. While numerous works have shown that reusing results of previous frequent itemset queries can significantly improve performance of data mining query processing, a thorough study of possible differences between the current query and a materialized view has not been presented yet. In this paper we classify possible differences into six classes, provide I/O cost analysis for all the classes, and experimentally evaluate the most promising one.

ACM Transactions on Internet Technology | 2017

Progressive Random Indexing: Dimensionality Reduction Preserving Local Network Dependencies

Michal Ciesielczyk; Andrzej Szwabe; Mikolaj Morzy; Pawel Misiorek

The vector space model is undoubtedly among the most popular data representation models used in the processing of large networks. Unfortunately, the vector space model suffers from the so-called curse of dimensionality, a phenomenon where data become extremely sparse due to an exponential growth of the data space volume caused by a large number of dimensions. Thus, dimensionality reduction techniques are necessary to make large networks represented in the vector space model available for analysis and processing. Most dimensionality reduction techniques tend to focus on principal components present in the data, effectively disregarding local relationships that may exist between objects. This behavior is a significant drawback of current dimensionality reduction techniques, because these local relationships are crucial for maintaining high accuracy in many network analysis tasks, such as link prediction or community detection. To rectify the aforementioned drawback, we propose Progressive Random Indexing, a new dimensionality reduction technique. Built upon Reflective Random Indexing, our method significantly reduces the dimensionality of the vector space model while retaining all important local relationships between objects. The key element of the Progressive Random Indexing technique is the use of the gain value at each reflection step, which determines how much information about local relationships should be included in the space of reduced dimensionality. Our experiments indicate that when applied to large real-world networks (Facebook social network, MovieLens movie recommendations), Progressive Random Indexing outperforms state-of-the-art methods in link prediction tasks.

data warehousing and knowledge discovery | 2005

Optimizing a sequence of frequent pattern queries

Mikolaj Morzy; Marek Wojciechowski; Maciej Zakrzewicz

Discovery of frequent patterns is a very important data mining problem with numerous applications. Frequent pattern mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. A significant amount of research on efficient processing of frequent pattern queries has been done in recent years, focusing mainly on constraint handling and reusing results of previous queries. In this paper we tackle the problem of optimizing a sequence of frequent pattern queries, submitted to the system as a batch. Our solutions are based on previously proposed techniques of reusing results of previous queries, and exploit the fact that knowing a sequence of queries a priori gives the system a chance to schedule and/or adjust the queries so that they can use results of queries executed earlier. We begin with simple query scheduling and then consider other transformations of the original batch of queries.

Entropy | 2016

Using Graph and Vertex Entropy to Compare Empirical Graphs with Theoretical Graph Models

Tomasz Kajdanowicz; Mikolaj Morzy

Over the years, several theoretical graph generation models have been proposed. Among the most prominent are: the Erdős–Renyi random graph model, Watts–Strogatz small world model, Albert–Barabasi preferential attachment model, Price citation model, and many more. Often, researchers working with real-world data are interested in understanding the generative phenomena underlying their empirical graphs. They want to know which of the theoretical graph generation models would most probably generate a particular empirical graph. In other words, they expect some similarity assessment between the empirical graph and graphs artificially created from theoretical graph generation models. Usually, in order to assess the similarity of two graphs, centrality measure distributions are compared. For a theoretical graph model this means comparing the empirical graph to a single realization of a theoretical graph model, where the realization is generated from the given model using an arbitrary set of parameters. The similarity between centrality measure distributions can be measured using standard statistical tests, e.g., the Kolmogorov–Smirnov test of distances between cumulative distributions. However, this approach is both error-prone and leads to incorrect conclusions, as we show in our experiments. Therefore, we propose a new method for graph comparison and type classification by comparing the entropies of centrality measure distributions (degree centrality, betweenness centrality, closeness centrality). We demonstrate that our approach can help assign the empirical graph to the most similar theoretical model using a simple unsupervised learning method.

World Wide Web | 2015

ICT Services for open and citizen science

Mikolaj Morzy

Ideas of open access, open data and open science are transforming the world of scientific inquiry as we speak. Every day thousands of ordinary citizens are engaging in data collection and data processing, giving rise to the new field of citizen science. Never before has the technology enabled scientists to reach out to such vast numbers of collaborators and show their work to the public. From pattern recognition in Hubble space telescope images of distant galaxies to field observations of migration patterns of birds in the rural areas of United States, the possibilities are countless. Certainly this new trend poses important problems and challenges, but it is also obvious that wide acceptance of citizen science can lead not only to great scientific results, but to the popularization of scientific method among the public. In the paper we examine the current state of citizen science, we outline some of the most interesting and difficult challenges in leading scientific projects on such scale, and we present typologies of citizen science projects. We also provide a survey of ICT tools available for citizen science projects.

Explore More