Marcin Szymczak | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marcin Szymczak is active.

Explore More

Publication

Featured researches published by Marcin Szymczak.

Information Sciences | 2015

Coreference detection in an XML schema

Marcin Szymczak; Sławomir Zadrożny; Antoon Bronselaer; Guy De Tré

Preserving data quality is an important issue in data collection management. One of the crucial issues hereby is the detection of duplicate objects (called coreferent objects) which describe the same entity, but in different ways. In this paper we present a method for detecting coreferent objects in metadata, in particular in XML schemas. Our approach consists in comparing the paths from a root element to a given element in the schema. Each path precisely defines the context and location of a specific element in the schema. Path matching is based on the comparison of the different steps of which paths are composed. The uncertainty about the matching of steps is expressed with possibilistic truth values and aggregated using the Sugeno integral. The discovered coreference of paths can help for establishing a mapping between two different XML schemas. In other words, a novel approach for schema matching problem based on paths comparison only is proposed.

joint ifsa world congress and nafips annual meeting | 2013

Coreference detection in XML metadata

Marcin Szymczak; Sławomir Zadrożny; Guy De Tré

Preserving data quality is an important issue in data collection management. One of the crucial issues hereby is the detection of duplicate objects (called coreferent objects) which describe the same entity, but in different ways. In this paper we present a method for detecting coreferent objects in metadata, in particular in XML schemas. Our approach consists in comparing the paths from a root element to a given element in the schema. Each path precisely defines the context and location of a specific element in the schema. Path matching is based on the comparison of the different steps of which paths are composed. The uncertainty about the matching of steps is expressed with possibilistic truth values and aggregated using the Sugeno integral. The discovered coreference of paths can help for determining the coreference of different XML schemas.

Information Fusion | 2016

Dynamical order construction in data fusion

Antoon Bronselaer; Marcin Szymczak; Sławomir Zadrożny; Guy De Tré

Fusion functions based on order relations are formalized.It is pointed out that an appropriate order relation is not always at hand.The DOC algorithm to construct an appropriate order relation dynamically, is provided.Selection strategies are discussed.A thorough experimental evaluation shows the benefits of the proposed techniques. A crucial operation in the maintenance of data quality in relational databases is to remove tuples that mutually describe the same entity (i.e., duplicate tuples) and to replace them with a tuple that minimizes information loss. A function that combines multiple tuples into one is called a fusion function. In this paper, we investigate fusion functions for attributes of which the values can be sorted by means of an order relation that reflects a notion of generality. It is shown that providing such an order relation a priori, let alone keeping it up-to-date, is a costly operation. Therefore, the Dynamical Order Construction (DOC) algorithm is proposed that constructs an order relation in an automated fashion upon inspecting the data that need to be fused. Such order relations can be immediately deployed in a framework of selectional fusion functions, which are fusion functions that adopt the sort-and-select principle. These fusion functions are investigated closely in terms of their selection strategies. An experimental evaluation of our method shows the influence of the parameters and the benefit with respect to using a fixed and predefined taxonomy.

north american fuzzy information processing society | 2012

Dynamical construction of binary relations in coreference detection

Marcin Szymczak; Antoon Bronselaer; Sławomir Zadrożny; Guy De Tré

Modern database systems allow to describe information from the real world in a well structured manner. Unfortunately, many databases cope with quality problems. One of these problems is the existence of coreferent data, which means that the same real world entity is described multiple times within one database. Due to errors, inaccuracies and lack of standardization, coreferent data are not bound to be equal, which makes the finding of coreferent data a challenging topic. In this paper, we contribute to the field of coreference detection by proposing an automated and dynamical method for the construction of a binary relation R that models semantical knowledge between attribute values. The advantages of the proposed method are two folded: no effort must be put in construction of knowledge bases and mismatches between the database and the knowledge base are avoided.

Advances in intelligent systems and computing | 2014

Selection of semantical mapping of attribute values for data integration

Marcin Szymczak; Antoon Bronselaer; Sławomir Zadrożny; Guy De Tré

Useful information is often scattered over multiple sources. Therefore, automatic data integration that guarantees high data quality is extremely important. One of the crucial operations in data integration from different sources is the detection of different representations of the same piece of information (called coreferent data) and translation to a common, unified representation. That translation is also known as value mapping. However, values mappings are often not explicit i.e. the specific value may be mapped to more than one value. In this paper, we investigate automatic selection method which reduces the set of one-to-many mappings to the set of one-to-one mappings for attributes whose domains are partially ordered and where the given order relation reflects a notion of generality.

Challenging Problems and Solutions in Intelligent Systems | 2016

Content Data Based Schema Matching

Marcin Szymczak; Antoon Bronselaer; Sławomir Zadrożny; Guy De Tré

A novel automatic method for detecting corresponding attributes in schemas based on content data is studied. More specifically, our proposed method for the detection of coreferent attributes in schemas is based on a statistical and lexical comparison of content data and detected coreferent tuples across multiple datasets, which increase the possibility of correct schema matching. We will show that knowledge of even a small number of coreferent tuples is sufficient to establish correct matching between corresponding attributes of heterogeneous schemas. The behaviour of the novel schema matching technique has been evaluated on several real life datasets, giving a valuable insight in the influence of the different parameters of our approach on the results obtained.

ieee international conference on intelligent systems | 2015

Selection of Semantical Mapping of Attribute Values for Data Integration

Marcin Szymczak; Antoon Bronselaer; Sławomir Zadrożny; Guy De Tré

Nowadays the amount of data is increasing very fast. Moreover, useful information is scattered over multiple sources. Therefore, automatic data integration that guarantees high data quality is extremely important. One of the crucial operations in integration of information from independent databases is detection of different representations of the same piece of information (called coreferent data) and translation of the representation of data from one source into the representation of the other source. That translation is also known as object mapping. In this paper, we investigate automatic mapping methods for attributes the values of which may need semantical comparison and can be sorted by means of an order relation that reflects a notion of generality. These mapping methods are investigated closely in terms of their effectiveness. An experimental evaluation of our method shows that using different mapping methods can enlarge a set of true positive mappings.

Norbert Wiener in the 21st Century (21CW), 2014 IEEE Conference on | 2014

Semantical Mapping of Attribute Values for Data Integration

Marcin Szymczak; Antoon Bronselaer; Sławomir Zadrożny; Guy De Tré

Nowadays the amount of data is increasing very fast. Moreover, useful information is scattered over multiple sources. Therefore, automatic data integration that guarantees high data quality is extremely important. One of the crucial operations in integration of information from independent databases is detection of different representations of the same piece of information (called coreferent data) and translation of the representation of data from one source into the representation of the other source. That translation is also known as object mapping. In this paper, we investigate automatic mapping methods for attributes the values of which may need semantical comparison and can be sorted by means of an order relation that reflects a notion of generality. These mapping methods are investigated closely in terms of their effectiveness. An experimental evaluation of our method shows that using different mapping methods can enlarge a set of true positive mappings.

New developments in fuzzy sets, intuitionistic fuzzy sets, generalized nets and related topics : application | 2012

Matching methods for semantic annotation-based XML document transformations

Marcin Szymczak; Julius Koepke

Challenging problems and solutions in computational intelligence | 2016

Content based schema matching

Marcin Szymczak; Antoon Bronselaer; Sławomir Zadrożny; Guy De Tré

Explore More

Collaboration

Dive into the Marcin Szymczak's collaboration.

Top Co-Authors

Sławomir Zadrożny

Polish Academy of Sciences

View shared research outputs

Top Co-Authors

Guy De Tré

Ghent University

View shared research outputs

Top Co-Authors

Antoon Bronselaer

Ghent University

View shared research outputs

Top Co-Authors

Antoon Bronselaer

Ghent University

View shared research outputs

Explore More

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Marcin Szymczak is active.

Publication

Featured researches published by Marcin Szymczak.

Coreference detection in an XML schema

Coreference detection in XML metadata

Dynamical order construction in data fusion

Dynamical construction of binary relations in coreference detection

Selection of semantical mapping of attribute values for data integration

Content Data Based Schema Matching

Selection of Semantical Mapping of Attribute Values for Data Integration

Semantical Mapping of Attribute Values for Data Integration

Matching methods for semantic annotation-based XML document transformations

Content based schema matching

Collaboration

Dive into the Marcin Szymczak's collaboration.