Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Matthias Studer is active.

Publication


Featured researches published by Matthias Studer.


Sociological Methods & Research | 2011

Discrepancy Analysis of State Sequences

Matthias Studer; Gilbert Ritschard; Alexis Gabadinho; Nicolas S. Müller

In this article, the authors define a methodological framework for analyzing the relationship between state sequences and covariates. Inspired by the principles of analysis of variance, this approach looks at how the covariates explain the discrepancy of the sequences. The authors use the pairwise dissimilarities between sequences to determine the discrepancy, which makes it possible to develop a series of statistical significance–based analysis tools. They introduce generalized simple and multifactor discrepancy-based methods to test for differences between groups, a pseudo-R 2 for measuring the strength of sequence-covariate associations, a generalized Levene statistic for testing differences in the within-group discrepancies, as well as tools and plots for studying the evolution of the differences along the time frame and a regression tree method for discovering the most significant discriminant covariates and their interactions. In addition, the authors extend all methods to account for case weights. The scope of the proposed methodological framework is illustrated using a real-world sequence data set.


international joint conference on knowledge discovery, knowledge engineering and knowledge management | 2009

Extracting and Rendering Representative Sequences

Alexis Gabadinho; Gilbert Ritschard; Matthias Studer; Nicolas S. Müller

This paper is concerned with the summarization of a set of categorical sequences. More specifically, the problem studied is the determination of the smallest possible number of representative sequences that ensure a given coverage of the whole set, i.e. that have together a given percentage of sequences in their neighbourhood. The proposed heuristic for extracting the representative subset requires as main arguments a pairwise distance matrix, a representativeness criterion and a distance threshold under which two sequences are considered as redundant or, identically, in the neighborhood of each other. It first builds a list of candidates using a representativeness score and then eliminates redundancy. We propose also a visualization tool for rendering the results and quality measures for evaluating them. The proposed tools have been implemented in our TraMineR R package for mining and visualizing sequence data and we demonstrate their efficiency on a real world example from social sciences. The methods are nonetheless by no way limited to social science data and should prove useful in many other domains.


data warehousing and knowledge discovery | 2008

Extracting Knowledge from Life Courses: Clustering and Visualization

Nicolas S. Müller; Alexis Gabadinho; Gilbert Ritschard; Matthias Studer

This article presents some of the facilities offered by our TraMineR R-package for clustering and visualizing sequence data. Firstly, we discuss our implementation of the optimal matching algorithm for evaluating the distance between two sequences and its use for generating a distance matrix for the whole sequence data set. Once such a matrix is obtained, we may use it as input for a cluster analysis, which can be done straightforwardly with any method available in the R statistical environment. Then we present three kinds of plots for visualizing the characteristics of the obtained clusters: an aggregated plot depicting the average sequential behavior of cluster members; an sequence index plot that shows the diversity inside clusters and an original frequency plot that highlights the frequencies of the nmost frequent sequences. TraMineR was designed for analysing sequences representing life courses and our presentation is illustrated on such a real world data set. The material presented should also be of interest for other kind of sequential data such as DNA analysis or web logs.


EGC (best of volume) | 2010

Discrepancy Analysis of Complex Objects Using Dissimilarities

Matthias Studer; Gilbert Ritschard; Alexis Gabadinho; Nicolas S. Müller

In this article we consider objects for which we have a matrix of dissimilarities and we are interested in their links with covariates. We focus on state sequences for which pairwise dissimilarities are given for instance by edit distances. The methods discussed apply however to any kind of objects and measures of dissimilarities. We start with a generalization of the analysis of variance (ANOVA) to assess the link of complex objects (e.g. sequences) with a given categorical variable. The trick is to show that discrepancy among objects can be derived from the sole pairwise dissimilarities, which permits then to identify factors that most reduce this discrepancy.We present a general statistical test and introduce an original way of rendering the results for state sequences. We then generalize the method to the case with more than one factor and discuss its advantages and limitations especially regarding interpretation. Finally, we introduce a new tree method for analyzing discrepancy of complex objects that exploits the former test as splitting criterion. We demonstrate the scope of the methods presented through a study of the factors that most discriminate Swiss occupational trajectories. All methods presented are freely accessible in our TraMineR package for the R statistical environment.


Sociological Methods & Research | 2015

Spell sequences, state proximities and distance metrics

Cees H. Elzinga; Matthias Studer

Because optimal matching (OM) distance is not very sensitive to differences in the order of states, we introduce a subsequence-based distance measure that can be adapted to subsequence length, to subsequence duration, and to soft-matching of states. Using a simulation technique developed by Studer, we investigate the sensitivity, relative to OM, of several variants of this metric to variations in order, timing, and duration of states. The results show that the behavior of the metric is as intended. Furthermore, we use family formation data from the Swiss Household Panel to compare a few variants of the new metric to OM. The new metrics have been implemented in the freely available TraMineR-package.


Archive | 2009

Converting between Various Sequence Representations

Gilbert Ritschard; Alexis Gabadinho; Matthias Studer; Nicolas S. Müller

This chapter is concerned with the organization of categorical sequence data. We first build a typology of sequences distinguishing for example between chronological sequences and sequences without time content. This permits to identify the kind of information that the data organization should preserve. Focusing then mainly on chronological sequences, we discuss the advantages and limits of different ways of representing time stamped event and state sequence data and present solutions for automatically converting between various formats, e.g., between horizontal and vertical presentations but also from state sequences into event sequences and reciprocally. Special attention is also drawn to the handling of missing values in these conversion processes.


open source systems | 2007

Community Structure, Individual Participation and the Social Construction of Merit

Matthias Studer

FLOSS communities are often described as meritocracies. We consider merit as a social construction that structures the community as a whole by allocating prestige to its participants on the basis of what they do. It implies a hierarchy of the different activities (web maintenance, writing code, bug report...) within the project. We present a study based on the merging of two datasets. We analyze the archive of KDE mailing lists using a social network. We also use responses to a questionnaire of KDE participants. Results bring empirical evidences showing that this hierarchy structures the community of KDE by allocating more central position to participants with more prestigious activities. We also show that this hierarchy structures individuals participation by giving greater “membership esteem” to members involved in more prestigious activities.


Sociological Methodology | 2018

Estimating the Relationship between Time-varying Covariates and Trajectories: The Sequence Analysis Multistate Model Procedure

Matthias Studer; Emanuela Struffolino; Anette Eva Fasang

The relationship between processes and time-varying covariates is of central theoretical interest in addressing many social science research questions. On the one hand, event history analysis (EHA) has been the chosen method to study these kinds of relationships when the outcomes can be meaningfully specified as simple instantaneous events or transitions. On the other hand, sequence analysis (SA) has made increasing inroads into the social sciences to analyze trajectories as holistic “process outcomes.” We propose an original combination of these two approaches called the sequence analysis multistate model (SAMM) procedure. The SAMM procedure allows the study of the relationship between time-varying covariates and trajectories of categorical states specified as process outcomes that unfold over time. The SAMM is a stepwise procedure: (1) SA-related methods are used to identify ideal-typical patterns of changes within trajectories obtained by considering the sequence of states over a predefined time span; (2) multistate event history models are estimated to study the probability of transitioning from a specific state to such ideal-typical patterns. The added value of the SAMM procedure is illustrated through an example from life-course sociology on how (1) time-varying family status is associated with women’s employment trajectories in East and West Germany and (2) how German reunification affected these trajectories in the two subsocieties.


Sociological Methodology | 2015

Comment: On the Use of Globally Interdependent Multiple Sequence Analysis

Matthias Studer

Raffaella Piccarreta is a research scientist at Bocconi University in Milan, Italy. She holds a PhD in statistics, and her research areas include multivariate data analysis, sequence analysis, and dissimilarity data. She has produced several articles on sequence analysis, focusing specifically on clustering algorithms, classification trees, and graphical tools. Her recent research has been focused on the study of the interplay of different domains, on the development of methods to explain and predict life-courses, and on multiway multidimensional scaling. Current research projects include an analysis of the early track of the work career in Italy and a study of the impact of family formation patterns on HIV infection in sub-Saharan countries.


Archive | 2008

Strategies in Identifying Issues Addressed in Legal Reports

Gilbert Ritschard; Matthias Studer; Vincent Pisetta

This paper deals with the automatic retrieval of issues reported in legal texts and presents an experience with expert’s reports on the application of ILO Conventions. The aim is to provide the end user, i.e. the legal expert, with a set of rules that permits her/him to find among a predefined list of issues those addressed by any new text. Since the end user is not supposed to be able to pre-process the text, we need rules that can be directly applied on raw texts. We present the strategy followed for generating the rules in this ILO legal setting and single out a few possible improvements that should significantly improve the performance of the retrieval process. Our approach consists in characterizing in a first stage a list of descriptor concepts, which are then used to get a quantitative representation of the texts. In the learning phase, using a sample of texts labeled by legal experts with the issues they actually address, we build the rules by means of induced decision trees.

Collaboration


Dive into the Matthias Studer's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge