Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dmitriy Fradkin is active.

Publication


Featured researches published by Dmitriy Fradkin.


European Heart Journal | 2015

Atlas of the clinical genetics of human dilated cardiomyopathy

Jan Haas; Karen Frese; Barbara Peil; Wanda Kloos; Andreas Keller; Rouven Nietsch; Zhu Feng; Sabine Müller; Elham Kayvanpour; Britta Vogel; Farbod Sedaghat-Hamedani; Wei Keat Lim; Xiaohong Zhao; Dmitriy Fradkin; Doreen Köhler; Simon Fischer; Jennifer Franke; Sabine Marquart; Ioana Barb; Daniel Tian Li; Ali Amr; Philipp Ehlermann; Derliz Mereles; Tanja Weis; Sarah Hassel; Andreas Kremer; Vanessa King; Emil Wirsz; Richard Isnard; Michel Komajda

AIM Numerous genes are known to cause dilated cardiomyopathy (DCM). However, until now technological limitations have hindered elucidation of the contribution of all clinically relevant disease genes to DCM phenotypes in larger cohorts. We now utilized next-generation sequencing to overcome these limitations and screened all DCM disease genes in a large cohort. METHODS AND RESULTS In this multi-centre, multi-national study, we have enrolled 639 patients with sporadic or familial DCM. To all samples, we applied a standardized protocol for ultra-high coverage next-generation sequencing of 84 genes, leading to 99.1% coverage of the target region with at least 50-fold and a mean read depth of 2415. In this well characterized cohort, we find the highest number of known cardiomyopathy mutations in plakophilin-2, myosin-binding protein C-3, and desmoplakin. When we include yet unknown but predicted disease variants, we find titin, plakophilin-2, myosin-binding protein-C 3, desmoplakin, ryanodine receptor 2, desmocollin-2, desmoglein-2, and SCN5A variants among the most commonly mutated genes. The overlap between DCM, hypertrophic cardiomyopathy (HCM), and channelopathy causing mutations is considerably high. Of note, we find that >38% of patients have compound or combined mutations and 12.8% have three or even more mutations. When comparing patients recruited in the eight participating European countries we find remarkably little differences in mutation frequencies and affected genes. CONCLUSION This is to our knowledge, the first study that comprehensively investigated the genetics of DCM in a large-scale cohort and across a broad gene panel of the known DCM genes. Our results underline the high analytical quality and feasibility of Next-Generation Sequencing in clinical genetic diagnostics and provide a sound database of the genetic causes of DCM.


knowledge discovery and data mining | 2003

Experiments with random projections for machine learning

Dmitriy Fradkin; David Madigan

Dimensionality reduction via Random Projections has attracted considerable attention in recent years. The approach has interesting theoretical underpinnings and offers computational advantages. In this paper we report a number of experiments to evaluate Random Projections in the context of inductive supervised learning. In particular, we compare Random Projections and PCA on a number of different datasets and using different machine learning methods. While we find that the random projection approach predictively underperforms PCA, its computational advantages may make it attractive for certain applications.


knowledge discovery and data mining | 2014

Log-based predictive maintenance

Ruben Sipos; Dmitriy Fradkin; Fabian Moerchen; Zhuang Wang

Success of manufacturing companies largely depends on reliability of their products. Scheduled maintenance is widely used to ensure that equipment is operating correctly so as to avoid unexpected breakdowns. Such maintenance is often carried out separately for every component, based on its usage or simply on some fixed schedule. However, scheduled maintenance is labor-intensive and ineffective in identifying problems that develop between technicians visits. Unforeseen failures still frequently occur. In contrast, predictive maintenance techniques help determine the condition of in-service equipment in order to predict when and what repairs should be performed. The main goal of predictive maintenance is to enable pro-active scheduling of corrective work, and thus prevent unexpected equipment failures.


knowledge discovery and data mining | 2008

Anticipating annotations and emerging trends in biomedical literature

Fabian Mörchen; Mathäus Dejori; Dmitriy Fradkin; Julien Etienne; Bernd Wachmann; Markus Bundschus

The BioJournalMonitor is a decision support system for the analysis of trends and topics in the biomedical literature. Its main goal is to identify potential diagnostic and therapeutic biomarkers for specific diseases. Several data sources are continuously integrated to provide the user with up-to-date information on current research in this field. State-of-the-art text mining technologies are deployed to provide added value on top of the original content, including named entity detection, relation extraction, classification, clustering, ranking, summarization, and visualization. We present two novel technologies that are related to the analysis of temporal dynamics of text archives and associated ontologies. Currently, the MeSH ontology is used to annotate the scientific articles entering the PubMed database with medical terms. Both the maintenance of the ontology as well as the annotation of new articles is performed largely manually. We describe how probabilistic topic models can be used to annotate recent articles with the most likely MeSH terms. This provides our users with a competitive advantage because, when searching for MeSH terms, articles are found long before they are manually annotated. We further present a study on how to predict the inclusion of new terms in the MeSH ontology. The results suggest that early prediction of emerging trends is possible. The trend ranking functions are deployed in our system to enable interactive searches for the hottest new trends relating to a disease.


Statistical Analysis and Data Mining | 2014

Mining Compressing Sequential Patterns

Hoang Thanh Lam; Fabian Mörchen; Dmitriy Fradkin; Toon Calders

Pattern mining based on data compression has been successfully applied in many data mining tasks. For itemset data, the Krimp algorithm based on the minimumdescription length MDL principle was shown to be very effective in solving the redundancy issue in descriptive pattern mining. However, for sequence data, the redundancy issue of the set of frequent sequential patterns is not fully addressed in the literature. In this article, we study MDL-based algorithms for mining non-redundant sets of sequential patterns from a sequence database. First, we propose an encoding scheme for compressing sequence data with sequential patterns. Second, we formulate the problem of mining the most compressing sequential patterns from a sequence database. We show that this problem is intractable and belongs to the class of inapproximable problems. Therefore, we propose two heuristic algorithms. The first of these uses a two-phase approach similar to Krimp for itemset data. To overcome performance issues in candidate generation, we also propose GoKrimp, an algorithm that directly mines compressing patterns by greedily extending a pattern until no additional compression benefit of adding the extension into the dictionary. Since checks for additional compression benefit of an extension are computationally expensive we propose a dependency test which only chooses related events for extending a given pattern. This technique improves the efficiency of the GoKrimp algorithm significantly while it still preserves the quality of the set of patterns. We conduct an empirical study on eight datasets to show the effectiveness of our approach in comparison to the state-of-the-art algorithms in terms of interpretability of the extracted patterns, run time, compression ratio, and classification accuracy using the discovered patterns as features for different classifiers.


BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING: 25th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering | 2005

Bayesian Multinomial Logistic Regression for Author Identification

David Madigan; Alexander Genkin; David Lewis; Dmitriy Fradkin

Motivated by high‐dimensional applications in authorship attribution, we describe a Bayesian multinomial logistic regression model together with an associated learning algorithm.


Knowledge and Information Systems | 2015

Mining sequential patterns for classification

Dmitriy Fradkin; Fabian Mörchen

While a number of efficient sequential pattern mining algorithms were developed over the years, they can still take a long time and produce a huge number of patterns, many of which are redundant. These properties are especially frustrating when the goal of pattern mining is to find patterns for use as features in classification problems. In this paper, we describe BIDE-Discriminative, a modification of BIDE that uses class information for direct mining of predictive sequential patterns. We then perform an extensive evaluation on nine real-life datasets of the different ways in which the basic BIDE-Discriminative can be used in real multi-class classification problems, including 1-versus-rest and model-based search tree approaches. The results of our experiments show that 1-versus-rest provides an efficient solution with good classification performance.


Data Mining and Knowledge Discovery | 2010

Hierarchical document clustering using local patterns

Hassan H. Malik; John R. Kender; Dmitriy Fradkin; Fabian Moerchen

The global pattern mining step in existing pattern-based hierarchical clustering algorithms may result in an unpredictable number of patterns. In this paper, we propose IDHC, a pattern-based hierarchical clustering algorithm that builds a cluster hierarchy without mining for globally significant patterns. IDHC first discovers locally promising patterns by allowing each instance to “vote” for its representative size-2 patterns in a way that ensures an effective balance between local pattern frequency and pattern significance in the dataset. The cluster hierarchy (i.e., the global model) is then directly constructed using these locally promising patterns as features. Each pattern forms an initial (possibly overlapping) cluster, and the rest of the cluster hierarchy is obtained by following a unique iterative cluster refinement process. By effectively utilizing instance-to-cluster relationships, this process directly identifies clusters for each level in the hierarchy, and efficiently prunes duplicate clusters. Furthermore, IDHC produces cluster labels that are more descriptive (patterns are not artificially restricted), and adapts a soft clustering scheme that allows instances to exist in suitable nodes at various levels in the cluster hierarchy. We present results of experiments performed on 16 standard text datasets, and show that IDHC outperforms state-of-the-art hierarchical clustering algorithms in terms of average entropy and FScore measures.


Veterinary Record | 2006

Prevalence of wet litter and the associated risk factors in broiler flocks in the United Kingdom

Patrick Hermans; Dmitriy Fradkin; Ilya B. Muchnik; K. L. Morgan

A postal questionnaire was sent to the managers of 857 broiler farms in the UK to determine the prevalence and risk factors for wet litter. The response rate was 75 per cent. Wet litter was reported by 75 per cent (95 per cent confidence interval [CI] 71·3 to 78·3) of the respondents in at least one flock during the year 2001 and 56·1 per cent (95 per cent CI 52·0 to 60·0) of them reported that they had an outbreak of wet litter in their most recently reared flock. Wet litter occurred more often during the winter months and farms using side ventilation systems were at an increased risk (odds ratio 1·74; 95 per cent CI 1·09 to 2·76). A multivariable analysis was carried out using two different definitions of wet litter as outcome variables – all cases of wet litter, and cases of wet litter associated with disease. Consistent risk factors for both outcomes were coccidiosis, feed equipment failures and the availability of separate farm clothing for each house. Cases of wet litter associated with disease were reported by 33·7 per cent (95 per cent CI 28·8 to 39·1) of the managers in their last flock and were associated with the use of hand sanitisers and broiler houses with walls made of concrete.


Knowledge and Information Systems | 2016

An efficient pattern mining approach for event detection in multivariate temporal data

Iyad Batal; Gregory F. Cooper; Dmitriy Fradkin; James H. Harrison; Fabian Moerchen; Milos Hauskrecht

This work proposes a pattern mining approach to learn event detection models from complex multivariate temporal data, such as electronic health records. We present recent temporal pattern mining, a novel approach for efficiently finding predictive patterns for event detection problems. This approach first converts the time series data into time-interval sequences of temporal abstractions. It then constructs more complex time-interval patterns backward in time using temporal operators. We also present the minimal predictive recent temporal patterns framework for selecting a small set of predictive and non-spurious patterns. We apply our methods for predicting adverse medical events in real-world clinical data. The results demonstrate the benefits of our methods in learning accurate event detection models, which is a key step for developing intelligent patient monitoring and decision support systems.

Collaboration


Dive into the Dmitriy Fradkin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge