Janez Demšar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Janez Demšar is active.

Explore More

Publication

Featured researches published by Janez Demšar.

european conference on principles of data mining and knowledge discovery | 2004

Orange: from experimental machine learning to interactive data mining

Janez Demšar; Blaž Zupan; Gregor Leban; Tomaz Curk

Orange (www.ailab.si/orange) is a suite for machine learning and data mining. For researchers in machine learning, Orange offers scripting to easily prototype new algorithms and experimental procedures. For explorative data analysis, it provides a visual programming framework with emphasis on interactions and creative combinations of visual components.

Bioinformatics | 2005

Microarray data mining with visual programming

Tomaz Curk; Janez Demšar; Qikai Xu; Gregor Leban; Uroš Petrovič; Ivan Bratko; Gad Shaulsky; Blaz Zupan

UNLABELLED Visual programming offers an intuitive means of combining known analysis and visualization methods into powerful applications. The system presented here enables users who are not programmers to manage microarray and genomic data flow and to customize their analyses by combining common data analysis tools to fit their needs. AVAILABILITY http://www.ailab.si/supp/bi-visprog SUPPLEMENTARY INFORMATION http://www.ailab.si/supp/bi-visprog.

Nature Genetics | 2005

Epistasis analysis with global transcriptional phenotypes

Nancy Van Driessche; Janez Demšar; Ezgi O. Booth; Paul Hill; Peter Juvan; Blaz Zupan; Adam Kuspa; Gad Shaulsky

Classical epistasis analysis can determine the order of function of genes in pathways using morphological, biochemical and other phenotypes. It requires knowledge of the pathways phenotypic output and a variety of experimental expertise and so is unsuitable for genome-scale analysis. Here we used microarray profiles of mutants as phenotypes for epistasis analysis. Considering genes that regulate activity of protein kinase A in Dictyostelium, we identified known and unknown epistatic relationships and reconstructed a genetic network with microarray phenotypes alone. This work shows that microarray data can provide a uniform, quantitative tool for large-scale genetic network analysis.

european conference on principles of data mining and knowledge discovery | 2004

Nomograms for visualization of naive Bayesian classifier

Martin Možina; Janez Demšar; Michael W. Kattan; Blaz Zupan

Besides good predictive performance, the naive Bayesian classifier can also offer a valuable insight into the structure of the training data and effects of the attributes on the class probabilities. This structure may be effectively revealed through visualization of the classifier. We propose a new way to visualize the naive Bayesian model in the form of a nomogram. The advantages of the proposed method are simplicity of presentation, clear display of the effects of individual attribute values, and visualization of confidence intervals. Nomograms are intuitive and when used for decision support can provide a visual explanation of predicted probabilities. And finally, with a nomogram, a naive Bayesian model can be printed out and used for probability prediction without the use of computer or calculator.

Bioinformatics | 2003

GenePath: a System for Automated Construction of Genetic Networks from Mutant Data

Blaz Zupan; Janez Demšar; Ivan Bratko; Peter Juvan; John A. Halter; Adam Kuspa; Gad Shaulsky

MOTIVATION Genetic networks are often used in the analysis of biological phenomena. In classical genetics, they are constructed manually from experimental data on mutants. The field lacks formalism to guide such analysis, and accounting for all the data becomes complicated when large amounts of data are considered. RESULTS We have developed GenePath, an intelligent assistant that automates the analysis of genetic data. GenePath employs expert-defined patterns to uncover gene relations from the data, and uses these relations as constraints in the search for a plausible genetic network. GenePath formalizes genetic data analysis, facilitates the consideration of all the available data in a consistent manner, and the examination of the large number of possible consequences of planned experiments. It also provides an explanation mechanism that traces every finding to the pertinent data. AVAILABILITY GenePath can be accessed at http://genepath.org. SUPPLEMENTARY INFORMATION Supplementary material is available at http://genepath.org/bi-.supp.

european conference on artificial intelligence | 1999

Machine Learning for Survival Analysis: A Case Study on Recurrence of Prostate Cancer

Blaz Zupan; Janez Demšar; Michael W. Kattan; J. Robert Beck; Ivan Bratko

Machine learning techniques have recently received considerable attention, especially when used for the construction of prediction models from data. Despite their potential advantages over standard statistical methods, like their ability to model non-linear relationships and construct symbolic and interpretable models, their applications to survival analysis are at best rare, primarily because of the difficulty to appropriately handle censored data. In this paper we propose a schema that enables the use of classification methods--including machine learning classifiers--for survival analysis. To appropriately consider the follow-up time and censoring, we propose a technique that, for the patients for which the event did not occur and have short follow-up times, estimates their probability of event and assigns them a distribution of outcome accordingly. Since most machine learning techniques do not deal with outcome distributions, the schema is implemented using weighted examples. To show the utility of the proposed technique, we investigate a particular problem of building prognostic models for prostate cancer recurrence, where the sole prediction of the probability of event (and not its probability dependency on time) is of interest. A case study on preoperative and postoperative prostate cancer recurrence prediction shows that by incorporating this weighting technique the machine learning tools stand beside modern statistical methods and may, by inducing symbolic recurrence models, provide further insight to relationships within the modeled data.

Artificial Intelligence | 1999

Learning by discovering concept hierarchies

Blaž Zupan; Marko Bohanec; Ivan Bratko; Janez Demšar

Abstract We present a new machine learning method that, given a set of training examples, induces a definition of the target concept in terms of a hierarchy of intermediate concepts and their definitions. This effectively decomposes the problem into smaller, less complex problems. The method is inspired by the Boolean function decomposition approach to the design of switching circuits. To cope with high time complexity of finding an optimal decomposition, we propose a suboptimal heuristic algorithm. The method, implemented in program HINT (Hierarchy INduction Tool), is experimentally evaluated using a set of artificial and real-world learning problems. In particular, the evaluation addresses the generalization property of decomposition and its capability to discover meaningful hierarchies. The experiments show that HINT performs well in both respects.

Bioinformatics | 2014

A combinatorial approach to graphlet counting

Tomaž Hočevar; Janez Demšar

MOTIVATION Small-induced subgraphs called graphlets are emerging as a possible tool for exploration of global and local structure of networks and for analysis of roles of individual nodes. One of the obstacles to their wider use is the computational complexity of algorithms for their discovery and counting. RESULTS We propose a new combinatorial method for counting graphlets and orbit signatures of network nodes. The algorithm builds a system of equations that connect counts of orbits from graphlets with up to five nodes, which allows to compute all orbit counts by enumerating just a single one. This reduces its practical time complexity in sparse graphs by an order of magnitude as compared with the existing pure enumeration-based algorithms. AVAILABILITY AND IMPLEMENTATION Source code is available freely at http://www.biolab.si/supp/orca/orca.html.

knowledge discovery and data mining | 2005

Nomograms for visualizing support vector machines

Aleks Jakulin; Martin Možina; Janez Demšar; Ivan Bratko; Blaž Zupan

We propose a simple yet potentially very effective way of visualizing trained support vector machines. Nomograms are an established model visualization technique that can graphically encode the complete model on a single page. The dimensionality of the visualization does not depend on the number of attributes, but merely on the properties of the kernel. To represent the effect of each predictive feature on the log odds ratio scale as required for the nomograms, we employ logistic regression to convert the distance from the separating hyperplane into a probability. Case studies on selected data sets show that for a technique thought to be a black-box, nomograms can clearly expose its internal structure. By providing an easy-to-interpret visualization the analysts can gain insight and study the effects of predictive factors.

Artificial Intelligence in Medicine | 2000

Machine learning for survival analysis: a case study on recurrence of prostate cancer

Blaz Zupan; Janez Demšar; Michael W. Kattan; J. Robert Beck; Ivan Bratko

Machine learning techniques have recently received considerable attention, especially when used for the construction of prediction models from data. Despite their potential advantages over standard statistical methods, like their ability to model non-linear relationships and construct symbolic and interpretable models, their applications to survival analysis are at best rare, primarily because of the difficulty to appropriately handle censored data. In this paper we propose a schema that enables the use of classification methods - including machine learning classifiers - for survival analysis. To appropriately consider the follow-up time and censoring, we propose a technique that, for the patients for which the event did not occur and have short follow-up times, estimates their probability of event and assigns them a distribution of outcome accordingly. Since most machine learning techniques do not deal with outcome distributions, the schema is implemented using weighted examples. To show the utility of the proposed technique, we investigate a particular problem of building prognostic models for prostate cancer recurrence, where the sole prediction of the probability of event (and not its probability dependency on time) is of interest. A case study on preoperative and postoperative prostate cancer recurrence prediction shows that by incorporating this weighting technique the machine learning tools stand beside modern statistical methods and may, by inducing symbolic recurrence models, provide further insight to relationships within the modeled data.

Explore More