Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sorin Alexe is active.

Publication


Featured researches published by Sorin Alexe.


Discrete Applied Mathematics | 2004

Consensus algorithms for the generation of all maximal bicliques

Gabriela Alexe; Sorin Alexe; Yves Crama; Stephan Foldes; Peter L. Hammer; Bruno Simeone

We describe a new algorithm for generating all maximal bicliques (i.e. complete bipartite, not necessarily induced subgraphs) of a graph. The algorithm is inspired by, and is quite similar to, the consensus method used in propositional logic. We show that some variants of the algorithm are totally polynomial, and even incrementally polynomial. The total complexity of the most efficient variant of the algorithms presented here is polynomial in the input size, and only linear in the output size. Computational experiments demonstrate its high efficiency on randomly generated graphs with up to 2000 vertices and 20,000 edges.


Annals of Operations Research | 2003

Coronary Risk Prediction by Logical Analysis of Data

Sorin Alexe; Eugene H. Blackstone; Peter L. Hammer; Hemant Ishwaran; Michael S. Lauer; Claire E Snader

The objective of this study was to distinguish within a population of patients with known or suspected coronary artery disease groups at high and at low mortality rates. The study was based on Cleveland Clinic Foundations dataset of 9454 patients, of whom 312 died during an observation period of 9 years. The Logical Analysis of Data method was adapted to handle the disproportioned size of the two groups of patients, and the inseparable character of this dataset – characteristic to many medical problems. As a result of the study, we have identified a high-risk group of patients representing 1/5 of the population, with a mortality rate 4 times higher than the average, and including 3/4 of the patients who died. The low-risk group identified in the study, representing approximately 4/5 of the population, had a mortality rate 3 times lower than the average. A Prognostic Index derived from the LAD model is shown to have a 83.95% correlation with the mortality rate of patients. The classification given by the Prognostic Index was also shown to agree in 3 out of 4 cases with that of the Cox Score, widely used by cardiologists, and to outperform it slightly, but consistently. An example of a highly reliable risk stratification system using both indicators is provided.


Breast Cancer Research | 2006

Breast cancer prognosis by combinatorial analysis of gene expression data

Gabriela Alexe; Sorin Alexe; David E. Axelrod; Tibérius O. Bonates; Irina Lozina; Michael Reiss; Peter L. Hammer

IntroductionThe potential of applying data analysis tools to microarray data for diagnosis and prognosis is illustrated on the recent breast cancer dataset of van t Veer and coworkers. We re-examine that dataset using the novel technique of logical analysis of data (LAD), with the double objective of discovering patterns characteristic for cases with good or poor outcome, using them for accurate and justifiable predictions; and deriving novel information about the role of genes, the existence of special classes of cases, and other factors.MethodData were analyzed using the combinatorics and optimization-based method of LAD, recently shown to provide highly accurate diagnostic and prognostic systems in cardiology, cancer proteomics, hematology, pulmonology, and other disciplines.ResultsLAD identified a subset of 17 of the 25,000 genes, capable of fully distinguishing between patients with poor, respectively good prognoses. An extensive list of patterns or combinatorial biomarkers (that is, combinations of genes and limitations on their expression levels) was generated, and 40 patterns were used to create a prognostic system, shown to have 100% and 92.9% weighted accuracy on the training and test sets, respectively. The prognostic system uses fewer genes than other methods, and has similar or better accuracy than those reported in other studies. Out of the 17 genes identified by LAD, three (respectively, five) were shown to play a significant role in determining poor (respectively, good) prognosis. Two new classes of patients (described by similar sets of covering patterns, gene expression ranges, and clinical features) were discovered. As a by-product of the study, it is shown that the training and the test sets of van t Veer have differing characteristics.ConclusionThe study shows that LAD provides an accurate and fully explanatory prognostic system for breast cancer using genomic data (that is, a system that, in addition to predicting good or poor prognosis, provides an individualized explanation of the reasons for that prognosis for each patient). Moreover, the LAD model provides valuable insights into the roles of individual and combinatorial biomarkers, allows the discovery of new classes of patients, and generates a vast library of biomedical research hypotheses.


Circulation | 2002

Use of the logical analysis of data method for assessing long-term mortality risk after exercise electrocardiography

Michael S. Lauer; Sorin Alexe; Claire E Snader; Eugene H. Blackstone; Hemant Ishwaran; Peter L. Hammer

Background—Logical Analysis of Data is a methodology of mathematical optimization on the basis of the systematic identification of patterns or “syndromes.” In this study, we used Logical Analysis of Data for risk stratification and compared it to regression techniques. Methods and Results—Using a cohort of 9454 patients referred for exercise testing, Logical Analysis of Data was applied to identify syndromes based on 20 variables. High-risk syndromes were patterns of up to 3 findings associated with >5-fold increase in risk of death, whereas low-risk syndromes were associated with >5-fold decrease. Syndromes were derived on a randomly derived training set of 4722 patients and validated in 4732 others. There were 15 high-risk and 26 low-risk syndromes. A risk score was derived based on the proportion of possible high risk and low risk syndromes present. A value ≥0, meaning the same or a greater proportion of high-risk syndromes, was noted in 979 patients (21%) in the validation set and was predictive of 5-year death (11% versus 1%, hazard ratio 8.3, 95% CI 5.9 to 11.6, P<0.0001), accounting for 67% of events. Calibration of expected versus observed death rates based on Logical Analysis of Data and Cox regression showed that both methods performed very well. Conclusion—Using the Logical Analysis of Data method, we identified subsets of patients who had an increased risk and who also accounted for the majority of deaths. Future research is needed to determine how best to use this technique for risk stratification.


Discrete Applied Mathematics | 2006

Accelerated algorithm for pattern detection in logical analysis of data

Sorin Alexe; Peter L. Hammer

Sets of positive and negative points (observations) in n-dimensional discrete space given along with their non-negative integer multiplicities are analyzed from the perspective of the Logical Analysis of Data (LAD). A set of observations satisfying upper and/or lower bounds imposed on certain components is called a positive pattern if it contains some positive observations and no negative one. The number of variables on which such restrictions are imposed is called the degree of the pattern. A total polynomial algorithm is proposed for the enumeration of all patterns of limited degree, and special efficient variants of it for the enumeration of all patterns with certain sign and coverage requirements are presented and evaluated on a publicly available collection of benchmark datasets.


Discrete Applied Mathematics | 2008

Comprehensive vs. comprehensible classifiers in logical analysis of data

Gabriela Alexe; Sorin Alexe; Peter L. Hammer; Alexander Kogan

The main objective of this paper is to compare the classification accuracy provided by large, comprehensive collections of patterns (rules) derived from archives of past observations, with that provided by small, comprehensible collections of patterns. This comparison is carried out here on the basis of an empirical study, using several publicly available data sets. The results of this study show that the use of comprehensive collections allows a slight increase of classification accuracy, and that the cost of comprehensibility is small.


Artificial Intelligence in Medicine | 2005

Logical analysis of diffuse large B-cell lymphomas

Gabriela Alexe; Sorin Alexe; David E. Axelrod; Peter L. Hammer; D. Weissmann

OBJECTIVEnThe goal of this study is to re-examine the oligonucleotide microarray dataset of Shipp et al., which contains the intensity levels of 6817 genes of 58 patients with diffuse large B-cell lymphoma (DLBCL) and 19 with follicular lymphoma (FL), by means of the combinatorics, optimisation, and logic-based methodology of logical analysis of data (LAD). The motivations for this new analysis included the previously demonstrated capabilities of LAD and its expected potential (1) to identify different informative genes than those discovered by conventional statistical methods, (2) to identify combinations of gene expression levels capable of characterizing different types of lymphoma, and (3) to assemble collections of such combinations that if considered jointly are capable of accurately distinguishing different types of lymphoma.nnnMETHODS AND MATERIALSnThe central concept of LAD is a pattern or combinatorial biomarker, a concept that resembles a rule as used in decision tree methods. LAD is able to exhaustively generate the collection of all those patterns which satisfy certain quality constraints, through a systematic combinatorial process guided by clear optimization criteria. Then, based on a set covering approach, LAD aggregates the collection of patterns into classification models. In addition, LAD is able to use the information provided by large collections of patterns in order to extract subsets of variables, which collectively are able to distinguish between different types of disease.nnnRESULTSnFor the differential diagnosis of DLBCL versus FL, a model based on eight significant genes is constructed and shown to have a sensitivity of 94.7% and a specificity of 100% on the test set. For the prognosis of good versus poor outcome among the DLBCL patients, a model is constructed on another set consisting also of eight significant genes, and shown to have a sensitivity of 87.5% and a specificity of 90% on the test set. The genes selected by LAD also work well as a basis for other kinds of statistical analysis, indicating their robustness.nnnCONCLUSIONnThese two models exhibit accuracies that compare favorably to those in the original study. In addition, the current study also provides a ranking by importance of the genes in the selected significant subsets as well as a library of dozens of combinatorial biomarkers (i.e. pairs or triplets of genes) that can serve as a source of mathematically generated, statistically significant research hypotheses in need of biological explanation.


Annals of Operations Research | 2006

Pattern-based feature selection in genomics and proteomics

Gabriela Alexe; Sorin Alexe; Peter L. Hammer; Béla Vizvári

A major difficulty in bioinformatics is due to the size of the datasets, which contain frequently large numbers of variables. In this study, we present a two-step procedure for feature selection. In a first “filtering” stage, a relatively small subset of features is identified on the basis of several criteria. In the second stage, the importance of the selected variables is evaluated based on the frequency of their participation in relevant patterns and low impact variables are eliminated. This step is applied iteratively, until arriving to a Pareto-optimal “support set”, which balances the conflicting criteria of simplicity and accuracy.


Annals of Mathematics and Artificial Intelligence | 2007

Logical analysis of data --- the vision of Peter L. Hammer

Gabriela Alexe; Sorin Alexe; Tibérius O. Bonates; Alexander Kogan

Logical analysis of data (LAD) is a special data analysis methodology which combines ideas and concepts from optimization, combinatorics, and Boolean functions. The central concept in LAD is that of patterns, or rules, which were found to play a critical role in classification, ranked regression, clustering, detection of subclasses, feature selection and other problems. The research area of LAD was defined and initiated by Peter L. Hammer, who was the catalyst of the LAD oriented research for decades, and whose consistent vision and efforts helped the methodology to move from theory to data analysis applications, to achieve maturity and to be successful in many medical, industrial and economics case studies. This overview presents some of the basic aspects of LAD, from the definition of the main concepts to the efficient algorithms for pattern generation, and from the complexity analysis of the difficult problems embedded in LAD to its biomedical applications. We focus in this paper only on some recent developments in LAD which were of particular interest to Peter L. Hammer, who played a key role in obtaining all the results described here. The presentation in this overview is based on the original publications of Peter L. Hammer and his co-authors. We dedicate this paper to the memory of Peter L. Hammer.


soft computing | 2006

Pattern-based clustering and attribute analysis

Gabriela Alexe; Sorin Alexe; Peter L. Hammer

The logical analysis of data (LAD) is a combinatorics, optimization and logic based methodology for the analysis of datasets with binary or numerical input variables, and binary outcomes. It has been established in previous studies that LAD provides a competitive classification tool comparable in efficiency with the top classification techniques available. The goal of this paper is to show that the methodology of LAD can be useful in the discovery of new classes of observations and in the analysis of attributes. After a brief description of the main concepts of LAD, two efficient combinatorial algorithms are described for the generation of all prime, respectively all spanned, patterns (rules) satisfying certain conditions. It is shown that the application of classic clustering techniques to the set of observations represented in prime pattern space leads to the identification of a subclass of, say positive, observations, which is accurately recognizable, and is sharply distinct from the observations in the opposite, say negative, class. It is also shown that the set of all spanned patterns allows the introduction of a measure of significance and of a concept of monotonicity in the set of attributes.

Collaboration


Dive into the Sorin Alexe's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michael Reiss

University of Medicine and Dentistry of New Jersey

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge