Zeeshan Syed
University of Michigan
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zeeshan Syed.
Physiological Measurement | 2016
Chengyu Liu; David Springer; Qiao Li; Benjamin Moody; Ricardo Abad Juan; Francisco J. Chorro; Francisco Castells; José Millet Roig; Ikaro Silva; Alistair E. W. Johnson; Zeeshan Syed; Samuel Schmidt; Chrysa D. Papadaniil; Hosein Naseri; Ali Moukadem; Alain Dieterlen; Christian Brandt; Hong Tang; Maryam Samieinasab; Mohammad Reza Samieinasab; Reza Sameni; Roger G. Mark; Gari D. Clifford
In the past few decades, analysis of heart sound signals (i.e. the phonocardiogram or PCG), especially for automated heart sound segmentation and classification, has been widely studied and has been reported to have the potential value to detect pathology accurately in clinical applications. However, comparative analyses of algorithms in the literature have been hindered by the lack of high-quality, rigorously validated, and standardized open databases of heart sound recordings. This paper describes a public heart sound database, assembled for an international competition, the PhysioNet/Computing in Cardiology (CinC) Challenge 2016. The archive comprises nine different heart sound databases sourced from multiple research groups around the world. It includes 2435 heart sound recordings in total collected from 1297 healthy subjects and patients with a variety of conditions, including heart valve disease and coronary artery disease. The recordings were collected from a variety of clinical or nonclinical (such as in-home visits) environments and equipment. The length of recording varied from several seconds to several minutes. This article reports detailed information about the subjects/patients including demographics (number, age, gender), recordings (number, location, state and time length), associated synchronously recorded signals, sampling frequency and sensor type used. We also provide a brief summary of the commonly used heart sound segmentation and classification methods, including open source code provided concurrently for the Challenge. A description of the PhysioNet/CinC Challenge 2016, including the main aims, the training and test sets, the hand corrected annotations for different heart sound states, the scoring mechanism, and associated open source code are provided. In addition, several potential benefits from the public heart sound database are discussed.
IEEE Transactions on Biomedical Engineering | 2007
Zeeshan Syed; Daniel Leeds; Daniel Curtis; Francesca Nesta; Robert A. Levine; John V. Guttag
Skilled cardiologists perform cardiac auscultation, acquiring and interpreting heart sounds, by implicitly carrying out a sequence of steps. These include discarding clinically irrelevant beats, selectively tuning in to particular frequencies and aggregating information across time to make a diagnosis. In this paper, we formalize a series of analytical stages for processing heart sounds, propose algorithms to enable computers to approximate these steps, and investigate the effectiveness of each step in extracting relevant information from actual patient data. Through such reasoning, we provide insight into the relative difficulty of the various tasks involved in the accurate interpretation of heart sounds. We also evaluate the contribution of each analytical stage in the overall assessment of patients. We expect our framework and associated software to be useful to educators wanting to teach cardiac auscultation, and to primary care physicians, who can benefit from presentation tools for computer-assisted diagnosis of cardiac disorders. Researchers may also employ the comprehensive processing provided by our framework to develop more powerful, fully automated auscultation applications
EURASIP Journal on Advances in Signal Processing | 2007
Zeeshan Syed; John V. Guttag; Collin M. Stultz
This paper describes novel fully automated techniques for analyzing large amounts of cardiovascular data. In contrast to traditional medical expert systems our techniques incorporate no a priori knowledge about disease states. This facilitates the discovery of unexpected events. We start by transforming continuous waveform signals into symbolic strings derived directly from the data. Morphological features are used to partition heart beats into clusters by maximizing the dynamic time-warped sequence-aligned separation of clusters. Each cluster is assigned a symbol, and the original signal is replaced by the corresponding sequence of symbols. The symbolization process allows us to shift from the analysis of raw signals to the analysis of sequences of symbols. This discrete representation reduces the amount of data by several orders of magnitude, making the search space for discovering interesting activity more manageable. We describe techniques that operate in this symbolic domain to discover rhythms, transient patterns, abnormal changes in entropy, and clinically significant relationships among multiple streams of physiological data. We tested our techniques on cardiologist-annotated ECG data from forty-eight patients. Our process for labeling heart beats produced results that were consistent with the cardiologist supplied labels 98.6 of the time, and often provided relevant finer-grained distinctions. Our higher level analysis techniques proved effective at identifying clinically relevant activity not only from symbolized ECG streams, but also from multimodal data obtained by symbolizing ECG and other physiological data streams. Using no prior knowledge, our analysis techniques uncovered examples of ventricular bigeminy and trigeminy, ectopic atrial rhythms with aberrant ventricular conduction, paroxysmal atrial tachyarrhythmias, atrial fibrillation, and pulsus paradoxus.
American Journal of Cardiology | 2009
Zeeshan Syed; Benjamin M. Scirica; Satishkumar Mohanavelu; Phil Sung; Eric L. Michelson; Christopher P. Cannon; Peter H. Stone; Collin M. Stultz; John V. Guttag
Electrocardiographic measures can facilitate the identification of patients at risk of death after acute coronary syndromes. This study evaluates a new risk metric, morphologic variability (MV), which measures beat-to-beat variability in the shape of the entire heart beat signal. This metric is analogous to heart rate variability (HRV) approaches, which focus on beat-to-beat changes in the heart rate. MV was calculated using a dynamic time-warping technique in 764 patients from the DISPERSE2 (TIMI 33) trial for whom 24-hour continuous electrocardiograph was recorded within 48 hours of non-ST-elevation acute coronary syndrome. The patients were evaluated during a 90-day follow-up for the end point of death. Patients with high MV showed an increased risk of death during follow-up (hazard ratio 8.46; p <0.001). The relationship between high MV and death could be observed even after adjusting for baseline clinical characteristics and HRV measures (adjusted hazard ratio 6.91; p = 0.001). Moreover, the correlation between MV and HRV was low (R < or =0.25). These findings were consistent among several subgroups, including patients under the age of 65 and those with no history of diabetes or hyperlipidemia. In conclusion, our results suggest that increased variation in the entire heart beat morphology is associated with a considerably elevated risk of death and may provide information complementary to the analysis of heart rate.
Science Translational Medicine | 2011
Zeeshan Syed; Collin M. Stultz; Benjamin M. Scirica; John V. Guttag
Computational biomarkers extracted by machine learning from electrocardiograms improve identification of high-risk patients after coronary events. Better Biomarkers for Predicting Coronary Artery Disease-Related Death The symptoms of a heart attack—chest pain, sweating, etc.—can actually signal several serious coronary events, collectively termed acute coronary syndrome. Some of these put patients at high risk for death and must be treated aggressively; others are of relatively lower risk. Our ability to correctly assign patients to one of these groups quickly and reliably is inadequate for optimal care. Syed and his colleagues have extracted three biomarkers, by computational methods, from the continuous EKG readings obtained from these patients in the hospital. When added to existing predictors, the three derived computational biomarkers improve the classification of acute coronary syndrome patients by 7 to 13%. The biomarkers derived from the EKG do not correspond readily to features easily recognizable by an observer in the tracings of the EKG. But they are clearly visible to a computer using time-series algorithms on those same tracings. The first, morphologic variability, quantifies energy differences between consecutive heart beats. The second, symbolic mismatch, quantifies the difference in the EKG of a particular patient from others with the same clinical course. The third, heart rate motifs, determines patterns of heart rate in the EKG that reveal autonomic functioning. The power of this approach is that these biomarkers can be extracted from clinical data that are already continuously collected from the patient. It can be done in line, in real time with no overt change in the patients’ clinical experience or the cost of care. This could lead to treatments that ultimately extend the lives of tens of thousands of patients. The existing tools for estimating the risk of death in patients after they experience acute coronary syndrome are commonly based on echocardiography and clinical risk scores (for example, the TIMI risk score). These identify a small group of high-risk patients who account for only a minority of the deaths that occur in patients after acute coronary syndrome. Here, we investigated the use of three computationally generated cardiac biomarkers for risk stratification in this population: morphologic variability (MV), symbolic mismatch (SM), and heart rate motifs (HRM). We derived these biomarkers from time-series analyses of continuous electrocardiographic data collected from patients in the TIMI-DISPERSE2 clinical trial through machine learning and data mining methods designed to extract information that is difficult to visualize directly in these data. We evaluated these biomarkers in a blinded, prespecified, and fully automated study on more than 4500 patients in the MERLIN-TIMI36 (Metabolic Efficiency with Ranolazine for Less Ischemia in Non–ST-Elevation Acute Coronary Syndrome–Thrombolysis in Myocardial Infarction 36) clinical trial. Our results showed a strong association between all three computationally generated cardiac biomarkers and cardiovascular death in the MERLIN-TIMI36 trial over a 2-year period after acute coronary syndrome. Moreover, the information in each of these biomarkers was independent of the information in the others and independent of the information provided by existing clinical risk scores, electrocardiographic metrics, and echocardiography. The addition of MV, SM, and HRM to existing metrics significantly improved model discrimination, as well as the precision and recall of prediction rules based on left ventricular ejection fraction. These biomarkers can be extracted from data that are routinely captured from patients with acute coronary syndrome and will allow for more accurate risk stratification and potentially for better patient treatment.
ACM Transactions on Knowledge Discovery From Data | 2010
Zeeshan Syed; Collin M. Stultz; Manolis Kellis; Piotr Indyk; John V. Guttag
In this article, we propose a methodology for identifying predictive physiological patterns in the absence of prior knowledge. We use the principle of conservation to identify activity that consistently precedes an outcome in patients, and describe a two-stage process that allows us to efficiently search for such patterns in large datasets. This involves first transforming continuous physiological signals from patients into symbolic sequences, and then searching for patterns in these reduced representations that are strongly associated with an outcome. Our strategy of identifying conserved activity that is unlikely to have occurred purely by chance in symbolic data is analogous to the discovery of regulatory motifs in genomic datasets. We build upon existing work in this area, generalizing the notion of a regulatory motif and enhancing current techniques to operate robustly on non-genomic data. We also address two significant considerations associated with motif discovery in general: computational efficiency and robustness in the presence of degeneracy and noise. To deal with these issues, we introduce the concept of active regions and new subset-based techniques such as a two-layer Gibbs sampling algorithm. These extensions allow for a framework for information inference, where precursors are identified as approximately conserved activity of arbitrary complexity preceding multiple occurrences of an event. We evaluated our solution on a population of patients who experienced sudden cardiac death and attempted to discover electrocardiographic activity that may be associated with the endpoint of death. To assess the predictive patterns discovered, we compared likelihood scores for motifs in the sudden death population against control populations of normal individuals and those with non-fatal supraventricular arrhythmias. Our results suggest that predictive motif discovery may be able to identify clinically relevant information even in the absence of significant prior knowledge.
international health informatics symposium | 2010
Shahzaib Hassan; Zeeshan Syed
Recommender systems are widely used to provide users with personalized suggestions for products or services. These systems typically rely on collaborative filtering (CF) to make automated predictions about the interests of a user, by collecting preference information from many users. CF techniques require no domain knowledge and can be used on very sparse datasets. Moreover, they rely directly on user behavior and are able to potentially discover complex and unexpected patterns that are difficult or impossible to profile using known data attributes. In this paper, we explore the use of a CF framework for clinical risk stratification. Our work assesses patient risk both by matching new cases to historical records, and by matching patient demographics to adverse outcomes. When evaluated on data from over 4,500 patients admitted with acute coronary syndrome, our CF-based approach achieved a higher predictive accuracy for both sudden cardiac death and recurrent myocardial infraction than popular classification approaches such as logistic regression and support vector machines.
Science Translational Medicine | 2012
Chih Chun Chia; Ilan Rubinfeld; Benjamin M. Scirica; Sean McMillan; Hitinder S. Gurm; Zeeshan Syed
Clinical models can be improved by decreasing the importance assigned to fitting historical patient outcomes in often small and imperfectly characterized derivation cohorts. When Less Is More Clinical models play an important role in guiding patient care at the bedside, improving our understanding of diseases, and performing objective assessments of healthcare quality. The typical approach to developing these models places great importance on fitting historical patient outcomes in derivation data sets (such as those obtained from clinical studies or patient registries). However, for a fairly broad range of medical applications, these derivation data sets may be small after accounting for inclusionary and exclusionary criteria, and additionally may be imperfectly characterized due to noise and variations in the rates of patient outcomes. Collecting more data offers one approach to address this issue, but is challenging due to the costs and complexity of increasing the size of clinical cohorts. In the setting of small and imperfectly characterized data sets, approaches to developing clinical models that rely exclusively on fitting historical patient outcomes suffer from the implicit assumption that the derivation data sets are representative. Instead, as the new study by Chia et al. explores, the process of developing clinical models can be improved by decreasing the importance placed on fitting historical patient outcomes, and by supplementing these models with information about the extent to which patients differ from the statistical distribution of clinical characteristics within the derivation data set. When evaluated using data from three different clinical applications [patients with acute coronary syndrome enrolled in the DISPERSE2-TIMI33 and MERLIN-TIMI36 trials, patients undergoing inpatient surgery in the National Surgical Quality Improvement Program (NSQIP) registry, and patients undergoing percutaneous coronary intervention in the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) registry], this approach of treating derivation data for clinical models as simultaneously labeled and unlabeled consistently improved discrimination between high- and low-risk patients according to different statistical metrics. The idea of decreasing the importance assigned to fitting historical outcomes allows for better clinical models, and ultimately for improvements in the use of these models to study diseases, choose therapies, or evaluate healthcare providers. Conventional algorithms for modeling clinical events focus on characterizing the differences between patients with varying outcomes in historical data sets used for the model derivation. For many clinical conditions with low prevalence and where small data sets are available, this approach to developing models is challenging due to the limited number of positive (that is, event) examples available for model training. Here, we investigate how the approach of developing clinical models might be improved across three distinct patient populations (patients with acute coronary syndrome enrolled in the DISPERSE2-TIMI33 and MERLIN-TIMI36 trials, patients undergoing inpatient surgery in the National Surgical Quality Improvement Program registry, and patients undergoing percutaneous coronary intervention in the Blue Cross Blue Shield of Michigan Cardiovascular Consortium registry). For each of these cases, we supplement an incomplete characterization of patient outcomes in the derivation data set (uncensored view of the data) with an additional characterization of the extent to which patients differ from the statistical support of their clinical characteristics (censored view of the data). Our approach exploits the same training data within the derivation cohort in multiple ways to improve the accuracy of prediction. We position this approach within the context of traditional supervised (2-class) and unsupervised (1-class) learning methods and present a 1.5-class approach for clinical decision-making. We describe a 1.5-class support vector machine (SVM) classification algorithm that implements this approach, and report on its performance relative to logistic regression and 2-class SVM classification with cost-sensitive weighting and oversampling. The 1.5-class SVM algorithm improved prediction accuracy relative to other approaches and may have value in predicting clinical events both at the bedside and for risk-adjusted quality of care assessment.
Journal of Gastrointestinal Surgery | 2012
Rupen Shah; Vic Velanovich; Zeeshan Syed; Andrew Swartz; Ilan Rubinfeld
BackgroundPatient-associated co-morbidities are a potential cause of postoperative complications. The National Surgical Quality Improvement Project (NSQIP) collects data on patient outcomes to provide risk-adjusted outcomes data to participating hospitals. However, operations which may have a high distribution of technically-related complications, such as pancreatic operations, may not be adequately assessed using such predictive models.MethodsA combined data set of NSQIP Public Use files (PUF) from 2005 to 2008 was created. Using this database, multiple logistic regression analyses were used to generate a predictive model of 30-day postoperative morbidity and mortality for pancreatic operations and all other operations recorded in NSQIP. Receiver-operator characteristic curves were generated and the area under those curves (AUROC) used to generate a c-statistic to assess the model’s discriminatory ability. Observed-to-expected (O/E) ratios of for mortality and morbidity using not only patient-associated co-morbidities, but operation-associated information, such as work relative-value units and Current Procedural Terminology codes, were generated. Data were analyzed in SPSS.ResultsIn the 4-year period analyzed, there were 7,097 complex pancreatic procedures done which were compared to 568,371 procedures that were not. For postoperative mortality, the AUROC was less for pancreatic operations (0.741) compared to all other operation (0.947) and all other inpatient operations (0.927). Similarly for postoperative morbidity, the AUROC was less for pancreatic operations (0.598) compared to all other operations (0.764) and all other inpatient operations (0.817). However, the O/E ratios were similar in both groups for mortality (all other operations, 0.94 vs. pancreatic operations, 0.92) and morbidity (0.98 for both).ConclusionsThese data imply that the factors used to assess postoperative mortality and morbidity may not completely explain postoperative outcomes in pancreatic operations. These procedures are technically demanding and can have morbidities not related to pre-existing co-morbid conditions; therefore, preoperative prediction based on pre-existing co-morbidities may have limitations in these types of operations.
Journal of The American College of Surgeons | 2011
Zeeshan Syed; Ilan Rubinfeld; Joe H. Patton; Jennifer Ritz; Jack Jordan; Andrea Doud; Vic Velanovich
BACKGROUND The American College of Surgeons National Surgical Quality Improvement Program collects information related to procedures in the form of the work relative value unit (RVU) and current procedural terminology (CPT) code. We propose and evaluate a fully automated nonparametric learning approach that maps individual CPT codes to perioperative risk. STUDY DESIGN National Surgical Quality Improvement Program participant use file data for 2005-2006 were used to develop 2 separate support vector machines (SVMs) to learn the relationship between CPT codes and 30-day mortality or morbidity. SVM parameters were determined using cross-validation. SVMs were evaluated on participant use file data for 2007 and 2008. Areas under the receiver operating characteristic curve (AUROCs) were each compared with the respective AUROCs for work RVU and for standard CPT categories. We then compared the AUROCs for multivariable models, including preoperative variables, RVU, and CPT categories, with and without the SVM operation scores. RESULTS SVM operation scores had AUROCs between 0.798 and 0.822 for mortality and between 0.745 and 0.758 for morbidity on the participant use file used for both training (2005-2006) and testing (2007 and 2008). This was consistently higher than the AUROCs for both RVU and standard CPT categories (p < 0.001). AUROCs of multivariable models were higher for 30-day mortality and morbidity when SVM operation scores were included. This difference was not significant for mortality but statistically significant, although small, for morbidity. CONCLUSIONS Nonparametric methods from artificial intelligence can translate CPT codes to aid in the assessment of perioperative risk. This approach is fully automated and can complement the use of work RVU or traditional CPT categories in multivariable risk adjustment models like the National Surgical Quality Improvement Program.