Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Simon Kasif is active.

Publication


Featured researches published by Simon Kasif.


PLOS Genetics | 2008

Genomewide Analysis of PRC1 and PRC2 Occupancy Identifies Two Classes of Bivalent Domains

Manching Ku; Richard Koche; Esther Rheinbay; Eric M. Mendenhall; Mitsuhiro Endoh; Tarjei S. Mikkelsen; Aviva Presser; Chad Nusbaum; Xiaohui Xie; Andrew S. Chi; Mazhar Adli; Simon Kasif; Leon M. Ptaszek; Chad A. Cowan; Eric S. Lander; Haruhiko Koseki; Bradley E. Bernstein

In embryonic stem (ES) cells, bivalent chromatin domains with overlapping repressive (H3 lysine 27 tri-methylation) and activating (H3 lysine 4 tri-methylation) histone modifications mark the promoters of more than 2,000 genes. To gain insight into the structure and function of bivalent domains, we mapped key histone modifications and subunits of Polycomb-repressive complexes 1 and 2 (PRC1 and PRC2) genomewide in human and mouse ES cells by chromatin immunoprecipitation, followed by ultra high-throughput sequencing. We find that bivalent domains can be segregated into two classes—the first occupied by both PRC2 and PRC1 (PRC1-positive) and the second specifically bound by PRC2 (PRC2-only). PRC1-positive bivalent domains appear functionally distinct as they more efficiently retain lysine 27 tri-methylation upon differentiation, show stringent conservation of chromatin state, and associate with an overwhelming number of developmental regulator gene promoters. We also used computational genomics to search for sequence determinants of Polycomb binding. This analysis revealed that the genomewide locations of PRC2 and PRC1 can be largely predicted from the locations, sizes, and underlying motif contents of CpG islands. We propose that large CpG islands depleted of activating motifs confer epigenetic memory by recruiting the full repertoire of Polycomb complexes in pluripotent cells.


Journal of Artificial Intelligence Research | 1994

A system for induction of oblique decision trees

Sreerama K. Murthy; Simon Kasif

This article describes a new system for induction of oblique decision trees. This system, OC1, combines deterministic hill-climbing with two forms of randomization to find a good oblique split (in the form of a hyperplane) at each node of a decision tree. Oblique decision tree methods are tuned especially for domains in which the attributes are numeric, although they can be adapted to symbolic or mixed symbolic/numeric attributes. We present extensive empirical studies, using both real and artificial data, that analyze OC1s ability to construct oblique trees that are smaller and more accurate than their axis-parallel counterparts. We also examine the benefits of randomization for the construction of oblique decision trees.


pacific symposium on biocomputing | 2002

Extracting conserved gene expression motifs from gene expression data.

T. M. Murali; Simon Kasif

We propose a representation for gene expression data called conserved gene expression motifs or XMOTIFs. A genes expression level is conserved across a set of samples if the gene is expressed with the same abundance in all the samples. A conserved gene expression motif is a subset of genes that is simultaneously conserved across a subset of samples. We present a computational technique to discover large conserved gene motifs that cover all the samples and classes in the data. When applied to published data sets representing different cancers or disease outcomes, our algorithm constructs XMOTIFS that distinguish between the various classes.


PLOS ONE | 2012

Deep sequencing of the oral microbiome reveals signatures of periodontal disease.

Bo Liu; Lina L. Faller; Niels Klitgord; Varun Mazumdar; Mohammad Ghodsi; Daniel D. Sommer; Theodore Gibbons; Todd J. Treangen; Yi-Chien Chang; Shan Li; O. Colin Stine; Hatice Hasturk; Simon Kasif; Daniel Segrè; Mihai Pop; Salomon Amar

The oral microbiome, the complex ecosystem of microbes inhabiting the human mouth, harbors several thousands of bacterial types. The proliferation of pathogenic bacteria within the mouth gives rise to periodontitis, an inflammatory disease known to also constitute a risk factor for cardiovascular disease. While much is known about individual species associated with pathogenesis, the system-level mechanisms underlying the transition from health to disease are still poorly understood. Through the sequencing of the 16S rRNA gene and of whole community DNA we provide a glimpse at the global genetic, metabolic, and ecological changes associated with periodontitis in 15 subgingival plaque samples, four from each of two periodontitis patients, and the remaining samples from three healthy individuals. We also demonstrate the power of whole-metagenome sequencing approaches in characterizing the genomes of key players in the oral microbiome, including an unculturable TM7 organism. We reveal the disease microbiome to be enriched in virulence factors, and adapted to a parasitic lifestyle that takes advantage of the disrupted host homeostasis. Furthermore, diseased samples share a common structure that was not found in completely healthy samples, suggesting that the disease state may occupy a narrow region within the space of possible configurations of the oral microbiome. Our pilot study demonstrates the power of high-throughput sequencing as a tool for understanding the role of the oral microbiome in periodontal disease. Despite a modest level of sequencing (∼2 lanes Illumina 76 bp PE) and high human DNA contamination (up to ∼90%) we were able to partially reconstruct several oral microbes and to preliminarily characterize some systems-level differences between the healthy and diseased oral microbiomes.


PLOS Genetics | 2007

Network-based analysis of affected biological processes in type 2 diabetes models.

Manway Liu; Arthur Liberzon; Sek Won Kong; Weil R. Lai; Peter J. Park; Isaac S. Kohane; Simon Kasif

Type 2 diabetes mellitus is a complex disorder associated with multiple genetic, epigenetic, developmental, and environmental factors. Animal models of type 2 diabetes differ based on diet, drug treatment, and gene knockouts, and yet all display the clinical hallmarks of hyperglycemia and insulin resistance in peripheral tissue. The recent advances in gene-expression microarray technologies present an unprecedented opportunity to study type 2 diabetes mellitus at a genome-wide scale and across different models. To date, a key challenge has been to identify the biological processes or signaling pathways that play significant roles in the disorder. Here, using a network-based analysis methodology, we identified two sets of genes, associated with insulin signaling and a network of nuclear receptors, which are recurrent in a statistically significant number of diabetes and insulin resistance models and transcriptionally altered across diverse tissue types. We additionally identified a network of protein–protein interactions between members from the two gene sets that may facilitate signaling between them. Taken together, the results illustrate the benefits of integrating high-throughput microarray studies, together with protein–protein interaction networks, in elucidating the underlying biological processes associated with a complex disorder.


Bioinformatics | 2003

RankGene: identification of diagnostic genes based on expression data

Yang Su; T. M. Murali; Vladimir Pavlovic; Michael E. Schaffer; Simon Kasif

RankGene is a program for analyzing gene expression data and computing diagnostic genes based on their predictive power in distinguishing between different types of samples. The program integrates into one system a variety of popular ranking criteria, ranging from the traditional t-statistic to one-dimensional support vector machines. This flexibility makes RankGene a useful tool in gene expression analysis and feature selection.


Bioinformatics | 2003

Identification of functional links between genes using phylogenetic profiles

Jie Wu; Simon Kasif; Charles DeLisi

MOTIVATION Genes with identical patterns of occurrence across the phyla tend to function together in the same protein complexes or participate in the same biochemical pathway. However, the requirement that the profiles be identical (i) severely restricts the number of functional links that can be established by such phylogenetic profiling; (ii) limits detection to very strong functional links, failing to capture relations between genes that are not in the same pathway, but nevertheless subserve a common function and (iii) misses relations between analogous genes. Here we present and apply a method for relaxing the restriction, based on the probability that a given arbitrary degree of similarity between two profiles would occur by chance, with no biological pressure. Function is then inferred at any desired level of confidence. RESULTS We derive an expression for the probability distribution of a given number of chance co-occurrences of a pair of non-homologous orthologs across a set of genomes. The method is applied to 2905 clusters of orthologous genes (COGs) from 44 fully sequenced microbial genomes representing all three domains of life. Among the results are the following. (1) Of the 51 000 annotated intrapathway gene pairs, 8935 are linked at a level of significance of 0.01. This is over 30-fold greater than the 271 intrapathway pairs obtained at the same confidence level when identical profiles are used. (2) Of the 540 000 interpathway genes pairs, some 65 000 are linked at the 0.01 level of significance, some 12 standard deviations beyond the number expected by chance at this confidence level. We speculate that many of these links involve nearest-neighbor path, and discuss some examples. (3) The difference in the percentage of linked interpathway and intrapathway genes is highly significant, consistent with the intuitive expectation that genes in the same pathway are generally under greater selective pressure than those that are not. (4) The method appears to recover well metabolic networks. This is illustrated by the TCA cycle which is recovered as a highly connected, weighted edge network of 30 of its 31 COGs. (5) The fraction of pairs having a common pathway is a symmetric function of the Hamming distance between their profiles. This finding, that the functional correlation between profiles with near maximum Hamming distance is as large as between profiles with near zero Hamming distance, and as statistically significant, is plausibly explained if the former group represents analogous genes.


Genomics | 1999

Optimized multiplex PCR : Efficiently closing a whole-genome shotgun sequencing project

Hervé Tettelin; Diana Radune; Simon Kasif; Hoda Khouri

A new method has been developed for rapidly closing a large number of gaps in a whole-genome shotgun sequencing project. The method employs multiplex PCR and a novel pooling strategy to minimize the number of laboratory procedures required to sequence the unknown DNA that falls in between contiguous sequences. Multiplex sequencing, a novel procedure in which multiple PCR primers are used in a single sequencing reaction, is used to interpret the multiplex PCR results. Two protocols are presented, one that minimizes pipetting and another that minimizes the number of reactions. The pipette optimized multiplex PCR method has been employed in the final phases of closing the Streptococcus pneumoniae genome sequence, with excellent results.


Artificial Intelligence | 1990

On the parallel complexity of discrete relaxation in constraint satisfaction networks

Simon Kasif

Abstract Constraint satisfaction networks have been shown to be a very useful tool for knowledge representation in Artificial Intelligence applications. These networks often utilize local constraint propagation techniques to achieve local consistency (consistent labeling in vision). Such methods have been used extensively in the context of image understanding and interpretation, as well as planning, natural language analysis and truth maintenance systems. In this paper we study the parallel complexity of discrete relaxation, one of the most commonly used constraint propagation techniques. Since the constraint propagation procedures such as discrete relaxation appear to operate locally, it has been previously believed that the relaxation approach for achieving local consistency has a natural parallel solution. Our analysis suggests that a parallel solution is unlikely to improve the known sequential solutions by much. Specifically, we prove that the problem solved by discrete relaxation (arc consistency) is log-space complete for P (the class of polynomial-time deterministic sequential algorithms). Intuitively, this implies that discrete relaxation is inherently sequential and it is unlikely that we can solve the polynomial-time version of the consistent labeling problem in logarithmic time by using only a polynomial number of processors. Some practical implications of our result are discussed. We also provide a two-way transformation between AND/OR graphs, propositional Horn satisfiability and local consistency in constraint networks that allows us to develop optimal linear-time algorithms for local consistency in constraint networks.


PLOS Biology | 2013

The COMBREX Project: Design, Methodology, and Initial Results

Brian P. Anton; Yi-Chien Chang; Peter Brown; Han-Pil Choi; Lina L. Faller; Jyotsna Guleria; Zhenjun Hu; Niels Klitgord; Ami Levy-Moonshine; Almaz Maksad; Varun Mazumdar; Mark McGettrick; Lais Osmani; Revonda Pokrzywa; John Rachlin; Rajeswari Swaminathan; Benjamin Allen; Genevieve Housman; Caitlin Monahan; Krista Rochussen; Kevin Tao; Ashok S. Bhagwat; Steven E. Brenner; Linda Columbus; Valérie de Crécy-Lagard; Donald J. Ferguson; Alexey Fomenkov; Giovanni Gadda; Richard D. Morgan; Andrei L. Osterman

Experimental data exists for only a vanishingly small fraction of sequenced microbial genes. This community page discusses the progress made by the COMBREX project to address this important issue using both computational and experimental resources.

Collaboration


Dive into the Simon Kasif's collaboration.

Top Co-Authors

Avatar

Arthur L. Delcher

Loyola University Maryland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David G. Heath

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge