Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Somali Chaterji is active.

Publication


Featured researches published by Somali Chaterji.


Nucleic Acids Research | 2016

The MG-RAST metagenomics database and portal in 2015

Andreas Wilke; Jared Bischof; Wolfgang Gerlach; Elizabeth M. Glass; Travis Harrison; Kevin P. Keegan; Tobias Paczian; William L. Trimble; Saurabh Bagchi; Somali Chaterji; Folker Meyer

MG-RAST (http://metagenomics.anl.gov) is an open-submission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. The system currently hosts over 200 000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. To show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignment tools.


Tissue Engineering Part A | 2014

Synergistic Effects of Matrix Nanotopography and Stiffness on Vascular Smooth Muscle Cell Function

Somali Chaterji; Peter H. Kim; Seung H. Choe; Jonathan H. Tsui; Christoffer H. Lam; Derek Ho; Aaron B. Baker; Deok Ho Kim

Vascular smooth muscle cells (vSMCs) retain the ability to undergo modulation in their phenotypic continuum, ranging from a mature contractile state to a proliferative, secretory state. vSMC differentiation is modulated by a complex array of microenvironmental cues, which include the biochemical milieu of the cells and the architecture and stiffness of the extracellular matrix. In this study, we demonstrate that by using UV-assisted capillary force lithography (CFL) to engineer a polyurethane substratum of defined nanotopography and stiffness, we can facilitate the differentiation of cultured vSMCs, reduce their inflammatory signature, and potentially promote the optimal functioning of the vSMC contractile and cytoskeletal machinery. Specifically, we found that the combination of medial tissue-like stiffness (11 MPa) and anisotropic nanotopography (ridge width_groove width_ridge height of 800_800_600 nm) resulted in significant upregulation of calponin, desmin, and smoothelin, in addition to the downregulation of intercellular adhesion molecule-1, tissue factor, interleukin-6, and monocyte chemoattractant protein-1. Further, our results allude to the mechanistic role of the RhoA/ROCK pathway and caveolin-1 in altered cellular mechanotransduction pathways via differential matrix nanotopography and stiffness. Notably, the nanopatterning of the stiffer substrata (1.1 GPa) resulted in the significant upregulation of RhoA, ROCK1, and ROCK2. This indicates that nanopatterning an 800_800_600 nm pattern on a stiff substratum may trigger the mechanical plasticity of vSMCs resulting in a hypercontractile vSMC phenotype, as observed in diabetes or hypertension. Given that matrix stiffness is an independent risk factor for cardiovascular disease and that CFL can create different matrix nanotopographic patterns with high pattern fidelity, we are poised to create a combinatorial library of arterial test beds, whether they are healthy, diseased, injured, or aged. Such high-throughput testing environments will pave the way for the evolution of the next generation of vascular scaffolds that can effectively crosstalk with the scaffold microenvironment and result in improved clinical outcomes.


Scientific Reports | 2016

EP-DNN: A Deep Neural Network-Based Global Enhancer Prediction Algorithm.

Seong-Gon Kim; Mrudul Harwani; Somali Chaterji

We present EP-DNN, a protocol for predicting enhancers based on chromatin features, in different cell types. Specifically, we use a deep neural network (DNN)-based architecture to extract enhancer signatures in a representative human embryonic stem cell type (H1) and a differentiated lung cell type (IMR90). We train EP-DNN using p300 binding sites, as enhancers, and TSS and random non-DHS sites, as non-enhancers. We perform same-cell and cross-cell predictions to quantify the validation rate and compare against two state-of-the-art methods, DEEP-ENCODE and RFECS. We find that EP-DNN has superior accuracy with a validation rate of 91.6%, relative to 85.3% for DEEP-ENCODE and 85.5% for RFECS, for a given number of enhancer predictions and also scales better for a larger number of enhancer predictions. Moreover, our H1 → IMR90 predictions turn out to be more accurate than IMR90 → IMR90, potentially because H1 exhibits a richer signature set and our EP-DNN model is expressive enough to extract these subtleties. Our work shows how to leverage the full expressivity of deep learning models, using multiple hidden layers, while avoiding overfitting on the training data. We also lay the foundation for exploration of cross-cell enhancer predictions, potentially reducing the need for expensive experimentation.


BMC Genomics | 2015

MicroRNA target prediction using thermodynamic and sequence curves.

Asish Ghoshal; Raghavendran Shankar; Saurabh Bagchi; Somali Chaterji

MicroRNAs (miRNAs) are small regulatory RNA that mediate RNA interference by binding to various mRNA target regions. There have been several computational methods for the identification of target mRNAs for miRNAs. However, these have considered all contributory features as scalar representations, primarily, as thermodynamic or sequence-based features. Further, a majority of these methods solely target canonical sites, which are sites with “seed” complementarity. Here, we present a machine-learning classification scheme, titled Avishkar, which captures the spatial profile of miRNA-mRNA interactions via smooth B-spline curves, separately for various input features, such as thermodynamic and sequence features. Further, we use a principled approach to uniformly model canonical and non-canonical seed matches, using a novel seed enrichment metric. We demonstrate that large number of seed-match patterns have high enrichment values, conserved across species, and that majority of miRNA binding sites involve non-canonical matches, corroborating recent findings. Using spatial curves and popular categorical features, such as target site length and location, we train a linear SVM model, utilizing experimental CLIP-seq data. Our model significantly outperforms all established methods, for both canonical and non-canonical sites. We achieve this while using a much larger candidate miRNA-mRNA interaction set than prior work. We have developed an efficient SVM-based model for miRNA target prediction using recent CLIP-seq data, demonstrating superior performance, evaluated using ROC curves, specifically about 20 % better than the state-of-the-art, for different species (human or mouse), or different target types (canonical or non-canonical). To the best of our knowledge we provide the first distributed framework for microRNA target prediction based on Apache Hadoop and Spark. All source code and data is publicly available at https://bitbucket.org/cellsandmachines/avishkar .


Journal of Biomedical Materials Research Part B | 2012

Development of a probucol-releasing antithrombogenic drug eluting stent†‡

Kumar Vedantham; Somali Chaterji; Sungwon Kim; Kinam Park

The success of drug eluting stents (DESs) has been challenged by the manifestation of late stent thrombosis after DES implantation. The incomplete regeneration of the endothelial layer poststenting triggers adverse signaling processes precipitating in thrombosis. Various approaches have been attempted to prevent thrombosis, including the delivery of biological agents, such as estradiol, that promote endothelialization, and the use of natural polymers as coating materials. The underlying challenge has been the inability to release the biological agent in synchronization with the temporal sequence of vascular wound healing in vivo. The natural healing process of the endothelium after an injury starts after a week and may take up to a month in humans. This article presents a novel DES formulation using a hemocompatible polyurethane (PU) matrix to sustain the release of probucol (PB), an endothelial agonist, by exploiting the greater difference in the solubility parameters of PB and PU. This results in the formation of crystalline PB aggregates retarding drug release from PU. The physicochemical properties of PB in PU were confirmed using differential scanning calorimetry and X-ray diffraction. Drug-polymer compatibility was examined using infrared spectral analysis. Also, in vitro studies using primary human aortic endothelial cells resulted in the selection of 5% w/w PB as the optimal dose, to be further tested in vitro and in vivo. This work develops and tests a promising new DES formulation to enable faster endothelial cell proliferation poststenting, potentially minimizing the incidence and severity of thrombotic events after DES implantation.


ieee international conference on high performance computing data and analytics | 2014

Orion: scaling genomic sequence matching with fine-grained parallelization

Kanak Mahadik; Somali Chaterji; Bowen Zhou; Milind Kulkarni; Saurabh Bagchi

Gene sequencing instruments are producing huge volumes of data, straining the capabilities of current database searching algorithms and hindering efforts of researchers analyzing large collections of data to obtain greater insights. In the space of parallel genomic sequence search, most of the popular software packages, like mpiBLAST, use the database segmentation approach, wherein the entire database is sharded and searched on different nodes. However this approach does not scale well with the increasing length of individual query sequences as well as the rapid growth in size of sequence databases. In this paper, we propose a fine-grained parallelism technique, called Orion, that divides the input query into an adaptive number of fragments and shards the database. Our technique achieves higher parallelism (and hence speedup) and load balancing than database sharding alone, while maintaining 100% accuracy. We show that it is 12.3X faster than mpiBLAST for solving a relevant comparative genomics problem.


PLOS ONE | 2014

Syndecan-1 Regulates Vascular Smooth Muscle Cell Phenotype

Somali Chaterji; Christoffer H. Lam; Derek Ho; Daniel C. Proske; Aaron B. Baker

Objective We examined the role of syndecan-1 in modulating the phenotype of vascular smooth muscle cells in the context of endogenous inflammatory factors and altered microenvironments that occur in disease or injury-induced vascular remodeling. Methods and Results Vascular smooth muscle cells (vSMCs) display a continuum of phenotypes that can be altered during vascular remodeling. While the syndecans have emerged as powerful and complex regulators of cell function, their role in controlling vSMC phenotype is unknown. Here, we isolated vSMCs from wild type (WT) and syndecan-1 knockout (S1KO) mice. Gene expression and western blotting studies indicated decreased levels of α-smooth muscle actin (α-SMA), calponin, and other vSMC-specific differentiation markers in S1KO relative to WT cells. The spread area of the S1KO cells was found to be greater than WT cells, with a corresponding increase in focal adhesion formation, Src phosphorylation, and alterations in actin cytoskeletal arrangement. In addition, S1KO led to increased S6RP phosphorylation and decreased AKT and PKC-α phosphorylation. To examine whether these changes were present in vivo, isolated aortae from aged WT and S1KO mice were stained for calponin. Consistent with our in-vitro findings, the WT mice aortae stained higher for calponin relative to S1KO. When exposed to the inflammatory cytokine TNF-α, WT vSMCs had an 80% reduction in syndecan-1 expression. Further, with TNF-α, S1KO vSMCs produced increased pro-inflammatory cytokines relative to WT. Finally, inhibition of interactions between syndecan-1 and integrins αvβ3 and αvβ5 using the inhibitory peptide synstatin appeared to have similar effects on vSMCs as knocking out syndecan-1, with decreased expression of vSMC differentiation markers and increased expression of inflammatory cytokines, receptors, and osteopontin. Conclusions Taken together, our results support that syndecan-1 promotes vSMC differentiation and quiescence. Thus, the presence of syndecan-1 would have a protective effect against vSMC dedifferentiation and this activity is linked to interactions with integrins αvβ3 and αvβ5.


frontiers in education conference | 2008

Effects of types of active learning activity on two junior-level computer engineering courses

Saurabh Bagchi; Mark C. Johnson; Somali Chaterji

In several computer engineering and computer science courses, it has been observed that active learning activities (ALAs) aid the students in better understanding of the technical material. In this paper, we explore the influence of the type of the ALA and the academic quality of the student on the effectiveness of the technique. We perform the study in two junior level courses-a course on discrete mathematics as applied to computer engineering topics and an ASIC (Application-Specific Integrated Circuit) design course. The first course has no laboratory component and teaches several abstract mathematical concepts. The latter course deals with the design of digital circuits using the VHDL hardware description language and has a laboratory component. We conduct ALAs of three kinds-solving problems in-class with active participation of the students; homework problems which are worked on collaboratively by the students and with solutions provided later; and, practice examinations handed out before the actual examination which the students are encouraged to solve in groups. The effect on the students is measured through examination questions. Looking at the aggregate class performance, the ALAs through in-class questions and homeworks do not appear to have a significant effect, while the practice examination questions do. However, on segmenting the data, we observe that the ldquoArdquo students benefited from the in-class ALAs while both ldquoArdquo and ldquoBrdquo students benefited from the practice examinations. The worst performing students did not benefit significantly from any of the ALAs. This study leads us to investigate further the possibility of tailoring the ALA to the different learning styles and academic calibers of the students.


Briefings in Bioinformatics | 2017

MG-RAST version 4—lessons learned from a decade of low-budget ultra-high-throughput metagenome analysis

Folker Meyer; Saurabh Bagchi; Somali Chaterji; Wolfgang Gerlach; Travis Harrison; Tobias Paczian; William L. Trimble; Andreas Wilke

As technologies change, MG-RAST is adapting. Newly available software is being included to improve accuracy and performance. As a computational service constantly running large volume scientific workflows, MG-RAST is the right location to perform benchmarking and implement algorithmic or platform improvements, in many cases involving trade-offs between specificity, sensitivity and run-time cost. The work in [Glass EM, Dribinsky Y, Yilmaz P, et al. ISME J 2014;8:1-3] is an example; we use existing well-studied data sets as gold standards representing different environments and different technologies to evaluate any changes to the pipeline. Currently, we use well-understood data sets in MG-RAST as platform for benchmarking. The use of artificial data sets for pipeline performance optimization has not added value, as these data sets are not presenting the same challenges as real-world data sets. In addition, the MG-RAST team welcomes suggestions for improvements of the workflow. We are currently working on versions 4.02 and 4.1, both of which contain significant input from the community and our partners that will enable double barcoding, stronger inferences supported by longer-read technologies, and will increase throughput while maintaining sensitivity by using Diamond and SortMeRNA. On the technical platform side, the MG-RAST team intends to support the Common Workflow Language as a standard to specify bioinformatics workflows, both to facilitate development and efficient high-performance implementation of the communitys data analysis tasks.


communication systems and networks | 2016

Fast training on large genomics data using distributed Support Vector Machines

Nawanol Theera-Ampornpunt; Seong Gon Kim; Asish Ghoshal; Saurabh Bagchi; Somali Chaterji

The field of genomics has seen a glorious explosion of high-quality data, with tremendous strides having been made in genomic sequencing instruments and computational genomics applications meant to make sense of the data. A common use case for genomics data is to answer the question if a specific genetic signature is correlated with some disease manifestations. Support Vector Machine (SVM) is a widely used classifier in computational literature. Previous studies have shown success in using these SVMs for the above use case of genomics data. However, SVMs suffer from a widely-recognized scalability problem in both memory use and computational time. It is as yet an unanswered question if training such classifiers can scale to the massive sizes that characterize many of the genomics data sets. We answer that question here for a specific dataset, in order to decipher whether some regulatory module of a particular combinatorial epigenetic “pattern” will regulate the expression of a gene. However, the specifics of the dataset is likely of less relevance to the claims of our work. We take a proposed theoretical technique for efficient training of SVM, namely Cascade SVM, create our classifier called EP-SVM, and empirically evaluate how it scales to the large genomics dataset. We implement Cascade SVM on the Apache Spark platform and open source this implementation1. Through our evaluation, we bring out the computational cost on each application process, the way of distributing the overall workload among multiple processes, which can potentially execute on different cores or different machines, and the cost of data transfer to different cores or different machines. We believe we are the first to shed light on the computational and network costs of training an SVM on a multi-dimensional genomics dataset. We also evaluate the accuracy of the classifier result as a function of the parameters of the SVM model.

Collaboration


Dive into the Somali Chaterji's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Aaron B. Baker

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Folker Meyer

Argonne National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christoffer H. Lam

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Deok Ho Kim

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Derek Ho

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge