Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tian Mi is active.

Publication


Featured researches published by Tian Mi.


Nucleic Acids Research | 2009

Minimotif miner 2nd release: a database and web system for motif search

Sanguthevar Rajasekaran; Sudha Balla; Patrick R. Gradie; Michael R. Gryk; Krishna Kadaveru; Vamsi Kundeti; Mark W. Maciejewski; Tian Mi; Nicholas Rubino; Jay Vyas; Martin R. Schiller

Minimotif Miner (MnM) consists of a minimotif database and a web-based application that enables prediction of motif-based functions in user-supplied protein queries. We have revised MnM by expanding the database more than 10-fold to approximately 5000 motifs and standardized the motif function definitions. The web-application user interface has been redeveloped with new features including improved navigation, screencast-driven help, support for alias names and expanded SNP analysis. A sample analysis of prion shows how MnM 2 can be used. Weblink: http://mnm.engr.uconn.edu, weblink for version 1 is http://sms.engr.uconn.edu.


Nucleic Acids Research | 2012

Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences.

Tian Mi; Jerlin Camilus Merlin; Sandeep Deverasetty; Michael R. Gryk; Travis J. Bill; Andy Brooks; Logan Y. Lee; Viraj Rathnayake; Christian A. Ross; David P. Sargeant; Christy L. Strong; Paula Watts; Sanguthevar Rajasekaran; Martin R. Schiller

Minimotif Miner (MnM available at http://minimotifminer.org or http://mnm.engr.uconn.edu) is an online database for identifying new minimotifs in protein queries. Minimotifs are short contiguous peptide sequences that have a known function in at least one protein. Here we report the third release of the MnM database which has now grown 60-fold to approximately 300 000 minimotifs. Since short minimotifs are by their nature not very complex we also summarize a new set of false-positive filters and linear regression scoring that vastly enhance minimotif prediction accuracy on a test data set. This online database can be used to predict new functions in proteins and causes of disease.


Nature Communications | 2016

Reprogramming metabolic pathways in vivo with CRISPR/Cas9 genome editing to treat hereditary tyrosinaemia

Francis P. Pankowicz; Mercedes Barzi; Xavier Legras; Leroy Hubert; Tian Mi; Julie A. Tomolonis; Milan Ravishankar; Qin Sun; Diane Yang; Malgorzata Borowiak; Pavel Sumazin; Sarah H. Elsea; Beatrice Bissig-Choisat; Karl-Dimiter Bissig

Many metabolic liver disorders are refractory to drug therapy and require orthotopic liver transplantation. Here we demonstrate a new strategy, which we call metabolic pathway reprogramming, to treat hereditary tyrosinaemia type I in mice; rather than edit the disease-causing gene, we delete a gene in a disease-associated pathway to render the phenotype benign. Using CRISPR/Cas9 in vivo, we convert hepatocytes from tyrosinaemia type I into the benign tyrosinaemia type III by deleting Hpd (hydroxyphenylpyruvate dioxigenase). Edited hepatocytes (Fah−/−/Hpd−/−) display a growth advantage over non-edited hepatocytes (Fah−/−/Hpd+/+) and, in some mice, almost completely replace them within 8 weeks. Hpd excision successfully reroutes tyrosine catabolism, leaving treated mice healthy and asymptomatic. Metabolic pathway reprogramming sidesteps potential difficulties associated with editing a critical disease-causing gene and can be explored as an option for treating other diseases.


Journal of Biological Chemistry | 2009

RecR-mediated Modulation of RecF Dimer Specificity for Single- and Double-stranded DNA

Nodar Makharashvili; Tian Mi; Olga Koroleva; Sergey Korolev

RecF pathway proteins play an important role in the restart of stalled replication and DNA repair in prokaryotes. Following DNA damage, RecF, RecR, and RecO initiate homologous recombination (HR) by loading of the RecA recombinase on single-stranded (ss) DNA, protected by ssDNA-binding protein. The specific role of RecF in this process is not well understood. Previous studies have proposed that RecF directs the RecOR complex to boundaries of damaged DNA regions by recognizing single-stranded/double-stranded (ss/ds) DNA junctions. RecF belongs to ABC-type ATPases, which function through an ATP-dependent dimerization. Here, we demonstrate that the RecF of Deinococcus radiodurans interacts with DNA as an ATP-dependent dimer, and that the DNA binding and ATPase activity of RecF depend on both the structure of DNA substrate, and the presence of RecR. We found that RecR interacts as a tetramer with the RecF dimer. RecR increases the RecF affinity to dsDNA without stimulating ATP hydrolysis but destabilizes RecF binding to ssDNA and dimerization, likely due to increasing the ATPase rate. The DNA-dependent binding of RecR to the RecF-DNA complex occurs through specific protein-protein interactions without significant contributions from RecR-DNA interactions. Finally, RecF neither alone nor in complex with RecR preferentially binds to the ss/dsDNA junction. Our data suggest that the specificity of the RecFOR complex toward the boundaries of DNA damaged regions may result from a network of protein-protein and DNA-protein interactions, rather than a simple recognition of the ss/dsDNA junction by RecF.


PLOS ONE | 2010

Partitioning of Minimotifs Based on Function with Improved Prediction Accuracy

Sanguthevar Rajasekaran; Tian Mi; Jerlin Camilus Merlin; Aaron Oommen; Patrick R. Gradie; Martin R. Schiller

Background Minimotifs are short contiguous peptide sequences in proteins that are known to have a function in at least one other protein. One of the principal limitations in minimotif prediction is that false positives limit the usefulness of this approach. As a step toward resolving this problem we have built, implemented, and tested a new data-driven algorithm that reduces false-positive predictions. Methodology/Principal Findings Certain domains and minimotifs are known to be strongly associated with a known cellular process or molecular function. Therefore, we hypothesized that by restricting minimotif predictions to those where the minimotif containing protein and target protein have a related cellular or molecular function, the prediction is more likely to be accurate. This filter was implemented in Minimotif Miner using function annotations from the Gene Ontology. We have also combined two filters that are based on entirely different principles and this combined filter has a better predictability than the individual components. Conclusions/Significance Testing these functional filters on known and random minimotifs has revealed that they are capable of separating true motifs from false positives. In particular, for the cellular function filter, the percentage of known minimotifs that are not removed by the filter is ∼4.6 times that of random minimotifs. For the molecular function filter this ratio is ∼2.9. These results, together with the comparison with the published frequency score filter, strongly suggest that the new filters differentiate true motifs from random background with good confidence. A combination of the function filters and the frequency score filter performs better than these two individual filters.


Proteins | 2011

A computational tool for identifying minimotifs in protein-protein interactions and improving the accuracy of minimotif predictions

Sanguthevar Rajasekaran; Jerlin Camilus Merlin; Vamsi Kundeti; Tian Mi; Aaron Oommen; Jay Vyas; Izua J. Alaniz; Keith Chung; Farah Chowdhury; Sandeep Deverasatty; Tenisha M. Irvey; David Lacambacal; Darlene Lara; Subhasree Panchangam; Viraj Rathnayake; Paula Watts; Martin R. Schiller

Protein–protein interactions are important to understanding cell functions; however, our theoretical understanding is limited. There is a general discontinuity between the well‐accepted physical and chemical forces that drive protein–protein interactions and the large collections of identified protein–protein interactions in various databases. Minimotifs are short functional peptide sequences that provide a basis to bridge this gap in knowledge. However, there is no systematic way to study minimotifs in the context of protein–protein interactions or vice versa. Here we have engineered a set of algorithms that can be used to identify minimotifs in known protein–protein interactions and implemented this for use by scientists in Minimotif Miner. By globally testing these algorithms on verified data and on 100 individual proteins as test cases, we demonstrate the utility of these new computation tools. This tool also can be used to reduce false‐positive predictions in the discovery of novel minimotifs. The statistical significance of these algorithms is demonstrated by an ROC analysis (P = 0.001). Proteins 2010.


BMC Medical Informatics and Decision Making | 2012

Efficient algorithms for fast integration on large data sets from multiple sources

Tian Mi; Sanguthevar Rajasekaran; Robert H. Aseltine

BackgroundRecent large scale deployments of health information technology have created opportunities for the integration of patient medical records with disparate public health, human service, and educational databases to provide comprehensive information related to health and development. Data integration techniques, which identify records belonging to the same individual that reside in multiple data sets, are essential to these efforts. Several algorithms have been proposed in the literatures that are adept in integrating records from two different datasets. Our algorithms are aimed at integrating multiple (in particular more than two) datasets efficiently.MethodsHierarchical clustering based solutions are used to integrate multiple (in particular more than two) datasets. Edit distance is used as the basic distance calculation, while distance calculation of common input errors is also studied. Several techniques have been applied to improve the algorithms in terms of both time and space: 1) Partial Construction of the Dendrogram (PCD) that ignores the level above the threshold; 2) Ignoring the Dendrogram Structure (IDS); 3) Faster Computation of the Edit Distance (FCED) that predicts the distance with the threshold by upper bounds on edit distance; and 4) A pre-processing blocking phase that limits dynamic computation within each block.ResultsWe have experimentally validated our algorithms on large simulated as well as real data. Accuracy and completeness are defined stringently to show the performance of our algorithms. In addition, we employ a four-category analysis. Comparison with FEBRL shows the robustness of our approach.ConclusionsIn the experiments we conducted, the accuracy we observed exceeded 90% for the simulated data in most cases. 97.7% and 98.1% accuracy were achieved for the constant and proportional threshold, respectively, in a real dataset of 1,083,878 records.


Journal of the American Medical Informatics Association | 2014

Efficient sequential and parallel algorithms for record linkage

Abdullah Al Mamun; Tian Mi; Robert H. Aseltine; Sanguthevar Rajasekaran

Background and objective Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. In this paper we report efficient sequential and parallel algorithms for record linkage which handle any number of datasets and outperform previous algorithms. Methods Our algorithms employ hierarchical clustering algorithms as the basis. A key idea that we use is radix sorting on certain attributes to eliminate identical records before any further processing. Another novel idea is to form a graph that links similar records and find the connected components. Results Our sequential and parallel algorithms have been tested on a real dataset of 1 083 878 records and synthetic datasets ranging in size from 50 000 to 9 000 000 records. Our sequential algorithm runs at least two times faster, for any dataset, than the previous best-known algorithm, the two-phase algorithm using faster computation of the edit distance (TPA (FCED)). The speedups obtained by our parallel algorithm are almost linear. For example, we get a speedup of 7.5 with 8 cores (residing in a single node), 14.1 with 16 cores (residing in two nodes), and 26.4 with 32 cores (residing in four nodes). Conclusions We have compared the performance of our sequential algorithm with TPA (FCED) and found that our algorithm outperforms the previous one. The accuracy is the same as that of this previous best-known algorithm.


PLOS ONE | 2012

Achieving High Accuracy Prediction of Minimotifs

Tian Mi; Sanguthevar Rajasekaran; Jerlin Camilus Merlin; Michael R. Gryk; Martin R. Schiller

The low complexity of minimotif patterns results in a high false-positive prediction rate, hampering protein function prediction. A multi-filter algorithm, trained and tested on a linear regression model, support vector machine model, and neural network model, using a large dataset of verified minimotifs, vastly improves minimotif prediction accuracy while generating few false positives. An optimal threshold for the best accuracy reaches an overall accuracy above 90%, while a stringent threshold for the best specificity generates less than 1% false positives or even no false positives and still produces more than 90% true positives for the linear regression and neural network models. The minimotif multi-filter with its excellent accuracy represents the state-of-the-art in minimotif prediction and is expected to be very useful to biologists investigating protein function and how missense mutations cause disease.


PLOS ONE | 2012

Reducing False-Positive Prediction of Minimotifs with a Genetic Interaction Filter

Jerlin Camilus Merlin; Sanguthevar Rajasekaran; Tian Mi; Martin R. Schiller

Background Minimotifs are short contiguous peptide sequences in proteins that have known functions. At its simplest level, the minimotif sequence is present in a source protein and has an activity relationship with a target, most of which are proteins. While many scientists routinely investigate new minimotif functions in proteins, the major web-based discovery tools have a high rate of false-positive prediction. Any new approach that reduces false-positives will be of great help to biologists. Methods and Findings We have built three filters that use genetic interactions to reduce false-positive minimotif predictions. The basic filter identifies those minimotifs where the source/target protein pairs have a known genetic interaction. The HomoloGene genetic interaction filter extends these predictions to predicted genetic interactions of orthologous proteins and the node-based filter identifies those minimotifs where proteins that have a genetic interaction with the source or target have a genetic interaction. Each filter was evaluated with a test data set containing thousands of true and false-positives. Based on sensitivity and selectivity performance metrics, the basic filter had the best discrimination for true positives, whereas the node-based filter had the highest sensitivity. We have implemented these genetic interaction filters on the Minimotif Miner 2.3 website. The genetic interaction filter is particularly useful for improving predictions of posttranslational modifications such as phosphorylation and proteolytic cleavage sites. Conclusions Genetic interaction data sets can be used to reduce false-positive minimotif predictions. Minimotif prediction in known genetic interactions can help to refine the mechanisms behind the functional connection between genes revealed by genetic experimentation and screens.

Collaboration


Dive into the Tian Mi's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michael R. Gryk

University of Connecticut Health Center

View shared research outputs
Top Co-Authors

Avatar

Robert H. Aseltine

University of Connecticut Health Center

View shared research outputs
Top Co-Authors

Avatar

Aaron Oommen

University of Connecticut

View shared research outputs
Top Co-Authors

Avatar

Jay Vyas

University of Connecticut Health Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Paula Watts

University of Connecticut Health Center

View shared research outputs
Top Co-Authors

Avatar

Vamsi Kundeti

University of Connecticut Health Center

View shared research outputs
Researchain Logo
Decentralizing Knowledge