Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jimin Pei is active.

Publication


Featured researches published by Jimin Pei.


Cell | 2012

Cell-free Formation of RNA Granules: Low Complexity Sequence Domains Form Dynamic Fibers within Hydrogels

Masato Kato; Tina W. Han; Shanhai Xie; Kevin Y. Shi; Xinlin Du; Leeju C. Wu; Hamid Mirzaei; Elizabeth J. Goldsmith; Jamie Longgood; Jimin Pei; Nick V. Grishin; Douglas E. Frantz; Jay W. Schneider; She Chen; Lin Li; Michael R. Sawaya; David Eisenberg; Robert Tycko; Steven L. McKnight

Eukaryotic cells contain assemblies of RNAs and proteins termed RNA granules. Many proteins within these bodies contain KH or RRM RNA-binding domains as well as low complexity (LC) sequences of unknown function. We discovered that exposure of cell or tissue lysates to a biotinylated isoxazole (b-isox) chemical precipitated hundreds of RNA-binding proteins with significant overlap to the constituents of RNA granules. The LC sequences within these proteins are both necessary and sufficient for b-isox-mediated aggregation, and these domains can undergo a concentration-dependent phase transition to a hydrogel-like state in the absence of the chemical. X-ray diffraction and EM studies revealed the hydrogels to be composed of uniformly polymerized amyloid-like fibers. Unlike pathogenic fibers, the LC sequence-based polymers described here are dynamic and accommodate heterotypic polymerization. These observations offer a framework for understanding the function of LC sequences as well as an organizing principle for cellular structures that are not membrane bound.


Nucleic Acids Research | 2008

PROMALS3D: A tool for multiple protein sequence and structure alignments

Jimin Pei; Bong Hyun Kim; Nick V. Grishin

Although multiple sequence alignments (MSAs) are essential for a wide range of applications from structure modeling to prediction of functional sites, construction of accurate MSAs for distantly related proteins remains a largely unsolved problem. The rapidly increasing database of spatial structures is a valuable source to improve alignment quality. We explore the use of 3D structural information to guide sequence alignments constructed by our MSA program PROMALS. The resulting tool, PROMALS3D, automatically identifies homologs with known 3D structures for the input sequences, derives structural constraints through structure-based alignments and combines them with sequence constraints to construct consistency-based multiple sequence alignments. The output is a consensus alignment that brings together sequence and structural information about input proteins and their homologs. PROMALS3D can also align sequences of multiple input structures, with the output representing a multiple structure-based alignment refined in combination with sequence constraints. The advantage of PROMALS3D is that it gives researchers an easy way to produce high-quality alignments consistent with both sequences and structures of proteins. PROMALS3D outperforms a number of existing methods for constructing multiple sequence or structural alignments using both reference-dependent and reference-independent evaluation methods.


Molecular & Cellular Proteomics | 2009

Lysine Acetylation Is a Highly Abundant and Evolutionarily Conserved Modification in Escherichia Coli

Junmei Zhang; Robert Sprung; Jimin Pei; Xiaohong Tan; Sungchan Kim; Heng Zhu; Chuan-Fa Liu; Nick V. Grishin; Yingming Zhao

Lysine acetylation and its regulatory enzymes are known to have pivotal roles in mammalian cellular physiology. However, the extent and function of this modification in prokaryotic cells remain largely unexplored, thereby presenting a hurdle to further functional study of this modification in prokaryotic systems. Here we report the first global screening of lysine acetylation, identifying 138 modification sites in 91 proteins from Escherichia coli. None of the proteins has been previously associated with this modification. Among the identified proteins are transcriptional regulators, as well as others with diverse functions. Interestingly, more than 70% of the acetylated proteins are metabolic enzymes and translation regulators, suggesting an intimate link of this modification to energy metabolism. The new dataset suggests that lysine acetylation could be abundant in prokaryotic cells. In addition, these results also imply that functions of lysine acetylation beyond regulation of gene expression are evolutionarily conserved from bacteria to mammals. Furthermore, we demonstrate that bacterial lysine acetylation is regulated in response to stress stimuli.


Bioinformatics | 2001

AL2CO: calculation of positional conservation in a protein sequence alignment

Jimin Pei; Nick V. Grishin

MOTIVATION Amino acid sequence alignments are widely used in the analysis of protein structure, function and evolutionary relationships. Proteins within a superfamily usually share the same fold and possess related functions. These structural and functional constraints are reflected in the alignment conservation patterns. Positions of functional and/or structural importance tend to be more conserved. Conserved positions are usually clustered in distinct motifs surrounded by sequence segments of low conservation. Poorly conserved regions might also arise from the imperfections in multiple alignment algorithms and thus indicate possible alignment errors. Quantification of conservation by attributing a conservation index to each aligned position makes motif detection more convenient. Mapping these conservation indices onto a protein spatial structure helps to visualize spatial conservation features of the molecule and to predict functionally and/or structurally important sites. Analysis of conservation indices could be a useful tool in detection of potentially misaligned regions and will aid in improvement of multiple alignments. RESULTS We developed a program to calculate a conservation index at each position in a multiple sequence alignment using several methods. Namely, amino acid frequencies at each position are estimated and the conservation index is calculated from these frequencies. We utilize both unweighted frequencies and frequencies weighted using two different strategies. Three conceptually different approaches (entropy-based, variance-based and matrix score-based) are implemented in the algorithm to define the conservation index. Calculating conservation indices for 35522 positions in 284 alignments from SMART database we demonstrate that different methods result in highly correlated (correlation coefficient more than 0.85) conservation indices. Conservation indices show statistically significant correlation between sequentially adjacent positions i and i + j, where j < 13, and averaging of the indices over the window of three positions is optimal for motif detection. Positions with gaps display substantially lower conservation properties. We compare conservation properties of the SMART alignments or FSSP structural alignments to those of the ClustalW alignments. The results suggest that conservation indices should be a valuable tool of alignment quality assessment and might be used as an objective function for refinement of multiple alignments. AVAILABILITY The C code of the AL2CO program and its pre-compiled versions for several platforms as well as the details of the analysis are freely available at ftp://iole.swmed.edu/pub/al2co/.


Bioinformatics | 2007

PROMALS: towards accurate multiple sequence alignments of distantly related proteins

Jimin Pei; Nick V. Grishin

MOTIVATION Accurate multiple sequence alignments are essential in protein structure modeling, functional prediction and efficient planning of experiments. Although the alignment problem has attracted considerable attention, preparation of high-quality alignments for distantly related sequences remains a difficult task. RESULTS We developed PROMALS, a multiple alignment method that shows promising results for protein homologs with sequence identity below 10%, aligning close to half of the amino acid residues correctly on average. This is about three times more accurate than traditional pairwise sequence alignment methods. PROMALS algorithm derives its strength from several sources: (i) sequence database searches to retrieve additional homologs; (ii) accurate secondary structure prediction; (iii) a hidden Markov model that uses a novel combined scoring of amino acids and secondary structures; (iv) probabilistic consistency-based scoring applied to progressive alignment of profiles. Compared to the best alignment methods that do not use secondary structure prediction and database searches (e.g. MUMMALS, ProbCons and MAFFT), PROMALS is up to 30% more accurate, with improvement being most prominent for highly divergent homologs. Compared to SPEM and HHalign, which also employ database searches and secondary structure prediction, PROMALS shows an accuracy improvement of several percent. AVAILABILITY The PROMALS web server is available at: http://prodata.swmed.edu/promals/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Proteins | 2009

Structure prediction for CASP8 with all-atom refinement using Rosetta

Srivatsan Raman; Robert B. Vernon; James Thompson; Michael D. Tyka; Ruslan I. Sadreyev; Jimin Pei; David E. Kim; Elizabeth H. Kellogg; Frank DiMaio; Oliver F. Lange; Lisa N. Kinch; Will Sheffler; Bong Hyun Kim; Rhiju Das; Nick V. Grishin; David Baker

We describe predictions made using the Rosetta structure prediction methodology for the Eighth Critical Assessment of Techniques for Protein Structure Prediction. Aggressive sampling and all‐atom refinement were carried out for nearly all targets. A combination of alignment methodologies was used to generate starting models from a range of templates, and the models were then subjected to Rosetta all atom refinement. For the 64 domains with readily identified templates, the best submitted model was better than the best alignment to the best template in the Protein Data Bank for 24 cases, and improved over the best starting model for 43 cases. For 13 targets where only very distant sequence relationships to proteins of known structure were detected, models were generated using the Rosetta de novo structure prediction methodology followed by all‐atom refinement; in several cases the submitted models were better than those based on the available templates. Of the 12 refinement challenges, the best submitted model improved on the starting model in seven cases. These improvements over the starting template‐based models and refinement tests demonstrate the power of Rosetta structure refinement in improving model accuracy. Proteins 2009.


Bioinformatics | 2003

PCMA: fast and accurate multiple sequence alignment based on profile consistency

Jimin Pei; Ruslan I. Sadreyev; Nick V. Grishin

UNLABELLED PCMA (profile consistency multiple sequence alignment) is a progressive multiple sequence alignment program that combines two different alignment strategies. Highly similar sequences are aligned in a fast way as in ClustalW, forming pre-aligned groups. The T-Coffee strategy is applied to align the relatively divergent groups based on profile-profile comparison and consistency. The scoring function for local alignments of pre-aligned groups is based on a novel profile-profile comparison method that is a generalization of the PSI-BLAST approach to profile-sequence comparison. PCMA balances speed and accuracy in a flexible way and is suitable for aligning large numbers of sequences. AVAILABILITY PCMA is freely available for non-commercial use. Pre-compiled versions for several platforms can be downloaded from ftp://iole.swmed.edu/pub/PCMA/.


Genes & Development | 2008

The conserved plant sterility gene HAP2 functions after attachment of fusogenic membranes in Chlamydomonas and Plasmodium gametes

Yanjie Liu; Rita Tewari; Jue Ning; Andrew M. Blagborough; Sara Garbom; Jimin Pei; Nick V. Grishin; Robert E. Steele; Robert E. Sinden; William J. Snell; Oliver Billker

The cellular and molecular mechanisms that underlie species-specific membrane fusion between male and female gametes remain largely unknown. Here, by use of gene discovery methods in the green alga Chlamydomonas, gene disruption in the rodent malaria parasite Plasmodium berghei, and distinctive features of fertilization in both organisms, we report discovery of a mechanism that accounts for a conserved protein required for gamete fusion. A screen for fusion mutants in Chlamydomonas identified a homolog of HAP2, an Arabidopsis sterility gene. Moreover, HAP2 disruption in Plasmodium blocked fertilization and thereby mosquito transmission of malaria. HAP2 localizes at the fusion site of Chlamydomonas minus gametes, yet Chlamydomonas minus and Plasmodium hap2 male gametes retain the ability, using other, species-limited proteins, to form tight prefusion membrane attachments with their respective gamete partners. Membrane dye experiments show that HAP2 is essential for membrane merger. Thus, in two distantly related eukaryotes, species-limited proteins govern access to a conserved protein essential for membrane fusion.


Proteins | 2001

GGDEF domain is homologous to adenylyl cyclase.

Jimin Pei; Nick V. Grishin

The GGDEF domain is detected in many prokaryotic proteins, most of which are of unknown function. Several bacteria carry 12–22 different GGDEF homologues in their genomes. Conducting extensive profile‐based searches, we detect statistically supported sequence similarity between GGDEF domain and adenylyl cyclase catalytic domain. From this homology, we deduce that the prokaryotic GGDEF domain is a regulatory enzyme involved in nucleotide cyclization, with the fold similar to that of the eukaryotic cyclase catalytic domain. This prediction correlates with the functional information available on two GGDEF‐containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. Domain architecture analysis shows that GGDEF is typically present in multidomain proteins containing regulatory domains of signaling pathways or protein–protein interaction modules. Evolutionary tree analysis indicates that GGDEF/cyclase superfamily forms a large diversified cluster of orthologous proteins present in bacteria, archaea, and eukaryotes. Proteins 2001;42:210–216.


Nucleic Acids Research | 2006

MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information

Jimin Pei; Nick V. Grishin

We have developed MUMMALS, a program to construct multiple protein sequence alignment using probabilistic consistency. MUMMALS improves alignment quality by using pairwise alignment hidden Markov models (HMMs) with multiple match states that describe local structural information without exploiting explicit structure predictions. Parameters for such models have been estimated from a large library of structure-based alignments. We show that (i) on remote homologs, MUMMALS achieves statistically best accuracy among several leading aligners, such as ProbCons, MAFFT and MUSCLE, albeit the average improvement is small, in the order of several percent; (ii) a large collection (>10 000) of automatically computed pairwise structure alignments of divergent protein domains is superior to smaller but carefully curated datasets for estimation of alignment parameters and performance tests; (iii) reference-independent evaluation of alignment quality using sequence alignment-dependent structure superpositions correlates well with reference-dependent evaluation that compares sequence-based alignments to structure-based reference alignments.

Collaboration


Dive into the Jimin Pei's collaboration.

Top Co-Authors

Avatar

Nick V. Grishin

University of Texas Southwestern Medical Center

View shared research outputs
Top Co-Authors

Avatar

Lisa N. Kinch

University of Texas Southwestern Medical Center

View shared research outputs
Top Co-Authors

Avatar

Hua Cheng

University of Texas Southwestern Medical Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

William J. Snell

University of Texas Southwestern Medical Center

View shared research outputs
Top Co-Authors

Avatar

Bong Hyun Kim

University of Texas Southwestern Medical Center

View shared research outputs
Top Co-Authors

Avatar

Yuxing Liao

University of Texas Southwestern Medical Center

View shared research outputs
Top Co-Authors

Avatar

David Baker

University of Washington

View shared research outputs
Top Co-Authors

Avatar

David E. Kim

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Jing Tong

University of Texas Southwestern Medical Center

View shared research outputs
Researchain Logo
Decentralizing Knowledge