Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sandro J. de Souza is active.

Publication


Featured researches published by Sandro J. de Souza.


BMC Genomics | 2004

The use of Open Reading frame ESTs (ORESTES) for analysis of the honey bee transcriptome

Francis M. F. Nunes; Valeria Valente; Josane F. Sousa; Marco A.V. Cunha; Daniel G. Pinheiro; Rafaela M. Maia; Daniela D. Araujo; Maria Cristina R. Costa; Waleska K. Martins; Alex F. Carvalho; Nadia Monesi; Adriana Mendes do Nascimento; Pablo Marco Veras Peixoto; Maria de Fátima Rodrigues da Silva; Ricardo Guelerman Pinheiro Ramos; Luis F.L. Reis; Emmanuel Dias-Neto; Sandro J. de Souza; Andrew J.G. Simpson; Marco A. Zago; Ademilson Espencer Egea Soares; Márcia Maria Gentile Bitondi; Enilza M. Espreafico; Foued Salmen Espindola; Maria Luisa Paçó-Larson; Zilá Luz Paulino Simões; Klaus Hartfelder; Wilson A. Silva

BackgroundThe ongoing efforts to sequence the honey bee genome require additional initiatives to define its transcriptome. Towards this end, we employed the Open Reading frame ESTs (ORESTES) strategy to generate profiles for the life cycle of Apis mellifera workers.ResultsOf the 5,021 ORESTES, 35.2% matched with previously deposited Apis ESTs. The analysis of the remaining sequences defined a set of putative orthologs whose majority had their best-match hits with Anopheles and Drosophila genes. CAP3 assembly of the Apis ORESTES with the already existing 15,500 Apis ESTs generated 3,408 contigs. BLASTX comparison of these contigs with protein sets of organisms representing distinct phylogenetic clades revealed a total of 1,629 contigs that Apis mellifera shares with different taxa. Most (41%) represent genes that are in common to all taxa, another 21% are shared between metazoans (Bilateria), and 16% are shared only within the Insecta clade. A set of 23 putative genes presented a best match with human genes, many of which encode factors related to cell signaling/signal transduction. 1,779 contigs (52%) did not match any known sequence. Applying a correction factor deduced from a parallel analysis performed with Drosophila melanogaster ORESTES, we estimate that approximately half of these no-match ESTs contigs (22%) should represent Apis-specific genes.ConclusionsThe versatile and cost-efficient ORESTES approach produced minilibraries for honey bee life cycle stages. Such information on central gene regions contributes to genome annotation and also lends itself to cross-transcriptome comparisons to reveal evolutionary trends in insect genomes.


BMC Genomics | 2007

Identification of unannotated exons of low abundance transcripts in Drosophila melanogaster and cloning of a new serine protease gene upregulated upon injury

Rafaela M. Maia; Valeria Valente; Marco A.V. Cunha; Josane F. Sousa; Daniela D. Araujo; Wilson A. Silva; Marco A. Zago; Emmanuel Dias-Neto; Sandro J. de Souza; Andrew J.G. Simpson; Nadia Monesi; Ricardo Guelerman Pinheiro Ramos; Enilza M. Espreafico; Maria Luisa Paçó-Larson

BackgroundThe sequencing of the D.melanogaster genome revealed an unexpected small number of genes (~ 14,000) indicating that mechanisms acting on generation of transcript diversity must have played a major role in the evolution of complex metazoans. Among the most extensively used mechanisms that accounts for this diversity is alternative splicing. It is estimated that over 40% of Drosophila protein-coding genes contain one or more alternative exons. A recent transcription map of the Drosophila embryogenesis indicates that 30% of the transcribed regions are unannotated, and that 1/3 of this is estimated as missed or alternative exons of previously characterized protein-coding genes. Therefore, the identification of the variety of expressed transcripts depends on experimental data for its final validation and is continuously being performed using different approaches. We applied the Open Reading Frame Expressed Sequence Tags (ORESTES) methodology, which is capable of generating cDNA data from the central portion of rare transcripts, in order to investigate the presence of hitherto unnanotated regions of Drosophila transcriptome.ResultsBioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury.ConclusionUsing the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data.


PLOS ONE | 2014

S-score: a scoring system for the identification and prioritization of predicted cancer genes.

Jorge Estefano Santana de Souza; André F. Fonseca; Renan Valieris; Dirce Maria Carraro; Jean Y. J. Wang; Richard D. Kolodner; Sandro J. de Souza

A new method, which allows for the identification and prioritization of predicted cancer genes for future analysis, is presented. This method generates a gene-specific score called the “S-Score” by incorporating data from different types of analysis including mutation screening, methylation status, copy-number variation and expression profiling. The method was applied to the data from The Cancer Genome Atlas and allowed the identification of known and potentially new oncogenes and tumor suppressors associated with different clinical features including shortest term of survival in ovarian cancer patients and hormonal subtypes in breast cancer patients. Furthermore, for the first time a genome-wide search for genes that behave as oncogenes and tumor suppressors in different tumor types was performed. We envisage that the S-score can be used as a standard method for the identification and prioritization of cancer genes for follow-up studies.


PLOS ONE | 2013

High-Throughput Sequencing of a South American Amerindian

Jorge Estefano Santana de Souza; Renan Almeida; Dayse O. Alencar; Maria Silvanira Barbosa; Leonor Gusmão; Wilson A. Silva; Sandro J. de Souza; Artur Silva; Ândrea Ribeiro-dos-Santos; Sylvain Darnet; Sidney Santos

The emergence of next-generation sequencing technologies allowed access to the vast amounts of information that are contained in the human genome. This information has contributed to the understanding of individual and population-based variability and improved the understanding of the evolutionary history of different human groups. However, the genome of a representative of the Amerindian populations had not been previously sequenced. Thus, the genome of an individual from a South American tribe was completely sequenced to further the understanding of the genetic variability of Amerindians. A total of 36.8 giga base pairs (Gbp) were sequenced and aligned with the human genome. These Gbp corresponded to 95.92% of the human genome with an estimated miscall rate of 0.0035 per sequenced bp. The data obtained from the alignment were used for SNP (single-nucleotide) and INDEL (insertion-deletion) calling, which resulted in the identification of 502,017 polymorphisms, of which 32,275 were potentially new high-confidence SNPs and 33,795 new INDELs, specific of South Native American populations. The authenticity of the sample as a member of the South Native American populations was confirmed through the analysis of the uniparental (maternal and paternal) lineages. The autosomal comparison distinguished the investigated sample from others continental populations and revealed a close relation to the Eastern Asian populations and Aboriginal Australian. Although, the findings did not discard the classical model of America settlement; it brought new insides to the understanding of the human population history. The present study indicates a remarkable genetic variability in human populations that must still be identified and contributes to the understanding of the genetic variability of South Native American populations and of the human populations history.


PeerJ | 2014

Identification of rare alternative splicing events in MS/MS data reveals a significant fraction of alternative translation initiation sites

José E. Kroll; Sandro J. de Souza; Gustavo A. de Souza

Integration of transcriptome data is a crucial step for the identification of rare protein variants in mass-spectrometry (MS) data with important consequences for all branches of biotechnology research. Here, we used Splooce, a database of splicing variants recently developed by us, to search MS data derived from a variety of human tumor cell lines. More than 800 new protein variants were identified whose corresponding MS spectra were specific to protein entries from Splooce. Although the types of splicing variants (exon skipping, alternative splice sites and intron retention) were found at the same frequency as in the transcriptome, we observed a large variety of modifications at the protein level induced by alternative splicing events. Surprisingly, we found that 40% of all protein modifications induced by alternative splicing led to the use of alternative translation initiation sites. Other modifications include frameshifts in the open reading frame and inclusion or deletion of peptide sequences. To make the dataset generated here available to the community in a more effective form, the Splooce portal (http://www.bioinformatics-brazil.org/splooce) was modified to report the alternative splicing events supported by MS data.


PeerJ | 2015

Splicing Express: a software suite for alternative splicing analysis using next-generation sequencing data

José Eduardo Kroll; Jihoon Kim; Lucila Ohno-Machado; Sandro J. de Souza

Motivation. Alternative splicing events (ASEs) are prevalent in the transcriptome of eukaryotic species and are known to influence many biological phenomena. The identification and quantification of these events are crucial for a better understanding of biological processes. Next-generation DNA sequencing technologies have allowed deep characterization of transcriptomes and made it possible to address these issues. ASEs analysis, however, represents a challenging task especially when many different samples need to be compared. Some popular tools for the analysis of ASEs are known to report thousands of events without annotations and/or graphical representations. A new tool for the identification and visualization of ASEs is here described, which can be used by biologists without a solid bioinformatics background. Results. A software suite named Splicing Express was created to perform ASEs analysis from transcriptome sequencing data derived from next-generation DNA sequencing platforms. Its major goal is to serve the needs of biomedical researchers who do not have bioinformatics skills. Splicing Express performs automatic annotation of transcriptome data (GTF files) using gene coordinates available from the UCSC genome browser and allows the analysis of data from all available species. The identification of ASEs is done by a known algorithm previously implemented in another tool named Splooce. As a final result, Splicing Express creates a set of HTML files composed of graphics and tables designed to describe the expression profile of ASEs among all analyzed samples. By using RNA-Seq data from the Illumina Human Body Map and the Rat Body Map, we show that Splicing Express is able to perform all tasks in a straightforward way, identifying well-known specific events. Availability and Implementation.Splicing Express is written in Perl and is suitable to run only in UNIX-like systems. More details can be found at: http://www.bioinformatics-brazil.org/splicingexpress.


BMC Genomics | 2015

Populational landscape of INDELs affecting transcription factor-binding sites in humans

Vandeclécio L. da Silva; Jorge Estefano Santana de Souza; Sandro J. de Souza

BackgroundDifferences in gene expression have a significant role in the diversity of phenotypes in humans. Here we integrated human public data from ENCODE, 1000 Genomes and Geuvadis to explore the populational landscape of INDELs affecting transcription factor-binding sites (TFBS). A significant fraction of TFBS close to the transcription start site of known genes is affected by INDELs with a consequent effect at the expression of the associated gene.ResultsHundreds of TFBS-affecting INDELs (TFBS-ID) show a differential frequency between human populations, suggesting a role of natural selection in the spread of such variant INDELs. A comparison with a dataset of known human genomic regions under natural selection allowed us to identify several cases of TFBS-ID likely involved in populational adaptations. Ontology analyses on the differential TFBS-ID further indicated several biological processes under natural selection in different populations.ConclusionTogether, our results strongly suggest that INDELs have an important role in modulating gene expression patterns in humans. The dataset we make available, together with other data reporting variability at both regulatory and coding regions of genes, represent a powerful tool for studies aiming to better understand the evolution of gene regulatory networks in humans.


Journal of Molecular Evolution | 2013

Testing for Natural Selection in Human Exonic Splicing Regulators Associated with Evolutionary Rate Shifts

Rodrigo F. Ramalho; Sahar Gelfman; Jorge Estefano Santana de Souza; Gil Ast; Sandro J. de Souza; Diogo Meyer

Despite evidence that at the interspecific scale, exonic splicing silencers (ESSs) are under negative selection in constitutive exons, little is known about the effects of slightly deleterious polymorphisms on these splicing regulators. Through the application of a modified version of the McDonald–Kreitman test, we compared the normalized proportions of human polymorphisms and human/rhesus substitutions affecting exonic splicing regulators (ESRs) on sequences of constitutive and alternative exons. Our results show a depletion of substitutions and an enrichment of SNPs associated with ESS gain in constitutive exons. Moreover, we show that this evolutionary pattern is also present in a set of ESRs previously involved in the transition from constitutive to skipped exons in the mammalian lineage. The similarity between these two sets of ESRs suggests that the transition from constitutive to skipped exons in mammals is more frequently associated with the inhibition than with the promotion of splicing signals. This is in accordance with the hypothesis of a constitutive origin of exon skipping and corroborates previous findings about the antagonistic role of certain exonic splicing enhancers.


Cell Cycle | 2016

NFAT1 transcription factor regulates cell cycle progression and cyclin E expression in B lymphocytes

Leonardo K. Teixeira; Nina Carrossini; Cristiane Sécca; José Eduardo Kroll; Déborah C. DaCunha; Douglas V. Faget; Lilian D.S. Carvalho; Sandro J. de Souza; João P. B. Viola

ABSTRACT The NFAT family of transcription factors has been primarily related to T cell development, activation, and differentiation. Further studies have shown that these ubiquitous proteins are observed in many cell types inside and outside the immune system, and are involved in several biological processes, including tumor growth, angiogenesis, and invasiveness. However, the specific role of the NFAT1 family member in naive B cell proliferation remains elusive. Here, we demonstrate that NFAT1 transcription factor controls Cyclin E expression, cell proliferation, and tumor growth in vivo. Specifically, we show that inducible expression of NFAT1 inhibits cell cycle progression, reduces colony formation, and controls tumor growth in nude mice. We also demonstrate that NFAT1-deficient naive B lymphocytes show a hyperproliferative phenotype and high levels of Cyclin E1 and E2 upon BCR stimulation when compared to wild-type B lymphocytes. NFAT1 transcription factor directly regulates Cyclin E expression in B cells, inhibiting the G1/S cell cycle phase transition. Bioinformatics analysis indicates that low levels of NFAT1 correlate with high expression of Cyclin E1 in different human cancers, including Diffuse Large B-cell Lymphomas (DLBCL). Together, our results demonstrate a repressor role for NFAT1 in cell cycle progression and Cyclin E expression in B lymphocytes, and suggest a potential function for NFAT1 protein in B cell malignancies.


BioEssays | 2017

A tool for integrating genetic and mass spectrometry-based peptide data: proteogenomics viewer: PV: a genome browser-like tool, which includes MS data visualization and peptide identification parameters

José Eduardo Kroll; Vandeclécio L. da Silva; Sandro J. de Souza; Gustavo A. de Souza

In this manuscript we describe Proteogenomics Viewer, a web‐based tool that collects MS peptide identification, indexes to genomic sequence and structure, assigns exon usage, reports the identified protein isoforms with genomic alignments and, most importantly, allows the inspection of MS2 information for proper peptide identification. It also provides all performed indexing to facilitate global analysis of the data. The relevance of such tool is that there has been an increase in the number of proteogenomic efforts to improve the annotation of both genomics and proteomics data, culminating with the release of the two human proteome drafts. It is now clear that mass spectrometry‐based peptide identification of uncharacterized sequences, such as those resulting from unpredicted exon joints or non‐coding regions, is still prone to a higher than expected false discovery rate. Therefore, proper visualization of the raw data and the corresponding genome alignments are fundamental for further data validation and interpretation.

Collaboration


Dive into the Sandro J. de Souza's collaboration.

Top Co-Authors

Avatar

Jorge Estefano Santana de Souza

Ludwig Institute for Cancer Research

View shared research outputs
Top Co-Authors

Avatar

Dirce Maria Carraro

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

José Eduardo Kroll

Federal University of Rio Grande do Norte

View shared research outputs
Top Co-Authors

Avatar

Vandeclécio L. da Silva

Federal University of Rio Grande do Norte

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elisa Napolitano Ferreira

Ludwig Institute for Cancer Research

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge