Computers in biology and medicine | 2021

TRP-BERT: Discrimination of transient receptor potential (TRP) channels using contextual representations from deep bidirectional transformer based on BERT

 
 

Abstract


Transient receptor potential (TRP) channels are non-selective cation channels that act as ion channels and are primarily found on the plasma membrane of numerous animal cells. These channels are involved in the physiology and pathophysiology of a wide variety of biological processes, including inhibition and progression of cancer, pain initiation, inflammation, regulation of pressure, thermoregulation, secretion of salivary fluid, and homeostasis of Ca2+ and Mg2+. Increasing evidences indicate that mutations in the gene encoding TRP channels play an essential role in a broad array of diseases. Therefore, these channels are becoming popular as potential drug targets for several diseases. The diversified role of these channels demands a prediction model to classify TRP channels from other channel proteins (non-TRP channels). Therefore, we presented an approach based on the Support Vector Machine (SVM) classifier and contextualized word embeddings from Bidirectional Encoder Representations from Transformers (BERT) to represent protein sequences. BERT is a deeply bidirectional language model and a neural network approach to Natural Language Processing (NLP) that achieves outstanding performance on various NLP tasks. We apply BERT to generate contextualized representations for every single amino acid in a protein sequence. Interestingly, these representations are context-sensitive and vary for the same amino acid appearing in different positions in the sequence. Our proposed method showed 80.00% sensitivity, 96.03% specificity, 95.47% accuracy, and a 0.56 Matthews correlation coefficient (MCC) for an independent test set. We suggest that our proposed method could effectively classify TRP channels from non-TRP channels and assist biologists in identifying new potential TRP channels.

Volume 137
Pages \n 104821\n
DOI 10.1016/j.compbiomed.2021.104821
Language English
Journal Computers in biology and medicine

Full Text