Mugizi Robert Rwebangira

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mugizi Robert Rwebangira is active.

Explore More

Publication

Featured researches published by Mugizi Robert Rwebangira.

international conference on machine learning | 2004

Semi-supervised learning using randomized mincuts

Avrim Blum; John D. Lafferty; Mugizi Robert Rwebangira; Rajashekar Reddy

In many application domains there is a large amount of unlabeled data but only a very limited amount of labeled training data. One general approach that has been explored for utilizing this unlabeled data is to construct a graph on all the data points based on distance relationships among examples, and then to use the known labels to perform some type of graph partitioning. One natural partitioning to use is the minimum cut that agrees with the labeled data (Blum & Chawla, 2001), which can be thought of as giving the most probable label assignment if one views labels as generated according to a Markov Random Field on the graph. Zhu et al. (2003) propose a cut based on a relaxation of this field, and Joachims (2003) gives an algorithm based on finding an approximate min-ratio cut.In this paper, we extend the mincut approach by adding randomness to the graph structure. The resulting algorithm addresses several short-comings of the basic mincut approach, and can be given theoretical justification from both a Markov random field perspective and from sample complexity considerations. In cases where the graph does not have small cuts for a given classification problem, randomization may not help. However, our experiments on several datasets show that when the structure of the graph supports small cuts, this can result in highly accurate classifiers with good accuracy/coverage tradeoffs. In addition, we are able to achieve good performance with a very simple graph-construction procedure.

Earth Science Informatics | 2015

Exploring a graph theory based algorithm for automated identification and characterization of large mesoscale convective systems in satellite datasets

Kim Whitehall; Chris A. Mattmann; Gregory S. Jenkins; Mugizi Robert Rwebangira; Belay Demoz; Duane E. Waliser; Jinwon Kim; Cameron Goodale; Andrew F. Hart; Paul M. Ramirez; Michael J. Joyce; Maziyar Boustani; Paul Zimdars; Paul C. Loikith; Huikyo Lee

Mesoscale convective systems are high impact convectively driven weather systems that contribute large amounts to the precipitation daily and monthly totals at various locations globally. As such, an understanding of the lifecycle, characteristics, frequency and seasonality of these convective features is important for several sectors and studies in climate studies, agricultural and hydrological studies, and disaster management. This study explores the applicability of graph theory to creating a fully automated algorithm for identifying mesoscale convective systems and determining their precipitation characteristics from satellite datasets. Our results show that applying graph theory to this problem allows for the identification of features from infrared satellite data and the seamlessly identification in a precipitation rate satellite-based dataset, while innately handling the inherent complexity and non-linearity of mesoscale convective systems.

international symposium on bioinformatics research and applications | 2013

A Graph Approach to Bridge the Gaps in Volumetric Electron Cryo-microscopy Skeletons

Kamal Al Nasr; Chunmei Liu; Mugizi Robert Rwebangira; Legand Burge

Electron Cryo-microscopy is an advanced imaging technique that is able to produce volumetric images of proteins that are large or hard to crystallize. De novo modeling is a process that aims at deriving the structure of the protein using the images produced by Electron Cryo-microscopy. At the medium resolutions (5 to 10A), the location and orientation of the secondary structure elements can be computationally identified on the images. However, there is no registration between the detected secondary structure elements and the protein sequence, and therefore it is challenging to derive the atomic structure from such volume data. The skeleton of the volume image is used to interpret the connections between the secondary structure elements in order to reduce the search space of the registration problem. Unfortunately, not all features of the image can be captured using a single segmentation. Moreover, the skeleton is sensitive to the threshold used which leads to gaps in the skeleton. In this paper, we present a threshold-independent approach to overcome the problem of gaps in the skeletons. The approach uses a novel representation of the image where the image is modeled as a graph and a set of volume trees. A test containing thirteen synthesized images and two authentic images showed that our approach could improve the existent skeletons. The percent of improvement achieved were 117% and 40% for Gorgon and MapEM, respectively.

IEEE Transactions on Geoscience and Remote Sensing | 2016

A New Methodology Based on Level Sets for Target Detection in Hyperspectral Images

Andres Alarcon-Ramirez; Mugizi Robert Rwebangira; Mohamed F. Chouikha; Vidya B. Manian

Target detection in hyperspectral images (HSIs) is an active area of research; it seeks to detect objects that are small in both number and size within a scene. The proposed work presents a new methodology for target detection in HSIs by combining kurtosis, level sets, and a size-based thresholding strategy. Kurtosis is used as a preprocessing step to initially enhance the targets in an image. Then, level sets identify and mark associations of pixels with similar spectral information as candidate targets. Finally, the size-based thresholding strategy detects true targets and discards false alarms that do not fit with target dimensions set as input parameter. In addition, we propose a novel version of level sets, which is suitable for target detection tasks in HSIs. Results show that the proposed algorithm could successfully detect targets in HSIs, and it gave better performance in terms of the receiver operating characteristic curve than other techniques widely used in target detection such as orthogonal subspace projection, constrained signal detector, constrained energy minimization, adaptive cosine/coherent estimator algorithm, and generalized-likelihood ratio test.

Tsinghua Science & Technology | 2015

Accurate Identification of Mass Peaks for Tandem Mass Spectra Using MCMC Model

Hui Li; Chunmei Liu; Mugizi Robert Rwebangira; Legand Burge

In proteomics, many methods for the identification of proteins have been developed. However, because of limited known genome sequences, noisy data, incomplete ion sequences, and the accuracy of protein identification,it is challenging to identify peptides using tandem mass spectral data. Noise filtering and removing thus play a key role in accurate peptide identification from tandem mass spectra. In this paper, we employ a Bayesian model to identify proteins based on the prior information of bond cleavages. A Markov Chain Monte Carlo(MCMC)algorithm is used to simulate candidate peptides from the posterior distribution and to estimate the parameters for the Bayesian model. Our simulation and computational experimental results show that the model can identify peptide with a higher accuracy.

bioinformatics and biomedicine | 2011

The development of a proteomic analyzing pipeline to identify proteins with multiple RRMs and predict their domain boundaries

Kyung Dae Ko; Chunmei Liu; Mugizi Robert Rwebangira; Legand Burge; William M. Southerland

The RNA-recognition motif (RRM) is the most abundant RNA-binding domain involved in many post-transcriptional processes. Since RRM-containing proteins have different functions with similar domain architecture, it is challenging to implement an automated annotation tool for these proteins in proteomic analysis. In this study, we implemented a proteomic analyzing pipeline to identify proteins with multiple RRMs and predict their domain boundaries using specific PSSMs, domain architectures, and proteins with the same entity name. After clustering sequences on the basis of their evolutionary distances, a reference group is selected comparing domain architectures. Then, candidate proteins are collected in a proteome using specific PSSMs from seed alignments in PFAM. Finally, target proteins are identified using multiple alignments and phyolgenetic trees between candidate and reference proteins. Therefore, we identified 33 proteins close to 12 types of RRM containing proteins and their domain boundaries among 508 candidates from 33610 sequences in a human proteome.

Tsinghua Science & Technology | 2014

Mono-isotope Prediction for Mass Spectra Using Bayes Network

Hui Li; Chunmei Liu; Mugizi Robert Rwebangira; Legand Burge

Mass spectrometry is one of the widely utilized important methods to study protein functions and components. The challenge of mono-isotope pattern recognition from large scale protein mass spectral data needs computational algorithms and tools to speed up the analysis and improve the analytic results. We utilized naïve Bayes network as the classifier with the assumption that the selected features are independent to predict mono-isotope pattern from mass spectrometry. Mono-isotopes detected from validated theoretical spectra were used as prior information in the Bayes method. Three main features extracted from the dataset were employed as independent variables in our model. The application of the proposed algorithm to publicMo dataset demonstrates that our naïve Bayes classifier is advantageous over existing methods in both accuracy and sensitivity.

arXiv: Machine Learning | 2012

On Ranking Senators by Their Votes

Mugizi Robert Rwebangira

The problem of ranking a set of objects given some measure of similarity is one of the most basic in machine learning. Recently Agarwal [1] proposed a method based on techniques in semi-supervised learning utilizing the graph Laplacian. In this work we consider a novel application of this technique to ranking binary choice data and apply it specifically to ranking US Senators by their ideology.

international symposium on bioinformatics research and applications | 2011

Rapid and accurate generation of peptide sequence tags with a graph search approach

Hui Li; Lauren Scott; Chunmei Liu; Mugizi Robert Rwebangira; Legand L. Burge; William M. Southerland

Protein peptide identification from a tandem mass spectrum (MS/MS) is a challenging task. Previous approaches for peptide identification with database search are time consuming due to huge search space. De novo sequencing approaches which derive a peptide sequence directly from a MS/MS spectrum usually are of high complexities and the accuracies of the approaches highly depend on the quality of the spectra. In this paper, we developed an accurate and efficient algorithm for peptide identification. Our work consisted of the following steps. Firstly, we found a pair of complementary mass peaks that are b-ion and y-ion, respectively. We then used the two mass peaks as two tree nodes and extend the trees such that in the end the nodes of the trees are elements of a b-ion set and a yion set, respectively. Secondly, we applied breadth first search to the trees to generate peptide sequence tags. Finally, we designed a weight function to evaluate the reliabilities of the tags and rank the tags. Our experiment on 2620 experimental MS/MS spectra with one PTM showed that our algorithm achieved better accuracy than other approaches with higher efficiency.

bioinformatics and biomedicine | 2011

Rapid generation of peptide sequence tags with a graph search algorithm

Hui Li; Chunmei Liu; Mugizi Robert Rwebangira; Legand Burge; William M. Southerland

Tandem mass spectrometry is a popular tool for the identification of peptide sequences. In this paper, we present a method for rapid generation of short peptide sequences via tandem mass spectrometry based on a graph search. The approach takes advantage of several pairs of peaks that have high intensities. We proposed a pair peak value set (PPS) and used the pair peak values of highest intensities as the root of a tree. The other nodes are viewed as the reference nodes to find the most promising path. We aimed to determine the peptide sequences for MS/MS spectra that have low signal-to-noise ratios. Our experiment on 2420 experimental MS/MS spectra with two PTMs shows that our algorithm achieves better accuracy than PepNovo approaches with higher efficiency.

Explore More