Nebojsa Jojic | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nebojsa Jojic is active.

Explore More

Publication

Featured researches published by Nebojsa Jojic.

Science | 2015

The human splicing code reveals new insights into the genetic determinants of disease

Hui Y. Xiong; Babak Alipanahi; Leo J. Lee; Hannes Bretschneider; Daniele Merico; Ryan K. C. Yuen; Yimin Hua; Serge Gueroussov; Hamed Shateri Najafabadi; Timothy R. Hughes; Quaid Morris; Yoseph Barash; Adrian R. Krainer; Nebojsa Jojic; Stephen W. Scherer; Benjamin J. Blencowe; Brendan J. Frey

Predicting defects in RNA splicing Most eukaryotic messenger RNAs (mRNAs) are spliced to remove introns. Splicing generates uninterrupted open reading frames that can be translated into proteins. Splicing is often highly regulated, generating alternative spliced forms that code for variant proteins in different tissues. RNA-binding proteins that bind specific sequences in the mRNA regulate splicing. Xiong et al. develop a computational model that predicts splicing regulation for any mRNA sequence (see the Perspective by Guigó and Valcárcel). They use this to analyze more than half a million mRNA splicing sequence variants in the human genome. They are able to identify thousands of known disease-causing mutations, as well as many new disease candidates, including 17 new autism-linked genes. Science, this issue 10.1126/science.1254806; see also p. 124 A model predicts how thousands of disease-linked nucleotide variants affect messenger RNA splicing. [Also see Perspective by Guigó and Valcárcel] INTRODUCTION Advancing whole-genome precision medicine requires understanding how gene expression is altered by genetic variants, especially those that are far outside of protein-coding regions. We developed a computational technique that scores how strongly genetic variants affect RNA splicing, a critical step in gene expression whose disruption contributes to many diseases, including cancers and neurological disorders. A genome-wide analysis reveals tens of thousands of variants that alter splicing and are enriched with a wide range of known diseases. Our results provide insight into the genetic basis of spinal muscular atrophy, hereditary nonpolyposis colorectal cancer, and autism spectrum disorder. RATIONALE We used “deep learning” computer algorithms to derive a computational model that takes as input DNA sequences and applies general rules to predict splicing in human tissues. Given a test variant, which may be up to 300 nucleotides into an intron, our model can be used to compute a score for how much the variant alters splicing. The model is not biased by existing disease annotations or population data and was derived in such a way that it can be used to study diverse diseases and disorders and to determine the consequences of common, rare, and even spontaneous variants. RESULTS Our technique is able to accurately classify disease-causing variants and provides insights into the role of aberrant splicing in disease. We scored more than 650,000 DNA variants and found that disease-causing variants have higher scores than common variants and even those associated with disease in genome-wide association studies (GWAS). Our model predicts substantial and unexpected aberrant splicing due to variants within introns and exons, including those far from the splice site. For example, among intronic variants that are more than 30 nucleotides away from any splice site, known disease variants alter splicing nine times as often as common variants; among missense exonic disease variants, those that least affect protein function are more than five times as likely as other variants to alter splicing. Autism has been associated with disrupted splicing in brain regions, so we used our method to score variants detected using whole-genome sequencing data from individuals with and without autism. Genes with high-scoring variants include many that have previously been linked with autism, as well as new genes with known neurodevelopmental phenotypes. Most of the high-scoring variants are intronic and cannot be detected by exome analysis techniques. When we scored clinical variants in spinal muscular atrophy and colorectal cancer genes, up to 94% of variants found to alter splicing using minigene reporters were correctly classified. CONCLUSION In the context of precision medicine, causal support for variants independent of existing whole-genome variant studies is greatly needed. Our computational model was trained to predict splicing from DNA sequence alone, without using disease annotations or population data. Consequently, its predictions are independent of and complementary to population data, GWAS, expression-based quantitative trait loci (QTL), and functional annotations of the genome. As such, our technique greatly expands the opportunities for understanding the genetic determinants of disease. “Deep learning” reveals the genetic origins of disease. A computational system mimics the biology of RNA splicing by correlating DNA elements with splicing levels in healthy human tissues. The system can scan DNA and identify damaging genetic variants, including those deep within introns. This procedure has led to insights into the genetics of autism, cancers, and spinal muscular atrophy. To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.

international conference on computer vision | 2005

LOCUS: learning object classes with unsupervised segmentation

John Winn; Nebojsa Jojic

We address the problem of learning object class models and object segmentations from unannotated images. We introduce LOCUS (learning object classes with unsupervised segmentation) which uses a generative probabilistic model to combine bottom-up cues of color and edge with top-down cues of shape and pose. A key aspect of this model is that the object appearance is allowed to vary from image to image, allowing for significant within-class variation. By iteratively updating the belief in the objects position, size, segmentation and pose, LOCUS avoids making hard decisions about any of these quantities and so allows for each to be refined at any stage. We show that LOCUS successfully learns an object class model from unlabeled images, whilst also giving segmentation accuracies that rival existing supervised methods. Finally, we demonstrate simultaneous recognition and segmentation in novel images using the learned models for a number of object classes, as well as unsupervised object discovery and tracking in video.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

Transformation-invariant clustering using the EM algorithm

Brendan J. Frey; Nebojsa Jojic

Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and videos), clustering techniques tend to extract cluster centers that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images from a video sequence of a person walking across a cluttered background are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different positions of the person and different configurations of the background clutter. We describe a way to add transformation invariance to mixture models, by approximating the nonlinear transformation manifold by a discrete set of points. We show how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input. We compare this technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images from videos of faces into different categories of identification and pose and removing foreground obstructions from video. We also demonstrate that the new technique is quite insensitive to initial conditions and works better than standard techniques, even when the standard techniques are provided with extra data.

ieee international conference on automatic face and gesture recognition | 2000

Detection and estimation of pointing gestures in dense disparity maps

Nebojsa Jojic; Barry Brumitt; Brian Meyers; Steve Harris; Thomas S. Huang

We describe a real-time system for detecting pointing gestures and estimating the direction of pointing using stereo cameras. Previously, similar systems were implemented using color-based blob trackers, which relied on effective skin color detection; this approach is sensitive to lighting changes and the clothing worn by the user. In contrast, we used a stereo system that produces dense disparity maps in real-time. Disparity maps are considerably less sensitive to lighting changes. Our system subtracts the background, analyzes the foreground pixels to break the body into parts using a robust mixture model, and estimates the direction of pointing. We have tested the system on both coarse and fine pointing by selecting the targets in a room and controlling the cursor on a wall screen, respectively.

international conference on computer vision | 2005

Consistent segmentation for optical flow estimation

Charles Lawrence Zitnick; Nebojsa Jojic; Sing Bing Kang

In this paper, we propose a method for jointly computing optical flow and segmenting video while accounting for mixed pixels (matting). Our method is based on statistical modeling of an image pair using constraints on appearance and motion. Segments are viewed as overlapping regions with fractional (alpha) contributions. Bidirectional motion is estimated based on spatial coherence and similarity of segment colors. Our model is extended to video by chaining the pairwise models to produce a joint probability distribution to be maximized. To make the problem more tractable, we factorize the posterior distribution and iteratively minimize its parts. We demonstrate our method on frame interpolation

European Journal of Immunology | 2007

Extensive HLA class I allele promiscuity among viral CTL epitopes

Nicole Frahm; Karina Yusim; Todd J. Suscovich; Sharon Adams; John Sidney; Peter Hraber; Hannah S. Hewitt; Caitlyn Linde; Daniel G. Kavanagh; Tonia Woodberry; Leah M. Henry; Kellie Faircloth; Jennifer Listgarten; Carl M. Kadie; Nebojsa Jojic; Kaori Sango; Nancy V. Brown; Eunice Pae; M. Tauheed Zaman; Florian Bihl; Ashok Khatri; M. John; S. Mallal; Francesco M. Marincola; Bruce D. Walker; Alessandro Sette; David Heckerman; Bette T. Korber; Christian Brander

Promiscuous binding of T helper epitopes to MHC class II molecules has been well established, but few examples of promiscuous class I‐restricted epitopes exist. To address the extent of promiscuity of HLA class I peptides, responses to 242 well‐defined viral epitopes were tested in 100 subjects regardless of the individuals’ HLA type. Surprisingly, half of all detected responses were seen in the absence of the originally reported restricting HLA class I allele, and only 3% of epitopes were recognized exclusively in the presence of their original allele. Functional assays confirmed the frequent recognition of HLA class I‐restricted T cell epitopes on several alternative alleles across HLA class I supertypes and encoded on different class I loci. These data have significant implications for the understanding of MHC class I‐restricted antigen presentation and vaccine development.

PLOS Computational Biology | 2007

Coping with Viral Diversity in HIV Vaccine Design

David C. Nickle; Morgane Rolland; Mark A. Jensen; Sergei L. Kosakovsky Pond; Wenjie Deng; Mark Seligman; David Heckerman; James I. Mullins; Nebojsa Jojic

The ability of human immunodeficiency virus type 1 (HIV-1) to develop high levels of genetic diversity, and thereby acquire mutations to escape immune pressures, contributes to the difficulties in producing a vaccine. Possibly no single HIV-1 sequence can induce sufficiently broad immunity to protect against a wide variety of infectious strains, or block mutational escape pathways available to the virus after infection. The authors describe the generation of HIV-1 immunogens that minimizes the phylogenetic distance of viral strains throughout the known viral population (the center of tree [COT]) and then extend the COT immunogen by addition of a composite sequence that includes high-frequency variable sites preserved in their native contexts. The resulting COT+ antigens compress the variation found in many independent HIV-1 isolates into lengths suitable for vaccine immunogens. It is possible to capture 62% of the variation found in the Nef protein and 82% of the variation in the Gag protein into immunogens of three gene lengths. The authors put forward immunogen designs that maximize representation of the diverse antigenic features present in a spectrum of HIV-1 strains. These immunogens should elicit immune responses against high-frequency viral strains as well as against most mutant forms of the virus.

international conference on computer vision | 1999

Tracking self-occluding articulated objects in dense disparity maps

Nebojsa Jojic; Matthew Turk; Thomas S. Huang

In this paper, we present an algorithm for real-time tracking of articulated structures in dense disparity maps derived from stereo image sequences. A statistical image formation model that accounts for occlusions plays the central role in our tracking approach. This graphical model (a Bayesian network) assumes that the range image of each part of the structure is formed by drawing the depth candidates from a 3-D Gaussian distribution. The advantage over the classical mixture of Gaussians is that our model takes into account occlusions by picking the minimum depth (which could be regarded as a probabilistic version of z-buffering). The model also enforces articulation constraints among the parts of the structure. The tracking problem is formulated as an inference problem in the image formation model. This model can be extended and used for other tasks in addition to the one described in the paper and can also be used for estimating probability distribution functions instead of the ML estimates of the tracked parameters. For the purposes of real-time tracking, we used certain approximations in the inference process, which resulted in a real-time two-stage inference algorithm. We were able to successfully track upper human body motion in real time and in the presence of self-occlusions.

intelligent systems in molecular biology | 2006

Learning MHC I—peptide binding

Nebojsa Jojic; Manuel Reyes-Gomez; David Heckerman; Carl M. Kadie; Ora Schueler-Furman

MOTIVATION AND RESULTS Motivated by the ability of a simple threading approach to predict MHC I--peptide binding, we developed a new and improved structure-based model for which parameters can be estimated from additional sources of data about MHC-peptide binding. In addition to the known 3D structures of a small number of MHC-peptide complexes that were used in the original threading approach, we included three other sources of information on peptide-MHC binding: (1) MHC class I sequences; (2) known binding energies for a large number of MHC-peptide complexes; and (3) an even larger binary dataset that contains information about strong binders (epitopes) and non-binders (peptides that have a low affinity for a particular MHC molecule). Our model significantly outperforms the standard threading approach in binding energy prediction. In our approach, which we call adaptive double threading, the parameters of the threading model are learnable, and both MHC and peptide sequences can be threaded onto structures of other alleles. These two properties make our model appropriate for predicting binding for alleles for which very little data (if any) is available beyond just their sequence, including prediction for alleles for which 3D structures are not available. The ability of our model to generalize beyond the MHC types for which training data is available also separates our approach from epitope prediction methods which treat MHC alleles as symbolic types, rather than biological sequences. We used the trained binding energy predictor to study viral infections in 246 HIV patients from the West Australian cohort, and over 1000 sequences in HIV clade B from Los Alamos National Laboratory database, capturing the course of HIV evolution over the last 20 years. Finally, we illustrate short-, medium-, and long-term adaptation of HIV to the human immune system. AVAILABILITY http://www.research.microsoft.com/~jojic/hlaBinding.html.

Multimedia Tools and Applications | 2005

Adaptive Video Fast Forward

Nemanja Petrovic; Nebojsa Jojic; Thomas S. Huang

We derive a statistical graphical model of video scenes with multiple, possibly occluded objects that can be efficiently used for tasks related to video search, browsing and retrieval. The model is trained on query (target) clip selected by the user. Shot retrieval process is based on the likelihood of a video frame under generative model. Instead of using a combination of weighted Euclidean distances as a shot similarity measure, the likelihood model automatically separates and balances various causes of variability in video, including occlusion, appearance change and motion. Thus, we overcome tedious and complex user interventions required in previous studies. We use the model in the adaptive video forward application that adapts video playback speed to the likelihood of the data. The similarity measure of each candidate clip to the target clip defines the playback speed. Given a query, the video is played at a higher speed as long as video content has low likelihood, and when frames similar to the query clip start to come in, the video playback rate drops. Set of experiments o12n typical home videos demonstrate performance, easiness and utility of our application.

Explore More