Fereshteh Sadeghi
University of Washington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fereshteh Sadeghi.
european conference on computer vision | 2012
Fereshteh Sadeghi; Marshall F. Tappen
In this paper we propose a simple but efficient image representation for solving the scene classification problem. Our new representation combines the benefits of spatial pyramid representation using nonlinear feature coding and latent Support Vector Machine (LSVM) to train a set of Latent Pyramidal Regions (LPR). Each of our LPRs captures a discriminative characteristic of the scenes and is trained by searching over all possible sub-windows of the images in a latent SVM training procedure. Each LPR is represented in a spatial pyramid and uses non-linear locality constraint coding for learning both shape and texture patterns of the scene. The final response of the LPRs form a single feature vector which we call the LPR representation and can be used for the classification task. We tested our proposed scene representation model in three datasets which contain a variety of scene categories (15-Scenes, UIUC-Sports and MIT-indoor). Our LPR representation obtains state-of-the-art results on all these datasets which shows that it can simultaneously model the global and local scene characteristics in a single framework and is general enough to be used for both indoor and outdoor scene classification.
robotics science and systems | 2017
Fereshteh Sadeghi; Sergey Levine
Deep reinforcement learning has emerged as a promising and powerful technique for automatically acquiring control policies that can process raw sensory inputs, such as images, and perform complex behaviors. However, extending deep RL to real-world robotic tasks has proven challenging, particularly in safety-critical domains such as autonomous flight, where a trial-and-error learning process is often impractical. In this paper, we explore the following question: can we train vision-based navigation policies entirely in simulation, and then transfer them into the real world to achieve real-world flight without a single real training image? We propose a learning method that we call CAD
computer vision and pattern recognition | 2013
Baoyuan Liu; Fereshteh Sadeghi; Marshall F. Tappen; Ohad Shamir; Ce Liu
^2
computer vision and pattern recognition | 2014
Hamid Izadinia; Fereshteh Sadeghi; Ali Farhadi
RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models. Our method uses single RGB images from a monocular camera, without needing to explicitly reconstruct the 3D geometry of the environment or perform explicit motion planning. Our learned collision avoidance policy is represented by a deep convolutional neural network that directly processes raw monocular images and outputs velocity commands. This policy is trained entirely on simulated images, with a Monte Carlo policy evaluation algorithm that directly optimizes the networks ability to produce collision-free flight. By highly randomizing the rendering settings for our simulated training set, we show that we can train a policy that generalizes to the real world, without requiring the simulator to be particularly realistic or high-fidelity. We evaluate our method by flying a real quadrotor through indoor environments, and further evaluate the design choices in our simulator through a series of ablation studies on depth prediction. For supplementary video see: this https URL
international conference on computer vision | 2015
Hamid Izadinia; Fereshteh Sadeghi; Santosh Kumar Divvala; Hannaneh Hajishirzi; Yejin Choi; Ali Farhadi
Large-scale recognition problems with thousands of classes pose a particular challenge because applying the classifier requires more computation as the number of classes grows. The label tree model integrates classification with the traversal of the tree so that complexity grows logarithmically. In this paper, we show how the parameters of the label tree can be found using maximum likelihood estimation. This new probabilistic learning technique produces a label tree with significantly improved recognition accuracy.
ieee international conference on fuzzy systems | 2009
Hamid Izadinia; Fereshteh Sadeghi; Mohammad Mehdi Ebadzadeh
A scene category imposes tight distributions over the kind of objects that might appear in the scene, the appearance of those objects and their layout. In this paper, we propose a method to learn scene structures that can encode three main interlacing components of a scene: the scene category, the context-specific appearance of objects, and their layout. Our experimental evaluations show that our learned scene structures outperform state-of-the-art method of Deformable Part Models in detecting objects in a scene. Our scene structure provides a level of scene understanding that is amenable to deep visual inferences. The scene structures can also generate features that can later be used for scene categorization. Using these features, we also show promising results on scene categorization.
workshop on applications of computer vision | 2015
Fereshteh Sadeghi; J. Rafael Tena; Ali Farhadi; Leonid Sigal
We introduce Segment-Phrase Table (SPT), a large collection of bijective associations between textual phrases and their corresponding segmentations. Leveraging recent progress in object recognition and natural language semantics, we show how we can successfully build a high-quality segment-phrase table using minimal human supervision. More importantly, we demonstrate the unique value unleashed by this rich bimodal resource, for both vision as well as natural language understanding. First, we show that fine-grained textual labels facilitate contextual reasoning that helps in satisfying semantic constraints across image segments. This feature enables us to achieve state-of-the-art segmentation results on benchmark datasets. Next, we show that the association of high-quality segmentations to textual phrases aids in richer semantic understanding and reasoning of these textual phrases. Leveraging this feature, we motivate the problem of visual entailment and visual paraphrasing, and demonstrate its utility on a large dataset.
international symposium on neural networks | 2009
Hamid Izadinia; Fereshteh Sadeghi; Mohammad Mehdi Ebadzadeh
Generalized Hough Transform (GHT) is an efficient method for detecting curves by exploiting the duality between points on a curve and parameters of that curve. However GHT has some practical limitations such as high computational cost and huge memory requirement for detecting scaled and rotated objects. In this paper a new method, namely Fuzzy Generalized Hough Transform (FGHT), is proposed that alleviates these deficiencies by utilizing the concept of fuzzy inference system. In FGHT the R-table consists of a set of fuzzy rules which are fired by the gradient direction of edge pixels and vote for the possible location of the center. Moreover, the proposed method can identify the boundary of the rotated and scaled object via a new voting strategy. To evaluate the effectiveness of FGHT several experiments with scaled, rotated, occluded and noisy images are conducted. The results are compared with two extensions of GHT and have revealed that the proposed method can locate and detect the prototype object with least error under various conditions.
computer vision and pattern recognition | 2015
Fereshteh Sadeghi; Santosh Kumar Divvala; Ali Farhadi
We propose the problem of automated photo album creation from an unordered image collection. The problem is difficult as it involves a number of complex perceptual tasks that facilitate selection and ordering of photos to create a compelling visual narrative. To help solve this problem, we collect (and will make available) a new benchmark dataset based on Flickr images. Flickr Album Dataset and provides a variety of annotations useful for the task, including manually created albums of various lengths. We analyze the problem and provide experimental evidence, through user studies, that both selection and ordering of photos within an album is important for human observers. To capture and learn rules of album composition, we propose a discriminative structured model capable of encoding simple preferences for contextual layout of the scene (e.g., spatial layout of faces, global scene context, and presence/absence of attributes) and ordering between photos (e.g., exclusion principles or correlations). The parameters of the model are learned using a structured SVM framework. Once learned, the model allows automatic composition of photo albums from unordered and untagged collections of images. We quantitatively evaluate the results obtained using our model against manually created albums and baselines on a dataset of 63 personal photo collections from 5 different topics.
neural information processing systems | 2015
Fereshteh Sadeghi; C. Lawrence Zitnick; Ali Farhadi
The natural immune system is composed of cells and molecules with complex interactions. Jerne modeled the interactions among immune cells and molecules by introducing the immune network. The immune system provides an effective defense mechanism against foreign substances. This system like the neural system is able to learn from experience. In this paper, the Jernes immune network model is extended and a new classifier based on the new immune network model and Learning Vector Quantization (LVQ) is proposed. The new classification method is called Hybrid Fuzzy Neuro-Immune Network based on Multi-Epitope approach (HFNINME). The performance of the proposed method is evaluated via several benchmark classification problems and is compared with two other prominent immune-based classifiers. The experiments reveal that the proposed method yields a parsimonious classifier that can classify data more accurately and more efficiently.