Scott E. Reed | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Scott E. Reed is active.

Explore More

Publication

Featured researches published by Scott E. Reed.

computer vision and pattern recognition | 2015

Going deeper with convolutions

Christian Szegedy; Wei Liu; Yangqing Jia; Pierre Sermanet; Scott E. Reed; Dragomir Anguelov; Dumitru Erhan; Vincent Vanhoucke; Andrew Rabinovich

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

european conference on computer vision | 2016

SSD: Single Shot MultiBox Detector

Wei Liu; Dragomir Anguelov; Dumitru Erhan; Christian Szegedy; Scott E. Reed; Cheng-Yang Fu; Alexander C. Berg

We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. Our SSD model is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stage and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component. Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. Compared to other single stage methods, SSD has much better accuracy, even with a smaller input image size. For

computer vision and pattern recognition | 2015

Evaluation of output embeddings for fine-grained image classification

Zeynep Akata; Scott E. Reed; Daniel J. Walter; Honglak Lee; Bernt Schiele

300\times 300

computer vision and pattern recognition | 2016

Learning Deep Representations of Fine-Grained Visual Descriptions

Scott E. Reed; Zeynep Akata; Honglak Lee; Bernt Schiele

input, SSD achieves 72.1% mAP on VOC2007 test at 58 FPS on a Nvidia Titan X and for

PLOS ONE | 2012

Modeling adaptive regulatory T-cell dynamics during early HIV infection.

Michael Simonov; Renata A. Rawlings; Nick Comment; Scott E. Reed; Xiaoyu Shi; Patrick W. Nelson

500\times 500

international conference on machine learning | 2016

Generative adversarial text to image synthesis

Scott E. Reed; Zeynep Akata; Xinchen Yan; Lajanugen Logeswaran; Bernt Schiele; Honglak Lee

input, SSD achieves 75.1% mAP, outperforming a comparable state of the art Faster R-CNN model. Code is available at this https URL .

arXiv: Computer Vision and Pattern Recognition | 2015

Scalable, high-quality object detection

Christian Szegedy; Scott E. Reed; Dumitru Erhan; Dragomir Anguelov

Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with finegrained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results.Despite significant recent advances in image classification, fine-grained classification remains a challenge. In the present paper, we address the zero-shot and few-shot learning scenarios as obtaining labeled data is especially difficult for fine-grained classification tasks. First, we embed state-of-the-art image descriptors in a label embedding space using side information such as attributes. We argue that learning a joint embedding space, that maximizes the compatibility between the input and output embeddings, is highly effective for zero/few-shot learning. We show empirically that such embeddings significantly outperforms the current state-of-the-art methods on two challenging datasets (Caltech-UCSD Birds and Animals with Attributes). Second, to reduce the amount of costly manual attribute annotations, we use alternate output embeddings based on the word-vector representations, obtained from large text-corpora without any supervision. We report that such unsupervised embeddings achieve encouraging results, and lead to further improvements when combined with the supervised ones.

neural information processing systems | 2015

Weakly-supervised disentangling with recurrent transformations for 3D view synthesis

Jimei Yang; Scott E. Reed; Ming-Hsuan Yang; Honglak Lee

State-of-the-art methods for zero-shot visual recognition formulate learning as a joint embedding problem of images and side information. In these formulations the current best complement to visual features are attributes: manually-encoded vectors describing shared characteristics among categories. Despite good performance, attributes have limitations: (1) finer-grained recognition requires commensurately more attributes, and (2) attributes do not provide a natural language interface. We propose to overcome these limitations by training neural language models from scratch, i.e. without pre-training and only consuming words and characters. Our proposed models train end-to-end to align with the fine-grained and category-specific content of images. Natural language provides a flexible and compact way of encoding only the salient visual aspects for distinguishing categories. By training on raw text, our model can do inference on raw text as well, providing humans a familiar mode both for annotation and retrieval. Our model achieves strong performance on zero-shot text-based image retrieval and significantly outperforms the attribute-based state-of-the-art for zero-shot classification on the Caltech-UCSD Birds 200-2011 dataset.

neural information processing systems | 2016

Learning what and where to draw

Scott E. Reed; Zeynep Akata; Santosh Mohan; Samuel Tenka; Bernt Schiele; Honglak Lee

Regulatory T-cells (Tregs) are a subset of CD4+ T-cells that have been found to suppress the immune response. During HIV viral infection, Treg activity has been observed to have both beneficial and deleterious effects on patient recovery; however, the extent to which this is regulated is poorly understood. We hypothesize that this dichotomy in behavior is attributed to Treg dynamics changing over the course of infection through the proliferation of an ‘adaptive’ Treg population which targets HIV-specific immune responses. To investigate the role Tregs play in HIV infection, a delay differatial equation model was constructed to examine (1) the possible existence of two distinct Treg populations, normal (nTregs) and adaptive (aTregs), and (2) their respective effects in limiting viral load. Sensitivity analysis was performed to test parameter regimes that show the proportionality of viral load with adaptive regulatory populations and also gave insight into the importance of downregulation of CD4+ cells by normal Tregs on viral loads. Through the inclusion of Treg populations in the model, a diverse array of viral dynamics was found. Specifically, oscillatory and steady state behaviors were both witnessed and it was seen that the model provided a more accurate depiction of the effector cell population as compared with previous models. Through further studies of adaptive and normal Tregs, improved treatments for HIV can be constructed for patients and the viral mechanisms of infection can be further elucidated.

arXiv: Computer Vision and Pattern Recognition | 2015