Dumitru Erhan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dumitru Erhan is active.

Explore More

Publication

Featured researches published by Dumitru Erhan.

computer vision and pattern recognition | 2015

Going deeper with convolutions

Christian Szegedy; Wei Liu; Yangqing Jia; Pierre Sermanet; Scott E. Reed; Dragomir Anguelov; Dumitru Erhan; Vincent Vanhoucke; Andrew Rabinovich

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

computer vision and pattern recognition | 2015

Show and tell: A neural image caption generator

Oriol Vinyals; Alexander Toshev; Samy Bengio; Dumitru Erhan

Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. For instance, while the current state-of-the-art BLEU-1 score (the higher the better) on the Pascal dataset is 25, our approach yields 59, to be compared to human performance around 69. We also show BLEU-1 score improvements on Flickr30k, from 56 to 66, and on SBU, from 19 to 28. Lastly, on the newly released COCO dataset, we achieve a BLEU-4 of 27.7, which is the current state-of-the-art.

european conference on computer vision | 2016

SSD: Single Shot MultiBox Detector

Wei Liu; Dragomir Anguelov; Dumitru Erhan; Christian Szegedy; Scott E. Reed; Cheng-Yang Fu; Alexander C. Berg

We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. Our SSD model is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stage and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component. Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. Compared to other single stage methods, SSD has much better accuracy, even with a smaller input image size. For

international conference on machine learning | 2007

An empirical evaluation of deep architectures on problems with many factors of variation

Hugo Larochelle; Dumitru Erhan; Aaron C. Courville; James Bergstra; Yoshua Bengio

300\times 300

computer vision and pattern recognition | 2014

Scalable Object Detection Using Deep Neural Networks

Dumitru Erhan; Christian Szegedy; Alexander Toshev; Dragomir Anguelov

input, SSD achieves 72.1% mAP on VOC2007 test at 58 FPS on a Nvidia Titan X and for

Machine Learning | 2006

Aggregate features and ADABOOST for music classification

James Bergstra; Norman Casagrande; Dumitru Erhan; Douglas Eck; Balázs Kégl

500\times 500

international conference on neural information processing | 2013

Challenges in Representation Learning: A Report on Three Machine Learning Contests

Ian J. Goodfellow; Dumitru Erhan; Pierre Carrier; Aaron C. Courville; Mehdi Mirza; Ben Hamner; Will Cukierski; Yichuan Tang; David Thaler; Dong-Hyun Lee; Yingbo Zhou; Chetan Ramaiah; Fangxiang Feng; Ruifan Li; Xiaojie Wang; Dimitris Athanasakis; John Shawe-Taylor; Maxim Milakov; John Park; Radu Tudor Ionescu; Marius Popescu; Cristian Grozea; James Bergstra; Jingjing Xie; Lukasz Romaszko; Bing Xu; Zhang Chuang; Yoshua Bengio

input, SSD achieves 75.1% mAP, outperforming a comparable state of the art Faster R-CNN model. Code is available at this https URL .

computer vision and pattern recognition | 2017

Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks

Konstantinos Bousmalis; Nathan Silberman; David Dohan; Dumitru Erhan; Dilip Krishnan

Recently, several learning algorithms relying on models with deep architectures have been proposed. Though they have demonstrated impressive performance, to date, they have only been evaluated on relatively simple problems such as digit recognition in a controlled environment, for which many machine learning algorithms already report reasonable results. Here, we present a series of experiments which indicate that these models show promise in solving harder learning problems that exhibit many factors of variation. These models are compared with well-established algorithms such as Support Vector Machines and single hidden-layer feed-forward neural networks.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2017

Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge

Oriol Vinyals; Alexander Toshev; Samy Bengio; Dumitru Erhan

Deep convolutional neural networks have recently achieved state-of-the-art performance on a number of image recognition benchmarks, including the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC-2012). The winning model on the localization sub-task was a network that predicts a single bounding box and a confidence score for each object category in the image. Such a model captures the whole-image context around the objects but cannot handle multiple instances of the same object in the image without naively replicating the number of outputs for each instance. In this work, we propose a saliency-inspired neural network model for detection, which predicts a set of class-agnostic bounding boxes along with a single score for each box, corresponding to its likelihood of containing any object of interest. The model naturally handles a variable number of instances for each class and allows for cross-class generalization at the highest levels of the network. We are able to obtain competitive recognition performance on VOC2007 and ILSVRC2012, while using only the top few predicted locations in each image and a small number of neural network evaluations.

Journal of Chemical Information and Modeling | 2006

Collaborative filtering on a family of biological targets

Dumitru Erhan; Pierre-Jean L'Heureux; Shi Yi Yue; Yoshua Bengio

We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner ADABOOST to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective method for genre classification at the recent MIREX 2005 international contests in music information extraction, and the second-best method for recognizing artists. This paper describes our method in detail, from feature extraction to song classification, and presents an evaluation of our method on three genre databases and two artist-recognition databases. Furthermore, we present evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual short-timescale features.

Explore More