Is this you? Create Your Porfile

Phillip Isola

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Phillip Isola is active.

Explore More

Publication

Featured researches published by Phillip Isola.

computer vision and pattern recognition | 2017

Image-to-Image Translation with Conditional Adversarial Networks

Phillip Isola; Jun-Yan Zhu; Tinghui Zhou; Alexei A. Efros

We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

computer vision and pattern recognition | 2011

What makes an image memorable

Phillip Isola; Jianxiong Xiao; Antonio Torralba; Aude Oliva

When glancing at a magazine, or browsing the Internet, we are continuously being exposed to photographs. Despite of this overflow of visual information, humans are extremely good at remembering thousands of pictures along with some of their visual details. But not all images are equal in memory. Some stitch to our minds, and other are forgotten. In this paper we focus on the problem of predicting how memorable an image will be. We show that memorability is a stable property of an image that is shared across different viewers. We introduce a database for which we have measured the probability that each picture will be remembered after a single view. We analyze image features and labels that contribute to making an image memorable, and we train a predictor based on global image descriptors. We find that predicting image memorability is a task that can be addressed with current computer vision techniques. Whereas making memorable images is a challenging task in visualization and photography, this work is a first attempt to quantify this useful quality of images.

european conference on computer vision | 2016

Colorful Image Colorization

Richard Zhang; Phillip Isola; Alexei A. Efros

Given a grayscale photograph as input, this paper attacks the problem of hallucinating a plausible color version of the photograph. This problem is clearly underconstrained, so previous approaches have either relied on significant user interaction or resulted in desaturated colorizations. We propose a fully automatic approach that produces vibrant and realistic colorizations. We embrace the underlying uncertainty of the problem by posing it as a classification task and use class-rebalancing at training time to increase the diversity of colors in the result. The system is implemented as a feed-forward pass in a CNN at test time and is trained on over a million color images. We evaluate our algorithm using a “colorization Turing test,” asking human participants to choose between a generated and ground truth color image. Our method successfully fools humans on 32 % of the trials, significantly higher than previous methods. Moreover, we show that colorization can be a powerful pretext task for self-supervised feature learning, acting as a cross-channel encoder. This approach results in state-of-the-art performance on several feature learning benchmarks.

Journal of Experimental Psychology: Learning, Memory and Cognition | 2008

Multidimensional Visual Statistical Learning

Nicholas B. Turk-Browne; Phillip Isola; Brian J. Scholl; Teresa A. Treat

Recent studies of visual statistical learning (VSL) have demonstrated that statistical regularities in sequences of visual stimuli can be automatically extracted, even without intent or awareness. Despite much work on this topic, however, several fundamental questions remain about the nature of VSL. In particular, previous experiments have not explored the underlying units over which VSL operates. In a sequence of colored shapes, for example, does VSL operate over each feature dimension independently, or over multidimensional objects in which color and shape are bound together? The studies reported here demonstrate that VSL can be both object-based and feature-based, in systematic ways based on how different feature dimensions covary. For example, when each shape covaried perfectly with a particular color, VSL was object-based: Observers expressed robust VSL for colored-shape sub-sequences at test but failed when the test items consisted of monochromatic shapes or color patches. When shape and color pairs were partially decoupled during learning, however, VSL operated over features: Observers expressed robust VSL when the feature dimensions were tested separately. These results suggest that VSL is object-based, but that sensitivity to feature correlations in multidimensional sequences (possibly another form of VSL) may in turn help define what counts as an object.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2014

What Makes a Photograph Memorable

Phillip Isola; Jianxiong Xiao; Devi Parikh; Antonio Torralba; Aude Oliva

When glancing at a magazine, or browsing the Internet, we are continuously exposed to photographs. Despite this overflow of visual information, humans are extremely good at remembering thousands of pictures along with some of their visual details. But not all images are equal in memory. Some stick in our minds while others are quickly forgotten. In this paper, we focus on the problem of predicting how memorable an image will be. We show that memorability is an intrinsic and stable property of an image that is shared across different viewers, and remains stable across delays. We introduce a database for which we have measured the probability that each picture will be recognized after a single view. We analyze a collection of image features, labels, and attributes that contribute to making an image memorable, and we train a predictor based on global image descriptors. We find that predicting image memorability is a task that can be addressed with current computer vision techniques. While making memorable images is a challenging task in visualization, photography, and education, this work is a first attempt to quantify this useful property of images.

european conference on computer vision | 2014

Crisp Boundary Detection Using Pointwise Mutual Information

Phillip Isola; Daniel Zoran; Dilip Krishnan; Edward H. Adelson

Detecting boundaries between semantically meaningful objects in visual scenes is an important component of many vision algorithms. In this paper, we propose a novel method for detecting such boundaries based on a simple underlying principle: pixels belonging to the same object exhibit higher statistical dependencies than pixels belonging to different objects. We show how to derive an affinity measure based on this principle using pointwise mutual information, and we show that this measure is indeed a good predictor of whether or not two pixels reside on the same object. Using this affinity with spectral clustering, we can find object boundaries in the image – achieving state-of-the-art results on the BSDS500 dataset. Our method produces pixel-level accurate boundaries while requiring minimal feature engineering.

Journal of Experimental Psychology: General | 2013

The intrinsic memorability of face photographs.

Wilma A. Bainbridge; Phillip Isola; Aude Oliva

The faces we encounter throughout our lives make different impressions on us: Some are remembered at first glance, while others are forgotten. Previous work has found that the distinctiveness of a face influences its memorability--the degree to which face images are remembered or forgotten. Here, we generalize the concept of face memorability in a large-scale memory study. First, we find that memorability is an intrinsic feature of a face photograph--across observers some faces are consistently more remembered or forgotten than others--indicating that memorability can be used for measuring, predicting, and manipulating subsequent memories. Second, we determine the role that 20 personality, social, and memory-related traits play in face memorability. Whereas we find that certain traits (such as kindness, atypicality, and trustworthiness) contribute to face memorability, they do not suffice to explain the variance in memorability scores, even when accounting for noise and differences in subjective experience. This suggests that memorability itself is a consistent, singular measure of a face that cannot be reduced to a simple combination of personality and social facial attributes. We outline modern neuroscience questions that can be explored through the lens of memorability.

Vision Research | 2015

Intrinsic and extrinsic effects on image memorability

Zoya Bylinskii; Phillip Isola; Constance Bainbridge; Antonio Torralba; Aude Oliva

Previous studies have identified that images carry the attribute of memorability, a predictive value of whether a novel image will be later remembered or forgotten. Here we investigate the interplay between intrinsic and extrinsic factors that affect image memorability. First, we find that intrinsic differences in memorability exist at a finer-grained scale than previously documented. Second, we test two extrinsic factors: image context and observer behavior. Building on prior findings that images that are distinct with respect to their context are better remembered, we propose an information-theoretic model of image distinctiveness. Our model can automatically predict how changes in context change the memorability of natural images. In addition to context, we study a second extrinsic factor: where an observer looks while memorizing an image. It turns out that eye movements provide additional information that can predict whether or not an image will be remembered, on a trial-by-trial basis. Together, by considering both intrinsic and extrinsic effects on memorability, we arrive at a more complete and fine-grained model of image memorability than previously available.

computer vision and pattern recognition | 2017

Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction

Richard Y. Zhang; Phillip Isola; Alexei A. Efros

We propose split-brain autoencoders, a straightforward modification of the traditional autoencoder architecture, for unsupervised representation learning. The method adds a split to the network, resulting in two disjoint sub-networks. Each sub-network is trained to perform a difficult task – predicting one subset of the data channels from another. Together, the sub-networks extract features from the entire input signal. By forcing the network to solve cross-channel prediction tasks, we induce a representation within the network which transfers well to other, unseen tasks. This method achieves state-of-the-art performance on several large-scale transfer learning benchmarks.

international conference on computer vision | 2015

Learning Ordinal Relationships for Mid-Level Vision

Daniel Zoran; Phillip Isola; Dilip Krishnan; William T. Freeman

We propose a framework that infers mid-level visual properties of an image by learning about ordinal relationships. Instead of estimating metric quantities directly, the system proposes pairwise relationship estimates for points in the input image. These sparse probabilistic ordinal measurements are globalized to create a dense output map of continuous metric measurements. Estimating order relationships between pairs of points has several advantages over metric estimation: it solves a simpler problem than metric regression, humans are better at relative judgements, so data collection is easier, ordinal relationships are invariant to monotonic transformations of the data, thereby increasing the robustness of the system and providing qualitatively different information. We demonstrate that this frame-work works well on two important mid-level vision tasks: intrinsic image decomposition and depth from an RGB image. We train two systems with the same architecture on data from these two modalities. We provide an analysis of the resulting models, showing that they learn a number of simple rules to make ordinal decisions. We apply our algorithm to depth estimation, with good results, and intrinsic image decomposition, with state-of-the-art results.

Explore More