Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tsung-Yi Lin is active.

Publication


Featured researches published by Tsung-Yi Lin.


european conference on computer vision | 2014

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin; Michael Maire; Serge J. Belongie; James Hays; Pietro Perona; Deva Ramanan; Piotr Dollár; C. Lawrence Zitnick

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.


computer vision and pattern recognition | 2017

Feature Pyramid Networks for Object Detection

Tsung-Yi Lin; Piotr Dollár; Ross B. Girshick; Kaiming He; Bharath Hariharan; Serge J. Belongie

Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But pyramid representations have been avoided in recent object detectors that are based on deep convolutional networks, partially because they are slow to compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.


european conference on computer vision | 2016

Learning to Refine Object Segments

Pedro H. O. Pinheiro; Tsung-Yi Lin; Ronan Collobert; Piotr Dollár

Object segmentation requires both object-level information and low-level pixel data. This presents a challenge for feedforward networks: lower layers in convolutional nets capture rich spatial information, while upper layers encode object-level knowledge but are invariant to factors such as pose and appearance. In this work we propose to augment feedforward nets for object segmentation with a novel top-down refinement approach. The resulting bottom-up/top-down architecture is capable of efficiently generating high-fidelity object masks. Similarly to skip connections, our approach leverages features at all layers of the net. Unlike skip connections, our approach does not attempt to output independent predictions at each layer. Instead, we first output a coarse ‘mask encoding’ in a feedforward pass, then refine this mask encoding in a top-down pass utilizing features at successively lower layers. The approach is simple, fast, and effective. Building on the recent DeepMask network for generating object proposals, we show accuracy improvements of 10–20% in average recall for various setups. Additionally, by optimizing the overall network architecture, our approach, which we call SharpMask, is 50 % faster than the original DeepMask network (under .8 s per image).


computer vision and pattern recognition | 2015

Learning deep representations for ground-to-aerial geolocalization

Tsung-Yi Lin; Yin Cui; Serge J. Belongie; James Hays

The recent availability of geo-tagged images and rich geospatial data has inspired a number of algorithms for image based geolocalization. Most approaches predict the location of a query image by matching to ground-level images with known locations (e.g., street-view data). However, most of the Earth does not have ground-level reference photos available. Fortunately, more complete coverage is provided by oblique aerial or “birds eye” imagery. In this work, we localize a ground-level query image by matching it to a reference database of aerial imagery. We use publicly available data to build a dataset of 78K aligned crossview image pairs. The primary challenge for this task is that traditional computer vision approaches cannot handle the wide baseline and appearance variation of these cross-view pairs. We use our dataset to learn a feature representation in which matching views are near one another and mismatched views are far apart. Our proposed approach, Where-CNN, is inspired by deep learning success in face verification and achieves significant improvements over traditional hand-crafted features and existing deep features learned from other large-scale databases. We show the effectiveness of Where-CNN in finding matches between street view and aerial view imagery and demonstrate the ability of our learned features to generalize to novel locations.


computer vision and pattern recognition | 2013

Cross-View Image Geolocalization

Tsung-Yi Lin; Serge J. Belongie; James Hays

The recent availability of large amounts of geotagged imagery has inspired a number of data driven solutions to the image geolocalization problem. Existing approaches predict the location of a query image by matching it to a database of georeferenced photographs. While there are many geotagged images available on photo sharing and street view sites, most are clustered around landmarks and urban areas. The vast majority of the Earths land area has no ground level reference photos available, which limits the applicability of all existing image geolocalization methods. On the other hand, there is no shortage of visual and geographic data that densely covers the Earth - we examine overhead imagery and land cover survey data - but the relationship between this data and ground level query photographs is complex. In this paper, we introduce a cross-view feature translation approach to greatly extend the reach of image geolocalization methods. We can often localize a query even if it has no corresponding ground level images in the database. A key idea is to learn the relationship between ground level appearance and overhead appearance and land cover attributes from sparsely available geotagged ground-level images. We perform experiments over a 1600 km2 region containing a variety of scenes and land cover types. For each query, our algorithm produces a probability density over the region of interest.


arXiv: Computer Vision and Pattern Recognition | 2015

Microsoft COCO Captions: Data Collection and Evaluation Server.

Xinlei Chen; Hao Fang; Tsung-Yi Lin; Ramakrishna Vedantam; Saurabh Gupta; Piotr Dollár; C. Lawrence Zitnick


international conference on computer vision | 2017

Focal Loss for Dense Object Detection

Tsung-Yi Lin; Priya Goyal; Ross B. Girshick; Kaiming He; Piotr Dollár


british machine vision conference | 2016

A MultiPath Network for Object Detection

Sergey Zagoruyko; Adam Lerer; Tsung-Yi Lin; Pedro H. O. Pinheiro; Sam Gross; Soumith Chintala; Piotr Dollár


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2018

Focal loss for dense object detection

Tsung-Yi Lin; Priyal Goyal; Ross B. Girshick; Kaiming He; Piotr Dollár


Large-Scale Visual Geo-Localization | 2016

Cross-View Image Geo-localization.

Tsung-Yi Lin; Serge J. Belongie; James Hays

Collaboration


Dive into the Tsung-Yi Lin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

James Hays

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pedro H. O. Pinheiro

École Polytechnique Fédérale de Lausanne

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge