Jim Gemmell
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jim Gemmell.
acm multimedia | 2002
Jim Gemmell; Gordon Bell; Roger Lueder; Steven M. Drucker; Curtis G. Wong
MyLifeBits is a project to fulfill the Memex vision first posited by Vannevar Bush in 1945. It is a system for storing all of ones digital media, including documents, images, sounds, and videos. It is built on four principles: (1) collections and search must replace hierarchy for organization (2) many visualizations should be supported (3) annotations are critical to non-text media and must be made easy, and (4) authoring should be via transclusion.
Communications of The ACM | 2006
Jim Gemmell; Gordon Bell; Roger Lueder
Developing a platform for recording, storing, and accessing a personal lifetime archive.
acm workshop on continuous archival and retrieval of personal experiences | 2004
Jim Gemmell; Lyndsay Williams; Ken Wood; Roger Lueder; Gordon Bell
Passive capture lets people record their experiences without having to operate recording equipment, and without even having to give recording conscious thought. The advantages are increased capture, and improved participation in the event itself. However, passive capture also presents many new challenges. One key challenge is how to deal with the increased volume of media for retrieval, browsing, and organizing. This paper describes the SenseCam device, which combines a camera with a number of sensors in a pendant worn around the neck. Data from SenseCam is uploaded into a MyLifeBits repository, where a number of features, but especially correlation and relationships, are used to manage the data.
very large data bases | 2012
Bo Zhao; Benjamin I. P. Rubinstein; Jim Gemmell; Jiawei Han
In practical data integration systems, it is common for the data sources being integrated to provide conflicting information about the same entity. Consequently, a major challenge for data integration is to derive the most complete and accurate integrated records from diverse and sometimes conflicting sources. We term this challenge the truth finding problem. We observe that some sources are generally more reliable than others, and therefore a good model of source quality is the key to solving the truth finding problem. In this work, we propose a probabilistic graphical model that can automatically infer true records and source quality without any supervision. In contrast to previous methods, our principled approach leverages a generative process of two types of errors (false positive and false negative) by modeling two different aspects of source quality. In so doing, ours is also the first approach designed to merge multi-valued attribute types. Our method is scalable, due to an efficient sampling-based inference algorithm that needs very few iterations in practice and enjoys linear time complexity, with an even faster incremental variant. Experiments on two real world datasets show that our new method outperforms existing state-of-the-art approaches to the truth finding problem.
ieee international conference on automatic face and gesture recognition | 2002
Rogério Schmidt Feris; Jim Gemmell; Kentaro Toyama; Volker Krüger
We present a technique for facial feature localization using a two-level hierarchical wavelet network. The first level wavelet network is used for face matching, and yields an affine transformation used for a rough approximation of feature locations. Second level wavelet networks for each feature are then used to fine-tune the feature locations. Construction of a training database containing hierarchical wavelet networks of many faces allows features to be detected in most faces. Experiments show that facial feature localization benefits significantly from the hierarchical approach. Results compare favorably with existing techniques for feature localization.
Communications of The ACM | 2006
Mary Czerwinski; Douglas W. Gage; Jim Gemmell; Catherine C. Marshall; Manuel A. Pérez-Quiñones; Meredith M. Skeels; Tiziana Catarci
A lifetime of digital memories is possible but raises many social, as well as technological, questions.
acm sigmm workshop on experiential telepresence | 2003
Jim Gemmell; Roger Lueder; Gordon Bell
Storage trends have brought us to the point where it is affordable to keep a complete digital record of ones life, and capture methods are multiplying. To experiment with a lifetime store, we are digitizing everything possible from Gordon Bells life. The MyLifeBits system is designed to store and manage a lifetimes worth of data. MyLifeBits enables the capture of web pages, telephone, radio and television. This demonstration highlights the application of typed links and database features to make a lifetime store something that is truly useful.
international conference on multimedia and expo | 2005
Jim Gemmell; Aleks Aris; Roger Lueder
User authored stories will always be the best stories, and authoring tools will continue to be developed. However, a digital lifetime capture permits storytelling via a lightweight markup structure, combined with location, sensor and usage data. In this paper, we describe support in the MyLifeBits system for such an approach, along with some simple authoring tools
international conference on multimedia computing and systems | 1998
Jim Gemmell; Eve Schooler; Roger G. Kermode
We have developed a scalable reliable multicast architecture for delivering one-to-many telepresentations. In contrast to audio and video, which are often transmitted unreliably other media, such as slides, images and animations require reliability. Our approach transmits the data in two layers. One layer is for session-persistent data, with reliability achieved by FEC alone, using the Fcast protocol. The other layer is for dynamic data, with reliability achieved using the ECSRM protocol, which combines FEC with NACK suppression. Our approach is scalable to large heterogeneous receiver sets, and supports late-joining receivers. We have implemented our approach in a multicast version of PowerPoint, a graphical slide presentation tool.
conference on information and knowledge management | 2012
Sahand Negahban; Benjamin I. P. Rubinstein; Jim Gemmell
We consider a serious, previously-unexplored challenge facing almost all approaches to scaling up entity resolution (ER) to multiple data sources: the prohibitive cost of labeling training data for supervised learning of similarity scores for each pair of sources. While there exists a rich literature describing almost all aspects of pairwise ER, this new challenge is arising now due to the unprecedented ability to acquire and store data from online sources, interest in features driven by ER such as enriched search verticals, and the uniqueness of noisy and missing data characteristics for each source. We show on real-world and synthetic data that for state-of-the-art techniques, the reality of heterogeneous sources means that the number of labeled training data must scale quadratically in the number of sources, just to maintain constant precision/recall. We address this challenge with a brand new transfer learning algorithm which requires far less training data (or equivalently, achieves superior accuracy with the same data) and is trained using fast convex optimization. The intuition behind our approach is to adaptively share structure learned about one scoring problem with all other scoring problems sharing a data source in common. We demonstrate that our theoretically-motivated approach improves upon existing techniques for multi-source ER.