Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jan Pieper is active.

Publication


Featured researches published by Jan Pieper.


international conference on management of data | 2010

Midas: integrating public financial data

Sreeram V. Balakrishnan; Vivian Chu; Mauricio A. Hernández; Howard Ho; Rajasekar Krishnamurthy; Shixia Liu; Jan Pieper; Jeffrey S. Pierce; Lucian Popa; Christine Robson; Lei Shi; Ioana Stanoi; Edison Lao Ting; Shivakumar Vaithyanathan; Huahai Yang

The primary goal of the Midas project is to build a system that enables easy and scalable integration of unstructured and semi-structured information present across multiple data sources. As a first step in this direction, we have built a system that extracts and integrates information from regulatory filings submitted to the U.S. Securities and Exchange Commission (SEC) and the Federal Deposit Insurance Corporation (FDIC). Midas creates a repository of entities, events, and relationships by extracting, conceptualizing, integrating, and aggregating data from unstructured and semi-structured documents. This repository enables applications to use the extracted and integrated data in a variety of ways including mashups with other public data and complex risk analysis.


conference on information and knowledge management | 2001

Towards speech as a knowledge resource

Eric W. Brown; Savitha Srinivasan; Anni Coden; Dulce B. Ponceleon; James W. Cooper; Arnon Amir; Jan Pieper

Speech is a tantalizing mode of human communication. On the one hand, humans understand speech with ease and use speech to express complex ideas, information, and knowledge. On the other hand, automatic speech recognition with computers is still very hard, and extracting knowledge from speech is even harder. In this paper we motivate the study of speech as a knowledge resource and briefly survey a family of related applications and systems being developed at IBM Research aimed towards the goal of exploiting speech as a knowledge resource.


very large data bases | 2010

Multimodal social intelligence in a real-time dashboard system

Daniel Gruhl; Meenakshi Nagarajan; Jan Pieper; Christine Robson; Amit P. Sheth

Social Networks provide one of the most rapidly evolving data sets in existence today. Traditional Business Intelligence applications struggle to take advantage of such data sets in a timely manner. The BBC SoundIndex, developed by the authors and others, enabled real-time analytics of music popularity using data from a variety of Social Networks. We present this system as a grounding example of how to overcome the challenges of working with this data from social networks. We discuss a variety of technologies to implement near real-time data analytics to transform Social Intelligence into Business Intelligence and evaluate their effectiveness in the music domain. The SoundIndex project helped to highlight a number of key research areas, including named entity recognition and sentiment analysis in Informal English. It also drew attention to the importance of metadata aggregation in multimodal environments. We explored challenges such as drawing data from a wide set of sources spanning a myriad of modalities, developing adjudication techniques to harmonize inputs, and performing deep analytics on extremely challenging Informal English snippets. Ultimately, we seek to provide guidance on developing applications in a variety of domains that allow an analyst to rapidly grasp the evolution in the social landscape, and show how to validate such a system for a real-world application.


IEEE Computer | 2001

Streaming-media knowledge discovery

Jan Pieper; Savitha Srinivasan; Byron Dom

As the amount of streaming audio and video available to World Wide Web users grows, tools for analyzing and indexing this content will become increasingly important. Frequently, knowledge management applications and information portals synthesize unstructured text information from the Web, intranets and partner sites. Given this context, we crawl a statistically significant number of Web pages, detect those that contain streaming media links, crawl the media links to extract associated meta-data, then use the crawl data to build a resource list for Web media. We have used these crawl-data findings to build a media indexing application that uses content-based indexing methods.


human factors in computing systems | 2009

Team analytics: understanding teams in the global workplace

Jan Pieper; Julia Grace; Stephen Dill

Many medium and large companies maintain internal employee directories. Unfortunately, most directories only allow the lookup of individual profiles, one profile at a time. Team Analytics is a novel application that integrates information from disparate enterprise tools for groups of people. Besides accelerating the lookup process, Team Analytics also displays information that is only available in the group context, such as an organizational chart and time zone awareness. We present the Team Analytics application, its integration with our corporate email client, and results from a user survey that evaluates various aspects of the application.


international conference on cloud computing | 2009

MONGOOSE: MONitoring Global Online Opinions via Semantic Extraction

Varun Bhagwan; Tyrone Grandison; Alfredo Alba; Daniel Gruhl; Jan Pieper

The ever increasing amount of content on the Internet has fostered many efforts seeking to leverage this potentially yottascale information source. Service systems using advanced data and text analytics techniques have been developed to perform knowledge gathering and information discovery over Web data. Information gathered from free and public sources on the Web is frequently integrated with enterprise and proprietary data to create sophisticated service systems able to provide insight in an increasing number of business critical areas. Unfortunately, for fixed and or limited resource projects, consistent and reliable ingestion and integration of content often dominates the effort, reducing the time available for developing core analytics and presentations that differentiate and define an information service. If this initial data extraction, translation and loading of information (known as ETL in the database world) can be abstracted for these web sources, it would provide an important core technology on which Web-based information services could be more rapidly and inexpensively developed and deployed. This paper presents such a system - MONGOOSE - an approach that seeks to reduce the time spent creating a reliable data ingest and integration system and thus reducing the time-to-impact of advanced analytics service solutions.


international conference on web services | 2009

Change Detection and Correction Facilitation for Web Applications and Services

Alfredo Alba; Varun Bhagwan; Tyrone Grandison; Daniel Gruhl; Jan Pieper

There are a large number of websites serving valuable content that can be used by higher-level applications, Web Services, Mashups, etc. Yet, due to various reasons (lack of computing resources, financial constraints etc.) they are unable to provide Web Service APIs to access their data. In their desire to incorporate the latest and greatest technologies, as well as to adapt layouts that are more preferred by users, websites undergo changes over time. These changes can range from minor, e.g. function name changes, to major, e.g., shifting the web platform to AJAX technologies. This paper addresses the problem of detecting layout changes for websites which are unable to provide any Web Service to access their content, yet do not mind others harvesting said content.


international conference on service operations and logistics, and informatics | 2007

Information Enrichment Service Systems

Daniel Gruhl; Kevin Haas; Jan Pieper; Christine Robson; Tony Stuart

Information enrichment is generally considered a modern task of supplying specific-necessity metadata to an existing body of information, with the intent to enable a particular task. In this paper we suggest that information enrichment service systems have been existent for thousands of years, and that information has often been enriched in more generically useful ways than just for a single task. Viewing information enrichment as a service system provides a framework for thinking about how to better implement and leverage the wealth of knowledge in modern organizations.


Archive | 2008

Method and apparatus for block size optimization in de-duplication

Subashini Balachandran; Mihail C. Constantinescu; Jan Pieper


international semantic web conference | 2009

Context and Domain Knowledge Enhanced Entity Spotting in Informal Text

Daniel Gruhl; Meenakshi Nagarajan; Jan Pieper; Christine Robson; Amit P. Sheth

Researchain Logo
Decentralizing Knowledge