Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Warren Shen is active.

Publication


Featured researches published by Warren Shen.


international conference on data engineering | 2008

Matching Schemas in Online Communities: A Web 2.0 Approach

Robert Emmett Mccann; Warren Shen; AnHai Doan

When integrating data from multiple sources, a key task that online communities often face is to match the schemas of the data sources. Today, such matching often incurs a huge workload that overwhelms the relatively small set of volunteer integrators. In such cases, community members may not even volunteer to be integrators, due to the high workload, and consequently no integration systems can be built. To address this problem, we propose to enlist the multitude of users in the community to help match the schemas, in a Web 2.0 fashion. We discuss the challenges of this approach and provide initial solutions. Finally, we describe an extensive set of experiments on both real-world and synthetic data that demonstrate the utility of the approach.


international conference on management of data | 2009

Information extraction challenges in managing unstructured data

AnHai Doan; Jeffrey F. Naughton; Raghu Ramakrishnan; Akanksha Baid; Xiaoyong Chai; Fei Chen; Ting Chen; Eric Chu; Pedro DeRose; Byron J. Gao; Chaitanya Gokhale; Jiansheng Huang; Warren Shen; Ba-Quy Vuong

Over the past few years, we have been trying to build an end-to-end system at Wisconsin to manage unstructured data, using extraction, integration, and user interaction. This paper describes the key information extraction (IE) challenges that we have run into, and sketches our solutions. We discuss in particular developing a declarative IE language, optimizing for this language, generating IE provenance, incorporating user feedback into the IE process, developing a novel wiki-based user interface for feedback, best-effort IE, pushing IE into RDBMSs, and more. Our work suggests that IE in managing unstructured data can open up many interesting research challenges, and that these challenges can greatly benefit from the wealth of work on managing structured data that has been carried out by the database community.


international conference on data engineering | 2005

Integrating data from disparate sources: a mass collaboration approach

Robert McCann; Alexander Kramnik; Warren Shen; Vanitha Varadarajan; Olu Sobulo; AnHai Doan

The rapid growth of distributed data at enterprises and on the WWW has fueled significant interest in building data integration systems. Such a system provides users with a uniform query interface (called mediated schema) to a multitude of data sources, thus freeing them from manually querying each individual source. To address some problems in the MOBS (Mass Collaboration to Build Systems) project at the University of Illinois, we develop solutions that learn from the multitude of users in the integration environment to improve the accuracy of integration tools. The improved accuracy in turn can significantly reduce the workload of the system builder. In developing MOBS we address the following key challenges: (i) obtaining user participation, (ii) learning from user participation, and (iii) combining user answers.


very large data bases | 2007

Declarative information extraction using datalog with embedded extraction predicates

Warren Shen; AnHai Doan; Jeffrey F. Naughton; Raghu Ramakrishnan


IEEE Data(base) Engineering Bulletin | 2006

Community Information Management.

AnHai Doan; Raghu Ramakrishnan; Fei Chen; Pedro DeRose; Yoonkyong Lee; Robert J. McCann; Mayssam Sayyadian; Warren Shen


conference on innovative data systems research | 2007

DBLife: A Community Information Management Platform for the Database Research Community (Demonstration)

Pedro DeRose; Warren Shen; Fei Chen; Yoonkyong Lee; Douglas Burdick; AnHai Doan; Raghu Ramakrishnan


very large data bases | 2007

Building structured web community portals: a top-down, compositional, and incremental approach

Pedro DeRose; Warren Shen; Fei Chen; AnHai Doan; Raghu Ramakrishnan


national conference on artificial intelligence | 2005

Constraint-based entity matching

Warren Shen; Xin Li; AnHai Doan


international conference on management of data | 2008

Toward best-effort information extraction

Warren Shen; Pedro DeRose; Robert Emmett Mccann; AnHai Doan; Raghu Ramakrishnan


IEEE Data(base) Engineering Bulletin | 2010

Socialising Data with Google Fusion Tables.

Hector Gonzalez; Alon Y. Halevy; Anno Langen; Jayant Madhavan; Rod McChesney; Rebecca Shapley; Warren Shen; Jonathan Goldberg-Kidon

Collaboration


Dive into the Warren Shen's collaboration.

Top Co-Authors

Avatar

AnHai Doan

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Pedro DeRose

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Fei Chen

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xiaoyong Chai

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Jeffrey F. Naughton

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Akanksha Baid

University of Wisconsin-Madison

View shared research outputs
Researchain Logo
Decentralizing Knowledge