Warren Shen
University of Wisconsin-Madison
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Warren Shen.
international conference on data engineering | 2008
Robert Emmett Mccann; Warren Shen; AnHai Doan
When integrating data from multiple sources, a key task that online communities often face is to match the schemas of the data sources. Today, such matching often incurs a huge workload that overwhelms the relatively small set of volunteer integrators. In such cases, community members may not even volunteer to be integrators, due to the high workload, and consequently no integration systems can be built. To address this problem, we propose to enlist the multitude of users in the community to help match the schemas, in a Web 2.0 fashion. We discuss the challenges of this approach and provide initial solutions. Finally, we describe an extensive set of experiments on both real-world and synthetic data that demonstrate the utility of the approach.
international conference on management of data | 2009
AnHai Doan; Jeffrey F. Naughton; Raghu Ramakrishnan; Akanksha Baid; Xiaoyong Chai; Fei Chen; Ting Chen; Eric Chu; Pedro DeRose; Byron J. Gao; Chaitanya Gokhale; Jiansheng Huang; Warren Shen; Ba-Quy Vuong
Over the past few years, we have been trying to build an end-to-end system at Wisconsin to manage unstructured data, using extraction, integration, and user interaction. This paper describes the key information extraction (IE) challenges that we have run into, and sketches our solutions. We discuss in particular developing a declarative IE language, optimizing for this language, generating IE provenance, incorporating user feedback into the IE process, developing a novel wiki-based user interface for feedback, best-effort IE, pushing IE into RDBMSs, and more. Our work suggests that IE in managing unstructured data can open up many interesting research challenges, and that these challenges can greatly benefit from the wealth of work on managing structured data that has been carried out by the database community.
international conference on data engineering | 2005
Robert McCann; Alexander Kramnik; Warren Shen; Vanitha Varadarajan; Olu Sobulo; AnHai Doan
The rapid growth of distributed data at enterprises and on the WWW has fueled significant interest in building data integration systems. Such a system provides users with a uniform query interface (called mediated schema) to a multitude of data sources, thus freeing them from manually querying each individual source. To address some problems in the MOBS (Mass Collaboration to Build Systems) project at the University of Illinois, we develop solutions that learn from the multitude of users in the integration environment to improve the accuracy of integration tools. The improved accuracy in turn can significantly reduce the workload of the system builder. In developing MOBS we address the following key challenges: (i) obtaining user participation, (ii) learning from user participation, and (iii) combining user answers.
very large data bases | 2007
Warren Shen; AnHai Doan; Jeffrey F. Naughton; Raghu Ramakrishnan
IEEE Data(base) Engineering Bulletin | 2006
AnHai Doan; Raghu Ramakrishnan; Fei Chen; Pedro DeRose; Yoonkyong Lee; Robert J. McCann; Mayssam Sayyadian; Warren Shen
conference on innovative data systems research | 2007
Pedro DeRose; Warren Shen; Fei Chen; Yoonkyong Lee; Douglas Burdick; AnHai Doan; Raghu Ramakrishnan
very large data bases | 2007
Pedro DeRose; Warren Shen; Fei Chen; AnHai Doan; Raghu Ramakrishnan
national conference on artificial intelligence | 2005
Warren Shen; Xin Li; AnHai Doan
international conference on management of data | 2008
Warren Shen; Pedro DeRose; Robert Emmett Mccann; AnHai Doan; Raghu Ramakrishnan
IEEE Data(base) Engineering Bulletin | 2010
Hector Gonzalez; Alon Y. Halevy; Anno Langen; Jayant Madhavan; Rod McChesney; Rebecca Shapley; Warren Shen; Jonathan Goldberg-Kidon