Zhiwu Xie
Virginia Tech
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhiwu Xie.
acm/ieee joint conference on digital libraries | 2017
Edward A. Fox; Zhiwu Xie; Martin Klein
This workshop will explore integration of Web archiving and digital libraries, so the complete life cycle involved is covered: creation/authoring, uploading/publishing in the Web (2.0), (focused) crawling, indexing, exploration (searching, browsing), ..., archiving (of events). It will include particular coverage of current topics of interest:, big data, mobile web archiving, and systems (e.g., Memento, SiteStory, Uninterruptible Web Service).
acm/ieee joint conference on digital libraries | 2016
Zhiwu Xie; Yinlin Chen; Julie Speer; Tyler Walters
In this paper, we utilize a set of controlled experiments to benchmark the cost associated with the cloud execution of typical repository functions such as ingestion, fixity checking, and heavy data processing. We focus on the repository service pattern where content is explicitly stored away from where it is processed. We measured the processing speed and unit cost of each scenario using a large sensor dataset and Amazon Web Services (AWS). The initial results reveal three distinct cost patterns: 1) spend more to buy up to proportionally faster services; 2) more money does not necessarily buy better performance; and 3) spend less, but faster. Further investigations into these performance and cost patterns will help repositories to form a more effective operation strategy.
international conference on asian digital libraries | 2015
Zhiwu Xie; Yinlin Chen; Tingting Jiang; Julie Speer; Tyler Walters; Pablo A. Tarazaga; Mary Kasarda
We describe a use and reuse driven digital repository integrated with lightweight data analysis capabilities provided by the Docker framework. Using building sensor data collected from the Virginia Tech Goodwin Hall Living Laboratory, we perform evaluations using Amazon EC2 and Container Service with a Fedora 4 repository backed with storage in Amazon S3. The results confirm the viability and benefits of this approach.
acm/ieee joint conference on digital libraries | 2015
Zhiwu Xie; Prashant Chandrasekar; Edward A. Fox
We describe a web archiving application that handles server errors using the most recently archived representation of the requested web resource. The application is developed as an Apache module. It leverages the transactional web archiving tool SiteStory, which archives all previously accessed representations of web resources originating from a website. This application helps to improve the websites quality of service by temporarily masking server errors from the end user and gaining precious time for the system administrator to debug and recover from server failures. By providing pertinent support to website operations, we aim to reduce the resistance to transactional web archiving, which in turn may lead to a better coverage of web history.
acm ieee joint conference on digital libraries | 2018
Abhinav Kumar; Zhiwu Xie
Web content acquisition forms the foundation of value extraction of web data. Two main categories of acquisition methods are crawler based methods and transactional web archiving or server-side acquisition methods. In this poster, we propose a new method to acquire web content from web caches. Our method provides improvement in terms of reduced penalty on HTTP transaction, flexibility to accommodate peak web server loads and minimal involvement of System Administrator to set up the system.
acm ieee joint conference on digital libraries | 2018
Xinyue Wang; Zhiwu Xie
Vibration data from building can reflect human activities such as human movement. Lack of relative labeled dataset has been a major challenge for conducting such analysis job. We aim to explore possibilities to produce footstep metadata automatically through machine learning techniques. In this paper, we perform an analysis on identifying human footsteps by utilizing deep neural network as a classifier.
International Journal on Digital Libraries | 2018
Edward A. Fox; Martin Klein; Zhiwu Xie
Since 1997, numerous organizations around the world have archived much of the content on the World Wide Web. This movement has spread, with more and more groups participating, so now there is a substantial research, development, operation, and utilization infrastructure supporting the collection, storage, indexing, sharing, and accessing of a large portion of the history of webpages. This infrastructure is shared by individuals, groups, educational institutions, government agencies, corporations, and other entities. Standards have emerged, tools have been devised, and analysismethods have been applied. All of this work helps show how, in the digital world, libraries and archives can be well supported, separately and in combination, by digital library methods. This special issue includes six papers. The authors are from Centrum Wiskunde & Informatica, Amsterdam, Netherlands; George Washington University, Washington, D.C., USA; Leibniz University Hannover, Germany; Los Alamos National Laboratory, New Mexico, USA; The Open University of Israel, Raanana, Israel and University of Haifa, Haifa, Israel; and Virginia Tech, Virginia, USA. This international group includes researchers and developers and
Information services & use | 2017
Zhiwu Xie; Edward A. Fox
Data-intensive science presents new opportunities as well as challenges to research libraries. The cyberinfrastructural challenge, although chiefly technological, also involves social-economic and human factors, therefore requires a deep understanding of what roles research libraries should play in the research lifecycle. This paper discusses the rationale and motivations behind a research project to investigate effective library big data cyberinfrastructure strategies.
international conference on asian digital libraries | 2016
Zhiwu Xie; Julie Speer; Yinlin Chen; Tingting Jiang; Collin Brittle; Paul Mather
We introduce VTechData, a Sufia/Fedora based institutional repository specifically implemented to meet the needs of research data management at Virginia Tech. Despite the rapid maturity of Hydra and Fedora code bases, the gaps between the released packages and a launched production-level service are still many and far from trivial. In this practitioner paper we describe the strategy and efforts through which these gaps were filled and lessons learned in the process of creating our first Hydra/Sufia-based repository.
acm/ieee joint conference on digital libraries | 2016
Edward A. Fox; Zhiwu Xie; Martin Klein
This workshop will explore integration of Web archiving and digital libraries, so the complete life cycle involved is covered: creation/authoring, uploading/publishing in the Web (2.0), (focused) crawling, indexing, exploration (searching, browsing), archiving (of events), etc. It will include particular coverage of current topics of interest, like: big data, mobile web archiving, and systems (e.g., Memento, SiteStory, Hadoop processing).