Sam Chapman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sam Chapman is active.

Explore More

Publication

Featured researches published by Sam Chapman.

Lecture Notes in Computer Science | 2004

Learning to Harvest Information for the Semantic Web

Fabio Ciravegna; Sam Chapman; Alexiei Dingli; Yorick Wilks

In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodology is based on a combination of information extraction, information integration and machine learning techniques. Learning is seeded by extracting information from structured sources (e.g. databases and digital libraries) or a user-defined lexicon. Retrieved information is then used to partially annotate documents. Annotated documents are used to bootstrap learning for simple Information Extraction (IE) methodologies, which in turn will produce more annotation to annotate more documents that will be used to train more complex IE engines and so on. In this paper we describe the methodology and its implementation in the Armadillo system, compare it with the current state of the art, and describe the details of an implemented application. Finally we draw some conclusions and highlight some challenges and future work.

european semantic web conference | 2008

Hybrid search: effectively combining keywords and semantic searches

Ravish Bhagdev; Sam Chapman; Fabio Ciravegna; Vitaveska Lanfranchi; Daniela Petrelli

This paper describes hybrid search, a search method supporting both document and knowledge retrieval via the flexible combination of ontology-based search and keyword-based matching. Hybrid search smoothly copes with lack of semantic coverage of document content, which is one of the main limitations of current semantic search methods. In this paper we define hybrid search formally, discuss its compatibility with the current semantic trends and present a reference implementation: K-Search. We then show how the method outperforms both keyword-based search and pure semantic search in terms of precision and recall in a set of experiments performed on a collection of about 18.000 technical documents. Experiments carried out with professional users show that users understand the paradigm and consider it very powerful and reliable. K-Search has been ported to two applications released at Rolls-Royce plc for searching technical documentation about jet engines.

Journal of Intelligent Manufacturing | 2009

Applying semantic web technologies to knowledge sharing in aerospace engineering

Aba-Sah Dadzie; Ravish Bhagdev; Ajay Chakravarthy; Sam Chapman; José Iria; Vitaveska Lanfranchi; João Magalhães; Daniela Petrelli; Fabio Ciravegna

This paper details an integrated methodology to optimise knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses ontologies as a central modelling strategy for the capture of knowledge from legacy documents via automated means, or directly in systems interfacing with knowledge workers, via user-defined, web-based forms. The domain ontologies used for knowledge capture also guide the retrieval of the knowledge extracted from the data using a semantic search system that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale.

Philosophical Transactions of the Royal Society A | 2009

The Archaeotools project: faceted classification and natural language processing in an archaeological context

Stuart Jeffrey; Julian D. Richards; Fabio Ciravegna; S. Waller; Sam Chapman; Ziqi Zhang

This paper describes ‘Archaeotools’, a major e-Science project in archaeology. The aim of the project is to use faceted classification and natural language processing to create an advanced infrastructure for archaeological research. The project aims to integrate over 1×106 structured database records referring to archaeological sites and monuments in the UK, with information extracted from semi-structured grey literature reports, and unstructured antiquarian journal accounts, in a single faceted browser interface. The project has illuminated the variable level of vocabulary control and standardization that currently exists within national and local monument inventories. Nonetheless, it has demonstrated that the relatively well-defined ontologies and thesauri that exist in archaeology mean that a high level of success can be achieved using information extraction techniques. This has great potential for unlocking and making accessible the information held in grey literature and antiquarian accounts, and has lessons for allied disciplines.

international acm sigir conference on research and development in information retrieval | 2004

Armadillo: harvesting information for the semantic web

Sam Chapman; Alexiei Dingli; Fabio Ciravegna

Armadillo [1] is an automatic system for producing domainspecific Semantic Web oriented annotation on large repositories. Armadillo is adaptive: learning how to harvest information with minimal user intervention. The first step performed in the learning process is identifying seed terms for examples of information to be extracted. The seeds, provided by the user from existing data or via connection to a web service, are used to learn from different parts of the repository or in the external world (e.g. the Web). An agent spiders the available space and identifies places where such terms occur (documents, databases, etc.). Rules are induced to model the context in which these terms appear. These rules are then used to extract other examples not contained in the initial lexicon but that appear in similar contexts. All new terms must be confirmed before they are accepted and used to re-seed learning. Multiple strategies are used for confirmation, e.g. a new piece of information is accepted if found within different (linguistic or semantic) contexts. Finally all the extracted information is integrated with the existing knowledge base. The methodology relies on the inherent redundancy of large repositories, e.g. the Web or company-wide repositories. By inherent redundancy we mean the fact that information is frequently represented in different forms in distributed resources. Armadillo has similarities to SemTag [2] which automatically generates instance labels from documents on the web. SemTag’s extraction is shallow for the needs of the Semantic Web: e.g. there is no attempt to automatically discover relations among entities. SemTag can be seen as an extension of a search engine towards semantic indexing and retrieval. Armadillo improves on SemTag by integrating extracted data from numerous sources into a knowledge base (KB). Such a KB can then be used both to access information directly (e.g. via a semantic web agent) and to annotate the pages where information was identified. Armadillo’s architecture is based on Semantic Web Services, where each service is associated to parts of the ontology (e.g. a set of concepts and/or relations) and works in an independent way. Each service can use other services (in-

international semantic web conference | 2008

Creating and Using Organisational Semantic Webs in Large Networked Organisations

Ravish Bhagdev; Ajay Chakravarthy; Sam Chapman; Fabio Ciravegna; Vitaveska Lanfranchi

Modern knowledge management is based on the orchestration of dynamic communities that acquire and share knowledge according to customized schemas. However, while independence of ontological views is favoured, these communities must also be able to share their knowledge with the rest of the organization. In this paper we introduce K-Forms and K-Search, a suite of Semantic Web tools for supporting distributed and networked knowledge acquisition, capturing, retrieval and sharing. They enable communities of users to define their own domain views in an intuitive way (automatically translated into formal ontologies) and capture and share knowledge according to them. The tools favour reuse of existing ontologies; reuse creates as side effect a network of (partially) interconnected ontologies that form the basis for knowledge exchange among communities. The suite is under release to support knowledge capture, retrieval and sharing in a large jet engine company.

Lecture Notes in Computer Science | 2004

CS AKTiveSpace: Building a Semantic Web Application

Hugh Glaser; Harith Alani; Les Carr; Sam Chapman; Fabio Ciravegna; Alexiei Dingli; Nicholas Gibbins; Stephen Harris; m.c. schraefel; Nigel Shadbolt

In this paper we reflect on the lessons learned from deploying the award winning [1] Semantic Web application, CS AKTiveSpace. We look at issues in service orientation and modularisation, harvesting, and interaction design for supporting this 10million-triple-based application. We consider next steps for the application, based on these lessons, and propose a strategy for expanding and improving the services afforded by the application.

GfKl | 2009

Collective Intelligence Generation from User Contributed Content

Vassilios Solachidis; Phivos Mylonas; Andreas Geyer-Schulz; Bettina Hoser; Sam Chapman; Fabio Ciravegna; Vita Lanfranchi; Ansgar Scherp; Steffen Staab; Costis Contopoulos; Ioanna Gkika; Byron Bakaimis; Pavel Smrz; Yiannis Kompatsiaris; Yannis S. Avrithis

In this paper we provide a foundation for a new generation of services and tools. We define new ways of capturing, sharing and reusing information and intelligence provided by single users and communities, as well as organizations by enabling the extraction, generation, interpretation and management of Collec- tive Intelligence from user generated digital multimedia content. Different layers of intelligence are generated, which together constitute the notion of Collective Intel- ligence. The automatic generation of Collective Intelligence constitutes a departure from traditional methods for information sharing, since information from both the multimedia content and social aspects will be merged, while at the same time the social dynamics will be taken into account. In the context of this work, we present two case studies: an Emergency Response and a Consumers Social Group case study.

Multimedia Tools and Applications | 2009

Attributing semantics to personal photographs

Rodrigo Fontenele Carvalho; Sam Chapman; Fabio Ciravegna

A major bottleneck for the efficient management of personal photographic collections is the large gap between low-level image features and high-level semantic contents of images. This paper proposes and evaluates two methodologies for making appropriate (re)use of natural language photographic annotations for extracting references to people, location and objects and propagating any location references encountered to previously unannotated images. The evaluation identifies the strengths of each approach and shows extraction and propagation results with promising accuracy.

european semantic web conference | 2009

K-Tools: Towards Semantic Knowledge Management

Sam Chapman; Vitaveska Lanfranchi; Ravish Bhagdev

This paper details the use of Semantic Web tools for supporting networked knowledge acquisition, search and sharing in large distributed organisations. The demonstration will showcase from a user perspective an application developed to aid knowledge management in large organisations, detailing the problems and technical solutions employed.

Explore More