James A. Thom | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where James A. Thom is active.

Explore More

Publication

Featured researches published by James A. Thom.

conference on information and knowledge management | 2007

Ontology evaluation using wikipedia categories for browsing

Jonathan Yu; James A. Thom; Audrey M. Tam

Ontology evaluation is a maturing discipline with methodologies and measures being developed and proposed. However, evaluation methods that have been proposed have not been applied to specific examples. In this paper, we present the state-of-the-art in ontology evaluation - current methodologies, criteria and measures, analyse appropriate evaluations that are important to our application - browsing in Wikipedia, and apply these evaluations in the context of ontologies with varied properties. Specifically, we seek to evaluate ontologies based on categories found in Wikipedia.

IEEE Transactions on Knowledge and Data Engineering | 1995

Atlas: a nested relational database system for text applications

Ron Sacks-Davis; Alan J. Kent; Kotagiri Ramamohanarao; James A. Thom; Justin Zobel

Advanced database applications require facilities such as text indexing, image storage, and the ability to store data with a complex structure. However, these facilities are not usually included in traditional database systems. In this paper we describe Atlas, a nested relational database system that has been designed for text-based applications. The Atlas query language is TQL, an SQL-like query language with text operators. The query language is supported by signature file text indexing techniques, and by a parser that can be configured for different text formats and even some foreign languages. Atlas can also be used to store images and audio. >

acm symposium on applied computing | 2008

Entity ranking in Wikipedia

Anne-Marie Vercoustre; James A. Thom; Jovan Pehcevski

The traditional entity extraction problem lies in the ability of extracting named entities from plain text using natural language processing techniques and intensive training from large document collections. Examples of named entities include organisations, people, locations, or dates. There are many research activities involving named entities; we are interested in entity ranking in the field of information retrieval. In this paper, we describe our approach to identifying and ranking entities from the INEX Wikipedia document collection. Wikipedia offers a number of interesting features for entity identification and ranking that we first introduce. We then describe the principles and the architecture of our entity ranking system, and introduce our methodology for evaluation. Our preliminary results show that the use of categories and the link structure of Wikipedia, together with entity examples, can significantly improve retrieval effectiveness.

Information Systems | 2009

Requirements-oriented methodology for evaluating ontologies

Jonathan Yu; James A. Thom; Audrey M. Tam

Many applications benefit from the use of a suitable ontology but it can be difficult to determine which ontology is best suited to a particular application. Although ontology evaluation techniques are improving as more measures and methodologies are proposed, the literature contains few specific examples of cohesive evaluation activity that links ontologies, applications and their requirements, and measures and methodologies. In this paper, we present ROMEO, a requirements-oriented methodology for evaluating ontologies, and apply it to the task of evaluating the suitability of some general ontologies (variants of sub-domains of the Wikipedia category structure) for supporting browsing in Wikipedia. The ROMEO methodology identifies requirements that an ontology must satisfy, and maps these requirements to evaluation measures. We validate part of this mapping with a task-based evaluation method involving users, and report on our findings from this user study.

Environment, Development and Sustainability | 2013

Reframing social sustainability reporting : towards an engaged approach

Liam Magee; Andy Scerri; Paul James; James A. Thom; Lin Padgham; Sarah L. Hickmott; Hepu Deng; Felicity Cahill

Existing approaches to sustainability assessment are typically characterized as being either “top–down” or “bottom–up.” While top–down approaches are commonly adopted by businesses, bottom–up approaches are more often adopted by civil society organizations and communities. Top–down approaches clearly favor standardization and commensurability between other sustainability assessment efforts, to the potential exclusion of issues that really matter on the ground. Conversely, bottom–up approaches enable sustainability initiatives to speak directly to the concerns and issues of communities, but lack a basis for comparability. While there are clearly contexts in which one approach can be favored over another, it is equally desirable to develop mechanisms that mediate between both. In this paper, we outline a methodology for framing sustainability assessment and developing indicator sets that aim to bridge these two approaches. The methodology incorporates common components of bottom–up assessment: constituency-based engagement processes and opportunity to identify critical issues and indicators. At the same time, it uses the idea of a “knowledge base,” to help with the selection of standardized, top–down indicators. We briefly describe two projects where the aspects of the methodology have been trialed with urban governments and communities, and then present the methodology in full, with an accompanying description of a supporting software system.

Information Processing and Management | 1996

Relevance judgments for assessing recall

Peter Wallis; James A. Thom

Abstract Recall and Precision have become the principle measures of the effectiveness of information retrieval systems. Inherent in these measures of performance is the idea of a relevant document. Although recall and precision are easily and unambiguously defined, selecting the documents relevant to a query has long been recognized as problematic. To compare performance of different systems, standard collections of documents, queries, and relevance judgments have been used. Unfortunately the standard collections, such as SMART and TREC, have locked in a particular approach to relevance that is suitable for assessing precision but not recall. The problem is demonstrated by comparing two information retrieval methods over several queries, and showing how a new method of forming relevance judgments that a suitable for assessing recall gives different results. Recall is an interesting and practical issue, but current test procedures are inadequate for measuring it.

database systems for advanced applications | 1999

A fuzzy object query language (FOQL) for image databases

Surya Nepal; M. V. Ramakrishna; James A. Thom

Content based retrieval systems have been developed for querying image data in which the users can pose queries based on visual properties such as color and texture. These systems, which have advanced the state of the art in image database systems, remarkably lack formal query languages. The traditional query languages are unable to capture the inherent fuzzy nature of the image data and content based querying. The fuzzy object query language (FOQL) presented in this paper addresses the need to support fuzzy values and fuzzy collections required for image databases. It can be used for defining schemas and high level concepts, and for querying image databases. It captures the inherent fuzzy nature of content based retrieval by keeping the query results fuzzy as against other query languages. The users can interactively refine their queries and high level concept definitions using recursive and named query definition constructs in FOQL. Being an extension of ODMG-OQL, FOQL can be easily mapped to ODMG-compliant visual query languages.

european conference on information retrieval | 2008

Exploiting locality of Wikipedia links in entity ranking

Jovan Pehcevski; Anne-Marie Vercoustre; James A. Thom

Information retrieval from web and XML document collections ever more focused on returning entities instead of web pages or XML elements. There are many research fields involving named entities; one such field is known as entity ranking, where one goal is to rank entities in response to a query supported with a short list of entity examples. In this paper, we describe our approach to ranking entities from the Wikipedia XML document collection. Our approach utilises the known categories and the link structure of Wikipedia, and more importantly, exploits link co-occurrences to improve the effectiveness of entity ranking. Using the broad context of a full Wikipedia page as a baseline, we evaluate two different algorithms for identifying narrow contexts around the entity examples: one that uses predefined types of elements such as paragraphs, lists and tables; and another that dynamically identifies the contexts by utilising the underlying XML document structure. Our experiments demonstrate that the locality of Wikipedia links can be exploited to significantly improve the effectiveness of entity ranking.

Lecture Notes in Computer Science | 2005

HiXEval: highlighting XML retrieval evaluation

Jovan Pehcevski; James A. Thom

This paper describes our proposal for an evaluation metric for XML retrieval that is solely based on the highlighted text. We support our decision of ignoring the exhaustivity dimension by undertaking a critical investigation of the two INEX 2005 relevance dimensions. We present a fine grained empirical analysis of the level of assessor agreement of the five topics double-judged at INEX 2005, and show that the agreement is higher for specificity than for exhaustivity. We use the proposed metric to evaluate the INEX 2005 runs for each retrieval strategy of the CO and CAS retrieval tasks. A correlation analysis of the rank orderings obtained by the new metric and two XCG metrics shows that the orderings are strongly correlated, which demonstrates the usefulness of the proposed metric for evaluation of XML retrieval performance.

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval | 2009

Overview of the INEX 2009 ad hoc track

Shlomo Geva; Jaap Kamps; Miro Lethonen; Ralf Schenkel; James A. Thom; Andrew Trotman

This paper gives an overview of the INEX 2009 Ad Hoc Track. The main goals of the Ad Hoc Track were three-fold. The first goal was to investigate the impact of the collection scale and markup, by using a new collection that is again based on a the Wikipedia but is over 4 times larger, with longer articles and additional semantic annotations. For this reason the Ad Hoc track tasks stayed unchanged, and the Thorough Task of INEX 2002-2006 returns. The second goal was to study the impact of more verbose queries on retrieval effectiveness, by using the available markup as structural constraints--now using both the Wikipedias layout-based markup, as well as the enriched semantic markup--and by the use of phrases. The third goal was to compare different result granularities by allowing systems to retrieve XML elements, ranges of XML elements, or arbitrary passages of text. This investigates the value of the internal document structure (as provided by the XML mark-up) for retrieving relevant information. The INEX 2009 Ad Hoc Track featured four tasks: For the Thorough Task a ranked-list of results (elements or passages) by estimated relevance was needed. For the Focused Task a ranked-list of non-overlapping results (elements or passages) was needed. For the Relevant in Context Task non-overlapping results (elements or passages) were returned grouped by the article from which they came. For the Best in Context Task a single starting point (element start tag or passage start) for each article was needed. We discuss the setup of the track, and the results for the four tasks.

Explore More