Lee Gillam | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lee Gillam is active.

Explore More

Publication

Featured researches published by Lee Gillam.

international conference on move to meaningful internet systems | 2005

Automatic ontology extraction from unstructured texts

Khurshid Ahmad; Lee Gillam

Construction of the ontology of a specific domain currently relies on the intuition of a knowledge engineer, and the typical output is a thesaurus of terms, each of which is expected to denote a concept. Ontological ‘engineers’ tend to hand-craft these thesauri on an ad-hoc basis and on a relatively smallscale. Workers in the specific domain create their own special language, and one device for this creation is the repetition of select keywords for consolidating or rejecting one or more concepts. A more scalable, systematic and automatic approach to ontology construction is possible through the automatic identification of these keywords. An approach for the study and extraction of keywords is outlined where a corpus of randomly collected unstructured, i.e. not containing any kind of mark-up, texts in a specific domain is analysed with reference to the lexical preferences of the workers in the domain. An approximation about the role of frequently used single words within multiword expressions leads us to the creation of a semantic network. The network can be asserted into a terminology database or knowledge representation formalism, and the relationship between the nodes of the network helps in the visualisation of, and automatic inference over, the frequently used words denoting important concepts in the domain. We illustrate our approach with a case study using corpora from three time periods on the emergence and consolidation of nuclear physics. The text-based approach appears to be less subjective and more suitable for introspection, and is perhaps useful in ontology evolution.

grid economics and business models | 2014

Performance Evaluation for Cost-Efficient Public Infrastructure Cloud Use

John O’Loughlin; Lee Gillam

In this paper, we discuss the nature of variability in compute performance in Infrastructure Clouds and how this presents opportunities for Cloud Service Brokers (CSB) in relation to pricing. Performance variation in virtual machines of the same type and price raises specific issues for end users: (i) the time taken to complete a task varies with performance, and therefore costs also vary (ii) the number of instances required to meet a certain problem scale within a given time being variable, so costs depend on variations in scale needed to meet the requirement; (iii) different computational requirements are better satisfied by different hardware, and understanding the relationship between instance types and available resources implies further costs. We demonstrate such variability problems empirically in a Public Infrastructure Cloud, and use the data gathered to discuss performance price issues, and how a CSB may re-price instances based on their performance.

international conference on e-science | 2009

Towards job-specific service level agreements in the cloud

Bin Li; Lee Gillam

To attract more users to commercially available computing, services have to specify clearly the charges, duties, liabilities and penalties in Service Level Agreements (SLAs). In this paper, we build on our existing work in SLAs by making easy measurements for a specific application run within a commercial Cloud. An outcome of this work is that certain application may run better than in a Grid or HPC environment, and this backs a recent hypothesis [7].

cluster computing and the grid | 2009

Risk Informed Computer Economics

Bin Li; Lee Gillam

Grid computing continues to hold promise for the high-availability of a wide range of computational systems and techniques. It is suggested that Grids will attain greater acceptance by a larger audience of commercial end-users if binding Service Level Agreements (SLAs) are provided. We discuss Grid commoditization, the use of Grid technologies for financial risk analysis, and the potential formulation of the Grid Economy. Our aim is to predict availability and capability for risk analysis in and of Grids. The considerations involved may be more widely applicable to the configuration and management of related architectures including those of P2P systems and Clouds. In this paper, we explore and evaluate some of the factors involved in the automatic construction of SLAs for the Grid Economy.

information retrieval facility conference | 2010

Rank by readability: document weighting for information retrieval

Neil Newbold; Harry McLaughlin; Lee Gillam

In this paper, we present a new approach to ranking that considers the reading ability (and motivation) of the user. Web pages can be, increasingly, badly written with unfamiliar words, poor use of syntax, ambiguous phrases and so on. Readability research suggests that experts and motivated readers may overcome confusingly written text, but nevertheless find it an irritation. We investigate using readability to re-rank web pages. We take an extended view of readability that considers the reading level of retrieved web pages using techniques that consider both textual and cognitive factors. Readability of a selection of query results is examined, and a re-ranking on readability is compared to the original ranking. Results to date suggest that considering a view of readability for each reader may increase the probability of relevance to a particular user.

Proceedings of 1st International Workshop on Grid Technology for Financial Modeling and Simulation — PoS(GRID2006) | 2007

Financial data tombs and nurseries: A grid-based text and ontological analysis

Khurshid Ahmad; Lee Gillam

Continuous news streams provide valuable and time-critical information across a range of financial market stakeholders. The large volume of such news makes it important to extract information automatically from these data nurseries. In many cases it is essential to repeatedly process a large news archive as and when news arrives and it has to be reconciled with what has happened in the past. An algorithm is presented that can analyse a large text collection and extract terminology and ontology of the specialist domain of the texts. The state of flux of financial markets and instruments traded therein can be observed with a diachronic analysis. We report on a grid implementation of our algorithm and show a degree of diachronic change in the use of certain terms in texts at one time with texts at another.

Archive | 2006

The Mood of the (Financial) Markets: In a Corpus of Words and of Pictures

Khurshid Ahmad; David Cheng; Tugba Taskaya; Saif Ahmad; Lee Gillam; Pensiri Manomaisupat; Hayssam Traboulsi; Andrew Hippisley

Corpora of texts are used typically to study the structure and function of language. The distribution of various linguistic units, comprising texts in a corpus are used to make and test hypotheses relevant to different linguistic levels of description. News reports and editorials have been used extensively to populate corpora for studying language, for making dictionaries and for writing grammar books. News reports of financial markets are generally accompanied by time-indexed series of values of shares, currencies and so on, reflecting the change in value over a period of time. A corpus linguistic method for extracting sentiment indicators, e.g. shares going up or a currency falling down, is presented together with a technique for correlating the quantitative time-series of values with a time series of sentiment indicators. The correlation may be used in the analysis of the movement of shares, currencies and other financial instruments.

The Journal of Supercomputing | 2016

Sibling virtual machine co-location confirmation and avoidance tactics for Public Infrastructure Clouds

John O'Loughlin; Lee Gillam

Infrastructure Clouds offer large scale resources for rent, which are typically shared with other users—unless you are willing to pay a premium for single tenancy (if available). There is no guarantee that your instances will run on separate hosts, and this can cause a range of issues when your instances are co-locating on the same host including: mutual performance degradation, exposure to underlying host failures, and increased threat surface area for host compromise. Determining when your instances are co-located is useful then, as a user can implement policies for host separation. Co-location methods to date have typically focused on identifying co-location with another user’s instance, as this is a prerequisite for targeted attacks on the Cloud. However, as providers update their environments these methods either no longer work, or have yet to be proven on the Public Cloud. Further, they are not suitable to the task of simply and quickly detecting co-location amongst a large number of instances. We propose a method suitable for Xen based Clouds which addresses this problem and demonstrate it on EC2—the largest Public Cloud Infrastructure.

international conference on e-science | 2009

Towards Executable Acceptable Use Policies (execAUPs) for email clouds

Lee Gillam; Neil Cooke; Jonathan Skinner

In this paper, we discuss the potential use of Cloud Computing for hosting and analysis of email. In particular, we are working towards the development of Executable Acceptable Use Policies (execAUPs) that assist organizations in preventing certain kinds of detrimental employee activities. We consider requirements for execAUPs, and outline initial efforts in using Microsofts Azure as an environment for providing hosted storage for such research.

international symposium on neural networks | 2007

Distributing SOM Ensemble Training using Grid Middleware

Bogdan Vrusias; Leonidas Vomvoridis; Lee Gillam

In this paper we explore the distribution of training of self-organised maps (SOM) on grid middleware. We propose a two-level architecture and discuss an experimental methodology comprising ensembles of SOMs distributed over a grid with periodic averaging of weights. The purpose of the experiments is to begin to systematically assess the potential for reducing the overall time taken for training by a distributed training regime against the impact on precision. Several issues are considered: (i) the optimum number of ensembles; (ii) the impact of different types of training data; and (iii) the appropriate period of averaging. The proposed architecture has been evaluated in a grid environment, with clock-time performance recorded.

Explore More