Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hany Azzam is active.

Publication


Featured researches published by Hany Azzam.


very large data bases | 2008

Modelling retrieval models in a probabilistic relational algebra with a new operator: the relational Bayes

Thomas Roelleke; Hengzhi Wu; Jun Wang; Hany Azzam

This paper presents a probabilistic relational modelling (implementation) of the major probabilistic retrieval models. Such a high-level implementation is useful since it supports the ranking of any object, it allows for the reasoning across structured and unstructured data, and it gives the software (knowledge) engineer control over ranking and thus supports customisation. The contributions of this paper include the specification of probabilistic SQL (PSQL) and probabilistic relational algebra (PRA), a new relational operator for probability estimation (the relational Bayes), the probabilistic relational modelling of retrieval models, a comparison of modelling retrieval with traditional SQL versus modelling retrieval with PSQL, and a comparison of the performance of probability estimation with traditional SQL versus PSQL. The main findings are that the PSQL/PRA paradigm allows for the description of advanced retrieval models, is suitable for solving large-scale retrieval tasks, and outperforms traditional SQL in terms of abstraction and performance regarding probability estimation.


Proceedings of the Third International Workshop on Keyword Search on Structured Data | 2012

A schema-driven approach for knowledge-oriented retrieval and query formulation

Hany Azzam; Sirvan Yahyaei; Marco Bonzanini; Thomas Roelleke

In order to search across factual knowledge and content explicated using different data formats this paper leverages a generic data model (schema) that transforms keyword-based retrieval models and queries to knowledge-oriented models and semantically-expressive queries. As each of the transformed retrieval models capitalises on a specific evidence space (term, classification, relationship and attribute), we demonstrate two possible combinations of these spaces, namely macro-based or micro-based. For bare keyword-based queries we demonstrate how the data model can be used to augment the queries with classifications, relationships, etc. that reflect the underlying constraints and objects found in the heterogeneous knowledge bases. Using the IMDb benchmark the results demonstrate the feasibility and effectiveness of the instantiated retrieval models and the query reformulation process.


exploiting semantic annotations in information retrieval | 2010

SQR: a semantic query rating scheme

Hany Azzam; Thomas Roelleke

We introduce a query rating scheme that identifies the possible interpretations which can be assigned to a semantic query. The interpretations range from the traditional bag-of-words interpretation to more context- and semantic-aware interpretations. The aims of this scheme are to communicate the extent of semantics that is being interpreted for a query and to assign suitable query processing methods for each level of interpretation accordingly.


patent information retrieval | 2009

A case for probabilistic logic for scalable patent retrieval

Iraklis A. Klampanos; Hany Azzam; Thomas Roelleke

Patent retrieval has emerged as an important application of information retrieval. Inherent properties of patent searching, such as large corpora, document length and the use of terminology have created the need for alternative approaches to searching. Logic-based information retrieval, as it is modelled by DB+IR systems, can accommodate these needs through its power of abstraction and the use of database-friendly query languages. However, there is a trade-off between expressiveness and efficiency. We propose to tackle such efficiency issues through distribution and parallelisation. In this paper we present our arguments in favour of a parallelised patent searching solution built on top of a probabilistic DB+IR system. Our contributions are both conceptual as well as technical. We demonstrate the flexibility of this approach by modelling two resource selection algorithms in probabilistic logic, expressed in probabilistic Datalog -- a rule-based language designed for expressing database-related tasks. Then, we provide early experimental indications which support the feasibility and technical soundness of this approach.


international conference on the theory of information retrieval | 2011

A generic data model for schema-driven design in information retrieval applications

Hany Azzam; Thomas Roelleke

Database technology offers design methodologies to rapidly develop and deploy applications that are easy to understand, document and teach. It can be argued that information retrieval (IR) lacks equivalent methodologies. This poster discusses a generic data model, the Probabilistic Object-Oriented Content Model, that facilitates solving complex IR tasks. The model guides how data and queries are represented and how retrieval strategies are built and customised. Application/task-specific schemas can also be derived from the generic model. This eases the process of tailoring search to a specific task by offering a layered architecture and well-defined schema mappings. Different types of knowledge (facts and content) from varying data sources can also be consolidated into the proposed modelling framework. Ultimately, the data model paves the way for discussing IR-tailored design methodologies.


information retrieval facility conference | 2010

Logic-Based retrieval: technology for content-oriented and analytical querying of patent data

Iraklis A. Klampanos; Hengzhi Wu; Thomas Roelleke; Hany Azzam

Patent searching is a complex retrieval task. An initial document search is only the starting point of a chain of searches and decisions that need to be made by patent searchers. Keyword-based retrieval is adequate for document searching, but it is not suitable for modelling comprehensive retrieval strategies. DB-like and logical approaches are the state-of-the-art techniques to model strategies, reasoning and decision making. In this paper we present the application of logical retrieval to patent searching. The two grand challenges are expressiveness and scalability, where high degree of expressiveness usually means a loss in scalability. In this paper we report how to maintain scalability while offering the expressiveness of logical retrieval required for solving patent search tasks. We present logical retrieval background, and how to model data-source selection and results’ fusion. Moreover, we demonstrate the modelling of a retrieval strategy, a technique by which patent professionals are able to express, store and exchange their strategies and rationales when searching patents or when making decisions. An overview of the architecture and technical details complement the paper, while the evaluation reports preliminary results on how query processing times can be guaranteed, and how quality is affected by trading off responsiveness.


patent information retrieval | 2011

Large-Scale Logical Retrieval: Technology for Semantic Modelling of Patent Search

Hany Azzam; Iraklis A. Klampanos; Thomas Roelleke

Patent retrieval has emerged as an important application of information retrieval (IR). It is considered to be a complex search task because patent search requires an extended chain of reasoning beyond basic document retrieval. As logic-based IR is capable of modelling both document retrieval and decision-making, it can be seen as a suitable framework for modelling patent data and search strategies. In particular, we demonstrate logic-based modelling for semantic data in patent documents and retrieval strategies which are tailored to patent search and exploit more than just the text in the documents. Given the expressiveness of logic-based IR, however, there is an attendant compromise on issues of scalability and quality. To address these trade-offs we suggest how a parallelised architecture can ensure that logical IR scales in spite of its expressiveness.


conference on information and knowledge management | 2011

Ranking-based processing of SQL queries

Hany Azzam; Thomas Roelleke; Sirvan Yahyaei

A growing number of applications are built on top of search engines and issue complex structured queries. This paper contributes a customisable ranking-based processing of such queries, specifically SQL. Similar to how term-based statistics are exploited by term-based retrieval models, ranking-aware processing of SQL queries exploits tuple-based statistics that are derived from sources or, more precisely, derived from the relations specified in the SQL query. To implement this ranking-based processing, we leverage PSQL, a probabilistic variant of SQL, to facilitate probability estimation and the generalisation of document retrieval models to be used for tuple retrieval. The result is a general-purpose framework that can interpret any SQL query and then assign a probabilistic retrieval model to rank the results of that query. The evaluation on the IMDB and Monster benchmarks proves that the PSQL-based approach is applicable to (semi-)structured and unstructured data and structured queries.


LWA | 2013

The D2Q2 Framework: On the Relationship and Combination of Language Modelling and TF-IDF.

Thomas Roelleke; Hany Azzam; Marco Bonzanini; Miguel Martinez-Alvarez; Mounia Lalmas


LWA | 2010

An Attribute-based Model for Semantic Retrieval.

Hany Azzam; Thomas Roelleke

Collaboration


Dive into the Hany Azzam's collaboration.

Top Co-Authors

Avatar

Thomas Roelleke

Queen Mary University of London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hengzhi Wu

Queen Mary University of London

View shared research outputs
Top Co-Authors

Avatar

Marco Bonzanini

Queen Mary University of London

View shared research outputs
Top Co-Authors

Avatar

Sirvan Yahyaei

Queen Mary University of London

View shared research outputs
Top Co-Authors

Avatar

Jun Wang

Queen Mary University of London

View shared research outputs
Top Co-Authors

Avatar

Miguel Martinez-Alvarez

Queen Mary University of London

View shared research outputs
Researchain Logo
Decentralizing Knowledge