Is this you? Create Your Porfile

Olivier Y. de Vel

Defence Science and Technology Organisation

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Olivier Y. de Vel is active.

Explore More

Publication

Featured researches published by Olivier Y. de Vel.

acm symposium on applied computing | 2006

Automated recognition of event scenarios for digital forensics

Jonathon Abbott; Jim Bell; Andrew J. Clark; Olivier Y. de Vel; George M. Mohay

The authors have previously developed the ECF (Event Correlation for Forensics) framework for scenario matching in the forensic investigation of activity manifested in digital transactional logs. ECF incorporated a suite of log parsers to reduce event records from heterogeneous logs to a canonical form for lodging in an SQL database. This paper presents work since then, the Auto-ECF system, which represents significant advances on ECF. The paper reports on the development and implementation of the new event abstraction and scenario specification methodology and on the development of the Auto-ECF system which builds on that to achieve the automated recognition of event scenarios. The paper also reports on the evaluation of Auto-ECF using three scenarios including one from the well known DARPA test data.

computing and combinatorics conference | 2000

Similarity Enrichment in Image Compression through Weighted Finite Automata

Zhuhan Jiang; Bruce E. Litow; Olivier Y. de Vel

We propose and study in details a similarity enrichment scheme for the application to the image compression through the extension of the weighted finite automata (WFA). We then develop a mechanism with which rich families of legitimate similarity images can be systematically created so as to reduce the overall WFA size, leading to an eventual better WFA-based compression performance. A number of desirable properties, including WFA of minimum states, have been established for a class of packed WFA. Moreover, a codec based on a special extended WFA is implemented to exemplify explicitly the performance gain due to extended WFA under otherwise the same conditions.

International Journal of Pattern Recognition and Artificial Intelligence | 1997

Learning to Recognize 3D Objects using Sparse Depth and Intensity Information

Brendan MccCane; Terry Caelli; Olivier Y. de Vel

In this paper we further explore the use of machine learning (ML) for the recognition of 3D objects in isolation or embedded in scenes. Of particular interest is the use of a recent ML technique (specifically CRG — Conditional Rule Generation) which generates descriptions of objects in terms of object parts and part-relational attribute bounds. We show how this technique can be combined with intensity-based model and scene–views to locate objects and their pose. The major contributions of this paper are: the extension of the CRG classifier to incorporate fuzzy decisions (FCRG), the application of the FCRG classifier to the problem of learning 3D objects from 2D intensity images, the study of the usefulness of sparse depth data in regards to recognition performance, and the implementation of a complete object recognition system that does not rely on perfect or synthetic data. We report a recognition rate of 80% for unseen single object scenes in a database of 18 non-trivial objects.

Faculty of Science and Technology; Information Security Institute | 2002

E-Mail Authorship Attribution for Computer Forensics

Olivier Y. de Vel; Alison Anderson; Malcolm W. Corney; George M. Mohay

In this chapter, we briefly overview the relatively new discipline of computer forensics and describe an investigation of forensic authorship attribution or identification undertaken on a corpus of multi-author and multi-topic e-mail documents. We use an extended set of e-mail document features such as structural characteristics and linguistic patterns together with a Support Vector Machine as the learning algorithm. Experiments on a number of e-mail documents generated by different authors on a set of topics gave promising results for multi-topic and multi-author categorisation.

Digital Investigation | 2004

File classification using byte sub-stream kernels

Olivier Y. de Vel

The ability to automatically classify files based on their low-level, short-range structures is of particular importance in computer forensics. We report a study on the automatic learning of file classification using byte sub-stream kernels that capture these low-level structures. We automatically discover byte-level patterns in a file by extracting a byte sequence feature map and use a suffix trie data structure to efficiently store and manipulate the feature map. Using the feature map we compute the spectrum kernel and, together with a support vector machine classifier algorithm, we are able to efficiently categorize a variety of different system and application file types. Experiments have provided good file classification performance results.

australasian joint conference on artificial intelligence | 2008

Knowledge Discovery from Honeypot Data for Monitoring Malicious Attacks

Huidong Jin; Olivier Y. de Vel; Ke Zhang; Nianjun Liu

Owing to the spread of worms and botnets, cyber attacks have significantly increased in volume, coordination and sophistication. Cheap rentable botnet services, e.g., have resulted in sophisticated botnets becoming an effective and popular tool for committing online crime these days. Honeypots, as information system traps, are monitoring or deflecting malicious attacks on the Internet. To understand the attack patterns generated by botnets by virtue of the analysis of the data collected by honeypots, we propose an approach that integrates a clustering structure visualisation technique with outlier detection techniques. These techniques complement each other and provide end users both a big-picture view and actionable knowledge of high-dimensional data. We introduce KNOF (K-nearest Neighbours Outlier Factor) as the outlier definition technique to reach a trade-off between global and local outlier definitions, i.e., K th -Nearest Neighbour (KNN) and Local Outlier Factor (LOF) respectively. We propose an algorithm to discover the most significant KNOF outliers. We implement these techniques in our hpdAnalyzer tool. The tool is successfully used to comprehend honeypot data. A series of experiments show that our proposed KNOF technique substantially outperforms LOF and, to a lesser degree, KNN for real-world honeypot data.

intelligence and security informatics | 2008

A Simple WordNet-Ontology Based Email Retrieval System for Digital Forensics

Phan Thien Son; Lan Du; Huidong Jin; Olivier Y. de Vel; Nianjun Liu; Terry Caelli

Because of the high impact of high-tech digital crime upon our society, it is necessary to develop effective Information Retrieval (IR) tools to support digital forensic investigations. In this paper, we propose an IR system for digital forensics that targets emails. Our system incorporates WordNet (i.e. a domain independent ontology for the vocabulary) into an Extended Boolean Model (EBM) by applying query expansion techniques. Structured Boolean queries in Backus-Naur Form (BNF) are utilized to assist investigators in effectively expressing their information requirements. We compare the performance of our system on several email datasets with a traditional Boolean IR system built upon the Lucene keyword-only model. Experimental results show that our system yields a promising improvement in retrieval performance without the requirement of very accurate query keywords to retrieve the most relevant emails.

Data Mining and Knowledge Discovery | 2006

Learning Semi-Structured Document Categorization Using Bounded-Length Spectrum Sub-Sequence Kernels

Olivier Y. de Vel

In this paper we report an investigation into the learning of semi-structured document categorization. We automatically discover low-level, short-range byte data structure patterns from a document data stream by extracting all byte sub-sequences within a sliding window to form an augmented (or bounded-length) string spectrum feature map and using a modified suffix trie data structure (called the coloured generalized suffix tree or CGST) to efficiently store and manipulate the feature map. Using the CGST we are able to efficiently compute the streams bounded-length sequence spectrum kernel. We compare the performance of two classifier algorithms to categorize the data streams, namely, the SVM and Naive Bayes (NB) classifiers. Experiments have provided good classification performance results on a variety of document byte streams, particularly when using the NB classifier under certain parameter settings. Results indicate that the bounded-length kernel is superior to the standard fixed-length kernel for semi-structured documents.

pacific rim international conference on artificial intelligence | 1996

Inducing complex spatial descriptions in two dimensional scenes

Brendan McCane; Terry Caelli; Olivier Y. de Vel

Very few object recognition systems attempt to learn the representations of the objects which they are to recognise. Some common techniques include evidence based systems and neural network approaches. However, evidence based systems decouple the unary and relational attributes and therefore expose themselves to the label compatibility problem (where two distincts objects have the same unary and relational relationships, but are structurally different). In this paper we describe an extension to Conditional Rule Generation (CRG), called FCRG (Fuzzy Conditional Rule Generation), how it learns complex spatial relationships and the implications of the label compatibility problem with respect to 3D object recognition.

Archive | 2003