Lyndon White
University of Western Australia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lyndon White.
conference on intelligent text processing and computational linguistics | 2016
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun
Many methods have been proposed to generate sentence vector representations, such as recursive neural networks, latent distributed memory models, and the simple sum of word embeddings (SOWE). However, very few methods demonstrate the ability to reverse the process – recovering sentences from sentence embeddings. Amongst the many sentence embeddings, SOWE has been shown to maintain semantic meaning, so in this paper we introduce a method for moving from the SOWE representations back to the bag of words (BOW) for the original sentences. This is a part way step towards recovering the whole sentence and has useful theoretical and practical applications of its own. This is done using a greedy algorithm to convert the vector to a bag of words. To our knowledge this is the first such work. It demonstrates qualitatively the ability to recreate the words from a large corpus based on its sentence embeddings.
international conference on data mining | 2016
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun
Converting a sentence to a meaningful vector representation has uses in many NLP tasks, however very few methods allow that representation to be restored to a human readable sentence. Being able to generate sentences from the vector representations demonstrates the level of information maintained by the embedding representation – in this case a simple sum of word embeddings. We introduce such a method for moving from this vector representation back to the original sentences. This is done using a two stage process, first a greedy algorithm is utilised to convert the vector to a bag of words, and second a simple probabilistic language model is used to order the words to get back the sentence. To the best of our knowledge this is the first work to demonstrate quantitatively the ability to reproduce text from a large corpus based directly on its sentence embeddings.
Archive | 2019
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun
This chapter discusses representations for larger structures in natural language. The primary focus is on the sentence level. However, many of the techniques also apply to sub-sentence structures (phrases), and super-sentence structures (documents). The three main types of representations discussed here are: unordered models, such as sum of word embeddings; sequential models, such as recurrent neural networks; and structured models, such as recursive autoencoders.
Archive | 2019
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun
This chapter covers the crucial machine learning techniques required to understand the remained of the book: namely neural networks. Readers already familiar with neural networks can freely skip this chapter. Readers interested in a more comprehensive coverage of all aspects of machine learning are referred to the many textbooks on this subject matter.
Archive | 2019
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun
This chapter continues from the general introduction to neural networks, to a focus on recurrent networks. The recurrent neural network is the most popular neural network approach for working with sequences of dynamic size. As with the prior chapter, readers familiar with RNNs can reasonably skip this. Note that this chapter does not pertain specifically to NLP. However, as NLP tasks are almost always sequential in nature, RNNs are fundamental to many NLP systems
Archive | 2019
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun
In this chapter, techniques for representing the multiple meanings of a single word are discussed. This is a growing area, and is particularly important in languages where polysemous and homonymous words are common. This includes English, but it is even more prevalent in Mandarin for example. The techniques discussed can broadly be classified as lexical word sense representation, and as word sense induction. The inductive techniques can be sub-classified as clustering -based or as prediction-based.
european conference on computer vision | 2018
Naeha Sharif; Lyndon White; Mohammed Bennamoun; Syed Afaq Ali Shah
The automatic evaluation of image descriptions is an intricate task, and it is highly important in the development and fine-grained analysis of captioning systems. Existing metrics to automatically evaluate image captioning systems fail to achieve a satisfactory level of correlation with human judgements at the sentence level. Moreover, these metrics, unlike humans, tend to focus on specific aspects of quality, such as the n-gram overlap or the semantic meaning. In this paper, we present the first learning-based metric to evaluate image captions. Our proposed framework enables us to incorporate both lexical and semantic information into a single learned metric. This results in an evaluator that takes into account various linguistic features to assess the caption quality. The experiments we performed to assess the proposed metric, show improvements upon the state of the art in terms of correlation with human judgements and demonstrate its superior robustness to distractions.
australasian document computing symposium | 2015
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun
Archive | 2019
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun
meeting of the association for computational linguistics | 2018
Lyndon White; Roberto Togneri; Wei Liu; Mohammed Bennamoun