Mohammad Taher Pilehvar
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohammad Taher Pilehvar.
meeting of the association for computational linguistics | 2017
Mohammad Taher Pilehvar; Jose Camacho-Collados; Roberto Navigli; Nigel Collier
Lexical ambiguity can impede NLP systems from accurate understanding of semantics. Despite its potential benefits, the integration of sense-level information into NLP systems has remained understudied. By incorporating a novel disambiguation algorithm into a state-of-the-art classification model, we create a pipeline to integrate sense-level information into downstream NLP applications. We show that a simple disambiguation of the input text can lead to consistent performance improvement on multiple topic categorization and polarity detection datasets, particularly when the fine granularity of the underlying sense inventory is reduced and the document is sufficiently large. Our results also point to the need for sense representation research to focus more on in vivo evaluations which target the performance in downstream NLP applications rather than artificial benchmarks.
language resources and evaluation | 2018
Milan Gritta; Mohammad Taher Pilehvar; Nut Limsopatham; Nigel Collier
AbstractGeographical data can be obtained by converting place names from free-format text into geographical coordinates. The ability to geo-locate events in textual reports represents a valuable source of information in many real-world applications such as emergency responses, real-time social media geographical event analysis, understanding location instructions in auto-response systems and more. However, geoparsing is still widely regarded as a challenge because of domain language diversity, place name ambiguity, metonymic language and limited leveraging of context as we show in our analysis. Results to date, whilst promising, are on laboratory data and unlike in wider NLP are often not cross-compared. In this study, we evaluate and analyse the performance of a number of leading geoparsers on a number of corpora and highlight the challenges in detail. We also publish an automatically geotagged Wikipedia corpus to alleviate the dearth of (open source) corpora in this domain.
empirical methods in natural language processing | 2016
Mohammad Taher Pilehvar; Nigel Collier
One major deficiency of most semantic representation techniques is that they usually model a word type as a single point in the semantic space, hence conflating all the meanings that the word can have. Addressing this issue by learning distinct representations for individual meanings of words has been the subject of several research studies in the past few years. However, the generated sense representations are either not linked to any sense inventory or are unreliable for infrequent word senses. We propose a technique that tackles these problems by de-conflating the representations of words based on the deep knowledge it derives from a semantic network. Our approach provides multiple advantages in comparison to the past work, including its high coverage and the ability to generate accurate representations even for infrequent word senses. We carry out evaluations on six datasets across two semantic similarity tasks and report state-of-the-art results on most of them.
conference of the european chapter of the association for computational linguistics | 2017
Mohammad Taher Pilehvar; Nigel Collier
The authors gratefully acknowledge the support of the MRC grant No. MR/M025160/1 for PheneBank.
meeting of the association for computational linguistics | 2016
Mohammad Taher Pilehvar; Nigel Collier
The authors gratefully acknowledge the support of the MRC grant No. MR/M025160/1 for PheneBank.
meeting of the association for computational linguistics | 2017
Milan Gritta; Mohammad Taher Pilehvar; Nut Limsopatham; Nigel Collier
Named entities are frequently used in a metonymic manner. They serve as references to related entities such as people and organisations. Accurate identification and interpretation of metonymy can be directly beneficial to various NLP applications, such as Named Entity Recognition and Geographical Parsing. Until now, metonymy resolution (MR) methods mainly relied on parsers, taggers, dictionaries, external word lists and other handcrafted lexical resources. We show how a minimalist neural approach combined with a novel predicate window method can achieve competitive results on the SemEval 2007 task on Metonymy Resolution. Additionally, we contribute with a new Wikipedia-based MR dataset called RelocaR, which is tailored towards locations as well as improving previous deficiencies in annotation guidelines.
Archive | 2017
Milan Gritta; Nigel Collier; Nut Limsopatham; Mohammad Taher Pilehvar
Complete supporting/replication data and code for the ACL Publication. The paper was published in August 2017 at www.acl2017.org
Artificial Intelligence | 2016
Jose Camacho-Collados; Mohammad Taher Pilehvar; Roberto Navigli
meeting of the association for computational linguistics | 2017
Jose Camacho-Collados; Mohammad Taher Pilehvar; Nigel Collier; Roberto Navigli
arXiv: Computation and Language | 2018
Jose Camacho-Collados; Mohammad Taher Pilehvar