Mohammad Taher Pilehvar

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohammad Taher Pilehvar is active.

Explore More

Publication

Featured researches published by Mohammad Taher Pilehvar.

meeting of the association for computational linguistics | 2017

Towards a Seamless Integration of Word Senses into Downstream NLP Applications

Mohammad Taher Pilehvar; Jose Camacho-Collados; Roberto Navigli; Nigel Collier

Lexical ambiguity can impede NLP systems from accurate understanding of semantics. Despite its potential benefits, the integration of sense-level information into NLP systems has remained understudied. By incorporating a novel disambiguation algorithm into a state-of-the-art classification model, we create a pipeline to integrate sense-level information into downstream NLP applications. We show that a simple disambiguation of the input text can lead to consistent performance improvement on multiple topic categorization and polarity detection datasets, particularly when the fine granularity of the underlying sense inventory is reduced and the document is sufficiently large. Our results also point to the need for sense representation research to focus more on in vivo evaluations which target the performance in downstream NLP applications rather than artificial benchmarks.

language resources and evaluation | 2018

What’s missing in geographical parsing?

Milan Gritta; Mohammad Taher Pilehvar; Nut Limsopatham; Nigel Collier

AbstractGeographical data can be obtained by converting place names from free-format text into geographical coordinates. The ability to geo-locate events in textual reports represents a valuable source of information in many real-world applications such as emergency responses, real-time social media geographical event analysis, understanding location instructions in auto-response systems and more. However, geoparsing is still widely regarded as a challenge because of domain language diversity, place name ambiguity, metonymic language and limited leveraging of context as we show in our analysis. Results to date, whilst promising, are on laboratory data and unlike in wider NLP are often not cross-compared. In this study, we evaluate and analyse the performance of a number of leading geoparsers on a number of corpora and highlight the challenges in detail. We also publish an automatically geotagged Wikipedia corpus to alleviate the dearth of (open source) corpora in this domain.

empirical methods in natural language processing | 2016

De-Conflated Semantic Representations

Mohammad Taher Pilehvar; Nigel Collier

One major deficiency of most semantic representation techniques is that they usually model a word type as a single point in the semantic space, hence conflating all the meanings that the word can have. Addressing this issue by learning distinct representations for individual meanings of words has been the subject of several research studies in the past few years. However, the generated sense representations are either not linked to any sense inventory or are unreliable for infrequent word senses. We propose a technique that tackles these problems by de-conflating the representations of words based on the deep knowledge it derives from a semantic network. Our approach provides multiple advantages in comparison to the past work, including its high coverage and the ability to generate accurate representations even for infrequent word senses. We carry out evaluations on six datasets across two semantic similarity tasks and report state-of-the-art results on most of them.

conference of the european chapter of the association for computational linguistics | 2017

Inducing Embeddings for Rare and Unseen Words by Leveraging Lexical Resources

Mohammad Taher Pilehvar; Nigel Collier

The authors gratefully acknowledge the support of the MRC grant No. MR/M025160/1 for PheneBank.

meeting of the association for computational linguistics | 2016

Improved Semantic Representation for Domain-Specific Entities

Mohammad Taher Pilehvar; Nigel Collier

The authors gratefully acknowledge the support of the MRC grant No. MR/M025160/1 for PheneBank.

meeting of the association for computational linguistics | 2017

Vancouver Welcomes You! Minimalist Location Metonymy Resolution

Milan Gritta; Mohammad Taher Pilehvar; Nut Limsopatham; Nigel Collier

Named entities are frequently used in a metonymic manner. They serve as references to related entities such as people and organisations. Accurate identification and interpretation of metonymy can be directly beneficial to various NLP applications, such as Named Entity Recognition and Geographical Parsing. Until now, metonymy resolution (MR) methods mainly relied on parsers, taggers, dictionaries, external word lists and other handcrafted lexical resources. We show how a minimalist neural approach combined with a novel predicate window method can achieve competitive results on the SemEval 2007 task on Metonymy Resolution. Additionally, we contribute with a new Wikipedia-based MR dataset called RelocaR, which is tailored towards locations as well as improving previous deficiencies in annotation guidelines.

Archive | 2017