Yashar Mehdad | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yashar Mehdad is active.

Explore More

Publication

Featured researches published by Yashar Mehdad.

meeting of the association for computational linguistics | 2009

Automatic Cost Estimation for Tree Edit Distance Using Particle Swarm Optimization

Yashar Mehdad

Recently, there is a growing interest in working with tree-structured data in different applications and domains such as computational biology and natural language processing. Moreover, many applications in computational linguistics require the computation of similarities over pair of syntactic or semantic trees. In this context, Tree Edit Distance (TED) has been widely used for many years. However, one of the main constraints of this method is to tune the cost of edit operations, which makes it difficult or sometimes very challenging in dealing with complex problems. In this paper, we propose an original method to estimate and optimize the operation costs in TED, applying the Particle Swarm Optimization algorithm. Our experiments on Recognizing Textual Entailment show the success of this method in automatic estimation, rather than manual assignment of edit costs.

web search and data mining | 2017

Lightweight Multilingual Entity Extraction and Linking

Aasish Pappu; Roi Blanco; Yashar Mehdad; Amanda Stent; Kapil Thadani

Text analytics systems often rely heavily on detecting and linking entity mentions in documents to knowledge bases for downstream applications such as sentiment analysis, question answering and recommender systems. A major challenge for this task is to be able to accurately detect entities in new languages with limited labeled resources. In this paper we present an accurate and lightweight, multilingual named entity recognition (NER) and linking (NEL) system. The contributions of this paper are three-fold: 1) Lightweight named entity recognition with competitive accuracy; 2) Candidate entity retrieval that uses search click-log data and entity embeddings to achieve high precision with a low memory footprint; and 3) efficient entity disambiguation. Our system achieves state-of-the-art performance on TAC KBP 2013 multilingual data and on English AIDA CONLL data.

congress of the italian association for artificial intelligence | 2009

Towards Extensible Textual Entailment Engines: The EDITS Package

Matteo Negri; Milen Kouylekov; Bernardo Magnini; Yashar Mehdad; Elena Cabrio

This paper presents the first release of EDITS, an open-source software package for recognizing Textual Entailment developed by FBK-irst. The main contributions of EDITS consist in: i) providing a basic framework for a distance-based approach to the task, ii) providing a highly customizable environment to experiment with different algorithms, iii) allowing for easy extensions and integrations with new algorithms and resources. Systems main features are described, together with experiments over different datasets showing its potential in terms of tuning and adaptation capabilities.

meeting of the association for computational linguistics | 2009

Optimizing Textual Entailment Recognition Using Particle Swarm Optimization

Yashar Mehdad; Bernardo Magnini

This paper introduces a new method to improve tree edit distance approach to textual entailment recognition, using particle swarm optimization. Currently, one of the main constraints of recognizing textual entailment using tree edit distance is to tune the cost of edit operations, which is a difficult and challenging task in dealing with the entailment problem and datasets. We tried to estimate the cost of edit operations in tree edit distance algorithm automatically, in order to improve the results for textual entailment. Automatically estimating the optimal values of the cost operations over all RTE development datasets, we proved a significant enhancement in accuracy obtained on the test sets.

international symposium on wikis and open collaboration | 2012

CoSyne: synchronizing multilingual wiki content

Amit Bronner; Matteo Negri; Yashar Mehdad; Angela Fahrni; Christof Monz

CoSyne is a content synchronization system for assisting users and organizations involved in the maintenance of multilingual wikis. The system allows users to explore the diversity of multilingual content using a monolingual view. It provides suggestions for content modification based on additional or more specific information found in other language versions, and enables seamless integration of automatically translated sentences while giving users the flexibility to edit, correct and control eventual changes to the wiki page. To support these tasks, CoSyne employs state-of-the-art machine translation and natural language processing techniques.

meeting of the association for computational linguistics | 2017

DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging.

Sheng Chen; Akshay Soni; Aasish Pappu; Yashar Mehdad

Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work. Accurate tagging of articles can benefit several downstream applications such as recommendation and search. In this work, we propose a novel yet simple approach called DocTag2Vec to accomplish this task. We substantially extend Word2Vec and Doc2Vec---two popular models for learning distributed representation of words and documents. In DocTag2Vec, we simultaneously learn the representation of words, documents, and tags in a joint vector space during training, and employ the simple

empirical methods in natural language processing | 2011

Divide and Conquer: Crowdsourcing the Creation of Cross-Lingual Textual Entailment Corpora

Matteo Negri; Luisa Bentivogli; Yashar Mehdad; Danilo Giampiccolo; Alessandro Marchetti

north american chapter of the association for computational linguistics | 2010

Towards Cross-Lingual Textual Entailment

Yashar Mehdad; Matteo Negri; Marcello Federico

-nearest neighbor search to predict tags for unseen documents. In contrast to previous multi-label learning methods, DocTag2Vec directly deals with raw text instead of provided feature vector, and in addition, enjoys advantages like the learning of tag representation, and the ability of handling newly created tags. To demonstrate the effectiveness of our approach, we conduct experiments on several datasets and show promising results against state-of-the-art methods.

joint conference on lexical and computational semantics | 2012