Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gábor Berend is active.

Publication


Featured researches published by Gábor Berend.


Natural Language Engineering | 2016

Exploiting extra-textual and linguistic information in keyphrase extraction

Gábor Berend

Keyphrases are the most important phrases of documents that make them suitable for improving natural language processing tasks, including information retrieval, document classification, document visualization, summarization and categorization. Here, we propose a supervised framework augmented by novel extra-textual information derived primarily from Wikipedia. Wikipedia is utilized in such an advantageous way that – unlike most other methods relying on Wikipedia – a full textual index of all the Wikipedia articles is not required by our approach, as we only exploit the category hierarchy and a list of multiword expressions derived from Wikipedia. This approach is not only less resource intensive, but also produces comparable or superior results compared to previous similar works. Our thorough evaluations also suggest that the proposed framework performs consistently well on multiple datasets, being competitive or even outperforming the results obtained by other state-of-the-art methods. Besides introducing features that incorporate extra-textual information, we also experimented with a novel way of representing features that are derived from the POS tagging of the keyphrase candidates.


Proceedings of the Workshop on Noisy User-generated Text | 2015

USZEGED: Correction Type-sensitive Normalization of English Tweets Using Efficiently Indexed n-gram Statistics

Gábor Berend; Ervin Tasnádi

This paper describes the framework applied by team USZEGED at the “Lexical Normalisation for English Tweets” shared task. Our approach first employs a CRFbased sequence labeling framework to decide the kind of corrections the individual tokens require, then performs the necessary modifications relying on external lexicons and a massive collection of efficiently indexed n-gram statistics from English tweets. Our solution is based on the assumption that from the context of the OOV words, it is possible to reconstruct its IV equivalent, as there are users who use the standard English form of the OOV word within the same context. Our approach achieved an F-score of 0.8052, being the second best one among the unconstrained submissions, the category our submission also belongs to.


international conference on computational linguistics | 2014

SZTE-NLP: Aspect level opinion mining exploiting syntactic cues

Viktor Hangya; Gábor Berend; István Varga; Richárd Farkas

In this paper, we introduce our contributions to the SemEval-2014 Task 4 ‐ Aspect Based Sentiment Analysis (Pontiki et al., 2014) challenge. We participated in the aspect term polarity subtask where the goal was to classify opinions related to a given aspect into positive, negative, neutral or conflict classes. To solve this problem, we employed supervised machine learning techniques exploiting a rich feature set. Our feature templates exploited both phrase structure and dependency parses.


international world wide web conferences | 2015

Supervised Prediction of Social Network Links Using Implicit Sources of Information

Ervin Tasnádi; Gábor Berend

In this paper, we introduce a supervised machine learning framework for the link prediction problem. The social network we conducted our empirical evaluation on originates from the restaurant review portal, yelp.com. The proposed framework not only uses the structure of the social network to predict non-existing edges in it, but also makes use of further graphs that were constructed based on implicit information provided in the dataset. The implicit information we relied on includes the language use of the members of the social network and their ratings with respect the businesses they reviewed. Here, we also investigate the possibility of building supervised learning models to predict social links without relying on features derived from the structure of the social network itself, but based on such implicit information alone. Our empirical results not only revealed that the features derived from different sources of implicit information can be useful on their own, but also that incorporating them in a unified framework has the potential to improve classification results, as the different sources of implicit information can provide independent and useful views about the connectedness of users.


Acta Cybernetica | 2018

l1 Regularization of Word Embeddings for Multi-Word Expression Identification

Gábor Berend

In this paper we compare the effects of applying various state-of-the-art word representation strategies in the task of multi-word expression (MWE) identification. In particular, we analyze the strengths and weaknesses of the usage of `1-regularized sparse word embeddings for identifying MWEs. Our earlier study demonstrated the effectiveness of regularized word embeddings in other sequence labeling tasks, i.e. part-of-speech tagging and named entity recognition, but it has not yet been rigorously evaluated for the identification of MWEs yet.


recent advances in natural language processing | 2011

Multiword Expressions and Named Entities in the Wiki50 Corpus

Veronika Vincze; István Nagy T.; Gábor Berend


international joint conference on natural language processing | 2011

Opinion Expression Mining by Exploiting Keyphrase Extraction

Gábor Berend


meeting of the association for computational linguistics | 2011

Detecting Noun Compounds and Light Verb Constructions: a Contrastive Study

Veronika Vincze; István Nagy T.; Gábor Berend


meeting of the association for computational linguistics | 2010

SZTERGAK : Feature Engineering for Keyphrase Extraction

Gábor Berend; Richárd Farkas


recent advances in natural language processing | 2011

Domain-Dependent Identification of Multiword Expressions

István Nagy T.; Veronika Vincze; Gábor Berend

Collaboration


Dive into the Gábor Berend's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Márton Makrai

Hungarian Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge