Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Amit Kirschenbaum is active.

Publication


Featured researches published by Amit Kirschenbaum.


meeting of the association for computational linguistics | 2009

Lightly Supervised Transliteration for Machine Translation

Amit Kirschenbaum; Shuly Wintner

We present a Hebrew to English transliteration method in the context of a machine translation system. Our method uses machine learning to determine which terms are to be transliterated rather than translated. The training corpus for this purpose includes only positive examples, acquired semi-automatically. Our classifier reduces more than 38% of the errors made by a baseline method. The identified terms are then transliterated. We present an SMT-based transliteration model trained with a parallel corpus extracted from Wikipedia using a fairly simple method which requires minimal knowledge. The correct result is produced in more than 76% of the cases, and in 92% of the instances it is one of the top-5 results. We also demonstrate a small improvement in the performance of a Hebrew-to-English MT system that uses our transliteration module.


Second Language Research | 2017

The Role of Orthotactic Probability in Incidental and Intentional Vocabulary Acquisition L1 and L2.

Denisa Bordag; Amit Kirschenbaum; Maria Rogahn; Erwin Tschirner

Four experiments were conducted to examine the role of orthotactic probability, i.e. the sequential letter probability, in the early stages of vocabulary acquisition by adult native speakers and advanced learners of German. The results show different effects for orthographic probability in incidental and intentional vocabulary acquisition: Whereas low orthographic probability contributed positively to incidental acquisition of novel word meanings in first language (L1), high orthographic probability affected positively the second language (L2) intentional learning. The results are discussed in the context of the following concepts: (1) triggering the establishment of a new representation, (2) noticing of new lexemes during reading, and (3) vocabulary size of the L1 and L2 mental lexicons.


SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing | 2013

Unsupervised segmentation for different types of morphological processes using multiple sequence alignment

Amit Kirschenbaum

The aim of unsupervised and knowledge free morphological segmentation is the identification of boundaries between morphs in words of a given language without relying on any knowledge source about that language. This paper describes a segmentation method that draws on previous approaches based both on semantic and orthographical similarity to identify morphologically related words. Using a version of Multiple Sequence Alignment originally applied in bioinformatics, the method extracts both concatenative and non-concatenative (e.g. introflection and circumfixation) morphological patterns and can thus handle languages of different morphological types as well as non-dominant morphological processes within languages of a particular predominant morphological type.


international conference on semantic systems | 2017

IDOL: Comprehensive & Complete LOD Insights

Ciro Baron Neto; Dimitris Kontokostas; Amit Kirschenbaum; Gustavo Publio; Diego Esteves; Sebastian Hellmann

Over the last decade, we observed a steadily increasing amount of RDF datasets made available on the web of data. The decentralized nature of the web, however, makes it hard to identify all these datasets. Even more so, when downloadable data distributions are discovered, only insufficient metadata is available to describe the datasets properly, thus posing barriers on its usefulness and reuse. In this paper, we describe an attempt to exhaustively identify the whole linked open data cloud by harvesting metadata from multiple sources, providing insights about duplicated data and the general quality of the available metadata. This was only possible by using a probabilistic data structure called Bloom filter. Finally, we published a dump file containing metadata which can further be used to enrich existent datasets.


Studies in Second Language Acquisition | 2017

Semantic Representation of Newly Learned L2 Words and Their Integration in the L2 Lexicon.

Denisa Bordag; Amit Kirschenbaum; Maria Rogahn; Andreas Opitz; Erwin Tschirner

The present semantic priming study explores the integration of newly learnt L2 German words into the L2 semantic network of German advanced learners. It provides additional evidence in support of earlier findings reporting semantic inhibition effects for emergent representations. An inhibitory mechanism is proposed that temporarily decreases the resting levels of the representations with which the new representation is linked and thus enables its selection despite its low resting level.


Studies in Second Language Acquisition | 2016

INCIDENTAL ACQUISITION OF GRAMMATICAL FEATURES DURING READING IN L1 AND L2

Denisa Bordag; Amit Kirschenbaum; Andreas Opitz; Maria Rogahn; Erwin Tschirner

The present study explores the initial stages of incidental acquisition of two grammatical properties of verbs (subcategorization and [ir]regularity) during reading in first language (L1) and second language (L2) German using an adjusted self-paced reading paradigm. The results indicate that L1 speakers are superior to L2 speakers in the incidental acquisition of grammatical knowledge (experiments on subcategorization), except when the new knowledge interferes with previously acquired knowledge and mechanisms (experiments on [ir]regularity): Although both populations performed equally well regarding the acquisition of the subcategorization of verbs from the input (i.e., whether the verbs are transitive or intransitive), they differed with respect to the regularity status of new verbs. L1 speakers (in contrast to L2 learners) seem to disprefer irregularly conjugated verb forms in general, irrespective of their conjugation in the previous input. The results further show that the syntactic complexity of the context and morphological markedness positively affect the incidental acquisition of new words in the L2, triggering learners’ shift of attention from the text level to the word level.


conference on intelligent text processing and computational linguistics | 2015

To Split or Not, and If so, Where? Theoretical and Empirical Aspects of Unsupervised Morphological Segmentation

Amit Kirschenbaum

The purpose of this paper is twofold: First, it offers an overview of challenges encountered by unsupervised, knowledge free methods when analysing language data (with focus on morphology). Second, it presents a system for unsupervised morphological segmentation comprising two complementary methods that can handle a broad range of morphological processes. The first method collects words which share distributional and form similarity and applies Multiple Sequence Alignment to derive segmentation of these words. The second method then analyses less frequent words utilizing the segmentation results of the first method. The challenges presented in the theoretical part are demonstrated exemplarily on the workings and output of the introduced unsupervised system and accompanied by suggestions how to address them in future works.


Bilingualism: Language and Cognition | 2015

Incidental acquisition of new words during reading in L2: Inference of meaning and its integration in the L2 mental lexicon – ERRATUM

Denisa Bordag; Amit Kirschenbaum; Erwin Tschirner; Andreas Opitz

A novel combination of several experimental and non-experimental paradigms was applied to explore initial stages of incidental vocabulary acquisition (IVA) during reading in German as a second language (L2). The results show that syntactic complexity of the context positively affects incidental acquisition of new words, triggering the learners shift of attention from the text level to the word level. A subsequent semantic priming task revealed that the new words establish associations with semantically related representations in the L2 mental lexicon after just three previous occurrences and without any consolidation period. The semantic inhibition effect for the new words (contrary to semantic facilitation for known L2 words), however, indicates that the memory traces of the new semantic representation are still very weak and that their retrieval is probably hindered by stronger semantically related representations that have much lower activation thresholds and higher potential for being selected.


International Conference of the German Society for Computational Linguistics and Language Technology | 2017

Investigating the Morphological Complexity of German Named Entities: The Case of the GermEval NER Challenge

Bettina Klimek; Markus Ackermann; Amit Kirschenbaum; Sebastian Hellmann

This paper presents a detailed analysis of Named Entity Recognition (NER) in German, based on the performance of systems that participated in the GermEval 2014 shared task. It focuses on the role of morphology in named entities, an issue too often neglected in the NER task. We introduce a measure to characterize the morphological complexity of German named entities and apply it to the subset of named entities identified by all systems, and to the subset of named entities none of the systems recognized. We discover that morphologically complex named entities are more prevalent in the latter set than in the former, a finding which should be taken into account in future development of methods of that sort. In addition, we provide an analysis of issues found in the GermEval gold standard annotation, which affected also the performance measurements of the different systems.


applications of natural language to data bases | 2012

Labeling queries for a people search engine

Antje Schlaf; Amit Kirschenbaum; Robert Remus; Thomas Efer

We present methods for labeling queries for a specialized search engine: a people search engine. Thereby, we propose several methods of different complexity from simple probabilistic ones to Conditional Random Fields. All methods are then evaluated on a manually annotated corpus of queries submitted to a people search engine. Additionally, we analyze this corpus with respect to typical search patterns and their distribution.

Collaboration


Dive into the Amit Kirschenbaum's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge