Oliver Hellwig | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Oliver Hellwig is active.

Explore More

Publication

Featured researches published by Oliver Hellwig.

Sanskrit Computational Linguistics | 2009

SanskritTagger: A Stochastic Lexical and POS Tagger for Sanskrit

Oliver Hellwig

SanskritTagger is a stochastic tagger for unpreprocessed Sanskrit text. The tagger tokenises text and performs part-of-speech tagging using a Markov model. Parameters for these processes are estimated from a manually annotated corpus that currently comprises approximately 1,500,000 words. This article sketches the tagging process, reports the results of tagging a few short passages of Sanskrit text and describes further improvements of the program.

Proceedings of the 3rd International Symposium on Sanskrit Computational Linguistics | 2008

Extracting Dependency Trees from Sanskrit Texts

Oliver Hellwig

In this paper, I describe a hybrid dependency tree parser for Sanskrit sentences improving on a purely lexical parsing approach through simple syntactic rules and grammatical information. The performance of the parser is demonstrated on a group of sentences from epic literature.

Language Technology for Cultural Heritage | 2011

Adapting NLP Tools and Frame-Semantic Resources for the Semantic Analysis of Ritual Descriptions

Nils Reiter; Oliver Hellwig; Anette Frank; Irina Gossmann; Borayin Larios; Julio Rodrigues; Britta D. Zeller

In this paper we investigate the use of standard natural language processing (NLP) tools and annotation methods for processing linguistic data from ritual science, which is concerned with the study of structure and variance of rituals. The work is embedded in an interdisciplinary project that addresses this study by applying empirical and quantitative computational linguistic analysis techniques to ritual descriptions from Indian rituals.We present motivation and prospects of such a computational approach to ritual structure research and sketch the overall project research plan. In particular, we motivate the choice of frame semantics as a theoretical framework for the semantic analysis of rituals. We discuss the special characteristics of the textual data and examine several domain adaptation strategies in order to use standard NLP resources and tools on the ritual domain. We also report on our workflows and methods for semi-automatic semantic annotation, which is used as a basis for the extraction of event chains. We close with some preliminary investigations on how to uncover regularities and differences of rituals.-

Literary and Linguistic Computing | 2009

A chronometric approach to Indian alchemical literature

Oliver Hellwig

Indian alchemy, a branch of traditional Indian medicine (Āyurveda), has produced a corpus of texts that are difficult to date using regular philological techniques. This article describes a contents-based computational method that is capable of calculating the relative chronology of these texts. Central parts of alchemical literature are encoded in a language model that can be understood by a computer and then compared with an alignment algorithm. Phylogenetic trees derived from these alignments show regularities in the ordering of alchemical texts, and these may be interpreted as temporal patterns. Processing these patterns with a minimization algorithm, we are able to compute a relative chronology of the corpus, which is largely consistent with results obtained using traditional philological techniques.

International Sanskrit Computational Linguistics Symposium | 2010

Performance of a Lexical and POS Tagger for Sanskrit

Oliver Hellwig

Due to the phonetic, morphological, and lexical complexity of Sanskrit, the automatic analysis of this language is a real challenge in the area of natural language processing. The paper describes a series of tests that were performed to assess the accuracy of the tagging program SanskritTagger. To our knowlegde, it offers the first reliable benchmark data for evaluating the quality of taggers for Sanskrit using an unrestricted dictionary and texts from different domains. Based on a detailed analysis of the test results, the paper points out possible directions for future improvements of statistical tagging procedures for Sanskrit.

Literary and Linguistic Computing | 2010

Etymological trends in the Sanskrit vocabulary

Oliver Hellwig

The article examines how the etymological composition of the Sanskrit lexicon is influenced by time and whether this composition can be used to date Sanskrit texts automatically. For this purpose, statistical tests are applied to a corpus of lexically analyzed texts. Results reported in the article may contribute to the diachronic lexicography of Sanskrit and help to develop computational methods for analyzing anonymous and undated Sanskrit texts.

Literary and Linguistic Computing | 2014