Uri Zernik | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Uri Zernik is active.

Explore More

Publication

Featured researches published by Uri Zernik.

Information Processing and Management | 1989

Information extraction and text summarization using linguistic knowledge acquisition

Lisa F. Rau; Paul S. Jacobs; Uri Zernik

Abstract Storing and accessing texts in a conceptual format has a number of advantages over traditional document retrieval methods. A conceptual format facilitates natural language access to text information. It can support imprecise and inexact queries, conceptual information summarization, and, ultimately, document translation. The lack of extensive linguistic coverage is the major barrier to extracting useful information from large bodies of text. Current natural language processing (NLP) systems do not have rich enough lexicons to cover all the important words and phrases in extended texts. Two methods of overcoming this limitation are (1) to apply a text processing strategy that is tolerant of unknown words and gaps in linguistics knowledge, and (2) to acquire lexical information automatically from the texts. These two methods have been implemented in a prototype intelligent information retrieval system called SCISOR (System for Conceptual Information Summarization, Organization and Retrieval). This article describes the text processing, language acquisition, and summarization components of SCISOR.

international conference on computational linguistics | 1990

Tagging for learning: collecting thematic relations from corpus

Uri Zernik; Paul S. Jacobs

Recent work in text analysis has suggested that data on words that frequently occur together reveal important information about text content. Co-occurrence relations can serve two main purposes in language processing. First, the statistics of co-occurrence have been shown to produce accurate results in syntactic analysis. Second, the way that words appear together can help in assigning thematic roles in semantic interpretation. This paper discusses a method for collecting co-occurrence data, acquiring lexical relations from the data, and applying these relations to semantic analysis.

meeting of the association for computational linguistics | 1985

Towards a Self-Extending Lexicon

Uri Zernik; Michael G. Dyer

The problem of manually modifying the lexicon appears with any natural language processing program. Ideally, a program should be able to acquire new lexical entries from context, the way people learn. We address the problem of acquiring entire phrases, specifically figurative phrases, through augmenting a phrasal lexicon. Facilitating such a self-extending lexicon involves (a) disambiguation---selection of the intended phrase from a set of matching phrases, (b) robust parsing---comprehension of partially-matching phrases, and (c) error analysis---use of errors in forming hypotheses about new phrases. We have designed and implemented a program called RINA which uses demons to implement functional-grammar principles. RINA receives new figurative phrases in context and through the application of a sequence of failure-driven rules, creates and refines both the patterns and the concepts which hold syntactic and semantic information about phrases.

international conference on computational linguistics | 1988

Default reasoning in natural language processing

Uri Zernik; Allen L. Brown

In natural language, as in other computational task domains it is important to operate by default assumptions. First, many constraints required for constraint propagation are initially unspecified. Second, in highly ambiguous tasks such as text analysis, ambiguity can be reduced by considering more plausible scenarios first. Default reasoning is problematic for first-order logic when allowing non-monotonic inferences. Whereas in monotonic logic facts can only be asserted, in non-monotonic logic a system must be maintained consistent even as previously assumed defaults are being retracted.Non-monotonicty is pervasive in natural language due to the serial nature of utterances. When reading text left-to-right, it happens that default assumptions made early in the sentence must be withdrawn as reading proceeds. Truth maintenance, which accounts for non-monotonic inferences, can resolve this issue and address important linguistic phenomena. In this paper we describe how in NMG (Non-Monotonic Grammar), by monitoring a logic parser, a truth maintenance system can significantly, enhance the parsers capabilities.

meeting of the association for computational linguistics | 1986

Encoding and Acquiring Meanings for Figurative Phrases

Michael G. Dyer; Uri Zernik

Here we address the problem of mapping phrase meanings into their conceptual representations. Figurative phrases are pervasive in human communication, yet they are difficult to explain theoretically. In fact, the ability to handle idiosyncratic behavior of phrases should be a criterion for any theory of lexical representation. Due to the huge number of such phrases in the English language, phrase representation must be amenable to parsing, generation, and also to learning. In this paper we demonstrate a semantic representation which facilitates, for a wide variety of phrases, both learning and parsing.

human language technology | 1990

Generic text processing: a progress report

Paul S. Jacobs; George R. Krupka; Susan W. McRoy; Lisa F. Rau; Norman K. Sondheimer; Uri Zernik

A generic natural language system, without modification, can effectively analyze an arbitrary input at least to the level of word sense tagging. Considerable research has addressed the transportability of natural language systems, but not generic text processing capabilities. For example, previous DARPA-sponsored work [1, 2] produced transportable interfaces to database systems. Each new application of these interfaces generally required modifications to lexicons, new semantic knowledge bases, and other specialized features. The most that natural language text processing systems have accomplished has been the parsing of arbitrary text, without any real semantic analysis.

international conference on computational linguistics | 1988

Language acquisition: coping with lexical gaps

Uri Zernik

Computer programs so far have not fared well in modeling language acquisition. For one thing, learning methodology applicable in general domains does not readily lend itself in the linguistic domain. For another, linguistic representation used by language processing systems is not geared to learning. We introduced a new linguistic representation, the Dynamic Hierarchical Phrasal Lexicon (DHPL) [Zernik 88], to facilitate language acquisition. From this, a language learning model was implemented in the program RINA, which enhances its own lexical hierarchy by processing examples in context. We identified two tasks: First, how linguistic concepts are acquired from training examples and organized in a hierarchy; this task was discussed in previous papers [Zernik87]. Second, we show in this paper how a lexical hierarchy is used in predicting new linguistic concepts. Thus, a program does not stall even in the presence of a lexical unknown, and a hypothesis can be produced for covering that lexical gap.

international conference on computational linguistics | 1992

Closed yesterday and closed minds: asking the right questions of the corpus to distinguish thematic from sentential relations

Uri Zernik

Collocation-based tagging and bracketing programs have attained promising results. Yet, they have not arrived at the stage where they could be used as pre-processors for full-fledged parsing. Accuracy, is still not high enough.To improve accuracy, it is necessary to investigate the points where statistical data is being misinterpreted, leading to incorrect results.In this paper we investigate inaccuracy which is injected when a pre-pocessor relies solely on collocations and blurs the distinction between two separate relations: thematic relations and sentential relations.Thematic relations are word paris, not necessarily adjacent, (e.g., adjourn a meeting) that encode information at the concept level. Sentential relations, on the other hand, concern adjacent word pairs that form a noun group. E.g., preferred stock is a noun group that must be identified as such at the syntactic level.Blurring the difference between these two phenomena contributes to errors in tagging of pairs such as expressed concerns, a verb-noun construct, as opposed to preferred stocks, an adjective-noun construct. Although both relations are manifested in the corpus as high mutual-information collocations, they possess different properties and they need to be separated.In our method, we distinguish between these two cases by asking additional questions of the corpus. By definition, thematic relations take on further variations in the corpus. Expressed concerns (a thematic relation) takes concerns expressed, expressing concerns, express his concerns etc. On the other hand, preferred stock (a sentential relation) does not take any such syntactic variations.We show how this method impacts preprocessing and parsing, and we provide empirical results based on the analysis of an 80-million word corpus.

Machine Translation | 1990

Lexical acquisition: Where is the semantics?

Uri Zernik

In this paper, which motivates lexical acquisition, we explain why existing lexicons are not complete and describe the impact of lexical gaps. We survey acquisition methods relative to their required resources and show that learning algorithms must be designed so that they rely on resources that are technologically available. We discuss in this light the availability and accessibility of machine-readable dictionaries and corpora. Finally we investigate the aspect of lexical semantics, addressing the question, What is the resource from which lexical semantics can be acquired? The answer is not clear because semantics is nowhere to be found explicitly. Thus, we resort to learning semantics by bootstrapping from secondary clues.

Machine learning: a guide to current research | 1986

Language acquisition: learning phrases in context

Uri Zernik; Michael G. Dyer

The lexicon provides the main linguistic database in any natural language processing program, associating words with their corresponding concepts. Even further significance is attached to the lexicon by the phrasal approach, where the phrasal lexicon contains entire phrases rather than single words. Consequently, parsing concerns interaction of phrases with general language patterns (e.g.: passive voice, infinitives, etc.). rather than interaction of single words. A lexical entry, or a phrase, is a pattern-concept pain representing both the linguistic pattern and the conceptualization of the phrase. This approach to language processing is appropriate for learning due to the modularity of knowledge, as represented by lexical entries.

Explore More