Lisa Pearl
University of California, Irvine
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lisa Pearl.
Language Learning and Development | 2009
Lisa Pearl; Jeffrey Lidz
We identify three components of any learning theory: the representations, the learners data intake, and the learning algorithm. With these in mind, we model the acquisition of the English anaphoric pronoun one in order to identify necessary constraints for successful acquisition, and the nature of those constraints. Whereas previous modeling efforts have succeeded by using a domain-general learning algorithm that implicitly restricts the data intake to be a subset of the input, we show that the same kind of domain-general learning algorithm fails when it does not restrict the data intake. We argue that the necessary data intake restrictions are domain-specific in nature. Thus, while a domain-general algorithm can be quite powerful, a successful learner must also rely on domain-specific learning mechanisms when learning anaphoric one.
Language Acquisition | 2013
Lisa Pearl; Jon Sprouse
The induction problems facing language learners have played a central role in debates about the types of learning biases that exist in the human brain. Many linguists have argued that some of the learning biases necessary to solve these language induction problems must be both innate and language-specific (i.e., the Universal Grammar (UG) hypothesis). Though there have been several recent high-profile investigations of the necessary learning bias types for different linguistic phenomena, the UG hypothesis is still the dominant assumption for a large segment of linguists due to the lack of studies addressing central phenomena in generative linguistics. To address this, we focus on how to learn constraints on long-distance dependencies, also known as syntactic island constraints. We use formal acceptability judgment data to identify the target state of learning for syntactic island constraints and conduct a corpus analysis of child-directed data to affirm that there does appear to be an induction problem when learning these constraints. We then create a computational learning model that implements a learning strategy capable of successfully learning the pattern of acceptability judgments observed in formal experiments, based on realistic input. Importantly, this model does not explicitly encode syntactic constraints. We discuss learning biases required by this model in detail as they highlight the potential problems posed by syntactic island effects for any theory of syntactic acquisition. We find that, although the proposed learning strategy requires fewer complex and domain-specific components than previous theories of syntactic island learning, it still raises difficult questions about how the specific biases required by syntactic islands arise in the learner. We discuss the consequences of these results for theories of acquisition and theories of syntax.
conference of the association for machine translation in the americas | 2002
Bonnie J. Dorr; Lisa Pearl; Rebecca Hwa; Nizar Habash
The frequent occurrence of divergences--structural differences between languages--presents a great challenge for statistical word-level alignment. In this paper, we introduce DUSTer, a method for systematically identifying common divergence types and transforming an English sentence structure to bear a closer resemblance to that of another language. Our ultimate goal is to enable more accurate alignment and projection of dependency trees in another language without requiring any training on dependency-tree data in that language. We present an empirical analysis comparing the complexities of performing word-level alignments with and without divergence handling. Our results suggest that our approach facilitates word-level alignment, particularly for sentence pairs containing divergences.
Literary and Linguistic Computing | 2012
Lisa Pearl; Mark Steyvers
We describe a new supervised machine learning approach for detecting author- ship deception, a specific type of authorship attribution task particularly relevant for cybercrime forensic investigations, and demonstrate its validity on two case studies drawn from realistic online data sets. The core of our approach involves identifying uncharacteristic behavior for an author, based on a writeprint ex- tracted from unstructured text samples of the authors writing. The writeprints used here involve stylometric features and content features derived from topic models, an unsupervised approach for identifying relevant keywords that relate to the content areas of a document. One innovation of our approach is to trans- form the writeprint feature values into a representation that individually balances characteristic and uncharacteristic traits of an author, and we subsequently apply a Sparse Multinomial Logistic Regression classifier to this novel representation. Our method yields high accuracy for authorship deception detection on the two case studies, confirming its utility.
Linguistic Inquiry | 2009
Ivano Caponigro; Lisa Pearl
In this squib, we argue that the wh-words where, when, and how are inherently nominal, rather than prepositional, though they are NPs with a peculiar property: they are always base-generated as the complement of a preposition (P), which is often silent. Our main evidence comes from the behavior of embedded noninterrogative wh-clauses known as free relatives (FRs). We show that this behavior can be easily accounted for if where, when, and how are inherently nominal. We bring further empirical support to our proposal by extending it to wh-interrogatives and by discussing the similarities between FRs and the NPs that have been called bare-NP adverbs or adverbial NPs (Emonds 1976, 1987, Larson 1985, McCawley 1988). We also show that potential alternative accounts that make different assumptions about the nature of where, when, and how are unable to account for the data.
Journal of Semantics | 2012
Ivano Caponigro; Lisa Pearl; Neon Brooks; David Barner
Plural definite descriptions (e.g. the things on the plate) and free relative clauses (e.g. what is on the plate) have been argued to share the same semantic properties, despite their syntactic differences. Specifically, both have been argued to be nonquantificational expressions referring to the maximal element of a given set (e.g. the set of things on the contextually salient plate). We provide experimental support for this semantic analysis with the first reported simultaneous investigation of children’s interpretation of both constructions, highlighting how experimental methods can inform semantic theory. A Truth-Value Judgment task and an Act-Out task show that children know that the two constructions differ from quantificational nominals (e.g. all the things on the plate) very early on (4 years old). Children also acquire the adult interpretation of both constructions at the same time, around 6‐7 years old. This happens despite major differences in the frequency of these constructions, according to our corpus study of children’s linguistic input. We discuss possible causes for this late emergence. We also argue that our experimental findings contribute to the recent theoretical debate on the correct semantic analysis of free relatives.
meeting of the association for computational linguistics | 2001
Rebecca Green; Lisa Pearl; Bonnie J. Dorr; Philip Resnik
This paper describes automatic techniques for mapping 9611 entries in a database of English verbs to WordNet senses. The verbs were initially grouped into 491 classes based on syntactic features. Mapping these verbs into WordNet senses provides a resource that supports disambiguation in multilingual applications such as machine translation and cross-language information retrieval. Our techniques make use of (1) a training set of 1791 disambiguated entries, representing 1442 verb entries from 167 classes; (2) word sense probabilities, from frequency counts in a tagged corpus; (3) semantic similarity of WordNet senses for verbs within the same class; (4) probabilistic correlations between WordNet data and attributes of the verb classes. The best results achieved 72% precision and 58% recall, versus a lower bound of 62% precision and 38% recall for assigning the most frequently occurring WordNet sense, and an upper bound of 87% precision and 75% recall for human judgment.
Cognitive Science | 2015
Lawrence Phillips; Lisa Pearl
The informativity of a computational model of language acquisition is directly related to how closely it approximates the actual acquisition task, sometimes referred to as the models cognitive plausibility. We suggest that though every computational model necessarily idealizes the modeled task, an informative language acquisition model can aim to be cognitively plausible in multiple ways. We discuss these cognitive plausibility checkpoints generally and then apply them to a case study in word segmentation, investigating a promising Bayesian segmentation strategy. We incorporate cognitive plausibility by using an age-appropriate unit of perceptual representation, evaluating the model output in terms of its utility, and incorporating cognitive constraints into the inference process. Our more cognitively plausible model shows a beneficial effect of cognitive constraints on segmentation performance. One interpretation of this effect is as a synergy between the naive theories of language structure that infants may have and the cognitive constraints that limit the fidelity of their inference processes, where less accurate inference approximations are better when the underlying assumptions about how words are generated are less accurate. More generally, these results highlight the utility of incorporating cognitive plausibility more fully into computational models of language acquisition.
Proceedings of the 5th Workshop on Cognitive Aspects of Computational Language Learning (CogACLL) | 2014
Lawrence Phillips; Lisa Pearl
Statistical learning has been proposed as one of the earliest strategies infants could use to segment words out of their native language because it does not rely on language-specific cues that must be derived from existing knowledge of the words in the language. Statistical word segmentation strategies using Bayesian inference have been shown to be quite successful for English (Goldwater et al. 2009), even when cognitively inspired processing constraints are integrated into the inference process (Pearl et al. 2011, Phillips & Pearl 2012). Here we test this kind of strategy on child-directed speech from seven languages to evaluate its effectiveness cross-linguistically, with the idea that a viable strategy should succeed in each case. We demonstrate that Bayesian inference is indeed a viable cross-linguistic strategy, provided the goal is to identify useful units of the language, which can range from sub-word morphology to whole words to meaningful word combinations.
Journal of Speech Language and Hearing Research | 2015
Lisa Pearl; Jon Sprouse
PURPOSE Given the growing prominence of computational modeling in the acquisition research community, we present a tutorial on how to use computational modeling to investigate learning strategies that underlie the acquisition process. This is useful for understanding both typical and atypical linguistic development. METHOD We provide a general overview of why modeling can be a particularly informative tool and some general considerations when creating a computational acquisition model. We then review a concrete example of a computational acquisition model for complex structural knowledge referred to as syntactic islands. This includes an overview of syntactic islands knowledge, a precise definition of the acquisition task being modeled, the modeling results, and how to meaningfully interpret those results in a way that is relevant for questions about knowledge representation and the learning process. CONCLUSIONS Computational modeling is a powerful tool that can be used to understand linguistic development. The general approach presented here can be used to investigate any acquisition task and any learning strategy, provided both are precisely defined.