Christopher Dozier
Thomson Reuters
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Christopher Dozier.
language resources and evaluation | 2010
Christopher Dozier; Ravikumar Kondadadi; Marc Light; Arun Vachher; Sriharsha Veeramachaneni; Ramdev Wudali
Named entities in text are persons, places, companies, etc. that are explicitly mentioned in text using proper nouns. The process of finding named entities in a text and classifying them to a semantic type, is called named entity recognition. Resolution of named entities is the process of linking a mention of a name in text to a pre-existing database entry. This grounds the mention in something analogous to a real world entity. For example, a mention of a judge named Mary Smith might be resolved to a database entry for a specific judge of a specific district of a specific state. This recognition and resolution of named entities can be leveraged in a number of ways including providing hypertext links to information stored about a particular judge: their education, who appointed them, their other case opinions, etc. This paper discusses named entity recognition and resolution in legal documents such as US case law, depositions, and pleadings and other trial documents. The types of entities include judges, attorneys, companies, jurisdictions, and courts. We outline three methods for named entity recognition, lookup, context rules, and statistical models. We then describe an actual system for finding named entities in legal text and evaluate its accuracy. Similarly, for resolution, we discuss our blocking techniques, our resolution features, and the supervised and semi-supervised machine learning techniques we employ for the final matching.
international conference on artificial intelligence and law | 2007
Christopher Dozier; Ravi Kondadadi; Khalid Al-Kofahi; Mark Chaudhary; Xi S. Guo
Medical terms occur across a wide variety of legal, medical, and news corpora. Documents containing these terms are of particular interest to legal professionals operating in such fields as medical malpractice, personal injury, and product liability. This paper describes a novel method of tagging medical terms in legal, medical, and news text that is very fast and also has high recall and precision. To date, most research in medical term spotting has been confined to medical text and has approached the problem by extracting noun phrases from sentences and mapping them to a list of medical concepts via a fuzzy lookup. The medical term tagging described in this paper relies on a fast finite state machine that finds within sentences the longest contiguous sets of words associated with medical terms in a medical term authority file, converts word sets into medical term hash keys, and looks up medical concept ids associated with the hash keys. Additionally our system relies on a probabilistic term classifier that uses local context to disambiguate terms being used in a medical sense from terms being used in a non-medical sense. Our method is two orders of magnitude faster than an approach based on noun phrase extraction and has better precision and recall for terms pertaining to injuries, diseases, drugs, medical procedures, and medical devices. The methods presented here have been implemented and are the core engines for a Thomson West product called the Medical Litigator. Thus far, the Medical Litigator has processed over 100 million documents and generated over 165 million tags representing approximately 164,000 unique medical concepts. The resulting system is very fast and posted a recall from 0.79 to 0.93 and precision between 0.94 and 0.97, depending on the document type.
international conference on artificial intelligence and law | 2003
Christopher Dozier; Peter Jackson; Xi S. Guo; Mark Chaudhary; Yohendran Arumainayagam
This paper describes how an online directory of expert witnesses was created from jury verdict and settlement documents using text mining techniques. We have created an expert witness directory that contains over 100,000 expert profiles, based on approximately 300,000 jury verdict and settlement documents, publicly available professional license information, an expertise taxonomy, and automatic text mining techniques. This directory can be browsed by area of expertise as well as by location and name. In addition, expert profiles are automatically linked to medline articles and jury verdict and settlement documents. The supporting technologies that made this application possible include information extraction from text via regular expression parsing, record linkage through Bayesian based matching, and automatic rule-based classification. To the best of our knowledge, this is the largest expert witness directory of its kind and the first to be built using automatic text mining techniques.
Archive | 2008
Marc Light; Frank Schilder; Ravi Kondadadi; Christopher Dozier; Wenhui Liao; Sriharsha Veeramachaneni
Archive | 2005
Yohendran Arumainayagam; Christopher Dozier
Archive | 2008
Christopher Dozier; Souptik Datta; Merine Thomas; Hugo Molina-Salgado
conference on computational natural language learning | 2004
Kenneth Allen Williams; Christopher Dozier; J. Andrew McCulloh
Archive | 2008
Marc Light; Frank Schilder; Christopher Dozier
meeting of the association for computational linguistics | 2004
Xi S. Guo; Mark Chaudhary; Christopher Dozier; Yohendran Arumainayagam; Venkatesan Subramanian
Archive | 2006
Christopher Dozier; Mark Chaudhary; Ravi Kondadadi