Thomas J. Lee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thomas J. Lee is active.

Explore More

Publication

Featured researches published by Thomas J. Lee.

Briefings in Bioinformatics | 2010

Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology

Peter D. Karp; Suzanne M. Paley; Markus Krummenacker; Mario Latendresse; Joseph M. Dale; Thomas J. Lee; Pallavi Kaipa; Fred Gilham; Aaron Spaulding; Liviu Popescu; Tomer Altman; Ian T. Paulsen; Ingrid M. Keseler; Ron Caspi

Pathway Tools is a production-quality software environment for creating a type of model-organism database called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc integrates the evolving understanding of the genes, proteins, metabolic network and regulatory network of an organism. This article provides an overview of Pathway Tools capabilities. The software performs multiple computational inferences including prediction of metabolic pathways, prediction of metabolic pathway hole fillers and prediction of operons. It enables interactive editing of PGDBs by DB curators. It supports web publishing of PGDBs, and provides a large number of query and visualization tools. The software also supports comparative analyses of PGDBs, and provides several systems biology analyses of PGDBs including reachability analysis of metabolic networks, and interactive tracing of metabolites through a metabolic network. More than 800 PGDBs have been created using Pathway Tools by scientists around the world, many of which are curated DBs for important model organisms. Those PGDBs can be exchanged using a peer-to-peer DB sharing system called the PGDB Registry.

BMC Bioinformatics | 2006

BioWarehouse: a bioinformatics database warehouse toolkit

Thomas J. Lee; Yannick Pouliot; Valerie Wagner; Priyanka Gupta; David W. J. Stringer-Calvert; Jessica D. Tenenbaum; Peter D. Karp

BackgroundThis article addresses the problem of interoperation of heterogeneous bioinformatics databases.ResultsWe introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research.ConclusionBioWarehouse embodies significant progress on the database integration problem for bioinformatics.

Journal of Artificial Intelligence Research | 2003

Interactive execution monitoring of agent teams

David E. Wilkins; Thomas J. Lee; Pauline M. Berry

There is an increasing need for automated support for humans monitoring the activity of distributed teams of cooperating agents, both human and machine. We characterize the domain-independent challenges posed by this problem, and describe how properties of domains influence the challenges and their solutions. We will concentrate on dynamic, data-rich domains where humans are ultimately responsible for team behavior. Thus, the automated aid should interactively support effective and timely decision making by the human. We present a domain-independent categorization of the types of alerts a plan-based monitoring system might issue to a user, where each type generally requires different monitoring techniques. We describe a monitoring framework for integrating many domain-specific and task-specific monitoring techniques and then using the concept of value of an alert to avoid operator overload. n nWe use this framework to describe an execution monitoring approach we have used to implement Execution Assistants (EAs) in two different dynamic, data-rich, real-world domains to assist a human in monitoring team behavior. One domain (Army small unit operations) has hundreds of mobile, geographically distributed agents, a combination of humans, robots, and vehicles. The other domain (teams of unmanned ground and air vehicles) has a handful of cooperating robots. Both domains involve unpredictable adversaries in the vicinity. Our approach customizes monitoring behavior for each specific task, plan, and situation, as well as for user preferences. Our EAs alert the human controller when reported events threaten plan execution or physically threaten team members. Alerts were generated in a timely manner without inundating the user with too many alerts (less than 10% of alerts are unwanted, as judged by domain experts).

intelligent systems in molecular biology | 2008

Annotation-based inference of transporter function

Thomas J. Lee; Ian T. Paulsen; Peter D. Karp

Motivation: We present a method for inferring and constructing transport reactions for transporter proteins based primarily on the analysis of the names of individual proteins in the genome annotation of an organism. Transport reactions are declarative descriptions of transporter activities, and thus can be manipulated computationally, unlike free-text protein names. Once transporter activities are encoded as transport reactions, a number of computational analyses are possible including database queries by transporter activity; inclusion of transporters into an automatically generated metabolic-map diagram that can be painted with omics data to aid in their interpretation; detection of anomalies in the metabolic and transport networks, such as substrates that are transported into the cell but are not inputs to any metabolic reaction or pathway; and comparative analyses of the transport capabilities of different organisms. Results: On randomly selected organisms, the method achieves precision and recall rates of 0.93 and 0.90, respectively in identifying transporter proteins by name within the complete genome. The method obtains 67.5% accuracy in predicting complete transport reactions; if allowance is made for predictions that are overly general yet not incorrect, reaction prediction accuracy is 82.5%. Availability: The method is implemented as part of PathoLogic, the inference component of the Pathway Tools software. Pathway Tools is freely available to researchers at non-commercial institutions, including source code; a fee applies to commercial institutions. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Engineering Applications of Artificial Intelligence | 2008

Airlift mission monitoring and dynamic rescheduling

David E. Wilkins; Stephen F. Smith; Laurence A. Kramer; Thomas J. Lee; Timothy W. Rauenbusch

We describe the Flight Manager Assistant (FMA), a prototype system, designed to support real-time management of airlift operations at the USAF Air Mobility Command (AMC). In current practice, AMC flight managers are assigned to manage individual air missions. They tend to be overburdened with associated data monitoring and constraint checking, and generally react to detected problems in a local, myopic fashion. Consequently, decisions taken for one mission can often have deleterious effects on others. FMA combines two key capabilities for overcoming these problems: (1) intelligent monitoring of incoming information (for example, weather, airport operations, aircraft status) and recognition of those situations that require corrective action and (2) dynamic rescheduling of missions in response to detected problems, both to understand the global implications of changed circumstances and to determine appropriate rescheduling actions. FMA builds on two existing technologies: an execution-monitoring framework previously applied to small-unit operations and control of robots, and a dynamic scheduling tool that is transitioning into operational use in AMCs Tanker/Airlift Control Center. FMAs dynamic-mediation module provides for collaborative mission management by different planning and execution offices by structuring communication for decision making.

data integration in the life sciences | 2008

BioWarehouse: Relational Integration of Eleven Bioinformatics Databases and Formats

Peter D. Karp; Thomas J. Lee; Valerie Wagner

BioWarehouse is an open-source project for integrating bioinformatics databases within a relational database warehouse. It has two key features. A comprehensive database schema models many different bioinformatics datatypes. A set of loader tools permits loading of public bioinformatics databases, and of standard bioinformatics formats, into that database schema. Thus, multiple databases can be queried together within a single common schema. The supported databases are BioCyc, CMR, ENZYME, Eco2DBase, Genbank, Gene Ontology, KEGG, NCBI Taxonomy, and UniProt. The supported formats are BioPAX (protein interactions subset only) and MAGE-ML.

international conference on systems engineering | 2005

Identifying candidate genes using the BioWarehouse: a case study

Yannick Pouliot; Thomas J. Lee; Valerie Wagner; Peter D. Karp

The BioWarehouse is an open source data warehousing environment focused on supporting bioinformatics databases (DBs). Operating on the MySQL or Oracle relational database management systems (RDBMSs), BioWarehouse integrates public source DBs such as Swiss-Prot and GenBank into a unified normalized schema operating under a single DB management system. BioWarehouse also imposes partial semantic normalization on the source data, thus decreasing semantic heterogeneity and facilitating multi-DB queries using the Structured Query Language (SQL). As an application case study of the BioWarehouse, we have identified candidate genes for orphan activities, defined as activities for which no cognate gene sequences exist. 1,356 (36%) of enzymatic activities that have been assigned an enzyme commission (EC) number are orphans (Karp, 2004). Such high prevalence is problematic, given that many of these activities are decades old and often perform essential functions. Most notably, the existence of orphans introduces gaps in sequence data that significantly limit the accuracy of genome annotation and metabolic pathway prediction. Fortunately, with more than 200 hundred genomes sequenced to completion, and with the availability of systems such as BioWarehouse, the computational identification of candidate genes associated with orphan activities can be envisioned. The BioWarehouses conglomeration of databases, combined with Oracle 10gs native integration of analytical tools into SQL queries (such as the basic local alignment search tool (BLAST) and POSIX regular expressions), enabled us to identify a small number of high-confidence candidate genes associated with a specific orphan activity. We describe the complex queries used in this work to illustrate the value of the data warehousing approach to bioinformatics research.

symposium on visual languages and human-centric computing | 2013

Discovering action idioms bridging the gap between system-level events and human-level actions

Melinda T. Gervasio; Thomas J. Lee

As computing devices become more pervasive in our daily lives, effective communication between the user and the system becomes increasingly important. The ability to describe actions at a human level of abstraction is key. However, the level at which computer system events is most easily captured is often well below the level at which humans conceptualize actions. We present a sequential pattern mining approach to discovering human-level actions-action idioms - from instrumentation logs of lower-level events. To support validation by a human expert, idiom discovery is designed to maximize recall, with filtering heuristics applied to help eliminate false positives. Empirical evaluation on data from a fielded application shows the promise of the approach for the automatic discovery of action idioms.

intelligent user interfaces | 2011

How to serve soup: interleaving demonstration and assisted editing to support nonprogrammers

Melinda T. Gervasio; Will Haines; David N. Morley; Thomas J. Lee; C. Adam Overholtzer; Shahin Saadati; Aaron Spaulding

The Adept Task Learning system is an end-user programming environment that combines programming by demonstration and direct manipulation to support customization by nonprogrammers. Previously, Adept enforced a rigid procedure-authoring workflow consisting of demonstration followed by editing. However, a series of system evaluations with end users revealed a desire for more feedback during learning and more flexibility in authoring. We present a new approach that interleaves incremental learning from demonstration and assisted editing to provide users with a more flexible procedure-authoring experience. The approach relies on maintaining a soup of alternative hypotheses during learning, propagating user edits through the soup, and suggesting repairs as needed. We discuss the learning and reasoning techniques that support the new approach and identify the unique interaction design challenges they raise, concluding with an evaluation plan to resolve the design challenges and complete the improved system.

Archive | 2009

Method and apparatus for automated assistance with task management

Hung Bui; Steven Eker; Daniel Elenius; Melinda T. Gervasio; Thomas J. Lee; Mei Marker; David N. Morley; Janet Murdock; Karen L. Myers; Bart Peintner; Shahin Saadati; Eric Yeh; Neil Yorke-Smith

Explore More