Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jeffrey Thomas Kreulen.
Communications of The ACM | 2006
Paul P. Maglio; Savitha Srinivasan; Jeffrey Thomas Kreulen; Jim Spohrer
Computer scientists work with formal models of algorithms and computation, and someday service scientists may work with formal models of service systems. The four examples here document some of the early efforts to establish a new academic discipline and new profession.
Ibm Systems Journal | 2002
William F. Cody; Jeffrey Thomas Kreulen; Vikas Krishna; William Scott Spangler
Enterprise executives understand that timely, accurate knowledge can mean improved business performance. Two technologies have been central in improving the quantitative and qualitative value of the knowledge available to decision makers: business intelligence and knowledge management. Business intelligence has applied the functionality, scalability, and reliability of modern database management systems to build ever-larger data warehouses, and to utilize data mining techniques to extract business advantage from the vast amount of available enterprise data. Knowledge management technologies, while less mature than business intelligence technologies, are now capable of combining todays content management systems and the Web with vastly improved searching and text mining capabilities to derive more value from the explosion of textual information. We believe that these systems will blend over time, borrowing techniques from each other and inspiring new approaches that can analyze data and text together, seamlessly. We call this blended technology BIKM. In this paper, we describe some of the current business problems that require analysis of both text and data, and some of the technical challenges posed by these problems. We describe a particular approach based on an OLAP (on-line analytical processing) model enhanced with text analysis, and describe two tools that we have developed to explore this approach--eClassifier performs text analysis, and Sapient integrates data and text through an OLAP-style interaction model. Finally, we discuss some new research that we are pursuing to enhance this approach.
Journal of Management Information Systems | 2003
W. Scott Spangler; Jeffrey Thomas Kreulen; Justin Lessler
We present a novel system and methodology for generating and then browsing multiple taxonomies over a document collection. Taxonomies are generated using a broad set of capabilities, including meta data, key word queries, and automated clustering techniques that serve as a seed taxonomy.The taxonomy editor, eClassifier, provides powerful tools to visualize and edit each taxonomy to make it reflective of the desired theme. Cluster validation tools allow the editor to verify that documents received in the future can be automatically classified into each taxonomy with sufficiently high accuracy. In general, those seeking knowledge from a document collection may have only a vague notion of exactly what they are attempting to understand, and would like to explore related topics and concepts rather than simply being given a set of documents. For this purpose, we have developed MindMap, an interface utilizing multiple taxonomies and the ability to interact with a document collection.
pacific symposium on biocomputing | 2006
James J. Rhodes; Stephen K. Boyer; Jeffrey Thomas Kreulen; Ying Chen; Patricia Ordóñez
Text analytics is becoming an increasingly important tool used in biomedical research. While advances continue to be made in the core algorithms for entity identification and relation extraction, a need for practical applications of these technologies arises. We developed a system that allows users to explore the US Patent corpus using molecular information. The core of our system contains three main technologies: A high performing chemical annotator which identifies chemical terms and converts them to structures, a similarity search engine based on the emerging IUPAC International Chemical Identifier (InChI) standard, and a set of on demand data mining tools. By leveraging this technology we were able to rapidly identify and index 3,623,248 unique chemical structures from 4,375,036 US Patents and Patent Applications. Using this system a user may go to a web page, draw a molecule, search for related Intellectual Property (IP) and analyze the results. Our results prove that this is a far more effective way for identifying IP than traditional keyword based approaches.
conference on information and knowledge management | 2002
W. Scott Spangler; Jeffrey Thomas Kreulen
Taxonomies are meaningful hierarchical categorizations of documents into topics reflecting the natural relationships between the documents and their business objectives. Improving the quality of these taxonomies and reducing the overall cost required to create them is an important area of research. Supervised and unsupervised text clustering are important technologies that comprise only a part of a complete solution. However, there exists a great need for the ability for a human to efficiently interact with a taxonomy during the editing and validation phase. We have developed a comprehensive approach to solving this problem, and implemented this approach in a software tool called eClassifier. eClassifier provides features to help the taxonomy editor understand and evaluate each category of a taxonomy and visualize the relationships between the categories. Multiple techniques allow the user to make changes at both the category and document level. Metrics then establish how well the resultant taxonomy can be modeled for future document classification. In this paper, we present a comprehensive set of viewing, editing and validation techniques we have implemented in the Lotus Discovery Server resulting in a significant reduction in the time required to create a quality taxonomy.
international phoenix conference on computers and communications | 1995
Brian O'krafka; Sriram Srinivasan Mandyam; Jeffrey Thomas Kreulen; Ramanathan Raghavan; A. Saha; Nadeem Malik
Cache-coherent multiprocessors are typically verified by extensive simulation with randomly generated testcases. With this methodology, certain aspects of test coverage can be measured using monitors that record the occurrence of specific events during simulation. If certain events do not occur sufficiently often, the designer must somehow bias the random test generator or write hand-written testcases to improve coverage of the desired event. This is usually a labor-intensive process that is made worse by frequent changes in design specifications and the high cost of simulating large multiprocessor models. This paper describes MPTG (MultiProcessor Test Generator): a portable test generator that automates much of this labor-intensive component of the simulation process. MPTG does this by deterministically generating sets of testcases that are guaranteed to cause specific events to happen. For example, with a single, compact test specification it is possible to generate a set of tests that exercise all transaction types and current cache state combinations at a particular cache in the system. Alternatively, it is easy to generate a set of tests that exercise all two-way races that can occur at a particular cache. Test generation at this level of detail requires the incorporation of a system-wide coherence protocol within the test generator, which can make it difficult to port the test generator to different systems. Portability is achieved in MPTG by breaking the test generator into two parts: a generic test generation engine and a system-specific set of protocol tables.<<ETX>>
hawaii international conference on system sciences | 2002
W. Scott Spangler; Jeffrey Thomas Kreulen; Justin Lessler
We present a novel system and methodology for browsing and exploring topics and concepts within a document collection. The process begins with the generation of multiple taxonomies from the document collections, each having a unique theme. We have developed the MindMap interface to the document collection. Starting from an initial keyword query, the MindMap interface helps the user to explore the concept space by first presenting the user with related terms and high level topics in a radial graph. After refining the query by selecting any related terms, one of the related high level concepts can be selected for further investigation. The MindMap uses a novel binary tree interface to explore the composition of a concept based on the presence or absence of terms. From the binary tree a concept can be further explored and visualized. Individual documents are presented as spatial coordinates where distance between points relates to document similarity. As the user browses this spatial representation, text is presented from the document that is most relevant to the users initial query. Individual points can be selected to pull up the relevant paragraphs from the document with the keywords highlighted. Finally, selected documents are displayed and the user is allowed to further interact and investigate.
international conference on data mining | 2009
Ying Chen; W. Scott Spangler; Jeffrey Thomas Kreulen; Stephen K. Boyer; Thomas D. Griffin; Alfredo Alba; Amit Behal; Bin He; Linda Kato; Ana Lelescu; Cheryl A. Kieliszewski; Xian Wu; Li Zhang
Intellectual Properties (IP), such as patents and trademarks, are one of the most critical assets in today’s enterprises and research organizations. They represent the core innovation and differentiators of an organization. When leveraged effectively, they not only protect a business from its competition, but also generate significant opportunities in licensing, execution, long term research and innovation. In certain industries, e. g., Pharmaceutical industry, patents lead to multi-billion dollar revenue per year. In this paper, we present a holistic information mining solution, called SIMPLE, which mines large corpus of patents and scientific literature for insights. Unlike much prior work that deals with specific aspects of analytics, SIMPLE is an integrated and end-to-end IP analytics solution which addresses a wide range of challenges in patent analytics such as the data complexity, scale, and nomenclature issues. It encompasses techniques for patent data processing and modeling, analytics algorithms, web interface and web services for analytics service delivery and end-user interaction. We use real-world case studies to demonstrate the effectiveness of SIMPLE.
extending database technology | 2006
Hakan Hacigümüs; James J. Rhodes; W. Scott Spangler; Jeffrey Thomas Kreulen
We present the architecture of a Business Information Analysis provisioning system, BISON. The service provisioning system combines two prominent domains, namely structured/unstructured data analysis and service-oriented computing. We also discuss open research problems in the area.
international phoenix conference on computers and communications | 1995
Ramanathan Raghavan; Jeffrey Thomas Kreulen; Brian O'krafka; Shahram Salamian; Avijit Saha; Nadeem Malik
The long development times and high costs of multiprocessor (MP) designs arise from their design complexity. To reduce the time and costs, it is critical that design bugs are detected early in the development cycle using design verification tools. The traditional method of hardware design verification is to simulate the actual hardware designs, usually specified in a hardware description language such as VHDL. Two major drawbacks of this methodology when applied to MP systems are the huge size of MP models and the long simulation times. In addition to the difficulty of detecting incorrect behavior in hardware cache coherent systems, MP system verification presents many other challenges as well. In this paper we present a MP verification methodology that lets the actual hardware designs coexist with behavioral models that approximate the functional behavior of the designs they represent. We describe an event-driven behavioral simulation engine that drives the entire simulation, an MP test language, a test executive that injects new transactions into the system, and a coherence monitor that helps detect quickly and efficiently coherency-related bugs in hardware designs.<<ETX>>