Dennis P. Groth | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dennis P. Groth is active.

Explore More

Publication

Featured researches published by Dennis P. Groth.

IEEE Transactions on Visualization and Computer Graphics | 2006

Provenance and Annotation for Visual Exploration Systems

Dennis P. Groth; Kristy Streefkerk

Exploring data using visualization systems has been shown to be an extremely powerful technique. However, one of the challenges with such systems is an inability to completely support the knowledge discovery process. More than simply looking at data, users will make a semipermanent record of their visualizations by printing out a hard copy. Subsequently, users will mark and annotate these static representations, either for dissemination purposes or to augment their personal memory of what was witnessed. In this paper, we present a model for recording the history of user explorations in visualization environments, augmented with the capability for users to annotate their explorations. A prototype system is used to demonstrate how this provenance information can be recalled and shared. The prototype system generates interactive visualizations of the provenance data using a spatio-temporal technique. Beyond the technical details of our model and prototype, results from a controlled experiment that explores how different history mechanisms impact problem solving in visualization environments are presented

advanced visual interfaces | 2004

A collaborative annotation system for data visualization

Sean E. Ellis; Dennis P. Groth

We present Collaborative Annotations on Visualizations (CAV), a system for annotating visual data in remote and collocated environments. Our system consists of a network framework, and a client application built for tablet PCs. CAV is designed to support the collection and sharing of annotations, through the use of mobile devices connected to visualization servers. We have developed a working system prototype based on tablet PCs that supports digital ink, voice and text annotation, and illustrates our approach in a variety of application domains, including biology, chemistry, and telemedicine. We have created an XML based open standard that supports access to a variety of client devices by publishing visualizations (data and annotations) as streams of images. CAVs primary goal is to enhance scientific discovery by supporting collaboration in the context of data visualizations.

SIAM Journal on Computing | 2004

Average-Case Performance of the Apriori Algorithm

Paul Walton Purdom; Dirk Van Gucht; Dennis P. Groth

The failure rate of the Apriori Algorithm is studied analytically for the case of random shoppers. The time needed by the Apriori Algorithm is determined by the number of item sets that are output (successes: item sets that occur in at least k baskets) and the number of item sets that are counted but not output (failures: item sets where all subsets of the item set occur in at least k baskets but the full set occurs in less than k baskets). The number of successes is a property of the data; no algorithm that is required to output each success can avoid doing work associated with the successes. The number of failures is a property of both the algorithm and the data.We find that under a wide range of conditions the performance of the Apriori Algorithm is almost as bad as is permitted under sophisticated worst-case analyses. In particular, there is usually a bad level with two properties: (1) it is the level where nearly all of the work is done, and (2) nearly all item sets counted are failures. Let l be the...

conference on software engineering education and training | 2001

It's all about process: project-oriented teaching of software engineering

Dennis P. Groth; Edward L. Robertson

Process considerations are a central part of the material for a software engineering course; they are also central to accomplishing full-lifecycle, team-based systems development projects in such a course. This paper discusses the ways in which we have achieved an effective process structure within an academic context of full-year project courses. The key features are a kernel project plan and a process management mechanism. The project plan is a schedule including eight milestones with fixed due dates and quite explicit deliverables. The management is accomplished through an advanced full-year course, whose participants guide the project teams through the process.

Proceedings. Eighth International Conference on Information Visualisation, 2004. IV 2004. | 2004

Information provenance and the knowledge rediscovery problem

Dennis P. Groth

Visualizations leverage innate human capabilities for recognizing interesting aspects of data. Even if users might agree on what is interesting about a visualization, the steps that they use in the knowledge discovery process may be significantly different. This results in an inability to effectively recreate the exact conditions of the discovery process, which we call the knowledge rediscovery problem. Because we cannot expect a user to fully document each of their interactions, there is a need for visualization systems to maintain user trace data in a way that enhances a users ability to communicate what they found to be interesting, as well as how they found it. We present a model for representing user interactions that articulates with a corresponding set of annotations, or observations that are made during the exploration. Such ability is critical to addressing the knowledge rediscovery problem, and is a fundamental component for systems that must provide information provenance.

Proceedings of the 1998 workshop on New paradigms in information visualization and manipulation | 1998

Architectural support for database visualization

Dennis P. Groth; Edward L. Robertson

The rapid proliferation and growth of database management systems has resulted in the retention of massive amounts of information for data processing and analysis needs. Many data processing requirements can be satisfied through the use of traditional database languages, such as SQL. These languages retrieve and present query results in record-oriented tables. The table of records format is best for presenting every record, but it cannot give a feel for the overall character of the data set.

british national conference on databases | 2002

Improving Query Evaluation with Approximate Functional Dependency Based Decompositions

Chris Giannella; Mehmet M. Dalkilic; Dennis P. Groth; Edward L. Robertson

We investigate how relational restructuring may be used to improve query performance. Our approach parallels recent research extending semantic query optimization (SQO), which uses knowledge about the instance to achieve more efficient query processing. Our approach differs, however, in that the instance does not govern whether the optimization may be applied; rather, the instance governs whether the optimization yields more efficient query processing. It also differs in that it involves an explicit decomposition of the relation instance. We use approximate functional dependencies as the conceptual basis for this decomposition and develop query rewriting techniques to exploit it. We present experimental results leading to a characterization of a well-defined class of queries for which improved processing time is observed.

ieee international conference on information visualization | 2003

Visual representation of database queries using structural similarity

Dennis P. Groth

It is often useful to get high-level views of datasets in order to identify areas of interest worthy of further exploration. In relational databases, the high-level view can be described using entity-relationship diagrams, which identify relationships between entities in the data model. Such high-level views are useful for database design activities, and can be used to generate user interfaces for constructing queries. This research introduces techniques for visualizing structural similarity of database queries. We demonstrate that individual queries can be visualized using graph visualization techniques. A distance measure based on query structure is proposed that provides database designers and administrators with a high-level perspective of relationships in the underlying data.

technical symposium on computer science education | 2008

Improving computer science diversity through summer camps

Dennis P. Groth; Helen H. Hu; Betty Lauer; Hwajung Lee

Summer camps offer a ripe opportunity for increasing computer science diversity. This panel provides several examples of summer camps that specifically recruit from traditionally underrepresented demographics. The panelists run camps at a community college, a private liberal-arts college, and public universities. The camps are residential and day camps, coed and all-female camps, ranging from three-days to two-weeks long, with campers from 10-years-olds to high school seniors. In addition to describing their camps, the panelists will provide information on securing funding, recruiting campers from underrepresented populations, measuring impact, and lessons learned along the way. Demonstrations of what campers accomplished will also be shown.

international conference on applications of declarative programming and knowledge management | 2001

Discovering frequent itemsets in the presence of highly frequent items

Dennis P. Groth; Edward L. Robertson

This paper presents new techniques for focusing the discovery of frequent itemsets within large, dense datasets containing highly frequent items. The existence of highly frequent items adds significantly to the cost of computing the complete set of frequent itemsets. Our approach allows for the exclusion of such items during the candidate generation phase of the Apriori algorithm. Afterwards, the highly frequent items can be reintroduced, via an inferencing framework, providing for a capability to generate frequent itemsets without counting their frequency. We demonstrate the use of these new techniques within the well-studied framework of the Apriori algorithm. Furthermore, we provide empirical results using our techniques on both synthetic and real datasets - both relevant since the real datasets exhibit statistical characteristics different from the probabilistic assumptions behind the synthetic data. The source we used for real data was the U.S. Census.

Explore More