Songting Chen
Princeton University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Songting Chen.
international conference on management of data | 2007
Junichi Tatemura; Arsany Sawires; Oliver Po; Songting Chen; K. Selçuk Candan; Diviyakant Agrawal; Maria Goveas
Mashup Feeds is a system that supports integrated web service feeds as continuous queries. We introduce collection-based stream processing semantics to enable information extraction by monitoring source evolution over time.
international conference on management of data | 2008
Junichi Tatemura; Songting Chen; Fenglin Liao; Oliver Po; K. Selçuk Candan; Divyakant Agrawal
The UQBE is a mashup tool for non-programmers that supports query-by-example (QBE) over a schema made up by the user without knowing the schema of the original sources. Based on automated schema matching with uncertainty, the UQBE system returns the best confident results. The system lets the user refine them interactively. A tuple in the query result is associated with lineage that is a boolean formula over schema matching decisions representing underlying conditions on which the corresponding tuple is included in the result. Given binary feedbacks on tuples by the user, which are possibly imprecise, the system solves it as an optimization problem to refine confidence values of matching decisions. The demo features graphical user interaction on the UQBE system, including querying and refinement.
IEEE Transactions on Knowledge and Data Engineering | 2008
Songting Chen; Hua-Gang Li; Junichi Tatemura; Wang-Pin Hsiung; Divyakant Agrawal; Kasim Selcuk Candan
An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.
international conference on management of data | 2008
Yan Qi; K. Selçuk Candan; Junichi Tatemura; Songting Chen; Fenglin Liao
OLAP is an important tool in decision support. With the help of domain knowledge, such as hierarchies of attribute values, OLAP helps the user observe the effects of various decisions. One assumption of most OLAP operations is that the available domain knowledge is precise. In particular, they assume that the hierarchy of values over which the user can navigate forms a taxonomy. In this paper, we first note that when multiple heterogeneous data sources are involved in the gathering of the data and the associated domain knowledge, the integrated knowledge-base, constructed by combining locally available taxonomies based on the concept matchings, may not be a taxonomy itself. Specifically, existence of intersections among concepts from different sources compromises the tree-structure of the integrated taxonomy and prevents effective use of hierarchical navigation techniques, such as drill-down and roll-up. To cope with this, we introduce concept un-classification, where a select few of the concepts are eliminated to ensure that the remaining structure is a navigable taxonomy, without concept intersections. Since un-classifying an originally classified data is not desirable, we consider ways to minimize un-classification in the process. We introduce a cost model which captures the imprecision caused by the un-classification process and we formulate the problem of finding an un-classification strategy which eliminates intersections and which adds minimal imprecision to the resulting structure. We show that, when performed naively, this task can be very costly and thus we propose a bottom-up preprocessing strategy which supports basic navigational analytics operations, such as drill-down and roll-up. Experiments over synthetic and real-life data verified the effectiveness and efficiency of our approach.
mobile data management | 2008
Egemen Tanin; Songting Chen; Junichi Tatemura; Wang-Pin Hsiung
Monitoring moving objects is one of the key application domains for sensor networks. In the absence of cooperative objects and devices attached to these objects, target tracking algorithms have to be used for monitoring. In this paper, we present that many of the applications of moving object monitoring systems could be addressed with low-frequency snapshot-based queries. With the realization of this query type, we show that existing target tracking algorithms may not be the least expensive solutions. We introduce an approach that uses two alternating strategies. We maintain a cheap low-quality knowledge of moving objects location between snapshots and trigger expensive sensor readings only when a snapshot period has elapsed. With extensive experiments we show that our approach is significantly more energy efficient than established methods. It is also more effective than existing data-and-query centric in-network query processing schemes as it can maintain object identities between snapshots.
international conference on embedded networked sensor systems | 2007
Egemen Tanin; Songting Chen; Junichi Tatemura; Wang-Pin Hsiung
Efficient data acquisition in WSNs has attracted significant interest. For example, TinyDB [2] introduced query dissemination and data aggregation trees. Later, a probabilistic model of the physical world is used in [1]. Recently, [3] argues that probabilistic models of the physical world used in acquisition may miss outliers and introduces spatio-temporal suppression-based methods. We classify these established approaches as query-and-data centric approaches for optimizing the data acquisition process.
very large data bases | 2006
Songting Chen; Hua Gang Li; Junichi Tatemura; Wang Pin Hsiung; Divyakant Agrawal; K. Selçuk Candan
very large data bases | 2006
Hua Gang Li; Songting Chen; Junichi Tatemura; Divyakant Agrawal; K. Selçuk Candan; Wang Pin Hsiung
very large data bases | 2006
K. Selçuk Candan; Wang Pin Hsiung; Songting Chen; Junichi Tatemura; Divyakant Agrawal
Archive | 2007
Songting Chen; Junichi Tatemura; Wang-Pin Hsiung; Divyakant Agrawal; Kasim Selcuk Candan; Hua-gang Li