Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sakire Arslan Ay is active.

Publication


Featured researches published by Sakire Arslan Ay.


acm multimedia | 2008

Viewable scene modeling for geospatial video search

Sakire Arslan Ay; Roger Zimmermann; Seon Ho Kim

Video sensors are becoming ubiquitous and the volume of captured video material is very large. Therefore, tools for searching video databases are indispensable. Current techniques that extract features purely based on the visual signals of a video are struggling to achieve good results. By considering video related meta-information, more relevant and precisely delimited search results can be obtained. In this study we propose a novel approach for querying videos based on the notion that the geographical location of the captured scene in addition to the location of a camera can provide valuable information and may be used as a search criterion in many applications. This study provides an estimation model of the viewable area of a scene for indexing and searching and reports on a prototype implementation. Among our objectives is to stimulate a discussion of these topics in the research community as information fusion of different georeferenced data sources is becoming increasingly important. Initial results illustrate the feasibility of the proposed approach.


acm multimedia | 2011

Automatic tag generation and ranking for sensor-rich outdoor videos

Zhijie Shen; Sakire Arslan Ay; Seon Ho Kim; Roger Zimmermann

Video tag annotations have become a useful and powerful feature to facilitate video search in many social media and web applications. The majority of tags assigned to videos are supplied by users - a task which is time consuming and may result in annotations that are subjective and lack precision. A number of studies have utilized content-based extraction techniques to automate tag generation. However, these methods are compute-intensive and challenging to apply across domains. Here, we describe a complementary approach for generating tags based on the geographic properties of videos. With todays sensor-equipped smartphones, the location and orientation of a camera can be continuously acquired in conjunction with the captured video stream. Our novel technique utilizes these sensor meta-data to automatically tag outdoor videos in a two step process. First, we model the viewable scenes of the video as geometric shapes by means of its accompanied sensor data and determine the geographic objects that are visible in the video by querying geo-information databases through the viewable scene descriptions. Subsequently we extract textual information about the visible objects to serve as tags. Second, we define six criteria to score the tag relevance and rank the obtained tags based on these scores. Then we associate the tags with the video and the accurately delimited segments of the video. To evaluate the proposed technique we implemented a prototype tag generator and conducted a user study. The results demonstrate significant benefits of our method in terms of automation and tag utility.


Multimedia Systems | 2010

Relevance ranking in georeferenced video search

Sakire Arslan Ay; Roger Zimmermann; Seon Ho Kim

The rapid adoption and deployment of ubiquitous video cameras has led to the collection of voluminous amounts of media data. However, indexing and searching of large video databases remain a very challenging task. Recently, some recorded video data are automatically annotated with meta-data collected from various sensors such as Global Positioning System (GPS) and compass devices. In our earlier work, we proposed the notion of a viewable scene model derived from the fusion of location and direction sensor information with a video stream. Such georeferenced media streams are useful in many applications and, very importantly, they can effectively be searched via their meta-data on a large scale. Consequently, search by geo-properties complements traditional content-based retrieval methods. The result of a georeferenced video query will in general consist of a number of video segments that satisfy the query conditions, but with more or less relevance. For example, a building of interest may appear in a video segment, but may only be visible in a corner. Therefore, an essential and integral part of a video query is the ranking of the result set according to the relevance of each clip. An effective result ranking is even more important for video than it is for text search, since the browsing of results can only be achieved by viewing each clip, which is very time consuming. In this study, we investigate and present three ranking algorithms that use spatial and temporal properties of georeferenced videos to effectively rank search results. To allow our techniques to scale to large video databases, we further introduce a histogram-based approach that allows fast online computations. An experimental evaluation demonstrates the utility of the proposed methods.


advances in geographic information systems | 2010

Generating synthetic meta-data for georeferenced video management

Sakire Arslan Ay; Seon Ho Kim; Roger Zimmermann

Recently various sensors, such as GPS and compass devices, can be cost-effectively manufactured and this allows their deployment in conjunction with mobile video cameras. Hence, recorded clips can automatically be annotated with geospatial information and the resulting georeferenced videos may be used in various Geographic Information System (GIS) applications. However, the research community is lacking large-scale and realistic test datasets of such sensor-fused information to evaluate their techniques since collecting real-world test data requires considerable time and effort. To fill this void, we propose an approach for generating synthetic video meta-data with realistic geospatial properties for mobile video management research. We highlight the essential aspects of the georeferenced video meta-data and present an approach to simulate the behavioral patterns of mobile cameras in the synthetic data. The data generation process can be customized through user parameters for a variety of GIS applications that use mobile videos. We demonstrate the feasibility and applicability of the proposed approach by providing comparisons with real-world data.


ACM Transactions on Multimedia Computing, Communications, and Applications | 2008

Distributed musical performances: Architecture and stream management

Roger Zimmermann; Elaine Chew; Sakire Arslan Ay; Moses Pawar

An increasing number of novel applications produce a rich set of different data types that need to be managed efficiently and coherently. In this article we present our experience with designing and implementing a data management infrastructure for a distributed immersive performance (DIP) application. The DIP project investigates a versatile framework for the capture, recording, and replay of video, audio, and MIDI (Musical Instrument Digital Interface) streams in an interactive environment for collaborative music performance. We are focusing on two classes of data streams that are generated within this environment. The first category consists of high-resolution isochronous media streams, namely audio and video. The second class comprises MIDI data produced by electronic instruments. MIDI event sequences are alphanumeric in nature and fall into the category of the data streams that have been of interest to data management researchers in recent years. We present our data management architecture, which provides a repository for all DIP data. Streams of both categories need to be acquired, transmitted, stored, and replayed in real time. Data items are correlated across different streams with temporal indices. The audio and video streams are managed in our own High-performance Data Recording Architecture (HYDRA), which integrates multistream recording and retrieval in a consistent manner. This paper reports on the practical issues and challenges that we encountered during the design, implementation and experimental phases of our prototype. We also present some analysis results and discuss future extensions for the architecture.


Journal of Visual Communication and Image Representation | 2010

Design and implementation of geo-tagged video search framework

Seon Ho Kim; Sakire Arslan Ay; Roger Zimmermann

User generated video content is experiencing significant growth which is expected to continue and further accelerate. As an example, users are currently uploading 20h of video per minute to YouTube. Making such video archives effectively searchable is one of the most critical challenges of multimedia management. Current search techniques that utilize signal-level content extraction from video struggle to scale. Here we present a framework based on the complementary idea of acquiring sensor streams automatically in conjunction with video content. Of special interest are geographic properties of mobile videos. The meta-data from sensors can be used to model the coverage area of scenes as spatial objects such that videos can effectively, and on a large scale, be organized, indexed and searched based on their field-of-views. We present an overall framework that is augmented with our design and implementation ideas to illustrate the feasibility of this concept of managing geo-tagged video.


acm multimedia | 2009

GRVS: a georeferenced video search engine

Sakire Arslan Ay; Lingyan Zhang; Seon Ho Kim; Ma He; Roger Zimmermann

An increasing number of recorded videos are being tagged with geographic properties of the camera scenes. This meta-data is of significant use for storing, indexing and searching large collections of videos. By considering video related meta-information, more relevant and precisely delimited search results can be returned. Our system implementation demonstrates a prototype of a georeferenced video search engine (GRVS) that utilizes an estimation model of a cameras viewable scene for efficient video search. For video acquisition, our system provides an automated annotation software that captures videos and their respective field of views (FOV). The acquisition software allows community-driven data contributions to the search engine.


Geoinformatica | 2014

Large-scale geo-tagged video indexing and queries

He Ma; Sakire Arslan Ay; Roger Zimmermann; Seon Ho Kim

With the wide spread of smartphones, a large number of user-generated videos are produced everyday. The embedded sensors, e.g., GPS and the digital compass, make it possible that videos are accessed based on their geo-properties. In our previous work, we have created a framework for integrated, sensor-rich video acquisition (with one instantiation implemented in the form of smartphone applications) which associates a continuous stream of location and viewing direction information with the collected videos, hence allowing them to be expressed and manipulated as spatio-temporal objects. These sensor meta-data are considerably smaller in size compared to the visual content and are helpful in effectively and efficiently searching for geo-tagged videos in large-scale repositories. In this study, we propose a novel three-level grid-based index structure and introduce a number of related query types, including typical spatial queries and ones based on bounded radius and viewing direction restriction. These two criteria are important in many video applications and we demonstrate the importance with a real-world dataset. Moreover, experimental results on a large-scale synthetic dataset show that our approach can provide a significant speed improvements of at least 30 %, considering a mix of queries, compared to a multi-dimensional R-tree implementation.


Proceedings of the first annual ACM SIGMM conference on Multimedia systems | 2010

Vector model in support of versatile georeferenced video search

Seon Ho Kim; Sakire Arslan Ay; Byunggu Yu; Roger Zimmermann

Increasingly geographic properties are being associated with videos, especially those captured from mobile cameras. The meta data from camera-attached sensors can be used to model the coverage area of the scene as a spatial object such that videos can be organized, indexed and searched based on their field of views (FOV). The most accurate representation of an FOV is through the geometric shape of a circular sector. However, spatial search and indexing methods are traditionally optimized for rectilinear shapes because of their simplicity. Established methods often use an approximation shape, such as a minimum bounding rectangle (MBR), to efficiently filter a large archive for possibly matching candidates. A second, refinement step is then applied to perform the time-consuming, precise matching function. MBR estimation has been successful for general spatial overlap queries, however it provides limited flexibility for georeferenced video search. In this study we propose a novel vector-based model for FOV estimation which provides a more versatile basis for georeferenced video search while providing competitive performance for the filter step. We demonstrate how the vector model can provide a unified method to perform traditional overlap queries while also enabling searches that, for example, concentrate on the vicinity of the cameras position or harness its view direction. To the best of our knowledge no comparable technique exists today.


acm sigmm conference on multimedia systems | 2011

Energy-efficient mobile video management using smartphones

Jia Hao; Seon Ho Kim; Sakire Arslan Ay; Roger Zimmermann

Mobile devices are increasingly popular for the versatile capture and delivery of video content. However, the acquisition and transmission of large amounts of video data on mobile devices face fundamental challenges such as power and wireless bandwidth constraints. To support diverse mobile video applications, it is critical to overcome these challenges. We present a design framework that brings together several key ideas to enable energy-efficient mobile video management applications. First, we leverage off-the-shelf smartphones as mobile video sensors. Second, concurrently with video recordings we acquire geospatial sensor meta-data to describe the videos. Third, we immediately upload the meta-data to a server to enable low latency video search. This last step allows for very energy-efficient transmissions, as the sensor data sets are small and the bulky video data can be uploaded on demand, if and when needed. We present the design, a simulation study, and a preliminary prototype of the proposed system. Experimental results show that our approach substantially prolongs the battery life of mobile devices while only slightly increasing the search latency

Collaboration


Dive into the Sakire Arslan Ay's collaboration.

Top Co-Authors

Avatar

Roger Zimmermann

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Seon Ho Kim

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Zhijie Shen

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

He Ma

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Jia Hao

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Beomjoo Seo

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Guanfeng Wang

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Lingyan Zhang

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Ma He

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Ying Zhang

National University of Singapore

View shared research outputs
Researchain Logo
Decentralizing Knowledge