Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ivo Santos is active.

Publication


Featured researches published by Ivo Santos.


very large data bases | 2009

Microsoft CEP server and online behavioral targeting

Mohamed H. Ali; C. Gerea; Balan Sethu Raman; Beysim Sezgin; T. Tarnavski; Tomer Verona; Ping Wang; Peter Zabback; Asvin Ananthanarayan; Anton Kirilov; M. Lu; Alex Raizman; R. Krishnan; Roman Schindlauer; Torsten Grabs; S. Bjeletich; Badrish Chandramouli; Jonathan Goldstein; S. Bhat; Ying Li; V. Di Nicola; Xiaoyang Sean Wang; David Maier; S. Grell; O. Nano; Ivo Santos

In this demo, we present the Microsoft Complex Event Processing (CEP) Server, Microsoft CEP for short. Microsoft CEP is an event stream processing system featured by its declarative query language and its multiple consistency levels of stream query processing. Query composability, query fusing, and operator sharing are key features in the Microsoft CEP query processor. Moreover, the debugging and supportability tools of Microsoft CEP provide visibility of system internals to users. Web click analysis has been crucial to behavior-based online marketing. Streams of web click events provide a typical workload for a CEP server. Meanwhile, a CEP server with its processing capabilities plays a key role in web click analysis. This demo highlights the features of Microsoft CEP under a workload of web click events.


international conference on data engineering | 2011

Accurate latency estimation in a distributed event processing system

Badrish Chandramouli; Jonathan Goldstein; Roger S. Barga; Mirek Riedewald; Ivo Santos

A distributed event processing system consists of one or more nodes (machines), and can execute a directed acyclic graph (DAG) of operators called a dataflow (or query), over long-running high-event-rate data sources. An important component of such a system is cost estimation, which predicts or estimates the “goodness” of a given input, i.e., operator graph and/or assignment of individual operators to nodes. Cost estimation is the foundation for solving many problems: optimization (plan selection and distributed operator placement), provisioning, admission control, and user reporting of system misbehavior. Latency is a significant user metric in many commercial real-time applications. Users are usually interested in quantiles of latency, such as worst-case or 99th percentile. However, existing cost estimation techniques for event-based dataflows use metrics that, while they may have the side-effect of being correlated with latency, do not directly or provably estimate latency. In this paper, we propose a new cost estimation technique using a metric called Mace (Maximum cumulative excess). Mace is provably equivalent to maximum system latency in a (potentially complex, multi-node) distributed event-based system. The close relationship to latency makes Mace ideal for addressing the problems described earlier. Experiments with real-world datasets on Microsoft StreamInsight deployed over 1–13 nodes in a data center validate our ability to closely estimate latency (within 4%), and the use of Mace for plan selection and distributed operator placement.


distributed event-based systems | 2014

JetStream: enabling high performance event streaming across cloud data-centers

Radu Tudoran; Olivier Nano; Ivo Santos; Alexandru Costan; Hakan Soncu; Luc Bougé; Gabriel Antoniu

The easily-accessible computation power offered by cloud infrastructures coupled with the revolution of Big Data are expanding the scale and speed at which data analysis is performed. In their quest for finding the Value in the 3 Vs of Big Data, applications process larger data sets, within and across clouds. Enabling fast data transfers across geographically distributed sites becomes particularly important for applications which manage continuous streams of events in real time. Scientific applications (e.g. the Ocean Observatory Initiative or the ATLAS experiment) as well as commercial ones (e.g. Microsofts Bing and Office 365 large-scale services) operate on tens of data-centers around the globe and follow similar patterns: they aggregate monitoring data, assess the QoS or run global data mining queries based on inter site event stream processing. In this paper, we propose a set of strategies for efficient transfers of events between cloud data-centers and we introduce JetStream: a prototype implementing these strategies as a high performance batch-based streaming middleware. JetStream is able to self-adapt to the streaming conditions by modeling and monitoring a set of context parameters. It further aggregates the available bandwidth by enabling multi-route streaming across cloud sites. The prototype was validated on tens of nodes from US and Europe data-centers of the Windows Azure cloud using synthetic benchmarks and with application code from the context of the Alice experiment at CERN. The results show an increase in transfer rate of 250 times over individual event streaming. Besides, introducing an adaptive transfer strategy brings an additional 25% gain. Finally, the transfer rate can further be tripled thanks to the use of multi-route streaming.


international conference on management of data | 2012

RACE: real-time applications over cloud-edge

Badrish Chandramouli; Joris Claessens; Suman Nath; Ivo Santos; Wenchao Zhou

The Cloud-Edge topology - where multiple smart edge devices such as phones are connected to one another via the Cloud - is becoming ubiquitous. We demonstrate RACE, a novel framework and system for specifying and efficiently executing distributed real-time applications in the Cloud-Edge topology. RACE uses LINQ for StreamInsight to succinctly express a diverse suite of useful real-time applications. Further, it exploits the processing power of edge devices and the Cloud to partition and execute such queries in a distributed manner. RACE features a novel cost-based optimizer that efficiently finds the optimal placement, minimizing global communication cost while handling multi-level join queries and asymmetric network links.


Future Generation Computer Systems | 2016

JetStream: Enabling high throughput live event streaming on multi-site clouds

Radu Tudoran; Alexandru Costan; Olivier Nano; Ivo Santos; Hakan Soncu; Gabriel Antoniu

Scientific and commercial applications operate nowadays on tens of cloud datacenters around the globe, following similar patterns: they aggregate monitoring or sensor data, assess the QoS or run global data mining queries based on inter-site event stream processing. Enabling fast data transfers across geographically distributed sites allows such applications to manage the continuous streams of events in real time and quickly react to changes. However, traditional event processing engines often consider data resources as second-class citizens and support access to data only as a side-effect of computation (i.e. they are not concerned by the transfer of events from their source to the processing site). This is an efficient approach as long as the processing is executed in a single cluster where nodes are interconnected by low latency networks. In a distributed environment, consisting of multiple datacenters, with orders of magnitude differences in capabilities and connected by a WAN, this will undoubtedly lead to significant latency and performance variations. This is namely the challenge we address in this paper, by proposing JetStream, a high performance batch-based streaming middleware for efficient transfers of events between cloud datacenters. JetStream is able to self-adapt to the streaming conditions by modeling and monitoring a set of context parameters. It further aggregates the available bandwidth by enabling multi-route streaming across cloud sites, while at the same time optimizing resource utilization and increasing cost efficiency. The prototype was validated on tens of nodes from US and Europe datacenters of the Windows Azure cloud with synthetic benchmarks and a real-life application monitoring the ALICE experiment at CERN. The results show a 3x increase of the transfer rate using the adaptive multi-route streaming, compared to state of the art solutions.


very large data bases | 2013

DiAl: distributed streaming analytics anywhere, anytime

Ivo Santos; Marcel Tilly; Badrish Chandramouli; Jonathan Goldstein

Connected devices are expected to grow to 50 billion in 2020. Through our industrial partners and their use cases, we validated the importance of inflight data processing to produce results with low latency, in particular local and global data analytics capabilities. In order to cope with the scalability challenges posed by distributed streaming analytics scenarios, we propose two new technologies: (1) JStreams, a low footprint and efficient JavaScript complex event processing engine supporting local analytics on heterogeneous devices and (2) DiAlM, a distributed analytics management service that leverages cloud-edge evolving topologies. In the demonstration, based on a real manufacturing use case, we walk through a situation where operators supervise manufacturing equipment through global analytics, and drill down into alarm cases on the factory floor by locally inspecting the data generated by the manufacturing equipment.


international conference on management of data | 2013

Query containment in entity SQL

Guillem Rull; Philip A. Bernstein; Ivo Santos; Yannis Katsis; Sergey Melnik; Ernest Teniente

We describe a software architecture we have developed for a constructive containment checker of Entity SQL queries defined over extended ER schemas expressed in Microsofts Entity Data Model. Our application of interest is compilation of object-to-relational mappings for Microsofts ADO.NET Entity Framework, which has been shipping since 2007. The supported language includes several features which have been individually addressed in the past but, to the best of our knowledge, they have not been addressed all at once before. Moreover, when embarking on an implementation, we found no guidance in the literature on how to modularize the software or apply published algorithms to a commercially-supported language. This paper reports on our experience in addressing these real-world challenges.


distributed event-based systems | 2014

Achieving high throughput for large scale event streaming across geographically distributed data-centers with JetStream

Radu Tudoran; Olivier Nano; Ivo Santos; Alexandru Costan; Hakan Soncu; Luc Bougé; Gabriel Antoniu

The increasing scale at which data processing is being performed nowadays calls for data management systems that enable high-performance data exchanges among geographically remote instances of large web services. In this demonstration we show how JetStream can increase the transfer rate of events which are streamed between geographically remote cloud data centers. The demonstration setup focuses on presenting how the binding can be done between JetStream and the event source on one hand and with the StreamInsight processing engine on the other hand. By considering a data source with an event generation rate that is variable in time, we demonstrate the importance of adapting the transfer scheme to the streaming context.


Archive | 2011

Local event processing

Olivier Nano; Ivo Santos; Marcel Tilly; Tomer Verona


Archive | 2010

Visual analysis and debugging of complex event flows

Ramkumar Krishnan; Tihomir Tarnavski; Sebastien Peray; Ivo Santos; Olivier Nano; Marcel Tilly

Collaboration


Dive into the Ivo Santos's collaboration.

Researchain Logo
Decentralizing Knowledge