Annika Baumann
Humboldt University of Berlin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Annika Baumann.
Expert Systems With Applications | 2018
Annika Baumann; Johannes Haupt; Fabian Gebert; Stefan Lessmann
We assess the applicability of graph metrics to predict purchase probabilities.Real-world clickstream data of two online retailers is used.Graphs are derived out of sessions of website visitors.Distance- and centrality-based graph metrics are useful for prediction.Closeness vitality, radius, number of circles and self-loops are most important. The prediction of online user behavior (next clicks, repeat visits, purchases, etc.) is a well-studied subject in research. Prediction models typically rely on clickstream data that is captured during the visit of a website and embodies user agent-, path-, time- and basket-related information. The aim of this paper is to propose an alternative approach to extract auxiliary information from the website navigation graph of individual users and to test the predictive power of this information. Using two real-world large datasets of online retailers, we develop an approach to construct within-session graphs from clickstream data and demonstrate the relevance of corresponding graph metrics to predict purchases.
Computers & Industrial Engineering | 2015
Benjamin Fabian; Annika Baumann; Jessika Lackner
Relying on services in the cloud involves high availability risks.We present a survey on cloud outages and causes.Furthermore, we analyze the network connectivity of cloud providers.Our approach is based on an empirical dataset of the Internet topology.This helps to identify providers that could suffer more from Internet outages. Relying on services in the cloud involves manifold availability risks and concerns. This article focuses on the network reachability of cloud services. We present a study on cloud outages and causes, and analyze the topological connectivity of major cloud service providers (CSPs) by graph-based measures. Our approach is based on the construction and integration of an empirical dataset describing the connections between Autonomous Systems (ASs) of organizations that form the Internet backbone. According to our findings, though the ASs of CSPs generally appear to be better connected than an average AS, they also vastly differ in several connectivity measures, sometimes by more than an order of magnitude. Our results help to identify well-connected CSPs and CSPs that could potentially suffer more from Internet outages, if no additional path redundancy is provided. Our approach can be used by CSPs to assess connectivity beyond their own premises. It can also support cloud service customers during benchmarking and selection of CSPs when high availability is a critical requirement.
conference on risks and security of internet and systems | 2014
Annika Baumann; Benjamin Fabian
The importance of the Internet as todays communication and information medium cannot be underestimated. Reduced Internet reliability can lead to significant financial losses for businesses and economies. But how robust is the Internet with respect to failures, accidents, and malicious attacks? We will investigate this question from the perspective of graph analysis. First, we develop a graph model of the Internet at the level of Autonomous Systems based on empirical data. Then, a global assessment of Internet robustness is conducted with respect to several failure and attack modes. Our results indicate that even today the Internet could be very vulnerable to smart attack strategies.
european conference on information systems | 2015
Annika Baumann; Stefan Lessmann; Kristof Coussement; Koen W. De Bock
Churn modeling is important to sustain profitable customer relationships in saturated consumer markets. A churn model predicts the likelihood of customer defection. This helps to target retention offers to the right customers and use marketing resources efficiently. Several statistical prediction methods exist in marketing, but all these suffer an important limitation: they do not allow the analyst to account for campaign planning objectives and constraints during model building. Our key proposition is that creating churn models in awareness of actual business requirements increases the performance of the final model for marketing decision support. To demonstrate this, we propose a decision-centric framework to create churn models. We test our modeling framework on eight real-life churn data sets and find that it performs significantly better than state-of-the-art churn models. We estimate that our approach increases the per customer profits of retention campaigns by
critical information infrastructures security | 2015
Annika Baumann; Benjamin Fabian
.47 on average. Further analysis confirms that this improvement comes directly from maximizing business objectives during model building. The main implication of our study is thus that companies better shift from a purely statistical to a more business-driven modeling approach when predicting customer churn.
International Journal of Networking and Virtual Organisations | 2013
Annika Baumann; Benjamin Fabian
The Internet of today permeates societies and markets as a critical infrastructure. Dramatic network incidents have already happened in history with strong negative economic impacts. Therefore, assessing the vulnerability of Internet connections against failures, accidents and malicious attacks is an important field of high practical relevance. Based on a large integrated dataset describing the Internet as a complex graph, this paper develops a multi-dimensional Connectivity Risk Score that, to our knowledge, constitutes the first proposal for a topological connectivity-risk indicator of single Autonomous Systems, the organizational units of the Internet backbone. This score encompasses a variety of topological robustness metrics and can help risk managers to assess the vulnerability of their organizations even beyond network perimeters. Such analyses can be conducted in a user-friendly way with the help of CORIA, a newly developed software framework for connectivity risk analysis. Our approach can serve as an important element in an encompassing strategy to assess and improve companies’ connectivity to the Internet.
international conference on communications | 2017
Benjamin Fabian; Annika Baumann; Mathias Ehlert; Vasilis Ververis; Tatiana Ermakova
Given the importance of the internet for worldwide communication and services, its resilience against attacks, accidents, or attempts of misusing political control becomes critical for businesses and society. This article focuses on the question how vulnerable specific geographical regions are to an internet access disruption or to censorship-based impediments due to governmental control. In particular, a new metric is developed that measures the geographical internet resilience on a country level. For this purpose several indices based on geography, technologies as well as control are combined into a single, rank-based score indicating the internet resilience of a particular country compared to others.
Social Science Research Network | 2017
Tatiana Ermakova; Benjamin Fabian; Annika Baumann; Mykyta Izmailov; Hanna Krasnova
The Internet can be considered as the most important infrastructure for modern society and businesses. A loss of Internet connectivity has strong negative financial impacts for businesses and economies. Therefore, assessing Internet connectivity, in particular beyond their own premises and area of direct control, is of growing importance in the face of potential failures, accidents, and malicious attacks. This paper presents CORIA, a software framework for an easy analysis of connectivity risks based on large network graphs. It provides researchers, risk analysts, network managers and security consultants with a tool to assess an organizations connectivity and paths options through the Internet backbone, including a user-friendly and insightful visual representation of results. CORIA is flexibly extensible in terms of novel data sets, graph metrics, and risk scores that enable further use cases. The performance of CORIA is evaluated by several experiments on the Internet graph and further randomly generated networks.
european conference on information systems | 2015
Benjamin Fabian; Annika Baumann; Marian Keil
The Bitcoin digital currency increasingly attracts an essential number of Internet users. This study focuses on the future outlook of Bitcoin by identifying drivers and impediments of the currencys adoption. For this aim, we conduct an empirical survey of around one hundred Bitcoin experts and discuss the results. Our research contributes to the practical and theoretical discussions in the cryptocurrency field and broadens the understanding of the adoption and future perspectives of Bitcoin.
international conference on web information systems and technologies | 2018
Annika Baumann; Benjamin Fabian; Matthias Lischke
Reddit is a social news website that aims to provide user privacy by encouraging them to use pseudonyms and refraining from any kind of personal data collection. However, users are often not aware of possibilities to indirectly gather a lot of information about them by analyzing their contributions and behaviour on this site. In order to investigate the feasibility of large-scale user classification with respect to the attributes social gender and citizenship this article provides and evaluates several data mining techniques. First, a large text corpus is collected from Reddit and annotations are derived using lexical rules. Then, a discriminative approach on classification using support vector machines is undertaken and extended by using topics generated by a latent Dirichlet allocation as features. Based on supervised latent Dirichlet allocation, a new generative model is drafted and implemented that captures Reddit’s specific structure of organizing information exchange. Finally, the presented techniques for user classification are evaluated and compared in terms of classification performance as well as time efficiency. Our results indicate that large-scale user classification on Reddit is feasible, which may raise privacy concerns among its community.