Natasa Milic-Frayling

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Natasa Milic-Frayling is active.

Explore More

Publication

Featured researches published by Natasa Milic-Frayling.

communities and technologies | 2009

Analyzing (social media) networks with NodeXL

Marc A. Smith; Ben Shneiderman; Natasa Milic-Frayling; Eduarda Mendes Rodrigues; Vladimir Barash; Cody Dunne; Tony Capone; Adam Perer; Eric Gleave

We present NodeXL, an extendible toolkit for network overview, discovery and exploration implemented as an add-in to the Microsoft Excel 2007 spreadsheet software. We demonstrate NodeXL data analysis and visualization features with a social media data sample drawn from an enterprise intranet social network. A sequence of NodeXL operations from data import to computation of network statistics and refinement of network visualization through sorting, filtering, and clustering functions is described. These operations reveal sociologically relevant differences in the patterns of interconnection among employee participants in the social media space. The tool and method can be broadly applied.

international acm sigir conference on research and development in information retrieval | 2004

Feature selection using linear classifier weights: interaction with classification models

Dunja Mladenic; Janez Brank; Marko Grobelnik; Natasa Milic-Frayling

This paper explores feature scoring and selection based on weights from linear classification models. It investigates how these methods combine with various learning models. Our comparative analysis includes three learning algorithms: Naïve Bayes, Perceptron, and Support Vector Machines (SVM) in combination with three feature weighting methods: Odds Ratio, Information Gain, and weights from linear models, the linear SVM and Perceptron. Experiments show that feature selection using weights from linear SVMs yields better classification performance than other feature weighting methods when combined with the three explored learning algorithms. The results support the conjecture that it is the sophistication of the feature weighting method rather than its apparent compatibility with the learning algorithm that improves classification performance.

international acm sigir conference on research and development in information retrieval | 2011

Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking

Gabriella Kazai; Jaap Kamps; Marijn Koolen; Natasa Milic-Frayling

The evaluation of information retrieval (IR) systems over special collections, such as large book repositories, is out of reach of traditional methods that rely upon editorial relevance judgments. Increasingly, the use of crowdsourcing to collect relevance labels has been regarded as a viable alternative that scales with modest costs. However, crowdsourcing suffers from undesirable worker practices and low quality contributions. In this paper we investigate the design and implementation of effective crowdsourcing tasks in the context of book search evaluation. We observe the impact of aspects of the Human Intelligence Task (HIT) design on the quality of relevance labels provided by the crowd. We assess the output in terms of label agreement with a gold standard data set and observe the effect of the crowdsourced relevance judgments on the resulting system rankings. This enables us to observe the effect of crowdsourcing on the entire IR evaluation process. Using the test set and experimental runs from the INEX 2010 Book Track, we find that varying the HIT design, and the pooling and document ordering strategies leads to considerable differences in agreement with the gold set labels. We then observe the impact of the crowdsourced relevance label sets on the relative system rankings using four IR performance metrics. System rankings based on MAP and Bpref remain less affected by different label sets while the Precision@10 and nDCG@10 lead to dramatically different system rankings, especially for labels acquired from HITs with weaker quality controls. Overall, we find that crowdsourcing can be an effective tool for the evaluation of IR systems, provided that care is taken when designing the HITs.

international world wide web conferences | 2004

Smartback: supporting users in back navigation

Natasa Milic-Frayling; Rachel Jones; Kerry Rodden; Gavin Smyth; Alan F. Blackwell; Ralph Sommerer

This paper presents the design and user evaluation of SmartBack, a feature that complements the standard Back button by enabling users to jump directly to key pages in their navigation session, making common navigation activities more efficient. Defining key pages was informed by the findings of a user study that involved detailed monitoring of Web usage and analysis of Web browsing in terms of navigation trails. The pages accessible through SmartBack are determined automatically based on the structure of the users navigation trails or page association with specific users activities, such as search or browsing bookmarked sites. We discuss implementation decisions and present results of a usability study in which we deployed the SmartBack prototype and monitored usage for a month in both corporate and home settings. The results show that the feature brings qualitative improvement to the browsing experience of individuals who use it.

Information Retrieval | 2013

An analysis of human factors and label accuracy in crowdsourcing relevance judgments

Gabriella Kazai; Jaap Kamps; Natasa Milic-Frayling

Crowdsourcing relevance judgments for the evaluation of search engines is used increasingly to overcome the issue of scalability that hinders traditional approaches relying on a fixed group of trusted expert judges. However, the benefits of crowdsourcing come with risks due to the engagement of a self-forming group of individuals—the crowd, motivated by different incentives, who complete the tasks with varying levels of attention and success. This increases the need for a careful design of crowdsourcing tasks that attracts the right crowd for the given task and promotes quality work. In this paper, we describe a series of experiments using Amazon’s Mechanical Turk, conducted to explore the ‘human’ characteristics of the crowds involved in a relevance assessment task. In the experiments, we vary the level of pay offered, the effort required to complete a task and the qualifications required of the workers. We observe the effects of these variables on the quality of the resulting relevance labels, measured based on agreement with a gold set, and correlate them with self-reported measures of various human factors. We elicit information from the workers about their motivations, interest and familiarity with the topic, perceived task difficulty, and satisfaction with the offered pay. We investigate how these factors combine with aspects of the task design and how they affect the accuracy of the resulting relevance labels. Based on the analysis of 960 HITs and 2,880 HIT assignments resulting in 19,200 relevance labels, we arrive at insights into the complex interaction of the observed factors and provide practical guidelines to crowdsourcing practitioners. In addition, we highlight challenges in the data analysis that stem from the peculiarity of the crowdsourcing environment where the sample of individuals engaged in specific work conditions are inherently influenced by the conditions themselves.

WIT Transactions on Information and Communication Technologies | 2002

Feature Selection Using Support Vector Machines

Janez Brank; Marko Grobelnik; Natasa Milic-Frayling; Dunja Mladenic

Text categorization is the task of classifying natural language documents into a set of predefined categories. Documents are typically represented by sparse vectors under the vector space model, where each word in the vocabulary is mapped to one coordinate axis and its occurrence in the document gives rise to one nonzero component in the vector representing that document. When training classifiers on large collections of documents, both the time and memory requirements connected with processing of these vectors may be prohibitive. This calls for using a feature selection method, not only to reduce the number of features but also to increase the sparsity of document vectors. We propose a feature selection method based on linear Support Vector Machines (SVMs). First, we train the linear SVM on a subset of training data and retain only those features that correspond to highly weighted components (in absolute value sense) of the normal to the resulting hyperplane that separates positive and negative examples. This reduced feature space is then used to train a classifier over a larger training set because more documents now fit into the same amount of memory. In our experiments we compare the effectiveness of the SVM -based feature selection with that of more traditional feature selection methods, such as odds ratio and information gain, in achieving the desired tradeoff between the vector sparsity and the classification performance. Experimental results indicate that, at the same level of vector sparsity, feature selection based on SVM normals yields better classification performance than odds ratioor information gainbased feature selection when linear SVM classifiers are used.

european conference on information retrieval | 2012

On aggregating labels from multiple crowd workers to infer relevance of documents

Mehdi Hosseini; Ingemar J. Cox; Natasa Milic-Frayling; Gabriella Kazai; Vishwa Vinay

We consider the problem of acquiring relevance judgements for information retrieval (IR) test collections through crowdsourcing when no true relevance labels are available. We collect multiple, possibly noisy relevance labels per document from workers of unknown labelling accuracy. We use these labels to infer the document relevance based on two methods. The first method is the commonly used majority voting (MV) which determines the document relevance based on the label that received the most votes, treating all the workers equally. The second is a probabilistic model that concurrently estimates the document relevance and the workers accuracy using expectation maximization (EM). We run simulations and conduct experiments with crowdsourced relevance labels from the INEX 2010 Book Search track to investigate the accuracy and robustness of the relevance assessments to the noisy labels. We observe the effect of the derived relevance judgments on the ranking of the search systems. Our experimental results show that the EM method outperforms the MV method in the accuracy of relevance assessments and IR systems ranking. The performance improvements are especially noticeable when the number of labels per document is small and the labels are of varied quality.

web intelligence | 2003

WebScout: support for revisitation of Web pages within a navigation session

Natasa Milic-Frayling; Ralph Sommerer; Kerry Rodden

WebScout is a system that creates a personal archive of Web pages seen by the user and a rich record of the users navigation, including various types of user and system generated annotations. We explore how this rich archive can be used to provide support for user navigation, in particular, for revisitation of pages within a navigation session. We describe the WebScout SessionNavigator feature that enhances the current browser functionality by providing both sequential and graph representation of the user navigation. It introduces the concept of a WebTrail which designates a sequence of navigation steps, started by a particular event, such as search, or explicit specification of a URL by typing into the address bar, or executing a link from a bookmark list. We present details of a user study that explores how users perceive and remember their navigation on the Web.

social informatics | 2012

Do You Know the Way to SNA?: A Process Model for Analyzing and Visualizing Social Media Network Data

Derek L. Hansen; Dana Rotman; Elizabeth Bonsignore; Natasa Milic-Frayling; Eduarda Mendes Rodrigues; Marc A. Smith; Ben Shneiderman

Traces of activity left by social media users can shed light on individual behavior, social relationships, and community efficacy. Tools and processes to make sense of social traces are essential for enabling practitioners to study and nurture meaningful and sustainable social interaction. Yet such tools and processes remain in their infancy. This paper describes a study of 15 graduate students who were learning to apply Social Network Analysis (SNA) to data from online communities. Their emergent practices were observed via a pre-post survey, diaries, observations, interviews, analysis of assignments and online class interactions, and a group modeling session. From this in-depth look, we derive the Network Analysis and Visualization (NAV) process model and use it to highlight stages where interaction with peers, experts, and features of the SNA tool were most useful. The important role of visualization in supporting networked thinking was essential, as was the iterative nature of goal formation, data structuring, and data analysis. The paper concludes with a discussion of how the NAV model informs the design of SNA tools and services and supports social media practitioners.

privacy security risk and trust | 2011

Group-in-a-Box Layout for Multi-faceted Analysis of Communities

Eduarda Mendes Rodrigues; Natasa Milic-Frayling; Marc A. Smith; Ben Shneiderman; Derek L. Hansen

Communities in social networks emerge from interactions among individuals and can be analyzed through a combination of clustering and graph layout algorithms. These approaches result in 2D or 3D visualizations of clustered graphs, with groups of vertices representing individuals that form a community. However, in many instances the vertices have attributes that divide individuals into distinct categories such as gender, profession, geographic location, and similar. It is often important to investigate what categories of individuals comprise each community and vice-versa, how the community structures associate the individuals from the same category. Currently, there are no effective methods for analyzing both the community structure and the category-based partitions of social graphs. We propose Group-In-a-Box (GIB), a meta-layout for clustered graphs that enables multi-faceted analysis of networks. It uses the tree map space filling technique to display each graph cluster or category group within its own box, sized according to the number of vertices therein. GIB optimizes visualization of the network sub-graphs, providing a semantic substrate for category-based and cluster-based partitions of social graphs. We illustrate the application of GIB to multi-faceted analysis of real social networks and discuss desirable properties of GIB using synthetic datasets.

Explore More