Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Derek Doran is active.

Publication


Featured researches published by Derek Doran.


Data Mining and Knowledge Discovery | 2011

Web robot detection techniques: overview and limitations

Derek Doran; Swapna S. Gokhale

Most modern Web robots that crawl the Internet to support value-added services and technologies possess sophisticated data collection and analysis capabilities. Some of these robots, however, may be ill-behaved or malicious, and hence, may impose a significant strain on a Web server. It is thus necessary to detect Web robots in order to block undesirable ones from accessing the server. Such detection is also essential to ensure that the robot traffic is considered appropriately in the performance and capacity planning of Web servers. Despite a variety of Web robot detection techniques, there is no consensus regarding a single technique, or even a specific “type” of technique, that performs well in practice. Therefore, to aid in the development of a practically applicable robot detection technique, this survey presents a critical analysis and comparison of the prevalent detection approaches. We propose a framework to classify the existing detection techniques into four categories based on their underlying detection philosophy. We compare the different classes to gain insights into those characteristics that make up an effective robot detection scheme. Finally, we discuss why the contemporary techniques fail to offer a general solution to the robot detection problem and propose a set of key ingredients necessary for strong Web robot detection.


advances in social networks analysis and mining | 2013

Human sensing for smart cities

Derek Doran; Swapna S. Gokhale; Aldo Dagnino

Smart cities are powered by the ability to self-monitor and respond to signals and data feeds from heterogeneous physical sensors. These physical sensors, however, are fraught with interoperability and dependability challenges. Moreover, they also cannot shed light on human emotions and factors that impact smart city initiatives. Yet everyday, millions of city dwellers share their observations, thoughts, feelings, and experiences about their city through social media updates. This paper describes how citizens can serve as human sensors in providing supplementary, alternate, and complementary sources of information for smart cities. It presents a methodology, based on a probabilistic language model, to extract the perceptions that may be relevant to smart city initiatives from social media updates. Geo-tagged tweets collected over a two-month period from New York City are used to illustrate the potential of social media powered human sensors.


advances in social networks analysis and mining | 2013

A comparison of web robot and human requests

Derek Doran; Kevin Morillo; Swapna S. Gokhale

Sophisticated Web robots sport a wide variety of functionality and visiting characteristics, constituting a significant percentage of the requests serviced by a Web server. Unlike human clients that retrieve information off a site by navigating links and ignoring irrelevant information, Web robots may collect many different types of resources, and employ varying navigation strategies to find the knowledge on the site they desire. Thus, the resource request patterns of their visits are unpredictable and cannot be inferred based on our knowledge of human request patterns. In this paper, we perform an analysis on the types of resources requested by Web robots using recent Web logs from an academic Web server. We study the distribution of response sizes and response codes, the types of resources requested, and popularity of resources for requests from Web robots. Throughout, we contrast our findings against human resource request patterns. We find reasons to suggest that robots severely handicaps the ability of Web server caches to operate with high performance.


network computing and applications | 2008

Discovering New Trends in Web Robot Traffic Through Functional Classification

Derek Doran; Swapna S. Gokhale

This paper proposes a novel functional classification scheme to understand and analyze web robot traffic. The scheme is rooted in the recognition that the crawling behavior of a robot on a site is primarily governed byits intended purpose or functionality. We apply the classification rules to analyze web server access logs from the University of Connecticut School of Engineering domain. The analysis results indicate how the classification scheme can provide insights into the robot traffic based on their functionality.


international conference on machine learning and applications | 2012

The Importance of Outlier Relationships in Mobile Call Graphs

Derek Doran; Veena B. Mendiratta; Chitra Phadke; Huseyin Uzunalioglu

Mobile phones have become one of the primary tools for individuals to communicate, to access data networks, and to share information. Service providers collect data about the calls placed on their network, and these calls exhibit a large degree of variability. Providers model the structure of the relationships between network subscribers as a mobile call graph. In this paper, we apply a new measure to quantify by how much a relationship between users in a mobile call graph deviate from an average relationship. This measure is used to explore the connection between calling behaviors and the complex structure mobile call graphs take. We study a large call graph from a major service provider and learn that distant, outlier relationships play the largest role in maintaining connectivity between cellular users, and that calling features of users more strongly influence tie variation compared to social features. We also observe a rapid decay of its massively connected component as outlier ties are removed.


advances in social networks analysis and mining | 2016

Finding street gang members on Twitter

Lakshika Balasuriya; Sanjaya Wijeratne; Derek Doran; Amit P. Sheth

Most street gang members use Twitter to intimidate others, to present outrageous images and statements to the world, and to share recent illegal activities. Their tweets may thus be useful to law enforcement agencies to discover clues about recent crimes or to anticipate ones that may occur. Finding these posts, however, requires a method to discover gang member Twitter profiles. This is a challenging task since gang members represent a very small population of the 320 million Twitter users. This paper studies the problem of automatically finding gang members on Twitter. It outlines a process to curate one of the largest sets of verifiable gang member profiles that have ever been studied. A review of these profiles establishes differences in the language, images, YouTube links, and emojis gang members use compared to the rest of the Twitter population. Features from this review are used to train a series of supervised classifiers. Our classifier achieves a promising F1 score with a low false positive rate.


Journal of the Association for Information Science and Technology | 2012

A classification framework for web robots

Derek Doran; Swapna S. Gokhale

The behavior of modern web robots varies widely when they crawl for different purposes. In this article, we present a framework to classify these web robots from two orthogonal perspectives, namely, their functionality and the types of resources they consume. Applying the classification framework to a year-long access log from the UConn SoE web server, we present trends that point to significant differences in their crawling behavior.


international conference on machine learning and applications | 2015

Request Type Prediction for Web Robot and Internet of Things Traffic

H. Nathan Rude; Derek Doran

The volume of Web robot traffic seen by Web servers and clouds continue to increase with the popularity of Internet of Things (IoT) devices. Such traffic exhibits decidedly different statistical and resource request patterns compared to humans. However, the optimizations ensuring high levels of Web systems and cloud performance requires traffic to exhibit the statistical and behavioral patterns of humans, not robots. This necessitates the design of novel Web system optimizations to handle Web robot traffic effectively. Caches are a basic component of high performing Web systems, but their effectiveness relies on accurate resource request prediction. In this paper, we explore a suite of classifiers for the resource request type prediction problem for robot traffic. Our analysis reveals: (i) a striking difference in the request patterns of robots across multiple servers from the same domain, and (ii) that Elman neural networks hold promise to predict request types despite these differences.


computational aspects of social networks | 2013

Triads, transitivity, and social effects in user interactions on Facebook

Derek Doran; Huda Alhazmi; Swapna S. Gokhale

Most computational techniques that analyze Online Social Networks (OSNs) aim to discover patterns in a networks structure and the behavior of its users, but do not seek to understand how peoples motives lead to these patterns. Studying the social effects that cause these patterns, however, can produce deeper insights that may transcend a specific network and are generically applicable. Therefore, a more promising approach is to anchor computational techniques to the underlying social effects that can explain the reasons behind why users interact the way they do. In this paper, we discover how the social effects of stature, relationship strength, and egocentricity shape the interactions among Facebook users. These effects are explored through transitivity in triads, which are network units that capture dynamics among triples of users. The analysis suggests that Facebook interactions are influenced by users with concentrated stature and strong bonds. However, the activities of popular and over-active users have little influence.


international conference on machine learning and applications | 2012

Detecting Web Robots Using Resource Request Patterns

Derek Doran; Swapna S. Gokhale

A significant proportion of Web traffic is now attributed to Web robots, and this proportion is likely to grow over time. These robots may threaten the security, privacy, functionality, and performance of a Web server due to their unregulated crawling behavior. Therefore, to assess their impact, it must be possible to accurately detect Web robot requests. Contemporary detection approaches, however, may cease to be effective as the behavior of both robots and humans evolves. In this paper, we present a novel detection approach that is based on the contrasts in the resource request patterns of robots and humans. The proposed scheme, which relies on an invariant behavioral difference between humans and robots, builds on the lessons from contemporary approaches. We demonstrate that the proposed approach can accurately detect Web robots and argue that it is expected to remain effective even as they continue their rapid evolution.

Collaboration


Dive into the Derek Doran's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge