Is this you? Create Your Porfile

Anthony Skjellum

University of Alabama at Birmingham

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Anthony Skjellum is active.

Explore More

Publication

Featured researches published by Anthony Skjellum.

Concurrency and Computation: Practice and Experience | 2006

Grid-flow: a grid-enabled scientific workflow system with a petri net-based interface

Anthony Skjellum; Zhijie Guan

Advances in computer technologies have enabled scientists to explore research issues in their respective domains at scales greater and finer than ever before. The availability of efficient data collection and analysis tools presents researchers with vast opportunities to process heterogeneous data within a distributed environment. To support the opportunities enabled by massive computation, a suitable scientific workflow system is needed to help the users to manage data and programs, and to design reusable procedures of scientific experimental tasks. In this paper, the design and prototype implementation of a scientific workflow infrastructure, called Grid‐Flow, is presented. Grid‐Flow assists researchers in specifying scientific experiments using a Petri‐net‐based interface. The Grid‐Flow infrastructure is designed as a Service Oriented Architecture with multi‐layer component models. The contributions of Grid‐Flow are as follows: (1) a new, lightweight, programmable Grid workflow language, Grid‐Flow Description Language, is provided to describe the workflow process in a Grid environment; (2) a Petri‐net‐based user interface, based on the Generic Modeling Environment, is demonstrated to help the user design the workflow process with a Petri‐net model; and (3) a program integration component of the Grid‐Flow system is presented to integrate all possible programs into the system. Copyright

international parallel and distributed processing symposium | 2008

Accelerating Reed-Solomon coding in RAID systems with GPUs

Matthew L. Curry; Anthony Skjellum; H.L. Ward; Ronald B. Brightwell

Graphical Processing Units (GPUs) have been applied to more types of computations than just graphics processing for several years. Until recently, however, GPU hardware has not been capable of efficiently performing general data processing tasks. With the advent of more general-purpose extensions to GPUs, many more types of computations are now possible. One such computation that we have identified as being suitable for the CPUs unique architecture is Reed-Solomon coding in a manner appropriate for RAID-type systems. In this paper, we motivate the need for RAID with triple-disk parity and describe a pipelined architecture for using a GPU for this purpose. Performance results show that the GPU can outperform a modern CPU on this problem by an order of magnitude and also confirm that a GPU can be used to support a system with at least three parity disks with no performance penalty.

acm symposium on applied computing | 2008

Mining spam email to identify common origins for forensic application

Chun Wei; Alan P. Sprague; Gary Warner; Anthony Skjellum

In recent years, spam email has become a major tool for criminals to conduct illegal business on the Internet. Therefore, in this paper we describe a new research approach that uses data mining techniques to study spam emails with the focus on law enforcement forensic analysis. After we retrieve useful attributes from spam emails, we use a connected components clustering algorithm to form relationships between messages. These initial clusters are then refined by using a weighted edges model where membership in the cluster requires the weight to exceed a chosen threshold. The results of the cluster membership are validated by WHOIS data, by the IP address of the computer hosting the advertised sites, and through comparison of graphical images of website fetches. This technique has been successful in identifying relationships between spam campaigns that were not identified by human researchers, enabling additional data to be brought into a single investigation.

2011 eCrime Researchers Summit | 2011

High-performance content-based phishing attack detection

Brad Wardman; Tommy Stallings; Gary Warner; Anthony Skjellum

Phishers continue to alter the source code of the web pages used in their attacks to mimic changes to legitimate websites of spoofed organizations and to avoid detection by phishing countermeasures. Manipulations can be as subtle as source code changes or as apparent as adding or removing significant content. To appropriately respond to these changes to phishing campaigns, a cadre of file matching algorithms is implemented to detect phishing websites based on their content, employing a custom data set consisting of 17,992 phishing attacks targeting 159 different brands. The results of the experiments using a variety of different content-based approaches demonstrate that some can achieve a detection rate of greater than 90% while maintaining a low false positive rate.

Concurrency and Computation: Practice and Experience | 2011

Gibraltar: A Reed-Solomon coding library for storage applications on programmable graphics processors

Matthew L. Curry; Anthony Skjellum; H. Lee Ward; Ron Brightwell

Reed–Solomon coding is a method for generating arbitrary amounts of erasure correction information from original data via matrix–vector multiplication in finite fields. Previous work has shown that modern CPUs are not well‐matched to this type of computation, requiring applications that depend on Reed–Solomon coding at high speeds (such as high‐performance storage arrays) to use hardware implementations. This work demonstrates that high performance is possible with current cost‐effective graphics processing units across a wide range of operating conditions and describes how performance will likely evolve in similar architectures. It describes the characteristics of the graphics processing unit architecture that enable high‐speed Reed–Solomon coding. A high‐performance practical library, Gibraltar, has been prototyped that performs Reed–Solomon coding on graphics processors in a manner suitable for storage arrays, along with applications with similar data resiliency needs. This library enables variably resilient erasure correcting codes to be used in a broad range of applications. Its performance is compared with that of a widely available CPU implementation, and a rationale for its API is presented. Its practicality is demonstrated through a usage example. Copyright

international conference on parallel processing | 2010

A Lightweight, GPU-Based Software RAID System

Matthew L. Curry; H. Lee Ward; Anthony Skjellum; Ronald B. Brightwell

While RAID is the prevailing method of creating reliable secondary storage infrastructure, many users desire more flexibility than offered by current implementations. Traditionally, RAID capabilities have been implemented largely in hardware in order to achieve the best performance possible, but hardware RAID has rigid designs that are costly to change. Software implementations are much more flexible, but software RAID has historically been viewed as much less capable of high throughput than hardware RAID controllers. This work presents a system, Gibraltar RAID, that attains high RAID performance by offloading the calculations related to error correcting codes to GPUs. This paper describes the architecture, performance, and qualities of the system. A comparison to a well-known software RAID implementation, the md driver included with the Linux operating system, is presented. While this work is presented in the context of high performance computing, these findings also apply to a general RAID market.

petascale data storage workshop | 2008

Arbitrary dimension Reed-Solomon coding and decoding for extended RAID on GPUs

Matthew L. Curry; Anthony Skjellum; H.L. Ward; Ronald B. Brightwell

Reed-Solomon coding is a method of generating arbitrary amounts of checksum information from original data via matrix-vector multiplication in finite fields. Previous work has shown that CPUs are not well-matched to this type of computation, but recent graphical processing units (GPUs) have been shown through a case study to perform this encoding quickly for the 3 + 3 (three data + three parity) case. In order to be utilized in a true RAID-like system, it is important to understand how well this computation can scale in the number of data disks supported. This paper details the performance of a general Reed-Solomon encoding and decoding library that is suitable for use in RAID-like systems. Both generation and recovery are performance-tested and discussed.

dependable systems and networks | 2014

Design and Evaluation of FA-MPI, a Transactional Resilience Scheme for Non-blocking MPI

Amin Hassani; Anthony Skjellum; Ron Brightwell

With the rapid scale out of supercomputers comes a corresponding higher failure frequency. Fault-tolerant methods have evolved to adapt to high rates of failure, but the behavior of MPI, the most widely used scalable programming middleware, is insufficient when confronting such failures. We present FA-MPI (Fault-Aware MPI), a set of extensions to the MPI standard designed to enable applications to implement a wide range of fault-tolerant methods. FA-MPI introduces transactional concepts to the MPI programming model for the first time to address failure detection, isolation, mitigation, and recovery via application-driven policies. To reach the maximum achievable performance of these scalable machines, overlapping communication and I/O with computation through non-blocking operations (while reducing jitter) are design themes of growing importance. Therefore, we emphasize fault tolerant, non-blocking communication operations combined with a set of nest able lightweight transactional Try Block API extensions architected to exploit system and application hierarchy both for failure detection and recovery. This is to enable applications to run to completion with higher probability than otherwise. Scaling up and out and fault-free overhead are key concerns that can be managed by tuning transaction granularity, we provide a simulation of FA-MPI in a stencil 3D program to illustrate this. Supported failure models include but are not limited to process failures, a key difference from other proposed fault-tolerant extensions to MPI. Restriction to non-blocking operations is a current limitation as compared to other proposed approaches insofar as legacy applications are concerned, but FA-MPI aligns well with future-looking applications emphasizing Exascale. And, tools to evolve legacy MPI programs to this fault-aware paradigm will soon bridge that portability gap.

The Journal of Digital Forensics, Security and Law | 2010

Reeling in Big Phish with a Deep MD5 Net

Brad Wardman; Gary Warner; Heather McCalley; Sarah Turner; Anthony Skjellum

Phishing continues to grow as phishers discover new exploits and attack vectors for hosting malicious content; the traditional response using takedowns and blacklists does not appear to impede phishers significantly. A handful of law enforcement projects — for example the FBIs Digital PhishNet and the Internet Crime and Complaint Center (ic3.gov) — have demonstrated that they can collect phishing data in substantial volumes, but these collections have not yet resulted in a significant decline in criminal phishing activity. In this paper, a new system is demonstrated for prioritizing investigative resources to help reduce the time and effort expended examining this particular form of online criminal activity. This research presents a means to correlate phishing websites by showing that certain websites are created by the same phishing kit. Such kits contain the content files needed to create the counterfeit website and often contain additional clues to the identity of the creators. A clustering algorithm is presented that uses collected phishing kits to establish clusters of related phishing websites. The ability to correlate websites provides law enforcement or other potential stakeholders with a means for prioritizing the allocation of limited investigative resources by identifying frequently repeating phishing offenders.

ACM Transactions on Storage | 2014

A Lightweight Data Location Service for Nondeterministic Exascale Storage Systems

Zhiwei Sun; Anthony Skjellum; Lee Ward; Matthew L. Curry

In this article, we present LWDLS, a lightweight data location service designed for Exascale storage systems (storage systems with order of 1018 bytes) and geo-distributed storage systems (large storage systems with physically distributed locations). LWDLS provides a search-based data location solution, and enables free data placement, movement, and replication. In LWDLS, probe and prune protocols are introduced that reduce topology mismatch, and a heuristic flooding search algorithm (HFS) is presented that achieves higher search efficiency than pure flooding search while having comparable search speed and coverage to the pure flooding search. LWDLS is lightweight and scalable in terms of incorporating low overhead, high search efficiency, no global state, and avoiding periodic messages. LWDLS is fully distributed and can be used in nondeterministic storage systems and in deterministic storage systems to deal with cases where search is needed. Extensive simulations modeling large-scale High Performance Computing (HPC) storage environments provide representative performance outcomes. Performance is evaluated by metrics including search scope, search efficiency, and average neighbor distance. Results show that LWDLS is able to locate data efficiently with low cost of state maintenance in arbitrary network environments. Through these simulations, we demonstrate the effectiveness of protocols and search algorithm of LWDLS.

Explore More