Is this you? Create Your Porfile

M. Asif Naeem

Auckland University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where M. Asif Naeem is active.

Explore More

Publication

Featured researches published by M. Asif Naeem.

data warehousing and olap | 2010

R-MESHJOIN for near-real-time data warehousing

M. Asif Naeem; Gillian Dobbie; Gerald Weber; Shafiq Alam

To fulfill the increasing demand of business for the latest information, current data integration approaches are moving towards real-time updates. One important element in real-time data integration is the join of a continuous incoming data stream with a disk-based relation. In this paper we investigate a stream-based join algorithm, called mesh join (MESHJOIN), and propose an improved version called reduced MESHJOIN (R-MESHJOIN). Both algorithms tune the memory, allocating parts of the memory to key components. In MESHJOIN there is a dependency between the size of partitions in an internal queue for the stream data and the number of iterations required to bring the disk-based relation into memory. This dependency hampers the optimal distribution of memory among the join components. In particular the size of the disk-buffer varies with the size of the disk-based relation which is unnecessary. On the other hand the R-MESHJOIN algorithm removes this dependency. This enables an optimal distribution of available memory among the join components. In R-MESHJOIN a change in the size of the disk-based relation does not affect the size of the disk-buffer. An experimental study is conducted in order to validate the arguments.

web intelligence | 2010

Particle Swarm Optimization Based Hierarchical Agglomerative Clustering

Shafiq Alam; Gillian Dobbie; Patricia Riddle; M. Asif Naeem

Clustering- an important data mining task, which groups the data on the basis of similarities among the data, can be divided into two broad categories, partitional clustering and hierarchal. We combine these two methods and propose a novel clustering algorithm called Hierarchical Particle Swarm Optimization (HPSO) data clustering. The proposed algorithm exploits the swarm intelligence of cooperating agents in a decentralized environment. The experimental results were compared with benchmark clustering techniques, which include K-means, PSO clustering, Hierarchical Agglomerative clustering (HAC) and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The results are evidence of the effectiveness of Swarm based clustering and the capability to perform clustering in a hierarchical agglomerative manner.

congress on evolutionary computation | 2010

A swarm intelligence based clustering approach for outlier detection

Shafiq Alam; Gillian Dobbie; Patricia Riddle; M. Asif Naeem

Outlier detection is an important field in data mining and knowledge discovery, which aims to identify abnormal observations in a large dataset. Common application areas of outlier detection are intrusion detection in computer networks, credit cards fraud detection, detecting abnormal changes in stock prices, and identifying abnormal health conditions. We propose the use of a novel swarm intelligence based clustering technique called Hierarchical Particle Swarm Optimization Based Clustering (HPSO-clustering) for outlier detection. The proposed technique is able to perform Hierarchical Agglom-erative Clustering (HAC) as well as outlier detection. In the proposed approach a swarm of particles evolves through different stages to identify outliers and normal clusters. The experimentation of the proposed approach is performed on benchmark datasets which show that the efficiency of the approach is better than some other popular outlier detection techniques.

International Journal of Data Warehousing and Mining | 2011

HYBRIDJOIN for Near-Real-Time Data Warehousing

Gillian Dobbie; M. Asif Naeem; Gerald Weber

An important component of near-real-time data warehouses is the near-real-time integration layer. One important element in near-real-time data integration is the join of a continuous input data stream with a disk-based relation. For high-throughput streams, stream-based algorithms, such as Mesh Join MESHJOIN, can be used. However, in MESHJOIN the performance of the algorithm is inversely proportional to the size of disk-based relation. The Index Nested Loop Join INLJ can be set up so that it processes stream input, and can deal with intermittences in the update stream but it has low throughput. This paper introduces a robust stream-based join algorithm called Hybrid Join HYBRIDJOIN, which combines the two approaches. A theoretical result shows that HYBRIDJOIN is asymptotically as fast as the fastest of both algorithms. The authors present performance measurements of the implementation. In experiments using synthetic data based on a Zipfian distribution, HYBRIDJOIN performs significantly better for typical parameters of the Zipfian distribution, and in general performs in accordance with the theoretical model while the other two algorithms are unacceptably slow under different settings.

advances in computing and communications | 2011

NL-Based Automated Software Requirements Elicitation and Specification

Ashfa Umber; Imran Sarwar Bajwa; M. Asif Naeem

This paper presents a novel approach to automate the process of software requirements elicitation and specification. The software requirements elicitation is perhaps the most important phase of software development as a small error at this stage can result in absurd software designs and implementations. The automation of the initial phase (such as requirement elicitation) phase can also contribute to a long standing challenge of automated software development. The presented approach is based on Semantic of Business Vocabulary and Rules (SBVR), an OMG’s recent standard. We have also developed a prototype tool SR-Elicitor (an Eclipse plugin), which can be used by software engineers to record and automatically transform the natural language software requirements to SBVR software requirements specification. The major contribution of the presented research is to demonstrate the potential of SBVR based approach, implemented in a prototype tool, proposed to improve the process of requirements elicitation and specification.

data warehousing and knowledge discovery | 2012

A lightweight stream-based join with limited resource consumption

M. Asif Naeem; Gillian Dobbie; Gerald Weber

Many stream-based applications have plenty of resources available to them, but there are also applications where resource consumption must be limited. For one important class of stream-based joins, where a stream is joined with a non-stream master data set, the algorithm called MESHJOIN was proposed. MESHJOIN uses limited memory and is a candidate for a resource-aware system setup. The problem that is considered in this paper is that MESHJOIN is not very selective. In particular, the performance of the algorithm is always inversely proportional to the size of the master data table. As a consequence, the resource consumption is in some scenarios sub-optimal. We present an algorithm CACHEJOIN, which performs asymptotically at least as well as MESHJOIN but performs better in realistic scenarios, particularly if parts of the master data are used with different frequencies. In order to quantify the performance differences, we compare both algorithms using a synthetic data set with a known skewed distribution.

applications of natural language to data bases | 2012

Interacting with data warehouse by using a natural language interface

M. Asif Naeem; Saif Ullah; Imran Sarwar Bajwa

Writing Online Analytical Processing (OLAP) queries for data warehouses is a complex and skill requiring task especially for the novel users. The situation becomes more critical when a low skilled person wants to access or analyze his business data from a data warehouse. These scenarios require more expertise and skills in terms of understanding and writing the accurate and functional queries. However, these complex tasks can be simplified by providing an easy interface to the users. In order to resolve all such issues, automated software tool is needed, which facilitates both users and software engineers. In this paper we present a novel approach with name QueGen (Query Generator) that generates OLAP queries based on the specification provided in natural English language. Users need to write the requirements in simple English in a few statements. After a semantic analysis and mapping of associated information, QueGen generates the intended OLAP queries that can be executed directly on data warehouses. An experimental study has been conducted to analyze the performance and accuracy of proposed tool.

data warehousing and knowledge discovery | 2013

SSCJ: A Semi-Stream Cache Join Using a Front-Stage Cache Module

M. Asif Naeem; Gerald Weber; Gillian Dobbie; Christof Lutteroth

Semi-stream processing has become an emerging area of research in the field of data stream management. One common operation in semi-stream processing is joining a stream with disk-based master data using a join operator. This join operator typically works under limited main memory and this memory is generally not large enough to hold the whole disk-based master data. Recently, a number of semi-stream join algorithms have been proposed in the literature to achieve an optimal performance but still there is room to improve the performance. In this paper we propose a novel Semi-Stream Cache Join SSCJ using a front-stage cache module. The algorithm takes advantage of skewed distributions, and we present results for Zipfian distributions of the type that appear in many applications. We analyze the performance of SSCJ with a well known related join algorithm, HYBRIDJOIN Hybrid Join. We also provide the cost model for our approach and validate it with experiments.

international conference on digital information management | 2014

A model transformation from NL to SBVR

Shabana Ramzan; Imran Sarwar Bajwa; Ikram Ul Haq; M. Asif Naeem

In Requirement Engineering, requirements are usually written in sentences of natural language and natural languages are ambiguous and inconsistent, so the requirements written in natural languages also tend to be ambiguous. To avoid this problem of ambiguity we present an approach of model transformation to generate requirements based on SBVR (Semantics of Business Vocabulary and Business Rules). The information provided in source metamodel (NL) is automatically transformed into target metamodel (SBVR). SBVR metamodel can not only be processed by machine but also provides precise and reliable model for software design. The standard SBVR metamodel is already available but for natural language we proposed our own metamodel because there is no standard metamodel available for natural languages.

business intelligence for the real-time enterprises | 2009

Comparing Global Optimization and Default Settings of Stream-Based Joins

M. Asif Naeem; Gillian Dobbie; Gerald Weber

One problem encountered in real-time data integration is the join of a continuous incoming data stream with a disk-based relation. In this paper we investigate a stream-based join algorithm, called mesh join (MESHJOIN), and focus on a critical component in the algorithm, called the disk-buffer. In MESHJOIN the size of disk-buffer varies with a change in total memory budget and tuning is required to get the maximum service rate within limited available memory. Until now there was little data on the position of the optimum value depending on the memory size, and no performance comparison has been carried out between the optimum and reasonable default sizes for the disk-buffer. To avoid tuning, we propose a reasonable default value for the disk-buffer size with a small and acceptable performance loss. The experimental results validate our arguments.

Explore More