Norwati Mustapha | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Norwati Mustapha is active.

Explore More

Publication

Featured researches published by Norwati Mustapha.

Artificial Intelligence Review | 2011

A review: accuracy optimization in clustering ensembles using genetic algorithms

Reza Ghaemi; Nasir Sulaiman; Hamidah Ibrahim; Norwati Mustapha

The clustering ensemble has emerged as a prominent method for improving robustness, stability, and accuracy of unsupervised classification solutions. It combines multiple partitions generated by different clustering algorithms into a single clustering solution. Genetic algorithms are known as methods with high ability to solve optimization problems including clustering. To date, significant progress has been contributed to find consensus clustering that will yield better results than existing clustering. This paper presents a survey of genetic algorithms designed for clustering ensembles. It begins with the introduction of clustering ensembles and clustering ensemble algorithms. Subsequently, this paper describes a number of suggested genetic-guided clustering ensemble algorithms, in particular the genotypes, fitness functions, and genetic operations. Next, clustering accuracies among the genetic-guided clustering ensemble algorithms is compared. This paper concludes that using genetic algorithms in clustering ensemble improves the clustering accuracy and addresses open questions subject to future research.

knowledge discovery and data mining | 2010

A Novel Approach for High Dimensional Data Clustering

Ali Alijamaat; Madjid Khalilian; Norwati Mustapha

Clustering is considered as the most important unsupervised learning problem. It aims to find some structure in a collection of unlabeled data. Dealing with a large quantity of data items can be problematic because of time complexity. On the other hand high dimensional data is a challenge arena in data clustering e.g. time series data. Novel algorithms are needed to be robust, scalable, efficient and accurate to cluster of these kinds of data. In this study we proposed a two stages algorithm base on K-Means to achieve our objective.

Computer Society of Iran Computer Conference | 2008

OPWUMP: An Architecture for Online Predicting in WUM-Based Personalization System

Mehrdad Jalali; Norwati Mustapha; Nasir Sulaiman; Ali Mamat

The Internet is one of the fastest growing areas of intelligence gathering. During their navigation web users leave many records of their activity. This huge amount of data can be a useful source of knowledge. Sophisticated mining processes are needed for this knowledge to be extracted, understood and used. Web Usage Mining (WUM) systems are specifically designed to carry out this task by analyzing the data representing usage data about a particular Web Site. WUM can model user behavior and, therefore, to forecast their future movements. Online prediction is one web usage mining application. However, the accuracy of the prediction and classification in the current architecture of predicting users’ future requests systems can not still satisfy users especially in Huge Web sites. To provide online prediction efficiently, we develop an architecture for online predicting in WUM-based personalization system (OPWUMP).This article advances an architecture of Web usage mining for enhancing accuracy of classification by interaction between classification, evaluation, current user activates and user profile in online phase of this architecture.

international conference on computer and automation engineering | 2009

K-Means Divide and Conquer Clustering

Madjid Khalilian; Farsad Zamani Boroujeni; Norwati Mustapha; Md. Nasir Sulaiman

Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. Most clustering techniques ignore the fact about the different size or levels – where in most cases, clustering is more concern with grouping similar objects or samples together ignoring the fact that even though they are similar, they might be of different levels. For really large data sets, data reduction should be performed prior to applying the data-mining techniques which is usually performing dimension reduction, and the main question is whether some of these prepared and preprocessed data can be discarded without sacrificing the quality of results. Existing clustering techniques would normally merge small clusters with big ones, removing its identity. In this study we propose a method which uses divide and conquer technique to improve the performance of the K-Means clustering method.

international conference on pattern recognition | 2008

A new clustering approach based on graph partitioning for navigation patterns mining

Mehrdad Jalali; Norwati Mustapha; Ali Mamat; Md. Nasir Sulaiman

We present a study of the Web based user navigation patterns mining and propose a novel approach for clustering of user navigation patterns. The approach is based on the graph partitioning for modeling user navigation patterns. For the clustering of user navigation patterns we create an undirected graph based on connectivity between each pair of Web pages and we propose novel formula for assigning weights to edges in such a graph. The experimental results represent that the approach can improve the quality of clustering for user navigation pattern in Web usage mining systems. These results can be use for predicting userpsilas next request in the huge Web sites.

Journal of Big Data | 2016

Data stream clustering by divide and conquer approach based on vector model

Madjid Khalilian; Norwati Mustapha; Nasir Sulaiman

Recently, many researchers have focused on data stream processing as an efficient method for extracting knowledge from big data. Data stream clustering is an unsupervised approach that is employed for huge data. The continuous effort on data stream clustering method has one common goal which is to achieve an accurate clustering algorithm. However, there are some issues that are overlooked by the previous works in proposing data stream clustering solutions; (1) clustering dataset including big segments of repetitive data, (2) monitoring clustering structure for ordinal data streams and (3) determining important parameters such as k number of exact clusters in stream of data. In this paper, DCSTREAM method is proposed with regard to the mentioned issues to cluster big datasets using the vector model and k-Means divide and conquer approach. Experimental results show that DCSTREAM can achieve superior quality and performance as compare to STREAM and ConStream methods for abrupt and gradual real world datasets. Results show that the usage of batch processing in DCSTREAM and ConStream is time consuming compared to STREAM but it avoids further analysis for detecting outliers and novel micro-clusters.

Multimedia Tools and Applications | 2013

An integrated semantic-based approach in concept based video retrieval

Sara Memar; Lilly Suriani Affendey; Norwati Mustapha; Shyamala Doraisamy; Mohammadreza Ektefa

Multimedia content has been growing quickly and video retrieval is regarded as one of the most famous issues in multimedia research. In order to retrieve a desirable video, users express their needs in terms of queries. Queries can be on object, motion, texture, color, audio, etc. Low-level representations of video are different from the higher level concepts which a user associates with video. Therefore, query based on semantics is more realistic and tangible for end user. Comprehending the semantics of query has opened a new insight in video retrieval and bridging the semantic gap. However, the problem is that the video needs to be manually annotated in order to support queries expressed in terms of semantic concepts. Annotating semantic concepts which appear in video shots is a challenging and time-consuming task. Moreover, it is not possible to provide annotation for every concept in the real world. In this study, an integrated semantic-based approach for similarity computation is proposed with respect to enhance the retrieval effectiveness in concept-based video retrieval. The proposed method is based on the integration of knowledge-based and corpus-based semantic word similarity measures in order to retrieve video shots for concepts whose annotations are not available for the system. The TRECVID 2005 dataset is used for evaluation purpose, and the results of applying proposed method are then compared against the individual knowledge-based and corpus-based semantic word similarity measures which were utilized in previous studies in the same domain. The superiority of integrated similarity method is shown and evaluated in terms of Mean Average Precision (MAP).

international symposium on information technology | 2008

A new classification model for online predicting users’ future movements

Mehrdad Jalali; Norwati Mustapha; Ali Mamat; Md. Nasir Sulaiman

Nowadays many internet users prefer to navigate their interest web pages in special web site rather than navigating all web pages in the web site. For this reason some techniques have been developed for predicting user’s future requests. Data manning algorithms can be applied to many prediction problems. We can exploit Web Usage Mining for Knowledge extracting based on user behavior during the web navigation. The WUM applies data mining techniques for extracting knowledge from user log files in the particular web server. The WUM can model user behavior and, therefore, to forecast their future movements by mining user navigation patterns. To provide online prediction efficiently, we advance architecture for online predicting in web usage mining system by proposing novel model based on Longest Common Subsequence algorithm for classifying user navigation patterns. The prediction of users’ future movements by this manner can improve accuracy of recommendations.

ieee global conference on consumer electronics | 2014

Proactive architecture for Internet of Things (IoTs) management in smart homes

Thinagaran Perumal; Nasir Sulaiman; Norwati Mustapha; Ahmad Shahi; R Thinaharan

Smart homes are driven by heterogeneity in nature and consist of diverse components that promote user comfort and security. In recent times, tremendous growth of Internet of Things (IoTs) applications is seen in smart homes. The huge diversity of various IoTs applications generally leads to interoperability requirements that need to be fulfilled. Current IoTs management is achieved using physical platforms that lack intelligence on decision making. A proactive architecture that deploys Event-Condition-Action (ECA) method is proposed to resolve the management of heterogeneous IoTs in smart homes. The proactive architecture, developed with a core repository stores persistent data of IoTs schema, proved to be an ideal solution in solving interoperability in smart homes.

international conference on spatial data mining and geographical knowledge services | 2011

An extended ID3 decision tree algorithm for spatial data

Imas Sukaesih Sitanggang; Razali Yaakob; Norwati Mustapha; Ahmad Ainuddin Nuruddin

Utilizing data mining tasks such as classification on spatial data is more complex than those on non-spatial data. It is because spatial data mining algorithms have to consider not only objects of interest itself but also neighbours of the objects in order to extract useful and interesting patterns. One of classification algorithms namely the ID3 algorithm which originally designed for a non-spatial dataset has been improved by other researchers in the previous work to construct a spatial decision tree from a spatial dataset containing polygon features only. The objective of this paper is to propose a new spatial decision tree algorithm based on the ID3 algorithm for discrete features represented in points, lines and polygons. As in the ID3 algorithm that use information gain in the attribute selection, the proposed algorithm uses the spatial information gain to choose the best splitting layer from a set of explanatory layers. The new formula for spatial information gain is proposed using spatial measures for point, line and polygon features. Empirical result demonstrates that the proposed algorithm can be used to join two spatial objects in constructing spatial decision trees on small spatial dataset. The proposed algorithm has been applied to the real spatial dataset consisting of point and polygon features. The result is a spatial decision tree with 138 leaves and the accuracy is 74.72%.

Explore More