Aina Musdholifah | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aina Musdholifah is active.

Explore More

Publication

Featured researches published by Aina Musdholifah.

intelligent systems design and applications | 2010

Triangular kernel nearest neighbor based clustering for pattern extraction in spatio-temporal database

Aina Musdholifah; Siti Zaiton Mohd Hashim

To date, various fields of applications have utilized spatio-temporal databases not only to store data, but to support decision making. For example, in traffic accident analysis; it is required to have knowledge on the pattern of accidents resulting in death. Thus, in such analysis, clustering technique is desired to implement pattern extraction. This paper presents clustering of spatio-temporal database using kernel nearest neighbor approach. It is chosen due to its ability to determine the number of clusters automatically. There are various types of kernel functions exist in the literatures, but the issue of concern is how to determine an appropriate kernel function for this application. In this study, two commonly used kernel functions, namely Gaussian and triangular, are investigated. From various experiments conducted, both functions produce reasonable clusters, but the triangular kernel nearest neighbor based clustering (TKNN) provides better performance with smaller number of iteration compared to Gaussian kernel nearest neighbor based clustering (ILGC) and K-means. Thus, TKNN is good option in clustering spatio-temporal database.

international conference on computer and communication engineering | 2010

KNN-kernel based clustering for spatio-temporal database

Aina Musdholifah; Siti Zaiton Mohd Hashim; Ito Wasito

Extracting and analyzing the interesting patterns from spatio-temporal databases, have drawn a great interest in various fields of research. Recently, a number of experiments have explored the problem of spatial or temporal data mining, and some clustering algorithms have been proposed. However, not many studies have been dealing with the integration of spatial data mining and temporal data mining. Moreover, the data in spatial temporal database can be categorized as high-dimensional data. Current density-based clustering might have difficulties with complex data sets including high-dimensional data. This paper presents Iterative Local Gaussian Clustering (ILGC), an algorithm that combines K-nearest neighbour (KNN) density estimation and Kernel density estimation, to cluster the spatiotemporal data. In this approach, the KNN density estimation is extended and combined with Kernel function, where KNN contributes in determining the best local data iteratively for kernel density estimation. The local best is defined as the set of neighbour data that maximizes the kernel function. Bayesian rule is used to deal with the problem of selecting the best local data. This paper utilized Gaussian kernel which has been proven successful in the clustering. To validate the KNN-kernel based algorithm, we compare its performance againts other popular algorithms, such as Self Organizing Maps (SOM) and K-Means, on Crime database. Results show that KNN-kernel based clustering has outperformed others.

international conference on computer science and information technology | 2013

Robust Local Triangular Kernel density-based clustering for high-dimensional data

Aina Musdholifah; Siti Zaiton Mohd Hashim; Razali Ngah

A number of clustering algorithms can be employed to find clusters in multivariate data. However, the effectiveness and efficiency of the existing algorithms are limited, since the respective data has high dimension, contain large amount of noise and consist of clusters with arbitrary shapes and densities. In this paper, a new kernel density-based clustering algorithm, called Local Triangular Kernel-based Clustering (LTKC), is proposed to deal with these conditions. LTKC is based on combination of k-nearest-neighbor density estimation and triangular kernel density-based clustering. The advantages of our LTKC approach are: (1) it has a firm mathematical basis; (2) it requires only one parameter, number of neighbors; (3) it defines the number of cluster automatically; (4) it allows discovering clusters with arbitrary shapes and densities ;and (5) it is significantly faster than existing algorithms. LTKC is tested using artificial data and applied to some UCI data. A comparison with k-means, KFCM and well known density-based clustering algorithms including ILGC, DBSCAN, and DENCLUE shows the superiority of our proposed LTKC algorithm.

systems, man and cybernetics | 2012

Hybrid PCA-ILGC clustering approach for high dimensional data

Aina Musdholifah; Siti Zaiton Mohd Hashim; Razali Ngah

The availability of high dimensional dataset that incredible growth, imposes insufficient conventional approaches to extract hidden useful information. As a result, today researchers are challenged to develop new techniques to deal with massive high dimensional data that has not only in term of number of data but also in the number of attributes. In order to improve effectiveness and accuracy of mining task on high dimensional data, an efficient dimensionality reduction method should be executed in data preprocessing stage before clustering technique is applied. Many clustering algorithms has been proposed and used to discover useful information from a dataset. Iterative Local Gaussian Clustering (ILGC) is a simple density based clustering technique that has successfully discovered number of clusters represented in the dataset. In this paper we proposed to use the Principal Component Analysis (PCA) method to preprocess the data prior to ILGC clustering in order to simplify the analysis and visualization of multi dimensional data set. The proposed approach is validated with benchmark classification datasets. In addition, the performance of proposed hybrid PCA-ILGC clustering approach is compared to original ILGC, basic k-means and hybridized k-means. The experimental results indicate that the proposed approach is capable to obtain clusters with higher accuracy, and time taken to process the data was decreased.

knowledge discovery and data mining | 2012

Triangular kernel nearest-neighbor-based clustering algorithm for discovering true clusters

Aina Musdholifah; Siti Zaiton Mohd Hashim

Clustering is a powerful exploratory technique for extracting the knowledge of given data. Several clustering techniques that have been proposed require predetermined number of clusters. However, the triangular kernel-nearest neighbor-based clustering (TKNN) has been proven able to determine the number and member of clusters automatically. TKNN provides good solutions for clustering non-spherical and high-dimensional data without prior knowledge of data labels. On the other hand, there is no definite measure to evaluate the accuracy of the clustering result. In order to evaluate the performance of the proposed TKNN clustering algorithm, we utilized various benchmark classification datasets. Thus, TKNN is proposed for discovering true clusters with arbitrary shape, size and density contained in the datasets. The experimental results on benched-mark datasets showed the effectiveness of our technique. Our proposed TKNN achieved more accurate clustering results and required less time processing compared with k-means, ILGC, DBSCAN and KFCM.

Journal of Computer Science | 2018

Multiview Hierarchical Agglomerative Clustering for Identification of Development Gap and Regional Potential Sector

Tb. Ai Munandar; Azhari; Aina Musdholifah; Lincolin Arsyad

The identification of regional development gaps is an effort to see how far the development conducted in every District in a Province. By seeing the gaps occurred, it is expected that the Policymakers are able to determine which region that will be prioritized for future development. Along with the regional gaps, the identification in Gross Regional Domestic Product (GRDP) sector is also an effort to identify the achievement in the development in certain fields seen from the potential GRDP owned by a District. There are two approaches that are often used to identify the regional development gaps and potential sector, Klassen Typology and Location Quotient (LQ), respectively. In fact, the results of the identification using these methods have not been able to show the proximity of the development gaps between a District to another yet in a same cluster. These methods only cluster the regions and GRDP sectors in a firm cluster based on their own parameter values. This research develops a new approach that combines the Klassen, LQ and hierarchical agglomerative clustering (HAC) into a new method named multi view hierarchical agglomerative clustering (MVHAC). The data of GRDP sectors of 23 Districts in West Java province were tested by using Klassen, LQ, HAC and MVHAC and were then compared. The results show that MVHAC is able to accommodate the ability of the three previous methods into a unity, even to clearly visualize the proximity of the development gaps between the regions and GRDP sectors owned. MVHAC clusters 23 districts into 3 main clusters, they are; Cluster 1 (Quadrant 1) consists of 5 Districts as the members, Cluster 2 (Quadrant 2) consists of 12 Districts and Cluster 3 (Quadrant 4) consists of 6 Districts.

Indonesian Journal of Computing and Cybernetics Systems | 2018

Local Triangular Kernel-Based Clustering (LTKC) for Case Indexing on Case-Based Reasoning

Damar Riyadi; Aina Musdholifah

This study aims to improve the performance of Case-Based Reasoning by utilizing cluster analysis which is used as an indexing method to speed up case retrieval in CBR. The clustering method uses Local Triangular Kernel-based Clustering (LTKC). The cosine coefficient method is used for finding the relevant cluster while similarity value is calculated using Manhattan distance, Euclidean distance, and Minkowski distance. Results of those methods will be compared to find which method gives the best result. This study uses three test data: malnutrition disease, heart disease, and thyroid disease. Test results showed that CBR with LTKC-indexing has better accuracy and processing time than CBR without indexing. The best accuracy on threshold 0.9 of malnutrition disease, obtained using the Euclidean distance which produces 100% accuracy and 0.0722 seconds average retrieval time. The best accuracy on threshold 0.9 of heart disease, obtained using the Minkowski distance which produces 95% accuracy and 0.1785 seconds average retrieval time. The best accuracy on threshold 0.9 of thyroid disease, obtained using the Minkowski distance which produces 92.52% accuracy and 0.3045 average retrieval time. The accuracy comparison of CBR with SOM-indexing, DBSCAN-indexing, and LTKC-indexing for malnutrition diseases and heart disease resulted that they have almost equal accuracy.

International Review on Computers and Software | 2016

Building Melodic Feature Knowledge of Gamelan Music Using Apriori Based on Functions in Sequence (AFiS) Algorithm

Khafiizh Hastuti; Azhari Azhari; Aina Musdholifah; Rahayu Supanggah

Gamelan is a traditional music ensemble from Java, Indonesia, whose melody has characteristics that make the melodic sound of gamelan music easy to recognize. This research aims at building melodic feature knowledge of gamelan music in terms of note sequences rules. The algorithm called AFiS (Apriori based on Functions in Sequence) was also introduced to produce rules by mining the frequent value of note sequences. The basic idea of the AFiS algorithm is to define functions in a sequence, and then to chain the functions based on its position order to identify the support value for each function. The implementation of AFiS algorithm is aimed to define rules of gamelan music melodic feature in terms of ideal note sequences for composition. The evaluation of the accuracy of the note sequences rules is conducted by developing a recommendation system using rules defined in this research. The program is expected to answer correctly to some notes randomly deleted from the sequences. The result shows that the accuracy of the knowledge, and that the note sequences rules of gamelan music based on the correct answer is up to 86.5%. Another evaluation is to find whether the different answers given by the program are accepted as alternative notes to the original notes. This evaluation involved 4 human experts to describe their acceptance of the alternative notes based on the different answers. The result shows that the different notes in 4 of 5 gendings are accepted by the experts as alternative notes. Copyright

Archive | 2013