Mostofa Kamal Rasel | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mostofa Kamal Rasel is active.

Explore More

Publication

Featured researches published by Mostofa Kamal Rasel.

Iete Technical Review | 2014

A Review of Ensemble Learning Based Feature Selection

Donghai Guan; Weiwei Yuan; Young-Koo Lee; Mostofa Kamal Rasel

ABSTRACT Feature selection is an important topic in machine learning. In recent years, via integrating ensemble learning, the ensemble learning based feature selection approach has been proposed and studied. The general idea is to generate multiple diverse feature selectors and combine their outputs. This approach is superior to conventional feature selection methods in many aspects. Among them, its most prominent advantage is the ability to handle stability issue that is usually poor in existing feature selection methods. This review covers different issues related to ensemble learning based feature selection, which include the main modules, the stability measurement, etc. To the best of our knowledge, this is the first review that focuses on ensemble feature selection. It can be a useful reference in the literature of feature selection.

Information Sciences | 2016

Topic modeling and improvement of image representation for large-scale image retrieval

Nguyen Anh Tu; Dong-Luong Dinh; Mostofa Kamal Rasel; Young-Koo Lee

In this paper, we present a new visual search system for finding similar images in a large database. However, there are a number of challenges regarding the robustness of the image representations and the efficiency of the retrieval framework. To tackle these challenges, we first propose an encoding technique based on soft-assignment of local features to convert an entire image into a single vector, which is a compact and discriminative representation. This encoded vector is suitable for most types of efficient indexing methods to produce an initial result. To compensate for the lack of incorporating geometric and object-related information during the encoding scheme, we then propose a probabilistic topic model to formalize the spatial structure among the local features. Moreover, the topic model allows us to effectively extract the object and background regions from the image. This is performed by a Markov Chain Monte Carlo algorithm for approximate inference. Finally, benefiting from the extracted objects in each image, we present a re-ranking scheme to automatically refine the initial search results. Our proposed retrieval framework has two major advantages: i) an aggregation strategy through soft-assignment improves the discriminative power of the representation, which has a determinative effect on the retrieval precision; and ii) the probabilistic latent topic model enables us to not only gain insight into the spatial structure of the image, but also handle a large variation in the object appearance. The experimental results from four benchmark datasets show that our approach provides competitive accuracy, and runs about ten times faster. Our studies also verify that proposed approach works effectively on large-scale databases of millions of images.

Information Sciences | 2017

Disk-based shortest path discovery using distance index over large dynamic graphs

Jihye Hong; Kisung Park; Yongkoo Han; Mostofa Kamal Rasel; Dawanga Vonvou; Young-Koo Lee

Abstract The persistent alternation of the internet world is changing networks rapidly. Shortest path discovery, especially over dynamic networks such as web page links, telephone or route networks, and ontologies, has received intense attention because of its importance for services in IoT. For example, when a new road is newly opened or becomes unavailable for any unexpected reason, the shortest paths must be recomputed. The system should respond promptly to its users with the updated recommended paths. In this paper, we propose a disk-based shortest path method that updates the shortest paths in a very large dynamic graph efficiently. The proposed method uses partial shortest paths as indices for efficient shortest path discovery. We classify the changes in the graph into four cases, such as the insertion or deletion of edges and the increase or decrease of edge weights. Our proposed strategy considers updating only the corresponding parts of the indices for each case. Our experiments on real-world dynamic datasets verify that the proposed framework updates the shortest paths 4 to 50 times faster than the existing type of framework.

Information Sciences | 2016

iTri: Index-based triangle listing in massive graphs

Mostofa Kamal Rasel; Yongkoo Han; Jin-Seung Kim; Kisung Park; Nguyen Anh Tu; Young-Koo Lee

Abstract Triangle listing is a basic operator when dealing with many graph problems. However, in-memory algorithms do not work well with recently developed massive graphs such as social networks because these graphs cannot be accommodated in the memory. Thus, external memory-based algorithms have been proposed recently, but these approaches still require frequent multiple scans of the whole graph on the disk and large volumes of calculations are performed that involve the whole graph during every iteration. In this study, we propose a novel index-based method for listing triangles in massive graphs. First, we present new notions for the vertex range index and potential cone vertex index. Next, we propose an index join-based triangle listing algorithm. Our method accesses the indexed data asynchronously and joins them to list triangles using a multi-threaded parallel processing technique. Based on experiments, we demonstrate that our algorithm outperforms the state-of-the-art solution methods by three to eight times in terms of the wall clock time.

Archive | 2018

An Efficient Subgraph Compression-Based Technique for Reducing the I/O Cost of Join-Based Graph Mining Algorithms

Mostofa Kamal Rasel; Young-Koo Lee

Many join-based graph mining algorithms such as triangle listing and clique enumeration output a large size of intermediate or final data that sometimes dominates the mining cost. A few researches highlighted on the size of output data. However, those techniques have limitation that they are highly specific to their corresponding graph mining algorithms. In this paper, through the careful observations of the output patterns, we propose a general compression solution that can be applied to any join-based graph algorithm. It first categorizes the overlapping and non-overlapping vertices in a resultant subgraph set of a join-based graph mining algorithm. Then it compresses the output data by removing the redundancy from the overlapping vertices and by encoding the non-overlapping vertices using a non-aligned hybrid bit vector compression technique. Our proposed technique performs the compression on-the-fly and can easily be adopted by the join-based graph mining algorithms. Experiments on the real datasets show that our proposed technique, which is adopted in a triangle listing algorithm, reduces the size of the output data and the running time by three times and more than two times, respectively. The proposed technique also reduces the I/O cost for a maximal clique listing algorithm.

Information Sciences | 2018

Summarized bit batch-based triangle listing in massive graphs

Mostofa Kamal Rasel; En Elena; Young-Koo Lee

Abstract The presence of triangles in massive graphs provides many important uses in different graph algorithms, such as finding highly relevant vertices for dense subgraph mining, measuring the clustering coefficient, and computing the transitivity for network analysis. In-memory algorithms cannot be used for triangle listing in massive graphs because the graphs are too large to fit into memory. External memory-based techniques address this problem by focusing on the I/O efficiency to improve performance. However, triangulation is a CPU intensive process that iteratively joins lists of neighbors to determine the adjacent vertices in each triangle. Therefore, the cost of a triangle listing algorithm on a massive graph is dominated by the join operations among the lists of neighbors. In this paper, we propose a disk-based triangle listing approach that uses an efficient technique to join the lists of neighbors by exploiting CPU parallelism through bitwise operations. We represent the lists of neighbors using bit vectors and compress them using our proposed summarized bit batch , which allows the bitwise operations to be performed directly on the compressed data. Our proposed technique slices a bit vector into a series of word length bit batches that it summarizes by pruning the bit batches that contain only 0-bits. Then our proposed approach for listing the triangles asynchronously accesses the summarized bit batches and joins them using bitwise operations. Our proposed technique achieves 40% higher compression for some real world datasets compared to the classic compression technique. The triangulation technique using the summarized bit batches also significantly outperforms the existing solutions in terms of wall clock time.

international conference on big data and smart computing | 2016

On efficiently summarizing a large dynamic graph

Kifayat Ullah Khan; Mostofa Kamal Rasel; Muhammad Noorulamin; Waqas Nawaz; Young-Koo Lee

Graph summarization is a well known technique to create summary of mega-sized structures like social networks and world wide web. A prime bottleneck in this process is in-efficient pairwise similarity computation strategy to find similar nodes for compression. Previous work provides a scalable similarity computation strategy by using Locality Sensitive Hashing (LSH) to improve execution time of pairwise methods for summary of a static graph. Whereas LSH adoption provides desired acceleration, however, it requires large storage space for indexing candidate similar nodes. This problem becomes even more challenging in case of a dynamic graph, increasing the space complexity from O (b.n) to O (b.n.k), where b, n, and k are number of hash tables, total nodes, and snapshots of the graph respectively. In this paper, we propose a new index structure for LSH to align candidate similar nodes from a dynamic graph, with least storage space complexity. The proposed structure reduces space requirements by a factor η, where range of η depends on structural redundancy of graphs. We evaluate our proposed solution for summarization of four real world dynamic graphs and obtain compression upto 52%.

international conference on big data and smart computing | 2016