Is this you? Create Your Porfile

Md. Saiful Islam

Shahjalal University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Md. Saiful Islam is active.

Explore More

Publication

Featured researches published by Md. Saiful Islam.

computer and information technology | 2016

Word embedding with hellinger PCA to detect the sentiment of bengali text

Md. Saiful Islam; Md. Al Amin; Shapan Das Uzzal

The sentiment of a sentence or a comment can be detected more accurately by applying Word Embeddings. This article presents the idea of word co-occurrence matrix and Skip-Gram to determine the actual contexts of the words, Hellinger PCA to determine the most similar words and generate a sliding window of most probable context words around each word. It is shown that, by applying Word Embeddings to classify the sentiment of a comment achieves higher accuracy with larger corpus. For our corpus of 2500 comments, the accuracy achieved is 70%, which is rapidly increasing with the size of the corpus.

international conference on electrical computer and communication engineering | 2017

A support vector machine mixed with TF-IDF algorithm to categorize Bengali document

Md. Saiful Islam; Fazla Elahi Md Jubayer; Syed Ikhtiar Ahmed

Document categorization is a technique through which the category of a document is determined. This paper deals with the automatic classification of Bangla documents. In this proposed categorization system, a support vector machine is used for classifying a document in predefine twelve categories. In this classification model TFIDF (term frequency-inverse document frequency) weighting with length normalization is used for feature selection after the preprocessing of data set is complete. It is shown that the results achieved by applying SVM to classify the category of a Bangla document are very promising as compared to conventional methods where features are chosen on the basis of bag-of-words. The accuracy of this proposed methodology is 92.57% for twelve categories.

computer and information technology | 2016

Supervised approach of sentimentality extraction from bengali facebook status

Md. Saiful Islam; Md. Ashiqul Islam; Md. Afjal Hossain; Jagoth Jyoti Dey

Sentiment is the only things that separate human and machine. To simulate the feelings for machines many researchers have been trying to create method and automated the process to extract opinion of particular news, product or life entity. Sentiment Analysis (SA) is a combination of opinions, emotions and subjectivity of a text. Currently SA is the most demanding task in Natural Language Processing. Social networking site like Facebook are mostly used in expressing the opinions about a particular entity of life. Newspaper published news about a particular event and user expressed their feedback in news comments. Online product feedback is increasing day by day. So reviews and opinions mining play a very important role in understanding people satisfactions. Such opinion mining has potential for knowledge discovery. The main target of SA is to find opinions from text extract sentiments from them and define their polarity, i.e positive or negative. In this domain most of the model was designed for English Language. This paper describes a novel approach using Naive Bayes classification model for Bengali Language. Here a supervised classification method is used with language rules for detecting sentiment for Bengali Facebook Status.

international conference on electrical computer and communication engineering | 2017

Sentiment analysis of Bengali comments with Word2Vec and sentiment information of words

Md. Al-Amin; Md. Saiful Islam; Shapan Das Uzzal

The vector representation of Bengali words using word2vec model (Mikolov et al. (2013)) plays an important role in Bengali sentiment classification. It is observed that the words that are from same context stay closer in the vector space of word2vec model and they are more similar than other words. In this article, a new approach of sentiment classification of Bengali comments with word2vec and Sentiment extraction of words are presented. Combining the results of word2vec word co-occurrence score with the sentiment polarity score of the words, the accuracy obtained is 75.5%.

international conference on electrical computer and communication engineering | 2017

A comprehensive study on sentiment of Bengali text

Md. Al-Amin; Md. Saiful Islam; Shapan Das Uzzal

Sentiment Analysis is one of the most important and challenging research topic in the field of natural language processing and opinion mining. In this article, six different approaches are discussed to determine the actual sentiment of the sentence and analyzed their performances. In parts of speech ratio method, the Parts of Speech (POS) of the queries are tagged and the POS ratio and the hamming distance between positive classifier and query and negative classifier and query are computed. To detect the sentiment more accurately, cosine similarity using TF-IDF is applied which is calculated by computing TF, DF and IDF and calculate positive vector, negative vector and query vector. In Cosine similarity using custom TF-IDF, custom POS tagger is used and TF, DF and IDF are computed. Another method with Naïve Bayes model using Uni-gram & stammer also gives good performance. In this approach, prior probability and conditional probability are calculated and the root words of the words are extracted. Naïve Bayes model using Bi-gram, stammer and normalizer is better than the other models. The last method discussed is Word Embedding with Hellinger PCA which presents the idea of word co-occurrence matrix and Skip-Gram to determine the actual contexts of the words, Hellinger PCA to determine most similar words and generate a sliding window of most probable context words around each word.

international conference on informatics electronics and vision | 2016

An efficient way for segmentation of Bangla characters in printed document using curved scanning

Ahnaf Farhan Rownak; Md. Fazle Rabby; Sabir Ismail; Md. Saiful Islam

The preeminent reason for poor output in Optical Character Recognition (OCR) for Bangla text is introduced by segmentation related error. Different shape of characters, connected characters, modifiers in top and bottom, overlapped region between consecutive characters are the main obstacle for effective segmentation for Bangla printed text. In this paper an efficient strategy is introduced to segment characters consisting overlapped region with other characters. The proposed strategy of our research have achieved 99.8% accuracy rate in line segmentation, 99.5% accuracy in word segmentation and 99% accuracy for character segmentation. The error introduced when two consecutive characters have multiple touching points.

international conference on informatics electronics and vision | 2016

A noble approach for recognizing Bangla real number automatically using CMU Sphinx4

Md. Mahadi Hasan Nahid; Md. Ashraful Islam; Md. Saiful Islam

Speech recognition is widely researched topic around the world. It is a process of conversion of speech to text. Many scientists and researchers are busy with doing works to increase the performance of speech recognition systems. Most of the languages in the world have speech recognizer of its own. But in our mother tongue Bangla there is no working speech recognizer. This work is little try to build a Bengali speech recognizer to enrich our language. In this paper we have proposed a noble approach to develop an automatic Bangla Real Number recognizer and analyze the performance of this recognition system using the most popular speech recognizer API CMU Sphinx 4 and a popular Bangla Unicode based writing software called Avro.

international conference on electrical engineering and information communication technology | 2016

Semi supervised keyword based bengali document categorization

Fahim Quadery; Abdullah Al Maruf; Tamjid Ahmed; Md. Saiful Islam

Document Categorization is an area of important research over the last couple of decades. The basic task in document categorization is classifying a given document in some predefined classes. Bengali is among the top ten most spoken languages in the world and is spoken by more than 200 million people, but the candid truth is, it still lacks significant research efforts in the area of Bengali Document Categorization. In the first phase of this paper a model has been designed that extracts keywords from a Bengali document. We crawled over 35000 news documents form popular Bengali newspapers and journals. Those documents have been stemmed and less significant words are removed using stemmer and Parts-of-Speech(POS) tagger. Statistical approach is used to extract keywords form the documents. Then probabilistic distribution and semi supervised learning with Naïve Bayes algorithm is used to approximate the category of a given Bengali document. Result and statistical data show the effectiveness of this model.

computer and information technology | 2016

A support vector machine approach for real time vision based human robot interaction

Nishikanto Sarkar Simul; Nusrat Mubin Ara; Md. Saiful Islam

Today humanoid robots are being exhibited to redact various task as a personal assistant of a human. To be an assistant, a robot needs to interact with human as a human. For this reason robot needs to understand the human gender, facial expression, facial gesture in real time. Ribo — A humanoid robot build in RoboSUST lab which has the ability to communicate in Bangla with the people speaking in Bengali. In this article the authors show the implementation of theoretical knowledge of the recognition of real time facial expression, detection of human gender and yes / no from facial gesture in Ribo. Real time facial expression and gender detection can be performed using Support Vector Machine (SVM). A prepared dataset containing the facial landmarks leveled as five different expression: sad, angry, smile, surprise and normal, is given to SVM to construct a classifier. For the prediction of any expression, facial images are taken in real time and provided the facial landmarks data to SVM. Local Binary Pattern(LBP) algorithm is used for extracting features from face images. These features leveled as male and female are responsible to build the classifier. The face gesture for detecting ‘yes/no’ is performed by tracking the movement of face in a certain time. After those implementations the principal results will make a framework that will be used in Ribo to recognize human facial expression, facial gesture movement and detect human gender.

arXiv: Computation and Language | 2017