Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yap Bee Wah is active.

Publication


Featured researches published by Yap Bee Wah.


international conference on science and social research | 2010

Evaluating spatial and temporal effects of accidents likelihood using random effects panel count model

Wan Fairos Wan Yaacob; Mohamad Alias Lazim; Yap Bee Wah

The application of Poisson and Negative Binomial models has been widely used in modeling road accident count. However, several restrictions on the data have been highlighted in the use of such model. Among which are the assumption of variance and mean to be equal, no serial correlation exist and the effect of unmeasured variables that may affect the dependent variable or so called the unobserved heterogeneity. Hence, an appropriate solution to this issue is to treat the data in a form time series and cross section data panel using panel model approach. This study analyzes the number of road accidents occurrences using time series cross sectional data for 14 states in Malaysia. The random effects negative binomial (RENB) model and the cross-sectional negative binomial (NB) models are examined. The models were developed to identify the contributing factors that affect the number of road accidents in Malaysia. We examine various factors associated with road accidents occurrence that includes the registered vehicle in the state, the amount of rainfall, the number of rainy day, time trend and the monthly effect of seasonality. Various model specifications were estimated. The specification comparisons indicate benefit from using the NB model when spatial and temporal effects are unobserved. While the RENB model appears to be more superior in the present case of accident count when incorporating temporal and cross sectional variations in which offers advantages in model flexibility.


international conference on statistics in science business and engineering | 2012

Comparison of predictive models to predict survival of cardiac surgery patients

Hezlin Aryani Abd Rahman; Yap Bee Wah; Zuraida Khairudin; Nik Nairan Abdullah

With recent innovation in computer database technology, voluminous data related to cardiac surgery are easily stored and made available for further analysis. However, these large quantities of data are often not fully utilized in terms of modelling cardiac surgery outcomes. Most of the previous studies have mainly focused on applying statistical techniques to small sample of data in order to reveal the simple linear relationships between the factors and survival of cardiac patients. Data mining offers a significant advantage over conventional statistical techniques which often requires the normality assumption. This study developed and compared new models to predict the survival of cardiac surgery patients. The dataset consists of 5154 observations with 23 variables, as suggested by domain experts from a renowned heart-surgery centre in Malaysia. After the data cleaning process, a total of 4976 cases and 12 variables were used for further analysis. The three predictive models, namely; Logistic Regression, Decision Tree and Artificial Neural Network, were developed and compared using the classification accuracy rate, sensitivity and specificity. From the Logistic Regression using ENTER selection model, the whole sample with 4976 cases had an imbalanced class case which led to biased results. Therefore, using the undersampling technique suggested by [14], a sample of 1209 cases (17% died and 83% alive) was used and further analysis was performed using this sample. Results showed that Artificial Neural Network is the best predictive model with classification accuracy, sensitivity and specificity of 88.4%, 95.67% and 58.06% respectively.


Journal of clinical & translational endocrinology | 2017

Examining diabetes distress, medication adherence, diabetes self-care activities, diabetes-specific quality of life and health-related quality of life among type 2 diabetes mellitus patients

Zeinab Jannoo; Yap Bee Wah; Alias Mohd Lazim; Mohamed Azmi Hassali

Highlights • A five-factor theoretical model is proposed.• The SEM model evaluated relationships among three endogenous and two exogenous variables.• Higher levels of medication adherence had a significant direct effect on diabetes distress.• Self-care activities had significant direct effect on diabetes distress and HRQoL.• Diabetes-specific QoL had a significant effect on HRQoL.


Journal of Applied Accounting Research | 2014

Tax non-compliance among SMCs in Malaysia: tax audit evidence

Nor Azrina Mohd Yusof; Lai Ming Ling; Yap Bee Wah

Purpose - – The pervasiveness of tax non-compliance remains a serious concern to most tax authorities around the world. The negative impact of tax non-compliance on the economy and the evolving nature of the Malaysian corporate tax system have motivated this study. The purpose of this paper is to examine the determinants of corporate tax non-compliance among small-and-medium-sized corporations (SMCs) in Malaysia. Design/methodology/approach - – This study used economic deterrence theory to analyze and test 375 tax-audited cases finalized by the Inland Revenue Board of Malaysia in 2011. Findings - – Multiple regression results revealed that marginal tax rate, company size and types of industry exerted significant effects on corporate tax non-compliance. The services and construction industries were noted to be the predominant industries engaged in tax non-compliance. The amount of concealed income unearthed during tax audit indicates clearly that there is widespread tax non-compliance in Malaysia and the quantum of tax lost through tax non-compliance is quite high. Research limitations/implications - – This study only sampled SMCs audited in 2011, hence, care has been exercised in generalizing the findings. Practical implications - – This study affirms that marginal tax rate, company size and types of industry are the main factors influencing compliance behavior of SMCs. The findings provide important insights not only to the Malaysian tax authority, but also to tax authorities and tax researchers in other parts of the world given that tax non-compliance of SMCs is a prevalent and universal problem. For example, with regard to the finding that marginal tax rate and company size are linked to non-compliance, it can be surmised that tax authorities ought to divert resources to firms with such characteristics when conducting audits. Originality/value - – Most tax research tax examining corporate tax non-compliance used financial data from annual reports to predict tax non-compliance, which are not very accurate. This study used actual tax audit cases obtained from the tax authority which are reflective of the actual situation. This study complements the scant existing literature by empirically evaluating the factors that influenced corporate tax non-compliance in a developing country like Malaysia.


international conference on statistics in science business and engineering | 2012

Comparison of conventional measures of skewness and kurtosis for small sample size

Sarah binti Yusoff; Yap Bee Wah

The normality assumption can be checked in three ways: graphical methods (histogram, normal Q-Q plot, and boxplots), descriptive statistics (value of skewness and kurtosis) or conducting test of normality (such as Shapiro-Wilk test, Kolmogorow-Smirnow test, Lilliefors test, Jacque-Bera test or Anderson Darling test). This paper focused on the two descriptive statistics which are skewness and kurtosis. A simulation study was carried out to compare the performance for three different types of conventional measures (TYPE 1, TYPE 2, and TYPE 3) of skewness and kurtosis for symmetric and asymmetric distributions. Monte Carlo simulation using R programming language was used to generate data from symmetric and skewed distribution. For symmetric distribution, the performance of TYPE 1, 2 and 3 skewness are comparable. Meanwhile, TYPE 2 kurtosis measure performs better for symmetric normal distribution. For symmetric distribution with negative kurtosis TYPE 1 kurtosis seems to perform better. While for asymmetric distribution, TYPE 2 skewness and kurtosis are better measures. However, all three measures do not perform well for leptokurtic distribution such as t-distribution.


THE 2ND ISM INTERNATIONAL STATISTICAL CONFERENCE 2014 (ISM-II): Empowering the Applications of Statistical and Mathematical Sciences | 2015

Assessing the effects of different types of covariates for binary logistic regression

Hamzah Abdul Hamid; Yap Bee Wah; Xian-Jin Xie; Hezlin Aryani Abd Rahman

It is well known that the type of data distribution in the independent variable(s) may affect many statistical procedures. This paper investigates and illustrates the effect of different types of covariates on the parameter estimation of a binary logistic regression model. A simulation study with different sample sizes and different types of covariates (uniform, normal, skewed) was carried out. Results showed that parameter estimation of binary logistic regression model is severely overestimated when sample size is less than 150 for covariate which have normal and uniform distribution while the parameter is underestimated when the distribution of covariate is skewed. Parameter estimation improves for all types of covariates when sample size is large, that is at least 500.


international conference on statistics in science business and engineering | 2012

Fatality prediction model for motorcycle accidents in Malaysia

Norashikin Nasaruddin; Wong Shaw Voon; Yap Bee Wah; Mohamad Alias Lazim

This paper involves building a fatality predictive model for motorcycle accidents data in Malaysia. The number of registered motorcycles in Malaysia has increased four-fold compared to the last 20 years. Thus, the motorcycle accidents rate and fatality rates among riders and pillion in Malaysia has also increased dramatically. However, results show that when taken into account the numbers of fatalities per 10,000 registered motorcycles, the fatality rate shows a decreasing trend starting from 1996 onwards. The motorcycle accident data for the period of 1996 to 2010 was analyzed using Smeeds Law and regression method. The results show that regression method approach gives better estimates of fatality rate than Smeeds equation.


fuzzy systems and knowledge discovery | 2011

Predicting car purchase intent using data mining approach

Yap Bee Wah; Nor Huwaina Ismail; Simon Fong

Data mining involves the exploration and analysis of large databases to find patterns and valuable information that can aid in decision making. This paper illustrates the use of data mining approach to build predictive models for predicting customers intent of car purchase after booking a car. Records show that a customer who has booked a car has the tendency to cancel their booking. Three data mining predictive models: Logistic Regression (LR), Decision Tree (DT) and Neural Network (NN) were used to model the intent of purchase (IOP). The sample for this study has 1935 cases. The data was partitioned into training (70%) and validation (30%) samples. Comparisons of the performance of these three predictive models were based on the validation accuracy rate, sensitivity and specificity. Results show that all three models validation accuracy rate are quite similar (LR= 91.79%, CART=91.17%, NN=91.17%) while LR has the highest sensitivity (LR=87.77%, CART=85.47%, NN=85.89%). Important customer characteristics were also revealed from these models.


Communications in Statistics - Simulation and Computation | 2018

Investigating the Power of Goodness-of-fit Tests for Multinomial Logistic Regression

Hamzah Abdul Hamid; Yap Bee Wah; Xian Jin Xie; Ong Seng Huat

ABSTRACT Goodness-of-fit tests are important to assess if the model fits the data. In this paper we investigate the Type I error and power of two goodness-of-fit tests for multinomial logistic regression via a simulation study. The GoF test using partitioning strategy (clustering) in the covariate space, was compared with another test, Cg which was based on grouping of predicted probabilities. The power of both tests was investigated when the quadratic term or an interaction term were omitted from the model. The proposed test shows good Type I error and ample power except for models with highly skewed covariate distribution. The proposed test also has good power in detecting omission of continuous interaction term.The application on a real dataset was performed to illustrate the use of goodness-of-fit test for multinomial logistic regression in practice using R.


ADVANCES IN INDUSTRIAL AND APPLIED MATHEMATICS: Proceedings of 23rd Malaysian National Symposium of Mathematical Sciences (SKSM23) | 2016

Handling imbalanced dataset using SVM and k-NN approach

Yap Bee Wah; Hezlin Aryani Abd Rahman; Haibo He; Awang Bulgiba

Data mining classification methods are affected when the data is imbalanced, that is, when one class is larger than the other class in size for the case of a two-class dependent variable. Many new methods have been developed to handle imbalanced datasets. In handling a binary classification task, Support Vector Machine (SVM) is one of the methods reported to give a high accuracy in predictive modeling compared to the other techniques such as Logistic Regression and Discriminant Analysis. The strength of SVM is the robustness of its algorithm and the capability to integrate with kernel-based learning that results in a more flexible analysis and optimized solution. Another popular method to handle imbalanced data is the random sampling method, such as random undersampling, random oversampling and synthetic sampling. The application of the Nearest Neighbours techniques in sampling approach has been seen as having a bigger advantage compared to other methods, as it can handle both structured and non-structured data. There are some studies that implement an ensemble method of both SVM and Nearest Neighbours with good results. This paper discusses the various methods in handling imbalanced data and an illustration of using SVM and k-Nearest Neighbours (k-NN) on a real-data set.

Collaboration


Dive into the Yap Bee Wah's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ruhaya Atan

Universiti Teknologi MARA

View shared research outputs
Top Co-Authors

Avatar

Saunah Zainon

Universiti Teknologi MARA

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge