Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zhanyu Ma is active.

Publication


Featured researches published by Zhanyu Ma.


Pattern Recognition | 2014

Bayesian estimation of Dirichlet mixture model with variational inference

Zhanyu Ma; Pravin Kumar Rana; Jalil Taghia; Markus Flierl; Arne Leijon

In statistical modeling, parameter estimation is an essential and challengeable task. Estimation of the parameters in the Dirichlet mixture model (DMM) is analytically intractable, due to the integral expressions of the gamma function and its corresponding derivatives. We introduce a Bayesian estimation strategy to estimate the posterior distribution of the parameters in DMM. By assuming the gamma distribution as the prior to each parameter, we approximate both the prior and the posterior distribution of the parameters with a product of several mutually independent gamma distributions. The extended factorized approximation method is applied to introduce a single lower-bound to the variational objective function and an analytically tractable estimation solution is derived. Moreover, there is only one function that is maximized during iterations and, therefore, the convergence of the proposed algorithm is theoretically guaranteed. With synthesized data, the proposed method shows the advantages over the EM-based method and the previously proposed Bayesian estimation method. With two important multimedia signal processing applications, the good performance of the proposed Bayesian estimation method is demonstrated.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Variational Bayesian Matrix Factorization for Bounded Support Data

Zhanyu Ma; Andrew E. Teschendorff; Arne Leijon; Yuanyuan Qiao; Honggang Zhang; Jun Guo

A novel Bayesian matrix factorization method for bounded support data is presented. Each entry in the observation matrix is assumed to be beta distributed. As the beta distribution has two parameters, two parameter matrices can be obtained, which matrices contain only nonnegative values. In order to provide low-rank matrix factorization, the nonnegative matrix factorization (NMF) technique is applied. Furthermore, each entry in the factorized matrices, i.e., the basis and excitation matrices, is assigned with gamma prior. Therefore, we name this method as beta-gamma NMF (BG-NMF). Due to the integral expression of the gamma function, estimation of the posterior distribution in the BG-NMF model can not be presented by an analytically tractable solution. With the variational inference framework and the relative convexity property of the log-inverse-beta function, we propose a new lower-bound to approximate the objective function. With this new lower-bound, we derive an analytically tractable solution to approximately calculate the posterior distributions. Each of the approximated posterior distributions is also gamma distributed, which retains the conjugacy of the Bayesian estimation. In addition, a sparse BG-NMF can be obtained by including a sparseness constraint to the gamma prior. Evaluations with synthetic data and real life data demonstrate the good performance of the proposed method.


IEEE Transactions on Neural Networks | 2018

Decorrelation of Neutral Vector Variables: Theory and Applications

Zhanyu Ma; Jing-Hao Xue; Arne Leijon; Zheng-Hua Tan; Zhen Yang; Jun Guo

In this paper, we propose novel strategies for neutral vector variable decorrelation. Two fundamental invertible transformations, namely, serial nonlinear transformation and parallel nonlinear transformation, are proposed to carry out the decorrelation. For a neutral vector variable, which is not multivariate-Gaussian distributed, the conventional principal component analysis cannot yield mutually independent scalar variables. With the two proposed transformations, a highly negatively correlated neutral vector can be transformed to a set of mutually independent scalar variables with the same degrees of freedom. We also evaluate the decorrelation performances for the vectors generated from a single Dirichlet distribution and a mixture of Dirichlet distributions. The mutual independence is verified with the distance correlation measurement. The advantages of the proposed decorrelation strategies are intensively studied and demonstrated with synthesized data and practical application evaluations.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2014

Bayesian Estimation of the von-Mises Fisher Mixture Model with Variational Inference

Jalil Taghia; Zhanyu Ma; Arne Leijon

This paper addresses the Bayesian estimation of the von-Mises Fisher (vMF) mixture model with variational inference (VI). The learning task in VI consists of optimization of the variational posterior distribution. However, the exact solution by VI does not lead to an analytically tractable solution due to the evaluation of intractable moments involving functional forms of the Bessel function in their arguments. To derive a closed-form solution, we further lower bound the evidence lower bound where the bound is tight at one point in the parameter distribution. While having the value of the bound guaranteed to increase during maximization, we derive an analytically tractable approximation to the posterior distribution which has the same functional form as the assigned prior distribution. The proposed algorithm requires no iterative numerical calculation in the re-estimation procedure, and it can potentially determine the model complexity and avoid the over-fitting problem associated with conventional approaches based on the expectation maximization. Moreover, we derive an analytically tractable approximation to the predictive density of the Bayesian mixture model of vMF distributions. The performance of the proposed approach is verified by experiments with both synthetic and real data.


Neurocomputing | 2016

Feature selection for neutral vector in EEG signal classification

Zhanyu Ma; Zheng-Hua Tan; Jun Guo

In the design of brain-computer interface systems, classification of Electroencephalogram (EEG) signals is the essential part and a challenging task. Recently, as the marginalized discrete wavelet transform (mDWT) representations can reveal features related to the transient nature of the EEG signals, the mDWT coefficients have been frequently used in EEG signal classification. In our previous work, we have proposed a super-Dirichlet distribution-based classifier. The proposed classifier performed better than the state-of-the-art support vector machine-based classifier. In this paper, we further study the neutrality of the mDWT coefficients. The mDWT coefficients have unit L1-norm and all the elements are nonnegative. Assuming the mDWT vector coefficients to be a neutral vector, we apply the proposed parallel nonlinear transformation (PNT) framework to transform them non-linearly into a set of independent scalar coefficients. Based on these scalar coefficients, feature selection strategy is proposed on the transformed feature domain. Experimental results show that the feature selection strategy helps improving the classification accuracy.


Knowledge Based Systems | 2015

Construction of semantic bootstrapping models for relation extraction

Chunyun Zhang; Weiran Xu; Zhanyu Ma; Sheng Gao; Qun Li; Jun Guo

A general formalization of existing bootstrapping frameworks is concluded.A formalization of new semantic bootstrapping model is defined.A unique SSDP to guide learning iterations of bootstrapping.A novel bottom-up kernel method for comparing patterns.The application of the new model to KBP-ESF task. Traditionally, pattern-based relation extraction methods are usually based on iterative bootstrapping model which generally implies semantic drift or low recall problem. In this paper, we present a novel semantic bootstrapping framework that uses semantic information of patterns and flexible match method to address such problem. We introduce formalization for this class of bootstrapping models, which allows semantic constraint to guide learning iterations and use flexible bottom-up kernel to compare patterns. To obtain the insights of reliability and applicability of our framework, we applied it to the English Slot Filling (ESF) task of Knowledge Based Population (KBP) at Text Analysis Conference (TAC). Experimental results show that our framework obtains performance superior to the state of the art.


Neurocomputing | 2018

Cross-modal subspace learning for fine-grained sketch-based image retrieval

Peng Xu; Qiyue Yin; Yongye Huang; Yi-Zhe Song; Zhanyu Ma; Liang Wang; Tao Xiang; W. Bastiaan Kleijn; Jun Guo

Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo. Compared with pixel-perfect depictions of photos, sketches are iconic renderings of the real world with highly abstract. Therefore, matching sketch and photo directly using low-level visual clues are unsufficient, since a common low-level subspace that traverses semantically across the two modalities is non-trivial to establish. Most existing SBIR studies do not directly tackle this cross-modal problem. This naturally motivates us to explore the effectiveness of cross-modal retrieval methods in SBIR, which have been applied in the image-text matching successfully. In this paper, we introduce and compare a series of state-of-the-art cross-modal subspace learning methods and benchmark them on two recently released fine-grained SBIR datasets. Through thorough examination of the experimental results, we have demonstrated that the subspace learning can effectively model the sketch-photo domain-gap. In addition we draw a few key insights to drive future research.


2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE) | 2016

Effect of multi-condition training and speech enhancement methods on spoofing detection

Hong Yu; Achintya Kumar Sarkar; Dennis Alexander Lehmann Thomsen; Zheng-Hua Tan; Zhanyu Ma; Jun Guo

Many researchers have demonstrated the good performance of spoofing detection systems under clean training and testing conditions. However, it is well known that the performance of speaker and speech recognition systems significantly degrades in noisy conditions. Therefore, it is of great interest to investigate the effect of noise on the performance of spoofing detection systems. In this paper, we investigate a multi-conditional training method where spoofing detection models are trained with a mix of clean and noisy data. In addition, we study the effect of different noise types as well as speech enhancement methods on a state-of-the-art spoofing detection system based on the dynamic linear frequency cepstral coefficients (LFCC) feature and a Gaussian mixture model maximum-likelihood (GMM-ML) classifier. In the experiment part we consider three additive noise types, Cantine, Babble and white Gaussian at different signal-to-noise ratios, and two mainstream speech enhancement methods, Wiener filtering and minimum mean-square error. The experimental results show that enhancement methods are not suitable for the spoofing detection task, as the spoofing detection accuracy will be reduced after speech enhancement. Multi-conditional training, however, shows potential at reducing error rates for spoofing detection.


Neurocomputing | 2015

Multi-label learning with prior knowledge for facial expression analysis

Kaili Zhao; Honggang Zhang; Zhanyu Ma; Yi-Zhe Song; Jun Guo

Abstract Facial expression is one of the most expressive ways to display human emotions. Facial expression analysis (FEA) has been broadly studied in the past decades. In our daily life, few of the facial expressions are exactly one of the predefined affective states but are blends of several basic expressions. Even though the concept of ‘blended emotions’ has been proposed years ago, most researchers did not deal with FEA as a multiple outputs problem yet. In this paper, multi-label learning algorithm for FEA is proposed to solve this problem. Firstly, to depict facial expressions more effectively, we model FEA as a multi-label problem, which depicts all facial expressions with multiple continuous values and labels of predefined affective states. Secondly, in order to model FEA jointly with multiple outputs, multi-label Group Lasso regularized maximum margin classifier (GLMM) and Group Lasso regularized regression (GLR) algorithms are proposed which can analyze all facial expressions at one time instead of modeling as a binary learning problem. Thirdly, to improve the effectiveness of our proposed model used in video sequences, GLR is further extended to be a Total Variation and Group Lasso based regression model (GLTV) which adds a prior term (Total Variation term) in the original model. JAFFE dataset and the extended Cohn Kanade (CK+) dataset have been used to verify the superior performance of our approaches with common used criterions in multi-label classification and regression realms.


Signal Processing | 2014

Fast communication: Dirichlet mixture modeling to estimate an empirical lower bound for LSF quantization

Zhanyu Ma; Saikat Chatterjee; W. Bastiaan Kleijn; Jun Guo

The line spectral frequencies (LSFs) are commonly used for the linear predictive/autoregressive model in speech and audio coding. Recently, probability density function (PDF)-optimized vector quantization (VQ) has been studied intensively for quantization of LSF parameters. In this paper, we study the VQ performance bound of the LSF parameters. The LSF parameters are transformed to the @D LSF domain and the underlying distribution of the @DLSF parameters is modeled by a Dirichlet mixture model (DMM) with a finite number of mixture components. The quantization distortion, in terms of the mean squared error (MSE), is calculated with high rate theory. For LSF quantization, the mapping relation between the perceptually motivated log spectral distortion (LSD) and the MSE is empirically approximated by a polynomial. With this mapping function, the minimum required bit rate (an empirical lower bound) for transparent coding of the LSF under DMM modeling is derived.

Collaboration


Dive into the Zhanyu Ma's collaboration.

Top Co-Authors

Avatar

Jun Guo

Beijing University of Posts and Telecommunications

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hong Yu

Beijing University of Posts and Telecommunications

View shared research outputs
Top Co-Authors

Avatar

Yi-Zhe Song

Queen Mary University of London

View shared research outputs
Top Co-Authors

Avatar

Arne Leijon

Royal Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jalil Taghia

Royal Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jiyang Xie

Beijing University of Posts and Telecommunications

View shared research outputs
Top Co-Authors

Avatar

Xiaoxu Li

Beijing University of Posts and Telecommunications

View shared research outputs
Top Co-Authors

Avatar

Chunyun Zhang

Shandong University of Finance and Economics

View shared research outputs
Top Co-Authors

Avatar

Liang Wang

Chinese Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge