Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mingzhou Song is active.

Publication


Featured researches published by Mingzhou Song.


Intelligent Computing: Theory and Applications III | 2005

Highly efficient incremental estimation of Gaussian mixture models for online data stream clustering

Mingzhou Song; Hongbin Wang

We present a probability-density-based data stream clustering approach which requires only the newly arrived data, not the entire historical data, to be saved in memory. This approach incrementally updates the density estimate taking only the newly arrived data and the previously estimated density. The idea roots on a theorem of estimator updating and it works naturally with Gaussian mixture models. We implement it through the expectation maximization algorithm and a cluster merging strategy by multivariate statistical tests for equality of covariance and mean. Our approach is highly efficient in clustering voluminous online data streams when compared to the standard EM algorithm. We demonstrate the performance of our algorithm on clustering a simulated Gaussian mixture data stream and clustering real noisy spike signals extracted from neuronal recordings.


BMC Genomics | 2013

A comprehensive meta QTL analysis for fiber quality, yield, yield related and morphological traits, drought tolerance, and disease resistance in tetraploid cotton

Joseph I. Said; Zhongxu Lin; Xianlong Zhang; Mingzhou Song; Jinfa Zhang

BackgroundThe study of quantitative trait loci (QTL) in cotton (Gossypium spp.) is focused on traits of agricultural significance. Previous studies have identified a plethora of QTL attributed to fiber quality, disease and pest resistance, branch number, seed quality and yield and yield related traits, drought tolerance, and morphological traits. However, results among these studies differed due to the use of different genetic populations, markers and marker densities, and testing environments. Since two previous meta-QTL analyses were performed on fiber traits, a number of papers on QTL mapping of fiber quality, yield traits, morphological traits, and disease resistance have been published. To obtain a better insight into the genome-wide distribution of QTL and to identify consistent QTL for marker assisted breeding in cotton, an updated comparative QTL analysis is needed.ResultsIn this study, a total of 1,223 QTL from 42 different QTL studies in Gossypium were surveyed and mapped using Biomercator V3 based on the Gossypium consensus map from the Cotton Marker Database. A meta-analysis was first performed using manual inference and confirmed by Biomercator V3 to identify possible QTL clusters and hotspots. QTL clusters are composed of QTL of various traits which are concentrated in a specific region on a chromosome, whereas hotspots are composed of only one trait type. QTL were not evenly distributed along the cotton genome and were concentrated in specific regions on each chromosome. QTL hotspots for fiber quality traits were found in the same regions as the clusters, indicating that clusters may also form hotspots.ConclusionsPutative QTL clusters were identified via meta-analysis and will be useful for breeding programs and future studies involving Gossypium QTL. The presence of QTL clusters and hotspots indicates consensus regions across cultivated tetraploid Gossypium species, environments, and populations which contain large numbers of QTL, and in some cases multiple QTL associated with the same trait termed a hotspot. This study combines two previous meta-analysis studies and adds all other currently available QTL studies, making it the most comprehensive meta-analysis study in cotton to date.


Nature Methods | 2016

Inferring causal molecular networks: empirical assessment through a community-based effort

Steven M. Hill; Laura M. Heiser; Thomas Cokelaer; Michael Unger; Nicole K. Nesser; Daniel E. Carlin; Yang Zhang; Artem Sokolov; Evan O. Paull; Christopher K. Wong; Kiley Graim; Adrian Bivol; Haizhou Wang; Fan Zhu; Bahman Afsari; Ludmila Danilova; Alexander V. Favorov; Wai Shing Lee; Dane Taylor; Chenyue W. Hu; Byron L. Long; David P. Noren; Alexander J Bisberg; Gordon B. Mills; Joe W. Gray; Michael R. Kellen; Thea Norman; Stephen H. Friend; Amina A. Qutub; Elana J. Fertig

It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense.


Applied and Environmental Microbiology | 2012

Effect of Rainfall-Induced Soil Geochemistry Dynamics on Grassland Soil Microbial Communities

Karelyn Cruz-Martinez; Anna Rosling; Yang Zhang; Mingzhou Song; Gary L. Andersen; Jillian F. Banfield

ABSTRACT In Mediterranean-type grassland ecosystems, the timing of rainfall events controls biogeochemical cycles, as well as the phenology and productivity of plants and animals. Here, we investigate the effect of short-term (days) soil environmental conditions on microbial community structure and composition during a natural wetting and drying cycle. Soil samples were collected from a meadow in Northern California at four time points after the first two rainfall events of the rainy season. We used 16S rRNA microarrays (PhyloChip) to track changes in bacterial and archaeal community composition. Microbial communities at time points 1 and 3 were significantly different than communities at time points 2 and 4. Based on ordination analysis, the available carbon, soil moisture, and temperature explained most of the variation in community structure. For the first time, a complementary and more comprehensive approach using linear regression and generalized logical networks were used to identify linear and nonlinear associations among environmental variables and with the relative abundance of subfamilies. Changes in soil moisture and available carbon were correlated with the relative abundance of many phyla. Only the phylum Actinobacteria showed a lineage-specific relationship to soil moisture but not to carbon or nitrogen. The results indicate that the use of a high taxonomic rank in correlations with nutritional indicators might obscure divergent subfamily-level responses to environmental parameters. An important implication of this research is that there is short-term variation in microbial community composition driven in part by rainfall fluctuation that may not be evident in long-term studies with coarser time resolution.


international conference on communication technology | 1996

Motion estimation in DCT domain

Mingzhou Song; Anni Cai; Jing-ao Sun

In this paper, a new block matching criterion is proposed for motion estimation in video coding. The new criterion is based on a comparison between the discrete cosine transform (DCT) coefficients of two blocks to be matched. Since the spectrum of DCT statistically concentrates on the neighborhood of the dc component and the number of non-zero coefficients is quite small, only few coefficients need to be considered when matching blocks according to the new criterion. Almost in all the video coding standards, DCT is a necessary step for spatial redundancy reduction. Thus the generated DCT coefficients can be utilized in the new criterion. However, there are still some blocks whose DCT coefficients are not available. Several algorithms of calculating DCT coefficients of these blocks are given. When this new criterion is combined with the logarithm search, promising results are produced.


international conference on pattern recognition | 2000

Algorithm performance contest

Selim Aksoy; Ming Ye; Michael L. Schauf; Mingzhou Song; Yalin Wang; Robert M. Haralick; J. R. Parker; Juraj Pivovarov; Dominik Royko; Changming Sun; Gunnar Farnebäck

This contest involved the running and evaluation of computer vision and pattern recognition techniques on different data sets with known groundwidth. The contest included three areas; binary shape recognition, symbol recognition and image flow estimation. A package was made available for each area. Each package contained either real images with manual groundtruth or programs to generate data sets of ideal as well as noisy images with known groundtruth. They also contained programs to evaluate the results of an algorithm according to the given groundtruth. These evaluation criteria included the generation of confusion matrices, computation of the misdetection and false alarm rates and other performance measures suitable for the problems. The paper summarizes the data generation for each area and experimental results for a total of six participating algorithms.


Eurasip Journal on Bioinformatics and Systems Biology | 2009

Reconstructing generalized logical networks of transcriptional regulation in mouse brain from temporal gene expression data

Mingzhou Song; Chris K. Lewis; Eric R. Lance; Elissa J. Chesler; Roumyana Yordanova; Michael A. Langston; Kerrie H. Lodowski; Susan E. Bergeson

Gene expression time course data can be used not only to detect differentially expressed genes but also to find temporal associations among genes. The problem of reconstructing generalized logical networks to account for temporal dependencies among genes and environmental stimuli from transcriptomic data is addressed. A network reconstruction algorithm was developed that uses statistical significance as a criterion for network selection to avoid false-positive interactions arising from pure chance. The multinomial hypothesis testing-based network reconstruction allows for explicit specification of the false-positive rate, unique from all extant network inference algorithms. The method is superior to dynamic Bayesian network modeling in a simulation study. Temporal gene expression data from the brains of alcohol-treated mice in an analysis of the molecular response to alcohol are used for modeling. Genes from major neuronal pathways are identified as putative components of the alcohol response mechanism. Nine of these genes have associations with alcohol reported in literature. Several other potentially relevant genes, compatible with independent results from literature mining, may play a role in the response to alcohol. Additional, previously unknown gene interactions were discovered that, subject to biological verification, may offer new clues in the search for the elusive molecular mechanisms of alcoholism.


IEEE Transactions on Medical Imaging | 2002

Integrated surface model optimization for freehand three-dimensional echocardiography

Mingzhou Song; Robert M. Haralick; Florence H. Sheehan; Richard K. Johnson

The major obstacle of three-dimensional (3-D) echocardiography is that the ultrasound image quality is too low to reliably detect features locally. Almost all available surface-finding algorithms depend on decent quality boundaries to get satisfactory surface models. We formulate the surface model optimization problem in a Bayesian framework, such that the inference made about a surface model is based on the integration of both the low-level image evidence and the high-level prior shape knowledge through a pixel class prediction mechanism. We model the probability of pixel classes instead of making explicit decisions about them. Therefore, we avoid the unreliable edge detection or image segmentation problem and the pixel correspondence problem. An optimal surface model best explains the observed images such that the posterior probability of the surface model for the observed images is maximized. The pixel feature vector as the image evidence includes several parameters such as the smoothed grayscale value and the minimal second directional derivative. Statistically, we describe the feature vector by the pixel appearance probability model obtained by a nonparametric optimal quantization technique. Qualitatively, we display the imaging plane intersections of the optimized surface models together with those of the ground-truth surfaces reconstructed from manual delineations. Quantitatively, we measure the projection distance error between the optimized and the ground-truth surfaces. In our experiment, we use 20 studies to obtain the probability models offline. The prior shape knowledge is represented by a catalog of 86 left ventricle surface models. In another set of 25 test studies, the average epicardial and endocardial surface projection distance errors are 3.2 /spl plusmn/ 0.85 mm and 2.6 /spl plusmn/ 0.78 mm, respectively.


Iet Systems Biology | 2009

Discrete dynamical system modelling for gene regulatory networks of 5-hydroxymethylfurfural tolerance for ethanologenic yeast

Mingzhou Song; Zhengyu Ouyang; Z.L. Liu

Composed of linear difference equations, a discrete dynamical system (DDS) model was designed to reconstruct transcriptional regulations in gene regulatory networks (GRNs) for ethanologenic yeast Saccharomyces cerevisiae in response to 5-hydroxymethylfurfural (HMF), a bioethanol conversion inhibitor. The modelling aims at identification of a system of linear difference equations to represent temporal interactions among significantly expressed genes. Power stability is imposed on a system model under the normal condition in the absence of the inhibitor. Non-uniform sampling, typical in a time-course experimental design, is addressed by a log-time domain interpolation. A statistically significant DDS model of the yeast GRN derived from time-course gene expression measurements by exposure to HMF, revealed several verified transcriptional regulation events. These events implicate Yap1 and Pdr3, transcription factors consistently known for their regulatory roles by other studies or postulated by independent sequence motif analysis, suggesting their involvement in yeast tolerance and detoxification of the inhibitor.


international conference on data mining | 2008

Comparison of Cluster Representations from Partial Second- to Full Fourth-Order Cross Moments for Data Stream Clustering

Mingzhou Song; Lin Zhang

Under seven external clustering evaluation measures, a comparison is made for cluster representations from the partial second order to the fourth order in data stream clustering. Two external clustering evaluation measures, purity and cross entropy, adopted for data stream clustering performance evaluation in the past, penalize the performance of an algorithm when each hypothesized cluster contains points in different target classes or true clusters, while ignoring the issue of points in a target class falling into different hypothesized clusters. The seven measures will address both sides of the clustering performance. The represented geometry by the partial second-order statistics of a cluster is non-oblique ellipsoidal and cannot describe the orientation, asymmetry, or peakedness of a cluster. The higher-order cluster representation presented in this paper introduces the third and fourth cross moments, enabling the cluster geometry to be beyond an ellipsoid. The higher-order statistics allow two clusters with different representations to merge into a multivariate normal cluster, using normality tests based on multivariate skewness and kurtosis. The clustering performance under the seven external clustering evaluation measures with a synthetic and two real data streams demonstrates the effectiveness of the higher-order cluster representations.

Collaboration


Dive into the Mingzhou Song's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jinfa Zhang

New Mexico State University

View shared research outputs
Top Co-Authors

Avatar

Joseph I. Said

New Mexico State University

View shared research outputs
Top Co-Authors

Avatar

Zhengyu Ouyang

New Mexico State University

View shared research outputs
Top Co-Authors

Avatar

Yang Zhang

New Mexico State University

View shared research outputs
Top Co-Authors

Avatar

Hien Nguyen

New Mexico State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Haizhou Wang

New Mexico State University

View shared research outputs
Top Co-Authors

Avatar

Hongbin Wang

City University of New York

View shared research outputs
Top Co-Authors

Avatar

Stéphane Boissinot

New York University Abu Dhabi

View shared research outputs
Researchain Logo
Decentralizing Knowledge