Jiyuan An | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jiyuan An is active.

Explore More

Publication

Featured researches published by Jiyuan An.

Disease Models & Mechanisms | 2010

Disease-specific, neurosphere-derived cells as models for brain disorders

Nicholas Matigian; Greger Abrahamsen; Ratneswary Sutharsan; Anthony L. Cook; Alejandra Mariel Vitale; Amanda Nouwens; Bernadette Bellette; Jiyuan An; Matthew J. Anderson; Anthony Gordon Beckhouse; Maikel Bennebroek; Rowena Cecil; Alistair Morgan Chalk; Julie Cochrane; Yongjun Fan; François Féron; Richard D. McCurdy; John J. McGrath; Wayne Murrell; Chris Perry; Jyothy Raju; Sugandha Ravishankar; Peter A. Silburn; Greg T. Sutherland; Stephen M. Mahler; George D. Mellick; Stephen A. Wood; Carolyn M. Sue; Christine A. Wells; Alan Mackay-Sim

SUMMARY There is a pressing need for patient-derived cell models of brain diseases that are relevant and robust enough to produce the large quantities of cells required for molecular and functional analyses. We describe here a new cell model based on patient-derived cells from the human olfactory mucosa, the organ of smell, which regenerates throughout life from neural stem cells. Olfactory mucosa biopsies were obtained from healthy controls and patients with either schizophrenia, a neurodevelopmental psychiatric disorder, or Parkinson’s disease, a neurodegenerative disease. Biopsies were dissociated and grown as neurospheres in defined medium. Neurosphere-derived cell lines were grown in serum-containing medium as adherent monolayers and stored frozen. By comparing 42 patient and control cell lines we demonstrated significant disease-specific alterations in gene expression, protein expression and cell function, including dysregulated neurodevelopmental pathways in schizophrenia and dysregulated mitochondrial function, oxidative stress and xenobiotic metabolism in Parkinson’s disease. The study has identified new candidate genes and cell pathways for future investigation. Fibroblasts from schizophrenia patients did not show these differences. Olfactory neurosphere-derived cells have many advantages over embryonic stem cells and induced pluripotent stem cells as models for brain diseases. They do not require genetic reprogramming and they can be obtained from adults with complex genetic diseases. They will be useful for understanding disease aetiology, for diagnostics and for drug discovery.

intelligent data engineering and automated learning | 2003

Grid-Based Indexing for Large Time Series Databases

Jiyuan An; Hanxiong Chen; Kazutaka Furuse; Nobuo Ohbo; Eamonn J. Keogh

Similarity search in large time series databases is an interesting and challenging problem. Because of the high dimensional nature of the data, the difficulties associated with dimensionality curse arise. The most promising solution is to use dimensionality reduction, and construct a multi-dimensional index structure for the reduced data. In this work we introduce a new approach called grid-based Datawise Dimensionality Reduction(DDR) which attempts to preserve the characteristics of time series. We then apply quantization to construct an index structure. An experimental comparison with existing techniques demonstrate the utility of our approach.

web age information management | 2002

C2VA: Trim High Dimensional Indexes

Hanxiong Chen; Jiyuan An; Kazutaka Furuse; Nobuo Ohbo

Classical multi-dimensional indexes are based on data space partitioning. The effectiveness declines because the number of indexing units grows exponentially as the number of dimensions increases. Then, unfortunately, using such index structures is less effective than linear scanning of all the data. The VA-file proposed a method of coordinate approximation, observing that nearest neighbor search becomes of linear complexity in high-dimensional spaces.In this paper we propose CM2VA(Clustered Compact VA) for dimensionality reduction. We investigate and find that real datasets are rarely uniformly distributed, which is the main assumption of VA-file. Instead of approximation on all dimensions, we figure out the condition of skipping less important dimensions. This avoids the problem of generating huge index file for a large, high dimensional dataset and hence saves a lot of I/O accesses when scanning. Moreover, we guarantee that C2VA preserves the precision of bounds as in VA-file, which maximizes the efficiency gain. The conviction is found in our experimental results.

Journal of Visual Languages and Computing | 2007

A dimensionality reduction algorithm and its application for interactive visualization

Jiyuan An; Jeffrey Xu Yu; Chotirat Ann Ratanamahatana; Yi-Ping Phoebe Chen

Visualization is one of the most effective methods for analyzing how high-dimensional data are distributed. Dimensionality reduction techniques, such as PCA, can be used to map high dimensional data to a two- or three-dimensional space. In this paper, we propose an algorithm called HyperMap that can be effectively applied to visualization. Our algorithm can be seen as a generalization of FastMap. It preserves its linear computation complexity, and overcomes several main shortcomings, especially in visualization. Since there are more than two pivot objects in each axis of a target space, more distance information needs to be preserved in each dimension. Then in visualization, the number of pivot objects can go beyond the limitation of six (2-pivot objects x 3-dimensions). Our HyperMap algorithm also gives more flexibility to the target space, such that the data distribution can be observed from various viewpoints. Its effectiveness is confirmed by empirical evaluations on both real and synthetic datasets.

australasian database conference | 2002

The convex polyhedra technique: an index structure for high-dimensional space

Jiyuan An; Hanxiong Chen; Kazutaka Furuse; Masahiro Ishikawa; Nobuo Ohbo

This paper proposes a new dimensionality reduction technique and an indexing mechanism for high dimensional data sets in which data points are not uniformly distributed. The proposed technique decomposes a data space into convex polyhedra, and the dimensionality of each data point is reduced according to which polyhedron includes the data point. One of the advantages of the proposed technique is that it reduces the dimensionality locally. This local dimensionality reduction contributes to improve indexing mechanisms for non-uniformly distributed data sets.To show the applicability and the effectiveness of the proposed technique, this paper describes a new indexing mechanism called CVA-file (Compact VA-File) which is a revised version of the VA-file. With the proposed dimensionality reduction technique, the size of data points stored in index files can be reduced. Furthermore, it can estimate upper and lower bounds of each entry in index files by using geographic properties of convex polyhedra. Results from experimental simulations show that the CVA-file is better than the VA-file for non-uniformly distributed real data sets.

Journal of Biotechnology | 2008

Finding edging genes from microarray data

Jiyuan An; Yi-Ping Phoebe Chen

MOTIVATIONnA set of genes and their gene expression levels are used to classify disease and normal tissues. Due to the massive number of genes in microarray, there are a large number of edges to divide different classes of genes in microarray space. The edging genes (EGs) can be co-regulated genes, they can also be on the same pathway or deregulated by the same non-coding genes, such as siRNA or miRNA. Every gene in EGs is vital for identifying a tissues class. The changing in one EGs gene expression may cause a tissue alteration from normal to disease and vice versa. Finding EGs is of biological importance. In this work, we propose an algorithm to effectively find these EGs.nnnRESULTnWe tested our algorithm with five microarray datasets. The results are compared with the border-based algorithm which was used to find gene groups and subsequently divide different classes of tissues. Our algorithm finds a significantly larger amount of EGs than does the border-based algorithm. As our algorithm prunes irrelevant patterns at earlier stages, time and space complexities are much less prevalent than in the border-based algorithm.nnnAVAILABILITYnThe algorithm proposed is implemented in C++ on Linux platform. The EGs in five microarray datasets are calculated. The preprocessed datasets and the discovered EGs are available at http://www3.it.deakin.edu.au/~phoebe/microarray.html.

database systems for advanced applications | 2005

A new indexing method for high dimensional dataset

Jiyuan An; Yi-Ping Phoebe Chen; Qinying Xu; Xiaofang Zhou

Indexing high dimensional datasets has attracted extensive attention from many researchers in the last decade. Since R-tree type of index structures are known as suffering “curse of dimensionality” problems, Pyramid-tree type of index structures, which are based on the B-tree, have been proposed to break the curse of dimensionality. However, for high dimensional data, the number of pyramids is often insufficient to discriminate data points when the number of dimensions is high. Its effectiveness degrades dramatically with the increase of dimensionality. In this paper, we focus on one particular issue of “curse of dimensionality”; that is, the surface of a hypercube in a high dimensional space approaches 100% of the total hypercube volume when the number of dimensions approaches infinite. We propose a new indexing method based on the surface of dimensionality. We prove that the Pyramid tree technology is a special case of our method. The results of our experiments demonstrate clear priority of our novel method.

active media technology | 2005

Keyword extraction for text categorization

Jiyuan An; Yi-Ping Phoebe Chen

Text categorization (TC) is one of the main applications of machine learning. Many methods have been proposed, such as Rocchio method, Naive bayes based method, and SVM based text classification method. These methods learn labeled text documents and then construct a classifier. A new coming text documents category can be predicted. However, these methods do not give the description of each category. In the machine learning field, there are many concept learning algorithms, such as, ID3 and CN2. This paper proposes a more robust algorithm to induce concepts from training examples, which is based on enumeration of all possible keywords combinations. Experimental results show that the rules produced by our approach have more precision and simplicity than that of other methods.

web intelligence | 2004

Concept Learning of Text Documents

Jiyuan An; Yi-Ping Phoebe Chen

Concept learning of text documents can be viewed as the problem of acquiring the definition of a general category of documents. To definite the category of a text document, the Conjunctive of keywords is usually be used. These keywords should be fewer and comprehensible. A naïve method is enumerating all combinations of keywords to extract suitable ones. However, because of the enormous number of keyword combinations, it is impossible to extract the most relevant keywords to describe the categories of documents by enumerating all possible combinations of keywords. Many heuristic methods are proposed, such as GA-base, immune based algorithm. In this work, we introduce pruning power technique and propose a robust enumeration-based concept learning algorithm. Experimental results show that the rules produce by our approach has more comprehensible and simplicity than by other methods.

international conference on pattern recognition | 2006

Finding Rule Groups to Classify High Dimensional Gene Expression Datasets

Jiyuan An; Yi-Ping Phoebe Chen

Microarray data provides quantitative information about the transcription profile of cells. To analyze microarray datasets, methodology of machine learning has increasingly attracted bioinformatics researchers. Some approaches of machine learning are widely used to classify and mine biological datasets. However, many gene expression datasets are extremely high dimensionality, traditional machine learning methods can not be applied effectively and efficiently. This paper proposes a robust algorithm to find out rule groups to classify gene expression datasets. Unlike the most classification algorithms, which select dimensions (genes) heuristically to form rules groups to identify classes such as cancerous and normal tissues, our algorithm guarantees finding out best-k dimensions (genes), which are most discriminative to classify samples in different classes, to form rule groups for the classification of expression datasets. Our experiments show that the rule groups obtained by our algorithm have higher accuracy than that of other classification approaches.

Explore More