Hyeyoung Park | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hyeyoung Park is active.

Explore More

Publication

Featured researches published by Hyeyoung Park.

Neural Computation | 2000

Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons

Shun-ichi Amari; Hyeyoung Park; Kenji Fukumizu

The natural gradient learning method is known to have ideal performances for on-line training of multilayer perceptrons. It avoids plateaus, which give rise to slow convergence of the backpropagation method. It is Fisher efficient, whereas the conventional method is not. However, for implementing the method, it is necessary to calculate the Fisher information matrix and its inverse, which is practically very difficult. This article proposes an adaptive method of directly obtaining the inverse of the Fisher information matrix. It generalizes the adaptive Gauss-Newton algorithms and provides a solid theoretical justification of them. Simulations show that the proposed adaptive method works very well for realizing natural gradient learning.

Neural Networks | 2000

Adaptive natural gradient learning algorithms for various stochastic models

Hyeyoung Park; Shun-ichi Amari; Kenji Fukumizu

The natural gradient method has an ideal dynamic behavior which resolves the slow learning speed of the standard gradient descent method caused by plateaus. However, it is required to calculate the Fisher information matrix and its inverse, which makes the implementation of the natural gradient almost impossible. To solve this problem, a preliminary study has been proposed concerning an adaptive method of calculating an estimate of the inverse of the Fisher information matrix, which is called the adaptive natural gradient learning method. In this paper, we show that the adaptive natural gradient method can be extended to be applicable to a wide class of stochastic models: regression with an arbitrary noise model and classification with an arbitrary number of classes. We give explicit forms of the adaptive natural gradient for these models. We confirm the practical advantage of the proposed algorithms through computational experiments on benchmark problems.

Neural Computation | 2006

Singularities Affect Dynamics of Learning in Neuromanifolds

Shun-ichi Amari; Hyeyoung Park; Tomoko Ozeki

The parameter spaces of hierarchical systems such as multilayer perceptrons include singularities due to the symmetry and degeneration of hidden units. A parameter space forms a geometrical manifold, called the neuromanifold in the case of neural networks. Such a model is identified with a statistical model, and a Riemannian metric is given by the Fisher information matrix. However, the matrix degenerates at singularities. Such a singular structure is ubiquitous not only in multilayer perceptrons but also in the gaussian mixture probability densities, ARMA time-series model, and many other cases. The standard statistical paradigm of the Cramér-Rao theorem does not hold, and the singularity gives rise to strange behaviors in parameter estimation, hypothesis testing, Bayesian inference, model selection, and in particular, the dynamics of learning from examples. Prevailing theories so far have not paid much attention to the problem caused by singularity, relying only on ordinary statistical theories developed for regular (nonsingular) models. Only recently have researchers remarked on the effects of singularity, and theories are now being developed. This article gives an overview of the phenomena caused by the singularities of statistical manifolds related to multilayer perceptrons and gaussian mixtures. We demonstrate our recent results on these problems. Simple toy models are also used to show explicit solutions. We explain that the maximum likelihood estimator is no longer subject to the gaussian distribution even asymptotically, because the Fisher information matrix degenerates, that the model selection criteria such as AIC, BIC, and MDL fail to hold in these models, that a smooth Bayesian prior becomes singular in such models, and that the trajectories of dynamics of learning are strongly affected by the singularity, causing plateaus or slow manifolds in the parameter space. The natural gradient method is shown to perform well because it takes the singular geometrical structure into account. The generalization error and the training error are studied in some examples.

Human and Ecological Risk Assessment | 2007

Degradation of Antibiotics (Tetracycline, Sulfathiazole, Ampicillin) Using Enzymes of Glutathion S-Transferase

Hyeyoung Park; Youn-Kyoo Choung

ABSTRACT Swine wastewater is not easily treated in biological wastewater treatment plants. One reason is that some antibiotics are not easily degradable in a normal treatment system and inhibit the biological organisms in the treatment system. Specifically, tetracycline, sulfathiazole, and ampicillin are representative antibiotics found in poultry wastewater. To degrade these refractory and impediment antibiotics more easily, a special method is needed, such as an enzyme method. This research used a special enzyme in an experiment that tested feasibility with an enzyme assay of biological treatment in vitro. The Glutathion S-Transferases (GSTs) are a family of proteins that catalyze the conjugation of reduced glutathione with a variety of hydrophobic chemicals containing electrophilic centers. Using GSTs, these major antibiotics were transformed into components that were non-toxic to the microorganisms that treat manure wastewater. The initial concentration of tetracycline, sulfathiazole, and ampicillin were 100 mg/L, 100 mg/L, and 50 mg/L, respectively, and the concentration of pig feed was the same as usual. The GSTs have made the effect of biotransformation of antibiotics as their mode. They were 60–70% transformed by GSTs at the end of the degradation reaction. This lowered their inhibitory strength against microorganisms.

Neural Processing Letters | 2009

Singularity and Slow Convergence of the EM algorithm for Gaussian Mixtures

Hyeyoung Park; Tomoko Ozeki

Singularities in the parameter spaces of hierarchical learning machines are known to be a main cause of slow convergence of gradient descent learning. The EM algorithm, which is another learning algorithm giving a maximum likelihood estimator, is also suffering from its slow convergence, which often appears when the component overlap is large. We analyze the dynamics of the EM algorithm for Gaussian mixtures around singularities and show that there exists a slow manifold caused by a singular structure, which is closely related to the slow convergence of the EM algorithm. We also conduct numerical simulations to confirm the theoretical analysis. Through the simulations, we compare the dynamics of the EM algorithm with that of the gradient descent algorithm, and show that their slow dynamics are caused by the same singular structure, and thus they have the same behaviors around singularities.

Journal of the Physical Society of Japan | 2003

On-line learning theory of soft committee machines with correlated hidden units - Steepest gradient descent and natural gradient descent

Masato Inoue; Hyeyoung Park; Masato Okada

The permutation symmetry of the hidden units in multilayer perceptrons causes the saddle structure and plateaus of the learning dynamics in gradient learning methods. The correlation of the weight vectors of hidden units in a teacher network is thought to affect this saddle structure, resulting in a prolonged learning time, but this mechanism is still unclear. In this paper, we discuss it with regard to soft committee machines and on-line learning using statistical mechanics. Conventional gradient descent needs more time to break the symmetry as the correlation of the teacher weight vectors rises. On the other hand, no plateaus occur with natural gradient descent regardless of the correlation for the limit of a low learning rate. Analytical results support these dynamics around the saddle point.

Neurocomputing | 2014

Robust recognition of face with partial variations using local features and statistical learning

Jeongin Seo; Hyeyoung Park

Despite the enormous interest in face recognition in the field of computer vision and pattern recognition, it still remains a challenge because of the diverse variations in facial images. In order to deal with variations such as illuminations, expressions, poses, and occlusions, it is important to find a discriminative feature that is robust to the variations while keeping the core information of original images. In this paper, we attempt to develop a face recognition method that is robust to partial variations through statistical learning of local features. By representing a facial image as a set of local feature descriptors such as scale-invariant feature transform (SIFT), we expect to achieve a representation robust to the variations in typical 2D images, such as illuminations and translations. By estimating the probability density of local feature descriptors observed in facial data, we expect to absorb typical variations in facial images, such as expressions and partial occlusions. In the classification stage, the estimated probability density is used to define the weighted distance measure between two images. Through computational experiments on benchmark data sets, we show that the proposed method is more robust to partial variations such as expressions and occlusions than conventional face recognition methods.

Neural Computation | 2004

Improving Generalization Performance of Natural Gradient Learning Using Optimized Regularization by NIC

Hyeyoung Park; Noboru Murata; Shun-ichi Amari

Natural gradient learning is known to be efficient in escaping plateau, which is a main cause of the slow learning speed of neural networks. The adaptive natural gradient learning method for practical implementation also has been developed, and its advantage in real-world problems has been confirmed. In this letter, we deal with the generalization performances of the natural gradient method. Since natural gradient learning makes parameters fit to training data quickly, the overfitting phenomenon may easily occur, which results in poor generalization performance. To solve the problem, we introduce the regularization term in natural gradient learning and propose an efficient optimizing method for the scale of regularization by using a generalized Akaike information criterion (network information criterion). We discuss the properties of the optimized regularization strength by NIC through theoretical analysis as well as computer simulations. We confirm the computational efficiency and generalization performance of the proposed method in real-world applications through computational experiments on benchmark problems.

Journal of Physics A | 2003

Online learning dynamics of multilayer perceptrons with unidentifiable parameters

Hyeyoung Park; Masato Inoue; Masato Okada

In the over-realizable learning scenario of multilayer perceptrons, in which the student network has a larger number of hidden units than the true or optimal network, some of the weight parameters are unidentifiable. In this case, the teacher network consists of a union of optimal subspaces included in the parameter space. The optimal subspaces, which lead to singularities, are known to affect the estimation performance of neural networks. Using statistical mechanics, we investigate the online learning dynamics of two-layer neural networks in the over-realizable scenario with unidentifiable parameters. We show that the convergence speed strongly depends on the initial parameter conditions. We also show that there is a quasi-plateau around the optimal subspace, which differs from the well-known plateaus caused by permutation symmetry. In addition, we discuss the property of the final learning state, relating this to the singular structures.

international symposium on neural networks | 2009

Nonlinear dimension reduction using ISOMap based on class information

Minkook Cho; Hyeyoung Park

Image processing and machine learning communities have long addressed the problems involved in the analysis of large high-dimensional data sets. To deal with high-dimensional data efficiently, learning core properties of given data set is important. The manifold learning methods such as ISOMap try to identify a low-dimensional manifold from a set of unorganized samples. ISOMap method is an extension of the classical multidimensional scaling method for dimension reduction, which find a linear subspace in which dissimilarity between data points is preserved. In order to measure dissimilarity, ISOMap uses the geodesic distances on the manifold instead of Euclidean distance. In this paper, we propose a modification of ISOMap using class information, which is often given in company with input data in many applications such as pattern classification. Since the conventional ISOMap does not use class information in approximating true geodesic distance between each pair of data points, it is difficult to construct a data structure related to class-membership that may give important information for given task such as data visualization and classification. The proposed method utilizes class-membership for measuring distance of data pair so as to find a low-dimensional manifold preserving the distance between classes as well as the distance between data points. Through computational experiments on artificial data sets and real facial data sets, we confirm that the proposed method gives better performance than the conventional ISOMap.

Explore More