Bill G. Horne | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bill G. Horne is active.

Explore More

Publication

Featured researches published by Bill G. Horne.

systems man and cybernetics | 1997

Computational capabilities of recurrent NARX neural networks

Hava T. Siegelmann; Bill G. Horne; C. L. Giles

Recently, fully connected recurrent neural networks have been proven to be computationally rich-at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. These networks are based upon Nonlinear AutoRegressive models with eXogenous Inputs (NARX models), and are therefore called NARX networks. As opposed to other recurrent networks, NARX networks have a limited feedback which comes only from the output neuron rather than from hidden states. They are formalized by y(t)=Psi(u(t-n(u)), ..., u(t-1), u(t), y(t-n(y)), ..., y(t-1)) where u(t) and y(t) represent input and output of the network at time t, n(u) and n(y) are the input and output order, and the function Psi is the mapping performed by a Multilayer Perceptron. We constructively prove that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines. We conclude that in theory one can use the NARX models, rather than conventional recurrent networks without any computational loss even though their feedback is limited. Furthermore, these results raise the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power.

IEEE Transactions on Neural Networks | 1996

An analysis of noise in recurrent neural networks: convergence and generalization

Kam-Chuen Jim; C. L. Giles; Bill G. Horne

Concerns the effect of noise on the performance of feedforward neural nets. We introduce and analyze various methods of injecting synaptic noise into dynamically driven recurrent nets during training. Theoretical results show that applying a controlled amount of noise during training may improve convergence and generalization performance. We analyze the effects of various noise parameters and predict that best overall performance can be achieved by injecting additive noise at each time step. Noise contributes a second-order gradient term to the error function which can be viewed as an anticipatory agent to aid convergence. This term appears to find promising regions of weight space in the beginning stages of training when the training error is large and should improve convergence on error surfaces with local minima. The first-order term is a regularization term that can improve generalization. Specifically, it can encourage internal representations where the state nodes operate in the saturated regions of the sigmoid discriminant function. While this effect can improve performance on automata inference problems with binary inputs and target outputs, it is unclear what effect it will have on other types of problems. To substantiate these predictions, we present simulations on learning the dual parity grammar from temporal strings for all noise models, and present simulations on learning a randomly generated six-state grammar using the predicted best noise model.

IEEE Transactions on Signal Processing | 1997

A delay damage model selection algorithm for NARX neural networks

Tsung-Nan Lin; C.L. Giles; Bill G. Horne; Sun-Yuan Kung

Recurrent neural networks have become popular models for system identification and time series prediction. Nonlinear autoregressive models with exogenous inputs (NARX) neural network models are a popular subclass of recurrent networks and have been used in many applications. Although embedded memory can be found in all recurrent network models, it is particularly prominent in NARX models. We show that using intelligent memory order selection through pruning and good initial heuristics significantly improves the generalization and predictive performance of these nonlinear systems on problems as diverse as grammatical inference and time series prediction.

Neural Networks | 1996

Bounds on the complexity of recurrent neural network implementations of finite state machines

Bill G. Horne; Don R. Hush

Abstract In this paper the efficiency of recurrent neural network implementations of m-state finite state machines will be explored. Specifically, it will be shown that the node complexity for the unrestricted case can be bounded above by O (√ m ). It will also be shown that the node complexity is O( m log m ) when the weights and thresholds are restricted to the set {−1,1} and O(m) when the fan-in is restricted to two. Matching lower bounds will be provided for each of these upper bounds assuming that the state of the FSM can be encoded in a subset of the nodes of size [ log m].

IEEE Transactions on Neural Networks | 1997

Time-delay neural networks: representation and induction of finite-state machines

Daniel S. Clouse; C.L. Giles; Bill G. Horne; G.W. Cottrell

In this work, we characterize and contrast the capabilities of the general class of time-delay neural networks (TDNNs) with input delay neural networks (IDNNs), the subclass of TDNNs with delays limited to the inputs. Each class of networks is capable of representing the same set of languages, those embodied by the definite memory machines (DMMs), a subclass of finite-state machines. We demonstrate the close affinity between TDNNs and DMM languages by learning a very large DMM (2048 states) using only a few training examples. Even though both architectures are capable of representing the same class of languages, they have distinguishable learning biases. Intuition suggests that general TDNNs which include delays in hidden layers should perform well, compared to IDNNs, on problems in which the output can be expressed as a function on narrow input windows which repeat in time. On the other hand, these general TDNNs should perform poorly when the input windows are wide, or there is little repetition. We confirm these hypotheses via a set of simulations and statistical analysis.

Neural Networks | 1995

Learning a class of large finite state machines with a recurrent neural network

C. Lee Giles; Bill G. Horne; Tsung-Nan Lin

Abstract One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be learned are too small to be competitive with existing grammatical inference algorithms. We show that a type of recurrent neural network (Narendra & Parthasarathy, 1990, IEEE Trans. Neural Networks, 1, 4–27) which has feedback but no hidden state neurons can learn a special type of FSM called a finite memory machine (FMM) under certain constraints. These machines have a large number of states (simulations are for 256 and 512 state FMMs) but have minimal order, relatively small depth and little logic when the FMM is implemented as a sequential machine.

systems man and cybernetics | 1992

Error surfaces for multilayer perceptrons

Don R. Hush; Bill G. Horne; John M. Salas

Characteristics of error surfaces for the multilayer perceptron neural network that help explain why learning techniques that use hill-climbing methods are so slow in these networks and also provide insights into techniques to speed learning are examined. First, the surface has a stair-step appearance with many very flat and very steep regions. When the number of training samples is small there is often a one-to-one correspondence between individual training samples and the steps on the surface. As the number of samples increases, the surface becomes smoother. In addition the surface has flat regions that extend to infinity in all directions, making it dangerous to apply learning algorithms that perform line searches. The magnitude of the gradients on the surface strongly supports the need for floating-point representations during learning. The consequences of various weight initialization techniques are also discussed. >

Neural Networks | 1998

How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies

Tsung-Nan Lin; Bill G. Horne; C. Lee Giles

Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NARX networks perform much better than conventional recurrent neural networks for learning certain simple long-term dependency problems. The intuitive explanation for this behavior is that the output memories of a NARX network can be manifested as jump-ahead connections in the time-unfolded network. These jump-ahead connections can propagate gradient information more efficiently, thus reducing the sensitivity of the network to long-term dependencies. This work gives empirical justification to our hypothesis that similar improvements in learning long-term dependencies can be achieved with other classes of recurrent neural network axchitectures simply by increasing the order of the embedded memory. In particular we explore the impact of learning simple long-term dependency problems on three classes of recurrent neural network architectures: globally recurrent networks, locally recurrent networks, and NARX (output feedback) networks.Comparing the performance of these architectures with different orders of embedded memory on two simple long-term dependencies problems shows that all of these classes of network architectures demonstrate significant improvement on learning long-term dependencies when the orders of embedded memory are increased. These results can be important to a user comfortable with a specific recurrent neural network architecture because simply increasing the embedding memory order of that architecture will make it more robust to the problem of long-term dependency learning.

IEEE Transactions on Neural Networks | 1998

Efficient algorithms for function approximation with piecewise linear sigmoidal networks

Don R. Hush; Bill G. Horne

This paper presents a computationally efficient algorithm for function approximation with piecewise linear sigmoidal nodes. A one hidden layer network is constructed one node at a time using the well-known method of fitting the residual. The task of fitting an individual node is accomplished using a new algorithm that searches for the best fit by solving a sequence of quadratic programming problems. This approach offers significant advantages over derivative-based search algorithms (e.g., backpropagation and its extensions). Unique characteristics of this algorithm include: finite step convergence, a simple stopping criterion, solutions that are independent of initial conditions, good scaling properties and a robust numerical implementation. Empirical results are included to illustrate these characteristics.

Journal of Intelligent and Robotic Systems | 1990

Neural networks in robotics: A survey

Bill G. Horne; Mohammad Jamshidi; Nader Vadiee

The purpose of this paper is to provide an overview of the research being done in neural network approaches to robotics, outline the strengths and weaknesses of current approaches, and predict future trends in this area.

Explore More