Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Martin Anthony.
Archive | 1999
Martin Anthony; Peter L. Bartlett
Introduction Results in the previous chapter show that the VC-dimension of the class of functions computed by a network of linear threshold units with W parameters is no larger than a constant times W log W . These results cannot immediately be extended to networks of sigmoid units (with continuous activation functions), since the proofs involve counting the number of distinct outputs of all linear threshold units in the network as the input varies over m patterns, and a single sigmoid unit has an infinite number of output values. In this chapter and the next we derive bounds on the VC-dimension of certain sigmoid networks, including networks of units having the standard sigmoid activation function σ(α) = 1/(1 + e −α ). Before we begin this derivation, we study an example that shows that the form of the activation function is crucial. The Need for Conditions on the Activation Functions One might suspect that if we construct networks of sigmoid units with a well-behaved activation function, they will have finite VC-dimension. For instance, perhaps it suffices if the activation function is sufficiently smooth, bounded, and monotonically increasing. Unfortunately, the situation is not so simple. The following result shows that there is an activation function that has all of these properties, and even has its derivative monotonically increasing to the left of zero and decreasing to the right (so it is convex and concave in those regions), and yet is such that a two-layer network having only two computation units in the first layer, each with this activation function, has infinite VC-dimension.
Archive | 1999
Martin Anthony; Peter L. Bartlett
Feed-Forward Neural Networks In this chapter, and many subsequent ones, we deal with feed-forward neural networks. Initially, we shall be particularly concerned with feed-forward linear threshold networks, which can be thought of as combinations of perceptrons. To define a neural network class, we need to specify the architecture of the network and the parameterized functions computed by its components. In general, a feed-forward neural network has as its main components a set of computation units, a set of input units , and a set of connections from input or computation units to computation units. These connections are directed; that is, each connection is from a particular unit to a particular computation unit. The key structural property of a feed-forward network—the feed-forward condition —is that these connections do not form any loops. This means that the units can be labelled with integers in such a way that if there is a connection from the unit labelled i to the computation unit labelled j then i j . Associated with each unit is a real number called its output . The output of a computation unit is a particular function of the outputs of units that are connected to it. The feed-forward condition guarantees that the outputs of all units in the network can be written as an explicit function of the network inputs.
Archive | 1999
Martin Anthony; Peter L. Bartlett
Archive | 1999
Martin Anthony; Peter L. Bartlett
Archive | 1999
Martin Anthony; Peter L. Bartlett
Archive | 1999
Martin Anthony; Peter L. Bartlett
Archive | 1999
Martin Anthony; Peter L. Bartlett
Archive | 1999
Martin Anthony; Peter L. Bartlett
Archive | 1999
Martin Anthony; Peter L. Bartlett
Archive | 1999
Martin Anthony; Peter L. Bartlett