## Neural Network Models

Neural networks mimic the human brain and are characterized by the pattern of connections between the various network layers, the numbers of neurons in each layer, the learning algorithm, and the neuron activation functions. Generally speaking, a neural network is a set of connected input and output units where each connection has a weight associated with it. During the learning phase, the network learns by adjusting the weights so as to be able to correctly predict or classify the output target of a given set of input samples. Given the numerous types of neural network architectures that have been developed in the literature, three important types of neural networks were implemented in this study to compare their predictive ability against the classical linear regression model. The following three subsections give a brief introduction of these three neural network models.

Multilayer Feed-Forward Neural Network_

Multilayer feed-forward neural networks have been widely used for financial forecasting due to their ability to correctly classify and predict the dependent variable (Vellido, Lisboa, & Vaughan, 1999). Backpropagation is by far the most popular neural network training algorithm that has been used to perform learning for multilayer feed-forward neural networks. Since the feed-forward neural networks are well known and described elsewhere, the network structures and backpropagation algorithms are not described here. However, readers who are interested in greater detail can refer to earlier chapters or to Rumelhart and McClelland (1986) for a comprehensive explanation of the backpropagation algorithm used to train multilayer feed-forward neural networks.

During neural network modeling, Malliaris and Salchenberger (1993) suggest that validation techniques are required to identify the proper number of hidden layer nodes, thus avoiding underfitting (too few neurons) and overfitting (too many neurons) problems. Generally, too many neurons in the hidden layers results in excessive connections, resulting in a neural network that memorizes the data and lacks the ability to generalize. One approach that can be used to avoid over-fitting is n-fold cross-validation (Peterson, St Clair, Aylward, & Bond, 1995). A five-fold cross-validation, which was used in this experiment, can be described as follows: The data sample is randomly partitioned into five equal-sized folds and the network is trained five times. In each of the training passes, one fold is omitted from the training data and the resulting model is validated on the cases in that omitted fold, which is also known as a validation set. The first period (200 months) of the data set is used for the five-fold cross-validation experiment, leaving the second period for truly untouched out-of-sample data. The average root-mean-squared error over the five unseen validation sets is normally a good predictor of the error rate of a model built from all the data.

Another approach that can be used to achieve better generalization in trained neural networks is called early stopping (Demuth & Beale, 1998). This technique can be effectively used with the cross-validation experiment. The validation set is used to decide when to stop training. When the network begins to over-fit the data, the error on the validation cases will typically begin to rise. In this study, the training was stopped when the validation error increased for five iterations, causing a return of the weights and biases to the minimum of the validation error. The average error results of the validation cases (40 months in each fold for this study) from the n-fold cross-validation experiment are then used as criteria for determining the network structure, namely the number of hidden layers, number of neurons, learning algorithms, learning rates, and activation functions.

Generalized Regression Neural Network_

While a number of articles address the ability of multilayer feed-forward neural network models for financial forecasting, none of these studies has practically applied the generalized regression neural network (GRNN) to forecast stock returns. Similar to the feed-forward neural networks, the GRNN can be used for function approximation to estimate the values of continuous dependent variables, such as future position, future values, and multivariable interpolation. The GRNN is a kind of radial-basis-function network and also looks similar to a feed-forward neural network responding to an input pattern by processing the input variables from one layer to the next with no feedback paths (Specht, 1991). However, its operation is fundamentally different. The GRNN is based on nonlinear regression theory that can be used when an assumption of linearity is not justified.

The training set contains the values of x (independent variables) that correspond to the value of y (dependent variable). This regression method will produce the optimal expected value ofy, which minimizes the mean-squared error. The GRNN approach uses a method that frees the necessity to assume a specific functional form, allowing the appropriate form to be expressed as a probability density function that is empirically determined from observed data using the window estimation (Parzen, 1962). Therefore, this approach is not limited to any particular forms and requires no prior knowledge of the estimated function. The GRNN formula is briefly described as follows:

J f(x,y)dy (5j where y is the output of the estimator, x is the estimator input vector, E [y /x] is the expected value of ygiven x, and f(x,y) is the known joint continuous probability density function of xand y. When the density f (x,y) is not known, it will be estimated from a sample of observations of xand y. For a nonparametric estimate of f (x, y), the class of consistent estimators proposed by Parzen (1962) is used. As a result, the following equation gives the optimal expected value of y:

X hi

Figure 1. Generalized regression neural network architecture Input Layer Hidden Layer 1 xr

Hidden Layer 2

Output Layer

Hidden Layer 2

Output Layer

where w. is the target output corresponding to the input training vector x. and the output y, h. = exp[-Df / (2c2)] is the output of hidden neuron, Df = (x-u)T(x-u) is the squared distance between the input vector x and the training vector u, and s is a smoothing parameter of the radial basis function. The GRNN architecture is shown in Figure 1. The neuron of the hidden layer 1 is created to hold the input vector. The weight between the newly created hidden neuron and the neuron of the hidden layer 2 is assigned the target value.

## Post a comment