## Interpretation of Results

Table 8.4 gives information on the partial derivatives of the models as well as the corresponding marginal significance or P-values of these estimates, based on the bootstrap distributions. We see that the estimates of the network and logit models are for all practical purposes identical. The probit model results do not differ by much, whereas the Weibull estimates differ by a bit more, but not by a large factor. Many studies using classification methods are not interested in the partial...

## Conclusion

Evaluation of the network performance relative to the linear approaches should be with some combination of in-sample and out-of-sample criteria, as well as by common sense criteria. We should never be afraid to ask how much these models add to our insight and understanding. Of course, we may use a neural network simply to forecast or simply to evaluate particular properties of the data, such as the significance of one or more input variables for explaining the behavior of the output variable....

## Matlab Program Notes

Optimization software is quite common. The MATLAB function fminunc.m, for unconstrained minimization, part of the Optimization Toolbox, is the one used for the quasi-Newton gradient-based methods. It has lots of options, such as the specification of tolerance criteria and the maximum number of iterations. This function, like most software, is a minimization function. For maximizing a likelihood function, we minimize the negative of the likelihood function. The genetic algorithm used above is...

## Matlab Example

To give the preceding regression diagnostics clearer focus, the following MATLAB code randomly generates a time series y sin(x)2 + exp( x) as a nonlinear function of a random variable x, then uses a linear regression model to approximate the model, and computes the in-sample diagnostic statistics. This program makes use of functions olsl.m, wnnestl.m, and bds.m, available on the webpage of the author. Create random regressors, constant term, Compute ols coefficients and diagnostics beta, tstat,...

## Feedforward Networks

Figure 2.1 illustrates the architecture on a neural network with one hidden layer containing two neurons, three input variables xi. ,i 1, 2, 3, and one output y. We see parallel processing. In addition to the sequential processing of typical linear systems, in which only observed inputs are used to predict an observed output by weighting the input neurons, the two neurons in the hidden layer process the inputs in a parallel fashion to improve the predictions. The connectors between the input...

## Approximation with Polynomials and Neural Networks

We can see how efficient neural networks are relative to linear and polynomial approximations with a very simple example. We first generate a standard normal random variable x of sample size 1000, and then generate a variable y sin x 2 e-x. We can then do a series of regressions with polynomial approximators and a simple neural network with two neurons, and compare the multiple correlation coefficients. We do this with the following set of MATLAB commands, which access the following functions...

## Data Requirements How Large for Predictive Accuracy

Many researchers shy away from neural network approaches because they are under the impression that large amounts of data are required to obtain accurate predictions. Yes, it is true that there are more parameters to estimate in a neural network than in a linear model. The more complex the network, the more neurons there are. With more neurons, there are more parameters, and without a relatively large data set, degrees of freedom diminish rapidly in progressively more complex networks. In...

## Linear Principal Components

The linear approach to reducing a larger set of variables into a smaller subset of signals from a large set of variables is called principal components analysis PCA . PCA identifies linear projections or combinations of data that explain most of the variation of the original data, or extract most of the information from the larger set of variables, in decreasing order of importance. Obviously, and trivially, for a data set of K vectors, K linear combinations will explain the total variation of...

## Forecasting Classification and Dimensionality Reduction

This book shows how neural networks may be put to work for more accurate forecasting, classification, and dimensionality reduction for better decision making in financial markets particularly in the volatile emerging markets of Asia and Latin America, but also in domestic industrialized-country asset markets and business environments. The importance of better forecasting, classification methods, and dimensionality reduction methods for better decision making, in the light of increasing...

## Contents

1.1 Forecasting, Classification, and Dimensionality 1.3 The Interface 1.4 Plan of the Book 1 Econometric Foundations 11 2 What Are Neural Networks 13 2.1 Linear Regression 2.2 GARCH Nonlinear 2.2.1 Polynomial 2.2.2 Orthogonal 2.3 Model 2.4 What Is A Neural 2.4.1 Feedforward 2.4.2 Squasher 2.4.3 Radial Basis 2.4.4 Ridgelet 2.4.5 Jump 2.4.6 Multilayered Feedforward Networks 32 2.4.7 Recurrent 2.4.8 Networks with Multiple Outputs 36 2.5 Neural Network Smooth-Transition Regime Switching 2.5.1...