NNLEARN Instruction

NNLEARN(options) start end

# list of input series in Regression Format

# list of output series

NNLEARN uses backpropagation techniques to train a new or existing neural network to model the relationship between a set of input series and a set of output series. You can use NNTEST to generate output (the fitted values) from a neural network trained using NNLEARN.

Parameters

start, end

range of entries of the output series to use

Supplementary Cards

The first supplementary card supplies the list of input series, while the second card supplies the list of output series. The input series (first supplementary card) are analogous to explanatory or independent variables in a regression, while the output series (second card) are analogous to dependent variable(s).

The first card supports regression format, which means that you can include lags or leads on the input list. The output list, however, must consist only of one or more series names (no lags or leads).

Options

SAVE=memory vector (required)

RESTART/[NORESTART]

The SAVE option saves the estimated weights of neural network model, as well as general information about the model (number of inputs, number of outputs, etc.) in a VECTOR of REALS. The memory vector can be used in subsequent NNLEARN commands for additional training as described below, or with the NNTEST instruction to generate fitted values.

If you re-use an existing memory vector, NNLEARN will, by default, use the values in the vector as the starting point for further training. This allows you to do further training of an existing network using additional NNLEARN commands. Use the RESTART option if you want to re-use the same vector name, but want NNLEARN to start the estimation from a new set of randomly generated initial values. In either case, after the estimation is completed, the new weights and information are saved into memory vector, replacing the earlier values.

HIDDEN=number of hidden nodes [number of input series]

This specifies the number of hidden nodes in the network.

DIRECT/[NODIRECT]

If DIRECT, the model will include direct links between the input nodes and the output nodes. If NODIRECT, the only connection will be through hidden nodes.

YMIN=minimum scale value for outputs [minimum output value]

YMAX=maximum scale value for outputs [maximum output value]

These options set the upper and lower bounds on the values of the outputs of the neural network. They control how the internal values of the network (which range from 0 to 1 or –1 to 1 depending on the squashing function used) are mapped to the actual output values. By default, these are set to the maximum and minimum values of the original training sample.

PAD=fraction to pad outputs [0]

The values of the network outputs run from 0 to 1 or –1 to 1 (depending on the SQUASH choice). By default, the outputs are scaled so that this range maps to the smallest and largest values in the training sample output series. If the model is ever used with samples that should produce larger or smaller output values than were present in the training sample, the outputs produced by NNTEST will be artificially truncated. You can avoid this by using the PAD option to provide a value between 0 and 1 which indicates the fraction of “padding” to include when rescaling the output variables.

If, for instance, you choose PAD=.2, the smallest output value in the training sample will be mapped to .1 while the largest will be mapped to .9. If the original range of the data were from 7.2 to 8, this would allow the network to produce forecasts up to 8.1 and down to 7.1.

MODE=[EPOCH]/EXAMPLE

This controls how often new weights are computed. With EPOCH, NNLEARN does a forward and backward pass through the network for all observations in the sample range before recomputing the weights. With EXAMPLE, weights are recomputed after (a forward and backward pass through) each observation in the sample.

SQUASH=[LOGISTIC]/HT1/HT2

Selects the sigmoidal filter to be used for “squashing” the node outputs:

LOGISTIC	\(1/\left( {1 + {e^{ - u}}} \right){\kern 1pt} \)
HT1	\(\tanh \left( u \right)\)
HT2	\(\tanh \left( u / 2 \right)\)

where u is the basic output of a node—a linear function of the input values and the current weights. The actual output of each node is a sigmoidal function of u. The SQUASH option allows you to choose which of these three functions is used to generate the output. These serve to scale the outputs so that they fall between 0 and 1 (LOGISTIC) or between –1 and 1 (HT1 and HT2).

TRACE/[NOTRACE]

If you use TRACE, NNLEARN will issue a progress report after each iteration.

SMPL=Standard SMPL option [unused]

This series or formula should return 0 for entries you want to omit from the training, and non-zero values for the other entries.

ITERS=maximum number of iterations [no default limit]

Sets a limit on the maximum number of iterations that will be performed. Each “iteration” involves less computation than, say, a MAXIMIZE instruction, but also has less of a chance to produce an improvement. You might, on occasion, need an iteration limit in the thousands.

CVCRIT=convergence criterion [.00001]

RSQUARED=minimum R-squared level

These mutually exclusive options provide two ways of specifying convergence criteria for the learning process. Both can produce equivalent fits—they simply offer two ways of thinking about the criteria.

If you use the CVCRIT option, NNLEARN will train the model until the mean square error (the mean of the squared error between the output series and the current output values of the network) is less than the CVCRIT value.

If you use the RSQUARED option, NNLEARN will train the model until the mean square error is less than \((1 - {R^2}){\sigma ^2}\), where \({R^2}\) is the value specified in the RSQUARED option, and \({\sigma ^2}\) is the smallest of the output series variances.

The default setting is CVCRIT=.00001. If you specify both options, NNLEARN will use the CVCRIT setting.

THETA=theta parameter [0.7] (must be in the range \(0 < \theta < 1\))

KAPPA=kappa parameter [0.1] (must be in the range \(0 \le \kappa < 1\))

PHI=phi parameter [0.5] (must be in the range \(0 < \varphi < 1\))

MU=momentum parameter [0.0] (must be in the range \(0 \le \mu < 1\))

These allow you to control various parameters of the adaptive learning rate algorithm. The THETA value affects how derivatives are averaged over recent derivatives. KAPPA affects how a weight is increased if the current derivative has the same direction as recent derivatives, while PHI affects how the weights change if the derivative changes direction. A non-zero value for the MU option adds “momentum” to the adaptive learning process, which helps prevent temporary changes in direction from adversely affecting the learning process (that is, it limits wild fluctuations in different directions).

Example

As a simple demonstration, we’ll fit a neural network model for the XOR (exclusive OR) function. The XOR function takes two binary values (1 or 0, true or false) as input, and returns a true value when either (but not both) of the inputs are true, and returns a false value otherwise.

all 4

data(unit=input,org=obs) / x1 x2 xor_actual

0 0 0

0 1 1

1 0 1

1 1 0

nnlearn(save=mem,rsquared=.9999,hidden=2)

# x1 x2

# xor_actual

nntest / mem

# x1 x2

# xor_output

print / x1 x2 xor_actual xor_output