Example: Neural Network Forecasting Application

This application illustrates one common approach to time series prediction using a neural network. In this case, the output target for this network is a single time series. In general, the inputs to this network consist of lagged values of the time series together with other concomitant variables, both continuous and categorical. In this application, however, only the first three lags of the time series are used as network inputs.

The objective is to train a neural network for forecasting the series Y_t, \,t=0, 1, 2, \ldots, from the first three lags of Y_t, i.e.

Y_t = f(Y_{t-1},Y_{t-2},Y_{t-3})
Since this series consists of data from several company departments, lagging of the series must be done within departments. This creates many missing values. The original data contains 118,519 training patterns. After lagging, 16,507 are identified as missing and are removed, leaving a total of 102,012 usable training patterns. Missing values are denoted using a number not in the training patterns, the value -9,999,999,999.0 .

The structure of the network consists of three input nodes and two layers, with three perceptrons in the hidden layer and one in the output layer. The following figure depicts this structure:

Figure 9. An example 2-layer Feed Forward Neural Network

There are a total of 16 weights in this network, including the 4 bias weights. All perceptrons in the hidden layer use logistic activation, and the output perceptron uses linear activation. Because of the large number of training patterns, the Activation.LOGISTIC_TABLE activation funtion is used instead of Activation.LOGISTIC . Activation.LOGISTIC_TABLE uses a table lookup for calculating the logistic activation function, which significantly reduces training time. However, these are not completely interchangable. If a network is trained using Activation.LOGISTIC_TABLE , then it is important to use the same activation function for forecasting.

All input nodes are linked to every perceptron in the hidden layer, which are in turn linked to the output perceptron. Then all inputs and the output target are scaled using the ScaleFilter class to ensure that all input values and outputs are in the range [0, 1]. This requires forecasts to be unscaled using the decode() method of the ScaleFilter class.

Training is conducted using the epoch trainer. This trainer allows users to customize training into two stages. Typically this is necessary when training using a large number of training patterns. Stage I training uses randomly selected subsets of training patterns to search for network solutions. Stage II training is optional, and uses the entire set of training patterns. For larger sets of training patterns, training could take many hours, or even days. In that case, Stage II training might be bypassed.

In this example, Stage I training is conducted using the Quasi-Newton trainer applied to 20 epochs, each consisting of 5,000 randomly selected observations. Stage II training also uses the Quasi-Newton trainer. The training results for each Stage I epoch and for the final Stage II solution are stored in a training log file NeuralNetworkEx1.log.

The training patterns are contained in two data files: continuous.txt and output.txt . The formats of these files are identical. The first line of the file contains the number of columns or variables in that file. The second contains a line of tab-delimited integer values. These are the column indices associated with the incoming data. The remaining lines contain tab-delimited, floating point values, one for each of the incoming variables.

For example, the first four lines of the continuous.txt file consists of the following lines:

3
1 2 3
0 0 0
0 0 0

There are 3 continuous input variables which are numbered, or labeled, as 1, 2, and 3.

Source Code


import com.imsl.datamining.neural.*;
import com.imsl.math.*;
import java.io.*;
import java.util.*;
import java.util.logging.*;

//***************************************************************************
// NeuralNetworkEx1.java                                                     
// Two Layer Feed-Forward Network Complete Example for Simple Time Series    
//***************************************************************************
// Synopsis:   This example illustrates how to use a Feed-Forward Neural     
//             Network to forecast time series data.  The network target is a
//             time series and the three inputs are the 1st, 2nd, and 3rd lag
//             for the target series.                                        
// Activation: Logistic_Table in Hidden Layer, Linear in Output Layer        
// Trainer:    Epoch Trainer: Stage I - Quasi-Newton, Stage II - Quasi-Newton
// Inputs:     Lags 1-3 of the time series                                   
// Output:     A Time Series sorted chronologically in descending order,     
//             i.e., the most recent observations occur before the earliest, 
//             within each department                                        
//***************************************************************************
public class NeuralNetworkEx1 implements Serializable {

    private static String QuasiNewton = "quasi-newton";
    private static String LeastSquares = "least-squares";
    // **********************************************************************
    // Network Architecture   
    // **********************************************************************
    private static int nObs = 118519;      // number of training patterns
    private static int nInputs = 3;        // four inputs
    private static int nCategorical = 0;   // three categorical attributes
    private static int nContinuous = 3;    // one continuous input attribute
    private static int nOutputs = 1;       // one continuous output
    private static int nLayers = 2;        // number of perceptron layers
    private static int nPerceptrons = 3;   // perceptrons in hidden layer
    private static int perceptrons[] = {3};// number of perceptrons in each
    // hidden layer
    // PERCEPTRON ACTIVATION	
    private static Activation hiddenLayerActivation
            = Activation.LOGISTIC_TABLE;
    private static Activation outputLayerActivation = Activation.LINEAR;
    // **********************************************************************
    // Epoch Training Optimization Settings
    // **********************************************************************
    private static boolean trace = true; //trainer logging
    private static int nEpochs = 20;     //number of epochs
    private static int epochSize = 5000; //samples per epoch
    // Stage I Trainer - Quasi-Newton Trainer *******************************
    private static int stage1Iterations = 5000;           //max. iterations
    private static double stage1MaxStepsize = 50.0;       //max. stepsize
    private static double stage1StepTolerance = 1e-09;    //step tolerance
    private static double stage1RelativeTolerance = 1e-11;//rel. tolerance
    // Stage II Trainer - Quasi-Newton Trainer ******************************
    private static int stage2Iterations = 5000;           //max. iterations
    private static double stage2MaxStepsize = 50.0;       //max. stepsize
    private static double stage2StepTolerance = 1e-09;    //step tolerance
    private static double stage2RelativeTolerance = 1e-11;//rel. tolerance
    // **********************************************************************
    // FILE NAMES AND FILE READER DEFINITIONS
    // **********************************************************************
    // READERS
    private static BufferedReader contFileInputStream;
    private static BufferedReader outputFileInputStream;
    // OUTPUT FILES
    // File Name for training log produced when trace = true
    private static String trainingLogFile = "NeuralNetworkEx1.log";
    // File Name for Serialized Network
    private static String networkFileName = "NeuralNetworkEx1.ser";
    // File Name for Serialized Trainer
    private static String trainerFileName = "NeuralNetworkTrainerEx1.ser";
    // File Name for Serialized xData File (training input attributes)
    private static String xDataFileName = "NeuralNetworkxDataEx1.ser";
    // File Name for Serialized yData File (training output targets)
    private static String yDataFileName = "NeuralNetworkyDataEx1.ser";
    // INPUT FILES
    // Continuous input attributes file.  File contains Lags 1-3 of series
    private static String contFileName = "continuous.txt";
    // Continuous network targets file.  File contains the original series
    private static String outputFileName = "output.txt";
    // **********************************************************************
    // Data Preprocessing Settings
    // **********************************************************************
    private static double lowerDataLimit = -105000;      // lower scale limit
    private static double upperDataLimit = 25000000;     // upper scale limit
    private static double missingValue = -9999999999.0;  // missing values
    // indicator
    // **********************************************************************
    // Time Parameters for Tracking Training Time
    // **********************************************************************
    private static Calendar startTime;

    // **********************************************************************
    // Error Message Encoding for Stage II Trainer - Quasi-Newton Trainer
    // **********************************************************************
    // Note:  For the Epoch Trainer, the error status returned is the status
    // for the Stage II trainer, unless Stage II training is not used.
    // **********************************************************************
    private static String errorMsg = "";
    // Error Status Messages for the Quasi-Newton Trainer
    private static String errorMsg0
            = "--> Network Training";
    private static String errorMsg1
            = "--> The last global step failed to locate a lower point "
            + "than the\ncurrent error value.  The current solution may "
            + "be an approximate\nsolution and no more accuracy is "
            + "possible, or the step tolerance\nmay be too large.";
    private static String errorMsg2
            = "--> Relative function convergence;  both both the actual "
            + "and \npredicted relative reductions in the error function "
            + "are less than\nor equal to the relative fu nction "
            + "convergence tolerance.";
    private static String errorMsg3
            = "--> Scaled step tolerance satisfied;  the current solution "
            + "may be\nan approximate local solution, or the algorithm is "
            + "making very slow\nprogress and is not near a solution, or "
            + "the step tolerance is too big.";
    private static String errorMsg4
            = "--> Quasi-Newton Trainer threw a \n"
            + "MinUnconMultiVar.FalseConvergenceException.";
    private static String errorMsg5
            = "--> Quasi-Newton Trainer threw a \n"
            + "MinUnconMultiVar.MaxIterationsException.";
    private static String errorMsg6
            = "--> Quasi-Newton Trainer threw a \n"
            + "MinUnconMultiVar.UnboundedBelowException.";

    // **********************************************************************
    // MAIN
    // **********************************************************************
    public static void main(String[] args) throws Exception {
        double weight[];   // Network weights
        double gradient[]; // Network gradient after training
        double xData[][];  // Training Patterns Input Attributes
        double yData[][];  // Training Targets Output Attributes
        double contAtt[][];// A 2D matrix for the continuous training attributes
        double outs[][];   // A matrix containing the training output tragets
        int i, j, m = 0;  // Array indicies
        int nWeights = 0;  // Number of network weights
        int nCol = 0;  // Number of data columns in input file
        int ignore[];      // Array of 0's and 1's (0=missing value)
        int cont_col[], outs_col[], isMissing[] = {0};
        // ******************************************************************
        // Initialize timers
        // ******************************************************************
        startTime = Calendar.getInstance();
        System.out.println("--> Starting Data Preprocessing at: "
                + startTime.getTime());

        // ******************************************************************
        // Read continuous attribute data
        // ******************************************************************
        // Initialize ignore[] for identifying missing observations
        ignore = new int[nObs];
        isMissing = new int[1];
        openInputFiles();

        nCol = readFirstLine(contFileInputStream);

        nContinuous = nCol;
        System.out.println("--> Number of continuous variables:     "
                + nContinuous);
        // If the number of continuous variables is greater than zero
        // then read the remainder of this file (contFile)
        if (nContinuous > 0) {
            // contFile contains continuous attribute data
            contAtt = new double[nObs][nContinuous];
            cont_col = readColumnLabels(contFileInputStream, nContinuous);
            for (i = 0; i < nObs; i++) {
                isMissing[0] = -1;
                contAtt[i] = readDataLine(contFileInputStream,
                        nContinuous, isMissing);
                ignore[i] = isMissing[0];
                if (isMissing[0] >= 0) {
                    m++;
                }
            }
        } else {
            nContinuous = 0;
            contAtt = new double[1][1];
            contAtt[0][0] = 0;
        }
        closeFile(contFileInputStream);
        // ******************************************************************
        // Read continuous output targets
        // ******************************************************************
        nCol = readFirstLine(outputFileInputStream);
        nOutputs = nCol;
        System.out.println("--> Number of output variables:         "
                + nOutputs);
        outs = new double[nObs][nOutputs];
        // Read numeric labels for continuous input attributes
        outs_col = readColumnLabels(outputFileInputStream, nOutputs);

        m = 0;
        for (i = 0; i < nObs; i++) {
            isMissing[0] = ignore[i];
            outs[i] = readDataLine(outputFileInputStream, nOutputs, isMissing);
            ignore[i] = isMissing[0];
            if (isMissing[0] >= 0) {
                m++;
            }
        }
        System.out.println("--> Number of Missing Observations:     " + m);
        closeFile(outputFileInputStream);
        // Remove missing observations using the ignore[] array
        m = removeMissingData(nObs, nContinuous, ignore, contAtt);
        m = removeMissingData(nObs, nOutputs, ignore, outs);

        System.out.println("--> Total Number of Training Patterns:  " + nObs);
        nObs = nObs - m;
        System.out.println("--> Number of Usable Training Patterns: " + nObs);

        // ******************************************************************
        // Setup Method and Bounds for Scale Filter
        // ******************************************************************
        ScaleFilter scaleFilter
                = new ScaleFilter(ScaleFilter.BOUNDED_SCALING);
        scaleFilter.setBounds(lowerDataLimit, upperDataLimit, 0, 1);
        // ******************************************************************
        // PREPROCESS TRAINING PATTERNS
        // ******************************************************************
        System.out.println("--> Starting Preprocessing of Training Patterns");
        xData = new double[nObs][nContinuous];
        yData = new double[nObs][nOutputs];
        for (i = 0; i < nObs; i++) {
            for (j = 0; j < nContinuous; j++) {
                xData[i][j] = contAtt[i][j];
            }
            yData[i][0] = outs[i][0];
        }
        scaleFilter.encode(0, xData);
        scaleFilter.encode(1, xData);
        scaleFilter.encode(2, xData);
        scaleFilter.encode(0, yData);
        // ******************************************************************
        // CREATE FEEDFORWARD NETWORK
        // ******************************************************************
        System.out.println("--> Creating Feed Forward Network Object");
        FeedForwardNetwork network = new FeedForwardNetwork();
        // setup input layer with number of inputs = nInputs = 3
        network.getInputLayer().createInputs(nInputs);
        // create a hidden layer with nPerceptrons=3 perceptrons
        network.createHiddenLayer().createPerceptrons(nPerceptrons);
        // create output layer with nOutputs=1 output perceptron 
        network.getOutputLayer().createPerceptrons(nOutputs);
        // link all inputs and perceptrons to all perceptrons in the next layer
        network.linkAll();
        // Get Network Perceptrons for Setting Their Activation Functions
        Perceptron perceptrons[] = network.getPerceptrons();
        // Set all hidden layer perceptrons to logistic_table activation
        for (i = 0; i < perceptrons.length - 1; i++) {
            perceptrons[i].setActivation(hiddenLayerActivation);
        }
        perceptrons[perceptrons.length - 1].
                setActivation(outputLayerActivation);
        System.out.println("--> Feed Forward Network Created with 2 Layers");
        // ******************************************************************
        // TRAIN NETWORK USING EPOCH TRAINER
        // ******************************************************************
        System.out.println("--> Training Network using Epoch Trainer");
        Trainer trainer = createTrainer(QuasiNewton, QuasiNewton);
        Calendar startTime = Calendar.getInstance();
        // Train Network
        trainer.train(network, xData, yData);

        // Check Training Error Status
        switch (trainer.getErrorStatus()) {
            case 0:
                errorMsg = errorMsg0;
                break;
            case 1:
                errorMsg = errorMsg1;
                break;
            case 2:
                errorMsg = errorMsg2;
                break;
            case 3:
                errorMsg = errorMsg3;
                break;
            case 4:
                errorMsg = errorMsg4;
                break;
            case 5:
                errorMsg = errorMsg5;
                break;
            case 6:
                errorMsg = errorMsg6;
                break;
            default:
                errorMsg = "--> Unknown Error Status Returned from Trainer";
        }
        System.out.println(errorMsg);
        Calendar currentTimeNow = Calendar.getInstance();
        System.out.println("--> Network Training Completed at: "
                + currentTimeNow.getTime());
        double duration = (double) (currentTimeNow.getTimeInMillis()
                - startTime.getTimeInMillis()) / 1000.0;
        System.out.println("--> Training Time: " + duration + " seconds");

        // ******************************************************************
        // DISPLAY TRAINING STATISTICS
        // ******************************************************************
        double stats[] = network.computeStatistics(xData, yData);
        // Display Network Errors
        System.out.println("***********************************************");
        System.out.println("--> SSE:                       "
                + (float) stats[0]);
        System.out.println("--> RMS:                       "
                + (float) stats[1]);
        System.out.println("--> Laplacian Error:           "
                + (float) stats[2]);
        System.out.println("--> Scaled Laplacian Error:    "
                + (float) stats[3]);
        System.out.println("--> Largest Absolute Residual: "
                + (float) stats[4]);
        System.out.println("***********************************************");
        System.out.println("");
        // ******************************************************************
        // OBTAIN AND DISPLAY NETWORK WEIGHTS AND GRADIENTS
        // ******************************************************************
        System.out.println("--> Getting Network Weights and Gradients");
        // Get weights               
        weight = network.getWeights();
        // Get number of weights = number of gradients
        nWeights = network.getNumberOfWeights();
        // Obtain Gradient Vector
        gradient = trainer.getErrorGradient();
        // Print Network Weights and Gradients
        System.out.println(" ");
        System.out.println("--> Network Weights and Gradients:");
        System.out.println("***********************************************");
        double[][] printMatrix = new double[nWeights][2];
        for (i = 0; i < nWeights; i++) {
            printMatrix[i][0] = weight[i];
            printMatrix[i][1] = gradient[i];
        }
        // Print result without row/column labels.
        String[] colLabels = {"Weight", "Gradient"};
        PrintMatrix pm = new PrintMatrix();
        PrintMatrixFormat mf;
        mf = new PrintMatrixFormat();
        mf.setNoRowLabels();
        mf.setColumnLabels(colLabels);
        pm.setTitle("Weights and Gradients");
        pm.print(mf, printMatrix);

        System.out.println("***********************************************");
        // ******************************************************************
        // SAVE THE TRAINED NETWORK BY SAVING THE SERIALIZED NETWORK OBJECT
        // ******************************************************************
        System.out.println("\n--> Saving Trained Network into "
                + networkFileName);
        write(network, networkFileName);
        System.out.println("--> Saving Network Trainer into "
                + trainerFileName);
        write(trainer, trainerFileName);
        System.out.println("--> Saving xData into "
                + xDataFileName);
        write(xData, xDataFileName);
        System.out.println("--> Saving yData into "
                + yDataFileName);
        write(yData, yDataFileName);
    }

    // **********************************************************************
    // OPEN DATA FILES
    // **********************************************************************
    static public void openInputFiles() {
        try {
            // Continuous  Input Attributes
            InputStream contInputStream = new FileInputStream(contFileName);
            contFileInputStream
                    = new BufferedReader(
                            new InputStreamReader(contInputStream));
            // Continuous Output Targets
            InputStream outputInputStream
                    = new FileInputStream(outputFileName);
            outputFileInputStream
                    = new BufferedReader(
                            new InputStreamReader(outputInputStream));
        } catch (Exception e) {
            System.out.println("-->ERROR: " + e);
            System.exit(0);
        }
    }
    
    // **********************************************************************
    // READ FIRST LINE OF DATA FILE AND RETURN NUMBER OF COLUMNS IN FILE
    // **********************************************************************
    static public int readFirstLine(BufferedReader inputFile) {
        String inputLine = "", temp;
        int nCol = 0;
        try {
            temp = inputFile.readLine();
            inputLine = temp.trim();
            nCol = Integer.parseInt(inputLine);
        } catch (Exception e) {
            System.out.println("--> ERROR READING 1st LINE OF File" + e);
            System.exit(0);
        }
        return nCol;
    }
    
    // **********************************************************************
    // READ COLUMN LABELS (2ND LINE IN FILE)
    // **********************************************************************
    static public int[] readColumnLabels(BufferedReader inputFile, int nCol) {
        int contCol[] = new int[nCol];
        String inputLine = "", temp;
        String dataElement[];
        // Read numeric labels for continuous input attributes
        try {
            temp = inputFile.readLine();
            inputLine = temp.trim();
        } catch (Exception e) {
            System.out.println("--> ERROR READING 2nd LINE OF FILE: " + e);
            System.exit(0);
        }
        dataElement = inputLine.split(" ");
        for (int i = 0; i < nCol; i++) {
            contCol[i] = Integer.parseInt(dataElement[i]);
        }
        return contCol;
    }

    // **********************************************************************
    // READ DATA ROW
    // **********************************************************************
    static public double[] readDataLine(BufferedReader inputFile,
            int nCol, int[] isMissing) {
        double missingValueIndicator = -9999999999.0;
        double dataLine[] = new double[nCol];
        String inputLine = "", temp;
        String dataElement[];
        try {
            temp = inputFile.readLine();
            inputLine = temp.trim();
        } catch (Exception e) {
            System.out.println("-->ERROR READING LINE: " + e);
            System.exit(0);
        }
        dataElement = inputLine.split(" ");
        for (int j = 0; j < nCol; j++) {
            dataLine[j] = Double.parseDouble(dataElement[j]);
            if (dataLine[j] == missingValueIndicator) {
                isMissing[0] = 1;
            }
        }
        return dataLine;
    }

    // **********************************************************************
    // CLOSE FILE
    // **********************************************************************
    static public void closeFile(BufferedReader inputFile) {
        try {
            inputFile.close();
        } catch (Exception e) {
            System.out.println("ERROR: Unable to close file: " + e);
            System.exit(0);
        }
    }

    // **********************************************************************
    // REMOVE MISSING DATA
    // **********************************************************************
    // Now remove all missing data using the ignore[] array
    // and recalculate the number of usable observations, nObs
    // This method is inefficient, but it works.  It removes one case at a
    // time, starting from the bottom.  As a case (row) is removed, the cases
    // below are pushed up to take it's place.
    // **********************************************************************
    static public int removeMissingData(int nObs, int nCol, int ignore[],
            double[][] inputArray) {
        int m = 0;
        for (int i = nObs - 1; i >= 0; i--) {
            if (ignore[i] >= 0) {
                // the ith row contains a missing value
                // remove the ith row by shifting all rows below the 
                // ith row up by one position, e.g. row i+1  -> row i
                m++;
                if (nCol > 0) {
                    for (int j = i; j < nObs - m; j++) {
                        for (int k = 0; k < nCol; k++) {
                            inputArray[j][k] = inputArray[j + 1][k];
                        }
                    }
                }
            }
        }
        return m;
    }
    
    // **********************************************************************
    // Create Stage I/Stage II Trainer
    // **********************************************************************
    static public Trainer createTrainer(String s1, String s2) {
        EpochTrainer epoch = null; // Epoch Trainer (returned by this method)
        QuasiNewtonTrainer stage1Trainer; // Stage I  Quasi-Newton Trainer
        QuasiNewtonTrainer stage2Trainer; // Stage II Quasi-Newton Trainer
        LeastSquaresTrainer stage1LS;     // Stage I  Least Squares Trainer
        LeastSquaresTrainer stage2LS;     // Stage II Least Squares Trainer
        Calendar currentTimeNow;          // Calendar time tracker

        // Create Epoch (Stage I/Stage II) trainer from above trainers.
        System.out.println("    --> Creating Epoch Trainer");
        if (s1.equals(QuasiNewton)) {
            // Setup stage I quasi-newton trainer
            stage1Trainer = new QuasiNewtonTrainer();
            //stage1Trainer.setMaximumStepsize(maxStepSize);
            stage1Trainer.setMaximumTrainingIterations(stage1Iterations);
            stage1Trainer.setStepTolerance(stage1StepTolerance);
            if (s2.equals(QuasiNewton)) {
                stage2Trainer = new QuasiNewtonTrainer();
                //stage2Trainer.setMaximumStepsize(maxStepSize);
                stage2Trainer.setMaximumTrainingIterations(stage2Iterations);
                epoch = new EpochTrainer(stage1Trainer, stage2Trainer);
            } else {
                if (s2.equals(LeastSquares)) {
                    stage2LS = new LeastSquaresTrainer();
                    stage2LS.setInitialTrustRegion(1.0e-3);
                    //stage2LS.setMaximumStepsize(maxStepSize);
                    stage2LS.setMaximumTrainingIterations(stage2Iterations);
                    epoch = new EpochTrainer(stage1Trainer, stage2LS);
                } else {
                    epoch = new EpochTrainer(stage1Trainer);
                }
            }
        } else {
            // Setup stage I least squares trainer
            stage1LS = new LeastSquaresTrainer();
            stage1LS.setInitialTrustRegion(1.0e-3);
            stage1LS.setMaximumTrainingIterations(stage1Iterations);
            //stage1LS.setMaximumStepsize(maxStepSize);
            if (s2.equals(QuasiNewton)) {
                stage2Trainer = new QuasiNewtonTrainer();
                //stage2Trainer.setMaximumStepsize(maxStepSize);
                stage2Trainer.setMaximumTrainingIterations(stage2Iterations);
                epoch = new EpochTrainer(stage1LS, stage2Trainer);
            } else {
                if (s2.equals(LeastSquares)) {
                    stage2LS = new LeastSquaresTrainer();
                    stage2LS.setInitialTrustRegion(1.0e-3);
                    //stage2LS.setMaximumStepsize(maxStepSize);
                    stage2LS.setMaximumTrainingIterations(stage2Iterations);
                    epoch = new EpochTrainer(stage1LS, stage2LS);
                } else {
                    epoch = new EpochTrainer(stage1LS);
                }
            }
        }
        epoch.setNumberOfEpochs(nEpochs);
        epoch.setEpochSize(epochSize);
        epoch.setRandom(new com.imsl.stat.Random(1234567));
        epoch.setRandomSamples(new com.imsl.stat.Random(12345),
                new com.imsl.stat.Random(67891));
        System.out.println("    --> Trainer:  Stage I - "
                + s1 + " Stage II " + s2);
        System.out.println("    --> Number of Epochs:   " + nEpochs);
        System.out.println("    --> Epoch Size:         " + epochSize);
        // Describe optimization setup for Stage I training
        System.out.println("    --> Creating Stage I  Trainer");
        System.out.println("    --> Stage I Iterations:          "
                + stage1Iterations);
        System.out.println("    --> Stage I Step Tolerance:      "
                + stage1StepTolerance);
        System.out.println("    --> Stage I Relative Tolerance:  "
                + stage1RelativeTolerance);
        System.out.println("    --> Stage I Step Size:           "
                + "DEFAULT");
        System.out.println("    --> Stage I Trace:               " + trace);
        if (s2.equals(QuasiNewton) || s2.equals(LeastSquares)) {
            // Describe optimization setup for Stage II training
            System.out.println("    --> Creating Stage II Trainer");
            System.out.println("    --> Stage II Iterations:         "
                    + stage2Iterations);
            System.out.println("    --> Stage II Step Tolerance:     "
                    + stage2StepTolerance);
            System.out.println("    --> Stage II Relative Tolerance: "
                    + stage2RelativeTolerance);
            System.out.println("    --> Stage II Step Size:          "
                    + "DEFAULT");
            System.out.println("    --> Stage II Trace:              "
                    + trace);
        }
        if (trace) {
            try {
                Handler handler = new FileHandler(trainingLogFile);
                Logger logger = Logger.getLogger("com.imsl.datamining.neural");
                logger.setLevel(Level.FINEST);
                logger.addHandler(handler);
                handler.setFormatter(EpochTrainer.getFormatter());
                System.out.println("    --> Training Log Stored in "
                        + trainingLogFile);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        currentTimeNow = Calendar.getInstance();
        System.out.println("--> Starting Network Training at "
                + currentTimeNow.getTime());
        // Return Stage I/Stage II trainer  
        return epoch;
    }

    // **********************************************************************
    // WRITE SERIALIZED OBJECT TO A FILE
    // **********************************************************************
    static public void write(Object obj, String filename)
            throws IOException {
        FileOutputStream fos = new FileOutputStream(filename);
        ObjectOutputStream oos = new ObjectOutputStream(fos);
        oos.writeObject(obj);
        oos.close();
        fos.close();
    }
}

Output

--> Starting Data Preprocessing at: Thu Oct 14 17:27:04 CDT 2004
--> Number of continuous variables:     3
--> Number of output variables:         1
--> Number of Missing Observations:     16507
--> Total Number of Training Patterns:  118519
--> Number of Usable Training Patterns: 102012
--> Starting Preprocessing of Training Patterns
--> Creating Feed Forward Network Object
--> Feed Forward Network Created with 2 Layers
--> Training Network using Epoch Trainer
    --> Creating Epoch Trainer
    --> Trainer:  Stage I - quasi-newton Stage II quasi-newton
    --> Number of Epochs:   20
    --> Epoch Size:         5000
    --> Creating Stage I  Trainer
    --> Stage I Iterations:          5000
    --> Stage I Step Tolerance:      1.0E-9
    --> Stage I Relative Tolerance:  1.0E-11
    --> Stage I Step Size:           DEFAULT
    --> Stage I Trace:               true
    --> Creating Stage II Trainer
    --> Stage II Iterations:         5000
    --> Stage II Step Tolerance:     1.0E-9
    --> Stage II Relative Tolerance: 1.0E-11
    --> Stage II Step Size:          DEFAULT
    --> Stage II Trace:              true
    --> Training Log Stored in NeuralNetworkEx1.log
--> Starting Network Training at Thu Oct 14 17:32:33 CDT 2004
--> The last global step failed to locate a lower point than the
current error value.  The current solution may be an approximate
solution and no more accuracy is possible, or the step tolerance
may be too larger.
--> Network Training Completed at: Thu Oct 14 18:18:08 CDT 2004
--> Training Time: 2735.341 seconds
***********************************************
--> SSE:                       3.88076
--> RMS:                       0.12284768
--> Laplacian Error:           125.36781
--> Scaled Laplacian Error:    0.20686063
--> Largest Absolute Residual: 0.500993
***********************************************

--> Getting Network Weights and Gradients

--> Network Weights and Gradients:
***********************************************
Weights and Gradients
   Weight   Gradient
   1.921   -0
   1.569    0
-199.709    0
   0.065   -0
  -0.003    0
 106.62     0
   1.221   -0
   0.787    0
 119.169    0
-129.8      0
 146.822    0
  -0.076    0
  -6.022   -0
  -5.257    0.001
   2.19     0
  -0.377    0

***********************************************

--> Saving Trained Network into NeuralNetworkEx1.ser
--> Saving Network Trainer into NeuralNetworkTrainerEx1.ser
--> Saving xData into NeuralNetworkxDataEx1.ser
--> Saving yData into NeuralNetworkyDataEx1.ser

Results

The above output indicates that the network successfully completed its training. The final sum of squared errors was 3.88, and the RMS (the scaled version of the sum of squared errors) was 0.12. All of the gradients at this solution are nearly zero, which is expected if network training found a local or global optima. Non-zero gradients usually indicate there was a problem with network training.

Examining the training log for this application, NeuralNetworkEx1.log , illustrates the importance of Stage II training.

Portions of the Training Log - NeuralNetworkEx1.log

.
.
.
End EpochTrainer Stage 1
        Best Epoch       15
        Error Status     17
        Best Error       0.03979299031789641
        Best Residual    0.03979299031789641
        SSE              1072.1281419136983
        RMS              33.93882798404427
        Laplacian        429.30253410528974
        Scaled Laplacian 0.7083620086220087
        Max Residual     11.837166167929052
Exiting com.imsl.datamining.neural.EpochTrainer.train Stage 1
Beginning com.imsl.datamining.neural.EpochTrainer.train Stage 2
.
.
.
Exiting com.imsl.datamining.neural.EpochTrainer.train Stage 2
    Summary
    Error Status     1
    Best Error       3.88076005209094
    SSE              3.88076005209094
    RMS              0.12284767343218107
    Laplacian        125.3678136373788
    Scaled Laplacian 0.20686063843020083
    Max Residual     0.5009930332151435

The training log indicates that the best Stage I epoch occurred at iteration 15, and that 17 of the 20 Stage I epochs detected a problem with training optimization. Other parts of the log indicate that these problems included: possible local minima, and maximum number of iterations exceeded. Although these problems are warning messages and not true errors, they do indicate that convergence to a global optima is uncertain for 17 of the 20 epochs. Possibly increasing the epoch size might have provided more stable Stage I training.

More disturbing is the fact that for the best epoch=15, the sum of squared errors totaled over all training patterns is 1072.13. Epoch 15 was used as the starting point for the Stage II training which was able to reduce this sum of squared errors to 3.88. This suggests that although the epoch size, epochSize=5000, was too small for effective Stage I training, the Stage II trainer was able to locate a better solution.

However, even the Stage II trainer returned a non-zero error status, errorStatus=1. This was a warning that the Stage II trainer may have found a local optima. Further attempts were made to determine whether a better network could be found, but these alternate solutions only marginally lowered the sum of squared errors.

The trained network was serialized and stored into four files:

the network file - NeuralNetworkEx1.ser,
the trainer file - NeuralNetworkTrainerEx1.ser,
the xData file - NeuralNetworkxDataEx1.ser, and
the yData file - NeuralNetworkyDataEx1.ser.

Link to Java source.

Links to Input Data Files Used in this Example and the Training Log:

continuous.txt - Input attributes file.
output.txt - Network targets file.
NeuralNetworkEx1.log - Training Log.