Chapter 13: Data Mining > time_series_filter

time_series_filter

Converts time series data to the format required for processing by a neural network.

Synopsis

#include <imsls.h>

float *imsls_f_time_series_filter (int n_patterns, int n_varint max_lagfloat x[]…,0)

The type double function is imsls_d_time_series_filter.

Required Arguments

int n_patterns   (Input)
Number of observations.  The number of observations must be greater than n_lags.

int n_var   (Input)
Number of variables (columns) in x.  The number of variables must be one or greater, n_var 1.

int max_lag   (Input)
The number of lags.  The number of lags must be one or greater, max_lag 1 and less than or equal to n_patterns.

float x[]   (Input)
An array of size n_patterns by n_var.  All data must be sorted in chronological order from most recent to oldest observation.

Return Value

A pointer to an internally allocated array of size (n_patterns-max_lag) by n_var*(max_lag+1))  If errors are encountered, NULL is returned.

Synopsis with Optional Arguments

#include <imsls.h>

float *imsls_f_time_series_filter (int n_patternsint n_var,
 int max_lag,  float x[],
IMSLS_RETURN_USER, float z[],
0)

Optional Arguments

IMSLS_RETURN_USER, float z[]   (Output)
User supplied array of size (n_patterns-max_lag) by n_var*(max_lag+1) containing the filtered data.

Description

Function imsls_f_time_series_filter accepts a data matrix and lags every column to form a new data matrix.  The input matrix, x, contains n_var columns.  Each column is transformed into (max_lag+1) columns by lagging its values.

Since a lag of zero is always included in the output matrix z, the total number of lags is
n_lags = max_lag+1.

The output data array, z, can be represented symbolically as:

z = |x(0) : x(1) : x(2) : … : x(max_lag)|,

where x(i) is the i-th lag of the incoming data matrix, x.  For example, if  x={1, 2, 3, 4, 5} and n_var=1, then n_patterns=5, and x(0)=x, x(1)={2, 3, 4, 5}, x(2)={3, 4, 5}, etc.

Consider, an example in which n_patterns = 2 and n_var = 2 with all variables having continuous input attributes.  It is assumed that the most recent observations are in the first row and the oldest are in the last row.

.

If max_lag=1, then the number of columns will be n_var*(max_lag+1)=2*2=4, and the number of rows will be n_patterns–max_lag=5-1=4:

.

If max_lag=2, then the number of columns will be n_var*(max_lag+1)=2*3=6. , and the number of rows will be n_patterns–max_lag=5-2=3:

.

Example 1

In this example, the matrix x with 5 rows and 2 columns is lagged twice, i.e. max_lag=2. This produces an output two-dimensional matrix with (n_patterns-max_lag)=5-2=3 rows, but 2*3=6 columns. The first two columns correspond to lag=0, which simply places the original data into these columns. The 3rd and 4th columns contain the first lags of the original 2 columns and the 5th and 6th columns contain the second lags. Note that the number of rows for the output  matrix z is less than the number for the input matrix x.

#include <imsls.h>

int main()
{

#define N_PATTERNS 5

#define N_VAR 2

#define MAX_LAG 2

  float x[N_PATTERNS*N_VAR] = {1, 6,

                       2, 7,

                       3, 8,

                       4, 9,

                       5, 10};

 

 

  float *z;

 

  z = imsls_f_time_series_filter(N_PATTERNS, N_VAR, MAX_LAG, (float*)x, 0);

  imsls_f_write_matrix("X", N_PATTERNS, N_VAR, (float*)x, 0);

  imsls_f_write_matrix("Z", N_PATTERNS-MAX_LAG, N_VAR*(MAX_LAG+1), z, 0);

}

Output

            X

            1           2

1           1           6

2           2           7

3           3           8

4           4           9

5           5          10

 

                                    Z

            1           2           3           4           5           6

1           1           6           2           7           3           8

2           2           7           3           8           4           9

3           3           8           4           9           5          10

 


RW_logo.jpg
Contact Support