CNL Stat : Data Mining : time_series_filter
time_series_filter
Converts time series data to the format required for processing by a neural network.
Synopsis
#include <imsls.h>
float *imsls_f_time_series_filter (int n_patterns, int n_var, int max_lag, float x[], …,0)
The type double function is imsls_d_time_series_filter.
Required Arguments
int n_patterns (Input)
Number of observations. The number of observations must be greater than n_lags.
int n_var (Input)
Number of variables (columns) in x. The number of variables must be one or greater, n_var 1.
int max_lag (Input)
The number of lags. The number of lags must be one or greater, max_lag 1 and less than or equal to n_patterns.
float x[] (Input)
An array of size n_patterns by n_var. All data must be sorted in chronological order from most recent to oldest observation.
Return Value
A pointer to an internally allocated array of size (n_patterns-max_lag) by n_var×(max_lag+1)) If errors are encountered, NULL is returned.
Synopsis with Optional Arguments
#include <imsls.h>
float *imsls_f_time_series_filter (int n_patterns, int n_var, int max_lag, float x[],
IMSLS_RETURN_USER, float z[],
0)
Optional Arguments
IMSLS_RETURN_USER, float z[] (Output)
User supplied array of size (n_patterns-max_lag) by n_var×(max_lag+1) containing the filtered data.
Description
Function imsls_f_time_series_filter accepts a data matrix and lags every column to form a new data matrix. The input matrix, x, contains n_var columns. Each column is transformed into (max_lag+1) columns by lagging its values.
Since a lag of zero is always included in the output matrix z, the total number of lags is n_lags = max_lag+1.
The output data array, z, can be represented symbolically as:
z = |x(0) : x(1) : x(2) : … : x(max_lag)|,
where x(i) is the i-th lag of the incoming data matrix, x. For example, if x={1, 2, 3, 4, 5} and n_var=1, then n_patterns=5, and x(0)=x, x(1)={2, 3, 4, 5}, x(2)={3, 4, 5}, etc.
Consider, an example in which n_patterns = 2 and n_var = 2 with all variables having continuous input attributes. It is assumed that the most recent observations are in the first row and the oldest are in the last row.
If max_lag=1, then the number of columns will be n_var*(max_lag+1)=2*2=4, and the number of rows will be n_patterns–max_lag=5-1=4:
If max_lag=2, then the number of columns will be n_var*(max_lag+1)=2*3=6. , and the number of rows will be n_patterns–max_lag=5-2=3:
Example
In this example, the matrix x with 5 rows and 2 columns is lagged twice, i.e. max_lag=2. This produces an output two-dimensional matrix with (n_patterns-max_lag)=5-2=3 rows, but 2*3=6 columns. The first two columns correspond to lag=0, which simply places the original data into these columns. The 3rd and 4th columns contain the first lags of the original 2 columns and the 5th and 6th columns contain the second lags. Note that the number of rows for the output matrix z is less than the number for the input matrix x.
 
#include <imsls.h>
int main()
{
#define N_PATTERNS 5
#define N_VAR 2
#define MAX_LAG 2
float x[N_PATTERNS*N_VAR] = {1, 6,
2, 7,
3, 8,
4, 9,
5, 10};
 
 
float *z;
 
z = imsls_f_time_series_filter(N_PATTERNS, N_VAR, MAX_LAG, (float*)x, 0);
imsls_f_write_matrix("X", N_PATTERNS, N_VAR, (float*)x, 0);
imsls_f_write_matrix("Z", N_PATTERNS-MAX_LAG, N_VAR*(MAX_LAG+1), z, 0);
}
Output
 
X
1 2
1 1 6
2 2 7
3 3 8
4 4 9
5 5 10
 
Z
1 2 3 4 5 6
1 1 6 2 7 3 8
2 2 7 3 8 4 9
3 3 8 4 9 5 10