timeSeriesFilter¶

Converts time series data to the format required for processing by a neural network.

Synopsis¶

timeSeriesFilter (maxLag, x)

Required Arguments¶

int maxLag (Input): The number of lags. The number of lags must be one or greater, maxLag ≥ 1 and less than or equal to nPatterns.
float x[] (Input): An array of size nPatterns by nVar. All data must be sorted in chronological order from most recent to oldest observation.

Return Value¶

An array of size (nPatterns-maxLag) by nVar×(maxLag+1)) If errors are encountered, None is returned.

Description¶

Function timeSeriesFilter accepts a data matrix and lags every column to form a new data matrix. The input matrix, x, contains nVar columns. Each column is transformed into (maxLag+1) columns by lagging its values.

Since a lag of zero is always included in the output matrix z, the total number of lags is nLags = maxLag+1.

The output data array, z, can be represented symbolically as:

$\texttt{z} = |x(0) : x(1) : x(2) : … : x(\texttt{maxLag})|,$

where x(i) is the i-th lag of the incoming data matrix, x. For example, if x={1, 2, 3, 4, 5} and nVar=1, then nPatterns=5, and $x(0)=x$ , $x(1)={2,3,4,5}$ , $x(2)={3,4,5}$ , etc.

Consider, an example in which nPatterns = 2 and nVar = 2 with all variables having continuous input attributes. It is assumed that the most recent observations are in the first row and the oldest are in the last row.

$\begin{split}x = \begin{bmatrix} 1 & 6 \\ 2 & 7 \\ 3 & 8 \\ 4 & 9 \\ 5 & 10 \\ \end{bmatrix}\end{split}$

If maxLag=1, then the number of columns will be nVar*(maxLag+1)=2*2=4, and the number of rows will be nPatterns–maxLag=5-1=4:

$\begin{split}z = \begin{bmatrix} 1 & 6 & 2 & 7\\ 2 & 7 & 3 & 8\\ 3 & 8 & 4 & 9\\ 4 & 9 & 5 & 10\\ \end{bmatrix}\end{split}$

If maxLag=2, then the number of columns will be nVar*(maxLag+1)=2*3=6. , and the number of rows will be nPatterns–maxLag=5-2=3:

$\begin{split}z = \begin{bmatrix} 1 & 6 & 2 & 7 & 3 & 8 \\ 2 & 7 & 3 & 8 & 4 & 9 \\ 3 & 8 & 4 & 9 & 5 & 10 \\ \end{bmatrix}\end{split}$

Example¶

In this example, the matrix x with 5 rows and 2 columns is lagged twice, i.e., maxLag=2. This produces an output two-dimensional matrix with (nPatterns-maxLag)=5-2=3 rows, but 2*3=6 columns. The first two columns correspond to lag=0, which simply places the original data into these columns. The 3rd and 4th columns contain the first lags of the original 2 columns and the 5th and 6th columns contain the second lags. Note that the number of rows for the output matrix z is less than the number for the input matrix x.

from numpy import *
from pyimsl.stat.timeSeriesFilter import timeSeriesFilter
from pyimsl.stat.writeMatrix import writeMatrix

x = array([[1., 6.], [2., 7.], [3., 8.], [4., 9.], [5., 10.]])
z = timeSeriesFilter(2, x)
writeMatrix("x", x, writeFormat="%5i")
writeMatrix("z", z, writeFormat="%5i")

Output¶

 
       x
       1      2
1      1      6
2      2      7
3      3      8
4      4      9
5      5     10
 
                     z
       1      2      3      4      5      6
1      1      6      2      7      3      8
2      2      7      3      8      4      9
3      3      8      4      9      5     10