CNL Stat : Utilities : data_sets
data_sets
Retrieves a commonly analyzed data set.
Synopsis
#include <imsls.h>
float *imsls_f_data_sets (int data_set_choice, ..., 0)
The type double function is imsls_d_data_sets.
Required Arguments
int data_set_choice (Input)
Data set indicator. Set data_set_choice = 0 to print a description of all fourteen data sets. In this case, any optional arguments are ignored.
data_set_choice
n_observations
n_variables
Description of Data Set
1
16
7
Longley
2
176
2
Wolfer sunspot
3
150
5
Fisher iris
4
144
1
Box and Jenkins Series G
5
13
5
Draper and Smith Appendix B
6
197
1
Box and Jenkins Series A
7
296
2
Box and Jenkins Series J
8
100
4
Robinson Multichannel Time Series
9
113
34
Afifi and Azen Data Set A
10
958
10
Tic-Tac-Toe Endgame
11
4601
58
Spambase Data Set
12
690
16
Credit Approval
13
20000
17
Letter Recognition Data
14
366
35
Dermatology Database
Return Value
If data_set_choice  0, the requested data set is returned. If data_set_choice = 0 or an error occurs, NULL is returned.
Synopsis with Optional Arguments
#include <imsls.h>
float *imsls_f_data_sets (int data_set_choice,
IMSLS_X_COL_DIM, int x_col_dim,
IMSLS_N_OBSERVATIONS, int *n_observations,
IMSLS_N_VARIABLES, int *n_variables,
IMSLS_PRINT_NONE,
IMSLS_PRINT_BRIEF,
IMSLS_PRINT_ALL,
IMSLS_RETURN_USER, float x[],
0)
Optional Arguments
IMSLS_X_COL_DIM, int x_col_dim (Input)
Column dimension of user allocated space.
IMSLS_N_OBSERVATIONS, int *n_observations (Output)
Number of observations or rows in the output matrix.
IMSLS_N_VARIABLES, int *n_variables (Output)
Number of variables or columns in the output matrix.
IMSLS_PRINT_NONE
No printing is performed. This option is the default.
IMSLS_PRINT_BRIEF
Rows 1 through 10 of the data set are printed.
IMSLS_PRINT_ALL
All rows of the data set are printed.
IMSLS_RETURN_USER, float x[] (Output)
User-supplied array containing the data set.
Description
Function imsls_f_data_sets retrieves a standard data set frequently cited in statistics text books or in this manual. The following table gives the references for each data set:
data_set_choice
Reference
1
Longley (1967)
2
Anderson (1971, p.660)
3
Fisher (1936); Mardia et al. (1979, Table 1.2.2)
4
Box and Jenkins (1976, p. 531)
5
Draper and Smith (1981, pp. 629-630)
6
Box and Jenkins (1976, p. 525)
7
Box and Jenkins (1976, pp. 532-533)
8
Robinson (1976, p. 204)
9
Afifi and Azen (1979, pp. 16-22)
10
Aha, D. W. (1991, pp. 117-121), and Asuncion, A. & Newman, D.J. (2007)
11
Asuncion, A. & Newman, D.J. (2007)
12
Quinlan (1987, pp. 221-234, 1997), and Asuncion, A. & Newman, D.J. (2007)
13
P. W. Frey and D. J. Slate, (Machine Learning Vol 6 #2 March 91), and Asuncion, A. & Newman, D.J. (2007)
14
G. Demiroz, H. A. Govenir, and N. Ilter, (Artificial Intelligence in Medicine ), and Asuncion, A. & Newman, D.J. (2007)
Example
In this example, imsls_f_data_sets is used to copy the Draper and Smith (1981, Appendix B) data set into x.
 
#include <imsls.h>
 
int main()
{
float *x;
 
x = imsls_f_data_sets (5, 0);
 
imsls_f_write_matrix("Draper and Smith, Appendix B", 13, 5, x, 0);
}
Output
 
Draper and Smith, Appendix B
1 2 3 4 5
1 7.0 26.0 6.0 60.0 78.5
2 1.0 29.0 15.0 52.0 74.3
3 11.0 56.0 8.0 20.0 104.3
4 11.0 31.0 8.0 47.0 87.6
5 7.0 52.0 6.0 33.0 95.9
6 11.0 55.0 9.0 22.0 109.2
7 3.0 71.0 17.0 6.0 102.7
8 1.0 31.0 22.0 44.0 72.5
9 2.0 54.0 18.0 22.0 93.1
10 21.0 47.0 4.0 26.0 115.9
11 1.0 40.0 23.0 34.0 83.8
12 11.0 66.0 9.0 12.0 113.3
13 10.0 68.0 8.0 12.0 109.4