data_sets
Retrieves a commonly analyzed data set.
Synopsis
#include <imsls.h>
float *imsls_f_data_sets (int data_set_choice, ..., 0)
The type double function is imsls_d_data_sets.
Required Arguments
int data_set_choice (Input)
Data set indicator. Set data_set_choice = 0 to print a description of all fourteen data sets. In this case, any optional arguments are ignored.
data_set_choice | n_observations | n_variables | Description of Data Set |
---|
1 | 16 | 7 | Longley |
2 | 176 | 2 | Wolfer sunspot |
3 | 150 | 5 | Fisher iris |
4 | 144 | 1 | Box and Jenkins Series G |
5 | 13 | 5 | Draper and Smith Appendix B |
6 | 197 | 1 | Box and Jenkins Series A |
7 | 296 | 2 | Box and Jenkins Series J |
8 | 100 | 4 | Robinson Multichannel Time Series |
9 | 113 | 34 | Afifi and Azen Data Set A |
10 | 958 | 10 | Tic-Tac-Toe Endgame |
11 | 4601 | 58 | Spambase Data Set |
12 | 690 | 16 | Credit Approval |
13 | 20000 | 17 | Letter Recognition Data |
14 | 366 | 35 | Dermatology Database |
Return Value
If data_set_choice ≠ 0, the requested data set is returned. If data_set_choice = 0 or an error occurs, NULL is returned.
Synopsis with Optional Arguments
#include <imsls.h>
float *imsls_f_data_sets (int data_set_choice,
IMSLS_X_COL_DIM, int x_col_dim,
IMSLS_N_OBSERVATIONS, int *n_observations,
IMSLS_N_VARIABLES, int *n_variables,
IMSLS_PRINT_NONE,
IMSLS_PRINT_BRIEF,
IMSLS_PRINT_ALL,
IMSLS_RETURN_USER, float x[],
0)
Optional Arguments
IMSLS_X_COL_DIM, int x_col_dim (Input)
Column dimension of user allocated space.
IMSLS_N_OBSERVATIONS, int *n_observations (Output)
Number of observations or rows in the output matrix.
IMSLS_N_VARIABLES, int *n_variables (Output)
Number of variables or columns in the output matrix.
IMSLS_PRINT_NONE
No printing is performed. This option is the default.
IMSLS_PRINT_BRIEF
Rows 1 through 10 of the data set are printed.
IMSLS_PRINT_ALL
All rows of the data set are printed.
IMSLS_RETURN_USER, float x[] (Output)
User-supplied array containing the data set.
Description
Function imsls_f_data_sets retrieves a standard data set frequently cited in statistics text books or in this manual. The following table gives the references for each data set:
data_set_choice | Reference |
---|
1 | Longley (1967) |
2 | Anderson (1971, p.660) |
3 | Fisher (1936); Mardia et al. (1979, Table 1.2.2) |
4 | Box and Jenkins (1976, p. 531) |
5 | Draper and Smith (1981, pp. 629-630) |
6 | Box and Jenkins (1976, p. 525) |
7 | Box and Jenkins (1976, pp. 532-533) |
8 | Robinson (1976, p. 204) |
9 | Afifi and Azen (1979, pp. 16-22) |
10 | Aha, D. W. (1991, pp. 117-121), and Asuncion, A. & Newman, D.J. (2007) |
11 | Asuncion, A. & Newman, D.J. (2007) |
12 | Quinlan (1987, pp. 221-234, 1997), and Asuncion, A. & Newman, D.J. (2007) |
13 | P. W. Frey and D. J. Slate, (Machine Learning Vol 6 #2 March 91), and Asuncion, A. & Newman, D.J. (2007) |
14 | G. Demiroz, H. A. Govenir, and N. Ilter, (Artificial Intelligence in Medicine ), and Asuncion, A. & Newman, D.J. (2007) |
Example
In this example, imsls_f_data_sets is used to copy the Draper and Smith (1981, Appendix B) data set into x.
#include <imsls.h>
int main()
{
float *x;
x = imsls_f_data_sets (5, 0);
imsls_f_write_matrix("Draper and Smith, Appendix B", 13, 5, x, 0);
}
Output
Draper and Smith, Appendix B
1 2 3 4 5
1 7.0 26.0 6.0 60.0 78.5
2 1.0 29.0 15.0 52.0 74.3
3 11.0 56.0 8.0 20.0 104.3
4 11.0 31.0 8.0 47.0 87.6
5 7.0 52.0 6.0 33.0 95.9
6 11.0 55.0 9.0 22.0 109.2
7 3.0 71.0 17.0 6.0 102.7
8 1.0 31.0 22.0 44.0 72.5
9 2.0 54.0 18.0 22.0 93.1
10 21.0 47.0 4.0 26.0 115.9
11 1.0 40.0 23.0 34.0 83.8
12 11.0 66.0 9.0 12.0 113.3
13 10.0 68.0 8.0 12.0 109.4