data_sets
Retrieves a commonly analyzed data set.
Synopsis
#include <imsls.h>
float *imsls_f_data_sets (int data_set_choice, ..., 0)
The type double function is imsls_d_data_sets.
Required Arguments
int data_set_choice (Input)
Data set indicator. Set data_set_choice = 0 to print a description of all fourteen data sets. In this case, any optional arguments are ignored.
data_set_choice |
n_observations |
n_variables |
Description of Data Set |
1 |
16 |
7 |
Longley |
2 |
176 |
2 |
Wolfer sunspot |
3 |
150 |
5 |
Fisher iris |
4 |
144 |
1 |
Box and Jenkins Series G |
5 |
13 |
5 |
Draper and Smith Appendix B |
6 |
197 |
1 |
Box and Jenkins Series A |
7 |
296 |
2 |
Box and Jenkins Series J |
8 |
100 |
4 |
Robinson Multichannel Time Series |
9 |
113 |
34 |
Afifi and Azen Data Set A |
10 |
958 |
10 |
Tic-Tac-Toe Endgame |
11 |
4601 |
58 |
Spambase Data Set |
12 |
690 |
16 |
Credit Approval |
13 |
20000 |
17 |
Letter Recognition Data |
14 |
366 |
35 |
Dermatology Database |
Return Value
If data_set_choice ≠ 0, the requested data set is returned. If data_set_choice = 0 or an error occurs, NULL is returned.
Synopsis with Optional Arguments
#include <imsls.h>
float *imsls_f_data_sets (int data_set_choice,
IMSLS_X_COL_DIM, int x_col_dim,
IMSLS_N_OBSERVATIONS, int *n_observations,
IMSLS_N_VARIABLES, int *n_variables,
IMSLS_PRINT_NONE,
IMSLS_PRINT_BRIEF,
IMSLS_PRINT_ALL,
IMSLS_RETURN_USER, float x[],
0)
Optional Arguments
IMSLS_X_COL_DIM, int x_col_dim (Input)
Column dimension of user allocated space.
IMSLS_N_OBSERVATIONS, int *n_observations (Output)
Number of observations or rows in the output matrix.
IMSLS_N_VARIABLES, int *n_variables (Output)
Number of variables or columns in the output matrix.
IMSLS_PRINT_NONE
No printing is performed. This option is the default.
IMSLS_PRINT_BRIEF
Rows 1 through 10 of the data set are printed.
IMSLS_PRINT_ALL
All rows of the data set are printed.
IMSLS_RETURN_USER, float x[] (Output)
User-supplied array containing the data set.
Description
Function imsls_f_data_sets retrieves a standard data set frequently cited in statistics text books or in this manual. The following table gives the references for each data set:
data_set_choice |
Reference |
1 |
Longley (1967) |
2 |
Anderson (1971, p.660) |
3 |
Fisher (1936); Mardia et al. (1979, Table 1.2.2) |
4 |
Box and Jenkins (1976, p. 531) |
5 |
Draper and Smith (1981, pp. 629-630) |
6 |
Box and Jenkins (1976, p. 525) |
7 |
Box and Jenkins (1976, pp. 532-533) |
8 |
Robinson (1976, p. 204) |
9 |
Afifi and Azen (1979, pp. 16-22) |
10 |
Aha, D. W. (1991, pp. 117-121), and Asuncion, A. & Newman, D.J. (2007) |
11 |
Asuncion, A. & Newman, D.J. (2007) |
12 |
Quinlan (1987, pp. 221-234, 1997), and Asuncion, A. & Newman, D.J. (2007) |
13 |
P. W. Frey and D. J. Slate, (Machine Learning Vol 6 #2 March 91), and Asuncion, A. & Newman, D.J. (2007) |
14 |
G. Demiroz, H. A. Govenir, and N. Ilter, (Artificial Intelligence in Medicine ), and Asuncion, A. & Newman, D.J. (2007) |
Example
In this example, imsls_f_data_sets is used to copy the Draper and Smith (1981, Appendix B) data set into x.
#include <imsls.h>
int main()
{
float *x;
x = imsls_f_data_sets (5, 0);
imsls_f_write_matrix("Draper and Smith, Appendix B", 13, 5, x, 0);
}
Output
Draper and Smith, Appendix B
1 2 3 4 5
1 7.0 26.0 6.0 60.0 78.5
2 1.0 29.0 15.0 52.0 74.3
3 11.0 56.0 8.0 20.0 104.3
4 11.0 31.0 8.0 47.0 87.6
5 7.0 52.0 6.0 33.0 95.9
6 11.0 55.0 9.0 22.0 109.2
7 3.0 71.0 17.0 6.0 102.7
8 1.0 31.0 22.0 44.0 72.5
9 2.0 54.0 18.0 22.0 93.1
10 21.0 47.0 4.0 26.0 115.9
11 1.0 40.0 23.0 34.0 83.8
12 11.0 66.0 9.0 12.0 113.3
13 10.0 68.0 8.0 12.0 109.4