data_sets

Retrieves a commonly analyzed data set.

Synopsis

#include <imsls.h>

float *imsls_f_data_sets (int data_set_choice, ..., 0)

The type double function is imsls_d_data_sets.

Required Arguments

int data_set_choice (Input)
Data set indicator. Set data_set_choice = 0 to print a description of all fourteen data sets. In this case, any optional arguments are ignored.

data_set_choice

n_observations

n_variables

Description of Data Set

1

16

7

Longley

2

176

2

Wolfer sunspot

3

150

5

Fisher iris

4

144

1

Box and Jenkins Series G

5

13

5

Draper and Smith Appendix B

6

197

1

Box and Jenkins Series A

7

296

2

Box and Jenkins Series J

8

100

4

Robinson Multichannel Time Series

9

113

34

Afifi and Azen Data Set A

10

958

10

Tic-Tac-Toe Endgame

11

4601

58

Spambase Data Set

12

690

16

Credit Approval

13

20000

17

Letter Recognition Data

14

366

35

Dermatology Database

Return Value

If data_set_choice  0, the requested data set is returned. If data_set_choice = 0 or an error occurs, NULL is returned.

Synopsis with Optional Arguments

#include <imsls.h>

float *imsls_f_data_sets (int data_set_choice,

IMSLS_X_COL_DIM, int x_col_dim,

IMSLS_N_OBSERVATIONS, int *n_observations,

IMSLS_N_VARIABLES, int *n_variables,

IMSLS_PRINT_NONE,

IMSLS_PRINT_BRIEF,

IMSLS_PRINT_ALL,

IMSLS_RETURN_USER, float x[],

0)

Optional Arguments

IMSLS_X_COL_DIM, int x_col_dim (Input)
Column dimension of user allocated space.

IMSLS_N_OBSERVATIONS, int *n_observations (Output)
Number of observations or rows in the output matrix.

IMSLS_N_VARIABLES, int *n_variables (Output)
Number of variables or columns in the output matrix.

IMSLS_PRINT_NONE
No printing is performed. This option is the default.

IMSLS_PRINT_BRIEF
Rows 1 through 10 of the data set are printed.

IMSLS_PRINT_ALL
All rows of the data set are printed.

IMSLS_RETURN_USER, float x[] (Output)
User-supplied array containing the data set.

Description

Function imsls_f_data_sets retrieves a standard data set frequently cited in statistics text books or in this manual. The following table gives the references for each data set:

data_set_choice

Reference

1

Longley (1967)

2

Anderson (1971, p.660)

3

Fisher (1936); Mardia et al. (1979, Table 1.2.2)

4

Box and Jenkins (1976, p. 531)

5

Draper and Smith (1981, pp. 629-630)

6

Box and Jenkins (1976, p. 525)

7

Box and Jenkins (1976, pp. 532-533)

8

Robinson (1976, p. 204)

9

Afifi and Azen (1979, pp. 16-22)

10

Aha, D. W. (1991, pp. 117-121), and Asuncion, A. & Newman, D.J. (2007)

11

Asuncion, A. & Newman, D.J. (2007)

12

Quinlan (1987, pp. 221-234, 1997), and Asuncion, A. & Newman, D.J. (2007)

13

P. W. Frey and D. J. Slate, (Machine Learning Vol 6 #2 March 91), and Asuncion, A. & Newman, D.J. (2007)

14

G. Demiroz, H. A. Govenir, and N. Ilter, (Artificial Intelligence in Medicine ), and Asuncion, A. & Newman, D.J. (2007)

Example

In this example, imsls_f_data_sets is used to copy the Draper and Smith (1981, Appendix B) data set into x.

 

#include <imsls.h>

 

int main()

{

float *x;

 

x = imsls_f_data_sets (5, 0);

 

imsls_f_write_matrix("Draper and Smith, Appendix B", 13, 5, x, 0);

}

Output

 

Draper and Smith, Appendix B

1 2 3 4 5

1 7.0 26.0 6.0 60.0 78.5

2 1.0 29.0 15.0 52.0 74.3

3 11.0 56.0 8.0 20.0 104.3

4 11.0 31.0 8.0 47.0 87.6

5 7.0 52.0 6.0 33.0 95.9

6 11.0 55.0 9.0 22.0 109.2

7 3.0 71.0 17.0 6.0 102.7

8 1.0 31.0 22.0 44.0 72.5

9 2.0 54.0 18.0 22.0 93.1

10 21.0 47.0 4.0 26.0 115.9

11 1.0 40.0 23.0 34.0 83.8

12 11.0 66.0 9.0 12.0 113.3

13 10.0 68.0 8.0 12.0 109.4