This directory contains 38 individual data files representing the data sets analyzed in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S. Simonoff, John Wiley and Sons, New York, 1995.

The names of the files correspond to the names used in the book. The data files are written in plain ASCII (character) text, so it should be possible to import them into virtually any statistical, database management, or spreadsheet package. Missing values are represented by M in all data files.

The entry "All data files" is a concatenation of the individual data files using the individual file name as the delimiter.

There is one additional file "geyser2a.dat" which is an easier to use version of "geyser2.dat" In "geyser2a.dat," the values of the original DURATION variable that were noted only as Short, Medium or Long are coded as missing. In addition the last column of the dataset is a new variable (LONGSHRT) that codes whether the previous duration was short (0) or l ong (1).

Several of the files include an alphabetic (labeling) variable. It is likely that these files would have to be input into a package using fixed, rather than free, format. The relevant files, along with appropriate Fortran format statements, are as follows:

  • adopt.dat : (A18,I6,I6,I6)
  • djsp.dat : (A11,F8.2,F8.2)
  • ers.dat : (A5,I6,I4,I4,I4,I4,I3,I3,I3,I3,I3,I3)
  • foot.dat : (A5,A4,I3,I3,I3,I4,I4)
  • free.dat : (A23,F5.3,F6.3,I3,I3,I4,I4,I5,I5,F6.3,F6.3,I4,I4,I2)
  • funds.dat : (A22,I1,I8,I5)
  • health.dat : (A22,A20,I3,I8,F7.1)
  • hockey1.dat : (A25,I5,I5,I5)
  • hockey2.dat : (A25,I5,I5,I5,I6)
  • liinc.dat : (A24,I7,I8)
  • lischool.dat: (A25,I4,I6,I6)
  • mort.dat : (A25,I1,F8.3,F7.2)
  • nba.dat : (A15,I4,I4,I6,I4,F5.1,F5.1,F5.1,F6.1,F6.1)
  • pcb.dat : (A21,F10.2,F9.2)
  • ppp.dat : (A15,F12.5,F12.5,F12.5,F12.5)
  • rock.dat : (A30,I4,I4,I4,I4,I4,I4,I4,I4)

    Description of data sources, and further information about the data sets, can be found in the "Descriptions of the data files" section of the book.

    The files are here.